A dirt cheap open hardware USB-JTAG board designed to program TinyFPGA A1 and A2 boards.
The PCBs arrived from the fab today. They look excellent. I set them up for 25 boards per panel and got a stainless steel solder paste stencil along with them.
Over the weekend I ordered 25 PIC16F1455 microcontrollers so I could assemble some boards ahead of the main order of PICs from Microchip. Programming 25 boards manually isn't too bad, especially after going through the work of assembling them.
Assembly and reflow went well. My reflow oven might be running a little hot. I noticed the silkscreen is slightly darker than the bare PCBs still in the packaging. Besides that the assembled boards look and work just as well as the boards I've gotten pre-assembled.
I broke apart the panel and tried programming one of the boards without soldering pins onto the ICSP header. It didn't work. So I figured it just wasn't making a good enough connection and I soldered on a set of pins. It still didn't work. I was pretty worried at this point, but the basic design of the PCB was the same, and all of the solder joints looked great. To try and isolate the problem I attached the PICkit3 programmer to one of my prototype boards. It didn't work. Whew. It wasn't the new boards. Finally I followed the tried and true method of unplugging and then replugging the PICkit3 programmer and trying again. Success!
I programmed and tested about 8 boards so far. All programmed successfully and the TinyFPGA programmer GUI has no problem recognizing the hardware and communicating with a #TinyFPGA A-Series board.
The breadboard in the picture was my basic testbench for developing the firmware for the PIC. I had connected my logic analyzer to the JTAG pins to view and debug the JTAG programming flow as well as performance issues.
All of my boards are open source and open hardware, but this is the first board to actually contain the logo.
I should be placing the boards for sale on @Tindie by next week. I only have 25 boards until the pre-programmed PICs come in. Speaking of, I just received an email from @Microchip Technology that the 200 PIC16F1455 chips have been programmed and shipped. At that point I'll assemble them one panel at a time. I'm still working on exact pricing, but I can confirm they will be available with the right-angle header soldered on for less than $10. I could also sell bare PCBs and boards without the right-angle header for a discount.
I'm really excited to bring this programmer to market. It's significantly cheaper than the alternatives and will make #TinyFPGA A-Series even more accessible. Once this is for sale I have more open TinyFPGA projects and products in the works.
Now when you launch the GUI the port selector will also tell you whether it is an A-series board or a B-series board. As TinyFPGA boards are connected or disconnected, the list will automatically update.
While a particular port is selected for either type of TinyFPGA board the programmer will continuously test the connection to the board. If the programmer is disconnected from the board, the board lost power, or the USB cable was disconnected, it will let you know.
Just as the #TinyFPGA B-Series boards will have a progress bar for programming, the #TinyFPGA A-Series do as well. As the configuration flash is erased, written, and verified, you will know at every step of the way.
Programming speed is just as fast as it has been during development of the tinyfpgaa.py Python driver. This means small images take about 3 seconds and large images will take up to 12 seconds. This is faster than the Diamond Programmer and Lattice Download Cable.
Delivery of the production PCBs is estimated to be September 12th. I have a few PIC16F1455 chips I may use to assemble some boards and give them a spin.
Looks like the production PCBs have completed fabrication. Should be on track to receive them early next week. Delivery for the pre-programmed PIC15F1455 micros is set for one week from today. This means I most likely won't be able to sell any TinyFPGA Programmer boards for about a week and a half.
In the meantime I've been reorganizing the FPGA programming/configuration Python code around the git repos and consolidated the GUI in its own repo. With this change the only programmer application you will need to use is the TinyFPGA Programmer Application. It automatically recognizes TinyFPGA boards and programmers when they are connected and gives immediate feedback on connectivity and configuration status.
Yesterday I was able to root-cause the bug where only '0' data would be returned from the PIC16F1455 over USB to the Python module. This turned out to be an interesting bug.
The bug appeared after I made some seemingly unrelated changes to the code. I could not figure out why the data returned was always null. I traced the firmware's execution in the debugger and followed the return data through the various buffers and it was always correct.
Looking through the PIC16F1455 datasheet I was reminded of the special dual-port RAM for the USB controller. Looking at the buffers in the USB stack source code from Microchip there was no annotations to keep them in this special dual-port RAM location. Upon further reading of the spec I found that the USB controller can only read data from the dual-port RAM.
Checking the map file with the compiled addresses of all the variables in the firmware I find that the transmit buffer is not within the dual-port RAM. Bingo. It turns out the firmware worked before because I was lucky and the buffers happened to be allocated in the right location. However, once some extra variables and data structures were added, the compilers allocation algorithm happened to place the USB buffers outside the dual-port RAM.
The solution seemed simple; tell the compiler to assign the USB buffers within this dual-port RAM address range. It turned out not to be so. Even though I could tell the compiler where to put the buffers, I would still get corruption on the transfers. As I dug further into the USB stack I started to replace sections with my own versions of the CDC device class transmit and receive path. This led to a much better understanding of the USB hardware in the PIC as well as fully functional firmware.
I'm not quite ready to order the pre-programmed PICs just yet. Will need some more testing to gain confidence. On a related note, the production PCBs are about half-way finished being fabricated. They could be delivered to me as soon as early next week.
Production PCBs are being fabricated. Should take about a week to finish and get delivered. While that's going on I've ordered all the necessary parts for a first assembly batch except for the PIC microcontroller. I believe it will take me too long to develop a robust programming sequence for the PIC on a pogo pin test bed so I've opted to have Microchip pre-program the PICs for me. As for assembly, I've had the #TinyFPGA A-Series and #TinyFPGA B-Series boards assembled in a factory, but I am planning on assembling the #TinyFPGA Programmer boards myself. They have very few parts and nothing tricky. This will help keep the final price of the boards low.
I have one more bug I'm aware of with transmitting USB data. At some point along the way of my optimizations the USB IN data started getting corrupted...it's always '0'. I'm going over the CDC code generated by MPLAB as well as the PIC16F1455 datasheet to understand what's going on. Once that's fixed I'll be able to order the pre-programmed PICs.
I've made a few updates to the PCB design.
Sending these boards off to a small production run. By the time I get them back I should have fixed that last few lingering firmware issues and should be ready to list them on my Tindie store.
In the last log I wrote about how much faster the new firmware and Python library is. Now I want to talk a little bit about the new commands that enable this performance improvement.TinyFPGA Programmer Firmware Command Encoding
There were three original commands for driving the programmer: CONFIG_IO, SET, SET_GET. These commands allow you to configure individual GPIO pins for INPUT or OUTPUT, set pin values, and get pin values. They are enough to bit-bang slow enough protocols, but not very fast.
To enable the faster speeds in the latest firmware, six new commands were added:
This shifts many bytes of data at a time serially through the GPIO pins. The shift operation is highly configurable: it can output data, input data, check data with a mask, and even consume or produce no data at all if configured to do so. The firmware supports a total of 8 programmable SHIFT configurations that allow many aspects of the SHIFT command to be configured.
SIE stands for Serial Interface Engine. CONFIG_SIE allows you to program each of the 8 serial interface engine configurations. When a SHIFT command is sent, it specifies the index of the SIE configuration to use.
A total of 60 bytes worth of commands can be repeated up to (2^16 - 1) times or until a SHIFT check data operation within the loop matches. This capability is used for JTAG programming to poll the FPGA's busy status bit while waiting for a flash erase or write operation to complete. Loops must be terminated with an END_LOOP command. If the loop terminates before a successful match, a failure status code will be sent to the host computer.
Results of SHIFT with check data, and LOOPs are only returned if there is a failure, and only if a failure has not previously been reported. Before a bitstream is programmed to a TinyFPGA board, the CLEAR_STATUS command is sent. At the very end of the entire programming procedure, the GET_STATUS command is sent. If there is an error, the error status will be sent at the time the error occurred as well as when the GET_STATUS command is executed. If there is no error, then a successful status will be sent when the GET_STATUS command is executed.
At last, after some long evenings working through various issues I've rewritten large portions of the TinyFPGA Programmer firmware and Python module to be much, much faster.
How fast? For a small design it can erase, program, and verify flash in 3 seconds. For a large design utilizing the entire FPGA it takes about 10 seconds. For comparison, the official Diamond Programmer and Lattice Download Cable takes about 15 seconds for the MachXO2 1200 FPGAs.
Fast bitstream program time matters because it means you can verify your changes on the real FPGA faster. Fast programming of flash means you don't have to worry about a power glitch or power loss to the board wiping out the SRAM configuration. The latest configuration bitstream will always be loaded.
How did I enable such a large improvement in speed? It comes down to recognizing the inefficiencies in the system and implementing optimizations that work around them:
It increases the amount of USB traffic the PIC needs to process and doesn't allow for a fast inner loop. Below is a waveform of the JTAG pins while the TinyFPGA Programmer is writing the #TinyFPGA A-Series FPGA's flash. The sections marked A are times when the firmware was processing incoming and outgoing USB packets. The sections marked B are times when the firmware was actively driving the JTAG pins, but was only able to achieve about 15KHz. These two inefficiencies add up to a lot of wasted time.
The solution is to add commands to shift many bytes worth of data all at once. This reduces the overall amount of USB traffic the PIC needs to process and allows for a very tight inner loop.
Writing a command to the PIC over USB, then waiting for a response takes at least a few milliseconds of time. This happened every time the programmer needed to wait for a status bit to clear or verify data from the FPGA. The section marked C in the waveform below shows where the Python application was waiting for a response from the PIC before it would send new commands.
There are multiple optimizations here:
Every time the Python programmer module writes to the serial port it appears to be a blocking operation and the process would get context-switched. This adds a few milliseconds while the programmer is idle waiting for commands to process.
To hide this latency I increased the buffer to 256 bytes to enable several packets to be queued up to transmit at once. This seems to be enough to keep the programmer hardware fed with commands while the Python application is blocked.
Lattice SVF files contain large delays within polling loops, and program unused rows unnecessarily.
Now that I understand the programming protocol very well, I wrote a custom JEDEC file parser that determines exactly what JTAG commands to issue. I was able to reduce the wait time between status polls to speed up polling. I was also able to program only the rows that have non-zero data.
A final optimization performed I'm not too happy about. My firmware ran up against the...Read more »
Well I took another look at the MPLAB Code Configurator for the firmware and I realized I was only running the CPU core at something like 8MHz. So I modified the multiplier so it's running at 48MHz and the improvement was significant. Flash program and verify now takes 35 seconds and @Xark reports SRAM programming takes only 2 seconds. Very nice! This is before any bulk serial optimizations have been added. I'm very happy with this result.
Next steps: I want to add the bulk serial optimizations in and see even better performance. Then I will add support for programming .jed and .bit files directly, and finally I will add the module into the TinyFPGA Programmer GUI that exists for #TinyFPGA B-Series.
Once that is done and working to my satisfaction I'll be making some more revisions to the PCB. I think it makes sense to breakout all the PIC's pins so that the board can be used for other purposes as well. So I can add support for UART and SPI along with it's existing GPIO capabilities. That will make it a dirt cheap programmer that can be used for many things.
I squashed a few more bugs in my Python code and was able to successfully program and verify the flash in a #TinyFPGA A-Series board. At the same time that I'm developing a Python module to communicate with the #TinyFPGA Programmer, @Xark has been working on a solution using Lattice ispVM to more closely integrate with the Lattice tools. He's discovered the SRAM can be programmed in about a dozen seconds using the #TinyFPGA Programmer. Flash programming and verifying on the other hand is taking the Python module about 3 minutes.
I believe flash programming is so slow because there is currently a lot of overhead for each JTAG bit of data transfered. Combine that with verification and it just takes time. To remedy this I'm planning on adding a new serial acceleration command. This command will allow the JTAG data to be transfered across USB about 16x faster and should reduce the overhead in the FW as well.