Close

Honorable Mention

A project log for MicroPort

USB-CDC Serial port for PIC18F, in under 1KiB. Refactored down from a USB-DFU bootloader, hand written in assembler to be light and fast

jesseJesse 01/05/2017 at 03:060 Comments

I'd like to talk a little about some of the changes that cut this project down to almost 1/3rd of it's original size. In earlier logs I already mentioned the USB Device Firmware Upgrade (DFU) support, extra USB descriptors, accurate baud-rate calculation, and bootloader/application separation as features that got the axe.

Here follows the remaining bits on the editing room floor.


As the bootloader was meant to handle USB core functionality while not interfering with user firmware, any registers modified by the bootloader had to be backed up prior and restored after use. This included WREG, BSR, STATUS, FSR1, FSR2, and TBLPTR. Some of which are two or even three bytes, and TBLPTR also needed to have bootloader specific values cached in ram for descriptors larger than 1 packet.

Every backup and restore operation required 4 bytes, and removing this constraint reduced the flash size by 104 bytes of code.


Originally, USB descriptors were referenced by a series of tables to simplify the application code required in user application firmware. This table setup supported multiple configurations, strings and any other type of descriptor supported by the USB specification. In the end, this table was replaced with code to specify the descriptor length, and address in flash of the remaining device and configuration descriptors. The tables themselves would have occupied at least 28 bytes of flash, and an additional 84 bytes for the code to scan the tables when a descriptor is requested.

Instead, a SWITCH/CASE block reduced the total footprint of descriptor lookups by 86 bytes.


Initializing the PIC for USB and even general operation typically involves loading default values into a number of registers that control operation of the device. After reviewing the datasheet to determine the reset and power on states of those registers, I found several either didn't need to be specified in code, or could be initialized by a single bit operation. In 16-bit PICs, loading an immediate value to a register is a 2 instruction operation, while single bit flips can be done in a single instruction. By re-arranging code to take advantage of common immediate values, changing immediate constants to single bit-flips, and removing initialization of registers that are already in the correct state at power on/reset I managed to save an additional 50 bytes.


USB Stall and USB Error interrupts had empty handlers assigned that did nothing more than clear the interrupt flag and exit the interrupt handler. During development of the bootloader, they also served to send a notification on the serial port which I'll go into a little more detail shortly. However, in this project they served no purpose, and both these interrupts could be disabled.

Removing the code for these interrupts freed up an additional 16 bytes.


Handing over buffers of filled data to the USB interface engine on the PIC is something done repeatedly by this firmware. This was a very modest saving, but in each case 3 instructions could be replaced with 1 call instruction, and two separate functions were added to handle these cases. Two were required, as these instructions are responsible for setting the number of bytes to be transmitted and setting the USB data toggle bit while handing over ownership of the buffer. The USB data toggle bit tracks if an even or odd packet is being sent for a specific request. This is independent somewhat to the ping/pong even or odd buffering which also has to be managed.

While scheming for ways to reduce the code size I had the idea, which I never made good on, to analyze the binary looking for common sequences of words in the compiled flash. The idea being, that if enough common sequences were found that could be condensed into function calls, then identifying many of these cases could be automated. While I never did get around to doing this, this specific change is exactly the kind of case I expect this analysis to identify. It would likely obfuscate the source code, but this kind of automated analysis could prove useful in the future.

In the end, this minor change saved another 8 bytes.


Next, this minor bit of tuning also helped slightly. RAM accesses in 16-bit PICs are either done through indirect addressing, using the global 'access' page, or using banked ram accesses. Banked accesses are used in a few places, and must be used when building for the extended instruction mode on the PIC18F series. As it turns out opcodes that specify the lower portion of access ram, are instead treated as offsets from FSR2 in extended mode. This firmware actually takes advantage of this when parsing USB requests received from the host, as individual bytes in the received message can be retrieved in a single opcode instead of 3 or more.

Anyway, back to banked register access. By doing bank switching earlier in the USB transfer interrupt handler, I was able to remove 4 other cases. This saved a paltry 6 bytes total.


And finally, while it saved no code space what-so-ever, I'd like to mention my serial debug routines. I removed them primarily to simplify the code in the final submission. Removing this code saved no space, because, similar to how asserts are performed in C, these debug routines are macros which can be toggled to generate no code when not producing debug output. Specific points in code (see the stall and error interrupts above for example) would print a symbol to the serial interface, and optionally dump a series of register values.

By having an ifdef statement with null macros for a non-debug build, and serial print statements for debug builds, I could easily litter the code with printed markers and dumped registers which were helpful in early development. It also had the added benefit of allowing me to determine if the delays imposed by printing to the serial port were the cause of bugs. **spoiler** It was never the delays of pushing bytes through the serial interface. Turns out, USB is somewhat forgiving as long as it sees a response in less than 50ms, which at the baud rates I was using would allow for almost 500 characters sent per USB transaction.

This feature was largely unused by the time I started on the 1KB project, as serial communication was a core part of the end goal. Trying to get meaningful debug prints, while also sending/receiving data on the same serial interface simply wasn't worth the effort by that point. Instead, most of my debugging was done by using the usbmon Linux module, and Wireshark for tracing USB communication.

While not relevant to this project, eventually the goal is to offer a similar debug printout feature using a USB-CDC interface. This should offer a development experience similar to doing Arduino development.

This, combined with the baud rate divider changes and other alterations mentioned in days past, made up enough space to fit the core functionality into 1KB while also including RTS/CTS/DTR/DSR signals.

This post wraps up the software development on this project to this point.

Thanks for reading, and stay tuned for more info on setting up the hardware!

Discussions