USB Packet Snooping

A low cost hardware dongle for capturing and analyzing Full Speed (12Mbps) traffic using ARM microcontroller

Similar projects worth following
While working on my "Low Cost KVM Switch", I need a simple way of snooping keyboard for detecting hotkey for switching to a different computer.

While working on a bit-bang USB for the STM8, I also see a need for a hardware device to capture raw and may be malformed USB packet that software only solution cannot capture.

There are software based solutions which rely on install additional low level drivers onto the target for capturing packets. What if the target system is a non-PC and what if the packet is malformed? A logic anaylzer is the closest solution right now, but is limited in the analysis and can get expensive.

I was working on ]getting a software based USB on the STM8 as part of my work on HID Multimedia Dial. Windows fails to recognize the device as there are some bugs to work out. I tried a software based USB logger, but the software failed to capture the packet as it was not recognize by the operating system.

Also one of the hurdles of software based solution under Windows - code signing for low level driver. This is the fate of USBPcap that ran into a BSOD bug and requires code signing certificate for a bug fix.

With a transparent hardware device, no software is needed on the target. I am thinking of dumping raw USB packet onto a PC and writing to wireshark files for analysis.

The problem with USB is that it is a half duplex protocol and there are no receive only USB SIE (serial interface engine) for snooping the traffic. One could probably use a high end microcontroller with both Host and Device ports for transparently passing through packet while snooping. It is also possible to implement a custom SIE using hardware or software. e.g. V-USB is a pure software implementation, but is limited to Low Speed (1.5Mbps).

This project will focus on Low or Full speed USB and use a microcontroller with minimum of additional components. This project relaxes some of the hard real time requirements as it does not interact with the traffic. My initial thought was to take advantage of the SPI for sampling the serial stream and introduce a phase shift to sample the closer to the center of the data window. As I let my mind ponder the question while running some errant, I realized that I could simply sample the signal at 2X the data rate and worry about recovering the data afterwards.

While the datasheet specifies an upper limit of 18Mbps data rate, I have successfully overclock the SPI MOSI of the STM32F030 to 25Mbps in "Low Cost VGA Terminal Module". I hope the same would be true of the MISO input. DMA is require to move around data at 24Mbps (3M bytes/s).

USB waveforms looks something like this:
Source: AN57294 - USB 101: An Introduction to Universal Serial Bus 2.0

The signals for 1.5Mbps/12Mbps are full swing 3.3V logic level, so they can be interfaced to GPIO. A brief Single Ended Zero (SE0) state is used to indicate End Of Packet (EOP).

To detect this using an external interrupt, we would need to add an external logic gate. (I have also considered making a missing pulse detector out of a timer.) The interrupt would then terminate the DMA transfer. We have no way of knowing what packet type and length will be received until the PID field is decoded.

I have decided to use MOSI as the other data pin. It can be used for figuring out the connection speed.

For capturing, we have up to 32 clock cycles during the SYNC. According to ARM, the interrupt latency is 16 cycles which already includes pushing a few registers onto the stack.

This is an optimistic number as:

  • The ARM core will be running from FLASH with 1 wait state.
  • This latency applies to the core. There are additional cycles for external signals - synchronization crossing clock boundary, delays inside peripherals


AN57294 - USB 101: An Introduction to Universal Serial Bus 2.0
USB in a NutShell
USB Made Simple

From: Lakeview Research - USB Development Tools

Windows serial port reference:

The project picture is a composite from:

Read more »

  • Data packets, compression

    K.C. Lee04/19/2017 at 02:58 0 comments

    I have been doing a bit of thinking on how to move the collected raw data to the PC via serial port for analysis.

    • USB data can take on any 8-bit values and 4 additional specialized USB symbols. i.e. 260 symbols that cannot be fitted into 8-bit.
    • need to tag specialized packets for serial protocol e.g. time stamp, non-data
    • a lot of small packets from 8 bytes to 64 bytes
    • a way of synchronization or re-synchronization in case of missing data due to over-run.
    • Streaming protocol with limited buffers and not enough bandwidth and memory for ACK/NACK.
    • optional data compression?

    One way of doing this is to make a packet with a header, a size field and data with ACK/NACK. The serial link has less bandwidth than USB, so it is done on a best effort. Error recovery would consume more bandwidth and may make the problem worse.

    I am thinking of trying something else. This is a simple way of tagging at the byte level with 1 or 2-bit Huffman encoding header.

    • If the MSB starts with a '0', then the rest of the byte carries 7 bits of raw USB data.
    • If the MSB starts with '10', then the remaining 6 bits is RLE (Run Length Encoding) counter
    • If the MSB starts with '11', then it is a special tag (with a byte count) which could be one of the following
      • Special USB symbol e.g. Single Ended Zero (SE0), Start of Frame (SOF), End of Packet (EOP), Reset
      • Initial value of RLE
      • Header for time stamp, text (e.g. configuration, command return code), corrupted packets, data over run and other errors etc

    Data compression

    If the data cannot be compressed easily, then the efficiency is 87.5% as 7 bits is sent every byte. If there are very long strings of '0' or '1', then the RLE could reduce the amount of data. (RLE data compression is a bonus as there are left over bits after special tag.)

    Serial data Re-synchronization

    Bad USB raw packets i.e. ones with bad CRC16 can be tagged with special tag. So if there is a mismatched CRC16, it means that the serial protocol layer have missing/corrupted data. The PC can use the next time stamp tag for synchronization.

  • Mechanical design

    K.C. Lee04/10/2017 at 22:41 4 comments

    It might be a simple thing that you don't think about, but designing the mechanical takes a lot of time.

    I am trying to make the dongle as small as reasonable as I don't want to block off neighboring ports or add too much length. I have decided on a daughter card for the USB to serial converter which can also be swapped out for different chip.

    I have offset the plug side to make room for a screw (in the lower right corner). The Micro USB connector is off the side and can exert a lot of leverage against the stacking connector. I have decided against soldering, hot gluing, duct tape and settled on using a screw.

    A lot of small adjustments to the layout have to be made. It took a long time for me to settle on a M1.6 x 12mm screw as it is big enough to be available in smaller quantities. 49 leftover screws is a lot better than 999 that I would never use. The crazy thing is that these small orders cost more than the rest of the parts.

    The screw encroachs the pad of the USB connector. I might look into changing the pad into a slotted pad or just live with what I have. I'll have to make the standoff as I can't find something this small. Going to do more clean up etc.

    Looks like I am in luck. This transformer core is the right height I'll need for the standoff.

    A bit of trial and error cutting it to the right length, I rolled the piece of core into a tube. At this scale the camera is very unforgiving. Just need to polish it up. The O.D. is 3mm which fit within the PCB.

    Total length of screw = 1.3mm + 1.57mm + 6.477mm + 1.57mm = 10.917mm

    11mm would have been the perfect size, but they only sell the even number. i.e. 10mm or 12mm. A bit of filing should be fine.

    I have decided to make the boards available separately. I have placed an order today and hopefully will show up in 3-4 weeks time frame, but things can happen.

    These designs are not tested yet.

    USB Snoop.brd

    USB Serial.brd

  • Dongle design

    K.C. Lee04/10/2017 at 01:59 0 comments

    This is what I have so far:

    Pretty simple design. USB D+ goes into MISO which is sampled by the SPI using DMA at 2X sampling rate. D- can be polled from the GPIO pin. A NOR gate is used to detect SE0 (Single Ended 0) which cause an interrupt to stop the DMA. The data collected is transfer to 3.3V TTL serial port using DMA and hardware RTS/CTS handshake for data rate of up to 6Mbps.

    Here is a quick survey of various popular USB serial chips:

    The USB serial design is based on this project log: STM8 breakout board with USB Serial

    I have also made provision for using the UART as a second SPI for logging data by connecting the clock pin to one of the handshake lines.

    This is what the preliminary assembly looks like with the USB serial board on top. (I have to find models for the 2mm connectors.)

    I ordered the crystals from China today, but probably won't be seeing them for about 2 months. :( I do have larger crystals.

  • Initial look at raw USB signals

    K.C. Lee03/23/2017 at 04:57 0 comments

    I soldered on a 3-pin header onto an old pass through dongle that I have made for powering an amplifier. This comes in handy for snooping the signals and for powering the dongle.

    To get an idea of what the USB signals looks like, I set the sampling rate of my logic analyzer to 24Mbps. This is also the highest speed that the SPI in master mode can operate. The following is the capture for a Full Speed device (ST Link).

    This looks promising to the human eyes. The firmware will be looking at a sequence of packed binary coming from MISO pin of the SPI and have to process the data in bit level one by one.

    While the signal is differential, one of the pairs can be used for recording the data stream. Both of the signals are needed to determinate the side band signalling. e.g. a single ended zero (SE0) when both D+,D at '0' can be used to signify a device reset if held for more than 10mS.

    The initial pulses is part of the synchronization. It is supposed to a series of 8 bits of alternating '1' and '0'. Not sure why I am seeing 7 transitions.

    Interrupt latency might insert delays jitters when the sampling starts. There are 4 bits in the PID field which follows the SYNC. It is used to identify packet type. The bits are packed with their complements, so it is useful for figuring where the packet starts.

    PID0 PID1 PID2 PID3/PID0/PID1/PID2/PID3 explains the basics of the packet format at this level.

View all 4 project logs

Enjoy this project?



Paul Stoffregen wrote 04/18/2017 at 16:03 point

Wow, impressive work so far.  If you can pull this off and publish as open source, I'm sure you'll make a lot of people very happy.  Well, except maybe a few.  When you write "logic anaylzer is the closest solution right now", perhaps you've not seen the Beagle Protocol Analyzer from Total Phase?  I'd imagine a very low cost open source hardware design on the market won't make them really happy.  But I'm sure everyone else will love it!

  Are you sure? yes | no

K.C. Lee wrote 04/18/2017 at 16:51 point

Thanks.  I am aware of protocol analyzer, but not doing enough USB development to justify one.  

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates