Main screen turn on

A project log for Reverse-Engineering a low-cost USB CO₂ monitor

I'm trying to get data out of a relatively low-cost (80€) CO₂ monitor that appears to have a USB connection for data as well as for power

Henryk PlötzHenryk Plötz 04/25/2015 at 00:000 Comments

So, looking at the transmitted data wasn't that successful in figuring out the protocol. My normal approach now would be to 'spoof' the other side of the communication channel and see how the device under test changes behaviour. In this case I would've written a short piece of Python to send out this initial SET_REPORT packet and then vary contents and look at how the device responds differently. But that's some boring work I've done several times before with other devices, so I'd wanted to try something new and this would be the perfect chance. I've never reverse-engineered anything by looking at a disassembly, so let's dive in!

The friendly people at Hex-Rays offer a freeware version of IDA Pro 5 for non-commercial use. It's not the latest and greatest, but should suffice. I installed it and loaded the ZG.exe file with full analysis turned on. This is the result:

That doesn't seem right. Scrolling through only shows a lot of data, but no functions. And the strings window has none of the user interface strings we were seeing when running the program. I've also tried some online disassembler services, and one of them gave a crucial hint: It identified the program as packed with UPX. IDA seems to have noticed too (there's this "UPX1" prefix to the left), but wasn't doing anything about it. Or maybe I didn't know to tell it to do something about it.

As the wikipedia article tells us, UPX is available for Linux and comes with a decompressor, so we can use

upx -d ZG.exe
and try IDA again, this time around analysis takes much longer:

Hey, better. There's the "assembler code in boxes connected with arrows" UI I've seen other people use. And also the strings window seems to be usefully filled.

We know that the program somehow gets data from the USB and then decodes this into CO₂ readings to display. So the obvious point of attack would seem to be to find the place where it reads from USB and then follow to where the data is processed to get an insight into how the packets are being decoded.

Doing a search for "read" in the names window at first doesn't yield a lot of useful things. There's a "aTusbthread0", but I couldn't make sense of what it did. There also are a couple of "TComm::Read…", which seem to point to an entire include serial communications library. But then finally:

Bingo! These seem to be strings that indirectly reference functions in the "HIDApi.dll" (which comes in the same directory as the .exe file). Following the cross reference gets me to

But what does that do? [Insert "I have no idea what I'm doing" dog meme here]

After being stupid for a couple hours I decided that apparently it would "LoadLibrary" the HIDApi.dll and then call "GetProcAddress" for each of "FindUSB", "WriteUSB", and "ReadUSB" and store the resulting value (likely a pointer to the respective functions) into memory. To better keep track I renamed the memory locations: dword_4FCADC became HID_FindUSB, dword_4FCAE0 became HID_WriteUSB, and dword_4FCAE4 became HID_ReadUSB. And, bonus, the cross-reference information shows that HID_ReadUSB is only used twice: Once here to write it, and a second time, in a call:

As you can see from the graph overview window there's quite a lot going on here, and the whole thing is a giant loop. (You may not be surprised to learn that I later found out that this is the body to the aforementioned USB-Thread structure.)

I'm skipping a couple of steps here: For a time I was distracted by byte_4FCAAE which seems to decide whether a read or write on the USB will happen. I also tried out the integrated debugger and was single stepping the program from the return of the HID_ReadUSB call on, and then found:

That box there in the middle, it's reading single bytes from [ecx+1], [ecx+2] and storing them at byte_4FCAB1 and following. On a hunch, let's call them usbread0, usbread1 etc. and see where they're used. First up:

Now, that's strange. It sums up the first 3 bytes and compares the result to the 4th byte. That matches how the checksum for the protocol documentation we have would be computed. If the checksum matches it calls sub_4031F4 (which happens to be next in our list of cross references to look at) and … holy shit, we hit the mother lode:

This code stores (usbread1<<8) | usbread2 into [ebp+var_2] and then does a mighty big case comparison on usbread0 (stored in ecx) starting with checking if it's less than, equal, or greater than 5A. This obviously is the protocol parser. And it matches the protocol documentation we have (though it knows about more opcodes than just 50 and 42).

Verification in the debugger shows that, yes, the output data of HID_ReadUSB really contains the protocol as parsed here. So whatever has been done to the USB traffic must have happened earlier, in the HIDApi.dll. That will be what we load into IDA next.