Reverse Engineering the Keyboard, Part III

To figure out how to properly implement a keyboard, we first need to learn how a modern USB keyboard works. We'll figure this out step by step.

The first question to answer is deceptively simple: how does my computer know what button i just pressed?

When you connect an USB keyboard to your computer, it will identify itself as a keyboard, which are part of a special USB device class. Since they follow the specification on how a keyboard is supposed to communicate, they "just work" when we plug them in - no drivers needed. But there is something that we do need to configure before it works as expected.

If you have ever used a different (non-american) keyboard, you will probably recognize settings like these:

The screenshot is from Linux Mint on my main computer, but there is a similar setting in Windows, OSX and so on. You'll notice that I have selected "Swedish" for my keyboard layout, even though the GUI language is set to English. That is because I have a "Swedish" keyboard connected to my computer, and it shares it layout with this keyboard:

When compared to an American keyboard layout, the number of keys are the same. But there are some extra letters, and some of the special characters has moved around. For example, look at the button to the right of 'L'. On the Swedish layout, it is labeled 'Ö', but on an American layout you would instead find ';' in the same position.

So the layouts are different, big deal! Why do we care? Why do we even need to select the proper layout?

This is a very important point: the keyboard does not know its own layout.

This means that the only real difference between a Swedish and an American keyboard are the symbols printed on the keys - the signals sent to the computer are exactly the same. How can we prove this?

We can look at what is actually happening by sniffing the USB traffic. This is very easy to do under Linux, I followed these instructions: http://wiki.wireshark.org/CaptureSetup/USB (note that you need to run Wireshark as root for USB packet sniffing)

First we need to know the USB bus our keyboard is connected to:

On the second line, we find the keyboard as device 26 on bus 2. I then captured bus 2 in Wireshark, pressed the 'Ö' key and saw this:

Look at the highlighted area near the bottom where it says '33' in hexadecimal. We can find 33 in the scancode table on this page. Somewhat confusingly, is also says that that '33' means that the ';' key was pressed, even though I really pressed the key labeled 'Ö'.

Does that mean that my Swedish keyboard is really sending 'English' keypresses? Let's find out! Remember that we had to tell the computer what layout we were using. This is done in a file called the keymap. Let's look at the keymap my computer uses. It's easily dumped like this:

$ dumpkeys > backup.kmap

… and just scroll down to keycode 33 where we'll find 'Ö'... oh crap. That's not right, it says keycode 33 is the letter 'F'?

Oh that's right, the '33' we saw before was hexadecimal, right? And looking at the keymap, the keycodes seems to be in decimal. So we should really be looking at keycode 51 since 0x33 = d51 … but that's not correct either!

These values does not match neither the Swedish nor the American layout, so there is obviously something else happening here.

The scancode sent over USB does actually get converted before we even get to think about the keyboard layout. If we look at the table on this page again, we can see that the keycodes 33 = 'f' and 51 = ',' match the values in column labeled 'Set 1' (just remember to convert to hex first…). Now that we know what to look for, we can see that our USB scancode '0x33' maps to 0x27' in 'Set 1'. So remembering to convert to decimal, we should really be looking for keycode 39 in our keymap. Let's see if we are correct this time.

That looks a lot better! Now it says that keycode 39 should be interpreted as an O with diaeresis - which is just a fancy way of saying 'Ö' :)

So to summarize what happens when I type an 'Ö' on my Swedish keyboard:

The keyboard sends USB scancode '0x33' to the computer
The generic USB keyboard driver converts this scancode to a 'Type 1' scancode (39)
The Type 1 keycode is found in the keymap file and interpreted as me pressing the 'Ö' key

From the keymap and the table we can also see that capital letters and special characters are deduced from the modifier keys, which are sent as separate keypresses. More info on how this is configured can be found on http://linux.die.net/man/5/keymaps.

Reverse Engineering the Keyboard, Part II

Reverse Engineering the Keyboard, Part IV

Discussions

Become a Hackaday.io Member