Close

Reverse engineering GBoard apk to learn how to read the models

A project log for Android offline speech recognition natively on PC

Porting the Android on-device speech recognition found in GBoard to TensorFlow Lite or LWTNN

biemsterbiemster 03/15/2019 at 10:121 Comment

This is my first endeavor in reversing android apk's, so please comment below if you have any ideas to get more info out of this. I used the tool 'apktool', which gave me a directory full of human readable stuff. Mostly 'smali' files, of which I never heard before.

They seem to me some kind of pseudo code, but are still quite readable.

When I started grepping through those files again search for keywords like "ondevice" and "recognizer", and the filenames found in the zipfile containing the models, I found the following mention of "dictation" in smali/gpf.smali:

smali/gpf.smali:    const-string v7, "dictation"

Opening this file in an editor revealed that the a const-string "config" was very close by, strengthening my suspicion that the app reads the "dictation.config" file to learn how to read rest of the files in the package. This is promising, since I then don't have to figure out how to do this, and if a future update comes along with better models or different languages, I just need to load the new dictation.config!

Next up is better understanding the smali files, to figure out how this dictation.config is read, and how it (hopefully) constructs TensorFlow objects from it.

UPDATE: The dictation.config seems to be a binary protobuf file, which can be decoded with the following command:

$ protoc --decode_raw < dictation.config

 The output I got is still highly cryptic, but it's progress nonetheless!

UPDATE 2: I've used another {dictation.config, dictation.ascii_proto} pair I found somewhere to fill in most of the enums found in the decoded config file. This ascii_proto is uploaded in the file section, and is a lo more readable now. Next step is to use this config to recreate the tensorflow graph, which I will report on in a new log.

Discussions

HackNoob wrote 12/26/2020 at 07:44 point

Hello @biemster, it is great that you are looking into it. 

Gboard faster offline voice typing is revolutionary in some ways - it is very fast while being very accurate. No other voice typing solution in my experience (including paid ones and also other google voice typing solution like on gdoc or chromebook) perform as well when looking at response time and accuracy together.

I was wondering - do you think there is way to enable the faster offline voice typing model (85MB for US English) on other Android devices. At the moment, it works only on flagship phones but not on most of the other Android devices. E.g. It works on Samsung S10 but not Note10 or S7 tab.

  Are you sure? yes | no