This is a project porting the Teachable Machine project to the K210 platform. Instead of running in a browser, this project running on a piece of low-cost hardware.
Just like the original Teachable Machine project. No coding is needed to play with this project. If you need to change sample and classifying classes, all you need to do is to put the micro-sd card into a computer, drag, and delete a few files. All training will happen the next time you power on the board, and the project will recognize new objects.
All code of this project is open-sourced and written in Python. It would be easy for you to modify the project, change output, or connect lights, motors, etc. The hardware can be easily embedded into your project.
You will need a Sipeed M1 Dock Kit and a micro SD card. Note there may be compatibility issues on the micro SD card. You do need a FAT formatted card. For me, I just used a cheap 8G one and it worked.
The next step is to upgrade MaixPy firmware. When I wrote this instruction on May 1, 2020, the official build of MaixPy has a bug in the ulab module. There is a PR fixing the bug but not merged yet. You can use firmwareWithWorkingulab.bin in the testedFirmware folder of this repo. Or you may use official minimum_with_ide_support release in the future if the bug is fixed.
Then you can copy all files in the sampleSDfiles folder in this repo into an empty micro SD card. MaixPy will automatically run main.py on the card. mbnet75_noact.kmodel is a pre-trained MobileNet model. tm_apple, tm_banana, and tm_empty contain sample images of different categories.
Now you can insert the micro SD card into the Sipeed M1 Dock Kit while unpowered. When you power on Sipeed M1 Dock Kit, it will train all sample images and generate tm_parameter.bin and tm_labels.txt on the card to save the training result. After training, point camera to a real apple, a real banana, or images of them. The program will try to predict if the image is apple, banana, or empty.
What if you want to change sample images and classes? Let's say you want to distinguish a watermelon and an ear of corn (2 classes this time). You can remove tm_apple, tm_banana, and tm_empty folders, create tm_watermelon and tm_corn folder. Then you can put a few 224x224 color jpg sample files into those 2 folders. In the end, remove tm_parameter.bin and tm_labels.txt, and the training process will happen again the next time you power up. Then the little machine will start to distinguish a watermelon and an ear of corn.
There is also a cam.py file, you may copy the content of cam.py to replace main.py. When it is running, you may press the boot button on the Sipeed M1 Dock to take pictures. Those pictures may be uses as sample images.
2
What does the python code do?
Here I'll briefly summarize how Teachable Machine (V1) works. Rather than training a new artificial neural network for your samples, Teachable Machine uses a pre-trained MobileNet model that can predict a 224*224 color image, generating a probability for each of the 1000 classes.
In the training stage, the Teachable Machine runs MobileNet model on all sample images you provide and get a probability list of each image. In the predicting stage, the Teachable Machine runs MobileNet model on the input image, gets the probability list. Then the machine will compare the probability list of the unknown image to the lists of all sample images, and see which class has more similar lists.
Back to the Python script in this project, the code will first check if tm_parameter.bin and tm_labels.txt exist on the micro SD card. If the 2 files both exist, the training process will be skipped.
Otherwise, the script will check all folders with a name that starts with "tm_" in the card's root folder. Each folder contains samples of a certain image. The script will use the folder name as the class name and get a normalized prediction vector of each image. The result will be written to tm_parameter.bin and tm_labels.txt.
Then the script will load data from tm_parameter.bin and tm_labels.txt, then start camera capturing. For each image from the camera, the script will get the normalized prediction vector of the camera image. Then the script uses a dot product to calculate the similarity between the camera image and each sample image. At last, the largest 5 results are pickup, and the predicted class will be determined by the highest representation.
3
Resources for further development
Frankly speaking, there isn't much support for the K210 chip or environment. Here I listed a few websites I used during developing this project.