Introduction

The typical machine vision systems I'm familiar with need to get the camera sensor data coded onto a protocol, transferred from the camera to the microchip, then decoded, and finally fed onto a pre-trained model which is solved by an IC, from which we can get a result.

The Nature paper describes an array of sensors which can be trained to recognize simple letters, i.e. it does the interpretation in the sensor array, forgoing most of the steps described above. The sensor array described in the paper is a complex setup well suited for a Research Institute but not so easy for a hacker to reproduce. At least for me to reproduce.

This was for me ML on the very edge, literally. I though that maybe something similar could be obtained with a simpler setup, using off the shelf parts, but also forgoing some of the steps typically needed for a machine-vision system.

My take on this was to try and make a prototype machine vision system that could identify images using discrete components by training a system based on decision trees. The sensors would be an array of photoresistors and the decision trees would be built using voltage comparators.

The training would all be carried out on a PC and the resulting decision tree implemented on an array of voltage comparators. The activation of the voltage comparator would be trimmed by adding resistors with the optical sensors to make voltage dividers which in turn would trigger the selection process.

There is a good possibility that this has been done in the past. If it has, I could not find anything similar.

Very quickly I realized two important things by looking at the dataset:

1. There would be some photoresistors' signal that would be ignored most of the time, i.e. my array would not be much of an array after all. It'd miss a few photoresistors. 2. Even if some photoresistors were missing, a simple array to analyse, e.g the MNIST number dataset would need an array of a few hundred photoresistors. I'm a father of 3 small kids with a full time job. This project quickly became impossible.

Then something came to my mind, what if I could simplify the pictures by averaging pixels? Would a lower definition picture reduce the efficacy of the decision tree?

The first objective was to develop some code to test this. All code was carried out in R using the RStudio IDE. Not extremely efficient but nice IDE I'm familiar with from previous forays on ML.

The target accuracy, in order to be at the level of a commercial solution, should be above 75%.

First decision tree

The first decision tree was produced using the rpart package and the full definition 28x28 digits and trained with the first 10000 images of the MNIST dataset. Below is a sample of a plot showing a digit represented by a single record on the MNIST matrix with all 28 x 28 = 784 "pixels". ```{r} image(avpic(2,1)) ```

After training the decision tree, the results were as follows:

## Decision tree: ```{r} rpart.plot(fitt, extra = 103, roundint=FALSE, box.palette="RdYlGn")

```

The confusion table below shows that a lot of numbers get miss-classified, but you only get what you pay for. If a better solution is desired, a neural network or a random forest will serve you well but at the expense of this project never seeing the light of day.

Confusion matrix

The diagonal shows how often the numbers are correctly identified. All the results outside the diagonal are miss-classified digits. Not looking so good. ```{r} table_mat ```

    0   1   2   3   4   5   6   7   8   9  
0 349   5   2  10   8   7   0   4  31   0  
1   1 346   8   7  34  19   0   8   6   7  
2  48  22 215   4  12   6  15   6  42  25  
3  23   6  33 239   6  35   2   7  47  21  
4   1  10  13   7 277   8  11   6  12  40  
5  47  10   5  28  32 155   5  10  54  14  
6  36  24  31   0  63  12 186   4  32  11  
7   1   3   4  16  34   4   0 326   9  22  
8   6  23  27   7   9  13  13   1 241  43  
9  10   9   3  46  28  12   2  52  18 205

Accuracy calculator

Ratio between the diagonal of the elements of the confusion matrix and the summation of the matrix and an indicator of how good the model is in classifying the digits.

```{r} accuracy_tune(fitt) ```

[1] 0.6352264

Accuracy for single digits

Similar to the previous metric but focused on single digits.

```{r} diag(table_mat)/apply(table_mat,1,sum) ```

        0         1         2         3         4         5         6         7         8         9 
0.8389423 0.7935780 0.5443038 0.5704057 0.7194805 0.4305556 0.4661654 0.7780430 0.6292428 0.5324675

The accuracy of the system is not particularly good, and some digits fare much better than others.

Nevertheless, something became apparent. With 3 photo-resistors and 3 voltage comparators, the number 1 could be identified successfully eight out of ten times. And the whole system could be built with only thirteen photoresistors.

Comparison with a Random Forest

A comparison with a Random Forest trained model yields the following result:

Accuracy calculator

```{r} accuracy_tune(fittrf) ```

[1] 0.8100517

Accuracy for single digits

Similar to the previous one but focused on single digits. ```{r} diag(table_matrf)/apply(table_matrf,1,sum) ```

     0         1         2         3         4         5         6         7         8         9 
0.8395270 0.9442351 0.7993448 0.7636512 0.7849829 0.7664504 0.9093904 0.8381030 0.7349823 0.7013946

With this model all digits fare much better than with the decision tree, but this solution cannot possibly be implemented in the way set out in this project. No easy way to implement 500 trees using discrete components.

Conclusion

The first calculations look promising. Further work is needed in order to verify the feasibility of this approach and confirm whether an extremely lean and extremely fast MNIST number recognition system can be built out of discrete components.

Acknowledgement

Below is a list of the giant's shoulders I remembered to reference. Sorry if there are more which this work relied upon and did not get referenced. Please inform me if you see part of your code above un-referenced and I'll correct the list below.

* Loading the MNIST dataset onto R and training with an SVM, https://gist.github.com/primaryobjects/b0c8333834debbc15be4
* RStudio Team (2015). RStudio: Integrated Development for R. RStudio, Inc., Boston, MA URL http://www.rstudio.com/.
* Nature's paper, https://www.nature.com/articles/s41586-020-2038-x
* Progress bar, https://ryouready.wordpress.com/2009/03/16/r-monitor-function-progress-with-a-progress-bar/
* Plotting and calculations on Decision trees, https://www.guru99.com/r-decision-trees.html

Introduction

Introduction

First decision tree

Confusion matrix

Accuracy calculator

Accuracy for single digits

Comparison with a Random Forest

Accuracy calculator

Accuracy for single digits

Conclusion

Acknowledgement

Minimum model

Minimum hardware

Discussions

Become a Hackaday.io Member