Business Constraint

Technologies Used

Let's Build It

The goal of this tutorial is to demonstrate how you can easily build a compact ML model to solve a binary classification task in which the positive class means that the problem in the truck is due to a fault in the APS while the negative class means otherwise.

In our case, we utilize the dataset made using readings taken from Scania Trucks in their daily use (collected and provided by Scania themselves). The names of all the features are anonymized due to proprietary reasons. The dataset for this case study can be found here: https://archive.ics.uci.edu/ml/datasets/APS+Failure+at+Scania+Trucks

The experiment will be conducted on a $4 MCU, with no cloud computing carbon footprints :)

Dataset Description

The dataset is divided into two parts, a train set, and a test set. The train set contains 60, 000 rows while the test set contains 16, 000 rows. There are 171 columns in the dataset, one of them is the class label of the datapoint, resulting in 170 features for each data point.

Step 1: Creating a New Solution and Uploading the Dataset on the Neuton TinyML Platform

Once you are signed in to your Neuton account, you should have a Solutions Home page, click on Add New Solution button.

Once the solution is created, as shown above, proceed to dataset uploading (keep in mind that the currently supported format is CSV only).

Select the target variable or the output you want for each prediction. In this case, we have class as Output Variable: 0 for 'negative' and 1 for 'positive'

Step 2: Model Training and Parameters

Since we are going to embed the model onto a tiny MCU, we need to set the parameters accordingly. The Raspberry Pico can run 32-bit operations and set normalization type to Unique Scale for Each Feature

Click start training, it might take longer to train since the dataset is huge, for me, it took about ~6 hours. In the meantime, you can check out Exploratory Data Analysis generated once the data processing is complete, check the below video:

The target metric for me was: AUC 0.987415 and the trained model had the following characteristics:

Number of coefficients = 278, File Size for Embedding = 3.074 Kb. That's super cool!

Step 3: Prediction and Embedding on Raspberry Pico

On the Neuton ai platform, click on the Prediction tab, and click on the Download button next to Model for Embedding, this would be the model library file that we are going to use for our device.

Once you have downloaded the model files, it's time to add our custom functions and actions. I am using Arduino IDE to program Raspberry Pico

Setting up Arduino IDE for Raspberry Pico:

I used Ubuntu for this tutorial, but the same instructions should work for other Debian based distributions such as Raspberry Pi OS.

1. Open a terminal and use wget to download the official Pico setup script.

$ wget https://raw.githubusercontent.com/raspberrypi/pico-setup/master/pico_setup.sh

2. In the same terminal modify the downloaded file so that it is executable.

$ chmod +x pico_setup.sh

3. Run pico_setup.sh to start the installation process. Enter your sudo password if prompted.

$ ./pico_setup.sh

4. Download the Arduino IDE and install it on your machine.

5. Open a terminal and add your user to the group “dialout.” This group can communicate with devices such as the Arduino. Using “$USER” will automatically use your username.

$ sudo usermod -a -G dialout “$USER”

6. Log out or reboot your computer for the changes to take effect.

7. Open the Arduino application and go to File >> Preferences.

8. In the additional boards' manager add this line and click OK.

https://github.com/earlephilhower/arduino-pico/releases/download/global/package_rp2040_index.json

9. Go to Tools >> Board >> Boards Manager.

10. Type “pico” in the search box and then install the Raspberry Pi Pico / RP2040 board. This will trigger another large download, approximately 300MB in size.

Note: Since we are going to make classification on the test dataset, we will use the CSV utility provided by Neuton to run inference on data sent to the MCU via USB.

Here is our project directory,

user@desktop:~/Desktop/APS_Failure_detection$ tree
.
├── application.c
├── application.h
├── APS_Failure_detection.ino
├── checksum.c
├── checksum.h
├── model
│   └── model.h
├── neuton.c
├── neuton.h
├── parser.c
├── parser.h
├── protocol.h
├── StatFunctions.c
└── StatFunctions.h

1 directory, 13 files

Checksum, parser program files are for generating handshake with the CSV serial utility tool and sending column data to the Raspberry Pico for inference.

[Secrettip: If you train a similar binary classification model, just replace the model.h file and modify the *.ino file accordingly to run running inference on the CSV dataset via USB serial] See pin connection below

Understanding the code part in APS_Failure_detection.ino file, we set different callbacks for monitoring CPU, time, and memory usage used while inferencing.

void setup() {  Serial.begin(230400);  while (!Serial);
  pinMode(LED_RED, OUTPUT);  pinMode(LED_BLUE, OUTPUT);  pinMode(LED_GREEN, OUTPUT);  digitalWrite(LED_RED, LOW);  digitalWrite(LED_BLUE, LOW);  digitalWrite(LED_GREEN, LOW);
  callbacks.send_data = send_data;  callbacks.on_dataset_sample = on_dataset_sample;  callbacks.get_cpu_freq = get_cpu_freq;  callbacks.get_time_report = get_time_report;
  init_failed = app_init(&callbacks);
}

The real magic happens here callbacks.on_dataset_sample=on_dataset_sample

static float* on_dataset_sample(float* inputs)
{  if (neuton_model_set_inputs(inputs) == 0)  {    uint16_t index;    float* outputs;
    uint64_t start = micros();    if (neuton_model_run_inference(&index, &outputs) == 0)    {      uint64_t stop = micros();
      uint64_t inference_time = stop - start;      if (inference_time > max_time)        max_time = inference_time;      if (inference_time < min_time)        min_time = inference_time;
      static uint64_t nInferences = 0;      if (nInferences++ == 0)      {        avg_time = inference_time;      }      else      {        avg_time = (avg_time * nInferences + inference_time) / (nInferences + 1);      }  // add your functions to respond to events based upon detection     switch (index)      {      case 0:        //Serial.println("0: No Failure");        digitalWrite(LED_GREEN, HIGH);        break;
      case 1:        //Serial.println("1: APS Failure Detected");        digitalWrite(LED_RED, HIGH);        break;
      case 2:        //Serial.println("2: Unknown");        digitalWrite(LED_BLUE, HIGH);        break;
      default:        break;      }
      return outputs;    }  }
  return NULL;
}

Once the input variables are ready, neuton_model_run_inference(&index, &outputs) is called which runs inference and returns outputs.

Installing CSV dataset Uploading Utility (Currently works on Linux and macOS only)

# For Ubuntu
$ sudo apt install libuv1-dev gengetopt
# For macOS
$ brew install libuv gengetopt
$ git clone https://github.com/Neuton-tinyML/dataset-uploader.git
$ cd dataset-uploader
$ make

Once it's done, you can try running the help command, it's should be similar to shown below

user@desktop:~/dataset-uploader$ ./uploader -h

Usage: uploader [OPTION]...
Tool for upload CSV file MCU  -h, --help                Print help and exit  -V, --version             Print version and exit  -i, --interface=STRING    interface  (possible values="udp", "serial"                              default=`serial')  -d, --dataset=STRING      Dataset file  (default=`./dataset.csv')  -l, --listen-port=INT     Listen port  (default=`50000')  -p, --send-port=INT       Send port  (default=`50005')  -s, --serial-port=STRING  Serial port device  (default=`/dev/ttyACM0')  -b, --baud-rate=INT       Baud rate  (possible values="9600", "115200",                              "230400" default=`230400')      --pause=INT           Pause before start  (default=`0')

Step 4: Running inference on Raspberry Pico

Upload the program on the Raspberry Pico,

Once uploaded and running, open a new terminal and run this command:

$ ./uploader -s /dev/ttyACM0 -b 230400 -d /home/vil/Desktop/aps_failures_test.csv

Green Light means no APS failure, Red Light means APS failure based upon the CSV test dataset received via Serial.

Green Light means no APS failure, Red Light means APS failure based upon the CSV test dataset received via Serial.

Green Light means no APS failure, Red Light means APS failure based upon the CSV test dataset received via Serial.

The inference has started running, once it is completed for the whole CSV dataset it will print a full summary.

>> Request performace report
Resource report:       CPU freq: 125000000    Flash usage: 2785
RAM usage total: 821      RAM usage: 821    UART buffer: 694

Performance report:
Sample calc time, avg: 1183.0 us
Sample calc time, min: 1182.0 us
Sample calc time, max: 1346.0 us

I also did a comparison with Web Prediction on the Neuton TinyML platform and the results were similar.Plus, I tried to build the same model with TensorFlow and TensorFlow Lite. My model built with Neuton TinyML turned out to be 14.3% better in terms of AUC and 9.7 times smaller in terms of model size than the one built with TF Lite. Speaking of the number of coefficients, TensorFlow's model has 7, 060 coefficients, while Neuton's model has only 278 coefficients (which is 25.4 times smaller!).

So, the resultant model footprint and inference time are as follows:

Isn't it amazing that Raspberry Pico is able to perform tasks that are otherwise handled using high-performance machines on the cloud?

Conclusion

This tutorial is vivid proof that you don’t need to be a data scientist to rapidly build super compact ML models to proactively solve practical challenges. And, most importantly, the implementation of such solutions using tinyML, which saves lots of money and resources, doesn’t require high costs or efforts, but only a free no-code tool and a super cheap MCU!

Stay tuned for more exciting tutorials and don't forget to sign for a free Neuton AI account :)