Detecting Clean-vs-Messy Rooms using TensorFlow Deep Learning

This guide will walk you through the steps to create and use a TensorFlow machine learning model to detect a clean room vs messy room.

In our house, we integrate such a model with our home automation system to block access to the TV used for video games if the kids have not kept their rooms clean.

Note that this process could actually be used to a create a model to recognize any type of objects/scenes in images - not just a clean or messy room.

Step 1 - Collect a wide variety of training data (images of the room)

Here are some guidelines for the training data:

  • Try to collect at least 100 images of each type (clean, messy)
  • The images should be as "diverse" as possible - e.g. different types of messy, different lighting, etc.
  • Try not to duplicate very similar images in the training data
  • The more training data you feed the model, the more accurate it will become (you can probably get away with starting with a smaller set of images, and then retrain as you obtain more).

For our use case, I set my security cameras to archive a photo of the rooms (boys room, kitchen, upstairs play room), at a one hour interval. Then I would go through and delete any that looked too similar, and then sort the remaining photos for training the model.

Here's an example of the scheduled cronjob that snapped the hourly photos:

# Get training data from security cameras for Jarvis:
0  * *  *  *  matt  wget "http://192.168.1.21/snapshot.cgi" -O /tmp/boys/$(date -u +"\%Y\%m\%d_\%H_\%M_\%S").jpg
0  * *  *  *  matt  wget "http://192.168.1.23/snapshot.cgi" -O /tmp/kitchen/$(date -u +"\%Y\%m\%d_\%H_\%M_\%S").jpg
0  * *  *  *  matt  wget "http://192.168.1.25/snapshot.cgi" -O /tmp/upstairs/$(date -u +"\%Y\%m\%d_\%H_\%M_\%S").jpg

After your photos have been pruned and sorted, they should be organized in a file/folder structure as follows:

/path/to/files/room           -- parent directory
/path/to/files/room/messy/    -- messy pics
/path/to/files/room/clean/    -- clean pics

It may take several weeks of collecting photos before you're able to amass an adequate level of diversity and quantity for training an effective model.

Step 2 - Install TensorFlow and Download Helper Scripts

TensorFlow is easy to install using Python's pip utility. Additionally, you'll need to get two python scripts from the TensorFlow github.

$ sudo apt install python-pip
$ pip install "tensorflow>=1.7.0" --user
$ pip install "tensorflow-hub" --user
$ mkdir ~/tensorflow
$ cd ~/tensorflow
$ curl -LO https://github.com/tensorflow/tensorflow/raw/master/tensorflow/examples/label_image/label_image.py
$ curl -LO https://github.com/tensorflow/hub/raw/r0.1/examples/image_retraining/retrain.py

If you run into any strange errors during the steps above, try searching for the error message on Google.

Step 3 - Train the Dragon Model

We can now issue the necessary TensorFlow command to take an existing deep learning model, and 'retrain' it to recognize our images (aka transfer learning).

The commands below will use the Inception V3 neural network architecture pre-trained on ImageNet:

$ cd ~/tensorflow
$ python retrain.py \
    --image_dir /path/to/files/room \
    --output_graph=rooms.pb \
    --output_labels=rooms.txt \
    --tfhub_module https://tfhub.dev/google/imagenet/inception_v3/feature_vector/1

The training process should take anywhere from 15-60 minutes depending on your processor. It does cache some of the data it generates; so if you ever retrain using some of the same images, it will be faster than the first time.

The newly generated files rooms.pb and rooms.txt represent the model and labels that will be used to recognize images (rooms) as messy...

Read more »