SmartScale Video Demonstration


Project Overview 

SmartScale was developed in 10 weeks as part of the EE 327: Electronic System Design II course at Northwestern University. The instructor was Professor Ilya Mikhelson, PhD. For the build process, we completed 5 milestones, roughly equally spaced every two weeks. The first milestone was simply a Project Plan, where we outlined all of our parts and goals for the product. The second milestone was an initial prototype, where we demonstrated functionality of the load cell (scale) and Tornado server. The third milestone was a full prototype, where we added in functionality for the ESP32-CAM camera, the Google Cloud Vision API, as well as the USDA FoodData Central API. The fourth milestone was a first revision, where we finalized the 3D printed hardware for the device as well as feedback in the form of LEDs. Finally, the fifth milestone was a second revision, where we added in the webpage as well as performed final testing.


Build Process

As a brief overview of the design process, after initially outlining the data pipeline and required hardware components via the system flow chart, we began with getting the load cell up and running as a functional scale. We 3D printed the necessary plates to interface with the load cell, and after wiring it with the HX711 amplifier, we used a set of calibration weights to generate a calibration curve and thus a necessary calibration constant for accurate measurements. We then interfaced this with the ESP32 via a 2 wire communication protocol (clk and dat) based off of I2C principles. 

The next main phase of the project was developing the backend server architecture that could receive weights (in the form of floats) and images from the ESP32 CAM. These were delivered in POST requests with specific content headers. The server itself was hosted on a local PC.

While server development was ongoing, we looked to get a functional image pipeline established with the ESP32 CAM including how to take, store, and process images with the eventual goal of packaging them in a post request to be sent to the server. 

Once images and weights could be sent to the server, we looked to develop a processing procedure to extract the appropriate food label and macronutrient information. We did this utilizing Google Cloud’s Vision API, specifically making use of the label detection method. This returned a potential list of image labels which we cross referenced with our local food inventory. Finally, we looked to the USDA FoodCentral API to gather the relevant macronutrient information.

With all the required information held locally on the server, the last step was to populate a webpage for the user to view their information. This was accomplished by rendering an HTML page passed in specific variables of interest like food name, protein, fat, and carbohydrate content. 


System Overview

The SmartScale system contains 10 basic components as outlined below:

  1. ESP32-CAM
  2. Scale
  3. Camera
  4. Battery Pack + Voltage Regulator
  5. Tornado Web Server
  6. Food Macro Database
  7. Cloud Image Recognition
  8. Public Webpage
  9. 3D Printing
  10. Push Button + LED Feedback System

The hardware components as well as the interaction of these systems are neatly outlined in the following figures.

SmartScale CAD Rendering
SmartScale Process Flow Block Diagram

ESP32-CAM

The ESP32-CAM served as the microcontroller for this system, i.e. the brains of the entire operation. The scale, camera, pushbutton (to signal the user would like to input another food item), and feedback LEDs were wired to it, and it ran code to capture a weight and take a picture of a food item, then send both pieces of information via a GET request to the Tornado server. The code was written in the Arduino programming language, and we enjoyed working with it greatly.
Selected ESP32-CAM Code

Scale

The scale is the backbone of this entire product, and contains a number of individual parts to make it work. The first is a 1kg load cell purchased from Adafruit. Along with the second component, the HX711 Amplifier (digital-to-analog converter), the load cell measures the strain throughout it and converts this to a digital number that can be calibrated to represent a weight. This calibration is done onboard the ESP32-CAM, using actual data we collected using known weights. 

Scale Calibration Graph

Camera

The camera is used to snap a photo of the food item on the scale to be analyzed to determine what the food item is. We used the OV2640 camera module already onboard the ESP32-CAM microntroller. This allowed us to combine the camera to the microcontroller, saving space and greatly streamlining the system hardware. 

OV2640 Camera

OV2640 Camera

Battery Pack + Voltage Regulator

In order to power the system, we use a 6V battery pack and a 5V voltage regulator. The 6V battery pack comes standard to hold four AA batteries (1.5 V each in series). The 5V voltage regulator brings this voltage down to 5V that the ESP32-CAM can run on to power all of the other electronics (HX711 chip, LEDs, push button). 

Battery Pack and Voltage Regulator

Tornado Web Server

We are utilizing Tornado as the primary backend framework for our server. Using the ESP32 as the client we are able to send POST requests to the server to send all the necessary information including the object’s weight and its accompanying image. To actually read in the data we have a single handler and each different post request is packaged with a content specific content header. This allows for all the data to be available to the local memory of the handler itself which aids in analysis. As such we are able to perform image recognition in the same handler via pinging Google Cloud’s Vision API and the resulting label is then sent to the USDA FoodCentral Database API. Once the appropriate macronutrient information has been extracted, it is stored in separate queues to eventually populate the website. 

An additional handler that is dedicated to rendering a static website was established that takes various items from their queues including food name, protein, fat, and carbohydrate content and passes them to the HTML template for eventual web page rendering. The use of queues ensures that all the macronutrient information is correctly aligned and there is no mismatch between weights, pictures, etc. 

Primary Server Handler

Food Nutrient Database: USDA FoodData Central

The USDA’s FoodData Central database, created in 2020,  represents the most comprehensive, regulated nutrition database currently publicly available. This database offers access to Standard Reference (SR) Legacy data (the primary reference database for US foods for several decades) as well as updated information gathered through the Food and Nutrient Database for Dietary Studies (FNDDS) 2017-2018. 

The USDA has built an API to allow for easy querying of the database to extract a wide range of nutrient information making it an optimal choice for our system. The API is extremely easy to work with and involves a GET request featuring the name of the food item you are searching for. The database returns over 25 descriptors for a single food item, including a variety of nutritional information even beyond the scope of what we are looking to tabulate such as an extensive list of micronutrients and sourcing information. A serving size is also provided for each category and by using the weight of the food item from our scale we can calculate the number of servings contained in the food item. This is then used as a multiplication factor for the various nutrients we are looking to account for. 

FoodData Central Database API

Cloud Image Recognition

After considering numerous image recognition platforms we ultimately decided to move forward using Google Cloud’s Vision API. While not specifically geared towards food, the image recognition is quite powerful for general usage. Additionally, to facilitate the goal of developing a working prototype in our short timeframe, this API had a much simpler setup and implementation with our existing Tornado server framework due to preexisting Python libraries being readily available. For initial testing the API also has a web based user interface to test and determine if it works for your needs and this proved to be very helpful in comparison to alternatives where no such easy trial was available.  For our purposes, we are utilizing the API’s label detection method which returns a list of labels for the picture along with a general score for the confidence of the label for the image and a topicality score (how well the label fits the image in the context of the broader image environment). Pictured below is a sample image taken on the ESP32-CAM and the resulting labels after feeding it to the API. As shown, “Banana” is the top result which matches the picture. Moving forward we could implement a score threshold for extracting the most appropriate label; however currently we are implementing a known food inventory which sorts through the output labels to see if there is a match. This limits us to foods that we tell our system to be able to identify and this represents a significant area for future improvement. 

Public Webpage

The next design aspect of this project was creating a user interface for our users to actually see the nutritional information that has been calculated in the backend. While we considered a physical display on the device itself, we figured a website would be preferred and this was confirmed in several interviews. For those interviews we developed a very basic HTML page as an example of how the information may be presented. Ultimately the main objective was to present the desired information in a straightforward fashion without spending too much time on more “trivial” web development to improve UI aesthetics.  

Webpage

3D Printing

SmartScale contained three CAD components that made up the form factor of the system. The first was the scale plates (shown below), which along with the load cell combined to form the scale. The second was the telescoping arm that held the ESP32-CAM above the food so it could snap a photo. Finally, we 3D printed a cover for the electronics, so that the user only saw the push button and feedback LEDs. 

SolidWorks CAD File of Scale Plate

Push Button + LED Feedback System

SmartScale uses a push button and LEDs to form a feedback system for the system. When a user wishes to input a new food item, they simply press the push button which starts the process of weighing and taking a picture of the food and then sending this info over the Tornado server for processing and publishing to the webpage. While this is happening, a green and red LED form a feedback system to tell the user exactly what the system is “thinking” at any given moment. If only the green LED is lit, this means the system is idling and ready for input. If only the red LED is lit, an error has occurred. If both LEDs are lit, the system is processing information. 

LED and Push Button