An offline voice recognition hub, no internet connection, no wifi router. Just a pair of ESP8266 with ESPNOW protocol running on them. This is what I want in my room. The devices can either be controlled by pressing switches or by voice. A central hub like alexa is there to respond to your commands. And this is the article which explains how things work. The idea comes to mind when I am operating my lab lights with an IR remote, this can be seen in one of the very previous articles I have made. Then to switch a little bit towards smart home keeping every data personal at my end, I made a plan to keep it offline. As I already have a voice recognition module lane around from DF, Gravity: Offline Language Learning Voice Recognition.

I also made a getting started tutorial on this module which can be seen from here. It explains types of commands and pre- programming required for the sensor. Although the current article provides all the required information. I made a solid state relay PCB to use this with AC mains, Big thanks to PCBWay for sponsoring the PCBs for this project! Their high-quality manufacturing and quick turnaround made this build possible.

Offline Voice recognition module:

This offline speech recognition sensor is built around an offline voice recognition chip, which can be directly used without an internet connection. It comes with 121 built-in fixed command words and supports the addition of 17 custom command words.

This small sensor can be interfaced with any microcontroller board and then different actions can be performed over voice. It will be the same as Alexa but with much more restricted commands and answers. Because overall it is a small ML model working on a little microcontroller. Some popular applications are such as various smart home appliances, toys, lighting fixtures, and robotics projects, among others.

Types of commands:

I will try to keep it as simple as possible, I will divide all the words in 3 sections as per action. See the block diagram given below.

1)Wake-Up words: The wake-up word refers to the word that switches a product from standby mode to operational mode. It is the first command which is given to wake the module. It is similar to saying “Alexa”, “Hey Siri”, “OK Google”. Same as that here the default wake up word is “Hello Robot” and it is the default one. We can also add one more wake work which is demonstrated below as “JARVIS”.

2)Command words: Fixed command words refers to the designated vocabulary used by users to issue specific instructions. Here these words are essential because, after all, this is an offline module and does not have external data/server from the internet to process. So these are the already processed, trained words which are stored in the memory of the module and whenever triggered produce the corresponding ID, which can be used to specify the different actions later on. There are a total of 121 words which are already defined for specific functions over different ID’s.

Now whenever working on a project it is not always good luck that these commands work for you. Means sometimes we need some other commands. Which is possible to do here, known as a custom command word. And there are a total of 17 Custom command words you can train the model on. See all commands from here.

3)Learning Related Commands (Control Commands): These are the command words which are used to interact with the Machine learning model. These are the Controlling commands which initiates the learning and deleting of Wake-up words and command words.

What is ESP NOW?

ESP-NOW is a wireless communication protocol based on the data-link layer, which reduces the five layers of the OSI model to only one. This way, the data need not be transmitted through the network layer, the transport layer, the session layer, the presentation layer, and the application layer. Also, there is no need for packet headers or...

Read more »