Close

First voice!

A project log for The dragon's home

A comfy, standalone DIY smart home

xasinXasin 04/20/2021 at 21:000 Comments

Ahh...

Sorry to keep you guys waiting, the whole... Oh, zero followers?
Well, I better get started on writing project logs in the first place, otherwise, no one will come here :P

Anyhow, things have been going wonderfully!
The first few weeks of the project were mainly just there to set up the hardware, improve the 3D printed casing and audio quality, get the sensors online etc.
There... Was stuff to report on, but frankly I just wanted to have fun with it and get to a state where I can be really proud of the project!

To be fair, that state was reached when I heard how nice the system sounds with the new 3D printed casing. It includes a proper spot for the speaker to fit into that guides the sound better, and it also looks fantastic!

Image

Looking and sounding good isn't the only thing this can do, though. It's also a wonderful listener!
Though it respects your privacy too - the local microphone does not stream audio over the network, not even for keyword detection. Instead it uses a whistle pattern that can be detected locally on the ESP, and only then is the microphone audio streamed over the network to a remote server for, you guessed it, voice recognition!

I finally got it working this evening, and although the recognition quality is... A little dubious, to be quite honest, it also seems to work just fine for the most part!

Recognition is performed using CMU Sphinx, which is open source and can be run locally. The lower quality is offset by the ability to use a JSGF Grammar file that restricts the recognized words and always guarantees some parseable result, and it doesn't need cloud services etc.

https://twitter.com/XasinTheSystem/status/1384610861436788738

Overall, things are starting to shape up, and I am very happy with it! 
The ambient lights and the temperature fade it provides have been great for my sleep rhythm too, I ... *Checks clock*
Oh nevermind.

Well, time to flesh out the voice recognition a bit more up next, build a better API to interface with it, and then build a backend that can actually do useful stuff!

Discussions