Close

Getting to MVP

A project log for Deep Thought

Creating an ominous terminal that responds to voice input with quotes from Hitchhikers Guide to the Galaxy

richard-julianRichard Julian 03/06/2017 at 16:000 Comments

After the design and implementation of the LEDs, I knew it would just be some programming work to create some optics with the lights and get voice input, process it, and then have Deep Thought respond accordingly.

Essentially, I just took some example python scripts for both the Speech Recognition library (using the pocketsphinx integration) and then the examples provided with the rpi_ws281x library for Neopixels and modified them to do my bidding. At this point, it is insanely janky and will take a little more focus to make Deep Thought's codebase extensible and sensible. Similarly, I'm using Festival Text to Speech, which, while charming, has some serious limitations. However, the British voice we found and settled on is by far one of the best I've ever heard for TTS.

On the hardware side of things, probably the biggest hiccup was getting the Raspberry Pi to output audio over the USB dongle audio jack. As I, and seemingly many people have learned, audio out of the 3mm jack + GPIO pins == craziness. There is considerable interference coming out of the speakers when you use the audio jack, and if you play any sound, for instance, the Neopixels will respond, but not in the way anyone wants them to! So, it took a quick couple of rounds of googling to get the proper configuration for USB-based audio.

A couple of issues still linger with Deep Thought. Most notably, my Neopixel code is throwing some sort of error on free() on exit of the code (maybe a memory leak or something?). Additionally, speech recognition is tough. Our trigger word was "Hello Deep Thought" or "Hello" and we learned big time that you have to specifically say "Hello" in a manner that both emphasizes the H in that word and then also ends on a rising tone (as you'll see by my _very_ enthusiastic hello in the video). Pocketsphinx is very good at getting certain words and phrases understood, especially for something free and open source. The processing time, however, for pocketsphinx is quite long! Hopefully, when giving deep thought function, we can fix this.

Either way, Cara and I had a blast working on this very odd art project! It really is pretty cool to have your own Alexa and man I really hope if the personal assistant craze continues, we see some really amazing open source alternatives to what we have currently. It really is like IRC bots, but in real life!

Discussions