Last year I hastily made a Halloween costume: a CRT Monitor shell hollowed out, converted into a wearable helmet, and given life with a simple animated eye and espeak voice. The eye has a simple animation of looking left and right, occasionally blinking. A wireless keyboard is used to provide phrases for espeak to say. This costume got a positive reaction last year. However, the quick implementation leaves much to be desired. There are three main issues: poor ergonomics, limited LED matrix utilization, and a clunky user interface. The helmet is poorly balanced, heavy, and very difficult to put on. The LED matrix is underutilized. Despite having an endless number of options for displaying symbols and expressing emotions, my headset uses only eight frames of animation played randomly. Finally, the costume's voice is driven using a keyboard. While it works thematically, it made operating the helmet clunky and slow. My goal is to fix these three issues.
One of the simplest shortcomings of the original costume as outlined in stage 0 is the user interface. While I can thematically get away with a keyboard as an input method, it left my interactions a little stilted. To have a "conversation" required that I sit down with the keyboard in my lap or on a desk. To respond, I would have to look down at the keyboard (my touch typing isn't that good without feedback), carefully type out something, and then look back at the person I am talking to. My solution to this is to use speech to text to recognize what I say so espeak can repeat it.
The basis of this new interface method is Mozilla's DeepSpeech (https://github.com/touchgadget/DeepSpeech), which was designed to run on Raspberry Pis. Apart from a momentary issue with Alsa, this was easy to get running and modify for my purposes. As of now, my work in this area has been done in the speechRec branch of this project's repo (https://github.com/cogFrog/computerHead/tree/speechRec). I used the mic_vad_streaming.py example as a basis for my speechToTextToSpeech.py.
At first, I thought it would be a pretty simple adjustment. My original plan was to use pyttsx3's runAndWait() function to have espeak say the recognized speech. I expected that this would pause the collection of new audio samples, preventing the system from hearing itself and "echoing". There were two problems with this. First, the audio collection was done on a separate thread, so the blocking function of runAndWait() didn't prevent echoing. Second, pyttsx3 crashes when it is fed an empty string. The solution is in two parts. First, I added pause and unpause functions to the audio class, shown below.
Second, I used the new pause/unpause functions while double-checking that the recognized text is not an empty string. This actually works!
text = stream_context.finishStream()
print("Recognized: %s" % text)
if len(text) != 0:
For this, only two changes were needed. First, the Raspberry Pi 3 B+ has been upgraded to a Raspberry Pi 4 with 4 GB of RAM. The 3 worked, but the 4 noticeably reduced the delay between an utterance and its recognition.
The modification was to replace the keyboard with a decent microphone. The challenge here was to find a decent quality microphone that could work at low volumes. The costume effect is diminished if you can hear the human inside talking as well as the computer! I just went to the store, bought a couple of microphones, and found that the Samson Go Mic worked well enough. A little expensive at $50, but not horrendous. The picture of the current setup is below. Cable management is going to be non-existent until I get more of the functions working, so things are going to be pretty ugly for now.
Now that the speech-to-text-to-speech system is working, it is time to redo the LED matrix control. Adding new icons and animations won't be too much work. In my previous implementation, the two separate scripts were used for the speech and display controls, as the two functions are were separate. However, speech recognition offers a good opportunity to display more complex content, this probably means figuring out some type of threading.
Last year, I made a computer head helmet. Rather than building an entirely new costume this year, I am instead improving this preexisting costume. Before I get ahead of myself, I should start by documenting this preexisting design a bit.
The frame of the costume is an old CRT display that has been gutted and cleaned. From there, three key modifications. First, a hole was sawed in the bottom of the CRT case, with pipe insulation around the edge for comfort.The second modification is a hard hat. This allows for the costume to be worn as a helmet. I was lazy, so the hard hat was literally epoxied to the CRT case. Not clean, but it works.
The third modification is a screen. For this, an acrylic one-way mirror was cut to size and glued into place.
Electrical Construction The electronics for this project were fairly simple, as shown below: