80speak - Online DECTalk Speech Synthesis

Description

In the last 30 years, digital speech synthesis has come a long way: devices like our smartphones or the Amazon Echo are astoundingly well-spoken. But one specific speech synthesis software/hardware has prevailed through the years due to it's famous use by Professor Stephen Hawking. The speech produced by this system by today's standards is very robotic in nature, but has a certain charm to a geek like me.

Follow me in the project logs for a more in-depth look into how this system works! Job processing source code is coming soon.

Details

&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt;span id="selection-marker-1" class="redactor-selection-marker" data-verified="redactor"&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt;/span&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt; &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt;span id="selection-marker-1" class="redactor-selection-marker" data-verified="redactor"&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt;/span&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;

A bit of background for this famous voice: Hawking has used a version of the "DECtalk" voice synthesizer for several years, and has come to be associated with the unique voice of the device. In 2011, Hawking's research assistant Sam Blackburn said Hawking still used a version of DECtalk identified on its board as the "Calltext 5010" manufactured in 1988 by SpeechPlus, Inc., because he identified with it and had not heard a voice he likes better. The CallText 5010 is still listed on Hawking's site as of 2017.

Earlier this year (after many lazy searches for a ready-to-roll "Hawking Emulator") all I could find were links to his CallText Box which ran the DECtalk software. (DEC = Digital Equipment Corporation)

However, after a lot of digging, I came across original Windows binaries that DEC used to demonstrate the DECtalk platform. While these aren't 100% accurate to the voice he uses today, they're the closest thing one can find.

The real challenge was to take these binaries, figure out if they could be automated, and run them on a Linux VPS for all to use.

A week later, I registered the domain name 80speak.com, and the service was up and running! Now anybody can write in a custom phrase, and have it repeated in the famous voice of Professor Hawking. Check the project logs to see how the service works behind the scenes!

Files

process.py

The Python speech processing script!

plain - 1.12 kB - 03/24/2017 at 21:31

Download

Project Logs

Collapse

Automating DECtalk
Lixie Labs • 03/24/2017 at 23:36 • 0 comments
The next step was to automate 80speak:
- User submits text through POST request
- Text is received by Python Flask API, and passed to say.exe for synthesis
- Say.exe outputs to a .WAV file, which is converted by PyDub to mp3.
- Link to .mp3 is returned by API
- Page is updated with inline mp3 player, which plays back the speech automatically
- Speech is made available for download
Before I continue, the Python/Flask API is located on port 5000 of 80speak.com, and the endpoint http://80speak.com:5000/send_text accepts a POST request containing the following JSON:
```
{"message":"Your text to speak!"}
```
Returned is a link to an MP3 containing your synthesized speech! Feel free to use this to make embedded systems like Raspberry Pi speak!
Cheating Mobile Devices
To make this process happen behind the scenes in a smooth way, I first wanted the user to be able to stay on a single page. This meant using jQuery and some clever HTML5 tricks to return an auto-playing result on the same page you requested it from.
To do this was pretty simple: make a POST request, wait for the MP3 link back, and spawn a hidden auto-playing HTML5 Audio player on the page. However, mobile devices don't allow auto-playing by default, they require the user to interact directly with an Audio object before it can be controlled with code. This is to prevent malicious websites from spawning a hundred auto-playing, silent MP3s off-screen just to eat your data/bandwidth in a small-scale DDOS attack. Or something like that.
I found out that HTML5 audio CAN be auto-played on mobile by JS AFTER the user has already manually started a previous audio object playing. To cheat this system and allow mobile devices to participate in the same way, a split-second silent MP3 is played when the user presses the "SAY IT!" button. That button is designated as the play/pause control for the silent audio. The silent file plays quickly in the background, and the mobile browser now allows us to play the speech audio automatically after it returns! Aha! There's a good Hack of the Day here.
Server Side
The Flask API parses the text POSTed to it by the website, and passes it to SAY.EXE for synthesis. Because SAY.EXE is a Windows binary, it has to be run under WINE. Easy enough.
```
$ wine say.exe
Application tried to create a window, but no driver could be loaded.
Make sure that your X server is running and that $DISPLAY is set correctly.
```
Ah, okay. Needs something for a display to run. Xvfb to the rescue! We create a fake display at 1024x768 for WINE to use.
```
Xvfb :0 -screen 0 1024x768x16 &
```
This was added as an "@reboot" to the crontab to make sure the display runs when we start the machine. Now our WINE command looks like this:
```
$ DISPLAY=:0.0 wine say.exe -w WAVE_FILE.wav "Our message goes here!"
```
It works! WAVE_FILE.wav now contains a recording of DECtalk saying "Our message goes here!". Time to automate with Python's os.system() command:
```
def convert_to_speech(message):
        mid = str(uuid.uuid4()).replace("-","")

        print "----------------------------------------"
        print "SPEECH CONVERSION\n"
        print "MID: "+mid
        print "MESSAGE: "+message
        print "Converting to speech..."

        wav_file = "/wav/"+mid+".wav"
        out_file = "/mp3/"+mid+".mp3"       # DOES NOT EXIST YET
        mp3_file = "/var/www/html"+out_file # DOES NOT EXIST YET

        try_rm(wav_file) # Deletes if exists
        try_rm(mp3_file)

        command = "DISPLAY=:0.0 wine say.exe -w "+wav_file+" "+shellquote(message)

        print command
        os.system(command)

        print "Converting to mp3..."
        sound = AudioSegment.from_file(wav_file, format="wav")
        loud = sound+3;
        loud.export(mp3_file, bitrate='64k', format="mp3")

        print "DONE!"
        print "----------------------------------------"

        return out_file
```
This function is called by the Flask API "send_text" endpoint, and returns an MP3 version of the speech. This mp3 is spawned in the user's page as a hidden Audio object, and automatically plays the result on both desktop and mobile thanks to the audio button cheat!
Finding DECtalk
Lixie Labs • 03/24/2017 at 21:55 • 0 comments

One of the biggest challenges in creating 80speak was sourcing the software to emulate the famous voice of Professor Hawking, and attach it to a web server that the public can use. I possibly could have found original DECtalk hardware and tied that to a Raspberry Pi for control, but this would trade off speed for only a slight authenticity gain. Unlike the original hardware, purely software-based DECtalk instances can be run in parallel, and can produce the speech much faster than it can be spoken. With a hardware solution, it would have to receive a command, capture 1-5000 words of speech in real time, and return the recording to that specific user before processing the next phrases.
Finding DECtalk Software
Finding a DECtalk demo is actually pretty easy. The most commonly distributed version is "SPEAK.EXE", which is a GUI allowing you to have one of ten voices speak any text you write in. (The one we need is the default "Perfect Paul" voice)
However, this won't do. On a Windows machine you could generate macros to control the GUI automatically, but on a headless Linux server you need something a little more command line-based.
Eventually I found a distribution of DECtalk that came with exactly what I needed: "SAY.EXE". This is a command line-only version of DECtalk, which would allow me to automate the speech generation process! If run with no arguments, it reads whatever is typed into STDIN. If provided with quoted text after the exe, it will read that aloud and then quit.
Next up was capturing the audio to a file, but luckily the SAY executable will allow you to write the output to .WAV format directly, so this saved me a step!

View all 2 project logs

Discussions

Capt. Flatus O'Flaherty ☠ wrote 04/19/2017 at 15:46

Check out the Emic 2 Arduino compatible device:

https://www.parallax.com/product/30016

It uses DECtalk and one of the voices is identical to Stephen Hawking. I used it to interact with my weather station database and even sing 'God bless America' like a true patriot! ..... Yes I do know Stephen Hawking is British!

Stephen Hawking the Weather Presenter:

https://hackaday.io/project/20081-ai-weather-presenter-stephenhawking

I loved the Emic2 so much I wrote a tribute song to Stephen Hawking, who, in this song, is trying to mimic a gangster rapper:

Have fun with DECtalk - it's sooooooo beautifully retro!

Are you sure? yes | no

flapjax wrote 04/04/2017 at 23:24

This is a great project! DECtalk is my favorite TTS. Is your API still running? I can't get any response from the endpoint.

Are you sure? yes | no

RandyKC wrote 03/25/2017 at 00:12

You know what's involved. Any chance of this running headless on a Raspi2

Are you sure? yes | no

Lixie Labs wrote 03/25/2017 at 00:23

Not really. :/ Unfortunately the DECtalk software is an x86 binary, so even with WINE it wouldn't work on the Raspberry Pi's ARM processor. However, if you found any embedded Linux boards that had x86-compatible processors (like the LattePanda that runs Windows 10), you could run it natively there.

Until then, the only way to do this on a Pi is to use the 80speak API detailed in the Project Logs.

Are you sure? yes | no

80speak - Online DECTalk Speech Synthesis

Description

Details

Files

process.py

Project Logs

Collapse

Automating DECtalk

Cheating Mobile Devices

Server Side

Finding DECtalk

Finding DECtalk Software

Discussions

Similar Projects

Lasercut Optics Bench

"Lixie", an LED alternative to the Nixie Tube

Real-Time Audio Visual Equalizer

Pallette

80speak - Online DECTalk Speech Synthesis

Become a Hackaday.io member

Just one more thing

Description

Details

Files

Project Logs Collapse

Cheating Mobile Devices

Server Side

Finding DECtalk Software

Enjoy this project?

Discussions

Become a Hackaday.io Member

Similar Projects

Does this project spark your interest?

Report project as inappropriate

Send message

Remove Member

Project Logs

Collapse