Close
0%
0%

80speak - Online DECTalk Speech Synthesis

I've always been a fan of Stephen Hawking's signature speech synthesizer, so I made a publicly available version of it!

Similar projects worth following
In the last 30 years, digital speech synthesis has come a long way: devices like our smartphones or the Amazon Echo are astoundingly well-spoken. But one specific speech synthesis software/hardware has prevailed through the years due to it's famous use by Professor Stephen Hawking. The speech produced by this system by today's standards is very robotic in nature, but has a certain charm to a geek like me.

Follow me in the project logs for a more in-depth look into how this system works! Job processing source code is coming soon.

A bit of background for this famous voice: Hawking has used a version of the "DECtalk" voice synthesizer for several years, and has come to be associated with the unique voice of the device. In 2011, Hawking's research assistant Sam Blackburn said Hawking still used a version of DECtalk identified on its board as the "Calltext 5010" manufactured in 1988 by SpeechPlus, Inc., because he identified with it and had not heard a voice he likes better. The CallText 5010 is still listed on Hawking's site as of 2017.

Earlier this year (after many lazy searches for a ready-to-roll "Hawking Emulator") all I could find were links to his CallText Box which ran the DECtalk software. (DEC = Digital Equipment Corporation)

However, after a lot of digging, I came across original Windows binaries that DEC used to demonstrate the DECtalk platform. While these aren't 100% accurate to the voice he uses today, they're the closest thing one can find.

The real challenge was to take these binaries, figure out if they could be automated, and run them on a Linux VPS for all to use.

A week later, I registered the domain name 80speak.com, and the service was up and running! Now anybody can write in a custom phrase, and have it repeated in the famous voice of Professor Hawking. Check the project logs to see how the service works behind the scenes!

process.py

The Python speech processing script!

plain - 1.12 kB - 03/24/2017 at 21:31

Download

  • Automating DECtalk

    Lixie Labs03/24/2017 at 23:36 0 comments

    The next step was to automate 80speak:

    • User submits text through POST request
    • Text is received by Python Flask API, and passed to say.exe for synthesis
    • Say.exe outputs to a .WAV file, which is converted by PyDub to mp3.
    • Link to .mp3 is returned by API
    • Page is updated with inline mp3 player, which plays back the speech automatically
    • Speech is made available for download

    Before I continue, the Python/Flask API is located on port 5000 of 80speak.com, and the endpoint http://80speak.com:5000/send_text accepts a POST request containing the following JSON:

    {"message":"Your text to speak!"}

    Returned is a link to an MP3 containing your synthesized speech! Feel free to use this to make embedded systems like Raspberry Pi speak!

    Cheating Mobile Devices

    To make this process happen behind the scenes in a smooth way, I first wanted the user to be able to stay on a single page. This meant using jQuery and some clever HTML5 tricks to return an auto-playing result on the same page you requested it from.

    To do this was pretty simple: make a POST request, wait for the MP3 link back, and spawn a hidden auto-playing HTML5 Audio player on the page. However, mobile devices don't allow auto-playing by default, they require the user to interact directly with an Audio object before it can be controlled with code. This is to prevent malicious websites from spawning a hundred auto-playing, silent MP3s off-screen just to eat your data/bandwidth in a small-scale DDOS attack. Or something like that.

    I found out that HTML5 audio CAN be auto-played on mobile by JS AFTER the user has already manually started a previous audio object playing. To cheat this system and allow mobile devices to participate in the same way, a split-second silent MP3 is played when the user presses the "SAY IT!" button. That button is designated as the play/pause control for the silent audio. The silent file plays quickly in the background, and the mobile browser now allows us to play the speech audio automatically after it returns! Aha! There's a good Hack of the Day here.

    Server Side

    The Flask API parses the text POSTed to it by the website, and passes it to SAY.EXE for synthesis. Because SAY.EXE is a Windows binary, it has to be run under WINE. Easy enough.

    $ wine say.exe
    Application tried to create a window, but no driver could be loaded.
    Make sure that your X server is running and that $DISPLAY is set correctly.

    Ah, okay. Needs something for a display to run. Xvfb to the rescue! We create a fake display at 1024x768 for WINE to use.

    Xvfb :0 -screen 0 1024x768x16 &
    This was added as an "@reboot" to the crontab to make sure the display runs when we start the machine. Now our WINE command looks like this:
    $ DISPLAY=:0.0 wine say.exe -w WAVE_FILE.wav "Our message goes here!"

    It works! WAVE_FILE.wav now contains a recording of DECtalk saying "Our message goes here!". Time to automate with Python's os.system() command:

    def convert_to_speech(message):
            mid = str(uuid.uuid4()).replace("-","")
    
            print "----------------------------------------"
            print "SPEECH CONVERSION\n"
            print "MID: "+mid
            print "MESSAGE: "+message
            print "Converting to speech..."
    
            wav_file = "/wav/"+mid+".wav"
            out_file = "/mp3/"+mid+".mp3"       # DOES NOT EXIST YET
            mp3_file = "/var/www/html"+out_file # DOES NOT EXIST YET
    
            try_rm(wav_file) # Deletes if exists
            try_rm(mp3_file)
    
            command = "DISPLAY=:0.0 wine say.exe -w "+wav_file+" "+shellquote(message)
    
            print command
            os.system(command)
    
            print "Converting to mp3..."
            sound = AudioSegment.from_file(wav_file, format="wav")
            loud = sound+3;
            loud.export(mp3_file, bitrate='64k', format="mp3")
    
            print "DONE!"
            print "----------------------------------------"
    
            return out_file
    This function is called by the Flask API "send_text" endpoint, and returns an MP3 version of the speech. This mp3 is spawned in the user's page as a hidden Audio object, and automatically plays the result on both desktop and mobile thanks to the audio button cheat!

  • Finding DECtalk

    Lixie Labs03/24/2017 at 21:55 0 comments

    One of the biggest challenges in creating 80speak was sourcing the software to emulate the famous voice of Professor Hawking, and attach it to a web server that the public can use. I possibly could have found original DECtalk hardware and tied that to a Raspberry Pi for control, but this would trade off speed for only a slight authenticity gain. Unlike the original hardware, purely software-based DECtalk instances can be run in parallel, and can produce the speech much faster than it can be spoken. With a hardware solution, it would have to receive a command, capture 1-5000 words of speech in real time, and return the recording to that specific user before processing the next phrases.

    Finding DECtalk Software

    Finding a DECtalk demo is actually pretty easy. The most commonly distributed version is "SPEAK.EXE", which is a GUI allowing you to have one of ten voices speak any text you write in. (The one we need is the default "Perfect Paul" voice)

    However, this won't do. On a Windows machine you could generate macros to control the GUI automatically, but on a headless Linux server you need something a little more command line-based.

    Eventually I found a distribution of DECtalk that came with exactly what I needed: "SAY.EXE". This is a command line-only version of DECtalk, which would allow me to automate the speech generation process! If run with no arguments, it reads whatever is typed into STDIN. If provided with quoted text after the exe, it will read that aloud and then quit.

    Next up was capturing the audio to a file, but luckily the SAY executable will allow you to write the output to .WAV format directly, so this saved me a step!

View all 2 project logs

Enjoy this project?

Share

Discussions

Capt. Flatus O'Flaherty ☠ wrote 04/19/2017 at 15:46 point

Check out the Emic 2 Arduino compatible device:

https://www.parallax.com/product/30016

It uses DECtalk and one of the voices is identical to Stephen Hawking. I used it to interact with my weather station database and even sing 'God bless America' like a true patriot! ..... Yes I do know Stephen Hawking is British!

Stephen Hawking the Weather Presenter:

https://hackaday.io/project/20081-ai-weather-presenter-stephenhawking

I loved the Emic2 so much I wrote a tribute song to Stephen Hawking, who, in this song, is trying to mimic a gangster rapper:

Have fun with DECtalk - it's sooooooo beautifully retro!

  Are you sure? yes | no

flapjax wrote 04/04/2017 at 23:24 point

This is a great project! DECtalk is my favorite TTS. Is your API still running? I can't get any response from the endpoint.

  Are you sure? yes | no

RandyKC wrote 03/25/2017 at 00:12 point

You know what's involved. Any chance of this running headless on a Raspi2

  Are you sure? yes | no

Lixie Labs wrote 03/25/2017 at 00:23 point

Not really. :/ Unfortunately the DECtalk software is an x86 binary, so even with WINE it wouldn't work on the Raspberry Pi's ARM processor. However, if you found any embedded Linux boards that had x86-compatible processors (like the LattePanda that runs Windows 10), you could run it natively there.

Until then, the only way to do this on a Pi is to use the 80speak API detailed in the Project Logs.

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates