Close
0%
0%

Raspi Newspaper Synthesizer/Reader

Text-to-speech synthesizer to allow blind persons to read online editions of newspapers

Similar projects worth following
Reader of online newspapers to blind people through TTS. Navigation (select, back, next, previous) through a series of buttons. Headlines are read from an RSS feed, articles themselves are retrieved from the link embedded in the RSS feed. HTML is cleaned up so that only the article content is synthesized.
Doug Gore's port of the Android TTS engine to the Raspi has been with ALSA for optimum synth speed.
As a little extra gadget the reader speaks out the air pressure when starts up. The current user, like many other elderly people, suffers under the weather and likes to track the pressure.

The first prototype uses a BananaPi with three navigation buttons in a repurposed Amazon cardboard box. The Raspi B+ was running near max CPU capacity with the TTS engine so I switched to th eBPi. However, the Raspi 2 that has since been released should have enough power to handle this.

A headset turns out to produce the best audio output. Initially a speaker was installed in the box but the sound quality of the speaker combined with the sometimes hard to understand TTS was too inconvenient for the test user. An inexpensive 10 Euro Phillips headset from Media Markt produces the clearest sound. A rotary encoder is used to adjust the volume.

When turned on the box first speaks out temperature and air pressure from a MPL3155 sensor - a feature the user found useful. It then speaks out a list of newspapers. eSpeak is used for this as it has a different voice than the TTS engine. The user presses a button when the name of a paper is read that (s)he wants to have read. The RSS headlines are then retrieved and read out one by one. If the user wants to know more about a headline (s)he presses the button again and is read the summary. After another button press the full article is retrieved.

This is where it gets challenging because we are now dealing with non-standardized web pages. BeautifulSoup in Python can strip all of the scripting and html tags, but there is still a lot of nonsense that gets read out. Therefore, for each featured paper some relevant html tags have been identified that encapsulate the relevant article text. This works reasonably well although there are some papers where the text that is read contains duplicate headlines, duplicate publication dates, duplicate author names or picture subtitles. Also the navigation footer turns out to be hard to remove in some cases.

The TTS engine connects directly to ALSA. This is where this project differs from other implementations of the Android TTS port to the Raspi, which either generate a wave file that is then played by ALSA, or they pipe into ALSA. Generating a wave file takes too long when the text to be spoken is long and leads to a frustrating user experience. Piping is faster than waiting but is unwieldy, to put it mildly, in Python and leads to heavy system loads. Therefore, recompiling the TTS application to interface directly to the ALSA API turned out to be the best option. Credits to the folks at ALSA for making it easy to interface to other C++ applications.

Overall, this product works reasonably well considering the comparatively simple setup. It has been used by one elderly blind person since late 2014.

  • 1 × Banana Pi Started with Raspi B+ but Banana Pi has better performance. Raspi2 should do just fine.
  • 3 × Momentary SPST NO Green Round Cap Push Button Switch From Amazon
  • 1 × Headphones (Phillips)
  • 1 × Rotary Encoder
  • 1 × Recycled Amazon cardboard box Clearly the enclosure needs some work

View all 6 components

  • Upgrading to Raspi 2

    Thomas Kirchner09/04/2015 at 01:52 0 comments

    Trying to move from the BananaPi to the Raspberry Pi 2 which should have even better performance. The problem is that Doug had to remove some files from the Android TTS git and now it won't compile any more. I will post an update if I can figure it out.

View project log

  • 1
    Step 1
    • If you want to use the TTS engine with ALSA integration only follow steps 1, 2, 4, 5. For the Newspaper Reader follow all steps.

    First install the TTS engine. I suggest you use mfurquim's fork.

    git clone git://github.com/mfurquim/picopi.git
    

    As an alternative you can go with the original repository from Doug

    git clone git://github.com/DougGore/picopi.git
    If you go with theoriginal version you will need to add these files: strdup8to16.c, strdup16to8.c, strdup16to8.cpp, jstring.h

    which you can get from mfurquim's fork.

    Followthe first steps of Doug Gore's instructions to setup the library
    cd picopi/pico/lib
    make && sudo make install 

    It is normal to see a lot of warnings.

    Now have the TTS library installed.
  • 2
    Step 2

    Now let's install ALSA's development environment. This can take some time.

    cd /picoip/pico
    wget http://alsa.cybermirror.org/lib/alsa-lib-1.0.29.tar.bz2
    tar -xjvf alsa-lib-1.0.29.tar.bz2
    cd alsa-lib-1.0*
    ./configure && make
    sudo make install
    cd ..
    rm alsa-lib-1.0.29.tar.bz2
    
    ##UPDATE FEBRUARY 2017:
    ##ALSA 1.1.3 is the current version. Use these commands instead to download and compile
    cd /picoip/pico
    wget ftp://ftp.alsa-project.org/pub/lib/alsa-lib-1.1.3.tar.bz2
    tar -xjvf alsa-lib-1.1.3.tar.bz2
    cd alsa-lib-1.1*
    ./configure && make
    sudo make install
    cd ..
    rm alsa-lib-1.1.3.tar.bz2
    … and test if it works by compiling one of the supplied example programs
    cd alsa-lib-1.0.29/test
    make pcm
    amixer cset numid=3 1
    ./pcm

    You should hear a 440 Hz sine wave when plugging a headphone into the analog jack. The amixer command above directs sound to the analog jack. If you want it output differently you need to change the amixer options.

    At the time of writing alsa-lib-1.0.29 is the latest version. UPDATE FEB 2017: ALSA 1.1.3

  • 3
    Step 3

    Now we will add some of the components we need for Python. Again, this may take some time.

    sudo apt-get install espeak python-espeak python-dev python-pip python-alsaaudio
    sudo pip install feedparser beautifulsoup4 

    Why do we need eSpeak if we are going to run the Android TTS engine? I have gone back and forth on this point but decided that eSpeak is good enough for reading out the names of the papers. It gives you an extra voice that makes it easier for the user to determine, just based on voice, where in the menu hierarchy (s)he is.

    In addition, if you plan to use the MPL3155 module you need to install

    sudo apt-get install python-smbus

View all 8 instructions

Enjoy this project?

Share

Discussions

David Effendi wrote 07/30/2015 at 05:11 point

Very cool project! :) I'm also working on a device for people with visual impairments, can I ask about obtaining more information on the ported Android TTS engine, please? How to install, get started etc. I'm using eSpeak but Android's TTS sounds more natural, would really love to give it a try :)

  Are you sure? yes | no

Thomas Kirchner wrote 09/04/2015 at 01:49 point

It's Doug Gore's port which is at https://github.com/DougGore/picopi - or rather I should say, that's where it was. Doug had to remove several files from the repository after he received an infringement notice and the version posted on the Git now is one I have not been able to compile. I have to admit that I have not tried extremely hard, with some extra effort it may be doable. 

If you can figure it out please let me know.

I agree that the Android TTS is much better than eSpeak. 

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates