Close
0%
0%

Reverse engineering a Renault Safrane Speech Synth

Talk to me
Like Renaults do

Similar projects worth following
Lets try to find out how this box works and maybe even make it do things...

While researching the different generations of Speech Synthesizers used by Renault i took pictures of the internals of every generation. 

I called this version used in the Safrane "TYPE 2" and did some basic research on the used components. 

As it turned out it may be possible to read the voice ROM. And a "maybe" is quite a good starting point...

italian_syp_test1.mp3

another test

MPEG Video - 24.40 kB - 07/22/2023 at 00:22

Download

SYPx54GerTEST.mp3

First (really bad) test of decoding the voice from the ROM data.

MPEG Video - 41.39 kB - 07/22/2023 at 00:05

Download

  • Phase 1 : EPROM -> FLASH mod

    BaumInventions08/04/2023 at 01:56 0 comments

    To speed up the process of writing and blanking ROMS i decided to replace the EPROM with a FLASH solution.

    FLASH can be written much faster and also you just blank them with push of a button instead of shining 15 minutes of uv light into them... Neat.

    After looking through my chip collection i found some 39SF040 in a PLCC package. These are quite similar to the used 27c4001 EPROMS (D23C4001E MASKROM). I also ordered some PLCC>DIP adapters. The Pinout of the FLASH in the adapter matches up with the datasheet (1:1 Pinout).

    A short look in both datasheets showed that only 2 PINS are different. PIN1 of the FLASH is A18 and PIN31 on the 27c4001. PIN31 on the FLASH is WriteEnable. WriteEnable must be high when we read (low while writing). We can just bridge PIN31 to PIN32 on the adapter (and remove PIN31). PIN1 from the adapter must be relocated to PIN31. I did this with just a wire bridge and some kapton tape to prevent shorts.

    It now works like the regular unit but with the benefit of way faster reprogramming. 

    Would i recommend it? Probably not... EPROMS are cheaper. If you dont need the faster programming and blanking it makes no sense... :)

  • Phase 2 : set language

    BaumInventions07/29/2023 at 13:54 0 comments

    It is possible to change the language of the Phase 2 synthesizer by original Renault dealer tools... But where is the fun to use expensive tools when you can do the same thing on your desk at home with a little bit of hacking.

    Control units often have some settings that can be changed to make the functions compatible to the car they will work in. Older control units like this one use an external storage to hold these settings. Because some bytes of settings dont use a lot of storage a really tiny Rom is enough to hold everything.

    While looking up the chips around the speech synthesizer ICs i noticed a tiny 128 Byte (in 8 Bit Mode)/ 64 Byte (in 16 Bit Mode) 93c46 >DATASHEET< serial EEPROM right next to the main microcontroller.

    I connected my chip reader to the Eprom and tried to read the chip. After 15 minutes i figured out that the chip wont read when the power is still connected to the control unit.... So dont forget to turn off the power to your unit...

    You have to do this at least on the third layer of stuff laying around at your desk... 

    The data i got was really super tiny. Only 3 (16Bit) / 6 (8Bit) of the 64 words / 128 Bytes are actually used to hold data.

    Because my box is speaking french... and this is a french product, my guess is that french is the base setting. This led me to experiments on the first "0000". 

    I figured out if i put "0000 / 0101 / 0202 / 0303 / 0404 / 0505" here i can set all available languages.

    0000 French / 0101 English / 0202 German / 0303 Italian / 0404 Spanish / 0505 Dutch

    It works only if the numbers are duplicated (like 0A0A or 2626).

    Other settings above and including 0606 result in funny behaviour. The system will just play random chunks of the samples, mixes all languages and sometimes makes crackling sounds or repeats short bits of samples over and over again. 

    Here is an example of "0909" ... pretty nice glitches :D

  • Analyzing the Data

    BaumInventions07/22/2023 at 23:15 0 comments

    Everything i mention here is in HEX

    The ROM has a storage volume of 80000 Bytes.

    When i looked at the Raw data i could clearly see the different "sectors" of data. 

     This is the first larger gap. The data from the first Block (starting at 00)  ends at 02A4. The bytes till 02FF are padded with zeros. At 0300 some 0F start. These 0F are the squiggly lines we have seen in the audio analysis. My guess is the Audio starts at 0314. Now we know that all the rows of 0F are a clear indicator of where one sample ends and another one starts. and that 00 are used for padding.


    Here is the gap we saw in the audio at the center of the data. The first block of data is padded with 00 till (presumingly) 03FFFF (thats the gap we saw). The new block of data starts at 040000. Thats exactly at half of the chip.


    And like the first block we have another "00 padding | row of 0F" situation at exactly 040300. Again i think the audio starts at 040314. That totally makes sense because the audio is visible and audible after the squiggly line.


    If i just search for "0F 0F 0F ..." i get 100 results. Now we know that there are around 100 audio samples.

    The number of 0F in a row is not always the same. its around 10 to 1F long. 


    Now we know the voce data sarts at 0300 (040300). Because decoding the voice is a different beast lets start with the stuff that is not voice (everything before 0300).

    This is the data from 00 to 0300. 

    While i was looking at that i noticed a smaller block wich starts at 00 and ends at 4F. When we ignore bytes 0 and 1 you can see some kind of "counter". Byte 2 and 3 are 0050, byte 4 and 5 are 006C... The highest numer is 0291 wich is located at byte 48 and 49. This block also seems padded with 00 towards the end. 

    In the data from 40 to 02FF i noticed a lot of 14 and 4000. And again padding with 00 towards the end of this block.

    To visualise this a little bit more for myself i exported the data as HEX (not BIN) and loaded it into an text editor.

    You start with a huge block of those numbers. And after every repetition or interesting number i saw i just pressed enter to get everything into what you see here. There is the "counter" we saw on top of the file and the "4000" . Some of the data between the 4000 was similar to the data before. And there are a lot of 14. Nearly every of those lines has at least one 14 and 4000 in it. 

    Because the first 4000 appeared after the first data it could indicate the end of a data block or the start of a new data block.

    Now i needed more flexibility to handle the data and i loaded the block into open office calc. 

    And after "just" one day constantly looking at those numbers i noticed some things.

    Remember the "counter"? When we take the first 0050 of the counter and have a look at the further code we will see the first block of data starts at exactly 0050 and ends at 006B with a 4000. Ohhh... Lets have a look at the counter again... The next number is 006C... Thats a thing! That is exactly after our 4000 from the first block of data... Lets call the "counter" a lookup table from now on.

    You probably already have noticed that i have coloured all the 32 values of the lookup table and the corrosponding data. Also there are really a lot of 14 and 94.

    Lets bring this into a new form.

    Thats looking better. 

    On the left side in grey i have the values of the lookup table. Each value from the lookup table is followed by the data that it sends us to.

    This took me hours to find out. I will break it down for you.

    Each set of 3 connected light yellow boxes is a 3 byte value pointing to the beginning of a squiggly line. The brown box after the light yellow box is the length. The length is around 0,05 0,06 s for an increase of 1 hex. The 4000 is the end of a data block. 

    From this table we can reconstruct the sentences that are formed from the samples. 

    Example: 

    Lookup...

    Read more »

  • Analysing the data by ear

    BaumInventions07/22/2023 at 00:04 0 comments

    Because i have no idea whats the content of the rom but i expected some kind of wave data i started with importing the .bin file i have read from the rom into Audacity.

    First i had no luck with that but i kept going with different import settings until i started to hear something that really sounded like a voice. The quality is really bad and you can barely make out the words. But at least now i know that there is real voice data on this chip.

    These are the settings i used in Audacity:

    Here is a small section of the badly decoded voice. It says "Tür hinten rechts, nicht geschlossen, Tür vorne rechts" and then it plays the two sounds that it can make (A rising 2 tone sound and a falling 3 tone sound).

    https://cdn.hackaday.io/files/1920428216324928/SYPx54GerTEST.mp3

    Thats another huge success.

    I looked at the waveform and noticed interesting details. There are sections that do not make sound... Thats probably data. Each sound is divided from the next with an interesting squiggly line.

    And now the crazy thing... There is a second language... I think its Italian. The second half of the ROM is entirely in italian.

    https://cdn.hackaday.io/files/1920428216324928/italian_syp_test1.mp3

    The italian part is exactly made like the german part. First some kind of data, followed by the voice sample. Also all samples are divided by a squiggly line.

    You can really see that the wave data is converted wrong. It looks like the waves come from the bottom and are clipping all the time.

    I have marked one of the dividers between the samples. they should be easy to find in the binary data.

    -----

    The 2 data blocks are divided by this gap shown below. I have highlighted the gap and the data in front of the voice samples up to the first squiggly line. The "Data Data" clearly looks different from the voice Data.

    IF YOU HAVE AN IDEA HOW TO DECODE THIS KIND OF RAW AUDIO please let me know... I have no idea how PCM and Waves work... 

    The next step is to dive deeper into the raw hex bytes of the BIN file. With the infos we discovered here it should be easy to find these blocks of audio and data.

  • Reading and cloning the ROM

    BaumInventions07/21/2023 at 20:26 0 comments

    The first step to explore the ROM is to read it. As it turned out it was really easy and read just fine with the "NEC UPD27C4001 @ DIP32" chip selected in the software of my programmer.

    Removed ROM:

    ROM in programmer:

    To find out if the data i read is good (because i dont know if the "user definable" pins are configured right on the D23C4001E) i just wrote a 1:1 copy of the data to a AM27C040 IC (same as 27C4001).

    And yes. It works completely normal with a self written ROM. HUGE SUCCESS.

    The next step is to have a closer look at the content of the ROM...

  • Used Hardware

    BaumInventions07/21/2023 at 19:55 1 comment

    This is the top side of a Phase 1 (till 1996) voice synthesizer used in the Renault Safrane.

    If you are interested in technical details about all the chips and versions, please have a look at my initial research >HERE< .

    The part with the highest probability for us to understand its inner workings and have some interesting results is the voice ROM (top right corner [NEC part]). Because my version is speaking german, i would make an educated guess that the ROM contains at least german words. 

    Phase 1 boxes were available in 5 languages. And all languages have own Partnumbers. That tells us that Phase 1 will probably contain just one language. (Unlike Phase 2 where there is only one Partnumber for all Languages, and the option to change the language with dealer tools).

    My research for the ROM pointed me in the direction of a MC27C4001 compatible device. It can store 4 megabits wich are 512 kilobytes. Depending on the used Samplerate (and maybe compression) you can fit quite a lot of speech inside this chip. There are "just" around 30 messages available to be played, but most of them use the same samples. Either the samplerate for the speech is very high to fill the whole 512kb chip with those few words or it is half empty...

    Lets see if we can read it with my TL866II PLUS...

    TL866II USB PROGRAMMER TUTORIAL (हिन्दी) - YouTube

View all 6 project logs

Enjoy this project?

Share

Discussions

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates