From the Latin, "Docta Vox" meaning "learned voice" or something like that according to Google Translate. It sounded cool and I'm bad at naming things anyway. So there you have it.

Picture

This project uses Processing with STT (speech to text) and TTS (text to speech) to allow verbal communication with the program. The program communicates with an Arduino with RF transmitter via serial. A simple demonstration of the project in action is found here.

"Lamp on." *click* "Lamp has been turned on." *commence feeling powerful*

I started this project a couple of months ago with the simple hopes of controlling an outlet. Flashing an LED is great, but when you need to make something more substantial happen, this is the thing to do. I was almost tempted to get relays like this one. I decided against that because of this excellent tutorial called "Arduino Controlled Relay Box."

After I priced it all out at Lowe's and Sparkfun, I found that each box would cost around $30. These boxes are large, don't forget, and require a wired connection to the Arduino. To control five outlets would cost $150 in components plus low-voltage wire to run around the house to each outlet. I'm not sure my family is going to be ok with bundles of wires running down the hall.

Time for a better solution. I scrounged around the vast caverns of Amazon to find what I thought would be the best fit for my plans. Etekcity 5LX remote outlets are just the ticket. They cost a whopping $35 and offer control of 5 outlets and negate the need to run wires. Not bad.

I bought the set of 5 outlets and two remotes, but you can also order almost any combination by viewing the related products on the page.

My first plan was to hack the remote apart and add transistors to the buttons. This began one of the larges failures in my hacking career: I fried both remotes! Ah! I wound up ordering a new remote and trying a much safer method. This new method was to sniff the RF codes of the remote and retransmit them using the Arduino. Of course, this meant another stop at Sparkfun for their beautifully simple RF products: RF Receiver, RF Transmitter. Both of these <$5 components proved to be quite valuable.

Once again digging up some help online, I came across the an exquisite library made for just such a thing as what I was doing. It's called the RCSwitch Arduino library.

After downloading and installing the library, open up the advanced receive sketch and follow the link in the commenting to see the tutorial. Really, you can operate it without much guidance. The code is simple, and it literally spits out codes on screen as you press buttons on the remote. I found that on my remote the decimal value was the easiest to work with. It will spit out a code that looks something like this:
Decimal: 5592371 (24Bit) Binary: 010101010101010100110011 Tri-State: FFFFFFFF0101 PulseLength: 185 microseconds Protocol: 1
Raw data: 5816,220,544,592,152,224,536,596,156,220,540,588,160,220,540,592,172,204,540,596,160,212,544,592,164,208,544,592,
164,212,548,588,164,208,544,216,544,588,168,208,548,208,544,208,544,208,548,208,548,

For the practical purposes of this project, you only care about the part that says, "Decimal: 5592371" You should press each button on the remote in an order that you will remember and then copy all the data from the serial monitor into something else (Notepad, or Notepad ++ would be great). Save it.

Next, open the transmit sketch and begin testing. I decided to make the repeat value 9 (I discovered that is what the remote itself sends) and I changed the decimal code to the one for "ON" on my first outlet. I uploaded the sketch and watched it work! Success is great! I then modified the code to do all 10 buttons on my remote. I added the switch/case statement to shorten it a bit. If you are wondering why I went with a serial interface, it is so that I can more easily interact with Processing in the next section of this tutorial. Here is the transmit code that I use in the final version.

/*
Example for different sending methods

http://code.google.com/p/rc-switch/

Need help? http://forum.ardumote.com
*/

#include <RCSwitch.h>
int message = 0;


RCSwitch mySwitch = RCSwitch();

void setup() {

Serial.begin(9600);

// Transmitter is connected to Arduino Pin #3
mySwitch.enableTransmit(3);

// Optional set pulse length.
mySwitch.setPulseLength(185);

// Optional set protocol (default is 1, will work for most outlets)
// mySwitch.setProtocol(2);

// Optional set number of transmission repetitions.
mySwitch.setRepeatTransmit(9);

}

void loop() {
delay(5);
if(Serial.available() > 0){
message = Serial.read();
switch(message){
case 'q':
mySwitch.send(5592371, 24);
break;
case 'w':
mySwitch.send(5592380, 24);
break;
case 'e':
mySwitch.send(5592515, 24);
break;
case 'r':
mySwitch.send(5592524, 24);
break;
case 't':
mySwitch.send(5592835, 24);
break;
case 'y':
mySwitch.send(5592844, 24);
break;
case 'u':
mySwitch.send(5594371, 24);
break;
case 'i':
mySwitch.send(5594380, 24);
break;
case 'o':
mySwitch.send(5600515, 24);
break;
case 'p':
mySwitch.send(5600524, 24);
break;
}

}
/* See Example: TypeA_WithDIPSwitches */
// mySwitch.switchOn("11111", "00010");
// delay(1000);
// mySwitch.switchOn("11111", "00010");
// delay(1000);

/* Same switch as above, but using binary code */
// mySwitch.send("000000000001010100010001");
// delay(1000);
// mySwitch.send("000000000001010100010100");
// delay(1000);

/* Same switch as above, but tri-state code */
// mySwitch.sendTriState("FFFFFFFF0101");
// delay(5000);
// mySwitch.sendTriState("FFFFFFFF0110");
// delay(1000);

//delay(20000);
}


If you wish to play with it now, simply open the serial monitor and type "q" and hit enter. I used the top row of the keyboard "qwertyuiop" because it has 10 letters and they are easy to keep track of. Processing will learn to send these characters later.

=====================================================================================

Time to play with Processing. The Processing IDE is a fantastic way to program your computer. It is flexible, understands Java, and makes programming unbelievably quick and easy. I hope you Arduino users out there already have some experience with this so that the project is less complicated. Most of the code is explained in the commenting, so you can pick it up from there.

I broke the code into several files (these open as tabs in Processing)
I will put all the code in this document, but you will be better off to just download the whole thing and play with it as is with supporting files.

doctavox_complete.zip
Download File

As is usually the case when combining code from multiple programmers, It can be bettered with a little work. I added a commands file in .txt format to hold all known commands and responses. It is a CSV file and the program splits it at commas. Anyway, this means that you can add commands and features in less than five minutes each. Pretty flexible! I also shortened the way to make the thing talk. The original author required the syntax "GoogleTTS(String, String);" Where the first string is what is to be said, and the second string is "en" for English. Since I only use English, I added a bit of code to assume this. Now the syntax is, "respond(String);" where the string is what is to be said. Pretty clean!

Ok, let's take a look at the main setup and loop.


//STT solution by Florian Schulz
//http://florianschulz.info/stt/

//TTS solution by "Amnon"
//http://amnonp5.wordpress.com/2011/11/26/text-to-speech/

//Configuration file system added by [trademark]
//

//Excellent library for RC Codes at http://code.google.com/p/rc-switch/

//to activate listening via phone
import oscP5.*;
import netP5.*;

//import serial to talk to Arduino
import processing.serial.*;


//import minim for managing audio
import ddf.minim.spi.*;
import ddf.minim.signals.*;
import ddf.minim.*;
import ddf.minim.analysis.*;
import ddf.minim.ugens.*;
import ddf.minim.effects.*;

//speech to text library
import com.getflourish.stt.*;

//This will hold what was said by the user
String VCResult = "";

//load commands listed in file
String knownVCommands = "";
String loadedCommands[] = {};

// load configuration file
String configuration = "";
String loadedConfiguration[] = {};
String[] config = {};


boolean micOpen = false;
boolean said = false;

Serial port;
STT stt;
Minim minim;
OscP5 oscP5;
NetAddress netLoc;

void setup(){
size(600,400);
frame.setResizable(true);
stt = new STT(this);
stt.enableDebug();
stt.setThreshold(1.0);
stt.setLanguage("en");
VCResult = "System is ready for voice commands.";
minim = new Minim(this);

oscP5 = new OscP5(this, 8000);
//respond("Welcome to dokta vox beta 2 point Oh");

loadedCommands = loadStrings("knownVCommands.txt");
for(int i=0; i<loadedCommands.length; i++){
knownVCommands = knownVCommands + loadedCommands[i];
System.out.println(knownVCommands);
}
loadedConfiguration = loadStrings("config.txt");
for(int i=0; i<loadedConfiguration.length; i++){
configuration = configuration + loadedConfiguration[i];
println(configuration);
}
//TODO: change this to read config file
port = new Serial(this, "COM4", 9600);

}
//^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
void draw(){

}

//Add a way to activate the listening and not listening features via computer keyboard
void keyPressed(){
if(keyCode == CONTROL){
micOpen = true;
stt.begin();
}
// send Arduino the off command if the down arrow is pressed
if(keyCode == DOWN){
port.write('w');
}
// send Arduino the on command if the up arrow is pressed
if(keyCode == UP){
port.write('q');
}
}
// manually activate the mic via control key
void keyReleased(){
if(keyCode == CONTROL){
micOpen = false;
stt.end();
}
}


void transcribe (String utterance, float confidence)
{
println(utterance);
micOpen = false;
VCResult = utterance;
voiceCommands();
}

void stop() {
speak.close();
minim.stop();
super.stop();
}

The code is fairly simple. Pressing the Ctrl key will activate the microphone. It acts as a PTT (push to talk) button similar to a walkie-talkie. Pressing the UP or DOWN arrows will switch the first outlet on and off. I added the arrow thing for a specific personal application which requires it. You don't have to use it, but it is handy for testing. You can see that I have provisioned for a general config file. I don't have anything that uses it at the moment, but eventually it could be used to add commands, COM port, etc. The other file that is imported is the list of things to understand and say.

This next part will handle STT.


import java.io.File;

//store the state of each outlet
boolean one = false;
boolean two = false;
boolean three = false;
boolean four = false;
boolean five = false;

//These come straight from the voice command file
void voiceCommands(){
String[] vc = split(knownVCommands, ',');
// store what was said
String v = VCResult;
/*This is the model for a new command:
else if(v.equals(vc[next even number]) == true){
doStuffHere();
respond(vc[next odd number]);
}*/
if(v.equals(vc[0]) == true){
System.out.println("Success: How sweet it is");
respond(vc[1]);
}
else if(v.equals(vc[2]) == true){
background(0,255,0);
respond(vc[3]);
}

else if(v.equals(vc[4]) == true){
background(255,0,0);
respond(vc[5]);
}

else if(v.equals(vc[6]) == true){
background(0,0,255);
respond(vc[7]);
}
else if(v.equals(vc[8]) == true){
respond(vc[9]);
}
else if(v.equals(vc[10]) == true){
respond(vc[11]);

}
else if(v.equals(vc[12]) == true){
respond(vc[13]);

}
else if(v.equals(vc[14]) == true){
port.write('q');
one = true;
respond(vc[15]);
}
else if(v.equals(vc[16]) == true){
port.write('w');
one = false;
respond(vc[17]);
}
else if(v.equals(vc[18]) == true){
port.write('e');
two = true;
respond(vc[19]);
}
else if(v.equals(vc[20]) == true){
port.write('r');
two = false;
respond(vc[21]);
}
else if(v.equals(vc[22]) == true){
port.write('t');
three = true;
respond(vc[23]);
}
else if(v.equals(vc[24]) == true){
port.write('y');
three = false;
respond(vc[25]);
}
else if(v.equals(vc[26]) == true){
port.write('u');
four = true;
respond(vc[27]);
}
else if(v.equals(vc[28]) == true){
port.write('i');
four = false;
respond(vc[29]);
}
else if(v.equals(vc[30]) == true){
port.write('o');
five = true;
respond(vc[31]);
}
else if(v.equals(vc[32]) == true){
port.write('p');
five = false;
respond(vc[33]);
}
else if(v.equals(vc[34]) == true){
port.write('e');
two = true;
delay(30);
port.write('t');
three = true;
respond(vc[35]);
}
else if(v.equals(vc[36]) == true){
port.write('r');
two = false;
delay(30);
port.write('y');
three = false;
respond(vc[37]);
}
else if(v.equals(vc[38]) == true){
port.write('q');
one = true;
port.write('e');
two = true;
port.write('t');
three = true;
port.write('u');
four = true;
port.write('o');
five = true;
respond(vc[39]);
}
else if(v.equals(vc[40]) == true){
port.write('w');
one = false;
port.write('r');
two = false;
port.write('y');
three = false;
port.write('i');
four = false;
port.write('p');
five = false;
respond(vc[41]);
}
else if(v.equals(vc[52]) == true){
if(one){respond(vc[42]);}else{respond(vc[43]);}
delay(1000);
if(two){respond(vc[44]);}else{respond(vc[45]);}
delay(1000);
if(three){respond(vc[46]);}else{respond(vc[47]);}
delay(1000);
if(four){respond(vc[48]);}else{respond(vc[49]);}
delay(1000);
if(five){respond(vc[50]);}else{respond(vc[51]);}
delay(1000);
}

}

//make responses easier to write in the code.
//respond(string);
//this is an addition by [trademark], and it negates
//one of the original writer's comments on another part
//of the code, but it is marked to avoid confusion.
void respond(String say){
googleTTS(say, "en"); // add the language to the URL and pass all this to the other funciton
File old = sketchFile("lastThingSaid.mp3");
speak = minim.loadFile("lastThingSaid.mp3", 2048);

speak.play();

// don't clutter HDD with .mp3 files
if(old.exists()){old.delete();}

}


This code is a little more to take in, but is just as simple. To understand the list of actions and commands, use the array position to locate the command in the knownVCommands.txt file.

Picture


else if(v.equals(vc[14]) == true){
port.write('q');
one = true;
respond(vc[15]);
}
To add a command (and response if you like), simply add another statement like you see above and add the words to the text file. A simplified explanation of the above code is this: "If what was said equals voice command 14, send the Arduino a 'q' let the rest of the program know I turned "one" on, and tell the user I did so." The last interesting thing I did in this code was to break the tradition with the last couple commands. I added a "status check" command that will tell me the state of each outlet. That is where the little list of if statements is necessitated at the end.

Next up: TTS!

AudioPlayer speak;

import java.net.*;
import java.io.*;



void googleTTS(String txt, String language){
String u = "http://translate.google.com/translate_tts?tl=";
u = u + language + "&q=" + txt;
u = u.replace(" ", "%20");
try {
URL url = new URL(u);
try {
URLConnection connection = url.openConnection();
// This user agent spoof is the loophole that lets this work. As you can see Google thinks we are using FIrefox
connection.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; .NET CLR 1.2.30703)");
connection.connect();
InputStream is = connection.getInputStream();
File f = new File("lastThingSaid.mp3");
OutputStream out = new FileOutputStream(f);
byte buf[] = new byte[1024];
int len;
while ((len = is.read(buf)) > 0) {
out.write(buf, 0, len);
}
out.close();
is.close();
println("File created");
} catch (IOException e) {
e.printStackTrace();
}
} catch (MalformedURLException e) {
e.printStackTrace();
}
said = false;
}


//The two comments below are from the original writer of this code. As seen in the STT file,
// this particular program will speak with the "respond(string)" syntax. The system defaults to English.

// To use the above system, use the following format: googleTTS(String, String);
// The first string is the words to say, and the second is the language slot. "en" for most uses.

Now we are cooking. The whole thing should be fully functional by now. The last part is completely optional, but was necessary for me to be able to use the system without being at my computer. I bought a Bluetooth headset by the way, and that improved accuracy of my translation. It also gives me more freedom to move about. This particular headset is cheap (I pad $9.99 when it was on sale) and produces some bothersome static at just 15 feet from the computer. If you have a nice one, use it! If not, you can buy a cheap one like I did. It's still better than having to carry the computer around your house.

I also added an iOS interface. I bought TouchOSC ($4.99) on the App Store a year or two ago for a different project. It usually winds up finding its way into most of my projects. It is a good system, and it is reasonably easy to use with Processing if you download the library that is built for it.

When you download the editor (free) from the website linked above, you need to open the OSC layout file included in the main download of this project (above). You will be greeted with something similar to what you see in the pictures. Hit "sync" and then allow it through your firewall. Follow the documentation of the app to download the layout to your iDevice. Note all this works with Android too. Teach the app the IP address of your computer, and make sure the ports are the same. Once you have this up and running, you can then control each outlet from your pocket device. Someday I'd like to make the system listen by pressing the button on the side of the Bluetooth headset, but these are basically impossible to interface with. This is a quick and sensible solution. Why use voice control if you have to press a button anyway? Well, for people like you, I made the other iOS page with ON/OFF switches. I'm also assuming that there aren't any people still reading who don't appreciate the voice activation aspect :)
// /*
// if you wish to use something other than OSC, you can
// delete this tab, or uncomment the first and last lines
//Add a way to make the computer start listening from a remote location
int [] button = new int [51];
void oscEvent(OscMessage theOscMessage){
//println("Got a message");
String addr = theOscMessage.addrPattern();
//println("addr " + addr);
if(addr.indexOf("/2/push1") != -1){
int i = int((addr.charAt(7))) - 0x30;
button[i] = int(theOscMessage.get(0).floatValue());
}
if(addr.indexOf("/2/push2") != -1){
int i = int((addr.charAt(7))) - 0x30;
button[i] = int(theOscMessage.get(0).floatValue());
}
if(addr.indexOf("/2/push3") != -1){
int i = int((addr.charAt(7))) - 0x30;
button[i] = int(theOscMessage.get(0).floatValue());
}
if(addr.indexOf("/2/push4") != -1){
int i = int((addr.charAt(7))) - 0x30;
button[i] = int(theOscMessage.get(0).floatValue());
}
if(addr.indexOf("/2/push5") != -1){
int i = int((addr.charAt(7))) - 0x30;
button[i] = int(theOscMessage.get(0).floatValue());
}
if(addr.indexOf("/2/push6") != -1){
int i = int((addr.charAt(7))) - 0x30;
button[i] = int(theOscMessage.get(0).floatValue());
}
if(addr.indexOf("/2/push7") != -1){
int i = int((addr.charAt(7))) - 0x30;
button[i] = int(theOscMessage.get(0).floatValue());
}
if(addr.indexOf("/2/push8") != -1){
int i = int((addr.charAt(7))) - 0x30;
button[i] = int(theOscMessage.get(0).floatValue());
}
if(addr.indexOf("/2/push9") != -1){
int i = int((addr.charAt(7))) - 0x30;
button[i] = int(theOscMessage.get(0).floatValue());
}
if(addr.indexOf("/2/push10") != -1){
int i = 10;
button[i] = int(theOscMessage.get(0).floatValue());
}
if(addr.indexOf("/1/push1") != -1){
int i = 11;
button[i] = int(theOscMessage.get(0).floatValue());
}
//===============================================================
// the main PTT button
if(button[11] == 0 && micOpen == true){
micOpen = false;
stt.end();
}
else if(button[11] == 1 && micOpen == false){
micOpen = true;
stt.begin();
}
if(button[1] == 1){
port.write('q');
println("Desk lamp on");
one = true;
}
if(button[2] == 1){
port.write('e');
two = true;
}
if(button[3] == 1){
port.write('t');
three = true;
}
if(button[4] == 1){
port.write('u');
four = true;
}
if(button[5] == 1){
port.write('o');
five = true;
}
if(button[6] == 1){
port.write('w');
one = false;
}
if(button[7] == 1){
port.write('r');
five = true;
}
if(button[8] == 1){
port.write('y');
one = false;
}
if(button[9] == 1){
port.write('i');
five = true;
}
if(button[10] == 1){
port.write('p');
one = false;
}

}
// */

Well, by this time you should be showing you family and friends your great new project! Thanks for reading!

--trademark