TV Remote Control based on Head Gestures

Is the project dead already?

Cassio Batista • 01/21/2018 at 05:05 • 0 comments

No! The project is definitely not dead. We've just been too busy to update things here. To the people following our project, I apologize for that. However, I have good news :)

The first one is that now I have a Master degree, so I'm a MSc. in Computer Science. But I still have almost no free time because I've joined the PhD program, so I'm still a busy student ):

Regarding the project, I have 2 good news:
- PWM now appears to work on C.H.I.P.. We didn't even have to mess with the DAC from audio output pins. It works on pin PWM0 (pin U13_18). This means that both Arduino and Bluetooth connection are no longer need (theoretically, since everything together wasn't tested yet).
- Speech recognition in Brazilian Portuguese now works as standalone way of controlling our Samsung TV. Pocketsphinx package does the job. I'm gonna create another project describing the step by step procedures to use voice commands as an offline input interface of control.

-- CB.

LIRC on C.H.I.P. is coming

Cassio Batista • 11/16/2017 at 22:04 • 0 comments

Right now, Erick is working his @ss out to make LIRC work on C.H.I.P.. The community has found a very strange way to make the PWM work on C.H.I.P.: via audio output pin. If everything goes well, the Arduino will be removed from the project soon. No more Bluetooth connection, no more IRremote lib... the $9 USD microcomputer will be the main, exclusive embedded platform of the remote control system. We actually want that to happen because it will have an impact in the total cost of the project.

I'm finishing my graduate thesis, so I'll be a Master of Science next month! Which also means I'm pretty fuc*ed up right now! If someone is seriously following our project and awaiting updates, I do apologize. Things will get better, I promise (actually, I hope).

-- CB.

New video of the project

Cassio Batista • 10/16/2017 at 04:02 • 0 comments

We spent the weekend producing the video for the contest. It's already on Youtube, an also attached on the proper section of our project's page.

-- CB.

Repo exclusive for pics

Cassio Batista • 10/03/2017 at 05:47 • 0 comments

I just created an account on imgur.com so I can upload pictures of the project without flooding the files section. I added the path as an external link to my Imgur album, as well as my Github link.

PS.: I'm now mentioning Kicad at the details section. I totally forgot we (Erick, mainly) had used it to design the PCB for our wireless, AT switch. He also took a picture of his "physical tools" used for soldering. Check it out:

Toner transfer rocks.

-- CB.

IR comm Arduino -> TV!

Cassio Batista • 10/01/2017 at 02:38 • 0 comments

I just finished the sections of head gesture recognition and infrared protocol emulated by the Arduino to send commands to the TV. I think both sections are complete now. I have some troubles while editing the instructions, because everytime I try to do it, the images I've uploaded previously simply disappear; and the code snippets never separate some block routines by a blank line space. It's boring to keep uploading images everytime I try to do a single modification, but... we gotta go anyway.

I owe you the section about the communication via Bluetooth with C.H.I.P. and Arduino. That's my next step before uploading the video for the final round!

-- CB.

Price list of components

Erick campos • 09/24/2017 at 01:22 • 0 comments

Components list upadated and added in Files section a PDF file with prices of each components used in this project

PCBs files

Erick campos • 09/22/2017 at 20:34 • 0 comments

All kicad project PCB used in this project added in Files section

Github Updated

Erick campos • 09/22/2017 at 20:21 • 0 comments

I updated the repository in github. All HPE's codes are commented. In addition all circuits used in this project

We're on 3 blogs!

Cassio Batista • 09/21/2017 at 15:51 • 0 comments

I forgot to mention that 3 blogs have published reports (don't know if this is the correct word in English, but...) about our project! I'm sharing the links here. Some time I will find a place to them in the middle of the description section.

https://blog.arduino.cc/2017/09/16/controlling-a-tv-with-head-movements/

https://hackaday.com/2017/09/03/hackaday-prize-entry-remote-control-by-head-gestures/

https://blog.hackster.io/head-movement-remote-control-with-c-h-i-p-and-arduino-93e89cf481c5

Another schematic created

Cassio Batista • 09/19/2017 at 18:58 • 0 comments

Since Erick is making his presentation for an event in University, we created another schematic of the project. It was added to the details section. A masterpiece!

Build Instructions

Collapse

1

Bluetooth communication

Coming soon.

Head Gesture Recognition

Once the system is turned on by an AT switch, the camera starts capturing frames from the user's face/head and the image processing take place through computer vision techniques. The head gesture recognition task is performed by the combination of two key techniques: face detection and head pose estimation, as depicted on the flowchart below. OpenCV, an open-source Computer Vision library, takes care of the algorithms used on our remote control. This procedure was proposed in the paper [Real-Time Head Pose Estimation for Mobile Devices].

Face Detection

The first step for head gesture recognition is to detect a face. This is done by OpenCV's method detectMultiScale(). This method implements the Viola-Jones algorithm, published on the papers [Rapid Object Detection using a Boosted Cascade of Simple Features] and [Robust Real-Time Face Detection], that basically applies an ensemble cascade of weak classifiers over each frame in order to detect a face. If a face is successfully detected, three points that represent the location of the eyes and the nose are calculated over a 2D plan. An example of the algorithm applied to my face is show on the picture below:

Pretty face, isn't it? The algorithm also needs to detect a face for 10 consecutive frames in order to ensure reliability. If this condition is met, then the face detection routine stops and the tracking of the three points starts trying to detect a change in the pose of the head. If the Viola-Jones fails at detecting a face into a single frame, the counter is reset. You can see below a chunk of code with the main parts of our implementation.

/* reset face counter */
face_count = 0;

/* while a face is not found */
while(!is_face) {
    /* get a frame from device */
    camera >> frame;

    /* Viola-Jones face detector (VJ) */
    rosto.detectMultiScale(frame, face, 2.1, 3, 
                0|CV_HAAR_SCALE_IMAGE, Size(100, 100));

    /* draw a green rectangle (square) around the detected face */
    for(int i=0; i < face.size(); i++) {
        rectangle(frame, face[i], CV_RGB(0, 255, 0), 1);
        face_dim = frame(face[i]).size().height; 
    }

    /* anthropometric initial coordinates of eyes and nose */
    re_x  = face[0].x + face_dim*0.3;  // right eye
    le_x  = face[0].x + face_dim*0.7;  // left eye
    eye_y = face[0].y + face_dim*0.38; // height of the eyes

    /* define points in a 2D plan */
    ponto[0] = cvPoint(re_x, eye_y);  // right eye
    ponto[1] = cvPoint(le_x, eye_y);  // left eye
    ponto[2] = cvPoint((re_x+le_x)/2, // nose
                eye_y+int((le_x-re_x)*0.45));

    /* draw a red cicle (dot) around eyes and nose coordinates */
    for(int i=0; i<3; i++)
    circle(frame, ponto[i], 2, Scalar(10,10,255), 4, 8, 0);

    /* increase frame counter if a face is found by the VJ */
    /* reset frame counter otherwise
                                * because you need 10 *consecutive* frames */
    if((int) face.size() > 0) // frame
        face_count++;
    else
    face_count = 0;

    /* if a face is detected for 10 consecutives, Viola-Jones (VJ) 
                                * algorithm stops and the Optical Flow algorithm starts */
    if(face_count >= 10) {
        is_face = true; // ensure breaking Viola-Jones loop
        frame.copyTo(Prev); // keep the current frame to be the 'previous frame'
        cvtColor(Prev, Prev_Gray, CV_BGR2GRAY);
    }
}// close while not face

Also, we provide a block diagram on how the face detection routine works in our implementation.

Head Pose Estimation

The gesture recognition is in fact achieved through the estimation of the head pose. Here, the tracking of the face points is performed by calculating the optical flow, thanks to the Lukas-Kanade algorithm implemented on the OpenCV's method calcOpticalFlowPyrLK(). This algorithm, proposed in the paper [An iterative image registration technique with an application to stereo vision], processes only two frames and basically tries to define where, in a certain neighborhood, a specific-intensity pixel will appear on a frame (treated here as the "current frame") based on the immediately previous frame.

Defining those three face points (two eyes, one nose) on the current frame as

and the same points, at the previous frame, as

we can calculate the head rotation 3D-axes (which also can be called Tait-Bryan angles or Euler angles: roll, yaw and pitch) through a set of 3 equations:

If the same pose happens for five consecutive frames, the program sends a value to Arduino referring to the recognized gesture and stops. To ensure reliability, some filters were created based on the anthropomorphic dimensions of the face. In general terms, three relations between the distances of the three points among themselves were considered: (1) the ratio between the distance of the eyes and the nose; (1) the distance between the eyes and (3) the distance between any of the eyes and the nose, as demonstrated below:

We also provide below a chunk of code with the main parts of our implementation in C++. You can notice when looking at the code that some filters were applied to the angles values as well, in order to ensure reliability once again.

roll right = [-60, -15]
roll left = [+15, +60]
yaw right = [0, +20]
yaw left = [-20, 0]
pitch down = [0, +20]
pitch up = [-20, 0]

/* (re)set pose counters (to zero) */
yaw_count = 0;
pitch_count = 0;
roll_count = 0;
noise_count = 0; // defines error

/* while there's a face on the frame */
while(is_face) {
    /* get captured frame from device */
    camera >> frame;

    /* convert current frame to gray scale */
    frame.copyTo(Current);
    cvtColor(Current, Current_Gray, CV_BGR2GRAY);

    /* Lucas-Kanade calculates the optical flow */
    /* the points are store in the variable 'saida' */
    cv::calcOpticalFlowPyrLK(Prev_Gray, Current_Gray,
                ponto, saida, status, err, Size(15,15), 1);

    /* Get the three optical flow points 
     * right eye = face_triang[0]
     * left eye  = face_triang[1]
     * nose      = face_triang[2] */
    for(int i=0; i<3; i++)
        face_triang[i] = saida[i];

    /* calculate the distance between eyes */
    float d_e2e = sqrt(
    pow((face_triang[0].x-face_triang[1].x),2) +
    pow((face_triang[0].y-face_triang[1].y),2));

    /* calculate the distance between right eye and nose */
    float d_re2n = sqrt(
    pow((face_triang[0].x-face_triang[2].x),2) +
    pow((face_triang[0].y-face_triang[2].y),2));

    /* calculate the distance between left eye and nose */
    float d_le2n = sqrt(
    pow((face_triang[1].x-face_triang[2].x),2) +
    pow((face_triang[1].y-face_triang[2].y),2));

    /* Error conditions to opticalflow algorithm to stop */
    /* ratio: distance of the eyes / distance from right eye to nose */
    if(d_e2e/d_re2n < 0.5  || d_e2e/d_re2n > 3.5) {
        cout << "too much noise 0." << endl;
        is_face = false;
        break;
    }
    /* ratio: distance of the eyes / distance from left eye to nose */
    if(d_e2e/d_le2n < 0.5 || d_e2e/d_le2n > 3.5) {
        cout << "too much noise 1." << endl;
        is_face = false;
        break;
    }

    /* distance between the eyes */
    if(d_e2e > 160.0 || d_e2e < 20.0) {
        cout << "too much noise 2." << endl;
        is_face = false;
        break;
    }

    /* distance from the right eye to nose */
    if(d_re2n > 140.0 || d_re2n < 10.0) {
        cout << "too much noise 3." << endl;
        is_face = false;
        break;
    }
    /* distance from the left eye to nose */
    if(d_le2n > 140.0 || d_le2n < 10.0) {
        cout << "too much noise 4." << endl;
        is_face = false;
        break;
    }

    /* draw a cyan circle (dot) around the points calculated by optical flow */
    for(int i=0; i<3; i++)
        circle(frame, face_triang[i], 2, Scalar(255,255,5), 4, 8, 0);

    /* head rotation axes */
    float param = (face_triang[1].y-face_triang[0].y) / (float)(face_triang[1].x-face_triang[0].x);
    roll  = 180*atan(param)/M_PI;           // eq. 1
    yaw   = ponto[2].x - face_triang[2].x;  // eq. 2
    pitch = face_triang[2].y - ponto[2].y;  // eq. 3

    /* Estimate yaw left and right intervals */
    if((yaw > -20 && yaw < 0) || (yaw > 0 && yaw < +20)) {
        yaw_count += yaw;
    } else {
        yaw_count = 0;
        if(yaw < -40 || yaw > +40) {
            noise_count++;
        }
    }

    /* Estimate pitch up and down intervals */
    if((pitch > -20 && pitch < 0) || (pitch > 0 && pitch < +20)) {
        pitch_count += pitch;
    } else {
        pitch_count = 0;
        if(pitch < -40 || pitch > +40) {
            noise_count++;
        }
    }

    /* Estimate roll left and right intervals */
    if((roll > -60 && roll < -15) || (roll > +15 && roll < +60)) {
        roll_count += roll;
    } else {
        roll_count = 0;
        if(roll < -60 || roll > +60) {
            noise_count++;
        }
    }

    /* check for noised signals. Stops if more than 2 were found */
    if(noise_count > 2) {
        cout << "error." << endl;
        is_face = false;
        break;
    }

    /* too much noise between roll and yaw */
    /* to ensure a YAW did happen, make sure a ROLL did NOT occur */
    if(roll_count > -1 && roll_count < +1) {
        if(yaw_count <= -20) { // cumulative: 5 frames
            cout << "yaw left\tprevious channel" << endl;
            is_face = false;
            break;
        } else if(yaw_count >= +20) { // cumulative: 5 frames
            cout << "yaw right\tnext channel" << endl;
            is_face = false;
            break;
        }

        /* Check if it is PITCH */
        if(pitch_count <= -10) { // cumulative: 5 frames
            cout << "pitch up\tincrease volume" << endl;
            is_face = false;
            break;
        } else if(pitch_count >= +10) { // cumulative: 5 frames
            cout << "pitch down\tdecrease volume" << endl;
            is_face = false;
            break;
        }
    }

    /* Check if it is ROLL */
    if(roll_count < -150) { // cumulative: 5 frames
        cout << "roll right\tturn tv on" << endl;
        is_face = false;
        break;
    } else if(roll_count > +150) { // cumulative: 5 frames
        cout << "roll left\tturn tv off" << endl;
        is_face = false;
        break;
    }

    /* store the found points */
    for(int j=0; j<4; j++)
        ponto[j] = saida[j];

    /* current frame now becomes the previous frame */
    Current_Gray.copyTo(Prev_Gray);
}//close while isface

A flowchart of the head pose estimation routine implemented is shown below. The full program is available at our Github inside a folder named 'HPE/desktop/original'.

3
Send commands from Arduino to the TV
Once the signal about the gesture recognition from C.H.I.P. arrives on Arduino via Bluetooth (by means of the SHD 18 shield), it's time to send the appropriate command to the TV. If you're curious about how infrared-based remote controls actually work, I recommend this nice article from How Stuff Works: [How Remote Controls Work].

The remotes usually follow a protocol that is manufacturer specific. Since we're using a Samsung TV, we just followed the S3F80KB MCU Application Notes document, which is the IC embedded into the Samsung's remote control. Take a look if you're really really curious :) In general terms, the Samsung protocol defines a sequence of 34 bits, in which the values "0" and "1" are represented by a flip on the state of the PWM pulses, whose carrier frequency is 37.9 kHz.

Since we didn't want to code the IR communication protocol from scratch, we've just used a library called IRremote. The library provides functions to emulate several IR protocols from a variety of remotes. First of all, we had to "hack" the Samsung remote codes, because the library doesn't provide the 34 bits we are interested in. Once we have the bits for each of the 6 commands we wanted (turn on/off, increase/decrease volume and switch to next/previous channel), we can emulate our remote control with the Arduino.

Hacking the Samsung remote control

The first step was to turn the Arduino into the TV receiver circuit. In other words, we created the same circuit placed in the front part of the TV that receives the infrared light from the remote's IR LED. The hardware required for this task is just a simple TSOP VS 18388 infrared sensor/receiver attached to the Arduino digital pin number 11. Then, we run the IRrecvDump.ino example file to "listen to" any IR communication and return the respective hexadecimal value that represents our 34 bits.
```
#include <IRremote.h>

IRrecv irrecv(11); // Default is Arduino pin D11.
decode_results results;

void setup() {
    Serial.begin(9600); // baud rate for serial monitor
    irrecv.enableIRIn(); // Start the receiver
}//close setup

void dump(decode_results *results) {
    Serial.print(results->value, HEX); // data in hexadecimal
    Serial.print(" (");
    Serial.print(results->bits, DEC); // number of bits
    Serial.println(" bits)");
}//close dump

void loop() {
    if (irrecv.decode(&results)) {
        Serial.println(results.value, HEX);
        dump(&results);
        irrecv.resume(); // Receive the next value
    }
}//close loop
```
This code outputs the information to the Serial Monitor. The final step is just pointing the remote control to the IR receiver, press the desired buttons to hack the hexadecimal value, copy the values from serial monitor and, finally, paste elsewhere to save them. Easy-peasy.

Emulating the Samsung remote control

Once we have the information about the protocol, the task to send it to the TV is straightforward: pass the hexadecimal as argument to a specific function. On IRremote lib, there is a function called sendSAMSUNG() that emulates the Samsung protocol over a hexadecimal value passed as argument together with the number of bits of the information. This number of bits is set to 32 because the first and the last one are the same for every command of the protocol.
```
#include <IRremote.h> 

IRsend irsend; // pin digital D2 default as IROut

void setup() {
    // nothing required by IRremote
}//close setup

void sendIR(unsigned long hex, int nbits) {
    for (int i=0; i<3; i++) {
        irsend.sendSAMSUNG(hex, nbits); // 32 bits + start bit + end bit = 34
        delay(40);
    }
    delay(3000); // 3 second delay between each signal burst
}

void loop() {
    sendIR(0xE0E0E01F, 32); // increase volume (pitch up)
}
```
The hardware required to perform the task of sending information from the Arduino to the TV is just an infrared LED (IR LED) attached to an amplifier circuit, which was built to increase the range of the IR signal. The schematic of the circuit is depicted below.

The PWM of the IR communication protocol turns the transistor on and off, which allows the IR LED to "blink" very quickly according to PWM's pulse width. A pulldown resistor limits the current passing through the IR LED from the 5V Vcc that comes from the Arduino pin. If you want to understand how we calculate the values of the resistors, take a look at Kirchhoff's Law KVL.

Here's a nice, simplified overview of the circuit I just found on Google:

View all 3 instructions

Price list of each component.pdf Price list of each componente Adobe Portable Document Format - 54.57 kB - 10/21/2017 at 21:22	Preview	Download

IR Emitter.zip IR emitter circuit kicad project Zip Archive - 53.68 kB - 09/27/2017 at 15:07		Download
Status.zip System status kicad circuit Zip Archive - 60.35 kB - 09/27/2017 at 15:07		Download

TV Remote Control based on Head Gestures

Description

Details

Files

Price list of each component.pdf

IR Emitter.zip

Status.zip

Components

Project Logs

Collapse

Is the project dead already?

LIRC on C.H.I.P. is coming

New video of the project

Repo exclusive for pics

IR comm Arduino -> TV!

Price list of components

PCBs files

Github Updated

We're on 3 blogs!

Another schematic created

Build Instructions

Collapse

Face Detection

Head Pose Estimation

Hacking the Samsung remote control

Emulating the Samsung remote control

Discussions

Similar Projects

Electronic Feeder for Pets with Arduino

Wi-Fi-controlled car (turtle bot) with FPV

Voice Control RaspberyPi Smart AR Glasses

Using Open PLC to Control an Arduino

TV Remote Control based on Head Gestures

Become a Hackaday.io member

Just one more thing

Description

Details

Files

Components

Project Logs Collapse

Build Instructions Collapse

Face Detection

Head Pose Estimation

Hacking the Samsung remote control

Emulating the Samsung remote control

Enjoy this project?

Discussions

Become a Hackaday.io Member

Similar Projects

Does this project spark your interest?

Report project as inappropriate

Send message

Remove Member

Project Logs

Collapse

Build Instructions

Collapse