close-circle
Close
0%
0%

TV Remote Control based on Head Movements

A low cost, open-source, universal-like remote control system that translates the user's head poses into commands to electronic devices.

Similar projects worth following
close
The project presents a proposal of an open-source, low cost universal remote control system that translates user’s head poses into commands to electronic devices. In addition, a proximity sensor circuit was combined to radio-frequency modules in order to act as a wireless switch. Among the applications found with respect to alternative remote controls, none of them supports head gestures as input, which would certainly make them a viable option to people whose upper limbs are compromised.

According to the Brazilian Institute of Geography and Statistics (IBGE), 23,9% of the Brazilians has declared to have some kind of disability. This is a really huge number, since it shows that 190,7 million people from one single country have special needs. These statistics also reveal that 6,9% of the population has declared to have motor impairments, i.e., approximately 13,2 million people have some kind of trouble at moving their limbs.

Among the applications that make use of gesture recognition technology, we emphasize i) home automation systems, since they provide convenience and comfort to people on the task of controlling electronic devices; ii) as well as systems developed to help people who find difficulty at manipulating such devices. The latter are normally inside the Assistive Technology (AT) concept, since they can usually be considered one of few solutions that are in fact accessible to people with motor impairments, whose limitations are often at moving around and/or at manipulating physical objects.

Therefore, this project aims at developing a low-cost, universal remote control system based on two embedded platforms -- C.H.I.P. and Arduino -- that were configured together as a centralized hub on home environment in order to control some electronic devices through the IRremote library. Those two boards communicate bia Bluetooth: C.H.I.P. uses the BlueZ lib and Arduino uses the SoftwareSerial lib. The system is able to process video images in real time through Computer Vision techniques implemented on OpenCV library, which are the base for the recognition of head movements that act as an alternative interface of control, providing then autonomy for people with disabilities. Besides, a proximity sensor was combined to a radio-frequency module in order to behave as a wireless external trigger or AT switch, which makes the user able to turn the system on. An overall schematic of the system is shown below as a "flowchart". Notice that the main components are open-source and open-hardware. 

The main idea is that people whose upper limbs are compromised (but not completely paralyzed or amputated); and whose head/neck movements are preserved; can make use of six head movements to transmit commands such as turn on/off; increase/decrease; and forward/backward to many home appliances that can be remotely controlled. It's important to mention here that all APIs, embedded platforms and software packages used to create the proposed remote control system are available on the Internet for free. C.H.I.P. and Arduino (both open-hardware) follow the Creative Commons Attribution-ShareAlike (CC BY-SA) license. C.H.I.P., by the way, also runs a Debian Jessie OS (Linux-based, open-source), which is GPL licensed, as well as BlueZ Bluetooth stack (open-source), its C/C++ API and the IRremote library (open-source). Finally, OpenCV (open-source) follows the Berkeley Software Distribution (BSD 3-clause) license. The hardware modules are not for free, but their schematics are, in case you're willing to build yours on your own :) A demonstration of the system is shown on the video below:

For visual purposes, the system is running over a desktop computer on Ubuntu 14.04 LTS OS. Plus, the proximity AT switch was not used. To be more clear: the system was initialized by the laptop keyboard and the microcomputer C.H.I.P. (which runs a Linux-based OS) was replaced by a full laptop computer. An Arduino UNO, placed in front of the Dracula mug, runs a code that receives a signal from the laptop via Bluetooth and transmits the respective command to the TV through an infrared LED connected to an amplifier circuit. 

OBS.: You can also see a "lag" whenever I change the pose of my head when performing the movements. This happens because I did not create a thread to send the commands via Bluetooth. So, since writing to a socket is a blocking call, the frames captured...
Read more »

View all 12 components

  • Another schematic created

    Cassio Batista2 hours ago 0 comments

    Since Erick is making his presentation for an event in University, we created another schematic of the project. It was added to the details section. A masterpiece!

  • Head Gesture Recognition instructions!

    Cassio Batista5 days ago 0 comments

    After a long period of absence, here I am again. I just started to write the instructions for head gesture recognition, but I'm not finished with it yet. Erick will help me with the codes, in order to be well documented and have some useless lines hidden. 

    The face recognition step is almost finished. Tomorrow I'm planning to write the head pose estimation step.

    I think I need to write a separated instruction about the Bluetooth communication between the CHIP (Linux) and the Arduino as well. But one step at a time, right? ;)

  • HPE Instructions

    Cassio Batista09/07/2017 at 10:42 0 comments

    This month I've been busy preparing a presentation for a conference in Portugal, but I'm planning to write the instructions about how implement the head pose estimation (HPE) in OpenCV next week. I'll leave here the link to the paper that my colleague Erick Campos used on his implementation: https://www.researchgate.net/publication/262320501_Real-Time_Head_Pose_Estimation_for_Mobile_Devices

  • License infos

    Cassio Batista08/22/2017 at 18:24 0 comments

    Just added license infos on the "Details" section. It's not pretty the way it is, so I'm planning move the text somewhere else next time.

  • Link to git repo updated

    Cassio Batista08/22/2017 at 17:29 0 comments

    Erick, my colleague, has started maintaining the project on gitlab. I'm a bit newbie on Git, so I've just imported the project from gitlab to github, since I've seen a complain about repository permissions. Well, the link for github is available now.

  • Documentation of wireless proximity switch

    Cassio Batista08/22/2017 at 16:53 0 comments

    It seems I finished the documentation of the RF-based wireless, IR-based proximity AT switch. It seems I can't edit a instruction at all, so I've been creating new instructions everytime I wanted to change some text or insert some image or video, while deleting the previous one. Anyway, it's already done! :)

  • Instructions to build a proximity, wireless switch

    Cassio Batista08/22/2017 at 11:57 0 comments

    Last night I started to write the instructions to build the proximity, wireless button to activate the system, but I'm still fighting against against the hackaday system. I'm not sure whether I'm stupid or the edit section has a real bug, But I can't modify the things I've written. 

  • English subs added to YouTube video

    Cassio Batista08/17/2017 at 18:51 0 comments

    English subtitles were provided to a second video, which is an introduction to the project. It's been posted on the "details" section.

  • Description done!

    Cassio Batista08/17/2017 at 03:35 0 comments

    Project "description" section finished. I think more details of implementation goes under the "instructions" section. I also must put English subs on my YouTube videos.

View all 9 project logs

  • 1
    IR-based proximity, RF-based wireless AT switch

    In order to turn the remote control system on, an external switch was used. You can think of it as a button, except you don't need to press it, just hover something above it. It also works apart from the circuit triggered, which means it works as a wireless trigger. To do so, an infrared-based, proximity switch was combined to radio-frequency modules (encoders, decoders and antennas). You can think about this switch as a low-cost button that you can "press" while you're at the kitchen to turn something on a circuit at the living room. The details on construction and implementation are given below.

    Transmitter side of the switch The proximity switch is based on the principle of a simple line follower circuit. An infrared LED (IR LED) is placed beside an infrared sensor (IR sensor), as depicted on the image below, with both pointing to the same direction (up). This circuit is kept close to the user to be used as his/her "hover button". Once an object is placed over the IR components, the infrared light emitted by the LED is expected to be reflected by the object onto the IR sensor, which will "detect" the approximation of the object through the change on the voltage of its anode.

    Since that voltage change is really small, an operational amplifier is used to (guess what) amplify this signal. The Op Amp (IC LM358) is also combined to a 10 k Ohm potentiometer to be used as a voltage comparator: when the voltage on the IR sensor's anode is greater than the voltage on the potentiometer, the 5V from the battery goes to a red LED, which provides a visual feedback for the user about the approximation of the object. Plus, a resistor was placed in series with the status LED in order to divide the voltage that goes to the pin of the radio-frequency encoder module (HT12E), as can be seen on the image below.

    The HT12E integrated circuit is used to transmit an encoded 12-bit parallel data via RF transmitter module at 433 MHz frequency. The same voltage used to turn the red status LED on is put on the encoder's pin number 10 (AD8, input data pin. There's also other 3 AD pins, since the IC has 4 channels). The data pin of the RF-module (transmitter antenna, Tx), on the other hand, is connected to pin 17 of HT12E (DOUT, data output pin). The DOUT pin serializes the data to the antenna if, and only if there's a high voltage on the AD8 input pin. Otherwise, no RF wireless signal is generate by the transmitter circuit.

    Receiver side of the switch The receiver part of the wireless switch stays close to the microcomputer in order to (guess again) receive the encoded signal from the RF-Tx modules, decode the information and turn the remote control system on. The RF-module (receiver antenna, Rx) output pin is connected to decoder's pin 14 (DIN, input pin). The HT12D integrated circuit reads the serial data captured by the RF-Rx module on its pin 14, and decodes the 12 bits trying to find a match the address. If the data is successfully decoded, the output is generated in one of its data pins.  Pin 10 (D8) was used here. The schematic is shown below.

    The HT12D's D8 output pin is connected to an input pin (GPI) of the microcomputer. Since the C.H.I.P. supports 3.3V of input voltage, a voltage divider circuit was formulated with an LED and a resistor to limit the voltage at the output of the decoder, which avoids burning the microcomputer pin.

    You can see a demonstration of the wireless, proximity AT switch on the video below:

  • 2
    Head Gesture Recognition

    Once the system is turned on by the external AT switch, the camera starts capturing frames from the user's face/head and the image processing take place through computer vision techniques. The head gesture recognition task is performed by the combination of two key techniques: face detection and head pose estimation, as depicted on the flowchart below. OpenCV, an open-source Computer Vision library, takes care of the algorithms used on our remote control. This procedure was proposed in the paper [Real-Time Head Pose Estimation for Mobile Devices].

    Face Detection

    The first step for head gesture recognition is to detect a face. This is done by OpenCV's method detectMultiScale(). This method implements the Viola-Jones algorithm, published on the papers [Rapid Object Detection using a Boosted Cascade of Simple Features] and  [Robust Real-Time Face Detection], that basically applies an ensemble cascade of weak classifiers over each frame in order to detect a face. If a face is successfully detected, three points that represent the location of the eyes and the nose are calculated over a 2D plan. An example of the algorithm applied to my face is show on the picture below:

    Pretty face, isn't it? The algorithm also needs to detect a face for 10 consecutive frames in order to ensure reliability. If this condition is met, then the face detection routine stops and the tracking of the three points starts trying to detect a change in the pose of the head. If the Viola-Jones fails at detecting a face into a single frame, the counter is reset. You can see below a chunk of code with the main parts of our implementation.

            face_count = 0;
            /* TODO: What does it do? */
            //viola-jones
            while(!is_face) {
                /* get captured frame from device */
                camera >> frame;
                /* TODO: What does it do? */
                
                //Detector de faces viola-jones
                rosto.detectMultiScale(frame, face, 2.1, 3, 0|CV_HAAR_SCALE_IMAGE, Size(100, 100));
        
                /* TODO: What does it do? */
                //desenha um "retângulo ao redor do rosto detectado
                for(int i=0; i < face.size(); i++){
                    rectangle(frame, face[i], CV_RGB(0, 255, 0), 1);
                    face_dim = frame(face[i]).size().height; // a = face_dimension
                }
                /* TODO: What does it do? */
                //Coordenadas iniciais do nariz e dos olhos
                re_x  = face[0].x + face_dim*0.3; //olho direito
                le_x  = face[0].x + face_dim*0.7; //olho esquerdo
                eye_y = face[0].y + face_dim*0.38; //altura do olhos
                /* TODO: What does it do? */
                ponto[0] = cvPoint(re_x, eye_y);//olho direito
                ponto[1] = cvPoint(le_x, eye_y);//olho esquerdo
                ponto[2] = cvPoint((re_x+le_x)/2, eye_y+int((le_x-re_x)*0.45));//nariz
                /* TODO: What does it do? */
                //desenha círculos em volta dos olhos e do nariz
                for(int i=0; i<3; i++)
                    circle(frame, ponto[i], 2, Scalar(10,10,255), 4, 8, 0);
                /* TODO: What does it do? */
                
                //conta o numero de frames consecutivos que um rosto foi detectado
                if((int) face.size() > 0) // quadros
                    face_count++;
                else
                    face_count = 0;
                /* TODO: What does it do? */
                //Viola-jones termina quanto um rosto é detectado em 10 frames
                //seguidos
                if(face_count >= 10) {
                    //is_face = true;
                    //frame.copyTo(Prev);
                    //cvtColor(Prev, Prev_Gray, CV_BGR2GRAY);
                }
                /* TODO: What does it do? */
                flip(frame, frame, 1);
                imshow("Camera", frame); //mostra os frames capturados pela webcam
                esc = waitKey(30);
                if(esc == 27) //pressionar ESC para sair
                    break;
            }//close while not face
    

     Also, we provide a block diagram on how the face detection routine works in our implementation.

    OBS.: I'm praying for Erick to translate the comments of the code to English. He also needs to fix the indentation, which is such a mess. This is just a preview guys don't worry.

    Head Pose Estimation

    The gesture recognition is in fact achieved through the estimation of the head pose. Here, the tracking of the face points is performed by calculating the optical flow, thanks to the Lukas-Kanade algorithm implemented on the OpenCV's method calcOpticalFlowPyrLK(). This algorithm, proposed in the paper [An iterative image registration technique with an application to stereo vision],  basically compares two frames -- an old frame and the current one -- in order to calculate, based on the old one, where the points will appear on the current frame.

    Defining those three face points on the current frame as 

    Current Frame

    and the same points, on an old frame, as

    Old frame

    we can calculate the head rotation 3D-axes (which also can be called Tait-Bryan angles or Euler angles: roll, yaw and pitch) through a set of 3 equations:

    head pose estimationYou can see below a chunk of code with the main parts of our implementation.
    		yaw_count = 0;
    		pitch_count = 0;
    		roll_count = 0;
    		noise_count = 0;
    		/* TODO: What does it do? */
    		//OpticalFlow
    		while(is_face) {
    			action = "erro";
    			/* get captured frame from device */
    			camera >> frame;
    			/* TODO: What does it do? */
    			//obtém o frame atual e converte para escala de cinza
    			frame.copyTo(Current);
    			cvtColor(Current, Current_Gray, CV_BGR2GRAY);
    			/* TODO: What does it do? */
    			//calcula o opticalflow
    			cv::calcOpticalFlowPyrLK(Prev_Gray, Current_Gray,
    						ponto, saida, status, err, Size(15,15), 1);
    			/* TODO: What does it do? */
    			//pontos encontrados pelo opticalflow
    			for(int i=0; i<3; i++)
    				face_triang[i] = saida[i];
    			/* TODO: Geometry of a triangle
    			 * 1) http://stackoverflow.com/questions/19835174/how-to-check-if-3-sides-form-a-triangle-in-c
    			 * 2) http://math.stackexchange.com/questions/516219/finding-out-the-area-of-a-triangle-if-the-coordinates-of-the-three-vertices-are
    			 * 3) Distance from camera to the user's face is inversely 
    			 *    propotional to the distance between the eyes
    				//printf("%.10lf\n", (double)(sqrt((olho_esquerdo[0].x-olho_direito[0].x) + (olho_esquerdo[0].y-olho_direito[0].y))));
    			*/
    			
    			//distância entre os olhos
    			float d_e2e = sqrt(
    						pow((face_triang[0].x-face_triang[1].x),2) +
    						pow((face_triang[0].y-face_triang[1].y),2));
    			
    			//distância entre olho direito e o nariz
    			float d_re2n = sqrt(
    						pow((face_triang[0].x-face_triang[2].x),2) +
    						pow((face_triang[0].y-face_triang[2].y),2));
    			
    			//distância entre omolho esquerdo e o nariz
    			float d_le2n = sqrt(
    						pow((face_triang[1].x-face_triang[2].x),2) +
    						pow((face_triang[1].y-face_triang[2].y),2));
    			//printf("(%3.8f, %3.8f), %3.8f\n", d_re2n, d_le2n, d_e2e);
    			//printf("%3.8f, %3.8f\n", d_e2e/d_re2n, d_e2e/d_le2n);
    			
    			//condições de erro para o opticalflow parar
    			if(d_e2e/d_re2n < 0.5  || d_e2e/d_re2n > 2.5) {
    			//printf("(%3.8f, %3.8f), %3.8f\n", d_re2n, d_le2n, d_e2e);
    				cout << "too much noise 0." << endl;
    				is_face = false;
    				break;
    			}
    			if(d_e2e/d_le2n < 0.5 || d_e2e/d_le2n > 2.5) {
    				cout << "too much noise 1." << endl;
    				is_face = false;
    				break;
    			}
    			if(d_e2e > 160.0 || d_e2e < 20.0) {
    				cout << "too much noise 2." << endl;
    				is_face = false;
    				break;
    			}
    			if(d_re2n > 140.0 || d_re2n < 10.0) {
    				cout << "too much noise 3." << endl;
    				is_face = false;
    				break;
    			}
    			if(d_le2n > 140.0 || d_le2n < 10.0) {
    				cout << "too much noise 4." << endl;
    				is_face = false;
    				break;
    			}
    		
    			/* TODO: What does it do? */
    			//desenha círculos em volta dos olhos e do nariz
    			for(int i=0; i<3; i++)
    				circle(frame, face_triang[i], 2, Scalar(255,255,5), 4, 8, 0);
    			
    			/* TODO: What does it do? */
    			//Eixos de rotação da cabeça
    			float param = (face_triang[1].y-face_triang[0].y) / (float)(face_triang[1].x-face_triang[0].x);
    			roll  = 180*atan(param)/M_PI;
    			yaw   = ponto[2].x - face_triang[2].x;
    			pitch = face_triang[2].y - ponto[2].y;
    			
    			////* Push values on the list. Useful to plot */
    			///yaws.push_back(yaw);
    			///pitches.push_back(pitch);
    			///rolls.push_back(roll);
    			/* debug */
    			printf("%4d,%4d,%4d\t%4d\n", yaw, pitch, roll, noise_count);
    			/* TODO: Check plot before */
    			if((yaw > -20 && yaw < 0) || (yaw > 0 && yaw < +20)) {
    				yaw_count += yaw;
    			} else {
    				yaw_count = 0;
    				if(yaw < -40 || yaw > +40) {
    					noise_count++;
    				}
    			}
    			/* TODO: Check plot before */
    			if((pitch > -20 && pitch < 0) || (pitch > 0 && pitch < +20)) {
    				pitch_count += pitch;
    			} else {
    				pitch_count = 0;
    				if(pitch < -40 || pitch > +40) {
    					noise_count++;
    				}
    			}
    			/* TODO: Check plot before */
    			if((roll > -60 && roll < -15) || (roll > +15 && roll < +60)) {
    				roll_count += roll;
    			} else {
    				roll_count = 0;
    				if(roll < -60 || roll > +60) {
    					noise_count++;
    				}
    			}
    			/* TODO: Check plot before */
    			if(noise_count > 2) {
    				cout << "too much noise." << endl;
    				is_face = false;
    				break;
    			}
    			/* Too much noise between roll and yaw */
    			if(roll_count > -1 && roll_count < +1) {
    				if(yaw_count <= -20) {
    					action = "canal menos";
    					cout << "yaw left\tcanal menos" << endl;
    					is_face = false;
    					break;
    				} else if(yaw_count >= +20) { 
    					action = "canal mais";
    					cout << "yaw right\tcanal mais" << endl;
    					is_face = false;
    					break;
    				}
    				
    				//pitch
    				if(pitch_count <= -10) {
    					action = "aumentar volume";
    					cout << "pitch up\taumentar volume" << endl;
    					is_face = false;
    					break;
    				} else if(pitch_count >= +10) { 
    					action = "diminuir volume";
    					cout << "pitch down\tdiminuir volume" << endl;
    					is_face = false;
    					break;
    				}
    			}
    			
    			//roll 
    			if(roll_count < -150) {
    				action = "ligar televisao";
    				cout << "roll right\tligar televisão" << endl;
    				is_face = false;
    				break;
    			} else if(roll_count > +150) { 
    				action = "desligar televisao";
    				cout << "roll left\tdesligar televisão" << endl;
    				is_face = false;
    				break;
    			}
    			/* TODO: What does it do? */
    			//pontos que servirão de referência para o próximo cálculo do
    			//opticalflow
    			for(int j=0; j<4; j++)
    				ponto[j] = saida[j];
    			/* TODO: What does it do? */
    			//Frame atual se torna frame anterior
    			Current_Gray.copyTo(Prev_Gray);
    			/* TODO: What does it do? */
    			flip(frame, frame, 1);
    			imshow("Camera", frame);//mostra os frames capturados pela webcam
    			esc = waitKey(30);
    			if(esc == 27) //pressionar ESC para sair
    				break;
    	
    		}//close while isface
    

     A flowchart of the head pose estimation routine implemented is shown below: 

    that's it!

View all instructions

Enjoy this project?

Share

Discussions

jlbrian7 wrote 08/22/2017 at 02:02 point

The gitlab project doesn't exist.

  Are you sure? yes | no

Cassio Batista wrote 08/22/2017 at 02:09 point

I think it's not public. I'll try to move to github this week :) 

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates