Close

Introduction - a general overview

A project log for Aina - Humanoid AI based social ROS robot

Aina is an open-source social robot that's able to speak and move with nearly no human intervention based in ROS that has AI inbuilt.

Maximiliano RojasMaximiliano Rojas 09/27/2021 at 08:240 Comments

As I said in the description, Aina is an open-source humanoid robot which aim is to be able to interact not only with its environment in terms of moving through it and grasp some light things but also speak and interact with humans by voice and facial expressions. There are several tasks that this robot can accomplish once it's finished like:

These tasks by no means are far from reality, a good example of it is the pepper robot, used in banks, hotels, and even houses., so let think about which characteristic the robot must have to do similar things:

  1. Navigation: It must be able to move through the environment.
  2. Facial expressions: It must have a good-looking face with enough set of emotions.
  3. Modularity: Easy to modify the software and hardware.
  4. Recognition: Of objects with Deep Learning techniques.
  5. Socialization: Be able to talk like a chatbot.
  6. Open source: Based on open-source elements as much as possible.

Because there is a lot to do a good strategy is divide to conquer, for this reason, the project is separated into three main subsystems and a minor one:

The general characteristics of the robot are:

Several parts of the robot has holes or are detachable and even interchangeable, the holes are useful to attach electronic components to the robot keeping the possibility to remove them (or add others) whenever necessary, meanwhile, the detachable parts can be modified to fulfill future requirements, like add a sensor, a screen or just make room for more internal functionalities, a good example of this is the mobile base, as you can see in the next images there are several parts that others can modify (light blue ones), and if you want to move or exchange some electronic part just unscrew it.


Of course, the same principle applied to the rest of the robot.


In regards to the software, as I said before is ROS and docker based, the simplified scheme of connections between the different elements are presented in the next image.

As you can see several containers are running on a Jetson Nano, let me clarify the purpose of them, the docker container on the right part is responsible for the navigation task, it's able to do the whole slam process with  a Kinect V1 and control the non-holonomic mobile base to the goal point specified by the user, the yellow squares represent the ROS packages implemented to accomplish the task, the up-left docker container has all the necessary programs to control the head, neck, and arms, to find the right angles for the last one a heuristic inverse kinematic algorithm called FABRIK is implemented as a node (you can have my Matlab implementation in the file section), the middle-left containers has all the necessary files to run the SDK conversational AI provided by Nvidia called RIVA (is free and there is any license trouble to be used in non-commercial open-source projects), it has more capabilities like gaze and emotion detection, that will be very useful for the social interaction part, the las container (down-left) has a neural network model called MDETR wich is an multiomodal understanding model based in images and plain text to detect objects in images, FLEXBE is a ROS tool to imeplemente state machines or behavior schemes in a firendly graphical way, so it can be said that this containers is about the "reasonong and behaviur part" of the social interaction.

That's it for now, in the next logs more details will be presented.

Discussions