Hey Rover fans!
I've started building a simulator for Rover using Unreal Engine. See the video of the simulator and my rationale and goals with the simulator below:
I'd like to build a kind of next generation navigation stack for Rover that uses only cameras for localization, path following, and obstacle avoidance. I have experience bringing up ROS based navigation systems using LIDAR, but those systems are expensive and traditional mapping and localization algorithms for LIDAR are typically for indoor flat environments like offices. While working on a robotics project for my job two years ago, I was asked to survey all possible sensing modalities for a well funded commercial robot, and spent time looking at 3D LIDAR like the Velodyne Puck, Time of Flight cameras like the IFM 03D303 and Kinect V2, structured light cameras like the first generation Kinect and the old Asus Xtion Pro, stereo pair cameras like the ZED Stereo Camera and the Playstation 4 camera (which at $50 is a steal for linux based robotics if you don't mind wiring on a USB3 connector to the cable!), and more.
I found that getting full 360 degree surround sensor coverage for the robot would be terribly expensive, the compute required to process all the data would be prohibitive for any semblance of a low cost system, the power budget would be terrible, and success in sunlight was still uncertain. Meanwhile stereo cameras looked almost do-able, but the quality of data one could glean with state of the art algorithms was so poor it seemed hopeless. It would be only $200 to surround the system with cameras compared to $10k for other sensors, but we couldn't make enough sense of the data to meet our operational needs. I surveyed the algorithms by looking at deployed systems, open source libraries, and the latest research, and it seemed there could be hope in the future for camera based systems. Indeed, most animals on Earth do well with just a pair of optical sensors and a movable head.
More recently, deep neural networks have revolutionized the way computers understand images and the world around them. We no longer need to manually tune algorithms to detect features in an image based on a human understanding of the data. We are learning to train algorithms to find the necessary details on their own. This is an approach that is both far more accurate and more computationally efficient than past approaches. A low power computer chip is all that is needed to do person following on modern drones - a task that would have taken a desktop grade CPU just a few years ago.
And so, I've envisioned the Rover system as a sort of camera-based research platform for robotics. Rover is made for unstructured off road environments, not flat well-behaved offices. I've come up with a six camera surround system I think has promise for a vehicle like this - four fish eye cameras in the corners and one regular view camera in the front and rear. This would allow Rover to do some stereo reconstruction of scenes while also giving it a monocular view all around the robot, with higher resolution images for front and rear just like the fovea in mammalian eyes.
The hardware would be a Jetson TX2 computer with six cameras feeding in to its CSI camera bus. This is off the shelf hardware and I think the TX2 will be enough for some pretty solid navigation work. See one such camera system below:
Rover Sim is a virtual environment designed to allow the development and training of the appropriate machine learning algorithms. It will be totally open source as soon as I get a little time to publish it on Github. Once the basic sim is complete (I need to modify the camera position to resemble Rover's planned camera placement), I will work on bringing up the World Models algorithm in sim: https://worldmodels.github.io/
I will start by just following the black road in the Sim, a straightforward enough task by my estimation. From there I will spruce up the trails a bit, and retrain the World Model network to follow...
Read more »