Beginnings and process

A project log for DIY Stereo Camera

Creating an open source and economical Stereo camera dev kit. For VR and 3d video use.

Bryan LyonBryan Lyon 06/01/2016 at 21:520 Comments

With this project, I'm starting from Square 1. I don't have experience with cameras or with building boards this complicated. After some research I've discovered the following difficulties that I will have to overcome.

USB cameras are simply not an option. USB 3.0 is hard to get in dev boards, and impossible to sync two cameras which send their video through the USB connection. In addition, mounting USB cameras is difficult when dealing with stereo calibration. Unless some new discoveries are made, USB is out.

Mipi CSI-2 is a high speed LVDS bus with very tight tolerances due to the speed. In order to use this, I will need to be able to run many traces with identical length, but different paths so that they don't interfere with each other. Ideally this should be done on as many layers as possible, but in attempting to keep the board price down, excessive layers should be avoided. Hopefully this can be done reliably with 4 layers. This can be helped by the fact that we have 2 cameras and they can have separate mounting points, it might also be possible to have their traces different lengths, though even if they're just mirrored it's easier than designing all traces.

There are many choices for chips that have sufficient CSI lanes, but they all have problems of their own.

Many would require NDAs to get low-level access to the chip's GPU. This would be unfortunate as it makes the idea of open hardware difficult (if not impossible, depending on what exactly falls under the NDA).

Others do not have the ability to use all the CSI lanes simultaneously (they are often designed for a forward and rear camera which can be used one at a time), but you can't know that until AFTER you read the datasheet which often requires NDA.

Others are hideously expensive which would prevent their use in an accessible camera.

Dev kits are extremely difficult to find with dual CSI headers. The ones that exist today are:

Nvidia Jetson TX1It has several exposed CSI lanes, but they're on complicated connectors and would require a specialized breakout board. Further it's cost ($299 for a module alone and $599 for a full dev kit) put it outside the reach of most reasonable hobbyists.
Raspberry Pi Compute ModuleIt is currently discontinued. Meaning that it is harder and harder to get ahold of. It only supports 2 camera modules (The OV5647 and the Sony IMX219). This cannot be increased because Broadcom has locked down the GPU on the chip and wont allow anyone to get access to the documents and source code required to implement new cameras. It can't provide full resolution encoding for both cameras in stereo mode, and it's rather slow making additional development difficult.
DragonBoard 410cIt has several CSI lanes on a complicated connector. It's GPU is locked down nearly as hard as broadcom's, but it may be available under NDA. There are only 6 CSI lanes, meaning that any cameras used will probably only run at ~75% of the rated framerate due to bandwidth limitations. Support for CSI cameras seems to be non-existent despite many users asking.

In addition to CPUs, there is a possibility to use FPGAs. FPGAs offer a lot of advantages over CPUS, but they also have some serious drawbacks. An FPGA can easily decode and convert the video from the cameras, but it could be difficult to get the video data out of the FPGA onto a CPU to process. FPGAs are also a lot more expensive, and could make the camera excessively expensive.

I have ordered the Jetson TX1 and the Compute Module dev kits. The Jetson has a lot of power that will be helpful during development, even if we move to a cheaper board later on. The Raspberry Pi has an existing workflow for 3d vision, so I can set it up quickly and begin experimenting with calibration.

I have also bought some dedicated hardware in the form of a StereoLabs Zed. The Zed is a great stereoscopic camera that includes an accessible and powerful SDK.

I have ignored the Realsense and Kindle cameras for now, as they only have a single camera which prevents any sort of 3d video recording, although they can give depth readings through their specialized systems. I have heard that Intel is working on a dual camera realsense board. That will be interesting once it's been released, but for now, I cannot rely on rumors or hardware that may still not meet requirements.

Hardware is only one side of the problem. The other is, of course, software. Right now, there is a very wide variety of software available. A lot of the software, like VisualSFM, does things that we want the stereo camera to do, but they are not yet capable of supporting stereo cameras as input. Obviously, we could send them single camera data (or separate the camera data into multiple threads) but ideally we'd want the software to "know" about the stereo camera. The Stereolabs ZED does the same thing as VisualSFM while being aware of the stereo cameras, but will only work on the ZED due to Stereolabs locking the software to their hardware platform (which is their right, but also the reason why this project exists). The ZED's software is also far less accurate than VisualSFM with a LOT of noise in the model, often times repeating an object in the model multiple times in different places.

I will continue to investigate the details of what is possible. Right now, my focus will be on seeing what software is capable of, and finding an acceptable processor. I believe something in the imx6 line may fill our needs but I haven't had a chance to analyze all of them. I've begun working on OpenCV, which we'll probably use for a large part of the logic systems for the camera. It already has some libraries and functions designed for stereo cameras which can get me jump started. One problem I'm having is that the only way I've gotten the StereoLabs ZED to work in Open CV is through the StereoLabs SDK which does a lot of shortcuts that we can't rely on for any other camera systems, so I'll have to first remove the extra data from the SDK and turn it back into raw video data -- not necessarily hard, just something that must be done first.

If you're interested in helping, or have any questions, please send me a message, I'll be happy to answer any questions you have and would readily welcome anyone else interested in this topic.