VR Camera: FPGA Stereoscopic 3D 360 Camera

Building a camera for 360-degree, stereoscopic 3D photos and videos.

Similar projects worth following
360 photos are pretty cool, especially if you have the opportunity to view them in VR. However, stereoscopic 3D is what really brings the VR experience to life. The cheapest commerically available 360 3D camera is $800, and I'm attempting to build a camera for half that price or less. I'm using a Terasic DE10-Nano FPGA development board with a Cyclone V SoC onboard for stitching and data storage, and Omnivision OV5642 camera modules as the image input.

Component Choice

Due to the large number of GPIO ports needed and large bandwidth of data to be processed, this project seemed like an ideal application for an FPGA SOC development board. After deciding to use an Altera Cyclone V SOC for the brains, I narrowed down the choices for a development board and eventually settled on the Terasic DE10-Nano due to its small size, large number of GPIOs, and most importantly, its ample documentation and support.

My choice of camera module was easy. The OV5642 is cheap ($25 on Amazon), has good resolution, and has a decent amount of online documentation and drivers available. In addition, it uses the publicly documented Digital Video Port protocol rather than the secret MIPI protocol that is used by most camera modules.


My planned datapath for photos will be as follows:

  • FPGA implements parallel data receiver from camera
  • FPGA writes data from camera to DDR3
  • ARM running Linux and OpenCV reads frames from DDR3
  • OpenCV performs fisheye correction, stitching, and writes compressed images to MicroSD

The OV5642 module supports a variety of output formats, including JPEG compression. For still photos, I intend to set the camera modules to output raw RGB data for easy manipulation with OpenCV. For video, JPEG compression will likely be necessary.


A full 360 photo of my back yard, stitched by the latest version of the stitching software and aligned by the new camera calibration software.

JPEG Image - 4.92 MB - 01/15/2018 at 21:38



The camera calibration software that adjusts and corrects the camera matrices.

plain - 42.47 kB - 01/15/2018 at 21:35



The latest stitching software with optical flow implemented.

plain - 32.60 kB - 11/15/2017 at 19:46



My first 3d "360" photo! Featuring me making my prettiest face.

JPEG Image - 3.61 MB - 10/16/2017 at 09:09



This is OpenCV's Camera Calibration example program, but with my modifications to make it work with fisheye lenses

plain - 17.04 kB - 10/16/2017 at 08:55


View all 9 files

  • 1 × Terasic DE10-Nano Altera/Intel FPGA Development board with Cyclone V SOC. Includes dual-core ARMv7, MicroSD slot, a lot of GPIOs, HDMI, Ethernet, USB OTG, etc...
  • 8 × uctronics OV5642 Camera Module 5MP Camera module with parallel output. $25 on Amazon. Currently have two for testing, eventually will scale up to 8 for full 360 3D (90 degrees FOV per camera, 2 cameras per direction)

  • Refining 360 Photos

    Colin Pate7 days ago 0 comments

      I've spent the last couple months on this project trying to get the photo output to be as detailed and high-quality as possible, and I'm pretty happy with the results. The results are still far from perfect, but they've improved considerably since the last photos I posted. In my next couple build logs I'll break down some of the different changes that I've made.

      Camera Alignment
      One of the largest issues with my photos that caused viewer discomfort was vertical parallax. Due to each camera having a slight tilt up or down from the horizontal axis, objects appearing in the one eye view would appear slightly higher or lower than their image in the other eye. I had created a manual alignment program that allowed me to use the arrow keys to move the images up and down to position them, but it took forever to get good results with this, especially when considering the three different possible axes of rotation.

      My solution, emulating the approach of Google's Jump VR, was to use custom software to figure out the rotation of each camera in 3D space, and translate this to how the images should be positioned. I chose to use the brute force method - estimate the positions and then use an iterative algorithm (Levenberg-Marquardt) to correct them. The alignment algorithm is described below:

      1. Use guess positions and lens characteristics to de-warp fisheye images to equirectangular and convert to spherical projection
      2. Use OpenCV SURF to find features in the spherical images
      3. Match these features to features in images from adjacent cameras
      4. Keep features that are seen from three adjacent cameras
      5. Use the spherical and equirectangular matrices in reverse to figure out the original pixel coordinates of the features
      6. Set up camera matrices for each camera with 3D coordinates, 3D rotation matrix, and 2D lens center of projection (pixel coordinates)
      7. Begin parameter optimization loop:
        1. Begin loop through the features with matches in 3 images:
          1. Calculate the 3D line corresponding to the pixel at the center of the feature in each of the 3 images
          2.  When viewed in the X-Y plane (horizontal plane, in this case), the three lines form a triangle. Calculate the area of the triangle and add this to the total error. This area should ideally be zero, since the lines should ideally intersect at one point.
          3. Calculate the angles of each line from the X-Y plane, and calculate the differences of these angles. Add this difference to the total error. Ideally, the difference between the angles should be almost zero, since the cameras are all on the X-Y plane and the lines should all be pointed at the same point.
        2. Figure out the derivatives of all of the parameters (angles, centers, etc.) with respect to the total error and adjust the parameters accordingly, trying to get the total error to zero
        3. Repeat until the total error stops going down

      This is a greatly simplified version of the algorithm and I left a number of steps out. I'm still tweaking it and trying to get the fastest optimization and least error possible. I also switched to the lens model in this paper by Juho Kannala and Sami S. Brandt. In addition to the physical matrix for each camera, the optimization algorithm also adjusts the distortion parameters for the lenses. Currently, I'm using the same parameters for all lenses but I may switch to individual adjustments in the future.

      Also, here's a visual representation of the feature-finding process.

      Here are three of the eight original images that we get from each camera:

      The 8 images are then de-warped and converted to spherical projection. Here is the initial 360 photo produced from the guess parameters.

      It looks pretty decent, but there are clear vertical misalignments between some of the images.

      Using SURF, features are extracted from each image and matched to the features in the image from the next camera to the left, as shown below.

      Features that are found in images from three adjacent cameras are kept and added to a list. An example of one...

    Read more »

  • Optical Flow

    Colin Pate11/15/2017 at 18:12 0 comments

    It's been a while since my last update, not because I've stopped working on this project, but because I've been sinking almost every minute of my free time into rewriting the stitching software. The output is finally decent enough that I figured it's time to take a break from coding, install the other four cameras, and write a new post about what I've been doing. A few days after my last log entry, I happened across Facebook's Surround 360 project, an amazing open-source 360 camera. The hardware was pretty cool but it cost $30,000, which was a little outside my budget. What really caught my eye was their description of the stitching software, which uses an image processing technique called "optical flow" to seamlessly stitch their 360 images.

    Optical flow is the movement of a point or object between two images. It can be time-based and used to detect how fast and where things are moving. In this case, the two images are taken at the same time from different cameras that are located next to each other, so it tells us how close the point is. If the object is very far away from the cameras, there will be no flow between the images, and it will be at the same location in both. If the object is close, it will appear shifted to the left in the image taken from the right camera, and shifted to the right in the image taken from the left camera. This is also known as parallax, and it's part of how your eyes perceive depth.

    With optical flow, we can create a "true" VR image. That means that every column of pixels ideally becomes the column of pixels that would be seen if you physically placed a camera at that location. This allows a seamless 3D image to be viewed at any point in the 360 sphere. With my previous stitching algorithm,  only the columns at the center of the left and right eye images for each camera were accurate representations of the perspective from that angle. Between these points, the perspective became more and more distorted as it approached the midpoint between the two cameras. If we use optical flow, we can guess where each point in the image would be located as seen by a camera at column in the image.

    This is easier said than done, however. I decided to roll my own feature-matching and optical-flow algorithm from the ground up, which in hindsight, may have been a little over-ambitious. I started off developing my algorithm using perfect stereo images from a set of example stereo images, and it took two weeks to get an algorithm working with decent results. Using it with the images from the 360 camera opened a whole other can of worms. It took redoing the lens calibration, adding color correction, and several rewrites of the feature-matching algorithm to get half-decent results.

    Here's a re-stitched image using the new algorithm:

    As you can see, there's still plenty of work to be done, but it's a step in the right direction!  I'll upload my code so you can check out my last few weeks of work.

  • Success!

    Colin Pate10/16/2017 at 08:08 1 comment

    TL;DR: I've successfully captured, warped and stitched a 200-degree stereoscopic photo! Check it out in the photo gallery. Full 360 will come soon, once 4 more cameras show up.

    After the last log, the project went dormant for a couple weeks because I was waiting on new lenses. However, the new ones arrived about a week ago and I was very happy to see that they have a much wider field of view! They are so wide that it seems unlikely that their focal length is actually 1.44mm. I'm not complaining, though. Below is a photo from one of the new lenses.

    The field of view on these lenses is pretty dang close to 180 degrees, and as you can see, the image circle is actually smaller than the image sensor. There is also a lot more distortion created than there was with the older lenses. This caused some serious issues with calibration, which I'll cover below.

    Camera Calibration

    In order to create a spherical 360 photo, you need to warp your images with a spherical projection. I found the equations for spherical projection on this page, and they're pretty straightforward, however, an equirectangular image is required to use this projection. The images from the new lenses are pretty dang far from equirectangular. So far, in fact, that the standard OpenCV camera calibration example completely failed to calibrate when I took a few images with the same checkerboard that I used to calibrate the old lenses.

    This required some hacking on my part. OpenCV provides a fisheye model calibration library but there is no example of its usage that I can find. Luckily, most of the functions are a 1 to 1 mapping with the regular camera calibration functions, and I only had to change a few lines in the camera calibration example! I will upload my modified version in the files section.

    Cropping and Warping

    Once I got calibration data from the new lenses, I decided to just crop the images from each camera so the image circle was in the same place, and then use the calibration data from one camera for all four. This ensured that the images would look consistent between cameras and saved me a lot of time holding a checkerboard, since you need 10-20 images per camera for a good calibration.


    OpenCV provides a very impressive built-in stitching library, but I decided to make things hard for myself and try to write my own stitching code. My main motivation for this was the fact that the images would need to wrap circularly for a full 360-degree photo, and they would need to be carefully lined up to ensure correct stereoscopic 3D. I wasn't able to figure out how to get the same positioning for different images or do a fully wrapped image with the built in stitcher.

    The code I wrote ended up working surprisingly decently, but there's definitely room for improvement. I'll upload the code to the files section in case anyone wants to check it out.

    The algorithm works as follows:

    The image from each camera is split vertically down the middle, to get the left and right eye views. After undistortion and spherical warping, the image is placed on the 360 photo canvas. It can currently be moved around with WASD or an X and Y offset can be manually entered. Once it has been placed correctly and lined up with the adjacent image, the area of each image with overlap is then cut into two Mats. The absolute pixel intensity difference between these mats is calculated and then gaussian blurred. A for() loop iterates through each row in the difference as shown in the code snippet below:

    Mat overlap_diff;
    absdiff(thisoverlap_mat, lastoverlap_mat, overlap_diff);
    Mat overlap_blurred;
    Size blursize;
    blursize.width = 60;
    blursize.height = 60;
    blur(overlap_diff, overlap_blurred, blursize);
    for (int y = 0; y < overlap_blurred.rows; y++) {
        min_intensity = 10000;
        for (int x = 0; x < overlap_blurred.cols; x++) {
            color =<Vec3b>(Point(x, y));
            intensity = (color[0] + color[1] + color[2]) * 1;
            if (y > 0) intensity += (abs(x - oldMin_index)...
    Read more »

  • Lenses...

    Colin Pate09/24/2017 at 01:23 0 comments

    The layout for my camera is an octagon with each camera lens positioned approximately 64mm (the average human interpupillary distance)  from those next to it. A 3D image is constructed by using the left half of the image from each camera for the right eye view and the right half of each image for the left eye view. That way, if you are looking forward, the images you should be seeing in front of you should be from two cameras spaced 64mm apart. I created some shapes in OnShape to help myself visualize this.

    The picture above is a top-down view of the camera ring. The blue triangle is the left half of the field of view of the camera on the right. The red triangle is the right half of the field of view of the camera on the left. By displaying the view from the blue triangle in your right eye and the red triangle to your left eye, it will appear to be a stereoscopic 3D image with depth. By stitching together the views from the blue triangles from all 8 cameras and feeding it to your right eye, and doing the opposite with the red triangles, it will theoretically create a full 360 degree stereoscopic image (as long as you don't tilt your head too much).

    For this to work, each camera should have at least 90 degrees of horizontal field of view. This is because to keep the left and right eye images isolated, each camera must provide 360/8 = 45 degrees of horizontal FOV (field of view) to the left and right eye images. However, you want at least a few degrees extra to seamlessly stitch the images.

    Focal Lengths And Sensor Sizes

    I figured I'd play it safe when I started and get some lenses with plenty of FOV to play around with, and then go smaller if I could. Since the OV5642 uses the standard M12 size lenses, I decided to just order a couple 170 degree M12 lenses from Amazon. However, when I started using them, I realized quickly that the horizontal field of view was far less than 170 degrees, and even less than my required minimum of 90 degrees. It turned out that the actual sensor size of the OV5642 is 1/4", far smaller than the lens's designed image size of 1/2.5". Below is an image of the first lens I bought, which looks pretty much the same as the rest.

    At that point, I thought that I had learned my lesson. I did some quick research and found out that focal length is inversely proportional to a lens's field of view, and I figured that if the FOV of my current lenses was just under 90 degrees, picking lenses with a slightly shorter focal length would do the trick. My next choice was a 1.8mm lense from eBay. The field of view was even wider than before... but still not larger than 90 degrees. The OV5642's image sensor is just too small.

    The image above is from two cameras that are facing 90 degrees apart. If the lenses had a field of view of over 90 degrees, there would be at least a little bit of overlap between the images.

    Now that I'm $60 deep in lenses I can't use, I'm really hoping that the next ones work out. I couldn't find any reasonable priced lenses with a focal length of less than 1.8mm in North America, so I have some 1.44mm lenses that will show up from China in a couple weeks and I'm crossing my fingers.

  • Building a Frame

    Colin Pate09/23/2017 at 23:38 0 comments

    After successfully capturing images from my OV5642 module, I decided it was time to build a camera mount to (eventually) hold all 8 cameras. I used OnShape, a free and easy-to-use online CAD program to design it. The result is pretty bare-bones for now, but it sits on top of the development board, holds all the cameras in place and gives convenient access to the camera wires and board pins.

    I got it printed for only $9 in a couple of days by a nearby printer using 3D hubs. Here's a link to the public document in OnShape, in case you wish to print or modify it yourself.

    I then used some old ribbon cable and quite a few female headers to create the monstrosity that is shown below.

  • Progress!

    Colin Pate09/18/2017 at 03:00 0 comments

    I've made quite a bit of progress since the last log, and run into a few road blocks on the way. Here's where I'm at right now.

    Image Capturing

    Using the OV5642 software application guide, I copied the register settings from the "Processor Raw" into my custom I2C master and used this to successfully get Bayer RGB data from the sensor! This outputs a 2592*1944 array of 10-bit pixels. In case you aren't familiar with Bayer image formatting, here's how it works:

    On the actual image sensor, each pixel captures one color. Half of them are green, and the rest alternate between blue and red. To get the full RGB data for a pixel, the colors of the nearest pixels are averaged. Looking at a Bayer image like the ones I captured at first, it just looks like a grayscale photo. However, if you look closely, you can see the Bayer grid pattern. OpenCV includes a function called cvtColor that makes it incredibly easy to convert from Bayer to regular RGB.

    Memory Writer

    In order to take the streaming image data from the OV5642's Digital Video Port interface and write it to the Cyclone's DDR3 Avalon Memory-Mapped Slave, I had to create a custom memory writer in VHDL. I might have made it a little more complicated than I needed to, but it works fine and that's all that matters. I will upload it in the Files section as mem_writer.vhd.

    Saving the Images in Linux

    To read the images from DDR3 and save them to the MicroSD card, I re-purposed one of the OpenCV examples included on the LXDE Desktop image. I didn't feel like messing around with creating a Makefile, so I just copied the directory of the Houghlines example and changed the code to do what I wanted and rebuilt it. Here's the most important parts:

            int m_file_mem;
    	m_file_mem = open( "/dev/mem", ( O_RDWR | O_SYNC ) );
    	void *virtual_base;
    	virtual_base = mmap( NULL, 0x4CE300, ( PROT_READ | PROT_WRITE ), MAP_SHARED, m_file_mem, 0x21000000 );
    	if (virtual_base == MAP_FAILED){
    		cout << "\nMap failed :(\n";

     I set up my OV5642 stream-to-DDR3 writer module to write to the memory address 0x21000000, and simply used the Linux command mmap() to get access to the DDR3 at this location. The size of this mapping, 0x4CE300, is simply the size of the imager (2592 * 1944 pixels).

    Then, the OpenCV commands to save the image from DDR3 onto the MicroSD card are very simple.

    Mat image = Mat(1944, 2592, CV_8UC1, virtual_base);
    imwrite("output.bmp", image);

    This doesn't include the Bayer-to-RGB conversion, which was a simple call to cvtColor.

    In my next log, I will discuss the parts of the project that have gone less smoothly...

  • Linux on the dev board

    Colin Pate09/01/2017 at 04:49 0 comments

    The DE10-Nano board ships with a MicroSD card that is loaded with Angstrom Linux. However, you can also download a couple of other MicroSD images with different Linux flavors from the Terasic website. Since I planned to use OpenCV to capture and stitch images, I downloaded the MicroSD image with LXDE desktop, which handily already has OpenCV and a few OpenCV example programs on it.


    The DE10 Nano uses an application called U-Boot to load Linux for the ARM cores to run and to load an image onto the FPGA. I don't know exactly how it works, but I've been able to get it to do my bidding with a bit of trial and error. The first thing I attempted to do was set aside some space in the DDR3 memory for the FPGA to write to, so the ARM wouldn't attempt to use that space for Linux memory. This turned out to be fairly simple, as I was able to follow the example "Nios II Access HPS DDR3" in the DE10 Nano User Manual. This example shows how to communicate with U-Boot and Linux on the DE10 using PuTTY on your PC, and set the boot command in U-Boot to allow Linux to only use the lower 512MB of the DDR3. Here's the pertinent snippet that you enter into U-Boot before it runs Linux:

    setenv mmcboot "setenv bootargs console=ttyS0,115200 root=/dev/mmcblk0p2 rw \ rootwait mem=512M;bootz 0x8000 - 0x00000100" 

    My next task was a bit more difficult. I also needed my FPGA to be able to access the DDR3 to write images for my OpenCV program in Linux to access. This required the use of a boot script that tells U-Boot to load the FPGA image from the SD card and then enable the FPGA-to-DDR3 bridge so it can be used.

    I used the helpful instructions on this page to create my boot script:

    fatload mmc 0:1 $fpgadata soc_system.rbf;
    fpga load 0 $fpgadata $filesize;
    setenv fpga2sdram_handoff 0x1ff;
    run bridge_enable_handoff;
    run mmcload;
    run mmcboot;

     And that did it! I had my FPGA writing directly to the DDR3. The next step was to actually get pictures from the cameras...

  • Setting up the camera

    Colin Pate08/26/2017 at 05:29 0 comments

    The OV5642 camera modules features two communication interfaces: DVP and SCCB. DVP (digital video port) is a unidirectional digital interface which is used to transfer the image data out of the camera module. SCCB is pretty much a clone of I2C, and it is used to read and write registers in the camera module that control every aspect of its functionality.

    My first adventure in this project was the implementation of an I2C master in the FPGA that would (eventually) set up the OV5642 and write the correct registers to produce a valid image output. In the interest of saving time and doing the least work possible, I adapted the I2C master module that is used to configure the HDMI module in the DE10 example designs.

    I've attached my adaption of the I2C master in the file section. Right now it doesn't do much, but it's enough to get PCLK, VSYNC, HSYNC, and data coming out of the camera. I have no idea what's in the data, however.

View all 8 project logs

Enjoy this project?



Lars R. wrote 10/29/2017 at 17:37 point


  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates