Features

Project history

To do

Hardware

The diagram below shows the main blocks of the design:

PCB Design

The DSI shield consists of two PCBs - the main board, where all the cool stuff is and a small adapter board, usually different for each display, connected through a 30 pin 2mm pinhead.

The main board is a typical Arduino shield. I routed the design on 4 layers, with the signals on the 2 outer layers, a contiguous ground plane and a split power plane. The DDR is placed right under the FPGA to simplify routing. SSTL to DSI level translator resistors are placed right next to the FPGA output pins to avoid stubs. All differential pairs are calculated for Z0=100 Ohm.

The adapter boards simply route the DSI lanes, power and backlight signals to the display's connector. There's completely no standard governing DSI LCD panel connectors and power, so for each display type you'll need a separate adapter board. They are meant to be simple (2 layers, a connector + few resisors for setting bias voltage/backlight LED current). The screenshot below shows the layout of the Iphone 4/4S Retina display adapter:

The prototypes were made by Itead studio:

The FPGA

The heart and soul of the project: Xilinx Spartan-6 - low cost FPGA with gigabit SerDes blocks on each pin, which make it possible to sample HDMI/DVI signals or generate DSI data stream with just a bunch of external resistors. The intestines of the FPGA are shown below. For the moment it's just a very brief description of what's where in the Verilog/VHDL code.

CPU & Peripherals

The CPU is responsible for initialization of the display and controlling the framebuffer. It can also do some simple drawing operations (although not too fast).

I chose a Lattice Mico32 soft-processor due to maturity of the design and because other successful OSHW projects use it (e.g. Milkymist). The CPU controls the following peripherals through a Wishbone interconnect:

- A small 16 kB RAM block, containing the software. The first 2 kB are reserved for the bootloader, the remaining 14 kB is the actual application.

- An UART (for debugging and loading the software).

- The DDR memory (slow access through Wishbone to FML bridge).

- The Framebuffer and the DSI core (setting video mode timing).

- EDID RAM, pretending to be a 2402 I2C EEPROM. Tells the HDMI source of the timing requirements of our display.

DDR Memory & Framebuffer

The memory subsystem uses Milkymist's High Performance Dynamic Memory Controller (thanks Sebastien!), connected to a 32 MByte DDR RAM (16 bits, 100 MHz). The RAM is currently used to store only the displayed image, the CPU can't execute code from it. The framebuffer core simply pumps frame data from the DDR RAM to the video overlay engine, where it's composited with the HDMI video.

HDMI Sampler

HDMI/DVI decoder from Xilinx's application note XAPP495. Inputs HDMI, outputs parallel RGB pixels, clock and synchronization signals. Used pretty much as a black box here.

Video Overlay

Just as the name says, it puts together the images from HDMI and the framebuffer (using simple color keying) and outputs the final pixel stream for the display. Includes some elastic buffering and frame alignment logic between 3 different clock domains (HDMI clock, CPU system clock and the DSI core clock).

DSI Core

The architecture of the DSI core is shown in the diagram below:

The pixel pipeline consists of 5 stages:

- The first stage is a big 4096 entry FIFO. It's role is to move the RGB pixel data from the system clock domain to the DSI byte clock domain (= data rate divided by 8). Thanks to that, we don't need to keep the system running synchronously to the pixel clock - the pixels can come at a slightly faster/slower rate.

- The Timing generator produces a sequence of DSI High Speed (HS) packet headers and payloads, which indicate a horizontal/vertical synchronization pulse, blanking period or pass RGB data to the display. Display resolution and blanking timing is programmed by the CPU through Wishbone interface.

- The Packet assembler takes the packet requests, stuffs them on 3 or 4 parallel bytes, adding packet headers and footers with ECC/CRC checksums.

- The Lane control modules (4 for the data lanes, 1 for the clock lane) control the transitions between low power (LP) and high speed (HS) modes and generate LP mode signals to initialize and enable the display. At the output of the Lane control module, we get the LP+ / LP- signals and an 8-bit parallel data stream for the SerDes. The clock lane data output is simply fixed to 0xaa, resulting in a DDR clock at the output of the SerDes.

- The SerDes block - converts parallel 8-bit data to high speed serial bits using Spartan-6's OSERDES2 blocks. Also ensures correct timing between data and clock lanes, thanks to the IODELAY programmable delay lines at each FPGA pin.