2D GPU: HD resolution @ 60Hz, HDMI, double buff, hardware acceleration, USB config upgrade, ... For ARDUINO & other µC

Similar projects worth following
This is a HDMI 2D GPU board for microcontrollers. The engine is embedded on a FPGA.
VHDL development was made on a Terasic board.
Specifications f̶o̶r̶e̶s̶e̶e̶n̶ achieved:
- Maximum resolution HD (1280x720) @ 60Hz with 8bbp palette (256 of 16.7 Mcolors), double B̶o̶f̶f̶e̶r̶d̶i̶n̶g̶ buffering, hardware acceleration (line, circle, blob, sprites, text, clipping area, ...)
- FPGA config file upgradable via USB
- Foreseen: Sound via HDMI.
Give a display to ARDUINO, PIC, ARM of whatever microcontroler that has a SPI interface.
Full ESD protected.
Target price: 90 EUR

One of the frustrating thing when you deal with small embedded systems is when you want to go for graphic capability (at least medium to high resolution) you must use Linux or another high level OS (and moreover not real time). The reason is the bloody closed source code driver for all these cheap ARM chip (Broadcom, TI, ...). There are some projects that try to do bare-metal on the chip (on RasPI for example) but that's time consuming.

Then I decided to launch HOMER. With HOMER, you'll have the key to do some beautiful graphics, GUI and even games on a classic PC monitor.


- 1280x720 (HD) @ 8bpp @ 60Hz (-> done)

- Full palette over 24bit (-> done)

- Double buffering (2 SRAM blocs) (-> done)

- 2D hardware acceleration (HOMER command example: line(0,0,120,500,option) ) (-> done)

- FPGA config file USB upgradable (nobody wants to buy a blaster II) (-> done)

- ESD protected (-> done)

- HDMI output (-> done)

- Audio (-> not worked on this yet)

How Homer works (see the sketch) ?

Homer can connect to any host with a SPI interface or the USB interface if connected to a PC. Supplying Homer is done via SPI connector, via USB interface, or both.

The host (PC, uC, Arduino, ESP8266, whatever...) sends high level commands to the SPI or USB interface (the max speed of SPI is not yet defined, I'm using 10Mbps now). The commands are then pushed to the FIFO. The 2D GPU engine pull the commands and process them. Basically, the engine contains a set of primitives (drawLine, putTxt, circle, fill, blobCopy, selectPalette,...) that read/write the non-displayed frame buffer (FB). Some primitives like putTxt and blobCopy take data from the FLASH and send them to the non-displayed FB. The flash contains user pre-loaded fonts, glyphs and images that are transferred (with processing) to FB with single commands. The host has only to store the useful data in FLASH at once. A special command (SwapBuffers() or whatever the name) ask HOMER to switch the FB that is displayed. Internally, the 2D engine ask the FB controller to do the switch. As soon as the FB controller did it, it sends an ack to the 2D GPU engine, the following commands are then applied to the new non-displayed FB. Transparency management is on work a̶l̶t̶h̶o̶u̶g̶h̶ ̶i̶t̶ ̶i̶s̶ ̶s̶l̶o̶w̶e̶r̶ ̶(̶a̶ ̶“̶r̶e̶a̶d̶ ̶b̶e̶f̶o̶r̶e̶ ̶w̶r̶i̶t̶e̶”̶ ̶o̶f̶ ̶F̶B̶ ̶i̶s̶ ̶n̶e̶c̶e̶s̶s̶a̶r̶y̶)̶. Displaying the content to the screen is done via a simple FB scanning: the CPLD has a counter, synchronised with the FPGA one, that increment the FB RAM addresses. The RAM data are transferred to FB controller then to one of the user-stored palettes. Actually, palettes are LUT with 256 entries and 24bits outputs (8bit per color). The data are then transferred to the HDMI PHY chip with the right timing.

Typically, a host program looks like this:

Loop() {
Clear(); /// or fill(...) or nothing
putString(X,Y, fontIDX, ”blablabla”, foregroundColor, backgroundColor);
copyBlob(flashMemAddress, X, Y, W, H, operation); ///operation = bitwise
drawLine(X1, Y1, X2, Y2, color, option); ///option will include antialiasing (slower)


Host can implement the commands like this (big endian, SPI 8 bits):

Example for drawLine(X1, Y1, X2, Y2, color, option):

buff[0] = X1U; buff[1] = X1L; buff[2] = Y1U; buff[3] = Y1L;
buff[4] = X2U; buff[5] = X2L; buff[6] = Y2U; buff[7] = Y2L;
buff[8] = color; buff[9] = option;
sendSPI(buff, 10); /// send the buffer to SPI, length = 10

The Frames Per Second (FPS) is 60 maximum, it corresponds to the HD refresh rate of 60Hz. Depending of the 2D engine charge, the FPS may be lower.

  • 1 × FPGA The cheapest ALTERA FPGA (Cyclone IV serie)
  • 1 × ARM LPC812 For the lloading of the config file to the FPGA and for the future upgrade via USB
  • 1 × MAX V CPLD Not enough FPGA pinout, then CPLD to mux the SRAM
  • 2 × QUAD SPI FLASH The first connected to LPC812 for FPGA config file. The second connected to FPGA for fast loading into SRAM the user preloaded font,, blobs, background,..
  • 1 × TDA19988 Video, Graphics and Imaging ICs / DVI-HDMI Interface, Receivers, Transceivers

View all 7 components

  • Fonts ok, adding alpha transparency

    monnoliv10/18/2015 at 18:49 0 comments

    Converting and adding (mono) true type fonts is done. Text is displayed in 16 grey-scale values. Displaying text on black background is beautiful but displaying the same text on another background colour or over an image is ugly. I decided to work on alpha transparency.

    Alpha transparency

    If one wants to display text with a specified colour on an image or other background, the system must do a calculation between the pixel text weight colour (16 values) and the background pixel colour (240 colour values, BTW I've fixed the palette colour content to 685-RGB values + 16 grey-scale). This calculation has to take into account a user transparency selection (also 16 values). I've implemented LUT to do this but this is not finished yet.

    Other benefit of this alpha transparency is for the blob copy operation that will have a transparency factor. The purpose is to mix the pixels colour values of the blob to copy with the pixels colour values of the screen that receive the blob. That will normally gives a nice effect.

    Work on progress ...

  • Still alive

    monnoliv09/21/2015 at 16:24 0 comments

    Just this log to inform that I'm working on fonts and fonts conversion.


  • HOMER WP5: FPGA conf via USB

    monnoliv08/15/2015 at 20:50 0 comments

    I've managed to do a better video thanks to a tripod.


    - Now, it's possible to upload the FPGA configuration file via USB (VCOM port). It remains to store this file into the FLASH next to the LPC812, then the board will be ready at power on.

    - I've made a little python application to do some tests with the board but also -and mainly- to upload image files to the embedded flash.

    Here is the source C code for the animation. This code can easily be embedded in a small 8bit uC.

    	//Init LisaGlyph
    	for (uint32_t i=0; i<LISA_GLYPHS; i++)
    	//Init Lisa
    	for (uint32_t i=0; i<N_LISA; i++) {
    		Lisa[i].size = (SIZE_t){ LISA_SIZE_W, LISA_SIZE_H};
    		Lisa[i].pos.X = rand() % 1000;
    		Lisa[i].pos.Y = rand() % 500;
    		Lisa[i].inc.X = rand() % 10 + 0.5;
    		Lisa[i].inc.Y = i / 10.0 + 0.5;
    		Lisa[i].idxGlyph = 0;
    	//Init HomerGlyph
    	for (uint32_t i=0; i<HOMER_GLYPHS; i++)
    	//Init Homer
    	for (uint32_t i=0; i<N_HOMER; i++) {
    		Homer[i].size = (SIZE_t){ HOMER_SIZE_W, HOMER_SIZE_H};
    		Homer[i].pos.X = rand() % 1000;
    		Homer[i].pos.Y = rand() % 500;;
    		Homer[i].inc.X = rand() % 10;
    		Homer[i].inc.Y = i / 2.0;
    		Homer[i].idxGlyph = 0;
    	//Init BallGlyph
    	for (uint32_t i=0; i<BALL_GLYPHS; i++)
    	Ball.size = (SIZE_t){ BALL_SIZE_W, BALL_SIZE_H};
    	Ball.pos.X = 500;
    	Ball.pos.Y = 100;
    	Ball.vit.X = 7.0;
    	Ball.acc.X = 0;
    	Ball.vit.Y = 0;
    	Ball.acc.Y = 1.5;
    	Ball.idxGlyph = 0;
    	Mur.size = (SIZE_t){ MUR_SIZE_W, MUR_SIZE_H};
    	Mur.pos.X = 500;
    	Mur.pos.Y = 719-512;
    	//Init display
    	GPUclipArea((COORD_t){0, 0}, (COORD_t){1279, 719});
    	while (1) {
    		for (uint32_t i=0; i<N_HOMER/2; i++)
    					GlyphHomer[ Homer[i].idxGlyph ],
    					(COORD_t){ (int16_t)Homer[i].pos.X, (int16_t)Homer[i].pos.Y },
    		for (uint32_t i=0; i<N_LISA/2; i++)
    					GlyphLisa[ Lisa[i].idxGlyph ],
    					(COORD_t){ (int16_t)Lisa[i].pos.X, (int16_t)Lisa[i].pos.Y },
    				(COORD_t){ (int16_t)Mur.pos.X, (int16_t)Mur.pos.Y },
    		for (uint32_t i=N_LISA/2; i<N_LISA; i++)
    					GlyphLisa[ Lisa[i].idxGlyph ],
    					(COORD_t){ (int16_t)Lisa[i].pos.X, (int16_t)Lisa[i].pos.Y },
    		for (uint32_t i=N_HOMER/2; i<N_HOMER; i++)
    					GlyphHomer[ Homer[i].idxGlyph ],
    					(COORD_t){ (int16_t)Homer[i].pos.X, (int16_t)Homer[i].pos.Y },
    				GlyphBall[ Ball.idxGlyph ],
    				(COORD_t){ (int16_t)Ball.pos.X, (int16_t)Ball.pos.Y },
    		GPUdrawLine((COORD_t){0, 0}, (COORD_t){ 1279, 0}, 15, 0);
    		GPUdrawLine((COORD_t){0, 0}, (COORD_t){ 0, 719}, 15, 0);
    		GPUdrawLine((COORD_t){1279, 0}, (COORD_t){ 1279, 719}, 15, 0);
    		GPUdrawLine((COORD_t){0, 719}, (COORD_t){ 1279, 719}, 15, 0);
    		for (uint32_t i=0; i<N_LISA; i++) {
    			Lisa[i].pos.X += Lisa[i].inc.X;
    			Lisa[i].pos.Y += Lisa[i].inc.Y;
    			if (Lisa[i].pos.X > SCR_W - Lisa[i].size.W  || Lisa[i].pos.X < 0 ) {
    				Lisa[i].inc.X = - Lisa[i].inc.X;
    				Lisa[i].pos.X += Lisa[i].inc.X;
    			if (Lisa[i].pos.Y > SCR_H - Lisa[i].size.H || Lisa[i].pos.Y < 0 ) {
    				Lisa[i].inc.Y = - Lisa[i].inc.Y;
    				Lisa[i].pos.Y += Lisa[i].inc.Y;
    			Lisa[i].idxGlyph = ((uint16_t)(Lisa[i].pos.X/5)) % LISA_GLYPHS;
    		for (uint32_t i=0; i<N_HOMER; i++) {
    			Homer[i].pos.X += Homer[i].inc.X;
    			Homer[i].pos.Y += Homer[i].inc.Y;
    			if (Homer[i].pos.X > SCR_W - Homer[i].size.W  || Homer[i].pos.X < 0 ) {
    				Homer[i].inc.X = - Homer[i].inc.X;
    				Homer[i].pos.X += Homer[i].inc.X;
    			if (Homer[i].pos.Y > SCR_H - Homer[i].size.H || Homer[i].pos.Y < 0 ) {
    				Homer[i].inc.Y = - Homer[i].inc.Y;
    Read more »

  • HOMER WP4: Sprites

    monnoliv08/05/2015 at 22:37 3 comments

    At least I managed to discuss with the QSPI flash. Then here are some blobs (sprites) of Lisa :-)

    in front of 200 lines moving and two clip area.

    The image on screen is better than this bloody video, I can't figure out how to upload a good quality one.

  • HOMER WP3 bis

    monnoliv07/21/2015 at 22:47 0 comments

    A small video, clip area, drawline, fillarea.

  • HOMER WP3, colors

    monnoliv07/13/2015 at 21:21 0 comments

    Small demo with colors (palette a bit ugly). Drawline (Bresenham algorithm) implemented but not perfect yet. Clipping area for Drawline.

  • First animation

    monnoliv07/08/2015 at 22:46 0 comments

    This is a first video result. I spent a lot of time debugging a mistake in the VHDL code. The video shows 50 blobs moving in HD resolution double buffering. I haven't yet implemented the palettes then it's grayscale. At every frame I do a clear screen. When the loading from flash will be available, instead of doing a clear screen, it will be possible to load a background.

    I'm very happy with this first result :-)

  • First HD output

    monnoliv06/28/2015 at 17:07 2 comments

    This is a first HD output. I've manage to configure the TDA19988 chip (a little bit by chance ;-) ).

    You can see the beautiful random values of a block SRAM in greyscale.

  • HOMER is alive :-)

    monnoliv06/23/2015 at 22:14 0 comments

    See video.


    HDMI Chip ok in test mode, communication ok but no video for the moment.

    Problem with NXP that didn't release the datasheet explaining how to configure the chip registers. Then I've to dig into open source code driver (Linux, Mimix, LPC4350). If I was aware of that before I'd never choose this chip.

    Nevertheless things are progressing, I'm very hammy that the hardware is running.

    Aside HDMI, It remains to test the overall timing for the HD video generation and of course to adapt the VHDL code to this hardware configuration.

  • Homer is born

    monnoliv06/09/2015 at 20:39 0 comments

    Dead or alive ? I've to check soon

View all 14 project logs

Enjoy this project?



traverseda wrote 09/08/2017 at 18:31 point

Have you given any thought to supporting wishbone bus? The fantasy being things like

`ipcore > videoDecoder > display"

`sdcardIO > videoDecoder > display`

or otherwise enabling more complex fpga/fpga interactions.

  Are you sure? yes | no

salmanisheikh wrote 07/05/2015 at 11:02 point

video is so blurry...

  Are you sure? yes | no

monnoliv wrote 07/05/2015 at 13:45 point

I'll post new (better resolution) video as soon as I've some new stuff to show

  Are you sure? yes | no

salmanisheikh wrote 07/05/2015 at 15:31 point

cool...also anxious for some source files, will be patient :)

  Are you sure? yes | no

twl wrote 06/20/2015 at 23:51 point

Very nice project (especially the PCB design). I didn't understand though why did you take so many components, which IMHO unnecessarily raise the price:

- external HDMI PHY (Cyclone4 can AFAIK do HDMI at 720p/60Hz through its normal I/O)

- separate microcontroller (why not use a softcore instead?)

- 2 SRAM chips and a CPLD (presumably to decode/multiplex the mems)


PS. are you planning to post the sources somewhere?

  Are you sure? yes | no

monnoliv wrote 06/21/2015 at 07:40 point

Unless you take the GX version of the Cyclone IV (more expensive, 24€ instead of 12€) it's not possible (for me at least) to output HD signal directly on HDMI connector. The HDMI PHY cost less then 5€ and let you transfer AUDIO I2S signals as well.

The separate microcontroller has two advantages:

1. It will take in charge FPGA upgrade with a simple USB interface

2. It cost less (<2€) than the Altera configuration chip (15€). For this first prototype I put both for doing tests but the idea is to remove the conf. chip.

CPLD is necessary because I haven't enough pin on the FPGA and I didn't want to go for a BGA package (at least for this first board)

Yes, the source will be available as soon as the project is well under way.

  Are you sure? yes | no

Joel Bodenmann wrote 06/20/2015 at 08:37 point

This is a very interesting project. Nice work so far!

I am very interested into adding support for your GPU to my (embedded) graphics library called uGFX  ( What do you think of that? Are you interested?

Is it possible for you to supply one or two boards to me so I can implement the official support?

  Are you sure? yes | no

monnoliv wrote 06/20/2015 at 10:59 point

Yes I'm interested but wait ... the project is not yet finished ! I've to do a lot of hardware tests and finish to translate my VHDL code from Terasic to this new hardware configuration. For the moment I've only one home made board. As soon as I can (and if everything goes right) I'll build another one for you (I hope also for a successful kickstarter campaign otherwise you'll work only for me :-) ). Stay tuned for the progress!

  Are you sure? yes | no

Joel Bodenmann wrote 06/20/2015 at 11:54 point

No worries, I'm not bored at all ;-)

Can you contact me via    info at ugfx dot com   ?

  Are you sure? yes | no

monnoliv wrote 06/20/2015 at 14:34 point


  Are you sure? yes | no

PK wrote 06/19/2015 at 05:18 point

Excellent stuff - I'll be cheering you on!  I've got a similar, yet much more modestly spec'd project going (144 Macrocell CPLD).

  Are you sure? yes | no

monnoliv wrote 06/19/2015 at 06:33 point

Thank you, I saw your project. You're more advanced than me! It's a lot of work (and hope since that's not going on to be easy). As you, my motivation is to have a simple graphic card for guys who don't want to use Linux as soon as they want to do some graphics with their projects.

  Are you sure? yes | no

Fabian wrote 06/18/2015 at 09:42 point

Interesting project. Could bring some beef to arduino projects...

Do you have any target date when this thing will be available?

  Are you sure? yes | no

monnoliv wrote 06/18/2015 at 10:01 point

AFAP (As Fast As Possible!) it depends on the problem I'll encounter but I've a target by end of August for a board 100% working.


  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates