Close

Detailed VGA controller description

A project log for FPGA Doom

Porting the classic Doom engine to an FPGA-based system

matt-stockMatt Stock 12/11/2015 at 03:192 Comments

As I mentioned earlier, the VGA controller was an interesting part of the design for me. While not perfect, it addresses my immediate needs, and there are several opportunities to tweak the design. For example, right now the 320x240 double pixel mode is hardcoded, but this could easily be added to a control register to all the CPU to change the video mode as needed.

Here's a block diagram of all the major parts:

There are three major components: the VGA out, and the two bus interfaces.

Bus Slave Interface

This is the simplest of the three. This allows the CPU to make updates to the four palette maps. For reasons explained in the section below, these four contain the same content and are quite small - 256 24-bit words. In the future I could make these slightly larger and have the ability to palette swap.

Bus Master Interface

This one is a little more complex. The model is to fetch a scanline of data at a time from the video RAM and store it locally. The state machine uses the change of the vertical row as the trigger to start the load process. This gives the memory fetch a bit of a head start, since we have the whole horizontal blanking interval before the VGA interface starts to use the data.

Since this is 8-bit pseudocolor, each 32-bit word from video memory represents 4 pixels. The values are used as an index into the palette map, and the resulting 24-bit color is put into the scanline memory. So that we can do all of these lookups in as few clock cycles as possible, we use 4 parallel palette maps, and 4 separate scanline memories - each responsible for 1/4th of the scanline. The addressing for each of these is again driven by the state machine.

VGA Interface

This component uses a fairly standard set of counters to determine the relevant portions of the VGA control signals such as the horizontal and vertical sync, and the column position is used to select the correct values from the scanline memories. The upper bits select the word address from the all of the memories, and the lower two bits select which of the memories to use via a mux.

Another thing to note is that you have two clock domains in this module, represented by the dotted line in the diagram. I attempted to minimize the number of signals that needed to cross that boundary, since they all need to be synchronized.

Discussions

Yann Guidon / YGDES wrote 12/11/2015 at 15:04 point

VGA is quite slow and internal blockRAM fast, so is it really necessary to parallelise the palette lookup to 4 pixels at once ?

(just trying to understand)

  Are you sure? yes | no

Matt Stock wrote 12/11/2015 at 15:15 point

The VGA clock is 25Mhz, and the CPU clock is 50MHz at the moment. The external RAM fetch is not currently pipelined, so it takes about 5 cycles to pull 4 bytes. If you add a clock cycle for each lookup, that means about 9 clock cycles for 4 pixels. In short, even with a head start using the horizontal blanking interval, the beam catches up with you before you're done with a scan line.

  Are you sure? yes | no