Porting the classic Doom engine to an FPGA-based system

Similar projects worth following
This project builds on my Bexkat1 CPU project, with the goal of porting the classic Doom game engine in playable form. It's a great platform to evaluate hardware design changes. Obviously the image samples shown here are from the non-public WAD file and are for illustration. You can buy the game and get the WAD file GOG as well as many other places.

CPU project is at

I'm using the WAD file from Ultimate Doom on GOG. The build process is pretty simple, but requires that you have the full GCC, binutils, and newlib tool chain, as well as the bexkat1 source. The bexkat1 source is needed only for the support library that is currently in there - there are some things that I do that are not part of newlib and so don't fit perfectly yet.

  • We're not dead yet!

    Matt Stock04/17/2020 at 17:49 0 comments

    It's... been a while since I last added a log.  The great news is that I once again have DOOM running on the platform, this time using the DE10-standard.  I also made substantial improvements to all the interconnects and modules:

    • Wishbone compliant interfaces, including many that pipeline and block transfer
    • Update code style to reflect some additional experience on my part
    • Updates to use SystemVerilog at least to the degree I know it
    • Switch system from Princeton to Harvard architecture
    • More implicit synthesis of Intel's IP, to make things like FIFOs and dual port RAM more portable
    • Some of the core elements (mostly the CPU) have Verilator tests and an easy to use test suite

    I also took the opportunity to aggressively cut down sections of the DOOM code that aren't relevant to my port (at least right now).  That included all of the sound support, argument parsing, and anything related to the game mode - it's hardcoded to retail.   Removing all of the parsing, strings and conditional logic stripped down the binary by a fairly large amount.

    I can post another video, but at the moment it's not really too different than the old one.

    What this really gives me is a better baseline to experiment with things that will improve performance of the game.  In particular, I'm looking at a few options right now fairly carefully:

    • Modify gcc to pass at least some arguments in registers instead of always on the stack.  Memory access is expensive, even with a cache.
    • Finishing the pipelined version of the CPU.  It's mostly there, but it fails certain regression tests related to exceptions and I need to find the deadlocks.
    • Adding a DMA controller that will allow for fast block transfers between regular and video memory without the CPU.
    • Profiling the game to see where most of the time is spent.  I've already done this is a crude way, and not surprisingly, it seems like most of the time is recursively evaluating the players visual field to determine what needs to be rendered (thus the arg handling above).  There may be other places that jump out if I did more profiling.

    So that's about where we are right now.  If you've been watching this project for a while, thanks!  Let me know if there's anything in particular you are interested in seeing, or if you have any ideas on where to go from here.

  • Improved Interconnect

    Matt Stock06/29/2017 at 19:13 1 comment

    To go beyond the performance I have right now, I think I'm going to have to spend some time improving memory access. It just takes too many cycles to access memory, even with the cache helping a fair bit.

    One option is to implement pipelined memory access and build a new CPU opcode that will support block memory moves. Another option would be to do another run of compacting the ISA so that there isn't as much dead space. That would have a tradeoff in decoding complexity, but I don't really think there's much of an issue there.

    For the purposes of DOOM, I think the real problem though is in making the compiler smarter about using registers vs memory locations. I think that's likely the best place to start, and should mean the least amount of complexity as well. Along similar lines, I should likely try to start passing at least a few of the function parameters in registers instead of forcing it all on the stack.

    Let me know your thoughts and if there's one area you find more interesting of useful than another and why.

  • More Speed

    Matt Stock12/11/2016 at 02:00 0 comments

    As with all good hobbies, I went on a couple of tangents recently. While I was waiting for the new daughterboard, I decided to spend some time on speed improvements.

    The first step was the easiest: I just doubled the clock speed from 50MHz to 100MHz. The external memory (SDRAM and SSRAM) had plenty of headroom, and I took the opportunity to parameterize all of the modules that use their own clock (SPI, I2C, UART, etc). That helped, but not enough. The memory operations were just too inefficient.

    The next step was to add simple write FIFOs for memory modules. The idea here is that writes can return immediately because the CPU isn't expecting a response. If the CPU makes a read request, that read will stall until the write FIFO is empty, ensuring that there is consistency. I put this FIFO between the CPU and the memory controllers so that it woudl do the most good. It helped, but wasn't a game changer.

    As most of you probably already know, SDRAM isn't particularly efficient at single word operations. You need to "open" a row of memory, do some operations, and then "close" it again. The overhead of the open and close operations is huge when you only do a single read or write, which is what I was doing up until this point.

    By adjusting the cache memory system and the SDRAM controller, I was able to to enable pipeline operations, making memory access a lot more efficient. Since I already had the cache controller handling 4 words in a cache line, it wasn't particularly hard to enable a 4 word pipeline. The nice thing is that all of this is abstracted away from the CPU - it can still use single word operations and gave some of the benefits simply because the content gets into the cache memory more quickly. This and increasing the cache size to 4k words (from 1k) made a substantial difference in performance. I'm starting to use the Doom load time as my benchmark for these things, and I've now got it down to about 1m20s from program load to menu.

    There are so many additional optimizations that I can still apply:

    • Block memory transfer CPU instructions
    • Improving the SDRAM controller to keep rows open longer and increase the time between refresh cycles
    • Improve the SSRAM controller to handle burst operations (should help with frame rate)

    I may tackle some of these in the near future.

  • New stuff

    Matt Stock11/17/2016 at 21:13 0 comments

    I know it's been a while since I updated this project. I'm better at working on the projects than documenting them. I've been focused on some other projects for a while, but now I've got some time and renewed interest, so expect updates soon.

    I just finished spinning a new IO board for this project. It's a simple one that incorporates an RTC chip, a codec with line in/out and headphone out, PS/2 keyboard, and an aux output for the LED matrix. I've been working on an i2c master instance for my SoC, which I can then use to program the codec. Untimately, I want to use the codec to serve as an output for the PCM/WAV audio for the game. So next update will talk about the audio interface and progress on integrating the audio into Doom.

  • Cache and new codec

    Matt Stock01/03/2016 at 20:57 0 comments

    I picked up the Adafruit Codec module recently, and I'm working to integrate it into the system design. It's SPI based and understands how to process both MIDI and WAV, and so I'm hopeful that I'll be able set this thing up to play sound effects and music from Doom without a lot of pain.

    Read more »

  • Video Demo

    Matt Stock12/11/2015 at 03:22 1 comment

  • Detailed VGA controller description

    Matt Stock12/11/2015 at 03:19 2 comments

    As I mentioned earlier, the VGA controller was an interesting part of the design for me. While not perfect, it addresses my immediate needs, and there are several opportunities to tweak the design. For example, right now the 320x240 double pixel mode is hardcoded, but this could easily be added to a control register to all the CPU to change the video mode as needed.

    Here's a block diagram of all the major parts:

    Read more »

  • System Block Diagram

    Matt Stock12/11/2015 at 02:55 0 comments

    Here's a high level view of the overall system architecture:

    Read more »

  • Custom Video (or the power of FPGA)

    Matt Stock12/09/2015 at 16:27 3 comments

    There are a lot of howto articles about building VGA clocks in FPGAs. My favorite is the Pong Game. More complex for me was how to take that VGA clock and use it to build a true framebuffer for a CPU. This introduces two new challenges related to memory bandwidth and multiple clock domains. I'll describe how I implemented my framebuffer in Verilog in a later article, but here I thought I'd share one of the interesting things I did when porting DOOM that was made trivial on an FPGA platform.

    Read more »

  • It Works!

    Matt Stock12/09/2015 at 04:25 0 comments

    I have the basics down - the program will load, load the WAD file, render the player views. Controls are crude, but I was able to map some of the keyboard commands to the analog joystick and pushbuttons to get things tested. I have a Doom on FPGA video up on Youtube.

    Some next steps:

    Read more »

View all 10 project logs

  • 1
    Step 1

    Install the GCC, binutils, newlib binaries following the project instructions.

  • 2
    Step 2

    Clone the Bexkat1 repo and run "make" in the soc/monitor directory.

  • 3
    Step 3

    Clone the DOOM repo, and edit the bexkat1doom/Makefile to refer to the soc/monitor/include and soc/monitor/library paths from step 2.

View all 6 instructions

Enjoy this project?



joepaul2126 wrote 03/24/2023 at 12:06 point

Well it is great to have that in now a days.

  Are you sure? yes | no

AlasterJames wrote 03/14/2023 at 13:22 point

Well it is great to have that in now a days, i am very excited for this project and it can be further used . i am also working on this type project.

  Are you sure? yes | no

sabyanti275 wrote 09/24/2022 at 17:06 point

Really impressive project you have started keep it up. As we recently started the development of a saas program of we are also working on it. As from your project I learnt few things going to implement on my own project.

  Are you sure? yes | no

polsjemi wrote 10/10/2020 at 13:10 point

This is super awesome project. I'm doing the similar project in my blog. You can see here

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates