Close

Architecture notes

A project log for Bexkat1 CPU

A custom 32-bit CPU core with GCC toolchain

matt-stockMatt Stock 12/09/2015 at 00:260 Comments

CPU Design

I've gone through a few iterations on the design, both as part of the trial and error learning process, and as I wanted to add new options. Here are the current key features:


At the moment, the majority of the opcodes have source and destination as registers and simple load and store operations. There are 16 general purpose 32-bit registers. At the moment, %14 is used as the frame pointer, %15 is the stack pointer, and %13 and sometimes %12 are used for return values for function calls. All arguments are pushed onto the stack, though I may change that to registers eventually. We have 4-byte alignment that needs to take place for memory fetch and store. The CPU supports byte addressable memory, and so there are byte enable signals and bytelane flow.


Implementation

Unlike a few hardcore folks, I'm implementing my design in an FPGA. I believe it will allow me to focus on the parts of the design I find most interesting, and allow me to skip the "where is that broken wire?" drudgery. I started the work on a Terasic DE1 board, and while the CPU itself can still run on that system (as well as something far smaller most likely), I've moved to the Terasic DE2i-150 board and the Altera Cyclone IV family.


Exceptions

Exception handling is quite primitive at this point. I have a simple vector block defined for interrupts, and currently only the reset address is defined. There is support to remap the base of the vector table, and so I anticipate remapping this into RAM in most system configs. No other interrupts are defined right now, but the main trick left is to implement stack push before jumping into the ISR, and defining a few new opcodes to enable and disable interrupts.

Protected mode

A protection bit/supervisor mode is defined, with a separate stack pointer. It is effectively unused at the moment, but I can build on the stub if I need it.


Memory protection

There is no memory protection at the moment. In theory, an external exception from an MMU could do this, but I haven't spent time building out those features yet.


Caching and Pipelining


No caching or pipelining at the moment. Soon though.


Machine


I currently have IP cores for the following devices:


On my list to do:



I also am building some true custom hardware. One of my tests of the floating point core was an implementation of a Mandelbrot set renderer. The algorithm is a classic parallel algorithm however, and should be able to be implemented as a hardware module. You feed the matrices in, and after a short time, you read out the results. Should allow for a very fast block rendering method. On my list.

Discussions