Project | Bexkat1 CPU

« Back to project details Sort by:

Instruction Set Architecture

12/11/2015 at 23:57 • 1 comment

The ISA for the CPU is pretty low density. With a word size of 32 bits, there's a fair amount of room to do everything... except for absolute addresses and some large constants. As I've experimented with the ISA, I've left gaps, extra bits, etc and it's a bit messy. I'm starting to clean up and make things a little more orthogonal now, with the idea that this will also allow the CPU core to become more efficient.

The ISA is defined in my Opcode worksheet, and I try to make sure this is up to date as I make changes to the core and the assembler.

After some review, it turns out that it's a little more compact than I originally thought. Here's a summary of the types of opcodes. Most of them share a common structure.

REG	00oo oooo oxxx aaaa bbbb cccc xxxx xxxx
REGIND	01oo oooo vvvv aaaa bbbb vvvv vvvv vvvv
IMM	10oo oooo vvvv aaaa xxxx vvvv vvvv vvvv
DIR (32 bit)	11oo oooo vvvv aaaa bbbb vvvv vvvv vvvv
DIR (64 bit)	11oo oooo xxxx aaaa xxxx xxxx xxxx xxxx AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

The key for the above is:

First two bits are the form value. The different forms used to have very different structure, but now they have kind of merged together quite a bit and so in many ways this is not just an extension of the opcode number.
o = opcode number, unique within the form.
a, b, c = the 4 bit register identifiers. The specific opcode may not use all of these.
v = a signed offset of some kind. Used for PC offset addressing, register offset addressing, as well as for constant values for various functions.
A = the 32 bits of the second word. In most cases it is used as an absolute address, but for the ldi instruction it is a 32 bit constant.
x = unused bits. Maybe an opportunity to build more compact forms. For example, instead of having the ALU command be part of the opcode number, maybe I could have a single opcode number for the ALU (or floating point, etc), and use some of these bits to indicate the type of operation. Unfortunately, these bits aren't always available in all addressing modes, so it's not always an easy choice.

FPGA Pin Assignments

12/09/2015 at 00:34 • 1 comment

This is in the project, but it's helpful to have an external guide as well. I should build a small breakout board for most of this stuff to avoid a bunch of wire harnesses.

Terasic DE2i-150

Function	GPIO	Pin
Serial1 TX	GPIO1[0]	G16
PS/2 Mouse - Clock	GPIO1[1]	F17
PS/2 Mouse - Data	GPIO1[2]	D18
LED Matrix - R0	GPIO1[3]	F18
LED Matrix - G0	GPIO1[4]	E14
LED Matrix - B0	GPIO1[5]	E15
LED Matrix - R1	GPIO1[6]	F15
LED Matrix - G1	GPIO1[7]	G16
LED Matrix - B1	GPIO1[8]	F12
LED Matrix - A	GPIO1[9]	F13
LED Matrix - B	GPIO1[10]	C14
LED Matrix - C	GPIO1[11]	D14
LED Matrix - OE	GPIO1[12]	D15
LED Matrix - STB	GPIO1[13]	D16
LED Matrix - Clock	GPIO1[14]	C17
PS/2 Keyboard - Clock	GPIO1[15]	C25
PS/2 Keyboard - Data	GPIO1[16]	C26
	GPIO1[17]	D28
	GPIO1[18]	D25
	GPIO1[19]	F20
	GPIO1[20]	E21
	GPIO1[21]	F23
SPI - RTC SCLK	GPIO1[22]	G20
SPI - RTC MISO	GPIO1[23]	F22
SPI - RTC SS	GPIO1[24]	G22
SPI - RTC MOSI	GPIO1[25]	G24
RST OUT	GPIO1[26]	G23
SPI - Codec XDCS	GPIO1[27]	A25
SPI - Codec CS	GPIO1[28]	A26
Codec - Data Request Intr	GPIO1[29]	A19
	GPIO1[30]	A28
	GPIO1[31]	A27
	GPIO1[32]	B30
SPI - General SCLK	GPIO1[33]	AG28
SPI - General MOSI	GPIO1[34]	AG26
SPI - General MISO	GPIO1[35]	Y21

Architecture notes
12/09/2015 at 00:26 • 0 comments
CPU Design
I've gone through a few iterations on the design, both as part of the trial and error learning process, and as I wanted to add new options. Here are the current key features:
- 32-bit address bus (byte addressable)
- 32-bit data bus
- Big endian (actually has little endian support in both GNU utils and CPU core, but I really haven't tested much)
- 32-bit opcodes, with an optional 32-bit arg
- Addressing modes: inherent, direct, PC indexed, register indexed
- Single precision floating point
At the moment, the majority of the opcodes have source and destination as registers and simple load and store operations. There are 16 general purpose 32-bit registers. At the moment, %14 is used as the frame pointer, %15 is the stack pointer, and %13 and sometimes %12 are used for return values for function calls. All arguments are pushed onto the stack, though I may change that to registers eventually. We have 4-byte alignment that needs to take place for memory fetch and store. The CPU supports byte addressable memory, and so there are byte enable signals and bytelane flow.

Implementation
Unlike a few hardcore folks, I'm implementing my design in an FPGA. I believe it will allow me to focus on the parts of the design I find most interesting, and allow me to skip the "where is that broken wire?" drudgery. I started the work on a Terasic DE1 board, and while the CPU itself can still run on that system (as well as something far smaller most likely), I've moved to the Terasic DE2i-150 board and the Altera Cyclone IV family.

Exceptions
Exception handling is quite primitive at this point. I have a simple vector block defined for interrupts, and currently only the reset address is defined. There is support to remap the base of the vector table, and so I anticipate remapping this into RAM in most system configs. No other interrupts are defined right now, but the main trick left is to implement stack push before jumping into the ISR, and defining a few new opcodes to enable and disable interrupts.
Protected mode
A protection bit/supervisor mode is defined, with a separate stack pointer. It is effectively unused at the moment, but I can build on the stub if I need it.

Memory protection
There is no memory protection at the moment. In theory, an external exception from an MMU could do this, but I haven't spent time building out those features yet.

Caching and Pipelining

No caching or pipelining at the moment. Soon though.

Machine

I currently have IP cores for the following devices:
- SDRAM (used for general purpose memory)
- SSRAM (used as frame buffer memory)
- Flash (unused at the moment)
- Async serial (used for the console, but could add as many as needed)
- SPI master (several busses)
  - RTC module
  - Micro SD card
  - LCD touchscreen
  - Dual A/D for joystick
- GPIO (switch and button interfacing)
- 640x480x256 VGA
On my list to do:
- PS/2 keyboard (I have the core written, just not integrated)
- Ethernet PHY that's part of the DE2i-150
- External MIDI interface (this is really a version of async serial for basic audio out)
I also am building some true custom hardware. One of my tests of the floating point core was an implementation of a Mandelbrot set renderer. The algorithm is a classic parallel algorithm however, and should be able to be implemented as a hardware module. You feed the matrices in, and after a short time, you read out the results. Should allow for a very fast block rendering method. On my list.

Bexkat1 CPU

Instruction Set Architecture

FPGA Pin Assignments

Terasic DE2i-150

Architecture notes

CPU Design

Implementation

Exceptions

Protected mode

Memory protection

Caching and Pipelining

Machine