A one-page CPU: spec, HDL, emulator and macro assembler each in one page. Fits in XC9572 CPLD.
The final change to OPC-1: we removed SEC, because we had to fix a bug in the verilog, and then the machine didn't fit the CPLD, so something had to go. No great problem though, because with our macro assembler we can have an SEC like this:
MACRO SEC() lda.i 0x01 lxa ENDMACRO
You can run an emulation of OPC-1 in your browser: see here for a trivial program, or here or here for longer programs. (It's an emulator without many features, because, of course, it fits on one page.)
Our next mini-adventure was to see about an OPC-2 - instead of an accumulator machine, how about a load-store machine? More registers, operations on registers, and the only memory accesses to be loads and stores.
Unfortunately, although we did manage to make a working machine, fitting in the CPLD and again with all sources on one page, it didn't turn out very satisfactory. The CPLD is so small we could only have two registers, and the address space shrunk again to just 10 bits. We could only afford load and store from one of the registers. We managed a conditional jump and a jump-and-link, so again one can manage subroutines. But there was so little room JAL takes two bytes even though the second byte contains no information.
At this point we've run out of room in the CPLD, and it's time for a new project. Although we might come back to OPC-1 and get it running in a breadboard - so far we've been exploring with synthesis, simulation and emulation.
For one last gasp, we wrote an OPC-3, which is a simple-minded expansion of OPC-1 into a machine with 16 address and 16 bit data. We quite like word-addressed machines, so this is one of those: addresses are not bytes! There's no simple way to access bytes although of course you can always shift and mask. But a benefit of this size of machine is that a value, whether in register or memory or as an operand byte, is always big enough to specify a full address.
OPC-3 is a bit too big for our CPLD and a bit too simple to be impressive. The instructions have lots of unused bits. But it leads us to think of interesting directions for the next project.
Still fitting within the CPLD, and still keeping source, spec and emulator within 66 lines, we managed to add a feature: indirect addressing.
Having reduced the address space from 12 to 11 bits, so we could fit in the CPLD, it turns out we had spare logic capacity (but of course no spare flop capacity) and we also had one bit freed up in the instructions.
So now we have a 5 bit opcode field. We gain a load instruction - LDA - and the store instruction - STA - gains a second addressing mode. In both cases there's an extra level of indirection: the effective address is the value loaded from the address given in the instruction. Because the pointers are fetched using only an 8-bit address, this gives the machine a zero page, like the 6502.
We also gain a set carry instruction, SEC, hoping to make subtraction-by-addition a little easier, as we lack a subtract instruction. And we gained a little by squeezing the carry bit into the link register. We needed that - the design now uses 100% of the Function Blocks in the CPLD.
Here's the updated spec.
We made an update, adding a link register and three instructions, which give us subroutine capability. There's still no stack!
The cost of this update, in fitting in the CPLD, was one address bit, so we move down from a 12 bit address space to 11 bits.
JSR stores the current PC in the link register and the accumulator - the PC is 11 bits, the accumulator only 8, so we needed a 3 bit link register. RTS copies the link and accumulator into the PC. That's almost enough, but to save a return address we need access to the link register, so LXA exchanges the link register and the accumulator.
In fact here's the spec on the subject:
Our aim here was to see if we could fit a useful CPU on a CPLD. We chose the Xilinx 9572 because we've used it before, and there's a breadboard-friendly dev board for it.
At the same time we wanted to see if we could describe a CPU in one page.
The 6502 is nice and simple but is too large for a CPLD and too complex to be described in one page, so we started with that and threw out the stack pointer, the index registers, and almost all the instructions. We're left with an accumulator machine, and we kept just two flags: the carry and the zero.
The first cut has a fixed instruction format of two bytes, which allows for two addressing modes: eight instructions in direct mode with a 12 bit operand, and sixteen instructions in implied/immediate mode with just an 8 bit operand. So we get a 12 bit address space, and a 256 byte zero page. We get LDA, ADD, SUB, and AND with two addressing modes, and STA. NOT, JP, JPC, JPZ and SEC with just one addressing mode.
With this version, not only must stack management be manual, but also subroutines have to be managed manually, perhaps by using a Wheeler Jump. Maybe self-modifying code would be essential. It's certainly a fully capable CPU though.
At this stage we also had an assembler and an emulator, both written in Python. See GitHub.