Project | ECM-16/TTL homebrew computer

« Back to project details Sort by:

Memory map

12/12/2022 at 19:57 • 0 comments

The computer has 32-bit address bus, which provides address space of 4 Gigabytes of memory. This is quite big, taking into account that processor will run at frequencies in range of megahertz. It will take more than half an hour for PC to traverse from address zero to the top of this address space doing only NOPs (they take 2 clock cycles) when running at 4MHz. This is impractical in my opinion. Also, as I intend to use SRAM chips, it will be very big and very, very expensive build, which may not even be capable of work at such speeds reliably.

So there will only be maximum of 16 Megabytes of SRAM (main RAM) installed -- for this only lowest 24 bits of address are needed.

The main RAM address range will be from 0x0000.0000 up to 0x00FF.FFFF. This is just 1/256 of all the address space.

Maybe some time in the future I'll implement some sort of memory management with paging and ability to swap memory to permanent storage, and thus full address range could be utilized.

These 16 MB are the lower memory.

There will also be a chunk of address space where ROMs and I/O is mapped. For future proofing I decided to place it at the very top, in range from 0xFF00.0000 to 0xFFFF.FFFF.

The first megabyte (0xFF00.0000--0xFF0F.FFFF) is reserved for ROMs, where system monitor and standard interrupt service routines are to be located.

The last megabyte (0xFFF0.0000--0xFFFF.FFFF) is for Video RAM -- there 2 video buffers are to be located, they will be switching as described on this section of wiki article (page flipping). The two pages of VRAM will not necessary occupy a whole megabyte, I think that pages 128 kBytes each will be enough. These buffer pages will be on the very top of address space.

The next to last megabyte (0xFFE0.0000--0xFFEF.FFFF) is to be reserved for various I/O ports, video control registers, palette and character table.

All the addresses in between will be left unused, there are just so many of them.

Graphical overview of the whole address space:

+-------------------------------------+ FFFF FFFF
|       ROMs, I/O, video RAM          |  16MB higher memory
+-------------------------------------+ FF00 0000
|                                     | FEFF FFFF
~       No hardware mapped here       ~
~  May be used for virtual addresses  ~
|                                     | 0100 0000
+-------------------------------------+ 00FF FFFF
|             Main RAM                |  16MB lower memory
+-------------------------------------+ 0000 0000

Graphical overview of the higher memory:

+-------------------------------------+ FFFF FFFF
|              Video RAM              |  2 pages 128 kBytes each for display data
+-------------------------------------+ FFE0 0000
|                 I/O                 |  keyboard, PATA, UARTs etc. mapped here
+-------------------------------------+ FFD0 0000
|                                     |  unused
+-------------------------------------+ FFC0 0000
|                                     |  unused
+-------------------------------------+ FFB0 0000
|                                     |  unused
+-------------------------------------+ FFA0 0000
|                                     |  unused
+-------------------------------------+ FF90 0000
|                                     |  unused
+-------------------------------------+ FF80 0000
|                                     |  unused
+-------------------------------------+ FF70 0000
|                                     |  unused
+-------------------------------------+ FF60 0000
|                                     |  unused
+-------------------------------------+ FF50 0000
|                                     |  unused
+-------------------------------------+ FF40 0000
|                                     |  unused
+-------------------------------------+ FF30 0000
|                                     |  unused
+-------------------------------------+ FF20 0000
|                                     |  unused
+-------------------------------------+ FF10 0000
|                 ROMs                |  EEPROMs with system software (monitor, ISRs)
+-------------------------------------+ FF00 0000
|                                     | FEFF FFFF
~          virtual addresses          ~

The ROMs and VRAM will not occupy whole Megabyte slots, but placed individually for ease of addressing.

Graphical overview of the VRAM slot:

+-------------------------------------+ FFFF FFFF
|               page 1                |  
|            (128 kBytes)             |  
+-------------------------------------+ FFFD 0000
|               page 0                |  
|            (128 kBytes)             |  
+-------------------------------------+ FFFC 0000
|                                     |  
|                                     |  unused
+-------------------------------------+ FFFA 0000
|                                     |  
|                                     |  unused
+-------------------------------------+ FFF8 0000
|                                     |  
|                                     |  unused
+-------------------------------------+ FFF6 0000
|                                     |  
|                                     |  unused
+-------------------------------------+ FFF4 0000
|                                     |  
|                                     |  unused
+-------------------------------------+ FFF2 0000
|                                     |  
|                                     |  unused
+-------------------------------------+ FFF0 0000
|                                     | FFFE FFFF
~                 I/O                 ~

The video buffer pages are on the topmost ranges.

At any time one of them is used by GPU, or VGA adapter (yet to be designed and built), and the other can be written to by CPU. When the write is done, and right before new frame is started to be drawn on screen, the pages flipped (with the use of small interrupt service routine), and new frame can be composed on fresh page.

This way, the CPU is decoupled from the harsh VGA timings, and can do its things at its own pace.

Graphical overview of the I/O slot:

~                VRAM                 ~
|                                     | FFF0 0000
+-------------------------------------+ FFEF FFFF
|                                     |  
|            VGA controls             | 
+-------------------------------------+ FFED 0000
|                                     |  
|               UART 3                |
+-------------------------------------+ FFEC 0000
|                                     |  
|               UART 2                |
+-------------------------------------+ FFEA 0000
|                                     |  
|               UART 1                |  
+-------------------------------------+ FFE8 0000
|                                     |  
|               UART 0                |
+-------------------------------------+ FFE6 0000
|                                     |  
|           PATA interface            |  to be used with CompactFlash storage
+-------------------------------------+ FFE4 0000
|                                     |  
|           keyboard input            | 
+-------------------------------------+ FFE2 0000
|                                     |  
|           terminal output           |  
+-------------------------------------+ FFE0 0000
|                                     | FFDF FFFF
~               unused                ~

The devices may change.

For each device/port there reserved quite a lot of address space, but in reality the whole range will be treated as single address, the lower bits will be discarded. This is done to make addressing circuitry easier, and I have a lot of address space to spare.

Rethinking multiplexed outputs
12/06/2022 at 07:12 • 0 comments

There are several 8-to-1 and 4-to-1 multiplexers around the build, which are routing 16-bit and 32-bit wide signals. Managing all this wiring looks like a challenge (one have to look at the Pilot-1 CPU, which is mostly a register file coupled with ALU -- it is ribbon cables all around.

There are quite a lot of wires criss-crossing the 8-to-1 multiplexer boards. It helps that these wires are all individual, and 3-d space allows to weave them as one pleases.

Here is the example schematics for Register File, there are a LOT of 16-bit paths:

If I to replicate such structure on PCB, turning these wires into traces on 2 planes, it seems that there will not be enough space, and I'll need multi-layered PCBs which are more complicated and expensive to order, and if there some wiring bug in buried layers, there is no way to repair it.

The better solution, it seems, would be to replace multiplexers with tri-state buffers that feed into common bus and activated by single 3-to-8 decoder. This must greatly simplify the build.

Revised schematics for Register File, using tri-state buffers, it looks much more manageable:

This new solution will be applied for replacing 4-to-1 and 8-to-1 multiplexers; the 2-to-1 multiplexers are fine as they are, wiring them is not as complicated. This replacement will cut greatly into the number of ribbon cables and weird orthogonal board connections in the final build.

The reason I was vary of using this solution earlier is that I thought it would be significantly slower. But it applies more to situations where the clock speeds are much higher and there are much more such outputs to single bus. In my case this speed penalty looks insignificant.

Also, when actually looking into datasheets, the propagation delays for 8-to-1 multiplexer and tri-state buffer are almost the same, and my worries were unsubstantiated.
ALU staggering
12/02/2022 at 19:00 • 0 comments

I solved the staggering of ALU execution step and fetch of the next instruction.

The state machine diagram is updated accordingly.

The problem was in timings of propagation delays. All the registers are clocked on rising edge, and the update of gpr and IR should have been at the same exact moment. But in reality, there are several gate delays between rising edge of system clock signal, and the latching of particular register in the register file. On the other hand, instruction register is updated right at the system clock rising edge, thus control signals for the next instruction propagate at the time of latching registers for previous instruction, leading to wrong operation results.

To solve this problem, I introduced "Shadow" register right after the Instruction register, that is latched some time after the IR:

This new register is clocked with 90 degrees shift relative to system clock, so that instruction word is propagated to decoder a quarter of system clock period later, when all other registers are long latched to their values.

Here is clock diagram for this:

The system clock is derived from base clock by dividing its period by 2, and Shadow register clock signal is AND between base and system clock signals.
2022-12-06 Update:
I figured, there is no particular need to make additional clock phase -- all that needed is a sufficient delay between Instruction Register and Shadow Register latch signals. This can be accomplished by several consecutive NOT gates in the latching signal path for the Shadow Register. While tinkering with simulation, it become clear that delay of just 2 gates is sufficient. In real circuit it may better be done with longer delay, say 6 NOT gates, it will be 20-30 ns.
Jumps simplification
11/20/2022 at 18:50 • 0 comments

The State Machine diagram and Jump instruction layouts are updated.
It looks like there is no need for jumps with offsets in GPR - such jumps are unpredictable at time of coding/assembly, and it is likely that no use can be gained from such an instruction -> removing this option.

Jumps using MPs other than PC as base would be too confusing, and there is also no clear idea what they can be used for -> removing this option.
It seems that jumps with current value in PC as base address are the only useful option -- they are easily predictable.
Thus, decode complexity is reduced, and now also there is no distinction between JMP and JSR states, so only one of them remains.

Jumping to absolute address can be accomplished using interrupts, or via direct or immediate load of address into PC register pair or via double-word MOV into this pair.

For JSR, use of PC can be made implicit.

All of the above simplifies the use of jumps in assembly, the instruction will look like this:

JUMP offset

,where JUMP is any of 10 varieties of jump instructions, and offset is the distance from jump address and current PC value.

In practice, the offset is to be calculated by the assembler, and user should use labels.
Yet another subproject
11/12/2022 at 17:35 • 0 comments

Just a note: I have created a new subproject for describing the assembly language to use with my CPU, and tracing some progress in development of assembler program.
Updated State Machine
11/10/2022 at 17:07 • 0 comments

As I tested the operations in simulation, I encountered the error in the design of Jump instructions, and thus the part of State Machine that is responsible for orchestrating execution of Jump instructions was not right. Here is updated state machine diagram with corrected error:

List of states:

01 - F1 -- Fetch 1 state, loads 16-bit word from memory at address in PC into Instruction Register, advances PC by 2; it is a common start of any instruction execution cycle.

02 - F2 -- Fetch 2 state, loads 16-bit word from memory at address in PC into Memory Data Buffer Register, advances PC by 2; it is used for 2-word instructions, second word containing some immediate value.

03 - F_i -- Fetch Interrupt state, loads hardcoded JSR (SP) [r0] instruction into Instruction Register. The execution is modified so that [r0] value is ignored. This state is entered when IRQ signal is set, and current instruction is executed fully.

04 - ALU -- ALU state, perform any of the 29 ALU operations on GPR. This state is normally staggered / concurrent with F1 state. In case there is an IRQ signal active, the next instructions Fetch state is entered only after current ALU state is executed.

05 - MOV -- MOV state, Copy data from one register in CPU to another, can be repeated several times for multi-word transfers.

06 - MAB load -- during this state, the 32-bit Memory Address Buffer is loaded with sum of value from MP and offset; for loads/stores with pre-ncrement the Memory Pointer is also updated with this value.

07 - LD/ST (MP) -- load or store value from memory at address in MAB to register, advance MAB by 2; can be repeated several times for multi-word memory transfers.

08 - MP Post incr -- Updates 32-bit Memory Pointer Pair with sum of its current value and signed 16-bit offset.

09 - LD/ST direct -- Loading/storing 16-bit value from memory at immediate address (provided from combination of low byte of Instruction Register and Memory Data Buffer register).

10 - LDi -- Loading of 16-bit value from Memory Data Buffer register into one of the GPR, or sign extended 25-bit value from combination of low 9 bits of Instruction Register and Memory Data Buffer register.

11 - Addr Arithm -- Address Arithmetic, Loading 32-bit MP with sum of its current value and signed 16-bit offset.

12 - Misc -- Any of 8 unsorted instructions that are executed in one clock cycle.

13 - JMP -- Jump, Loading 32-bit PC with sum of its previous value and signed 16-bit offset.

14 - SP-4 -- Loading 32-bit SP with its current value minus 4, also storing this new value to 32-bit Memory Address Buffer.

15 - Push PC -- Store PC register pair to memory at current address in MAB, update MAB with its current value plus 2. This state is repeated 2 times, storing high and low halves of PC.

16 - EINT -- Enter Interrupt, or Jump to Interrupt Service routine, Loads PC with current value from Interrupt Vector register, clears IRQ signal.
Update on simulation; rethinking Jumps
11/06/2022 at 16:33 • 0 comments

With my new rudimentary assembler, I wrote a series of test routines for the instructions used in CPU. This is a great help for the debugging the simulation in systematic way, and using them, to date, I tested all instruction types except Jumps and Miscellaneous.

During the testing several wiring bugs were found and fixed.

As I started to test the Jumps, I found that I thought out them in an erroneous way.

The way I designed the microinstruction sequence, the jump address is being loaded from memory, which is not how it should work:

-

Old (wrong) Jump instruction sequence:
0 Fetch1 // load IR <-- Mem:PC; PC <-- PC+2
0' (Fetch2) // load MDB <-- Mem:PC; PC <-- PC+2
1 Set 2w flag
2 Load MAB <-- MP+offset // offset may be GPR or MDB
3 Load PCH <-- Mem:MAB; load MAB <-- MAB+2
3* Load PCL <-- Mem:MAB; load MAB <-- MAB+2

This sequence is loading direct address for jump from somewhere in memory.
How should jumps work:

The way jumps should work is by loading the address from some Memory Pointer pair, with possible offset, into PC pair. This can be readily done using Address Arithmetic instruction for unconditional jumps. But for conditional jumps, there need to be a special kind of instruction anyway, one that encodes condition. When using the PC as base address, such jumps facilitate position independence of the piece of code.

So, the jump instructions have quite a lot in common with Address Arithmetic instructions while also having some resemblance to Indirect Memory Accesses, but not almost identical to the latter, as I thought earlier.

If there is a need to load direct jump address from memory (i.e. in case of returning from subroutine), it can be done with 2-word load via address pointer into the PCH and PCL registers.

-

Here is revised sequence for regular jumps:

0 Fetch1 // load IR <-- Mem:PC; PC <-- PC+2
0' (Fetch2) // load MDB <-- Mem:PC; PC <-- PC+2
1 Load PC <-- MP+offset // offset may be GPR or MDB. The offset is signed value.
-

Sequence for Jump to Subroutine:

This is a jump combined together with storing of current PC value to memory, at address pointed by Stack Pointer (with some offset).

The store part has the same microinstruction sequence as regular indirect store with pre-increment, only difference is that it is hardwired to store the PCH/PCL pair, and pre-increment is always -4. The base pointer can be any of the four memory pointers, but convention is to use the second one (PC). The following jump part is the regular unconditional jump. This combination makes possible to store the address which when loaded into PC again allows the program readily proceed with execution without further modification of PC value.

-

mnemonic for Jump to Subroutine instr.

JSR (SP) [offset]

The offset is distance in bytes from current PC to start of subroutine, it is used in later, "jump" part of sequence.

-

0 Fetch1 // load IR <-- Mem:PC; PC <-- PC+2
0' (Fetch2) // load MDB <-- Mem:PC; PC <-- PC+2
1 Set 2w flag
2 Load MAB <-- SP-4, Load SP <-- SP-4 // most common MP will be SP
3 Store Mem:MAB <-- PCH; MAB <-- MAB+2
3* Store Mem:MAB <-- PCL; MAB <-- MAB+2
4 Load PC <-- PC+offset // offset may be GPR or MDB. The offset is signed value.
Status Register, Prefixes and Interrupt masks
10/27/2022 at 20:03 • 0 comments

There are several groups of latches scattered around the CPU, which can be collectively named as "Flags register".

The Status Register - 4 flags to store some properties of ALU operation result. These flags are updated only at ALU operation, or by special MOV, otherwise they stay intact during execution of other operations. Their contents influence results of conditional jumps.

Prefixes - 4 flags for multi-word operations and 1 flag for byte transfer. These are located in Multi-word Transfer Enabler circuit that is integrated with the sequencer. The multi-word transfers are applicable to regular MOVs and memory accesses via memory pointer. The byte transfer flag is only affecting memory access operations. These 5 flags can only be explicitly set by "prefix" instructions - they are cleared automatically after transferring a number of words/bytes according to particular flag. The "byte" flag is working in parallel to the "multi-word operation" flags, so one can set the byte flag together with multi-word transfer flag, and then execute load instruction, thus loading several bytes in consecutive registers. All these flags are cleared at the end of this Load operation execution.

The Multi-word Transfer Enabler makes some states to repeat several times in row (the looping arrows from the state into the same state on the State Machine diagram).

The operations for setting "Prefixes" flags are in "Miscellaneous" group.

Interrupt Masks - several flags (for now I think that 7 or 8 may be needed), that can be set or cleared using operations from "miscellaneous" group. These flags are meant to differentiate interrupts by priority, so that while serving high-priority interrupt, the low-priority ones are ignored or put on queue. This part for now is least defined.

The Interrupt Masks group of latches/flags are to be placed in Interrupt Controller circuit that is yet to be developed. The sequencer itself only sees if there is interrupt request pending or not.
Current status (oct 2022) and plans for the future
10/27/2022 at 16:30 • 0 comments

For some time ( ~8-9 months) I put this project on hold... again. This time not just for losing interest to other topic, but because I decided to (hopefully temporally) move from Belarus to other country, to get farther away from war between Russia and Ukraine.

All the hardware was left in my home country, there was a bit of stress, and many more urgent things to do.

Finally, for now I have the interest for this project reignite, and I continuing developing it.

For a time, this project will be software/documentation only - recently I updated the instruction descriptions, and started writing an assembler program that translates mnemonics into machine instructions.

As of now, plans are as following:

- finish defining and describing instructions in Miscellaneous group

- finish and debug the model in logic simulator

- make a proper assembler

- write an emulator for the CPU - this probably will be finished before or alongside the model in logic simulator

- (provisional) - make working model using some hardware description language - this will require me to learn such a language

- (provisional) - make pcb designs

I am thinking now, that ordering custom PCBs is the way to go at the time when I'll have the opportunity to return to making hardware, as wiring all parts by hand on a perf board is very time-consuming, finicky, and often error-prone process (although a satisfying one). Resulting hardware turns out to be quite fragile, with all these dangling wires.

If I will go through this custom PCB business, the Pilot system will remain on its own, and proper processor will be built up from scratch, with parts (calculating core) functionally identical to corresponding ones already built.

Before this the finishing of the model is top priority: as of now, not all aspects of jump implementation are thought out, and the operations in Miscellaneous group are not properly defined. The Memory subsystem is in very draft stage, and I/O is not done at all. The Interrupt system, although broadly exists, but some details need work to do. Lastly, startup system is to be developed, so that CPU at start automatically enters the right working state.
New sub-project describing the instruction set
01/26/2022 at 10:39 • 0 comments

I created a new project related to this one for publishing the instruction set used by this ECM-16/TTL cpu. In that project all of the instructions are going to be presented and fully described in one place.