Log#04: New register organisation

By whygee on Monday 18 August 2008, 06:50 - Architecture

The architecture of YASEP is very unorthodox. It is a living experiment and evolves in many unexpected directions.

However, one known uncertainty has always been how to implement the instruction fetch mechanisms. The memory queues have been a guideline but no organisation has been tested yet, and validated. Back when VSP emerged from the chaos of my brain, I wanted to use one of four queues to fetch instructions, and to indicate the current queue in the 2-bit CQ register. This idea was already implemented in the RCA1802 processor (the 4-bit P register) but this adds some overhead (and YASEP's instruction stream was never meant to be ultracompact).

Funny : I find more and more common traits between YASEP and 1802 :-)

The CQ register (just like the COSMAC's P register) also slows down the core as a whole cycle is necessary to fetch the opcode from the queue. This goes against the idea of a pipelined processor, the pipeline being the implementation of a sequential principle and sequence occurs a lot in an instruction flow.

However, the availability of several queues as potential pre-cooked jump destination (address as well as corresponding data) is very interesting and this remains in the YASEP architecture. A jump instruction with a direct immediate adress remains possible but with some (future and planned) architectures, there is the risk of a high execution latency.

I recently came to the conclusion that a compromise between the completely weird and the classical approaches is necessary.

So I keep the memory queues but the first one is modified and assigned to the instruction pointer and a status register. I had sworn that I would never do that, but I'm forced to admit that in a sense, and in the current situation (where no cache can support parallel memory accesses) something "looking like that" is necessary. And I'll do my best to avoid the inherent traps !

First, why do I need the registers #0 and #1 to hold these values ? In the currently planned first implementation, I can use a bank of 512 registers, or 32 banks of 16 registers. This means that context swapping can be very fast (1 major cycle) and I need to save many informations at once. If these informations are stored in the SR space (as previously planned), more cycles are needed to save/restore the "whole" context. So the best place to store these critical informations is in the register set itself. I could have chosen to create another parallel register bank but this would consume too much memory. The availability of the "Current/Next IP" is also very useful for computing addresses in position-independent code.

So the new register map is :

0h: IP (replaces A0)
1h: ST (replaces D0)
2h: A1 \ Q1
3h: D1 /
4h: A2 \ Q2
5h: D2 /
6h: A3 \ Q3
7h: D3 /
8h: A3 \ Q4
9h: D3 /
Ah: A3 \ Q5
Bh: D3 /
Ch: R0
Dh: R1
Eh: R2
Fh: R3

Second: what does the Status Register contain ? Of course, I avoid the storage of carry flags and such. But I can't avoid the auto-update bits of the 5 remaining queues. They use 2x5 bits and 6 bits are unaffected yet (for how long ?). These two bits per queue represent the following codes :

00 : no update
10 : post-incrementation
11 : post-decrementation

2 queues are able to implement a normal stack (LIFO) and 2 additional bits represent this ability. So Q4 and Q5 have the following properties in the Status Register:

bit N   : update on/off
bit N+1 : update up/down
bit N+2 : stack on/off (pre/post modification)

Third: These registers are not really real registers. These are "shadowed" registers with a physical instance copied somewhere else. This is necessary because the register set can't have enough ports and these 2 specific registers are critical and accessed every cycle . So their incorporation in the register map makes them easily remanent through context switches and IRQs, as well as easily alterable (without going through get/put instructions) but the register bank is updated only when these new registers are accessed. Some new datapaths must be reserved for them.

sigh...

This means that most of the opcode map (the part with the jump instructions) must be redesigned.

re-sigh...

20200406:

Ah..... the old YASEP was very weird. The microYASEP solved many of the problems with a simpler structure.

I'm not sure if this log is very important but it shows what can happen when we try to make a weird idea work : it creates weird complications.

Log#03: 106MHz !

Log#05: A suitable HW platform for CPU design ?

Discussions

Become a Hackaday.io Member