Program Counter and other considerations

I made some mistakes with the previous diagrams and I might have uncovered a new concept...

Let's start with the register set : the last register is PC, which is the address of the current instruction. It is precious for several reasons : for "loop entry", for "call" and return, for (conditional) jumps, for calculated/indexed or indirect jumps...

Jumping is easy : just write the desired address to PC. It can be an immediate value, a register (even PC), a value coming from memory (through a register), even data from an input port. It's THAT flexible. This justifies having the PC in the register set so the ISA is highly orthogonal and it saves a lot of opcode space.

However reading the PC register is not as simple. Physically reading the current value is easy but most of the times, this is not what we want. A computed/indexed jump simply adds a value to PC and any offset is easily adjusted by an additional instruction. However Call and "loop entry" need PC+1 !

"Loop entry" is emulated with PC+1=>Reg but this is valid for 3-addresses instructions, and Y8 has only 2 address fields, one source and one source/destination. Getting PC+1 directly is important for relocatable code, an 8-bits immediate is still possible but you want to be able to move code blocks around (even though this uses a lot of registers, but you could spill on the stack).

Call absolutely requires the value of PC+1 : this value gets written to the destination register (might be an address or data register as well) while the source (register or immediate) goes to PC (and the Program Address bus). But instruction fetch takes one cycle time and things get a bit messy here.

It is NOT reasonably possible to read PC+1 in the register set : the incrementer adds some inherent latency to the already tight critical datapath (read operands, calculate, select the result from the various sources, setup&hold...). The value on the bus MUST come from the PC register itself.

Yet we need PC+1 for important features, which would take one more instruction otherwise, and waste time&program space. And the incrementer doesn't have much latency, the CDP is a few gates at most, which leaves ample time to route the result to the Program Memory and any other MUX.

This is where I realise something very interesting... The instructions that need PC+1 keep data movement in the register set section. The ALU is short-circuited. The other instructions use the ALU but don't need PC+1. Their CDP don't need to be added !

ALU operations (ADD/SUB, ROP2, SHL) take 2 operands and create a result by going through a lot of circuits. Let's call that a "grand tour" :-)

OTOH the other instructions (the "control group" : SET, CALL, IN/OUT, LDCx) stay close to, or inside, the register set. Let's call that the "petit tour" :-D

SET and CALL need PC+1 and don't use the ALU so we can directly tap the desired value from the incrementer's output, for these instructions. This is possible because the value of PC+1 doesn't go much further than the register set (maybe to the data RAM). LDCx and IN/OUT don't make much sense, however, because PC+1 would have to go through the main multiplexer and this would slow everything down.

Thus, SET and CALL have a special access to PC+1 because they do a "little tour". They bypass the ALU (which can get rid of its "bypass" flag) and get their value directly from the incrementer.

R7 P&R

Bus names (SRC-SRI, DST/SND)

Discussions

Become a Hackaday.io Member