Update 20231019 : The "carry trick" is dropped from the current ISA definition, as it applies only to forward jumps (IF blocks).
The Y8 can jump:
set label pc
it can also jump relative :
add (label-$) pc
and it can even jump relative conditionally :
add (label-$) pc ifc
But then the range is limited because only 4 bits are available for the the signed amplitude. And I have already sacrificed one condition bit...
With the new assembler, here is the best that can be reasonably done:
add (forward-$) PC ifnz nop ; 1 nop ; 2 nop ; 3 nop ; 4 nop ; 5 nop ; 6 nop ; 7 forward: backwards: nop ; -8 nop ; -7 nop ; -6 nop ; -5 nop ; -4 nop ; -3 nop ; -2 nop ; -1 add (backwards-$) PC ifz
The output in .hyx:
;;hyx1 ; L1: add (forward-$) PC ifnz 75BF ; @0: ADD 8 PC IFNZ ; L2: nop ; 1 0000 ; @1: NOP ; L3: nop ; 2 0000 ; @2: NOP ; L4: nop ; 3 0000 ; @3: NOP ; L5: nop ; 4 0000 ; @4: NOP ; L6: nop ; 5 0000 ; @5: NOP ; L7: nop ; 6 0000 ; @6: NOP ; L8: nop ; 7 0000 ; @7: NOP ; L9: forward: ; = 8 ; L11: backwards: ; = 8 ; L12: nop ; -8 0000 ; @8: NOP ; L13: nop ; -7 0000 ; @9: NOP ; L14: nop ; -6 0000 ; @10: NOP ; L15: nop ; -5 0000 ; @11: NOP ; L16: nop ; -4 0000 ; @12: NOP ; L17: nop ; -3 0000 ; @13: NOP ; L18: nop ; -2 0000 ; @14: NOP ; L19: nop ; -1 0000 ; @15: NOP ; L20: add (backwards-$) PC ifz 77C7 ; @16: ADD -8 PC IFZ ;;;; SYMBOL DUMP : ; * 'FORWARD'=8 ref:1 / sym_usr ; * 'BACKWARDS'=8 ref:1 / sym_usr
A backwards loop could then contain 8 instructions (including a test for the end of the loop) but the forward jump can only skip over 7 instructions, despite the ability to encode the constant 8 when dealing with the PC register.
The offset 1 is still possible and this represents the next instruction, which would be trivial to execute otherwise. And the offset 8 points to the 8th instruction after the skipped block, it's not the size of the skipped block.
At least it's now impossible to do a pointless loop such as
ADD 0 PC ifnz ; spin endlessly doing nothing
To achieve a more practical goal, the operand should be the NPC, or PC+1, which is being computed at the same time as the addition. But this creates a whole lot of troubles, in particular:
- if we compute PC+2 then the backwards jump will only reach 7 instructions
- Timing becomes too tight, since the pipeline must choose between PC and PC+1 depending on the imm's sign
- This will require a stall cycle, and there is already one because writes to PC must discard the prefetched instruction.
At this point, the "short add trick" requires only a few logic gates (to detect the opcode, the format and the sign of imm4, detecting the PC register is not even necessary) and no deep modification of the state machine.
Trying to squeeze one more instruction, to skip 8 opcodes, would complicate the whole circuit with quite little benefits...