Binary translation (updated)

One thing I've been thinking about : since the YGREC8 is a sort of subset of the YASEP ISA, wouldn't it be nice and easy to emulate the YGREC8 on the YASEP with a pipeline stage that performs binary translation of the YGREC8 instructions ?

UPDATE 20181014 :

Of course it would be more than a great feature.

In fact : it would be good to redesign the YASEP with that translation stage from the ground up.

There are two ways to do it :

on-the-fly binary translation by hardware. This increases the latency by one cycle for predecoding but the overhead could be kept small enough to be practical. The emulation should ideally not slow the core down, SMT is possible.
block-wide instruction translation in software. Most instructions would be translated from 16 to 32 bits wide and some hardware assistance is required to compensate this difference (and others).

In either case, some hardware is required to achieve this. Ideally, the least SW is required, the best !

Similarly : the #YASEP's instruction set should be straight-forward to emulate on the #F-CPU...

The YGREC8 instructions generally map directly to "long" YASEP instructions. A "Y8" mode bit must be set to enable the translation features, but most first-order details are pretty easy to translate :

Registers : Y8 has 8, which are a simple subset of the YASEP's 16 registers. One bit must be extended, probably by simple sign extension : The Y8 order is D1 A1 D2 A2 R1 R2 R3 PC but the YASEP has a reverse order : PC R1 R2 R3 R4 R5 D1 A1 D2 A2 D3 A3 D4 A4 D5 A5, but it's not a big deal to change this. And it's just a matter of renaming registers, the YASEP has been changed a couple of times already. I'm thinking about this, where bit 2 is copied to bit 3:
```
code   Y8   YASEP
0000   A1     A1 
0001   D1     D1
0010   A2     A2
0011   D2     D2
0100          A3
0101          D3
0110          A4
0111          D4
1000          A5
1001          D5
1010          R5
1011          R4
1100   R3     R3
1101   R2     R2
1110   R1     R1
1111   PC     PC
```
So there is almost no hardware cost here and registers can be fetched almost immediately (Register read speed is the most critical factor for performance in this kind of core)
Opcodes : This is getting more complex here. Some more advanced binary trickery is required... but it's possible. The YASEP has a rather flexible instruction map, with 8 well-defined groups, that can be mapped to the 4 groups of the Y8.

Instructions : The Imm8 form maps almost directly (after some bit rerouting) to the "Long Immediate" form of the YASEP. The other forms map to the "Extended" instruction form.
Conditions : the codes are very similar too and can be easily mapped.

There are quite some differences too :

PC granularity : Y8 is using one address per instruction while it's not so clear with the YASEP. It would be best to use 16-bits granularity for both, to save on a shifter in the YASEP's datapath.
IO registers : we could define a reserved range, or some other mechanism, to make the first 256 IO registers compatible with both.
The instructions LDCL and LDCH require specific hardware and the CALL/INV/OVL system needs some adaptations...
The ALU outputs the carry at a different place (8th bit, 16th bit or even 32nd)

Of course, the emulated Y8 can't access more than it can in a native implementation... But this emulation project is a good way to reboot the YASEP design again :-)

Flags, PC, IO ports and interrupts

Even better register set

Discussions

Become a Hackaday.io Member