Close

Log#60 : Definition of the auto-update fields

A project log for YASEP News archives

A backup of the blog before the host erases it. A good opportunity to review the development, twists and turns of this ISA !

yann-guidon-ygdesYann Guidon / YGDES 04/11/2020 at 03:480 Comments

By whygee on Friday 31 January 2014, 07:11 - Architecture
(edit 20140207: some stupid typos crept into the tables)
(edit 2014-02-08 : dropping all the pre-modifications)


The instruction set of the YASEP architecture is finally frozen, after years of fine-tuning and exploration !

In august 2013, during a discussion with JCH, I came up with a new encoding for the 4 remaining bits of the extended instructions that were reserved for register auto-updates. I've been struggling with the one big shortcoming of the architecture : the very limited range of Imm4, particularly for conditional relative jumps. I had hacked a few tricks but none were really satisfying.

JCH pointed to some autoupdate codes that didn't make sense in combination with other flags and that's how he found a way to get 2 more bit for SI4/Imm4.

I tried to simplify the system down to a few simpler codes, following these principles :

The important trick that JCH found is that the Imm/Reg field invalidates certain auto-updates and frees some bits. In particular, it makes no sense to update SI4 when this source operand is immediate, so SI4 is associated with NOP in certain cases.

There is very little room and I had to make some compromises. For example, the CND field can't be updated when other registers are. Pre-incrementations are also avoided (see at the bottom why). It's not possible to increment one register and decrement another.

The resulting format provides Imm6 and one post-update for all extended instructions, and one to three post-updates when no immediate is present.

00 NOP
01 SND+
10 DST+
11 CND- (this helps loops)
        00           01         10    11
00     NOP     SND+,SI4+,DST+  SI4-  SI4+
01  SND-,SI4-    SND+,SI4+     SND-  SND+
10  DST-,SI4-    DST+,SI4+     DST-  DST+
11  DST-,SND-    DST+,SND+     CND-  CND+

- The big advantage of this encoding is that it increases code density for a lot of very common sequences : stack manipulation, string/vector processing, counters... Code density increase does not always mean faster execution but it helps. Different microarchitectures might implement these flags with different approaches (serial or parallel)

- There are several drawbacks as well : the encoding favours density over decoding ease (but what can we do with only 4 bits ?). The new encoding also breaks Imm4 and a new assembler must be recoded from scratch (the current one is aging and its flexibility has been stretched to its limits).

- In the end, it is a progress :

- Some questions remain :

Right now, the priority is to rewrite the assembler/disassembler and keep the simulator and VHDL up-to-date. My work system is in a bad state and it will take time to get everything back in order.

Why no pre-increment or pre-decrement ?

Pre-modification are removed because they break the very important rule that an instruction should not trap (or be able to trap) in the middle of the execution pipeline.

In the case of pre-incrementing an address register, such as MOV -D1, R1, the validity of the new address in A1 is known only after it is being computed, but there is no way to gracefully stop the instruction in the middle or even restart it. The proper way to do it is to move the -D1 into either a previous instruction using A1 or D1, or simply emit a short ADD -1 A1 instruction before the actual move to R1.

Remember : all the operands must be directly ready for use (at decode stage) before the instruction can proceed to execution stage.

The previous table was :

        00       01   10    11
00     NOP     +SI4  SI4+  SI4-
01  SND+,SI4+  +SND  SND+  SND-
10  DST+,SI4+  +DST  DST+  DST-
11  DST+,SND+  +CND  CND+  CND-

The new table uses the 4 pre-inc entries for 2-post-decrement and 3-post-increment.


20200411:

This definition is the cornerstone that was always missing from the YASEP ISA, which would increase the coding density and speed.

It took a long time to re-define and it is not straight-forward to implement without the little tricks I have developed...

But it also makes the compiler more complex, and the architecture harder to understand...

This is one of the major changes that have broken the assembler and YASEP2014 has since remained in a crippled state :-(

Anyway this is a major aspect to remember and implement in any future redesign !

Discussions