Core state machine(s)

A project log for YGREC8

A byte-wide stripped-down version of the YGREC16 architecture

yann-guidon-ygdesYann Guidon / YGDES 01/12/2020 at 21:070 Comments

Update 20200711 : superseded by log 106. FSM update


I'm considering writing a quick&dirty&short behavioural simulation of the Y8 core to get a refreshed higher level view (as well as an alternate platform that simulates faster to try code out). I've already written one 2 years ago but it left a lot to be desired. Today's core has a more refined ISA and many details have evolved and matured. And I have a cool assembler I can throw at more challenges!

The core is still simple, anyway, with "most instructions executing in a single cycle", with 2 exceptions :

  1. Write to PC (except SET and CALL that can write directly to PC and bypass the slow ALU path)
  2. LDCL/LDCH that need one more cycle to read the instruction memory.

and then you realise you haven't thought about LDCx writing to PC, which makes things even more complex but who would do it ? (well, if 512 instructions of the 64K space allow it, there is 1 chance in 128 it gets executed so it's not negligible).

A proper FSM is clearly required. And soon, it appears that it's not going to be a nice and tidy FSM like you see in tutorials because I need 2 FSMs.

These can't be completely joined because they are orthogonal. The execution FSM is what is visible from the outside, particularly during debug sessions. It must also handle internal initialisation when the chip goes out of /RESET (like: load data from external memory). The instruction cycle FSM cares for sequencing single and complex instructions.

So it makes sense to separate the 2 FSM because it untangles the complex combinations that might arise and prevents naughty bugs and race conditions.

Before going further, let's remind how the Y8 works : there is no real pipeline, yet 2 phases overlap:

  1. A first phase computes the next address (Increment PC) and/or gets a new address from an instruction, then fetches the instruction memory to read the next instruction.
  2. The second phase gets a ready instruction (or immediate word), then decodes/fetches operands/executes/writesback stuff.

The FSM must ensure that the instruction memory is fetched before starting the execution phase.

It gets more complex with the LDCx instructions because another round goes through the instruction memory. And if the destination is PC, yet another cycle is added to fetch the instruction instead of using PC+1.

OTOH, the core state FSM has to receive orders from outside, latch them and send the appropriate commands to the other FSM. Reset is asserted there, since no "general RESET" signal would be propagated through the whole chip, further simplifying the implementation.

This FSM receives a code from outside and latches it before interpretation. During debug, it deals with the usual start/step/stop commands, though it defaults to start after power-up. A fourth command can be RESET to force a reload of the program memory (if you change the source code to debug or if INV wants to switch to a different code page).

So we get the following command codes :

Start11Run the system
Step10execute one instruction then pause the core.
Stop01pause the core at the end of the instruction.
Reset00pause the core and reset FSM
(reload instructions when another
command is received)

The default command is Start when the core wakes up, and it can be modified (from input pins or through the debug interface).

The FSM can have more states to deal with initialisation. It is clocked by (and synchronised to) the core clock. Its value should be read back by the user after every command to ensure the proper value is latched and the FSM is in a coherent state.

The five steps (so far) are:

StateDescriptionHow to get there
ResetJust do nothing.
Clear PC and some
eventual SR flags.
receive the RESET command
or external signal.
Loadcopy external data to
instruction memory,
increment PC for each
new word.
ext. Reset signal is
de-asserted and receive
command (other than Reset)
Stopwait, do nothing.
- Load is finished (PC overflow)
- received Stop when in Run
- when in state Step
StepExecute one instruction
- received Step command
RunLet the core do its thing- received Start command
when in Stop state

I hope it's clear.

The instruction FSM has 4 states:

StateDescriptionhow to get there
IDLEDo nothing.
- opcode=INV in state INST
- state/= Step and state /= Run
decode and execute instruction
- LDCX & SND /= PC
- WrPC
- IDLE & State=RUN or STEP
cycle to read instruction memory
state=INST and opcode=LDCH/LDCL
extra cycle to fetch instruction
state=INST or LDCX and SND=PC and opcode/=SET|CALL

These states have different names to prevent confusion between both FSMs.

From there, the code is pretty easy to write...

Note : the split FSM makes them easier to run fast and less prone to bugs.

The size for storing the states is marginally larger, but the decoding logic probably smaller.

The significant difference is the latency and delay : there is about two clock cycles of delay between the arrival of a new command and the start of the execution of an instruction. This is not a severe problem because the debug interface will be significantly slower than the core.

Crude simulations shows the "Phase FSM" has a flaw that allows a transition from IDLE to INST when the opcode is INV. This condition should put the FSM into a different state that is exited when a new command is received... I'm updating the code & sketches.

This system will now clearly refuse to execute any INV instruction, even in debug mode.