Close

Hop, Skip and Jump

A project log for Suite-16

Suite-16 is a 16-bit cpu built entirely from TTL. It is a personal exploration of how hardware and software interact.

monsonitemonsonite 10/24/2019 at 17:293 Comments

This week I am working towards getting my pet project, SIMPL working on the Suite-16 simulator. 

I am making good progress with the main routines that handle decimal number entry and decimal number printing. These have been relatively easy to code, and the codesize compares with the equivalent code written in MSP430 assembly language.

The next thing I need to code up are the three routines that will provide the mechanics of the interpreter.

Assembling code by hand is not too difficult, but it helps if you keep a modular approach - and each module has only one entry point and one exit point. It takes more time to plan each module, and then test it - than it does to hand assemble. So for the moment I am not overly concerned that I don't have a full assembler.

Modular code is the approach taken by Forth. You are encouraged to write short routines that only require a few input variables that are taken off the stack, and in turn will calculate some output result that is placed back on the stack. The stack is the all important communicating pipeline between the functional modules.

Whilst SIMPL is by no means anything like a full Forth, it does follow closely with some of the techniques used in the interpreter, but the dictionary that is fundamental to Forth is replace with a simple jump table. This makes it possible to have a working SIMPL kernel operating in fewer than 1000 bytes of code.

The Command Interpreter

Ward Cunningham who wrote Txtzyme, the precursor to SIMPL, described his interpreter as a switch-case statement contained within a loop.

I now need to devise an efficient switch statement mechanism for Suite-16, as this is central to the whole functioning of the command interpreter.

The switch statement is given an input value which it translates by a look-up table mechanism to an output value, and this output value is used as a jump address for the program execution.

Commands

Whilst there are 96 printable ascii codes to be used as commands, we are unlikely to have to use all of them in the jump table. First we can discount the numerical characters as these are handled separately by the number entry routing.  Capital letters are reserved for User Functions or variables, so they will also be handled differently. That just leaves 26 lower case characters and 34 other symbols.  The jump table has already reduced in size from 96 to 60 entries.  It may be possible to reserve 60 words of the zeropage to accommodate the jump table, leaving 196 words for essential code, user variables and structures such as the data stack and return stack.

The jump mechanism needs some clarification. With Suite-16 we can embed an 8-bit jump address into the lower byte of the instruction. This however is very useful for accessing addresses on the zeropage, and we will need to find an alternative method to access the code words that are more likely to be located outside of page 0.

The jump table will contain a list of addresses, which are the start addresses for all of the command routines. For example if our accumulator currently holds the ascii character "p" 0x70 and we want to used this to invoke the printnum routine which for example starts at address 0x0100, we need to create a table in memory which at address 0x70 contains the value 0x0100. We can get this address back into the accumulator and then jump to it.

Trampoline Jumps

Suite -16 is currently only using an 8-bit jump address which is stored in the payload section of the instruction. If we extend this to a 16-bit jump, the target address will be held in the word following the jump instruction. We can use the accumulator to overwrite this target address, so we can effectively jump to an address that is held in the accumulator. This currently will have to be done in a two stage process, sometimes called a Trampoline Jump.

Let's assume that the accumulator holds 0x70 the letter p, and we want to jump to address 0x0100 that is held in the lookup table. We can use the indirect register addressing mode to access the table, using register R1 as a pointer. Our trampoline will be placed at locations 0x80 and 0x81

ST R0, R1       // R1 now contains 0x70
LD R0, @R1      // R0 contains 0x0100
SET R1,  0x81   // The trampoline's target address location
ST  R0, @R1     // store 0x0100 at location 0x81
JMP 0x80        // Jump to 0x80 where the trampoline jump instruction is located

This method is quite clunky and it takes 6 instructions to direct the program flow to the printnum routine. 

It would be better if there was an easy way of doing a direct jump based on the contents of the accumulator, but with very little additional hardware overhead. 

Fortunately we already have the means to modify the bottom 8-bits of the program counter as it is used by our call and branching methods. It should be relatively straightforward to adding a new instruction in the form of JMP @R0. 

Our look-up code then becomes much simpler:

ST R0, R1     // R1 now contains 0x70
LD R0, @R1    // R0 contains 0x0100
JMP @R0       // Program jumps to address 0x0100


Discussions

monsonite wrote 10/25/2019 at 22:21 point

Roelh - Thanks for the suggestions.   I have added support for an indirect jump and just testing it out now.

I agree that making the PC just a general purpose register gives more flexibility, but this will add the overhead of INC PC to every instruction, whilst if I use a counter the PC is automatically incremented on every clock cycle, unless it's being loaded from the ALU. I'm following the  gigatron scheme which uses a counter that can be loaded from the internal data bus.

I intend to extend the instructions so that I can add and subtract small numbers from the registers.

I'm making good progress writing code on the simulator and learning every day what can be improved in my instruction set.

  Are you sure? yes | no

roelh wrote 10/26/2019 at 21:11 point

Just the fact that the PC is part of the register set does not imply that it can not have a fast-increment feature (that the other registers don't have).

Another advantage after reading your next log:   5) return from subroutine is just popping the PC-register. 

  Are you sure? yes | no

roelh wrote 10/25/2019 at 21:16 point

HI Monsonite,

your instruction set seems to lack an indirect jump. Immediates were also problematic.

Why don't you take the approach of the PDP-11 ?

In that case, the program counter is just one of the registers. Possible in your case, because the registers are already 16 bit. Advantages:

1) It can be loaded just like any other register, so your indirect jump is also easy

2) Immediate values can be loaded with indirect-pc addressing. But it needs an extra increment of the program counter (can also be done by just skipping the next instruction).

3) You will probably have instructions to add or subtract a small number from a register. The same instructions can be used to do relative jumps. You will probably want to make the jumps conditional, but that effect can also be reached by having a conditional-skip-next-instruction as part of compare instructions.

4) Since the program counter is handled as just another register, the control system of your CPU will be simpler.

.... I just realized that you have an Accumulator system, while in the PDP-11 a load to any register must be possible. Don't exactly know which operations your registers support.

Your accumulator system will become a bottleneck if you aim for higher performance, but I think performance of the CPU is not high on your list.

  Are you sure? yes | no