Close

Solving Towers of Hanoi recursively in 80 nibbles

A project log for Exploring the Science Fair Microcomputer Trainer

An exploration of Radio Shack's educational computer from 1985

jason-jacquesJason Jacques 02/22/2024 at 01:490 Comments

As Michael mentioned, he somewhat nerd sniped me with the the idea of squeezing a Towers of Hanoi solver in not the Science Fair Microcomputer Trainer. As an avid fan of both the Gakken FX R-165 and the Science Fair Microcomputer Trainer, I could't let it go uncompleted.

With a good number of iterations, I managed to squeeze an "interactive" solver down to fit into just the 80 nibbles (40 bytes!) of the Microcomputer Trainer's memory.

The full listing follows.

The Listing

Memory address in column 1, 2, and 3; value to enter in column 4. Logical labels, relevant to program execution, are in column 5. The final two columns are a disassembly of the instruction, back to labels and opcodes. These correspond one-to-one with the final assembly language source (provided at the end of this entry).

; towers of hanoi for science fair microcomputer trainer
; (c) 2024 LambdaMikel & jtjacques

; assembled with https://arpruss.github.io/r165.html

; stack
; set 0x50 to a
; set 0x51 to c
; set 0x52 to b
; set 0x53 to 0

; initial state
; set z (0x6b) to n (0, 1, 2, 3)
; set y (0x6e) to 3 (pointer to m 0x53)

; start at hanoi (0x17), mode 2
; when pause (0x33, 0b0110011)
;     reset, inspect y (0x6e)
;     moves: inspect memory at 0x5[y] (from), 0x5[y+1] (to)
;     resume at resume (0x00, or simply reset), mode 2
; when hlt (0x29, 0b0101001), program complete

0         0000000   00        B         resume              AIY
1         0000001   01        2                             2
2         0000010   02        5                             MA
3         0000011   03        B                             AIY
4         0000100   04        2                             2
5         0000101   05        4                             AM
6         0000110   06        B                             AIY
7         0000111   07        D                             D
8         0001000   08        5                             MA
9         0001001   09        B                             AIY
10        0001010   0A        4                             4
11        0001011   0B        4                             AM
12        0001100   0C        B                             AIY
13        0001101   0D        B                             B
14        0001110   0E        5                             MA
15        0001111   0F        B                             AIY
16        0010000   10        6                             6
17        0010001   11        4                             AM
18        0010010   12        8                             TIA
19        0010011   13        3                             3
20        0010100   14        B                   lbl_6:    AIY
21        0010101   15        1                             1
22        0010110   16        4                             AM
23        0010111   17        2         hanoi               CH
24        0011000   18        D                             CIY
25        0011001   19        0                             0
26        0011010   1A        F                             JUMP
27        0011011   1B        3                             lbl
28        0011100   1C        6                             _0
29        0011101   1D        F                             JUMP
30        0011110   1E        2                             lbl
31        0011111   1F        5                             _1
32        0100000   20        B                   lbl_3:    AIY
33        0100001   21        C                             C
34        0100010   22        2                             CH
35        0100011   23        B                             AIY
36        0100100   24        1                             1
37        0100101   25        2                   lbl_1:    CH
38        0100110   26        5                             MA
39        0100111   27        E                             CAL
40        0101000   28        6                             SIFT
41        0101001   29        F         halt      lbl_2:    JUMP
42        0101010   2A        2                             lbl
43        0101011   2B        9                             _2
44        0101100   2C        C                             CIA
45        0101101   2D        0                             0
46        0101110   2E        F                             JUMP
47        0101111   2F        2                             lbl
48        0110000   30        0                             _3
49        0110001   31        B                             AIY
50        0110010   32        9                             9
51        0110011   33        F         pause     lbl_4:    JUMP
52        0110100   34        3                             lbl
53        0110101   35        3                             _4
54        0110110   36        B                   lbl_0:    AIY
55        0110111   37        F                             F
56        0111000   38        2                             CH
57        0111001   39        B                             AIY
58        0111010   3A        D                             D
59        0111011   3B        5                             MA
60        0111100   3C        B                             AIY
61        0111101   3D        4                             4
62        0111110   3E        4                             AM
63        0111111   3F        B                             AIY
64        1000000   40        E                             E
65        1000001   41        5                             MA
66        1000010   42        B                             AIY
67        1000011   43        3                             3
68        1000100   44        4                             AM
69        1000101   45        B                             AIY
70        1000110   46        C                             C
71        1000111   47        5                             MA
72        1001000   48        B                             AIY
73        1001001   49        5                             5
74        1001010   4A        4                             AM
75        1001011   4B        8                             TIA
76        1001100   4C        1                             1
77        1001101   4D        F                             JUMP
78        1001110   4E        1                             lbl
79        1001111   4F        4                             _6
80        1010000   50        A         source
81        1010001   51        C         dest
82        1010010   52        B         spare
83        1010011   53        0         return

Setup

The program does require you to specify the initial parameters for the first stack frame in the data area at addresses 0x50-0x53. This tells the computer the layout of your pegs, and sets the exit condition. For convenience I have included them at the end of the listing and you can simply carry on entry past the program area into the data area. However, for completeness they are as follows: 

    0x50: source (peg A),
    0x51: destination (peg C)
    0x52: spare (peg B)
    0x53: 0 (return label)

While the names are arbitrary (you could call them pegs 1, 2, and 3, for example, if you prefer) the final return label (0) is required to allow the program to exit correctly.

You also need to configure two more parameters: the number of discs between 0 and 3 in 0x6B (the Z register) and the least significant nibble marking the end of the stack in 0x6E (the Y register), 3.

Important Note: documentation for the FX R-165 and Microcomputer Trainer indicate that the Z register is held in memory at location 0x6D. This appears to be an error. The Z register is located at 0x6B on real hardware (both the FX R-165 and Microcomputer Trainer) and on emulators using the ROM, but other simulations may implement the documented location of 0x6D.

Execution

Once you have entered the program and set your desired number of discs (0-3), you need to start the program from location 0x17, in run mode 2 (run with addresses). Press [1], [7], [addr-set] to set the program counter, then press [2] and [run] to start the program. Using run mode 2 means that we can monitor the program progress on the address bus (the binary LEDs) and read out our moves when the program pauses.

Result

When the program pauses, the address bus will read 0b0110011. When this happens, the program has calculated a move for us.

To read the move, first we must stop the program by pressing [reset]. This allow us to check the Y register, at 0x6E, by pressing [6], [E], [addr-set]. This will display the least significant nibble of the data area address where we will find our move. For example, if you see 4, we would inspect 0x54. e.g. [5], [4], [addr-set]. The computer will now display the from peg on the HEX LED. Press [incr] to see the to peg. For example, if we see [B], [C], we would move the top disc on peg B to peg C.

To find our next move, we simply resume the program in mode 2. For convenience, the resume address is at 0x00. This allows us to simply reset the machine: [reset], [2], [run]. The program will proceed to calculate the next move. If the binary LEDs indicate a pause (0b0110011) then repeat the inspection process, starting with the Y register. If the binary LEDs display 0b0101001 the program has completed.

Note that attempting to solve for numbers of discs 4 or greater will overflow the 16 locations made available as data memory (0x50-5F) and used for the stack. This will cause the program to "crash" or result in other unexpected behaviour.

How the Program Works

It is possible to put executable code in the memory area (0x50-5F) of the Science Fair Microcomputer Trainer, and therefore set return addresses at run time. However, each jump requires 3 nibbles (4-bit words) which is a lot of overhead; both in the stack frame and in code required to write these addresses to memory. Fortunately Michael gave me a great start with his initial work on the problem. Instead, I used his neat trick of constant return labels and return addresses calculated at assembly time.

The program went through multiple iterations, but the draft version fairly closely follows this example pseudo code.

  TOH(N, A, B, C):
      IF (N = 0): STOP
      TOH(N-1, A, C, B)
      A -> C
      TOH(N-1, B, A, C)

Draft program

To make it easier to develop the program I use a very simple assembler. The opcode mnemonics are identical to those used in the Microcomputer Trainer documentation. While the assembler does not offer any macros, labels are supported. Using an assembler allows us to iterate our program faster and, importantly, ensures that all the jumps are correctly calculated in the final program. 

The following code represents a basic, first attempt at implementing a (somewhat) interactive solver for Towers of Hanoi.

; towers of hanoi for science fair microcomputer trainer
; (c) 2024 LambdaMikel & jtjacques

; assemble with https://arpruss.github.io/r165.html

; stack
; set 0x50 to a
; set 0x51 to c
; set 0x52 to b
; set 0x53 to 0

; initial state
; set z (0x6b) to n (0, 1, 2, 3)
; set y (0x6e) to 3 (pointer to m 0x53)

; start at hanoi (0x??), mode 2
; when pause (0x??, 0b???????)
;   reset, inspect y (0x6e)
;   moves: inspect memory at 0x5[y] (from), 0x5[y+1] (to)
;   resume at resume (0x??), mode 2
; when hlt (0x??, 0b???????), program complete

HANOI:
  CH          ; get b/z
  CIY 0       ; z <> 0
  JUMP CALL1  ;   then start
  JUMP RETURN ;   else return

CALL1:
  AIY F       ; disk - 1 (z)
  CH          ; restore a/y

  ; swap 1    ; prepare stack

    ; s =+0, d =+1, x =+2, r =+3
    ; s'=+4, d'=+5, x'=+6, r'=+7
    ; y = 3
    ; s' <- s
    ; d' <- x
    ; x' <- d

        AIY 3   ; move to x'

    ; AIY D ; s
    ; MA
    ; AIY 4 ; s'
    ; AM

    ; AIY E ; x
    ; MA
    ; AIY 3 ; d'
    ; AM

    ; AIY C ; d
    ; MA
    ; AIY 5 ; x'
    ; AM

  TIA 1       ; lbl 1
  AIY 1       ; advance to r'
  AM          ; set lbl
  JUMP HANOI  ; recurse

RET1:
  ; AIY C       ; unwind stack
  ; ; CH          ; get b/z
  ; ; AIY 1       ; restore disk number (z)
  ; ; CH          ; restore a/y
  ; AIY D       ; move stack to s
  AIY 9       ; unwind stack and move to s
PAUSE:
  JUMP PAUSE  ; pause

RESUME:
  AIY 3       ; move stack to r
CALL2:
  ; CH          ; get b/z
  ; AIY F       ; disk - 1 (z)
  ; CH          ; get a/y

  ; swap 2    ; prepare stack

    ; s =+0, d =+1, x =+2, r =+3
    ; s'=+4, d'=+5, x'=+6, r'=+7
    ; y = 3
    ; s' <- x
    ; d' <- d
    ; x' <- s

        AIY 3   ; move to x'

    ; AIY F ; x
    ; MA
    ; AIY 2 ; s'
    ; AM

    ; AIY D ; d
    ; MA
    ; AIY 4 ; d'
    ; AM

    ; AIY B ; s
    ; MA
    ; AIY 6 ; x'
    ; AM

  TIA 2       ; lbl 2
  AIY 1       ; advance to r'
  AM          ; set lbl
  JUMP HANOI  ; recurse

RET2:
  AIY C       ; unwind stack
  CH          ; get b/z
  AIY 1       ; restore disk number (z)
  ;JUMP RETURN                                  ; fallthrough

RETURN:
  CH          ; restore a/y
  MA          ; get r
  CIA 0       ; r <> 0 ?
  JUMP TEST   ;   then goto test
HLT:          ; else
  JUMP HLT    ;   hlt
TEST:
  CIA 1       ; r <> 1 ?
  JUMP RET2   ;   then ret2
  JUMP RET1   ;   else ret1

The program is fairly intelligible and its structure mimics that of the pseudo code. First we check if the number of discs is > 0, if not we exit the function by returning. Otherwise, we begin our first function call. Here we decrement the disc number (disk - 1), then begin to arrange our stack (or we would, but I comment this out for brevity and replaced it with a stub during testing).

Note that the Microcomputer Trainer does not have a subtraction instruction. Subtraction is achieved by causing the sum to overflow and wrap around. To discover the number to add, simply subtract the desired value from 0x10. For example, to subtract 0x1 we instead need to add 0xF as 0x10-0x1=0xF.

Finally, we complete the stack configuration for our first recursive call. We load the constant 1 into the accumulator, advance the stack pointer (the Y register) to the correct location, and store the return label in memory.

The program then does its first recursion, by jumping back to HANOI. When that call is done, the function will return (more on that later) and we will be put back at RET1. Here we pop the stack (or unwind, as I note in the comments), correct the disk number (the decrement should only affect the call we just made) and move to the Y register to point to the source peg in the current stack frame.

Here we see our first optimisations. Notably, there is no need to actually restore the disk number. No one is looking at it right now, and we are going to need to decrement it again in a moment for the second recursive call. So we comment that code out, saving four nibbles. This would leave two adjacent AIY instructions to advance the Y register. We can simplify this by summing these together and doing a single add to save a further two nibbles. The sum of AIY 0xD and AIY 0xC is 0x19. As this is a 4-bit machine, we simply add the lower nibble, and do AIY 0x9.

We then pause the program with a jump to the PAUSE label. This allow the user to exit out of the program and retrieve the calculated move as indicated on the stack at the position indicated by the Y register.

We then expect the user to resume the program at RESUME. We move the stack pointer (the Y register) back to the end of the stack and prepare our second recursive call. We've already commented out the redundant decrement of the disk number (as we didn't restore it), saving another four nibbles. Simply avoiding restoring and re-decrementing the disk number has saved us 10 nibbles, with no impact on execution an only a minor impact on readability.

Next, we managed the stack. For brevity we use a stub to skip over doing the actual swaps while we develop the code. Like with the first call, we set our return label, but this time to 2 to indicate that this is the second recursive call. We load the constant into the accumulator, advance the stack pointer, and store the return label in memory.

When this call completes, we will be retuned to the RET2 label. Here we again pop the stack, restore the disk number (this time we need to!) and jump to RETURN. Or we would, but why waste memory jumping to the very next instruction? Instead we simply fall through. An instant and totally free saving of three nibbles.

Now we get to the RETURN block. Michael gives a great explanation of this concept, which he calls the "jump block", in his video where he demonstrates recursion on the Busch Microtronic 2090. Essentially, we retrieve the label at the stack pointer and do a series of tests. The Microcomputer Trainer tests for inequality, which makes this a bit more confusing, but in summary: if the label is <> 0, we will TEST the label further, otherwise we halt (using a loop to HALT). In the TEST section, if the label is <> 1 (i.e. 2), we jump to the label RET2, otherwise it must be 1 so we jump to RET1.

Shrinking the Program

If we uncomment the swaps (and remove the stub code) and assemble the program we find it is 94 nibbles. While we could just about fit the program into the combined memory and data space of 96 nibbles (80 in the program area at 0x00-4F, plus the first 14 from the adjacent data area at 0x50-5D), that would leave just two nibbles (0x5E-5F) for our stack; not even enough for a single frame. First we double check our code for any easy wins.

We notice two adjacent AIY instructions in the RESUME and CALL2 areas that have surfaced when we uncommented the proper stack management code. We add the values of 0x3 and 0xF to get 0x12, and simply add 0x2 as is sufficient in our 4-bit architecture. This saves us two nibbles. Down to 92.

Our optimisations so far have not materially impacted intelligibility, but to shrink the program further we need to tweak the code somewhat.

Next we notice some duplicate code. Both CALL1 and CALL2 end with the sequence, AIY 1, AM, JUMP HANOI to set the return location on the stack and actually execute the recursive call. We squeeze in a no-cost label of SETR above this code in CALL2 and remove the corresponding code from CALL1. Instead we jump to our new label. This gives us a net saving of three nibbles. Down to 89.

The code is still fairly understandable, but now we will flip that on its head. Literally.

Next, we rearrange the program to maximise the amount of fall through. Each jump takes three nibbles of memory. By rearranging the program we can loose two more jumps (in addition to the one we saved earlier), another saving of 6 nibbles. Down to 83. We also take the opportunity to move the frequently used RESUME block to 0x00, minimising the number of times the user has to explicitly set the program counter.

To squeeze out the final nibbles, we get a little more tricksy. The RETURN block does a bit of an odd pattern of jumps due to the inequality operation. It would be nice to test if the label is 0 and, if it is, halt right away. To do this we can do a no-cost substitution of the two nibble CIA 0 operation, replacing it with a two nibble call to CAL SIFT. This built in subroutine shifts the value in the accumulator to the right and (oddly) sets the jump flag to true if the lowest bit was zero. This allows us to remove the JUMP TEST instruction and immediately halt when we see an even number, like zero. If we have an odd number, the jump will be skipped and we can begin the next TEST. Removing the jump makes a three nibble saving. Down to 80.

Unfortunately, our label of 2 would also now trigger the HALT loop, so we make sure to make a no-cost change and replace the TIA 2 instruction to TIA 3 (or 0b11, as I like to think of it now). Equally, we need to consider the effect of that shift on our comparison in the TEST block, where we change our CIA 1 to CIA 0, to account for the fact that 1 right shifted is 0 (and 3 right shifted would be 1).

The final listing looks like so:

; towers of hanoi for science fair microcomputer trainer
; (c) 2024 LambdaMikel & jtjacques

; assemble with https://arpruss.github.io/r165.html

; stack
; set 0x50 to a
; set 0x51 to c
; set 0x52 to b
; set 0x53 to 0

; initial state
; set z (0x6b) to n (0, 1, 2, 3)
; set y (0x6e) to 3 (pointer to m 0x53)

; start at hanoi (0x??), mode 2
; when pause (0x??, 0b???????)
;   reset, inspect y (0x6e)
;   moves: inspect memory at 0x5[y] (from), 0x5[y+1] (to)
;   resume at resume (0x??), mode 2
; when hlt (0x??, 0b???????), program complete

RESUME:
  ;AIY 3       ; move stack to r
CALL2:
  ; CH          ; get b/z
  ; AIY F       ; disk - 1 (z)
  ; CH          ; get a/y

  ; swap 2    ; prepare stack

    ; s =+0, d =+1, x =+2, r =+3
    ; s'=+4, d'=+5, x'=+6, r'=+7
    ; y = 0
    ; s' <- x
    ; d' <- d
    ; x' <- s

        ;AIY 6   ; move to x'

    AIY 2 ; x
    MA
    AIY 2 ; s'
    AM

    AIY D ; d
    MA
    AIY 4 ; d'
    AM

    AIY B ; s
    MA
    AIY 6 ; x'
    AM

  TIA 3       ; lbl 0b11
SETR:
  AIY 1       ; advance to r'
  AM          ; set lbl
  ;JUMP HANOI  ; recurse                        ; fallthrough

HANOI:
  CH          ; get b/z
  CIY 0       ; z <> 0
  JUMP CALL1  ;   then start
  JUMP RETURN ;   else return

RET2:
  AIY C       ; unwind stack
  CH          ; get b/z
  AIY 1       ; restore disk number (z)
  ;JUMP RETURN                                  ; fallthrough

RETURN:
  CH          ; restore a/y
  MA          ; get r
  CAL SIFT    ; (r & 1) <> 1 (i.e. is r even) ?
HLT:          ; then
  JUMP HLT    ;   hlt
TEST:         ; else
  CIA 0       ; (r >> 1) <> (0b1 >> 1) ?
  JUMP RET2   ;   then ret2
  ;JUMP RET1   ;   else ret1                    ; fallthrough

RET1:
  ; AIY C       ; unwind stack
  ; ; CH          ; get b/z
  ; ; AIY 1       ; restore disk number (z)
  ; ; CH          ; restore a/y
  ; AIY D       ; move stack to s
  AIY 9       ; unwind stack and move to s
PAUSE:
  JUMP PAUSE  ; pause

CALL1:
  AIY F       ; disk - 1 (z)
  CH          ; restore a/y

  ; swap 1    ; prepare stack

    ; s =+0, d =+1, x =+2, r =+3
    ; s'=+4, d'=+5, x'=+6, r'=+7
    ; y = 3
    ; s' <- s
    ; d' <- x
    ; x' <- d

        ;AIY 3   ; move to x'

    AIY D ; s
    MA
    AIY 4 ; s'
    AM

    AIY E ; x
    MA
    AIY 3 ; d'
    AM

    AIY C ; d
    MA
    AIY 5 ; x'
    AM

  TIA 1       ; lbl 0b1
  ; AIY 1       ; advance to r'
  ; AM          ; set lbl
  ; JUMP HANOI  ; recurse
  JUMP SETR

If we run this through the assembler (or assemble it by hand), we get the very same sequence of instructions shown at the beginning of this entry.

Conclusion

In truth, and as Michael could attest, there were a fair few more iterations of this. Previous versions had to dip into the data area to fit. Throughout the process, many subtle sequencing decisions were made at indeterminate points to minimise the use of CH and the swaps of the Y and Z registers. Equally, I admit I did some last minute post-hoc rearranging of code, such as the order of the TIA and AIY instructions around the SETR label to maximise the opportunity for code reuse. Choosing which of the blocks to keep and which to discard may sometimes also require some trial and error in maximising total overall code savings.

Equally, the more abstract program design goals evolved over time. Very early iterations attempted some actual program driven I/O, as seen in the Microtronic version, rather than relying on the address LEDs to indicate state. Unfortunately, this was quickly dismissed as impractical. I did choose to retain the programatic pauses, however, so that there is no need to single-step the program and some semblance of an interactive experience remains.

Overall while many design trade-offs were made, I think it is fair to say we did achieve the goal of squeezing a Towers of Hanoi solver into just 80 nibbles (40 bytes) of memory. I hope you've enjoyed this deep dive into the code as much as I have. Despite its limitations, the Microcomputer Trainer clearly is a "real" computer. And for me, at least, it is yet to truly disappoint!

Discussions