Writing a TOM-1 program (take 1)

I needed a better way to test the branching logic of the CPU and decided to come up with a fairly basic "assembler". Taking inspiration from Forth, we can invent a very simple programming language from scratch taking a string, splitting all the whitespace into a set of tokens, and then converting each token into a 4-byte instruction in the code binary that the circuit uses to operate. With just a little compiler magic we can throw in some convenience features like named labels and mixed decimal and hex (0x-prefixed) numbers.

Here's an example of a test that verifies subtraction:

from tom1 import *

labels = generate("""

[start]

0x8372 0x35aa -1 ~& 1 + + [check_subtraction]

0 branch0 start

""")

debug()
step_until(pc=labels['check_subtraction'])
validate(tos=0x4dc8)
print("success")

This language isn't actually Forth but something much less powerful. Here's the module "tom1" imported at the top of the file: tom1.py

So in the above script, call to generate("""..."""") will take a TOM-1 "script" as its first argument and actually compile it, generating a .hex binary file that can be used by the CPU in the simulator. This function generate() also returns a value labels that can be used to actually run the test. The methods debug() and the rest that follow generate() are actually Python code to interactively run the CPU, to step the CPU until we hit specific labels, and then validate if all the CPU values are correct. If the script doesn't throw during a call to validate(), then the test succeeded.

Let's look at the at the first and last lines of the script inside the triple quotation marks ("""):

"""
[start]

...

0 branch0 start
"""

[start] declares a label that we can conditionally jump to. When the compiler sees it, it will treat the token start everywhere else in the script as a reference to the location in the program where you wrote [start].

The token 0 means the CPU will push the number 0 to the stack with its "push_literal" opcode. The next token branch0 is implemented on the CPU as a "jump_if_0" opcode that rewrites the Program Counter with a new address if and only if the value on the top of the stack is equal to 0. So 0 branch0 is a two-opcode way to "always branch". All the tests end with a jump to the start of the program since there is no way to stop the CPU...

The middle line in the script is the actual test:

0x8372 0x35aa -1 ~& 1 + + [check_subtraction]

This is actually an executable program that subtracts two numbers! The CPU needs a lot of assistance from a compiler to abstract this away, because the CPU can do very little. For now I'm forced to write this out by hand. Let's break this down in steps:

0x8372 0x35aa => We push these two values onto the stack. The top of the stack is now 0x35aa.
-1 ~& => We push the value -1 and then NAND the top two values on the stack. The token for the NAND opcode is arbitrarily ~& so it looks similar to the other arithmetic function +. After this opcode, the top of the stack will be equal to "0x35aa nand 0xFFFF", which is 0xca55. This is actually a bit hack that inverts 0x35aa without needing a dedicated "invert" opcode.
1 + => We need to add 1 to the inverted number. There's a difference between inverting a number and turning it negative, and that difference is two's complement.
+ => We add the top two values on the stack, which are now the first number we pushed (0x8372) and the two's complement of the second (0x35aa). Adding them together performs subtraction, and we get our one result on the stack: 0x4dc8.

And there you have it, arithmetic subtraction on the TOM-1 implemented in a blazing, uh, 14 clock cycles. The final token in the script [check_subtraction] is used for testing but also is a helpful comment, since this code is very hard to follow. Eventually I'll have to stop working on an assembler and start working on an actual compiler, so -1 ~& 1 + + can be tucked away inside a function.

Opcode Encoding for a Stack Machine

Bought an EPROM Programmer + Eraser

Discussions

Become a Hackaday.io Member