Close

SIMPL as a hardware "bring up" language.

A project log for Suite-16

Suite-16 is a 16-bit CPU designed to built entirely from 74xx TTL. It is a personal exploration of how hardware and software interact.

monsonitemonsonite 10/28/2020 at 14:130 Comments

Back in January, I described an interpreted language that I have been developing, for the Suite-16 TTL computer.

SIMPL is an acronym for Serially Interpreted Minimal Programming Language

It started life back in 2013 as a serial command interface to allow a microcontroller to respond to single character commands, sent to it over a serial terminal connection. At that time I was working on precision motion control systems, and during development it was useful to have a simple command interface to allow control over the motion hardware.

Commands were give easy to remember uppercase alpha characters,  U Up,  D Down, L Left, R right  etc. Each command was preceded by a decimal number, which for the motion system, was the distance to move in millimetres.

100 U   would move the platform up by 100mm

This simple command shell was great for debugging the hardware, and rapidly evolved additional functionality to make it more useful.

Over a period of months, the test program evolved many more commands and started to become a bit unwieldly.

Further inspiration came in May of 2013, when I was introduced to Ward Cunningham's Txtzyme.  This was a short C program, written to run on the Arduino, which introduced me to a more formal structure for a command interpreter based around a switch-case running within a loop.


As a language, it is a means of communicating with a processor and as a means for automating repetitive machine tasks. It has specifically been designed to be small and uncomplicated, requiring very few memory resources. Versions of it have been ported to other processors in less than 1000 bytes of program space.

Although the central interpreter routines and command despatcher are small in size, the basic framework is extensible and adaptable to a range of  applications, the first of which is a tool-kit to allow easier programming and testing of hardware.

In this log I describe some of the desirable features and how they are implemented, to make this a useful tool for working with new processors.

Background.

Working with any new processor is never an easy task, but having worked on a few over the years, you tend to learn a few techniques and build a tool-kit to help make the job easier.  Once you have reached the stage that you can blink an LED or make a musical tone from an output pin, you have fought half the battle.

Almost every modern microcontroller is now programmed in C, or some scripting language.  To make this possible you need the vast resources of a C-compiler and a large tool-chain running on a modern laptop. The Arduino Project has brought new skills to millions of new coders, who have then gone on to do great things with embedded or hobbyist  programming. However, even the Arduino IDE has become bloated over the years, and it makes the coder highly reliant on vast libraries created by others.

This log hopes to illustrate some of the early methods of coding - dating back nearly 50 years, and how some of these techniques can be applied even today.

Assembly Language.

Writing in the native assembly language of any computer is not the easiest of tasks, time consuming to write and debug, and prone to mistakes. This often discourages people from attempting assembly language, or at least writing as little as possible.  However, assembly language is the bedrock, or foundation layer of everything we do, and at some point it is necessary to gain an understanding of it's importance to the whole software stack that is built on top of it.

So if ultimately you cannot avoid assembly language, there are means of minimising your exposure to it.  You need the bare bones of assembly language routines which will allow you to interact with your new processor, in an abstracted way that simplifies the tasks of low-level programming.

Early Days

In the early days of microprocessors, there was few programming tools available, and many were initially programmed solely in machine code by toggling the contents into memory using the switches on the front panel. This was incredibly time consuming and error-ridden, so when serial terminals became more widely available,  microcomputer manufacturers often provided a "serial monitor" program.  This allowed the contents of memory to be examined and modified and programs to be run from RAM. 

One of the more memorable monitors was "WozMon" that Steve Wozniak supplied for the Apple 1. It was compact and fitted into 256 bytes of PROM.  You still has to do your one hand assembly using pencil and paper, but typing and viewing hex on a screen is a lot easier than toggling front panel switches.

The next evolutionary step, was to provide an assembler, self hosted on the machine. This made assembly language programming a lot quicker, especially where labels and symbols could be use to define sub-routines and allow relocatable code to be written.  However a fully symbolic 2-pass assembler was at that time a fairly complex piece of software, and CPU manufacturers often only bundled them with their expensive tool chains and development systems.

Early hobbyists could seldom afford these tools, and so efforts were made to find alternatives.  Out of this period of late 1970's computer history came a range of interactive languages, tailored towards the resource limited microcomputers, including TinyBASIC, VTL-2, Mouse and Forth.  VTL-2   (Very Tiny Language) was small enough to fit into 768 bytes.

Basic Machine Interaction

Interacting on a one to one basis with a computer, in my opinion, is one of the purest forms of programming.  It is a conversation between man and machine, where you issue  commands, and the computer as your powerful servant  follows your commands without questioning.

To command your machine, you need to give it concise, unambiguous and accurate instructions which it will execute. These are traditionally done using a text based language which is either interpreted directly or compiled into the computer's memory. Compiled code runs faster, but it has the disadvantage and overheads of the "edit compile test" cycle. Interpreted code is slower in execution time but has the advantage that the process is interactive and you can quickly test and reiterate until you achieve the desired functionality.

The earliest electronic computers (1946-1952) were very simple machines and could not incorporate large instruction sets.  The input and output was done by telegraphy paper tape and teletype, with limited character sets based on a 5-bit code. This let to small instruction sets with program commands restricted to uppercase instructions. The instructions were chosen to have a strong mnemonic value - such as A for ADD and S for SUB.

In the light of these simpler methods, I wanted to move away from native assembly language, and provide a rudimentary, interpreted language, based on single character commands, which can form the basis of a toolkit to make the job of code development  somewhat easier.


Virtual Machines VMs

These Tiny Interpreted Languages, were frequently designed around a Virtual Machine, to be able to parse and interpret instructions stored in RAM.  Using a VM running a higher level language makes the task of programming easier, but at the expense of speed of execution. For this reason the VM needs to be efficient in fetching instructions from memory and and executing them in the native machine language,  Efficiency often means simplicity, so the interpreter must use all the fastest coding techniques to achieve this.

One approach is to use the minimum possible instruction set to implement the VM.  This keeps the amount of native machine language required to a few hundred bytes, which makes coding it simpler, and it also makes it easier and quicker to transfer the VM from one processor ISA to another.

So what operations will form the primitive instructions of the VM.  How many will be needed and typically what operations are essential to run the mechanics of the VM?

More complex instructions can always be synthesised by combining sequences of the primitive instructions. 

How to we make the VM language easier to work with? 

I have found that I struggle to remember the mnemonics and syntax when I move from one assembly language to another, as one manufacturer choses one convention over an other. As processors get more complex, the quantity and complexity of each mnemonic statement increases. It has almost got to the point where assembly language for the largest processors has become virtually unreadable.

To simplify this ever increasing complexity, I propose the use of single printable ascii characters to represent each VM primitive instruction. As literate humans, we have learned to efficiently recognise a wide range of symbols, punctuation marks, and alpha-numerical characters.

For example we all recognise the aritmetical symbols from our basic math and algebra lessons:

+      ADD
-      SUB
*      MUL
/      DIV

Then we can include the logical symbols:

&      AND
|      OR
^      XOR
~      NOT

With just these 8 symbols we have covered almost all of the instructions performed by the ALU.

We can then add the comparison operators:

<      LESS THAN
=      EQUAL TO
>      GREATEN THAN

 The memory operation symbols are borrowed from the Forth language

@      FETCH
!      STORE

 And because the VM is going to be based around a stack machine architecture, we need the stack manipulation operations:

"      DUP
'      DROP
$      SWAP
%      OVER

 There are only a few remaining printable symbols and these are used for program flow control, allocating variables and defining structures

      SPACE
#     LIT
(     BEGIN
)     END
,     PUSH
.     POP PRINT
:     CALL
;     RETURN
?     KEY INPUT
[     OPEN ARRAY
]     CLOSE ARRAY
\     COMMENT
_     TEXT STRING
`     VAR  
{     SWITCH-CASE OPEN
}     SWITCH-CASE CLOSE

We have defined the main constituents of a short-hand, but human readable language suitable for a stack based VM. It has its roots in Forth,  not only with a much reduced word-set to define the operations, but with single ASCII characters it means that the text interpreter has been greatly simplified. Any character received in the input buffer is decoded using a jump table which supplies the code execution address.

For example, if the interpreter finds a + symbol, which is ascii character 2B, we use the value of 2B to index into a jump table, where there will be located a 16-bit start address of the code block that will handle the + operation. In this case it will ADD the top two numbers that it finds on the stack, and put the result on the top of the stack.

The text interpreter will then fetch the next symbol from memory, and perform a similar execution process.  For each of the characters used in the tiny language, there will be a code block associated with it.

Numbers and Variables.

Having dealt with most of the printable symbols - there are three main classes of characters left, numerals 0-9, lowercase alpha a-z and uppercase alpha A-Z.

The numerical characters are handled by a code routine called NUMBER.  It takes each character in turn, until it finds a space or non-numeric character and creates a 16-bit number which places on the stack. It is the equivalent of an ascii-bin conversion. There is an equivalent numerical output routine  PRINTNUM,  which takes a number off the stack and prints it as a decimal number to the terminal. Further routines can be added to handle hexadecimal notation.

Having dealt with the numbers, we come to the alpha characters, which have traditionally been used in assembly language to denote sub-routine addresses in the form of labels, numerical constants and variables.  Often the various registers of microprocessors are given a shortform name consisting of an uppercase letter.  Being able to substitute a single character to represent a variable or constant has always been a powerful process in symbolic programming. In the case of tinyBASIC it was thus limited to 26 uppercase variables.

Subroutine labels are often denoted in lowercase.  In SIMPL we will continue this tradition.  Lowercase alpha characters will be assigned to some of the common tasks that would be required from a hardware bring-up language. With modern microcontrollers we want to exercise the GPIO ports, sample the ADC, define the timing of delays or  square waves in terns of  microseconds and milliseconds. The basic kernel of SIMPL can be extended by allocating the necessary code routines to these characters.

For a hardware bring up language, here is an example of the typical routines one would want to perform.  I have suggested single character names - with a mnemonic value.

a       set address for hex dump
b       set the number base
c       clear a range of memory
d       dump as hex
e       edit an address
f       fill memory range
g       Go - run code from and address
h       set an output high
i       define an input port 
j       jump to a procedure address
k       a loop index variable
l       set an output low
m       milliseconds delay
n       output a binary number on a port
o       define output port
p       print in decimal
q       print in hex
r       read register
s       sample ADC
t       toggle port line
u       microseconds delay
v       assign a variable
w       write register
x       define x-axis position
y       define y-axis position
z       sleep until keypress or other event

 a through to f are the typical commands that you would have on a hex editor or monitor program

h through to o are for exercising GPIO and defining loops and timing - useful for wave synthesis

p and q are for printing

r and w are for directly accessing a CPU register

x and y are useful for graphics routines or  2D motion control - such as CNC

Uppercase Commands.

These are what gives the language its flexibility and extensibility. They provide the mechanism for the programmer to create their own functions, and save them to memory. 

For example, the following code will print a string of text to the screen, everything contained between the underscore characters is treated as a text string and is printed directly to the screen:

_This is a test message_

We can now assign this snippet of code to a user function, let's use M for message

:M_this is a test message_;

Each user  function must be defined starting with a colon :  The interpreter recognises the colon as the start of a user definition, defined by the next character, in this case M.  It uses the ascii value of the M to create a unique address to store the remainder of the code snippet, until it reaches the trailing semicolon ;  

To execute this function, you only have to type M. 

User functions will often incorporate the in-built functions - such as the millisecond and microsecond delays  m and u.  We can incorporate these into a loop structure and use it to blink a LED, or generate audio tones.

The following code defines a scale of musical notes A,B,C,D,E,F,G 

\ Some fixed length  "Musical Tones"
:A40(1o1106u0o1106u);     \ A 440.00 Hz
:B45(1o986u0o986u);       \ B 493.88 Hz
:C51(1o929u0o929u);       \ C 523.25 Hz
:D57(1o825u0o825u);       \ D 587.33 Hz
:E64(1o733u0o733u);       \ E 659.26 Hz
:F72(1o690u0o691u);       \ F 698.46 Hz
:G81(1o613u0o613u);       \ G 783.99 HZ

 The backslash is used to state that everything following it until the newline character is a comment.

The brackets (parenthesis) are the means we define a loop function. Everything contained within the brackets will be repeated n times, where n is the number that immediately precedes the open bracket.

Looking at the definition for musical note A, assuming we have a small speaker connected to one pin of an output port.  

1o will set an output port high and 0o will set it low.  

So we set the output port high for 1106 microseconds and then set it low for 1106 microseconds. 

We repeat this procedure 40 times, which will produce a short tone of 440Hz.

If we want to play 3 notes, all we need to do is type ABC,  and if we want to play this sequence 5 times we can put it in a loop, and give it a new definition T, for "tune":

:T5(ABC)

 If we don't like the tune - we can quickly edit the definition for T.


        

Discussions