Close

Benchmarking Suite-16

A project log for Suite-16

Suite-16 is a 16-bit cpu built entirely from TTL. It is a personal exploration of how hardware and software interact.

monsonitemonsonite 10/28/2019 at 13:460 Comments

Over the last week I have been running Suite-16 assembly language simulated  in about 60 lines of  C++ code.  I have evolved the simulator over that time, and added some new instructions where it became necessary to use them.

The simulator has been written using the Arduino IDE - so that anyone with an Arduino compatible board can explore the code and learn how a very simple cpu simulator works.

Originally I had been simulating the Suite-16 cpu on an MSP430 Launchpad board with FRAM.

I noticed that despite it being a 16-bit processor, the performance was not so good, so I have swapped over to a Nucleo STM32H743 board which has a 400MHz ARM processor.

I'm still using the Arduino IDE to develop code - because it has a useful timing function micros() which returns the number of microseconds since the program was started. With this I can get fairly accurate timing information from my simulator.

I have used one of the spare instruction opcodes to allow the instruction count and the elapsed time to be output to the terminal

By way of a timing benchmark, I have set up a simple loop that loads R0 with 32767 and repeatedly decrements it until it reaches zero. I then print out instruction count and elapsed number of microseconds.

Based on the "count down from 32767" loop, my Suite16 simulator is running about 8 million simulated instructions per second.

That's about 66% of what I'm hoping the TTL cpu to run at.

Based on the 400MHz clock on the Nucleo board, I can estimate that the simulator in C is taking about 50 ARM instructions to execute a Suite-16 simulated one.

I tried exactly the same code on the MSP430 which is a nominal 16MHz. Unfortunately the FRAM only works at 8MHz with wait states, so that slows it down considerably to about 75,000 simulated instructions per second.

So I tried a 16MHz Arduino with an 8-bit AVR ATmega328 and the results were much improved to nearly 139,000 instructions per second.

The humble AVR is approximately 59 times slower than the ARM, but with a 7uS simulated instruction cycle it is still in the same league as some of the classic minicomputers from the 1960s.

Update 31-3-2021.

I am now running the simulator on a 600MHz Teensy 4.0 dev board.

An empy loop executes at around 50 million iterations per second and an addition, subtraction or logic operation can be performed around 9.2 million times per second.

I have decided that the ISA of Suite-16 is a very good match for a stack-based bytecode language called STABLE by Sandor Schneider.  

Discussions