This project is very much a race against the clock. The idea for this project occurred to me after the first week of september, leaving three weeks before the end of the Square Inch Contest. Since it involves processor architecture, hardware and pcb design, and software, this is hardly possible in three weeks when you also have a job.

The project was not finished in time for the square inch deadline, perhaps the Hackaday prize 2019 will bring more luck.

Well let's start with the usual characteristics:




The registers are in RAM. That has been done before (see TMS9900). This will not give you a speed devil, but it is needed to fit the design in one square inch. Another thing left out of the CPU for this reason is...... the ALU. ( I do not intend to connect the square-inch 4 bit TTL ALU to this cpu ).


A development board was made, that has the external ROM and RAM for the CPU. It also has a 32kHz crystal with divider, and 6 displays, to make a digital clock. The clock is working now ! The development board also has an I/O connector, so it can also be used for other projects.


To make programming easy, an online Javascript editor/assembler/simulator was made. The assembly code for the application can be made and assembled in your browser. It can also simulate the cpu. If you open the simulator, just press Assemble and Run to see the simulated working clock (no soldering required) !

If you do this on a Raspberry Pi, you can save the assembly code to your local memory card, and you can download the assembled binary code. The Raspberry Pi can be connected to the development board, and directly burn the binary code into the flash ROM of the development board by means of a Python script.

While connected to the development board, the Raspberry Pi can also program new microcode into the square inch cpu. The required software and Python scripts are available in the files section.


NO ALU...  I could have programmed a small PIC or AVR as ALU (Wikipedia: ALU), but that's cheating. With the current microcode version, the only arithmetic that it can do is compare bytes and address items in a table. And the hardware won't allow much more.

For incrementing or decrementing a byte, lookup-tables are set up that contain an incremented or decremented version of the lower 8 address bits. Now the processor can increment or decrement a byte. Nothing more is needed to do arithmetic !

This was also done in the legendary HP9100 programmable calculator that was introduced in 1968. It worked with transistors and diodes, not a single digital IC ! The story behind this calculator is amazing, and can be read on People have tried to reverse-engineer the diode-transistor logic and came to the conclusion that the hardware of the machine could only increment or decrement digits (described by Tony Duell). Yet it could calculate with high-precision floating point numbers, including logarithms and trigonometry, at high speed. More info about the HP9100 can be found in the HP journal 1968-09.

So the CPU can do "calculations" by using a byte as index in a table. It can also work with linked structures as in LISP, or do a "calculated" jump in the microcode. This last feature can be used to interpret the instructions.


The instructions of the processor are defined by microcode (Wikipedia: Microcode). The microcode can be re-programmed, so many instruction sets are possible. I will now give a short description of the instructions that are available with the current microcode. 

The instructions deal with only three registers: A, PC (both 16 bits) and SP. Most instructions work with 16 bit data, so this is a 16 bit processor with an 8 bit databus.

In the 0x0000 - 0x00FF RAM range (called Zero Page), the processor can directly access 16-bit words. 

The contents of A can be loaded from or stored into a 16-bit zero page location.

The zero-page locations can also be used as a pointer to memory. The A register can be loaded from or stored into such a memory location (indirect addressing). It can use 16-bit word format and also 8-bit byte format.

The upper or lower byte section of the A register can be loaded with an immediate byte.

Table lookups can be done in a single instruction, and bytes can be compared with a single instruction.

There is a stack, that allows pushing or popping from zero-page variables, and that supports CALL and RET instructions. 

Conditional and unconditional branches are available.

For more information, refer to the files section or the logs.


The hardware registers UPC, H, L and B are only visible to the microcode and not to the instructions of the CPU. The instructions of the CPU only deal with architectural registers in RAM.


Programming without ALU is very doable, as I've shown in my project NeuronZoo. In the Neuronzoo project, the "neurons" are like software objects. The eight out-going connections of a neuron, Axons, are like eight fields within the object structure, that can contain pointers to other objects. The neurotransmitters are like processor registers that point to a certain object. Processing is done by following links in the structure and changing the pointers in the object structure. Different execution paths can be followed, depending on the existence of a link. Everyone with theoretical computer background can tell you that such a system is Turing complete, meaning that it can execute every possible computer program, if it is given enough memory and time. The NeuronZoo project demonstrates this by adding numbers and generating chess moves. [ End of NeuronZoo commercial ].