Close
0%
0%

Nits Processor

8-bit TTL technology processor

CedCed
Similar projects worth following
Nits Processor V1
The goal is to build an 8-bit TTL based CPU and learn from the experience.
It is based on the SAP-1, Ben Eater, James Bates and James Sharman's design with many changes and tweeks.

Nits Processor V2
V2 is an improved version of the Nits Processor with expanded ALU, 16-bit address space, more registers, interrupt manager and rich instruction set.

This project was triggered by the incredible set of videos from Ben Eater and James Bates. Many thanks to both of them. 

I do not pretend to be an expert in computer hardware. As a professional in the software side of things, I wanted to learn more and build myself a working TTL based CPU to improve and try out ideas.

Even if this build is based on both Ben Eater and James Bates design, I wanted to really understand every bit of the design and therefore I have made many changes along the way. I do not pretend they are better, they are mainly driven by the fact of testing things (different choice in chips, different approach, try an learn).

One thing that was really improved is the instruction set. The choice of having 8bit instructions and 256 bytes of memory really made it possible.

Once the Nits Processor was up and running, I started working on an improved V2.

These log posts are related to Nits V2:

What next?

Improved Address Register

and more to come...

These log posts are related to Nits V1:

General Architecture

Design Principles

The clock Module (Clock part 1)

All about registers (Registers part 1)

Program Counter (Registers part 2)

Setting, Terminating, Displaying the bus

More about the clock (Clock part 2)

Selecting the right register IC (Registers part 3)

Micro Instructions

Decoder register and timing issues (Register part 4)

Switches and debouncing

It's alive

Assembly compiler and memory loader

  • 1 × 74HCT00 Four 2-input NAND gate
  • 1 × 74HCT02 Four 2-input NOR gate
  • 1 × 74HCT107 Two J-K flip flop with clear
  • 1 × 74HCT138 3 to 8 line decoder
  • 1 × 74HCT245 8 bit bi-directonal bus transceiver with 3 state output

View all 15 components

  • Improved Address Registers

    Ced01/20/2020 at 15:28 0 comments

    Our current CPU has many limitations, including on the register level. This post will look at how to improve the registers that handle memory or output addresses.

    First, let's look at the current register capabilities:

    • there are basically only two memory address registers : the Program Counter (PC) and the Memory Address Register MAR)
    • these registers are 8-bit registers, therefore they can only address 256 bytes of memory or IO
    • Only one provides the increment capability (the Program Counter) however it cannot directly address the memory and has to go through the MAR (memory address register)
    • None can decrement
    • There is no stack pointer
    • There is no index register

    So lets try to design a single General Purpose Address Register (GPAR) that would provide:

    • 16 bits in order to address 64k words of memory space
    • increment and decrement capability in order to serve also as PC (Program Counter) and SP (Stack Pointer) without the need of the ALU
    • Read and Write capability from the databus which is only 8-bit, in order to access both MSB (Most Significant Byte, the byte of higher value of the 16-bits) and LSB (Least Significant Byte, the byte of lower value of the 16-bit) 
    • Write capability to the 16-bit address bus

    Now what type of instruction these registers should be able to handle :

    • LD PC-MSB, D2 : meanind load the MSB of the PC address register with the content of register D2
    • LD D2, IX-LSB : meaning load D2 with the content of the LSB of the index pointer IX
    • LD IX-LSB, IY-MSB : meaning load the LSB part of ix with the content of the MCB part of IY
    • INC IX

    This shows that at any given time, it must be possible to output any of the content of any part of the register ***AND*** input any other part of any other register. 

    Possibly even within the same register:

    • LD IX-MSB, IX-LSB: meaning loar the the MSB part of IX with the LSB value of IX

    We therefore need to separate the output action functions from the input action functions.

    For output and reset:

    • Publish MSB to data bus: -msb-out, asynchrounous
    • Publish LSB to data bus : -lsb-out, asynchrounous
    • Publish value to Adress bus :  -add-out, asynchrounous
    • Clear value : -clear,  asynchrounous

    For input and other:

    • load MSB from data bus : -msb-in, on clock rising edge
    • load LSB from data bus : -lsb-in, on clock rising edge
    • Increment value : -inc, on clock rising edge
    • decrement value : -dec, on clock rising edge

    In the end we need:

    • 3 bits to select the output register (value 0b000 serves as not selected)
    • 2 bits to select the output/reset function (MSB, LSB, address bus or reset)
    • 3 bits to select the input register (value 0b000 servers as not selected)
    • 2 bits to select the input/other function (input MSB, input LSB, inc, dec)

    With a total of 10 bits, we can perform any type of function on the General Purpose Address Registers.

    Here is the naming convention for the register select bits:

    • None: code 0b000
    • Program Counter : PC, code 0b001
    • Stack Pointer : SP, code 0b010
    • Index Pointer 1 : IX, code 0b011
    • Index Pointer 2 : IY, code 0b100

    Here is the convention for the output action (all are asynchronous):

    • Publish MSB to data bus: -msb-out, code 0b00
    • Publish LSB to data bus: -lsb-out, code 0b01
    • Publish value to Adress bus :  -add-out, code 0b10
    • Clear value : -clear,  code 0b11

    Here is the convention for the input/other functions (all are on the clock rising edge):

    • load MSB from data bus: -msb-in, code 0b00
    • load LSB from data bus: -lsb-in, code 0b01
    • Increment value : -inc, code 0b10
    • decrement value : -dec, code 0b11

    Lets now review what the 10 bits would be for the instructions listed above:

    • LD PC-MSB, D2
      • select output register NONE: 0b000 (D2 is not consider as we are focusing on the GPAR only)
      • select output function X : 0b0
      • select input register  PC :  0b001
      • select input function -msb-in: 0b00
      • Result is : 0b000000100
    • LD D2, IX-LSB
      • select output register IX: 0b011
      • select output function -lsb-out: 0b0
      • select input register  NONE :  0b000
      • select input function X: 0b00
      • Result...
    Read more »

  • What next?

    Ced01/14/2020 at 10:23 0 comments

    The first version of the Nits Processor is now finished. It is turing complete with a very basic set of instructions, it is possible to upload a program and run it, it can display a result on the 7-segment display.

    Here are a few photos.

    The instruction decoder (3 EEPROMS for 17 signals and a set of gates to decode the flags):

    The memory (256 bytes stored in Non Volatile Static RAM, with the Address Register, the memory value display). 

    The two flat cables come from the memory loader module (Arduino based). One is the Address bus and one is the data bus. They are used in PROG mode to upload the program to the memory.


    It is now time to think about the next steps. What are the current limitations and how can it be improved.

    Improvements can be of 3 sorts:

    • Improve the instrution set
    • Improve the architecture
    • Improve the build

    Improve the instruction set

    The instruction set is very limited and really needs to be expanded to provide usable capabilities. For instance:

    • Add basic logic functions (And, Or, Not, Exclusive or)
    • Add shift functions (Shift, Shift Circular, Shift with carry)
    • Add compare functions (zero, equal, greater than)
    • Add Push and Pop capability (requires a dedicated stack pointer register)
    • Add Call and Return
    • Add bit management (test flags, store flags, bit operations)

    Improve the architecture

    With only 2 registers, the lack of stack pointer and only 256 bytes of memory for both data and program, the current archietcture can really be improved:

    • Expand address bus to 16 bit (hence 64 Kbytes of memory). However this requires many changes because now the address bus can be double the size of the databus and ALU creating a challenge when computing addresses
    • Expand the number of registers, at least to 4 General Purpuse Registers
    • Separate memory from Input-Output. This will provide double the addressing capability
    • Add a way to interact with the system, for instance with a proper serial interface
    • add a ROM with basic functions including initial setup and serial management
    • add a stack pointer and index registers for points in memory
    • add interupt management (is required for serial interface)
    • expand the ALU capabilities

    Improve the buid

    In its current form (build on breadboards), the CPU works well at 1 Mhz, however, when putting a 4 Mhz oscilator, it breaks. And this is normal considering the capacitance of the breadboard and how the cables are set up.

    It would therefore be interresting to improve on the design with:

    • a PCB backplane to handle all the busses and the clocks with proper connectors (I'm investigating the 96pint DIN41612 connector)
    • PCB modules for very stable elements such as registers, clock
    • Keep the breadboards for test modules and modules that keep beeing improved (ALU, IO)
    • improve on test modules

  • Assembly compiler and memory loader

    Ced12/30/2019 at 18:12 0 comments

    Two new pieces were added to the system :

    • A simple software to compile the assembly code into binary
    • An arduino based RAM loader

    Indeed since the beginning of the project I had to manually write the binary code and upload it using dip switches and this is very error prone and it takes forever.

    My assembly compiler is very basic, written in PHP (just because it's the langage I'm more comfortable with). The input is an assembly file such as (custom format):

    ;
    ; Brute force find three consecutive integers whose sum is equal to 204
    
    var x1
    var x2
    var x3
    var result
    const expected 204
    
    init
    	LD A, 0
    	LD [x1], A
    
    compute_x
    	; compute the 3 values and total
    	LD A, [x1]
    	LD B, 1
    	ADD A
    	OUT A
    	LD [x1], A
    ...

    It handles:

    • comments (anything that follows the semicolon)
    • var definitions (only unsigned bytes)
    • constant definition
    • labels 

    And it produces a binary file and a human readable processed file very useful to debug both the software and the hardware:

    VAR x1 at address 11111111
    VAR x2 at address 11111110
    VAR x3 at address 11111101
    VAR result at address 11111100
    CONST expected = 11001100
    Label init
      00000000  LD A, 0x0                       00100110 00000000
      00000010  LD [x1], A                      01001000 11111111
    Label compute_x
      00000100  LD A, [x1]                      00100101 11111111
      00000110  OUT A                           00011000
      00000111  LD B, 0x1                       00101110 00000001
    

     Each line of compiled code contains

    • the start address of the  code
    • the assembly code
    • the binary code once compiled (one or two bytes depending on the operand)

    Once the binary file is obtained, it was required to load the code into the memory. For this I used an Arduino nano connected to a 74HC595 in a way very close to Ben Eater's EEPROM programmer.

    The Arduino will take over the Address and Data bus of the memory by activating the PROG mode, this basically disconects the memory from the Bus through 75HCT245 chips. Once the memory is isolated, the arduino will go through all the needed address using the 75HC595 (a shift register) and upload the data.

    Note : 

    • a dedicated signal is used to write the value on the bus to memory
    • Lines A0 to A2 of the arduino are set to digital
    • Yes it would be possible to connect directly all 8 lines of Data and 8 lines of Address to the arduino but I wanted to try out the shift register for the time I will have more lines.

    Overall everything works and it a good way to finish 2019. I hope 2020 will bring new features such as:

    • PCB backplane with a 5A power supply
    • Stack Pointer register with associated Push and Pop
    • new ALU with compare, logic operations, etc

  • It's alive

    Ced12/17/2019 at 11:06 0 comments

    Happy to say that the Nits CPU is now alive and Turing complete. The instruction set is quite small but it runs at both slow (300 Hz) and fast (1 Mhz) speed.


    The instruction set includes:

    • NOP : do nothing
    • HALT : halt the computer, at this stage, only a manual reset can restart it
    • ADD : add content of register A and register B and store the result in register A
    • SUB : substract the value stored in register B from the value stored in register A and store the result in register A
    • OUT A : output the content of register A as a decimal unsigned value on the display
    • OUT B : output the content of register B as a decimal unsigned value on the display
    • LD A, B : copy the content of register B in register A
    • LD A,  x : copy the content of the memory at address x in register A
    • LD A, i: copy the value i in register A
    • LD B, A : copy the content of register A in register B
    • LD B,  x : copy the content of the memory at address x in register B
    • LD B, i: copy the value i in register B
    • LD x, A : copy the content of register A at memory address x
    • LD x, B : copy the content of register B at memory address x
    • JMP x : unconditionnal jump at address x
    • JMPC x : jump at address x if carry flag is set
    • JMPZ x : jump at address x if zero flag is set

    It is now time to improve on it.

  • Switches and debouncing

    Ced12/03/2019 at 16:20 0 comments

    Now that that first version of my CPU is running, it is time to fix some issues. One of them is manual switches and debouncing.

    I will not write one more article about debouncing as all this is very well detailed in the great article by Elliot Williams Debounce your noisy buttons.

    In this first version of my CPU I ended up with the following switches

    • Bus Publish : This pushbutton published the value of a dip switch to the bus
    • Master Reset : This pushbuttons resets the computer
    • Memory manual Write : This pushbutton writes the data set on the data dip switch at the memory address set on the address dip switch [PRO mode only]
    • PROG/BUS selector : this is a two way selector used to program the memory (PROG mode) or to use the memory through the regular bus and Memory Address Register (BUS mode)
    • MANUAL / AUTO clock selector : this is a two way selector used to select between the automatic clock (slow or fast) or the manual pulse pushbutton
    • SLOW / FAST clock selector : this is a two way selector to swicth between the 555 base slow clock (between 0,5Hz and 300 Hz) and the fast cristal oscillator based clock (1 Mhz) [valid for Auto clock mode only]
    • MANUAL / AUTO  uCode selector : this is a two way selector disabling the microcode decoder in order to use manual action signals (debugging purposes only)

    3 are pushbuttons and 4 are two way selectors (slide buttons).

    In is articule, Elliot explains how to debounce using an RC (Resistor/Capacitor) circuit and a schmitt trigger inverter. The inverter can be found in the 74HCT14 IC.

    Here is an example of a complete debouncer. Note that the signal is inverted : when pressing the button, the signal ACTION_MANUAL_BUS goes low.

    Here a short description of how it works :

    When the switch is open, the capacitor is loaded through the 10k + 10k resistors and reaches VCC. The output signal is then 0V (inverted input)

    When the swicth is pressed, the capacitor is unloaded through the 10k resistor, it will therefore take 1 ms to reach 1/3 VCC and trigger the change of state of the inverter

    When the switch is released, the capacitor is loaded again, it will reach 2/3 VCC in 2ms and trigger the change of state of the inverter.

    For the slide swicthes (type is break before make), we need to prevent any oscilation between the two states and prevent an unknown state. The best solution here is a simple SR latch.

    An SR latch (in this cas an SR NOT latch as it is built using 2 NOR gates) can only be in 1 of 2 states. 

    When moving the switch from on position to the other, what may happen is the following:

    • bounce off the first state
    • stay undefined (in between the two states)
    • bounce on the second state

    In such a situation, the SR latch will prevent any oscillation following the reasoning:

    • When the switch bounces off, the latch will stay in the same status (set or reset) 
    • when the swicth is undefined, the latch will stay at the same status
    • the first time the witch touches the other positon the latch will toggle but even if the switch bounces it will stay in that second position.

    So in the end, I have built a dedicated breadboard with all the swicthes, RC circuits and IC to debounce all and have perfectly clean manual signals.

  • The need for a decoder register

    Ced11/20/2019 at 15:27 0 comments

    The decoder section of the architecture diagram shows a 13-bit decoder register.:

    What is the rationale behind the need for this register?

    In order to properly set the action signals, there is a need for 13 bits to address de EEPROM used to implement the micro-code:

    • 3 bits for the steps (count from 0 to 7 steps maximum, however most macro-instructions will only need 3 or 4 steps)
    • 8 bits for the instructions (not all 256 capabilities will be used)
    • 2 bits for a combination based on the flags register

    However, when looking at the specs of the EEPROM (AT28C64) we can see that in the worst case, between the stabilization of the address inputs and the availability of proper stable output it can take up to 250ns. This means that if bits from the address are not stable the output might not be valid.

    This is why the implementation of a register used to create a stable snapshot of the 13 bits will help providing stable action signals.

    This is actually a perfect use case for the 74HCT273 register we already talked about. What we need is the capability to snapshot tu status of the 3 elements (register, steps, flags) at a given time and keep it stable until the next clock cycle:

    Now the question is : when are values  changed  (step, instruction register, flags)? When in the overall instruction cycle are they stable enough that I can snapshot them ?

    Well if we look at the clock cycle we can identify 3 moments:

    - main clock (rising edge) : when the actions take place (add, load, etc)

    - step clock : when the step is incremented

    - uCode clock : when to hold the value of the 3 elements that constitute the micro-instruction register

    So we need 3 clock rising edges per step cycle:

    If the main clock signal is at 1Mhz, we get 1ms between general clock rising edges. So that is:

    • 500ns between T0 and T1
    • 500ns between T1 and T2
    • 1000ns between T2 and the next T0 which is more than enough to stabilize the EEPROM values

  • Micro Instructions

    Ced11/19/2019 at 10:22 0 comments

    All the initial modules are now built, it is time to document the list of Micro Instructions.

    Here is a map of all the available Micro Instructions assigned to each modules:

    [unless specified, all actions are synchronous on the CLOCK_MAIN clock signal]


    A register

    • CLA : Clear A : reset value to 0 - Asynchronous
    • AI : A register In : Load A register value from bus
    • AO : A register Out : publish A register value to bus

    B register

    • CLB : Clear B : reset value to 0 - Asynchronous
    • BI : B register In : Load B regisister ter value from bus
    • BO : B register Out : publish B register value to bus

    ALU & Flags

    • CLF : Clear Flags : reset flags value to 0 - Asynchronous
    • SUB : Substract function : apply substract function (default is Add)
    • SO : ALU Out : pubish ALU result value to bus
    • FI : Flags In : set flags based on current value of ALU output

    Instruction register

    • II : Instruction In : Load instruction register value from bus

    Step counter

    • CLstep : Clear Steap : reset counter to 0 - synchronous on CLOCK_STEP clock signal

    Clock 

    • HALT : stops the clock

    Program counter

    • CLC : Clear Program Counter : reset Program Counter value to 0 - Asynchronous
    • CE : Program Counter enable : increment Program Counter value by 1
    • CI : Program Counter In : load Programl Counter value from bus
    • CO : Program Counter Out : publish Program Counter value to bus

    Memory Address Register

    • RI : Memory Address Register In : load Memory Address Register value from bus

    Memory

    • MI : Memory In : Load value from bus and store it in memory at the address specified by the Memory Address Register
    • MO : Memory Out : publish value of memory at address defined by the Memory Address Register to the bus

    Output Register

    • OI : Output In : load Output register value from bus

    Some of these signals will be connected to the Instruction Decoder, some are connected to the reset switch only.

  • Selecting the right Register IC

    Ced11/16/2019 at 13:58 0 comments

    After watching many Youtube videos on Breadboard TTL computers, I noticed that there is often a misundestanding regarding the type of Register TTL chips available and what are their best use cases.

    The most common is the 74HCT173. This one brings everything you need from a register IC:

    • 4 bit register
    • Common clock signal (rising edge)
    • Input enable signal (/E1 and /E2)
    • Output enable signal (/OE, /OE2)
    • 3 state output
    • Asynchronous master reset

    The only drawbacks are that this IC is only 4 bits and the pinout is really weird.

    However, in many breadboards computer, designers use the 74HCT273 without really understanding the differences. The 273 has the following features:

    • 8 bit register
    • Common clock signal (rising edge)
    • Asynchronous master reset

    This IC has no Input Enable signal and no Output enable signal. What it means is that at EACH clock signal, the input is latched and that the output is always on. In other words : The output mimics the exact value of the input as it was on the previous rising edge of the clock.

    No big deal regarding the output as we can use a 74HCT245 to buffer the bus.

    The issue is with the input. Do we want to latch the input value at each rising edge of the clock ? Most of the time NO !  We want an input enable signal. Some would say, it's easy: just use an AND gate between the clock and the input enable and it will work. This is not true and should not be done. Here's why:

    Example 1, the Input Enable is activated a bit before the clock rises. This is basically what we would expect. The input Enable signal activates the clock, the register latches the value on the bus at the time of the rising edge. The AND between the clock and the Input Signal looks like the clock when activated:

    Example 2, the Input Enable signal is not really aligned with the clock, it misses the first rising edge and stays on for the second (note that the duration of the Input Enable is the same as above):

    In that situation, the AND signal provides 2 rising edges. Therefore the register will latch (if fast enough) the information twice at points in time that are not expected. it will miss the first rising edge of the clock, latch on the rising edge of the Input Enable signal (unexpected) and then latch again on the second clock rising edge.

    It is not recommended to apply gates on the clock signal to enable/disable clocks for the microinstructions. The 74HCT273 is not recommended in our use cases.

    However, there is a nice IC that matches better the needs for typical registers: the 74HCT377. It provides the following features:

    • 8 bit register
    • Common clock signal (rising edge)
    • Input Enable signal (/CLKEN)

    It still lacks some nice features of the 173 (master reset, output enable) but it is quite convenient to get 8 bits with Input Enable.

    Here is a good example of the usage of the 377 for our Instruction Register (it doesn't need to be cleared and the output is always on):

  • More about the clock

    Ced11/06/2019 at 21:49 0 comments

    In a TTL type CPU, there is a need for two rising edges of the clock clearly separated:

    • One rising edge to trigger the decode of the microinstruction and set the action signals
    • One rising edge to trigger the action itself (update the register,  etc)

    Most TTL cpus use a single clock with an inverted signal to produce two rising edges per clock cycle (one from the clock signal, one from the inverted clock signal).


    However, this might be playing dangerously as there are chances that one rising edge happens while the other clock signal is still active, creating situations that are very difficult to trace.

    So one alternative would be to clearly separate the two signals with absolutely no overlaping time.

    Here is an example (all screenshots with a 1 Mhz input clock signals from a quartz oscillator):

    In such a situation, a rising edge can never happen at the same time than the other signal is active.

    So how can you build such a signal ? Lets look at this simple circuit and analyze what is going on:

    Lets input a clock on the JK flip-flop (pin 12). The JK is set with two active J and K signals, meaning that it will toggle its exits (Q at pin 3, inverse Q at pin 2) at each clock signal. Note that the 74LS107 triggers on the falling edge.

    Below is the clock signal (pin 12 of the 74LS107, top signal) and the Q output (pin 3 of the 74LS107, bottom signal). 

    At each falling edge, the J-K flip flop toggles. It is therefore a frequency divider by 2. The output is half the frequency of the clock with a 50% duty cycle (ratio between high and low duration). 

    What happens then if we AND the clock and the Q output ?

    Below is the graph of the clock signal (pin 12 of the 74LS107, top signal) and the output of the AND gate (pin 3 of the 74LS08, bottom signal):

    We get the high part of the clock signal but only once out of two.

    If at the same time, we AND the clock and the inverse Q we get the two reciprocating signals.

    Below is the output of both AND gates (pin 3 of the 74LS08, top signal and pin 6 of the 74LS08, bottom signal) :

    With such 2 clock signals, no chances that you will get a rising edge at the same time the other signal is high. The drawback however is that the clock is now half the frequency.

  • Setting / viewing the bus

    Ced11/05/2019 at 14:10 0 comments

    So we've been talking about the bus for some time. But when building a CPU we need two things from the bus:

    • see what is going on (visualize the value currently set on the bus)
    • manualy input a value if needed (for instance when trying to test a register, we need to set a known value on the bus to make sure the register can store it register_in and send it back register_out)

    For this I have built a simple module that manually sets the value of the bus and displays the value of the bus.


    However, there are a few contraints to take into account when building such a module:

    • You need to set a default value for the bus. if a line of the bus is kept hanging (not connected to anything), there are chances that you will get either random values or unexpected behavior. This is called terminating a bus. It is usually done by connecting each line of the bus to a known value (Ground or VCC - 0V or 5V) through a resistor (called pull down or pull up).
    • It is not a good idea to plug LEDs directlly to the bus as they can draw many milliamps from the bus and eventually overload the IC that is active on the bus. To prevent that, a buffer was added (as usual a 74HCT245).
    • We are using standard DIP switches to set a value, but as explained above, the value is actually not directly connected to the bus but with a 3-state buffer and an action signal

    In the end this module is quite simple with only 2 74HCT245, a bunch of LEDs and resistor networks (it is easyer to use a resistor network than to manually plug 8 resistors).

    [Note: do not forget the pull down resistors on the DIP switches otherwise the line would be seen as hanging when the switch is opened]

    [note : the NAND gate is used as an inverter, juste because there was one available near the module]

    Manual setting of the bus balue:

    Displaying the Bus value and bus termination:

    Preview of the Breadboard prototype (right of the photo):

View all 15 project logs

Enjoy this project?

Share

Discussions

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates