Close
0%
0%

The Spikeputor

I am building a computer featuring a 16-bit RISC CPU made of discrete transistors for learning, fun and art. It will be pretty large.

Similar projects worth following
After taking the MITx course "Computation Structures", I realized how relatively easy it would be to build a CPU. Armed with a beginners knowledge of electronics and the concepts introduced in the course, I designed and started implementation of an NMOS logic-based RISC CPU. Wall-mounted, it will take up an entire wall. For extra retro-computing flair, the I/O will be handled by an old Apple II plus I had sitting in my closet. I'm physically building it out of pegboards, protoboards, 2N7000 transistors, resistors (mainly SIPs, but a few discretes here and there), LEDs, a few HP-5082 display chips, some electroluminescent wire, and lots of lovely colored jumper wire. My goal will be to be able to execute simple programs and visualize the computation as it happens.

The Spikeputor will be a fully functional computer with a CPU made exclusively from NMOS logic using about 5,000 MOSFETs (2N7000), resistors, and LEDs to visualize the logic. HP-5082-7340 Hex Display chips will be strategically added to display the numeric contents of the outputs of the major CPU components, although if you read binary, the LEDs will provide the same information. Electroluminescent wire will be added to visualize the logic pathways between the major components. The clock will be adjustable from maximum speed (predicted to be in the tens of thousands of hertz range) down to single step (design TBD). Since the primary driver of the Spikeputor is to visualize computation, speed was not a concern, although some steps will be taken (a clock tree, e.g.) to enhance the overall performance.

The initial CPU design was inspired by the Beta CPU, introduced in the MITx course, "Computation Structures" (highly recommended!) The Beta has a load-store, 32-bit RISC architecture. To save space, transistors, and the creator's sanity, the Spikeputor CPU was reduced to 16 bits. This necessitated several other major design changes, mainly centering on the fact that 16-bit opcodes could no longer support the inclusion of constants within the opcode.

Spikeputor CPU Design Features and Major Components:

  • 16-Bit Address space
  • Register Memory: Seven 16-bit registers, plus one register hard-coded to zero.
  • Multi-function ALU
    • Addition and Subtraction, supporting 2s Complement negative numbers
    • Comparison (Equal, Less Than, Less Than or Equal)
    • Boolean Logic (AND, OR, XOR, Identity)
    • Variable Bitwise Shift Right or Left with or without Sign Bit Propagation
  • Control Logic implemented with simple logic gates (no microcode or ROMs)
  • Additional registers for Program Counter, Instruction, Constant, CPU Phase, and a few required status flags
  • One-word opcodes for operations between registers
  • Two-Word opcodes for operations between registers and 16-bit constants
    • ALU Functions (see above)
    • Memory functions: Load, Load Relative to PC, and Store
    • Conditional (BEQ, BNE) and non-conditional branching with branch point storage
  • IRQ and RESET handling (including automatically using register R6 as the Exception Pointer)
  • Direct Memory Access (DMA) for all I/O

Memory

To be able to execute more than just trivial programs, the Spikeputor CPU will interface with high speed static memory RAM (2 x 32K AS6C62256) and ROM (2 x 32K AT27C256R) chips. In addition, a mirrored write-only "screen memory," also made from (roughly 5,000) transistors, will be created, providing a 48 by 18 array of addressable LEDs. Memory addresses will be restricted to word-boundaries. Attempts to address odd-numbered memory locations return the corresponding even-numbered 16 bit word. Since there weren't any 16K SRAM and ROM chips available, I used 32K chips, and will design a memory banking system to select 16K banks from the RAM and ROM for read operations. The full 32K words (64 kbytes) of RAM can be written to at any time.

Input/Ouput

I/O functions will be facilitated via custom software and hardware running on an Apple II Plus computer, vintage 1986. A custom peripheral card called BIAS (Board to Interface Apple to Spikeputor) will be designed as part of the project, as well as the software on the Apple to provide keyboard input, screen output, and long-term data storage and retrieval. The I/O controller will have direct access to the Spikeputor memory and will halt the Spikeputor CPU during all read and write operations.

16 Bits of General Purpose Input and Output signals will also be implemented, mirrored to fixed high memory locations.

Structure

The Spikeputor will be assembled on a series of mounted solderless breadboards. Each major component (ALU, Register Memory, Control Logic/Program Counter/Status Registers, and Screen Memory) will be laid out on a 4 foot by 2 foot pegboard. Each pegboard can contain a nine...

Read more »

  • 10000 × 2N7000 MOSFET
  • 200 × 830 Hole Prototype Board
  • 200 × 4.7 KΩ Resistors, SIP-8

  • Spikeputor ISA and Control Logic

    spudfishScott05/02/2019 at 20:00 0 comments

    Each Spikeputor instruction is one word (16 bits) long, with a subset of the instructions requiring a second 16-bit integer. Operations are divided into four groups: ALU Functions, Branching, Load, and Store.

    ALU Functions

    All ALU operations require two source operands, and one destination register. The two source operands are either two registers (Ra and Rb), or a register (Ra) and a 16-bit signed integer literal. The result of the ALU function is then stored in the destination register (Rc). Thus, ALU functions may either be one word or two words long, depending on the need for the 16-bit constant. Two-word instructions end in C.

    ALU Instructions:

    • Arithmetic
      • ADD(Ra, Rb, Rc)
      • SUB(Ra, Rb, Rc)
      • ADDC(Ra, const, Rc)
      • SUBC(Ra, const, Rc)
    • Bitwise Logic
      • AND(Ra, Rb, Rc)
      • OR(Ra, Rb, Rc)
      • XOR(Ra, Rb, Rc)
      • ANDC(Ra, const, Rc)
      • ORC(Ra, const, Rc)
      • XORC(Ra, const, Rc) 
    • Bitwise Shift: The bottom four bits of the second operand represents the number of bits to shift. SRA is a right shift with sign extension.
      • SHL(Ra, Rb, Rc)
      • SHR(Ra, Rb, Rc)
      • SRA(Ra, Rb, Rc)
      • SHLC(Ra, const, Rc)
      • SHRC(Ra, const, Rc)
      • SRAC(Ra, const, Rc)
    • Compare: Rc is set to 1 if the comparison is true, 0 if not. Comparisons are based on 2's complement signed integers except for CMPUL, which is an unsigned compare less than.
      • CMPEQ(Ra, Rb, Rc)
      • CMPLT(Ra, Rb, Rc)
      • CMPLE(Ra, Rb, Rc)
      • CMPUL(Ra, Rb, Rc)
      • CMPEQC(Ra, constant, Rc)
      • CMPLTC(Ra, constant, Rc)
      • CMPLEC(Ra, constant, Rc)
      • CMPULC(Ra, constant, Rc)

    Branching

    Conditional branch instructions test the value of register Ra. The branch is taken if Ra is zero (for BEQ) or not zero (for BNE). The second word of the instruction is used to specify a 16-bit signed Program Counter offset. Before the branch, register Rc is loaded with the next calculated Program Counter value so it may be used as a return vector. If the branch will be followed, the target branch address is calculated by adding the given offset to that new PC value.

    For immediate branching, there are two options: JMPC calculates the target branch address by adding the value of Ra to the given 16-bit signed literal. JMP simply takes the target branch address from the value of register Ra. As with the conditional branches, Rc is set to the address of the subsequent instruction. Branch instructions are two words long except for JMP. Also, if Ra is register 6, the JMP(R6, Rc) instruction will clear the IRQ flag, indicating the end of interrupt processing.

    Branching Instructions:

    • BEQ(Ra, PC offset, Rc)
    • BNE(Ra, PC offset, Rc)
    • JMPC(Ra, mem offset, Rc)
    • JMP(Ra, Rc)

    Load

    There are two load instructions to move values from memory into registers. Both instructions use two words, the second of which is used as a signed memory offset. For the LD instruction, the target memory address is calculated by adding the value of register Ra to the given offset . The LDR (LoaD Relative) instruction adds the offset to the memory address of the subsequent instruction. In both cases, the data in the calculated memory address is stored in register Rc.

    Load Instructions:

    • LD(Ra, mem offset, Rc)
    • LDR(PC offset, Rc)

    Store

    The only way to write to main memory is through the two-word Store instruction. The value of register Rc is stored in the memory address calculated by adding the value of register Ra to the value of the second word.

    Store Instruction: ST(Rc, Ra, mem offset)

    The format of the opcodes are as follows:

    The top five bits of the opcode are simply the ALU function (described in the ALU log entry) and are passed directly to the ALU. Bit 10 (Constant Needed) indicates whether or not the opcode is made up of a one word or two words. Bit 9 differentiates...

    Read more »

  • It's About Time!

    spudfishScott04/27/2019 at 22:56 0 comments

    Until now, I've been generating clock signals via an Arduino. With the Register Memory and ALU boards complete, and work beginning on Control Logic, it was time to think about building a clock for the Spikeputor. Like many other homebrew computers, I wanted the clock rate to be variable based on user control of some kind of knob or slider. I also wanted the ability to switch to a mode where the user can control the clock pulses on a step-by-step basis for a fine-grained look at the CPU in action.

    Because everyone loves a 555, I decided that it would be an appropriate timekeeper for the Spikeputor. I considered creating a discrete version (see this great kit, for example), but since space on the pegboards was filling up fast, I decided to start with ICs, and then go discrete later on if space is available.

    Here's the schematic for the clock circuit:

    I'm actually using two 555's - one to generate the variable-rate clock signal, and another to generate a clean 200 ms debounced pulse for the manual step pushbutton. Using a 10 MΩ variable resistor and a 200 nF capacitor, clock rates between 0.35 and about 17,000 Hz are possible. I might want a little more dynamic range, so I'm considering switching to a control knob where I can substitute a larger range of resistors, but for now, the 10 MΩ slider will do.

    The 555 clock signal and the single step pulses go into a 2-channel MUX, the output of which is controlled by a Clock/Step Select switch, itself debounced via an SR latch. Finally, one final gate allows for setting IOFLAG high to stop the clock so the DMA I/O can occur.

    Here's how it all fits together on a board:

    IOFLAG is  hard-wired to 0 right now, but that will eventually connect up to the BIAS board of the Apple II for I/O.

    And here's a little demo of the board in action. We start in clock mode, adjust the rate, then switch over to single-step mode, step a bit, then switch back to the clock. This board drives the proto-Spikeputor perfectly!

    UPDATE: I decided to eschew one of the 555 chips in anticipation of eventually replacing the single remaining 555 with discrete components. Instead, I used a DPST momentary pushbutton switch for the single step pulse, debounced via another SR Latch as shown here:

  • I'm a Big Fan of Sharp Clock Signals

    spudfishScott04/06/2019 at 18:03 0 comments

    As work on the Spikeputor continues, I wanted to check out the shape of the clock pulses, now that a single signal feeds into 112 inputs (7 registers * 16 bits per register). In the entire design, there will be upwards of 165 clock inputs. The result of such a high fan-out number was not entirely unexpected:

    At 37 µS for full rise, that starts rivaling the top speed I want to be able to run the processor (~40 µS, preferably faster if I can speed up the ALU). All of those tiny capacitances of each of the input transistors are starting to add up, slowing down the rise time! Luckily, there's a simple solution to this expected problem: build a clock tree. I inserted a buffer (two transistors) in front of every group of 16-bit inputs, and now the rise time is seven times faster. Math and physics works! Yay!

    None of this is rocket science, but it's a very helpful and educational illustration of these principles. Onward!

  • ALU and RegMem Completed - Demo Time!

    spudfishScott03/02/2019 at 22:15 5 comments

    Two of the four major pegboards of the Spikeputor are now complete and have been mounted. They come down pretty easily in case repairs or modifications need to be made that can't be performed right on the wall.

    Now that we have seven functioning registers and an ALU, we can put it through its paces by simulating the rest of the CPU. Wiring together the components along with an Arduino, the nascent CPU can be "programmed" by simply setting up the correct input signals after every rising clock pulse (also provided by the Arduino). 

    Since there is no RAM yet, the only storage is in the registers. Nevertheless, you can easily set up the Arduino to do something like this:  

    • Place the correct values in each of the registers:
      • Select the register with Ra and Rc to output to Channel A and to allow the input to overwrite it (Rb is ignored)
      • Turn Write Enable on
      • Output the correct bit pattern to the CONSTANT input. 
      • Set ASEL to 0 (input from Register Channel A goes to ALU A)
      • Set BSEL to 1 (input from CONSTANT goes to ALU B)
      • Set the ALU to the ADDC function to add Input A to Input B.
    • Rotate Left each of the registers:
      • Select the register with Ra and Rc to output to Channel A and to allow the input to overwrite it (Rb is ignored)
      • Turn Write Enable on
      • Output 0x0001 to the CONSTANT input. 
      • Set ASEL to 0 (input from Register Channel A goes to ALU A)
      • Set BSEL to 1 (input from CONSTANT goes to ALU B)
      • Set the ALU to the Rotate Left function to rotate Input A by Input B bits.
    • After doing this for all seven registers, pause briefly and loop back to another rotate cycle

    And voilá! Ghost In The Machine!

    You can also set up a similar type of loop, except instead of shifting bits, you can add a constant. This video starts with adding 1 by setting the ALU A Input DIP switch to 0x0001, outputting the selected register to Channel B with write-back, and cycling for a number of clock cycles before moving on to the next register. It shows the whole CPU, then focuses on the ALU output, and then Register Output Channel B. The DIP switch is then changed to 0x0010 to increment the second digit (by adding 16), leaving the first digit intact.

    Finally, for something a bit more advanced, we can manipulate the control signals to simulate a program to calculate the first 24 Fibonacci numbers (all of them that can fit in 16 bits). After each number, the control signals are set to place the step number on RegMem Output Channel A and the corresponding Fibonacci number on RegMem Output Channel B, then pause before calculating the next one. The "program" uses four registers: 

    • R0 = N
    • R 1 = Fib(N-2)
    • R2 = Fib(N-1)
    • R3 = Fib(N)

    (Video in the dark for dramatic effect!)

  • Adding an Unsigned Compare Function to Enable Wide Math

    spudfishScott02/18/2019 at 21:49 0 comments

    As currently designed, the Spiekputor's ALU Compare functions worked with signed integers only. For whatever reason, the "Beta" processor from MITx, upon which the Spikeputor CPU was based, only included signed compares (so, for example, the number 0xC000 would be reported as less than 0x7000 because 0xC000 represents the signed integer -16384, rather than the unsigned integer 49152). This presents a problem if one desires to do math on numbers of a higher bit depth than the bit depth of the CPU registers and the ALU. To add two 32-bit numbers with a 16-bit ALU, for example, you need a way to calculate the carry after summing the lower 16 bits of the 32-bit numbers, then add that carry to the sum of the higher 16 bits. Calculating the carry can be done a variety of ways in code, but it comes directly by checking if the low order sum is less than either of the addends. Thus, an Unsigned Compare Less Than function would be a great addition to the ALU commands. Luckily, it was fairly easy to add the new compare function by expanding the Compare module, which simply subtracts one operand from the other and reports a result based on logic with (Z)zero, (N)egative, and o(V)erflow flags. Adding one more compare function (CMPUL for CoMPare Unsigned Less-than), and bringing in the Carry Out from the Bit 15 adder, as shown below, does the trick.

    Previous Compare Module:

    New Compare Module:

    This change adds just four transistors to the circuit (converting a MUX3 to a MUX4 and inverting Cout[15]), and now, the Spikeputor will be able to easily do math on numbers arbitrarily larger than 16 bits. A very small price to pay, indeed!

  • Caught Up To Real Time! Current Status

    spudfishScott01/29/2019 at 17:42 0 comments

    Since I started this Hackaday project after I had completed a portion of the actual Spikeputor, I've spent the last few months catching Hackaday up to the current state. As of now, the "history" is complete in the logs and it'll be progress in real time from here on out!

    Here is a summary of the current state of Spikeputor project. Additional details will be addressed in separate Project Logs.

    Pegboard 1 - Register Memory: All but three bits of four registers are completely assembled.

    Pegboard 2 - ALU: Entirely assembled.

    Pegboard 3 - Control Logic, CONSTANT and INSTRUCTION registers, Program Counter, Memory Addressing, Main Memory, EL Wire Logic and System Clock: most circuits are designed, but the breadboard layouts are not yet completed (blue cells). Control Logic layout is designed and ready to be assembled.

    Pegboard 4 - Screen Memory: Circuit design and breadboard layout completed and ready to be assembled, but this will only happen after Pegboard 3 is complete and the CPU is otherwise functional.

    BIAS (Board to Interface Apple to Spikeputor) Card: Circuit designed and assembly in progress.

    In addition, I have written a custom Spikeputor simulator to confirm that the overall design and especially the control logic should work as planned.

    I have also written an assembler for the Spikeputor instruction set.

  • Designing and Building the ALU

    spudfishScott01/29/2019 at 15:35 0 comments

    The ALU is the second major module of the Spikeputor, and, including the two input MUX2's, fits neatly on a single pegboard. The design of the ALU again mimics the design of the MITx "Beta" microprocessor. The major differences are the 16 bit (vs. 32 bit for the "Beta") data widths and the fact that, since the opcode width is reduced, the ALU FN input is five bits wide (vs. six). As a result, I needed to take a shortcut with the Boolean module function select (described below). The high-level ALU schematic simply routes the inputs A and B through the four ALU modules and routes the desired function through the function-select MUX4 to the output:

    Here's six bits of one of the input MUX2's, placed at the top of the pegboard. The two MUX2's will select between the Register A input and the next PC value (ASEL signal) and between the Register B input and the CONSTANT value (BSEL signal):

    The ARITH module performs addition and subtraction. The heart of the design is a 16 bit ripple carry adder. If AFN (bit 0 of the ALU FN) is 1, the B input is inverted and the initial full adder carry is set to 1. This produces the 2's complement form of the input (B XOR 0xFFFF + 1), and thus, outputs the difference instead of the sum of A and B. There is also logic to determine whether the result is (N)egative, (Z)ero, or caused an o(V)erflow. These flags are used as inputs to the CMP module.

    The full adder circuit is standard:

    The physical layout of the ARITH function places the B Input inverters directly above the full adder circuitry, keeping all of the bits aligned. This creates some empty space on the inverter breadboards, but greatly simplified the wiring between rows.

    Three bits of inverter:

    And three bits of full adder:

    The breadboards used to add and invert the high order bit have extra space on them, and so they conveniently accommodated the N, Z, and V flag logic and the entirety of the CMP module. The CMP circuit just takes A - B from the ARITH modules and looks at those three flags to determine the single bit output based on the value of CFN (bits 1 and 2 of the ALU FN):

    The BOOL module provides bitwise logic operations between the two inputs. Each bit of the module simply combines the A and B inputs to create a two bit select signal for a MUX4. The four inputs of the MUX then correspond to the truth table for each pair and A and B bits. In the original "Beta" design, each of these four values was directly taken from the ALU FN, allowing for instructions that can create any two-input logic function. Because I only had five bits to use for the entire ALUFN, and two of those ware used to select which module to output, I was forced to make do with the remaining three bits for the BFN input. Luckily, all of the major logic functions I wanted to implement (AND, OR, XOR, and A Identity) can be produced with three bits, hard-wiring channel 0 of the MUX4 to 0. 

    To produce an AND function, BFN is 0b100. OR is 0b111, XOR is 0b011, and the A Identity function is 0b101.

    Three bits of BOOL:

    The SHIFT module is considerably more complicated. It takes the lower four bits of Input B to determine how many bits to shift Input A. Depending on the SFN inputs, the bits are shifted to the left or right, and with or without sign-extending bit 15. The sub-module is implemented with a cascade of six MUX2's. Four MUX2's shift the number 1, 2, 4, or 8 bits to the left or allow it to simply pass through, depending on the value of Input B. The remaining two MUX2's reverse the bit order of Input A and reverse it back after the shift operations in order to procedure right-shifted results:

    The physical layout of the SHIFT module breaks with the usual one row per word design, instead placing each of the MUX2's in a single column that is three rows tall. Here's one column of the SHIFT module, which corresponds to all 16 bits of the SHIFT4 MUX2 in the schematic:

    The output of each module is...

    Read more »

  • Register Memory Function Test

    spudfishScott01/18/2019 at 22:08 0 comments

    I'm almost completely finished with Register Memory. All there is left is to finish the final three-bit boards for four registers (you can see these as "wire only" breadboards on the far left). Meanwhile, here's a video of a quick test. I used an Arduino to make clock pulses and to put a random number at the Register Memory Input on each cycle. I used another Arduino to pick a random register to write to and two random registers to send to the Channel A and Channel B outputs. So far, so good!

  • Designing and Building Register Memory

    spudfishScott01/09/2019 at 16:17 0 comments

    From a functional standpoint, the only differences between the Spikeputor's Register Memory and the Register Memory of the "Beta" processor on which it is based is a decreased number of registers (8 vs. 32) and a decreased number of bits per register (16 vs. 32). Otherwise, the inputs and outputs are the same. RegMem has three register address inputs (Ra, Rb, and Rc), three bits each, a 16 bit data input (WDATA), two 16 bit outputs (RADATA and RBDATA), three control lines (RBSEL, WASEL, and WE), and a clock input. 

    Functionally, register Ra is immediately placed on RADATA. The RBSEL signal determines whether Rb or Rc is placed on RBDATA. WASEL selects whether Rc or R6 (used as the exception pointer on IRQ) will be written to, if WE is high, on the next rising clock pulse.

    The full schematic of the register memory is shown below, and includes some logic to handle the control lines, decoder logic to translate each of the address signals into enable signals for the writeback and output functions, load enable positive edge-triggered flip-flops to store the data, and tri-state buffers to select each of the two output channels.

    The flip-flops are modified from those described in the "Building Blocks" log entry to include a built-in MUX2, which selects between updating the flip-flop with new data (EN high) or previously stored data (EN low). Note that all of the tri-state buffers are implemented as a single transistor using source as the input, drain as output, and gate as enable. To overcome issues with the body diode in the 2N7000, each signal going into the tri-state must be a isolated from the rest of the circuit. This was done by simply putting an inverter in front of each tri-state input. Yann Guidon pointed out that I could have also used two transistors, rather than a transistor and an inverter. I had explored this earlier, but abandoned it for reasons I can't quite remember (might have had to do with a lower voltage output for that configuration). Since it spares a resistor, I'll likely use it for later additions to the Spikeputor CPU. 

    Since notQ is the actual desired value of the bit (because D is inverted before the tri-state), an LED was wired directly to that output to show the state of each bit. The two channel outputs actually come from Q, followed by an inverter leading to the tri-state buffers shown in the schematic. Note that for R7, the data values are hard-wired to ground, nor is there any circuitry for writing to or storing in R7, creating the "always zero" register.

    Each register is arranged in a single row on the pegboard, with one breadboard containing the decoder logic and bit 0 of the stored value, and five breadboards with three bits each completing the remaining 15 bits. The decoder also has three LEDs to indicate if the register is selected for writeback (blue), or channel A (red) or B (green) output.

    Decoder logic and Bit 0:

    Three bits of Register Memory:

    Because I had space, the Writeback Data MUX4 used to select the RegMem WDATA input was placed at the top of the RegMem pegboard. It is also five breadboards wide (three bits per board), plus an additional board for Bit 0 plus the rest of the logic to process the RegMem control signals. The WDSEL signal selects which input (PC_INC, ALU, or MEMORY) is used for WDATA. At the time of the initial design, the fourth input on the MUX4 was open. Since I had the space to accommodate it, I kept it open, initially grounding each bit of the signal, figuring I'd have another way to write a zero to a selected register. As it turns out, as of today, I think I'm going to need this open position to capture the current PC value to handle interrupts, but that's still being worked out. Bottom line: I'm glad I had the inkling to leave it rather than spare some transistors to build a MUX3 instead.

    WDATA Input from WDSEL MUX4:

    Since register R7 is zero only, it only needed two boards (one for each output channel) to lay out all of the associated...

    Read more »

  • Overall Design and Build Strategy

    spudfishScott01/08/2019 at 03:36 0 comments

    As mentioned in the Details Section, the Spikeputor CPU is modeled after the "Beta" CPU from the MITx course, "Computation Structures" (see this link to the Beta diagram). The Beta is a 32-bit RISC CPU with 32 registers (31 + 1 that is hard-coded to zero). I chose it as a model because, frankly, it was the only CPU I knew from the ground up and it seemed relatively simple and elegant, yet quite powerful. Once I did a quick calculation of the number of discrete components I'd need to fully implement the Beta, however, I knew that some compromises would have to be made. The main compromise was to switch from a 32-bit to a 16-bit architecture. At that point, I did NOT look at other 16-bit CPU designs. Instead, I took on the challenge of adapting the Beta to a 16-bit design with as few changes as possible. The biggest changes centered around opcode and instruction size. The beta always executed one opcode per clock cycle. The 32-bit opcodes were wide enough to accommodate instructions featuring up to three registers (five bits to address each of the 32 register locations) or two registers and a 16-bit constant. The remainder of the bits were more than enough to describe all of the instructions defined by the Beta's ISA (Instruction Set Architecture). 

    With only 16 bits to work with in a Spikeputor CPU opcode, the number of registers needed to be reduced to eight (three bits to address each of them, R0 to R7, where R7 was hard-coded to always be zero), leaving six bits for encoding the rest of the operations. Obviously, there was no available space for constants of any size within the opcode itself. Reducing the register count was acceptable to me. There'd be a code efficiency hit, but the transistor count for Register Memory was much more manageable. To be able to use constants, however, required a major change in the architecture. Instead of a simple one instruction per cycle design, a multi-phase instruction scheme was needed, necessitating several additional special-purpose registers, independent from the user registers. I envisioned three CPU phases. Phase information would be stored in a two bit CPU_PHASE register. In phase 0, the instruction would be read in to a 16 bit INSTRUCTION register. If the instruction called for a constant, the CPU phase would be updated to phase 1, the program counter incremented, and the constant read in to a 16 bit CONSTANT register. If a constant wasn't part of the instruction, the CPU phase would go directly from phase 0 to phase 2. In phase 2, the instruction would be executed, including updating the Program Counter to go to the next instruction or to branch to a new memory location.

    One other difference I chose with the Spikeputor CPU was to not implement a "supervisor" address bit, mainly because I wanted the flexibility to (eventually) use the entire 16-bit address space. Instead, to handle interrupts (which I still don't know if I'll even use, but still wanted to design), there would be a one bit IRQ_FLAG register to prevent the interrupt itself from being interrupted. The whole initial design looked like this:

    I'll describe each of the modules in greater detail in subsequent project logs.

    Since the Register Memory and ALU were most similar to the Beta design (something I gained experience with taking the Computation Structures course), I decided to dive right in to starting to build those first. As mentioned in the Details, I'm doing the whole project with solderless breadboards. I'm doing this for purely artistic reasons: I just love how they look, the air of impermanence they invoke, and their modularity. In addition to the schematic, I needed to plan out how the actual components would be laid out. Trying to use breadboard space most efficiently is actually another major design driver. Usually this goes hand-in-hand with transistor component efficiency, but not always. Sometimes, adding a few extra components actually makes it easier to lay out the boards without...

    Read more »

View all 13 project logs

Enjoy this project?

Share

Discussions

Lee Wilkins wrote 01/07/2019 at 18:33 point

Very cool! Can't wait to see how ti progresses. Have you got any clearer pictures up close? 

  Are you sure? yes | no

spudfishScott wrote 01/07/2019 at 18:47 point

Thanks! I'll upload more close-up pictures as I "catch up" in the project log with where I am now.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 01/03/2019 at 21:49 point

Oh, a MegaProcessor on Hackaday :-D
https://www.youtube.com/results?search_query=megaprocessor
For MUX, you could use a "ladder/binary tree" system, which requires complementary control inputs but uses fewer transistors.
NMOS has some inherent drawbacks (power and speed). I've tried discrete CMOS for #Yet Another (Discrete) Clock :-)

  Are you sure? yes | no

spudfishScott wrote 01/05/2019 at 19:23 point

Thanks. I do know about the Megaprocessor. Although I found it after I started designing the Spikeputor, I decided to go forward anyway. The designs are different enough, plus I want one of my own! I'm pretty happy with NMOS, because I don't need the speed and the power requirements are reasonable (about 25 Watts total). Also, p-Channel MOSFETs are 8-10 times as expensive as n-Channel and using CMOS would double the number of transistors required.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 01/09/2019 at 03:18 point

it's not as if you could have too many transistors ;-)

  Are you sure? yes | no

Peabody1929 wrote 12/18/2018 at 18:11 point

It looks like you are using EasyEDA.  Instead of breadboards, how about creating simple PCBs for different modules?  If one wire comes loose on a breadboard somewhere, it could take a LOOOOOOONG time to find it.

  Are you sure? yes | no

spudfishScott wrote 12/18/2018 at 18:26 point

Your point is well taken, and many of my friends think I'm nuts to not use PCBs. But I'm pretty committed to breadboards. I really like the way they look and they fit my artistic vision for the project. Interestingly, since things fail at the bit level, it has been simpler than you might think to track down wiring problems.

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates