Close
0%
0%

Q2 Computer

A 12-bit single-board discrete transistor computer.

Similar projects worth following
The Q2 is a discrete transistor computer implemented on a single PCB using surface mount components. It is a 12-bit design with a bit-serial ALU.

Background

Making a computer out of 7400-series logic, FPGAs, or PLDs is fun, but a lot of the complexity hides in those integrated circuits and their structure significantly affects any design that uses them. To avoid this influence, I like to use transistors. This led me to design and build a computer on perf-board out of NPN transistors between 2008 and 2011 (the Q1). In the end, I was left with a functioning transistor computer, but with a somewhat less than inspired design that would be extremely difficult to improve or replicate.

Today (2021), the situation has changed quite a bit from when the Q1 was built. While designing a custom integrated circuit is probably still beyond the reach of most, it is now fairly easy to get a PCB both fabricated and assembled. This means that designing and building a transistor computer is no longer such a labor-intensive endeavor. In fact, I now have the opposite problem where I end up with 5 clones every time I make a new revision. In some respects, I feel like putting this many transistors on a PCB is like designing an integrated circuit with a really big feature size.

The Q2 is my attempt at a single-board transistor computer. It is a 12-bit design with a bit-serial ALU. It is no coincidence that the architecture is similar to the PDP-8. In designing a transistor computer like this, the design decisions of old computers start to make a lot more sense.

Technology

The Q2 is implemented in NMOS using n-channel MOSFETs with resistor pull-ups.  The current design uses 1094 2N7002 transistors.  To keep the transistor count low and the power usage low, the Q2 runs at 80 kHz. This allows it to draw well under 500 mA at 5V, so I use a USB-B connector for power.

User Interface

Programming is accomplished via a front-panel interface on the lower left side of the board. The front-panel has 12 LEDs to show the current address and 12 LEDs to show the data bus. There are also 12 switches to serve as inputs and the following buttons:

  • Reset - Set P (the program counter/current address) to the value in the switches and reset the Q2.
  • Halt - Stop the clock.
  • Run - Start the clock.
  • Deposit - Store the value in the switches to the current address.
  • Next - Increment P.

The front-panel switches are exposed to a 40-pin header allowing the use of a Raspberry Pi for programming, which is much more convenient.

For interaction with running programs, the lower right side of the board has a 16x2 LCD that can be written at address 0xFFF (bit 8 determines if the write is a command or data). There are also 12 buttons under the LCD, whose state can be read from 0xFFF.

Here is a snake game:

Memory

The Q2 uses two 6264 SRAMs for main memory leaving 4 bits unused. A CR2032 battery is used for backup. The use of an SRAM IC might be controversial, but there aren't a whole lot of viable options. Making a DRAM out of transistors would be possible, and with large capacitors it might even be essentially non-volatile, but to have 12-bits of address space would use more transistors than the Q2 has. A core memory would also be neat, but I don't know an easy way of sourcing one.

All 12 bits of the address space are mapped to SRAM except for the last address (0xFFF), which is for I/O.  This means there are 4095 words of SRAM total. This seemly small amount actually provides enough to do quite a bit. The snake game shown above only uses a small fraction of the available memory.

Instruction Set

  FFF D Z XXXXXXX
   \  \ \    \____ Operand
    \  \ \________ Zero-Page
     \  \_________ Dereference
      \___________ Opcode

  Opcode  | Name  | F | Description
  ------- | ----- | - | ----------------
  000     | LDA   | Z | A = [X]
  001     | NOR   | Z | A = A NOR [X]
  010     | ADD   | C | A = A + [X]
  011     | SHR   | C | A = [X] >> 1
  100     | LEA   | - | A = X
  101     | STA   | - | [X] = A
  110     | JMP   | - | P = X
  111     | JFC   | - | if !F, P = X

The instruction...

Read more »

  • Smaller Flip-Flops

    Joe Wingbermuehle2 days ago 0 comments

    I started out by looking into RTL as a possible substitute for NMOS for the Q2 as a way to speed it up. In the end, I think NMOS is probably the best bet:

    1. RTL requires base resistors, which greatly increases component count.
    2. An RTL gate always draws power (when low, just like NMOS, but also when high, through all output gates). This makes it even more difficult to get a low-power (sub-500mA) design, which is important since I want to be able to run the Q2 off of a USB adapter.
    3. RTL really only works well with NOR gates, which makes the logic slightly more complex (though in reality, NMOS works better with NOR gates too).

    From my experiments, an RTL design can easily go faster, but given the power constraint, it isn't a clear win.

    Despite this outcome, the RTL investigation caused me to look into alternative flip-flop designs. Given the 46 flip-flops in Q2, the simple edge-triggered design I had in NMOS wasn't going to cut it in RTL. So I started looking into a pulse-triggered flip-flop (shown below). This design is kind of neat because it uses so few transistors.  Unfortunately, it doesn't seem to translate easily into NMOS, and it's somewhat picky about component values as you increase the clock frequency (not to mention the high component count).

    The current Q2 flip-flop design is shown below (without the LED section). It's basically the classic positive-edge triggered flip-flop from NAND gates with set and reset.

    Using 14 transistors, 7 resistors, and 1 LED each, these DFFs make up most of the Q2 by component count and area. So saving even one component would be a good savings.

    A new design is shown below. By re-arragining some inputs, it's possible to "share" a couple of transistors. Also, by switching to an LED with a low voltage drop, we can save a transistor and resistor, using the LED as part of the pull-up. We're left with a positive-edge triggered D-flip-flop with set and reset, using 11 transistors (all the same type), 6 resistors, and 1 LED.

    The smaller flip-flop design will save 138 transistors and 46 resistors. In addition, the pull-up on the inverted output is actually faster due to the low resistance to power the LED.

  • Transistor Comparison Results

    Joe Wingbermuehle04/24/2021 at 14:51 0 comments

    Following up on my search for faster transistors, I got the boards back earlier this week to test a new candidate transistor (a 2SK3018 vs 2N7002). The good news is that the surface mount switch I wanted to try appears to work just fine. Unfortunately, the new transistor does not appear to improve performance nearly as much as I had hoped.

    For the test, I have the output of an oscillator tied to an inverter implemented with 4 transistors of each type in parallel (I ended up going with a NOR structure instead of a NAND structure). The graph below shows the results.

    The blue line is the output fo the 2SK3018 inverter and the yellow line is the output of the 2N7002 inverter. There is maybe some difference, but it's hard to see here.

    To investigate further, here is the input compared to the output for the 2N7002 inverter:

    The output is in yellow and the input is in blue. The threshold voltage is apparent and seems to be around 1.6V. After somewhere around 1.8us, the input has reached 2V.

    Nothing too surprising. Here's the 2SK3018:

    Again, the output is in yellow and the input is in blue. Here we see a lower threshold voltage of what appears to be 1.2V. Unfortunately, after 1.8us, the input has only reached 2V. This is actually quite similar to the 2N7002.

    Looking at the data sheets, although sparse on details, maybe this isn't completely unexpected. For the 2N7002, the maximum capacitance is given as 50pF and no typical capacitance is listed. For the 2SK3018, only a typical capacitance is given as 13pF.

    The lower threshold is an advantage, but we need to make sure that external components that the Q2 uses, such as the SRAM and LCD, still get a high enough voltage for the high level. In the end, I'll probably just go with the cheapest transistor for the next revision. Although it would be nice to get some more speed, I would like to be able to do so without adding significantly to the size or power requirements of the Q2.

    One, perhaps obvious, observation from this experiment that might speed things up is that in NMOS gates with a lot of inputs are slow. The graph below compares two 4-input NAND gates. For the NAND gate in yellow, the first input is switched (closest to ground). For the NAND gate in blue, the last input is switched (other inputs are held high).

    When the first input is switched the output only reaches 3V after 2.4us whereas when the last output is switched the output reaches 4V.  I'm guessing this effect is due to the output capacitance combined with the on-state resistance. For the address decoder, there is a 12-input NAND gate to check for 0xFFF, which is very slow and probably limiting performance. I plan to revisit that with a more appropriate pull-up. This also implies that re-arranging the order of signals into the gate may improve the speed of the gate (slower signals should go near the output side, and faster signals near the ground side). NOR gates don't slow down with increased inputs quite as much as NAND gates, but still slow down as the number of inputs increases almost certainly due to the output capacitance.

  • I2C

    Joe Wingbermuehle04/14/2021 at 02:21 0 comments

    I should have my test board for the 2SK3018 transistors back next week, but in the meantime I've been thinking about other changes.

    In the interest of adding more I/O capabilities, I think I've settled on adding an I2C interface to the Q2.  I2C is pretty easy to support, requiring 2 open-drain outputs (SDA for data and SCL for a clock), and an input (only SDA assuming there isn't a need for clock stretching).

    To implement the output for I2C is the most complicated, requiring a latch for SDA and SCL. Input is easy, requiring only a single NAND gate. Here's the current proposal:

    The idea is that bit 11 of address 0xFFF will select between the LCD (0) and I2C (1), allowing easy access to the LCD just as before.  When bit 11 is set, bit 10 sets SCL and bit 9 sets SDA. The software for I2C is fairly simple. For start/stop, I think something like this should work:

    .def I2C_EN   0x800
    .def I2C_SCL  0x400
    .def I2C_SDA  0x200
    
    ; Note I2C signals are inverted.
    i2c_zero:
      .dw   I2C_EN | I2C_SDA | I2C_SCL
    i2c_zero_clk:
      .dw   I2C_EN | I2C_SDA
    i2c_one:
      .dw   I2C_EN | I2C_SCL
    i2c_one_clk:
      .dw   I2C_EN
    i2c_input_mask:
      .dw   ~I2C_SDA
    
    ; Send I2C start
    ; Take SDA low while SCL stays high.
    i2c_start:
      sta   =x1
      lda   i2c_one_clk   ; SDA=1, CLK=1
      sta   @=neg1
      lda   i2c_zero_clk  ; SDA=0, CLK=1
      sta   @=neg1
      jmp   @=x1
    
    ; Send I2C stop
    ; Take SDA high while SCL stays high.
    i2c_stop:
      sta   =x1
      lda   i2c_zero_clk    ; SDA=0, CLK=1
      sta   @=neg1
      lda   i2c_one_clk     ; SDA=1, CLK=1
      sta   @=neg1
      jmp   @=x1
    

     For writing, we just loop over each bit.  Being a 12-bit architecture, we have to shift off 4 bits first. So, something like:

    ; Write byte in x0.
    ; Destroys x0-x2
    i2c_write:
      sta   =x1
    
      ; Shift out high 4 bits
      lda   =x0
      add   =x0
      sta   =x0   ; x2
      add   =x0
      sta   =x0   ; x4
      add   =x0
      sta   =x0   ; x8
      add   =x0
      sta   =x0   ; x16
    
      lea   =8
    i2c_write_loop:
      add   =neg1
      sta   =x2
    
      lda   =x0
      add   =x0
      sta   =x0
      jfc   i2c_write_zero
    
      ; Write 1
      lda   i2c_one
      sta   @=neg1
      lda   i2c_one_clk
      sta   @=neg1
      lda   i2c_one
    
      jmp   i2c_write_cont
    i2c_write_zero:
    
      ; Write 0
      lda   i2c_zero
      sta   @=neg1
      lda   i2c_zero_clk
      sta   @=neg1
      lda   i2c_zero
    
    i2c_write_cont:
      sta   @=neg1
      lda   =x2
      jfc   i2c_write_loop
    
      ; Acknowledge
      lda   i2c_one
      sta   @=neg1
      lda   i2c_one_clk
      sta   @=neg1
      lda   i2c_one
      sta   @=neg1
    
      jmp   @=x1

    Reading is similar. From simulation, this would make reading 256 bytes from an EEPROM take somewhere in the neighborhood of 26 seconds at a 80kHz clock.  It would be nice to get this faster, but that's plenty fast to use some I2C sensors or a real-time clock, etc.

  • Faster Transistors

    Joe Wingbermuehle04/04/2021 at 22:04 0 comments

    The high gate capacitance of the 2N7002 transistors that the Q2 uses prevents it from running much faster than 80 kHz without becoming unstable.  This is because, with resistor pull-ups, the charge stored in the gates is pulled high through the resistor, causing slow rise times when a lot of gates are connected together.

    Consider the A register in the Q2, which is 12 bits. To clock the A register, 2 transistor gates per bit need to be pulled high. This is a fanout of 24, so a capacitance of 50pF * 24 = 1200pF. The threshold voltage of a 2N7002 is 2.5V worst-case. If we pull this high through a 10k resistor, we get the following expression for the rise time:


    Solving for t we get 8.3us for a single gate (if there were only one level of logic, the frequency would be limited to 120 kHz, but there are more levels involved). Substituting a 1k resistor solves the problem, but introduces another: instead of using 0.5mA, we use 5mA. This power draw quickly adds up.

    The control lines of the Q2 are carefully designed to use 10k resistors where possible, and fall back to 1k in just enough places to allow stable 80 kHz operation. Unfortunately, going faster becomes increasingly difficult and wastes more power. This raises the question of whether another transistor would be more suitable.

    We want the following characteristics:

    • Low gate capacitance (lower than 50pF)
    • Low threshold voltage. Not only does a high threshold cause problems with the supply voltage, it also makes the computer slower since the gate output needs to reach a higher voltage, which takes longer. 
    • Low price. When using 1000s of transistors, we can't ignore the price.
    • ESD protection. Not strictly necessary, but certainly nice to have. With lower gate capacitance, this is probably more important.

    One transistor that seems to be a good contender is the 2SK3018 (Shikues brand available through LCSC). It has a gate capacitance of 13pF and threshold of 1.5V. This means that in our example, the delay would be 1.1us with a 10k resistor and 0.11us with a 1k resistor. This should allow running the Q2 at nearly 8x the clock speed and/or save some power.

    To investigate this further, I put together a simple test circuit. to see how the transistors compare:

    The circuit is a simple relaxation oscillator (identical to the oscillator used in the Q2, but implemented using 2SK3018s instead of 2N7002s).  The output is run through two circuits with a fanout of 4 using both types of transistors so I can compare the rise times. This circuit will also allow me to try out an SMD switch so I don't have to worry about soldering switches in future revisions.

View all 4 project logs

Enjoy this project?

Share

Discussions

Tim wrote 04/25/2021 at 11:58 point

Nice project! Did you use JLCPCB for the assembly? In that case you probably used the CJ 2N7002 as transistors. Those are an extremely slow variant of the 7002. You will probably already see a significant speedup by going to onsemi or nexperia variants.

I would suggest to use the onsemi FDV301N or the diodes inc. counterpart. But you already found my project about that :)

  Are you sure? yes | no

Joe Wingbermuehle wrote 04/30/2021 at 01:27 point

Thanks!

I did use JLCPCB for the assembly and I used the CJ 2N7002 at first (they are indeed quite slow). I switched to another cheap variant, but those were just as slow.

Your projects are very interesting to me for obvious reasons! The FDV301N is tempting, but 3x as expensive. Of course, now I'm curious if a BJT RTL design would be able to run faster without using more power...

  Are you sure? yes | no

Tim wrote 04/30/2021 at 05:19 point

RTL should be much faster, given that the right transistors are used (PMBT2369). But circuit design is a bit more complex because you cannot use stacked devices and the load resistors need to be sized properly to have stable logic levels.
(I guess you have also seen this: https://hackaday.io/project/170697-evaluating-transistors-for-bipolar-logic-rtl )

Maybe there are also ways to speed up on logic level? E.g. by optimizing carry logic?

  Are you sure? yes | no

Joe Wingbermuehle wrote 04/30/2021 at 12:27 point

I was actually looking into converting everything to be NORs since they appear to be faster than NANDs and, since I'm already trying to compute the fanout of each gate in the interest of sizing the pull-up resistors to combat gate capacitance, it occurred to me that I might as well just use RTL. Of course, RTL requires an additional gate resistor, which basically doubles the part count... so those digital transistors start to look appealing. I did see your write-up about RTL and the CDC (LTL is also interesting, but that would really up the part count I think!). My concern is that if I use RTL with a 10k pull-up for low-fanout gates (in the interest of keeping power consumption at a reasonable level), then I'm back where I started as far as speed is concerned. I'll have to do some experiments to find out.

There are possibly some logic optimizations (and I'm trying to switch away from NANDs), but there is no carry logic as it's all bit-serial. The only adder is 1 bit wide. The program counter increment is just a binary ripple counter, which I don't believe is an issue because the new value is not required for a few cycles after it's incremented. I think I can probably get some more speed out of the current design with stronger pull-ups on the data bus, but I'm still only at around 100kHz. It's probably actually fast enough already, I just can't help but look for ways to improve it and justify another revision :)

  Are you sure? yes | no

Tim wrote 05/01/2021 at 04:58 point

I used analog simulation to verify the behavior of my digital gates. That could also be a great tool to verify the proper sizing of resistors in an RTL design.

Why do you want to use such a large pull up? Do you have a strict power limit on the design? In the old days, people used several mA per gate for high speed circuits. (One of the reasons why RTL quickly disappeared, I guess).


Digital transistors look handy, but they are also not really optimized for switching speed. I found this out the hard way, since I also built an entire CPU only to find that the switching speed is limited by long storage times.

Also see here:

https://hackaday.io/project/170697-evaluating-transistors-for-bipolar-logic-rtl/log/175362-digital-transistors

There are really only very few fast switching bipolar transistors left: PMBT2369, MMBT2369, BVS52. These are the only ones that I am aware of, that use a gold implant or other means to reduce charge storage time. XX3904 is also optimized for switching speed but is much slower.

  Are you sure? yes | no

Joe Wingbermuehle wrote 05/01/2021 at 14:09 point

Thanks for the reply! My plan is to try out some circuits in both simulation and on breadboards (using PN2222s since I have a lot of them and that ought to be a good lower bound on speed). And then get a few test boards for the actual devices if that looks good. I'd probably be happy with anything north of 200kHz for this level of power consumption at this point.

The high value pull-up is to keep power well under 500mA. I'm using USB for power (it's so convenient and I'm sure everyone has a drawer full of 2.5W USB adapters), so that limits the design to 500mA. Well, technically 100mA without negotiation, but I've never seen that enforced on a charger. I really like the idea of having an easy-to-use single PCB transistor computer, powered with USB, and programmed with a Raspberry Pi. Just about anyone could pick this thing up and use it. The power constraint actually makes the whole thing more interesting to me.

It's disappointing that digital transistors are so slow, but I maybe they're still faster than MOSFETs? I don't like the idea of using them because it's slightly less "discrete", but then again, if I add a resistor for every transistor, the size of the PCB will have to increase (maybe I can prune some transistors from the design... use pulse-triggered flip-flops, etc.).

  Are you sure? yes | no

Karl-Wilhelm Wacker wrote 04/10/2021 at 21:59 point

Have you looked at how the original PDP-8 and other DEC computers sped up their logic?

The pull-up resistors pulled to a higher voltage and then were diode clamped to a lower voltage.  If you pulled the 10K's to +12V and clamped them to +5V , this would put you in the earlier and faster part of the R-C charge  pull-up curve.

  Are you sure? yes | no

Joe Wingbermuehle wrote 04/11/2021 at 12:53 point

They do some clever things. I've been particularly interested in how the PDP-8 implements flip-flops using a capacitor for edge detection since flip-flops take up so much room.

I like the simplicity of NMOS, but I think bipolar transistors would probably be a better choice for the most part since they're cheaper and faster. I could always try to bring some of those tricks over to NMOS (just using a higher voltage would probably help a lot, though would require a different power supply and level shifting... but hey, I already have the MOSFETs for that!). 

  Are you sure? yes | no

Karl-Wilhelm Wacker wrote 04/11/2021 at 13:43 point

I would keep your NMOS logic, just pull the 10K's to the higher level, and clamp to your current logic level with the diodes.  Remember that the clamp level power supply [VCC} has to sink the voltage from the diodes, not source it. a zener [multi-watt?] to ground with a pull up resistor to 12V or whatever to bias it should do the trick,and let you keep VCC as your logic high level

  Are you sure? yes | no

Joe Wingbermuehle wrote 04/11/2021 at 14:26 point

It is a neat idea for only an extra diode per gate.  If my understanding is correct, it would raise the power draw a bit. A gate would draw either 1.2mA (sinking 12V through the transistor) or 0.7mA (sinking 7V through the diode). On the other hand, with just the pull-up it draws 0.5mA when the transistor is conducting, and basically nothing otherwise.  I think it would be roughly 3x faster though.  Using a 22k resistor pull up instead would get it into the same range power-wise, but then it would only be about 30% faster if my calculations are correct.

So I think it gets 3x the performance for 2x the power whereas a lower-valued pull-up would get 3x the performance at 3x the power.  I might be thinking about this all wrong though.

Another issue to solve would be getting the 12V supply (currently I'm just powering the thing directly off of USB).

  Are you sure? yes | no

Karl-Wilhelm Wacker wrote 04/11/2021 at 15:58 point

what is your power draw for the whole unit [worst case]? A way to calculate this would be to ask how many 10K resistors in your design.

I have used a dc-dc converter to get 24v [at about 50mA] from the USB 5V for a modbus/4-20mA calibrator I made that talks on modbus to a 4-20mA output smart pressure sensor I also designed, 4-20 for operation, modbus to calibrate the sensor.

  Are you sure? yes | no

Joe Wingbermuehle wrote 04/12/2021 at 12:17 point

Total current draw is around 380mA. 188mA from 10k pull-ups, 85mA from 1k pull-ups, and the rest for LEDs, SRAM, etc.  I've gone through the design and calculated the capacitive load on all of the gates to try to adjust down the pull-ups for certain gates that have the greatest impact on the performance, but at some point I just need all the gates to go faster. I think the 2N7002 is probably the biggest problem with it's 50pF input capacitance though, so I'm hopeful that using a different transistor (supposedly only 13pF) will help a lot with basically no extra effort or components.

  Are you sure? yes | no

marazm wrote 04/08/2021 at 17:52 point

please add network. meybe put it to web?

  Are you sure? yes | no

Joe Wingbermuehle wrote 04/09/2021 at 11:52 point

Not sure about putting it on the web, but I am planning on adding an I2C interface to the next revision.

  Are you sure? yes | no

vincent stuchly wrote 04/08/2021 at 16:00 point

Woah, this is awesome! To be honest I would be interested also in the perfboard version. The pcb looks stunning. Great work

  Are you sure? yes | no

Joe Wingbermuehle wrote 04/09/2021 at 11:50 point

Thanks! There is some info and pictures of the perfboard one here: https://joewing.net/projects/q1/

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates