Both RTL (Resistor Transistor Logic) and DTL (Diode Transistor Logic) were common logic families in the early days of transistors. The image above shows inverters in both logic families.
RTL is the simpler of the two, but it suffers from many issues: The L->H threshold is defined by the forward voltage of the base-emitter junction of the transistor and is hence rather low, around 600mV for a typical silicon transistor. Furthermore, if the input is at high level it will sink current which is supplied by the collector resistor of the preceding gate. To maintain consistent logic levels it is therefore necessary to adjust all resistors according to fan-out. Besides that, there is also a lot of nastiness in dynamic operation.
DTL addresses these shortcoming by introducing additional diodes: When the input is high, all the base current will be supplied by the base resistor (R5) and there is no current flowing into the gate. When the input is low, current is sourced from the gate, which is sunk into the transistor of the preceding gate. Since the transistor is fairly low-ohmic when it is turned on, there is much less limitation in fan-out. Furthermore, it is not necessary to adjust gate resistors depending on fan-out. This allows using the same resistor values throughout the entire circuit.
A second improvment in DTL is to lift the L->H threshold to a higher voltage. It is now defined by Vbe+Vd2+Vd1-Vd3, which is roughly 1.3V depending on the compenents used.
From DTL to LTL
The image above shows the circuit of an LTL inverter. At the first glance it looks very straightforward: We replace the two silicon diodes with a LED. Since the LED has a much higher forward voltage (depending on LED type), only a single diode is sufficient. This is nice, because it saves one component, but is it actually a good idea?
Replacing the base diodes in an DTL gate with an LED saves one component. One very interesting side effect is that the LED also emits light. Modern LEDs are already quite bright at around 1mA of current, so the normal base current will be sufficient to turn it on.
But what is the impact on the circuit behavior? Usually we would like to use fast switching diodes for any logic circuit. Switching diodes are optimized for low capacitance and low recovery time to be able to switch very fast between forward and reverse operation. Here I am using a BAW56 dual diode, which has a capacitance of 2pF and 6ns reverse recovery time, as the input diode. LEDs are not optimized for switching operation. They typically have a fairly high capacitance of around ~40pF and take long time to be switched off. Therefore, using an LED in place of D2, the input diode, would slow down the gate significantly.
The base diode is, however, never in reverse operation. Therefore the bad switching properties of the LED are not an issue. In addition, the higher capacitance helps to pull down the base potential quicker if the transistor is to be turned off. You sometimes see intentional reach-through capacitors in parallel to the base diode in DTL gates.
Some attention has to be paid to the terminal between D1 and Q1 base. If the transistor is turned off, this terminal is pulled to negative voltage and is basically floating, since the base-emitter diode is reverse biased. I found that this can lead to a shift of switching voltage depending on duty cycle and frequency of the incoming signal. It may be advised to use a bleeder resistor to conntect this node to the ground. Due to simplicity, I omitted this and made sure to design glitch free logic instead...
I spent quite some time optimizing the basic gate in LTspice. To measure switching speed, I simulated a 5 stage ring oscillator. One crucial choice was to pick the right transistor, as I also outlined in greater detail here.
You can see simulation results for several different configurations above. I also tried various configurations with reach through caps and baker clamps but found that chosing the right transistor, the PMBT2369, yielded much better results that all other options. The PMBT2369 is available in a SOT23 SMD package for around $0.02, so there is really no reason to use the BC847 or MMBT3904 over it.
Final parameters of the simulated inverter are shown above. The relatively high L->H delay is owed to the use of a relatively large collector resistor. Note that the threshold voltages are almost centered between the 5V supply and ground, maximizing noise marging and making the gate compatible to CMOS logic levels.
I designed a neat breadboardable LTL-NAND2 gate to test the design in hardware. You can find the PCB layout and a photo of the finished product above. I use a BAW56 dual-diode in a SOT23 package for the input diode. This form factor is much easier to handle than single diode SMD packages and there is no cost disadvantage. You can see the LED in the center of the PCB. I made a few variation with different LEDs: green (high Vf) and red (lower Vf) and different collector resistors.
Transfer characteristics were measured by using the DAC and ADC of an ATtiny416. You can see that different LEDs can be used to adjust the threshold levels of the NAND2 gate. Red with a low forward voltage leads to a L->H threshold around 1.65 V while green as it 2.24/2.33V depending on collector resistor.
To build up more complex circuits it is also important to study the effect of output loading. The figure above shows the NAND2 gate with floating output and with another NAND2 gate connected to it. The threshold is slightly shifted for the loaded output, but not to a level of concern.
The scope picture above shows the transient switching behavior of a loaded NAND2 output with green LED and 8.2k collector resistor. One can see that there are actually two elements to the rising edge. Initially, the voltage rises quickly and is then followed be a slower slope. During the initial part, the 3.4k base resistor of the connected gate helps pulling up the output. However, after the threshold voltage is reached, the current from the base transistor flows into the base of the transistor and the output is only pulled up by the 8.2k collector resistor.
The behavior of the NAND2 gate with 1.8k collector resistor, as shown above, is more consistent and shows a fast rising edge up to Vhigh. I therefore decided to focus on using 1.8k as a collector resistor.
After characterizing a basic NAND2 gate, the next step is to measure the timing properties of the LTL gate. For this purpose I built a ring oscillator based on five gates. The PCB design is shown above. I added one additional inverter as an output buffer to avoid loading of the oscillator while measuring its frequency.
I built two versions of the ring oscillator: One with red LED and one with green LED, both using 1.8k collector resistors.
The scope screenshot shows the input (yellow) and output (turquoise) waveform of one inverter in the ring oscillator (green led). The falling edge is steep since it is actively pulled down by the transistor. The rising edge is, again, separated into two regions depending on wether only collector resistor or base and collector resistor are involved in pulling up the node.
The diagram above shows measurements of ring oscillator frequencies versus supply voltages. The blue line corresponds to the inverter with green LED, the orange line to a red LED. There is a clear trend towards higher frequency for higher supply, which can be explained by the availability of more switching current. The red LED LTL gate has a lower threshold voltage and does therefore switch earlier and faster. Since this design is based on the PMBT2369 switching transistors, no dominant influence of base saturation is observered. LTL gates with normal small signal transistores should exhibit a speed-supply relationship similar to what I observed with RTL.
The supply current shows a linear relationship with supply voltage, as expected for a resistor-loaded gate.
The table above summarizes the charactization of the LTL gates. There is a tradeoff between high noise margin (green LED) and speed (red LED). All in all, the propagation delay looks very acceptable for discrete logic. At tp=6.9ns, the red LTL gate is almost as fast as the RTL gate in a CDC6600 and still much faster than the BC847 based DTL gates used in the MT-15, while maintaining full 5V CMOS/TTL logic level compatibility.
Did I mention that each gate comes with its own blinkenlight?
After successful validation of the LTL concept in real hardware I started to build up a library of common gates in LTspice as a foundation for the design of a CPU.
The basic gate of LTL is the NAND2 gate. Symbol and circuit shown above. In my final design I used gates with different threshold level so I added a "G" to a high threshold device with a green LED.
Every gate was tested in a simple testbench in LTspice. I arbitrarily chose a fan out (FO) of 7 for the test case, although this does not occur in the real design.
Test waveforms for the NAND2 gate are shown above - nothing peculiar. There is a little crosstalk between the inputs of the gates if one input is high and the other one is pulled low. This is due to the extremely high slew-rate of the falling edge on the output. In a physical implementation this will hopefully be reduced a bit by additional parasitic capacitances to the power plane.
Next is the NOR2 gate. This can be easily realized by a wired AND of two LTL inverters. A minor but very important detail: If a gate uses a wired AND at the output, neither of the LEDs will be representative of the output signal. In practice this means that additonal indicator-LEDs may have to be added to monitor certain nodes.
Last one is the AOI2 gate (AND OR INVERT). You may not be familiar with this kind of gate, but it is a very useful building block due to it's simple implementation. For example, it can be used as a multiplexer or as part of an ALU.
Finally, a list of part counts for each gate. Since my intention is to build a CPU with a minimal amount of discretes, it is important to keep track of this.
Not too exciting, so let's get to the more special building blocks next...
As a next step we will look into options to design XOR2 gates in LTL.
A straightforward approach is to build a XOR2 gate from 4 NAND gates. This is simple and robust, but results in a propagation delay of three NAND2 equivalent. Not perfect for fast circuits. Also, the component consumption is quite high.
Another option is to use an AOI2 gate and two inverters. The number of components is almost the same as the NAND2 implementation, but now the propagation delay is only two gates. Furthermore, often inverted signals are alrady available as output from a previous stage. In that case, the inverters can be omitted.
One approach that has been discussed at length at the hackaday TTLers is to use a cross coupled transitor pair, as shown above in a XNOR2 gate. This method is really tricky and drastically reduces the part count. In context of LTL there are a few challenges, though: The gate above is basically an RTL gate and will sink current when the input is high, which leads to a reduced fan-out. The threshold voltage is not defined relatively to ground, but in reference to the second input. The gate switches when the voltage difference between both inputs is equal to Vbe (~0.7V) - very different from the normal LTL threshold. Also, the output low level is 2xVCEsat instead of 1xVCEsat. The combination of all these effects leads to some headaches when designing circuits with several of these XNOR2 gates as they will start to influence each other and the noise margin degrades. A few changes have to be introduced to make this type of gate a bit more compatible to LTL.
An LTL version of a XOR2 gate based on a cross coupled transistor pair is shown above. First, this device has an output inverter to restore the low level. An input diode and resistor is added to avoid current sinking during high. To fix the threshold levels, two additional diodes were added (D1, D3). LEDs cannot be used in this place, because there are other elements in the current path (D1,D2, output transistor from preceding gate) that add to threshold voltage. The threshold level is still defined by the differential voltage between two inputs. This is still of concern, but a little less relevant now since the output levels have been restored. Assuming the input low level is 1xVCEsat, the threshold level is equal to VD1-VD3+VBE+D2+VCESat: 0.7+0.7+0.2 ~ 1.6V. This is much higher than the 0.7V of the bare transistor, however still not the same as the LTL threshold levels.
The three options are summarized above. Using a 3T XOR2 gate allows to reduce compenent count drastically, but still comes with some potential to screw up signal integrity. In practice often both inverted and noninverted input signals are available. In that case using an AOI2 gate is the most straightforward option and only adds 9 components.
Now to the last category of building blocks: Flip Flips.
If you grew up learning about digital electronics in the advanced CMOS era, like me, you will most likely be accustomed to using edge triggered flip flops for everything. Unfortunately, it turns out that proper edge triggered flip flop require at least 6 NAND gate equivalents, unless you have dynamic CMOS logic at your disposal.
For those of us who were suddenly beamed into the discrete LTL age, latches are a much more part count efficient solution, as they only consume about half as many components as a static edge triggered flip flop.
A commonly known minimal representation of a gated D-latch in NAND2 is shown above. See also Wikipedia article. A nice propery of this design is that it only requires a single clock input and has both inverted and non-inverted data outputs. Data from Din is forwarded to the output while Clk is high. The state of the latch is frozen on the high->low transition of the clk and held while clk is low.
When using this design in high speed circuits it becomes apparent that it has a nasty habit of generating glitches. The origin of this effect is the NAND2 gate in the lower left. The clock signal arrives on one input directly and on the other it is delayed through the NAND2 gate in the top left.
This effect can be somewhat reduced by tweaking the propagation delay of the logic gates. In LTL this is easily possible by changing the LED color to change threshold voltage. In this case I introduced a faster "red" (hence the R) gate with lower threshold as the top left gate.
There is also a way to reduce gate count to three by replacing one of the NAND2 gates with a wired AND. This is described in a now expired patent. (A similar design is described in this patent)
The patent also describes to work around the aforementioned glitch by introducing one faster gate. A disadvantage of this design is that it is only has an inverted output. The clock input is also inverted compared to the previous design: Data will be forwarded for clk='0' and held for clk='1'.
One very important point of learning came out of actually simulating a full design including the latch and observing dynamic operation. To do this, it is necessary to perform a transient simulation in Spice - LTspice was used here.
The figure above shows simulation results and modifications to the latch design that were applied as a result of the observations. The latch was used as part of a program counter here. The first line (in red) of the simulation traces shows the fourth bit (A3), the second line shows the sixth bit (A5).
The leftmost column shows the results of the unmodified circuit. A3 looks nice and clean. However, A5 shows a series of negative spikes when the output is high. Interestingly, this behavior changes over time. It should be noted that this is mostly a cosmetic issue at this point, because the glitches are far away from the clock edge were the data is latched. Elminitating this effect should still be a priority as the additional noise may snowball into actual bit errors when the timing gets more tight.
The culprit is the gate indicated by the red box, which receives a clock signal that is partially delayed by another gate. The lowermost trace shows the node directly at the base of the gates transistors. This node is basically floating when the gate is turned off, since both the LED and the base junction are reverse biased. You can see that it assumes a deeply negative potential. It appears that some of the charge on the node trickles away over time, making the gate more sensitive to spikes on the inputs.
A solution to this issue is to make the gate in question "weaker", so it becomes less sensitive to transient conditionts at the inputs. I tried two ways of doing so: Increasing the base resistor (middle column) and by adding a small capacitor (right column). Ultimately I settled on the additional capacitor, because increasing the base resistor would reduce the current through the LED which makes the LED Vf unpredictable and reduces blinkyness (important!).
A case in point: Don't underestimate the trouble with latches. I settled on the design above for now, but would most likely look into further options when starting another design with LTL.
Now that all the gates have been designed and tested, let's discuss the CPU architecture. I spent quite some time pondering about a minimal architecture that is catered to a LTL implementation. To make a long story short: I essentialy ended up at the MCPU architecture again.
There are some ways to simplify the ISA or the datapath, for example by replacing the ADD instruction with something else (think Brainfuck) or reducing datapath width. All of these lead to an explosion in number of instructions needed to perform even the simplest operation, which will in turn increase memory size requirements and hence address path width. A nice example of this extreme are designs like the Qibec, which is a very nicely done implementation of an "invert bit and branch" one bit OISC. While it reduces the datapath to one bit, the address path needs to be increased to 16 bit to allow any meaningful program.
I may rant more about this at a later time, but in essence there seems to be little overall gain for these trade offs and all of them come at a great disadvantage in usability of the instruction set architecture.
I noticed way too late that I was actually not the first to implement the MCPU as a discrete CPU: The ED-64 is a very great looking implementation based on core memory. Kudos to Andrew for this great effort!
The original MCPU architecture is shown above. There are separate flows for the address and data-path. The state machine is not shown. Since this design is catered for a CPLD, it assumes edge triggered flip-flops for all registers.
The MCPU programmers model is based on two registers: Accumulator and PC. The ISA consists of four instructions in a fixed encoding that is based on a two bit opcode and a six bit memory address. This ISA can be directly mapped to the datapath shown above with minimal control overhead. See MCPU link for more information including assembler, emulator and example code.
In LTL it is much simpler to use latches instead of edge triggered flip flops. Introducing latches requires changing the design in some places to avoid race conditions where the input of the latch is dependent on its output. This is, for example, the case for the accumulator. The diagram above shows a latch-based architecture. The main modification is the addition of a data latch. The data latch is used to prevent race conditions for memory loads and accu-accu operations.
For implementation, I divided the design into three sections: The address path, the datapath and control. This modular architecture allows some freedom to modifify the CPU at a later time and add more instructions. Both data paths will be implemented in a bitslice architecture. This allows reconfiguring the CPU from 8 (data)/6 (address) to other configurations like 16/12 by just adding more slices.
Implementation of the address path is straightforward based on the previously designed gates.
The first design of the adress path is shown above. A half adder is needed to increment the PC. Since the gates are relatevely fast, I used a ripple carry adder. There are two latches for address and PC respectively. An AOI2 gate is used as multiplexer between PC and external address. The address latch can be loaded with zeros by pulling both control inputs of the MUX to low. This can be used to reset the PC.
The image above shows the design after some optimization to reduce part count. Each slice consists of 12 NAND equivalents and 56 components total. Note that the carry delay was increased to two gates.
Six of the bit slices were combined in a testbench to test the design of the full width datapath. A two phase clock is needed to alternatingly clock both latches. In this setting the circuit is configured to reset the PC and then increment it with each clock cycle.
Output traces of clock, reset and the first two output bits are shown above. Everything works nicely.
Designing PCBs with regular structures using free tools turned out to be surprisingly tedious. In the end I used the free version of Eagle due to it's block feature which allowed replicating of circuit and layout units. The other options I looked at were EasyEDA and KiCad.
Unfortunately, the free version of Eagle is limited to a single sheet, so I ended up transferring my nice hierachical LTspice implementation into a single paged mess of a schematic. The image above shows a one bit slice.
Curiously enough, none of the many LEDs in the LTL gates represented the actual state of the address output pins. Therefore I introduced two additional indicator LEDs.
The final PCB layout is shown above. I managed to fit a four bit wide slice of the adress path into one 8,5x7cm² PCB.
A populated PCB is shown above. Since I am not too fond of placing all those parts by hand, I used an SMD assembly service. The finals stats are: 52 transistors, 48 diodes, 24 capacitors, 56 LEDs and 88 resistors.