I can't keep track of all the awesome "discrete" CPU designs on my own project. There is a list of such projects but it is "curated". Why not make my own list and invite like-minded hackers ?
If you have a similar project here, drop me a message and I'll add you to the contributors.
For practical reasons (it's impossible to list everything on the 'net), the "project" is mostly about gathering people from HaD who built their CPU (or at least digital electronic devices). Here are some external links for those who just can't get enough:
The author presents his TTL gate, then his modification inspired from TTL.
I did some tests and tried a basic gate and... "it's a weird AND".
More precisely it works as a linear amplifier with close to no voltage gain but strong current gain. When the circuit is rewritten, it's obviously a pair of complementary "emitter followers" with the output clamped above 2.5V-Vbe=1.7V. The output can go down to about 0.2V on my tests.
This circuit also has a strong tendency to oscillate. My test setup was poorly designed but I could stop 60MHz oscillations with a 4n7F capacitor at the input of the PNP. I'll see how I can get a stable circuit...
Since this is just a pair of emitter followers, why bother with using PNP inputs after all ? With my BC559C, each with hFE=480, the overall gain is about 200K, the input current is very low but this is overkill and oscillations are not surprising at all.
The PNP emitter followers at the input are nice. The NPN emitter follower at the output is nice. AND gates are very useful in some places. However this is not what we expect from a "logic gate" because there is no real "active level" or "threshold". Current gain is nice but voltage gain is important too ! So this CTL might be faster than ECL but ECL can do more functions and provide inversion.
After this little setback (or disappointment) I looked at other ways to make this circuit and a variation appeared : it replaces selected PNP input transistors with NPN.
Thus instead of inverting the output, we can invert the necessary input(s) and we apply "bubble pushing" :-)
Of course the logic levels are modified but this leads to the interesting concept of a cascade of emitter followers, "or-dotted" together for the OR functions, with parallel transistors for AND or OR functions, and complementation (switching from PNP to NPN and vice versa) for the negation.
[updated 20180930, read the comments below for more background]
People usually confuse the operating frequency of the computer with the max. frequency of its individual parts.
Let's say a CPU runs at 1GHz, that must mean each transistor switches 1 billion times per second, right ? Hahaha I'm kidding.
Actually the Ft (transition frequency) of transistors is way higher than that. And the whole circuit is slowed down by other factors such as wires, capacitances, resistances that make distributed RC networks along with the capacitances, and countless other factors. Of course, the CDP (critical datapath) length matters too.
But in average, I have observed a 1:50 ratio between the operating frequency of a processor versus the "speed" of the constituting transistors, for reasonable architectures. This might be lower for recent ultrapipelined processors but when you make your own discrete processor, divide the Ft by 50 to get your final processor's speed. A ratio of 100 is much more realistic for a hobby project but it's less optimistic...
The ratio of 50 is a realistic ceiling that shows the influence of parameters outside the transistor's ideal characteristic. One such influence is the type of logic gate (TTL, DTL, CTL, DCTL, ECL...) so you have to measure your individual inverter gate speed (for example with a ring oscillator) for a better estimate.
I'd be happy to get more datapoints from various architectures and implementations. A chart would help us identify the factors that inflate or decrease this ratio and give us a better prediction.
Note : this rule applies to transistors and semiconductors, not relays, where the delay is limited essentially by the contact switching speed and RC delays are irrelevant.
I sometimes find a small circuit with 3 resistors and 2 transistors that performs the eXclusive OR operation.
These two interlocked transistors use a very unusual structure, which requires the least theoretical number of switching elements, but it depends on a trick : the input impedances matter a lot and the circuit depends on a "hard" 0 level, because the circuit behaves almost like a "pass" element...
Thus, the question : is it the best method ? What about the switching speed or the capacitances ?
XOR is pretty important in CPUs because many mechanisms rely on it, for example ALUs. Does the gain in parts count affect the performance ? Apparently, it's pretty close to ideal because it's touted as a solution in Direct Coupled Transistor Transistor Logic:
Another version has only one transistor but 4 diodes :
Another question is : can this scheme (no amplification, just relying on the input's strength) be extended to other logic or sequential functions ?
The XOR gate has a much wider range of implementations in MOS and CMOS. You can find circuits using 4, 6, 8, 9, 10 or 12 transistors, again with varied strengths for the inputs and the output. For example, pass-transistor logic (transmission gates) makes it pretty simple :
Each pass element is a pair of complementary transistors, so this gate uses 2 NMOS and 2 PMOS. Add as many if you want to isolate the outputs with inverters...
Oh and don't forget another inverter at the output. This is why you'll find various transistor counts. Unless the designer wants to decompose the function into elementary boolean functions, and the size explodes, depending on how you break it up:
This decomposition leads to the "classic" CMOS XOR gate:
which gains weight again when the inputs are buffered and inverted :
XOR has a reputation of a "slow and large gate" for this reason and that's why I investigate smarter topologies and their compromises.
Another version is also pretty nice :
This is interesting for my #Yet Another (Discrete) Clock because it is almost suitable for MOSFETs. The B input must double the transistors because of the inherent diodes but it's "only" 3×BS170 and 3×BS250. In this case, the B input actually works as a multiplexer or transmission gate... Which means it might not be suited for ultra high speed.
Even fewer parts with this 3T XOR :
In this case, only 1×BS170 and 3×BS250 are required. It's still not ideal because the BS250 is more expensive than the BS170 but I don't see how to permute the polarities without requiring more inverters... Furthermore, there seems to be a conflict with one of the input combinations : B=1 forces the output to 0, but if A=0 then the input B (which is =1) is forced to 0 byitself... The solution is another PFET controlled by A, in series with the grounding NFET.
M1 has a 1/1 ratio, almost a square, wih minimal size, hence highest resistance, while M3 has a high ratio to overcome the pull-down from M1. For very high-speed CMOS circuits, where power is dominated by switching (and leakage for the newest processes) this short current can be considered "negligible".
Another interesting compromise uses only 2 of each type:
But the "upper pass trick" on input A might still need doubling of the P-MOSFET to cancel the parasitic body diodes. This could be cheaper if XNOR was made instead, so 2 PFETs are tied to Vcc in series, and the NFETs are used as pass elements.
Lately I was looking for very fast diodes to design faster DTL/TTL discrete gates.
silicon epitaxial diodes can be quite fast but still have a limited frequency of rectification (particularly the cheap ones).
As noted by @K.C. Lee on #YGREC-ECL : "Carrier mobility isn't as good as electrons. That's why NPN, N-MOSFET have better performance than their PNP, P-MOSFET counterparts." so a complementary TTL gates, with a PNP input stage, would probably be speed-limited by the input transistors.
The recovery time makes a difference in several designs including switching power supplies. If you dig into the physics, there is a usually a trade-off between several other parameters and recovery time. Just to give you an idea, the datasheet for a BAT42 Schottky diode says the reverse recovery time at 10mA is no more than 5 ns."
Rectification speed was already a burning subject in the 40s because it was essential to the war effort (faster diodes means higher carrier frequencies, shorter wavelengths and better radar resolution)
ECL prevents all these issues because
there are only NPN transistors
no diode (no recovery time)
However there are more transistors... so maybe DCTL is an interesting alternative ?
@Yann Guidon / YGDESasked me to do a write-up of the Direct Coupled Transistor Logic (DCTL) of the famous CDC 6600 computer. When it was released, and for some years, the CDC 6600 was one of the fastest and most powerful computers in the world. When we take a look at the logic family that it used, it will be obvious to see why:
It uses very few components, primarily transistors, with no diodes
The transistors which perform the logic are driven very hard, to the point where the quality of the transistor fabrication actually matters a great deal
The logic levels are dangerously close together (in the 6600’s case “0” = 0.2V, and “1” = 1.2V).
The cacading and interlinked circuits must be carefully “tuned” input and output impedances MUST agree precisely, or the logic will not function due to noise or otherwise.
The Basic Unit: The Inverter
There are primarily two main articles available online which deal with the electronic description of DCTL logic:
This is not a great deal. On Bitsavers, there is no folder for the engineering documents of the 6600, as opposed to the 1406, and 3600-series CDC logic, which I also aim to cover because it is a very interesting high-speed DTL.
So bear in mind that the information I am presenting here is limited, and if you want to build your own DCTL circuits you are most likely going to have to design your own, because there are no complete design documents online for the 6600 which you would have been able to copy and modify.
Anyway here is the basic building block of DCTL:
The CDC 6600’s inverter:
The 1950s DCTL inverter:
By the time the 6600 was built, transistor fabrication had developed and improved markedly. In fact the first few pages of the chapter of the CDC-published book on the digital electronics of the 6600’s DCTL go on about how the new silicon transistors they used in ‘69/‘70 made the 6600 possible. So this explains why the 6600 uses NPNs, as opposed to the older implementation using PNPs.
As you can see, you would be forgiven for mistaking DCTL with Resistor-Transistor Logic (RTL) if you had only an inverter to look at. I agree with the speculation on the (tiny) Wikipedia article on DCTL that it evolved from RTL.
Obviously the thought process that lead to developing this logic family was “what if we had RTL but got rid of all the resistors?” The point of having resistors in RTL is to allow you to increase the voltage margins of the logic levels. It also allows you to better control the flow of current throughout the circuitry and match the impedances of the inputs and outputs.
The first problem you have with DCTL is – how do you make sure you can switch transistors without driving them too far into saturation? The solution is to use transistors with special impedances and gain ratios.
Take a look at the special transistor characteristics that the conference proceeding document outlines:
This is obviously based on an old understanding of exactly how well-designed transistors are, but you can see we’re only switching very small amounts of current, and the V(BE) of the operation of the transistors when in saturation/conduction is far lower than your standard BC548/9 or 2N3904/6.
I haven’t checked yet, but I believe transistors with these kinds of characteristics should be able...
(note : the ideal current is about 4mA per gate and the oscillator reaches 307MHz on breadboard)
On the other hand, Complementary Transistor Logic is a bit like DTL but the input diode is replaced by a PNP. This greatly increases the input impedance and helps with many things. Operating voltage and current might be significantly lower, it even reduces the transistors count by 2 compared to ECL. But if it's easy to get one sort of FAST transistors, the complementary type might not be easy, as cheap or as fast... I have only stocked one type of germanium (PNP, because Ge NPN is rare) and silicon (well, I have mostly NPN, some PNP but i have no idea how to find a PNP equivalent of BFS480...
(as usual, the 2 diodes in series could be replaced by a red LED ?)
The BC857 has a high gain so the input current can be very low and this reduces fanin/fanout issues. The speedup capacitor might need some tuning, maybe 1 or 2nF ? And the resistors could be reduced to increase current and speed.
I've also read mentions of hysteresis of CTL gates, due maybe to capacitance, which can reduce the operating speed. A Schottky diode might be needed to remove bias buildup... or even add a resistor in parallel with the speedup capacitor ? Or what about simply avoiding the voltage shift by using more power rails ?
Another parameter is : sometimes, using better and faster transistors simply lets the gate run faster. But it can come at a high price so topology is still critical...
Now, the only way to compare is to try, right ?
(repost courtesy of @Dana Myers ) Just for the sake of discussion, Fairchild had their own TTL family back in the day when the 74-series was not an industry standard (or Fairchild had their "74F" series). They did TR, TD and TT : transistor-resistor logic, transistor-diode logic and here transistor-transistor logic. The datasheet shows no Baker clamp, but R5 might have helped...
But 5 transistors for one NAND, that's similar to ECL density :-D
EDIT 20180918 : Mea Culpa !
Apparently I made a big mistake and used the wrong schematic for CTL. CTL seems to be derived from ECL and is claimed to be even faster (for certain values of "fast" because I still have to try it)
From what I have gathered from the MT15 project, with BC847\BC857:
TTL with some load: 50ns.
TTL Schottky clamped: 35ns.
Differential ECL: 20ns.
CTL two input AND gate: maybe 5ns. To be measured...
The schematic is subtly different from the one I included earlier :
As above, the inputs are PNPs that short the base node to GND. However the difference with the previous schematic is the output stage: it is a follower and not a "shorter to GND", so it does not saturate.
Note also that the gate above is a AND gate and no inversion takes place. I'd love to see an inverter, a MUX, a OR, a XOR...
I'm also unsure about the logic levels, temperature susceptibility and noise : I'll have to compare to ECL and DCTL :-P
The availability of a richer set of logic functions is critical for me, but there is an added simplicity to CTL : one can use SIL resistor networks to keep the parts count low and share a single package for several gates. No caps, no diodes, only 2 resistors with very close values...