I can't keep track of all the awesome "discrete" CPU designs on my own project. There is a list of such projects but it is "curated". Why not make my own list and invite like-minded hackers ?
If you have a similar project here, drop me a message and I'll add you to the contributors.
For practical reasons (it's impossible to list everything on the 'net), the "project" is mostly about gathering people from HaD who built their CPU (or at least digital electronic devices). Here are some external links for those who just can't get enough:
These days I'm contemplating "tasting" BFP740 (44GHz GBW but not in stock so far) and 2N2369 gates (I have a fistful but not enough to make anything interesting)...
I propose to create a new project/page where we gather all the ring oscillators experiments, sort them by technologies, discuss on measurement details (and gotchas) and agree on a standard "size" to help tally and compare speeds, efficiencies etc.
I was thinking that with my BFS480 (rated at 7GHz) I would need 9 inverters in series to have a reasonably observable waveform and a frequency that my HP5335A could accurately follow.
Edit : this exploratory page is interesting but not the final word. The rest is logged at More bistables...
You know that a MUX can be easily turned into a latch by looping the output back to one input...
And in From XOR to MUX I turn a XOR into a MUX. So the next logical step is to connect the output to one of the inputs...
The natural choice is to connect Y to A because the polarities are compatible.
This is unfair for the /B input which is inverted and requires a pull-down transistor.
My quest is to make a D-FlipFlop circuit with the least number of bipolar transistors. A pair of latches will require another NPN to pull /D low, but another topology is possible if complementary transistors are allowed :-) As in the early IBM ECL circuits (Current Steering Logic) I can make the next stage complementary to save one transistor... As a bonus, there is no need of a complementary clock signal and the output data has recovered its original polarity :-D
Now, the more I look at it, the more I doubt it can work as is. There must be errors here and there...
I'm sure the CLK signal will create quite a lot of problems and it must be split into overlapped, out-of-phase signals (2-phase clock ?)
But I'll have to test and you know, you're never safe from a good surprise... who knows if it could be the basis of a new clock or UART ?
Normally, both transistors must invert the signal, but in your circuit one of them is an emitter follower that does not invert.
"yes but" double inversion is not the real requirement for latching, it's a consequence that transistors can only invert. We can apply the https://en.wikipedia.org/wiki/Barkhausen_stability_criterion which states : gain > 1 and phase = 0 (mod 2Pi). To fulfill this condition, you need 2 transistors because each adds a phase of Pi, and their gain is >1. However, most latches are used both in common emitter configuration, which creates the double inversion. Here I use another structure, similar to a SCR https://en.wikipedia.org/wiki/Thyristor
"It acts exclusively as a bistable switch, conducting when the gate receives a current trigger, and continuing to conduct until the voltage across the device is reversed biased, or until the voltage is removed" because I use the common collector output (borrowed from the classic ECL gates structures). My circuit is almost identical, I only added a base resistor to prevent damage and too hard a saturation. The phase is 0 and the gain is very high so latching should occur as long as the CLK level is enough (which will be another concern for later)
Personally I would design a transistor CPU in such a way that the registers are latches (that was also done by Dieter in his transistor CPU).
I agree too : this cuts the transistor count in half and this is what is intended for #YGREC8.
However it is necessary to see the full DFF working on the bench and be familiar with its idiosynchrasies, before I cut it in half. It's important because I'll have to decide which part is NPN and which part is PNP. Apparently here the first/common latch is PNP because there are fewer transistors, and the bulk (replicated for each register) would be NPN because I have more of these.
The speed and timing of the circuit will depend on the power supply, the saturation and other parameters... I might have to add a anti-saturation diode in the SCR latch part, while the saturation of the emitter might not be such a problem. In fact, saturation is often considered in common emitter configurations, but here the emitter is a data input so I'm in a totally uncharted territory...
And I would love to test the circuit in both Silicon and Germanium versions. I don't have Germanium NPN transistors though (or so few, eventually) so it would be interesting to find a solution with only a single type/polarity.
Time to play with Falstad !!!
So I played with Falstad for hours and came up with this simulation...
But thinking about how XOR is done with pass transistors in CMOS and the structure often creates a MUX, I wondered if I could translate this concept back to bipolar world.
This first result is pretty nice and compact though the circuit is highly unbalanced...
A is a typical high-impedance input where a high signal is a valid 1.
/B is a negated low-impedance input that must be shorted to -V to make a valid 1. Another transistor can do the trick though that would create another delay...
Sel has to swing High and Low...
But for discrete, parts-constrained circuits, that might work...
The output could be used to directly drive another MUX stage if the next MUX swaps the NPN for PNP (and reverse polarity) though a big MUX could also be built with the single-transistor NPN-ANDN gate to then drive a big CTL AND gate.
The author presents his TTL gate, then his modification inspired from TTL.
I did some tests and tried a basic gate and... "it's a weird AND".
More precisely it works as a linear amplifier with close to no voltage gain but strong current gain. When the circuit is rewritten, it's obviously a pair of complementary "emitter followers" with the output clamped above 2.5V-Vbe=1.7V. The output can go down to about 0.2V on my tests.
This circuit also has a strong tendency to oscillate. My test setup was poorly designed but I could stop 60MHz oscillations with a 4n7F capacitor at the input of the PNP. I'll see how I can get a stable circuit...
Since this is just a pair of emitter followers, why bother with using PNP inputs after all ? With my BC559C, each with hFE=480, the overall gain is about 200K, the input current is very low but this is overkill and oscillations are not surprising at all.
The PNP emitter followers at the input are nice. The NPN emitter follower at the output is nice. AND gates are very useful in some places. However this is not what we expect from a "logic gate" because there is no real "active level" or "threshold". Current gain is nice but voltage gain is important too ! So this CTL might be faster than ECL but ECL can do more functions and provide inversion.
After this little setback (or disappointment) I looked at other ways to make this circuit and a variation appeared : it replaces selected PNP input transistors with NPN.
Thus instead of inverting the output, we can invert the necessary input(s) and we apply "bubble pushing" :-)
Of course the logic levels are modified but this leads to the interesting concept of a cascade of emitter followers, "or-dotted" together for the OR functions, with parallel transistors for AND or OR functions, and complementation (switching from PNP to NPN and vice versa) for the negation.
[updated 20180930, read the comments below for more background]
People usually confuse the operating frequency of the computer with the max. frequency of its individual parts.
Let's say a CPU runs at 1GHz, that must mean each transistor switches 1 billion times per second, right ? Hahaha I'm kidding.
Actually the Ft (transition frequency) of transistors is way higher than that. And the whole circuit is slowed down by other factors such as wires, capacitances, resistances that make distributed RC networks along with the capacitances, and countless other factors. Of course, the CDP (critical datapath) length matters too.
But in average, I have observed a 1:50 ratio between the operating frequency of a processor versus the "speed" of the constituting transistors, for reasonable architectures. This might be lower for recent ultrapipelined processors but when you make your own discrete processor, divide the Ft by 50 to get your final processor's speed. A ratio of 100 is much more realistic for a hobby project but it's less optimistic...
The ratio of 50 is a realistic ceiling that shows the influence of parameters outside the transistor's ideal characteristic. One such influence is the type of logic gate (TTL, DTL, CTL, DCTL, ECL...) so you have to measure your individual inverter gate speed (for example with a ring oscillator) for a better estimate.
I'd be happy to get more datapoints from various architectures and implementations. A chart would help us identify the factors that inflate or decrease this ratio and give us a better prediction.
Note : this rule applies to transistors and semiconductors, not relays, where the delay is limited essentially by the contact switching speed and RC delays are irrelevant.
I sometimes find a small circuit with 3 resistors and 2 transistors that performs the eXclusive OR operation.
These two interlocked transistors use a very unusual structure, which requires the least theoretical number of switching elements, but it depends on a trick : the input impedances matter a lot and the circuit depends on a "hard" 0 level, because the circuit behaves almost like a "pass" element...
Thus, the question : is it the best method ? What about the switching speed or the capacitances ?
XOR is pretty important in CPUs because many mechanisms rely on it, for example ALUs. Does the gain in parts count affect the performance ? Apparently, it's pretty close to ideal because it's touted as a solution in Direct Coupled Transistor Transistor Logic:
Another version has only one transistor but 4 diodes :
Another question is : can this scheme (no amplification, just relying on the input's strength) be extended to other logic or sequential functions ?
The XOR gate has a much wider range of implementations in MOS and CMOS. You can find circuits using 4, 6, 8, 9, 10 or 12 transistors, again with varied strengths for the inputs and the output. For example, pass-transistor logic (transmission gates) makes it pretty simple :
Each pass element is a pair of complementary transistors, so this gate uses 2 NMOS and 2 PMOS. Add as many if you want to isolate the outputs with inverters...
Oh and don't forget another inverter at the output. This is why you'll find various transistor counts. Unless the designer wants to decompose the function into elementary boolean functions, and the size explodes, depending on how you break it up:
This decomposition leads to the "classic" CMOS XOR gate:
which gains weight again when the inputs are buffered and inverted :
XOR has a reputation of a "slow and large gate" for this reason and that's why I investigate smarter topologies and their compromises.
Another version is also pretty nice :
This is interesting for my #Yet Another (Discrete) Clock because it is almost suitable for MOSFETs. The B input must double the transistors because of the inherent diodes but it's "only" 3×BS170 and 3×BS250. In this case, the B input actually works as a multiplexer or transmission gate... Which means it might not be suited for ultra high speed.
Even fewer parts with this 3T XOR :
In this case, only 1×BS170 and 3×BS250 are required. It's still not ideal because the BS250 is more expensive than the BS170 but I don't see how to permute the polarities without requiring more inverters... Furthermore, there seems to be a conflict with one of the input combinations : B=1 forces the output to 0, but if A=0 then the input B (which is =1) is forced to 0 byitself... The solution is another PFET controlled by A, in series with the grounding NFET.
M1 has a 1/1 ratio, almost a square, wih minimal size, hence highest resistance, while M3 has a high ratio to overcome the pull-down from M1. For very high-speed CMOS circuits, where power is dominated by switching (and leakage for the newest processes) this short current can be considered "negligible".
Another interesting compromise uses only 2 of each type:
But the "upper pass trick" on input A might still need doubling of the P-MOSFET to cancel the parasitic body diodes. This could be cheaper if XNOR was made instead, so 2 PFETs are tied to Vcc in series, and the NFETs are used as pass elements.
Lately I was looking for very fast diodes to design faster DTL/TTL discrete gates.
silicon epitaxial diodes can be quite fast but still have a limited frequency of rectification (particularly the cheap ones).
As noted by @K.C. Lee on #YGREC-ECL : "Carrier mobility isn't as good as electrons. That's why NPN, N-MOSFET have better performance than their PNP, P-MOSFET counterparts." so a complementary TTL gates, with a PNP input stage, would probably be speed-limited by the input transistors.
The recovery time makes a difference in several designs including switching power supplies. If you dig into the physics, there is a usually a trade-off between several other parameters and recovery time. Just to give you an idea, the datasheet for a BAT42 Schottky diode says the reverse recovery time at 10mA is no more than 5 ns."
Rectification speed was already a burning subject in the 40s because it was essential to the war effort (faster diodes means higher carrier frequencies, shorter wavelengths and better radar resolution)
ECL prevents all these issues because
there are only NPN transistors
no diode (no recovery time)
However there are more transistors... so maybe DCTL is an interesting alternative ?