04/02/2017 at 11:21 •
This is what my geek corner looks like. Spring is in the air, so in the upcoming months I will spend more time on the balcony with a Chardonnay than here :-)
04/02/2017 at 19:14 •
The control unit (CU) is the last thing to be worked out before committing to a build. The concept I had in mind requires 6 chips and I was slightly unhappy about that. After some thinking I came up with an idea that drops one chip from this unit. However, it requires that at some point I invert the clock signal using an XOR gate, whereas in my original plan this would be done with a NAND. Inverting the clock with a NAND worked fine in my previous VGA project. But the XOR is slightly slower, especially if used as an inverter, so it increases the phase shift a bit further. This inversion is on the critical path towards the RAM write-enable pin. A write into RAM is only allowed to happen in the second phase of the clock. In the first phase we have to stabilise the RAM address. We don't want any glitches there so we must be really sure this is going to fly.
That's why I built a small CU-RAM simulator and fed it fake instructions: read, read, write, write, read, read, write, write, etc.. so all possible execution orders are exercised. Then just measure what the resulting write-enable signal looks like. Here is the idea in a drawing:
The fake instructions come from a 74LS74 dual-flipflop that just divides the clock by 4. This feeds into a 74LS273 that acts as IR register. (For fun, before entering IR, the signal goes through 2 bonus XOR inversions and picks up some delay there. This is not important but the gates were there anyway...) From the IR register it goes to the "real" NAND that on the other pin receives the XOR-inverted clock, just as in my intended design. Here the circuit on a breadboard:
From left to right: 10 MHz clock, 74LS74 dual flip-flop, 74LS86 quad xor, 74LS273 register and a 74LS00 quad nand. Plus some probes. This is what we get:
From top to bottom: yellow = clock, cyan = inverted clock with clearly some phase shift, purple = simulated instructions (low is read, high is write), and blue = the resulting write-enable signal that must go into RAM.
It works! We see nice pairs of negative write pulses and no glitches. Even better, with a 10 MHz clock the write pulse is still over 50 ns wide. For the 70 ns RAM I intend to use the write pulse has to be at least 50 ns. I wonder if the rest of the system can keep up with that as well.
04/03/2017 at 23:20 •
[ Edit: I eventually settled for a totally different ALU design than sketched here! Ik keep this log entry just to preserve history. -MvK ]
We will do a custom ALU, not because we don't have any 74'181 IC's available, but because it is more fun.
There is a beautiful 12 chip MUX-based design out there, nicely described by Dieter Mueller. It even has a shift-right instruction which the 74181 is lacking. Without that it would be 10 chips. I'm tempted to use this design but I still worry about the many control lines that go in, 9 if I count correctly. Well, 8 if we drop de SHR support. Many control lines means many chips in the decoder, unless we use a ROM but that is slow.
That's why I consider something else, based on 6 chips per nibble, with less flexibility but therefore also fewer control lines. The necessary operations are there (A+B, A-B, A&B, A|B, A^B). There is also a "B" operation that we can use to load data without modifying it. We need that because all traffic to registers goes through the ALU in our design. In our data path we can also put AC on the BUS, so we have things like A+A. And there is an "A+1" that we will make use of in the STIX instruction ("store-and-increment-X") later on.
The main part is straightforward: three stages, some logic on top, some multiplexers in the middle to select intermediates and a final addition stage. With 4 control lines we can generate our desired functions plus a handful more that are not very useful, such as "(A ^ B) + 1". Of our functions, only "A|B" is a bit difficult to visualise, because there is no OR-chip in the circuit. It uses the identity A|B = (A&B)+(A^B) instead. Finally, we won't store the carry as we don't want to have a status register. Maybe in a later phase we can use the carry in some useful way.
Four control lines is OK already. With that we could make an opcode scheme where 4 bits select the desired ALU operation immediately, without any further decoding, and let the other bits select the addressing modes. Then we assign the less useful codes to instructions that don't use the ALU, such as store and jump instructions. We need to derive the "write" and "jump" detectors with some logic but that shouldn't be hard. Also, during the first phase of the clock the "load" lines into the registers and the "write" line into the RAM must be muted anyway (for different reasons), which means there should be no worry for glitches while deriving these signals with combinatorial logic.
Still haven't decided yet on this one.
04/05/2017 at 19:58 •
I'm still uneasy about the planned ALU layout and its control logic. The way to extract "write" and "jump" signals remains clumsy and the STIX instruction doesn't really fit in well yet. It will work, but that one remains a wart. Besides, the alternative ALU, the 10 chip MUX-based design, remains much more attractive.
So I revisited it and realised that of the 8 control lines only 5 matter. That's good news. If that can be reduced to 4 or 3 bits we're done. 4 is easy, but then we still don't have "jump" and "write" signals.
Staring at the dense part of the truth table I realised this is a ROM's job. That can decode 3 instruction bits into to 6 ALU combinations (ADD, SUB, AND, OR, XOR and "B" aka "LOAD"). Doing this in a little ROM we immediately get the "write" and "jump" for free by assigning these to the unused ALU codes. Plus, we'll have a full bit to spare in the instruction word. That might be useful for STIX. Also, managing the two clock phases can become simpler. All combined this has the potential to save up to 3 chips and that is worth exploring.
But ROM is slow. So an old idea popped up: maybe, just maybe, we can do the ALU control with a 3-to-8 decoder and a tiny, 6 word by 5 bit, diode ROM matrix? People have controlled Atari's with such contraptions and this will be much smaller. Just 20 or so small diodes. So today I got some 1N60 and 1N4148 signal diodes and started playing. I did some quick multimeter tests first: at 10 mA the 1N4148 drops 400 mV and the 1N60 is about ten times better. But both should be ok because TTL allows for a 0.4V drop. Are they also fast enough? Lets measure:
From left to right we have one 74LS161 to count, one 74LS155 to decode and the breadboard provides word and bit lines (all vertical). Both types of diodes are in between these and there is a final 74LS153 to simulate the ALU's first stage. More on that later. I put the 1N4148's on one bit line and the 1N60's on the other so I can easily compare their characteristics. In the photo I'm probing the latter. After I realised I had forgotten a pull-up resistor, the signals generally looked good if you tweak the resistor value a bit.
Ignore the yellow and blue. The cyan signal is the diode stage output. This signal looks poor: it has very slow rises. But once above 2V TTL has a "H", and that happens fast enough. We see the next stage (purple) rectifying it nicely. For the ALU this will be more than good enough: both diode types seem to give proper signal transfer through the ROM matrix.
So are we good? Not yet... In reality this ALU will have 8 parallel MUX chips and each one has 8 inputs. Ultimately they are all connected to the same word line coming out of the decoder.
That still doesn't sound good at all! We have an effective fanout of 8*6, + 1 for the carry. 49 is definitely more than 10. Worse, we have some nasty voltage dropping diodes in the path. Maybe this idea was too good to be true and this is a dead end. To be continued...
[ Postscript: An obvious solution to the fan-out issue is to feed the diode ROM into an inverter row. It is not really the decoder's problem anyway, as the root cause lies with the MUX-based ALU concept. We'll need 6 inverters, not 5, because one of the signals goes to 2 pins of every MUX (there are eight of them). A simple 74LS04 hex inverter IC will do the trick. An added benefit is that the ROM logic now flips from negative to positive. We can really see every operation's truth table right in front of us! And save on a few diodes along the way... ]
04/07/2017 at 14:19 •
First some perspective:
Yes, the ALU will be made out of multiplexers! (74LS153)
The 74LS155 that is to become a 3-to-8 opcode decoder, the bottom half of the diode ROM matrix and the inverter/buffer row (74LS04). You can see XOR, OR, AND and LOAD programmed here.
04/08/2017 at 20:08 •
I just hope I didn't misread any data sheet
04/16/2017 at 18:13 •
If you want to have a fighting chance of ending up with a small design, the instruction set follows the control unit and not the other way around. While writing the assembler I found that I had overlooked something in my original 4 or 5 IC design. I didn't want to proceed with anything else until this was settled. So back to the drawing board. I managed to solve everything within 5 chips, but still with a somewhat limited repertoire for conditional jumps: still only BPL and BMI were present. Doable, but not nice. After some puzzling, I figured I could get a complete conditional jump set with just one additional 74153 IC. I think the benefits outweigh the extra IC, so it is in for now. The clock will have to move over to another place on the breadboard, because this will take up the whole upper right board.
Designing the control unit turned out to be a bigger challenge than I had anticipated. I will let it rest for a couple of days. If I don't find any new oversights in that time, this is what it will look like. The instruction set follows from this, more on that another time.
[ Edit: Only one real issue found since writing this: the write instructions had unintended side effects. Also some of the wired logic was wrong, fixed that. We now use the '138 instead of '155, and AC/OUT must become '377 instead of '273. With that it still fits in 6 chips, although the wired AND remains a bit of a wart. ]
04/23/2017 at 21:00 •
During jump, the ALU is wired to calculate "-AC". Only the carry out is looked at. The negation result itself is not used and discarded.
What's the use of this? Well, only in case of all-zero bits, "-AC" overflows the ALU, setting its carry-out line. So in this case the carry out acts as a zero indicator (Z). If we also look at bit 7 of AC, we also know if AC is negative or positive. The combination of these two signals makes all condition codes possible:
With this, a single 74153 chip can decode the condition code from the opcode, compare it to the contents of AC, and determine if a jump is needed.
04/27/2017 at 22:30 •
Well over 50% wired now. I start to appreciate the invention of the PCB.
05/02/2017 at 00:23 •
Instead of wiring up the whole thing it is better to do a minimalistic wiring that allows basic testing before proceeding. It is possible to write simple programs without using RAM, X, Y and the EA unit, so those are still left out. Today the first program is running:
The test program is doing 7 simple load instructions followed by a jump back to address 0. Because the system is pipelined, the instruction in memory immediately behind the jump instruction (at address 8, or 1000 in binary) also gets executed, yielding a cycle of 9. The assembly looks like this:
address | encoding | | opcode | | | operand | | | | V V V V 0000 0000 ld $00 0001 0001 ld $01 0002 0002 ld $02 0003 0003 ld $03 0004 0004 ld $04 0005 0005 ld $05 0006 0006 ld $06 0007 fc00 bra $00 0008 0008 ld $08
I put in a slower oscillator so that it easier for me to interpret the timing. For now, 1 µs = 1 clock tick. The scope in the photo traces ROM address line 3, bus line 0 and the decoding output for "ld" and "jump". You can't actually observe the data passing through the ALU here and entering the AC register. That can only be fully tested once the 74LS377's have arrived and they are still on the boat from China.