08/08/2019 at 02:31 •
As you can see at Log: Don't go full Numitron ! Unless... OK whatever., I started to design the disassembler and I totally renewed the look:
This version now shares a pair of Numitrons to display either the SRI or the Immediate field, which saves a bit of room and 2 tubes but adds some complexity in the decoding logic (yet not enough to scare me, of course). The above display is not compliant with the normal assembler but close enough and it gets the job done :-)
I draw a lot experience from the #Numitron Hexadecimal display module so I know what to expect and what to do to display the desired patterns on the 7 segments. There is no technical challenge anymore to display the CND, SND, SRI and Opcode fields, but it's still a lot of work, in particular for the Opcodes: this decoder not only must display 19 words on 3 tubes, but also sends control signals to enable the other fields.
It's going to be small and gorgeous but behind the front panel, the electronics will be pretty dense and draw a significant amount of power...
Damnit I forgot that IN/OUT have 9 bits of immediate address...
The Imm field will create more problems, on top of the multiplexing with the SRI decoder. The field must select the width between 4, 8 and 9 bits. The last case is not a problem because an address is just a positive number. Imm4 is sign-extended to 8 bits (easy: 4 relays) but should Imm8 be represented as a signed number ?
Then there is the special case of Add imm4(>=0) where Imm4 is incremented.
The easy way is to simply display the number as is, and forget about it, though the display would not be accurate. It would even be misleading.
But then, if the Add correction or the negative display are implemented, an increment unit is required. And negative numbers require a XOR to transpose to positive numbers. This means more circuits in front of the display modules...
But with negative numbers, the added Numitron can have another segment used, for the sign bit:
08/04/2019 at 16:41 •
I nailed it !
I solved a "bug" and I now use "normal" hexadecimal encoding knobs. This required a big redesign... There are now 29 diodes but only one per signal so I can use old, low-current point-contact D9K.
Another big difference is a rotary 4-poles 3-throw selector ! It selects between the Imm8, Imm4 and Register forms. I could use interlocked switches as well but their mechanical installation might be more complex, more holes and alignment...
I'm only missing one row of 8 switches for now, but I can start the construction :-)
07/31/2019 at 00:40 •
As I received the buttons, I was able to not only get electrical information but also dimensions. I updated the layout:
The 50 buttons easily fit in a 18×18cm square so 20×20cm is a safe dimension for this panel.
Remember that it's only the instruction, and more things are done in other panels :
- The disassembly panel shows the instruction being sent to the core (it's the reverse of the ASM panel)
- The control panel selects the source of the instruction, generates the clock, sends the clock pulses to the core (one-shot, several ones or continuously)
- The debug panel has several event counters that can stop the core...
- More panels display the inner state of the core
07/29/2019 at 01:44 •
I received quite a few switches and more will come later !
These are 10 interlocked switches, with their caps, and they are just like I wanted :-)
2 of them will be used for the opcode selector.
For the other selectors I needed somethign more simple and pre-made, and I have found it in the form of the 8-channels video selectors. And I have a good surprise ! Look:
I expected the box to contain a single PCB but there are two, making the switches assembly very easy to remove !
2 screws and 2 connectors to remove, and there you have the module :-)
The traces are easy to follow and they are already conveniently wired for my purpose :
The SRI, SND and CND fields are 3 bits wide, and the selector outputs 3 bits, which is a great match.
I can wire the switches' inputs to a common rail, such that the 3 output bits will contain the binary code of the selected button. I can also bind the boards together to make a more sturdy structure (I have 2 but need a 3rd for the conditions).
There is a last technical problem though, with the hexadecimal encoder : the lower nibble requires a type of switch I don't own... But @Artem Kashkanov has the perfect part ! From Russia, of course ;-)
This selector has individual SPDT switches for each bit, making my diodes hack possible :-)
I'm still missing a few extra switches but they'll come soon enough.
At least I successfully designed a fully-passive assembler panel, with only one diode drop on certain signals, and no requirement of external power (no relay or other logic). The panel can work with relay logic, transistors, TTL...
Time to think about the disassembler panel now ;-)
I have easily modified 2 8-ways selectors !
I can say it works very nicely :-)
I'm waiting for the delivery of the 3rd module so I can have all the necessary buttons for the SND, SRI and CND fields.
07/23/2019 at 02:32 •
I figured a few things...
Here is the aggregated schematic for the opcode field :
It might look complex but most of the complexity has been examined in the previous log and it's basically a pair of binary encoders that have been chained through the XT button.
The tricky part is the simultaneous selection of buttons from both rows : some combinations would disrupt the binary code. This occurs when the "lower row" (AND through CALL) has one button pushed with a code that has more than one set bit (ADD, SUB, CMPS, ANDN). There is a potential path that is now broken by 10 diodes. And since the locked switches work together, only one switch is now required.
The "upper row" OTOH doesn't need diodes because the switches are interlocked. There is no place where different signals are brought together, except when a switch is pushed.
I'm working on the other switches...
More schematicsing :-D
Going further, I added the dual hex encoders and tried to join everything together:
It's almost complete. b2,b1,b0 didn't change, nor did b15,b14,b13,b12,b11. The SRI field (and the LSB of the condition) can be combined with the lower half of the immediate field, while the condition field (plus a couple more bits) can be combined with the higher half.
However the simultaneous activation of the lower half for both Imm4 and Imm8 is more complicated and not yet implemented. The trivial version would use another diode, so the total drop would be 3 diodes !
I need a method to reduce this drop, I suspect there is a way to keep it down to 1 drop but I need to test it...
And I should switch from dia to EAGLE :-D
I solved a potential problem caused by several diode drops in series :
The free switches for the XT button isolate the main signals and prevent the use of diodes. The hexadecimal encoder requires separate switches, so each signal can be driven by a pair of diodes (instead of one diode at the output).
There is one remaining switch and I wonder how to use it to remove more diodes...
07/20/2019 at 20:28 •
As mentioned in A new assembler panel, I'll soon (hopefully) get a bunch of 4PDT interlocked buttons !
I have already assigned their function but how will I make them work ? Electrically, I only want to have switches, eventually some diodes, and I would love to have the panel use the least power possible, which means ideally no relay. I hope that the panel can be reused for other technologies with the least amount of changes.
The critical information is contained in these 2 diagrams :
Already we can see 2 fields that are (almost) fixed :
- The SND field uses 3 bits and a row of 8 buttons that can directly encode the binary code. The 4PDT switches are enough to encode the 3 bits easily. The question is settled and the same system will be used for the SRI sub-unit.
- The Opcode field has 19 codes (and as many buttons). Only RC and INV use 4 set bits so the 4PDT has some spare room, that we need indeed !
There is no interlocked switch with 19 buttons. The most I have found is 12 so I have chosen to use a pair of 10-buttons rows.
- The upper row contains the SH/SA/RO/RC/LDCL/LDCH/IN/OUT/INV opcodes. That is 9 opcodes. Each 4PDT switch can directly encode the binary value, except bit 15 which is always 1.
- The 10th button of the row is not an opcode but an "escape" button. Let's call it XT, it's an extention that selects the other row (which is the most used anyway) so it's not operated often. The 4PDT switch sets bit 15 to 1 when released (selecting the above opcodes), and otherwise activates the I8/R and Imm8 field.
- The lower row contains the AND/OR/XOR/ANDN/CMPU/CMPS/SUB/ADD/SET/CALL opcodes. Only 3 of the dual-throw switches are needed. The 4th bit remains unused.
How will this work, electrically ?
The panel uses "positive logic" where you implement a bit set to 1 with an electrical contact to an extra signal (usually a bus/rail, such as a positive voltage source). The switches will steer each instruction signal to the common voltage, depending on its encoding.
We can already write some truth equations for the most significant bits of the instruction word :
- b15 = ( XT & (SET | CALL))
- b14 = ( XT & (CMPU | CMPS | SUB | ADD))
| (/XT & (LDCL | LDCH | IN | OUT | INV) )
- b13 = ( XT & (XOR | ANDN | SUB | ADD))
| (/XT & (SH | SA | RO | RC | OUT | INV) )
- b12 = ( XT & (OR | ANDN | CMPS | ADD | CALL))
| (/XT & (RO | RC | IN | INV) )
- b11 = ( XT & I8/R)
| (/XT & (SA | RC | LDCH | ( IMM9 & (IN | OUT) ) ) )
As usual the OR operator "|" is implemented by parallel switches, while the AND operator "&" connects switches in series.
The terms contain XT which means the Normally Open side of the XT switch. So XT is a MUX for two sub-buses (b14,b13,b12).
Due to the interlock mechanism, /XT is redundant because it is always off when SH/SA/RO/RC/LDCL/LDCH/IN/OUT/INV opcodes are selected. However the 10 other opcodes are not interlocked and XT must disconnect the affected bits to prevent the other row from interfering.
- b15 = ( XT & (SET | CALL))
- b14 = ( XT & (CMPU | CMPS | SUB | ADD))
| LDCL | LDCH | IN | OUT | INV
- b13 = ( XT & (XOR | ANDN | SUB | ADD))
| SH | SA | RO | RC | OUT | INV
- b12 = ( XT & (OR | ANDN | CMPS | ADD | CALL))
| RO | RC | IN | INV
- b11 = ( XT & I8/R )
| SA | RC | LDCH | ( IMM9 & (IN | OUT) )
This is translated into the following diagrams :
The good news is : there is apparently no need for a complex switch for XT because its signal can be shared with several subsignals, thus saving switches...
The bad news is : to prevent certain cases of feedback/bypass (despite the use of independent switches) the output of XT must be guarded with diodes.
However, there are multiple switches per button so the number of diodes is kept low. Here is the census of the usage of SPDT switches per opcode (so far) :
OTOH the XT signal must be split into 5 diode-protected paths.
IN and OUT each use only 2 SPDT switches and 2 are left so each can feed (through a diode) the rotary encoders for IMM8.
XT also can enable the IMM8 encoders through the I8/R switch and 2 diodes. There are 2 encoders :
IMM8_hi = IN | OUT | ( XT & I8/R )
IMM8_lo = IN | OUT | ( XT & I8/R ) | I4/R
Due to the nature of the Hex switches, each output bit must be guarded by a diode as well...
Another interlocked switch selects the format : I4/R, I8/R and REG (they are mutually exclusive).
I8/R is only available in XT mode and otherwise defaults to I4/R mode.
The SRI field (as well as b6) is enabled with
SRI_en = REG & /INV & /IN & /OUT
This uses inverted logic and probably one of the spare/free SPDT of INV/IN/OUT.
Note : I have ordered 8PDT switches for the format selector because I thought it would be useful for actually MUXing the signals but it is not interlocked with XT and others and IN/OUT/INV would not reset the format to IMM8... Diodes seem to be necessary to enable the REG, COND and IMM fields.
The SND field is decoded with the following circuit :
The SRI and CND fieds are identical with some differences :
- The + is not tied directly to the power supply but to a diode-guarded contact of the REG switch
- The CND field is split, depending on the REG switch. The Bit0, Bit1, Bit2 and Bit3 options are enabled by REG (the b3 signal is switched) because otherwise the bit is used for the bit3 of the immediate (in IMM4 and IMM8 mode).
07/19/2019 at 23:11 •
A few days ago, I had a blast from the past !
While looking for a multi-position switches in the local store, the clerk proposed a type of switch I had totally forgotten !
The impressive interlocked switch !
I initially wanted to use rotary buttons but you can't find all the options and it takes some turning...
These buttons solve most of the problems I had, they use more room but they provide a direct access to everything. With about 50 buttons (and 2 rotary Hex encoders) one can select a whole instruction with a few pushes. Some rows or selectors can be enabled or inhibited, the conditions and the IMM knobs can be selected depending on the format selectors...
Here is the first sketch :
I just ordered more parts to start prototyping...
These switches are quite awesome. There are rotary switches but turning them all the time would quickly become ... hmmm...
I keep the rotary selectors for the numbers but the opcode, the condition and register names now have their own button. These are not inherently linear things but symbols. So it's good to point directly at them and push the right button right away, instead of blindly turning a selector until it reaches the desired position.
The other advantage is more technical : these switches can come in 2PDT, 4PDT, 6PDT and even 8PDT ! This means that a lot of signals can be encoded directly at the mechanical level, reducing the amount of diodes and relays ! For the main parts I have chosen 4PDT and the IMM4/IMM8/REG selector is 8PDT.
04/24/2019 at 18:37 •
As the register set and the ALU are (mostly) ready, now comes the time to connect them. This raises many questions that I hadn't considered fully until now and even though they are not really hard, they deserve to be treated carefully and independently from the rest. This is why I create a specific unit, which I name "nexus".
The primary purpose of the nexus is to gather all the operands and to fan them out to the sink units. So it's basic wiring, with the newly added twist to "control gate" the datapath ("data gate" ?). So it's wiring plus extra latches. And MUXes too...
This is the part that mixes the immediate value from the instruction, with the SRI read port of the register set. With size selection and sign extension. And it must also handle the low/high part of the instruction word for LDCL/LDCH. Oh and it also must manage the crossover with PC (see the log Now faster without the "PC-swap" MUX)
Let's first solve the easy case : SND is available directly from the register set and distributes it 8 bits to 3 ports, each with a latch:
- the I/O "OUT" port is transparent for b15.b14.b13./b12
- the "Shift" port (SH/SA/R0/RC) is enabled by b15./b14.b13
- the ALU port (CALL and SET don't use SND so it's simply decoded as /b15)
These latches isolate the buses and help solve fanout problems, on top of reducing spurious toggles in units where the result would be discarded. It looks great so far.
The SRI part is more complex... So let's look at a drawing:
The top half is the easy part:
- The register set provides the 8 bits on the SRI port
- SRI is latched to the shifter input, for the SH/SA/RO/RC instructions (b15./b14.b13)
- SRI is also latched at the input of ALU for 8 opcodes (/b15)
The other half is more sophisticated and subtle.
There is the part that writes to PC and sends the value to the program memory address bus. The PC is first incremented, then mixed with SRI for the SET and CALL opcodes so jumps and calls use only one cycle. But as explained earlier in Now faster without the "PC-swap" MUX, there is a 3rd case where the other opcodes write to PC but have a longer latency, the result bus goes to a bypass (btw, PC has its own write port).
Meanwhile, the incremented PC also goes to the result MUX (now called SUXEN, yes, it's a reverse NEXUS) because CALL must be able to write PC+1 to the result bus. This completes the crossover.
04/21/2019 at 20:05 •
It's time for a little census.
[yg@Host-002 VHDL]$ grep -r 'entity' * |grep 'port' |grep 'map' |sed 's/.*entity //'|sed 's/ port.*//'|sort|uniq AND2 AND2A AND3 AND3A AO1 AX1C CLA3 INV MX2 NAND2 NAND3 NAND3A NOR2 NOR3 NOR3A OA1A OR3 XA1 XO1 XOR2
(I removed the complex unit names by hand)
There are 20 gates so far, more, and more complex, than what #Shared Silicon provides (only INV, NOR, NOR3, NAND2, NAND3 and some T-gates).
I believe that by using more complex gates with more inputs (but reasonably so), there is a bit of performance and size benefit. I don't see any roadblock to get the missing gates : either I can make mine easily, or I borrow from existing free libraries.
04/21/2019 at 17:50 •
The latest source code archive contains the enhanced decoder for the register set, including 3 strategies:
- Straight (fast)
- update only meaningful control lines
- update only meaningful control lines when the related field is used
I provide a pseudo-randomised test to compare these strategies and the outcome is great:
[yg@Host-001 R7]$ ./test.sh Testing R7: straight decoder:R7_tb_dec.vhdl:165:5:(report note): 100000 iterations, 702273 toggles latching decoder:R7_tb_dec.vhdl:165:5:(report note): 100000 iterations, 301068 toggles Instr-sensitive :R7_tb_dec.vhdl:165:5:(report note): 100000 iterations, 160231 toggles R7: OK
There is a ratio of approx. 1/5 between the first and third result, which I explain below :
- Given that the probability of one bit being set is pretty close to 1/2, it makes sense that the first "straight" decoder toggles the output bits every other time in average. There are 14 control lines to drive and with a 1/2 probability, 7 lines change.
- The next method gives a better result, that you can understand using similar logic : we get 3 toggles per instruction, which makes total sense. There are 2 decoders but only 1/2 chance of change, so we can focus on one decoder. Each decoder updates only 3 of the 7 control lines because the other 4 give results that will not be used. So far, so good, no surprise at all.
- The last method gives an average toggle rate of 1.6 per instruction. This is one half of the previous result and though it should be taken with a lot of precaution, the benefit is clear. Some instructions (about 1/4) don't use the SND field, and the SRI field is not used when Imm8 or Imm4 fields are used, giving a further significant reduction of toggles.
Of course, these numbers are NOT representative of real use cases. I used pretty uncorrelated bits as sources, while real workloads have some sorts of patterns. The numbers will certainly increase or decrease, depending on each program.
There is a compromise for each situation and the 3 methods are provided in the source code, so you can choose the best trade-off between latency and consumption. The numbers are pretty good and I think I reached the point of diminishing return. Any "enhancement" will increase the logic complexity with insignificant gains...