04/04/2017 at 01:10 •
What a day ! Where do I begin ?
Ah yes, yesterday, I was thinking about the I/O system.
With a budget around 500 relays, that translates to 50/18=27 relays per slice (I exaggerate because I/Os are not parity protected but "bear with me". That would be about 16 relays for outputs and 8 for input. BTW, what are these ?
- Input is simple : this is just an external signal that enters the datapath. One MUX per input channel is enough.
- Outputs are a bit more tricky : they need to be latched, so this uses two relays : one for selection and the other for storage.
Now, let's think about this:
- 8 read and 8 writes is ... very similar to the existing register set !
- It appears that 8×16bits of inputs AND outputs is quite a lot (128 inputs wires, 128 output wires...) and I don't think I can use all of them.
So today I merged the I/Os and the register set. This computer uses both "register-mapped memory" AND "register-mapped I/O"...
So far, we already have these 8 registers :
Fine, now let's add our four I/O registers:
- I1 / O1
- I2 / O2
- I3 / O3
- I4 / O4
(that's 64 input wires and 64 output wires, not bad...)
Reading I1 will read the corresponding pin while writing O1 at the same address will set the corresponding pin. They are only related by having the same address.
Now, what if you connected, say, I1 to O1 ? Well, you get a register (that can be seen by the exterior) and which can only be read by one MUX8.
And since we don't need as many GPIO, let's now introduce the "extended registers" ! Welcome to
The last one is, well.... hmmm I had to think hard about it but I have chosen to dedicate it to PC.
We have 16 registers, now, guys ! But one half can only be seen by the 2nd read port. The instructions contain these fields :
- SRC : 3 bits
- SRCX : 4 bits (SRC eXtended)
- DEST : 4 bits
SRC is missing some of the action but has its own fun : it gets the shifting done, and gets a bit of immediate values. Each slice can do :
- Pass (direct value from the register)
- Shiflt Left (1 bit)
- Shift Right (1 bit)
- Immediate (short)
(that's a MUX4)
The short immediate value comes from the SRC field, but 3 bits (range from -4 to +3) is a very short range. But there is more :-)
Usually, the complete set of operations is : ROR, ROL, SHL, SHR, SAR, and let's not forget the Carry : RCR, RCL, SHRC, SHLC, and the arithmetic SAR...
They only affect the "edges" of the shifter, so we have to select the value of the next bit that gets shifted in.
- 0 (for SHx)
- sign bit (for SAR)
- Carry (for SxC)
- the bit from the opposite bitplane (for ROx)
That's just a few relays on the edges of the backplane, that use 2 control bits.
We now have a more complete set of single-bit shift operations and 2 more bits to use with the short immediate, no dead code :-) The Pass function can also gain more features, I'll decide later.
(byte swap ?)
5 bits of signed immediate value is very short, and useful for short jumps for example, but the user needs to load whole 16-bits registers. There is not much room and I'd like to share features with the other instructions so I have decided :
- Common prefix : DEST (4 bits), Immediate flag (1) and CND (condition, a 3-bits predicate) [total: 8 bits, niiiice]
- The remaining 16 bits are either the immediate value or the source and ALU bits.
The source fields are :
- 0..2 : SRC (access the 8 lower registers)
- 3..4 : C/R/S/W (shift "edge" mode)
- 5..6 : R/L/I/P (Right, Left, Immediate, Pass)
- 7..10 : SRCX (access all the 16 registers)
Bits 11..15 are left for the ALU. 5 bits is just enough to do all the ROP2 and some funky add/sub dance.
Conclusion : the YGREC is a pretty complete computer now, with a 16-bits datapath, I/O and 24-bits fixed instructions !
The instruction format is very well adapted to the datapath and very little decoding/recoding is needed.
Two fields need to be defined : CND and ALU.
CND uses 3 bits : usually one if for the negation, two bits remain for the source :
The condition "not always" is "never", which is often an "extension code" , that I keep for later.
20170407: "never" is now the "call" prefix. Call can't be conditional/predicated but saves some efforts (at least one instruction) for routine management.
The "input" condition could be the value of one Input signal that is selected (MUX) by an output register. We will see...
There is something funky : the address registers are fully populated but only 8 or 9 bits are used. If you write values in the MSB, they will be ignored. This can be interesting for saving/hiding values... (It just needs a byteswap feature)
The PC however is external. Damn, I need to design an incrementer with CCPBRL...
The ALU Field uses 5 bits.
1 bit for the carry chain enable and 4 for the ROP2 MUX4.
When the carry enable bit is clear, the full standard ROP2 features are available : you write the LUT directly in the opcode.
When set, the ROP2 field is overwritten to XOR or XORN. One bit selects the Carry-in, another if the carry flag is written.
(to be determined later)
I love that the instructions are 24 bits wide, in particular because 3 8-positions DIP switches can encode one instruction. With the stock I have, I can wire about 256 instructions. This means that the PC needs to be 9 bits at most. OK let's say 10 but the system then needs to MUX on the backplane level....
For the sake of completeness, the diagram that shows the instructions is here:
The overall datapath is now :
Even a kid could understand that, right ? :-)
I haven't included the rest, like DRAM, PROM, clock, tree drivers...
On 2nd thought : let's scrap the IMM16 MUX at the end of the ALU. Instead, send "Pass SRC" to the ROP2 and we're done. However the gain is not significant and probably negative because
- IMM5 is sign-extended but IMM16 is transmitted as is. That's 11 MUX2 to select the MSB.
- the ROP2/ALU control signals must be "overdriven" : that's 5 more MUX
No relay has been saved but it's more compatible with a transistorized implementation.
Here is the new version : it looks even more simple !
Of course it's simple because I have taken a lot out of the picture but this gives you a "programmer's view.
Update : I added the PC in the datapath in 7. The Program Counter's circuit
04/04/2017 at 21:17 •
Let's start with the Program Memory. Or more exactly : the ROM.
It's made of a collection of diodes, making a diode matrix to store 24-bits words.
There is also the configurable ROM, which is made of boards with DIP switches.
For the ROM I can use my stock of leadless 1N4148 (SOD) and solder one diode where one bit must pass. But developing the software that must be soldered requires a development board, hence the DIP swiches, which is my focus now : 256 words require 768 DIPS switches (ok) and 6144 diodes.
Wouldn't it be nice if they were easier to solder ? The SOD diodes roll before soldering. OTOH I know there are dual-diodes in SOT23 : half the number of parts, 2/3 the number of pins to solder.
I use dual-diodes in the PWM generator of #DYPLED but I have the wrong kind, with no common anode or cathode. So I went shopping. And I found ... something else !
Further down the datasheet (well, the competitor's), I find that it's a "low leakage" diode. Low leakage, two diodes...
Yes, the DRAM cell needs that ! The SOD version was such a pain to route:
(see the log Dual Diodes (the hard way))
For now I can test a prototype array with the BAS70-04 (NXP) (I can borrow the reel from the #DYPLED project which doesn't need all of it) and I can compare the leakage, the driving methods, the routing...
Anyway I have vague estimates of the leakage currents, but no clear understanding of all the phenomenons at play in this kind of array. Fortunately the capacitors have a reasonably high capacitance and high voltage so maybe I can spy on the voltages with a 10x probe.
BTW a 512-words DRAM system requires 9216 dual-diodes... that's a lot and I plan on building 40 arrays of 16×16 bits (10240 diodes).
OK I made a little mistake : the BAS70 are very small signal Schottky diodes.
Schottky is usually more leaky than traditional 1N4148 (though it is relative to the voltage ratings). Their repetitive pulse ratings are also far from what the system will make them endure (12V differentials during contacts).
But it can be indicative of a "worst case" so why not try and measure the data retention time ?
Happy ending : I have found some BAW56, with enough peak current rating, in SOT23, for the PROM array, but I must wait for their delivery... and only enough for 128 instruction.
04/04/2017 at 23:19 •
Now that I know the instruction width, I can finally move forward with the design of the instruction memory.
The YGREC is a Harvard architecture and the instructions can be stored in read-only memory, made from diodes arrays.
There are programmable arrays, with switches that a user can change to affect the program. With 24-bits instructions, 3 standard 8-positions DIP switches fit nicely, but what about the physical dimensions ?
I've made a few attempts with the wrong switch model (still waiting for the definitive model) and the wrong kinds of boards but I have come to a few conclusions.
One Europe-format board can contain 16 instructions. I'll see if I can get cheap 20×16 boards etched for me.
The multiplexing is pretty important too but unlike the DRAM system, we're not forced to have a 2D array. I initially thought about adding relays on the result bus but I realise it's pointless : that's 24 relays... It's better to just select which board gets the address, and populate each instruction board with one relay per line.
At worst, the backplane can have 48 instruction bits, and select one half or another, to simplify the routing.
For the diodes : any kind is fine, I wonder if I can use bipolar transistors as well :-D After all a PNP transistor is a pair of common anode diodes, right ? Well, the base current might just kill the part... For now I must use the BAW56 (still waiting for them) and the trusted 1N4148.
For now I have determined the width of the boads : 16cm (enough to carry 48 data bits and address bits). A PROM board on a Europe card (10cm) will store 16 instructions, a double (20×16) contains 32.
Diodes-only boards can double this density but require more decoding.
The decoders are made of relays and we have seen that MUX require certain topologies to balance their fan-in. Modularity dictates the grouping of boards and 64 seems to be a good compromise. That implies 32 relays per group, which is a convenient and balanced configuration, see below, though reordering must be done, it might be advisable to make a separate board with one MUX32...
64 instructions is roughly the size of the core program that computes the #Game of Life bit-parallel algorithm :-)
Each board can be removed and possibly connected to other devices for programming : I'm thinking about building an electro-mechanical assembler/disassembler. A mechanical switch will select between the user input and the PROM board to ensure that the program is correctly configured (by comparing the disassembler's panel).
PS: Yes, I remember, there was the idea of using LEDs instead of diodes.
However the forward current and reverse voltages don't allow the use of standard LEDs... The RES15 require too much current to "trip" from one state to another (at least 40mA with pre-bias) and that might damage the LED if the current remains too long. Adding a parallel diode to limit the current would increase the cost further...
04/04/2017 at 23:42 •
Development with this kind of machine is "bare-metal". I'm simply tired of writing emulators and simulators, and people actually don't need much to get started : forget Eclipse or Emacs, just take some paper and pens. You can program a computer with something to input the instructions, and something else to read/decode the binary into human-readable form.
Following my experience with the #Discrete YASEP, I have decided to create a "hardware assembler/disassembler", using this kind of technology (sans the TTL ICs, see #Quick & Dirty Frequency Generator ):
A set of two boards are connected to the instruction bus :
- The first board inputs data using knobs, in a manner that follows the instruction's logic and structure, and writes the corresponding bits on the bus.
- The second board reads the bus and displays the value, in a human-readable way (that is : with LEDs and maybe a #DYPLED)
If the user can take over the buses (instruction address bus, instruction read bus, ...), they can examine the program, input arbitrary instructions, single-step through routines, modify the memory and check the execution.
The assembler board provides one switch or rotating knob for each field :
- 4 knobs for the IMM16 field
- one for each source and destination register field
- one for the condition
- some switches and sliding switches to select modes and options
For example :
Note that the values of each position are not all marked because they are displayed in the disassembler board :
Little lamps show the value of the current instruction. In this example, we see the instruction "D1 XOR R3 => R4"
The disassembler board is always connected to the instruction read bus so you can see the machine's thoughts as it runs... But both the assembler and disassembler must be modular because they will be reused for the next implementations in various technologies :-)
04/06/2017 at 09:29 •
So I've been updating the inventory and here are the relays :
That's approximately 3000 relays, so I'm good.
I also have 10 capacitors (100µF 25V) but the sizes are mixed (4K in 6×7mm and 6K in 5×11mm). This does not help making a compact DRAM board, I'd like to fit 16×16 in a 10cm square... 16×5mm=8cm so I have room for connectors on 3 sides.
04/07/2017 at 06:03 •
Following the recent updates about the DRAM (in the log DRAM (again)), I had to test all my assertions and try with the real parts.
The BAS70 are apparently not practical for a prototype because I have no suitable PCB. I'd have to put a lot of wires... But I found a method with the legs of two 1N4148 !
It was a pretty crazy adventure but it ends well (so far, if you don't count the 30 capacitors I soldered in reverse).
I figured out a placement that is feasible and compact, but it requires two levels of wiring. First, the diodes are soldered in place and their wires occupy the PCB surface :
Make sure they are ALL in the correct direction :-) before it's too late...
I have chosen to alternate the positions to prevent soldering problems :
of course, it's still important to test ;-)
Then the capacitors. Make sure they are properly oriented ! (I wasted 30 of them...)
The positive side is connected to the diodes with a solder bridge and the other stands alone, waiting to be bent.
.Side view :
The finished board :
You can see the "re-steering" diodes at the bottom. That's where data bits will go :-)
It's a pretty big mess of wires ;-)
Tomorrow I'll wire some switches...
04/08/2017 at 00:56 •
work in progress, see also The instruction register (cont'd)
I've been focused on the main datapath but the control logic needs some love...
So let's talk about the PC, or Program Counter register.
From the user's point of view, PC is a user accessible register, that can be read as operand (only as the SRCX operand) and written as a destination. The value is incremented at every cycle to point to the next instruction so the value is visible both to the datapath and the PROM circuit.
From the hardware point of view : the PC is a register with a MUX2 (to select data coming from the datapath's result bus) and an incrementer.
How to use it and why
- Write an instruction address to PC so the next executed instruction is at the new address, as an equivalent of "BRANCH" or "JUMP" (it can be conditional)
- Read PC to perform operation on the current address and JUMP to a relative address
- Read PC to save it somewhere, just before you jump to a subroutine, to perform a CALL.
The last part requires a few MUX so the PC (actually, that's PC+1 so it returns to the next instruction) can be saved while the new address is computed.
To select the CALL path, we need a bit or signal in the topmost bits of the instruction (to save the other buts and keep the whole architecture orthogonal). That's hard because all the combinations and fields have been used. Except one !
The "NEVER" condition is now renamed "CALL" and behaves as "ALWAYS" but enables the necessary MUX.
Concerning the width : since the PC interacts with the datapath (more than I had considered initially) it becomes part of the bitplane. So 16+2 bitslices will be implemented. The backplane will enable and route the proper circuits.
In practice, not all bits might be used. The PROM can reasonably (at most ?) contain 1K instructions (that's up to 24K diodes guys !) and the address decoder really gets large at this point...
The PC's MSB (topmost bits) could be used to store a few bits at your own risks, because I'd love to create a SRAM-based PROM emulator (controlled by a Pi, serving YGWM, yada, yada...) and I can easily use the whole 64K addressing space :-)
The PC's bits are implemented in CCPBRL, with a complementary output. Here is the equivalent circuit for a DFF : 3 relays and a capacitor are required, an additional resistor (in the 10-50 Ohms range) is added to reduce inrush current (and also protect from fast changing data if the input needs time to settle).
The LSB (or bit#0) is pretty simple : at each clock cycle, it toggles. So the inverted output is looped back to the input (through the MUX). Simple !
(actually, that's a simplification because the carry in will always be set to 1 and every bitplane will use the same circuit)
For the other bits, it's barely more complicated : a MUX2 selects between Q and /Q, depending on the "carry in" signal". The real trick however is to get this carry signal...
I know the usual method : the carry chain gets interrupted by a switch if the current stage is "1". That's method a) below :
But this amounts to a AND gate, right ?
And a AND gate can have its inputs swapped, right ? So look at version b) for which the signal passes through the coil.
The great thing about this is that the fanout is 2, which is naturally implemented with a CCPBRL cell. A (hypothetical) single incrementer stage is shown below:
OK it slows the signal down... However, it's possible to
- alternate between circuit a) and circuit b) to halve the propagation delay
- include the carry chain switch's coil inside the DFF circuitry, but that requires 6V power and corresponding signals, to swing between 6V and 12V, and a resistor will waste energy
The incrementer should be quite fast because we want PC+1 at the same time the ALU delivers its result (so it can be written to the register and/or the memory).
The typical carry interrupted AND chain has a big problem with variable load (driving 16 relays can draw a lot of energy and damage the relays' contacts. But the PC+1 should be available in about 5 coil units (time to decode the instruction address, read/decode/amplifiy the control signals ...)
04/11/2017 at 04:47 •
Oh well, this entry is maybe related to the classic piece of the same name.
But the origin goes back to last year's #SPDT16: 16-bits arithmetic unit with relays where I wanted to "ding" a bell when a multiply or divide multi-cycle operation was over.
It turns out that the "bell" I found online was not appropriate. It's an electromechanical oscillating ringer with a nasty sound that lasts as long as you supply it with current. It works but meh.
I want a real bell, a distinct sound that will rise up from the clatter and clickety vibrations of the thousands of relays. Something monotonal and different. But I don't want to bet on a wrong item again on eBay.
Then I remember that video (linked above) and the final minute. Tubular bells. I realise I don't need a "bell" shaped metallic structure, I just need a steel tube !
Well I happen to have some (sold as blinders bars in the decoration shop on the corner) and I just sawed 395mm out of the 16mm diameter rod. There is no special reason for this length, though it sounded ok when I tested the whole bar. The original bar was 2m long, 2m/0.4m=5 so I think I struck the 5th harmonic.
Then it's an easy matter to determine the nicest-sounding vibration mode, and select the best place to "excite" the tube. I pass the tube through a sheet of hard foam and find that it sounds OK when the fixation foam is approximately L/4 (a tiny bit more, probably because the interaction with slightly detuned higher harmonics create a low frequency vibration). "Excitation" comes from the bottom. That's not practical to hang a 40cm tube from the final structure and I would prefer a horizontal system like with vibraphones...
I can use a horizontal configuration by placing the foam at the exact middle of the tube and reach the 2nd harmonic of the tube, a clear tone but I'd like it to be lower. The L/4 fixation probably got a different harmonic with a lower dominant resonance... Anyway, it works horizontally with just household^Wworkshop items (a bit of antistatic foam, can you do cheaper ?)
Then there is the excitation which must be electronically controlled. The idea is to have one bell for user output (controlled by a GPIO signal) and another later for signaling faults (like invalid opcode, out of range access, whatever).
The electronic problem is that the coil must be energized as a "one shot", monostable way. The coil pulse must last long enough to move the hammer, whatever the duration of the trigger signal.
My idea is to charge a capacitor (through a reasonable resistor) and discharge it through the coil under a relay command, a bit like with the hysteretic relay latch (and its charge pump-like system). But first I have to find the proper coil...
Looking through my collection of random "coil based devices", I find tiny vibrators : a small motor and a little mass of metal that makes the whole vibrate. That one needs 100mA at 1.5V and is a complete assembly, ready to be powered from the 3.3V rail. The voltage can be dropped with a Glühbirnchen in series with the motor.
But this is not a one-shot hammer-style "ding", it is still close to the ringer I wanted to replace.
So one way to get an electromagnet is to disassemble a relay. Here's the core from a 12V relay, with the coil and the magnetic part to "loop back" and focus the magnetic field.
The electromagnet attracts the steel when energised - even a few volts, but the action is not "hammer-like" because one of the parts (the tube or the electromagnet) must move, and they are heavy, and this dissipates the energy, and the tube doesn't "ring".
Then there is the problem of remanence. Even after the pulse into the coil has ended, the electromagnet still sticks to the bar. Well, a freewheel diode across the coil seems to solve the problem.
I'd like the pulse to last for about 1/10s so let's see how much capacitance is needed : t=R×C, with R=57 ohms (approx). C=t/R=1500µF
I tried with 10000µF and it was apparently too much, but 470µF works well. You can vary the energy into the coil by changing the input voltage. And I have not even considered the inductance of the coil, which certainly messes with my estimates. I just checked : 68mH is not insignificant :-D
There is an obvious problem with the mechanical structure as the wire is a bit plastic and changes shape after several moves. Steel wire becomes necessary. The tube's fixation system is also not adapted for repeatability...
04/11/2017 at 06:32 •
While rumaging through my stash, looking for electromagnets, guess what I found ?
Yes, memories from Active Surplus, a now closed Toronto surplus store. Damn, I miss you guys...
4 individual flip dots, 10×10mm, that flip at about 1.5V and 70mA (they are measured at 19.2 ohms).
I might use them for showing status (like Z and C flags) and they are easily interfaced in a CCPBRL way : one leg to 0V, the other leg tied to a capacitor (it worked with 100µF under 3V but 470µ is better) that swings from 0 to 3V... or something like that. Hysteresis is built into the system (through ferrite that stores the magnetic flux) so there is no need to maintain a bias current.
Which makes me think...
Without the need to provide bias, a completely electromechanical calculating machine would need no static/constant power !
04/12/2017 at 03:48 •
See log#4: Assembler and disassembler
Try to do that with a x86 !
The YGREC's instruction format is so simple that 9 hexadecimal rotary encoders and a few other switches are enough to "assemble" one instruction word. With the help of some diodes, I admit, but nothing crazy and no weird encoding table...
Luckily I have 4-positions slider switches in stock :-)
There is the question of how to encode the IMM5 field : with the IMM16 or with SRC+CRSW fields ? In either case, a 2-diodes drop is inevitable. However the IMM16 encoders can't be "split" so I Keep It Simple and remain with the SRC+CRSW solution. Simpler and easier though maybe a bit confusing. You'll get used to it ;-)
The switches and diodes are soldered. I have chosen a 26 positions connector for ribbon cable, a handy size to transmit 24 bits of instruction, one supply/enable bit and a key (not implemented).
The 4-positions slide switches are encoded to 2 bits with 4 diodes each.
I'd like to test the board while it is being wired but I'll have to wait for the build of the disassembly board.
I notice that I have enough room to add a few DIP chips. I'd easily put some 74HCT595 but they don't provide enough current. Amplifying devices would use much room and need to be "high side". Maybe LED drivers ?
Whatever solution I imagine, the board should sink the current, because the input voltage is unknown. Electronics needs a ground reference. I also have the STP16DP05B1R in mind, a 16-ports constant current sink LED driver, that can get up to 80mA (I need 60-70mA). It's like a dual 74HC595 on steroids.
However I have planned from the beginning to have the sense relays ground-based so a higher voltage is required (at least 3V), and I have already bought the dual-diodes for the ROM matrix, connected in the direction of the grounded coils...
I can spend a bit of time reversing the 40 diodes of the assembler board but the PROM boards will still have the wrong diodes, or require more efforts during soldering...
Unlike the rest of the system, the instruction input is using unipolar signals, with disconnected=0 and 1 is signaled by a pull-up (through diodes). The current and voltage must be enough to trip 2 relays in series because the disassembler board must intercept the instructions.
20170413 : I think I've found the solution, see the next log :-)