Close
0%
0%

YGREC8

A byte-wide stripped-down version of the YGREC16 architecture

Similar projects worth following
YGREC can stand for many things, such as YG's Relay Electric Computer or YG's Ridiculous Electronic Contraption. You decide !

#YGREC16 is getting pretty large and moving away from the original #AMBAP inspiration, making it less likely to be implemented within my lifetime. So here is a "back to minimalism" version with
* 256 bytes of Data RAM (plus parity)
* 8 registers, 8 bits each
* fewer relays/gates than the YGREC16
This core is so simple that I focus now on the debug/test access port and the register set's structure.
Like the others, it's suitable for implementation with relays, transistors, SSI TTL, FPGA, ASIC, you name it!

I give up on the idea of playing the Game of Life (the forte of #YGREC-РЭС15-bis) but I design a VHDL version because @llo sees the YGREC8 as a perfect replacement for PICs for his #SteamBot Willie !

A significant reduction of the register set's size is required so I/O must be managed differently, through specific instructions. The register map is expected to be:

  • D1  <= for NOP
  • A1
  • D2
  • A2
  • R1
  • R2
  • R3
  • PC  <= for INV

I shrunk the instruction word down to 16 bits. It is still reminiscent of the YGREC16 older brother but I had to make clear cuts... The YGREC8 is a 1R1W machine (like x86) instead of the RISCy YGREC16, to remove one field. Speed should be great, with a pretty short crritical datapath, and all instruction execute in one clock cycle (except the LDCx instructions and computed writes to PC).

The fields have evolved with time (I have tried various locations and sizes). For example:

20171116: The latest evolution of the instruction format has added a 9-bits immediate field address for the I/O instructions.
20180112: Imm9 is now removed again...
20181024: changed the names of some fields
20181101: modified the conditions to change Imm3 into Imm4
20180112: Imm9 back again !

There are 18 useful opcodes (plus INV, HLT and NOP), and most share two instruction forms : either an IMM8 field, or a source & condition field. The source field can be a register or a short immediate field (4 bits only but essential for conditional short jumps or increments/decrements).

The main opcode field has 4 bits and the following values:

Logic group :

  • XOR
  • OR
  • AND
  • ANDN

Arithmetic group:

  • CMPU
  • CMPS
  • SUB
  • ADD

Beware : There is no point to ADD 0, so ADD with short immediate (Imm4) will skip the value 0 and the range is now from -8 to -1 and +1 to +8. (see 17. Basic assembly programming idioms)

Shift group (optional)

  • SH/SA direction is sign of shift, I/R(bit9) is Logic/Arithmetic flag.
  • RO/RC direction is sign of shift, I/R(bit 9) allows carry to be rotated.

Control group:

The COND field has 3 bits (for Imm4) or 4 bits, more than YGREC16, so we can add more direct binary input signals. CALL is moved to the opcodes so one more code is available. All conditions can be negated so we have :

  • Always
  • Z (Zero, all bits cleared)
  • C (Carry)
  • S (Sign, MSB)
  • B0, B1, B2, B3 (input signals, for register-register form)

Instruction code 0000h should map to NOP, and the NEVER condition, hence ALWAYS is coded as 1.

Instruction code FFFFh should map to INV, which traps or reboots the CPU (through the overlay mechanism): condition is implicitly ALWAYS because it's a IMM8 format.

Overall, it's still orthogonal and very simple to decode, despite the added complexity of dealing with 1R1W code.


Logs:
1. Honey, I forgot the MOV
2. Small progress
3. Breakpoints !
4. The YGREC debug system
5. YGREC in VHDL, ALU redesign
6. ALU in VHDL, day 2
7. Programming the YGREC8
8. And a shifter, and a register set...
9. I/O registers
10. Timer(s)
11. Structure update
12. Instruction cycle counter
13. First synthesis
14. Coloration syntaxique pour Nano
15. Assembly language and syntax
16. Inspect and control the core
17. Basic assembly programming idioms
18. Constant tables in program space
19. Trap/Interrupt vector table
20. Automated upload of overlays into program memory
21. Making...

Read more »

svg+xml - 11.40 kB - 01/13/2019 at 23:30

Download

svg+xml - 24.80 kB - 01/13/2019 at 22:24

Download

YGREC8_VHDL.20190101.tgz

assembly passes self-tests

x-compressed-tar - 114.55 kB - 01/01/2019 at 15:25

Download

YGREC8_VHDL.20181230.tgz

assembler reboot, not finished but promising !

x-compressed-tar - 113.57 kB - 12/30/2018 at 07:05

Download

YGREC8_VHDL.20181101.zip

Added the proasic3 VHDL library for rough gate-level simulations, many incoherent or obsolete files though.

Zip Archive - 109.48 kB - 11/01/2018 at 16:04

Download

View all 21 files

  • More high-current germanium diodes

    Yann Guidon / YGDES02/11/2019 at 23:43 0 comments

    I just got a dozen of Д305: these Russian Ge diodes are rated for 10A (50V reverse) with a bulky screwable black-painted body. They were cold-tested at 2.1A (the max of my digital PSU) at 0.38V (two outliers at 0.4V), which is the performance of the OC31 at only 0.5A. They are not cheap, but not very expensive either (for what these are).

    The nuts are missing but are claimed to be "metric" and easy to source (M4?). At 2A, and assuming 0.4V drop, the diode would dissipate about 0.8W which should be easy to sustain without extra cooling. For a diode bridge, I suppose I'll make 2 pieces of copper with a few mounting holes (+insulation), as well as a couple of 4mm holes to screw the diodes.

    I think we have a winner here :-D

  • High-current germanium diodes

    Yann Guidon / YGDES01/17/2019 at 22:44 11 comments

    So far I have only played with nimble, fragile, low-current, point-contact germanium diodes. The "neovintage" power supply of the register set requires higher current than their usual 20 to 40mA rating.

    I got some OA31 from the usual suspect (minifux1)  (who provided, among others, essential parts for #Germanium ECL) and they are pretty impressive: rated at 85V reverse voltage, or 3.8A in direct current. Germanium hates the heat (it leaks a lot and the junction should not exceed 85°C usually) so the bulky metal package says we're doing serious business here.

    I measured the I/V curve with a crude setup and I'm rather impressed :-)

    Direct current (A)
     Voltage drop (V)
    0.050.24
    0.10.29
    0.150.31
    0.20.33
    0.250.35
    0.30.34
    0.350.35
    0.40.36
    0.450.36
    0.50.37
    0.550.38
    0.60.38
    0.70.4
    0.80.4
    0.90.4
    10.4
    1.20.41
    1.50.43
    20.45

    The curve is quite flat and the drop is lower than silicon diodes and comparable to a good Schottky ! This is a significant progress compared to selenium rectifiers ;-) and the performance for the low-voltage diode bridge will be great :-)
    I expect a Vpeak of 4.3V at the output of the transformer (conservative estimate). The diode bridge will "eat" 2×0.45V=0.9V, make this 1V. The diode bridge peak will be at least 3.3V, better than the previous estimate of 2.9V :-)

    These measurements were done "cold" and are expected to vary with temperature. Not in a bad way, though, because the drop would be slightly reduced. I don't expect the diodes to require a heatsink because they would dissipate an average of 0.5W and they are already pretty bulky.

    As previously noted, this could be tested with loads made of many 39 ohms resistors in parallel.


    But wait, I also received several AY105K ! They are a little bit smaller and are rated for 5A (a different source claims 3A, which is a convenient margin)

    It's another contender for the diode bridges. Those vintage Italian diodes are smaller and have a more convenient packaging with insulated heat spreader !

    However the drop is higher than the OC31 : 0.5V at 1A only, and 0.59V at 1.5A. I would use them for the other power supplies, where the rails are higher voltage (12V ?) and slightly lower current.


    I also expect to receive a few Д305 for comparison. They are claimed at 10A so they should be impressive, though I won't have enough to make more than one bridge. So far the OA31 is the clear winner :-)

  • Another opcode re-organisation

    Yann Guidon / YGDES01/13/2019 at 22:23 2 comments

    You modify a detail and all the rest crumbles. So after a few "modified details" added up, I had to take a global view again. Here is the result:

    I did my best to avoid fragmentation, while keeping most of the constraints already established. For example, XOR and SET/CALL differ with the bit 15. The first half (8 opcodes) has not changed, but I have moved all the others : SET/CALL are now situated just after the ALU operations, instead of at the very end. The SHL (Shift/Rotate) unit comes next, without Imm8. LDCx also has no Imm8, followed by IN/OUT using only Imm9.

    This hopefully simplifies the assembler (which must be rewritten) as well as the instruction decoder (fewer gates).

    SND has been moved to the LSB because it looks simpler this way (what do you think ?). This probably amounts to half of the modifications to apply to files and documentation... and also source code ?

  • Mister Bin

    Yann Guidon / YGDES01/13/2019 at 19:40 0 comments

    Looking through my stock, I find a bag of 33Ω Russian resistors. The value is a bit weak and there are only 60 of them (the register set has 64 bits)... Time to hit eBay so I can get nicer-looking resistors than the carbon YAGEO 1/2W used for the prototypes. I want to give a crazy, vintage and out-of-this-world look that will stun the novices like the professionals !

    After a while though, I start to realise something: why bother with the tolerance? Oh wait, if I bin the relays precisely, then +/-5% resistors will wreck the whole thing...

    There are now other approaches:

    • Get 1% resistors => more expensive
    • Bin the relay with its associated resistor
    • Bin the resistors

    It will depend on what curious-looking resistors I find...

  • Power supply for the register set

    Yann Guidon / YGDES01/09/2019 at 22:29 0 comments

    This log continues 43. Data retention times of hysteretic relay latches, I'm digging more into the practical details now.

    First, the fuse (you don't want this to happen) then the transformer: the TSL40/001 from INDEL. The high voltage output is ignored, I only use the 3.15V outputs. There are 2 outputs and each can supply 3A but the register set needs 2.1A (total) so each half will provide only 1A. The extra power can be used for other parts of the circuit.

    I have not found suitable Selenium rectifiers for the bridge rectifier. The peak current could be in the 3 or 4A range. I just spotted some Germanium power diodes, we'll have to wait for their delivery to test them. At high current, their drop can become "significant" so I cross my fingers : the output should be around 3V or 2.9V. If we consider the diode drop of silicon diodes, this is achievable, and the Schottky diodes can always be used as a last eventuality.

    A few big capacitors filter the bridge's output then the rail is split into two : each half-rail has a small rheostat to "drop" some fractions of volt, and the result is measured with a small solenoid indicator. I'll calibrate the measurements so each branch has the correct voltage and reading. Some diodes drop the voltage so only 60µA flows through the solenoids at the right working point.

    Finally this sub-sub-branch is split again and powers 2 "slices", each with their own capacitor-inductor-capacitor filter for the extra smoothing.


    So far the only thing I don't have right now is the diode bridges, but I just ordered these parts.

    Germanium has a naughty tendency to drift with temperature. The bad way. The behaviour will change with the load and I don't have all the register boards to draw the expected current. I can however simulate one slice (of 8 relays) with 4 resistors in parallel: I have a bunch of YAGEO 39 Ohms 5% 1/2W that will do the trick, the whole set would be emulated by 32 resistors (with each resistor used at half of max. power rating). It's still a progress...


    I didn't check enough but... The low voltage secondary is made of 2 windings that are joined in series. I hacked it to make them independent again. Notice the small writings : the 3.15V windings have one pin in common...

    The construction quality is good so it was not hard to separate the 2 windings. I just wish I noticed it earlier !

    The resistance of the 3A circuits is very low, I can't measure it with my multimeter. It's going to be very powerful...

  • Improved linear power supply

    Yann Guidon / YGDES01/09/2019 at 03:02 0 comments

    Spoiler alert : read the bottom of the page first :-D


    This post is more or less totally not related to CPU design. It's absolutely related to power supplies however !


    Let's just jump to the conclusion :

    It's a particularly "overkill" "solution" to an old problem because it requires 2 identical transformers with dual outputs and could in theory output as much current as one.

    The advantage is the reduced ripple and a much better cos φ because current is drawn from the mains during the 4 quadrants, instead of only 2.  The output ripple is also reduced (and that's the whole point of this circuit ! ) and this is significant for certain types of loads.


    The long story :

    I finally received one TSL 40/001

    This little Polish device is a well built transformer, usually targeted at lamp/valve amplifiers with 3 secondaries : one is a low-current high voltage output, that I will ignore. The other two are 3A 3.15V, with 18W cumulated power. See the end of   43. Data retention times of hysteretic relay latches   for more computations.

    I bought one, that I never received. Then I bought a second that I received so I'm considering the next steps. But what if I received the first one ?

    Using 2 transformers in parallel will not change much because they would be in phase and they both will require a large amount of filtering capacitors to keep the output ripple low.

    Then I realised that the key was the dual, symmetrical but independent outputs. Usually you can wire them to either provide more current (in parallel) or more voltage (series). Or you can power a different circuit. But I have never tried to use the secondary as an isolation transformer, or a de-phaser, though in theory nothing prevents it.

    In the above diagram, I use one secondary to de-phase the other secondary by 90°. Because the windings are identical, there should not be any mismatch and the "direct" secondary could be dampened with a small rheostat to account for the extra resistance in the de-phaser. 2 diode bridges rectify the output and only one of them is "active" at a time (one pair of diodes for each quadrant).


    This is not as efficient as 3-phase power but it is totally what our grand-dads would have done if they could.

    I'm curious to know if this had been already done before. It look like this kind of circuit should be in some textbooks but I have never seen them. The closes I've see is a very large inductor in front of a PSU to smooth and correct cos φ but 2 transformers ?...

    I'm happy because this is a question that has been spinning in my head for more than 2 decades and I couldn't resolve myself to "using larger filtering capacitors".

    ...

    Has anybody here seen this circuit before ?


    @Bharbour  notified me that this wouldn't work and he provided a 'scope screenshot.

    I'm very surprised because this goes against my understanding of how a transformer works.

    So I tried too.

    And I find no phase shifting either ! The output mostly copies the input.

    I'm so disappointed :-D

  • Improved ROP2

    Yann Guidon / YGDES01/03/2019 at 23:23 5 comments

    In the log 5. YGREC in VHDL, ALU redesign   I show how the ROP2 unit shares gates with the adder.

    The "Pass" datapath is quite annoying with the 3rd multiplexer so I moved it upstream, taking advantage of the 3-input gates.

    The merged gate is now a type AX1 and saves a tiny bit of latency on the ROP2 critical datapath, as well as one gate. This is valid for the ProASIC3 as well as other FPGA, less so for discrete or MUX-based technologies (such as relays). This change is significant enough, however, to justify a redesign of the opcode map, following these constraints :

    • The SET opcode must "map" to the XOR opcode, F0 and F1 must be identical but F3 (or F2?) must be opposite. There is no constraint with CALL anymore because the datapath has a couple of bypasses.
    • Computing the NEG  signal should be a bit easier and I want to get rid of the XOR gate. I re-organised the opcodes so the function is MAJ3 (which must be added to the #VHDL library of ProASIC3 gates)

    The new mapping is :

    This means I have to redesign the ALU "a bit" but with more emphasis on place&route. The above new circuit is easy to process by hand. There are however a few details that change with the order of the bit, during comparison. From the previous version of the ALU8 code:

    -- Initial XOR of the operands
    XOR_DST <= (7=> negate_DST and not compare_signed, others => negate_DST);
    XOR_SRC <= (7=>                    compare_signed, others =>        '0');
    DSTX <= DST XOR XOR_DST;
    SRCX <= SRC XOR XOR_SRC;
    

    Bus names have changed since 2017, DST=>SND and SRC=>SRI. The code says that SRI(7) is XORed with the control signal "compare_signed", and SND(7) with its inverse. This adds an inconvenient corner case that I'd like to get rid of... It doesn't affect the critical datapath a lot but placement gets trickier.

    I'll "tweak" that later but at least the SND input could be inverted by a XOR3 instead of XOR2, or the specific NEG input could get a special treatment.

    Layout is pretty easy:

    That's a good base for the ADD8 that connects to it (I didn't show the P, G and XOR outputs).

    One nice thing with this kind of pre-routing is the opportunity to spot optimisations for later in ASIC. For example: there are MUXes driven by the same control signal so they can share a buffer and inverter with a direct neighbour.

  • Improved Shuffling Unit

    Yann Guidon / YGDES01/03/2019 at 21:07 0 comments

    The pressure on the ISA increases and I am already forced to squeeze 2 instructions in the IN and LCDL opcodes. Naturally I'm looking at the shifting/shuffling/barrel shifter and the 4 opcodes.

    Things have changed since the last time because the short immediate is now a signed 4-bits field ! This means that an opcode such as ROR can encode both left and right directions, saving space in the opcode map.

    SHL, SHR => SH
    ROL, ROR => ROT

    There are also two other desirable variations :

    • Rotate through Carry (not sure it is really necessary with all the predicated instructions ?)
    • Shift Arithmetic

    With Imm4, there is no use for the Imm8 field now as well, which saves another bit. The shifter will use only two opcodes by moving the arithmetic flag/carry flag in the R/I8 flag.

    What could these opcodes be used for ? They should remain reserved for now but I can already see the Imm8 extended back to 9 bits for the IN and OUT opcodes, leaving one remaining free opcode slot...

  • Assembly in VHDL works

    Yann Guidon / YGDES01/01/2019 at 16:04 0 comments

    The latest archive upload shows the new assembler, which just passed all the self-tests. The few corner cases gave some difficulties but they were solved.

    Part of the self-test includes throwing "stuff" at the assembler and disassemble it, to see if the parser chokes on anything. Of course this is not perfect but most cases of user abuse are covered.

    The other part scans the WHOLE INSTRUCTION SPACE. The instruction is disassembled, reassembled an re-disassembled to check discrepancies. This is where the ambiguities become obvious.

    • Instructions with small immediate values are always assembled to Imm4 but it is also possible to encode them in binary as Imm8. This is a sort of "funnel" and 824 Imm8 codes are converted to Imm4 codes.
    • ADD SND Imm4 increments the Imm4 field when it is not negative. This creates an additional mismatch with ADD SND Imm8, which creates an additional "funnel" of 64 codes.
    • LDCH and LDCL don't take conditions into account, which creates another "funnel" of 3712 codes.

    All these test ensure that no "blind spot" or undefined behaviour exists, not just in the assembler and disassembler, but also in the ISA itself.

    Overall, VHDL is perfectly capable of assembling and disassembling instructions with only the basic feature set. It's not the easiest language but the Ada legacy helps a lot ! Thanks to GHDL there is no need of an external software module and this package ygrec8_asm adds a  lot of convenience in the simulator, emulator and debugger !

  • Assembly syntax

    Yann Guidon / YGDES12/24/2018 at 07:01 0 comments

    I have to rewrite the assembler and disassembler... so here is the census of the instructions and their syntax, in order of complexity :-)

    • NOP => 0000h
      INV,
      HLT => FFFFh
    • OVL  Imm8 => FF........h
    • IN,
      OUT SND Imm8 => (no SRI)  Ch snd i/o Imm8
    • LDCL,
      LDCH  SND SRI => (no Imm8) Dh snd l/h ..... sri  (condition not supported yet)
    • XOR,
      OR,
      AND,
      ANDN,
      CMPU,
      CMPS,
      SUB,
      ADD,
      SHR,
      SHL,
      SAR,
      ROL,
      SET,
      CALL SND  Imm8/Imm4 [cond2]/SRI [cond3] => (see diagram)

    This should help structure the code :-) There are 4 special cases to check, and then it's all very orthogonal.

View all 53 project logs

Enjoy this project?

Share

Discussions

Yann Guidon / YGDES wrote 11/04/2018 at 07:11 point

Another note for later :
writing to A1 or A2 starts a fetch from RAM. In theory the latency is the same as instruction memory and one wait state would be introduced. However the processor can also write directly so the wait state would be only on read to the paired data register...

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/04/2018 at 06:55 point

Note for later : don't forget the transparent latch on the destination register address field, for the (rare) case of LDCx, because the 2nd cycle doesn't preserve the opcode etc.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/04/2018 at 07:18 point

OK, not a transparent latch, but a DFF and a mux, plus some logic to control it.

-- DFF, every cycle :

SND_latched <= SND_field;

LDCx_flag <= '1' when (LDCx_flag='0' and opcode=opc_LDC and writeBack_enabled='1')   else '0';

-- MUX2 :

WriteAddress <= SND_latched when LDCx_flag = '1' else SND_field;

______

Note : LDCx into PC must work without wait state because it's connected directly to SRI, as an IMM8, and no extra delay is required. PC wait state is required for ADD/ROP2/SHL and IN.

  Are you sure? yes | no

Frank Buss wrote 10/27/2018 at 12:51 point

Do you really plan 8 byte-wide registers? This would require thousands of relays :-)

  Are you sure? yes | no

Yann Guidon / YGDES wrote 10/27/2018 at 14:26 point

no :-)

8 registers, 8 bits each = 64 storage bits.
1 relay per bit => 64 registers


The trick is to use the hysteretic mode of the relays :-)

  Are you sure? yes | no

Frank Buss wrote 10/27/2018 at 16:17 point

Ok, makes sense. Maybe change the project description, someone might think you are planning a 64 bit architecture.
BTW, could this be parametrized for the address and data size? If you implement it in VHDL, you could use generics for this, would be no additional work to use just the generic names instead of hard coded numbers. Except maybe some work for extending the instruction opcodes.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 10/27/2018 at 17:16 point

Frank : DAMNIT you're right !

I updated the description...

  Are you sure? yes | no

Yann Guidon / YGDES wrote 10/27/2018 at 17:19 point

For the parameterization : it doesn't make sense at this scale. Every fraction of bit counts and must be wisely allocated.

Larger architectures such at #YASEP Yet Another Small Embedded Processor  and #F-CPU  have much more headroom for this.

  Are you sure? yes | no

Bartosz wrote 11/08/2017 at 16:40 point

this will working on epiphany or oHm or other cheap machine?

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/08/2017 at 18:07 point

I'm preparing a version that would hopefully use less than half of a A3P060 FPGA, which is already the smallest of that family that can reasonably implement a microcontroller.

But it's a lot less fun than making one with hundreds of SPDT relays !

  Are you sure? yes | no

Bartosz wrote 11/14/2017 at 14:13 point

Question is price and posibility to buy

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/14/2017 at 16:08 point

@Bartosz : what do you want to buy ?

If you can simulate and/or synthesise VHDL, the source code is being developed and available for free, though I can't support all FPGA vendors.

If you want a ready-made FPGA board, that could be made too.

If you want relays, it's a bit more tricky ;-)

I have just enough RES15 to make my project and it might take a long while to succeed. There will be many PCB and other stuff.

However if, in the end, I see strong interest from potential buyers, I might make a cost-reduced version with easily-found minirelays. I don't remember well but the Chinese models I found cost around 1/2$ a piece. Factor in PCB and other costs and you get a very rough price estimate... It's not cheap, it's not power efficient, it's slow and won't compute useful stuff... But it certainly can make a crazy nice interactive display, when coupled with flip dots :-D

So the answer is : "it depends" :-D

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates