Close
0%
0%

YGREC8

A byte-wide stripped-down version of the YGREC16 architecture

Similar projects worth following
YGREC can stand for many things, such as "YG's Relay Electric Computer", "Yann's Germanium and Relay Equipped Computers" or "YG's Ridiculous Electronic Contraption". You decide !

#YGREC16 is getting pretty large and moving away from the original #AMBAP inspiration, making it less likely to be implemented within my lifetime. So here is a "back to minimalism" version with
* 256 bytes of Data RAM (plus parity ?)
* 8 registers, 8 bits each (including PC)
* fewer relays/gates than the YGREC16
This core is so simple that I focus now on other issues, such as the debug/test access port, the register set's structure, I/O, power reduction...
Like the others, it's suitable for implementation with relays, transistors, SSI TTL, FPGA, ASIC, you name it (as long it uses boolean logic)!

After the explorations with #YGREC-РЭС15-bis, I reached several limits and I decided to scale it down as much as possible. And this one will be implemented both with relays and VHDL, since the YGREC8 is a great replacement for Microchip's PICs.

A significant reduction of the register set's size is required so I/O must be managed differently, through specific instructions. The register map is now:

  • D1  <= for NOP
  • A1
  • D2
  • A2
  • R1
  • R2
  • R3
  • PC  <= for INV

The instruction word is shrunk down to 16 bits. It is still reminiscent of the YGREC16 older brother but I had to make clear cuts... The YGREC8 is a 1R1W machine (like x86) instead of the RISCy YGREC16, to remove one field. Speed should be decent, with a pretty short critical datapath, and all the instructions execute in one clock cycle (except the LDCx instructions and computed writes to PC).

The fields have evolved with time (I have tried various locations and sizes). For example:

20171116: The latest evolution of the instruction format has added a 9-bits immediate field address for the I/O instructions.
20180112: Imm9 is now removed again...
20181024: changed the names of some fields
20181101: modified the conditions to change Imm3 into Imm4
20180112: Imm9 back again ! (for speed/latency reasons, no register operand is provided, an indirect IO register is used instead, and having more IO space is more desirable, otherwise only imm4 is available if a register operand is used)

There are 18 useful opcodes (plus INV, and the pseudo-opcodes HLT and NOP), and most share two instruction forms : either an IMM8 field, or a source & condition field. The source field can be a register or a short immediate field (4 bits only but essential for conditional short jumps or increments/decrements).

The main opcode field has 4 bits and the following values:

Logic group :

  • OR
  • XOR
  • AND
  • ANDN

Arithmetic group:

  • CMPU
  • CMPS
  • SUB
  • ADD

Beware : There is no point to ADD 0, so ADD with short immediate (Imm4) will skip the value 0 and the range is now from -8 to -1 and +1 to +8. (see 17. Basic assembly programming idioms)

Shift group (optional)

  • SH/SA direction is sign of shift, I/R(bit9) is Logic/Arithmetic flag.
  • RO/RC direction is sign of shift, I/R(bit 9) allows carry to be rotated.

Control group:

The COND field has 3 bits (for Imm4) or 4 bits, more than YGREC16, so we can add more direct binary input signals. CALL is moved to the opcodes so one more code is available. All conditions can be negated so we have :

  • Always
  • Z (Zero, all bits cleared)
  • C (Carry)
  • S (Sign, MSB)
  • B0, B1, B2, B3 (for register-register form, we can select 4 bits to test from user-defined sources)

Instruction code 0000h should map to NOP, and the NEVER condition, hence ALWAYS is coded as 1.

Instruction code FFFFh should map to INV, which traps or reboots the CPU (through the overlay mechanism): condition is implicitly ALWAYS because it's a IMM8 format.

Overall, it's still orthogonal and very simple to decode, despite the added complexity of dealing with 1R1W code.


This project is more than an ISA or one implementation : the goal is to become a platform. See log 82. Project organisation

Logs:
1. Honey, I forgot the MOV
2. Small progress
3. Breakpoints !
4. The YGREC debug system
5. YGREC in VHDL, ALU redesign
6. ALU in VHDL, day 2
7. Programming the YGREC8
8. And a shifter, and a register set...
9. I/O registers
10. Timer(s)
11. Structure update
12. Instruction cycle counter
13. First synthesis
14. Coloration syntaxique pour Nano
15. Assembly language and syntax
16. Inspect and control the core
17. Basic assembly programming idioms
18. Constant tables in program space
19. Trap/Interrupt vector table
20. Automated upload of overlays into program memory
21. Making room for another instruction
22. Opcode map
23. Sequencing...

Read more »

x-compressed-tar - 372.71 kB - 11/20/2021 at 14:46

Download

x-compressed-tar - 372.09 kB - 11/20/2021 at 08:56

Download

YGREC8_VHDL.20211118.tgz

assembler refactored, supports DW and re-assembly

x-compressed-tar - 360.31 kB - 11/18/2021 at 17:29

Download

YGREC8_VHDL.20211114.tgz

ALU8 still bork and assembler is incomplete

x-compressed-tar - 359.56 kB - 11/14/2021 at 08:08

Download

YGREC8_VHDL.20211112.tgz

a better assembler starts to work.

x-compressed-tar - 358.68 kB - 11/12/2021 at 06:18

Download

View all 58 files

  • counters

    Yann Guidon / YGDES6 days ago 0 comments

    In the log 129. Counters strike ! I started considering the new version of the "counters". It's easier said than made and the many clock domains don't ease the design. But I managed to draft one "bit":

    Each byte can be read with one byte select per counter (but the latency is quite high as the data ripple through many gates). Writing is another story because it's asynchronous so a byte clear precedes a selective set. Each byte has their own clock domain, selected among various sources, including the preceding counters. And the counter's value comes either from the local incrementer or the previous counter's value, for the cases of arbitrary frequency dividers (think: baud generator as a trivial example, the previous incrementer is left unused in this case).

    This is quite scalable, the size of the pool of counters can be configured at will. A 8×8 bits block seems like a good compromise but nothing keeps one from changing that.

    . . . . . . . . . . . .

    Looking further, 2 main concerns arise:

    1. Latency : I don't want the I/O space to be too constrained or constraining. There might be a cacophony of wires, registers and decoders that would probably slow everything down. So let's consider the I/O space as "mostly asynchronous" from the core's point of view. This means that IN and OUT should have a "completion" flag that lets the core resume operation once the IO is done. Which means I must adapt/change/update the instruction's FSM...
    2. The register map. So far I have identified that each counter byte could have 2 addresses: one for read and write of the value, the other for control and status. Thus there is no constraint on the total number of counters: implement as many as you like. However, timing and scheduling becomes critical at this point so look at the previous point.

  • Tri-mode TAP

    Yann Guidon / YGDES7 days ago 0 comments

    Now that I have an assembler, how do I upload the program into the core ?

    The TAP (Test Access Port) allows one to upload data, instruction by instruction, then make the core run them. However this is quite complex and not suitable for autonomous operation, like, when the core works alone.

    One thing I would loooove is to hook the circuit to a serial port of my computer and then cat a binary file through /dev/ttyS0 to the circuit. When 512 bytes are transferred, the processor starts the uploaded program. It's convenient from the user's point of view, though it doesn't allow full debugging and requires baud generation circuits. Some sort of external adaptation circuit on a dongle must be designed.

    Another interesting situation would be that the TAP circuit itself goes to fetch the program by itself. Usually it comes from a SPI Flash device, and I have already developed such a system in 2014/2015 for #WizYasep.

    I am familiar with SPI Flash devices: #SPI Flasher implements a few protocols already. SPI usually works with 4 signals :

    • MISO
    • MOSI
    • CLK
    • SEL

    Compare this to the TAP interface:

    • Din
    • Dout
    • CLK
    • R/W
    • /Reset (optional)

    It looks quite similar, right? Furthermore, the TAP contains the shift register and the other circuits that write to the program memory, which is indexed by the PC register (the latter conveniently wraps around to 0 when the upload is over, and signals the FSM to start running said program).

    So there are already about one half of the circuits in place for autonomous loading, now the trick is how to hack the existing system to add the new features.

    First, the Y8 interface needs to know the operating mode. So far, the TAP had only one mode so the question didn't arise but we have identified 3 modes, that would be conveniently selected by weak pull-up/down resistors on the pins:

    1. TAP mode : R/W low, CLK low (?)
    2. External/slave programming mode : R/W low, CLK high (?)
    3. Autonomous SPI programming mode : R/W high

    It starts to look like what modern FPGA provide...

    Usually, the R/W pin is pulled up by an internal weak resistor, which is overridden by an external upload/control device. The default behaviour is to get the stream of 4096 bitts from external SPI storage. I'm not sure yet about the CRC/scrambling, which can be designed/added later, but should not remain an afterthought for ever.

    The state of the  R/W pin is sampled by the FSM during the Reset sequence, just after the /Reset pin is brought high. Note that the TAP is still functional even when the rest of the chip is held in RESET state, since TAP as its own reset sequence and clock domain.

    Note: the TAP used to have 3 pins only (plus external RESET though it could also be controlled from within the TAP registers). See https://hackaday.io/project/27280-ygrec8/log/182563-tap-pins for the diagrams. Now, in addition to the previously defined interface, the external debug/upload device must control the /Reset pin to take over the internal FSM and prevent conflicts with the internal operation. The minimal number of pins for the probe is 6 but more become desired:

    • Din
    • Dout
    • CLK
    • R/W
    • SSel (open collector)
    • /Reset (open collector)
    • Vtarget
    • GND

    It's on the "high tier" of the pin count range for probes but that's the price for sharing a SPI bus between 2 masters. It's also for safety due to the uncertainty of the support of "half-duplex" mode by SPI Flash chips. I added these 2 pins:

    • The debug probe would certainly want to access the SPI Flash for programming, and the SPI Flash should not interfere with the normal TAP operation. A separate pulled-up SPI Sel pin (SSel) becomes necessary so the SPI Flash is deselected during TAP operation.
    • I also added Vtarget because the TAP probe shouldn't operate if the target circuit is not energised. Some voltage translation buffers will ensure electrical integrity, prevent "ghost powering" through pins' protection diodes, and the probe must be sure that the target is correctly powered.

    I know it's getting more complex but not everything...

    Read more »

  • The YGREC8 Assembler

    Yann Guidon / YGDES11/20/2021 at 16:20 0 comments

    This is a copy of the current documentation for y8asm :

    _________________________________________________________________

    The YGREC8 Assembler

    Y8asm is a 2-pass assembler written in VHDL. It transforms source/assembly language .y8 files into .hyx files suitable for flashing/emulation. The VHDL code is compiled and elaborated with GHDL by the provided build/test script, and generates an executable program.

    Pre-processing

    The program can't include files or manage macros. Use external programs such as cpp or m4. You could also concatenate source files with cat to a temporary file.

    Program invocation

    The program runs on the command line interface or in a script. There are 3 active parameters:

    • Input file name: -gname

    $ ./y8asm -gname=example.y8

    will assemble the file named example.y8.

    The output file name is derived from this parameter, with the .hyx suffix.

    The output file will be overwritten without a warning.

    • Symbol table output: -gdump

    $ ./y8asm -gname=example.y8 -gdump=yes

    appends a dump of the user-defined symbols in the comments at the end of the .hyx output file.

    The dump also contains the number of times the symbol has been referenced (though it might be over-estimated because it includes both passes). That's still convenient if you want to clean up your source code and prune some useless lines.

    The option -gdump=full dumps all the defined symbols, including the reserved words, keywords, opcodes etc.

    • Maximum Symbol Length : -gmax_sym_len

    $ ./y8asm -gname=example.y8 -gmax_sym_len=12

    changes the maximum length of symbols (labels, identifiers etc.) from the defaut 16 characters to 12 characters.

    Basic Syntax

    • Comments start at ';' and remove the rest of the line.

    • All the symbols are translated to upper case during parsing.

    • The symbols can not be redefined. Unless they are in a nested context (not yet implemented).

    • User's symbols have a range dependent on the VHDL simulator, "at least 32 bits". This is practical for intermediary values since the assembler checks each range for every instruction field.

    • Identifiers can contain the following letters: '_', 'A' to 'Z' and '0' to '9' (but no digit at the first position).

    • Separators are space ' ', comma ',', horizontal tab, and ASCII character 160 (  in HTML).

    • The dollar sign '$' represents/returns the value of the current address.

    • Numbers are decimal by default. Binary numbers have the b suffix and h is the hexadecimal suffix.

    • Numerical computations ("expressions") are always between parenthesis, to avoid precedence. The assembler supports the following arithmetic operations: '+', '-', '*', '/', '%'. VHDL does not allow easy boolean operations on integers, unless you go the std_logic_vector route, but I can't change the standard... I'll add later if needed.

    • By default, code is assembled starting at address 0. Don't forget to ORG if you need otherwise.

    Pseudo-instructions

    The assembler provides some housekeeping commands that greatly help even basic programs.

    • END

    Ends parsing of the file. The source file can contain any garbage below this line.

    • DEF

    DEFines a new user symbol.

    The line

    DEF plop 42

    defines the symbol "PLOP" and assigns the value 42. The symbol "PLOP" can be used later in the source file, and even before for instructions and DW.

    See the -gdump=yes command line argument to list all the user-defined symbols.

    • Label:

    The line

    plop:

    is a shorthand to the line "DEF plop $".

    There is no separator before ':' and the label must be alone on the line.

    • ORG

    Change the address to which the next instruction will be assembled/stored.

    ORG 42

    means that the next instruction will be stored at address 42.

    The value may be a number, symbol or expression, but can not be post-defined or have a value lower than the current address.

    • DW

    DW 42

    will output the value 42 to the instruction memory space, as if it was an instruction.

    Just like an instruction, it is 16-bit wide and can have a post-defined value.

    Instructions

    The assembler pre-defines the following...

    Read more »

  • jumping back and forth, and carry

    Yann Guidon / YGDES11/20/2021 at 15:34 0 comments

    The Y8 can jump:

    set label pc

    it can also jump relative :

    add (label-$) pc

    and it can even jump relative conditionally :

    add (label-$) pc ifc

    But then the range is limited because only 4 bits are available for the the signed amplitude. And I have already sacrificed one condition bit...

    With the new assembler, here is the best that can be reasonably done:

    add (forward-$) PC ifnz
    nop ; 1
    nop ; 2
    nop ; 3
    nop ; 4
    nop ; 5
    nop ; 6
    nop ; 7
    forward:
    
    backwards:
    nop ; -8
    nop ; -7
    nop ; -6
    nop ; -5
    nop ; -4
    nop ; -3
    nop ; -2
    nop ; -1
    add (backwards-$) PC ifz
    

    The output in .hyx:

    ;;hyx1
    ; L1: add (forward-$) PC ifnz
    75BF ; @0: ADD   8 PC IFNZ   
    ; L2: nop ; 1
    0000 ; @1: NOP               
    ; L3: nop ; 2
    0000 ; @2: NOP               
    ; L4: nop ; 3
    0000 ; @3: NOP               
    ; L5: nop ; 4
    0000 ; @4: NOP               
    ; L6: nop ; 5
    0000 ; @5: NOP               
    ; L7: nop ; 6
    0000 ; @6: NOP               
    ; L8: nop ; 7
    0000 ; @7: NOP               
    ; L9: forward:
    ; = 8
    ; L11: backwards:
    ; = 8
    ; L12: nop ; -8
    0000 ; @8: NOP               
    ; L13: nop ; -7
    0000 ; @9: NOP               
    ; L14: nop ; -6
    0000 ; @10: NOP               
    ; L15: nop ; -5
    0000 ; @11: NOP               
    ; L16: nop ; -4
    0000 ; @12: NOP               
    ; L17: nop ; -3
    0000 ; @13: NOP               
    ; L18: nop ; -2
    0000 ; @14: NOP               
    ; L19: nop ; -1
    0000 ; @15: NOP               
    ; L20: add (backwards-$) PC ifz
    77C7 ; @16: ADD  -8 PC IFZ    
    
    ;;;; SYMBOL DUMP :
    ; * 'FORWARD'=8 ref:1 / sym_usr
    ; * 'BACKWARDS'=8 ref:1 / sym_usr
    

    A backwards loop could then contain 8 instructions (including a test for the end of the loop) but the forward jump can only skip over 7 instructions, despite the ability to encode the constant 8 when dealing with the PC register.

    The offset 1 is still possible and this represents the next instruction, which would be trivial to execute otherwise. And the offset 8 points to the 8th instruction after the skipped block, it's not the size of the skipped block.

    At least it's now impossible to do a pointless loop such as

    ADD 0 PC ifnz ; spin endlessly doing nothing

    To achieve a more practical goal, the operand should be the NPC, or PC+1, which is being computed at the same time as the addition. But this creates a whole lot of troubles, in particular:

    • if we compute PC+2 then the backwards jump will only reach 7 instructions
    • Timing becomes too tight, since the pipeline must choose between PC and PC+1 depending on the imm's sign
    • This will require a stall cycle, and there is already one because writes to PC must discard the prefetched instruction.

    At this point, the "short add trick" requires only a few logic gates (to detect the opcode, the format and the sign of imm4, detecting the PC register is not even necessary) and no deep modification of the state machine.

    Trying to squeeze one more instruction, to skip 8 opcodes, would complicate the whole circuit with quite little benefits...

  • Status 20211114

    Yann Guidon / YGDES11/14/2021 at 08:17 0 comments

    The new source code archive is there ! YGREC8_VHDL.20211114.tgz

    Some pretty things are there though it's still missing quite a lot:

    • The assembler is not complete: it misses some keywords, thorough tests, padding, proper symbol definitions and backward patches...
      update 20211118: VHDL assembler is mostly functional. more tests, doc & examples are welcome, as well as nested expressions.
    • The ALU8 is more-or-less working, the tests run but find an error.
    • TAP is incomplete and more debug infrastructure is required.
    • Some files have been split and/or moved around
    • the gates library #Libre Gates project is still somehow evolving in parallel, in a soft fork that I'll have to reconcile.

    The core is still not complete... there are still many things to manage under the hood. But what works works well :-)

    And having a proper assembler helps a lot too. No more estimates, it's now possible to test and reproduce ideas! So I think it's the priority for the next days, then I'll go back to #Libre Gates  so I can fix the ALU and progress on the TAP, which is critical to enter programs, control the core and read back its status...

    The Shift unit and the register set will then be quite easy to design, I think.

  • Undecided overlay options

    Yann Guidon / YGDES11/12/2021 at 01:55 0 comments

    Work is progressing nicely on the new assembler. This also allows me to find some corner cases that I didn't consider carefully yet. Let's look at the existing disassembler:

        if OPC=Op_CALL and SND=Reg_PC then
          if Imm9="111111111" then
            result(1 to 3):="HLT";
          else
        result(1 to 7):="OVL " & SLV_to_Hex(Imm8) & "h";
    -- /!\ bit 11 not used ?
          end if;
          return;
        end if;
    

    The HLT (halt) opcode uses all the 9 bits but the OVL (overlay) only uses 8, since only a byte can be managed by the overly register.

    The 11th bit is not handled so there are 3 ideas that come to mind :

    1. extend the immediate field to work with IMM9 like IN/OUT (simplest)
    2. consider the R/I bit so the OVL instruction can use a register argument as well (useful to deal with multiple or indirect overlay numbers)
    3. create a new instruction that provides another functionality (which ?)

     The jury is still out.

  • Towards a better assembler, still in VHDL, sans Lex & Yacc

    Yann Guidon / YGDES11/06/2021 at 05:04 0 comments

    More and more, I'm looking at the YASEP's JavaScript framework and want to reuse it to develop better programs for the YGREC. This would be great to explore the practical limitations of the Y8...

    Then I realise how outdated, clunky, messy and unmanageable #YGWM still is :-/

    And I don't want to use C stuff : I have committed to using only bash and VHDL.

    And here we are...

    The project already provides YGREC8/ASM/Y8asm.vhdl but this is useful only for inline, context-less instructions. I want to create/write/assemble/run real programs, generate .HYX files and load them for simulation.

    Without EVEN starting to deal with all the parsing, two critical parts are already required :

    1. The .HYX filter (only a C in/out filter is available so far)
    2. The symbols table.

    IN VHDL.

    I think I'll start with 1. because the algorithms are already written in C and JS.

    Then, I'll deal with the dynamic allocation of symbols. I have chosen to use a unified table where the opcodes, the pseudo-opcodes, the unknown symbols and the defined symbols are kept together, to keep the complexity low and ensure there is no "shadowing", such as redefnition of opcodes, numbers etc. (as was the case in the buggy YASEP assembler in JS, where I lazily used string substitutions as a shortcut and it could totally break everything...)

    Update: drawing from the YASM experience, the assembler should use a collection of dictionaries, first the pseudocodes and opcodes, then the local symbols, then the global symbols, eventually more, such that function nesting becomes possible (for example) and important things don't get re-defined. At first, only a global symbol table will be defined but more should be able to be allocated and de-allocated. A kind of linked list of symbol tables will ensure precedence.

    Ephemeral Local symbol tables will be useful for the macros, for example. In this context, some inspiration from C syntax will help: the pseudo-symbol '{'  would create one table and '}' destroys the last created.

    Preprocessing would use m4.

    _________________________________________________________________

    OK, as usual, I say something then do the contrary, so here is dictionary.vhdl.

    "Methods" are provided to create a dictionary, look it up, add a symbol and flush the whole dictionary. The idea being that there is one dictionary per context and the contexts can be stacked with "{" and unstacked/flushed by "}".

    That will make the assembler way easier to write, and now I need to handle .HYX files...

    ____________________________________________________________________

    20211112: a few days of passionate work and YGREC8_VHDL.20211112.tgz is now available !

    It's missing at least two important features: arithmetic expressions and symbol definition&update. Anyway it's starting to be useful to write simple programs. It's a bit bloated and requires some refactoring but it's only 741 lines so far (not including a few external packages).

    I have so far been very, overly confident maybe, about the use of the ISA and now I'll be able to prove its worth.

    ____________________________________________________________________

    20211118: YGREC8_VHDL.20211118.tgz provides a refactored assembler that supports more features and works better. I am now able to write whole programs!

    And it's all written in VHDL, 700 lines so far.

  • Add with carry : the macro

    Yann Guidon / YGDES11/03/2021 at 02:52 3 comments

    The Y8 core has a carry flag but no ADC opcode. That's a compromise, turned into a fact now. So how do we perform multi-precision add/sub ?

    The first way uses the conditional form that can contains a small immediate. This can skip an instruction that increments the MSB but then comes the problem of the eventual secondary carry, which requires another conditional test.

    Another way uses the rotate-through-carry instruction. Again, secondary carry and all...

    The last way was imagined a few moments ago and exploits the fact that the SUB opcode force the carry to 1, the trick then is to negate the register operand, which could be simplified in some cases.

    Y8 was not meant to be an efficient multi-precision core, but not plainly awkward either.

    I'd like to run PEAC16 as a programmed BIST to exercise the RAM, ALU and decoder so the ability to use 16-bit numbers pushes the core to its limits.

    The idea is to configure the debug probe to spy on the carry signal and observe the pattern that arrives, then compare to an internally programmed bitstream (this easily fits with a small FPGA or even EPLD). Slowly increase the clock speed and whatch when the output bistream diverges from the internally generated one, and you can bin the chips.

    So it turns out that handling multi-precision addition is slightly more important than I thought but I'll find a pretty hack.

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    So let's say we have two 16-bit integers in R1-R2 and D1-D2.

    The LSB is added by ADD R1 D1 with result in D1. The Carry flag is set accordingly.

    The carry can then be merged with D2 : ADD 1 D2 C

    At this moment, we look if we need the extra carry, or 17th bit. If not, just do ADD R2 D2 and you're done.

    But PEAC requires the 17th bit so the 2nd instruction does not work.

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    Let's go back to ADD R1 D1 which generates a carry. It must be included in the MSB and this can generate a carry by itself. The second ADD R2 D2 will generate a carry too though not both at the same time so a OR is possible. If a secondary carry occurs when incrementing D2, this means that its new value is 0 and no value of R2 could trigger another/tertiary carry.

    The easy way to deal with it is to dedicate R3 to a sort of "carry register".

    1. SET 0 R3 ; init
    2. ADD R1 D1   ; primary add
    3. ADD 1 D2 C  ; secondary add
    4. SET 1 R3 C ; first correction, could also be a RCL 1 R3
    5. ADD R2 D2  ; tertiary add
    6. SET 1 R3 C ; final fix. No need of OR.

    This code is branchless : 4 is executed only if 3 generates a carry, which only happens if 2 also generates the carry. The ADD opcode overwrites the carry so 4 only occurs when we really need it.

    That's 6 instructions and half of them manage the external carry flag. The flag can be kept in place by using the "SUB trick". However a couple of branches are required.

    1. ADD R1 D1   ; primary add
    2. ADD 3 PC NC  ; conditional branch to normal ADD
    3. XOR -1 D2 ; pre-correction to compensate the SUB
    4. SUB R2 D2  ; tertiary add, +1
    5. ADD 1 PC ; Goto END.
    6. ADD R2 D2  ; tertiary add, normal
    7. the end.

    That's still 6 instructions but we save one register. But wait ! The jump uses ADD which also destroys the carry flag ! Fortunately it's also possible to do a direct jump when no condition is needed.

    1. ADD R1 D1   ; primary add
    2. ADD 3 PC NC  ; conditional branch to normal ADD
    3. XOR -1 D2 ; pre-correction to compensate the SUB
    4. SUB R2 D2  ; tertiary add, +1
    5. SET theend PC ; Goto END.
    6. ADD R2 D2  ; tertiary add, normal
    7. theend:

    Et voilà.

    PEAC requires 2 consecutive byte adds with carry, and each takes 5 opcodes. Then the whole block is register-swapped to emulate the copy.

    A macro could be created :

    Define ADC SRC DST label
    ADD 3 PC NC
    XOR -1 DST
    SUB SRC DST
    SET label PC
    ADD SRC...
    Read more »

  • Counters strike !

    Yann Guidon / YGDES07/14/2021 at 00:40 0 comments

    In the log 62. Floorplanning the diagram shows a zone on the left with incrementers and event matching logic. Since then, I have expanded and genericised the system. Let's just say I've been traumatised by the timer circuits of chips like the i8253 and microcontrollers: the features are often very anti-orthogonal, obscure, hard to configure... This is often due to "evolutions" of the platforms, such as the PIC16F family for example, which progressively introduced this, that, and oh, this new bit for added convenience... and after several generations, the datasheet and user manual become totally obscure. The circuit itself is a mess of logic gates.

    What I propose is simple and flexible, based around 8 identical byte incrementers, and MUXes to select the clock sources. That's all.

    Of course, the "smart" parts are in the MUXes and their organisation. The first thing is that the source of one incrementer can be the overflow of the preceding one, such that they are cascaded. This can count events on one byte, or two, or three or more... The 8 bytes become fully useful, because there are much fewer structural restrictions. The whole space can be partitioned into 8 independent counters or form a monolithic counter with 64 bits, or anything between. Each incrementer has its own configuration register that selects the source between zero (default), the neighbour (cascade) or any other source of event (internal or external). To ease resource allocation, a given signal may be available on 3 different incrementers (giving the pattern 2:3:3 that maximises the chances of finding consecutive available incrementers).

    Incrementing a byte is not hard. There could be a "ripple" effect in a cascade of incrementers but this can be mitigated by circuits. However reading and writing the 8 bytes, such that they are available with the IN and OUT opcodes, requires more circuits. For example, reading requires 8 MUX8, similar to the register set but with only one read port. A write port can be avoided by only allowing clear/reset commands (for example with a write-only byte-wide register with 1 bit per incrementer). However this is not enough in the case of a frequency generator for example, where the value of the counter is reloaded from another register. For this purpose, one half of the registers can be written, and their value is copied to their neighbour when it overflows. Structurally, that makes 4 pairs of bytes.

    Reading a cascade or simply one counter creates its own pile of problems if its clock is not synchronous to the core.

    A clear-on-read byte can hold the overflow/saturation flags of the incrementers, if software polling is chosen.

    Diagrams are required, I know.

  • More (virtual) relays

    Yann Guidon / YGDES06/29/2021 at 00:49 0 comments

    I use circuitjs a lot and I just noticed that it now handles relays. After a great exchange with its author, the relays are now more configurable and can simulate hysteresis !

    During the tests, it didn't take long to design a bitslice with the 8 registers and the write circuits:

    Having drawings and sketches is nice but making the circuit work is priceless ! Thank you Paul !

    Temporary link: https://tinyurl.com/yh8b8ue4

    With this many parts, the simulation runs quite slowly. It took minutes to overwrite the 2 leftmost bits but it works as expected !

    Adding the read logic is pretty easy at this point but the simulation will be slower.

View all 137 project logs

Enjoy this project?

Share

Discussions

Yann Guidon / YGDES wrote 04/25/2021 at 21:57 point

This project is not dead, I'm just extra over-busy with more immediate concerns and priorities...

  Are you sure? yes | no

salec wrote 10/09/2019 at 09:18 point

YGREC can stand for so many things, but since my wife has been learning French on Duolingo I can't avoid noticing that it is also a wordplay on French spelling of "Y". 

:-)

  Are you sure? yes | no

Yann Guidon / YGDES wrote 10/09/2019 at 10:03 point

oh, of course, yes, too ;-)

  Are you sure? yes | no

salec wrote 10/09/2019 at 12:04 point

always have an opening joke/tease for audience :D

  Are you sure? yes | no

Yann Guidon / YGDES wrote 10/09/2019 at 12:46 point

@salec  always !

  Are you sure? yes | no

[deleted]

[this comment has been deleted]

Yann Guidon / YGDES wrote 04/14/2019 at 08:56 point

That "purposeful sense" may look drowned into the proliferation of projects, angles and ideas but it is still clear to me since it's my main hobby since 1998 at least :-D

I'm glad you enjoy !

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/04/2018 at 07:11 point

Another note for later :
writing to A1 or A2 starts a fetch from RAM. In theory the latency is the same as instruction memory and one wait state would be introduced. However the processor can also write directly so the wait state would be only on read to the paired data register...

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/04/2018 at 06:55 point

Note for later : don't forget the transparent latch on the destination register address field, for the (rare) case of LDCx, because the 2nd cycle doesn't preserve the opcode etc.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/04/2018 at 07:18 point

OK, not a transparent latch, but a DFF and a mux, plus some logic to control it.

-- DFF, every cycle :

SND_latched <= SND_field;

LDCx_flag <= '1' when (LDCx_flag='0' and opcode=opc_LDC and writeBack_enabled='1')   else '0';

-- MUX2 :

WriteAddress <= SND_latched when LDCx_flag = '1' else SND_field;

______

Note : LDCx into PC must work without wait state because it's connected directly to SRI, as an IMM8, and no extra delay is required. PC wait state is required for ADD/ROP2/SHL and IN.

  Are you sure? yes | no

Frank Buss wrote 10/27/2018 at 12:51 point

Do you really plan 8 byte-wide registers? This would require thousands of relays :-)

  Are you sure? yes | no

Yann Guidon / YGDES wrote 10/27/2018 at 14:26 point

no :-)

8 registers, 8 bits each = 64 storage bits.
1 relay per bit => 64 registers


The trick is to use the hysteretic mode of the relays :-)

  Are you sure? yes | no

Frank Buss wrote 10/27/2018 at 16:17 point

Ok, makes sense. Maybe change the project description, someone might think you are planning a 64 bit architecture.
BTW, could this be parametrized for the address and data size? If you implement it in VHDL, you could use generics for this, would be no additional work to use just the generic names instead of hard coded numbers. Except maybe some work for extending the instruction opcodes.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 10/27/2018 at 17:16 point

Frank : DAMNIT you're right !

I updated the description...

  Are you sure? yes | no

Yann Guidon / YGDES wrote 10/27/2018 at 17:19 point

For the parameterization : it doesn't make sense at this scale. Every fraction of bit counts and must be wisely allocated.

Larger architectures such at #YASEP Yet Another Small Embedded Processor  and #F-CPU  have much more headroom for this.

  Are you sure? yes | no

Bartosz wrote 11/08/2017 at 16:40 point

this will working on epiphany or oHm or other cheap machine?

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/08/2017 at 18:07 point

I'm preparing a version that would hopefully use less than half of a A3P060 FPGA, which is already the smallest of that family that can reasonably implement a microcontroller.

But it's a lot less fun than making one with hundreds of SPDT relays !

  Are you sure? yes | no

Bartosz wrote 11/14/2017 at 14:13 point

Question is price and posibility to buy

  Are you sure? yes | no

Yann Guidon / YGDES wrote 11/14/2017 at 16:08 point

@Bartosz : what do you want to buy ?

If you can simulate and/or synthesise VHDL, the source code is being developed and available for free, though I can't support all FPGA vendors.

If you want a ready-made FPGA board, that could be made too.

If you want relays, it's a bit more tricky ;-)

I have just enough RES15 to make my project and it might take a long while to succeed. There will be many PCB and other stuff.

However if, in the end, I see strong interest from potential buyers, I might make a cost-reduced version with easily-found minirelays. I don't remember well but the Chinese models I found cost around 1/2$ a piece. Factor in PCB and other costs and you get a very rough price estimate... It's not cheap, it's not power efficient, it's slow and won't compute useful stuff... But it certainly can make a crazy nice interactive display, when coupled with flip dots :-D

So the answer is : "it depends" :-D

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates