• Adding without carry

    Kyle McInnes08/28/2022 at 16:43 3 comments

    It would be nice to be able to make use of the carry output from the 74LS283 adder, but it's going to require at least one extra chip to store the carry bit, and maybe more to decode the opcode into a "write carry" signal.

    The alternative is to OR all the accumulator's bits together to test for zero. That doesn't need any chips, just four diodes connected like this:

    "jnz" (jump if accumulator is not zero) will be our conditional jump.

    The question now is, how do you synthesise a carry bit in software if you don't have one in hardware? I couldn't find much information about this - a common definition of the carry bit is "1 if the result of A+B is less than A (or B)", but that's not very helpful - it's not very easy to do an unsigned comparison without a carry flag! In the end I found the answer in the source code for the Gigatron, which I knew doesn't have a carry flag.

    Q = A + B
    if top (sign) bit of Q is set:
      carry bit = top bit of (A & B)
      carry bit = top bit of (A | B)

    This is where having a NAND operation becomes very useful. ANDing A and B is just a case of NANDing, then inverting:

    lda $A
    nan $B
    nan f    ; nand with 0b1111 = invert

     OR is ~(~A . ~B), i.e. NAND with both inputs inverted. This requires a temporary location:

    lda $A
    nan f
    sta $notA
    lda $B
    nan f
    nan $notA     ; ~A nand ~B == A or B

    Putting it all together, here's how to add two 8-bit numbers:

    ; input values
            lda f       ;a=0xff (big endian, stored at $0/$1)
            sta $0
            lda f
            sta $1
            lda 5       ;b=0x52 (stored at $2/$3)
            sta $2
            lda 2
            sta $3
    ; add two 8-bit numbers
            lda $1      ; add lo nibbles
            add $3
            sta $5      ; store result at $5
            nan %1000   ; check hi bit
            nan f
            jnz set
            lda $1      ; msb clr: a or b
            nan f
            sta $f          ; $f = not a
            lda $3
            nan f
            nan $f
            jmp next
    set:    lda $1      ; msb set: a and b
            nan $3
            nan f
    next:   nan %1000   ; hi bit is carry
            nan f
            jnz carry
            jmp addhi   ; acc already zero if no carry
    carry:  lda 1
    addhi:  add $0      ; add hi nibs + carry
            add $2
            sta $4      ; store result at $4

  • ALU

    Kyle McInnes08/28/2022 at 15:47 0 comments

    Here is what I am thinking of for the "ALU":

    • We need a way to load the accumulator with an immediate value ("lda 3"), or from a memory location ("lda $3"), so a mux is required there.
    • At a minimum we want to be able to add a value to the accumulator. We already have that mux so the value could be immediate or direct. Another mux is needed to select between "add" and "load". The input port can also feed into that mux.
    • At the cost of one extra chip, a logic function will be very useful. I've put NOR in the diagram but after playing around I think NAND is slightly more useful (easier to do AND for testing bits).

    That allows for these instructions: 

    000m vvvv  lda    acc = value
    001m vvvv  add    acc = acc + value
    010m vvvv  nan    acc = ~(acc & value)
    011x xxxx  in     acc = in
    m: 0=immediate value, 1=value from given RAM address
    v: 4-bit value
    x: don't care

    I really like how this turned out, because the decoding for these instructions requires no additional chips!

    opcode bits:
       3210 vvvv
       |||\- operand/memory mux select
       ||\-- accumulator source mux select
       |\---             -"-
       \---- accumulator write enable (active low)

    Two opcodes are taken up by "in" but it's worth that small cost.

  • Initial thoughts

    Kyle McInnes08/21/2022 at 15:28 0 comments

    • Tiny 4-bit CPU with 4-bit input and output ports. Like the #TD4 CPU but a bit more capable.
    • Connect the output port to a SSD1306-style OLED display. These can be driven via SPI which would be simple enough.
    • Have it do something non-trivial - I love the TD4 for its simplicity, but can a slightly more capable CPU do something more interesting? How about finding prime numbers, or displaying a fractal?
    • 256 words of RAM is plenty. It will have to be a Harvard architecture with a considerably bigger program ROM - 4Kx8 sounds good. Instructions would be 8 bits wide.
    • As few ICs as possible. The TD4 has 12 chips, but can only run very small programs. This Brainfuck machine has 14 chips and can run non-trivial programs, but you need to be a wizard to write programs for it. Let's see what can be done with less than 20. Instruction decoding can be done with the help of diodes.

    If we're going to write characters to the display we need a font, which is a lot of constant data. The easiest thing is for that constant data to reside in ROM in the form of instructions, which populate RAM:

        lda %1011    ; load accumulator with 4 bits
        sta $1       ; store at some memory location (upper 4 address bits will come from somewhere else)
        lda %0101    ; next 4 bits
        sta $2

     The next part of the program then reads the data in RAM and clocks it out serially.

    Control flow: we will have something simple like "jump if accumulator is zero". Is this enough? Can you have subroutines when you can only jump to an immediate address? Maybe if you have a jump table at the end of each subroutine, selecting which caller to jump back to.

    Addressing: similar question - can all addresses be immediate, or do we need the ability to store addresses in RAM?

    There will be many tradeoffs to look at - if removing a couple of chips causes the code size to balloon to make up for it, it may not be worth it.