Close
0%
0%

Kobold - retro TTL computer

A computer with 20 address bits, 16 bit instructions, video display, from just a few TTL and memory chips. Aiming for PDP11 instruction set.

Similar projects worth following
After having designed the one square inch ttl cpu, the moment comes that there is a new
minimal-parts computer that wants to be designed. The Kobold computer has chosen me to design him.

Constraints are:
- low number of parts (TTL)
- no of-the-shelf processor or microcontroller
- no 74181 ALU

The drawing shows how the Kobold CPU (in SMD version) could be plugged into the mainboard of the Kobold computer system.

PARTS

It will consist of the following parts: 

  • processor similar to the #1 Square Inch TTL CPU  
  • a simple ALU 
  • some address registers 
  • at least 256 KByte RAM 
  • flash-ROM for booting 
  • VGA text output 80 * 25 chars 
  • VGA color output (resolution TBD) 
  • PS/2 keyboard input 
  • user-accessible I/O pins (footprint of Arduino shield)
  • a sound system
  • serial flash (16 or 32 MByte) for storing all your programs. 

The text mode will have several (probably 4) colors, and the goal for video is freely mixing of text and full color mode.

At this moment, the idea is that the CPU and the video part will operate quite independent. The CPU might even be a separate pcb that is placed on top of a "motherboard", that has the memory, video and other I/O.

The CPU will probably be around 27 TTL parts and a microprogram Flash. (But now I am aiming for a full PDP11 instruction set, this will be more than 27.)

INSTRUCTION SET

The instructions will be implemented in microcode. The first instruction set that I have in mind is similar to the PDP-11. Binary compatibility with the PDP11 is under investigation.

But the microcode can be easily reprogrammed with a Raspberry Pi as programmer (just as with the 1 Square Inch TTL CPU ), so it could also behave as another processor like the 6502 or Z80. The microcode Flash is large enough to accomodate a lot of microcode.

PROGRAMMING

A Raspberry Pi can be connected, that can read or write files from or to the serial flash of the Kobold.

When enough supporting software is in place, the programs for Kobold can be developed on the device itself, perhaps first in BASIC and later in C. 

PLAN

  • design the CPU
  • pcb design of CPU with thru-hole components
  • design the motherboard
  • pcb design of motherboard
  • some simulations
  • order boards, and assembly
  • hw fault finding
  • pcb design of a small Kobold CPU in SMD version
  • build or adapt a C compiler
  • make system software. Might adapt an OS.

This is a work in progress, the logs will show the design steps....

  • Design upgrade

    roelh2 hours ago 0 comments

    The current design of the microcode structure is almost complete, with just a few loose ends that have to be solved.

    However, the total design effort for the project will be quite huge, and it would be a pity if the result is inferior on certain aspects. It seems that with a little more effort, a better result can be obtained (at the cost of a few more components).

    This will focus on two main aspects:

    • Most instructions are just moving data around. If we change the bus from 8 to 16 bit, the system will run almost twice as fast. The ALU might 
      stay 8 bit to keep the part count reasonable (This will perhaps only cost a single extra cycle for instructions that do calculations).
    • Several aspects of the Kobold are inspired on the PDP11. With some more effort, the design could be such that the microcode can implement a binary 
      compatible PDP-11 instruction set. (This will only be practical with a 16 bit bus).

    So, work to do !

  • Opcode handling

    roelh3 days ago 3 comments

    The Kobold is advertised to handle 16 bit instructions, but everything is 8 bit. Even the microcode is only 8 bit wide. How does it work ?

    [ What you need to know for the examples: The program counter is in register R7 (as on the PDP11), and is copied into address register PC when needed. The lowest bit of the PC is always 0.]

    EXAMPLE: 16-bit ADD

    As example, take an instruction that adds (X+6) to register R4:

     add (X+6),R4 

    The instruction is split into two parts, that operate almost independent of each other:

    The first byte of the instruction is fetched from (PC) into the micro-program counter. From here, the micro-instructions determine the operation:

    • First part:
    • load the LSB of the 16bit accumulator with (X+6), (the LSB part)
    • load the MSB of the 16bit accumulator with (X+6),  (the MSB part)
    • fetch an instruction byte from (PC+1) into the micro-program counter (changing flow of the microcode). 

    This instruction byte tells to add R4 (16 bits) to the accumulator and store the result back to R4:

    • Second part:
    • add the LSB of (WP+4) to the LSB of the accumulator
    • add the MSB of (WP+4) to the MSB of the accumulator
    • store the LSB of the accumulator back into (WP+4) (the LSB part of R4)
    • store the MSB of the accumulator back into (WP+4) (the MSB part of R4)

    The next section of the microcode will increment the PC and start the next instruction:

    • connect the LSB of the PC (from the address registers) to the B-input of the adder of the ALU. The value 2 will be provided to the A-input of the ALU, so the byte value P+2 will be put in the accumulator.
    • fetch the MSB of the pc from R7 [in the workspace] and put it into the accumulator, to make 16 bits complete.
    • move the 16-bit accumulator contents to the pc in the address register set.
    • fetch the next instruction from (PC) into the micro-program counter.
           ( from here, the micro-instructions for the next instruction are executed )

    EXAMPLE: 8 bit immediate load

    There are also instructions that have a single opcode byte, followed by an 8-bit immediate operand or z-page location. Branch instructions are  an example of this.

    As example, load register R3 with value 0x80:

    mov #0x80,R3
    • load the LSB of the accumulator from (PC+1)
    • load the MSB of the accumulator with zero
    • store the LSB of the accumulator in (WP+3)  (the LSB part of R3)
    • store the MSB of the accumulator in (WP+3)  (the MSB part of R3)

    Finally, the pc is incremented and the next instruction is fetched, as in the previous example.

  • Address registers schematic

    roelh4 days ago 0 comments

    Here you see the schematic of the address registers, with the five HC670 chips. 

    Loading the address register

    The inputs (REG0-REG15) come from the accumulator. The address register to write to is selected with the WA and WB inputs (that connect to bit IR4 and IR5 of the microcode byte). When ADDR_WE is active (low) the data is written into the selected address register. 

    The upper four bits (16-19) of the address register are written when ADDR_PAGE_WE is active.

    Output of the address registers

    The five 670's always put an address on the address bus (A0-A19), because the GR signal is always active. When the USE_XY signal is low, the PC or WP is connected to the bus (selected with bit IR4 from the microcode). When USE_XY is high, the X or Y register is connected to the bus, also selected with IR4. 

    Adding the displacement

    At the upper right you see five OR-gates that "add" a displacement to the address. The lower four bits of the displacement come from the microcode byte. The fifth bit (DISPL4) comes from the control section. Since the displacement is not really added, it only works if the address in the address register is properly aligned. Alhough a four bit adder chip could have been used for the A1-A4, this only moves the problem when that adder has a carry. But having more than one adder chip here is against the minimum parts philosophy of the project.

    The upper signals AP0-AP4, together with A5-A8 can be connected to the adder in the ALU. This connection is used for incrementing the PC. Only the lower byte of the PC is incremented. To cross a 256-byte boundary, an explicit instruction will be needed.

  • The ALU of the Kobold

    roelh6 days ago 0 comments

    This is the current state of the ALU design. There are still several loose ends. Carry signals for addition and shifting are not yet present. The ALU is 8 bits wide, but for clarity I only show 4 bits. (Clicking it will show a better readable version).

    The ALU functions are LOAD, ADD, BIS, BIC, SHR and SHL

    The ALU is intended to work on 16 bit words, in two sequential cycles that each handle a byte.

    ADD function

    The ADD function is the easiest to explain. The upper MPX (multiplexer) connects the output of the AL accumulator byte to the upper inputs of the adder chip, 74HC283. The lower MPX is disabled (by ALU_F0 signal), so all its outputs are high. That means that the input from the databus (D0-D7) is flowing through the AND-gates to the lower inputs of the adder chip. So, the adder will add the databus byte to the AL byte and deliver the result in AH.

    Ooops... what is that lower byte doing in the high part of the accumulator ?  And the accumulator has been clocked, so the high byte of the accumulator is now in the low byte ?

    After the next cycle, it will be all right. The high byte coming from the databus will be added to the high byte of the accumulator, that is in AL now. The result will go to the AH register, and the previous result of the low byte will at the same time go to the HL register.

    LOAD function

    How do we put something in the accumulator ? We set the ALU-F1 signal (connected to the upper MPX) to 1. This disables the upper MPX, its output will be zero. We now do the ADD operation. The databus contents will be added to zero, and the result will be put in the accumulator.

    BIC function

    For the BIC (bit clear) function (that is a logical AND where one of the operands is inverted), the upper MPX is disabled but the lower MPX is now enabled. The lower MPX inverts the data (it is a 74HC158). The following AND gates will give result:  DATA and (not ACCU). The adder will add zero to this, so that does not change the result. So the result is that any bit that was set in the accumulator, will cause the same bit from the databus input to be cleared.

    BIS function

    The BIS (bit set) function (PDP11 parlance for logical OR) is similar to BIC, but now also the upper MPX is enabled again. so both adder inputs can receive data. When an accumulator bit is 0, the AND gates will transfer the databus inputs, the adder will add the 0 bit to it, so result for this bit is 1 (if there was no carry from previous bit). And when an accumulator bit is 1, it is inverted by the lower MPX so it delivers 0 to its AND gate, so the lower input of the adder will be 0.  The adder output will be 1 because the upper input is 1 and the lower input is 0. The key idea is, that at the adder, the two inputs for a certain bit are never both 1. In this case, the adder will never generate an (internal or external) carry, and will behave as four OR gates.

    SHR function

    Finally, the SHR (shift right) function can be done by the upper MPX, because it has a shifted version of the accumulator bits connected to one of its input groups. The databus input should be zero (or not, in that case it will be added to the result).

    SHL function

    SHL (shift left) is the same as adding.

    Other functions

    A few functions are missing, but they can be composed from the functions that are available. This can be done in microcode, so the instruction set can still contain these missing functions. It will only have a small impact on performance. The functions that I'm talking about are SUB, XOR and NOT.

  • Address generation

    roelh04/10/2019 at 14:15 0 comments

    For generating the memory address, the square inch processor had only a H-L register pair, that had to be reloaded each time when another address was needed. For the new processor, I want to have several addresses on standby, ready to be connected to the address bus when needed. I also want 20 bit addresses.

    The 74HC670 seems to be very suitable for this. It has 4 latches of 4 bits each (and only 16 pins). If we use five of them, we have four 20-bit registers. 1-to-4 decoders for read and write are built-in ! Here are the internals:

    Bits 1 to 4 will get or-gates or an adder to add the 4-bit word-displacement to the address. The lowest address bit, bit 0, is used to select upper or lower byte in a word.

    So, the address generation takes only 6 chips !

  • High level instructions

    roelh04/09/2019 at 19:45 0 comments

    For the instructions, there will be the following registers visible:

     
    Hardware address registers: (20 bit wide)
      PC program counter
      WP workspace pointer
       X  index register
       Y  index register  

    Registers in RAM (16 bit wide):  R0 - R15

    The WP register points to the location of R0. 

    The instructions will follow the assembly format of the PDP11 computer.

    Most instructions will have two operands, like:

      MOVB #17,R4   ; load register R4 with the value 17 (decimal)

    Instructions handle a single byte or a word (2 bytes).

    Possible instructions are: 

    MOVB src,dst  ; move data (byte size)
    ADDB src,dst  ; add data (byte size)
    SUBB src,dst  ; subtract data (byte size)
    BISB src,dst  ; bit set  (byte size)
    BICB src,dst  ; bit clear (byte size)
    BITB src,dst  ; bit test (byte size)
    CMPB src,dst  ; compare (byte size)
    
    MOV src,dst  ; move data (word size)
    ADD src,dst  ; add data (word size)
    SUB src,dst  ; subtract data (word size)
    BIS src,dst  ; bit set  (word size)
    BIC src,dst  ; bit clear (word size)
    BIT src,dst  ; bit test (word size)
    CMP src,dst  ; compare (word size)
    
    BR label     ; branch. conditional versions also available
    JSR label    ; jump to subroutine
    RTS          ; return from subroutine 
    

    There will be more instructions, but these are the main ones.

    Now the addressing modes:

    Rn        ; general register
    X, Y, WP  ; address register
    (Rn)      ; register indirect
    (Rn+)     ; register indirect with post-increment
    disp(X)   ; indirect with displacement, displacement 0-15 words
    disp(Y)   ; indirect with displacement, displacement 0-15 words
    #number   ; immediate data
    label     ; zero-page memory location

    Most instructions will be 2 bytes (16 bits).

    There will be special instructions to load the four upper bits of X, Y and WP.  This might be done with instructions that handle LONG operands (4 bytes).

    Due to the limited number of opcodes available, not all combinations of addressing modes will be available in the final instruction set.

    It is under investigation to make the instruction encoding equal to the encoding of the PDP11.

    Well all that has to be done is write microcode to implement this.... Oh wait, there is no hardware yet....

  • Hardware registers and microcode

    roelh04/09/2019 at 18:59 0 comments

    REGISTERS

    The data width of the CPU is 8 bits.

    The CPU will now have four address registers (instead of the single HL register pair ).

    The address registers are 20 bits wide. Regular instructions will operate on the lowest 16 bits, and there will be special instructions to fill the upper 4 bits of each register.

    Register naming:
      PC program counter
      WP workspace pointer
       X index register
       Y index register  

    There will be two 8-bit accumulator registers, coupled to the new ALU. They can, together,
    contain a 16 bit value.

    MICROCODE

    Hopefully, the microcode can stay 8 bits wide.

    For the moment, this is the idea for the micro-instructions:

    00AA DDDD  load accumulator 8bit, addr register AA with displ. DDDD
    01AA DDDD  store accumulator 8bit, addr register AA with displ. DDDD
    10AA DDDD  add to accumulator, addr register AA with displ. DDDD
    
    1100 FFFF  set ALU function FFFF (instead of the default ADD)
    
    1101 AA00  16-bit accumulator to addr. register AA (bits 0-15)
    1101 AA01  16-bit accumulator to addr. register AA (bits 16-19) 
    1101 AA10  16-bit accumulator to addr. register AA if true (bits 0-15)
    1101 AA11  16-bit accumulator to addr. register AA if false (bits 0-15)
    
    1110 FFFF  reserved for I/O
    
    1111 FFFF  microcode jump

View all 7 project logs

Enjoy this project?

Share

Discussions

bobricius wrote 4 days ago point

I am exited with this project.

  Are you sure? yes | no

roelh wrote 04/10/2019 at 12:48 point

Yes, you made it famous, thank you !

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates