Close
0%
0%

ECM-16/TTL

16 bit Computer made from ttl logic chips

Public Chat
Similar projects worth following
The aim of this project is to build functional computer, based around 16 bit datapath, from scratch, by using logic chips of 74hc family. There are three parts of this project:1. hardware design2. hardware build3. writing softwareFor design, the "Digital" logic simulator (Logisim clone from H.Neeman, https://github.com/hneemann/Digital) is used.After the simulation is fully designed and works, actual building/soldering begins.Throughout all this time software for computer is developed, and this development is not stopped after hardware completion. I plan to release instruction set and *.dig files publicly (when it becomes stable) , so anyone interested can write their own software for this computer, or tinker with it's design.

Computer design goals:

Architecture:

RISC-like – inspired by MIPS, but is quite different. This is Load/Store architecture, meaning that ALU operations are only applied on data in registers, and for using data from memory it should be first loaded to these registers, or stored from them back to memory, in separate instruction cycle.

16-bit computer, 16-bit wide registers, 16-bit wide ALU and 16-bit bus.

Memory consists of 16-bit words.

Up to 16M words can be addressed.

Component base: 74HCxx SSI and MSI chips.

Input-output

Input: keyboard, serial interface from magnetic tape (this is still very speculative).

Output: Monitor (TV) characters, pseudographics, bitmap. Serial to tape recorder(?). Or Compact Flash card(?).

Mass storage: CF or tape.

Registers

Register file: 8 16-bit registers, 2-address/3-address

2-address mode:

First address read-write access, provides A operand for ALU and is overwritten by ALU (when write-enabled), gets written to from bus and enabled to bus.

Second address is read-only, provides B operand for ALU.

3-address mode:

First address (A operand) is written with result of ALU operation on 2 registers (B and C operands). C operand has restriction that it cannot be GPR0 or GPR1.

Program counter: presettable synchronous counter – 24 bits

Instruction register: holds running instruction.

Memory address register – 24 bits, can address up to 16M locations

Stack pointer: presettable synchronous up/down counter – 24 bits

ALU

Functions: ADD, SUB, AND, NOR, XOR, SHIFT, ROTATE

B operand modifications: no, invert (1-complement), twos complement, replace with: 0, 1-255.

Adder: fast adder (with carry look-ahead) for high speed.

Computer042.zip

Several bugs fixed, addressing logic remade from scratch, so it more regular, plus adding true subroutine calls. Comes pre-loaded with 16-bit positive integer multiplication program, input and output are hexadecimal.

x-zip-compressed - 2.29 MB - 12/12/2019 at 05:44

Download

asm3.zip

Third version of assembler; supports constant and address labels, several directives.

x-zip-compressed - 1.46 MB - 12/12/2019 at 05:42

Download

asm2.zip

Second version of assembler: couple of bugs fixed, now can have arbitrary number of spaces in lines, and, most important, address labels are supported.

x-zip-compressed - 1.43 MB - 11/30/2019 at 08:20

Download

asm.zip

simple assembler for code for use with simulated computer

application/x-zip-compressed - 1.43 MB - 11/24/2019 at 13:24

Download

Computer04.zip

Simulation files for computer described in this project + documentation of its instruction set. (Updated with new instructions and execution cycle) (zip archive)

application/x-zip-compressed - 284.03 kB - 11/18/2019 at 17:12

Download

View all 7 files

  • Name for the computer; asm v3

    Pavel5 days ago 0 comments

    I am not good at coming up with names, but after some thinking I've came up with the name for this computer at last:
    from now on it will be called ECM-16/TTL. This is itself a quite generic name, meaning Electronic Computing Machine, 16bit, based on TTL logic chips (of 74HC family). A quick search on Google haven't returned such name for any other homebrew computer/cpu, so I claiming this name for my machine in development.

    On the assembler front, it is now a version 3, and becoming a treat to use, as now it supports labels and directives (though, no expressions yet). The constant reference for bit patterns of instructions and hand-calculation of jump and load/store addresses are no longer needed, all is done automatically.

    During development of this assembler, several bugs were found in it, and also in the wiring of computer itself. Now, they are fixed.

    There are some things that are yet to be added to the assembler -- I think, adding support for at least rudimentary expressions (like adding constant to label) will be handy. I also thinking about adding of PC-relative addressing for short jumps and load/stores, this way awkward situations on the page and block boundaries could be avoided. But this means adding another adder to machine, this time to addressing circuitry, the thing I've tried to avoid from the very beginning. Also, the assembler logic will have to be changed a bit.

    Below are descriptions on some of the aspects of ECM-16/TTL:

    Reason for 24-bit addresses

    At first, I was content with having 64k words for this computer design.
    But after a while, when designing memory access instructions, I faced the situation: the address must be somewhere. The 16 bits for addressing of 64k are comfortably fit in general purpose registers, and such size is comfortable for shuttling around on the main bus. But I wanted instructions which would have the address encoded into them. I already had instructions for ALU with a constant value encoded into them, like ADD aX 0xff, and this scheme can also be employed for addressing. Thus, there is an instruction like LD aX [0xff], and it is laid out in instruction word as [high byte][low byte] => [opcode][address]. But this only can address 256 words (what I call a single "page") of memory, which is far from desired 64k words. To have an instruction which have in itself the whole address, it itself will be longer than the word. Naturally, it would take up the two words, as this design does not support byte addressing. So, I had 32 bits to play with. One way could be done as [opcode][high8bit of address]:[low8bit of address][not used]; but it would be too awkward to implement in hardware. Another way could be [opcode][not used]:[high8bit of address][low8bit of address], which is much more natural in this design. Now, though, I have 8 bits more than I have to to address 64k locations, so why not use it? Let's define block of memory as 64k locations which can be addressed by one 16 bit word. Then if we use the byte right next to opcode, then it will be the block index, with up to 256 blocks possible to address, and therefore 16M memory locations that could be addressed by 24 bit address. In reality, I don't think the computer will ever have as much memory, mostly because it would be relatively expensive (SRAM chips are to be used for memory). 

    Memory space can be presented as hierarchical structure, with levels differing by ease of access (number of clock cycles needed):

    Page -- 256 words, which can be accessed with shortest and fastest instructions;
    --I am thinking about re-making in-page addressing into PC-relative one, so there close load/store or jump could be made to +-127 memory locations, independent on page/block boundary. This change will need addition of another adder, 24bits wide, so [base + offset] address of location is calculated on the fly. This may also add quite a lot of complexity to assembler.

    Block -- 65536 words, addressed by value in one of the general purpose...

    Read more »

  • Assembler, first version

    Pavel11/24/2019 at 13:23 0 comments

    Just another quick note:

    The assembler, first version is created. It seems that it outputs good machine code in hex, which can be then imported into ROM of the simulated computer.

    It is not very sophisticated, just converts mnemonics into machine code; there is no support for directives and labels yet.

    The source code, compiled binary for Windows, .bat file with commands for compilation of source with g++ and test file are in archive in "Files" section.

    On another hand, work on making instruction cycles shorter encountered some problems -- it turned out much more work is needed than I thought at first. For now my attempts were hindered by data/clock races -- next instruction fetches and starts to execute while previous is still in progress, or bus congestion occurs.

  • Small update on simulation

    Pavel11/18/2019 at 06:47 0 comments

    Turns out there is still room to some optimisations.

    I found a way to squeeze 3rd register address into 16-bit ALU instruction, so now operations on two operands with saving into the 3rd are possible, for example: ADD a0 b1 c2, adds values stored in registers 1 and 2 together and stores result into register 0. This is in no way breaks previous mode where ops were like this: ADD a0 b1, where values in registers 0 and 1 are added and  result overwrites register 0. These new operations became possible by slightly changing addressing wiring in Register File. As it maintains compatibility with previous instructions, the new mode has limitation in that only registers 2 to 7 can be used as operand C.

    Other changes to ALU operations are made with some modification of ALU itself. The wiring was changed slightly, so one operand instructions (such as shifts) now can store result into other register, without overwriting contents of the source.

    Also barrel rotator was added to ALU, which can rotate the word left to up to 15 bits. 

    The encoding for new register addresses for above operations, as well as for bit number of rotations has been achieved by utilising previously unused bits of instruction (or conditional reinterpret of some bits). This turned out to take surprisingly small changes to ALU/Register File wiring and decode logic.

    One encoding which was changed is that +/- bit now is used for shift direction indication instead of previously used dedicated Left/Right bit. All the other changes are additions to previously existing set.

    On another note, I found a way to speed up execution to shave off number of clock cycles for some of the instruction types. This is done by staggering instruction execution, so when the last step of current cycle is going on, the fetching of the next instruction begins.

    It was possible for instructions which don't do writes to Memory Address Register (MAR) at their last step, so no bus congestion is created.

    Instructions affected are ALU, MOV and Load/Store instructions. This way, if they are going in succession, ALU and MOV instructions take only 2 clock cycles instead of 3, and Load/Store operations take 3-4 clocks instead of 4-5, depending on operation.

    To achieve this, the changes made were also surprisingly small, 3 or 4 logic gates were added, as well as a couple of wires, to the simulated processor.

    The files with updated instruction set as well as simulation can be found in Computer04.7z archive in "Files" section. 

    -------------------------

    Things to add to simulation later:

    1. Start-up sequence -- first cycle of 8 clocks nothing is done and Reset applied to whole system, then execution commences with instruction at address 0x000000.

    2. Interrupt handling. For now it is still somewhat obscure topic for me.

    3. More memory :) For now placeholder is used, with 256 word ROM and RAM. For testing purposes, now it is enough.

    -------------------------

    Right now I need to write assembler (however crude) in c++, so I won't need to type 0's and 1's in spreadsheet by hand and look up every bit pattern, as it became apparent that this process is incredibly slow and error prone.

    --------------------------------------------------------

    Quick Update: today it occurred to me that I can make first two steps of instruction cycle into one, by bypassing MAR when doing addressing from Program Counter or Stack pointer. This can be done by rerouting some busses and using two 4:1 24-bit multiplexers instead of one 8:1. So, in coming days I'll try to implement this. It will not change any instruction encoding in any way, but will make all instruction cycles 1 clock cycle shorter. This, combined with recently implemented pre-fetching (described above), will make ALU and MOV operations effectively done 1 instruction per clock cycle in some situations.

  • Simulation completion

    Pavel11/06/2019 at 06:12 5 comments

    There was a year long hiatus in my work on this project, but as of recently, I've resumed it.

    Looking with fresh eye, I found several things which I hadn't noticed previously, such as commonalities for decode structure in different instructions. This lead to overhaul of decoding circuitry, reducing redundancies while at the same time adding some new instructions ( or rather variations on them ), exploiting found commonalities.

    The CPU now is Turing complete. Though there are still possibilities for adding additional instructions, existing set is quite big already.

    There are multiple operations for manipulating data with ALU, other ones move data between registers, loading and storing from memory, unconditional and conditional jumps.

    The registers are of two types - General purpose (GPR) and Special (SpR). Data in GPR can be manipulated in ALU, while data in SpR can not. Data can be moved between all registers, and also loaded/stored in RAM; although there are some restrictions for particular SpRs.

    The ALU can only perform its functions on data from GPRs. Data from memory need to be explicitly loaded or stored as distinct operation.

    There is no microcode -- all instructions are decoded by combinatorial logic. 

    The computer simulation has very rudimentary I/O as of right now.

    It also lacks any interrupt handling.

    The last two points will be worked on in the future.

    In the Files section there is an archive with simulation files as well as an Excel spreadsheet with all instructions described. It also contains manual "assembler", which makes programming this computer slightly easier.

    The simulation files can be opened with Digital logic circuit simulator software.

  • ALU

    Pavel08/20/2018 at 07:32 3 comments

    This is the description of the Arithmetic-logic unit of this "computer". 

    Most of it I conceived a year ago, and now, after some modifications, it is one of the most stable parts of the system (along with the register file).

    Here is it's inner organisation:

    The ALU accepts two 16-bit operands, A_in and B_in, and outputs one 16-bit value, Y_out, which for ease of debugging is also outputted to 16 LEDs on this schematic. 

    The A_in and B_in are supplied from Register File, but there is possibility of substituting B_in with some constant value in the range 0 to 255, that is suplied through Const input. Selecting of  what is going as B operand is accomplished by 1-bit control line Sel_B through the Incrementor module (just 2to1 mux).

    Next, the B operand is going through the Negator module (16 parallel XOR gates), where 1-bit control line Sub dictates if B will go further unchanged, or all its bits will be flipped. this same line flips carry bit that goes into adder for facilitating subtraction in twos complement.

    On the next stage there are several blocks in parallel, namely, Shifter, Fast Adder and Logic unit.

    The Shifter works only on A_in operand, so it is ignores whatever is B value. This module takes in 16 bit D_in and 1 bit C_in (carry in) and outputs 16 bit D_out and 1 bit carry out. The three controls are: A - for Arithmetic shift, L/R - for choosing shift direction, and sh/RC, for choosing between Shift and Rotate through carry. The Shifter mostly consists of 2-to-1 muxes with handful of logic gates.

    Fast adder takes in C_in, A_in and B, outputs S and C_out. It is comprised of four chained 4-bit fast adder units. Constructed from simple logic gates throughout, no fancy 74181/2/3 chips.

    Logic unit also operates on A_in and B_in inputs bitwise, has three outputs. Essentially, it is compount XOR gate ( made from ANDs and NORs + inverters), so AND, OR and XOR are achieved in the same circuit.

    The last stage is the big 8to1 16 bit mux, which is controlled by 3-bit F (as in Function) control line. It chooses which function result will be output from the ALU.

    Here are these functions:

    000     Zero -- no matter what inputs, all bits are set to '0' output

    001      A shifted   -- whatever comes from the Shifter unit (itself controlled by A, L/r and sh/RC lines) -- could be SHL, SHR, ASHR, ROLC, RORC

    010      A       -- A_in goes straight through to the output

    011       ~A     -- A_in gets inverted     

    100     ADD (A,B)  -- result from adder, can be any of the ADD, SUB, ADDC, SUBC

    101      AND (A,B)

    110      OR (A,B)

    111        XOR (A,B)

    In total, this ALU is capable of 28 different operations, though in many cases some of the operations can give the same result.

  • At the beginning...

    Pavel08/16/2018 at 07:47 0 comments

    So, a couple of years ago I've seen people over the internet making their own computers and CPUs from scratch at home, the kind of thing that average person would think is possible only in industrial setting by a team of experts. Since then this idea has infected my mind, and I started to learn electronics and digital circuits in my spare time, then bought soldering iron and small assortment of electronic components, and started to experiment. 

    A year ago I first started to think seriously about really building a cpu, then I thought that using individual transistors would be cool (I still think so, but it means A LOT of tedious work, with at least half of the circuits not working as designed at first). At that time I read several books about cpu design (But how do it know?,  DIY Calculator, and others). Based on my new knowledge and estimating my limitations, I decided, that this computer should be based around 16 bit datapath, and to have 16 bit wide intructions. It also would have 16 bit memory (data and address), all for simplicity of implementation.

    The instructions are not to be microcoded, but rather decoded from instruction word by combinational logic. 16 bit instructions are this width explicitly for ease of decoding. Longer instructions would be even easier, but it may increase component count and overall cost beyond what could be allotted to it.

    Another feature is register file of 8 16 bit registers, so they are addressed by 3-bit addresses. They are to be operated as:  A <-- ALUop(A,B). General architecture is Load/Store.

    Well, after that I put this project on hold, and recently, some month ago I turned to it again. I made simulations in Digital for ALU and Register File first, and around that started to design the rest of the system. Right now I am implementing instructions for this simulated computer, and it already kind of works, with little program in machine language. 

    The instructions available now are:

    ALU instructions -- the simplest decoder, because ALU instruction is almost the same as instruction word for hardware. There are quite a lot of different instructions, so the ALU can do many things with its operands, which could be one of the registers, two of the registers, or one of the registers and some 8-bit constant, which is encoded in instruction.

    Short jumps (within 256-byte page) -- unconditional, and on codition of Carry, Negative, and Zero flags. Lower 8 bits of jump address are encoded in instruction itself.

    Load from page -- can load a word into one of general purpose rgisters from within the current memory page (256 byte long), as the lower 8 bit of memory address is encoded into instruction.

    Load immediate -- loads a word which is on the next address after instruction.

    Here is screenshot of this "computer" simulation running a small program running in Digital digital logic simulator (only part is seen):

    This is work in progress (in the beginning stage of it), so the layout is subject to great change.

View all 6 project logs

Enjoy this project?

Share

Discussions

Ken Yap wrote 11/24/2019 at 14:15 point

So no byte addressing? I suppose that keeps the addressing simple. Still, it means you will waste half the storage for characters and character strings.

This is the kind of CPU BCPL was targetting, but to handle strings, BCPL first had byte packing and unpacking routines, and later the % infix byte indirection operator.

  Are you sure? yes | no

Pavel wrote 11/24/2019 at 17:07 point

Well, I have consciously made such choice. And it doesn't mean that half of the storage is wasted, as character set could be made richer with more pseudographics and other non-Latin scripts (in my country the script used is Cyrillic, for example), and also maybe colour info encoded in the higher bits. On the other hand, basic Unicode is also 16-bit, so going this route, the strings can be made just that.

And I intend to use fairly modern and capacious storage, so having text info using up twice as much space than it would if I used byte-addressable memory is not that big of a deal.

  Are you sure? yes | no

Julian wrote 08/20/2018 at 23:31 point

Nice.  I think yours might be the only project on this site using Digital other than my 6-bit CPU (https://hackaday.io/project/159003-c61). :)

Out of interest, in case you need it I have a Digital plug-in library that provides a variable width/length FIFO component, which I've implemented for my planned IO processor project but haven't got around to actually using yet.  If you have any need for such a thing, let me know, and I'll upload it somewhere so you can use it.

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates