16 bit Computer made from ttl logic chips

Public Chat
Similar projects worth following
The aim of this project is to build functional computer, based around 16 bit datapath, from scratch, by using logic chips of 74hc family. There are three parts of this project:
1. hardware design
2. hardware build
3. writing softwareFor design, the "Digital" logic simulator (Logisim clone from H.Neeman, is used. After the simulation is fully designed and works, actual building/soldering begins. Throughout all this time software for computer is developed, and this development is not stopped after hardware completion. I plan to release instruction set and *.dig files publicly (when it becomes stable) , so anyone interested can write their own software for this computer, or tinker with it's design.

Computer design goals:


RISC-like – inspired by MIPS, but is quite different. This is Load/Store architecture, meaning that ALU operations are only applied on data in registers, and for using data from memory it should be first loaded to these registers, or stored from them back to memory, in separate instruction cycle.

16-bit computer, 16-bit wide registers, 16-bit wide ALU and 16-bit bus.

Memory consists of 16-bit words.

Up to 8M bytes can be addressed, byte-addressable memory.

Component base: 74HCxx SSI and MSI chips.


Input: keyboard, serial interface from magnetic tape (this is still very speculative).

Output: Monitor (TV) characters, pseudographics, bitmap.

Mass storage: CompactFlash throuth Parallel ATA interface.


Register file: 8 16-bit registers, 2-address/3-address

2-address mode:

First address read-write access, provides A operand for ALU and is overwritten by ALU (when write-enabled), gets written to from bus and enabled to bus.

Second address is read-only, provides B operand for ALU.

3-address mode:

First address (A operand) is written with result of ALU operation on 2 registers (B and C operands). C operand has restriction that it cannot be GPR0 or GPR1.

Program counter: presettable synchronous counter – 24 bits

Instruction register: holds running instruction.

Memory address register – 24 bits, can address up to 16M locations

Stack pointer: presettable synchronous up/down counter – 24 bits

Frame pointer: special 24 bit register for temporary storage of SP value.

ALU (16-bit)


B operand modifications: no, invert (1-complement), twos complement, replace with: 0, 1-255.

Adder: fast adder (with carry look-ahead) for high speed.

Address AU ( 24-bit )

Arithmetic unit, add and subtract for indexed address calculation.

Added byte addressing mode; fixed couple of wiring bugs. Additionally, the simulation is pre-loaded with integer calculator program (decimal input and output), which does additions, subtractions, multiplications and divisions (also can calculate remainder of division). Numbers can be up to 9 digits long.

application/x-zip-compressed - 1.67 MB - 01/13/2020 at 06:37


Several bugs fixed, addressing logic remade from scratch, so it more regular, plus adding true subroutine calls. Comes pre-loaded with 16-bit positive integer multiplication program, input and output are hexadecimal.

x-zip-compressed - 2.29 MB - 12/12/2019 at 05:44


Third version of assembler; supports constant and address labels, several directives.

x-zip-compressed - 1.46 MB - 12/12/2019 at 05:42


Second version of assembler: couple of bugs fixed, now can have arbitrary number of spaces in lines, and, most important, address labels are supported.

x-zip-compressed - 1.43 MB - 11/30/2019 at 08:20


simple assembler for code for use with simulated computer

application/x-zip-compressed - 1.43 MB - 11/24/2019 at 13:24


View all 8 files

  • Thoughts on expanding addressing circuitry

    Pavel02/08/2020 at 17:56 0 comments


    It looks like I definitely need to add relative addressing mode. As I am reading K+R right now, I started wondering about how are "automatic" variables are created/implemented, and started searching info about it. This way I learned a lot about how stack operates, and about stack frames.This explanation is the best of all I've read.

    It seems like having ability to push and pop one word at a time won't suffuce for implementing proper function calls. (Well, there is still a way, but it is very awkward). So, I need to have the ability to access locations that are not only on the very top of stack, but a whole number of these locations (as many as needed for storing of all local variables and parameters for a given function. Having ability to access them by address which is known offset from known value (Stack Pointer or Frame Pointer) makes this much easier.
    Thus it is now clear that addressing logic modification is needed, namely, adding an additional adder for calculating offset addresses. Now, as it became apparent that most of the circuitry will be muxes anyway, another adder now seems to be relatively small addition.
    Given that there will be adder in addressing circuitry, then it seems logical to implement different additional addressing modes (various indexed ones). This of course would require yet another change to memory access decoding circuits.
    These new modes should be mostly an addition to existing absolute addressing modes. One exception is that local (in-page) addressing will no longer be available -- this I already discarding with adding of byte addressing, as it adds much hassle for little gain.
    Right now I could think of PC-relative and FP-relative (relative to Frame Pointer -- it is absent in simulation right now).
    Also maybe GPR-relative mode also. The number of modes may be constrained by the number of available instruction bits.
    Another implication of having an adder in addressing circuitry is that SP and PC are no longer need to be counters, but just a simple registers. On the other hand, after some thought, it looks like leaving these special registers the way they are now (counters which can auto-increment, and in case of SP, also auto-decrement) may serve a purpose - this way there is no need for dedicated hard-wired inputs of +2 and -2 for the address adder/subractor, so 2 inputs less to be multiplexed.
    All of the above implies quite a bit of change have to be applied to the sumulation, as well as an assembler. As for the latter, I am already planning to rewrite it from scratch, in C rather than C++, in a way that will make it more plausible to write it in its own language eventually, i.e. make it native assemler, which will run on simulated (and hopefully physically built) computer, and produce code for that same computer. As of now it is of course a cross-assembler running on a PC.

    On topic of building hardware -- I've built a board recently, which may become, at least temporarily, a part of this machine -- the 256bit PROM (16 x 16bit words), made of switch banks, N4148 diodes and a handful of logic ICs. I gather, at the building and debugging phase it will be handier than burning EEPROM for each change. As a preliminary testing result, it seems like it can handle accesses at 1MHz, though I don't think such performance will ever be needed for a 32-byte PROM.
    This board is described in more detail in my other progect.

  • Adding byte addressing; and ramblings on some other topics

    Pavel01/13/2020 at 06:36 0 comments

    2020-01-04: MemAccess wiring bug fixed: JSRg instruction now works

    What to do with direct addressing modes?
    There are two of them: local and global, local meaning only a page of 256 word is accessible, while global instructions can access any memory location, with a penalty of instruction being 2 words long and takes 1 clock cycle longer to execute.

    As jumps are used fairly frequently (in integer calculator they are encountered once per 5-10 instructions), leaving out the local instructions will bloat code 10 to 20%, and slow the execution by somewhere near 5% in terms of clock counts (this should be measured, for now it is a guess after looking at the code).
    But these local jumps and load/stores are real pain in the ass when the program exceeds 256 words in length, because one should be careful that all of them are restricted to their local page and not try to reach adjacent one, as this will surely fail.
    If to make a more sophisticated assembler, which will automatically check for this locality, and automatically choose global or local memory access/jump, this will make the assembly a highly iterative process -- as these kinds of instructions are differing in length, each change will affect all the addresses and alignment of the following code.

    PC-relative addressing -- not appealing.
    Making these local references to be PC-relative rather than absolute will ameliorate the problem slightly, but this is not solving the issue with changing code length and jump/memory access lengths in the process of adjustment.

    Padding with "holes", with saving the page alignment -- this might actually be of some use.
    Part of the solution would be to make holes in code, so parts which fit inside one page would have all internal references local, while references to other parts of program would be global. If the part of program is not taking up the full page, all the rest locations will be filled with zeros until the start of new page. Such padding would be quite a waste of memory space, but will offer slightly faster execution without complications caused by shifting code across page boundaries.

    Thoughts on byte addressing

    This starts to seem inevitable that byte addressing should be added. This will ask for hardware update, so special instructions for byte loads and stores could be done (memory addressing circuitry mainly, but also changes to memaccess decoder), as well as update to the assembler.

    Right now the machine is happily working in word addressing mode, but this seem only good for number crunching; while working with characters had not yet been explored. This now seem that it could bite me in the ass in the future.

    Byte addressing is implemented.
    Simulation now in its version 5.0
    Version incremented due to addition of new functionality -- namely, byte addressing. 
    Comment by Ken Yap regarding lack of byte addressing provoked thoughts on this topic. Although in my reply I wrote that this seems to be of no big consequence, afterwards I started to think about this deeper. So far I am implementing a calculator program, and it cannot care less if byte addressing is present or not, as it works with words. But to make programming for other problems easier on this computer, adding byte addressing seemed better and better idea. There are thoughts of implementing C compiler for it in some distant future, and it is the kind of thing which would benefit from ability to address individual bytes.
    So, as of right now, I implemented it. On one hand, this was not as difficult as I thought it would be. On the other hand, some sacrifices were made (not that these were real): total addressable memory has been reduced at least 4 times: first, halved because now addressable unit is 8-bit byte instead of 16-bit machine word; second, I need additional bit in instruction word for loads and stores to indicate that I want load...

    Read more »

  • Name for the computer; asm v3

    Pavel12/11/2019 at 06:48 0 comments

    I am not good at coming up with names, but after some thinking I've came up with the name for this computer at last:
    from now on it will be called ECM-16/TTL. This is itself a quite generic name, meaning Electronic Computing Machine, 16bit, based on TTL logic chips (of 74HC family). A quick search on Google haven't returned such name for any other homebrew computer/cpu, so I claiming this name for my machine in development.

    On the assembler front, it is now a version 3, and becoming a treat to use, as now it supports labels and directives (though, no expressions yet). The constant reference for bit patterns of instructions and hand-calculation of jump and load/store addresses are no longer needed, all is done automatically.

    During development of this assembler, several bugs were found in it, and also in the wiring of computer itself. Now, they are fixed.

    There are some things that are yet to be added to the assembler -- I think, adding support for at least rudimentary expressions (like adding constant to label) will be handy. I also thinking about adding of PC-relative addressing for short jumps and load/stores, this way awkward situations on the page and block boundaries could be avoided. But this means adding another adder to machine, this time to addressing circuitry, the thing I've tried to avoid from the very beginning. Also, the assembler logic will have to be changed a bit.

    Below are descriptions on some of the aspects of ECM-16/TTL:

    Reason for 24-bit addresses

    At first, I was content with having 64k words for this computer design.
    But after a while, when designing memory access instructions, I faced the situation: the address must be somewhere. The 16 bits for addressing of 64k are comfortably fit in general purpose registers, and such size is comfortable for shuttling around on the main bus. But I wanted instructions which would have the address encoded into them. I already had instructions for ALU with a constant value encoded into them, like ADD aX 0xff, and this scheme can also be employed for addressing. Thus, there is an instruction like LD aX [0xff], and it is laid out in instruction word as [high byte][low byte] => [opcode][address]. But this only can address 256 words (what I call a single "page") of memory, which is far from desired 64k words. To have an instruction which have in itself the whole address, it itself will be longer than the word. Naturally, it would take up the two words, as this design does not support byte addressing. So, I had 32 bits to play with. One way could be done as [opcode][high8bit of address]:[low8bit of address][not used]; but it would be too awkward to implement in hardware. Another way could be [opcode][not used]:[high8bit of address][low8bit of address], which is much more natural in this design. Now, though, I have 8 bits more than I have to to address 64k locations, so why not use it? Let's define block of memory as 64k locations which can be addressed by one 16 bit word. Then if we use the byte right next to opcode, then it will be the block index, with up to 256 blocks possible to address, and therefore 16M memory locations that could be addressed by 24 bit address. In reality, I don't think the computer will ever have as much memory, mostly because it would be relatively expensive (SRAM chips are to be used for memory). 

    Memory space can be presented as hierarchical structure, with levels differing by ease of access (number of clock cycles needed):

    Page -- 256 words, which can be accessed with shortest and fastest instructions;
    --I am thinking about re-making in-page addressing into PC-relative one, so there close load/store or jump could be made to +-127 memory locations, independent on page/block boundary. This change will need addition of another adder, 24bits wide, so [base + offset] address of location is calculated on the fly. This may also add quite a lot of complexity to assembler.

    Block -- 65536 words, addressed by value in one of the general purpose...

    Read more »

  • Assembler, first version

    Pavel11/24/2019 at 13:23 0 comments

    Just another quick note:

    The assembler, first version is created. It seems that it outputs good machine code in hex, which can be then imported into ROM of the simulated computer.

    It is not very sophisticated, just converts mnemonics into machine code; there is no support for directives and labels yet.

    The source code, compiled binary for Windows, .bat file with commands for compilation of source with g++ and test file are in archive in "Files" section.

    On another hand, work on making instruction cycles shorter encountered some problems -- it turned out much more work is needed than I thought at first. For now my attempts were hindered by data/clock races -- next instruction fetches and starts to execute while previous is still in progress, or bus congestion occurs.

  • Small update on simulation

    Pavel11/18/2019 at 06:47 0 comments

    Turns out there is still room to some optimisations.

    I found a way to squeeze 3rd register address into 16-bit ALU instruction, so now operations on two operands with saving into the 3rd are possible, for example: ADD a0 b1 c2, adds values stored in registers 1 and 2 together and stores result into register 0. This is in no way breaks previous mode where ops were like this: ADD a0 b1, where values in registers 0 and 1 are added and  result overwrites register 0. These new operations became possible by slightly changing addressing wiring in Register File. As it maintains compatibility with previous instructions, the new mode has limitation in that only registers 2 to 7 can be used as operand C.

    Other changes to ALU operations are made with some modification of ALU itself. The wiring was changed slightly, so one operand instructions (such as shifts) now can store result into other register, without overwriting contents of the source.

    Also barrel rotator was added to ALU, which can rotate the word left to up to 15 bits. 

    The encoding for new register addresses for above operations, as well as for bit number of rotations has been achieved by utilising previously unused bits of instruction (or conditional reinterpret of some bits). This turned out to take surprisingly small changes to ALU/Register File wiring and decode logic.

    One encoding which was changed is that +/- bit now is used for shift direction indication instead of previously used dedicated Left/Right bit. All the other changes are additions to previously existing set.

    On another note, I found a way to speed up execution to shave off number of clock cycles for some of the instruction types. This is done by staggering instruction execution, so when the last step of current cycle is going on, the fetching of the next instruction begins.

    It was possible for instructions which don't do writes to Memory Address Register (MAR) at their last step, so no bus congestion is created.

    Instructions affected are ALU, MOV and Load/Store instructions. This way, if they are going in succession, ALU and MOV instructions take only 2 clock cycles instead of 3, and Load/Store operations take 3-4 clocks instead of 4-5, depending on operation.

    To achieve this, the changes made were also surprisingly small, 3 or 4 logic gates were added, as well as a couple of wires, to the simulated processor.

    The files with updated instruction set as well as simulation can be found in Computer04.7z archive in "Files" section. 


    Things to add to simulation later:

    1. Start-up sequence -- first cycle of 8 clocks nothing is done and Reset applied to whole system, then execution commences with instruction at address 0x000000.

    2. Interrupt handling. For now it is still somewhat obscure topic for me.

    3. More memory :) For now placeholder is used, with 256 word ROM and RAM. For testing purposes, now it is enough.


    Right now I need to write assembler (however crude) in c++, so I won't need to type 0's and 1's in spreadsheet by hand and look up every bit pattern, as it became apparent that this process is incredibly slow and error prone.


    Quick Update: today it occurred to me that I can make first two steps of instruction cycle into one, by bypassing MAR when doing addressing from Program Counter or Stack pointer. This can be done by rerouting some busses and using two 4:1 24-bit multiplexers instead of one 8:1. So, in coming days I'll try to implement this. It will not change any instruction encoding in any way, but will make all instruction cycles 1 clock cycle shorter. This, combined with recently implemented pre-fetching (described above), will make ALU and MOV operations effectively done 1 instruction per clock cycle in some situations.

  • Simulation completion

    Pavel11/06/2019 at 06:12 5 comments

    There was a year long hiatus in my work on this project, but as of recently, I've resumed it.

    Looking with fresh eye, I found several things which I hadn't noticed previously, such as commonalities for decode structure in different instructions. This lead to overhaul of decoding circuitry, reducing redundancies while at the same time adding some new instructions ( or rather variations on them ), exploiting found commonalities.

    The CPU now is Turing complete. Though there are still possibilities for adding additional instructions, existing set is quite big already.

    There are multiple operations for manipulating data with ALU, other ones move data between registers, loading and storing from memory, unconditional and conditional jumps.

    The registers are of two types - General purpose (GPR) and Special (SpR). Data in GPR can be manipulated in ALU, while data in SpR can not. Data can be moved between all registers, and also loaded/stored in RAM; although there are some restrictions for particular SpRs.

    The ALU can only perform its functions on data from GPRs. Data from memory need to be explicitly loaded or stored as distinct operation.

    There is no microcode -- all instructions are decoded by combinatorial logic. 

    The computer simulation has very rudimentary I/O as of right now.

    It also lacks any interrupt handling.

    The last two points will be worked on in the future.

    In the Files section there is an archive with simulation files as well as an Excel spreadsheet with all instructions described. It also contains manual "assembler", which makes programming this computer slightly easier.

    The simulation files can be opened with Digital logic circuit simulator software.

  • ALU

    Pavel08/20/2018 at 07:32 3 comments

    This is the description of the Arithmetic-logic unit of this "computer". 

    Most of it I conceived a year ago, and now, after some modifications, it is one of the most stable parts of the system (along with the register file).

    Here is it's inner organisation:

    The ALU accepts two 16-bit operands, A_in and B_in, and outputs one 16-bit value, Y_out, which for ease of debugging is also outputted to 16 LEDs on this schematic. 

    The A_in and B_in are supplied from Register File, but there is possibility of substituting B_in with some constant value in the range 0 to 255, that is suplied through Const input. Selecting of  what is going as B operand is accomplished by 1-bit control line Sel_B through the Incrementor module (just 2to1 mux).

    Next, the B operand is going through the Negator module (16 parallel XOR gates), where 1-bit control line Sub dictates if B will go further unchanged, or all its bits will be flipped. this same line flips carry bit that goes into adder for facilitating subtraction in twos complement.

    On the next stage there are several blocks in parallel, namely, Shifter, Fast Adder and Logic unit.

    The Shifter works only on A_in operand, so it is ignores whatever is B value. This module takes in 16 bit D_in and 1 bit C_in (carry in) and outputs 16 bit D_out and 1 bit carry out. The three controls are: A - for Arithmetic shift, L/R - for choosing shift direction, and sh/RC, for choosing between Shift and Rotate through carry. The Shifter mostly consists of 2-to-1 muxes with handful of logic gates.

    Fast adder takes in C_in, A_in and B, outputs S and C_out. It is comprised of four chained 4-bit fast adder units. Constructed from simple logic gates throughout, no fancy 74181/2/3 chips.

    Logic unit also operates on A_in and B_in inputs bitwise, has three outputs. Essentially, it is compount XOR gate ( made from ANDs and NORs + inverters), so AND, OR and XOR are achieved in the same circuit.

    The last stage is the big 8to1 16 bit mux, which is controlled by 3-bit F (as in Function) control line. It chooses which function result will be output from the ALU.

    Here are these functions:

    000     Zero -- no matter what inputs, all bits are set to '0' output

    001      A shifted   -- whatever comes from the Shifter unit (itself controlled by A, L/r and sh/RC lines) -- could be SHL, SHR, ASHR, ROLC, RORC

    010      A       -- A_in goes straight through to the output

    011       ~A     -- A_in gets inverted     

    100     ADD (A,B)  -- result from adder, can be any of the ADD, SUB, ADDC, SUBC

    101      AND (A,B)

    110      OR (A,B)

    111        XOR (A,B)

    In total, this ALU is capable of 28 different operations, though in many cases some of the operations can give the same result.

  • At the beginning...

    Pavel08/16/2018 at 07:47 0 comments

    So, a couple of years ago I've seen people over the internet making their own computers and CPUs from scratch at home, the kind of thing that average person would think is possible only in industrial setting by a team of experts. Since then this idea has infected my mind, and I started to learn electronics and digital circuits in my spare time, then bought soldering iron and small assortment of electronic components, and started to experiment. 

    A year ago I first started to think seriously about really building a cpu, then I thought that using individual transistors would be cool (I still think so, but it means A LOT of tedious work, with at least half of the circuits not working as designed at first). At that time I read several books about cpu design (But how do it know?,  DIY Calculator, and others). Based on my new knowledge and estimating my limitations, I decided, that this computer should be based around 16 bit datapath, and to have 16 bit wide intructions. It also would have 16 bit memory (data and address), all for simplicity of implementation.

    The instructions are not to be microcoded, but rather decoded from instruction word by combinational logic. 16 bit instructions are this width explicitly for ease of decoding. Longer instructions would be even easier, but it may increase component count and overall cost beyond what could be allotted to it.

    Another feature is register file of 8 16 bit registers, so they are addressed by 3-bit addresses. They are to be operated as:  A <-- ALUop(A,B). General architecture is Load/Store.

    Well, after that I put this project on hold, and recently, some month ago I turned to it again. I made simulations in Digital for ALU and Register File first, and around that started to design the rest of the system. Right now I am implementing instructions for this simulated computer, and it already kind of works, with little program in machine language. 

    The instructions available now are:

    ALU instructions -- the simplest decoder, because ALU instruction is almost the same as instruction word for hardware. There are quite a lot of different instructions, so the ALU can do many things with its operands, which could be one of the registers, two of the registers, or one of the registers and some 8-bit constant, which is encoded in instruction.

    Short jumps (within 256-byte page) -- unconditional, and on codition of Carry, Negative, and Zero flags. Lower 8 bits of jump address are encoded in instruction itself.

    Load from page -- can load a word into one of general purpose rgisters from within the current memory page (256 byte long), as the lower 8 bit of memory address is encoded into instruction.

    Load immediate -- loads a word which is on the next address after instruction.

    Here is screenshot of this "computer" simulation running a small program running in Digital digital logic simulator (only part is seen):

    This is work in progress (in the beginning stage of it), so the layout is subject to great change.

View all 8 project logs

Enjoy this project?



peter wrote 12/20/2019 at 17:39 point

Hi, great project!! how did you the transfer from Digital software to schematics and PCB design? Any tools, or made by hand?

  Are you sure? yes | no

Pavel wrote 12/26/2019 at 05:33 point

All by hand. Digital has library of DIL shapes for chips, so I start with replacing conventional logic element shapes with these, and then use it as reference when soldering the thing. I do not make or order custom PCBs, all is point-to-point soldering on perfboard.

  Are you sure? yes | no

Ken Yap wrote 11/24/2019 at 14:15 point

So no byte addressing? I suppose that keeps the addressing simple. Still, it means you will waste half the storage for characters and character strings.

This is the kind of CPU BCPL was targetting, but to handle strings, BCPL first had byte packing and unpacking routines, and later the % infix byte indirection operator.

  Are you sure? yes | no

Pavel wrote 11/24/2019 at 17:07 point

Well, I have consciously made such choice. And it doesn't mean that half of the storage is wasted, as character set could be made richer with more pseudographics and other non-Latin scripts (in my country the script used is Cyrillic, for example), and also maybe colour info encoded in the higher bits. On the other hand, basic Unicode is also 16-bit, so going this route, the strings can be made just that.

And I intend to use fairly modern and capacious storage, so having text info using up twice as much space than it would if I used byte-addressable memory is not that big of a deal.

  Are you sure? yes | no

Julian wrote 08/20/2018 at 23:31 point

Nice.  I think yours might be the only project on this site using Digital other than my 6-bit CPU ( :)

Out of interest, in case you need it I have a Digital plug-in library that provides a variable width/length FIFO component, which I've implemented for my planned IO processor project but haven't got around to actually using yet.  If you have any need for such a thing, let me know, and I'll upload it somewhere so you can use it.

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates