The central processing unit, or CPU, is a ubiquitous type of computational device these days. Every desktop and laptop computer has one (or more), of course, and so do cell phones, but they are also found in set-top boxes, TVs, modems, and cameras, just to name a few. As well, every microcontroller includes a (small) CPU, so they are also found in Arduino boards, remote controls, light timers, engine control units, and toy robots. This post will cover the basic workings of CPUs. It assumes that you have a basic familiarity with binary numbers. Throughout the post there are links to Wikipedia articles where you can read more about each subject.

Switches and logic

The smallest unit found within a CPU is the switch. In all modern CPUs, these switches are transistors. In the past, vacuum tubes were used. You could also construct a CPU out of mechanical switches.

A switch generally has two states, on and off. Therefore, just about all CPUs work in binary, where everything is represented by ons and offs, aka ones and zeros. It is also possible (and sometimes done as a hobby project) to build a ternary CPU, where each switch can have three states, but this is not done commercially because it is less practical.

These switches are arranged into logic gates. A logic gate is a device that takes one or more input signals and produces an output based on those inputs. These implement Boolean logic, first described by George Boole in 1847, which deals with binary states, in this context called true and false. The basic Boolean logic functions are AND, OR, and NOT. AND takes two inputs, and produces a true output if and only if both of its inputs are true, producing a false output otherwise. OR takes two inputs, and produces a true output if either of its inputs is true, producing a false output only if both inputs are false. NOT takes one input, and produces the output opposite to the state of its input. The truth tables for AND, OR, and NOT are as follows:

AND

Input 1	Input 2	Output
false	false	false
false	true	false
true	false	false
true	true	true

Input 1	Input 2	Output
false	false	false
false	true	true
true	false	true
true	true	true

NOT

Input	Output
false	true
true	false

These should be familiar to you from programming, if you have programming experience.

There are also other functions, including NAND (NOT-AND), NOR (NOT-OR), and XOR (exclusive OR). NAND and NOR are, respectively, composed of an AND or an OR with a NOT applied to its output. Everything in the output column is simply the opposite of what it is for AND and OR. XOR is similar to OR, but produces a false output when both inputs are true.

It turns out that all of the other Boolean operations can be composed of arrangements of multiple NANDs or multiple NORs. For this reason, modern digital circuits tend to use all NAND or all NOR, because this makes manufacturing easier.

As mentioned above, the electronic implementations of these functions are called logic gates. There are multiple competing systems of symbols to represent these gates, which can be seen in this image.

More logic circuits

Another common type of circuit composed of the above described parts is the flip-flop, or latch. These circuits enable the storage of information. An example is the SR (set–reset) latch. Here is an interactive SR latch composed of individual switches: Interactive diagram by DrJolo (click link, then click on diagram to interact)

An SR latch can also be made from logic gates:

Animated diagram by Napalm Llama, used under CC-BY 2.0

Several flip-flops connected together and sharing a clock form a register, which is a form of memory.

Clocking

To transfer data around between different flip-flops, they need a trigger signal. This is provided by the clock. A clock in a processor is a signal that alternates between on and off at a specific frequency. (It has nothing to do with telling the time of day—that is the function of a real time clock, which is not usually part of the CPU at all.)

In modern computer CPUs, the main clock frequency can get up to 2 to 4 GHz—that's two to four billion oscillations between on and off per second. However, there is often a clock generator circuit that generates clock signals at different frequencies for different parts of the CPU. For example, the front-side bus (which the CPU uses to transfer data in and out) is often clocked more slowly than the core(s).

CPU architecture and subsystems

There is a huge variety of different CPU architectures, all optimized according to different constraints and requirements. For just about any method of operation you can think of, there's been a CPU designed that works that way.

However, the subsystems commonly found in a CPU are the following:

Control unit

The control unit, or CU, manages the flow of data between the other subsystems, and provides them with control signals. Typically, the CU is what takes an instruction (described below) from the computer program that the CPU is executing, and breaks that instruction down into steps to be taken by the other subsystems. These steps can include fetching data from memory, sending it to the ALU and telling the ALU what to do with it, sending data to or receiving data from external hardware, and storing the results back in memory.

Some control units break down the instructions using hardwired circuitry, while higher-end ones use microcode. Microcode is basically a reprogrammable lookup table of what steps to perform based on a given instruction. Using microcode makes design and development of a CPU easier, and it enables higher-level instructions to be implemented, which makes assembly programming easier. Microcode also makes it possible to use the same instruction set across several microarchitectures (designs of the CPU's hardware subsystems). As well, it allows patches to work around hardware bugs in a CPU design.

Arithmetic logic unit

The part of the CPU that actually does the calculations is called the arithmetic logic unit, or ALU. The ALU takes operands (data to operate on, typically two operands at once, each of which is a binary number) and an opcode (the operation it is told to perform by the CU). It uses logic circuitry to perform the requested operation on the operands, and when it has done so, the result appears at its output.

The operations performed by the ALU are arithmetic (addition and subtraction, multiplication and division, incrementing and decrementing, negation) and bitwise logic (bitwise AND/OR/etc. and left/right shifts). Multiplication and division are often omitted from simpler ALUs, and were not available at all until the late 1970s. This is because arbitrary multiplication and division are more difficult to do at the binary hardware level. However, multiplication and division by powers of two is easy: simply shift the bits to the left or to the right. (This is analogous to multiplying or dividing by a power of 10 in the base-10 system most humans are more familiar with.)

The ALU also has some status outputs (commonly including carry-out, zero, negative, overflow, and parity, which all indicate properties of the output number, and are described in more detail in the Wikipedia article). These are used by the ALU for future operations or to control conditional branching (e.g. if statements in higher-level languages).

Floating point unit

Some CPUs include a floating-point unit (FPU, commonly called a math coprocessor back when it was a separate chip), which is optimized for working with floating-point (non-integer) numbers. Without an FPU, floating-point operations can be performed by the ALU with the help of software such as CORDIC, but this is slower.

Processor registers

Almost all CPUs include processor registers, which are a small amount of memory where the CPU stores pieces of data it is actively working with. Most CPUs have a few data and address registers for storing operands and outputs that will be used again soon, as well as some more specialized registers such as the program counter, which points to the address (location in memory) of the instruction currently being executed. There are also internal registers (registers not accessible to the user's program) such as the instruction register, which holds the actual instruction being executed (as opposed to its address). Some also have control registers, where specific bits in specific registers control the operation of hardware features such as switchable clock speeds.

Instructions and assembly languages

Above, I mentioned instructions. Instructions are names for specific operations that a CPU can perform on data. A collection of instructions is called machine code. Together, the instructions a given CPU supports make up its instruction set architecture (ISA). Each instruction may specify where to access or store pieces of data (processor registers, addresses in main memory), what to do with the data (such as arithmetic operations), and other operations such as choosing the next instruction to execute based on certain conditions (branching).

In traditional architectures, an instruction is composed of an opcode, specifying what operation to perform, and one or more operands, which are usually addresses of the data on which to perform the operation. Some architectures also allow an 'immediate value', which is similar to a literal in higher-level programming.

Assembly languages

An assembly language is a slightly more human-friendly language than machine code. Generally there is a one-to-one (or almost one-to-one) correspondence between the assembly language of a given architecture and its instruction set. The assembly language, however, is easier to read because it uses alphabetic mnemonics (such as mov move a value from one location to another, jmp jump to new instruction location, jnz jump if not zero—a branch operation) for commands and registers, instead of binary opcodes and addresses. The assembly language is converted to machine code by a program called an assembler. Some assemblers support higher-level features such as function declarations and calls, macros, and even object-oriented programming.

Until higher-level languages became popular in the 1970s and 1980s, assembly programming was the main way programming was done. Today it is still used when resources are very tight, when speed of computation is paramount, and also for programming of home-built processors. As well, many of today's high-level languages are compiled into assembly code by their compilers, which is then converted to machine code to run on the CPU.

In the next post, coming in the next week, I plan to cover modern improvements to the CPU, such as pipelining, caches, and branch prediction. If there's anything you'd like me to cover specifically, let me know in the comments. Also let me know what you think of this post.

Discussions

Eric Hertz wrote 02/06/2017 at 12:50

Looking forward to the next installment!

Are you sure? yes | no

Fundamentals of Digital Computation and Introduction to CPUs

Switches and logic

More logic circuits

Clocking

CPU architecture and subsystems

Control unit

Arithmetic logic unit

Floating point unit

Processor registers

Instructions and assembly languages

Assembly languages

Next post

Discussions

Fundamentals of Digital Computation and Introduction to CPUs

Switches and logic

More logic circuits

Clocking

CPU architecture and subsystems

Control unit

Arithmetic logic unit

Floating point unit

Processor registers

Instructions and assembly languages

Assembly languages

Next post

Introduction to Programmable Logic and FPGAs

Discussions

Become a Hackaday.io Member