Close

Part 4: The Computer Architecture

A project log for ALUMinimum

Learn deep logic optimization by implementing the fastest ALU West of the Pecos River, using no specialized arithmetic chip

squonk42Squonk42 05/22/2017 at 20:390 Comments

At the same time as defining the central role of the processor (CPU), the evolution of memory technologies provided a random-access capability that departed from the model of the Universal Turing Machine which is based on an infinite tape as its main storage unit.

As Hao Wang wrote in 1954 in his paper "A Variant to Turing's Theory of Computing Machines" (unfortunately, behind a paywall):

Turing's theory of computable functions antedated but has not much influenced the extensive actual construction of digital computers. These two aspects of theory and practice have been developed almost entirely independently of each other. The main reason is undoubtedly that logicians are interested in questions radically different from those with which the applied mathematicians and electrical engineers are primarily concerned. It cannot, however, fail to strike one as rather strange that often the same concepts are expressed by very different terms in the two developments

Interestingly though, is the fact that Turing himself worked on both the theoretical and practical aspects of computers.

As a matter of fact, the theory of computational complexity was founded in the early 1960s only, when scheming minimal universal Turing machines has been a mental sport for years (check Peter van Emde Boas's paper "Machine Models and Simulations" for a good enumeration and classification of abstract machines).

Von Neumann (Princeton) Architecture

Both von Neumann's "First Draft of a Report on the EDVAC" and Turing's own "Proposal for Development in the Mathematics Division of an Automatic Computing Engine (ACE)" early papers describe a computer architecture that seems completely unconnected to the concept of Universal Turing Machine at first sight.

First, the fact that the machine is a serial computer that processes each instruction and data atomic components sequentially rather than all at once as in today's parallel-word computers makes it difficult to understand. But this choice does not influence the logical architecture per se, it is rather a compromise due to the technology available at that time with the sequential acoustic mercury delay line memories. With the practical availability of random-access memories of higher capacities such as the iconoscope and Williams tube, Arthur Burks, Herman H. Goldstine and John von Neumann published only one year later on 28 June 1946 the first edition of their famous paper "Preliminary discussion of the logical design of an electronic computing device" (followed on 2 September 1947 by a second edition), where a more modern design based on a parallel-word architecture is described.

Second, is the more important substitution of the sequential tape by a random-access memory and registers.

This architecture, known as the Von Neumann or Princeton architecture, was only categorized theoretically much later (in October 1964 by Calvin Elgot and Abraham Robinson in "Random-Access Stored-Program Machines, an Approach to Programming Languages", unfortunately behind a paywall) as an example implementation of a specific class of abstract register Turing-equivalent machine: the Random-Access Stored-Program (RASP) machine.

Although attributed to von Neumann alone, there is a large controversy about the fact that all the ideas exposed in this paper are his own.

There are at least some details that tend to prove that his ideas did not come ex nihilo: in section 5.3 of the Burks, Goldstine and von Neumann paper:

Several of the digital computers being built or planned in this country and England are to contain a so-called "floating decimal point"

This proves that they were aware of at least Bell labs's Mark V computer and National Physical Laboratory, UK's Pilot ACE development at that time.

In fact, there were many direct contacts between the British and American laboratories, showing that exchanges between the concurrent projects was not uncommon at this time:

Whatever were von Neumann motivations, disclosing the computer architecture early in such public documents prevented the various people working on these projects to further patent anything they worked on, and had the positive result of placing the computer invention into the public domain.

What is remarkable though, is that the von Neumann architecture closely follows Turing's stored-program computer model, to the point where both concepts are often confused. Having a stored-program capability was mandatory for those primitive computers, which had only very poor means of specifying the data they were working on, using only:

It was then impossible to address blocks of data to process, unless the program was able to self-modify itself by computing instructions with pre-computed addresses to be executed further in the algorithm, thus the stored-program condition.

Harvard Architecture

However, the stored-program model is not a strict requirement for a computer to be universal: a simple design with a looping external program capable of indirect addressing (i.e. having instructions containing the address of the address of values or two levels of abstraction for accessing the values) suffices to simulate a universal Turing machine (see the paper from Raùl Rojas "Conditional Branching is not Necessary for Universal Computation in von Neumann Computers").

The architecture that matches this description is called the Harvard Architecture, in reference to the Harvard Mark I and II built for IBM by Howard Aiken in 1944-47. These machines were not capable of universal computation, missing the indirect addressing mode, but feature a split memory for program and data.

One distinctive advantage of the Harvard Architecture compared to von Neumann's was that the CPU can both read an instruction and perform a data memory access at the same time, whereas a von Neumann architecture will be limited by the Von Neumann bottleneck. But this performance limitation is mitigated in today's computers by using:

Pure Harvard Architecture is still used today in cache-less processors, such as Digital Signal Processors (DSPs) and some small microcontrollers.

Stack Architecture

Another popular computer architecture is the Stack Architecture, where a pushdown stack is used to store data and instructions rather than registers. Most of its instructions assume that operands will be from the stack, and results placed in the stack.

Because of the lack of addressing in the instructions, the object code is very compact. And as algebraic expressions are directly convertible into a binary syntax tree, itself easy to convert into stack-based instructions, the stack architecture permits simple compilers and interpreters to be generated. However, the stack architecture suffers from several flaws.

Only a few processors using the stack architecture were actually built, but this architecture is commonly used for Virtual Machines interpreted in software.

Transport Triggered Architecture

One last computer architecture of interest is the Transport Triggered Architecture, where computation happens as a side effect of data transport: accessing some addresses in memory triggers a functional unit to start computation. Put into other terms, the processor only features a single instruction to move data between a source and a destination, and the instruction to perform depends on the actual addresses that are accessed.

Although this architecture greatly simplifies the control logic of the processor, only a few commercial processors actually uses it:

To summarize all that, today most computers use the von Neumann Architecture with a cache, a Modified Harvard Architecture with separate instruction and data caches or access paths, with an exception for DSPs or small microcontrollers that still use a pure Harvard Architecture.

Discussions