In the previous chapter, we had a general introduction to the functional computer. In this chapter, let me detail the definition of computer architecture design aspect. We said that Computer Architecture is all about designing a computer for the desired performance. This design is based on a set of rules and methods that describe the functionality, organization and implementation of computer systems for efficient information processing using the available technologies. It is a vicious circle between technology and applications. Each one drives the other to the next generation. Thus Computer architecture is an iterative process searching for designs at all levels.
Whatever technological changes come in, the basic thumb rules of the architecture design stands unchanged. Technology brings in efficiency, speed, capacity and capability. The basic architectural designs are known as the Von Neumann Architecture and Harvard Architecture.
Von Neumann Architecture
In 1940, John Von Neumann introduced the idea that a program to be executed be stored in binary format in a memory device so that intermediate results could cause the desired effect to the program. This is also known as the Stored Program Concept in Computer design.
The stored program concept says that the program i.e. instructions are stored along with the data in the computer’s memory in machine-readable binary form( machine language). The computer can manipulate it as instruction or data depending upon the control state. The instructions are executed by CPU sequentially as per the control flow of the program and intermediate results. This is done as Fetch-Execute cycle by the CPU i.e CPU fetches instructions from memory in Fetch Control Cycle and executes in Execute Control cycle by accessing data from memory. Thus every instruction goes through Fetch-Execute Cycle in Stored Program Concept.
For the clarity of readers, let me recall that modern computer have a hard disk where we save our program and data files. These get loaded onto the main memory while the program is being executed. It is to be noted that Hard disk is slow to access device comparing main memory (RAM). Not only that structurally RAM is in proximity to CPU than Hard disk.
The Stored Program Concept is the basic operating principle of every computer and also a default method of execution. Without it, every instruction would have to be initiated manually. Unimaginable!!. A Von Neumann Architecture computer has five basic blocks: an Arithmetic-Logic Unit (ALU), a control unit, a memory, some form of input/output and a bus that provides a data path between these functional units as in figure 2.1. The ALU has a register called Accumulator. The Control Unit has a register cum counter by name Program Counter. The program counter keeps track of the next instruction to be executed based on the flow of the program, including “for loops”, “if loops”, etc. These registers act as small internal memory in CPU facilitating the flow of execution.
A von Neumann Architecture computer performs or emulates the Fetch-Decode-Execute sequence for program execution and as detailed below:
- Fetch the instruction from memory at the address denoted by the Program Counter.
- Increment the program counter to point to the next instruction to be fetched.
- Decode the instruction using the control unit.
- Execute the decoded instruction using the data path, control unit and ALU. As part of this step, any data that is to be fetched from memory for executing the instruction is brought. Also if any results to be updated in memory is written into memory, at the end of the execution of an instruction.
- Go back to step 1.
I/O systems need to communicate with memory and/or CPU. For this reason, Interrupts are used. Exception errors occur during arithmetic operations. So CPU handles both interrupts and exception handling breaking the Fetch-Decode-Execute cycle of pure Von Neumann Architecture. Such a design enhances the efficiency of a system.
Von Neumann bottleneck
Stored Program Concept, however, has an attendant bottleneck: it was designed to process instructions one after the other instead of using faster parallel processing. In a computer with a Von Neumann architecture, the CPU can be either reading instruction in Fetch cycle or reading/writing data from/to the memory in execute cycle. Both cannot occur at the same time since the instructions and data use the same signal pathways (bus between CPU and Memory Units) as in figure 2.1. Essentially the CPU operational speed is limited by memory access. Memory is a slow device comparing CPU. Every memory cycle is longer than the CPU cycle. Therefore, the CPU spends more time waiting for memory to be available for obtaining instruction or data, limiting the throttle speed of crunching the instruction thruput. This is known as the Von Neumann bottleneck.
The bandwidth and data transfer rate between the CPU and memory is very small, in comparison with the amount of memory. In modern days, the CPU clock rate is much higher and also the availability of memory. Under this circumstance, Von Neumann bottleneck becomes a serious limitation in overall processing speed, because the CPU is continuously forced to wait for vital data to be transferred to or from memory.
Solutions to Von Neumann bottleneck
- Design the CPU –Memory interface with two busses, exclusively one for instructions, and the other for data.
- Designing CPU with Cache increases the bandwidth utilization between CPU and main memory, thereby reducing the waiting time of CPU.
- Use of modern functional programming and object-oriented programming reduces the need for pushing vast numbers of words back and forth, than the conventional programming languages of the olden days like FORTRAN.
Harvard Architecture
Harvard architecture design has two physical memory and interface to CPU, whereas Von Neumann has one memory and one interface to CPU. The two physical memory in Harvard architecture is meant to be one each exclusively for instruction storage and data storage. This structure facilitates the CPU to read both instruction and data from these two separate memory at the same clock cycle or as needed. Both, the instruction and data memory has a separate bus (memory Interface) to CPU for Data transfer. Referring to figure 2.2 aids in clarity of understanding. A computer with Harvard design is faster because it can fetch the next instruction while the current instruction is being executed. However, there is a complexity loaded on the control unit design while achieving better thruput.
In recent decades, the processor design incorporates the aspects of both Harvard and Von Neumann architecture. These processors have on-chip Cache for both Instruction (I-Cache) and Data (D-Cache). Harvard architecture design is used in on-chip cache interface to the CPU. In the case of a cache miss, i.e. the desired information is not available in the cache, the same is retrieved from the main memory. The main memory is not divided into separate instruction and data sections. Von Neumann architecture is used for off-chip memory design. The combined design is illustrated in figure 2.3.
General-purpose microcontrollers are used in electronics applications and are based on the Harvard architecture model. Microcontrollers are characterized by having small amounts of program and data memory, and take advantage of the Harvard architecture. These have Read-only Memory (ROM) where the program is stored. The RAM is available for storing data from I/O and is also accessed by CPU. A small instruction set ensures that most instructions are executed within only one machine cycle.
Both Von Neumann and Harvard architectures are the same regarding sequential access and execution of instructions in Fetch-Decode-Execute cycle mode but differ in the way the memory access is dealt with. The control unit design in Harvard architecture is more complex than the Von Neumann architecture.