CS 473 - Pipelining

Basic idea: the various steps in executing an instruction don't use the same parts of the CPU. This tells us we can execute them in parallel - doing this is called pipelining. The text has a really good analogy in their ``laundry analogy.''

In the four or five basic steps involved in the MIPS, we in fact never have to reuse hardware. This will let us pipeline it. The steps, and their corresponding hardware parts, are:
Step Hardware
Intruction Fetch PC, Instruction Memory
Decode/Register Read Registers (read ports)
Arithmetic ALU
Memory Read/Write Data Memory
Writeback Registers (write port)

We implement the pipeline by putting registers, called pipeline registers, between the stages of the pipeline. Now, on each cycle, we move the data through the CPU by one step. The result is that we are able to get a substantial improvement in performance, at very minimal cost.

Speedup

How much speedup can we expect from pipelining? Let's take a not-too-unreasonable case, of a processor where the five stages take (these numbers are actually pulled out of a hat, so they shouldn't be quoted as any sort of real estimate of how long these steps take):
TimeStage
5Instruction Fetch
2Register Read
3Execute
5Memory
2Register Write

For a single-cycle implementation, this machine will take 5 + 2 + 3 + 5 + 2 = 17 nanoseconds to execute an instruction.

To pipeline it, we need to insert the pipeline registers, which will take time (let's call it 1 nanosecond - it's reasonable that this take less time than the register file, because no addressing is needed). This adds one nanosecond to the time for each stage. The maximum speed we can run now is determined by the speed of the slowest stage, so we can run at 6 nanoseconds per cycle: a speedup by a factor of almost three

But there is still more to the story: this is the rate with the pipeline full. If it has to be drained, the speedup goes down.


Last modified: Wed Sep 12 11:36:31 MDT 2001