How to handle data dependencies through memory?

We covered different types of data dependencies through memory in previous post. To handle data memory dependencies in CPU out-of-order scheduling, there are usually two steps involved:

  1. Calculate effective address for memory access
  2. Check addresses of all active load / store and ensure conflicting load / store cannot execute out of order

J. P. Shen and M. H. Lipasti have excellent elaboration on this topic in their book, Modern Processor Design. In this post, we will briefly discuss their idea about load / store processing.

Continue reading → How to handle data dependencies through memory?

What are data dependencies through memory in Tomasulo’s Algorithm?

The RAW, WAW and WAR hazards we discussed in previous posts are with respect to registers. For memory accesses, there exists subtle data dependencies that are barely mentioned. These data dependencies include RAW, WAW and WAR as well. Note,

  1. Memory data dependencies only exist for accesses to the same memory address; for memory accesses to different addresses, load and store can be safely done out of order
  2. In MIPS 5-stage pipeline, all memory accesses are performed in MEM stage and thus are serialized, so these data dependencies will not happen in MIPS 5-stage pipeline. In this post, we will focus on CPU out-of-order scheduling.

Continue reading → What are data dependencies through memory in Tomasulo’s Algorithm?

How does Tomasulo’s Algorithm work?

A fundamental restriction in MIPS 5-stage pipeline is, if one instruction stalls, all instructions following it stall as well, even if they have no dependencies at all. If the stalled instruction take a long time to finish, the throughput of the CPU pipeline will suffer.

In modern CPU technology, out-of-order scheduling is an important technique to address the restriction above. Tomasulo’s algorithm is a typical solution to CPU out-of-order scheduling.

Continue reading → How does Tomasulo’s Algorithm work?

How to implement hardware based branch predictions?

Hardware based branch prediction is an important CPU technology to improve control hazards. Branch prediction has two aspects: branch condition prediction and branch target prediction. Branch condition decides whether the branch is taken or not, and branch target decides the target address. Both aspects are equally important.

Continue reading → How to implement hardware based branch predictions?

Hazards and solutions: a case study using MIPS 5-stage pipeline

The CPU pipelining introduces throughput increasing and possible higher clock frequency, but it does not come for free. By allowing multiple instructions being executed in parallel, CPU designers need to take care of the following hazards:

Structural hazards
Data hazards
Control hazards

In this post, we will discuss these hazards in detail and use MIPS 5-stage pipeline for case study.

Continue reading → Hazards and solutions: a case study using MIPS 5-stage pipeline

What does MIPS 5-stage pipeline looks like?

MIPS 5-stage pipeline is a classic way to illustrate CPU pipelining, and it is a common interview questions for new grads and junior engineers. The 5-stage pipeline consists of the following stages:

IF – instruction fetch
ID – instruction decode and operand fetch
EX – instruction execution
MEM – memory access
WB – write back

Continue reading → What does MIPS 5-stage pipeline looks like?