How to implement precise interrupt?

In previous post, we discussed the definition of precise interrupt. In this post, we will cover the implementations, based on J.E. Smith and A.R. Pleszkun’s paper Implementing Precise Interrupt in Pipelined Processor.

A total of 4 approaches will be discussed:

Result shift register
Reorder buffer with bypass
History buffer
Future file

Continue reading → How to implement precise interrupt?

How to implement hardware-based speculation in Tomasulo’s Algorithm to minimize control hazards?

We have covered how Tomasulo’s algorithm minimize performance impact of data hazards, we can also improve on control hazards. The key idea is to execute instructions based on branch prediction / speculation. Now the concern is, how to restore the program if the speculation is incorrect.

Continue reading → How to implement hardware-based speculation in Tomasulo’s Algorithm to minimize control hazards?

What are data dependencies through memory in Tomasulo’s Algorithm?

The RAW, WAW and WAR hazards we discussed in previous posts are with respect to registers. For memory accesses, there exists subtle data dependencies that are barely mentioned. These data dependencies include RAW, WAW and WAR as well. Note,

  1. Memory data dependencies only exist for accesses to the same memory address; for memory accesses to different addresses, load and store can be safely done out of order
  2. In MIPS 5-stage pipeline, all memory accesses are performed in MEM stage and thus are serialized, so these data dependencies will not happen in MIPS 5-stage pipeline. In this post, we will focus on CPU out-of-order scheduling.

Continue reading → What are data dependencies through memory in Tomasulo’s Algorithm?

How does Tomasulo’s Algorithm work?

A fundamental restriction in MIPS 5-stage pipeline is, if one instruction stalls, all instructions following it stall as well, even if they have no dependencies at all. If the stalled instruction take a long time to finish, the throughput of the CPU pipeline will suffer.

In modern CPU technology, out-of-order scheduling is an important technique to address the restriction above. Tomasulo’s algorithm is a typical solution to CPU out-of-order scheduling.

Continue reading → How does Tomasulo’s Algorithm work?