We have covered how Tomasulo’s algorithm minimize performance impact of data hazards, we can also improve on control hazards. The key idea is to execute instructions based on branch prediction / speculation. Now the concern is, how to restore the program if the speculation is incorrect.
Tomasulo’s Algorithm with Speculation Overview
Tomasulo’s algorithm introduces a reorder buffer, to separate the “completion” from “commit”. Results can be produced early based on prediction (“completion”), and they can be bypassed to subsequent instructions before permanently modifying CPU state (“commit). Reorder buffer provides a fast mechanism to undo speculative state changes: instructions are committed in order, and reorder buffers and CPU pipelines are flushed if head of reorder buffer is a mispredicted branch.
A diagram of Tomasulo’s Algorithm with speculation from J. Hennessy and D. Patterson is shown below.
The reorder buffer will store:
Instruction type – branch / store / load / register op
Destination field – register number or memory address where results should be committed
Value – temporarily hold result until instruction commits
Ready – indication of results being ready and waiting to be committed
Beside the addition of reorder buffer, the differences between Tomasulo’s Algorithm with and without speculation are:
- The Qi tag of each register will no longer store which reservation station will produce the result; instead, it tracks which reorder buffer entry will commit the result
- Qj, Qk field of reservation station now tracks which reorder buffer entry will have the result
- Vj, Vk field of reservation station can be copied from either register file, or reorder buffer entries. If the value is from reorder buffer entries, these values are speculative.
Instruction Execution in Tomasulo’s Algorithm
In issuing phase, as long as there are available reservation stations and reorder buffer entries, instructions can be issued. Instruction type, operands and ROB entry number are stored in reservation station; instruction type and destination register number are stored in ROB entry; ROB entry number will be stored in destination register Qi tag.
In execution phase, if all operands and the execution unit are available, the operation is performed. For load and store, effective address computation is done in this phase.
In result writing phase, when the result is available and the CDB is free, the result is broadcasted to waiting reservation stations and corresponding ROB entry. Note, results are not written into register file yet.
In commit phase, if the head of ROB is ready and it is not a mispredicted branch, results are allowed to be updated in register file. For store, memory is updated in this phase.
J. Hennessy and D. Patterson have summarized a detailed algorithmic description of Tomasulo’s algorithm with speculation in the table below:
Resolving Data Hazards and Dependencies in Tomasulo’s Algorithm with Speculation
How data hazards are resolved in Tomasulo’s algorithm with speculation is similar to without speculation case. As for data memory dependencies, it is worth some analysis.
Since stores will update memory in program order, WAW and WAR data memory dependencies are eliminated with speculation.
RAW data memory dependency is eliminated since computation of effective addresses of loads with respect to all earlier stores is in program order. In addition, a load is not allowed to initiate the second step of execution if any active ROB entry is occupied by a store to the same memory address.
To check the understanding of speculation, interviewer will often focus on the concept of “completion” and “commit” separation, and how data memory dependencies are resolved.
The use of reorder buffer can also be extended from Tomasulo’s algorithm, and interviewer can even ask follow up questions about how to size reorder buffer, etc.
We highly recommend readers to refer to J. Hennessy and D. Patterson’s book, Computer Architecture, for more detail.