Page 15 of 16

How does Tomasulo’s Algorithm work?

A fundamental restriction in MIPS 5-stage pipeline is, if one instruction stalls, all instructions following it stall as well, even if they have no dependencies at all. If the stalled instruction take a long time to finish, the throughput of the CPU pipeline will suffer.

In modern CPU technology, out-of-order scheduling is an important technique to address the restriction above. Tomasulo’s algorithm is a typical solution to CPU out-of-order scheduling.

Continue reading → How does Tomasulo’s Algorithm work?

How to implement hardware based branch predictions?

Hardware based branch prediction is an important CPU technology to improve control hazards. Branch prediction has two aspects: branch condition prediction and branch target prediction. Branch condition decides whether the branch is taken or not, and branch target decides the target address. Both aspects are equally important.

Continue reading → How to implement hardware based branch predictions?

Hazards and solutions: a case study using MIPS 5-stage pipeline

The CPU pipelining introduces throughput increasing and possible higher clock frequency, but it does not come for free. By allowing multiple instructions being executed in parallel, CPU designers need to take care of the following hazards:

Structural hazards
Data hazards
Control hazards

In this post, we will discuss these hazards in detail and use MIPS 5-stage pipeline for case study.

Continue reading → Hazards and solutions: a case study using MIPS 5-stage pipeline

What does MIPS 5-stage pipeline looks like?

MIPS 5-stage pipeline is a classic way to illustrate CPU pipelining, and it is a common interview questions for new grads and junior engineers. The 5-stage pipeline consists of the following stages:

IF – instruction fetch
ID – instruction decode and operand fetch
EX – instruction execution
MEM – memory access
WB – write back

Continue reading → What does MIPS 5-stage pipeline looks like?

Design a non-power-of-2 Synchronous FIFO

If the sync FIFO has non-power-of-2 entries, for example, 3 entries, what can we do? Well, we have at least 2 approaches.

The easiest way is to maintain a counter for FIFO occupied entries, and use it to generate the FIFO empty or full condition.

The second approach is to use the same trick for pointer encoding as what we see in power-of-2-entry FIFO. For example, both read and write pointers have (2 + 1) bits, where 2 is for bits to index 3-entry FIFO, and 1 is to decode FIFO full and empty conditions. Pointer value change shall have the following patterns:

000 -> 001 -> 010 -> 100 -> 101 -> 110 -> 000 -> …

When read pointer is equal to write pointer, the FIFO is empty; When read pointer is the same as write pointer other than MSB, the FIFO is full.