Least Recently Used, or LRU, is the optimal cache replacement policy, and it is used in various applications where arbitration is required. In this post, we will discuss different true LRU implementations, based on the paper “Highly Efficient LRU Implementations for High Associativity Cache Memory”.

In the following discussion, we assume the number of ways in a cache set is N.

Square Matrix Implementation

Square matrix implementation keeps a N x N 2D array.

Upon a cache access, the entire row corresponding to that way is set to 1, and then the entire column corresponding to that way is set to 0.

It is obvious that the number of ones in each row is an indication of the order of cache way accesses within the set. A way with more 1’s is more recently accessed than the one that has less 1’s.

On a cache miss, LRU way is detected by checking the row in which all the storage elements are 0.

Counter Based Implementation

Square matrix implementation requires N^2 bit of storage. Counter based implementation can reduce the storage bits to N x log(N).

There is one to one mapping between the counters and cache ways in a set. The counter values indicate the order in which the cache ways within a set have been accessed.

A counter with a larger value means that corresponding cache way is more recently accessed than the way whose counter has a less value. 

Obviously, a counter value of 0 indicates the corresponding cache way is LRU, while the highest value, N-1 indicates the corresponding cache way is most recently accessed.

Upon a cache access, the value of “active counter” whose cache way being accessed is compared with the value of other counters.

The counters with values greater than “active register” are decremented and the active register is set to highest value N-1.


In this post, we discussed 2 true LRU implementations, and we will cover more in next post.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.