Cache Organization and Address Mapping

A block is the minimum unit of information that can be either present or not present in a cache. A block in the main memory is called a frame and a block in the cache memory is called a cache block or a cache line.

A processor accesses data within a frame or cache block through an addressing system. The smallest chunk of information that the processor can read is in units of words.

The main and cache memory are each divided into frames/blocks of $2^n = N$ words. When data is moved into the cache an entire frame of $N$ words occupies one block of $N$ words in the cache.

If the main memory address $A$ contains $p$ bits, and the least significant $n$ bits are used to represent the $N$ words in each frame/block, the remaining $(p - n)$ bits of the address form the tag for the block, whereby the tag is the beginning address of the block of $N$ words. Thus $(p-n)$ bits can reference any frame from $0$ to $2^{(p - n) - 1}$ in the main memory.

The cache capacity is $2^b = B$ blocks. The data area of the cache has a total capacity of $B \times N$ words. The cache also has a tag area consisting of $B$ tags. Each tag in the tag area identifies the address range of the $N$ words in the corresponding block in the cache.

When the processor references a primary memory address $A$ , the cache mechanism compares the tag portion of $A$ to each tag in the tag area. The least significant $n$ bits of $A$ are used to select the appropriate word from the block, to be transmitted to the processor.

Examples

Word Size	# Words in a Block	Bits to reference Word within a Block	Primary Memory	Bits to reference Primary Memory	Tag Length	Cache Memory	# of Cache Blocks
$w$	$N$	$n$	$P$	$p$	$t$	$C$	$B$
		$log_{2}N$		$log_{2}(\frac{P}{w})$	$p-n$		$log_{2}(\frac{C}{Nw})$
$1$ B	$8$	$3$	$64$ KB	$16$	$13$	$1$ KB	$128$
$1$ B	$16$	$4$	$64$ KB	$16$	$12$	$1$ KB	$64$
$1$ B	$8$	$3$	$128$ KB	$17$	$14$	$1$ KB	$128$
$1$ B	$16$	$4$	$128$ KB	$17$	$13$	$1$ KB	$64$
$1$ B	$8$	$3$	$64$ KB	$16$	$13$	$2$ KB	$256$
$1$ B	$16$	$4$	$64$ KB	$16$	$12$	$4$ KB	$256$
$1$ B	$8$	$3$	$128$ KB	$17$	$14$	$8$ KB	$1024$
$1$ B	$16$	$4$	$128$ KB	$17$	$13$	$16$ KB	$1024$

Fully Associative Mapping

In the above mapping scheme one frame can occupy any of the cache blocks. This is called a fully associative mapping scheme. All the $B$ tags must be searched in order to determine a hit/miss.

Direct Mapping

On the other extreme, in a direct mapping each memory address $A$ maps to a unique cache block. Use the least significant $b$ bits of the tag to indicate the cache block in which a frame can reside. The tag search is now reduced to just one comparison.

This mapping divides the $2^p$ addresses in the main memory into $B = 2^b$ partitions. Thus, $2^{(p-b)}$ addresses map to the same cache block. The addresses that map to the same cache block are $B \times N$ words apart in main memory.

Since the cache data area can hold upto $32$ blocks, the frames that map to the same cache block are $32$ frames apart.

Word Size $w = 1$ B
Cache Size $C = 256$ B
Number of Blocks in Cache $B = 32$
Bits to reference a block $b = log_{2}B = 5$
Size of one Block/Frame $= 256$ B $/32 = 8$ B
Words per Block/Frame $N = 8$ B $/1$ B $= 8$
Bits to reference Word within a block $n = log_{2}N = 3$
Primary Memory $P = 64$ KB
Bits to reference Primary Memory $p = log_{2}(\frac{P}{w}) = 16$
Number of Frames in Primary Memory $= 64$ KB $/8$ B $= 8192$
Tag length $p - n - b = 8$ .

Set Associative Mapping

A compromise between the two types of mapping is called set-associative mapping. The $B$ cache blocks are divided into $2^k = K$ partitions each containing $\frac{B}{K}$ blocks. This is called a $K$ -way set-associative mapping.

This mapping divides the main memory into $2^{(b-k)}$ partitions. Each frame can now reside in of the $K$ corresponding cache blocks, known as a set. The tag search is now limited to $K$ tags in the set.

Since the cache data area can hold upto $32$ blocks, the frames that map to the same set are $32$ frames apart.

Word Size $w = 1$ B
Bits to reference Word within a block $n = 3$
Words per Block/Frame $N = 2^3 = 8$
Size of one Block/Frame $= 8 \times 1$ B $= 8$ B
Number of slots in a set $K = 4$
Size of one set $= 4 * 8$ B $= 32$ B
Number of Sets in Cache $= 32$
Bits to reference a set $b - k = log_{2}32 = 5$
Cache Size $C = 32 \times 32$ B $= 1$ KB
Primary Memory $P = 64$ KB
Bits to reference Primary Memory $p = log_{2}(\frac{P}{w}) = 16$
Number of Frames in Primary Memory $= 64$ KB $/8$ B $= 8192$
Tag length $p - n - (b - k) = 8$ .

Observations

$K = 0$ implies direct mapping.

$k = b$ implies fully associative mapping.

A direct-mapped cache is simply a one-way set-associative cache: each cache entry holds one block and each set has one element. A fully associative cache with $m$ entries is simply an $m$ -way set-associative cache; it has one set with $m$ blocks, and an entry can reside in any block within that set.

Increasing the degree of associativity usually decreases the miss rate, with a potential increase in the hit time. While the benefit of going from one-way to two-way set associative is significant, the benefits of further associativity are smaller.

Each increase by a factor of $2$ in associativity doubles the number of blocks per set and halves the number of sets. Each factor of $2$ increase in associativity decreases the size of the index by $1$ bit and increases the size of the tag by $1$ bit.

Total size of cache in blocks = Number of sets $\times$ Associativity

The Beard Sage