CS2100 Computer Organisation Cache II Road Map: Part II Performance Assembly Language Cache Part II Processor: Datapath Type of Cache Misses Direct-Mapped Cache Processor: Control Set Associative Cache Block Replacement Policy Cache Framework Pipelining Cache Cache II 2 Cache Misses: Classifications Compulsory / Cold Miss • First time a memory block is accessed • Cold fact of life: Not much can be done • Solution: Increase cache block size Conflict Miss • Two or more distinct memory blocks map to the same cache block • Big problem in direct-mapped caches • Solution 1: Increase cache size • Inherent restriction on cache size due to SRAM technology • Solution 2: Set-Associative caches (coming next ..) Capacity Miss • Due to limited cache size CS2100 Cache II 3 Block Size Tradeoff (1/2) Average Access Time = Hit rate x Hit Time + (1-Hit rate) x Miss penalty Larger block size: + Takes advantage of spatial locality -- Larger miss penalty: Takes longer time to fill up the block -- If block size is too big relative to cache size Too few cache block miss rate will go up Miss Rate Miss Penalty Exploits Spatial Locality Average Access Time Fewer blocks: compromises temporal locality Block Size CS2100 Block Size Cache II Block Size 4 Block Size Tradeoff (2/2) Miss rate (%) 10 8 KB 16 KB 64 KB 256 KB 5 0 8 16 32 64 128 256 Block size (bytes) CS2100 Cache II 5 Another way to organize the cache blocks SET ASSOCIATIVE CACHE CS2100 Cache II 6 Set-Associative (SA) Cache Analogy Many book titles start with "T" Too many conflicts! Hmm… how about we give more slots per alphabet, 2 books start with "A", 2 books start with "B", …. etc? CS2100 Cache II 7 Set Associative (SA) Cache N-way Set Associative Cache A memory block can be placed in a fixed number of locations ( N > 1) in the cache Key Idea: Cache consists of a number of sets: CS2100 Each set contains N cache blocks Each memory block maps to a unique cache set Within the set, a memory block can be placed in any element of the set Cache II 8 SA Cache Structure Valid Tag Data Valid Tag Data Set Index 00 01 10 11 Set 2-way Set Associative Cache An example of 2-way set associative cache Each set has two cache blocks A memory block maps to an unique set In the set, the memory block can be placed in either of the cache blocks Need to search both to look for the memory block CS2100 Cache II 9 Set-Associative Cache: Mapping Memory Address 31 N N-1 Block Number 0 Offset Cache Block size = 2N bytes Cache Set Index = (BlockNumber) modulo (NumberOfCacheSets) N+M-1 31 Tag N N-1 Set Index 0 Offset Cache Block size = 2N bytes Number of cache sets = 2M Offset = N bits Set Index = M bits Tag = 32 – (N + M) bits CS2100 Cache II Observation: It is essentially unchanged from the direct-mapping formula 10 SA Cache Mapping: Example Memory 4GB 1 Block = 4 bytes Memory Address N N-1 31 Block Number Offset Offset, N = 2 bits Block Number = 32 – 2 = 30 bits Check: Number of Blocks = 230 N+M-1 31 Tag Cache 4 KB 0 N N-1 Set Index 0 Offset Number of Cache Blocks = 4KB / 4bytes = 1024 = 210 4-way associative, number of sets = 1024 / 4 = 256 = 28 Set Index, M = 8 bits Cache Tag = 32 – 8 – 2 = 22 bits CS2100 Cache II 11 Set Associative Cache Circuitry 31 30 . . . Tag Tag 22 Idx 8 V Tag Data 0 1 2 . . . 254 255 22 = 22 = 22 = B C A Hit CS2100 V Tag Data 0 1 2 . . . 254 255 0 1 2 . . . 254 255 A 4-way 4-KB cache: 1-word (4-byte) blocks Os V Tag Data 22 Note the simultaneous "search" on all elements of a set 0 Index V Tag Data 0 1 2 . . . 254 255 11 10 9 . . . 2 1 Cache II BC D = D 4 x 1 Select 32 Data 12 Advantage of Associativity (1/3) Block Number (Not Address) 0 Cache Index 00 1 01 2 10 3 11 Memory 4 Cache 5 6 7 8 9 10 11 12 13 14 15 CS2100 Example: Given this memory access sequence: 0 4 0 4 0 4 0 4 Result: Cold Miss = 2 (First two accesses) Conflict Miss = 6 (The rest of the accesses) Cache II 13 Advantage of Associativity (2/3) Block Number (Not Address) 0 Set Index 0 1 1 2 Cache 3 Memory 4 5 6 7 8 9 10 11 12 13 14 15 CS2100 Example: Given this memory access sequence: 0 4 0 4 0 4 0 4 Result: Cold Miss = 2 (First two accesses) Conflict Miss = None Cache II 14 Advantage of Associativity (3/3) Rule of Thumb: A direct-mapped cache of size N has about the same miss rate as a 2-way set associative cache of size N / 2 CS2100 Cache II 15 SA Cache Example: Setup Given: Memory access sequence: 4, 0, 8, 36, 0 2-way set-associative cache with a total of four 8byte blocks total of 2 sets Indicate hit/miss for each access 31 3 2 4 Tag Set Index 0 Offset Offset, N = 3 bits Block Number = 32 – 3 = 29 bits 2-way associative, number of sets= 2 = 21 Set Index, M = 1 bits Cache Tag = 32 – 3 – 1 = 28 bits CS2100 Cache II 16 Example: LOAD #1 - 1 Tag Index Offset Load from 4 000000000000000000000000000 0 100 Check: Both blocks in Set 0 are invalid [ Cold Miss ] Block 0 Block 1 Set Idx Valid 0 0 0 1 0 0 CS2100 Tag W0 W1 Cache II Valid Tag W0 W1 17 Example: LOAD #1 - 2 Tag Index Offset Load from 4 000000000000000000000000000 0 100 Result: Load from memory and place in Set 0-Block 0 Block 0 Block 1 Set Idx Valid Tag W0 W1 Valid 0 1 0 M[0] M[4] 0 1 0 CS2100 Tag W0 W1 0 Cache II 18 Example: LOAD #2 Tag Index Offset Load from 0 000000000000000000000000000 0 000 Result: [Valid and Tag match] in Set 0-Block 0 [ Spatial Locality ] Block 0 Block 1 Set Idx Valid Tag W0 W1 Valid 0 1 0 M[0] M[4] 0 1 0 CS2100 Tag W0 W1 0 Cache II 19 Example: LOAD #3 - 1 Tag Index Offset Load from 8 000000000000000000000000000 1 000 Check: Both blocks in Set 1 are invalid [ Cold Miss ] Block 0 Block 1 Set Idx Valid Tag W0 W1 Valid 0 1 0 M[0] M[4] 0 1 0 CS2100 Tag W0 W1 0 Cache II 20 Example: LOAD #3 - 2 Tag Index Offset Load from 8 000000000000000000000000000 1 000 Result: Load from memory and place in Set 1-Block 0 Block 0 Block 1 Set Idx Valid Tag W0 W1 Valid 0 1 0 M[0] M[4] 0 1 1 0 M[8] M[12] CS2100 Cache II Tag W0 W1 0 21 Example: LOAD #4 - 1 Tag Index Offset Load from 36 000000000000000000000000010 0 100 Check: [Valid but tag mismatch] Set 0-Block 0 [Invalid] Set 0-Block1 [ Cold Miss ] Block 0 Block 1 Set Idx Valid Tag W0 W1 Valid 0 1 0 M[0] M[4] 0 1 1 0 M[8] M[12] CS2100 Cache II Tag W0 W1 0 22 Example: LOAD #4 - 2 Tag Index Offset Load from 36 000000000000000000000000010 0 100 Result: Load from memory and place in Set 0-Block 1 Block 0 Set Valid Tag Idx Block 1 W0 W1 Valid Tag M[4] 1 2 0 1 0 M[0] 1 1 0 M[8] M[12] CS2100 Cache II W0 W1 M[32] M[36] 0 23 Example: LOAD #5 Tag Load from 0 Index Offset 000000000000000000000000000 0 000 Check: [Valid and tag match] Set 0-Block 0 [Valid but tag mismatch] Set 0-Block1 [ Temporal Locality ] Block 0 Set Valid Tag Idx Block 1 W0 W1 Valid Tag M[4] 1 2 0 1 0 M[0] 1 1 0 M[8] M[12] CS2100 Cache II W0 W1 M[32] M[36] 0 24 Who should I kick out next…..? BLOCK REPLACEMENT POLICY CS2100 Cache II 25 Block Replacement Policy (1/3) Set Associative or Fully Associative Cache: Can choose where to place a memory block Potentially replacing another cache block if full Need block replacement policy Least How: For cache hit, record the cache block that was accessed CS2100 Recently Used (LRU) When replacing a block, choose one which has not been accessed for the longest time Why: Temporal locality Cache II 26 Block Replacement Policy (2/3) Least Recently Used policy in action: 4- way Cache Memory accesses: 0 4 8 12 4 16 12 0 4 LRU Access 4 CS2100 0 4 8 12 Access 16 x0 8 12 4 Access 12 8 12 4 16 Hit Access 0 x8 4 16 12 Miss (Evict Block 8) Access 4 4 16 12 0 16 12 0 4 Cache II Hit Miss (Evict Block 0) Hit 27 Block Replacement Policy (3/3) Drawback for LRU Hard to keep track if there are many choices Other replacement policies: First in first out (FIFO) Second chance variant Random replacement (RR) Least frequently used (LFU) CS2100 Cache II 28 Cache Organizations: Summary CS2100 Cache II 29 Cache Framework (1/2) Block Placement: Where can a block be placed in cache? Direct Mapped: • Only one block defined by index N-way SetAssociative: • Any one of the N blocks within the set defined by index Fully Associative: • Any cache block Block Identification: How is a block found if it is in the cache? Direct Mapped: • Tag match with only one block CS2100 N-way Set Associative: Fully Associative: • Tag match for all the blocks within the set • Tag match for all the blocks within the cache Cache II 30 Cache Framework (2/2) Block Replacement: Which block should be replaced on a cache miss? Direct Mapped: • No Choice n-way SetAssociative: • Based on replacement policy Fully Associative: • Based on replacement policy Write Strategy: What happens on a write? Write Policy: Write-through vs. write-back Write Miss Policy: Write allocate vs. write no allocate CS2100 Cache II 31 Reading Assignment Large and Fast: Exploiting Memory Hierarchy CS2100 Chapter 7 sections 7.3 (3rd edition) Chapter 5 sections 5.3 (4th edition) Cache II 32 END CS2100 Cache II 33
© Copyright 2026 Paperzz