Insertion Policy Selection Using Decision Tree Analysis Samira Khan, Daniel A. Jiménez University of Texas at San Antonio Motivation L1 and L2 filters the cache access Last Level Cache (LLC) does not have much temporal locality Large fraction of blocks brought to cache are never accessed again (zero reuse lines). For SPEC CPU 2006 benchmarks, on average 60.18% lines are never accessed again while they are in the LLC Motivation No cache bursts in LLC Only small portion of hits occur near the MRU position Goal Get rid of zero reuse lines as early as possible Keep lines in cache for sufficient time to get the first hit Minimal change to LRU policy Use as little space as possible Insertion Position Selection Find the optimal insertion position Zero reuse lines will get evicted earlier Most of the non zero reuse lines should be in cache before their first hit This will get rid of zero reuse lines and make space for useful lines Use Decision Tree Analysis via set dueling to find the position This allows choosing among the insertion positions to set duel nearMRU pos nearLRU pos For 400.perlbench 66.67% lines brought to cache are never accessed again and 73.03% hits occur in between MRU and middle position LRU pos middle pos MRU pos Set dueling between middle and MRU pos MRU pos winner middle pos winner Set dueling between LRU and middle pos LRU pos winner Insert pos LRU Set dueling between nearMRU and MRU pos Middle pos winner nearMRU pos winner Set dueling between nearLRU and middle pos nearLRU pos winner Insert pos nearLRU Middle pos winner Insert pos middle Insert pos nearMRU MRU pos winner Insert pos MRU Adaptive Multi Set Dueling Current multi set dueling Have one leader set for each insertion policy Partial follower sets duplicate the winner set policy Each policy set duel in a tournament manner Not scalable Leader sets performing the looser policies hurt performance Adaptive multi set dueling Leader set adaptively chooses the policy No need for partial follower set Scalable Result Space Overhead Space overhead for a 1MB 16 way set associative LLC Parameter Storage Total Storage LRU overhead per line 4 bits 1024*16*4 = 8 KB Set type per set 2 bits 1024 * 2 = 2048 bits Two counters (psel1 & psel2) Each 10 bits 20 bits One counter (switched) 1 bit 1 bit Total 8 KB + 2069 bits Conclusion Insertion Position Selection using Decision Tree Analysis Requires minimal change to LRU Needs only 2069 bits extra space Chooses the best insertion position adaptively Gets rid of zero reuse lines without any storage hungry predictor Makes multi set dueling scalable Questions Zero Reuse Lines in SPEC CPU 2006 Adaptive Multi Set Dueling -1 +1 pselab -1 -1 +1 pselcd psel1 +1 -1 +1 pselef -1 -1 +1 pselgh psel2 +1 -1 psel1 +1 +1 All sets in LLC +1, if pb wins psel2 Leader sets in adaptive multi set dueling scheme -1, if pa wins pa pb pα Leader sets in current multi set dueling scheme pa pb φab pc pd φcd pe pf φef pg ph φgh Result MRU nearMRU middle nearLRU LRU psel2 psel1 s
© Copyright 2026 Paperzz