Evaluation of the Gini-index for Studying Branch Prediction Features Veerle Desmet Lieven Eeckhout Koen De Bosschere A simple prediction example outlook t° windy season,... features prediction mechanism umbrella prediction past observations goal = prediction accuracy of 100% 2 A simple prediction example • Daily prediction • Binary prediction: yes or no • Outcome in the evening • Prediction strategies: – No need in summer, yes otherwise • Easy, not very accurate – Based on humidity and temperature • More complex, very accurate 3 Predicting • How to improve prediction accuracy? • Shortcomings of existing models? – Feature set – Prediction mechanism – Implementation limits – ... • This talk: evaluation of prediction features for branch prediction 4 Program execution • Phases during instruction execution: Fetch Decode Execute • Fetch = read next instruction • Decode = analyze type and read operands • Execute • Write Back = write result Write Back R1=R2+R3 addition 4 3 computation R1 contains 7 5 Pipelined architectures Parallel versus sequential: Fetch Decode Execute Write Back R1=R2+R3 R5=R2+1 R4=R3-1 R7=2*R1 R5=R6 R1=4 R1=R2+R3 R5=R2+1 R4=R3-1 R7=2*R1 R5=R6 R1=R2+R3 R5=R2+1 R4=R3-1 R7=2*R1 R1=R2+R3 R5=R2+1 R4=R3-1 • Constant flow of instructions possible • Faster applications • Limitation due to branches 6 Branches • Branches determine program flow or execution path • Introduce 2 bubbles affecting pipeline throughput R1=R2+R3 R5=R6 R5=R2+1 test R1=0 R7=2*R1 no R2=R2-1 R7=0 yes Fetch Decode Execute Write Back R1=R2+R3 test R5=R2+1 R4=R3-1 R7=2*R1 R5=R6 ?R1=0 R5=R2+1 test ?R1=0 R5=R6 R5=R2+1 test R1=0 R1=R2+R3 R5=R6 test R1=0 R5=R2+1 7 Solution • 1 out of 8 instructions is a branch • Waiting for the outcome of branches seriously affects amount of parallelism • Increasing number of pipeline stages – Pentium 4: up to 20 stages Predict outcome of branch 8 Branch prediction • Fetch those instructions that are likely to be executed • Correct prediction eliminates bubbles R1=R2+R3 R5=R6 R5=R2+1 test R1=0 R7=2*R1 no R2=R2-1 R7=0 yes Fetch Decode Execute Write Back R1=R2+R3 test R5=R2+1 R4=R3-1 R7=2*R1 R5=R6 R1=0 R2=R2-1 R5=R2+1 R7=2*R1 test R1=0 R5=R6 test R1=0 R5=R2+1 R1=R2+R3 R5=R2+1 R5=R6 9 Branch prediction • Prediction for each branch execution • Binary prediction: taken or not-taken • Outcome after the test is excuted • Prediction strategies: – Many predictors in literature – Static versus dynamic 10 Static branch prediction • BTFNT: Backward Taken, Forward Not Taken – Loops (e.g. For, while) – Summer no need of umbrella • Based on type of test in branch – Branch if equal mostly not-taken – Sunday no need of umbrella • Easy, prediction fixed at compile-time • Prediction accuracy: about 75% 11 Dynamic branch prediction • • • • Bimodal Global Gshare Local Simulations: • SimpleScalar/Alpha • SPEC2000 integer benchmarks • 250M branches 12 Bimodal branch predictor branch address saturating counter e.g. 3 2 prediction e.g. taken update with outcome e.g. taken Averaging outcomes from previous years 13 Global branch predictor global history e.g. 0111 1111 saturating counter e.g. 3 2 prediction e.g. taken update with outcome e.g. taken Averaging last day outcomes 14 Gshare branch predictor AMD K6 branch address XOR global history e.g. 1010 saturating counter e.g. 2 prediction e.g. taken update with outcome 15 Local branch predictor branch address local history e.g. 1111 prediction saturating counter e.g. 2 Record day outcomes of previous years Averaging over same day histories 16 Prediction Accuracy (%) Accuracy versus storage 100 gshare 95 local 90 bimodal 85 global 80 75 1 10 100 1000 10000 Predictor Size (byte) 100000 17 Branch prediction strategies • All saturating counter mechanism • All use of limited tables – problem with so-called aliasing • Different prediction features • Accuracies up to 95% • Further improvement? • Predictive power of features? 18 Feature selection features? prediction mechanism prediction Feature selection • Which features are relevant? • Less features – require less storage – faster prediction 19 Systematic feature evaluation • Feature = input to predictor • Power of features – predictor size not fixed – prediction strategy not fixed • Decision trees: – Selects feature – Split observations – Recursive algorithm – Easy understandable 20 Decision Tree Construction Outlook t° windy features windy sunny high no no sunny low yes yes overcast high no no overcast low no no overcast high yes yes overcast low yes yes rain low no yes rain high yes yes yes prediction mechanism no YES outlook sunny overcast rain NO YES prediction 21 Gini-index Metric for partition purity of a data set S: Gini (S) = 1 – p0² – p1² where pi is the relative frequence of class i in S For binary prediction: minimum 0 maximum 0.5 The higher the Gini-index, the more difficult to predict 22 Finding good split points • If data set S is split into two subsets S0 and S1 with sizes N0 and N1 (N = N0 + N1): N0 N1 Gini(S0) + Gini(S1) Ginisplit(S) = N N • Feature with lowest Ginisplit is chosen • Extensible for non binary features • Looking for features with low Ginisplit-index, i.e. features with good predictive power 23 Ginisplit-index Individual feature bits 0,5 0,45 0,4 0,35 0,3 0,25 0,2 0,15 0,1 0,05 0 global history branch address gshare-index target direction branch type ending type successor basic block local history dynamic features static features 24 Individual features • Local history bits very good – perfect local history uses branch address • Static features powerful – non-binary – except target direction – known at compile-time • Looking for good feature combinations... 25 Ginisplit-index Features as used in predictors 0,5 0,45 0,4 0,35 0,3 0,25 0,2 0,15 0,1 0,05 0 gshare-index branch address global history local history 0 1 2 3 4 5 6 7 8 9 10 1112 1314 1516 1718 1920 Feature length (bit) 26 Features as used in predictors • Static features better for small lengths • Better if longer features • A few local history bits enough • Same behaviour as accuracy curves – low Gini-index implies high accuracy • Independent to predictor size • Independent to prediction strategy 27 Remark • Limitation of decision trees: outliers – majority vote – clean data Outlook t° windy sunny high no no sunny high no yes sunny high no no • Keep implementation in mind 28 Conclusion • Need of accurate branch prediction in modern microprocessors • Towards systematic predictor development – Selecting features – Predictive power of features • Gini-index useful for studying branch prediction features – without fixing any predictor aspect 29 Thanks for Listening
© Copyright 2026 Paperzz