Branch Prediction

Module 3: Branch Prediction
ECE 4100/6100: Yalamanchili
Fall 2003
Control Dependencies
Dependencies
Structural
Data
Name
Anti
 Control dependencies
instructions
Control
Output
determine
execution
order
of
– Instructions are control dependent on a branch instruction
 Why do we conserve control dependencies?
– Correctness
– Exception behavior and dataflow
 Goal: Maximize utilization of instruction fetch bandwidth 
branch prediction
– How do we improve prediction accuracy and reduce penalties?
ECE 4100/6100: S. Yalamanchili
Fall 2003
2
Impact of Branches
EX
INT
IF
instruction
issue

EX
FP
ID
MEM
WB
EX
BR
For general pipelines penalties occur because of two reasons
– Branch target address generation
– PC-relative address generation “can” occur after instruction fetch
– Branch condition resolution
– Unconditional branches do not incur this penalty
– What cycle is the condition known?
– ID?  testing the contents of a register as in BNEZ R1, Loop
– EX?  testing equality as in BNE R1, R2, Loop
ECE 4100/6100: S. Yalamanchili
Fall 2003
3
Branch Prediction


Dominated by history-based predictors past behavior is a good
indicator of future behavior?
Design issues
– How is history maintained?
– How are decisions made based on the this history?

Significant analysis of the behavior of benchmarks is used in the design
of predictors
ECE 4100/6100: S. Yalamanchili
Fall 2003
4
Dynamic Branch Prediction Strategies
Shift register
n-1
How do we capture
this history?
How do we predict?
…
0
Last branch behavior,
i.e., taken or not taken
prediction
From Ref: “Modern Processor Design: Fundamentals of
Superscalar Processors, J. Shen and M. Lipasti



Use history of behavior, taken vs. not-taken, by a single branch
Predict the next decision that will be taken by this branch, i.e.,
taken vs. not-taken
A general model (shown above) captures history and uses it to
make predictions
ECE 4100/6100: S. Yalamanchili
Fall 2003
5
Predicting the Outcome of a Single Branch
1-bit predictor
Index using
address LSB
bits
Change to 2bit predictor
Generalize to nbit predictor

n-bit predictors
–
–
–


Prediction bit addressed by k LSBs of the address of the branch instruction
Prediction bit set by a n-bit history: 2-bit most common
Useful when the branch address is known before the branch condition is known so
as to support pre-fetching
Performance parameters: prediction accuracy, penalties, branch frequency
Example – how does this work in the pipeline? Impact on CPI.
ECE 4100/6100: S. Yalamanchili
Fall 2003
6
Correlating Predictions Across Multiple Branches
Correlating across two
successive branches


Instead of having a predictor for a single branch have a predictor for
the most recent history of branch decisions
For each branch history sequence, use an n-bit predictor
ECE 4100/6100: S. Yalamanchili
Fall 2003
7
Performance Comparison

Size and resolution of
predictors established
empirically
ECE 4100/6100: S. Yalamanchili
Fall 2003
8
Multi-level Predictors

Use multiple predictors and chose between them
– Employ predictors based on local and global information state of
the art

Adaptive, multi-level predictors
– Substantial work throughout the 90’s starting with seminal work of
Yeh &Patt (1992)
ECE 4100/6100: S. Yalamanchili
Fall 2003
9
Misprediction Recovery

What actions must be taken on a misprediction?
– Remove “predicted” instructions
– Start fetching from the correct branch target(s)

What information is necessary to recover from misprediction?
– Address information for non-predicted branch target address
– Identification of those instructions that are “predicted”
– To be invalidated and prevented from completion
– Association between “predicted” instructions and specific branch
– When that branch is mispredicted then only those instructions must
be squashed
ECE 4100/6100: S. Yalamanchili
Fall 2003
10
Branch Target Buffers

Store the branch instruction address (PC) and corresponding
target address in a small associative cache
– Miss on the first access to a branch instruction

Access in parallel with instruction cache
– Hit produces the branch target address
ECE 4100/6100: S. Yalamanchili
Fall 2003
11
Branch Target Buffers: Operation

Couple speculative generation of the branch target address
with branch prediction
– Continue to fetch and resolve branch condition
– Take appropriate action if wrong

Any of the preceding history based techniques can be used for
branch condition speculation

Example: impact on CPI

Store prediction information, e.g., n-bit predictors, along with
BTB entry
ECE 4100/6100: S. Yalamanchili
Fall 2003
12
Some other Techniques

Static prediction techniques
– Opcode-based: offline frequency analysis guides prediction

Static prediction recorded in the branch instruction
– Off-line prediction (Motorola 8810)

Offset based prediction  negative target address offset
triggers branch taken prediction
– Motivated by behavior of loops (IBM RS 6000)
ECE 4100/6100: S. Yalamanchili
Fall 2003
13
Concluding Remarks

Challenge to keeping the execution core fed is handling
control flow

Prediction and recovery mechanisms key to keeping the
pipeline active

Superscalar datapaths provide increased pressure pushing for
better, more innovative techniques to keep pace with
technology-enabled appetite for instruction level parallelism

What next?
ECE 4100/6100: S. Yalamanchili
Fall 2003
14