COSC6385 Advanced Computer Architecture Lecture 12. ROB and Precise Interrupts Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston 2 Precise Interrupt 3 Out-of-Order Execution • We’re now executing instructions in data-flow order  Great! More performance • But outside world can’t know about this  Must maintain illusion of sequentiality Issue with Imprecise Interrupt lw r5, 8(r10) add r10, r9, r8 Instruction Page Fault add r12, r10, r7 L1: add r3, r1, r2 add r4, r1, r4 add r2, r4, r4 • add instructions take one cycle • E.g.,  Load (left side) induces a “data page fault”;  Add (right side) induces an “instruction page fault” • If out-of-order completion is allowed  r10, r12, (or r2, r4) … will be modified  Wrong values will be used by the re-issued load • Interrupt classes  Program interrupts (exceptions or traps)  External interrupts (asynchronous) 5 End of Non-Resident Page X Start of Resident Page X+1 Precise Interrupts • To reflect a sequential architecture model  Serially correct (think about a single issue, nonpipelined processor) • Keep “Precise State” of an execution  All instructions before the interrupted instruction must be completed  The state should appear as if no instruction issued after the interrupted instruction  The interrupted PC should be presented to the interrupt handler (restartable) • Similar to branch misprediction handling • Out-of-order execution makes the ordering hard  Undo what comes after an interrupt 6 Why Supporting Precise Interrupts • Need to maintain a precise state (for recovery) • • • • • Software debugging I/O or timer interrupts Virtual memory (page fault) Instruction emulation Virtual machines 7 Support Precise Interrupt • Buffer results • Can reconstruct the scenario (state) as sequential execution • Restart from saved PC with saved PC state 8 Reorder Buffer (ROB) [SmithPlezkun’85 ‘88] • Architecture Register File keeps “In-order state” • Reorder Buffer (ROB)  A circular buffer  Contains all in-flight instructions  buffers the “Lookahead state”  In-order allocation/deallocation with head/tail pointers • When an exception occurs  Halting instruction issues  Revert to in-order state using RF and discard ROB results • Also used for branch misprediction recovery • Pentium Pro/II/III integrates physical register file within ROB • Pentium 4 decouples ROB and physical register file 9 V Head (oldest instruction) Spec? Done? Reorder Buffer (with physical registers) PC Exp event RegDst . . . Tail (next inst to be allocated) 10 ROB Sandy Bridge : 168-entry Data (physical register) . . . V Spec? Done? Handling Precise Interrupts PC Exp event RegDst Head 01 0 1 0 1 0 0 xA000 xA004 0000 0000 R1 R2 Tail 1 0 0 xA008 0000 FR1 . . . Data (physical register) 11 R1=R1+10 R2=R2*2 FR1=FR2/0.0 . . . R1 R2 R3 R4 ARF 11 1 1 2 1 3 1 4 1 R31 11 PC Exp event 0 1 0 0 xA004 0000 R2 1 0 0 xA008 0000 FR1 FR1=FR2/0.0 1 0 0 xA00C 0000 R3 R3=R3+1 V Head Spec? Done? Handling Precise Interrupts RegDst Data (physical register) R2=R2*2 Tail . . . . . . R1 R2 R3 R4 ARF 11 1 1 2 1 3 1 4 1 R31 12 PC Exp event 0 1 0 0 xA004 0000 R2 1 0 0 xA008 0000 FR1 1 0 1 1 0 0 xA00C xA010 0000 0000 R3 R4 V Head Spec? Done? Handling Precise Interrupts RegDst Data (physical register) R2=R2*2 FR1=FR2/0.0 4 R3=R3+1 R4=R4*2 Tail . . . . . . R1 R2 R3 R4 ARF 11 1 1 2 1 3 1 4 1 R31 13 PC Exp event 0 1 0 0 1 1 0 0 xA004 0000 R2 xA008 0010 FR1 1 0 1 1 0 1 xA00C xA010 0000 0000 R3 R4 1 0 0 xA014 0000 FR4 V Head Spec? Done? Handling Precise Interrupts RegDst Data (physical register) 4 R2=R2*2 FR1=FR2/0.0 4 8 R3=R3+1 R4=R4*2 FR4=FR4*2.0 Tail . . . . . . R1 R2 R3 R4 ARF 11 1 4 1 2 1 3 1 4 1 R31 14 PC Exp event 0 1 0 0 1 xA004 0000 R2 1 0 0 xA008 0010 FR1 1 0 1 1 0 1 xA00C xA010 0000 0000 R3 R4 1 0 0 xA014 0000 FR4 V Head Spec? Done? Handling Precise Interrupts RegDst Data (physical register) 4 R2=R2*2 FR1=FR2/0.0 4 8 R3=R3+1 R4=R4*2 FR4=FR4*2.0 Tail . . . . . . R1 R2 R3 R4 ARF 11 1 1 4 1 3 1 4 1 R31 15 PC Exp event RegDst 0 0 1 0 0 xA008 0010 FR1 1 0 1 1 0 1 xA00C xA010 0000 0000 R3 R4 1 0 0 xA014 0000 FR4 V Head Spec? Done? Handling Precise Interrupts These values were not Data (physical register) committed into RF FR1=FR2/0.0 4 8 R3=R3+1 R4=R4*2 FR4=FR4*2.0 Tail Back up “PC” and current RF . . . . Exception detected. . . R1 R2 R3 R4 1 R31 Depending on the Exception, process will either abort or instruction will be resumed from this 16 excepting instruction ARF 11 1 1 4 1 3 1 4 V Head Spec? Done? Handling Speculative Execution 1 0 0 1 0 0 PC Exp event xB000 xB004 0000 0000 RegDst Data (physical register) R1=R1+10 BEQ R1, R0, L1 R1 Tail . . . . . . R1 R2 R3 R4 ARF 1 1 2 1 3 1 4 1 R31 17 PC Exp event 1 0 0 1 0 0 xB000 xB004 0000 0000 R1 1 1 1 xC100 0000 1 1 0 xC104 0000 R2 R1 1 1 0 xD2AC 0000 1 1 1 xD2B0 0000 V Head Spec? Done? Handling Speculative Execution RegDst Data (physical register) R1=R1+10 BEQ R1, R0, L1 32 R2=R3 << 2 R1=R2*R3 BEQ R3, R0, L1 R1 28 R1=R7+1 Tail . . . . . . R1 R2 R3 R4 ARF 1 1 2 1 3 1 4 1 R31 18 BEQ R1, R0, L1 is predicted TAKEN V Head Spec? Done? Handling Speculative Execution PC Exp event 1 0 0 xB004 0000 1 1 1 xC100 0000 1 1 0 xC104 0000 1 1 0 xD2AC 0000 1 1 1 xD2B0 0000 RegDst BEQ Data (physical register) Misprediction BEQ R1, R0, L1 R2 R1 32 R2=R3 << 2 R1=R2*R3 BEQ R3, R0, L1 R1 28 R1=R7+1 Tail . . . . . . R1 R2 R3 R4 ARF 11 1 2 1 3 1 4 1 R31 19 BEQ R1, R0, L1 is resolved, actually NOT TAKEN !! V Head Spec? Done? Handling Speculative Execution 1 0 0 PC Exp event xB004 0000 RegDst Data (physical register) BEQ R1, R0, L1 Tail . . . . . . R1 R2 R3 R4 ARF 11 1 2 1 3 1 4 1 R31 20after the mis-speculated branch Retire branch, Clear all entries V Head Spec? Done? Handling Speculative Execution 1 0 0 PC Exp event RegDst xB008 0000 R2 Data (physical register) R2=R5 << 4 Tail . . . . . . R1 R2 R3 R4 ARF 11 1 2 1 3 1 4 1 R31 21 path (Fall through in this case) Continue execution from the correct RAT Recovery ARF br RAT ?!? ARF state corresponds to state prior to oldest non-committed instruction As instructions are processed, the RAT corresponds to the register mapping after the most recently renamed instruction On a branch misprediction, wrong-path instructions are flushed from the machine The RAT is left with an invalid set of mappings corresponding to the wrongpath instruction state 22 Solution: Stall and Drain ARF Allow all instructions to execute and commit; ARF corresponds to last committed instruction RAT ARF now corresponds to the state right before the next instruction to be renamed (foo) br X Reset RAT so that all mappings ?!? refer to the ARF Pros: Very simple to implement Resume renaming the new correctfoo Correct path instructions from Cons: Performance loss pathfetch; instructions from fetch can’t rename because RAT is wrong due to stalls 23 Another Solution: Checkpointing ARF At each branch, make a copy of the RAT (register mapping at the time of the branch) br br br br foo RAT RAT RAT RAT RAT On a misprediction: 1. flush wrong-path instructions 2. deallocate RAT checkpoints 3. recover RAT from checkpoint 4. resume renaming 24 Checkpoint Free Pool Commit Illustrated • Make instruction execution “visible” to the outside world  “Commit” the changes to the architected state ROB A B C D E F G H J K           WB result ARF Outside World “sees”: A executed B executed C executed D executed E executed Instructions execute out of program order, but outside world still “believes” it’s in-order
© Copyright 2025 Paperzz