Lecture 11: Modern Superscalar Processor Models Generic Superscalar Models, Issue Queue-based Pipeline, Multiple-Issue Design 1 Generic Superscalar Processor Models schedule D-cache FU FU bypass Regfile Wakeup select Rename Fetch Issue queue based commit schedule D-cache FU FU Wakeup select bypass ROB Reg Rename Fetch execute Reservation based (already studied) commit execute Revised from Paracharla PhD thesis 1998 2 Issue Queue Based Pipeline Fetch->Rename->Issue->Reg-read-> Execute>Writeback/Commit Core structure: register mapping table Rename: translate architectural registers into physical registers Issue: send instruction out to register read and then execution Commit: Process mis-prediction/exception, update register renaming Why study? Used in Alpha 21264, MIPS R10000, Intel P4 3 Compare Reservation Station and Issue Queue Pipeline Stage Sequence 1. 2. RS: IF -> REN -> REG/ROB->SCHD->… IQ: IF -> REN -> SCHD -> REG ->… Mapping Table vs. Status Table 1. 2. RS: Status table chooses architectural register or ROB IQ: Always renames to a physical register Register file 1. 2. RS: Architectural register file stores architectural states IQ: Physical register file; No architectural register file! Mapping table determines architectural states 4 Compare Reservation Station and Issue Queue Reservation Station 1. 2. RS: busy, fu, op, Qj, Qk, Vj, Vk IQ: busy, fu, op, Pj, Pk, ReadyJ, ReadyK ROB 1. 2. RS: Store register values IQ: No register contents Pros and Cons of IQ: No copying between ROB and register Efficient use of register Bad: Complex mapping table design 5 Register Mapping Table Records the mapping from virtual, architectural registers to physical registers Mapping is stored in RAM or CAM memories Phy reg Arch reg (virtual) R1 => P3 R2 => P10 R3 => P6 R4 => P8 R5 => P12 … 6 Register Renaming Examples Loop: LW R2, 0(R1) ADD R2, R2, 1 SW R2, 0(R1) ADD R1, R1, 4 BNE R2, R3, LOOP LW returns 100, R1=1000 Renamed dynamic instructions: … BNE P2, P3, Loop LW P32, 0(P1) ADD P33, P32, 1 SW R33, 0(P1) ADD P34, P1, 4 BNE P34, P3, LOOP … Assume at first BNE.rename, R1-R31 mapped to P1-P31, P32-P127 are free First BNE may be predicted either correctly or not 7 Register Mapping Status R1 => P1 R2 => P2 R3 => P3 R4 => P4 R5 => P5 … R1 => P1 R2 => P32 R3 => P3 R4 => P4 R5 => P5 … R1=>P1 R2 => P33 R3 => P3 R4 => P4 R5 => P5 … R1=>P1 R2 => P33 R3 => P3 R4 => P4 R5 => P5 R1=>R34 R2 => P33 R3 => P3 R4 => P4 R5 => P5 No change P1=4000 P2=200 … P32=100 P33=101 P34=4004 … … At commit (possible sequence) P1=4000 P2=200 … P32=100 P33=? P34=4004 P1=4000 P2=200 … P32=100 P33=101 P34=4004 P1=4000 P2=200 … P32=100 P33=101 P34=4004 8 Commit and Rollback Rename point commit point R1 => P1 R2 => P2 R3 => P3 R4 => P4 R5 => P5 … P1=4000 P2=200 … P32=100 P33=? P34=4004 R1 => P1 R2 => P32 R3 => P3 R4 => P4 R5 => P5 … P1=>R1 R2 => P33 R3 => P3 R4 => P4 R5 => P5 … P1=>R1 R2 => P33 R3 => P3 R4 => P4 R5 => P5 … P1=>R34 R2 => P33 R3 => P3 R4 => P4 R5 => P5 … Commit successful: make the next mapping status as committed mapping status free the previous physical register Mis-prediction/exception: flush pipeline, flush the following mappings 9 Program Execution Correctness Only committed instructions write to register and memory Yes, from programmer’s viewpoint -- only committed instructions’ register output becomes visible Maintain correct data flow – a child instruction always use the values from its parents Yes, in renamed form, and not affected by speculative execution Register/memory receives the value of last write Yes, from programmer’s viewpoint -architectural mapping status is updated in program order Note memory correctness is not affected 10 Mapping Table Design – MIPS R1000 Current mapping Committed mapping Mapping tables Mapping after Br4 Mapping after Br3 Mapping after Br2 Mapping after Br1 Committed mapping Branch stack Alternative PC4 Alternative PC3 Alternative PC2 Alternative PC1 RAM-based structure: Automatically, parallel saving on branches at rename On mis-prediction: restore the previous mapping immediately, flush pipeline, restart fetch at the alternative PC On commit of branch instruction: make the corresponding mapping as the committed one Stall if branch stack is full 11 Mapping Table Design – MIPS R1000 How about precise exception? Cannot preserve every mapping status for every instruction Solution: record the change of mapping in ROB ROB: Contains Dest Architectural Register, Renamed physical register, Old renamed physical register On exception: rollback mapping one instruction by one instruction, four instructions per cycle Slow performance – but how frequent is exception? Note branch mis-prediction has fast recovery 12 Mapping Table Design – Alpha 21264 Valid bits p0 Arch. Reg # 1 1 p1 Arch. Reg # 1 0 p2 Arch. Reg # 0 1 Match and valid … … pk Arch. Reg # 1 1 committed mapping current mapping CAM structure Associative searching on architecture register index, output physical register index (through an encoder) One column represents one mapping, allocated to each instruction with register output at rename One pair of valid bit changes per one dest renaming Fast recovery even on exceptions 13 Multiple Issue Pipelines Each pipeline stages accept k instructions – kissue processor Alpha 21264 – 4-issue MIPS R1000 – 4-issue Intel P4 – 3-issue Memory structure must have multiple ports proportional to issue width! What if k instructions at rename have dependence among them? Need Dependence check logic! 14 Dependence Check Logic Rs0 Rt0 Rd0 Rs1 Rt1 Rd1 Rs2 Rt2 Rd2 Rs3 Rt3 Rd3 No dependence check yet mapping table Ps0Ps1 Pd0 Ps0Ps1 Pd1 Ps0 Ps1 Pd2 Ps0Ps1 Pd3 Any change to the first renaming? What is the change to the second one? Third and forth ones? 15
© Copyright 2026 Paperzz