Tomasulo’s Algorithm RHK.S96 1 Tomasulo Organization RHK.S96 2 Reservation Station Components Op—Operation to perform in the unit (e.g., + or –) Qj, Qk—Reservation stations producing source registers Vj, Vk—Value of Source operands Rj, Rk—Flags indicating when Vj, Vk are ready Busy—Indicates reservation station and FU is busy Register result status—Indicates which functional unit will write each register, if one exists. Blank when no pending instructions that will write that register. RHK.S96 3 Three Stages of Tomasulo Algorithm 1. Issue—get instruction from FP Op Queue If reservation station free, the scoreboard issues instr & sends operands (renames registers). 2. Execution—operate on operands (EX) When both operands ready then execute; if not ready, watch CDB for result 3. Write result—finish execution (WB) Write on Common Data Bus to all awaiting units; mark reservation station available. RHK.S96 4 Tomasulo Example Cycle 0 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Write Result Load1 Load2 Load3 Busy Op No No No No No Clock 0 Issue Execution complete F0 S1 Vj S2 Vk RS for j Qj RS for k Qk F2 F4 F6 F8 Busy No No No Address F10 F12 ... F30 FU RHK.S96 5 Tomasulo Example Cycle 1 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy Op No No No No No Clock 1 Issue 1 F0 FU Execution complete Write Result Load1 Load2 Load3 S1 Vj S2 Vk RS for j Qj RS for k Qk F2 F4 F6 F8 Busy No Yes No No Address 34+R2 F10 F12 ... F30 Load1 RHK.S96 6 Tomasulo Example Cycle 2 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy Op No No No No No Clock 2 Issue 1 2 F0 FU Execution complete Write Result Load1 Load2 Load3 S1 Vj S2 Vk RS for j Qj RS for k Qk F2 F4 F6 F8 Load2 Busy Yes Yes No Address 34+R2 45+R3 F10 F12 ... F30 Load1 RHK.S96 7 Tomasulo Example Cycle 3 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy Op No No No Yes MULTD No Clock 3 Issue 1 2 3 FU Execution complete 3 Write Result S1 Vj S2 Vk RS for j Qj R(F4) Load2 F4 F6 F0 F2 Mult1 Load2 Load1 Load2 Load3 Busy Yes Yes No Address 34+R2 45+R3 F10 F12 ... RS for k Qk F8 F30 Load1 RHK.S96 8 Tomasulo Example Cycle 4 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy Op Yes SUBD No No Yes MULTD No Clock 4 Issue 1 2 3 4 FU Execution complete 3 S1 Vj M(34+R2) F0 F2 Mult1 Load2 Write Result 4 Load1 Load2 Load3 S2 Vk RS for j Qj R(F4) Load2 F4 F6 F8 M(34+R2) Add1 Busy No Yes No Address F10 F12 ... 45+R3 RS for k Qk Load2 F30 RHK.S96 9 Tomasulo Example Cycle 5 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy Yes No No Yes Yes Clock 5 FU Issue 1 2 3 4 5 Op SUBD Execution complete 3 5 Write Result 4 S1 Vj M(34+R2) S2 Vk RS for j Qj R(F4) M(34+R2) Load2 Mult1 F4 F6 M(34+R2) MULTD DIVD F0 F2 Mult1 Load2 Busy No Yes No Address F8 F10 F12 ... Add1 Mult2 Load1 Load2 Load3 45+R3 RS for k Qk Load2 F30 RHK.S96 10 Tomasulo Example Cycle 6 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 2 Add1 0 Add2 Add3 10 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy Yes Yes No Yes Yes Clock 6 FU Issue 1 2 3 4 5 6 Op SUBD ADDD Execution complete 3 5 Write Result 4 6 S1 Vj M(34+R2) S2 Vk M(45+R3) M(45+R3) Load1 Load2 Load3 RS for j Qj Busy No No No Address F12 ... RS for k Qk Add1 MULTD M(45+R3) DIVD R(F4) M(34+R2) Mult1 F0 F2 F4 F6 F8 F10 Mult1 M(45+R3) Add2 Add1 Mult2 F30 RHK.S96 11 Tomasulo Example Cycle 7 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 1 Add1 0 Add2 Add3 9 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy Yes Yes No Yes Yes Clock 7 FU Issue 1 2 3 4 5 6 Op SUBD ADDD Execution complete 3 5 Write Result 4 6 S1 Vj M(34+R2) S2 Vk M(45+R3) M(45+R3) Load1 Load2 Load3 RS for j Qj Busy No No No Address F12 ... RS for k Qk Add1 MULTD M(45+R3) DIVD R(F4) M(34+R2) Mult1 F0 F2 F4 F6 F8 F10 Mult1 M(45+R3) Add2 Add1 Mult2 F30 RHK.S96 12 Tomasulo Example Cycle 8 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 8 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy Yes Yes No Yes Yes Clock 8 FU Issue 1 2 3 4 5 6 Op SUBD ADDD Execution complete 3 5 Write Result 4 6 Load1 Load2 Load3 Busy No No No Address F12 ... 8 S1 Vj M(34+R2) S2 Vk M(45+R3) M(45+R3) RS for j Qj RS for k Qk Add1 MULTD M(45+R3) DIVD R(F4) M(34+R2) Mult1 F0 F2 F4 F6 F8 F10 Mult1 M(45+R3) Add2 Add1 Mult2 F30 RHK.S96 13 Tomasulo Example Cycle 9 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 7 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy No Yes No Yes Yes Clock 9 FU Issue 1 2 3 4 5 6 Op Execution complete 3 5 Write Result 4 6 8 S1 Vj Load1 Load2 Load3 Busy No No No Address F10 F12 ... 9 S2 Vk RS for j Qj RS for k Qk ADDD M()–M() M(45+R3) MULTD M(45+R3) DIVD R(F4) M(34+R2) Mult1 F0 F2 F4 F6 F8 Mult1 M(45+R3) Add2 M()–M() Mult2 F30 RHK.S96 14 Tomasulo Example Cycle 10 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 2 Add2 Add3 67 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy No Yes No Yes Yes Clock 10 FU Issue 1 2 3 4 5 6 Op Execution complete 3 5 Write Result 4 6 8 S1 Vj Load1 Load2 Load3 Busy No No No Address F10 F12 ... 9 S2 Vk RS for j Qj RS for k Qk ADDD M()–M() M(45+R3) MULTD M(45+R3) DIVD R(F4) M(34+R2) Mult1 F0 F2 F4 F6 F8 Mult1 M(45+R3) Add2 M()–M() Mult2 F30 RHK.S96 15 Tomasulo Example Cycle 11 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 1 Add2 Add3 5 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy No Yes No Yes Yes Clock 11 FU Issue 1 2 3 4 5 6 Op Execution complete 3 5 Write Result 4 6 8 S1 Vj Load1 Load2 Load3 Busy No No No Address F10 F12 ... 9 S2 Vk RS for j Qj RS for k Qk ADDD M()–M() M(45+R3) MULTD M(45+R3) DIVD R(F4) M(34+R2) Mult1 F0 F2 F4 F6 F8 Mult1 M(45+R3) Add2 M()–M() Mult2 F30 RHK.S96 16 Tomasulo Example Cycle 12 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 4 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy No Yes No Yes Yes Clock 12 FU Issue 1 2 3 4 5 6 Op Execution complete 3 5 Write Result 4 6 8 Load1 Load2 Load3 Busy No No No Address F10 F12 ... 9 12 S1 Vj S2 Vk RS for j Qj RS for k Qk ADDD M()–M() M(45+R3) MULTD M(45+R3) DIVD R(F4) M(34+R2) Mult1 F0 F2 F4 F6 F8 Mult1 M(45+R3) Add2 M()–M() Mult2 F30 RHK.S96 17 Tomasulo Example Cycle 13 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 3 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 9 12 13 Busy Op No No No Yes MULTD M(45+R3) Yes DIVD FU Write Result 4 6 8 S1 Vj Clock 13 Issue 1 2 3 4 5 6 Execution complete 3 5 F0 F2 Mult1 M(45+R3) Load1 Load2 Load3 Busy No No No Address F10 F12 ... S2 Vk RS for j Qj RS for k Qk R(F4) M(34+R2) Mult1 F4 F6 F8 (M–M)+M() M()–M() Mult2 F30 RHK.S96 18 Tomasulo Example Cycle 14 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 2 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 9 12 13 Busy Op No No No Yes MULTD M(45+R3) Yes DIVD FU Write Result 4 6 8 S1 Vj Clock 14 Issue 1 2 3 4 5 6 Execution complete 3 5 F0 F2 Mult1 M(45+R3) Load1 Load2 Load3 Busy No No No Address F10 F12 ... S2 Vk RS for j Qj RS for k Qk R(F4) M(34+R2) Mult1 F4 F6 F8 (M–M)+M() M()–M() Mult2 F30 RHK.S96 19 Tomasulo Example Cycle 15 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 1 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 9 12 13 Busy Op No No No Yes MULTD M(45+R3) Yes DIVD FU Write Result 4 6 8 S1 Vj Clock 15 Issue 1 2 3 4 5 6 Execution complete 3 5 F0 F2 Mult1 M(45+R3) Load1 Load2 Load3 Busy No No No Address F10 F12 ... S2 Vk RS for j Qj RS for k Qk R(F4) M(34+R2) Mult1 F4 F6 F8 (M–M)+M() M()–M() Mult2 F30 RHK.S96 20 Tomasulo Example Cycle 16 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Busy Op No No No Yes MULTD M(45+R3) Yes DIVD FU Write Result 4 6 F0 F2 Mult1 M(45+R3) Load1 Load2 Load3 Busy No No No Address F10 F12 ... 9 12 S1 Vj Clock 16 Issue 1 2 3 4 5 6 Execution complete 3 5 16 8 13 S2 Vk RS for j Qj RS for k Qk R(F4) M(34+R2) Mult1 F4 F6 F8 (M–M)+M() M()–M() Mult2 F30 RHK.S96 21 Tomasulo Example Cycle 17 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 FU Write Result 4 6 17 9 12 Load1 Load2 Load3 Busy No No No Address F10 F12 ... 13 S1 Vj S2 Vk M*F4 M(34+R2) F0 F2 F4 M*F4 M(45+R3) Busy Op No No No No Yes DIVD Clock 17 Issue 1 2 3 4 5 6 Execution complete 3 5 16 8 RS for j Qj RS for k Qk F6 F8 (M–M)+M() M()–M() Mult2 F30 RHK.S96 22 Tomasulo Example Cycle 18 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 40 Mult2 Register result status k R2 R3 F4 F2 F6 F2 FU Write Result 4 6 17 9 12 Load1 Load2 Load3 Busy No No No Address F10 F12 ... 13 S1 Vj S2 Vk M*F4 M(34+R2) F0 F2 F4 M*F4 M(45+R3) Busy Op No No No No Yes DIVD Clock 18 Issue 1 2 3 4 5 6 Execution complete 3 5 16 8 RS for j Qj RS for k Qk F6 F8 (M–M)+M() M()–M() Mult2 F30 RHK.S96 23 Tomasulo Example Cycle 57 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 1 Mult2 Register result status k R2 R3 F4 F2 F6 F2 FU Write Result 4 6 17 9 12 Load1 Load2 Load3 Busy No No No Address F10 F12 ... 13 S1 Vj S2 Vk M*F4 M(34+R2) F0 F2 F4 M*F4 M(45+R3) Busy Op No No No No Yes DIVD Clock 57 Issue 1 2 3 4 5 6 Execution complete 3 5 16 8 RS for j Qj RS for k Qk F6 F8 (M–M)+M() M()–M() Mult2 F30 RHK.S96 24 Tomasulo Example Cycle 58 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Write Result 4 6 17 9 M*F4 M(34+R2) F0 F2 F4 M*F4 M(45+R3) Issue 1 2 3 4 5 6 Busy Op No No No No Yes DIVD Clock 58 Execution complete 3 5 16 8 58 12 S1 Vj FU Load1 Load2 Load3 Busy No No No Address F10 F12 ... 13 S2 Vk RS for j Qj RS for k Qk F6 F8 (M–M)+M() M()–M() Mult2 F30 RHK.S96 25 Tomasulo Example Cycle 59 Instruction status Instruction j LD F6 34+ LD F2 45+ MULTDF0 F2 SUBD F8 F6 DIVD F10 F0 ADDD F6 F8 Reservation Stations Time Name 0 Add1 0 Add2 Add3 0 Mult1 0 Mult2 Register result status k R2 R3 F4 F2 F6 F2 Write Result 4 6 17 9 59 13 S2 Vk RS for j Qj RS for k Qk F0 F2 F4 F6 F8 M*F4 M(45+R3) (M–M)+M() M()–M() M*F4/M Issue 1 2 3 4 5 6 Busy Op No No No No No Clock 59 Execution complete 3 5 16 8 58 12 S1 Vj FU Load1 Load2 Load3 Busy No No No Address F10 F12 ... F30 RHK.S96 26 Tomasulo Loop Example Loop: LD MULTD SD SUBI BNEZ F0 F4 F4 R1 R1 0 F0 0 R1 Loop R1 F2 R1 #8 • Multiply takes 4 clocks • Load have cache misses RHK.S96 27 Loop Example Cycle 0 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 No 0 Mult2 No Register result status Clock 0 R1 80 F0 Issue S1 Vj F2 Execution Write complete Result S2 Vk F4 Busy Address No No No Qi No No No Load1 Load2 Load3 Store1 Store2 Store3 RS for j RS for k Qj Qk Code: LD F0 MULTDF4 SD F4 SUBI R1 BNEZ R1 F6 F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Qi RHK.S96 28 Loop Example Cycle 1 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 No 0 Mult2 No Register result status Clock 1 R1 80 F0 Qi Issue 1 S1 Vj F2 Execution Write complete Result S2 Vk F4 Busy Address Yes 80 No No Qi No No No Load1 Load2 Load3 Store1 Store2 Store3 RS for j RS for k Qj Qk Code: LD F0 MULTDF4 SD F4 SUBI R1 BNEZ R1 F6 F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Load1 RHK.S96 29 Loop Example Cycle 2 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 No Register result status Clock 2 R1 80 F0 Qi Load1 Issue 1 2 S1 Vj Execution Write complete Result S2 Vk R(F2) F2 F4 Busy Address Yes 80 No No Qi No No No Load1 Load2 Load3 Store1 Store2 Store3 RS for j RS for k Qj Qk Code: LD F0 MULTDF4 SD F4 Load1 SUBI R1 BNEZ R1 F6 F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult1 RHK.S96 30 Loop Example Cycle 3 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 No Register result status Clock 3 R1 80 F0 Qi Load1 Issue 1 2 3 S1 Vj Execution Write complete Result S2 Vk R(F2) F2 F4 Busy Address Yes 80 No No Qi Yes 80 Mult1 No No Load1 Load2 Load3 Store1 Store2 Store3 RS for j RS for k Qj Qk Code: LD F0 MULTDF4 SD F4 Load1 SUBI R1 BNEZ R1 F6 F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult1 RHK.S96 31 Loop Example Cycle 4 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 No Register result status Clock 4 R1 72 F0 Qi Load1 Issue 1 2 3 S1 Vj Execution Write complete Result S2 Vk R(F2) F2 F4 Busy Address Yes 80 No No Qi Yes 80 Mult1 No No Load1 Load2 Load3 Store1 Store2 Store3 RS for j RS for k Qj Qk Code: LD F0 MULTDF4 SD F4 Load1 SUBI R1 BNEZ R1 F6 F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult1 RHK.S96 32 Loop Example Cycle 5 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 No Register result status Clock 5 R1 72 F0 Qi Load1 Issue 1 2 3 S1 Vj Execution Write complete Result S2 Vk R(F2) F2 F4 Busy Address Yes 80 No No Qi Yes 80 Mult1 No No Load1 Load2 Load3 Store1 Store2 Store3 RS for j RS for k Qj Qk Code: LD F0 MULTDF4 SD F4 Load1 SUBI R1 BNEZ R1 F6 F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult1 RHK.S96 33 Loop Example Cycle 6 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 No Register result status Clock 6 R1 72 F0 Qi Load1 Issue 1 2 3 6 S1 Vj Execution Write complete Result S2 Vk R(F2) F2 F4 Busy Address Yes 80 Yes 72 No Qi Yes 80 Mult1 No No Load1 Load2 Load3 Store1 Store2 Store3 RS for j RS for k Qj Qk Code: LD F0 MULTDF4 SD F4 Load1 SUBI R1 BNEZ R1 F6 F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult1 RHK.S96 34 Loop Example Cycle 7 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 Yes MULTD Register result status Clock 7 R1 72 F0 Qi Load2 Issue 1 2 3 6 7 S1 Vj F2 Execution Write complete Result Busy Address Yes 80 Yes 72 No Qi Yes 80 Mult1 No No R(F2) R(F2) Load1 Load2 Load3 Store1 Store2 Store3 RS for j RS for k Qj Qk Code: LD F0 MULTDF4 SD F4 Load1 SUBI R1 Load2 BNEZ R1 F4 F6 S2 Vk F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult2 RHK.S96 35 Loop Example Cycle 8 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 Yes MULTD Register result status Clock 8 R1 72 F0 Qi Load2 Issue 1 2 3 6 7 8 S1 Vj F2 Execution Write complete Result Busy Address Yes 80 Yes 72 No Qi Yes 80 Mult1 Yes 72 Mult2 No R(F2) R(F2) Load1 Load2 Load3 Store1 Store2 Store3 RS for j RS for k Qj Qk Code: LD F0 MULTDF4 SD F4 Load1 SUBI R1 Load2 BNEZ R1 F4 F6 S2 Vk F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult2 RHK.S96 36 Loop Example Cycle 9 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 Yes MULTD Register result status Clock 9 R1 64 F0 Qi Load2 Issue 1 2 3 6 7 8 S1 Vj F2 Execution Write complete Result 9 Load1 Load2 Load3 Store1 Store2 Store3 S2 RS for j RS for k Vk Qj Qk R(F2) R(F2) Load1 Load2 F4 F6 F8 Busy Address Yes 80 Yes 72 No Qi Yes 80 Mult1 Yes 72 Mult2 No Code: LD F0 MULTDF4 SD F4 SUBI R1 BNEZ R1 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult2 RHK.S96 37 Loop Example Cycle 10 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 4 Mult1 Yes MULTD 0 Mult2 Yes MULTD Register result status Clock 10 R1 64 F0 Qi Load2 Issue 1 2 3 6 7 8 S1 Vj Execution Write complete Result 9 10 Load1 Load2 Load3 10 Store1 Store2 Store3 S2 RS for j RS for k Vk Qj Qk M(80) R(F2) R(F2) Load2 F2 F6 F4 F8 Busy Address No Yes 72 No Qi Yes 80 Mult1 Yes 72 Mult2 No Code: LD F0 MULTDF4 SD F4 SUBI R1 BNEZ R1 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult2 RHK.S96 38 Loop Example Cycle 11 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 3 Mult1 Yes MULTD 4 Mult2 Yes MULTD Register result status Clock 11 R1 64 F0 Qi Issue 1 2 3 6 7 8 S1 Vj Execution Write complete Result 9 10 Load1 Load2 Load3 10 11 Store1 Store2 Store3 S2 RS for j RS for k Vk Qj Qk M(80) R(F2) M(72) R(F2) F2 F4 F6 F8 Busy Address No No Yes 64 Qi Yes 80 Mult1 Yes 72 Mult2 No Code: LD F0 MULTDF4 SD F4 SUBI R1 BNEZ R1 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult2 RHK.S96 39 Loop Example Cycle 12 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 2 Mult1 Yes MULTD 3 Mult2 Yes MULTD Register result status Clock 12 R1 64 F0 Qi Issue 1 2 3 6 7 8 S1 Vj Execution Write complete Result 9 10 Load1 Load2 Load3 10 11 Store1 Store2 Store3 S2 RS for j RS for k Vk Qj Qk M(80) R(F2) M(72) R(F2) F2 F4 F6 F8 Busy Address No No Yes 64 Qi Yes 80 Mult1 Yes 72 Mult2 No Code: LD F0 MULTDF4 SD F4 SUBI R1 BNEZ R1 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult2 RHK.S96 40 Loop Example Cycle 13 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 1 Mult1 Yes MULTD 2 Mult2 Yes MULTD Register result status Clock 13 R1 64 F0 Qi Issue 1 2 3 6 7 8 S1 Vj Execution Write complete Result 9 10 Load1 Load2 Load3 10 11 Store1 Store2 Store3 S2 RS for j RS for k Vk Qj Qk M(80) R(F2) M(72) R(F2) F2 F4 F6 F8 Busy Address No No Yes 64 Qi Yes 80 Mult1 Yes 72 Mult2 No Code: LD F0 MULTDF4 SD F4 SUBI R1 BNEZ R1 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult2 RHK.S96 41 Loop Example Cycle 14 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 1 Mult2 Yes MULTD Register result status Clock 14 R1 64 F0 Qi Issue 1 2 3 6 7 8 S1 Vj Execution Write complete Result 9 10 Load1 14 Load2 Load3 10 11 Store1 Store2 Store3 S2 RS for j RS for k Vk Qj Qk M(80) R(F2) M(72) R(F2) F2 F4 F6 F8 Busy Address No No Yes 64 Qi Yes 80 Mult1 Yes 72 Mult2 No Code: LD F0 MULTDF4 SD F4 SUBI R1 BNEZ R1 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult2 RHK.S96 42 Loop Example Cycle 15 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 No 0 Mult2 Yes MULTD Register result status Clock 15 R1 64 F0 Qi Issue 1 2 3 6 7 8 S1 Vj Execution Write complete Result 9 10 Load1 14 15 Load2 Load3 10 11 Store1 15 Store2 Store3 S2 RS for j RS for k Vk Qj Qk M(72) R(F2) F2 F4 F6 F8 Busy Address No No Yes 64 Qi Yes 80 M(80)*R(F2) Yes 72 Mult2 No Code: LD F0 MULTDF4 SD F4 SUBI R1 BNEZ R1 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult2 RHK.S96 43 Loop Example Cycle 16 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 No Register result status Clock 16 R1 64 F0 Qi Issue 1 2 3 6 7 8 S1 Vj F2 Execution Write complete Result 9 10 Load1 14 15 Load2 Load3 10 11 Store1 15 16 Store2 Store3 S2 RS for j RS for k Vk Qj Qk R(F2) Load3 F4 F6 F8 Busy Address No No Yes 64 Qi Yes 80 M(80)*R(F2) Yes 72 M(72)*R(72) No Code: LD F0 MULTDF4 SD F4 SUBI R1 BNEZ R1 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult1 RHK.S96 44 Loop Example Cycle 17 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 No Register result status Clock 17 R1 64 F0 Qi Issue 1 2 3 6 7 8 S1 Vj F2 Execution Write complete Result 9 10 Load1 14 15 Load2 Load3 10 11 Store1 15 16 Store2 Store3 S2 RS for j RS for k Vk Qj Qk R(F2) Load3 F4 F6 F8 Busy Address No No Yes 64 Qi Yes 80 M(80)*R(F2) Yes 72 M(72)*R(72) Yes 64 Mult1 Code: LD F0 MULTDF4 SD F4 SUBI R1 BNEZ R1 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult1 RHK.S96 45 Loop Example Cycle 18 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 No Register result status Clock 18 R1 56 F0 Qi Issue 1 2 3 6 7 8 S1 Vj Execution Write complete Result 9 10 14 15 18 10 11 15 16 S2 Vk R(F2) F2 F4 Busy Address No No Yes 64 Qi Yes 80 M(80)*R(F2) Yes 72 M(72)*R(72) Yes 64 Mult1 Load1 Load2 Load3 Store1 Store2 Store3 RS for j RS for k Qj Qk Code: LD F0 MULTDF4 SD F4 Load3 SUBI R1 BNEZ R1 F6 F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult1 RHK.S96 46 Loop Example Cycle 19 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 No Register result status Clock 19 R1 56 F0 Qi Issue 1 2 3 6 7 8 S1 Vj Execution Write complete Result 9 10 14 15 18 19 10 11 15 16 S2 Vk R(F2) F2 F4 Busy Address No No Yes 64 Qi No Yes 72 M(72)*R(72) Yes 64 Mult1 Load1 Load2 Load3 Store1 Store2 Store3 RS for j RS for k Qj Qk Code: LD F0 MULTDF4 SD F4 Load3 SUBI R1 BNEZ R1 F6 F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult1 RHK.S96 47 Loop Example Cycle 20 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 No Register result status Clock 20 R1 56 F0 Qi Issue 1 2 3 6 7 8 S1 Vj Execution Write complete Result 9 10 14 15 18 19 10 11 15 16 20 S2 RS for j Vk Qj R(F2) F2 F4 Busy Address No No Yes 64 Qi No Yes 72 M(72)*R(72) Yes 64 Mult1 Load1 Load2 Load3 Store1 Store2 Store3 RS for k Qk Code: LD F0 MULTDF4 SD F4 Load3 SUBI R1 BNEZ R1 F6 F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult1 RHK.S96 48 Loop Example Cycle 21 Instruction status Instruction j k iteration LD F0 0 R1 1 MULTDF4 F0 F2 1 SD F4 0 R1 1 LD F0 0 R1 2 MULTDF4 F0 F2 2 SD F4 0 R1 2 Reservation Stations Time Name Busy Op 0 Add1 No 0 Add2 No 0 Add3 No 0 Mult1 Yes MULTD 0 Mult2 No Register result status Clock 21 R1 56 F0 Qi Issue 1 2 3 6 7 8 S1 Vj Execution Write complete Result 9 10 14 15 18 19 10 11 15 16 20 21 S2 RS for j Vk Qj R(F2) F2 F4 Busy Address No No Yes 64 Qi No No Yes 64 Mult1 Load1 Load2 Load3 Store1 Store2 Store3 RS for k Qk Code: LD F0 MULTDF4 SD F4 Load3 SUBI R1 BNEZ R1 F6 F8 0 R1 F0 F2 0 R1 R1 #8 Loop F10 F12 ... F30 Mult1 RHK.S96 49 Tomasulo Summary • • • • Prevents Register as bottleneck Avoids WAR, WAW hazards of Scoreboard Allows loop unrolling in HW Not limited to basic blocks (provided branch prediction) • Lasting Contributions – Dynamic scheduling – Register renaming – Load/store disambiguation • Next: More branch prediction RHK.S96 50
© Copyright 2026 Paperzz