CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan www.cs.ucr.edu/~bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson) .1 1999 ©UCB Implementing Main Control Main Control has one 6-bit input, 9 outputs (7 are 1-bit, ALUOp is 2 bits) RegDst Branch To build Main Control as sum-of-products: MemRead op (1) Construct a minterm for each different instruction (or R-type); each minterm corresponds to a single instruction (or all of the Rtype instructions), e.g., MR-format, Mlw .2 (2) Determine each main control output by forming the logical OR of relevant minterms (instructions), e.g., RegWrite: MR-format OR Mlw MemtoReg Main Control ALUop 2 MemWrite ALUSrc RegWrite 1999 ©UCB Single-Cycle MIPS-lite CPU 4 Read Addr P C 31:0 Instruction Imem op=[31:26] 15:11 a d d 25:21 20:16 M u x << 2 Read Reg1 Read Read data1 Reg2 Read Write data2 Reg Write Data Regs RegWrite 15:0 .3 PCSrc MemWrite Branch RegDst Main Control a d d M u x Sign Extend Zero M u x A L U Address MemToReg Dmem ALUcon ALUsrc Write Data MemRead ALU Control 5:0 Read data ALUOp (2) M u x 1999 ©UCB Fig. 5.17 Datapath with Control Signals .4 1999 ©UCB Fig. 5.18 Setting Control Lines Depend on Opcode Memto- Reg Mem Mem Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0 R-format 1 0 0 1 0 0 0 1 0 lw 0 1 1 1 1 0 0 0 0 sw X 1 X 0 0 1 0 0 0 beq X 0 X 0 0 0 1 0 1 .5 1999 ©UCB Control Design ° Simple combinational logic (truth tables) Inputs Op5 Op4 Op3 Op2 ALUOp Op1 ALU control block Op0 ALUOp0 ALUOp1 Outputs F3 F2 F (5– 0) Operation2 Operation1 Operation Iw sw beq RegDst ALUSrc MemtoReg F1 Operation0 F0 R-format RegWrite MemRead MemWrite Branch ALUOp1 ALUOpO .6 1999 ©UCB Fig. 5.19 R-type operation, add $t1, $t2, $t3 Active parts are highlighted .7 1999 ©UCB Fig. 5.20 Active parts for a Load instruction .8 1999 ©UCB Fig. 5.21 Active parts for a beq instruction .9 1999 ©UCB Fig. 5.24 Extension for Jump instruction .10 1999 ©UCB Single-Cycle Machine: Appraisal ° All instructions complete in one clock cycle (CPI = 1) ° Some instructions take more steps than others • lw is most expensive (5 steps, vs. 4 for R-type and sw, 3 for beq) ° Clock cycle must cover longest instruction inefficient • suppose mul is added? • 32-shift/add steps would delay every other instruction .11 1999 ©UCB Example ° Assume 2ns for instruction/data memory, 1ns for decode/register read, 2ns for ALU and 1 ns for register write. ° Single-cycle datapath clock period = 8 ns. ° Assume an instn mix of 24% loads, 12% stores, 44% R-format, 18% branches, and 2% jumps. ° Assuming a variable-cycle datapath, average clock period = 6.3 ns. ° Possible Speed-up = 1.27 .12 1999 ©UCB Multicycle Implementation (MIPS-lite v.2) ° Want more efficient implementation ° Each step will take one clock cycle (not each instruction) [CPI > 1] shorter clock cycle: cycle time constrained by longest step, not longest instruction ° simpler instructions take fewer cycles higher overall performance ° complex control: finite state machine ° Versatile (can extend for new instructions: add3, swap, etc.) .13 1999 ©UCB Recap: Clocking: single-cycle vs. multicycle Single-cycle Implementation clock waste waste beq $t0,$t1,L add $t0,$t1,$t2 Multicycle Implementation clock add $t0,$t1,$t2 beq $t0,$t1,L • Multicycle Implementation: less waste=higher performance .14 1999 ©UCB Recap: How fast can we run the clock? °Depends on how much want done per clock cycle • Can do: several “inexpensive” datapath operations per clock - simple gates (AND, OR, …) - single datapath registers (PC) - sign extender, left shifter, multiplexor • PLUS: exactly one “expensive” datapath operation per clock - ALU operation - Register File access (2 reads, or 1 write) - Memory access (read or write) .15 1999 ©UCB MIPS-lite Multicycle Version Multicycle Datapath (overview) P C Address Instruction Register Read Reg1 Memory Read Reg2 Instruction or Data Data Read data 1 A Registers Memory Data Register Write Reg Read data 2 A L U ALUOut B Data • One ALU .16 (no extra adders) • One Memory (no separate Imem, Dmem) • New Temporary Registers (“clocked”/require clock input) 1999 ©UCB Multicycle Implementation °Datapath changes • one memory: both instructions and data (because can access on separate steps) • one ALU (eliminate extra adders) • extra “invisible” registers to capture intermediate (per-step) datapath results °Controller changes • controller must fire control lines in correct sequence and correct time controller must remember current execution step, advance to next step .17 1999 ©UCB Multicycle Datapath: Add Multiplexors P C M u Address x Read Reg1 25:21 Read data1 A Read data2 B Mem Read Data Write Data Read Reg2 20:16 IR M u Write 15:0 15:11 x Reg M D R M u x Write Data Regs 4 A L U zero ALUOut 0 1M 2u 3x Note inputs to multiplexors Sgn Extend .18 M u x << 2 1999 ©UCB Datapath + Control Points IRWrite RegWrite MemRead MemWrite RegDst IorD P C M u Address x Write Data PCSrc ALUSrcA PCWriteCond Read Reg1 25:21 Read data1 A Read data2 B Mem Read Data PCWrite Read Reg2 20:16 IR M u Write 15:0 15:11 x Reg M D R M u x Write Data Regs Sgn Extend M u x 4 z A L U 0 1M 2u 3x M u x ALUOut 3 ALU Control << 2 2 2 (funct) 5:0 ALUSrcB .19 MemtoReg ALUOp 1999 ©UCB Multicycle Instruction Execution °All instructions execute in 3-5 cycles • 3 cycles: beq • 4 cycles: R-type, sw • 5 cycles: lw °1: fetch instruction, PC=PC+4 °2: decode, fetch registers, brnch target °3: execute/compute address/branch °4: access memory/complete R-type °5: (lw) store memory .20 1999 ©UCB Cycle 1 Datapath: IR=Mem[PC]; PC=PC+4 P C M u x Address M u x Read Reg1 25:21 Read data1 A Read data2 B Mem 20:16 Read Data Write Data IR M u Write 15:0 15:11 x Reg M D R IR=Mem[PC]; PC=PC+4 .21 Read Reg2 M u x Write Data Regs M u x 4 A L U 0 1M 2u 3x << 2 ALUOut 3 (funct) 5:0 Sgn Extend z ALU Control 2 2 1999 ©UCB Cycle 2: A=Reg[IR25:21]; ALUOut= PC + sgn-ext(IR15:0) << 2 P C M u Address x M u x Read Reg1 25:21 Read data1 A Read data2 B Mem Read Data Write Data 20:16 IR M u Write 15:0 15:11 x Reg M D R A=Reg[IR25:21]; B=Reg[IRPC 20:16 ALUOut= +]; sgn-ext(IR15:0) << 2 .22 Read Reg2 M u x Write Data Regs M u x 4 A L U 0 1M 2u 3x << 2 ALUOut 3 (funct) 5:0 Sgn Extend z ALU Control 2 2 1999 ©UCB Cycle 3: R-format: ALUOut = A op B P C M u Address x M u x Read Reg1 25:21 Read data1 A Read data2 B Mem Read Data Write Data 20:16 IR M u Write 15:0 15:11 x Reg M D R ALUOut=A op B Read Reg2 M u x Write Data Regs M u x 4 A L U 0 1M 2u 3x << 2 ALUOut 3 (funct) 5:0 Sgn Extend z ALU Control 2 2 .23 1999 ©UCB Cycle 4 R-format: Reg[IR15:11] = ALUOut P C M u Address x M u x Read Reg1 25:21 Read data1 A Read data2 B Mem Read Reg2 20:16 Read Data Write Data IR M u Write 15:0 15:11 x Reg M D R Reg[IR15:11] = ALUOut M u x Write Data Regs M u x 4 A L U 0 1M 2u 3x << 2 ALUOut 3 (funct) 5:0 Sgn Extend z ALU Control 2 2 .24 • How many times use ALU? 1999 ©UCB Cycle 3 beq: P C M u Address x if (A==B) PC =ALUOut M u x Read Reg1 25:21 Read data1 A Read data2 B Mem 20:16 Read Data Write Data Read Reg2 IR M u Write 15:0 15:11 x Reg M D R if (A==B) PC =ALUOut M u x Write Data Regs M u x 4 A L U 0 1M 2u 3x << 2 ALUOut 3 (funct) 5:0 Sgn Extend z ALU Control 2 2 .25 1999 ©UCB Cycle 3 lw: ALUOut = A + sgn-ext(IR15:0) IRWrite RegWrite MemRead MemWrite RegDst=x IorD=x P C M u Address x Write Data PCSrc=x ALUSrcA=1 PCWriteCond Read Reg1 25:21 Read data1 A Read data2 B Mem Read Data PCWrite Read Reg2 20:16 IR M u Write 15:0 15:11 x Reg M D R M u x Write Data Regs 4 << 2 z A L U 0 1M 2u 3x ALUOut 3 (funct) 5:0 Sgn Extend ALUOut = A + sgn-ext(IR15:0) M u x M u x ALU Control 2 2 ALUSrcB=2 .26 MemtoReg=x ALUOp=0 1999 ©UCB Cycle 4 lw:MDR = Mem[ALUout] IRWrite RegWrite MemRead MemWrite RegDst=x IorD=1 P C M u Address x Write Data PCSrc=x ALUSrcA=x PCWriteCond Read Reg1 25:21 Read data1 A Read data2 B Mem Read Data PCWrite Read Reg2 20:16 IR M u Write 15:0 15:11 x Reg M D R M u x Write Data Regs 4 << 2 z A L U 0 1M 2u 3x ALUOut 3 (funct) 5:0 Sgn Extend MDR = Mem[ALUout] M u x M u x ALU Control 2 2 ALUSrcB=x .27 MemtoReg=x ALUOp=x 1999 ©UCB Cycle 5 lw: Reg[IR15:11] = MDR IRWrite RegWrite MemRead MemWrite RegDst=0 IorD=x P C M u Address x Write Data Read Reg1 25:21 Read data1 A Read data2 B Read Reg2 20:16 IR M u Write 15:0 15:11 x Reg M D R Reg[IR15:11] = MDR PCSrc=x ALUSrcA=x PCWriteCond Mem Read Data PCWrite M u x Write Data Regs M u x 4 << 2 ALUOut 3 (funct) 5:0 Sgn Extend z A L U 0 1M 2u 3x M u x ALU Control 2 2 ALUSrcB=x .28 MemtoReg=1 ALUOp=x 1999 ©UCB Cycle 4 (sw): Mem[ALUOut] = B IRWrite RegWrite MemRead RegDst IorD=1 MemWrite P C M u Address x Write Data PCSrc ALUSrcA PCWriteCond Read Reg1 25:21 Read data1 A Read data2 B Mem Read Data PCWrite Read Reg2 20:16 IR M u Write 15:0 15:11 x Reg M D R M u x Write Data Regs 4 << 2 z A L U 0 1M 2u 3x ALUOut 3 (funct) 5:0 Sgn Extend Mem[ALUOut] = B M u x M u x ALU Control 2 2 ALUSrc .29 MemtoReg ALUOp 1999 ©UCB
© Copyright 2026 Paperzz