CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath CPE 442 single-cycle datapath.1 Intro. To Computer Architecture Outline of Today’s Lecture ° Recap and Introduction • Where are we with respect to the BIG picture? (10 minutes) ° The Steps of Designing a Processor (10 minutes) ° Datapath and timing for Reg-Reg Operations (15 minutes) ° Datapath for Logical Operations with Immediate (5 minutes) ° Datapath for Load and Store Operations (10 minutes) ° Datapath for Branch and Jump Operations (10 minutes) CPE 442 single-cycle datapath.2 Intro. To Computer Architecture Course Overview Computer Design Instruction Set Deign ° Machine Language ° Compiler View ° "Computer Architecture" ° "Instruction Set Processor" "Building Architect" Computer Hardware Design ° Machine Implementation ° Logic Designer's View ° "Processor Architecture" ° "Computer Organization” “Construction Engineer” CPE 442 single-cycle datapath.3 Intro. To Computer Architecture The Big Picture: Where are We Now? ° The Five Classic Components of a Computer Processor Input Control Memory Datapath Output ° Today’s Topic: Datapath Design CPE 442 single-cycle datapath.4 Intro. To Computer Architecture The Big Picture: The Performance Perspective ° Performance of a machine was determined by: • Instruction count • Clock cycle time • Clock cycles per instruction ° Processor design (datapath and control) will determine: • Clock cycle time • Clock cycles per instruction ° In the next two lectures: • Single cycle processor: - Advantage: One clock cycle per instruction Disadvantage: long cycle time CPE 442 single-cycle datapath.5 Intro. To Computer Architecture Recall the MIPS Instruction Set Architecture CPE 442 single-cycle datapath.6 Intro. To Computer Architecture CPE 442 single-cycle datapath.7 Intro. To Computer Architecture CPE 442 single-cycle datapath.8 Intro. To Computer Architecture CPE 442 single-cycle datapath.9 Intro. To Computer Architecture The MIPS Instruction Formats ° All MIPS instructions are 32 bits long. The three instruction formats: 31 • R-type 26 op rs 6 bits • I-type 31 26 op 31 16 rt 5 bits 5 bits 21 rs 6 bits • J-type 21 5 bits 11 6 rd shamt funct 5 bits 5 bits 6 bits 16 0 immediate rt 5 bits 16 bits 26 op 6 bits 0 0 target address 26 bits ° The different fields are: • op: operation of the instruction • • • • • rs, rt, rd: the source and destination register specifiers shamt: shift amount funct: selects the variant of the operation in the “op” field address / immediate: address offset or immediate value target address: target address of the jump instruction CPE 442 single-cycle datapath.10 Intro. To Computer Architecture The MIPS Subset ° ADD and subtract 31 26 op • add rd, rs, rt • sub rd, rs, rt rs 6 bits 31 ° OR Immediate: • ori rt, rs, imm16 21 op rt 5 bits 26 5 bits 21 rs 6 bits 16 5 bits 11 6 0 rd shamt funct 5 bits 5 bits 6 bits 16 0 rt immediate 5 bits 16 bits ° LOAD and STORE • lw rt, rs, imm16 • sw rt, rs, imm16 ° BRANCH: • beq rs, rt, imm16 ° JUMP: • j target 31 26 op 6 bits CPE 442 single-cycle datapath.11 0 target address 26 bits Intro. To Computer Architecture An Abstract View of the Implementation Clk PC Instruction Fetch Instruction Address Ideal Instruction Memory Instruction Rd Rs 5 5 Rt 5 Operand Fetch Imm 16 Execute Rw Ra Rb 32 32-bit Registers 32 ALU 32 32 Clk 32 CPE 442 single-cycle datapath.12 Data Address Data In Ideal Data Memory DataOut Clk Intro. To Computer Architecture Clocking Methodology Clk Setup Hold Setup Hold . . . . . . Don’t Care . . . . . . ° All storage elements are clocked by the same clock edge ° Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock Skew CPE 442 single-cycle datapath.13 Intro. To Computer Architecture An Abstract View of the Critical Path ° Register file and ideal memory: • The CLK input is a factor ONLY during write operation • During read operation, behave as combinational logic: - Address valid => Output valid after “access time.” Clk Critical Path (Load Operation) = PC’s Clk-to-Q + Instruction Memory’s Access Time + Register File’s Access Time + ALU to Perform a 32-bit Add + Data Memory Access Time + Setup Time for Register File Write + Clock Skew PC Instruction Address Ideal Instruction Memory Instruction Rd Rs 5 5 Imm 16 32 Rw Ra Rb 32 32-bit Registers 32 ALU 32 Rt 5 Clk 32 CPE 442 single-cycle datapath.14 Data Address Data In Ideal Data Memory DataOut Clk Intro. To Computer Architecture Outline of Today’s Lecture ° Recap and Introduction (5 minutes) • Where are we with respect to the BIG picture? (10 minutes) ° The Steps of Designing a Processor (10 minutes) ° Datapath and timing for Reg-Reg Operations (15 minutes) ° Datapath for Logical Operations with Immediate (5 minutes) ° Datapath for Load and Store Operations (10 minutes) ° Datapath for Branch and Jump Operations (10 minutes) CPE 442 single-cycle datapath.15 Intro. To Computer Architecture The Steps of Designing a Processor 0) Instruction Set Architecture => Register Transfer Language Program •Convert each instruction to RTL statements 1)Register Transfer Language program => Datapath Design •Determine Datapath components •Determine Datapath interconnects 2)Datapath components => Control signals 3)Control signals => Control logic => Control Unit Design CPE 442 single-cycle datapath.16 Intro. To Computer Architecture Step: 0) Instruction Set Architecture => Register Transfer Language Program Register Transfer Language-RTL: The ADD Instruction ° add rd, rs, rt • mem[PC] Fetch the instruction from memory • R[rd] <- R[rs] + R[rt] The ADD operation • PC <- PC + 4 Calculate the next instruction’s address CPE 442 single-cycle datapath.17 Intro. To Computer Architecture RTL: The Load Instruction ° lw rt, rs, imm16 • mem[PC] Fetch the instruction from memory • Addr <- R[rs] + SignExt(imm16) Calculate the memory address • R[rt] <- Mem[Addr] Load the data into the register • PC <- PC + 4 Calculate the next instruction’s address CPE 442 single-cycle datapath.18 Intro. To Computer Architecture Step 1) : RTL To Data Path Design – Data Path Components, ° Adder Combinational Logic Elements CarryIn A Adder B 32 Sum 32 Carry 32 ° MUX Select B 32 MUX A 32 Y 32 ° ALU OP A CPE 442 single-cycle datapath.19 32 ALU B 32 32 Result Zero Intro. To Computer Architecture Step 1) : RTL To Data Path Design – Data Path Components, Storage Elements: Register ° Register • Similar to the D Flip Flop except - N-bit input and output Write Enable Data In - Write Enable input • Write Enable: - 0: Data Out will not change Data Out N N Clk - 1: Data Out will become Data In CPE 442 single-cycle datapath.20 Intro. To Computer Architecture Step 1) : RTL To Data Path Design – Data Path Components, Storage Elements: Register File RW RA RB ° Register File consists of 32 registers: • Two 32-bit output busses: busA and busB • One 32-bit input bus: busW Write Enable 5 5 5 busA busW 32 Clk 32 32-bit Registers 32 busB 32 ° Register is selected by: • RA selects the register to put on busA • RB selects the register to put on busB • RW selects the register to be written via busW when Write Enable is 1 ° Clock input (CLK) • The CLK input is a factor ONLY during write operation • During read operation, behaves as a combinational logic block: RA or RB valid => busA or busB valid after “access time.” CPE 442 single-cycle datapath.21 Intro. To Computer Architecture Step 1) : RTL To Data Path Design – Data Path Components, Storage Elements: Idealized Memory Write Enable ° Memory (idealized) • One input bus: Data In • One output bus: Data Out Data In Address DataOut 32 Clk 32 ° Memory word is selected by: • Address selects the word to put on Data Out • Write Enable = 1: address selects the memory memory word to be written via the Data In bus ° Clock input (CLK) • The CLK input is a factor ONLY during write operation • During read operation, behaves as a combinational logic block: - Address valid => Data Out valid after “access time.” CPE 442 single-cycle datapath.22 Intro. To Computer Architecture Step 1) : RTL To Data Path Design – Data Path Components, Overview of the Instruction Fetch Unit ° The common RTL operations • Fetch the Instruction: mem[PC] • Update the program counter: - Sequential Code: PC <- PC + 4 - Branch and Jump PC <- “something else” Clk PC Next Address Logic Address Instruction Memory CPE 442 single-cycle datapath.23 Instruction Word 32 Intro. To Computer Architecture Outline of Today’s Lecture ° Recap and Introduction • Where are we with respect to the BIG picture? (10 minutes) ° The Steps of Designing a Processor (10 minutes) ° Datapath and timing for Reg-Reg Operations (5 minutes) ° Datapath for Logical Operations with Immediate (5 minutes) ° Datapath for Load and Store Operations (10 minutes) ° Datapath for Branch and Jump Operations (10 minutes) CPE 442 single-cycle datapath.24 Intro. To Computer Architecture Step 1): RTL To Data-Path Design Determine Data-path interconnects RTL: The ADD Instruction 31 26 21 op 6 bits rs 5 bits 16 rt 5 bits 11 6 0 rd shamt funct 5 bits 5 bits 6 bits ° add rd, rs, rt • mem[PC] Fetch the instruction from memory • R[rd] <- R[rs] + R[rt] The actual operation • PC <- PC + 4 Calculate the next instruction’s address CPE 442 single-cycle datapath.25 Intro. To Computer Architecture RTL: The Subtract Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 11 6 0 rd shamt funct 5 bits 5 bits 6 bits ° sub rd, rs, rt • mem[PC] Fetch the instruction from memory • R[rd] <- R[rs] - R[rt] The actual operation • PC <- PC + 4 Calculate the next instruction’s address CPE 442 single-cycle datapath.26 Intro. To Computer Architecture RTL To Data Path Design – Data-path for Register-Register Operations ° R[rd] <- R[rs] op R[rt] Example: add rd, rs, rt • Ra, Rb, and Rw comes from instruction’s rs, rt, and rd fields • ALUctr and RegWr: control logic after decoding the instruction 31 26 21 op 16 rs 6 bits rt 5 bits Rd Rs RegWr 5 5 5 bits Rt 0 rd shamt funct 5 bits 5 bits 6 bits busA 32 busB ALU 32 Clk 32 32-bit Registers 6 ALUctr 5 Rw Ra Rb busW 11 Result 32 32 CPE 442 single-cycle datapath.27 Intro. To Computer Architecture Register-Register Timing Clk PC Old Value Clk-to-Q New Value Rs, Rt, Rd, Op, Func Old Value ALUctr Old Value RegWr Old Value busA, B busW Instruction Memory Access Time New Value Delay through Control Logic New Value New Value Register File Access Time New Value Old Value ALU Delay New Value Old Value Rd Rs Rt RegWr 5 5 5 CPE 442 single-cycle datapath.28 Register Write Occurs Here busA 32 busB 32 ALU busW 32 Clk Rw Ra Rb 32 32-bit Registers ALUctr Result 32 Intro. To Computer Architecture Outline of Today’s Lecture ° Recap and Introduction • Where are we with respect to the BIG picture? (50 minutes) ° The Steps of Designing a Processor (10 minutes) ° Datapath and timing for Reg-Reg Operations (15 minutes) ° Datapath for Logical Operations with Immediate (5 minutes) ° Datapath for Load and Store Operations (10 minutes) ° Datapath for Branch and Jump Operations (10 minutes) CPE 442 single-cycle datapath.29 Intro. To Computer Architecture RTL: The OR Immediate Instruction 31 26 op 6 bits ° ori 21 rs 5 bits 16 rt 0 immediate 5 bits 16 bits rt, rs, imm16 • mem[PC] Fetch the instruction from memory • R[rt] <- R[rs] or ZeroExt(imm16) The OR operation • PC <- PC + 4 Calculate the next instruction’s address 31 0000000000000000 16 bits CPE 442 single-cycle datapath.30 16 15 0 immediate 16 bits Intro. To Computer Architecture Datapath for Logical Operations with Immediate ° R[rt] <- R[rs] op ZeroExt[imm16]] 31 26 op rt Rs 5 5 Don’t Care (Rt) ALUctr busA ALU 32 32 32-bit Registers 32 Clk 16 bits rd Rt Rw Ra Rb busW busB ZeroExt 16 Result 32 Mux 32 imm16 0 immediate 5 bits Mux RegWr 5 rt, rs, imm16 11 16 5 bits Rd RegDst 21 rs 6 bits Example: ori 32 ALUSrc CPE 442 single-cycle datapath.31 Intro. To Computer Architecture Outline of Today’s Lecture ° Recap and Introduction • Where are we with respect to the BIG picture? (50 minutes) ° The Steps of Designing a Processor (10 minutes) ° Datapath and timing for Reg-Reg Operations (15 minutes) ° Datapath for Logical Operations with Immediate (5 minutes) ° Datapath for Load and Store Operations (10 minutes) ° Datapath for Branch and Jump Operations (10 minutes) ° Assignment #3 CPE 442 single-cycle datapath.32 Intro. To Computer Architecture RTL: The Load Instruction 31 26 op ° lw rt, rs, imm16 6 bits • mem[PC] 21 rs 5 bits 16 rt 0 immediate 5 bits 16 bits Fetch the instruction from memory • Addr <- R[rs] + SignExt(imm16) Calculate the memory address R[rt] <- Mem[Addr] • PC <- PC + 4 Load the data into the register Calculate the next instruction’s address 31 SignExt operation 16 15 0 0000000000000000 16 bits 31 16 15 1111111111111111 1 16 bits CPE 442 single-cycle datapath.33 0 immediate 16 bits 0 immediate 16 bits Intro. To Computer Architecture Datapath for Load Operations ° R[rt] <- Mem[R[rs] + SignExt[imm16]] 31 26 21 op rs 6 bits Rd RegDst Rs RegWr 5 5 bits 5 5 Don’t Care (Rt) 16 bits rd busA 32 ExtOp 32 Mux Extender 32 ALUSrc CPE 442 single-cycle datapath.34 32 MemWr Mux busB 32 16 ALUctr MemtoReg Rw Ra Rb 32 32-bit Registers imm16 immediate ALU 32 Clk 0 Rt Mux busW rt, rs, imm16 11 16 rt 5 bits Example: lw WrEn Adr Data In 32 Clk Data Memory Intro. To Computer Architecture RTL: The Store Instruction 31 26 op 6 bits ° sw 21 rs 5 bits 16 rt 5 bits 0 immediate 16 bits rt, rs, imm16 • mem[PC] Fetch the instruction from memory • Addr <- R[rs] + SignExt(imm16) Calculate the memory address • Mem[Addr] <- R[rt] Store the register into memory • PC <- PC + 4 Calculate the next instruction’s address CPE 442 single-cycle datapath.35 Intro. To Computer Architecture Datapath for Store Operations ° Mem[R[rs] + SignExt[imm16] <- R[rt]] 31 26 21 op Rd RegDst 0 rt 5 bits rt, rs, imm16 16 rs 6 bits Example: sw immediate 5 bits 16 bits Rt Mux RegWr 5 Rw Ra Rb 32 32-bit Registers CPE 442 single-cycle datapath.36 MemtoReg busA 32 busB 32 32 ExtOp 32 32 Mux 16 MemWr Extender imm16 ALUctr 5 Mux 32 Clk 5 Rt ALU busW Rs WrEn Adr Data In 32 ALUSrc Clk Data Memory Intro. To Computer Architecture Outline of Today’s Lecture ° Recap and Introduction • Where are we with respect to the BIG picture? (10 minutes) ° The Steps of Designing a Processor (10 minutes) ° Datapath and timing for Reg-Reg Operations (15 minutes) ° Datapath for Logical Operations with Immediate (5 minutes) ° Datapath for Load and Store Operations (10 minutes) ° Datapath for Branch and Jump Operations (10 minutes) CPE 442 single-cycle datapath.37 Intro. To Computer Architecture RTL: The Branch Instruction 31 26 21 op 6 bits rs 5 bits 16 rt 5 bits 0 immediate 16 bits ° beq rs, rt, imm16 • mem[PC] Fetch the instruction from memory • Cond <- R[rs] - R[rt] Calculate the branch condition • if (COND eq 0) Calculate the next instruction’s address PC <- PC + 4 + ( SignExt(imm16) x 4 ) else PC <- PC + 4 CPE 442 single-cycle datapath.38 Intro. To Computer Architecture Step 1) : RTL To Data Path Design – Data Path Components, Overview of the Instruction Fetch Unit ° The common RTL operations • Fetch the Instruction: mem[PC] • Update the program counter: - Sequential Code: PC <- PC + 4 - Branch and Jump PC <- “something else” Clk PC Next Address Logic Address Instruction Memory CPE 442 single-cycle datapath.39 Instruction Word 32 Intro. To Computer Architecture Datapath for Branch Operations ° beq rs, rt, imm16 31 We need to compare Rs and Rt! 26 21 op rs 6 bits Rd RegDst 16 0 rt 5 bits immediate 5 bits 16 bits Rt Branch Clk PC Mux Rs RegWr 5 32 Clk ALUctr 5 busA Rw Ra Rb 32 32-bit Registers 32 16 Extender imm16 Mux busB 32 imm16 16 ALU busW 5 Rt Zero Next Address Logic To Instruction Memory 32 ALUSrc CPE 442 single-cycle datapath.40 ExtOp Intro. To Computer Architecture Binary Arithmetics for the Next Address ° In theory, the PC is a 32-bit byte address into the instruction memory: • Sequential operation: PC<31:0> = PC<31:0> + 4 • Branch operation: PC<31:0> = PC<31:0> + 4 + SignExt[Imm16] * 4 ° The magic number “4” always comes up because: • The 32-bit PC is a byte address • And all our instructions are 4 bytes (32 bits) long ° In other words: • The 2 LSBs of the 32-bit PC are always zeros • There is no reason to have hardware to keep the 2 LSBs ° In practice, we can simply the hardware by using a 30-bit PC<31:2>: • Sequential operation: PC<31:2> = PC<31:2> + 1 • Branch operation: PC<31:2> = PC<31:2> + 1 + SignExt[Imm16] • In either case: Instruction Memory Address = PC<31:2> concat “00” CPE 442 single-cycle datapath.41 Intro. To Computer Architecture Next Address Logic: Expensive and Fast Solution ° Using a 30-bit PC: • Sequential operation: PC<31:2> = PC<31:2> + 1 • Branch operation: PC<31:2> = PC<31:2> + 1 + SignExt[Imm16] • In either case: Instruction Memory Address = PC<31:2> concat “00” 30 Addr<31:2> 30 “00” Adder PC 30 0 imm16 Instruction<15:0> 16 SignExt Clk Adder “1” 1 32 30 30 Instruction<31:0> Branch CPE 442 single-cycle datapath.42 Instruction Memory Mux 30 Addr<1:0> Zero Intro. To Computer Architecture Next Address Logic: Cheap and Slow Solution ° Why is this slow? • Cannot start the address add until Zero (output of ALU) is valid ° Does it matter that this is slow in the overall scheme of things? • Probably not here. Critical path is the load operation. 30 PC Addr<31:2> 30 “1” “0” Carry In Adder 0 Mux imm16 Instruction<15:0> 16 SignExt Clk 1 30 “00” 30 Addr<1:0> Instruction Memory 32 30 Instruction<31:0> Branch Zero CPE 442 single-cycle datapath.43 Intro. To Computer Architecture RTL: The Jump Instruction 31 26 op 6 bits °j 0 target address 26 bits target • mem[PC] Fetch the instruction from memory • PC<31:2> <- PC<31:28> concat target<25:0> Calculate the next instruction’s address CPE 442 single-cycle datapath.44 Intro. To Computer Architecture Instruction Fetch Unit °j target • PC<31:2> <- PC<31:29> concat target<25:0> 30 30 PC<31:28> 26 30 Jump 1 Instruction Memory 32 Instruction<31:0> 30 30 Branch equal CPE 442 single-cycle datapath.45 0 0 Addr<1:0> Mux imm16 Instruction<15:0> 16 SignExt Clk 1 30 Adder “1” 4 Adder PC 30 “00” Mux Target Instruction<25:0> Addr<31:2> Zero Intro. To Computer Architecture Putting it All Together: A Single Cycle Datapath ° We have everything except control signals (underline) Instruction<31:0> Branch Rs RegWr 5 5 Rt Rt ALUctr 5 busA 0 1 32 Rd MemtoReg MemWr 0 32 32 WrEn Adr Data In 32 Clk Imm16 Mux 16 Extender imm16 32 Mux 32 Clk Rw Ra Rb 32 32-bit Registers busB 32 ALU busW Zero Rs <0:15> 1 Mux 0 <11:15> Clk <16:20> RegDst Rt <21:25> Rd Instruction Fetch Unit Jump 1 Data Memory ALUSrc ExtOp CPE 442 single-cycle datapath.46 Intro. To Computer Architecture
© Copyright 2026 Paperzz