RISC Processor Design Single Cycle Implementation MIPS Virendra Singh Computer Design and Test Lab. Indian Institute of Science Bangalore [email protected] Advance Computer Architecture http://www.serc.iisc.ernet.in/~viren/ACA/ACA.htm MIPS Instructions add MIPS assembly language Example Meaning add $s1, $s2, $s3 $s1 = $s2 + $s3 Three operands; data in registers Arithmetic subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands; data in registers Data transfer add immediate load word store word load byte store byte load upper immediate addi $s1, $s2, 100 lw $s1, 100($s2) sw $s1, 100($s2) lb $s1, 100($s2) sb $s1, 100($s2) lui $s1, 100 $s1 = $s2 + 100 Used to add constants $s1 = Memory[ $s2 + 100] Word from memory to register Memory[ $s2 + 100] = $s1 Word from register to memory $s1 = Memory[ $s2 + 100] Byte from memory to register Memory[ $s2 + 100] = $s1 Byte from register to memory Loads constant in upper 16 bits $s1 = 100 * 2 16 branch on equal beq $s1, $s2, 25 if ($s1 == $s2 ) go to PC + 4 + 100 Equal test; PC-relative branch branch on not equal bne $s1, $s2, 25 if ($s1 != $s2 ) go to PC + 4 + 100 Not equal test; PC-relative set on less than slt $s1, $s2, $s3 if ($s2 < $s3 ) $s1 = 1; else $s1 = 0 Compare less than; for beq, bne set less than immediate slti if ($s2 < 100 ) $s1 = 1; else $s1 = 0 Compare less than constant jump jump register jump and link j jr jal go to 10000 go to $ra $ra = PC + 4; go to 10000 Jump to target address For switch, procedure return For procedure call Category Conditional branch Unconditional jump Instruction $s1, $s2, 100 2500 $ra 2500 Advance Computer Architecture Comments 2 MIPS Registers and Memory $ s 0 - $ s 7 , $ t 0 - $ t 9 , F$azsetrloo ,c a t io n s f o r d a t a . In M IP S , d a t a m u s t b e in r e g is 3 2 r e g is te$ras0 - $ a 3 , $ v 0 - $ v 1 , a$rgitph,m e t ic . M IP S r e g is t e r $ z e r o a lw a y s e q u a ls 0 . R e $ f p , $ s p , $ r a , $ a t r e s e r v e d f o r t h e a s s e m b le r t o h a n d le la r g e c o n s t a n M e m o r y [ 0 ], A c c e s s e d o n ly b y d a t a t r a n s f e r in s t r u c t io n s . M IP S u s 23 0 m e m oMrye m o r y [ 4 ], ..., w ords a d d r e s s e s , s o s e q u e n t ia l w o r d s d if f e r b y 4 . M e m o r y M e m o r y [ 4 2 9 4 9 6 7 2 9 2 ] s t r u c t u r e s , s u c h a s a r r a y s , a n d s p ille d r e g is t e r s , s u s a v e d o n p r o c e d u r e c a lls . Advance Computer Architecture 3 Addressing Modes Example 1. Immediate addressing addi add op rs rt Immediate 2. Register addressing op rs rt rd ... funct Registers Register 3. Base addressing lw, sw op rs rt + Register beq, bne Memor y Address Byte Halfword Word 4. PC-relative addressing op rs rt Memor y Address PC + Word 5. Pseudodirect addressing j op Memor y Address Word PC 2004 © Morgan Kaufman Publishers Advance Computer Architecture 4 MIPS – Instruction Set MIPS Instruction – Subset Arithmetic and Logical Instructions add, sub, or, and, slt Memory reference Instructions lw, sw Branch beq, j Advance Computer Architecture 5 Overview of MIPS simple instructions, all 32 bits wide very structured, no unnecessary baggage only three instruction formats funct R op rs1 rs2 rd I op rs1 rd 16 bit address J op shmt 26 bit address rely on compiler to achieve performance 2004 © Morgan Kaufman Publishers Advance Computer Architecture 6 Where Does It All Begin? In a register called program counter (PC). PC contains the memory address of the next instruction to be executed. In the beginning, PC contains the address of the memory location where the program begins. Advance Computer Architecture 7 Where is the Program? Processor Memory Program counter (register) Start address Advance Computer Architecture Machine code of program 8 How Does It Run? Start PC has memory address where program begins Fetch instruction word from memory address in PC and increment PC ← PC + 4 for next instruction Decode and execute instruction No Program complete? Advance Computer Architecture Yes STOP 9 Datapath and Control Datapath: Memory, registers, adders, ALU, and communication buses. Each step (fetch, decode, execute) requires communication (data transfer) paths between memory, registers and ALU. Control: Datapath for each step is set up by control signals that set up dataflow directions on communication buses and select ALU and memory functions. Control signals are generated by a control unit consisting of one or more finite-state machines. Advance Computer Architecture 10 Datapath for Instruction Fetch Add 4 PC Address Instruction Memory Advance Computer Architecture Instruction word to control unit and registers 11 Register File: A Datapath Component 5 Read registers Write register Write data 5 5 reg 1 reg 2 32 reg 1 data 32 Registers (reg. file) 32 reg 2 data 32 RegWrite from control Advance Computer Architecture 12 Multi-Operation ALU Operation select ALU function 000 001 010 110 111 AND OR Add Subtract Set on less than Operation select from control 3 zero ALU result overflow zero = 1, when all bits of result are 0 Advance Computer Architecture 13 R-Type Instructions Also known as arithmetic-logical instructions add, sub, slt Example: add $t0, $s1, $s2 • Machine instruction word 000000 10001 10010 01000 00000 100000 opcode $s1 $s2 $t0 function • Read two registers • Write one register • Opcode and function code go to control unit that generates RegWrite and ALU operation code. Advance Computer Architecture 14 Datapath for R-Type Instruction 01000 10010 10001 Operation 000000 10001 10010 01000 00000 100000 select opcode $s1 $s2 $t0 function (add) from control (add) Read 5 $s1 register 3 32 numbers zero 5 $s2 32 Registers ALU result Write reg. (reg. file) 32 number overflow 5 $t0 Write data 32 RegWrite from control activated Advance Computer Architecture 15 Load and Store Instructions I-type lw instructions $t0, 1200 ($t1) # incr. in bytes 100011 01001 01000 0000 0100 1011 0000 opcode $t1 sw $t0 $t0, 1200 ($t1) 1200 # incr. in bytes 101011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200 Advance Computer Architecture 16 Datapath for lw Instruction Read register numbers 01001 100011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200 5 $t1 3 32 01000 result ALU 32 Registers (reg. file) 5 Addr. Data memory overflow $t0 Write data Write data 0000 0100 1011 0000 Read data zero 5 Write reg. number MemWrite Operation select from control (add) 32 32 RegWrite from control activated 16 Sign extend mem. data to $t0 Advance Computer Architecture MemRead activated 17 Datapath for sw Instruction 101011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200 01001 Read register numbers 5 01000 MemWrite activated 5 Write reg. number $t1 3 32 $t0 Read data zero result ALU 32 Registers (reg. file) Addr. Data memory overflow 5 32 Write data 0000 0100 1011 0000 Operation select from control (add) $t0 data to mem. 32 Write data 32 RegWrite from control Sign extend MemRead 16 Advance Computer Architecture 18 Branch Instruction (I-Type) beq $s1, $s2, 25 # if $s1 = $s2, advance PC through 25 instructions 16-bits 000100 10001 10010 0000 0000 0001 1001 opcode $s1 $s2 25 Note: Can branch within ± 215 words from the current instruction address in PC. Advance Computer Architecture 19 10001 Read register numbers 5 10010 Datapath for beq Instruction 5 Write reg. number 000100 10001 10010 0000 0000 0001 1001 opcode $s1 $s2 25 $s1 Operation select from control (subtract) 32 3 $s2 32 Registers zero (reg. file) ALU 5 result 32 To branch control logic overflow Write data 0000 0000 0001 1001 32 16 Sign extend RegWrite from control 32 PC+4 From instruction 32 fetch datapath Shift left 2 Add Branch target 32 32 Advance Computer Architecture 20 J-Type Instruction j 2500 # jump to instruction 2,500 26-bits 000010 0000 0000 0000 0010 0111 0001 00 opcode 2,500 32-bit jump address 0000 0000 0000 0000 0010 0111 0001 0000 bits 28-31 from PC+4 Advance Computer Architecture 21 Datapath for Jump Instruction Branch 4 Add Branch addr. 32 PC+4 Jump 1 mux 0 32 32 0 mux 1 4 Shift left 2 32 PC 32 Address Instruction Memory 26 32 Advance Computer Architecture 28 opcode (bits 26-31) to control 6 Instruction word to control and registers 22 16-20 11-15 Combined Datapaths 0-15 Sign ext. Shift left 2 MemWrite MemRead Data mem. ALU Cont. 0-5 Advance Computer Architecture 0 mux 1 1 mux 0 ALU zero MemtoReg 23 0 mux 1 Instr. mem. ALU PC 1 mux 0 21-25 1 mux 0 26-31 Branch Reg. File opcode Jump Shift left 2 CONTROL RegDst 4 Add 0-25 Control RegDst Jump Branch Instruction bits 26-31 opcode Control Logic MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Instruction bits 0-5 funct. 2 ALU Control Advance Computer Architecture to ALU 24 Control Logic: Truth Table Outputs: control signals 0 0 0 1 0 0 0 1 0 lw 1 0 0 0 1 1 0 0 1 1 1 1 0 0 0 0 sw 1 0 1 0 1 1 X 0 1 X 0 0 1 0 0 0 beq 0 0 0 1 0 0 X 0 0 X 0 0 0 1 0 1 j 0 0 0 0 1 0 X 1 X X X X X X X X Advance Computer Architecture 25 ALUOp2 1 ALOOp1 0 Branch 0 MemWrite 0 MemRead 0 RegWrite 0 MemtoReg 0 ALUSrc R 31 30 29 28 27 26 Jump Inputs: instr. opcode bits RegDst Instr type How Long Does It Take? Assume control logic is fast and does not affect the critical timing. Major time delay components are ALU, memory read/write, and register read/write. Arithmetic-type (R-type) • Fetch (memory read) • Register read • ALU operation • Register write • Total 2ns 1ns 2ns 1ns 6ns Advance Computer Architecture 26 Time for lw and sw (I-Types) ALU (R-type) Load word (I-type) • Fetch (memory read) • Register read • ALU operation • Get data (mem. Read) • Register write • Total Store 6ns 2ns 1ns 2ns 2ns 1ns 8ns word (no register write) 7ns Advance Computer Architecture 27 Time for beq (I-Type) ALU (R-type) 6ns Load word (I-type) 8ns Store word (I-type) 7ns Branch on equal (I-type) • Fetch (memory read) 1ns • Register read 2ns • ALU operation • Total 5ns 2ns Advance Computer Architecture 28 Time for Jump (J-Type) ALU (R-type) Load word (I-type) Store word (I-type) Branch on equal (I-type) Jump (J-type) • Fetch (memory read) • Total Advance Computer Architecture 6ns 8ns 7ns 5ns 2ns 2ns 29 How Fast Can the Clock Be? If every instruction is executed in one clock cycle, then: • Clock period must be at least 8ns to perform the • • longest instruction, i.e., lw. This is a single cycle machine. It is slower because many instructions take less than 8ns but are still allowed that much time. Method of speeding up: Use multicycle datapath. Advance Computer Architecture 30 A Single Cycle Example Delay of 1-bit full adder = 1ns Clock period ≥ 32ns a31 . . . a2 a1 a0 b31 . . . b2 b1 b0 1-b full adder 1-b full adder 1-b full adder 1-b full adder 0 Time of adding words ~ 32ns Time of adding bytes ~ 32ns Advance Computer Architecture 31 c32 s31 . . . s2 s1 s0 a31 . . . a2 a1 a0 b31 . . . b2 b1 b0 Delay of 1-bit full adder = 1ns Clock period ≥ 1ns Time of adding words ~ 32ns Time of adding bytes ~ 8ns 1-b full adder c32 FF Initialize to 0 Advance Computer Architecture s31 . . . s2 s1 s0 Shift Shift Shift A Multicycle Implementation 32 ALUOut Reg. 4 ALU A Reg. B Reg. Register file Mem. Data (MDR) Data Instr. reg. (IR) Addr. Memory PC Multicycle Datapath One-cycle data transfer paths (need registers to hold data) Advance Computer Architecture 33 Multicycle Datapath Requirements Only one ALU, since it can be reused. Single memory for instructions and data. Five registers added: • Instruction register (IR) • Memory data register (MDR) • Three ALU registers, A and B for inputs and ALUOut for output Advance Computer Architecture 34 Multicycle Datapath MUX in1 control in2 MemRead MemWrite IRWrite RegDst MemtoReg 0-15 Sign extend 0-5 Advance Computer Architecture ALUOut Reg. ALU ALUSrcB 28-31 ALUSrcA A Reg. 11-15 16-20 Shift left 2 RegWrite B Reg. out 21-25 Register file Data 0-25 Mem. Data (MDR) IorD Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. PCSource 4 Shift left 2 ALU control 35 ALUOp 3 to 5 Cycles for an Instruction Step R-type (4 cycles) Mem. Ref. (4 or 5 cycles) Branch type (3 cycles) J-type (3 cycles) Instruction fetch IR ← Memory[PC]; PC ← PC+4 Instr. decode/ Reg. fetch A ← Reg(IR[21-25]); B ← Reg(IR[16-20]) ALUOut ← PC + (sign extend IR[0-15]) << 2 Execution, addr. Comp., branch & jump completion ALUOut ← A op B Mem. Access or Reg(IR[11R-type 15]) ← completion ALUOut Memory read completion ALUOut ← A+sign extend (IR[0-15]) If (A= =B) PC←PC[28-31] then || PC←ALUOut (IR[0-25]<<2) MDR←M[ALUout] or M[ALUOut]←B Reg(IR[16-20]) ← MDR Advance Computer Architecture 36 Cycle 1 of 5: Instruction Fetch (IF) Read instruction into IR, M[PC] → IR • Control signals used: = 0 • IorD 1 • MemRead = = 1 • IRWrite Increment PC, PC + 4 → PC • Control signals used: • ALUSrcA • ALUSrcB • ALUOp • PCSource • PCWrite = = = = = 0 01 00 00 1 select PC read memory write IR select PC into ALU select constant 4 ALU adds select ALU output write PC Advance Computer Architecture 37 Cycle 2 of 5: Instruction Decode (ID) 31-26 25-21 20-16 15-11 10-6 5-0 R opcode | reg 1 | reg 2 | reg 3 | shamt | fncode I opcode | reg 1 | reg 2 | word address increment J opcode | word address jump Control unit decodes instruction Datapath prepares for execution • R and I types, reg 1→ A reg, reg 2 → B reg • No control signals needed • Branch type, compute branch address in ALUOut =0 select PC into ALU • ALUSrcA = 11 Instr. Bits 0-15 shift 2 into ALU • ALUSrcB • ALUOp= 00 ALU adds Advance Computer Architecture 38 Cycle 3 of 5: Execute (EX) R type: execute function on reg A and reg B, result in ALUOut • Control signals used: = 1 A reg into ALU • ALUSrcA = 00 B reg into ALU • ALUsrcB = 10 instr. Bits 0-5 control ALU • ALUOp I type, lw or sw: compute memory address in ALUOut ← A reg + sign extend IR[0-15] • Control signals used: • • • ALUSrcA ALUSrcB ALUOp = = = 1 10 00 A reg into ALU Instr. Bits 0-15 into ALU ALU adds Advance Computer Architecture 39 Cycle 3 of 5: Execute (EX) I type, beq: subtract reg A and reg B, write ALUOut to PC • Control signals used: • • • • • • ALUSrcA = 1 ALUsrcB = 00 ALUOp = 01 If zero = 1, PCSource = 01 If zero = 1, PCwriteCond =1 Instruction complete, go to IF A reg into ALU B reg into ALU ALU subtracts ALUOut to PC write PC J type: write jump address to PC ← IR[0-25] shift 2 and four leading bits of PC • Control signals used: • • • PCSource = 10 PCWrite = 1 write PC Computer InstructionAdvance complete, goArchitecture to IF 40 Cycle 4 of 5: Reg Write/Memory R type, write destination register from ALUOut • Control signals used: = 1 Instr. Bits 11-15 specify reg. • RegDst = 0 ALUOut into reg. • MemtoReg = 1 write register • RegWrite • Instruction complete, go to IF I type, lw: read M[ALUOut] into MDR • Control signals used: = 1 select ALUOut into mem adr. • IorD = 1 read memory to MDR • MemRead I type, sw: write M[ALUOut] from B reg • Control signals used: • • • IorD = 1 select ALUOut into mem adr. MemWrite = 1 write memory 41 Advance Computer Architecture Instruction complete, go to IF Cycle 5 of 5: Reg Write I type, lw: write MDR to reg[IR(16-20)] • Control signals used: = 0 instr. Bits 16-20 are write reg • RegDst 1 MDR to reg file write input • MemtoReg = 1 read memory to MDR • RegWrite = • Instruction complete, go to IF For an alternative method of designing datapath, see N. Tredennick, Microprocessor Logic Design, the Flowchart Method, Digital Press, 1987. Advance Computer Architecture 42 1-bit Control Signals Signal name Value = 0 Value =1 RegDst Write reg. # = bit 16-20 Write reg. # = bit 11-15 RegWrite No action Write reg. ← Write data ALUSrcA First ALU Operand ← PC First ALU Operand←Reg. A MemRead No action Mem.Data Output←M[Addr.] MemWrite No action M[Addr.]←Mem. Data Input MemtoReg Reg.File Write In←ALUOut Reg.File Write In←MDR IorD Mem. Addr. ← PC Mem. Addr. ← ALUOut IRWrite No action IR ← Mem.Data Output PCWrite No action PC is written PCWriteCond No action PC is written if zero(ALU)=1 zero(ALU) PCWriteCond PCWrite Advance Computer Architecture PCWrite etc. 43 2-bit Control Signals Signal name ALUOp ALUSrcB PCSource Value Action 00 ALU performs add 01 ALU performs subtract 10 Funct. field (0-5 bits of IR ) determines ALU operation 00 Second input of ALU ← B reg. 01 Second input of ALU ← 4 (constant) 10 Second input of ALU ← 0-15 bits of IR sign ext. to 32b 11 Second input of ALU ← 0-15 bits of IR sign ext. and left shift 2 bits 00 ALU output (PC +4) sent to PC 01 ALUOut (branch target addr.) sent to PC 10 Jump address IR[0-25] shifted left 2 bits, concatenated with PC+4[28-31], sent to PC Advance Computer Architecture 44 Control: Finite State Machine Start State 0 Clock cycle 1 Instruction fetch State 1 Clock cycle 2 Instruction decode and register fetch FSM-M Clock cycles 3-5 Memory access instr. FSM-R R-type instr. FSM-B Branch instr. Advance Computer Architecture FSM-J Jump instr. 45 State 0: Instruction Fetch (CC1) MUX in1 control in2 MemRead = 1 MemWrite IRWrite =1 RegDst MemtoReg 0-15 Shift left 2 Sign extend 0-5 Advance Computer Architecture 4 ALUOut Reg. ALU ALUSrcB=01 28-31 ALUSrcA=0 A Reg. RegWrite B Reg. out 16-20 21-25 Register file Data 0-25 Mem. Data (MDR) IorD=0 Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc.=1 PCSource=00 Add Shift left 2 ALU control 46 ALUOp =00 State 0 Control FSM Outputs Start State0 Instruction fetch MemRead =1 ALUSrcA = 0 IorD = 0 IRWrite = 1 ALUSrcB = 01 ALUOp = 00 PCWrite = 1 PCSource = 00 State 1 Instruction decode/ Register fetch/ Branch addr. Advance Computer Architecture Outputs? 47 MUX in1 control in2 MemRead MemWrite IRWrite RegDst MemtoReg 0-15 0-5 Advance Computer Architecture 4 ALUOut Reg. ALU ALUSrcB=11 28-31 ALUSrcA=0 A Reg. Sign extend PCSource Shift left 2 RegWrite B Reg. out 16-20 21-25 Register file Data 0-25 Mem. Data (MDR) IorD Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. State 1: Instr. Decode/Reg. Fetch/ Branch Address (CC2) Add Shift left 2 ALU control 48 ALUOp = 00 State 1 Control FSM Outputs Start State0 Instruction fetch MemRead =1 (IF) ALUSrcA = 0 IorD = 0 IRWrite = 1 ALUSrcB = 01 ALUOp = 00 PCWrite = 1 PCSource = 00 State 1 Instruction decode (ID) / Register fetch / Branch addr. ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 e= d = o c Op , sw ode lw c ype p O R-t FSM-M FSM-R Opcode = BEQ FSM-B Advance Computer Architecture Opcode = J-type FSM-J 49 State 1 (Opcode = lw) → FSM-M (CC3-5) PCSource MUX in1 control in2 MemRead=1 MemWrite IRWrite 4 RegDst=0 MemtoReg=1 0-15 Sign extend 0-5 Advance Computer Architecture CC3 ALU ALUSrcB=10 ALUSrcA=1 A Reg. 28-31 ALUOut Reg. CC5 B Reg. out 16-20 Shift left 2 RegWrite=1 21-25 Register file Data 0-25 Mem. Data (MDR) IorD=1 Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. CC4 Add Shift left 2 ALU control 50 ALUOp = 00 State 1 (Opcode= sw)→FSM-M (CC3-4) CC4 MUX in1 control IRWrite CC4 in2 MemRead MemWrite=1 RegDst=0 MemtoReg 0-15 Shift left 2 Sign extend 0-5 Advance Computer Architecture 4 ALUOut Reg. CC3 ALU ALUSrcB=10 28-31 ALUSrcA=1 A Reg. RegWrite B Reg. out 16-20 21-25 Register file Data 0-25 Mem. Data (MDR) IorD=1 Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. PCSource Add Shift left 2 ALU control 51 ALUOp = 00 FSM-M (Memory Access) From state 1 Opcode = “lw” or “sw” Opcode Read Memory data Compute mem addrress ALUSrcA =1 ALUSrcB = 10 Opcode ALUOp = 00 = “lw” = “sw” MemRead = 1 IorD = 1 Write register Write memory MemWrite = 1 IorD = 1 RegWrite = 1 MemtoReg = 1 RegDst = 0 Advance Computer Architecture To state 0 (Instr. Fetch) 52 State 1(Opcode=R-type)→FSM-R (CC3-4) MUX in1 control in2 MemRead MemWrite IRWrite RegDst=0 4 CC4 MemtoReg=0 0-15 Sign extend 0-5 Advance Computer Architecture ALUOut Reg. ALU ALUSrcB=00 A Reg. CC3 28-31 ALUSrcA=1 11-15 Shift left 2 RegWrite B Reg. out 16-20 21-25 Register file Data 0-25 Mem. Data (MDR) IorD Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. PCSource “funct. code” Shift left 2 ALU control 53 ALUOp = 10 FSM-R (R-type Instruction) From state 1 Opcode = R-type ALU operation ALUSrcA =1 ALUSrcB = 00 ALUOp = 10 Write register RegWrite = 1 MemtoReg = 0 RegDst = 1 Advance Computer Architecture To state 0 (Instr. Fetch) 54 State 1 (Opcode = beq ) → FSM-B (CC3) MUX in1 control in2 MemRead MemWrite IRWrite RegDst MemtoReg 0-15 Sign extend 0-5 Advance Computer Architecture 4 ALUOut Reg. zero ALU ALUSrcB=00 CC3 28-31 ALUSrcA=1 11-15 A Reg. 16-20 PCSource 01 Shift left 2 RegWrite B Reg. out 0-25 Register file Data 21-25 Mem. Data (MDR) IorD Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc.=1 If(zero) subtract Shift left 2 ALU control 55 ALUOp = 01 Write PC on “zero” zero=1 PCWriteCond=1 PCWrite etc.=1 PCWrite Advance Computer Architecture 56 FSM-B (Branch) From state 1 Opcode = “beq” Write PC on branch condition ALUSrcA =1 ALUSrcB = 00 ALUOp = 01 PCWriteCond=1 PCSource=01 Branch condition: If A – B=0 zero = 1 To state 0 (Instr. Fetch) Advance Computer Architecture 57 State 1 (Opcode = j) → FSM-J (CC3) control MUX in1 in2 MemRead MemWrite IRWrite RegDst MemtoReg 0-15 Shift left 2 0-5 Advance Computer Architecture ALUOut Reg. zero ALU ALUSrcB A Reg. Sign extend 28-31 ALUSrcA 11-15 PCSource 10 RegWrite Register file 16-20 21-25 B Reg. out Dat a 0-25 Mem. Data (MDR) IorD Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. CC3 4 Shift left 2 ALU control ALUOp 58 Write PC zero PCWriteCond PCWrite=1 PCWrite etc.=1 Advance Computer Architecture 59 FSM-J (Jump) From state 1 Opcode = “jump” Write jump addr. In PC PCWrite=1 PCSource=10 To state 0 (Instr. Fetch) Advance Computer Architecture 60 Control FSM State 0 Start 1 Instr. fetch/ adv. PC 3 l 2 Read memory data Compute memory addr. lw sw r wo Instr. decode/reg. fetch/branch R addr. B ALU operation 6 8 sw 4 5 Write register J Write jump addr. to PC Write PC on branch condition 9 7 Write memory data Write register Advance Computer Architecture 61 Control FSM (Controller) 6 inputs (opcode) Combinational logic Present state Reset Clock 16 control outputs Next state FF FF FF FF Advance Computer Architecture 62 Designing the Control FSM Encode states; need 4 bits for 10 states, e.g., • State 0 is 0000, state 1 is 0001, and so on. Write a truth table for combinational logic: Opcode 000000 .... Present state 0000 .... Control signals 0001000110000100 .... Next state 0001 .... Synthesize a logic circuit from the truth table. Connect four flip-flops between the next state outputs and present state inputs. Advance Computer Architecture 63 Block Diagram of a Processor MemWrite funct. [0,5] PCWriteCond PCWrite IRWrite IorD MemtoReg RegWrite ALUSrcB 2-bits PCSource 2-bits RegDst ALUSrcA Overflow Opcode 6-bits zero Reset Clock Mem. Addr. Datapath (PC, register file, registers, ALU) Mem. write data Mem. data out Advance Computer Architecture ALU control 64 ALUOp 3-bits ALUOp 2-bits Controller (Control FSM) MemRead Exceptions or Interrupts Conditions under which the processor may produce incorrect result or may “hang”. • • • Illegal or undefined opcode. Arithmetic overflow, divide by zero, etc. Out of bounds memory address. EPC: 32-bit register holds the affected instruction address. Cause: 32-bit register holds an encoded exception type. For example, • • 0 for undefined instruction 1 for arithmetic overflow Advance Computer Architecture 65 Implementing Exceptions MUX out control in2 0 1 Cause 32-bit register Advance Computer Architecture 4 EPCWrite=1 ALU ALUSrcB=01 PC Instr. reg. (IR) 26-31 to Control FSM ALUSrcA=0 PCWrite etc.=1 8000 0180(hex) CauseWrite=1 in1 PCSource 11 EPC Overflow to Control FSM Subtract ALU control 66 ALUOp =01 How Long Does It Take? Again Assume control logic is fast and does not affect the critical timing. Major time components are ALU, memory read/write, and register read/write. Time for hardware operations, suppose • Memory read or write • Register read • ALU operation • Register write 2ns 1ns 2ns 1ns Advance Computer Architecture 67 Single-Cycle Datapath R-type 6ns 8ns Load word (I-type) Store word (I-type) 7ns Branch on equal (I-type) 5ns Jump (J-type) 2ns Clock cycle time = 8ns Each instruction takes one cycle Advance Computer Architecture 68 Multicycle Datapath Clock cycle time is determined by the longest operation, ALU or memory: • Clock cycle time = 2ns Cycles • • • • • per instruction (CPI): lw sw R-type beq j 5 4 4 3 3 Advance Computer Architecture (10ns) (8ns) (8ns) (6ns) (6ns) 69 CPI of a Computer = ∑k (Instructions of type k) × CPIk —————————————— ∑k (instructions of type k) CPIk = Cycles for instruction of type k Note: CPI is dependent on the instruction mix of the program being run. Standard benchmark programs are used for specifying the performance of CPUs. CPI where Advance Computer Architecture 70 Example Consider a program containing: • loads • stores • branches • jumps • Arithmetic CPI CPI 25% 10% 11% 2% 52% = 0.25×5 + 0.10×4 + 0.11×3 + 0.02×3 + 0.52×4 = 4.12 for multicycle datapath = 1.00 for single-cycle datapath Advance Computer Architecture 71 Multicycle vs. Single-Cycle Performance ratio = Single cycle time / Multicycle time = (CPI × cycle time) for single-cycle ——————————————— (CPI × cycle time) for multicycle = 1.00 × 8ns ————— 4.12 × 2ns = 0.97 Single cycle is faster in this case, but remember, performance ratio depends on the instruction mix. Advance Computer Architecture 72 Thank You Advance Computer Architecture 73
© Copyright 2026 Paperzz