Finite State Machines - IISc-SERC

RISC Processor Design
Single Cycle Implementation MIPS
Virendra Singh
Computer Design and Test Lab.
Indian Institute of Science
Bangalore
[email protected]
Advance Computer Architecture
http://www.serc.iisc.ernet.in/~viren/ACA/ACA.htm
MIPS Instructions
add
MIPS assembly language
Example
Meaning
add $s1, $s2, $s3
$s1 = $s2 + $s3
Three operands; data in registers
Arithmetic
subtract
sub $s1, $s2, $s3
$s1 = $s2 - $s3
Three operands; data in registers
Data transfer
add immediate
load word
store word
load byte
store byte
load upper
immediate
addi $s1, $s2, 100
lw $s1, 100($s2)
sw $s1, 100($s2)
lb $s1, 100($s2)
sb $s1, 100($s2)
lui $s1, 100
$s1 = $s2 + 100
Used to add constants
$s1 = Memory[ $s2 + 100] Word from memory to register
Memory[ $s2 + 100] = $s1 Word from register to memory
$s1 = Memory[ $s2 + 100] Byte from memory to register
Memory[ $s2 + 100] = $s1 Byte from register to memory
Loads constant in upper 16 bits
$s1 = 100 * 2 16
branch on equal
beq
$s1, $s2, 25
if ($s1 == $s2 ) go to
PC + 4 + 100
Equal test; PC-relative branch
branch on not equal
bne
$s1, $s2, 25
if ($s1 != $s2 ) go to
PC + 4 + 100
Not equal test; PC-relative
set on less than
slt
$s1, $s2, $s3
if ($s2 < $s3 ) $s1 = 1;
else $s1 = 0
Compare less than; for beq, bne
set less than
immediate
slti
if ($s2 < 100 ) $s1 = 1;
else $s1 = 0
Compare less than constant
jump
jump register
jump and link
j
jr
jal
go to 10000
go to $ra
$ra = PC + 4; go to 10000
Jump to target address
For switch, procedure return
For procedure call
Category
Conditional
branch
Unconditional jump
Instruction
$s1, $s2, 100
2500
$ra
2500
Advance Computer Architecture
Comments
2
MIPS Registers and Memory
$ s 0 - $ s 7 , $ t 0 - $ t 9 , F$azsetrloo ,c a t io n s f o r d a t a . In M IP S , d a t a m u s t b e in r e g is
3 2 r e g is te$ras0 - $ a 3 , $ v 0 - $ v 1 , a$rgitph,m e t ic . M IP S r e g is t e r $ z e r o a lw a y s e q u a ls 0 . R e
$ f p , $ s p , $ r a , $ a t r e s e r v e d f o r t h e a s s e m b le r t o h a n d le la r g e c o n s t a n
M e m o r y [ 0 ],
A c c e s s e d o n ly b y d a t a t r a n s f e r in s t r u c t io n s . M IP S u s
23 0 m e m oMrye m o r y [ 4 ], ...,
w ords
a d d r e s s e s , s o s e q u e n t ia l w o r d s d if f e r b y 4 . M e m o r y
M e m o r y [ 4 2 9 4 9 6 7 2 9 2 ] s t r u c t u r e s , s u c h a s a r r a y s , a n d s p ille d r e g is t e r s , s u
s a v e d o n p r o c e d u r e c a lls .
Advance Computer Architecture
3
Addressing Modes
Example
1. Immediate addressing
addi
add
op
rs
rt
Immediate
2. Register addressing
op
rs
rt
rd
...
funct
Registers
Register
3. Base addressing
lw, sw
op
rs
rt
+
Register
beq, bne
Memor y
Address
Byte
Halfword
Word
4. PC-relative addressing
op
rs
rt
Memor y
Address
PC
+
Word
5. Pseudodirect addressing
j
op
Memor y
Address
Word
PC
2004 © Morgan Kaufman Publishers
Advance Computer Architecture
4
MIPS – Instruction Set
MIPS Instruction – Subset
 Arithmetic and Logical Instructions
 add, sub, or, and, slt
 Memory reference Instructions
 lw, sw
 Branch
 beq, j
Advance Computer Architecture
5
Overview of MIPS
simple instructions, all 32 bits wide
 very structured, no unnecessary baggage
 only three instruction formats

funct
R
op
rs1
rs2
rd
I
op
rs1
rd
16 bit address
J
op

shmt
26 bit address
rely on compiler to achieve performance
2004 © Morgan Kaufman Publishers
Advance Computer Architecture
6
Where Does It All Begin?
 In
a register called program counter
(PC).
 PC contains the memory address of the
next instruction to be executed.
 In the beginning, PC contains the
address of the memory location where
the program begins.
Advance Computer Architecture
7
Where is the Program?
Processor
Memory
Program counter
(register)
Start
address
Advance Computer Architecture
Machine code
of program
8
How Does It Run?
Start
PC has memory address where program begins
Fetch instruction word from memory address in PC
and increment PC ← PC + 4 for next instruction
Decode and execute instruction
No
Program
complete?
Advance Computer Architecture
Yes
STOP
9
Datapath and Control
 Datapath: Memory, registers, adders, ALU, and communication
buses. Each step (fetch, decode, execute) requires
communication (data transfer) paths between memory, registers
and ALU.
 Control: Datapath for each step is set up by control signals that
set up dataflow directions on communication buses and select
ALU and memory functions. Control signals are generated by a
control unit consisting of one or more finite-state machines.
Advance Computer Architecture
10
Datapath for Instruction Fetch
Add
4
PC
Address
Instruction
Memory
Advance Computer Architecture
Instruction
word to
control unit
and registers
11
Register File: A Datapath Component
5
Read
registers
Write
register
Write data
5
5
reg 1
reg 2
32
reg 1 data
32 Registers
(reg. file)
32
reg 2 data
32
RegWrite
from control
Advance Computer Architecture
12
Multi-Operation ALU
Operation select
ALU function
000
001
010
110
111
AND
OR
Add
Subtract
Set on less than
Operation
select
from control
3
zero
ALU
result
overflow
zero = 1, when all bits of result are 0
Advance Computer Architecture
13
R-Type Instructions
 Also
known as arithmetic-logical instructions
 add, sub, slt
 Example: add
$t0, $s1, $s2
• Machine instruction word
000000 10001 10010 01000 00000 100000
opcode $s1 $s2 $t0
function
• Read two registers
• Write one register
• Opcode and function code go to control unit that
generates RegWrite and ALU operation code.
Advance Computer Architecture
14
Datapath for R-Type Instruction
01000 10010 10001
Operation
000000 10001 10010 01000 00000 100000
select
opcode $s1 $s2 $t0
function (add)
from control (add)
Read
5
$s1
register
3
32
numbers
zero
5
$s2 32 Registers
ALU
result
Write reg.
(reg. file)
32
number
overflow
5
$t0
Write data
32
RegWrite from control
activated
Advance Computer Architecture
15
Load and Store Instructions
 I-type

lw
instructions
$t0, 1200 ($t1)
# incr. in bytes
100011 01001 01000 0000 0100 1011 0000
opcode $t1

sw
$t0
$t0, 1200 ($t1)
1200
# incr. in bytes
101011 01001 01000 0000 0100 1011 0000
opcode $t1
$t0
1200
Advance Computer Architecture
16
Datapath for lw Instruction
Read
register
numbers
01001
100011 01001 01000 0000 0100 1011 0000
opcode $t1
$t0
1200
5
$t1
3
32
01000
result
ALU
32 Registers
(reg. file)
5
Addr.
Data
memory
overflow
$t0
Write
data
Write data
0000 0100 1011 0000
Read
data
zero
5
Write reg.
number
MemWrite
Operation
select
from control
(add)
32
32
RegWrite
from control
activated
16
Sign
extend
mem. data to $t0
Advance Computer Architecture
MemRead
activated
17
Datapath for sw Instruction
101011 01001 01000 0000 0100 1011 0000
opcode $t1
$t0
1200
01001
Read
register
numbers
5
01000
MemWrite activated
5
Write reg.
number
$t1
3
32
$t0
Read
data
zero
result
ALU
32 Registers
(reg. file)
Addr.
Data
memory
overflow
5
32
Write data
0000 0100 1011 0000
Operation
select
from control
(add)
$t0 data to mem.
32
Write
data
32
RegWrite
from control
Sign
extend
MemRead
16
Advance Computer Architecture
18
Branch Instruction (I-Type)

beq
$s1, $s2, 25
# if $s1 = $s2,
advance PC through
25 instructions
16-bits
000100 10001 10010 0000 0000 0001 1001
opcode $s1 $s2
25
Note:
Can branch within ± 215 words from the current instruction
address in PC.
Advance Computer Architecture
19
10001
Read
register
numbers
5
10010
Datapath for beq Instruction
5
Write reg.
number
000100 10001 10010 0000 0000 0001 1001
opcode $s1
$s2
25
$s1
Operation select from control (subtract)
32
3
$s2 32 Registers
zero
(reg. file)
ALU
5
result
32
To branch
control
logic
overflow
Write data
0000 0000 0001
1001
32
16
Sign
extend
RegWrite
from
control
32
PC+4
From instruction
32
fetch datapath
Shift
left 2
Add
Branch
target
32
32
Advance Computer Architecture
20
J-Type Instruction

j 2500
# jump to instruction 2,500
26-bits
000010 0000 0000 0000 0010 0111 0001 00
opcode
2,500
32-bit jump address
0000 0000 0000 0000 0010 0111 0001 0000
bits 28-31 from PC+4
Advance Computer Architecture
21
Datapath for Jump Instruction
Branch
4
Add
Branch
addr.
32
PC+4
Jump
1
mux
0
32
32
0
mux
1
4
Shift
left 2
32
PC
32
Address
Instruction
Memory
26
32
Advance Computer Architecture
28
opcode (bits 26-31)
to control
6
Instruction word to
control and registers
22
16-20
11-15
Combined
Datapaths
0-15
Sign
ext.
Shift
left 2
MemWrite
MemRead
Data
mem.
ALU
Cont.
0-5
Advance Computer Architecture
0 mux 1
1 mux 0
ALU
zero
MemtoReg
23
0 mux 1
Instr.
mem.
ALU
PC
1 mux 0
21-25
1 mux 0
26-31
Branch
Reg. File
opcode
Jump
Shift
left 2
CONTROL
RegDst
4
Add
0-25
Control
RegDst
Jump
Branch
Instruction
bits 26-31
opcode
Control
Logic
MemRead
MemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite
Instruction
bits 0-5
funct.
2
ALU
Control
Advance Computer Architecture
to
ALU
24
Control Logic: Truth Table
Outputs: control signals
0
0
0
1
0
0
0
1
0
lw
1
0
0
0
1
1
0
0
1
1
1
1
0
0
0
0
sw
1
0
1
0
1
1
X
0
1
X
0
0
1
0
0
0
beq
0
0
0
1
0
0
X
0
0
X
0
0
0
1
0
1
j
0
0
0
0
1
0
X
1
X
X
X
X
X
X
X
X
Advance Computer Architecture
25
ALUOp2
1
ALOOp1
0
Branch
0
MemWrite
0
MemRead
0
RegWrite
0
MemtoReg
0
ALUSrc
R
31 30 29 28 27 26
Jump
Inputs: instr. opcode bits
RegDst
Instr
type
How Long Does It Take?
 Assume
control logic is fast and does not
affect the critical timing. Major time delay
components are ALU, memory read/write,
and register read/write.
 Arithmetic-type (R-type)
• Fetch (memory read)
• Register read
• ALU operation
• Register write
• Total
2ns
1ns
2ns
1ns
6ns
Advance Computer Architecture
26
Time for lw and sw (I-Types)
 ALU
(R-type)
 Load word (I-type)
• Fetch (memory read)
• Register read
• ALU operation
• Get data (mem. Read)
• Register write
• Total
 Store
6ns
2ns
1ns
2ns
2ns
1ns
8ns
word (no register write) 7ns
Advance Computer Architecture
27
Time for beq (I-Type)
 ALU
(R-type) 6ns
 Load word (I-type) 8ns
 Store word (I-type) 7ns
 Branch on equal (I-type)
• Fetch (memory read)
1ns
• Register read
2ns
• ALU operation
• Total 5ns
2ns
Advance Computer Architecture
28
Time for Jump (J-Type)
 ALU
(R-type)
 Load word (I-type)
 Store word (I-type)
 Branch on equal (I-type)
 Jump (J-type)
• Fetch (memory read)
• Total
Advance Computer Architecture
6ns
8ns
7ns
5ns
2ns
2ns
29
How Fast Can the Clock Be?
 If
every instruction is executed in one clock
cycle, then:
• Clock period must be at least 8ns to perform the
•
•
longest instruction, i.e., lw.
This is a single cycle machine.
It is slower because many instructions take less
than 8ns but are still allowed that much time.
 Method
of speeding up: Use multicycle
datapath.
Advance Computer Architecture
30
A Single Cycle Example
Delay of 1-bit full adder = 1ns
Clock period ≥ 32ns
a31
.
.
.
a2
a1
a0
b31
.
.
.
b2
b1
b0
1-b full
adder
1-b full
adder
1-b full
adder
1-b full
adder
0
Time of adding words ~ 32ns
Time of adding bytes ~ 32ns
Advance Computer Architecture
31
c32
s31
.
.
.
s2
s1
s0
a31
.
.
.
a2
a1
a0
b31
.
.
.
b2
b1
b0
Delay of 1-bit full adder = 1ns
Clock period ≥ 1ns
Time of adding words ~ 32ns
Time of adding bytes ~ 8ns
1-b full
adder
c32
FF
Initialize
to 0
Advance Computer Architecture
s31
.
.
.
s2
s1
s0
Shift
Shift
Shift
A Multicycle Implementation
32
ALUOut Reg.
4
ALU
A Reg.
B Reg.
Register file
Mem. Data (MDR)
Data
Instr. reg. (IR)
Addr.
Memory
PC
Multicycle Datapath
One-cycle data transfer paths
(need registers to hold data)
Advance Computer Architecture
33
Multicycle Datapath Requirements
 Only
one ALU, since it can be reused.
 Single memory for instructions and data.
 Five registers added:
• Instruction register (IR)
• Memory data register (MDR)
• Three ALU registers, A and B for inputs and
ALUOut for output
Advance Computer Architecture
34
Multicycle Datapath
MUX
in1
control
in2
MemRead
MemWrite
IRWrite
RegDst
MemtoReg
0-15
Sign
extend
0-5
Advance Computer Architecture
ALUOut Reg.
ALU
ALUSrcB
28-31
ALUSrcA
A Reg.
11-15
16-20
Shift
left 2
RegWrite
B Reg.
out
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.
PCSource
4
Shift
left 2
ALU
control
35
ALUOp
3 to 5 Cycles for an Instruction
Step
R-type
(4 cycles)
Mem. Ref.
(4 or 5 cycles)
Branch type
(3 cycles)
J-type
(3 cycles)
Instruction fetch
IR ← Memory[PC]; PC ← PC+4
Instr. decode/
Reg. fetch
A ← Reg(IR[21-25]); B ← Reg(IR[16-20])
ALUOut ← PC + (sign extend IR[0-15]) << 2
Execution, addr.
Comp., branch &
jump completion
ALUOut ←
A op B
Mem. Access or Reg(IR[11R-type
15]) ←
completion
ALUOut
Memory read
completion
ALUOut ←
A+sign extend
(IR[0-15])
If (A= =B) PC←PC[28-31]
then
||
PC←ALUOut (IR[0-25]<<2)
MDR←M[ALUout]
or M[ALUOut]←B
Reg(IR[16-20]) ←
MDR
Advance Computer Architecture
36
Cycle 1 of 5: Instruction Fetch (IF)
 Read
instruction into IR, M[PC] → IR
• Control signals used:
=
0
• IorD
1
• MemRead =
=
1
• IRWrite
 Increment PC, PC + 4 → PC
• Control signals used:
• ALUSrcA
• ALUSrcB
• ALUOp
• PCSource
• PCWrite
=
=
=
=
=
0
01
00
00
1
select PC
read memory
write IR
select PC into ALU
select constant 4
ALU adds
select ALU output
write PC
Advance Computer Architecture
37
Cycle 2 of 5: Instruction Decode (ID)
31-26


25-21
20-16
15-11
10-6
5-0
R
opcode | reg 1 | reg 2 | reg 3 | shamt | fncode
I
opcode | reg 1 | reg 2 | word address increment
J
opcode |
word address jump
Control unit decodes instruction
Datapath prepares for execution
• R and I types, reg 1→ A reg, reg 2 → B reg
• No control signals needed
• Branch type, compute branch address in ALUOut
=0
select PC into ALU
• ALUSrcA
= 11
Instr. Bits 0-15 shift 2 into ALU
• ALUSrcB
• ALUOp= 00 ALU adds
Advance Computer Architecture
38
Cycle 3 of 5: Execute (EX)

R type: execute function on reg A and reg B, result in
ALUOut
• Control signals used:

=
1
A reg into ALU
• ALUSrcA
=
00
B reg into ALU
• ALUsrcB
=
10
instr. Bits 0-5 control ALU
• ALUOp
I type, lw or sw: compute memory address in ALUOut
← A reg + sign extend IR[0-15]
• Control signals used:
•
•
•
ALUSrcA
ALUSrcB
ALUOp
=
=
=
1
10
00
A reg into ALU
Instr. Bits 0-15 into ALU
ALU adds
Advance Computer Architecture
39
Cycle 3 of 5: Execute (EX)

I type, beq: subtract reg A and reg B, write ALUOut to PC
• Control signals used:
•
•
•
•
•
•

ALUSrcA
=
1
ALUsrcB
=
00
ALUOp
=
01
If zero = 1, PCSource = 01
If zero = 1, PCwriteCond =1
Instruction complete, go to IF
A reg into ALU
B reg into ALU
ALU subtracts
ALUOut to PC
write PC
J type: write jump address to PC ← IR[0-25] shift 2 and
four leading bits of PC
• Control signals used:
•
•
•
PCSource
=
10
PCWrite
=
1
write PC
Computer
InstructionAdvance
complete,
goArchitecture
to IF
40
Cycle 4 of 5: Reg Write/Memory


R type, write destination register from ALUOut
• Control signals used:
=
1
Instr. Bits 11-15 specify reg.
• RegDst
=
0
ALUOut into reg.
• MemtoReg
=
1
write register
• RegWrite
• Instruction complete, go to IF
I type, lw: read M[ALUOut] into MDR
• Control signals used:

=
1 select ALUOut into mem adr.
• IorD
=
1 read memory to MDR
• MemRead
I type, sw: write M[ALUOut] from B reg
• Control signals used:
•
•
•
IorD
=
1
select ALUOut into mem adr.
MemWrite
=
1
write memory
41
Advance Computer Architecture
Instruction complete, go to IF
Cycle 5 of 5: Reg Write
I
type, lw: write MDR to reg[IR(16-20)]
• Control signals used:
=
0
instr. Bits 16-20 are write reg
• RegDst
1
MDR to reg file write input
• MemtoReg =
1
read memory to MDR
• RegWrite =
• Instruction complete, go to IF
For an alternative method of designing datapath, see
N. Tredennick, Microprocessor Logic Design, the Flowchart Method,
Digital Press, 1987.
Advance Computer Architecture
42
1-bit Control Signals
Signal name
Value = 0
Value =1
RegDst
Write reg. # = bit 16-20
Write reg. # = bit 11-15
RegWrite
No action
Write reg. ← Write data
ALUSrcA
First ALU Operand ← PC
First ALU Operand←Reg. A
MemRead
No action
Mem.Data Output←M[Addr.]
MemWrite
No action
M[Addr.]←Mem. Data Input
MemtoReg
Reg.File Write In←ALUOut
Reg.File Write In←MDR
IorD
Mem. Addr. ← PC
Mem. Addr. ← ALUOut
IRWrite
No action
IR ← Mem.Data Output
PCWrite
No action
PC is written
PCWriteCond
No action
PC is written if zero(ALU)=1
zero(ALU)
PCWriteCond
PCWrite
Advance Computer Architecture
PCWrite etc.
43
2-bit Control Signals
Signal name
ALUOp
ALUSrcB
PCSource
Value
Action
00
ALU performs add
01
ALU performs subtract
10
Funct. field (0-5 bits of IR ) determines ALU operation
00
Second input of ALU ← B reg.
01
Second input of ALU ← 4 (constant)
10
Second input of ALU ← 0-15 bits of IR sign ext. to 32b
11
Second input of ALU ← 0-15 bits of IR sign ext. and left shift 2
bits
00
ALU output (PC +4) sent to PC
01
ALUOut (branch target addr.) sent to PC
10
Jump address IR[0-25] shifted left 2 bits, concatenated with
PC+4[28-31], sent to PC
Advance Computer Architecture
44
Control: Finite State Machine
Start
State 0
Clock
cycle 1
Instruction fetch
State 1
Clock
cycle 2
Instruction decode and register fetch
FSM-M
Clock
cycles
3-5
Memory
access
instr.
FSM-R
R-type
instr.
FSM-B
Branch
instr.
Advance Computer Architecture
FSM-J
Jump
instr.
45
State 0: Instruction Fetch (CC1)
MUX
in1
control
in2
MemRead = 1
MemWrite
IRWrite
=1
RegDst
MemtoReg
0-15
Shift
left 2
Sign
extend
0-5
Advance Computer Architecture
4
ALUOut Reg.
ALU
ALUSrcB=01
28-31
ALUSrcA=0
A Reg.
RegWrite
B Reg.
out
16-20
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD=0
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.=1
PCSource=00
Add
Shift
left 2
ALU
control
46
ALUOp
=00
State 0 Control FSM Outputs
Start
State0
Instruction
fetch
MemRead =1
ALUSrcA = 0
IorD = 0
IRWrite = 1
ALUSrcB = 01
ALUOp = 00
PCWrite = 1
PCSource = 00
State 1
Instruction decode/
Register fetch/
Branch addr.
Advance Computer Architecture
Outputs?
47
MUX
in1
control
in2
MemRead
MemWrite
IRWrite
RegDst
MemtoReg
0-15
0-5
Advance Computer Architecture
4
ALUOut Reg.
ALU
ALUSrcB=11
28-31
ALUSrcA=0
A Reg.
Sign
extend
PCSource
Shift
left 2
RegWrite
B Reg.
out
16-20
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.
State 1: Instr. Decode/Reg.
Fetch/ Branch Address (CC2)
Add
Shift
left 2
ALU
control
48
ALUOp
= 00
State 1 Control FSM Outputs
Start
State0
Instruction
fetch
MemRead =1
(IF)
ALUSrcA = 0
IorD = 0
IRWrite = 1
ALUSrcB = 01
ALUOp = 00
PCWrite = 1
PCSource = 00
State 1
Instruction decode (ID) /
Register fetch /
Branch addr.
ALUSrcA = 0
ALUSrcB = 11
ALUOp = 00
e=
d
=
o
c
Op , sw ode
lw
c ype
p
O R-t
FSM-M
FSM-R
Opcode
= BEQ
FSM-B
Advance Computer Architecture
Opcode = J-type
FSM-J
49
State 1 (Opcode = lw) → FSM-M (CC3-5)
PCSource
MUX
in1
control
in2
MemRead=1
MemWrite
IRWrite
4
RegDst=0
MemtoReg=1
0-15
Sign
extend
0-5
Advance Computer Architecture
CC3
ALU
ALUSrcB=10
ALUSrcA=1
A Reg.
28-31
ALUOut Reg.
CC5
B Reg.
out
16-20
Shift
left 2
RegWrite=1
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD=1
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.
CC4
Add
Shift
left 2
ALU
control
50
ALUOp
= 00
State 1 (Opcode= sw)→FSM-M (CC3-4)
CC4
MUX
in1
control
IRWrite
CC4
in2
MemRead
MemWrite=1
RegDst=0
MemtoReg
0-15
Shift
left 2
Sign
extend
0-5
Advance Computer Architecture
4
ALUOut Reg.
CC3
ALU
ALUSrcB=10
28-31
ALUSrcA=1
A Reg.
RegWrite
B Reg.
out
16-20
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD=1
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.
PCSource
Add
Shift
left 2
ALU
control
51
ALUOp
= 00
FSM-M (Memory Access)
From state 1 Opcode = “lw” or “sw”
Opcode
Read
Memory data
Compute mem
addrress
ALUSrcA =1
ALUSrcB = 10 Opcode
ALUOp = 00
= “lw”
= “sw”
MemRead = 1
IorD = 1
Write
register
Write
memory
MemWrite = 1
IorD = 1
RegWrite = 1
MemtoReg = 1
RegDst = 0
Advance Computer Architecture
To state 0
(Instr. Fetch)
52
State 1(Opcode=R-type)→FSM-R (CC3-4)
MUX
in1
control
in2
MemRead
MemWrite
IRWrite
RegDst=0
4
CC4
MemtoReg=0
0-15
Sign
extend
0-5
Advance Computer Architecture
ALUOut Reg.
ALU
ALUSrcB=00
A Reg.
CC3
28-31
ALUSrcA=1
11-15
Shift
left 2
RegWrite
B Reg.
out
16-20
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.
PCSource
“funct. code”
Shift
left 2
ALU
control
53
ALUOp
= 10
FSM-R (R-type Instruction)
From state 1 Opcode = R-type
ALU
operation
ALUSrcA =1
ALUSrcB = 00
ALUOp = 10
Write
register
RegWrite = 1
MemtoReg =
0
RegDst = 1
Advance Computer Architecture
To state 0
(Instr. Fetch)
54
State 1 (Opcode = beq ) → FSM-B (CC3)
MUX
in1
control
in2
MemRead
MemWrite
IRWrite
RegDst
MemtoReg
0-15
Sign
extend
0-5
Advance Computer Architecture
4
ALUOut Reg.
zero
ALU
ALUSrcB=00
CC3
28-31
ALUSrcA=1
11-15
A Reg.
16-20
PCSource
01
Shift
left 2
RegWrite
B Reg.
out
0-25
Register file
Data
21-25
Mem. Data (MDR)
IorD
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.=1
If(zero)
subtract
Shift
left 2
ALU
control
55
ALUOp
= 01
Write PC on “zero”
zero=1
PCWriteCond=1
PCWrite etc.=1
PCWrite
Advance Computer Architecture
56
FSM-B (Branch)
From state 1 Opcode = “beq”
Write PC on
branch condition
ALUSrcA =1
ALUSrcB = 00
ALUOp = 01
PCWriteCond=1
PCSource=01
Branch condition:
If A – B=0
zero = 1
To state 0
(Instr. Fetch)
Advance Computer Architecture
57
State 1 (Opcode = j) → FSM-J (CC3)
control
MUX
in1
in2
MemRead
MemWrite
IRWrite
RegDst
MemtoReg
0-15
Shift
left 2
0-5
Advance Computer Architecture
ALUOut
Reg.
zero
ALU
ALUSrcB
A Reg.
Sign
extend
28-31
ALUSrcA
11-15
PCSource
10
RegWrite
Register file
16-20
21-25
B Reg.
out
Dat
a
0-25
Mem. Data (MDR)
IorD
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.
CC3
4
Shift
left 2
ALU
control
ALUOp
58
Write PC
zero
PCWriteCond
PCWrite=1
PCWrite etc.=1
Advance Computer Architecture
59
FSM-J (Jump)
From state 1 Opcode = “jump”
Write
jump addr. In PC
PCWrite=1
PCSource=10
To state 0
(Instr. Fetch)
Advance Computer Architecture
60
Control FSM
State 0
Start
1
Instr.
fetch/
adv. PC
3
l
2
Read
memory
data
Compute
memory
addr.
lw
sw
r
wo
Instr.
decode/reg.
fetch/branch
R addr.
B
ALU
operation
6
8
sw
4
5
Write
register
J
Write
jump addr.
to PC
Write PC
on branch
condition
9
7
Write
memory
data
Write
register
Advance Computer Architecture
61
Control FSM (Controller)
6 inputs
(opcode)
Combinational
logic
Present
state
Reset
Clock
16 control
outputs
Next
state
FF
FF
FF
FF
Advance Computer Architecture
62
Designing the Control FSM


Encode states; need 4 bits for 10 states, e.g.,
• State 0 is 0000, state 1 is 0001, and so on.
Write a truth table for combinational logic:
Opcode
000000
....


Present state
0000
....
Control signals
0001000110000100
....
Next state
0001
....
Synthesize a logic circuit from the truth table.
Connect four flip-flops between the next state outputs and
present state inputs.
Advance Computer Architecture
63
Block Diagram of a Processor
MemWrite
funct.
[0,5]
PCWriteCond
PCWrite
IRWrite
IorD
MemtoReg
RegWrite
ALUSrcB
2-bits
PCSource
2-bits
RegDst
ALUSrcA
Overflow
Opcode
6-bits
zero
Reset
Clock
Mem. Addr.
Datapath
(PC, register file, registers, ALU)
Mem. write data
Mem. data out
Advance Computer Architecture
ALU
control
64
ALUOp
3-bits
ALUOp
2-bits
Controller
(Control FSM)
MemRead
Exceptions or Interrupts

Conditions under which the processor may
produce incorrect result or may “hang”.
•
•
•


Illegal or undefined opcode.
Arithmetic overflow, divide by zero, etc.
Out of bounds memory address.
EPC: 32-bit register holds the affected instruction
address.
Cause: 32-bit register holds an encoded exception
type. For example,
•
•
0 for undefined instruction
1 for arithmetic overflow
Advance Computer Architecture
65
Implementing Exceptions
MUX
out
control
in2
0
1
Cause
32-bit
register
Advance Computer Architecture
4
EPCWrite=1
ALU
ALUSrcB=01
PC
Instr. reg. (IR)
26-31 to
Control
FSM
ALUSrcA=0
PCWrite
etc.=1
8000 0180(hex)
CauseWrite=1
in1
PCSource
11
EPC
Overflow to
Control
FSM
Subtract
ALU
control
66
ALUOp
=01
How Long Does It Take? Again
 Assume
control logic is fast and does not
affect the critical timing. Major time
components are ALU, memory read/write,
and register read/write.
 Time for hardware operations, suppose
• Memory read or write
• Register read
• ALU operation
• Register write
2ns
1ns
2ns
1ns
Advance Computer Architecture
67
Single-Cycle Datapath
 R-type
6ns
8ns
 Load
word (I-type)
 Store word (I-type)
7ns
 Branch on equal (I-type)
5ns
 Jump (J-type)
2ns
 Clock cycle time
=
8ns
 Each instruction takes one cycle
Advance Computer Architecture
68
Multicycle Datapath
 Clock
cycle time is determined by the
longest operation, ALU or memory:
• Clock cycle time = 2ns
 Cycles
•
•
•
•
•
per instruction (CPI):
lw
sw
R-type
beq
j
5
4
4
3
3
Advance Computer Architecture
(10ns)
(8ns)
(8ns)
(6ns)
(6ns)
69
CPI of a Computer
=
∑k (Instructions of type k) × CPIk
——————————————
∑k (instructions of type k)
CPIk
=
Cycles for instruction of type k
Note:
CPI is dependent on the instruction mix of the
program being run. Standard benchmark programs
are used for specifying the performance of CPUs.
CPI
where
Advance Computer Architecture
70
Example
 Consider
a program containing:
• loads
• stores
• branches
• jumps
• Arithmetic
 CPI
 CPI
25%
10%
11%
2%
52%
= 0.25×5 + 0.10×4 + 0.11×3 +
0.02×3 + 0.52×4
= 4.12 for multicycle datapath
= 1.00 for single-cycle datapath
Advance Computer Architecture
71
Multicycle vs. Single-Cycle
Performance ratio
=
Single cycle time / Multicycle time
=
(CPI × cycle time) for single-cycle
———————————————
(CPI × cycle time) for multicycle
=
1.00 × 8ns
—————
4.12 × 2ns
=
0.97
Single cycle is faster in this case, but remember, performance ratio
depends on the instruction mix.
Advance Computer Architecture
72
Thank You
Advance Computer Architecture
73