"set" pseudo-instruction in delay slot

UltraSPARC
UltraSPARC data types
•
•
•
•
•
byte
halfword
word
doubleword
quadword
UltraSPARC
• big endian
• fixed length instructions (1 word / 4 bytes
each)
• label: opcode src, dst
!comment
UltraSPARC registers
• 32 “general” purpose registers (%r0..%r31)
• 4 groups of 8 each:
– %g0..%g7
– %o0..%o7
– %l0..%l7
– %i0..%i7
UltraSPARC registers
• 32 “general” purpose registers (%r0..%r31)
• %g0..%g7
– same as %r0..%r7
– globals
– %g0 is always 0. It cannot be changed.
– There is only one set of (shared) global registers.
UltraSPARC registers
• 32 “general” purpose registers (%r0..%r31)
• %o0..%o7
(note: letter oh)
– same as %r8..%r15
– outs
• for arguments passed to functions called by the current
function
• %o0 is the first arg, %o1 is the second, …
– %o6 = %sp
– %o7 is used for return address (do not use)
UltraSPARC registers
• 32 “general” purpose registers (%r0..%r31)
• %l0..%l7 (note: letter ell)
– same as %r16..%r23
– locals (for function local variables)
– %i6 = %fp
– %i7 used for return address (do not use)
UltraSPARC registers
• 32 “general” purpose registers (%r0..%r31)
• %i0..%i7
– same as %r24..%r31
– ins
• arguments coming into this function
• %i0 is the first arg, %i1 is the second, …
– %i0 is also used to return value to caller
More registers
• Actually, there are many more than 32 register.
– There is only one set of global registers, shared by all.
– There are 2..32 other sets of 16 registers (32..512
registers).
– At any time, you have a “window” into these sets.
– %l0..%l7 and %i0..%i7 (%r16..%r31) are the current set.
– %o0..%o7 (%r8..%r15) are the in registers from the next
set.
– SAVE/RESTORE instructions switch to the next/previous
set.
Special purpose registers
• %psr - processor status register
• %pc - holds the address of the instruction
which is currently being executed
• %npc - holds the address of the instruction
which is currently being fetched and which
will be executed next
• And many others.
Comments
!======================================================================
/*
file:
simple.s
date:
6-dec-2008
author: george j. grevera, ph.d.
description:
example "hello world" program in sparc assembler.
build:
gcc -mcpu=ultrasparc -o simple.exe simple.s
Note: Add -v to above to see separate assemble/link steps.
run:
./simple.exe
debug: mdb ./simple.exe
*/
!======================================================================
Defining symbols and data
.file
"simple.s"
!symbol definitions
SSIZE
=
112
!minimum stack allocation size
!read-only data definitions
.section ".rodata"
.align
8
msg:
.asciz
"\nhello, world!\n\n"
!read-write data definitions
.section ".data"
.align
8
val:.word
12
!code
.section ".text"
.align
4
.global main
Code
main:
save
%sp, -SSIZE, %sp
!allocate _required_ temp storage on
! stack. 96 is the minimum amount to allocate.
set
msg, %o0
call
printf
!load address of message.
! actually a "synthetic" instruction which
! translates into two instructions below:
! sethi %hi(msg), %o0
! or
%o0, %lo(msg), %o0
!call function to output the message.
! 1 out reg used as an arg.
! (in general, args may be in %o0..%o5)
!delay instruction (actually done before call!)
! (comment this out and see what happens)
nop
ret
restore
!return from main
!restore old set of registers (delay slot)
Basic instructions
Loads:
• ldsb
• ldsh
• ldub
• lduh
• ld
• ldd
[addr], rd
[addr], rd
[addr], rd
[addr], rd
[addr], rd
[addr], rd
!load signed byte
!ld signed halfword
!ld unsigned byte
!ld unsigned halfword
!ld word
!ld doubleword
! rd must be even
Basic instructions
Stores:
• stb rs, [addr]
• sth rs, [addr]
• st rs, [addr]
• std rs, [addr]
!store byte
!st halfword
!st word
!st doubleword
! rs must be even
Basic instructions
• add
• addcc
rs, reg_or_imm, rd
rs, reg_or_imm, rd
!add w/ mod cc
• sub
• subcc
rs, reg_or_imm, rd
rs, reg_or_imm, rd
!sub w/ mod cc
Basic instructions
• and
• andcc
rs, reg_or_imm, rd
rs, reg_or_imm, rd
!and w/ mod cc
• or
• orcc
rs, reg_or_imm, rd
rs, reg_or_imm, rd
!or w/ mod cc
• xor
• xorcc
rs, reg_or_imm, rd
rs, reg_or_imm, rd
!xor w/ mod cc
Basic instructions
• sdiv
• smul
rs, reg_or_imm, rd
rs, reg_or_imm, rd
!signed
• udiv
• umul
rs, reg_or_imm, rd
rs, reg_or_imm, rd
!unsigned
Basic instructions
Basic instruction
• nop
• ret
Synthetic instructions
•
•
•
•
•
•
•
•
•
call
cmp
dec
deccc
inc
inccc
mov
neg
not
reg_or_imm
reg, reg_or_imm
rd
rd !dec w/ mod cc
rd
rd !inc w/ mod cc
reg_or_imm, rd
rd !2’s complement
rd !1’s complement
Synthetic instructions
•
•
•
•
restore
save
rs, reg_or_imm, rd
set
value, rd
tst
rd
Operand addressing modes
Pipeline and delay slot
• Consider the following sequence of instructions:
call f
mov 52, %0
• Before the first instruction of printf can be
executed, the mov instruction is already in the
pipeline and execution of it has started.
Pipeline and delay slot
• Consider the following sequence of instructions:
beq someLabel
mov 52, %0
• If the branch is not taken, then the instruction
after beq should be executed.
• If the branch is taken, then the instruction after
beq should NOT be executed.
• Regardless, the mov instruction is already in the
pipeline and execution of it has started.
What should we do?
Options:
1. Invalidate the next instruction in the pipeline
(and everything in the pipeline after the branch,
or course).
2. Go ahead and execute the next instruction in the
pipeline (the delay slot).
What’s the most efficient?
What should we do?
Options:
1. Invalidate the next instruction in the pipeline
(and everything in the pipeline after the branch,
or course).
2. Go ahead and execute the next instruction in the
pipeline (the delay slot).
What’s the most efficient?
Pipeline and delay slot
#1 rule: You must ALWAYS fill the delay slot!
#2 rule: Only fill the delay slot w/ 1 instruction.
• Consider the following sequence of
instructions:
call f
call g
• What must be done? (See rule #1.)
Pipeline and delay slot
#1 rule: You must ALWAYS fill the delay slot!
#2 rule: Only fill the delay slot w/ 1 instruction!
• Consider the following sequence of
instructions:
call f
nop
call g
nop
• Should we always fill the delay slot w/ NOP?
Pipeline and delay slot
• Should we always fill the delay slot w/ NOP?
• What happens if we don’t? Recall our example:
save
%sp, -SSIZE, %sp
set
msg, %o0
call
printf
nop
ret
restore
What happens
if we remove
this nop?
!allocate _required_ temp storage on
! stack. 96 is the minimum amount to allocate.
!load address of message.
! actually a "synthetic" instruction which
! translates into two instructions below:
! sethi %hi(msg), %o0
! or
%o0, %lo(msg), %o0
!call function to output the message.
! 1 out reg used as an arg.
! (in general, args may be in %o0..%o5)
!delay instruction (actually done before call!)
! (comment this out and see what happens)
!return from main
!restore old set of registers (delay slot)
Pipeline and delay slot
• Should we always fill the delay slot w/ NOP?
• What happens if we don’t? Recall our example:
save
set
call
!nop
ret
restore
%sp, -SSIZE, %sp
!allocate _required_ temp storage on
! stack. 96 is the minimum amount to allocate.
msg, %o0
!load address of message.
! actually a "synthetic" instruction which
! translates into two instructions below:
! sethi %hi(msg), %o0
! or
%o0, %lo(msg), %o0
printf
!call function to output the message.
ret fills the
! 1 out reg used as an arg.
delay slot and ! (in general, args may be in %o0..%o5)
is executed
!delay instruction (actually done before call!)
before printf so ! (comment this out and see what happens)
NOTHING is
!return from main
printed!
!restore old set of registers (delay slot)
Pipeline and delay slot
• Let’s try to fill the delay slot with something
useful.
save
%sp, -SSIZE, %sp
call
printf
set
msg, %o0
This is
something
useful. What
happens now?
ret
restore
!allocate _required_ temp storage on
! stack. 96 is the minimum amount to allocate.
!call function to output the message.
! 1 out reg used as an arg.
! (in general, args may be in %o0..%o5)
!load address of message.
! actually a "synthetic" instruction which
! translates into two instructions below:
! sethi %hi(msg), %o0
! or
%o0, %lo(msg), %o0
!return from main
!restore old set of registers (delay slot)
Pipeline and delay slot
• Let’s try to fill the delay slot with something
useful.
save
%sp, -SSIZE, %sp
!allocate _required_ temp storage on
! stack. 96 is the minimum amount to allocate.
call
printf
!call function to output the message.
! 1 out reg used as an arg.
! (in general, args may be in %o0..%o5)
set
msg, %o0
!load address of message.
! actually a "synthetic" instruction which
bash-3.00$ gcc simple3.s
! translates into two instructions below:
! sethi %hi(msg), %o0
/usr/ccs/bin/as: "simple3.s", line 38:
! or
%o0, %lo(msg), %o0
warning: 2-word "set" pseudoretdelay slot (follows CTI) !return from main
instruction in
restore
!restore old set of registers (delay slot)
Pipeline and delay slot
• Let’s try to fill the delay slot with something
useful.
save
%sp, -SSIZE, %sp
call
printf
set
msg, %o0
So what should we do – back
to nop?
ret
restore
!allocate _required_ temp storage on
! stack. 96 is the minimum amount to allocate.
!call function to output the message.
! 1 out reg used as an arg.
! (in general, args may be in %o0..%o5)
!load address of message.
! actually a "synthetic" instruction which
! translates into two instructions below:
! sethi %hi(msg), %o0
! or
%o0, %lo(msg), %o0
!return from main
!restore old set of registers (delay slot)
Pipeline and delay slot
• Let’s try to fill the delay slot with something
useful.
save
%sp, -SSIZE, %sp
call
printf
set
msg, %o0
So what should we do – back
to nop?
ret
restore
!allocate _required_ temp storage on
! stack. 96 is the minimum amount to allocate.
!call function to output the message.
! 1 out reg used as an arg.
! (in general, args may be in %o0..%o5)
!load address of message.
! actually a "synthetic" instruction which
! translates into two instructions below:
! sethi %hi(msg), %o0
! or
%o0, %lo(msg), %o0
!return from main
!restore old set of registers (delay slot)
Pipeline and delay slot
• Let’s try to fill the delay slot with something
useful.
save
%sp, -SSIZE, %sp
sethi
call
%hi(msg), %o0
printf
or
!allocate _required_ temp storage on
! stack. 96 is the minimum amount to allocate.
!load addr of message
!call function to output the message.
! 1 out reg used as an arg.
! (in general, args may be in %o0..%o5)
%o0, %lo(msg), %o0 !delay slot – finish loading addr of message
ret
restore
This works well!
!return from main
!restore old set of registers (delay slot)