Optimizations on the Move framework

Enhancing the
MOVE framework
Endianness port
Long Immediates
Master’s thesis | Ivo Janssen | 11 mei 2001
introduction
introduction
endianness
immediates
conclusions
overview
• introduction
• endianness
• long immediates
• conclusions
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
2
introduction
introduction
endianness
immediates
conclusions
introduction
• project motivation
• overview Move project
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
3
introduction
introduction
endianness
immediates
conclusions
motivation
• Laboratory of
Computer Engineering
• NEC C&CRL,
Princeton, NJ, USA
• PcomP / packet processor
• endianness
• immediates
• linux/x86 machines
• cheap
• little endian
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
4
introduction
introduction
endianness
immediates
conclusions
the move framework
application
C/C++
machine
software
framework
technology
description
hardware
framework
cycle count
cost/performance
explorer
modify configuration
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
5
introduction
introduction
endianness
immediates
conclusions
the move framework
C++
c = a+b;
c = c<<4;
d = func100(c);
Pascal
begin
c := a+b;
c := c*16;
d := func100(c)
end;
RISC
add r3,r8,r9
TTA
shl r3,r3,4
r8 -> add_o
r9 -> add_t
add_r -> r3
jump 100
r3 -> shl_o
4 -> shl_t
shl_r -> r3
100 -> jump_t
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
6
introduction
introduction
endianness
immediates
conclusions
the move framework
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
7
introduction
introduction
endianness
immediates
conclusions
the move framework
move bus
cycle
0
1
2
3
4
0
1
2
r8 -> add_o
r9 -> add_t
r3 -> shl_o
r3 -> shl_o
4 -> shl_t
shl_r -> r3
add_r -> r3
r8 -> add_o
r9 -> add_t
add_r -> r3
r3 -> shl_o
4 -> shl_t
3
add_r -> r3
100 -> jump_t
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
8
endianness
introduction
endianness
endianness
immediates
conclusions
endianness
• what is endianness
• endianness in the Move
framework
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
10
introduction
endianness
endianness
immediates
conclusions
what is endianness
• Gulliver’s travels
• byte ordering
• 32 bit architecture
• ‘byte addressable’
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
11
introduction
endianness
endianness
immediates
conclusions
what is endianness
• little endian
(x86, PDP-11, Alpha)
least significant byte is stored at the
most significant address
memory address 00
01
02
03
11 22 33 44
= 0x44332211
• big endian
(Sparc, HPPA, m68k)
most significant byte is stored at the
most significant address
memory address 00
01
02
03
44 33 22 11
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
= 0x44332211
12
introduction
endianness
endianness
immediates
conclusions
changing endianness
• ‘byte swap’
memory address 00
01
02
03
11 22 33 44
44 33 22 11
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
13
introduction
endianness
endianness
immediates
conclusions
host and target endianness
• host endianness
• file on disk always has the same
endianness
• ‘swap’ if host != file
• target endianness
• file on disk is e.g. a binary with a
certain endianness
• ‘swap’ if host != target
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
14
introduction
endianness
endianness
immediates
conclusions
endianness in the Move framework
• apply principles on Move
framework
• host endianness
Move framework has to run on
both little (x86) and big (sparc)
• target endianness
host should be able to run both
‘big move’ and ‘little move’
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
15
introduction
endianness
endianness
immediates
conclusions
endianness in the front end
C/C++
gcc-move
assembler
.o
move
libraries
linker
seq. TTA
bintools
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
16
introduction
endianness
endianness
immediates
conclusions
endianness in the back end
seq. TTA
profiling
s. simulator
machine
profile
verification
parameters
scheduler
par. TTA
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
p. simulator
.txt
17
immediates
introduction
endianness
immediates
immediates
conclusions
immediates
• what are immediates
• existing implementation
• requirements
• possible solutions
• ‘resource variant’
• ‘pseudo-move variant’
• conclusions
• future work
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
19
introduction
endianness
immediates
immediates
conclusions
what are immediates
dy=y-1993;
1993 -> sub_o
y -> sub_t
sub_r -> dy
2
guard
1993
sub_o
8
source
6
destination
• immediates take lots of bits
• more than available space?
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
20
introduction
endianness
immediates
immediates
conclusions
existing implementation
• fixed immediate fields
• always writes to
‘immediate register’
move slot
move slot
move slot
immediate field
1993
definition
guard
i0
sub_o
use
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
21
introduction
endianness
immediates
immediates
conclusions
requirements
• possibility of no dedicated
fields
• short immediates stay in
source field
• long immediate bits in
instruction stream
• add state between ‘definition’
and ‘use’
• clean code interface
• must be applicable to PcomP
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
22
introduction
endianness
immediates
immediates
conclusions
possible solutions
1. make move slot wider
2. use multiple ‘short
immediates’ to construct
large
3. schedule immediate fields
in the move slots
immediate field
move slot
move slot
guard
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
i0
move slot
1993
sub_o
23
introduction
endianness
immediates
immediates
conclusions
‘resource variant’
resource table
time
LIT
i0
free
immediate bits
busy
busy
busy
i0 -> r4
busy
free
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
24
introduction
endianness
immediates
immediates
conclusions
‘resource variant’
• decoupling of def/use
• no dedicated fields required
• LIT tag
• immediates not part of
movelist
• Ifetch unit stores bits in
immediate registers
• immediate registers become
part of state
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
25
introduction
endianness
immediates
immediates
conclusions
scheduling algorithms
• mach-file format
• data structures
• algorithms
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
26
introduction
endianness
immediates
immediates
conclusions
mach file format
LongImmediate
{
Registers:
i0 20, signed, ir_0;
i1 20, signed, ir_1;
i2 32, signed, ir_2;
Control:
{};
i0 20 : {4};
i1 20 : {5};
i0 20 : {4}, i1 20: {5}, i2 32: {4,5};
}
ImmediateUnits
{
i0 32, signed, ir_1;
i1 20, signed, ir_2;
i2 20, signed, ir_3;
}
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
27
introduction
endianness
immediates
immediates
conclusions
scheduling algorithms
FindImmMoveBus
if (immediate fits in source field)
return success
else
forall (iregs) do
assign ireg socket to source
check resources on ireg
if (possible allocation imm-use found)
tentatively claim imm-use
for (this cycle downto zero) do
check if ireg is available
check if LIT encodig is possible
tentatively assign LIT tag
if (movebuses allocatable) then break
commit imm-def and imm-use
return success
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
28
introduction
endianness
immediates
immediates
conclusions
benchmarks
• various mach-files:
• mach.pcomp
• 6 buses, 3 imm. reg., 2 imm. ‘slots’
• mach.one
• 6 buses, 1 imm. reg., 1 imm. ‘slot’
• mach.small
• 3 buses, 1 imm. reg., 1 imm. ‘slot’
• mach.big
• 8 buses, 2 imm. reg., 2 imm. ‘slots’
• no dedicated fields
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
29
introduction
endianness
immediates
immediates
conclusions
benchmarks
• various benchmarks:
• dsp-suite (arfreq, music, radproc,
edge, expand, flatten, smooth)
• g722main
• cjpeg, djpeg
• go
• compress
• m88ksim
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
30
introduction
endianness
immediates
immediates
conclusions
benchmarks
• metric under test:
• instruction counts
• also derived: code size
• prediction:
• slight increase instr. count
• if dedicated fields go,
huge reduction in code size
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
31
introduction
endianness
immediates
immediates
conclusions
benchmark results
• instruction count increases
• especially for smaller machines
• 1-2% average increase
(6% for small machines)
• code size decreases
• dedicated fields are ~20% of
instruction word width
• code size decrease can be near
20% if dedicated fields go
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
32
introduction
endianness
immediates
immediates
conclusions
‘pseudo-move variant’
• TNO-FEL needed
implementation too
• paradigm shift:
• ‘resource variant’:
clean code interface
• resulting in the
‘pseudo-move variant’
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
33
introduction
endianness
immediates
immediates
conclusions
1993 -> i0
i0 -> r33
1993 -> sub_o
immediate operation
‘pseudo-move variant’
dflw(r33)
r33 -> sub_o
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
34
introduction
endianness
immediates
immediates
conclusions
‘pseudo-move variant’
• split immediate move in
two operations
• schedule the immediate
operation (def and use) as
normal moves
• count on bypass of virtual
register as optimization
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
35
introduction
endianness
immediates
immediates
conclusions
qualitative comparison
‘resource variant’
• one ‘move’ less added
• more flexible encoding
• clean code interface
‘pseudo-move variant’
• importing is possible
• ‘real’ moves can be scheduled
effeciently
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
36
introduction
endianness
immediates
immediates
conclusions
quantitative comparison
• two completely different
schedulers
• compare relative cycle counts,
not absolute
• cycle count increase both
about the same
• small machines:
‘real’ move -> better schedule
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
37
introduction
endianness
immediates
immediates
conclusions
future work
• exploration
• importing of immediate
writes
• sharing of immediate writes
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
38
introduction
endianness
immediates
immediates
conclusions
exploration
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
39
introduction
endianness
immediates
immediates
conclusions
region scheduling immediate writes
immediate bits
immediate bits
A
B
C
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
i0 -> r4
40
introduction
endianness
immediates
immediates
conclusions
sharing immediate writes
resource table
time
LIT
i0
0
immediate bits
2
2
i0->sub_o
2
i0 -> ld_t
1
0
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
41
conclusions
introduction
endianness
immediates
conclusions
conclusions
conclusions endianness
• completed
• host-dependency:
• sources compile on
both platforms
• target-dependency:
• one Makefile switch controls
all tools
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
43
introduction
endianness
immediates
conclusions
conclusions
conclusions immediates
• small (negligible)
instruction count increase
• possible large decrease
code size
• clean code interface not
entirely achieved
Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
44
questions?
borrel
ricardishof, 21:00 uur