Lecture 6A Machine-Level Programming IV: Structured Data Basic

Basic Data Types
Lecture 6A
Machine-Level Programming IV:
Structured Data
Integral
Stored & operated on in general registers
Signed vs. unsigned depends on instructions used
Topics
Intel
GAS
Bytes
C
byte
b
1
[unsigned] char
word
w
2
[unsigned] short
Arrays
double word
l
4
[unsigned] int
Structs
Floating Point
Unions
Stored & operated on in floating point registers
Intel
GAS
Bytes
C
Single
s
4
float
Double
l
8
double
Extended
t
10/12
long double
Datorarkitektur 2007
F6A – 2 –
Array Allocation
Array Access
Basic Principle
Basic Principle
T A[L];
T A[L];
Array of data type T and length L
Identifier A can be used as a pointer to array element 0
Array of data type T and length L
Contiguously allocated region of L * sizeof(T) bytes
char string[12];
x
x
int val[5];
x
double a[4];
x
x+4
x+8
x+8
x + 16
x + 12
x + 16
x + 24
x + 20
F6A – 3 –
x+4
x+4
2
x+8
Type
Value
val[4]
val
val+1
int
int *
int *
3
x
x+4
&val[2]
int *
x+8
val[5]
int
??
*(val+1)
int
5
val + i
int *
x+4i
x + 32
x+8
Datorarkitektur 2007
5
Reference
char *p[3];
x
1
int val[5];
x + 12
F6A – 4 –
1
3
x + 12
x + 16
x + 20
Datorarkitektur 2007
Array Example
Array Accessing Example
typedef int zip_dig[5];
Computation
zip_dig cmu = { 1, 5, 2, 1, 3 };
zip_dig mit = { 0, 2, 1, 3, 9 };
zip_dig ucb = { 9, 4, 7, 2, 0 };
zip_dig cmu;
1
16
5
20
zip_dig mit;
36
60
68
Use memory reference (%edx, eax,4)
56
2
64
int get_digit
(zip_dig z, int dig)
{
return z[dig];
}
9
52
7
Desired digit at 4*%eax + %edx
36
3
48
4
3
32
1
44
9
56
1
28
2
40
zip_dig ucb;
Register %eax contains array index
2
24
0
Register %edx contains starting
address of array
0
72
Memory Reference Code
76
# %edx = z
# %eax = dig
movl (%edx,%eax,4),%eax # z[dig]
Notes
Declaration “zip_dig cmu” equivalent to “int cmu[5]”
Example arrays were allocated in successive 20 byte blocks
Not guaranteed to happen in general
Datorarkitektur 2007
F6A – 5 –
Referencing Examples
zip_dig cmu;
1
16
zip_dig mit;
5
20
0
36
zip_dig ucb;
56
24
40
9
1
28
1
44
4
60
Array Loop Example
2
2
3
32
3
48
7
64
52
2
68
36
Original Source
9
56
0
72
76
Transformed Version
Code Does Not Do Any Bounds Checking!
Reference
Address
Value
Datorarkitektur 2007
F6A – 6 –
As generated by GCC
Guaranteed?
Eliminate loop variable i
Yes
mit[3]
36 + 4* 3 = 48 3
mit[5]
36 + 4* 5 = 56 9
No
mit[-1]
36 + 4*-1 = 32 3
No
cmu[15]
16 + 4*15 = 76 ??
No
Out of range behavior implementation-dependent
Convert array code to pointer
code
Express in do-while form
No need to test at entrance
int zd2int(zip_dig z)
{
int i;
int zi = 0;
for (i = 0; i < 5; i++) {
zi = 10 * zi + z[i];
}
return zi;
}
int zd2int(zip_dig z)
{
int zi = 0;
int *zend = z + 4;
do {
zi = 10 * zi + *z;
z++;
} while(z <= zend);
return zi;
}
No guaranteed relative allocation of different arrays
F6A – 7 –
Datorarkitektur 2007
F6A – 8 –
Datorarkitektur 2007
Nested Array Example
Array Loop Implementation
#define PCOUNT 4
zip_dig pgh[PCOUNT] =
{{1, 5, 2, 0, 6},
{1, 5, 2, 1, 3 },
{1, 5, 2, 1, 7 },
{1, 5, 2, 2, 1 }};
int zd2int(zip_dig z)
{
int zi = 0;
int *zend = z + 4;
do {
zi = 10 * zi + *z;
z++;
} while(z <= zend);
return zi;
}
Registers
%ecx z
%eax zi
%ebx zend
Computations
10*zi + *z implemented as
*z + 2*(zi+4*zi)
z++ increments by 4
# %ecx = z
xorl %eax,%eax
leal 16(%ecx),%ebx
.L59:
leal (%eax,%eax,4),%edx
movl (%ecx),%eax
addl $4,%ecx
leal (%eax,%edx,2),%eax
cmpl %ebx,%ecx
jle .L59
zip_dig
pgh[4];
76
# zi = 0
# zend = z+4
#
#
#
#
#
#
Nested Array Allocation
A[0][0]
R rows, C columns
Type T element requires K
bytes
A[R-1][0]
Datorarkitektur 2007
F6A – 10 –
•••
A[0][C-1]
Row Vectors
•
•
•
•••
A[i] is array of C elements
Each element of type T
Starting address A + i * C * K
A[R-1][C-1]
int A[R][C];
Array Size
A[0]
R * C * K bytes
Arrangement
int A[R][C];
Row-Major Ordering
•••
156
Nested Array Row Access
•
•
•
Array of data type T
136
“Row-Major” ordering of all elements guaranteed
Datorarkitektur 2007
T A[R][C];
116
Variable pgh denotes array of 4 elements
» Allocated contiguously
Each element is an array of 5 int’s
» Allocated contiguously
5*zi
*z
z++
zi = *z + 2*(5*zi)
z : zend
if <= goto loop
Declaration
96
Declaration “zip_dig pgh[4]” equivalent to “int pgh[4][5]”
F6A – 9 –
A
[0]
[0]
1 5 2 0 6 1 5 2 1 3 1 5 2 1 7 1 5 2 2 1
A
A
[0] [1]
[C-1] [0]
•••
A
[1]
[C-1]
• • •
A
A
[R-1] • • • [R-1]
[0]
[C-1]
A
[0]
[0]
A
•••
A[i]
A
[0]
[C-1]
• • •
A
[i]
[0]
A+i*C*4
•••
A[R-1]
A
[i]
[C-1]
A
• • • [R-1]
[0]
•••
A
[R-1]
[C-1]
A+(R-1)*C*4
4*R*C Bytes
F6A – 11 –
Datorarkitektur 2007
F6A – 12 –
Datorarkitektur 2007
Nested Array Row Access Code
Nested Array Element Access
int *get_pgh_zip(int index)
{
return pgh[index];
}
A[i][j] is element of type T
Address A + (i * C + j) * K
int A[R][C];
Row Vector
pgh[index] is array of 5 int’s
A[0]
Starting address pgh+20*index
A
[0]
[0]
Code
Computes and returns address
Compute as pgh + 4*(index+4*index)
•••
A[i]
A
[0]
[C-1]
A
int get_pgh_digit
(int index, int dig)
{
return pgh[index][dig];
}
Code
A
• • • [R-1]
[0]
A
[R-1]
[C-1]
•••
A+(R-1)*C*4
Datorarkitektur 2007
zip_dig
pgh[4];
pgh + 20*index + 4*dig
•••
Strange Referencing Examples
Array Elements
pgh[index][dig] is int
A
[i]
[j]
F6A – 14 –
Nested Array Element Access Code
Address:
•••
A+(i*C+j)*4
Datorarkitektur 2007
F6A – 13 –
• • •
A[R-1]
A+i*C*4
# %eax = index
leal (%eax,%eax,4),%eax # 5 * index
leal pgh(,%eax,4),%eax # pgh + (20 * index)
Computes address
pgh + 4*dig + 4*(index+4*index)
movl performs memory reference
# %ecx = dig
# %eax = index
leal 0(,%ecx,4),%edx
leal (%eax,%eax,4),%eax
movl pgh(%edx,%eax,4),%eax
A
[i]
[j]
Array Elements
# 4*dig
# 5*index
# *(pgh + 4*dig + 20*index)
1 5 2 0 6 1 5 2 1 3 1 5 2 1 7 1 5 2 2 1
76
96
Reference Address
116
136
156
Value Guaranteed?
pgh[3][3] 76+20*3+4*3 = 148
2
Yes
pgh[2][5] 76+20*2+4*5 = 136
1
Yes
pgh[2][-1] 76+20*2+4*-1 = 112
3
Yes
pgh[4][-1] 76+20*4+4*-1 = 152
1
Yes
pgh[0][19] 76+20*0+4*19 = 152
1
Yes
pgh[0][-1] 76+20*0+4*-1 = 72
??
No
Code does not do any bounds checking
Ordering of elements within array guaranteed
F6A – 15 –
Datorarkitektur 2007
F6A – 16 –
Datorarkitektur 2007
Structures
Generating Pointer to Struct.
Member
r
Concept
struct rec {
int i;
int a[3];
int *p;
};
Contiguously-allocated region of memory
Refer to members within structure by names
Members may be of different types
struct rec {
int i;
int a[3];
int *p;
};
Memory Layout
i
0
a
Generating Pointer to Array
Element
p
16 20
4
Offset of each structure
member determined at
compile time
Accessing Structure Member
void
set_i(struct rec *r,
int val)
{
r->i = val;
}
i
0
a
p
4
16
r + 4 + 4*idx
int *
find_a
(struct rec *r, int idx)
{
return &r->a[idx];
}
Assembly
# %eax = val
# %edx = r
movl %eax,(%edx)
# %ecx = idx
# %edx = r
leal 4(%edx,%ecx,4),%eax # r+4*idx+4
# Mem[r] = val
Datorarkitektur 2007
F6A – 17 –
Alignment
Datorarkitektur 2007
F6A – 18 –
Specific Cases of Alignment
Size of Primitive Data Type:
Aligned Data
1 byte (e.g., char)
no restrictions on address
Primitive data type requires K bytes
2 bytes (e.g., short)
Address must be multiple of K
lowest 1 bit of address must be 02
4 bytes (e.g., int, float, char *, etc.)
Required on some machines; advised on IA32
lowest 2 bits of address must be 002
treated differently by Linux and Windows!
8 bytes (e.g., double)
Motivation for Aligning Data
Windows (and most other OS’s & instruction sets):
» lowest 3 bits of address must be 0002
Linux:
» lowest 2 bits of address must be 002
» i.e., treated the same as a 4-byte primitive data type
Memory accessed by (aligned) double or quad-words
Inefficient to load or store datum that spans quad word boundaries
Virtual memory very tricky when datum spans 2 pages
12 bytes (long double)
Compiler
Linux:
» lowest 2 bits of address must be 002
» i.e., treated the same as a 4-byte primitive data type
Inserts gaps in structure to ensure correct alignment of fields
F6A – 19 –
Datorarkitektur 2007
F6A – 20 –
Datorarkitektur 2007
Satisfying Alignment with Structures
Linux vs. Windows
struct S1 {
char c;
int i[2];
double v;
} *p;
Offsets Within Structure
Must satisfy element’s alignment requirement
Overall Structure Placement
Each structure has alignment requirement K
Largest alignment of any element
struct S1 {
char c;
int i[2];
double v;
} *p;
Initial address & structure length must be
multiples of K
K = 8, due to double element
i[0]
p+4
i[1]
p+8
p+16
Datorarkitektur 2007
F6A – 21 –
Overall Alignment Requirement
i[0]
p+0
p+8
struct S3 {
float x[2];
int i[2];
char c;
} *p;
F6A – 23 –
i[1]
p+12
i[0]
p+8
v
p+16
i[1]
p+12
Multiple of 8
i[1]
p+8
v
p+12
p+20
Multiple of 4
Multiple of 4
Datorarkitektur 2007
F6A – 22 –
10 bytes wasted space in Windows
c1
Windows: p+24
Linux: p+20
v
p+0
v
p+20
p+0
Datorarkitektur 2007
F6A – 24 –
c2
p+8
struct S5 {
double v;
char c1;
char c2;
int i;
} *p;
c
p+16
i[0]
p+4
struct S4 {
char c1;
double v;
char c2;
int i;
} *p;
c
p+16
p+24
Multiple of 8
Multiple of 4
Multiple of 4
p must be multiple of 4 (in either OS)
x[1]
i[1]
p+8
Ordering Elements Within Structure
p must be multiple of:
8 for Windows
4 for Linux
x
p+4
c
p+0
Multiple of 8
struct S2 {
double x;
int i[2];
char c;
} *p;
i[0]
p+4
K = 4; double treated like a 4-byte data type
p+24
Multiple of 8
Multiple of 8
p+0
c
p+0
Linux:
v
Multiple of 4
x[0]
K = 8, due to double element
Multiple of 4
Multiple of 8
Example (under Windows):
c
p+0
Windows (including Cygwin):
p+16
i
p+20
p+24
2 bytes wasted space
c1 c2
p+8
i
p+12
p+16
Datorarkitektur 2007
Accessing Element within Array
Arrays of Structures
Compute offset to start of structure
Principle
Compute 12*i as 4*(i+2i)
Allocated by repeating allocation for
array type
In general, may nest arrays & structures
to arbitrary depth
a[1].i
a[1].v
a+12
struct S6 {
short i;
float v;
short j;
} a[10];
Offset by 8
Assembler gives displacement as a + 8
» Linker must set actual value
a[1].j
a+16
a+20
a+24
short get_j(int idx)
{
return a[idx].j;
}
a[0]
a[0]
a+0
a[1]
a+12
a[2]
a+24
# %eax = idx
leal (%eax,%eax,2),%eax # 3*idx
movswl a+8(,%eax,4),%eax
•• •
a[i]
•••
a[i].i
a[i].v
a+12i
Datorarkitektur 2007
Satisfying Alignment within Structure
Achieving Alignment
Overlay union elements
Overall size of structure must be multiple of worstcase alignment for any element
Structure padded with unused space to be 12 bytes
•• •
a[i]
•• •
a+12i
a+0
a[1].i
F6A – 27 –
Datorarkitektur 2007
Principles
Allocate according to largest element
struct S6 {
short i;
Offset of element within structure must be multiple of float v;
element’s alignment requirement
short j;
v’s offset of 4 is a multiple of 4
} a[10];
a+12i
a+12i+8
Union Allocation
a must be multiple of 4
Multiple of 4
a[i].j
F6A – 26 –
Starting address of structure array must be multiple
of worst-case alignment for any element
a[0]
•• •
a+12i
a+0
a+36
F6A – 25 –
struct S6 {
short i;
float v;
short j;
} a[10];
Access element according to its offset
within structure
a[1].v
struct S1 {
char c;
int i[2];
double v;
} *sp;
c
sp+0
a[1].j
a+12i+4
Multiple of 4
Can only use one field at a time
Datorarkitektur 2007
F6A – 28 –
sp+4
union U1 {
char c;
int i[2];
double v;
} *up;
c
i[0]
up+0
i[1]
v
up+4
up+8
(Windows alignment)
i[0]
sp+8
i[1]
v
sp+16
sp+24
Datorarkitektur 2007
Summary
Using Union to Access Bit Patterns
typedef union {
float f;
unsigned u;
} bit_float_t;
u
f
0
4
Get direct access to bit
representation of float
bit2float generates float with
given bit pattern
float bit2float(unsigned u) {
bit_float_t arg;
arg.u = u;
return arg.f;
}
unsigned float2bit(float f) {
bit_float_t arg;
arg.f = f;
return arg.u;
}
Arrays in C
Contiguous allocation of memory
Pointer to first element
No bounds checking
Compiler Optimizations
Compiler often turns array code into pointer code (zd2int)
Uses addressing modes to scale array indices
Lots of tricks to improve array indexing in loops
Structures
Allocate bytes in order declared
Pad in middle and at end to satisfy alignment
NOT the same as (float) u
float2bit generates bit pattern
from float
Unions
Overlay declarations
Way to circumvent type system
NOT the same as (unsigned) f
F6A – 29 –
Datorarkitektur 2007
F6A – 30 –
Datorarkitektur 2007