Code - On Time Data Solutions

Virtual Memory Layout
FFFFFFFF
S
t
a
c
k
Break
(Heap)
.text
.bss
.data
Operating
System
Reserved
0
How The Sections Work
.data
.text
Instruction1
Instruction2
OpCode Operand1 Operand2
ADD 01010101 11110000
Code
a=1
b=1
a+b
Variable b @ (11110000)
Variable a @(01010101)
00000001
00000001
Indirection
.text
.data
Instruction1
Instruction2
OpCode Operand1 Operand2
ADD 11110011 11111111
Variable d @ (11111111)
Code
a=1
b=1
c* = &a
d* = &b
c* + d*
11110000
Variable c @ (11110011)
01010101
Variable b @ (11110000)
00000001
Variable a @(01010101)
00000001
I
n
d
i
r
e
c
t
Example A “C” “Array”
Code
myArray = array()
i=1
myArray[i] = 0
myArray[i] == myArray[1] == myArray + 1 @ (11110001)
00000000
myArray @ (11110000)
Index value
(indirection)
i @(01010101)
00000001
Back To Memory
Code
a=0
p* =&a
a =Actual pointed variable = Whole_Memory_Array[p]@ (11110001)
Pointer value
(indirection)
Pointer p @(01010101)
Whole_Memory_Array
00000000
11110001
Intro To Binary
Switch = Bit
8 Switches = 8 Bits = Byte
We can have only 0 or 1
Is there any way to get real numbers?
1 = On
0 = Off
Lets Take an Example From Regular
(Decimal) Numbers
•
•
•
•
•
•
•
•
•
•
•
0
1
2
3
4
5
6
7
8
9
?????
We have only 9 numerals
So how do we proceed?
Lets Take an Example From Regular
(Decimal) Numbers
•
•
•
•
•
•
•
•
•
•
•
0
1
2
3
4
5
6
7
8
9
?????
•
•
•
•
•
•
•
•
•
•
•
10
11
12
…
20
21
….
30
….
99
?????
•
•
•
•
•
100
101
….
200
…..
The answer is combination
Back To Binary
• 0
• 1
• ???
• 10
• 11
• ?????
•
•
•
•
•
100
101
110
111
??????
•
•
•
•
•
•
•
•
1000
1001
1010
1100
1101
1110
1111
??????
The answer is combination
Binary Compared To Decimal
Decimal
•
•
•
•
•
•
•
•
•
•
•
0
1
2
3
4
5
6
7
8
9
……
Binary
•
•
•
•
•
•
•
•
•
•
•
00000000
00000001
00000010
00000011
00000100
00000101
00000110
00000111
00001000
00001001
……..
Octal
•
•
•
•
•
•
•
•
•
•
•
0
1
2
3
4
5
6
7
10
11
……
Negative Numbers
• Rule 1: Negative has a 1 in front
Ex: 100000001
• Rule 2: The 2’s complement
1. The 1’s Complement – Xor all bits
Ex:
Xor:
000000001
111111110
-
(Decimal “1”)
2. The 2’s Complement – Add 1
Ex:
Xor:
Add 1:
000000001
111111110
111111110
-
(Decimal “1”)
-
(Decimal “-1”)
Converting Between Smaller And Larger Types
Positive Values
Code
byte a = 1
short b = a
unsigned byte a = 255
short b = a
Correct
Correct
00000001
0000000000000001
11111111
0000000011111111
Negative Values
Wrong
byte a = -1
short b = a
Correct
11111111
0000000011111111
11111111
1111111111111111
Identifiers Turned Into Memory
Addresses
1. The identifiers that are being turned into memory
addresses:
– Global Variables
– Functions
– Labels
2. The identifiers that are NOT being turned into memory
addresses, (are only used to measure the size to reserve):
– Custom types
– Struct
– Class names
3. The identifiers that are used as offsets:
– Array index
– Class (and struct) field (NOT function) member
– Local variables
Variables
Variables are a “higher language – human
readable” name for a memory address
• Size of the reserved memory is based on the type
• There might be the following types
1.
2.
3.
4.
5.
6.
7.
Built-in type – which is just a group of bytes, based on the compiler
and/or platform
Custom type – which is just a series of built in types
Array (in C and C++) – which is just a series of the same built-in type
(but no one keeps track of the length, so it is the programmers job)
Pointer – system defined to hold memory addresses
Reference – a pointer without the possibility to access the memory
address
Handle – a value supplied by the operating system (like the index of an
array)
Typedef and Define – available in C and C++ to rename existing types
Labels
Labels are a “higher language – human
readable” name for a memory address
• There is no size associated with it
• Location
– In assembly it might be everywhere
• In fact in assembly it is generally the only way to declare a
variable, function, loop, or “else”
• In Dos batch it is the only way to declare a function
– In C and C++ it might be only in a function – but is only
recommended to break out of a nested loop
– In VB it is used for error handling “On error goto”
– In Java it might only be before a loop
Label Sample In Assembly
.data
.int
var1:
1
var2:
10
.text
.global
start:
mov var1 %eax
call myfunc
jmp myfunc
myfunc:
mov var2 %ebx
add %eax %ebx
ret
Label Sample In Java (Or C)
outerLabel:
while(1==1)
{
while(2==2)
{
//Java syntax
break outerLabel;
//C syntax (not the same as
before, as it will cause the loop again)
goto outerLabel;
}
}
Boolean Type
• False == 0 (all switches are of)
• True == 1 (switch is on, and also matches
Boolean algebra)
• All other numbers are also considered true (as
there are switches on)
• There are languages that require conversion
between numbers and Boolean (and other are
doing it behind the scenes))
However TWO languages are an
exception
Why Some Languages Require conversion
•Consider the following C code, all of them are perfectly valid:
if(1==1)
if(1)
if(a)
if(a==b)
if(a=b)
//true
//true as well
//Checks if “a” is non-zero
//Compares “a” to “b”
//Sets “a” to the value of “b”, and then
//checks “a” if it is non-zero, Is this by
//intention or typo?
•However in Java the last statement would not compile, as it is
not a Boolean operation
•For C there is some avoidance by having constants to the left, ie.
if(20==a)
Instead of if(a==20)
Because
if(20=a) Is a syntax error
While
if(a=20)
Is perfectly valid
Implicit Conversion
• However some languages employ implicit
conversion
• JavaScript considers
– If
– If
– If
– If
(1) : true
(0) : false
(“”) : false
(“0”) : true
• Php Considers
• If (“0”) : false
• If (array()) : false
Boolean In Visual Basic
• True in VB = -1
• To see why let us see it in binary
– 1 = 00000001
– 1’s complement = 1111111110
– 2’s complement = 1111111111
• So all switches are on
But why different than all others?
To answer that we need to understand the difference
between Logical and Bitwise operators, and why do
Logical operators short circuit?
Logical vs bitwise
• Logical
– Step 1 – Check the left side for true
– Step 2 – If still no conclusion check the right side
– Step 3 – Compare both sides and give the answer
• Bitwise
– Step 1 – Translate both sides into binary
– Step 2 – Compare both sides bit per bit
– Step 3 – Provide the soltuion
Example Bitwise vs Logical
• Example 1
If 1==1 AND 2==2
Logical And
Bitwise And
1==1 is true
1==1 is true=00000001
2 ==2 is true
2==2 is true=00000001
Step 1:
Step 2:
Step 3:
True
And True
= True
00000001
00000001
00000001
More Examples
• Example 2
If 1==2 AND 2==2
Logical And
Bitwise And
Step 1: 1==2 is false
1==2 is false=00000000
Step 2: return false
2==2 is true =00000001
Step 3: N/A
00000000
AND 00000001
00000000
• Example 3
If 1 AND 2
Step 1:
1 is True
00000001
Step 2:
2 is True
00000010
Step 3:
True
00000000
Bitwise vs Logical Operators
Operator
C
Basic
VB.Net
(Logical)
Logical AND
&&
N/A
AndAlso
Logical OR
||
N/A
OrElse
Logical NOT
!
N/A
N/A
(Bitwise)
Bitwise AND
&
AND
AND
Bitwise OR
|
OR
OR
Bitwise XOR
^
XOR
XOR
Bitwise NOT
~
NOT
NOT
Back To VB
• Since we have only a bitwise NOT we have to
make sure it works on Boolean
NOT On 1
1 = 00000001 = True
NOT = 11111110 = True
NOT On -1
-1 = 11111111 = True
NOT = 00000000 = False
Beware Of The Win32 API
Boolean In Bash Shell Scripting
# if(true) then echo “works”; fi
# works
#
# if(false) then echo “works”; fi
#
# if(test 1 –eq 1) then echo “works”; fi
# works
#
# if(test 1 –eq 2) then echo “works”; fi
#
# echo test 1 –eq 1
#
# test 1 –eq 1
# echo $?
#0
# test 1 –eq 2
# echo $?
#1
#
# true
# echo #?
#0
# false
# echo #?
#1
Simple Function Call
Stack
Base Pointer
Stack Pointer
Code
Function func1()
{
func2();
}
Function func2()
{
return;
}
Simple Function Call – Assembly Code
Stack
Base Pointer
9997
Stack Pointer
9995
Assembly Code
Func1: #label
Jmp func2
Func2: #label
Push BasePointer
BasePointer = StackPointer
#some code
Pop BasePointer
Code
Function func1()
{
func2();
}
Function func2()
{
//Some Code
return;
}
Simple Function Call – Step 1
Stack
9997
Base Pointer
Stack Pointer
9997
9994
Code
Assembly Code
Func1: #label
Jmp func2
Func2: #label
Push BasePointer
BasePointer = StackPointer
#some code
Pop BasePointer
Function func1()
{
func2();
}
Function func2()
{
//Some Code
return;
}
Simple Function Call – Step 2
Stack
Base Pointer
Stack Pointer
9997
9994
Code
Assembly Code
Func1: #label
Jmp func2
Func2: #label
Push BasePointer
BasePointer = StackPointer
#some code
Pop BasePointer
Function func1()
{
func2();
}
Function func2()
{
//Some Code
return;
}
Simple Function Call – Step 3
Stack
9997
Base Pointer
Stack Pointer
9995
Code
Assembly Code
Func1: #label
Jmp func2
Func2: #label
Push BasePointer
BasePointer = StackPointer
#some code
Pop BasePointer
Function func1()
{
func2();
}
Function func2()
{
//Some Code
return;
}
Function Call
Stack
Base Pointer
Stack Pointer
Code
Function func1()
{
func2(5, 10);
}
Function func2(int x, int y)
{
int a;
int[10] b;
//Some code
return;
}
Function Call – Assembly Code
Stack
Base Pointer
9997
Stack Pointer
9995
Assembly Code
Code
Func1: #label
Push 10
Push 5
Jmp func2
Function func1()
{
func2(5, 10);
}
Func2: #label
Push BasePointer
BasePointer = StackPointer
Sub 1
Sub 10
#some code
Add 10
Add 1
Pop BasePointer
Function func2(int x, int y)
{
int a;
int[10] b;
//Some code
return;
}
Function Call – Step 1
Stack
9997
Base Pointer
Stack Pointer
Assembly Code
10
5
9993
Code
Func1: #label
Push 10
Push 5
Jmp func2
Function func1()
{
func2(5, 10);
}
Func2: #label
Push BasePointer
BasePointer = StackPointer
Sub 1
Sub 10
#some code
Add 10
Add 1
Pop BasePointer
Function func2(int x, int y)
{
int a;
int[10] b;
//Some code
return;
}
Function Call – Step 2
Stack
9997
Base Pointer
Stack Pointer
Assembly Code
10
5
9997
9992
Code
Func1: #label
Push 10
Push 5
Jmp func2
Function func1()
{
func2(5, 10);
}
Func2: #label
Push BasePointer
BasePointer = StackPointer
Sub 1
Sub 10
#some code
Add 10
Add 1
Pop BasePointer
Function func2(int x, int y)
{
int a;
int[10] b;
//Some code
return;
}
Function Call – Step 3
Stack
Base Pointer
Stack Pointer
Assembly Code
10
5
9997
9992
Code
Func1: #label
Push 10
Push 5
Jmp func2
Function func1()
{
func2(5, 10);
}
Func2: #label
Push BasePointer
BasePointer = StackPointer
Sub 1
Sub 10
#some code
Add 10
Add 1
Pop BasePointer
Function func2(int x, int y)
{
int a;
int[10] b;
//Some code
return;
}
Function Call – Step 4
Stack
Base Pointer
Stack Pointer
Assembly Code
10
5
9997
9992
9981
Code
Func1: #label
Push 10
Push 5
Jmp func2
Function func1()
{
func2(5, 10);
}
Func2: #label
Push BasePointer
BasePointer = StackPointer
Sub 1
Sub 10
#some code
Add 10
Add 1
Pop BasePointer
Function func2(int x, int y)
{
int a;
int[10] b;
//Some code
return;
}
Function Call – Step 5
Stack
Base Pointer
Stack Pointer
Assembly Code
10
5
9997
9992
Code
Func1: #label
Push 10
Push 5
Jmp func2
Function func1()
{
func2(5, 10);
}
Func2: #label
Push BasePointer
BasePointer = StackPointer
Sub 1
Sub 10
#some code
Add 10
Add 1
Pop BasePointer
Function func2(int x, int y)
{
int a;
int[10] b;
//Some code
return;
}
Function Call – Step 6
Stack
9997
Base Pointer
Stack Pointer
Assembly Code
10
5
9993
Func1: #label
Push 10
Push 5
Jmp func2
Add 2
Func2: #label
Push BasePointer
BasePointer = StackPointer
Sub 1
Sub 10
#some code
Add 10
Add 1
Pop BasePointer
Code
Function func1()
{
func2(5, 10);
}
Function func2(int x, int y)
{
int a;
int[10] b;
//Some code
return;
}
Function Call – Step 6
Stack
9997
Base Pointer
Stack Pointer
9995
Assembly Code
Func1: #label
Push 10
Push 5
Jmp func2
Add 2 StackPointer
Func2: #label
Push BasePointer
BasePointer = StackPointer
Sub 1 StackPointer
Sub 10 StackPointer
#some code
Add 10 StackPointer
Add 1 StackPointer
Pop BasePointer
Code
Function func1()
{
func2(5, 10);
}
Function func2(int x, int y)
{
int a;
int[10] b;
//Some code
return;
}
Function call - summary
Call
1.
2.
3.
4.
Arguments are copied (passed by value) on the stack from right to left (In the C
calling convention)
The return address is pushed
The old base pointer is pushed
We make place for local variables (but is not intialized and contans “Garbage”)
Access
1.
2.
All access to local variables or parameters are relative to the base
pointer
i.e the compiler does not transalate “localVar” to “FF55”, but to
“BasePointer - 1”
Return
1.
2.
3.
4.
Remove the space from the local variables (everything there is destroyed)
The old base pointer is restored
The return address is poped and jumped to.
Parameters are destroyed
The Heap
FFFFFFFF
S
t
a
c
k
Break
(Heap)
C Code
Int* myPtr = NULL;
myPtr = alloc(sizeof(int));
If(myPtr == NULL) exit(1);
//Some code
free(myPtr);
Points here
after malloc
and new
.text
.bss
myPtr
.data
?????
Operating
System
Reserved
0
C++ Code
Int* myPtr = NULL;
myPtr = new(sizeof(int));
If(myPtr == NULL) exit(1);
//Some code
delete(myPtr);
Heap vs Stack Summary
• All types of objects can be allocated both on the
stack and on the heap
• To use the heap we must have a pointer and then
call “malloc” or “new”
• Objects created on the stack are automatically
destroyed when the function exists (unless with
the keyword static in C which actually creates in
the data section)
• Objects created on the heap, must be explicitly
freed with “free”, or “malloc”
– IF the pointer is destroyed then we are stuck (or if we
just forgat)
Copying
.data
Variable d @ (11111111)
Code
a=1
b=a
c* = &a
d* = c
01010101
Variable c @ (11110011)
01010101
Variable b @ (11110000)
00000001
Variable a @(01010101)
00000001
I
n
d
i
r
e
c
t
Copying And Then Changing
.data
Variable d @ (11111111)
Code
a=1
b=a
b = 2 //Not affecting a
c* = &a
d* = c
01010101
Variable c @ (11110011)
01010101
Variable b @ (11110000)
00000010
Variable a @(01010101)
00000001
I
n
d
i
r
e
c
t
Copying Pointers And Then Changing Value
.data
Variable d @ (11111111)
Code
a=1
b=a
b = 2 //Not affecting a
c* = &a
d* = c
(*d) = 3
01010101
Variable c @ (11110011)
01010101
Variable b @ (11110000)
00000010
Variable a @(01010101)
00000011
I
n
d
i
r
e
c
t
Copying Pointers And Then Changing Pointer Value
.data
Variable d @ (11111111)
Code
a=1
b=a
b = 2 //Not affecting a
c* = &a
d* = c
(*d) = 3
d = &b
(*d) = 4
01010101
Variable c @ (11110011)
01010101
Variable b @ (11110000)
00000100
Variable a @(01010101)
00000011
I
n
d
i
r
e
c
t
Copying Summary
• Copying value variables, just copy the value,
any change after to one does NOT affect to
the other
• Copying pointer variables, both point to the
same value variable
– Any change to the value variable via one of the
pointers IS reflected to the other
– Any changes to the pointer value of one is NOT
reflected by the other pointer, and also not affect
the original value variable
Pointer vs Reference
• Problems with pointers, stack and heap so far:
– A pointer can be directly (by intention or
accidently) changed to access any memory
– Failure to free the allocated heap results in a
memory leak
– Forgetting to free the allocated heap till the
function returns, then there is no longer a way to
free it
– Creating a large object on the stack is not worth,
and of course not to pass by value
References
• Reference is the same as a pointer but without
the ability to change the memory address, only
possible to change pointing from one object of
the same type to another
Code
object myVar = new object();
object myVar2 = myVar;//myVar2 points to the same object as myVar
myVar = new object();//Now myVar points to a new object, while myVar2 points to the old
• A primitive variable is in general a value type and
cannot be NULL
• A object is generally a reference and is NULL as long
there is no call to “new()”
• Now the compiler can keep track of what is referenced
by what
Garbage Collection
• Garbage Collection means to free heap memory
no longer in use
• In COM and VB6 (based on COM) it is done by
keeping a count of the variables that still
reference the object
– For each new reference it is incremented, and for each
out of scope or null or just change in reference it is
decremented
– When the reference count reaches 0 it is deleted
– Problem if two objects just reference each other, they
will never be deleted
• In Java and .Net the JVM and CLR keep track of
objects that are not referenced by real references
Null Pointer And DB Null
• A pointer or reference not pointing anywhere is
pointing to null
• In C NULL is a constant with the value 0
• 0 == Null (in C) and null == null
• NULL + “SomeString” = “SomeString”
– (but not in Linq To SQL!!! WATCH OUT)
•
•
•
•
•
In Database it is different NULL is nothing
NULL = NULL is also NULL, use IS NULL
NULL + “someString” is NULL
WHERE NOT IN(NULL) is corrupting everything
SQL Server has an option “SET ANSI NULLS OFF”, and
MySQl has an operator <=> to compare Nulls
NaN
• Division of an integer by 0 is an error
• Division of a floating point by 0 results in Nan
IEEE
Java
NaN == NaN = false
NaN != NaN = false
NaN == NaN = false
NaN != NaN = true
• Check for Nan by checking
If((myVar == myVar) == false)
If(myVar.IsNaN())
• This does not work for a database NULL
NULL == NULL = NULL
NULL != NULL = NULL
Characters
String
String Quoting
Pass By Value vs Pass By Reference
Immutable Strings