Intro

Program Semantics
Xiangyu Zhang
Why Formalization
Analysis != hacking
CS510 S o f t w a r e E n g i n e e r i n g
Corner cases haven’t been considered
Work-arounds
Approximate implementation
Ideally, soundness and completeness
Soundness – what you implement achieves the goal
Completeness – you have everything considered
In practice, soundness and completeness cannot be
achieved
Clean-up your thoughts before coding
Expose problems earlier
2
Language
CS510 S o f t w a r e E n g i n e e r i n g
Program P
::= s
Statement s ::= s1; s2 | x= y | x = y op z | x= c |
if (x) s1 else s2 |
while (x) s
Operation op :: = + | - | * | / | > | < | …
Value c :: = 0 | 1 | 2 … | true | false
Variable x, x1, x2, x3
3
Configuration
CS510 S o f t w a r e E n g i n e e r i n g
4
Semantics Rules
CS510 S o f t w a r e E n g i n e e r i n g
𝜹′ = 𝜹[𝒙 ⟼ 𝒄]
< 𝒙 = 𝒄; 𝒔, 𝜹 >→< 𝒔, 𝜹′ >
𝜹′ = 𝜹[𝒙 ⟼ 𝜹[𝒚]]
< 𝒙 = 𝒚; 𝒔, 𝜹 >→< 𝒔, 𝜹′ >
𝒃=𝜹 𝒛 𝒄=𝒂+𝒃
𝜹′ = 𝜹[𝒙 ⟼ 𝒄]
< 𝒙 = 𝒚 + 𝒛; 𝒔, 𝜹 >→< 𝒔, 𝜹′ >
Const-Assign
Copy
𝒂=𝜹𝒚
BinOp-Add
5
Semantics Rules
CS510 S o f t w a r e E n g i n e e r i n g
𝜹 𝒙 = 𝒕𝒓𝒖𝒆
If-T
< 𝒊𝒇 𝒙 𝒔𝟏 𝒆𝒍𝒔𝒆 𝒔𝟐; 𝒔, 𝜹 >→< 𝒔𝟏; 𝒔, 𝜹 >
𝜹 𝒙 = 𝒇𝒂𝒍𝒔𝒆
< 𝒊𝒇 𝒙 𝒔𝟏 𝒆𝒍𝒔𝒆 𝒔𝟐; 𝒔, 𝜹 >→< 𝒔𝟐; 𝒔, 𝜹 >
If-F
< 𝒘𝒉𝒊𝒍𝒆 𝒙 𝒔𝟏; 𝒔, 𝜹 >→
< 𝒊𝒇 𝒙 𝒔𝟏; 𝒘𝒉𝒊𝒍𝒆 𝒙 𝒔𝟏 𝒆𝒍𝒔𝒆 𝒔𝒌𝒊𝒑; 𝒔, 𝜹 >
While
6
Example
CS510 S o f t w a r e E n g i n e e r i n g
i=0;
sum=0;
N=2;
while (i<N) {
sum=sum+i;
i=i+1;
}
7
Extend the Language with Pointers
CS510 S o f t w a r e E n g i n e e r i n g
Program P
::= s
Statement s ::= s1; s2 | x= y | x = y op z | x= c |
x= &y | (*x)=y | x= (*y)
if (x) s1 else s2 |
while (x) s
Operation op :: = + | - | * | / | > | < | …
Value c :: = 0 | 1 | 2 … | true | false
Address a :: = 0 | 1 | 2…
Variable x, x1, x2, x3
8
Configuration
CS510 S o f t w a r e E n g i n e e r i n g
9
Semantics Rules
CS510 S o f t w a r e E n g i n e e r i n g
𝜹′ = 𝜹[𝜶[𝒙] ⟼ 𝒄]
< 𝒙 = 𝒄; 𝒔, 𝜹 >→< 𝒔, 𝜹′ >
𝜹′ = 𝜹[𝜶[𝒙] ⟼ 𝜹[𝜶[𝒚]]]
< 𝒙 = 𝒚; 𝒔, 𝜹 >→< 𝒔, 𝜹′ >
𝒂 = 𝜹 𝜶[𝒚] 𝒃 = 𝜹 𝜶[𝒛] 𝒄 = 𝒂 + 𝒃
𝜹′ = 𝜹[𝜶[𝒙] ⟼ 𝒄]
< 𝒙 = 𝒚 + 𝒛; 𝒔, 𝜹 >→< 𝒔, 𝜹′ >
Const-Assign
Copy
BinOp-Add
10
Semantics Rules
CS510 S o f t w a r e E n g i n e e r i n g
𝜹′ = 𝜹 𝜶 𝒙 ⟼ 𝒂 𝒂 = 𝜶 𝒚
< 𝒙 = &𝒚; 𝒔, 𝜹 >→< 𝒔, 𝜹′ >
Addr-Of
𝜹′ = 𝜹[𝜹[𝜶 𝒙 ] ⟼ 𝜹[𝜶[𝒚]]]
< (∗ 𝒙) = 𝒚; 𝒔, 𝜹 >→< 𝒔, 𝜹′ >
Ptr-Write
𝜹′ = 𝜹[𝜶 𝒙 ⟼ 𝜹[𝜹[𝜶[𝒚]]]]
< 𝒙 =∗ 𝒚; 𝒔, 𝜹 >→< 𝒔, 𝜹′ >
Ptr-Read
11
Extend the Language with Heap
CS510 S o f t w a r e E n g i n e e r i n g
Program P
::= s
Statement s ::= s1; s2 | x= y | x = y op z | x= c |
x= &y | (*x)=y | x=(*y) | x=malloc (y)
if (x) s1 else s2 |
while (x) s
Operation op :: = + | - | * | / | > | < | …
Value c :: = 0 | 1 | 2 … | true | false
Address a :: = 0 | 1 | 2…
Variable x, x1, x2, x3
12
Configuration
CS510 S o f t w a r e E n g i n e e r i n g
13
Semantics Rules
CS510 S o f t w a r e E n g i n e e r i n g
𝜹′ = 𝜹[𝜶[𝒙] ⟼ 𝒄]
Const-Assign
< 𝒙 = 𝒄; 𝒔, 𝜹, 𝜸 >→< 𝒔, 𝜹′ , 𝜸 >
𝜹′ = 𝜹[𝜶[𝒙] ⟼ 𝜹[𝜶[𝒚]]]
< 𝒙 = 𝒚; 𝒔, 𝜹, 𝜸 >→< 𝒔, 𝜹′ , 𝜸 >
𝒂 = 𝜹 𝜶[𝒚] 𝒃 = 𝜹 𝜶[𝒛] 𝒄 = 𝒂 + 𝒃
𝜹′ = 𝜹[𝜶[𝒙] ⟼ 𝒄]
< 𝒙 = 𝒚 + 𝒛; 𝒔, 𝜹, 𝜸 >→< 𝒔, 𝜹′ , 𝜸 >
Copy
BinOp-Add
14
Semantics Rules
CS510 S o f t w a r e E n g i n e e r i n g
𝜹′ = 𝜹 𝜶 𝒙 ⟼ 𝒂 𝒂 = 𝜶 𝒚
< 𝒙 = &𝒚; 𝒔, 𝜹, 𝜸 >→< 𝒔, 𝜹′ , 𝜸 >
Addr-Of
𝜹′ = 𝜹[𝜹[𝜶 𝒙 ] ⟼ 𝜹[𝜶[𝒚]]]
< ∗ 𝒙 = 𝒚; 𝒔, 𝜹, 𝜸 >→< 𝒔, 𝜹′ , 𝜸 >
Ptr-Write
𝜹′ = 𝜹[𝜶 𝒙 ⟼ 𝜹[𝜹 𝜶 𝒚 ]]
< 𝒙 =∗ 𝒚; 𝒔, 𝜹, 𝜸 >→< 𝒔, 𝜹′ , 𝜸 >
Ptr-Read
𝜹′ = 𝜹 𝜶 𝒙 ⟼ 𝜸
𝜸′ = 𝜸 + 𝜹 𝜶 𝒚
Malloc
′
< 𝒙 = 𝒎𝒂𝒍𝒍𝒐𝒄 𝒚 ; 𝒔, 𝜹, 𝜸 >→< 𝒔, 𝜹 , 𝜸′ >
15
Extend the Language with Functions
CS510 S o f t w a r e E n g i n e e r i n g
Program P
::= fd; s
Function f
:: = M(y) { s }
FuncDef fd :: = f | fd; f
FuncId M, M1, M2, …
Statement s ::= s1; s2 | x= y | x = y op z | x= c |
if (x) s1 else s2 |
while (x) s | call(M, x)
Operation op :: = + | - | * | / | > | < | …
Value c :: = 0 | 1 | 2 … | true | false
Variable x, x1, x2, x3
16
Formalizing Dynamic Analysis
A Dynamic Checker for Heap
Overflow
Check if a heap access dereferences an address
beyond the allocated buffer
CS510 S o f t w a r e E n g i n e e r i n g
p=malloc (10);
x=p+2;
q=x+9;
(*q)=…
18
Heap Language
CS510 S o f t w a r e E n g i n e e r i n g
Program P
::= s
Statement s ::= s1; s2 | x= y | x = y op z | x= c |
x= &y | (*x)=y | x=*y | x=malloc (y)
if (x) s1 else s2 |
while (x) s
Operation op :: = + | - | * | / | > | < | …
Value c :: = 0 | 1 | 2 … | true | false
Address a :: = 0 | 1 | 2…
Variable x, x1, x2, x3
19
A Plausible Solution - Configuration
CS510 S o f t w a r e E n g i n e e r i n g
20
Semantics Rules
𝜹′ = 𝜹 𝜶 𝒙 ⟼ 𝜸
𝜸′ = 𝜸 + 𝜹 𝜶 𝒚
CS510 S o f t w a r e E n g i n e e r i n g
𝑯′ = 𝑯[𝒙 ⟼< 𝜸, 𝜹 𝜶 𝒚 >]
< 𝒙 = 𝒎𝒂𝒍𝒍𝒐𝒄 𝒚 ; 𝒔, 𝜹, 𝜸, 𝑯 >→< 𝒔, 𝜹′ , 𝜸′ , 𝑯′ >
𝜹′ = 𝜹 𝜹 𝜶 𝒙 ⟼ 𝜹 𝜶 𝒚
Malloc
𝑯 𝒙 =< 𝒂, 𝒊 >
𝒂≤𝜹 𝜶 𝒙 <𝒂+𝒊
< ∗ 𝒙 = 𝒚; 𝒔, 𝜹, 𝜸, 𝑯 >→< 𝒔, 𝜹′ , 𝜸, 𝑯 >
Ptr-Write
21
Semantics Rules
CS510 S o f t w a r e E n g i n e e r i n g
𝜹′ = 𝜹 𝜹 𝜶 𝒙 ⟼ 𝜹 𝜶 𝒚
𝑯 𝒙 =< 𝒂, 𝒊 >
𝒂+𝒊 ≤𝜹 𝜶 𝒙
𝜹𝜶𝒙 <𝒂
< ∗ 𝒙 = 𝒚; 𝒔, 𝜹, 𝜸, 𝑯 >→< 𝒆𝒙𝒄𝒆𝒑𝒕𝒊𝒐𝒏, 𝜹′ , 𝜸, 𝑯 >
Ptr-Write-Excp
22
Another Solution - Configuration
CS510 S o f t w a r e E n g i n e e r i n g
23
Semantics Rules
CS510 S o f t w a r e E n g i n e e r i n g
𝜹′ = 𝜹[𝜶[𝒙] ⟼ 𝒄]
Const-Assign
< 𝒙 = 𝒄; 𝒔, 𝜹, 𝜸, 𝑯 >→< 𝒔, 𝜹′ , 𝜸, 𝑯 >
𝜹′ = 𝜹 𝜶 𝒙 ⟼ 𝜹 𝜶 𝒚
𝑯′ = 𝑯[𝜶 𝒙 ⟼ 𝑯 𝜶 𝒚 ]
< 𝒙 = 𝒚; 𝒔, 𝜹, 𝜸, 𝑯 >→< 𝒔, 𝜹′ , 𝜸, 𝑯′ >
𝒂 = 𝜹 𝜶[𝒚] 𝒃 = 𝜹 𝜶[𝒛] 𝒄 = 𝒂 + 𝒃
𝑯[𝜶[𝒛]] = < 𝒂, 𝒊 > 𝜹′ = 𝜹[𝜶[𝒙] ⟼ 𝒄]
𝑯′ = 𝑯[𝜶[𝒙] ⟼< 𝒂, 𝒊 >]
< 𝒙 = 𝒚 + 𝒛; 𝒔, 𝜹, 𝜸, 𝑯 >→< 𝒔, 𝜹′ , 𝜸, 𝑯′ >
Copy
Pnt-Add-z
24
Semantics Rules
CS510 S o f t w a r e E n g i n e e r i n g
𝒂 = 𝜹 𝜶[𝒚] 𝒃 = 𝜹 𝜶[𝒛] 𝒄 = 𝒂 + 𝒃
𝑯[𝜶[𝒚]] = < 𝒂, 𝒊 > 𝜹′ = 𝜹[𝜶[𝒙] ⟼ 𝒄]
𝑯′ = 𝑯[𝜶[𝒙] ⟼< 𝒂, 𝒊 >]
< 𝒙 = 𝒚 + 𝒛; 𝒔, 𝜹, 𝜸, 𝑯 >→< 𝒔, 𝜹′ , 𝜸, 𝑯′ >
Pnt-Add-y
𝒂 = 𝜹 𝜶[𝒚] 𝒃 = 𝜹 𝜶[𝒛] 𝒄 = 𝒂 + 𝒃
𝑯[𝜶[𝒚]] = 𝒖𝒏𝒅𝒆𝒇 𝜹′ = 𝜹[𝜶[𝒙] ⟼ 𝒄]
𝑯 𝜶 𝒛 = 𝒖𝒏𝒅𝒆𝒇
< 𝒙 = 𝒚 + 𝒛; 𝒔, 𝜹, 𝜸, 𝑯 >→< 𝒔, 𝜹′ , 𝜸, 𝑯 >
NonPnt-Add
25
Semantics Rules
𝜹′ = 𝜹 𝜶 𝒙 ⟼ 𝜸
𝜸′ = 𝜸 + 𝜹 𝜶 𝒚
CS510 S o f t w a r e E n g i n e e r i n g
𝑯′ = 𝑯[𝜹 𝜶 𝒙 ⟼< 𝜸, 𝜹 𝜶 𝒚 >]
< 𝒙 = 𝒎𝒂𝒍𝒍𝒐𝒄 𝒚 ; 𝒔, 𝜹, 𝜸, 𝑯 >→< 𝒔, 𝜹′ , 𝜸′ , 𝑯′ >
𝜹′ = 𝜹 𝜹 𝜶 𝒙 ⟼ 𝜹 𝜶 𝒚
𝑯 𝜹𝜶𝒙
Malloc
=< 𝒂, 𝒊 >
𝒂≤𝜹 𝜶 𝒙 <𝒂+𝒊
𝑯′ = 𝑯[𝜹 𝜶 𝒙 ⟼ 𝑯[𝜹 𝜶 𝒚 ] ]
< ∗ 𝒙 = 𝒚; 𝒔, 𝜹, 𝜸, 𝑯 >→< 𝒔, 𝜹′ , 𝜸, 𝑯′ >
Ptr-Write
26
Example Revisit
CS510 S o f t w a r e E n g i n e e r i n g
p=malloc (10);
x=p+2;
q=x+9;
(*q)=…
27
Soundness Proof
CS510 S o f t w a r e E n g i n e e r i n g
For each memory access, if the address exceeds the
bound of the corresponding buffer, the execution
must terminate with an exception.
Prove by induction
28
In-class Exercise: Enhance the
Analysis to Detect Dangling Pointers
CS510 S o f t w a r e E n g i n e e r i n g
29
In-class Exercise: Dynamic Data
Dependence Detection
We aim to detect data dependence on the fly
CS510 S o f t w a r e E n g i n e e r i n g
30
Heap Language
CS510 S o f t w a r e E n g i n e e r i n g
Program P
::= s
Statement s ::= s1; s2 | x=L y | x =L y op z | x=L c |
x=L &y | (*x)=L y | x=L *y | x=L malloc (y)
if (xL) s1 else s2 |
while (xL) s
Operation op :: = + | - | * | / | > | < | …
Value c :: = 0 | 1 | 2 … | true | false
Address a :: = 0 | 1 | 2…
Variable x, x1, x2, x3
Label L, L1, L2,…
31
Configuration
< 𝑠, 𝛿, 𝛾, 𝐶, 𝐷, 𝑋 >→< 𝑠 ′ , 𝛿 ′ , 𝛾 ′ , 𝐶 ′ , 𝐷′, 𝑋′ >
CS510 S o f t w a r e E n g i n e e r i n g
Counter C: Label -> Int
Definition D: Address -> Label  Int
Dependences X: P (Label  Int  Label  Int)
32
Semantics Rules
𝜹′ = 𝜹 𝜶 𝒙 ⟼ 𝜹 𝜶 𝒚
𝑪′ = 𝑪[𝑳 ⟼ 𝑪 𝑳 + 𝟏]
CS510 S o f t w a r e E n g i n e e r i n g
𝑫′ = 𝑫[𝜶 𝒙 ⟼< 𝑳, 𝑪 𝑳 >] 𝑿′ = 𝑿 ∪< 𝑳, 𝑪 𝑳 , 𝑫 𝜶 𝒚 >
< 𝒙 = 𝑳𝒚; 𝒔, 𝜹, 𝜸, 𝑪, 𝑫, 𝑿 >→< 𝒔, 𝜹′ , 𝜸, 𝑪′ , 𝑫′ , 𝑿′ >
𝜹′ = 𝜹 𝜹 𝜶 𝒙 ⟼ 𝜹 𝜶 𝒚
𝑫′ = 𝑫[𝜹
Copy
𝑪′ = 𝑪[𝑳 ⟼ 𝑪 𝑳 + 𝟏]
𝜶 𝒙 ⟼< 𝑳, 𝑪 𝑳 >] 𝑿′ = 𝑿 ∪< 𝑳, 𝑪 𝑳 , 𝑫 𝜶 𝒚 >∪< 𝑳, 𝑪 𝑳 , 𝑫 𝜶 𝒙
< ∗ 𝒙 = 𝑳𝒚; 𝒔, 𝜹, 𝜸, 𝑪, 𝑫, 𝑿 >→< 𝒔, 𝜹′ , 𝜸, 𝑪′ , 𝑫′ 𝑿′ >
Ptr-Write
33
>
In-class Exercise: Logging and
Replay Semantics
CS510 S o f t w a r e E n g i n e e r i n g
34
In-class Exercise: Dynamic Control
Dependence Detection
We aim to detect dynamic control dependence on the
fly
CS510 S o f t w a r e E n g i n e e r i n g
35
In-class Exercise (hard): Dual
Execution Semantics
How to align two executions with slightly different
inputs?
CS510 S o f t w a r e E n g i n e e r i n g
Critical for explaining behavior differences (i.e., trace
comparison)
36
Discussion
Did you formulate your dynamic analysis in your
projects?
CS510 S o f t w a r e E n g i n e e r i n g
With what you have learned, can you formulate it now?
What properties do you want to prove regarding your
analysis?
Can you prove them now with your formalism
37