Sample exam-style questions (as PDF)

CS 618 Program Analysis: Practice Questions
Uday P. Khedker
(http://www.cse.iitb.ac.in/˜uday)
Department of Computer Science and Engineering
Indian Institute of Technology, Bombay
May 10, 2011
About These Questions
c
These questions are a copyrighted material (2009
Uday Khedker) and have been designed for the
course CS 618: Program Analysis offered at the Department of Computer Science and Engineering,
IIT Bombay. They are made available only for academic use as material accompanying the book
Data Flow Analysis: Theory and Practice by Khedker, Sanyal, and Karkare. Eventually they are
expected to be included in the next edition of the book. Additional material accompanying the book
can be found at the book page http://www.cse.iitb.ac.in/ uday/dfaBook-web.
Number of
Questions
Bit vector data flow frameworks
15
Theoretical abstractions in data flow analysis
6
General data flow frameworks
27
Interprocedural data flow analysis
14
Topic
Page Number
2
8
9
16
We would be happy to receive suggestions for corrections and improvements in questions. We
also welcome new questions or ideas for new questions.
1
1 Bit Vector Data Flow Frameworks
1. Consider the available expressions analysis framework for the following control flow graph.
n1 a ∗ b n1
n2 a ∗ b n2
n3 a ∗ b n3
n4 a = n4
n5 a ∗ b n5
(a) What should be BI? Which program point should it be associated with? What should be
the initialization for all internal nodes?
(b) Perform available expression analysis using round robin iterative algorithm by traversing
the graph in the forward direction. Show the values for each node in each iteration. How
many iterations are needed for this analysis?
(c) Repeat the analysis by traversing the graph in the backward direction. In backward traversal, compute Out n before computing Inn . How many iterations do you need now?
(d) Show the trace of the worklist iterative algorithm for the given control flow graph. Assume that the work list follows FIFO (First in First Out) policy.
Step No.
Program Point Remaining Data Flow Program Point(s) Resulting
Selected
Work list
Value
Added
Work list
(e) Compare the work performed by the two algorithms for the given control flow graph in
terms of the total number of nodes processed until convergence is established.
Let one unit of work be defined as processing one node (i.e. computing In and Out for
the node). Include the initialization of Ini and Out i as one visit to node i. Compare the
work done by
(i) Round robin analysis with forward traversal
(ii) Round robin analysis with backward traversal
(iii) Work list based method
Which algorithm involves less work? Why?
(f) Does your work list contain nodes that need not be included in it? Can you suggest how
the worklist algorithm can be made more efficient in terms of nodes processed?
(g) What is the depth of this graph? What is its width for available expressions analysis
with forward traversal in a round robin method? What is its width for if the direction of
traversal is changed to backward? In each case, identify the width defining information
flow path.
2. Consider the following dump by gcc for a C program.
2
#
#
e
d
#
BLOCK
PRED:
= a *
= b *
SUCC:
# BLOCK
# PRED:
<L0>:;
e = c *
if (c <
# BLOCK 5
# PRED: 3 (false)
<L2>:;
d = 3;
# SUCC: 6 (fallthru)
2
ENTRY (fallthru)
b;
c;
3 (fallthru)
# BLOCK
# PRED:
<L3>:;
c = d *
c = a *
if (c <
3
2 (fallthru) 6 (true)
6
4 (fallthru) 5 (fallthru)
e;
b;
d) goto <L0>;
else goto <L3>;
# SUCC: 3 (true) 7 (false)
d;
d) goto <L1>;
else goto <L2>;
# SUCC: 4 (true) 5 (false)
# BLOCK 4
# PRED: 3 (true)
<L1>:;
c = 2;
goto <bb 6> (<L3>);
# SUCC: 6 (fallthru)
# BLOCK 7
# PRED: 6 (false)
<L4>:;
D.1547 = d * e;
return D.1547;
# SUCC: EXIT
(a) Draw the control flow graph. Which control construct may have been used in the input C
program?
(b) Perform available expressions analysis. Clearly show Genn and Killn and the initial values
of Inn and Out n , and the values of Inn and Out n in each iteration of analysis.
(c) Show common subexpression elimination using the above data flow values.
3. Perform reaching definitions analysis on the following CFG using round-robin algorithm. List
Gen and Kill for each noode and show the complete trace of all In/Out values in each iteration.
Do you find any scope of copy propagation?
n1 a = 3 n1
n2 b = 5 nn3 2 d = a + 1 n3
n4
c=b
n4
n5 d = 7 n5
n6 print a, b, c, d n6
3
4. The following function computes the mth fibonacci number for a given m ≥ 1. Construct its
control flow graph and perform reaching definitions analysis. Do you find any scope for copy
propagation?
int fib(unsigned int m)
{
int f0, f1, f2, i;
f0 = 0;
f1 = 1;
if (m <= 1)
f2 = m;
else
{
for (i=2; i<=m; i++)
{
f2 = f0 + f1;
f0 = f1;
f1 = f2;
}
}
return f2;
}
5. Construct an instance of live variables analysis which requires four iterations to converge
regardless of whether the control flow graph is traversed in forward direction (i.e. reverese
depth first order), or in the backward direction (i.e. depth first order).
• You need to construct a control flow graph and show the data flow values in each iteration
for both directions of traversal.
• Convergence in four iterations means that the data flow values computed in the first
three iterations are different but the data flow values computed in the fourth iterations
are identical to the data flow values computed in the third iteration.
• Assume that in a forward traversal, Inn is computed before Out n whereas in a backward
traversal, Out n is computed before Inn .
6. Construct a C program for which available expressions analysis for which the minimum fixed
point assignment is different from the maximum fixed point assignment and both computations
require four iterations to converge (three for convergence and one for detecting convergence).
You are not allowed to use goto statements in your C programs. Use at most two expressions.
Assume that the control flow graph is traversed in forward direction (i.e. reverse depth first
order). Show the data flow values in each iteration.
7. Construct an instance of live variables analysis such that
• the maximum fixed point assignment for the instance is different from its minimum fixed
point assignment, and
4
• the computation of maximum fixed point assignment converges in 3+1 iterations (3 for
computation and 1 for discovering convergence), and
• the computation of minimum fixed point assignment also converges in 3+1 iterations.
Assume that analysis is performed by making a backward traversal over the control flow graph.
8. Construct a C program for which available expressions analysis requires four iterations to
converge. You are not allowed to use goto statements in your C programs. Assume that the
control flow graph is traversed in forward direction (i.e. reverese depth first order). Show the
data flow values in each iteration.
9. Recall that it is possible to perform total and partial availability analyses together by using
the component lattice shown below. Perform this analysis for the following program using
the round robin iterative method. Assume that the control flow graph is traversed in reverse
postorder (i.e. forward direction). Is the result of your analysis any different from the results
obtained by independent analyses? Why?
n1 a ∗ b n1
unknown
n2 a ∗ b n2
no
must
n3 a = nn34 a ∗ c n4
may
n5 a ∗ b n5
n6 a ∗ b n5
10. Backward Slicing finds the smallest set of statements relevant to a given computation at a
program point of interest (known as the Slicing Criterion). The first step in slicing is to
construct a set of variables that are directly relevant to the computation at the slicing criterion.
For each edge i → j in the CFG, a variable v is relevant at i (denoted v ∈ Relevant(i)) if:
/
(v ∈ Relevant( j) ∧ v ∈
/ Def (i)) ∨ (v ∈ Ref (i) ∧ Def (i) ∩ Relevant( j) 6= 0)
Consider the following example:
Program
L1: b = 5;
L2: c = 10;
if (<cond>)
L3:
a = a + c;
else
L4:
a = a + 1;
L5: return (a*b);
// slicing criterion
Directly relevant variables
• b ∈ Relevant(L4) because b ∈ Relevant(L5) and
b∈
/ Def (L4)
• b ∈ Relevant(L2) because b ∈ Relevant(L4) and
b∈
/ Def (L2)
• a ∈ Relevant(L4), because a ∈ Ref (L4) and
a ∈ Def (L4) ∩ Relevant(L5)
• a, c ∈ Relevant(L3), because a, c ∈ Ref (L3) and
a ∈ Def (L4) ∩ Relevant(L5)
5
Define the data-flow equations for computing Relevant(i). Please provide:
(a) Definitions of Gen, Kill, In, Out .
(b) Interpretation of BI for the problem with appropriate assumptions.
(c) Lattice, confluence operation, and default initialization.
(d) Comment on separability of the problem with illustration.
11. Perform partial redundancy elimination for the following control flow graphs. Use bit vector
notation for convenience.
n1
n1 a ∗ b n1
n2 a ∗ b n2
a∗b
n2
n4 b ∗ c
t0 = a ∗ b
a = t0
1
1
t1 = b ∗ c
c = t1
a=2
b∗c
n3 c = 5
n3 a = n3
n4 a = n4
n5
n5 a ∗ b n5
a∗b
b∗c
t0 = a ∗ b
c = t0
2
2
t1 = b ∗ c
d = t1
(C)
(A)
n1 a ∗ b
n1 a ∗ b n1
n2 a ∗ b
n8 a ∗ b
n2 a ∗ b n2
n3 a = n3
n4 a = n4
n5 a ∗ b n5
(B)
n3 a ∗ b
n5 a ∗ b
n4 a ∗ b
n6 a ∗ b
n7 a =
n9 a =
3
t1 = b ∗ c
3
b = t1
4
t1 = b ∗ c
4
a = t1
a ∗ b n10
a ∗ b n11
a ∗ b n12
(E)
(D)
(a) List the values of Genn , Killn , AntGenn .
(b) Compute PavInn /PavOut n , and AvInn /AvOut n for each node n.
(c) Show the values of Inn and Out n for each node n in each iteration.
(d) Show the values of Redundant n and Insert n .
(e) Show the hoisting paths and explain the resulting transformation intuitively. If no hoisting
is possible, explain why it is not possible.
(f) What is the width of the CFG for PRE? Identify the width determining path.
6
(g) Computer AntInn /AntOut n . Do you observe any relationship between these values and
the Inn /Out n values for PRE?
12. Construct a C program for which PRE requires 5 iterations of round robin iterative analysis (4
to converge and 1 to detect convergence). Assume that the control flow graph is traversed in
postorder (i.e. backward direction). You have to meet the following design constraints.
• The program should not contain goto statements or nested loops.
• Design the program to contain a single expression for analysis.
• The structure of the program should not resemble the structures that we have seen in the
class.
Simple programs will receive more credit.
(a) Write the program and draw the corresponding control flow graph.
(b) Show the final values of available expressions and partially available expressions analyses.
(c) Perform work list based PRE analysis for your program. Show the trace of analysis in the
following format. Assume that the work list follows FIFO (First in First Out) policy.
Step No.
Program Point Remaining Data Flow Program Point(s) Resulting
Selected
Work list
Value
Added
Work list
(d) Show an ifp leading to 5 iterations of round robin analysis. Indicate the step numbers in
your trace of the work list algorithm that cover this ifp.
13. Construct a C program for which MFP and LFP assignments for PRE are different.
14. Let N be the set of nodes of a control flow graph. A node x ∈ N dominates a node y ∈ N,
denoted x dom y, if and only if x appears on every path from Start to y. The dominance
relation dom is reflexive, transitive, and antisymmetric. An example of dominator information
has been shown below.
n1 a ∗ b n1
Node
n1
n2
n3
n4
n5
n2 a ∗ b n2
n3 a = n3
n4 a = n4
Dominators
{n1 }
{n1 , n2 }
{n1 , n2 , n3 }
{n1 , n2 , n4 }
{n1 , n2 , n5 }
n5 a ∗ b n5
Define a data flow analysis for computing dominators of each node. The result on analysis
should be to compute, for a given node n, a set domInn which is the set of dominators of n
excluding n and domOutn which includes n also.
7
In particular, you need to specify BI, the program point with which BI should be associated,
the flow functions in terms of Genn and Killn , the confluence operation, and tie them into data
flow equations. Is your framework a bit vector framework?
15. Let the set of dominators of node y be denoted by dom(y). An immediate dominator of a node
y is a node z ∈ dom(y) such that ∀w ∈ (dom(y) − {y}), w ∈ dom(z). Intuitively, an immediate
dominator of y is the dominator that is “closest” to y but is not y.
Define a data flow analysis based method to compute immediate dominators.
2 Theoretical Abstractions in Data Flow Analysis
1. Is the following poset a lattice? If not, explain whether it is a meet semilattice or a join
semilattice. If yes, explain whether it is a complete lattice or a bounded lattice.
a
b
c
d
e
f
2. Assume that L is a complete lattice in which all strict chains are bounded. Given a monotonic
function f : L 7→ L, show that
∃k ≥ 0 such that f k+1 (⊥) = f k (⊥) and ∀ j < k, f
j+1
(⊥) 6= f j (⊥)
Prove that f k (⊥) is the least fixed point f .
3. Given f : L 7→ L, prove that the following two definitions of monotonicity are equivalent.
∀x, y ∈ L,
x ⊑ y ⇒ f (x) ⊑ f (y)
∀x, y ∈ L, f (x ⊓ y) ⊑ f (x) ⊓ f (y)
4. Let fρ and fi→ j denote the flow functions associated with path ρ and edge i → j, respectively.
Let Paths(i) denote the paths from Entry to i. Given the definitions of MFP and MoP,
MoP(i) =
ρ∈Paths(i)
MFP(i) =
fρ (BI)
f p→i (MFP(p))
p∈Pred(i)
show that ∀i, MFP(i) = MoP(i) for distributive frameworks.
8
5. Construct an instance of a framework by describing a lattice L, a ⊑ relation, and a ⊓ operation
and by constructing a control flow graph with the associated flow functions. Assume the
following data flow equations:
(
BI
n is Start block
Inn =
Out p otherwise
p∈pred(n)
Out n = fn (Inn )
The constructed instance should have the following characteristics:
• L must be a complete lattice.
• Every flow function fn must be distributive.
• Select a BI, an initialization, and flow functions such that the round robin iterative algorithm should not terminate for this instance.
Does your instance have a fixed point assignment? If yes, explain why the round robin iterative
algorithm does not terminate. If it does not have a fixed point assignment, explain why.
6. Recall that a data flow framework hL, ⊓, Fi is k-bounded if
∀ f ∈ F, ∀x ∈ L, f ∗ (x) = x ⊓ f (x) ⊓ f 2 (x) ⊓ . . . = x ⊓ f (x) ⊓ f 2 (x) ⊓ . . . ⊓ f k−1 (x)
Is the boundedness parameter k related to the the component lattice? The overall lattice?
Justify your answer in the context of
(a) Constant propagation
(b) Available Expressions Analysis
(c) Combined Total and Partially Available Expressions Analysis
3 General Data Flow Frameworks
1. Perform possibly uninitialized variables analysis for the following CFG using round-robin
algorithm and show the values of In/Out in each iteration.
n1 a = 3 n1
n2 b = 5 nn3 2 d = a + 1 n3
n4
c=b
n4
n5 d = 7 n5
n6 print a, b, c, d n6
9
2. Perform faint variables analysis for the following CFG round-robin algorithm and show the
values of In/Out in each iteration.
n1
b=2
c=3
n1
n2 a = b n32 d = c + 1 n3
n4
d=c
n4
n5 print a nn56 d = 5 n6
n7 print d n7
n8 a = b n8
3. (a) Construct a CFG containing only one back edge such that
(i) Faint variables analysis requires four iterations of backward traversal for convergence
(i.e. values in 2nd and 3rd iterations are different but the values in 3rd and 4th
iteration are identical).
(ii) Does the result of your analysis lead to dead code elimination?
(iii) Perform live variables analysis on the same graph. How many iterations does it
take? What is the relationship between dead variables as discovered by live variables
analysis and faint variables?
(b) Construct another example of faint variables analysis which has 3 back edges but requires
only two iterations to converge.
4. Construct an instance of possibly uninitialized variables analysis such that
• the maximum fixed point assignment for the instance is different from its minimum fixed
point assignment, and
• the computation of maximum fixed point assignment converges in 3+1 iterations (3 for
computation and 1 for discovering convergence), and
• the computation of minimum fixed point assignment also converges in 3+1 iterations.
Assume that analysis is performed by making a forward traversal over the control flow graph.
5. Consider the constant propagation framework for the following control flow graph.
n1 a = 0 n1
n2 a = a + 1 n2
n3 a ∗ b n3
10
(a) Compute the maximum fixed point for constant propagation for the give control flow
graph. How many iterations does it need?
(b) Observe that fn2 does not have a fixed point. Still the data flow equations compute a fixed
point. How?
6. Perform constant propagation for the following program using the round robin iterative method.
Assume that the control flow graph is traversed in the forward direction.
n1 a = b + 1 n1
n2 a = b + 1 n2
n4 a = 1 n4
n3 b = c + 1 n3
n5 c = a + 1 n5
n6 a = 1 n6
7. (a) Perform constant propagation for the following two programs using round-robin algorithm. Show the complete trace (in tabular form) of all In/Out values in each iteration.
Clearly specify the number iterations in each case.
n1 a = b n1
n2 a = b n1
n4 a = b n1
n6 a = b n1
n8 a = b n1
n3 d = 2 n1
n5 c = d n1
n7 b = c n1
n10 a = b nn109 a = b n1
(b) Repeat constant propagation analysis for the same CFG by changing the statements in
selected nodes as described below: n3 contains a = b, n5 contains b = c, n6 contains
c = d, n7 contains d = 2. How does the number of iterations changes?
11
8. Recall that copy constant propagation is a simplified version of constant propagation in expressions are not evaluated; whenever expression evaluation is required, the result of the expression is assumed to be ⊥. Perform copy constant propagation for the following CFG.
n1 read (e); n1
n2
a = 7; b = 2; f = e;
n2
if ( f > 0)
n3
n4
a = 2;
if ( f ≥ e + 2) n3
b = c + 1;
if (b ≥ 7) n4n if ( f ≥ e + 1) n
6
6
n5
f = f + 1; n5
n7 c = d; n7n8
n10
e = a + b; n10
n9
d = f + b; n8
d = a;
f = f + 1 n9
9. Create an example of constant propagation for which the MoP, the MFP, and the LFP, are all
different from each other.
10. Signs Analysis involves determining signs of the possible values of variables at each program
point as positive non-zero (+), negative (−), and zero (0). An example of a program and the
desired data flow information is:
a=0
ha, {0}i
a = a−5
ha, {−}i
a = −a
ha, {+}i
ha, {+, −}i
a = a−2
ha, {+, −, 0}i
Define a data flow analysis to compute signs information for each variable. Clearly define
the lattice, the confluence operation, the BI, the initialization for internal nodes, and data flow
12
equations. For the purpose the flow functions, assume that the arithmetic expressions are
restricted to both binary and unary versions of operators + and −. Define the flow functions
in terms of ConstGenn , ConstKilln , DepGenn (X ), and DepKilln (X ).
Is your framework separable? Distributive?
11. Consider the following program fragments for points-to analysis.
if (...)
p = &x;
else
p = &y;
x = &a;
y = &b;
∗p = &c;
∗y = &a;
if (...)
p = &y;
else
p = &x;
y = &b;
x = &a;
∗y = &a;
∗p = &c;
(a) Show the result of flow insensitive May points-to analysis.
(b) Show the result of flow sensitive May points-to analysis.
(c) Show the result of flow sensitive Must points-to analysis.
12. Perform independent May and Must points-to analyses on the following control flow graph.
When performing May points-to analysis, assume conservative Must points-to information
(no pointer points to any location). When performing Must points-to analysis, assume conservative May points-to information (a pointer points to every location).
n1 b = &a; n1
n2 c = b; n2
n3 a = &b; nn3 4 a = &c; n5
n5 a = ∗a; n6
n6 ∗b = c; n7
13. Is May points-to analysis framework k-bounded? If yes, what is the value of k?
14. Is Must points-to analysis framework k-bounded? If yes, what is the value of k?
15. Create an example to show the non-distributivity of May points-to analysis framework.
16. Create an example to show the non-distributivity of Must points-to analysis framework.
17. Perform liveness analysis of heap references for the following program. Clearly specify the
number iterations.
13
n1 x = x.n
n2 y = x
n3 x = y.n
n4 print x.d
18. Construct a control flow graph for this program and perform explicit liveness analysis for heap
references in the following program. You have to show the Inn and Out n in each iteration.
bool find(int n, Tree * t)
{ found = false;
while (t != NULL)
{ if (n == t->n)
{ found = true; break; }
else if (n < t->n)
t = t->l;
else t = t->r;
}
}
19. Consider the following programs for heap reference analysis.
n1 x = x.n n1
n2 x = x.n n1n2 x = x.n n1 n2 x = x.n n1
n3 y = x.r n1
n1
x = x.n
n1 x = x.n n1 n1 x = x.n n1 n1 x = x.n n1
n2 x = x.n n1
(b)
(c)
x = x.l
n4
y = x.r
n3 y = x.r n1 n3 y = x.r n1 n3 y = x.r n1
(a)
n3
n2
x = x.n
(d)
(e)
(a) Show the final explicit liveness access graphs for each node.
(b) Liveness graphs of programs (b) and (c) are identical at node n1 . Why is this semantically
correct inspite of the fact that both the programs have different structures?
20. Consider the following access graphs representing explicit liveness at some program point
(perhaps in different programs).
x
l
l
r
l
l
y
r
r
r
14
l
Unfortunately the student who constructed these access graphs forgot to attach a statement
number as a subscript to the node labels and has misplaced the programs which gave rise
to these graphs. Please help her by constructing (independent) CFGs for which these access
graphs represent explicit liveness at some program point in the CFGs. Please also show the
liveness access graphs at all other program points in the CFGs.
21. Perform heap liveness analysis for the following program:
a=0
1
x.n.n.n = y 2
y = x.n 3
x = y.n 4
print x.d 5
22. Is the access path x−⊲n live at the entry of node n4 ? Does explicit liveness include it? If there
is a discrepancy in your observation and the result of explicit liveness analysis, please suggest
how this discrepancy can be removed. If there is no discrepancy, explain why there is no
discrepancy.
n1
x.n.n.d
n3 x = y
x = z n2
n4 y.n = Null
n5 print x.n.n.d
23. Create an example of explicit liveness analysis in HRA such that it requires five iterations. The
values must change in the first four iterations and should remain constant in the fifth iteration.
Draw the CFG and show the values in each iteration.
24. Construct an example to show that explicit liveness analysis is non-distributive.
25. Is it possible to get the following two liveness access graphs reaching the same program point
p along two different control flow paths? Please explain your answer and describe the conclusion that you draw from your explanation.
x
l1
r2
15
x
r2
l1
26. Perform explicit liveness analysis for the following control flow graph. What nullification
statements, if any, can be inserted and at what points?
n1 w = x n1
n2 while (x.data < max) n1
n3 y = x.lptr n1 n5 x = x.rptr n1
n4 x = y.lptr n1n6 print x.data n1
27. Heap reference analysis computes explicit liveness at each program point and then aliasing
information is used to compute implicitly live access paths by discovering link aliases of the
explicitly live access paths. There are two ways of doing this:
• Implicitly live access paths are computed during explicit liveness analysis and are propagated backwards in the program along with the explicitly live access paths.
• Implicitly live access paths are computed after explicit liveness analysis and are not
propagated backwards.
Perform heap reference analysis for the following program to show that for correctness it is
necessary to compute implicitly live access paths during explicit liveness analysis. Also show
how it results in imprecise liveness information. (Since this program does not have loops,
there is no need to construct access graphs. Give your answer in terms of sets of access paths
for liveness and pairs of access paths for aliasing.)
n1 x = y n1n2 x = y n2
n3 x.n = z n3
n4 Use y.n.n.d n4
4 Interprocedural Data Flow Analaysis
1. Consider the following program.
void p()
{ if (...)
{ a = a∗b;
p();
}
}
int a, b, c;
int main()
{ c = a∗b
p();
}
16
(a) Draw control flow graphs of the two procedures and perform available expressions analysis using functional approach. Clearly show Φmain (n) and Φ p (n) for each block n in
procedures main and p. You have to provide complete trace of computation of these flow
functions in a tabular form.
(b) Construct a supergraph for the program and perform interprocedural available expressions
analysis using the call strings approach assuming that a call site needs to appear in a call
string at most thrice. Use the work list approach and show the trace of your computation
in the following format:
Step
No.
Selected
Node
Qualified Data Flow Value
INn
OUTn
Remaining Work List
Intraprocedural Call/Return
Nodes
Nodes
(c) Use the modified call strings approach and perform interprocedural available expressions
analysis. Clearly indicate representation and regeneration of call strings. Recall that the
algorithm requires that
• intraprocedural nodes are processed before any call/return node in the worklist, and
• call nodes are processed before any return node in the work list.
Do you have to construct fewer call strings?
2. Perform interprocedural available expressions analysis for the following program using the
functional approach. Draw the control flow graphs and construct summary flow functions by
iterating over them. Show the trace of construction in a tabular form.
1.
2.
3.
4.
5.
6.
void main(void)
{
c = a*b;
p();
printf ("%d\n", c);
}
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
void p (void)
{
if (...)
{
a = a*b;
p();
}
else if (...)
{
c = a * b;
p();
printf ("%d\n", c);
}
else
; /* ignore */
}
What does your analysis conclude about the availability of a ∗ b at line 18? At line 6? If it is
not available, please show an interprocedurally valid path along with the expression is killed.
3. Construct an instance of interprocedural available expressions analysis to show that the result of context sensitive interprocedural analysis is more precise than the result of context
insensitive interprocedural analysis.
17
4. Construct an example of liveness analysis to show that context sensitive interprocedural data
flow analysis is more precise compared to context insensitive interprocedural data flow analysis. You do not have to perform analysis but only show possible interprocedurally invalid paths
in a supergraph which, if not avoided, lead to imprecision for your instance of interprocedural
liveness analysis.
Your program must have a single variable which must be a global variable. Use only two
procedures, one of which must be main and is not called by any procedure.
5. Construct a supergraph for the program shown below. Perform interprocedural available expressions analysis using call-strings approach. Use the work list approach and give a table of
the trace of the algorithm.
Main
1. start
2. call P
3. c ∗ d
4. call T
5. end
P
1.
2.
3.
4.
5.
6.
7.
8.
Q
1.
2.
2.
3.
4.
6.
start
a∗b
if() then
call Q
else
call T
c=5
end
start
if() then
a=5
else
b=6
end
T
1.
2.
3.
start
call Q
end
6. Recall that the 3 occurrences bound on call strings for interprocedural data flow analysis of
bit vector frameworks is a general program independent bound and fewer than 3 occurrences
of any call sites in every call string may be sufficient for some programs. Using the reasoning
for 3 occurrences bound, explain why a single occurrence of any call site is sufficient for
computing a precise solution for available expressions analysis of the following program.
(You have to explain this without performing data flow analysis.)
Startmain
read a, b
t := a ∗ b
StartP i f a = 0 StartP
n2 a := a − 1 n2
C1 call p C1
C2 call p C2
R1 call p R1
n1
R2 call p R2
t := a ∗ b
n1
print t
call p
Endmain
n3 t := a ∗ b n3
EndP call p EndP
18
7. Is a ∗ b available immediately after c1 in the following supergraph? If it is not available,
identify and interprocedural valid information flow path that causes it to be un-available. How
is it covered by the functional method? By the call strings method?
Start p a ∗ b Start p
Startmain a ∗ b Startmain
a=1
n1 c = a + b n1n2 b = 2 n2
C1 call p C1
C2 call p C2 C3 call p C3
R1 call q R1
R2 call p R2 R3 call p R3
Endmain a ∗ b Endmain
n3 e = d + 1 nn43 d = c + 1 n4
End p a ∗ b End p
8. Use the modified call strings method to perform interprocedural available expressions analysis
of the following program.
Start p a ∗ b Start p
Startmain a + b Startmain
n1 a = a + b n12 c = a + b n2
C1 call p C1
C2 call p C2 C3 call p C3
R1 call q R1
R2 call p R2 R3 call p R3
Endmain a ∗ b Endmain
End p a ∗ b End p
Is expression a + b available at R2 ? At R3 ? If it is not available, please show an interprocedurally valid path along with the expression is killed.
9. Explain using the staircase diagram why 3 occurrences of a call site are sufficient in any call
string for performing interprocedural data flow analysis of bit vector data flow frameworks.
19
10. Consider the following program
Sp x ∗ y Sp
Entry x ∗ y Entry
C2 call p C2
C1 call p C1
R2 call p R2
R1 call p R1
n2
Exit x ∗ y Exit
a = b;
b = c;
c = 2;
n2
Ep x ∗ y Ep
(a) Perform call strings based interprocedural copy constant propagation using the modified
call strings method for the following supergraph. Show the trace of work-list algorithm.
(b) Perform copy constant propagation using the functional approach. Show the constructed
summary flow functions.
11. The approximate call strings approach maintains k-length suffixes of call strings instead of
full call strings.
When length of a call string σ reaching a call nodes Ci in procedure P is less then
m, ci is appended to σ. If length of σ is equal to m, then the first call site is removed
and ci is appended to σ, thus maintaining length m. Assume that hc j · σ′ , x j i and
hck · σ′ , xk i reach Ci and |σ′ | = m − 1. Then hσ′ · ci , x j ⊓ xk i is propagated further.
At return node Ri , the last call site ci is removed from the incoming call string
and all call sites containing a call to procedure P are attached at the first position,
thereby reconstructing the call strings whose first call site was removed at Ci . Assume that hσ′ · ci , xi i reach Ri , then hc j · σ′ , xi i and hck · σ′ , xi i are propagated further.
(a) Construct a supergraph for a non-recursive program such that when this method (with
m = 2) is used for performing interprocedural live variables analysis, the data flow information so obtained is imprecise. Use at most two variables in your program.
(b) Repeat the above exercise for a recursive program where approximation that causes imprecision is influenced by recursion.
12. (a) Perform interprocedural constant propagation for the following program using the functional approach. Draw the control flow graphs and construct summary flow functions by
iterating over them. Show the trace of construction in a tabular form.
20
1.
2.
3.
4.
5.
6.
void main(void)
{
a = 5;
p();
printf ("%d\n", a);
}
7.
8.
9.
10.
11.
12.
13.
void p (void)
{
if (...)
{
a = a+7;
p();
a = a-7;
}
}
Does variable a have a constant value at line 5? What does your analysis conclude about
it?
(b) Draw the supergraph for the above program and explain the difficulty in performing interprocedural constant propagation over the program using the classical Sharir-Pnueli call
strings method. Does the problem go away if we use the modified call strings method?
Why?
(c) Use approximate call strings method with m = 2 for interprocedural constant propagation
analysis of the above program. Does the problem you described in answer to part (b)
above go away now? Why?
Does your method discover a to be constant line 5? Why?
13. Perform may points-to analysis for the following program using modified call strings method.
Make conservative assumptions about must points-to information. How many call strings do
you need to construct? How many call strings would be required by the Sharir-Pnueli method?
p()
{ if (...)
{ p(); /* C2 */
x = *x;
}
}
main()
{ x = &y;
z = &x;
y = &z;
p(); /* C1 */
}
14. In the proof of correctness of the modified call strings method, we have used the concept of a
fixed point closure bound which is defined as the smallest n ≥ 0 such that ∀x ∈ L, f n+1 (x) = f n (x).
Consider the following program for may points-to analysis. Assume that there are only three
pointer variables a, b, and c. Also assume that must points-to analysis is not to be performed.
p()
{
main()
{
a = &b;
b = &c;
c = &b;
p();
}
if (...)
{
a = *a;
p();
}
}
21
Does the flow function for the cyclic call sequence of the program have a fixed point closure bound? Perform interprocedural may points-to analysis using the modified call strings
method. Does the method terminate? Explain why.
22