+ Lib - Ohio State Computer Science and Engineering

Component-Level Dataflow Analysis
Atanas (Nasko) Rountev
Ohio State University
CBSE'05
1
Outline

Interprocedural dataflow analysis


Problem: making dataflow analysis usable
and useful for component-based software


Whole-program analysis: limitations
Technical challenges
Ongoing and future work
CBSE'05
2
Uses of Dataflow Analysis

Software understanding tools


Software testing


e.g. dataflow-based testing; testing of object
interactions in OO software
Software checking


e.g. dependence analysis for program slicing,
change impact analysis, refactoring, etc.
e.g. object protocols: open(read|write)*close
Performance optimizations in compilers
CBSE'05
3
Model for Whole-Program Analysis
code for C1
code for C2
…
code for Cn


Whole
Program
Dataflow
Analysis
dataflow
solution for
C1 + C2 + … + Cn
C1 + C2 + … + Cn constitute a complete program
Implicit assumption: it is possible and desirable
to analyze the source code of the entire
program as a single unit
CBSE'05
4
Limitations of Whole-Program Analysis



What if some of the components are only
available in binary form?
What if we are building a library?
What if we are using large libraries that
need to be re-analyzed from scratch?


e.g. the standard Java libraries contain a few
thousand classes
What if one part of program changes?

may have to re-analyze the entire program
CBSE'05
5
Outline

Interprocedural dataflow analysis


Whole-program analysis: limitations
Problem: making dataflow analysis usable
and useful for component-based software


Dozens of existing analyses could potentially
become useful for component-based software
 In tools for software understanding,
testing, checking, and optimization
Technical challenges
CBSE'05
6
A Simple Case: Main + Lib
code for Lib
code for Main
summary for Lib

Component
Level
Dataflow
Analysis
Component
Level
Dataflow
Analysis
dataflow
solution for Lib
summary for Lib
dataflow
solution for
Main
Goal: the solution for Main should be as good as
the solution that would have been computed by
a whole-program analysis (no loss of precision)
CBSE'05
7
Component Model and Summary Info

Component = set of related procedures
or classes




Component interactions: synchronous calls,
shared variables
Challenge: more sophisticated component
models
Summary information is computed based
only on the source code of Lib
Challenge: use info from component
specifications
CBSE'05
8
Summary Functions
Main
Lib
Main calls
procedure Q
Q
path p1: dataflow function f1
Summary
function for Q:
fQ = f1  f2
computed by
the analysis
of Lib
path p2: dataflow function f2
CBSE'05
9
Open Questions

Challenge: compact representation of
dataflow functions and their transitive
composition and meet


Existing work solves this problem for some
analysis categories; need generalizations
Challenge: callbacks



e.g., function pointers in C
e.g., virtual dispatch in C++ and Java
Fundamental problem, not addressed
adequately by existing work
CBSE'05
10
Callbacks
Main
Lib
Main calls
procedure Q;
during the call,
Lib calls R
R
Q
The function
for p2 cannot be
computed until
Main is analyzed
Solution: summary functions for subpaths,
computed during the analysis of Lib;
Later, compose them with the functions from Main
CBSE'05
11
Ongoing Work


Goal 1 (achieved): theoretical model for
computing and using summary functions in
the presence of callbacks
Goal 2 (ongoing): instantiate the model to
common categories of analyses


dependence analysis, pointer analysis, etc.
Goal 3: experimental evaluation


e.g. how large are the summaries?
Eclipse plug-in for call graph construction:
needs summary info for all Java 1.4 libraries
CBSE'05
12
Future Work

Beyond the traditional restrictions


Higher-level of abstraction for
component interfaces and interactions


Use not only code, but also component
specifications: e.g., “sharpen” the summary
functions based on preconditions
Right now: low-level mechanisms such as
procedure calls and shared variables
Extensive experimental evaluation on
real-world software systems
CBSE'05
13