Abstract Interpretation with Applications to Timing Validation

Runtime Guarantees for
Real-Time Systems
Reinhard Wilhelm , Björn Wachter
AVACS Spring School
2010
Hard Real-Time Systems
Physical Environment
•Situation of Aircraft
•Aerodynamics
•…
Actuators
Sensors
Fly-by-wire
(rudder,slats,flaps)
Engine
Flight Control
Software
Reaction time is critical
Hard Real-Time Systems
Hard real-time systems
• Avionics, automotive, train industries,
manufacturing control
Airbag,
Reaction in <10 mSec
Crankshaft-synchronous
tasks < 45 microsec
•Embedded controllers must finish
their tasks within given time bounds.
•Developers would like to know the
Worst-Case Execution Time (WCET)
to give a guarantee.
Overview
•
•
•
•
The Problem
Sketch of a Solution
Tool Architecture
Analyses
Static Timing Analysis
- the Application Domain The problem:
Given
1. a software expected to produce some
reaction,
2. a hardware platform, on which software
runs,
3. required reaction time.
Derive: a guarantee for timeliness.
What does Execution Time Depend on?
Caused by caches, pipelines,
speculation etc.
• the input
– inevitable,
• the state of the
execution platform
– this is (relatively) new,
• interferences from the
environment
– in presence of preemptive
scheduling, interrupts.
“external” interference as seen
from analyzed task
Modern Hardware Features
• Modern processors increase performance by using:
Caches, Pipelines, Branch Prediction, Speculation
• make computation of bounds difficult:
fast
Several hundred cycles
– Best case
everything goes smoothly
no cache miss,
operands ready,
needed resources free,
branch correctly predicted
slow
Worst case
-all goes wrong:
-all loads miss the cache,
-resources needed are occ
-operands are not ready
(Concrete) Instruction
Execution
mul
Fetch
Issue
Execute
I-Cache miss? Unit occupied? Multicycle?
s1
s2
Pending instructions?
4
1
30
Retire
3
1
3
6
41
Access Times
x = a + b;
MPC 5xx
The threat:
Over-estimation by a factor of 100 
LOAD
r2, _a
LOAD
r1, _b
ADD
r3,r2,r1
PPC 755
Notions in Timing Analysis
Hard or
impossible to
determine
Determine
upper bounds
instead
High-Level Requirements for
Timing Analysis
• Upper bounds must be safe, i.e. not
underestimated
• Upper bounds should be tight, i.e. not far
away from real execution times
• Analogous for lower bounds
• Analysis effort must be tolerable
Note: all analyzed programs are terminating,
loop bounds need to be known 
no decidability problem, but a complexity problem!
aiT WCET Analyzer
IST Project DAEDALUS final
review report:
"The AbsInt tool is probably the
best of its kind in the world and it
is justified to consider this result
as a breakthrough.”
Several time-critical subsystems of the Airbus A380
have been certified using aiT;
aiT is the only validated tool for these applications.
cache-miss penalty
over-estimation
Tremendous Progress 200
during the past 15 Years
The explosion of penalties has been compensated
by the improvement of the analyses!
60
25
20-30%
30-50%
25%
15%
10%
4
1995
Lim et al.
2002
Thesing et al.
2005
Souyris et al.
Timing Accidents and Penalties
Timing Accident – cause for an increase
of the execution time of an instruction
Timing Penalty – the associated increase
• Types of timing accidents
–
–
–
–
–
–
Cache misses
Pipeline stalls
Branch mispredictions
Bus collisions
Memory refresh of DRAM
TLB miss
History-Sensitivity of
Instruction Execution-Time
Contribution of the execution of an instruction to a
program‘s execution time
Cache hit
Execution history
Execution state
Execution time
Cache miss
• Needed: an invariant about the set of execution
states produced by all executions reaching a
program point,
• e.g. load instruction always cache hit
Deriving Run-Time Guarantees
• Our method and tool, aiT, derives Safety
Properties from these invariants :
Certain timing accidents will never happen.
Execution history
Execution time
Cache miss
• The more accidents excluded, the lower
the upper bound.
No
invariant
Fastest
Variance of execution times
Slowest
Tool Architecture
Determines enclosing
intervals for the set of
values in registers and
Determines
local variables, used for
loop
bounds addresses.
Determines
determining
infeasible
paths
Abstract Interpretations
Value
Analysis
Loop bound
analysis
Abstract Interpretation
Control Flow
Analysis
Integer Linear
Programming
Tool Architecture
Abstract Interpretations
Caches,
Pipelines
Abstract Interpretation
Integer Linear
Programming
Overview
• Intro (20min)
• Analysis
–
–
–
–
Caches (20min)
Pipelines (25min)
Value Analysis (20min)
IPET (5min)
• Conclusion (5min)
Caches: Small & Fast Memory on Chip
• Bridge speed gap between CPU and RAM
• Caches work well in the average case:
– Programs access data locally (many hits)
– Programs reuse items (instructions, data)
– Access patterns are distributed evenly across the cache
• Cache performance has a strong influence on
system performance!
• The precision of cache analysis has a strong
influence on the degree of over-estimation!
Caches: How they work
CPU: read/write at memory address a,
– sends a request for a to bus
m
Cases:
• Hit:
a
– Block m containing a in the cache:
request served in the next cycle
• Miss:
– Block m not in the cache:
m is transferred from main memory to the cache,
m may replace some block in the cache,
request for a is served asap while transfer still continues
• Replacement strategy: LRU, PLRU, FIFO,...determine which
line to replace in a full cache (set)
LRU Strategy
• Each cache set has its own replacement logic =>
Cache sets are independent: Everything explained
in terms of one set
• LRU-Replacement Strategy:
– Replace the block that has been Least Recently Used
– Modeled by Ages
• Example: 4-way set associative cache
age
Access m4 (miss)
Access m1 (hit)
Access m5 (miss)
0
1
2
3
m0
m1
m2
m3
m4
m1
m5
m0
m4
m1
m1 m2
m0 m2
m4 m0
Cache Analysis
How to statically precompute cache contents:
• Must Analysis:
For each program point (and context), find out which blocks
are in the cache  prediction of cache hits
• May Analysis:
For each program point (and context), find out which blocks
may be in the cache
Complement says what is not in the cache  prediction of
cache misses
(Must) Cache Analysis
• Consider one instruction in
the program.
• There may be many paths
leading to this instruction.
• How can we compute
whether a will always be in
cache independently of
which path execution
takes?
load a
Question:
Is the access to a
always a cache hit?
Cache Analysis
Over-approximation of the Collecting Semantics
Collecting semantics
determines
the
semantics
collects at each program
point all states that any
execution may encounter
there.
determines
“cache” semantics
reduces the program
to the sequence of
memory references
abstract semantics
set of all cache states
for each program point
set of cache states
for each program point
conc
determines
abstract cache states
for each program point
Cache Analysis – how does it work?
For each program point,
• compute an abstract cache state
• represents set of memory blocks guaranteed to
be in cache each time execution reaches this
program point
{x}
{a, b}
{x}
{a, b}
load a
Abstract Cache
youngest age - 0
{x}
{a, b}
Ag
e
oldest age - 3
Interpretation sets of all cache states
in which x, a, and b occur with
• Age(x) at most 1
a
x
x
a
• Age(a), Age(b) at most 2,
b
c
b
d
a
x
d
b
Cache information contains
1. only memory blocks guaranteed to be in cache.
2. they are associated with their maximal age.
Abstract Domain: Must Cache
Sets of concrete caches described by an abstract cache
concrete caches
Concretization
abstract cache
{s{
z, x 
{}
{}
{z,x}
{s}
remaining line filled up
with any other block
 and  form a Galois connection
Abstract Domain: Must Cache
Representing sets of concrete caches by their description
concrete caches
z
s
x
a
z
s
x
t
s
z
x
t
x
s
z
t
Abstraction
abstract cache
z
t
x
s
{}
{}
{z,x}
{s}
(Must) Cache analysis of a memory access
with LRU replacement
x
a
b
y
concrete
transfer
function
(cache)
access to a
a
x
b
y
abstract
transfer
function
(analysis)
{x}
{a, b}
After the access to a,
a is the youngest memory
block in cache,
and we must assume that
x has aged.
access to a
{a}
{b, x}
What happens when control-paths merge?
We can
guarantee
this content
on this path.
{c}
{e}
{a}
{d}
{a}
{ }
{ c, f }
{d}
Which content
{ }
can we
{ }
guarantee
{ a, c }
on this path
{d}
We can
guarantee
this content
on this path.
“intersection
+ maximal age”
?
combine cache information at each control-flow merge point
Cache Analysis
Over-approximation of the Collecting Semantics
Collecting semantics
determines
the
semantics
collects at each program
point all states that any
execution may encounter
there.
determines
“cache” semantics
reduces the program
to the sequence of
memory references
abstract semantics
set of all cache states
for each program point
set of cache states
for each program point
conc
determines
fixpoint
abstract cache states
for each program point
Predictability of Caches
- Speed of Recovery from Uncertainty
write
z
read
y
read
x
mul
x, y
1. Initial cache contents?
2. Need to combine information
3. Cannot resolve address of x...
4. Imprecise analysis domain/
update functions
Need to recover information:
Predictability = Speed of Recovery
J. Reineke et al.: Predictability of Cache Replacement
Policies, Real-Time Systems, Springer, 2007
The Influence of the Replacement Strategy
(an excursion into the area of Predictability)
LRU:
Information gain
through access to m
FIFO:
m
m
m
m
+ aging of
prefix of
unknown
length of
the cache
contents
m
m
m
m
m  cache
at least
k-1
youngest
still in
cache
Metrics of Predictability:
... ... ...
evict
evict & fill
fill
Two Variants:
M = Misses Only
HM
[f,e,d]
[f,e,c]
[h,g,f]
[f,d,c]
Seq: a b c d e f g h
Results: tight bounds
Generic examples prove tightness.
Wrap-up
• Cache analysis
– Problem solved for LRU
• … but other strategies exist
– These are provably less predictable
– Analyses are on-going work
• Design recommendation
– Use LRU caches
Most predictable replacement strategy
Tool Architecture
Abstract Interpretations
Pipelines
Abstract Interpretation
Integer Linear
Programming
Execution
Basic
Block
Time
Bound?
Execution
Basic
Block
Time
Bound?
Execution
Basic
Block
Time
move.1
(A0,D0),D1
Bound?
Execution
Basic
Block
Time
Bound?
Later:
Combine
Local bounds
To Global
Bound (ILP)
Block = Sequence of Instructions
time
Instruction 1
Instruction 2
Instruction 3
Time bound for entire block
43
What really happens
•
Pipeline:
–
overlapped execution of instructions.
Inst 1
Inst 2
Fetch
Decode
Fetch
Inst 3
–
Execute
Decode
Fetch
WB
Timing Model (FSM)
s
•States
Execute
WB
•Pipelines
1 cycle
•Buffers
Decode•Cache
Execute t
WB
Timing behavior influenced by
• availability of resources
– Bus, cache
• speculation, branch prediction…
Time Bounds for Basic Blocks
Incoming States
s1
s2
s4
s6
Execution
s5 s7
Time
Bound?
Longest
path
s6
s8
s9
s10
s11 s12
s3
Timing Model (FSM)
•States
•Pipelines
•Buffers
•Cache
s
1 cycle
t
Time Bounds for Basic Blocks
Incoming States
s1
s2
s4
s6
s5
s7
s6
s8
s3
Timing Model (FSM)
•States
•Pipelines
•Buffers
•Cache
s
1 cycle
t
s9
s10
s11 s12
Incoming States
of Successor Blocks
Basic Block
Basic Block
Basic Block
move.1 (A0,D0),D1
Basic Block
Fixpoint
Computation
47
Challenge: Imprecise Information
•
•
Imprecision due to abstraction.
–
Value analysis computes intervals.
–
Cache analysis predicts “don’t know” (hit or miss).
State traversal becomes non-deterministic.
Memory area
latency
[0x00000,0x00000fff]
[0x01000,0x0000ffff]
[0x10000,0x0007ffff]
[0x80000,0xffffffff]
2
1
10
4
{ s1 }
“memory access to
[0x7fff,0xffffffff]”
{ s1’ , s2’ , s3’ }
Timing Anomalies
• Counterintuitive timing behaviour:
Cache miss is not always the worst-case
[Lundquist/Stenström 1999].
Branch Condition
Evaluated
Cache Hit
Cache Miss
A
Prefetches
A
C - Miss due to Prefetch
C
Cache miss delays mispredicted branch
 limits cache damage
Timing Anomalies
• Counterintuitive timing behaviour:
Local worst-case does not entail global worst-case
C ready
Resource 1
A
D
Resource 2
Resource 1
Resource 2
C
E
B
A
D
B
E
C
AB blocks the critical sequence CDE
Timing Anomalies and Domino
Effects
•
•
–
–
Timing anomalies inhibit local assumptions.
Domino effects: difference in execution time
between two hardware states cannot be
bounded by constant.
PPC755 pipeline [Schneider 2003]
PLRU cache replacement [Berg 2006]
50
...
What is the problem?
• It makes timing analysis more difficult:
...
The analysis has to follow all possibilities
-> exponential blow-up
• Pipeline analysis must consider all
possibilities.
...
– Explicit state representation doesn’t scale well.
– Analysis can become infeasible [Thesing 2004].
Challenge: State Explosion
• Many possible initial hardware states
• Non-determinism in timing model
initial states
What to do?
• Most technologies presented
in this talk well-established
• Now: some discussion of on-going work
– How to tackle state explosion?
• On-going Research: Symbolic state
traversal for Timing Analysis
Research within AVACS R2
54
Solution: Symbolic Representation
of System and its States
C
01
Finite State Machine (FSM) M  (Q, I , T )
Q
: set of states
I
: set of input valu es
1
A : Q  {0,1} ; A( x)  1  x  A
x
1
T  (Q  I  Q) : transit ion relation
Describe subset A  Q by its characteri stic function
0
A
00
B
10
s1
0
s2
0
0,1
n1
FSM (explicit transition relation)
Transition relation has characteri stic function
n1
n2
s1
T : Q  I  Q  {0,1} ; T ( x, i, y )  1  ( x, i, y )  T
s2
0
1
FSM (symbolic transition relation)
Efficient representation as ordered binary
decision diagrams (OBDDs).
0
1
OBDD for characteristic function of state C.
55
OBDD Performance
•
Explicit: Performance depends on # of states
•
Symbolic: Performance depends on BDD size
exploit redundancy compact representation.
–
–
Size is worst-case exponential in the #variables.
Finding optimal variable ordering is NP-hard.
Symbolic Timing Analysis
• Store states in BDDs (implicit/symbolic).
– as in symbolic model checking [McMillan’s PhD thesis]
• Explore states in BFS order layer for layer
• exploitsTransitions
redundancies in large state space.
– manystored
states
in equal
BDD up to a few bits.
States sets
stored implicitly
in BDD
Image Computation
•
Computes all states that are reachable with transition
relation in one step from a set of source states
–
–
•
57
Core computation for symbolic model checkers.
Basic operation for implementing state traversal.
Efficient implementations available [Ranyan et. al. 1995].
Transition relation
Transition Relation
•
•
Transition relation
Conjunction T of
–
–
Transition Relation for
Program-independent abstract processor model M.
Transition Relation for
Program-Analysis Information
• Control-flow information, e.g. branch targets.
• Value analysis information (register contents), dependencies, …
58
61
Optimizations
Goal: Reduce state bits required in encoding
– 32 bit addresses are too large –> Compactly
enumerate.
• Uniquely identify all instructions in all
contexts for building TL.
• Number of state bits.
• Number of conjoined relations in TL.
• Yet: Program size still affects every
OBDD operation!
Decomposition I
•
62
Scope of program information required for basic block
analysis is limited.
–
–
Enumerate only instructions within scope.
Decompose TL into relations for individual basic
blocks.
–
–
Translate between blocks using image computation.
Program size doesn’t affect OBDD size anymore.
63
Decomposition II
Translate(b3,b5) = { (5,1),
(6,2), (7,5), (8,6) }
64
Infineon TriCore
•
•
•
•
•
•
2 major pipelines
1 minor pipeline
16 byte instruction
prefetch buffer
static branch
prediction
Used in automotive industry.
Commercial, explicit-state model available (aiT).
– Can be reduced to pipeline core.
65
Implementations
•
Explicit state implementation:
–
–
•
full model including peripheral modules:
696
bytes per state
pipeline core representation alone:
500 bytes
per state
Symbolic implementation:
–
–
–
implements most of the pipeline core: ~ 240 bits
per state
carefully tuned to minimize required state bits
loop pipeline still has simplified representation
66
Symbolic vs. Explicit Analysis
time in seconds
5000
4885
4500
4000
3500
3000
symb
2500
expl
2000
1500
1000
2
500
87
0
604
11
176
78
350
1113
526
expl
7
15
31
splits per memory access
symb
63
127
Analysis runtime for dhrystone benchmark using basic TriCore model and
different latencies for memory accesses.
Used partitioned transition relations, IWLS95 image engine, dynamic reordering,
no decomposition.
67
Summary
•
Symbolic state implementation scales much better:
–
Lots of redundancy between abstract states considered
concurrently by the analysis.
–
Frequently needed operations like set union and checking for
equality of sets have efficient OBDD implementations.
•
OBDD performance depends on size of abstract models.
–
Models require careful tuning.
–
Scales very well with growing program size (decomposition).
EMSOFT 2009, joint work with Stephan Wilhelm
Ongoing Work: Pipeline-CacheIntegration I
• Analyses must interact during runtime (cyclic
dependency).
• Relation between pipeline and cache states affects
precision.
C
1
P
1
P
2
P
3
P
4
P
5
P
n
Abstract pipeline states
C
2
C
3
C
m
Abstract cache states
On-going Work: Pipeline-CacheIntegration II
Pipeline Analysis
Cache Analysis
send cache accesses:
{ A1:0x4, A3:0x10, …}
{P1,P2,…,Pn}
update
{C1,C2,…,Cm}
return classification:
{ A1:hit, A3:miss, …}
{P1’,P2’,…,Pn’}
update
{C1’,C2’,…,Cm’}
compact, explicit
symbolic representation
representation
Tool Architecture
Abstract Interpretations
Abstract Interpretation
Integer Linear
Programming
Static Analysis in Timing Analysis
Invariants about the values of variables
• Constant propagation (CP)
– propagates statically available information about the
values of variables,
variable x always has value 5 at program point p.
– traditional static analysis in compilers
• Interval Analysis (IA), C&C‘77
– determines intervals enclosing all potential values of
variables
variable x always has value between -2 and  at
program point p.
– used to exclude index-out-of-bound errors
Simple imperative language
• finite set of program variables: Var
• arithmetic expr.: +, -, *, /
• statements:
y=x+100
• programs represented by their
x:=x+1
control-flow graph,
statements/conditions/instructions
associated with edges
false(x<y)
true(x<y)
– assignment: x := e,
– conditions e with outgoing edges
true(e), false(e)
x=1
Collecting Semantics
• A state binds variables to values
States = Var  Z
• Collecting semantics
– For each program point, compute
set of states reached during any execution,
• a (possibly infinite) subset of States
– in general, non-computable,
– need (efficiently) computable static
approximations!
What should be the value of x?
if c
not c
c
x := 3
x := 1
?
CS: 1, 3
CP: T don’t know
IA: [1,3]
What should be the value?
x := 3
while true
false
true
x := x + 2
?
CS: 5, 7, 9, …
CP: T don’t know
IA: [5, ]
Two Different Abstractions of the
Same Concrete Semantics
• Concrete semantics States = Var  Z
• Both are domains of functions binding
variables to “values”, States# = Var  V
• but different domains of “values”, V
– CP: Z and some value representing unknown,
– IA: intervals over Z including -1 and +1.
• Both need arithmetic on V, define [e]# to
describe the new value x of associated
with it by [ x:= e]#
CP vs. IA Domain
different value domains
concrete semantics States: Var  Z
– Don‘t know value >
– Intervals
• Var  (Z[ {-1}) x (Z[{+1})
• Var  Z>
– ? for not reached
– ? for not reached
• (Var  (Z[ {-1}) x (Z[{+1}))?
• (Var  Z>)?
(-1,1)
> don‘t know
…
-2
-1
0
1
2
…
(-1,1]
[-1,1)
(-1,0]
[0,1)
[-2,2]
(-1,-1]
[-1,2]
[1,1)
[-2,1]
[-1,1]
[0,2]
[-2,0]
[-2,-1]
[-1,0]
[0,1]
[1,2]
[-2,-2]
[-1,-1]
[0,0]
[1,1]
[2,2]
Meaning Function/Concretization
concrete domain
T
{{x=1, y=a} | a2 Z}
Dconc

abstract domain
T
°
{x=1, y=>}
Dabs

Abstraction
concrete domain
abstract domain
abstraction
Dconc
Dabs
Abstraction
concrete domain
T
{{x=1,y=2},
{x=1,y=3},
{x=2,y=2}}
{{x=1,y=2},
{x=1,y=3}}
{x=1,y=2}
Dconc

abstract domain
lattices
®
®
®
{x=>,y=>}
{x=1,y=>}
{x=1,y=2}
Dabs

Order v
• {x=1,y=>,z=3} v {x=1,y=>,z=>}
{ {x=1,y=a,z=3} | a2 Z}
{ {x=1,y=a,z=b} | a,b2 Z}
s v t iff they agree on the values
of variables they both know,
but s may know some more values,
on which t has value >
CP: Properties of Order
• CP Domain has infinite size
– Infinitely many values
• … however, it is of finite height
– No infinite ascending chains
• Finitely many program variables
• Can only go up in lattice by changing to a >
> don‘t know
…
-2
-1
0
1
2
…
CP: Expression evaluation
• Abstract evaluation 
> ; if [a]#(s)=> or [b]#(s)=>
[a  b]#(s)=
[a]#(s) [b]#(s) ; otherwise
• Examples:
– 1 + 2 = 3,
–1+>=>
Abstract Semantics [.]#
• [ x:= e]# (s)= s[x/[e]#]
? ; if [e]#(s)=ff
• [true(e)]#(s)=
s ; otherwise
?; if [e]#(s)=tt
• [false(e)]#(s)=
s ; otherwise
More Precise Value Domain - Intervals
1 height
(-1,1)
(-1,1]
[-1,1)
(-1,0]
(-1,-1]
[-2,1]
[-2,0]
[-2,-1]
[-2,-2]
[0,1)
[-2,2]
[-1,2]
[-1,1]
[-1,0]
[-1,-1]
[0,0]
[1,1)
[0,2]
[0,1]
[1,2]
[1,1]
[2,2]
IA Domain
[
-2
-1
0
1
2
]
• Abstract Domain
• (Var  (Z[ {-1}) x (Z [ {+1}))?
? in the function domain stands for not
reached
Value Domain: Intervals
concrete domain
T
abstract domain
T
®(M)=[inf M,sup M]
{1,3,5,7}
Dconc

[1,+1)
v
µ
{1,2,3,…}
®
®
[1,7]
Dabs

Value Domain: Intervals
concrete domain
T
abstract domain
T
°([1,7])={m2Z|1· m · 7}
{1,2,3,4,5,6,7}
Dconc

[1,7]
Dabs

IA: expression evaluation
• Addition:
[l1, u1] + [l2, u2] = [l1+l2, u1+u2]
where
-1 + anything = -1
+1 + anything = + 1
• Unary Minus: -[l1, u1] = [-u1 ,-l1]
• Multiplication: ….
Abstract Semantics [.]#
• [ x:= e]# (s)= s[x/[e]#]
? ; if [e]#(s) = ff
• [true(e)]#(s)=
s ; otherwise
?; if [e]#(s) = tt
• [false(e)]#(s)=
s ; otherwise
Interval Analysis in Timing Analysis
• Data-cache analysis needs effective
addresses at analysis time to know where
accesses go.
• Effective addresses are approximatively
precomputed by an interval analysis for
the values in registers, local variables
• “Exact” intervals – singleton intervals,
• “Good” intervals – addresses fit into less
than 16 cache lines.
Value Analysis (Airbus Benchmark)
Task Unreached Exact
1
8% 86%
2
8% 86%
3
7% 86%
4
13% 79%
5
6% 88%
6
9% 84%
7
9% 84%
8
10% 83%
9
6% 89%
10
10% 84%
11
7% 85%
12
10% 82%
Good
4%
4%
4%
5%
4%
5%
5%
4%
3%
4%
5%
5%
Unknown
2%
2%
3%
3%
2%
2%
2%
3%
2%
2%
3%
3%
1Ghz Athlon, Memory usage <= 20MB
Time [s]
47
17
22
16
36
16
26
14
34
17
22
14
Tool Architecture
Abstract Interpretations
Abstract Interpretation
Integer Linear
Programming
Path Analysis
by Integer Linear Programming (ILP)
• Execution time of a program =

Execution_Time(b) x Execution_Count(b)
Basic_Block b
• ILP solver maximizes this function to
determine the WCET
• Program structure described by linear
constraints
– automatically created from CFG structure
– user provided loop/recursion bounds
– arbitrary additional linear constraints to
exclude infeasible paths
Example (simplified constraints)
max: 4 xa + 10 xb + 3 xc +
a
if a then
2 xd + 6 xe + 5 xf
4t
where
b
elseif c then
d
else
b
c
10t
e
d
endif
2t
f
f
5t
xc = xd + xe
3t
6t
e
xa = xb + xc
xf = xb + xd + xe
xa = 1
Value of objective function: 19
xa
1
xb
1
xc
0
xd
0
xe
0
xf
1
Conclusion
• Static Timing Analysis
– Execution time bounds for embedded software
• Ingredients of successful approach
– Decomposition of the problem
• Microarchitectural analysis
• Static analyses: Value analysis
• Implicit path enumeration
– Heavy use of abstractions
• Cache
• Pipeline
Relevant Publications (from my group)
•
•
•
•
•
•
•
•
•
•
C. Ferdinand et al.: Reliable and Precise WCET Determination of a Real-Life
Processor, EMSOFT 2001
R. Heckmann et al.: The Influence of Processor Architecture on the Design and
the Results of WCET Tools, IEEE Proc. on Real-Time Systems, July 2003
M. Langenbach et al.: Pipeline Modeling for Timing Analysis, SAS 2002
St. Thesing et al.: An Abstract Interpretation-based Timing Validation of Hard
Real-Time Avionics Software, IPDS 2003
R. Wilhelm: AI + ILP is good for WCET, MC is not, nor ILP alone, VMCAI 2004
L. Thiele, R. Wilhelm: Design for Timing Predictability, 25th Anniversary edition
of the Kluwer Journal Real-Time Systems, Dec. 2004
R. Wilhelm: Determination of Execution-Time Bounds, CRC Handbook on
Embedded Systems, 2005
J. Reineke et al.: Predictability of Cache Replacement Policies, Real-Time
Systems, Springer, 2007
Reinhard Wilhelm, et al. : The worst-case execution-time problem—overview of
methods and survey of tools, ACM Transactions on Embedded Computing
Systems (TECS) , Volume 7 , Issue 3 (April 2008)
R.Wilhelm et al.: Memory Hierarchies, Pipelines, and Buses for Future
Architectures in Time-critical Embedded Systems, IEEE TCAD, July 2009