Habilitation à Diriger des
Recherches
de l’INSTITUT NATIONAL POLYTECHNIQUE DE
TOULOUSE
Xavier Thirioux
19 septembre 2016
Verifying Embedded Systems
Rapporteurs :
Matthieu Martel
Professeur à l’Université de Perpignan Via Domitia,
LAMPS
Marc Pouzet
Professeur à l’Université Pierre et Marie Curie,
ENS, département d’informatique, équipe PARKAS
Sylvain Conchon
Professeur à l’Université Paris-Sud,
LRI, équipe Toccata
Examinateurs :
Virginie Wiels
Directrice du DTIM, ONERA Toulouse
Éric Féron
Professor of Aerospace Engineering at Georgia Tech,
USA
Didier Henrion
Directeur de recherches au LAAS-CNRS,
Université de Toulouse
Robert De Simone
Directeur de Recherche INRIA Sophia-Antipolis,
équipe AOSTE
Philippe Queinnec
Professeur à l’INPT/IRIT, Université de Toulouse
correspondant INPT
2
Contents
1 Who by null pointer, who by out-of-buffer
1.1 Formal Verification of Critical Software . . .
1.2 Dataflow languages . . . . . . . . . . . . . .
1.3 Trusting a Compiler . . . . . . . . . . . . .
1.3.1 Related Work . . . . . . . . . . . . .
1.3.2 Compiling Lustre . . . . . . . . . .
1.3.3 Testing-based Approach . . . . . . .
1.3.4 Translation Validation . . . . . . . .
1.3.5 Correct-by-construction . . . . . . .
1.4 Trusting the Control Law . . . . . . . . . .
1.4.1 Contribution . . . . . . . . . . . . .
1.4.2 Related Works . . . . . . . . . . . .
I
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Validation of code generators
2 Our Code Generation Framework: lustrec
2.1 Introduction to Lustre . . . . . . . . . . . . . . .
2.2 Structure of a Lustre development project . . . .
2.2.1 Interface files . . . . . . . . . . . . . . . . .
2.2.2 Source files . . . . . . . . . . . . . . . . . .
2.3 Dataflow Examples . . . . . . . . . . . . . . . . . .
2.4 Stateflow Examples . . . . . . . . . . . . . . . . . .
2.5 Compilation Workflow . . . . . . . . . . . . . . . .
2.5.1 Stateflow Unfolding . . . . . . . . . . . . .
2.5.2 Normalization . . . . . . . . . . . . . . . . .
2.5.3 Inlining . . . . . . . . . . . . . . . . . . . .
2.5.4 Typing . . . . . . . . . . . . . . . . . . . . .
2.5.5 Clocking . . . . . . . . . . . . . . . . . . . .
2.5.6 Scheduling . . . . . . . . . . . . . . . . . . .
2.5.7 Generation of sequential intermediate code .
2.5.8 Code Optimization . . . . . . . . . . . . . .
2.5.9 Targetting C Code . . . . . . . . . . . . . .
3
7
8
8
9
9
10
11
11
11
12
13
14
17
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
19
19
21
21
21
22
24
24
28
34
38
38
40
40
42
47
51
4
CONTENTS
2.6
2.5.10 Targetting Horn Clauses . . . . . . . . . . . . . . . . .
Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Testing-based Approach
3.1 Test-suite Generation for Compiler Validation
3.1.1 MC/DC as Conditions over Traces . .
3.2 Reinforcing Test Suite via Mutation Testing .
3.3 Compiler Validation via ∆-neighborhood . . .
3.4 A Validating Lustre Compiler . . . . . . . .
3.4.1 Experimental evaluation . . . . . . . .
3.5 Perspectives . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
59
61
62
63
65
67
68
69
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4 Translation Validation
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
4.2 A Denotational Reference Semantics . . . . . . . . . .
4.2.1 Lustre Operators in Coq . . . . . . . . . . .
4.2.2 Lustre Nodes in Coq . . . . . . . . . . . . . .
4.2.3 A Small Example . . . . . . . . . . . . . . . . .
4.3 Correction of Operational Semantics . . . . . . . . . .
4.3.1 Operational Semantics in Coq . . . . . . . . .
4.3.2 A Small Example – Continued . . . . . . . . .
4.4 Correction of Code . . . . . . . . . . . . . . . . . . . .
4.4.1 From Horn Clauses to ACSL Annotations . . .
4.4.2 Memory Representation . . . . . . . . . . . . .
4.4.3 More Simulation Relations . . . . . . . . . . . .
4.4.4 Closed Formulation and Code Optimization . .
4.4.5 A Small Example – Final . . . . . . . . . . . .
4.5 Compilation of Synchronous Observers . . . . . . . . .
4.5.1 Contract verification via k-induction . . . . . .
4.5.2 Synchronous Observers as Code Contracts . . .
4.5.3 Case study: The NASA Transport Class Model
4.6 Synthesis of Modular Invariants . . . . . . . . . . . . .
4.7 Perspectives . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
71
. 71
. 72
. 74
. 74
. 76
. 79
. 79
. 80
. 80
. 83
. 83
. 84
. 87
. 88
. 89
. 94
. 95
. 96
. 98
. 105
II
.
.
.
.
.
.
.
51
56
Certified Taylor Expansions
5 Type-level arithmetics
5.1 Motivation . . . . . . . . . . . . . .
5.2 Introduction to GADT . . . . . . .
5.3 A Simple Example . . . . . . . . .
5.4 GADT versus Proof Assistants . .
5.5 Encoding Arithmetics at Type-level
5.5.1 Equality . . . . . . . . . . .
109
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
111
111
111
112
113
113
113
CONTENTS
5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
115
115
116
116
117
118
119
119
120
120
120
120
121
6 Symmetric Tensor Algebra
6.1 Safety versus Efficiency . . . . . . . . . . . . . .
6.2 Definition . . . . . . . . . . . . . . . . . . . . .
6.3 Representations of symmetric tensors . . . . . .
6.3.1 Index versus Occurrence . . . . . . . . .
6.3.2 Tensor versus Homogeneous Polynomial
6.3.3 Recursive decomposition . . . . . . . . .
6.4 Data Structure . . . . . . . . . . . . . . . . . .
6.5 Complexity analysis . . . . . . . . . . . . . . .
6.6 Functorial Structure . . . . . . . . . . . . . . .
6.7 Algebraic Operations . . . . . . . . . . . . . . .
6.8 Non-structural Decomposition . . . . . . . . . .
6.9 Reduction Operations . . . . . . . . . . . . . .
6.10 Differential Operations . . . . . . . . . . . . . .
6.11 Changing Tensor Basis . . . . . . . . . . . . . .
6.12 Perspectives . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
123
123
124
124
124
124
124
126
128
131
132
135
137
138
140
141
.
.
.
.
.
.
.
.
.
.
.
.
143
143
143
143
144
145
146
147
149
150
150
150
151
5.6
5.5.2 Natural Numbers . . . . . . . . . . . .
5.5.3 Relation to primitive integers . . . . .
5.5.4 Arithmetical Operations . . . . . . . .
5.5.5 Functional Specification . . . . . . . .
5.5.6 Properties of addition . . . . . . . . .
5.5.7 Properties of multiplication . . . . . .
Conclusion . . . . . . . . . . . . . . . . . . . .
5.6.1 User Experience . . . . . . . . . . . .
5.6.2 Proof normalization and Performance
5.6.3 Change Encoding . . . . . . . . . . . .
5.6.4 Memoization . . . . . . . . . . . . . .
5.6.5 Laziness . . . . . . . . . . . . . . . . .
5.6.6 Type annotations . . . . . . . . . . . .
7 Values and Errors
7.1 Values . . . . . . . . . . . . . . . . . . .
7.2 Errors . . . . . . . . . . . . . . . . . . .
7.2.1 Introduction . . . . . . . . . . . .
7.3 Error model . . . . . . . . . . . . . . . .
7.4 Tensors of Error Model Elements . . . .
7.5 Error refinement . . . . . . . . . . . . .
7.5.1 Reduction of Value-Error Tensor
7.5.2 Refinement of Value-Error Tensor
7.6 Implementation . . . . . . . . . . . . . .
7.6.1 Memoization . . . . . . . . . . .
7.6.2 Zero Functions . . . . . . . . . .
7.6.3 OCaml Code . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
CONTENTS
7.7
Perspectives . . . . . . . . . . . . . .
7.7.1 Coping with Rounding Errors
7.7.2 Using Complex Numbers . . .
7.7.3 Change of Tensor Basis . . .
7.7.4 More Precise Error Models .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8 Taylor Expansions
8.1 Introduction . . . . . . . . . . . . . . . . . . . .
8.2 Data Structure . . . . . . . . . . . . . . . . . .
8.3 Causality . . . . . . . . . . . . . . . . . . . . .
8.4 Taylor Model . . . . . . . . . . . . . . . . . . .
8.5 Convolution . . . . . . . . . . . . . . . . . . . .
8.6 Polynomial Operations . . . . . . . . . . . . . .
8.7 Differential Operations . . . . . . . . . . . . . .
8.8 Taylor Expansions of Elementary Functions . .
8.9 Composition of Taylor Expansions . . . . . . .
8.10 Error Refinement . . . . . . . . . . . . . . . . .
8.11 Perspectives . . . . . . . . . . . . . . . . . . . .
8.11.1 Solving (Partial) Differential Equations
8.11.2 Improved Data Structures . . . . . . . .
8.11.3 Improved Composition of Taylor series .
8.11.4 Other Decompositions . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
153
153
153
154
154
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
157
157
158
158
159
160
162
166
167
169
172
173
173
175
176
177
Chapter 1
Who by null pointer, who by
out-of-buffer
It is nowadays folklore, in the critical systems community, to pinpoint the
difficulty of getting rid of software bugs that may creep under the hoods of
our cars, planes, trains, satellites, rockets, power plants, etc. The occurrence
of faulty behaviors in these embedded systems could endanger the lives of
many and jeopardize the completion of critical missions. Notwithstanding
drastic and time-consuming validation campaigns, the danger still remains.
We aim at specifically attacking the problem of verifying the correct
behavior of the software part of critical systems. More precisely, we propose
and advocate an integrated compilation and verification approach, from highlevel specification and source code to low-level C code, focusing on languages
used in the development of such critical systems in industrial contexts.
It has been accepted that formal reasoning might be an interesting way
to consider walking, to supplement or even replace traditional testing-based
approaches embedded in a development cycle. Indeed, it is now admitted
that classical industrial development processes, even under the heavy constraints of an extensive norm such as DO-178B (for aerospace industry), have
shown their limitations as regards the huge cost of testing campaigns and
the relative liability of their outcomes.
Critical systems usually obey strict rules: simple imperative first-order
pieces of code with only reliable commonplace instructions, bounded static
memory, simple datatypes, low resource requirements. That strongly impacts the sensible choices for the high-level languages used by production
engineers and the way that compilation to low-level code may be achieved.
Furthermore, a compilation process is declared apt only if it ensures traceability, i.e. each source code artifact must structurally correspond to a target
code artifact. This feature usually prevents aggressive optimizations, which
is somehow antagonistic to the low resource rule.
7
8
1.1
CHAPTER 1. WHO BY NULL POINTER, . . .
Formal Verification of Critical Software
One particularly difficult case for software validation is generation of
code. Proving the correctness of a compiler, i.e. that the low-level code
preserves the semantics of the higher-level model, amounts to validating a
second order artifact; it is therefore a non trivial task. Moreover compilers
tend to become more and more huge and complex programs 1 Since they play
a major role in the production of the final executable, proving their soundness
is an important concern. In the safety critical domain, a common workflow
consists in stating and verifying (safety) properties of systems, where proofs
for these properties are usually established at the level of source code or formal models. Source code and/or models are then compiled to executables for
some target platform. This compilation may invalidate already established
verification results and imposes re-validation of the final executable product,
where verification efforts are much more tedious at low-level binary code.
It is thus of utmost importance to have a trustworthy compilation process.
Existing approaches to trusted compilation fall into two categories. Either
they aim at verifying the compiler itself (e.g., [61]), or they aim at validating the compiled output using a verified validator (e.g., [73]). Both exist
in weaker variants, where verification is replaced by testing. There exists
a body of work on generating test suites for verifying the correctness of a
compiler (c.f., [10]). Testing the correctness of a compiled artifact is usually
done by some form of specification-based testing (e.g., [91]).
The more rigorous approaches come at a high cost. Establishing the
correctness of a compiler takes a lot of effort. Developing and verifying a
validator is not less of an effort. Also, to be successful, a shared semantic
basis is needed between the source and the target language. Testing the correctness of a compiler is difficult because the set of potential input programs
to a compiler is potentially infinite and hard to sample in an automated fashion. Specification-based testing, on the other hand, is well understood and
cheap (compared to the other approaches). It will, however, in many cases
not uncover errors in a compiler: test-suites are geared towards finding violations of a specification and not towards uncovering faults in the translation
of a program.
1.2
Dataflow languages
We are interested in dataflow languages, of which block-diagrams languages constitute a prominent sub-class. These languages, such as Lustre [17, 41], Scade [26], Signal [1], Esterel [6], Lucid Synchrone [82],
Simulink [88], to name a few, are extensively used in industrial contexts
1. Eg. gcc 4.7.3 proposes 185 configuration parameters for optimizations. See gcc -Q
–help=optimizers.
1.3. TRUSTING A COMPILER
9
to develop critical embedded systems. They bear similarities with (analog)
electronic circuits and other continuous physical devices dealing with signals.
Instead of manipulating continuous values and a real time variable, dataflow
languages handle sequences of values (i.e. flows) and a discrete time variable.
Mixing discrete and continuous time (i.e. hybrid systems) is possible
and already available in industrial contexts (e.g. Simulink). Yet, giving a
sensible executable semantics to continuous time models alone is very difficult and implementations so far seem to favor efficiency and other practical
concerns over reliability and correctness of the semantics. Hybrid systems
keep all weaknesses from both sides: inherently complex discrete-time and
hardly predictable continuous-time. As we plan to apply validation and verification techniques, the lack of a reliable semantical basis so far compels us
to stick to the simpler case of pure discrete time models. The reach of our
arguments may be mitigated, as indeed some preliminary works exist around
soundly mixing continuous-time and discrete-time [3, 13] and even proposing
a reliable verification process for continuous-time models [29]. But we feel
these works are not mature enough.
In this document, we mainly consider Lustre, a simple synchronous
language that we use as a common denominator to all other members of
the dataflow family. A synchronous dataflow language is simply a language
where output flows are produced at the same rate as input flows, i.e. each
time a value is input, an output value is produced. Such instants are called
ticks of a (logical) clock. Moreover, the output flow can be produced using
only a bounded memory.
To conclude this short review of dataflow languages, we mention a related
paradigm, the so-called Functional Reactive Programming paradigm, which
we don’t consider here, where flows are lifted as first-class citizens of a (most
often functional) programming language. Flows are not synchronized and the
underlying model of computation is much more general and doesn’t guarantee bounded memory. Therefore these languages cannot fit the constraints
of critical systems. A fair member of this class is the language Yampa, a
domain-specific language embedded in Haskell, cf. [35].
1.3
1.3.1
Trusting a Compiler
Related Work
Approaches to trusted compilation can be classified in two respects. First,
approaches are either based (1) on testing or (2) on formal methods (model
checking, automated reasoning, etc.). Second, they can aim (a) at proving
the compiler correct or (b) at proving the compiled artifact correct. This
yields four classes. There exist many examples for class 1a in industry:
There are companies that specialize on test suites for compilers (e.g., the
Plum Hall validation suite for C [79]). There is also an extensive body of
10
CHAPTER 1. WHO BY NULL POINTER, . . .
research on generating meaningful test-inputs for compilers in an automated
fashion (e.g., [93]).
When used in a critical context (subject to certification authorities),
compilers producing critical code have to be qualified at the same level than
the code they produce. For example, in civil aircrafts, the DO-178/C [75] and
its qualification supplement DO-333, specifies the requirement verification
and validation activities. When using testing, tests should be driven by
formalized requirements and applied until reaching a given coverage criteria
upon the compiler source code; typically the Modified Condition/Decision
Coverage, aka MC/DC criterion. The method presented in this paper does
not target certification.
An example for class 2a is Leroy’s work on CompCert [9], a verified compiler from a subset of C to PowerPC assembly code. Except some preprocessing code and the linker, CompCert is entirely proved in Coq [87]. Most
of the development has been made in Coq. However the most complex parts
of the compilation were developed in OCaml [62] and proved a posteriori
with Coq. In the same line of thought, some classical code optimizations
are proved correct in [24].
Pnueli’s Compiler Validation Project [80], based on proof-carrying code,
or Tony Hoare’s grand challenge [44], the verifying compiler, fall into class 2b.
Those approaches typically requires the compiler to construct a proof, which
can be checked with an external tool after compilation [72]. We slightly depart from this as we carry only specifications from source level to target level,
providing enough information so that proofs can be automatically discharged
by external independent tools.
The standard testing-based approach intersects categories 1a and 1b:
producing tests specific to a given source file and generating test inputs for
compilers. The approach to comparing two similar programs on one compiler
is dual the differential testing of [93], which already identified an important
number of bugs or discrepancies in state of the art compilers. Differential
testing amounts to comparing the behavior of two separated compilers on the
same input. It allows to reason about the program semantics by comparison,
without having the need for formally characterization the source or target
language semantics.
1.3.2
Compiling Lustre
We implemented a prototype compiler from Lustre to C, which is also
able to produce a Horn clauses encoding. The techniques involved are mainly
inspired from previous works and our prototype serves as an experimental
platform for integrating verification activities in a compiler workflow. An introduction to modern Lustre as well as the global workflow of our compiler
are developed in Chapter 2.
1.3. TRUSTING A COMPILER
1.3.3
11
Testing-based Approach
We implemented our own testing-based approach to validate our code
generator. As testing a compiler is notably difficult, we mix several techniques. First, we use a so-called differential testing through source program
mutations. This technique allows to measure the influence of every code
artifact on the target C code. Second, we also use MC/DC criteria to generate test suites, which is standard in industrial contexts. This approach is
detailed in Chapter 3 and has been published in [31].
1.3.4
Translation Validation
In Chapter 4, we study an approach based on translation validation techniques. It consists in producing a certificate-carrying code from a Lustre
program, which, if validated, ensures that the code conforms to a more abstract semantics of the program. This principle also extends to user-defined
properties (i.e. synchronous observers [25]), which bring their own code contracts. We also study the case when properties are automatically discovered
by an external tool, in a modular way [32].
1.3.5
Correct-by-construction
Last but not least, we also took the a priori approach and have studied
means of building a correct code generator from its specification, or more
precisely from the specification of its different phases. For us, this path is
being followed since the Gene-Auto project [92]. We choose not to detail
our contribution and instead present a summary of our activities.
We opted for a constructive fixed point framework, general enough to
express the different compiler phases in Coq. Our framework computes
fixed points in lattices verifying the ascending chain condition (ACC).
We first developed a theory of indexed families of lattices, with various operators such as product, lexicographic product, disjoint sum, finite
mappings (or environments), etc, that take their part in the specification of
compiler phases. Note that all operators were proved to preserve ACC, in
contrast with state-of-the-art practices [78, 8, 9] where termination guarantees of monotonic lattice functions were thrown out of the scope.
Then, a library of fixed points (least and greatest) on lattice domains
was developed, with all the properties needed to exploit them in proofs.
We applied this framework to two important phases of a Gene-Auto
language compiler. As a significant subset of the Simulink language, GeneAuto contained many semantically ill-formed constructs, that were nevertheless part of industrial practices. The mix of dataflow and stateflow, along
with a notion of priority induced by graphical position and “inheritance” between blocks, were particularly tedious and daunting, with a large textual
specification.
CHAPTER 1. WHO BY NULL POINTER, . . .
12
The first phase consisted in devising a scheduling algorithm from the
textual specification. Our contribution is summarized below:
— When trying to build and prove a scheduling algorithm, we found several bugs in the specification itself.
— We developed a lattice of finite sets of Node Calls and Node Returns
events, general enough to cope with all the constraints and priorities.
— We built a terminating algorithm, totally within Coq, that computes
a scheduling from any Gene-Auto source diagram.
— We proved the generated scheduling respects the causality constraints
of the diagram, from dataflow and stateflow altogether, as well as the
priorities.
— This component is a part of the industrial Gene-Auto compiler.
The second phase we implemented was a typing algorithm, through a
dataflow analysis, again seamlessly embedded within our lattice and fixed
point framework. The novelty was the possibility, as required by the users,
to specify input types and/or output types of any diagram block.
— We developed a lattice of type boundaries, i.e. a pair of lower and
upper type bounds, which fully supports Gene-Auto subtyping rules.
— Each block gives rise to a pair of typing functions, forward and backward, forming an adjunction. It guarantees the coherence of computed
type boundaries.
— The typing algorithm terminates and infer the internal type information from input and/or output types.
— It paves the way for further developments such as: type unification,
general type constructors and also contracts.
For the interested reader, more details can be found in [50, 53, 52, 51, 49].
1.4
Trusting the Control Law
In the context of critical embedded systems, many software parts consist
of implementations of various control laws. Such control laws are discretized
versions of smooth transfer functions, or their linear approximations. They
intensively use floating-point numbers and arbitrary numerical functions such
as trigonometric functions, logarithms, etc. Even more, these laws often
assume their environment behaves as the solution to a differential equation.
In order to study numerical properties of these implemented control laws,
such as the domain of possible values for every variable of the system, we may
facilitate analysis by restricting ourselves to polynomial functions and systems. Indeed, there exist methods and algorithms to characterize dynamics
of polynomial systems. Still, providing certified polynomial approximations
for arbitrary transfer functions remains to be done.
1.4. TRUSTING THE CONTROL LAW
13
In part II, we propose Taylor series as a solution to this prominent problem. Taylor expansion is a very pervasive tool to obtain more tractable
polynomial approximations from arbitrary mathematical functions. For our
purpose, Taylor expansions appear quite natural: any analytical function can
be given a polynomial approximation and a certified bounding error may be
computed. Also, the solution to any differential equation can be expressed as
a Taylor series, when the equation is put in solved form, so Taylor expansions
can be used to modelize even the continuous environment.
Therefore, Taylor expansions seem a promising tool as they render analysis of embedded systems more tractable by reducing them to polynomial
systems, not to mention other pervasive usages of polynomial approximations.
1.4.1
Contribution
Although Taylor series are far from being new, we claim our method
possess several unprecedented key features, to the best of our knowledge.
Novelty is present in different aspects, detailed below. Each feature is detailed in a dedicated chapter:
• Strong static guarantees : Chapter 5
• Multivariate : Chapter 6
• Specialized at point 0 : Chapter 7
• Certified errors : Chapter 7
• On-demand refinable models : Chapter 8
• Resolution of differential equations : Chapter 8
Last, but not least, the overall development is highly modular, mainly
structured with OCaml modules and functors seen as (implementations of)
various algebras: values, errors, values-with-errors, tensors and finally Taylor
series.
Some further design choices clearly derive from our above requirements.
For instance:
— The requirement for strong static guarantees indeed stems from a wish
to reduce the effort to prove our implementation correct (with a proof
assistant), that is why we need advanced features such as Generalized
Abstract Data Types, type-level arithmetics and modularity.
— Multi-variate Taylor expansions entice to handle symmetric tensors as
a suitable representation for partial derivatives.
— Taylor expansions specialized at point 0 naturally lead to zero-centered
errors. Proving correctness would also require to handle floating-point
errors as additional zero-centered errors.
CHAPTER 1. WHO BY NULL POINTER, . . .
14
Type-level Arith.
Zero-centered Errors
q
Values
+
Values×Errors
Tensor Basics
*
Tensor Algebra
Error Refinement
1D Series
"
Convolution prod.
t
Taylor Basics
*
Taylor Series
"
t
Refined Taylor Series
Figure 1.1 – Overall design.
— The pervasive nature of multiplication, which is a performance bottleneck, strongly suggest that we carefully devise a fast convolution
algorithm, while keeping in mind static safety.
— In the same line of thought, many arithmetical expressions are already
polynomial and we need to cope with them efficiently with a specific
treatment of finite Taylor expansions and zero error functions.
— The need for on-demand Taylor model refinement is only tractable if
the same expansion is computed once and reused many times, anywhere
in the neighbourhood of point 0. This clearly favors co-induction and
laziness as well as error functions (as opposed to error values).
— As a conclusion, the OCaml language as an implementation language
makes out with its matching assets: highly modular capabilities, strong
static guarantees, clean support for functions and laziness and finally
managed side-effects when efficiency is needed.
The global design appears in Figure 1.1.
1.4.2
Related Works
Although Taylor expansions are well known and form a very rich and
interesting algebra, their realizations as software items are not widespread.
From a mathematical perspective, some weaknesses may explain this lack
of success: they only support analytical functions, a rather limited class of
functions; they don’t possess good convergence properties, uniform convergence is hardly guaranteed for instance; typical applications for polynomial
approximations are usually not concerned with certified errors, mean error
or integrated square error (through various norms) are more important and
don’t easily fit in Taylor expansion schemes. Finally, from a programming
1.4. TRUSTING THE CONTROL LAW
15
perspective, Taylor expansions are: hard to implement as they require many
different operations to be implemented, from low-level pure numbers to highlevel abstract Taylor expansions seen as first-class citizens; error-prone with
lots of complex floating-point computations on non-trivial data structures;
heavily resource demanding in our multi-dimensional setting because data
structures grow exponentially as the precision order increases.
To the best of our knowledge, here are some few but important milestones
on our own ambitious path. In [58], the author presents an early application
of laziness to cleanly obtain Taylor polynomial approximations. Laziness
allows to augment the degree of the resulting polynomial on demand. Yet,
the setting is much simpler as it is strictly one-dimensional and certified errors are not computed at all. With these restrictions, the author obtains
nice formulations of automatic differentiation and polynomial approximations of classical phenomena in physics. Speaking about implementation,
related works come in many flavors and date back to the now well established folklore of automatic differentiation (forward or backward modes). As
for symmetric tensor algebra, a huge menagerie of (mostly C++) libraries
exists, for tensors of arbitrary order and dimension (but some libraries put
a very low upper-bound on order or dimension). These implementations are
clearly not oriented towards reliability and proof of correctness, but towards
mere efficiency. This also comes at the expense of some user-friendliness, as
memory management and user interface are more complex and error-prone
than in our own library. Still, we may consider interfacing our code base
with a trusted and stable tensor library, for much better performance.
One of the most prominent implementation of Taylor expansions is the
COSY tool, cf. [83, 63]. This tool has been used in industrial-scale engineering and scientific contexts, to modelize and predict the complex dynamics
of particles in accelerators for instance. This tool supports 1D Taylor expansions with interval-based certified errors. Polynomial degree is not refinable on demand and Taylor expansions are not handled per se (i.e. not
first-class citizens). The authors managed anyway to implement an error
refinement scheme for solved form differential equations, that allows solving
them with tight certified errors. Experiments show that this tool compares
favorably to other traditional approximations and bounding techniques, such
as branch-and-bound approaches and interval arithmetics, in terms of speed
and precision. We also aim at implementing differential equation solving in
our multi-dimensional setting.
At the other end of the spectrum lies [68], which proposes correct-byconstruction one-dimensional Taylor expansions with certified errors, which
appears as a huge step. Integration of floating-point errors into this scheme
is also a concern adressed in [67]. Still, apart from its limitation to the 1D
case, this approach suffers from weaknesses: expansion degree is fixed and
differential equations cannot be handled. The underlying algorithm won’t
be so easily turned into a co-inductive (lazy) equivalent version.
16
CHAPTER 1. WHO BY NULL POINTER, . . .
Part I
Validation of code generators
17
Chapter 2
Our Code Generation
Framework: lustrec
We stress the fact that we don’t intend to replace renowned, reliable
and efficient available Lustre compilers, but use our framework to explore
and implement verification techniques tightly coupled with code generation.
Still, we feel free to push forward some changes to the traditional Lustre
folklore, without departing too much from established practices though. Our
own resulting flavor of Lustre blends features from Lustre V4 and V6, with
a particular emphasis on modularity. In this chapter, we describe rather
roughly and informally the source language and the compilation process of
our code generator lustrec [33] and suppose the reader is somehow familiar
with the Lustre language, the specific syntax of which is not presented in
detail. Indeed, many compiler phases are directly inspired from previous
works and don’t bring so much novelty. As we feel that a nearly identical
transposition would dangerously border plagiarism, we try to present most
parts by displaying examples that nevertheless speak for the general case.
We invite the interested reader to refer to the original sources for a more
thorough and formal presentation.
2.1
Introduction to Lustre
Synchronous languages are a class of languages proposed for the design
of so called “reactive systems” – systems that maintain a permanent interaction with physical environment. Such languages are based on the theory
of synchronous time, in which the system and its environment are timetriggered with respect to a symbolic “abstract” universal clock. In order to
simplify reasoning about such systems, outputs are usually considered to be
calculated instantly [4]. Lustre [16] is a synchronous dataflow language
that combines each data stream with an associated clock as a mean to discretize time. The overall system is considered to have a universal clock that
19
20
CHAPTER 2. LUSTREC
represents the smallest time span the system is able to distinguish, with additional, coarser-grained, user-defined clocks. Therefore the overall system
may have different subsections that react to inputs at different frequencies.
At each clock tick, the system is considered to evaluate all streams, so all
values are considered stable for any actual time spent in the instant between
clock ticks.
Variables in Lustre are used to represent individual typed streams; basic
types including streams of real numbers, integers, and Booleans. Lustre
programs and subprograms are expressed in terms of nodes. Nodes directly
model subsystems in a modular fashion, with an externally visible set of
inputs and outputs. A node can be seen as a mapping of a finite set of
input streams (in the form of a tuple) to a finite set of output streams (also
expressed as a tuple). At each instant t, the node reads the values of its input
streams and writes the values of its output streams. Operationally, a node
has a cyclic behavior: at each clock tick t, it takes as input the value of each
input stream at position or instant t, and returns the value of each output
stream at instant t. Lustre nodes have a limited form of memory in that,
when computing the output values they can also look at input and output
values from previous instants, up to a finite limit statically determined by
the program itself. Listing 2.1.1 describes a simple Lustre program: a node
that every four computation steps activates its output signal, starting at the
third step. The reset input reinitializes this counter.
node counter(reset : bool) returns (active : bool);
var a, b: bool;
let
a = false −> (not reset and not (pre b));
b = false −> (not reset and pre a);
active = a and b;
tel
Listing 2.1.1 – A simple Lustre example.
Typically, the body of a Lustre node consists in a set of definitions,
stream equations of the form x = t (as seen in Figure 2.1.1) where x is a
variable denoting an output or a locally defined stream and t is an expression, in a certain stream algebra, whose variables are input, output, or local
streams. More generally, x can be a tuple of stream variables and t an expression evaluating to a tuple of the same type. Most of Lustre’s operators
are point-wise lifting to streams of the usual operators over stream values.
For example, let x = [x0 , x1 , . . . ] and y = [y0 , y1 , . . . ] be two integer streams.
Then, x + y denotes the stream [x0 + y0 , x1 + y1 , . . . ]; an integer constant
c, is implicitly lifted to the constant integer stream [c, c, . . . ]. Two important additional operators are a unary shift-right operator pre (“previous”),
2.2. STRUCTURE OF A LUSTRE DEVELOPMENT PROJECT
21
and a binary initialization operator → (“followed by"). The first is defined
as pre(x) = [u, x0 , x1 , . . . ] with the value u left unspecified. The second is
defined as x → y = [x0 , y1 , y2 , . . . ]. Syntactical restrictions on the equations
in a Lustre program guarantee that all its streams are well defined, that is,
(mutually) recursive stream definitions traverse at least one pre operator.
More advanced features such as interfaces, arrays, clocks and automata
will be presented in dedicated sections.
2.2
Structure of a Lustre development project
A typical development in our source language involves several interface
files with a .lusi extension and source files with a .lus extension. This
rudimentary module system is inspired from similar .mli/.ml modules of
OCaml. We don’t support parameterized modules as Lustre V6 does.
2.2.1
Interface files
An interface file contains declared global constant values, defined user
types (no type abstraction is provided yet) and declared input/output interfaces of standard Lustre nodes. Memory-less nodes, as they are compiled
in a more specific way, may be declared as functions. Also, single output
functions that are meant to be linked to standalone C functions with a return value (such as trigonometric functions of the math library) can receive
a special external annotation. As in classical Lustre V4, we don’t support
type polymorphism (but could do so). Nevertheless we somehow allow size
(of arrays) polymorphism and also clock polymorphism through a restricted
implementation of dependent typing.
Compiling an interface file file.lusi produces a standard C header
file file.h containing declarations of global constant values, definitions of
user types and interfaces of nodes, as well as a file.lusic file containing
a compiled internal version of the same content, later used to match the
contents of file.lus against, when this source file exists and is compiled.
2.2.2
Source files
A source file contains definitions of global constant values, defined user
types and definitions of Lustre nodes and functions. For the time being,
user-defined types may only consist of type aliases and enumeration types,
type expressions may include arrays of any dimension with symbolic sizes, we
plan to support records for a near future. Inside nodes, source code involves
standard Lustre V4 equations, without array slicing operations but with
array and tuple homomorphism. As for clock constructs, we support Lustre
V6 syntax with merge and when, with clocks being values of an enumerated
type (bool being the only supported enumerated type in Lustre V4). We
22
CHAPTER 2. LUSTREC
got rid on purpose of the current Lustre V4 construct, which is not wellbehaved. We also support the reset construct, i.e. a node memory may be
reset immediately prior to being called.
These new Lustre V6 features are quite rare in the corpus of Lustre
codes, but principally serve as an intermediate target language for Lustre
V6 stateflows, that we also support with only small differences.
Compiling a source file file.lus consists in producing a C source file
file.c containing definitions of global constants, types and procedures corresponding to Lustre artifacts, a file.alloc_h file containing static memory allocation primitives for each node, needed to execute them, and also a
makefile file.makefile to automate the compilation process of the generated C file (managing module dependencies, linking with external C libraries,
etc). Compiling a source file also produces the aforementioned file.h and
file.lusic files, in case an explicit file.lusi interface doesn’t exist. When
an executable is needed, a specific main node is chosen and an additional
main C file (file_main.c with associated file_main.makefile) is also generated with all the necessary stub code to interface the main node with the
external world.
To complete this description, we insist again on one of the main assets of our compiler lustrec for its use in a certified context: modularity.
(i) This allows the user to split his project into separate files, compile them
independently through interface files and link them seamlessly with external
C libraries, getting rid of the old Lustre V4 practice of textually including
source files. This is possible as every node is compiled into a single bunch
of C procedures (indeed 2 or even 4 procedures, depending on compilation
options) and static memory allocation macros, whatever its inherent size or
clock polymorphism. Moreover, every state of a stateflow automaton is also
seen and compiled as an independent node, enhancing modularity. (ii) Traceability is also more tractable by preserving the component structures of the
initial model. Each line of code will be defined in a function associated to
a Lustre node. Modularity at the code level obviously leverages modular
specification and verification, which we believe is a key aspect for the formal verification of huge pieces of Lustre code, as one can find in industrial
developments.
2.3
Dataflow Examples
We illutrate some abilities of our compiler through small interface and
source pure dataflow examples. Stateflows are handled in Section 2.5.1.
Listing 2.3.1 illustrates node interfaces with some amount of dependent
typing, either through symbolic array sizes or clock declarations. In node
imp1, input a and output c are arrays whose respective size depends on
a static/constant parameter m. In node test , output a is clocked. It has a
2.3. DATAFLOW EXAMPLES
23
node imp1(const m: int ; a: int^(3∗m)) returns (c: int^m);
function _MatMul_real (
const n, m, p : int ;
in1 : real^n^m ;
in2 : real^m^p)
returns (
out : real^n^p) ;
node test (x: bool; d: bool clock) returns (a: int when true(d); b: int );
Listing 2.3.1 – Node interfaces in Lustre.
meaningful value (i.e. is “active”) only at times when d is true. The semantics
of clocked signals is presented in [7].
Listing 2.3.2 illustrates arrays and library importation. Our compiler
doesn’t support array slices typical of Lustre V4, which are compensated
for by the much more general Lustre V6 array iterators. As a first step, we
choose to leave iterators out too, even if our framework is able to generate
code from them. Our contributions are geared towards verification more
than towards expressivity and since iterators raise specific problems in this
respect, we left them for future work. If one is interested in writing array
related algorithms, such as matrix product, the workflow supported by our
tool is to write an external piece of C code and link it to a Lustre interface
file. Both foreign code and interface can be generic with respect to array
sizes.
Listing 2.3.3 illustrates a stopwatch example using Lustre enumerated
clocks and node reset. Enumerated clocks are an advanced form of the traditional Lustre clocks. They allow to sample a value of a flow depending on
the value of a clock. For example, the expression “ tick when Start(run) ”
denotes a signal that is only defined when the clock flow run has value Start.
The sampled flows can be gathered together using the merge operator as in
the definition of variable “seconds” in node stopwatch. Moreover, a node call
can be reset to its initial state when a given boolean condition is set to true.
For example, in Listing 2.1.1 the expression “count (..) every reset ” will return the initial state of the node count when reset is true. The function
switch is a memoryless node, hence is declared with the keyword function.
Overall, this piece of code describes the behaviour of a simple stopwatch.
First, the enumeration type run_mode encodes whether the stopwatch is
running or not and the function switch exchanges the two states. The node
count increments an internal counter , starting from 0, each time it is called
(with a dummy argument). Finally, the principal node stopwatch has two
possible modes, the active one being represented by the clock variable run,
toggled by the external boolean start_stop. When the stopwatch is running
(mode Start), the variable seconds is counting upwards, unless the external
24
CHAPTER 2. LUSTREC
#open "libarrays"
node ctl (in0 : real^1^1) returns (mem : real^2^1);
var _A: real^2^2;
_B: real^2^1;
let
_A = [[1.5 , 1.] , [−0.7, 0 . ] ] ;
_B = [[1.6 , 0 . ] ] ;
assert (in0 [0][0] >= −1. and in0 [0][0] <= 1.);
mem = [[0. , 0.]] −> _MatMul_real(2, 2, 1, _A, pre mem)
+ _MatMul_real(2, 1, 1, _B, in0 );
tel
node top(in0 : real) returns (x, y:
var res : real^2^1;
let
res = ctl ([[ in0 ] ] ) ;
x = res [0][0];
y = res [0][1];
tel
real );
Listing 2.3.2 – Arrays in Lustre.
boolean reset is set true and thus resets the counter. When the stopwatch
is stopped (mode Stop), seconds freezes its value.
2.4
Stateflow Examples
The first example of Listing 2.4.1 is a rephrasing of Listing 2.3.3 in the
automata language, with the same semantics.
The more significant example of Listing 2.4.2 entirely borrows the description of a simple controller for a personal gas heater from [81] and
illustrates the combination of dataflow equations and state machines. The
original example has been translated from the synchronous language Lucid
Synchrone to Lustre, with a minimal amount of syntax changes.
2.5
Compilation Workflow
While initial compilation schemes for Lustre were computing a global
automaton of the system [17], the approach of [7], which we globally follow,
relies on an object-like compilation of the program: each Lustre node call
2.5. COMPILATION WORKFLOW
25
type run_mode = enum { Start , Stop };
function switch (mode_in:run_mode) returns (mode_out:run_mode);
let mode_out = i f mode_in = Start then Stop else Start ; tel
node count (tick :bool) returns (seconds: int );
let seconds = 0 −> pre seconds + 1; tel
node stopwatch (tick :bool; start_stop:bool; reset :bool) returns (seconds: int );
var run : run_mode clock;
let run = Stop −> i f start_stop then switch(pre run) else pre run;
seconds = merge run (Start −> count(tick when Start(run)) every reset)
(Stop −> (0 −> pre seconds) when Stop(run));
tel
Listing 2.3.3 – Clocks and resets in Lustre.
node count (tick :bool) returns (seconds: int );
let seconds = 0 −> pre seconds + 1; tel
node stopwatch (tick :bool; start_stop:bool; reset :bool) returns (seconds: int );
var last_seconds : int ;
let
last_seconds = pre seconds;
automaton run
state Stop:
let
seconds = 0 −> last_seconds;
tel
until start_stop resume Start
state Start :
unless reset restart Start
let
seconds = count(tick );
tel
until start_stop resume Stop
tel
Listing 2.4.1 – A simple automaton in Lustre.
26
const
const
const
const
CHAPTER 2. LUSTREC
low = 5;
high = 5;
delay_on = 200;
delay_off = 500;
node edge(c : bool) returns (edge_c : bool)
let
edge_c = false −> c && not (pre c);
tel
node count(d : int ; t : bool) returns (ok : bool)
var cpt : int ;
let
ok = (cpt = 0);
cpt = 0 −> ( i f t then pre cpt + 1 else pre cpt) mod d;
tel
(∗ controlling the heat
∗)
(∗ returns [ true ] when [expected_temp] does not agree with [actual_temp]∗)
node heat(expected_temp, actual_temp : int ) returns (ok : bool)
let
automaton heat_control
state Stop :
unless (actual_temp<= expected_temp − low) resume Start
let
ok = false ;
tel
state Start :
unless (actual_temp>= expected_temp + high) resume Stop
let
ok = true ;
tel
tel
(∗ a cyclic two mode automaton with an internal timer
∗)
(∗ [open_light = true ] and [open_gas = true ] for [delay_on millisecond ]∗)
(∗ then [open_light = false ] and [open_gas = false ] for
∗)
(∗ [delay_off millisecond ]
∗)
node command(millisecond : bool) returns (open_light, open_gas : bool)
let
automaton command_control
state Open :
let
open_light = true ;
open_gas = true ;
tel
until count(delay_on, millisecond) restart Silent
state Silent :
let
open_light = false ;
open_gas = false ;
tel
until count(delay_off , millisecond) restart Open
tel
(∗ the main command which controls the opening of the light and gas ∗)
node light (millisecond : bool ; on_heat, on_light : bool)
returns (open_light, open_gas, nok : bool)
let
automaton light_control
state Light_off :
let
nok = false ;
open_light = false ;
open_gas = false ;
tel
until on_heat restart Try
state Light_on :
let
nok = false ;
open_light = false ;
open_gas = true ;
tel
until not on_heat restart Light_off
state Try :
let
nok = false ;
(open_light, open_gas) = command(millisecond );
tel
until on_light restart Light_on
until count(3, edge(not open_light)) restart Failure
state Failure :
let
nok = true ;
open_light = false ;
open_gas = false ;
tel
tel
(∗ the main function ∗)
node main(millisecond : bool ; reset : bool ; expected_temp, actual_temp : int ; on_light : bool)
returns (open_light, open_gas, ok : bool)
var on_heat, nok : bool ;
let
on_heat = heat(expected_temp,actual_temp) every reset ;
(open_light, open_gas, nok) = light (millisecond , on_heat, on_light) every reset ;
ok = not nok;
tel
Listing 2.4.2 – Automata in Lustre.
2.5. COMPILATION WORKFLOW
27
is seen as an instance of the generic declaration of the node.
The “object” associated to each node will represent an instance of the
node. It is principally fitted with a method/function that expresses a single
step computation: it takes as many input values as input flows and produces
as many output values as output flows. However, a node is stateful, ie. it is
not a mathematical function of its current inputs only since it can contain
memories – typically bound by operators pre in Lustre or by unit delays in
Simulink – or instances of subnodes that could recursively be defined with
memory and subnode instances. This information characterizes the current
state of this object, e.g. represented through object fields. When generating C code, those objects are typically compiled into an allocated struct
representing the object state. The single step function takes as parameters
inputs, pointers to outputs and a pointer to this instance struct. Moreover, to support resetting of node internal memories, an additional method
is also provided: it takes as input the instance struct pointer as well as the
constant node inputs (to have access to sizes) and doesn’t produce any output. It may recursively reset instances of subnodes. This reset method is
only called within step methods and mainly comes in handy when handling
Lustre automata.
Figure 2.5.1 describes a struct definition and the signature of the single step function as well as the reset function for the Lustre example in
Figure 2.1.1. The “self” parameter of both functions accounts for instance
structs of node counter.
struct counter_mem {
struct counter_reg { _Bool __counter_1;
_Bool __counter_2;
} _reg;
};
void counter_step (_Bool reset ,
_Bool (∗active ) ,
struct counter_mem ∗ s e l f ) ;
void counter_reset (struct counter_mem ∗ s e l f ) ;
Listing 2.5.1 – C header of the compiled node counter of Figure 2.1.1.
The very classical workflow of our Lustre compiler, lustrec, is organized in different successive phases as follows: stateflow unfolding, normalization and common subexpression elimination, inlining, typing, clocking,
production of sequential intermediate code, optimization of intermediate
code, production of final code. On top of that stack, another companion
tool named CocoSim [56], developed at NASA Ames, translates Simulink
28
CHAPTER 2. LUSTREC
input
Unless
output
State
input
Eqs +
Until
restart_in
z−1
next_state_in, next_restart_in
restart_act
state_in
state_act
base clock
Figure 2.1 – Automaton as a pure dataflow.
dataflow diagrams to Lustre code. The translation of Stateflow diagrams into Lustre automata is currently being addressed.
2.5.1
Stateflow Unfolding
The first step consists in unfolding automata into pure dataflow equations. In our setting, we follow the approach developed in [20, 19], which
is to the best of our knowledge the most disciplined and simple approach.
Following earlier works around statecharts [43] and mode automata [65, 64],
automata in Lustre are simpler and better-behaved than the de facto standard Stateflow [89].
This approach is also implemented in the commercial KCG Scade compiler [26]. Yet, the motivation to present this phase with more details is
twofold: first, Lustre automata are still a novel and unusual feature; second, we slightly depart from the original compilation model to meet our own
requirements.
Informally, the modular compilation scheme developed in [20, 19] is enforced at the expense of raw (and somehow undesired) expressivity, disallowing for instance transitions that go through boundaries of hierarchical state
machines or the firing of an unbounded number of transitions per instant
(e.g. in Matlab Simulink and Stateflow).
The overall behaviour of automata is pictured in Figure 2.1. At each
instant, two pairs of variables are computed: a putative state_in and an
actual state state_act and also, for both states, two booleans restart_in
and restart_act, that tell whether their respective state equations should
be reset before execution. The actual state is obtained via a strong (unless)
transition from the putative state, whereas the next putative state is obtained
via a weak (until) transition from the actual state. Only the actual state
equations are executed at each instant. Finally, a reset function is driven by
the restart/resume keyword switches. As transition-firing conditions may
have their own memories, they can be reset if needed before being evaluated.
Specifically, unless conditions are reset according to restart_in, whereas
2.5. COMPILATION WORKFLOW
29
until and state equations altogether are reset according to restart_act.
Our approach builds on top of the aforementioned compilation scheme. In
our setting, we promote the computation of strong transition, state equations
and weak transition to independent auxiliary Lustre nodes. This allows a
certain flexibility: (i) independent scheduling and optimization of different
state equations; (ii) addition of code contracts to different states. Those
features are not supported by the commercial KCG suite.
Another benefit of our approach, unlike the original scheme, is that we
don’t need modifying state equations to account for clock constraints, nor
local variables or the reset operation. Indeed, we encapsulate state equations
in new node definitions and generate new equations for calling these nodes,
greatly facilitating the management of local state invariants for instance.
On the contrary, in the original scheme, local state information may only be
recovered through clock calculus and is not structural any more, as generated
code may be optimized and scattered. Yet, our encapsulation comes at the
expense of a rather limited loss in expressivity, due to possible causality
issues 1 . Note that inlining these auxiliary nodes is already an available
option that fully recovers the original semantics.
Differences Between our Approach and [20]
Differences are illustrated in Listings. 2.5.2a, 2.5.2b and 2.5.2c, from a
user’s viewpoint. Example 2.5.2a is a typical program that cannot be statically scheduled and produces a compilation error in both approaches. A
solution may be devised as in Example 2.5.2b, using an automaton to encode the boolean switch i = 0. Even if scheduling is done prior to other static
analyses (and thus is unaware of exclusive automaton states for instance),
we succeed in generating a correct code whereas the method proposed in [20]
would fail 2 . Example 2.5.2c is non-causal and won’t compile if we remove
the pre occurring in the unless clause. But if we keep it, KCG will handle it
correctly whereas our causality analysis will reject this program. Generally
speaking, we forbid unless clauses that would refer to putative state memories (such as o). Accepting these clauses appear problematic or at least
confusing as it makes the putative state visible and distinct from the actual
state, thus duplicating state variables.
1. We recall that the classical causality analysis in modern Lustre doesn’t cross
boundaries of nodes, hence the conservative rejection of some correct programs.
2. Although KCG, built from the same principles, handles example 2.5.2b.
30
CHAPTER 2. LUSTREC
node failure ( i : int) returns (o1, o2: int );
let
(o1, o2) = i f i = 0
then (o2, i )
else ( i , o1);
tel
(a) Scheduling failure.
node solution ( i : int) returns (o1, o2: int );
let
automaton condition
state OK:
unless i <> 0 resume KO
let
(o1, o2) = (o2, i );
tel
state KO:
unless i = 0 resume OK
let
(o1, o2) = ( i , o1);
tel
tel
(b) Automaton based solution.
node triangle (r :bool) returns (o: int );
let
automaton t ri v i al
state One:
unless r | | pre o = 100
let
o = 0 −> 1 + pre o;
tel
tel
(c) Causality issues.
Listing 2.5.2 – Examples comparing our approach with the one developed
in [20].
2.5. COMPILATION WORKFLOW
31
Compiling Automata to Clocked Expressions
We denote with ReadEqsi and W riteEqsi the set of read and write
variables occurring in equations of an automaton state Si . We also denote
as ReadU nlessi and ReadU ntili the set of variables in unless and until
clauses.
Our compilation scheme from automata to clocked expressions follows
Figure 2.1 and is applied to a generic automaton such as the one described
in Listing 2.5.3a (node nd). As illustrated in Listing 2.5.3b, the variables
state_act and state_in are modelled as clocks of enumerated type. Also, two
new nodes are introduced for each of the automaton state: one to express
the semantics of state equations; and another one to capture the weak and
strong transitions (as explained in Section 2.5.1).
Listing 2.5.4 illustrates the compiled node c_nd that replaces the original automaton description of node nd. Evaluation of each single node call
embedding state equations and transitions only takes place when its corresponding clock is active; this is done via “when V alue(clock)” sampling
operators applied to all node arguments. All the node calls that corresponds
to the global evaluation of the automaton are then gathered in two merge
constructs, which are driven by the putative state clock state_in (for strong
transitions) and the actual state clock state_act (for weak transitions and
state equations).
Arguments to unless and until clauses are triples (scj , srj , SSj ), although the concrete syntax is fleshed out: “scj resume/restart SSj ”. Here,
scj is the triggering condition for an outgoing transition to state SSj . The
boolean rj denotes whether SSj will be reset when entered (restart) or not
(resume). Strong and weak transitions are evaluated in order of appearance.
As witnessed by our example in “general position”, our translation is
simple as it owes much to modularity at the automaton level, but also at
the state level within an automaton. Starting from a Lustre source, the
translation process repeatedly transforms it to remove at each step some
automaton, until no more is present. Parallel as well as nested automata are
seamlessly handled. The resulting pure dataflow program is independent of
the particular order in which automata are removed.
The reader can find a more detailed example of automata unfolding in
Section 4.6. The high-level of modularity of our encoding clearly leverages
modular analyses. Indeed, the structural information we keep could be lost
if automata states were inlined, because various code transformations may
scatter or erase information, even if clock calculus helps in this respect. In
that case, it would force subsequent analyses to try recovering lost structural information, prior to the extraction of interesting properties from the
program, such as local invariants.
Remark. In forthcoming Chapters 3and 4, we implicitly assume automata
have been unfolded so that we can stick to plain dataflow Lustre. This
32
CHAPTER 2. LUSTREC
node nd (inputs) returns (outputs);
var locals
let
other_equations
automaton aut
...
state Si :
...
unless (scj , srj , SSj )
...
var localsi
let
equationsi
tel
...
until (wcj , wrj , W Sj )
...
tel
(a) Automaton skeleton.
type aut_type = enum { S1 , . . . , Sn };
...
node Si _unless (ReadU nlessi )
returns (restart_act : bool ,
state_act : aut_type clock);
let
(restart_act , state_act) =
i f sc1 then (sr1 , SS1 ) else
i f sc2 then (sr2 , SS2 ) else
...
(false , Si );
tel
node Si _handler_until (ReadEqsi ∪ ReadU ntili )
returns (restart_in : bool ,
state_in : aut_type clock,
W riteEqs);
var localsi
let
(restart_in , state_in) =
i f wc1 then (wr1 , W S1 ) else
i f wc2 then (wr2 , W S2 ) else
...
(false , Si );
equationsi
tel
(b) Automaton states as new nodes with enumerated clocks.
Listing 2.5.3 – Automata in Lustre and their representation as Lustre nodes.
2.5. COMPILATION WORKFLOW
33
node c_nd (inputs) returns (outputs);
var locals;
aut_restart_in, aut_next_restart_in, aut_restart_act : bool;
aut_state_in, aut_next_state_in, aut_state_act : aut_type clock;
let
...
(aut_restart_in, aut_state_in) =
(false , S1 ) −> pre (aut_next_restart_in, aut_next_state_in);
(aut_restart_act, aut_state_act) = merge aut_state_in
...
(Si −> Si _unless((ReadU nlessi ) when Si (aut_state_in))
every aut_restart_in)
...
(aut_next_restart_in, aut_state_next_in, W riteEqs) = merge aut_state_act
...
(Si −> Si _handler_until((ReadEqsi ∪ ReadU ntili ) when Si (aut_state_act))
every aut_restart_act)
...
tel
Listing 2.5.4 – Compiled node c_nd with clocked expressions from node nd
in Figure 2.5.3a.
.
34
CHAPTER 2. LUSTREC
is not a bold stance as regards verification activities. First, our translation
of automata is the reference semantics. Second, the very structural nature
of this translation doesn’t impact much traceability and user-friendliness of
verification diagnoses.
2.5.2
Normalization
The second step consists in normalizing dataflow equations and also performing elimination of common sub-expressions. This phase, although almost classical, nonetheless contains some variations inspired by our treatment of array expressions and homomorphic extension. For instance, arrays
are not unfolded as they may have symbolic sizes. The compiler transforms
the equations of a Lustre node to extract the stateful computations that appear inside expressions. A Stateful computation can either be an explicit use
of a pre construct or a call to another node which is stateful. The extraction
is made through a linear traversal of the node’s equations, introducing new
equations for the stateful computations 3 . When possible, tuple definitions
are split as simpler definitions. To ease latter computation phases, each node
call is labeled by a unique identifier. The expressions in Figure 2.2 give an
example of such normalization, including operators supporting homomorphic
extension as well as array expressions:
Original program
Normalized program
pb = pre b;
a = false −> not (r or pre b); arr = true −> false ;
a = i f arr then false else not (r or pb);
y = 3 + node(x, 2);
res_node = nodeuid1 (x, 2);
y = 3 + res_node;
(x, y) = pre (x, y) + (1, 2);
x = pre x + 1;
y = pre y + 2;
zi = (x + (1 ∧ n))[ i ] ;
assert ( i <= n);
zi = x[ i ] + 1;
Figure 2.2 – Normalization examples.
We define the normalization functions N ormEq and N ormExpr , applied
to a set of equations and an expression respectively. In N ormexpr , op is a
Lustre operator supporting homomorphic extension, N is a node in a Lustre
program L and uid is a unique identifier associated to the call of N with
arguments (e1 , . . . , en ). All normalization functions take as inputs an element to be normalized, a set of accumulated normalized equations Eqs, a
3. As opposed to 3 addresses code, only the stateful part of expressions is extracted.
2.5. COMPILATION WORKFLOW
35
set of accumulated auxiliary variables V ars and finally a stack of offsets
Of f s 4 . They return a normalized element and an updated version of both
accumulators.
For the sake of simplicity, as an abuse of notation, N ormExpr may also
be applied to a sequence of expressions ee, from left to right, where both
accumulators are carried along expressions. In this case, N ormExpr returns
a sequence of normalized expressions and the final value of both accumulators. Also, normalization functions perform homomorphic extension wherever possible and normalizing an expression may return a tuple of normalized expanded expressions. Homomorphic operators op that may be lifted to
tuples include arithmetical operators and any Lustre primitive but array
operators 5 .
Normalization of an expression
The operation N ormExpr of Figure 2.3 returns a modified expression
along with a set of newly bound stateful expressions and associated new
variables. Also, the special expression “true → f alse” is interpreted too
as an instance of a built-in Lustre node “Arrow”, with its own boolean
memory:
Normalization of equations
The operation N ormEqs of Figure 2.4 normalizes each equation of a set.
For each equation, N ormEq simplifies tuple definitions and normalizes the
right-hand side expression. Both return a set of normalized equations and a
set of variables occurring in these equations:
Common subexpression Elimination
Elimination of common subexpressions is embedded in normalization
steps, i.e. we avoid creating a new variable (x ∈
/ V ars) and a new equation
(x = norm_rhs) if we already have another normalized equation with the
same right-hand side (y = norm_rhs) present in set Eqs. In this case,
the fresh variable x is simply replaced by y. For more expression sharing,
this step could be enhanced through a more liberal equivalence relation between (right-hand side) expressions. For instance, pre operators can migrate
through other Lustre operators while keeping the same semantics and normalization could account for these moves, so that pre(x + y) is deemed
equivalent to pre(x) + pre(y).
4. To access array elements (and record fields as future work).
5. Note also that conditional expressions are not homomorphic with respect to their
conditions.
36
CHAPTER 2. LUSTREC
N ormExpr (e, Of f s, Eqs, V ars) ,
v
→ v[Of f s1 ] . . . [Of f sn ], Eqs, V ars
cst
→ cst[Of
f s1 ] . . . [Of f sn ], Eqs, V ars
let x ∈
/ V ars in
true → f alse
→
x, {x = Arrowuid ()} ∪ Eqs, {x} ∪ V ars
e1 → e2
→ N ormExpr (if (true → f alse) then e1 else e2 ,
Of f s, Eqs, V ars)
e
1
let e , Eqs, V ars = N ormExpr (e1 , Of f s, Eqs, V ars) in
...
op(e1 , . . . , en )
→
n
n , Eqs, V ars = N orm
let ef
Expr (e , Of f s, Eqs, V ars) in
1
n
1
(op(e1 , . . . , e1 ), . . . , op(ep , . . . , enp )),
Eqs, V ars
let ee, Eqs, V ars = N ormExpr (e, Of f s, Eqs, V ars) in
let x
e∈
/SV ars in
pre e
→
S
x
e
,
i {xi = pre ei } ∪ Eqs, i {xi } ∪ V ars
let e0 , Eqs, V ars = N ormExpr (e0 , [], Eqs, V ars) in
e, Of f s, Eqs, V ars) in
let ee, Eqs, V ars = N ormExpr (e
let x
e∈
/ V ars, c ∈
/ {e
x} ∪ V ars in
N (e
e) every e0
→
x
e, S
{e
x = Nuid (e
e) every c} ∪ {c = e0 } ∪ Eqs,
i {xi } ∪ {c} ∪ V ars
let c, Eqs, V ars = N ormExpr (c, Of f s, Eqs, V ars) in
let ee1 , Eqs, V ars = N ormExpr (e1 , Of f s, Eqs, V ars) in
1
2
if c then e else e →
let ee2 , Eqs, V ars = N ormExpr (e2 , Of f s, Eqs, V ars) in
(if c then e11 else e21 , . . . , if c then e1n else e2n ),
Eqs, V ars
when
Of f s = []
let e, Eqs, V ars = N ormExpr (e, [], Eqs, V ars) in
e∧s
→
let s, Eqs, V ars = N ormExpr (s, [], Eqs, V ars) in
e ∧ s, Eqs, vars
when
Of f s = idx :: Of f s0
let e, Eqs, V ars = N ormExpr (e, Of f s0 , Eqs, V ars) in
e∧s
→
let s, Eqs, V ars = N ormExpr (s, [], Eqs, V ars) in
e, {assert (idx ≤ s)} ∪ Eqs, vars
let
e
e
, Eqs, V ars = N ormExpr (e
e, Of f s, Eqs, V ars) in
let x ∈
/ V ars in
[e
e]
→
x, {x = [e
e]} ∪ Eqs, {x} ∪ V ars
let idx, Eqs, V ars = N ormExpr (idx, Of f s, Eqs, V ars) in
let e, Eqs, V ars = N ormExpr (e, idx :: Of f s, Eqs, V ars) in
e[idx]
→
e, Eqs, vars
Figure 2.3 – Normalization of expressions.
2.5. COMPILATION WORKFLOW
37
N ormEqs ({}, Eqs, V ars) , Eqs, V ars
N ormEqs ({eq} ∪ N ewEqs, Eqs, V ars) ,
let Eqs, V ars = N ormEq (eq, Eqs, V ars) in
N ormEqs (N ewEqs, Eqs, V ars)
N ormEq ((e
v = e), Eqs, V ars) ,
S
ve = ee
→ N ormEqs ( i {vi = ei }, Eqs, V ars)
v = true → f alse → {v = Arrowuid ()} ∪ Eqs, V ars
v = e1 → e2
→ N ormEq (v = if (true → f alse) then e1 else e2 ,
Eqs, V ars)
0
let
e
,
Eqs,
V ars = N ormExpr (e0 , [], Eqs, V ars) in
let
e
e
,
Eqs,
V
ars = N ormExpr (e
e, [], Eqs, V ars) in
ve = N (e
e) every e0 →
let
c
∈
/
V
ars
in
{e
v = Nuid (e
e) every c} ∪ {c = e0 } ∪ Eqs, V ars
S
e
1
n
ve = op(e , . . . , ef) → N ormEqs ( i {vi = op(e1i , . . . , eni )}, Eqs, V ars)
Figure 2.4 – Normalization of equations.
Normalization of a node
It amounts to normalizing each of its equations; from the final variable
accumulator, only the newly bound variables are added to the set of local
variables.
From these definitions, we normalize a node N essentially by computing
the set of normalized equations and variables:
(nom_eqs, all_vars) = N ormEqs (N.eqs, {}, N.vars)
where N.eqs and N.vars = N.ins ∪ N.outs ∪ N.locs respectively denote the
equations and all input/output/local variables of N . The resulting node is
then built by replacing original equations by normalized ones and by adding
only the new variables created by normalization (i.e. all_vars \ N.vars) to
the set of local variables.
Identifying Memories
Finally, the state of a node is characterized by its memories. For a normalized node, straigthforwardly, memories are the variables that are defined
by pre constructs, as well as the memories associated to each of its called
node instances.
Assuming every node is now normalized, we first define the set of local
memories for a node N :
M em(N ) = {v | {v = pre e} ∈ N.eqs}
Then we characterize the set of callee instances, using their unique identifiers uid:
38
CHAPTER 2. LUSTREC
Inst(N ) = {(M, uid) | {e
v = Muid (e
e) every c} ∈ N.eqs}
The set of memories fully characterizing the state of a node is organized
as a tree, for the sake of modularity. This tree structure will be carried over
to target code level. We choose to represent it as a set of dot-separated name
paths.
We denote by State(N ) the set of memories fully characterizing the state
of any N node instance 6 .
State(N ) = M em(N ) ∪ {Muid v | (M, uid) ∈ Inst(N ) ∧ v ∈ State(M )}
2.5.3
Inlining
Then, following user-provided annotations and compilation directives,
we inline node calls as needed, replacing original node equations by their
specialized versions with respect to the types, clocks and expressions of call
arguments. Node call inlining is used to circumvent spurious causality problems as a way to cure the short-sightedness of purely modular per node algorithms. It may also be fruitful as a potential source of code optimization as
inlining brings more opportunities to factorize out common subexpressions
and control-flows.
2.5.4
Typing
The typing phase relies on type information provided at interface of nodes
and is able to perform type inference for local variables and expressions,
with a classical Hindley-Milner algorithm. We support some small amount
of subtyping: static/constant values as well as clock values are a subtype of
normal values (of the same base type), abstracted as the subtype relation:
const t < t
t clock < t
For instance, symbolic expressions that may occur as array sizes must be
declared or inferred as static (integer) values. The clock type annotation is
obviously used to mark clocks.
As array sizes may be arbitrary expressions, we need to keep the type system simple and avoid handling arithmetical constraints systems and complex
unification problems altogether. For that purpose, we enforce declaration of
symbolic parameters occurring in size expressions. A typical node interface
is then, for instance:
6. By construction, circular definition of nodes are forbidden in Lustre: this recursive
definition is well-founded.
2.5. COMPILATION WORKFLOW
39
node n1(const int p; float t1[3∗p]) returns (float t2[p+1]);
For node n1 to be used without any restriction, we must prove or dynamically check (with assertions) that parameter p is in “general position”,
i.e. arrays t1 and t2 always have valid sizes whatever the positive value of
p. The following piece of code violates this rule, as p and m must hold the
same value and thus are not generalizable:
node n2_wrong(const int p,m; float t1[p]) returns (float t2[m])
let
t2 = t1;
tel
Also, for each call site of node n1, we must check that the real call parameter for p is a positive integer. For the equality constraints between real
and formal array sizes, we simply employ syntactic equality, upto simplification, partial evaluation and constant propagation. This solution has proved
to cover our usages until now, but we could implement a more clever and
user-friendly equality check between size expressions e1 and e2 : first, check
that parameters are the same in both expressions, as this would otherwise
mean that some parameters are dependent of some others and thus not generalizable; second, exploit decidability of polynomial equality (over integer
coefficients) to perform the check. Of course, this only works if both expressions are reducible to polynomial forms, which is expressive enough.
Our solution avoids more general and more complex type systems such
as HM(X) [86] where type constraints are gathered and propagated throughout expressions. In our particular case, it would amount to maintaining
and solving integer arithmetical constraints between array sizes. The gain
in expressivity doesn’t seem appreciable enough, not to mention potential
decidability issues.
Compiling a node with symbolic array sizes still yields a C99 compliant
unique piece of code, whereas other compilers chose to duplicate code for each
node instantiation and then check instantiated sizes. Our sizes are checked
statically and once and for all and this policy preserves modular compilation.
However, this approach fails to generate correct C code (due to limitations of
the C language) if the node memory itself contains an array with a symbolic
size. In this case, we fall back to a mere C pointer representation and must
insert a (single !) type casting to convert the memory pointer to a local sized
array variable.
Also, since it is not a standard Lustre feature anyway, we forbid type
polymorphism as it would introduce boxing/unboxing values, type casting
or code duplication in our compilation scheme, which both defeat simplicity
and reliability of code generation.
40
CHAPTER 2. LUSTREC
2.5.5
Clocking
The clock calculus is identical to the Lustre V6 one [7]. Folklore tells us
it is a dependent typing problem, but it is much more akin to existential or
singleton types, which can be readily supported by modern type systems 7 .
Clock calculus consists in building, for each node, a tree of sub-clocks occurring in the node equations. The root of this tree is the base clock of the node
and branches are built according to when and merge constructs. A sub-clock
is represented by its branch:
base on K1 (c1 ) on K2 (c2 ) . . . on Kn (cn )
where base is the base clock and each ci is a Lustre variable declared as a
clock and Ki one of its possible value.
As far as we know, our implementation might be a bit more tolerant
than available Lustre compilers so as to allow more clock expressions in
the input/output signature of nodes.
2.5.6
Scheduling
The scheduling phase per node of the different equations first computes a
dependency graph between node variables, then performs a causality analysis to detect algebraic loops and finally computes a topological sorting with
priority. The final outcome is a total ordering for the evaluation of normalized equations. We currently have two priority levels: if a variable occurring
inside a tuple is chosen to be evaluated by the scheduler at a given step,
then at next steps a priority will be put on other variables of the same tuple, which will be chosen before other variables, whenever possible (i.e. if it
respects dependencies). Enforcing this priority ensures a better sharing of
control-flow structures in generated C code. Another possible intermediate
level of priority would be to privilege variables under the same clock, again
for the benefit of a smaller control-flow.
First, for each node a dependency graph is built from its normalized
equations. Vertices of this graph belong to three categories:
• input variables (represented as “#x”): these are read-only variables, not
meant to be scheduled. Read-only variables are: global constants, node
inputs and node current memories. They are included for optimization
purposes and detection of useless variables.
• instance variables (represented as “?Nuid ” and “!Nuid ”): these are dummy
variables, not meant to be scheduled either, that respectively represent
calls to (or returns from) node instances occurring in equations and
are used to factorize out graph dependencies. These variables are also
useful for presenting a more user-friendly diagnosis in case our compiler
flags a causality problem.
7. Such as Generalized Abstract Data Types, cf. Chapter 5.
2.5. COMPILATION WORKFLOW
41
• normal variables: these are scheduled variables: either local, output or tobe-updated memory variables.
The global roadmap is the following:
1. compute two dependency graphs: gm for dependencies between memory variables only; gn for other dependencies between scheduled nonmemory/memory variables.
2. check for a dependency cycle in gn , which likely indicates a causality
error and is flagged as such.
3. check for cycles in gm and break them if any, by introducing fresh
local variables. These cycles are legit as they indeed represent only
a dependency between current and next-time memory variables. For
instance:
x=pre (y+1); y=pre (x+2);
is treated as if we wrote:
lx=x; x=pre (y+1); y=pre (lx+2);
instead, with fresh variable lx . This latter equivalent program yields
an acyclic graph.
4. finally merge gn with gm , giving the full dependency graph.
Each normalized equation corresponds to edges of gn or gm as follows,
where an edge x → y specifies that the equation defining y must be scheduled
before the one of x.
Figure 2.5 summarizes the construction of the dependency graph from
normalized equations. Due to normalization, no expression contains node
calls, array or pre constructs as subexpressions. We respectively denote the
set of non-memory variables and memories occurring in an expression e as
V (e) and M (e).
42
CHAPTER 2. LUSTREC
Equations
x = pre e
x
e = Nuid (e) every c
x = e
Edges (for all m ∈ M (e) and y ∈ V (e))
in gn
in gm
x
m→x
→ #m
x → #y when y ∈
/ N.locs ∪ N.outs
x→y
otherwise
(for all xi ∈ x
e) !Nuid → xi →?Nuid
m
→!Nuid , ?Nuid → #m, ?Nuid → c
?Nuid → #y when y ∈
/ N.locs ∪ N.outs
?Nuid → y
otherwise
m
→ x → #m
x → #y when y ∈
/ N.locs ∪ N.outs
x→y
otherwise
Figure 2.5 – Dependency graph from equations.
Dependencies to read-only variables (such as x → #m) are meaningless
with respect to causality and scheduling, they are gathered only to emit
warnings about dummy input or memory variables that are never used.
Node call (?Nuid ) and node return (!Nuid ) events are also an interesting
basis to perform fine-grained causality analysis in Simulink diagrams, as
shown in [52, 51], where control-flows and data-flows are really intricate and
interdependent.
From the final acyclic dependency graph, a total ordering for schedulable
variables is finally produced, which always respect dependencies and respects
priorities whenever possible.
2.5.7
Generation of sequential intermediate code
To avoid introducing another specific syntax for our intermediate language, we adopt a C-like syntax to express control-flow structures and memory representation.
Let us assume a generic normalized node N , on which we apply our
modular code generation scheme, such that:
N.ins = {in1 , . . . , ina } (of respective types {I1 , . . . , Ia })
N.locs = {l1 , . . . , lb } (of respective types {O1 , . . . , Ob })
N.outs = {o1 , . . . , oc } (of respective types {L1 , . . . , Lc })
N.eqs = {eq1 , . . . , eqr }
M ems(N ) = {m1 , . . . , mn } (of respective types {T1 , . . . , Tn })
Inst(N ) = {(M1 , iud1 ), . . . , (Mp , uidp )}
Moreover, we assume the scheduling phase produced the following sequence of variables, as a permutation σ of variables belonging in N.locs ∪
N.outs:
Sched(N ) = {xσ(1) , . . . , xσ(m) }
2.5. COMPILATION WORKFLOW
43
Memory Representation
The memory representation of a node, according to the tree structure of
Section 2.5.2, consists in a record containing its immediate memories, plus
a reference 8 to a record for every node call instance. From node N , we
generate the following type definitions:
struct N_mem {
struct N_reg {
T1 m1 ;
...
Tn mn ;
} _reg;
struct M1_mem ∗ni_uid1 ;
...
struct Mp_mem ∗ni_uidp ;
};
Code Generation
Code generation, as in [7], consists in providing two procedures per node
N . The first procedure N _reset takes as input the current memory of an
N instance and resets recursively every node call instances in N equations,
until it ultimately reaches the special node “Arrow” (cf. Section 2.5.2). The
procedure N _reset is used to initialize a node before any reaction occurs and
to reset node calls when needed (through every constructs). The second
procedure N _step computes one reaction step, taking current inputs and
memory of an N instance and producing current outputs and an updated
memory. Outputs and memory are passed by reference.
We start by illustrating in Listing 2.5.5 our scheme on the node “Arrow”,
which outputs the value of its internal boolean state _first . This state is
initialized to true and then becomes f alse after the first reaction, unless it
is reset.
As for our generic node N , its equations are directly translated as a
sequence of variable assignments. Each assignment is possibly a branch of
a conditional statement, where the condition depends upon the assigned
variable’s own enumerated clock. Assignments are still high-level contructs,
as they concern variables of any type (for instance, arrays of any dimension
and sizes), not only base types. As this phase is no more than a copy
of an already published algorithm, we sum it up rather sketchily. First,
for each clocked equation “xck = e”, where ck is the clock of x, the main
translation function Codeck calls the function Control that takes care of
the original equation clock ck and then generates a piece of “unclocked”
8. References specifically allow modular compilation.
44
CHAPTER 2. LUSTREC
struct _arrow_mem {struct _arrow_reg {_Bool _first ; } _reg; };
void _arrow_reset(struct _arrow_mem ∗ s e l f ) {
self−>_reg. _first = 1;
return;
}
void _arrow_step( _Bool ∗output , struct _arrow_mem ∗ s e l f ) {
∗output = self−>_reg. _first ;
i f ( self−>_reg. _first ) { self−>_reg. _first = 0; };
return;
}
Listing 2.5.5 – Encoding the node “Arrow”.
code, which is in turn translated into sequential code through the recursive
translation function Code. Translation of pure expressions is omitted. Nonmemory variables keep their names, whereas each memory variable m turns
into “ self −>_reg.m”, according to 2.5.7.
2.5. COMPILATION WORKFLOW
Function(argument)
Codeck (xck = e)
Control(ck on Ki (c), code)
Control(base, code)
Code(x = e when Ki (ck))
Code(x = pre e)
45
Result
Control(ck, Code(x = e))
Control(ck,
switch (c) {
case K1 : break;
...
case Ki : code ;
break;
...
case Kn : break;
})
code
Code(x = e)
self−>_reg. x = e ;
Code(e
x = Muid (e
y ) every c)
i f (c) {
M _reset(self−>ni_uid ) ;
}
M_step(y1 ,. . . ,yn ,
&x1 ,. . .,&xp ,
self−>ni_uid ) ;
Code(x = if (c) then e1 else e2 )
i f (c) {
x = e1 ;
} else {
x = e2 ;
}
Code(x=merge (c) (K1 −> e1 )
...
(Kn −> en ))
switch (c) {
case K1 : x = e1 ; break;
...
case Kn : x = en ; break;
}
Then, all these equations must be scheduled. We first have to produce
an evaluation order for equations, obviously according to how variables are
scheduled. There is however a small difference between equations and variables, as an equation left-hand side may contain more than one variable. An
evaluation order is a permutation π of node equations {eq1 , . . . , eqr } compatible with the schedule σ, i.e. such that:
∀i < j ∈ [1, r], ∃a ∈ [1, m], xσ(a) ∈ lhs(eqπ(i) )
∧∀b ∈ [1, m].xσ(b) ∈ lhs(eqπ(j) ) =⇒ a < b
This formula expresses the fact that equations are scheduled in earliest possible position, i.e. according to earliest variables of their left-hand sides.
46
CHAPTER 2. LUSTREC
Finally, we are able to produce the reset and step procedures, as shown
in Listing 2.5.6.
void N _reset(struct N_mem ∗ s e l f ) {
M1 _reset( self−>ni_uid1 ) ;
...
Mp _reset( self−>ni_uidp ) ;
return;
}
void N_step(I1 i1 ; . . . ; Ia ia ; O1 ∗o1 ; . . . ; Oc ∗oc ;
struct N_mem ∗ s e l f ) {
L1 l1 ;
...
Lb lb ;
Codeck (eqπ(1) ) ;
...
Codeck (eqπ(r) ) ;
return;
}
Listing 2.5.6 – Sequential code for a node N .
As several equations paced on the same clock may be scheduled one right
after another, the resulting sequence of C switches (on the same condition)
is merged into a single switch, gathering all branches.
Original code excerpt
Merged code
...
switch (c) {
case K1 : code1 ;
break;
...
case Kn : coden ;
break;
}
switch (c) {
case K1 : code01 ;
break;
...
case Kn : code0n ;
break;
}
...
...
switch (c) {
case K1 : code1 ;
code01 ;
break;
...
case Kn : coden ;
code0n ;
break;
}
...
2.5. COMPILATION WORKFLOW
47
Moving code0i right next to codei may trigger another round of code merging, as they may both contain a switch statement on the same condition.
2.5.8
Code Optimization
We implemented several classical optimizations. These optimizations
may be freely enabled or disabled by the user. They are all compatible with
verification activities.
• Variable inlining: first, we inline a variable when the duplication of its
defining expression doesn’t cost much. This may also trigger constant
propagation and further simplifications.
• Variable recycling: second, we try exploiting variable reuse. Yet, for the
sake of traceability and good coding rules, we do replace variables type
for type only. In the generated sequential code, dead variables may obviously be reused, so we perform a live analysis at each program point.
But, we also exploit clock information to detect mutually exclusive
variables, i.e. variables that cannot be live in the same time frame (depending on clocks dynamic values). A variable may thus be replaced by
another “live” one, from another time frame. Our current experimental
policy is to try choosing as a replacement candidate a live variable not
too far away in the clock tree (in terms of ultra-distance) or a dead
one, as a fallback. This policy obviously saves more variables than
dead variables reuse alone and really shines with clock-heavy Lustre
programs, as the ones produced by complex nested automata. Reusing
variables is submitted to further constraints: an input variable may
even be reused (and thus overwritten) if and only if its type is not
aliasable, i.e. is not an array (or struct) type. This is due to the
argument passing mode of the C language for arrays, which is by reference. Hence, the reuse policy for a node doesn’t depend upon possible
contexts of the node calls, where this aliasable input may be dead or
live.
• Enumerated type elimination: third, we perform a kind of “cut elimination” for variables belonging to an enumerated type (e.g. clocks). It
means that we try to merge introduction cases (assignments of variables to enumeration constants) with elimination cases (switch cases
depending upon this variable). Again, this is useful when trying to
efficiently compile clock-heavy programs.
Let us illustrate our policy on the toy program of Listing 2.5.7:
First, the underlying clock tree of node test is:
48
CHAPTER 2. LUSTREC
type choice1 = enum { On, Off };
type choice2 = enum { Up, Down };
node test (x: int) returns (y: int)
var c:choice1 clock; d:choice2 clock; b1,b2,b3,z: int ;
let
c = i f 0=x when Up(d) then Off else On;
d = i f x>0 then Up else Down;
b1 = 1 when On(c);
b2 = 2 when Off(c);
b3 = 3 when Down(d);
y = merge d (Up−> z) (Down−> b3);
z = merge c (On−> b1) (Off −> b2);
tel
Listing 2.5.7 – A Lustre example suited to optimization purposes.
9
base
f
base on Up(d)
:
base on Down(d)
e
base on Up(d) on On(c)
base on Up(d) on Off(c)
where base denotes the activation clock of test.
Second, the dependency graph is then pictured as:
xbase
8
O
4 dbase
O X
cbase on Up(d)
8
base on Up(d) on On(c)
b1
f
O
f
base on Up(d) on Off(c)
b2
base on Down(d)
b3
8
z base on Up(d) j
y base
F
2.5. COMPILATION WORKFLOW
49
where xc denotes a variable x with clock c.
Third, a schedule is computed:
d; c; b3 ; b1 ; b2 ; z; y
and a live analysis is performed, yielding the following death table, where
v 7→ set means that as soon as variable v is evaluated, all the variables in the
set become dead (as well as dead variables gathered by previous evaluations)
and may be reused to replace v:
d 7→{x}
c 7→{}
b3 7→{}
b1 7→{}
b2 7→{}
z 7→{b1 , b2 , c}
y 7→{z, b3 , d}
In our example, if x was not reused before, when z is evaluated, the
variables {x, b1 , b2 , c} are all dead and freely reusable. Since we may only
reuse variables of the same type, b3 is the first variable that may reuse x.
Reusing the maximum number of dead variables would give for instance the
following updated schedule, where input x is reused, b1 occurs 2 times and
y is not substituted (as could be) since we don’t want output variables to be
assigned more than once:
d; c; x/b3 ; b1 ; b2 ; b1 /z; y
Finally, the node test may be evaluated with 6 variables instead of 8 (6
local variables, plus 1 input and 1 output), with the help of very classical
techniques. But we may even go further by reusing live but time-frame
disjoint variables. For instance, when evaluating b2 , b1 is live but active in
another time frame, i.e. when clock c has another value. Therefore, we may
freely reuse b1 in place of b2 . In our example, b1 is disjoint from b2 and
c,z,b1 ,b2 are disjoint from b3 .
Thus, applying our policy at its full potential yields the following really
variable-saving schedule, with only 4 variables left:
d; c; x/b3 ; x/b1 ; x/b2 ; x/z; y
We illustrate several optimization levels in Listing 2.5.8. Here, because we
work on a toy example, constant propagation coupled with “cut elimination”
alone renders our advanced variable reuse scheme useless. This is not the
case anymore for more complex examples with automata for instance.
As a conclusion, we stress the fact that these optimizations are fully and
seamlessly supported by our verification framework. This contrasts with
correct-by-construction approaches, such as [24], where interesting optimizations are hard to prove correct.
50
CHAPTER 2. LUSTREC
void test_step ( int x ,
int (∗y)
) {
int b1 ;
int b2 ;
int b3 ;
choice1 c ;
choice2 d;
int z ;
i f (x > 0) {
d = Up;
} else {
d = Down;
}
switch(d) {
case Down:
b3 = 3;
∗y = b3 ;
break;
case Up:
i f (0 == x) {
c = Off ;
} else {
c = On;
}
switch(c) {
case Off :
b2 = 2;
z = b2 ;
break;
case On:
b1 = 1;
z = b1 ;
break;
}
∗y = z ;
break;
}
return ;
void test_step ( int x ,
int (∗y)
) {
choice1 c ;
choice2 d;
i f (x > 0) {
d = Up;
} else {
d = Down;
}
switch(d) {
case Down:
x = 3;
∗y = x ;
break;
case Up:
i f (0 == x) {
c = Off ;
} else {
c = On;
}
switch(c) {
case Off :
x = 2;
x = x;
break;
case On:
x = 1;
x = x;
break;
}
∗y = x ;
break;
}
return ;
void test_step ( int x ,
int (∗y)
) {
i f ((x > 0)) {
i f (0 == x) {
∗y = 2;
} else {
∗y = 1;
}
} else {
∗y = 3;
}
return ;
}
}
}
(a) Unoptimized code.
(b) Variable reuse only.
(c) Full optimization.
Listing 2.5.8 – Several optimization levels from Lustre source 2.5.7.
2.5. COMPILATION WORKFLOW
2.5.9
51
Targetting C Code
From the previous intermediate sequential code, we generate by default
a C99 compliant code. Assignments for structured types are translated into
nested loops and/or loop unrolling. Several side options may be activated:
standard floating point numbers may be replaced with multiple precision
numbers provided by the MPFR [42] library; a dynamic memory allocation
may replace the static one. Modular static memory allocation is provided
through a system of C macros. Another specificity of “real” code is that
an executable file must be produced. For that purpose, a main node must
be chosen and its code wrapped in a piece of code that computes reactions
forever, while interacting with its environment, as shown in Listing 2.5.9.
We choose to elide the presentation of this phase, as it is more akin to a
mere transliteration from our simple sequential language to the C language.
Moreover, we have already presented our internal sequential language in a
C-like syntax.
2.5.10
Targetting Horn Clauses
Another possible target used for formal verification purposes is a firstorder predicate encoding, specifically as Horn clauses. The use of Horn
clauses as intermediate representation for verification was proposed in [38],
with the verification of concurrent programs as the main application. More
recently, Horn clauses have been used in a tool dedicated to verification of C
code [39]. The underlying procedure for solving sets of recursion-free Horn
clauses, over the combined theory of Linear Rational Arithmetic (LRA) and
Uninterpreted Functions (UF), was presented in [37]. A range of further
applications of Horn clauses includes inter-procedural/exchange format for
verification problems that is supported by the SMT solver Z3 [23, 45]. We
refer to the traditional definition of transition systems in model checking
techniques. A detailed description of transition systems for Lustre programs can be found in [40].
Predicates encode properties, initial states and transition relations of
Lustre nodes. Horn clauses are depicted as “ rule (=> Body Head)”, where
Body is a conjunction of expressions and Head is the predicate being defined
by this clause. Putting verification aside, a straightforward use of such encoding could be to detect uninitialized Lustre programs, a task for which
the lack of a specific phase in our compiler may have been noticed by the
seasoned reader. An initialization problem corresponds to an under specified behavior, i.e. to a non-deterministic program. The property of being
deterministic easily translates into a formula, quantifying on inputs, states
and outputs, that can be checked valid with appropriate tools. Thus, initialization checking can be performed with our Horn encoding, as a first
and meaningful application. Other verification-oriented applications will be
52
CHAPTER 2. LUSTREC
int main ( int argc , char ∗argv [ ] ) {
/∗ Declaration of inputs/outputs variables ∗/
I1 i1 ;
...
Ia ia ;
O1 o1 ;
...
Ob ob ;
/∗ Main memory allocation ∗/
main_ALLOC(static ,main_mem) ;
/∗ I n i t i a l i z e the main memory ∗/
main_reset(&main_mem) ;
/∗ Infinite loop ∗/
while(1){
/∗ Read inputs ∗/
i1 = _get_O1 ("i_1" ) ;
...
ia = _get_Oa ("i_a" ) ;
/∗ Compute reaction ∗/
main_step(i1 ,. . . ,ia , &o1 , . . . , &ob , &main_mem) ;
/∗ Write outputs ∗/
_put_O1 ("o_1" , o1 ) ;
...
_put_Ob ("o_b" , ob ) ;
}
return 1;
}
Listing 2.5.9 – Generation of main code.
2.5. COMPILATION WORKFLOW
53
presented in Chapters 3 and 4.
It turns out that our Horn encoding closely resembles code generation
and doesn’t need much change, because generated code, during each step,
assigns a value to each variable exactly once. This simple code structure is
close to Static Single Assignment 9 , yet requires to disable optimizations that
would reuse variables. It doesn’t mean that generated C code is doomed to
be totally naive, as we are able to prove that optimized C code still conforms
to its specification as a set of Horn clauses.
As for the predicate syntax, we choose to conform to the extension of
the standard SMT-LIB(v2) provided in the tool Z3 [23] because it supports
Horn clauses as well as structured datatypes (such as arrays and records)
and also enumeration types (and even general algebraic datatypes, for that
matter).
Enumerated Clocks
Enumerated types, as typically used in conjunction with clocks, are translated into a similar datatype in SMT-LIB. From the type declaration:
type T = enum { K1 , . . . , Kn };
we get:
(declare−datatypes () ((T K1 . . . Kn )))
One fundamental property of such datatypes is that expressions of type
T are forced to take values in {K1 , . . . , Kn }, so that for instance one can
prove that n + 1 values of type T cannot be all pairwise different.
Memory Representation
The SMT-LIB encoding keeps the original tree structure. For a struct
type, there is only one constructor with as many arguments as struct fields.
Fields are turned into accessor functions. In this case, the fundamental
property is that any expression e of a struct type equals the “product” of its
field values, i.e.:
(= e (struct (f ield1 e) . . . (f ields e)))
This is crucial since we are then able to specify a whole struct as a compound
specification of its fields, each in isolation, preserving our modular approach.
We translate the material of Section 2.5.7 into the following type declarations:
9. But for conditional statements where many assignments will correspond to a same
variable.
54
CHAPTER 2. LUSTREC
(declare−datatypes ()
(type_N_reg (struct_N_reg (m1 T1 )
...
(mn Tn ))))
(declare−datatypes ()
(type_N_mem (struct_N_mem (N_reg type_N_reg)
(ni_uid1 type_M1_mem)
...
(ni_uidp type_Mp_mem))))
Predicate Encoding
Predicates here denote transition relations which relate current inputs
and node memory “self” to current outputs and updated memory “selfp”.
Non-memory variables keep their names. Memory variables stand for the
current memory when read and the updated one when written. Therefore,
depending upon its usage, each memory variable m of node N turns into
“ (m (N _reg self))” or “ (m (N _reg selfp))”. The free variable selfm_uid occurring in the node call encoding corresponds to an intermediate state which
takes either the value specified by the reset predicate in case of reset or self
otherwise.
Function(argument)
Hornck (xck = e)
Result
HControl(ck, Horn(x = e))
HControl(ck on K(c), pred)
HControl(base, pred)
Horn(x = e when Ki (ck))
Horn(x = pre e)
HControl(ck,(=> (= c K ) pred))
pred
Horn(x = e)
Horn(e
x = Muid (e
y ) every c)
(= (x (N_reg s e l f p )) e)
(and (=> (= c true )
(M _reset selfm_uid))
(=> (= c f a l s e )
(= selfm_uid (ni_uid s e l f )))
(M _step selfm_uid ye
(ni_uid s e l f p ) x
e ))
Horn(x = if (c) then e1 else e2 ) (and (=> (= c true ) (= x e1 ))
(=> (= c f a l s e ) (= x e2 )))
Horn(x = merge (c) (K1 −> e1 ) (and (=> (= c K1 ) (= x e1 ))
...
...
(Kn −> en ))
(=> (= c Kn ) (= x en )))
2.5. COMPILATION WORKFLOW
55
Finally, mimicking Listing 2.5.6, we are able to produce the reset and
step procedures as Horn clauses, as shown in Listing 2.5.10. One main difference is the absence of explicit local variable declarations, departing from the
tradition of existentially quantified variables in standard predicate encoding.
Rather, local variables freely appear in the body of the N _step rule. Another difference is that, although we use the equation ordering π, we don’t
need to encode equations in any specific order, but for better performance of
tools such as Z3. Finally, the N _reset rule appears non-deterministic as it
doesn’t specify the local memory of a node being reset. However, its usage in
a node call (through an every construct) cannot produce a non-deterministic
step rule, since local memory is not meaningful when computing the current
outputs and updated memory of a reset node call.
(declare−rel N _reset (type_N_mem))
( rule (=> (and (M1 _reset (ni_uid1 s e l f ))
...
(Mp _reset (ni_uidp s e l f )))
(N _reset s e l f )))
(declare−rel N _step (type_N_mem I1 . . . Ia type_N_mem O1 . . . Oc ))
( rule (=> (and Hornck (eqπ(1) )
...
Hornck (eqπ(r) ))
(N _step s e l f i1 . . . ia s e l f p o1 . . . oc )))
Listing 2.5.10 – Horn encoding for a node N .
Trace Semantics
From above definitions, we may study the trace semantics of a given
(main) node N , the basis for many analyses. For a fixed depth D, we encode
the executions of N as a set of possible sequences of inputs, states and
outputs, of length D, by unrolling the transition relation D times. In order
to let tools perform reasoning on these traces and validate properties, we are
committed to declare all intermediate inputs, states and outputs as global
symbols. This classical presentation appears in Equation 3.1 of Chapter 3.
On the contrary, in our Horn framework, we can also define inductively
the set of reachable states, as displayed in Listing 2.5.11. The part in red is
optional and allows to control the depth of traces under study, if desired.
Tools such as Z3 can automatically unroll such definitions so that they are
equivalent to the classical presentation. Then, these tools apply reasoning
techniques such as k-induction [57] and PDR [45] to check satisfiablity of
N _reach.
56
CHAPTER 2. LUSTREC
(declare−rel N _reach ( Int type_N_mem))
( rule (=> (and (= depth 0)
(N _reset s e l f ) )
(N _reach depth s e l f )))
( rule (=> (and (> depth 0)
(N _reach (− depth 1) s e l f )
(N _step s e l f i1 . . . ia s e l f p o1 . . . oc ))
(N _reach depth s e l f p )))
Listing 2.5.11 – Trace semantics for a node N .
2.6
Perspectives
We recall that we are not so much interested in developing a competitive
compiler alone as experimenting tightly coupled code generation and verification activities. As such, we resort to classical compilation solutions of
classical constructs, so that the applicability of our verification techniques
is maximal as they should easily carry over to other compilers and similar
languages within the same realm of critical systems. Anyway, to comply to
critical systems requirements, we must keep ourselves from generating complex pieces of code from fancy high-level constructs, which obviously curbs
our ambition in this regard.
In terms of language alone, few things remain to be added, as we incline
to favor automation of verification duties over expressivity for the end-user.
Adding records seems the most straightforward and useful task to perform,
as many industrial examples use records and currently need to be rewritten. To enhance modularity, at code and verification levels, parameterized
modules (as found in Lustre V6) would be a good fit and would also allow
to support type polymorphism in a clean way, while avoiding burdensome
value boxing and unboxing. Using ideas similar to our typing mechanism, a
more advanced clock calculus could be devised, as a first step towards more
expressive Signal-like clocks. One could employ boolean constraints between clock variables and use BDD-like normalization to compare two clock
expressions for equality.
As for code generation, more optimizations may be considered studying,
as long as we are able to support them at the verification level. As an example, since the equation pattern “x=K−>pre(e);” is very commonplace (with
K a constant), a pervasive optimization is to separate initialization from
nominal reaction steps. The equation then becomes “x=pre(e);”, whereas
initialization is moved in the reset procedure, as “x=K;”. So in the end, we
saved a call to a node “Arrow”. Also Listing 2.5.8 contains an infeasible path,
i.e. nested conditions x > 0 and 0 = x that cannot hold at the same time.
Such dead code could be removed (or a warning could be issued) with tech-
2.6. PERSPECTIVES
57
niques such as MC/DC coverage criteria coupled with SAT-solving tools, as
allowed by our Horn encoding. We address related concerns in Chapter 3.
58
CHAPTER 2. LUSTREC
Chapter 3
Testing-based Approach
In this chapter, we present a lightweight approach to compiler validation.
We build our validating compiler upon specification-based testing, which we
augment with a method for generating test cases targetting potential bugs
in a compiler. Figure 3.1 illustrates the overall idea of our techniques. When
translating a source program P to a target program Q, we want to assert
that the compilation preserves the semantics prescribed in P , as shown in
sub-figure (a). For a correct compilation we require that all expected behaviors of P are still present in Q, i.e., that the compilation has not introduced
any deviation from the expected behavior. We present two lightweight approaches to this challenge, using the same building blocks. First, we extend
our compiler to generate test cases targeted towards discovering differences in
the behavior of P and Q introduced by incorrect compilation. We show that
commonly used coverage criteria are not an optimal choice when generating
tests for this purpose. Instead we will use source-level program mutations to
emulate bugs in a compiler and for generating test cases geared towards detecting the introduced behavioral differences. As shown in sub-figure (b) the
generated tests are a lightweight certificate for the behavioral equivalence of
P and Q. With the same building blocks, we can approach the problem of
generating inputs for testing compilers: As shown in sub-figure (c), we can
use source-level mutation to fuzz input programs. We formalize the notion
of fuzzing using a measure for the introduced behavioral difference (∆). In
this scenario, we can use the test suites generated as certificates for single
translations to assert that a compiler work correctly in the neighborhood of
P : We check that the behavioral difference between P and its mutant M is
preserved in the corresponding compiled artifacts Q and Q0 .
Technically, our approach for a compiler validation relies on the following
main components:
Initial test synthesis generates a test suite from the source program P
driven by a coverage criterion (e.g., MC/DC). Let T be such initial
test suite. A test case t ∈ T includes the input data and the expected
59
60
CHAPTER 3. TESTING-BASED APPROACH
P
compilation
P
compilation
=?
b
mutation
T=
b
Q
Q
(a)
(b)
P
comp.
Q
∆P
=?
∆Q
M
comp.
Q0
(c)
Figure 3.1 – Sketch of our lightweight compiler validation. (a) Challenge:
asserting semantic equivalence (=)
b when compiling P to Q; (b) Generating
tests as a lightweight certificate for =;
b (c) Using ∆-neighborhoods to fuzz
compiler inputs.
output (test oracle).
Test reinforcement via mutation generates a set of mutants
M = {M1 , . . . , Mn } from P . A mutant Mi ∈ P from a program P
is a program where a single syntactic mutation has been introduced.
The set of mutants ΥP are used to reinforce the initial test suite T
to become the lightweight certificate T=
b . This is done by finding test
cases that distinguish the behavior of the original program P and its
mutants Mi ∈ M.
Test Execution on the target evaluates the generated test suites automatically on the compiled source code and compares the result to the
expected output.
∆-neighborhood preservation checking evaluates test suites not only
on the compiled program P , but also on the compiled versions (Q0 ) of
the mutated Mi programs. For every test case, a discrepancy found
at the source level (which acts as a semantic distance ∆) between Mi
and P must exactly carry over to the compiled level, by comparing the
results of executing Mi and P .
Summarizing, the central idea of this approach is threefold: (i) we verify that
the compiler did not introduce any difference between the source program
P and the compiled version Q. We do this by generating automatically test
suites from the source program P based on some coverage criterion. (ii) We
use mutations of the source program P to simulate bugs in the compiler,
and that test cases that differentiate mutations from the original program
are likely to uncover errors in the translation of this program. (iii) We use
a set of new input programs generated with a certain distance from the
source program in order to check that such distance is preserved also after
the compilation.
We have implemented a prototypical version of such a lightweight validating compiler from the synchronous dataflow language Lustre to C. Our
3.1. TEST-SUITE GENERATION FOR COMPILER VALIDATION
61
initial experiment evaluation illustrates that the presented techniques were
quite effective in validating the trustworthiness of our compiler lustrec.
3.1
Test-suite Generation for Compiler Validation
In this section, we describe our technique of generating test suite with a
certain coverage criteria. In our case we use the Modified Condition/Decision
Coverage (MC/DC) [59] as a criteria during the test generation. Our test
generation technique relies on bounded model checking (BMC) algorithm to
produce a trace, i.e. a test case. The main idea of generating test cases using
BMC is to interpret counterexamples as test cases. Then, a suitable test case
execution framework can extract from the counterexamples the test data and
the expected results i.e, test oracle. A BMC procedure can be used to find
test cases by formulating a test criterion as a verification condition for the
BMC. A test purpose describes characteristics of a test case that should be
created. For example, it could describe the final state of the test case, or
a sequence of states that should be traversed. The test purpose is specified
in temporal logic and then converted to a trap property by negation. The
latter asserts that the test purpose never becomes true. The counterexample
illustrates how the trap property is violated, and thus shows how the original
test purpose is fulfilled.
Bounded model checking (BMC) allows, through the enumeration of
predicates characterizing traces of increasing length, to build valid traces
of the program S encoded as a transition system (see Section 2.5.10). When
˜ out)
˜ output
˜ over input variables in,
associated to a given condition C(s̃, in,
˜ and state variables s̃, the BMC algorithm will try to find a
variables out
trace matching the condition at some point. That is, a valid finite trace of
length n for S that satisfies the following expression:
Sreset (s̃0 ) ∧
n
^
˜ i , s̃i+1 , out
˜ n , out
˜ i ) ∧ C(s̃n , in
˜ n)
Sstep (s̃i , in
(3.1)
i=0
In other words, a satisfiability check, using for instance an SMT solver,
˜ i,
over the expression in (1) for a given n will produce a set of values for in
˜
s̃i and outi for i ∈ [0 . . . n]. In practice, tools unroll the transition relation
one step at a time trying to meet the specific C condition. This can be done
efficiently with an SMT solver by reusing previously computed states. The
following definition formalizes the algorithm for BMC.
Definition Let Σ be a set of states, In be a set of inputs and Out be a set of
outputs. For all state s̃ ∈ Σ, let Sreset (s̃) be the initial/reset predicate. For
all s̃, s̃0 ∈ Σ, ĩ ∈ In, õ ∈ Out, let Sstep (s̃, ĩ, s̃0 , õ) be the transition predicate.
For all s̃ ∈ Σ, ĩ ∈ In, õ ∈ Out, let C(s̃, ĩ, õ) be a property over the state,
62
CHAPTER 3. TESTING-BASED APPROACH
input and output variable. We define bmc(Sreset , Sstep , C, n) the algorithm
computing the trace satisfying Eq. 3.1 in n steps. We denote as
˜ i )0≤i≤n , (out
˜ i )0≤i≤n
sat n, (s˜i )0≤i≤n , (in
an assignment of free variables satisfying Eq. (1). When such an assignment
is unfeasible with n ≤ maxn , we denote by unsat the result of the call to
bmc algorithm.
In the next subsection, we specialize the generation of test cases using
MC/DC conditions as trap properties.
3.1.1
MC/DC as Conditions over Traces
MC/DC coverage criteria is mainly used for the verification of the most
critical parts of an embedded system. Coverage criteria are typically used
not to synthesize tests but rather to evaluate the quality of a test suite,
making sure it covers most of the source code. MC/DC coverage requires
that a test case shows that each sub-expression of a boolean formula, as used
in a guard, actually impacts the status of the boolean formula. Failing in
reaching this coverage may be a witness of a potential dead code, i.e., code
that is useless since it does not impact the expressions where it is used.
In order to achieve a good coverage over the activation condition of a
model M, one can gather those conditions, i.e. enumerating the paths, the
˜ s̃0 , out)
˜ and
disjunctions, in (the syntax of) the transition relation Mstep (s̃, in,
characterize a tree of constraints. An activation condition would be associated with a path in this tree. For example, considering the path [g1 ; g2 ; g3 ] of
˜ and out,
˜ an MC/DC like coverage would require
three predicates over s̃, in
to cover all possible values for atoms in those predicates.
˜ out).
˜
We show how to express the MC/DC criterion as a predicate C(s̃, in,
First, a procedure is needed to extract the decision predicates from the source
code. Coverage of each decision predicate is checked in isolation against a
given global set of test cases. The principle is the following: from a decision
P (v1 , . . . , vn ) where the vi ’s are primitive predicates over the variables s̃,
˜ and out,
˜ we have to exert the value of each variable vi with respect to
in
the global truth value of P , the other variables vj6=i being left untouched.
Precisely, we have to find two test cases for which, in the last element of
the trace, vi is respectively assigned to F alse and T rue. Then, for each
such test case, blindly changing the value of vi should also change the global
predicate truth value. Formally, for a given decision P (v1 , . . . , vn ), the set
of conditions describing the last element of covering traces is:
[ vi ∧ (P (v1 , . . . , vn ) ⊕ P (v1 , . . . , vi−1 , ¬vi , vi+1 , . . . , vn )) , ¬vi ∧ (P (v1 , . . . , vn ) ⊕ P (v1 , . . . , vi−1 , ¬vi , vi+1 , . . . , vn ))
i∈1..n
3.2. REINFORCING TEST SUITE VIA MUTATION TESTING
63
Note that not every condition generated will result in a test case (counterexample), since some parts of the model may be unreachable or specific
combinations of truth values unfeasible (for example, because of masking).
Therefore, expressions occurring in those portions of the model may not
achieve MC/DC coverage. Moreover, BMC may also not be able to find
a test case for some condition within an acceptable time limit. In such
cases, we conclude that the generated test suite does not reach the MC/DC
coverage.
3.2
Reinforcing Test Suite via Mutation Testing
We now present our test-suite reinforcement method based on techniques
from mutation-based testing. In the following, we denote by the term mutant
a mutated model or implementation where a single mutation has been introduced. The considered mutation, which is grammar based, does not change
the control flow graph or the structure of the semantics but could either:
(i) perform arithmetic, relational or boolean operator replacement; or (ii)
introduce additional delay 1 , or (iii) negate boolean variables or expressions;
or (iv) replace constants.
We propose a novel way to exploit the generated mutants in order to
reinforce the test suite generated via MC/DC coverage criterion. A mutant
is either killed by the test suite or the test suite is too weak to detect a
it. In the latter case, adding a new test case able to distinguish between
the original source program and this yet-unkilled mutant would make the
test suite stronger. Let P and M be respectively the original source program
and the mutated one. Both are encoded with predicates, as described in
Section 3.1. Then, the following formula Killer gives a formulation for
differentiating M from P:
˜ 0 , n) ,
˜ out,
˜ out
Killer(Preset , Pstep , Mreset , Mstep , in,
0
0
∃s̃, s̃ : Preset (s̃0 ) ∧ Mreset (s̃ 0 )
V
V
˜ 0i)
˜ i , s̃0 i+1 , out
˜ i , s̃i+1 , out
˜ i ) ∧ ni=0 Mstep (s̃0 i , in
∧ ni=0 Pstep (s̃i , in
˜ 0n
˜ n 6= out
∧ out
(3.2)
Note that while the trace of P is built on state variables s̃ and the one
of M on s̃0 , those state variables are independent of each other. So are with
the output variables too. However, the inputs are shared all along the two
traces. A satisfying assignment of the variables of this formula characterizes
˜ where different outputs
two runs of P and M with the same input sequence in,
0
˜
˜ is our new
˜ n and out n are produced at step n. The input sequence in
out
1. Note that, introducing additional delay could produce a program with initialization
issues.
64
CHAPTER 3. TESTING-BASED APPROACH
test case. The satisfiability check of formula (2) may fail for two reasons:
(i) either no trace of length n can exhibit a difference between P and M ,
or (ii) M and P are observationally equivalent, potentially due to the fact
that the mutation has occurred in a dead code or in a portion of a masked
code. For the first case, we continue searching for an increased length of
trace. While for the second case, the formula (2) is used to prove that P
and M are observationally equivalent using k-induction [85]. The latter is
formulated as follows 2 :
Base
Case
Inductive
Case
˜ out
˜ P , out
˜M :
∀s̃, s̃0 , t̃, t˜0 , in,
˜ s̃0 , out
˜ t˜0 , out
˜ P ) ∧ Mstep (t̃, in,
˜M )
:=
Preset (s̃) ∧ Mreset (t̃) ∧ Pstep (s̃, in,
˜ P = out
˜M
=⇒ out
˜ 1 , out
˜ 2 , out
˜1 , out
˜2 :
∀s˜1 , s˜2 , s˜3 , t˜1 , t˜2 , t˜3 , in˜1 , in˜2 , out
P
P
M
M
˜
˜
1
1
˜
˜
˜
˜
˜
˜
Pstep (s1 , in1 , s2 , outP ) ∧ Mstep (t1 , in1 , t2 , outM ) ∧
:=
˜2 ) ∧ M
˜1
˜1
˜2 ˜2 ˜3 ˜2
Pstep (s˜2 , in˜2 , s˜3 , out
step (t , in , t , outM ) ∧ outP = outM
P
˜
˜
2
2
=⇒ outP = outM
(3.3)
Figure 3.2 illustrates the general procedure to reinforce test suite based
on mutation-based testing. The call to killMutant performs the killing of a
mutant using the test suite generated via the MC/DC coverage criterion.
While the call to genNewTest will either produce a new test case based on
an un-killed mutant, or will return a NULL result.
MC/DC Recast as a Mutation-Based Technique
Incidentally, we show that MC/DC testing can be seen as a mutationbased technique, which broadens the scope of our approach. MC/DC criterion can be envisioned as a kind of transient mutation of some primitive
predicate v nested in some condition P , negating its expected output at
some point in time, inducing a change of value at the condition level. This
transient mutation can easily be recast as an ordinary one.
First, an auxiliary boolean input aux and a local boolean variable once_aux
are added to the original program:
once_aux = false −> pre (once_aux or aux);
It memorizes the fact that aux once occurred in a strict past. This is
neutral with respect to the semantics of the original program.
Second, mutate some primitive predicate v, permanently replacing it:
v → v ⊕ (aux and not once_aux)
2. To keep the presentation simple, we expressed it a an inductive proof, i.e. a 1induction.
3.3. COMPILER VALIDATION VIA ∆-NEIGHBORHOOD
65
Input: Υ: set of mutants, Θ: initial test suite
Output: Updated test suite Θ
proc reinforceTestSuite(Preset , Pstep , Υ, Θ) begin
foreach (Mreset , Mstep ) ∈ Υ do
killed := false ;
foreach t ∈ Θ ∧ ¬killed do
killed := killMutant(Preset , Pstep , Mreset , Mstep , t);
end
if ¬killed then
new_test := genNewTest(Preset , Pstep , Mreset , Mstep );
if new_test 6= NULL then
Θ := Θ ∪ {new_test};
else
write ‘ ‘Inconclusive on mutant M’ ’ ;
end
end
end
return Θ;
end
Figure 3.2 – Procedure to reinforce test suite generated via MC/DC coverage
criterion.
It follows that a change in the expected output of v is only possible at the
first instant when aux equals true. Last, try generating some discriminating test cases between the original instrumented program and the mutated
one. These test cases satisfy the MC/DC coverage criterion for the original
program.
3.3
Compiler Validation via ∆-neighborhood
In the previous section, we have focused on how mutants of an original source program can be used to generate test cases targeting bugs in a
compiler as a certificate for the correct translation of this program. In this
section, we discuss how we can use mutation to fuzz existing input programs
in order to test the correctness of a compiler in the neighborhood of these
programs.
Compilers are notoriously difficult to test for absence of bugs. This is
due to their second-order nature (i.e., compilers are program that operate
on programs). The main challenge is generating meaningful input programs
for exercising all interactions between the different features of an input language. There is no established automated (or semi-automated) method for
66
CHAPTER 3. TESTING-BASED APPROACH
generating sensible test cases based on some coverage criterion on the code of
the compiler itself. For instance, MC/DC or path coverage at the compiler
source level would not provide enough guidance in how to create test programs that actually compile. Generating random test cases seems ineffective
for the same reason.
Mutation of source programs seems to be a good heuristic for generating
meaningful inputs to a compiler — testing more behaviors than a single
input program – for two reasons: First, mutations can be designed purely
syntactically and still guarantee to result in a program that can be compiled.
Second, except for the case of dead code, a mutation should lead to an
observable difference in program behavior.
For an executable source language, we can check if differences in behavior are preserved through compilation. For a source language with formal
semantics, we can even use the source programs to search for differences (as
described in the previous section) and test their preservation. This allows us
to move from considering the correctness of the compiler on one particular
input to the correctness of a region of the input space of a compiler.
In order to quantify the additional coverage achieved by testing a mutated
version of a source program, we introduce the notion of distance between a
program and its mutated version. For the case of dataflow languages, the
well-known semantic notion of ultra-metric distance between two models,
i.e., the first instant at which output traces disagree, with respect to any
common given input trace, seems to be a good choice: Syntactic mutations
(and any distance measure based on these) will not necessarily be preserved
from source to target level in optimizing compilers.
The Figure 3.3 illustrates ultra-metric distance on the output flows generated by the two versions compared: original model and mutated one, on a
specific provided test case. Inputs are named i0 , . . ., and outputs o0 , . . .. In
the example, the outputs differ after k steps, leading to a distance of k on
this particular sequence of inputs. Please note that a small distance in this
setup means that two programs expose different behavior already for short
test cases. Ideally, we want to create mutations at many distances.
Testing a compiler now decomposes into two independent phases, (i) generating mutations from an original input program, and (ii) generating test
cases for the original program. The question of choosing sensible mutations
and test cases can probably not be answered in general. We used automatically generated MC/DC test cases, as they are supposed to cover the
different execution paths of the program well. As for the mutation part, we
stick to a simple random generation of structurally mutated programs, each
only containing one mutation. Ideally though, generation of test cases could
be designed to capture a good under-approximation of the distance between
original and mutated programs.
The semantic difference, if any, should carry over from source to target
level. For any given test case, we compute an approximation of the distance,
3.4. A VALIDATING LUSTRE COMPILER
67
∆m
t
Orig. system
i0
i1
ik−1
s0
s1
...
sk−1
o0
o1
...
ok−1
ik
ik+1
sk
sk+1
s0k
Mutant m
s0k+1
ok
ok+1
o0k
o0k+1
in
...
...
sn
s0n
on
o0n
Figure 3.3 – Characterization of the distance between the original model, a
mutant and a given test.
m2
2
∆m
t2
1
∆m
t1 m1
3 src
∆m
t3
m3
m2
2
∆m
t2
1
∆m
t1 tgtm1
3 tgt
∆m
t3
m3
Original model and its neighborhood.
Compiled versions.
Figure 3.4 – Evaluating the compiler on a source neighborhood.
by executing original and mutated programs against the whole set of test
cases. For each test case t and each mutant m, we compute an approximation
∆m
t of the distance, simply by executing both original and mutant programs
in the reference semantics and stopping as soon as their respective outputs
disagree at some point or a trace length limit is met. Then, we check that
exactly the same discrepancy ∆m
t occurs at the target level, according to the
schema presented in Fig. 3.4.
The test reinforcement process presented in Sect. 3.2 allows to add new
tests when the current test suite only shows a 0 distance between the original
source and its mutant.
3.4
A Validating Lustre Compiler
We have implemented a prototypical version of the lightweight validating
compiler from Lustre to C using the PKind model checker [55] and have
68
CHAPTER 3. TESTING-BASED APPROACH
Figure 3.5 – Schematic view of a testing-based validating compiler from
Lustre to C.
performed a preliminary evaluation 3 . Figure. 3.5 illustrates an overview of
the developed framework. The lightweight validating compiler from Lustre
to C consists of five main components: A modular compiler lustrec from
Lustre to C, a test suite generator for Lustre programs, a grammar-based
mutant generator for Lustre programs, a test suite enhancer – extends test
suites with test cases for killing mutants – and a validator, which will execute
the test suite on the compiled C program. The validator will use the input
Lustre program as a test oracle. Test results are provided as output along
with the compiled C program.
Group
MC/DC
No MC/DC
All
SUM
AVG
SUM
AVG
SUM
AVG
No of.
Prog.
408
464
872
-
Mutants
Total
Inconcl.
74,096.00
24,136.00
181.61
59.16
85,587.00
52,019.00
184.45
112.11
159,683.00
76,155.00
183.12
87.33
No. of Tests
MC/DC
Reinf.
31,355.00
11,216.00
76.85
27.49
49,908.00
5,879.00
107.56
12.67
81,263.00
17,095.00
93.19
19.60
Killed [%]
MC/DC
Reinf.
23.84
15.14
20.67
6.87
22.14
10.71
Table 3.1 – Experimental results.
3.4.1
Experimental evaluation
We ran the lightweight validating compiler on a set of 872 Lustre benchmarks. For every benchmark, we use MC/DC conditions to generate basic
test suites. We then automatically generate a set of mutants (183 on average) for each benchmark. Table 3.1 shows the results of our experiments. We
break results down for programs on which we could achieve MC/DC coverage
and those on which we did not. For both groups (and their union) we report
the number of mutants that were generated, the number of mutants that did
not lead to new test cases, the number of test cases generated for MC/DC,
the number of additional new tests generated during reinforcement, and the
percentage of mutants killed by tests from MC/DC and reinforcement.
3. The prototypical implementation, benchmarks and results can be found at
https://bitbucket.org/lememta/sefm-14
3.5. PERSPECTIVES
69
Test suite generated via BMC guided by MC/DC conditions were able
to achieve 100% MC/DC coverage on 408 of the benchmarks. In this group,
test cases generated using BMC guided by the MC/DC conditions were able
to kill 23.84% of the generated mutants (with a standard derivation of 0.17).
On the remaining 464 benchmarks the MC/DC conditions could not be satisfied completely. for this group, test cases generated using BMC guided by
the MC/DC conditions were able to kill 20.67% of the generated mutants
(with a standard derivation of 0.20). Test cases generated using the procedure highlighted in Section 3.2 increased the performance of these basic
test suites by 63.5% (for the MC/DC group) and 33.2%. In absolute terms,
the combined test suites killed 38.98% of the mutants (27.54% respectively).
Table 3.1 additionally shows the combined results for the set of all examples. However, we are convinced that the data for the MC/DC group is
most indicative of the actual performance of the approach: Examples without MC/DC had dead code, which also lead to much higher numbers of
inconclusive (i.e., indistinguishable) mutants.
Finally, Figure 3.6 shows a more detailed view of the results. For every
experiment we show the percentages of mutants killed by MC/DC generated test cases and additional mutants killed by test cases generated using
genNewTest. The data set was sorted by the overall number of killed mutants.
Figure 3.6 – Killed mutants per Benchmark. Ordered by percentage of mutants killed.
3.5
Perspectives
In this chapter, we described a novel lightweight approach to validating
compilers based on testing. Instead of verifying a compiler for all input programs or providing a fixed suite of regression tests, our technique generates
test-suites with high behavioral coverage that are geared towards discovery
70
CHAPTER 3. TESTING-BASED APPROACH
of faults for single compiled artifacts. The generated test suites are automatically evaluated on the compiled program and provide a (lightweight)
certificate for the correctness of the compiler on a concrete input. Using a
set of mutants of the source program, we validate that a compiler works correctly in the ‘neighborhood’ of a source program: we check that the distance
(or mutation) of the source program from its mutants is preserved after the
compilation. We have developed a tool that implements the proposed techniques on top of the lustrec suite. Experimental evidence shows that the
presented techniques are quite effective.
A strength of our approach lies in the independence of the different components: predicate encoding, test suite generation, ∆-neighbourhood preservation checking and finally the tested artifact (Lustre to C compiler) may
all be provided by different sources. We conducted experiments with our
own lustrec tool, either in Lustre to C or Lustre to Horn mode, with
the standard Lustre V4 suite [90] and with verification tools such as Z3 or
PKind [57]. The obtained reproducibility of results greatly strenghtens the
confidence one can put in the correctness of the overall approach and of each
component in isolation.
As a next step we plan to assess the fault finding capabilities of the
generated test suites and compare these to other methods for generating test
suites (e.g., random testing). A more recent work by Whalen et. al extended
MC/DC with a notion of observability (OMC/DC) [91]. Our approach is
orthogonal to such, in principle any coverage criterion can be used to generate
the initial test case. We plan to integrate the OMC/DC technique in our
validating compiler. Moreover, we plan to perform experiments with seeded
bugs in the Lustre to C compiler to confirm that the mutations that we
selected on Lustre programs mimic the effects of potential bugs in a compiler.
We also plan to investigate how the validation results can be quantified and
provide an estimate for the trustworthiness of the compiler.
Chapter 4
Translation Validation
4.1
Introduction
In this chapter, we are principally interested in leveraging the translation
validation approach, which roughly consists, from a source code, in generating a target code along with a proof of connection between both, i.e. a
(bi)simulation relation. The proof have to convey enough information for an
external tool to replay it. As the target code is definitely proved correct, this
is obviously more reliable than testing-based approaches, for which one can
always wonder whether a good and large enough test suite has been provided
in order to acquire confidence. We also translate high-level specification (i.e.
synchronous observers [84]) down to Horn level and C code level. These
high-level specifications may be provided by the system designer or automatically synthesized by external tools with techniques inspired by Property
Directed Reachability [45]. The interested reader will find further explanations and examples in [54, 30, 32]. The current state of the art allows us,
for instance with the tool Spacer [60] or the more recent Kind2, to build
modular invariants, i.e. invariants specifically bound to nodes and automata
states. Proofs are not carried over, but are replayed at lower levels, with
some hints such as the needed auxiliary lemmas that were discovered and
the depth of the performed k-induction.
Figure 4.1 depicts the different components of our verification framework
and how they relate to each other. “Denot. (Coq)” represents a denotational
semantics of a Lustresource, which is easily obtained by a simple transliteration (i.e. with our Coq library Lustre, a Domain Specific Language). This
denotational semantics, which serves as a reference semantics, is described
in Section 4.2. “Oper. (Horn)” is an operational semantics, i.e. a transition
system with reset and step predicates as provides our lustrec Horn encoding, which supposedly conforms to the reference semantics. Section 4.3
addresses this conformance issue.
Section 4.4 deals with the problem of proving that the generated “C code”
71
72
CHAPTER 4. TRANSLATION VALIDATION
complies with its operational specification, the last step required to establish
that our compiler generates valid code instances. Code annotations are used,
written in the ANSI/ISO C specification language ACSL and supported by
the framework Frama-C [22, 2, 71]. This framework is dedicated to static
analysis of C code, with several available “plugins”: simple dataflow analysis,
weakest precondition calculus, etc.
Another facet of our contribution is that, as a kind of double-check, we
may wish to prove that “Invariant” properties, either translated from synchronous observers or synthesized from the Lustre source 1 still hold at the
C code level. To achieve this, a synchronous observer, which is at most times
a stateful component, must be given too an operational semantics. Since observers are no more expressive than Lustre nodes, we may consider them
as normal Lustre programs and use lustrec to extract their semantics.
Variations around this scheme are discussed in Section 4.5. The proofs made
at Horn level, mostly by k-induction (cf. Section 4.5.1), are replayed at the
C code/ACSL level, as shown in Section 4.5.2.
Source (Lustre)
Lustre
v
Spacer 4.6
Denot. (Coq) ks
4.3
2.5
2.5
4.2
lustrec
)
Oper. (Horn)
em
k-ind.
4.5.1
-
4.4
ks
Invariant
O
4.5
C code
4.5.2
Frama-C
C annot. (ACSL)
5
lustrec
4.5
Sync. Observer (Lustre)
Figure 4.1 – Overall verification framework.
4.2
A Denotational Reference Semantics
We develop in this section a denotational semantics for Lustre. It is
meant to be used as a reference semantics that other semantics must conform
to. Denotational characterizations, in terms of stream transformers or stream
relations, have been a semantical basis for most synchronous languages [16,
5]. Although all these semantics share a decent amount of material: the
synchronous hypothesis, a bounded memory, an explicit mark for absence of
signal (⊥), clocks of signals, etc, they notably differ by being constructive
or not. Most often, semantics for Lustre (or other related languages) are
1. Section 4.6 develops an example of invariant synthesis, using the tool Spacer.
4.2. A DENOTATIONAL REFERENCE SEMANTICS
73
defined constructively, whereas Signal semantics are not. Indeed, Signal
allows the definition of non-deterministic “nodes”, which doesn’t tip the scales
in favour of an explicit computation of outputs from inputs. This is not so
much a difference until the semantics is formalized in a proof assistant such
as Coq, in which case constructiveness is a fundamental feature. Recent
formalizations include [11, 76] (constructive) and [74] (non constructive). In
the synchronous paradigm, streams are naturally defined as a co-inductive
structure, instead of a more cumbersome choice where streams are functions
from instants to values. Therefore, having a constructive and co-inductive
characterization grants you a free interpreter, very close in nature to the
reactive loop of an operational semantics. As appealing as it sounds, we
nevertheless chose a more lighweight option:
• No ⊥ value, no fixed points. First, we get rid of the explicit signal
absence ⊥ as it has no counterpart, either in the “physical” world or in an
operational semantics as implemented by a compiler, which would complicate the proof of semantical equivalence. On the one hand, removing ⊥ also
practically prevents us from building a sensible constructive formalization,
since it would mean providing mandatory signal values even at instants when
signal is not active. From a semantical viewpoint, we would clearly be better
off not specifying spurious values. But, on the other hand, dropping constructiveness also means dropping ω-cpos and computations of fixed points
within an instant. These fixed points mimick the effect of following an evaluation schedule without computing it, but we would need an extra endeavor
to prove this assertion, without which there is no possible correspondence
between denotational and operational semantics.
Also, recursive definitions of variables through Lustre equations will
be avoided so as to deal with a semantics totally devoid of any program
fixed point. Definitional equations will be translated to weaker bisimulation
equivalences. This entails that every Lustre operator must be proved to be
a congruence with respect to bisimulation.
A non-constructive denotational semantics without ⊥ is thus easier to
establish and if constructiveness is needed, the operational semantics is still
available. As is, this relaxed situation allows us to define ill-formed Lustre
programs, potentially non-deterministic or even deadlocking. Both plagues
will not be dealt with in the denotational setting, because it is the duty of
the compiler to ensure that the purportedly equivalent operational semantics
is deterministic and non-deadlocking.
• Co-induction. Second, we keep the co-inductive flavor as we claim it
is more naturally suited to co-inductive (bi)simulation proofs. These proofs
will consists in relating instants of denotational streams to operational states,
through destructuring streams and unrolling transition relations. Note that
without ⊥, some Lustre operators must be defined axiomatically, such as
pre, when and merge.
74
CHAPTER 4. TRANSLATION VALIDATION
4.2.1
Lustre Operators in Coq
We provide a library of streams: the Signal library, sketched in Listing 4.2.1, which enriches the one present in the standard Streams Coq library. For the sake of readability, we keep only some significant results and
omit their proofs. This library contains Str_apply and Str_lift that both
aim at building stateless stream functions and also zip/unzip to convert
tuples of streams to/from streams of tuples.
The ForAll operator is already defined in Coq and represents the “always” temporal operator on streams. The bisimulation relation between
streams is denoted by EqSt. Both definitions are found in Figure 4.2.2.
Then, our Lustre library shown in Listing 4.2.3 is built on top on the
Signal library. The special node “Arrow” gains, in our denotational semantics, an extra reset parameter, which allows us to reinitialize it. The denotations “J K” of some Lustre operators, can also be found in Figure 4.2,
where uppercase symbols (such as S) denote sets of streams. Note that,
without loss of generality, we restrict ourselves to the case of boolean clocks.
In the boolean case, the operators “merge” and “ if then else” have identical
denotations 2 . Enumerated types and clocks are easily encoded in Coq and
don’t deserve more attention.
Jpre K , λS.
{s0 | (tl s0 ) = s, s ∈ S}
Jif then else K , λSc , St , Se .
{s0 | (c|t|e|s0 ) 2(if ci = true then s0i = ti else s0i = ei ), (c, t, e) ∈ Sc ×St ×Se }
J when K , λS, Sc .
{s0 | (s|c|s0 ) 2(ci = true =⇒ s0i = si ), (s, c) ∈ S ×Sc }
Jmerge (true−> )(false−> )K , Jif then else K
Figure 4.2 – Denotations of some Lustre operators.
Since we want to keep translation to Coq as simple as possible and forbid ourselves to perform any syntactical alteration to the Lustre source,
we don’t need any property for Lustre operators, besides congruence properties. Maybe more clever properties could come in handy when trying to
devise clever automatic proof strategies but we don’t feel the urge for it until
now.
4.2.2
Lustre Nodes in Coq
We define the denotational semantics of a node as a relation between
input streams and output streams, to which we add a special reset boolean
stream, to account for resetting conditions of node calls. Each equation is
2. These operators only differ in terms of clock calculus.
4.2. A DENOTATIONAL REFERENCE SEMANTICS
75
Require Import Bool List.
Require Export Streams.
(* Value lifting *)
CoFixpoint Str_lift {A} (v : A) := Cons v (Str_lift v).
(* Stream is an Applicative functor *)
CoFixpoint Str_apply {A B} : Stream (A -> B) -> Stream A -> Stream B :=
fun f a =>
match f, a with
| (Cons f0 f’), (Cons a0 a’) => Cons (f0 a0) (Str_apply f’ a’)
end.
(* The ’next-time’ temporal operator *)
Definition next {A} (P : Stream A -> Prop) s :=
P (tl s).
(* The stream semantics of a state predicate *)
Definition now {A} (P : A -> Prop) : Stream A -> Prop :=
fun s => P (hd s).
Notation
Notation
Notation
Notation
Notation
"x
"f
"!
"s
"s
~ y" := (EqSt x y) (at level 70, no associativity).
@ x" := (Str_apply f x) (at level 65, left associativity).
x" := (Str_lift x) (at level 60).
|= [] P" := (ForAll P s) (at level 90).
|= () P" := (next P s) (at level 90).
Definition zip {A B} (s : Stream A * Stream B) :=
!(fun a b => (a, b)) @ (fst s) @ (snd s).
Notation "x | y" := (zip (x, y)) (at level 61, left associativity).
Definition unzip {A B} (s : Stream (A * B)) :=
(!(@fst _ _) @ s, !(@snd _ _) @ s).
Lemma eqst_apply : forall {A B} {f f’ : Stream (A -> B)} a a’,
f ~ f’ -> a ~ a’ -> f @ a ~ f’ @ a’.
Lemma zip_fst_eqst : forall {A B} (s1 s1’ : Stream A) (s2 s2’ : Stream B),
(s1 | s2) ~ (s1’ | s2’) -> s1 ~ s1’.
Lemma fst_zip_eqst : forall {A B} (s1 : Stream A) (s2 : Stream B),
!(@fst _ _) @ (s1 | s2) ~ s1.
Lemma zip_unzip_eqst : forall {A B} (s : Stream (A * B)),
zip (unzip s) ~ s.
.
.
.
Listing 4.2.1 – The Signal library.
76
CHAPTER 4. TRANSLATION VALIDATION
Variable A : Type.
CoInductive Stream : Type :=
Cons : A -> Stream -> Stream.
Definition hd (x:Stream) := match x with
| Cons a _ => a
end.
Definition tl (x:Stream) := match x with
| Cons _ s => s
end.
CoInductive EqSt (s1 s2: Stream) : Prop :=
eqst : hd s1 = hd s2 -> EqSt (tl s1) (tl s2) -> EqSt s1 s2.
CoInductive ForAll (x: Stream) : Prop :=
HereAndFurther : P x -> ForAll (tl x) -> ForAll x.
Listing 4.2.2 – An excerpt of the Stream library.
turned into a bisimulation constraint between both hand sides and a conjunction is formed with all equations. Within an equation, each operator
in each expression turns into its Lustre library equivalent, depending upon
how Lustre types are turned into Coq types (Z, bool, etc) and each node
call turns into a similar denotation form, depending upon its own inputs,
outputs and reset. We denote J KL the translation of each type, expression,
etc, in the Coqlibrary Lustre, as shown in Figure 4.3, where op is a Lustre
operator listed in library Lustre.
Since our Horn encoding may contain free variables, such as local variables, we must quantify them existentially. The special reset stream, similarly to the N _reset function, propagates down to the tree structure of
node instances, until it reaches the only resettable base Lustre operator,
the “Arrow” operator. Figure 4.2.4 contains the denotational semantics of
the generic node N of Section 2.5.7. The denotation is a function N _denot
from input (and reset) streams to output streams, to which we associate the
main specification predicate, through the axiom N _denot_spec.
4.2.3
A Small Example
Let us consider the toy example of Listing 4.2.5.
Listing 4.2.6 shows a simplified denotational characterization of node f.
Note that the translation of equations is rather straightforward. The node
4.2. A DENOTATIONAL REFERENCE SEMANTICS
77
Require Import ZArith Bool Signal.
(* The special Arrow node: true -> false *)
Definition _arrow reset := Cons true (tl reset).
(* The if-then-else operator *)
Definition ite_at {A} (e_if : bool) (e_then e_else : A) :=
if e_if then e_then else e_else.
Definition ite {A} (s_if : Stream bool) (s_then s_else : Stream A) :=
!ite_at @ s_if @ s_then @ s_else.
(* Specification of pre: tl (pre s) ~ s *)
Variable pre : forall {A}, Stream A -> Stream A.
Axiom pre_spec : forall {A}, forall (s : Stream A), (tl (pre s)) ~ s.
(* Specification of when: forall i, c_i=true -> (s when c)_i = s_i *)
Definition when_rel_at {A} (si_ci_wi : (A * bool) * A) : Prop :=
match si_ci_wi with
| ((si, ci), wi) => if ci then wi = si else True
end.
Variable when : forall {A}, Stream bool -> Stream A -> Stream A.
Axiom when_spec : forall {A}, forall (s : Stream A) (c : Stream bool),
((s | c) | (when c s)) |= [] (now when_rel_at).
(* Specification of merge : forall i,
(merge c (true -> s) (false -> t))_i = if c_i then s_i else t_i *)
Definition merge {A} := fun c s t => @ite A c s t.
(* Boolean operators *)
Definition bor (s1 s2 : Stream bool) :=
!orb @ s1 @ s2.
...
(* Arithmetical operators *)
Definition iplus (s1 s2 : Stream Z) :=
!(fun i1 i2 => (i1 + i2)%Z) @ s1 @ s2.
...
(* Relational operators *)
Definition ieq_at (i1 i2 : Z) :=
match (i1 ?= i2)%Z with Eq => true | _ => false end.
Definition ieq (s1 s2 : Stream Z) :=
!ieq_at @ s1 @ s2.
...
(* Congruence properties *)
Theorem ite_cong_eqst : forall {A} c c’ (t t’ e e’ : Stream A),
c ~ c’ -> t ~ t’ -> e ~ e’ -> ite c t e ~ ite c’ t’ e’.
...
Listing 4.2.3 – The Lustre library.
78
J real KL
J int KL
JboolKL
JT [n]KL
CHAPTER 4. TRANSLATION VALIDATION
,
,
,
,
R
Z
bool
{i | 0<=i<=n} -> JT KL
(Je1 −>e2 KL reset) ,
(ite (_arrow reset) (Je1 KL reset) (Je2 KL reset))
(Jop(e1 , . . . , en )KL reset) ,
(JopKL (Je1 KL reset). . . (Jen KL reset))
(JMuid (e1 ,. . .,en ) every cKL reset) ,
(Muid _denot(bor reset (JcKL reset))
((Je1 KL reset), . . ., (Jen KL reset)))
Figure 4.3 – Denotation of types and expressions in Coq.
Variable N _denot : Stream bool
-> (Stream JI1 KL ) * . . . * Stream JIa KL )
-> (Stream JO1 KL ) * . . . * Stream JOc KL ).
Definition N _denot_pred (reset : Stream bool)
(inputs : Stream JI1 KL * . . . * Stream JIa KL )
(outputs : Stream JO1 KL * . . . * Stream JOc KL ) :=
let (i1 ,. . .,ia ) := inputs in
let (o1 ,. . .,oc ) := outputs in
exists (l1 : Stream JL1 KL ) . . . (lb : Stream JLb KL ),
lhs(eq1 ) ~ (Jrhs(e1 )KL reset)
...
/\ lhs(eqr ) ~ (Jrhs(er )KL reset).
Axiom N _denot_spec : forall reset inputs,
(N _denot_pred reset inputs (N _denot reset inputs)).
Listing 4.2.4 – Denotation of a Lustre node in Coq.
node f (x : int) returns (y, cpt : int)
let
y = x + 1;
cpt = (0 −> pre cpt) + 1;
tel
Listing 4.2.5 – A simple Lustre node.
4.3. CORRECTION OF OPERATIONAL SEMANTICS
79
f turns into a relation between input and output streams, with the addition
of a reset stream.
Variable f_denot : Stream bool -> Stream Z -> Stream Z*Stream Z.
Definition f_denot_pred reset x y_cpt :=
let (y, cpt) := y_cpt in
y ~ (iplus x (!1%Z))
/\ cpt ~ (iplus (ite (_arrow reset) (!0%Z) (pre cpt))
(!1%Z)).
Axiom f_denot_spec : forall reset x,
(f_denot_pred reset x (f_denot reset x)).
Listing 4.2.6 – Denotational semantics of Listing 4.2.5 in Coq.
4.3
Correction of Operational Semantics
We prove that the operational semantics is correct with respect to the
denotational reference semantics. We need to embed transition relations in
our denotational setting.
4.3.1
Operational Semantics in Coq
We omit the translation from our Horn encoding to the Coq language.
Simply, each operator in each expression turns into its Coq equivalent, depending upon how SMT-LIB types are turned into Coq types. Structured
memory types are easily encoded in Coq. Reset and transition predicates
are still Coq predicates, with the same interface. Again, the free variables
of our Horn encoding must be existentially quantified, if any.
The library Refine, presented in Listing 4.3.1, contains the results needed
to express the trace semantics of reset and step predicates and establish the
refinement between denotational and operational semantics. Proving that
the operational reset scheme is compatible with the denotational one is the
main concern. For that purpose, we extract the operational definition of
reset, the predicate resettable, which can be applied to any transition
relation. It enjoys the property of being “transitive”, meaning that two resets before a transition are equivalent to a single one, with a disjunction of
both reset conditions. This property allows to relate resets to their denotational interpretation, where conditions are disjunctively gathered, through
the reset stream argument of Listing 4.2.4. Then, oper_to_denot gives a
denotation to a resettable transition relation: considering reset, input, output and memory streams, at each instant, the transition holds between two
successive memory states mem and (tl mem).
80
CHAPTER 4. TRANSLATION VALIDATION
Since we make use of the temporal operator “always” 3 , we include necessary properties that fully leverage its Coq counterpart ForAll. We list in
Figure 4.4 the necessary rules written in a Natural Deduction logical style,
as well as their Coq names.
(tl s) 2P (P (hd s))
HereAndFurther
s 2P
sI
s 2P s 2(P ⇒ Q)
ForAll_apply
s 2Q
(I ⇒ P ) (I ⇒ I)
ForAll_coind
s 2P
s 2(P ◦ hd)
ForAll_forall
∀n.(P sn )
P necessitation
2P
(tl s) P
next
s P
Figure 4.4 – Rules for the temporal operator 2P .
4.3.2
A Small Example – Continued
We illustrate the semantical refinement on the example of Listing 4.2.5.
A simplified version of the operational semantics of node f appears in Listing 4.3.2. It is a direct adaption of the Horn encoding, with only few syntax
changes. Our compiler introduces the memory variable __f_2 and the node
instance ni_2 respectively denoting “pre cpt” and “−>”. Then, a refinement
theorem follows, which proves that the operational semantics respects the
denotational reference semantics. We don’t assume that node f is a main
node (i.e. reset from the outside world may occur).
Roughly, the proof principle of Theorem f_refinement consists in undoing the normalization process and rebuilding the original equations so
that they can be compared with the denotational formulation which directly
stems from these equations too. Before that, one must prove that its only
instance, a node “Arrow”, also satisfies the same refinement theorem. We
recall its operational definition, similar to Listing 2.5.5.
More generally, considering the node tree structure of an arbitrary node
N , all node instances must be handled before N is proved correct. Terminal
nodes, i.e. stateless nodes and node “Arrow”, are easily proved correct.
4.4
Correction of Code
This section focuses on proving that a given C code produced by our
compiler lustrec satisfies its corresponding operational semantics given in
terms of Horn clauses. The framework Frama-C will be used to annotate
3. Remind that “always” also appears in the denotation of Lustre constructs.
4.4. CORRECTION OF CODE
81
(* Adding the posibility to reset a transition relation *)
Definition resettable {I M O} (Re : M -> Prop) (c : bool) Tr
(mem_in : M) (inputs : I)
(mem_out : M) (outputs : O) :=
if c then exists mem_m, Re mem_m /\ Tr mem_m inputs mem_out outputs
else Tr mem_in inputs mem_out outputs.
(* Resetting is ’transitive’ *)
Theorem resettable_transitive : forall {I M O} c c’ Re Tr
(mem_in : M) (inputs : I) (mem_out : M) (outputs : O),
resettable Re c (resettable Re c’ Tr)
<-> resettable Re (orb c c’) Tr.
(* The stream semantics of a resettable transition predicate *)
Definition now_tr {I M O} Tr (s : Stream (bool * M * I * M * O)) : Prop :=
match s with
| Cons (mem_in, inputs, mem_out, outputs) _
=> Tr reset mem_in inputs mem_out outputs
end.
(* Denotational view of reset and transition predicates Re and Tr *)
Definition oper_to_denot {I M O} Re Tr (reset : Stream bool)
(inputs : Stream I) (outputs : Stream O) (mem : Stream M) :=
(reset|mem|inputs|tl mem|outputs)
|= [] (now_tr (fun reset => (resettable Re reset Tr))).
(* ForAll is an Applicative Functor (system K) *)
CoFixpoint ForAll_apply {A} {P Q : (Stream A -> Prop)} :
forall s, (s |= [] P) -> (s |= [] (fun s => P s -> Q s))
-> (s |= [] Q) :=
fun s box_p box_pq =>
match box_p, box_pq with
| (HereAndFurther p_s box_p’), (HereAndFurther pq_s box_pq’)
=> HereAndFurther s (pq_s p_s) (ForAll_apply (tl s) box_p’ box_pq’)
end.
(* The standard necessitation rule (system K) *)
CoFixpoint necessitation {A} {P : Stream A -> Prop}
(I : forall s, P s) (s : Stream A) : (s |= [] P) :=
HereAndFurther s (I s) (necessitation I (tl s)).
(* Point-wise equivalent of [] P *)
Theorem ForAll_forall {A} {P : A -> Prop} :
forall s, (s |= [] (fun s => P (hd s)) -> forall n, P (Str_nth n s)).
Listing 4.3.1 – The Refine library.
82
CHAPTER 4. TRANSLATION VALIDATION
(* Operational semantics of node _arrow *)
Definition _arrow_reset (mem : bool) := mem = true.
Definition _arrow_step (mem_in : bool) (_ : unit) mem_out output :=
output = mem_in /\ mem_out = false.
Theorem arrow_refinement : forall reset input output mem,
oper_to_denot arrow_reset arrow_step reset input output mem
-> output ~ _arrow reset.
(* Simplified operational semantics of node f *)
(* memory = node arrow memory * pre cpt *)
Definition f_mem := (bool * Z)%type.
(* ni_2 is the node arrow *)
Definition ni_2 (mem : f_mem) := fst mem.
(* __f_2 is the memory pre cpt *)
Definition __f_2 (mem : f_mem) := snd mem.
Definition f_reset (mem : f_mem) := ni_2 mem = true.
Definition f_step mem_in x mem_out y_cpt :=
match y_cpt with | (y, cpt) =>
(* __f_1 is the output of ni_2 *)
(* __f_3 is the if-then-else
*)
exists __f_1 __f_3,
/\ (y = (x + 1)%Z)
(resettable _arrow_reset false _arrow_step
(ni_2 mem_in) tt (ni_2 mem_out) __f_1)
/\ (__f_3 = if__f_1 then 0%Z else (_f_2 mem_in))
/\ (cpt = (__f_3 + 1)%Z)
/\ (__f_2 mem_out) = cpt
end.
Theorem f_refinement : forall reset x y_cpt mem,
oper_to_denot f_reset f_step reset x (zip y_cpt) mem
-> y_cpt ~ f_denot reset x.
Listing 4.3.2 – Operational semantics of Listing 4.2.5.
4.4. CORRECTION OF CODE
83
the code and check whether annotations hold. We want a fully automatic
proof process, so we must annotate very precisely every line of code, to guide
the tools. Since we are about to compare the C code with its operational
semantics, where different symbols share the same name, we shall rename
in this section the predicates N _step and N _reset as N _trans and N _init
respectively, to avoid ambiguities.
4.4.1
From Horn Clauses to ACSL Annotations
Let us consider, similarly to Section 2.5.7, a node N with r equations.
A generic encoding of the transition relation N _trans can be found in Listing 2.5.10 of Section 2.5.10. We notice that we can generalize the step function so that it “executes” the k first equations in order π(1), . . . , π(k), and
then stops. Let us call it N _transk . It could defined in SMT-LIB as:
(declare−rel N _transk (type_N_mem I1 . . . Ia type_N_mem O1 . . . Oc ))
( rule (=> (and Hornck (eqπ(1) )
...
Hornck (eqπ(k) ))
(N _transk s e l f i1 . . . ia s e l f p o1 . . . oc )))
Sketchily, provided the C memory representation conforms to the treestructured scheme of Section 2.5.7, the property:
(N _transk self i1 . . . ia selfp o1 . . . oc )
should hold after equation k has been executed. Moreover, speaking
about syntax, translation from SMT-LIB predicates to ACSL annotations
is rather straightforward. Still, what remains to be done is to give a closed
formula interpretation to Horn clauses (as they may contain free variables)
as well as to give a meaning to the variable selfp , which may depend on k.
These issues are addressed in the following sections.
4.4.2
Memory Representation
We first need a way to distinguish between current and updated memory
in ACSL annotations, which cannot be done with the memory representation
of the C code, passed by reference and overwritten as execution proceeds.
Furthermore, even if could hack our way up to a representable transition,
the memory model of low-level languages like C can dramatically complicate
the task of verification tools, by obfuscating simple operations, even in our
static memory scheme.
We propose to build a ghost memory representation in ACSL 4 , with the
same tree structure but without any pointers, only structs within structs. For
4. Ghost types and variables are akin to logical variables as used in Hoare axiomatic
specifications.
84
CHAPTER 4. TRANSLATION VALIDATION
a node N , we then consider a relation (N _ghost) between real (N _mem)
and ghost (N _mem_ghost) memory, which amounts to equating corresponding struct fields of both memories, disregarding pointers. This correspondence is indeed a relation of simulation of real memory states by ghost
memory states, as pictured in Figure 4.5. The stateful node “Arrow” also
has its ghost memory.
mem_in
N _ghost
self (bef ore)
N _trans
N _step
mem_out
N _ghost
self (af ter)
Figure 4.5 – The ghost simulation relation.
Any other ghost representation is possible, as long as it contains enough
information. The goal is to establish a correspondence between real and
ghost memory, strong enough to prove annotations. One could for instance
totally flatten the memory tree structure (as long as every Lustre module is provided) or relate an integer-based counter to a Gray-encoded ghost
counter. Even if not every Lustre module is provided, one could still promote modular verification through an axiomatic specification of step functions and ghost memories of external nodes, which is possible in ACSL.
Let us not forget the precondition on the validity of the C memory state
N _valid. A valid memory has all its pointers allocated and pairwise different. We show in Listing 4.4.1 all the requirements that the N _reset function
and the N _step function must meet.
4.4.3
More Simulation Relations
Indeed, the meaning of the Horn variable selfp does depend upon the
number of already executed equations. At the beginning and at the end of an
execution of the C function N _step, the simulation relation N _ghost must
hold. But, in order to annotate every single line of code, we must provide a
different simulation relation whenever a memory assignment or a node call
(within an equation) partially alters the real memory. Therefore, for a node
N containing r equations, we first recursively define the set of r + 1 partial
ghost simulation relations N _ghostk , for k ∈ [0, r], as shown in Figure 4.4.2.
We can now decorate every equation with the corresponding ghost simulation relation, with assert statements, as shown in Figure 4.4.3. Only the
predicates N _transk remain to be defined. For the time being, we safely
assume that these predicates have access to every variable (inputs, outputs,
locals and memories).
4.4. CORRECTION OF CODE
85
/∗@ predicate N _valid(struct N_mem ∗ s e l f ) =
\valid( s e l f )
&& M1 _valid( self−>ni_uid1 )
...
&& Mp _valid( self−>ni_uidp ) ;
∗/
/∗@
requires N _valid( s e l f ) ;
requires \separated( self , ( self−>ni_uid1 ) , . . . , ( self−>ni_uidp ) , . . . ) ;
ensures \forall struct N_mem_ghost mem;
N_ghost(mem, s e l f ) ==> N _init(mem) ;
assigns ( self−>ni_uid1)−>_reg, . . . , ( self−>ni_uidp)−>_reg, . . . ;
∗/
void N _reset (struct N_mem ∗ s e l f ) {
...
return;
}
/∗@
requires \valid(o1 , . . . , oc ) ;
requires N _valid( s e l f ) ;
requires \separated( self , ( self−>ni_uid1 ) , . . . , ( self−>ni_uidp ) , . . . ) ;
ensures \forall struct N_mem_ghost mem_in;
\forall struct N_mem_ghost mem_out;
\at(N_ghost(mem_in, s e l f ) , Pre)
==> N_ghost(mem_out, s e l f )
==> N _trans(mem_in, i1 , . . . , ia , mem_out, ∗o1 , . . . , ∗oc ) ;
assigns ∗o1 , . . . , ∗oc , ( self−>ni_uid1)−>_reg, . . . ,
( self−>ni_uidp)−>_reg, . . . ;
∗/
void N_step (I1 i1 , . . . , Ia ia ,
O1 (∗o1 ) , . . . , Oc (∗oc ) ,
struct N_mem ∗ s e l f ) {
...
return;
}
Listing 4.4.1 – Code contracts with ghost memories in ACSL.
86
CHAPTER 4. TRANSLATION VALIDATION
−− partial ghost simulations
//@ predicate N_ghost0 (struct N_mem_ghost mem, struct N_mem ∗ s e l f ) =
\true ;
...
−− When eqπ(i) is an assignment of memory m
//@ predicate N_ghosti+1 (struct N_mem_ghost mem, struct N_mem ∗ s e l f ) =
N_ghosti (mem, s e l f )
&& (mem._reg. m == self−>_reg. mi ) ;
−− When eqπ(i) is a c a l l of node Muid
//@ predicate N_ghosti+1 (struct N_mem_ghost mem, struct N_mem ∗ s e l f ) =
N_ghosti (mem, s e l f )
&& M_ghost(mem. ni uid , self−>ni uid ) ;
−− Otherwise , when eqπ(i) is a local/output assignment
//@ predicate N_ghosti+1 (struct N_mem_ghost mem, struct N_mem ∗ s e l f ) =
N_ghosti (mem, s e l f ) ;
...
−− external main ghost simulation
−− holds before (k=0) and after (k=p) execution
//@ predicate N_ghost (struct N_mem_ghost mem, struct N_mem ∗ s e l f ) =
N_ghostp (mem, s e l f ) ;
Listing 4.4.2 – Ghost simulations in ACSL.
void N_step (I1 i1 , . . . , Ia ia ,
O1 (∗o1 ) , . . . , Oc (∗oc ) ,
struct N_mem ∗ s e l f ) {
//@ assert \forall struct N_mem_ghost mem1;
\forall struct N_mem_ghost mem2;
\at(N_ghost(mem1, s e l f ) , Pre)
==> N_ghost0 (mem2, s e l f )
==> N _trans0 (mem1, i1 , . . . , ia , l1 , . . . , lb ,
mem2, ∗o1 , . . . , ∗oc ) ;
...
Codeck (eqπ(k) ) ;
//@ assert \forall struct N_mem_ghost mem1;
\forall struct N_mem_ghost mem2;
\at(N_ghost(mem1, s e l f ) , Pre)
==> N_ghostk (mem2, s e l f )
==> N _transk (mem1, i1 , . . . , ia , l1 , . . . , lb ,
mem2, ∗o1 , . . . , ∗oc ) ;
...
return;
}
Listing 4.4.3 – Code annotations in ACSL.
4.4. CORRECTION OF CODE
4.4.4
87
Closed Formulation and Code Optimization
Finally, after every equation, we have to provide a closed formulation
of the annotation core, given by the predicate N _transk , for k ∈ [0, r] (cf.
Section 4.4.1). We recall that the Horn formulation may contain free variables, such as local variables and intermediate reset states (for resetting node
instances before node calls, cf. Section 2.5.10).
Intermediate reset state variables will be existentially quantified in our
ACSL assertions as they have no counterpart in the C code. This won’t
break the ghost simulation relation, since there is essentially a unique way
to reset a node state 5 . Therefore, even if one can imagine modifying code
generation to introduce in ACSL a ghost location and a ghost variable to
capture intermediate states, this is not worth the effort.
As for local variables, since they obviously do exist as C variables, there
are at least two possible strategies: either one can let local variables be free
in every annotation, or quantify over them as soon as they become dead. We
chose this second more clever solution as it allows to activate the variable
reuse optimization during code generation, without compromising the proof
process 6 . Indeed, dead variables, once quantified away, are freely reusable
without interference. Live and time-frame disjoint variables, because they
are disjoint, won’t interfere as well.
Interfaces of predicates N _transk will also be reduced to the set of current live variables only, as opposed to Section 4.4.3 where interfaces contain
a priori all variables. For the generic node N of Section 2.5.7, we assume the
live analysis has computed the following sets of live variables Livei , one per
equation eqπ(i) . The set Livei denotes the set of assigned local or output variables so far, after the evaluation of equation eqπ(i) , minus the set of local variables not occurring anywhere in the remaining equations eqπ(i+1) , . . . , eqπ(r) .
This is dual to the notion of death table (cf. Section 2.5.8) that we compute
to enable variable reuse. Also, at each step i + 1, we must project away local
“fresh” dead variables, i.e. variables belonging in Locals ∩ (Livei \ Livei+1 ).
The resulting predicates N _transk are defined in Figure 4.4.4. We use
the symbol Acslck ( ) to denote an ACSL syntax compliant version of our
Horn encoding Hornck ( ).
Since we claimed our annotation scheme supports code optimization as
we stated it in Section 2.5.8, we feel that some evidence should ensue 7 .
Note that this is not mandatory as the underlying framework supporting
ACSL annotations will have to prove (or disprove) our claim on every Lustre instance anyway. Note also that the main predicate N _trans is totally
independent of any optimization concern since it doesn’t involve local vari5. State variables left unspecified by a reset predicate don’t hold meaningful values.
6. Yet, annotations must be produced from an unoptimized operational semantics.
7. Yet, we support local variable optimization only, but not input variable reuse for
instance.
88
CHAPTER 4. TRANSLATION VALIDATION
/∗@ predicate N _trans0 (struct N_mem_ghost mem_in, I1 i1 , . . . , Ia ia ,
struct N_mem_ghost mem_out, O1 o1 , . . . , Oc oc ) =
\true ;
∗/
...
/∗@ predicate N _transi+1 (struct N_mem_ghost mem_in, I1 i1 , . . . , Ia ia ,
Locals ∩ Livei+1 ,
struct N_mem_ghost mem_out, Outputs ∩ Livei+1 ) =
\exists Locals ∩ (Livei \ Livei+1 ) ;
N _transi (mem_in, i1 , . . . , ia , Locals ∩ Livei , mem_out, Outputs ∩ Livei )
&& Acslck (eqπ(i+1) ) ;
∗/
...
/∗@ predicate N _trans(struct N_mem_ghost mem_in, I1 i1 , . . . , Ia ia ,
struct N_mem_ghost mem_out, O1 o1 , . . . , Oc oc ) =
\exists Locals ∩ Liver ;
N _transr (mem_in, i1 , . . . , ia , Locals ∩ Liver , mem_out, Outputs ∩ Liver )
∗/
Listing 4.4.4 – Annotation cores in ACSL.
ables that may get replaced, erased, etc. We list hereunder the different
optimization phases that our compiler lustrec supports:
• Variable inlining: we may freely substitute every variable by its defining
expression, in code and annotations altogether, without altering the
definitions of N _transk . The annotation associated to the substituted
equation then becomes equivalent to the previous one and may also be
removed.
• Variable recycling: since the predicates N _transk refer to live variables
only by projecting away dead ones, substituting a variable by a dead
one, in code and annotations, cannot raise any variable capture problem. Live disjoint variables don’t interfere with one another and don’t
cause any problem either.
• Enumerated type elimination: conflating assignments such as “x=Ki ;” with
branches “switch(x) { . . . case Ki : Codei ; break; . . . }” results in a
code where the variable x is locally substituted by Ki in Codei , at
the location of “x=Ki ;” and replacing it. So it amounts to inlining
variable x locally, which is deemed correct. Once x is substituted, the
annotations attached to Codei can be simplified.
4.4.5
A Small Example – Final
Again from Listing 4.2.5, let us assume we obtain the piece of C code of
Listing 4.4.5, obviously similar to the operational description of Listing 4.3.2.
4.5. COMPILATION OF SYNCHRONOUS OBSERVERS
89
The listing on the right is an optimized version, where __f_3, which is dead
after ∗cpt is set, has been reused instead of cpt. Then, to avoid confusion
in the node interface, the original output name cpt has been restored 8 .
void f_step ( int x,
int (∗y) , int (∗cpt ) ,
struct f_mem ∗ s e l f ) {
_Bool __f_1;
int __f_3;
∗y = (x + 1);
_arrow_step (&__f_1, self−>ni_2) ;
i f (__f_1) {
__f_3 = 0;
} else {
__f_3 = self−>_reg.__f_2;
}
∗cpt = (__f_3 + 1);
self−>_reg.__f_2 = ∗cpt ;
return;
}
(a) Unoptimized.
void f_step ( int x,
int (∗y) , int (∗cpt ) ,
struct f_mem ∗ s e l f ) {
_Bool __f_1;
∗y = (x + 1);
_arrow_step (&__f_1, self−>ni_2) ;
i f (__f_1) {
∗cpt = 0;
} else {
∗cpt = self−>_reg.__f_2;
}
∗cpt = (∗cpt + 1);
self−>_reg.__f_2 = ∗cpt ;
return;
}
(b) Variable reuse.
Listing 4.4.5 – Different codes from Listing 4.2.5.
Independently of any optimization, we generate the ghost simulations
and the transition predicates, as shown respectively in Listings 4.4.6 and
4.4.7. We deliberately omit annotations attached to the function f_init ,
since it is not a target of our optimization algorithms.
Finally, the annotated optimized code appears in Listing 4.4.8, where the
notation “\at(f_ghost(mem1, self ), Pre)” stands for the value of expression f_ghost(mem1, self ) before execution (i.e. at Precondition). Every
annotation was proved within the framework Frama-C, using Z3 as a blind
decision procedure. No manual intervention was ever necessary. In the end,
the C code was proved to respect its operational semantics.
More complex examples with a much greater amount of variable reuse
(dead and alive) and containing more advanced constructs such as clocks
were also automatically validated.
4.5
Compilation of Synchronous Observers
A synchronous observer is typically a wrapper node used to test observable properties of a node N with minimal modification to the node itself;
it returns an error signal if the property does not hold, reducing the more
8. For the needs of our demonstration, we let an output variable be assigned twice,
which may be considered a bad practice.
90
CHAPTER 4. TRANSLATION VALIDATION
struct f_mem {struct f_reg {int __f_2; } _reg; struct _arrow_mem ∗ni_2; } ;
//@ ghost struct _arrow_mem_ghost { struct _arrow_reg _reg; };
/∗@ predicate _arrow_init (struct _arrow_mem_ghost mem_in) =
(mem_in._reg. _first == true ) ;
∗/
/∗@ predicate _arrow_trans(struct _arrow_mem_ghost mem_in,
struct _arrow_mem_ghost mem_out, _Bool out) =
(out == mem_in._reg. _first )
&& (mem_in._reg. _first )?(mem_out._reg. _first == false )
: (mem_out._reg. _first == mem_in._reg. _first ) ;
∗/
/∗@ predicate _arrow_ghost(struct _arrow_mem_ghost memp,
struct _arrow_mem ∗mem) =
(memp._reg. _first == mem−>_reg. _first ) ;
∗/
//@ ghost struct f_mem_ghost {struct f_reg _reg;
struct _arrow_mem_ghost ni_2; } ;
/∗@ predicate f_ghost0(struct f_mem_ghost memp, struct f_mem ∗mem) =
\true ;
∗/
/∗@ predicate f_ghost1(struct f_mem_ghost memp, struct f_mem ∗mem) =
f_ghost0(memp, mem) ;
∗/
/∗@ predicate f_ghost2(struct f_mem_ghost memp, struct f_mem ∗mem) =
f_ghost1(memp, mem)
&& _arrow_ghost(memp.ni_2, mem−>ni_2) ;
∗/
/∗@ predicate f_ghost3(struct f_mem_ghost memp, struct f_mem ∗mem) =
f_ghost2(memp, mem) ;
∗/
/∗@ predicate f_ghost4(struct f_mem_ghost memp, struct f_mem ∗mem) =
f_ghost3(memp, mem) ;
∗/
/∗@ predicate f_ghost5(struct f_mem_ghost memp, struct f_mem ∗mem) =
f_ghost4(memp, mem)
&& (memp._reg.__f_2 == mem−>_reg.__f_2) ;
∗/
/∗@ predicate f_ghost(struct f_mem_ghost memp, struct f_mem ∗mem) =
f_ghost5(memp, mem) ;
∗/
Listing 4.4.6 – Ghost simulations from Listing 4.2.5.
4.5. COMPILATION OF SYNCHRONOUS OBSERVERS
/∗@ predicate f_trans0(struct f_mem_ghost mem_in, int x,
struct f_mem_ghost mem_out) =
\true ;
∗/
/∗@ predicate f_trans1(struct f_mem_ghost mem_in, int x,
struct f_mem_ghost mem_out, int y)
f_trans0(mem_in, x, mem_out)
&& (y == x + 1);
∗/
/∗@ predicate f_trans2(struct f_mem_ghost mem_in, int x,
_Bool __f_1,
struct f_mem_ghost mem_out, int y)
f_trans1(mem_in, x, mem_out, y)
&& _arrow_trans(mem_in.ni_2, mem_out.ni_2, __f_1) ;
∗/
/∗@ predicate f_trans3(struct f_mem_ghost mem_in, int x,
int __f_3,
struct f_mem_ghost mem_out, int y)
\exists _Bool __f_1;
f_trans2(mem_in, x, __f_1, mem_out, y)
&& (__f_1)?(__f_3 == 0):(__f_3 == mem_in._reg.__f_2) ;
∗/
/∗@ predicate f_trans4(struct f_mem_ghost mem_in, int x,
struct f_mem_ghost mem_out, int y,
\exists int __f_3;
f_trans3(mem_in, x, __f_3, mem_out, y)
&& (cpt == __f_3 + 1);
∗/
/∗@ predicate f_trans5(struct f_mem_ghost mem_in, int x,
struct f_mem_ghost mem_out, int y,
f_trans4(mem_in, x, mem_out, y, cpt)
&& (mem_out._reg.__f_2 == cpt ) ;
∗/
/∗@ predicate f_trans(struct f_mem_ghost mem_in, int x,
struct f_mem_ghost mem_out, int y,
f_trans5(mem_in, x, mem_out, y, cpt ) ;
∗/
91
=
=
=
int cpt) =
int cpt) =
int cpt) =
Listing 4.4.7 – Core annotations from Listing 4.2.5.
92
CHAPTER 4. TRANSLATION VALIDATION
/∗@
requires f_valid( s e l f ) ;
requires \valid(y) ;
requires \valid(cpt ) ;
requires \separated( self , y, cpt , ( self−>ni_2) ) ;
ensures \forall struct f_mem_ghost mem1;
\forall struct f_mem_ghost mem2;
\at(f_ghost(mem1, s e l f ) , Pre) ==> f_ghost(mem2, s e l f )
==> f_trans(mem1, x, mem2, ∗y, ∗cpt ) ;
assigns ∗y, ∗cpt , self−>_reg.__f_2, ( self−>ni_2)−>_reg. _first ;
∗/
void f_step ( int x,
int (∗y) , int (∗cpt ) ,
struct f_mem ∗ s e l f ) {
_Bool __f_1;
//@ assert \forall struct f_mem_ghost mem1;
\forall struct f_mem_ghost mem2;
\at(f_ghost(mem1, s e l f ) , Pre) ==> f_ghost0(mem2,
==> f_trans0(mem1, x, mem2) ;
∗y = (x + 1);
//@ assert \forall struct f_mem_ghost mem1;
\forall struct f_mem_ghost mem2;
\at(f_ghost(mem1, s e l f ) , Pre) ==> f_ghost1(mem2,
==> f_trans1(mem1, x, mem2, ∗y) ;
_arrow_step (&__f_1, self−>ni_2) ;
//@ assert \forall struct f_mem_ghost mem1;
\forall struct f_mem_ghost mem2;
\at(f_ghost(mem1, s e l f ) , Pre) ==> f_ghost2(mem2,
==> f_trans2(mem1, x, __f_1, mem2, ∗y) ;
i f (__f_1) {
∗cpt = 0;
} else {
∗cpt = self−>_reg.__f_2;
}
//@ assert \forall struct f_mem_ghost mem1;
\forall struct f_mem_ghost mem2;
\at(f_ghost(mem1, s e l f ) , Pre) ==> f_ghost3(mem2,
==> f_trans3(mem1, x, ∗cpt , mem2, ∗y) ;
∗cpt = (∗cpt + 1);
//@ assert \forall struct f_mem_ghost mem1;
\forall struct f_mem_ghost mem2;
\at(f_ghost(mem1, s e l f ) , Pre) ==> f_ghost4(mem2,
==> f_trans4(mem1, x, mem2, ∗y, ∗cpt ) ;
self−>_reg.__f_2 = ∗cpt ;
//@ assert \forall struct f_mem_ghost mem1;
\forall struct f_mem_ghost mem2;
\at(f_ghost(mem1, s e l f ) , Pre) ==> f_ghost5(mem2,
==> f_trans5(mem1, x, mem2, ∗y, ∗cpt ) ;
return;
}
self )
self )
self )
self )
self )
self )
Listing 4.4.8 – A fully annotated optimized code from Listing 4.2.5.
4.5. COMPILATION OF SYNCHRONOUS OBSERVERS
93
complicated property to a single Boolean stream where we need only to
check if the stream is constantly true. Specifically in Lustre, an observer
is a node taking as input all the flows relevant to the safety property to be
specified, and computing a Boolean flow, say “safe”, which is true as long as
the observed flows satisfy the property. To support such specification during
the compilation process, we have extended the traditional Lustre language
with annotations, written in ACSL, which we call Lustre contracts. Indeed, these contracts denote synchronous assume/guarantee specifications in
the following way:
(∗@ requires P re(x1 : T1 , . . . , xn : Tn );
ensures P ost(x1 : T1 , . . . , xn : Tn , y1 : T10 , . . . , yp : Tp0 );
∗)
node N (x1 : T1 , . . . , xn : Tn ) returns (y1 : T10 , . . . , yp : Tp0 );
The above constract corresponds to the following Linear Temporal Logic formula, where x̃ and ỹ respectively denote streams of input tuples (x1 , . . . , xn )
and output tuples (y1 , . . . , yp ):
JhP re, P ostiK , ∀x̃, ỹ, (x̃|ỹ) 2(H(P re) =⇒ P ost)
where P re and P ost are Lustre boolean expressions representing the assumptions and guarantees respectively. H(P ) denotes all traces verifying P
since the beginning of the streams 9 . We now define more precisely what it
means for a node N to satisfy a contract hP re, P osti.
Contract satisfaction. We say that a node N fulfills its contract (N |=
hP re, P osti) if and only if, for all input and output sequences {x̃}i∈N ∈
N
T̃ N , {ỹ}i∈N ∈ T˜0 the following holds:
Ninit (s̃0 ) ∧ ∀j.(∀i ≤ j.Ntrans (s̃i , x̃i , s̃i+1 , ỹi ) ∧ P re(x̃i )) =⇒ P ost(x̃j , ỹj )
Intuitively, the above definition asserts that if the assumption P re has
held at all instants up and including the current time, then the guarantee
P ost holds at the current time. Such formulation closely follows the contract
semantics defined in the AGREE framework [18].
Remark. Definition 4.5 specifies a safety property. It should not be confused with the similar looking property: 2P re ⇒ 2P ost, which is a liveness
property, unsuited for verification purposes. As a matter of fact, a counterexample to the second property is easily produced if P re is false at least at
one instant, in which case P ost is totally unspecified. On the contrary, the
definition 4.5 requires P ost to hold at least until P re becomes false and is
thus stronger.
9. H stands for historically.
94
CHAPTER 4. TRANSLATION VALIDATION
Contract verification via k-induction
4.5.1
Lustre contracts can be automatically verified via k-induction based
model checking, for example using the Pkind [55] tool. Let us briefly recall
the basic ideas behind k-induction as a tool for checking safety, that is invariant properties. In general terms, it can be formulated as the following
Linear Time Logic inference:
s
k−1
^
i
P
i=0
s 2(
k−1
^
i P =⇒ k P )
i=0
k-induction
s 2P
Although k-inductiveness is a sufficient condition for invariance, the procedure, however, is not complete since there exist systems with invariant
properties that are not k-inductive for any k. For those properties, the
procedure will keep increasing k indefinitely. A number of improvements
are possible to increase the procedure’s precision, i.e. the set of invariant
properties it can prove [40, 54].
In our case, we will apply k-induction where the property P identifies
with (H(P re) =⇒ P ost). The specific form of our contracts allows a
reinforcement because if we assume that P re has held at all instants up to
and including current time, then P ost has also held (excluding current time).
Verifying a contract hP re, P osti as a k-inductive property then amounts to
prove that for all i such that 0 ≤ i < k, the base cases hold:
Ninit (s̃0 ) ∧
i
V
˜ j , s̃j+1 , out
˜ j )) ∧
˜ j ) ∧ P re(in
(Ntrans (s̃j , in
j=0
i−1
V
˜ j , out
˜ j)
P ost(in
j=0
˜ i , out
˜ i)
=⇒ P ost(in
(4.1)
and also that the k-inductive case holds:
k
V
˜ j , s̃j+1 , out
˜ j )) ∧
˜ j ) ∧ P re(in
(Ntrans (s̃j , in
j=0
k−1
V
j=0
˜ j , out
˜ j)
P ost(in
(4.2)
˜ k , out
˜ k)
=⇒ P ost(in
In the framework of BMC (cf. Definition 3.1), a satisfiability check using
an SMT solver over the negation of the expressions 4.1 and 4.2 for a given
˜ i , s̃i and out
˜ i for i ∈ [0..k], i.e. a
k may produce a set of values for in
counterexample to the k-induction. If the result is unsat, then k-induction
succeeded and the contract holds.
In practice, tools unroll the transition relation one step at a time, trying
to meet the specific ¬P ost condition for the lowest possible k value. This
can be done efficiently with an SMT solver by reusing previously computed
states.
4.5. COMPILATION OF SYNCHRONOUS OBSERVERS
4.5.2
95
Synchronous Observers as Code Contracts
One may consider replaying the whole k-induction proof at the code and
ACSL level, for the benefit of revalidating the proofs in another context and
within a single framework where everything is handled: correction of code,
k-induction, etc.
First, the code has to be proved correct with respect to its description
in terms of transition systems (init and trans predicates), as developed in
Section 4.4. Second, we have to translate the original contract into a plain
assumption/guarantee specification for sequential code. This divides again in
two steps: first, the stateful nodes present in P re or P ost must be compiled;
second, the “historically” operator H must somehow be unrolled to account
for past instants.
Compiling stateful P re or P ost as any other Lustre node raises the
problem that their own memory will impact the memory representation of
the node they specify. This is potentially dangerous as the code we check
is not the code that will be executed (without observers). To overcome this
problem, we encode these specification nodes as transition systems and embed them in Ninit and Ntrans . P re and P ost become pure stateless boolean
expressions. Only if one is interested in instrumenting the code with assertions (to perform checking at run-time) should the standard C code be
produced for P re or P ost.
The last step to ACSL instrumentation consists in unrolling transitions
to replay the k-induction proof. We assume that the k value is provided.
We have to unroll k times transitions of stateful observer nodes, transitions
of the observed node and the H operator. For that purpose, we define
P ref ixk (T), a k-bounded unfolding of a predicate transition relation (Tinit ,
Ttrans ), defined as follows:
Prefix definition:
P ref ix0 (Tinit , Ttrans , s) , true
P ref ixk+1 (Tinit , Ttrans , s), Tinit (s)
∨ ∃i, o, t.Ttrans (t, i, s, o) ∧ P ref ixk (Tinit , Ttrans , t)
The main benefit of this definition is that it includes both base and inductive
cases in a single definition. It is the duty of the model checking tools to split
it into several cases. With P ref ixk , we can recast formulas 4.1 and 4.2 by
considering the unique formula:
P ref ixk (Tinit , Ttrans , s) ∧ Ntrans (s, i, s0 , o) ∧ P re(i) =⇒ P ost(i, o)
with
Tinit (s)
= Ninit (s)
0
Ttrans (s, i, s , o) = Ntrans (s, i, s0 , o) ∧ P re(i) ∧ P ost(i, o)
(4.3)
96
CHAPTER 4. TRANSLATION VALIDATION
Figure 4.6 shows the ACSL notation for the proof obligations of kinduction with a provided value k = 2. Without loss of generality, we assume
a single input and output for the node N . The ACSL contract of node N
now includes its “synchronous observer” contract. All proofs are replayed at
the ACSL level and don’t rely anymore on external oracles.
4.5.3
Case study: The NASA Transport Class Model
This case study is derived from NASA Langley’s Transport Class Model
[46], a modern transport-class aircraft simulation intended for controls and
health management system research. The simulator is meant to model a
mid-size (approximately 250,000 lb.), twin-engine, commercial transportclass aircraft.The simulator is not intended to be a high-fidelity simulation of
any particular transport aircraft, but instead is meant to be representative
of the types of non-linear behaviors that must be considered for the class as
a whole.
The TCM simulator is primarily implemented in Simulink with a few
C/C++ libraries, and includes models for the avionics (with transport delay), actuators, engines, landing gear, nonlinear aerodynamics, sensors (including noise), aircraft parameters, equations of motion, and gravity. The
overall Simulink model consists of approximately 5700 Simulink blocks
and of several thousand additional lines of C/C++ code in libraries. Such
libraries are primarily used for the engines and the nonlinear aerodynamics
models. Figure 4.7 illustrates the Guidance, Navigation, and Control (GNC)
system of the TCM.
The TCM’s autopilot can control altitude (either by flying to a directed
altitude, or holding a current altitude), can reach and maintain a desired
flight path angle (FPA), can reach and maintain a desired heading, and can
control the airplane’s speed. Each of the autopilot functions listed above
is implemented as a Simulink subsystem within the controls system of the
TCM. For our case study, we considered more specifically the longitudinal
control, i.e., the Altitude, FPA and the Pitch Inner Loop controller described
in Figure 4.7.
The TCM was not intended for embedded production-level code. As
such, written requirements or specifications for the autopilot and controls
software have not been developed. For this case study, we choose some
relevant requirements for the TCM described in [15]. In the latter, the
authors gave an overview of 20 properties for the TCM. We now give details
of the verification of one property regarding the longitudinal controller. The
natural language version of P reads as follows:
The guidance shall be capable of climbing at a defined rate, to
be limited by minimum and maximum engine performance and
airspeed.
4.5. COMPILATION OF SYNCHRONOUS OBSERVERS
97
/∗@
predicate N _init(struct N_mem_ghost mem) = . . . ;
predicate N _trans(struct N_mem_ghost mem_in, I input ,
struct N_mem_ghost mem_out, O output) = . . . ;
predicate Pre(I input) = . . . ;
predicate Post(I input , O output) = . . . ;
predicate Prefix0 (struct N_mem_ghost mem) =
\true ;
predicate Prefix1 (struct N_mem_ghost mem) =
N _init(mem)
| | \exists I in ;\ exists O out;\ exists struct N_mem_ghost pre ;
N _trans(pre , in ,mem, out) && Pre( in ) && Post(in , out)
&& Prefix0 (pre ) ;
predicate Prefix2 (struct N_mem_ghost mem) =
N _init(mem)
| | \exists I in ;\ exists O out;\ exists struct N_mem_ghost pre ;
N _trans(pre , in ,mem, out) && Pre( in ) && Post(in , out)
&& Prefix1 (pre ) ;
lemma kinduction : \forall struct N_mem_ghost mem_in; \forall I in ;
\forall struct N_mem_ghost mem_out; \forall O out ;
Prefix2 (mem_in)
&& N _trans(mem_in, in , mem_out, out) && Pre( in ) => Post(in , out ) ;
∗/
/∗@ requires \forall struct N_mem_ghost mem_in;
N_ghost(mem_in, s e l f )
==> Prefix2 (mem_in) ;
requires Pre(input)
ensures \forall struct N_mem_ghost mem_in;
\forall struct N_mem_ghost mem_out;
\at(N_ghost(mem_in, s e l f ) , Pre)
==> N_ghost(mem_out, s e l f )
==> N _trans(mem_in, input , mem_out, output ) ;
ensures Post(input , output ) ;
∗/
void N_step(I input , O (∗output) , struct N_mem ∗ s e l f ) { . . . };
Figure 4.6 – k-induction proof obligations in ACSL.
98
CHAPTER 4. TRANSLATION VALIDATION
Figure 4.7 – The Simulink controls system for the TCM.
This property involves 3 components: the Autopilot, the Altitude controller and the FPA controller. In order to prove P , we decomposed the
property into 4 component level properties: P1 and P2 for the Autopilot; P3
for the Altitude controller and P4 for the FPA controller. PKind were able
to prove P1 , . . . , P4 on the individual components. In order to prove P , we
assumed the properties P1 , . . . , P4 . The latter argument is captured in Figure 4.8. The upper box shows the Lustre nodes of the various components
involved in P . Each component is annotated with an assume/guarantee
contract.
4.6
Synthesis of Modular Invariants
We conclude this chapter by an illustration of the synthesis approach
on a simple example based on counters and automata. We fully develop
the different compilation and translation phases, from the original Lustre
source to the final invariant properties.
A classical example used to evaluate tools on Lustre models relies on
two counters computing the same traces but implemented with different
computations. The traces generated are the following:
0 → 0 → 1 → 0 → 0 → 0 → 1 → 0 → ...
They can be also defined with the following regular expression:
(0 → 0 → 1 → 0 →)∗
The two existing implementations encode these four states either with
an integer counter intloopcounter (cf. Figure 4.6.1c) or a pair of booleans
graycounter (cf. Figure 4.6.1b).
First, an enumerated type is generated to denote automaton states:
4.6. SYNTHESIS OF MODULAR INVARIANTS
node auto (x:bool)
returns (out:bool);
let
automaton four_states
state One :
let
out = false ;
tel until true restart
state Two :
let
out = false ;
tel until true restart
state Three :
let
out = true ;
tel until true restart
state Four :
let
out = false ;
tel until true restart
tel
Two
99
node graycounter (x:bool)
returns (out:bool);
var a,b:bool;
let
a = false −> not pre(b);
b = false −> pre(a);
out = a and b;
tel
(b) Gray encoding
Three
Four
One
(a) Automaton-based
node intloopcounter (x:bool)
returns (out:bool);
var time: int ;
let
time = 0 −> i f pre(time) = 3 then 0
else pre time + 1;
out = (time = 2);
tel
(c) Integer-based
Listing 4.6.1 – Counters in three different flavours: automaton (4.6.1a), pair
of booleans (4.6.1b) and integer (4.6.1c).
100
CHAPTER 4. TRANSLATION VALIDATION
(∗@ requires (FPAMode = 0.0) or (not (AilStick = 0.0))
or (not (ElevStick = 0.0)) or FPAEng;
ensures (not (AltMode= 0.0)) => not AltEng;
∗)
node AutoPilot(HeadMode, AilStick , ElevStick , AltMode: real ;
FPAMode, ATMode, AltCmd, Altitude , CAS,CASCmdMCP: real ;)
returns (HeadEng, AltEng, FPAEng, ATEng: bool; CASCmd: real );
(∗@ requires not AltEng;
ensures (AltGammaCmd = 0.0);
∗)
node AltitudeControl(AltEng: bool; AltCmd, Alt : real ;
GsKts, Hdot, HdotChgRate: real)
returns (AltGammaCmd: real ) ;
(∗@ requires true −> (Engage = false) or (AltGammaCmd = Gamma)
or((AltGammaCmd > Gamma) and (PitchCmd > pre(PrePitchCmd)))
or((AltGammaCmd < Gamma) and (PitchCmd < pre(PrePitchCmd)));
∗)
node FPAControl(Engage: bool; AltGammaCmd, Gamma: real ;
ThetaDeg, VT: real)
returns (PitchCmd, PrePitchCmd: real );
(∗@ requires FPain = (AltGammaCmd + GammaCmd);
requires (AltMode = 0.0);
requires (not (FPAMode = 0.0));
requires (ElevStick = 0.0);
requires (AilStick = 0.0);
requires (GammaCmd> 1.0 and GammaCmd< 10.0);
ensures obs = true ;
∗)
node G−120 (HeadMode, AilStick , ElevStick , AltMode: real ;
FPAMode, ATMode, AltCmd, Altitude , CAS: real ;
CASCmdMCP, Gskts, Hdot, HDotChgRate, GammaCmd: real ;
Gamma, ThetaDeg, VT: real)
returns (Obs: bool);
Figure 4.8 – Verification of property P at the Lustre level.
4.6. SYNTHESIS OF MODULAR INVARIANTS
101
type auto_ck = enum { One, Two, Three, Four };
Each automaton state is associated to nodes describing respectively its
strong (unless) transitions and its weak (until) ones. The “handler and
until” function is the last one called. To keep the presentation simpler we
renamed the generated code variable to ease the reading and we focus only
on the state Four in the automaton and generated equations.
The handler_until function assigns the next state to state One and requires the node to be restarted. The output of this node has value out = false.
In this example no unless transition is defined, the unless function is the
identity. Note also that since all automaton states are stateless, restart or
resume behave similarly.
−− state body and until transitions
function Four_handler_until (restart_act : bool; state_act: auto_ck)
returns (restart_in : bool; state_in: auto_ck; out: bool)
let −− encodes the next state , here One
restart_in , state_in = (true , One);
out = false ;−− returns true in the handler for state Three
tel
−− unless transitions (none in this example)
function Four_unless (restart_in : bool; state_in: auto_ck)
returns (restart_act : bool; state_act: auto_ck)
let
restart_act , state_act = (restart_in , state_in);
tel
The generated code without automaton is described below:
node auto (x: bool) returns (out: bool)
var mem_restart: bool; mem_state: auto_ck;
next_restart_in: bool; restart_in : bool;
restart_act : bool; next_state_in: auto_ck;
state_in: auto_ck clock; state_act: auto_ck clock;
four_restart_in: bool; four_state_in: auto_ck; four_out: bool;
four_restart_act: bool; four_restart_in: auto_ck;
. . . −− similar declarations for other states
let
restart_in , state_in = ((false , One) −> (mem_restart, mem_state));
mem_restart, mem_state = pre (next_restart_in, next_state_in);
next_restart_in, next_state_in, out =
merge state_act (One −> (one_restart_in, one_state_in, one_out))
(Two−> (two_restart_in, two_state_in, two_out))
(Three −> (three_restart_in, three_state_in, three_out))
(Four −> (four_restart_in, four_state_in, four_out));
four_restart_in, four_state_in, four_out = Four_handler_until
102
CHAPTER 4. TRANSLATION VALIDATION
(restart_act when Four(state_act) ,
state_act when Four(state_act)) every (restart_act);
restart_act , state_act =
merge state_in (One −> (one_restart_act, one_state_act))
(Two−> (two_restart_act, two_state_act))
(Three −> (three_restart_act, three_state_act))
(Four −> (four_restart_act, four_state_act));
four_restart_act, four_state_act = Four_unless
(restart_in when Four(state_in) ,
state_in when Four(state_in)) every (restart_in );
. . . −− similar definitions for other states
tel
The Horn encoding can now be produced. Enumerated type enables the
declaration of clock values:
(declare−datatypes () ((auto_ck One Two Three Four)))
Until and unless functions are defined as Horn predicates. As an optimization, the Horn encoding of Four_handler_until doesn’t contain any input
or output state parameter, since it is stateless.
(declare−rel Four_handler_until (Bool auto_ck Bool auto_ck Bool))
( rule (=> (and (= out f a l s e )
(= state_in One)
(= restart_in true ))
(Four_handler_until restart_act state_act
restart_in state_in out )))
(declare−rel Four_unless (Bool auto_ck Bool auto_ck))
( rule (=> (and (= state_act state_in )
(= restart_act restart_in ))
(Four_unless restart_in state_in restart_act state_act )))
Finally the reset and step predicates are defined:
( rule (=> (arrow_reset (ni_1 s e l f ))
(auto_reset s e l f )))
( rule (=>
−− update of arrow state
(and (arrow_step (ni_1 s e l f ) (ni_1 s e l f p ) auto_l1)
(and (=> (= auto_l1 true ) −− current arrow i s f i r s t
(and (= state_in One)
(= restart_in f a l s e )))
(=> (= auto_l1 f a l s e ) −− current arrow i s not f i r s t
(and (= state_in (mem_state (auto_reg s e l f )))
(= restart_in (mem_restart (auto_reg s e l f )))))
(and (=> (= state_in Four) −− unless block for automaton state Four
4.6. SYNTHESIS OF MODULAR INVARIANTS
103
(and (Four_unless restart_in state_in
four_restart_act four_state_act)
(= state_act four_state_act)
(= restart_act four_restart_act )))
. . . ) −− similar d e f i n i t i o n for other states
(and (=> (= state_act Four) −− handler and u n t i l block for state Four
(and (Four_handler_until restart_act state_act
four_restart_in four_state_in four_out)
(= out four_out)
(= next_state_in four_state_in)
(= next_restart_in four_restart_in )))
. . . ) −− similar d e f i n i t i o n for other states
−− next value for memory mem_state
(= (mem_state (auto_reg s e l f p )) next_state_in)
−− next value for memory mem_restart
(= (mem_restart (auto_reg selp )) next_restart_in ))
(auto_step s e l f −− state input
x
−− inputs
s e l f p −− state output
out −− outputs )))
Once this Horn clauses definitions are produced, our lustrec compiler
also generates a traceability file mapping Horn encoding variables to initial
Lustre model variables, which are displayed by their exact location in the
memory tree-like structure. Then another tool Zustre interacts with the
SMT-based model-checker Spacer [60] to verify the contracts and generate
node local invariants.
These invariants are expressed in a syntax similar to Lustre nodes.
Without arguying too much about the verification algorithm, which is inspired from PDR (Property Directed Reachability) [14], the verification of
a specific target property amounts to compute inductive invariants of the
transition system reachable states. These invariants are somehow needed to
prove the target property, but they may not capture precisely the reachable
states of the system. Thanks to our modular encoding, it is relatively easy
to recover local information associated to each node.
The next invariant has been computed on the intloopcounter node. It
captures the fact that if the previous counter value was greater than 3 –
which only reachable value is exactly 3 – then the previous state was either
smaller than 2 or greater than 4. This property seems somehow imprecise,
since when the previous value of the counter is 3, then the current value is
0. But it is indeed an invariant.
contract intloopcounter (x:bool) returns (out:bool);
let
guarantee (
true −> (pre (top.ni_2. intloopcounter .__intloopcounter_2) >= 3 =>
not (top.ni_2. intloopcounter .__intloopcounter_2>= 2) or pre
104
CHAPTER 4. TRANSLATION VALIDATION
top.ni_2. intloopcounter .__intloopcounter_2>= 4));
tel
The second one is also a little bit weak, but valid. It states that the only
transition from state One is targetting state Two.
contract auto (x:bool) returns (out:bool);
let
guarantee (true −> (One = pre auto.automaton_state)
=> (Two = auto.automaton_state));
tel
Thanks to our encoding of automaton, we have nodes for the until/unless
transitions. The two next invariants captures exactly the behavior of those
states. State Three produces a true output and switches to state Four, while
state Four produces a false output and looped back to state One.
automaton_contract four_states__Three_handler_until;
let
four_states__Three_handler_until.out_out and
four_states__Three_handler_until.four_states__state_in = Four
tel
automaton_contract four_states__Four_handler_until;
let
not four_states__Four_handler_until.out_out and
four_states__Four_handler_until.four_states__state_in = One
tel
The last invariant is difficult to interpret. It is associated to an unless
transition, which, in our example, is essentially the identity function. Because of clocked expression, the function call to Four__unless is only defined
when state_in = Four. In that case state_act = Four. when the clock condition when Four(state_in) is not satisfied, the value of the signal could be
anything. This is exactly the property captured by this invariant:
automaton_contract four_states__Four_unless;
let
four_states__Four_unless.four_states__state_in = One or
four_states__Four_unless.four_states__state_act = Four or
four_states__Four_unless.four_states__state_in = Three or
four_states__Four_unless.four_states__state_in = Two
tel
In other words, functions or nodes such as unless and until nodes that are
always used on clocked inputs, should be interpreted under this hypothesis.
In that case, it will give us the following contract:
4.7. PERSPECTIVES
105
automaton_contract four_states__Four_unless;
let
four_states__Four_unless.four_states__state_in = Four and
(
four_states__Four_unless.four_states__state_in = One or
four_states__Four_unless.four_states__state_act = Four or
four_states__Four_unless.four_states__state_in = Three or
four_states__Four_unless.four_states__state_in = Two
)
tel
This contract reduces to:
automaton_contract four_states__Four_unless;
let
four_states__Four_unless.four_states__state_in = Four and
four_states__Four_unless.four_states__state_act = Four
tel
All the synthesized contracts can now be translated into ACSL assertions
and carried over to the C code to check if they still hold. This is another
kind of validation for our compiler lustrec. We may suspect that an error
in the Horn encoding would likely occur as well in the C code target, because
both stem from the same preceding phases of the same compiler. Under such
an hypothesis, we would only prove that properties of wrong Horn clauses
are preserved in a wrong C code, which is useless and deceptive.
Yet, we are able to stop the compiling process right after the automata
unfolding and feed another tool with pure dataflow Lustre, such as Kind2,
which will perform its own compilation to Horn clauses and discover modular
invariants. Having the possibility to use independent tools is an important
feature that supports confidence in verification results.
4.7
Perspectives
Our framework for comparing operational and denotational semantics is
still not very mature and largely untested, except for simple examples. We
believe that not much extra material, in terms of Coqspecifications, will ever
be needed. Yet, the main stumbling block here is the crafting of an automatic
refinement proof tactics able to cope with any user-provided Lustre node.
Even if we aimed at a high level of engineering discipline, which tends to
lighten the burden of (automatic) proofs, we are stepping in an unfamiliar
territory. Modern SMT solvers such as Z3 now available in Coq, through
an interface to the Why3 specification framework [27], will indoubtedly be
of a great help. Also, in terms of Coq arcane, we are convinced that type
106
CHAPTER 4. TRANSLATION VALIDATION
classes and morphisms will play a major role in organizing and automatizing
complex proof structures.
The KCG compiler [26] implements another reset scheme, which appears
closer to the denotational semantics. In particular, reset and transition
phases are not separated anymore. Having such an operational scheme may
help simplifying proofs, besides the benefit of generating more efficient code,
which is worth the study.
Moreover, we are unable to handle additional optimizing code transformations, such as retiming, in the absence of more specific properties of our
denotational operators. Other factorizations are also possible at the Lustre level, for instance the sampling operator “when” can be rearranged when
applied to stateless Lustre expressions. Listing 4.7.1 shows an example of
what a retiming library might contain. As stateless stream functions are
produced by Str_apply and Str_lift, it boils down to proving that pre
commutes with both, ruling out the first instant of pre which is unspecified.
(* Retiming properties of pre *)
Theorem retiming_pre_apply : forall {A B} (f : Stream (A -> B)) s,
tl (pre (f @ s)) ~ tl ((pre f) @ (pre s)).
Theorem retiming_pre_lift : forall {A} (v : A),
tl (pre (!v)) ~ !v.
Proof.
apply pre_spec.
Qed.
Listing 4.7.1 – Retiming in Coq.
As for the validation of code annotations, we still need to mechanize it
and perform a thorough testing to check whether the full automation promise
holds or not. Then, arrays and array iterators also have to pass the test. As a
second-order artifact, iterators won’t lend themselves to verification activities
based on predicate encodings. At instance level, provided the iterated node
is given a contract, it . Arrays alone, the way we support them, seem much
easier to handle since array traversals always take the form of nested for
loops with a uniform processing of array elements, which result is easily
specified from the loop body.
Section 4.5 showed that we can re-validate proofs that were made at the
Lustre/Horn level at the C code level. This is possible since k-induction
can be expressed in ACSL and replayed within Frama-C, through the same
kind of SMT tools that were used to build the proof in the first place. For an
even greater level of safety, we can also imagine replaying proofs at the Coq
level. That would involve dedicated instrumentation, such as the statement
(and proof) of the k-induction principle. Some preliminary results have been
obtained in this direction.
4.7. PERSPECTIVES
107
From the perspective of making specification and verification techniques
cooperate, we still have to address the very important problem of the numerous representations of numbers in Coq, SMT-LIB, Lustre and C code.
Discrepancies are likely to occur, between unbounded integers and 64 bits
machine integers, between real numbers and floating-point numbers. The
reference point is the C code with its complicated numerical phenomena.
Bounding and minimizing numerical errors between the specification level
and the code level seems a good fit, following approaches such as [47, 48, 66].
In this respect, the forthcoming Part II may constitute an interesting starting
point for bounding errors in the case of complex numerical computations.
As a conclusion, hybrid systems and more specifically the subtle semantical interplay between a synchronous program and a differential equation
seem interesting and challenging. Since the matter is far from being settled
yet, cf. [12], verification activities may seem far-fetched. But still, our contributions of Part II and more particularly of Section 8.11 may bring a new
light on that subject, since we are able to handle differential equations.
108
CHAPTER 4. TRANSLATION VALIDATION
Part II
Certified Taylor Expansions
109
Chapter 5
Type-level arithmetics
5.1
Motivation
The rationale behind this implementation of a type-level arithmetics is
simple: to statically check various properties of complex data structures
involved in the computation of multivariate taylor expansions. Generalized
Abstract Data Types (GADT) are a general mean to perform type-level
arithmetics, in a non-hackish and extensible way.
Type-level arithmetics allows, at compile time, to ensure that dimensions and orders of various tensors, functions, convolutions and power series
always conform to their specification. This is of a particular importance in
such a complex and particularly error-prone context involving a vast number
of numerical computations. The type system which validates all dimension
related issues greatly helps in reducing the focus on purely numerical concerns: correctness of approximation, precision, convergence.
Type-level arithmetics also allows to guide the development of recursive
algorithms on strongly typed data structures. In practice, when designing
the behavior of an algorithm on some recursive sub-case, not many terms of
choice would fit the required type constraints, most often only one.
As a matter of fact, the only bug that has been unveiled during our tests
was direcly related to a silent numerical mistake occurring when computing an error margin for the logarithm function. On the algebraic side, for
instance, all tensor operations, which can be quite complex, were correct.
5.2
Introduction to GADT
What stands behind the term coined “GADT” may slightly differ, from
their introduction in the early 2000’s (see [77]) to their recent implementation in OCaml. Roughly, it is a form of ad-hoc type-dependent typing,
built on top of Hindley-Milner type system. Notably, a difference with plain
dependent typing is that computation is not allowed at the type-level and
111
112
CHAPTER 5. TYPE-LEVEL ARITHMETICS
therefore type relations instead of type functions are used (though in some
early works, a limited form of type functions were envisaged). The theory
is much simpler than dependent typing and doesn’t need heavy annotations
from the user, sticking to the Hindley-Milner tradition of type inference.
Still, a recursive function processing recursive values of a GADT requires a
type scheme at definition site for the inference to succeed.
Until now, as far as we know, applications of GADT were restricted to
simple cases: singleton types, existential types, safe typecasting (through
type equality witnesses) and safe interpreters for term algebras. All the
encountered examples involve monadic predicates on types (or at best dyadic
as found in equality type, but not recursive).
5.3
A Simple Example
To depart from classical examples, we introduce here a type-dependent
projection function, built with a singleton type. The GADT p triple_proj
is a predicate on type variable p which constrains it to take one of three
different type shapes, one for each projection function of a triple. Thus
one can publicly release only one generic projection function proj_triple,
parameterized by the component to keep (First, Second or Third) and
yielding one of the three primitive projections (first, second or third).
type _ triple_proj =
| First: (’a *’b *’c -> ’a) triple_proj
| Second: (’a *’b *’c -> ’b) triple_proj
| Third: (’a *’b *’c -> ’c) triple_proj
let first (a, b, c) = a
let second (a, b, c) = b
let third (a, b, c) = c
let proj_triple
| First ->
| Second ->
| Third ->
: type proj. proj triple_proj -> proj = function
first
second
third
To make GADT worth the annotation cost, a non-uniform type scheme
is required (usually called non-uniform recursion as it concerns most often
the recursive call sites of a function, but not in this example). In our specific case, the non-uniform flavour, tagged by the construction “type proj”,
comes handy when typing the different branches of the pattern-matching.
Each branch may be given a different type, through the generation of local
(per branch) fresh existential type variables, based on the type declaration of
the matched constructor. Then, in each branch, the resulting expression is
typed and the whole pattern→result type should unify with the global function type scheme, for some (local) value of the type variable proj. If that suc-
5.4. GADT VERSUS PROOF ASSISTANTS
113
ceeds, proj_triple is given the type :’proj triple_proj -> ’proj. Note
that assigning this type directly to the function proj_triple doesn’t trigger the non-uniform typing feature. Therefore, although the whole typing
still succeeds, we obtain ’proj≡(’a * ’a * ’a) -> ’a, which is the best
that a standard Hindley-Milner type system allows for a “generic” projection
function, due to its dependency on a dynamic value.
5.4
GADT versus Proof Assistants
Our ambition of using GADT to encode type-level arithmetics naturally
raises the question of the implementation language. As proof assistants are
naturally dedicated to making proofs, why not directly choose a proof assistant such as Coq ? The answer is twofold. First, code extraction to
a regular programming language is not currently GADT aware and would
drop all non-uniform type parameters and thus would weaken type definitions. This is problematic since our application is not so much an end-user
product than a library. We would lose type safety provided by GADT. Second, as coinduction and streams are involved (see chapter 8), again code
extraction fails at providing a strictly synchronous semantics to functions
that we fundamentally need to be synchronous and know they are.
But we must admit that we embarked on a perilous journey, since a
programming language typically lacks support to prove complex properties,
such as automatic proof search strategies. Moreover, as termination is not
enforced, an inconsistency may be introduced by forever looping programs
(viewed as proofs of falsity). We have to be particularly careful when designing inductive proofs.
As a final remark, we are also concerned with the absence of “proof by
contradiction” principle. Apart from using looping programs as workarounds,
the property False cannot be defined 1 and this prevents us from using negations. Unfortunately, some specifications happen to use negations quite liberally and these negations must be discarded. This makeover may yield
artificial formulations though.
5.5
5.5.1
Encoding Arithmetics at Type-level
Equality
We start by a simple but useful relation, equality. Equality as we define it
is not restricted to natural numbers. The module Equal defines the classical
equality predicate eq with a single proof constructor Eq. This is reminiscent
1. In OCaml, False may be supported by continuations since the seminal work of
Griffin [36]. But continuations are not implementable in the host language and need
foreign low-level external primitives.
114
CHAPTER 5. TYPE-LEVEL ARITHMETICS
of similar definitions found in proof assistants. Some minimal properties
follow. General Leibniz’s equality theorem cannot be proved since OCaml
has no higher-order kind. Still, without modification to eq_elim proof term,
we could prove that having (a, b) eq allows to cast a list into b list for
instance.
A salient characteristic of GADT that particularly stands out in the
following module Equal, is that, although most proof constructors such as
Eq happen to be polymorphic values, instantiation parameters don’t need
to be explicitly stated, as Hindley-Milner type inference does all the work.
This could also be an hindrance in case of an ill-typed proof term, since it is
sometimes hard to tell which specific instantiation we are dealing with and
why type inference fails. Proof assistants, on the contrary, essentially force
users to choose and write manually the instantiation parameters they want.
For that matter, mimicking explicit type instantiation in the current state
of GADT in OCaml, by explicitly typing every proof constructor, requires
the use of type variables provided through type annotations.
More anecdotal is the obligation to perform pattern-matching to spark
GADT typing specificities, even in the case of a single parameter-less constructor to match against. As an example, in function eq_sym, removing the
pattern matching construct or weakening it by replacing Eq with an universal pattern _ leads the type inference to fail. Even if perfectly normal, this
obligation sometimes interferes with a programmer’s habit to apply simple
optimizations such as removing (computationally) useless code.
module Equal =
struct
type (_, _) eq =
| Eq: (’a, ’a) eq
let eq_refl : type a. (a, a) eq =
Eq
let eq_sym : type a b. (a, b) eq -> (b, a) eq =
fun Eq -> Eq
let eq_trans : type a b c. (a, b) eq -> (b, c) eq -> (a, c) eq =
fun Eq bc_eq -> bc_eq
let eq_elim : type a b. (a, b) eq -> a -> b =
fun Eq a -> a
end
5.5. ENCODING ARITHMETICS AT TYPE-LEVEL
5.5.2
115
Natural Numbers
The module Nat defines a type-based representation of Peano natural
numbers and operations. First, types zero and ’a succ reflects the structure
of Peano numbers Z and S( ). Their constructors, Zero and Succ, are
irrelevant and never used in our formalization. They could (should) be totally
abstracted away since they allow meaningless terms such as:
Succ (‘‘hello’’, 1.5)
which is not a natural number. Then, predicate ’a isnat states that its
parameter ’a represents a Peano number. Its proof cases Z and S are the
“concrete” value-level natural numbers.
module Nat =
struct
open Equal
type ’a succ = Succ of ’a
type zero = Zero
type _ isnat =
| Z: zero isnat
| S: ’a isnat -> ’a succ isnat
.
.
.
end
5.5.3
Relation to primitive integers
Interfacing primitive integers and Peano type-level numbers is possible
through the following functions isnat_to_int and int_to_isnat. Note that
the second one needs an unsafe type-casting (Obj.magic) to be accepted,
because it exhibits a non-uniform type scheme that cannot be expressed as
it depends on a dynamic value, the primitive number to be converted. It
is the only function that needs such a hack and anyway one can check by
simple code inspection that it always produces a correct isnat instance. So
using this function cannot jeopardize type safety. It is merely intended for
an end-user and we never make use of it inside our application.
let rec isnat_to_int : type a. a isnat -> int =
function | Z -> 0
| S p -> 1 + (isnat_to_int p)
let rec int_to_isnat : type n. int -> n isnat =
function | 0 -> Obj.magic Z
| p -> Obj.magic (S (int_to_isnat (p-1)))
116
CHAPTER 5. TYPE-LEVEL ARITHMETICS
5.5.4
Arithmetical Operations
Arithmetical operations are total functions on (defined subsets of) natural numbers. As type-level functions are not allowed, we specify them as
inductive relations plus a property that states existence of a result for every possible arguments. Addition add and multiplication mul are defined by
proof cases representing the following axioms:
Zadd
Sadd
Zmul
Smul
: ∀b. 0 + b = b
: ∀a b c. a + b = c =⇒ (1 + a) + b = 1 + c
: ∀b. 0 ∗ b = 0
: ∀a b c d. a ∗ b = c ∧ b + c = d =⇒ (1 + a) ∗ b = d
The types ex_add and ex_mul stand for the fact that one can always add
and multiply numbers. The type le encodes the standard order predicate.
Associated proof terms represent the following axioms:
Exadd : ∀a b c. a + b = c =⇒ ∃c. a + b = c
Exmul : ∀a b c. a ∗ b = c =⇒ ∃c. a ∗ b = c
Le
: ∀a b d. d + a = b =⇒ a ≤ b
Finally, in terms of GADT, we have:
type (_, _, _) add =
| Zadd: (zero, ’b, ’b) add
| Sadd: (’a, ’b, ’c) add -> (’a succ, ’b, ’c succ) add
type (’a, ’b) ex_add =
| Exadd: (’a, ’b, ’c) add -> (’a, ’b) ex_add
type (_, _, _) mul =
| Zmul: (zero, ’b, zero) mul
| Smul: (’a, ’b, ’c) mul * (’b, ’c, ’d) add -> (’a succ, ’b, ’d) mul
type (’a, ’b) ex_mul =
| Exmul: (’a, ’b, ’c) mul -> (’a, ’b) ex_mul
type (_, _) le =
| Le: (’d, ’a, ’b) add -> (’a, ’b) le
5.5.5
Functional Specification
Relational semantics of arithmetical operations must be supplemented
by: first, a property add_eq stating that addition is functional in its first
two arguments and more generally than equality is a congruence with respect
to addition; second, a property exists_add stating that addition is total.
Similar properties for multiplication are stated and proved.
5.5. ENCODING ARITHMETICS AT TYPE-LEVEL
117
let rec add_eq : type a1 a2 b1 b2 c1 c2. (a1, a2) eq -> (b1, b2) eq ->
(a1, b1, c1) add -> (a2, b2, c2) add -> (c1, c2) eq =
fun Eq Eq add_1 add_2 ->
match add_1, add_2 with
| Zadd , Zadd
-> Eq
| Sadd a1, Sadd a2 -> match add_eq Eq Eq a1 a2 with Eq -> Eq
let rec exists_add : type a. a isnat -> (a, ’b) ex_add =
fun a ->
match a with
| Z
-> Exadd Zadd
| S a’ -> match exists_add a’ with Exadd add’ -> Exadd (Sadd add’)
let rec mul_eq : type a1 a2 b1 b2 c1 c2. (a1, a2) eq -> (b1, b2) eq ->
(a1, b1, c1) mul -> (a2, b2, c2) mul -> (c1, c2) eq =
fun Eq Eq mul_1 mul_2 ->
match mul_1, mul_2 with
| Zmul
, Zmul
-> Eq
| Smul (m1, a1), Smul (m2, a2) -> add_eq Eq (mul_eq Eq Eq m1 m2) a1 a2
let rec exists_mul : type a b. a isnat -> b isnat -> (a, b) ex_mul =
fun a b ->
match a with
| Z
-> Exmul Zmul
| S a’ -> match exists_mul a’ b with Exmul mul’ ->
match exists_add b
with Exadd add’ ->
Exmul (Smul (mul’, add’))
5.5.6
Properties of addition
let rec isnat_add1 : type a b c. (a, b, c) add -> a isnat =
fun add ->
match add with
| Zadd -> Z
| Sadd p -> S (isnat_add1 p)
let rec addZ_right : type a. a isnat -> (a, zero, a) add =
fun pr ->
match pr with
| Z -> Zadd
| S p -> Sadd (addZ_right p)
let rec addS_right : type a b c. (a, b, c) add -> (a, b succ, c succ) add =
fun add ->
match add with
| Zadd -> Zadd
| Sadd a -> Sadd (addS_right a)
118
CHAPTER 5. TYPE-LEVEL ARITHMETICS
let rec add_comm : type a b c. b isnat -> (a, b, c) add -> (b, a, c) add =
fun pr add ->
match add with
| Zadd -> addZ_right pr
| Sadd a -> addS_right (add_comm pr a)
let rec addS_comm : type a b c. b isnat -> (a, b succ, c) add ->
(a succ, b, c) add =
fun b pr ->
match add_comm (S b) pr with
| Sadd pr’ -> Sadd (add_comm (isnat_add1 pr) pr’)
let rec addZ_elim : type a b c. b isnat -> (a, b, b) add -> (a, zero) eq =
fun b abb ->
match b, abb with
| _ , Zadd
-> Eq
| S b’, Sadd abb’ -> addZ_elim b’ (addS_comm b’ abb’)
let rec add_assoc_left : type a b c ab bc abc. (a, b, ab) add ->
(b, c, bc) add -> (ab, c, abc) add -> (a, bc, abc) add =
fun ab bc ab_c ->
match ab, ab_c with
| Zadd
, _
-> let Eq = add_eq Eq Eq bc ab_c in Zadd
| Sadd ab’, Sadd ab_c’ -> Sadd (add_assoc_left ab’ bc ab_c’)
let rec add_assoc_right : type a b c ab bc abc. (a, b, ab) add ->
(b, c, bc) add -> (a, bc, abc) add -> (ab, c, abc) add =
fun ab bc a_bc ->
match ab, a_bc with
| Zadd
, Zadd
-> bc
| Sadd ab’, Sadd a_bc’ -> Sadd (add_assoc_right ab’ bc a_bc’)
5.5.7
Properties of multiplication
let rec factr_left : type a b ab c d e de. (a, b, ab) add -> (a, c, d) mul ->
(b, c, e) mul -> (d, e, de) add -> (ab, c, de) mul =
.
.
.
let rec distr_left : type a b ab c d e de. (a, b, ab) add -> (a, c, d) mul ->
(b, c, e) mul -> (ab, c, de) mul -> (d, e, de) add =
.
.
.
let rec mulZ_right : type a. a isnat -> (a, zero, zero) mul =
.
.
.
let rec mulS_right : type a b c d. (a, b, c) mul -> (a, c, d) add ->
(a, b succ, d) mul =
.
.
.
5.6. CONCLUSION
119
let rec mul_comm : type a b c. b isnat -> (a, b, c) mul -> (b, a, c) mul =
.
.
.
let rec le_mul : type a b c d e. (c, a, d) mul -> (c, b, e) mul ->
(a, b) le -> (d, e) le =
.
.
.
let mul_zero_inv : type a b. (a, b, zero) mul ->
((a, zero) eq, (b, zero) eq) Sum.issum =
.
.
.
let rec mul_left_inv : type a b c d. (a, b succ, c) mul ->
(d, b succ, c) mul -> (a, d) eq =
.
.
.
5.6
5.6.1
Conclusion
User Experience
GADT is still in its infancy in terms of useful use-cases as much as in
terms of mere usability. They already provide a very powerful way, yet
very lightweight from the user perspective, to represent strong properties
checked at compile-time. Another strength of GADT is the reduction of
pattern matching cases needed. It helps the developer getting rid of spurious
impossible cases not catched by standard type systems, that normally result
in a lot of failwith or assert false branches. Also, it could have an
impact on code optimization with less patterns to match data against. We
think our application clearly advocates all these features.
Nevertheless, our massive use of GADT left us with a bitter feeling of
working at a low and undocumented level so far. Most type errors encountered, although perfectly legitimate, are not user-friendly and end up showing
not really informative messages such as the following:
“type #ex388 is not compatible with type #ex340 succ”
To interpret this kind of oracle, one must understand GADT algorithms and
heuristics from sparse explanations, valid use-cases and one’s own unsuccessful tryouts. At early stages of this development, when we were still missing this expertise, problematic pieces of code were translated into the Coq
proof assistant language until worked out, then translated back to GADT in
OCaml.
In our opinion, more information from GADT internals should be provided to the user so that the learning curve is not so steep and encourages
to try more complex examples. But we also believe that with complex properties comes the need for a certain amount of proof automation. In this
respect, type classes or similar proof/term-searching mechanisms could really be helpful in leveraging the use of GADT.
120
5.6.2
CHAPTER 5. TYPE-LEVEL ARITHMETICS
Proof normalization and Performance
As properties are carried over through the code, to ensure strong static
guarantees whenever possible, proof normalization occurs at runtime, along
“useful” computations. It incurs a complexity penalty, which depends on
the application at hand and on the computational content of proofs. Our
own application is very demanding in terms of computing power and the
proofs carried represent a negligible fraction of it. But it may not be the case
anymore if we wish more safety and have to handle more complex properties.
Speaking of proof optimization, let us note first that we are not facing a
situation where we could invoke a proof by reflection principle, as in modern
proof assistants. Proofs with chained equational reasoning are the sweet spot
of proof by reflection principle and equality in our formalization is most often
avoided and expressed by type unification. Still, it seems that we would be
in a favourable case, because our type relations on types embed equality on
natural numbers. But, in any case, it would at least require an interplay
between values and types at type-level, which is not possible with GADT.
So we stick to classical program optimization and propose solutions to
limit resources spent by proof normalization at runtime. These solutions
are still to be tested though and some of them are even not expressible in
OCaml.
5.6.3
Change Encoding
A simple way to obtain logarithmic-time complexity for proof normalization is to express numbers in base 2 instead of using Peano numbers. This
solution is obviously specific to the problem at hand of encoding arithmetics.
Moreover, re-encoding the whole library with more complex data and proof
structures seems a bit daunting, due to lack of support from the language.
5.6.4
Memoization
One possibility is to use memoization on theorems to avoid replaying
their proofs. Yet, checking that arguments lie in the memoization table
is still a linear-time operation, with respect to the size of Peano natural
numbers. Memoization should be applied only to super-linear normalization
schemes and cannot discard linear-time penalties. One positive point is that
memoization would be introduced very easily in the code base.
5.6.5
Laziness
Another possibility is to resort to laziness and thus avoid computing
normalized proof when not needed. This requires a small encapsulation
of every proof function in module Nat with lazy/force constructs. We
estimate the impact of memory-cluttering frozen closures around every proof
5.6. CONCLUSION
121
argument to be modest when compared to memory footprint of tensor data
structures. This seems a good compromise as it avoids a complete makeover
while guaranteeing a minimal time complexity. Yet, as laziness may hide
non-terminating programs, users are exposed to the risk of writing looping
programs as proofs of falsity, introducing an inconsistency without being
noticed if the proof is never used.
5.6.6
Type annotations
Last, a very minimal solution is to rely on very aggressive compiler optimizations that would detect that most of our proofs are dragged from
functions to functions but never used and therefore could be safely removed
from generated code. This is unrealistic as the compiler would have to track
values through function calls and general recursion. But if we were offered
the possibility to annotate proofs and values with a Prop or Set type tag,
as in Coq, then after checking well-typedness, Prop values (that is, proofs)
would be discarded and Set values would be kept in generated code.
In a language supporting type classes, such an annotation scheme could
be implemented without specific additional compiler support. Prop a or
Set a would be type classes parameterized with type a, with a coercion
from Set a to Prop a instances that would remove the proof of a provided
by Set a and thus provide no information at all in Prop a, except that a
proof of a does indeed exist. This would naturally fit type class based proof
search strategies.
122
CHAPTER 5. TYPE-LEVEL ARITHMETICS
Chapter 6
Symmetric Tensor Algebra
Symmetric tensors are heavily involved in multivariate Taylor expansions,
specifically representing partial derivatives of functions at any order. In order
to obtain a certifiable implementation of Taylor expansions, it thus appeared
wise to try to define a disciplined version of tensor algebra. Disciplined in the
sense that: first, we define a well-behaved recursive data structure; second,
we ensure through strong static typing guarantees that every structure and
sub-structure represents a tensor of correct dimension and order. This is
very contrasting with implementations commonly found, usually written in
C++ with single dimension arrays for efficiency reasons, with involved array
indices or pointers management, where certification seems far-fetched.
6.1
Safety versus Efficiency
Our design and implementation choices were settled with the following
goals:
— static guarantee of correctness, with respect to tensor dimension and
order. Tensor modules were also carefully crafted so that full correctness could be achieved through a seamless embedding in a proof
assistant.
— as the data structure may seem a little alien when compared to classical
mathematical descriptions, static guarantees should be a guide to a
correct implementation.
— the data structure should be flexible, allowing for instance complex
operations that increase or decrease order and dimension of tensor
arguments in an easy and structural way.
— still, as a pure functional implementation would be largely inefficient,
we propose side-effecting versions of ubiquitous tensor operations (such
as addition and multiplication) that modify coefficients of their tensor
arguments, instead of allocating a whole new structure as a result.
123
124
6.2
CHAPTER 6. SYMMETRIC TENSOR ALGEBRA
Definition
Covariant tensor A covariant tensor T of order R on a vector space V is
a multilinear application with signature: VR → K, where K is a field. When
V is N -dimensional (N < ∞), T is usually denoted by Ti1 ,...,iR where the
ik ’s range over the N dimensions.
Symmetric tensor A tensor S is said to be symmetric when its values are
invariant under permutations of indices, i.e. Si1 ,...,iR = Siσ(1) ,...,iσ(R) for any
permutation σ on J1, RK.
Symmetric tensors of dimension N and order R form a N +R−1
vector
R
space (the number of its components) and also a R-graded N -dimensional
algebra.
6.3
6.3.1
Representations of symmetric tensors
Index versus Occurrence
Order between indices being irrelevant in a symmetric tensor, access to
tensor components may be achieved with an occurrence vector, i.e. for every
dimension d ∈ J0, N − 1K, the occurrence number of indice d in the sequence
i1 , . . . , iR . As an example, a tensor element at indices (0, 0, 0, 1) when N =
3 and R = 4 corresponds to the occurrence (3, 1, 0). In other words, an
occurrence vector is (o0 , . . . , oN −1 ) ∈ NJ0,N −1K , such that o0 +. . .+oN −1 = R.
6.3.2
Tensor versus Homogeneous Polynomial
The occurrence representation (o0 , . . . , oN −1 ) is also a mean to select the
o −1
coefficient of a specific monomial X0o0 . . . . .XNN−1
in a homogeneous polynomial of degree R in N variables. Therefore symmetric tensors are bijectively
related to homogeneous polynomials and some specific polynomial operations
may be defined on symmetric tensors, such as derivation, integration, product (which is different from tensor product), partial evaluation at Xi = 0 or
Xi = 1, etc.
In this document, we follow two different notations when a symmetric
tensor S is interpreted as a polynomial on variables xi : either implicitly,
as S[x0 , . . . , xN −1 ] ; or explicitly, as the coefficient-wise product (also called
Hadamard product) of S with the variables power tensor xR (i1 , . . . , iR ) =
xi1 × . . . × xiR , denoted S xR . This latter notation emphasizes S as a
standalone data structure, independent from variables xi .
6.3.3
Recursive decomposition
Mathematical descriptions handle tensors abstractly, essentially as functions from indices/occurrences to values. At the other end of the spectrum,
6.3. REPRESENTATIONS OF SYMMETRIC TENSORS
125
most implementations choose to store coefficients in a single-dimension array.
This choice typically requires the heavy usage of some conversion between
single-index and occurrences/indices, for computing symmetric tensor product for instance. This representation is very simple and obviously efficient
for linear operations, but incurs a great complexity penalty for operations
that raise or lower tensor dimension or order.
Yet a canonical recursive decomposition is possible, if we adopt a representation based on ordered sequences of indices. Such an ordered sequence serves as the representative for any of its permutations, since we
deal with symmetric tensors. For instance, the prefix of a tree representing
a 4-dimension tensor could be sketched as:
...
x3
x2
x1
x0
···
···
···
x3
...
x2
x2
x1
x0
...
...
x1
x0
x1
x0
x0
...
...
...
As can be seen, components of a tensor are accessed through a sequence
of decreasing indices. Components are stored in the leaves, which lie at depth
R from the tree root, where R is the order of the tensor. This representation
corresponds to an ordered multivariate Horner scheme applied to homogeneous polynomials. The above tree prefix describes the following polynomial
skeleton:
x3 × (x3 × . . . + x2 × . . . + x1 × . . . + x0 × . . .)
+ x2 × (x2 × . . . + x1 × . . . + x0 × . . .)
+ x1 × (x1 × . . . + x0 × . . .)
+ x0 × x0 × . . .
This first decomposition would yield n-ary trees of varying arity, so that
its implementation would anyway require some extra data structure for the
sequence of direct sub-trees of a given tree and also a set of extra properties
to ensure every node has a correct dimension and order according to its
position in the tree. So a simpler representation using only binary trees was
opted.
126
CHAPTER 6. SYMMETRIC TENSOR ALGEBRA
The principle is quite simple: at each node, we choose either to keep
the same variable accounting for the final tensor order, or we drop it and
repeat the same process for lower dimension variables. This is pictured in
tree branches of the following example as xi for the first case and xi for the
second case. This tree is fully developed and represents a symmetric tensor
s of dimension 4 and order 3:
x3
x3
x2
x2
x1
x1
x3
x3
x2
x2
x2
x2
x3
x3
s3,3,3
x0 x0
x1
x1
x1
x1
x2
x2
x1
x1
x2
x2
s2,2,2
x0 x0
x0 x0
x1
x1
x0 x0
x1
s1,1,1
x
x0 0
s0,0,0
6.4
x
x0 0
x
x0 0
s1,0,0 s1,1,0
x1
x1
s2,1,1
x
x0 0
x
x0 0
s2,0,0 s2,1,0
x1
s3,2,2
x0 x0
x1
s2,2,1
x
x0 0
s2,2,0
x1
x1
s3,1,1
x
x0 0
x
x0 0
s3,0,0 s3,1,0
x1
x
x0 0
s3,2,0
If we denote by ST (N, R) the set of symmetric tensors of dimension
N and order R, then it may be described as the recursive family of data
structures indexed by R and N :
ST (d, R)
d∈J1,N +1K
= ST (N, R + 1) × ST (N + 1, R)
x2
s3,3,2
x1
s3,2,1
Data Structure
ST (0, R)
,0
ST (N + 1, 0)
,K
Q
ST (N + 1, R + 1) ,
x2
x1
s3,3,1
x
x0 0
s3,3,0
6.4. DATA STRUCTURE
127
Mathematically, this decomposition corresponds to having a pair of projections/injections from ST (N + 1, R) and ST (N, R + 1) to ST (N, R). As
our description doesn’t match traditional mathematical ways in which tensors and tensor operations are defined, expressing these operations in our
context may require these projections/injections. Unfortunately, these operations have a complex expression. For instance in standard indices representation, we have:
π1 : ST (N + 1, R) → ST (N, R)
π1 (S) , (i1 , . . . , iR ) ∈ [0, N − 1]R 7→ Si1 ,...,iR
π1−1 (S) , (i1 , . . . , iR ) ∈ [0, N ]R 7→ if N ∈ {i1 , . . . , iR }
then 0
else Si1 ,...,iR
π2 : ST (N, R + 1) → ST (N, R)
π2 (S) , (i1 , . . . , iR ) ∈ [0, N − 1]R 7→ Si1 ,...,iR ,N
π2−1 (S) , (i1 , . . . , iR+1 ) ∈ [0, N − 1]R+1 7→ if N − 1 ∈ {i1 , . . . , iR+1 }
then Si1 ,...,iR+1 \{N }
else 0
We let the reader check that indeed for any symmetric tensor S, the
following equality holds:
S = π1−1 (π1 (S)) + π2−1 (π2 (S))
Note that this above equality is much simpler and clearly holds when
recast in terms of homogeneous polynomials as it merely expresses euclidean
division of polynomial S by XN . In this respect, π1 consists in specializing S
at XN = 0 (or equivalently said, computing a remainder), π1−1 is the natural
injection from N -variate into (N + 1)-variate polynomials, π2 is quotienting
S by XN and finally π2−1 is obviously multiplication by XN :
S(X0 , . . . , XN −1 , XN ) = S(X0 , . . . , XN −1 , 0) + XN .
S(X0 , . . . , XN −1 , XN )
XN
Therefore it seems reasonable to seek a direct expression of tensor operations under our decomposition instead of embedding it within more classical
mathematical descriptions. We feel that using π1 and π2 would lead to overly
complex and contrived encodings of algorithms in order to fit our decomposition.
Also, our recursive type scheme bears similarities with the binary decision
trees with ordered variables. Our structure is a relaxation of decision trees
where we no longer impose that a variable occurs only once in every tree
path.
Finally, in terms of OCaml implementation, this decomposition scheme
naturally translates into the following type definition:
128
CHAPTER 6. SYMMETRIC TENSOR ALGEBRA
type (’a, _, _) st =
| Nil:
(’a, Nat.zero, ’r) st
| Leaf:
’a
-> (’a, ’n Nat.succ, Nat.zero) st
| Node:
(’a, ’n, ’r Nat.succ) st
* (’a, ’n Nat.succ, ’r) st
-> (’a, ’n Nat.succ, ’r Nat.succ) st
Here the type of symmetric tensors st has three type parameters: the
type of elements ’a, the dimension type and the order type. The last two parameters not being constant through recursion, they appear as _ in the type
declaration. Then, the three cases where respectively N = 0 or N 6= 0, R = 0
or N, R 6= 0 are handled with three different constructors. Type parameters of constructors’ arguments behave accordingly to the decomposition of
ST (N, R).
6.5
Complexity analysis
To estimate the complexity of tensor operations, some counting formulas
are needed. We follow a standard approach and use formal power series and
generating functions to obtain closed formulas.
First, let us establish a recursive formulation to count the total number of
nodes and/or leaves in a tensor-as-tree. This definition is parameterized by 3
values b0 , b1 , b2 ∈ {0, 1} accounting for the presence of null-leaves, coefficients
and internal nodes respectively:
C(0, R) = b0
C(N + 1, 0) = b1
C(N + 1, R + 1) = b2 + C(N, R + 1) + C(N + 1, R)
Then the following bivariate ordinary generating function is formed, where
indices range over N:
P
P
P
F (z, t) ,
C(n, r)z n tr = C(0, r)tr + C(n + 1, r)z n+1 tr
n,r
r
P
P n,r
b0
= 1−t
+ C(n + 1, 0)z n+1 + C(n + 1, r + 1)z n+1 tr+1
n
n,r
P
b0
b1 z
= 1−t
+ 1−z
+ (b2 + C(n, r + 1) + C(n + 1, r))z n+1 tr+1
n,r
=
b0
1−t
b1 z
1−z
b2 zt
+
+ (1−z)(1−t)
P
P
+ z(F (z, t) − C(n, 0)z n ) + t(F (z, t) − C(0, r)tr )
n
2
r
b1 z
b2 zt
b0 t
b0
1z
= 1−t
+ 1−z
+ (1−z)(1−t)
− b0 z − b1−z
− 1−t
+ (z + t)F (z, t)
b2 zt
= b0 (1 − z) + b1 z + (1−z)(1−t) + (z + t)F (z, t)
Finally:
F (z, t) =
1
1−z−t (b0 (1
− z) + b1 z +
b2 zt
(1−z)(1−t) )
6.5. COMPLEXITY ANALYSIS
129
For the sake of simplicity, we study each term bi in isolation. Then, going
back to formal power series and equating the two formulations for F (z, t),
we obtain the following results.
Counting coefficients (b0 = b2 = 0, b1 = 1):
F (z, t)
P
P i j i−j
= z (z + t)i = z
j z t
i
i,j≤i
P n+r n r
=z
r z t (with n = j and r = i − j)
Pn,rn+r n+1 r
=
t
r z
=
z
1−z−t
n,r
Finally:
C(0, r)
=0 C(n + 1, r) = n+r
r
This result, although well-known, conforts us in trusting the calculations
performed so far. In our application, the dimension n is fixed and we have
C(n, r) = θ(rn−1 ).
Counting null-leaves (b1 = b2 = 0, b0 = 1):
F (z, t)
P n+r n+1 r
t
z n tr −
r z
n,r
n,r
n+1 r
P r r P n+r+1
( r
− n+r
t
=
r )z
r t +
r
n,r
P
P
n+r+1 n+1 r+1
z
t
= tr +
r
=
1−z
1−z−t
r
=
P
n+r
r
n,r
Finally:
C(0, r)
=1
C(n + 1, 0)
=0
C(n + 1, r + 1) = n+r+1
r
For n, r 6= 0, the relative proportion of null-leaves with respect to useful
coefficients equals nr . Null-leaves rapidly outnumber coefficients, so it raises
the question of a less space-wasting scheme for representing symmetric tensors. Finally, for n fixed, C(n, r) = θ(rn ).
130
CHAPTER 6. SYMMETRIC TENSOR ALGEBRA
Counting internal nodes (b0 = b1 = 0, b2 = 1):
F (z, t)
=
zt
(1−z−t)(1−z)(1−t)
= zt
P
P
n,r
P
= zt
z i tj kl z l tk−l
i,j,k,l≤k
k n r
l z t
i,j,k,l≤k
n=i+l,r=j+k−l P
n−i+r−j
P P
i+j n+1 r+1
t
z n+1 tr+1 =
i z
n,r i≤n,j≤r
n,r i≤n,j≤r
P P r+i+1 n+1 r+1 P r+n+2 n+1 r+1
t
=
t
=
i+1 z
r+1 z
=
P
n,r i≤n
(using
n−i
n,r
n
P
m=0
m
k
=
n+1
k+1
)
Finally:
C(0, r)
=0
C(n, 0)
=0
C(n + 1, r + 1) = r+n+2
r+1
Again, for n 6= 0, r 6= 0, the relative proportion of internal nodes with respect
n
to useful coefficients equals r+n
n . For n fixed, C(n, r) = θ(r ).
Thus, iterating through the full set of coefficients of a tensor will take
more time that just looping though elements of a single dimension array
of tensor coefficients. Moreover, from the previous results, this total extra
time/memory 1 + 2r
n is increasing with the tensor order r. Unfortunately, for
the targeted application, whereas n remains fixed, r will continue to increase
until the precision of Taylor approximations satisfies the user.
Linear operations, where tensors are seen as a mere vector space, will
particularly be subject to this penalty. Cumulating these effects when computing Taylor expansions where every tensor from order 0 to r needs be
computed, yields an overall θ(r2 ) penalty. The question of finding a better
trade-off is addressed in section 6.12.
On the contrary, as regards multiplication or other operations with more
complex iteration schemes, the previous pessimistic result may be mitigated.
Indeed, such algorithms will surely need to address tensor coefficients individually, through repeated conversions between single index and occurrence
representations for instance. To illustrate this phenomenon, consider the
following formulation for tensor multiplication for N = 2:
X
(S × T)(o1 ,o2 ) ,
S(t1 ,t2 ) × T(o1 −t1 ,o2 −t2 )
t1 ≤o1 ,t2 ≤o2
It clearly appears that iterating through (o1 , o2 ) and (t1 , t2 ), even when efficiently performed through a single index encoding, imposes computing the
dual index (o1 − t1 , o2 − t2 ) which doesn’t follow an iterative scheme itself
and needs explicit conversions.
In our scheme, the depth of any coefficient lies in the range [r, r + n − 1]
and therefore grows linearly with r.
6.6. FUNCTORIAL STRUCTURE
6.6
131
Functorial Structure
With respect to its element type, type st is functorial and comes equipped
with generic structural iterators. For efficiency reasons, we also define sideeffecting versions of some operators. For instance, iter modifies its second
argument, a tensor of mutable elements. Every function exhibits, at type
declaration point, non-uniform recursion over dimension n and order r. As
OCaml doesn’t allow kind variables, the generic folder cannot be parameterized over some (n, r) b type and as such won’t be very useful. The
weak type of fold is not able to suit the strong type discipline that the
tensor code is commited to. As usual, apply denotes an applicative functor
and serves as a basis to easily implement map2 (and map_n). As can be seen,
GADT help here at reducing the number of patterns necessary to ensure
that functions are total. Both arguments of apply, having the same type,
are matched against the same type-discriminating constructors.
let rec map : type n r. (’a -> ’b) -> (’a, n, r) st -> (’b, n, r) st =
fun f st ->
match st with
| Nil
-> Nil
| Leaf v
-> Leaf (f v)
| Node (stl, str) -> Node (map f stl, map f str)
let rec iter : type
fun f st ->
match st with
| Nil
| Leaf v
| Node (stl, str)
n r. (’a -> ’a) -> (’a ref, n, r) st -> unit =
-> ()
-> v := f !v
-> (iter f stl; iter f str)
let rec apply : type n
fun stf sta ->
match stf, sta with
| Nil
,
| Leaf f
,
| Node (stfl, stfr),
let rec fold : type
fun st f e ->
match st with
| Nil
| Leaf v
| Node (stl, str)
r. (’a -> ’b, n, r) st -> (’a, n, r) st -> (’b, n, r) st =
Nil
-> Nil
Leaf a
-> Leaf (f a)
Node (stal, star) -> Node (apply stfl stal,
apply stfr star)
n r. (’a, n, r) st -> (’a -> ’b -> ’b) -> ’b -> ’b =
-> e
-> f v e
-> fold stl f (fold str f e)
These iterators all perform a single traversal of their tensor argument,
hence exhibit a complexity of θ(rn ), which is also the number of coefficients
(disregarding the effects of applied functions).
132
CHAPTER 6. SYMMETRIC TENSOR ALGEBRA
Tensors also enjoy a (graded) monadic structure, which allows to splice
a tensor of tensors into a single tensor, but also to perform the converse
splitting operation. Yet, the type system comes short of expressivity here.
To define a graded monad, we would need the following functions:
let rec join : type n r1 r2. (r1, r2, r) Nat.add ->
((’a, n, r2) st, n, r1) st -> (’a, n, r) st =
...
let unit : (’a, n, Nat.zero) st = fun v -> Leaf v
Yet again, in the absence of higher-order kinds, elements of a tensor
cannot be made dependent on n and r. Thus, the above join function
cannot be written.
6.7
Algebraic Operations
From this section onward, we assume tensor elements form a field, represented by a module R containing a type R.t and algebraic operations on it.
Functions lambda and sum, as well as their imperative versions lambda_update
and sum_update straightforwardly witness the vector space structure of tensors. Function hprod computes the component-wise product of two tensors
of same dimension and order, i.e. the Hadamard product “ ”.
let lambda k st = map (R.( * ) k) st
let lambda_update k st = iter (R.( * ) k) st
let sum st1 st2 = apply (map R.( + ) st1) st2
let sum_update st1 st2 = iter2 R.( + ) st1 st2
let hprod st1 st2 = apply (map R.( * ) st1) st2
Multiplication of symmetric tensors is not a tensor product (which doesn’t
yield a symmetric result) and is also not even a symmetrized version of tensor
product. It is rather simply the product of homogeneous polynomials. According to our decomposition, we use the following Horner compliant scheme
with two products, one between tensors of equal dimensions (denoted ×) and
the other one between tensors of successive dimensions (denoted ×0 ). For
any tensors S and T of same dimension N and arbitrary orders R1 , R2 structurally decomposed into:
S(X0 , . . . , XN −1 ) = S1 (X0 , . . . , XN −2 ) + XN −1 .S2 (X0 , . . . , XN −1 )
We define:
S × T , S1 ×0 T + XN −1 .(S2 × T)
6.7. ALGEBRAIC OPERATIONS
133
Simultaneously, for any tensors S and T of successive dimensions N − 1, N
and arbitrary orders R1 , R2 structurally decomposed into:
T(X0 , . . . , XN −1 ) = T1 (X0 , . . . , XN −2 ) + XN −1 .T2 (X0 , . . . , XN −1 )
We define:
S ×0 T , S × T1 + XN −1 .(S ×0 T2 )
Clearly, these definitions amount at multiplying every S coefficient by every T coefficient exactly once, as at each recursive step, a structural decomposition of exactly one of the arguments occurs. This remark alone suffices
to establish usual algebraic properties attached to product: commutativity, associativity, bilinearity. This translates into the following imperative
piece of code, where partial products, of which the whole solution is the
addition, are successively used to update a result tensor (most of the time
initially filled with R.zero). This gives an important speedup with respect
to a functional encoding where at each recursive step, some intermediate and
ephemeral structures are likely to be allocated, unless some aggressive compiler optimizations take place. Also, functions are locally tail-recursive which
saves some stack space. Indeed, the product definitions show that recursion
over tensor order is done tail-recursively whereas recursion over dimension
is not. Therefore stack space grows only with N as opposed to the resulting
product tensor depth of R1 + R2 + N . The functions product_update and
product_aux respectively implement × and ×0 :
let rec product_update : type n r r1 r2.
n Nat.isnat -> r1 Nat.isnat -> r2 Nat.isnat ->
(r1, r2, r) Nat.add -> (R.t, n, r1) st -> (R.t, n, r2) st ->
(R.t ref, n, r) st -> unit =
fun n r1 r2 pr st1 st2 str ->
match st1, pr with
| Nil
, _
-> ()
| Leaf v1
, Nat.Zadd
-> iter2 (fun v2 vr -> R.(vr + (v1 * v2)))
st2 str
| Node (st1l, st1r), Nat.Sadd pr’ ->
match str, r1, n with
| Node (strl, strr), Nat.S r1’, Nat.S n’ ->
begin
let pr’’ = Nat.add_comm r2 pr’ in
product_aux
n’ r1’ r2 pr’’ st1l st2 str;
product_update n r1’ r2 pr’ st1r st2 strr;
end
and product_aux : type n r r1 r2. n Nat.isnat -> r1 Nat.isnat -> r2 Nat.isnat ->
(r2, r1, r) Nat.add -> (R.t, n, r1 Nat.succ) st -> (R.t, n Nat.succ, r2) st ->
(R.t ref, n Nat.succ, r Nat.succ) st -> unit =
fun n r1 r2 pr st1 st2 str ->
match st2, pr with
134
CHAPTER 6. SYMMETRIC TENSOR ALGEBRA
| Leaf v2,
Nat.Zadd
-> iter2 (fun v1 vr -> R.(vr + (v1 * v2)))
st1 (match str with
| Node (strl, _) -> strl)
| Node (st2l, st2r), Nat.Sadd pr’ ->
match str, r2 with
| Node (strl, strr), Nat.S r2’ ->
begin
let pr’’ = Nat.Sadd (Nat.add_comm r1 pr) in
product_update n (Nat.S r1) r2 pr’’ st1 st2l strl;
product_aux
n r1
r2’ pr’ st1 st2r strr;
end
Note that a pure functional implementation with a least number of nonstructural operations (such as auxiliary tensor additions) is obtainable through
unfolding the companion function ×0 and rearranging terms to explicitly display a convolution formula (where the S12i denote the tensors at depth i
belonging to the fringe of the right branch of S – and similarly for T – as
illustrated in Figure 6.1):
xN
xN
S1
xN
xN
S12
xN
S122
xN
S12R1 −1
xN
S12R1
Figure 6.1 – Tensor decomposition unfolded.
S×T =
R1
P
i
×0 T)XN
−1
=
(S12i
i=0
R2
R1 P
P
=
i=0 j=0
R1P
+R2
P
r=0
i∈[0,R1 ],j∈[0,R2 ]
i+j
(S12i × T12j )XN
−1
r
XN
−1
(S12k × T12r−k )
i+j=r
It appears that, at each step, it would perform approximately R1 ×
R2
2 +N
more tensor additions, each with an allocation cost upto θ( R1R+R
) =
1 +R2
N
θ((R1 + R2 ) , assuming N is fixed. Overall, this would yield a solution of
prohibitive complexity.
6.8. NON-STRUCTURAL DECOMPOSITION
135
The above decomposition serves as a basis to estimate the number of
recursive calls made by product functions. We prove that:
Proposition 6.7.1 The number of recursive calls of product functions is in
θ((R1 × R2 )N ) for N fixed.
Proof By induction on N . Base case holds since there are no sub-calls. For
R1 = 0 or R2 = 0, property holds because product is then equivalent to a
single map. For R1 , R2 6= 0, the above decomposition is obtained through
(R1 − 1) × (R2 − 1) recursive calls of × and ×0 , then the induction hypothesis
can be applied to the remaining products which arguments are tensors of
dimension N − 1. So the total number of calls equals:
(R1 −1)×(R2 −1)+
R1 X
R2
X
θ((i×j)N −1 ) =
i=0 j=0
R1
X
θ((iN −1 ×R2N ) = θ(R1N ×R2N )
i=0
Therefore, our implementation lies in the optimal complexity class as the
required number of coefficient products required to form the whole tensor
product obviously belongs to the same class θ((R1 × R2 )N ).
6.8
Non-structural Decomposition
Tensors of dimension N may not only be structurally decomposed on XN
but also on any other Xk . For that purpose, the “ [ ]” function specializes
a tensor, i.e. drops some index by specializing it to a specific dimension k.
Conversely, the “ ↑ ” function increments the order of its tensor argument.
For a tensor of dimension N and order R, they are defined in terms of
polynomials as:
S(X ,...,X
)−S(X ,...,Xk−1 ,0,Xk+1 ,...,XN −1 )
0
0
N −1
(S[k])(X0 , . . . , XN −1 ) ,
Xk
(S ↑ k)(X0 , . . . , XN −1 ) , Xk .S(X0 , . . . , XN −1 )
Using the same notations as for the tensor product, we show how these
operators simply fit the structural decomposition:
=(
(S1 + XN −1 .S2 )[k]
S2 , for k = N − 1
=
S1 +XN −1 .S2 −S1 |Xk ←0 −XN −1 .S2 |Xk ←0
= S1 [k] + XN −1 .S2 [k], for k ≥ N
Xk
0 + XN −1 .S, for k = N − 1
S↑k =
(S1 + XN −1 .S2 ).Xk = S1 ↑ k + XN −1 .(S2 ↑ k), for k ≥ N
S[k]
Figure 6.2 illustrates these operations on a tensor of dimension N = 4
and order R = 3, for k = 2. The set operation removes the red sub-trees
136
CHAPTER 6. SYMMETRIC TENSOR ALGEBRA
x3
x3
x2
x2
x1
x1
x3
x3
x2
x2
x2
x2
x3
x3
s3,3,3
x0 x0
x1
x1
x1
x1
x2
x2
x1
x1
x2
s2,2,2
x0 x0
x0 x0
x1
x1
x0 x0
x1
s1,1,1
x
x0 0
x
x0 0
x
x0 0
s0,0,0
s1,0,0 s1,1,0
x1
x1
s2,1,1
x
x0 0
x
x0 0
s2,0,0 s2,1,0
x1
x2
x2
s3,2,2
x0 x0
x1
s2,2,1
x1
x1
s3,1,1
x
x0 0
x
x0 0
x
x0 0
s2,2,0
s3,0,0 s3,1,0
Figure 6.2 – Illustration of set and lift operations.
x1
x2
s3,3,2
x1
s3,2,1
x1
s3,3,1
x
x0 0
x
x0 0
s3,2,0
s3,3,0
6.9. REDUCTION OPERATIONS
137
and the blue edges and merges nodes at extremity of blue edges. The lift
operation proceeds the other way by inserting blue edges and creating zero
filled red sub-trees.
It yields the following implementation, where set and lift respectively
denote “ [ ]” and “ ↑ ”. Both functions perform a single traversal of their
argument, hence exhibit a complexity of θ(RN ) for N fixed. The auxiliary
function make implements constant tensors, used for 0:
(* set k T_{i0,...,iR} = {i1,...,iR} |-> T_{k,i1,...,iR} *)
let rec set : type n d k r. (d, k, n) Nat.add ->
(R.t, n Nat.succ, r Nat.succ) st -> (R.t, n Nat.succ, r) st =
fun pr st ->
match pr, st with
| Nat.Zadd
, Node (stl, str) -> str
| Nat.Sadd pr’, Node (stl, str) ->
match str with
| Node _ -> Node (set pr’ stl, set pr str)
| Leaf _ -> match set pr’ stl with | Leaf v -> Leaf v
(* lift k T_{i1,...,iR} =
{i0,...,iR} |-> if \E m: im=k then T_({i0,...,iR}\{im}) else 0 *)
(* set k (lift k T) = T *)
let rec lift : type n d k r. r Nat.isnat -> k Nat.isnat -> (d, k, n) Nat.add ->
(R.t, n Nat.succ, r) st -> (R.t, n Nat.succ, r Nat.succ) st =
fun r k pr st ->
match pr, st with
| Nat.Zadd
, _
-> Node (make k (Nat.S r) R.zero, st)
| Nat.Sadd pr’, Leaf v
-> Node (lift r k pr’ (Leaf v), Leaf R.zero)
| Nat.Sadd pr’, Node (stl, str) -> Node (lift r k pr’ stl,
lift (Nat.pred r) k pr str)
Both set and lift will come in handy when considering the differential
structure of tensors.
6.9
Reduction Operations
Reduction consists in computing the global sum of all coefficients. This
is easily implemented by a single traversal with an auxiliary accumulator.
As an operation on polynomials, reduction is merely evaluation at point
X0 = . . . = XN −1 = 1. k-reduction only performs partial sums and yields
a tensor of a lower order k by cutting tensor branches deeper than k. Reduction is 0-reduction. k-reduction, denoted as “Σk ”, has a meaningful
tensor interpretation in terms of occurrencesPonly. For a tensor S(o0 ,...,oN −1 )
of dimension N and order R, i.e. such that i oi = R, we define:
(Σk S)(r0 ,...,rN −1 ) ,
X
(o0 ,...,oN −1 )≥(r0 ,...,rN −1 )
S(o0 ,...,oN −1 ) (for
P
i ri
= k)
138
CHAPTER 6. SYMMETRIC TENSOR ALGEBRA
where (o0 , . . . , oN −1 ) ≥ (r0 , . . . , rN −1 ) is the following ordering:
(o1 , . . . , oN −1 ) ≥ (r1 , . . . , rN −1 ) if r0 = 0
o0 ≥ r0 ∧
(o1 , . . . , oN −1 ) = (r1 , . . . , rN −1 ) otherwise
Proposition 6.9.1 The occurrence ordering (o0 , . . . , oN −1 ) ≥ (r0 , . . . , rN −1 )
also represents the sub-tree ordering.
Proof (⇒) Let k be the least index such that ri = 0 for every i < k. Then
(o0 , . . . , oN −1 ) characterizes a sub-tree of (0, . . . , 0, rk , . . . , rN −1 ). (⇐) Let
us consider a node nr which lies at occurrence (0, . . . , 0, rk , . . . , rN −1 ) from
the root with rk 6= 0. Then (o0 , . . . , oN −1 ) ≥ (0, . . . , 0, rk , . . . , rN −1 ) means
from the definition that (o0 , . . . , oN −1 ) denotes a sub-tree of nr .
reduction and k_reduction are implemented below. They will serve as
a basis for error refinement of Taylor expansions:
let rec reduction : type
fun st acc ->
match st with
| Nil
->
| Leaf v
->
| Node (stl, str) ->
n r. (R.t, n, r) st -> R.t -> R.t =
acc
R.( + ) v acc
reduction stl (reduction str acc)
let rec k_reduction : type n d r0 r1. (r0, d, r1) Nat.add ->
(R.t, n, r1) st -> (R.t, n, r0) st =
fun pr st ->
match pr with
| Nat.Zadd
-> (match st with
| Nil
-> Nil
| Leaf v -> Leaf v
| Node _ -> Leaf (reduction st R.zero))
| Nat.Sadd pr’ -> match st with
| Nil
-> Nil
| Node (stl, str) -> Node (k_reduction pr stl, k_reduction pr’ str))
6.10
Differential Operations
Differential operations introduce partial differentiation and integration,
as well as a primitive “variable” tensor, in the tensor algebra. To that purpose, the dimension N is defined as a constant, in the form of a module
satisfying the following interface Natural:
(* instantiations of this interface denote natural numbers *)
module type Natural =
sig
(* the type encoding a natural number *)
6.10. DIFFERENTIAL OPERATIONS
139
type t
(* the proof that it encodes a natural number *)
val isnat : t Nat.isnat
end
The var function builds a tensor of order 1 corresponding to a monomial Xk . proj and inj are respectively calling set and lift to perform
partial derivation and integration with the help of the auxiliary function
pdiff_coeff which computes the tensor of integration/derivation factors
“∆k ” such that:
P
(∆k )(o0 ,...,oN −1 )
, 1 + ok , for i oi = R
dS(X0 ,...,XN −1 )
, S[k] ∆k
dXk
X
k
R
S(X0 , . . . , xk , . . . , XN −1 )dxk , (S ∆−1
k )↑k
0
Finally, the main functions pdiff and pinteg themselves call proj and
inj while encapsulating indices and their associated proofs into Indices.index
structures. Such structures, which represent safe indices only, are provided
by our infrastructure and keep the user away from complex type-aware indices manipulation.
let var : ’k Nat.isnat -> (’d Nat.succ, ’k, N.t) Nat.add ->
(R.t, N.t, Nat.zero Nat.succ) st =
fun k pr -> delta N.isnat k pr
let proj : ’r Nat.isnat -> ’k Nat.isnat -> (’d Nat.succ, ’k, N.t) Nat.add ->
(R.t, N.t, ’r Nat.succ) st -> (R.t, N.t, ’r) st =
fun r k pr ->
match pr with
| Nat.Sadd pr’ ->
fun ve -> map2 R.( * ) (set k pr’ ve) (pdiff_coeff r k pr)
let inj : ’r Nat.isnat -> ’k Nat.isnat -> (’d Nat.succ, ’k, N.t) Nat.add ->
(R.t, N.t, ’r) st -> (R.t, N.t, ’r Nat.succ) st =
fun r k pr ->
match pr with
| Nat.Sadd pr’ ->
fun ve -> lift r k pr’ (map2 R.( / ) ve (pdiff_coeff r k pr))
let pinteg : N.t Indices.index -> ’r Nat.isnat -> (R.t, N.t, ’r) st ->
(R.t, N.t, ’r Nat.succ) st =
fun (Indices.Idx (_, k, pr)) r -> inj r k pr
let pdiff : N.t Indices.index -> ’r Nat.isnat -> (R.t, N.t, ’r Nat.succ) st ->
(R.t, N.t, ’r) st =
fun (Indices.Idx (_, k, pr)) r -> proj r k pr
140
CHAPTER 6. SYMMETRIC TENSOR ALGEBRA
6.11
Changing Tensor Basis
We define here a circular permutation of tensor dimensions. This operation is easily described when seen from the viewpoint of homogeneous
polynomials:
S(X0 , X1 , . . . , XN −2 , XN −1 ) 7→ S(X1 , X2 , . . . , XN −1 , X0 )
As regards our tensor decomposition of a ST (N, R) tensor S = S1 +
XN −1 .S2 , the principle is:
— first, to rotate S2 basis and then lift it from ST (N + 1, R) to ST (N +
1, R + 1) by implicitly multiplying it by X0 , the image of the leading
XN −1 decomposition variable;
— second, as S1 has no XN −1 variable to transform, it is also implicitly
lifted from ST (N, R+1) to ST (N +1, R+1) by shifting up its variables,
creating a tensor without variable X0 that needs be completed.
— third, rotated S2 and S1 of now same dimension and order may be
merged together. It consists in a single traversal of both structures
where the null-leaves of S1 are replaced by the now X0 -prefixed coefficients of S2 . The result is thus a complete rotated tensor.
A formalization of this algorithm, denoted “ ”, is straightforward as
merging boils down to polynomial addition and rotation:
S , (S1 + XN −1 .S2 ) = S1 (X1 , . . . , XN −1 ) + X0 . S2
A proof of correction then follows:
Proposition 6.11.1 (S(X0 , . . . , XN −2 , XN −1 )) = S(X1 , . . . , XN −1 , X0 )
Proof By induction on tensor order R. Base case R = 0 trivially holds as
there is no variable and no decomposition. As for the inductive case, we have:
(S1 + XN −1 .S2 ) = S1 (X1 , . . . , XN −1 ) + X0 . S2 = S1 (X1 , . . . , XN −1 ) +
X0 .S2 (X1 , . . . , XN −1 , X0 ) = S(X1 , . . . , XN −1 , X0 ).
Speaking of complexity, rotating as well as merging essentially amounts
to a single traversal of their arguments. These functions are implemented by
the following functions superpose and rotate:
let rec superpose : type n r. (’a, n, r Nat.succ) st -> (’a, n Nat.succ, r) st
-> (’a, n Nat.succ, r Nat.succ) st =
fun stl x0_str ->
match stl, x0_str with
| Nil
, Leaf v
-> Node (Nil
, Leaf v)
| Node (stll, Leaf w), Leaf v
-> Node (superpose stll (Leaf v), Leaf w)
6.12. PERSPECTIVES
| Nil
, Node
-> Node (Nil, superpose
| Node (stll, stlr) , Node
-> Node (superpose stll
(* rotate variables
let rec rotate : type
fun st ->
match st with
| Nil
| Leaf v
| Node (stl, str)
141
(Nil, strr)
Nil strr)
(strl, strr)
strl
, superpose stlr strr);;
to the right: x(0) -> x(1),..., x(N-1) -> x(0) *)
n r. (’a, n , r) st -> (’a, n, r) st =
-> Nil
-> Leaf v
-> superpose stl (rotate str);;
Basis change comes in handy when taylor approximations are successively
refined into sharper models. Our refinement technique is strictly structural
with respect to tensor data structure, which may be a weakness, as shown
in 8.10. Therefore we might need a more general way to shuffle and mix
precision information from different dimensions to increase refinement opportunities. Actually, arbitrary basis permutations of x0 , . . . , xN −1 may be
achieved quite simply through rotations of (sub-) tensors only: first, by a certain number of rotations, bring the desired variable xσ(N −1) at first position
N − 1; then, recursively, apply the remaining changes on every sub-tensor
of dimension N − 1. Nonetheless, we feel the amount of spurious auxiliary
rotations doesn’t make it a workable technique.
6.12
Perspectives
All operations developed so far have optimal complexity with respect to
size of tensor structure, but sub-optimal in the number of tensor elements, r
times more costly than optimal for a fixed dimension n. This seems to tip the
scales in favour of a linear storage of tensor elements in a single dimension
array, as done in most implementations. Notwithstanding, this doesn’t account for complex operations hardly implementable in such a basic scheme.
For instance, many operations, such as multiplication, would require nontrivial index manipulations which have a cost. Moreover, many operations
such as reduction or basis change which are simply structural here would
involve an overall complex reordering and reindexing of array elements. We
firmly think that the only tractable and trustable way lies in keeping track
of tensor branches and basically rebuilding the tensor structure at run-time.
Still, for structure preserving functions, such as linear operators, we may
sacrifice some more storage and attain optimal complexity. Instead of directly storing elements in tree leaves, we store an offset within a singledimension array of elements. This redundancy allows to compute structure
preserving functions in optimal time, provided the tree structure pointing
to the array is computed only when needed. This lazy pairing between an
142
CHAPTER 6. SYMMETRIC TENSOR ALGEBRA
array and a tree can be computed efficiently with a global offset index that
increments each time a leaf is encountered while creating the tree structure.
Independently, prolonging the analogy with binary decision trees seems
promising, as some amount of sharing and subsequent path reductions may
be introduced in the data structure, so that we get very close to binary
decision diagrams. In our case, the only practical sharing occurs for 0 tensors
and also more marginally for tensors resulting from some Taylor expansion
of the exp function (because we have d exp(x)
= exp(x), hence the sharing).
dx
Chapter 7
Values and Errors
7.1
Values
Numerical values are at the core of our application. They must be
provided as a module containing a type t for values as well as associated
operations, implementing the common module interface Numeric. This interface contains arithmetical functions, classical elementary functions and
order-related functions. The latter ones allow to implement error refinement,
whereas the former ones are present only as they reflect some corresponding
functions implemented at the level of Taylor series.
If one is not interested in certified errors and only estimates orders of
magnitude, an implementation with floating-point numbers seems a sensible
choice. If one wish to obtain certified computations though, implementations where a value is a (small) interval of possible real numbers are already
available. Representing numbers with exact rational values is not possible
in our case, because we plan to include in our framework Taylor expansions
of elementary functions which don’t produce in general rational numbers.
More involved representations of real numbers are possible, such as limits of
converging computing processes, but we feel that it would bring an untolerable amount of extra complexity to an application that already spends most
of its time producing lots of numbers.
7.2
7.2.1
Errors
Introduction
Having an error domain is the second requisite for our application. To
the best of our knowledge, implementations are most often based on intervals
and interval operations. These intervals may seamlessly blend in pure numerical uncertainties coming from floating-point numbers. Therefore values and
errors together may be represented as a single domain, which simplifies for143
144
CHAPTER 7. VALUES AND ERRORS
malization and implementation. Yet, what our application really needs are
pointed intervals where bounds are functions, not values. First, an interval
with a specific value pointed at is required, as we compute Taylor expansions at a certain point A. Second, we want to be able to quickly recompute
the error made on the value of some function f around point A, when uncertainty on this A changes. To avoid recomputing tensor coefficients and
Taylor expansions with the same values and new errors, we produce once
and for all error functions. These functions, when provided an error on A,
yield an error on f (A). Hence, they are zero-centered and monotonous.
To save computation time, we gave up some potential precision. Instead
of dealing with pointed intervals that amount to one value and two functions
for the upper and lower bounds, we implemented centered errors with one
value v and one positive error function ( ). This is detailed in the next
section 7.3.
As a conclusion, we assume that more accurate results will not be provided by the error domain alone, but mostly from the ability to dynamically
choose on-demand the expansion order of Taylor series, supplemented by error refinement heuristics. Although this choice may be deemed as reasonable,
many case studies are needed before this intuition turns into trust.
7.3
Error model
Let us assume K stands for the value domain. Error functions are then
elements of the following domain E:
E , {f ∈ (K+ )N → K+ | f (0) = 0, f monotonous}
The error model is then the product K × E. The semantics J K of an
element of this model represents a function from variable bounds to sets of
possible values:
J(v, )K , X ∈ (K+ )N 7→ {k ∈ K | |k − v| ≤ (X)}
The error model has N + 1 constructors: (k, 0) for k ∈ K, denoted “k”
and the i ∈ [0, N − 1] indexed family (0, X 7→ Xi ), denoted “Xi ”. It is
endowed with a K-algebra structure:
(v1 , 1 ) + (v2 , 2 ) , (v1 + v2 , 1 + 2 )
α × (v, )
, (α × v, |α| × )
(v1 , 1 ) × (v2 , 2 ) , (v1 × v2 , v1 × 2 + v2 × 1 + 1 × 2 )
So far the error model allows the definition of polynomial terms. It is
further turned into an full-fledged domain using the definitions of elementary
7.4. TENSORS OF ERROR MODEL ELEMENTS
145
functions on K × E in the following table. Similar definitions may be devised
for other elementary functions, such as inverse trigonometric functions:
e(v,)
log(v, )
sin(v, )
cos(v, )
sinh(v, )
cosh(v, )
, (ev , ev × (e − 1))
, (log v, log(1 + v )) (v 6= 0)
, (sin v, | sin v| × (1 − cos ) + | cos v| × | sin |)
, (cos v, | cos v| × (1 − cos ) + | sin v| × | sin |)
, (sinh v, | sinh v| × (cosh − 1) + | cosh v| × | sinh |)
, (cosh v, | cosh v| × (cosh − 1) + | sinh v| × | sinh |)
We have to prove that these operators are consistent with respect to the
semantics of the error domain:
Proposition 7.3.1 For every operator “op” of arity n defined above, for
every x1 , . . . , xn ∈ K and every X ∈ (K+ )N , the following inference is valid:
x1 ∈ J(v1 , 1 )K(X) . . . xn ∈ J(vn , n )K(X)
op(x1 , . . . , xn ) ∈ Jop((v1 , 1 ), . . . , (vn , n ))K(X)
Proof (sketch). Algebraic operations are trivially correct. As regards nonalgebraic operations, they are easily proved correct once an additive decomposition of op is granted, i.e. an expression of op(x1 + x2 ) in terms of op(x1 )
and op(x2 ). This kind of decomposition may be found in reference textbooks
or figured out from operator properties. As an example, we develop the sin
case. The error around sin v is given by:
| sin(v ± ) − sin v | = | sin v × cos ± cos v × sin − sin v |
= | sin v × (cos − 1) ± cos v × sin |
≤ | sin v| × (1 − cos ) + | cos v| × | sin |
7.4
Tensors of Error Model Elements
We state the general form of multivariate Taylor expansions at order R:
P
P
α
α
f (x) =
Dαf (0) xα! +
Dαf (λ ∗ x) xα!
|α|<R
x∈
RN
α∈
|α|=R
NN
λ ∈ [0, 1]
If we suppose R is not fixed and that a user can increment it dynamically
to improve the quality of a Taylor expansion, what appears in this formulation is the need to compute tensors of partial derivatives applied at point 0
at any order as well as a Taylor remainder at any order too, because R may
vary. These remainders depend upon a point λ ∗ x which is unknown but
still lies in some bounded neighbourhood of point 0. The position of λ ∗ x is
146
CHAPTER 7. VALUES AND ERRORS
interpreted as an error model element (0, ). The partial derivatives applied
at point 0 are values, interpreted as error model elements (v, 0). Interestingly, for any R and P , the sum of terms of a Taylor expansion under the
following form:
R+P
X
|α|=R+1
Dαf (0) xα
+
α!
X
|α|=R+P +1
Dαf (λ ∗ x) xα
α!
is also an error term of order R + 1, i.e. may be compared to the original
error term:
X
xα
Dαf (λ ∗ x) α!
|α|=R+1
In particular, both formulations are identical for P = 0. This remark paves
the way for error refinement, for Taylor expansions get more precise as P →
+∞, under convergence hypothesis.
7.5
Error refinement
Error refinement is an extra step, mandatory to obtain best precision.
At first glance, it seems that existing approaches, implemented in tools such
as COSY, don’t need such refinement. As a matter of fact, this refinement
is indeed present but totally implicit and hard-coded in the core algorithms.
We propose to detach it from the bare computation of Taylor expansions, so
that the user can parameterize it, even dynamically if wished so. This is a
strength of our approach that appears more “natural” in this respect.
The difference is straightforward. Classical approaches handle Taylor
expansions as polynomials of fixed degree d. Therefore, when computing
products of such polynomials for instance, terms of degree greater than d are
approximated and gathered in the error of degree d. In the particular case of
polynomial product, it amounts to refine error at degree d by terms of degree
upto 2d. While this is a totally arbitrary refinement bound induced and
imposed by the specific implementation of a product algorithm, this would
also prove overly costly in our multidimensional setting. A rapid calculation
shows that computing an error this way involves computing θ(2N ) times
more tensor coefficients than strictly necessary to obtain a Taylor expansion
without any error refinement, with respect to order d, N being the fixed
dimension of the problem.
To define refinement, we first need an inclusion ordering on errors.
Definition For any pair of error model elements (v1 , 1 ) and (v2 , 2 ), we
7.5. ERROR REFINEMENT
147
define the following ordering:
(v1 , 1 ) v (v2 , 2 ) , ∀ X ∈ (K+ )N . J(v1 , 1 )K(X) ⊆ J(v2 , 2 )K(X)
= {k ∈ K | |k − v1 | ≤ 1 (X)} ⊆ {k ∈ K | |k − v2 | ≤ 2 (X)}
= [v1 − 1 (X), v1 + 1 (X)] ⊆ [v2 − 2 (X), v2 + 2 (X)]
= v2 − 2 (X) ≤ v1 − 1 (X) ∧ v1 + 1 (X) ≤ v2 + 2 (X)
= |v2 − v1 | ≤ (2 − 1 )(X)
As a direct consequence of property 7.3.1, all operators on error model
elements can be proved monotonic with respect to inclusion ordering.
7.5.1
Reduction of Value-Error Tensor
In order to exploit order r + 1 tensor to refine order r, we must somehow
turn a degree r + 1 poynomial into a degree r one. As long as it preserves
inclusion, any kind of injection would do the job. To save complexity, we propose a structural operation that decomposes an order r + 1 tensor containing
elements of the error model into an order r tensor where elements are order
1 tensors (as would have given the monadic structure of tensors, see 6.6).
These order 1 tensors are then each fully reduced to error model elements.
This is akin to a (r − 1)-reduction as defined in 6.9. This operation, named
k-reduction-as-error and denoted “ΣX ”, is defined for an occurrence-style
tensor S(o0 ,...,oNP
of error model elements, with dimension N and order R,
−1 )
i.e. such that i oi = R > 0:
P
r −1 −oN −1
.S(o0 ,...,oN −1 )
Xr00 −o0 . . . . .XNN−1
(ΣX S)(r0 ,...,rN −1 ) ,
(o0 ,...,oN −1 )≥(r0 ,...,rN −1 )
P
(for i ri = R − 1)
Here, sums, products and variables Xi are interpreted in the error model.
Further order reductions are possible by iterating ΣX . An application is
illustrated in figure 7.1. Reduction-as-error is here applied to a tensor of
dimension and order 3 where the si,j,k are values of the error domain (v, ).
and yields an order 2 tensor, as the result of folding order 1 tensors (pictured
in red) into zero-centered error values (0, ). To obtain these error values,
nodes are interpreted as sums, xi branches as multiplication by (0, Xi ) and xi
as multiplication by (1, 0) and null-leaves are (0, 0). Therefore, the resulting
tensor is zero-centered, i.e. contains only (0, ) values.
Finally, we prove that ΣX S gives an over-approximation of S, with respect to inclusion ordering. This property lets us use k-reduction-as-error
without compromising our claim for certified errors. For that purpose, we
denote “S[x0 , . . . , xN −1 ]” the polynomial error model term where S is an
error model tensor and variables xi ∈ K. This polynomial term represents
an error model element. For the sake of simplicity, we slightly abuse the
notation for function application and denote:
(v, )(X) , (v, (X))
148
CHAPTER 7. VALUES AND ERRORS
x2
x2
x1
x0
x1
x0
x1
x2
x2
x1
x1
x1
X2
1
s2,2,2
x0 x0
x0 x0
1 X1
x0 x0
1 X1
s2,1,1
s1,1,1
1 X0
s0,0,0
1 X0
s1,0,0
1 X0
s1,1,0
1 X1
1 X0
s2,0,0
1 X0
s2,1,0
Figure 7.1 – Illustration of tensor reduction as error.
s2,2,1
1 X0
s2,2,0
7.5. ERROR REFINEMENT
149
Proposition 7.5.1 For every error model tensor S of order R > 0 and
dimension N , for every x0 , . . . , xN −1 ∈ K and every X ∈ (K+ )N :
(∀i ∈ [0, N −1].|xi | ≤ Xi ) =⇒ S[x0 , . . . , xN −1 ](X) v (ΣX S)[x0 , . . . , xN −1 ](X)
Proof By double induction on N and R. We assume |xi | ≤ Xi for every
i ∈ [0, N − 1]. The property trivially holds for N = 0 as S is then the zero
polynomial. For the other base case R = 1, we develop both hand sides of
the inclusion:
S[x0 , . . . , xN −1 ](X) v (Σ
PX S)[x0 , . . . , xN −1 ](X)
≡ S[x0 , . . . , xN −1 ](X) v ( Xi × (vi , i ))[x0 , . . . , xN −1 ](X)
i
P
P
≡ ( xi × (vi , i ))(X) v ( Xi × (vi , i ))(X)
i
i
P
P
P
≡ ( xi × vi , |xi | × i )(X) v ( (0, Xi × (|vi | + i )))(X)
i
i
iP
P
P
≡ ( xi × vi , |xi | × i (X)) v (0, Xi × (|vi | + i (X)))
i
i
i P
P
≡ | − xi × vi | ≤ Xi × (|vi | + i (X)) − |xi | × i (X)
Pi
Pi
P
≡ | xi × vi | ≤ Xi × |vi | + i (X) × (Xi − |xi |)
i
i
i
P
P
⇐ | xi × vi | ≤ Xi × |vi | (as i (X) × (Xi − |xi |) ≥ 0)
i
P
Pi
|xi | × |vi | ≤ Xi × |vi |
⇐
i
i
⇐ |xi | ≤ Xi
For the recursive case, we structurally decompose S and assume the inductive
hypothesis holds for both S1 and S2 below. Then we deduce:
S[x0 , . . . , xN −1 ](X) = S1 [x0 , . . . , xN −2 ](X) + xN −1 .S2 [x0 , . . . , xN −1 ](X)
⇒ ΣX S[x0 , . . . , xN −1 ](X) = ΣX S1 [x0 , . . . , xN −2 ](X)
+ xN −1 .ΣX S2 [x0 , . . . , xN −1 ](X)
⇒ ΣX S[x0 , . . . , xN −1 ](X) w S1 [x0 , . . . , xN −2 ](X)
+ xN −1 .S2 [x0 , . . . , xN −1 ](X)
≡ ΣX S[x0 , . . . , xN −1 ](X) w S[x0 , . . . , xN −1 ](X)
7.5.2
Refinement of Value-Error Tensor
Now that we can reduce the order of any tensor, we may compare an
original error tensor with reduced tensors of higher orders, with respect to
inclusion ordering. The general problem is then stated as follows: suppose
we are given two different error model tensors (of same order and dimension)
S and T. Is there any error model tensor that describes values that belong
to both JSK and JTK ? If so, is there a most precise one ?
150
CHAPTER 7. VALUES AND ERRORS
In a one dimensional framework, the answer to the above question is
positive and readily implemented in the COSY tool [83] (with a different
interval-based error model though). In the 1D case, tensors are merely
(value, error) pairs. Refinement simply consists in taking the minimum
error of both inputs, since the value part is identical. In our multidimensional setting, we are not aware of any general solution that would compare
different errors at different positions in tensors and come out with a best
approximation. For instance, the naïve idea of taking the minimum of both
errors component-wise in tensors S and T, somehow prolonging the 1D case,
is clearly wrong. Our fallback solution is currently to replace the original
tensor by the reduced/refined one, since the refined one is supposed to bring
tighter error bounds.
7.6
Implementation
Implementation of the error domain closely follows its specification. To
improve efficiency, we added two simple optimizations.
7.6.1
Memoization
First, error formulas may duplicate error symbols as in the case of multiplication:
(v1 , 1 ) × (v2 , 2 ) , (v1 × v2 , v1 × 2 + v2 × 1 + 1 × 2 )
To avoid recomputing applications of error functions, we memoize these calls
by encapsulating them in a memo function. Also, memoization proves useful
for error refinement. When an user is unsatisfied by the error computed
at some order r, choice is given to compute a new smaller error at order
r + 1 or greater. This new error naturally depends on previously computed
values and errors at lower orders, thus memoization avoids recomputing lower
order error terms. We render computation of such terms of increasing order
totally incremental. Nevertheless, if the user changes the bounds X, every
error (X) must be recomputed then.
7.6.2
Zero Functions
Second, another optimization is based on the observation that many errors are in fact zero functions. This occurs for instance when computing
Taylor expansions of polynomial functions, which are obviously exact from
a certain development order onward and for which errors vanish and become
zero functions. Our data structure thus distinguishes between this special
case of zero function and the general case. We also handle the special case
of multiplication of an error function by value 0 that yields zero functions
too. This allows for much simplification as 0 absorbs other operands too, is
7.6. IMPLEMENTATION
151
a neutral element of all error functions and propagates through error terms.
To illustrate this phenomenon, we show the obvious simplifications obtained
on our definitions:
(v1 , 0) + (v2 , 2 )
0 × (v, )
α × (v, 0)
(v1 , 0) × (v2 , 2 )
e(v,0)
log(v, 0)
sin(v, 0)
cos(v, 0)
sinh(v, 0)
cosh(v, 0)
, (v1 + v2 , 2 )
, (0, 0)
, (α × v, 0)
, (v1 × v2 , v1 × 2 )
, (ev , 0)
, (log v, 0) (v 6= 0)
, (sin v, 0)
, (cos v, 0)
, (sinh v, 0)
, (cosh v, 0)
These simplifications are performed once and for all when error functions
are built. Therefore any subsequent application of these functions benefits
from a much smaller error term.
7.6.3
OCaml Code
The corresponding OCaml code is split in three modules, one for each of
K, E and K × E. We only provide some short excerpt of these modules. The
value module K is currently built with simple floating-point numbers, not
accounting for rounding errors. More sophisticated domain of values may be
devised, provided it forms an algebra with the required operations.
type t = float
let
let
let
let
let
let
let
let
of_float x = x
zero = of_float 0.
one = of_float 1.
( + ) = ( +. )
( * ) = ( *. )
exp = exp
log = log
sin = sin
The error module E and the value-error module K × E implement zerocentered errors, assuming Num denotes K and Vector denotes KN . Other
sensible choices, such as interval arithmetics, are also possible.
We start by the error domain Err, with a specialization for zero errors.
type t = Nul | Fun of ((Num.t, N.t) Vector.t -> Num.t)
let var k pr =
let ik = Nat.isnat_to_int k in
Fun (fun env -> Num.abs (Vector.get (Indices.Idx (ik, k, pr)) env))
152
CHAPTER 7. VALUES AND ERRORS
let ( <+> ) f g =
match f, g with
| Nul , _
-> g
| _
, Nul -> f
| Fun f, Fun g -> Fun (fun x -> Num.((f x) + (g x)))
let lambda k =
let abs_k = Num.abs k in
if abs_k = Num.zero then fun _ -> Nul else
fun f -> match f with
| Nul -> Nul
| Fun f -> Fun (fun x -> Num.( * ) abs_k (f x))
(* translate k e e’ = (k+e)*e’ *)
let translate k =
if k = Num.zero then ( <*> ) else fun f1 ->
match f1 with
| Nul
-> lambda k
| Fun f1 -> let abs_k = Num.abs k in
let tr_f1 = Fun (fun arg -> Num.( + ) abs_k (f1 arg))
in ( <*> ) tr_f1
let expm1
match f
| Nul
| Fun f
f =
with
-> Nul
-> Fun (fun x -> Num.(-) (Num.exp (f x)) Num.one)
let log1p
match f
| Nul
| Fun f
f =
with
-> Nul
-> Fun (fun x -> Num.abs (Num.log (Num.(+) Num.one (f x))))
let sin f =
match f with
| Nul -> Nul
| Fun f -> Fun (fun x -> Num.abs (Num.sin (f x)))
The error-value module is built from Num and Err and follows above
descriptions of our value-error domain. While most functions are standard
members of numerical algebras and self-explaining, functions cancel_error
and absorb_value are used to implement error refinement.
type t = Num.t * Err.t
let of_float x = (Num.of_float x, Err.nul)
let cancel_error (v, e) = (v, Err.nul)
let absorb_value e’ (v, e) = (Num.zero, Err.translate v e e’)
7.7. PERSPECTIVES
153
let ( + ) (v1, e1) (v2, e2) =
(v1 + v2, e1 <+> e2)
let lambda k (v, e) =
(k * v, Err.lambda k e)
let ( * ) (v1, e1) (v2, e2) =
let v1_v2 = v1 * v2 in
let e1 = memo e1 in
let e2 = memo e2 in
(v1_v2, (Err.lambda v1 e2) <+> (Err.lambda v2 e1) <+> (e1 <*> e2))
let exp (v, e) =
let exp_v = Num.exp v
in (exp_v, Err.lambda exp_v (Err.expm1 e))
let log (v, e) =
let log_v = Num.log v
in (log_v, Err.log1p (Err.lambda (Num.one / v) e))
let sin (v, e) =
let sin_v = Num.sin v in
let cos_v = Num.cos v in
let e = memo e in
(sin_v, (Err.lambda sin_v (Err.cosm1 e)) <+> (Err.lambda cos_v (Err.sin e)))
7.7
7.7.1
Perspectives
Coping with Rounding Errors
To really deserve the qualification of certified errors, our value-error algebra must cope with rounding errors of floating-point operations. Integrating
these errors into our scheme doesn’t seem too complex, as there already exist
tools and libraries (see for instance [47]) that augment numbers with certified
bounding boxes to represent errors.
7.7.2
Using Complex Numbers
Another totally independent enhancement would be to use complex numbers instead of “real” numbers. In case a Taylor expansion has to be computed outside of its convergence domain, errors will silently not converge
(or even diverge) as the expansion order increases. A famous example is
2
the function e−1/x , which is not analytical at point 0, when x ∈ R. Its
Taylor expansion is the zero polynomial and the error remains constant, for
any domain of x values. A solution to these pathological cases is to switch
154
CHAPTER 7. VALUES AND ERRORS
to an algebra of complex numbers, where all elementary functions are well
behaved. Complex numbers allow to take advantage of a larger domain of
convergence of Taylor expansions.
7.7.3
Change of Tensor Basis
Being structural and therefore cheap is the principal advantage of our
error refinement technique. But it is also a drawback as it imposes some
predefined way of merging tensor coefficients, regardless of their respective
magnitude. In figure 7.1, error values s2,2,2 , s2,2,1 and s2,2,0 will be merged
together. Assuming that Xi ' 1 and s2,2,2 is much greater than any other
tensor value, the total error of the righmost red tree will be of same magnitude as s2,2,2 . But if we swap dimension 0 and 2, the rightmost red tree
then merges s0,0,0 , s1,0,0 and s2,0,0 and thus will be of a lower magnitude.
Of course, this imposes not losing track of current tensor basis. Using
for instance the function rotate defined in 6.11 that implements a circular
permutation of basis vectors would improve the precision of Taylor approximations. Refining errors N times in a row for a given Taylor expansion
with this function also finally yields an unchanged basis. This would allow
us to perform this operation locally, on demand, without perturbating the
computation and representation of other Taylor expansions.
7.7.4
More Precise Error Models
If we want to go to the other end of this spectrum, i.e. precise and
potentially costly error domains that would save computation of higher order terms of Taylor expansions, some solutions seem to exist. For instance,
errors can be represented as zonotopes (see [34]). Indeed, zonotopes are
known to be resource demanding and challenging from an algorithmic and
implementation standpoint, yet they are on average much more precise than
intervals as they allow to represent exactly linear dependencies. Using zonotopes would amount to replace v ± (A) formulations with a more precise
one, by making explicit the linear part of the dependency of on problem
variables Xi :
X
v ± (A) becomes v +
Xi κi (A) + ρ(A)
i∈[0,N −1]
Here, functions κ are not necessarily error functions anymore, i.e. κ(0)
may be different from 0. They form the gradient of the original function .
The error function ρ captures non-linear dependencies, if any. This decomposition is a zonotope with functions as coefficients. Such zonotopes can be
added and multiplied easily. Other elementary operations can also be worked
out with classical formulas for computing partial derivatives. A great advantage of zonotopes is that they avoid many cancellation effects that already
7.7. PERSPECTIVES
155
proved to be a nuisance in interval analysis. Even more, special extra error
terms may be used to handle rounding errors, taking care of floating-point
issues in the same move.
We can rephrase the above paragraph as a quite self-referential proposition to replace order 0 Taylor expansions of errors with order 1 Taylor
expansions, also known as zonotopes. We can obviously push this idea further and make explicit the order of convergence in the error domain, i.e.
use explicit powers of Xi in error functions, which is normally rendered as
a “O(Xki )” notation in classical mathematical textbooks. Indeed, nothing
prevents us from computing order k Taylor expansions of errors for any k,
but the fear of a program greedily devouring its host computer resources.
156
CHAPTER 7. VALUES AND ERRORS
Chapter 8
Taylor Expansions
8.1
Introduction
We recall the canonical presentation of a Taylor expansion at order R
in dimension N . This expansion converges to f (x) when R → +∞ for
an analytical function f only in a chosen neighbourhood of point 0. The
good behaviour of our application is highly dependent on this hypothesis. In
case it doesn’t hold, many phenomenons may occur. It could raise a “NaN”
issue during numerical computations or it could yield a diverging error, i.e.
a Taylor remainder whose value grows and diverges when R → +∞. We
assume the user will provide us an analytical f on some neighbourhood of
interest around point 0. We don’t try to check the validity of this hypothesis.
f (x) =
P
|α|<R
α
Dαf (0). xα! +
P
|α|=R
α
Dαf (λ ∗ x). xα!
In the above formulation, x = (x0 , . . . , xN −1 ) ∈ RN is a vector, α =
(α0 , . . . , αN −1 ) ∈ NN is a multi-index and λ ∈ [0, 1] is an unknown coefficient
that characterizes the exact Taylor remainder.
At the level of Taylor series, we don’t need variables xi to appear explicitly, i.e. we can handle tensors as elements of a black-box algebra, regardless
of their interpretation as homogeneous polynomials in variables xi . Therefore, Taylor series may and will be represented as formal power series in one
formal variable Z, where coefficients are elements of our graded tensor algebra. These tensors themselves are built on top of the error-model algebra of
previous chapter 7. Finally, the abbreviated formulation of Taylor series is:
fZ (Z) ,
P
Tr .Z r
r∈N
where
Tr (i0 , . . . , ir−1 ) ,
P
|α|=r
157
(
α
Dα
f (0) Df (λ∗x)
).xα
α! ,
α!
158
CHAPTER 8. TAYLOR EXPANSIONS
The coefficient extractor and the suffix extractor of a Taylor series f are
then defined as:
P
dm f
[Z m ](f )
= [Z m ](
Tr ) , m! dZ
m (Z = 0) = Tm
r∈N
P
[Z ≥m ](f ) =
[Z r ](f ).Z r
r≥m
We also specify the shift operator [Z + ](f ), which is equivalent to [Z ≥r+1 ](f ),
upto right composition with [Z ≥r ](f ), i.e.:
[Z + ][Z ≥r ](f ) = [Z ≥r+1 ][Z r ](f ) = [Z ≥r+1 ](f )
8.2
Data Structure
The above formulation as a power series in variable Z let us encode any
Taylor expansion into a (lazy) stream of tensors of increasing orders. The
laziness specifically allows to dynamically set and increase the expansion
order.
The following functor instantiates our Taylor series as type ’r g_t, where
’r denotes a current (and increasing) expansion order. From a user’s perspective, the main type is then t, the type of Taylor series developed from
order 0 onwards. Here, type ’r V.t accounts from some ’r graded algebra,
typically our value-error tensor algebra (with a fixed dimension).
For optimization purposes, lazy co-recursion may be stopped with the
Nil constructor, representing the Taylor series of the zero function.
module TaylorType (V : sig type ’r t end) =
struct
type ’r opt =
| Nil of ’r Nat.isnat
| Cons of (’r V.t * ’r Nat.succ g_t)
and ’r g_t = T of ’r opt Lazy.t
type t = Nat.zero g_t
end
8.3
Causality
Every function operating on Taylor series, such as algebraic operations,
differentation and so on, should better respect causality, as if Taylor series
were interpreted in a flow semantics. This is indeed a natural restriction as
it amounts to check that, for any op, (f op g)(0) only depends on f (0) and
α
g(0). This condition carries on at any derivation order: ∂ (f∂αop g) (0) only
depends on
∂β f
∂β (0)
and
∂γ g
∂γ (0)
for β, γ ≤ α.
8.4. TAYLOR MODEL
159
Obviously, non-causal operations may provide more precise Taylor models as they anticipate the further value-error tensor values of their arguments
to compute smaller error at current order. But they suffer a fundamental flaw
as such operations cannot be used to define Taylor series co-recursively: they
are not productive. This strictly forbids specifying co-recursively solutions to
ordinary/partial differential equations, which is one of our not so far-fetched
goal. In a sense, operations on Taylor series must be synchronous. Moreover,
the better models non-causal operations provide can also be obtained with
supplementary layers of error refinements.
8.4
Taylor Model
Starting from a Taylor series s =
P
Tr .Z r , we define our Taylor model
r∈N
N
at order R in a -neighbourhood of point 0, where ∈ R+ . We assume
≤ (respectively | |) is the pointwise ordering (respectively absolute
value) when applied to vectors in RN . Also, Tr (0) denotes the value part of
value-error tensor Tr , whereas Tr (0) denotes its error part, applied to .
For any function f ∈ RN → R, a Taylor model predicate T M (f, R, ) of
s is defined as the following:
R
X
T M (f, R, ) , ∀x ∈ R .|x| ≤ =⇒ |f (x)−
Tr (0)xr | ≤ Σ0 (Tr ()|x|R )
N
r=0
A Taylor model for parameters R and is then the set of functions f
such that T M (f, R, ) holds true.
We claim our Taylor models provide valuable information in a userfriendly mode of interaction. Let us suppose we have to study a complex
arithmetical expression e depending on several variables and determine tight
enclosing bounds from these variables’ domain of values D. A typical workflow may be as follows: once a Taylor expansion of e is computed at some
arbitrary order R, we appropriately choose to cover the whole of D. In
case that the resulting Taylor model is not tight enough, we may keep on
computing orders R + 1, R + 2, etc, until the required precision is met. This
will only succeed if the expression is analytical (let alone defined) in the
-neighbourhood. Later on, if we want to exert our Taylor model when variables take values in an even greater domain D0 ⊇ D, we may only recompute
the new enclosing bound with an appropriate 0 .
Sketchily, this is particularly suited to computing abstract collecting semantics of programs, in an abstract interpretation framework, see [21].
160
8.5
CHAPTER 8. TAYLOR EXPANSIONS
Convolution
Product of Taylor expansions is really pervasive and appears in many
operations (derivation formulas, composition of Taylor series, etc). It is
naturally defined with an explicit convolution. Concretely:
P
P
P P
(
Tr .Z r ) × (
Sr .Z r ) =
( Ti × Sr−i )Z r
r∈N
r∈N
r∈N i∈N
In our setting, we maintain aPstrongly typed convolution structure to
express computation of the term
Ti × Sr−i . This structure, while geared
i∈N
towards strong static guarantees and proof of correctness, still allows for
some efficient implementation.
— Initially, the r = 0-order structure contains two 0-order tensors T0 and
S0 .
— Then, to compute the r + 1-iterate, we add Tr+1 and Sr+1 to the
r-order iterate.
— The r-order tensor of the product is then a folding of the r-iterate of
this convolution structure.
Indeed, the previous description must be generalized in order to prove
type safety. Informally, we may specify our structure as an array containing
tensors of appropriate orders, parameterized by four values l1 , l2 , r1 and r2 ,
although our own usage always specializes this structure with l1 = r2 = 0:
Tl1
Sl2
Tl1 +1 . . . Tr1
Sl2 −1 . . . Sr2
Folding this structure to compute terms of a product simply consists in
multiplying tensors column-wise (Tl1 × Sl2 , etc) and then summing these
intermediate results altogether.
In each column, the sum of orders from both lines is equal to some constant r, the current order of the convolution structure. This is easily guaranteed and proved as orders increase along the first line while they decrease
along the second line.
The initial value of this “array” is a single column containing T0 and S0 .
Schematically, the r+1-iterate is obtained through the following line-shifting
operation from the r-iterate (enclosed by double lines):
xTl1xTl1 +1x . . .xTr1 xTr1 +1
Sl2 +1 Sl2 Sl2 −1 . . . Sr2
Therefore, the r-iterate contains exactly r + 1 columns. As a data structure, convolution yields the following types: (’r1,’r2,’r) column which
represents a column of elements from a graded group G.t (usually tensors)
such that r1 + r2 = r and (’l1, ’l2, ’r1, ’r2, ’r) t which represents
8.5. CONVOLUTION
161
a convolution “array”. The various occurrences of option type are used to
distinguish between 0 tensors (constructor None) and non-0 tensors and further simplify and optimize computations. Constructor One denotes the initial
one-column array, whereas More denotes the r + 1-iterate:
type (’r1,
| Item:
*
*
->
’r2, ’r) column =
(’r1, ’r2, ’r) Nat.add
’r1 G.t option
’r2 G.t option
(’r1, ’r2, ’r) column
type (_, _, ’r1, ’r2, ’r) t =
| One :
(’r1, ’r2, ’r) column
-> (’r1, ’r2, ’r1, ’r2, ’r) t
| More:
(’l1, ’l2 Nat.succ, ’r) column
* (’l1 Nat.succ, ’l2, ’r1, ’r2, ’r) t
-> (’l1, ’l2 Nat.succ, ’r1, ’r2, ’r) t
Supported operations, i.e. initialization, line-shifting and folding, are
respectively implemented by functions one, more and product_update. For
the sake of efficiency, this latter function doesn’t compute partial sums of
tensor products as independent tensors that would require a lot of transient
memory allocations, but rather accumulates these products in a unique tensor (of type G.t_u) made of updatable values, initially filled with zeroes.
This is performed by the auxiliary impure functions prod_item_update and
prod_clist_update:
let one : (’r1, ’r2, ’r) Nat.add -> ’r1 G.t option -> ’r2 G.t option
-> (’r1, ’r2, ’r) column =
fun pr st1 st2 -> One (Item (pr, st1, st2))
let rec more : type l1 l2 r1 r2 r.
(l1, l2, r1, r2, r) t -> r1 Nat.succ G.t option -> l2 Nat.succ G.t
-> (l1, l2 Nat.succ, r1 Nat.succ, r2, r Nat.succ) t =
fun conv ->
match conv with
| One (Item (pr, st_r1, st_l2))
->
fun st_r1’ st_l2’ -> More (Item (Nat.addS_right pr, st_r1,
One (Item (Nat.Sadd pr, st_r1’,
| More (Item (pr, st_l1, st_l2), q) ->
fun st_r1’ st_l2’ -> More (Item (Nat.addS_right pr, st_l1,
more q st_r1’ st_l2)
option
st_l2’),
st_l2)))
st_l2’),
let prod_item_update : (’r1, ’r2, ’r) column -> ’r G.t_u -> bool ref -> unit =
fun (Item (pr, st_r1_opt, st_r2_opt)) st nope ->
match st_r1_opt, st_r2_opt with
| Some st_r1, Some st_r2 ->
begin
G.prod_u pr st_r1 st_r2 st;
162
CHAPTER 8. TAYLOR EXPANSIONS
nope := false;
end
| _
-> ()
let rec prod_clist_update : type l1 l2 r1 r2 r.
(l1, l2, r1, r2, r) t -> r G.t_u -> bool ref -> unit =
fun conv ->
match conv with
| One item
-> fun st nope ->
begin
prod_item_update item st nope;
end
| More (item, q) -> fun st nope ->
begin
prod_item_update item st nope;
prod_clist_update q st nope
end
let product_update : (’l1, ’l2, ’r1, ’r2, ’r) t -> ’r G.t_u
-> ’r G.t option =
fun conv st ->
begin
let nope = ref true in
prod_clist_update conv st nope;
if !nope then None else Some (G.( !: ) st)
end
8.6
Polynomial Operations
Polynomial operations include algebraic operations, built on top of the
graded tensor group operations, which we augment with N symbolic polynomial variables. For the record, a special attention was necessary to obtain a
causal version of division, which is a rather complex operation. It is specified
as:
P
P
P
(
Tr .Z r )/(
Sr .Z r ) =
Dr .Z r
r∈N
where Dr ,
r∈N
1
S0
× (Tr −
r∈N
r
P
Sk × Dr−k )
k=1
for (S0 6= 0)
In the following piece of code, V stands for the pure value model whereas
VE stands for the value-error model. We use three auxiliary functions, not
detailed here. Functions fzero and fcons respectively encapsulate constructors Nil and Cons, while hdtl destructures a Taylor series f , i.e. computes
the pair (f (Z = 0), [Z + ]f ).
Function of_float injects a floating-point number as a constant Taylor
series. Function var creates a variable, i.e. an order 1 monomial. Its main
8.6. POLYNOMIAL OPERATIONS
163
value at point 0 is 0 and its error is given by the corresponding error function Err.var. Linear functions such as ( + ), ( - ) and lambda bear their
intented meanings. Multiplicative functions such as ( * ) and ( / ) have a
separate initialization phase to compute 0-order tensors and involve convolution. Multiplication between series of lowest non-zero tensors respectively at
order r1 and r2 yields a series of lowest order r1 + r2 . Division of a series of
lowest order r1 by a series of lowest order 0 ( 1 ) yields a series of lowest order
r1 . These specifications are entirely embedded and proved at the type-level.
As a matter of optimization, multiplication between finite polynomial
Taylor series should yield finite series as well. To handle this case precisely,
multiplication by 0 tensors and convolution with 0 elements are treated as
special optimized cases. Similar optimizations are implemented for division.
let of_float : float -> Nat.zero g_t =
fun f -> if f = 0.
then fzero Nat.Z
else fcons (VE.inject Nat.Z (V.of_float f))
(fzero (Nat.S Nat.Z))
let var : ’k Nat.isnat -> (’d Nat.succ, ’k, N.t) Nat.add -> Nat.zero g_t =
fun k pr ->
fcons (VE.inject Nat.Z (fst zeroV, Err.var k pr))
(fcons (VE.var k pr)
(fzero (Nat.S (Nat.S Nat.Z))))
let lambda : V.t -> ’r g_t -> ’r g_t =
fun k -> if k = zeroV then fun ctm -> fzero (order ctm) else
let rec lambda_rec : type r. r g_t -> r g_t =
fun (T ctm) ->
T (lazy (
match Lazy.force ctm with
| Nil r
-> Nil r
| Cons (ve, q) -> Cons (VE.( *.: ) k ve, lambda_rec q)
))
in fun ctm -> lambda_rec ctm
let rec ( + ) : type r. r g_t -> r g_t -> r g_t =
fun (T ctm1) (T ctm2) ->
T (lazy (
match Lazy.force ctm1, Lazy.force ctm2 with
| ctm1
, Nil _
-> ctm1
| Nil _
, ctm2
-> ctm2
| Cons (ve1, q1), Cons (ve2, q2) -> Cons (VE.( +: ) ve1 ve2,
q1 + q2)
))
1. We need a non-zero constant part as division must be defined in the neighbourhood
of 0.
164
CHAPTER 8. TAYLOR EXPANSIONS
let ( * ) : (’R1, ’R2, ’R) Nat.add -> ’R1 g_t -> ’R2 g_t -> ’R g_t =
let rec prod_rec : type r1 r2 r. r Nat.isnat -> (r1, ’R2, r) Nat.add
-> (’R1, r2, r1, ’R2, r) Conv.t
-> r1 Nat.succ g_t -> r2 Nat.succ g_t -> r Nat.succ g_t =
fun r pr conv (T ctm1) (T ctm2) ->
T (lazy (
let (ve1_opt, q1) =
match Lazy.force ctm1 with
| Nil r1
-> None, fzero (Nat.S r1)
| Cons (ve1, q1) -> Some ve1, q1 in
let (ve2_opt, q2) =
match Lazy.force ctm2 with
| Nil r2
-> None, fzero (Nat.S r2)
| Cons (ve2, q2) -> Some ve2, q2 in
let str’ = VE.inject_u (Nat.S r) zeroV in
let conv = Conv.more conv ve1_opt ve2_opt in
let str_opt’ = Conv.product_update conv str’ in
match str_opt’ with
| None
-> Nil (Nat.S r)
| Some str’ -> Cons (str’, prod_rec (Nat.S r) (Nat.Sadd pr) conv q1 q2)
))
in fun pr (T ctm1) (T ctm2) ->
T (lazy (
let r = Nat.isnat_add pr (order (T ctm2)) in
let (ve1_opt, q1) =
match Lazy.force ctm1 with
| Nil r1
-> None, fzero (Nat.S r1)
| Cons (ve1, q1) -> Some ve1, q1 in
let (ve2_opt, q2) =
match Lazy.force ctm2 with
| Nil r2
-> None, fzero (Nat.S r2)
| Cons (ve2, q2) -> Some ve2, q2 in
let str = VE.inject_u r zeroV in
let conv = Conv.one pr ve1_opt ve2_opt in
let str_opt = Conv.product_update conv str in
match str_opt with
| None
-> Nil r
| Some str -> Cons (str, prod_rec r pr conv q1 q2)
))
let rec div : type r1 r2. V.t -> r1 Nat.isnat -> (r2, ’R1, r1) Nat.add
-> (Nat.zero Nat.succ, r1, r2 Nat.succ, ’R1, r1 Nat.succ) Conv.t
-> r1 Nat.succ VE.t -> r1 Nat.succ Nat.succ g_t -> r2 Nat.succ Nat.succ g_t
-> r1 Nat.succ Nat.succ g_t =
fun factor r1 pr conv d1 ctm1 (T ctm2) ->
T (lazy (
let (ve1, q1) = hdtl ctm1 in
let (ve2_opt, q2) =
match Lazy.force ctm2 with
8.6. POLYNOMIAL OPERATIONS
165
| Nil r2
-> None, fzero (Nat.S r2)
| Cons (ve2, q2) -> Some ve2, q2 in
let conv = Conv.more conv ve2_opt (Some d1) in
let str1’ = VE.inject_u Nat.(S (S r1)) zeroV in
let str1’ =
match Conv.product_update conv str1’ with
| None
-> VE.( !: ) str1’
| Some str1’ -> str1’ in
let d1’ = VE.(factor *.: (str1’ -: ve1)) in
Cons (d1’, div factor (Nat.S r1) (Nat.Sadd pr) conv d1’ q1 q2)
))
let ( / ) : ’R1 g_t -> Nat.zero g_t -> ’R1 g_t =
fun (T ctm1) (T ctm2) ->
T (lazy (
let (_R1, ve1_opt0, q1) =
match Lazy.force ctm1 with
| Nil r1
-> r1, None, fzero (Nat.S r1)
| Cons (ve1_0, q1) -> VE.order ve1_0, Some ve1_0, q1 in
let (inv_v2_0, q2) =
match Lazy.force ctm2 with
| Nil _
-> failwith "division␣by␣zero"
| Cons (VE.ST.Leaf v2_0, q2) -> V.(one / v2_0), q2 in
let d_0, d_opt0 = match ve1_opt0 with
| None
-> VE.inject (order (T ctm1)) zeroV, None
| Some ve1_0 -> let ve_0 = VE.ST.lambda inv_v2_0 ve1_0 in
ve_0, Some ve_0 in
Cons (d_0,
T (lazy (
match Lazy.force (unT q2) with
| Nil r2
-> Lazy.force (unT (lambda inv_v2_0 q1))
| Cons (ve2_1, q2) ->
let (ve2_opt1, qq2) = Some ve2_1, q2 in
let conv = Conv.one Nat.(Sadd Zadd) ve2_opt1 d_opt0 in
let str1 = VE.inject_u (Nat.S _R1) zeroV in
let str1 =
match Conv.product_update conv str1 with
| None
-> VE.( !: ) str1
| Some str1 -> str1 in
match hdtl q1 with
| (ve_1, qq1) ->
let factor = V.(minusoneV * inv_v2_0) in
let d_1 = VE.(factor *.: (str1 -: ve_1)) in
Cons (d_1, div factor _R1 Nat.Zadd conv d_1 qq1 qq2)
)))))
166
8.7
CHAPTER 8. TAYLOR EXPANSIONS
Differential Operations
Partial differential as well as integral operators are supported, with respect to any variable xi ,i = 0, . . . , N − 1. Differentiation amounts to pointwise tensor differentiation of every element of a Taylor series, minus the
constant part (i.e. 0-order tensor). It is implemented as function pdiff,
which in turn calls the recursive function proj which differentiates every
tensor of the series and decrements the order of the series by 1.
Integration is implemented as function pinteg, which in turn calls the
recursive function inj which integrates every tensor of the series and increments the order of the series by 1. Yet, a 0-order tensor must be provided.
Its main value is obviously 0 but a sensible error must be provided. We
choose to simply integrate the 0-order error of the argument through function absorb_value. Nevertheless, we stress the fact that this definition is
not strictly productive, as 0-order result is produced from 0-order argument.
As such, it cannot be used to specify differential equations with a recursive
so-called solved form, as it would only create a diverging computation (in
fact, thanks to laziness in OCaml, this problem would be spotted and an
exception would be thrown). A simpler version of pinteg which doesn’t
compute 0-order errors is needed in this case and computation of sensible
errors is recovered through a general fixed point formulation, following the
Picard-Lindelöf theorem.
let rec proj : type r. r Nat.isnat -> ’k Nat.isnat
-> (’d Nat.succ, ’k, N.t) Nat.add -> r Nat.succ g_t -> r g_t =
fun r k pr (T ctm) ->
T (lazy (
match Lazy.force ctm with
| Nil _
-> Nil r
| Cons (ve, q) -> Cons (VE.proj r k pr ve, proj (Nat.S r) k pr q)
))
let pdiff : ’k Nat.isnat -> (’d Nat.succ, ’k, N.t) Nat.add -> ’r g_t -> ’r g_t =
fun k pr (T ctm) ->
T (lazy (
match Lazy.force ctm with
| Nil r
-> Nil r
| Cons (ve, q) -> Lazy.force (unT (proj (VE.order ve) k pr q))
))
let rec inj : type r. r Nat.isnat -> ’k Nat.isnat
-> (’d Nat.succ, ’k, N.t) Nat.add -> r g_t -> r Nat.succ g_t =
fun r k pr (T ctm) ->
T (lazy (
match Lazy.force ctm with
| Nil _
-> Nil (Nat.S r)
| Cons (ve, q) -> Cons (VE.inj r k pr ve, inj (Nat.S r) k pr q)
8.8. TAYLOR EXPANSIONS OF ELEMENTARY FUNCTIONS
167
))
let pinteg : ’k Nat.isnat -> (’d Nat.succ, ’k, N.t) Nat.add
-> Nat.zero g_t -> Nat.zero g_t =
fun k pr ->
let var_k = Err.var k pr in
fun ctm ->
T (lazy (
let ve = VE.ST.map (V.absorb_value var_k) (fst (hdtl ctm))
in Cons (ve, inj Nat.Z k pr ctm)
))
8.8
Taylor Expansions of Elementary Functions
Elementary functions, limited to one argument functions, are specified as
1-dimensional Taylor series. Therefore, as all tensors contain one coefficient
only, 1D series are treated separately. This is only a matter of efficiency and
obviously not mandatory. To obtain a Taylor expansion of an elementary
function, we need to be able to compute any n-th derivative. Taylor series
for elementary functions are well known, but for certified errors. Therefore
we need to compute them in the value-error domain. We present the rderivative of function f in the -neighbourhood of point 0 as Drf (), which
yields a value of the derivative of f at point 0 and an error, a function of .
The error accounts for the evaluation of the derivative in the -neighbourhood
and not exactly at point 0, as occurs in the Taylor remainder.
Some arbitrary derivatives of standard elementary functions are summarized in the following enumeration. Similar formulations are available for
elementary functions not presented here. One should bear in mind that such
derivatives may be hard to obtain in suitable closed form. For instance, as
far as we know, the derivatives of atan don’t enjoy a closed form and are
presented here in a recursive way. Incidentally, we use the function log(1+ )
instead of mere log. This is due to the more practical side of zero-centered
functions.
Drexp ()
Dr+1
log(1+ ) ()
2r
Dsin ()
D2r+1
sin ()
r+2
Datan ()
= exp()
−1 r+1
= −(r + 1)! ∗ ( 1+
)
r
= (−1) sin()
= (−1)r cos()
r
2
= (−2(r + 1)Dr+1
atan () − r ∗ (r + 1) ∗ Datan ())/(1 + )
The data structure is best represented by the following type ’r coefficients
standing for an infinite sequence of 1D derivatives. Functions exp_coeffs,
log1p_coeffs and trigo_coeffs respectively stand for exp, log(1 + ) and
trigonometric functions. Function atan is not implemented yet.
168
CHAPTER 8. TAYLOR EXPANSIONS
type ’r coefficients =
| Next of (VE.t * ’r Nat.succ coefficients) Lazy.t
let
let
let
let
let
let
let
let
let
zeroR = VE.of_float 0.
oneR = VE.of_float 1.
minusoneR = VE.of_float (-1.)
err = fst zeroR, VE.Err.var Nat.Z (Nat.Sadd Nat.Zadd)
errp1 = V.(oneR + err)
sin_err = VE.sin err
cos_err = VE.cos err
minus_sin_err = VE.(minusoneR * sin_err)
minus_cos_err = VE.(minusoneR * cos_err)
let exp_coeffs : Nat.zero coefficients =
let coeff_0 = VE.exp err in
let rec coeffs : type r. VE.t -> float -> r Nat.succ Nat.isnat
-> r Nat.succ coefficients =
fun inv_fact_Pn’ n’ n ->
let inv_fact_n’ = VE.(inv_fact_Pn’ / (of_float n’))
in Next (lazy (inv_fact_n’, coeffs inv_fact_n’ (n’ +. 1.) (Nat.S n)))
in Next (lazy (coeff_0, coeffs coeff_0 1. (Nat.S Nat.Z)))
let log1p_coeffs : Nat.zero coefficients =
let coeff_0 = VE.log errp1 in
let rec coeffs : type r. VE.t -> bool -> float -> r Nat.succ Nat.isnat
-> r Nat.succ coefficients =
fun coeff_n sgn n’ n ->
let inv_nV = VE.(oneR / (coeff_n * (of_float (if sgn then -. n’
else n’))))
in Next (lazy (inv_nV,
coeffs VE.(coeff_n * errp1) (not sgn) (n’ +. 1.) (Nat.S n)))
in Next (lazy (coeff_0, coeffs errp1 false 1. (Nat.S Nat.Z)))
let trigo_coeffs : int -> Nat.zero coefficients =
let f_n i =
match i with
| 0 -> sin_err
| 1 -> cos_err
| 2 -> minus_sin_err
| 3 -> minus_cos_err
| _ -> assert false in
fun i ->
let rec coeffs : type r. int -> V.t -> float -> r Nat.isnat
-> r coefficients =
fun i inv_fact_Pn’ n’ n ->
let inv_fact_n’ = VE.(inv_fact_Pn’ / (of_float (n’ +. 1.)))
in Next (lazy (VE.(inv_fact_Pn’ * (f_n i)),
coeffs ((i+1) mod 4) inv_fact_n’ (n’ +. 1.) (Nat.S n)))
in coeffs i oneR 0. Nat.Z
8.9. COMPOSITION OF TAYLOR EXPANSIONS
169
let sin_coeffs = trigo_coeffs 0
8.9
Composition of Taylor Expansions
We only need to apply elementary functions to arbitrary arguments, i.e.
to compose 1D Taylor series with multidimensional ones. The general composition of multidimensional Taylor series is also possible in our setting but
out of the scope of our current concerns. As in formal power series, composition (f ◦ g) may be achieved only when g has no constant part. To factorize
out the constant part of g (so that we fall back to evaluation at point 0),
we depend on an additive decomposition of f , an elementary function, when
available.
Again, we sum up some decompositions of standard elementary functions:
exp(x0 + x0 )
log(x0 + x0 )
sin(x0 + x0 )
atan(x0 + x0 )
= exp(x0 ) exp(x0 )
0
= log(x0 ) + log(1 + xx0 )
= sin(x0 ) cos(x0 ) + cos(x0 ) sin(x0 )
x0
= atan(x0 ) + atan( 1+x
0)
0x
Then, we propose a general composition mechanism to compute (f ◦ g),
where f is 1D and g(0) = 0. Mathematically speaking, it boils down to
replacing variable Z by g in the expansion of f :
X
X
f ◦g ,
fr .g r =
Tr .Z r
r∈N
r∈N
Still, tensors Tr must be extracted from this infinite sum to build the
resulting Taylor series. Since g(0) = 0, the lowest non-zero term of g is at
least of order 1 and g r is at least of order r. Therefore, we can characterize
Tr as a finite sum:
r
X
[Z r ]g k
Tr =
k=0
First, we propose a recursive formulation for the value part Tr (0):
pow0 = 1, powr+1 = g × powr
r
P
S0 = 0, Sr+1 = [Z ≥r+1 ]
fk .g k = [Z + ](Sr + fr .powr )
k=0
T0 (0) = f0 (0), Tr+1 (0) = [Z r+1 ]Sr+1 (0)
Second, in order to define the error part Tr (), we compose the error part
of f with some Taylor model of g. We choose to employ the Taylor model
of g at order r, which acts as a zero-centered (error) function. Any other
170
CHAPTER 8. TAYLOR EXPANSIONS
approximation of g, at any order, is yet possible. We obtain the following
recursive error scheme:
r+1
P oly0 () = g0 (0) = 0, P olyr r+1 () = P olyr () + Σ(gr+1 (0) )
err_gr () = Σ(|gr (0)| + gr ())
T
() = f0 (P oly0 () + g0 ()) = f0 (g0 ()),
0
Tr+1 () = fr+1 (P olyr () + err_gr+1 ())
These recursive schemes are implemented by the (recursive) functions
series and series_rec. The module Series stands for the specific functionalities of 1D Taylor series. The correspondence between function arguments and their mathematical counterparts is detailed in the following
table:
f_0
err_g_0
g_1
comp_infeq_r
poly_g_infeq_r
f_Sr
g_Sr
g_1_power_Sr
fg0
f_1
↔
↔
↔
↔
↔
↔
↔
↔
↔
↔
[Z ≥0 ](f ) = f
err_g0
[Z ≥1 ](g) = g
Sr+1
P olyr
[Z ≥r+1 ](f )
[Z ≥r+1 ](g)
powr+1
T0
[Z 1 ](f )
let series : Nat.zero Series.coefficients -> Err.t
-> Nat.zero Nat.succ g_t -> Nat.zero g_t =
fun (Series.Next f_0) err_g_0 g_1 ->
let rec series_rec : type r. r Nat.isnat -> r Nat.succ g_t -> Err.t
-> r Nat.succ Series.coefficients
-> r Nat.succ g_t -> r Nat.succ g_t -> r Nat.succ g_t =
fun r comp_infeq_r poly_g_infeq_r (Series.Next f_Sr) g_Sr g_1_power_Sr ->
T (lazy (
match Lazy.force f_Sr, hdtl comp_infeq_r,
hdtl g_Sr
, hdtl g_1_power_Sr with
| (fSr, f_SSr), (comp_infeqSr, comp_infeq_Sr),
(gSr, g_SSr), (g_1_powerSr, g_1_power_Sr’) ->
let gSr_as_err =
match VE.k_reduction_as_error Nat.Zadd gSr with
| ST.Leaf (_, err) -> err in
let fSr_err = Err.(memo (abs poly_g_infeq_r <+> gSr_as_err)) in
let fSr_comp = VE.compose fSr fSr_err in
let fgSr = VE.(comp_infeqSr +: (fSr_comp *.: g_1_powerSr)) in
let comp_infeq_Sr = comp_infeq_Sr + lambda fSr_comp g_1_power_Sr’ in
let g_1_power_SSr = product Nat.(Sadd Zadd) g_1 g_1_power_Sr in
let gSr_as_poly =
match VE.k_reduction_as_error ~signed:true ~with_error:false
Nat.Zadd gSr with
8.9. COMPOSITION OF TAYLOR EXPANSIONS
171
| ST.Leaf (_, err) -> err in
let poly_g_infeq_Sr = Err.(memo (poly_g_infeq_r <+> gSr_as_poly)) in
Cons (fgSr,
series_rec Nat.(S r) comp_infeq_Sr poly_g_infeq_Sr
f_SSr g_SSr g_1_power_SSr)
))
in T (lazy (
match Lazy.force f_0 with
| (f0, f_1) ->
let fg0 = VE.compose f0 err_g_0 in
Cons (VE.inject Nat.Z fg0, series_rec Nat.Z (fzero Nat.(S Z)) Err.nul
f_1 g_1 g_1)
))
Finally, we apply the composition primitive series to the aforementioned
elementary functions. We first get rid of the constant part of arguments using
additive decompositions.
let exp : Nat.zero g_t -> Nat.zero g_t =
fun (T ctm) ->
T (lazy (
match Lazy.force ctm with
| Nil r
->
Lazy.force (unT oneT)
| Cons (ve_0, ctm’) ->
let ve_0 = ST.get Indices.Inil ve_0 in
let exp_ve_0 = VE.exp (VE.cancel_error ve_0) in
let err_ve_0 = VE.get_absolute_error ve_0 in
Lazy.force (unT (lambda exp_ve_0 (series Series.exp_coeffs
err_ve_0 ctm’)))
))
let log : Nat.zero g_t -> Nat.zero g_t =
fun (T ctm) ->
T (lazy (
match Lazy.force ctm with
| Nil r
->
failwith "log:␣argument␣has␣zero␣constant␣part␣!"
| Cons (ve_0, ctm’) ->
let ve_0 = ST.get Indices.Inil ve_0 in
let ve_0_wo_err = VE.cancel_error ve_0 in
let ve_0_rel_err = VE.get_relative_error ve_0 in
let ctm_log_ve_0 = cons (ST.inject Nat.Z (VE.log ve_0_wo_err))
(fzero (Nat.S Nat.Z)) in
Lazy.force (unT (ctm_log_ve_0 + (series (Series.log1p_coeffs)
ve_0_rel_err
(lambda VE.(oneR / ve_0_wo_err)
ctm’))))
))
172
CHAPTER 8. TAYLOR EXPANSIONS
let sin : Nat.zero g_t -> Nat.zero g_t =
fun (T ctm) ->
T (lazy (
match Lazy.force ctm with
| Nil r
-> Nil r
| Cons (ve_0, ctm’) ->
let ve_0 = ST.get Indices.Inil ve_0 in
let ve_0_wo_err = VE.cancel_error ve_0 in
let ve_0_err = VE.get_absolute_error ve_0 in
let sin_ve_0 = VE.sin ve_0_wo_err in
let cos_ve_0 = VE.cos ve_0_wo_err in
let ctm_sin’ = series Series.sin_coeffs ve_0_err ctm’ in
let ctm_cos’ = series Series.cos_coeffs ve_0_err ctm’ in
Lazy.force (unT (lambda sin_ve_0 ctm_cos’ + lambda cos_ve_0 ctm_sin’))
))
8.10
Error Refinement
Error refinement in a Taylor series simply consists in replacing the error
part of the tensor Tr at order r by the reduction of the tensor at order
r + 1: ΣX Tr+1 , which gives a pure error tensor (see 7.5.2). We assume that
the Taylor series being currently refined converges towards some function,
so that the global error decreases when taken at order r + 1 instead of r.
Iterative refinement is possible to obtain tighter error bounds. Refinement is
not productive/causal as it clearly produces order r results from order r + 1
and above arguments. Thus, refinement cannot be mixed with differential
equation solving.
We give an excerpt of a functor that refines every operation of its Taylor
algebra module argument, yielding another refined Taylor algebra. Module
TA denotes the original Taylor algebra. The function refine_err1 implements the above tensor refinement scheme. The function refine1 applies
one level of refinement, whereas refine recursively applies refine1 any given
number of times, specified by a module-level integer P. Finally, we illustrate
two operations. Assuming arguments ctm1 and ctm2 are refined, multiplication may need refinement but we know for sure that addition doesn’t.
let refine_err1 : ’r0 Nat.isnat -> ’r0 Nat.succ VE.t -> ’r0 VE.t -> ’r0 VE.t =
fun r0 st1 ->
(* pr : r0 + (1 + 0) = 1 + r0 *)
let pr = Nat.addS_right (Nat.addZ_right r0) in
let st0’ = VE.k_reduction pr st1 in
fun st0 -> ST.map2 (fun (v, _) (_, e’) -> (v, e’)) st0 st0’
let refine1 : ’r Nat.isnat -> ’r g_t -> ’r g_t =
let rec loop : type r0. r0 Nat.isnat -> r0 g_t -> r0 g_t =
fun r0 (TA.T ctm0) ->
8.11. PERSPECTIVES
173
TA.T (lazy (
match Lazy.force ctm0 with
| TA.Nil r0
-> TA.fzero r0
| TA.Cons (ve0, q0) -> match TA.hdtl q0 with
| (ve1, _) -> TA.fcons (refine_err1 r0 ve1 ve0) (loop (Nat.S r0) q0)
))
in fun r0 ctm0 -> loop r0 ctm0
let refine : type r. r g_t -> r g_t =
let rec loop : type p r0. p Nat.isnat -> r0 Nat.isnat -> r0 g_t -> r0 g_t =
fun p r (TA.T ctm) ->
match p with
| Nat.Z
-> TA.T ctm
| Nat.S p’ -> match Lazy.force ctm with
| TA.Nil r
-> TA.fzero r
| TA.Cons (ve, q) -> refine1 r (TA.fcons ve (loop p’ (Nat.S r) q))
in fun ctm0 -> loop P.isnat (TA.order ctm0) ctm0
let ( + ) ctm1 ctm2 = TA.(ctm1 + ctm2)
let ( * ) ctm1 ctm2 = refine (TA.( * ) ctm1 ctm2)
Last but not least, change of tensor basis may help in obtaining tighter
error bounds. Assuming a Taylor expansion unfolded upto order r, we may
imagine a heuristics that:
1. Change tensor basis (currently, only rotation is supported) and try
different mixes of more precise dimensions with less precise ones.
2. Keep the best basis change, based on some criteria on the collection of
error functions of order r tensor, such as max-norm || ||∞ or abs-norm
|| ||1 .
Unfortunately, it has not been fully worked out nor implemented yet.
8.11
8.11.1
Perspectives
Solving (Partial) Differential Equations
Some conditions were already stated to obtain differential equation solving for free from our Taylor algebra. Namely, we need a productive solvedform differential equation.
We present a differential equation in solved-form:
f (x) = expr(x, f (x))
Here, expr is an arbitrary expression containing elementary functions, as
well as integral and differential operators. It must also be productive, i.e.
in each branch from the root of expr to variable f , there is always strictly
174
CHAPTER 8. TAYLOR EXPANSIONS
more integral operators than differential ones. No extra initial conditions
are necessary. This definition naturally extends to systems of differential
equations. For the sake of simplicity, the equation also appears homogeneous,
i.e. f is always applied to the same arguments x. This condition may be
relaxed as long as the arguments of f are zero-centered expressions.
An equation in solved-form naturally corresponds to a recursive Taylor
series for f . As an example, let us consider the following differential equation,
the solution of which is exp(x):
Zx
f (x) = 1 +
f (h)dh
0
Its translation in our framework yields the following piece of code. Vars.isnat
is 1, both at value and type levels. The first expression builds the whole list
of different usable variables in a 1-dimensional space. The second expression
recursively builds the solution of the differential equation, closely following
its mathematical decription. The only difference is the boxing/unboxing of
lazy values, due to OCaml restrictions.
let [x] = Indices.fold Vars.isnat [] (fun idx l -> idx :: l) in
let rec f = TA.(T (lazy (Lazy.force (unT
(of_float 1. + pinteg x f))))) in
f
Yet, although the polynomial part of the Taylor development is correct,
the current state of our development doesn’t handle errors in such a recursive
definition. To overcome this limitation, a general fixed-point mechanism
must be implemented above this definition of f , following the Picard-Lindelöf
theorem. We sketchily state this theorem, in the 1D case. Assuming a
differential equation in f :
Zx
f (x) = f (0) +
expr(h, f (h))dh
0
The following sequence of iterates:
Zx
φ0 (x) = 0, φn+1 (x) = f (0) +
expr(h, φn (h))dh
0
converges to a solution of the equation.
We propose to adapt this technique to our solved-form productive expression expr. First, we endow integral operators with explicit error parameters
(one for each): 0 , . . . , p . The expression expr being productive, each new
iterate φn+1 computes at least one supplementary order of the solution. Suppose we want to compute a Taylor expansion upto order r. Given the lowest
8.11. PERSPECTIVES
175
iterate φn such that order r appears in its expansion, we may inject φn in
the expression to compute φn+1 . Then, we need to tweek 0 , . . . , p in every
dimension in order to find the tightest error bounds such that the Taylor
model of φn+1 is included in the Taylor model of φn . This kind of algorithm
has already been documented and implemented in the COSY tool in the 1D
case. Comparison between both Taylor models can be achieved if we reduce
the order of φn+1 down to r, through error refinement. Inclusion then holds
if error terms of φn+1 are all smaller than corresponding error terms of φn .
8.11.2
Improved Data Structures
Our framework for multidimensional Taylor series is a conceptually simple extension of the 1D case, which has been the subject of numerous studies,
see Section 1.4.2. Indeed, scalar coefficients must be replaced by symmetric tensors. It allows to split the design and development into independent
modules, with clear algebraic interfaces. Yet, memory footprint is not optimal because for instance tensors at order r + 1 contain as a sub-structure the
shape of tensors at order r (and below) and all these tree prefixes in common
hold no values at all (only leaves hold values in our tensor data structure).
This view naturally raises the question whether there exists a better memory scheme which would retain every static guarantee. It happens that the
answer is positive, provided a co-recursive tree scheme is adopted for tensors
and Taylor series as infinite flows don’t exist anymore. The most general
co-tensor data structure we need is called cost and is defined as follows, in
its raw experimental version:
type left = Left
type right = Right
type (_, _, ’s) element =
| L : ’a -> (’a, ’b, left) element
| R : ’b -> (’a, ’b, right) element
(* (’a,’b,’n,’r,’l,’s) cost :
’a : left nodes elements
’b : right nodes elements
’n : current dimension/variable
’r : current order
’l : current variable order
’s : node side (left/right)
*)
type (’a, ’b, _, _, _, _) cost_t =
| CoNil: (’a, ’b, Nat.zero, ’r, ’l, left) cost_t
| CoZero: (’a, ’b, ’n Nat.succ, ’r, ’l, right) cost_t
| CoNode: (’a, ’b, ’s) element
* (’a, ’b, ’n, ’r, Nat.zero, left) cost
* (’a, ’b, ’n Nat.succ, ’r Nat.succ, ’l Nat.succ, right) cost
176
CHAPTER 8. TAYLOR EXPANSIONS
-> (’a, ’b, ’n Nat.succ, ’r, ’l, ’s) cost_t
and (’a, ’b, ’n, ’r, ’l, ’s) cost = (’a, ’b, ’n, ’r, ’l, ’s) cost_t Lazy.t
Since all derivatives are stored in a single co-tensor structure, the current
global order of derivation/expansion is not handled anymore in the infinite
Taylor series. Yet, most of our algorithms rely on the knowledge of this
current derivation/expansion order, such as the convolution algorithm. To
compensate for this loss, instead of co-tensors alone, we may use tensors, the
coefficients of which are co-tensors of appropriate dimensions. Adapting our
tensor data structure yields the following type definition, where leaves are
co-tensors:
type (’a, _, _) st =
| Nil:
(’a, Nat.zero, ’r) st
| Leaf:
(unit, ’a, ’n Nat.succ, Nat.zero, Nat.zero) cost
-> (’a, ’n Nat.succ, Nat.zero) st
| Node:
(’a, ’n, ’r Nat.succ) st
* (’a, ’n Nat.succ, ’r) st
-> (’a, ’n Nat.succ, ’r Nat.succ) st
We feel the current code base would only need a refurbishing to cope
with co-tensors, but until now it is a not-so-much informed opinion though.
8.11.3
Improved Composition of Taylor series
Composition of Taylor series is some rather heavyweight piece of code,
with a lot of similar parameters, which is error-prone. It is also a clear
performance bottleneck. Another possibility is to employ the renowned Faa
Di Bruno’s formula for composition of formal power series:
!mj
n
(j) (x)
X
Y
dn
n!
g
f (g(x)) =
f (m1 +···+mn ) (g(x))
dxn
m1 ! m2 ! · · · mn !
j!
j=1
where summation variables m1 , . . . , mn span over positive integers such that:
n
P
i ∗ mi = n.
i=1
Similarly to the product which needs a disciplined and reliable convolution algorithm, composition needs a disciplined version of Faa Di Bruno’s
formula to keep static guarantees. Unfortunately, this formula relies on enumerations of set partitions, a problem which seems hard to encode entirely at
the type-level. Indeed, we have conducted some experimental works showing
that this is possible, by mimicking classical algorithms that compute partitions, but at the expense of simplicity. So far, the resulting data structure
is crippled with auxiliary parameters, serving as counters and accumulators.
The memory footprint is also not very favorable, with potentially very long
empty branches. Improved composition has not been implemented yet but is
8.11. PERSPECTIVES
177
worth a thorough study, even if we suspect the complexity of its implementation is likely to curb its potential benefits in terms of performance. Here is a
possible attempt, inspired by our tensor data structure, where we specify the
expected sum and weighted sum of all mi as parameters. Recursion over the
branches of such a structure allows to browse through every possible valid
combination of mi values.
(* (’rank, ’cpt, ’sum, ’wsum) partitions
’rank : rank of current variable m_{rank}
’cpt : current value of counter in [0, rank], to encode (rank*m_{rank})
’sum : expected sum: \sigma_i m_i
’wsum : expected weighted sum: \sigma_i (i*m_i), equal to N initially
*)
type (_, _, _, _) partitions =
| Null :
(Nat.zero, Nat.zero, Nat.zero, Nat.zero) partitions
| Part :
(’rank, ’rank, ’sum Nat.succ, ’wsum) partitions
* (’rank Nat.succ, ’rank Nat.succ, ’sum, ’wsum) partitions
-> (’rank Nat.succ, Nat.zero, ’sum Nat.succ, ’wsum) partitions
| WSum :
(’rank Nat.succ, ’cpt, ’sum, ’wsum) partitions
-> (’rank Nat.succ, ’cpt Nat.succ, ’sum, ’wsum Nat.succ) partitions
(* ’sum faadibruno is the main type *)
type ’sum faadibruno = (N.t, Nat.zero, ’sum, N.t) partitions
8.11.4
Other Decompositions
Finally, other decompositions exist to approximate real-valued functions.
As we must keep certified errors in scope, using Fourier series, the physicist
classical recipe, seems unfit. Actually, Fourier series have serious drawbacks:
they involve a non-polynomial basis (trigonometric functions); computation
of coefficients is resource-demanding if precision is needed; coefficients are
valid only with respect to a given “period” that may change when variables’
domains change; convergence is good, but in L2-norm, which doesn’t yield
certified errors.
Yet, not so well-known Poisson series [69, 70] seem a good candidate to
replace traditional Taylor series. The decomposition basis of Poisson series
n
is (in the 1D case) the collection of functions e−x xn! . While being nonpolynomial, the constant supplementary factor e−x is easily factorized out
in calculations. Poisson series bear strong similarities with Bernstein basis
and Bezier curves which are used to approximate continuous functions on
compact intervals. As an example, they are endowed too with interesting
geometric properties which lead to cheap but powerful refinement techniques
as an alternative way to obtain more precision without computing tensors of
increasing orders. We list some features of Poisson series:
— Precision refinement techniques through cheap De Casteljau-like subdivision algorithms.
178
CHAPTER 8. TAYLOR EXPANSIONS
— Enhanced uniform convergence properties, for functions vanishing at
infinity, on the whole of RN .
— Algebra similar to Taylor (differential) algebra. Derivation is more
complex, but again a cheap De Casteljau-like algorithm can cope with
it.
— Geometric properties of Poisson coefficients: in a quantifiable way, the
graph of the approximated function draws near each coefficient, seen
as a geometric point. Other interesting geometric properties also hold.
Poisson series definitely seem worth a try, all the more than they are
within close reach of our Taylor implementation (but not in the most efficient way, though). Basically, in our multidimensional setting, it amounts to
multiplying every Taylor series by the exponential function e−(x0 +...+xN −1 ) .
Bibliography
[1] P. Amagbégnon, L. Besnard, and P. Le Guernic. Implementation of
the data-flow synchronous language signal. In In Conference on Programming Language Design and Implementation, pages 163–173. ACM
Press, 1995.
[2] Patrick Baudin, Jean-Christophe Filliâtre, Claude Marché, Benjamin
Monate, Yannick Moy, and Virgile Prevosto. ACSL: ANSI/ISO C Specification Language, 2008. frama-c.cea.fr/acsl.html.
[3] Calin Belta and Franjo Ivancic, editors. Proceedings of the 16th international conference on Hybrid systems: computation and control, HSCC
2013, April 8-11, 2013, Philadelphia, PA, USA. ACM, 2013.
[4] A. Benveniste and G. Berry. The synchronous approach to reactive and
real-time systems. In Proceedings of the IEEE, pages 1270–1282, 1991.
[5] Albert Benveniste, Paul Le Guernic, and Christian Jacquemot. Synchronous programming with events and relations: the signal language
and its semantics. Science of Computer Programming, 16(2):103 – 149,
1991.
[6] G. Berry and G. Gonthier. The esterel synchronous programming language: design, semantics, implementation. Sci. Comput. Program.,
19(2):87–152, 1992.
[7] Dariusz Biernacki, Jean-Louis Colaço, Grégoire Hamon, and Marc
Pouzet. Clock-directed modular code generation for synchronous dataflow languages. In Krisztián Flautner and John Regehr, editors, LCTES,
pages 121–130. ACM, 2008.
[8] Sandrine Blazy, Vincent Laporte, André Oliveira Maroneze, and David
Pichardie. Formal verification of a C value analysis based on abstract
interpretation. In Francesco Logozzo and Manuel Fähndrich, editors,
Static Analysis - 20th International Symposium, SAS 2013, Seattle, WA,
USA, June 20-22, 2013. Proceedings, volume 7935 of Lecture Notes in
Computer Science, pages 324–344. Springer, 2013.
[9] Sandrine Blazy and Xavier Leroy. Formal verification of a memory
model for C-like imperative languages. In ICFEM 2005, volume 3785
of LNCS, pages 280–299. Springer, 2005.
179
180
BIBLIOGRAPHY
[10] A.S. Boujarwah and K. Saleh. Compiler test case generation methods: a
survey and assessment. Information and Software Technology, 39(9):617
– 625, 1997.
[11] Sylvain Boulmé and Grégoire Hamon. A clocked denotational semantics
for lucid-synchrone in coq, 2001.
[12] Timothy Bourke, Jean-Louis Colaço, Bruno Pagano, Cédric Pasteur,
and Marc Pouzet. A synchronous-based code generator for explicit hybrid systems languages. In Franke [28], pages 69–88.
[13] Timothy Bourke and Marc Pouzet. Zélus: a synchronous language with
odes. In Belta and Ivancic [3], pages 113–118.
[14] Aaron R. Bradley. IC3 and beyond: Incremental, inductive verification.
In P. Madhusudan and Sanjit A. Seshia, editors, Computer Aided Verification - 24th International Conference, CAV 2012, Berkeley, CA, USA,
July 7-13, 2012 Proceedings, volume 7358 of Lecture Notes in Computer
Science, page 4. Springer, 2012.
[15] G. Brat, D. Bushnell, M. Davies, D. Giannakopoulou, F. Howar, and
T. Kahsai. Verifying the safety of a flight-critical system. Technical
report, NASA Ames, September 2011.
[16] P. Caspi, D. Pilaud, N. Halbwachs, and J. A. Plaice. Lustre: a declarative language for real-time programming. In Proceedings of the 14th
ACM SIGACT-SIGPLAN symposium, POPL ’87, pages 178–188. ACM,
1987.
[17] Paul Caspi, Daniel Pilaud, Nicolas Halbwachs, and John Plaice. Lustre: A declarative language for programming synchronous systems. In
POPL-87, pages 178–188. ACM Press, 1987.
[18] Darren D. Cofer, Andrew Gacek, Steven P. Miller, Michael W. Whalen,
Brian LaValley, and Lui Sha. Compositional verification of architectural
models. In NASA Formal Methods - 4th International Symposium, NFM
2012, pages 126–140, 2012.
[19] Jean-Louis Colaço, Grégoire Hamon, and Marc Pouzet. Mixing signals
and modes in synchronous data-flow systems. In Sang Lyul Min and
Wang Yi, editors, Proceedings of the 6th ACM & IEEE International
conference on Embedded software, EMSOFT 2006, October 22-25, 2006,
Seoul, Korea, pages 73–82. ACM, 2006.
[20] Jean-Louis Colaço, Bruno Pagano, and Marc Pouzet. A conservative
extension of synchronous data-flow with state machines. In Wayne Wolf,
editor, EMSOFT 2005, September 18-22, 2005, Jersey City, NJ, USA,
5th ACM International Conference On Embedded Software, Proceedings,
pages 173–182. ACM, 2005.
[21] Patrick Cousot and Radhia Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Robert M. Graham, Michael A. Harrison,
BIBLIOGRAPHY
181
and Ravi Sethi, editors, Conference Record of the Fourth ACM Symposium on Principles of Programming Languages, Los Angeles, California,
USA, January 1977, pages 238–252. ACM, 1977.
[22] P. Cuoq, F. Kirchner, N. Kosmatov, V. Prevosto, J. Signoles, and
B. Yakobowski. Frama-c: a software analysis perspective. SEFM’12,
pages 233–247. Springer, 2012.
[23] Leonardo Mendonça de Moura and Nikolaj Bjørner. Z3: An efficient
smt solver. In C. R. Ramakrishnan and Jakob Rehof, editors, TACAS,
volume 4963 of Lecture Notes in Computer Science, pages 337–340.
Springer, 2008.
[24] Delphine Demange, David Pichardie, and Léo Stefanesco. Verifying fast
and sparse ssa-based optimizations in coq. In Franke [28], pages 233–
252.
[25] Arnaud Dieumegard, Pierre-Loic Garoche, Temesghen Kahsai, Alice
Tailliar, and Xavier Thirioux. Compilation of synchronous observers
as code contracts. In Roger L. Wainwright, Juan Manuel Corchado,
Alessio Bechini, and Jiman Hong, editors, 30th ACM/SIGAPP Symposium on Applied Computing, SAC 2015, Salamanca, Spain - April 13 17, 2015, pages 1933–1939. ACM, 2015. Short paper.
[26] Inc. Esterel Technologies. Scade. http://www.esterel-technologies.
com/products/scade-suite/.
[27] Jean-Christophe Filliâtre and Claude Marché. The why/krakatoa/caduceus platform for deductive program verification. In Werner Damm
and Holger Hermanns, editors, Computer Aided Verification, 19th International Conference, CAV 2007, Berlin, Germany, July 3-7, 2007,
Proceedings, volume 4590 of Lecture Notes in Computer Science, pages
173–177. Springer, 2007.
[28] Björn Franke, editor. Compiler Construction - 24th International Conference, CC 2015, Held as Part of the European Joint Conferences on
Theory and Practice of Software, ETAPS 2015, London, UK, April 1118, 2015. Proceedings, volume 9031 of Lecture Notes in Computer Science. Springer, 2015.
[29] Sicun Gao, Soonho Kong, and Edmund M. Clarke. dreal: An smt solver
for nonlinear theories over the reals. In Maria Paola Bonacina, editor,
CADE, volume 7898 of Lecture Notes in Computer Science, pages 208–
214. Springer, 2013.
[30] Pierre-Loïc Garoche, Arie Gurfinkel, and Temesghen Kahsai. Synthesizing modular invariants for synchronous code. In Nikolaj Bjørner, Fabio
Fioravanti, Andrey Rybalchenko, and Valerio Senni, editors, Proceedings First Workshop on Horn Clauses for Verification and Synthesis,
HCVS 2014, Vienna, Austria, 17 July 2014., volume 169 of EPTCS,
pages 19–30, 2014.
182
BIBLIOGRAPHY
[31] Pierre-Loïc Garoche, Falk Howar, Temesghen Kahsai, and Xavier Thirioux. Testing-based compiler validation for synchronous languages. In
Julia M. Badger and Kristin Yvonne Rozier, editors, NASA Formal
Methods - 6th International Symposium, NFM 2014, Houston, TX,
USA, April 29 - May 1, 2014. Proceedings, volume 8430 of Lecture Notes
in Computer Science, pages 246–251. Springer, 2014. Short paper.
[32] Pierre-Loïc Garoche, Xavier Thirioux, and Temesghen Kahsai. Hierarchical state machines as modular horn clauses. In Proceedings Third
Workshop on Horn Clauses for Verification and Synthesis, HCVS 2016,
Eindhoven, The Netherlands, 03 April 2016., 2016.
[33] Pierre-Loïc Garoche, Xavier Thiroux, and Temesghen Kahsai. Lustrec:
a modular lustre compiler. https://github.com/coco-team/lustrec,
2012–. Collaboration avec l’ONERA et NASA Ames.
[34] Khalil Ghorbal, Eric Goubault, and Sylvie Putot. The zonotope abstract domain taylor1+. In Ahmed Bouajjani and Oded Maler, editors,
Computer Aided Verification, 21st International Conference, CAV 2009,
Grenoble, France, June 26 - July 2, 2009. Proceedings, volume 5643 of
Lecture Notes in Computer Science, pages 627–633. Springer, 2009.
[35] George Giorgidze and Henrik Nilsson. Switched-on yampa. In Paul Hudak and David Scott Warren, editors, Practical Aspects of Declarative
Languages, 10th International Symposium, PADL 2008, San Francisco,
CA, USA, January 7-8, 2008., volume 4902 of Lecture Notes in Computer Science, pages 282–298. Springer, 2008.
[36] Timothy Griffin. A formulae-as-types notion of control. In Frances E.
Allen, editor, Conference Record of the Seventeenth Annual ACM Symposium on Principles of Programming Languages, San Francisco, California, USA, January 1990, pages 47–58. ACM Press, 1990.
[37] Ashutosh Gupta, Corneliu Popeea, and Andrey Rybalchenko. Solving
Recursion-Free Horn Clauses over LI+UIF. In APLAS, pages 188–203,
2011.
[38] Ashutosh Gupta, Corneliu Popeea, and Andrey Rybalchenko. Threader:
A Constraint-Based Verifier for Multi-threaded Programs. In CAV,
pages 412–417, 2011.
[39] Arie Gurfinkel, Temesghen Kahsai, Anvesh Komuravelli, and Jorge A.
Navas. The seahorn verification framework. In Daniel Kroening and
Corina S. Pasareanu, editors, Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24,
2015, Proceedings, Part I, volume 9206 of Lecture Notes in Computer
Science, pages 343–361. Springer, 2015.
[40] George Hagen and Cesare Tinelli. Scaling up the formal verification of
Lustre programs with SMT-based techniques. In FMCAD-2008, pages
109–117. IEEE, 2008.
BIBLIOGRAPHY
183
[41] N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous
dataflow programming language lustre. In Proceedings of the IEEE,
pages 1305–1320, 1991.
[42] Guillaume Hanrot, Vincent Lefèvre, Patrick Pélissier, and Paul Zimmermann. The mpfr library. http://www.mpfr.org, 2000–.
[43] David Harel. Statecharts: A visual formalism for complex systems. Sci.
Comput. Program., 8(3):231–274, June 1987.
[44] Tony Hoare. The verifying compiler: A grand challenge for computing
research. Journal of the ACM, 50:2003, 2003.
[45] Kryštof Hoder and Nikolaj Bjørner. Generalized property directed
reachability. In Alessandro Cimatti and Roberto Sebastiani, editors,
Theory and Applications of Satisfiability Testing – SAT 2012, volume
7317 of LNCS, pages 157–171. 2012.
[46] Richard M. Hueschen. Development of the Transport Class Model
(TCM) aircraft simulation from a sub-scale Generic Transport Model
(GTM) simulation. Technical report, NASA, Langley Research Center,
Hampton, VA, August 2011.
[47] Arnault Ioualalen and Matthieu Martel. Synthesis of arithmetic expressions for the fixed-point arithmetic: The sardana approach. In Proceedings of the 2012 Conference on Design and Architectures for Signal and
Image Processing, DASIP 2012, Karlsruhe, Germany, October 23-25,
2012, pages 1–8. IEEE, 2012.
[48] Arnault Ioualalen and Matthieu Martel. Synthesizing accurate floatingpoint formulas. In 24th International Conference on Application-Specific
Systems, Architectures and Processors, ASAP 2013, Washington, DC,
USA, June 5-7, 2013, pages 113–116. IEEE Computer Society, 2013.
[49] Nassima Izerrouken. Développement prouvé de composants formels pour
un générateur de code embarqué critique pré-qualifié. Thèse de doctorat,
Institut National Polytechnique de Toulouse, Toulouse, France, juillet
2011. (Soutenance le 06/07/2011).
[50] Nassima Izerrouken, Marc Pantel, Olivier Ssi Yan Kai, and Xavier Thirioux. D2.36: Geneauto block sequencer tool requirements. Rapport de
contrat D2.36, Institut National Polytechnique de Toulouse, Toulouse,
France, août 2008.
[51] Nassima Izerrouken, Marc Pantel, and Xavier Thirioux. Machine
checked sequencer for critical embedded code generator (regular paper). In International Conference on Formal Engineering Methods
(ICFEM), Rio de Janeiro, Brazil, 09/12/2009-12/12/2009, pages 521–
540, http://www.springerlink.com/, décembre 2009. Springer-Verlag.
[52] Nassima Izerrouken, Marc Pantel, Xavier Thirioux, and Olivier Ssi
Yan Kai.
Integrated formal approach for qualified critical embedded code generator (short paper). In International Workshop
184
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
[62]
[63]
BIBLIOGRAPHY
on Formal Methods for Industrial Critical Systems (FMICS), Eindhoven - The Netherlands, 02/11/2009-03/11/2009, pages 199–201,
http://www.springerlink.com/, 2009. Springer-Verlag.
Nassima Izerrouken, Xavier Thirioux, Marc Pantel, and Martin
Strecker. Certifying an automated code generator using formal tools :
Preliminary experiments in the geneauto project. In European Congress
on Embedded Real-Time Software (ERTS), Toulouse, 29/01/200801/02/2008, page (electronic medium), http://www.sia.fr, 2008. Société
des Ingénieurs de l’Automobile.
T. Kahsai, Y. Ge, and C. Tinelli. Instantiation-based invariant discovery. In NFM, volume 6617 of LNCS, pages 192–207, 2011.
T. Kahsai and C. Tinelli. PKind: a parallel k-induction based model
checker. In PDMC, volume 72 of EPTCS, pages 55–62, 2011.
Temesghen Kahsai, Arnaud Dieumegard, Xavier Thirioux, Pierre-Loïc
Garoche, Claire Pagetti, and Éric Noulard. The mpfr library. https:
//github.com/coco-team/cocoSim, 2015–.
Temesghen Kahsai and Cesare Tinelli. Pkind: A parallel k-induction
based model checker. In Jiri Barnat and Keijo Heljanko, editors, Proceedings 10th International Workshop on Parallel and Distributed Methods in verifiCation, PDMC 2011, Snowbird, Utah, USA, July 14, 2011.,
volume 72 of EPTCS, pages 55–62, 2011.
Jerzy Karczmarczuk. Functional differentiation of computer programs.
Higher-Order and Symbolic Computation, 14(1):35–57, 2001.
Hayhurst Kelly J., Veerhusen Dan S., Chilenski John J., and Rierson
Leanna K. A practical tutorial on modified condition/decision coverage.
Technical report, 2001.
Anvesh Komuravelli, Arie Gurfinkel, Sagar Chaki, and Edmund M.
Clarke. Automatic abstraction in smt-based unbounded software model
checking. In Natasha Sharygina and Helmut Veith, editors, Computer
Aided Verification - 25th International Conference, CAV 2013, Saint
Petersburg, Russia, July 13-19, 2013. Proceedings, volume 8044 of Lecture Notes in Computer Science, pages 846–862. Springer, 2013.
Xavier Leroy. Formal verification of a realistic compiler. Communications of the ACM, 52(7):107–115, 2009.
Xavier Leroy, Damien Doligez, Alain Frisch, Didier Rémy, and Jérôme
Vouillon. The ocaml system, release 4.03. http://caml.inria.fr/pub/
docs/manual-ocaml/.
Kyoko Makino and Martin Berz. Rigorous integration of flows and odes
using taylor models. In Hiroshi Kai, Hiroshi Sekigawa, Tateaki Sasaki,
Kiyoshi Shirayanagi, and Ilias S. Kotsireas, editors, Symbolic Numeric
Computation, SNC ’09, Kyoto, Japan - August 03 - 05, 2009, pages
79–84. ACM, 2009.
BIBLIOGRAPHY
185
[64] Florence Maraninchi and Yann Rémond. Mode-automata: a new
domain-specific construct for the development of safe critical systems.
Sci. Comput. Program., 46(3):219–254, 2003.
[65] Florence Maraninchi and Yann Rémond. Mode-automata: About modes
and states for reactive systems. In Chris Hankin, editor, Programming
Languages and Systems, volume 1381 of Lecture Notes in Computer
Science, pages 185–199. Springer Berlin Heidelberg, 1998.
[66] Matthieu Martel. An overview of semantics for the validation of numerical programs. In Radhia Cousot, editor, Verification, Model Checking, and Abstract Interpretation, 6th International Conference, VMCAI
2005, Paris, France, January 17-19, 2005, Proceedings, volume 3385 of
Lecture Notes in Computer Science, pages 59–77. Springer, 2005.
[67] Érik Martin-Dorel, Guillaume Hanrot, Micaela Mayero, and Laurent
Théry. Formally verified certificate checkers for hardest-to-round computation. J. Autom. Reasoning, 54(1):1–29, 2015.
[68] Érik Martin-Dorel, Laurence Rideau, Laurent Théry, Micaela Mayero,
and Ioana Pasca. Certified, efficient and sharp univariate taylor models
in COQ. In Nikolaj Bjørner, Viorel Negru, Tetsuo Ida, Tudor Jebelean, Dana Petcu, Stephen M. Watt, and Daniela Zaharie, editors, 15th
International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC 2013, Timisoara, Romania, September 2326, 2013, pages 193–200. IEEE Computer Society, 2013.
[69] Géraldine Morin and Ronald N. Goldman. A subdivision scheme
for poisson curves and surfaces. Computer Aided Geometric Design,
17(9):813–833, 2000.
[70] Géraldine Morin and Ronald N. Goldman. Trimming analytic functions using right sided poisson subdivision. Computer-Aided Design,
33(11):813–824, 2001.
[71] Yannick Moy and Claude Marché. Jessie Plugin Tutorial, Beryllium
version. INRIA, 2009. www.frama-c.cea.fr/jessie.html.
[72] George C. Necula. Proof-carrying code. In Proceedings of the 24th
ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages, POPL ’97, pages 106–119. ACM, 1997.
[73] George C. Necula. Translation validation for an optimizing compiler.
SIGPLAN Not., 35(5):83–94, May 2000.
[74] David Nowak, Jean-René Beauvais, and Jean-Pierre Talpin. Coinductive axiomatization of a synchronous language. In Proceedings of
the 11th International Conference on Theorem Proving in Higher Order
Logics, pages 387–399, London, UK, UK, 1998. Springer-Verlag.
[75] Special C. of RTCA. DO-178C, software considerations in airborne
systems and equipment certification, 2011.
186
BIBLIOGRAPHY
[76] Christine Paulin-Mohring. A constructive denotational semantics for
kahn networks in coq. In Yves Bertot, Gérard Huet, Jean-Jacques Lévy,
and Gordon Plotkin, editors, From Semantics to Computer Science,
pages 383–414. Cambridge University Press, 2009. Cambridge Books
Online.
[77] Simon Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, and Geoffrey Washburn. Simple unification-based type inference for gadts.
SIGPLAN Not., 41(9):50–61, September 2006.
[78] David Pichardie. Building certified static analysers by modular construction of well-founded lattices. Electr. Notes Theor. Comput. Sci.,
212:225–239, 2008.
[79] Plum Hall, Inc. The plum hall validation suite for c. http://www.
plumhall.com/stec.html, viewed on May 2nd, 2014.
[80] A. Pnueli, M. Siegel, and F. Singerman. Translation validation. pages
151–166. Springer, 1998.
[81] Marc Pouzet.
The heater.
https://www.di.ens.fr/~pouzet/
lucid-synchrone/manual_html/manual019.html, 2006.
[82] Marc Pouzet. Lucid Synchrone, version 3. Tutorial and reference manual. Université Paris-Sud, LRI, April 2006.
[83] Nathalie Revol, Kyoko Makino, and Martin Berz. Taylor models and
floating-point arithmetic: proof that arithmetic operations are validated
in COSY. J. Log. Algebr. Program., 64(1):135–154, 2005.
[84] John Rushby. The versatile synchronous observer. In Proceedings of the
15th Brazilian conference on Formal Methods: foundations and applications, SBMF’12, pages 1–1, Berlin, Heidelberg, 2012. Springer-Verlag.
[85] Mary Sheeran, Satnam Singh, and Gunnar Stålmarck. Checking safety
properties using induction and a SAT-solver. In FMCAD ’00, pages
108–125. Springer, 2000.
[86] Christian Skalka and François Pottier. Tip’02, international workshop
in types in programming syntactic type soundness for hm(x). Electronic
Notes in Theoretical Computer Science, 75:61 – 74, 2003.
[87] The Coq Development Team. The coq proof assistant, version 8.5.
https://coq.inria.fr/.
[88] Inc. The MathWorks.
products/simulink/.
Simulink.
http://www.mathworks.com/
[89] Inc. The MathWorks.
products/stateflow/.
Stateflow.
http://www.mathworks.com/
[90] Verimag.
Lustre v4 toolbox.
The-Lustre-Toolbox.
http://www-verimag.imag.fr/
BIBLIOGRAPHY
187
[91] Michael Whalen, Gregory Gay, Dongjiang You, Mats P. E. Heimdahl,
and Matt Staats. Observable modified condition/decision coverage. In
ICSE 2013, pages 102–111. IEEE Press, 2013.
[92] Olivier Ssi yan-kai (project leader). The Gene-Auto Project. https:
//itea3.org/project/gene-auto.
[93] Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. Finding and
understanding bugs in c compilers. ACM SIGPLAN Notices, 47(6):283–
294, 2012.
© Copyright 2026 Paperzz