An Example of Translation and
Proof using Higher-Order
Abstract Syntax
Michael W. Whalen
Advanced Technology Center
Rockwell Collins Inc.
Safety-Critical Systems
http://www.cs.umn.edu/crisys
2
Code Generation Requirements
•
•
Automatic
Formally-Defined
http://www.cs.umn.edu/crisys
•
Correctly-Implemented
•
Formal description of source/target language
Proof that generated code implements specification
Transparent transliteration of translation rules
Implementation should be rigorously tested
Usable for Safety-Critical Systems
Human-Understandable and traceable
Necessary for fault analysis, code instrumentation
Required by regulatory agencies
Fast enough for target environment
3
Aspects of Translation
http://www.cs.umn.edu/crisys
1. Foundations
2. Formal Architecture
Language
Semantics and
Proof
How do we create a
formal translation
approach from
foundations?
3. Application
4. Implementation
Applying Semantics
and Proofs to RSML-e
Translator
Designing a
Translator that
transparently
implements rules
4
Formal Definition of Compiler Correctness
RSML-e Syntax
http://www.cs.umn.edu/crisys
Compiler
Definition
Program Syntax
RSML-e Semantics
Output
Proof: Same
Outputs
Generated
Program Semantics
5
Output
Operational Semantics and Proof
• Operational semantics provides framework
http://www.cs.umn.edu/crisys
•
for evaluation, static semantics, and
transformations
Several different “flavors” of operational
semantics
SOS, Natural Semantics, Abstract Machines
• We want formalism that leads to elegant
transformations and proofs
6
Managing Identifiers
• Large part of translation and proof complexity
• Explicit Environments
“Environment Carrying” functions [Plotkin: SOS,
Despyroux: Mini-ML]
Renaming over scopes [Drossopoulou: Java]
Substitution as meta-rule [Pierce PL Book]
http://www.cs.umn.edu/crisys
• Implicit Environments
Lambda variables in object language
Metalevel support [Hannon93, Whalen05]
Lambda variables in metalanguage
Proofs describing substitution behavior provided by
metalogic
7
Extended Natural Semantics Example
http://www.cs.umn.edu/crisys
Concrete Syntax:
Higher-Order Abstract Syntax:
function sum(y: int; z: int) : int
{
return y + z;
A
}
(function_def int
(param int
(λy. (param int
(λz. (body
B
(binary_expr (lit_expr y) plus (lit_expr z))))))))
Evaluation Rules:
Example:
8
Extended Natural Semantics: Typing
Higher-Order Abstract Syntax:
Typing Rules:
http://www.cs.umn.edu/crisys
(function_def int
(param int
(λy. (param int
(λz. (body
A
(binary_expr (lit_expr y) plus
(lit_expr z)))))))).
Example:
9
Extended Natural Semantics:
Transformation
Higher-Order Abstract Syntax:
Transformation Rules:
http://www.cs.umn.edu/crisys
(function_def int
(param int
(λy. (param int
(λz. (body
(binary_expr (lit_expr y) plus (lit_expr z))…
is transformed into:
(function_def int
(param int
(λy. (param int
(λz. (body (binary_expr
(unary_expr minus (lit_expr y)) plus
(unary_expr minus (lit_expr z))…
10
ENS Transformation, Expanded
Apply trans rule here. Rule premises define how
Body is transformed to Body'
http://www.cs.umn.edu/crisys
Body[z := c]
Applying c
For z in Body
(λz.<Body>) c
Instantiating
new constant
c for x
( x ((λz.<Body>) x)
Body'[z := c]
Several functions may match Body',
replacing zero or more instances of c
with z
{ (λz.<Body''>) c, (λz.<Body'''>) c,
However, only one function can match
…}
the , because the c must be new: it
cannot exist outside the scope of the
, so all c's must be replaced by z's.
((λz.<Body*>) x))
11
Aspects of Translation
http://www.cs.umn.edu/crisys
1. Foundations
2. Formal Architecture
Language
Semantics and
Proof
How do we create a
formal translation
approach using
foundations?
3. Application
4. Implementation
Applying Semantics
and Proofs to RSML-e
Translator
Designing a
Translator that
transparently
implements rules
12
Notions of Completeness and Determinism
Source Syntax
RSML Semantics Rules
Output
http://www.cs.umn.edu/crisys
Are rules deterministic?
Compiler Rules
Program Syntax
Are rules complete?
Program Semantics Rules
13
Output
Correctness Obligations for SOS Rules
• Despeyroux’s obligations:
http://www.cs.umn.edu/crisys
• Obligations for deterministic language:
• Obligations are equivalent if source semantics are complete.
14
Translation in Layers
Semantics Rules
RSML-e
http://www.cs.umn.edu/crisys
Translation
Rules
Completeness
Proofs
…
…
C, Ada,
Java, …
15
Evaluation Rules in Translation
Source AST
Grammar
Target AST
Grammar
...
v_expr ::=
unknown |
id(expr list) |
expr.
...
New Syntax
if expr
then v_expr
else v_expr
...
v_expr ::=
if expr then v_expr
else v_expr |
expr.
...
http://www.cs.umn.edu/crisys
Evaluation rules for
new syntax:
Source
Evaluation
Rules
-
Rules for
Removed
Syntax
Target Evaluation Rules
16
Translation Proof Structure
•
•
•
http://www.cs.umn.edu/crisys
Describe the correctness of contexts:
Describe equivalence of program states:
Describe completeness obligation using evaluation rules for
source and target languages + transformation rules:
17
Aspects of Translation
http://www.cs.umn.edu/crisys
1. Foundations
2. Formal Architecture
Language
Semantics and
Proof
How do we create a
formal translation
approach from
foundations?
3. Application
4. Implementation
Applying Semantics
and Proofs to RSML-e
Translator
Designing a
Translator that
transparently
implements rules
18
Source Language: RSML-e
•
RSML-e is a Reactive Synchronous Dataflow Language
http://www.cs.umn.edu/crisys
•
Specification consists of Variables and Interfaces
•
Reactive: Specification reacts to changes in external environment at
discrete intervals
Synchronous: those reactions take (logically) zero time
Dataflow: value of object (variable or interface) can be computed as
soon as objects on which it is dependent have been computed.
Variables maintain internal state of model
Interfaces describe interaction with the external environment
Two-state model
Values of variables from previous step can be referenced
19
Source Language: RSML-e
Input Frames:
Output Frames:
http://www.cs.umn.edu/crisys
Reset_Receiver
Clock
Fault_Sender
Clock
<empty>
Altitude Switch Specification
Being
... Frame
Evaluated:
DOI_Receiver
Clock
Evaluation
Result:
Clock
DOI_Receiver
DOICmd_
Sender
...
Reset_Receiver
Clock
20
...
DOICmd_Sender
Source Language: RSML-e
Altitude Switch Specification
Input
Interfaces
State Variables
http://www.cs.umn.edu/crisys
Reset
Receiver
Reset
Clock
Reader
Clock
Output
Interfaces
System
Mode
DOI
DOI
Receiver
Altitude
Reader
DOI
Status
DOICmd
Sender
Altitude
Altitude
Quality
Altitude
Status
Fault
Sender
21
Source Language: RSML-e
• Each variable or interface has an assignment:
Altitude
Status
http://www.cs.umn.edu/crisys
AltitudeStatus :=
when initially defined: Status:Ok
equals Status:Failed if
Altitude < 0 or
AltitudeQuality = Quality:Bad
equals Status:Ok if
Altitude > 0 and
AltitudeQuality = Quality:Good
22
Translation: Intermediate Languages
•
We move the language successively closer to an imperative
language
http://www.cs.umn.edu/crisys
RSMLp : We move from the RSML-e synchronous specification
language to a synchronous programming language: remove
undefined and case lists.
RSMLt : Switch from a structural to a nominal type system
RSMLv: Switch from two-state variables to one-state variables
SIMPLr: Add imperative, rather than functional, assignments to
variables (subset of Ada)
SIMPL: Remove record assignments from SIMPLr (subset of C, Java)
23
Example: RSML-e to RSML-p
• This transformation does two things:
Replaces assignment case lists with assignment
expressions
Removes undefined_val from the type system
http://www.cs.umn.edu/crisys
• To remove undefined_val we transform all
variables in the specification
var x : T; becomes
var x : record{ val: T,
def: Boolean };
24
Transformation Rules
• Expressions
http://www.cs.umn.edu/crisys
• Declarations
25
Proof Obligations
Context Relation:
http://www.cs.umn.edu/crisys
State Variable Value Similarity Relation:
State Relation:
26
Proof Obligation: Expressions
Expression Obligation:
http://www.cs.umn.edu/crisys
Lemma about deref:
27
Example Proof: pre_expr
Transformation
Rule:
http://www.cs.umn.edu/crisys
RSML-e
Evaluation Rule:
From deref Lemma:
From definition of ≈, and from premise Vals ≠ undefined_val, Valt =
with V2 = Vals.
Now, we can derive:
28
Aspects of Translation
http://www.cs.umn.edu/crisys
1. Foundations
2. Formal Architecture
Language
Semantics and
Proof
How do we create a
formal translation
approach from
foundations?
3. Application
4. Implementation
Applying Semantics
and Proofs to RSML-e
Translator
Designing a
Translator that
transparently
implements rules
29
Implementation
• Prototype Translator In λProlog
• Transparently Implements ENS Rules
http://www.cs.umn.edu/crisys
becomes…
30
Translator Architecture
RSML-e
Concrete
Syntax
File
Untrusted "User
Friendly" ML
Static Semantics
Checker
http://www.cs.umn.edu/crisys
Trusted MLYACC Parser
Static
Semantics
Error
Report
RSML-e
Abstract
Syntax
File
C/C++
Concrete
Syntax
File
Trusted
Lambda-Prolog
Translator
Java
Concrete
Syntax
File
31
Traceability
Information
Report
(Planned)
Ada
Concrete
Syntax File
(Planned)
Implementation
• Translator Stats
Source Code: @ 100KB in 27 source/header files
http://www.cs.umn.edu/crisys
Rule Type
Lines of Code
Number of Rules
Translation
@2000
278
RSML-e Static
Semantics
@1000
141
Scaffolding
@500
45
RSML-e Evaluation
@350
100
SIMPL Evaluation
@320
91
32
Implementation
• Translation Results
http://www.cs.umn.edu/crisys
File Name
Size (LOC)
Compilation Time
records3.rsmle
71
1s
three_altimeters.rsmle
131
2s
numeric_ops.rsmle
230
DNF – Ran out of Memory
function_test.rsmle
215
DNF – Ran out of Memory
• Teyjus Needs Garbage Collection!
33
Post-Mortem
Discussion
• Original work was in first-order system
Used ID-substitution (Drossopoulou)
Requires additional rules describing which ids
should be substituted (e.g. no record fields)
Required significant additional lemmas about
how terms behave under id substitutions
I was struggling to complete proofs (and bored)
due to sheer number of details related to
identifiers
http://www.cs.umn.edu/crisys
35
Discussion
• HOAS and λProlog made my dissertation
much more straightforward
Language descriptions became simpler
Translation became much simpler
http://www.cs.umn.edu/crisys
Use of implication allowed immediate and simple
constructions of compiler environment
Relations over correct environments are
straightforward to construct
Proofs became much simpler
No substitution lemmas [Pierce, Despyroux]
Proofs 2-3x shorter
36
Binding I: Removing Names
• One goal of HOAS: make identifier names
http://www.cs.umn.edu/crisys
•
irrelevant
I was not totally able to do this:
Record fields still keyed by id
λ-bindings assume a specific order – record
expressions allow arbitrary order
Question is it possible / a good idea to remove
field identifiers?
37
Binding II: Adding Variables
• Translation from higher-level to lower-level
http://www.cs.umn.edu/crisys
language often requires introduction of new
variables
Difficult to motivate translation rules at first
Led to some odd rule constructions where
bindings and code were constructed “in parallel”
Example: moving from a language with recordcreation expressions (a la ML) to one that does
not (a la C)
38
Remove Record Expressions Example
• Given:
type a = record {
f1 : int,
f2 : real } ;
http://www.cs.umn.edu/crisys
• Want to change
something like:
[f1 : 2+y, f2 : 3.1]
• Into:
create_a(2+y, 3.1)
• Need to create:
fun create_a(
f1 : int,
f2 : real): a =
var
r_result : a ;
in
r_result.f1 = f1 ;
r_result.f2 = f2 ;
return r_result ;
end
39
Remove Record Expressions Example
http://www.cs.umn.edu/crisys
Rule: create_type_fn_body Var Type Fields StmtList Block
- Var is the fresh constant bound to the r_result local variable
- Type is the return type of Var
- Fields describes the remaining fields to be assigned within the record
- StmtList defines the field assignments performed thus far
- Block is the returned function block
40
Binding II: Adding Variables
• Similar project at RCI: Translating Lustre to
http://www.cs.umn.edu/crisys
•
several languages (NuSMV, PVS, SAL)
Lustre supports PRE-operator that allows
reference to previous values of variables
Fibonacci: x = pre(pre(x, 0), 0) + pre(x,1) ;
• To translate to C, we must introduce
•
additional variables for each pre-operator
Seems tricky to do in HOAS!
41
Binding III: Non-Lexical Scoping
• Many languages allow forward references to
identifiers
Java
Lustre/SCADE
-e
I changed the RSML semantics to disallow
forward references
http://www.cs.umn.edu/crisys
• (How) Can we represent “global” scopes in
HOAS?
Alternately, can we add environments for
“global” ids and still get most of the HOAS
benefits?
42
Working in a Positivist Logic
• It would be difficult to write semantics and
translator entirely without the use of cut
http://www.cs.umn.edu/crisys
List non-membership in static semantics
Evaluation rule for not-equal expressions
Occasional use of set data structure
• Cuts were not used in rules that referenced
structures that could contain meta-level variables
or universal constants
These uses could affect correctness of reasoning
• How will my use of cut affect reasoning in a
formal framework?
43
Tool Support
• λProlog gripes
http://www.cs.umn.edu/crisys
No syntax for naming commonly used types – makes for
long type descriptions
Syntax allows misplaced comma to conjoin two rule
instances;
New symbol for reverse implication in rule instance? (<- )
New rule begins with turnstile? (|- )
Implication (=>) binds tighter than and (,)
No garbage collector
No warnings on single use of variable
No warnings on rule declaration without definition
No warnings on non-use of bound variable within term
No debugger
• Teyjus gripes
44
Conclusion
• Formal approach can be used for real
http://www.cs.umn.edu/crisys
•
translators
Difficulty is dependent on choice of
formalism
Original work was in natural semantics
Much simpler with extended natural semantics
• Some things are still tricky to do in HOAS
• A few improvements to tools would really
benefit serious users
45
Conclusion
• SIMPL – “Small Imperative Language”
semantics may be useful to others
I didn’t want to write it
http://www.cs.umn.edu/crisys
However, I needed a small subset of Ada/Java/C
YAILS - boring
Literature semantics are cleaner, but no clear
correspondence to “real” languages
Supports basic records, arrays, block
structuring, functions
Recursion could be added easily
However, matching C/Java syntax for recursion
would be harder
46
Future Work
• Generalizing work to other source languages
Lustre, SCR
http://www.cs.umn.edu/crisys
• Adding other target languages
• Extensive testing (if actually to be used on
•
•
DO178B development effort)
Teyjus Improvements
Optimizations
47
Contact Information
• Crisys Research Group
on the web: http://www.cs.umn.edu/crisys
http://www.cs.umn.edu/crisys
• Mike Whalen
e-mail: [email protected]
phone: (612) 625-4543
• Mats P.E. Heimdahl
e-mail: [email protected]
phone: (612) 625-2068
48
© Copyright 2026 Paperzz