Normalisation by Calculation

Normalisation by Calculation
R. Loader∗
April, 1995.
Abstract
I outline a new proof of strong normalisation for system F, by
direct calculation of the size of a normalisation tree. The system F
itself is used as a formulism for representing these calculations, and
any reasonably sensible model of the calculus can be used to verify that
the correctness of the calculations. The method works for a variety of
calculi, and seems reasonably general.
Discussion
Proofs of strong normalisation for typed λ-calculi, such as system F, are well
known. The standard proof, developed by Tait and Girard using a notion
called reducibility, proceeds by defining sets of certain normalising terms
in such a way that it can be shown that every term is a member of the
appropriate set. This proof is probably best understood in a model theoretic
manner: the sets of normalising terms form a model of some sort. A term
t may be interpreted in this model by some term ([t]), which is necessarily
normalising, and either actually t = ([t]), or t and ([t]) are sufficiently similar
so that we may infer that t is normalising.
All previous proofs of normalisation for system F follow this pattern, there
being some differences in the exact machinery used. We present here a novel
proof, which does not rely on a model based on sets of terms. Instead, we
present a translation t 7→ ([t]) on terms, such that ([t]) is a program computing
a witness for the Σ01 statement that t is strongly normalising. By interpreting
([t]) in a model, or by otherwise extracting an actual value from ([t]), we can
verify that t is in fact strongly normalising.
∗
Merton College, Oxford. Email: [email protected]
1
In hindsight, there is a close proof-theoretical connection between this
proof and the more traditional ones. Consider the traditional normalisation
proof of system F. It is well known that this proof can not be formalised
in second order logic; the final induction over terms verifying reducibility
has an induction statement that is not second order. However, it is easy to
check that if we fix a term t, and unwind this problematic induction for the
particular term t, then we obtain a proof of SN(t) that is second order. By
known facts about program-extraction and realisability, we can extract from
this proof a term of system F witnessing SN(t); comparison of the result of
this process with the translation we shall construct reveals them to be very
similar.
Two similar proofs of normalisation for Gödel’s T exist in the literature.
The first is that of Gandy [?], where a similar translation is used to show
that weak normalisation implies strong normalistion. The second is that of
van de Pol and Schwichtenberg [?] where an interpretation of terms as strict
functionals is used to derive strong normalisation; this seems similar to the
proof here, but the translation is carried out directly into a model; clearly our
proof of strong normalisation could be done in a similar manner if desired.
For simplicity we shall work with terms that do not contain type decorations; therefore all terms are terms of the untyped λ-calculus. It simplifies
our presentation if we work with a slightly unusual presentation of strong
normalisation.
Definition 1 A term is strongly normalising if it is derivable using the following three rules:
t1 . . . tn
x(t1 ) · · · (tn )
s[x ← t0 ](t1 ) · · · (tn ) t0
(λx. s)(t0 )(t1 ) · · · (tn )
t
λx. t
This is essentially a simplification of H. Goguen’s typed operational semantics [?]—we retain only the fact that a term is strongly normalising, while
Goguen gives a system that simultaneously infers types, and gives normal
forms and strong normalisation.
So that our proofs of strong normalisation are meaningful, it is necessary
to check that our definition implies the usual one; the converse also holds.
Lemma 2 A term t is strongly normalising if and only if there is no infinite
sequence of β-reductions starting from t.
Proof 3 The ‘only if’ part is easily seen by induction over the derivation of
t; the only non-trivial case is to show that if s[x ← t0 ](t1 ) · · · (tn ) and t0 have
no infinite reduction sequences, then neither does (λx. s)(t0 )(t1 ) · · · (tn ).
The ‘if’ part follows by induction over the length of possible normalisation
sequences.
2
2
1
Simply Typed λ-Calculus
We shall first give a proof of normalisation for the simply typed λ-calculus;
the extension to system F is fairly straightforward. We consider the simply
typed λ-calculus with a collection o1 , o2 , . . . of base types. We define our
interpretation on types and then on terms. Let N be one of the base types.
Set [[o]] = N for any base type o, and [[A ⇒ B]] = [[A]] ⇒ [[B]]. Before we
define the interpretation of terms, we define some auxiliary terms used in the
translation. For any type A we shall require terms ΣA : N ⇒ [[A]] ⇒ [[A]],
CA : [[A]] and ℵA : [[A]] ⇒ N . We introduce constants Σ and C of type
N ⇒ N ⇒ N and N respectively, and define
Σo = Σ,
Co = C,
ℵo = λx. x,
for base types o, and for function types A ⇒ B;
ΣA⇒B = λiN . λf [[A]]⇒[[B]] . λx[[A]] . ΣB (i)(f x),
CA⇒B = λx[[A]] . ΣB (ℵA x)(CB ),
ℵA⇒B = λf [[A]]⇒[[B]] . ℵB (f CA ).
These three terms correspond closely to properties of reducibility predicates
used in traditional normalisation proofs; they are essentially realisers of the
proofs of these properties.
Given a of type [[A]] we define ℵA a of type N by
ℵo n = n,
ℵA⇒B f = ℵB f (CA ).
Now given a term t of type A we define a term ([t]) of type [[A]] as follows:
1. If x is a variable of type A then let ([x]) be a variable of type [[A]].
2. ([s(t)]) = ([s])(([t])).
3. If x is a free variable of type A and t has type B then ([λx. t]) =
λ([x]). ΣB (ℵA x)(([t])).
Given a term t of type A with free variables x1 : B1 , . . . , xn : Bn define |t| =
A
¯
ℵ ([t]) ([x̄]) ← CB .
The following lemma is shown by the obvious induction:
Lemma 4 ([·]) commutes with substitution: ([b[x ← a]]) = ([b]) ([x]) ← ([a]) .
3
The heart of our normalisation technique is the behaviour of the operation
| · | under reduction; the following lemma provides the calculations necessary
to use the rules of definition ?? in deducing that a term t is SN from properties
of |t|.
Lemma 5
1. Let a be a term of type A = A1 ⇒ · · · ⇒ An ⇒ B.
Then ℵA (a) β-reduces to ℵA a and ΣA (n)(a)(a1 ) · · · (an ) β-reduces to
ΣB (n)(a a1 . . . an ).
2. If λx. b has type A ⇒ B then λx. b β-reduces to Σ(ℵA CA )(|b|).
3. If x is a variable of type A1 ⇒ · · · ⇒ An ⇒ B, and ai has type Ai for
i = 1 . . . n, then x(a1 ) · · · (an ) β-reduces to
Σ(|a1 |) · · · Σ(|an |)(ℵB CB ) · · · .
4. (λx. b)(a0 )(a1 ) · · · (an ) β-reduces to Σ(|a0 |) b[x ← a0 ](a1 ) · · · (an ) .
Proposition 6 Let N be the set of terms of type N that contain no free
variables but may contain Σ and C. Suppose that µ is a function from N to
ω such that
1. µ(·) is invariant under β-conversion,
2. µ(m), µ(n) < µ(Σ(m)(n)) for all n, m ∈ N .
Then the simply typed λ-calculus is strongly normalising.
Proof 7 By the previous lemma and induction on µ|t|.
2
Corollary 8 The simply typed λ-calculus is strongly normalising.
Proof 9 There is a model of the simply typed λ-calculus in which the base
type N is the standard natural numbers (e.g., the PER model). Take such a
model, and interpret Σ and C by Σ(x)(y) = x + y + 1 and C = 0 respectively.
Then interpreting terms in the model gives µ as required above.
2
Corollary 10 If the simply typed λ-calculus is weakly normalising, then it
is strongly normalising. This fact can be shown in weak arithmetic systems
such as Buss’s S21 .
4
Proof 11 Every normal form in the set N can be built from Σ and C; using
this fact together with normalisation and the Church-Rosser property suffice
to define µ suitable to use the proposition.
To formalise this in weak arithmetic systems we must be careful, as this
µ is not definable by a bounded formula, and thus cannot be used in an
induction statement of the weak system. However, for any term |a|, we may
take an upper bound L for a witness of the Σ01 statement that |a| has a normal
form. Now µ can be defined for all the terms we need to show SN(A), by a
bounded formula involving L.
2
Inspection of the proof above shows that the use of the Church-Rosser
property is easily avoided; any normal form of |a| will suffice to bound the
induction in the proof of proposition ??.
An alternative proof of corollary ?? can be obtained by noting that ([a])
is a λI term, and hence ([a]) must be strongly normalising if it is weakly
normalising. Then comparison of reduction sequences shows that SN([a])
implies SN(a).
2
System F
The jump, from Tait’s proof of normalisation for the simply typed calculus,
to Girard’s proof in the case of system F required some ingenuity to achieve
the degree of abstraction required.
Our translation essentially lays bare the proof-theoretical structure of
these proofs; in doing so we make the extension to second, and higher-order,
cases fairly obvious; we just extend the inductive clauses of the translation
as desired.
For notational convenience, we deal with system F with the single basic
type N . The inferences in the definition of strong normalisation should be
extended with the rules
a
a
,
.
ΠX. a
a(A)
The three auxiliary terms ΣX , CX and ℵX will be variables for a type variable
X, and we will abstract them in the translated term whenever X is abstracted
in the original term. Thus we make [[X]] a type variable whenever X is, and
set
[[ΠX. A]] = Π[[X]]. (N ⇒ [[X]] ⇒ [[X]]) ⇒ [[X]] ⇒ ([[X]] ⇒ N ) ⇒ [[A]].
The definition of the auxilary terms is now extended in a sensible fashion:
ΣΠX.A = λi. λa. Λ[[X]]. λΣX . λCX . λℵX . ΣA (i)(aXΣX CX ℵX ),
5
CΠX.A = Λ[[X]]. λΣX . λCX . λℵX . CA ,
ℵΠX.A = λa. ℵA[X←N ] (a[[N ]]ΣN CN ℵN ).
Also, we put
ℵΠX.A a = ℵA[X←N ] a([[N ]])(ΣN )(CN )(ℵN ).
We shall only deal with ℵA for closed A, so that there is no need to define
ℵX for a type variable X.
The clauses defining ([t]) are extended with
([ΛX. a]) = Λ[[X]]. λΣX . λCX . λℵX . ΣA (C)(a)
and
([a(A)]) = ([a])(A)(ΣA )(CA )(ℵA ).
If a is a term of type A, with free object variables xi : Bi (i = 1, . . . , n) and
free type variables X1 , . . . , Xm , then |a| is defined to be
ℵA[X̄←N ] ([a]) x̄ ← CB̄ [[X̄]] ← N , ΣX̄ ← ΣN , CX̄ ← CN , ℵX̄ ← ℵN .
Versions of proposition ?? and corollaries ?? and ?? for system F may now
by proven in much the same manner as they were proven for the simply typed
case.
3
Extensions
The proof given extends with no surprises to higher order calculi and inductive datatypes. It has the surprising feature that a general treatment
of inductive datatypes is easier than treatment of the canonical example of
the natural numbers. Treating dependent datatypes needs some care due to
the reductions in type decorations on λ-abstractions, in fact, the translation
doesn’t use dependancies, so that, for instance, the calculus of constructions
is translated into Fω .
References
[1] R. Gandy, Proofs of Strong Normalisation. In J. Seldin and R. Hindley (editors), To H. B. Curry: Essays on Combinatory Logic, Lambda
Calculus and Formalism. Academic Press, 1980.
[2] H. Goguen, A Typed Operational Semantics for Type Theory. Ph.D thesis,
University of Edinburgh 1994.
6
[3] J. van de Pol and H. Schwichtenberg, Strict Functionals for Termination Proofs. In M. Dezani-Ciancaglini and G. Plotkin (editors), Typed
Lambda Calculi and Applications, Lecture Notes in Computer Science
902. Springer-Verlag, 1995.
7