MLF
Raising ML to the Power of System F
Didier Le Botlan and Didier Rémy
INRIA-Rocquencourt
78153 Le Chesnay Cedex, France
{Didier.Le_Botlan,Didier.Remy}@inria.fr
Abstract
We propose a type system MLF that generalizes ML with first-class
polymorphism as in System F. Expressions may contain secondorder type annotations. Every typable expression admits a principal
type, which however depends on type annotations. Principal types
capture all other types that can be obtained by implicit type instantiation and they can be inferred. All expressions of ML are welltyped without any annotations. All expressions of System F can be
mechanically encoded into MLF by dropping all type abstractions
and type applications, and injecting types of lambda-abstractions
into MLF types. Moreover, only parameters of lambda-abstractions
that are used polymorphically need to remain annotated.
Categories and Subject Descriptors: D.3.3 Language Constructs
and Features.
General Terms: Theory, Languages.
Keywords: Type Inference, First-Class Polymorphism, SecondOrder Polymorphism, System F, ML, Type Annotations.
The quest for type inference with first-class
polymorphic types
Programming languages considerably benefit from static typechecking. In practice however, types may sometimes trammel programmers, for two opposite reasons. On the one hand, type annotations may quickly become a burden to write; while they usefully
serve as documentation for toplevel functions, they also obfuscate
the code when every local function must be decorated. On the other
hand, since types are only approximations, any type system will
reject programs that are perfectly well-behaved and that could be
accepted by another more expressive one; hence, sharp programmers may be irritated in such situations.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specific permission and/or a fee.
ICFP’03, August 25–29, 2003, Uppsala, Sweden.
Copyright 2003 ACM 1-58113-756-7/03/0008 ...$5.00
Fortunately, solutions have been proposed to both of these problems. Type inference allows to elide most type annotations, which
relieves the programmer from writing such details and simultaneously lightens programs. In parallel, more expressive type systems
have been developed, so that programmers are less often exposed
to their limitations.
Unfortunately, those two situations are often conflicting. Expressive type systems tend to require an unbearable amount of type
decorations, thus many of them only remained at the status of prototypes. Indeed, full type inference for System F is undecidable [26].
Conversely, languages with simple type inference are still limited
in expressiveness; more sophisticated type inference engines, such
as those with subtyping constraints or higher-order unification have
not yet been proved to work well in practice.
The ML language [4] appears to be a surprisingly stable point of
equilibrium between those two forces: it combines a reasonably
powerful yet simple type system and comes with an effective type
inference engine. Besides, the ML experience made it clear that
expressiveness of the type system and a significant amount of type
inference are equally important.
Despite its success, ML could still be improved: indeed, there are
real examples that require first-class polymorphic types [25, 20, 7]
and, even though these may not occur too frequently, ML does not
offer any reasonable alternative. (The inconvenience is often underestimated, since the lack of a full-fledged language to experiment with first-class polymorphism insidiously keeps programmers
thinking in terms of ML polymorphism.)
A first approach is to extend ML with first-class second-order polymorphism [15, 25, 20, 7]. However, the existing solutions are still
limited in expressiveness and the amount of necessary type declarations keeps first-class polymorphism uneasy to use.
An alternative approach, initiated by Cardelli [2], is to start with
an expressive but explicitly typed language, say Fω
<: , and perform
a sufficient amount of type inference, so that simple programs—
ideally including all ML programs—would not need any type annotation at all. This lead to local type inference [24], recently improved to colored local type inference [21]. These solutions are
quite impressive. In particular, they include subtyping in combination with higher-order polymorphism. However, they fail to type all
ML programs. Moreover, they also fail to provide an intuitive and
simple specification of where type annotations are mandatory.
In this work, we follow the first approach. At least, by being conservative over ML, we are guaranteed to please programmers who are
already quite happy with ML1 . We build on some previous work [7],
which has been used to add polymorphic methods to OCaml [16].
Here, we retain the same primary goal, that is to type all expressions
of System-F, providing explicit annotations when needed, and to
keep all expressions of ML unannotated. In addition, we aim at the
elimination of all backward coercions from polymorphic types to
ML-types. In particular, our goal is not to guess polymorphic types.
Our track
Church’s style System-F and ML are quite different in nature. In
ML, the elimination of polymorphism is implicitly performed at
every use occurrence of a variable bound with a polymorphic type
∀ᾱ.τ, which can then be given any instance, of the form τ[τ̄/ᾱ].
Indeed, a polymorphic type somehow represents the set of its instances. This induces an instance relation between polymorphic
types themselves. For example, the (polymorphic) type ∀ (α) α →
α is said to be more general than ∀ (α) ∀ (β) (α → β) → (α → β)
and we write ∀ (α) α → α 4 ∀ (α) ∀ (β) (α → β) → (α → β) because all instances of the latter are also instances of the former.
Conversely, in Church’s style System F, a type ∀ (α) α → α only
stands for itself (modulo renaming of bound variables) and the elimination of polymorphism must be performed explicitly by type application (and abstraction) at the source level. A counter-part in
System F is that bound type variables may be instantiated by polymorphic types, allowing for expressive impredicative second-order
types. For example, an expression of type ∀ (α) α → α can be given
type (∀ (β) β → β) → (∀ (β) β → β) by an explicit type-application
to ∀ (β) β → β.
Unfortunately, combining implicit instantiation of polymorphic
types with second-order types raises conflicts almost immediately.
For illustration, consider the application of the function choose,
defined as λ(x) λ(y) if true then x else y, to the identity function
id. In ML, choose and id have principal types ∀ (α) α → α → α
and ∀ (α) α → α, respectively. For conciseness, we shall write
idα for α → α and σid for ∀ (α) idα . Should choose id have
type σ1 equal to ∀ (α) idα → ∀ (α) idα , obtained by keeping the
type of id uninstantiated? Or, should it have type σ2 equal to
∀ (α) (idα → idα ), obtained by instantiating the type of id to the
monomorphic type idα and generalizing α only at the end? Indeed, both σ1 and σ2 are correct types for choose id. However,
neither one is more general than the other in System F. Indeed, the
function auto defined as λ(x : σid ) x x can be typed with σ1 , as
choose id, but not with σ2 ; otherwise auto could be applied, for
instance, to the successor function, which would lead to a runtime
error. Hence, σ1 cannot be safely coerced to σ2 . Conversely, however, there is a retyping function—a function whose type erasure
η-reduces to the identity [19]—from type σ2 to type σ1 , namely,
λ(g : σ2 ) λ(x : σid ) λ(α) g α (x α). Actually, σ2 is a principal type
for choose id in Fη∗ (System F closed by η-expansion) [19].
While the argument of auto must be at least as polymorphic as σid ,
the argument of the function choose id need not be polymorphic:
it may be any instance τ of σid and the type of the return value is
then τ. We could summarize these constraints by saying that:
auto :
choose id :
∀ (α = σid ) α → α
∀ (α ≥ σid ) α → α
The type given to choose id captures the intuition that this appli1 On
a practical level, this would also ensure upward compatibility of existing code, although translating tools could always be
provided.
cation has type τ → τ for any instance τ of σid . This form of quantification allows to postpone the decision of whether σid should
be instantiated as soon as possible or kept polymorphic as long as
possible. The bound of α in ∀ (α ≥ σid ) α → α, which is said to
be flexible, can be weakened either by instantiating σid or by replacing ≥ by =. Both forms of weakening can be captured by an
appropriate instance relation 4 between types. In a binder of the
form (α = σ) the bound σ, which is said to be rigid, cannot be instantiated any longer. Intuitively, the type ∀ (α = σ) σ0 stands for
the System-F type σ0 [σ/α].
Finally, both choose id succ and choose id auto are well-typed,
taking int → int or σid for the type of α, respectively. In fact,
the type ∀ (α ≥ σid ) α → α happens to be a principal type for
choose id in MLF. This type summarizes in a compact way the part
of typechecking choose id that depends on the context in which it
will be used: some typing constraints have been resolved definitely
and forgotten; others, such as “α is any instance of σid ”, are kept
unresolved. In short, MLF provides richer types with constraints on
the bounds of variables so that instantiation of these variables can
be delayed until there is a principal way of instantiating them.
A technical road-map
The instantiations between types used above remain to be captured
formally within an appropriate relation 4. Indeed, the instance relation plays a crucial role in type inference, via a typing rule I NST
stating that any expression a of type σ in a context Γ has also type
σ0 in the same context whenever σ 4 σ0 . Intuitively, the larger the
relation 4 is, the more flexibility is left for inference, and, usually,
the harder the inference algorithm is.
Unsurprisingly, the “smallest reasonable relation” 4 that validates
all the instantiations used above leads to undecidable type inference, for full type inference in System-F is undecidable. Still, the
relation 4 induces an interesting variant UMLF that has the same
expressiveness as MLF but requires no type annotations at all. Fortunately, the relation 4 can be split into a composition of relations A
−vA
− where uses of the relation v can be inferred as long
as all applications of A
− are fully explicit: this sets a clear distinction between explicit and inferred type information, which is the
essence of MLF.
Unfortunately, subject reduction does not hold in MLF for a simple
notion of reduction (non local computation of annotations would
be required during reduction). Thus, we introduce an intermediate
variant MLF? where only place holders for A
− are indicated. For example, using the symbol ? in place of polytypes, λ(x : ?) x x belongs
to MLF? since λ(x : σid ) x x belongs to MLF (and of course, λ(x) x x
belongs to UMLF). Subject reduction and progress are proved for
MLF? and type soundness follows for MLF? and, indirectly, for MLF.
In fact, we abstract the presentation of MLF? over a collection of
primitives so that MLF can then be embedded into MLF? by treating
type-annotations as an appropriate choice of primitives and disallowing annotation place holders in source terms. Thus, although
our practical interest is the system MLF, most of the technical developments are pursued in MLF? .
Unsurprisingly, neither UMLF nor MLF? admits principal types.
Conversely, every expression typable in MLF admits a principal
type. Of course, principal types depend on type annotations in
Figure 1. Syntax of Types
τ ::= α | gn
τ1 .. τn
σ ::= τ | ⊥ | ∀ (α ≥ σ) σ | ∀ (α = σ) σ
Monotypes
Polytypes
erators (e.g. app) applied to polymorphic functions (e.g. auto)
over structures holding polymorphic values (e.g. id).
1
Types
1.1 Syntax of types
the source term. More precisely, if an expression is not typable
in MLF, it may sometimes be typable by adding extra type annotations. Moreover, two different type annotations may lead to two incomparable principal types. As an example, the expression λ(x) x x
is not typable in MLF, while both expressions λ(x : ∀α.α) x x and
λ(x : ∀α.α → α) x x are typable, with incomparable types. Adding
a (polymorphic) type annotation to a typable expression may also
lead to a new type that is not comparable with the previous one.
This property should not be surprising since it is inherent to secondorder polymorphism, which we keep explicit—remember that we
only infer first-order polymorphism in the presence of second-order
types. Still, the gain is the elusion of most type annotations, via the
instance relation v.
The paper is organized as follows. In Section 1, we describe types
and instance relations @
− and v. The syntax and the static and dynamic semantics of MLF? are described in Section 2. Section 3
presents formal properties, including type soundness for MLF? and
type inference for MLF. Section 4 introduces explicit type annotations. A comparison with System-F is drawn is Section 5. In Section 6, we discuss expressiveness, language extensions, and related
works. For the sake of readability, unification and type inference algorithms have been moved to the appendices. Due to lack of space,
all proofs are omitted.
“Monomorphic abstraction of polymorphic types”
In our proposal, ML-style polymorphism, as in the type of choose
or id, can be fully inferred. (We will show that all ML programs
remain typable without type annotations.) Unsurprisingly, some
polymorphic functions cannot be typed without annotations. For
instance, λ(x) x x cannot be typed in MLF. In particular, we do not
infer types for function arguments that are used polymorphically.
Fortunately, such arguments can be annotated with a polymorphic
type, as illustrated in the definition of auto given above. Once defined, a polymorphic function can be manipulated by another unannotated function, as long as the latter does not use polymorphism,
which is then retained. This is what we qualify “monomorphic abstraction of polymorphic types”. For instance, both id auto and
choose id auto remain of type ∀ (α = σid ) α → α (the type σid
of auto is never instantiated) and neither choose nor id require
any type annotation. Finally, polymorphic functions can be used by
implicit instantiation, much as in ML.
To summarize, a key feature of MLF is that type variables can always be implicitly instantiated by polymorphic types. This can be
illustrated by the killer-app(lication) (λ(x) x id) auto. This expression is typable in MLF as such, that is without any type application nor any type annotation—except, of course, in the definition of auto itself. In fact, a generalization of this example
is the app function λ( f ) λ(x) f x, whose MLF principal type is
∀ (α, β) (α → β) → α → β. It is remarkable that whenever a1 a2
is typable in MLF, so is app a1 a2 , without any type annotation nor
any type application. This includes, of course, cases where a1 expects a polymorphic value as argument, such as in app auto id. We
find such examples quite important in practice, since they model it-
The syntax of types is given in Figure 1. The syntax is parameterized by an enumerable set of type variables α ∈ ϑ and a family of
type symbols g ∈ G given with their arity |g|. To avoid degenerated
cases, we assume that G contains at least a symbol of arity two (the
infix arrow →). We write gn if g is of arity n. We also write τ̄ for
tuples of types. The polytype ⊥ corresponds to (∀α.α), intuitively.
More precisely, it will be made equivalent to ∀ (α ≥ ⊥) α.
We distinguish between monotypes and polytypes. By default, types
refer to the more general form, i.e. to polytypes. As in ML, monotypes do not contain quantifiers. Polytypes generalize ML type
schemes. Actually, ML type schemes can be seen as polytypes of
the form ∀ (α1 ≥ ⊥) . . . ∀ (αn ≥ ⊥) τ with outer quantifiers. Inner
quantifiers as in System F cannot be written directly inside monotypes. However, they can be simulated with types of the form
∀ (α = σ) σ0 , which stands, intuitively, for the polytype σ0 where
all occurrences of α would have been replaced by the polytype σ.
However, our notation contains additional meaningful sharing information. Finally, the general form ∀ (α ≥ σ) σ0 intuitively stands
for the collection of all polytypes σ0 where α is an instance of σ.
We say that α has a rigid bound in (α = σ) and a
flexible bound in (α ≥ σ). A particular case of flexible bound is
the unconstrained bound (α ≥ ⊥), which we abbreviate as (α). For
convenience, we write (α ¦ σ) for either (α = σ) or (α ≥ σ). The
symbol ¦ acts as a meta-variable and two occurrences of ¦ in the
same context mean that they all stand for = or all stand for ≥. To
allow independent choices we use indices ¦1 and ¦2 for unrelated
occurrences.
Notation.
Conversion and free variables.
Polytypes are considered
equal modulo α-conversion where ∀ (α ¦ σ) σ0 binds α in σ0 , but not
in σ. The set of free variables of a polytype σ is written ftv(σ) and
defined inductively as follows:
ftv(gn τ1 . . . τn ) =
ftv(α) = {α}
[
ftv(τi )
ftv(⊥) = 0/
i=1..n
(
ftv(σ0 )
ftv(∀ (α ¦ σ) σ ) =
ftv(σ0 ) \ {α} ∪ ftv(σ)
0
if α ∈
/ ftv(σ0 )
otherwise
The capture-avoiding substitution of α by τ in σ is written σ[τ/α].
E XAMPLE 1. The syntax of types only allows quantifiers to be
outermost, as in ML, or in the bound of other bindings. Therefore, the type ∀ α · (∀ β · (τ[β] → α)) → α of System F2 cannot
be written directly. (Here, τ[β] means a type τ in which the variable β occurs.) However, it could be represented by the type
∀ (α) ∀ (β0 = ∀ (β) τ[β] → α) β0 → α. In fact, all types of System F can easily be represented as polytypes by recursively binding all occurrences of inner polymorphic types to fresh variables
beforehand— an encoding from System F into MLF is given in Section 5.1.
Types may be instantiated implicitly as in ML along an instance
relation 4. As explained above, we decompose 4 into A
−vA
−. In
2 We write ∀ α · τ for types of System F, so as to avoid confusion.
Section 1.2, we first define an equivalence relation between types,
which is the kernel of both @
− and v. In Section 1.3, we define the
relation @
− that is the inverse of A
−. The instance relation v, which
contains @
−, is defined in Section 1.4.
1.2
Type equivalence
The order of quantifiers and some other syntactical notations are
not always meaningful. Such syntactic artifacts are captured by a
notion of type equivalence. Type equivalence and all other relations
between types are relative to a prefix that specifies the bounds of
free type variables.
D EFINITION 1 (P REFIXES ). A prefix Q is a sequence of bindings (α1 ¦1 σ1 ) . . . (αn ¦n σn ) where variables α1 , . . . αn are pairwise distinct and form the domain of Q, which we write dom(Q).
The order of bindings in a prefix is significant: bindings are meant
to be read from left to right; furthermore, we require that variables
α j do not occur free in σi whenever i ≤ j. Since α1 , . . . αn are pairwise distinct, we can unambiguously write (α ¦ σ) ∈ Q to mean that
Q is of the form (Q1 , α ¦ σ, Q2 ). We also write ∀ (Q) σ for the type
∀ (α1 ¦1 σ1 ) . . . ∀ (αn ¦n σn ) σ. (Note that αi ’s can be renamed in the
type ∀ (Q) σ, but not in the prefix Q.)
D EFINITION 2 (E QUIVALENCE ). The equivalence under prefix
is a relation on triples composed of a prefix Q and two types σ1 and
σ2 , written (Q) σ1 ≡ σ2 . It is defined as the smallest relation that
/ σ1 ≡ σ2 .
satisfies the rules of Figure 2. We write σ1 ≡ σ2 for (0)
Rule E Q -C OMM allows the reordering of independent binders;
Rule E Q -F REE eliminates unused bound variables. Rules E Q C ONTEXT-L and E Q -C ONTEXT-R tell that ≡ is a congruence; Reasoning under prefixes allows to break up a polytype ∀ (Q) σ and
“look inside under prefix Q”. For instance, it follows from iterations of Rule E Q -C ONTEXT-R that (Q) σ ≡ σ0 suffices to show
/ ∀ (Q) σ ≡ ∀ (Q) σ0 .
(0)
Rule E Q -M ONO allows to read the bound of a variable from the
prefix when it is (equivalent to) a monotype. An example of use
of E Q -M ONO is (Q, α = τ0 , Q0 ) α → α ≡ τ0 → τ0 . Rule E Q M ONO makes no difference between ≥ and = whenever the bound
is (equivalent to) a monotype. The restriction of Rule E Q -M ONO
to the case where σ0 is (equivalent to) a monotype is required for
the well-formedness of the conclusion. Moreover, it also disallows
(Q, α = σ0 , Q0 ) α ≡ σ0 when τ is a variable α: variables with non
trivial bounds must be treated abstractly and cannot be silently expanded. In particular, (Q) ∀ (α = σ0 , α0 = σ0 ) α → α0 ≡ ∀ (α = σ0 )
α → α does not hold.
Rule E Q -VAR expands into both ∀ (α = σ) α ≡ σ and ∀ (α ≥ σ)
α ≡ σ. The former captures the intuition that ∀ (α = σ) σ0 stands
for σ0 [σ/α], which however, is not always well-formed. The latter
may be surprising, since one could expect ∀ (α ≥ σ) α v σ to hold,
but not the converse. The inverse part of the equivalence could be
removed without changing the set of typable terms. However, it is
harmless and allows for a more uniform presentation.
The equivalence (Q) ∀ (α ¦ τ) σ ≡ σ[τ/α] follows from Rules E Q M ONO, context rules, transitivity, and E Q -F REE, which we further
refer to as the derived rule E Q -M ONO? .
The equivalence under (a given) prefix is a symmetric operation.
In other words, it captures reversible transformations. Irreversible
transformations are captured by an instance relation v. Moreover,
we distinguish a subrelation @
− of v called abstraction. Inverse of
Figure 2. Type equivalence under prefix
All rules are considered symmetrically.
E Q -R EFL
(Q) σ ≡ σ
E Q -T RANS
(Q) σ1 ≡ σ2
(Q) σ2 ≡ σ3
(Q) σ1 ≡ σ3
E Q -C ONTEXT-L
E Q -C ONTEXT-R
(Q, α ¦ σ) σ1 ≡ σ2
(Q) ∀ (α ¦ σ) σ1 ≡ ∀ (α ¦ σ) σ2
(Q) σ1 ≡ σ2
(Q) ∀ (α ¦ σ1 ) σ ≡ ∀ (α ¦ σ2 ) σ
E Q -F REE
α∈
/ ftv(σ1 )
(Q) ∀ (α ¦ σ) σ1 ≡ σ1
E Q -C OMM
α1 ∈
/ ftv(σ2 )
α2 ∈
/ ftv(σ1 )
(Q) ∀ (α1 ¦1 σ1 ) ∀ (α2 ¦2 σ2 ) σ ≡ ∀ (α2 ¦2 σ2 ) ∀ (α1 ¦1 σ1 ) σ
E Q -M ONO
E Q -VAR
(α ¦ σ0 ) ∈ Q
(Q) σ0 ≡ τ0
(Q) τ ≡ τ[τ0 /α]
(Q) ∀ (α ¦ σ) α ≡ σ
Figure 3. The abstraction relation
A-E QUIV
(Q) σ1 ≡ σ2
(Q) σ1 − σ2
@
A-H YP
A-T RANS
(Q) σ1 − σ2
(Q) σ2 − σ3
(Q) σ1 − σ3
(α1 = σ1 ) ∈ Q
(Q) σ1 @
− α1
@
@
@
A-C ONTEXT-R
(Q, α ¦ σ) σ1 − σ2
(Q) ∀ (α ¦ σ) σ1 − ∀ (α ¦ σ) σ2
@
@
A-C ONTEXT-L
(Q) σ1 @
− σ2
(Q) ∀ (α = σ1 ) σ @
− ∀ (α = σ2 ) σ
abstractions are sound relations for 4 but made explicit so as to
preserve type inference, while inverse of instance relations would,
in general, be unsound for 4.
1.3 The abstraction relation
D EFINITION 3. The abstraction under prefix, is a relation on
triples composed of a prefix Q and two types σ1 and σ2 , written3
(Q) σ1 @
− σ2 , and defined as the smallest relation that satisfies the
/ σ1 @
rules of Figure 3. We write σ1 @
− σ2 for (0)
− σ2 .
Rules A-C ONTEXT-L and A-C ONTEXT-R are context rules; note
that Rule A-C ONTEXT-L does not allow abstraction under flexible
bounds. The interesting rule is A-H YP, which replaces a polytype
σ1 by a variable α1 , provided α1 is rigidly bound to σ1 in Q.
Remarkably, rule A-H YP is not reversible. In particular, (α = σ) ∈
Q does not imply (Q) α @
− σ, unless σ is (equivalent to) a monotype.
This asymmetry is essential, since uses of @
− will be inferred, but
uses of A
− will not. Intuitively, the former consists in abstracting
the polytype σ as the name α (after checking that α is declared as
an alias for σ in Q). The latter consists in revealing the polytype
abstracted by the name α. An abstract polytype, i.e. a variable
bound to a polytype in Q, can only be manipulated by its name, i.e.
abstractly. The polytype must be revealed explicitly (by using the
relation A
−) before it can be further instantiated (along the relation
@− or v). (See also examples 6 and 7.)
σ2 is an abstraction of σ1 —or σ1 is a revelation of σ2 —
under prefix Q.
3 Read
Figure 4. Type instance
I-A BSTRACT
(Q) σ1 − σ2
(Q) σ1 v σ2
@
I-H YP
I-T RANS
(Q) σ1 v σ2
(Q) σ2 v σ3
(Q) σ1 v σ3
(α1 ≥ σ1 ) ∈ Q
(Q) σ1 v α1
I-B OT
(Q) ⊥ v σ
I-C ONTEXT-R
(Q, α ¦ σ) σ1 v σ2
(Q) ∀ (α ¦ σ) σ1 v ∀ (α ¦ σ) σ2
I-C ONTEXT-L
(Q) σ1 v σ2
(Q) ∀ (α ≥ σ1 ) σ v ∀ (α ≥ σ2 ) σ
E XAMPLE 4. The instance relation covers an interesting case of
type isomorphism [3]. In System F, type ∀ α · τ0 → τ is isomorphic5
to τ0 → ∀ α · τ whenever α is not free in τ0 . In MLF, the two corresponding polytypes are not equivalent but in an instance relation.
Precisely, ∀ (α0 ≥ ∀ (α) τ) τ0 → α0 is more general than ∀ (α) τ0 → τ,
as shown by the following derivation:
∀ (α0 ≥ ∀ (α) τ) τ0 → α0
v ∀ (α) ∀ (α0 ≥ τ) τ0 → α0
≡ ∀ (α) τ0 → τ
I-R IGID
(Q) ∀ (α ≥ σ1 ) σ v ∀ (α = σ1 ) σ
E XAMPLE 2. The abstraction (α = σ) ∀ (α = σ) σ0 @
− σ0 is derivable: on the one hand, (α = σ) σ @
− α holds by A-H YP, leading to
(α = σ) ∀ (α = σ) σ0 @
− ∀ (α = α) σ0 by A-C ONTEXT-L; on the other
hand, (α = σ) ∀ (α = α) σ0 ≡ σ0 holds by E Q -M ONO? . Hence, we
conclude by A-E QUIV and A-T RANS.
1.4
The instance relation also coincides with the one of ML on MLtypes. In particular, ∀ (ᾱ) τ0 v τ1 if and only if τ1 is of the form
τ0 [τ̄/ᾱ].
The instance relation
D EFINITION 4. The instance under prefix, is a relation on triples
composed of a prefix Q and two types σ1 and σ2 , written4 (Q)
σ1 v σ2 . It is defined as the smallest relation that satisfies the rules
/ σ1 v σ2 .
of Figure 4. We write σ1 v σ2 for (0)
Rule I-B OT means that ⊥ behaves as a least element for the instance relation. Rules I-C ONTEXT-L and I-R IGID mean that flexible bounds can be instantiated and changed into rigid bounds. Conversely, instantiation cannot occur under rigid bounds, except when
it is an abstraction, as described by Rule A-C ONTEXT-L.
The interesting rule is I-H YP—the counter-part of rule A-H YP,
which replaces a polytype σ1 by a variable α1 , provided σ1 is a
flexible bound of α1 in Q.
E XAMPLE 3. The instance relation (α ≥ σ) ∀ (α ≥ σ) σ0 v σ0
holds. The derivation follows the one of Example 2 but uses I-H YP
and I-C ONTEXT-L instead of A-H YP and A-C ONTEXT-L. More
generally, (QQ0 ) ∀ (Q0 ) σ v σ holds for any Q, Q0 , and σ, which we
refer to as Rule I-D ROP.
The relation (Q) ∀ (α1 ≥ ∀ (α2 ¦ σ2 ) σ1 ) σ v ∀ (α2 ¦ σ2 ) ∀ (α1 ≥ σ1 )
σ holds whenever α2 ∈
/ ftv(σ), which we further refer to as the
derived rule I-U P.
As expected, the equivalence is the kernel of the instance relation:
L EMMA 1 (E QUIVALENCE ). For any prefixes Q and types σ and
σ0 , we have (Q) σ ≡ σ0 if and only if both (Q) σ v σ0 and (Q) σ0 v σ
hold.
(However, as opposed to type containment [19], the instance relation cannot express any form of contravariance.)
1.5 Operation on prefixes and unification
Rules A-C ONTEXT-L and I-C ONTEXT-L show that two types
∀ (Q) σ and ∀ (Q0 ) σ with the same suffix can be in an instance
relation, for any suffix σ. This suggests a notion of inequality between prefixes alone. However, because prefixes are “open” this
relation must be defined relatively to a set of variables that lists (a
superset of) the free type variables of σ. In this context, a set of
type variables is called an interface and is written with letter I.
D EFINITION 5 (P REFIX INSTANCE ). A prefix Q is an instance
of a prefix Q0 under the interface I, and we write Q vI Q0 , if and
only if ∀ (Q) σ v ∀ (Q0 ) σ holds for all types σ whose free variables
are included in I. We omit I in the notation when it is equal to
dom(Q). We define Q ≡I Q0 and Q @
−I Q0 similarly.
Prefixes can be seen as a generalization of the notion of substitutions to polytypes. Then, Q v Q0 captures the usual notion of (a
substitution) Q being more general than (a substitution) Q0 .
D EFINITION 6 (U NIFICATION ). A prefix Q0 unifies monotypes
τ1 and τ2 under Q if and only if Q v Q0 and (Q) τ1 ≡ τ2 .
The unification algorithm, called unify, is defined in Appendix A.
T HEOREM 1. For any prefix Q and monotypes τ1 and τ2 , unify
(Q, τ1 , τ2 ) returns the smallest prefix (for the relation vdom(Q) ) that
unifies τ1 and τ2 under Q, or fails if there exists no prefix Q0 that
unifies τ1 and τ2 under Q.
The following lemma shows that first-order unification lies under
MLF unification.
b (deL EMMA 3. If (Q) τ1 ≡ τ2 , then there exists a substitution Q
pending only on Q) that unifies τ1 and τ2 .
2
The instance relation coincide with equivalence on monotypes,
which captures the intuition that “monotypes are really monomorphic”.
L EMMA 2. For all prefixes Q and monotypes τ and τ0 , we have
(Q) τ v τ0 if and only if (Q) τ ≡ τ0 .
4 Read σ
2
is an instance of σ1 —or σ1 is more general than σ2 —
under prefix Q.
by I-U P
by E Q -M ONO?
The core language
As explained in the introduction we formalize the language MLF
as a restriction to the more permissive language MLF? . We assume
given a countable set of variables, written with letter x, and a countable set of constants c ∈ C . Every constant c has an arity |c|. A
constant is either a primitive f or a constructor C. The distinction
is, there exists a function (η, β)-reducible to the identity
that transforms one into the other, and conversely.
5 That
Figure 5. Expressions of MLF?
a ::= x | c | λ(x) a | a a | let x = a in a
| (a : ?)
Figure 6. Typing rules for MLF and MLF?
Terms
Oracles
c ::= f | C
Constants
z ::= x | c
Identifiers
VAR
The language MLF is the restriction of MLF? to expressions that do
not contain oracles.
2.1
Static semantics
Typing contexts, written with letter Γ are lists of assertions of the
form z : σ. We write z : σ ∈ Γ to mean that z is bound in Γ and
z : σ is its rightmost binding in Γ. We assume given an initial typing
context Γ0 mapping constants to closed polytypes.
Typing judgments are of the form (Q) Γ ` a : σ. A tiny difference
with ML is the presence of the prefix Q that assigns bounds to type
variables appearing free in Γ or σ. By comparison, this prefix is
left implicit in ML because all free type variables have the same
(implicit) bound ⊥. In MLF, we require that σ and all polytypes of
Γ be closed with respect to Q, that is, ftv(Γ) ∪ ftv(σ) ⊆ dom(Q).
L ET
(Q) Γ, x : τ0 ` a : τ
(Q) Γ ` λ(x) a : τ0 → τ
(Q, α ¦ σ) Γ ` a : σ0
α∈
/ ftv(Γ)
(Q) Γ ` a : ∀ (α ¦ σ) σ0
I NST
(Q) Γ ` a : σ
(Q) σ v σ0
(Q) Γ ` a : σ0
(Q) Γ ` a : σ
(Q) σ A
− σ0
(Q) Γ ` a : σ0
U-I NST
(Q) Γ ` a : σ
(Q) σ A
−vA
− σ0
(Q) Γ ` a : σ0
As in ML, there is an important difference between rules F UN and
L ET: while typechecking their bodies, a let-bound variable can be
assigned a polytype, but a λ-bound variable can only be assigned
O RACLE
(Q) Γ ` a : σ
(Q) σ A
− σ0
(Q) Γ ` (a : ?) : σ0
a monotype in Γ. Indeed, the latter must be guessed while the former can be inferred from the type of the bound expression. This
restriction is essential to enable type inference. Notice that a λbound variable can refer to a polytype abstractly via a type variable
α bound to a polytype σ in Q. However, this will not allow to take
different instances of σ while typing the body of the abstraction,
unless the polytype bound σ of α is first revealed by an oracle. Indeed, the only possible instances of α under a prefix Q that contains
the binding (α = σ) are types equivalent to α under Q. However,
(Q) α ≡ σ does not hold. Thus, if x : α is in the typing context
Γ, the only way of typing x (modulo equivalence) is (Q) Γ ` x : α,
whereas (Q) Γ ` x : σ is not derivable. Conversely, (Q) Γ ` (x : ?) : σ
is derivable, since (Q) α A
− σ.
ML as a subset of MLF.
ML can be embedded into MLF
by restricting all bounds in the prefix Q to be unconstrained. Rules
G EN and I NST are then exactly those of ML. Hence, any closed
program typable in ML is also typable in MLF.
E XAMPLE 5. This first example of typing illustrates the use of
polytypes in typing derivations: we consider the simple expression K 0 defined by λ(x) λ(y) y. Following ML, one possible typing
derivation is (we recall that (α, β) stands for (α ≥ ⊥, β ≥ ⊥)):
Typing rules.
U-O RACLE
(Q) Γ ` a1 : σ
(Q) Γ, x : σ ` a2 : τ
(Q) Γ ` let x = a1 in a2 : τ
G EN
F UN
The typing rules of MLF? and MLF are described
in Figure 6. They correspond to the typing rules of ML modulo the
richer types, the richer instance relation, and the explicit binding of
free type variables in judgments. In addition, Rule O RACLE allows
for the revelation of polytypes, that is, the transformation of types
along the inverse of the abstraction relation. (This rule would has
no effect in ML where abstraction is the same as equivalence.) For
UMLF, it suffices to replace Rule O RACLE by U-O RACLE given
below or, equivalently, combine O RACLE with I NST into U-I NST.
(Q) Γ ` a1 : τ2 → τ1
(Q) Γ ` a2 : τ2
(Q) Γ ` a1 a2 : τ1
F UN
between constructors and primitives lies in their dynamic semantics: primitives (such as +) are reduced when fully applied, while
constructors (such as cons) represent data structures, and are not
reduced. We use letter z to refer to identifiers, i.e. either variables
or constants.
Expressions of MLF? , written with letter a, are described in Figure 5. Expressions are those of ML extended with oracles. An
oracle, written (a : ?) is simply a place holder for an implicit type
annotation around the expression a. Intuitively, oracles are places
where the type inference algorithm must call an “oracle” to fill
the hole with a type annotation. Equivalently, the oracles can be
replaced by explicit type annotations before type inference. Explicit annotations (a : σ), which are described in Section 4, are
actually syntactic sugar for applications (σ) a where (σ) are constants. Examples in the introduction also use the notation λ(x : σ) a,
which do not appear in Figure 5, because this is, again, syntactic sugar for λ(x) let x = (x : σ) in a. Similarly, λ(x : ?) a means
λ(x) let x = (x : ?) in a.
A PP
z:σ∈Γ
(Q) Γ ` z : σ
F UN
G EN
(α, β) x : α, y : β ` y : β
(α, β) x : α ` λ(y) y : β → β
(α, β) ` K 0 : α → (β → β)
` K 0 : ∀ (α, β) α → (β → β)
There is, however, another typing derivation that infers a more general type for K 0 in MLF (for conciseness we write Q for α, β ≥ σid ):
F UN
G EN
I NST
(Q, γ) x : α, y : γ ` y : γ
(Q, γ) x : α ` λ(y) y : γ → γ
(Q) x : α ` λ(y) y : σid
(Q) σid v β
(Q) x : α ` λ(y) y : β
F UN
(Q) ` K 0 : α → β
G EN
` K 0 : ∀ (Q) α → β
Notice that the polytype ∀ (α, β ≥ σid ) α → β is more general than
∀ (α, β) α → (β → β), which follows from Example 4.
Figure 7. Syntax directed typing rules
VAR
O
F UN
dom(Q0 ) ∩ Γ = 0/
O
(Q) Γ ` λ(x) a : ∀ (Q0 , α ≥ σ) τ0 → α
O
O
(Q) Γ ` z : σ
A PP
O
(QQ0 ) Γ, x : τ0 ` a : σ
z:σ∈Γ
VARO (Q) x : α ` x : α
O RACLEO
A PPO
O
O
(Q) Γ ` a1 a2 : ∀ (Q0 ) τ1
O
O RACLEO
O
(Q) Γ ` a1 : σ1
O
(Q) Γ, x : σ1 ` a2 : σ2
O
(Q) Γ ` a : σ
(Q) σ A
− σ0
(Q) Γ ` (a : ?) : σ0
O
O
(Q) Γ ` let x = a1 in a2 : σ2
(Q) α A
− σid (3)
O
(Q) x : α ` (x : ?) : σid
(Q) σid v α → α
O
(Q) x : α ` x : α VARO
O
F UNO
O
(Q) Γ ` a1 : σ1
(Q) Γ ` a2 : σ2
(Q) σ1 v ∀ (Q0 ) τ2 → τ1
(Q) σ2 v ∀ (Q0 ) τ2
L ETO
α for τ0 , we obtain:
O
(Q) x : α ` (x : ?) x : α
` λ(x) (x : ?) x : ∀ (α = σid ) α → α
The oracle plays a crucial role in (3)—the revelation of the type
scheme σid that is the bound of the type variable α used in the type
of x. We have (Q) σid v α, indeed, but the converse relation does
not hold, so rule I NST cannot be used here to replace α by its bound
σid .
2.3 Dynamic semantics
The semantics of MLF? is the standard call-by-value semantics of
ML. We present it as a small-step reduction semantics. Values and
call-by-value evaluation contexts are described below.
2.2
v ::= λ(x) a
| f v 1 . . . vn
| C v 1 . . . vn
| (v : ?)
E ::= [ ] | E a | v E | (E : ?) | let x = E in a
Syntax directed presentation
As in ML, we can replace the typing rules of MLF? by a set of
equivalent syntax-directed typing rules, which are given in Figure 7.
Naively, a sequence of non-syntax-directed typing Rules G EN and
I NST should be placed around any other rule. However, many of
these occurrences can be proved unnecessary by following an appropriate strategy. For instance, in ML, judgments are maintained
instantiated as much as possible and are only generalized on the
left-hand side of Rule L ET. In MLF? , this strategy would require
more occurrences of generalization. Instead, we prefer to maintain
typing judgments generalized as much as possible. Then, it suffices
to allow Rule G EN right after Rule F UN and to allow Rule I NST
right before Rule A PP (see Rules F UNO and A PPO ).
E XAMPLE 6. As we claimed in the introduction, a λ-bound variable that is used polymorphically must be annotated. Let us check
that λ(x) x x is not typable in MLF by means of contradiction. A
syntax-directed type derivation of this expression would be of the
form:
O
O
VARO (Q) x : τ0 ` x : τ0
A PPO
(Q) τ0 v ∀ (Q0 ) τ2 → τ1 (2)
O
F UNO
(Q) x : τ0 ` x : τ0 VARO
(Q) τ0 v ∀ (Q0 ) τ2 (1)
0
(Q) x : τ0 ` x x : ∀ (Q ) τ1
(Q) 0/ ` λ(x) x x : ∀ (α ≥ ∀ (Q0 ) τ1 ) τ0 → α
Applying Rule I-D ROP to (2) and (1), we get respectively (QQ0 )
τ0 v τ2 → τ1 and (QQ0 ) τ0 v τ2 . Then (QQ0 ) τ2 → τ1 ≡ τ2 follows by Lemma 2 and E Q -T RANS. Thus, by Lemma 3, there exists a substitution θ such that θ(τ2 ) = θ(τ2 → τ1 ), that is, θ(τ2 ) =
θ(τ2 ) → θ(τ1 ), which cannot be the case.
This example shows the limit of type inference, which is actually
the strength of our system! That is, to maintain principal types
by rejecting examples where type inference would need to guess
second-order types.
E XAMPLE 7. Let us recover typability by introducing an oracle
and build a derivation for λ(x) (x : ?) x. Taking (α = σid ) for Q and
n < |f|
n ≤ |C|
The reduction relation −→ is parameterized by a set of δ-rules of
the form (δ) below:
f v1 . . . vn −→ a
when | f | = n
(λ(x) a) v −→ a[v/x]
let x = v in a −→ a[v/x]
(v1 : ?) v2 −→ (v1 v2 : ?)
(δ)
(βv )
(βlet )
(?)
The main reduction is the β-reduction that takes two forms Rule
(βv ) and Rule (βlet ). Oracles are maintained during reduction to
which they do not contribute: they are simply pushed out of applications by rule (?). Finally, the reduction is the smallest relation containing (δ), (βv ), (βlet ), and (?) rules that is closed by
E-congruence:
E[a] −→ E[a0 ] if a −→ a0
3
(C ONTEXT)
Formal properties
We verify type soundness for MLF? and address type inference in
MLF.
3.1 Type soundness
Type soundness for MLF? is shown as usual by a combination of
subject reduction, which ensures that typings are preserved by reduction, and progress, which ensures that well-typed programs that
are not values can be further reduced.
To ease the presentation, we introduce a relation ⊆ between programs: we write a ⊆ a0 if and only if every typing of a, i.e. a triple
(Q, Γ, σ) such that (Q) Γ ` a : σ holds, is also a typing of a0 . A relation R on programs preserves typings whenever it is a sub-relation
of ⊆.
Of course, type soundness cannot hold without some assumptions
relating the static semantics of constants described by the initial
typing context Γ0 and their dynamic semantics.
D EFINITION 7 (H YPOTHESES ). We assume that the following
three properties hold for constants.
(H0) (Arity) Each constant c ∈ dom(Γ0 ) has a closed type Γ0 (c) of
the form ∀ (Q) τ1 → . . . τ|c| → τ and such that the top symbol
of ∀ (Q) τ is not in {→, ⊥} whenever c is a constructor.
(H1) (Subject-Reduction) All δ-rules preserve typings.
(H2) (Progress) Any expression a of the form f v1 . . . v| f | , such
that (Q) Γ0 ` a : σ is in the domain of (δ).
T HEOREM 2
ings.
(S UBJECT REDUCTION ). Reduction preserves typ-
T HEOREM 3 (P ROGRESS ). Any expression a such that (Q) Γ0 `
a : σ is a value or can be further reduced.
Combining theorems 2 and 3 ensures that the reduction of welltyped programs either proceeds for ever or ends up with a value.
This holds for programs in MLF? but also for programs in MLF,
since MLF is a subset of MLF? . Hence MLF is also sound. However,
MLF does not enjoy subject reduction, since reduction may create
oracles. Notice, however, that oracles can only be introduced by
δ-rules.
3.2
Type inference
A type inference problem is a triple (Q, Γ, a), where all free type
variables in Γ are bound in Q. A pair (Q0 , σ) is a solution to this
problem if Q v Q0 and (Q0 ) Γ ` a : σ. A pair (Q0 , σ0 ) is an instance
of a pair (Q, σ) if Q v Q0 and (Q0 ) σ v σ0 . A solution of a type
inference problem is principal if all other solutions are instances of
the given one.
Figure 9 in the Appendix B defines a type inference algorithm WF
for MLF. This algorithm proceeds much as type inference for ML:
the algorithm follows the syntax-directed typing rules and reduces
type inference to unification under prefixes.
T HEOREM 4 (T YPE INFERENCE ). The set of solutions of a solvable type inference problem admits a principal solution. Given any
type inference problem, the algorithm WF either returns a principal
solution or fails if no solution exists.
4
Type annotations
In this section, we restrict our attention to MLF, i.e. to expressions
that do not contain oracles. Since expressions of MLF are exactly
those of ML, its expressiveness may only come from richer types
and typing rules. However, the following lemma shows that this is
not sufficient:
L EMMA 4. If the judgment (Q) Γ ` a : σ holds in MLF where the
typing context Γ contains only ML types and Q contains only type
variables with unconstrained bounds, then there exists a derivation
of Γ ` a : ∀ (ᾱ) τ in ML where ∀ (ᾱ) τ is obtained from σ by moving
all inner quantifiers ahead.
The inverse inclusion has already been stated in Section 2.1. In the
particular case where the initial typing context Γ0 contains only ML
types, a closed expression can be typed in MLF under Γ0 if and only
if it can be typed in ML under Γ0 . This is not true for MLF? in which
the expression λ(x) (x : ?) x is typable. Indeed, as shown below, all
terms of System F can be typed in MLF? . Fortunately, there is an
interesting choice of constants that provides MLF with the same
expressiveness as MLF? while retaining type inference. Precisely,
we provide type annotations as a collection of coercion primitives,
i.e. functions that change the type of expressions without changing
their meaning. The following example, which describes a single
annotation, should provide intuition for the general case.
E XAMPLE 8. Let f be a constant of type σ equal to ∀ (α =
σid , α0 ≥ σid ) α → α0 with the δ-reduction f v −→ (v : ?). Then,
the expression a defined as λ(x) ( f x) x behaves as λ(x) x x and
is well-typed, of type ∀ (α = σid ) α → α. To see this, let Q
and Γ stand for (α = σid , α0 = σid ) and x : α. By Rules I NST,
VAR, and A PP (Q) Γ ` f x : α0 ; hence by rule G EN, (α = σid )
Γ ` f x : ∀ (α0 = σid ) α0 since α0 is not free in the Γ. By rule
E Q -VAR, we have ∀ (α0 = σid ) α0 ≡ σid (under any prefix); besides, σid v α → α under any prefix that binds α. Thus, we get
(α = σid ) Γ ` f x : α → α by Rule I NST. The result follows by
Rules A PP, F UN, and G EN.
Observe that the static effect of f in f x is (i) to enforce the type
of x to be abstracted by a variable α bound to σid in Q and (ii) to
give f x, that is x, the type σid , exactly as the oracle (x : ?) would.
Notice that the bound of α in σ is rigid: the function f expects a
value v that must have type σid (and not a strict instance of σid ).
Conversely, the bound of α0 is flexible: the type of f v is σid but
may also be any instance of σid .
D EFINITION 8. We call annotations the denumerable collection
of primitives (∃ (Q) σ), of arity 1, defined for all prefixes Q and
polytypes σ closed under Q. The initial typing environment Γ0 contains these primitives with their associated type:
(∃ (Q) σ) : ∀ (Q) ∀ (α = σ) ∀ (β ≥ σ) α → β
∈ Γ0
We may identify annotation primitives up to the equivalence of their
types.
Besides, we write (a : ∃ (Q) σ) for the application (∃ (Q) σ) a. We
also abbreviate (∃ (Q) σ) as (σ) when all bounds in Q are unconstrained. Actually, replacing an annotation (∃ (Q) σ) by (σ) preserves typability and, more precisely, preserves typings.
While annotations are introduced as primitives for simplicity of presentation, they are obviously meant to be applied. Notice that the
type of an annotation may be instantiated before the annotation is
applied. However, the annotation keeps exactly the same “revealing power” after instantiation. This is described by the following
technical lemma (the reader may take 0/ for Q0 at first).
L EMMA 5. The judgment (Q0 ) Γ ` (a : ∃ (Q) σ) : σ0 is valid if and
only if there exists a type ∀ (Q0 ) σ01 such that the judgment (Q0 ) Γ `
a : ∀ (Q0 ) σ01 holds together with the following relations: Q0 Q v
Q0 Q0 , (Q0 Q0 ) σ01 A
− σ, and (Q0 ) ∀ (Q0 ) σ v σ0 .
The prefix Q of the annotation ∃ (Q) σ may be instantiated into Q0 .
However, Q0 guards σ01 A
− σ in (Q0 Q0 ) σ01 A
− σ. In particular, the
lemma would not hold with (Q0 ) ∀ (Q0 ) σ01 A
− ∀ (Q00 ) σ and (Q0 )
∀ (Q00 ) σ01 v σ0 . Lemma 5 has similarities with Rule A NNOT of
Poly-ML [7].
C OROLLARY 6. The judgment (Q) Γ ` (a : ?) : σ0 holds if and
only if there exists an annotation (σ) such that (Q) Γ ` (a : σ) : σ0
holds.
Hence, all expressions typable in MLF? are typable in MLF as long
as all annotation primitives are in the initial typing context Γ0 .
The δ-reduction for annotations
just replaces explicit type information by oracles.
Reduction of annotations.
written [[A]], returns a pair (Q) Γ of a prefix and a typing context
and is defined inductively as follows:
(v : ∃ (Q) σ) −→ (v : ?)
L EMMA 7 (S OUNDNESS OF TYPE ANNOTATIONS ).
All three
hypotheses (H0, arity), (H1, subject-reduction), and (H2, progress)
hold when primitives are the set of annotations, alone.
The annotation (∃ (Q) σ) can be simulated by λ(x) (x : ?) in MLF? ,
both statically and dynamically. Hence annotations primitives are
unnecessary in MLF? .
As mentioned in Section 2, λ(x : σ) a is
syntactic sugar for λ(x) let x = (x : σ) in a. The derived typing
rule is:
[[A]] = (Q) Γ
[[A, x : t]] = (Q) Γ, x : [[t]]
/ = () 0/
[[0]]
[[A]] = (Q) Γ
α∈
/ dom(Q)
[[A, α]] = (Q, α) Γ
The translation of System F terms into MLF terms forgets type
abstraction and type applications, and translates types in termabstractions.
[[Λ(α)M]] = [[M]]
0
[[M t]] = [[M]]
0
[[M M ]] = [[M]] [[M ]]
[[x]] = x
[[λ(x : t) M]] = λ(x : [[t]]) [[M]]
Syntactic sugar.
F UN?
(Q) Γ, x : σ ` a : σ0
Q0 v Q
0
(Q) Γ ` λ(x : ∃ (Q ) σ) a : ∀ (α = σ) ∀ (α0 ≥ σ0 ) α → α0
This rule is actually simpler than the derived annotation rule suggested by lemma 5, because instantiation is here left to each occurrence of the annotated program variable x in a.
β?
The derived reduction rule is (λ(x : ∃ (Q) σ) a) v −→ let x = (v :
∃ (Q) σ) in a. Values must then be extended with expressions of
the form λ(x : ∃ (Q) σ) a, indeed.
5
Comparison with System-F
We have already seen that all ML programs can be written in MLF
without annotations. In Section 5.1, we provide a straightforward
compositional translation of terms of System-F into MLF. In Section 5.2, we then identify a subsystem of MLF, called ShallowMLF? , whose let-binding free version is exactly the target of the
encoding of System-F.
5.1
Encoding System-F into MLF
The types, terms, and typing contexts of system F are given below:
t ::= α | t → t | ∀ α · t
M ::= x | M M | λ(x : t) M | Λ(α)M | M t
A ::= 0/ | A, x : t | A, α
The translation of types of System-F into MLF types uses auxiliary
rigid bindings for arrow types. This ensures that there are no inner
polytypes left in the result of the translation, which would otherwise
be ill-formed. Quantifiers that are present in the original types are
translated to unconstrained bounds.
[[α]] = α
[[∀ α · t]] = ∀ (α) [[t]]
[[t1 → t2 ]] = ∀ (α1 = [[t1 ]]) ∀ (α2 = [[t2 ]]) α1 → α2
In order to state the correspondence between typing judgments, we
must also translate typing contexts. We write A ` M : t to mean that
M has type t in typing context A in System F. The translation of A,
Finally, we can state the following lemma:
L EMMA 8. For any closed typing context A (that does not bind
the same type variable twice), term M, and type t of system F such
that A ` M : t, there exists a derivation (Q) Γ ` [[M]] : τ such that
(Q) Γ = [[A]] and [[t]] @
− τ.
Remarkably, translated terms contain strictly fewer annotations
than original terms—a property that was not true in Poly-ML. In
particular, all type Λ-abstractions and type applications are dropped
and only annotations of λ-bound variables remain. Moreover, some
of these annotations are still superfluous.
5.2 Shallow-MLF?
Types whose flexible bounds are always ⊥ are called F-types
(they are the translation of types of System F). Types of the form
∀ (α ≥ σ) τ, where σ is not equivalent to a monotype nor to ⊥, have
been introduced to factor out choices during type inference. Such
types are indeed used in a derivation of let f = choose id in
( f auto) ( f succ). However, they are never used in the encoding
of System-F. Are they needed as annotations?
Let a type be shallow if and only if all its rigid bounds are F-types.
More generally, prefixes, typing contexts, and judgments are shallow if and only if they contain only shallow types. A derivation is
shallow if all its judgments are shallow and Rule O RACLE is only
applied to F-types. Notice that the explicit annotation (σ) has a
shallow type if and only if σ is an F-type. We call Shallow-MLF?
the set of terms that have shallow derivations. Interestingly, subject
reduction also holds for Shallow-MLF? .
Let-bindings do not increase expressiveness in MLF? , since they can
always be replaced by λ-bindings with oracles or explicit annotations. This is not true for Shallow-MLF? , since shallow-types that
are not F-types cannot be used as annotations. Therefore, we also
consider the restriction Shallow-F of Shallow-MLF? to programs
without let-bindings.
The encoding of System-F into MLF given in Section 5.1 is actually an encoding into Shallow-F. Conversely, all programs typable
into Shallow-F are also typable in System-F. Hence, Shallow-F and
System-F have the same expressiveness.
6
6.1
Discussion
Expressiveness of MLF
By construction, we have the chain of inclusions Shallow-F ⊆
Shallow-MLF? ⊆ MLF? . We may wonder whether these languages
have strictly increasing power. That is, ignoring annotations and
the difference in notation between let-bindings and λ-redexes, do
they still form a strict chain of inclusions? We conjecture that this
is true.
Still, MLF? remains a second-order system and in that sense should
not be significantly more expressive than System F. In particular,
we conjecture that the term (λ(y) y I ; y K) (λ(x) x x) that is typable
in Fω but not in F [9] is not typable in MLF either. Conversely, we
do not know whether there exists a term of MLF that is not typable
in Fω .
Reducing all let-bindings in a term of Shallow-MLF? produces a
term in Shallow-F. Hence, terms of Shallow-MLF? are strongly normalizable. We conjecture that so are all terms of MLF.
6.2
Simple language extensions
Because the language is parameterized by constants, which can be
used either as constructors or primitive operations, the language
can import foreign functions defined via appropriate δ-rules. These
could include primitive types (such as integers, strings, etc.) and
operations over them. Sums and products, as well as predefined
datatypes, can also be treated in this manner, but some extension is
required to declare new data-types within the language itself.
The value restriction of polymorphism [28] that allows for safe mutable data-structures in ML should carry over to MLF by allowing
only rigid bounds that appear in the type of expansive expressions to
be generalized. However, this solution is likely to be disappointing
in MLF, as it is in Poly-ML, which uses polymorphism extensively.
An interesting relaxation of the value-only restriction has been recently proposed [6] and allows to always generalize type variables
that never appears on the left hand-side of an arrow type; this gave
quite satisfactory results in the context of Poly-ML and we can expect similar benefits for MLF.
6.3
Related works
Our work is related to all other works that aim at some form of
type inference in the presence of higher-order types.The closest
of them is unquestionably Poly-ML [7], with which close connections have already been made. Poly-ML also subsumes previous proposals that encapsulate first-class polymorphic values within
datatypes [25]. Odersky and Laufer’s proposal [20] also falls into
this category; however, a side mechanism simultaneously allows
a form of toplevel rank-2 quantification, which is not covered by
Poly-ML but is, we think, subsumed by MLF.
Rank-2 polymorphism actually allows for full type inference [13,
11]. However, the algorithm proceeds by reduction on source terms
and is not very intuitive. Rank-2 polymorphism has also been incorporated in the Hugs implementation of Haskell [17], but with
explicit type annotations. The GHC implementation of Haskell has
recently been released with second-order polymorphism at arbitrary
ranks [8]; however, types at rank 2 or higher must be given explic-
itly and the interaction of annotations with implicit types remains
unclear. Furthermore, to the best of our knowledge, this has not yet
been formalized. Indeed, type inference is undecidable as soon as
universal quantifiers may appear at rank 3 [14].
Although our proposal relies on the let-binding mechanism to introduce implicit polymorphismand flexible bounds in types to factorize all ways of obtaining type instances, there may still be some
connection with intersection types [12], which we would like to
explore. Our treatment of annotations as type-revealing primitives
also resembles retyping functions (functions whose type-erasure ηreduces to the identity) [19]. However, our annotations are explicit
and contain only certain forms of retyping functions. Type inference for System F modulo η-expansion is known to be undecidable
as well [27].
Several people have considered partial type inference for System F
[10, 1, 23] and stated undecidability results for some particular variants that in all cases amount—directly or indirectly—to permitting
(and so forcing) inference of the type of at least one variable that
can be used in a polymorphic manner, which we avoid.
Second-order unification, although known to be undecidable, has
been used to explore the practical effectiveness of type inference
for System F by Pfenning [22]. Despite our opposite choice, that is
not to support second-order unification, there are at least two comparisons to be made. Firstly, Pfenning’s work does not cover the
language ML per se, but only the λ-calculus, since let-bindings are
expanded prior to type inference. Indeed, ML is not the simplytyped λ-calculus and type inference in ML cannot, in practice, be
reduced to type inference in the simply-typed λ-calculus after expansion of let-bindings. Secondly, one proposal seems to require
annotations exactly where the other can skip them: in [22], markers
(but no type) annotations must replace type-abstraction and typeapplication nodes; conversely, this information is omitted in MLF,
but instead, explicit type information must remain for (some) arguments of λ-abstractions.
Our proposal is implicitly parameterized by the type instance relation and its corresponding unification algorithm. Thus, most of the
technical details can be encapsulated within the instance relation.
We would like to understand our notion of unification as a particular
case of second-order unification. One step in this direction would
be to consider a modular constraint-based presentation of secondorder unification such as [5]. Flexible bounds might partly capture,
within principal types, what constraint-based algorithms capture as
partially unresolved multi-sets of unification constraints. Another
example of restricted unification within second-order terms is unification under a mixed prefix [18]. However, our notion of prefix
and its role in abstracting polytypes is quite different.
Actually, none of the above works did consider subtyping at all.
This is a significant difference with proposals based on local type
inference [2, 24, 21] where subtyping is a prerequisite. The addition
of subtyping to our framework remains to be explored.
Furthermore, beyond its treatment of subtyping, local type inference also brings the idea that explicit type annotations can be propagated up and down the source tree according to fixed well-defined
rules, which, at least intuitively, could be understood as a preprocessing of the source term. Such a mechanism is being used in the
GHC Haskell compiler, and could in principle be added on top of
MLF as well.
Conclusions
We have proposed an integration of ML and System F that combines the convenience of type inference as present in ML and the
expressiveness of second-order polymorphism. Type information is
only required for arguments of functions that are used polymorphically in their bodies. This specification should be intuitive to the
user. Besides, it is modular, since annotations depend more on the
behavior of the code than on the context in which the code is placed;
in particular, functions that only carry polymorphism without using
it can be left unannotated.
The obvious potential application of our work is to extend ML-like
languages with second-order polymorphism while keeping full type
inference for a large subset of the language, containing at least all
ML programs. Indeed, we implemented a prototype of MLF and
verified on a variety of examples that few annotations are actually
required and always at predictable places. However, further investigations are still needed regarding the syntactic-value polymorphism
restriction and its possible relaxation.
Furthermore, on the theoretical side, we wish to better understand
the concept of “first-order unification of second-order terms”, and,
if possible, to confine it to an instance of second-order unification.
We would also like to give logical meaning to our types and to the
abstraction and instance relations.
7
References
[1] H.-J. Boehm. Partial polymorphic type inference is undecidable. In
26th Annual Symposium on Foundations of Computer Science, pages
339–345. IEEE Computer Society Press, Oct. 1985.
[2] L. Cardelli. An implementation of FSub. Research Report 97, Digital
Equipment Corporation Systems Research Center, 1993.
[3] R. D. Cosmo. Isomorphisms of Types: from lambda-calculus to information retrieval and language design. Birkhauser, 1995.
[4] L. Damas and R. Milner. Principal type-schemes for functional programs. In Proceedings of the Ninth ACM Conference on Principles of
Programming Langages, pages 207–212, 1982.
guages (POPL), St. Petersburg Beach, Florida, pages 42–53, 1996.
[13] A. J. Kfoury and J. B. Wells. A direct algorithm for type inference
in the rank-2 fragment of the second-order lambda -calculus. In ACM
Conference on LISP and Functional Programming, 1994.
[14] A. J. Kfoury and J. B. Wells. Principality and decidable type inference
for finite-rank intersection types. In ACM Symposium on Principles of
Programming Languages (POPL), pages 161–174. ACM, Jan. 1999.
[15] K. Läufer and M. Odersky. Polymorphic type inference and abstract
data types. ACM Transactions on Programming Languages and Systems, 16(5):1411–1430, Sept. 1994.
[16] X. Leroy, D. Doligez, J. Garrigue, D. Rémy, and J. Vouillon. The
Objective Caml system, documentation and user’s manual - release
3.05. Technical report, INRIA, July 2002. Documentation distributed
with the Objective Caml system.
[17] Mark P Jones, Alastair Reid, the Yale Haskell Group, and the OGI
School of Science & Engineering at OHSU. An overview of hugs
extensions. Available electronically, 1994-2002.
[18] D. Miller. Unification under a mixed prefix. Journal of Symbolic
Computation, 14:321–358, 1992.
[19] J. C. Mitchell. Polymorphic type inference and containment. Information and Computation, 2/3(76):211–249, 1988.
[20] M. Odersky and K. Läufer. Putting type annotations to work. In Proceedings of the 23rd ACM Conference on Principles of Programming
Languages, pages 54–67, Jan. 1996.
[21] M. Odersky, C. Zenger, and M. Zenger. Colored local type inference.
ACM SIGPLAN Notices, 36(3):41–53, Mar. 2001.
[22] F. Pfenning. Partial polymorphic type inference and higher-order unification. In Proceedings of the ACM Conference on Lisp and Functional Programming, pages 153–163. ACM Press, July 1988.
[23] F. Pfenning. On the undecidability of partial polymorphic type reconstruction. Fundamenta Informaticae, 19(1,2):185–199, 1993. Preliminary version available as Technical Report CMU-CS-92-105, School
of Computer Science, Carnegie Mellon University, January 1992.
[24] B. C. Pierce and D. N. Turner. Local type inference. In Proceedings of
the 25th ACM Conference on Principles of Programming Languages,
1998. Full version in ACM Transactions on Programming Languages
and Systems (TOPLAS), 22(1), January 2000, pp. 1–44.
[5] G. Dowek, T. Hardin, C. Kirchner, and F. Pfenning. Higher-order
unification via explicit substitutions: the case of higher-order patterns.
In M. Maher, editor, Joint international conference and symposium on
logic programming, pages 259–273, 1996.
[25] D. Rémy. Programming objects with ML-ART: An extension to ML
with abstract and record types. In M. Hagiya and J. C. Mitchell,
editors, Theoretical Aspects of Computer Software, volume 789 of
Lecture Notes in Computer Science, pages 321–346. Springer-Verlag,
April 1994.
[6] J. Garrigue. Relaxing the value-restriction. Presented at the
third Asian workshop on Programmaming Languages and Systems
(APLAS), 2002.
[26] J. B. Wells. Typability and type checking in the second order λcalculus are equivalent and undecidable. In Ninth annual IEEE Symposium on Logic in Computer Science, pages 176–185, July 1994.
[7] J. Garrigue and D. Rémy.
Extending ML with semi-explicit
higher-order polymorphism. Journal of Functional Programming,
155(1/2):134–169, 1999. A preliminary version appeared in TACS’97.
[8] The GHC Team. The Glasgow Haskell Compiler User’s Guide, Version 5.04, 2002. Chapter Arbitrary-rank polymorphism.
[9] P. Giannini and S. R. D. Rocca. Characterization of typings in polymorphic type discipline. In Third annual Symposium on Logic in Computer Science, pages 61–70. IEEE, 1988.
[10] J. James William O’Toole and D. K. Gifford. Type reconstruction with
first-class polymorphic values. In SIGPLAN ’89 Conference on Programming Language Design and Implementation, Portland, Oregon,
June 1989. ACM. also in ACM SIGPLAN Notices 24(7), July 1989.
[11] T. Jim. Rank-2 type systems and recursive definitions. Technical Report MIT/LCS/TM-531, Massachusetts Institute of Technology, Laboratory for Computer Science, Nov. 1995.
[12] T. Jim. What are principal typings and what are they good for? In
ACM, editor, ACM Symposium on Principles of Programming Lan-
[27] J. B. Wells. Type Inference for System F with and without the Eta Rule.
PhD thesis, Boston University, 1996.
[28] A. K. Wright. Simple imperative polymorphism. Lisp and Symbolic
Computation, 8(4):343–355, 1995.
A
Unification algorithm
The algorithm unify is specified in Section 1.5; it takes a prefix
Q and two types τ and τ0 and returns a prefix that unifies τ and τ0
under Q (as described in Theorem 1) or fails. In fact, the algorithm
unify is recursively defined by an auxiliary unification algorithm
for polytypes: polyunify takes a prefix Q and two type schemes
σ1 and σ2 and returns a pair (Q0 , σ0 ) such that Q v Q0 and (Q0 )
σ1 v σ0 and (Q0 ) σ2 v σ0 .
The algorithms unify and polyunify are described in Figure 8.
For the sake of comparison with ML, think of the input prefix Q
Figure 8. Unification algorithm
unify (Q, τ00 , τ000 )
— first rewrites all bounds of Q in normal form and proceeds
by case analysis on (τ00 , τ000 ) :
Case (α, α): return Q.
Figure 9. Algorithm WF
infer (Q, Γ, a):
— proceeds by case analysis on expression a :
Case x: return Q, Γ(x)
Case λ(x) a:
Case (g τ11 .. τn1 , g τ12 .. τn2 ):
•
•
•
•
• let Q1 be Q in
• let Qi+1 be unify (Qi , τi1 , τi2 ) for 1 6 i 6 n in
• return Qn+1 .
p
q
Case (g1 τ11 .. τ1 , g2 τ12 .. τ2 ) with g1 6= g2 : fail.
(α ¦ τ0 ) ∈ Q:
Case (α, τ) or (τ, α) when
• return unify (Q, τ, τ0 ).
Case (α, τ) or (τ, α) when (α ¦ σ) ∈ Q
and τ ∈
/ dom(Q) and σ ∈
/T
• let (Q0 , _) be polyunify (Q, σ, τ) in
• fail if (α ¦ σ) ∈
/ Q0 .
0
• return (Q ) ⇐ (α = τ)
Case (α1 , α2 ) when (α1 ¦1 σ1 ) ∈ Q and (α2 ¦2 σ2 ) ∈ Q and
α1 6= α2 and σ1 , σ2 are not in T .
• let (Q0 , σ3 ) be polyunify (Q, σ1 , σ2 ) in
• fail if (α1 ¦1 σ1 ) ∈
/ Q0 or if (α2 ¦2 σ2 ) ∈
/ Q0
0
• return (Q ) ⇐ (α1 ¦1 σ3 ) ⇐ (α2 ¦2 σ3 ) ⇐ α1 ∧ α2 .
polyunify (Q, σ1 , σ2 )
— requires σ1 , σ2 , and all bounds in Q to be in normal form1 :
Case (⊥, σ) or (σ, ⊥): return (Q, σ)
Case (∀ (Q1 ) τ1 , ∀ (Q2 ) τ2 ) with Q1 , Q2 , and Q having disjoint
domains (which usually requires renaming σ1 and σ2 )
• let Q0 be unify (QQ1 Q2 , τ1 , τ2 ) in
• let (Q3 , Q0 ) be Q0 ↑ dom(Q) in
• return (Q3 , ∀ (Q0 ) τ1 )
1 Actually, only need to replace types of the form ∀ (α ¦ σ) α by
σ, which can always be done lazily.
given to unify as a substitution and of the result prefix Q0 as an
instance of Q (i.e. a substitution of the form Q00 ◦ Q) that unifies τ
and τ0 . First-order unification of polytypes essentially follows the
general structure of first-order unification of monotypes. The main
differences are that (i) the computation of the unifying substitution
is replaced by the computation of a unifying prefix, (ii) additional
work must be performed when a variable bound to a strict polytype
(i.e. other than ⊥ and not equivalent to a monotype) is being unified: bounds must be further unified (last case of polyunify) and
the prefix updated accordingly. Auxiliary algorithms are used for
this purpose.
Let a rearrangement of a prefix Q be a prefix equivalent to Q obtained by a permutation of bindings of Q.
D EFINITION 9. The split algorithm Q↑ᾱ takes a prefix Q and returns a pair of prefixes (Q1 , Q2 ) such that (i) Q1 Q2 is a rearrangement of Q, (ii) ᾱ ⊆ dom(Q1 ), and (iii) dom(Q1 ) is minimal (in
other words, Q1 contains only bindings of Q useful for exporting
the interface ᾱ, and Q2 contains the rest).
let Q1 = (Q, α ≥ ⊥) with α ∈
/ dom(Q)
let (Q2 , σ) = infer (Q1 , Γ, x : α, a)
let β ∈
/ dom(Q2 ) and (Q3 , Q4 ) = Q2 ↑ dom(Q)
return Q3 , ∀ (Q4 ) ∀ (β ≥ σ) α → β
Case a b:
• let (Q1 , σa ) = infer (Q, Γ, a)
• let (Q2 , σb ) = infer (Q1 , Γ, b)
• let αa , αb , β ∈
/ dom(Q2 )
• let Q3 = unify ((Q2 , αa ≥ σa , αb ≥ σb , β ≥ ⊥),
αa , αb → β)
• let (Q4 , Q5 ) = Q3 ↑ dom(Q)
• return (Q4 , ∀ (Q5 ) β)
Case let x = a1 in a2 :
• let (Q1 , σ1 ) = infer (Q, Γ, a1 )
• return infer (Q1 , (Γ, x : σ1 ), a2 )
D EFINITION 10. The abstraction-check algorithm (Q) σ @
−? σ0
takes a prefix Q and two polytypes σ and σ0 such that (Q) σ v σ0
and checks that (Q) σ @
− σ0 or fails otherwise.
D EFINITION 11. The update algorithm Q ⇐ (α ¦ σ) takes a prefix
Q and a binding (α¦σ) such that α is in the domain of Q and returns
a prefix (Q0 , α ¦ σ, Q1 ) such that (i) (Q0 , α ¦0 σ0 , Q1 ) is a rearrange/ The algorithm fails when
ment of Q and (ii) dom(Q1 ) ∩ ftv(σ) = 0.
there is not such decomposition (because of circular dependencies)
or when ¦0 is = and (Q) σ0 @
−? σ fails.
D EFINITION 12. The merge algorithm Q ⇐ α ∧ α0 takes two variables α and α0 and a prefix of the form (Q0 , α ¦ σ, Q1 , α0 ¦0 σ0 , Q2 )
and returns the prefix (Q0 , α ¦00 σ, Q1 , α0 = α, Q2 ) where ¦00 is ≥ if
both ¦ and ¦0 are ≥, and ¦00 is = otherwise.
The implementation of algorithms split, update, and merge is
straightforward. The algorithm abstraction-check can be reduced
to a simple check on the structure of paths, thanks to the assumption (Q) σ v σ0 . By lack of space, they are all omitted.
B
Type inference algorithm
Figure 9 defines the type-inference algorithm WF for MLF. The
algorithm follows the algorithm W for ML, with only two differences: first, the algorithm builds a prefix Q instead of a substitution; second, all free type variables not in Γ are quantified at each
abstraction or application. Since free variables of Γ are in dom(Q),
finding quantified variables consists in splitting the current prefix
according to dom(Q), as described by Definition 9.
© Copyright 2026 Paperzz