presentation source

Type inference as
abstract interpreter
Giorgio Levi
Dipartimento di Informatica, Università di Pisa
[email protected]
http://www.di.unipi.it/~levi/levi.html
1
Types as abstract interpretations
 inspired by a paper of Cousot (POPL ‘97), which

derives several type abstract semantics from a collecting
semantics for eager untyped lambda calculus
• including à la Church/Curry polytypes (with polymorphic recursion
and abstraction)

discusses their relation to “more traditional” rule based
presentations
 most of the abstract semantics are not “effective”


infinite sets of monotypes in the abstract domain
non-effective semantics for lambda abstraction
2
Our experiment
 abstract interpreters rather than abstract semantics

executable specifications of the semantics in ML
 principal type inference via Herbrand abstraction


generic and parametric polytypes, represented by (possibly
quantified) type expressions with variables (terms)
much in the spirit of ML type inference algorithm
 all the abstract interpreters are “effective” type
inference algorithms

easy to extend, modify and compare
3
The language: syntax
 type ide = Id of string
 type exp =

| Eint of int
| Var of ide
| Sum of exp * exp
| Diff of exp * exp
| Ifthenelse of exp * exp * exp
| Fun of ide * exp
| Rec of ide * exp
| Appl of exp * exp
| Let of ide * exp * exp
| Letrec of ide * exp * exp

| Letmutrec of (ide * exp) * (ide *exp) * exp









4
Concrete semantics
 denotational interpreter
 eager semantics with “run-time” type checking
 separation from the main semantic evaluation
function of the primitive operations

which will then be replaced by their abstract version
 abstraction of concrete values

identity function in the concrete semantics
 symbolic “non-deterministic” semantics of the
conditional
5
Semantic domains
 type proc = eval -> eval
and eval =
| Funval of proc
| Int of int
| Wrong
let alfa x = x
 type env = ide -> eval
let emptyenv (x: ide) = alfa(Wrong)
let applyenv ((x: env), (y: ide)) = x y
let bind ((r:env), (l:ide), (e:eval)) (lu:ide) =
if lu = l then e else r(lu)
6
Semantic evaluation function

let rec sem (e:exp) (r:env) = match e with
| Eint(n) -> alfa(Int(n))
| Var(i) -> applyenv(r,i)
| Sum(a,b) -> plus ( (sem a r), (sem b r))
| Diff(a,b) -> diff ( (sem a r), (sem b r))
| Ifthenelse(a,b,c) -> let a1 = sem a r in
(if valid(a1) then sem b r else
(if unsatisfiable(a1) then sem c r
else merge(a1,sem b r,sem c r)))
| Fun(ii,aa) -> makefun(ii,aa,r)
| Rec(i,e1) -> makefunrec (i,e1,r)
| Appl(a,b) -> applyfun(sem a r, sem b r)
| Let(i,e1,e2) -> let d = sem e1 r in if d = alfa(Wrong) then d else
sem e2 (bind(r, i,d))
| Letrec(i,e1,e2) -> sem (Let(i,Rec(i,e1),e2)) r
| Letmutrec ((i1,e1),(i2,e2),e3) ->
sem e3 (makemutrec((i1,e1),(i2,e2),r))
7
Primitive operations 1
let plus (x,y) = match (x,y) with
|(Int nx, Int ny) -> Int (nx + ny)
| _ -> Wrong
let diff (x,y) = match (x,y) with
|(Int nx, Int ny) -> Int (nx - ny)
| _ -> Wrong
let valid x = match x with
|Int n -> n=0
let unsatisfiable x = match x with
|Int n -> if n=0 then false else true
let merge (a,b,c) = match a with
|Int n -> if b=c then b else Wrong
| _ -> Wrong
let applyfun ((x:eval),(y:eval)) = match x with
|Funval f -> f y
| _ -> Wrong
let rec makefun(ii,aa,r) = Funval(function d ->
if d = Wrong then Wrong
else sem aa (bind(r,ii,d)))
8
Primitive operations 2
let rec funzionale ff i e r = sem e (bind(r,i,ff))
and makefunrec (i,e1,r) =
Funval(let rec ff = function d ->
if d = Wrong then Wrong else
(match funzionale (Funval ff) i e1 r with
| Funval f -> f d) in ff)
 lfp (funzionale ff i e r)
let rec makemutrec ((i1,e1),(i2,e2),r) =
let rec ff1 = function d -> if d = Wrong then Wrong else
(match sem e1 (bind(bind(r,i1,Funval(ff1)),i2,Funval(ff2)))
with Funval f -> f d)
and ff2 = function d -> if d = Wrong then Wrong else
(match sem e2 (bind(bind(r,i1,Funval(ff1)),i2,Funval(ff2)))
with Funval f -> f d)
in bind(bind(r,i1,Funval(ff1)),i2,Funval(ff2))
9
Examples 1
 expressions, which have a (type) correct concrete
evaluation, and which cannot be typed by the ML
type system
 f f1 g n x = g(f1n(x))
# let rec f f1 g n x = if n=0 then g(x) else f(f1) (function x -> (function h -> g(h(x)) )) (n-1) x f1
in f (function x -> x+1) (function x -> -x) 10 5;;
This expression has type ('a -> 'a) -> 'b but is here used with type 'b
 cannot be typed for the approximation in the abstract fixpoint computation
# sem (Letrec (Id "f",Fun(Id "f1", Fun(Id "g", Fun(Id "n", Fun(Id "x",
Ifthenelse(Var(Id "n"),Appl(Var(Id "g"),Var(Id "x")),
Appl(Appl(Appl(Appl(Appl(Var(Id "f"),Var(Id "f1")),
Fun(Id "x",Fun(Id "h", Appl(Var(Id "g"),Appl(Var(Id "h"),Var(Id "x")))))),
Diff(Var(Id "n"),Eint 1)),Var(Id "x")),Var(Id "f1"))))))),
Appl(Appl(Appl(Appl(Var(Id "f"),Fun(Id "x",Sum(Var(Id "x"),Eint 1))),
Fun(Id "x",Var(Id "x"))), Eint 10),Eint 5) ) ) emptyenv;;
- : eval = Int 15
10
Examples 2
 expressions, which have a (type) correct concrete
evaluation, and which cannot be typed by the ML
type system
# let rec f x = x and g x = f (1+x) in f f 2;;
This expression has type int -> int but is here used with type int
 cannot be typed for the approximation in the abstract fixpoint computation (because of mutual
recursion)
# sem (Letmutrec((Id "f",Fun(Id "x",Var(Id "x"))),
(Id "g",Fun(Id "x",Appl(Var(Id "f"),Sum(Eint 1,Var(Id "x"))))),
Appl(Appl(Var(Id "f"),Var(Id "f")),Eint 2))) emptyenv;;
- : eval = Int 2
11
Examples 3
 expressions, which have a (type) correct concrete
evaluation, and which cannot be typed by the ML
type system
# let rec polyf x y = if x=0 then 0 else
if (x-1)=0 then (polyf (x-1)) (function z -> z)
else (polyf (x-2)) 0 in polyf 3 1;;
This expression has type int but is here used with type ‘a -> ‘a
 no polymorphic recursion
# sem (Letrec (Id "polyf", Fun(Id "x", Fun (Id "y",
Ifthenelse(Var (Id "x"), Eint 0,
Ifthenelse (Diff (Var (Id "x"), Eint 1), Appl(Appl (Var (Id "polyf"),
Diff (Var (Id "x"), Eint 1)), Fun (Id "z", Var (Id "z"))),
Appl (Appl (Var (Id "polyf"), Diff (Var (Id "x"), Eint 2)), Eint 0))))),
Appl(Appl(Var (Id "polyf"),Eint 3),Eint 1) )) emptyenv;;
- : eval = Int 0
12
Examples 4
 expressions, which have a (type) correct concrete
evaluation, and which cannot be typed by the ML
type system
# (function x -> x x) (function x -> x) 3
This expression has type ‘a -> ‘b but is here used with type ‘a
 no polymorphic abstraction
# sem (Appl(Appl(Fun(Id "x", Appl(Var(Id "x"),Var(Id "x"))),
(Fun(Id "x",Var(Id "x")))), Eint 3) ) emptyenv;;
- : eval = Int 3
13
From the concrete to the
collecting semantics
 the concrete semantic evaluation function

sem: exp -> env -> eval
 the collecting semantic evaluation function




semc: exp -> env -> (eval)
semc e r = {sem e r}
all the concrete primitive operations have to be lifted to
(eval) in the design of the abstract operations
there exist other (more concrete) collecting semantics
• semc’: exp -> (env -> eval)
14
From the collecting to the
abstract semantics
 concrete domain: ((ceval),  )
 concrete (non-collecting) environment:
cenv = ide -> ceval
 abstract domain: (eval, )
 abstract environment: env = ide -> eval

 the collecting semantic evaluation function
 semc: exp -> env -> (ceval)
 the abstract semantic evaluation function

sem: exp -> env -> eval
15
Type abstract interpreter 1
 essentially the Hindley monotype abstract
interpreter


exact Herbrand abstraction of the Church/Curry
monotype semantics
can be made more precise (fixpoint computation) (?)
 principal types



monotypes with variables
which subsume all the other types
represented as Herbrand terms
• terms built on type variables
16
Monotypes with variables
 type evalt = Notype
| Vvar of string
| Intero
| Mkarrow of evalt * evalt
 the partial order relation (on equivalence classes of terms
modulo variance)
 anti-instance relation:
• t1  t2 , if t2 is an instance of t1
• Notype is the top element
there exist infinite increasing chains
 we look for more general (principal) types


in least fixpoint computations, possible non termination
problems
17
Concrete and abstract domains
 type evalt = Notype
| Vvar of string
| Intero
| Mkarrow of evalt * evalt
 t1  t2 , if t2 is an instance of t1
 lub on evalt:

gci (greatest common instance), computed by unification
 glb on evalt:

lcg (least common generalization), computed by anti-unification
 even if evalt is not the final abstract domain, we relate it to the
concrete domain of the collecting semantics
 concrete domain: ((ceval),  , {}, ceval, ,  )
 abstract domain: (evalt, , Vvar(_), Notype, lcg, gci)
18
Concretization function
 concrete domain: ((ceval),  , {}, ceval, ,  )
 abstract domain: (evalt, , Vvar(_), Notype, lcg, gci)
 gt(x) =
•
•
•
•
ceval,
if x = Notype
{ y | $z. y = Int(z)},
if x = Intero
{},
if x = Vvar(_)
{Funval(f) |d  gt(s) f(d)  gt(t)}, if x = Mkarrow(s, t),
• gt(m),
t),
s, tground terms
for m ground instance of x,
if x = Mkarrow(s,
either s or tnon-ground
19
Abstraction function
 concrete domain: ((ceval),  , {}, ceval, ,  )
 abstract domain: (evalt, , Vvar(_), Notype, lcg, gci)
 at(y) = gci{
Notype,
Intero,
Vvar (_),
lcg{s | Funval(f)  gt(s)},
 at and gt


if Wrong  y
if $z. Int(z)  y
if y = {}
if Funval(f)  y
}
are monotonic
define a Galois connection
20
The abstraction of functions

given the concrete (non-collecting) operation
let rec makefun(ii,aa,r) = Funval(function d ->
if d = Wrong then Wrong
else sem aa (bind(r,ii,d)))

in the abstract version one should
• for each ground type ti
– bind d to ti
– compute the type si = sem aa (bind(r,ii,d))
• compute the glb of all the resulting functional types:
lcg ({Mkarrow(ti, si)})


this can be made effective by making a single evaluation,
starting from the bottom element
(wrong!) abstract operation
let rec makefun(ii,aa,r) = let d = newvar() in
let t = sem aa (bind(r,ii,d))) in
Mkarrow(d,t)
21
Type variables
 gt(Vvar(_)) = {}

a type variable represents the set of all the (concrete) values
which have any type, i.e., the empty set
 (fresh) type variables are introduced in the abstract
version of makefun

(wrong!) abstract operation
let rec makefun(ii,aa,r) = let d = newvar() in
let t = sem aa (bind(r,ii,d))) in
Mkarrow(d,t)
 the problem

the evaluation of the function body should (possibly) lead to an
instantiation of d
22
Towards constraints
 (wrong!) abstract operation
let rec makefun(ii,aa,r) = let d = newvar() in
let t = sem aa (bind(r,ii,d))) in
Mkarrow(d,t)
 Fun(Id “x”, Sum(Var(Id “x”), Eint 1))



“x” is bound to a new variable Vvar “0” in the environment r
the expression Sum(Var(Id “x”), Eint 1)) is evaluated in r
the abstract Sum operation needs to instantiate the type variable Vvar “0” to
Intero
• abstraction of the concrete type checking
 this can be achieved
 by forcing abstract operations to return an abstract value and an
abstract environment
• changing the structure of the concrete semantic evaluation function

by extending the abstract domain to pairs consisting of a term
(type) and a constraint on type variables
23
The real abstract domain
 type evalt = Notype
| Vvar of string
| Intero
| Mkarrow of evalt * evalt
type eval = evalt * (evalt * evalt) list
 the second component of an abstract value (the constraint)
represents a set of term equalities (equations)
 each abstract operation




combines the constraints in the arguments and updates the result with new constraints
checks the resulting constraint for satisfiability and transforms it to solved form (by
means of unification)
applies the constraint in solved form (substitution) to the type
returns the pair (type,constraint)
 the partial order on eval and the corresponding lub and glb
operations are obtained by lifting the definitions for evalt
24
Two abstract operations
let plus ((v1,c1),(v2,c2)) =
let sigma = unifylist((v1,Intero) :: (v2,Intero) :: (c1 @ c2)) in
match sigma with
|Fail -> (Notype,[])
|Subst(s) -> (Intero,s)
let rec makefun(ii,aa,r) =
let f1 =newvar() in
let f2 =newvar() in
let body = sem aa (bind(r,ii,(f1,[]))) in
(match body with (t,c) ->
let sigma = unifylist( (t,f2) :: c) in
(match sigma with
|Fail -> (Notype,[])
|Subst(s) -> ((applysubst sigma (Mkarrow(f1,f2))),s)))
25
Merge and function application
let gci ((v1,c1),(v2,c2)) = let sigma = unifylist((v1,v2) :: (c1 @ c2)) in
match sigma with
|Fail -> (Notype,[])
|Subst(s) -> (applysubst sigma v1,s)
let merge (a,b,c) = match a with
|(Notype,_) -> (Notype,[])
|(v0,c0) -> let sigma = unifylist((v0,Intero)::c0) in
match sigma with
|Fail -> (Notype,[])
|Subst(s) -> match gci(b, c) with
|(Notype,_) -> (Notype,[])
|(v1,c1) -> let sigma1 = unifylist(c1@s) in
match sigma1 with
|Fail -> (Notype,[])
|Subst(s1) -> (applysubst sigma1 v1,s1)
let applyfun ((v1,c1),(v2,c2)) = let f1 =newvar() in let f2 =newvar() in
let sigma = unifylist((v1,Mkarrow(f1,f2)) :: (v2,f1) :: (c1 @ c2)) in
match sigma with
|Fail -> (Notype,[])
|Subst(s) -> (applysubst sigma f2,s)
26
Abstract least fixpoint computation
let makefunrec (i, e1, r) =
alfp ((newvar(),[]), i, e1, r )
let rec alfp (ff, i, e1, r ) =
let tnext = funzionale ff i e1 r in
(match tnext with
|(Notype, _) -> (Notype,[])
|_ -> if abstreq(tnext,ff) then ff
else
alfp(tnext, i, e1, r ) ) )
 because of infinite increasing chains, the fixpoint
computation may diverge (an example later)

we need a widening operator computing an upper approximation of the
lfp
27
Abstract least fixpoint computation
let makefunrec (i, e1, r) =
alfp ((newvar(),[]), i, e1, r, k)
let rec alfp (ff , i, e1, r, n) =
let tnext = funzionale ff i e1 r in
(match tnext with
|(Notype, _) -> (Notype,[])
|_ -> if abstreq(tnext,ff) then ff
else
(if n = 0 then
widening(ff,tnext)
else alfp(tnext, i, e1, r, n-1) ) )
let widening ((f1,c1),(t,c)) =
let sigma = unifylist( (t,f1) :: (c@c1)) in
(match sigma with
|Fail -> (Notype,[])
|Subst(s) -> (applysubst sigma t,s))
28
Mutual recursion
let makemutrec ((i1,i2), (e1,e2), r) =
let (v1,v2) = alfpm ((newvar(),[]),(newvar(),[]) ((i1,i2), (e1,e2), r, k)
in bind(bind(r,i1,v1),i2,v2)
let rec alfpm ((ff1,ff2) , ((i1,i2), (e1,e2), r, n) =
let r1 = bind(bind(r,i1,ff1),i2,ff2) in
let tnext1 = sem e1 r1 in
let tnext2 = sem e2 r1 in
(match (tnext1, tnext2) with
|((Notype, _), _) -> (Notype,[]), (Notype,[])
|(_,(Notype, _)) -> (Notype,[]), (Notype,[])
|_ -> if abstreq(tnext1,ff1) & abstreq(tnext2,ff2) ) then
(ff1,ff2)
else
(if n = 0 then
widening(ff1, ff2, tnext1, tnext2)
else alfpm((tnext1,tnext2), ((i1,i2), (e1,e2), r, n-1) ) )
 with the straightforward extension of widening
29
Abstract least fixpoint computation
 both Hindley’s and ML’s type inference algorithms do not try
to compute the fixpoint and simply perform the widening
(unification) at step 1
 abstreq checks the two terms in the abstract values for
variance

not always correct when there are free variables
• solved in the next abstract domain
 the abstract semantic evaluation function is left unchanged

we just add an external function which sets the widening parameter k
let sem1 (e:exp) (k:int) =
………
in sem e emptyenv
30
Examples 1: non-termination
# let rec f x = f in f;;
This expression has type 'a -> 'b but is here used with type 'b
# sem1 (Rec(Id "f",Fun(Id "x",Var(Id "f")))) 0;;
- : eval = Notype, []
# sem1 (Rec(Id "f",Fun(Id "x",Var(Id "f")))) 5;;
- : eval = Notype, []
# sem1 (Rec(Id "f",Fun(Id "x",Var(Id "f")))) (-1);;
Interrupted.
31
2: “easy” recursion
# let fact =
Rec ( Id("pfact"), Fun(Id("x"),
Ifthenelse(Diff(Var(Id("x")),Eint(1)), Eint(1),
Sum(Var(Id("x")),
Appl(Var(Id("pfact")),Diff(Var(Id("x")),Eint(1)))))))
…..
# sem1 fact 0;;
- : eval = Mkarrow (Intero, Intero),[…]
# sem1 fact (-1);;
- : eval = Mkarrow (Intero, Intero),[…]
32
3: non-typable Cousot’s function
# let rec f f1 g n x = if n=0 then g(x)
else f(f1)(function x -> (function h -> g(h(x))))
(n-1) x f1;;
This expression has type ('a -> 'a) -> 'b but is here used with type 'b
# let monster = Rec (Id "f",Fun(Id "f1", Fun(Id "g", Fun(Id "n", Fun(Id "x",
Ifthenelse(Var(Id "n"),Appl(Var(Id "g"),Var(Id "x")),
Appl(Appl(Appl(Appl(Appl(Var(Id "f"),Var(Id "f1")),
Fun(Id "x",Fun(Id "h", Appl(Var(Id "g"),Appl(Var(Id "h"),Var(Id
"x")))))),Diff(Var(Id "n"),Eint 1)),Var(Id "x")),Var(Id "f1"))))))));;
…...
# sem1 monster 2;;
# sem1 monster 0;;
- : eval = Mkarrow (Mkarrow (Vvar "var43", Vvar "var43"),
- : eval = Notype, []
Mkarrow (Mkarrow (Vvar "var43", Vvar "var36"),
# sem1 monster 1;;
Mkarrow (Intero, Mkarrow (Vvar "var43", Vvar "var36")))),
[...]
- : eval = Notype, []
 same result for k=-1 (lfp)
33
3: failure in widening
(Vvar "var0", [])
(initial approximation)
(Mkarrow(Vvar "var17", Mkarrow(Mkarrow (Vvar "var15", Vvar "var8"),
Mkarrow (Intero, Mkarrow (Vvar "var15", Vvar "var8")))),
[.…; Vvar "var0",
Mkarrow (Vvar "var17", Mkarrow(Mkarrow (Vvar "var13",
Mkarrow (Mkarrow (Vvar "var13", Vvar "var15"), Vvar "var8")),
Mkarrow(Intero, Mkarrow (Vvar "var15", Mkarrow (Vvar "var17", Vvar
"var8"))))); .......])
(Mkarrow (Mkarrow (Vvar "var43", Vvar "var43"),Mkarrow
(Mkarrow (Vvar "var43", Vvar "var36"),
Mkarrow (Intero, Mkarrow (Vvar "var43", Vvar "var36")))),
[...…; Vvar "var8", Mkarrow (Mkarrow (Vvar "var43", Vvar "var43"), Vvar
"var36"); ...])
(Mkarrow (Mkarrow (Vvar "var43", Vvar "var43"),
Mkarrow (Mkarrow (Vvar "var43", Vvar "var36"),
Mkarrow (Intero, Mkarrow (Vvar "var43", Vvar "var36")))), [...])
34
4: successful widening loses precision
# let f1 x = x in let g x = f1 (1+x) in f1;;
- : 'a -> 'a = <fun>
# let rec f x = x and g x = f (1+x) in f;;
- : int -> int = <fun>
# let f1 = Let(Id "f",Fun(Id "x",Var(Id "x")), Let(Id "g",Fun(Id "x",Appl(Var(Id
"f"),Sum(Eint 1,Var(Id "x")))), Var(Id "f")));;
….
# sem1 f1 (-1);;
- : eval = Mkarrow (Vvar "var1", Vvar "var1"), […..]
# let f = Letmutrec((Id "f",Fun(Id "x",Var(Id "x"))),
(Id "g",Fun(Id "x",Appl(Var(Id "f"),Sum(Eint 1,Var(Id "x"))))), Var(Id "f"));;
….
# sem1 f (-1);;
- : eval = Mkarrow (Vvar "var9", Vvar "var9"), […..]
# sem1 f 0;;
- : eval = Mkarrow (Intero, Intero), […...]
35
5: no let-polymorphism
# let f x = x in f f;;
- : '_a -> '_a = <fun>
# sem1 (Let(Id "f",Fun(Id "x", Var(Id "x")),Appl(Var(Id "f"),Var(Id "f")))) 0;;
- : eval = Notype, []
36
Polymorphism 1
 evaluation of an expression e in an environment containing
an association between an identifier n and a type t



Let(i,e1,e2)
Rec(f,e)
Fun(x,e)
evaluation of e2 with a new association for i
evaluation of e with a new association for f (function name)
evaluation of e with a new association for x (parameter)
 if t contains (type) variables, different occurrences of n in e
can use different instances of t
 let-polymorphism

applies only to associations created by Let
 polymorphic recursion

applies to recursive calls of f in e
 polymorphic abstraction

applies to the occurrences of the formal parameter x in e
37
Polymorphism 2
 ML provides let polymorphism only

similar to our second abstract interpreter
 our third interpreter will handle polymorphic recursion as
well
 I was not able to handle polymorphic abstraction within our
approach

Herbrand abstraction and principal types
 it can be handled using sets of types as abstract domain


see Cousot’s paper
abstract semantics rather than effective abstract interpreter
38
Towards polymorphism
 the basic mechanism to allow polymorphism is to represent
types as universally quantified (closed) terms

whenever the environment is applied to an identifier it returns a
renamed version of the type (using fresh variables)
 unfortunately, types are not closed terms

they are in general open terms, in which some variables (free
variables) cannot be renamed
 free variables can be determined from the current
environment

they are exactly those variables which occur in the environment
 all the remaining variables are bound and therefore
explicitely universally quantified
39
Parametric polytypes 1
type evalt = Notype
| Vvar of string
| Intero
| Mkarrow of evalt * evalt
type tscheme = Forall of (string list) * evalt
type eval = tscheme * (evalt * evalt) list
 the new operations

val instance: eval -> eval
• returns a new abstract value, whose tscheme is the most general instance of the input
one
– a renaming with fresh variables replacing universally quantified variables

val generalize : evalt -> subst -> env -> tscheme
• generalizes the input type by returning a tscheme in which all the variables which do
not occur in the current environment
determined by the environment and the current substitution
are universally quantified
40
Parametric polytypes 2
type evalt = Notype
| Vvar of string
| Intero
| Mkarrow of evalt * evalt
type tscheme = Forall of (string list) * evalt
type eval = tscheme * (evalt * evalt) list
 the new environment

to make easier the application of substitution
type env = (ide * eval) list
let emptyenv = []
let rec bind ((r:env),i,t) = match r with
| [] -> [(i,t)]
| (j,t1):: r1 -> if i=j then (i,t):: r1 else (j,t1):: (bind(r1,i,t))
let rec applyenv ((r:env),i) = match r with
|[] -> (Forall([],Notype),[])
|(j,t):: r1 -> if i=j then instance(t) else applyenv(r1,i)
41
Type abstract interpreter 2
 partial order, glb and lub on the new abstract domain are
similar to the previous ones
 we might have used the same domain in the monotype
interpreter


without taking instances in applyenv
abstreq could (correctly) check variance of universally quantified
variables only
 the new abstract operations are the straightforward
adaptation of the previous ones


quantification prefixes are ignored
computed type schemes have usually an empty list of quantified
variables
42
Some abstract operations
let plus ((Forall(_,v1),c1),(Forall(_,v2),c2)) =
let sigma = unifylist((v1,Intero) :: (v2,Intero) :: (c1 @ c2)) in
match sigma with
|Fail -> (Forall([],Notype),[])
|Subst(s) -> (Forall([],Intero),s)
let rec makefun (ii,aa,r) = let f1 =newvar() in
let f2 =newvar() in
let body = sem aa (bind(r,ii,(Forall([],f1),[]))) in
(match body with
| (Forall(_,t),c) -> let sigma = unifylist( (t,f2) :: c) in
(match sigma with
|Fail -> (Forall([],Notype),[])
|Subst(s) -> (Forall([],applysubst sigma (Mkarrow(f1,f2))),s)))
43
Let-polymorphism
 we only use type generalization in the semantics of
the let construct
let rec sem (e:exp) (r:env) = match e with
….
| Let(i,e1,e2) -> let (s,c) = sem e1 r in
match s with
|Forall(_,Notype) -> (Forall([],Notype),[])
|Forall(_,t) -> let t1 = generalize t (Subst c) r in
sem e2 (bind (r, i, (t1,c)))
44
Typings
 type systems specified by inference rules do usually
infer for a given expression a typing

a pair (t,r)
• t is a (non-quantified) type expression
• r is an environment
 this allows one to assign a typing to open lambdaexpressions containing references to global names

the result gives us constraints on the global environment
 our type interpreter 2 has been modified so as to
return a typing

the “external function” sem1 takes an additional
argument
• the list of global names
45
Examples 1
# let f x = x in f f;;
- : '_a -> '_a = <fun>
# sem1 (Let(Id "f",Fun(Id "x", Var(Id "x")),Appl(Var(Id "f"),Var(Id "f")))) [] 0;;
- : evalt * (string * evalt) list = Mkarrow (Vvar "var2", Vvar "var2"), []
# let f x y = x in f (f 2 3)(f 3 (f 2 ));;
- : int = 2
# sem1 (Let(Id "f", Fun (Id "x",Fun(Id "y",Var(Id "x"))), Appl(Appl(Var(Id "f"),
Appl(Appl(Var(Id "f"),Eint 2),Eint 3)), Appl(Appl(Var(Id "f"),Eint 3), Appl(Var(Id "f"), Eint
2))))) [] 0;;
- : evalt * (string * evalt) list = Intero, []
46
Examples 2
 no news for the monster function

as expected, since it does not contain any Let
# sem1 monster [] 0;;
- : evalt * (string * evalt) list = Notype, []
# sem1 monster [] 1;;
- : evalt * (string * evalt) list = Notype, []
# sem1 monster [] 2;;
- : evalt * (string * evalt) list = Mkarrow(Mkarrow (Vvar "var43", Vvar "var43"),
Mkarrow(Mkarrow (Vvar "var43", Vvar "var36"),
Mkarrow(Intero, Mkarrow (Vvar "var43", Vvar "var36")))), []
47
Examples 3
 inferring typings for “open” expressions

not typable in ML
# sem1(Appl(Fun(Id "x", Appl(Var(Id "x"),Var(Id "y"))), Fun (Id "x", Var(Id "x")))) ["y"] 0;;
- : evalt * (string * evalt) list = Vvar "var9", ["y", Vvar "var9"]
# sem1(Rec(Id "times", Fun(Id "x", Fun (Id "y", Ifthenelse(Var(Id "x"), Var(Id "z"),
Sum(Var(Id "y"),Appl(Appl(Var(Id "times"),
Diff(Var(Id "x"),Eint 1)),Var(Id
"y")))))))) ["z"] 0;;
- : evalt * (string * evalt) list = Mkarrow (Intero, Mkarrow (Intero, Intero)), ["z", Intero]
48
Examples 4
 still no polymorphic recursion
# let rec polyf x y = if x=0 then 0 else if (x-1)=0 then (polyf (x-1)) (function z -> z)
else (polyf (x-2)) 0;;
This expression has type int but is here used with type 'a -> 'a
# sem1(Rec (Id "polyf", Fun (Id "x", Fun (Id "y", Ifthenelse (Var (Id "x"), Eint 0, Ifthenelse (
Diff (Var (Id "x"), Eint 1), Appl (Appl (Var (Id "polyf"), Diff (Var (Id "x"), Eint 1)), Fun (Id
"z", Var (Id "z"))), Appl(Appl (Var (Id "polyf"), Diff (Var (Id "x"), Eint 2)), Eint 0))))))) []
(-1);;
- : evalt * (string * evalt) list = Notype, []
49
Type abstract interpreter 3
 polymorphic recursion
 same abstract domain of interpreter 2
 the essential feature

the abstract values denoted by the recursive function name within
the fixpoint computation need to be generalized
• and therefore correctly universally quantified
so as to allow different instantiations in each iteration of the
evaluation of the function body
 we have decided to generalize all the computed type
schemes



universally quantified types in the result
no special handling of Let
fresh variables used in the semantic of functional abstraction are
still free variables
50
Abstract semantics of recursive
functions
let rec makefunrec (i,e1,r) = let f1 =newvar() in (match f1 with Vvar(x) ->
alfp((Forall([x],f1),[]), i, e1, r, k))
and alfp (ff ,i ,e1 ,r, n) = let tnext = sem e1 (bind(r,i,ff)) in
(match tnext with
|(Forall(_,Notype),_) -> (Forall([],Notype),[])
|(Forall(_,t),c) -> let t1 = generalize t (Subst c) r in
if abstreq((t1,c),ff) then ff else
(if n = 0 then widening(ff,(t1,c),r) else alfp ((t1,c) ,i,e1 ,r, n-1) )

note that the Ml widening (k=0) would always succeed
computing an incorrect type
51
Generalizations
 generalization is performed before returning the abstract
value in




merge
applyfun
widening
makefun
let rec makefun (ii,aa,r) = let f1 =newvar() in
let f2 =newvar() in
let body = sem aa (bind(r,ii,(Forall([],f1),[]))) in
(match body with
| (Forall(_,t),c) -> let sigma = unifylist( (t,f2) :: c) in
(match sigma with
|Fail -> (Forall([],Notype),[])
|Subst(s) -> let t1 = applysubst sigma (Mkarrow(f1,f2)) in
(generalize t1 sigma r,s)))
 generalization is removed from the semantics of Let

becomes the same of interpreter 1
52
Examples 1
 let polymorphism still ok without ad-hoc semantics
# sem1 (Let(Id "f",Fun(Id "x", Var(Id "x")),Appl(Var(Id "f"),Var(Id "f")))) 0;;
- : eval = Forall (["var2"], Mkarrow (Vvar "var2", Vvar "var2")), [...]
 polymorphic recursion
# let rec polyf x y = if x=0 then 0 else if (x-1)=0 then (polyf (x-1)) (function z -> z)
else (polyf (x-2)) 0;;
# sem1(Rec (Id "polyf", Fun (Id "x", Fun (Id "y", Ifthenelse (Var (Id "x"), Eint 0, Ifthenelse
(Diff (Var (Id "x"), Eint 1), Appl (Appl (Var (Id "polyf"), Diff (Var (Id "x"), Eint 1)), Fun
(Id "z", Var (Id "z"))), Appl(Appl (Var (Id "polyf"), Diff (Var (Id "x"), Eint 2)), Eint
0))))))) [] (-1);;
- : eval = Forall (["var3"], Mkarrow (Intero, Mkarrow (Vvar "var3", Intero))), [...]
53
Examples 2
 improvement in the fixpoint computation (more precise
widening)
# sem1 monster 1;;
- : eval = Forall (["var15"; "var8"],
Mkarrow(Mkarrow (Vvar "var15", Vvar "var15"),
Mkarrow(Mkarrow (Vvar "var15", Vvar "var8"),
Mkarrow (Intero, Mkarrow (Vvar "var15", Vvar "var8"))))), [...]
54
No polymorphic abstraction
 universally quantifying the variable bound to the formal parameter
in the lambda abstraction does not work
 with polymorphic abstraction the semantics of the following
constructs should be equivalent (no need for let)
 let x = e1 in e2 and (function x -> e2) e1
 this is not the case as shown by the following example
# (function x -> x x) (function x -> x)
This expression has type 'a -> 'b but is here used with type 'a
# let x = function x -> x in x x
- : '_a -> '_a = <fun>
# sem1 (Appl(Fun(Id "x", Appl(Var(Id "x"),Var(Id "x"))),(Fun(Id "x",Var(Id "x")))) ) 0;;
- : eval = Forall ([], Notype), []
# sem1 (Let(Id "x",Fun(Id "x",Var(Id "x")),Appl(Var(Id "x"),Var(Id "x"))) ) 0;;
- : eval = Forall (["var2"], Mkarrow (Vvar "var2", Vvar "var2")), [...]
55
Relation to type systems
 traditional type systems are specified by providing


a notion of type
type assertions of the form: expression  type
• Eint(x)  Intero

a notion of judgment of the form: environment  assertion
• the environment entails the assertion
• r  Eint(x)  Intero

a set of type rules which assert the validity of certain judgments on the basis
of other judgments
r  Eint(x)  Intero
r  e1  Intero
r  e2  Intero
----------------------------------------------------r  Sum(e1,e2)  Intero

we will look at some of the rules corresponding to the let polymorphic abstract semantics
with 0-widening

ML and Damas-Milner’s semantics
56
The Damas-Milner’s type system
 types

terms with type variables, generalized in the environment when created by
the let construct
 some rules
r  Eint(x)  Intero
t elim(applyenv(r,i))
------------------------------------------r  Var(x)  t
r  e1  Intero
r  e2  Intero
----------------------------------------------------r  Sum(e1,e2)  Intero
r[x  t1]  e  t2
----------------------------------------------------r  Fun(x,e)  Mkarrow(t1, t2)
r[f  Mkarrow(t1, t2)]  Fun(x,e)  Mkarrow(t1, t2)
--------------------------------------------------------------------r  Rec(f,x,e)  Mkarrow(t1, t2)
57
The Damas-Milner’s type system 2
r  e1  Intero
r  e2  Intero
----------------------------------------------------r  Sum(e1,e2)  Intero
• ok for type checking
• for type inference, judgments have to be understood as going from
expressions to pairs (environment,type), i.e.
– we need to compute an environment making the judgment valid
– the algorithm is not straightforward
r[x  t1]  e  t2
----------------------------------------------------r  Fun(x,e)  Mkarrow(t1, t2)
r[f  Mkarrow(t1, t2)]  Fun(x,e)  Mkarrow(t1, t2)
--------------------------------------------------------------------r  Rec(f,x,e)  Mkarrow(t1, t2)
• even worse: no hints on how to choose t1 and on how to solve the
recursive definition
58
Type systems vs. abstract interpreters
 for a given a notion of type
 a type system
• is easier to specify
• it may be quite complex to move from the rules to the type
inference algorithm
• needs some way of relating it to the semantics, to show the
correctness (for example subject reduction)

an abstract interpreter
• may require an abstract domain more complex than just
the type
• is directly a type inference algorithm
• is correct by construction
59