Quasi-Linear Types - ACM Digital Library

Quasi-Linear
Types
Naoki Kobayashi
Department of Information Science, University of Tokyo
email:[email protected]
linear pair, the second element y of the pair can be destructively updated to y + 1. Similarly, the “append” of linear
lists can be performed destructively, without copying cons
cells. This kind of improvement will be especially effective
for functional programs, because many intermediate data
structures like cons cells and closures are created in naive
functional programs.
Abstract
Linear types (types of values that can be used just once)
have been drawing a great deal of attention because they
are useful for memory management, in-place update of data
structures, etc.: an obvious advantage is that a value of a
linear type can be immediately deallocated after being used.
However, the linear types have not been applied so widely
in practice, probably because linear values (values of linear
types) in the traditional sense do not so often appear in actual programs. In order to increase the applicability of linear
types, we relax the condition of linearity by extending the
types with information on an evaluation order and simple
dataflow information. The extended type system, called a
quasi-linear type system, is formalized and its correctness
is proved. We have implemented a prototype type inference system for the core-ML that can automatically find
out which value is linear in the relaxed sense. Promising results were obtained from preliminary experiments with the
prototype system.
1
1.2
tems, they do not seem to have been used so widely, even
after type inference algorithms [8,9,22] were proposed that
can automatically find which values are linear.
We think one of the major reasons for this is that
linear type systems are too naive for the above application: what is actually important for automatic deallocation and in-place update is to identify which is the last
access to a heap value; the condition of linearity is too
strong for this purpose. For example, consider an expression M E let 2 = 2.0 in let y = x + 1.0 in y + 2: it is easy
to see that y + x is the last access to z, but linear type systems cannot insert a code to deallocate z - just because
z is used twice! This limitation also forces a programmer
to follow a particular programming style. For instance, in
Xz.(fst(z),snd(z)
+ l), the pair z is judged to be accessed
twice; so, if one wants the pair to be deallocated, one must
write X(z, y).(x, y + 1) instead.
Linear types
A number of type systems based on Girard’s linear logic [6]
have been proposed [l, 3,9,13,14,22].
They guarantee that
certain data structures (called linear values) are accessed
just once (or at most once). The distinction between linear values and other values brings us several benefits: improvement of memory management [ 11, safe inlining [22], etc.
Among them, we are here interested in the improvement of
memory management. If some data structure is statically
known to be linear, it can be deallocated immediately after it is accessed. Moreover, if another heap value needs
to be created after a linear value is accessed, we can just
replace the linear value with the new value, instead of deallocating the linear value and allocating space for the new
data (provided that the physical size of the new value is not
greater than that of the linear value). For instance (throughout this paper, we assume a call-by-value language), when
the function X(z, y).(z, y + 1) is called, if the argument is a
Permission
to make digital or hard copies
ofall or part
of
this work
1.3
POPL
99 San Antonio
Copyright
Texas
proposal
-
almost
linear
types
(quasi-linear
In order to remove the above-mentioned limitation of linear
types, we relax the condition of linearity.
A key idea is to express information on the evaluation
order and dataflow by using types. Recall the above expression M: if a type system can take it into account
that z + 1.0 is evaluated before y + z, it can ignore the
use of z in z + 1.0 and treat the above expression in the
same way as let 2 = 2.0 in let y = 1.0 in y + 5. In order to obtain such information on the evaluation order,
the type system should also be able to deal with some
dataflow information. Consider, for example, an expression
let y = f(x) in g(x) + h(y). f(z) is evaluated before g(z)
but it does not imply that the access to x in g(z) is the
last one: x may be returned by f(z) and accessed again in
h(y). In order to deal with this, we distinguish between the
type of a value that must be used up in an expression and
for
on the first page. ‘To copy
otherwise. to republish, to post on servers or to redistribute
requires prior specific permission and/or a fee.
Our
types)
personal or classroon~ use is granted without f’ee provided that copies
are not made or distributed for prolit or commercial advantage and that
topics bear this notice and the full citation
of linear types
In spite of the above-mentioned benefit from linear type sys-
Introduction
1.1
Limitation
to lists,
USA
ACM 1999 l-581 13-095-3/99/01...$5.00
29
the type of a value that may be returned as a part of the
evaluation result of an expression.
It would be possible to obtain the above kinds of information by using more sophisticated analyses, but we instead
obtain it by a very simple method - by introducing a new
use [9,22]. Advantages of that approach include: (1) the
whole analysis can be formalized uniformly in terms of a
type system (this is especially advantageous when we introduce polymorphism) and (2) a program containing free
variables can be analyzed as long as their type information is
provided, which is useful for modular (or incremental) analysis and separate compilation of programs. We f&t review
the ideas of previous linear type systems [9,22] and then
overview the ideas of our system below.
1.3.1
Review
with use 1 (hereafter, we call it a quasi-linear value or simply a linear value, and call a use-once value in the traditional
sense a strictly linear value) can now be accessed in a more
relaxed manner: It can be accessed many times as a b-value,
and then it can be accessed as a strictly linear value (i.e., as
a value that can be accessed once and deallocated).
In order to guarantee such usage of a value, it is crucial
that the type system can take the evaluation order into account. The type derivation shown above is now changed as
follows:
x : real6 t x + 1.0 : real’
x:reaP,y:reaPl-z+y:reall
2:rea16;z:reaZ1(=x:reaZ1)
b let y = x + 1.0 in x + y : real1
of linear type systems
Here, the new combination (x : reals); (z : real’) of type environments captures the fact that z is first accessed as a
J-value in z + 1.0, and then it is accessed as a linear value in
x + y. Thus, 2 can be deallocated after x + y is evaluated.
Because a quasi-linear value may be used more than once,
in addition to allocation of a heap value, each access to a
heap value is also annotated with a use: it is for distinguishing between the last access of the value and the other
accesses. For example, the expression M is annotated as
let x1 = 2.0 in let y1 = x6 + 1.0 in y1 + z1 (actually, we
will annotate constructors and destructors of heap values
instead of variables). Here, we annotate x in y + x with 1 in
order to indicate that it is the last access, while x in x + 1.0
is annotated with 6 in order to indicate that it is not the
last one.
A use was introduced to control how often values can be accessed [9,22]: it was either 0 (not used at all), 1 (accessed at
most once), or w (accessed an arbitrary number of times) in
[9] and was either 1 or w in [22]. Types are annotated with
such uses. For example, real’ represents the type of real
numbers that can be accessed at most once, and int-b’int
is the type of functions on integers that can be called an
arbitrary number of times. A type judgment can be accordingly refined so that it gives more useful information
than the ordinary one: for example, x : real’ l- N : real”
means not only that 2 is used as a real number in N but
also that it is used at most once. So, it is invalid if N is
let y = 2 + 1.0 in x + y.
The following type derivation highlights a key point of
the linear type system:
1.4
z : real1 I- 3: + 1.0 : real’
2:reaZ’,y:reallt-z+y:real’
The premises of the derivation imply that z is used once
each in x + 1.0 and x + y. In order to know how x is totally used in the expression, we combine the type environments of subexpressions by adding the corresponding uses as
above. This is in contrast to the following rule of ordinary
type systems, where type environments are shared between
subexpressions:
rl--r
:7-i
r, y : 71 i- iI& : 72
r t- let y = Ml in M2 : 73
By using the linear type system, we can infer how often
each heap value is accessed, and annotate each allocation
with a use. For instance, we can annotate the expression.
M as let xy = 2.0 in let y1 = x + 1.0 in z + y, so that y
is deallocated after it is accessed in x + y while x is not
deallocated.
Overview
results
Main contributions of the present work are formalization
of the new type system sketched above and a proof of its
correctness. Since the evaluation order must be taken into
account, the proof is non-trivial and more involved than that
of previous type systems.
Another contribution is implementation of a type inference system for the core-ML (i.e., Standard ML without
modules [15]) based on the new type system, which inputs an unannotated core-ML expression and outputs a useannotated expression. We have so far tested several fairly
small programs and obtained promising results: for programs that use lists frequently (such as sorting programs,
the sieve of Eratosthenes and Conway’s game of life), the
type inference system could judge that most heap values (in
the case of the sorting programs and the sieve of Eratosthenes, all values except for top-level functions) are linear,
which indicates that they can be executed almost without
garbage collection. Moreover, the annotated programs output by the system indicate that most linear cons cells in
those programs can be updated in-place (the application of
linear types to the in-place quick sort has been suggested
by Baker [2] but a remarkable point here is that it can be
performed for naive programs with no programmer’s annotation).
z : real’ + 2 : real1 (= z : real”)
I- let y = z + 1.0 in z + y : real’
1.3.2
Main
of our type system
In order to express simple dataflow information,
we introduce another use 6. Intuitively, a value with use S (which
we call a S-value below) may be accessed many times but
only locally - it cannot be returned as (a part of) the computed result. So, x can have type real6 in z + 1.0 and
(Xy.(y + l.O))z, but not in (2,l.O) or Xy.s + y. A value
1.5
Structure
of the paper
The rest of this paper is structured as follows. Section 2 introduces the syntax of the target language. Section 3 gives
our type system for judging whether an expression is correctly annotated with uses. Section 4 sketches a proof of
30
(VI, Vz)” and Xny.Ml allocate a pair and a closure respectively in the heap, and records that their use is IC(which
shall be restricted to O,l, or w by the type system given
later). The recorded use expresses how the value can be accessed in the rest of the computation: 0 means that the value
cannot be accessed at all (so, (VI, V2)’ need not actually allocate the pair in the heap), 1 means that the value can
be accessed as a linear value and may be deallocated later,
and w means that the value can be accessed in an arbitrary
manner and can be deallocated only by other mechanisms
such as garbage collection. recn (f, y, Ml) creates a recursive
function f such that f(y) = Ml.
There are three operations to access the heap: fst”(z)
(snd”(z),
resp.) extracts the first (second, resp.) element
of the pair stored at 2, and appZy”(y, V) reads the closure
stored at y and applies it to V. R attached to an access
operation represents as a value of what use the heap value
is accessed by this operation. If K.is 6, the heap is just read
and unchanged; if it is 1, the heap value is deallocated after
being read (if the use of the heap value is also 1’). If n is w,
then the operation assumes that the accessed heap value is
an w-value and does not deallocate the value. (so, w is the
same as 6 in effect and is unnecessary; it is included just for
technical convenience). If K. is greater than the actual use
recorded at the accessed address, the access is invalid and
causes an error.
A conditional expression ifD V then Ml else Ma is reduced to Ml if V = 0 and otherwise reduced to M2.
the correctness of the type system and Section 5 briefly discusses type inference issues. Section 6 discusses several extensions of the target language. Section 7 explains our prototype type inference system and the results of preliminary
experiments. Section 8 discusses related work and Section 9
concludes this paper. For space restriction, we omit some
formal definitions, proofs, etc.: they are found in the full
version of this paper [12].
2
2.1
Target Language
Uses
As explained in Section 1, a use is introduced to control how
often and in which way each heap value can be accessed.
Definition 2.1 [uses]:
A use is 0, 6, 1, or w.
We use metavariables ~1, ~2,. . . for uses.
A new point here is the use 6: basically, we say that a
heap value is accessed in an expression as a value of use 6
if it is accessed locally inside the expression and cannot be
returned as a part of the evaluation result,. 0 means that the
data cannot be accessed at all, 1 means that after the data
is accessed in a restricted manner as a value of use 6, it can
be either (1) accessed at most once and deallocated inside
the current expression being evaluated or (2) put into the
evaluation result as a value of use 1. w means that the data
can be accessed many times in any way.
A value of use w can be regarded as a value of any other
use, and a value of 1 can be regarded as a value of use S
or 0. In order to express this kind of relationship between
uses, we define a total order 5 on uses by 0 5 6 5 1 5 w.
We write ~1 < ~2 if ~1 # ~2 and ~1 5 ~2.
Example 2.3: let z = (1,2)l
in let y = fst’(z)
in
let z = snd’(z) in z allocates a linear pair (1,2) in the
heap, and then reads its first element; at this moment, z
is not deallocated since fst is annotated with 6. After that,
the second element, is read and the pair is deallocated.
Remark 2.2: By a heap value being accessed (or used), we
mean the contents of the heap value are indeed read; so,
just passing the reference to the heap value does not count
as access. For example, we do not say that y is accessed in
(Xz.z)y, since the expression can be reduced to y without
reading the actual contents of y. A pair is considered to be
accessed only when its first or second element is extracted,
a closure is accessed only when it is invoked, and a real
number is accessed only when it is passed as an argument
to a primitive function on real numbers.
2.2
2.3
Operational
semantics
In order to clarify how use annotation is used for allocating/deallocating heap values at run-time, we define an operational semantics as a rewriting relation on pairs of a heap
and an expression [17,22].
Definition 2.4 [evaluation contexts]: The set, of evaluation contexts is given by the following syntax:
C[ ] ::= [ ] 1let z = C[ ] in M.
Expressions
We write C[M] for the expression obtained from C[ ] by
replacing the hole [ ] with M.
We consider a simply-typed X-calculus with recursion and
pairs as a target language. Its extension with polymorphism and other data structures will be briefly discussed
in Section 6. Because our type system is sensitive to an
evaluation order, we use the K-normal form [5] in order to
make the evaluation order explicit and name each intermediate value. (For readability, we sometimes use expressions
that are not in K-normal form when giving examples.)
The syntax of expressions is given in Figure 1. Each
allocation or read operation of a heap value is annotated
with a use. The annotation can be automatically inferred
by a type inference algorithm explained in Section 5. We
write & for the set of expressions.
The metavariable V represents a value that can be represented in one word and need not be allocated in the heap.
Here, it is either an integer constant (denoted by n) or a
variable (denoted by 2). A variable can be considered a
heap address.
Definition 2.5 [heap]: A heap value, denoted by h, is a
term of the form (VI, Vz) or Xx.M.
A heap is a mapping
from a finite set, of variables to pairs consisting of a use and
a heap value. We write (21 I-+~~ hl, . . . , z,, I-+~” h,} for a
heap H such that H(zi) = (Ki, hi).
The use associated with a heap value is used for judging
whether the value can be deallocated after it is accessed
as a linear value: the value can be deallocated only if the
use is 1. The use 0 means that the value has been already
‘The check of the use of the heap value is necessary, because we
allow an w-value to be accessed as a linear value. If we forbid such
coercion of an w-value into a linear value as in [22], then the check is
unnecessary.
However, we prefer not to make such restriction because
the check can be implemented efficiently(see Remark 2.6).
31
A4 (expressions)
::=
V (non-heap values)
::=
V 1 let z = (VI,VZ)~ in M 1let z = fst”(y) in M 1let z = snd”(y) in M
1let x = X”y.Ml in M2 I let z = reC(f,y,Ml)
in M2 I let z = appZy’((y, V) in M
I let z = Ml in M2 1ifII V then MI else M2
z 1n
Figure 1: The Syntax of Expressions
Example 2.8: If the expression in Example 2.3 is wrongly
annotated as:
deallocated. We often write h” for (VI, Vz)” or A”z.M. We
write X for the set of heaps.
The reduction relation WE (X x E) x ((74x E) U{Error})
is defined by the rules given in Figure 2. Error indicates
that invalid access to the heap has occurred (there are other
kinds of errors such as application of a non-function to a
value, etc.; however, because the lack of those errors can be
guaranteed by a type system in the usual way, this paper
focuses on errors caused by invalid heap access).
The first two rules (R-HEAP) and (R-REC) are for allocating heap values. The use K.is recorded at a new address
together with the allocated value. The heap is not changed
in (R-SVAL) since V is a one-word value.
The rules (R-APP) and (R-APP-ERROR) for function ap-
then it is reduced to Error
({},let
-
3
Remark 2.6: One may think that the above operation on
uses incurs a heavy overhead at run-time. However, because
the type system guarantees that no invalid access occurs, the
check of K > n’ is always guaranteed to succeed and can be
eliminated. Moreover, the use associated with a heap value
can change only from 1 to 0; since this change means that
the heap value is deallocated in the actual implementation,
the use need not be updated actually. This further implies
that the use of a heap value can be recorded not in the heap
space itself but in the pointer to it by using a one-bit tag,
and that the extra run-time cost is only to check this tag
(to know whether the use of the accessed heap value is 1 or
w) when the access operation is annotated with 1. So, we
think the actual run-time overhead is quite small.
-
3.1
(1,2)),
= fst6(u) in
(1,2)},let
y
(1,2)),let
z
(1,2)}, let z
4
‘u
Q
7
-
({u ++O (112)), 2)
(L2)L
= fst’(u) in let z = snd6(u) in z)
(1,2)},let y = 1 in let z = snd6(u) in z)
(1,2)}, let z = snd’(u) in z)
System
Types
::=
int 171 x6 72 I 71-bnl+Q72
n of 71 xK r2 and ~1 of r1-+K11r;2r2expresses how a pair and
a closure can be accessed. We require n2 of rr-+~L’lnars not
to be 6. It represents how often (0, 1, or w ) the closure can
be invoked in the sense of previous linear type systems [9,
221. The closure itself is allocated and deallocated based on
nl. The reason why we keep ~2 will become clear when we
present a typing rule for closure creations.
Note that the meaning of a pair type is different from
that in previous linear type systems [8,22]. In the previous
linear type systems, int x“’ (int x1 int) corresponds to the
linear logic formula !( int @ (int @ int)); so, if a pair (1, (2,3))
has this type, it can be read an arbitrary number of times,
and each time the second element (2,3) is extracted, it can
be read at most once. On the other hand, in our quasi-linear
type system, the second element (2,3) can be accessed only
linearly througghout aEZthe access to the pair (1, (2,3)) (not
each time (2,3) is extracted).
in z)
let I = snd’(u) in z)
= 1 in let z = snd’(u)
= snd’(u) in z)
= 2 in z)
as follows:
Definition 3.1 [types]: The set of types, ranged over by
7, is given by the following syntax.
The expression in Example 2.3 is reduced
({u +bl
let y
({u I+)’
({u I+’
({u *’
in z,
As was shown in Example 2.8, if an expression is wrongly
annotated with uses, a run-time error may occur. In order
to reject such expressions, we introduce a type system. Our
type system is a conservative extension of the usual type
system for the simply typed X-calculus in the sense that
any expressions well-typed in the usual type system are well
typed also in our type system.
There are two major differences between our quasi-linear
type system and the usual linear type systems. As explained
in Section 1, a linear value can be accessed many times as
long as the last access is identified by the quasi-linear type
system. The other difference is the interpretation of a pair
type (see below).
in Figure 3. Similarly, access to a pair fst”’ (x) or sndn’ (z)
succeeds only if K,2 IE’, and the use is decreased by K’ after
the access ((T-FST) - (T-SND-ERROR)).
({},let
z = (1,2)l in
let y = fst’(z) in let z = snd’(z)
Type
in let z = snd’(z)
z = (1,2)l in
let y = fst’(z) in let z = snd6 (z) in z)
({u *l
let y
“v) ({u I+’
u I+’
Z
Lror
plication deserve attention. Since apply”’ (2, V) reads a closure stored at z as a value with use K’, the currently available
use K of z should be greater than or equal to PC’:otherwise,
the application causes an error. If K. = K’ = 1, the closure
is deallocated after the application and it can no longer be
called; so, the use of z should be changed to 0. It is expressed by the subtraction n - K’, whose definition is given
Example 2.7:
as follows:
in let y = fst’(z)
let z = (1,2)l
in z9
The heap value u is considered to be deallocated when the
use changes from 1 to 0.
32
(If, C[let x = h” in M]) +P (H{y r-)‘( h}, C[[y/x]M]) (y fresh)
(H, C[let z = rec“(f, y, MI) in M]) w (H{z eK Xy.[z/fMr),
C[[z/x]M])
(K CPet z = VP>:!)>;
(H, CVI4W)
(z fresh)
_
(H{x +-FKXy.M}, C[appZy”‘(x, V)]) ‘u (H{x t-k-’
(fc < d) v (K’ = 0)
(H{x
tie Xy.M}, C[appZy'('(x,
V)]) -
Xy.M}, C[[V/y]M])
Error
a>>‘>0
(H{x +-bK
047vz)}, CwqZ)]) - (H{x +b-
(R-APP)
(R-APP-ERROR)
(R-FST)
(v-l, vi)}, C[vl])
(lc < d) v (K’ = 0)
(H{x I-+~ (VI, Vz)), C[fst”‘(z)])
K>IE’>O
(R-HEAP)
(R-REC)
(R-SVAL)
u Error
(H{x I+= (V,,Vz)},C[snd’E’(x)])
“v) (H{z tiKPK’ (Vr, V,)},C[Vz])
(K < K’) v (fc’ = 0)
(H{z +-+“ (VI, Vz)}, C[snd”‘(z)])
u Error
(H, C[ifo 0 then Mi else Mz]) ti (H, C[Mi])
n#O
(H, C[ifo n then Mr else Mz]) -..+ (H, C[M2])
(R-FST-ERROR)
(R-SND)
(R-SND-ERROR)
(R-IFT)
(R-IFF)
Figure 2: Operational Semantics
3.3
Example 3.2: (int x6 int)-+‘+‘int
is the type of a linear
closure that takes a pair of integers as an argument, uses it
locally, and returns an integer. The function may be invoked
many times (thus its use is w in the strict sense), but it can
be invoked only linearly in the relaxed sense (i.e., invoked
as a closure of use 6 except for the last invocation). For
example, X’z.(f&(z)
+ snd’(z))
can have this type.
In presenting typing rules, we use several operations on uses,
types, and type environments given in Figure 3 and 4. As
explained in Section 1, the operation ~1 + IE~is used for
computing the total use of a value when it is accessed as ~1
in one place and as tcz in another place. When the order of
the uses is statically known, K~;KZ is used instead. 1~1 is
the least non-6 use that is greater than or equal to K, and
]K] is the greatest non-6 use that is less than or equal to n.
~1 . ~2 is used for computing the total use of a value when
the value is used as a Kz-value ~1 times. Motivations for
these operations will become clearer when typing rules are
given below.
These operations are extended to operations on types
and type environments.
For example, if a heap value is
accessed as a value of type int x6 int in one place and as a
value of type int x ’ int in another place, it is totally accessed
as a value of type (int x6 int) + (int x1 id) = int xw int,
which means that the value may be accessed more than once.
The definitions of operations on pair types and function types deserve special attention; notice that the operations act on sub-components of pair type, but not on
the arguments and return types of functions. In order to
understand the reason for this, suppose x is accessed in
two places, each as a value of type (int x1 int) x1 int (i.e.,
the inner pair of integers and the outer pair are both accessed linearly). Then, both the inner pair of integers and
the outer pair are totally accessed more than once; it is
expressed by the type (int x1+’ int) x1+’ int obtained by
adding each use attached to the pair types, not by the type
(int x1 int) xl+l int (which means that the inner pair is
totally accessed only linearly). On the other hand, suppose z is accessed in two places as a function of type
(int x1 int)--+‘yl(int x1 int). Then, the function is invoked
twice, and each time it is invoked, it uses an argument pair
linearly and returns a linear pair. So, the total access to the
function is expressed by (int x1 int)+‘+‘~‘+‘(int
x1 int),
Example 3.3: Consider an expression
let f = XYz.(fst6(x) + z) in let y = fst’(snd’(x)) in M
and suppose x does not appear in M. Because the second
element of the pair x is accessed just once in fst’(snd’(x)),
it can be assigned the type int xw (int x1 int); so, the
pair of integers can be deallocated at fst’.
Consider
another expression let f = X’z.(fst’(snd’(z))
+ 1) in
apply’(f, x) + apply’(f, x). The second element of x is accessed once (by fst’) each time it is extracted by snd’(z).
Since it is totally accessed more than once, the type int xw
(int x w int) is assigned to x in our quasi-linear type system.
3.2
Operations on uses, types, and type environments
Type judgment
A type judgment is of the form r !- M : 7, where I’ (called a
type environment) is a mapping from a finite set of variables
to types. It expresses how each heap value is accessed during
evaluation of M. For example, x : int x6 int I- M : int x1 int
implies (l)M may access the heap address x, (2)the result
is a linear pair of integers, and (3)~ is used up in M and
does not escape through the result.
Notation 3.4: We write VI : ~1,. . . , V, : 7n for the type environment l? such that dom(I’) = {VI,. . . , Vn} f~ V and
l?(V;) = Ti for each vi E don@),
where V denotes the
set of variables. For example, (x : int, 2 : id) denotes the
type environment that maps x to int. If V $Z dom(l?),
we write I, V : T for the type environment I such that (1)
I”=dom(r)u({V}nV)and(2)
I”(x) =rifx
E ({V}nV)
and r’(x) = r(x) if 2 E dam(r).
33
We require E # 6 because there is no use creating a J-value:
since it can never be deallocated, creating a J-value is the
same as creating an w-value.
x1+’
id).
Similarly, the
not by (int x ‘+l int)-+‘+ll’+l(int
operation r.1 also acts on sub-components of pair type
but not on the arguments and return types of function
types. This is because [rl is intended to express the type
of an expression whose evaluation result does not contain
S-values. Although the type (int x6 int)-+‘>‘(int x1 int)
contains 6, it just means that a function of that type
uses an argument pair locally, not that the function must
so, [(int x6 int)+l*‘(int
x1 int)l
=
be used locally.
(id x6 int)+lJ(int
Remark 3.6: Actually, annotating allocation of a pair with
6 is meaningful and may be rather useful if the type 7s of
M does not contain 6: we can modify the operational semantics so that let x = (VI, V2)’ in M allocates the pair
(VI, V2) in the stack and deallocates it after M has been
evaluated (recall that a value with use 6 cannot be a part of
the evaluation result).
x1 iTId).
We often omit . and write ~1~2 for ~1 . ~2 and IEPfor
K. I?. We give higher precedence to ., +, and ; in this order.
Example 3.5: Let ri be
(id
x0 id).
Then,
h
+
72
=
int
=
r1; 3-2
rr11 =
w * 71
int x’ (id
=
xy
x6 int)
and 72 be
The rule for the allocation of a closure is:
int x1
r2,f: 71+*l+*rT22] k
(id x’ int)
l~~[l?ll
int x1 (id x6 int)
id xw (int xw hat).
2 : (71 + 72), y :7-l
= 2: int xw (int x6 int),y:
Typing
int x6 (int x6 id).
rules
Variables,
constants,
and weakening
The rule for variables and constants is:
v:rl-v:7.
(T-VAL)
If an expression is well typed under some type environment, then it should also be well typed under a type environment that represents more access capabilities. For example,
if z : int x1 int t- M : int, then 2 : int xw int k M : int,
since the former judgment requires z to be used as a linear
pair, while the latter does not impose such a requirement.
The following rule allows such weakening of an assumption:
r’kM:r
r’
5
r
t-let
The relation I” 5 P, which means “P represents more access
capabilities than I” (in other words, I’ allows more liberal
access to the heap),” is defined in Figure 4.
3.4.2
Allocation
M2
:5
X”‘x.M1
Kl
#S
in MZ : 73
(T-Ass)
f = recrK3(1+K1)1
(f, x, Ml)
in M2 : 73
( T-REC)
The total number of calls of the function f is calculated by
using the number ~2 of calls in MI and the number ~4 of
calls in Ms. Since each invocation of f may result in other
~2 invocations of f, the total calls are counted as ~(1 +
~2 + K: + . . .) = ~4(1+ ~2). The use of function f itself (in
the relaxed sense) is similarly calculated by using its uses
~1 and ~3 in Ml and Ms. The ceiling function r-1 applied
to ~r(l+ ~3) is just to make sure that the use is not 6.
(T-WEAK)
rkM:r
let f =
[7-21 in the premise Pi, x : 71 i- Ml : [-r21 guarantees that
when the closure Xx.M
is called, it does not return a 6value as a result. PI, x : 71 I- Ml : [7-21also means that
during each call of the closure, heap values are accessed as
described by Pi. Because the other premise means that
the closure will be called ~2 times, the total access to heap
values by MI can be represented by ~2 copies of Pi, which
is written as Icsl?i. Since the access happens only later (not
when Xx.Ml
is evaluated and the closure is created), r-1 is
applied to it, so that it does not contain &values.
The following rule for recursive functions is similar to
(T-Ass) except for the estimation of the use of the created
function:
=
3.4.1
IT2 I-
int x1 (int x1 int)
(z:71,y:71)+(z:72)
3.4
+
of heap values
3.4.3
Let us consider an expression let z = (y, z)~ in M. In M, y
and .z can be accessed in two ways: either directly by using
addresses y and z, or through the pair z (by extracting y and
2). If P, z : ~1 x s 72 I- M : 7, then the access in the former
way is represented by the types I’(y) and I’(z), while the
access in the latter way is represented by the types ri and
~2. Therefore, information on the total access is obtained by
adding those types. For example, if I’(y) = int x1 int and
71 = int x ’ int, then the total access to y can be represented
by P(Y) + 71 = int xy int. Thus, the rule for the allocation
of a pair is:
r,z:rlx~T22M:r3
Let-expressions
The following rule for let-expressions best illustrates how the
evaluation order is taken into account in our type system:
rl I-MI :
[Tll
r2, z : [T11 k
l?l;~21-letx=MIinM2:~
M2
: T (T_LET)
Here, the premise means that heap values are accessed by
Ml as described by I’i and by Ms as described by I’s, x :
[q]. In ordinary linear type systems, the total use of heap
values in let x = Ml in M2 was computed by adding Pi and
I’s. However, since Ms is evaluated only after MI has been
fully evaluated, we estimate the total access of heap values
by sequential composition Ii; 12. We require the type of MI
K#d
(T-PAIR)
34
our type system is to capture information on the order of
heap access (recall (T-LET), for example), whereas the flat
representation of a heap loses that information. Therefore,
we define an alternative operational semantics that represents a heap by using nested letrec-expressions, and show the
soundness of the type system with respect to that semantics. Although it includes rather strange reduction rules, it
is easy to see that the alternative semantics is essentially
equivalent to the original semantics.
Before presenting the alternative semantics, we explain
why the reduction relation ?r) fails to preserve typing. Let
us consider the following configuration:
not to contain 6 at top-level (by applying 1-1 to ri), in order
to make sure that every &value represented by Ii is really
“used up” during the evaluation of Ml and does not escape
to the rest of computation.
3.4.4
Access to heap values
The following is the rule for closure invocation:
r, z : ~1 I- M : 7-2
K>O
(f : T~-+~*~TI+ V : 5); I? I- let x = appZy”(f, V) in M : ~2
(T-APP)
({w e1
The access to heap values by apply” (f, V) is estimated as
f : 7y-b+ n+ V : 7-3. Note that the first use K.of the type of
f should not be zero, and that the second use is 1, since the
function is used once here. The access of heap values except
for z by M is estimated as r. Since apply” (f, V) is evaluated
before M is evaluated, the total access can be estimated as
their sequential composition (f : T~+~*‘TI+ V : 7-s); r.
The rule for reading the first element of a pair is:
rly : 7:,x : Tl d
72 k M :
73
x : (id
v = MI in M2).
x1 id)
x1 int k let v = MI in M2 : int,
since
((id x6 int) x’
int);((int X1 int) X1 int)
=
So it is fine that w is a linear
(id xl int) x1 int.
pair. However, if we reduce the configuration by (R-FsT),
the resulting configuration:
tc>o
(T-FST)
({w e1 (1,2), 3: e1 (w, 3)],
let u = (let z = f&(w) in z) in M2)
The premise implies that the pair is used as a &‘-value in M
after being used as a n-value in fst”(x). So, the total use of
x is estimated as K; K’. The first element may be accessed
through either y or CC,hence the total use of the first element
is estimated as ~1 + 71.
Similarly, the rule for extraction of the second element
of the pair is:
requires w to be an w-pair: w is accessed as a linear pair in M
(through x) and as a S-pair in fst!(w), and the type system
cannot capture the order between the two uses (because the
access is made through different variables z and w).
One way to avoid the above problem would be to extend
the type system so that the order of access to different variables can be expressed, as in [ll]. Since the resulting typing
rules would become rather complex, we instead redefme the
operational semantics. The idea is, before losing information on the order of heap access, to split the heap space in
advance by taking the order into account. For example, the
above configuration is first converted to something like:
r,y:~~,x:~~x~‘72~M:73
Ic>o
I?,x : 71 xILia’ (72 + T;) t- let y = snd”(x) in M : 73
(T-SND)
Rules for conditionals
let v = ({w H’ (1,2),x ++’ (w, 3)), MI)
in ({w ++’ (1,2),x e1 (w, 3)}, M2).
In the following rule for conditional
expressions, we require
that the then-part and the else-part has the same typing.
rt-Ml:Tl
r 1 M2 : 71
V:int+I’kifDVthenMielseMa:ri
4
++l (w,3)},let
where Ml = let y = fst’(x) in let z = f&(y)
in z. Then,
z : (id x6 int) x’ int k MI : int. So, if x : (id x1 int) x1
int k M2 : id, then by (T-LET),
r, 2 : CT1+ Q X-K’ 72 f- let y = f&*(x) in M : 73
3.4.5
(1,2),x
Type
Since the heap space has been split for the definition part
and for the body part of the let-expression, we can safely
throw away information that the definition part is evaluated
before the body part. So, it can be reduced to:
(T-IF)
Soundness
let v = ({w +k’ (1,2),x I-+’ (w,3)},let
in ({w +-+l (1,2),z ++’ (w,3)}, Mz).
This section sketches a proof of the type soundness property
that no invalid heap access occurs during evaluation of welltyped expressions. The full proof is given in the technical
report [12].
The type soundness is formally stated as follows:
Theorem 4.1:
z = fst’(w) in z)
In order to express the above kind of nested heaps and
expressions, we introduce a new class of expressions called
dynamic expressions and define a reduction relation for dynamic expressions. Note that this new operational semantics is introduced just for proving Theorem 4.1. The actual
implementation can be based on the semantics defined in
Section 2.3 and no cost for splitting heap needs to be paid.
If 0 l- M : 7, then (0, M) +* Error.
We want to show this property as usual [8,22], by defining a typing system for a run-time state (H, M) and proving that the reduction relation preserves typing and that no
well-typed state (H, M) causes an error immediately. Unfortunately, however, the reduction relation -Q defined in
Section 2.3 is not suitable for this purpose: a key feature of
4.1
Dynamic
expressions
The syntax of expressions given in Section 2 is extended to
that of dynamic expressions as follows:
35
L
I
6
1
W
0 II 0 I, un def
undef
undef
0
undef
undef
undef
W
W
-,
,,
,
6
b
6
1
1
1
W
w
w
Definition of ~1 * ~2
IE1 Tc2 0 6 1
0
0
0
0
6
0
1II0l~1
W
6
Definition of ~1; ~2
IErIE
0
6
1
Definition of ~1 + ~2
Definition of IEI - KZ
I Idl\K.2
II 0 I
6
w
0
llw
Definition of 1~1
/c ~~0~6~1~w]
Definition of [/cl
R /016111w]
W
1r111,,,.,.,
1 I/cl
11
u
w
1111
1
w
I
I
n JllOlOlllwl
llolwlwlw
Figure 3: Operations on uses
Binary operations (op = +, ;)
(n-t
intopint
=
int
(71 x6 T2)OP(4 xK’ T;)
=
=l+272)op(T~-+ 44,)
=
’ (7-20p~i) if rropri
~~~~~dx *ziLerwise
NloPfi:,fi20P4h
Tl-+
(r10pr2)(x)
and r20pri
are well defined.
if x E dOm(rI) n d0m(r2)
lf x E dom(rl)\do~(r2)
if x E d0m(r2)\d077i(rl)
=
Umry operations (op = r-1, fc . _)
op(int)
=
=
int
op(71) X”P(R)op(72)
T1+~P(41M2)~2
=
Opmx))
op(7-1xXT2) =
op(71+=1'~2T2)
(op(r))(x)
Figure 4: Definitions of a relation and operations on types and type environments
let x = h” in A4 -% letrec z = h” in [z/x]M (z fresh)
==C~zlz,”
let u = fst”(x) in M
wI4~
D q’
D,
letrec x = h” in D &
D .$
D,
K>K’>O
letrec x = h”-“’
in D’
(/c’ > K) v (d = 0)
letrec x = h” in D -% Error
letrec x = hRlzs2 in let y = D1 in D2 &
(DR.-HEAP)
(DR-FST)
(DR-READ)
(DR-ERROR)
let y = letrec x = hnl in D1 in letrec z = hsa in D2 @R-SPLIT)
_
let x = (letrec Z = Z+ in letrec 23 = hfiS3in V) in (letrec r’= Z” in M)
(DR-VAR)
A letrec z’ = ilLGJilr’a
in letrec 2u= I?‘@ in [V/z]M
Figure 5: Main reduction rules for dynamic expressions
36
Definition 4.2 [dynamic expressions]: The set D of dynamic expressions, ranged over by D, is given by the following syntax:
D
::=
M 1 let z = D1 in Dz 1letrec x = h” in D
Here, we introduced a letrec-expression letrec x = h” in D
to express a heap binding, and extended the syntax of letexpressions so that heap bindings and let-bindings can be
nested. By using the new syntax, the nested heap and expression
let u = ({z ti’
(w,3)},
I’,x:~~x~~~i-D:r~
2 e w, v,l
I?+ VI : 71 + V2 : 72 k letrec 2 = (VI, Vs)” in D : 73
(DT-PAIR)
~~(1 + ICZ.)[~~~+ r2 t-letrec f = X”x.Ml in D : 73
&
letrec x = (1, 2)6” in let z = snd’(x) in z
&
let z = (letrec x = (1,2)l in snd’(x))
(letrec 2 = (1,2)’ in z)
-%
let z = (letrec 2 = (1,2)’ in 2) in
(letrec x = (1,2)' in z)
A
letrec x = (1,2)’ in 2
in
then D f,
Operational semantics of dynamic expressions
semantics by using a reduction
a IfI’l-D:randDiD’,thenI’I-D’:r;and
l
letrec x = (1,2)l in
let y = fst’(x) in let z = snd’(x)
in z
If I’ I- D : r and D o\
that D %
D’, then there exists D” such
D” and I’ k D” : r.
We write -% for the relation % a._ a&.
We also
write D t if D is reduced to Error no matter how the heap
bindings in D are split, i.e., if D &
Error and there is no
D’ E 2) such that D 4
D’. The correspondence between
the new semantics and the original semantics can be stated
as follows:
Lemma 4.6: There is a relation 7Z(s (‘H x E) x D) that
satisfies the following conditions:
l
(0, M)‘RM;
l
if (H, M) -A (IS’, M’)
D 4
l
The expression in Example 2.3 is reduced
in z
Error]: If l? F D : 7,
Error.
Lemma 4.5 [Subject Reduction]:
relation D &
E. D is a dynamic expression and E is
either a dynamic expression or a special constant Error
representing invalid heap access. The label I shows which
heap value is accessed in the reduction step. It is either e,
which indicates that an internal heap value is accessed, or
x = h”, which indicates that the heap address x is accessed
as a value with use n, or x, which indicates that the heap of
address x is split as explained above.
We show only the key rules in Figure 5. The rule
(DR-HEAP) corresponds to the rule (R-HEAP) of the original semantics.
The rules (DR-FsT),
(DR-READ) and
(DR-ERROR) correspond to the rules (R-FST) and (F-FSTERROR). The rule (DR-SPLIT) is the most important rule:
it allows the current heap to be split into that for the
definition part of a let-expression and that for the body
The rule (DR-VAR) corresponds to the rule (Rpart.
VAR), but it allows split heap bindings to be merged after
the definition part of a let-expression has been fully evaluated. In the rule, letrec z’= 2 in D is a shorthand for
letrec zi = h;’ in . . . letrec z,, = hz” in D.
let 2 = (1,2)l in
let y = fst6(x) in let z = snd’(x)
Proof sketch of Theorem 4.1
Lemma 4.4 [Lack of Immediate
(DT-ABS)
A
let y = (letrec x = (1,2)6 in 1) in
(letrec x = (1,2)l in let z = snd’(x) in z)
Theorem 4.1 follows immediately from the soundness of the
type system with respect to the new operational semantics
and the fact that if an expression is reduced to Error in
the original semantics then it is also the case in the new
semantics.
The type soundness with respect to the new operational
semantics is stated as the two lemmas given below. The
subject reduction property (Lemma 4.5) is a little more
complicated than the usual one because even a well-typed
dynamic expression may be reduced to an ill-typed expression if a heap value is split in an inappropriate way. The
second statement of the lemma says that, if heap splitting
(the rule (DR-SPLIT)) can be applied to a well-typed expression, there is always a good splitting that preserves the
well-typeness of the expression.
The typing rules for expressions can be easily extended
for dynamic expressions. The new rules are as follows:
Example 4.3:
as follows:
A
4.3
let v = (letrec x = (w, 3)’ in Ml)
in letrec x = (u, 3)l in MZ
We define an operational
let y = (letrec 2 = (1,2)’ in fst6(x)) in
(letrec x = (1,2)' in let z = snd’(x) in z)
Ml) in ({x til (w,3)}, Mz)
can be expressed by the following dynamic expression:
4.2
-%
5
and (H, M)RD,
D’ and (H’, M’)RD’
then either
for some D’ or D t; and
if (H, M) -A Error and (H, M)RD,
then D t.
Type Reconstruction
Use annotations can be automatically inferred from unannotated expressions. Since it is done basically in the same way
as that for previous linear type systems [9,22], we explain
it only informally. The basic idea is to introduce variables
ranging over uses and types, and constraints on them, so
31
type expression (such as p + 7) constructed only from type
variables. This simplification is performed by partial unification of types, with some uses being kept different. For
example, given a constraint (Y 2 (p Xj real’) + y, we can
that we can express the most general typing (principal typing) of an expression. A new type judgment is of the form
I’, C t- A4 : r, where a set of constraints C specifies which
set of uses/types each use/type variable (in P, M, or 7) can
range over. For instance, let z = (y, y) in z is typed as:
y:a,{a>~+~,i#6,i>j}I-letx=(y,y)‘inz:/3xj~.
This typing is principal%
the sense that all the typings
derivable from the rules in Section 3 can be obtained by instantiating variables (Y,p, 7, i and j so that the constraint is
satisfied. (We do not give the formal definition of principal
typings here; it is basically the same as that in [22].)
Given an unannotated expression M, the inference of a
use annotation proceeds as follows (actually, the second and
third steps may overlap):
instantiate a and y with ,B’ xj’ real” and p” xj” real”’ for
fresh variables ,B’, /3”, i’, i”, j’, j” and reduce the constraint
to{p’~~+~“,i’~i+i”,j’~j+j”}.
Next, since the set of trivial constraints on type variables
always has a solution (let all the remaining type variables
be instantiated with, say, id), we can obtain the least assignment to use variables by solving the set of constraints
on use variables. We can use the simple iterative method
used in previous linear type systems [8,9]. Of course, the
least assignment to some use variables cannot be completely
determined until the whole program is given. For example,
given an expression of type real’, we cannot fmd the least
assignment to the use variable i until we know all the places
where the evaluation result of the expression is used. For
these unknown variables, we can either simply assign w or
delay constraint solving.
One important difference from the previous linear
type inference is that the least solution is not necessarily the best solution from the viewpoint of memory management.
For example, consider an expression let y = (1,2) in (let x = fst(y) in x).
The least
annotation is let y = (1,2)’
in (let x = fsts(y) in x).
In this expression, y cannot be deallocated since the
only access to the pair y (fst(y))
is annotated with
6. Therefore, it is better to annotate the expression as
let y = (I,2)l in (let z = fd(y) in x). We can think of
two ways for dealing with this. One way is to classify
uses into those for annotating heap allocation and those
for annotating heap access. After the least solution is obtained, we can maximize heap access annotations as long
as no heap allocation amrotations increase. The other way
is to extend a target language with an explicit deallocation operation free(P):
It deallocates heap values as if
the heap were accessed as described by I’. For example,
fiee(z : reall,y : rea!6) deallocates x but not y. The rule
(T-WEAK) can be changed as follows (l? - I” is the least
type environment I” such that I”; I”’ = I’):
1. Attach a fresh use variable to each place where use
annotation is required (let the annotated expression be
M’).
2. Compute a principal typing P, C t- M’ : r.
3. Solve the constraint C and apply the obtained substitution to M’.
In the following subsections, we explain the second and third
steps in a little more detail, and then discuss the cost of the
analysis.
5.1
Computing
a principal
typing
An algorithm for computing a principal typing is obtained
by constructing syntax-directed typing rules for the new
type judgment form and by reading them from bottom to
up. For example, we can merge the rules (T-PAIR) and
(T-WEAK) and obtain the following rule for the new type
judgment form:
r,x: rl xKn,C k M: 7-3
c + r’ 1 (r + K : r1 + v, : rz)
C!=K#J
P’,C E let x = (Vl,V2jR in M : 5
(TR-PAIR)
Here, C + Pi 1 Pz means that, for any substitution 0 on
type/use variables, if K’ is satisfied then BI’i 1 8’1‘s is also
satisfied. The above rule says that in order to compute a
typing for let x = (VI,VZ)~ in M, we should first compute
a typing for M and then add new constraints entailing I” >
P + Vl : 71 + Vz : Q and IE# 6.
One major difference from the previous linear type inference [8,22] is that constraints on type variables appear
in C (recall the typing for let z = (y, y) in x given above).
They are of the form r1 2 7s or 71 - 7s (meaning 71 and 72
are compatible in the sense that 7-1+ 72 and 71; 72 are well
defined). Here, the righthand type expression 72 of rl > 7-z
may contain +, r-1, etc. as constructors of type expressions
(rather than as functions on types).
5.2
Solving
constraints
r't-M:r
r'5 r
r I-M;free(r - r’) : 7
This change avoids forgetting to deallocate linear values.
We can optimize the result by moving the operation free
leftward and merging it with heap access annotation as far
as possible. Although there is no guarantee that these methods give the best use annotation, we expect that both give
enough good an annotation in practice.
5.3
Cost of the analysis
Unfortunately, the computational cost for type reconstruction can be exponential in the size of an input in the worst
case. The reason is that the number of use variables can be
linear in the size of the type of a variable appearing in an
input expression and that the size of a type can be exponential in the size of an input expression. For example, if a
variable x is required to have type (int x int) x (id x int)
by the context, its typing is:
on types and uses
An algorithm for solving constraints can be divided into two
steps. First, a set of constraints on type/use variables can
be simplified into a pair of a set of constraints on use variables of the form i 2 K’ and a set of “trivial” constraints
on type variables of the form a - /3 or cr 2 r where r is a
2
‘Note
(T-WEAK’)
that a constraint i # 6 can be expressed as i 2 [il.
:
(int x jl id) xi3 (id x i*
k 2 : (int
38
xi;
int)
id),
xi;
{il
(int
2
i;,
xi:
i2 >
id).
i;,
is
>
ii)
So, the number of use variables is linear in the size 3 of the
type (int x int) x (int x int). Next, consider an expression
let 20 = 1 in let 21 = (20,ze) in let 22 = (zi,zi)
in
. ..let 2, = (2,_1,z,_i)
in zn.
The size of the type of the expression is exponential in n.
In spite of the above fact, we expect that the reconstruction can be performed efficiently for realistic programs for
the following reasons. First, the above problem occurs only
when tuple or record types are extraordinarily nested; a huge
flat tuple type expression int x int x ... x int or an arrow
type expression ri+ 72 do not cause such problems. Second,
if the computational cost is really problematic for some program, we can save the cost by sacrificing the accuracy of
the analysis. For example, we can restrict the + operator
so that (71 xK 72) + (ri x”’ ~4) is defined only if ri = ri and
r2 = r;. This forces many type expressions to be shared and
saves the time and space of the analysis (in fact, the previous analysis in [8,9], which imposes a similar restriction,
is performed in polynomial time for the monomorphic type
system).
6
6.2
So far, we have considered only integers as constants. In
order to deal with large constants (i.e., constants that cannot
be represented in one word and must be allocated in a heap),
we need to annotate each allocation of a large constant. For
example, allocation of a linear real number 2.0 is annotated
as let 2 = 2.0’ in . . .. The type of a large constant is also
annotated with a use. For instance, the type of linear real
numbers is represented by real’. The typing rule for real
numbers is given as follows:
P, z : real” k M : T
l? k let x = rK in M : r
6.3
Primitive
functions
Primitive functions can be treated as constants of polymorphic types, and each occurrence of a primitive function can
be annotated with uses. For example, the primitive + on
real numbers can be assigned a type
Extensions
We have so far considered a simply-typed X-calculus with
recursion and pairs. It can be extended by introducing polymorphism on uses and types, and by adding other kinds of
data such as recursive data structures and reference cells.
For lack of space, we only briefly discuss how to introduce
polymorphism, constants such as real numbers, and recursive data structures.
6.1
Large constants
+ z in . . . can be annotated like:
The expression let z =
?
let z=+[l,l,l,l](y,z)
in ....
6.4
Because the type system presented in Section 3 is monomorphic in both uses and types, the analysis of (quasi-)linear
values is often rough. Consider the following expression:
in
::
:
nil :
hd
:
If either z or w is used as an w-value in M, the other is
also forced to be an w-value, because the code for the pair
creation (i.e., the body of the function f) is shared.
In order to avoid the above problem, we can introduce polymorphism on uses. Then, f can be given a type
Vi # G.(int+“+‘(int
xi id))
and the above expression can
be annotated, for example, as
...
7
rl,
4
A C2
k
Ml
: 1~11
7.1
Mz : 72
Ci,a do not affect the values of variables in Pi, M
c2.[Tll,&
Va, i.o list’
Vo,i :: {i > S}.(a
list’)
list’+“+a)
Preliminary
Experiments
We have implemented a prototype type reconstruction system for the core-ML (i.e., SML [15] without modules) based
on our quasi-linear type system and a simple profiler that
executes the output program and counts the number of allocated heap values. We first describe the current status
of the system in Section 7.1 and then report the results of
simple experiments in Section 7.2. The prototype system
willbe availablefrom http://wuv.yl.is.s.u-tokyo.ac. jp
/'koba/research/qlinear/.
Here, uses are explicitly passed as parameters to polymorphic functions.
In general, we need to introduce a constrained type
scheme of the form Voi,. . . ,a,, il,. . . ,ik :: C.-r, where C
is a set of constraints on type and use variables such as
(~1 2 (~2 + (~3 and ii 1 iz + 1. A new rule for polymorphic
let-expressions roughly looks like:
: trd,;::
VCY,
i.((a xi a liSti)+u*wa
In general, we can generate the types of constructors and destructors from datatype declarations of Standard ML [15].
let f = Ai.Xz.(z, z)~ in let L = appZy(f[l],
2) in
let w = appZy(f[w],3) in M.
l-2,2
data structures
Recursive data structures such as lists and trees can be
treated in the same way as pairs, by annotating their type
constructors with uses. The type of a list of values of type
r is expressed as r list”, where K expresses the use of each
cons cell of the list. For example, real’ list” is the type
of a list whose cons cells may be accessed many times in
an arbitrary manner, but whose elements are real numbers
that can be accessed only linearly. Primitives for lists can
be given the following types:
Polymorphism
let f = Az.(z,z) in let z = upply(f,2)
let w = apply (f, 3) in M.
Recursive
t-
A prototype
type reconstruction
system
Following the extensions discussed in Section 6, we have implemented a prototype analyzer. It takes an expression of
the core-ML as an input, performs type (and use) reconstruction and produces a use-annotated expression. It has
rl;rz,C1AC3~letz=M~inM2:r2
(T-LETPOLY)
39
been implemented by using the ML kit version 1 [4] as a
front end, and supports full features of the core-ML: records,
datatype declarations, references, and exceptions. Polymorphism on types and uses is based on the let-polymorphism
discussed in Section 6.
Polymorphic recursion on uses would be sound (just as
polymorphic recursion on regions is sound in the region inference [20]), but it is not supported currently. That is because the algorithm would become complex and also because
the polymorphic recursion does not seem so crucial for our
analysis.
The current system has the following limitations.
It
produces the least use annotation, not the best one (see
Section 5); The analysis of the use of a reference cell is
rough; Unnecessary uses are passed as parameters to usepolymorphic functions (they can be removed in the same
manner as unnecessary region parameters are removed in a
post-path of the region inference [5]); The current system
is very slow for several known reasons (it is just because
we preferred rapid prototyping and the system will be optimized in the future); A compiler that utilizes the analyzed
information for in-place update, etc. has not been implemented yet.
7.2
Results
of preliminary
boyer programs, the percentage of linear values is relatively
smaller. This is probably because unification of first-order
terms performed in the program causes sharing of many
heap values.
One should, however, note that the percentages shown
in the rightmost column do not necessarily indicate how
much memory space can be saved by our analysis: compared with memory management using conventional garbage
collection, extra memory space is required for representing
use-polymorphic functions and its instances, and it is not yet
clear how soon linear values can indeed be deallocated. Also,
the current analysis is performed on unoptimized programs;
other optimizations such as inlining and hoisting may reduce
the ratio of linear values. Therefore, we know for sure the
real impact of our analysis only after more serious experiments are carried out (for that purpose, some of the future
work described in Section 9 must be completed).
8
Work
Previous linear type systems
To our knowledge, among
previous linear type systems [3,9,22] that can automatically infer the usage of values, only Barendsen and Smetsers’s uniqueness typing [3] takes an evaluation order into
account. The effect of taking an evaluation order into account sometimes depends on a programming style, but it is
evident at least for the following functions:
experiments
Table 1 shows the result of experiments. The columns ‘zero,’
‘linear,’ and ‘omega’ respectively show how many heap values of use 0, 1, and w are allocated dynamically in each
program. The rightmost column shows what percentage of
heap values are zero or linear. For a use-polymorphic function like f : Vi.(realZ+‘“+ real’), the allocation was counted
only once; in other words, creations of instances like f[l] and
f[w] are not counted as the heap allocation (this is because
the current analyzer infers not the use of each instance of
a use-polymorphic function but the total use of all the instances).
“sumlist10000” is a program that makes a list of integers
from 0 to 10,000 and computes the sum of them. “qsort20”
and “msort20” sort a list of 20 real numbers by using the
quick sort and merge sort algorithms. They were written in
a naive way; the main part of the quick sort program is:
fun quick (Cl> = [I
I quick (x::l)
=
let val (11,12)=divide(x,l)
in append(quick(ll),
x::quick(l2))
Related
fun map f Cl = Cl
= f x::map f 1
I map f (x::l)
( Cl, S2) = Cl
fun diff
S2) =
I diff (x::Sl,
if member(x, S2) then diff(S1,
52)
else x::diff(Sl,SZ)
fun move (p, dx) = updatecp, X, p.X+dx)
The function f in map and the list S2 in diff (which is a
function computing the set difference) are used many times,
but are judged to be linear in the quasi-linear type system.
The function move takes a record representing a point object
as the first argument and adds dx to its X field. The point
p is accessed twice, but it is also judged to be linear.
The uniqueness typing [3] is different from ours in many
ways: the evaluation order is call-by-need,3 their type system does not have “b-type” of values that must be used up
locally, and instead it relies on a separate analysis for analyzing the order of memory access (so, the analysis is not
integrated as a use-type system, and it is rather complex).
Guzman and Hudak [7] and Odersky [18] also proposed a
kind of an extension of a linear type system, which can take
an evaluation order into account and check that destructive
operations on arrays or lists are safely used. Unlike ours,
their approach is to let programmers explicitly declare where
destructive operations should be performed (by using special
primitives for destructive update of data structures) and
check whether they are safe by performing type inference.
end
“sieve2000” finds prime numbers less than 2,000 by using
the sieve of Eratosthenes. “life” computes the Iirst 10 generations of lives generated from the initial 44 lives. “dangle”
and “reynolds3” are programs taken from [5] (with some parameters being changed). “reynolds3u” is a slightly modified
version of “reynolds3, ” in which one function is uncurried so
that our analysis works better.
In the cases of sumlist, merge/quick sorts and the sieve of
Eratosthenes, almost all of the heap values are linear: nonlinear values were only top-level functions. Moreover, by
looking at the produced use-annotated expression, we can
find that the cons cells generated in merge/quick sorts and
the sieve of Eratosthenes can all be updated in-place.
In the case of the life (some curried functions in the original program were uncurried: see discussions in Section 9),
dangle, and reynolds3, not all heap values are linear, but the
number of linear values is still huge enough to greatly reduce
the need for garbage collection. In the Knuth-Bendix and
Region inference
There is an alternative approach to static
memory management: region inference [5,20]. The region
inference abstracts a bunch of memory addresses as a region,
and estimates its life-time by performing a kind of type inference. Our type system and the region inference have both
3Recently, they dealt also with strict let expressions,
ysis for them seems to be more naive than our analysis
40
but the anal[19].
I
sumlist
qsort20
msort20
sieve2000
life
Knuth-Bendix
boyer
mandelbrot
danale
reynolds3
reynolds3u
zero I
linear II
omena
I (zero+linear)/total
v
0
2
40,0011 [
0
448$ I
4
0
496
4
0
256,455
8
0
2.353.116
26.600
0 10;339;410
2,949;292
0
1,163,786
342,642
0
5,672,170
920,944
20.000
20.032
34
0 1
211515 ]
2,059
0 I
21,514 1
13
I\
1
I
]
]
I,
I
I
100.0%
99.1%
99.2%
100.0%
98.9%
77.8%
77.3%
86.0%
99.9%
91.3%
99.9%
Table 1: The number of allocated heap values
ture. First, static memory management is especially attractive in parallel/distributed environments, since it requires
much less communications/synchronizations
than conventional garbage collection. Second, with the increasing importance of cache memory, saving the memory space for a
program is also likely to save its execution time.
The type system proposed in this paper is for call-byvalue languages; it is not yet clear whether a similar type
system can be developed for lazy functional languages.
Much work is left to be done to apply our type system
to a compiler of ML and know its real impact. We need to
work at least on the following issues (although they seem to
be fairly straightforward except for the last issue):
advantages and disadvantages. Because the life-time of data
is estimated region-wise, it is difficult to use the region inference for in-place update. Another shortcoming of the region inference is that too many data are often merged into
the same region (especially when recursive data structures,
higher-order functions, and reference cells are used), and as
a result, the analysis of the life-time of data sometimes becomes rough. For example, consider how to represent the
type of a list. Because a list may contain an arbitrary number of cons cells, it is impossible to use a distinct region for
each cons cell; so, the type of a list must be something like
(7 list, T) where T is the region in which all the cons cells are
stored. This implies that a cons cell of a list can be deallocated only after all the cons cells of the list become garbage.
On the other hand, in our analysis, the type of a list is of
the form r list”; if K.is 1, each cons cell can be deallocated
separately when it is accessed as a value of use 1. Thus,
unlike in the region inference, the life-times of cons cells are
not merged. (We do not intend to say that our analysis is
always better for lists than the region inference. Indeed, if
some cons cell of a list is accessed as an w-value, our analysis
assigns type r list” to the list; hence no cons cells of the list
can be deallocated automatically. In this case, the region
inference would be much better.)
On the other hand, the advantages of the region inference
are that it is completely independent of how often values
are accessed, and that the deallocation of data in the same
region can be performed in a constant time. So, it may be
interesting to use our type system and the region inference
as complementary methods for static memory management.
The goal of our analysis is also similar to that of the
storage mode analysis [5], which was proposed as a complementary to the region inference. It analyzes whether each
access to a value is the last access to the region of the value,
and if so, it inserts a code to deallocate all values in the
region. Thus, unlike our analysis, with the storage mode
analysis, a value can be deallocated only after all the data
in the region become garbage.
9
Conclusion
and Future
Refinement
of the type inference system
As discussed in
Section 5.2, the least use annotation may not be optimal
from the viewpoint of memory management; so, we need
to modify the analyzer so that it can produce a better use
annotation.
Transformation for in-place updates
We need to implement a compilation path that finds places where deallocation
of a heap value is immediately followed by allocation of another value of the same type and replaces it with m-place
update.
Combination
with conventional
garbage collection
Since
the linear-type-based memory management can automatically deallocate only quasi-linear values, we need a conventional garbage collector as well. There is, however, a subtle
problem: since the linear-type-based memory management
is not based on the pointer traversal information, it may
create a dangling pointer (just as the region-based memory managements does [20]). For example, when the closure
(Xz.(fst(y)+z),
{y = (1.0,2.0)}) is traced at garbage collection time, the second element 2.0 of y may have already been
deallocated. There are two ways for dealing with this problem. One way is to exclude the use 0 from the type system
so that dangling pointers cannot be created. The other way,
which we are currently exploring (with Atsushi Igarashi),
is to use the idea of tag-free garbage collection [16,21]: we
can utilize use-annotated types for tracing heap values at
garbage collection time, instead of conventional types. For
example, from the type real’ xy real’ of y in the above closure, it is known that the second element of y will not be
accessed by the closure and hence it need not be traced by
a garbage collector.
Work
We proposed an extended linear type system that can elegantly take an evaluation order into account, and proved
its correctness. Static memory management like the one
proposed here and the region inference is currently less
popular than conventional garbage collection.
However,
we think it will become very important in the near fu-
41
The following improvement of the accuracy of the analysis is also future work.
Taking
count
the
order
of access to different
variables
into
PI
A. Igarashi. Type-based analysis of usage of values
for concurrent programming languages. Master’s thesis, Department of Information Science, University of
Tokyo, 1997.
PI
A. Igarashi and N. Kobayashi. Type-based analysis of
usage of communication channels for concurrent programming languages. In Proceedings of International
Static Analysis Symposium (SAS’g7), Springer LNCS
1302, pages 187-201, 1997.
ac-
Because the type system cannot deal with the order
of access to different variables, the analysis can be rough
when variables are aliased. For example, consider an expression let y = z in fst’(z) + fst’(y). Since z and y refer to the same heap value, we could regard z as a linear
value; however, the current type system judges z to be an
w-value. We can overcome this problem by representing a
type environment as a poset of type bindings in order to
express the order of access to different variables, as in an
earlier version [lo] of Kobayashi’s type system for deadlockfreedom [ill.
PO1N. Kobayashi.
A Partially Deadlock-free Typed Process
Calculus (I) - A Simple System -. Technical Report
96-02, Department of Information Science, University
of Tokyo, September 1996.
Pll
Combining our analysis with other analyses
Our analysis
of &values is weak in dealing with curried functions. Consider an expression (Xr.(fst(z) + 1.0))(2.0,3.0) and its curried version ((Xz.Xy.(z + 1.0))2.0)3.0.
For the former expression, 6 can be assigned to 2.0, while 1 must be assigned
to it for the latter one. This problem may not be so serious
because the ordinary compiler optimization uncurries functions as far as possible, but it may be interesting to improve
the analysis by using more accurate dataflow information
like the one obtained by the region inference.
TechniKobayashi.
Quasi-linear
types.
cal Report 98-02, Department of Information SciUniversity of Tokyo,
1998.
Available
ence,
through http://ava.yl.is.s.u-tokyo.ac.jp/’koba
/publications.html.
P21 N.
1131 N. Kobayashi, B. C. Pierce, and D. N. Turner. Linearity and the pi-calculus. In Proceedings of ACM SIGPLAN/SIGACT
Symposium on Principles of Progmmming Languages, pages 358-371, January 1996.
Acknowledgment
We would like to thank Atsushi Igarashi, Martin Odersky,
Kenjiro Taura, Mads Tofte, and the anonymous referees for
useful discussions and comments.
[I41 I. Mackie. Lilac : A functional programming language
based on linear logic. Journal of Functional Programming, 4(4):1-39, October 1994.
1151 R. Milner, M. Tofte, R. Harper, and D. MacQueen. The
References
Definition of Standard ML (Revised). The MIT Press,
1997.
PI H.
G. Baker. Lively linear lisp - look ma, no garbage!
ACM Sigplan Notices, 27(8):89-98, 1992.
PI H.
G. Baker. A linear logic quicksort.
Notices, 29(2):13-18, 1994.
[I61 G. Morrisett. Compiling with Types. PhD thesis, School
ACM Sigplan
of Computer Science, Carnegie Mellon University, 1995.
P71 G. Morrisett, M. Felleisen, and R. Harper.
Abstract
models of memory management.
In Proceeding8 of
Functional Programming Languages and Computer Architecture, pages 66-76, 1995.
Conventional and
[31 E. Barendsen and S. Smetsers.
uniqueness typing in graph rewrite systems. Technical
Report CSI-R9328, Computer Science Institute, University of Nijmegen, 1993. An extended abstract appeared in Proc. of FST&TCS 12, Springer LNCS 761,
pp.41-51.
PI
[I81M.
Odersky.
Observers for linear types.
In Proceedings of 4th European Symposium on Programming
(ESOP’g2), Springer LNCS 582, pages 390-407, 1992.
L. Birkedal, N. Rothwell, M. Tofte, and D. N. Turner.
The ML Kit (Version 1). Technical Report 93/14, Department of Computer Science, University of Copenhagen, 1993.
P91 R.
Plasmeijer and M. van Eekelen. Concurrent Clean
ver.l.3 language report, 1997.
PO1M.
Tofte and J.-P. Talpin. Implementing the call-byvalue lambda-calculus using a stack of regions. In Proceedings of ACM SIGPLAN/SIGACT
Symposium on
Principles of Programming Languages, pages 188-201,
1994.
FJIL.
Birkedal, M. Tofte, and M. Vejlstrup. From region inference to von neumann machines via region
representation inference. In Proceedings of ACM SIGPLAN/SIGACT
Symposium on Principles of Programming Languages, pages 171-183, 1996.
161J.-Y.
Girard. Linear logic.
ence, 50:1-102, 1987.
N. Kobayashi. A partially deadlock-free typed process
calculus.
ACM nansuctions
on Progmmming Languages and Systems, 20(2):436-482,
1998. A preliminary summary appeared in Proceedings of LICS’97,
pages 128-139.
WI
Theoretical Computer Sci-
A. Tolmach. Tag-free garbage collection using explicit
type parameters. In Proceedings of ACM Conference on
Lisp and Functional Programming, pages l-11, 1994.
P21 D.
N. Turner, P. Wadler, and C. Mossin. Once upon a
type. In Proceedings of Functional Programming Languages and Computer Architecture, San Diego, California, pages l-11, 1995.
171 J. G. GuzmBn and P. Hudak. Single-threaded polymor-
phic lambda calculus. In Proceedings of IEEE Symposium on Logic in Computer Science, pages 333-343,
1990.
42