Quasi-Linear Types Naoki Kobayashi Department of Information Science, University of Tokyo email:[email protected] linear pair, the second element y of the pair can be destructively updated to y + 1. Similarly, the “append” of linear lists can be performed destructively, without copying cons cells. This kind of improvement will be especially effective for functional programs, because many intermediate data structures like cons cells and closures are created in naive functional programs. Abstract Linear types (types of values that can be used just once) have been drawing a great deal of attention because they are useful for memory management, in-place update of data structures, etc.: an obvious advantage is that a value of a linear type can be immediately deallocated after being used. However, the linear types have not been applied so widely in practice, probably because linear values (values of linear types) in the traditional sense do not so often appear in actual programs. In order to increase the applicability of linear types, we relax the condition of linearity by extending the types with information on an evaluation order and simple dataflow information. The extended type system, called a quasi-linear type system, is formalized and its correctness is proved. We have implemented a prototype type inference system for the core-ML that can automatically find out which value is linear in the relaxed sense. Promising results were obtained from preliminary experiments with the prototype system. 1 1.2 tems, they do not seem to have been used so widely, even after type inference algorithms [8,9,22] were proposed that can automatically find which values are linear. We think one of the major reasons for this is that linear type systems are too naive for the above application: what is actually important for automatic deallocation and in-place update is to identify which is the last access to a heap value; the condition of linearity is too strong for this purpose. For example, consider an expression M E let 2 = 2.0 in let y = x + 1.0 in y + 2: it is easy to see that y + x is the last access to z, but linear type systems cannot insert a code to deallocate z - just because z is used twice! This limitation also forces a programmer to follow a particular programming style. For instance, in Xz.(fst(z),snd(z) + l), the pair z is judged to be accessed twice; so, if one wants the pair to be deallocated, one must write X(z, y).(x, y + 1) instead. Linear types A number of type systems based on Girard’s linear logic [6] have been proposed [l, 3,9,13,14,22]. They guarantee that certain data structures (called linear values) are accessed just once (or at most once). The distinction between linear values and other values brings us several benefits: improvement of memory management [ 11, safe inlining [22], etc. Among them, we are here interested in the improvement of memory management. If some data structure is statically known to be linear, it can be deallocated immediately after it is accessed. Moreover, if another heap value needs to be created after a linear value is accessed, we can just replace the linear value with the new value, instead of deallocating the linear value and allocating space for the new data (provided that the physical size of the new value is not greater than that of the linear value). For instance (throughout this paper, we assume a call-by-value language), when the function X(z, y).(z, y + 1) is called, if the argument is a Permission to make digital or hard copies ofall or part of this work 1.3 POPL 99 San Antonio Copyright Texas proposal - almost linear types (quasi-linear In order to remove the above-mentioned limitation of linear types, we relax the condition of linearity. A key idea is to express information on the evaluation order and dataflow by using types. Recall the above expression M: if a type system can take it into account that z + 1.0 is evaluated before y + z, it can ignore the use of z in z + 1.0 and treat the above expression in the same way as let 2 = 2.0 in let y = 1.0 in y + 5. In order to obtain such information on the evaluation order, the type system should also be able to deal with some dataflow information. Consider, for example, an expression let y = f(x) in g(x) + h(y). f(z) is evaluated before g(z) but it does not imply that the access to x in g(z) is the last one: x may be returned by f(z) and accessed again in h(y). In order to deal with this, we distinguish between the type of a value that must be used up in an expression and for on the first page. ‘To copy otherwise. to republish, to post on servers or to redistribute requires prior specific permission and/or a fee. Our types) personal or classroon~ use is granted without f’ee provided that copies are not made or distributed for prolit or commercial advantage and that topics bear this notice and the full citation of linear types In spite of the above-mentioned benefit from linear type sys- Introduction 1.1 Limitation to lists, USA ACM 1999 l-581 13-095-3/99/01...$5.00 29 the type of a value that may be returned as a part of the evaluation result of an expression. It would be possible to obtain the above kinds of information by using more sophisticated analyses, but we instead obtain it by a very simple method - by introducing a new use [9,22]. Advantages of that approach include: (1) the whole analysis can be formalized uniformly in terms of a type system (this is especially advantageous when we introduce polymorphism) and (2) a program containing free variables can be analyzed as long as their type information is provided, which is useful for modular (or incremental) analysis and separate compilation of programs. We f&t review the ideas of previous linear type systems [9,22] and then overview the ideas of our system below. 1.3.1 Review with use 1 (hereafter, we call it a quasi-linear value or simply a linear value, and call a use-once value in the traditional sense a strictly linear value) can now be accessed in a more relaxed manner: It can be accessed many times as a b-value, and then it can be accessed as a strictly linear value (i.e., as a value that can be accessed once and deallocated). In order to guarantee such usage of a value, it is crucial that the type system can take the evaluation order into account. The type derivation shown above is now changed as follows: x : real6 t x + 1.0 : real’ x:reaP,y:reaPl-z+y:reall 2:rea16;z:reaZ1(=x:reaZ1) b let y = x + 1.0 in x + y : real1 of linear type systems Here, the new combination (x : reals); (z : real’) of type environments captures the fact that z is first accessed as a J-value in z + 1.0, and then it is accessed as a linear value in x + y. Thus, 2 can be deallocated after x + y is evaluated. Because a quasi-linear value may be used more than once, in addition to allocation of a heap value, each access to a heap value is also annotated with a use: it is for distinguishing between the last access of the value and the other accesses. For example, the expression M is annotated as let x1 = 2.0 in let y1 = x6 + 1.0 in y1 + z1 (actually, we will annotate constructors and destructors of heap values instead of variables). Here, we annotate x in y + x with 1 in order to indicate that it is the last access, while x in x + 1.0 is annotated with 6 in order to indicate that it is not the last one. A use was introduced to control how often values can be accessed [9,22]: it was either 0 (not used at all), 1 (accessed at most once), or w (accessed an arbitrary number of times) in [9] and was either 1 or w in [22]. Types are annotated with such uses. For example, real’ represents the type of real numbers that can be accessed at most once, and int-b’int is the type of functions on integers that can be called an arbitrary number of times. A type judgment can be accordingly refined so that it gives more useful information than the ordinary one: for example, x : real’ l- N : real” means not only that 2 is used as a real number in N but also that it is used at most once. So, it is invalid if N is let y = 2 + 1.0 in x + y. The following type derivation highlights a key point of the linear type system: 1.4 z : real1 I- 3: + 1.0 : real’ 2:reaZ’,y:reallt-z+y:real’ The premises of the derivation imply that z is used once each in x + 1.0 and x + y. In order to know how x is totally used in the expression, we combine the type environments of subexpressions by adding the corresponding uses as above. This is in contrast to the following rule of ordinary type systems, where type environments are shared between subexpressions: rl--r :7-i r, y : 71 i- iI& : 72 r t- let y = Ml in M2 : 73 By using the linear type system, we can infer how often each heap value is accessed, and annotate each allocation with a use. For instance, we can annotate the expression. M as let xy = 2.0 in let y1 = x + 1.0 in z + y, so that y is deallocated after it is accessed in x + y while x is not deallocated. Overview results Main contributions of the present work are formalization of the new type system sketched above and a proof of its correctness. Since the evaluation order must be taken into account, the proof is non-trivial and more involved than that of previous type systems. Another contribution is implementation of a type inference system for the core-ML (i.e., Standard ML without modules [15]) based on the new type system, which inputs an unannotated core-ML expression and outputs a useannotated expression. We have so far tested several fairly small programs and obtained promising results: for programs that use lists frequently (such as sorting programs, the sieve of Eratosthenes and Conway’s game of life), the type inference system could judge that most heap values (in the case of the sorting programs and the sieve of Eratosthenes, all values except for top-level functions) are linear, which indicates that they can be executed almost without garbage collection. Moreover, the annotated programs output by the system indicate that most linear cons cells in those programs can be updated in-place (the application of linear types to the in-place quick sort has been suggested by Baker [2] but a remarkable point here is that it can be performed for naive programs with no programmer’s annotation). z : real’ + 2 : real1 (= z : real”) I- let y = z + 1.0 in z + y : real’ 1.3.2 Main of our type system In order to express simple dataflow information, we introduce another use 6. Intuitively, a value with use S (which we call a S-value below) may be accessed many times but only locally - it cannot be returned as (a part of) the computed result. So, x can have type real6 in z + 1.0 and (Xy.(y + l.O))z, but not in (2,l.O) or Xy.s + y. A value 1.5 Structure of the paper The rest of this paper is structured as follows. Section 2 introduces the syntax of the target language. Section 3 gives our type system for judging whether an expression is correctly annotated with uses. Section 4 sketches a proof of 30 (VI, Vz)” and Xny.Ml allocate a pair and a closure respectively in the heap, and records that their use is IC(which shall be restricted to O,l, or w by the type system given later). The recorded use expresses how the value can be accessed in the rest of the computation: 0 means that the value cannot be accessed at all (so, (VI, V2)’ need not actually allocate the pair in the heap), 1 means that the value can be accessed as a linear value and may be deallocated later, and w means that the value can be accessed in an arbitrary manner and can be deallocated only by other mechanisms such as garbage collection. recn (f, y, Ml) creates a recursive function f such that f(y) = Ml. There are three operations to access the heap: fst”(z) (snd”(z), resp.) extracts the first (second, resp.) element of the pair stored at 2, and appZy”(y, V) reads the closure stored at y and applies it to V. R attached to an access operation represents as a value of what use the heap value is accessed by this operation. If K.is 6, the heap is just read and unchanged; if it is 1, the heap value is deallocated after being read (if the use of the heap value is also 1’). If n is w, then the operation assumes that the accessed heap value is an w-value and does not deallocate the value. (so, w is the same as 6 in effect and is unnecessary; it is included just for technical convenience). If K. is greater than the actual use recorded at the accessed address, the access is invalid and causes an error. A conditional expression ifD V then Ml else Ma is reduced to Ml if V = 0 and otherwise reduced to M2. the correctness of the type system and Section 5 briefly discusses type inference issues. Section 6 discusses several extensions of the target language. Section 7 explains our prototype type inference system and the results of preliminary experiments. Section 8 discusses related work and Section 9 concludes this paper. For space restriction, we omit some formal definitions, proofs, etc.: they are found in the full version of this paper [12]. 2 2.1 Target Language Uses As explained in Section 1, a use is introduced to control how often and in which way each heap value can be accessed. Definition 2.1 [uses]: A use is 0, 6, 1, or w. We use metavariables ~1, ~2,. . . for uses. A new point here is the use 6: basically, we say that a heap value is accessed in an expression as a value of use 6 if it is accessed locally inside the expression and cannot be returned as a part of the evaluation result,. 0 means that the data cannot be accessed at all, 1 means that after the data is accessed in a restricted manner as a value of use 6, it can be either (1) accessed at most once and deallocated inside the current expression being evaluated or (2) put into the evaluation result as a value of use 1. w means that the data can be accessed many times in any way. A value of use w can be regarded as a value of any other use, and a value of 1 can be regarded as a value of use S or 0. In order to express this kind of relationship between uses, we define a total order 5 on uses by 0 5 6 5 1 5 w. We write ~1 < ~2 if ~1 # ~2 and ~1 5 ~2. Example 2.3: let z = (1,2)l in let y = fst’(z) in let z = snd’(z) in z allocates a linear pair (1,2) in the heap, and then reads its first element; at this moment, z is not deallocated since fst is annotated with 6. After that, the second element, is read and the pair is deallocated. Remark 2.2: By a heap value being accessed (or used), we mean the contents of the heap value are indeed read; so, just passing the reference to the heap value does not count as access. For example, we do not say that y is accessed in (Xz.z)y, since the expression can be reduced to y without reading the actual contents of y. A pair is considered to be accessed only when its first or second element is extracted, a closure is accessed only when it is invoked, and a real number is accessed only when it is passed as an argument to a primitive function on real numbers. 2.2 2.3 Operational semantics In order to clarify how use annotation is used for allocating/deallocating heap values at run-time, we define an operational semantics as a rewriting relation on pairs of a heap and an expression [17,22]. Definition 2.4 [evaluation contexts]: The set, of evaluation contexts is given by the following syntax: C[ ] ::= [ ] 1let z = C[ ] in M. Expressions We write C[M] for the expression obtained from C[ ] by replacing the hole [ ] with M. We consider a simply-typed X-calculus with recursion and pairs as a target language. Its extension with polymorphism and other data structures will be briefly discussed in Section 6. Because our type system is sensitive to an evaluation order, we use the K-normal form [5] in order to make the evaluation order explicit and name each intermediate value. (For readability, we sometimes use expressions that are not in K-normal form when giving examples.) The syntax of expressions is given in Figure 1. Each allocation or read operation of a heap value is annotated with a use. The annotation can be automatically inferred by a type inference algorithm explained in Section 5. We write & for the set of expressions. The metavariable V represents a value that can be represented in one word and need not be allocated in the heap. Here, it is either an integer constant (denoted by n) or a variable (denoted by 2). A variable can be considered a heap address. Definition 2.5 [heap]: A heap value, denoted by h, is a term of the form (VI, Vz) or Xx.M. A heap is a mapping from a finite set, of variables to pairs consisting of a use and a heap value. We write (21 I-+~~ hl, . . . , z,, I-+~” h,} for a heap H such that H(zi) = (Ki, hi). The use associated with a heap value is used for judging whether the value can be deallocated after it is accessed as a linear value: the value can be deallocated only if the use is 1. The use 0 means that the value has been already ‘The check of the use of the heap value is necessary, because we allow an w-value to be accessed as a linear value. If we forbid such coercion of an w-value into a linear value as in [22], then the check is unnecessary. However, we prefer not to make such restriction because the check can be implemented efficiently(see Remark 2.6). 31 A4 (expressions) ::= V (non-heap values) ::= V 1 let z = (VI,VZ)~ in M 1let z = fst”(y) in M 1let z = snd”(y) in M 1let x = X”y.Ml in M2 I let z = reC(f,y,Ml) in M2 I let z = appZy’((y, V) in M I let z = Ml in M2 1ifII V then MI else M2 z 1n Figure 1: The Syntax of Expressions Example 2.8: If the expression in Example 2.3 is wrongly annotated as: deallocated. We often write h” for (VI, Vz)” or A”z.M. We write X for the set of heaps. The reduction relation WE (X x E) x ((74x E) U{Error}) is defined by the rules given in Figure 2. Error indicates that invalid access to the heap has occurred (there are other kinds of errors such as application of a non-function to a value, etc.; however, because the lack of those errors can be guaranteed by a type system in the usual way, this paper focuses on errors caused by invalid heap access). The first two rules (R-HEAP) and (R-REC) are for allocating heap values. The use K.is recorded at a new address together with the allocated value. The heap is not changed in (R-SVAL) since V is a one-word value. The rules (R-APP) and (R-APP-ERROR) for function ap- then it is reduced to Error ({},let - 3 Remark 2.6: One may think that the above operation on uses incurs a heavy overhead at run-time. However, because the type system guarantees that no invalid access occurs, the check of K > n’ is always guaranteed to succeed and can be eliminated. Moreover, the use associated with a heap value can change only from 1 to 0; since this change means that the heap value is deallocated in the actual implementation, the use need not be updated actually. This further implies that the use of a heap value can be recorded not in the heap space itself but in the pointer to it by using a one-bit tag, and that the extra run-time cost is only to check this tag (to know whether the use of the accessed heap value is 1 or w) when the access operation is annotated with 1. So, we think the actual run-time overhead is quite small. - 3.1 (1,2)), = fst6(u) in (1,2)},let y (1,2)),let z (1,2)}, let z 4 ‘u Q 7 - ({u ++O (112)), 2) (L2)L = fst’(u) in let z = snd6(u) in z) (1,2)},let y = 1 in let z = snd6(u) in z) (1,2)}, let z = snd’(u) in z) System Types ::= int 171 x6 72 I 71-bnl+Q72 n of 71 xK r2 and ~1 of r1-+K11r;2r2expresses how a pair and a closure can be accessed. We require n2 of rr-+~L’lnars not to be 6. It represents how often (0, 1, or w ) the closure can be invoked in the sense of previous linear type systems [9, 221. The closure itself is allocated and deallocated based on nl. The reason why we keep ~2 will become clear when we present a typing rule for closure creations. Note that the meaning of a pair type is different from that in previous linear type systems [8,22]. In the previous linear type systems, int x“’ (int x1 int) corresponds to the linear logic formula !( int @ (int @ int)); so, if a pair (1, (2,3)) has this type, it can be read an arbitrary number of times, and each time the second element (2,3) is extracted, it can be read at most once. On the other hand, in our quasi-linear type system, the second element (2,3) can be accessed only linearly througghout aEZthe access to the pair (1, (2,3)) (not each time (2,3) is extracted). in z) let I = snd’(u) in z) = 1 in let z = snd’(u) = snd’(u) in z) = 2 in z) as follows: Definition 3.1 [types]: The set of types, ranged over by 7, is given by the following syntax. The expression in Example 2.3 is reduced ({u +bl let y ({u I+)’ ({u I+’ ({u *’ in z, As was shown in Example 2.8, if an expression is wrongly annotated with uses, a run-time error may occur. In order to reject such expressions, we introduce a type system. Our type system is a conservative extension of the usual type system for the simply typed X-calculus in the sense that any expressions well-typed in the usual type system are well typed also in our type system. There are two major differences between our quasi-linear type system and the usual linear type systems. As explained in Section 1, a linear value can be accessed many times as long as the last access is identified by the quasi-linear type system. The other difference is the interpretation of a pair type (see below). in Figure 3. Similarly, access to a pair fst”’ (x) or sndn’ (z) succeeds only if K,2 IE’, and the use is decreased by K’ after the access ((T-FST) - (T-SND-ERROR)). ({},let z = (1,2)l in let y = fst’(z) in let z = snd’(z) Type in let z = snd’(z) z = (1,2)l in let y = fst’(z) in let z = snd6 (z) in z) ({u *l let y “v) ({u I+’ u I+’ Z Lror plication deserve attention. Since apply”’ (2, V) reads a closure stored at z as a value with use K’, the currently available use K of z should be greater than or equal to PC’:otherwise, the application causes an error. If K. = K’ = 1, the closure is deallocated after the application and it can no longer be called; so, the use of z should be changed to 0. It is expressed by the subtraction n - K’, whose definition is given Example 2.7: as follows: in let y = fst’(z) let z = (1,2)l in z9 The heap value u is considered to be deallocated when the use changes from 1 to 0. 32 (If, C[let x = h” in M]) +P (H{y r-)‘( h}, C[[y/x]M]) (y fresh) (H, C[let z = rec“(f, y, MI) in M]) w (H{z eK Xy.[z/fMr), C[[z/x]M]) (K CPet z = VP>:!)>; (H, CVI4W) (z fresh) _ (H{x +-FKXy.M}, C[appZy”‘(x, V)]) ‘u (H{x t-k-’ (fc < d) v (K’ = 0) (H{x tie Xy.M}, C[appZy'('(x, V)]) - Xy.M}, C[[V/y]M]) Error a>>‘>0 (H{x +-bK 047vz)}, CwqZ)]) - (H{x +b- (R-APP) (R-APP-ERROR) (R-FST) (v-l, vi)}, C[vl]) (lc < d) v (K’ = 0) (H{x I-+~ (VI, Vz)), C[fst”‘(z)]) K>IE’>O (R-HEAP) (R-REC) (R-SVAL) u Error (H{x I+= (V,,Vz)},C[snd’E’(x)]) “v) (H{z tiKPK’ (Vr, V,)},C[Vz]) (K < K’) v (fc’ = 0) (H{z +-+“ (VI, Vz)}, C[snd”‘(z)]) u Error (H, C[ifo 0 then Mi else Mz]) ti (H, C[Mi]) n#O (H, C[ifo n then Mr else Mz]) -..+ (H, C[M2]) (R-FST-ERROR) (R-SND) (R-SND-ERROR) (R-IFT) (R-IFF) Figure 2: Operational Semantics 3.3 Example 3.2: (int x6 int)-+‘+‘int is the type of a linear closure that takes a pair of integers as an argument, uses it locally, and returns an integer. The function may be invoked many times (thus its use is w in the strict sense), but it can be invoked only linearly in the relaxed sense (i.e., invoked as a closure of use 6 except for the last invocation). For example, X’z.(f&(z) + snd’(z)) can have this type. In presenting typing rules, we use several operations on uses, types, and type environments given in Figure 3 and 4. As explained in Section 1, the operation ~1 + IE~is used for computing the total use of a value when it is accessed as ~1 in one place and as tcz in another place. When the order of the uses is statically known, K~;KZ is used instead. 1~1 is the least non-6 use that is greater than or equal to K, and ]K] is the greatest non-6 use that is less than or equal to n. ~1 . ~2 is used for computing the total use of a value when the value is used as a Kz-value ~1 times. Motivations for these operations will become clearer when typing rules are given below. These operations are extended to operations on types and type environments. For example, if a heap value is accessed as a value of type int x6 int in one place and as a value of type int x ’ int in another place, it is totally accessed as a value of type (int x6 int) + (int x1 id) = int xw int, which means that the value may be accessed more than once. The definitions of operations on pair types and function types deserve special attention; notice that the operations act on sub-components of pair type, but not on the arguments and return types of functions. In order to understand the reason for this, suppose x is accessed in two places, each as a value of type (int x1 int) x1 int (i.e., the inner pair of integers and the outer pair are both accessed linearly). Then, both the inner pair of integers and the outer pair are totally accessed more than once; it is expressed by the type (int x1+’ int) x1+’ int obtained by adding each use attached to the pair types, not by the type (int x1 int) xl+l int (which means that the inner pair is totally accessed only linearly). On the other hand, suppose z is accessed in two places as a function of type (int x1 int)--+‘yl(int x1 int). Then, the function is invoked twice, and each time it is invoked, it uses an argument pair linearly and returns a linear pair. So, the total access to the function is expressed by (int x1 int)+‘+‘~‘+‘(int x1 int), Example 3.3: Consider an expression let f = XYz.(fst6(x) + z) in let y = fst’(snd’(x)) in M and suppose x does not appear in M. Because the second element of the pair x is accessed just once in fst’(snd’(x)), it can be assigned the type int xw (int x1 int); so, the pair of integers can be deallocated at fst’. Consider another expression let f = X’z.(fst’(snd’(z)) + 1) in apply’(f, x) + apply’(f, x). The second element of x is accessed once (by fst’) each time it is extracted by snd’(z). Since it is totally accessed more than once, the type int xw (int x w int) is assigned to x in our quasi-linear type system. 3.2 Operations on uses, types, and type environments Type judgment A type judgment is of the form r !- M : 7, where I’ (called a type environment) is a mapping from a finite set of variables to types. It expresses how each heap value is accessed during evaluation of M. For example, x : int x6 int I- M : int x1 int implies (l)M may access the heap address x, (2)the result is a linear pair of integers, and (3)~ is used up in M and does not escape through the result. Notation 3.4: We write VI : ~1,. . . , V, : 7n for the type environment l? such that dom(I’) = {VI,. . . , Vn} f~ V and l?(V;) = Ti for each vi E don@), where V denotes the set of variables. For example, (x : int, 2 : id) denotes the type environment that maps x to int. If V $Z dom(l?), we write I, V : T for the type environment I such that (1) I”=dom(r)u({V}nV)and(2) I”(x) =rifx E ({V}nV) and r’(x) = r(x) if 2 E dam(r). 33 We require E # 6 because there is no use creating a J-value: since it can never be deallocated, creating a J-value is the same as creating an w-value. x1+’ id). Similarly, the not by (int x ‘+l int)-+‘+ll’+l(int operation r.1 also acts on sub-components of pair type but not on the arguments and return types of function types. This is because [rl is intended to express the type of an expression whose evaluation result does not contain S-values. Although the type (int x6 int)-+‘>‘(int x1 int) contains 6, it just means that a function of that type uses an argument pair locally, not that the function must so, [(int x6 int)+l*‘(int x1 int)l = be used locally. (id x6 int)+lJ(int Remark 3.6: Actually, annotating allocation of a pair with 6 is meaningful and may be rather useful if the type 7s of M does not contain 6: we can modify the operational semantics so that let x = (VI, V2)’ in M allocates the pair (VI, V2) in the stack and deallocates it after M has been evaluated (recall that a value with use 6 cannot be a part of the evaluation result). x1 iTId). We often omit . and write ~1~2 for ~1 . ~2 and IEPfor K. I?. We give higher precedence to ., +, and ; in this order. Example 3.5: Let ri be (id x0 id). Then, h + 72 = int = r1; 3-2 rr11 = w * 71 int x’ (id = xy x6 int) and 72 be The rule for the allocation of a closure is: int x1 r2,f: 71+*l+*rT22] k (id x’ int) l~~[l?ll int x1 (id x6 int) id xw (int xw hat). 2 : (71 + 72), y :7-l = 2: int xw (int x6 int),y: Typing int x6 (int x6 id). rules Variables, constants, and weakening The rule for variables and constants is: v:rl-v:7. (T-VAL) If an expression is well typed under some type environment, then it should also be well typed under a type environment that represents more access capabilities. For example, if z : int x1 int t- M : int, then 2 : int xw int k M : int, since the former judgment requires z to be used as a linear pair, while the latter does not impose such a requirement. The following rule allows such weakening of an assumption: r’kM:r r’ 5 r t-let The relation I” 5 P, which means “P represents more access capabilities than I” (in other words, I’ allows more liberal access to the heap),” is defined in Figure 4. 3.4.2 Allocation M2 :5 X”‘x.M1 Kl #S in MZ : 73 (T-Ass) f = recrK3(1+K1)1 (f, x, Ml) in M2 : 73 ( T-REC) The total number of calls of the function f is calculated by using the number ~2 of calls in MI and the number ~4 of calls in Ms. Since each invocation of f may result in other ~2 invocations of f, the total calls are counted as ~(1 + ~2 + K: + . . .) = ~4(1+ ~2). The use of function f itself (in the relaxed sense) is similarly calculated by using its uses ~1 and ~3 in Ml and Ms. The ceiling function r-1 applied to ~r(l+ ~3) is just to make sure that the use is not 6. (T-WEAK) rkM:r let f = [7-21 in the premise Pi, x : 71 i- Ml : [-r21 guarantees that when the closure Xx.M is called, it does not return a 6value as a result. PI, x : 71 I- Ml : [7-21also means that during each call of the closure, heap values are accessed as described by Pi. Because the other premise means that the closure will be called ~2 times, the total access to heap values by MI can be represented by ~2 copies of Pi, which is written as Icsl?i. Since the access happens only later (not when Xx.Ml is evaluated and the closure is created), r-1 is applied to it, so that it does not contain &values. The following rule for recursive functions is similar to (T-Ass) except for the estimation of the use of the created function: = 3.4.1 IT2 I- int x1 (int x1 int) (z:71,y:71)+(z:72) 3.4 + of heap values 3.4.3 Let us consider an expression let z = (y, z)~ in M. In M, y and .z can be accessed in two ways: either directly by using addresses y and z, or through the pair z (by extracting y and 2). If P, z : ~1 x s 72 I- M : 7, then the access in the former way is represented by the types I’(y) and I’(z), while the access in the latter way is represented by the types ri and ~2. Therefore, information on the total access is obtained by adding those types. For example, if I’(y) = int x1 int and 71 = int x ’ int, then the total access to y can be represented by P(Y) + 71 = int xy int. Thus, the rule for the allocation of a pair is: r,z:rlx~T22M:r3 Let-expressions The following rule for let-expressions best illustrates how the evaluation order is taken into account in our type system: rl I-MI : [Tll r2, z : [T11 k l?l;~21-letx=MIinM2:~ M2 : T (T_LET) Here, the premise means that heap values are accessed by Ml as described by I’i and by Ms as described by I’s, x : [q]. In ordinary linear type systems, the total use of heap values in let x = Ml in M2 was computed by adding Pi and I’s. However, since Ms is evaluated only after MI has been fully evaluated, we estimate the total access of heap values by sequential composition Ii; 12. We require the type of MI K#d (T-PAIR) 34 our type system is to capture information on the order of heap access (recall (T-LET), for example), whereas the flat representation of a heap loses that information. Therefore, we define an alternative operational semantics that represents a heap by using nested letrec-expressions, and show the soundness of the type system with respect to that semantics. Although it includes rather strange reduction rules, it is easy to see that the alternative semantics is essentially equivalent to the original semantics. Before presenting the alternative semantics, we explain why the reduction relation ?r) fails to preserve typing. Let us consider the following configuration: not to contain 6 at top-level (by applying 1-1 to ri), in order to make sure that every &value represented by Ii is really “used up” during the evaluation of Ml and does not escape to the rest of computation. 3.4.4 Access to heap values The following is the rule for closure invocation: r, z : ~1 I- M : 7-2 K>O (f : T~-+~*~TI+ V : 5); I? I- let x = appZy”(f, V) in M : ~2 (T-APP) ({w e1 The access to heap values by apply” (f, V) is estimated as f : 7y-b+ n+ V : 7-3. Note that the first use K.of the type of f should not be zero, and that the second use is 1, since the function is used once here. The access of heap values except for z by M is estimated as r. Since apply” (f, V) is evaluated before M is evaluated, the total access can be estimated as their sequential composition (f : T~+~*‘TI+ V : 7-s); r. The rule for reading the first element of a pair is: rly : 7:,x : Tl d 72 k M : 73 x : (id v = MI in M2). x1 id) x1 int k let v = MI in M2 : int, since ((id x6 int) x’ int);((int X1 int) X1 int) = So it is fine that w is a linear (id xl int) x1 int. pair. However, if we reduce the configuration by (R-FsT), the resulting configuration: tc>o (T-FST) ({w e1 (1,2), 3: e1 (w, 3)], let u = (let z = f&(w) in z) in M2) The premise implies that the pair is used as a &‘-value in M after being used as a n-value in fst”(x). So, the total use of x is estimated as K; K’. The first element may be accessed through either y or CC,hence the total use of the first element is estimated as ~1 + 71. Similarly, the rule for extraction of the second element of the pair is: requires w to be an w-pair: w is accessed as a linear pair in M (through x) and as a S-pair in fst!(w), and the type system cannot capture the order between the two uses (because the access is made through different variables z and w). One way to avoid the above problem would be to extend the type system so that the order of access to different variables can be expressed, as in [ll]. Since the resulting typing rules would become rather complex, we instead redefme the operational semantics. The idea is, before losing information on the order of heap access, to split the heap space in advance by taking the order into account. For example, the above configuration is first converted to something like: r,y:~~,x:~~x~‘72~M:73 Ic>o I?,x : 71 xILia’ (72 + T;) t- let y = snd”(x) in M : 73 (T-SND) Rules for conditionals let v = ({w H’ (1,2),x ++’ (w, 3)), MI) in ({w ++’ (1,2),x e1 (w, 3)}, M2). In the following rule for conditional expressions, we require that the then-part and the else-part has the same typing. rt-Ml:Tl r 1 M2 : 71 V:int+I’kifDVthenMielseMa:ri 4 ++l (w,3)},let where Ml = let y = fst’(x) in let z = f&(y) in z. Then, z : (id x6 int) x’ int k MI : int. So, if x : (id x1 int) x1 int k M2 : id, then by (T-LET), r, 2 : CT1+ Q X-K’ 72 f- let y = f&*(x) in M : 73 3.4.5 (1,2),x Type Since the heap space has been split for the definition part and for the body part of the let-expression, we can safely throw away information that the definition part is evaluated before the body part. So, it can be reduced to: (T-IF) Soundness let v = ({w +k’ (1,2),x I-+’ (w,3)},let in ({w +-+l (1,2),z ++’ (w,3)}, Mz). This section sketches a proof of the type soundness property that no invalid heap access occurs during evaluation of welltyped expressions. The full proof is given in the technical report [12]. The type soundness is formally stated as follows: Theorem 4.1: z = fst’(w) in z) In order to express the above kind of nested heaps and expressions, we introduce a new class of expressions called dynamic expressions and define a reduction relation for dynamic expressions. Note that this new operational semantics is introduced just for proving Theorem 4.1. The actual implementation can be based on the semantics defined in Section 2.3 and no cost for splitting heap needs to be paid. If 0 l- M : 7, then (0, M) +* Error. We want to show this property as usual [8,22], by defining a typing system for a run-time state (H, M) and proving that the reduction relation preserves typing and that no well-typed state (H, M) causes an error immediately. Unfortunately, however, the reduction relation -Q defined in Section 2.3 is not suitable for this purpose: a key feature of 4.1 Dynamic expressions The syntax of expressions given in Section 2 is extended to that of dynamic expressions as follows: 35 L I 6 1 W 0 II 0 I, un def undef undef 0 undef undef undef W W -, ,, , 6 b 6 1 1 1 W w w Definition of ~1 * ~2 IE1 Tc2 0 6 1 0 0 0 0 6 0 1II0l~1 W 6 Definition of ~1; ~2 IErIE 0 6 1 Definition of ~1 + ~2 Definition of IEI - KZ I Idl\K.2 II 0 I 6 w 0 llw Definition of 1~1 /c ~~0~6~1~w] Definition of [/cl R /016111w] W 1r111,,,.,., 1 I/cl 11 u w 1111 1 w I I n JllOlOlllwl llolwlwlw Figure 3: Operations on uses Binary operations (op = +, ;) (n-t intopint = int (71 x6 T2)OP(4 xK’ T;) = =l+272)op(T~-+ 44,) = ’ (7-20p~i) if rropri ~~~~~dx *ziLerwise NloPfi:,fi20P4h Tl-+ (r10pr2)(x) and r20pri are well defined. if x E dOm(rI) n d0m(r2) lf x E dom(rl)\do~(r2) if x E d0m(r2)\d077i(rl) = Umry operations (op = r-1, fc . _) op(int) = = int op(71) X”P(R)op(72) T1+~P(41M2)~2 = Opmx)) op(7-1xXT2) = op(71+=1'~2T2) (op(r))(x) Figure 4: Definitions of a relation and operations on types and type environments let x = h” in A4 -% letrec z = h” in [z/x]M (z fresh) ==C~zlz,” let u = fst”(x) in M wI4~ D q’ D, letrec x = h” in D & D .$ D, K>K’>O letrec x = h”-“’ in D’ (/c’ > K) v (d = 0) letrec x = h” in D -% Error letrec x = hRlzs2 in let y = D1 in D2 & (DR.-HEAP) (DR-FST) (DR-READ) (DR-ERROR) let y = letrec x = hnl in D1 in letrec z = hsa in D2 @R-SPLIT) _ let x = (letrec Z = Z+ in letrec 23 = hfiS3in V) in (letrec r’= Z” in M) (DR-VAR) A letrec z’ = ilLGJilr’a in letrec 2u= I?‘@ in [V/z]M Figure 5: Main reduction rules for dynamic expressions 36 Definition 4.2 [dynamic expressions]: The set D of dynamic expressions, ranged over by D, is given by the following syntax: D ::= M 1 let z = D1 in Dz 1letrec x = h” in D Here, we introduced a letrec-expression letrec x = h” in D to express a heap binding, and extended the syntax of letexpressions so that heap bindings and let-bindings can be nested. By using the new syntax, the nested heap and expression let u = ({z ti’ (w,3)}, I’,x:~~x~~~i-D:r~ 2 e w, v,l I?+ VI : 71 + V2 : 72 k letrec 2 = (VI, Vs)” in D : 73 (DT-PAIR) ~~(1 + ICZ.)[~~~+ r2 t-letrec f = X”x.Ml in D : 73 & letrec x = (1, 2)6” in let z = snd’(x) in z & let z = (letrec x = (1,2)l in snd’(x)) (letrec 2 = (1,2)’ in z) -% let z = (letrec 2 = (1,2)’ in 2) in (letrec x = (1,2)' in z) A letrec x = (1,2)’ in 2 in then D f, Operational semantics of dynamic expressions semantics by using a reduction a IfI’l-D:randDiD’,thenI’I-D’:r;and l letrec x = (1,2)l in let y = fst’(x) in let z = snd’(x) in z If I’ I- D : r and D o\ that D % D’, then there exists D” such D” and I’ k D” : r. We write -% for the relation % a._ a&. We also write D t if D is reduced to Error no matter how the heap bindings in D are split, i.e., if D & Error and there is no D’ E 2) such that D 4 D’. The correspondence between the new semantics and the original semantics can be stated as follows: Lemma 4.6: There is a relation 7Z(s (‘H x E) x D) that satisfies the following conditions: l (0, M)‘RM; l if (H, M) -A (IS’, M’) D 4 l The expression in Example 2.3 is reduced in z Error]: If l? F D : 7, Error. Lemma 4.5 [Subject Reduction]: relation D & E. D is a dynamic expression and E is either a dynamic expression or a special constant Error representing invalid heap access. The label I shows which heap value is accessed in the reduction step. It is either e, which indicates that an internal heap value is accessed, or x = h”, which indicates that the heap address x is accessed as a value with use n, or x, which indicates that the heap of address x is split as explained above. We show only the key rules in Figure 5. The rule (DR-HEAP) corresponds to the rule (R-HEAP) of the original semantics. The rules (DR-FsT), (DR-READ) and (DR-ERROR) correspond to the rules (R-FST) and (F-FSTERROR). The rule (DR-SPLIT) is the most important rule: it allows the current heap to be split into that for the definition part of a let-expression and that for the body The rule (DR-VAR) corresponds to the rule (Rpart. VAR), but it allows split heap bindings to be merged after the definition part of a let-expression has been fully evaluated. In the rule, letrec z’= 2 in D is a shorthand for letrec zi = h;’ in . . . letrec z,, = hz” in D. let 2 = (1,2)l in let y = fst6(x) in let z = snd’(x) Proof sketch of Theorem 4.1 Lemma 4.4 [Lack of Immediate (DT-ABS) A let y = (letrec x = (1,2)6 in 1) in (letrec x = (1,2)l in let z = snd’(x) in z) Theorem 4.1 follows immediately from the soundness of the type system with respect to the new operational semantics and the fact that if an expression is reduced to Error in the original semantics then it is also the case in the new semantics. The type soundness with respect to the new operational semantics is stated as the two lemmas given below. The subject reduction property (Lemma 4.5) is a little more complicated than the usual one because even a well-typed dynamic expression may be reduced to an ill-typed expression if a heap value is split in an inappropriate way. The second statement of the lemma says that, if heap splitting (the rule (DR-SPLIT)) can be applied to a well-typed expression, there is always a good splitting that preserves the well-typeness of the expression. The typing rules for expressions can be easily extended for dynamic expressions. The new rules are as follows: Example 4.3: as follows: A 4.3 let v = (letrec x = (w, 3)’ in Ml) in letrec x = (u, 3)l in MZ We define an operational let y = (letrec 2 = (1,2)’ in fst6(x)) in (letrec x = (1,2)' in let z = snd’(x) in z) Ml) in ({x til (w,3)}, Mz) can be expressed by the following dynamic expression: 4.2 -% 5 and (H, M)RD, D’ and (H’, M’)RD’ then either for some D’ or D t; and if (H, M) -A Error and (H, M)RD, then D t. Type Reconstruction Use annotations can be automatically inferred from unannotated expressions. Since it is done basically in the same way as that for previous linear type systems [9,22], we explain it only informally. The basic idea is to introduce variables ranging over uses and types, and constraints on them, so 31 type expression (such as p + 7) constructed only from type variables. This simplification is performed by partial unification of types, with some uses being kept different. For example, given a constraint (Y 2 (p Xj real’) + y, we can that we can express the most general typing (principal typing) of an expression. A new type judgment is of the form I’, C t- A4 : r, where a set of constraints C specifies which set of uses/types each use/type variable (in P, M, or 7) can range over. For instance, let z = (y, y) in z is typed as: y:a,{a>~+~,i#6,i>j}I-letx=(y,y)‘inz:/3xj~. This typing is principal% the sense that all the typings derivable from the rules in Section 3 can be obtained by instantiating variables (Y,p, 7, i and j so that the constraint is satisfied. (We do not give the formal definition of principal typings here; it is basically the same as that in [22].) Given an unannotated expression M, the inference of a use annotation proceeds as follows (actually, the second and third steps may overlap): instantiate a and y with ,B’ xj’ real” and p” xj” real”’ for fresh variables ,B’, /3”, i’, i”, j’, j” and reduce the constraint to{p’~~+~“,i’~i+i”,j’~j+j”}. Next, since the set of trivial constraints on type variables always has a solution (let all the remaining type variables be instantiated with, say, id), we can obtain the least assignment to use variables by solving the set of constraints on use variables. We can use the simple iterative method used in previous linear type systems [8,9]. Of course, the least assignment to some use variables cannot be completely determined until the whole program is given. For example, given an expression of type real’, we cannot fmd the least assignment to the use variable i until we know all the places where the evaluation result of the expression is used. For these unknown variables, we can either simply assign w or delay constraint solving. One important difference from the previous linear type inference is that the least solution is not necessarily the best solution from the viewpoint of memory management. For example, consider an expression let y = (1,2) in (let x = fst(y) in x). The least annotation is let y = (1,2)’ in (let x = fsts(y) in x). In this expression, y cannot be deallocated since the only access to the pair y (fst(y)) is annotated with 6. Therefore, it is better to annotate the expression as let y = (I,2)l in (let z = fd(y) in x). We can think of two ways for dealing with this. One way is to classify uses into those for annotating heap allocation and those for annotating heap access. After the least solution is obtained, we can maximize heap access annotations as long as no heap allocation amrotations increase. The other way is to extend a target language with an explicit deallocation operation free(P): It deallocates heap values as if the heap were accessed as described by I’. For example, fiee(z : reall,y : rea!6) deallocates x but not y. The rule (T-WEAK) can be changed as follows (l? - I” is the least type environment I” such that I”; I”’ = I’): 1. Attach a fresh use variable to each place where use annotation is required (let the annotated expression be M’). 2. Compute a principal typing P, C t- M’ : r. 3. Solve the constraint C and apply the obtained substitution to M’. In the following subsections, we explain the second and third steps in a little more detail, and then discuss the cost of the analysis. 5.1 Computing a principal typing An algorithm for computing a principal typing is obtained by constructing syntax-directed typing rules for the new type judgment form and by reading them from bottom to up. For example, we can merge the rules (T-PAIR) and (T-WEAK) and obtain the following rule for the new type judgment form: r,x: rl xKn,C k M: 7-3 c + r’ 1 (r + K : r1 + v, : rz) C!=K#J P’,C E let x = (Vl,V2jR in M : 5 (TR-PAIR) Here, C + Pi 1 Pz means that, for any substitution 0 on type/use variables, if K’ is satisfied then BI’i 1 8’1‘s is also satisfied. The above rule says that in order to compute a typing for let x = (VI,VZ)~ in M, we should first compute a typing for M and then add new constraints entailing I” > P + Vl : 71 + Vz : Q and IE# 6. One major difference from the previous linear type inference [8,22] is that constraints on type variables appear in C (recall the typing for let z = (y, y) in x given above). They are of the form r1 2 7s or 71 - 7s (meaning 71 and 72 are compatible in the sense that 7-1+ 72 and 71; 72 are well defined). Here, the righthand type expression 72 of rl > 7-z may contain +, r-1, etc. as constructors of type expressions (rather than as functions on types). 5.2 Solving constraints r't-M:r r'5 r r I-M;free(r - r’) : 7 This change avoids forgetting to deallocate linear values. We can optimize the result by moving the operation free leftward and merging it with heap access annotation as far as possible. Although there is no guarantee that these methods give the best use annotation, we expect that both give enough good an annotation in practice. 5.3 Cost of the analysis Unfortunately, the computational cost for type reconstruction can be exponential in the size of an input in the worst case. The reason is that the number of use variables can be linear in the size of the type of a variable appearing in an input expression and that the size of a type can be exponential in the size of an input expression. For example, if a variable x is required to have type (int x int) x (id x int) by the context, its typing is: on types and uses An algorithm for solving constraints can be divided into two steps. First, a set of constraints on type/use variables can be simplified into a pair of a set of constraints on use variables of the form i 2 K’ and a set of “trivial” constraints on type variables of the form a - /3 or cr 2 r where r is a 2 ‘Note (T-WEAK’) that a constraint i # 6 can be expressed as i 2 [il. : (int x jl id) xi3 (id x i* k 2 : (int 38 xi; int) id), xi; {il (int 2 i;, xi: i2 > id). i;, is > ii) So, the number of use variables is linear in the size 3 of the type (int x int) x (int x int). Next, consider an expression let 20 = 1 in let 21 = (20,ze) in let 22 = (zi,zi) in . ..let 2, = (2,_1,z,_i) in zn. The size of the type of the expression is exponential in n. In spite of the above fact, we expect that the reconstruction can be performed efficiently for realistic programs for the following reasons. First, the above problem occurs only when tuple or record types are extraordinarily nested; a huge flat tuple type expression int x int x ... x int or an arrow type expression ri+ 72 do not cause such problems. Second, if the computational cost is really problematic for some program, we can save the cost by sacrificing the accuracy of the analysis. For example, we can restrict the + operator so that (71 xK 72) + (ri x”’ ~4) is defined only if ri = ri and r2 = r;. This forces many type expressions to be shared and saves the time and space of the analysis (in fact, the previous analysis in [8,9], which imposes a similar restriction, is performed in polynomial time for the monomorphic type system). 6 6.2 So far, we have considered only integers as constants. In order to deal with large constants (i.e., constants that cannot be represented in one word and must be allocated in a heap), we need to annotate each allocation of a large constant. For example, allocation of a linear real number 2.0 is annotated as let 2 = 2.0’ in . . .. The type of a large constant is also annotated with a use. For instance, the type of linear real numbers is represented by real’. The typing rule for real numbers is given as follows: P, z : real” k M : T l? k let x = rK in M : r 6.3 Primitive functions Primitive functions can be treated as constants of polymorphic types, and each occurrence of a primitive function can be annotated with uses. For example, the primitive + on real numbers can be assigned a type Extensions We have so far considered a simply-typed X-calculus with recursion and pairs. It can be extended by introducing polymorphism on uses and types, and by adding other kinds of data such as recursive data structures and reference cells. For lack of space, we only briefly discuss how to introduce polymorphism, constants such as real numbers, and recursive data structures. 6.1 Large constants + z in . . . can be annotated like: The expression let z = ? let z=+[l,l,l,l](y,z) in .... 6.4 Because the type system presented in Section 3 is monomorphic in both uses and types, the analysis of (quasi-)linear values is often rough. Consider the following expression: in :: : nil : hd : If either z or w is used as an w-value in M, the other is also forced to be an w-value, because the code for the pair creation (i.e., the body of the function f) is shared. In order to avoid the above problem, we can introduce polymorphism on uses. Then, f can be given a type Vi # G.(int+“+‘(int xi id)) and the above expression can be annotated, for example, as ... 7 rl, 4 A C2 k Ml : 1~11 7.1 Mz : 72 Ci,a do not affect the values of variables in Pi, M c2.[Tll,& Va, i.o list’ Vo,i :: {i > S}.(a list’) list’+“+a) Preliminary Experiments We have implemented a prototype type reconstruction system for the core-ML (i.e., SML [15] without modules) based on our quasi-linear type system and a simple profiler that executes the output program and counts the number of allocated heap values. We first describe the current status of the system in Section 7.1 and then report the results of simple experiments in Section 7.2. The prototype system willbe availablefrom http://wuv.yl.is.s.u-tokyo.ac. jp /'koba/research/qlinear/. Here, uses are explicitly passed as parameters to polymorphic functions. In general, we need to introduce a constrained type scheme of the form Voi,. . . ,a,, il,. . . ,ik :: C.-r, where C is a set of constraints on type and use variables such as (~1 2 (~2 + (~3 and ii 1 iz + 1. A new rule for polymorphic let-expressions roughly looks like: : trd,;:: VCY, i.((a xi a liSti)+u*wa In general, we can generate the types of constructors and destructors from datatype declarations of Standard ML [15]. let f = Ai.Xz.(z, z)~ in let L = appZy(f[l], 2) in let w = appZy(f[w],3) in M. l-2,2 data structures Recursive data structures such as lists and trees can be treated in the same way as pairs, by annotating their type constructors with uses. The type of a list of values of type r is expressed as r list”, where K expresses the use of each cons cell of the list. For example, real’ list” is the type of a list whose cons cells may be accessed many times in an arbitrary manner, but whose elements are real numbers that can be accessed only linearly. Primitives for lists can be given the following types: Polymorphism let f = Az.(z,z) in let z = upply(f,2) let w = apply (f, 3) in M. Recursive t- A prototype type reconstruction system Following the extensions discussed in Section 6, we have implemented a prototype analyzer. It takes an expression of the core-ML as an input, performs type (and use) reconstruction and produces a use-annotated expression. It has rl;rz,C1AC3~letz=M~inM2:r2 (T-LETPOLY) 39 been implemented by using the ML kit version 1 [4] as a front end, and supports full features of the core-ML: records, datatype declarations, references, and exceptions. Polymorphism on types and uses is based on the let-polymorphism discussed in Section 6. Polymorphic recursion on uses would be sound (just as polymorphic recursion on regions is sound in the region inference [20]), but it is not supported currently. That is because the algorithm would become complex and also because the polymorphic recursion does not seem so crucial for our analysis. The current system has the following limitations. It produces the least use annotation, not the best one (see Section 5); The analysis of the use of a reference cell is rough; Unnecessary uses are passed as parameters to usepolymorphic functions (they can be removed in the same manner as unnecessary region parameters are removed in a post-path of the region inference [5]); The current system is very slow for several known reasons (it is just because we preferred rapid prototyping and the system will be optimized in the future); A compiler that utilizes the analyzed information for in-place update, etc. has not been implemented yet. 7.2 Results of preliminary boyer programs, the percentage of linear values is relatively smaller. This is probably because unification of first-order terms performed in the program causes sharing of many heap values. One should, however, note that the percentages shown in the rightmost column do not necessarily indicate how much memory space can be saved by our analysis: compared with memory management using conventional garbage collection, extra memory space is required for representing use-polymorphic functions and its instances, and it is not yet clear how soon linear values can indeed be deallocated. Also, the current analysis is performed on unoptimized programs; other optimizations such as inlining and hoisting may reduce the ratio of linear values. Therefore, we know for sure the real impact of our analysis only after more serious experiments are carried out (for that purpose, some of the future work described in Section 9 must be completed). 8 Work Previous linear type systems To our knowledge, among previous linear type systems [3,9,22] that can automatically infer the usage of values, only Barendsen and Smetsers’s uniqueness typing [3] takes an evaluation order into account. The effect of taking an evaluation order into account sometimes depends on a programming style, but it is evident at least for the following functions: experiments Table 1 shows the result of experiments. The columns ‘zero,’ ‘linear,’ and ‘omega’ respectively show how many heap values of use 0, 1, and w are allocated dynamically in each program. The rightmost column shows what percentage of heap values are zero or linear. For a use-polymorphic function like f : Vi.(realZ+‘“+ real’), the allocation was counted only once; in other words, creations of instances like f[l] and f[w] are not counted as the heap allocation (this is because the current analyzer infers not the use of each instance of a use-polymorphic function but the total use of all the instances). “sumlist10000” is a program that makes a list of integers from 0 to 10,000 and computes the sum of them. “qsort20” and “msort20” sort a list of 20 real numbers by using the quick sort and merge sort algorithms. They were written in a naive way; the main part of the quick sort program is: fun quick (Cl> = [I I quick (x::l) = let val (11,12)=divide(x,l) in append(quick(ll), x::quick(l2)) Related fun map f Cl = Cl = f x::map f 1 I map f (x::l) ( Cl, S2) = Cl fun diff S2) = I diff (x::Sl, if member(x, S2) then diff(S1, 52) else x::diff(Sl,SZ) fun move (p, dx) = updatecp, X, p.X+dx) The function f in map and the list S2 in diff (which is a function computing the set difference) are used many times, but are judged to be linear in the quasi-linear type system. The function move takes a record representing a point object as the first argument and adds dx to its X field. The point p is accessed twice, but it is also judged to be linear. The uniqueness typing [3] is different from ours in many ways: the evaluation order is call-by-need,3 their type system does not have “b-type” of values that must be used up locally, and instead it relies on a separate analysis for analyzing the order of memory access (so, the analysis is not integrated as a use-type system, and it is rather complex). Guzman and Hudak [7] and Odersky [18] also proposed a kind of an extension of a linear type system, which can take an evaluation order into account and check that destructive operations on arrays or lists are safely used. Unlike ours, their approach is to let programmers explicitly declare where destructive operations should be performed (by using special primitives for destructive update of data structures) and check whether they are safe by performing type inference. end “sieve2000” finds prime numbers less than 2,000 by using the sieve of Eratosthenes. “life” computes the Iirst 10 generations of lives generated from the initial 44 lives. “dangle” and “reynolds3” are programs taken from [5] (with some parameters being changed). “reynolds3u” is a slightly modified version of “reynolds3, ” in which one function is uncurried so that our analysis works better. In the cases of sumlist, merge/quick sorts and the sieve of Eratosthenes, almost all of the heap values are linear: nonlinear values were only top-level functions. Moreover, by looking at the produced use-annotated expression, we can find that the cons cells generated in merge/quick sorts and the sieve of Eratosthenes can all be updated in-place. In the case of the life (some curried functions in the original program were uncurried: see discussions in Section 9), dangle, and reynolds3, not all heap values are linear, but the number of linear values is still huge enough to greatly reduce the need for garbage collection. In the Knuth-Bendix and Region inference There is an alternative approach to static memory management: region inference [5,20]. The region inference abstracts a bunch of memory addresses as a region, and estimates its life-time by performing a kind of type inference. Our type system and the region inference have both 3Recently, they dealt also with strict let expressions, ysis for them seems to be more naive than our analysis 40 but the anal[19]. I sumlist qsort20 msort20 sieve2000 life Knuth-Bendix boyer mandelbrot danale reynolds3 reynolds3u zero I linear II omena I (zero+linear)/total v 0 2 40,0011 [ 0 448$ I 4 0 496 4 0 256,455 8 0 2.353.116 26.600 0 10;339;410 2,949;292 0 1,163,786 342,642 0 5,672,170 920,944 20.000 20.032 34 0 1 211515 ] 2,059 0 I 21,514 1 13 I\ 1 I ] ] I, I I 100.0% 99.1% 99.2% 100.0% 98.9% 77.8% 77.3% 86.0% 99.9% 91.3% 99.9% Table 1: The number of allocated heap values ture. First, static memory management is especially attractive in parallel/distributed environments, since it requires much less communications/synchronizations than conventional garbage collection. Second, with the increasing importance of cache memory, saving the memory space for a program is also likely to save its execution time. The type system proposed in this paper is for call-byvalue languages; it is not yet clear whether a similar type system can be developed for lazy functional languages. Much work is left to be done to apply our type system to a compiler of ML and know its real impact. We need to work at least on the following issues (although they seem to be fairly straightforward except for the last issue): advantages and disadvantages. Because the life-time of data is estimated region-wise, it is difficult to use the region inference for in-place update. Another shortcoming of the region inference is that too many data are often merged into the same region (especially when recursive data structures, higher-order functions, and reference cells are used), and as a result, the analysis of the life-time of data sometimes becomes rough. For example, consider how to represent the type of a list. Because a list may contain an arbitrary number of cons cells, it is impossible to use a distinct region for each cons cell; so, the type of a list must be something like (7 list, T) where T is the region in which all the cons cells are stored. This implies that a cons cell of a list can be deallocated only after all the cons cells of the list become garbage. On the other hand, in our analysis, the type of a list is of the form r list”; if K.is 1, each cons cell can be deallocated separately when it is accessed as a value of use 1. Thus, unlike in the region inference, the life-times of cons cells are not merged. (We do not intend to say that our analysis is always better for lists than the region inference. Indeed, if some cons cell of a list is accessed as an w-value, our analysis assigns type r list” to the list; hence no cons cells of the list can be deallocated automatically. In this case, the region inference would be much better.) On the other hand, the advantages of the region inference are that it is completely independent of how often values are accessed, and that the deallocation of data in the same region can be performed in a constant time. So, it may be interesting to use our type system and the region inference as complementary methods for static memory management. The goal of our analysis is also similar to that of the storage mode analysis [5], which was proposed as a complementary to the region inference. It analyzes whether each access to a value is the last access to the region of the value, and if so, it inserts a code to deallocate all values in the region. Thus, unlike our analysis, with the storage mode analysis, a value can be deallocated only after all the data in the region become garbage. 9 Conclusion and Future Refinement of the type inference system As discussed in Section 5.2, the least use annotation may not be optimal from the viewpoint of memory management; so, we need to modify the analyzer so that it can produce a better use annotation. Transformation for in-place updates We need to implement a compilation path that finds places where deallocation of a heap value is immediately followed by allocation of another value of the same type and replaces it with m-place update. Combination with conventional garbage collection Since the linear-type-based memory management can automatically deallocate only quasi-linear values, we need a conventional garbage collector as well. There is, however, a subtle problem: since the linear-type-based memory management is not based on the pointer traversal information, it may create a dangling pointer (just as the region-based memory managements does [20]). For example, when the closure (Xz.(fst(y)+z), {y = (1.0,2.0)}) is traced at garbage collection time, the second element 2.0 of y may have already been deallocated. There are two ways for dealing with this problem. One way is to exclude the use 0 from the type system so that dangling pointers cannot be created. The other way, which we are currently exploring (with Atsushi Igarashi), is to use the idea of tag-free garbage collection [16,21]: we can utilize use-annotated types for tracing heap values at garbage collection time, instead of conventional types. For example, from the type real’ xy real’ of y in the above closure, it is known that the second element of y will not be accessed by the closure and hence it need not be traced by a garbage collector. Work We proposed an extended linear type system that can elegantly take an evaluation order into account, and proved its correctness. Static memory management like the one proposed here and the region inference is currently less popular than conventional garbage collection. However, we think it will become very important in the near fu- 41 The following improvement of the accuracy of the analysis is also future work. Taking count the order of access to different variables into PI A. Igarashi. Type-based analysis of usage of values for concurrent programming languages. Master’s thesis, Department of Information Science, University of Tokyo, 1997. PI A. Igarashi and N. Kobayashi. Type-based analysis of usage of communication channels for concurrent programming languages. In Proceedings of International Static Analysis Symposium (SAS’g7), Springer LNCS 1302, pages 187-201, 1997. ac- Because the type system cannot deal with the order of access to different variables, the analysis can be rough when variables are aliased. For example, consider an expression let y = z in fst’(z) + fst’(y). Since z and y refer to the same heap value, we could regard z as a linear value; however, the current type system judges z to be an w-value. We can overcome this problem by representing a type environment as a poset of type bindings in order to express the order of access to different variables, as in an earlier version [lo] of Kobayashi’s type system for deadlockfreedom [ill. PO1N. Kobayashi. A Partially Deadlock-free Typed Process Calculus (I) - A Simple System -. Technical Report 96-02, Department of Information Science, University of Tokyo, September 1996. Pll Combining our analysis with other analyses Our analysis of &values is weak in dealing with curried functions. Consider an expression (Xr.(fst(z) + 1.0))(2.0,3.0) and its curried version ((Xz.Xy.(z + 1.0))2.0)3.0. For the former expression, 6 can be assigned to 2.0, while 1 must be assigned to it for the latter one. This problem may not be so serious because the ordinary compiler optimization uncurries functions as far as possible, but it may be interesting to improve the analysis by using more accurate dataflow information like the one obtained by the region inference. TechniKobayashi. Quasi-linear types. cal Report 98-02, Department of Information SciUniversity of Tokyo, 1998. Available ence, through http://ava.yl.is.s.u-tokyo.ac.jp/’koba /publications.html. P21 N. 1131 N. Kobayashi, B. C. Pierce, and D. N. Turner. Linearity and the pi-calculus. In Proceedings of ACM SIGPLAN/SIGACT Symposium on Principles of Progmmming Languages, pages 358-371, January 1996. Acknowledgment We would like to thank Atsushi Igarashi, Martin Odersky, Kenjiro Taura, Mads Tofte, and the anonymous referees for useful discussions and comments. [I41 I. Mackie. Lilac : A functional programming language based on linear logic. Journal of Functional Programming, 4(4):1-39, October 1994. 1151 R. Milner, M. Tofte, R. Harper, and D. MacQueen. The References Definition of Standard ML (Revised). The MIT Press, 1997. PI H. G. Baker. Lively linear lisp - look ma, no garbage! ACM Sigplan Notices, 27(8):89-98, 1992. PI H. G. Baker. A linear logic quicksort. Notices, 29(2):13-18, 1994. [I61 G. Morrisett. Compiling with Types. PhD thesis, School ACM Sigplan of Computer Science, Carnegie Mellon University, 1995. P71 G. Morrisett, M. Felleisen, and R. Harper. Abstract models of memory management. In Proceeding8 of Functional Programming Languages and Computer Architecture, pages 66-76, 1995. Conventional and [31 E. Barendsen and S. Smetsers. uniqueness typing in graph rewrite systems. Technical Report CSI-R9328, Computer Science Institute, University of Nijmegen, 1993. An extended abstract appeared in Proc. of FST&TCS 12, Springer LNCS 761, pp.41-51. PI [I81M. Odersky. Observers for linear types. In Proceedings of 4th European Symposium on Programming (ESOP’g2), Springer LNCS 582, pages 390-407, 1992. L. Birkedal, N. Rothwell, M. Tofte, and D. N. Turner. The ML Kit (Version 1). Technical Report 93/14, Department of Computer Science, University of Copenhagen, 1993. P91 R. Plasmeijer and M. van Eekelen. Concurrent Clean ver.l.3 language report, 1997. PO1M. Tofte and J.-P. Talpin. Implementing the call-byvalue lambda-calculus using a stack of regions. In Proceedings of ACM SIGPLAN/SIGACT Symposium on Principles of Programming Languages, pages 188-201, 1994. FJIL. Birkedal, M. Tofte, and M. Vejlstrup. From region inference to von neumann machines via region representation inference. In Proceedings of ACM SIGPLAN/SIGACT Symposium on Principles of Programming Languages, pages 171-183, 1996. 161J.-Y. Girard. Linear logic. ence, 50:1-102, 1987. N. Kobayashi. A partially deadlock-free typed process calculus. ACM nansuctions on Progmmming Languages and Systems, 20(2):436-482, 1998. A preliminary summary appeared in Proceedings of LICS’97, pages 128-139. WI Theoretical Computer Sci- A. Tolmach. Tag-free garbage collection using explicit type parameters. In Proceedings of ACM Conference on Lisp and Functional Programming, pages l-11, 1994. P21 D. N. Turner, P. Wadler, and C. Mossin. Once upon a type. In Proceedings of Functional Programming Languages and Computer Architecture, San Diego, California, pages l-11, 1995. 171 J. G. GuzmBn and P. Hudak. Single-threaded polymor- phic lambda calculus. In Proceedings of IEEE Symposium on Logic in Computer Science, pages 333-343, 1990. 42
© Copyright 2025 Paperzz