Almost Every Simply Typed λ-Term Has a Long β-Reduction Sequence Ryoma Sin’ya, Kazuyuki Asada, Naoki Kobayashi, and Takeshi Tsukada The University of Tokyo Abstract. It is well known that the length of a β-reduction sequence of a simply typed λ-term of order k can be huge; it is as large as k-fold exponential in the size of the λ-term in the worst case. We consider the following relevant question about quantitative properties, instead of the worst case: how many simply typed λ-terms have very long reduction sequences? We provide a partial answer to this question, by showing that asymptotically almost every simply typed λ-term of order k has a reduction sequence as long as (k − 2)-fold exponential in the term size, under the assumption that the arity of functions and the number of variables that may occur in every subterm are bounded above by a constant. The work has been motivated by quantitative analysis of the complexity of higher-order model checking. 1 Introduction It is well known that the length of a β-reduction sequence of a simply typed λ-term can be extremely long. Beckmann [1] showed that, for any k ≥ 0, max{β(t) | t is a simply typed λ-term of order k and size n} = expk (Θ(n)) where β(t) is the maximum length of the β-reduction sequences of the term t, and expk (x) is defined by: exp0 (x) ≜ x and expk+1 (x) ≜ 2expk (x) . Indeed, the following order-k term [1]: (Twice k )n Twice k−1 · · · Twice 2 (λx.a x x)c, where Twice j is the twice function λf σj−1 .λxσj−2 .f (f x) (with σj being the order-j type defined by: σ0 = o and σj = σj−1 → σj−1 ), has a β-reduction sequence of length expk (Ω(n)). Although the worst-case length of the longest β-reduction sequence is well known as above, much is not known about the average-case length of the longest β-reduction sequence: how often does one encounter a term having a very long β-reduction sequence? In other words, suppose we pick a simply-typed λ-term t of order k and size n randomly; then what is the probability that t has a β-reduction sequence longer than a certain bound, like expk (cn) (where c is some constant)? One may expect that, although there exists a term (like the one above) whose reduction sequence is as long as expk (Ω(n)), such a term is rarely encountered. In the present paper, we provide a partial answer to the above question, by showing that almost every simply typed λ-term of order k has a reduction sequence as long as (k − 2)-fold exponential in the term size, under a certain assumption. More precisely, we shall show: ( ) # {[t]α ∈ Λα n (k, ι, ξ) | β(t) ≥ expk−2 (n)} =1 lim n→∞ #(Λα n (k, ι, ξ)) where Λα n (k, ι, ξ) is the set of (α-equivalence classes [−]α of) simply-typed λterms such that the term size is n, the order is up to k, the (internal) arity is up to ι ≥ k and the number of variable names is up to ξ (see the next section for the precise definition). To obtain the above result, we use techniques inspired by the quantitative analysis of untyped λ-terms [2,3,4]. For example, David et al. [2] have shown that almost all untyped λ-terms are strongly normalizing, whereas the result is opposite in the corresponding combinatory logic. A more sophisticated analysis is, however, required in our case, for considering only well-typed terms, and also for reasoning about the length of a reduction sequence instead of a qualitative property like strong normalization. This work is a part of our long-term project on the quantitative analysis of the complexity of higher-order model checking [5,6]. The higher-order model checking asks whether the (possibly infinite) tree generated by a groundtype term of the λY-calculus (or, a higher-order recursion scheme) satisfies a given regular property, and it is known that the problem is k-EXPTIME complete for order-k terms [6]. Despite the huge worst-case complexity, practical model checkers [7,8,9] have been built, which run fast for many typical inputs, and have successfully been applied to automated verification of functional programs [10,11,12,13]. The project aims to provide a theoretical justification for it, by studying how many inputs actually suffer from the worst-case complexity. Since the problem appears to be hard due to recursion, as an intermediate step towards the goal, we aimed to analyze the variant of the problem considered by Terui [14]: given a term of the simply-typed λ-calculus (without recursion) of type Bool, decide whether it evaluates to true or false (where Booleans are Church-encoded; see [14] for the precise definition). Terui has shown that even for the problem, the complexity is k-EXPTIME complete for order-(2k + 2) terms. If, contrary to the result of the present paper, the upper-bound of the lengths of β-reduction sequences were small for almost every term, then we could have concluded that the decision problem above is easily solvable for most of the inputs. The result in the present paper does not necessarily provide a negative answer to the question above, because one need not necessarily apply β-reductions to solve Terui’s decision problem. The present work may also shed some light on other problems on typed λcalculi with exponential or higher worst-case complexity. For example, despite DEXPTIME-completeness of ML typability [15,16], it is often said that the exponential behavior is rarely seen in practice. That is, however, based on only empirical studies. Our technique may be used to provide a theoretical justification (or possibly unjustification). The rest of this paper is organized as follows. Section 2 states our main result formally. Section 3 analyzes the asymptotic behavior of the number of typed λterms of a given size. Section 4 proves the main result. Section 5 discusses related work, and Section 6 concludes the paper. 2 Main Result In this section we give the precise statement of our main theorem. We denote the cardinality of a set S by #(S), and the domain and image of a function f by Dom(f ) and Im (f ), respectively. The set of (simple) types, ranged over by τ and σ, is given by: τ ::= o | σ → τ . Let V be a countably infinite set, which is ranged over by x, x1 , x2 , etc. The set of λ-terms (or terms), ranged over by t, is defined by: t ::= x | λxτ.t | t t x ::= x | ∗ We call elements of V ∪ {∗} variables; V ∪ {∗} is ranged over by x, x1 , x2 , etc. We call the special variable ∗ an unused variable. We sometimes omit type annotations and just write λx.t for λxτ.t. Terms of our syntax can be translated to usual λ-terms by regarding elements in V ∪{∗} as usual variables. We define the notions of free variables, closed terms, and α-equivalence ∼α through this identification. The α-equivalence class of a term t is written as [t]α . In this paper, we do not consider a term as an αequivalence class, and we always use [−]α explicitly. For a term t, we write FV(t) for the set of all the free variables of t. For a term t, we define the set V(t) of variables (except ∗) in t by: V(x) ≜ {x} V(λxτ.t) ≜ {x}∪V(t) V(λ∗τ.t) ≜ V(t) V(t1 t2 ) ≜ V(t1 )∪V(t2 ). Note that neither V(t) nor even #(V(t)) is preserved by α-equivalence. For example, t = λx1 .(λx2 .x2 )(λx3 .x1 ) and t′ = λx1 .(λx1 .x1 )(λ∗.x1 ) are α-equivalent, but #(V(t)) = 3 and #(V(t′ )) = 1. A type environment Γ is a finite set of type bindings of the form x : τ such that if (x : τ ), (x : τ ′ ) ∈ Γ then τ = τ ′ ; sometimes we regard an environment also as a function. Note that (∗ : τ ) cannot belong to a type environment; we do not need any type assumption for ∗ since it does not occur in terms. We give the typing rules as follows: x:τ ⊢x:τ Γ′ ⊢ t : τ Γ1 ⊢ t1 : σ→τ Γ 2 ⊢ t2 : σ Γ1 ∪ Γ2 ⊢ t1 t2 : τ Γ ′ = Γ or Γ ′ = Γ ∪ {x : σ} Γ ⊢ λxσ.t : σ→τ x∈ / Dom(Γ ) The above typing rules are equivalent to the usual ones for closed terms, and if Γ ⊢ t : τ is derivable, then the derivation is unique. Moreover, if Γ ⊢ t : τ then Dom(Γ ) = FV(t). Below we consider only well-typed λ-terms. A pair ⟨Γ ; τ ⟩ of Γ and τ is called a typing. We use θ as a metavariable for typings. When Γ ⊢ t : τ is derived, we call ⟨Γ ; τ ⟩ a typing of a term t, and call t an inhabitant of ⟨Γ ; τ ⟩ or a ⟨Γ ; τ ⟩-term. Definition 1 (size, order and internal arity of a term). The size of a term t, written |t|, is defined by: |x| ≜ 1 |λxτ.t| ≜ |t| + 1 |t1 t2 | ≜ |t1 | + |t2 | + 1. The order and internal arity of a type τ , written ord(τ ) and iar(τ ), are defined respectively by: ord(o) ≜ 0 iar(o) ≜ 0 ord(τ1 → · · · → τn → o) ≜ max{ord(τi ) + 1 | 1 ≤ i ≤ n} (n ≥ 1) iar(τ1 → · · · → τn → o) ≜ max({n} ∪ {iar(τi ) | 1 ≤ i ≤ n}) (n ≥ 1). For a ⟨Γ ; τ ⟩-term t, we define the order and internal arity of Γ ⊢ t : τ written ord(Γ ⊢ t : τ ) and iar(Γ ⊢ t : τ ) by: ord(Γ ⊢ t : τ ) ≜ max{ord(τ ′ ) | (Γ ′ ⊢ t′ : τ ′ ) occurs in ∆} iar(Γ ⊢ t : τ ) ≜ max{iar(τ ′ ) | (Γ ′ ⊢ t′ : τ ′ ) occurs in ∆} where ∆ is the (unique) derivation tree for Γ ⊢ t : τ . Note that the notions of size, order, internal arity, and β(t) (defined in the introduction) are well-defined with respect to α-equivalence. Definition 2 (terms with bounds on types and variables). Let δ, ι, ξ ≥ 0 and n ≥ 1 be integers. We denote by Types(δ, ι) the set of types {τ | ord(τ ) ≤ δ, iar(τ ) ≤ ι}. For Γ and τ we define: ′ Λα n (⟨Γ ; τ ⟩, δ, ι, ξ) ≜ {[t]α | Γ ⊢ t : τ, |t| = n, ′min #(V(t )) ≤ ξ, t ∈[t]α Λ (⟨Γ ; τ ⟩, δ, ι, ξ) ≜ α ∪ ord(Γ ⊢ t : τ ) ≤ δ, iar(Γ ⊢ t : τ ) ≤ ι} Λα n (⟨Γ ; τ ⟩, δ, ι, ξ). n≥1 Also we define: Λα n (δ, ι, ξ) ≜ ∪ Λα n (⟨∅; τ ⟩, δ, ι, ξ) τ ∈Types(δ,ι) Λα (δ, ι, ξ) ≜ ∪ Λα n (δ, ι, ξ). n≥1 Our main result is the following theorem, which will be proved in Section 4. Theorem 1. Let δ, ι, ξ ≥ 2 be integers and let k = min{δ, ι}. Then, ( ) # {[t]α ∈ Λα n (δ, ι, ξ) | β(t) ≥ expk−2 (n)} lim = 1. n→∞ #(Λα n (δ, ι, ξ)) Remark 1. Note that in the above theorem, the order δ, the internal arity ι and the number ξ of variables are bounded above by a constant, independently of the term size n. It is debatable whether the assumption is reasonable, and a slight change of the assumption may change the result, as is the case for strong normalization of untyped λ-term [2,4]. When λ-terms are viewed as models of functional programs, our rationale behind the assumption is as follows. The assumption that the size of types (hence also the order and the internal arity) is fixed is sometimes assumed in the context of type-based program analysis [17]. The assumption on the number of variables comes from the observation that a large program usually consists of a large number of small functions, and that the number of variables is bounded by the size of each function. Remark 2. Actually, by refining the proof of Theorem 1, we can strengthen the main result as follows: For δ, ι, ξ ≥ 2 and k = min{δ, ι}, there exists a real number q > 0 such that ( ) q # {[t]α ∈ Λα n (δ, ι, ξ) | β(t) ≥ expk−1 (n )} lim = 1. n→∞ #(Λα n (δ, ι, ξ)) A proof of this strengthened result is presented in a revised version of this paper under preparation. 3 Analysis of Λα n (δ, ι, ξ) To prove our main theorem, we first analyze some formal language theoretic structure and properties of Λα (δ, ι, ξ): in Section 3.1, we construct a regular tree grammar such that there is a size preserving bijection between its tree language and Λα (δ, ι, ξ); in Section 3.2, we show that the grammar has two important properties: irreducibility and aperiodicity. Thanks to those properties, we can obtain a simple asymptotic formula for #(Λα n (δ, ι, ξ)) using analytic combinatorics [18]. The irreducibility and aperiodicity properties will also be used in Section 4 for adjusting the size and typing of a λ-term. 3.1 Λα (δ, ι, ξ) as a Regular Tree Language We first recall some basic definitions for regular tree grammars. A ranked alphabet Σ is a mapping from a finite set of symbols to the set of natural numbers. For a symbol a ∈ Dom(Σ), we call Σ(a) the rank of a. A Σ-tree is a tree composed from symbols in Σ according to their ranks: (i) a is a Σ-tree if Σ(a) = 0, (ii) a(T1 , · · · , TΣ(a) ) is a Σ-tree if Σ(a) ≥ 1 and Ti is a Σ-tree for each i ∈ {1, . . . , Σ(a)}. We use the meta-variable T for trees. The size of T , written as |T |, is the number of nodes and leaves of T . We denote the set of all Σ-trees by TΣ . A regular tree grammar is a triple G = (Σ, N , R) where (i) Σ is a ranked alphabet; (ii) N is a finite set of non-terminals; (iii) R is a finite set of rewriting rules of the form N −→ a(N1 , · · · , NΣ(a) ) where a ∈ Dom(Σ), N ∈ N and Ni ∈ N for every i ∈ {1, . . . , Σ(a)}. A (Σ ∪ N )-tree T is a tree composed from symbols in Σ ∪ N according to their ranks where the rank of every symbol in N is zero (thus non-terminals appear only in leaves of T ). For a tree grammar G = (Σ, N , R) and a non-terminal N ∈ N , the language L (G, N ) of N is defined by L (G, N ) ≜ {T ∈ TΣ | N −→∗G T } where −→∗G denotes the reflexive and transitive closure of the rewriting relation −→G . We also define Ln (G, N ) ≜ {T ∈ TΣ | N −→∗ T, |T | = n}. We often omit G and write N −→∗ N ′ , L (N ), and Ln (N ) for N −→∗G N ′ , L (G, N ), and Ln (G, N ) respectively, if G is clear from the context. We say that N ′ is reachable from N if there exists a (Σ ∪ N )-tree T such that N −→∗ T and T contains N ′ as a leaf. A grammar G is unambiguous if, for every pair of a non-terminal N and a tree T , there exists at most one leftmost reduction sequence from N to T . Definition 3 (grammar of Λα (δ, ι, ξ)). Let δ, ι, ξ ≥ 0 be integers and Xξ = {x1 , · · · , xξ } be a subset of V . The regular tree grammar G(δ, ι, ξ) is defined as (Σ(δ, ι, ξ), N (δ, ι, ξ), R(δ, ι, ξ)) where: Σ(δ, ι, ξ) ≜ {x 7→ 0 | x ∈ Xξ } ∪ {@ 7→ 2} ∪ {λxτ 7→ 1 | x ∈ {∗} ∪ Xξ , τ ∈ Types(δ − 1, ι)} N (δ, ι, ξ) ≜ {N⟨Γ ;τ ⟩ | τ ∈ Types(δ, ι), Dom(Γ ) ⊆ Xξ , Im (Γ ) ⊆ Types(δ − 1, ι), Γ ⊢ t : τ for some t } R(δ, ι, ξ) ≜ {N⟨{xi :τ };τ ⟩ −→ xi } ∪ {N⟨Γ ;σ→τ ⟩ −→ λ∗σ (N⟨Γ ;τ ⟩ )} ∪ {N⟨Γ ;σ→τ ⟩ −→ λxσi (N⟨Γ ∪{xi :σ};τ ⟩ ) | i = min{j ≥ 1 | xj ∈ / Dom(Γ )}, #(Γ ) < ξ} ∪ {N⟨Γ ;τ ⟩ −→ @(N⟨Γ1 ;σ→τ ⟩ , N⟨Γ2 ;σ⟩ ) | Γ = Γ1 ∪ Γ2 } Here, the special symbol @ ∈ Dom(Σ(δ, ι, ξ)) corresponds to application. For a technical convenience, the above definition excludes from N (δ, ι, ξ) typings which have no inhabitant. Note that Σ(δ, ι, ξ), N (δ, ι, ξ) and R(δ, ι, ξ) are finite. To see the finiteness of N (δ, ι, ξ), notice that Xξ and Types(δ − 1, ι) are finite, hence so is {Γ | Dom(Γ ) ⊆ Xξ , Im (Γ ) ⊆ Types(δ − 1, ι)}. The finiteness of R(δ, ι, ξ) follows immediately from that of N (δ, ι, ξ). Example 1. Let us consider the case where δ = ι = ξ = 1. The grammar G(1, 1, 1) consists of the following. Σ(1, 1, 1) ={x1 , @, λxo1 , λ∗o } N (1, 1, 1) = {N⟨∅;o→o⟩ , N⟨{x1 :o};o⟩ N⟨{x1 :o};o→o⟩ } o N⟨∅;o→o⟩ −→ λx1 (N⟨{x1 :o};o⟩ ) R(1, 1, 1) = N⟨{x1 :o};o⟩ −→ x1|@(N⟨{x1 :o};o→o⟩, N⟨{x1 :o};o⟩)|@(N⟨∅;o→o⟩, N⟨{x1 :o};o⟩) N⟨{x1 :o};o→o⟩ −→ λ∗o (N⟨{x1 :o};o⟩ ). There is the obvious embedding e(δ,ι,ξ) (e for short) from trees in TΣ(δ,ι,ξ) into λ-terms. For N⟨Γ ;τ ⟩ ∈ N (δ, ι, ξ) we define ( ) (δ,ι,ξ) π⟨Γ ;τ ⟩ ≜ [−]α ◦ e : L N⟨Γ ;τ ⟩ → Λα (⟨Γ ; τ ⟩, δ, ι, ξ). We sometimes omit the superscript and/or the subscript. Proposition 1. For δ, ι, ξ ≥ 0, π⟨Γ ;τ ⟩ is a size-preserving bijection, and G(δ, ι, ξ) is unambiguous. The former part of Proposition 1 says that G(δ, ι, ξ) gives a complete representation system of the α-equivalence classes. For [t]α ∈ Λα (⟨Γ ; τ ⟩, δ, ι, ξ), we ( )−1 (δ,ι,ξ) (δ,ι,ξ) define ν⟨Γ ;τ ⟩ ([t]α ) (or ν([t]α ) for short) as e(δ,ι,ξ) ◦ π⟨Γ ;τ ⟩ ([t]α ). The function ν normalizes variable names. For example, t = λx.x(λy.λz.z) is normalized to ν([t]α ) = λx1 .x1 (λ∗.λx1 .x1 ). Due to technical reasons, we restrict the grammar G(δ, ι, ξ) to G ∅ (δ, ι, ξ), which contains only non-terminals reachable from N⟨∅;σ⟩ for some σ (see Appendix B for details). N ∅ (δ, ι, ξ) ≜ {Nθ ∈ N (δ, ι, ξ) | Nθ is reachable from some N⟨∅;σ⟩ ∈ N (δ, ι, ξ)} R∅ (δ, ι, ξ) ≜ {Nθ −→ T ∈ R(δ, ι, ξ) | Nθ ∈ N ∅ (δ, ι, ξ)} G ∅ (δ, ι, ξ) ≜ (Σ(δ, ι, ξ), N ∅ (δ, ι, ξ), R∅ (δ, ι, ξ)). ( ) For Nθ ∈ N ∅ (δ, ι, ξ), clearly L G ∅ (δ, ι, ξ), Nθ = L (G(δ, ι, ξ), Nθ ). Through the bijection π, we can show that, for any N⟨Γ ;τ ⟩ ∈ N (δ, ι, ξ), N⟨Γ ;τ ⟩ also belongs to N ∅ (δ, ι, ξ) if and only if there exists a term in Λα (δ, ι, ξ) whose derivation contains a type judgment of the form Γ ⊢ t : τ . 3.2 Irreducibility and Aperiodicity We discuss two important properties of the grammar G ∅ (δ, ι, ξ) where δ, ι, ξ ≥ 2: irreducibility and aperiodicity [18].1 Definition 4 (irreducibility and aperiodicity). Let G = (Σ, N , R) be a regular tree grammar. We say that G is: – non-linear if R contains at least one rule of the form N −→ a(N1 , · · · , NΣ(a) ) with Σ(a) ≥ 2, – strongly connected if for any pair of non-terminals N1 , N2 ∈ N , N1 is reachable from N2 , – irreducible if G is both non-linear and strongly connected, – aperiodic if for any non-terminal N ∈ N there exists an integer m > 0 such that #(Ln (N )) > 0 for any n > m. Proposition 2. G ∅ (δ, ι, ξ) is irreducible and aperiodic for any δ, ι, ξ ≥ 2. The following theorem is a minor modification of Theorem VII.5 in [18], which states the asymptotic behavior of an irreducible and aperiodic contextfree specification (see Appendix C for details). Below, ∼ means the asymptotic equality, i.e., f (n) ∼ g(n) ⇐⇒ limn→∞ f (n)/g(n) = 1. 1 In [18], irreducibility and aperiodicity are defined for context-free specifications. Our definition is a straightforward adaptation of the definition for regular tree grammars. Theorem 2 ([18]). Let G = (Σ, N , R) be an unambiguous, irreducible and aperiodic regular tree grammar. Then there exists a constant γ(G) > 1 such that, for any non-terminal N ∈ N , there exists a constant CN (G) > 0 such that #(Ln (N )) ∼ CN (G)γ(G)n n−3/2 . As a corollary of Proposition 2 and Theorem 2 above, we obtain: n −3/2 #(Λα n (δ, ι, ξ)) ∼ Cγ n (1) where C > 0 and γ > 1 are some real constants determined by δ, ι, ξ ≥ 2. For proving our main theorem, we use a variation of the formula (1) above, stated as Lemma 1 later. 4 Proof of the Main Theorem We give a proof of Theorem 1 in this section. In the rest of the paper, we denote by log(2) (n) the 2-fold logarithm: log(2) (n) ≜ log log n. All logarithms are base 2. The outline of the proof is as follows. We prepare a family (tn )n∈N of λ-terms such that tn is of size Ω(log(2) (n)) and has a β-reduction sequence of length expk (Ω(|tn |)), i.e., expk−2 (Ω(n)). Then we show that almost every λ-term of size n contains tn as a subterm. The latter is shown by adapting (a parameterized version of) the infinite monkey theorem 2 for words to simply-typed λ-terms. To clarify the idea, let us first recall the infinite monkey theorem for words. Let A be an alphabet, i.e., a finite non-empty set of symbols. For a word w = a1 · · · an , we write |w| = n for the length of w. As usual, we denote by An the set of ∪ all words of length n over A, and by A∗ the set of all finite words over A: ∗ A = n≥0 An . For two words w, w′ ∈ A∗ , we say w′ is a subword of w and write w′ ⊑ w if w = w1 w′ w2 for some words w1 , w2 ∈ A∗ . The infinite monkey theorem states that, for any word w ∈ A∗ , the probability that a randomly chosen word of size n contains w as a subword tends to one if n tends to infinity. To prove our main theorem, we need to extend the above infinite monkey theorem to the following parameterized version3 , and then further extend it for simply-typed λ-terms instead of words. We give a proof of the following proposition, because it will clarify the overall structure of the proof of the main theorem. Proposition 3 (parameterized infinite monkey theorem). Let A be an alphabet and (wn )n be a family of words over A such that |wn | = ⌈log(2) (n)⌉. Then, we have: #({w ∈ An | wn ⊑ w}) lim = 1. n→∞ #(An ) 2 3 It is also called as “Borges’s theorem” (cf. [18, p.61, Note I.35]). Although it is a simple extension, we are not aware of literature that explicitly states this parameterized version. Proof. Let p(n) be 1 − #({w ∈ An | wn ⊑ w}) /#(An ), i.e., the probability that a word of size n does not contain wn . We write s(n) for ⌈log(2) (n)⌉ and c(n) for ⌊n/s(n)⌋. Given a word w = a1 · · · an ∈ An , let us partition it to subwords of length s(n) as follows. w = a1 · · · as(n) · · · a(c(n)−1)s(n)+1 · · · ac(n)s(n) ac(n)s(n)+1 · · · an {z } | {z } | 1-st subword c(n)-th subword Then, p(n) ≤ “the probability that none of the i-th subword is wn ” )c(n) ( )c(n) ( ( )c(n) #(As(n) )−1 #(As(n) \{wn }) 1 = = 1 − . = s(n) #(A) #(As(n) ) #(As(n) ) ( Since 1− 1 #(A)s(n) ( )c(n) = )⌊n/⌈log(2) (n)⌉⌋ 1− 1 #(A)⌈log tends to zero (see (2) (n)⌉ Appendix E ) if n tends to infinity, we have the required result. ⊓ ⊔ To prove an analogous result for simply-typed λ-terms, we consider below subcontexts of a given term instead of subwords of a given word. To consider “contexts up to α-equivalence”, in Section 4.1 we introduce the set Unν (δ, ι, ξ) of “normalized” contexts (of size n and with the restriction by δ, ι and ξ), ν where Us(n) (δ, ι, ξ) corresponds to As(n) above, and give an upper bound of ν #(Un (δ, ι, ξ)). A key property used in the above proof was that any word of length n can be partitioned to sufficiently many subwords of length log(2) (n). Section 4.2 below shows an analogous result that any term of size n can be decomposed into sufficiently many subcontexts of a given size. Section 4.3 constructs a family of contexts Expl kn (called “explosive contexts”) that have very long reduction sequences; (Expl kn )n corresponds to (wn )n above. Finally, Section 4.4 proves the main theorem using an argument similar to (but more involved than) the one used in the proof above. 4.1 Normalized Contexts We first introduce some basic definitions of contexts, and then we define the notion of a normalized context, which is a context normalized by the function ν given in Section 3.1. The set of contexts, ranged over by C, is defined by C ::= [ ] | x | λxτ.C | CC The size of C, written |C|, is defined by: |[ ]| ≜ 0 |x| ≜ 1 |λxτ.C| ≜ |C| + 1 |C1 C2 | ≜ |C1 | + |C2 | + 1. We call a context C an n-context (and define hn(C) ≜ n) if C contains n occurrences of [ ]. We use the metavariable S for 1-contexts. A 0/1-context is a term t or a 1-context S and we use the metavariable u to denote 0/1-contexts. The holes in C occur as leaves and we write [ ]i for the i-th hole, which is counted in the left-to-right order. For C, C1 , . . . , Chn(C) , we write C[C1 ] . . . [Chn(C) ] for the context obtained by replacing [ ]i in C with Ci for each i ≤ hn(C). For C and C ′ , we write C[C ′ ]i for the context obtained by replacing the i-th hole [ ]i in C with C ′ . As usual, these substitutions may capture variables; e.g., (λx.[ ])[x] is λx.x. We say that C is a subcontext of C ′ and write C ⪯ C ′ if there exist C ′′ , 1 ≤ i ≤ hn(C ′′ ) and C1 , · · · , Chn(C) such that C ′ = C ′′ [C[C1 ] · · · [Chn(C) ]]i . The set of context typings, ranged over by κ, is defined by: κ ::= θ1 · · · θk ⇒θ where k ∈ N and θi is a typing of the form ⟨Γi ; τi ⟩ for each 1 ≤ i ≤ k (recall that we use θ as a metavariable for typings). A ⟨Γ1 ; τ1 ⟩ · · · ⟨Γk ; τk ⟩⇒⟨Γ ; τ ⟩-context is a k-context C such that Γ ⊢ C : τ is derivable from Γi ⊢ [ ]i : τi . We identify a context typing ⇒θ with the typing θ, and call a θ-context also a θ-term. From now, we begin to define normalized contexts. First we consider contexts in terms of the grammar G ∅ (δ, ι, ξ) given in Section 3.1. Let δ, ι, ξ ≥ 0. For κ = θ1 · · · θn ⇒θ such that Nθ1 , . . . , Nθn , Nθ ∈ N (δ, ι, ξ), a (κ-)context-tree is a tree Tb in TΣ(δ,ι,ξ)∪N (δ,ι,ξ) such that there exists a reduction Nθ −→∗ Tb and the occurrences of non-terminals in Tb (in the left-to-right order) are exactly Nθ1 , . . . , Nθn . We use Tb as a metavariable for context-trees. We write L (κ, δ, ι, ξ) for the set of all κ-context-trees. For θ1 · · · θn ⇒θ-context-tree Tb and θ1i · · · θki i ⇒θi -contexttrees Tbi (i = 1, . . . , n), we define the substitution Tb[Tb1 ] · · · [Tbn ] as the θ11 · · · θk11 · · · θ1n · · · θknn ⇒θ-context-tree obtained by replacing Nθi in Tb with Tbi . The set C ν (κ, δ, ι, ξ) of normalized κ-contexts is defined by: C ν (κ, δ, ι, ξ) ≜ e(δ,ι,ξ) (L (κ, δ, ι, ξ)) κ (δ,ι,ξ) is the obvious embedding from κ-context-trees to κ-contexts that where eκ (δ,ι,ξ) (δ,ι,ξ) (δ,ι,ξ) (T ′ )]). Further, (T )[eκ (T [T ′ ]) = eκ preserves the substitution (i.e., eκ ν ν the sets U (δ, ι, ξ) and Un (δ, ι, ξ) of normalized 0/1-contexts are defined by: ( ) ∪ ( ) ∪ ∪ U ν (δ, ι, ξ) ≜ C ν (θ, δ, ι, ξ) C ν (θ⇒θ′ , δ, ι, ξ) Nθ ∈N ∅ (δ,ι,ξ) Unν (δ, ι, ξ) Nθ ,Nθ′ ∈N ∅ (δ,ι,ξ) ≜ {u ∈ U (δ, ι, ξ) | |u| = n}. ν ν In our proof of the main theorem, the set Us(n) (δ, ι, ξ) plays a role corres(n) sponding to A in the word case explained above. Note that in the word case we calculated the limit of some upper bound of p(n); similarly, in our proof, we only need an upper bound of #(Unν (δ, ι, ξ)), which is given as follows. Lemma 1 (upper bound of #(Unν (δ, ι, ξ))). For any δ, ι, ξ ≥ 2, there exists some constant γ(δ, ι, ξ) > 1 such that #(Unν (δ, ι, ξ)) = O(γ(δ, ι, ξ)n ). Proof Sketch. Given an unambiguous, irreducible and aperiodic regular tree grammar, adding a new terminal of the form aN and a new rule of the form N −→ aN for each non-terminal N does not change the unambiguity, irreducibil∅ ity and aperiodicity. Let G (δ, ι, ξ) be the grammar obtained by applying this ∅ transformation to G ∅ (δ, ι, ξ). We can regard a tree of G (δ, ι, ξ) as a normalized context, with aNθ considered a hole with typing θ. Then, clearly we have ( ( ∅ )) #(Unν (δ, ι, ξ)) ≤ # ∪N ∈N ∅ (δ,ι,ξ) Ln G (δ, ι, ξ), N . ⊓ ⊔ Thus the required result follows from Theorem 2. 4.2 Decomposition As explained in the beginning of this section, to prove the parameterized infinite monkey theorem for terms, we need to decompose a λ-term into sufficiently many subcontexts of the term. Thus, in this subsection, we will define a decomposition bm (where m is a parameter) that decomposes a term t into (i) a function Φ (sufficiently long) sequence P of 0/1-subcontexts of t such that every component u of P satisfies |u| ≥ m, and (ii) a “second-order” context E (defined later), which is a remainder of extracting P from t. Figure 1 illustrates how a term is b3 . Here, the symbols J K in the second-order context on the rightdecomposed by Φ hand side represents the original position of each subcontext (λy.[ ])x, λz.λ∗.z, and (λ∗.y)λz.z. x Second-order context x @ x y b3 @ z ⇤ ⇤ @ ⇤ z z y h;; o ! o ! o ! oi z 7 ! JK @ JK ⇤ + JK Sequence of 0/1-subcontexts ( y.[ ])x · z. ⇤.z · ( ⇤.y) z.z Fig. 1: Example of a decomposition o o!o o!o!o!o bm , let us give a precise definition of second-order contexts. In order to define Φ The set of second-order contexts, ranged over by E, is defined by: E ::= J Kθn1 ···θk ⇒θ [E1 ] · · · [Ek ] (n ∈ N) | x | λxτ.E | E1 E2 . Intuitively, the second-order context is an expression having holes of the form J Kκn (called second-order holes). In the second-order context J Kθn1 ···θk ⇒θ [E1 ] · · · [Ek ], J Kθn1 ···θk ⇒θ should be filled with a θ1 · · · θk ⇒θ-context of size n, yielding a term whose typing is θ. We use the metavariable P for sequences of contexts. For a sequence of contexts P = C1 · C2 · · · Cℓ and i ≤ ℓ, we write #(P ) for the length ℓ, and P i for the i-th component Ci . We define |J Kκn | ≜ n. We write shn(E) for the number of the second-order holes in E. For i ≤ shn(E), we write Ei for the i-th second-order hole (counted in the depth-first left-to-right pre-order). For a context C and a second-order hole J Kκn , we write C : J Kκn if C is a κ-context of size n. For E and P = C1 · C2 · · · · Cshn(E) , we write P : E if Ci : Ei for each i ≤ shn(E). We distinguish between second-order contexts with different annotations; for example, ⟨{x:o→o};o→o⟩⇒⟨{x:o→o};o→o⟩ ⟨{x:o};o⟩⇒⟨{x:o};o⟩ ⟨{x:o};o⟩⇒⟨{x:o};o⟩ [x] [x] and J K2 [x], J K2 J K0 are different from each other. Note that every term can be regarded as a secondorder context E such that shn(E) = 0. The bracket [−] in a second-order context is just a syntactical representation rather than the substitution operation of contexts. Given E and C such that shn(E) ≥ 1 and C : E1, we write EJCK for the second-order context obtained by replacing the leftmost second-order hole of E (i.e., E1) with C (and by interpreting the syntactical bracket [−] as the substitution operation). For example, we have: ((λx.J K[x][x])J K) Jλy.y[ ][ ]K = (λx.(λy.y[ ][ ])[x][x]) J K = (λx.λy.yxx)J K. Formally, it is defined by induction on E as follows: ( ) 1 ;τ1 ⟩···⟨Γk ;τk ⟩⇒⟨Γ ;τ ⟩ J K⟨Γ [E1 ] · · · [Ek ] JCK ≜ C[E1 ] · · · [Ek ] n { (E1 JCK)E2 (shn(E1 ) ≥ 1) τ τ (λx .E)JCK ≜ λx .(EJCK) (E1 E2 )JCK ≜ E1 (E2 JCK) (shn(E1 ) = 0) where C[E1 ] · · · [En ] is the second-order context obtained by replacing [ ]i in C with Ei for each i ≤ n. We extend the operation EJ−K to a sequence of context. Given E and a sequence of contexts P = C1 · C2 · · · · Cℓ such that ℓ ≤ shn(E) and Ci : Ei for each i ≤ ℓ, we define EJP K by induction on P : EJϵK ≜ E EJC · P K ≜ (EJCK)JP K Note that shn(EJCK) = shn(E) − 1, so if P : E then EJP K is a term. Below we actually consider only second-order contexts whose second-order holes are of the ′ form J Kθn or J Kθn ⇒θ . bm . We first We are now ready to define the the decomposition function Φ prepare an auxiliary function Φm (t) = (E, u, P ) such that (i) u is an auxiliary 0/1-subcontext, (ii) EJu·P K = t, and (iii) the size of each context in P is between m and 2m − 1. It is defined by induction on the size of t as follows: If |t| < m, then Φm (t) ≜ (J K, t, ϵ). If |t| ≥ m, then: { (E1 , λxτ.u1 , P1 ) if |λxτ.u1 | < m τ Φm (λx .t1 ) ≜ (J K[E1 ], [ ], (λxτ.u1 ) · P1 ) if |λxτ.u1 | = m where (E1 , u1 , P1 ) = Φm (t1 ). Φm (t1 t2 ) ≜ (J K[(E1 Ju1 K)(E2 Ju2 K)], [ ], P1 · P2 ) if |ti | ≥ m (i = 1, 2) (E if |t1 | ≥ m, |t2 | < m, |u1 t2 | < m 1 , u 1 t2 , P 1 ) (J K[E1 ], [ ], (u1 t2 ) · P1 ) if |t1 | ≥ m, |t2 | < m, |u1 t2 | ≥ m if |t1 | < m, |t1 u2 | < m (E2 , t1 u2 , P2 ) (J K[E2 ], [ ], (t1 u2 ) · P2 ) if |t1 | < m, |t1 u2 | ≥ m where (Ei , ui , Pi ) = Φm (ti ) (i = 1, 2). bm . We say that a In the rest of this subsection, we show key properties of Φ 0/1-context u is good for m if u is either (i) a λ-abstraction where |u| = m; or (ii) an application u1 u2 where |uj | < m for each j = 1, 2. By the definition of bm (t) = (E, P ), every component u of P is good for m. Φ b m (δ, ι, ξ), Λm (δ, ι, ξ), and For m ≥ 2, E and 1 ≤ i ≤ shn(E), we define U Ei E m Bn (δ, ι, ξ) by: m bEi U (δ, ι, ξ) ≜ {u ∈ U ν (δ, ι, ξ) | u : Ei and u is good for m} bm Λm E (δ, ι, ξ) ≜ {[EJu1 · · · ushn(E) K]α | ui ∈ UEi (δ, ι, ξ) for 1 ≤ i ≤ shn(E)} bm (ν([t]α )) for some [t]α ∈ Λα Bnm (δ, ι, ξ) ≜ {E | (E, P ) = Φ n (δ, ι, ξ)}. b m (δ, ι, ξ) is the set of good contexts that can fill E.i, Λm (δ, ι, ξ) Intuitively, U E Ei is the set of terms obtained by filling the second-order holes of E with good contexts, and Bnm (δ, ι, ξ) is the set of second-order contexts that can be obtained by decomposing a term of size n. The following lemma states the key properties bm . of Φ Lemma 2 (decomposition). Let δ, ι, ξ ≥ 0 and 2 ≤ m ≤ n. m 1. Λα n (δ, ι, ξ) is the disjoint union of ΛE (δ, ι, ξ)’s, i.e., ⊎ Λα Λm n (δ, ι, ξ) = E (δ, ι, ξ). m (δ,ι,ξ) E∈Bn bm (EJP K) = (E, P ) holds for any P ∈ ∏ bm Moreover, Φ 1≤i≤shn(E) UEi (δ, ι, ξ). m 2. m ≤ |Ei| < 2m (1 ≤ i ≤ shn(E)) for every E ∈ Bn (δ, ι, ξ). 3. shn(E) ≥ n/4m for every E ∈ Bnm (δ, ι, ξ). bm decomposes each term into suffiThe second and third properties say that Φ ciently many contexts of appropriate size. 4.3 Explosive Context b m (δ, ι, ξ) contains at least one context that has a Here, we show that each U Ei very long reduction sequence. To this end, we first prepare a special context Expl m k that has a long reduction sequence, and shows that at least one element b m (δ, ι, ξ) contains Expl m of U k as a subcontext. Ei We define a “duplicating term” Dup ≜ λxo .(λxo .λ∗o .x)xx, and Id ≜ λxo .x. For two terms t, t′ and integer n ≥ 1, we define the “n-fold application” operation ↑n as t ↑0 t′ ≜ t′ and t ↑n t′ ≜ t(t ↑n−1 t′ ). For an integer k ≥ 2, we define an order-k term 2k ≜ λf τ (k)→τ (k) .λxτ (k) .f (f x) where τ (i) is defined by τ (2) ≜ o and τ (i + 1) ≜ τ (i)→τ (i). Definition 5 (explosive context). Let m ≥ 1 and k ≥ 2 be integers and let ( ( )) t ≜ ν λxo . (2k ↑m 2k−1 )2k−2 · · · 22 Dup(Id x† ) where x† is just variable x but we put † to refer to the occurrence. We define the explosive context Expl km (of m-fold and order k) as the 1-context obtained by replacing the “normalized” variable x1 † in t with [ ]. We state key properties of Expl km below. The proof of Item 5 is the same as that in [1]. The other items follow by straightforward calculation. Lemma 3 (explosive). 1. 2. 3. 4. 5. ∅ ⊢ Expl km [x1 ] : o→o is derivable. |Expl km | = 8m + 8k − 3. ord(Expl km [x1 ]) = k, iar(Expl km [x1 ]) = k and #(V(Expl km )) = 2. Expl km ∈ U ν (δ, ι, ξ) if δ, ι ≥ k and ξ ≥ 2. If a term t satisfies Expl km ⪯ t, then β(t) ≥ expk (m) holds. b m (δ, ι, ξ) contains Expl km′ as a subWe show that at least one element of U Ei context. Lemma 4. Let δ, ι, ξ ≥ 2 be integers and k = min{δ, ι}. There exist inte′ gers b, c ≥ 2 such that, for any n ≥ 1, m′ ≥ b, E ∈ Bncm (δ, ι, ξ) and i ∈ ′ b cm (δ, ι, ξ) contains u′ such that Expl km′ ⪯ u′ . {1, . . . , shn(E)}, U Ei b cm′ (δ, ι, ξ) and construct u′ by replacing some Proof Sketch. We pick u′′ ∈ U Ei subcontext u0 of u′′ with a 0/1-context of the form S ◦ [Expl km′ [u◦ ]]. Here S ◦ and u◦ adjust the context typing and size of Expl km′ and these can be obtained by using Proposition 2. The subcontext u0 is chosen so that the goodness of u′′ is preserved by this replacement. ⊓ ⊔ 4.4 Proof Sketch of Theorem 1 We are now ready to prove the main theorem; see Appendix E for details. For ν m bm m readability, we omit the parameters (δ, ι, ξ), and write Λα n , Un , ΛE , UEi and Bn α ν m m m b (δ, ι, ξ) and Bn (δ, ι, ξ) respectively. for Λn (δ, ι, ξ), Un (δ, ι, ξ), ΛE (δ, ι, ξ), U Ei Let p(n) be the probability that a randomly chosen normalized term t in k Λα n does not contain Expl ⌈log(2) (n)⌉ as a subcontext. By Item 5 of Lemma 3, it suffices to show limn→∞ p(n) = 0. Let b and c be the constants in Lemma 4 and b let n ≥ 22 , m′ = ⌈log(2) (n)⌉ and m = cm′ . Then m′ ≥ log(2) (n) ≥ b. m m mΛ By Lemma 2, Λα n can be represented as the disjoint union ⊎E∈Bn E . Let ΛE k be the subset of Λm E that does not contain Expl m′ as a subcontext. By Lemma 4, b m contains at least one element that has Expl k ′ as a subcontext. each of U Ei ( ) m( ) bm ≤ # Uν Furthermore, since m ≤ |Ei| < 2m, we have # U for some Ei 2m+d constant d (see Appendix H ). Thus, we have ( m) ( )shn(E) # ΛE ∏ 1 1 ) ≤ 1− ( ν ) ≤ 1− ( #(Λm # U2m+d bm E) # U 1≤i≤shn(E) Ei ( ) n 4m 1 ) ≤ 1− ( ν (∵ Item 3 of Lemma 2). # U2m+d Let q(n) be the rightmost expression. Then we have ( m) ∑ ∑ m # ΛE m m (q(n)#(ΛE )) E∈Bn E∈Bn ∑ ≤ p(n) = ∑ m m m #(ΛE ) m #(ΛE ) E∈Bn E∈Bn ∑ ( ) n q(n) E∈Bm #(Λm 4m 1 E) n = ∑ = q(n) ≤ 1 − ′ m 2m c γ(δ, ι, ξ) m #(ΛE ) E∈Bn (∵ Lemma 1) for sufficiently large n. Finally, we can conclude that p(n) ≤ ( 1− 1 (2) c′ γ(δ, ι, ξ)2c⌈log (n)⌉ ) n 4c⌈log(2) (n)⌉ −→ 0 (as n −→ ∞) (see Appendix I for the last convergence) as required. 5 ⊓ ⊔ Related Work As mentioned in Section 1, there are several pieces of work on probabilistic properties of untyped λ-terms [2,3,4]. David et al. [2] have shown that almost all untyped λ-terms are strongly normalizing, whereas the result is opposite for terms expressed in SK combinators. Their former result implies that untyped λ-terms do not satisfy the infinite monkey theorem, i.e., for any term t, the probability that a randomly chosen term of size n contains t as a subterm tends to zero. Bendkowski et al. [4] proved that almost all terms in de Brujin representation are not strongly normalizing, by regarding the size of an index i is i + 1, instead of the constant 1. The discrepancies among those results suggest that this kind of probabilistic property is quite fragile and depends on the definition of the syntax and the size of terms. Thus, the setting of our paper, especially the assumption on the boundedness of internal arities and the number of variables is a matter of debate, and it would be interesting to study how the result changes for different assumptions. We are not aware of similar studies on typed λ-terms. In fact, in their paper about combinatorial aspects of λ-terms, Grygiel and Lescanne [3] pointed out that the combinatorial study of typed λ-terms is difficult, due to the lack of (simple) recursive definition of typed terms. In the present paper, we have avoided the difficulty by making the assumption on the boundedness of internal arities and the number of variables (which is, as mentioned above, subject to a debate though). In a larger context, our work may be viewed as an instance of the studies of average-case complexity ([19], chapter 10), which discusses “typical-case feasibility”. We are not aware of much work on the average-case complexity of problems with hyper-exponential complexity. 6 Conclusion We have shown that almost every simply-typed λ-term of order k has a βreduction sequence as long as (k − 2)-fold exponential in the term size, under a certain assumption. To our knowledge, this is the first result of this kind for typed λ-terms. A lot of questions are left for future work, such as (i) whether our assumption (on the boundness of arities and the number of variables) is reasonable, and how the result changes for different assumptions, (ii) whether our result is optimal (e.g., whether almost every term has a k-fold exponentially long reduction sequence), and (iii) whether similar results hold for Terui’s decision problems [14] and/or the higher-order model checking problem [6]. Acknowledgment We would like to thank anonymous referees for useful comments. This work was supported by JSPS KAKENHI Grant Number JP15H05706. References 1. Beckmann, A.: Exact bounds for lengths of reductions in typed lambda-calculus. Journal of Symbolic Logic 66(3) (2001) 1277–1285 2. David, R., Grygiel, K., Kozic, J., Raffalli, C., Theyssier, G., Zaionc, M.: Asymptotically almost all λ-terms are strongly normalizing. Logical Methods in Computer Science 9(2) (2013) 3. Grygiel, K., Lescanne, P.: Counting and generating lambda terms. Journal of Functional Programming 23(05) (2013) 594–628 4. Bendkowski, M., Grygiel, K., Lescanne, P., Zaionc, M.: A Natural Counting of Lambda Terms. In: SOFSEM 2016: Theory and Practice of Computer Science: 42nd International Conference on Current Trends in Theory and Practice of Computer Science, Harrachov, Czech Republic, January 23-28, 2016, Proceedings. Springer Berlin Heidelberg, Berlin, Heidelberg (2016) 183–194 5. Knapik, T., Niwinski, D., Urzyczyn, P.: Higher-order pushdown trees are easy. In: FoSSaCS 2002. Volume 2303., Springer (2002) 205–222 6. Ong, C.H.L.: On model-checking trees generated by higher-order recursion schemes. In: LICS 2006, IEEE Computer Society Press (2006) 81–90 7. Kobayashi, N.: Model-checking higher-order functions. In: Proceedings of PPDP 2009, ACM Press (2009) 25–36 8. Broadbent, C.H., Kobayashi, N.: Saturation-based model checking of higher-order recursion schemes. In: Proceedings of CSL 2013. Volume 23 of LIPIcs. (2013) 129–148 9. Ramsay, S., Neatherway, R., Ong, C.H.L.: An abstraction refinement approach to higher-order model checking. In: Proceedings of POPL 2014. (2014) 10. Kobayashi, N.: Types and higher-order recursion schemes for verification of higherorder programs, ACM Press (2009) 416–428 11. Kobayashi, N., Sato, R., Unno, H.: Predicate abstraction and CEGAR for higherorder model checking, ACM Press (2011) 222–233 12. Ong, C.H.L., Ramsay, S.: Verifying higher-order programs with pattern-matching algebraic data types, ACM Press (2011) 587–598 13. Sato, R., Unno, H., Kobayashi, N.: Towards a scalable software model checker for higher-order programs. In: Proceedings of PEPM 2013, ACM Press (2013) 53–62 14. Terui, K.: Semantic evaluation, intersection types and complexity of simply typed lambda calculus. In: 23rd International Conference on Rewriting Techniques and Applications (RTA’12). Volume 15 of LIPIcs., Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2012) 323–338 15. Mairson, H.G.: Deciding ML typability is complete for deterministic exponential time. In: POPL, ACM Press (1990) 382–401 16. Kfoury, A.J., Tiuryn, J., Urzyczyn, P.: ML typability is DEXTIME-complete. In: Proceedings of CAAP ’90. Volume 431 of Lecture Notes in Computer Science., Springer (1990) 206–220 17. Heintze, N., McAllester, D.A.: Linear-time subtransitive control flow analysis. (1997) 261–272 18. Flajolet, P., Sedgewick, R.: Analytic Combinatorics. 1 edn. Cambridge University Press, New York, NY, USA (2009) 19. Goldreich, O.: Computational Complexity: A Conceptual Perspective. Cambridge University Press (2008) 20. Chomsky, N., Schützenberger, M.: The algebraic theory of context-free languages*. In Braffort, P., Hirschberg, D., eds.: Computer Programming and Formal Systems. Volume 35 of Studies in Logic and the Foundations of Mathematics. Elsevier (1963) 118–161 21. Flajolet, P., Sedgewick, R.: Analytic combinatorics : functional equations, rational and algebraic functions. Research Report RR-4103, INRIA (2001) A Proof of Proposition 1 Proof. It is trivial that the image of π⟨Γ ;τ ⟩ is contained in Λα (⟨Γ ; τ ⟩, δ, ι, ξ) and π preserves size. We omit the proof of the unambiguity, as it is similar to the proof of the injectivity below. ( ) The injectivity, i.e., e(T ) ∼α e(T ′ ) implies T = T ′ for T, T ′ ∈ L N⟨Γ ;τ ⟩ , is shown by induction on the length of the leftmost reduction sequence of T and by case analysis of the rewriting rule used first for the reduction sequence. We use simultaneous induction on all non-terminals N⟨Γ ;τ ⟩ . Cases of variables and λ-abstraction: These cases are clear because each rewriting rule of this kind is determined by the non-terminal in the left hand side and the terminal in the right hand side. Case of N⟨Γ ;τ ⟩ −→ @(N⟨Γ1 ;σ→τ ⟩ , N⟨Γ2 ;σ⟩ ) where Γ = Γ1 ∪ Γ2 : Let the first rewriting rule for T ′ be N⟨Γ ;τ ⟩ −→ @(N⟨Γ1′ ;σ′ →τ ⟩ , N⟨Γ2′ ;σ′ ⟩ ) where Γ = Γ1′ ∪ Γ2′ . Let T = @(T1 , T2 ) and T ′ = @(T1′ , T2′ ). Since e(T1 )e(T2 ) = e(T ) ∼α e(T ′ ) = e(T1′ )e(T2′ ), we have e(Ti ) ∼α e(Ti′ ) and hence FV(e(Ti )) = FV(e(Ti′ )) for each i = 1, 2. Now we have FV(e(Ti )) = Dom(Γi ) and FV(e(Ti′ )) = Dom(Γi′ ) as our typing rules do not allow weakening; hence Dom(Γi ) = Dom(Γi′ ). Since Γi , Γi′ ⊆ Γ , we obtain Γi = Γi′ , for each i = 1, 2. Then, we have also σ = σ ′ . Hence by induction hypothesis, Ti = Ti′ for i = 1, 2, and therefore T = T ′ . α Next( we show ) the surjectivity, i.e., for any [t]α ∈ Λ (⟨Γ ; τ ⟩, δ, ι, ξ) there exists T ∈ L N⟨Γ ;τ ⟩ such that e(T ) ∼α t. For δ, ι, ξ with Xξ = {x1 , · · · , xξ } and (δ,ι,ξ) N⟨Γ ;τ ⟩ ∈ N (δ, ι, ξ), we define a “renaming” function ρ⟨Γ ;τ ⟩ (ρ⟨Γ ;τ ⟩ , ρ for short) ) ( from {t | [t]α ∈ Λα (⟨Γ ; τ ⟩, δ, ι, ξ)} to L N⟨Γ ;τ ⟩ by induction on t so that we obtain π(ρ(t)) = [t]α . The induction is simultaneous on all ⟨Γ ; τ ⟩. ρ⟨{x:τ };τ ⟩ (x) ≜ x ρ⟨Γ ;σ→τ ⟩ (λ∗σ.t) ≜ λ∗σ (ρ⟨Γ ;τ ⟩ (t)) where we have Γ ⊢ t : τ ρ⟨Γ ;σ→τ ⟩ (λxσ.t) ≜ λ∗σ (ρ⟨Γ ;τ ⟩ (t)) (if x ∈ / FV(t)) where we have Γ ⊢ t : τ ρ⟨Γ ;σ→τ ⟩ (λxσ.t) ≜ λxσi (ρ⟨Γ ∪{xi :σ};τ ⟩ (t[xi /x])) (if x ∈ FV(t)) where we have Γ ∪ {x : σ} ⊢ t : τ and i ≜ min{j | xj ∈ / Dom(Γ )} ρ⟨Γ1 ∪Γ2 ;τ ⟩ (t1 t2 ) ≜ @(ρ⟨Γ1 ;σ→τ ⟩ (t1 ), t2 ρ⟨Γ2 ;σ⟩ (t2 )) where we have Γ1 ⊢ t1 : σ→τ and Γ2 ⊢ t2 : σ Then, π(ρ(t)) = [t]α follows by straightforward induction on the size of t. ⊓ ⊔ B Proof of Proposition 2 Proposition 2 follows from the following three lemmas and the property in the definition of N ∅ (δ, ι, ξ); the non-linearity is clear. We note that, if a regular tree grammar G is irreducible, G is aperiodic if and only if there exist a non-terminal N ∈ N and an integer m > 0 such that #(Ln (N )) > 0 for any n > m.4 Lemma 5. Let δ, ι ≥ 2 and ξ ≥ 1 be integers. Then for any non-terminal N⟨∅;τ ⟩ ∈ N ∅ (δ, ι, ξ), N⟨∅;τ ⟩ is reachable from N⟨∅;o→o⟩ . Proof. Let τ = τ1 → . . . →τn →o and τi = τ1i → . . . →τni i →o for i = 1, . . . , n. For i i i = 1, . . . , n, we define Tτi ≜ λ ∗τ1 (. . . λ ∗τni (x1 ) . . . ), and then i N⟨x1 :o;τi ⟩ −→ λ ∗τ1 (N⟨x1 :o;τ2i →...→τni i →o⟩ ) i −→∗ λ ∗τ1 (. . . λ ∗τni (N⟨x1 :o;o⟩ ) . . . ) i i i −→ λ ∗τ1 (. . . λ ∗τni (x1 ) . . . ) = Tτi and hence N⟨∅;o→o⟩ −→ λxo1 (N⟨x1 :o;o⟩ ) −→ λxo1 (@(N⟨x1 :o;τ1 →o⟩ , N⟨x1 :o;τ1 ⟩ )) −→∗ λxo1 (@(N⟨x1 :o;τ1 →o⟩ , Tτ1 )) −→∗ λxo1 (@(@(. . . @(N⟨∅;τ ⟩ , Tτ1 ), . . . ), Tτn )). ⊓ ⊔ Lemma 6. Let δ, ι, ξ ≥ 2 be integers. Then for any non-terminal N⟨Γ ;τ ⟩ ∈ N ∅ (δ, ι, ξ), N⟨∅;o→o⟩ is reachable from N⟨Γ ;τ ⟩ . Proof. By the definition of N ∅ (δ, ι,(ξ) (and) N (δ, ι, ξ)), there exists t such that Γ ⊢ t : τ . Let T ≜ π −1 ([t]α ) ∈ L N⟨Γ ;τ(⟩ . Now) T has at least one variable, say x, and let x has type σ. Since T ∈ L N⟨Γ ;τ ⟩ , there exists T ′ such that T ′ contains just one occurrence of N⟨{x:σ};σ⟩ and N⟨Γ ;τ ⟩ −→∗ T ′ −→ T ′ [x/N⟨{x:σ};σ⟩ ] = T. If σ is a function type, say σ = σ1 →σ2 , then since ξ > 1, we can apply “ηexpansion”: T ′ −→ T ′ [λxσi 1 (N⟨{x:σ,xi :σ1 };σ2 ⟩ )/N⟨{x:σ};σ⟩ ] −→ T ′ (i ≜ min{j | xj ∈ / Dom({x : σ})}) [λxσi 1 (@(N⟨{x:σ};σ⟩ , N⟨{xi :σ1 };σ1 ⟩ ))/N⟨{x:σ};σ⟩ ]. Next we focus on N⟨{xi :σ1 };σ1 ⟩ and repeat similar reductions until the types σ, σ1 , . . . becomes o; note that this ends in finite steps since the orders of σ, σ1 , . . . are decreasing. Thus we have a tree T0 that contains N⟨{y:o};o⟩ such that N⟨Γ ;τ ⟩ −→∗ T0 . Then we have N⟨Γ ;τ ⟩ −→∗ T0 −→ T0 [@(N⟨∅;o→o⟩ , N⟨{y:o};o⟩ )/N⟨{y:o};o⟩ ] and thus N⟨∅;o→o⟩ is reachable from N⟨Γ ;τ ⟩ . 4 cf. Footnote 10 in p. 483 of [18] ⊓ ⊔ Lemma 7. Let δ, ι ≥ 2 and ξ ≥ 1 be integers. Then for )any integer n ≥ 5, the ( non-terminal N⟨∅;o→o⟩ of G ∅ (δ, ι, ξ) satisfies Ln N⟨∅;o→o⟩ ̸= ∅. Proof. For simplicity, in this proof we identify trees with terms, which can be justified by the size-preserving functions e in Section 3.1 and ρ in Appendix A. The proof proceeds by induction on n. For n = 5, 6, 7, the following terms λxo.(λ∗o.x)x λxo.(λ∗o→o.x), λ∗o.x)) λxo.(λ∗o→o→o.x)λ∗o.λ∗o.x ( ) ( ) belong to Ln (N⟨∅;o→o⟩ ), respectively. Next, for any t ∈ Ln N⟨∅;o→o⟩ , we have λxo.tx ∈ Ln+3 N⟨∅;o→o⟩ . ⊓ ⊔ In Appendix G, to prove Lemma 4, we will use the following Corollary 1 which is a consequence of Proposition 2. The following two lemmas are nothing but the irreducibility and the aperiodicity in Proposition 2, respectively, except that we here consider terms/contexts rather than trees. The proofs are given by pulling-back Proposition 2 through π and π −1 . Lemma 8 (irreducibility). Let δ, ι, ξ ≥ 2 be integers. For N⟨Γ ;τ ⟩ , N⟨Γ ′ ;τ ′ ⟩ ∈ N ∅ (δ, ι, ξ), there exists a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ ; τ ⟩-context S ∈ S ν (δ, ι, ξ). Lemma 9 (aperiodicity). Let δ, ι, ξ ≥ 2 be integers. For any integer n ≥ 5, there exists ⟨∅; o→o⟩-term t such that [t]α ∈ Λα n (δ, ι, ξ). Corollary 1 (of Lemma 8 and 9). Let δ, ι, ξ ≥ 2 be integers. There exists a constant d0 ≥ 1 such that for any n ≥ d0 and for any N⟨Γ ;τ ⟩ , N⟨Γ ′ ;τ ′ ⟩ ∈ N ∅ (δ, ι, ξ), there exist a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ ; τ ⟩-context S ∈ Snν (δ, ι, ξ) and ⟨Γ ; τ ⟩-term t such that [t]α ∈ Λα n (δ, ι, ξ). C C.1 Analytic Combinatorics Chomsky-Schützenberger Enumeration Theorem for Regular Tree Grammars We have an obvious translation from a regular tree grammar G to a context-free grammar G, just by replacing every rewriting rule N −→G a(N1 , · · · , NΣ(a) ) by N −→G aN1 · · · NΣ(a) , so that the translation preserves unambiguity. Also, this translation preserves size: i.e., the size of trees and the size of words. Therefore, the generating functions for an unambiguous regular tree grammar G are equal to those for the corresponding context-free grammar G. Also, there is the well-known translation from a context-free grammar to a system of polynomial equations: it replaces each non-terminal N by the corresponding indeterminate FN , and each terminal a by the variable z, each union ∪ by +, and each concatenation · by ×. The translation given in Section C.2 is obtained as the composite of the above two translations. The well-known Chomsky-Schützenberger enumeration theorem [20][21, Proposition 8.7] states that the generating function of an unambiguous context-free grammar G is a solution of the corresponding system of polynomial equations. In the literature of analytic combinatorics [18,21], the notion of a context-free specification is discussed, which is just a syntactical representation of a system of polynomial equations. The above translation from context-free grammars to systems of polynomial equations can also be seen as that to context-free specifications (cf. Section 6.2 of [21]). Irreducibility and aperiodicity are defined also for context-free specifications in the same way [18, Definition VII. 5], and the translation from regular tree grammars to context-free specifications preserves irreducibility and aperiodicity. C.2 Polynomial Equations for the Generating Function of Λα (δ, ι, ξ) For a sequence (an )n∈N of numbers, the (ordinary) generating function of (an )n ∑∞ is the formal power series: n=0 an z n . For a set S of α-equivalence classes of terms or of) )trees, the generating function of S is the generating function of ( ( −1 # φ (n) n where φ is the size function. A function f that is analytic at 0 is regarded as the generating function of (f (n) (0)/n!)n by the Taylor expansion. We let [z n ]F(z) denote the operation of extracting the coefficient of z n of the generating function F(z). We often simply write F instead of F(z). For a regular tree grammar G and a non-terminal N of G, we write FGN (or FN for short) for the generating function of L (N ). Also, for δ, ι and ξ, we write F(δ,ι,ξ) for the generating function of Λα (δ, ι, 1, F(δ,ι,ξ)(is also) ( ξ). By) Proposition ∪ ∪ the generating function of N⟨∅;τ ⟩ ∈N (δ,ι,ξ) L N⟨∅;τ ⟩ = N⟨∅;τ ⟩ ∈N ∅ (δ,ι,ξ) L N⟨∅;τ ⟩ . Now we explain a well-known translation [20] of a regular tree grammar G to a system of polynomial equations. The definition of the translation itself is simple: Let G = (Σ, N , R) be a tree grammar. For each non-terminal N , we prepare an indeterminate FN , and let i N −→ ai (N1i , · · · , NΣ(a ) i) (i = 1, . . . , mN ) be all the rewriting rules of N . Then we define a polynomial PN as follows: FN = mN ∑ i=1 i zFN1i . . . FNΣ(a i) where the formal power series z(= 0 + z + 0z 2 + . . . ) is a coefficient of this polynomial. Thus we have obtained a system of polynomial equations (PN )N ∈N . Chomsky-Schützenberger enumeration theorem [20][21, Proposition 8.7] states that, if G is unambiguous, (FN )N ∈N becomes a solution to this system5 . 5 We can translate a regular tree grammar also to a context-free grammar or a contextfree specification [18], which, then, can be (trivially) transformed to the system of polynomial equations. We explain this in Appendix C. For G(1, 1, 1) in Example 1, let us see how this translation works. Applying the translation to G(1, 1, 1), we have: F⟨∅;o→o⟩ = zF⟨{x1 :o};o⟩ ( ) ( ) F⟨{x1 :o};o⟩ = z + z F⟨{x1 :o};o→o⟩ × F⟨{x1 :o};o⟩ + z F⟨∅;o→o⟩ × F⟨{x1 :o};o⟩) F⟨{x1 :o};o→o⟩ = zF⟨{x1 :o};o⟩ For this small example, we can solve this system of equations algebraically as follows. From the above system of equations, one can obtain a one-variable polynomial 2z(F⟨∅;o→o⟩ )2 − F⟨∅;o→o⟩ + z 2 = 0 by some suitable substitutions. We have two solutions to this equation: √ √ 1 − 1 − 8z 3 1 + 1 − 8z 3 and . 4z 4z o Since #(Λα 2 (1, 1, 1)) = #({[λx .x]α }) = 1, F⟨∅;o→o⟩ must be the second solution. Then √ 1 − 1 − 8z 3 F(1,1,1) = F⟨∅;o→o⟩ = = z 2 + 2z 5 + 8z 8 + 40z 11 + 224z 14 + · · · . 4z and n n n #(Λα n (1, 1, 1)) = [z ]F(1,1,1) = [z ]F⟨∅;o→o⟩ = O(2 ) √ 3 1−8z where the last equality is obtained from F⟨∅;o→o⟩ = 1− 4z and a well-known fact in analytic combinatorics [18] (i.e., the exponential growth rate 2 (2 in O(2n )) is given as the multiplicative inverse of the dominant singularity (the singularity closest to the origin) of F⟨∅;o→o⟩ ). C.3 Deduction of Theorem 2 In this subsection we fill the gap between Theorem VII.5 in [18] and Theorem 2. Theorem VII.5 states asymptotic behavior of an irreducible and aperiodic context-free specification, and as its direct consequence via the above translation, we have the following corollary. Corollary 2 ([18]). Let G = (Σ, N , R) be an unambiguous, irreducible and aperiodic regular tree grammar. Then for any non-terminal N ∈ N , there exist constants γN (G) > 1 and CN (G) > 0 such that #(Ln (N )) ∼ CN (G)γN (G)n n−3/2 . The above corollary is weaker than Theorem 2: in Theorem 2 the constant γ(G) depends on only G, while in Corollary 2 the constant γN (G) depends on G and N. This gap can be easily filled as follows. Below we use the notion of substitution of context-trees in Section 4.1. Let N1 , N2 ∈ N . Since G is irreducible, by Lemma, there exists a N1 ⇒N2 -context-tree Tb. Since the function Tb[−] : L (N1 ) → L (N2 ) is (injective, and) |Tb[T ]| = |Tb| + |T | for any T ∈ L (N1 ), we have #(Ln (N1 )) ≤ # L|Tb|+n (N2 ) . Hence for sufficiently large n, the following holds: |Tb|+n CN1 (G)γN1 (G) n−3/2 ≤ CN2 (G)γN2 (G) n (|Tb| + n)−3/2 . Then γN1 (G) ≤ γN2 (G) must hold. By symmetry, we have γN2 (G) ≤ γN1 (G), and so γN1 (G) = γN2 (G). Thus all the non-terminals in N have the same growth rate, which is γ(G). D ( ) Proof of Lemma 1 (upper bound of # Unν (δ, ι, ξ) ) To prove Lemma 1, we define a regular tree grammar that represents all subcontexts of Λα (δ, ι, ξ). Definition 6 (Generalized grammar of Λα (δ, ι, ξ) and its analysis). Let δ, ι, ξ ≥ 0 be integers. The generalized grammar ∅ ∅ G (δ, ι, ξ) = (Σ(δ, ι, ξ), N ∅ (δ, ι, ξ), R (δ, ι, ξ)) of Λα (δ, ι, ξ) is defined as an extension of G ∅ (δ, ι, ξ) by adding new symbols and rules as follows: – Σ(δ, ι, ξ) ≜ Σ(δ, ι, ξ) ∪ {h⟨Γ ;τ ⟩ 7→ 0 | N⟨Γ ;τ ⟩ ∈ N ∅ (δ, ι, ξ)}. ∅ – R (δ, ι, ξ) ≜ R∅ (δ, ι, ξ) ∪ {N⟨Γ ;τ ⟩ −→ h⟨Γ ;τ ⟩ | N⟨Γ ;τ ⟩ ∈ N ∅ (δ, ι, ξ)}. We call a terminal symbol of the form h⟨Γ ;τ ⟩ a hole-terminal. Intuitively, h⟨Γ ;τ ⟩ represents a hole [ ] with typing annotation ⟨Γ ; τ ⟩, and so there is the obvi(δ,ι,ξ) ous embedding eh (eh for short) from Σ(δ, ι, ξ)-trees to contexts such that eh (h⟨Γ ;τ ⟩ ) ≜ [ ]. Note that this embedding )eh drops the type annotations of ( hole-terminals. We call a tree T ∈ L N⟨Γ ;τ ⟩ where exactly one hole-terminal occurs, say h⟨Γ ′ ;τ ′ ⟩ , a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ ; τ ⟩-tree. It is easy to show that, for a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ ; τ ⟩-tree T , the context eh (⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ ; τ ⟩) is an inhabitant of ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ ; τ ⟩. In the previous section, we defined the size of a tree T as the number of nodes and leaves of T . In this section, however, we re-define the size of a Σ(δ, ι, ξ)-tree T , written as |T |, is the number of nodes and non-hole-terminal leaves of T . This definition of the size is necessary for a size-consistency between Σ(δ, ι, ξ)-trees and contexts. For δ, ι, ξ ≥ 0 and n ≥ 0, we define ( ∅ ) ( ∅ ) ∪ Ln G (δ, ι, ξ) ≜ Ln G (δ, ι, ξ), N⟨Γ ;τ ⟩ . N⟨Γ ;τ ⟩ ∈N ∅ (δ,ι,ξ) Note that above all non-terminals N⟨Γ ;τ ⟩ are involved rather than N⟨∅;τ ⟩ , since the purpose of this grammar is to (approximately) represent all the sub-contexts of closed terms. ∅ We can prove that G (δ, ι, ξ) has similar properties with of G ∅ (δ, ι, ξ). In general, adding new symbol of the form aN for each non-terminal N and adding new rule of the form N −→ aN does not change unambiguity, irreducibility and aperiodicity. Item 3 in the following lemma is followed from Theorem VII.5 in [18] (see Appendix C). Lemma 10. Let δ, ι, ξ ≥ 2 be integers. Then the following holds. ∅ 1. G (δ, ι, ξ) is unambiguous, irreducible and aperiodic. 2. There exists a constant γ(δ, ι, ξ) ≥ γ(δ, ι, ξ) such that, for any non-terminal ∅ N⟨Γ ;τ ⟩ ∈ N ∅ (δ, ι, ξ) of G (δ, ι, ξ) )) ( ( ∅ (δ,ι,ξ) ∼ C N⟨Γ ;τ ⟩ γ(δ, ι, ξ)n n−3/2 # Ln G (δ, ι, ξ), N⟨Γ ;τ ⟩ (δ,ι,ξ) holds for some constant C N⟨Γ ;τ ⟩ > 0 determined by δ, ι, ξ and N⟨Γ ;τ ⟩ . Now, to prove Lemma 1, it suffices to show that the following lemma. ( ∅ ) Lemma 11. For δ, ι, ξ ≥ 0, eh is a surjection from Ln G (δ, ι, ξ) to Unν (δ, ι, ξ). Lemma 11 immediately follows from the following lemma. ) ( Lemma 12. Suppose that T ∈ L G ∅ (δ, ι, ξ), N⟨Γ0 ;τ0 ⟩ and S ′ [S[t′ ]] = e(T ), where ctt (S) = ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ ; τ ⟩. Then, N⟨Γ ;τ ⟩ , N⟨Γ ′ ;τ ′ ⟩ ∈ N ∅ (δ, ι, ξ) and there exists a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ ; τ ⟩-tree T ′ such that eh (T ′ ) = S. Proof. The proof proceeds by induction on the length of the leftmost reduction sequence of T and by case analysis on the rewriting rule used first for the reduction sequence and that on S ′ and S. Recall that to show N⟨Γ ;τ ⟩ ∈ N ∅ (δ, ι, ξ) is to show that N⟨Γ ;τ ⟩ is reachable from a non-terminal of the form N⟨∅;τ ′ ⟩ . We note that, if S ′ = [ ], then N⟨Γ ;τ ⟩ ∈ N ∅ (δ, ι, ξ) since ⟨Γ ; τ ⟩ = ⟨Γ0 ; τ0 ⟩; below, the required condition that N⟨Γ ;τ ⟩ , N⟨Γ ′ ;τ ′ ⟩ ∈ N ∅ (δ, ι, ξ) can be shown by this condition S ′ = [ ] or by induction hypothesis directly. We first consider the case that S = S ′ = [ ]. We have ⟨Γ0 ; τ0 ⟩ = ⟨Γ ; τ ⟩ = ′ ′ ⟨Γ ; τ ⟩ and because of the rule N⟨Γ ;τ ⟩ −→ h⟨Γ ;τ ⟩ , T ′ ≜ h⟨Γ ;τ ⟩ is a ⟨Γ ; τ ⟩ ⇒ ⟨Γ ; τ ⟩-tree and eh (T ′ ) = [ ] = S. From now, we perform the case analysis on the rewriting rule used first for the reduction sequence of T . Case of N⟨{xi :τ0 };τ0 ⟩ −→ xi : Since e(T ) = xi , S ′ = S = [ ]. Case of N⟨Γ0 ;σ0 →τ0 ⟩ −→ λ∗σ0 (N⟨Γ0 ;τ0 ⟩ ): Let T = λ∗σ0 (T1 ); then e(T ) = σ0 λ∗ .e(T1 ) = S ′ [S[t′ ]]. Now S ′ is either λ∗σ0.S1′ or [ ]. In the former case, by induction hypothesis for T1 with S1′ [S[t′ ]] = e(T1 ), we directly obtain the required properties: N⟨Γ ;τ ⟩ , N⟨Γ ′ ;τ ′ ⟩ ∈ N ∅ (δ, ι, ξ) and there exists a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ ; τ ⟩-tree T ′ such that eh (T ′ ) = S. In the case that S ′ = [ ], by the typing rules of terms, ⟨Γ0 ; σ0 →τ0 ⟩ = ⟨Γ ; τ ⟩, and S is either λ∗σ0.S1 or [ ]. In the former case, by the typing rules of terms, we have ctt (S1 ) = ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ0 ; τ0 ⟩. Then, by induction hypothesis for T1 with ([ ])[S1 [t′ ]] = e(T1 ), we obtain the following properties: N⟨Γ0 ;τ0 ⟩ , N⟨Γ ′ ;τ ′ ⟩ ∈ N ∅ (δ, ι, ξ) and there exists a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ0 ; τ0 ⟩-tree T1′ such that eh (T1′ ) = S1 . By combining N⟨Γ0 ;σ0 →τ0 ⟩ −→ λ∗σ0 (N⟨Γ0 ;τ0 ⟩ ) and T1′ : ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ0 ; τ0 ⟩-tree T ′ ≜ λ∗σ0 (T1′ ) is a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ0 ; σ0 →τ0 ⟩-tree and eh (T ′ ) = λ∗σ0.eh (T1′ ) = λ∗σ0.S1 = S. Case of N⟨Γ0 ;σ0 →τ0 ⟩ −→ λxσi 0 (N⟨Γ0 ∪{xi :σ0 };τ0 ⟩ ) where #(Γ0 ) < ξ and i = min{j | xj ∈ / Dom(Γ0 )}: Let T = λxσi 0 (T1 ); then e(T ) = λxσi 0.e(T1 ) = S ′ [S[t′ ]]. ′ Now S is either λxσi 0.S1′ or [ ]. In the former case, by induction hypothesis for T1 with S1′ [S[t′ ]] = e(T1 ), we directly obtain the required properties: N⟨Γ ;τ ⟩ , N⟨Γ ′ ;τ ′ ⟩ ∈ N ∅ (δ, ι, ξ) and there exists a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ ; τ ⟩-tree T ′ such that eh (T ′ ) = S. In the case that S ′ = [ ], by the typing rules of terms, ⟨Γ0 ; σ0 →τ0 ⟩ = ⟨Γ ; τ ⟩, and S is either λxσi 0.S1 or [ ]. In the former case, by the typing rules of terms, we have ctt (S1 ) = ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ0 ∪ {xi : σ0 }; τ0 ⟩. Then, by induction hypothesis for T1 with ([ ])[S1 [t′ ]] = e(T1 ), we obtain the following properties: N⟨Γ0 ∪{xi :σ0 };τ0 ⟩ , N⟨Γ ′ ;τ ′ ⟩ ∈ N ∅ (δ, ι, ξ) and there exists a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ0 ∪ {xi : σ0 }; τ0 ⟩-tree T1′ such that eh (T1′ ) = S1 . By combining N⟨Γ0 ;σ0 →τ0 ⟩ −→ λxσi 0 (N⟨Γ0 ∪{xi :σ0 };τ0 ⟩ ) and T1′ : ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ0 ∪{xi : σ0 }; τ0 ⟩-tree T ′ ≜ λxσi 0 (T1′ ) is a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ0 ; σ0 →τ0 ⟩-tree and eh (T ′ ) = λxσi 0.eh (T1′ ) = λxσi 0.S1 = S. Case of N⟨Γ0 ;τ0 ⟩ −→ @(N⟨Γ1 ;σ0 →τ0 ⟩ N⟨Γ2 ;σ0 ⟩ ) where Γ = Γ1 ∪ Γ2 : Let T = @(T1 , T2 ); then e(T ) = e(T1 )e(T2 ) = S ′ [S[t′ ]]. Now S ′ is either S1′ t′2 , t′1 S2′ or [ ]. In the first case, we can use the induction hypotheses for T1 with S1′ [S[t′ ]] = e(T1 ). we directly obtain the required properties: N⟨Γ ;τ ⟩ , N⟨Γ ′ ;τ ′ ⟩ ∈ N ∅ (δ, ι, ξ) and there exists a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ ; τ ⟩-tree T ′ such that eh (T ′ ) = S. The second case is similar. In the case that S ′ = [ ], by the typing rules of terms, ⟨Γ0 ; τ0 ⟩ = ⟨Γ ; τ ⟩, and S is either S1 t2 , t1 S2 or [ ]. In the first case, by the typing rules of terms, we have Γ1′ ⊢ S1 [t′ ] : σ0′ →τ0 Γ2′ ⊢ t2 : σ0′ as well as ctt (S1 ) = ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ1′ ; σ0′ →τ0 ⟩ ( ) where Γ1′ ∪ Γ2′ = Γ0 . Since T1 ∈ L N⟨Γ1 ;σ0 →τ0 ⟩ , Γ1 ⊢ e(T1 ) : σ0 →τ0 . Now e(T1 ) = S1 [t′ ], and hence Dom(Γ1′ ) = FV(S1 [t′ ]) = FV(e(T1 )) = Dom(Γ1 ). and therefore Γ1′ = Γ1 since Γ1′ , Γ1 ⊆ Γ . Then, we also have σ0′ = σ0 . Now, we can use induction hypothesis for T1 with ([ ])[S1 [t′ ]] = e(T1 ), and obtain the following properties: N⟨Γ1 ;σ0 →τ0 ⟩ , N⟨Γ ′ ;τ ′ ⟩ ∈ N ∅ (δ, ι, ξ) and there exists a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ1 ; σ0 →τ0 ⟩-tree T1′ such that eh (T1′ ) = S1 . By combining the following results T1′ : ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ1 ; σ0 →τ0 ⟩-tree N⟨Γ0 ;τ0 ⟩ −→ @(N⟨Γ1 ;σ0 →τ0 ⟩ N⟨Γ2 ;σ0 ⟩ ) ( ) e(T2 ) = t2 T2 ∈ L G ∅ (δ, ι, ξ), N⟨Γ2 ;σ0 ⟩ we find that T ′ ≜ @(T1′ , T2 ) is a ⟨Γ ′ ; τ ′ ⟩ ⇒ ⟨Γ0 ; τ0 ⟩-tree. Also we have eh (T ′ ) = λxσi 0.eh (T1′ ) = λxσi 0.S1 = S. The case that S = t1 S2 is similar. ) ( ∅ Since there is a surjection from Ln G (δ, ι, ξ) to Unν (δ, ι, ξ), we have ( ( ∅ )) # Ln G (δ, ι, ξ) ≥ #(Unν (δ, ι, ξ)) . (2) This ends the proof of Lemma 1. E Proof of Lemma 2 (decomposition) In order to prove Lemma 2, we give a complete definition of Φm (note that the definition of Φm given in Section 4 is a simplified version which ignores the context-typing/size annotations of second-order holes). We first prove Item 2 and 3 of Lemma 2, and then prove Item 1. In the rest of this section, we always assume n ≥ 1 and m ≥ 2. E.1 Complete Definition of Φm The set of second-order context typings ζ and typing rules for E are given as follows. ζ ::= κ1 · · · κk ⇒θ (k ∈ N) ⊢ Ei : κi1 · · · κiki ⇒θi (i ∈ {1, . . . , k}) κ = θ1 · · · θk ⇒θ 1 1 k κ ⊢ J Kn [E1 ] · · · [Ek ] : κ κ1 · · · κk1 · · · κ1 · · · κkkk ⇒θ ⊢ x : ⟨x : τ ; τ ⟩ ⊢ E1 : κ11 · · · κ1k1 ⇒⟨Γ1 ; σ→τ ⟩ ⊢ E2 : κ21 · · · κ2k1 ⇒⟨Γ2 ; σ⟩ ⊢ E1 E2 : κ11 · · · κ1k1 κ21 · · · κ2k1 ⇒⟨Γ1 ∪ Γ2 ; τ ⟩ ⊢ E : κ1 · · · κk ⇒⟨Γ ′ ; τ ⟩ Γ ′ = Γ or Γ ∪ {x : σ} σ ⊢ λx .E : κ1 · · · κk ⇒⟨Γ ; σ→τ ⟩ x∈ / Dom(Γ ) Below we consider only well-typed second-order contexts. We identify a secondorder context typing ⇒θ with θ. We define the size of E as follows: |J Kκn [E1 ] · · · [Ek ]| ≜ n + |E1 | + · · · + |Ek | |x| ≜ 1 |λxτ.E| ≜ |E| + 1 |E1 E2 | ≜ |E1 | + |E2 | + 1 Note that |EJCK| = |EJP K| = |E|. We define κ̄(J Kκn ) ≜ κ. For a ⟨Γ ; τ ⟩-term t and a 0/1-subcontext occurrence u of t, we write κ̄(u ⪯ t : ⟨Γ ; τ ⟩) for the context-typing of u obtained from the derivation of Γ ⊢ t : τ . For E such that shn(E) ≥ 1, we write E↾κn for the secondorder context obtained by replacing E1 with J Kκn . We omit κ if κ = κ̄(E1), and κ̄(u⪯t:⟨Γ ;τ ⟩) n if n = |E1|. We write E↾(u ⪯ t) (or E↾u) for E↾|u| , if Γ, τ (and t) are clear from the context. Now, we define “decomposition” of a term. Below we consider only second⟨Γ ;τ ⟩ ⟨Γ ′ ;τ ′ ⟩⇒⟨Γ ;τ ⟩ order contexts whose second-order holes are of the form J Kn or J Kn . For a ⟨Γ ; τ ⟩-term t, we define Φm (t) = (E, u, P ) with the following properties – u · P : E and EJu · P K = t – u occurs at the root of t: E is of the form J Kκn [E1 ] · · · [Ek ] where k ≤ 1 – |u| < m. We define Φm (t) by induction on the size of t as follows; in this definition we use the above properties (for the operation −↾(− ⪯ −)) and thus the induction is simultaneous. ⟨Γ ;τ ⟩ – If |t| < m, then Φm (t) ≜ (J K|t| , t, ϵ). – If |t| ≥ m, then: τ τ τ (E1↾(λx .u1 ⪯ λx .t1 ), λx .u1 , P1 ) if Φm (t1 ) = (E1 , u1 , P1 ) and |λxτ.u1 | < m Φm (λxτ.t1 ) ≜ (J K⟨Γ ;τ ⟩⇒⟨Γ ;τ ⟩ [E1↾(λxτ.u1 ⪯ λxτ.t1 )], [ ], (λxτ.u1 ) · P1 ) 0 if Φm (t1 ) = (E1 , u1 , P1 ) and |λxτ.u1 | = m ⟨Γ ;τ ⟩⇒⟨Γ ;τ ⟩ (J K0 [(E1 Ju1 K)(E2 Ju2 K)], [ ], P1 · P2 ) if Φm (ti ) = (Ei , ui , Pi ) and |ti | ≥ m for each i ∈ {1, 2} (E ↾(u 1 1 t2 ⪯ t1 t2 ), u1 t2 , P1 ) if Φm (t1 ) = (E1 , u1 , P1 ), |t1 | ≥ m, |t2 | < m and |u1 t2 | < m (J K⟨Γ ;τ ⟩⇒⟨Γ ;τ ⟩ [E ↾(u t ⪯ t t )], [ ], (u t ) · P ) 1 1 2 1 2 1 2 1 0 Φm (t1 t2 ) ≜ if Φ (t ) = (E , u , P ) |t | ≥ m, |t | < m and |u1 t2 | ≥ m m 1 1 1 1 1 2 (E2↾(t1 u2 ⪯ t1 t2 ), t1 u2 , P2 ) if Φm (t2 ) = (E2 , u2 , P2 ), |t1 | < m and |t1 u2 | < m ⟨Γ ;τ ⟩⇒⟨Γ ;τ ⟩ (J K0 [E2↾(t1 u2 ⪯ t1 t2 )], [ ], (t1 u2 ) · P2 ) if Φ (t ) = (E , u , P ), |t | < m and |t u | ≥ m m 2 2 2 2 1 1 2 It is trivial that Φm is well-defined, and E, u, and each component of P are wellbm is then defined by Φ bm (t) ≜ (EJuK, P ) typed. The decomposition function Φ where (E, u, P ) = Φm (t). E.2 Proof of Item 2 and 3 of Lemma 2 The following lemma can be shown by straightforward induction of the size of t. Item 4 is Item 2 of Lemma 2. Lemma 13. For Φm (t) = (E, u, P ), 1. 2. 3. 4. u · P : E and EJu · P K = t u occurs at the root of t: E is of the form J Kκn [E1 ] · · · [Ek ] where k ≤ 1 |u| < m P = u1 · · · uℓ where m ≤ |ui | < 2m for each 1 ≤ i ≤ ℓ. Now we prove Item 3 of Lemma 2. Recall that we write #(P ) for the length of a sequence of contexts P . Note that shn(E) − 1 = #(P ) always holds for any t and Φm (t) = (E, u, P ). Lemma 14. For any term t and m such that 2 ≤ m ≤ |t|, if Φm (t) = (E, u, P ), then |t| . #(P ) ≥ 4m Proof. We show below that |u| + 4m#(P ) − 2m ≥ |t| by induction on t. Then it follows that 4m#(P ) ≥ |t| + 2m − |u| > |t| + 2m − m > |t|, which implies #(P ) > |t| 4m as required. – Case t = λxτ.t1 : By the assumption |t| ≥ m, |t1 | ≥ m − 1. If |t1 | = m − 1, then Φm (t1 ) = (J K, t1 , ϵ). Thus, Φm (t) = (J K[J K], [ ], λxτ.t1 ). Therefore, we have |u| + 4m#(P ) − 2m = 0 + 4m − 2m = 2m ≥ m = |t| as required. If |t1 | ≥ m and Φm (t1 ) = (E1 , u1 , P1 ), then we have |u1 | + 4m#(P1 ) − 2m ≥ |t1 | by the induction hypothesis. If |λxτ.u1 | < m, then |u| + 4m#(P ) − 2m = |u1 | + 1 + 4m#(P1 ) 2 − m ≥ |t1 | + 1 = |t| as required. If |λxτ.u1 | = m, then |u| + 4m#(P ) − 2m = 0 + 4m(#(P1 ) + 1) − 2m = 4m#(P1 ) + 2m ≥ |t1 | + 2m − |u1 | + 2m > |t1 | as required. – Case t = t1 t2 : Let Φm (ti ) = (Ei , ui , Pi ). We perform further case analysis. • Case |ti | ≥ m for each i ∈ {1, 2}: In this case, we have P = P1 · P 2 u = [ ]. By the induction hypothesis, we have |ui | + 4m#(Pi ) − 2m ≥ |ti | for i ∈ {1, 2}. Thus, |u| + 4m#(P ) − 2m = 0 + 4m(#(P1 ) + #(P2 )) − 2m ≥ (|t1 | + 2m − |u1 |) + (|t2 | + 2m − |u2 |) − 2m = |t1 | + |t2 | + (m − |u1 |) + (m − |u2 |) ≥ |t1 | + |t2 | + 2 > |t| as required. • Case |t1 | ≥ m and |u1 t2 | < m: In this case, we have P = P1 u = u1 t2 . By the induction hypothesis, we have |u1 | + 4m#(P1 ) − 2m ≥ |t1 |. Thus, we have: |u| + 4m#(P ) − 2m = |u1 | + |t2 | + 1 + 4m#(P1 ) − 2m ≥ |u1 | + |t2 | + 1 + (|t1 | + 2m − |u1 |) − 2m = |t1 | + |t2 | + 1 = |t| as required. • Case |t1 | ≥ m, |t2 | < m and |u1 t2 | ≥ m: In this case, we have P = u1 t2 · P1 u = [ ]. By the induction hypothesis, we have |u1 | + 4m#(P1 ) − 2m ≥ |t1 |. Thus, we have |u| + 4m#(P ) − 2m = 0 + 4m(#(P1 ) + 1) − 2m ≥ |t1 | + 2m − |u1 | + 2m > |t1 | + 4m − m > |t1 | + |t2 | + 1 = |t| as required. • Case |t1 | < m, and |t1 u2 | < m: It must be the case |t2 | ≥ m, because otherwise we have u2 = t2 , which would imply |t| = |t1 t2 | = |t1 u2 | < m, contradicting the assumption |t| ≥ m. Thus, the required result can be obtained in a manner similar to the case for |t1 | ≥ m and |u1 t2 | < m. • Case |t1 | < m and |t1 u2 | ≥ m: In this case, we have: P = t1 u2 · P2 u = [ ]. If |t2 | ≥ m, then the required result can be obtained in a manner similar to the case for |t1 | ≥ m, |t2 | < m and |u1 t2 | ≥ m. If |t2 | < m, then P2 = ϵ and u2 = t2 . Therefore, we have: |u| + 4m#(P ) − 2m = 0 + 4m − 2m = 2m ≥ |t1 | + |t2 | + 1 = |t| ⊓ ⊔ as required. E.3 Proof of Item 1 of Lemma 2 For n > 0 and m > 2, we define an auxiliary set Bbnm (δ, ι, ξ) of second-order contexts as: Bbnm (δ, ι, ξ) ≜ {(E, u) | (E, u, P ) = Φm (ν([t]α )) for some [t]α ∈ Λα n (δ, ι, ξ)}. Recall that we have defined Bnm (δ, ι, ξ) as: bm (ν([t]α )) for some [t]α ∈ Λα Bnm (δ, ι, ξ) = {E | (E, P ) = Φ n (δ, ι, ξ)}, and obviously, we have Bnm (δ, ι, ξ) = {EJuK | (E, u) ∈ Bbnm (δ, ι, ξ)}. We call a sequence P of contexts good context sequence for (E, u, m) if E : u·P m (δ, ι, ξ) for the set of all good context and Φm (EJu·P K) = (E, u, P ). We write PE,u bm (δ, ι, ξ) of P m (δ, ι, ξ) as: sequences for (E, u, m). We define a subset P E,u E,u m m bE,u P (δ, ι, ξ) ≜ {P ∈ PE,u (δ, ι, ξ) | EJu·P K = ν([EJu·P K]α ), [EJu·P K]α ∈ Λα (δ, ι, ξ)}. In the rest of this subsection, we will show that the following two properties holds: ⨿ bm bm induces a bijection between: Λα – Φ bm (δ,ι,ξ) PE,u (δ, ι, ξ). n (δ, ι, ξ) and (E,u)∈B n ∏ bm (δ, ι, ξ) = bm bm – P E,u 2≤i≤shn(E) UEi (δ, ι, ξ) for (E, u) ∈ Bn (δ, ι, ξ). Item 1 of Lemma 2 is a direct consequence of these two properties. First we show the coproduct part. Lemma 15. For any δ, ι, ξ ≥ 0, n ≥ 1 and m ≥ 2, there is a bijection ⨿ m ∼ bE,u Λα P (δ, ι, ξ) = n (δ, ι, ξ) bm (δ,ι,ξ) (E,u)∈B n ( = ⊎ bm (δ,ι,ξ) (E,u)∈B n ) m bE,u {(E, u)} × P (δ, ι, ξ) . Proof. We define a function ⨿ f: m bE,u P (δ, ι, ξ) −→ Λα n (δ, ι, ξ) bm (δ,ι,ξ) (E,u)∈B n by f ((E, u), P ) ≜ [EJu · P K]α , and a function ⨿ g : Λα n (δ, ι, ξ) −→ bm (δ, ι, ξ) P E,u bm (δ,ι,ξ) (E,u)∈B n by g([t]α ) ≜ ((E, u), P ) where (E, u, P ) = Φm (ν([t]α )). Then we have f (g([t]α )) = f ((E, u), P ) = [EJu · P K]α = [ν([t]α )]α = [t]α and g(f ((E, u), P )) = g([EJu · P K]α ) = ((E, u), P ) where the last equation holds since Φm (ν([EJu · P K]α )) = Φm (EJu · P K) = (E, u, P ) bm (δ, ι, ξ). due to P ∈ P E,u ⊓ ⊔ Next we show the product part. Lemma 16. For δ, ι, ξ ≥ 0, n ≥ 1, m ≥ 2 and (E, u) ∈ Bnm (δ, ι, ξ), m m bE,u bE2 bm P (δ, ι, ξ) = U (δ, ι, ξ) × · · · × U Eshn(E) (δ, ι, ξ). To prove Lemma 16, we take two steps, Lemmas 20 and 23, with some auxiliary lemmas. For m ≥ 2, E and i ∈ {1, . . . , shn(E)}, we define m UEi ≜ {u | u : Ei and u is good for m}. b m (δ, ι, ξ) as: Recall that, for δ, ι, ξ ≥ 0, we have defined U Ei m bEi U (δ, ι, ξ) = {u ∈ U ν (δ, ι, ξ) | u : Ei and u is good for m} m = UEi ∩ U ν (δ, ι, ξ). For sets U1 , . . . , Uk of 0/1-contexts, we use the following notation: U1 × · · · × Uk ≜ {u1 · · · uk | ui ∈ Ui for each i}. It is easy to show, by induction on C, that if C ∈ C(θ1 · · · θk ⇒θ), 1 ≤ i ≤ k and C ′ ∈ C(θ1′ · · · θk′ ′ ⇒θi ), then C[C ′ ]i ∈ C(θ1 · · · θi−1 θ1′ · · · θk′ ′ θi+1 · · · θk ⇒θ). For κ-context C with its derivation ∆ and a subcontext occurrence C ′ of C, we write κ̄(C ′ ⪯ C : κ) for the context typing of C’ that is determined by the corresponding sub-derivation of ∆. Thus we have the following two trivial facts; we may use these two lemmas without referring explicitly. Lemma 17. For κ = θ1 · · · θn ⇒θ, κi = θ1i · · · θni i ⇒θi (i = 1, . . . , n), C ∈ C ν (κ, δ, ι, ξ) and Ci ∈ C ν (κi , δ, ι, ξ), let κ′ = θ11 · · · θn1 1 · · · θ1n · · · θnnn ⇒θ; then C[C1 ] · · · [Cn ] ∈ C ν (κ′ , δ, ι, ξ). Lemma 18. For C ∈ C ν (κ, δ, ι, ξ) and C ′ ⪯ C, let κ′ = κ̄(C ′ ⪯ C : κ). Then C ′ ∈ C ν (κ′ , δ, ι, ξ). Lemma 19. For any δ, ι, ξ ≥ 0, E, typing θ, and two context sequences P and P ′ of same length k (i.e., , #(P ) = #(P ′ ) = k), if P ′ : E, P : E, EJP ′ K ∈ C ν (θ, δ, ι, ξ) and Ci ∈ C ν (κ̄(Ei), δ, ι, ξ) for all i ≤ k, then EJP K ∈ C ν (θ, δ, ι, ξ). Proof. Let P = C1 · · · Ck and P ′ = C1′ · · · Ck′ . The proof proceeds by induction on E. Below we use Lemmas 17 and 18. – Case E = J Kθm11···θℓ ⇒θ [E1 ] · · · [Eℓ ]: Let k1 = 2, ki+1 = ki + shn(Ei ) for 1 ≤ i < k, and ki′ = ki +shn(Ei )−1 for 1 ≤ i ≤ k; thus (Ck′ i · · · Ck′ ′ ), (Cki · · · Cki′ ) : Ei i for each i ≤ k. Since EJP ′ K = C1′ [E1 JCk′ 1 · · · Ck′ 1′ K] · · · [Eℓ JCk′ ℓ · · · Ck′ ′ K] ∈ C ν (θ, δ, ι, ξ), ℓ Ei JCk′ i · · · Ck′ ′ K ∈ C ν (θi , δ, ι, ξ) for each i ≤ k. By induction hypothesis, i Ei JCki · · · Cki′ K ∈ C ν (θi , δ, ι, ξ) for each i ≤ k, and hence EJP K = C1 [E1 JCk1 · · · Ck1′ K] · · · [Eℓ JCkℓ · · · Ckℓ′ K] ∈ C ν (θ, δ, ι, ξ). – Case E = x: Clear. – Case E = λxσ.E ′ : We have EJP ′ K = λxσ.E ′ JP ′ K ∈ C ν (θ, δ, ι, ξ); let θ′ = κ̄(E ′ JP ′ K ⪯ λxσ.E ′ JP ′ K : θ). Then E ′ JP ′ K ∈ C ν (θ′ , δ, ι, ξ) and λxσ.[] ∈ C ν (θ′ ⇒θ, δ, ι, ξ). By induction hypothesis, E ′ JP K ∈ C ν (θ′ , δ, ι, ξ), and hence EJP K = λxσ.E ′ JP K ∈ C ν (⟨Γ ; σ → τ ⟩, δ, ι, ξ). – Case E = E1 E2 : Let k1 = 1, k1′ = shn(E1 ), k2 = shn(E1 ) + 1, and k2′ = k. Then EJP ′ K = E1 JCk′ 1 · · · Ck′ ′ KE2 JCk′ 2 · · · Ck′ ′ K ∈ C ν (θ, δ, ι, ξ). Let 1 2 θi = κ̄(Ei JCk′ i · · · Ck′ ′ K ⪯ EJP ′ K : θ). We have Ei JCk′ i · · · Ck′ ′ K ∈ C ν (θi , δ, ι, ξ) i i and [ ][ ] ∈ C ν (θ1 θ2 ⇒θ, δ, ι, ξ). By induction hypothesis, Ei JCki · · · Cki′ K ∈ C ν (θi , δ, ι, ξ). Hence, EJP K = E1 JCk1 · · · Ck1′ KE2 JCk2 · · · Ck2′ K ∈ C ν (θ, δ, ι, ξ). ⊓ ⊔ Lemma 20. For δ, ι, ξ ≥ 0, n ≥ 1, m ≥ 2 and (E, u) ∈ Bbnm (δ, ι, ξ), we have m m bE,u P (δ, ι, ξ) = {P ∈ PE,u (δ, ι, ξ) | P i ∈ U ν (δ, ι, ξ) for each i ≤ shn(E) − 1} m = PE,u (δ, ι, ξ) ∩ (U ν (δ, ι, ξ))shn(E)−1 . bm (δ, ι, ξ). By the definition of P bm (δ, ι, ξ), P ∈ P m (δ, ι, ξ), Proof. Let P ∈ P E,u E,u E,u EJu · P K = ν([EJu · P K]α ) and [EJu · P K]α ∈ Λα (δ, ι, ξ). Hence EJu · P K ∈ m (δ, ι, ξ) ∩ U ν (δ, ι, ξ), and by Lemma 18 P ∈ (U ν (δ, ι, ξ))shn(E)−1 . Thus P ∈ PE,u ν shn(E)−1 (U (δ, ι, ξ)) . m Let P ∈ PE,u (δ, ι, ξ) ∩ (U ν (δ, ι, ξ))shn(E)−1 . Since (E, u) ∈ Bbnm (δ, ι, ξ), there bm (δ, ι, ξ). Hence P ′ ∈ P m (δ, ι, ξ), EJu · P ′ K = ν([EJu · P ′ K]α ) exists P ′ ∈ P E,u E,u and [EJu · P ′ K]α ∈ Λα (δ, ι, ξ). Therefore EJu · P ′ K ∈ C ν (⟨∅; τ ⟩, δ, ι, ξ) for some τ ∈ Types(δ, ι). By Lemma 19, EJu · P K ∈ C ν (⟨∅; τ ⟩, δ, ι, ξ), and hence EJu · P K = bm (δ, ι, ξ). ν([EJu · P K]α ) and [EJu · P K]α ∈ Λα (δ, ι, ξ). Thus, P ∈ P ⊓ ⊔ E,u Lemma 21. For m ≥ 2 and Φm (t) = (E, u, P ), the following are equivalent: – |t| ≥ m – shn(E) ≥ 2 – u is a 1-context. Proof. Straightforward induction on |t|. ⊓ ⊔ m In the next two lemmas, we use the fact: P ∈ PE,u (δ, ι, ξ) if and only if Φm (t) = (E, u, P ) for some t. The following lemma states that, for Φm (u[t]) = (E, u, P ), P is determined only by t (u does not matter), i.e., decomposition is performed in a bottom-up manner. Lemma 22. For m ≥ 2 and a θ′ -term t such that |t| ≥ m, let Φm (t) = (E, u, P ) and κ̄(u ⪯ t : θ′ ) = θ⇒θ′ . Then m m PE,u (δ, ι, ξ) = PE↾ θ⇒θ ,[ ] (δ, ι, ξ) . 0 Proof. The proof proceeds by induction on the size of u. Note that, by the assumption |t| ≥ m and Lemma 21, u is a 1-context. The case where u = [ ] is trivial. m m – Case u = λxσ.u1 : By the definition of Φm we have PE,u (δ, ι, ξ) = PE↾u (δ, ι, ξ). 1 ,u1 m m Then by the induction hypothesis, PE↾u1 ,u1 (δ, ι, ξ) = PE↾θ⇒θ ,[ ] (δ, ι, ξ). 0 – Case u = u1 t2 : Since u is a 1-context, it must be the case that shn(E) > 1. m Therefore, for every P1 ∈ PE↾u (δ, ι, ξ), |(E↾u1 )[u1 · P1 ]| ≥ m. Thus, we 1 ,u1 m m (δ, ι, ξ). Conversely, have P1 ∈ PE,u1 t2 (δ, ι, ξ) by the definition of PE↾u 1 ,u1 m if P1 ∈ PE,u1 t2 (δ, ι, ξ), then (E, u1 t2 , P1 ) must have been computed by the m second case of Φm (t1 t2 ); therefore, we have P1 ∈ PE↾u (δ, ι, ξ). Thus, we 1 ,u1 m m m have PE,u (δ, ι, ξ) = PE↾u1 ,u1 (δ, ι, ξ) = PE↾θ⇒θ ,[ ] (δ, ι, ξ) as required. 0 – Case u = u2 t1 : We omit the proof as it is similar to the proof of the above case. The following lemma shows that the components of a good context sequence are independent of each other. Lemma 23. For n ≥ 1, m ≥ 2 and (E, u, m), m m m PE,u (δ, ι, ξ) = UE2 × · · · × UEshn(E) . m m Proof. For readability, we omit the parameters (δ, ι, ξ), and write PE,u for PE,u (δ, ι, ξ). Let k = shn(E). The proof proceeds by induction on k ≥ 1. When k = 1, m m UE2 × · · · × UEk = {ϵ}. m When k ≥ 2, let us pick P ∈ PE,u (δ, ι, ξ); then (E, u, P ) = Φm (EJu · P K). Now we perform a case analysis on how E has been computed. Since k ≥ 2, u is a 1-context, and hence by Lemma 22, we can assume u = [ ] in this case analysis. ′ – Case E = J K[E ′ ]: In this case, E is computed by the second case of Φm (λx′τ .t′ ), or the third or fifth case of Φm (t′1 t′2 ). By the fact we mentioned above, m PE,[ ] = {P | Φm (t) = (E, [ ], P ) for some t} = (∵ by the definition of Φm (t)) m τ τ {λxτ.u1 · P1 | P1 ∈ PE ′ ,u , |λx .u1 | = m, λx .u1 : E2} 1 m ′ ∪ {u1 t2 · P1 | P1 ∈ PE ′ ,u , |E [u1 · P1 ]| ≥ m, |t2 | < m, |u1 t2 | ≥ m, u1 t2 : E2} 1 m ∪ {t1 u2 · P1 | P1 ∈ PE ′ ,u , |t1 | < m, |t1 u2 | ≥ m, t1 u2 : E2}. 2 By Lemma 21, |E ′ [u1 · P1 ]| ≥ m if and only if shn(E ′ ) > 1, we further have: m τ m τ τ PE,[ ] ={λx .u1 · P1 | P1 ∈ PE ′ ,u1 , |λx .u1 | = m, λx .u1 : E2} m ′ ∪ {u1 t2 · P1 | P1 ∈ PE ′ ,u , shn(E ) > 1, |t2 | < m, |u1 t2 | ≥ m, u1 t2 : E2} 1 m ∪ {t1 u2 · P1 | P1 ∈ PE ′ ,u , |t1 | < m, |t1 u2 | ≥ m, t1 u2 : E2}. 2 (3) We perform further case analysis: • Case shn(E ′ ) = 1: In this case, in the right hand side of Equation (3), shn(E ′ ) = 1, P1 = ϵ and u1 and u2 are 0-contexts by Lemma 21. Then we can deduce from Equation (3) that: τ τ τ m PE,[ ] = {λx .u1 | |λx .u1 | = m, λx .u1 : E2} ∪{t1 u2 | |t1 | < m, |t1 u2 | ≥ m, t1 u2 : E2} = {λxτ.u1 | λxτ .u1 is good, λxτ.u1 : E2} ∪{t1 u2 | t1 u2 is good, t1 u2 : E2} m . = UE2 • Case shn(E ′ ) > 1: Let κ̄(E ′ 2) = κ⇒⟨Γ ; τ ⟩. In this case, in the right hand side of Equation (3), u1 and u2 are 1-contexts by Lemma 21. Further, λxτ.u1 : E2, t1 u2 : E2 and t1 u2 : E2 imply that u1 and u2 have the same hole typing κ, i.e., ∅ ⊢ u1 : κ⇒⟨Γ1 ; τ1 ⟩ and ∅ ⊢ u2 : κ⇒⟨Γ2 ; τ2 ⟩ m m for some ⟨Γ1 ; τ1 ⟩, ⟨Γ2 ; τ2 ⟩. Therefore PE ′ ,u = PE ′↾κ⇒κ ,[ ] by Lemma 22. i 0 Then we can deduce from Equation (3) that: τ τ τ m m PE,[ ] = {λx .u1 · P1 | P1 ∈ PE ′↾κ⇒κ ,[ ] , |λx .u1 | = m, λx .u1 : E2} 0 m ′ ∪{u1 t2 · P1 | P1 ∈ PE ′↾κ⇒κ ,[ ] , shn(E ) > 1, |t2 | < m, |u1 t2 | ≥ m, u1 t2 : E2} 0 m ∪{t1 u2 · P1 | P1 ∈ PE ′↾κ⇒κ ,[ ] , |t1 | < m, |t1 u2 | ≥ m, |t1 u2 | : E2} 0 = ({λxτ.u1 | |λxτ.u1 | = m, λxτ.u1 : E2} ∪{u1 t2 | shn(E ′ ) > 1, |t2 | < m, |u1 t2 | ≥ m, u1 t2 : E2} m ∪{t1 u2 | |t1 | < m, |t1 u2 | ≥ m, t1 u2 : E2}) × PE ′↾κ⇒κ ,[ ] 0 τ τ = ({λx .u1 | λx .u1 is good, u1 : E2} m ∪{t1 u2 | t1 u2 is good, u1 : E2}) × PE ′↾κ⇒κ ,[ ] 0 m m = UE2 × PE ′↾κ⇒κ ,[ ] 0 m m m By the induction hypothesis, PE ′↾κ⇒κ ,[ ] = UE3 × · · · × UEk , thus we 0 m m m obtain PE,[ ] = UE2 × · · · × UEk . – Case E = J K[(E1 Ju1 K)(E2 Ju2 K)]: In this case, E is computed by the first case of Φm (t′1 t′2 ). By the definition of Φm , we have: m m m PE,[ ] = {P1 · P2 | P1 ∈ PE1 ,u1 and P2 ∈ PE2 ,u2 } m m = PE1 ,u1 × PE2 ,u2 m m m Let k1 = shn(E1 ), By the induction hypothesis, PE = UE2 × · · · × UEk 1 ,u1 1 m m m m m m = U and PE = U × · · · × U , thus we obtain P × · · · × U E2 Ek1 +1 Ek Ek . E,[ ] 2 ,u2 ⊓ ⊔ Combining Lemmas 20 and 23, we obtain Lemma 16 as a corollary. F Proof of Lemma 3 Proof. First, by the definition of 2k , we have: |2k | = 7, ⊢ 2k : τ (k + 2), ord(2k ) = k, iar(2k ) = k. Item 1 is clear. On Item 2, since we have: |t ↑m s| = m|t| + |s| + m, |Dup| = 8 and |Id | = 2, |Expl km | = 1 + (7m + 7 + m) + 1 + 7 + 1 + · · · + 7 + 1 + 8 + 1 + 2 + 1 + 0 = 1 + (7m + 7 + m) + 1 + (k − 3)(7 + 1) + 8 + 1 + 2 + 1 = 8m + 8k − 3. On Item 3, #(V(Expl km )) = 2 is clear. To calculate ord(−) and iar(−) inductively, we show the following by induction: ) ( ( ) ∆ ord = max{ord(τ ), ord(∆)} = ord(τ ) ord ′ Γ ⊢x:τ Γ ⊢ λxτ .t : τ ) ( ∆1 ∆2 = max{ord(∆1 ), ord(∆2 )} ord Γ ⊢ ts : τ ) ( ( ) ∆ iar = max{iar(τ ), iar(∆)} = iar(τ ) iar ′ Γ ⊢x:τ Γ ⊢ λxτ .t : τ ( ) ∆1 ∆2 iar = max{iar(∆1 ), iar(∆2 )} Γ ⊢ ts : τ where note that we have excluded ord(τ ) and iar(τ ) in the right hand sides of the two application cases. Also, we have ord(Dup) = 1 ord(Id ) = 1 iar(Dup) = 2 iar(Id ) = 1. Hence, ord(Expl km [x]) = max{ord(o→o), k, . . . , 2, ord(Dup), ord(Id ), 1} = k iar(Expl km [x]) = max{iar(o→o), k, . . . , 2, iar(Dup), iar(Id ), 1} = k. Item 4 follows from Items 1 and 3. The proof for Item 5 is the same as that in [1]. ⊓ ⊔ G Proof of Lemma 4 Proof. We shall construct u′ by embedding Expl km′ in an appropriate context. We use Corollary 1 to adjust the goodness, typing and size of u′ . For given δ, ι, ξ ≥ 2, let d0 ≥ 1 be the constant in Corollary 1. We choose b and c so that the following condition holds for any m′ ≥ b: (cm′ − 1)/2 ≥ |Expl km′ | + 2d0 . (4) For example, we can set b = 8k − 2 + 2d0 and c = 18; note that |Expl km′ | = 8m′ + 8k − 3 by Lemma 3. Suppose that n, m′ , (E, u) and i in the statement are given. By the definition, cm′ b b cm′ (δ, ι, ξ). We conUEi (δ, ι, ξ) is non-empty. Pick an arbitrary element u′′ ∈ U Ei struct u′ by finding a subcontext u0 of u′′ such that |Expl km′ | + 2d0 ≤ |u0 | < cm′ , and replacing it with a 0/1-context of the form S ◦ [Expl km′ [u◦ ]] such that (i) |S ◦ | = d0 , (ii) |u◦ | = |u0 | − |Expl km′ | − d0 ≥ d0 , (iii) S ◦ , u◦ ∈ U ν (δ, ι, ξ), and (iv) the size and typing of S ◦ [Expl km′ [u◦ ]] are the same as those of u0 . The existence of such S ◦ and u◦ are guaranteed by Corollary 1. Since u′′ is good, so is u′ (note that |u0 | < cm′ ). It remains to show how to find the subcontext u0 . We perform a case analysis: – Case where u′′ is of the form λxσ.u′0 . In this case, since λxσ.u′0 is good for cm′ , we have |Ei| = |λxσ.u′0 | = cm′ . Thus, we can let u0 = u′0 . Note that the required condition |Expl km′ | + 2d0 ≤ |u0 | follows from the inequality (4). – Otherwise, u′′ must be of the form u1 u2 . Since u1 u2 is good for cm′ , we have |u1 |, |u2 | < cm′ , and |u1 u2 | = |u1 | + |u2 | + 1 ≥ cm′ . Let u0 be ui such that |ui | ≥ |u3−i |. By the conditions above and the inequality (4), we have |Expl km′ | + 2d0 ≤ |u0 | < cm′ as required. ⊓ ⊔ H Proof of the Main Theorem The following lemma will be used in the main proof. Lemma 24. Let δ, ι, ξ ≥ 2 be integers. There exists a constant d such that ( ν ) #(Unν (δ, ι, ξ)) ≤ # Un+d+n ′ (δ, ι, ξ) holds for any n, n′ ≥ 0. Proof. Let d = d0 where d0 is the constant in Corollary 1. We show that there is ν ν a injection f : Unν (δ, ι, ξ) → Un+d+n ′ (δ, ι, ξ). For a 0/1-κ-context u ∈ Un (δ, ι, ξ), we define f (u) ≜ Sκ [u] where Sκ is a 1-context determined by the context typing κ of u such that ν ′ Sκ [u] ∈ Un+d+n ′ (δ, ι, ξ), i.e., |Sκ | = d+n , and Sκ [u] is a κ-context. The existence of such Sκ is assured by Corollary 1, and it is clear by the definition that f is an injection. ⊓ ⊔ Now, we give a complete proof of the main theorem. Proof (of Theorem 1). Given δ, ι, ξ, k ≥ 2 in the assumption, let k = max(δ, ι). b We take the constants b and c in Lemma 4. Let n ≥ 22 , m′ ≜ ⌈log(2) (n)⌉ and m ≜ cm′ ; then m′ ≥ log(2) (n) = b. By Lemma 15, there is a bijection g : Λα n (δ, ι, ξ) ∼ = ⨿ bm (δ, ι, ξ). P E,u bm (δ,ι,ξ) (E,u)∈B n bm (δ, ι, ξ) = U b δ,ι,ξ,m,E (δ, ι, ξ) × · · · × For each (E, u) ∈ Bbnm (δ, ι, ξ), we have P 2 E,u m bm (δ, ι, ξ) | b δ,ι,ξ,m,E (δ, ι, ξ) by Lemma 16. Let P (δ, ι, ξ) ≜ {u2 · · · ushn(E) ∈ P U E,u E,u shn(E) Expl km′ ⪯̸ ui for all 2 ≤ i ≤ shn(E)}. Then, by Lemma 3 (Item 5), we can restrict g to an injection: g ′ : {[t]α ∈ Λα n (δ, ι, ξ) | β(t) < expk−2 (n)} ↣ ⨿ m P E,u (δ, ι, ξ) bm (δ,ι,ξ) (E,u)∈B n b m (δ, ι, ξ) contains some u′ such that Expl km′ ⪯ u′ , so By Lemma 4, every U i i Ei ′ m bm b (δ, ι, ξ) \ {u }). Then ⊆ (UE2 (δ, ι, ξ) \ {u′2 }) × · · · × (U Eshn(E) shn(E) we have m P E,u (δ, ι, ξ) ) ( m ) ∏shn(E) ( b m shn(E) (# U (δ, ι, ξ) − 1) # P E,u (δ, ι, ξ) ∏ Ei i=2 1 1 − ( ) ≤ ) . ( ) = ∏shn(E) ( b m m m b b U (δ, ι, ξ) # PE,u (δ, ι, ξ) # U (δ, ι, ξ) # i=2 Ei Ei i=2 ′ ′′ By Lemma 24, there ( νexists a constant ) d such that for any n , n ≥ 0 we have ν #(Un′ (δ, ι, ξ)) ≤ # Un′ +d+n′′ (δ, ι, ξ) . Hence we can deduce ) ( m shn(E) shn(E) # P E,u (δ, ι, ξ) ∏ ∏ 1 1 1 − ( 1 − ( ( ) ≤ ) ≤ ) m (δ, ι, ξ) ν bm (δ, ι, ξ) b # P # U # U (δ, ι, ξ) i=2 i=2 E,u Ei |Ei| )shn(E)−1 ( ) n ( 4m 1 1 ) ) ≤ 1− ( ν = 1− ( ν # U2m+d (δ, ι, ξ) # U2m+d (δ, ι, ξ) where note that m ≤ |Ei| < 2m for each 2 ≤ i ≤ shn(E) by Lemma 13 and shn(E) − 1 ≥ n/4m by Lemma 14. Now, for sufficiently large n, we have ≤ ≤ = ≤ = ( ) # {[t]α ∈ Λα n (δ, ι, ξ) | β(t) < expk−2 (n)} #(Λα (δ, ι, ξ)) (n m ) ∑ # P (δ, ι, ξ) m b (δ,ι,ξ) E,u (E,u)∈B n ( ) ∑ bm bm (δ,ι,ξ) # PE,u (δ, ι, ξ) (E,u)∈B n ) (( ) n ( ) 4m ∑ 1 m b (δ, ι, ξ) ×# P 1 − # U ν (δ,ι,ξ) bm (δ,ι,ξ) E,u (E,u)∈B n ( 2m+d ) ( ) ∑ bm bm (δ,ι,ξ) # PE,u (δ, ι, ξ) (E,u)∈B n ( ) n 4m 1 ) 1− ( ν # U2m+d (δ, ι, ξ) ( ) n 4m 1 1− ′ (∵ for sufficiently large n by Lemma 1) 2m c γ(δ, ι, ξ) n ( ) 1 4c⌈log(2) (n)⌉ 1− −→ 0 (as n −→ ∞) (2) c′ γ(δ, ι, ξ)2c⌈log (n)⌉ where the last convergence is shown in Appendix I. Thus, we have proved Theorem 1. ⊓ ⊔ I Proof of the Convergences Proof. First note that, on the convergence in the proof of Proposition 3, we can assume that #(A) ≥ 2 and we have ⌊ ( lim n→∞ 1− 1 1− n→∞ ⌋ n ⌈log (2) (n)⌉ ⌈log(2) (n)⌉ #(A) ( ≤ lim ) )1 n 2 2log(2) (n) 1 2log(2) (n) #(A) = lim 1 − ( n→∞ 1 )log(2) (n) #(A) 2 1 n 4 log(2) (n) . Also, on the convergence in the proof of Theorem 1, we have n ( ) 1 4c⌈log(2) (n)⌉ lim 1 − (2) n→∞ c′ γ(δ, ι, ξ)2c⌈log (n)⌉ n ( ) 1 8c(log(2) (n)) ≤ lim 1 − (2) (2) n→∞ γ(δ, ι, ξ)log (n) γ(δ, ι, ξ)4c(log (n)) 1 n ( ) (2) 8c log (n) 1 ≤ lim 1 − . log(2) (n) n→∞ (γ(δ, ι, ξ)4c+1 ) Hence, it is enough to show ( lim n→∞ 1− ) 1 γ log n log(2) (n) (2) (n) =0 2 for any γ > 1, where γ = #(A) for Proposition 3 and γ = γ(δ, ι, ξ)4c+1 for Theorem 1. ( 1 ) log γ (2) ; Now we transform the variable n by γ log (n) = x, i.e., by n = 2 x then ( 1 )) ( n ( ) (2) ) log γ 2 x log γ ( log x 1 1 log (n) lim 1 − = lim 1 − (2) (n) log n→∞ x→∞ x γ ( 1 ) ) log γ ( ( ) 1 2 x log γ log x 1 = lim 1 − x x→∞ where log γ > 0. Then ( 1− lim x→∞ ( 1− = lim x→∞ ( ≤ lim 1− x→∞ ( = lim x→∞ =0 1− 1 x 1 x 1 x 1 x ( ) ( 1 log γ 1 2 x log x )) )) (( 1 (2) ) 2 x log γ −log (x) 1 )) (( ) 2 12 x log γ ( since ( ( 1 )) ) √2 x log γ 1 log1 γ ≥ log(2) (x) for x > 1 x 2 ) where the last equation can be calculated by applying log(−) to the left-handside: ( lim log x→∞ ( 1− 1 x ) ( √ ( 2 1 x log γ )) ) ) ( log 1 − x1 = lim 1 ) ( x→∞ √ −x log γ 2 = x−2 1) (loge 2)(1− x lim 1 )( ( ( )) x→∞ 1 −1 √ ) √ −x log γ ( 1 log γ loge 2 2 − log γ x ( 1 ) x log γ −(log γ) 2 1 ( ) lim = lim ( 1 +1 x→∞ 1 − √ ) x→∞ loge 2 (loge 2) x log γ (by the L’Hôpital’s Rule) √ 1 x = − ∞. ⊓ ⊔
© Copyright 2024 Paperzz