A Comparison of Tree Transductions dened by Monadic Second Order Logic and by Attribute Grammars Roderick Bloem? and Joost Engelfriet?? Department of Computer Science, Leiden University P.O.Box 9512, 2300 RA Leiden, The Netherlands e-mail: [email protected] Abstract. Formulas from monadic second order (mso) logic with one and two free variables can be used to dene the nodes and edges (respectively) of a graph, in terms of a given graph. Such mso denable graph transductions play a role in the theory of graph grammars. Here we investigate the special case of trees. The main result is that the mso denable tree transductions are exactly those tree transductions that can be computed by attributed tree transducers with look-ahead, which are a specic type of two-stage attribute grammar: in the rst (look-ahead) stage all attributes have nitely many values, in the second stage all attributes are trees, and the second stage satises the single use restriction (i.e., each attribute is used at most once). Moreover, if we allow the mso transductions to produce trees with shared subtrees (i.e., term graphs, that have to be unfolded), then the single use restriction can be dropped. 1 Introduction Formulas of monadic second order (mso) logic can be used to express properties of labeled graphs, and, in particular, trees and strings. Thus, monadic second order logic is a convenient language for the specication of sets of graphs, relations between graphs, relations between the nodes of graphs, etcetera. Several results are known that provide formal models for the implementation of such specications (see, e.g., [Tho, Eng5]). The classical one, proved in [Buc, Elg], is that a set of strings can be specied in monadic second order logic (by a closed formula) if and only if it can be recognized by a nite-state automaton. This was generalized in [Don, ThaWri] to sets of node-labeled ordered trees, with an appropriate generalization of the nite-state automaton to the bottom-up nite-state tree automaton. The trees considered are the usual representations of terms over a nite set of operators. In [Cou1] one direction of the result was generalized to The present address of the rst author is: Department of Computer Science, University of Colorado at Boulder, P.O.Box 430, Boulder, CO 80303, email: [email protected] ?? The second author was supported by ESPRIT BRWG No.7183 COMPUGRAPH II and TMR Network GETGRATS ? 1 graphs: every set of graphs that can be specied in monadic second order logic is recognizable in the algebraic sense (see [MezWri]), for a specic algebra of graphs. This was used to show that the class of context-free graph languages, generated by context-free graph grammars, is closed under intersection with mso denable sets of graphs. Inspired by the ideas of [Cou1], a characterization of the context-free graph languages in terms of monadic second order logic was presented in [Oos, Eng4, EngOos], based on the following natural logical way to specify graph transductions, i.e., functions from graphs to graphs. The idea is that the output graph is specied by formulas that are interpreted on the input graph (see [ArnLagSee] for the history of the concept of interpreting one logical structure in another). More precisely, for a given input graph g1 , the nodes of the output graph g2 are a subset of the nodes of g1 , which is specied by a unary mso formula, i.e., a formula with one free variable ranging over the nodes of g1; the edges of g2 are specied by a binary mso formula, i.e., a formula with two free variables ranging over the nodes of g1 . In fact, to dene the labels of the nodes and edges of g2 , these formulas are indexed by node and edge labels, respectively. It was shown in [Oos, Eng4, EngOos] that a set of graphs is context-free i it is the image of an mso denable set of trees under an mso denable graph transduction. Since it is easy to see that the mso denable graph transductions are closed under composition, this characterization implies that the context-free graph languages are closed under mso denable graph transductions (comparable to the closure of the context-free string languages under nite state transductions). 3 In [Cou3] the mso denable graph transductions were generalized, in a natural way, such that the above characterization of the context-free graph languages still holds. As a result, the context-free graph languages are even closed under this larger class of functions on graphs. The main idea of the generalization is to allow the output graph to contain a xed number k of copies of each node of the input graph. Accordingly, the mso formulas specifying the graph transduction are additionally indexed by numbers from 1 to k. This is the type of mso denable function on graphs that we consider in this paper. The mso denable functions in [EngOos, Cou3] are partial, and in [Cou3] also mso denable relations are considered. Here, we investigate total functions only. For a survey on mso denable graph transductions, see [Cou4]. The results discussed above show how to specify the context-free graph languages by mso denable graph transductions, but, the other way around, they do not answer the question of how to implement mso specications of graph transductions, which is the topic of this paper. Since there do not seem to be 3 It should be noted here that there are (at least) two natural classes of contextfree graph languages, those generated by (hyper)edge replacement (HR, see, e.g., [DreHabKre]) and those generated by node replacement (NR, see, e.g., [EngRoz]). They are related to mso logic with and without variables ranging over edges, respectively. The original result of [Cou1] was proved for HR (and, later, for NR in [CouEngRoz]). The characterization of [Oos, Eng4, EngOos] was proved for NR (and, later, for HR in [CouEng]). For a survey of the relationships between context-free graph grammars and monadic second order logic, see [Cou2, Cou5, Eng6]. 2 suitable formal models for the implementation of functions on graphs, we consider in this paper the special case of trees: both input and output graph are trees. This choice is also motivated by the fact that tree transductions form a well-known and well-investigated model of syntax-directed semantics. A partial answer to the question of how to implement the mso denable tree transductions was given in [BloEng], where it was shown how to implement unary and binary mso formulas for trees: unary formulas can be implemented by attribute grammars of which all attributes have nitely many values (proved independently in [NevBus]), and binary formulas can be implemented by a particular type of nite-state tree-walking automaton. Since the specication of a graph transduction consists of a collection of unary and binary formulas, we can use these characterizations to obtain a model for the implementation of mso specications of tree transductions. The main idea is that the tree-walking automata can be turned into an attribute grammar. Thus, we obtain a characterization of the mso denable tree transductions in terms of attribute grammars, one of the best known formal models for dening syntax-directed semantics. Of course, attribute grammars are still a specication language, but they are much closer to implementation than mso logic, and their implementation has been studied extensively (see, e.g., [DerJouLor, Eng3, KuhVog2]). To be precise, we prove that a tree transduction is mso denable i it can be computed by an \attributed tree transducer with look-ahead" (attR for short). This is a transducer which consists of two attribute grammars, each computing a tree transduction, the composition of which is the tree transduction computed by the transducer. The two attribute grammars work in dierent ways. The rst attribute grammar is a so-called relabeling attribute grammar (introduced in [BloEng]) which just preprocesses the input tree by relabeling its nodes: all attributes have nitely many values and one of the attributes holds the new label of each node. This is the look-ahead part of the attR ; the R stands for `relabeling' or for `regular look-ahead', where `regular' is used to refer to nite-state devices (in this case an attribute grammar of which all attributes have nitely many values). The second attribute grammar is an attributed tree transducer (introduced in [Ful], see also [EngFil]) which performs the actual computation: the values of all attributes are trees, with substitution as the only operation on trees, and the output tree is the value of a designated attribute at the root of the input tree. Moreover, the second attribute grammar satises the so-called single use restriction (investigated in [Gan, GanGie, Gie]), which means that each attribute is used at most once. An attR can also be viewed as one attribute grammar of which the attributes can be evaluated in two phases: in the rst phase only attributes with nitely many values are evaluated, called \ags" in [Eng2, Blo], and in the second phase the value of each attribute is a tree which is dened by a conditional rule, depending on the values of the ags. Intuitively, this is a more understandable model. Technically, due to the presence of conditional rules in such two-phase attribute grammars, it is more convenient to work with compositions of attribute grammars as above. Since trees are terms and terms can be evaluated in a semantic domain, at3 tributed tree transducers are a schematic model of attribute grammars: every attribute grammar can be viewed as an attributed tree transducer followed by an evaluation of terms (cf. [Eng2]). The attributed tree transducers seem to have certain undesirable properties, see, e.g., [FulVag]. Based on the results of this paper, we think that the attributed tree transducer with look-ahead is a more attractive and robust formal model of attribute grammars. By the result of [FulVag] it is more powerful than the attributed tree transducer. This can be compared with the addition of regular look-ahead to top-down tree transducers (see [Eng1, GecSte]), which are, in fact, attributed tree transducers with synthesized attributes only. When implementing tree transductions (as in term rewriting systems) it is natural, and ecient, to allow trees with shared subtrees: so-called term graphs (see, e.g., [SlePleEek, CorMon]). Thus, we also investigate mso denable graph transductions of which the input graph is a tree, but the output graph is a term graph. Such a graph transduction denes a tree transduction, obtained by unfolding the output graph into the tree it represents. We prove that a tree transduction can be dened by such an mso denable term graph transduction i it can be computed by an attributed tree transducer with look-ahead, without the single use restriction. In fact, one of the reasons that attribute grammars are a popular tool for the implementation of syntax-directed semantics is the fact that attributes can be used several times, but are computed only once. Note that the single use restriction naturally correspond to the requirement that the term graph has no sharing, i.e., is a tree. The structure of this paper is as follows. Section 2 contains the basic terminology on graphs, term graphs, trees, mso logic, and attribute grammars. In Section 3 we recall the denition of mso denable graph transduction, and dene the two classes of mso denable tree transductions that we investigate, with and without sharing of subtrees. The notion of an mso relabeling is introduced, which is an mso denable tree transduction that just relabels the nodes of the input tree. In Section 4 we dene the attributed tree transducer with look-ahead, by recalling the denitions of a relabeling attribute grammar and an attributed tree transducer. This section also contains some basic properties of attributed tree transducers, which are used in Section 5 to show that every attributed tree transduction is mso denable (without sharing of subtrees if the single use restriction is satised). In Section 6 we recall the characterization of unary and binary formulas on trees, proved in [BloEng], and we show that relabeling attribute grammars and mso relabelings have the same power. With the result of the previous section, this shows half of our main results: every tree transduction computed by an attributed tree transducer with look-ahead can be specied in mso logic. The other half is proved in Section 7: every mso denable tree transduction can be computed by an attR . After the proof, a detailed example is given of this implementation. The attributed tree transducer constructed in the proof is only guaranteed to be noncircular when restricted to the output trees of the relabeling attribute grammar. It is shown in Section 8 that it can be turned into a noncircular attributed tree transducer. Finally, the main results are stated and 4 discussed in Section 9. Most of the results of this paper were proved as part of [Blo], the Master's Thesis of the rst author. 2 Preliminaries In this section we recall some well-known concepts concerning graphs and trees, monadic second order logic on graphs, and attribute grammars. N = f0; 1; 2; : : : g, and for m; n 2 N , [m; n] = fi j m i ng. For a set S , P (S ) is its powerset and #S its cardinality. For binary relations R1 and R2 , their composition is R1 R2 = f(x; z ) j 9y : (x; y) 2 R1 and (y; z ) 2 R2 g; note that the order of R1 and R2 is nonstandard. For sets of relations R1 and R2 , R1 R2 = fR1 R2 j R1 2 R1 ; R2 2 R2 g. The transitive reexive closure of a binary relation R is denoted R . A binary relation R is said to be functional if it is a partial function, i.e., if (x; y); (x; z ) 2 R implies y = z . 2.1 Graphs, term graphs, and trees We view trees and term graphs as nite, directed graphs with labeled nodes and edges, in the usual way. Let and be alphabets (of node labels and edge labels, respectively). A graph over (; ) is a triple (V; E; lab), with V a nite set of nodes, E V V the set of labeled edges, and lab : V ! the node-labeling function. For a given graph g, its nodes, edges, and node-labeling function are denoted Vg , Eg , and labg , respectively. The set of all graphs over (; ) is denoted G; . Let g = (V; E; lab) be a graph. An edge (u; ; v) is an edge with label from u to v, it is an outgoing edge of u, and an incoming edge of v. We also say that u is a parent of v and that v is a child of u. Instead of (u; ; v) 2 E we also write u! v. For nodes u; v 2 V , a (directed, possibly empty) path from u to v is an alternating sequence u0 e1 u1 e2 u2 en un of nodes ui and edges ei , with n 0 (the length of the path), u0 = u, un = v, and ei is an edge from ui 1 to ui . We also say that u is an ancestor of v and that v is a descendant of u. The path 2 1 n u2 ! u1 ! un , if ei = (ui 1 ; i ; ui ). It is a cycle is also written u0 ! if n 1 and u0 = un . A graph is noncircular if it has no cycles, and circular otherwise. The trees we consider are the usual graphical representations of terms, which form the free algebra over a set of operators. An operator alphabet is an alphabet together with a rank function rk : ! N . For all k 2 N , k = f 2 j rk() = kg is the set of operators of rank k, i.e., with k arguments. The rank interval of the operator alphabet is rki( ) = [1; m] where m is the maximal rank of the elements of . The nodes of a tree or term graph over are labeled by operators. To indicate the order of the arguments of an operator, we label the edges by natural numbers. Since a term graph may contain \garbage", the root of the tree it represents has 5 to be marked. Here it is technically convenient to add the mark # to the label of the root. This is only necessary if the term graph itself has no unique root. Let be an operator alphabet. By # we denote the ranked alphabet [ ( f#g) in which, for every 2 , both and (; #) have the same rank as in . A term graph over is a graph t over (# ; rki( )) such that (1) t is noncircular, (2) for every node u of t, if labt (u) = or labt (u) = (; #), with 2 k , then all outgoing edges of u have labels in [1; k], and, for every i 2 [1; k], u has exactly one outgoing edge with label i, and (3) either there is a unique node with label in f#g, or there is no node with label in f#g and there is a (unique) node which is an ancestor of all nodes. The unique node mentioned in condition (3) is called the root of t, denoted root(t). Thus, either root(t) is the unique node that is marked with #, or no node is marked with # and root(t) is the unique node that is an ancestor of all nodes of t. A node without outgoing edges is called a leaf of t. For nodes u and v of t, if (u; i; v) 2 Et , then v is called the i-th child of u, denoted by u i. For technical convenience, we also dene u 0 = u. All nodes of t that are not descendants of root(t) are referred to as garbage. Note that, by condition (3), there is no garbage if the root is not marked by #. A forest is a term graph such that no node has more than one incoming edge. A tree is a forest such that all nodes have their labels in . The set of all trees over is denoted T . A function from T to T (where is an operator alphabet too) is called a tree transduction or a term transduction. As usual, trees will also be denoted by the corresponding terms over . Thus, for 2 k and t1 ; : : : ; tk 2 T , the term (t1 ; : : : ; tk ) denotes the tree t with labt (root(t)) = and root(t) i = root(ti ) for every i 2 [1; k]. Term graphs can be unfolded into trees. For a term graph t over , the unfolding of t, denoted unfold(t), is the tree over dened by unfold(t) = unf t (root(t)), where, for u 2 Vt , unf t (u) = (unf t (u 1); : : : ; unf t (u k)) with labt (u) = or labt (u) = (; #), 2 k . A term graph over is clean if all its node labels are in (and hence all its nodes are descendants of the root). Garbage can be removed from term graphs as follows. For a term graph t over , the cleaning of t, denoted clean(t), is the subgraph of t induced by the descendants of root(t), in which # is removed from the label of root(t), if present. Obviously, clean(t) is a term graph over of which all nodes have their labels in . Moreover, root(clean(t)) = root(t) and unfold(clean(t)) = unfold(t). We also need terms with variables. These variables are treated as constants. Let be an operator alphabet and a nite set (of variables), disjoint with . Let ( ) be the operator alphabet with ( )0 = 0 [ and ( )k = k for k 1. Then T ( ) = T( ) is the set of trees over with variables in . A tree t 2 T ( ) is linear if every variable appears at most once in t. If = f1 ; : : : ; k g and ti 2 T ( ) for i 2 [0; k], then t0 [1 7! t1 ; : : : ; k 7! tk ] 6 denotes the result of (simultaneously) substituting ti for i in t0 , i 2 [1; k]. 2.2 Monadic second order logic Monadic second order logic can be used to describe properties of graphs (see, e.g., [Cou2, Cou5, Eng4, Eng6, EngOos]), and hence in particular to describe properties of trees and term graphs. For alphabets and , we use the language MSOL(; ) of monadic second order (mso) formulas over (; ). Formulas over (; ) describe properties of graphs over (; ). This logical language has node variables x; y; : : : , and node-set variables X; Y; : : : . For a given graph g over (; ), node variables range over the elements of Vg , and node-set variables range over the subsets of Vg . There are three types of atomic formulas in MSOL(; ): lab (x), for every 2 , denoting that x has label ; edg (x; y), for every 2 , denoting that there is an edge labeled from x to y; and x 2 X , denoting that x is an element of X . The formulas are built from the atomic formulas using the connectives :, ^, _, !, and $, as usual. Both node variables and node-set variables can be quantied with 9 and 8. We will use edg(x; y) to abbreviate the disjunction of all edg (x; y), 2 ; it denotes that there is an edge from x to y. We will use x = y for 8X (x 2 X $ y 2 X ), denoting that x equals y. Finally, we will use the formula path(x; y) which expresses the existence of a directed path from x to y: path(x; y) = 8X ((closed(X ) ^ x 2 X ) ! y 2 X ) where closed(X ) = 8x; y((edg(x; y) ^ x 2 X ) ! y 2 X ). For every k 2 N , the set of mso formulas over (; ) with k free node variables and no free node-set variables is denoted MSOLk (; ). For k = 1; 2, the elements of MSOLk (; ) are also called unary and binary formulas, respectively. Since we are predominantly interested in trees and term graphs, over an operator alphabet , an mso formula over (# ; rki( )) will simply be called an mso formula over . Also, MSOL(# ; rki( )) will be abbreviated to MSOL( ), and similarly for MSOLk ( ). Note that, in MSOL( ), edgi (x; y) means that y is the i-th child of x, and edg(x; y) means that x is a parent of y. We will additionally use root(x) for a formula that expresses that x is the root of a term graph, e.g., the formula lab# (x) _ 8y(: lab# (y) ^ path(x; y)), where lab# (x) is the disjunction of all lab(;#)(x), for 2 . Moreover, we will use leaf(x) for :9y(edg(x; y)), which denotes that x is a leaf. In case we consider trees only, the component f#g can of course be dropped, and, e.g., root(x) can be simplied to 8y(path(x; y)). For a closed formula 2 MSOL0 (; ) and a graph g 2 G; , we write g j= if g satises . Given a graph g, a valuation is a function that assigns to each node variable an element of Vg , and to each node-set variable a subset of Vg . We write (g; ) j= , if holds in g, where the free variables of are assigned values according to the valuation . If a formula has free variables, say, x; X; y and no others, we also write (x; X; y). Moreover, we write (g; u; U; v) j= (x; X; y) for (g; ) j= (x; X; y), where (x) = u, (X ) = U , and (y) = v. As a very 7 simple example, (g; u; v) j= path(x; y) means that there is a path from u to v in g, and g j= 8x; y(path(x; y)) means that g is strongly connected. 2.3 Attribute Grammars In this subsection we recall some terminology concerning attribute grammars (see, e.g., [Knu, DerJouLor, Eng3, KuhVog2]). In order to allow the attribute grammar to work on arbitrary trees over an operator alphabet, rather than on derivation trees of an underlying context-free grammar, we consider a slight variation of the attribute grammar that was introduced in [Ful]. The semantic rules of the attribute grammar are grouped by operator rather than by grammar production, and there are special semantic rules for the inherited attributes of the root. All operators have the same attributes. Let be an operator alphabet. An attribute grammar over is a six-tuple G = (; S; I; ; W; R; m) where { is the input alphabet. { S is a nite set, the set of synthesized attributes. { I , disjoint with S , is a nite set, the set of inherited attributes. { is a nite set of sets, the semantic domains of the attributes. { W : (S [ I ) ! is the domain assignment. { R describes the semantic rules; it is a function associating a set of rules with every 2 [ frootg: For 2 , R() is the set of internal rules for ; R() contains one rule h0 ; i0i = f (h1 ; i1 i; : : : ; hk ; ik i) for every pair h0 ; i0 i, where either 0 is a synthesized attribute and i0 = 0, or 0 is an inherited attribute and i0 2 [1; rk()]. Furthermore, k 0, 1 ; : : : ; k 2 S [ I , i1 ; : : : ; ik 2 [0; rk()], f is a function from W (1 ) W (k ) to W (0 ), and the hj ; ij i are all distinct. R(root) is the set of root rules; R(root) contains one rule h0 ; 0i = f (h1 ; 0i; : : : ; hk ; 0i) for every 0 2 I , where k 0, 1 ; : : : ; k 2 S [ I , f is a function from W (1 ) W (k ) to W (0 ), and the hj ; ij i are all distinct. { m 2 S is the meaning attribute. Usually, for a semantic rule h0 ; i0 i = f (h1 ; i1 i; : : : ; hk ; ik i) the function f is given as f = x1 ; : : : ; xk :e for some expression e with variables in fx1 ; : : : ; xk g. We will then informally denote the rule by h0 ; i0 i = e0 where e0 is obtained from e by substituting hj ; ij i for xj , for all j 2 [1; k]. If I = ;, i.e., G has synthesized attributes only, then G is said to be Only Synthesized (os). Note that an os attribute grammar has no root rules. If all sets in are nite, then G is said to be nite-valued; this means that each attribute has nitely many values. 8 Let t be a tree over . The set of attributes of t is A(t) = (S [ I ) Vt . The set R(t) of semantic instructions of t is dened as follows. Recall that, for u 2 Vt , u 0 = u. For every node u 2 Vt , if labt (u) = , and h0 ; i0 i = f (h1 ; i1 i; : : : ; hk ; ik i) is a rule in R(), then h0 ; u i0 i = f (h1 ; u i1 i; : : : ; hk ; u ik i) is an internal instruction of t. Analogously, if h0 ; 0i = f (h1 ; 0i; : : : ; hk ; 0i) is a root rule, then h0 ; root(t)i = f (h1 ; root(t)i; : : : ; hk ; root(t)i) is a root instruction of t. The set of all internal instructions and root instructions of t is R(t). Note that for every h; ui 2 A(t) there is exactly one semantic instruction with left-hand side h; ui in R(t). With the help of R(t), the dependencies between the attributes of t can be represented in the usual way by a graph. The dependency graph of a tree t over is the unlabeled directed graph D(t) = (V; E ), where V = A(t) and E consists of all edges (h; ui; h0 ; u0 i) such that there is a semantic instruction h0 ; u0i = f (h1 ; u1i; : : : ; hk ; uk i) 2 R(t) with h; ui = hi ; ui i for some i 2 [1; k]. An attribute grammar G is noncircular on a tree t 2 T if D(t) is noncircular, and G is noncircular if it is noncircular on every tree t 2 T . An attribute grammar G is single use restricted (sur) on a tree t 2 T if no node of D(t) has more than one outgoing edge, and G is sur if it is sur on every tree t 2 T . The single use restriction was investigated in [Gan, GanGie, Gie]; it received its name in [Gie]. We now dene how to give the correct values to the attributes of the tree t. Let dec be a function from A(t) to [ , such that dec(h; ui) 2 W () for every h; ui 2 A(t). The function dec is a decoration of t if all instructions are satised, i.e., for every instruction h0 ; u0 i = f (h1 ; u1 i; : : : ; hk ; uk i) 2 R(t), dec(h0 ; u0 i) = f (dec(h1 ; u1 i); : : : ; dec(hk ; uk i)). It is well known that if D(t) is noncircular, then t has a unique decoration; this decoration will be denoted by decG;t. We will not use the meaning attribute m here. It will be used in later sections, in dierent ways, to dene the translation realized by an attribute grammar. Finally, we dene the dependency graphs of symbols (and the root), which are similar to the dependency graphs of productions in the usual type of attribute grammar [Knu]. They are the atomic graphs from which all dependency graphs D(t) of trees t 2 T are built. For 2 k , the dependency graph of is the unlabeled directed graph D() = (V; E ) where V = A [0; k] and E consists of all edges (hj ; ij i; h0 ; i0i) such that there is a rule h0 ; i0 i = f (h1 ; i1 i; : : : ; hr ; ir i) 2 R(), with j 2 [1; r]. The dependency graph of the root is the unlabeled directed graph D(root) = (V; E ) where V = A f0g and E consists of all edges (hj ; 0i; h0 ; 0i) such that there is a rule h0 ; 0i = f (h1 ; 0i; : : : ; hr ; 0i) 2 R(root), with j 2 [1; r]. 9 3 MSO denable graph transductions The main concept in this paper is that of an mso denable function f on graphs, introduced in [Oos, Eng4, EngOos], and independently in [Cou3] (for a recent survey see [Cou4]). The idea is that, for a given input graph g1 , the nodes, edges, and labels of the output graph g2 = f (g1 ) are described in terms of mso formulas on g1 . For the simplest type of mso denable function (see [EngOos]), the nodes of g2 are a subset of the nodes of g1 . In fact, for each node label of g2 there is a unary formula (x) expressing that node x of g1 will be a node of g2 with label (provided no other such formula is true for x). The edges of g2 are specied by a binary formula (x; y), for every edge label of g2 , expressing that there will be a -labeled edge from x to y in g2 . Although this type of mso denable function is sucient for many purposes, its power is restricted by the fact that the size of the output graph cannot be larger than the size of the input graph. Thus, in [Cou3] the above idea was extended by allowing the output graph to contain (a xed number k of) copies of each node of the input graph. Now, for each i 2 [1; k] there is a formula ;i (x) expressing that the i-th copy of node x of g1 will be a node of g2 with label , and, similarly, for i; j 2 [1; k] there are formulas ;i;j (x; y) for the edges of g2. It will be convenient to use arbitrary names for the copies, rather than numbers. We will only consider total functions f (see [Cou4] for the extension to partial functions and to relations). Since our aim is to compare the dening power of monadic second order logic with the power of certain types of tree transducers, we will view an mso specication of f as a \graph transducer"; this is then a total deterministic transducer. We now dene the syntax and semantics of such mso graph transducers. An mso graph transducer from (1 ; 1 ) to (2 ; 2 ) is a triple T = (C; ; X ) where { C is a nite set of copy names, { = f ;c(x)g22;c2C , with ;c(x) 2 MSOL1(1 ; 1), is the family of node formulas, and { X = f;c;c (x; y)g2 2;c;c 2C , with ;c;c (x; y) 2 MSOL2 (1; 1), is the family of edge formulas. The copy number of T is #C . The graph transduction Tgr : G1 ; 1 ! G2 ; 2 dened by T is dened as follows. For every graph g1 over (1 ; 1 ), Tgr (g1 ) is the graph g2 over (2 ; 2 ) with { Vg2 = f(c; u) j c 2 C; u 2 Vg1 ; and there is exactly one 2 2 such that (g1 ; u) j= ;c (x)g, { Eg2 = f((c; u); ; (c0; u0)) j (c; u); (c0; u0) 2 Vg2 ; 2 2; and (g1 ; u; u0) j= ;c;c (x; y)g, { labg2 = f((c; u); ) j (c; u) 2 Vg2 ; 2 2; and (g1; u) j= ;c(x)g. 0 0 0 0 10 The set of all graph transductions dened by mso graph transducers will be denoted mso-gt. They are the mso denable graph transductions. If the copy number of an mso graph transducer is 1, i.e., the copy set C is a singleton fcg, we will drop the subscripts c from the node and edge formulas, i.e., we write (x) and (x; y) instead of ;c (x) and ;c;c(x; y), respectively. Note that a copy (c; u) of a node u of g1 may not be a node of g2 = Tgr (g1 ) for two reasons: either there is no such that (g1 ; u) j= ;c (x), or there is more than one such . However, it is easy to see that we may always assume, for xed c 2 C , the formulas ;c (x) to be mutually exclusive, in which case only the rst reason remains and { Vg2 = f(c; u) 2 C Vg1 j 9 2 2 : (g1; u) j= ;c(x)g, and { labg2 = f((c; u); ) 2 (C Vg1 ) 2 j (g1; u) j= ;c(x)g. In fact, the formula ;c (x) can be replaced by its conjunction with all : ;c(x), for 0 2 2 fg. Similarly, it can always be assumed that an edge formula ;c;c (x; y) only holds for nodes u; u0 2 Vg1 if (c; u); (c0 ; u0 ) are nodes of the output graph g2, i.e., that (g1 ; u; u0) j= ;c;c (x; y) implies (c; u); (c0 ; u0 ) 2 Vg2 . In that case { Eg2 = f((c; u); ; (c0; u0)) j (c; u); (c0 ; u0) 2 C Vg1 ; 2 2 ; (g1 ; u; u0) j= ;c;c (x; y)g. Assuming the node formulas to be mutually exclusive (for xed c), this is achieved by changing the formula ;c;c (x; y) into its conjunction with the disjunction of all formulas ;c(x) ^ ;c (y), for ; 0 2 2 . These two assumptions will be made whenever convenient, without mentioning. They allow us, e.g., to check fewer conditions in the simulation of mso graph transducers by attribute grammars (in Section 7). 0 0 0 0 0 0 0 Terms and trees We now introduce the specic mso graph transducers that we are interested in. They have trees as input, and they also produce trees as output, either by directly dening the output tree or by dening a term graph of which the unfolding is the output tree. Let and be operator alphabets. An mso term graph transducer from to is an mso graph transducer T from (; rki( )) to (# ; rki()) such that Tgr (t) is a term graph over for every t 2 T . It denes the tree transduction T : T ! T with T (t) = unfold(Tgr (t)) for every t 2 T . An mso tree transducer from to is an mso term graph transducer T from to such that Tgr (t) 2 T for every t 2 T . Note that T (t) = Tgr (t) for every t 2 T , because unfolding has no eect on trees. The set of all tree transductions dened by mso term graph transducers is denoted mso-tgt, and the set of all tree transductions dened by mso tree transducers is denoted mso-tt. To distinguish between mso-tt and mso-tgt, the transductions in mso-tt will be called mso denable tree transductions, and those in mso-tgt will be called mso denable term transductions. Note that, 11 by denition, mso-tt mso-tgt. Properness of this inclusion will be shown in Example 1(5, binary); thus, in general, the unfolding of term graphs is not mso denable. Special cases Two special cases of interest are the following. First, an mso graph transducer is said to be \direction preserving" if edges of the output graph correspond to (directed) paths in the input graph. If, in particular, the input graph is a tree, then all edges in the output graph lead from (a copy of) a node of the input tree to (a copy of) one of its descendants. Formally, an mso graph transducer T from (1 ; 1 ) to (2 ; 2 ) is direction preserving if, for every graph g1 2 G1 ; 1 , if ((c; u); ; (c0 ; u0 )) is an edge of Tgr (g1 ) then there is a directed path from u to u0 in g1 . We will indicate the direction preserving property by a subscript `dir'. Thus, mso-tgtdir denotes the class of all term transductions dened by direction preserving mso term graph transducers. We will show that these transducers are related to Only Synthesized attribute grammars. Second, an mso tree transducer is said to be a \relabeling" if it just relabels all nodes of the input tree. Formally, an mso tree transducer T = (C; ; X ) from to is an mso relabeling if (1) the copy number of T is 1, (2) i (x; y) = edgi (x; y) for every i 2 rki(), and (3) for every t 2 T and u 2 Vt there is a unique 2 such that (t; u) j= (x). The class of all tree transductions dened by mso relabelings will be denoted mso-rel. Note that mso-rel mso-ttdir . An inclusion diagram of the classes of tree transductions dened above is given in Fig. 1. Its correctness will follow from Example 1 below, in particular Example 1(3, stars), Example 1(5, binary), and Example 1(6, yield). mso-tgt mso-tgtdir mso-tt mso-ttdir mso-rel Fig. 1. An inclusion diagram of classes of tree transductions 12 We observe here that, restricting attention to input trees over , all the above properties of mso graph transducers T are decidable: whether or not T is an mso term graph transducer, is an mso tree transducer, is direction preserving, or is an mso relabeling. This is because for each of these properties a closed formula 2 MSOL( ) can be found such that T has the property i t j= for every tree t 2 T . Decidability now follows from the classical fact that ft 2 T j t j= :g is a regular tree language ([Don, ThaWri]), and the well-known fact that emptiness of regular tree languages is decidable (see, e.g., [GecSte]). As an example, for the direction preserving property the formula is = 8x; y(0 (x; y)), where 0 (x; y) is the conjunction of all formulas i;c;c (x; y) ! path(x; y), for every edge formula i;c;c (x; y). 0 0 Examples We now give some examples of mso denable graph transductions. Example 1. (1, clean) The cleaning of term graphs (see Section 2.1) is mso denable, i.e., for every operator alphabet there is an mso graph transducer T from (# ; rki( )) to (; rki( )) such that for every term graph t over , Tgr (t) = clean(t). The copy number of T is 1. For every 2 it has node formula (x) = (lab (x) _ lab(;#) (x)) ^ 8y (root(y ) ! path(y; x)); and for every i 2 rki( ) it has edge formula i (x; y) = edgi (x; y). Through the node formulas, only those nodes of the input term graph t are copied to the output graph that are descendants of root(t). The labels of those nodes stay the same, except that root(t) is unmarked (if it was marked by #), and the edges between them are just copied to the output graph. This shows that the output graph is clean(t). (2, relab) If 1 (x); : : : ; n (x) is a sequence of formulas in MSOL1 ( ), then there is an mso relabeling T from to ftrue; falsegn such that, for every t 2 T and u 2 Vt , labT (t) (u) = (; b1 ; : : : ; bn ) where = labt (u) and, for every i 2 [1; n], bi = true i (t; u) j= i (x). In fact, for 0 = (; b1 ; : : : ; bn ), the node formula (x) of T is the conjunction of the formula lab (x) with all formulas i (x) for which bi = true and all formulas :i (x) for which bi = false. (3, stars) We give an mso tree transducer T from to , where = 0 [ 2 , with 0 = fag, and 2 = fg, and = 0 [ 1 [ 2 , with 0 = 0 , 2 = 2 , and 1 = fg. Note that rki() = rki( ) = f1; 2g. The transducer transforms a tree by inserting a node with label on each of its edges. See Fig. 2 for an example. The transducer T = (C; ; X ) is dened as follows. 0 { The copy set C is fo; ng, where o stands for `old', and n for `new'. A node (o; u) of T (t) is an old copy of the node u of t (with the same label). A node (n; u) is a new copy of the node u; it has label , and it has node (o; u) as its child. Two copies are made of every node except the root. 13 1 a 1 2 a 1 1 1 a 2 a 2 1 2 1 a 1 a Fig. 2. Example of a tree t and its transduction T (t) { consists of the node formulas ;o(x) = lab (x) for all 2 , ;o (x) = false; for all 2 , and ;n(x) = false; ;n(x) = : root(x): { X consists of the edge formulas i;o;o (x; y) = false for i 2 f1; 2g, i;n;n (x; y) = false for i 2 f1; 2g, i;o;n (x; y) = edgi (x; y) for i 2 f1; 2g, 1;n;o (x; y) = (x = y); and 2;n;o (x; y) = false : Note that T is a direction preserving mso tree transducer. Thus, T is in mso-ttdir, but not in mso-rel because for an mso relabeling the output tree has the same size as the input tree. (4, path) The next example is adapted from [FulVag]. It is a direction preserving mso tree transducer T from to , where the input alphabet is f; ; ag, with rk () = 2, and rk () = rk (a) = 0, and the output alphabet is f1; 2; ; ag, with rk (1) = rk (2) = 1, and rk () = rk (a) = 0. If the input tree t contains exactly one leaf labeled , the transducer transforms it into a tree over , which codes the path leading from the root of t to the leaf labeled in the obvious way. Otherwise, the output is a. For example, T (((a; (a; )); a)) = 1(2(2())), and T ((; (a; ))) = T ((a; (a; a))) = a. The copy number of T is 1. The node and edge formulas of T are 1 (x) = us ^ 9y; z (edg1 (x; y ) ^ path(y; z ) ^ lab (z )); 2 (x) = us ^ 9y; z (edg2 (x; y ) ^ path(y; z ) ^ lab (z )); (x) = us ^ lab (x); a (x) = :us ^ root(x); 1 (x; y) = edg(x; y); 14 where `us' is the closed formula 9z (lab (z ) ^ 8y(lab (y) ! y = z )), expressing that the input tree has a unique star. (5, binary) Consider the mso term graph transducer T from to , where = = f; ag, with rk () = 1, rk () = 2, and rk (a) = rk (a) = 0. The transducer transforms a linear input tree into a full binary output tree, of the same height. To do this, it turns the input tree into a term graph by changing every edge (with label 1) into two edges (with labels 1 and 2). This term graph is then unfolded into a full binary tree. The copy number of T is 1, it has node formulas (x) = lab (x) and a (x) = laba (x), and edge formulas 1 (x; y) = edg1 (x; y) and 2 (x; y) = edg1 (x; y). Note that, again, T is direction preserving. Thus, T is in mso-tgtdir, but not in mso-tt; in fact, it should be obvious that, for every T 2 mso-tt and every input tree t, the size of T (t) is linear in the size of t, where the constant is the copy number of T . (6, yield) Finally, we consider an example of an mso tree transducer that is not direction preserving. Let the input alphabet be = 0 [2 , for some 0 and 2 . The output alphabet is = 0 [ 1 , with 1 = 0 and 0 = f^ j 2 0 g. We present an mso tree transducer T such that for any tree t 2 T , T (t) is equal to the yield of t as a monadic tree (a string can be seen as a monadic tree, with its rst symbol as root, the second symbol as child of the rst, etcetera, up to the last symbol, which is the leaf of the tree). We have to beware, because all labels in the monadic tree that constitutes the yield have rank 1, except for the last one, which has rank 0. We will therefore use elements of 1 for all of the labels, except the last one, for which we will use a corresponding label from 0 . To do this we need the following mso formula over , which, for a node x of a (binary) tree, checks that it is on the path from the root to the rightmost leaf: rm(x) = 8y(root(y) ! path2 (y; x)) where the formula path2 (x; y) expresses that there is a path from x to y of which the edges are all labeled 2 (obtained by changing edg(x; y) into edg2 (x; y) in the formula path(x; y), see Section 2.2). Let T = (fcg; f g20 [ f ^ g20 ; f1 g) be the mso tree transducer from to , with (x) = lab (x) ^ : rm(x) ^ (x) = lab (x) ^ rm(x) 1 (x; y) = leaf(x) ^ (x; y) ^ leaf(y) where (x; y) = 9z (9zl (edg1 (z; zl ) ^ path2 (zl ; x)) ^ 9zr (edg2 (z; zr ) ^ path1 (zr ; y))) : For leaves x and y, the formula (x; y) checks that y directly follows x in the left-to-right order of leaves. Note that in this formula, z is the least common ancestor of x and y, and zl and zr are its two children. The formula path1 (x; y) is analogous to path2 (x; y), as explained above. 15 Note that we could as well have taken 1 (x; y) = (x; y), because, by denition, an edge in the output graph is only drawn if both nodes it is incident to exist, so this need not be checked explicitly. However, as observed before, such a check can always be added to 1 (x; y), and that is what we have done here. Note also that T is in mso-tt, but not in mso-tgtdir; in fact, it is not dicult to see that, for every T 2 mso-tgtdir and every input tree t, the height of T (t) is linear in the height of t, where the constant is the copy number of T (cf. the similar statement on size in the previous example). Composition We end this section by stating a well-known and basic prop- erty of mso denable graph transductions, and its consequences for the specic transductions considered in this paper. It is proved, e.g., in Proposition 3.2(2) of [Cou3]. Proposition 1. mso-gt is closed under composition. An immediate consequence of this proposition is that mso-tt is closed under composition, and that mso-tt mso-tgt mso-tgt. To see that also mso-gtdir , mso-ttdir , and mso-rel are closed under composition, it suces to know the following about the proof of Proposition 1. If T 0 and T 00 are mso graph transducers with copy sets C 0 and C 00 , then their composition is realized by an mso graph transducer T with copy set C 00 C 0 . Moreover, for every input graph g1 , the output graphs Tgr (g1 ) and Tgr00 (Tgr0 (g1 )) are isomorphic, with node ((c00 ; c0 ); u) corresponding to node (c00 ; (c0 ; u)). Proposition 2. mso-gtdir, mso-tt, mso-ttdir, and mso-rel are closed under composition; mso-tt mso-tgt mso-tgt and mso-ttdir mso-tgtdir mso-tgtdir . It will follow from our results that mso-tgtdir is also closed under composition (see Section 9), thus strengthening the last inclusion in Proposition 2. It is straightforward to show that mso-tgt is not closed under composition, by a size argument similar to the one in Example 1(5, binary). 4 Attributed tree transducers We will show that mso denable term transductions can be computed by attribute grammars, and that mso denable tree transductions can be computed by sur attribute grammars. To obtain precise characterizations, we consider two types of tree transducers that are based on attribute grammars: the relabeling attribute grammar (see [BloEng]) and the attributed tree transducer (see [Ful, EngFil, FHVV, KuhVog1]). We prove that mso term graph transducers have the same power as two-stage attribute grammars, of which the rst stage is a relabeling attribute grammar, and the second stage is an attributed tree transducer. For mso tree transducers the second phase is sur. Such two-stage 16 attribute grammars will be called attributed tree transducers with look-ahead (attR ). A relabeling attribute grammar is a quite restricted type of tree transducer: it only changes the labels of the nodes of the input tree. All its attributes have nitely many values, and, for each node of a given input tree, the new label of the node is the value of its meaning attribute. Since it is nite-valued, a relabeling attribute grammar can be viewed as a nite-state tree automaton that relabels the nodes of the tree. It is the look-ahead stage of an attR . An attributed tree transducer is a much more powerful device. It is an attribute grammar in which all attribute values are trees and the semantic rules are limited to involve substitution only. For a given input tree, the output tree is the value of the meaning attribute at the root. It is the computation stage of an attR . In this section we rst dene relabeling attribute grammars, attributed tree transducers, and attributed tree transducers with look-ahead. Then we discuss a normal form for attributed tree transducers, and we dene the notion of a semantic graph for attributed tree transducers in this normal form. 4.1 Formal denitions Let and be operator alphabets. A relabeling attribute grammar from to is a nite-valued noncircular attribute grammar G = (; S; I; ; W; R; m) with W (m ) = , such that for every t 2 T and u 2 Vt , decG;t (hm ; ui) has the same rank as labt (u). The tree transduction computed by G is the total function r(G) = f(t; t0 ) 2 T T j Vt = Vt ; Et = Et ; and labt (u) = decG;t (hm ; ui) for all u 2 Vt g; r(G) is called an attributed relabeling. The set of all attributed relabelings is denoted att-rel. In Section 6.1 we will show that att-rel equals the class mso-rel of mso relabelings. Let ; be operator alphabets (the input and output alphabet, respectively). An attributed tree transducer (att, for short) from to is an attribute grammar G = (; S; I; ; W; R; m) with the following two properties: { = fTg, and { every semantic rule is of the form h0 ; i0 i = f (h1 ; i1 i; : : : ; hk ; ik i); where for all t1 ; : : : ; tk 2 T , f (t1 ; : : : ; tk ) = r[1 7! t1 ; : : : ; k 7! tk ] for some linear r 2 T (f1 ; : : : ; k g). For the notation and terminology used for r see the end of Section 2.1. Note that we did not require G to be noncircular. Although we are only interested in noncircular attributed tree transducers, we need circular ones as a technical tool (see Sections 7 and 8). A circular attributed tree transducer only translates input trees with a noncircular dependency graph. 0 17 0 0 The tree transduction computed by attributed tree transducer G is the partial function G from T to T such that for every tree t 2 T with noncircular dependency graph D(t), G (t) = decG;t (hm ; root(t)i). The class of all tree transductions that can be computed by noncircular attributed tree transducers is denoted att. Note that att contains total functions only. The class of tree transductions computed by noncircular attributed tree transducers that satisfy the single use restriction (sur), is denoted attsur . The corresponding classes of tree transductions computed by os attributed tree transducers are denoted attos and attos;sur , respectively. From here on, we will identify the right-hand side f (h1 ; i1 i; : : : ; hk ; ik i) of a semantic rule as above, with the tree r[1 7! h1 ; i1 i; : : : ; k 7! hk ; ik i]. Note that, since the hj ; ij i are all distinct (see Section 2.3), this is a linear term in T ((I [ S ) N ). As an example, with 2 2 and c 2 0 , the semantic rule h; 0i = f (h; 1i; h; 2i), with f (t1 ; t2 ) = (t1 ; (c; t2 )) is identied with h; 0i = (h; 1i; (c; h; 2i)): Note that f (t1 ; t2 ) = r[1 7! t1 ; 2 7! t2 ] for r = (1 ; (c; 2 ))). As a second example, the semantic rule h; 0i = f (h; 1i) with f (t) = t = 1 [1 7! t], is identied with h; 0i = h; 1i. In exactly the same way, for an input tree t, the right-hand sides of semantic instructions in R(t) will be identied with trees in T (A(t)). Thus, for a decoration dec of t and an instruction h; ui = r, dec(h; ui) is the result of substituting dec(h; vi) for every h; vi in r. Linearity of the terms used in the semantic rules is a convenient technical detail which is intuitively required for sur att's. For att's that are not sur it is an inessential restriction, because it can always be achieved by duplicating attributes. To see this, we give an example of a rule that is not linear, and transform it to two linear rules. With 2 2 , a typical nonlinear rule would be h; 0i = (h; 1i; h; 1i). We can make it linear by adding an attribute 0 with rule h0 ; 0i = h; 1i, and changing the rule for into h; 0i = (h0 ; 0i; h; 1i). Note that this transformation does not preserve the sur because h; 1i is used twice. An attributed tree transducer with look-ahead (attR , for short) from to is a pair G = (G1 ; G2 ) where G1 is a relabeling attribute grammar from to , and G2 is an attributed tree transducer from to , for some operator alphabet . The tree transduction computed by G is r(G1 ) G2 . The attR G is noncircular, os, or sur, if G2 is (with no restrictions on G1 except that it is noncircular). Obviously, the class of all tree transductions that can be computed by noncircular attributed tree transducers with look-ahead is att-rel att. Restricting the attR 's to be sur, or os, or both, they compute the classes att-rel attsur , att-rel attos , and att-rel attos;sur , respectively. We now give some examples, numbered (3){(6), corresponding to the mso term graph transducers of Example 1(3){(6). All examples are noncircular. 18 Example 2. (3, stars) The tree transduction T of Example 1(3, stars) is computed by an att G with one synthesized attribute , which is also the meaning attribute, and semantic rules R(a) = fh; 0i = ag and R() = fh; 0i = ((h; 1i); (h; 2i))g. Note that G is os and sur, and so T 2 attos;sur . (4, path) The tree transduction T of Example 1(4, path) cannot be computed by an attributed tree transducer, as proved in [FulVag]. We now show that it can be computed by an attributed tree transducer with look-ahead, i.e., that it is the composition of an attributed relabeling r(G1 ) and an attributed tree transduction G2 . The relabeling attribute grammar G1 has a synthesized attribute `ns' that counts the number of descendants that are labeled by , with W (ns) = f0; 1; manyg, where `many' means `more than one'. It has meaning attribute that adds to the label of each node the values of the ns-attribute of its children, with W () = fa; g [ f(; i; j ) j i; j 2 W (ns)g. The semantic rules of G1 are R() = f hns; 0i = hns; 1i + hns; 2i; h; 0i = (; hns; 1i; hns; 2i) g; where addition is dened in the obvious way (1 + 1 = many, and many plus anything equals many), R(a) = f hns; 0i = 0; h; 0i = a g; and R() = f hns; 0i = 1; h; 0i = g: The att G2 has one synthesized attribute , with semantic rules R(a) = f h; 0i = a g; R() = f h; 0i = g; and, for i; j 2 W (ns), 8 > <f h; 0i = 1(h; 1i) g if i = 1 and j = 0, R(; i; j ) = >f h; 0i = 2(h; 2i) g if i = 0 and j = 1, :f h; 0i = a g otherwise. It should be clear that T = r(G1 ) G2 , and so T 2 att-rel attos;sur . (5, binary) The term transduction T of Example 1(5, binary) is computed by an att G with two synthesized attributes and 0 (where is the meaning attribute), and semantic rules R(a) = fh; 0i = a; h0 ; 0i = ag and R() = fh; 0i = (h; 1i; h0 ; 1i); h0 ; 0i = (h; 1i; h0 ; 1i)g. Thus, T 2 attos . Note that T cannot be computed by a sur attributed tree transducer with look-ahead, because for a transduction in att-rel attsur the size of the output tree is linear in the size of the input tree, where the constant is the number of attributes of the att times the maximal size of the right-hand sides of its semantic rules. Note also that if semantic rules with nonlinear right-hand sides were allowed, G could do without 0 , and have rules h; 0i = a for a, and h; 0i = (h; 1i; h; 1i) for . 19 (6, yield) Finally, we show that the tree transduction T of Example 1(6, yield) is in att-rel attsur , i.e., can be computed by a sur attR (G1 ; G2 ). The obvious idea is to construct an att G2 with a synthesized attribute `up' and an inherited attribute `down', that makes a depth-rst right-to-left walk through the input tree, adding one symbol to the output tree at each leaf. The only problem is that it has to treat the rightmost leaf dierently from the other leaves. Thus, the relabeling attribute grammar G1 will mark the rightmost leaf. The output alphabet of G1 is [ 0 . It has an inherited attribute `rm' with W (rm) = ftrue; falseg, root rule hrm; 0i = true, and, for 2 2 , internal rules hrm; 1i = false and hrm; 2i = hrm; 0i. It has meaning attribute with internal rule h; 0i = for 2 2 , and the following internal rule for 2 0 : ( hrm; 0i = true, h; 0i = ^ ifotherwise. Now the semantic rules of G2 are: for 2 2 , R() = f hdown; 2i = hdown; 0i; hdown; 1i = hup; 2i; hup; 0i = hup; 1i g; for 2 0 , R(^) = fhup; 0i = ^ g and R() = fhup; 0i = (hdown; 0i)g; and R(root) contains a \dummy" rule hdown; 0i = ?, where ? is any element of 0 . It should be clear that T = r(G1 ) G2 . Thus, T 2 att-rel attsur . Note that T cannot be computed by an os attributed tree transducer with lookahead, because for a transduction in att-rel attos the height of the output tree is linear in the height of the input tree, where the constant is the number of attributes of the att times the maximal height of the right-hand sides of its semantic rules. ut 4.2 Operator form To simplify proofs we will consider attributed tree transducers for which every right-hand side of a semantic rule contains exactly one output symbol. Here we show that this is, in a certain sense, a normal form for attributed tree transducers. An attributed tree transducer G from to is in operator form if every semantic rule of G is of the form h0 ; i0i = (h1 ; i1 i; : : : ; hk ; ik i) for some 2 k . Thus, for a tree t 2 T , the semantic instructions in R(t) are of the form h0 ; u0 i = (h1 ; u1i; : : : ; hk ; uk i), and a decoration dec of t should satisfy dec(h0 ; u0i) = (dec(h1 ; u1i); : : : ; dec(hk ; uk i)), accordingly. We rst show that for every att there is an equivalent one for which every right-hand side of a semantic rule contains at most one output symbol. Lemma3. For every att G there is an att G1 such that G1 = G and every semantic rule of G1 is either of the form h0 ; i0 i = (h1 ; i1 i; : : : ; hk ; ik i), for some 2 k , or of the form h0 ; i0 i = h1 ; i1i. If G is sur or os, then so is G1 . 20 Proof. The straightforward idea is to introduce new attributes for the subtrees of the right-hand side of a semantic rule. As an example, the semantic rule h; 0i = (h; 1i; (c; h; 2i)) can be changed into the semantic rules h; 0i = (h; 1i; h0 ; 0i), h0 ; 0i = (h00 ; 0i; h; 2i), and h00 ; 0i = c, where 0 and 00 are new synthesized attributes. Formally we transform G into G1 by iterating the following procedure. Let h; ii = r be a semantic rule such that r = (r1 ; : : : ; rj ; : : : ; rk ) and rj is not an attribute occurrence, i.e., rj 2= (S [ I ) N . Replace this semantic rule by the two semantic rules h; ii = (r1 ; : : : ; rj 1 ; h0 ; 0i; rj+1 ; : : : ; rk ) and h0 ; 0i = rj , where 0 is a new, synthesized, attribute. Moreover, since the new attribute is not used in the R()'s to which h; ii = r does not belong, it is dened there by a \dummy" rule h0 ; 0i = ?, where ? is any element of 0 . It should be clear that one such replacement does not change G and preserves the sur and os properties. Iteration of these replacements leads to an att of the required form. ut A semantic rule of the form h0 ; i0 i = h1 ; i1 i is called a copy rule. To put an att G from to into operator form, we just replace such a copy rule by the rule h0 ; i0 i = id(h1 ; i1 i), where `id' is a new output symbol of rank 1, the identity operator. The new att outputs trees over [ fidg from which then all occurrences of id have to be removed to obtain the output tree of G. For an operator alphabet and a symbol `id' of rank 1, not in , let id denote the operator alphabet [ fidg. For a tree t over id , the pruning of t is the tree prune(t) over which is obtained from t by pruning all occurrences of id. In other words, prune(id(t)) = prune(t), and, for 2 k , prune((t1 ; : : : ; tk )) = (prune(t1 ); : : : ; prune(tk )). The eect of the identity operator, in general, is stated in the following lemma. In the statement of the lemma we apply the function `prune' to righthand sides r of semantic rules of an att from to id . This is well dened because r is (identied with) a tree over the operator alphabet [ fidg [ ((I [ S ) N ). Lemma 4. Let G be an att from to id, and let Gprune be the att from to which is obtained from G by changing every semantic rule h; ii = r into h; ii = prune(r). Then Gprune = G prune. Moreover, G is os i Gprune is os, and, for t 2 T , G is sur on t i Gprune is sur on t. Proof. Obviously, a tree t 2 T has the same dependency graph D(t) in both G and Gprune. Hence, Gprune is sur on t i G is sur on t, and Gprune is noncircular on t i G is noncircular on t. For such a noncircular t, it is straightforward to show that the mapping decG;t prune is a decoration of t for Gprune. Thus, since t has a unique decoration, decGprune ;t = decG;t prune, and so Gprune = G prune. ut We now show in which sense operator form is a normal form for attributed tree transducers. Lemma 5. For every att G from to there is an att G0 in operator form from to id such that G = G 0 prune. If G is sur or os, then so is G0 . 21 Proof. Let G1 be the att obtained from G by Lemma 3, and change G1 into G0 by replacing every copy rule h0 ; i0 i = h1 ; i1i by h0 ; i0 i = id(h1 ; i1 i). Then 0 G0prune = G1 , and so, by Lemma 4, G = G1 = Gprune = G 0 prune. ut The operator form is used in Section 5 to facilitate the proof that every attributed tree transduction is an mso denable term transduction. In fact, for an att in operator form the output tree is the unfolding of a term graph which is closely related to the dependency graph of the input tree, and which is therefore mso denable in a straightforward way. This term graph is dened in the next subsection. 4.3 Semantic graphs If G is an att in operator form, and t is an input tree with noncircular dependency graph D(t), then the nodes and edges of D(t) can be labeled in a straightforward way such that, after reversing the direction of all its edges, it turns into a term graph of which the unfolding is the output tree G (t). This term graph will be called the \semantic graph" of t. In fact, an attribute of D(t) is labeled with the single operator 2 used in the semantic instruction that denes the attribute, and an edge of D(t) is labeled according to the order of the attributes in the instruction that denes the dependency. Moreover, the mark # is added to the label of hm ; root(t)i. We now turn to the formal denition. Let G be an att from to in operator form. The semantic graph SG (t) of a tree t 2 T is the graph SG (t) = (V; E; lab) over (# ; rki()), where V = A(t), and E and lab are determined as follows. If h0 ; u0 i = (h1 ; u1 i; : : : ; hk ; uk i) is a semantic instruction in R(t), then (h0 ; u0 i; j; hj ; uj i) is in E for every j 2 [1; k], lab(h0 ; u0i) = if h0 ; u0 i 6= hm ; root(t)i, and lab(h0 ; u0i) = (; #) if h0 ; u0i = hm ; root(t)i. It should be clear that SG (t) satises all requirements of a term graph over , except that it may be circular. Hence, for every t 2 T , SG (t) is a term graph over if and only if SG (t) is noncircular if and only if D(t) is noncircular. Moreover, if SG (t) is a term graph, then root(SG (t)) = hm ; root(t)i. Also, SG (t) is a forest if and only if G is sur on t. We now show that the output tree is the unfolding of the semantic graph of the input tree. Lemma6. Let G be an att from to in operator form, and let t 2 T be an input tree with noncircular D(t). Then G (t) = unfold(SG (t)). If, moreover, G is sur on t, then G (t) = clean(SG (t)). Proof. Let S = SG (t). Since D(t) is noncircular, S is a term graph over . Dene the mapping dec : A(t) ! T such that dec(h; ui) = unf S (h; ui). Since t has a unique decoration decG;t , it suces to show that dec is a decoration of t. When this is established, it follows that decG;t = dec, and hence that G (t) = decG;t(hm ; root(t)i) = unf S (hm ; root(t)i) = unf S (root(S )) = unfold(S ). A mapping is a decoration of t if all semantic instructions in R(t) are obeyed. If R(t) contains a semantic instruction h0 ; u0 i = (h1 ; u1i; : : : ; hk ; uk i), then 22 labS (h0 ; u0 i) = (or (; #)) and (h0 ; u0i; j; hj ; uj i) 2 ES for all j 2 [1; k], and hence unf S (h0 ; u0 i) = (unf S (h1 ; u1i); : : : ; unf S (hk ; uk i)), i.e., dec obeys this semantic instruction. If G is sur on t, then SG (t) is a forest. Clearly, unfold( ) = clean( ) for any forest . ut Example 3. To illustrate Lemma 6, we give a simple example of a (sur) attributed tree transducer G in operator form. Given a tree t, it reproduces the monadic tree that constitutes the path from the root of t to its leftmost leaf, cf. Example 2(4, path). The input alphabet of G is = 2 [ 0 , with 2 = fg and 0 = fa; bg, and the output alphabet is = 1 [ 0 , with 1 = fg and 0 = fa; bg. There is only one, synthesized, attribute . The rules are simple: the single rule in R() is h; 0i = (h; 1i), R(a) contains the single rule h; 0i = a, and R(b) the single rule h; 0i = b. For the tree t = ((a; a); (b; a)), a b a a Fig. 3. Dependency graph the dependency graph D(t) is given in Fig. 3, the semantic graph SG (t) in Fig. 4, and the output tree G (t) = ((a)) in Fig. 5. Note that since G is sur, unfold(SG (t)) = clean(SG (t)). ut 5 From att to mso In this section we will prove that every noncircular attributed tree transducer can be simulated by an mso term graph transducer, i.e., that att mso-tgt. By Lemma 5 it suces to show this for att's in operator form, provided we also show that mso-tgt fpruneg mso-tgt. Recall that for an mso term graph transducer T and an input tree t, T (t) = unfold(Tgr (t)); thus, the output tree is obtained by unfolding the term graph Tgr (t), which is the output graph of the mso graph transducer T . 23 (; #) 1 1 a 1 a a b a b a a Fig. 4. Semantic graph 1 1 a Fig. 5. Output tree Lemma7. For every noncircular attributed tree transducer G in operator form there is an mso term graph transducer T such that T = G . Moreover, if G is sur, then T is an mso tree transducer, and if G is os, then T is direction preserving. Proof. Let G = (; S; I; ; W; R; m) be a noncircular att in operator form with = fTg. We will abbreviate S [ I by A. By Lemma 6, G (t) = unfold(SG (t)) for every t 2 T . Thus, it suces to dene an mso graph transducer T such that, for every t 2 T , Tgr (t) = SG (t), the semantic graph of t. To simplify the description of this T , we introduce semantic graphs S (), for 2 , and S (root). These graphs are (partially) labeled versions of the usual dependency graphs D() and D(root), as dened in Section 2.3, with the edges reversed. They are the atomic graphs from which all semantic graphs SG (t) are built, just as the dependency graphs D(t) are built from the D() and D(root). For x 2 [ frootg, the semantic graph of x is the graph S (x) = (V; E; lab) over (; rki()) where V = A [0; rk()] if x = 2 and V = A f0g if x = 24 root, and E and lab are determined as follows. If h0 ; i0i = (h1 ; i1 i; : : : ; hk ; ik i) is a rule in R(x), then (h0 ; i0i; j; hj ; ij i) is in E for every j 2 [1; k], and lab(h0 ; i0 i) = . Note that lab is a partial function. The mso graph transducer dening the semantic graphs of the input trees of G is T = (A; f ; g2# ;2A; fj;; gj2rki();; 2A ); where the copy set, edge formulas, and node formulas are dened as follows (in that order). { The set of copy names is A = S [I , the set of attributes of G. For each node u of an input tree t, T makes a copy (; u) for every attribute , corresponding to node h; ui of SG(t). Thus, all these copies are nodes of the output graph SG (t). { The edge formulas check for a dependency between attributes. In particular, the edge formula j;; (x; y) checks the semantic rules to see if there is a semantic instruction that denes h; xi in terms of h0 ; yi. In what follows edg0 (x; y) stands for x = y. For all j 2 rki() and ; 0 2 A, the edge formula j;; (x; y) is dened to be the disjunction of all formulas 9z (lab (z ) ^ edgi (z; x) ^ edgi (z; y)), for all 2 and i; i0 2 [0; rk()] such that (h; ii; j; h0 ; i0 i) is an edge of S (), and the formula root(x) ^ x = y in case (h; 0i; j; h0 ; 0i) is an edge of S (root). { The node formulas determine the operators in by which the attributes are dened, and they determine the root of the semantic graph SG (t). Again, edg0 (x; y) stands for x = y. Let 2 and 2 A. We rst consider the case that 6= m . Then the node formula (;#);(x) is dened to be false, and the node formula ; (x) is dened to be the disjunction of all formulas 9z (lab (z ) ^ edgi (z; x)), for all 2 and i 2 [0; rk()] such that h; ii has label in S (), and the formula root(x) in case h; 0i has label in S (root). 0 (x) be the disjunction of all To dene the node formulas for m , let ; m formulas 9z (lab (z ) ^ edgi (z; x)), for all ; i such that hm ; ii has label in 0 (x), and ;m (x) = : root(x) ^ S (). Then (;#);m (x) = root(x) ^ ; m 0 (x). ;m It should be obvious that the above formulas closely correspond to the denition of SG(t) in Section 4.3. More exactly, for 2 and 6= m , (t; u) j= ;(x) i there is a semantic instruction of the form h; ui = (: : : ) 2 R(t), and similarly for the other node formulas. Also, (t; u; v) j= j;; (x; y) i there is a semantic instruction in R(t) of the form h; ui = (: : : ; h0 ; vi; : : : ) where h0 ; vi is the j -th argument of . This implies that Tgr (t) = SG (t) for every t 2 T . If G is os and (h; ui; j; h; vi) is an edge of SG(t), then v is a descendant of u (in fact, either v = u or v is a child of u). Hence T is direction preserving in that case. Now assume that G is sur. Then, for every tree t 2 T , SG (t) is a forest. Thus, T need not be a tree transducer. However, in this case G (t) = clean(SG (t)) 0 0 0 0 0 0 25 for every t 2 T , by Lemma 6. Since `clean' is denable by a direction preserving mso graph transducer, as shown in Example 1(1, clean), we can replace T by the mso tree transducer T 0 such that T 0 = Tgr0 = Tgr clean which can be found by Proposition 1 (mso-gt is closed under composition). Then G (t) = T 0 (t) for every t 2 T . If, moreover, G is os, then T 0 is direction preserving by Proposition 2 (mso-gtdir is closed under composition). ut It now remains to be shown that for every term graph transducer T from to id there is a term graph transducer T 0 from to such that T 0 (t) = prune(T (t)) for every t 2 T . Recall from Section 4.2 that the mapping `prune' removes all occurrences of `id' from a tree. Thus, by Proposition 1, it suces to construct an mso graph transducer which denes a mapping that extends `prune' to term graphs, in such a way that unfold(prune( )) = prune(unfold( )) for every term graph . We rst dene the extension of `prune' to term graphs. It is convenient to dene it for clean term graphs only. Let t be a clean term graph over id. Then the (clean) term graph prune(t) over is dened as follows. For a node u of t, let the \forward node" of u, denoted fw(u), be the rst descendant of u that does not have label id. Formally, if labt (u) 2 then fw(u) = u, and if labt (u) = id then fw(u) = fw(u 1). Now we dene Vprune(t) = ffw(u) j u 2 Vt g = fu 2 Vt j labt (u) 2 g, Eprune(t) = f(u; i; fw(v)) j u 2 Vprune(t) ; (u; i; v) 2 Et g, and labprune(t) (u) = labt (u) for every u 2 Vprune(t) . Obviously this denes the old prune(t) for trees over id. It should be clear that root(prune(t)) = fw(root(t)) for every clean term graph t; this is because, in t, every node u with labt (u) 2 is a descendant of fw(root(t)). It should also be clear that unfold(prune(t)) = prune(unfold(t)). In fact, it is straightforward to show from the denitions that unf prune(t) (fw(u)) = prune(unf t (u)) for every u 2 Vt , by induction on u. The result then follows by taking u = root(t) and using the fact that fw(root(t)) = root(prune(t)). The next lemma shows that the mapping `prune' is an mso denable graph transduction. Lemma8. Let be an operator alphabet. There is a direction preserving mso graph transducer T such that Tgr (t) = prune(t) for every clean term graph t over id . Proof. It is not dicult to nd a formula (x; y) in MSOL2 (id ) such that, for every term graph t and nodes u; v 2 Vt , (t; u; v) j= (x; y) i v is the forward node of u, i.e., fw(u) = v. In fact, (x; y) = 0 (x; y) ^ : labid(y), where 0 (x; y) is obtained from the formula for path(x; y) in Section 2.2, by changing edg(x; y) into labid(x) ^ edg1 (x; y). The required mso graph transducer T has copy number 1, node formulas ( x ) = lab (x) for every 2 , and edge formulas i (x; y ) = 9z (edgi (x; z ) ^ (z; y)) for every i 2 rki(). ut We now prove the simulation of attributed tree transducers by mso term graph transducers. 26 Theorem 9. For every noncircular attributed tree transducer G there is an mso term graph transducer T such that T = G . Moreover, if G is sur, then T is an mso tree transducer, and if G is os, then T is direction preserving. Proof. Let G be a noncircular att from to . By Lemma 5 there is a noncircular att G0 in operator form from to id such that G = G 0 prune. Moreover, if G is sur or os, then so is G0 . By Lemma 7 there is an mso term graph transducer T 0 such that T 0 = G 0 . Moreover, if G0 is sur, then T 0 is an mso tree transducer, and if G0 is os, then T 0 is direction preserving. We may assume that, for every t 2 T , Tgr0 (t) is clean. In fact, if this is not the case, then, since unfold(clean( )) = unfold( ) for every term graph , we can replace T 0 by an mso graph transducer that denes Tgr0 clean which can be found by Propositions 1 and 2, and the fact that cleaning is denable by a direction preserving mso graph transducer, as shown in Example 1(1, clean). Let T be the mso term graph transducer with Tgr = Tgr0 prune which can be found by Proposition 1 and Lemma 8. Since unfold(prune( )) = prune(unfold( )) for every clean term graph (cf. the discussion before Lemma 8), we obtain, for t 2 T , that T (t) = unfold(Tgr (t)) = unfold(prune(Tgr0 (t))) = prune(unfold(Tgr0 (t))) = prune(T 0 (t)) = prune(G 0 (t)) = G (t). Note that if T 0 is an mso tree transducer then so is T (because prune transforms trees into trees), and if T 0 is direction preserving then so is T (by Proposition 2 and Lemma 8). ut 6 Characterizing unary and binary mso formulas To show that mso denable term transductions can be computed by attribute grammars, we will use the characterizations proved in [BloEng] of mso formulas (x) 2 MSOL1 ( ) and (x; y) 2 MSOL2 ( ), for trees in T . These characterizations are recalled in this section. The characterization of unary formulas is already in terms of attribute grammars, and we use this characterization to show that the class mso-rel of mso relabelings (dened in Section 3 as a \special case") is equal to the class att-rel of attributed relabelings (dened in Section 4.1). The characterization of binary formulas is in terms of tree-walking automata that employ unary formulas as tests. 6.1 Unary mso formulas Let (x) be a formula in MSOL1 ( ). It is proved in Theorem 18 of [BloEng] (and independently in [NevBus]) that there exists a nite-valued noncircular attribute grammar G = (; S; I; ; W; R; m) with W (m ) = ftrue; falseg such that for every t 2 T and u 2 Vt , decG;t (hm ; ui) = true i (t; u) j= (x). 27 This also holds the other way around: for every such attribute grammar there is a formula (x) 2 MSOL1 ( ) satisfying the above condition. We now use this characterization to show that the class mso-rel of mso relabelings is equal to the class att-rel of attributed relabelings. This implies that the look-ahead stage of an attR can as well be realized by an mso relabeling. Theorem 10. mso-rel = att-rel. Proof. We rst show mso-rel att-rel. Let T be an mso relabeling from to . By the above mentioned characterization of unary formulas, there exists for every node formula (x) of T a nite-valued noncircular attribute grammar G over such that decG ;t (hm; ; ui) = true i (t; u) j= (x), where m; is the meaning attribute of G . Let G be the relabeling attribute grammar obtained by taking the union of all G , in the obvious way, and adding a new meaning attribute m with W (m ) = and the semantic rule hm ; 0i = the unique 2 such that hm; ; 0i = true in every R(). It should be clear that r(G) = T . In fact, for every t 2 T , u 2 Vt , and 2 , labr(G)(t) (u) = i decG;t (hm ; ui) = i decG ;t (hm; ; ui) = true i (t; u) j= (x) i labT (t) (u) = . This also shows that G is indeed a relabeling attribute grammar, because decG;t (hm ; ui) has the same rank as labT (t) (u), which has the same rank as labt (u). Next we show att-rel mso-rel. Let G be a relabeling attribute grammar from to . We now claim (as in the rst part of the proof of Theorem 19 of [BloEng]) that for every 2 there is a formula (x) in MSOL1 ( ) such that, for every t 2 T and u 2 Vt , (t; u) j= (x) i labr(G)(t)(u) = . In fact, let G be the attribute grammar that is obtained from G by adding a new meaning attribute m; with W (m; ) = ftrue; falseg which has, for every 2 , the internal rule hm; ; 0i = (hm ; 0i = ). By the characterization of unary formulas mentioned above, there is a formula (x) such that (t; u) j= (x) i decG ;t (hm; ; ui) = true i decG;t (hm ; ui) = i labr(G)(t)(u) = . This proves the claim. Clearly, the formulas (x), 2 , uniquely determine an mso relabeling T such that labT (t) (u) = labr(G)(t)(u), i.e., T = r(G). ut 6.2 Binary mso formulas It is shown in [BloEng] that a formula (x; y) 2 MSOL2 ( ) can be computed by a tree-walking automaton. A tree-walking automaton (see, e.g., [AhoUll, EngRozSlu, KamSlu]) is a nite-state automaton that walks on a tree from node to node, following the edges of the tree (downwards or upwards). Here, and in [BloEng], we allow the tree-walking automaton to test any mso denable property of the current node, using formulas from MSOL1 ( ). Let A be a treewalking automaton corresponding to (x; y). Then, for every tree t 2 T and nodes u; v 2 Vt , (t; u; v) j= (x; y) if and only if A can walk from u to v in t, starting in an initial state and ending in a nal state. 28 We recall the formal denition of the tree-walking automata of [BloEng]. The instructions used by tree-walking automata are called directives (as in [KlaSch]). Let be an operator alphabet. The (innite) set of directives over is D = f#i j i 2 rki( )g [ f"i j i 2 rki( )g [ MSOL1 ( ): A directive is an instruction of how to move from one node to another: #i means \move along an edge labeled i", i.e., \move to the i-th child of the current node (provided it has one)"; "i means \move against an edge labeled i", i.e., \move to the parent of the current node (provided it has one, and it is the i-th child of its parent)"; and (x) means \check that holds for the current node (and do not move)". Formally, we dene for each t 2 T and each directive d 2 D the node relation Rt (d) Vt Vt , as follows: Rt (#i ) = f(u; v) j (u; i; v) 2 Et g; Rt ("i ) = f(u; v) j (v; i; u) 2 Et g; and Rt ( (x)) = f(u; u) j (t; u) j= (x)g: Syntactically, a tree-walking automaton is just an ordinary nite automaton (on strings) with a nite subset of D as input alphabet. However, the symbols of D are interpreted as instructions on the input tree as explained above. Let be an operator alphabet. A tree-walking automaton (with mso tests) over is a quintuple A = (Q; ; ; I; F ), where { Q is a nite set of states, { is the instruction set, a nite subset of D , { Q Q is the transition relation, the elements of which are called transitions, { I Q is the set of initial states, and { F Q is the set of nal states. For a tree-walking automaton A = (Q; ; ; I; F ) and a tree t, an element (q; u) of Q Vt is a conguration of the automaton. Intuitively, it denotes that A is in state q at node u. One step of A on t is dened by the binary relation A;t on the set of congurations, as follows. For every q; q0 2 Q and u; u0 2 Vt , (q; u) A;t (q0 ; u0 ) i 9d 2 : (q; d; q0 ) 2 and (u; u0) 2 Rt (d): To indicate the directive that is executed by A in this step, we also write d (q; u) A;t (q0 ; u0 ). For each tree t 2 T , A computes the node relation Rt (A) = f(u; v) 2 Vt Vt j (q; u) A;t (q0 ; v) for some q 2 I and q0 2 F g. Thus, A computes all pairs of nodes (u; v) such that A can walk from u to v in t, starting in an initial state and ending in a nal state. Example 4. Let = 0 [ 2 , and consider the edge formula 1 (x; y) of the mso tree transducer of Example 1(6, yield). For nodes x and y of a tree t 2 T , it checks that x is a leaf and y is the next leaf after x, in the left-to-right order of the 29 leaves of t. The following tree-walking automaton A = (Q; ; ; I; F ) computes Rt (A) = f(u; v) j (t; u; v) j= 1 (x; y)g, for every t 2 T , i.e., it walks from one leaf to the next. The transition graph of A is depicted in Fig. 6, in the way usual for nite automata. Thus, A has states Q = fi; ii; iii; iv; vg, initial state I = fig, I leaf(x) II "1 III #2 "2 IV leaf(x) V #1 Fig. 6. The tree-walking automaton for 1(x; y) and nal state F = fvg. The instruction set is = f#1; #2 ; "1; "2 ; leaf(x)g, and the transition relation consists of the transitions (i; leaf(x); ii), (ii; "2; ii), (ii; "1 ; iii), (iii; #2 ; iv), (iv; #1; iv), and (iv; leaf(x); v). On a given tree, the automaton A rst checks that it is at a leaf. If so, it walks to the next leaf along the shortest undirected path: it walks upwards over 2-labeled edges as far as possible (in state ii), moves to the other child of its parent (through state iii), and walks downwards along 1-labeled edges as far as possible (in state iv). ut The characterization of binary mso formulas by tree-walking automata is stated next; it is Theorem 13 of [BloEng]. For a formula (x; y) in MSOL2 ( ) and a tree t 2 T , we denote by Rt () the node relation f(u; v) 2 Vt Vt j (t; u; v) j= (x; y)g. The characterization says that binary mso formulas and tree-walking automata compute the same node relations. We also need a special case of this. We will say that a formula (x; y) in MSOL2 ( ) is direction preserving if (t; u; v) j= (x; y) implies that v is a descendant of u, for every t 2 T and u; v 2 Vt . Moreover, we will say that a tree-walking automaton is descending if it does not have directives "i in its instruction set. Proposition 11. For every formula (x; y) 2 MSOL2( ) there is a tree-walking automaton A over such that, for all t 2 T , Rt (A) = Rt (). The other way around, for every tree-walking automaton A over there is a formula (x; y) 2 MSOL2 ( ) such that, for all t 2 T , Rt () = Rt (A). The same two statements hold for direction preserving formulas and descending automata. The special case of this characterization was not explicitly stated in [BloEng]. From automata to formulas it is obvious because a descending automaton always walks from a node to one of its descendants. From formulas to automata, it was shown in the proof of Theorem 13 of [BloEng] that, in general, the automaton can always compute (t; u; v) j= (x; y) by walking from u to v along the shortest undirected path in t. Thus, if is direction preserving, the automaton walks downwards only. 30 Determinism The formulas i (x; y) used in an mso term graph (or tree) trans- ducer are all functional, i.e., the node relation Rt (i ) is functional for every input tree t. This follows, of course, from the fact that every node u of a term graph has a unique i-th child u i. It is shown in [BloEng] that such relations can be computed by deterministic tree-walking automata. This will be essential in the proof of the simulation of mso term graph transducers by attribute grammars, in the next section. A tree-walking automaton A = (Q; ; ; I; F ) over is deterministic if the following three conditions hold: (1) I is a singleton, (2) if (q; d; q0 ) 2 , then q 2= F , and (3) for every tree t 2 T and every conguration (q; u), there is at most one conguration (q0 ; u0) such that (q; u) A;t (q0 ; u0 ). Note that the automaton of Example 4 is deterministic. In [BloEng] a stronger denition of determinism is given (which is not satised by the automaton of Example 4). The above denition suces for our present purposes. A tree-walking automaton A is functional if the node relation Rt (A) is functional for every input tree t. Obviously, every deterministic tree-walking automaton is functional. The next result is Theorem 14 of [BloEng]. Proposition 12. For every functional tree-walking automaton A over there is a deterministic tree-walking automaton A0 over with Rt (A0 ) = Rt (A) for every t 2 T . Moreover, if A is descending then so is A0 . As for Proposition 11, the second statement was not explicitly stated in Theorem 14 of [BloEng], but is obvious from its proof. 7 From mso to att In this section we prove the more dicult part of our main result, viz. that every mso term graph transducer can be simulated by an attributed tree transducer with look-ahead. The constructed attR is, however, not necessarily noncircular (though it is, of course, noncircular on every output tree of its relabeling attribute grammar). Moreover, for an mso tree transducer, the attR is not necessarily sur though it is sur on every output tree of its relabeling attribute grammar. These problems are taken care of in the next section. One of the essential dierences between attributed tree transducers and mso term graph transducers is that the former operate locally, i.e., attributes of a node can only be computed in terms of attributes of its neighbors (viz. its children, its parent, or itself), whereas the latter operate \globally", i.e., edges of the output graph can be established between (copies of) nodes of the input tree that are not necessarily neighbors, but may be far apart. On the other hand, attributed tree transducers have copy rules by which they can transport attribute values through the tree. Thus, to facilitate the simulation of mso term graph transducers by attributed tree transducers, we rst prove that every mso term graph transducer can, in a certain sense, be simulated by one that is \local", i.e., establishes edges between (copies of) neighbors only. Moreover, we want to forbid term graphs 31 with multiple edges; this is related to the requirement in att's that the righthand side of a semantic rule should be linear. Finally, we require that the root of the output term graph is a copy of the root of the input tree. An mso term graph transducer T = (C; ; X ) from to is local if the following three conditions hold: { For every edge formula j;c;c (x; y), every t 2 T , and every u; v 2 Vt, if (t; u; v) j= j;c;c (x; y) then v = u or v is a child of u or v is the parent of u. { For every t 2 T , Tgr(t) has no multiple edges, i.e., no edges ((c; u); j; (c0 ; v)) and ((c; u); i; (c0 ; v)) with j 6= i. { There exists cm 2 C such that for every t 2 T , root(Tgr(t)) = (cm; root(t)). To alleviate the locality restriction, we will allow a local mso term graph transducer to use the identity operator `id'. This corresponds to the use of copy rules in attributed tree transducers, cf. Section 4.2 where the identity operator was used to put att's into operator form. The formulation of the next lemma is similar to that of Lemma 5. Lemma13. For every mso term graph transducer T from to there is a local mso term graph transducer T 0 from to id such that T = T 0 prune. If T is an mso tree transducer then so is T 0, and if T is direction preserving then so is T 0. 0 0 Proof. Let T = (C; ; X ) be an mso term graph transducer from to . We may assume, by Propositions 1 and 2 and Example 1(1, clean), that T produces clean term graphs only, i.e., Tgr (t) is clean for every t 2 T . We rst discuss the intuitive ideas behind the construction of T 0. Let t 2 T , = Tgr (t), and 0 = Tgr0 (t). The (clean) term graph 0 will have the same nodes (c; u) 2 C Vt as , but it will have additional id-labeled nodes, in such a way that prune( 0 ) = (see Section 5 for the denition of `prune'). Then T (t) = unfold( ) = unfold(prune( 0 )) = prune(unfold( 0 )) = prune(T 0 (t)), as required. The new, id-labeled, nodes of 0 and the edges of 0 are constructed as follows. Consider an edge (c; u) !j (c0 ; v) in . The dicult point for T 0 is that the nodes u and v may be far apart in t. Thus, in 0 the edge will be divided into small pieces, by introducing id-labeled nodes that are copies of nodes of t on an undirected path from u to v, see Fig. 7. In particular, the edge is replaced by a path in 0 of the form 1 (c; u) !j (c1 ; u1) ! !1 (cn ; un ) !1 (c0 ; v) where (c1 ; u1 ); : : : ; (cn ; un ) are id-labeled nodes, n 1, c1 ; : : : ; cn are new copy names, and u1 ; : : : ; un are the nodes on an undirected path in t from u to v. More precisely, u1 = u, un = v, and ui+1 = ui or ui+1 is a child of ui or ui+1 is the parent of ui , for every i 2 [1; n 1]. If we view (c; u) and (c0 ; v) as having \value" unf (c; u) and unf (c0 ; v), respectively, then the value of (c0 ; v) is transported backwards along the above path, from v to u, in order to be used in the value of (c; u). The problem for T 0 is that, to transport this value through the tree, it has 32 (cm ; root(t)) 1 1 (c; u) 2 2 1 (c; u) (c0 ; v) 2 (c0 ; v) 12 3 Fig. 7. To the left: = Tgr(t), to the right: = Tgr(t), where dashed lines symbolize 0 0 paths in corresponding to edges of , plus the root path 0 to know the proper route from u to v. Now, since (c; u) !j (c0 ; v) in , the edge formula j;c;c (x; y) of T is satised by (t; u; v). Hence, by the characterization of binary mso formulas in Proposition 11, there is a tree-walking automaton A = Aj;c;c that knows how to walk from u to v. The formula j;c;c (x; y) is functional, because every node of has a unique j -th child, and so we may assume that the automaton is deterministic (see Proposition 12). Hence A has a unique walk (q; u) A;t (q0 ; v), starting at u in its initial state q and ending at v in one of its nal states q0 . If, in detail, the walk is (q1 ; u1) A;t (q2 ; u2 ) A;t A;t (qn ; un) (1) 0 0 0 with (q1 ; u1) = (q; u) and (qn ; un ) = (q0 ; v), then the nodes u1 ; : : : ; un are the ones that will be used in the above path in 0 . As the new copy names c1 ; : : : ; cn we will use ((j; c; c0 ); q1 ); : : : ; ((j; c; c0 ); qn ) which uniquely identify the states q1 ; : : : ; qn of the automaton A = Aj;c;c . Thus, the above path of 0 now has the following form, where e = (j; c; c0 ), qe = q1 = q is the initial state of A, (qi ; ui ) A;t (qi+1 ; ui+1 ) for every i 2 [1; n 1], and qn is a nal state of A: 0 1 1 (c; u) !j ((e; qe ); u) ! ((e; q2 ); u2 ) ! 1 ! ((e; qn 1 ); un 1) !1 ((e; qn ); v) !1 (c0 ; v): (2) It should be clear that in this way the rst locality requirement for T 0 is satised. The second is also satised because if j 6= i then e = (j; c; c0 ) diers from e0 = (i; c; c00 ), and hence ((e; qe ); u) 6= ((e0 ; qe ); u). Finally, to satisfy the third locality requirement, viz. that root( 0 ) = (cm ; root(t)), we introduce a new copy name cm and we add to 0 the \root path" 0 1 (cm ; u1) ! !1 (cm ; un) !1 (c; u) 33 (3) where u1 = root(t), un = u, (c; u) = root( ), (cm ; u1); : : : ; (cm ; un ) are id-labeled nodes, with n 1, and u1 ; : : : ; un are the nodes on the path in t from root(t) to u, see Fig. 7. In this way, the \value" unf (c; u) = unfold( ) is transported from u to root(t). We now turn to the formal denitions. In what follows we will denote elements (j; c; c0 ) of rki() C C by e. We will also denote rki() C C by E . Thus, T has an edge formula e (x; y) for every e 2 E . For every e 2 E , let Ae = (Qe ; e ; e ; fqe g; Fe ) be a deterministic tree-walking automaton such that for all t 2 T and u; v 2 Vt , (u; v) 2 Rt (A) i (t; u; v) j= e (x; y), according to Propositions 11 and 12 (recall that e (x; y) is functional). We will write e;t instead of Ae ;t . We will need an mso formula rootc (x), with c 2 C , such that, for t 2 T and u 2 Vt , (t; u) j= rootc (x) i (c; u) is the root of = Tgr (t). Since we have assumed that is clean, it should be clear that the root of is the unique node without incoming edges, and so we can take rootc (x) to be the conjunction of all formulas :9y(j;c ;c (y; x)), for j 2 rki() and c0 2 C . The local mso term graph transducer T 0 = (C 0 ; 0 ; X 0) from to id is now dened. The set of copy names of T 0 is C 0 = C [ f(e; q) j e 2 E; q 2 Qe g [ fcmg. The node formulas of T 0 are dened as follows. All node formulas that are not mentioned, are dened to be false. We also give some informal comments, related to the informal explanations above. 0 { For all 2 and c 2 C , 0 ;c (x) = ;c (x). This means that all nodes of are also nodes of 0 , with the same labels. { For all e 2 E and q 2 Qe, id0 ;(e;q)(x) is dened in such a way that, for all t 2 T and w 2 Vt , (t; w) j= id0 ;(e;q) (x) i there are nodes u; v 2 Vt such that (qe ; u) e;t (q; w) e;t (q0 ; v) for some q0 2 Fe . By Proposition 11 such a formula exists. In fact, applying Proposition 11 to the tree-walking automaton (Qe ; e ; e ; fq1 g; fq2g), which is Ae with start state q1 and nal state q2 , we obtain, for any two states q1 ; q2 of Ae , a formula q1 ;q2 (x; y) which expresses that (q1 ; x) e (q2 ; y). Then 0 id;(e;q) (x) = 9y; z (qe ;q (y; x) ^ q;Fe (x; z )); where q;Fe (x; z ) is the disjunction of all q;q (x; z ), for q0 2 Fe . Through this node formula, we only add an id-labeled node ((e; q); w) to 0 when it is needed on a path (2) of 0 that replaces an edge of , as discussed above, i.e., when the conguration (q; w) is on the walk (1) of Ae that determines that path. { id0 ;cm (x) is the disjunction of all formulas 9y(path(x; y)^rootc(y)), for c 2 C . Thus, an id-labeled node (cm ; w) is only added to 0 when it is needed on the root path (3). 0 The edge formulas of T 0 are now dened as follows. Again, edge formulas that are not mentioned, are dened to be false. 34 { For every e = (j; c; c0 ) 2 E , 0j;c;(e;q )(x; y) = (x = y). This edge formula establishes the rst edge of a path (2) of 0 that replaces an edge of . { For every e = (j; c; c0 ) 2 E and q 2 Fe , 01;(e;q);c (x; y) = (x = y). This edge formula establishes the last edge of a path (2) of 0 . { For every e 2 E and q; q0 2 Qe, e 0 01;(e;q);(e;q ) (x; y) = d1 (x; y) _ _ dn (x; y) where fd1 ; : : : ; dn g = fd 2 e j (q; d; q0 ) 2 e g and, for d 2 D , the formula d (x; y) is dened as follows: if d = #i then d (x; y) = edgi (x; y), if d = "i then d (x; y) = edgi (y; x), and if d = (x) then d (x; y) = ((x) ^ x = y). These edge formulas establish all other edges of a path (2) of 0 . Each such edge corresponds to one step of the automaton Ae in the walk (1). Note that, for every t 2 T , Rt (d ) = Rt (d). { 01;cm;cm (x; y) = edg(x; y), and for every c 2 C , 01;cm;c(x; y) = (rootc (x) ^ x = y). The second formula establishes the last edge on the root path (3) of 0 , and 0 the rst formula establishes all other edges on that path. This ends the denition of T 0. We will now show the correctness of T 0. To this aim, let again t 2 T , = Tgr (t), and 0 = Tgr0 (t). We have to prove that T 0 is local, that 0 is a clean term graph such that prune( 0 ) = , that 0 is a tree if is one, and that T 0 is direction preserving if T is. We rst make four easy observations. First, it is immediate from the definitions that the rst two requirements for locality hold for T 0 . Second, if T is direction preserving, then every edge formula e (x; y) of T is direction preserving, and so, by Propositions 11 and 12, Ae is descending. This means that Ae has no transitions (q; "i ; q0 ), which implies that all edge formulas of T 0 are direction preserving, i.e., that T 0 is direction preserving. Third, it is straightforward to see from the denition of A;t that, for every t 2 T and u; u0 2 Vt , (t; u; u0 ) j= 01;(e;q);(e;q ) (x; y) i (q; u) e;t (q0 ; u0). Fourth, since all automata Ae are deterministic, it is easy to see from the denitions (and the previous observation) that each id-labeled node of 0 has at most one outgoing edge, with label 1. To prove the remaining facts, we show that 0 has the special form that was suggested in the intuitive introduction of this proof, see Fig. 7. By the denition 0 (x), the nodes of 0 of the form (c; u) are exactly the nodes of , with the of ;c same labels. All other nodes of 0 have label id. To describe the id-labeled nodes and the edges of 0 , we associate with each edge (c; u) !j (c0 ; v) of a particular path in 0 as follows. Let e = (j; c; c0 ) and consider the walk (1) of the treewalking automaton A = Ae from u1 = u to un = v, with q1 = qe and qn 2 Fe . Such a walk exists because (t; u; v) j= e (x; y), and it is unique because Ae is 0 35 deterministic. Then we associate with (c; u) !j (c0 ; v) the path (2) of 0 . It is easy to check from the denition of T 0 that this path indeed exists in 0 . In fact, each id-labeled node ((e; qi ); ui ) exists because the walk (qe ; u) A;t (qi ; ui ) A;t (qn ; v) implies that (t; ui ) j= id0 ;(e;qi ) (x). The edges of the path are clearly established by the edge formulas of T 0; in particular, (qi ; ui ) A;t (qi+1 ; ui+1 ) implies (t; ui ; ui+1 ) j= 01;(e;qi );(e;qi+1 ) (x; y). Having associated a path of 0 with every edge of , we dene one additional path of 0 , the \root path". It is the path (3), where u1 = root(t), un = u, (c; u) = root( ), and u1 ; : : : ; un are the nodes on the path in t from root(t) to u. It is easy to check from the denition of T 0 that this root path exists in 0 . Having dened these special paths, we now claim that every id-labeled node and every edge of 0 is on one of these paths. Consider an id-labeled node ((e; q); w) with e = (j; c; c0 ). Since (t; w) j= id0 ;(e;q) (x), there are nodes u; v 2 Vt such that (qe ; u) e;t (q; w) e;t (q0 ; v) for some q0 2 Fe . Consequently (t; u; v) j= e (x; y) and so has an edge (c; u) !j (c0 ; v). This means that (qe ; u) e;t (q; w) e;t (q0 ; v) is in fact the unique walk (1) of Ae from u to v, and hence the node ((e; q); w) is on the path (2) associated with this edge of . A similar (but easier) argument shows that all id-labeled nodes of the form (cm ; w) are on the root path (3). Thus, every id-labeled node of 0 is on one of the special paths. Since, as observed above, an id-labeled node has at most one outgoing edge, all these edges are also on the special paths. The remaining edges of 0 are of the form (c; u) !j ((e; qe ); u) with e = (j; c; c0 ); obviously, the node ((e; qe ); u) has at most one such incoming edge, and hence this edge is also on a special path (2). This shows that also every edge of 0 is on one of the special paths. Having shown that 0 is of the above special form, we now prove the remaining facts. Since every node (c; u) has the same number of outgoing edges in 0 as in , with the same labels, 0 satises the second requirement for being a term graph. Moreover, the special paths of 0 do not contain cycles, because the Ae are deterministic and hence the walks (1) do not contain repetitions. A cycle of 0 would consist of consecutive special paths, and thus would give a cycle in . Hence 0 is noncircular. Finally, since is clean, 0 has root (cm ; root(t)), which shows that 0 is a clean term graph, and that the third locality requirement holds. It should now be clear, from the special form of 0 and the denition of `prune', that prune( 0 ) = . It remains to be shown that 0 is a tree if is one. It obviously suces, in view of the special form of 0 , to prove that the special paths have no idlabeled nodes in common. Suppose that the special paths associated with two j 0 distinct edges (c1 ; u1) !j (c01 ; v1 ) and (c2 ; u2 ) ! (c2 ; v2 ) of have an id-labeled node in common. Since id-labeled nodes have exactly one outgoing edge, these paths have to end at the same node (c01 ; v1 ) = (c02 ; v2 ). But then (c01 ; v1 ) has two incoming edges in , contradicting the fact that is a tree. ut We now prove that every mso term graph transducer can be simulated by a, possibly circular, attributed tree transducer with look-ahead. More precisely, 0 36 we show the following theorem. Theorem 14. For every mso term graph transducer T from to there exist an attributed relabeling r and an attributed tree transducer G such that, for every tree t 2 T , G is noncircular on r(t) and G (r(t)) = T (t). Moreover, if T is an mso tree transducer, then G is sur on r(t) for every t 2 T , and if T is direction preserving, then G is os. Proof. Let T = (C; ; X ) be an mso term graph transducer from to . We may assume, by Lemmas 13 and 4, that T is local. Let cm be the copy name that satises the third locality requirement. Based on the characterization of unary mso formulas, Theorem 10 shows that it suces to dene an mso relabeling r instead of an attributed relabeling. The mso relabeling r will be used to compute the truth values of the node and edge formulas of T . Since T is local, the truth values of the edge formulas can indeed be stored in the labels of the nodes. Thus, r will be of the type discussed in Example 1(2, relab), from to 0 = ftrue; falsegn , where each symbol 0 = (; b1 ; : : : ; bn ) stores truth values for the unary formulas 1 (x); : : : ; n (x) that we wish to consider. To avoid having to order these formulas, we will write [i (x)] for bi . The formulas that determine r are all the node formulas ;c(x) 2 , and all the formulas j;c;c (x i; x i0 ) with j;c;c (x; y) 2 X , and i; i0 2 f0g[ rki() [f"g. What we mean by the latter formulas is the following. For i 2 rki(), x i denotes the i-th child of x, x0 denotes x itself, and x " denotes the parent of x. Formally, j;c;c (x i; x i0 ) is the formula 9y; z (edgi (x; y) ^ edgi (x; z ) ^ j;c;c (y; z )), with edg0 (x; y) = (x = y) and edg" (x; y) = edg(y; x). The att G = ( 0 ; S; I; ; W; R; m), with = fT g, is dened as follows. For each copy name c of T , G has a synthesized attribute c and an inherited attribute c . The meaning attribute m of G is cm . For t 2 T and u 2 Vt , the attribute hc ; ui of r(t) will have the value unf (c; u) where = Tgr (t). Since T is local, the rst two locality requirements imply that 0 0 0 0 0 0 unf (c; u) = (unf (c1 ; u1 ); : : : ; unf (ck ; uk )) where each ui is either u itself or one of its children or its parent, and the nodes (c1 ; u1); : : : ; (ck ; uk ) of are all distinct. Thus, the (synthesized) attribute hc ; ui depends on the -attributes of itself, its children, and its parent. If T is direction preserving, then there is no dependency on the attributes of the parent, and the inherited attributes c are not needed, i.e., G is os. Otherwise, the inherited attribute c is used to transport the attribute c of a node down to its children, by a copy rule, cf. Fig. 8. We now dene the semantic rules of G. All root rules of G are \dummy" rules, i.e., rules hc ; 0i = ?, where ? is any element of 0 . For 0 2 0 , the semantic rules in R(0 ) are dened as follows. 37 c u j c ui c 0 Fig. 8. Edge of the output graph (dashed line) and attribute dependencies (solid lines), for an edge from child to parent { If [ ;c (x)] = true and [j;c;cj (x 0; x ij )] = true for all j 2 [1; k] (where k 2 rki(), 2 k , c; c1 ; : : : ; ck 2 C , i1; : : : ; ik 2 [0; rk(0 )] [ f"g, and (c1 ; i1 ); : : : ; (ck ; ik ) are all distinct), then R(0 ) contains the rule hc ; 0i = (hc1 ; m1 i; : : : ; hck ; mk i) where hcj ; mj i = hcj ; ij i if ij 2 [0; rk(0 )] and hcj ; mj i = hcj ; 0i if ij = ". 0 0 { If [j;c ;c(x i; x 0)] = true (where i 2 [1; rk(0 )], j 2 rki(), and c0; c 2 C ), then R(0 ) contains the copy rule hc ; ii = hc ; 0i. Attributes hc ; 0i or hc ; ii for which no semantic rule is dened above, are dened by dummy rules hc ; 0i = ? or hc ; ii = ?, respectively. Attributes for 0 0 which more than one semantic rule is dened above, are also dened by dummy rules (this can only happen when 0 does not occur in any tree r(t)). Note that in the rst item above, hc1 ; m1 i; : : : ; hck ; mk i are all distinct, i.e., the right-hand side of the semantic rule is linear, as required. This ends the denition of G. To show the correctness of G, let t 2 T and = Tgr (t). The denition of the attributed relabeling r implies the following facts. For a node u of r(t) with label 0 , [ ;c (x)] = true i (t; u) j= ;c(x) i (c; u) is a node of with label . Also, [j;c;c (x i; x i0 )] = true i (t; u i; u i0) j= j;c;c (x i; x i0 ) i (c; u i) !j (c0 ; u i0 ) is an edge of , where u " denotes the parent of u. We will refer to these facts as \the correctness of r". Suppose that G is noncircular on r(t). Dene the mapping dec : A(r(t)) ! T as follows. For c 2 C and u 2 Vt , 0 0 0 0 { dec(hc; ui) = unf (c; u) if (c; u) 2 V , and ? otherwise, and { dec(hc; ui) = unf (c; v) where v is the parent of u, if ((c0 ; u); j; (c; v)) 2 E for some c0 2 C and j 2 rki(), and ? otherwise. The correctness of r implies that dec is a decoration of r(t), and consequently dec = decG;r(t), and so, using the third locality requirement, we obtain that G (r(t)) = dec(hm ; root(r(t))i) = unf (cm ; root(t)) = unf (root( )) = unfold( ) = T (t): 38 To prove that G is noncircular on r(t), and that G is sur on r(t) if is a tree, we consider the semantic graph SG (r(t)), where G0 is the att in operator form which is obtained from G by changing every copy rule hc ; ii = hc ; 0i into hc ; ii = id(hc ; 0i), cf. Lemma 5 and its proof. It now suces to show that SG (r(t)) is noncircular, and that SG (r(t)) is a forest if is a tree. To prove these facts, we consider the edges of SG (r(t)). By the denition of G and the correctness of r, the edges of SG (r(t)) are of one of the following three forms 0 0 0 0 0 : hc ; ui !j hc ; u ii with i 2 N and (c; u) !j (c0 ; u i) in , : hc ; ui !j hc ; ui with (c; u) !j (c0 ; u ") in , or : hc ; ui !1 hc ; u "i, where, again, u " denotes the parent of u. Moreover, if edge () is in SG (r(t)) then so is edge ( ) for some c 2 C . This shows that a cycle in SG (r(t)) is changed into a cycle in by changing every edge hc ; ui !j hc ; u0i into (c; u) !j (c0 ; u0), and every two consecutive 1 hc ; u "i into (c; u) !j (c0 ; u "). Since is noncircuedges hc ; ui !j hc ; ui ! lar, so is SG (r(t)), which means that G is noncircular on r(t). It also shows that if a node hc ; ui of SG (r(t)) has two incoming ()-edges, then (c; u) has two incoming edges in . Similarly, if a node hc ; ui of SG (r(t)) has two incoming edges (which are necessarily ( )-edges), then (c0 ; u ") has two incoming edges in from two nodes (c1 ; u) and (c2 ; u). If hc ; vi has two incoming ()-edges, then (c0 ; v) has two incoming edges from nodes (c1 ; u1 ) and (c2 ; u2 ), for two children u1 ; u2 of v. Finally, if hc ; ui has an incoming ()-edge and an incoming ()-edge, then (c; u) has two incoming edges from nodes (c1 ; u1 ) and (c2 ; u2 ), where u2 is a child of u and u1 is not. Altogether this shows that if SG (r(t)) has a node with two incoming edges, then so has . Thus, if is a tree, then SG (r(t)) is a forest, which means that G is sur on r(t). This proves the theorem. We nally note that can in fact be obtained from SG (r(t)) by rst removing all (isolated) nodes hc ; ui of SG (r(t)) with (c; u) 2= V , and then applying `prune' (and removing the root mark # if has none). ut 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 An Example In the remainder of this section we give an example of the implementation of an mso term graph transduction by an attributed tree transducer with look-ahead, as described in Lemma 13 and Theorem 14. Consider the mso tree transducer T = (fcg; ; X ) of Example 1(6, yield) which transforms a binary tree into its yield, viewed as a monadic tree. Recall that T is from = 0 [ 2 to = 0 [ 1 , with 1 = 0 and 0 = f^ j 2 0 g. Recall also that T has the following node and edge formulas: for 2 0 , ;c (x) = lab (x) ^ : rm(x) for 2 0 , ^ ;c (x) = lab (x) ^ rm(x) 1;c;c(x; y) = leaf(x) ^ (x; y) ^ leaf(y); 39 where the formula (x; y) checks that y directly follows x in the left-to-right order of leaves, and the formula rm(x) expresses that x is a node on the path from the root to the rightmost leaf. We will also need the formula lm(x) = 8y(root(y) ! path1 (y; x)) which expresses that x is a node on the path from the root to the leftmost leaf. In the following, let e = (1; c; c). Clearly, this is the only element of E = rki() C C . Let Ae = A = (Q; ; ; I; F ) be the deterministic tree-walking automaton of Example 4 which is equivalent with e (x; y), see Fig. 6. The local mso tree transducer T 0 constructed in the proof of Lemma 13 has copy names C 0 = fc; i; ii; iii; iv; v; cm g, where we have identied the copy name (e; q) with q for every q 2 Q, because there is only one automaton. We now give the node formulas of T 0, or rather, node formulas that are equivalent to the ones that are actually constructed. First, the node formulas of T 0 include those of T , with a prime. Second, after analyzing the precise behavior of A, it can be seen that T 0 has the following node formulas id0 ;q (x) for q 2 Q: 0 id;i (x) = leaf(x) ^ : rm(x); 0 (x) = : rm(x); id;ii 0 (x) = : leaf(x); id;iii 0 id;iv (x) = : lm(x); and 0 id;v (x) = leaf(x) ^ : lm(x): Third, since the root of the output tree is the leftmost leaf, id0 ;cm (x) = lm(x). Next, we give the edge formulas of T 0. First, since qe = i, it has the edge formula 01;c;i(x; y) = (x = y). Second, since F = fvg, 01;v;c (x; y) = (x = y). Third, the transitions of A result in the following six edge formulas: 01;i;ii(x; y) = (leaf(x) ^ x = y); 01;ii;ii(x; y) = edg2 (y; x); 01;ii;iii(x; y) = edg1 (y; x); 01;iii;iv(x; y) = edg2 (x; y); 01;iv;iv(x; y) = edg1 (x; y); and 01;iv;v(x; y) = (leaf(x) ^ x = y): Fourth, 01;cm;cm (x; y) = edg(x; y) and, since the root of the output tree is the leftmost leaf, 01;cm;c(x; y) = (leaf(x) ^ lm(x) ^ x = y). In what follows, it is assumed that each of the above edge formulas 01;c1 ;c2 (x; y) of T 0 is restricted in such a way that (t; u; v) j= 01;c1 ;c2 (x; y) implies that (c1 ; u) and (c2 ; v) are nodes of the output tree T 0 (t), as discussed in Section 3 and, in fact, as assumed in the proof of Theorem 14. We now turn to the attributed relabeling r and attributed tree transducer G, constructed in the proof of Theorem 14 for the local mso tree transducer T 0 . As explained in that proof, r computes the truth values of the node and edge formulas of T 0. In this example, since the truth values of the formulas leaf(x) 40 and lab (x), for 2 0 , are obvious from the -label of the node x, and since most of the formulas j;c;c (x i; x i0 ) are trivial for similar reasons, it suces that r computes the truth values of the formulas rm(x), lm(x), 9y(edg1 (y; x)), and 9y(edg2 (y; x)). Thus, the symbols of the output alphabet 0 of r are of the form 0 = (; b1 ; b2 ; b3; b4 ), with 2 and bi 2 ftrue; falseg. The att G has synthesized attributes c; i ; : : : ; v ; cm , with m = cm , and it has inherited attributes ii and iii . In fact, since, for i 1 and c1 ; c2 2 C 0 , [j;c1 ;c2 (x i; x 0)] can only be true for c1 = ii and c2 2 fii; iiig, the other inherited attributes of G will be dened by dummy rules only, and hence are superuous. For 0 2 20 , if [rm(x)] = false, then R(0 ) contains the semantic rules hii ; 2i = hii ; 0i and hiii ; 1i = hiii ; 0i, and all other -rules are dummy rules. 0 0 0 III II V IV III IV II I II III V IV II I V IV II I Fig. 9. All walks of the automaton on tree t The synthesized attributes have the following semantic rules. We do not mention the dummy rules, i.e., we dene the attribute c only for leaves, and an attribute c with c0 6= c only for nodes x for which id0 ;c (x) holds. { For 0 = (; b1 ; b2; b3; b4) 2 00 , R(0 ) contains the rule hc; 0i = (hi ; 0i) if [rm(x)] = false, and the rule hc ; 0i = ^ otherwise. { For 0 2 00 with [rm(x)] = false, R(0 ) contains hi; 0i = hii; 0i. { For every 0 2 0 with [rm(x)] = false, R(0 ) contains hii ; 0i = hii; 0i if [9y(edg2 (y; x))] = true, and hii ; 0i = hiii ; 0i if [9y(edg1 (y; x))] = true. 0 0 0 0 0 0 0 { For 0 2 20 , R(0 ) contains hiii; 0i = hiv ; 2i. 41 { For every 0 2 0 with [lm(x)] = false, R(0) contains hiv ; 0i = hiv ; 1i if 0 2 20 , and hiv ; 0i = hv ; 0i otherwise. { For 0 2 00 with [lm(x)] = false, R(0 ) contains hv; 0i = hc; 0i. { For every 0 2 0 with [lm(x)] = true, R(0) contains hm; 0i = hm; 1i if 0 2 20 , and hm ; 0i = hc ; 0i otherwise. 0 0 0 This ends the description of r and G. To conclude the example, we consider how G acts on tree t = $($(a; $(b; a)); a), where a; b 2 0 and $ 2 2 , with output T (t) = a(b(a(^a))). Figure 9 shows all walks of the automaton on t. Figure 10 gives the dependency graph D(r(t)). It shows the inherited attributes to the II III II III II III a rt IV V c I II III $ rt IV V c I II III II III b $ rt II III $ IV V c I rt II III IV V c I II III rt IV V c I II III II III a rt IV V c I II III rt IV V c I II III II III a Fig. 10. The dependency graph of r(t) left of each node, and the synthesized attributes to the right. The label of each node is the one in t. The attribute names have been abbreviated. Attribute m is denoted m, c is denoted c, and the attributes q and q are both denoted q, for all q 2 Q. Note that the direction of the edges in D(r(t)), i.e., the direction in which the attributes are evaluated, is opposite to the direction in which the automaton walks on t (cf. the rst part of the proof of Lemma 13). We can easily infer the values of the attributes. Let the leaves of t be numbered u1 through u4 , from left to right. The values of the c are decG;t (hc ; u4i) = a^, decG;t (hc ; u3 i) = a(^a), decG;t (hc ; u2 i) = b(a(^a)), and decG;t (hc ; u1 i) = a(b(a(^a))), and the other occurrences of c have value ?. The rules for all the 42 other attributes merely copy values, so, in particular, decG;t (hm ; root(t)i) = a(b(a(^a))). It should be clear that r and G are very similar to r(G1 ) and G2 of Example 2(6, yield). Roughly speaking, the attribute `down' of that example corresponds to the attributes i , ii , iii , ii , and iii , and the attribute `up' to the attributes c , iv , v , and m . 8 Noncircularity and single use We have shown in Theorem 14 that every mso term graph transducer can be simulated by a, possibly circular, attR . We might be quite satised with this, but it is even more satisfactory to have a noncircular attR , and, thus, to be able to state that mso-tgt att-rel att. That this is possible will be proved in this section. Thus, we will show the following theorem. Note that without the last statement, the theorem says the following: for every attR that computes a total function there is an equivalent noncircular attR . Theorem 15. Let r be an attributed relabeling and G be an att which is noncircular on r(t) for every input tree t of r. Then there exist an attributed relabeling r0 and a noncircular att G0 such that r0 G 0 = r G . Moreover, if G is sur on r(t) for every input tree t of r, then G0 is sur, and if G is os, then so is G0 . To prove this, it suces to prove the following lemma. Note that the rst sentence of this lemma says the following: for every att G there is a noncircular attR (G1 ; G2 ) such that G r(G1 ) G2 . Lemma 16. 1. For every att G there exist an attributed relabeling r0 and a noncircular att G0 such that G 0 (r0 (t)) = G (t) for every input tree t on which G is noncircular. Moreover, for every input tree t, if G is sur on t, then G0 is sur on r0 (t). 2. For every noncircular att G0 there exist an attributed relabeling r1 and a noncircular sur att G00 such that G 00 (r1 (t)) = G 0 (t) for every input tree t on which G0 is sur. Moreover, if G0 is os, then so is G00 . In fact, assuming this lemma, the rst part of Theorem 15 can be proved as follows. Take r0 = r r0 . By Proposition 2 and Theorem 10, att-rel is closed under composition, and hence r0 is an attributed relabeling. Then, for every input tree t of r, G 0 (r0 (t)) = G 0 (r0 (r(t))) = G (r(t)) because G is noncircular on r(t). To prove the second part of Theorem 15 (with r00 and G00 instead of r0 and G0 , respectively), take r00 = r0 r1 . Again, r00 is an attributed relabeling. Let t be an input tree of r. Since G is sur on r(t), G0 is sur on r0 (r(t)) = r0 (t). Hence G 00 (r1 (r0 (t))) = G 0 (r0 (t)), and so G 00 (r00 (t)) = G (r(t)). Note that for the os-part we only need the second statement of Lemma 16, because every os att is noncircular. In the remainder of this section we prove Lemma 16. The lemma holds for arbitrary attribute grammars, in such a way that G0 has the same semantic 43 domains and uses the same functions f in its semantic rules (and similarly for G00 ). The reader who is not interested in technical subtleties should skip the remainder of this section. We start with the rst part of Lemma 16. Let G = (; S; I; ; W; R; m) be an arbitrary att, with = fT g. We will abbreviate S [ I by A. Note that if the dependency graph D(root) of the root is circular, then G is circular on every input tree, and Lemma 16 holds trivially. Hence, we may assume that D(root) is noncircular. The relabeling r0 will add is-graphs, well known from the circularity test for attribute grammars, to the labels of the nodes of an input tree t. This will allow G0 to see whether or not G is circular on t. Let us rst recall these wellknown ideas from [Knu], in our notation. An is-graph is a subset of I S . For 2 k and is-graphs q1 ; : : : ; qk , D()[q1 ; : : : ; qk ] is the unlabeled directed graph obtained from the dependency graph D() of by adding all edges (h; ii; h; ii) with (; ) 2 qi , for i 2 [1; k]. Furthermore, 0 (D()[q1 ; : : : ; qk ]) is the isgraph consisting of all (; ) such that there is a path from h; 0i to h; 0i in D()[q1 ; : : : ; qk ]. For an is-graph q, D(root)[q] is the unlabeled directed graph obtained from D(root) by adding all edges (h; 0i; h; 0i) with (; ) 2 q. For t 2 T and u 2 Vt , the is-graph of u, denoted is(u), is dened recursively to be the is-graph is(u) = 0 (D()[is(u 1); : : : ; is(u k)]), where labt (u) = 2 k . The circularity test for a single tree t 2 T is now as follows: D(t) is circular i either D(root)[is(root(t))] is circular or there exists u 2 Vt such that D()[is(u 1); : : : ; is(u k)] is circular, where labt (u) = 2 k . We now dene the attributed relabeling r0 . Let Q = P (I S ) be the set of all is-graphs, and let = froot; nonrootg. The output alphabet 0 of r0 (which will also be the input alphabet of G0 ) consists of all symbols (; ; (q0 ; q1 ; : : : ; qk )) of rank k, with k 0, 2 k , 2 , qi 2 Q for all i 2 [0; k], and q0 = 0 (D()[q1 ; : : : ; qk ]). Note that q0 is superuous; it is added to simplify the description of the semantic rules of G0 . The attributed relabeling r0 relabels a node u of a tree t 2 T that has label 2 k , with the label (; ; (q0 ; q1 ; : : : ; qk )) 2 0 , where qi = is(u i) for all i 2 [0; k], and = root i u = root(t). It is straightforward to dene a relabeling attribute grammar G0 such that r(G0 ) = r0 . It has a synthesized attribute `is' with W (is) = Q to compute is(u) for every node u, an inherited attribute `r' with W (r) = to indicate the root, and a (synthesized) meaning attribute `lab' with W (lab) = 0 to compute the new label of every node. It has one root rule: hr; 0i = root. For every 2 k it has the internal rules his; 0i = 0 (D()[his; 1i; : : : ; his; ki]) hr; ii = nonroot for all i 2 [1; k] hlab; 0i = (; hr; 0i; (his; 0i; his; 1i; : : : ; his; ki)): From these rules it should be clear that r(G0 ) = r0 . 44 We now turn to the denition of the att G0 from 0 to . For G0 we will use the same notation as for G, with primes. The attributes of G0 are S 0 = S [ (S Q) and I 0 = I [ (I Q), with 0m = m . We denote S 0 [ I 0 by A0 . The idea in the construction of G0 is the following. Let t0 be a tree over 0 , and let t be the tree over that is obtained from t0 by changing every label (; ; (q0 ; q1 ; : : : ; qk )) into . Let us rst consider the case that t0 = r0 (t). Then the attribute h; ui of t in G is simulated in G0 by the attribute h(; q); ui of t0 if is(u) = q and u is not the root, and by the attribute h; ui if u is the root. The remaining attributes of t0 are superuous and get a dummy value. This will guarantee that G 0 (r0 (t)) = G (t) if D(t) is noncircular. If D(t) is circular, then all cycles of D(t) will be broken in D0 (t0 ) because we will dene D0 (; ; (q0 ; q1 ; : : : ; qk )) to have no edges whenever D()[q1 ; : : : ; qk ] is circular (and when D(root)[q0 ] is circular, if = root). Now consider the case that t0 6= r0 (t). Then a possible cycle in D(t) will be broken in D0 (t0 ) either for the same reason as above, or because the qi in the labels of t0 do not \t". Roughly speaking, the semantic rules for a symbol (; ; (q0 ; q1 ; : : : ; qk )) in G0 will be the same as the semantic rules for in G, except that every h; ii is replaced by h(; qi ); ii. Now, if a node u of t0 has label (; ; (q0 ; q1 ; : : : ; qk )), and its i-th child has a label of the form ( ; ; (p0 ; : : : )) that does not \t", i.e., with p0 6= qi , then a path in D(t) through h; u ii will be broken in D0 (t0 ) because h; u ii is split into h(; p0 ); u ii and h(; qi ); u ii between which there is no connection, see Fig. 11. It remains to specify the semantic rules of G0 . For h0 ; ii 2 A0 N , labt (u) = (; ; (: : : ; qr ; : : : )) 0 (; qr ) (; pr0 ) labt (u r) = (0 ; 0 ; (pr0 ; : : : )) 0 Fig. 11. Nontting labels of a parent and child a dummy rule for h0 ; ii is a semantic rule h0 ; ii = ?, where ? is an arbitrary element of 0 . Note that dummy rules do not produces edges in dependency graphs. The root rules of G0 are the root rules of G, together with dummy rules for the attributes in I Q. The internal rules of G0 for a symbol 0 = (; ; (q0 ; q1 ; : : : ; qk )) 2 0 are dened as follows, distinguishing between the two possible values of . Case 1: = nonroot. If D()[q1 ; : : : ; qk ] is circular, then all internal rules for 0 are dummy rules. Otherwise, R0 (0 ) consists of all rules of R(), in which every h; ii is replaced by h(; qi ); ii, and dummy rules for the remaining attributes. 45 Case 2: = root. If D()[q1 ; : : : ; qk ] is circular or D(root)[q0 ] is circular, then all internal rules for 0 are dummy rules. Otherwise, R0 (0 ) consists of all rules of R(), in which every h; ii with i 6= 0 is replaced by h(; qi ); ii, and dummy rules for the remaining attributes. This ends the construction of G0 . Clearly, if G is sur on t 2 T , then G0 is sur on r0 (t). In fact, G0 is then sur on any tree t0 2 T that is obtained from t by changing each label into some (; ; (q0 ; q1 ; : : : ; qk )). We will now rst show that r0 and G0 compute the same tree transduction as G, for noncircular input, and then we will show that G0 is noncircular. Let t 2 T and t0 = r0 (t), and assume that both D(t) and D0 (t0 ) are noncircular. We have to prove that G 0 (t0 ) = G (t). For that purpose we dene a valuation val : A0 (t0 ) ! T of the attributes of t0 as follows. Let u be a node of t0 with label (; ; (q0 ; q1 ; : : : ; qk )). If = nonroot, then val(h(; q0 ); ui) = decG;t (h; ui) for every 2 A, and val(h0 ; ui) = ? for all remaining attributes 0 2 A0 . If = root, then val(h; ui) = decG;t(h; ui) for every 2 A, and val(h0 ; ui) = ? for every 0 2 A Q. We claim that val is a decoration of t0 (for G0 ). In fact, for every node u of t0 with label (; ; (q0 ; q1 ; : : : ; qk )), it follows from the denition of r0 that qi = is(u i) for all i 2 [0; k], and that = root i u is the root. Since D(t) is noncircular, this implies that D()[q1 ; : : : ; qk ] is noncircular, and that D(root)[q0 ] is noncircular if u is the root. It now follows from the denition of the semantic rules of G0 (and the fact that decG;t is a decoration of t) that val is a decoration of t0 . Hence, since D0 (t0 ) is noncircular by assumption, val = decG ;t and so G 0 (t0 ) = decG ;t (h0m ; root(t0 )i) = val(hm ; root(t)i) = decG;t(hm ; root(t)i) = G (t). Finally we show that G0 is noncircular, i.e., that D0 (t0 ) is noncircular for every tree t0 2 T . We claim that the following four statements hold for every node u of t0 , with label 0 = (; ; (q0 ; q1 ; : : : ; qk )). 0 0 0 0 0 0 1. If = nonroot and (0 ; 0 ) 2 is0 (u), then there exist ; 2 A such that 0 = (; q0 ), 0 = (; q0 ), and (; ) 2 q0 . 2. If = root and (0 ; 0 ) 2 is0 (u), then 0 ; 0 2 A and (0 ; 0 ) 2 q0 . 3. If D0 (0 )[is0 (u 1); : : : ; is0 (u k)] is circular, then D()[q1 ; : : : ; qk ] is circular. 4. If D0 (root)[is0 (u)] is circular, then = root and D(root)[q0 ] is circular. From Statements (3) and (4) the noncircularity of D0 (t0 ) can be proved as follows. Assume that D0 (t0 ) is circular. Then either there is a node u such that D0 (0 )[is0 (u 1); : : : ; is0 (u k)] is circular, or D0 (root)[is0 (u)] is circular for u = root(t). In the rst case, Statement (3) implies that D()[q1 ; : : : ; qk ] is circular, and so, by the denition of the semantic rules of G0 , D0 (0 ) has no edges, contradicting the circularity of D0 (0 )[is0 (u 1); : : :; is0 (u k)]. In the second case, Statement (4) implies that = root and D(root)[q0 ] is circular, and so also in this case, by the denition of the semantic rules of G0 , D0 (0 ) has no edges. This implies that is0 (u) has no edges, and so, since G0 and G have the same root rules, D(root) = D0 (root) = D0 (root)[is0 (u)] is circular. We have assumed, however, that D(root) is noncircular. 46 It remains to show the above four statements. First Statements (1) and (2) are proved simultaneously, by induction on u, and then Statements (3) and (4) are proved. In these proofs the following lemma is useful. Let : A0 ! A be dened by ((; q)) = () = for 2 A and q 2 Q. Lemma 17. Let u be a node of t0 2 T with label 0 = (; ; (q0 ; q1; : : : ; qk )), and assume that Statements (1) and (2) hold for u 1; : : : ; u k. If 0 h00 ; i0 ie1 h01 ; i1 ie2 h02 ; i2i en h0n ; ini is a path in D0 (0 )[is0 (u 1); : : : ; is0 (u k)], with ej = (h0j 1 ; ij 1 i; h0j ; ij i), and e1 is an edge of D0 (0 ), then h(00 ); i0 i(e1 )h(01 ); i1 i(e2 )h(02 ); i2 i (en )h(0n ); in i is a path in D()[q1 ; : : : ; qk ], with (ej ) = (h(0j 1 ); ij 1 i; h(0j ); ij i). Proof. Let the label of u r be (r ; r ; (pr0 ; pr1 ; : : : ; prkr )), for r 2 [1; k]. Now the essence of the whole construction of G0 is that if the given path uses an edge from is0 (u r), r 2 [1; k], then pr0 = qr , see Fig. 11. The formal proof is as follows. It is an immediate consequence of the denition of the semantic rules of G0 that if ej is an edge in D0 (0 ), then (ej ) is an edge in D() and hence in D()[q1 ; : : : ; qk ]. Now consider an edge ej+1 = (h0j ; ri; h0j+1 ; ri) from is0 (u r); or more precisely, (0j ; 0j+1 ) 2 is0 (u r) with ij = ij+1 = r 2 [1; k]. Since e1 is an edge of D0 (0 ), j 1. Since, clearly, the edge ej = (h0j 1 ; ij 1 i; h0j ; ri) is in D0 (0 ), it follows from the denition of the semantic rules of G0 that 0j = (; qr ) for some inherited attribute 2 A. Hence, by Statements (1) and (2) for u r, r = nonroot, qr = pr0 , 0j+1 = (; pr0 ) for some 2 A, and (; ) 2 pr0 . So (; ) 2 qr . Since (0j ) = and (0j+1 ) = , this implies that (ej+1 ) = (h(0j ); ri; h(0j+1 ); ri) is in D()[q1 ; : : : ; qk ]. This proves the lemma. ut The proof of Statements (1) and (2) is now straightforward. Assume by induction that the statements hold for u 1; : : : ; u k. Recall that is0 (u) is equal to the graph 0 (D0 (0 )[is0 (u 1); : : : ; is0 (u k)]). Consider a path from h0 ; 0i to h 0 ; 0i in the graph D0 (0 )[is0 (u 1); : : : ; is0 (u k)]. Such a path must start and end with an edge in D0 (0 ). Hence, by the denition of the semantic rules of G0 , there exist ; 2 A such that either = nonroot, 0 = (; q0 ), and 0 = (; q0 ), or = root, 0 = , and 0 = . According to the above lemma, there is a path from h(0 ); 0i = h; 0i to h( 0 ); 0i = h; 0i in D()[q1 ; : : : ; qk ], and so (; ) 2 0 (D()[q1 ; : : : ; qk ]) = q0 . Now the proof of Statement (3) is straightforward. Consider a cycle in the graph D0 (0 )[is0 (u 1); : : : ; is0 (u k)]. It must contain an edge from D0 (0 ), which we can take as the rst edge. Since we know that Statements (1) and (2) hold for all nodes, the above lemma gives us a cycle in D()[q1 ; : : : ; qk ]. Finally we prove Statement (4). It follows from the circularity of the graph D0 (root)[is0 (u)], the noncircularity of D0 (root), and the fact that the root rules of G0 involve attributes from A only, that is0 (u) must have an edge that involves attributes from A. By Statements (1) and (2) this implies that = root and 47 is0 (u) q0 . Hence, since D0 (root) = D(root), all edges of D0 (root)[is0 (u)] are also edges of D(root)[q0 ], and so D(root)[q0 ] is circular. This shows that G0 is noncircular, and ends the proof of the rst part of Lemma 16. We now turn to the proof of the second part of that lemma, but for simplicity we use G and G0 instead of G0 and G00 , respectively. Since the idea of the construction is similar to that of the rst part of the lemma, we will not give a detailed correctness proof. Let G = (; S; I; ; W; R; m) be an arbitrary noncircular att from to . We will turn semantic rules into dummy rules whenever an attribute is used more than once. To see whether an attribute of a node u is used more than once, we do not only need the dependency graph D() of the label of u, but also the dependency graph of the label of the parent of u (or the dependency graph of the root, if u is the root). To provide this information is the job of the relabeling r1 . Let be the set of all pairs (; i) such that either 2 and i 2 [1; rk()], or = root and i = 0. The attributed relabeling r1 has output alphabet 0 = , and r1 relabels a node u of a tree t 2 T that has label , with the label (; ; i) such that (1) if u = v j then = labt (v) and i = j , and (2) if u = root(t) then = root and i = 0. This can easily be realized by a relabeling attribute grammar that has inherited attributes `par' and `num' with root rules hpar; 0i = root and hnum; 0i = 0, and internal rules hpar; ii = and hnum; ii = i, i 2 [1; rk()], in R(). The att G0 , from 0 to , has attributes S 0 = S and I 0 = I , with 0 m = (m ; root; 0). The root rules of G0 are the root rules of G in which every h; 0i is replaced by h(; root; 0); 0i, together with dummy rules for the remaining attributes. In fact, we assume that D(root) does not contain a node with more than one outgoing edge (otherwise G is not sur on any input tree and Lemma 16(2) is trivial). The internal rules of G0 for the symbol (; ; i) are dened as follows. If D() has a node with more than one outgoing edge, or a node h; 0i with one outgoing edge such that h; ii has at least one outgoing edge in D(), then all internal rules of (; ; i) are dummy rules. Otherwise, R0 (; ; i) consists of all rules of R(), in which every h; 0i is replaced by h(; ; i); 0i and every h; j i, j 6= 0, by h(; ; j ); j i, and dummy rules for the remaining attributes. The idea in the construction of G0 is the following. Let t0 be a tree over 0 , and let t be the tree over that is obtained from t0 by changing every label (; ; i) into . Let us rst consider the case that t0 = r1 (t). Then the attribute h; ui of t in G is simulated in G0 by the attribute h(; ; i); ui of t0 , where (; ; i) is the label of u in r1 (t). This guarantees that G 0 (r1 (t)) = G (t) if G is sur on t. If G is not sur on t, then all \bad" nodes of D(t) (i.e., nodes with more than one outgoing edge) are not bad any more in D0 (t0 ) because we dened D0 (; ; i) to have no edges whenever D(), combined with the parent dependency graph D(), has bad nodes. Now consider the case that t0 6= r1 (t). Then a bad node of D(t) is removed from D0 (t0 ) either for the same reason as above, or because the labels of t0 do not \t". In fact, if a node u of t has label , and its i-th child 48 has a label (; 0 ; i0 ) in r1 (t) that does not \t", i.e., with (0 ; i0 ) 6= (; i), then a bad node h; u ii of D(t) is not bad in D0 (t0 ) because it is split into the two nodes h(; ; i); u ii and h(; 0 ; i0 ); u ii. Note nally that G0 is still noncircular: a cycle in D0 (t0 ) would produce a cycle in D(t) when replacing every h(; ; i); ui by h; ui. This ends the proof of the second part of Lemma 16, and thus also ends the proof of Theorem 15. 9 Main results In this nal section we combine all previously proved results, and derive our main results: the equivalence of mso term graph transducers and attributed tree transducers with look-ahead, and the equivalence of mso tree transducers and attributed tree transducers with look-ahead that satisfy the single use restriction. Theorem 18. mso-tgt = att-rel att, mso-tt = att-rel attsur, mso-tgtdir = att-rel attos , and mso-ttdir = att-rel attos;sur . Proof. The -inclusions are an immediate consequence of Theorems 14 and 15. Without att-rel, the -inclusions are stated in Theorem 9. Then, the class att-rel can be added because of the equality att-rel = mso-rel (Theorem 10) and the composition results of Proposition 2 (recalling that mso-rel mso-ttdir). ut Together with Theorem 10 this characterizes all classes of tree transductions from Fig. 1 in terms of attribute grammars. In the remainder of the section we discuss a number of issues related to these results. Look-ahead As observed in the introduction, an attributed tree transducer with look-ahead can also be viewed as one attribute grammar of which the attributes can be evaluated in two phases (see [Blo]). Such an attribute grammar is like an att, except that it also has \ags", i.e., attributes with nitely many values. The semantic rules for ags use ags only, and therefore the ags can be evaluated in a rst phase. The \tree attributes", i.e., the attributes with W () = T , have conditional semantic rules in which the ags can be tested, i.e., a semantic rule in R() is of the form 8 > <r. 1 h; ii = >.. :rn if c1 .. . if cn (4) where c1 ; : : : ; cn are mutually exclusive, exhaustive tests on the ags, and rj 2 T ((I [ S ) [0; rk()]) for every j 2 [1; n]. The root rules are of a similar form. As an example, the tree transduction of Example 2(4, path) can be computed by such an attribute grammar, with a ag `ns' which is dened as in Example 2(4, 49 path), and a tree attribute which has the same semantic rules in R(a) and R() as in Example 2(4, path) and the following conditional semantic rule in R(): 8 > <1(h; 1i) if hns; 1i = 1 and hns; 2i = 0, h; 0i = >2(h; 2i) if hns; 1i = 0 and hns; 2i = 1, :a otherwise. The reason that we have not chosen for this model is that it leads to more complicated denitions of noncircularity (cf. [Boy]) and the single use restriction. The usual notion of dependency graph takes a worst case view on dependencies in the sense that for a conditional semantic rule (4), it would assume h; ii to depend on all attributes occurring in r1 ; : : : ; rn , whereas, for a particular input tree, h; ii depends of course on the attributes of one rj only. Another, equivalent, way of viewing an attR , based on Theorem 10, is as an att which has conditional rules as above, in which the conditions c1 ; : : : ; cn are unary mso formulas 1 (x); : : : ; n (x), where x refers to the node under consideration (usually referred to by the number 0). The tree transduction of Example 2(4, path) can be computed by such an attribute grammar, with one attribute which, again, has the same semantic rules R(a) and R() as in Example 2(4, path), but now has the following conditional rule in R(): 8 > <1(h; 1i) if 8y((edg1 (x; y) ! 1 (y)) ^ (edg2 (x; y) ! 0 (y))), h; 0i = >2(h; 2i) if 8y((edg1 (x; y) ! 0 (y)) ^ (edg2 (x; y) ! 1 (y))), :a if :1 (x), where 1 (x) is a formula expressing that x has exactly one descendant with label , and 0 (x) expresses that x has no descendant with label . This seems to be an attractive formal model, which was in fact used in the proof of Theorem 14, in disguise. It is natural to extend the attR by allowing semantic conditions on the ags, which are specied for each operator (and the root) in addition to the semantic rules. Such an extended attR G computes a partial function: it accepts only those input trees t for which decG;t satises the semantic conditions. Using the characterization of unary mso formulas proved in [BloEng, NevBus] (see Section 6.1) it is straightforward to show (see [Blo]) that the domains of these functions are exactly the mso denable tree languages. Thus, the attR with semantic conditions computes precisely the partial mso term transductions (of which the domain is specied by a closed mso formula, see [EngOos, Cou3]), and similarly for the sur and os restrictions. Time complexity It is known from [CouMos] that mso denable graph transductions can be computed in polynomial time. As an immediate consequence of Theorem 18 we obtain that the mso denable tree transductions can be computed in linear time, in the size of the input tree. In fact, it is well known that 50 attribute grammars can be evaluated in linear time, provided each semantic rule can be evaluated in constant time. This condition is obviously satised for nite-valued attribute grammars (and hence for attributed relabelings), and for sur attributed tree transducers. If one is willing to accept a term graph as the representation of the output tree, then it also holds for arbitrary att's. The next corollary also follows from the results in Section 7 of [BloEng], where it is shown that both unary formulas and functional binary formulas can be evaluated on trees in linear time. Corollary 19. mso denable tree transductions can be computed in linear time. Using a term graph to represent the output tree, mso denable term transductions can be computed in linear time. Context-free graph grammars Let regt denote the class of regular tree languages (see, e.g., [GecSte]). According to the classical result of [Don, ThaWri], this is also the class of mso denable tree languages. Let us now consider the class mso-tgt(regt) of images of the mso denable tree languages under mso denable term transductions. As observed in the Introduction, it is shown in [Oos, Eng4, EngOos] that mso-gt(regt), the class of images of mso denable tree languages under mso denable graph transductions, is equal to the class of context-free graph languages. Thus mso-tgt(regt) = unfold(cf-tgl), the class of all term languages unfold(L) where L is a set of term graphs that can be generated by a context-free graph grammar. This class was investigated in [EngHey], where it was shown that it equals the class att(regt) of images of the regular tree languages under attributed tree transductions. Since it is straighforward to prove that regt is closed under attributed relabelings, this result is now an immediate consequence of Theorem 18. We have, however, skipped some details. First, as also observed in a footnote in the Introduction, there are two dierent types of context-free graph languages: HR (hyperedge replacement) and NR (node replacement). The result of [Oos, Eng4, EngOos] is for NR, but the result of [EngHey] is for HR. However, there is another type of mso denable graph transductions, closely related to the one dened here, in which the incidence relation between nodes and edges has to be dened by a binary formula, see [Cou4]. It is shown in [CouEng] that mso-gt0 (regt) is the class of HR context-free graph languages, where the prime indicates this other type of mso denable graph transductions. Now it is easy to see that for term graphs there is no dierence between the two types of mso denable graph transductions, i.e., Theorem 18 holds for both types: mso-tgt0 = mso-tgt and similarly for the other three classes. Consequently, the class unfold(cf-tgl) is the same for HR and NR. Second, the notion of term graph is dened in a slightly dierent way in [EngHey], and called \jungle". This does not make a dierence, because jungles can be transformed into term graphs, and vice versa, by mso denable graph transductions (of both types). Third, the attributed tree transducers dened in [EngHey] are not of the type dened here (as introduced in [Ful]), but are ordinary attribute grammars of which all attributes have trees as values (as considered, e.g., in [EngFil]). The domain of 51 such an attributed tree transducer is the set of all derivation trees of the underlying context-free grammar, and the result of [EngHey] concerns the class of all ranges of such attributed tree transducers. It is however straightforward to prove, using the close relationship between regular tree languages and derivation tree languages of context-free grammars (see, e.g., [GecSte]), that this class of ranges equals att(regt). Consider now the class mso-tt(regt) of images of the mso denable tree languages under mso denable tree transductions. This is the class of tree languages that can be generated by context-free graph grammars. Thus, by Theorem 18, it is equal to the class attsur (regt) of images of the regular tree languages under sur attributed tree transductions. This result is in fact also clear from the proof of [EngHey]. Corollary 20. attsur(regt) is the class of tree languages that can be generated by (HR or NR) context-free graph grammars. Tree transducers In tree language theory several types of tree transducers are studied, apart from the attributed tree transducer (see, e.g., [GecSte]). In what follows we consider total deterministic transducers only. We consider three types of such transducers. First the top-down tree transducer, which is well known to be equivalent with the os attributed tree transducer (see [CouFra, Ful]). To ensure closure under composition, the top-down tree transducer was extended with regular look-ahead in [Eng1]. Let TR denote the class of all tree transductions that are computed by top-down tree transducers with regular look-ahead. It is not dicult to understand, and is proved in [EngMan], that preprocessing the input tree with an attributed relabeling has the same eect as regular look-ahead. This implies that att-rel attos = TR . Thus, by Theorem 18, mso-tgtdir = TR , i.e., the direction preserving mso denable term transductions are exactly the top-down tree transductions with regular look-ahead. Since TR is closed under composition (see [Eng1]), this implies that mso-tgtdir is closed under composition (see Proposition 2 and the discussion following it). Second the bottom-up tree transducer, which is incomparable with the topdown tree transducer. The result of [FulVag] shows that not every bottom-up tree transducer can be simulated by an att. However, since every bottom-up tree transducer can be simulated by a top-down tree transducer with regular lookahead, the class of bottom-up tree transductions is included in mso-tgtdir. Third the macro tree transducer, introduced in [CouFra, EngVog], which extends the top-down tree transducer with parameters (and thus can be viewed as a model of denotational semantics). Every attributed tree transduction can be computed by a macro tree transducer and thus, since every macro tree transducer with regular look-ahead can be simulated by one without, the same is true for attributed tree transductions with look-ahead. Hence, by Theorem 18, every mso denable term transduction can be implemented by a macro tree transducer. Using results from [Eng2, EngVog], it can now be shown that the class of macro tree transductions is mso-tgtdir mso-tgt. A precise characterization 52 of the mso denable tree transductions (mso-tt) as a subclass of the macro tree transductions is presented in [EngMan]. Closure under composition When looking for a model for the implemen- tation of the tree transductions in mso-tt or mso-ttdir, a good guideline is that these classes are closed under composition, by Proposition 2. Thus, in [EngMan], the class mso-ttdir is shown to be equal to a subclass of TR which is well known to be closed under composition. When searching for our main result mso-tt = att-rel attsur , we were also guided by closure under composition. The main reason for the introduction of the single use restriction in [Gan, GanGie, Gie] was that the sur attribute coupled grammars are closed under composition. An attribute coupled grammar is an attribute grammar in which a distinction is made between syntactic attributes (which have trees as values) and semantic attributes (which have arbitrary values). The single use restriction is imposed on the syntactic attributes only (which is why it is actually called the syntactic single use requirement in [Gie]). Taking syntactic attributes only, the result of [Gan, GanGie, Gie] shows that attsur is closed under composition. Adding \ags", i.e., attributes with nitely many values, as semantic attributes, it shows that the class att-rel attsur of sur attributed tree transductions with look-ahead is closed under composition. It is observed in [Gie] that the same construction proves that the composition of a sur attribute coupled grammar with an arbitrary attribute coupled grammar can again be realized by an attribute coupled grammar. This corresponds to the fact that mso-tt mso-tgt mso-tgt, as stated in Proposition 2. References [AhoUll] A. V. Aho, J. D. Ullman; Translations on a context-free grammar, Inf. and Control 19 (1971), 439{475 [ArnLagSee] S. Arnborg, J. Lagergren, D. Seese; Easy problems for tree-decomposable graphs, J. of Algorithms 12 (1991), 308{340 [Blo] R. Bloem; Attribute Grammars and Monadic Second Order Logic, Master's Thesis, Leiden University, June 1996 [BloEng] [Boy] [Buc] [CorMon] http://www.wi.LeidenUniv.nl/MScThesis/bloem.html R. Bloem, J. Engelfriet; Characterization of properties and relations dened in monadic second order logic on the nodes of trees, Tech. Report 97{03, Leiden University, August 1997 http://www.wi.LeidenUniv.nl/TechRep/tr97-03.html J. T. Boyland; Conditional attribute grammars, TOPLAS 18 (1996), 73{ 108 J. Buchi; Weak second-order arithmetic and nite automata, Z. Math. Logik Grundlag. Math. 6 (1960), 66{92 A. Corradini, U.Montanari, eds., Proc. SEGRAGRA'95, Joint COMPUGRAPH/SEMAGRAPH Workshop on Graph Rewriting and Computation, Electronic Notes in Theoretical Computer Science, Vol. 2, Elsevier, 1995 53 [Cou1] B. Courcelle; The monadic second-order logic of graphs I: Recognizable sets of nite graphs, Inform. and Comput. 85 (1990), 12{75 [Cou2] B. Courcelle; Graph rewriting: an algebraic and logic approach, in Handbook of Theoretical Computer Science, Vol. B (J. van Leeuwen, ed.), Elsevier, 1990, 193{242 [Cou3] B. Courcelle; The monadic second-order logic of graphs V: On closing the gap between denability and recognizability, Theor. Comput. Sci. 80 (1991), 153{202 [Cou4] B. Courcelle; Monadic second-order denable graph transductions: a survey, Theor. Comput. Sci. 126 (1994), 53{75 [Cou5] B. Courcelle; The expression of graph properties and graph transformations in monadic second-order logic, Chapter 5 of Handbook of Graph Grammars and Computing by Graph Transformation, Vol. 1: Foundations (G. Rozenberg, ed.), World Scientic, 1997 [CouEng] B. Courcelle, J. Engelfriet; A logical characterization of the sets of hypergraphs dened by hyperedge replacement grammars, Math. Systems Theory 28 (1995), 515{552 [CouEngRoz] B. Courcelle, J. Engelfriet, G. Rozenberg; Handle-rewriting hypergraph grammars, J. of Comp. Syst. Sci. 46 (1993), 218{270 [CouFra] B. Courcelle, P. Franchi-Zannettacci; Attribute grammars and recursive program schemes I, II, Theor. Comput. Sci. 17 (1982), 163{191; 235{257 [CouMos] B. Courcelle, M. Mosbah; Monadic second-order evaluations on tree decomposable graphs, Theor. Comput. Sci. 109 (1993), 49{82 [DerJouLor] P. Deransart, M. Jourdan, B. Lorho; Attribute Grammars, Lecture Notes in Computer Science 323, Springer-Verlag, Berlin, 1988 [Don] J. Doner; Tree acceptors and some of their applications, J. of Comp. Syst. Sci. 4 (1970), 406{451 [DreHabKre] F. Drewes, A. Habel, H.-J. Kreowski; Hyperedge replacement graph grammars, Chapter 2 of Handbook of Graph Grammars and Computing by Graph Transformation, Vol. 1: Foundations (G. Rozenberg, ed.), World Scientic, 1997 [Elg] C. C. Elgot; Decision problems of nite automata and related arithmetics, Trans. Amer. Math. Soc. 98 (1961), 21{51 [Eng1] J. Engelfriet; Top-down tree transducers with regular look-ahead, Math. Systems Theory 10 (1977), 289{303 [Eng2] J. Engelfriet; Tree transducers and syntax-directed semantics, Memorandum nr. 363, Twente University of Technology, 1981, presented at CAAP'82 [Eng3] J. Engelfriet; Attribute grammars: attribute evaluation methods, in Methods and Tools for Compiler Construction (B. Lorho, ed.), Cambridge University Press, 1984, 103{138 [Eng4] J. Engelfriet; A characterization of context-free NCE graph languages by monadic second-order logic on trees, in Graph Grammars and their Application to Computer Science (H. Ehrig, H.-J. Kreowski, G. Rozenberg, eds.), Lecture Notes in Computer Science 532, Springer-Verlag, Berlin, 1991, 311{327 [Eng5] J. Engelfriet; A regular characterization of graph languages denable in monadic second-order logic, Theor. Comput. Sci. 88 (1991), 139{150 [Eng6] J. Engelfriet; Context-free graph grammars, Chapter 3 of Handbook of Formal Languages, Vol. 3: Beyond Words (G. Rozenberg, A. Salomaa, 54 eds.), Springer-Verlag, 1997 J. Engelfriet, G. File; The formal power of one-visit attribute grammars, Acta Informatica 16 (1981), 275{302 [EngHey] J. Engelfriet, L. M. Heyker; Context-free hypergraph grammars have the same term-generating power as attribute grammars, Acta Informatica 29 (1992), 161{210 [EngMan] J. Engelfriet, S. Maneth; Monadic second order logic and macro tree transductions, in preparation [EngOos] J. Engelfriet, V. van Oostrom; Logical description of context-free graph languages, Tech. Report 96{22, Leiden University, August 1996, to appear in J. of Comp. Syst. Sci. [EngRoz] J. Engelfriet, G. Rozenberg; Node replacement graph grammars, Chapter 1 of Handbook of Graph Grammars and Computing by Graph Transformation, Vol. 1: Foundations (G. Rozenberg, ed.), World Scientic, 1997 [EngRozSlu] J. Engelfriet, G. Rozenberg, G. Slutzki; Tree transducers, L systems, and two-way machines, J. of Comp. Syst. Sci. 20 (1980), 150{202 [EngVog] J. Engelfriet, H. Vogler; Macro tree transducers, J. of Comp. Syst. Sci. 31 (1985), 71{146 [FHVV] Z. Fulop, F. Herrmann, S. Vagvolgyi, H. Vogler; Tree transducers with external functions, Theor. Comput. Sci. 108 (1993), 185{236 [Ful] Z. Fulop; On attributed tree transducers, Acta Cybernetica 5 (1981), 261{279 [FulVag] Z. Fulop, S. Vagvolgyi; Attributed tree transducers cannot induce all deterministic bottom-up tree transformations, Inform. and Comput. 116 (1995), 231{240 [Gan] H. Ganzinger; Increasing modularity and language-independency in automatically generated compilers, Sci. Comput. Programming 3 (1983), 223{278 [GanGie] H. Ganzinger, R. Giegerich; Attribute coupled grammars, in Proc. SIGPLAN'84, SIGPLAN Notices 19 (1984), 157{170 [GecSte] F. Gecseg, M. Steinby; Tree automata, Akademiai Kiado, Budapest, 1984 [Gie] R. Giegerich; Composition and evaluation of attribute coupled grammars, Acta Informatica 25 (1988), 355{423 [KamSlu] T. Kamimura, G. Slutzki; Parallel and two-way automata on directed ordered acyclic graphs, Inf. and Control 49 (1981), 10{51 [KlaSch] N. Klarlund, M. L. Schwartzbach; Graph Types, in Proc. of the 20th Conference on Principles of Programming Languages, 1993, 196{205 [Knu] D. E. Knuth; Semantics of context-free languages, Math. Syst. Theory 2 (1968), 127{145. Correction: Math. Syst. Theory 5 (1971), 95{96 [KuhVog1] A. Kuhnemann, H. Vogler; Synthesized and inherited functions, a new computational model for syntax-directed semantics, Acta Informatica 31 (1994), 431{477 [KuhVog2] A. Kuhnemann, H. Vogler; Attributgrammatiken, Vieweg, Braunschweig, 1997 [MezWri] J. Mezei, J. B. Wright; Algebraic automata and context-free sets, Inform. and Control 11 (1967), 3{29 [NevBus] F. Neven, J. Van den Bussche; On the expressive power of Boolean-valued attribute grammars, extended abstract, University of Limburg, Belgium, 1997 [EngFil] 55 [Oos] V. van Oostrom; Graph grammars and 2nd order logic (in Dutch), Master's Thesis, Leiden University, January 1989 [SlePleEek] M. R. Sleep, M. J. Plasmeijer, M. C. J. D. van Eekelen, eds., Term Graph Rewriting { Theory and Practice, John Wiley, London, 1993 [ThaWri] J. W. Thatcher, J. B. Wright; Generalized nite automata theory with an application to a decision problem of second-order logic, Math. Systems Theory 2 (1968), 57{81 [Tho] W. Thomas; Automata on innite objects, in Handbook of Theoretical Computer Science, Vol. B (J. van Leeuwen, ed.), Elsevier, 1990, 133{192 This article was processed using the LATEX macro package with LLNCS style 56
© Copyright 2026 Paperzz