An extension of data automata that captures XPath

An extension of data automata that captures XPath
Mikołaj Bojańczyk and Sławomir Lasota
Warsaw University
Email: bojan,[email protected]
Abstract—We define a new kind of automata recognizing
properties of data words or data trees and prove that the
automata capture all queries definable in Regular XPath. We
show that the automata-theoretic approach may be applied
to answer decidability and expressibility questions for XPath.
Finally, we use the newly introduced automata as a common
framework to classify existing automata on data words and
trees, including data automata, register automata and alternating register automata.
Keywords-Regular XPath, data automata, register automata.
I. I NTRODUCTION
In this paper, we study data trees. In a data tree, each node
carries a label from a finite alphabet and a data value from
an infinite domain. We study properties of data trees, such
as those defined in XPath, which refer to data values only
by testing if two nodes carry the same data value. Therefore
we define a data tree as a pair (t, ∼) where t is a tree over
a finite alphabet and ∼ is an equivalence relation on nodes
of t. Data values are identified with equivalence classes of
∼.
Recent years have seen a lot of interest in automata
for data trees and the special case of data words. The
general theme is that it is difficult to design an automaton
which recognizes interesting properties and has decidable
emptiness.
Decidable emptiness is important in XML static analysis.
A typical question of static analysis is the implication
problem: given two properties ϕ1 , ϕ2 of XML documents
(modeled as data trees), decide if every document satisfying
ϕ1 must also satisfy ϕ2 . Solving the implication problem
boils down to deciding emptiness of ϕ1 ∧ ¬ϕ2 .
A common logic for expressing properties is XPath. For
XPath, satisfiability is undecidable in general, even for data
words, see [1]. Satisfiability is undecidable also for most
other natural logics, including first-order logic with predicates for order (or even just successor) and data equality.
The approach chosen in prior work was to find automata
on data words or trees that would have decidable emptiness
and recognize interesting, but necessarily weak, logics or
fragments of XPath. These include: fragments of XPath
without recursion or negation [1], [2]; first-order logic with
This work has been partially supported by the Polish government grant
no. N206 008 32/0810 and by the FET-Open grant agreement FOX, number
FP7-ICT-233599.
two variables [3], [4]; forward-only fragments related to
alternating automata [5]–[8]. The original automaton model
for data words was [9]. See [10] for a survey.
In this paper, we take a different approach. Any model
that captures XPath will have undecidable emptiness. We
are not discouraged by this, and try to capture XPath by
something that feels like an “automaton”. Three tangible
goals are: 1. use the automaton to decidable emptiness for
interesting restrictions of data trees; 2. use the automaton to
prove easily that the automaton (and consequently XPath)
cannot express a property; 3. unify other automata models
that have been suggested for data trees and words.
What is our new model? To explain it, we use logic. From
a logical point of view, a nondeterministic automaton is a
formula of the form ∃X1 . . . ∃Xn ϕ(X1 , . . . , Xn ). As often
in automata theory, when designing the automaton model,
we try to use the prefix of existential set quantifiers as
much as possible, in the interest of simplifying the kernel
ϕ. For satisfiability, this is like a free lunch, since deciding
satisfiability with or without the prefix are the same problem.
In the automaton model that we propose in this paper, the
kernel ϕ is of the form “for every class X of ∼, property
ψ(X, X1 , . . . , Xn ) holds”, where ψ is an MSO formula that
can use predicates for navigation (sibling order, descendant),
predicates for testing labels from the finite alphabet, but not
the predicate ∼ for data equality. The data ∼ is only used
in saying that X is a class. In the case of data words, this
model is an extension of the data automata introduced in [3],
which correspond to the special case when ψ quantifies only
over positions from X. For instance, our new model, but not
data automata, can express the property “between every two
different positions in the same class there is at most one
position outside the class with label a”.
The principal technical contribution of this paper is that
the model above can recognize all boolean queries of XPath.
This proof is difficult, and takes over ten pages. We believe the real value of this paper lies in this proof, which
demonstrates some powerful normalization techniques for
formulas describing properties of data trees. Since the scope
of applicability for these techniques will be clear only in the
future; and since the appreciation of an “automaton model”
may ultimately be a question of taste, we describe in more
details the three tangible goals mentioned above.
1. The ultimate goal of this research is to find interesting
classes of data trees which yield decidable emptiness for
XPath. As a proof of concept, we define a simple subclass
of data trees, called bipartite data trees, and prove that
emptiness of our automata (and consequently of XPath) is
decidable for bipartite data trees. This is only a preliminary
result, we intend to find new subclasses in the future.
2. We use the automaton to prove that XPath cannot define
certain properties. Proving inexpressibilty results for XPath
is difficult, because the truth value of an XPath query in a
position x might depend on the truth value of a subquery in
a position y < x, which in turn might depend on the truth
value of a subquery in a position z > y, and so on. On the
other hand, our automaton works in one direction, so it is
easier to understand its limitations. We use (an extension
of) our automata to prove that for documents with two
equivalence relations ∼1 and ∼2 , some properties of twovariable first-order logic cannot be captured by XPath, which
was an open question.
3. We use the automaton to classify existing models for
data words in a single framework. A problem with the
research on data words and data trees is that the models
are often incomparable in expressive power. We show that
all the existing models can be seen as syntactic fragments
of our automaton. We hope that this classification underlines
more clearly what the differences are between the models.
II. P RELIMINARIES
Trees. Trees are unranked, finite, and labeled by a finite
alphabet Σ. We use the terms child, parent, sibling, descendant, ancestor, node in the usual way. The siblings are
ordered. We write x ≤ y when x is an ancestor of y. Every
nonempty set of nodes x1 , . . . , xn in a tree has a greatest
closest common ancestor (the greatest lower bound wrt ≤),
which is denoted gca(x1 , . . . , xn ).
Let t and s be two trees, over alphabets Σ and Γ,
respectively, that have the same sets of nodes. We write t⊗s
for the tree over the product alphabet Σ×Γ that has the same
nodes as s and t, and where every node has the label from
t on the first coordinate, and the label from s on the second
coordinate. If X is a set of nodes in a tree t, we write t ⊗ X
for the tree t ⊗ s, where s is the tree over alphabet {0, 1},
whose nodes are the nodes of t and whose labeling is the
characteristic function of X.
Regular tree languages and transducers. We use the
standard notion of regular tree languages for unranked trees.
We also use transductions, which map trees to trees. Let Σ
be an input alphabet and Γ an output alphabet. A regular
tree language f over the product alphabet Σ × Γ can be
interpreted as a binary relation, which contains pairs (s, t)
such that s ⊗ t ∈ f . We use the name letter-to-letter
transducer for such a relation, underlining that the trees in a
pair (s, t) ∈ f must have the same nodes. In short, we simply
say transducer. We often treat a transducer as a function that
maps an input tree to a set of output trees, writing t ∈ f (s)
instead of (s, t) ∈ f .
Data trees. A data tree is a tree t equipped with an
equivalence relation ∼ on its nodes that represents data
equality. We use the name class for equivalence classes of
∼.
Queries. Fix an input alphabet. We use the name n-ary
query for a function φ that maps a tree t over the input
alphabet to a set φ(t) of n-tuples of its nodes. In this paper,
we will deal with queries of arities 0,1,2 and 3, which are
called boolean, unary, binary and ternary. We also study
queries that input a data tree (t, ∼); they output a set of
node tuples φ(t, ∼) as well.
MSO. Logic is a convenient way of specifying queries,
both for trees and data trees. We use monadic second-order
logic (MSO). In a given tree, or a data tree, a formula of
MSO is allowed to quantify over nodes of the tree using
individual variables x, y, z, and also over sets of nodes using
set variables X, Y, Z. A formula φ with free individual
variables x1 , . . . , xn defines an n-ary query, which selects
in a tree t the set φ(t) of tuples (x1 , . . . , xn ) that make the
formula true. To avoid confusion, we use round parentheses
for the tree input of a query, φ(t), and square parentheses for
indicating the free variables of a query. The two parenthesis
can appear together, e.g. φ[x1 , . . . , xn ](t) will be the set of
n-tuples selected in a tree t by a query with free variables
x1 , . . . , x n .
When working over trees without data, MSO formulas use
binary predicates for the child, descendant and next-sibling
relations, as well as a unary predicate for each label. Queries
defined by MSO with these predicates are called regular
queries (of course, regular queries can also be characterized
in terms of automata). When working over data trees, we
additionally allow a binary predicate ∼ to test data equality.
A query using ∼ is no longer called regular. For instance,
the following formula says that each class contains nodes
with the same label.
_
∀x∀y x ∼ y ⇒
a(x) ∧ a(y)
a∈Σ
Extended Regular XPath. We define a variant of XPath
that works over data trees. For unary queries, the variant is
an extension of XPath, thanks to including MSO as part of
its syntax. Expressions of extended Regular XPath, which
we call XPath in short, are defined below. Each expression
defines a unary query.
• Every letter of the alphabet a is a unary query, which
selects nodes with label a.
• Let Γ = {φ1 , . . . , φn } be a set of already defined
unary queries of Extended Regular XPath, which will
be treated as unary predicates. Suppose that ϕ[x, y1 , y2 ]
be a regular (i.e. not using ∼) ternary query of MSO
that uses the predicates from Γ. (Data equality ∼ might
have been used to define the queries from Γ, but inside
ϕ, the queries from Γ are treated as unary predicates.)
Then the following propery of x is a unary query.
∃y1 ∃y2
y1 ∼ y2 ∧ ϕ[x, y1 , y2 ].
(1)
Likewise for y1 6∼ y2 instead of y1 ∼ y2 .
Boolean queries can be defined by taking a unary query, and
choosing the data trees where the root is selected.
Binary trees. A binary tree is a tree where each node
has at most two children. Although the interest of XPath is
mainly for unranked trees, we assume in the proofs that trees
are binary. This assumption can be made because XPath,
as well as the models of automata introduced later on, are
stable under the usual first-child / next-sibling encoding in
the following sense. A language L of unranked data trees
can be expressed by a boolean XPath query if and only if
the set of binary encodings of trees from L can be expressed
by a boolean XPath query. A similar, though more technical,
statement holds for unary queries.
III. C LASS AUTOMATA
In this section we define a new type of automaton for data
trees, called a class automaton, and state the main result:
class automata capture all queries definable in XPath.
A class automaton is a type of automaton that recognizes
properties of data trees. A class automaton is given by: an
input alphabet Σ, a work alphabet Γ, a nondeterministic
letter-to-letter tree transducer f from the input alphabet Σ to
the work alphabet Γ, and a regular tree language on alphabet
Γ × {0, 1}, called the class condition. The class automaton
accepts a data tree (t, ∼) over input alphabet Σ if there is
some output s ∈ f (t) such that for every class X, the class
condition contains the tree s ⊗ X.
Example 1. Consider an input alphabet Σ = {a, b}. Let L
be the data trees where some class contains at least three
nodes with label a. This language is recognized by a class
automaton. The work alphabet is Γ = {a, c}. The transducer
guesses three nodes with label a, and outputs a on them,
other nodes get c. The class condition consists of trees s⊗X
over alphabet Γ × {0, 1} where X contains all or none of
the nodes with label a. Note that the class condition does
not inspect positions outside X.
Example 2. Let K be the set of data words over Σ = {a, b}
where each class has exactly two positions x < y, and there
is at most one a in the positions {x + 1, . . . , y − 1}. In
the class automaton recognizing K, the transducer is the
identity function, and the class condition is
Σ∗0
· Σ1 ·
b∗0
· (a0 + ) ·
b∗0
· Σ1 ·
Σ∗0
were defined for data words. Since it is not clear what the
correct tree version thereof is, we just present the version for
data words. Like a class automaton, a data automaton has an
input alphabet Σ, a work alphabet Γ, and a nondeterministic
letter-to-letter transducer f (this time only for words). The
difference is in the class condition, which is less powerful in
a data automaton. In a data automaton, the class condition
is a word language over Γ, and not Γ × {0, 1}. The data
automaton accepts a data words (w, ∼) if there is some
output v ∈ f (w) such that for every class X, the class
condition contains the substring of v obtained by only
keeping positions from X. In the realm of data words, data
automata can be seen as a special of class automata, where
the class condition is only allowed to look at positions from
the current class. The language L in Example 1 can be
recognized by a data automaton (in the case of words), while
the language K in Example 2 is a language that can be
recognized by class automata, but not data automata. We
will comment more closely on the relationship between data
automata and class automata in Section VI, and also on
the relationship to other types of automata for data words
and data trees, including pebble automata [11], register
automata [9] and alternating register automata.
The difference between data automata and class automata
is crucial for decidability of emptiness. Data automata have
decidable emptiness [3], the proof being a reduction to
reachability in Vector Addition Systems with States.
Closure properties. Suppose that f : Σ1 → Σ2 is any
function. We extend f to a function fˆ from data trees over
alphabet Σ1 to data trees over alphabet Σ2 , by just changing
the labels of nodes, and not the tree structure or data values.
We use the name relabeling for any such function fˆ.
Lemma 1. Languages of data trees recognized by class
automata are closed under union, intersection, images under
relabelings, and inverse images under relabelings.
In the proof, one uses Cartesian product for intersection,
nondeterminism for union and images. The inverse images
are the simplest: the letter-to-letter tree transducer in the data
automaton is composed with the relabeling.
Evaluation. The evaluation problem (given an automaton
and a data word/tree, check if the latter is accepted by the
former) is NP-complete, even for a fixed data automaton
(cf. [12]). Hence it is also NP-complete for class automata,
which extend data automata.
Class automata as a fragment of MSO. One can see a
class automaton as a restricted type of formula of monadic
second-order logic. This is a formula of the form:
where Σi is a shortcut for Σ × {i}, likewise for ai and bi .
∃X1 · · · ∃Xn ∀X class(X) ⇒ ϕ(X1 , . . . , Xn , X)
Comparison to data automata. Class automata are closely
related to data automata introduced in [3]. Data automata
(2)
where X1 , . . . , Xn , X are variables for sets of nodes, the
class formula is defined
class(X) =
∃y∀x
x ∈ X ⇐⇒ y ∼ x
query is now regular). Most of this paper is devoted to this
case, which is stated in the following proposition.
Proposition 1. Class automata can recognize queries
and ϕ is a formula of MSO that does not use ∼. Formulas of
the above form recognize exactly the same languages of data
trees as class automata. For translating a class automaton to
a formula, one uses the variables X1 , . . . , Xn to encode the
output of the transducer, and the formula ϕ to test two things:
a) the variables X1 , . . . , Xn encode a legitimate output of
the transducer; and b) the class condition holds for X.
Main result. The main result of this paper is that unary
XPath queries over data trees can be recognized by class
automata. To state the theorem, we need to say how a class
automaton recognizes a unary query. We do this by encoding
a unary query φ over data trees as a language of data trees:
Lφ = {(t ⊗ X, ∼) : (t, ∼) is a data tree, X = φ(t, ∼)}.
In other words, the language consists of data trees decorated
with the set of nodes selected by the query. (This encoding
does not generalize to binary queries.)
Theorem 1. Every unary XPath query over data trees can
be recognized by a class automaton.
We begin the proof of Theorem 1, mainly to show where
the difficulties appear. Then, we lay out the proof strategy
in more detail. When referring to the language of a unary
query, we mean the encoding above.
We do an induction on the size of the unary query. The
base case, when the query is a label a, is straightforward.
Consider now the induction step, with a unary query
φ[x] =
∃y1 ∃y2
∃y1 ∃y2
φ[x] =
where ϕ is a regular ternary query. Likewise for y1 6∼ y2 .
Proof strategy. The construction of the automaton for
φ[x] is spread across several sections. In Section III-B,
we introduce the main concepts underlying the proof. In
particular, we define a new complexity measure for binary
relations on tree domains, called guidance width, that seems
to be of independent interest. In Section III-C we start the
proof itself, formulate an induction, and reduce Proposition 1
to a more technical Theorem 2. In Section III-D we identify
a simplified form of queries appearing in Theorem 2 (how
arbitrary queries can be transformed to this simplified form
we show in Appendix VII). Finally, Sections IV contains
the proof of Theorem 2 for the simplified queries, the
heart of the whole proof. We do the proof for the case
of words only in Section IV – it already contains some
of the important ideas for the general tree case, but is
easier to digest. The proof for the general tree case is in
Appendix VIII.
A. Discussion of the proof
In this section, we discuss informally the concepts that
appear in the proof of Theorem 1.
We begin our discussion with words without data. For a
regular binary query ϕ[x, y], consider the unary query
ψ[x] = ∃y ϕ[x, y].
y1 ∼ y2 ∧ ϕ[x, y1 , y2 ]
as in (1). (The same argument works for the case where
y1 6∼ y2 .) Let φ1 , . . . , φn be all the unary XPath subqueries
that appear in ϕ. By the induction assumption, the languages of the subqueries are recognized by class automata
A1 , . . . , An . Let the variables X, X1 , . . . , Xn denote sets of
nodes. Consider the set L of data trees
(t ⊗ X ⊗ X1 ⊗ · · · ⊗ Xn , ∼)
such that a) for each i ∈ {1, . . . , n}, the data tree (t⊗Xi , ∼)
is accepted by the automaton Ai ; and b) X is the set of
nodes selected by the query φ0 obtained from φ by replacing
each subquery φi with “has 1 on coordinate corresponding
to Xi ”. Suppose that the language of φ0 is recognized by a
class automaton. Then so is L, by closure of class automata
under intersection and inverse images of projections, see
Lemma 1. Finally, the language of φ is the image of L under
the projection which removes the labels describing the sets
X1 , . . . , Xn .
It remains to show that φ0 is recognized by a class
automaton (the advantage of φ0 over φ is that the ternary
y1 ∼ y2 ∧ ϕ[x, y1 , y2 ],
We use the name witness function in a word w for a function
which maps every position x satisfying ψ to some y such
that ϕ[x, y] holds. Consider, as an example, the case where
ϕ[x, y] says that there exists exactly one z that has label
a and satisfies x < z < y. The following picture shows a
witness function.
a
a
b
b
a
b
a
b
a
a
b
a
b
The way the picture is drawn is important. The witness
function is recovered by following arrowed lines. The arrowed lines are colored black, dashed black, or grey, in such
a way that no position is traversed by two arrowed lines of
the same color. With the formula ψ in the example, any input
word has a witness function that can be drawn with three
colors of arrowed lines. This can be generalized to arbitrary
MSO binary queries; the number of colors depends only on
the query, and not the input word.
The above observation may be used to design a nondeterministic automaton recognizing a property like ∀x ψ[x].
The automaton would guess the labeling by arrows and then
verify its correctness. The number of states in the automaton
would grow with the number of colors; hence the need for
a bound on the number of colors. Of course, there are other
ways of recognizing ∀xψ[x], but we talk about the coloring
since this is the technique that will work with data.
We now move to data words. Consider a unary query
ψ[x] = ∃y1 ∃y2
y1 ∼ y2 ∧ ϕ[x, y1 , y2 ],
where ϕ[x, y1 , y2 ] says that y1 ≤ x ≤ y2 , there is exactly
one a label in the positions {y1 , . . . , x} and there is exactly
one b label in the positions {x, . . . , y2 }. The query ψ[x] is
an example of a query as in Proposition 1. Consider the
following data word (the labels are blank, a and b).
a
1
2
3
4
5
6
b
5
b
4
b
3
b
2
b
1
Look at the first node with label b, which is selected by ψ,
and consider pairs (y1 , y2 ) required by ψ[x], which we call
witnesses. The only possibility for y2 is x itself; thus y1 is
also determined, as the only other position with the same
data value. The same situation holds for all other positions
with label b, which are the only positions selected by ψ.
The drawing below shows how witness pairs are assigned
to positions.
1
2
3
4
5
a
6
b
5
b
4
b
3
b
2
b
1
We would like to draw this picture with colored arrows,
as we did for the first example of witness functions. If
we insist on drawing arrows that connect each position x
with its corresponding witness y1 , then we will need 5
colors as the middle position (labeled by a) is traversed
by 5 arrows; the picture also generalizes to any number of
colors. On the other hand, connecting each position x with
its corresponding y2 (a self-loop) requires only one color.
We can come up symmetric with instances of data words
where connecting each node x to y2 requires an unbounded
number of colors.
A consequence of our main technical result, Theorem 2,
is that a bounded number of colors is sufficient if we want
to perform the following task: for each position x selected
by ψ, choose some witness pair y1 ∼ y2 , and connect x to
either y1 or y2 .
The concepts of witness functions and coloring are defined
more precisely below.
B. The core result
Witness functions. We will state some technical results for
a structure more general than a data tree, namely a graph
tree. A graph tree is a tree t together with an arbitrary
symmetric binary relation E. A data tree is the special case
of a graph tree where E is an equivalence relation.
Let ϕ[x, y1 , y2 ] be a regular query (think of Proposition 1),
and consider a graph tree (t, E). We are interested in triples
(x, y1 , y2 ) selected by ϕ in t such that (y1 , y2 ) ∈ E. (Think
of E being either the data equivalence relation ∼, or its
complement.) Consider any such triple. The node x is called
the source node; the notion of source node is relative to the
query ϕ and relation E, which will be usually clear from
the context and not mentioned explicitly. The pair (y1 , y2 ) is
called the witness pair, y1 is called the first witness, and y2 is
called the second witness. These notions are all relative to a
given x, but if we do not mention the x, then x is quantified
existentially. Let X be a set of source nodes in a graph
tree (not necessarily containing all source nodes). A witness
function for ϕ and X in a graph tree is a function which
maps every source node x ∈ X to some (first or second)
witness. There may be many witness functions, since for
each node we can choose to use either a first witness or a
second one, and there may be multiple witness pairs.
The key technical result of this paper is that one can
always find a witness function of low complexity. The notion
of complexity is introduced below.
Guidance width. A guide in a tree t is given by two
nonempty sets of source nodes and target nodes. The support
of the guide is the set of all nodes and edges on (the shortest)
paths that connect some source node with a target node,
including all the source and target nodes. A guide conflicts
with another guide if their supports intersect. We write π for
guides.
A guidance system is a set of guides Π. It induces a
relation containing all pairs (x, y) of tree nodes such that
x is a source and y a target in some guide in Π. An n-color
guidance system is a guidance system whose guides can be
colored by n colors so that conflicting guides have different
colors. The guidance width of a binary relation R on tree
nodes is the smallest n such that some n-color guidance
system induces R.
In the proof we will only consider guidance systems
for relations R that are partial functions. In such cases,
it is sufficient to restrict to deterministic guides, i.e., those
with precisely one target node. From now on, if not stated
otherwise, a guidance system will be implicitly assumed to
contain only deterministic guides.
Witness functions of bounded width. We are now ready
to state the main technical result, which forms the core of
the proof of Theorem 1.
Theorem 2. Let ϕ be a regular ternary query. There exists
a constant m, depending only on ϕ, such that in every graph
tree, every set of source nodes has some witness function of
guidance width at most m.
In other words, regular ternary queries have witness
functions of bounded guidance width. Before proving the
theorem, we show how it implies Theorem 1.
C. From Theorem 2 to Proposition 1
We show how Theorem 2 implies the last remaining piece
of Theorem 1, namely Proposition 1. Consider a unary query
φ[x] as in the statement of Proposition 1. We begin with the
case when φ[x] requires y1 ∼ y2 . We need to find a class
automaton that accepts the data trees (t ⊗ X, ∼) where X is
the set of all nodes selected by φ in the data tree (t, ∼). The
class automaton will test the conjunction of two properties:
Completeness. Each node selected by φ in (t, ∼) is in X.
Correctness. Each node in X is selected by φ in (t, ∼).
We give separate class automata for the two properties.
Completeness is simple. It can be rephrased as
for every class Y and triple (x, y1 , y2 ) selected by
ϕ, if y1 , y2 ∈ Y then x ∈ X.
This is the type of property class automata are designed for:
for every class, test a regular property. (Recall the discussion
on class automata as a fragment of MSO.) Correctness is
the difficult property, since the order of quantifiers is not
the same as in an class automaton:
for every x ∈ X there is a class Y and y1 , y2 ∈ Y
such that (x, y1 , y2 ) is selected by ϕ.
Our solution is to use, as a part of the class automaton to
be designed, a guidance system given by Theorem 2.
Apply Theorem 2 to ϕ, yielding a constant m. The class
automaton for the correctness property works as follows.
Given an input data tree (t ⊗ X, ∼), it guesses an m-color
guidance system; let R stand for the induced relation. The
automaton then checks the two conditions below.
A. For every x ∈ X there is some y with xRy.
B. For every class Y , if xRy, x ∈ X, y ∈ Y , then either
(x, y, y 0 ) or (x, y 0 , y) is in ϕ(t), for some y 0 ∈ Y .
If the class automaton accepts, then clearly every position in
X is a source node. Conversely, if all nodes in X are source
nodes, then there is an accepting run of the above class
automaton. This accepting run uses the guidance system for
the witness function from Theorem 2.
This completes the proof for the case when φ[x] requires
y1 ∼ y2 . For the case y1 6∼ y2 , the proof is almost the
same, but for two changes. The first change is that we apply
Theorem 2 to the graph trees (t, E), obtained from data trees
(t, ∼) by taking as E the complement of ∼. This explains
why Theorem 2 is formulated for graph trees and not just
data trees. The second change is that we write y 0 6∈ Y instead
of y 0 ∈ Y at the end of condition B.
D. Simplifying assumptions
Before proving Theorem 2, we make two simplifying
assumptions about the query ϕ[x, y1 , y2 ].
For two nodes x, y in a tree t, we write wordt (x, y) for
the sequence of labels on the unique shortest path from x
to y in t, including x and y. We omit the subscript t when
a tree is clear from the context. Note that wordt (x, y) is
always nonempty and wordt (x, x) is the label of x.
The two assumptions about the query ϕ[x, y1 , y2 ] are:
1) All selected triples satisfy y1 < x < y2 .
2) Whether or not a triple is selected depends only on
the words wordt (y1 , x) and wordt (x, y2 ). It does not
depend on nodes outside the path from y1 to y2 .
In Appendix VII we show that the above conditions can
be assumed without loss of generality. In the case of words,
the simplification is standard, but for trees it requires new
ideas about guidance systems.
IV. T HEOREM 2 FOR WORDS
In this section we prove Theorem 2 for (graph) words,
which are the special case of (graph) trees where each node
has at most one child. For words we say position instead of
node. The case of trees is solved in Appendix VIII.
Recall the simplifying assumptions from Section III-D.
Since the query is regular, then the dependency stated in item
2 is a regular dependency. We use an algebraic approach
to represent the regular query. There is a morphism α :
Σ∗ → S, which maps each word to an element of a finite
semigroup S. Whether or not a triple (x, y1 , y2 ) is selected
by ϕ depends only on the images
α(wordw (y1 , x)) ∈ S
α(wordw (x, y2 )) ∈ S.
(3)
Forward Ramseyan splits. In a graph word (w, E) we will
distinguish two types of edges: word edges, which connect
positions in the word w with their successors, and class
edges, which are from the additional structure E. We also
have two dummy word edges: one entering the first position,
and one leaving the last position. We order word edges
according to the positions in the word, with the two dummy
word edges coming first and last, respectively. For two word
edges e ≤ f in a word w, we write wordw (e, f ) for the infix
of w that begins in the target of e and ends in the source of
f . In particular wordw (e, e) = .
A split of height n in a word w is a function σ that maps
each word edge to a number in {1, . . . , n}. We say that two
word edges e, f are neighbours with respect to a split σ, if
σ assigns the same number to e and f , and all word edges
between e and f are mapped to at most σ(e). A split σ is
called forward Ramseyan with respect to a morphism α if
α(wordw (e, f )) = α(wordw (e, g))
(4)
holds for every three pairwise neighbouring word edges e <
f < g. The following theorem was shown in [13].
Theorem 3. Any word w ∈ Σ∗ has a forward Ramseyan
split σw of height O(|S|). Furthermore, the split is left-toright deterministic in the following sense: if two words agree
on a prefix leading to a word edge e, then the splits also
agree on the prefix leading to e, including e.
The reason for the determinism property is that for any
morphism α, there is a deterministic left-to-right transducer
that produces the splits. We do not need the determinism
property for words, but it will be important for trees. In
the sequel we assume fixed forward Ramseyan splits σw for
each word w ∈ Σ∗ as stated in Theorem 3.
Factors. Fix a word w. The split σw divides the word
into pieces, called factors, which are defined as follows.
The two word edges e and f are called visible if all word
edges between them get values strictly smaller than both
σw (e) and σw (f ). A factor is a set of positions between
visible word edges. These word edges are called the border
edges of the factor. For convenience assume that the the
two dummy edges beginning and ending w are visible.
We write F, G, H for factors. Consider a factor F with
border edges e < f . Let i be the maximal number assigned
by the split to word edges inside the factor (word edges
strictly between the border edges), and let e1 , . . . , ek be
all the word edges in the factor that are assigned this
maximal number i. It is not difficult to see that the maximal
(with respect to inclusion) factors strictly included in F are
the factors whose pairs of border edges are, respectively,
(e, e1 ), (e1 , e2 ), . . . , (ek−1 , ek ), (ek , f ). We call these factors
the subfactors of F . The subfactors form a partition of the
factor.
Our proof of Theorem 2 is based on a lemma stated below.
The lemma is proved by induction on the height of factors
(the height of a factor is the maximal value of the split on
the factor) .
To state the Main Lemma, we need one new notion. A
guidance system is called consistent if each of its guides
obeys the following uniqueness requirement: if a subset Z
of source nodes is guided to the same node y, then there is
a pair (y1 , y2 ) that is a witness pair for all nodes Z, with
y1 = y or y2 = y. Roughly speaking: if all nodes in Z
agree on the witness they are guided to, then they agree
on the other witness as well. The notion of consistency is
meaningful only relative to a given ternary query.
Lemma 2 (Main Lemma). Fix a factor height h. There is
a bound n ∈ N, depending only on ϕ and h, such that for
every graph word (w, E), every factor F in w of height h,
and every set X ⊆ F of source nodes, there is a witness
function for ϕ and X in (w, E) induced by a consistent
guidance system using at most n colors. Furthermore, this
witness function only points to nodes inside or to the right
of F .
Note that we do not require the witnesses for nodes in
X to be contained in F . Theorem 2 is a special case of
the lemma for F that contains all positions in the word. By
Theorem 3, this factor F has bounded height O(|S|).
The proof of the lemma is by induction on the height h.
The number of colors n will depend on h and the size of
the monoid S recognizing the query ϕ. When going from
height h to height h + 1, there will be a linear blowup in the
number of colors. Therefore, n will be exponential in the
height of F . This contrasts with the tree case, where each
increment in the height comes with a quadratic blowup in
n, which makes n doubly exponential in the height.
Since the witness function will be induced by a guidance
system, the last assumption in the Main Lemma could be
restated as saying that no guide in the guidance system
passes through the left border edge of F .
The induction base, when F is empty or contains just one
position, is immediate. Consider now the induction step. Let
G1 , . . . , Gk be all subfactors of F , listed from left to right.
We use the term internal border edge for any word edge that
connects Gi with Gi+1 . We use the term external border
edge for the two border edges of F , which are incident
with the first position in G1 and the last position in Gk ,
respectively.
Let n be the number of colors that is necessary to color
guidance systems for sets contained in the subfactors of F .
This number is obtained from the induction assumption.
For a node x ∈ X and a witness (y1 , y2 ) we define
two numbers m1 , m2 . Let m1 be the number of internal
border edges between y1 and x, and let m2 be the number
of internal border edges between x and y2 . For technical
convenience, we deliberately choose not to count external
border edges. We divide the set X into three parts:
1) Nodes x ∈ X that have a witness with m2 ≤ 1.
2) Nodes x ∈ X that have a witness with m1 ≤ 1 and
m2 ≥ 2.
3) Nodes x ∈ X that have a witness with m1 , m2 ≥ 2.
If some node belongs to two or three of these parts, by
having several witnesses, we choose one arbitrarily. We
prove the Main Lemma for each of the three parts separately.
Next, we combine the three guidance systems into a single
guidance system.
Nodes x ∈ X that have a witness with m2 ≤ 1. Take a
subfactor Gi , and remove from the graph E all class edges
(y1 , y2 ) where the path from the inside of Gi to y2 crosses
more than one internal border edge. None of these removed
pairs are necessary as witnesses for X ∩Gi , so we can apply
the induction assumption Gi , producing a guidance system
Πi with at most n colors. Since the Main Lemma prohibits
crossing the left border edge of Gi , we infer that inside F
the guides of Πi pass through at most Gi and Gi+1 , and
no other subfactors. Therefore, all the guidance systems Πi
can be combined into a single guidance system with at most
2n colors, by using one set of n colors for even-numbered
subfactors and another set of n colors for odd-numbered
subfactors.
Nodes x ∈ X that have a witness with m1 ≤ 1 and m2 ≥
2. As in the previous case, for each subfactor Gi we use the
induction assumption to produce an n-color guidance system
Πi with at most n colors for nodes in X ∩ Gi . Consider a
guide π of Πi that exits the subfactor Gi . Let y2 be the
target of π. By the consistency property of π, we know
that there is some y1 such that the pair (y1 , y2 ) ∈ E is a
witness pair for all the source nodes of π. We remove π from
Πi and create a new guide, with a new color, that directs
all the sources of π, which are contained in Gi , to y1 . By
assumption on m1 ≤ 1, we know that the new guide crosses
at most one internal border edge. After this modification,
the only subfactors that are crossed by the guides of the
guidance system are Gi and possibly Gi−1 . Since we used
new colors for the new guides, the new guidance system
might now require 2n colors. We combine all these guidance
systems into a single one using the even/odd strategy from
the first case, thus yielding a guidance system with at most
4n colors.
Nodes x ∈ X that have a witness with m1 , m2 ≥ 2. In
this case we use the forward Ramseyan split. All guides
created in this case will point to the right.
Consider a source x with a witness (y1 , y2 ). The internal
border edges naturally split the words word(y1 , x) and
word(x, y2 ) into m1 +1 and m2 +1 words, respectively:
word(y1 , x)
=
v0 ·v1 · . . . ·vm1
word(x, y2 )
=
w0 ·w1 · . . . ·wm2 .
The first letter of v0 is the label of y1 . The last letter
of vm1 and also the first letter of w0 is the label of x.
The last letter of wm2 is the label of y2 (recall that we
do not count an external border edge, so y1 or y2 might
be outside F ). Furthermore, each two consecutive internal
border edges are not only visible, but also neighbouring, by
definition of subfactors. Hence, the values α(word(y1 , x))
and α(word(x, y2 )) are determined by the first two parts
and the last part:
(i)
α(word(y, x))
= α(v0 )·α(v1 )·α(vmy )
(ii)
α(word(x, z))
= α(w0 )·α(w1 )·α(wmz ).
(5)
Let us fix six values s1 , . . . , s6 ∈ S. By splitting the set X
6
into at most |S| parts, each requiring a separate guidance
system, we can assume that each x has a witness where
s1 = α(v0 )
s2 = α(v1 )
s3 = α(vm1 )
s4 = α(w0 )
s5 = α(w1 )
s6 = α(wm2 ).
We consider witnesses satisfying the assumptions above.
For each i = 0, . . . , k we will create a guidance system
Πi that will provide witnesses for all elements of X in the
union of subfactors G1 ∪ · · · ∪ Gi . The guidance system
will use three colors, and will have the following additional
property: if e is a word edge that connects a subfactor Gj
with Gj+1 , then at most 2 guides pass through e, all of them
directed left-to-right, and at most one of them exits Gj+1 .
The induction base, when i = 0 is immediate, since the
union of factors is empty. We now show how to extend Πi
by adding a subfactor Gi+1 (only the case i+1 ≤ k −2 is of
interest because of the assumption m2 ≥ 2). To this end we
need a guidance system Π that induces a witness function
for all positions of X inside Gi+1 . Surprisingly enough, we
will not apply the Main Lemma to Gi+1 , as we have:
Claim 1. There is a single consistent guide π that induces
a witness function for all sources in X ∩ Gi+1 .
Let Πi+1 := Πi ∪ {π} and let π exit the subfactor Gi+1
towards Gi+2 .
Now we show that Πi+1 has the property specified above:
at most two of its guides exit Gi+1 , and at most one of
them exits Gi+2 . By induction assumption on Πi , at most
one of the guides of Πi entering the subfactor Gi+1 , say π 0 ,
exits this subfactor, hence at most two guides (π and π 0 ) of
Πi+1 exit Gi+1 . Furthermore, if there are two such guides
and none of them has its target in Gi+2 , then they may be
merged into one by the following:
Claim 2. If targets of both π and π 0 are not in Gi+2 then
all source nodes of π can be moved to π 0 .
The required property of Πi+1 is thus demonstrated,
which completes the proof of induction step of the Main
Lemma as well as the entire proof of Theorem 2 for words.
V. A PPLICATIONS
In this section, we present two applications of our results.
The first application is a class of XML documents for which
emptiness of XPath is decidable. The second application is
a proof that two-variable first-order logic is not captured by
XPath, in the presence of two attribute values per node.
Satisfability of XPath. As we said in the introduction,
our study on class automata is a first step in a search for
structural restrictions on data words and data trees which
make XPath satisfiability decidable.
One idea for a structural restriction would be a variant
of bounded clique width, or tree width. Maybe bounded
clique or tree width are interesting restrictions, but they are
not relevant in the study of class automata. This is because
bounded clique width or tree width, when defined in the
natural way for data trees, guarantees decidable satisfiability
for a logic far more powerful than class automata: MSO with
navigation and equal data value predicates.
Here we provide a basic example of a restriction on inputs
that works for class automata but not for MSO. A data tree
is called bipartite (bipartite refers to the data) if its nodes
can be split into two connected (by the child relation) sets
X, Y such that every class has at most one node in X and
at most one node in Y .
Satisfiability of MSO, or even FO, with navigation and
data equality predicates is undecidable even for bipartite data
words. For instance, a solution to the Post Correspondence
Problem can be encoded in a bipartite data word.
This coding, however, cannot be captured by class automata. In the appendix, we prove the following theorem.
The proof uses semilinear sets.
Theorem 4. On bipartite data trees, emptiness is decidable
for class automata, and therefore also for XPath.
Multiple attributes. Heretofore, we have studied data trees,
which model XML documents where each node has one data
value. In this section, and this one only, we consider the
situation where each node x has n data values. Formally,
an n-data tree consists of a tree t over the finite alphabet
and functions d1 , . . . , dn which map the tree nodes to data
values. How does XPath deal with multiple data values?
Instead of y1 ∼ y2 and y1 6∼ y2 , we can use any formula of
the form
di (y1 ) = dj (y2 )
where i, j ∈ {1, . . . , n}
or its negation (for inequality). For n > 1 we need more
information than just the partitions of nodes into classes
of ∼i , for each i ∈ {1, . . . , n}. An example is the property
“every node has the same data value on attributes 1 and 2”.
How do we extend class automata to read n-data trees?
For one data value, the class condition is a language over the
alphabet Γ × {0, 1}. For n data values, the class condition is
a language over the alphabet Γ × {0, 1}n . An n-data tree
(t, d1 , . . . , dn ) is accepted if there is an output s of the
transducer on t such that for every data value d, the tree
−1
s ⊗ d−1
1 (d) ⊗ · · · ⊗ dn (d)
is accepted by the class condition. By the same technique as
in the proof of Theorem 1, we can prove that the automata
capture XPath.
A consequence is that for n ≥ 2, XPath does not capture
two-variable first-order logic (unlike the case of n = 1).
This was an open question.
Theorem 5. The following (two-variable) property
ψ = ∀x∀y
d1 (x) = d1 (y) ⇐⇒ d2 (x) = d2 (y)
cannot be defined by a boolean query of XPath.
VI. C LASSIFICATION OF AUTOMATA ON DATA WORDS
In this section, we show how all existing models of
automata over data words can be presented as special cases
of class automata. In particular, we show how alternating
one register automata [5] are captured by a restriction of
class automata. We do the comparison for data words only.
We also characterize expressibility of the relevant subclasses of class automata by (restrictions of) counter automata.
Counter automata. Let C be a finite set of names for
counters that store natural numbers. A counter automaton is
a nondeterministic finite state automaton, whose transitions,
allowing also -transitions, are labeled by counter operations.
For each counter name c we have the operations: increment
(c := c + 1), decrement (c := c − 1) and zero test (c = 0?).
There are restrictions on executing transitions labeled by
decrement or zero test: for the decrement, the counter must
have value at least 1, for the zero test the counter must have
value 0. A counter automaton accepts if in an accepting
state all counters equal zero. We write A0 for the class of
counter automata. We identify the following subclasses of
A0 .
A1 : Counter automata without zero tests.
A2 : Presburger automata. Counter values are in Z. Only
increments and decrements are allowed, and no zero
tests. Decrements can always be executed. It is easy to
encode this as a counter automaton, by representing an
integer as the difference of two naturals.
A3 : Gainy counter automata. Like unrestricted counter automata, but each state must have a self-loop -transition
labeled by c := c + 1, for every c ∈ C.
Unlike the unrestricted model, each of the three restricted
models has decidable emptiness. For counter automata without zero tests, the underlying problem is reachability for
vector addition systems, see [14]. For Presburger automata,
the proof uses semilinear Parikh images. For gainy counter
automata, the proof uses well-structured transition systems,
see e.g. [5].
Restrictions on class condition. Recall the ingredients of
a class automaton: a transducer f from the input alphabet
Σ to the work alphabet Γ, and a class condition, which is a
regular language L over the alphabet Γ × {0, 1}. We write
C0 for all class automata. We identify three restrictions on
the class condition L, which limit the way it inspects a word
w ⊗ X, with w ∈ Γ∗ and X a set of positions in w. They
yield three subclasses of class automata:
C1 : The class condition L is local, i.e., membership w ⊗
X ∈ L depends only on the labels of positions x ∈ X.
It does not depend on labels of positions outside X.
C2 : The class condition is commutatively local, i.e., membership w ⊗ X ∈ L depends only on the multiset
{w|Y : Y is a maximal interval contained in X} ⊆ Γ∗ .
In the above, we write w|Y for the substring of w
corresponding to positions from Y . An interval means
a connected set of positions.
C3 : The class condition is a tail language, i.e., there is a
regular language K such that w ⊗ X ∈ L if and only
if K contains every suffix of w ⊗ X that begins in a
position from X.
In the appendix we prove that C1 is equivalent to the
model of data automata from [3]. We also show that C3
captures alternating one register automata (the proof also
Model of counter automata
Counter automata
Counter automata w/o zero tests
Presburger automata
Gainy counter automata
Class condition
No restriction
Local
Commutatively local
Tail language
Complexity of emptiness
Undecidable
Decidable, EXPSPACE-hard
NP-complete
Decidable, non-primitive rec.
Languages of data words captured
(Extended Regular) XPath
Data automata, FO2(<, +1, ∼)
FO2(+1, ∼)
Alternating one register automata,
one register freeze LTL
Table I
C LASSIFICATION OF AUTOMATA ON DATA WORDS .
works for trees). In consequence C3 captures weak twopebble automata [11], as they are simulated by alternating
one register automata [15]. The same applies to the top view
weak pebble automata introduced in [15].
Classification. We will show that the restrictions on counter
automata match the restrictions on class conditions. To
compare languages of words without data and languages of
data words, we use the following definition. Let L be a set
of data words over an alphabet Σ, and π : Σ → Γ ∪ {}
a relabeling of Σ that may erase some letters. We define
π(L) ⊆ Γ∗ to be the words without data obtained from L
by ignoring the data and applying π to the underlying word.
Any such language π(L) is called a projection of L.
combinations thereof.
ACKNOWLEDGMENT
The authors would like to thank the anonymous reviewers
for their valuable comments.
R EFERENCES
[1] M. Benedikt, W. Fan, and F. Geerts, “XPath satisfiability in
the presence of DTDs,” J. ACM, vol. 55, no. 2, 2008.
[2] F. Geerts and W. Fan, “Satisfiability of XPath queries with
sibling axes,” in DBPL, 2005, pp. 122–137.
[3] M. Bojańczyk, A. Muscholl, T. Schwentick, L. Segoufin, and
C. David, “Two-variable logic on words with data,” in LICS,
2006, pp. 7–16.
Theorem 6 (Classification theorem). For i ∈ {0, 1, 2, 3} the
following language classes are equal:
• Languages of words without data recognized by counter
automata from Ai .
• Projections of languages of data words recognized by
class automata from Ci .
[4] M. Bojańczyk, A. Muscholl, T. Schwentick, and L. Segoufin,
“Two-variable logic on data trees and XML reasoning,” J.
ACM, vol. 56, no. 3, 2009.
The translations in both directions are effective in the
following sense. For any automaton A ∈ Ai , one can
compute an automaton C ∈ Ci and a projection π such that
the language of A is the projection under π of the language
of C. Conversely, for any C ∈ Ci and π, one can compute
an automaton A ∈ Ai recognizing the projection under π
of the language of C. Consequently, the emptiness problems
for Ai and Ci can be reduced to each other. In most of the
cases the reductions are polynomial.
The theorem is illustrated in Table I. In the third column,
the table includes the complexities of deciding emptiness for
the counter automata. In the last column, the table lists some
properties of data words captured by the automata.
The table does not include nondeterministic automata with
multiple registers of [9]. Their projections are regular word
languages. By [12] we know that nondeterministic register
automata are captured by data automata, and therefore they
correspond to some restriction on the class condition in
a class automaton. But the construction in [12] is quite
sophisticated, and we do not know what this restriction is.
We are curious how our undecidable model over data
words and trees relates to other undecidable models, such as
strong one-way pebble automata, alternating automata with
multiple registers, or two-way automata with registers, or
[6] M. Jurdziński and R. Lazić, “Alternation-free modal mucalculus for data trees,” in LICS, 2007, pp. 131–140.
[5] S. Demri and R. Lazić, “LTL with the freeze quantifier and
register automata,” ACM Trans. Comput. Log., vol. 10, no. 3,
2009.
[7] D. Figueira, “Satisfiability of downward XPath with data
equality tests,” in PODS, 2009, pp. 197–206.
[8] ——, “Forward-XPath and extended register automata on
data-trees,” in ICDT, 2010, to appear.
[9] M. Kaminski and N. Francez, “Finite-memory automata,”
Theor. Comput. Sci., vol. 134, no. 2, pp. 329–363, 1994.
[10] L. Segoufin, “Automata and logics for words and trees over
an infinite alphabet,” in CSL, 2006, pp. 41–57.
[11] F. Neven, T. Schwentick, and V. Vianu, “Finite state machines
for strings over infinite alphabets,” ACM Trans. Comput. Log.,
vol. 5, no. 3, pp. 403–435, 2004.
[12] H. Björklund and T. Schwentick, “On notions of regularity
for data languages,” in FCT, 2007, pp. 88–99.
[13] T. Colcombet, “A combinatorial theorem for trees,” in ICALP,
2007, pp. 901–912.
[14] E. W. Mayr, “An algorithm for the general Petri net reachability problem,” in STOC, 1981, pp. 238–246.
[15] T. Tan, “On pebble automata for data languages with decidable emptiness problem,” in MFCS, 2009, pp. 712–723.
VII. S IMPLIFYING THE QUERY
The goal of this section is to reduce Theorem 2 to the case
when ϕ[x, y1 , y2 ] is a simplified query (see Section III-D).
This simplification is achieved in several steps.
Proof: Otherwise we can split ϕ into a union of three
queries, one for each arrangement, and then combine the
three separate presentation systems.
C. Path based queries
A. Generalized witness functions
Fix any number n ∈ N, although we will be mainly
interested in n ∈ {1, 2}. Consider a regular query
ϕ[x, y1 , . . . , yn ] over trees. Consider now a tree t together
with a set E of n-tuples of nodes in t. As before, the idea is
that E gives a constraint on the witness variables. A witness
tuple for a node x is a tuple (y1 , . . . , yn ) ∈ E such that
(x, y1 , . . . , yn ) is selected by ϕ in t. In this case, we say
that x is a source, and yi is an i-th witness for x (the other
variables are quantified existentially).
A witness function for ϕ and a set of source nodes X in
(t, E) is a function which assigns to each node x ∈ X some
witness (an i-th witness for some i, with i depending on x).
We say that a regular query ϕ[x, y1 , . . . , yn ] has witness
functions of guidance width m if for every tree t and every
choice E of n-tuples of nodes of t, there is a witness function
for ϕ and any set X of source nodes in (t, E) of guidance
width at most m. A query ϕ has bounded width witness
functions if some such m exists.
The following lemma says about the case of n = 1:
Lemma 3. Every binary query ϕ[x, y] has bounded width
witness functions.
B. Three arrangements
By an arrangement of the nodes x, y1 , y2 in a tree we
mean the information on how these nodes, and their greatest
common ancestors
gca(x, y1 )
gca(x, y2 )
gca(y1 , y2 )
are related with respect to the descendant ordering. We
distinguish three different arrangements, pictured below.
gca(x,y1) = gca(y1,y2)
gca(x,y1) = gca(x,y2)
gca(y1,y2)
x
y1
y2
gca(x,y2)
y1
x
y2
gca(x,y2) = gca(y1,y2)
gca(x,y1)
y2
x
Let us fix one of the arrangements. There are four words
w1 , w2 , w3 , w4 that will interest us. These are shown on the
picture below for the arrangement (A1) only, but the reader
can easily see the situation for all other arrangements.
x
x
y1
w1 = word(gca(x,y1), x)
y2
x
y1
x
y2
y1
w2 = word(gca(x,y1), gca(y1,y2))
y2
w3 = word(gca(y1,y2), y1)
y1
y2
w4 = word(gca(y1,y2), y2)
.
The query is called path based if its truth value depends only
on some regular properties of the four words w1 , . . . , w4 .
I.e., there are regular word languages L1 , . . . , L4 such that
ϕ selects precisely those triples (x, y1 , y2 ) where w1 ∈
L1 , . . . , w4 ∈ L4 .
Lemma 5. We may assume without loss of generality that
ϕ is path based.
Proof: We first claim that for any ternary regular query
ϕ which selects only tuples in one arrangement, there is a
functional transducer f and a path based query γ such that
ϕ = γ ◦f , i.e. for a tree t the set ϕ(t) of tuples selected by ϕ
in t is the same as the set of tuples selected by γ in f (t). The
claim can be proved using logical methods (the transducer
computes MSO theories) or using automata methods (the
transducer computes state transformations).
To prove the lemma, we need to show that if Theorem 2
is true for the path based queries γ, then it is also true for
arbitrary ternary queries ϕ. But this is straightforward, as ϕ
and γ have the same witness functions in trees t and f (t),
respectively.
In the proofs later on we will use algebra, hence we
prefer to define the query ϕ in algebraic terms. We assume
a monoid morphism
y1
α : Σ∗ → S
These arrangements correspond, respectively, to the following situations.
gca(x, y1 ) = gca(x, y2 ) ≤ gca(y1 , y2 )
(A1)
gca(x, y1 ) = gca(y1 , y2 ) ≤ gca(x, y2 )
(A2)
gca(x, y2 ) = gca(y1 , y2 ) ≤ gca(x, y1 )
(A3)
such that membership (x, y1 , y2 ) ∈ ϕ(t) depends only on
the values assigned by α to the words w1 , . . . , w4 . In other
words, there is a set of accepting pairs F ⊆ S 4 such
ϕ[x, y1 , y2 ] holds if and only if
(α(w1 ), . . . , α(w4 )) ∈ F .
The arrangements are not contradictory, for instance the case
x = y1 = y2 is covered by all three.
D. Composing guidance systems
Lemma 4. We may assume without loss of generality that
all the triples selected by ϕ have the same arrangement.
In the sequel we will need to compose the guidance
systems as outlined in the lemma below. For two partial
functions f, g on the set of nodes of a tree, the composition
g ◦ f has the same domain as f , and is defined as follows:
(
g(f (x)) if g is defined on f (x)
(g ◦ f )(x) =
f (x)
otherwise.
Lemma 6. Let f, g be partial functions on the set of nodes
of a tree, of guidance width m1 and m2 , respectively. Then
their composition g ◦f is of guidance width at most 2m1 m2 .
Proof: Fix a tree t together with some m1 - and m2 color guidance systems Πf and Πg , inducing f and g,
respectively. We will show existence of a 2m1 m2 -color
guidance system for g ◦ f .
As the first step, combine Πf and Πg as follows: a node
x is first guided by Πf , and then, if g is defined on f (x),
guided by Πg to its final destination. Formally, Π contains
those guides of Πf whose destination node is not in the
domain of g; and moreover a number of guides that are
composed of at least two guides, to be described now. Fix
a pair of colors (k, l), where k is a color used in Πf and
l is used in Πg . A composed guide, colored by the pair
(k, l), consists of one l-colored guide from Πg , say π, and
all those k-colored guides from Πf whose destination node
is a source node of π. We will focus on the latter ’composed’
guides only. (A ’non-composed’ guide in Π, say colored k,
may be safely considered as colored by (k, l), for any l.)
The above coloring, using m1 m2 colors, is not satisfactory as same colored guides may be in conflict. We will show
how to resolve these conflicts by introducing an additionally
distinguishing piece of data into the colors. Fix a color pair
(k, l) as above. Note that a conflict may only arise when the
Πg -part (l-colored in Πg ) of one (k, l)-colored guide, say
π1 , conflicts with the Πf -part (k-colored in Πf ) of another
same colored guide, say π2 . Consider an undirected graph
G, whose nodes are all (k, l)-colored guides; there is an
edge between π1 and π2 in the graph if the abovementioned
conflict arises.
We claim that the graph G is a forest, i.e., a disjoint union
of trees. Towards a contradiction, suppose that G has a cycle
consisting of n pairwise different guides π1 , . . . , πn . Take
πn+1 = π1 . Let x1 , . . . , xn denote arbitrarily chosen nodes
witnessing the conflicts, i.e., xi belongs to the guides πi
and πi+1 . In πi+1 , for any i ≤ n, there is a unique path
from xi to xi+1 (take xn+1 as x1 ), denote it pi ; pi always
uses a path of a guide from Πf , colored k, and a path of
a guide from Πg , colored l. As the k-colored guides never
conflict, and likewise the l-colored ones, the k-colored part
of pi is separated from the same colored part of pi+1 by
at least one l-colored edge; thus the paths pi are nonempty,
i.e., xi 6= xi+1 . Assume that x1 , . . . , xn are pairwise distinct
(if this is not the case, i.e., xi = xj , consider xi , . . . , xj−1
instead; and consider πi , . . . , πj−1 instead of π1 , . . . , πn ).
Now we are prepared to obtain a contradiction, thus
proving that G is a forest. If two paths pi and pi+1 share an
edge adjacent to xi+1 , the edge may be removed from both
paths; this clearly forces xi+1 to be moved appropriately.
Thus the paths can be made edge-disjoint; moreover we keep
the xi nodes pairwise distinct, argued as above. Hence the
paths p1 , . . . , pn form a cycle in the tree t, a contradiction.
Knowing that G is a forest, we may easily label its
nodes by two numbers 1, 2, level by level, starting from an
arbitrary leaf in any connected component. This additional
numbering, added to the colors of the guides in Π, eliminates
the problematic conflicts and makes Π a 2m1 m2 -color
guidance system as required.
E. Arrangement (A1)
In this section we show that Theorem 2 holds if all triples
selected by ϕ[x, y1 , y2 ] have arrangement (A1), pictured
below.
x
y2
y1
Suppose τ [x, y] is a binary query, and σ[y, y1 , y2 ] is a
ternary query. We define the following ternary query
τ ◦y σ[x, y1 , y2 ]
=
∃y τ [x, y] ∧ σ[y, y1 , y2 ] .
Lemma 7. Let τ, σ be as above. If τ and σ have bounded
width witness functions then τ ◦y σ too.
Proof: By considering the witness function for τ ◦y σ
obtained as a composition and applying Lemma 6.
Lemma 8. We may assume without loss of generality that
x = gca(y1 , y2 ).
Proof: By the considerations in Section VII-C, we
know that a triple (x, y1 , y2 ) is selected by ϕ if and only
if the images, under the morphism α, of the four path
words w1 , w2 , w3 , w4 belong to a designated set F ⊆ S 4
of accepting tuples.
Let s1 , . . . , s4 ∈ S. Let τs1 ,s2 be the binary query that
selects a pair (x, y) if
α(wordt (gca(x, y), x)) = s1
α(wordt (gca(x, y), y)) = s2 .
Likewise, let σs3 ,s4 be the ternary query that selects a triple
(y, y1 , y2 ) if
gca(y1 , y2 ) = y
α(wordt (y, y1 )) = s3
α(wordt (y, y2 )) = s4 .
The queries τs1 ,s2 and σs3 ,s4 can be joined to define ϕ, in
the following way.
[
ϕ=
τs1 ,s2 ◦y σs3 ,s4 .
(s1 ,s2 ,s3 ,s4 )∈F
By Lemma 7, we see that the width of witness functions for
ϕ is bounded by the widths of the witness functions for the
τ queries, which is bounded by Lemma 3, and the width of
the witness functions for the σ queries. The latter are queries
where the first variable is the gca of the second and third
variables, which concludes the proof of the lemma.
Thanks to the above lemma, we are left with a query ϕ
that selects triples in the arrangement pictured below (for
future reference let us call this arrangement trivial).
x
y1
y2
We will provide a 2-color guidance system that induces a
witness function for ϕ in (t, E). This is guaranteed by the
following lemma:
Lemma 9. Let ϕ be any (not necessarily regular) query
that select only nodes in a trivial arrangement. Then ϕ has
witness functions of guidance width 2.
Proof: The guidance system is constructed in a single
root-to-leaf pass.
More formally, for each set X of nodes that is closed
under ancestors, we will provide a guidance system ΠX that
directs each source in X to some witness, either y1 or y2 .
The guidance system will have the additional property that
no tree edge is traversed by two guides.
The guidance system is constructed by induction on the
size of X. The induction base, when X has no nodes, is
straightforward. We now show how ΠX should be modified
when adding a single x node to X. Since all guides in ΠX
originate in nodes from X, any guide that passes through
x must also pass through its parent. Using the additional
assumption, we conclude that at most one guide π from ΠX
passes through x. In particular, either the left or right subtree
of x has no guide passing through it. If x has no witness,
nothing needs to be done. If x has a witness (y1 , y2 ), then
we create a new guide that connects x to the node yi which
is in the child subtree of x without guides.
For arrangment (A1) the proof of Theorem 2 is thus
completed.
F. Arrangements (A2) and (A3)
For the remaining arrangments, in this section we only
show how they can be reduced to the simplified ones.
Formally, we provide here a link between the Main Lemma
and Theorem 2, missing in the body of the paper. We formulate Theorem 7 below, which we will use in this section,
and which follows easily from the tree version of Main
Lemma (forthcoming Lemma 10). To state the theorem,
recall the notion of consistent guidance system introduced
in Section IV: a guidance system in a graph tree is called
consistent wrt. a given ternary query if each of guides
obeys the following uniqueness requirement: whenever a set
Z ⊆ X of source nodes is guided to the same node y,
then there is a pair (y1 , y2 ) that is a witness pair for all
nodes Z, with y1 = y or y2 = y. Below, the consistency
property will make it possible to combine two guidance
systems appropriately.
Theorem 7. Every simplified regular query has witness
functions of bounded guidance width. Furthermore, a consistent guidance system always exists (of the required bounded
width).
Now using Theorem 7 we prove Theorem 2 for arrangements (A2) and (A3). By symmetry, we only consider the
arrangement (A2). We simplify the arrangement in two steps.
First we claim that without loss of generality x can be
assumed to be an ancestor of y2 – this may be shown by
essentially the same technique as in Lemma 8 hence we omit
the details. Second, we show that y1 can be assumed to be
an ancestor of x and y2 . The arrangement (A2), as well as
its two successive simplifications, are pictured below.
y1
y1
y1
x
y2
x
x
y2
y2
Let our starting arrangement now be the middle one in
the picture above, i.e. we assume that the first simplification
has been already applied. Wlog we may assume that y1 is in
the left subtree, and x in the right subtree of the gca(x, y1 )
node (thus we again split into two sub-cases), and that both
x and y1 are not equal to gca(x, y1 ).
By the considerations in Section VII-C, we know that a
triple (x, y1 , y2 ) is selected by ϕ in t if and only if the
images, under the morphism α, of the three path words:
wordt (x, gca(x, y1 )), wordt (x, y2 ), wordt (gca(x, y1 ), y1 ),
belong to a designated set F ⊆ S 3 of accepting tuples.
Fix (s1 , s3 , s3 ) ∈ F (thus the guidance system will be
a disjoint union over all triples (s1 , s2 , s3 ) ∈ F ). Let
σs1 ,s2 [x, y, y2 ] be a simplified query that selects a triple
(x, y, y2 ) if
α(wordt (x, y)) = s1
α(wordt (x, y2 )) = s2
y < x ≤ y2
and x is in the right subtree of y. Consider an arbitrary
graph tree (t, E) over which ϕ is evaluated. The idea now
is that the query σs1 ,s2 is evaluated over a modified graph
tree (t, Es3 ), where a pair (y, y2 ) is in Es3 iff (y1 , y2 ) ∈ E
for some y1 such that
• y = gca(y1 , y2 ),
• y1 is in the left subtree of y,
• α(wordt (y, y1 )) = s3 .
(Intuitively, every edge (y1 , y2 ) ∈ E is ’moved’ to (y =
gca(y1 , y2 ), y2 ), but only if α(wordt (y, y1 )) = s3 .)
Let’s fix now a graph tree (t, E) and a set X of source
nodes wrt. ϕ and E. By Theorem 7 we know that σs1 ,s2 has
a witness function induced by a consistent m-color guidance
system Π (m does not depend on t, E or X).
Let τs3 ,Π [y, y1 , y2 ] be a ternary query that selects a triple
(y, y1 , y2 ) in (t, E) if
• y = gca(y1 , y2 ), y1 is in the left subtree of y and y2
in the right one,
• y is one of the target nodes of Π,
• (y1 , y2 ) ∈ E is a witness pair for all nodes x that are
guided by Π to y (by consistency of Π, such a pair
(y1 , y2 ) exists for any target node y of Π),
• α(wordt (y, y1 ) = s3 .
Query τs3 ,Π depends on the guidance system Π, we thus
implicitly assume that Π is included in the labeling of a
tree. Query τs3 ,Π is not a regular one in general, but it selects
only triples in trivial arrangement. Applying Lemma 9 we
get a 2-color guidance system that induces a witness function
for τs3 ,Π . The two guidance systems can be then combined
into one 4m-color guidance system due to Lemma 6. This
guidance system induces a witness function for ϕ and X in
(t, E).
As the graph tree (t, E) was chosen arbitrary, this completes the proof.
VIII. T HEOREM 2 FOR TREES
Fix a query ϕ[x, y1 , y2 ] that satisfies the simplifying
assumptions from Section III-D. As in the word case, we can
assume that query is represented by a morphism α : Σ∗ → S
in the following sense: ϕ only selects triples (x, y1 , y2 ) with
y1 ≤ x ≤ y2 and where the values
α(wordt (y1 , x)) ∈ S
α(wordt (x, y2 )) ∈ S.
(6)
belong to an accepting set of pairs Acc ⊆ S 2 . We fix this
morphism for the rest of this section.
Factors.: As in the word case, we distinguish two types
of edges in a graph tree (t, E). The tree edges are edges that
connect parents with children, as well as a dummy edge
going into the root of the tree and dummy edges going out
of the leaves. The class edges are the edges from E.
We will use a forward Ramseyan split for the morphism α,
as given by Theorem 3. The Ramseyan split is defined for
every path (root to leaf) in the tree. By the determinism
property, the splits agree on common prefixes of the paths, so
we can see the split as a labeling of the tree edges, including
the dummy edges. Again by determinism, all tree edges from
a node to its children have the same label in the split. Wlog
we consider only complete binary trees, where each non-leaf
node has precisely two children.
Two comparable (i.e., belonging to one path) tree edges
e and f are called visible if all tree edges between e and f
are mapped by the split to values strictly smaller than σ(e)
and σ(f ). Visible tree edges naturally determine a nested
factorization of t in the following way.
A pre-factor in a tree t is a connected set of nodes
(connected by tree edges) such that if a node x is in the
pre-factor, then either all sons of x are in the pre-factor,
or none of them. Each pre-factor of t has a root and some
leaves (maximal nodes wrt. ≤), and inherits its edges from
t. We distinguish internal edges of a pre-factor, connecting
two nodes in that pre-factor, and external edges connecting
the root or the leaves with their children outside the prefactor. This includes the tree edge leading to the root of the
pre-factor (called the root edge of the pre-factor) and the tree
edges going out of the leaves (called the leaf edges of the
pre-factor). Note that external edges may be either proper
tree edges, or dummy edges. As the split σ is assumed to
be deterministic, all tree edges leaving a given leaf of a
pre-factor are assigned the same number. A pre-factor F is
called a factor in t if it respects the split σ in the following
way: the root edge is visible from each of the leaf edges.
This means that on each (shortest) path in a factor from
its root edge to a leaf edge, numbers assigned by σ to the
internal edges on that path are strictly smaller than those
assigned to the two external edges. By the height of a factor
we mean the greatest number assigned to an internal edge,
or 0 if no such edge exists (the case of one-node pre-factor).
Additionally, the whole tree t is also a factor if, wlog, we
assume that the root dummy edge is visible with all leaf
dummy edges; its height at most equals the height of σ.
A subfactor of a factor F is any factor G ( F that
is maximal with respect to inclusion. By the definition
of factor, two different subfactors of F are disjoint (have
disjoint sets of nodes, but possibly share an external edge).
Hence each factor F is the disjoint union of its subfactors.
We say a subfactor G is an ancestor of a subfactor H if their
roots are so related. Likewise we talk about a subfactor being
a child or parent of some other subfactor.
A factor together with its decomposition into subfactors
is pictured below.
4
1
4
2 2
4
4
3
1
1
1
1
3
2 2
4
2 2
2 2
3
3
4
1
4
3
1
1
3
2 2
2 2
1 1
3
4
2
4
4
2
2
5
3
1 1
5
2
5
5
We are now ready to present the tree version of the Main
Lemma.
Lemma 10 (Main Lemma). Fix a factor height h. There is a
bound n ∈ N, depending only on ϕ and h, such that for every
graph tree (t, E), every factor F in t of height h, and every
set X ⊆ F of source nodes, there is a witness function for
ϕ and X in (t, E) induced by a consistent guidance systems
using at most n colors. Furthermore, this witness function
only points to descendants of the root of F .
The proof of the lemma is by induction on the height h.
The number of colors n will depend on h and the size of
the monoid S recognizing the query. It will not depend on
t. When going from height h to height h + 1, there will be
a quadratic blowup in the number of colors. Therefore, n
will be doubly exponential in the height of F . (Note how
this differs from the word case, cf. Section IV.)
Since the witness function will be induced by a guidance
system, the last assumption in Lemma 10 could be restated
as saying that no guide passes through the root edge of F .
Theorem 7 is a special case of the Main Lemma when F is
the whole tree.
The base case, when the factor F has one or no nodes,
is easy (1 color, going downwards, is sufficient). For the
induction step, fix a factor F , and assume that there is a
bound n sufficient for any factor of smaller height than F ,
which includes all subfactors of F . Below, subfactors of F
are simply called subfactors, without explicitly referring to
F.
A tree edge of F that is an external edge of one if its
subfactors is called a border edge. In particular each external
edge of F is a border edge. Special care will be paid in our
proof to internal (i.e. not external) border edges, i.e., the root
edges of all subfactors except the root edge of F itself.
Claim 3. If two internal border edges in a factor are comparable by the ancestor relation ≤ then they are assigned
the same value by the split.
We do the same case distinction as in the word case,
regarding the number of internal border edges on the paths
from a source to witness nodes. For a node x ∈ X and a
witness (y1 , y2 ) we define two numbers m1 , m2 . Let m1 be
the number of internal border edges on the path between y1
and x, and let m2 be the number of internal border edges on
the path between x and y2 . For technical convenience, we
deliberately choose to not to count external border edges.
We divide the set X into three parts:
1) Nodes x ∈ X that have a witness with m2 ≤ 1.
2) Nodes x ∈ X that have a witness with m1 ≤ 1 and
m2 ≥ 2.
3) Nodes x ∈ X that have a witness with m1 , m2 ≥ 2.
We prove the Main Lemma for each of the three parts
separately. Next, we combine the three guidance systems
into a single guidance system. Our construction will yield
two kinds of guides: the ancestor guides pointing to the first
witness and thus going up a tree; and descendant guides
pointing to the second witness, and thus going down the
tree. Interestingly, ancestor guides will be only created in
case 2. All the guides will satisfy the consistency condition
required in Lemma 10.
Nodes x ∈ X that have a witness with m2 ≤ 1.: This
case works the same way as for words. Consider a subfactor
G of F . In this case, each node x ∈ X ∩ G has a descendant
witness y2 that is either in G, or in a child subfactor of G,
or perhaps outside F . Apply the induction assumption to
G, producing a guidance system ΠG with at most n colors.
Since the Main Lemma requires the guidance system to point
to descendants of the factor’s root, and m2 ≤ 1, we infer
that inside F the guides of ΠG can only intersect G and
its child subfactors, and no other subfactors (it is possible
that the guides leave the factor F , though). Therefore, all
the guidance systems ΠG can be combined into a single
guidance system with at most 2n colors, used alternatingly
for even and odd depths.
Nodes x ∈ X that have a witness with m1 ≤ 1 and
m2 ≥ 2.: Here we need a different argument for trees than
for words. In this case, for each node x ∈ X there is an
ancestor witness y1 that is either in the subfactor of x, in
the parent subfactor, or outside F . Note that the latter is
possible only when x belongs either to the root subfactor
of F , or to some of its child subfactors; denote this set of
subfactors by G0 . We will construct the guidance system in
a step-by-step manner, for all subfactors, according to the
ancestor ordering.
Formally speaking, consider a family G of subfactors
that is closed under ancestors and includes G0 . We provide
a guidance system ΠG of 4n2 + 3n colors that provides
witnesses for all nodes of X belonging to the subfactors in
G. The construction of ΠG is by induction on the number
of subfactors in G.
The induction base is when G equals G0 . For each subfactor G ∈ G0 , we apply the Main Lemma, for the smaller
height, to G and nodes from X that belong to G, yielding a
n-color guidance. We combine these guidance systems into
ΠG0 as follows: use one set of n colors for the root subfactor,
and another set of n colors for all the child subfactors of
the root.
For the induction step, suppose that we have already
constructed ΠG for G, and that G 6∈ G is a subfactor whose
parent is in G. Consider the guides of ΠG that pass through
the root edge of G. We apply two distinctions to these
guides. First, we use the name parent guides, for the guides
that originate in the parent subfactor of G, and the name
far guides for the other guides. Second, we use the name
ending guides for the guides whose target is in G and the
name transit guides for the other guides, which continue into
a child subfactor of G, or even exit F . Altogether, there are
four possibilities: parent transit guides, far ending guides,
etc. We assume additionally that there are at most n parent
guides and at most 2n far and parent guides altogether, and
hence at most 2n guides entering G. This additional invariant
is satisfied by the induction base, and it will be preserved
through the construction.
Apply now the induction assumption of the Main Lemma
to G and nodes from X that belong to G, yielding a guidance
system Π with n fresh colors. We use the name starting
guides for the guides of Π. We want to combine ΠG with Π
in such a way that the resulting guidance system still uses at
most 4n2 + 3n colors, like ΠG , and satisfies the additional
invariant. If we were to simply take the two systems together,
we might end up with an external leaf edge of G which is
traversed both by starting and transit guides, which could
exceed the bound 2n on guides passing through external
edges.
We solve this problem as follows. Consider the external
leaf edges of G that are traversed by the far transit guides.
There are at most 2n such edges by our invariant assumption.
We will remove all starting guides that pass any of these
edges, and find other witnesses for nodes that use these
starting guides. This guarantees that the invariant condition
is recovered: at most 2n guides passes through any external
leaf edge of G, and at most n of them are starting guides.
These other witnesses will be ancestors. This explains why
the induction starts with G containing the root subfactor and
its children, since these are the subfactors that may have
ancestor witnesses outside the whole factor F (recall that
passing through the root edge is not counted in m1 ). The
statement of Main Lemma does not allow guides that pass
through the root edge of F .
The removing of starting guides proceeds as follows. Let e
be an external leaf edge of G that is traversed by a far transit
guide, which has color j in the guidance system ΠG . Let π
be a starting guide, which has color i in Π, that also traverses
e, with y2 its target node. By the consistency property of π,
there is some y1 such that (y1 , y2 ) ∈ E is a witness pair for
all source nodes of π. Note that by assumption on m1 ≤ 1,
the node y1 is either in the subfactor G or its parent. We
create an ancestor guide with a fresh color that connects
all the source nodes of π to y1 . The color of this guide,
which we call an ancestor color, will take into account three
parameters: the colors i and j, as well as a parity bit b ∈
{0, 1}. The parity bit is 0 if and only if G has an even
number of ancestor subfactors. We use the triple (i, j, b) for
the color name.
We will show that this new ancestor guide does not
conflict with any other ancestor guide with the same color.
Each new ancestor guide is contained in G and possibly
its parent subfactor H, by assumption on m1 ≤ 1. Inside
the subfactor G there is at most one ancestor guide of
each color, so there are no collisions inside G. One could
imagine, though, that a new ancestor guide π with color
(i, j, b) collides inside the parent subfactor H with some
other ancestor guide π 0 of the same color. Since the colors
of π and π 0 agree on the parity bit b, we conclude that the
guide π 0 cannot originate in H, which has a different parity
than G. Therefore, π 0 must originate in some other child
subfactor of H, call it G0 , that had been previously added
to G. Since the color of π 0 is also (i, j, b), we conclude that j
was the color of a far transit guide in G0 . This is impossible,
since a far transit guide in G0 or G must originate not in
H, but in an ancestor of H, and therefore there would be a
collision in the root of H.
The ancestor guides created above are the only ancestor
guides in our solution. In the subfactors where they are used,
the ancestor guides have the target in the parent subfactor.
Let us count the number of colors used. We need 2n colors
for the transit guides, and n colors for the starting guides.
For the ancestor guides, we need 4n2 colors. Altogether, we
need 4n2 + 3n colors. Note that all the guides satisfy the
consistency condition required by Lemma 10.
Nodes x ∈ X that have a witness with m1 , m2 ≥ 2.:
In this case, each node x ∈ X has a witness (y1 , y2 ) such
that the path from y1 to x, as well as the path from x to y2 ,
passes through at least two internal border edges. This case
is the only one where we use the forward Ramseyan split.
Consider a source x with a witness (y1 , y2 ). The internal
border edges split naturally word(y1 , x) and word(x, y2 )
into m1 +1 and m2 +1 words, respectively:
word(y1 , x)
=
v0 ·v1 · . . . ·vm1
word(x, y2 )
=
w0 ·w1 · . . . ·wm2 .
The first letter of v0 is the label of y1 . The last letter of
vm1 and also the first letter of w0 is the label of x. The last
letter of wm2 is the label of y2 (recall that we do not count
an external border edge, so y1 or y2 might be outside F ).
Furthermore, each two consecutive internal border edges are
not only visible, but also neighbouring, by our last claim.
Hence, as we have a forward Ramseyan split (cf. (4) in
Section IV), the values α(word(y1 , x)) and α(word(x, y2 ))
are determined by the first two parts and the last part:
(i)
α(word(y, x))
= α(v0 )·α(v1 )·α(vmy )
(ii)
α(word(x, z))
= α(w0 )·α(w1 )·α(wmz ).
(7)
Let us fix six values s1 , . . . , s6 ∈ S. By splitting the set X
6
into at most |S| parts, each requiring a separate guidance
system, we can assume that each x ∈ X has a witness where
s1 = α(v0 )
s2 = α(v1 )
s3 = α(vm1 )
s4 = α(w0 )
s5 = α(w1 )
s6 = α(wm2 ).
We will only consider witnesses that satisfy the assumptions
above.
We now proceed to create the guidance system. As in the
case m1 ≤ 1, the guidance system will be defined for a
family G of subfactors that is closed under ancestors. The
guidance system will use at most 3 colors, and will have
the following additional invariant property: if e is an edge
that connects a subfactor G with a child subfactor H, then
at most two guides pass through e. Furthermore, if exactly
two guides pass through e, then one of the guides has its
target in H.
The construction is by induction on the number of subfactors in G. The induction base when G has no subfactors is
obvious. Below we show how to modify a guidance system
ΠG for G when adding a new subfactor G.
Consider the (at most two) guides of ΠG that pass through
the edge connecting G to ΠG . As in the case m1 ≤ 1, we
use the term transit guide for the guides of ΠG that enter
G through its root and exit through one of its external leaf
edges. By the invariant assumption, there is at most one
transit guide.
We now define a guidance system Π for the nodes in G,
which we will next combine with ΠG .
Claim 4. There is a one color guidance system Π defining
a witness function for all nodes in G ∩ X.
Proof: For each node x ∈ G ∩ X, choose the lexicographically first witness y2 > x that satisfies the assumptions
on the six images in the semigroup, call it yx . Let Y be all
these witnesses yx ; this set is an antichain with respect to
the descendant relation. For each y ∈ Y , let Xy be the chain
of nodes x which are witnessed by y. By the lexicographic
assumption, if y, y 0 ∈ Y are such that y is lexicographically
before y 0 , then no element from Xy has an ancestor in Xy0 .
Consequently, if we define πy to be the guide that connects
all Xy to y, then Π = {πy }y∈Y is a one color guidance
system for all nodes in G ∩ X.
The guides of Π we call starting guides as usual.
We now need to combine Π and ΠG . If we simply
combine ΠG and Π, we might end up with a starting
guide going through and external edge of G that is already
traversed by two transit guides. To avoid this problem, we
need to do an optimisation relying on a simple observation
formulated in the claim below.
A descendant guide π is called live in a subfactor G if π
passes through G, and its target is not in a child subfactor of
G (i.e., the target is in a proper descendant of some child of
G). The idea is that the target of π satisfies the assumption
m2 ≥ 2 from the ’point of view’ of nodes in G. Note that
guides live in G may be either transit or starting in G.
Claim 5. Suppose that two consistent descendant guides π
and π 0 are live in a subfactor G and exit G through the same
edge. Suppose also that at least one of them is starting in
G. Then all source nodes of one of π, π 0 can be moved to
the other.
Proof: Let (y1 , y2 ) be the witness pair corresponding
to π and let (y10 , y20 ) be the witness pair corresponding to
π 0 – by consistency, not only the second witnesses y2 and
y20 are determined by π and π 0 , but the whole witness pairs.
Note that the source nodes of a descendant guide in G are all
situated on one path from the root of G to one of the leaves.
If both π and π 0 are starting in G, assume wlog that π has
a source node that is an ancestor of all source nodes of π 0 ;
otherwise one of the guides is starting in G, wlog assume
it is π 0 . As only the case of m1 , m2 ≥ 2 is considered, and
the values s1 , . . . , s6 are fixed, due to equation (7)(i) the
pair (y1 , y2 ) is a witness for all source nodes of π 0 as well.
Thus, these nodes may be guided to y2 instead of y20 .
We use the term live transit guide for the transit guides
that are live in G, and dead transit guide for the other transit
guides (those that have their target in a child subfactor of
G).
Consider an edge e that connects G with a child subfactor
H. Suppose first that e is traversed by a starting guide and
a live transit guide. Using the claim we merge the starting
guide with the transit one. Therefore, we end up satisfying
the invariant property: e is passed by at most one live guide,
and possibly by one dead transit guide.
IX. A PPLICATIONS
This section contains the proofs of Theorem 4 and 5.
Proof of Theorem 4
We now prove Theorem 4, which says that emptiness
is decidable for class automata on bipartite data trees. For
simplicity, we consider a more restricted definition: we allow
only data words, each class has exactly two elements: one
in the first half, and the other in the second half. Stated
differently, a bipartite data word is defined as a data word
(w, ∼) with an even number 2n of positions, where every
class has two positions: one in {1, . . . , n} and the other
in {n + 1, . . . , 2n}. A similar proof works for the general
definition.
Consider a class automaton A where the transducer is
f : Σ∗ → Γ∗ and the class condition is a language L over
alphabet Γ × {0, 1}. In a bipartite data word each class has
size exactly two, so we can model the class condition as a
binary query ϕ with input alphabet Γ, which selects a pair
x < y in a word w if and only if w ⊗ {x, y} belongs to L.
We first state a simple lemma. Suppose w is a word and
X a set of positions (usually a prefix or suffix). Then w|X
is the substring of w which contains only the labels from
the positions in X.
Since the above document satisfies ψ, it should also be
accepted by the automaton. Let the work alphabet of the
automaton be Γ, and let a1 · · · a2n ∈ Γ∗ be the word
produced by the automaton in the accepting run. For a data
value d, we use the term class string of d for the word
Lemma 11. Let ϕ be a regular binary query with input
alphabet Σ. There is a morphism α : (Σ × {0, 1})∗ → S
and a subset F ⊆ S 2 such that for any positions x ≤ z < y
in a word w ∈ Σ∗ , ϕ(w) contains (x, y) if and only if
By definition of the way class automata accept documents,
each class string should belong to the class condition.
Consider a number i ∈ {1, . . . , n}. The class string of i
is
(α(w ⊗ {x}|{1, . . . , z}), α(w ⊗ {y}|{z+1, . . . , |w|})) ∈ F.
Coming back to the proof of Theorem 4, we will prove a
stronger result, namely that there exists an automaton with
Presburger conditions that accepts exactly the data erasure
of L(A), i.e., the set K ⊆ Σ∗ of words w such that for
some ∼, the pair (w, ∼) is a bipartite data word accepted
by A. The result follows, since emptiness is decidable for
automata with Presburger conditions.
The language K is the inverse image, under the transducer
f , of the set M ⊆ Γ∗ of words that satisfy
There exists an equivalence relation ∼ such that
(w, ∼) is a bipartite data word where each class
x < y satisfies (x, y) ∈ ϕ(w).
Since automata with Presburger conditions are closed under
images of transducers, it suffices to show that M is recognized by a an automaton with Presburger conditions.
Apply Lemma 11, obtaining a morphism α and a set F .
We claim that membership of a word w of length 2n in the
language M depends only on the following 2|S| numbers,
which we call the footprint of w.
is
= |{x : x ≤ n and α(w ⊗ {x}|{1, . . . , n}) = s}|
js
= |{y : y > n and α(w ⊗ {y}|{n + 1, . . . , 2n}) = s}|.
More specifically, w belongs to M if and only if the footprint
satisfies the following semilinear property Q ⊆ N2|S| : there
exist numbers {ks,t }(s,t)∈F such that for each s ∈ S
X
X
ks,t = is
kt,s = js .
t:(s,t)∈F
t:(t,s)∈F
This completes the proof of Theorem 4, since an automaton
with Presburger conditions can compute the footprint and
test if it satisfies Q.
Proof of Theorem 5
Towards a contradiction, suppose that ψ is recognized by
any class automaton in the generalized version for 2 data
values. The document that will confuse φ will be a word.
Consider the document (over a one letter alphabet) with
positions 1, . . . , 2n, where the data values are defined
d1 (i) = d1 (n + i) = i
for i ∈ {1, . . . , n}
d2 (i) = d2 (n + i) = −i
for i ∈ {1, . . . , n}
−1
a1 · · · a2n ⊗ d−1
1 (d) ⊗ d2 (d).
ui =
a1 · · · a2n ⊗ {i, n + i} ⊗ ∅
Suppose that
α : (Γ × {0, 1} × {0, 1})∗ → S
is a morphism recognizing the class condition. If n is greater
than |S|2 , then we can find two data values i < j ∈
{1, . . . , n} such that
α(ui |{1, . . . , n})
α(ui |{n + 1, . . . , 2n})
= α(uj |{1, . . . , n})
= α(uj |{n + 1, . . . , 2n}).
Consider a new document obtained from the previous one by
swapping the first, but not second, data value in the positions
i and j. This new document violates the property ψ. To get
the contradiction, we will construct an accepting run of the
class automaton for this new document. The output of the
transducer is the same a1 · · · a2n . The class strings are the
same for data values other than i and j, so they are also
accepted. For the class strings of i and j, the images under
α are the same by assumption on i and j, and hence they
are also accepted.
X. T HE CLASSIFICATION THEOREM
We prove now Theorem 6. Most of the proof is standard,
and uses known results from [5] and [3]. There are two
original contributions: a) we show that data automata are
the same as class automata with a local class condition; b)
we show that alternating one register automata are captured
by class automata with tail class conditions.
A. A0 versus C0
We begin by considering the general case, which involves
the powerful but undecidable automata: counter automata
and class automata.
Proposition 2. The following language classes are equal.
1) Languages of words without data recognized by
counter automata.
2) Projections of languages of data words recognized by
class automata.
sketch: We begin with the inclusion of the first class
in the second class. Consider a counter automaton with the
set of transitions ∆. Let L be a language over alphabet ∆
that contains all data words (w, ∼) such that the following
condition holds.
Let x be a position with label c := c + 1. Then
there is exactly one other position y in the same
class, this position is after x, and has label c :=
c − 1. Furthermore, no position between x and y
is labeled by the test c = 0.
The language L is recognized by a class automaton. It
is not difficult to see that after erasing the data values
from L, we get exactly the accepting runs of the counter
automaton. Consequently, the language accepted by the
counter automaton is obtained from L by erasing data values
and projecting each transition onto the letter it reads.
The opposite inclusion, of the second class in the first
class, is proved the same way as when translating a data
automaton into a counter automaton without zero tests in [3].
The zero tests are used to capture the additional power of
class automata.
B. A1 versus C1
We move to the first of the three decidable restrictions:
counter automata without zero tests, and class automata with
local class conditions. Case i = 1 of Theorem 6 follows now
from the two results below, which use data automata [3].
Before we define data automata, we state the two results.
Proposition 3. The following language classes are equal.
1) Languages of words without data recognized by
counter automata without zero tests.
2) Projections of languages of data words recognized by
data automata.
Proposition 4. Data automata and class automata with
local class conditions recognize the same languages of data
words.
Proposition 3 was shown in [3]. We now define a data
automaton and prove Proposition 4.
A data automaton is given by a transducer and a class
condition. The transducer is the same as in a class automaton, it is a letter-to-letter transducer f : Σ∗ → Γ∗ from the
input alphabet Σ to the work alphabet Γ. The difference is in
the class condition: in a data automaton the class condition
is a language L ⊆ Γ∗ over the work alphabet Γ, and not
over Γ × {0, 1}. A data word (w, ∼) is accepted if there is
an output v ∈ f (w) of the transducer such that for every
class X, the string v|X belongs to L.
Clearly a data automaton is a special case of a class
automaton with a local class condition. The interest of
Proposition 4 is that each automaton with a local class
condition is equivalent to a data automaton. We show this
below.
Consider a class automaton, with transducer f and a local
class condition L. Let X be a set of positions. The Xsuccessor of x is the first position after x in X. This position
may be undefined, written as ⊥. For n ∈ N, the n-profile of
a position x with respect to a set X is a symbol in
Πn = {⊥, +1, . . . , +n,
mod 1, . . . ,
mod n}
defined as follows. If x has no X-successor, the symbol is
⊥. Otherwise, let d be the distance between x and its Xsuccessor; if d is at most n, then the symbol is +d; otherwise
the symbol is mod d0 , where d0 ∈ {1, . . . , n} and d0 ≡ d
mod n.
The n-profile of data word (w, ∼) is defined to be the
function that maps each position of w to its n-profile with
respect to its own class. The n-profile can be treated as a
word v ∈ Π∗n of the same length as w.
Proposition 4 follows from the following two lemmas, and
closure of data automata under relabelings and intersections.
The second lemma follows from [12].
Lemma 12. Let L ⊆ (Γ × {0, 1})∗ be a local language.
There is some n ∈ N and a regular language K ⊆ (Γ×Πn )∗
such that for any data word (w, ∼) over Γ:
w ⊗ X ∈ L iff (w ⊗ v)|X ∈ K,
for any class X.
sketch: Let A be a deterministic automaton recognizing
L. Since L is local, for any two a, b ∈ Γ, the transitions of
A over (a, 0) and (b, 0) may be assumed to be the same.
Intuitively, only the number of consecutive positions outside
of X matters. Furthermore, as A is finite, this number can
only be measured up to some finite threshold, and then
modulo some number. This information is stored in the
profile.
Lemma 13. For any n ∈ N there is a data automaton, with
input alphabet Σ×Πn , which accepts a data word (w⊗v, ∼)
if and only if v is the n-profile of (w, ∼).
A2 versus C2 .: The relationship between Presburger
automata and commutatively local class automata, analogous to Propositions 2 and 3, follows from results of [4]
specialized to data words.
C. A3 versus C3
We now move to the last of the three decidable restrictions: gainy counter automata, and class automata with tail
class conditions. Case i = 3 of Theorem 6 will be a consequence of Propositions 5 and 6 below. These propositions
talk about alternating one register automata, which we define
and discuss in more detail shortly.
Proposition 5 ( [5]). The following language classes are
equal.
1) Languages of words without data recognized by gainy
counter automata.
2) Projections of languages of data words recognized by
alternating one register automata.
Suppose π : Σ → Γ. If K is a set of data trees over Σ, we
write π(K) for the set of data trees over Γ obtained from K
by keeping the data and replacing the labels using π. Any
such π(K) we call a relabelling of K. Note that unlike with
projections, we do not erase any positions.
Proposition 6. The following languages classes are equal:
• Languages of data trees recognized by class automata
with a tail class condition.
• Relabelings of languages of data trees recognized by
alternating one register automata.
The two propositions give case i = 3 of Theorem 6,
since the composition of a relabeling and a projection is
a projection.
Now our main goal is to prove Proposition 6. We will
prove this proposition for the more general case of data trees.
(Under the natural generalization of tail class conditions for
trees, where instead of a suffix we talk about a subtree.) We
will talk about binary data trees, although the same proofs
would work for unranked data trees, using the first child /
next sibling encoding.
We begin by defining an alternating one register automaton on data trees. When talking about register automata, it
is convenient to represent a data tree in a different way.
Instead of the equivalence relation ∼ in (t, ∼) in a data
tree over Σ, we use a labeling µ of the nodes of t with
data values from some fixed domain of data values, call it
D. We present this as single tree t ⊗ µ over the alphabet
Σ × D. The presentation using µ is more convenient for
the automaton with a register, since it makes clear what is
loaded into the register: an element of D. A data tree in the
representation t ⊗ µ corresponds to exactly one data tree in
the representation (t, ∼); but a data tree in the representation
(t, ∼) corresponds to many data trees in the representation
t⊗µ .
Alternating one register automata
The ingredients of an alternating one register automaton
are a state space Q, which is partitioned into states owned
by ∃ and states owned by ∀; an initial state qI ; an input
alphabet and a set of transitions. Each transition is a triple
(q, o, p), where q, p are states, and o is one of the operations:
“test if the current label is a”, “test if current class is in the
register”, “load class into the register”, “move to the left
child”, or “move to the right child”. A configuration of the
automaton in a data tree is a triple consisting of a state, a
tree node and a data value, called the content of the register.
The automaton accepts a data tree if player ∃ has a strategy
to win certain game, played by two players ∃ and ∀, to be
described below.
The game begins in the initial configuration, consisting
of the initial state, the root node, and the data value of the
root node. In a configuration (q, x, d), the player who owns
state q chooses the next configuration (p, y, e) in such a
way that for some transition (q, o, p) the two configurations
are related by the →o relation defined in Table II. Player ∃
wins the game if a configuration owned by ∀ is reached from
which no move can be done, otherwise ∃ loses (this includes
infinite plays, which can happen if the automaton stays in a
node forever). Note that due to alternation we don’t need to
distinguish accepting states. Likewise, we don’t need testing
whether the current class is not in the register, as this can be
simulated by the opponent’s test whether the current class
is in the register.
Tail class automata
The class tail of a node x, belonging to a class X, in a
data tree (t, ∼) is the subtree of t ⊗ X rooted in x. A tail
class automaton with input alphabet Σ and work alphabet Γ
is given by a letter-to-letter transducer f with input alphabet
Σ and output alphabet Γ, and a regular language over Γ ×
{0, 1}, called the tail condition. So far, the definition is the
same as for normal class automata. The difference is in the
semantics: a tail class automaton accepts a data tree (t, ∼) if
there is an output s ∈ f (t) such that all class tails in (s, ∼)
are accepted by the tail condition. (In class automata, for
each class X, s ⊗ X needs to be accepted.)
It is not difficult to see that tail class automata are
equivalent to class automata with a tail class condition.
Hence it suffices to prove the version of Proposition 6 where
“tail automata” is substituted for “class automata with a tail
class condition”.
Proof of Proposition 6
The inclusion of the first class in the second class in the
statement of the Proposition 6 is straightforward. Suppose
that we have a tail automaton recognizing a set L of data
trees, given by a transducer f from Σ to Γ, and a tail
condition M . Consider the set K of data trees
t⊗s⊗µ
t over Σ, s over Γ
where: a) s ∈ f (t); and b) for every node x, the class tail
of x belongs to K. Clearly, the language L recognized by
the tail automaton is obtained from K by the relabeling
which erases Γ. On the other hand, K is recognized by an
alternating one register automaton. Condition a) is regular,
and condition b) is tested by universally branching over all
nodes x, putting node x into the register, and then inspecting
the subtree.
The inclusion of the second class in the first class in the
statement of Proposition 6 is more difficult. This inclusion
follows from the following theorem, as languages recognized
by the tail class automata are closed under relabeling.
Theorem 8. For every alternating one register automaton
one can compute an equivalent tail class automaton.
The rest of this section is devoted to showing the above
theorem. Fix an alternating one register automaton A. We
will show that there is a tail class automaton equivalent to A.
operation o
test if current label is a
test if current class is in the register
load current class into the register
move to the left/right child
condition on (q, x, d) →o (p, y, e)
x = y, d = e, t(x) = a
x = y, d = e = µ(x)
x = y, e = µ(x)
y is the left/right child of x, d = e
Table II
D EFINITION OF →o GIVEN A DATA TREE t ⊗ µ.
Recall the acceptance game described in the definition
of alternating one register automata, which we denote by
G. This game depends on the automaton A and on the
data tree t ⊗ µ, which we hope will be clear from the
context. Below, we will also consider games with initial
configurations different from the one described above.
A configuration of A in a data tree is called local if its
register value is the class of the current position. After performing a “load...” transition, the resulting configuration is
always local. Suppose that Γ is a set of local1 configurations
in a data tree. We define the game GΓ in the same way
as the game G in the definition of alternating one register
automata, with the difference that when a local configuration
is seen, different from the initial configuration of the game,
the play is ended. Player ∃ wins if the configuration is in Γ
and player ∀ wins otherwise.
Lemma 14. In a data tree, the following conditions are
equivalent for any set of local configurations ∆.
1) Player ∃ wins the game G from a local configuration
(q, x, d).
2) For some set of local configurations Γ containing
(q, x, d), player ∃ wins the game GΓ from every
configuration in Γ.
Proof: For the bottom up implication, we create a
winning strategy in G by composing the strategies winning
in the game GΓ from the positions in Γ. For the top down
implication, we choose Γ to be the set of all local configurations where ∃ can win the game. The second implication
works because G and GΓ are positionally determined.
In Lemma 15 below, we will show that a tail class
automaton can test if a set of local configurations is winning
for player ∃, in the sense of the above lemma. First, we
say how a set of local configurations is coded in the input
of a tail class automaton. Suppose that Γ is a set of local
configurations in a data tree t ⊗ µ. We write t ⊗ Γ for the
tree with the same nodes as t, but where the label of each
node x is enriched by the set of states q such that Γ contains
the unique local configuration with state q and node x. (The
alphabet of t ⊗ Γ is Σ × P (Q) instead of Σ.)
Lemma 15. There is a tail class automaton E such that for
any data tree t ⊗ µ and set of local configurations Γ, the
1 the definition of G works for any kind of configurations but, we only
Γ
use local ones.
automaton E accepts t ⊗ Γ ⊗ µ if and only if player ∃ wins
the game GΓ from every configuration in Γ.
Proof: Fix t ⊗ µ and a set Γ of local configuration.
What do we need to know about t ⊗ µ to see if player ∃
wins GΓ from the unique local configuration in state q and
node x?
The crucial observation is that as far as playing the game
GΓ is concerned, we only need to know which positions
are in the class of x, call it X, and we do not care
about the precise data values of other positions. This is
because the game GΓ is terminated when the register value is
changed. Using this observation, we can write an alternating
automaton B (on trees without data) such that for any data
tree t⊗µ, any set Γ of local configurations, and any class X
containing a node x ∈ X, B accepts the subtree of t⊗Γ⊗X
rooted in x if and only if player ∃ wins the game GΓ from
all the local configurations in node x that are in Γ. Using B
for the tail condition, together with the identity transducer,
we get a tail class automaton as required.
Lemmas 14 and 15 give Theorem 8. The tail class
automaton that is equivalent to A works as follows: it uses
the transducer to guess a set of local configurations Γ that
contains the initial configuration (the initial configuration
is necessarily local). Then it uses the automaton E from
Lemma 15 to test if player ∃ can win the game GΓ from
every configuration in Γ. By Lemma 14, the latter condition
is equivalent to player ∃ winning the game G from the initial
configuration, which is the same as A accepting the tree.
XI. D ECIDING EMPTINESS OF TAIL CLASS AUTOMATA
We now prove that emptiness is decidable for tail class
automata.
Theorem 9. Emptiness is decidable for tail class automata.
Note that this theorem follows from the previous results.
To decide emptiness of the language L of a tail class
automaton, do the following steps. First, apply Proposition 6
to compute a relabeling π and a language K recognized by
an alternating one register automaton, such that L = π(K).
Next, apply Proposition 5 (more precisely, its tree variant
from [6]) to compute a gainy counter automaton (for trees)
whose emptiness is equivalent to emptiness of K. Finally,
use results on gainy counter automata for trees from [6] to
decide emptiness of the gainy counter automaton for trees.
We give the proof below aiming at a self-contained
presentation, as we do not to use any external results to
show that emptiness is decidable for alternating one register
automata, or for tail class automata. We would like to underline, however, that none of the ideas here are substantially
different from the ones in the original papers on alternating
one register automata.
Fot the rest of this section fix a tail class automaton T .
Let f be the transducer, and let A be a tree automaton
recognizing the tail condition. Let P and Q be the state
spaces of f and A, respectively.
A. Profiles
A profile is a pair consisting of a transducer state p ∈ P
and an obligation set S, which is a finite subset of Q ×
D. The rough idea is that a profile represents run of T in
the moment that it enters a node x from its parent. The
transducer state represents the state of the transducer in x,
while each pair (q, d) in the obligation set represents a run
of the automaton A that was started for a class tail of an
ancestor of x with data value d. A more precise definition
is presented below.
In the following we write A(q) for the automatonA with
initial state changed to q, likewise f (p) for the transducer.
A profile (p, S) is called satisfiable if there exists a data tree
t ⊗ µ and
1) A tree s output by f (p).
2) For each node x in s, a run ρx of A on the class tail
of x in s ⊗ µ.
3) For each (q, d) ∈ S, a run ρ(q,d) of A(q) on s ⊗
µ−1 (d).
Emptiness of T is equivalent to satisfiability of the initial
profile, where the transducer state is the initial state of the
transducer, and the obligation set is empty. The rest of
Section XI is devoted to an algorithm that decides if a given
profile is satisfiable.
B. Well structured transition systems
Consider an enumerable set, which we will call a domain.
A transition system on the domain is a function f that maps
each element x of the domain to a nonempty family of
subsets of the domain. A derivation is a finite tree whose
nodes are labeled by elements of the domain, such that for
each node with label x, the labels in the children of x form
a set from f (x). We write f ∗ for the set of elements in the
domain that have (i.e. appear in the root of) a derivation.
A transition system f is called computable if f maps each
element to a finite family of finite subsets of the domain, and
f is a computable function.
A binary relation R over the domain is called a downward
simulation if for any (y, x) ∈ R and any X ∈ f (x), there is
some Y ∈ f (y) such that each element of Y is related by
R to some element of X: for each y 0 ∈ Y there is x0 ∈ X
with (y 0 , x0 ) ∈ R.
Lemma 16. If a downward simulation R contains a pair
(y, x) and x ∈ f ∗ then y ∈ f ∗ and |y| ≤ |x|.
A transition system is monotone if its domain is equipped
with a partial order ≤ that is a downward simulation. By
Lemma 16 in a monotone transition system the set f ∗ is
downward-closed, and thus its complement ¬f ∗ is upward
closed.
Theorem 10. Consider an enumerable domain with a
computable, well founded order without infinite antichains.
For any computable and monotone transition system f ,
membership in f ∗ is decidable.
Proof: For an element x of the domain we run two
algorithms in parallel. The first algorithm enumerates all
derivations and terminates if it finds one with x in the root.
Thus the algorithm terminates if and only if x belongs to
f ∗.
The second algorithm enumerates all finite antichains in
the domain and terminates if it finds an antichain Z that
such that x belongs to the upward closure of Z, call it Z ↑ ,
and Z satisfies the following property (*): for any element
z ∈ Z, every set in f (z) has nonempty intersection with Z ↑ .
We now show that this algorithm terminates if and only if
x does not belong to f ∗ .
Suppose that x does not belong to f ∗ . We claim that
the algorithm succeeds; one possible set Z produced by the
algorithm is all minimal elements in the complement of f ∗ .
Since the order has no infinite antichains, Z is finite. By
well foundedness, the complement of f ∗ equals Z ↑ . Thus
Z satisfies property (*) and x ∈ Z ↑ .
Suppose that x belongs to f ∗ . We claim that the upward
closure Z ↑ of any antichain Z satisfying (*) contains no
elements of f ∗ , and hence the algorithm must fail. For
the proof we will use a bit of duality of induction and
coinduction.
The set f ∗ is the smallest set X with the property that
whenever f (x) contains a set X with X ⊆ X then x ∈ X .
Thus f ∗ is the smallest (pre-)fixed point of the mapping:
F : X 7→ {x : some set in f (x) is included in X },
ie., the smallest X with F(X ) ⊆ X . By duality, the complement of f ∗ is the greatest (post-)fixed point of the mapping
X 7→ ¬F(¬X ); i.e., the greatest X with X ⊆ ¬F(¬X );
i.e., the greatest X such that whenever x ∈ X then each set
in f (x) has nonempty intersection with X .
Now we are ready to argue that if Z satisfies (*) then
Z ↑ is included in ¬f ∗ . Indeed, by monotonicity it follows
that Z ↑ satisfies (*): for every z ∈ Z ↑ , every set in f (z)
has nonempty intersection with Z ↑ . Thus Z ↑ is a post-fixed
point, Z ↑ ⊆ ¬F(¬Z ↑ ), and hence necessarily is included in
the greatest one.
C. Profiles form a transition system
In this and in the next section we apply the machinery
of Theorem 10 to decide which profiles are satisfiable,
completing the decision procedure for emptiness of tail class
automata. Our aim is first to define an ordered domain and a
transition system f , and then its quotient fb that satisfies the
assumptions of Theorem 10, and such that satisfiable profiles
can be computed on the basis of testing membership in fb∗ .
The domain of f will contain all profiles; f itself we
define below. The definition of f is designed so that a data
tree that makes a profile satisfiable is essentially a derivation
of this profile, and vice versa (cf. Lemma 17 below).
A profile (p, S) we call accepting if p as well as all states
appearing in S are accepting. If a profile x = (p, S) is
accepting then let f (x) contain only ∅. Otherwise, f (x)
will contain all the sets {(p0 , S0 ), (p1 , S1 )}, of size at
most two each, that arise as an outcome of the following
nondeterministic ’procedure’:
• Choose an input letter a of f , a transition p, a →
¯
b, p0 , p1 thereof, and a data value d.
¯
• For each (q, d) ∈ S ∪{(qinit , d)}, where qinit is the initial
state of A, choose a transition of A
q, (b, 1) → q0 , q1
if d = d¯
q, (b, 0) → q0 , q1
if d 6= d¯
and add (q0 , d) to S0 and (q1 , d) to S1 .
Lemma 17. Satisfiable profiles are exactly f ∗ .
The profiles are ordered coordinatewise, with the first
coordinate (states of the transducer) ordered discretely (all
states are incomparable), and the second coordinate (obligations) ordered by inclusion.
Lemma 18. The transition system f is monotonne.
profiles are equivalently satisfiable, this abstraction is not a
problem: it makes sense to talk about satisfiable abstractions.
The transition function fb is defined by projecting the
transition function f : if w = [S] is an abstraction of the
obligation set S, then fb(p, w) contains precisely those sets
that are obtained by abstraction of some set X from f (p, S);
by abstraction of X we mean here the set of profile abstractions obtained by replacing each profile (p0 , S 0 ) appearing
in X with its abstraction (p0 , [S 0 ]).
The following lemma together with Lemma 17 reduces
emptiness of tail class automata to testing fb∗ .
Lemma 19. A profile is in f ∗ if and only if its abstraction
is in fb∗ .
Proof: For both directions we may use Lemma 16
applied to the disjoint union of f and fb; as the downward
simulation we will use either the relation R containing all
pairs (a profile, its abstraction), or the inverse relation R−1 ,
respectively. It remains to show that both R and R−1 are
downward simulations.
Observe that the isomorphism of profiles satisfies a
stronger property, which we call a 2-level bisimulation:
whenever x and y are isomorphic and X ∈ f (x), there
is Y ∈ f (y) such that X and Y witness the following 1level bisimulation condition: for any x0 ∈ X there is y 0 ∈ Y
isomorphic to x0 ; and vice versa, for any y 0 ∈ Y there is
x0 ∈ X isomorphic to y 0 .
In consequence, the relation R is a 2-level bisimulation
too. Thus both R and R−1 are downward simulations indeed.
The transition function fb maps always an abstraction x
to a finite family of finite sets of abstractions, and fb(x) is
computable, for a given x, by simulating the nondeterministic procedure given in Section XI-C. Thus we have:
D. Abstractions of profiles form a well-structured transition
system
Now we define the well structured transition system fb.
Elements of the domain will not be profiles, but isomorphism
classes of profiles, in the following sense. Two profiles are
called isomorphic if they have the same transducer state and
there is a bijection on data values that maps one obligation
set to the other. It is not difficult to see that the isomorphism
class of (p, S) is identified by its abstraction (p, [S]), where
the vector
The order on abstractions of obligation sets is defined by
projecting the inclusion order on obligation sets: w ≤ w0
if w = [S], w0 = [S 0 ] for some obligation sets S ⊆ S 0 .
Accordingly we define the order on abstractions of profiles:
(p, w) ≤ (p0 , w0 ) if p = p0 and w ≤ w0 . Directly by
Lemma 18 and by the bisimulation property of the profile
isomorphism we get:
[S] ∈ NP (Q)\{∅} ,
We complete the check-list of assumptions of Theorem 10
by showing the following:
called the abstraction of S, stores on coordinate Q0 ⊆ Q the
number of data values in the set
{d ∈ D : Q0 = {q : (q, d) ∈ S}}.
The domain of fb will consist of abstractions of profiles;
in particular, the domain is enumerable. Since isomorphic
Lemma 20. The transition system fb is computable.
Lemma 21. The transition system fb is monotone.
Lemma 22. The order ≤ is computable, well founded and
has no infinite antichains.
Proof: Since there are finitely many transducer states,
it suffices that these properties hold for the order on abstractions of obligation sets.
Recall that abstractions of obligation sets are vectors of
numbers indexed by nonempty subsets of Q. Note that the
order on abstractions of obligation sets we use here, is not
the same as the classical coordinatewise order on vectors of
numbers, call it .
To show computability, we give the following alternative
definition of ≤. For two abstractions v, w of obligation sets,
we have v ≤ w if and only if v can be reached from w by
a finite number of steps of the form: choose a coordinate
R ⊆ Q and an element q ∈ R, decrement the vector on
coordinate R; and then increment the vector on coordinate
R \ {q} if R \ {q} is nonempty.
This alternate definition also shows that the order is well
founded: it is impossible to do an infinite sequence of such
steps.
Finally, we want to show that ≤ has no infinite antichains.
It is not hard to see that v w implies v ≤ w. It is well
known that has no infinite antichains, and hence the same
must hold for ≤.