Implication and axiomatization of functional and constant constraints

Ann Math Artif Intell
DOI 10.1007/s10472-015-9473-7
Implication and axiomatization of functional
and constant constraints
Jelle Hellings1 · Marc Gyssens1 · Jan Paredaens2 ·
Yuqing Wu3
© Springer International Publishing Switzerland 2015
Abstract Akhtar et al. introduced equality-generating constraints and functional constraints as a first step towards dependency-like integrity constraints for RDF data [3]. Here,
we focus on functional constraints. Since the usefulness of functional constraints is not
limited to the RDF data model, we study the functional constraints in the more general setting of relations with arbitrary arity. We further introduce constant constraints and study
the functional and constant constraints combined. Our main results are sound and complete
This is a revised and extended version of the paper ‘Implication and Axiomatization of Functional
Constraints on Patterns with an Application to the RDF Data Model’ presented at the 8th International
Symposium on Foundations of Information and Knowledge Systems, Bordeaux, France (FOIKS
2014) [25].
Yuqing Wu carried out part of her work during a sabbatical visit to Hasselt University with a Senior
Visiting Postdoctoral Fellowship of the Research Foundation Flanders (FWO).
Jelle Hellings
[email protected]
Marc Gyssens
[email protected]
Jan Paredaens
[email protected]
Yuqing Wu
[email protected]
1
Faculty of Sciences, Hasselt University and Transnational University of Limburg,
Martelarenlaan 42, 3500 Hasselt, Belgium
2
Department of Mathematics and Computer Science, University of Antwerp,
Campus Middelheim, Bldg. G, Middelheimlaan 1, 2020 Antwerp, Belgium
3
Computer Science Department, Pomona College, 185 E. 6th St., Claremont, CA 91711, USA
J. Hellings et al.
axiomatizations for the functional and constant constraints, both separately and combined.
These axiomatizations are derived using the chase algorithm for equality-generating constraints. For derivations of constant constraints, we show how every chase step can be
simulated by a bounded number of applications of inference rules. For derivations of functional constraints, we show that the chase algorithm can be normalized to a more specialized
symmetry-preserving chase algorithm performing so-called symmetry-preserving steps. We
then show how each symmetry-preserving step can be simulated by a bounded number of
applications of inference rules. The axiomatization for functional constraints is in particular
applicable to the RDF data model, solving a major open problem of Akhtar et al.
Keywords Functional constraints · Constant constraints · Chase algorithm ·
Axiomatization
Mathematics Subject Classfication (2010) 68P15 · 05C60
1 Introduction
Usually, data is subject to integrity constraints implied by the semantics of the data.
Formalizing these constraints can help us to reason over the data and identify inconsistencies. As such, formal constraints play a major role in database management systems
that automatically maintain integrity of the data and optimize query evaluation. In this
work, we study what we call constant-functional constraints. These constant-functional constraints are a generalization of the well-known functional dependencies of Codd [13], the
conditional functional dependencies of Fan et al. [21], and the functional constraints of
Akhtar et al. [3, 15, 25]. More concretely, the constant-functional constraints can be characterized as the union of the functional constraints applied to arbitrary relations, as presented
by Hellings et al. [25], and the constant constraints, which we introduce here.
Functional constraints have the form
(P, L → R),
where P specifies a pattern in the data and L and R are sets of variables occurring in
this pattern. Their semantics is comparable to that of the functional dependencies: if two
parts of the data match the pattern and are equal on L, then they must also be equal on R.
Example 1 illustrates this for ternary RDF data.
Example 1 Consider the family tree shown in Fig. 1. On these data, the constraint
“a child only has one biological father and mother” holds. This constraint can be
expressed by the pair of functional constraints ({($p, motherOf, $c)}, $c → $p) and
({($p, fatherOf, $c)}, $c → $p). The constraint “children have only one biological parent”,
which can be expressed by ({($p, $t, $c)}, $c → $p), does not hold, however.
Functional constraints allow the expression of not only the traditional functional dependencies, but also of context-dependent functional dependencies that only apply to a part of
the data. In the context of Example 1, a context-dependent functional dependency could be
the constraint child → parent restricted to motherOf triplets.
Implication and axiomatization of functional and constant constraints
Fig. 1 Simplified visualization
of an RDF representation of a
small family tree
Because of the use of free variables and constants in patterns, patterns may also match
specific structures in the relation. This is particularly useful if the underlying relation
represents a graph. In this setting, functional constraints may impose structural constraints.
Example 2 Let Edge(from, to) be a binary relation schema representing the edge relation
of a graph. The functional constraint ({($n, $n)}, ∅ → $n) expresses that there is at most
one node with a self-loop. The pattern {($n, $m), ($m, $n)} in the functional constraint
({($n, $m), ($m, $n)}, $n → $m) matches cycles (closed paths) of length 2 (including selfloops). Consider two pairs of such cycles starting at node v. By the constraint, the second
node in both cycles must be equal, and thus the latter constraint expresses that every node v
is part of at most one cycle of length 2.
With functional constraints, one cannot specify that data entries matching a variable
in some pattern should have a specified constant value. To extend the scope of the constraints under consideration, we introduce constant constraints and study functional and
constant constraints combined and refer to them as constant-functional constraints. Constant
constraints have the form
(P, E),
where P is a pattern and E is a finite set of constant equalities of the form (c = $v), with c
a constant and $v a variable occurring in the pattern P.
Example 3 Let PI(name, country, cc, phone) be the schema of a relation storing personal
information, in which the attribute cc provides the country code that should be used to
call the phone number phone. The constant constraint ({($n, BE, $c, $p)}, {(‘32’ = $c)})
specifies that people from Belgium have a Belgian phone number.
For the functional dependencies in the relational data model, a sound and complete
axiomatization is already long known [5], and, more recently, Akhtar et al. presented a
sound and complete axiomatization for the equality-generating constraints in the RDF data
model [3]. Since the constant-functional constraints are subsumed by equality-generating
constraints, this axiomatization can also be used for the inference of constant-functional
constraints only. In this case, intermediate inference steps can generate equality-generating
constraints that are not necessarily equivalent to constant-functional constraints, unfortunately. For the functional constraints, Akhtar et al. identified the existence of a sound and
complete axiomatization (not involving other types of constraints) as a major open problem.
On the one hand, the Armstrong axiomatization for the functional dependencies [5] can be
generalized to the setting of functional constraints. This generalization, however, lacks the
reasoning power over patterns necessary for a complete axiomatization. On the other hand,
there is no straightforward way to specialize the axiomatization of the equality-generating
constraints to functional constraints only.
J. Hellings et al.
In this paper, we present a sound and complete axiomatization for the constant constraints, the functional constraints, and the constant-functional constraints over relations of
arbitrary arity. The constraints used by all intermediate inference steps allowed by these
axiomatizations are constant constraints, functional constraints, and constant-functional
constraints, respectively. In particular, the case of ternary relations yields a sound and complete axiomatization for the functional constraints in the RDF data model, thereby positively
solving the open problem of Akhtar et al. [3].
Our axiomatization is derived using the chase algorithm for equality-generating constraints [3], which is a variation of the standard chase algorithm [2, 8]. For derivations of
constant constraints, we show how every chase step can be simulated by a bounded number of applications of inference rules. For derivations of functional constraints, such a direct
simulation of chase steps is not straightforward. The key insight we use to circumvent
this problem is that the chase algorithm, when applied to decide if a functional constraint
holds, can be normalized to a more specialized, symmetry-preserving, chase algorithm. The
main idea behind the symmetry-preserving chase algorithm is that, due to their semantics,
chases for functional constraints always start with tableaux that are symmetric. We prove
that during such chases one can always maintain this symmetry in the tableau, by using socalled symmetry-preserving steps. We then show how each symmetry-preserving step can
be simulated by a bounded number of applications of inference rules. The axiomatization
for the constant-functional constraints follows from these simulations. Axiomatizations for
only the constant constraints and only the functional constraints can be derived in a similar
manner.
This is a revised and extended version of Hellings et al. [25]. Compared to Hellings et al.,
we present a slightly simplified sound and complete axiomatization of the functional constraints. Additionally, we present the constant constraints and provide sound and complete
axiomatizations of constant constraints and of constant-functional constraints.
Organization. In Section 2, we present the necessary definitions used throughout this
paper. In Section 3, we discuss the constant-functional constraints and equality-generating
constraints. In Section 4, we present the chase algorithm for equality-generating constraints and specialize it to constant-functional constraints. In Section 5, we propose an
axiomatization for the constant-functional constraints. In Section 6, we look at the relationships between the constant-functional constraints and previously introduced classes of
dependencies. In Section 7, we summarize our findings and discuss directions for future
work.
2 Preliminaries
Functional and equality-generating constraints [3] have originally been introduced in the
context of the RDF data model. In this model, RDF data are usually represented by a single
ternary relation. In the Introduction, we have already argued that functional and equalitygenerating constraints are useful in a wider range of data models. We therefore generalize
functional and equality-generating constraints to relations of arbitrary arity. The following
notations and definitions will be used throughout the paper.
We consider disjoint infinitely enumerable sets U and V of constants and variables,
respectively. For distinction, we usually prefix variables by “$”. A term is either a constant or a variable. Hence, the set T of all terms equals U ∪ V . A tuple of arity n is a
Implication and axiomatization of functional and constant constraints
sequence (t1 , . . . , tn ) of terms. A pattern of arity n is a finite set of tuples of arity n. If P
is a pattern, then UP , VP , and TP denote the set of all constants, variables, and terms in P,
respectively. A pattern P with VP = ∅ is usually referred to as a relation.
We define the domain, range, and inverse of a function f in the usual way and denote
these by domain(f ), range(f ), and f −1 , respectively. Two functions f and g agree on a
set S, denoted by f =S g, if f (x) = g(x) for all x ∈ S. The restriction of a function f to a
set S is defined as f |S = {(x, y) | x ∈ S, y = f (x)}. The identity on a set S is defined as
idS = {(s, s) | s ∈ S}.
The term-based renaming function φa1 ←b1 ,...,ai ←bi , a1 , b1 , . . . , ai , bi ∈ T , is the function on T for which φa1 ←b1 ,...,ai ←bi (bj ) = aj , j = 1, . . . , i, and which is the identity
elsewhere. Likewise, the function-based renaming function f ←g , with f a function and
g an injective function with domain(f ) = domain(g), is the function on T for which
f ←g (g(t)) = f (t), t ∈ domain(f ), and which is the identity elsewhere. Notice that this
function is well-defined due to the injectivity of g.
A function f on terms is extended to tuples, patterns, and sets in the following natural
way: for a tuple (t1 , . . . , tn ), f ((t1 , . . . , tn )) = (f (t1 ), . . . , f (tn )), and, for a set S, f (S) =
{f (s) | s ∈ S}. For two patterns P and Q, a function e : VP ∪ U → T is an embedding of P
into Q if e|U = idU and e(P) ⊆ Q.
For constraints for which this is relevant, we denote “pattern P satisfies constraint C” by
P |≡ C. Likewise, for a set of constraints S , we write P |≡ S to denote that P |≡ C for all
C ∈ S .1
We say that “set of constraints S implies set of constraints S ”, denoted by S |= S ,
if, for every relation R, R |≡ S implies R |≡ S . The sets of constraints S and S are
equivalent, denoted by S ≡ S , if S |= S and S |= S . In the above, whenever S = {C}
and/or S = {C } are singletons, we usually write C and/or C instead of {C} and/or {C }.
An inference rule can be described using a schema of the form
S
conditions,
C
which should be informally read as “if S , subject to some conditions, then C”, where S is
a set of constraints and C is a constraint. The inference rule is k-ary if |S | = k. A set of
inference rules R is k-ary if every rule r ∈ R is at most k-ary.
If a constraint C can be obtained from a set of constraints S by repeatedly applying
rules of a set of inference rules R, we say that C can be derived from S using R, denoted
by S R C. We usually omit R if R is clear from the context. A set of inference rules
R is sound whenever S R C implies S |= C and complete whenever S |= C implies
S R C.
A set of inference rules is a finite axiomatization if it is sound, complete, k-ary (for some
k ≥ 0), and if the inference rules can be represented by a finite number of decidable inference rule schemas. An inference rule schema s = “if S , subject to some conditions, then C”
is decidable if, given a set of constraints S with |S | ≤ |S | and constraint C , it is decidable
if s can be applied to derive S s C .
1 Observe
that the semantics of satisfaction depends on the type of constraints considered. The semantics of
satisfaction for functional constraints is defined in Definition 5 and the semantics of satisfaction for constant
constraints is defined in Definition 7. However, we use a generic notation for satisfaction independent of the
type of constraints, which is why we introduce it here.
J. Hellings et al.
3 Functional and constant constraints
Functional constraints and constant constraints on n-ary relations are special subclasses of
equality-generating constraints on n-ary relations. We formally define functional, constant,
and equality-generating constraints on n-ary relations.
Definition 1 A functional constraint is a pair (P, L → R), where P is a nonempty pattern
and L, R ⊆ VP .
Let C = (P, L → R) be a functional constraint. We say that P, L, and R are the pattern,
left-hand side, and right-hand side of C, respectively. We denote these by PTN(C), LHS(C),
and RHS(C).
Definition 2 A constant constraint is a pair (P, E), where P is a nonempty pattern and E is
a finite set of equalities of the form (c = $v) for c ∈ U and $v ∈ VP .
Definition 3 An equality-generating constraint is a pair (P, E), where P is a non-empty
pattern and E is a finite set of equalities of the form (t = u) for t, u ∈ VP ∪ U .
Let C = (P, E) be a constant or an equality-generating constraint. We say that P and
E are the pattern and the equalities of C, respectively. We denote these by PTN(C) and
EQ (C). We say that an equality (t = u) ∈ E is a constant equality on pattern P if t ∈
U and u ∈ VP . Notice that the constant constraints are a proper syntactical subclass of
the equality-generating constraints, in which the set of equalities is restricted to constant
equalities.
Definition 4 A constant-functional constraint is a constraint that is either a constant
constraint or a functional constraint.
We formally define the semantics of the functional, constant, and equality-generating
constraints on n-ary relations.
Definition 5 Let C = (P, L → R) be a functional constraint and let P be a pattern. Then
P satisfies C if, for every pair of embeddings e1 and e2 of P into P with e1 =L e2 , we have
e1 =R e2 .
Each time we have two embeddings of P into P , it makes sense to look at the structure
e1 (P) ∪ e2 (P) in order to compare the two embeddings. To make reasoning about such
structures easier, we formalize them.
Definition 6 Let f1 and f2 be a pair of embeddings of pattern P into some pattern P
with f1 =L f2 for L ⊆ VP . We say that the pattern D = f1 (P) ∪ f2 (P) is a double
pattern of P and L. A pattern D is a maximally double pattern of P and L if f1 and f2
are injections, f1 =L∪idU f2 , range(f1 |VP ) ⊆ V , range(f2 |VP ) ⊆ V , and range(f1 |VP \L ) ∩
range(f2 |VP \L ) = ∅.
We observe that if f1 (P) ∪ f2 (P) is a maximally double pattern, then, by construction,
f1 and f2 are bijective functions of VP ∪ U into Vf1 (P) ∪ U and Vf2 (P) ∪ U , embedding
P into f1 (P) and f2 (P), respectively. In addition, f1 and f2 map constants to themselves
Implication and axiomatization of functional and constant constraints
and variables onto variables. Hence, P, f1 (P), and f2 (P) are isomorphic and, as a direct
consequence, the maximally double pattern of P and L is unique up to isomorphisms. We
shall therefore often leave the embeddings f1 and f2 that define f1 (P) ∪ f2 (P) implicit.
Definition 7 Let P be a pattern and let C = (P, E) be a constant constraint or an equalitygenerating constraint. Then P satisfies C if, for every embedding e of P into P and every
equality (t = u) ∈ E, we have e(t) = e(u).
Akhtar et al. [3] already showed that every functional constraint can be written as
an equality-generating constraint. To establish a formal relationship between equalitygenerating and functional constraints, we provide the following results.
Proposition 1 Let C = (P, L → R) and let D = f1 (P) ∪ f2 (P) be the maximally double
pattern of P and L. For every two embeddings e1 and e2 of P into pattern P with e1 =L
e2 , there is an embedding e of D into P with, for every $r ∈ R, e1 ($r) = e ◦ f1 ($r)
and e2 ($r) = e ◦ f2 ($r). Conversely, for every embedding e of D into P , there are two
embeddings e1 and e2 of P into P with e1 =L e2 and, for every $r ∈ R, e ◦ f1 ($r) = e1 ($r)
and e ◦ f2 ($r) = e2 ($r).
Proof In one direction, we define e = (e1 ◦ f1 −1 ) ∪ (e2 ◦ f2 −1 ), and, in the other direction,
we define e1 = e ◦ f1 and e2 = e ◦ f2 .
Corollary 1 Let C = (P, L → R) be a functional constraint and let D = f1 (P) ∪ f2 (P) be
the maximally double pattern of P and L. We have C ≡ (D, {(f1 ($r) = f2 ($r)) | $r ∈ R}).
As already mentioned, the functional constraints are a generalization of the functional
dependencies. We also formalize this relationship.
Proposition 2 Let C = L → R be a functional dependency over the relation schema
R(A1 , . . . , An ) with L, R ⊆ {A1 , . . . , An }. Consider the functional constraint C =
({(A1 , . . . , An )}, L → R), in which the attribute names are assumed to be variables. Then
C ≡ C .
The functional dependencies have a well-known axiomatization in the form of Armstrong’s axioms, consisting of the three inference rules Reflexivity, Augmentation, and
Transitivity [5]. We generalize Armstrong’s inference rules to our setting of the functional
constraints.
Rule 1 (Reflexivity) Let P be a pattern. If R ⊆ L ⊆ VP , then (P, L → R).
Rule 2 (Augmentation) If (P, L → R) and V ⊆ VP , then (P, L ∪ V → R ∪ V ).
Rule 3 (Transitivity) If (P, L → M) and (P, M → R), then (P, L → R).
Since Armstrong’s axioms can be generalized to functional constraints, it is straightforward to show that the well-known decomposition and union rules can also be generalized to
functional constraints. Also, similar decomposition and union rules can be obtained for the
equality-generating constraints [3]. For the decomposition and union of constant constraints,
we introduce the following inference rules, which also hold for the equality-generating
constraints in general.
J. Hellings et al.
Rule 4 (Decomposition) If (P, E ∪ E ), then (P, E).
Rule 5 (Union) If (P, E) and (P, E ), then (P, E ∪ E ).
From now on, if the left-hand side or right-hand side in a functional constraint is a singleton set {$v}, then we usually write $v instead. Likewise, if the set of equalities in a
equality-generating or constant constraint is a singleton set {(t = u)}, then we usually write
(t = u) instead.
Armstrong’s axioms together with Decomposition and Union are not complete for the
functional constraints, the constant constraints, or the constant-functional constraints: these
inference rules only provide means to reason on constraints specified on a single pattern.
The following example exhibits a situation in which this is not sufficient:
Example 4 Consider the functional constraint C = ({($a, $b)}, $a → $b) and the pattern
P = {($a, c1 ), ($a, c2 )} with c1 , c2 ∈ U and c1 = c2 . If C holds on a relation R, then no
embedding of P into R is possible, and, hence, every constant-functional constraint on the
pattern P holds.
Likewise, consider the constant constraint C = ({($a, $b)}, (c = $b)) and the pattern
P = {($a, c )} with c ∈ U and c = c . Also in this case, if C holds on a relation R, then
no embedding of P into R is possible, and, hence, every constant-functional constraint on
the pattern P holds.
Both these cases consider the derivation of a constraint on pattern P from a constraint C
on pattern PTN(C) with P = PTN(C). None of the presented inference rules are, however,
able to reason about constraints specified on patterns that are not all equivalent.
4 Chasing constant-functional constraints
For equality-generating constraints, Algorithm 1 is known to decide implication [3]. This
algorithm is a constant-aware variation of standard chase algorithms for equality-generating
dependencies [2, 8].
In Algorithm 1, we refer to lines 5–10 as equalization steps, to line 12 as inconsistency
termination, and to line 15 as regular termination.
Theorem 1 (Akhtar et al. [3]) Algorithm 1 is correct for equality-generating constraints.
We use the relationship between the constant-functional constraints and the equalitygenerating constraints, as described in Corollary 1, to construct a chase-based algorithm
that decides implication of constant-functional constraints, shown as Algorithm 2.
In Algorithm 2, we refer to lines 15–21 as equalization steps, to line 23 as inconsistency
termination, and to line 26 as regular termination. This is analogue to the equalization steps,
inconsistency termination, and regular termination in Algorithm 1.
Theorem 2 Algorithm 2 is correct for constant-functional constraints.
Proof Algorithm 2 makes a case distinction on the type of all constraints involved. In it,
constant constraints are treated as normal equality-generating constraints. For functional
constraints, the algorithm implicitly translates functional constraints to equality-generating
constraints using the results of Proposition 1 and Corollary 1.
Implication and axiomatization of functional and constant constraints
Algorithm 1 Chase for equality-generating constraints
Input: Set of equality-generating constraints S ,
Equality-generating constraint C
Output: S |= C
1: T ← PTN(C)
2: while there exists a constraint C ∈ S with T |≡ C do
3:
Choose equality (t = u ) ∈ EQ(C ) and
embedding e of PTN(C ) into T with e(t ) = e(u )
4: /* equalize e(t ) and e(u ) in T */
5: if e(t ) ∈ V then
6:
/* replace all occurrences of e(t ) in T by e(u ) */
7:
T ← φe(u )←e(t ) (T)
8: else if e(u ) ∈ V then
9:
/* replace all occurrences of e(u ) in T by e(t ) */
10:
T ← φe(t )←e(u ) (T)
11: else /* e(t ), e(u ) ∈ U and e(t ) = e(u ) */
12:
return TRUE
13: end if
14: end while
15: return T |≡ C
In Section 5, we shall use the correctness of Algorithm 2 to prove that there is a complete
axiomatization for the derivation of constant-functional constraints from a set of constantfunctional constraints.
5 Axiomatizing constant-functional constraints
Algorithm 2 is a correct algorithm to decide S |= C, with S a set of constant-functional
constraints and C a single constant-functional constraint. Hence, if S |= C holds,
and if we can simulate every equalization step and the termination of an execution of
Algorithm 2 by sound inference rules, then we have a sound and complete axiomatization
for deriving constant-functional constraints.
For chases that do not perform equalization steps, we provide straightforward inference
rules to simulate the eventual termination step in a single inference step (Section 5.1).
For chases that do initially perform an equalization step, we show that we can reduce any
chase that decides S |= C to a single equalization step (constant constraints) or at most two
equalization steps (functional constraints) with a single constraint C ∈ S , followed by the
chase deciding S |= C , for some constraint C . We do so by showing that, after the initial
equalization step(s) with C , the resulting tableau T is equivalent to the initial tableau for
any chase deciding S |= C . To show that S |= C holds when S |= C holds, we show
C |= C . Lastly, we show {C , C } |= C, while providing sound inference rules to derive
{C , C } C in a finite number of inference steps.
Due to the dual nature of Algorithm 2 with respect to, on the one hand, constant constraints, and on the other hand, functional constraints, we divide our search for inference
rules to simulate the initial equalization step(s) of an execution of Algorithm 2 into two
cases.
J. Hellings et al.
Algorithm 2 Chase for constant-functional constraints
Input: Set of constant-functional constraints S ,
Constant-functional constraint C
Output: S |= C
1: if C is a functional constraint then
2: Let D be the maximally double pattern of PTN(C) and LHS(C)
3: T ← D
4: else /* C is a constant constraint */
5: T ← PTN(C)
6: end if
7: while there exists a constraint C ∈ S with T |≡ C do
8:
if C is a functional constraint then
9:
Choose variable $r ∈ RHS(C ) and pair of embeddings e1 and e2 of
PTN(C) into T with e1 =LHS(C ) e2 and e1 ($r ) = e2 ($r )
10:
t1 , t2 ← e1 ($r ), e2 ($r )
11:
else /* C is a constant constraint */
12:
Choose equality (c = $v) ∈ EQ(C ) and
embedding e of PTN(C ) into T with c = e($v)
13:
t1 , t2 ← c, e($v)
14:
end if
15:
/* equalize t1 and t2 in T */
16:
if t1 ∈ V then
17:
/* replace all occurrences of t1 in T by t2 */
18:
T ← φt2 ←t1 (T)
19:
else if t2 ∈ V then
20:
/* replace all occurrences of t2 in T by t1 */
21:
T ← φt1 ←t2 (T)
22:
else /* t1 , t2 ∈ U and t1 = t2 */
23:
return TRUE
24:
end if
25: end while
26: return T |≡ C
Firstly, we consider the derivation of constant constraints (Section 5.2). Thereto, we
simulate the initial equalization step in the chase that decides S |= C with C a constant
constraint by using a finite number of inference steps.
Secondly, we consider the derivation of functional constraints (Section 5.3). Thereto we
try to simulate the chase that decides S |= C with C a functional constraint. In this case,
we conclude that a direct simulation of individual equalization steps is not straightforward.
To circumvent this problem, we show that, in this case, Algorithm 2 can be normalized to
a more specialized, symmetry-preserving, chase algorithm. For the symmetry-preserving
chase that decides S |= C, we are able to simulate the initial symmetry-preserving step by
using a finite number of inference steps.
Finally, the simulation of direct termination in chases for constant-functional constraints,
the initial equalization step in chases for constant constraints, and the initial symmetrypreserving step in chases for functional constraints are used as the basis for an induction
Implication and axiomatization of functional and constant constraints
argument to extend the simulation to the entire chase (Section 5.4). This induction argument
shows how to construct a derivation S C by showing how to translate each equalization
step (constant constraints) or symmetry-preserving step (functional constraints) of a chase
deciding S |= C to a finite number of inference steps, this such that the produced sequence
of inference steps is a derivation of S C.
5.1 Inference rules for direct termination
Consider the case where Algorithm 2 decides whether S |= C terminates without performing any equalization steps. Two subcases are possible, namely regular termination and
inconsistency termination.
5.1.1 Regular termination
Assuming no equalization steps are possible, we have regular termination if there does not
exist a constraint C ∈ S with T |≡ C . In this case, due to line 26 in Algorithm 2, we have
S |= C if and only if T |≡ C. We investigate necessary conditions on C such that T |≡ C
holds.
Proposition 3 Let C be a functional constraint and let D be the maximally double pattern
of PTN(C) and LHS(C). If D |≡ C, then RHS(C) ⊆ LHS(C).
If we have regular termination and C is a functional constraint, then T is equivalent to
the maximally double pattern of PTN(C) and LHS(C). Hence, by Proposition 3, we conclude RHS(C) ⊆ LHS(C), and thus we can use Reflexivity to derive ∅ C. Hence, also
S C.
Proposition 4 Let C be a constant constraint. If PTN(C) |≡ C, then EQ(C) = ∅.
If we have regular termination and C is a constant constraint, then T is equivalent to
Hence, by Proposition 4, we conclude EQ(C) = ∅. For this case, we introduce the
following inference rule.
PTN(C).
Rule 6 (Empty)
Let P be a pattern. We have (P, ∅).
Proof (soundness) Let R be a relation and assume there are embeddings of P into R. Let e
be any embedding of P into R. The constraint specifies no equalities, hence, every equality
specified by the constraint is satisfied by embedding e.
If we have regular termination and C is a constant constraint, then EQ(C) = ∅, and thus
we can use Empty to derive ∅ C. Hence, also S C.
5.1.2 Inconsistency termination
Still assuming no equalization steps are possible, we have inconsistency termination if there
does exist a constraint C ∈ S with T |≡ C . In this case, due to line 23 in Algorithm 2, we
can find two terms t1 ∈ U and t2 ∈ U with t1 = t2 that should be equal according to C. We
introduce the following inference rules to deal with such inconsistencies.
J. Hellings et al.
Rule 7 (Inconsistency I) Let P be a pattern. If (P , L → R ) with $r ∈ R , and if there is
a pair of embeddings of P into P that agree on L and map $r to two distinct constants,
then (P, E), for every finite set of constant equalities E on P, and (P, L → R), for every
L ⊆ VP and R ⊆ VP .
Proof (soundness) Let R be a relation with R |≡ (P , L → R ), and assume there are
embeddings of P into R. Let e be such an embedding. Let e1 and e2 be embeddings mapping
P into P that agree on L and map $r to two distinct constants. Now, the embeddings
e1 = e ◦ e1 and e2 = e ◦ e2 map P into R, agree on L , and map $r to two distinct
constants, a contradiction. Hence, no embedding e of P into R exists. Thus, we can conclude
R |≡ (P, E), for every finite set of constant equalities E on P, and R |≡ (P, L → R), for
every L ⊆ VP and R ⊆ VP .
Rule 8 (Inconsistency II) Let P be a pattern. If (P , E ) with (c = $v) ∈ E , and if there
is an embedding of P into P mapping $v to a constant unequal to c, then (P, E), for
every finite set of constant equalities E on P, and (P, L → R), for every L ⊆ VP and
R ⊆ VP .
Proof (soundness) Let R be a relation with R |≡ (P , E ), and assume there are embeddings
of P into R. Let e be such an embedding. Let e be an embedding mapping P into P and
mapping $v to a constant unequal to c. Now, the embedding e = e ◦ e maps P into R
and maps $v to a constant unequal to c, a contradiction. Hence, no embedding e of P into
R exists. Thus, we can conclude R |≡ (P, E), for every finite set of constant equalities E
on P, and R |≡ (P, L → R), for every L ⊆ VP and R ⊆ VP .
It is straightforward to verify that these inference rules can be applied to simulate chases
that decide S |= C, with C a constant constraint, with inconsistency termination and without
using equalization steps. We shall now prove that this is also the case when C is a functional
constraint.
Proposition 5 Let S |= C with C a functional constraint. If Algorithm 2 decides S |= C by
immediate inconsistency termination using constraint C ∈ S , then Rule 7 or Rule 8 can be
applied to derive C C.
Proof Let C = (P, L → R) and let T = D = f1 (P) ∪ f2 (P) be the maximally
double pattern of P and L. Let e be any embedding of PTN(C ) into D. As P, f1 (P),
and f2 (P) are isomorphic and f1 and f2 are bijections, the function e = f1 ←f2 ◦ e
is well-defined and an embedding of PTN(C ) into f1 (P). Hence, the function e =
f1 −1 ◦ e = f1 −1 ◦ f1 ←f2 ◦ e is also well-defined and an embedding of PTN(C ) into
P. By construction, the function f1 −1 ◦ f1 ←f2 is the identity on constants. We now consider two cases, namely where C is a functional constraint and where C is a constant
constraint.
1.
If C = (P , L → R ) is a functional constraint, then, since Algorithm 2 must pass
line 9 to terminate on line 23, we can choose a variable $r ∈ R and pair of embeddings
e1 and e2 of P into T = D with e1 =L e2 and e1 ($r ) = e2 ($r ). Now, using the above
construction, we obtain the pair of embeddings e1 = f1 −1 ◦ f1 ←f2 ◦ e1 and e2 =
f1 −1 ◦ f1 ←f2 ◦ e2 of P into P. As e1 =L e2 , we also have e1 =L e2 . Furthermore,
Implication and axiomatization of functional and constant constraints
as e1 ($r ) ∈ U , e2 ($r ) ∈ U , e1 ($r ) = e2 ($r ), and f1 −1 ◦ f1 ←f2 is the identity
on constants, we have e1 ($r ) ∈ U , e2 ($r ) ∈ U , and e1 ($r ) = e2 ($r ). We can thus
apply Inconsistency I using constraint C and embeddings e1 and e2 to derive C C.
2. If C = (P , E ) is a constant constraint, then, since Algorithm 2 must pass line 12
to terminate on line 23, we can choose an equality (c = $v) ∈ E and embedding e
of P into T = D with c = e($v). Now, using the above construction, we obtain an
embedding e = f1 −1 ◦ f1 ←f2 ◦ e of P into P. As e($v) ∈ U and f1 −1 ◦ f1 ←f2 is
the identity on constants, we have e ($v) ∈ U and c = e (c) = e ($v). We can thus
apply Inconsistency II using constraint C and embedding e to derive C C.
This case analysis completes the proof.
5.2 Inference rules for deriving constant constraints
From now on, we assume that a chase deciding S |= C always performs equalization steps.
For simulating the initial equalization step in the case where C is a constant constraint, we
use the following three inference rules.
Rule 9 (Application I) Let P be a pattern, let $v ∈ VP be a variable, and let t ∈ VP ∪ U
be a term. If (φt←$v (P), E), if (P , L → R ), and if there is a pair of embeddings of
P into P that agree on L and map $r ∈ R to t and $v, respectively, then (P, E). If, in
addition, there exists a c ∈ U such that either c = t (if t ∈ U ) or (c = t) ∈ E (if t ∈ V ),
then also (P, E ∪ {(c = $v)}).
Proof (soundness) Let R be a relation with R |≡ (φt←$v (P), E) and R |≡ (P , L → R ),
and assume there are embeddings of P into R. Let e be any embedding of P into R and let
e1 and e2 be any pair of embeddings of P into P that agree on L and map $r ∈ R to t
and $v, respectively. Now e1 = e ◦ e1 and e2 = e ◦ e2 are embeddings of P into R with
e1 =L e2 . By R |≡ (P , L → R ), we have e1 ($r ) = e2 ($r ), and, by construction, we
have e1 ($r ) = e(t) and e2 ($r ) = e($v). Hence, e($v) = e(t), and ε = e|domain(e)\{$v}
is an embedding of φt←$v (P) into R. Now, by R |≡ (φt←$v (P), E), we have, for every
(c = $w) ∈ E, ε($w) = c . Notice that $v ∈ domain(ε). Hence, if ε($w) = c , then
$w = $v and e($w) = c . Hence, we conclude R |≡ (P, E).
Above, we already showed that e($v) = e(t). It now follows readily that, if, in addition,
there exists c ∈ U such that either c = t (if t ∈ U ) or (c = t) ∈ E (if t ∈ V ), we also have
R |≡ (P, (c = $v)), and, hence, R |≡ (P, E ∪ {(c = $v)}).
Rule 10 (Application II) Let P be a pattern, let $v ∈ VP be a variable, and let c ∈ U be a
constant. If (φc←$v (P), E), if (P , E ) with (c = $v ) ∈ E , and if there is an embedding
of P into P mapping $v to $v, then (P, E ∪ {(c = $v)}).
Proof (soundness) Let R be a relation with R |≡ (φc←$v (P), E) and R |≡ (P , E ), and
assume there are embeddings of P into R. Let e be any embedding of P into R and let e be
any embedding of P into P mapping $v to $v. Now e = e ◦ e is an embedding of P into
R. By R |≡ (P , E ), we have e ($v ) = e($v) = c.
As a consequence, ε = e|domain(e)\{$v} is an embedding of φc←$v (P) into R. Now, by
R |≡ (φt←$v (P), E), we have, for every (c = $w) ∈ E, ε($w) = c . Notice that $v ∈
domain(ε). Hence, if ε($w) = c , then $w = $v and e($w) = c . Hence, we conclude
R |≡ (P, E ∪ {(c = $v)}).
J. Hellings et al.
The following example exhibits situations in which the Application I and II rules can be
used.
Example 5 Using Application I in a straightforward manner, we can derive
{({($a, $b, $c)}, $a → $b), ({($a, b, $c), ($a, b, d)}, (c = $c))} ({($a, b, $c), ($a, $b, d)}, (c = $c))
by using the embeddings
e1 = {$a → $a, $b → b, $c → $c}
e2 = {$a → $a, $b → $b, $c → d}.
We thus have e1 =$a e2 , e1 ($b) = b, and e2 ($b) = $b. Using Application I, we can also
derive
({($a, $b, $c), $a → $c) ({($a, $b, c), ($a, $b, $c)}, {c = $c})
by using the embeddings
e1 = {$a → $a, $b → $b, $c → c}
e2 = {$a → $a, $b → $b, $c → $c}.
Observe that in this case, Application I is applied to a single non-trivial functional
constraint and to the trivial constant constraint ({($a, $b, c), ($a, $b, $c)}, ∅). Using
Application II, we can derive
{({($a, $b, $c)}, (c = $c)), ({($a, $b, c)}, (b = $b))} ({($a, $b, $c)}, (b = $b))
by using the embedding e = {$a → $a, $b → $b, $c → $c}.
Rule 11 (Inconsistent Propagation) Let P be a pattern, $v ∈ VP , and c1 , c2 ∈ U with
c1 = c2 . If (P, {(c1 = $v), (c2 = $v)}), then (P, E) for every finite set of constant
equalities E on P.
Proof (soundness) Let R be a relation with R |≡ (P, {(c1 = $v), (c2 = $v)}) and assume
there are embeddings of P into R. Let e be such an embedding mapping P into R. By
R |≡ (P, {(c1 = $v), (c2 = $v)}), we must have c1 = e($v) and c2 = e($v). As a
consequence, we conclude c1 = c2 , a contradiction. Hence, no embedding e of P into R
exists. Thus we can conclude R |≡ (P, E), for every finite set of constant equalities E on
P.
Before we show how the initial equalization step in any chase deciding S |= C with C
a constant constraint can be simulated by the introduced inference rules, we introduce an
additional inference rule to simplify our proofs:
Rule 12 (Embedding I) If (P , E ) and h is an embedding from P into P, then (P, {(c =
h($v)) | ((c = $v) ∈ E ) ∧ (h($v) ∈ V )}).
Proof (soundness) Let R be a relation with R |≡ (P , E ), and assume there are embeddings
of P into R. Let e be any embedding of P into R. Then ε = e ◦ h is an embedding of P into
R. Hence, for every (c = $v) ∈ E , we have e(c) = c = ε(c) = ε($v) = e(h($v)).
Embedding I explicitly maps a constant constraint defined on a pattern to a different pattern. Observe that Embedding I is applicable to equality-generating constraints as
Implication and axiomatization of functional and constant constraints
well. However, if we apply Embedding I to a constant constraint (P , E ), then the derived
constraint is always a constant constraint. The following example illustrates the usage of
Embedding I.
Example 6 If ({($a, $b)}, ($a = c)) holds, then trivially also ({($x, $y)}, ($x = c)) holds.
We can derive ({($x, $y)}, ($x = c)) from ({($a, $b)}, ($a = c)) by using Embedding I
with the embedding h = {$a → $x, $b → $y}.
Next, we show how the initial equalization step in any chase deciding S |= C with C a
constant constraint can be simulated by the introduced inference rules.
Theorem 3 Let S be a set of constant-functional constraints, let C = (P, E) be a constant
constraint, and let S |= C.
If Algorithm 2 is initially able to perform an equalization step with C ∈ S resulting in
tableau T = P , then there exists a constant constraint C = (P , E ) such that we have
the following:
1.
2.
S |= C ,
{C , C } C using Rules 4, 9, 10, and 11.
Proof Without loss of generality, we can assume that φt1 ←t2 is the equalization performed
in the initial equalization step: this always holds if C is a constant constraint, and, if C is
a functional constraint, then we can always swap the roles of e1 and e2 . Hence, we assume
that P = φt1 ←t2 (P).
We distinguish two types of executions of Algorithm 2, namely those that terminate
regularly and those that terminate due to inconsistency. We divide our analysis into these
two cases.
1.
Algorithm 2 terminates regularly. We define
E = {c = φt1 ←t2 ($v) | ((c = $v) ∈ E) ∧ (φt1 ←t2 ($v) ∈ V )}
and C = (P , E ). Observe that we have E = E if and only if there exists an equality
(c = t2 ) ∈ E: if (c = t2 ) ∈ E and t1 ∈ V , then we have E = E ∪{(c = t1 )}\{(c = t2 )},
and, if (c = t2 ) ∈ E and t1 ∈ U , then we have E = E \ {(c = t2 )}. Also observe that, if
t1 ∈ U and if there exists an equality (c = t2 ) ∈ E, then, due to this chase terminating
regularly and S |= C, we have t1 = c.
We have C |= C , since we have C C by a straightforward application of
Embedding I using the embedding φt1 ←t2 . By S |= C and C |= C , we conclude
S |= C . Next, we prove {C , C } C. We do so by a case distinction on the type of
constraint C .
1.a. C is a functional constraint. Let e1 and e2 be embeddings of PTN(C ) into T = P
and let $r ∈ RHS(C ) with t1 = e1 ($r ) and t2 = e2 ($r ), meeting the conditions
of line 9 of Algorithm 2. Since Algorithm 2 terminates regularly, it is not possible
that t1 , t2 ∈ U , and, since φt1 ←t2 is the equalization performed, it is not possible
that t1 ∈ V , t2 ∈ U . Two cases remain, namely t1 , t2 ∈ V and t1 ∈ U , t2 ∈ V .
First consider the case where t1 ∈ V and t2 ∈ V . If there exists an equality
(c = t2 ) ∈ E, then we have E = E ∪ {(c = t1 )} \ {(c = t2 )}, and, if there
does not exist an equality (c = t2 ) ∈ E, then we have E = E. In both cases,
we apply Application I to conclude {C , C } C. Next consider the case where
J. Hellings et al.
t1 ∈ U and t2 ∈ V . In this case, we have E = E \ {(t1 = t2 )} and we apply
Application I to conclude {C , C } (P, E ∪ {(t1 = t2 )}) and Decomposition to
conclude {C , C } (P, E ). If (t1 = t2 ) ∈ E, then C = (P, E ∪ {(t1 = t2 )}),
otherwise C = (P, E ). Hence, we conclude {C , C } C.
1.b. C is a constant constraint. Let e be an embedding of PTN(C ) into T = P and
let (c = $v) ∈ EQ(C ) be an equality with t1 = c and t2 = e($v), meeting
the conditions of line 12 of Algorithm 2. In this case, we have E = E \ {(t1 =
t2 )}. We apply Application II to conclude {C , C } (P, E ∪ {(t1 = t2 )}) and
Decomposition to conclude {C , C } (P, E ). If (t1 = t2 ) ∈ E, then C = (P, E ∪
{(t1 = t2 )}), otherwise C = (P, E ). Hence, we conclude {C , C } C.
2.
Algorithm 2 terminates due to inconsistency. Let $w ∈ VP be a variable and let c1 , c2 ∈
U be constants with c1 = c2 . We define E = {(c1 = $w), (c2 = $w)} and C =
(P , E ). Chasing pattern P using S will lead to inconsistency, hence, S |= C . Next,
we prove {C , C } C. We do so by a case distinction on the type of constraint C .
2.a. C is a functional constraint. Let e1 and e2 be embeddings of PTN(C ) into T = P
and let $r ∈ RHS(C ) with t1 = e1 ($r ) and t2 = e2 ($r ), meeting the conditions of
line 9 of Algorithm 2. We apply Application I to conclude (P, E ) and Inconsistent
Propagation to conclude {C , C } C.
2.b. C is a constant constraint. Let e be an embedding of PTN(C ) into T = P
and let (c = $v) ∈ EQ(C ) be an equality with t1 = c and t2 = e($v),
meeting the conditions of line 12 of Algorithm 2. We apply Application II to conclude {C , C } (P, E ∪ {(c = $v)}), Decomposition to conclude (P, E ), and
Inconsistent Propagation to conclude {C , C } C.
This case analysis completes the proof.
5.3 Inference rules for deriving functional constraints
For chases deciding S |= C with C a constant constraint, there is a direct correspondence
between the tableau T and patterns of constant constraints. In Section 5.2, we showed that
we can use this correspondence to derive inference rules simulating the initial equalization
step.
For chases deciding S |= C with C = (P, L → R) a functional constraint, such a direct
correspondence between the tableau T and patterns of functional constraints does not exist.
Line 3 of Algorithm 2 does however give an initial correspondence between the tableau
T and the maximally double pattern of P and L. A single equalization step can however
destroy any correspondence between the tableau T and any maximally double pattern that
is useful in derivations of non-trivial functional constraints, as shown by the next example.
Example 7 We apply Algorithm 2 to the set of constraints
S = {({($a, $b, $c)}, $b → $c), ({($a, $b, $c), ($a, $d, e)}, $a → $b)}
and the target functional constraint C = ({($a, $b, $c), ($a, $b, e)}, $a → $b). It is
straightforward to verify S |= C. We initially have the tableau
T = ($a, $b1 , $c1 ), ($a, $b1 , e), ($a, $b2 , $c2 ), ($a, $b2 , e) .
Implication and axiomatization of functional and constant constraints
We have ({($a, $b, $c)}, $b → $c) |≡ T, leading to the equalization φe←$c1 (T) which
results in the tableau
T = ($a, $b1 , e), ($a, $b2 , $c2 ), ($a, $b2 , e) .
Using the definition of a maximally double pattern, we can search for a pattern P , set
of variables L ⊆ VP , and injective functions f1 and f2 such that T = f1 (P ) ∪ f2 (P ),
f1 =L ∪idU f2 , and range(f1 |VP \L ) ∩ range(f2 |VP \L ) = ∅. It is easily verifiable that
the only way to achieve this is by putting P = T , f1 = f2 = idVP ∪U , and L =
{$a, $b1 , $b2 , $c2 }. Since L = VP , any functional constraint of the form (P , L → R ) will
have R ⊆ L , and, hence, holds trivially.
We can however always maintain a direct correspondence between tableau T and a
maximally double pattern that is useful in derivations and we can do so in at most two
equalization steps, as shown next. Theorem 4 is visualized in Fig. 2.
Theorem 4 Let S be a set of constant-functional constraints, let C = (P, L → R) be a
functional constraint, let D = f1 (P) ∪ f2 (P) be the maximally double pattern of P and L,
and let S |= C.
If T = D and an equalization step with C ∈ S is possible, then also a sequence of
at most two equalization steps with C is possible that results in a tableau T = D =
f1 (P ) ∪ f2 (P ), a maximally double pattern of P and L with L ⊆ VP , such that there
is a mapping m with f1 (P ) = m(f1 (P)) and f2 (P ) = m(f2 (P)).
Proof First, we consider all the cases in which C = (P , L → R ) is a functional constraint. Let $r ∈ R be a variable and let e1 and e2 be a pair of embeddings of P into T = D
with e1 =L e2 and e1 ($r ) = e2 ($r ), meeting the conditions of line 9 of Algorithm 2.
Since an equalization step is performed, we have e1 ($r ) ∈ U or e2 ($r ) ∈ U .
We can swap the roles of both f1 , f2 , and e1 , e2 . Hence, without loss of generality, we
can assume that e1 ($r ) = f1 (t) with t ∈ TP , and, for $v ∈ VP , e2 ($r ) = f1 ($v) or
e2 ($r ) = f2 ($v). First, we consider the cases in which e2 ($r ) = f1 ($v).
1.
t ∈ L ∪ U and $v ∈ L. Since $v ∈ L, we have f1 ($v) = f2 ($v). Let φf1 (t)←f1 ($v) be
the equalization performed by the initial equalization step using C and embeddings e1
and e2 , resulting in the tableau φf1 (t)←f1 ($v) (f1 (P) ∪ f2 (P)), which is equivalent to the
tableau
T = (φf1 (t)←f1 ($v) ◦ f1 (P)) ∪ (φf1 (t)←f1 ($v) ◦ f2 (P)).
By construction, φf1 (t)←f1 ($v) ◦ f1 and φf1 (t)←f1 ($v) ◦ f2 are embeddings of P into
T that agree on L = φt←$v (L) ∩ V = L \ {$v} and, for all t ∈ TP \ {$v}, are equal
to f1 and f2 . Hence, we can put P = φt←$v (P), f1 = f1 |TP , f2 = f2 |TP , and
m = φf1 (t)←f1 ($v) .
2. t ∈ VP ∪ U and $v ∈ L. Since P, f1 (P), and f2 (P) are isomorphic, and f1 and f2
are bijections mapping constants to themselves and variables to variables, the functions
ε1 = f2 ←f1 ◦e1 and ε2 = f2 ←f1 ◦e2 are well-defined and are embeddings of P into
f2 (P) that agree on L . Hence, by construction, f2 (t) = ε1 ($r ) = ε2 ($r ) = f2 ($v).
Let φf1 (t)←f1 ($v) be the equalization performed by the initial equalization step using C
and embeddings e1 and e2 , resulting in the tableau φf1 (t)←f1 ($v) (f1 (P) ∪ f2 (P)). Since
f1 ($v) ∈ range(f2 ), this tableau is equivalent to the tableau
T = (φf1 (t)←f1 ($v) ◦ f1 (P)) ∪ f2 (P).
J. Hellings et al.
Fig. 2 Visualization of a symmetry-preserving step performed on the maximally double pattern D = f1 (P)∪
f2 (P) of P and L with L ⊆ VP , resulting in the maximally double pattern D = f1 (P ) ∪ f2 (P ) of P and
L with L ⊆ VP . Since the mapping m is symmetry-preserving, the symmetry between f1 (P) and f2 (P) is
maintained: structural changes are applied symmetrically to both halves of D. Hence, m maps each half of D
to the corresponding half of D
Hence, f2 (P) is unaffected by the initial equalization step and thus a second
equalization step is possible using functional constraint C and embeddings ε1 and ε2 .
Performing this second equalization φf2 (t)←f2 ($v) on T results in the tableau
φf2 (t)←f2 ($v) (φf1 (t)←f1 ($v) (f1 (P) ∪ f2 (P))). Since f2 ($v) ∈ range(f1 ), this tableau is
equivalent to the tableau
T = (φf1 (t)←f1 ($v),f2 (t)←f2 ($v) ◦ f1 (P)) ∪ (φf1 (t)←f1 ($v),f2 (t)←f2 ($v) ◦ f2 (P)).
By construction, φf1 (t)←f1 ($v),f2 (t)←f2 ($v) ◦ f1 and φf1 (t)←f1 ($v),f2 (t)←f2 ($v) ◦ f2
are embeddings of P into T that agree on L = φt←$v (L) ∩ V = L and, for all
t ∈ TP \ {$v}, are equal to f1 and f2 . Hence, we can put P = φt←$v (P), f1 = f1 |TP ,
f2 = f2 |TP , and m = φf1 (t)←f1 ($v),f2 (t)←f2 ($v) .
Next we consider the cases in which e2 ($r ) = f2 ($v).
3. e1 ($r ) = f1 ($v). From e1 ($r ) = f1 ($v), e2 ($r ) = f2 ($v), and e1 ($r ) =
e2 ($r ), we conclude that $v ∈ L. Let φf1 ($v)←f2 ($v) be the equalization performed by the initial equalization step using C and embeddings e1 and e2 ,
resulting in the tableau φf1 ($v)←f2 ($v) (f1 (P) ∪ f2 (P)), which is equivalent to the
tableau
T = (φf1 ($v)←f2 ($v) ◦ f1 (P)) ∪ (φf1 ($v)←f2 ($v) ◦ f2 (P)).
By construction, φf1 ($v)←f2 ($v) ◦ f1 and φf1 ($v)←f2 ($v) ◦ f2 are embeddings of
P into T that agree on L = L∪{$v} and, for all t ∈ TP \{$v}, are equal to f1 and
f2 . Hence, we can put P = P, f1 = φf1 ($v)←f2 ($v) ◦f1 , f2 = φf1 ($v)←f2 ($v) ◦f2 ,
and m = φf1 ($v)←f2 ($v) .
4. e1 ($r ) = f1 (t) = f1 ($v). Since P, f1 (P), and f2 (P) are isomorphic and f1 and
f2 are bijections mapping constants to themselves and variables to variables, the
functions ε1 = f1 ←f2 ◦ e1 and ε2 = f1 ←f2 ◦ e2 are well-defined and are
embeddings of P into f1 (P) with ε1 =L ε2 . Since e1 ($r ) = f1 (t) and e2 ($r ) =
f2 ($v) with t = $v, we have e1 ($r ) = f1 (t) = ε1 ($r ) = ε2 ($r ) = f1 ($v).
Hence, if the equalization step with C , e1 , and e2 is initially possible, then also
an equalization step with C , ε1 , and ε2 is possible. We choose to perform the
equalization step with C , ε1 , and ε2 instead. Hence, we reduce this case to the
cases 1 and 2.
Finally, we consider all the cases in which C = (P , E ) is a constant constraint. Let (c =
$v ) ∈ E be an equality and let e be an embedding of P into T = P with c = e($v ),
meeting the conditions of line 12 of Algorithm 2. Since an equalization step is performed,
Implication and axiomatization of functional and constant constraints
we have e($v ) ∈ U . Since we can swap the roles of f1 and f2 , we can assume, without loss
of generality, that e($v ) = f1 ($v) with $v ∈ VP . This yields us two cases, namely $v ∈ L
or $v ∈ L.
5. $v ∈ L. Since $v ∈ L, we have f1 ($v) = f2 ($v). Let φc←f1 ($v) be the equalization performed by the initial equalization step using C and embedding e, resulting
in the tableau φc←f1 ($v) (f1 (P) ∪ f2 (P)), which is equivalent to the tableau
T = (φc←f1 ($v) ◦ f1 (P)) ∪ (φc←f1 ($v) ◦ f2 (P)).
By construction, the functions φc←f1 ($v) ◦ f1 and φc←f1 ($v) ◦ f2 are embeddings of P into T that agree on L = φc←$v (L) ∩ V = L \ {$v} and, for all
t ∈ TP \ {$v}, are equal to f1 and f2 . Hence, we can put P = φc←$v (P),
f1 = f1 |TP , f2 = f2 |TP , and m = φc←f1 ($v) .
6. $v ∈ L. Since P, f1 (P), and f2 (P) are isomorphic, and f1 and f2 are bijections mapping constants to themselves and variables to variables, the function
ε = f2 ←f1 ◦ e is well-defined and is an embedding of P into f2 (P) with
ε($v ) = c. Hence, by construction, ε($v ) = f2 ($v). Let φc←f1 ($v) be the
equalization performed by the initial equalization step using C and embedding e,
resulting in the tableau φc←f1 ($v) (f1 (P) ∪ f2 (P)). Since f1 ($v) ∈ range(f2 ), this
tableau is equivalent to the tableau
T = (φc←f1 ($v) ◦ f1 (P)) ∪ f2 (P).
Hence f2 (P) is unaffected by the initial equalization step and thus a second
equalization step is possible using constant constraint C and embedding ε.
Performing this second equalization φc←f2 ($v) on T results in the tableau
φc←f2 ($v) (φc←f1 ($v) (f1 (P) ∪ f2 (P))). Since f2 ($v) ∈ range(f1 ), this tableau is
equivalent to the tableau
T = (φc←f1 ($v),c←f2 ($v) ◦ f1 (P)) ∪ (φc←f1 ($v),c←f2 ($v) ◦ f2 (P)).
By construction, φc←f1 ($v),c←f2 ($v) ◦ f1 and φc←f1 ($v),c←f2 ($v) ◦ f2 are
embeddings of P into T that agree on L = φc←$v (L) ∩ V = L and, for all
t ∈ TP \ {$v}, are equal to f1 and f2 . Hence, we can put P = φc←$v (P),
f1 = f1 |TP , f2 = f2 |TP , and m = φc←f1 ($v),c←f2 ($v) .
This case analysis completes the proof.
We refer to the sequence of at most two equalization steps performed in Cases 1–3, 5
and 6 of Theorem 4 as a symmetry-preserving step of type 1–3, 5 and 6, respectively. The
name “symmetry-preserving” reflects that this sequence of equalization steps maintains the
symmetry between the parts in the tableau originating from f1 (P) and f2 (P), as illustrated
in Fig. 2. Notice that we do not consider equalization steps performed by Case 4 of Theorem
4. In this case, we have shown that we can always perform other equalization steps that can
be represented by a symmetry-preserving step of type 1 or 2.
J. Hellings et al.
Observe that the initial tableau in a chase deciding S |= C, with C a functional constraint, is a maximally double pattern of PTN(C) and LHS(C). Hence, if equalization steps
are possible, then we can apply Theorem 4 inductively to yield a sequence of symmetrypreserving steps that ends when no further equalization steps are possible. We refer to such
chases performing only symmetry-preserving steps as symmetry-preserving chases.
Corollary 2 Let S be a set of constant-functional constraints, let C = (P, L → R) be a
functional constraint. If S |= C, then there is a symmetry-preserving chase that decides
S |= C.
For simulating the initial symmetry-preserving step in a chase deciding S |= C with C a
functional constraint, we use the following three inference rules.
Rule 13 (Application III) Let P be a pattern, let $v ∈ VP be a variable, and let t ∈ VP ∪ U
be a term. If (φt←$v (P), φt←$v (L) ∩ V → φt←$v (R) ∩ V ), if (P , L → R ), and if there
is a pair of embeddings of P into P that agree on L and map $r ∈ R to t and $v,
respectively, then (P, L → R).
Proof (soundness) Let R be a relation with R |≡ (φt←$v (P), φt←$v (L)∩ V → φt←$v (R)∩
V ) and R |≡ (P , L → R ), and assume there are embeddings of P into R. Let e1 and e2 be
any pair of embeddings of P into R with e1 =L e2 , and let e1 and e2 be embeddings of P
into P that agree on L and map $r ∈ R to t and $v, respectively. Now, g1 = e1 ◦ e1 and
g2 = e1 ◦ e2 are embeddings of P into R with g1 =L g2 . By R |≡ (P , L → R ), we have
g1 ($r ) = g2 ($r ), and, by construction, we have g1 ($r ) = e1 (t) and g2 ($r ) = e1 ($v),
and, hence, e1 ($v) = e1 (t). In a similar way, we obtain e2 ($v) = e2 (t).
As a consequence, ε1 = e1 |domain(e1 )\{$v} and ε2 = e2 |domain(e2 )\{$v} are embeddings of
φt←$v (P) into R. By construction, we have ε1 =φt←$v (L) ε2 and, hence, ε1 =φt←$v (L)∩V ε2 .
By R |≡ (φt←$v (P), φt←$v (L) ∩ V → φt←$v (R) ∩ V ), we conclude ε1 =φt←$v (R)∩V ε2 . If
$v ∈ R, then R = φt←$v (R) ∩ V and thus e1 =R e2 . In the other case, when $v ∈ R, we
have e1 ($v) = ε1 (t) and e2 ($v) = ε2 (t). As t is either a constant or t ∈ φt←$v (R) ∩ V , we
have ε1 (t) = ε2 (t). Hence, in both cases we have e1 =R e2 , and we conclude R |≡ (P, L →
R).
Rule 14 (Application IV) Let P be a pattern, let $v ∈ VP be a variable, and let c ∈ U be a
constant. If (φc←$v (P), φc←$v (L)∩ V → φc←$v (R)∩ V ), if (P , E ) with (c = $v ) ∈ E ,
and if there is an embedding of P into P mapping $v to $v, then (P, L → R).
Proof (soundness) Let R be a relation with R |≡ (φc←$v (P), φc←$v (L)∩V → φc←$v (R)∩
V ) and R |≡ (P , E ) and assume there are embeddings of P into R. Let e1 and e2 be any pair
of embeddings of P into R with e1 =L e2 , and let e be an embedding of P into P mapping
$v to $v. Now, e1 = e1 ◦ e and e2 = e2 ◦ e are embeddings of P into R. By R |≡ (P , E ),
we have c = e1 ($v ) = e2 ($v ) and, by construction, we have c = e ($v ) = e1 ($v) and
c = e2 ($v ) = e2 ($v), and thus c = e1 ($v) = e2 ($v).
As a consequence, ε1 = e1 |domain(e1 )\{$v} and ε2 = e2 |domain(e2 )\{$v} are embeddings of
φc←$v (P) into R. By construction, we have ε1 =φc←$v (L) ε2 and, hence, ε1 =φc←$v (L)∩V ε2 .
By R |≡ (φc←$v (P), φc←$v (L) ∩ V → φc←$v (R) ∩ V ), we conclude ε1 =φc←$v (R)∩V ε2 .
If $v ∈ R, then R = φc←$v (R) ∩ V and thus e1 =R e2 . In the other case, when $v ∈ R,
Implication and axiomatization of functional and constant constraints
we already have e1 ($v) = e2 ($v) = c. Hence, in both cases we have e1 =R e2 , and we
conclude R |≡ (P, L → R).
The following example exhibits situations in which the Application III and IV rules can
be used.
Example 8 Using Application III, we can derive
{({($a, $b, $c)}, $a → $b), ({($a, b, $c), ($a, b, d)}, $a → $c)} ({($a, b, $c), ($a, $b, d)}, $a → $c)
by using the embeddings
e1 = {$a → $a, $b → b, $c → $c}
e2 = {$a → $a, $b → $b, $c → d}.
Using Application IV, we can derive
{({($a, $b, $c)}, (c = $c)), ({($a, $b, c)}, $a → $b)} ({($a, $b, $c)}, $a → $b)
by using the embedding e = {$a → $a, $b → $b, $c → $c}.
If (P, (c = $v)) is a constant constraint, then, using Application IV, we can also
derive (P, ∅ → $v) by using the identity on VP ∪ U as the embedding e of P onto
itself.
Rule 15 (Functional Consequence) Let P be a pattern and let L ⊆ VP be a set of variables. If (P , L → R ) and P has a pair of embeddings into the maximally double pattern
f1 (P) ∪ f2 (P) of P and L, agreeing on L and mapping $r ∈ R to f1 ($v) and f2 ($v),
respectively, then (P, L → $v).
Proof (soundness) Let R be a relation with R |≡ (P , L → R ), and assume there are
embeddings of P into R. Let e1 and e2 be any pair of embeddings of P into R with e1 =L e2 .
Let f1 (P) ∪ f2 (P) be the maximally double pattern of P and L. Let e1 and e2 be embeddings
of P into f1 (P) ∪ f2 (P) with e1 =L e2 , e1 ($r ) = f1 ($v), and e2 ($r ) = f2 ($v). As
f1 and f2 are bijections, the functions ε1 = e1 ←f1 ◦ e2 ←f2 ◦ e1 and ε2 = e1 ←f1 ◦
e2 ←f2 ◦ e2 are well-defined and are embeddings of P into R. By construction, we have
ε1 =L ε2 , ε1 ($r ) = e1 ($v), and ε2 ($r ) = e2 ($v). Hence, by R |≡ (P , L → R ),
e1 ($v) = ε1 ($r ) = ε2 ($r ) = e2 ($v).
The following example exhibits situations in which the Functional Consequence rule can
be used.
Example 9 We can derive
({($a, $b, $c), ($x, $y, $z)}, $b → $z) ({($a, $b, $c), ($x, $y, $z)}, ∅ → $z)
using Functional Consequence. We have the maximally double pattern
{($a1 , $b1 , $c1 ), ($x1 , $y1 , $z1 ), ($a2 , $b2 , $c2 ), ($x2 , $y2 , $z2 )}
of {($a, $b, $c), ($x, $y, $z)} and ∅. We use the embeddings
e1 = {$a → $a1 , $b → $b1 , $c → $c1 , $x → $x1 , $y → $y1 , $z → $z1 }
e2 = {$a → $a1 , $b → $b1 , $c → $c1 , $x → $x2 , $y → $y2 , $z → $z2 }.
J. Hellings et al.
We thus have e1 =$b e2 , e1 ($z) = $z1 and e2 ($z) = $z2 . Hence, using Functional
Consequence, we conclude ({($a, $b, $c), ($x, $y, $z)}, ∅ → $z).
Notice that there is a relationship between the Functional Consequence rule and the
well-known join dependencies [2] and multivalued dependencies [6]. In this example, the
possible embeddings of the pattern {($a, $b, $c), ($x, $y, $z)} can be represented by a relational table T with schema R(A, B, C, X, Y, Z). Due to the construction of T , the join
dependency ABC XY Z holds on T . This join dependency is equivalent to the multivalued dependency ∅ ABC. Due to ({($a, $b, $c), ($x, $y, $z)}, $b → $z), the functional
dependency B → Z also holds on T , and, hence, ABC → Z holds. Using the well-known
derivation rules for functional dependencies and multivalued dependencies, we conclude
∅ → Z.
Before we show how the initial equalization step in any chase deciding S |= C with C
a functional constraint can be simulated by the introduced inference rules, we introduce an
additional inference rule to simplify our proofs:
Rule 16 (Embedding II) If (P , L → R ) and h is an embedding from P into P, then
(P, h(L ) ∩ V → h(R ) ∩ V ).
Proof (soundness) Let R be a relation with R |≡ (P , L → R ), and assume there are
embeddings of P into R. Let e1 and e2 be any pair of embeddings of P into R with
e1 =h(L )∩V e2 . Then, ε1 = e1 ◦ h and ε2 = e2 ◦ h are embeddings of P into R. As embeddings always agree on constants and e1 =h(L )∩V e2 , we have e1 =h(L ) e2 and, hence,
also ε1 =L ε2 . By R |≡ (P , L → R ), we have ε1 =R ε2 . As a consequence, we have
e1 =h(R ) e2 and, hence, also e1 =h(R )∩V e2 .
Embedding II explicitly maps a functional constraint defined on a pattern to a different
pattern. The following example illustrates this.
Example 10 If ({($a, $b)}, $a → $b) holds, then trivially also ({($c, $d)}, $c → $d)
holds. We can derive ({($c, $d)}, $c → $d) from ({($a, $b)}, $a → $b) by using
Embedding II with the embedding h = {$a → $c, $b → $d}.
Next, we show how the initial symmetry-preserving step in any symmetry-preserving
chase deciding S |= C with C a functional constraint can be simulated by the introduced
inference rules.
Theorem 5 Let S be a set of constant-functional constraints, let C = (P, L → R) be a
functional constraint, and let S |= C.
If Algorithm 2 is initially able to perform a symmetry-preserving step with C ∈ S resulting in tableau T = f1 (P ) ∪ f2 (P ), which is the maximally double pattern of P and L
with L ⊆ VP , then there exists a functional constraint C = (P , L → R ) such that we
have the following:
1.
2.
C |= C ,
{C , C } C using Rules 2, 3, 13, 14, and 15.
Proof Initially, we have T = f1 (P) ∪ f2 (P), a maximally double pattern of P and L.
By Theorem 4, the initial symmetry-preserving step with constraint C results in tableau
Implication and axiomatization of functional and constant constraints
T = f1 (P ) ∪ f2 (P ), a maximally double pattern of P and L , such that f1 (P ) =
m(f1 (P)) and f2 (P ) = m(f2 (P)) for some mapping m. We make a case distinction on the
type of symmetry-preserving step initially performed.
The initial symmetry-preserving step is of type 1 or 2. Let C = (P , L → R ), let
$r ∈ R , and let e1 and e2 be the embeddings of P into T with e1 ($r ) = f1 (t) and
e2 ($r ) = f1 ($v), as in the proof of Theorem 4.
We choose C = (φt←$v (P), φt←$v (L) ∩ V → φt←$v (R) ∩ V ). First, we have C |=
C , since we have C C by a straightforward application of Embedding II with the
embedding φt←$v . The functions ε1 = f1 −1 ◦f1 ←f2 ◦e1 and ε2 = f1 −1 ◦f1 ←f2 ◦e2
are well-defined and are embeddings of P into P with, by construction, ε1 =L ε2 ,
ε1 ($r ) = t, and ε2 ($r ) = $v. Hence, we can apply Application III using embeddings
ε1 and ε2 , and variable $r to derive {C , C } C.
2. The initial symmetry-preserving step is of type 3. Let C = (P , L → R ), let $r ∈ R ,
and let e1 and e2 be the embeddings of P into T with e1 ($r ) = f1 ($v) and e2 ($r ) =
f2 ($v), as in the proof of Theorem 4.
We choose C = (P, L ∪ {$v} → R). First, we have C |= C , since we have
C C by a straightforward application of Augmentation, Decomposition, and Union.
A straightforward application of Functional Consequence with e1 , e2 , and $v yields
C (P, L → $v). Using Augmentation, we obtain C (P, L → L ∪ {$v}), and,
hence, using Transitivity, {C , C } C.
3. The initial symmetry-preserving step is of type 5 or 6. Let C = (P , E ), let (c = $v ) ∈
E , and let e be the embedding of P into T with e($v ) = f1 ($v), as in the proof of
Theorem 4.
We choose C = (φc←$v (P), φc←$v (L) ∩ V → φc←$v (R) ∩ V ). First, we have
C |= C , since we have C C by a straightforward application of Embedding II with
the embedding φc←$v . The function ε = f1 −1 ◦ f1 ←f2 ◦ e is well-defined, and is
an embedding of P into P with, by construction, ε($v ) = $v. Hence, we can apply
Application IV using embedding ε and variable $v to derive {C , C } C.
1.
This case analysis completes the proof.
5.4 Inference rules for constant-functional constraints
The results from Section 5.1–5.3 only concerned the simulation of direct termination
in chases for constant-functional constraints, the initial equalization step in chases for
constant constraints, and the initial symmetry-preserving step in chases for functional
constraints, respectively. These results are now used as the basis for an induction argument to extend the simulation to the entire chase, as such showing how to translate a
chase (constant constraints) or a symmetry-preserving chase (functional constraints) to a
derivation.
Proposition 6 The set of inference rules Reflexivity, Empty, Augmentation, Transitivity,
Decomposition, Inconsistency I and II, Application I-IV, Inconsistent Propagation, and
Functional Consequence is complete for the class of constant-functional constraints.
Proof Suppose that S is a set of constant-functional constraints and C is a single constantfunctional constraint with S |= C.
If initially direct termination is possible, then, by Proposition 3–5, Reflexivity, Empty,
Inconsistency I, or Inconsistency II can be used to derive C from S .
J. Hellings et al.
For chases performing equalization steps we make a case distinction on the type of the
constraint C.
1.
C is a constant constraint. Let j > 0 be the number of equalization steps in a shortest
chase deciding S |= C. Assume, as induction hypothesis, that, if S |= C, with C
a constant constraint, and if this can be decided by a chase performing less than j
equalization steps, then S C.
Choose a chase deciding S |= C in exactly j equalization steps. By Theorem 3,
there is a constant constraint C with S |= C and {C , C } C using Decomposition,
Application I, Application II, and Inconsistent Propagation such that the result of the
initial equalization step is a tableau T , which is equivalent to the initial tableau in any
chase deciding S |= C . Hence, the remaining chase of j − 1 equalization steps is a
chase deciding S |= C , and, by the induction hypothesis, S C . Since S C and
{C , C } C, we conclude S C.
2. C is a functional constraint. By Corollary 2, we only have to consider symmetrypreserving chases. Let j > 0 be the number of symmetry-preserving steps in a
shortest symmetry-preserving chase deciding S |= C. Assume, as induction hypothesis, that, if S |= C, with C a functional constraint, and if this can be decided by
a symmetry-preserving chase performing less than j symmetry-preserving steps, then
S C.
Choose a symmetry-preserving chase deciding S |= C in exactly j symmetrypreserving steps. By Theorem 5, there is a functional constraint C with C |= C
and {C , C } C using Augmentation, Transitivity, Application III, Application IV,
and Functional Consequence such that the result of the initial symmetry-preserving
step is a tableau T , which is equivalent to the initial tableau in any chase deciding
S |= C . Hence, the remaining chase of j −1 symmetry-preserving steps is a symmetrypreserving chase deciding S |= C , and, by the induction hypothesis, S C . Since
S C and {C , C } C, we conclude S C.
This case analysis completes the proof.
Thus, the provided set of inference rules is sound and complete. Finally, we prove that
the provided set of inference rules is 2-ary and that each inference rule is indeed decidable.
As a consequence, the provided set of inference rules is an axiomatization for the constantfunctional constraints.
Theorem 6 The set of inference rules Reflexivity, Empty, Augmentation, Transitivity,
Decomposition, Inconsistency I and II, Application I-IV, Inconsistent Propagation, and
Functional Consequence is an axiomatization for the class of constant-functional constraints.
Proof Each time we introduced an inference rule, we proved its soundness. Proposition 6
proves that the set of inference rules is complete for deriving constant-functional constraints.
Hence, it only remains to prove that the inference rules are at most k-ary, for some k ≥ 0,
and that applicability of each inference rule is decidable. The inference rules Reflexivity and
Empty are 0-ary. The inference rules Augmentation, Decomposition, Inconsistent Propagation and Functional Consequence are 1-ary. The inference rules Transitivity, Inconsistency
I and II, and Application I-IV are 2-ary.
Applicability of an inference rule can be decided by checking if patterns are equal, which
is straightforward, or by checking if embeddings from pattern P into Q exist that satisfy
Implication and axiomatization of functional and constant constraints
certain properties. Functionally, each embedding from a pattern P into Q can be specified by
the mapping from VP to TQ . Since VP and TQ are finite, we can easily enumerate all possible
embeddings and check whether embeddings that satisfy the conditions of an inference rule
exist. Hence, checking if an inference rule can be applied is decidable.
We observe that the Embedding I and II rules are not part of the axiomatization, as they
are not used in the simulation of a chase. Their soundness was only used to simplify certain
proofs. Likewise, the Union rule is not part of the axiomatization.
By carefully restricting the simulation of chases by inference rules, as presented in
Theorem 6, to either chases involving only constant constraints or chases involving only
functional constraints, respectively, we can also provide axiomatizations for only the
functional constraints and only the constant constraints.
Theorem 7 The set of inference rules Reflexivity, Augmentation, Transitivity, Inconsistency I 2 , Application III, and Functional Consequence is an axiomatization for the class of
functional constraints.
Theorem 8 The set of inference rules Empty, Decomposition, Inconsistency II 3 ,
Application II and Inconsistent Propagation is an axiomatization for the class of constant
constraints.
Next, we investigate the complexity of the provided axiomatization for the constantfunctional constraints.
Theorem 9 Let S be a set of constant-functional constraints and let C be a single constantfunctional constraint. If S C, then there is a derivation that performs O(|VPTN(C) |)
inference steps.
Proof Consider the chase deciding S |= C. At each step of the chase, the number of
distinct variables in the tableau T decreases by one. If C is a constant constraint, then we
initially have |VT | = |VP |. Hence, the chase performs at most |VP | equalization steps and
Proposition 6 provides a way to simulate each equalization step by a bounded number of
inference steps. If C is a functional constraint, then we initially have |VT | ≤ 2|VP |. Hence,
the chase performs at most 2|VP | symmetry-preserving steps and Proposition 6 provides a
way to simulate each symmetry-preserving step by a bounded number of inference steps.
6 Related work
Many types of constraints have been investigated for the relational model, and among the
simplest of these constraints are the functional dependencies [13]. Functional dependencies
play an important role in the well-known Boyce-Codd normal form [14] and in relational
2 The Inconsistency I inference rule can be used to derive both functional constraints and constant constraints.
In this setting we only use Inconsistency I to derive functional constraints.
3 The Inconsistency II inference rule can be used to derive both constant constraints and functional constraints.
In this setting we only use Inconsistency II to derive constant constraints.
J. Hellings et al.
schema normalization in general. Besides the functional dependencies, many other types of
constraints have been investigated, and for the relational model most of these constraints
can be categorized as a subclass of the equality-generating and/or tuple-generating dependencies [1, 16, 17, 26, 29]. The constraints studied in this work are all equality-generating.
In the following, we shall describe how the constant-functional constraints are related with
other classes of equality-generating dependencies.
For describing these relationships, we introduce some terminology to define restrictions on the classes of functional, constant, and equality-generating constraints. We say
that a constraint C over pattern P is typed if, for every pair of tuples (t1 , . . . , tn ) ∈ P and
(u1 , . . . un ) ∈ P, and for every pair of variables ti ∈ VP , uj ∈ VP , we have ti = uj only
if i = j . We say that a functional constraint (P, L → R) is constant-free if UP = ∅. We
say that an equality-generating constraint (P, E) is constant-free if UP = ∅ and, for every
(t = u) ∈ E, t, u ∈ VP . We say that an equality-generating constraint (P, E) is many sorted
if it is typed and, for every (t = u) ∈ E, there exist tuples (t1 , . . . , tn ) ∈ P, (u1 , . . . , un ) ∈ P
and there exists a j , 1 ≤ j ≤ n, such that t = tj and u = uj . Lastly, we say that an
equality-generating or functional constraint C is an n-pattern constraint if |PTN(C)| ≤ n.
Using the terminology introduced above, we can describe the relationships between
various well-known classes of equality-generating constraints. A summary of these relationships is visualized in Fig. 3. The equality-generating constraints (EGC) and the
functional constraints (FC) of Akhtar et al. [3] were both originally introduced for the
RDF data model, and have been generalized to relations of arbitrary arity in this work.
The equality-generating constraints on relations of arbitrary arity are equivalent to the
full equality-generating dependencies of Wijsen [31]. The equality-generating dependencies (EGD) of Beeri and Vardi [7] are equivalent to the constant-free fragment of the
equality-generating constraints.
In the literature on dependencies for the relational model, one often considers a n-ary
relation to be a subset of D1 × · · · × Dn , where Di , 1 ≤ i ≤ n, is a domain of values
disjoint from all other domains. As such, many dependencies are many-sorted. The implication dependencies of Fagin [16] are an example, and the equality-generating fragment of
these implication dependencies are indeed equivalent to the many-sorted and constant-free
equality-generating constraints.
Also the well-known functional dependencies (FD) of Codd [13] are many-sorted. It is
straightforward to verify that the functional dependencies are also equivalent to the typed,
constant-free, 1-pattern functional constraints (1-pattern FD). Using this relationship, it is
also straightforward to verify that the functional dependencies are equivalent to the constantfree, many-sorted, 2-pattern equality-generating dependencies.
In the literature, several context-dependent generalizations of the functional dependencies have been studied. Among them are the conditional functional dependencies (CFD)
of Fan et al. [21]. These dependencies allow the use of context-dependent information in
the specification of integrity constraints, and, as such, allow for richer tools to maintain
integrity of databases. In a straightforward manner, the conditional functional dependencies in normal form [21] can be shown to be equivalent to the union of the typed, 1-pattern
subset of the constant constraints (CD) and the typed, 1-pattern functional constraints.
Hence, the constant-functional constraints are a proper untyped generalization of the
conditional functional dependencies towards arbitrary patterns in relations. The extended
conditional functional dependencies of Bravo et al. [9, 20] have been introduced as an
extension of the conditional functional dependencies in which one can also express that the
valuation of an attribute (in some context) is restricted to a set of allowed values.
Implication and axiomatization of functional and constant constraints
Fig. 3 Overview of the various classes of constraints. Directed edges from a class C1 to another class C2
express inclusion of C1 into C2 , meaning that every constraint in C1 is expressible by a set of constraints in
C2
Example 11 Consider again the relational schema PI(name, country, cc, phone) of
Example 3. The attribute cc gives the country code, and the set of allowed values can be
restricted to all existing country calling codes.
Such an allowed set of values is however not expressible by a set of equalities that should
all hold at the same time, and, hence, these types of constraints are not expressible by
constant-functional constraints.
Another example of constraints that allow the specification of context-dependent information are the qualified functional dependencies (QFD) of He et al. [24]. The qualified
functional dependencies introduce a convenient syntax to specify context-dependent functional dependencies on several relations with equivalent schemas in a straightforward
manner. It is straightforward to show that each qualified functional dependency holding
on each individual relation can be expressed by a set of typed, 1-pattern functional constraints on the relation, and vice versa. Hence, the qualified functional dependencies are
also equivalent to the functional part of the conditional functional dependencies.
For the RDF and XML graph data models, a large body of work on the integrity of data
focuses on the schema of the data. Examples are RDF Schema [18] and, for the XML data
model, DTDs [10] and XSDs [19]. The usage of dependency-like constraints is less common
for these data-models although initial steps have been made (e.g., [3, 4, 11, 12, 22, 23, 27,
28, 30, 32]).
We repeat that functional constraints were originally introduced for the RDF data
model [3, 15]. Besides this obvious relationship with the RDF data model, the concept
J. Hellings et al.
of applying functional dependencies or functional-like dependencies to a view on the data
plays a major role in other proposals for constraints for the RDF and XML data models. A
clear and recent example are the XML Constraints based on XML-to-relational mappings of
Niewerth et al. [28], who studied functional dependencies over tree patterns. Specifically,
the tree patterns using only edges are equivalent to patterns, as we consider them, over the
edge relation of XML documents.
7 Conclusions and future work
Starting from functional and equality-generating constraints for the RDF data model,
we studied constant-functional constraints on arbitrary relations. As our main result, we
prove the existence of a sound and complete axiomatization for the constant-functional
constraints. This axiomatization can easily be specialized to only deal with functional constraints; hence our results solve a major open problem in the work of Akhtar et al. [3] on
functional constraints for the RDF data model. The major result leading to the axiomatization of the constant-functional constraints is that chases for functional constraints can
always be normalized to symmetry-preserving chases.
We believe that our work provides a promising formal basis for reasoning about constantfunctional constraints. As for future work, several open problems remain.
Firstly, for several types of equality-generating constraints, it is known whether
Armstrong-like relations [5, 16] exist. An Armstrong-like relation R for a set of constraints S satisfies constraint C if and only if S |= C. For both the equality-generating
constraints and the constant-functional constraints, it is unknown under which conditions
Armstrong-like relations exist.
Secondly, the addition of constant constraints to the functional constraints also introduced the possibility to define a set of inconsistent constraints, such that no non-empty
relation can satisfy the constraints. An example is the set
{({($a, $b, $c)}, (c1 = $c)), ({($a, $b, $c)}, (c2 = $c))}.
Being able to decide consistency of a set of constraints is useful. Inconsistent sets of
constraints are not very useful in practice and might even hint at a mistake in the definition
of one or more constraints. The complexity to decide consistency of a set of constraints S
is unknown, however.
Thirdly, the exact complexity of the implication problem for constant-functional constraints is not yet known. However, the relationship with conditional functional dependencies already provides lower bounds for the complexity of the consistency problem and
the implication problem: for the conditional functional dependencies, these problems are
NP-complete and coNP-complete, respectively [21].
References
1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995)
2. Aho, A.V., Beeri, C., Ullman, J.D.: The theory of joins in relational databases. ACM Trans. Database
Syst. 4(3), 297–314 (1979)
3. Akhtar, W., Cortés-Calabuig, Á., Paredaens, J.: Constraints in RDF. In: Semantics in Data and
Knowledge Bases, Lecture Notes in Computer Science, vol. 6834, pp. 23–39. Springer (2011)
4. Arenas, M., Libkin, L.: A normal form for XML documents. ACM Trans. Database Syst. 29(1), 195–
232 (2004)
Implication and axiomatization of functional and constant constraints
5. Armstrong, W.W.: Dependency structures of data base relationships. In: Information Processing 74, pp.
580–583 (1974)
6. Beeri, C., Fagin, R., Howard, J.H.: A complete axiomatization for functional and multivalued dependencies in database relations. In:Proceedings of the 1977 ACM SIGMOD International Conference on
Management of Data, SIGMOD ’77, pp. 47–61 (1977)
7. Beeri, C., Vardi, M.: The implication problem for data dependencies. In: Automata, Languages and
Programming, Lecture Notes in Computer Science, vol. 115, pp. 73–85. Springer (1981)
8. Beeri, C., Vardi, M.Y.: A proof procedure for data dependencies. J. ACM 31(4), 718–741 (1984)
9. Bravo, L., Fan, W., Geerts, F., Ma, S.: Increasing the expressivity of conditional functional dependencies
without extra complexity. In: ICDE ’08 Proceedings of the 2008 IEEE 24th International Conference on
Data Engineering, pp. 516–525 (2008)
10. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F.: Extensible markup language
(XML) 1.0 (fifth edition), W3C recommendation 26 November 2008 (2008). http://www.w3.org/TR/
2008/REC-xml-20081126/
11. Buneman, P., Davidson, S., Fan, W., Hara, C., Tan, W.C.: Keys for XML. Comput. Netw. 39(5), 473–487
(2002)
12. Calbimonte, J.P., Porto, F., Keet, C.M.: Functional dependencies in OWL ABOX. In: XXIV Simpósio
Brasileiro de Banco de Dados, pp. 16–30 (2009)
13. Codd, E.F.: Relational completeness of data base sublanguages, vol. 987. IBM Research Laboratory, San
Jose, California (1972)
14. Codd, E.F.: Recent investigations in relational data base systems. In: Information Processing 74, pp.
1017–1021 (1974)
15. Cortés-Calabuig, A., Paredaens, J.: Semantics of constraints in RDFS. In: Proceedings of the 6th Alberto
Mendelzon International Workshop on Foundations of Data Management, pp. 75–90 (2012)
16. Fagin, R.: Horn clauses and database dependencies. J. ACM 29(4), 952–985 (1982)
17. Fagin, R., Vardi, M.Y.: The theory of data dependencies—a survey. In: Mathematics of Information Processing, Proceedings of Symposia in Applied Mathematics, vol. 34, pp. 19–71. American Mathematical
Society (1986)
18. Fallside, D.C., Walmsley, P.: RDF schema 1.1, W3C recommendation 25 February 2014 (2014). http://
www.w3.org/TR/2014/REC-rdf-schema-20140225/
19. Fallside, D.C., Walmsley, P.: XML schema part 0: Primer second edition, W3C recommendation 28
October 2004 (2004). http://www.w3.org/TR/2004/REC-xmlschema-0-20041028/
20. Fan, W., Geerts, F., Jia, X.: Conditional dependencies: A principled approach to improving data quality.
In: Dataspace: The Final Frontier, Lecture Notes in Computer Science, vol. 5588, pp. 8–20. Springer
(2009)
21. Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Conditional functional dependencies for capturing data
inconsistencies. ACM Trans. Database Syst. 33(2), 6:1–6:48 (2008)
22. Hartmann, S., Link, S.: More functional dependencies for XML. In: Advances in Databases and
Information Systems, Lecture Notes in Computer Science, vol. 2798, pp. 355–369. Springer (2003)
23. Hartmann, S., Link, S.: Efficient reasoning about a robust XML key fragment. ACM Trans. Database
Syst. 34(2), 10:1–10:33 (2009)
24. He, Q., Ling, T.W.: Extending and inferring functional dependencies in schema transformation. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management,
CIKM ’04, pp. 12–21. ACM (2004)
25. Hellings, J., Gyssens, M., Paredaens, J., Wu, Y.: Implication and axiomatization of functional constraints
on patterns with an application to the RDF data model. In: Foundations of Information and Knowledge
Systems, Lecture Notes in Computer Science, vol. 8367, pp. 250–269. Springer (2014)
26. Kanellakis, P.C.: Elements of relational database theory. Tech. Rep. CS-89-39, Brown University (1989)
27. Lausen, G., Meier, M., Schmidt, M.: SPARQLing constraints for RDF. In: Proceedings of the 11th International Conference on Extending Database Technology: Advances in Database Technology, EDBT ’08,
pp. 499–509 (2008)
28. Niewerth, M., Schwentick, T.: Reasoning about XML constraints based on XML-to-relational mappings.
In: Proceedings of the 17th International Conference on Database Theory, ICDT ’14, pp. 72–83 (2014)
29. Vardi, M.Y.: Fundamentals of dependency theory. In: Trends in Theoretical Computer Science, Principles
of Computer Science Series, vol. 34, pp. 171–224. Computer Science Press (1987)
30. Vincent, M.W., Liu, J., Mohania, M.: The implication problem for ‘closest node‘ functional dependencies
in complete XML documents. J. Comput. Syst. Sci. 78(4), 1045–1098 (2012)
31. Wijsen, J.: Database repairing using updates. ACM Trans. Database Syst. 30(3), 722–768 (2005)
32. Yu, Y., Heflin, J.: Extending functional dependency to detect abnormal data in RDF graphs. In: The
Semantic Web ISWC 2011, Lecture Notes in Computer Science, vol. 7031, pp. 794–809. Springer
(2011)