Ann Math Artif Intell DOI 10.1007/s10472-015-9473-7 Implication and axiomatization of functional and constant constraints Jelle Hellings1 · Marc Gyssens1 · Jan Paredaens2 · Yuqing Wu3 © Springer International Publishing Switzerland 2015 Abstract Akhtar et al. introduced equality-generating constraints and functional constraints as a first step towards dependency-like integrity constraints for RDF data [3]. Here, we focus on functional constraints. Since the usefulness of functional constraints is not limited to the RDF data model, we study the functional constraints in the more general setting of relations with arbitrary arity. We further introduce constant constraints and study the functional and constant constraints combined. Our main results are sound and complete This is a revised and extended version of the paper ‘Implication and Axiomatization of Functional Constraints on Patterns with an Application to the RDF Data Model’ presented at the 8th International Symposium on Foundations of Information and Knowledge Systems, Bordeaux, France (FOIKS 2014) [25]. Yuqing Wu carried out part of her work during a sabbatical visit to Hasselt University with a Senior Visiting Postdoctoral Fellowship of the Research Foundation Flanders (FWO). Jelle Hellings [email protected] Marc Gyssens [email protected] Jan Paredaens [email protected] Yuqing Wu [email protected] 1 Faculty of Sciences, Hasselt University and Transnational University of Limburg, Martelarenlaan 42, 3500 Hasselt, Belgium 2 Department of Mathematics and Computer Science, University of Antwerp, Campus Middelheim, Bldg. G, Middelheimlaan 1, 2020 Antwerp, Belgium 3 Computer Science Department, Pomona College, 185 E. 6th St., Claremont, CA 91711, USA J. Hellings et al. axiomatizations for the functional and constant constraints, both separately and combined. These axiomatizations are derived using the chase algorithm for equality-generating constraints. For derivations of constant constraints, we show how every chase step can be simulated by a bounded number of applications of inference rules. For derivations of functional constraints, we show that the chase algorithm can be normalized to a more specialized symmetry-preserving chase algorithm performing so-called symmetry-preserving steps. We then show how each symmetry-preserving step can be simulated by a bounded number of applications of inference rules. The axiomatization for functional constraints is in particular applicable to the RDF data model, solving a major open problem of Akhtar et al. Keywords Functional constraints · Constant constraints · Chase algorithm · Axiomatization Mathematics Subject Classfication (2010) 68P15 · 05C60 1 Introduction Usually, data is subject to integrity constraints implied by the semantics of the data. Formalizing these constraints can help us to reason over the data and identify inconsistencies. As such, formal constraints play a major role in database management systems that automatically maintain integrity of the data and optimize query evaluation. In this work, we study what we call constant-functional constraints. These constant-functional constraints are a generalization of the well-known functional dependencies of Codd [13], the conditional functional dependencies of Fan et al. [21], and the functional constraints of Akhtar et al. [3, 15, 25]. More concretely, the constant-functional constraints can be characterized as the union of the functional constraints applied to arbitrary relations, as presented by Hellings et al. [25], and the constant constraints, which we introduce here. Functional constraints have the form (P, L → R), where P specifies a pattern in the data and L and R are sets of variables occurring in this pattern. Their semantics is comparable to that of the functional dependencies: if two parts of the data match the pattern and are equal on L, then they must also be equal on R. Example 1 illustrates this for ternary RDF data. Example 1 Consider the family tree shown in Fig. 1. On these data, the constraint “a child only has one biological father and mother” holds. This constraint can be expressed by the pair of functional constraints ({($p, motherOf, $c)}, $c → $p) and ({($p, fatherOf, $c)}, $c → $p). The constraint “children have only one biological parent”, which can be expressed by ({($p, $t, $c)}, $c → $p), does not hold, however. Functional constraints allow the expression of not only the traditional functional dependencies, but also of context-dependent functional dependencies that only apply to a part of the data. In the context of Example 1, a context-dependent functional dependency could be the constraint child → parent restricted to motherOf triplets. Implication and axiomatization of functional and constant constraints Fig. 1 Simplified visualization of an RDF representation of a small family tree Because of the use of free variables and constants in patterns, patterns may also match specific structures in the relation. This is particularly useful if the underlying relation represents a graph. In this setting, functional constraints may impose structural constraints. Example 2 Let Edge(from, to) be a binary relation schema representing the edge relation of a graph. The functional constraint ({($n, $n)}, ∅ → $n) expresses that there is at most one node with a self-loop. The pattern {($n, $m), ($m, $n)} in the functional constraint ({($n, $m), ($m, $n)}, $n → $m) matches cycles (closed paths) of length 2 (including selfloops). Consider two pairs of such cycles starting at node v. By the constraint, the second node in both cycles must be equal, and thus the latter constraint expresses that every node v is part of at most one cycle of length 2. With functional constraints, one cannot specify that data entries matching a variable in some pattern should have a specified constant value. To extend the scope of the constraints under consideration, we introduce constant constraints and study functional and constant constraints combined and refer to them as constant-functional constraints. Constant constraints have the form (P, E), where P is a pattern and E is a finite set of constant equalities of the form (c = $v), with c a constant and $v a variable occurring in the pattern P. Example 3 Let PI(name, country, cc, phone) be the schema of a relation storing personal information, in which the attribute cc provides the country code that should be used to call the phone number phone. The constant constraint ({($n, BE, $c, $p)}, {(‘32’ = $c)}) specifies that people from Belgium have a Belgian phone number. For the functional dependencies in the relational data model, a sound and complete axiomatization is already long known [5], and, more recently, Akhtar et al. presented a sound and complete axiomatization for the equality-generating constraints in the RDF data model [3]. Since the constant-functional constraints are subsumed by equality-generating constraints, this axiomatization can also be used for the inference of constant-functional constraints only. In this case, intermediate inference steps can generate equality-generating constraints that are not necessarily equivalent to constant-functional constraints, unfortunately. For the functional constraints, Akhtar et al. identified the existence of a sound and complete axiomatization (not involving other types of constraints) as a major open problem. On the one hand, the Armstrong axiomatization for the functional dependencies [5] can be generalized to the setting of functional constraints. This generalization, however, lacks the reasoning power over patterns necessary for a complete axiomatization. On the other hand, there is no straightforward way to specialize the axiomatization of the equality-generating constraints to functional constraints only. J. Hellings et al. In this paper, we present a sound and complete axiomatization for the constant constraints, the functional constraints, and the constant-functional constraints over relations of arbitrary arity. The constraints used by all intermediate inference steps allowed by these axiomatizations are constant constraints, functional constraints, and constant-functional constraints, respectively. In particular, the case of ternary relations yields a sound and complete axiomatization for the functional constraints in the RDF data model, thereby positively solving the open problem of Akhtar et al. [3]. Our axiomatization is derived using the chase algorithm for equality-generating constraints [3], which is a variation of the standard chase algorithm [2, 8]. For derivations of constant constraints, we show how every chase step can be simulated by a bounded number of applications of inference rules. For derivations of functional constraints, such a direct simulation of chase steps is not straightforward. The key insight we use to circumvent this problem is that the chase algorithm, when applied to decide if a functional constraint holds, can be normalized to a more specialized, symmetry-preserving, chase algorithm. The main idea behind the symmetry-preserving chase algorithm is that, due to their semantics, chases for functional constraints always start with tableaux that are symmetric. We prove that during such chases one can always maintain this symmetry in the tableau, by using socalled symmetry-preserving steps. We then show how each symmetry-preserving step can be simulated by a bounded number of applications of inference rules. The axiomatization for the constant-functional constraints follows from these simulations. Axiomatizations for only the constant constraints and only the functional constraints can be derived in a similar manner. This is a revised and extended version of Hellings et al. [25]. Compared to Hellings et al., we present a slightly simplified sound and complete axiomatization of the functional constraints. Additionally, we present the constant constraints and provide sound and complete axiomatizations of constant constraints and of constant-functional constraints. Organization. In Section 2, we present the necessary definitions used throughout this paper. In Section 3, we discuss the constant-functional constraints and equality-generating constraints. In Section 4, we present the chase algorithm for equality-generating constraints and specialize it to constant-functional constraints. In Section 5, we propose an axiomatization for the constant-functional constraints. In Section 6, we look at the relationships between the constant-functional constraints and previously introduced classes of dependencies. In Section 7, we summarize our findings and discuss directions for future work. 2 Preliminaries Functional and equality-generating constraints [3] have originally been introduced in the context of the RDF data model. In this model, RDF data are usually represented by a single ternary relation. In the Introduction, we have already argued that functional and equalitygenerating constraints are useful in a wider range of data models. We therefore generalize functional and equality-generating constraints to relations of arbitrary arity. The following notations and definitions will be used throughout the paper. We consider disjoint infinitely enumerable sets U and V of constants and variables, respectively. For distinction, we usually prefix variables by “$”. A term is either a constant or a variable. Hence, the set T of all terms equals U ∪ V . A tuple of arity n is a Implication and axiomatization of functional and constant constraints sequence (t1 , . . . , tn ) of terms. A pattern of arity n is a finite set of tuples of arity n. If P is a pattern, then UP , VP , and TP denote the set of all constants, variables, and terms in P, respectively. A pattern P with VP = ∅ is usually referred to as a relation. We define the domain, range, and inverse of a function f in the usual way and denote these by domain(f ), range(f ), and f −1 , respectively. Two functions f and g agree on a set S, denoted by f =S g, if f (x) = g(x) for all x ∈ S. The restriction of a function f to a set S is defined as f |S = {(x, y) | x ∈ S, y = f (x)}. The identity on a set S is defined as idS = {(s, s) | s ∈ S}. The term-based renaming function φa1 ←b1 ,...,ai ←bi , a1 , b1 , . . . , ai , bi ∈ T , is the function on T for which φa1 ←b1 ,...,ai ←bi (bj ) = aj , j = 1, . . . , i, and which is the identity elsewhere. Likewise, the function-based renaming function f ←g , with f a function and g an injective function with domain(f ) = domain(g), is the function on T for which f ←g (g(t)) = f (t), t ∈ domain(f ), and which is the identity elsewhere. Notice that this function is well-defined due to the injectivity of g. A function f on terms is extended to tuples, patterns, and sets in the following natural way: for a tuple (t1 , . . . , tn ), f ((t1 , . . . , tn )) = (f (t1 ), . . . , f (tn )), and, for a set S, f (S) = {f (s) | s ∈ S}. For two patterns P and Q, a function e : VP ∪ U → T is an embedding of P into Q if e|U = idU and e(P) ⊆ Q. For constraints for which this is relevant, we denote “pattern P satisfies constraint C” by P |≡ C. Likewise, for a set of constraints S , we write P |≡ S to denote that P |≡ C for all C ∈ S .1 We say that “set of constraints S implies set of constraints S ”, denoted by S |= S , if, for every relation R, R |≡ S implies R |≡ S . The sets of constraints S and S are equivalent, denoted by S ≡ S , if S |= S and S |= S . In the above, whenever S = {C} and/or S = {C } are singletons, we usually write C and/or C instead of {C} and/or {C }. An inference rule can be described using a schema of the form S conditions, C which should be informally read as “if S , subject to some conditions, then C”, where S is a set of constraints and C is a constraint. The inference rule is k-ary if |S | = k. A set of inference rules R is k-ary if every rule r ∈ R is at most k-ary. If a constraint C can be obtained from a set of constraints S by repeatedly applying rules of a set of inference rules R, we say that C can be derived from S using R, denoted by S R C. We usually omit R if R is clear from the context. A set of inference rules R is sound whenever S R C implies S |= C and complete whenever S |= C implies S R C. A set of inference rules is a finite axiomatization if it is sound, complete, k-ary (for some k ≥ 0), and if the inference rules can be represented by a finite number of decidable inference rule schemas. An inference rule schema s = “if S , subject to some conditions, then C” is decidable if, given a set of constraints S with |S | ≤ |S | and constraint C , it is decidable if s can be applied to derive S s C . 1 Observe that the semantics of satisfaction depends on the type of constraints considered. The semantics of satisfaction for functional constraints is defined in Definition 5 and the semantics of satisfaction for constant constraints is defined in Definition 7. However, we use a generic notation for satisfaction independent of the type of constraints, which is why we introduce it here. J. Hellings et al. 3 Functional and constant constraints Functional constraints and constant constraints on n-ary relations are special subclasses of equality-generating constraints on n-ary relations. We formally define functional, constant, and equality-generating constraints on n-ary relations. Definition 1 A functional constraint is a pair (P, L → R), where P is a nonempty pattern and L, R ⊆ VP . Let C = (P, L → R) be a functional constraint. We say that P, L, and R are the pattern, left-hand side, and right-hand side of C, respectively. We denote these by PTN(C), LHS(C), and RHS(C). Definition 2 A constant constraint is a pair (P, E), where P is a nonempty pattern and E is a finite set of equalities of the form (c = $v) for c ∈ U and $v ∈ VP . Definition 3 An equality-generating constraint is a pair (P, E), where P is a non-empty pattern and E is a finite set of equalities of the form (t = u) for t, u ∈ VP ∪ U . Let C = (P, E) be a constant or an equality-generating constraint. We say that P and E are the pattern and the equalities of C, respectively. We denote these by PTN(C) and EQ (C). We say that an equality (t = u) ∈ E is a constant equality on pattern P if t ∈ U and u ∈ VP . Notice that the constant constraints are a proper syntactical subclass of the equality-generating constraints, in which the set of equalities is restricted to constant equalities. Definition 4 A constant-functional constraint is a constraint that is either a constant constraint or a functional constraint. We formally define the semantics of the functional, constant, and equality-generating constraints on n-ary relations. Definition 5 Let C = (P, L → R) be a functional constraint and let P be a pattern. Then P satisfies C if, for every pair of embeddings e1 and e2 of P into P with e1 =L e2 , we have e1 =R e2 . Each time we have two embeddings of P into P , it makes sense to look at the structure e1 (P) ∪ e2 (P) in order to compare the two embeddings. To make reasoning about such structures easier, we formalize them. Definition 6 Let f1 and f2 be a pair of embeddings of pattern P into some pattern P with f1 =L f2 for L ⊆ VP . We say that the pattern D = f1 (P) ∪ f2 (P) is a double pattern of P and L. A pattern D is a maximally double pattern of P and L if f1 and f2 are injections, f1 =L∪idU f2 , range(f1 |VP ) ⊆ V , range(f2 |VP ) ⊆ V , and range(f1 |VP \L ) ∩ range(f2 |VP \L ) = ∅. We observe that if f1 (P) ∪ f2 (P) is a maximally double pattern, then, by construction, f1 and f2 are bijective functions of VP ∪ U into Vf1 (P) ∪ U and Vf2 (P) ∪ U , embedding P into f1 (P) and f2 (P), respectively. In addition, f1 and f2 map constants to themselves Implication and axiomatization of functional and constant constraints and variables onto variables. Hence, P, f1 (P), and f2 (P) are isomorphic and, as a direct consequence, the maximally double pattern of P and L is unique up to isomorphisms. We shall therefore often leave the embeddings f1 and f2 that define f1 (P) ∪ f2 (P) implicit. Definition 7 Let P be a pattern and let C = (P, E) be a constant constraint or an equalitygenerating constraint. Then P satisfies C if, for every embedding e of P into P and every equality (t = u) ∈ E, we have e(t) = e(u). Akhtar et al. [3] already showed that every functional constraint can be written as an equality-generating constraint. To establish a formal relationship between equalitygenerating and functional constraints, we provide the following results. Proposition 1 Let C = (P, L → R) and let D = f1 (P) ∪ f2 (P) be the maximally double pattern of P and L. For every two embeddings e1 and e2 of P into pattern P with e1 =L e2 , there is an embedding e of D into P with, for every $r ∈ R, e1 ($r) = e ◦ f1 ($r) and e2 ($r) = e ◦ f2 ($r). Conversely, for every embedding e of D into P , there are two embeddings e1 and e2 of P into P with e1 =L e2 and, for every $r ∈ R, e ◦ f1 ($r) = e1 ($r) and e ◦ f2 ($r) = e2 ($r). Proof In one direction, we define e = (e1 ◦ f1 −1 ) ∪ (e2 ◦ f2 −1 ), and, in the other direction, we define e1 = e ◦ f1 and e2 = e ◦ f2 . Corollary 1 Let C = (P, L → R) be a functional constraint and let D = f1 (P) ∪ f2 (P) be the maximally double pattern of P and L. We have C ≡ (D, {(f1 ($r) = f2 ($r)) | $r ∈ R}). As already mentioned, the functional constraints are a generalization of the functional dependencies. We also formalize this relationship. Proposition 2 Let C = L → R be a functional dependency over the relation schema R(A1 , . . . , An ) with L, R ⊆ {A1 , . . . , An }. Consider the functional constraint C = ({(A1 , . . . , An )}, L → R), in which the attribute names are assumed to be variables. Then C ≡ C . The functional dependencies have a well-known axiomatization in the form of Armstrong’s axioms, consisting of the three inference rules Reflexivity, Augmentation, and Transitivity [5]. We generalize Armstrong’s inference rules to our setting of the functional constraints. Rule 1 (Reflexivity) Let P be a pattern. If R ⊆ L ⊆ VP , then (P, L → R). Rule 2 (Augmentation) If (P, L → R) and V ⊆ VP , then (P, L ∪ V → R ∪ V ). Rule 3 (Transitivity) If (P, L → M) and (P, M → R), then (P, L → R). Since Armstrong’s axioms can be generalized to functional constraints, it is straightforward to show that the well-known decomposition and union rules can also be generalized to functional constraints. Also, similar decomposition and union rules can be obtained for the equality-generating constraints [3]. For the decomposition and union of constant constraints, we introduce the following inference rules, which also hold for the equality-generating constraints in general. J. Hellings et al. Rule 4 (Decomposition) If (P, E ∪ E ), then (P, E). Rule 5 (Union) If (P, E) and (P, E ), then (P, E ∪ E ). From now on, if the left-hand side or right-hand side in a functional constraint is a singleton set {$v}, then we usually write $v instead. Likewise, if the set of equalities in a equality-generating or constant constraint is a singleton set {(t = u)}, then we usually write (t = u) instead. Armstrong’s axioms together with Decomposition and Union are not complete for the functional constraints, the constant constraints, or the constant-functional constraints: these inference rules only provide means to reason on constraints specified on a single pattern. The following example exhibits a situation in which this is not sufficient: Example 4 Consider the functional constraint C = ({($a, $b)}, $a → $b) and the pattern P = {($a, c1 ), ($a, c2 )} with c1 , c2 ∈ U and c1 = c2 . If C holds on a relation R, then no embedding of P into R is possible, and, hence, every constant-functional constraint on the pattern P holds. Likewise, consider the constant constraint C = ({($a, $b)}, (c = $b)) and the pattern P = {($a, c )} with c ∈ U and c = c . Also in this case, if C holds on a relation R, then no embedding of P into R is possible, and, hence, every constant-functional constraint on the pattern P holds. Both these cases consider the derivation of a constraint on pattern P from a constraint C on pattern PTN(C) with P = PTN(C). None of the presented inference rules are, however, able to reason about constraints specified on patterns that are not all equivalent. 4 Chasing constant-functional constraints For equality-generating constraints, Algorithm 1 is known to decide implication [3]. This algorithm is a constant-aware variation of standard chase algorithms for equality-generating dependencies [2, 8]. In Algorithm 1, we refer to lines 5–10 as equalization steps, to line 12 as inconsistency termination, and to line 15 as regular termination. Theorem 1 (Akhtar et al. [3]) Algorithm 1 is correct for equality-generating constraints. We use the relationship between the constant-functional constraints and the equalitygenerating constraints, as described in Corollary 1, to construct a chase-based algorithm that decides implication of constant-functional constraints, shown as Algorithm 2. In Algorithm 2, we refer to lines 15–21 as equalization steps, to line 23 as inconsistency termination, and to line 26 as regular termination. This is analogue to the equalization steps, inconsistency termination, and regular termination in Algorithm 1. Theorem 2 Algorithm 2 is correct for constant-functional constraints. Proof Algorithm 2 makes a case distinction on the type of all constraints involved. In it, constant constraints are treated as normal equality-generating constraints. For functional constraints, the algorithm implicitly translates functional constraints to equality-generating constraints using the results of Proposition 1 and Corollary 1. Implication and axiomatization of functional and constant constraints Algorithm 1 Chase for equality-generating constraints Input: Set of equality-generating constraints S , Equality-generating constraint C Output: S |= C 1: T ← PTN(C) 2: while there exists a constraint C ∈ S with T |≡ C do 3: Choose equality (t = u ) ∈ EQ(C ) and embedding e of PTN(C ) into T with e(t ) = e(u ) 4: /* equalize e(t ) and e(u ) in T */ 5: if e(t ) ∈ V then 6: /* replace all occurrences of e(t ) in T by e(u ) */ 7: T ← φe(u )←e(t ) (T) 8: else if e(u ) ∈ V then 9: /* replace all occurrences of e(u ) in T by e(t ) */ 10: T ← φe(t )←e(u ) (T) 11: else /* e(t ), e(u ) ∈ U and e(t ) = e(u ) */ 12: return TRUE 13: end if 14: end while 15: return T |≡ C In Section 5, we shall use the correctness of Algorithm 2 to prove that there is a complete axiomatization for the derivation of constant-functional constraints from a set of constantfunctional constraints. 5 Axiomatizing constant-functional constraints Algorithm 2 is a correct algorithm to decide S |= C, with S a set of constant-functional constraints and C a single constant-functional constraint. Hence, if S |= C holds, and if we can simulate every equalization step and the termination of an execution of Algorithm 2 by sound inference rules, then we have a sound and complete axiomatization for deriving constant-functional constraints. For chases that do not perform equalization steps, we provide straightforward inference rules to simulate the eventual termination step in a single inference step (Section 5.1). For chases that do initially perform an equalization step, we show that we can reduce any chase that decides S |= C to a single equalization step (constant constraints) or at most two equalization steps (functional constraints) with a single constraint C ∈ S , followed by the chase deciding S |= C , for some constraint C . We do so by showing that, after the initial equalization step(s) with C , the resulting tableau T is equivalent to the initial tableau for any chase deciding S |= C . To show that S |= C holds when S |= C holds, we show C |= C . Lastly, we show {C , C } |= C, while providing sound inference rules to derive {C , C } C in a finite number of inference steps. Due to the dual nature of Algorithm 2 with respect to, on the one hand, constant constraints, and on the other hand, functional constraints, we divide our search for inference rules to simulate the initial equalization step(s) of an execution of Algorithm 2 into two cases. J. Hellings et al. Algorithm 2 Chase for constant-functional constraints Input: Set of constant-functional constraints S , Constant-functional constraint C Output: S |= C 1: if C is a functional constraint then 2: Let D be the maximally double pattern of PTN(C) and LHS(C) 3: T ← D 4: else /* C is a constant constraint */ 5: T ← PTN(C) 6: end if 7: while there exists a constraint C ∈ S with T |≡ C do 8: if C is a functional constraint then 9: Choose variable $r ∈ RHS(C ) and pair of embeddings e1 and e2 of PTN(C) into T with e1 =LHS(C ) e2 and e1 ($r ) = e2 ($r ) 10: t1 , t2 ← e1 ($r ), e2 ($r ) 11: else /* C is a constant constraint */ 12: Choose equality (c = $v) ∈ EQ(C ) and embedding e of PTN(C ) into T with c = e($v) 13: t1 , t2 ← c, e($v) 14: end if 15: /* equalize t1 and t2 in T */ 16: if t1 ∈ V then 17: /* replace all occurrences of t1 in T by t2 */ 18: T ← φt2 ←t1 (T) 19: else if t2 ∈ V then 20: /* replace all occurrences of t2 in T by t1 */ 21: T ← φt1 ←t2 (T) 22: else /* t1 , t2 ∈ U and t1 = t2 */ 23: return TRUE 24: end if 25: end while 26: return T |≡ C Firstly, we consider the derivation of constant constraints (Section 5.2). Thereto, we simulate the initial equalization step in the chase that decides S |= C with C a constant constraint by using a finite number of inference steps. Secondly, we consider the derivation of functional constraints (Section 5.3). Thereto we try to simulate the chase that decides S |= C with C a functional constraint. In this case, we conclude that a direct simulation of individual equalization steps is not straightforward. To circumvent this problem, we show that, in this case, Algorithm 2 can be normalized to a more specialized, symmetry-preserving, chase algorithm. For the symmetry-preserving chase that decides S |= C, we are able to simulate the initial symmetry-preserving step by using a finite number of inference steps. Finally, the simulation of direct termination in chases for constant-functional constraints, the initial equalization step in chases for constant constraints, and the initial symmetrypreserving step in chases for functional constraints are used as the basis for an induction Implication and axiomatization of functional and constant constraints argument to extend the simulation to the entire chase (Section 5.4). This induction argument shows how to construct a derivation S C by showing how to translate each equalization step (constant constraints) or symmetry-preserving step (functional constraints) of a chase deciding S |= C to a finite number of inference steps, this such that the produced sequence of inference steps is a derivation of S C. 5.1 Inference rules for direct termination Consider the case where Algorithm 2 decides whether S |= C terminates without performing any equalization steps. Two subcases are possible, namely regular termination and inconsistency termination. 5.1.1 Regular termination Assuming no equalization steps are possible, we have regular termination if there does not exist a constraint C ∈ S with T |≡ C . In this case, due to line 26 in Algorithm 2, we have S |= C if and only if T |≡ C. We investigate necessary conditions on C such that T |≡ C holds. Proposition 3 Let C be a functional constraint and let D be the maximally double pattern of PTN(C) and LHS(C). If D |≡ C, then RHS(C) ⊆ LHS(C). If we have regular termination and C is a functional constraint, then T is equivalent to the maximally double pattern of PTN(C) and LHS(C). Hence, by Proposition 3, we conclude RHS(C) ⊆ LHS(C), and thus we can use Reflexivity to derive ∅ C. Hence, also S C. Proposition 4 Let C be a constant constraint. If PTN(C) |≡ C, then EQ(C) = ∅. If we have regular termination and C is a constant constraint, then T is equivalent to Hence, by Proposition 4, we conclude EQ(C) = ∅. For this case, we introduce the following inference rule. PTN(C). Rule 6 (Empty) Let P be a pattern. We have (P, ∅). Proof (soundness) Let R be a relation and assume there are embeddings of P into R. Let e be any embedding of P into R. The constraint specifies no equalities, hence, every equality specified by the constraint is satisfied by embedding e. If we have regular termination and C is a constant constraint, then EQ(C) = ∅, and thus we can use Empty to derive ∅ C. Hence, also S C. 5.1.2 Inconsistency termination Still assuming no equalization steps are possible, we have inconsistency termination if there does exist a constraint C ∈ S with T |≡ C . In this case, due to line 23 in Algorithm 2, we can find two terms t1 ∈ U and t2 ∈ U with t1 = t2 that should be equal according to C. We introduce the following inference rules to deal with such inconsistencies. J. Hellings et al. Rule 7 (Inconsistency I) Let P be a pattern. If (P , L → R ) with $r ∈ R , and if there is a pair of embeddings of P into P that agree on L and map $r to two distinct constants, then (P, E), for every finite set of constant equalities E on P, and (P, L → R), for every L ⊆ VP and R ⊆ VP . Proof (soundness) Let R be a relation with R |≡ (P , L → R ), and assume there are embeddings of P into R. Let e be such an embedding. Let e1 and e2 be embeddings mapping P into P that agree on L and map $r to two distinct constants. Now, the embeddings e1 = e ◦ e1 and e2 = e ◦ e2 map P into R, agree on L , and map $r to two distinct constants, a contradiction. Hence, no embedding e of P into R exists. Thus, we can conclude R |≡ (P, E), for every finite set of constant equalities E on P, and R |≡ (P, L → R), for every L ⊆ VP and R ⊆ VP . Rule 8 (Inconsistency II) Let P be a pattern. If (P , E ) with (c = $v) ∈ E , and if there is an embedding of P into P mapping $v to a constant unequal to c, then (P, E), for every finite set of constant equalities E on P, and (P, L → R), for every L ⊆ VP and R ⊆ VP . Proof (soundness) Let R be a relation with R |≡ (P , E ), and assume there are embeddings of P into R. Let e be such an embedding. Let e be an embedding mapping P into P and mapping $v to a constant unequal to c. Now, the embedding e = e ◦ e maps P into R and maps $v to a constant unequal to c, a contradiction. Hence, no embedding e of P into R exists. Thus, we can conclude R |≡ (P, E), for every finite set of constant equalities E on P, and R |≡ (P, L → R), for every L ⊆ VP and R ⊆ VP . It is straightforward to verify that these inference rules can be applied to simulate chases that decide S |= C, with C a constant constraint, with inconsistency termination and without using equalization steps. We shall now prove that this is also the case when C is a functional constraint. Proposition 5 Let S |= C with C a functional constraint. If Algorithm 2 decides S |= C by immediate inconsistency termination using constraint C ∈ S , then Rule 7 or Rule 8 can be applied to derive C C. Proof Let C = (P, L → R) and let T = D = f1 (P) ∪ f2 (P) be the maximally double pattern of P and L. Let e be any embedding of PTN(C ) into D. As P, f1 (P), and f2 (P) are isomorphic and f1 and f2 are bijections, the function e = f1 ←f2 ◦ e is well-defined and an embedding of PTN(C ) into f1 (P). Hence, the function e = f1 −1 ◦ e = f1 −1 ◦ f1 ←f2 ◦ e is also well-defined and an embedding of PTN(C ) into P. By construction, the function f1 −1 ◦ f1 ←f2 is the identity on constants. We now consider two cases, namely where C is a functional constraint and where C is a constant constraint. 1. If C = (P , L → R ) is a functional constraint, then, since Algorithm 2 must pass line 9 to terminate on line 23, we can choose a variable $r ∈ R and pair of embeddings e1 and e2 of P into T = D with e1 =L e2 and e1 ($r ) = e2 ($r ). Now, using the above construction, we obtain the pair of embeddings e1 = f1 −1 ◦ f1 ←f2 ◦ e1 and e2 = f1 −1 ◦ f1 ←f2 ◦ e2 of P into P. As e1 =L e2 , we also have e1 =L e2 . Furthermore, Implication and axiomatization of functional and constant constraints as e1 ($r ) ∈ U , e2 ($r ) ∈ U , e1 ($r ) = e2 ($r ), and f1 −1 ◦ f1 ←f2 is the identity on constants, we have e1 ($r ) ∈ U , e2 ($r ) ∈ U , and e1 ($r ) = e2 ($r ). We can thus apply Inconsistency I using constraint C and embeddings e1 and e2 to derive C C. 2. If C = (P , E ) is a constant constraint, then, since Algorithm 2 must pass line 12 to terminate on line 23, we can choose an equality (c = $v) ∈ E and embedding e of P into T = D with c = e($v). Now, using the above construction, we obtain an embedding e = f1 −1 ◦ f1 ←f2 ◦ e of P into P. As e($v) ∈ U and f1 −1 ◦ f1 ←f2 is the identity on constants, we have e ($v) ∈ U and c = e (c) = e ($v). We can thus apply Inconsistency II using constraint C and embedding e to derive C C. This case analysis completes the proof. 5.2 Inference rules for deriving constant constraints From now on, we assume that a chase deciding S |= C always performs equalization steps. For simulating the initial equalization step in the case where C is a constant constraint, we use the following three inference rules. Rule 9 (Application I) Let P be a pattern, let $v ∈ VP be a variable, and let t ∈ VP ∪ U be a term. If (φt←$v (P), E), if (P , L → R ), and if there is a pair of embeddings of P into P that agree on L and map $r ∈ R to t and $v, respectively, then (P, E). If, in addition, there exists a c ∈ U such that either c = t (if t ∈ U ) or (c = t) ∈ E (if t ∈ V ), then also (P, E ∪ {(c = $v)}). Proof (soundness) Let R be a relation with R |≡ (φt←$v (P), E) and R |≡ (P , L → R ), and assume there are embeddings of P into R. Let e be any embedding of P into R and let e1 and e2 be any pair of embeddings of P into P that agree on L and map $r ∈ R to t and $v, respectively. Now e1 = e ◦ e1 and e2 = e ◦ e2 are embeddings of P into R with e1 =L e2 . By R |≡ (P , L → R ), we have e1 ($r ) = e2 ($r ), and, by construction, we have e1 ($r ) = e(t) and e2 ($r ) = e($v). Hence, e($v) = e(t), and ε = e|domain(e)\{$v} is an embedding of φt←$v (P) into R. Now, by R |≡ (φt←$v (P), E), we have, for every (c = $w) ∈ E, ε($w) = c . Notice that $v ∈ domain(ε). Hence, if ε($w) = c , then $w = $v and e($w) = c . Hence, we conclude R |≡ (P, E). Above, we already showed that e($v) = e(t). It now follows readily that, if, in addition, there exists c ∈ U such that either c = t (if t ∈ U ) or (c = t) ∈ E (if t ∈ V ), we also have R |≡ (P, (c = $v)), and, hence, R |≡ (P, E ∪ {(c = $v)}). Rule 10 (Application II) Let P be a pattern, let $v ∈ VP be a variable, and let c ∈ U be a constant. If (φc←$v (P), E), if (P , E ) with (c = $v ) ∈ E , and if there is an embedding of P into P mapping $v to $v, then (P, E ∪ {(c = $v)}). Proof (soundness) Let R be a relation with R |≡ (φc←$v (P), E) and R |≡ (P , E ), and assume there are embeddings of P into R. Let e be any embedding of P into R and let e be any embedding of P into P mapping $v to $v. Now e = e ◦ e is an embedding of P into R. By R |≡ (P , E ), we have e ($v ) = e($v) = c. As a consequence, ε = e|domain(e)\{$v} is an embedding of φc←$v (P) into R. Now, by R |≡ (φt←$v (P), E), we have, for every (c = $w) ∈ E, ε($w) = c . Notice that $v ∈ domain(ε). Hence, if ε($w) = c , then $w = $v and e($w) = c . Hence, we conclude R |≡ (P, E ∪ {(c = $v)}). J. Hellings et al. The following example exhibits situations in which the Application I and II rules can be used. Example 5 Using Application I in a straightforward manner, we can derive {({($a, $b, $c)}, $a → $b), ({($a, b, $c), ($a, b, d)}, (c = $c))} ({($a, b, $c), ($a, $b, d)}, (c = $c)) by using the embeddings e1 = {$a → $a, $b → b, $c → $c} e2 = {$a → $a, $b → $b, $c → d}. We thus have e1 =$a e2 , e1 ($b) = b, and e2 ($b) = $b. Using Application I, we can also derive ({($a, $b, $c), $a → $c) ({($a, $b, c), ($a, $b, $c)}, {c = $c}) by using the embeddings e1 = {$a → $a, $b → $b, $c → c} e2 = {$a → $a, $b → $b, $c → $c}. Observe that in this case, Application I is applied to a single non-trivial functional constraint and to the trivial constant constraint ({($a, $b, c), ($a, $b, $c)}, ∅). Using Application II, we can derive {({($a, $b, $c)}, (c = $c)), ({($a, $b, c)}, (b = $b))} ({($a, $b, $c)}, (b = $b)) by using the embedding e = {$a → $a, $b → $b, $c → $c}. Rule 11 (Inconsistent Propagation) Let P be a pattern, $v ∈ VP , and c1 , c2 ∈ U with c1 = c2 . If (P, {(c1 = $v), (c2 = $v)}), then (P, E) for every finite set of constant equalities E on P. Proof (soundness) Let R be a relation with R |≡ (P, {(c1 = $v), (c2 = $v)}) and assume there are embeddings of P into R. Let e be such an embedding mapping P into R. By R |≡ (P, {(c1 = $v), (c2 = $v)}), we must have c1 = e($v) and c2 = e($v). As a consequence, we conclude c1 = c2 , a contradiction. Hence, no embedding e of P into R exists. Thus we can conclude R |≡ (P, E), for every finite set of constant equalities E on P. Before we show how the initial equalization step in any chase deciding S |= C with C a constant constraint can be simulated by the introduced inference rules, we introduce an additional inference rule to simplify our proofs: Rule 12 (Embedding I) If (P , E ) and h is an embedding from P into P, then (P, {(c = h($v)) | ((c = $v) ∈ E ) ∧ (h($v) ∈ V )}). Proof (soundness) Let R be a relation with R |≡ (P , E ), and assume there are embeddings of P into R. Let e be any embedding of P into R. Then ε = e ◦ h is an embedding of P into R. Hence, for every (c = $v) ∈ E , we have e(c) = c = ε(c) = ε($v) = e(h($v)). Embedding I explicitly maps a constant constraint defined on a pattern to a different pattern. Observe that Embedding I is applicable to equality-generating constraints as Implication and axiomatization of functional and constant constraints well. However, if we apply Embedding I to a constant constraint (P , E ), then the derived constraint is always a constant constraint. The following example illustrates the usage of Embedding I. Example 6 If ({($a, $b)}, ($a = c)) holds, then trivially also ({($x, $y)}, ($x = c)) holds. We can derive ({($x, $y)}, ($x = c)) from ({($a, $b)}, ($a = c)) by using Embedding I with the embedding h = {$a → $x, $b → $y}. Next, we show how the initial equalization step in any chase deciding S |= C with C a constant constraint can be simulated by the introduced inference rules. Theorem 3 Let S be a set of constant-functional constraints, let C = (P, E) be a constant constraint, and let S |= C. If Algorithm 2 is initially able to perform an equalization step with C ∈ S resulting in tableau T = P , then there exists a constant constraint C = (P , E ) such that we have the following: 1. 2. S |= C , {C , C } C using Rules 4, 9, 10, and 11. Proof Without loss of generality, we can assume that φt1 ←t2 is the equalization performed in the initial equalization step: this always holds if C is a constant constraint, and, if C is a functional constraint, then we can always swap the roles of e1 and e2 . Hence, we assume that P = φt1 ←t2 (P). We distinguish two types of executions of Algorithm 2, namely those that terminate regularly and those that terminate due to inconsistency. We divide our analysis into these two cases. 1. Algorithm 2 terminates regularly. We define E = {c = φt1 ←t2 ($v) | ((c = $v) ∈ E) ∧ (φt1 ←t2 ($v) ∈ V )} and C = (P , E ). Observe that we have E = E if and only if there exists an equality (c = t2 ) ∈ E: if (c = t2 ) ∈ E and t1 ∈ V , then we have E = E ∪{(c = t1 )}\{(c = t2 )}, and, if (c = t2 ) ∈ E and t1 ∈ U , then we have E = E \ {(c = t2 )}. Also observe that, if t1 ∈ U and if there exists an equality (c = t2 ) ∈ E, then, due to this chase terminating regularly and S |= C, we have t1 = c. We have C |= C , since we have C C by a straightforward application of Embedding I using the embedding φt1 ←t2 . By S |= C and C |= C , we conclude S |= C . Next, we prove {C , C } C. We do so by a case distinction on the type of constraint C . 1.a. C is a functional constraint. Let e1 and e2 be embeddings of PTN(C ) into T = P and let $r ∈ RHS(C ) with t1 = e1 ($r ) and t2 = e2 ($r ), meeting the conditions of line 9 of Algorithm 2. Since Algorithm 2 terminates regularly, it is not possible that t1 , t2 ∈ U , and, since φt1 ←t2 is the equalization performed, it is not possible that t1 ∈ V , t2 ∈ U . Two cases remain, namely t1 , t2 ∈ V and t1 ∈ U , t2 ∈ V . First consider the case where t1 ∈ V and t2 ∈ V . If there exists an equality (c = t2 ) ∈ E, then we have E = E ∪ {(c = t1 )} \ {(c = t2 )}, and, if there does not exist an equality (c = t2 ) ∈ E, then we have E = E. In both cases, we apply Application I to conclude {C , C } C. Next consider the case where J. Hellings et al. t1 ∈ U and t2 ∈ V . In this case, we have E = E \ {(t1 = t2 )} and we apply Application I to conclude {C , C } (P, E ∪ {(t1 = t2 )}) and Decomposition to conclude {C , C } (P, E ). If (t1 = t2 ) ∈ E, then C = (P, E ∪ {(t1 = t2 )}), otherwise C = (P, E ). Hence, we conclude {C , C } C. 1.b. C is a constant constraint. Let e be an embedding of PTN(C ) into T = P and let (c = $v) ∈ EQ(C ) be an equality with t1 = c and t2 = e($v), meeting the conditions of line 12 of Algorithm 2. In this case, we have E = E \ {(t1 = t2 )}. We apply Application II to conclude {C , C } (P, E ∪ {(t1 = t2 )}) and Decomposition to conclude {C , C } (P, E ). If (t1 = t2 ) ∈ E, then C = (P, E ∪ {(t1 = t2 )}), otherwise C = (P, E ). Hence, we conclude {C , C } C. 2. Algorithm 2 terminates due to inconsistency. Let $w ∈ VP be a variable and let c1 , c2 ∈ U be constants with c1 = c2 . We define E = {(c1 = $w), (c2 = $w)} and C = (P , E ). Chasing pattern P using S will lead to inconsistency, hence, S |= C . Next, we prove {C , C } C. We do so by a case distinction on the type of constraint C . 2.a. C is a functional constraint. Let e1 and e2 be embeddings of PTN(C ) into T = P and let $r ∈ RHS(C ) with t1 = e1 ($r ) and t2 = e2 ($r ), meeting the conditions of line 9 of Algorithm 2. We apply Application I to conclude (P, E ) and Inconsistent Propagation to conclude {C , C } C. 2.b. C is a constant constraint. Let e be an embedding of PTN(C ) into T = P and let (c = $v) ∈ EQ(C ) be an equality with t1 = c and t2 = e($v), meeting the conditions of line 12 of Algorithm 2. We apply Application II to conclude {C , C } (P, E ∪ {(c = $v)}), Decomposition to conclude (P, E ), and Inconsistent Propagation to conclude {C , C } C. This case analysis completes the proof. 5.3 Inference rules for deriving functional constraints For chases deciding S |= C with C a constant constraint, there is a direct correspondence between the tableau T and patterns of constant constraints. In Section 5.2, we showed that we can use this correspondence to derive inference rules simulating the initial equalization step. For chases deciding S |= C with C = (P, L → R) a functional constraint, such a direct correspondence between the tableau T and patterns of functional constraints does not exist. Line 3 of Algorithm 2 does however give an initial correspondence between the tableau T and the maximally double pattern of P and L. A single equalization step can however destroy any correspondence between the tableau T and any maximally double pattern that is useful in derivations of non-trivial functional constraints, as shown by the next example. Example 7 We apply Algorithm 2 to the set of constraints S = {({($a, $b, $c)}, $b → $c), ({($a, $b, $c), ($a, $d, e)}, $a → $b)} and the target functional constraint C = ({($a, $b, $c), ($a, $b, e)}, $a → $b). It is straightforward to verify S |= C. We initially have the tableau T = ($a, $b1 , $c1 ), ($a, $b1 , e), ($a, $b2 , $c2 ), ($a, $b2 , e) . Implication and axiomatization of functional and constant constraints We have ({($a, $b, $c)}, $b → $c) |≡ T, leading to the equalization φe←$c1 (T) which results in the tableau T = ($a, $b1 , e), ($a, $b2 , $c2 ), ($a, $b2 , e) . Using the definition of a maximally double pattern, we can search for a pattern P , set of variables L ⊆ VP , and injective functions f1 and f2 such that T = f1 (P ) ∪ f2 (P ), f1 =L ∪idU f2 , and range(f1 |VP \L ) ∩ range(f2 |VP \L ) = ∅. It is easily verifiable that the only way to achieve this is by putting P = T , f1 = f2 = idVP ∪U , and L = {$a, $b1 , $b2 , $c2 }. Since L = VP , any functional constraint of the form (P , L → R ) will have R ⊆ L , and, hence, holds trivially. We can however always maintain a direct correspondence between tableau T and a maximally double pattern that is useful in derivations and we can do so in at most two equalization steps, as shown next. Theorem 4 is visualized in Fig. 2. Theorem 4 Let S be a set of constant-functional constraints, let C = (P, L → R) be a functional constraint, let D = f1 (P) ∪ f2 (P) be the maximally double pattern of P and L, and let S |= C. If T = D and an equalization step with C ∈ S is possible, then also a sequence of at most two equalization steps with C is possible that results in a tableau T = D = f1 (P ) ∪ f2 (P ), a maximally double pattern of P and L with L ⊆ VP , such that there is a mapping m with f1 (P ) = m(f1 (P)) and f2 (P ) = m(f2 (P)). Proof First, we consider all the cases in which C = (P , L → R ) is a functional constraint. Let $r ∈ R be a variable and let e1 and e2 be a pair of embeddings of P into T = D with e1 =L e2 and e1 ($r ) = e2 ($r ), meeting the conditions of line 9 of Algorithm 2. Since an equalization step is performed, we have e1 ($r ) ∈ U or e2 ($r ) ∈ U . We can swap the roles of both f1 , f2 , and e1 , e2 . Hence, without loss of generality, we can assume that e1 ($r ) = f1 (t) with t ∈ TP , and, for $v ∈ VP , e2 ($r ) = f1 ($v) or e2 ($r ) = f2 ($v). First, we consider the cases in which e2 ($r ) = f1 ($v). 1. t ∈ L ∪ U and $v ∈ L. Since $v ∈ L, we have f1 ($v) = f2 ($v). Let φf1 (t)←f1 ($v) be the equalization performed by the initial equalization step using C and embeddings e1 and e2 , resulting in the tableau φf1 (t)←f1 ($v) (f1 (P) ∪ f2 (P)), which is equivalent to the tableau T = (φf1 (t)←f1 ($v) ◦ f1 (P)) ∪ (φf1 (t)←f1 ($v) ◦ f2 (P)). By construction, φf1 (t)←f1 ($v) ◦ f1 and φf1 (t)←f1 ($v) ◦ f2 are embeddings of P into T that agree on L = φt←$v (L) ∩ V = L \ {$v} and, for all t ∈ TP \ {$v}, are equal to f1 and f2 . Hence, we can put P = φt←$v (P), f1 = f1 |TP , f2 = f2 |TP , and m = φf1 (t)←f1 ($v) . 2. t ∈ VP ∪ U and $v ∈ L. Since P, f1 (P), and f2 (P) are isomorphic, and f1 and f2 are bijections mapping constants to themselves and variables to variables, the functions ε1 = f2 ←f1 ◦e1 and ε2 = f2 ←f1 ◦e2 are well-defined and are embeddings of P into f2 (P) that agree on L . Hence, by construction, f2 (t) = ε1 ($r ) = ε2 ($r ) = f2 ($v). Let φf1 (t)←f1 ($v) be the equalization performed by the initial equalization step using C and embeddings e1 and e2 , resulting in the tableau φf1 (t)←f1 ($v) (f1 (P) ∪ f2 (P)). Since f1 ($v) ∈ range(f2 ), this tableau is equivalent to the tableau T = (φf1 (t)←f1 ($v) ◦ f1 (P)) ∪ f2 (P). J. Hellings et al. Fig. 2 Visualization of a symmetry-preserving step performed on the maximally double pattern D = f1 (P)∪ f2 (P) of P and L with L ⊆ VP , resulting in the maximally double pattern D = f1 (P ) ∪ f2 (P ) of P and L with L ⊆ VP . Since the mapping m is symmetry-preserving, the symmetry between f1 (P) and f2 (P) is maintained: structural changes are applied symmetrically to both halves of D. Hence, m maps each half of D to the corresponding half of D Hence, f2 (P) is unaffected by the initial equalization step and thus a second equalization step is possible using functional constraint C and embeddings ε1 and ε2 . Performing this second equalization φf2 (t)←f2 ($v) on T results in the tableau φf2 (t)←f2 ($v) (φf1 (t)←f1 ($v) (f1 (P) ∪ f2 (P))). Since f2 ($v) ∈ range(f1 ), this tableau is equivalent to the tableau T = (φf1 (t)←f1 ($v),f2 (t)←f2 ($v) ◦ f1 (P)) ∪ (φf1 (t)←f1 ($v),f2 (t)←f2 ($v) ◦ f2 (P)). By construction, φf1 (t)←f1 ($v),f2 (t)←f2 ($v) ◦ f1 and φf1 (t)←f1 ($v),f2 (t)←f2 ($v) ◦ f2 are embeddings of P into T that agree on L = φt←$v (L) ∩ V = L and, for all t ∈ TP \ {$v}, are equal to f1 and f2 . Hence, we can put P = φt←$v (P), f1 = f1 |TP , f2 = f2 |TP , and m = φf1 (t)←f1 ($v),f2 (t)←f2 ($v) . Next we consider the cases in which e2 ($r ) = f2 ($v). 3. e1 ($r ) = f1 ($v). From e1 ($r ) = f1 ($v), e2 ($r ) = f2 ($v), and e1 ($r ) = e2 ($r ), we conclude that $v ∈ L. Let φf1 ($v)←f2 ($v) be the equalization performed by the initial equalization step using C and embeddings e1 and e2 , resulting in the tableau φf1 ($v)←f2 ($v) (f1 (P) ∪ f2 (P)), which is equivalent to the tableau T = (φf1 ($v)←f2 ($v) ◦ f1 (P)) ∪ (φf1 ($v)←f2 ($v) ◦ f2 (P)). By construction, φf1 ($v)←f2 ($v) ◦ f1 and φf1 ($v)←f2 ($v) ◦ f2 are embeddings of P into T that agree on L = L∪{$v} and, for all t ∈ TP \{$v}, are equal to f1 and f2 . Hence, we can put P = P, f1 = φf1 ($v)←f2 ($v) ◦f1 , f2 = φf1 ($v)←f2 ($v) ◦f2 , and m = φf1 ($v)←f2 ($v) . 4. e1 ($r ) = f1 (t) = f1 ($v). Since P, f1 (P), and f2 (P) are isomorphic and f1 and f2 are bijections mapping constants to themselves and variables to variables, the functions ε1 = f1 ←f2 ◦ e1 and ε2 = f1 ←f2 ◦ e2 are well-defined and are embeddings of P into f1 (P) with ε1 =L ε2 . Since e1 ($r ) = f1 (t) and e2 ($r ) = f2 ($v) with t = $v, we have e1 ($r ) = f1 (t) = ε1 ($r ) = ε2 ($r ) = f1 ($v). Hence, if the equalization step with C , e1 , and e2 is initially possible, then also an equalization step with C , ε1 , and ε2 is possible. We choose to perform the equalization step with C , ε1 , and ε2 instead. Hence, we reduce this case to the cases 1 and 2. Finally, we consider all the cases in which C = (P , E ) is a constant constraint. Let (c = $v ) ∈ E be an equality and let e be an embedding of P into T = P with c = e($v ), meeting the conditions of line 12 of Algorithm 2. Since an equalization step is performed, Implication and axiomatization of functional and constant constraints we have e($v ) ∈ U . Since we can swap the roles of f1 and f2 , we can assume, without loss of generality, that e($v ) = f1 ($v) with $v ∈ VP . This yields us two cases, namely $v ∈ L or $v ∈ L. 5. $v ∈ L. Since $v ∈ L, we have f1 ($v) = f2 ($v). Let φc←f1 ($v) be the equalization performed by the initial equalization step using C and embedding e, resulting in the tableau φc←f1 ($v) (f1 (P) ∪ f2 (P)), which is equivalent to the tableau T = (φc←f1 ($v) ◦ f1 (P)) ∪ (φc←f1 ($v) ◦ f2 (P)). By construction, the functions φc←f1 ($v) ◦ f1 and φc←f1 ($v) ◦ f2 are embeddings of P into T that agree on L = φc←$v (L) ∩ V = L \ {$v} and, for all t ∈ TP \ {$v}, are equal to f1 and f2 . Hence, we can put P = φc←$v (P), f1 = f1 |TP , f2 = f2 |TP , and m = φc←f1 ($v) . 6. $v ∈ L. Since P, f1 (P), and f2 (P) are isomorphic, and f1 and f2 are bijections mapping constants to themselves and variables to variables, the function ε = f2 ←f1 ◦ e is well-defined and is an embedding of P into f2 (P) with ε($v ) = c. Hence, by construction, ε($v ) = f2 ($v). Let φc←f1 ($v) be the equalization performed by the initial equalization step using C and embedding e, resulting in the tableau φc←f1 ($v) (f1 (P) ∪ f2 (P)). Since f1 ($v) ∈ range(f2 ), this tableau is equivalent to the tableau T = (φc←f1 ($v) ◦ f1 (P)) ∪ f2 (P). Hence f2 (P) is unaffected by the initial equalization step and thus a second equalization step is possible using constant constraint C and embedding ε. Performing this second equalization φc←f2 ($v) on T results in the tableau φc←f2 ($v) (φc←f1 ($v) (f1 (P) ∪ f2 (P))). Since f2 ($v) ∈ range(f1 ), this tableau is equivalent to the tableau T = (φc←f1 ($v),c←f2 ($v) ◦ f1 (P)) ∪ (φc←f1 ($v),c←f2 ($v) ◦ f2 (P)). By construction, φc←f1 ($v),c←f2 ($v) ◦ f1 and φc←f1 ($v),c←f2 ($v) ◦ f2 are embeddings of P into T that agree on L = φc←$v (L) ∩ V = L and, for all t ∈ TP \ {$v}, are equal to f1 and f2 . Hence, we can put P = φc←$v (P), f1 = f1 |TP , f2 = f2 |TP , and m = φc←f1 ($v),c←f2 ($v) . This case analysis completes the proof. We refer to the sequence of at most two equalization steps performed in Cases 1–3, 5 and 6 of Theorem 4 as a symmetry-preserving step of type 1–3, 5 and 6, respectively. The name “symmetry-preserving” reflects that this sequence of equalization steps maintains the symmetry between the parts in the tableau originating from f1 (P) and f2 (P), as illustrated in Fig. 2. Notice that we do not consider equalization steps performed by Case 4 of Theorem 4. In this case, we have shown that we can always perform other equalization steps that can be represented by a symmetry-preserving step of type 1 or 2. J. Hellings et al. Observe that the initial tableau in a chase deciding S |= C, with C a functional constraint, is a maximally double pattern of PTN(C) and LHS(C). Hence, if equalization steps are possible, then we can apply Theorem 4 inductively to yield a sequence of symmetrypreserving steps that ends when no further equalization steps are possible. We refer to such chases performing only symmetry-preserving steps as symmetry-preserving chases. Corollary 2 Let S be a set of constant-functional constraints, let C = (P, L → R) be a functional constraint. If S |= C, then there is a symmetry-preserving chase that decides S |= C. For simulating the initial symmetry-preserving step in a chase deciding S |= C with C a functional constraint, we use the following three inference rules. Rule 13 (Application III) Let P be a pattern, let $v ∈ VP be a variable, and let t ∈ VP ∪ U be a term. If (φt←$v (P), φt←$v (L) ∩ V → φt←$v (R) ∩ V ), if (P , L → R ), and if there is a pair of embeddings of P into P that agree on L and map $r ∈ R to t and $v, respectively, then (P, L → R). Proof (soundness) Let R be a relation with R |≡ (φt←$v (P), φt←$v (L)∩ V → φt←$v (R)∩ V ) and R |≡ (P , L → R ), and assume there are embeddings of P into R. Let e1 and e2 be any pair of embeddings of P into R with e1 =L e2 , and let e1 and e2 be embeddings of P into P that agree on L and map $r ∈ R to t and $v, respectively. Now, g1 = e1 ◦ e1 and g2 = e1 ◦ e2 are embeddings of P into R with g1 =L g2 . By R |≡ (P , L → R ), we have g1 ($r ) = g2 ($r ), and, by construction, we have g1 ($r ) = e1 (t) and g2 ($r ) = e1 ($v), and, hence, e1 ($v) = e1 (t). In a similar way, we obtain e2 ($v) = e2 (t). As a consequence, ε1 = e1 |domain(e1 )\{$v} and ε2 = e2 |domain(e2 )\{$v} are embeddings of φt←$v (P) into R. By construction, we have ε1 =φt←$v (L) ε2 and, hence, ε1 =φt←$v (L)∩V ε2 . By R |≡ (φt←$v (P), φt←$v (L) ∩ V → φt←$v (R) ∩ V ), we conclude ε1 =φt←$v (R)∩V ε2 . If $v ∈ R, then R = φt←$v (R) ∩ V and thus e1 =R e2 . In the other case, when $v ∈ R, we have e1 ($v) = ε1 (t) and e2 ($v) = ε2 (t). As t is either a constant or t ∈ φt←$v (R) ∩ V , we have ε1 (t) = ε2 (t). Hence, in both cases we have e1 =R e2 , and we conclude R |≡ (P, L → R). Rule 14 (Application IV) Let P be a pattern, let $v ∈ VP be a variable, and let c ∈ U be a constant. If (φc←$v (P), φc←$v (L)∩ V → φc←$v (R)∩ V ), if (P , E ) with (c = $v ) ∈ E , and if there is an embedding of P into P mapping $v to $v, then (P, L → R). Proof (soundness) Let R be a relation with R |≡ (φc←$v (P), φc←$v (L)∩V → φc←$v (R)∩ V ) and R |≡ (P , E ) and assume there are embeddings of P into R. Let e1 and e2 be any pair of embeddings of P into R with e1 =L e2 , and let e be an embedding of P into P mapping $v to $v. Now, e1 = e1 ◦ e and e2 = e2 ◦ e are embeddings of P into R. By R |≡ (P , E ), we have c = e1 ($v ) = e2 ($v ) and, by construction, we have c = e ($v ) = e1 ($v) and c = e2 ($v ) = e2 ($v), and thus c = e1 ($v) = e2 ($v). As a consequence, ε1 = e1 |domain(e1 )\{$v} and ε2 = e2 |domain(e2 )\{$v} are embeddings of φc←$v (P) into R. By construction, we have ε1 =φc←$v (L) ε2 and, hence, ε1 =φc←$v (L)∩V ε2 . By R |≡ (φc←$v (P), φc←$v (L) ∩ V → φc←$v (R) ∩ V ), we conclude ε1 =φc←$v (R)∩V ε2 . If $v ∈ R, then R = φc←$v (R) ∩ V and thus e1 =R e2 . In the other case, when $v ∈ R, Implication and axiomatization of functional and constant constraints we already have e1 ($v) = e2 ($v) = c. Hence, in both cases we have e1 =R e2 , and we conclude R |≡ (P, L → R). The following example exhibits situations in which the Application III and IV rules can be used. Example 8 Using Application III, we can derive {({($a, $b, $c)}, $a → $b), ({($a, b, $c), ($a, b, d)}, $a → $c)} ({($a, b, $c), ($a, $b, d)}, $a → $c) by using the embeddings e1 = {$a → $a, $b → b, $c → $c} e2 = {$a → $a, $b → $b, $c → d}. Using Application IV, we can derive {({($a, $b, $c)}, (c = $c)), ({($a, $b, c)}, $a → $b)} ({($a, $b, $c)}, $a → $b) by using the embedding e = {$a → $a, $b → $b, $c → $c}. If (P, (c = $v)) is a constant constraint, then, using Application IV, we can also derive (P, ∅ → $v) by using the identity on VP ∪ U as the embedding e of P onto itself. Rule 15 (Functional Consequence) Let P be a pattern and let L ⊆ VP be a set of variables. If (P , L → R ) and P has a pair of embeddings into the maximally double pattern f1 (P) ∪ f2 (P) of P and L, agreeing on L and mapping $r ∈ R to f1 ($v) and f2 ($v), respectively, then (P, L → $v). Proof (soundness) Let R be a relation with R |≡ (P , L → R ), and assume there are embeddings of P into R. Let e1 and e2 be any pair of embeddings of P into R with e1 =L e2 . Let f1 (P) ∪ f2 (P) be the maximally double pattern of P and L. Let e1 and e2 be embeddings of P into f1 (P) ∪ f2 (P) with e1 =L e2 , e1 ($r ) = f1 ($v), and e2 ($r ) = f2 ($v). As f1 and f2 are bijections, the functions ε1 = e1 ←f1 ◦ e2 ←f2 ◦ e1 and ε2 = e1 ←f1 ◦ e2 ←f2 ◦ e2 are well-defined and are embeddings of P into R. By construction, we have ε1 =L ε2 , ε1 ($r ) = e1 ($v), and ε2 ($r ) = e2 ($v). Hence, by R |≡ (P , L → R ), e1 ($v) = ε1 ($r ) = ε2 ($r ) = e2 ($v). The following example exhibits situations in which the Functional Consequence rule can be used. Example 9 We can derive ({($a, $b, $c), ($x, $y, $z)}, $b → $z) ({($a, $b, $c), ($x, $y, $z)}, ∅ → $z) using Functional Consequence. We have the maximally double pattern {($a1 , $b1 , $c1 ), ($x1 , $y1 , $z1 ), ($a2 , $b2 , $c2 ), ($x2 , $y2 , $z2 )} of {($a, $b, $c), ($x, $y, $z)} and ∅. We use the embeddings e1 = {$a → $a1 , $b → $b1 , $c → $c1 , $x → $x1 , $y → $y1 , $z → $z1 } e2 = {$a → $a1 , $b → $b1 , $c → $c1 , $x → $x2 , $y → $y2 , $z → $z2 }. J. Hellings et al. We thus have e1 =$b e2 , e1 ($z) = $z1 and e2 ($z) = $z2 . Hence, using Functional Consequence, we conclude ({($a, $b, $c), ($x, $y, $z)}, ∅ → $z). Notice that there is a relationship between the Functional Consequence rule and the well-known join dependencies [2] and multivalued dependencies [6]. In this example, the possible embeddings of the pattern {($a, $b, $c), ($x, $y, $z)} can be represented by a relational table T with schema R(A, B, C, X, Y, Z). Due to the construction of T , the join dependency ABC XY Z holds on T . This join dependency is equivalent to the multivalued dependency ∅ ABC. Due to ({($a, $b, $c), ($x, $y, $z)}, $b → $z), the functional dependency B → Z also holds on T , and, hence, ABC → Z holds. Using the well-known derivation rules for functional dependencies and multivalued dependencies, we conclude ∅ → Z. Before we show how the initial equalization step in any chase deciding S |= C with C a functional constraint can be simulated by the introduced inference rules, we introduce an additional inference rule to simplify our proofs: Rule 16 (Embedding II) If (P , L → R ) and h is an embedding from P into P, then (P, h(L ) ∩ V → h(R ) ∩ V ). Proof (soundness) Let R be a relation with R |≡ (P , L → R ), and assume there are embeddings of P into R. Let e1 and e2 be any pair of embeddings of P into R with e1 =h(L )∩V e2 . Then, ε1 = e1 ◦ h and ε2 = e2 ◦ h are embeddings of P into R. As embeddings always agree on constants and e1 =h(L )∩V e2 , we have e1 =h(L ) e2 and, hence, also ε1 =L ε2 . By R |≡ (P , L → R ), we have ε1 =R ε2 . As a consequence, we have e1 =h(R ) e2 and, hence, also e1 =h(R )∩V e2 . Embedding II explicitly maps a functional constraint defined on a pattern to a different pattern. The following example illustrates this. Example 10 If ({($a, $b)}, $a → $b) holds, then trivially also ({($c, $d)}, $c → $d) holds. We can derive ({($c, $d)}, $c → $d) from ({($a, $b)}, $a → $b) by using Embedding II with the embedding h = {$a → $c, $b → $d}. Next, we show how the initial symmetry-preserving step in any symmetry-preserving chase deciding S |= C with C a functional constraint can be simulated by the introduced inference rules. Theorem 5 Let S be a set of constant-functional constraints, let C = (P, L → R) be a functional constraint, and let S |= C. If Algorithm 2 is initially able to perform a symmetry-preserving step with C ∈ S resulting in tableau T = f1 (P ) ∪ f2 (P ), which is the maximally double pattern of P and L with L ⊆ VP , then there exists a functional constraint C = (P , L → R ) such that we have the following: 1. 2. C |= C , {C , C } C using Rules 2, 3, 13, 14, and 15. Proof Initially, we have T = f1 (P) ∪ f2 (P), a maximally double pattern of P and L. By Theorem 4, the initial symmetry-preserving step with constraint C results in tableau Implication and axiomatization of functional and constant constraints T = f1 (P ) ∪ f2 (P ), a maximally double pattern of P and L , such that f1 (P ) = m(f1 (P)) and f2 (P ) = m(f2 (P)) for some mapping m. We make a case distinction on the type of symmetry-preserving step initially performed. The initial symmetry-preserving step is of type 1 or 2. Let C = (P , L → R ), let $r ∈ R , and let e1 and e2 be the embeddings of P into T with e1 ($r ) = f1 (t) and e2 ($r ) = f1 ($v), as in the proof of Theorem 4. We choose C = (φt←$v (P), φt←$v (L) ∩ V → φt←$v (R) ∩ V ). First, we have C |= C , since we have C C by a straightforward application of Embedding II with the embedding φt←$v . The functions ε1 = f1 −1 ◦f1 ←f2 ◦e1 and ε2 = f1 −1 ◦f1 ←f2 ◦e2 are well-defined and are embeddings of P into P with, by construction, ε1 =L ε2 , ε1 ($r ) = t, and ε2 ($r ) = $v. Hence, we can apply Application III using embeddings ε1 and ε2 , and variable $r to derive {C , C } C. 2. The initial symmetry-preserving step is of type 3. Let C = (P , L → R ), let $r ∈ R , and let e1 and e2 be the embeddings of P into T with e1 ($r ) = f1 ($v) and e2 ($r ) = f2 ($v), as in the proof of Theorem 4. We choose C = (P, L ∪ {$v} → R). First, we have C |= C , since we have C C by a straightforward application of Augmentation, Decomposition, and Union. A straightforward application of Functional Consequence with e1 , e2 , and $v yields C (P, L → $v). Using Augmentation, we obtain C (P, L → L ∪ {$v}), and, hence, using Transitivity, {C , C } C. 3. The initial symmetry-preserving step is of type 5 or 6. Let C = (P , E ), let (c = $v ) ∈ E , and let e be the embedding of P into T with e($v ) = f1 ($v), as in the proof of Theorem 4. We choose C = (φc←$v (P), φc←$v (L) ∩ V → φc←$v (R) ∩ V ). First, we have C |= C , since we have C C by a straightforward application of Embedding II with the embedding φc←$v . The function ε = f1 −1 ◦ f1 ←f2 ◦ e is well-defined, and is an embedding of P into P with, by construction, ε($v ) = $v. Hence, we can apply Application IV using embedding ε and variable $v to derive {C , C } C. 1. This case analysis completes the proof. 5.4 Inference rules for constant-functional constraints The results from Section 5.1–5.3 only concerned the simulation of direct termination in chases for constant-functional constraints, the initial equalization step in chases for constant constraints, and the initial symmetry-preserving step in chases for functional constraints, respectively. These results are now used as the basis for an induction argument to extend the simulation to the entire chase, as such showing how to translate a chase (constant constraints) or a symmetry-preserving chase (functional constraints) to a derivation. Proposition 6 The set of inference rules Reflexivity, Empty, Augmentation, Transitivity, Decomposition, Inconsistency I and II, Application I-IV, Inconsistent Propagation, and Functional Consequence is complete for the class of constant-functional constraints. Proof Suppose that S is a set of constant-functional constraints and C is a single constantfunctional constraint with S |= C. If initially direct termination is possible, then, by Proposition 3–5, Reflexivity, Empty, Inconsistency I, or Inconsistency II can be used to derive C from S . J. Hellings et al. For chases performing equalization steps we make a case distinction on the type of the constraint C. 1. C is a constant constraint. Let j > 0 be the number of equalization steps in a shortest chase deciding S |= C. Assume, as induction hypothesis, that, if S |= C, with C a constant constraint, and if this can be decided by a chase performing less than j equalization steps, then S C. Choose a chase deciding S |= C in exactly j equalization steps. By Theorem 3, there is a constant constraint C with S |= C and {C , C } C using Decomposition, Application I, Application II, and Inconsistent Propagation such that the result of the initial equalization step is a tableau T , which is equivalent to the initial tableau in any chase deciding S |= C . Hence, the remaining chase of j − 1 equalization steps is a chase deciding S |= C , and, by the induction hypothesis, S C . Since S C and {C , C } C, we conclude S C. 2. C is a functional constraint. By Corollary 2, we only have to consider symmetrypreserving chases. Let j > 0 be the number of symmetry-preserving steps in a shortest symmetry-preserving chase deciding S |= C. Assume, as induction hypothesis, that, if S |= C, with C a functional constraint, and if this can be decided by a symmetry-preserving chase performing less than j symmetry-preserving steps, then S C. Choose a symmetry-preserving chase deciding S |= C in exactly j symmetrypreserving steps. By Theorem 5, there is a functional constraint C with C |= C and {C , C } C using Augmentation, Transitivity, Application III, Application IV, and Functional Consequence such that the result of the initial symmetry-preserving step is a tableau T , which is equivalent to the initial tableau in any chase deciding S |= C . Hence, the remaining chase of j −1 symmetry-preserving steps is a symmetrypreserving chase deciding S |= C , and, by the induction hypothesis, S C . Since S C and {C , C } C, we conclude S C. This case analysis completes the proof. Thus, the provided set of inference rules is sound and complete. Finally, we prove that the provided set of inference rules is 2-ary and that each inference rule is indeed decidable. As a consequence, the provided set of inference rules is an axiomatization for the constantfunctional constraints. Theorem 6 The set of inference rules Reflexivity, Empty, Augmentation, Transitivity, Decomposition, Inconsistency I and II, Application I-IV, Inconsistent Propagation, and Functional Consequence is an axiomatization for the class of constant-functional constraints. Proof Each time we introduced an inference rule, we proved its soundness. Proposition 6 proves that the set of inference rules is complete for deriving constant-functional constraints. Hence, it only remains to prove that the inference rules are at most k-ary, for some k ≥ 0, and that applicability of each inference rule is decidable. The inference rules Reflexivity and Empty are 0-ary. The inference rules Augmentation, Decomposition, Inconsistent Propagation and Functional Consequence are 1-ary. The inference rules Transitivity, Inconsistency I and II, and Application I-IV are 2-ary. Applicability of an inference rule can be decided by checking if patterns are equal, which is straightforward, or by checking if embeddings from pattern P into Q exist that satisfy Implication and axiomatization of functional and constant constraints certain properties. Functionally, each embedding from a pattern P into Q can be specified by the mapping from VP to TQ . Since VP and TQ are finite, we can easily enumerate all possible embeddings and check whether embeddings that satisfy the conditions of an inference rule exist. Hence, checking if an inference rule can be applied is decidable. We observe that the Embedding I and II rules are not part of the axiomatization, as they are not used in the simulation of a chase. Their soundness was only used to simplify certain proofs. Likewise, the Union rule is not part of the axiomatization. By carefully restricting the simulation of chases by inference rules, as presented in Theorem 6, to either chases involving only constant constraints or chases involving only functional constraints, respectively, we can also provide axiomatizations for only the functional constraints and only the constant constraints. Theorem 7 The set of inference rules Reflexivity, Augmentation, Transitivity, Inconsistency I 2 , Application III, and Functional Consequence is an axiomatization for the class of functional constraints. Theorem 8 The set of inference rules Empty, Decomposition, Inconsistency II 3 , Application II and Inconsistent Propagation is an axiomatization for the class of constant constraints. Next, we investigate the complexity of the provided axiomatization for the constantfunctional constraints. Theorem 9 Let S be a set of constant-functional constraints and let C be a single constantfunctional constraint. If S C, then there is a derivation that performs O(|VPTN(C) |) inference steps. Proof Consider the chase deciding S |= C. At each step of the chase, the number of distinct variables in the tableau T decreases by one. If C is a constant constraint, then we initially have |VT | = |VP |. Hence, the chase performs at most |VP | equalization steps and Proposition 6 provides a way to simulate each equalization step by a bounded number of inference steps. If C is a functional constraint, then we initially have |VT | ≤ 2|VP |. Hence, the chase performs at most 2|VP | symmetry-preserving steps and Proposition 6 provides a way to simulate each symmetry-preserving step by a bounded number of inference steps. 6 Related work Many types of constraints have been investigated for the relational model, and among the simplest of these constraints are the functional dependencies [13]. Functional dependencies play an important role in the well-known Boyce-Codd normal form [14] and in relational 2 The Inconsistency I inference rule can be used to derive both functional constraints and constant constraints. In this setting we only use Inconsistency I to derive functional constraints. 3 The Inconsistency II inference rule can be used to derive both constant constraints and functional constraints. In this setting we only use Inconsistency II to derive constant constraints. J. Hellings et al. schema normalization in general. Besides the functional dependencies, many other types of constraints have been investigated, and for the relational model most of these constraints can be categorized as a subclass of the equality-generating and/or tuple-generating dependencies [1, 16, 17, 26, 29]. The constraints studied in this work are all equality-generating. In the following, we shall describe how the constant-functional constraints are related with other classes of equality-generating dependencies. For describing these relationships, we introduce some terminology to define restrictions on the classes of functional, constant, and equality-generating constraints. We say that a constraint C over pattern P is typed if, for every pair of tuples (t1 , . . . , tn ) ∈ P and (u1 , . . . un ) ∈ P, and for every pair of variables ti ∈ VP , uj ∈ VP , we have ti = uj only if i = j . We say that a functional constraint (P, L → R) is constant-free if UP = ∅. We say that an equality-generating constraint (P, E) is constant-free if UP = ∅ and, for every (t = u) ∈ E, t, u ∈ VP . We say that an equality-generating constraint (P, E) is many sorted if it is typed and, for every (t = u) ∈ E, there exist tuples (t1 , . . . , tn ) ∈ P, (u1 , . . . , un ) ∈ P and there exists a j , 1 ≤ j ≤ n, such that t = tj and u = uj . Lastly, we say that an equality-generating or functional constraint C is an n-pattern constraint if |PTN(C)| ≤ n. Using the terminology introduced above, we can describe the relationships between various well-known classes of equality-generating constraints. A summary of these relationships is visualized in Fig. 3. The equality-generating constraints (EGC) and the functional constraints (FC) of Akhtar et al. [3] were both originally introduced for the RDF data model, and have been generalized to relations of arbitrary arity in this work. The equality-generating constraints on relations of arbitrary arity are equivalent to the full equality-generating dependencies of Wijsen [31]. The equality-generating dependencies (EGD) of Beeri and Vardi [7] are equivalent to the constant-free fragment of the equality-generating constraints. In the literature on dependencies for the relational model, one often considers a n-ary relation to be a subset of D1 × · · · × Dn , where Di , 1 ≤ i ≤ n, is a domain of values disjoint from all other domains. As such, many dependencies are many-sorted. The implication dependencies of Fagin [16] are an example, and the equality-generating fragment of these implication dependencies are indeed equivalent to the many-sorted and constant-free equality-generating constraints. Also the well-known functional dependencies (FD) of Codd [13] are many-sorted. It is straightforward to verify that the functional dependencies are also equivalent to the typed, constant-free, 1-pattern functional constraints (1-pattern FD). Using this relationship, it is also straightforward to verify that the functional dependencies are equivalent to the constantfree, many-sorted, 2-pattern equality-generating dependencies. In the literature, several context-dependent generalizations of the functional dependencies have been studied. Among them are the conditional functional dependencies (CFD) of Fan et al. [21]. These dependencies allow the use of context-dependent information in the specification of integrity constraints, and, as such, allow for richer tools to maintain integrity of databases. In a straightforward manner, the conditional functional dependencies in normal form [21] can be shown to be equivalent to the union of the typed, 1-pattern subset of the constant constraints (CD) and the typed, 1-pattern functional constraints. Hence, the constant-functional constraints are a proper untyped generalization of the conditional functional dependencies towards arbitrary patterns in relations. The extended conditional functional dependencies of Bravo et al. [9, 20] have been introduced as an extension of the conditional functional dependencies in which one can also express that the valuation of an attribute (in some context) is restricted to a set of allowed values. Implication and axiomatization of functional and constant constraints Fig. 3 Overview of the various classes of constraints. Directed edges from a class C1 to another class C2 express inclusion of C1 into C2 , meaning that every constraint in C1 is expressible by a set of constraints in C2 Example 11 Consider again the relational schema PI(name, country, cc, phone) of Example 3. The attribute cc gives the country code, and the set of allowed values can be restricted to all existing country calling codes. Such an allowed set of values is however not expressible by a set of equalities that should all hold at the same time, and, hence, these types of constraints are not expressible by constant-functional constraints. Another example of constraints that allow the specification of context-dependent information are the qualified functional dependencies (QFD) of He et al. [24]. The qualified functional dependencies introduce a convenient syntax to specify context-dependent functional dependencies on several relations with equivalent schemas in a straightforward manner. It is straightforward to show that each qualified functional dependency holding on each individual relation can be expressed by a set of typed, 1-pattern functional constraints on the relation, and vice versa. Hence, the qualified functional dependencies are also equivalent to the functional part of the conditional functional dependencies. For the RDF and XML graph data models, a large body of work on the integrity of data focuses on the schema of the data. Examples are RDF Schema [18] and, for the XML data model, DTDs [10] and XSDs [19]. The usage of dependency-like constraints is less common for these data-models although initial steps have been made (e.g., [3, 4, 11, 12, 22, 23, 27, 28, 30, 32]). We repeat that functional constraints were originally introduced for the RDF data model [3, 15]. Besides this obvious relationship with the RDF data model, the concept J. Hellings et al. of applying functional dependencies or functional-like dependencies to a view on the data plays a major role in other proposals for constraints for the RDF and XML data models. A clear and recent example are the XML Constraints based on XML-to-relational mappings of Niewerth et al. [28], who studied functional dependencies over tree patterns. Specifically, the tree patterns using only edges are equivalent to patterns, as we consider them, over the edge relation of XML documents. 7 Conclusions and future work Starting from functional and equality-generating constraints for the RDF data model, we studied constant-functional constraints on arbitrary relations. As our main result, we prove the existence of a sound and complete axiomatization for the constant-functional constraints. This axiomatization can easily be specialized to only deal with functional constraints; hence our results solve a major open problem in the work of Akhtar et al. [3] on functional constraints for the RDF data model. The major result leading to the axiomatization of the constant-functional constraints is that chases for functional constraints can always be normalized to symmetry-preserving chases. We believe that our work provides a promising formal basis for reasoning about constantfunctional constraints. As for future work, several open problems remain. Firstly, for several types of equality-generating constraints, it is known whether Armstrong-like relations [5, 16] exist. An Armstrong-like relation R for a set of constraints S satisfies constraint C if and only if S |= C. For both the equality-generating constraints and the constant-functional constraints, it is unknown under which conditions Armstrong-like relations exist. Secondly, the addition of constant constraints to the functional constraints also introduced the possibility to define a set of inconsistent constraints, such that no non-empty relation can satisfy the constraints. An example is the set {({($a, $b, $c)}, (c1 = $c)), ({($a, $b, $c)}, (c2 = $c))}. Being able to decide consistency of a set of constraints is useful. Inconsistent sets of constraints are not very useful in practice and might even hint at a mistake in the definition of one or more constraints. The complexity to decide consistency of a set of constraints S is unknown, however. Thirdly, the exact complexity of the implication problem for constant-functional constraints is not yet known. However, the relationship with conditional functional dependencies already provides lower bounds for the complexity of the consistency problem and the implication problem: for the conditional functional dependencies, these problems are NP-complete and coNP-complete, respectively [21]. References 1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995) 2. Aho, A.V., Beeri, C., Ullman, J.D.: The theory of joins in relational databases. ACM Trans. Database Syst. 4(3), 297–314 (1979) 3. Akhtar, W., Cortés-Calabuig, Á., Paredaens, J.: Constraints in RDF. In: Semantics in Data and Knowledge Bases, Lecture Notes in Computer Science, vol. 6834, pp. 23–39. Springer (2011) 4. Arenas, M., Libkin, L.: A normal form for XML documents. ACM Trans. Database Syst. 29(1), 195– 232 (2004) Implication and axiomatization of functional and constant constraints 5. Armstrong, W.W.: Dependency structures of data base relationships. In: Information Processing 74, pp. 580–583 (1974) 6. Beeri, C., Fagin, R., Howard, J.H.: A complete axiomatization for functional and multivalued dependencies in database relations. In:Proceedings of the 1977 ACM SIGMOD International Conference on Management of Data, SIGMOD ’77, pp. 47–61 (1977) 7. Beeri, C., Vardi, M.: The implication problem for data dependencies. In: Automata, Languages and Programming, Lecture Notes in Computer Science, vol. 115, pp. 73–85. Springer (1981) 8. Beeri, C., Vardi, M.Y.: A proof procedure for data dependencies. J. ACM 31(4), 718–741 (1984) 9. Bravo, L., Fan, W., Geerts, F., Ma, S.: Increasing the expressivity of conditional functional dependencies without extra complexity. In: ICDE ’08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, pp. 516–525 (2008) 10. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F.: Extensible markup language (XML) 1.0 (fifth edition), W3C recommendation 26 November 2008 (2008). http://www.w3.org/TR/ 2008/REC-xml-20081126/ 11. Buneman, P., Davidson, S., Fan, W., Hara, C., Tan, W.C.: Keys for XML. Comput. Netw. 39(5), 473–487 (2002) 12. Calbimonte, J.P., Porto, F., Keet, C.M.: Functional dependencies in OWL ABOX. In: XXIV Simpósio Brasileiro de Banco de Dados, pp. 16–30 (2009) 13. Codd, E.F.: Relational completeness of data base sublanguages, vol. 987. IBM Research Laboratory, San Jose, California (1972) 14. Codd, E.F.: Recent investigations in relational data base systems. In: Information Processing 74, pp. 1017–1021 (1974) 15. Cortés-Calabuig, A., Paredaens, J.: Semantics of constraints in RDFS. In: Proceedings of the 6th Alberto Mendelzon International Workshop on Foundations of Data Management, pp. 75–90 (2012) 16. Fagin, R.: Horn clauses and database dependencies. J. ACM 29(4), 952–985 (1982) 17. Fagin, R., Vardi, M.Y.: The theory of data dependencies—a survey. In: Mathematics of Information Processing, Proceedings of Symposia in Applied Mathematics, vol. 34, pp. 19–71. American Mathematical Society (1986) 18. Fallside, D.C., Walmsley, P.: RDF schema 1.1, W3C recommendation 25 February 2014 (2014). http:// www.w3.org/TR/2014/REC-rdf-schema-20140225/ 19. Fallside, D.C., Walmsley, P.: XML schema part 0: Primer second edition, W3C recommendation 28 October 2004 (2004). http://www.w3.org/TR/2004/REC-xmlschema-0-20041028/ 20. Fan, W., Geerts, F., Jia, X.: Conditional dependencies: A principled approach to improving data quality. In: Dataspace: The Final Frontier, Lecture Notes in Computer Science, vol. 5588, pp. 8–20. Springer (2009) 21. Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Conditional functional dependencies for capturing data inconsistencies. ACM Trans. Database Syst. 33(2), 6:1–6:48 (2008) 22. Hartmann, S., Link, S.: More functional dependencies for XML. In: Advances in Databases and Information Systems, Lecture Notes in Computer Science, vol. 2798, pp. 355–369. Springer (2003) 23. Hartmann, S., Link, S.: Efficient reasoning about a robust XML key fragment. ACM Trans. Database Syst. 34(2), 10:1–10:33 (2009) 24. He, Q., Ling, T.W.: Extending and inferring functional dependencies in schema transformation. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, CIKM ’04, pp. 12–21. ACM (2004) 25. Hellings, J., Gyssens, M., Paredaens, J., Wu, Y.: Implication and axiomatization of functional constraints on patterns with an application to the RDF data model. In: Foundations of Information and Knowledge Systems, Lecture Notes in Computer Science, vol. 8367, pp. 250–269. Springer (2014) 26. Kanellakis, P.C.: Elements of relational database theory. Tech. Rep. CS-89-39, Brown University (1989) 27. Lausen, G., Meier, M., Schmidt, M.: SPARQLing constraints for RDF. In: Proceedings of the 11th International Conference on Extending Database Technology: Advances in Database Technology, EDBT ’08, pp. 499–509 (2008) 28. Niewerth, M., Schwentick, T.: Reasoning about XML constraints based on XML-to-relational mappings. In: Proceedings of the 17th International Conference on Database Theory, ICDT ’14, pp. 72–83 (2014) 29. Vardi, M.Y.: Fundamentals of dependency theory. In: Trends in Theoretical Computer Science, Principles of Computer Science Series, vol. 34, pp. 171–224. Computer Science Press (1987) 30. Vincent, M.W., Liu, J., Mohania, M.: The implication problem for ‘closest node‘ functional dependencies in complete XML documents. J. Comput. Syst. Sci. 78(4), 1045–1098 (2012) 31. Wijsen, J.: Database repairing using updates. ACM Trans. Database Syst. 30(3), 722–768 (2005) 32. Yu, Y., Heflin, J.: Extending functional dependency to detect abnormal data in RDF graphs. In: The Semantic Web ISWC 2011, Lecture Notes in Computer Science, vol. 7031, pp. 794–809. Springer (2011)
© Copyright 2026 Paperzz