PODS 2012
2012 ACM SIGMOD/PODS Conference
Scottsdale, Arizona, USA
A Dichotomy in the Complexity of Deletion
Propagation with Functional Dependencies
Benny Kimelfeld
IBM Research – Almaden
Deletion Propagation
• Translate a tuple deletion on the view
back to the source relations … properly
• Classic database problem
– Specializing the more general view-update problem
– [Dayal & Bernstein 1982; Cosmadakis & Papadimitriou 1984; Keller 1986; Cui &
Widom 2001; Buneman & Khanna & Tan 2002; Cong & Fan & Geerts 2006; …]
• Renewed motivation: debug/causality for false positives
[K, Vondrak, Williams, 2011]
• Various definitions of “properly” were studied
– Minimize the view side effect
This Work!
• # view tuples lost except the intentional one
– Minimize the source side effect
• # source tuples to delete
• = maximal “responsibility” for an answer [Meliou et al., 2010]
Example: File Access
[Cui & Widom 2001; Buneman et al. 2002]
Access
UserGroup
user
file
Emma
a.txt
Emma
b.txt
Olivia
a.txt
=
GroupFile
user
group
group
file
Emma
ai
ai
a.txt
Emma
db
ai
b.txt
Olivia
os
db
a. txt
⋈
Olivia
b.txt
Jacob
a.txt
Olivia
db
db
b.txt
Jacob
b.txt
Jacob
ai
os
a.txt
Access(u,f) :– UserGroup(u,g),
GroupFile(g,f)
Delete source rows, s.t. Emma won’t access a.txt.
But, maintain maximum access permissions!
Example: File Access
[Cui & Widom 2001; Buneman et al. 2002]
Access
UserGroup
user
file
Emma
a.txt
Emma
b.txt
Olivia
a.txt
=
GroupFile
user
group
group
file
Emma
ai
ai
a.txt
Emma
db
ai
b.txt
Olivia
os
db
a. txt
⋈
Olivia
b.txt
Jacob
a.txt
Olivia
db
db
b.txt
Jacob
b.txt
Jacob
ai
os
a.txt
Access(u,f) :– UserGroup(u,g),
GroupFile(g,f)
Delete source rows, s.t. Emma won’t access a.txt.
But, maintain maximum access permissions!
Example: File Access
[Cui & Widom 2001; Buneman et al. 2002]
Access
UserGroup
user
file
Emma
a.txt
Emma
b.txt
Olivia
a.txt
=
GroupFile
user
group
group
file
Emma
ai
ai
a.txt
Emma
db
ai
b.txt
Olivia
os
db
a. txt
⋈
Olivia
b.txt
Jacob
a.txt
Olivia
db
db
b.txt
Jacob
b.txt
Jacob
ai
os
a.txt
Access(u,f) :– UserGroup(u,g),
GroupFile(g,f)
Delete source rows, s.t. Emma won’t access a.txt.
But, maintain maximum access permissions!
Formal Definitions
Schema S: rel. symbols + functional dependencies (fd)
R1,….,Rm
Ri: attribute-set → attribute
Conjunctive Query (CQ) Q:
Q( y1 , y2 , y3 ) :– R1(x1 , y1), R2(x1 ,'ibm'), R3(x2 , y1 , y2 , x3), R4(x4 , y3)
head variables
Input:
• DB D over S
• Answer a ∈ Q(D)
to delete
atom
existential variables
No self joins!
Solution: E ⊆ D s.t. a ∉ Q(E)
• Side-effect free: Q(E) = Q(D) – {a}
• Optimal: |Q(E)| is maximal
Complexity Questions
What is the complexity of
• Deciding if a side-effect-free solution exists?
• Finding an optimal solution?
– Or one w/ approximately minimal side effect?
– Or one w/ approximately maximal # surviving answers?
• Not the same [K, Vondrák, Williams, 2011]
Data complexity:
Fixed: Schema S, CQ Q
Input: DB D over S, answer a ∊ Q(D) to delete
Unirelation Algorithm (1Rel): Example
[Buneman et al., 2002]
Access
UserGroup
user
file
Emma
a.txt
Emma
b.txt
Olivia
a.txt
=
GroupFile
user
group
group
file
Emma
ai
ai
a.txt
Emma
db
ai
b.txt
Olivia
os
db
a. txt
⋈
Olivia
b.txt
Jacob
a.txt
Olivia
db
db
b.txt
Jacob
b.txt
Jacob
ai
os
a.txt
Access(u,f) :– UserGroup(u,g),
GroupFile(g,f)
Delete a = (Emma, a.txt)
Unirelation Algorithm (1Rel): Example
[Buneman et al., 2002]
Access
UserGroup
user
file
Emma
a.txt
Emma
b.txt
Olivia
a.txt
=
GroupFile
user
group
group
file
Emma
ai
ai
a.txt
Emma
db
ai
b.txt
Olivia
os
db
a. txt
⋈
Olivia
b.txt
Jacob
a.txt
Olivia
db
db
b.txt
Jacob
b.txt
Jacob
ai
os
a.txt
Access(u,f) :– UserGroup(u,g),
GroupFile(g,f)
better than previous ⇒ selected solution
Delete a = (Emma, a.txt)
Recall: there is even better solution (side-effect free)
1Rel: General Case
undesired a ∈ Q(D)
R1
R2
…
Rk
D
Q has k atoms
solution 1
R1
R2
…
select best
Rk
D
solution 2
solution i:
delete from Ri each
tuple consistent w/ a
…
(i=1,…,k)
R1
R2
…
D
solution k
Rk
Head Domination [K, Vondrák, Williams, 2011]
Q:
G∃[Q]:
A CQ over a schema S
nodes = atoms(Q)
edges = “sharing ≥1 existential var.”
head domination:
∀ C ∊ CC(G∃[Q]) ∃j ∊ atoms(Q) s.t., headVars(C) ⊆ vars(j)
Connected Components
Q( y1 , y2) :– R1(x1 , y1) , R2(x1 , y2) , R3(x1 , y1 , y2)
Q( y1 , y2 , y3) :– R1(x1 , y1) , R2(x1 , y2) , R3(y1 , y2) , R4(x2 , y2 , y3)
Q( y1 , y2) :– R1(x , y1) , R2(x , y2)
Access(u,f)
Previous Dichotomy Theorem [KVW 2011]
Let Q be a CQ over a schema S
(no self joins)
[K, Vondrak, Williams, 2011], no FDs:
Q has head
⇒ 1Rel returns an optimal solution (in PTime)
domination
otherwise ⇒
∃side-effect-free is NP-complete; NP-hard to find an
(αQ-approx.) optimal solution
PTime
(1Rel)
Q( y1 , y2) :– R1(x1 , y1) , R2(x1 , y2) , R3(x1 , y1 , y2)
Q( y1 , y2 , y3) :– R1(x1 , y1) , R2(x1 , y2) , R3(y1 , y2) , R4(x2 , y2 , y3)
PTime
(1Rel)
Q( y1 , y2) :– R1(x , y1) , R2(x , y2)
NP-hard
Access(u,f)
Access Example Revisited
Delete (Emma, a.txt)
NP-hard
Access
user
file
Emma
a.txt
Emma
b.txt
Olivia
a.txt
Olivia
b.txt
Jacob
a.txt
Jacob
b.txt
UserGroup
=
GroupFile
user
group
group
file
Emma
ai
ai
a.txt
Emma
db
ai
b.txt
Olivia
os
db
a. txt
Olivia
db
db
b.txt
Jacob
ai
os
a.txt
⋈
group ← file
PTime
Access Example Revisited
Delete (Emma, a.txt)
NP-hard
Access
user
file
Emma
a.txt
Emma
b.txt
Olivia
a.txt
Olivia
b.txt
Jacob
a.txt
Jacob
b.txt
UserGroup
=
GroupFile
user
group
group
file
Emma
ai
ai
a.txt
Emma
db
ai
b.txt
Olivia
os
db
a. txt
Olivia
db
db
b.txt
Jacob
ai
os
a.txt
⋈
user → group
group ← file
PTime
PTime
Access Example Revisited
Delete (Emma, a.txt)
NP-hard
Access
user
file
Emma
a.txt
Emma
b.txt
Olivia
a.txt
Olivia
b.txt
Jacob
a.txt
Jacob
b.txt
UserGroup
=
GroupFile
user
group
group
file
Emma
ai
ai
a.txt
Emma
db
ai
b.txt
Olivia
os
db
a. txt
Olivia
db
db
b.txt
Jacob
ai
os
a.txt
⋈
user ← group
PTime
user → group
group ← file
PTime
PTime
Access Example Revisited
Delete (Emma, a.txt)
NP-hard
Access
user
file
Emma
a.txt
Emma
b.txt
Olivia
a.txt
Olivia
b.txt
Jacob
a.txt
Jacob
b.txt
UserGroup
=
Every nontrivial set
of FDs brings the
problem to PTime
GroupFile
user
group
group
file
Emma
ai
ai
a.txt
Emma
db
ai
b.txt
Olivia
os
db
a. txt
Olivia
db
db
b.txt
Jacob
ai
os
a.txt
⋈
user ← group
group → file
PTime
PTime
user → group
group ← file
PTime
PTime
Additional Examples
Q(y , y1 , y2) :– R1(y1 , x1) , R(x1 , y , x2) , R2(y2 , x2)
NPhard
Q(y , y1 , y2) :– R1(x1 , y1) , R(x1 , y , x2) , R2(x2 , y2)
PTime
Q( y , y1 , y2) :– R1(x1 , y1) , R(x1 , y , x2) , R2(x2 , y2)
NPhard
Dichotomy with FDs
Let Q be a CQ over a schema S
(no self joins)
[K, Vondrak, Williams, 2011], no FDs:
This paper: (FDs)
Remove tuple only
if it is used for the
undersired answer
1Rel returns an
Q has head
⇒ optimal solution
domination
(in PTime)
Q+ has
1Rel* returns an
functional ⇒ optimal solution
head dom.
(in PTime)
∃side-effect-free
is NP-complete;
otherwise ⇒ NP-hard to find
an (αQ-approx.)
optimal solution
∃side-effect-free
is NP-complete;
otherwise ⇒ NP-hard to find
an (αQ-approx.)
optimal solution
Depending on the CQ and FDs, the problem
is either straightforward or hard!
FDs Among Variables
Definition:
CQ Q over schema S, U, V ⊆ variables(Q)
U → V:
∀ D ∈ db(S) m1, m2 ∈ hom(Q→D)
m1=m2 on U ⇒ m1=m2 on V
Access(u,f) :– UserGroup(u,g), GroupFile(g,f)
FD: user → group FD: group → file
u→g
g→f
u→f
{u,g} → f
The CQ Q+
Tractability Condition:
Q+ has functional head domination
Definition:
CQ Q over schema S, U, V ⊆ variables(Q)
U → V:
∀ D ∈ db(S) m1, m2 ∈ hom(Q→D)
m1=m2 on U ⇒ m1=m2 on V
Q+ : add to Q’s head every x s.t. headVars → x
Access(u,f) :– UserGroup(u,g), GroupFile(g,f)
group ← file
⇒ g ← {u,f}
Access+(u,g,f) :– UserGroup(u,g), GroupFile(g,f)
Functional Head Domination
Q:
G∃[Q]:
Tractability Condition:
A CQ over a schema S Q+ has functional head domination
nodes = atoms(Q)
edges = “sharing ≥1 existential var.”
head domination:
∀ C∈CC(G∃[Q]) ∃j ∊ atoms(Q), s.t. vars(j) ⊇ headVars(C)
functional head domination:
∀ C∈CC(G∃[Q]) ∃j ∊ atoms(Q), s.t. vars(j) → headVars(C)
Access(u,f) :– UserGroup(u,g), GroupFile(g,f)
{u,g} → {u,f}
⇐ group → file
Examples
Tractability Condition:
Q+ has functional head domination
Q( y , y1 , y2) :– R1(x1 , y1) , R(x1 , y , x2) , R2(x2 , y2)
NP-hard
Q(y , y1 , y2) :– R1(x1 , y1) , R(x1 , y , x2) , R2(x2 , y2)
{y , y1 , y2} → x2
Q+(y , y1 , y2, x2) :– R1(x1 , y1) , R(x1 , y , x2) , R2(x2 , y2)
PTime (1Rel*)
Example: Key-Preserving Views
Tractability Condition:
Q+ has functional head domination
Theorem [Cong, Fan, Geerts, 2006]:
Q preserves keys* ⇒ deletion propagation in PTime
For CQs w/o self joins, follows directly from our positive side:
Q preserves keys
⇒ Q+ has no existential vars ⇒ G∃[Q+] has no edges
⇒ Q+ trivially has functional head domination
(every connected component is a node, dominated by itself…)
⇒ 1Rel* returns an optimal solution
* Each relation has a key; none of
the key attributes are projected out
About the Proof
• The positive side is fairly simple
– … once the tractability condition is found
• The negative side is intricate
– Reduction from the special case of the Access CQ
– Challenge: simulating Access(u,f) by an instance that
satisfies all the FDs
– Central concept: graph separation on the variable
graph of the CQ
Q(y1 , y2) :– R1(y1 , x) , R2(x , y2)
→
Q'(y1 , y2) :– R1(y1 , x1 , x) , R2(x , x2 , y2)
R3(x1 , x2)
Conclusions & Ongoing Work
• Studied deletion propagation in the presence of
functional dependencies
• Established a dichotomy in complexity:
– PTime by a straightforward algorithm vs.
– Hardness (of approximation)
• Generalizes previously established special
cases: no FDs, key-preserving views
• Ongoing work: deletion of multiple answers
– Preview: trichotomy
Questions?
• Straightforward
• Hard but approximable (by a constant-factor)
• Hard to approximate
© Copyright 2026 Paperzz