Slides - Purdue Computer Science

Staging User Feedback toward
Rapid Conflict Resolution in Data Fusion
Romila Pradhan*, Siarhei Bykau✝, Sunil Prabhakar*
*Purdue University, ✝Bloomberg L.P.
1
Fusing data from multiple
sources
Data Item
Zootopia
Kung Fu Panda
Inside Out
Finding Dory
Minions
Rio
S1
S2
Howard
Stevenson
leFauve
S3
Spencer
Nelson
Docter
S4
Spencer
Stanton
Coffin
Jones
Renaud
Saldanha
2
Data fusion systems
Data Item
S1
Zootopia
Kung Fu Panda
S3
S4
Howard
Spencer
Spencer
Stevenson
Inside Out
ACCU1
S2
Nelson
leFauve
Docter
Finding Dory
Stanton
Minions
Coffin
Rio
Jones
Source
accuracy
iterative
computation
Correctness
of claims
Renaud
Saldanha
Data Item
Correctness of claims
Zootopia
Howard (0.000)
Spencer (1.000)
Kung Fu Panda
Stevenson (0.015)
Nelson (0.985)
Inside Out
leFauve (0.001)
Docter (0.999)
Finding Dory
Stanton (1.000)
Minions
Coffin (0.921)
Renaud (0.079)
Rio
Jones (0.015)
Saldanha (0.985)
Source
Accuracy
S1
0.317
S2
0.027
S3
0.992
S4
1.000
[1] Xin Luna Dong, Laure Berti-Equille, Divesh Srivastava. Data Fusion: Resolving Conflicts from Multiple Sources. WAIM 2013.
3
Comparison with ground truth
Data Item
S1
Zootopia
Kung Fu Panda
S3
S4
Howard
Spencer
Spencer
Stevenson
Inside Out
ACCU1
S2
Nelson
leFauve
Docter
Finding Dory
Stanton
Minions
Coffin
Rio
Jones
Source
accuracy
iterative
computation
Correctness
of claims
Renaud
Saldanha
Data Item
Truth
Data Item
Zootopia
Howard
Zootopia
Howard (0.000)
Spencer (1.000)
Kung Fu Panda
Stevenson
Kung Fu Panda
Stevenson (0.015)
Nelson (0.985)
Inside Out
Docter
Inside Out
leFauve (0.001)
Docter (0.999)
Finding Dory
Stanton
Finding Dory
Stanton (1.000)
Minions
Coffin
Minions
Coffin (0.921)
Renaud (0.079)
Rio
Saldanha
Rio
Jones (0.015)
Saldanha (0.985)
Correctness of claims
Source
Accuracy
S1
0.317
S2
0.027
S3
0.992
S4
1.000
[1] Xin Luna Dong, Laure Berti-Equille, Divesh Srivastava. Data Fusion: Resolving Conflicts from Multiple Sources. WAIM 2013.
4
Involve the User
Validate data item
D
Data Fusion
Model
Correctness
of claims
User feedback to fusion model
Labels
5
How to be most effective
with user feedback?
6
This talk
4 ranking strategies
Query-by-committee
Maximum Expected Utility
Holistic ranking
Item-level ranking
Uncertainty Sampling
Approximate MEU
Non-expert
•
•
•
Confidence
Error-rate
Conflicting
feedback
Evaluation
D distance_to_ground_truth (%)
Feedback Errors
0
Random
QBC
US
ApproxMEU
MEU
GUB
-20
-40
-60
-80
0
20
40
60
80
100
data items validated (%)
7
item-level ranking
holistic ranking
feedback errors
evaluation
Query-by-committee (QBC)
most sources agree
Data Item
S1
S2
S3
S4
Zootopia
Howard Spencer Spencer
Kung Fu Panda Stevenson
Nelson
Inside Out
leFauve Docter
sources disagree
Finding Dory
Stanton
Minions
Coffin
Renaud
Rio
Jones
Saldanha
8
item-level ranking
holistic ranking
feedback errors
evaluation
Uncertainty Sampling (US)
Data Item
Correctness of claims
Howard (0.000)
Spencer (1.000)
Zootopia
Kung Fu Panda Stevenson (0.015) Nelson (0.985)
leFauve (0.001)
Docter (0.999)
Inside Out
Stanton (1.000)
Finding Dory
Minions
Coffin (0.921)
Renaud (0.079)
Rio
Jones (0.015)
Saldanha (0.985)
9
item-level ranking
holistic ranking
feedback errors
evaluation
Implication of a validation
Data Item
Zootopia
Kung Fu Panda
Inside Out
Finding Dory
Minions
Rio
S1
S2
Howard
Stevenson
leFauve
S3
Spencer
Nelson
Docter
S4
Spencer
Stanton
Coffin
Jones
Renaud
Saldanha
10
item-level ranking
holistic ranking
feedback errors
evaluation
Implication of a validation
Data Item
Zootopia
Kung Fu Panda
Inside Out
Finding Dory
Minions
Rio
S1
S2
Howard
Stevenson
leFauve
S3
Spencer
Nelson
Docter
S4
Spencer
Stanton
Coffin
Jones
Renaud
Saldanha
11
item-level ranking
holistic ranking
feedback errors
evaluation
Ideal utility function
truth
truthfunction
function?
fusion model
data
Utility Function
average correctness
of true claims
12
item-level ranking
holistic ranking
feedback errors
evaluation
Practical utility function
over correctness
of all claims
entropies of all data items
Entropy Utility Function
13
item-level ranking
holistic ranking
feedback errors
evaluation
Maximum Expected Utility (MEU)
 Value of perfect information
entropy utility if claim is true
Best alternative in the absence of ground truth
14
item-level ranking
holistic ranking
feedback errors
evaluation
Approximate-MEU
• Key idea: Propagation of changes
correctness of
claims of validated
data item
accuracies of
sources
correctness of claims
of unvalidated data
items
no need to fuse for every claim!
removed bottleneck iterative computation of MEU
15
item-level ranking
holistic ranking
feedback errors
evaluation
Users can be wrong
o Honest but unsure user
o Error-rate of user
80% certain about a claim
user is correct 85% of the time
o Conflicting feedback from a crowd of workers
Claim1
Claim2
Claim3
6/10
3/10
1/10
16
item-level ranking
holistic ranking
feedback errors
evaluation
Real-world datasets
Books1
FlightsDay2 Population3
Flights2
Items
1263
5836
40696
121567
Sources
894
38
2545
38
Claims
24303
80452
46734
1931701
Feedback Simulation
o Books: silver standard provided in [4]
o Flight information: data provided by carrier websites considered ground truth
o Population: manually identified the true claim for data items having multiple claims
1.
2.
3.
4.
X. L. Dong, L. Berti-Equille, and D. Srivastava. Integrating conflicting data: The role of source dependence. PVLDB, 2009
X. Li, X. L. Dong, K. Lyons, W. Meng, and D. Srivastava. Truth finding on the deep web: Is the problem solved? PVLDB, 2012
J. Pasternack and D. Roth. Knowing what to believe(when you already know something). COLING, 2010
http://lunadong.com/fusionDataSets.htm
17
item-level ranking
holistic ranking
feedback errors
evaluation
Competing methods
o Item-level ranking methods
 QBC / US
o Decision-theoretic ranking methods
 MEU / Approx-MEU
 Greedy Upper Bound (GUB)
ground-truth-utility-based
o Random
 all data items equally beneficial
18
item-level ranking
holistic ranking
feedback errors
evaluation
Large number of sources, few claims:
holistic ranking
Books
19
item-level ranking
holistic ranking
feedback errors
evaluation
Large number of sources, few claims:
holistic ranking
Population
20
item-level ranking
holistic ranking
feedback errors
evaluation
Large number of claims, few sources:
either QBC/holistic
FlightsDay
21
item-level ranking
holistic ranking
feedback errors
evaluation
Large number of claims, few sources:
either QBC/holistic
Flights
22
Contributions
o Integrating user feedback to improve the performance of
existing data fusion systems
o Designed strategies to generate an effective ordering for
validating claims
o scalable decision-theoretic solution for iterative fusion
o explored imperfect feedback scenarios
o Evaluation on real-world datasets confirmed that guided
feedback rapidly increases the effectiveness of data fusion
23