SView Team Report

.nju.edu.cn
Integrating Class Hierarchies
Yuzhong Qu
NJVR
ws .nju.edu.cn
2,996 vocabularies
From 261 PLDs (many are from w3.org)
455,718 terms
396,023 classes, 59,868 properties, (many are in YAGO NS)
Instantiation found for
115,707 classes (29.2%), e.g. foaf:Person
25,963 properties (43.4%), e.g. dc:creator
1,874 vocabularies (62.6%)
SView Team Report
2 of 24
Select Vocabulary
ws .nju.edu.cn
Class and property
Instantiated classes //and their ancestors
The amount of instantiation, e.g. k  10 (100?)
SView Team Report
3 of 24
Instantiated Class Hierarchy
ws .nju.edu.cn
A
D
C
e1
SView Team Report
e2
4 of 24
Homomorphism
ws .nju.edu.cn
Let M ={S1, S2 ,…} be a partially ordered set (or poset),
and so does N= {C1, C2 ,…}
H:MN be a functional relation from M to N (partial?)
Si  Sj  H(Si)  H(Sj)
Note
Merging class hierarchies (taxonomies)
Abstractive summary of a given class hierarchy
|Range H|  K
SView Team Report
5 of 24
Distance
ws .nju.edu.cn
C
S
H
|| H || max i {dist ( Si , H ( Si ))}
 max i {1  p( Si , H ( Si ))}
|| I || 0
|| H  F |||| H ||  || F ||
SView Team Report
6 of 24
Distance
ws .nju.edu.cn
 max{dist (S , C ) | H (S )  C }
k
k
k
1
dist ( Si , H ( Si ))

m i
 p(S ) * dist (S , H (S ))
i
i
i
i
SView Team Report
7 of 24
Merge
ws .nju.edu.cn
S
SView Team Report
8 of 24
Summary of instances (class hierarchy)
ws .nju.edu.cn
SView Team Report
9 of 24
Summary of instances (class hierarchy)
ws .nju.edu.cn

| Range(H) |  K
?
Si  H ( Si )

Minimize
 p(S ) * dist (S , H (S ))
i
i
i
i
OR
Maximize
 p(S ) p(S , H (S ))
i
i
i
i
SView Team Report
10 of 24
Instance category and taxonomy
ws .nju.edu.cn
A
D
C
leaf node is weighted
e1
SView Team Report
e2
11 of 24
Related Problem
ws .nju.edu.cn
Huffman Coding
Minimum-cost flow problem
(Directed) Steiner Tree
Node-weighted Steiner Tree
(Weighted)Vertex Cover
(Weighted) Dominating Set
Maximum coverage problem (select no more than K sets)
Weighted version (elements are weighted)
Minimum Set Cover
Weighted version (sets are weighted)
SView Team Report
12 of 24
Huffman Coding (Minimum weighted path length)
ws .nju.edu.cn
Huffman D A. A method for the construction of minimum-redundancy
codes. Proceedings of the IRE, 1952, 40(9): 1098-1101.
SView Team Report
13 of 24
Minimum-cost flow problem
ws .nju.edu.cn
Given a directed graph with source s and sink t, where
each edge (u,v) has capacity c(u, v), flow f(u, v), and
cost a(u, v). You are required to send an amount of
flow d from s to t.
Minimize
SView Team Report
14 of 24
Minimum Steiner Tree
ws .nju.edu.cn
Given a set V of points (vertices), interconnect them by a network
(graph) of shortest length, where the length is the sum of the
lengths of all edges.
SView Team Report
15 of 24
Minimum Steiner Tree
ws .nju.edu.cn
Given an edge-weighted graph G = (V, E, w) and a subset S ⊆ V of
required vertices.
A Steiner tree is a tree in G that spans all vertices of S. The task is
to find a minimum-weight Steiner tree.
SView Team Report
16 of 24
Dominating Set problem
ws .nju.edu.cn
A dominating set for a graph G = (V, E) is
a subset D of V such that every vertex not
in D is adjacent to at least one member of D.
The minimum dominating set is NP-hard
Its decision version is a classical NPcomplete decision problem
the problem is not fixed-parameter tractable in
the sense that no algorithm with running
time f(k)nO(1) for any function f exists unless
the W-hierarchy collapses to FPT=W[2].
if the input graph is planar, the problem
remains NP-hard, but a fixed-parameter
algorithm is known.
SView Team Report
17 of 24
Vertex Cover problem
ws .nju.edu.cn
A vertex cover of a graph is a set of vertices such that each edge
of the graph is incident to at least one vertex of the set.
The minimum vertex cover is NP-hard
Its decision version, the vertex cover problem, was one of Karp's
21 NP-complete problems
“if G has a vertex cover of k vertices” is fixed-parameter tractable
O(kn + 1.2852k)
SView Team Report
18 of 24
Other Techniques
ws .nju.edu.cn
Graph summarization
Graph edit distance
SView Team Report
19 of 24
Reference (Minimum cost flow)
ws .nju.edu.cn
James B. Orlin. A polynomial time primal network simplex
algorithm for minimum cost flows. Mathematical Programming.
1997(78): 109–129.
SView Team Report
20 of 24
Reference (Steiner Tree)
ws .nju.edu.cn
Klein P, Ravi R. A nearly best-possible approximation algorithm
for node-weighted Steiner trees. Journal of Algorithms, 1995,
19(1): 104-115.
Zelikovsky A. A series of approximation algorithms for the
acyclic directed Steiner tree problem. Algorithmica, 1997,
18(1): 99-110.
Charikar M, Chekuri C, Cheung T, et al. Approximation
algorithms for directed Steiner problems. Proceedings of the
ninth annual ACM-SIAM symposium on Discrete algorithms.
1998: 192-200.
Zosin L, Khuller S. On directed Steiner trees. Proceedings of
the thirteenth annual ACM-SIAM symposium on Discrete
algorithms. 2002: 59-63.
SView Team Report
21 of 24
Reference (Vertex Cover)
ws .nju.edu.cn
Niedermeier R, Rossmanith P. On efficient fixed-parameter
algorithms for weighted vertex cover. Journal of Algorithms,
2003, 47(2): 63-77.
White L J, Gillenson M L. An efficient algorithm for minimum
k-covers in weighted graphs. Mathematical Programming,
1975, 8(1): 20-42.
Chen J, Kanj I A, Xia G. Improved parameterized upper bounds
for vertex cover. Mathematical Foundations of Computer
Science 2006. Springer Berlin Heidelberg, 2006: 238-249.
SView Team Report
22 of 24
Reference (Graph Summarization)
ws .nju.edu.cn
Navlakha S, Rastogi R, Shrivastava N. Graph summarization
with bounded error. Proceedings of the 2008 ACM SIGMOD
international conference on Management of data. ACM, 2008:
419-432.
Tian Y, Hankins R A, Patel J M. Efficient aggregation for graph
summarization. Proceedings of the 2008 ACM SIGMOD
international conference on Management of data. ACM, 2008:
567-580.
Zhang N, Tian Y, Patel J M. Discovery-driven graph
summarization. Data Engineering (ICDE), 2010 IEEE 26th
International Conference on. IEEE, 2010: 880-891.
Gao X, Xiao B, Tao D, et al. A survey of graph edit distance.
Pattern Analysis and applications, 2010, 13(1): 113-129.
SView Team Report
23 of 24
Reference (Document Summarization)
ws .nju.edu.cn
Celikyilmaz A, Hakkani-Tur D. A hybrid hierarchical model
for multi-document summarization. ACL 2010: 815-824.
Shen C, Li T. Multi-document summarization via the
minimum dominating set. COLING 2010: 984-992
SView Team Report
24 of 24
Acknowledgement
ws .nju.edu.cn
Q&A
Discussion
SView Team Report
25 of 24