SView 0.2 融合2013-04-09 - Websoft Research Group

We b s o f t R e s e a rc h G ro u p , Na n j i n g Un i v e r s i t y
ws.nju.edu.cn
SView 0.2 融合
2013-04-09
[email protected] 龚赛赛
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
Contents
系统功能
问题描述
现有研究工作
[email protected] 龚赛赛
2
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
[email protected] 龚赛赛
3
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
系统功能
用户根据自己的偏好,将属性(resp. 实体)
合并,形成属性(resp. 实体)的划分
浏览融合后的数据
帮助SView融合数据
SView的目标
保证每个用户个性化的划分 ----Finished
从用户的划分集合中挖掘一致(consensus)的划
分 ----TODO
[email protected] 龚赛赛
4
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
问题描述
符号说明
X : a set of n elements,
Ρ : the set of all the partitions of X,
Π  P : a profile of m partitions, here m is user
num
 (i ) : the class that x  X belongs to in   
i
[email protected] 龚赛赛
5
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
问题描述
Consensus of partition: find a partition
P(central partition ) best summarizes the
profile according to a specific criterion
Use a metric between partitions S(P,Q)
Optimize
在SView上下文中的差异
Profile中每个划分是partial的
[email protected] 龚赛赛
6
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
现有研究工作
Axiomatic approach
Central partition satisfy conditions from
experimental evidence and others
Constructive approach
A way to construct consensus is explicitly given
Combinatorial optimization problem
Based on some criterion measuring the remoteness
of partitions
[email protected] 龚赛赛
7
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
Combinatorial optimization problem
Remoteness of partitions O(n) to O(n^3)
Symmetric difference distance between relations of
partition
Minimum number of elements deleted so that two induced
partitions are identical
Minimum number of elements moved between clusters so
that resulting partition equals ( this num equals above)
……
Consensus of partition is NP-hard
当n不太大时,可用分支限界算法得到最优解
[email protected] 龚赛赛
8
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
Combinatorial optimization problem
NP-hard 证明简述
使用对称差作为距离度量
 P ( i ) P ( j )  1当且仅当xi 和x j 在P中join
优化目标:
[email protected] 龚赛赛
9
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
Combinatorial optimization problem
NP-hard 证明简述(续)
优化目标等价为;
Tij :the number of partitions in which two elements xi and x j are joined
优化目标转换为:
*
R(P) : the set of joined pairs in P
[email protected] 龚赛赛
10
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
Combinatorial optimization problem
NP-hard 证明简述(续)
构造X上完全图K n 并赋予边权重w(i,j) = Tij - m/2
划分P有p个类
,每个类对应了K n 的一个团
(clique),并且每个团权重为其子图对应边权重之和
优化目标转换为:
带权团划分问题(NP-hard)
[email protected] 龚赛赛
11
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
Combinatorial optimization problem
启发式算法
Regnier’s Transfer method and its optimization
基于hill climbing
……
Fusion-Transfert(FT) method
Alain Guenoche. Consensus of partitions : a
constructive approach. Advances in Data Analysis
and Classification. 2011
[email protected] 龚赛赛
12
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
Fusion-Transfert(FT) method
[email protected] 龚赛赛
13
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
Fusion-Transfert(FT) method
Hierarchical procedure
[email protected] 龚赛赛
14
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
Fusion-Transfert(FT) method
Transfer procedure
{
x }
P
[email protected] 龚赛赛
{ }
Q
15
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
Fusion-Transfert(FT) method
时间复杂度
实验
[email protected] 龚赛赛
16
We b s o f t Re s e a rc h G ro u p , Na n j i n g Un i v e r s i ty
ws.nju.edu.cn
[email protected] 龚赛赛
17