Type Similarity Measure and Its Application to
Entity Recommendation
Zheng Liang
Scenario
Type
Scientist
English Physicists
English Mathematicians
Christian Mystics
...
80个类型
Scientist
Jewish American Scientists
Nobel Laureates In Physics
Deists
Scientist
Italian Physicists
Italian Astronomers
Italian Astrologers
...
...
105个类型
51 个类型
• When viewing a entity, our goal is to recommend the most
similar entities based on type similarity.
Type Similarity Measure
X
0 s(txi, tyj ) 1
Y
wx1
wy1
s(txi, tyj )
j wyj =1
tx1
ty1
.
.
. wxi
.
.
. wyj
txi
tyj
.
.
.
.
.
.
wxm
txm
i wxi =1
1im
1jn
wyn
tyn
• S(Albert_Einstein, Isaac_Newton)= SetSim(X , Y)
• wxi is the weight of txi 直观解释:即为 txi在类型集合X中的重要度
• 0 wxi 1
Type Similarity Measure Based on Network Flow
Cost
X
Y
(1, b(vx1 , vy1 ) )
Capacity
(wx1, 0)
vx1
vy1
.
.
.
.
.
.
(1, b(vxi , vyj ) )
(wxi, 0)
vs
(wxm, 0)
(wyi, 0)
vxi
vyj
.
.
.
.
.
.
vt
(wyn, 0)
(1, b (vxm , vyn ) )
vxm
(wy1, 0)
vyn
b (vxi , vyj ) = [bij]=1-s(txi, tyj )
1 i m ; 1 j n ; 0 bij 1
Problem is formalized as follows:
• We turn to the problem of finding a maximum flow of minimum
cost. (Edmonds, 1972)
• Given a network N={V, A, C, B}
(V= X Y {vs , vt })
• Let the cost of a flow f be (vxi , vyj )A bij fij and let its value be
f(vs , vt).
• We find a flow which is maximum, but has the lowest cost among
the maximums.
• b( f ) = (vxi , vyj )A bij fij
= (1- sij ) fij
= fij - sij fij
(最大流, fij 1)
1- sij fij
•对于两个相同的集合, b( f ) =0
•对于两个完全不相同的集合, b( f ) =1 ( sij=0)
•b( f )越小, sij fij 越大,表示两个集合相似度越高; 反之,
…
•0 b( f ) 1
Type Weight Measure
1: Informational content (Resnik 1992, 1995).
IC(ti )= - logP(ti )=- log(freq(ti )/N)
一元模型,实质就是IDF
wxi =IC(txi ) / IC(txi ) txi X
0 wxi 1
Type Weight Measure
2: Conditional Entropy
多元模型,考虑上下文
H(txi|Txi) = - P(X) log P(txi |Txi)
X={ tx1, tx2,…, txm}; txi X; Txi =X – { txi };
wxi =H(txi|Txi) / H(txi|Txi)
( 0 wxi 1 )
© Copyright 2026 Paperzz