SOCIAL NETWORK
ANALYSIS VIA FACTOR
GRAPH MODEL
Zi Yang
OUTLINE
Background
Challenge
Unsupervised case 1
Representative user finding
Unsupervised case 2
Community
discovery
Experiments
Supervised case
Modeling
information diffusion in social network
BACKGROUND
Social network
Example: Digg.com
A popular social news website for people to discover
and share content
Various types of behaviors of the users
submit, digg, comment and reply a comment
Edges
if one diggs or comments a story of another
BACKGROUND
Community discovery
Modularity
property
ki k j
exp [ yi y j ] i , j
2
m
i, j
Affinity propagation
Clustering
via factor graph model
Update rules:
r (i, k ) s (i, k ) max {a (a, k ') s (i, k ')}
k ' s .t .k ' k
a (i, k ) min{0, r (k , k )
i ' s .t .i '{i , k }
a (k , k )
i ' s .t .i '{ k }
max{0, r (i ', k )}
max{0, r (i ', k )}}
Pair-wise
constrain
BACKGROUND
Affinity propagation
, if ck k but i : ci k
S (c) s(i, ci ) k (c1:N ) where k (c1:N )
0, otherwise
i 1
k 1
N
N
Local factor Regional constrain
OUTLINE
Background
Challenge
Unsupervised case 1
Representative user finding
Unsupervised case 2
Community
discovery
Experiments
Supervised case
Modeling
information diffusion in social network
CHALLENGES
How to capture the local properties for social
network analysis?
Community discovery as a graph clustering,
and how to consider the edge information
directly?
Homophily
What constraint can be applied to describe the
formation/evolution of community?
OUTLINE
Background
Challenge
Unsupervised case 1
Representative user finding
Unsupervised case 2
Community
discovery
Experiments
Supervised case
Modeling
information diffusion in social network
REPRESENTATIVE USER FINDING
Problem definition
a social network G (V , E ) and (optional) a
confidence i for each user vi , the objective is to
find a pair-wise representativeness on each edge
in the network, and estimate the representative
degree of each user vi in the network, which is
denoted by a set of variables { yi } satisfying
.yi {1,, N} . In other words, yi represents the
user that vi mostly trusts (or relies on).
given
REPRESENTATIVE USER FINDING
Modeling
v2
Input
v4
v1
v3
Variables
y2
y4
y1
y3
Represent the
representative
v2
v4
v1
v3
REPRESENTATIVE USER FINDING
Modeling
Node
y2
feature function
Normalization
factor
y3
Observation:
similarity between
the node and
variable
wi , yi
gi (y i ) g i ( yi ) w j ,i
jNB (i )
0
y4
y1
g1(y1)
v2
v4
v1
if yi O (i )
if yi i
otherwise
g4(y4)
g3(y3)
g2(y2)
v3
Neighbor
Representative
Self-representative
REPRESENTATIVE USER FINDING
Modeling
f2,1(y2,y1)
Edge
y1
feature function
f2,3(y2,y3)
f3,2(y3,y2)
g4(y4)
g3(y3)
g2(y2)
v2
v4
v1
if yi y j
fi , j (y i , y j ) fi , j ( yi , y j )
1 if yi y j
y4
f3,2(y3,y2) y3
g1(y1)
Undirected edge:
bidirected influence
f2,4(y2,y4)
y2
v3
If vertexes of the
edge have the same
representative
If vertexes of the
edge have different
representative
REPRESENTATIVE USER FINDING
Modeling
Regional
a
h1(y1,y2)
h2(y2,y3,y4)
h3(y3,y1)
feature function
feature function defined
on the set of neighboring
nodes of vi and itself.
f2,1(y2,y1)
h4(y4,y2)
f2,4(y2,y4)
y2
f2,3(y2,y3)
y1
f3,2(y3,y2)
y4
f3,2(y3,y2) y3
g1(y1)
g4(y4)
g3(y3)
g2(y2)
v2
v4
v1
v3
0 if yk k and i I (k ), yi k
hk (y I (k ){k} ) hk ( yI ( k ){k } )
otherwise
1
To avoid “leader without followers”
REPRESENTATIVE USER FINDING
Modeling
Objective
function
max log P (y 1:N )
y1:N
N
1 N
P (y 1:N ) gi (y i ) f i , j (y i , y j ) hk (y I (k ){k } )
Z i 1
ei , j E
k 1
N
1 N
gi ( yi ) fi , j ( yi , y j ) hk ( yI ( k ){k } )
Z i 1
ei , j E
k 1
Solving
Max-sum
algorithm
REPRESENTATIVE USER FINDING
Model learning
aii max min rkj , 0
kI ( j )
aij min min rjj , 0 max min rkj , 0 , max rjj , 0
kI ( j ) ‚ {i}
rij gij cikj
max gij aij cikj
j O ( i ) {i}‚ { j }
kI ( i ) O ( i )
kI ( i ) O ( i )
pijk gik aik
lI ( i ) O ( i ) ‚
cikl max gij aij
j O ( i )
{ j}
lI ( i ) O ( i ) ‚
cijk max log
p jik , 0
1
cilj
{ j}
REPRESENTATIVE USER FINDING
A bit explanation
pijk : how likely user vi persuadesv j to take vk as his
representative
cijk : how likely user vi compliances the suggestion
from v j that he considers vk as his representative
The direction of such process
Along
the directed edges
v1
v2
v3
v1
v2
v3
v1
v2
v3
REPRESENTATIVE USER FINDING
Algorithm
OUTLINE
Background
Challenge
Unsupervised case 1
Representative user finding
Unsupervised case 2
Community
discovery
Experiments
Supervised case
Modeling
information diffusion in social network
COMMUNITY DISCOVERY
Problem definition
given
a social network G and an expected number
of communities C , correspondingly a virtual
node uc U . is introduced for each community,
and the objective is to find a community yi for
each person vi satisfying yi {1,, C} , which
represents the community that vi belongs to, such
that maximize the preservation of structure (or
maximize the modularity Q of the community).
COMMUNITY DISCOVERY
Feature definition – What’s different?
Node
gi ( yi ) exp
Edge
feature function
[ y
jI ( i ) O ( i )
j
yi ] 1
i, j
| X yj |
u1
feature function
f i , j ( yi , y j ) exp qi , j
ki k j
exp[ yi y j ] i , j
2m
u2
g3(y3)
g2(y2)
g1(y1)
f2,1(y2,y1) y2
g4(y4)
f2,4(y2,y4)
f2,3(y2,y3)
y4
y1
f3,2(y3,y2) y
3
f1,3(y1,y3)
v2
v4
v1
v3
COMMUNITY DISCOVERY
Algorithm
Result output and
Variable updates
OUTLINE
Background
Challenge
Unsupervised case 1
Representative user finding
Unsupervised case 2
Community
discovery
Experiments
Supervised case
Modeling
information diffusion in social network
Experiments
Dataset: Digg.com
a
popular social news website for people to
discover and share content
9,583 users, 56,440 contacts
various types of behaviors of the users
submit,
Edges
if
digg, comment and reply a comment
(In total: 308,362)
one diggs or comments a story of another
Weight of the edge: the total number of diggs and
comments
Experiments
Dataset: Digg.com
9,583
users, 56,440 contacts
308,362 edges
weight
of the edge: the total number of diggs and
comments
Settings:
Parameter
0.6
Experiments
Result: 3 most self-representative users
on 3 different topics for Digg user network
Experiments
Result: 3 most representative users of 5
communities on 3 different subset
Experiments
Result: Representative network on a sub
graph in Digg-2 Network
irfanmp
0.0000
0.0024
0.0000
SirPopper
0.0020
0.0024
wonderwal
0.0006
0.0009
0.0006
0.0007
0.0010
0.0000
0.0006
upick
maxthreepwood
0.0007
0.0000
zohaibusman
mpind176
numberneal
0.0020
0.0000
rocr69
optimusprime01
0.0006
0.0006
0.0000
0.0007
0.0015
0.0000
0.0000
0.0003
0.0000
0.0000
0.0010
0.0000
mklopez
louiebaur
GordonFree
pyrates
Omek
0.0000 0.0005
0.0007
0.0000
0.0003
0.0000
pavelmah
ritubpant
1nfiniteLoop
mikek814
OUTLINE
Background
Challenge
Unsupervised case 1
Representative user finding
Unsupervised case 2
Community
discovery
Experiments
Supervised case
Modeling
information diffusion in social network
Modeling information diffusion in
social network
Supervised model
Bridging the actual value (label) with the
variable.
More variables to come?
Learning
the weights
Thanks
© Copyright 2026 Paperzz