On the Submodularity of Influence in Social Network

On the Submodularity of
Influence in Social Networks
Elchanan Mossel & Sebastien Roch
STOC07
Speaker: Xinran He
[email protected]
Social Network
•Social network as a graph
–Nodes represent individuals.
–Edges are social relations with different
strengths:
• Neighbors, Coworkers relation in real life
• Virtual Friendship in Facebook
• Follower-Followee relations in Twitter
Diffusion In Social Network
• The adoption of new products can propagate in the social
network Diffusion in the social network
• Information, rumors, innovation, ......
Influence Maximization
• Influence maximization: Find k people that
generates the largest influence spread (i.e.
expected number of activated nodes) [KKT
2003]
Linear Threshold Model
• Given a social network with edge weight wuv and a set of
Initially active individuals S as seed.
• Every individual independently chooses a threshold Θv
uniformly in [0,1].
• At any step t later, still inactive nodes become activated
if
∑w
u∈N v
uv
≥ θv
where Nv is the set of activated direct neighbors of v.
• The diffusion ends when no more nodes are activated.
• The influence spread σ(S)=E[|Pend||S], is the expected
number of active nodes when the diffusion process
ends.
Linear Threshold Example
Inactive Node
0.6
0.3
Active Node
0.2
X
Threshold
0.2
Active neighbors
0.1
0.4
U
0.5
w
0.3
0.5
Step 3
0
1
2
Stop!
0.2
v
Influence Maximization
• Find a seed set S, |S| ≤ k, σ(S) is maximized.
• Influence Maximization Problem is NP-hard
under linear threshold model[Kempe et.al
2003].
• We have to solve it approximately.
• Main tool for analysis
Theorem: The greedy algorithm is a 1-1/e
approximation for maximizing monotone and
submodular set functions[Nemhauser/Wolsey 1978].
Submodular & Monotone
• A set function f: 2VR is monotone if
f ( S ) ≤ f (T ), for all S ⊆ T ⊆ V
• A set function f: 2VR is submodular if
f ( S ) + f (T ) ≥ f ( S ∩ T ) + f ( S ∪ T )
for all S , T ⊆ V
Submodularity
• A function set f is submodular if
f ( S ) + f (T ) ≥ f ( S ∩ T ) + f ( S ∪ T ), for all S , T ⊆ V
• Or equivalently
f (T ∪ {v}) − f (T ) ≤ f ( S ∪ {v}) − f ( S ), for all S ⊆ T ⊆ V
• Submodularity can be considered as
diminishing return property.
Submodularity: Examples
• Maximum coverage problem:
Given a collection of sets S={S1,…,Sm} and a
number k, find S ' ⊆ S , | S ' |≤ k , maximize σ(S’)=  Si
S ∈S '
.
σ is submodular.
i
• The influence spread σ under the linear
threshold model is submodular[Kempe et.al
2003].  Influence Maximization Problem under
linear Threshold model can be solved
approximately.
General Threshold Model
Linear Threshold Model: ∑ w ≥ θ
u∈N v
uv
v
General Threshold Model: fv(S) ≥ θv
fv(S) : activation function of node v over S. S is the set of
already activated nodes.
• General Threshold model is generalization of many
diffusion models:
∑w
u∈N v
fv(S)=
uv
1 − ∏ (1 − puv )
u∈N v
r
1 - ∏ (1 - p v (ωi , Si-1 ))
i =1
…
Linear Threshold Model
[KKT 2003]
Independent Cascade Model
[KKT 2003]
Decreasing Cascade Model
[KKT 2005]
…
General Threshold Model(2)
For Linear Threshold model, the influence spread σ(S) is
submodular [KKT 2003].
Conjecture: Under the general threshold model
with monotone and submodular fv , σ(S) is
monotone and submodular [KKT 2003].
Main Result
Theorem: Under the general threshold model with
monotone and submodular fv , σ(S) is monotone
and submodular [Mossel/Roch 2007].
Corollary: The greedy algorithm is a (1-1/e)
approximation to solve the influence maximization
problem under general threshold model.
Proof: General Idea(1)
• By coupling four diffusion process:
A={A0=S,A1,A2,…,Aend}
B={B0=T,B1,B2,…,Bend}
C={C0=S∩T,C1,C2,…,Cend}
D={D0=S∪T,D1,D2,…,Dend}
• Such that Ct ⊆ At ∩ Bt and Dt ⊆ At ∪ Bt
Proof: General Idea(2)
If Ct ⊆ At ∩ Bt and Dt ⊆ At ∪ Bt
Then | Aend | + | Bend |
Aend
Bend
≥| Aend ∩ Bend | + | Aend ∪ Bend |
≥| Cend | + | Dend |
Then taking expectation, we have
σ ( S ) + σ (T ) ≥ σ ( S ∩ T ) + σ ( S ∪ T )
Ct ⊆ At ∩ Bt
• Couple the four processes with the same
thresholds θv.
• Show Ct ⊆ At , Ct ⊆ Bt by induction.
– Base Case: C0 = S ∩ T ⊆ S = A0
– Assume Ct ⊆ At .
– For a node v still inactive at step t, we
have f v (Ct ) ≤ f v ( At ) . Therefore if v is activated in
step t+1 in C, it must also be activated in A.
⇒ Ct +1 ⊆ At +1
fv(Ct)
fv(At)
Dt ⊆ At ∪ Bt :First Attempt
• Let’s try the same coupling method
for Dt ⊆ At ∪ Bt .
1
2
0.3
0.3
3
D
Θ3=0.5
1
2
0.3
0.3
3
A
Θ3=0.5
1
2
0.3
0.3
3
B
Θ3=0.5
Antisense Coupling
• Then how could we keep Dt ⊆ At ∪ Bt ?
• Intuitively, using ϴ for activation of S and 1ϴ for activation of T will maximize their union.
Piecemeal Growth
Define P = P ( S (1) ,..., S ( k ) ) as the the piecemeal growth diffusion
process, where S (1) ,..., S ( k ) is a partition of seed set S .
Grow S(1)
Grow S(2)
Until it ends
Until it ends
Add S(1)
Add S(2)
……
Grow S(k)
Until it ends
Add S(k)
Lemma: The distribution over the activated node set at the
end of original process with seed set S and the piecemeal
growth process P(S(1),…,S(k)) is identical.
Piecemeal Growth: Proof
• By coupling three piecemeal growth processes
T’, T, T’’ and original process S with same θ.
Grow S
Add S at stage 1
Grow S(1)
Add S(1) at stage 1
Grow nothing
Add nothing at stage 1
Grow nothing
Add nothing at stage 2
Grow S(2)
Add S(2) at stage 2
Grow S
Add S at stage 2
T ' 's ⊆ Ts ⊆ T 's and T 'end = T ' 'end = S end
so that S end = Tend
Need-to-know Representation(1)
• Consider the diffusion in a different way:
Need-to-know Representation.
• Principle of Deferred Decisions: We don’t decide all
thresholds at the beginning; instead we reveal the
value of thresholds whenever needed.
• For example: if node v is inactive at step t-1, we only
want to know whether it is activated at step t.
Θv Θv
fv(St-2)
fv(St-1)
Need-to-know Representation(2)
Lemma: The following process is equivalent to the
original one:
1.Initialize S 0 = S
2. At step 1 ≤ t ≤ n − 1, we initialize St = St −1 and for each still inactive node v
- With probability
f v ( St −1 ) − f v ( St − 2 )
, v becomes activated
1 − f v ( St −2 )
and we pick θ v uniformly in [ f v ( St − 2 ), f v ( St −1 )].
- Otherwise we do nothing.
1 − f v ( St −2 )
fv(St-2)
fv(St-1)
f v ( St −1 ) − f v ( St − 2 )
Antisense Coupling(1)
Define the antisense diffusion P = P ( S ,..., S ; T )
(1)
(1)
where S ,..., S
Grow S(1)
Until it ends
(k )
(k )
is a partition of seed set S .
……
K stage piecemeal growth
Grow S(k)
Grow T
Until it ends
Until it ends
τ
Add T at the
beginning of
stage k+1
Any step t in the final stage, activate nodes
under the condition f v ( Pt ) ≥ f v ( Pτ ) + 1 − θ v.
Antisense Coupling(2)
Grow S(1)
……
Grow S(k)
Grow T
Grow S(1)
……
Grow S(k)
Grow T
τ
θ
f v ( Pt ) ≥ θ v
fv(Pτ)
Θ’v =fv(Pτ)+1- Θv
f v ( Pt ) ≥ θ v'
fv(Qτ)
fv(Pt)
Θ’
fv(Qt)
Antisense Coupling(3)
Grow S(1)
……
Grow S(k)
Grow T
Grow S(1)
……
Grow S(k)
Grow T
tτ
Lemma: The distributions over the activated node set at
the end of the piecemeal growth process P(S(1),…,S(k);T) and
the antisense diffusion process Q(S(1),…,S(k);T) are identical.
Antisense Coupling: Proof(1)
Grow S(1)
……
Grow S(k)
Grow T
Grow S(1)
……
Grow S(k)
Grow T
tτ
• From Need-to-know Representation point of
view:
For any node v still inactive at time t = τ , we have
θ v uniformly distributed in [ f v ( Pτ ),1] = [ f v (Qτ ),1]
Antisense Coupling: Proof(2)
• Then for any still inactive node, we pick its Θv
uniformly in [fv(Pτ),1].
• We define Θ’v =fv(Qτ)+1- Θv .
• Since Θv and Θ’v have the same distribution,
the final stage in growing T in P and Q is
identical.
• Therefore Pend and Qend have the same
distribution.
Coupling: Overview
Grow S∩T
Grow S\T
Until it ends
Until it ends
Grow S∩T
Until it ends
Grow Nothing
Grow nothing
Grow T\S
Until it ends
Grow S∩T
Grow S\T
Grow T\S
Until it ends
Until it ends
Until it ends
Dt ⊆ At ∪ Bt for any step t in all three stages
Coupling: First two stages
• At=Dt for all t in the first two stages.
• Therefore Dt ⊆ At ∪ Bt for all steps t in the first
two stages.
• We will show Dt ⊆ At ∪ Bt for any step in final
stage.
Grow S∩B
Grow S\T
Grow nothing
Grow S∩T
Grow Nothing
Grow T\S
Grow S∩T
Grow S\T
Grow T\S
First two stages
τ
Last stage
Coupling: Antisense Coupling
• We first prove Dt \ Dτ ⊆ Bt \ Bτ for any step in
the final stage by induction on t.
• Base case:
Dτ +1 \ Dτ ⊆ Bτ +1 \ Bτ
• Because: Dτ +1 = Dτ ∪ (T \ S )
Bτ +1 = Bτ ∪ (T \ S )
Bτ ⊆ Dτ
Coupling: Antisense Coupling
• Assume Dt \ Dτ ⊆ Bt \ Bτ .
• We need to show that Dt +1 \ Dτ ⊆ Bt +1 \ Bτ .
Lemma: For any S ⊆ S ' and T ⊆ T ' and submodular f,
we have f ( S ∪ T ' ) − f ( S ) ≥ f ( S '∪T ) − f ( S ' ) .
Dt \ Dτ ⊆ Bt \ Bτ
f v ( Dt ) ≥ f v ( Dτ ) + 1 − θ v
⇒ f v ( Bt ) ≥ f v ( Bτ ) + 1 − θ v
f v ( Bt ) − f v ( Bτ ) ≥ f v ( Dt ) − f v ( Dτ )
S = Bτ , S ' = Dτ , T ' = Bt \ Bτ , T = Dt \ Dτ
Dt +1 \ Dτ ⊆ Bt +1 \ Bτ
Coupling: Wrapup
• Therefore we have:
Dt \ Dτ ⊆ Bt \ Bτ ⊆ Bt
At = Dτ for all t in the final stage
Dt ⊆ At ∪ Bt , Ct ⊆ At ∩ Bt (Previously proved)
Grow S∩T
Grow S\T
Grow nothing
Grow S∩T
Grow Nothing
Grow T\S
Grow S∩T
Grow S\T
Grow T\S
First two stages
τ
Last stage
Further Generalization
• We have defined σ(S)=E[|Pend||S].
• We can introduce a set function ω(·) on Pend and
define the influence spread as σω(S)=E[ω(Pend)|S]
instead.
Theorem: Under the general threshold model with
monotone and submodular fv and ω, σω(S) is
monotone and submodular. [Mossel/Roch 2007]
Further Generalization: Proof
Assume Cend ⊆ Aend ∩ Bend and Dend ⊆ Aend ∪ Bend .
Then ω ( Aend ) + ω ( Bend )
≥ ω ( Aend ∩ Bend ) + ω ( Aend ∪ Bend )
≥ ω (Cend ) + ω ( Dend ).
Taking expectation, we have
σ ω ( S ) + σ ω (T ) ≥ σ ω ( S ∩ T ) + σ ω ( S ∪ T ).
Conclusion
• General Threshold Model generalizes many
popular diffusion models.
Theorem: Under the general threshold model with
monotone and submodular fv and ω, σω(S) is
monotone and submodular. [Mossel/Roch 2007]
• Proof methodology: Coupling (piecemeal
growth & antisense coupling)
Algorithm for Influence Maximization
Corollary: The greedy algorithm is a (1-1/e)
approximation to solve the influence maximization
problem under general threshold model.
Algorithm 1 : Greedy(k)
1 : initialize S to empty set
2 : for i = 1 to k do
3 : select u = arg max v∈V \ S (σ ( S ∪ {v}) − σ ( S ))
4 : S = S ∪ {u}
5 : end for
6 : return S
Algorithm for Influence Maximization
Algorithm 1 : Greedy(k)
1 : initialize S to empty set
2 : for i = 1 to k do
3 : select u = arg max v∈V \ S (σ ( S ∪ {v}) − σ ( S ))
4 : S = S ∪ {u}
5 : end for
6 : return S
• Time complexity: O(knCm)
• Where n=|V|, m=|E|, C the times of MonteCarlo simulation.
Algorithm for Influence Maximization
Name
Main Idea
Model
Guarantee
Reference
CELF
Lazy Forward optimization
All
1-1/e
Leskovec et al.
2007
CELF++
Further optimization of
CELF
All
1-1/e
Goyal et al.
2011
PMIA
Use directed tree structure
IC
No
Chen et al.
2010
LDAG
Use DAG structure
LT
No
Chen et al.
2010
IRIE
Use PageRank to initialize
and update locally
IC
No
Chen et al.
2012
CGA
Use community structure
IC
MSA
Simulated Annealing
All
1− e
−
1
1+ ∆dθ
No
Wang et al.
2010
Jiang et al.
2011
Open Questions
• Different classes of activation function fv .
– Local subadditive set function  Global
subadditive influence spread σ(S)?
• Find approximation algorithm for solving the
influence maximization problem under
diffusion models with non-submodular
influence spread σ(S).