Statistical Inference for Some Network Models

Statistical Inference for Some Network Models
Aad van der Vaart
Mathematical Institute
Leiden University
Conference on Probability and Statistics in High Dimensions
A Scientific Tribute to Evarist Giné
CRM, Barcelona, June 2016
1 / 15
Co-author
Fengnan Gao
Remco van der Hofstad
Rui Castro
2 / 15
Preferential Attachment
Start: complete graph with nodes 1, 2.
3 / 15
Preferential Attachment
Start: complete graph with nodes 1, 2.
Recursions for n = 2, 3, . . .:
(n)
(n)
given graph with nodes 1, 2, . . . , n with degrees d1 , . . . , dn ,
(n)
connect node n + 1 to node k ∈ {1, . . . , n} with probability ∝ f (dk ).
3 / 15
Preferential Attachment
Start: complete graph with nodes 1, 2.
Recursions for n = 2, 3, . . .:
(n)
(n)
given graph with nodes 1, 2, . . . , n with degrees d1 , . . . , dn ,
(n)
connect node n + 1 to node k ∈ {1, . . . , n} with probability ∝ f (dk ).
f(k)=k
1/2
1/2
3 / 15
Preferential Attachment
Start: complete graph with nodes 1, 2.
Recursions for n = 2, 3, . . .:
(n)
(n)
given graph with nodes 1, 2, . . . , n with degrees d1 , . . . , dn ,
(n)
connect node n + 1 to node k ∈ {1, . . . , n} with probability ∝ f (dk ).
f(k)=k
1/2
1/2
2/4
1/4
1/4
3 / 15
Preferential Attachment
Start: complete graph with nodes 1, 2.
Recursions for n = 2, 3, . . .:
(n)
(n)
given graph with nodes 1, 2, . . . , n with degrees d1 , . . . , dn ,
(n)
connect node n + 1 to node k ∈ {1, . . . , n} with probability ∝ f (dk ).
f(k)=k
1/2
1/2
2/4
1/4
1/4
3/6
1/6
1/6
1/6
3 / 15
Preferential Attachment
Start: complete graph with nodes 1, 2.
Recursions for n = 2, 3, . . .:
(n)
(n)
given graph with nodes 1, 2, . . . , n with degrees d1 , . . . , dn ,
(n)
connect node n + 1 to node k ∈ {1, . . . , n} with probability ∝ f (dk ).
f(k)=k
1/2
1/2
2/4
1/4
1/4
3/6
1/6
1/6
1/6
Estimate the attachment function f : N → [0, ∞) from observed network
4 / 15
Preferential Attachment with Random Initial Degree
For i.i.d. m1 , m2 , . . ., connect node n to mn existing nodes.
Start: graph with nodes 1, 2 and m1 edges.
5 / 15
Preferential Attachment with Random Initial Degree
For i.i.d. m1 , m2 , . . ., connect node n to mn existing nodes.
Start: graph with nodes 1, 2 and m1 edges.
Recursions for n = 2, 3, . . .:
for i = 1, . . . , mn ,
(n,i−1)
given graph with nodes 1, 2, . . . , n with current degrees d1
(n,i−1)
, . . . , dn
,
(n,i−1)
connect node n + 1 to node k ∈ {1, . . . , n} with probability ∝ f (dk
).
5 / 15
Degree distribution
pk (n): =
1
#nodes 1, 2, . . . , n of degree k.
n
6 / 15
Degree distribution
pk (n): =
THEOREM
1
#nodes 1, 2, . . . , n of degree k.
n
[Barabasi-Albert, 2000; Mori 2002; Rudas, Toth, Valko, 2007]
k−1
Y f (j)
α
,
pk (n) → pk : =
α + f (k)
α + f (j)
a.s..
j=1
(α makes (pk ) a probability distribution on N.)
EXAMPLE
f (k) = k
f (k) = k + δ
f (k) = k β , β ∈ [1/2, 1)
[Barabasi, Albert, 99]
[Krapivsky, Redner, 01]
pk = 4/(k(k + 1)(k + 2)).
pk ∼ k −3−δ .
1−β
c
−c
k
1
2
pk = k e
.
6 / 15
Preferential Attachment with f (k) = k
start movie
Movies by Matjaz Perc downloaded from Youtube
7 / 15
Preferential Attachment with f (k) = k 0.25 or f (k) = k or f (k) = k 2
From movies by Matjaz Perc downloaded from Youtube
8 / 15
Empirical Estimator
p>k (n)
ˆ
fn (k) =
pk (n)
Motivation: # nodes of degree > k at time t
= # of times up to time t that the new node chose a node of degree k .
9 / 15
Empirical Estimator
p>k (n)
ˆ
fn (k) =
pk (n)
Motivation: # nodes of degree > k at time t
= # of times up to time t that the new node chose a node of degree k .
So, for large t:
p>k (n) ≈ P(node t + 1 connects to node of degree k)
∝ f (k)pk (t) ≈ f (k)pk (n).
9 / 15
Empirical Estimator
p>k (n)
ˆ
fn (k) =
pk (n)
Motivation: # nodes of degree > k at time t
= # of times up to time t that the new node chose a node of degree k .
So, for large t:
p>k (n) ≈ P(node t + 1 connects to node of degree k)
∝ f (k)pk (t) ≈ f (k)pk (n).
THEOREM
[Gao,vdV]
P
ˆ
fn (k) → f (k)/ j f (j)pj a.s. as n → ∞, for every fixed k .
Proof is based on LLN of supercritical branching processes by Jagers, 1975 and Nerman,
1981, along the lines of Rudas, Toth, Valko, 2008.
9 / 15
Supercritical Branching
Individual x born at time σx has children at times of counting process
(ξx (t − σx ): t ≥ σx ).
For given numerical time-dependent characteristic (φx (t − σx ): t ≥ σx ):
Ztφ : =
X
x:σx ≤t
φx (t − σx ).
If (ξx , φx , ψx ) are i.i.d. ∼ (ξ, φ, ψ) and suitably integrable, then
Ztφ
Ztψ
R −αt
e Eφ(t) dt
R
→
,
e−αt Eψ(t) dt
for α the “Malthusian parameter”:
R
a.s.,
e−αt µ(dt) = 1, for µ(t) = Eξ(t).
φ
In fact e−αt Zt converges to a random limit.
EXAMPLE
φ(t) = 1t≥0 gives Ztφ = #(x: σx ≤ t).
10 / 15
Maximum Likelihood
Growing the graph is (nonstationary) Markov.
The log likelihood for observing the full evolution up to time n is
f 7→ log
n
Y
f (dt )
t=3
Sf (t)
=
∞
X
k=1
log f (k)N>k (n) −
n
X
log Sf (t),
t=3
where
dt = degree of the node to which node t + 1 is attached,
Sf (t) = tf (1) +
t−1
X
i=2
f (di + 1) − f (di ) =
∞
X
f (k)tpk (t).
k=1
11 / 15
Maximum Likelihood
Growing the graph is (nonstationary) Markov.
The log likelihood for observing the full evolution up to time n is
f 7→ log
n
Y
f (dt )
t=3
Sf (t)
=
∞
X
k=1
log f (k)N>k (n) −
n
X
log Sf (t),
t=3
where
dt = degree of the node to which node t + 1 is attached,
Sf (t) = tf (1) +
t−1
X
i=2
f (di + 1) − f (di ) =
∞
X
f (k)tpk (t).
k=1
The degree sequence d3 , d4 , . . . , dn is sufficient.
11 / 15
Maximum Likelihood in the Affine Case fδ (k) = k + δ
Sfδ (t) =
∞
X
(k + δ)tpk (t) = 2t + tδ
k=1
The log likelihood for observing the full evolution up to time n is
δ 7→ log
n
Y
fδ (dt )
t=3
Sfδ (t)
=
∞
X
k=1
log(k + δ)N>k (n) −
n
X
log(2t + tδ),
t=3
12 / 15
Maximum Likelihood in the Affine Case fδ (k) = k + δ
Sfδ (t) =
∞
X
(k + δ)tpk (t) = 2t + tδ
k=1
The log likelihood for observing the full evolution up to time n is
δ 7→ log
n
Y
fδ (dt )
t=3
Sfδ (t)
=
∞
X
k=1
log(k + δ)N>k (n) −
n
X
log(2t + tδ),
t=3
Observation of the graph at time n is sufficient for the full evolution.
12 / 15
Maximum Likelihood in the Affine Case fδ (k) = k + δ
Sfδ (t) =
∞
X
(k + δ)tpk (t) = 2t + tδ
k=1
The log likelihood for observing the full evolution up to time n is
δ 7→ log
n
Y
fδ (dt )
t=3
Sfδ (t)
=
∞
X
k=1
log(k + δ)N>k (n) −
n
X
log(2t + tδ),
t=3
THEOREM [Gao, vdV]
The model is locally asymptotically normal in parameter δ and
√
n(δ̂n − δ)
N (0, i−1
δ ),
iδ =
∞
X
k=1
µ(k + δ)pδ,k
µ
−
.
(k + δ)2 (2µ + δ) (2µ + δ)2
µ = mean initial degree distribution
Proof uses the martingale central limit theorem.
12 / 15
Preferential Attachment in the Affine Case f (k) = k + δ
log pk (n) (vertical) versus log k for single realization with n = 150000 and m = 5.
pk ∼ k −3−δ ,
k → ∞.
13 / 15
Maximum Likelihood in the General Parametric Case fθ , for θ ∈ Rd
The log likelihood for observing the full evolution up to time n is
θ 7→ log
n
Y
fθ (dt )
t=3
Sfθ (t)
=
∞
X
k=1
log fθ (k)N>k (n) −
n
X
log Sfθ (t),
t=3
THEOREM
Under general conditions on fθ the model of observing the full evolution up to time n is
locally asymptotically normal with respect to θ and
√
n(θ̂n − θ)
N (0, i−1
θ ),
iθ =
∞ ˙
X
fθ
k=1
P∞ ˙
k=1 fθ (k)pθ,k
P
(k)pθ,>k − ∞
.
fθ
k=1 fθ (k)pθ,k
Proof uses the martingale central limit theorem and the LLN for supercritical branching
processes.
14 / 15
Maximum Likelihood in the General Parametric Case fθ , for θ ∈ Rd
The log likelihood for observing the full evolution up to time n is
θ 7→ log
n
Y
fθ (dt )
t=3
Sfθ (t)
=
∞
X
k=1
log fθ (k)N>k (n) −
n
X
log Sfθ (t),
t=3
THEOREM
Under general conditions on fθ the model of observing the full evolution up to time n is
locally asymptotically normal with respect to θ and
√
n(θ̂n − θ)
N (0, i−1
θ ),
iθ =
∞ ˙
X
fθ
k=1
P∞ ˙
k=1 fθ (k)pθ,k
P
(k)pθ,>k − ∞
.
fθ
k=1 fθ (k)pθ,k
Proof uses the martingale central limit theorem and the LLN for supercritical branching
processes.
EXAMPLE
??
fθ (k) = (k + δ)β , for θ = (δ, β).
14 / 15
Some Open Questions
Is observing the graph at time n asymptotically sufficient for observing the full evolution
up to time n?
15 / 15
Some Open Questions
Is observing the graph at time n asymptotically sufficient for observing the full evolution
up to time n?
Does the maximum likelihood estimator based on observing the graph at time n
behave the same as the maximum likelihood estimator based on the full evolution?
15 / 15
Some Open Questions
Is observing the graph at time n asymptotically sufficient for observing the full evolution
up to time n?
Does the maximum likelihood estimator based on observing the graph at time n
behave the same as the maximum likelihood estimator based on the full evolution?
Is the empirical estimator asymptotically normal?
15 / 15
Some Open Questions
Is observing the graph at time n asymptotically sufficient for observing the full evolution
up to time n?
Does the maximum likelihood estimator based on observing the graph at time n
behave the same as the maximum likelihood estimator based on the full evolution?
Is the empirical estimator asymptotically normal?
Can we estimate f under nonparametric shape constraints?
15 / 15