An empirical Bayes approach to network recovery
using external data
G.B. Kpogbezan (UL)
Joint work with:
Aad van der Vaart (UL), Wessel N. van Wieringen (VU &
VUmc), Gwenael G.R. Leday (MRC Biostatistics Unit,
Cambridge), Mark van de Wiel (VU & VUmc)
G.B. Kpogbezan
Network recovery using prior knowledge
Background
Model
VB vs Gibbs sampling
Simulation and Illustration
Conclusions
G.B. Kpogbezan
Network recovery using prior knowledge
I
Goal: network reconstruction from data and use of
prior/external knowledge.
I
Network = graph. A graph consists of a pair (I, E) where
I = {1, 2, ..., p} is a set of indices representing nodes and E is
the set of edges (relations between the nodes) in I × I.
I
Here edges reflect conditional dependencies between the
nodes =⇒ Conditional Independence Graph.
G.B. Kpogbezan
Network recovery using prior knowledge
I
Data: Y j ∼iid N(0, Ω−1
j ∈ {1, ..., n} where
p ),
Ω−1
is
the
covariance
matrix
and Ωp = (wkl )k,l=1,...,p is the
p
inverse covariance (or precision) matrix.
I
In this setting (Gaussian Graphical Model), it holds
corr (Yi1 , Yi2 |Y−i1 ,−i2 ) = wi1 i2 (conditional dependency).
I
Recontructing the network (conditional independence graph)
is equivalent to determine the support of Ωp .
G.B. Kpogbezan
Network recovery using prior knowledge
I
For n p, typically for gene expression data, the problem of
estimating Ωp is not feasible.
I
Some proposed solution:
Graphical lasso : maximize the penalized log-likelihood
log(detΩp ) − tr (SΩp ) − ρ||Ωp ||1
over the space of positive definite matrices M + with shrinkage
parameter ρ > 0.
G.B. Kpogbezan
Network recovery using prior knowledge
SEMs
I
Simultaneous Equations Models (SEMs): modeling of the
full conditional distribution of each node and result in a
system of p regression equations.
I
It is:
Yi =
p
X
βis Ys + i ,
i ∈ I.
s=1,s6=i
I
Equivalence between regression parameters and precision
matrix elements, namely βis = wii−1 wis .
I
Estimation of support of Ωp ⇐⇒ variables selection in p
regressions.
G.B. Kpogbezan
Network recovery using prior knowledge
(1)
Bayesian SEM
I
Meinshausen & Buehlmann put a lasso penalty on each
regression parameter to select the neighbors of each variable.
I
Previously we proposed a Bayesian formulation of the SEM
(BSEM) and put priors on parameters in (1). It is:
i ∼ N(0n , σi2 In ),
βis ∼ N(0, σi2 τi−2 ),
τi2 ∼ Γ(a, b),
σi−2 ∼ Γ(c, d)
c,d: non-informative.
G.B. Kpogbezan
Network recovery using prior knowledge
(2)
Variational Bayes
I
Variational approximation to a distribution = closest element
in a target set Q chosen both for computational tractability
and accuracy.
I
distance measured by Kullback- Leibler divergence.
I
Distributions Q with stochastically independent marginals (i.e.
product laws) are popular.
I
Accuracy of approximation naturally restricted to the marginal
distributions.
G.B. Kpogbezan
Network recovery using prior knowledge
Why is Bayesian SEM attractive?
I
Double shrinkage; Amount of shrinkage is node-specific.
I
Fast posterior approximation by Variational Bayes.
I
Approximate joint posterior under product laws assumption:
π(βi , τi2 , σi−2 ) ≈ q(βi , τi2 , σi−2 ) = q1 (βi )q2 (τi2 )q3 (σi−2 )
I
Appealing EB procedure for hyperparameters estimation.
I
Efficient EM-type algorithms for minimization.
G.B. Kpogbezan
Network recovery using prior knowledge
Why is Variational Bayes attractive?
I
It is FAST (remember: p penalized regressions...).
I
It is ACCURATE (verified by Gibbs sampling).
I
Analytical lower-bound for marginal likelihood: Mi (a, b)
I
I
I
Pp
Summation: M(a, b) = i=1 Mi (a, b)
Maximize M(a, b): EB estimate of prior parameters.
Mi (a, b) facilitates posterior edge selection
G.B. Kpogbezan
Network recovery using prior knowledge
Extension of BSEM to BSEMed
I
Prior knowledge on the to-be-reconstructed network topology
usually available for instance
I
I
from pathway repositories like KEGG
inferred from data of pilot study.
I
Natural to take such information into account during network
reconstruction.
I
Prior knowledge assumed to be available as a prior network,
which specifies which edges are present and absent.
G.B. Kpogbezan
Network recovery using prior knowledge
BSEMed
Incorporation of prior network P in form of adjacency matrix
containing only zeros (no edge) and ones (edge is present):
Yi =
p
P
βis Ys + i ,
i ∈ I,
s=1,s6=i
i ∼ N(0n , σi2 In ),
−2
βis ∼ N(0, σi2 τi,P
)
is
2
τi,Pis ∼ Γ(aPis , bPis ),
σi−2 ∼ Γ(c, d)
c,d: non-informative.
G.B. Kpogbezan
Network recovery using prior knowledge
(3)
VB vs Gibbs sampling
I
How close is the variational approximation to the true
posterior distribution?
I
Here we investigate this question by comparing the variational
Bayes estimates of the marginal densities with the
corresponding Gibbs sampling-based estimates.
I
n = 50 independent replicates from a N(0, Ω−1
p ) with p = 50.
I
Ωp was chosen to be a band matrix with bl = bu = 4 =⇒ a
total number of 9 band elements including the diagonal.
I
Simulation study with a single regression equation (say i = 1).
G.B. Kpogbezan
Network recovery using prior knowledge
8
0.15
−0.4
0.0
12
8
−0.15
0.00
0.15
−0.15
0.15
0.00
0.15
10
0.5
0.1
420
460
500
Tau_0^{2}
0.0
0.0
0.000
−0.15
0.3
Density
Density
0.5
0.010
1.0
0.4
1.5
0.030
Density
8
6
Density
4
9
0.5
1.5
2.5
3.5
9
Sigma^{−2}
11
13
Tau_1^{2}
Figure: Comparison of variational marginal densities of β1,2 , . . . , β1,10
2
2
(blue curves) and τ1,0
, τ1,1
and σ1−2 (black curves) with corresponding
Gibbs sampling-based histograms. The red vertical lines display the
variational marginal means.
G.B. Kpogbezan
0.15
7
2
0.00
0.00
6
12
5
0
−0.15
6
Density
4
2
0
2
0
−0.8
0.2
0.2
10
12
8
6
Density
4
2
0
0.00
−0.2
4
10
12
10
8
6
2
0
−0.15
1
−0.6
3
4
Density
−0.4
0
1
−0.8
2
0.020
0.2
0
0
1
4
6
Density
3
Density
2
3
Density
2
2
Density
3
8
10
12
10
5
4
5
4
5
4
5
4
3
Density
2
1
0
−0.6 −0.2
Network recovery using prior knowledge
15
VB vs Gibbs sampling: Time comparison
time in seconds
BSEMed
40
Gibbs sampling
2542 × 50 = 127, 100
Computing times for an R-implementation of the variational Bayes
method and the Gibbs sampling method with n = p = 50.
G.B. Kpogbezan
Network recovery using prior knowledge
(a) True graph
(b) BSEMed: true (c) BSEMed: 50 %
(d) BSEM
Figure: Visualization of BSEMed estimate using perfect prior (b),
BSEMed estimate using 50% true edges information (c), BSEM estimate
(d) and the true graph (a) in case n = 50 and p = 100.
G.B. Kpogbezan
Network recovery using prior knowledge
Data: Gene expression data from GEO
Lung: 49 Normals, 58 Cancers
Apoptosis pathway: p = 84 genes
Pancreas: 39 Normals, 39 Cancers
p53 pathway: p = 68 genes.
Idea: Use network fitted on Normals to inform network for
Cancers.
G.B. Kpogbezan
Network recovery using prior knowledge
2 and τ 2
EB estimate of prior mean τi,0
i,1
Lung
Pancreas
I
Not in Normal Network
27.32
20.03
In Normal Nerwork
1.71
1.21
ratio
20.13
12.97
Prior networks are clearly of use:
I
the mean prior precision for regression parameters
corresponding to the edges absent in the prior network is
relatively large
I
stronger shrinkage towards zero compared to mean prior
precision corresponding to edges present in the prior network.
G.B. Kpogbezan
Network recovery using prior knowledge
Reproducibility
100 random splits of the data: What proportion of top 50 edges
reproduces, on average?
# edges
50
BSEM
9.12%
SEMLasso
2.64%
glasso
6.84%
BSEMed
59.16%
Lung data, average percentage edges in overlap.
# edges
50
BSEM
14.84%
SEMLasso
5.6%
glasso
9.04%
BSEMed
55.64%
Pancreas data, average percentage edges in overlap.
G.B. Kpogbezan
Network recovery using prior knowledge
I
BSEMed and BSEM are attractive framework for network
inference and computationally very fast.
I
Performance of BSEMed increase when the external data is
relevant.
I
BSEMed performs as good as BSEM when external data are
not relevant at all.
I
In case of multiple sources of external data BSEMed can be
easily used: one at a time.
G.B. Kpogbezan
Network recovery using prior knowledge
THANK YOU!!!
G.B. Kpogbezan
Network recovery using prior knowledge
© Copyright 2026 Paperzz