Statistical Genomics
Lecture 27: Bayesian theorem
Zhiwu Zhang
Washington State University
Administration
Homework 6 (last) posted, due April 29, Friday,
3:10PM
Final exam: May 3, 120 minutes (3:10-5:10PM), 50
Evaluation due May 6 (7 out of 19 received).
Outline
Concept development for genomic selection
Bayesian theorem
Bayesian transformation
Bayesian likelihood
Bayesian alphabet for genomic selection
All SNPs have same distribution
rrBLUP
b~N(0, I σg2)
y=x1g1 + x2g2 + … + xpgp + e
gBLUP
U ~N(0, K σa2)
Selection of priors
density.default(x = x)
Flat
0.2
σg2
0.0
Density
0.4
Identical normal
-4
-2
0
2
4
N = 10000 Bandwidth = 0.143
LSE
solve LL
solely
RR
solve
REML by
EMMA
Distributions of gi
More realistic
…
N(0, I σg2)
N(0, I σg2)
N(0, I σg2)
Out of control and overfitting?
y=x1g1 + x2g2 + … + xpgp + e
Need help from Thomas Bayes
"An Essay towards solving a Problem in the Doctrine of Chances" which
was read to the Royal Society in 1763 after Bayes' death by Richard Price
An example from middle school
A school by 60% boys and 40% girls. All boy
wear pants. Half girls wear pants and half wear
skirt.
What is the probability to meet a student with
pants?
P(Pants)=60%*100+40%50%=80%
Probability
P(pants)=60%*100+40%50%=80%
P(Boy)*P(Pants | Boy) + P(Girl)*P(Pants | Girl)
Inverse question
A school by 60% boys and 40% girls. All boy
wear pants. Half girls wear pants and half wear
skirt.
Meet a student with pants. What is the
probability the student is a boy?
P(Boy | Pants)
60%*100%
60%*100+40%50%
= 75%
P(Boy|Pants)
60%*100
60%*100+40%50%
=
75%
P(Pants | Boy) P(Boy)
P(Pants | Boy) P(Boy) + P(Pants | Girl) P(Girl)
P(Pants | Boy) P(Boy)
P(Pants)
Bayesian theorem
P(Boy|Pants)P(Pants)=P(Pants|Boy)P(Boy)
Bayesian transformation
q(parameters)
Posterior distribution
of q given y
y(data)
P(q | y)
P(Boy | Pants)
∝
P(Pants | Boy) P(Boy)
Likelihood of data given
parameters
P(y|q)
Distribution of
parameters (prior)
P(q)
Bayesian for hard problem
A public school containing 60% males and 40%
females. What is the probability to draw four
males? -- Probability (36%)
Four males were draw from a public school.
What are the gender proportions? -- Inverse
probability (?)
Prior knowledge
0.2
0.04
Density
Density
0.0
0.00
10
20
30
40 -4
100%
female
00.0
0
ytisneD
40.0
100% male
80 .0
0.08
0.4
distribution
density.default(x Gender
=density.default(x
x)
=
)xx)
= x(tluafed.ytisned
-2
0
2
4 04
03
02
01
0
N = 10000 Bandwidth =N0.6272
= 10000 Bandwidth27=20.143
6.0 = htdiwdnaB 00001 = N
Likely
Unsure
Safe
unlikely
Reject
Four males were draw from a public school.
What are the gender proportions? -- Inverse
probability (?)
Transform hard problem to easy one
P(G|y) ∝ P(y|G)
Probability of
unknown given
data
(hard to solve)
Probability of
observed given
unknown
(easy to solve)
P(G)
Prior
knowledge of
unknown
(freedom)
P(y|G)
p=seq(0, 1, .01)
n=4
k=n
pyp=dbinom(k,n,p)
theMax=pyp==max(pyp)
pMax=p[theMax]
plot(p,pyp,type="b",main=paste("Data=", pMax,sep=""))
0.4
0.0
pyp
0.8
Data=1
0.0
0.2
0.4
0.6
0.8
1.0
P(G)
Data=1
0.4
0.0
pyp
0.8
ps=p*10-5
pd=dnorm(ps)
theMax=pd==max(pd)
pMax=p[theMax]
plot(p,pd,type="b",main=paste("Prior=", pMax,sep=""))
0.0
0.2
0.4
0.6
0.8
1.0
0.8
1.0
0.2
0.1
0.0
pd
0.3
0.4
p
Prior=0.5
0.0
0.2
0.4
0.6
0.0
0.0
0.2
0.4
0.6
P(y|G) P(G)
0.8
1.0
0.3
0.4
p
Prior=0.5
0.2
0.0
0.1
pd
ppy=pd*pyp
theMax=ppy==max(ppy)
pMax=p[theMax]
plot(p,ppy,type="b",main=paste("Optimum=", pMax,sep=""))
0.0
0.2
0.4
0.6
0.8
1.0
0.8
1.0
0.015
0.000
ppy
0.030
p
Optimum=0.57
0.0
0.2
0.4
0.6
Depend what you believe
0.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.6
0.8
1.0
0.8
1.0
0.8
1.0
0.2
0.3
0.4
p
Prior=0.7
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
p
Optimum=0.75
0.04
0.00
0.000
0.015
ppy
0.08
0.030
p
Optimum=0.57
ppy
0.4
0.1
pd
0.2
0.3
0.4
p
Prior=0.5
0.1
pd
0.4
pyp
0.4
0.0
pyp
0.8
Data=1
0.8
Data=1
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
Ten are all males
1.0
0.8
0.2
0.6
0.8
1.0
0.0
0.0
0.2
0.4
0.6
0.8
1.0
0.4
0.6
0.8
1.0
0.0
0.2
1.0
0.4
0.6
0.8
1.0
0.8
1.0
0.20
ppy
0.020
0.8
1.0
p
Optimum=1
0.00
0.000
0.6
0.8
0.0
0.2
0.010
ppy
0.0010
0.4
0.6
p
Prior=0.9
p
Optimum=0.82
0.0000
0.2
0.4
0.1
pd
0.0
p
Optimum=0.65
0.0
0.2
0.3
0.3
0.2
0.0
0.0
0.1
pd
0.2
0.3
0.4
p
Prior=0.7
0.4
p
Prior=0.5
0.4
0.0
0.0
0.4
0.8
0.2
0.6
0.10
0.4
0.4
pyp
0.0
0.2
0.1
pd
0.4
pyp
0.4
0.0
pyp
0.0
ppy
Data=1
0.8
Data=1
0.8
Data=1
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
Control of unknown parameters
…
N(0, I σgp2)
N(0, I σg22)
N(0, I σg12)
Prior distribution
y=x1g1 + x2g2 + … + xpgp + e
Selection of priors
density.default(x = x)
density.default(x = Others
x)
0.08
0.00
-4
-2
0
2
4
N = 10000 Bandwidth = 0.143
RR
0.04
Density
0.2
0.0
Density
0.4
Flat
Bayes
Prior distributions of gi
0
10
20
30
40
N = 10000 Bandwidth = 0.6272
One choice is inverse Chi-Square
…
N(0, I σgp2)
N(0, I σg22)
N(0, I σg12)
σgi2~X-1(v, S) Hyper parameters
y=x1g1 + x2g2 + … + xpgp + e
Bayesian likelihood
P(gi, σgi2, σe2 v, s | y) =
P(y | gi, σgi2, σe2 v, s) P(gi, σgi2, σe2 v, s)
Variation of assumption
σgi2>0 for all i
σgi2=0 with probability π
Bayes A
Bayes B
}
~X (v, S) with probability 1-π
σgi2
-1
Bayes alphabet
marker effect
Genomic
Effect
Variance
Bayes A
All SNPs
X-2(v,S)
Bayes B
P(1-π)
X-2(v,S)
X-2(0,-2)
P(1-π)
X-2(v,S’)
X-2(0,-2)
P(1-π)
X-2(v,S)
X-2(0,-2)
Method
Bayes Cπ
Bayes Dπ
BayesianLASSO
BayesMulti,
BayesR
P(1-π)
P(1-π)
Residual
variance
Unknown
parameter
X-2(0,-2)
π
S π
Double
exponential
effects
λ t
Multiple
normal
distributions
γ
Highlight
Concept development for genomic selection
Bayesian theorem
Bayesian transformation
Bayesian likelihood
Bayesian alphabet for genomic selection
© Copyright 2026 Paperzz