Statistical Genomics
Lecture 26: Bayesian theory
Zhiwu Zhang
Washington State University
Administration
Homework 6 (last) due April 28, Friday, 3:10PM
Final exam: May 4 (Thursday), 120 minutes (3:105:10PM), 50 questions
Party: April 28, Friday, 4:30-7:30 (food at 5:00), 130
Johnson Hall
Course evaluation starts on next Wednesday, April 17
Outline
Concept development for genomic selection
Bayesian theorem
Bayesian transformation
Bayesian likelihood
Bayesian alphabet for genomic selection
All SNPs have same distribution
rrBLUP
gi~N(0, I σg2)
y=x1g1 + x2g2 + … + xpgp + e
gBLUP
U ~N(0, K σa2)
Selection of priors
density.default(x = x)
Flat
0.2
σg2
0.0
Density
0.4
Identical normal
-4
-2
0
2
4
N = 10000 Bandwidth = 0.143
LSE
solve LL
solely
RR
solve
REML by
EMMA
Distributions of gi
More realistic
…
N(0, I σgp2)
N(0, I σg22)
N(0, I σg12)
Out of control and overfitting?
y=x1g1 + x2g2 + … + xpgp + e
Need help from Thomas Bayes
"An Essay towards solving a Problem in the Doctrine of Chances" which
was read to the Royal Society in 1763 after Bayes' death by Richard Price
An example from middle school
A school by 60% boys and 40% girls. All boy
wear pants. Half girls wear pants and half wear
skirt.
What is the probability to meet a student with
pants?
P(Pants)=60%*100+40%50%=80%
Probability
P(pants)=60%*100+40%50%=80%
P(Boy)*P(Pants | Boy) + P(Girl)*P(Pants | Girl)
Inverse question
A school by 60% boys and 40% girls. All boy
wear pants. Half girls wear pants and half wear
skirt.
Meet a student with pants. What is the
probability the student is a boy?
P(Boy | Pants)
60%*100%
60%*100+40%50%
= 75%
P(Boy|Pants)
60%*100
60%*100+40%50%
=
75%
P(Pants | Boy) P(Boy)
P(Pants | Boy) P(Boy) + P(Pants | Girl) P(Girl)
P(Pants | Boy) P(Boy)
P(Pants)
Bayesian theorem
q(parameters)
X
P(Boy|Pants)P(Pants)=P(Pants|Boy)P(Boy)
y(data)
Bayesian transformation
q(parameters)
Posterior distribution
of q given y
y(data)
P(q | y)
P(Boy | Pants)
∝
P(Pants | Boy) P(Boy)
Likelihood of data given
parameters
P(y|q)
Distribution of
parameters (prior)
P(q)
Bayesian for hard problem
A public school containing 60% males and 40%
females. What is the probability to draw four
males? -- Probability (0.6^4=12.96%)
Four males were draw from a public school.
What are the male proportion? -- Inverse
probability (?)
Prior knowledge
0.2
0.04
Density
Density
0.0
0.00
10
20
30
40 -4
100%
female
00.0
0
ytisneD
40.0
100% male
80 .0
0.08
0.4
distribution
density.default(x Gender
=density.default(x
x)
=
)xx)
= x(tluafed.ytisned
-2
0
2
4 04
03
02
01
0
N = 10000 Bandwidth =N0.6272
= 10000 Bandwidth27=20.143
6.0 = htdiwdnaB 00001 = N
Likely
Unsure
Safe
unlikely
Reject
Four males were draw from a public school.
What are the gender proportions? -- Inverse
probability (?)
Transform hard problem to easy one
P(G|y) ∝ P(y|G)
Probability of
unknown given
data
(hard to solve)
Probability of
observed given
unknown
(easy to solve)
P(G)
Prior
knowledge of
unknown
(freedom)
P(y|G)
Probability of having 4 males given male proportion
p=seq(0, 1, .01)
n=4
k=n
pyp=dbinom(k,n,p)
theMax=pyp==max(pyp)
pMax=p[theMax]
plot(p,pyp,type="b",main=paste("Data=", pMax,sep=""))
0.4
0.0
pyp
0.8
Data=1
0.0
0.2
0.4
0.6
0.8
1.0
P(G)
Probability of male proportion
Data=1
0.4
0.0
pyp
0.8
ps=p*10-5
pd=dnorm(ps)
theMax=pd==max(pd)
pMax=p[theMax]
plot(p,pd,type="b",main=paste("Prior=", pMax,sep=""))
0.0
0.2
0.4
0.6
0.8
1.0
0.8
1.0
0.2
0.1
0.0
pd
0.3
0.4
p
Prior=0.5
0.0
0.2
0.4
0.6
0.0
0.0
P(G|y) ∝ P(y|G) P(G)
0.2
0.4
0.6
0.8
1.0
0.4
p
Prior=0.5
0.2
pd
0.3
Probability of male proportion given 4 males drawn
0.0
0.1
ppy=pd*pyp
theMax=ppy==max(ppy)
pMax=p[theMax]
plot(p,ppy,type="b",main=paste("Optimum=", pMax,sep=""))
0.0
0.2
0.4
0.6
0.8
1.0
0.8
1.0
0.015
0.000
ppy
0.030
p
Optimum=0.57
0.0
0.2
0.4
0.6
Depend what you believe
Male=Female
More Male
0.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.6
0.8
1.0
0.8
1.0
0.8
1.0
0.2
0.3
0.4
p
Prior=0.7
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
p
Optimum=0.75
0.04
0.00
0.000
0.015
ppy
0.08
0.030
p
Optimum=0.57
ppy
0.4
0.1
pd
0.2
0.3
0.4
p
Prior=0.5
0.1
pd
0.4
pyp
0.4
0.0
pyp
0.8
Data=1
0.8
Data=1
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
Ten are all males
More Male
Male=Female
1.0
0.8
0.2
0.6
0.8
1.0
0.0
0.0
0.2
0.4
0.6
0.8
1.0
0.4
0.6
0.8
1.0
0.0
0.2
0.8
1.0
0.4
0.6
0.8
1.0
0.8
1.0
0.20
ppy
0.020
ppy
0.6
1.0
p
Optimum=1
0.00
0.000
0.4
0.8
0.0
0.2
0.010
0.0010
0.0000
0.2
0.6
p
Prior=0.9
p
Optimum=0.82
vs. 57%
0.4
0.1
pd
0.0
p
Optimum=0.65
0.0
0.2
0.3
0.3
0.2
0.0
0.0
0.1
pd
0.2
0.3
0.4
p
Prior=0.7
0.4
p
Prior=0.5
0.4
0.0
0.0
0.4
0.8
0.2
0.6
0.10
0.4
0.4
pyp
0.0
0.2
0.1
pd
0.4
pyp
0.4
0.0
pyp
0.0
ppy
Data=1
0.8
Data=1
0.8
Data=1
Much more male
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
Bayesian likelihood
q(parameters)
Posterior distribution
of q given y
y(data)
P(q | y)
P(Boy | Pants)
∝
P(Pants | Boy) P(Boy)
Likelihood of data given
parameters
P(y|q)
Distribution of
parameters (prior)
P(q)
Highlight
Concept development for genomic selection
Bayesian theorem
Bayesian transformation
Bayesian likelihood
Bayesian alphabet for genomic selection
© Copyright 2026 Paperzz