A Fresh Look at the Bayes` Theorem from Information Theory

A Fresh Look at the Bayes’ Theorem from Information
Theory
Tan Bui-Thanh
Computational Engineering and Optimization (CEO) Group
Department of Aerospace Engineering and Engineering Mechanics
Institute for Computational Engineering and Sciences (ICES)
The University of Texas at Austin
Babuska Series, ICES Sep 9, 2016
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
1 / 28
Outline
1
Bayesian Inversion Framework
2
Entropy
3
Relative Entropy
4
Bayes’ Theorem and Information Theory
5
Conclusions
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
2 / 28
Large-scale computation under uncertainty
Inverse electromagnetic scattering
Randomness
Random errors in measurements are unavoidable
Inadequacy of the mathematical model (Maxwell equations)
Challenge
How to invert for the invisible shape/medium using computational
electromagnetics with O 106 degree of freedoms?
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
3 / 28
Large-scale computation under uncertainty
Full wave form seismic inversion
Randomness
Random errors in seismometer measurements are unavoidable
Inadequacy of the mathematical model (elastodynamics)
Challenge
How to image the earth interior using forward computational model with
with O 109 degree of freedoms?
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
4 / 28
Inverse Shape Electromagnetic Scattering Problem
Maxwell Equations:
@H
,
@t
@E
r⇥H=✏
,
@t
r⇥E=
µ
(Faraday)
(Ampere)
E: Electric field, H: Magnetic field, µ: permeability, ✏: permittivity
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
5 / 28
Inverse Shape Electromagnetic Scattering Problem
Maxwell Equations:
@H
,
@t
@E
r⇥H=✏
,
@t
r⇥E=
µ
(Faraday)
(Ampere)
E: Electric field, H: Magnetic field, µ: permeability, ✏: permittivity
Forward problem (discontinuous Galerkin discretization)
d = G(x)
where G maps shape parameters x to electric/magnetic field d at the
measurement points
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
5 / 28
Inverse Shape Electromagnetic Scattering Problem
Maxwell Equations:
@H
,
@t
@E
r⇥H=✏
,
@t
r⇥E=
µ
(Faraday)
(Ampere)
E: Electric field, H: Magnetic field, µ: permeability, ✏: permittivity
Forward problem (discontinuous Galerkin discretization)
d = G(x)
where G maps shape parameters x to electric/magnetic field d at the
measurement points
Inverse Problem
Given (possibly noise-corrupted) measurements on d, infer x?
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
5 / 28
The Bayesian Statistical Inversion Framework
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
6 / 28
The Bayesian Statistical Inversion Framework
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
6 / 28
The Bayesian Statistical Inversion Framework
Bayes Theorem
⇡post (x|d) / ⇡like (d|x) ⇥ ⇡prior (x)
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
6 / 28
Bayes theorem for inverse electromagnetic scattering
Prior knowledge: The obstacle is smooth:
✓
◆
Z 2⇡
00
⇡pr (x) / exp
r (x)d✓
0
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
7 / 28
Bayes theorem for inverse electromagnetic scattering
Prior knowledge: The obstacle is smooth:
✓
◆
Z 2⇡
00
⇡pr (x) / exp
r (x)d✓
0
Likelihood: Additive Gaussian noise, for example,
✓
◆
1
2
⇡like (d|x) / exp
kG(x) dkCnoise
2
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
7 / 28
Outline
1
Bayesian Inversion Framework
2
Entropy
3
Relative Entropy
4
Bayes’ Theorem and Information Theory
5
Conclusions
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
8 / 28
Entropy
Definition
We define the uncertainty in a random variable X distributed by
0  ⇡ (x)  1 as
Z
H (X) =
⇡ (x) log ⇡ (x) dx 0
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
9 / 28
Entropy
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
10 / 28
Entropy
Wiener and Shannon
Kolmogorov
Copied from Sergio Verdu
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
10 / 28
Entropy
Wiener and Shannon
Kolmogorov
Copied from Sergio Verdu
Wiener: “...for it belongs to the two of us equally”
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
10 / 28
Entropy
Wiener and Shannon
Kolmogorov
Copied from Sergio Verdu
Wiener: “...for it belongs to the two of us equally”
Shannon: “...a mathematical pun”
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
10 / 28
Entropy
Wiener and Shannon
Kolmogorov
Copied from Sergio Verdu
Wiener: “...for it belongs to the two of us equally”
Shannon: “...a mathematical pun”
Kolmogorov: “...has no physical interpretation”
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
10 / 28
Entropy
Entropy of uniform distribution
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
11 / 28
Entropy
Entropy of uniform distribution
Let U be a uniform random variable with values in X, and |X| < 1
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
11 / 28
Entropy
Entropy of uniform distribution
Let U be a uniform random variable with values in X, and |X| < 1
1
⇡ (u) :=
) H (U ) = log (|X|)
|X|
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
11 / 28
Entropy
Entropy of uniform distribution
Let U be a uniform random variable with values in X, and |X| < 1
1
⇡ (u) :=
) H (U ) = log (|X|)
|X|
How uncertain is the uniform random variable?
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
11 / 28
Entropy
Entropy of uniform distribution
Let U be a uniform random variable with values in X, and |X| < 1
1
⇡ (u) :=
) H (U ) = log (|X|)
|X|
How uncertain is the uniform random variable?
H (X)  H (U )
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
11 / 28
100 years of uniform distribution
source: Christoph Aistleitner
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
12 / 28
100 years of uniform distribution
source: Christoph Aistleitner
Hermann Weyl
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
12 / 28
and Maximum entropy
Maximum entropy distribution
X with known mean and variance
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
13 / 28
and Maximum entropy
Maximum entropy distribution
X with known mean and variance
⇡ (x)? with maximum entropy
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
13 / 28
and Maximum entropy
Maximum entropy distribution
X with known mean and variance
⇡ (x)? with maximum entropy
Z
max H(X) =
⇡(x)
⇡(x) log(⇡(x)) dx
subject to
Z
(CEO Group UT Austin)
Bayes’ Theorem
Z
(x
x⇡(x) dx = µ
µ)2 ⇡(x) dx = 2
Z
⇡(x) dx = 1
Bayesian Inversions
13 / 28
Gaussian and Maximum entropy
Maximum entropy distribution
X with known mean and variance
⇡ (x)? with maximum entropy
Z
max H(X) =
⇡(x)
⇡(x) log(⇡(x)) dx
subject to
Z
Z
(x
x⇡(x) dx = µ
µ)2 ⇡(x) dx = 2
Z
⇡(x) dx = 1
Gaussian distribution: ⇡ (x) = N µ,
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
2
13 / 28
Outline
1
Bayesian Inversion Framework
2
Entropy
3
Relative Entropy
4
Bayes’ Theorem and Information Theory
5
Conclusions
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
14 / 28
Relative Entropy
Abraham Wald (1945)
Harold Je↵reys (1945)
✓
◆
Z
⇡(x)
D (⇡||q) := ⇡(x) log
dx
q(x)
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
15 / 28
Kullback-Leibler divergence = Relative Entropy
Solomon Kullback
Richard Leibler (1951)
(1951)
✓
◆
Z
⇡(x)
D (⇡||q) := ⇡(x) log
dx
q(x)
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
16 / 28
Kullback-Leibler divergence = Relative Entropy
Solomon Kullback
Richard Leibler (1951)
(1951)
✓
◆
✓ ◆
Z
X
⇡(x)
⇡i
discrete
D (⇡||q) := ⇡(x) log
dx =
⇡i log
q(x)
qi
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
16 / 28
Information Inequality
The most important inequality in information theory
D (⇡||q)
0
Can we see it easily?
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
17 / 28
Information Inequality
The most important inequality in information theory
D (⇡||q)
0
Can we see it easily?
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
17 / 28
Outline
1
Bayesian Inversion Framework
2
Entropy
3
Relative Entropy
4
Bayes’ Theorem and Information Theory
5
Conclusions
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
18 / 28
From Relative Entropy to Bayes’ Theorem
Toss n times an kth dimensional dice with the prior distribution of
k
X
k
each face {pi }i=1 :
pi = 1
i=1
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
19 / 28
From Relative Entropy to Bayes’ Theorem
Toss n times an kth dimensional dice with the prior distribution of
k
X
k
each face {pi }i=1 :
pi = 1
i=1
Let ni is the number of times we see face i:
(CEO Group UT Austin)
Bayes’ Theorem
ni
! pi
n
Bayesian Inversions
19 / 28
From Relative Entropy to Bayes’ Theorem
Toss n times an kth dimensional dice with the prior distribution of
k
X
k
each face {pi }i=1 :
pi = 1
i=1
ni
! pi
n
What is the likelihood that these n faces also distributed by the
k
X
posterior distrubtion qi :
qi = 1?
Let ni is the number of times we see face i:
i=1
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
19 / 28
From Relative Entropy to Bayes’ Theorem
Toss n times an kth dimensional dice with the prior distribution of
k
X
k
each face {pi }i=1 :
pi = 1
i=1
ni
! pi
n
What is the likelihood that these n faces also distributed by the
k
X
posterior distrubtion qi :
qi = 1?
Let ni is the number of times we see face i:
The likelihood of
i=1
k
{ni }i=1 distributed
by {qi }ki=1
⇧ki=1 qini
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
19 / 28
From Relative Entropy to Bayes’ Theorem
Toss n times an kth dimensional dice with the prior distribution of
k
X
k
each face {pi }i=1 :
pi = 1
i=1
ni
! pi
n
What is the likelihood that these n faces also distributed by the
k
X
posterior distrubtion qi :
qi = 1?
Let ni is the number of times we see face i:
The likelihood of
distribution)
i=1
k
{ni }i=1 distributed
L :=
(CEO Group UT Austin)
by {qi }ki=1 (Multinomial
n!
⇧ki=1 ni !
Bayes’ Theorem
⇧ki=1 qini
Bayesian Inversions
19 / 28
From Relative Entropy to Bayes’ Theorem
The likelihood of {ni }ki=1 distributed by {qi }ki=1
L :=
(CEO Group UT Austin)
n!
⇧ki=1 ni !
Bayes’ Theorem
⇧ki=1 qini
Bayesian Inversions
20 / 28
From Relative Entropy to Bayes’ Theorem
The likelihood of {ni }ki=1 distributed by {qi }ki=1
L :=
n!
⇧ki=1 ni !
⇧ki=1 qini
Take the log likelihood
log L = log(n!)
(CEO Group UT Austin)
X
log(ni !) +
Bayes’ Theorem
X
ni log(qi )
Bayesian Inversions
20 / 28
From Relative Entropy to Bayes’ Theorem
The likelihood of {ni }ki=1 distributed by {qi }ki=1
L :=
n!
⇧ki=1 ni !
⇧ki=1 qini
Take the log likelihood
log L = log(n!)
X
log(ni !) +
X
ni log(qi )
Stirling’s approximation log n! ⇡ n log(n) n
X
X
X
log L = n log(n)
ni log(ni ) +
ni log(qi ) +
ni
| {z
0
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
n
}
20 / 28
From Relative Entropy to Bayes’ Theorem
The likelihood of {ni }ki=1 distributed by {qi }ki=1
L :=
n!
⇧ki=1 ni !
⇧ki=1 qini
Take the log likelihood
X
log L = log(n!)
log(ni !) +
X
ni log(qi )
Stirling’s approximation log n! ⇡ n log(n) n
X
X
X
log L = n log(n)
ni log(ni ) +
ni log(qi ) +
ni
| {z
0
X ni
1
( log L) =
log
n
n
(CEO Group UT Austin)
✓
ni /n
qi
◆
Bayes’ Theorem
Bayesian Inversions
n
}
20 / 28
From Relative Entropy to Bayes’ Theorem
The likelihood of {ni }ki=1 distributed by {qi }ki=1
L :=
n!
⇧ki=1 ni !
⇧ki=1 qini
Take the log likelihood
X
log L = log(n!)
log(ni !) +
X
ni log(qi )
Stirling’s approximation log n! ⇡ n log(n) n
X
X
X
log L = n log(n)
ni log(ni ) +
ni log(qi ) +
ni
| {z
0
X ni
1
( log L) =
log
n
n
(CEO Group UT Austin)
✓
ni /n
qi
◆
Bayes’ Theorem
=
X
pi log
✓
pi
qi
◆
Bayesian Inversions
n
}
20 / 28
From Relative Entropy to Bayes’ Theorem
The likelihood of {ni }ki=1 distributed by {qi }ki=1
L :=
n!
⇧ki=1 ni !
⇧ki=1 qini
Take the log likelihood
log L = log(n!)
X
log(ni !) +
X
ni log(qi )
Stirling’s approximation log n! ⇡ n log(n) n
X
X
X
log L = n log(n)
ni log(ni ) +
ni log(qi ) +
ni
| {z
0
n
}
Relative entropy = average likelihood
✓
◆
✓ ◆
X ni
X
1
ni /n
pi
( log L) =
log
=
pi log
= D (p||q)
n
n
qi
qi
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
20 / 28
From Relative Entropy to Bayes’ Theorem
Relative entropy = average likelihood
1
( log L) = D (p||q)
n
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
21 / 28
From Relative Entropy to Bayes’ Theorem
Relative entropy = average likelihood
Write
X
!
Z
(CEO Group UT Austin)
1
( log L) = D (p||q)
n
Z
log(L)p(x) dx =
Z
Bayes’ Theorem
log
✓ ◆
p
p(x) dx
q
Bayesian Inversions
21 / 28
From Relative Entropy to Bayes’ Theorem
Relative entropy = average likelihood
Write
X
!
Z
1
( log L) = D (p||q)
n
Z
log(L)p(x) dx =
Z
log
✓ ◆
p
p(x) dx
q
q(x) = L(x)p(x)
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
21 / 28
From Relative Entropy to Bayes’ Theorem
Relative entropy = average likelihood ! Bayes
Write
X
!
Z
Bayes’ theorem
(CEO Group UT Austin)
1
( log L) = D (p||q)
n
Z
log(L)p(x) dx =
Z
log
✓ ◆
p
p(x) dx
q
q(x) = L(x)p(x)
Bayes’ Theorem
Bayesian Inversions
21 / 28
From Optimization to Bayes’ Theorem
Inverse Problem
Given observation model
d = G (x) + "
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
22 / 28
From Optimization to Bayes’ Theorem
Inverse Problem
Given observation model
d = G (x) + "
Inverse task: given d, infer x
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
22 / 28
From Optimization to Bayes’ Theorem
Inverse Problem
Given observation model
d = G (x) + "
Inverse task: given d, infer x
Statistical inversion: Prior knowledge: X ⇠ ⇡prior (x). Look for the
posterior distribution ⇡post (x) that combines prior information and
information from the data.
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
22 / 28
From Optimization to Bayes’ Theorem
Inverse Problem
Given observation model
d = G (x) + "
Inverse task: given d, infer x
Statistical inversion: Prior knowledge: X ⇠ ⇡prior (x). Look for the
posterior distribution ⇡post (x) that combines prior information and
information from the data.
The likelihood: assume " ⇠ N (0, C)
✓
◆
1
2
⇡like (x) = exp
kd G(x)kC
2
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
22 / 28
From Optimization to Bayes’ Theorem
Prior Elicitation
Try to get the best prior information = discrepancy relative to the
posterior is minimized
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
23 / 28
From Optimization to Bayes’ Theorem
Prior Elicitation
Try to get the best prior information = discrepancy relative to the
posterior is minimized
Conversely, best prior ! the information gained in the posterior
should not be large
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
23 / 28
From Optimization to Bayes’ Theorem
Prior Elicitation
Try to get the best prior information = discrepancy relative to the
posterior is minimized
Conversely, best prior ! the information gained in the posterior
should not be large
Equivalently,
⇡post = arg min D (⇡||⇡prior ) =
⇡(x)
(CEO Group UT Austin)
Z
Bayes’ Theorem
⇡(x) log
✓
⇡(x)
⇡prior (x)
◆
dx
Bayesian Inversions
23 / 28
From Optimization to Bayes’ Theorem
How about information from the data?
Want to find x to match the data as well as we can
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
24 / 28
From Optimization to Bayes’ Theorem
How about information from the data?
Want to find x to match the data as well as we can
“Equivalently”: want to find the posterior distribution such that
kd G(x)k2C is minimized!
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
24 / 28
From Optimization to Bayes’ Theorem
How about information from the data?
Want to find x to match the data as well as we can
“Equivalently”: want to find the posterior distribution such that
kd G(x)k2C is minimized!
One approach: minimize the mean squared error
Z
⇡post = arg min ⇡(x) kd G(x)k2C dx
⇡(x)
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
24 / 28
From Optimization to Bayes’ Theorem
How about information from the data?
Want to find x to match the data as well as we can
“Equivalently”: want to find the posterior distribution such that
kd G(x)k2C is minimized!
One approach: minimize the mean squared error
Z
Z
2
⇡post = arg min ⇡(x) kd G(x)kC dx=
⇡(x) log (⇡like (x)) dx
⇡(x)
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
24 / 28
From Optimization to Bayes’ Theorem
Prior + data information
From prior
⇡post = arg min D (⇡||⇡prior ) =
⇡(x)
(CEO Group UT Austin)
Z
Bayes’ Theorem
⇡(x) log
✓
⇡(x)
⇡prior (x)
◆
dx
Bayesian Inversions
25 / 28
From Optimization to Bayes’ Theorem
Prior + data information
From prior
⇡post = arg min D (⇡||⇡prior ) =
⇡(x)
Z
⇡(x) log
✓
⇡(x)
⇡prior (x)
◆
dx
From likelihood
⇡post = arg min
⇡(x)
(CEO Group UT Austin)
Z
⇡(x) log (⇡like (x)) dx
Bayes’ Theorem
Bayesian Inversions
25 / 28
From Optimization to Bayes’ Theorem
Prior + data information
From prior
⇡post = arg min D (⇡||⇡prior ) =
⇡(x)
Z
⇡(x) log
✓
⇡(x)
⇡prior (x)
◆
dx
From likelihood
⇡post = arg min
⇡(x)
Z
⇡(x) log (⇡like (x)) dx
A Compromise
⇡post = arg min
⇡(x)
subject to
(CEO Group UT Austin)
Z
⇡(x) log (⇡like (x)) dx+
Z
⇡(x) dx = 1,
Bayes’ Theorem
Z
and ⇡(x)
⇡(x) log
✓
⇡(x)
⇡prior (x)
◆
dx
0.
Bayesian Inversions
25 / 28
From Optimization to Bayes’ Theorem
Prior + data information
A Compromise
✓
◆
Z
Z
⇡(x)
⇡post = arg min
⇡(x) log (⇡like (x)) dx+ ⇡(x) log
dx
⇡prior (x)
⇡(x)
subject to
(CEO Group UT Austin)
Z
⇡(x) dx = 1,
Bayes’ Theorem
and ⇡(x)
0.
Bayesian Inversions
26 / 28
From Optimization to Bayes’ Theorem
Prior + data information
A Compromise
✓
◆
Z
Z
⇡(x)
⇡post = arg min
⇡(x) log (⇡like (x)) dx+ ⇡(x) log
dx
⇡prior (x)
⇡(x)
subject to
Z
⇡(x) dx = 1,
and ⇡(x)
0.
Does it have a solution ⇡post (x)? is it unique?
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
26 / 28
From Optimization to Bayes’ Theorem
Prior + data information
A Compromise
✓
◆
Z
Z
⇡(x)
⇡post = arg min
⇡(x) log (⇡like (x)) dx+ ⇡(x) log
dx
⇡prior (x)
⇡(x)
subject to
Z
⇡(x) dx = 1,
and ⇡(x)
0.
Does it have a solution ⇡post (x)? is it unique?
How to solve?
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
26 / 28
From Optimization to Bayes’ Theorem
Prior + data information
A Compromise
✓
◆
Z
Z
⇡(x)
⇡post = arg min
⇡(x) log (⇡like (x)) dx+ ⇡(x) log
dx
⇡prior (x)
⇡(x)
subject to
Z
⇡(x) dx = 1,
and ⇡(x)
0.
Does it have a solution ⇡post (x)? is it unique?
How to solve?
Lagrangian + calculus of variation
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
26 / 28
From Optimization to Bayes’ Theorem
Prior + data information
A Compromise
✓
◆
Z
Z
⇡(x)
⇡post = arg min
⇡(x) log (⇡like (x)) dx+ ⇡(x) log
dx
⇡prior (x)
⇡(x)
subject to
Z
⇡(x) dx = 1,
and ⇡(x)
0.
Does it have a solution ⇡post (x)? is it unique?
How to solve?
Lagrangian + calculus of variation
Solution = Bayes’ theorem
⇡post (x|d) = R
(CEO Group UT Austin)
⇡like (d|x) ⇥ ⇡prior (x)
⇡like (d|x) ⇥ ⇡prior (x) dx
Bayes’ Theorem
Bayesian Inversions
26 / 28
Outline
1
Bayesian Inversion Framework
2
Entropy
3
Relative Entropy
4
Bayes’ Theorem and Information Theory
5
Conclusions
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
27 / 28
Conclusions
1
Information provide an intuitive and fresh view of Bayes’ theorem
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
28 / 28
Conclusions
1
Information provide an intuitive and fresh view of Bayes’ theorem
2
Relative entropy ! Bayes’ theorem
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
28 / 28
Conclusions
1
Information provide an intuitive and fresh view of Bayes’ theorem
2
Relative entropy ! Bayes’ theorem
3
Optimization + information ! Bayes’ theorem
(CEO Group UT Austin)
Bayes’ Theorem
Bayesian Inversions
28 / 28