Data-Driven Bayesian Model Selection: Parameter Space

Data-Driven Bayesian Model Selection: Parameter Space Dimension
Reduction using Automatic Relevance Determination Priors
Mohammad Khalil†
† [email protected]
Sandia National Laboratories,
Livermore, CA
Workshop on Uncertainty Quantification and Data-Driven Modeling
Austin, Texas
March 23 - 24, 2017
Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin
Corporation, for the U.S. Department of Energys National Nuclear Security Administration under contract DE-AC04-94AL85000.
Sandia
National
Laboratories
Motivation
Overview
Motivation
Bayesian Model
Selection
Bayesian Model Selection
Automatic Relevance
Determination
Automatic Relevance Determination
Application:
Aeroelasticity
Application: Aeroelasticity
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Predictive Modeling of Wavelet Coefficients
Application: Optimal
Embedding of Model
Error
Application: Optimal Embedding of Model Error
Summary
Summary
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 2 / 26
Sandia
National
Laboratories
Motivation
Why Model Selection?
●
Model selection is the task of selecting a physical/statistical model from a set of
candidate models, given data.
●
When dealing with nontrivial physics under limited a priori understanding of the
system, multiple plausible models can be envisioned to represent the system
with a reasonable accuracy.
Application:
Aeroelasticity
●
Application: Predictive
Modeling of Wavelet
Coefficients
A complex model may overfit the data but results in a higher model prediction
uncertainty.
●
Application: Optimal
Embedding of Model
Error
A simpler model may misfit the data but results in a lower model prediction
uncertainty.
●
An optimal model provides a balance between data-fit and prediction uncertainty.
●
Common approaches:
❖ Why Model Selection?
Bayesian Model
Selection
Automatic Relevance
Determination
Summary
M. Khalil
✦
Cross-validation
✦
Akaike information criterion (AIC)
✦
Bayesian information criterion (BIC)
✦
(Bayesian) Model evidence
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 3 / 26
Sandia
National
Laboratories
Inverse Problems
Motivation
Bayesian Model
Selection
❖ Inverse Problems
❖ Stages of Bayesian
Inference
❖ Model Evidence and
Bayes Factor
❖ Model Evidence and
Occam’s Razor
❖ Model Evidence:
Nested Models
❖ Bayesian Model
Selection: Occam’s
Razor at Work
Automatic Relevance
Determination
Application:
Aeroelasticity
noisy observations
forward model
❖ Bayes’ Theorem
model parameters
●
●
Forward Problem: Given model parameters, predict “clean” observations
Inverse Problem: Given “noisy” observations, infer model parameters
✦
observations are
■
■
✦
inherently noisy with unknown (or weakly known) noise model
sparse in space and time (insufficient resolution)
problem typically ill-posed, i.e. no guarantee of solution existence nor
uniqueness
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 4 / 26
Sandia
National
Laboratories
Motivation
Bayes’ Theorem
The parameters φ are treated as a random vector. Using Bayes’ rule, one can write
Bayesian Model
Selection
❖ Inverse Problems
likelihood
prior
posterior
❖ Bayes’ Theorem
❖ Stages of Bayesian
Inference
❖ Model Evidence and
Bayes Factor
4
evidence
2
1
0
Automatic Relevance
Determination
Application:
Aeroelasticity
●
Application: Predictive
Modeling of Wavelet
Coefficients
●
●
●
−1
0
u
1
p (φ, M) is the prior pdf of φ: induces regularization
p (d |φ, M) is the likelihood pdf: describes data misfit
p (φ|d, M) is the posterior pdf of φ the full Bayesian solution:
✦
✦
✦
Application: Optimal
Embedding of Model
Error
Summary
3
pdf
❖ Model Evidence and
Occam’s Razor
❖ Model Evidence:
Nested Models
❖ Bayesian Model
Selection: Occam’s
Razor at Work
prior
likelihood
posterior
Not a single point estimate but a probability density
Completely characterizes the uncertainty in φ
Used in simulations for prediction under uncertainty
For parameter inference alone, it is sufficient to consider
p (φ|d, M) ∝ p (d |φ, M) p (φ|M)
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 5 / 26
Sandia
National
Laboratories
Stages of Bayesian Inference
Bayesian inverse modeling from real data is often an iterative process:
Motivation
●
Select a model (parameters + priors)
Bayesian Model
Selection
●
Using available data, perform model calibration: Parameter inference
●
Using posterior parameter pdf, compute model evidence: Model selection
●
Refine model or propose new model and repeat
❖ Inverse Problems
❖ Bayes’ Theorem
❖ Stages of Bayesian
Inference
❖ Model Evidence and
Bayes Factor
❖ Model Evidence and
Occam’s Razor
❖ Model Evidence:
Nested Models
❖ Bayesian Model
Selection: Occam’s
Razor at Work
Stage 1
Stage 2
Stage 3
I have a model
and parameter priors
I have more than
one plausible model
None of the models
is clearly the best
Automatic Relevance
Determination
Application:
Aeroelasticity
Application: Predictive
Modeling of Wavelet
Coefficients
Parameter inference:
assume I have an
accurate model
Model selection:
compute relative
plausibility of models
given data
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
Model averaging:
obtain posterior predictive
density of QoI averaged
over plausible models
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 6 / 26
Sandia
National
Laboratories
Motivation
Model Evidence and Bayes Factor
●
When there are competing models, Bayesian model selection allows us to obtain
their relative probabilities in light of the data and prior information
●
The ”best” model is then the one which strikes an optimum balance between
quality of fit and predictivity
●
Model evidence: An integral of the likelihood over the prior, or marginalized
(averaged) likelihood
Z
p (d |M) =
p (d |φ, M) p (φ, M) dφ
●
Model posterior/plausibility: Obtained using Bayes’ Theorem
Bayesian Model
Selection
❖ Inverse Problems
❖ Bayes’ Theorem
❖ Stages of Bayesian
Inference
❖ Model Evidence and
Bayes Factor
❖ Model Evidence and
Occam’s Razor
❖ Model Evidence:
Nested Models
❖ Bayesian Model
Selection: Occam’s
Razor at Work
Automatic Relevance
Determination
p (M|d) ∝ p (d |M) p (M)
Application:
Aeroelasticity
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
●
Relative model posterior probabilities: Obtained using Bayes’ factor
Posterior odds = Bayes' factor x prior odds
Summary
Bayes' factor = relative model evidence
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 7 / 26
Sandia
National
Laboratories
Motivation
Model Evidence and Occam’s Razor
●
Bayes’ model evidence balances quality of fit vs unwarranted model complexity
●
It does that by penalizing ”wasted” parameter space and thereby rewarding
highly predictive models
Bayesian Model
Selection
❖ Inverse Problems
❖ Bayes’ Theorem
Likelihood
❖ Stages of Bayesian
Inference
❖ Model Evidence and
Bayes Factor
❖ Model Evidence and
Occam’s Razor
❖ Model Evidence:
Nested Models
❖ Bayesian Model
Selection: Occam’s
Razor at Work
Prior
Penalizes
complex
models
Automatic Relevance
Determination
automatic
Occam’s
razor effect
Likelihood
Application:
Aeroelasticity
Application: Predictive
Modeling of Wavelet
Coefficients
Prior
Application: Optimal
Embedding of Model
Error
Summary
●
M. Khalil
The parameter prior plays a decisive role as it reflects the available parameter
space under the model M prior to assimilating data.
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 8 / 26
Sandia
National
Laboratories
Motivation
Model Evidence: Nested Models
●
Nested models are investigated often in practice: a more complex model, M1 ,
with prior p (φ, M), which reduces to a simpler nested model, M0 , for a certain
value of the parameter, φ = φ∗ = 0
●
Question: Is the extra complexity of M1 warranted by the data?
Bayesian Model
Selection
❖ Inverse Problems
❖ Bayes’ Theorem
❖ Stages of Bayesian
Inference
❖ Model Evidence and
Bayes Factor
❖ Model Evidence and
Occam’s Razor
❖ Model Evidence:
Nested Models
❖ Bayesian Model
Selection: Occam’s
Razor at Work
Automatic Relevance
Determination
Likelihood
Define:
Prior
We have:
Application:
Aeroelasticity
Application: Predictive
Modeling of Wavelet
Coefficients
Wasted parameter
space
Application: Optimal
Embedding of Model
Error
Favors simpler model
mismatch between
prediction and
likelihood
Favors more
complex model
Summary
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 9 / 26
Bayesian Model Selection: Occam’s Razor at Work
Generate 6 noisy data points from the ”true” model given by
2
Motivation
❖ Stages of Bayesian
Inference
❖ Model Evidence and
Bayes Factor
Application: Predictive
Modeling of Wavelet
Coefficients
3
4
5
8
t
10
8
8
6
6
6
4
4
4
2
2
2
e
M
e
10
0.8
2
1
0
P
10
ior pr
0.6
0.4
0
0
-2
0
2
0.2
0
-2
0
2
-2
0
2
0
00
10
10
1
22
3
4
55
Application:
Aeroelasticity
2
M5 : y = a0 + a1 x + a2 x + a3 x + a4 x + a5 x + ǫ
Automatic Relevance
Determination
.
.
.
❖ Model Evidence and
Occam’s Razor
❖ Model Evidence:
Nested Models
❖ Bayesian Model
Selection: Occam’s
Razor at Work
M0 : y = a0 + ǫ
●
t
❖ Bayes’ Theorem
Question: Not knowing the true model, what is the ”best” model?
We propose polynomials of increasing order:
●
❖ Inverse Problems
Bayesian Model
Selection
ǫi ∼ N (0, 1)
yi = 1 + xi + ǫi
●
Sandia
National
Laboratories
true
10
10
8
8
5
8
4
3
6
6
6
4
4
4
true
8
Summary
r
r
n
6
n
Application: Optimal
Embedding of Model
Error
m
4
2
2
2
2
0
0
-2
0
2
0
-2
0
2
-2
0
2
0
-2
M. Khalil
0
2
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 10 / 26
Sandia
National
Laboratories
Challenges with Bayesian Model Selection
Motivation
●
Model evidence is extremely sensitive to prior parameter pdfs
Bayesian Model
Selection
●
Missing out on better candidate models:
Automatic Relevance
Determination
❖ Challenges with
Bayesian Model
Selection
❖ Automatic Relevance
Determination
❖ Bayesian Model
Selection: ARD Priors
✦
The number of possible models grows rapidly with the number of possible
terms in the physical/statistical model
✦
For the previous example, the number of possible models of order up to and
including 6 is
NM = number of k − combinations up to and including 5
6 X
6
=
k
Application:
Aeroelasticity
Application: Predictive
Modeling of Wavelet
Coefficients
k=1
6!
6!
6!
6!
6!
6!
+
+
+
+
+
1! 5!
2! 4!
3! 3!
4! 2!
5! 1!
6! 0!
= 63
=
Application: Optimal
Embedding of Model
Error
Summary
✦
For polynomials of maximum order of 10, 1023 possible models!
Solution: Automatic Relevance Determination (ARD)
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 11 / 26
Sandia
National
Laboratories
Automatic Relevance Determination
●
A parametrized prior distribution known as ARD prior is assigned to the unknown
model parameters
●
ARD prior pdf is a Gaussian with zero mean and unknown variance (could also
use Laplace priors, etc...)
●
The hyper-parameters, α, are estimated using the data by performing evidence
maximization or type-II maximum likelihood estimation
Motivation
Bayesian Model
Selection
Automatic Relevance
Determination
❖ Challenges with
Bayesian Model
Selection
❖ Automatic Relevance
Determination
❖ Bayesian Model
Selection: ARD Priors
Application:
Aeroelasticity
Prior :
Posterior :
Type − II likelihood :
p (φ|α, M)
p (φ|d, α, M)
p (d |α, M) =
Z
p (d |φ, M) p (φ|α, M) dφ
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 12 / 26
Sandia
National
Laboratories
Motivation
Bayesian Model Selection: ARD Priors
●
Revisiting the previous example with the ”true” model given by
Bayesian Model
Selection
Automatic Relevance
Determination
❖ Challenges with
Bayesian Model
Selection
❖ Automatic Relevance
Determination
❖ Bayesian Model
Selection: ARD Priors
Application:
Aeroelasticity
Application: Predictive
Modeling of Wavelet
Coefficients
yi = 1 + x2i + ǫi
●
ǫi ∼ N (0, 1)
Question: What is the ”best” model nested under the model:
y = a0 + a1 x + a2 x2 + a3 x3 + a4 x4 + a5 x5 + ǫ
5
5
5
0
0
0
Convergence could
be improved with
better optimizer
Application: Optimal
Embedding of Model
Error
1.4
×10
-4
1.2
-5
-5
0
Summary
100
200
300
-5
0
Optimizer Iteration
100
200
300
Optimizer Iteration
5
5
0
100
200
300
1
Optimizer Iteration
0.8
5
0.6
0.4
0.2
0
0
0
0
0
50
100
150
200
250
300
Optimizer Iteration
-5
-5
0
100
200
300
Optimizer Iteration
M. Khalil
-5
0
100
200
300
Optimizer Iteration
0
100
200
300
Type-II likelihood
(model evidence)
Optimizer Iteration
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 13 / 26
Sandia
National
Laboratories
Application: Nonlinear Modeling in Aeroelasticity
Motivation
Bayesian Model
Selection
Automatic Relevance
Determination
Application:
Aeroelasticity
❖ Application: Nonlinear
Modeling in
Aeroelasticity
❖ Previous Work
●
❖ Use of ARD Priors
❖ Hybrid approach: ARD
Priors vs Fixed Priors
●
❖ Hierarchical Bayes
●
❖ Numerical techniques
Limit cycle oscillation (LCO) is observed in wind tunnel experiments for 2-D rigid
airfoil in transitional Re regime
Pure pitch LCO due to nonlinear aerodynamic loads
Objective: Inverse modeling of nonlinear oscillations with an aim to understand
and quantify the contribution of unsteady and nonlinear aerodynamics.
❖ Numerical Results:
ARD Priors
❖ Numerical Results:
ARD vs Flat Priors
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
Nor alized T me
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 14 / 26
Sandia
National
Laboratories
Research Group/Resources
Motivation
●
Philippe Bisaillon, Ph.D. candidate, Carleton University
Bayesian Model
Selection
●
Rimple Sandhu, Ph.D. candidate, Carleton University
Automatic Relevance
Determination
●
Dominique Poirel, Royal Military College (RMC) of Canada
●
Abhijit Sarkar, Carleton University
●
Chris Pettit, United States Naval Academy
Application:
Aeroelasticity
❖ Application: Nonlinear
Modeling in
Aeroelasticity
❖ Previous Work
❖ Use of ARD Priors
❖ Hybrid approach: ARD
Priors vs Fixed Priors
❖ Hierarchical Bayes
❖ Numerical techniques
❖ Numerical Results:
ARD Priors
❖ Numerical Results:
ARD vs Flat Priors
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
HPC lab at Carleton University
Wind tunnel at RMC
Summary
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 15 / 26
Sandia
National
Laboratories
Motivation
Previous Work
●
Start with a candidate model set:
Bayesian Model
Selection
Automatic Relevance
Determination
M1 :
Application:
Aeroelasticity
❖ Previous Work
M6 :
❖ Use of ARD Priors
❖ Hybrid approach: ARD
Priors vs Fixed Priors
❖ Hierarchical Bayes
❖ Numerical techniques
●
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
′
..
.
❖ Application: Nonlinear
Modeling in
Aeroelasticity
❖ Numerical Results:
ARD Priors
❖ Numerical Results:
ARD vs Flat Priors
1
2 2
IEA θ̈ + D θ̇ + Kθ + K θ = D signθ̇ + ρU c sCM θ, θ̇, θ̈
2
CM = e1 θ + e2 θ̇ + e3 θ 3 + e4 θ 2 θ̇ + σξ (τ )
′ 3
(B1 + B2 ) C˙M
C¨M
+
+ CM = e1 θ + e2 θ̇ + e3 θ 3 + e4 θ 2 θ̇ + e5 θ 5
B1 B2
B1 B2
...
(2c6 c7 + 0.5) θ̈
c6 θ
+
+
+ σξ (τ )
B1 B2
B1 B2
We observe the pitch degree-of-freedom (DOF):
dk = θ (tk ) + ǫk
●
We perform Bayesian model selection in discrete model space
Sandhu et al., JCP, 2016
Sandhu et al., CMAME, 2014
Khalil et al., JSV, 2013
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 16 / 26
Sandia
National
Laboratories
Motivation
Use of ARD Priors
●
Bayesian Model
Selection
IEA θ̈ + D θ̇ + Kθ + K ′ θ 3 = D ′ signθ̇ +
Automatic Relevance
Determination
❖ Application: Nonlinear
Modeling in
Aeroelasticity
❖ Use of ARD Priors
❖ Hybrid approach: ARD
Priors vs Fixed Priors
1
ρU 2 c2 sCM
2
C˙M
c6
+ CM = a1 θ + a2 θ̇ + a3 θ 3 + a4 θ 2 θ̇ + a5 θ 5 + a6 θ 4 θ̇ +
θ̈ + σξ (τ )
B
B
Application:
Aeroelasticity
❖ Previous Work
Start with an encompassing model:
●
We would like to find the optimal model nested under the overly-prescribed
encompassing model
❖ Hierarchical Bayes
❖ Numerical techniques
❖ Numerical Results:
ARD Priors
❖ Numerical Results:
ARD vs Flat Priors
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 17 / 26
Sandia
National
Laboratories
Motivation
Bayesian Model
Selection
Automatic Relevance
Determination
Application:
Aeroelasticity
Hybrid approach: ARD Priors vs Fixed Priors
●
We assign prior distributions by categorizing parameters based on prior
knowledge about the aerodynamics as Required φ−α or Contentious (φα )
c6
C˙M
θ̈ + σξ (τ )
+ CM = a1 θ + a2 θ̇ + a3 θ 3 + a4 θ 2 θ̇ + a5 θ 5 + a6 θ 4 θ̇ +
B
B
❖ Application: Nonlinear
Modeling in
Aeroelasticity
❖ Previous Work
❖ Use of ARD Priors
❖ Hybrid approach: ARD
Priors vs Fixed Priors
❖ Hierarchical Bayes
❖ Numerical techniques
❖ Numerical Results:
ARD Priors
❖ Numerical Results:
ARD vs Flat Priors
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 18 / 26
Sandia
National
Laboratories
Motivation
Bayesian Model
Selection
Hierarchical Bayes
Using hierarchical Bayes’ approach
●
Posterior pdf p (α|d) of hyper-parameter vector α:
Automatic Relevance
Determination
p (α|d) ∝ p (d |α) p (α)
Application:
Aeroelasticity
❖ Application: Nonlinear
Modeling in
Aeroelasticity
●
For a fixed ”hyper-prior” p (α), Task: Stochastic optimization:
❖ Previous Work
αMAP = arg max p (α|d)
α
❖ Use of ARD Priors
❖ Hybrid approach: ARD
Priors vs Fixed Priors
❖ Hierarchical Bayes
❖ Numerical techniques
●
Model evidence as a function of hyper-parameter, Task: Evidence computation:
Z
p (d |α) =
p (d |φ) p (φ|α) dφ
●
Parameter likelihood computation, Task: State estimation:
❖ Numerical Results:
ARD Priors
❖ Numerical Results:
ARD vs Flat Priors
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
p (d |φ) =
nd Z
Y
k=1
p dk |uj(k) , φ p uj(k) |d1:k−1 , φ duj(k)
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 19 / 26
Sandia
National
Laboratories
Motivation
Numerical techniques
●
Evidence computation: Chib-Jeliazkov method, Power posteriors, Nested
sampling, Annealed importance sampling, Harmonic mean estimator, adaptive
Gauss-Hermite quadrature; and many others
●
MCMC sampler for Chib-Jeliazkov method: Metropolis-Hastings, Gibbs, tMCMC,
adaptive Metropolis, Delayed Rejection Adaptive Metropolis (DRAM); and many
others
●
State estimation: Kalman filter, extended Kalman filter, unscented Kalman filter,
ensemble Kalman filter, particle filter; and many others.
●
Results are in: R. Sandhu, C. Pettit, M. Khalil, D. Poirel, A. Sarkar, Bayesian
model selection using automatic relevance determination for nonlinear dynamical
systems, Computer Methods in Applied Mechanics and Engineering, in press.
Bayesian Model
Selection
Automatic Relevance
Determination
Application:
Aeroelasticity
❖ Application: Nonlinear
Modeling in
Aeroelasticity
❖ Previous Work
❖ Use of ARD Priors
❖ Hybrid approach: ARD
Priors vs Fixed Priors
❖ Hierarchical Bayes
❖ Numerical techniques
❖ Numerical Results:
ARD Priors
❖ Numerical Results:
ARD vs Flat Priors
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 20 / 26
Sandia
National
Laboratories
Numerical Results: ARD Priors
Motivation
Bayesian Model
Selection
Automatic Relevance
Determination
Application:
Aeroelasticity
❖ Application: Nonlinear
Modeling in
Aeroelasticity
❖ Previous Work
❖ Use of ARD Priors
❖ Hybrid approach: ARD
Priors vs Fixed Priors
❖ Hierarchical Bayes
❖ Numerical techniques
❖ Numerical Results:
ARD Priors
❖ Numerical Results:
ARD vs Flat Priors
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 21 / 26
Sandia
National
Laboratories
Motivation
Numerical Results: ARD vs Flat Priors
●
We compare selected marginal and joint pdfs for (a) ARD priors with optimal
hyper-parameters, and (b) flat priors
●
ARD priors able to remove superfluous parameters while having insignificant
effect on the posterior pdfs of important parameters
Bayesian Model
Selection
Automatic Relevance
Determination
Application:
Aeroelasticity
❖ Application: Nonlinear
Modeling in
Aeroelasticity
❖ Previous Work
❖ Use of ARD Priors
❖ Hybrid approach: ARD
Priors vs Fixed Priors
❖ Hierarchical Bayes
❖ Numerical techniques
❖ Numerical Results:
ARD Priors
❖ Numerical Results:
ARD vs Flat Priors
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 22 / 26
Sandia
National
Laboratories
Application: Predictive Modeling of Wavelet Coefficients
●
Collaborators: Jina Lee, Maher Salloum (Sandia)
●
Objective: Replace computationally expensive simulations of physical systems
with response predictions constructed at the wavelet coefficient level
●
Procedure:
Motivation
Bayesian Model
Selection
Automatic Relevance
Determination
Application:
Aeroelasticity
Application: Predictive
Modeling of Wavelet
Coefficients
❖ Application: Predictive
Modeling of Wavelet
Coefficients
✦
Perform compressed sensing of high-dimensional system response from
full-order model simulations
✦
Model resulting low-dimensional wavelet coefficients using
autoregressive-moving-average (ARMA) model
p
X
Application: Optimal
Embedding of Model
Error
xt =
Summary
yt = xt + ζt
M. Khalil
ϕi xt−i +
i=1
q
X
θj ǫt−j
ǫt ∼ N (0, 1)
j=1
ζt ∼ N 0, γ 2
✦
Parameters likelihood for ϕi , θj and γ involves a state estimation using the
Kalman filter
✦
Model selection, i.e. determining model orders p and q, is performed using
Akaike information criterion (AIC)
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 23 / 26
Sandia
National
Laboratories
Motivation
Wavelet Coefficient Predictions
●
For illustration we consider the transient response of the 2D heat equation on a
square domain with randomly chosen holes (for added heterogeneity)
●
Compressed sensing is performed and 7 dominant wavelet coefficients are
modeled
Bayesian Model
Selection
Automatic Relevance
Determination
Application:
Aeroelasticity
Application: Predictive
Modeling of Wavelet
Coefficients
❖ Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 24 / 26
Sandia
National
Laboratories
Application: Optimal Embedding of Model Error
●
Collaborators: Layal Hakim, Guilhem Lacaze, Khachik Sargsyan, Habib Najm,
Joe Oefelein (Sandia)
●
Objective: Calibrate a simple chemical model against computations from a
detailed kinetic model
Motivation
Bayesian Model
Selection
Automatic Relevance
Determination
Application:
Aeroelasticity
Application: Predictive
Modeling of Wavelet
Coefficients
✦
Simple model with an embedded parameterization of model error using
polynomial chaos expansions
✦
Optimal placement of model error achieved via Bayesian model selection
(Bayes’ factor)
Application: Optimal
Embedding of Model
Error
❖ Application: Optimal
Embedding of Model
Error
Summary
Bayes' factors
M. Khalil
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 25 / 26
Sandia
National
Laboratories
Summary
Motivation
●
Presented a framework for data-driven model selection using ARD prior pdfs
Bayesian Model
Selection
●
ARD priors enable the transformation of the model selection problem from the
discrete model space into the continuous hyper-parameter space
●
Allow for parameter space dimension reduction informed by noisy observations
of the system
●
Applications:
Automatic Relevance
Determination
Application:
Aeroelasticity
Application: Predictive
Modeling of Wavelet
Coefficients
Application: Optimal
Embedding of Model
Error
Summary
❖ Summary
●
M. Khalil
✦
Nonlinear dynamical systems modeled using stochastic ordinary differential
equations (ARD priors)
✦
Predictive Modeling of Wavelet Coefficients (AIC)
✦
Optimal Embedding of Model Error (Bayes’ factor)
ARD priors able to remove superfluous parameters while having insignificant
effect on the posterior pdfs of important parameters
Data-Driven Bayesian Model Selection using Automatic Relevance Determination – 26 / 26