Fast Estimation of Expected Information Gain for Bayesian Experimental Design Based on Laplace Approximation Quan Long Marco Scavino Raúl Tempone Suojin Wang Computer, Electrical and Mathematical Sciences & Engineering, King Abdullah University of Science and Technology, KSA Department of Statistics, Texas A& M University, College Station, TX, 77843, USA [email protected], [email protected], [email protected], [email protected] Introduction Generalization to Under–Determined Model Multi-dimensional Inference Let us consider additive Gaussian experimental noise, y i = g(θ, ξ) + i , where both y and θ are vectors. is the Gaussian noise term independent from θ. The posterior of parameter vector before the Laplace approximation is applied reads p(θ) 1 T −1 , p(θ|{y i}) ∝ exp − r i Σ r i QM 2 i=1 p(y i ) i=1 where r i is the residual for ith measurement corresponding to one design parameter, i.e. r i = g(θ d) + i − g(θ), where θ d is the ”true” parameter of the system. Laplace approximation leads to the normality of the posterior pdf for the parameters ! T −1 (θ − θ̂) Σ (θ − θ̂) p(θ|{y i}) ≈ p(θ̂|{y i}) exp − . (1) 2 where θ̂ and Σ are to be derived below. We first present how to obtain Σ, assuming Gaussian prior for θ, θ ∼ N (θ 0, Σp). Define the following quantity: the negative logarithm of the original parameter posterior before approximation i ni h√ I≈ p(θ 0)dθ 0 , − log [ps(0)] − log ( 2π)n|Σs|t|1/2 − 2 θ0 where Σs|t is the projected covariance matrix. Z h In case we are interested in some physical quantity of interest, which is commonly defined as a function of θ plus some independent error. i.e. , Q = τ (θ) + Q , where the prediction error Q is independent to θ. Thus, the uncertainty in Q comes from the direct combination of two sources θ and . Since the posterior pdf of parameters is concentrated around θ̂, small noise assumption can be applied to propagate randomness from θ to Q. Linearization of τ at θ̂ leads to ∂τ τ (θ) ≈ τ (θ̂) + (θ − θ̂) . ∂θ We consequently can conclude that Q|{y i} is a Gaussian " # 2 1 (Q − Q̂) p(Q|{y i}) = √ exp − 2 (3) 2σQ|{y } 2πσQ|{yi} i 9 10 −10 10 0 1 10 2 3 4 10 10 10 Number of Quadrature Points/Samples 8 7 10 9 θ 6 5 11 θ 3 6 θ 9 θ θ 2 5 θ 8 θ 7 3 12 1 3 2 θ 4 θ 1 0 0 4 3 6 9 Figure 4: Admissible locations of boundary sources and subdomains of piecewise constant random conductivities. exp(θ) ∼ N (0, Σp), Σp(3, 3) = Σp(5, 5) = Σp(7, 7) = 1 , Σp(1, 1) = Σp(2, 2) = Σp(4, 4) = Σp(6, 6) = Σp(8, 8) = Σp(9, 9) = 0.01. 3.5 = H(Q) − H(Q|{y i}) . where Z Z (log p(Q|{y i}))p(Q|{y i})dQp({y i})d{y i} 3 2.5 2 1.5 1 0.5 0 0 Z H(Q) = − LA MC MC MC 6 of Q, Q̂ = τ (θ̂). The expected information gain in Q is therefore Z Z I= (log p(Q|{y i}) − log p(Q))p(Q|{y i})dQp({y i})d{y i} and −5 10 Impedance Tomography T ∂τ ∂τ 2 2 Σ + σ . with σQ|{y = } Q i ∂θ ∂θ where σ2Q , which is assumed to be a known constant, is the variance H(Q|{y i}) = − 0 10 Figure 3: Logarithmic plot for absolute consecutive difference of ex2 pected information gain. σm = 0.01. α = 1, β = 1. Prediction of Quantity of Interest Expected Information Gain Shannon-type expected information gain is an important utility in evaluating the usefulness of a proposed experiment that involves uncertainty. Its estimation, however, cannot rely solely on Monte Carlo sampling methods, that are generally too computationally expensive for realistic physical models, especially for those involving the solution of stochastic partial differential equations. In this work we present a new methodology, based on the Laplace approximation of the posterior probability density function, to accelerate the estimation of expected information gain in the model parameters and predictive quantities of interest. Furthermore, in order to deal with the issue of dimensionality in a complex problem, we use sparse quadratures for the integration over the prior. We show the accuracy and efficiency of the proposed method via several nonlinear numerical examples, including a single parameter design of one dimensional cubic polynomial function and the current pattern for impedance tomography. M Y Absolute Consecutive Difference 5 10 10 20 (log p(Q))p(Q)dQ We substitute expression (3) into H(Q|{y i}), Z T ∂τ ∂τ 2 H(Q|{y i}) ≈ 0.5 log2π + log( Σ + σQ|{y } ) + 1 p(θ d )dθ d . i ∂θ ∂θ 30 40 Index of Design Scenario 50 60 Figure 5: Information gains computed for all the possible combinations of current sources. The parameter space is defined by 9 conductivities. M 1 X T −1 F (θ) := − log(p(θ|{y i})) = r i Σ r i + log(p(θ)) + C1 2 i=1 9 Numerical Examples 6 y(θ, ξ) = θ3ξ 2 + θ exp[−|0.2 − ξ|] + with θ ∼ U(−1, 1), and the corresponding Hessian is M X T −1 −1 H f = ∇∇F (θ̂) = H g (θ̂)T Σ−1 r i + M J g (θ̂)Σ J g (θ̂) + Σp . 3.6 3.8 1 quadrature point 3 quadrature points 10 quadrature points Expected Information Gain Expected Information Gain i=1 2 2 3 3 ∼ N (0, 0.001). 10 X10 samples 3 0.4 0.6 ξ 0.8 0.5 −0.5 3 0 0 −1.5 3 6 9 −1 0 0 −1.5 3 6 9 (b) 3 9 2.8 0 1 0 Figure 6: Two examples of current patterns and corresponding potential contours inducing the most information gains. (a) Current pattern of 35th scenario. (b) Current pattern of 36th scenario. 104X104 samples 3.2 0.2 6 (a) 3.4 3.2 2.8 0 1 −1 10 X10 samples 3.6 3.4 0.2 0.4 (a) 0.6 ξ 0.8 1.5 1 (b) 9 1.5 1 6 0.5 1 6 0.5 0 Absolute Consecutive Difference i=1 where J g and H g are the Jacobian and Hessian of g with respect to θ. For√M sufficiently large the magnitude of the first term is proportional to M . Therefore, we can disregard the term M X H g (θ̂)Σ−1 ri 10 0 ξ=0.25 ξ=0.5 ξ=0.75 −5 10 −0.5 3 −1 −10 10 0 0 6 9 0 0 0 1 10 2 10 Number of Quadrature Points −1.5 3 6 9 (b) Figure 7: Two examples of current patterns and corresponding potential contours inducing the least information gains. (a) Current pattern of 5th scenario. (b) Current pattern of 52th scenario. −20 10 −1.5 3 (a) −15 10 10 (c) and obtain the approximation −1 H f (θ̂) ≈ M J Tg (θ̂)Σ−1 J ( θ̂) + Σ g p . −0.5 3 0 −1 i=1 Figure 1: Performances of Laplace approximation and Monte Carlo sampling in computing the expected information gain, M = 10. (2) Now consider the maximum likelihood solution of the parameters, X M T −1 θ̂ := arg min r iT Σ−1 r + (θ − θ ) Σp (θ − θ 0) . i 0 We change the model in the first example to y = (αθ1 + βθ2)3ξ 2 + (αθ1 + βθ2) exp[−|0.2 − ξ|] + 2 with θ ∼ N (θ 0, Σp), ∼ N (0, σm ). i=1 θ The expected information gain in multi-dimensional case can be approximated in the following way # Z " −1 T −1 (θ̂ − θ 0) Σp (θ̂ − θ 0) 1 |Σ| d Σ : Σp I≈ − log − + + p(θ d)dθ d 2 |Σ | 2 2 2 p Θ Z 1 |Σp| ≈ log p(θ d)dθ d . 2 |Σ| where the symbol : defines the summation of component-wise product of two tensors. θ 0 = [0.5 0.5]T , Σp(1, 1) = Σp(2, 2) = 0.1 5 Expected Information Gain Indeed, we have T −1 θ̂ = arg min M (g(θ d) − g(θ))T Σ−1 (g(θ ) − g(θ)) + (θ − θ ) Σp (θ − θ 0) . d 0 Acknowledgement Model with two indistinguishable parameters 4 3 MC, M=10 LA, M=10 2 MC, M=1 LA, M=1 1 0 0 (4) , Σp(1, 2) = Σp(2, 1) = 0 Support of this work by the AEA UT-KAUST project entitled “Predictability and uncertainty quantification for models of porous media” is gratefully acknowledged. Quan Long, Marco Scavino and Raúl Tempone are members of the KAUST SRI Center for Uncertainty Quantification in Computational Science and Engineering. 5 Expected Information Gain Eventually the covariance matrix of the posterior is θ −0.5 We consider the following simple model for the scalar data y: 3.8 1.5 0 3 1 F (θ) ≈ F (θ̂) + ∇F (θ̂)(θ − θ̂) + (θ − θ̂)T ∇∇F (θ̂)(θ − θ̂) , 2 where the Jacobian of the log posterior with respect to parameters θ is M X −1 ∇F (θ̂) = J g (θ̂)T Σ−1 r + Σ i p (θ̂ − θ 0 ) Σ= 0.5 Model with one parameter where C1 and C2 are both constants. Now Taylor expansion of F (θ) around θ̂ yields 9 1 M 1 X T −1 1 = r i Σ r i + (θ − θ 0)T Σ−1 p (θ − θ 0 ) + C2 . 2 i=1 2 H −1 f (θ̂) . 1.5 0.2 0.4 ξ 0.6 0.8 1 Reference MC, M=10 LA, M=10 4 3 2 MC, M=1 LA, M=1 1 0 0 0.2 0.4 ξ 0.6 0.8 1 Figure 2: Expected information gain for different design parameters. 2 α = 0.7, β = 0.3. In the left figure, σm = 0.01, while in the right figure 2 σm = 0.001. Quan Long, Marco Scavino, Raúl Tempone, Suojin Wang. Fast estimation of expected information gains for Bayesian experimental designs based on Laplace approximation. Computer Methods in Applied Mechanics and Engineering. Vol. 259, pp. 24-39, 2013 Quan Long, Marco Scavino, Raúl Tempone, Suojin Wang. A projection method for optimal Bayesian experimental design. Preprint. 2013.
© Copyright 2025 Paperzz