Slides for Introduction to Stochastic Search and Optimization (ISSO) by J. C. Spall CHAPTER 17 OPTIMAL DESIGN FOR EXPERIMENTAL INPUTS •Organization of chapter in ISSO –Background •Motivation •Finite sample and asymptotic (continuous) designs •Precision matrix and D-optimality –Linear models •Connections to D-optimality •Key equivalence theorem –Response surface methods –Nonlinear models Optimal Design in Simulation • Two roles for experimental design in simulation – Building approximation to existing large-scale simulation via “metamodel” – Building simulation model itself • Metamodels are “curve fits” that approximate simulation input/output – Usual form is low-order polynomial in the inputs; linear in parameters – Linear design theory useful • Building simulation model – Typically need nonlinear design theory • Some terminology distinctions: – “Factors” (statistics term) “Inputs” (modeling and simulation terms) – “Levels” “Values” – “Treatments” “Runs” 17-2 Unique Advantages of Design in Simulation • Simulation experiments may be considered special case of general experiments • Some unique benefits occur due to simulation structure • Can control factors not generally controllable (e.g., arrival rates into network) • Direct repeatability due to deterministic nature of random number generators – Variance reduction (CRNs, etc.) may be helpful • Not necessary to randomize runs to avoid systematic variation due to inherent conditions – E.g., randomization in run order and input levels in biological experiment to reduce effects of change in ambient humidity in laboratory – In simulation, systematic effects can be eliminated since analyst controls nature 17-3 Design of Computer Experiments in Statistics • There exists significant activity among statisticians for experimental design based on computer experiments – T. J. Santner et al. (2003), The Design and Analysis of Computer Experiments, Springer-Verlag – J. Sacks et al (1989), “Design and Analysis of Computer Experiments (with discussion),” Statistical Science, 409–435 – Etc. • Above statistical work differs from experimental design with Monte Carlo simulations – Above work assumes deterministic function evaluations via computer (e.g., solution to complicated ODE) • One implication of deterministic function evaluations: no need to replicate experiments for given set of inputs • Contrasts with Monte Carlo, where replication provides variance reduction 17-4 General Optimal Design Formulation (Simulation or Non-Simulation) • Assume model z = h(, x) + v , where x is an input we are trying to pick optimally • Experimental design consists of N specific input values x = i and proportions (weights) to these input values wi : 1 2 w1 w 2 N wN • Finite-sample design allocates n N available measurements exactly; asymptotic (continuous) design allocates based on n 17-5 D-Optimal Criterion • Picking optimal design requires criterion for optimization • Most popular criterion is D-optimal measure • Let M(,) denote the “precision matrix” for an estimate of based on a design – M(,) is inverse of covariance matrix for estimate and/or – M(,) is Fisher information matrix for estimate • D-optimal solution is arg max det M (, ) 17-6 Equivalence Theorem • Consider linear model zk • T hk v k , k =1,2,..., n Prediction based on parameter estimate ̂ n and “future” measurement vector hT is ˆz = hT ˆ n • Kiefer-Wolfowitz equivalence theorem states: D-optimal solution for determining to be used in forming ̂ n is the same that minimizes the maximum variance of predictor ẑ • Useful in practical determination of optimal 17-7 Variance Function as it Depends on Input: Optimal Asymptotic Design for Example 17.6 in ISSO 17-8 Orthogonal Designs • With linear models, usually more than one solution is D-optimal • Orthogonality is means of reducing number of solutions • Orthogonality also introduces desirable secondary properties – Separates effects of input factors (avoids “aliasing”) – Makes estimates for elements of uncorrelated • Orthogonal designs are not generally D-optimal; D-optimal designs are not generally orthogonal – However, some designs are both • Classical factorial (“cubic”) designs are orthogonal (and often D-optimal) 17-9 Example Orthogonal Designs, r = 2 Factors xk2 xk2 xk1 Cube (2r design) xk1 Star (2r design) 17-10 Example Orthogonal Designs, r = 3 Factors xk2 xk2 xk1 xk1 xk3 xk3 Cube (2r design) Star (2r design) 17-11 Response Surface Methodology (RSM) • Suppose want to determine inputs x that minimize the mean response z of some process (E(z)) – There are also other (nonoptimization) uses for RSM • RSM can be used to build local models with the aim of finding the optimal x – Based on building a sequence of local models as one moves through factor (x) space • Each response surface is typically a simple regression polynomial • Experimental design can be used to determine input values for building response surfaces 17-12 Steps of RSM for Optimizing x Step 0 (Initialization) Initial guess at optimal value of x. Step 1 (Collect data) Collect responses z from several x values in neighborhood of current estimate of best x value (can use experimental design). Step 2 (Fit model) From the x, z pairs in step 1, fit regression model in region around current best estimate of optimal x. Step 3 (Identify steepest descent path) Based on response surface in step 2, estimate path of steepest descent in factor space. Step 4 (Follow steepest descent path) Perform series of experiments at x values along path of steepest descent until no additional improvement in z response is obtained. This x value represents new estimate of best vector of factor levels. Step 5 (Stop or return) Go to step 1 and repeat process until final best factor level is obtained. 17-13 Conceptual Illustration of RSM for Two Variables in x; Shows More Refined Experimental Design Near Solution Adapted from: Montgomery (2001), Design and Analysis of Experiments, Fig. 11-3 17-14 Nonlinear Design • Assume model z = h(, x) + v , where enters nonlinearly • D-optimality remains dominant measure – Maximization of determinant of Fisher information matrix (from Chapter 13 of ISSO: Fn(, x) is Fisher information matrix based on n data points) • Fundamental distinction from linear case is that Doptimal criterion depends on • Leads to conundrum: Choosing x to best estimate , yet need to know to determine x 17-15 Strategies for Coping with Dependence on • Assume nominal value of and develop an optimal design based on this fixed value • Sequential design strategy based on an iterated design and model fitting process. • Bayesian strategy where a prior distribution is assigned to , reflecting uncertainty in the knowledge of the true value of . 17-16 Sequential Approach for Parameter Estimation and Optimal Design • Step 0 (Initialization) Make initial guess at , ˆ 0 . Allocate n0 measurements to initial design. Set k = 0 and n = 0. Step 1 (D-optimal maximization) Given Xn , choose the nk inputs in X = X nk to maximize det[Fn (ˆ n , X n ) Fnk (ˆ n , X )] . • • Step 2 (Update estimate) Collect nk measurements based on inputs from step 1. Use measurements to update from ̂n to ˆ n +nk . Step 3 (Stop or return) Stop if the value of in step 2 is satisfactory. Else return to step 1 with the new k set to the former k + 1 and the new n set to the former n + nk (updated Xn now includes inputs from step 1). 17-17 Comments on Sequential Design • Note two optimization problems being solved: one for , one for • Determine next nk input values (step 1) conditioned on current value of – Each step analogous to nonlinear design with fixed (nominal) value of • “Full sequential” mode (nk = 1) updates based on each new inputouput pair (xk , zk) • Can use stochastic approximation to update : ˆ n 1 ˆ n anYn ˆ n | zn 1, x n 1 where Yn ( | zn 1, x n 1) 12 zn 1 h(, x n 1)2 17-18 Bayesian Design Strategy • Assume prior distribution (density) for , p(), reflecting uncertainty in the knowledge of the true value of . • There exist multiple versions of D-optimal criterion • One possible D-optimal criterion: E logdet Fn (, X ) logdet Fn (, X ) p() d • Above criterion related to Shannon information • While log transform makes no difference with fixed , it does affect integral-based solution. • To simplify integral, may be useful to choose discrete prior p() 17-19
© Copyright 2026 Paperzz