Experimental Space-Filling Designs For Complicated Simulation Outpts LTC Alex MacCalman PhD Student Candidate Modeling, Virtual Environments, and Simulations (MOVES) Institute Naval Postgraduate School Agenda • Background – Design of Simulation Experiments – Meta-models (how we characterize complexity) • Motivation: Why Orthogonality and Space-Filling Properties are important for understanding system behavior? • Introduce the 2nd Order Nearly Orthogonal Latin Hypercube Designs • Contributions to the literature • Conclusions Simulation Studies Underpin Many DoD Decisions • DoD uses complex, high-dimensional, simulation models as an important tool in the acquisition process. – Used when too difficult or costly to experiment on “real systems.” – Needed for future systems - we shouldn‘t wait until they’re operational to decide on appropriate capabilities and operational tactics, or evaluate their potential performance. – Investigate the impact of randomness and other uncertainties. • Many simulations involve dozens, hundreds, or thousands of “factors” that can be set to different levels. Differences Between Physical and Simulation Experiments Characteristic Physical World Experiments Simulation Experiments Number of factors Few Many Number of levels Few Many Number of responses Single Multiple Error variance Homogeneous Heterogeneous Presence of interactions Negligible or limited Important and complex Error structure iid Normal Complex structure Response surface form Linear Non-linear Classical experimental designs used for physical experiments are not suited for simulation experiments; new designs are needed to account for their complex characteristics. Design of Experiments Allow Us to Understand the Input/Output Behavior of a Simulation Real World Observation Post Insig hts 1st IED Input Factors (X) 2nd IED Simulation Model of Real World Meta-model of Simulation Model “model of a model” Surrogate of the Simulation Linear Effects Output Responses (Y) • Simulation models tend to have several inputs and outputs or “responses” that have complicated behavior. • The second order model is the most common polynomial meta-model used to describe response surfaces. • Simulation analyst desire designs that can handle multiple high-order response surfaces that explore the entire landscape. Non-linear Quadratic Effects Interaction Effects (Synergies) Random Noise Significant Factor Effects (Linear) Output (Y) • “Which factors matter?” • How do increases in factor X change Y when all other factors are held constant? • Measured by the estimated slope or rate of change. • Identifying the factors that have no effect is just as important as finding the ones that do. Low Factor Effect (X) High Linear effects indicate how much resources in factor X must increase in order to increase/decrease output Y. Quadratic Factor Effects (Non-linear) Output (Y) Diminishing Returns Low Factor Effect (X) • “Where is the knee in the curve?” • Identifies increasing or decreasing rates of return. High If a non-linear quadratic term become significant, we may find that we can increase the Output Y with a lot less resources in factor X than if we assumed it was linear. Two-Way Interactions / Synergistic Effects (Non-linear) Output (Y) X2 High When X2 is High, increasing X1 will have an impact on the Output Y When X2 is Low, increasing X1 will have NO impact on the Output Y X2 Low Factor Effect (X1) Varying one factor at a time will not identify interactions; we must use experimental designs. • Factor’s effect depends on the level of another. • Heat and pressure together will cause C-4 to explode. • Lighting C-4 on fire or pounding it with a hammer alone will not. 1st Order Orthogonal Latin Hypercube Least Squares Fit NOLH Correlation Matrix High order correlations impact the high order B estimates. True Model Y=8X1-15X2-30X1^2+20X2^2+30X1X2 2nd Order Orthogonal Latin Hypercube Least Squares Fit 2nd Order NOLH Correlation Matrix Orthogonality (0 correlation) between columns ensures an accurate B estimate regardless of the model fit. True Model Y=8X1-15X2-30X1^2+20X2^2+30X1X2 Threshold Effects / Step Functions / Change Points Y Region where the response behavior is significantly different than the rest of the response surface. • “Is there a threshold that separates data into vastly different areas?” • The rate of decent of a cargo parachute will increase at a certain rate as the weight increases . . . • Until it collapses. We must explore throughout the interior of the experimental region to find thresholds. Benefits of Space-Filling Space-filling designs provide multiple “cameras” across the entire landscape. These designs often had high correlations among the 2nd order terms. 2nd Order OLH Design (Space-Filling) Y Traditional designs for 2nd order response surfaces sample only at the corners, edges and center. D-Optimal Design (Traditional) Both these designs have 21 design points and are orthogonal across all second order terms. If the true response surface contains a threshold, traditional second order designs may not identify its presence. The 2nd Order OLH designs have minimal correlations AND good space-filling properties. Genetic Algorithm Objective Function (Fitness) • Currently, state-of-the-art space-filling designs only minimize correlations for the 1st order model (linear terms only). • In order to understand the system complexities we must include the second order terms in the regression matrix, Z. Linear Linear X= Quadratics Two-Way Interactions 4 1 6 1 4 6 6 6 1 5 3 2 8 5 3 3 2 8 3 2 3 5 3 2 2 3 5 2 4 5 7 2 1 1 5 7 1 7 4 6 1 4 4 4 6 2 6 5 1 2 6 7 7 1 2 7 3 7 6 7 3 7 8 8 7 3 8 4 8 8 8 4 8 5 5 8 4 1 6 1 3 2 8 2 3 5 4 5 7 7 4 6 5 1 6 8 Z= • The correlation between any two columns i and j in Z is: • The objective function for the genetic algorithm is the maximum absolute pairwise (map) correlation between the columns of Z: Color Correlation Plots of Designs with 4 Factors and 25 Design Points 2D Projections of Designs with 4 Factors and 25 Design Points FCCD Uniform 2nd Order NOLH Sphere Pack D Optimal LHS Cataloged 2nd Order Nearly Orthogonal Latin Hypercubes • The 2nd Order NOLH designs have a maximum absolute pairwise correlation < 0.05. • No other designs in the literature perform as well the 2nd Order NOLH design in terms of correlation and space-filling characteristics. MacCalman, A. D., H. Vieira Jr., and T. W. Lucas. 2012. Flexible Second Order Nearly Orthogonal Latin Hypercubes for Multiple Unknown Response Surface Forms. Working paper, MOVES Institute, Naval Postgraduate School, Monterey, CA. Space-Filling and Second Order Design Domain Convergence Space-Filling Design Domain Discrete and Categorical NO/B (Vieira 2010) NOLH (Cioppa & Lucas 2007) Extended OLH (Steinberg & Lin 2006) Orthogonal LH: OLH (Ye 1998) Extended OLH (Ang 2006) Updating LHS (Florian 1992) Controlling Correlation in Maximum Entropy LHS (Owen 1994) Design (Shewry & Wynn 1987) Sphere-Packing Design (Johnson et al. LHS (McKay, 1990) et al. 1979) Inducing Correlation in LHS (Iman & Conover 1982) Hoke Design Uniform Design (Hoke 1974) Box-Behnken (Fang 1980) Design (Box & Behnken 1960) Central Composite Design (Box & Wilson 1951) Optimal Design Theory (Kiefer & 3-Level Factorial Wolfowitz 1959) Designs (Fisher 1920) Discrete 2nd Order NO/B (MacCalman 2012) Saturated NOLH (Hernandez 2008) 2nd Order NOLH (MacCalman 2012) Vary-Large Fractional Factorial and CCD (Sanchez & Sanchez 2005) Hybrid Design (Roquemore 1976) Second Order Design Domain Conclusions • Simulation experiments allow us to understand the complex nature of our world when physical experiments are infeasible. • Analysts can characterize these complexities with accurate meta-models that act as surrogates to the simulation. • The 2nd Order NOLH Design enables the creation of accurate metamodel by: – Minimizing all pairwise correlations for a full second order model, thereby nearly guaranteeing that no term is confounded with another. – Providing excellent space-filling properties to detect thresholds and non-linear behavior. – Providing flexible designs across a wide variety of high-order meta-models for multiple output responses. • The Designs are available at http://harvest.nps.edu under the software downloads page (2ndOrderNOLHDesigns.xlsm). Contact: LTC Alex MacCalman [email protected] Back up The Latin Hypercube (LH) • In its basic form, each column in an n-run, k-factor LH is a random permutation of the integers 1,2,…,n. A 6-run, 2-factor design Factor 1 Factor 2 1 4 5 1 6 2 4 5 3 3 2 6 • The n integers correspond to levels across the range of the factor. • Design points are typically spread over the factor ranges: good for exploratory purposes. f (X) Pairwise projection X2 X1 Factor 2 ρ = -0.66 Factor 1 Random LH designs often result in correlations among factors. Impact on Estimates When Varying Angle Between Vectors X2 (0,1) Vector Columns True Model Y = 12X1 - 4X2 Design Matrix θ X1 (1,0) Charts show the impact on the coefficient estimates when the angle between column vectors varies between 0 and 90 degrees. Each chart shows the results from 1000 least square fits with both terms. A small angle between vectors inflates the variance. Geometric Depiction of Non-Orthogonal Impact (-1,3) Y True Model Y=2X1 + X2 B1 = 2 B2 = 1 (-1,3) Y B1 = 3 Angle Between Column Vectors = 45 degrees (-1,1) (0,1) X1 (0,1) Design Matrix X1 X2 Experiment 1 0 -1 Experiment 2 1 1 X1 (-1,3) Y B2 = 2 Non-Orthogonal column vectors produce different B estimates depending on the model fit. (-1,1) Geometric Depiction of Orthogonal Impact (1,3) B1 = 2 B2 = 1 (-1,1) Y (1,3) True Model Y=2X1 + X2 Y B1 = 2 Angle Between Column Vectors = 90 degrees (1,1) X1 (1,1) Design Matrix X1 X2 Experiment 1 1 -1 Experiment 2 1 1 Orthogonal column vectors (where the angle between them is 90 degrees) ensure that the B estimates are correct regardless of the model we fit using Least Squares. B estimates do not change no matter what model is fit. X1 (1,3) Y (-1,1) B2 = 1 Impact on Variance When Varying Angles Between Vectors X2 (0,1) Vector Columns The variance of the estimates are a function of the design matrix X. Design Matrix θ X1 (1,0) Variance of Response The smaller the angel between vectors, the more the variance of the estimates are inflated. Impact on Estimates When Varying Angle Between Vectors X2 (0,1) Vector Columns True Model Y = 12X1 - 4X2 Design Matrix θ X1 (1,0) Charts show the impact on the coefficient estimates when the angle between column vectors varies between 0 and 90 degrees. Each chart shows the results from 1000 least square fits with one term only. Column vectors that are not orthogonal can result in a completely inaccurate interpretation of a factor’s effect. Computational Complexity • The number of terms, p for a full second order model increases as k increases. • p = 1 + 2k + k(k – 1)/2 • The number of p choose 2 pairwise correlation increases as k increases at a rate show in the above figure. • Finding a 2nd Order NOLH design is significantly harder than for a 1st Order design.
© Copyright 2025 Paperzz