Non-Monotonic Modification to the General Monotone Model Ashley Lawrence University of Oklahoma Rick Thomas University of Oklahoma OKJDM 4.12.2014 Michael Dougherty University of Maryland Researcher Degrees of Freedom 2 • Common statistical analyses provide researchers with flexibility • Which covariates to include • Whether or not to transform the data • How to handle outliers • Simmons et al (2011) labeled these decisions “researcher degrees of freedom” and showed how they lead to inflated Type-I error rates • Used an ANCOVA to show that participants were a yearand-a-half younger after listening to a particular song • Selecting covariates among a number of variables • Selecting independent variable • Using a flexible sample size Possible Solution 3 • To reduce use of researcher degrees of freedom, researchers should use methods that have less stringent assumptions • General monotone model (GeMM) is one method that makes few assumptions about the form of the data Overview 4 1. Propose a modification to the general monotone model (GeMM) which allows it to be applied to non-monotonic associations 2. Compare non-monotonic GeMM to other methods of analyzing non-monotonic data using Monte Carlo simulations 1. Linear and non-linear monotonic environments 2. Symmetric and asymmetric non-monotonic environments with and without outliers GeMM 5 • Semi-metric alternative to multiple regression • Squared error replaced with rank-order inversion • Model for GeMM • 𝑌 = ∑𝛽𝑖 𝑋𝑖 • Parameter weights (𝛽𝑖 ) correspond to relative importance of 𝑋𝑖 to the rank-order correspondence between Y and 𝑌 • Uses genetic algorithm (GA) to find weights that maximize a metric of monotonic association between the Ys and 𝑌s, while accounting for model complexity via the BIC-Tau • Does assume that the data show a monotonic association (Dougherty & Thomas, 2012) Examples of Non-Monotonic Associations Yerkes & Dodson, 1908 Cepeda, Vul, Rohrer, Wixted, & Pashler, 2008 6 Non-Monotonic GeMM 7 • Adds a reflection parameter for each predictor when needed • Identify a point of reflection in the criterion • Typically the maximum or minimum value • Predictor values on one side of that point are reflected 120 120 100 Criterion 100 Criterion 80 60 40 -5 40 0 0 -10 60 20 20 -15 80 0 Predictor 1 5 10 15 0 5 10 Predictor 1 15 Non-Monotonic GeMM 8 • If the predictor and the criterion are actually monotonically associated, non-monotonic GeMM will identify a maximum or minimum value on the predictor as a point of reflection 3.5 3 Criterion 2.5 2 1.5 1 0.5 0 0 2 4 6 Predictor 2 8 10 12 Non-Monotonic GeMM 9 • Points of reflection for each predictor are found using the GA • BIC-Tau is augmented to take into account the number of reflection parameters used in the model • No reflection parameters for predictors with 0 weight • No penalty in BIC-Tau if the GA fails to identify a reflection point • Weights indicate the relative importance of the predictor in predicting the rank order on the criterion Non-Monotonic GeMM 10 • Designed to capture rank order • Many researchers are interested in answering questions about rank • Makes few assumptions about the form of the data • Should be relatively invariant to extreme scores and non-normality • Requires the use of few researcher degrees of freedom Model Comparisons 11 • Non-monotonic GeMM (nmGeMM) was compared with polynomial regression (k ≤ 2) and piecewise polynomial spline (n ≤ 1,k ≤ 1) • Power, Type-I error rate and predictive accuracy • Monotonic environments • Linear environment simulated from a continuous multivariate distribution (𝑌 = .5𝑥1 + .3𝑥2 + .2𝑥3 + 0𝑥4 + 0𝑥5 + 0𝑥6 + 𝑒) • Monotonic but non-linear environment used the above equation with the criterion change to 2Y Model Comparisons 12 • Non-monotonic environments • Symmetric: 𝑌 = −1(.1𝑥1 + .9𝑥1 2 + .2𝑥2 + .7𝑥2 2 + .8𝑥3 + 0𝑥4 + 0𝑥5 + 0𝑥6 + 𝑒) • Asymmetric: 𝑌 = .5 𝑥1 + 𝑥1 2 + 𝑥1 4 + .3 𝑥2 + 𝑥2 2 + 𝑥2 4 + .2 𝑥3 + 𝑥3 2 + 𝑥3 4 + 0𝑥4 + 0𝑥5 + 0𝑥6 + 𝑒 • e ~ N(0, 1) • Five univariate outliers were also added to the data • All simulations used N=100 for estimation and holdout samples Linear Results 13 Power Predictive Accuracy 1 0.8 0.6 0.4 0.2 0 GeMM nmGeMM PR Spline Tau r GeMM .38 .54 nmGeMM .37 .53 PR .40 .59 PPS .32 .47 • nmGeMM showed reduced power for weakest predictors when compared to PR • nmGeMM did not cross-validate as well as PR but better than PPS Monotonic but Nonlinear 14 Power Predictive Accuracy 1.0 0.8 0.6 0.4 0.2 0.0 GeMM nmGeMM PR Spline Tau r GeMM .40 .53 nmGeMM .40 .52 PR .39 .51 PPS .29 .39 • nmGeMM showed better power and smaller Type-I errors rates than the other methods • nmGeMM had a slight advantage over PR and a large advantage over PPS in predictive accuracy Symmetric Non-Monotonic 15 PowerPower for Reflection Parameters for Predictors 1 1 0.8 0.8 0.6 Predictive Accuracy 0.4 0.2 0 W1 L1 W2 L2 W3 L3 nmGeMM nmGeMM W4 L4 PR PR W5 L5 W6 L6 Tau r nmGeMM .54 .76 PR .53 .76 PPS .58 .80 Spline Spline • nmGeMM had the worst power for non-monotonic predictors but the best power for monotonic • PPS had the best power for reflection parameters but also had the most Type-I errors • PPS cross-validated the best with nmGeMM showing a slight advantage over PR Asymmetric Non-Monotonic 16 PowerPower for Reflection Parameters for Predictors 1 Predictive Accuracy 0.8 0.6 0.4 0.2 0 W1 L1 W2 L2 W3 L3 nmGeMM W4 L4 PR W5 L5 Spline W6 L6 Tau r nmGeMM .46 .61 PR .32 .65 PPS .45 .67 • PPS had the best power in general, followed by nmGeMM • PPS also showed more Type-I errors for the reflection parameters • nmGeMM performed best on predictive accuracy for tau and PPS performed best for r Effects of Outliers 17 • Symmetric data • Outliers influenced the performance of the spline • Worse power for monotonic predictor • Increased Type-I error rates • Poorer performance for predictive accuracy • Asymmetric data Predictive Accuracy • No strong effect of outliers in this data environment Tau • Pattern of results did not change from the data without outliers nmGeMM .50 .54 r .71 .76 PR .49 .53 .71 .76 PPS .48 .58 .71 .80 Summary of Results 18 nmGeMM Linear Power Monotonic Symmetric Asymmetric Linear Type-I error Monotonic Symmetric Asymmetric Linear Predictive Accuracy Monotonic Symmetric Asymmetric PR PPS Conclusions 19 • If one knows that the data show a non-monotonic, symmetric association, then PR or PPS could be preferred • If the structure of the data is unknown, then nmGeMM may be preferred • PR and PPS do not perform as well as nmGeMM when the data are monotonically associated • PR performs poorly in the asymmetric environment • nmGeMM rarely showed the worst performance and often showed the best • If one is interested in estimating rank, nmGeMM may be the best option Importance of nmGeMM 20 • Provides researchers with a way to estimate rank when the data are either monotonic or non-monotonic • Gives researchers with a way to analyze data that requires very few assumptions about the form of the data • Uses few researcher degrees of freedom • PR and PPS require researchers to make a number of decisions about the form of the data • More research is needed to determine the effects of researcher degrees of freedom on these methods Thank you! 21
© Copyright 2026 Paperzz