Missing Values Analysis with IBM SPSS Analyze, Missing Values Analysis MVA VARIABLES=Gender Ideal Statoph Nucoph SATM Year Eye /MAXCAT=25 /CATEGORICAL=Eye /TTEST PROB PERCENT=5 /MPATTERN /EM(TOLERANCE=0.001 CONVERGENCE=0.0001 ITERATIONS=25). [DataSet] C:\Users\Vati\Documents\StatData\IntroQ\IntroQ.sav See IntroQ Questionnaire for a description of the questions asked. Univariate Statistics N Gender Ideal Statoph Nucoph SATM Year Eye Mean 667 662 658 665 529 667 666 1.26 70.32 6.29 58.04 505.29 1997.52 Std. Deviation No. of Extremesa Missing Count .438 3.854 2.315 22.423 96.080 8.621 Percent 0 5 9 2 138 0 1 Low .0 .7 1.3 .3 20.7 .0 .1 0 8 11 26 1 0 High 0 0 0 0 2 0 a. Number of cases outside the range (Q1 - 1.5*IQR, Q3 + 1.5*IQR). Here we see that almost 21% of the cases are missing data on the SATM variable. Summary of Estimated Means Gender All Values EM Ideal Statoph Nucoph SATM Year 1.26 70.32 6.29 58.04 505.29 1997.52 1.26 70.32 6.29 58.02 504.60 1997.51 Please read David Howell’s document on the Expectation-Maximization algorithm. The table above and that below show the results of SPSS’ EM procedure. The algorithm leads to an estimated mean of 504.6 and standard deviation of 95.77 for SATM, not much different from the observed means for those cases on which we do have data. Summary of Estimated Standard Deviations Gender All Values .438 Ideal 3.854 Statoph Nucoph 2.315 22.423 SATM 96.080 Year 8.621 EM .437 3.851 2.308 22.431 95.770 8.620 Separate Variance t Testsa Gender t Statoph Nucoph SATM Year 1.5 .3 -2.7 -.1 . -1.8 228.9 209.3 221.6 222.7 . 234.1 P(2-tail) .132 .787 .007 .952 . .066 # Present 529 527 525 527 529 529 # Missing 138 135 133 138 0 138 Mean(Present) 1.27 70.34 6.17 58.01 505.29 1997.23 Mean(Missing) 1.21 70.24 6.75 58.14 . 1998.65 df SATM Ideal For each quantitative variable, pairs of groups are formed by indicator variables (present, missing).a a. Indicator variables with less than 5% missing are not displayed. These t tests compare the group of cases with data on SATM to the group of cases without data on SATM. Notice that those who did not answer the SATM question scored significantly higher on the Statophobia item than did those who did answer the SATM question. There is also a hint that the frequency of failure to answer the SATM question has increased over the years. I have cut out most of the table below, but left in enough to show you how SPSS groups cases by the pattern of missing values. The most frequent pattern was missing data on SATM but not on any other variables. Cases 646 through 629 and case 631 were missing data on SATM and Statophobia. Case 630 (and others) were missing data only on Statophobia, and so on. Missing Patterns (cases with missing values) Case # Missing Missing and Extreme Value Patternsa % Missing Gender Year Eye Nucoph Ideal Statoph SATM 660 1 14.3 S 661 1 14.3 S 662 1 14.3 S 626 2 28.6 S S 627 2 28.6 S S 628 2 28.6 S S 629 2 28.6 S S 631 2 28.6 S S 630 1 14.3 S 632 1 14.3 S 633 1 14.3 S - 194 2 28.6 S S 665 1 14.3 S 558 1 14.3 78 1 14.3 S 444 1 14.3 S 221 2 28.6 S S 552 2 28.6 S S 311 2 28.6 S S S - - indicates an extreme low value, while + indicates an extreme high value. The range used is (Q1 - 1.5*IQR, Q3 + 1.5*IQR). a. Cases and variables are sorted on missing patterns. EM Estimated Statistics Here we have estimated means, covariances, and correlation coefficients. Little’s MCAR test the null that the missing data are Missing Completely At Random. Since it is significant, we conclude that the data are NOT missing completely at random. The majority opinion is that EM estimates are not trustworthy when the data at not missing completely at random. EM Meansa Gender 1.26 Ideal 70.32 Statoph 6.29 Nucoph 58.02 SATM Year 504.60 1997.51 a. Little's MCAR test: Chi-Square = 61.477, DF = 32, Sig. = .001 EM Covariancesa Gender Gender Ideal Statoph Nucoph SATM Year .191 -.949 -.146 -.815 2.480 .032 Ideal 14.833 .698 8.702 -18.295 -.483 Statoph 5.326 1.749 -73.667 -3.288 Nucoph SATM 503.155 63.812 9171.847 1.114 259.774 Year 74.301 a. Little's MCAR test: Chi-Square = 61.477, DF = 32, Sig. = .001 EM Correlationsa Gender Gender 1 Ideal Statoph Nucoph SATM Year Ideal Statoph Nucoph SATM Year -.563 -.145 -.083 .059 .008 1 .079 .101 -.050 -.015 1 .034 -.333 -.165 1 .030 .006 1 .315 1 a. Little's MCAR test: Chi-Square = 61.477, DF = 32, Sig. = .001 ECU Users: Curiously, the SPSS (20) installation provided for ECU faculty to use on campus does not contain the missing values and multiple imputation modules, but that provided for use off campus does. Go figure. Karl L. Wuensch, December, 2012 Return to Wuensch’s SPSS Lessons Page
© Copyright 2026 Paperzz