Lecture 10 - Qualitative and Quantitative IVs What’s special about mixing Qualitative and Quantitative IVs? Nothing What Procedure should I use – REGRESSION (or its R equivalent) or GLM (or R equivalent)? If you’re using SPSS, either REGRESSION or GLM can be used to perform all of the analyses to be described here. Use the procedure that is easiest for you. There are two procedures in Rcmdr – the Linear Regression procedure and the Linear Model procedure. They’re roughly analogous to SPSS’s REGRESSION and GLM procedures. Simplest Example – One qualitative and one quantitative factor Suppose that two groups are given training on how to perform a job. One of the groups has been given the old training method. It is Group 0. The other group has been given a recently purchased new training program. It is group 1. Suppose also that a test of cognitive ability, an ability that is probably related to job performance, has been given each person. The dependent variable is the final performance after completion the training program. That final performance depends on both the type of training and also on cognitive ability. In the following, cognitive ability is X and final performance is Y. The data are as follows . . . CA Group 54 49 66 45 65 47 50 39 53 29 52 38 32 24 58 49 62 46 35 50 55 55 44 44 56 56 55 67 37 52 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Perf (Y) 45 60 68 43 61 29 46 49 59 41 58 42 30 30 54 54 77 67 50 58 47 74 47 62 71 54 67 68 58 54 REGRESSION Menu Sequence Note that since in this example the Qualitative factor has only two levels, we do not have to create group coding variables. (Whew!) Lecture 11_Qualitative And Quantitative IVs - 1 8/1/2017 The REGRESSION analysis – same old stuff – a two-predictor analysis. Regression [DataSet0] G:\MDBT\P513\P513L07B-QualQuant\Training program and cognitive ability data.sav Variables Entered/Removeda Model Variables Entered 1 CA, Groupb a. Dependent Variable: Y b. All requested variables entered. Variables Removed Method . Enter Model Summary Model R R Square Adjusted R Square 1 .781a .610 .581 a. Predictors: (Constant), CA, Group Std. Error of the Estimate 8.210 ANOVAa Model 1 Regression Residual Sum of Squares 2844.774 1819.926 Total df 2 27 4664.700 Mean Square 1422.387 67.405 F 21.102 Sig. .000b 29 a. Dependent Variable: Y b. Predictors: (Constant), CA, Group Coefficientsa Model 1 (Constant) Unstandardized Coefficients B Std. Error 14.644 7.095 Group CA a. Dependent Variable: Y 9.946 .707 3.057 .145 Standardized Coefficients Beta .399 .598 t 2.064 Sig. .049 3.253 4.877 .003 .000 0 1 Verbal Interpretation of B Coefficients BGroup = 9.946: Among persons of equal cognitive ability, a 1-point change in GROUP, i.e., moving from Group 0 to Group 1 is associated with a 9.946 increase in performance. Difference is significant. BCogability = .707: Among persons in the same group, a 1-point increase in cognitive ability is associated with a .707 increase in performance. Relationship is statistically significant. Predicted Y = 14.644 + .707*X + 9.946*GROUP. When one of the variables is a grouping variable, analysts often write separate equations for each group. For people in Group 1, Predicted Y = 14.644 + .707*X + 9.946*1 = 24.590 + .707*X For people in Group 0, Predicted Y = 14.644 + .707*X + 9.946*0 = 14.644 + .707*X Lecture 11_Qualitative And Quantitative IVs - 2 8/1/2017 The same analysis using GLM Why Group is in the covariates field You may recall that I told you to put quantitative variables in the Covariate(s) field in GLM Group is a qualitative variable. Why did I not put it in the Fixed Factor(s) field. The answer is that since Group has only two values, it can be put in EITHER the Fixed Factor(s) field or in the Covariate(s) field. The results will be the same, although they may be formatted slightly differently. I chose the Covariate(s) field because the formatting of output is slightly easier to understand. I chose my preferred list of options. Lecture 11_Qualitative And Quantitative IVs - 3 8/1/2017 The GLM Output Note that because I put Group in the Covariates field, GLM does not know that there are two groups of participants. It thinks there is just one group. Descriptive Statistics Dependent Variable: Y Mean Std. Deviation 54.10 N 12.683 30 Tests of Between-Subjects Effects Dependent Variable: Y Type III Sum of Source Squares Corrected 2844.774a Model Intercept 287.107 Group 713.443 CA 1603.141 Error 1819.926 Total 92469.000 2 Mean Square 1422.387 1 1 1 27 287.107 713.443 1603.141 67.405 df F 21.102 Sig. .000 Partial Eta Squared .610 Noncent. Parameter 42.204 Observed Powerb 1.000 4.259 10.584 23.784 .049 .003 .000 .136 .282 .468 4.259 10.584 23.784 .512 .880 .997 Noncent. Parameter 2.064 3.253 4.877 Observed Powera .512 .880 .997 30 Corrected 4664.700 29 Total a. R Squared = .610 (Adjusted R Squared = .581) b. Computed using alpha = .05 Parameter Estimates Dependent Variable: Y Paramet Std. er B Error t Intercept 14.644 7.095 2.064 Group 9.946 3.057 3.253 CA .707 .145 4.877 a. Computed using alpha = .05 Sig. .049 .003 .000 95% Confidence Interval Lower Upper Bound Bound .085 29.202 3.673 16.219 .409 1.004 Lecture 11_Qualitative And Quantitative IVs - 4 Partial Eta Squared .136 .282 .468 8/1/2017 The same output, but this time with Group in the Fixed Factor(s) field Between-Subjects Factors N Group 0 1 15 15 Descriptive Statistics Dependent Variable: Y Group Mean 0 47.67 1 60.53 Total 54.10 Std. Deviation 12.251 9.716 12.683 N 15 15 30 Tests of Between-Subjects Effects Dependent Variable: Y Type III Sum of Source Squares Corrected 2844.774a Model Intercept 496.499 CA 1603.141 Group 713.443 Error 1819.926 Total 92469.000 2 Mean Square 1422.387 1 1 1 27 496.499 1603.141 713.443 67.405 df F 21.102 Sig. .000 Partial Eta Squared .610 Noncent. Parameter 42.204 Observed Powerb 1.000 7.366 23.784 10.584 .011 .000 .003 .214 .468 .282 7.366 23.784 10.584 .744 .997 .880 Partial Eta Squared .276 .468 .282 Noncent. Parameter 3.206 4.877 3.253 Observed Powerb .871 .997 .880 . . . 30 Corrected 4664.700 29 Total a. R Squared = .610 (Adjusted R Squared = .581) b. Computed using alpha = .05 Parameter Estimates Dependent Variable: Y 95% Confidence Interval Lower Upper Bound Bound 8.854 40.325 .409 1.004 -16.219 -3.673 Paramet Std. er B Error t Sig. Intercept 24.590 7.669 3.206 .003 CA .707 .145 4.877 .000 [Group= -9.946 3.057 -3.253 .003 0] [Group= 0a . . . 1] a. This parameter is set to zero because it is redundant. b. Computed using alpha = .05 . . The essential results are the same. The output has been formatted to acknowledge that there are two groups. Lecture 11_Qualitative And Quantitative IVs - 5 8/1/2017 The same Analysis in Rcmdr R rcmdr Import data from SPSS file . . . Note that if you’re going to be doing only regression, you should uncheck the “Convert value labels. . .” box. Statistics Fit Models Linear Regression > RegModel.1 <- lm(y~ca+group, data=trainingprograms) > summary(RegModel.1) Call: lm(formula = y ~ ca + group, data = trainingprograms) Residuals: Min 1Q -18.8551 -4.9045 Median 0.4651 3Q 6.7782 These are the commands that Rcmdr gave to R. Max 10.7317 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 14.6438 7.0954 2.064 0.04876 * ca 0.7066 0.1449 4.877 4.24e-05 *** group 9.9460 3.0571 3.253 0.00306 ** --Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 8.21 on 27 degrees of freedom Multiple R-squared: 0.6099, Adjusted R-squared: 0.581 F-statistic: 21.1 on 2 and 27 DF, p-value: 3.031e-06 The regression parameters are identical to those obtained in SPSS. Any difference would be due to printing format. There is another Rcmdr procedure that will yield the same results. Lecture 11_Qualitative And Quantitative IVs - 6 8/1/2017 Statistics Fit Models Linear Model Double-click each variable’s name in order to put it into the field. Don’t worry about the stuff we haven’t covered yet. We’ll get to some of it in the Advanced SPSS course. > LinearModel.2 <- lm(y ~ group + ca, data=trainingprograms) > summary(LinearModel.2) Call: lm(formula = y ~ group + ca, data = trainingprograms) Residuals: Min 1Q -18.8551 -4.9045 Median 0.4651 3Q 6.7782 Max 10.7317 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 14.6438 7.0954 2.064 0.04876 * group 9.9460 3.0571 3.253 0.00306 ** ca 0.7066 0.1449 4.877 4.24e-05 *** --Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 8.21 on 27 degrees of freedom Multiple R-squared: 0.6099, Adjusted R-squared: 0.581 F-statistic: 21.1 on 2 and 27 DF, p-value: 3.031e-06 Note that the output of this different command is IDENTICAL to that for the above command. The above command was for only linear REGRESSION. This command is analogous to GLM – it can include grouping variables. In this analysis, since Group has only two values, I treated it as a quantity for the analysis. Lecture 11_Qualitative And Quantitative IVs - 7 8/1/2017 Visualizing the Data – A Scatterplot which contains markers for group membership – Start here on 411/17. A useful way of visualizing the relationship of a dependent variable to a continuous independent variable and also a dichotomous independent variable is with A plot of the dependent variable vs. the continuous independent variable using different symbols for the two groups represented by the dichotomous IV. In this case, we would plot Y vs. CA with different symbols used for Group 0 and Group 1. The following represents such a plot . . . This plot allows you to see . . . 1) The relationship of Y to X – within each group circled by a green ellipse and overall (see the red ellipse which highlights the overall relationship. 2. The relationship of Y to Group – Compare the heights of the two green ellipses. Lecture 11_Qualitative And Quantitative IVs - 8 8/1/2017 Relationship of the Visual Aid to the Coefficients Table – The Group variable Regression formula is Predicted Y = 14.644 + 9.946*Group + 0.707*X So, for Group 1: And for Group 0: Predicted Y = 14.644 + 9.946*1 + .707*X = 24.590 + 0.707*X Predicted Y = 14.644 + 9.936*0 + .707*X = 14.644 + 0.707*X I’ve drawn the regression for Group 1 and that for Group 0 below. Note that the within-group regressions have the same slope. So the only difference in the two within-group regressions is the Y-intercept. G1: Y-hat = 24.590 + .707X G0: Y-hat = 14.644 + .707X Partial regression coefficient for Group (9.946). The difference in intercepts of the two lines represents the partial regression coefficient for Group. That difference is 9.946. Recall from the definition of a partial regression coefficient that the Group B value is the expected change in Y when Group changes by 1 unit (going from 0 to 1 or vice versa) among persons with the same X value. The concept “among persons with the same X value” can be illustrated in the figure with a rectangular sliver over the scatterplot. So within the rectangle are persons satisfying the condition that they’re equal on Cogability. Lecture 11_Qualitative And Quantitative IVs - 9 8/1/2017 Relationship of the Visual Aid to the Coefficients Table – The Cogability (quantitative) IV Within each group, Y = Intercept + .707*Cognitive Ability The common slope(s) of the two lines represents the partial regression coefficient for the quantitative predictor. G1: Y-hat = 24.590 + .707X G0: Y-hat = 14.644 + .707X The B value for the quantitative IV (0.707) is the common slope of the lines. Recall from the definition of a partial regression coefficient that the Cogability B is the expected change in Y when Cogability changes by 1 unit (going from 0 to 1 or vice versa) among persons with the same Group value. So each line represents the concept people equal on Group. Lecture 11_Qualitative And Quantitative IVs - 10 8/1/2017 The Y vs. X Graph with group-specific lines through each group of points. When you ask SPSS to draw lines for each group, the lines that SPSS draws will have slopes specific to the group. They won’t be set equal to the value in the Coefficients Table. Later we’ll see that the extent to which the group-specific lines are not parallel will reflect the extent to which the two variable interact. More on that in the chapter on Moderation. Lecture 11_Qualitative And Quantitative IVs - 11 8/1/2017 Regression with One 3-Category Qualitative and One Quantitative Independent Variable (Data are RandomizedBlocks in P595B HA folder. N is 240, too large to allow the data to be displayed here.) A company is considering switching from the current training program to one of two alternative programs. (Tell Tom story here.) The current program has been in effect for many years. Two alternative programs have recently become available. The first is a video / computer based method, involving presenting key information on DVDs and then using a stand-alone computer program to present drill-and-practice on the material presented. The 2nd is a completely web-based training method, in which the information is presented over the web and interactive drill-and-practice are also presented over the web. Before the company decides to make any kind of switch, it must determine whether there are any significant differences in learning after having gone through the three programs. Two hundred forty participants are randomly assigned to one of three groups of trainees with 80 participants in each group. One group receives the current training, the 2nd group receives the video/computer training, and the third receives the web-based training. A measure of cognitive ability (X), known to predict performance after training, is taken prior to beginning training. So the question we wish to answer is: Are there significant differences between the group means when controlling for differences in cognitive ability? The data are as follows. X is the cognitive ability measure. Y is the performance after training. TRTMENT is the training program: 1=standard; 2=video/computer, and 3=web-based. So, for this problem, we’re comparing TRTMENT means controlling for cognitive ability. De scriptiv e S tatis tics TRTME N 1 T 2 CA PERF X Y N 80 80 Me an 49. 68 110 .06 Me dian 50. 50 111 .00 Std . De viatio n 10. 45 15. 51 N 80 80 Me an 49. 54 111 .95 Me dian 50. 00 114 .00 9.6 5 13. 96 Std . De viatio n 3 To tal N 80 80 Me an 49. 21 117 .94 Me dian 51. 00 120 .00 Std . De viatio n 10. 91 15. 28 N 240 240 Me an 49. 47 113 .32 Me dian 50. 00 114 .00 Std . De viatio n 10. 31 15. 25 Note that the differences between groups in Cognitive Ability are small. You might wonder why we bother to control for it. It turns out that controlling for a variable that is related to the dependent variable even though it is unrelated to the other independent variables increases our power to detect differences associated with the other independent variables. So it pays to include a covariate, as long as that covariate is related to the dependent variable. Lecture 11_Qualitative And Quantitative IVs - 12 8/1/2017 Recall, the questions we’re asking are the following: 1) Among persons equal in cognitive ability (X), are there mean differences between the groups in performance after training? This is the main question. 2) Among persons within the same group, is there a relationship between performance after training and cognitive ability? This is a question that we assume will be answered positively, otherwise we wouldn’t control for cognitive ability. But we will check it anyway. The data prior to group-coding variables The data originally consists of just 4 columns – the ID column, the Y column, the X column and the TRTMENT column. The rows are in 1 – 2 – 3 order of trtment. This is not often the case but was done here as part of the randomization procedure. Creation of Group Coding Variables To analyze the problem using the REGRESSION procedure, we must create group coding variables for the TRTMENT variable. Which method should we use?? Since one of the groups is a natural control group, we’ll use dummy coding, using TRTMENT=1 as the reference group. So the coding will be TRTMENT 1 2 3 DCODE1 0 1 0 DCODE2 0 0 1 <==== Reference group So, after creating group coding variables and using them to represent the groups, the data editor looks like the following Lecture 11_Qualitative And Quantitative IVs - 13 8/1/2017 Testing the two hypotheses . . . 1. The significance of differences in mean performance between groups, controlling for X. The first test is assessed by computing the increase in R2 due to addition of the group coding variables to the equation. Since there is a set of IVs representing the groups, we have to use the techniques discuss previous involving groups of independent variables (GRE tests in the previous example). We do a two-step regression. First, we regress Y onto just X. Then we add the group coding variables to the equation. Specifically, first, we enter X Second, we enter the two group-coding variables representing TRTMENT. We assess the significance of R2 change in Step 2. 2. The significance of the relationship of Y to X, controlling for group differences. Since X is a single variable, we can assess its significance by simply examining the t value (and its pvalue) in the Coefficients box. That t assesses the significance of X controlling for all the other variables, specifically, the group coding variables. Lecture 11_Qualitative And Quantitative IVs - 14 8/1/2017 Regression A two-step regression analysis is conducted because one of the tests involves a set of independent variables. Step 1: Enter continuous predictor, X. Step 2: Add the set of group-coding variables, DCODE1, DCODE2. [DataSet1] F:\MdbT\P595B\HAs\RandomizedBlocks.sav De scriptiv e Statis tics x Me an Std . Deviatio n 49 .48 10 .307 y N 24 0 11 3.32 15 .247 24 0 dco de1 .33 .47 2 24 0 dco de2 .33 .47 2 24 0 Group Standard Video Web DCODE1 0 1 0 DCODE2 0 0 1 Correla tions Pe arson Correlatio n x x 1.0 00 y .43 0 dco de1 .00 4 dco de2 -.0 18 y .43 0 1.0 00 -.0 64 .21 5 dco de1 .00 4 -.0 64 1.0 00 -.5 00 dco de2 -.0 18 .21 5 -.5 00 1.0 00 Va riable s Entered/Rem ov e db Mo del 1 Va riable s En tered xa 2 dco de1, dcod e2 a Va riable s Re move d Me thod . En ter Significance of the increase in R2 (from 0) when X (cognitive ability) was added to the equation, p < .001. . En ter a. All requ ested varia bles entered. b. De pend ent V ariab le: y Model S umm ary Mo del 1 Ch ange Stat istics Std . Erro r of R R S quare Ad justed R S quare the Estim ate R S quare Ch ange F Chang e df1 df2 Sig . F Chang e .43 0 a .18 5 .18 1 13 .798 .18 5 53 .847 1 23 8 .00 0 2 .48 7 b .23 7 .22 7 13 .404 .05 2 8.0 92 2 23 6 .00 0 a. Pre dicto rs: (Consta nt), x b. Pre dicto rs: (Consta nt), x, dco de1, dcod e2 Significance of the change in R2 when the group-coding variables were added to the equation, p < .001. Lecture 11_Qualitative And Quantitative IVs - 15 8/1/2017 ANOVAc Mo del 1 2 Re gressi on Su m of Squa res 102 50.9 65 df Me an S quare 102 50.9 65 1 Re sidua l 453 08.9 68 238 To tal 555 59.9 33 239 Re gressi on 131 58.5 65 3 438 6.18 8 Re sidua l 424 01.3 69 236 179 .667 To tal 555 59.9 33 239 F 53. 847 Sig . .00 0 a 24. 413 .00 0 b Performance is related to the whole collection of IVs in each model. For Mode1 1, X is the only predictor. For Model 2, the X plus the two group coding variables make up the collection of IVs. 190 .374 a. Pre dicto rs: (Consta nt), x b. Pre dicto rs: (Consta nt), x, dcod e1, d code 2 c. De pend ent V ariabl e: y The significance of each individual variable can be obtained from the Coefficients table below. We want the significance of X, the continuous predictor for the 2nd hypothesis.. We may also want the significance of the dummy coding variables.. Coeffici ents a Un stand ardized Co effici ents Mo del 1 (Co nstan t) x B Std . Erro r 81 .880 4.3 76 Sta ndardized Co effici ents Be ta Co rrelat ions t 18 .712 .63 5 .08 7 .43 0 78 .182 4.4 40 .64 2 .08 4 .43 4 dco de1 1.9 76 2.1 19 dco de2 8.1 72 2.1 20 Sig . Ze ro-ord er .00 0 7.3 38 .00 0 17 .609 .00 0 7.6 28 .06 1 .93 2 .25 3 3.8 55 Pa rtial Pa rt .43 0 .43 0 .43 0 .00 0 .43 0 .44 5 .43 4 .35 2 -.0 64 .06 1 .05 3 .00 0 .21 5 .24 3 .21 9 There was a significant relationship of Y to X among persons in the same group.. dco de1 dco de2 2 (Co nstan t) x a. De pend ent V ariab le: y Ex clude d Va riabl es b Co llinea rity Sta tistics Mo del 1 dco de1 Be ta In -.0 65 a t -1. 117 dco de2 .22 3 a 3.9 14 Sig . Pa rtial Corre lation .26 5 -.0 72 .00 0 .24 6 To leran ce 1.0 00 Mean of Group 3 was significantly larger than the mean of the control group among people equal on X. 1.0 00 a. Pre dicto rs in the M ode l: (Co nstan t), x b. De pend ent V ariab le: y I’ve never used this table. So, our conclusion is that there ARE significant differences between the group post-training means after controlling for cognitive ability. The t-test in the Coefficients Box tells us that only Treatment 3 (dcode2) is significantly different from the control or standard method. Lecture 11_Qualitative And Quantitative IVs - 16 8/1/2017 The visual aid plot of Y vs. X with different plotting symbols for each group. (Regression lines with slopes specific to each group were added because they’re so easy to get. Note however that the analysis assumes that the regression lines all have the same slope.) 1) Note that the Treatment 3 line – the green one - is above the Treatment 2 line (the red one) which is (usually) above the Treatment 1 line (the dotted one). This hints at differences between average performance between groups. 2) Note the generally positive relationship of Y to X, overall and within each group. Looks like the dependent variable is related to the cognitive ability, which makes it a good covariate. Lecture 11_Qualitative And Quantitative IVs - 17 8/1/2017 The analysis of the same data with GLM (much easier) Start here on 4/12/16. The data, again. Specifying the analysis . . . Putting the name of a variable in the Fixed Factor(s) field tells GLM that the variable needs group coding variables. GLM will automatically create them. Don’t put the name of a quantitative variable in the Fixed Factor(s) field. GLM will create many many many group coding variables, then perhaps terminate with an error message. Lecture 11_Qualitative And Quantitative IVs - 18 8/1/2017 Plots . . . Post hocs . . . Alas, Post Hocs are not available when you have a continuous Covariate. So, we can’t, for example, use Post Hocs to discover which pairs of means are significantly different from each other. This is a problem GLM or perhaps it’s simply a problem with mathematical statisticians who are too lazy to discover post hoc tests when there are covariates. Options . . . Lecture 11_Qualitative And Quantitative IVs - 19 8/1/2017 Results . . . Univariate Analysis of Variance [DataSet1] G:\MdbT\P595B\HAs\RandomizedBlocks.sav Between-Subjects Factors N trtment 1 80 2 80 3 80 Descriptive Statistics Dependent Variable:y trtment Mean Std. Deviation N 1 110.06 15.514 80 2 111.95 13.965 80 3 117.94 15.276 80 Total 113.32 15.247 240 Good news – Using GLM allows you to get group means and standard deviations without having to use a different procedure. Levene's Test of Equality of Error Variancesa Dependent Variable:y F df1 .064 df2 2 Sig. 237 .938 Tests the null hypothesis that the error variance of the dependent variable is equal across groups. a. Design: Intercept + x + trtment This compares variances of the dependent variable across groups. We passed the test – the data suggest that the null hypothesis of equal population variances should be retained. Lecture 11_Qualitative And Quantitative IVs - 20 8/1/2017 Tests of Between-Subjects Effects Dependent Variable:y Partial Type III Sum of Source Squares df Mean Square F Sig. Eta Noncent. Observed Squared Parameter Powerb 13158.565a 3 4386.188 24.413 .000 .237 73.239 1.000 Intercept 66125.624 1 66125.624 368.046 .000 .609 368.046 1.000 x 10453.806 1 10453.806 58.184 .000 .198 58.184 1.000 2907.600 2 1453.800 8.092 .000 .064 16.183 .956 Error 42401.369 236 179.667 Total 3137320.000 240 55559.933 239 Corrected Model trtment Corrected Total a. R Squared = .237 (Adjusted R Squared = .227) b. Computed using alpha = .05 Corrected Model: Intercept: Same information as in the REGRESSION Anova box. Test of the hypothesis that the population Y-intercept (“Constant” in REGRESSION output) is zero. I’ve dimmed them to remind you that they’re technical stuff that has nothing to do with the hypotheses. X: Test of the hypothesis that in the population, controlling for trtment, the slope relating DV to X is zero. Conclusion: Among persons equal on trtment (i.e., in the same group) there is a significant relationship of Y to X (cognitive ability) Trtment: Test of the hypothesis that the population means of the 3 conditions are equal when controlling for differences in X. Conclusion: Among persons equal on X (cognitive ability) there are significant differences in the means of the three groups. The plot of group means for each TRTMENT condition. Lecture 11_Qualitative And Quantitative IVs - 21 8/1/2017 New Topic - Creating Scale scores – May be covered at end of Sem. Questions like: “Does Conscientiousness Test Performance?” are answered by computing scale Not covered onpredict 3/31/15. scores. Not covered on 4/12/16. Not covered on 4/11/17. A scale score is computed from a collection of conscientiousness items. This scale score represents Conscientiousness. A scale score may be computed from the items of a measure of performance. That scale score would represent Performance. Finally, the correlation coefficient between the two scale scores is computed. Procedure for computing a scale score Data: Biderman, Nguyen, & Sebren, 2008. GET FILE='G:\MdbR\1Sebren\SebrenDataFiles\SebrenCombined070726NOMISS2EQ1.sav'. The data typically are entered in the order in which they appear on questionnaire data sheets. In this case, 50 columns contain the responses to the 50-item IPIP Big Five exactly as they appear on the questionnaire. The next 20 or so columns contain the reverse coded responses to the negatively-worded items. I created them using the SPSS RECODE command. 1. Reverse score the negatively-worded items. q2 q4 q6 q8 q10 q12 q14 q16 q18 q20 q22 q24 q26 q28 q29 q30 q32 q34 q36 q38 q39 q44 q46 q49 Here’s syntax to perform the recode: SAVE FILE IMMEDIATELY BEFORE WHAT FOLLOWS. recode q2 q4 q6 q8 q10 q12 q14 q16 q18 q20 q22 q24 q26 q28 q29 q30 q32 q34 q36 q38 q39 q44 q46 q49 (1=7)(2=6)(3=5)(4=4)(5=3)(6=2)(7=1) into q2r q4r q6r q8r q10r q12r q14r q16r q18r q20r q22r q24r q26r q28r q29r q30r q32r q34r q36r q38r q39r q44r q46r q49r. SAVE FILE UNDER A DIFFERENT NAME IMMEDIATELY AFTER THIS. You don’t need to do this using syntax. It can be done using pull-down menus or by hand. But it must be done. Lecture 11_Qualitative And Quantitative IVs - 22 8/1/2017 2. Define Missing Values Tell SPSS if specific values are to be treated as missing. This is very important. A fairly recent thesis student lost several days because the student created scale scores without declaring missing values. MISSING VALUES MUST BE DECLARED FOR ALL ITEMS. 3. Determine which items belong to which scale? The IPIP items are distributed as follows: E A C S O E A C S O E A C S O . . . That is the 1st, 6th, 11th, 16th, 21st, 26th, 31st, 36th, 41st, and 46th items are E items. The 2nd, 7th, 12th, 17th, etc are A. And so forth. 4. Compute Scale scores. 4a. In syntax To compute the E scale score, E = (q1 + q6r + q11 + q16r + q21 + q26 + q31 + q36r + q41 + q46r) / 10. Manual Arithmetic. In syntax, that would be compute e = (q1+q6r+q11+q16r+q21+q26+q31+q36r+q41+q46r)/10. If it’s computed this way, the result for any case with a missing value will be treated as missing. It can also be computed as compute e = mean(q1,q6r,q11,q16r,q21,q26,q31,q36r,q41,q46r). If it’s computed this way, if a response is missing, the mean will be taken across the remaining nonmissing items. So, after all negatively worded items have been recoded, the syntax to compute all of the Big Five scale scores would be compute compute compute compute compute e a c s o = = = = = mean(q1,q6r,q11,q16r,q21,q26r,q31,q36r,q41,q46r). mean(q2r,q7,q12r,q17,q22r,q27,q32r,q37,q42,q47). mean(q3,q8r,q13,q18r,q23,q28r,q33,q38r,q43,q48). mean(q4r,q9,q14r,q19,q24r,q29r,q34r,q39r,q44r,q49r). mean(q5,q10r,q15,q20r,q25,q30r,q35,q40,q45,q50). Cut this page and the previous page out and paste it on your wall for when you analyze your thesis data. Lecture 11_Qualitative And Quantitative IVs - 23 8/1/2017 4b. Computing a scale score using the TRANSFORM menu . . Repeat the above for each scale, substituting appropriate item names. Lecture 11_Qualitative And Quantitative IVs - 24 8/1/2017 5. Run FREQUENCIES on scale scores. Negatively skewed. Lecture 11_Qualitative And Quantitative IVs - 25 8/1/2017 6. Run Reliabilities of each scale Reliability Statistics Cronbach's Alpha Based on Cronbach's Standardized Alpha Items .792 N of Items .794 e 10 Reliability Statistics Cronbach's Alpha Based on Cronbach's Standardized Alpha Items .833 N of Items .832 a 10 Reliability Statistics Cronbach's Alpha Based on Cronbach's Standardized Alpha Items .799 N of Items .799 c 10 Cronbach's Alpha Based on Cronbach's Standardized Alpha Items .825 N of Items .836 s 10 Reliability Statistics Cronbach's Alpha Based on Cronbach's Standardized Alpha Items .848 N of Items .849 o 10 Lecture 11_Qualitative And Quantitative IVs - 26 8/1/2017 7. Compute correlations between scale scores. Correlations hcon C summated scale scores from IPIP 50-item Big Five hext hext Pearson Correlation hagr 1.000 Sig. (2-tailed) scale hsta hopn .254 .155 .194 .241 .003 .074 .024 .005 N 135 135 135 135 135 Pearson Correlation .254 1.000 .421 .224 .426 Sig. (2-tailed) .003 .000 .009 .000 N 135 135 135 135 135 Pearson Correlation .155 .421 1.000 .277 .238 scores from IPIP 50-item Big Sig. (2-tailed) Five scale N .074 .000 .001 .005 135 135 135 135 135 hsta Pearson Correlation .194 .224 .277 1.000 .226 Sig. (2-tailed) .024 .009 .001 N 135 135 135 135 135 Pearson Correlation .241 .426 .238 .226 1.000 Sig. (2-tailed) .005 .000 .005 .008 N 135 135 135 135 hagr hcon C summated scale hopn .008 135 The mean of the correlations between scale scores is (.254+.155+.194+.241+.421+.224+.426+.277+.238+.226)/10 = .265. Wait!! Aren’t the Big Five dimensions supposed to be independent dimensions of personality? If so, why are they generally positive correlated? This question is leading to a ton of research right now. Key phrases: higher order factors of the Big Five; general factor of personality. Lecture 11_Qualitative And Quantitative IVs - 27 8/1/2017 8. Compute correlations of scale scores with variables your theory says they should correlate with. INCLUDE SCATTERPLOTS TO CHECK FOR NONLINEARITY. Correlations hcon C summated scale scores from IPIP 50-item Big Five scale hcon C summated scale scores from IPIP 50-item Big Pearson Correlation test 1.000 Sig. (2-tailed) .086 .320 Five scale test N 135 135 Pearson Correlation .086 1.000 Sig. (2-tailed) .320 N 135 135 In this case, the correlation is not significant. After two years of thinking about it, we hit upon the idea that perhaps the correlation was suppressed by method bias. That turned out to be a viable hypothesis. Lecture 11_Qualitative And Quantitative IVs - 28 8/1/2017
© Copyright 2026 Paperzz