Stat 301 – Lab 10 Goals: In this lab, we will see how to: fit models with indicator variables create indicator variables fit models to factor variables with automatically created indicator variables fit models with both qualitative and quantitative variables, possibly including an interaction Fit models with indicator variables for qualitative variables: Load the BIDMAINT.txt data set. That data set has one qualitative factor, State, with 3 levels: Kansas, Kentucky, and Texas. This data set already includes an X1 and X2 variable. The coding of X1 and X2 is the one used in the text and emphasized in lecture. 1. Look at the data set and compare the value in the State variable and the value in the X1 variable. X1 is an indicator variable. That means it has the value of 1 for some level of State. X1 is an indicator for which state? X2 is also an indicator variable. For which state? 2. To fit a regression, use Analyze / Fit Model and use X1 and X2 as the X variables. The parameter estimates are the values reported in the text. Interpretation of the regression slopes for this coding (0/1 values) was explained in lecture and in the text. To create indicator variables from a qualitative variable: Load the BIDMAINT.txt data set. Here is how to create X1 and X2 if they weren't already there. Select the column containing the values of the qualitative variable, Right click Cols / Utilities. You should see the following menu: Choose Make Indicator Columns. Three new columns will appear in the data set. Their names are the values of State (so Kansas, Kentucky, and Texas here). Each is a red bar variable (qualitative) even though the values are 0 and 1. Turn each into a quantitative variable by left-clicking on the red bar by the variable name and selecting continuous (instead of nominal). Note: This is a new feature in JMP 12 Pro (the version you should be using). If you don't see this Utility menu, check you are using JMP 12 Pro. If the menu items are grayed out, you forgot to select the State column before selecting Cols / Utilities. Automatically create indicator variables for a qualitative variable: JMP will automatically create +1/-1 indicator variablesLoad the BIDMAINT.txt data set. Select Analyze / Fit Model 1. You will notice a set of red bars by State. That tells you that State is a qualitative variable (which JMP calls a nominal variable). 2. Select Cost as the Y variable and State as the X variable, then run the model. 3. Some parts of the output are identical to the multiple regression output we’re familiar with: Summary of Fit: familiar, Root Mean Square Error is a very useful number Analysis of Variance: familiar. Compares the full model (3 means) to the “intercept only” model. Parameter Estimates: these are “hidden” (because they correspond to the specific coding used by JMP). To see them, click the sideways open triangle by “Parameter Estimates”. You see three rows: Intercept, STATE[Kansas], and STATE[Kentucky]. The two “slopes” are for indicator variables that JMP creates for you. These are for -1/ +1 coding that I briefly discussed in lecture. Residual by Predicted Plot: familiar Some parts of the output are new: Effect Tests: This is a comparison of the model that includes STATE to a model without. When there is only one factor, the effect test for STATE is the same as the test in the Analysis of Variance box because both are comparing the same pair of models. Least Squares Means: When a model has a qualitative variable, the real interest is the means for each level of the factor. These are calculated from the parameter estimates and reported in this table. For the models we’re considering, the Least Squares Mean and Mean are the same quantity. The Standard Error is what you expect it to be. You can also get a 95% confidence interval for each mean by clicking in the table, and selecting the Lower 95% and Upper 95% items. To see what indicator variables JMP creates for you (optional): 1. After fitting the ANOVA model (using Analyze / Fit Model), click the red triangle by Response Cost, select Save Columns and Save Coding Table (very last item on the menu). A new data set window opens with one row for each observation and the indicator variables that JMP creates for you. If you look at them, you see that JMP uses 1 / 0 / -1 values. Changes the parameter estimates for each indicator variable (which is why they’re hidden) but doesn’t change the SS associated with each factor or the least squares means for each level. To fit a model with both qualitative and quantitative variables, and optionally their interaction: 1. Load the bear.csv data set. This has the chest, sex and weight variables for the black bear data set. It also has other variables, which we won't use here. Sex is the qualitative (nominal) variable; Isex is the corresponding indicator variable, with 1 for Males and 0 for Females. 2. Add the desired variables to the model box in Analyze / Fit Model. If you create the indicator variable, you can specify which level is the reference level. If JMP creates the indicator for you (i.e. the model includes the nominal variable), it will use +1 / -1 coding. There is little practical difference when the qualitative variable has two levels. I find results from 0/1 coding a little easier to use. 3. When there are more than two levels for the nominal variable, there is a huge advantage to letting JMP create the variables: it knows that the k-1 indicator variables (for k levels) are related. That means JMP automatically gives you the test of a k-1 regression coefficients = 0. If you create the indicator variables, you have to use a custom test. 4. To include their interaction, the easiest way is to create the product on the fly using Cross. This can be used to specify the interaction as the cross of a nominal and a quantitative variable. 5. To fit the different slopes line to the bear data, add Chest and Sex to the model box, then select both variables again and click Cross. Sex*Chest is added to the Model Effects box. If you run the model and look at the parameter estimates, you see different results from what I gave in lecture, because of the different coding. 6. Open the Effect Tests box to see the effect-specific tests (all interactions = 0 is the Chest*Sex test). Self assessment: We'll use the bear data in bear.csv. 1. Use Fit Model to test whether males and females differ in average weight. (This is a t-test, but use Fit Model). What is the p-value for this test? 2. Test whether males and females differ in average weight when compared at the same Chest size. What is the pvalue for the comparison of their weights? 3. What is the difference between average male and average female weight, when compared at the same chest size? 4. What is the slope for Chest in the ANCOVA model with Chest and Sex? 5. Is there evidence of an interaction between chest size and sex? Answers: 1. p = 0.20 2. p = 0.68 3. Males are 4.0 lbs heavier, when compared at the same chest size. The most straight forward way to get this number is to switch to a 0/1 indicator variable and look at the regression coefficient. Isex is 1 when the bear is male; the estimated regression coefficient is 4.0. 4. 12.65 lbs/inch. This is (and should be) the same value for models using Sex and models using Isex to describe the bear's sex. 5. Yes, p = 0.029. This is the p-value for the test of sex*chest or Isex*chest = 0, or the Effect Test p-value for the Sex*Chest interaction. If you had more than two levels for the qualitative variable, you would want to look at the Effect Test to test all k-1 coefficients = 0 simultaneously. 6.
© Copyright 2026 Paperzz