HW 8 due March 30. 1. Consider the charitable contributions data

HW 8 due March 30.
1. Consider the charitable contributions data set. But use only CHARITY and DEPS in the analysis.
Consider the model specified by the R syntax lm(CHARITY ~ DEPS) first.
A. What is the theoretical model for E(CHARITY | DEPS =x) corresponding to the R syntax?
B. How do you interpret the parameters of the theoretical model?
C. Explain why the theoretical model in A. is wrong as it applies to charitable contributions,
without doing any data analysis. (Note: Unlike the example in the book, where the predictor
was “location,” DEPS is in fact ordinal, so the linear model is not nearly as silly for DEPS as it
was for location. In other words, don’t use the same argument given in Chapter 10. Consider
something you learned earlier, maybe in Chapter 1?)
D. Graph the scatterplot and overlay the fitted model.
2. Consider again the charitable contributions data set. Again use only CHARITY and DEPS in the
analysis. But now consider the model specified by the R syntax lm(CHARITY ~ as.factor(DEPS)) .
A. What is the theoretical model for E(CHARITY | DEPS =x) corresponding to the R syntax?
B. How do you interpret the parameters of the theoretical model?
C. Explain why the theoretical model in A. is correct, as it applies to charitable contributions,
without doing any data analysis.
D. Graph the scatterplot and overlay the fitted model.
E. Comment on the difference between the appearance of the graphs in 1.D. and 2.D.
3. Read the data set clinical = read.table("http://westfall.ba.ttu.edu/isqs5349/Rdata/clinical.txt")
The dependent variable is the variable T4 in the data set, which is a doctor’s assessment of a
patient asthma symptoms, on a 0 – 4 scale. Consider first the two-way ANOVA model using
GENDER and DRUG (A=”Active”, P=”Placebo”) as predictors, with no interaction.
A. Specify the theoretical model and interpret all its parameters.
B. Estimate the theoretical model and draw the graph as shown in Figure 10.6.2.
C. This model states that the effect of DRUG is the same for men and women. Why might the
drug manufacturer like this to be true?
4. See the previous problem. Now consider first the two-way ANOVA model using GENDER and
DRUG (A=”Active”, P=”Placebo”) as predictors, with interaction.
A. Specify the theoretical model and interpret all its parameters.
B. Estimate the theoretical model and draw the graph as shown in Figure 10.6.3.
C. See the graph in B. Does the effect of DRUG appear to be different for Males and Females?
5. Consider the models of 4. Now include “Age” as a covariate. Call this model2, call the model in 4
model 1, and the model in 3 model0.
A. Explain why this sequence of the models is “nested”; i.e., explain why each in the sequence
is either full or restricted relative to the next.
B. Test all three models using the “anova” function in R and interpret the results.
6. Consider the (slightly farcical) example of predicting drownings from ice cream sales, given in
Chapter 6. See Figure 6.2.1 in particular. What “unmeasured confounders” can explain the
appearance of the graph?
7. Now consider the data set firms = read.csv("http://westfall.ba.ttu.edu/ISQS5347/firms.csv").
This is firm-level data, with a measure of Y = firm performance (perform_y) and a measure of X=
the extent to which the firm has adopted a particular performance-enhancing strategy
(strategy_x). There is also another variable called “code_3dig” which identifies the industry that
the firm is in.
A. Fit the model where Y is predicted as a function of X alone. Is the observed effect of X on Y
explainable by chance alone? Discuss. As part of your answer, explain (i) what “explainable
by chance alone” means, and (ii) what “explained by chance alone” means. Identify, as part
of your answer to (ii), the specific chance-only model in this case.
B. Fit the model where Y is predicted as a function of X and the fixed effects of industry, as
measured by “code_3dig.” Is the observed effect of X on Y explainable by chance alone in
this model? Discuss. As part of your answer, explain (i) what “explainable by chance alone”
means, and (ii) what “explained by chance alone” means. Identify, as part of your answer to
(ii), the specific chance-only model in this case.
C. Comparing your answers to A. and B., was there any indication of unobserved confounding
due to industry-specific unmeasured confounders?