Assignment 3 Inferential statistics Please answer the questions in a Word document and submit the completed assignment by November 29, 2016 via email to Jaime Sebastián: [email protected]. This assignment is worth 10% of your final grade. 1. You are studying the use of two different substrates for oil sands remediation. You want to know if these substrates affect the growth of aspen, for that you measured the height of aspens growing in both substrates and performed a t-test with the data collected. You manually calculated your t-value and got a result of 1.834, then you run the following codes in R (italic): qt(0.95, 14) = 1.761 qt(0.975, 14) = 2.144 a. What are the null and alternative hypotheses? (1 point) b. Is the t-test one or two sided? With one or two samples? Why? (1 point) c. Is aspen growth in these substrates significantly different? If it is, with what level of confidence? (1 point) 2. Explain the difference between type I and II errors. Give an example where one of the errors is preferred over the other one and explain why. (2 points) 3. In the macroinvertebrate study in the first assignment we wanted to know if the introduction of rainbow trout affects macroinvertebrate density in water bodies in Alberta. After the visual exploration, we performed an ANOVA and this was the result. Explain what the numbers in the red circles mean. (2 points) 4. Why should we adjust when making multiple inferences? Mention one method to do it. (1 point) 5. After the first result of the substrate study in the first question we wanted to further explore the effect of both substrates under different moisture conditions. Looking at the interaction plot below, what conclusions can you make about the use of these substrates under different moisture conditions? Is there any interaction? (2 points) 6. What statistical analysis would you perform for the following objectives? Why? (2 points) a. You want to know if poplar growth is affected by climate. You have data for diameter increment and several yearly climate variables for 20 years. b. You want to know if three salinity levels (low, medium and high) affect pine seedling mortality. c. You found out that the data you used for the t-test in question 1 is not normal, what method would you use now? d. You want to check if your data follows a normal distribution. 7. In forestry it is important to know the timber volume standing in a site to apply a proper management. However, accurate measure of volume is only possible after the tree is cut down. For that reason it is important to have accurate models using variables that are easier to measure while the tree is standing, like diameter. We used 100 trees to establish a relationship between volume (cm3) and diameter at breast height (cm) registered in the file Assignment3.csv. a. Try a linear model with squared DBH, Volume = a + b*DBH2 (use I(DBH^2) in R). Is it a good model? Does it meet the assumptions of linear regression (Show figures to prove your statements when possible)? How could we fix it? (2 points) b. Try transforming the volume with a square root and run the model again. Is it a good model? Does it meet the assumptions of linear regression (Show figures to prove your statements)? How could we fix it? (2 points) c. Try a non-linear model of your choice. How did you choose it? (2 points) d. Which is the best model? Why? (2 point)
© Copyright 2025 Paperzz