Linear models and their mathematical foundations Cynthia Huber, Steffen Unkel Study sheet 4 Winter term 2016/17 Exercise 1: Assume we are working with a model y = Xβ + , E() = 0n , Cov() = σ 2 In , but the true covariance matrix of is σ 2 V. Find the mean of s2 obtained using the least squares approach and show, specifically, that if X contains a column of ones and V = (1 − ρ)In + ρ1n 1Tn , 0 < ρ < 1, then E(s2 ) = σ 2 (1 − ρ) < σ 2 (thus s2 underestimates σ 2 ). Describe verbally the covariance structure implied by σ 2 V. Exercise 2: Prove the following Theorem. If y is Nn (Xβ, σ 2 V), where X is full-rank and V is a known positive definite matrix, where X is a n × (k + 1) of rank k + 1, then the maximum likelihood estimators for β and σ 2 are β̂ = (XT V−1 X)−1 XT V−1 y and σ̂ 2 = 1 (y − Xβ̂)T V−1 (y − Xβ̂). n Exercise 3: Use the dataset mtcars which is available in R. Assume the following model: mpgi = β0 + β1 · cyli + β2 · hpi + β3 · wti + i , i ∼ N (0, σ 2 ), i = 1, . . . , 32 a) Define the matrix C and the vector t of H0 : Cβ = t for the following hypotheses: i) None of the covariates influences the response mpg. ii) The number of cylinders cyl and the horsepower hp have the same influence. iii) The influence of weight wt is −3. b) Which of the mentioned hypotheses are considered in the R summary? c) Calculate the coefficients under the null hypotheses H0 considered in a). Date: 17 November 2016 Page 1 Exercise 4: iid Let x = (x1 , . . . , xn )T be a random vector with xi ∼ N (µ, σ 2 ) for i = 1, . . . , n and µ is known. The parameter σ 2 should be estimated. Look at the estimator T = T (x) = n X (xi − µ)2 . i=1 Is T sufficient for σ 2 ? Exercise 5: When fitting a model yi = β0 +β1 xi1 +β2 x2i1 +i , lm() in R, we obtain the following output: i = 1, . . . , n to our data using the function Coefficients: Estimate Std. Error t value (Intercept) 0.74669 0.24922 2.996 x -0.45842 0.12677 x2 0.02796 0.01255 P r(> |t|) 2.227 Residual standard error: 0.9343 on 73 degrees of freedom Multiple R-squared: 0.3383, Adjusted R-squared: 0.3202 F-statistic: on 2 and 73 DF, p-value: 2.841e-07 a) How many observations did we use? (The design matrix in our model was of full column rank.) b) Fill in the missing values. You know that the missing p-values were 0.02900, 0.000547, 0.003733. Fill in these values. c) Test the overall regression hypothesis (significance level 1%). d) Can we reject (at 5% significance level) the hypothesis that the regression function goes through the origin? Date: 17 November 2016 Page 2
© Copyright 2026 Paperzz