projects

Homework #7 – Simulation Study Problems
ST750
02 November 2010
1
Least Three-halves Regression
P
3/2
The location problem in Exercise 8.13 describes the estimator µ̃ that minimizes
i |Xi − µ|
as intermediate between L2 (least squares) and L1 (least absolute value). Here we examine its
application in a simple linear regression situation, minimizing
X
|Yi − β0 − β1 xi |3/2 .
i
The asymptotics for this estimator should follow
√
n(β̃ − β) ≈ N ormal(0, (a/b)(XT X)−1 )
2
P
2
R 1/2 0
R
P
−1/2
where a = |e|f (e)de is estimated by i |ẽ| and b =
|e| f (e)de is estimated by
/2
i |ẽ|
and ẽi = Yi − β̃ T xi . Compare this estimator to the usual least squares estimator and see where
asymptotics apply.
2
Logit/Probit Misspecification
Both the logit and probit models can be viewed as regression models with a latent variable Y ∗ and
cutoff point c so that P r(Y = 1) = P r(Y ∗ > c) = P r(γ T x + e > c). If the error random variable e
has a logistic distribution, then this leads to the usual logistic regression model
T
P r(Y = 1) =
eβ x
1 + eβ T x
with an obvious relationship between β and γ. For the probit model, the error distribution is
the usual N (0, σ 2 ). In both cases, the standard approach for parameter estimation is maximum
likelihood. What may be interesting is misspecification, that is, using a probit model when the
truth is logistic, and vice versa. For example, examine the bias of b̂ estimated from the probit model
when the data arose from the logistic.
3
Poisson Regression via Iterative GLS
The Poisson regression model from Section 9.x has yi distributed as independent Poisson(λi ) random variables, where λi = exp{β T xi } for i = 1, . . . , n). The usual (Newton, scoring) method for
1
computing the MLE is Iteratively Reweighted Least Squares (IRWLS). Notice that this is not GLS,
as GLS would be minimizing
n
X
(yi − λi )2
S(β) =
,
V ar(yi )
i
where V ar(yi ) = λi . An iterative GLS estimator that minimizes S(β) can be computed using a
general optimizer (like nlm) or a nonlinear least squares algorithm (like nls). (Don’t use algorithm
as a factor.) Compare this estimator with a) the MLE or b) the nonlinear least squares estimator
that takes sqrt(Yi ) as the response and sqrt(λi as the mean function.
4
Regression with Lickert Scale Responses
Many surveys have responses that may be viewed as a continuous latent variable restricted to a
five or seven point scale. For a simple case, suppose that latent variable Zi for the ith respondent
followed a regression model with mean β0 + β1 xi and constant variance σ 2 , and the categorization
of the reported response Yi arose from
P r(Yi = k) = P r(ck−1 ≤ Zi < ck ),
where the cutoff points cj are KNOWN. (For example, income is reported by categories with cutoffs
at $25k, $50k, etc.) One obvious approach is to assume normality and fit using maximum likelihood.
Examine the power and level of tests for slope and compare to the tests where the latent variable
Zi were observed.
5
Goodness of Fit with Dependence
Goodness of fit tests, e.g. chi-square, Kolmogorov-Smirnov, Anderson-Darling, as well as tests
for normality, are designed to test whether a random sample arose independently from a specified
distribution. In some applications, however, the sample may arise from dependent data. Investigate
the effect of dependence on the level and power of one or more of these tests.
Some simple methods for generating dependent series {Yt ,t = 1, . . . , n} are:
1. Yt = at + αat−1 , where at iid N (0, σ 2 ), so that Yt ∼ N (0, σ 2 (1 + α2 )) (aka ”moving average”)
2
2
2
2. Yt = αYt−1 + a√
t , where at iid N (0, σ ), so that Yt ∼ N (0, σ /(1 − α )) (aka ”autoregression”)
2
(take Y0 = a0 / 1 − α )
Pk
3. Yt = j=0 at + ut , where at iid χ2α and ut iid χ2β , so that Yt ∼ χ2(k+1)α+β
Pk
4. Yt = j=0 at + ut , where at iid Exponential(mean=1) and ut iid Gamma(α, 1), so that Yt ∼
Gamma(k + 1 + α), 1)
5. Yt = min{at , . . . , at−k } + ut , where at iid Exponential(mean=k+1), and ut iid Gamma(α, 1),
so that Yt ∼ Gamma(1 + α, 1)
Changing the parameters (α, β, k) will affect the dependence. One measure of the dependence is
the serial correlation, the correlation between Yt and Yt−1 .
2

Download Report

projects

Paperzz.com

Your Paperzz