A New Dierence-based Weighted Mixed Liu Estimator in Partially Linear Models Fikri Akdeniz [email protected] Çag University, Mersin-TURKEY Department of Mathematics and Computer Science 11-06-2014 1 / 56 1 2 Shrinkage Estimators Semiparametric Partially Linear Models Motivation 3 Dierenced-based estimation 4 Restricted estimation 5 Generalized Dierence-based Liu Estimator 6 Eciency Properties 7 Application 8 Simulation Study 2 / 56 Shrinkage Estimators Shrinkage Estimators: Ridge and Liu Classical linear regression model is dened as: y = X β + ε, E (ε) = 0, Cov (ε) = σ 2 I (1) OLS estimator of coecients are βb = (X > X )− X > y . OLS estimator for nearly collinear data (X T X is ill-conditioned) is very unreliable with very high variance. Small perturbations in data change everything! Multicollinearity between the predictors: shrinkage estimators The most popular shrinkage estimator is the ridge regression estimator which is dened by (Hoerl and Kennard, 1970) 1 b β(k) = (X > X + kI )−1 X > y where k > 0 is the tuning (biasing) parameter. 3 / 56 Shrinkage Estimators Ridge Regression Ridge regression is like least squares but shrinks the estimated coecients towards zero. Given a response vector y ∈ R and a predictor matrix X ∈ Rn×p , the ridge regression coecients are dened as b β(k) = arg min β∈ Rp X (yi − xiT β)2 +k p X βj2 j=1 = arg min ky − X βk2 +k kβk2 {z } | {z } β∈Rp | 2 Loss 2 Penalty Here k ≥ 0 is a tuning parameter which controls the strength of the penalty term. Note that For k = 0, we get the OLS estimate b For k = ∞, we get β(k) =0 For k in between 0 and ∞, model t and amount of shrinkage are balanced 4 / 56 Shrinkage Estimators More of Ridge The canonical form of model (1) is: Let Λ and T be the matrices of eigenvalues and eigenvectors of X > X y = XTT > β + ε = Z γ + ε where Z = XT , γ = T > β and Z > Z = Λ, T > X > XT = Λ = diag (λ , λ , . . . , λp ), λi is the i th eigenvalue of 1 2 X >X The ridge resression estimator for the canonical model is: b β(k) = (Λ + kI )−1 Λβb b i= β(k) λi b β, i = 1, . . . , p λi + k This illustrates the essential feature of ridge regression: shrinkage 5 / 56 Shrinkage Estimators Bias and Variance of Ridge Regression Theorem For any design matrix X , the quantity Λ + λI is always invertible, thus b . there is always a unique solution β(k) Theorem b i) = σ The variance of the ridge regression estimate is: Var(β(k) 2 λi (λi +ki )2 Theorem b i) = − k The bias of the ridge regression estimate is: Bias(β(k) λi +k b i ) is a monotone increasing The total squared bias i Bias (β(k) sequenceP with respect to k (amount of shrinkage) while the total b i ) is a monotone decreasing sequence with variance i Var(β(k) respect to k . P 2 6 / 56 Shrinkage Estimators Theorem (Existence theorem)There always exists a k > 0 such that the MSE of b β(k) is always less than the MSE of βb Ridge regression is a linear estimator (b y = Hridge y ), with Hridge = X (X T X + kI )−1 X T One may dene its degrees of freedom to be tr (Hridge ) P i Furthermore one can show that dfridge = λiλ+k where λi are the T eigenvalues of X X . 7 / 56 Shrinkage Estimators Eect of Multicollinearity Let us consider two parameter simple linear regression such that y = β1 x1 + β2 x2 Correlation transformation: xtj∗ xtj − x j = √ , xj = sj n − 1 Pn P xij , n − x j )2 yt − y , yt∗ = √ , n−1 sy n − 1 ∗T ∗ x 1 x 1 x ∗T x ∗2 1 r ∗T ∗ 1 X X = x ∗T x ∗ x ∗T x ∗ = r 1 1 2 2 2 1 1 1 −r 2 T −1 b , Var (βbj ) = Var (β) = σ (X X ) = 2 −r 1 1−r 1 − r2 2 sj = r Var (βbi ) t=1 (xtj 0 0.3 0.5 0.95 0.99 1 1.1 1.33 10.26 50.25 8 / 56 Shrinkage Estimators Another Shrinkage Estimator: Liu Given a response vector y ∈ R and a predictor matrix X ∈ Rn×p , the Liu regression coecients are dened as b β(η) = arg min β∈ R p X (yi − xiT β)2 + p X (η βbj − βj )2 j=1 2 = arg min ky − X βk22 + η βb − β p | {z } β∈R | {z }2 Loss Penalty where βb is OLS estimator. Here 0 ≤ η ≤ 1 is a tuning parameter which controls the strength of the penalty term. Note that For η = 1, we get the OLS estimate For η = 0, we get ridge estimate with k = 1 For η in between 0 and 1, model t and amount of shrinkage are balanced 9 / 56 Shrinkage Estimators Another Shrinkage Estimator: Liu (Linearly Unied) Liu(Linearly Unied) estimator is a linear combination of ridge and Stein-Rule estimators: b b β(η) = (X T X + I )− (X T y + η β), 0<d <1 1 = (X T X + I )−1 X T y + (X T X + I )−1 η βb | {z } | {z } b β(k) c βb Liu estimator in canonical form is dened as (Liu,1993): b β(η) = (Λ + I )−1 (Λ + ηI ) Theorem b i) = σ The variance of the Liu regression estimate is: Var(β(η) 2 (η+λi )2 (1+λi )2 λi Theorem b i ) = − ( −η) The bias of the Liu regression estimate is: Bias(β(η) (λi + ) 1 1 10 / 56 Shrinkage Estimators Theorem (Existence theorem)There always exists a 0 < η < 1 such that the MSE b of β(η) is always less than the MSE of βb. Liu estimator is a linear estimator (b y = Hliu y ), with Hliu = X (X T X + I )−1 (I + ηX T X )X T One may dene its degrees of freedom to be tr (Hliu ). P i )λi Furthermore one can show that dfliu = ( +ηλ where λi are the λi + T eigenvalues of X X . 1 1 11 / 56 Semiparametric Partially Linear Models Motivation Determinants of electricity demand Monthly aggregated(20 cities) electricity consumption(EC) in Germany: 01.1996-08.2010 Unique physical attributes of electricity are non-storability uncertain and inelastic demand steep supply function Electricity providers are interested in understanding and hedging demand uctuations! 12 / 56 Semiparametric Partially Linear Models Motivation Determinants of electricity demand To determine relationship between electricity consumption (EC) and temperature we have to account for: price:electricity price, gas price etc. seasonal eects: monthly eects etc. economic activity: income etc. Preassumptions: EC and price:− EC and income:+ EC and temperature:unknown 13 / 56 6.5 6.4 y 6.3 6.2 6.1 6.1 6.2 6.3 y 6.4 6.5 6.6 Motivation 6.6 Semiparametric Partially Linear Models −0.04 0.00 0.04 x_1 Figure : Plot of income vs. electricity consumption, linear t(green),local polynomial t(red), 95% condence bands(blue). −0.2 −0.1 0.0 0.1 0.2 x_2 Figure : Plot of relative price vs. electricity consumption, linear t(green),local polynomial t(red), 95% condence bands(blue). 14 / 56 Motivation 6.1 6.2 6.3 y 6.4 6.5 6.6 Semiparametric Partially Linear Models 0 200 400 600 800 t Figure : Plot of temperature vs. electricity consumption, linear t(green),local polynomial t(red), 95%condence bands(black). 15 / 56 Semiparametric Partially Linear Models Motivation Semiparametric Models The ordinary regression model can be written as Y = XT β + ε. (2) Assumption: The predictors are linearly related to the response. 16 / 56 Semiparametric Partially Linear Models Motivation Semiparametric Models The ordinary regression model can be written as Y = XT β + ε. (2) Assumption: The predictors are linearly related to the response. Semiparametric partially linear model is dened as Y = XT β + f (U) + ε. (3) Introduced by Engle, Granger, Rice and Weiss(1986): analyzed the relationship between temperature and electricity usage. Became popular in the statistical literature due to the seminal works of Robinson(1988) and Speckman (1988). Advantages: Increased exibility and reduced modeling bias. Partially linear models alleviate the curse of dimensionality. 16 / 56 Semiparametric Partially Linear Models Motivation Nonparametric Models versus Semiparametric Models Nonparametric models: Relax the restrictive assumptions of parametric models. Are too exible to permit concise conclusions. Semiparametric models: Are natural extensions of ordinary linear regression models with multiple predictors and nonparametric models with several covariates. Retain the explanatory power of parametric models and the exibility of nonparametric models. 17 / 56 Semiparametric Partially Linear Models Motivation Semiparametric Partial Linear Model To avoid identiability problems we assume that a variable contained in X is not contained also in U , and in general, that no component of X can be mapped to any component of U . Stone (1985) states that the three fundamental aspects of statistical models are exibility, dimensionality and interpretability. Flexibility is the ability of the model to provide accurate ts in a wide variety of stuations, inaccuracy here leading to bias in estimation. Dimensionality can be thought of in terms of the variance in estimation, the curse of dimensionality being that the amount of data required to avoid an acceptable large variance increases rapidly with increasing dimensionality. Interpretability lies in the potantial for shedding light on the underlying structure. 18 / 56 Dierenced-based estimation Model and Dierencing Semiparametric partial linear model i = 1, . . . , n yi = xi> β + f (ui ) + εi , (4) in vector/matrix notation Y = XT β + f (U) + ε (5) where yi 's are observations at ui , 0 ≤ u ≤ u ≤ . . . ≤ un ≤ 1, xi> = (xi , xi , . . . , xip ),y = (y , . . . , yn )> , X = (x , . . . , xn ), f = {f (u ), . . . , f (un )}> , ε = (ε , . . . , εn )> . 1 1 2 1 1 2 1 1 E(ε|x, u) = 0, Var(ε|x, u) = σ , f has bounded rst derivative(f 0 < L) 2 Dierencing removes the nonparametric component (well.. almost). 19 / 56 Dierenced-based estimation How does the approximation work? mth order dierencing equation (Yatchew(1997)) m m m m X X X X dj yi−j = dj xi−j β + dj f (ui−j ) + dj εi−j , j=0 j=0 j=0 j=0 where d , d , . . . , dm are the dierencing weights. Suppose ui are equally spaced on the unit interval and f 0 ≤ L. By the mean value theorem for some ui∗ ∈ [ui− , ui ] 0 1 1 f (ui ) − f (ui−1 ) = f > (ui∗ )(ui − ui−1 ) ≤ For m = 1 from equation (20) L n yi − yi−1 = (xi − xi−1 )β + f (ui ) − f (ui−1 ) + εi − εi−1 1 = (xi − xi−1 )β + O + εi − εi−1 n =≈ (xi − xi−1 )β + εi − εi−1 . 20 / 56 Dierenced-based estimation d0 0 D = ... 0 0 d1 d0 .. . ··· 0 ··· d2 d2 d1 ··· ··· d0 ··· dm ··· d1 d0 0 dm d2 d1 ··· 0 ··· d2 ··· ··· dm ··· 0 0 0 dm d0 , d1 , . . . , dm are dierencing weights that minimise m X m−k X k=1 such that m X j=0 2 dj dk+j , j=1 dj = 0 and m X dj2 = 1 (6) j=0 are satised. The constraints (6) ensure that the nonparametric eect is removed as n→∞ and Var(e ε) = Var(ε) = σ respectively. 2 21 / 56 Dierenced-based estimation Dierencing the model Dy = DX β + Df + Dε ≈ DX β + Dε ye ≈ Xe β + εe (7) where ye = Dy , Xe = DX and εe = Dε. βbD −1 = (DX )> (DX ) (DX )> Dy − 1 = Xe > Xe Xe > ye (Yatchew, 2003). σ b2 = ye> (I − P ⊥ )e y > ⊥ tr {D (I −P )D} (8) (9) with P ⊥ = Xe (Xe > Xe )− Xe > , I (p × p ): identity matrix and tr (·): trace of a matrix (Eubank et al., 1998). 1 22 / 56 Dierenced-based estimation Dierenced based Liu-type estimator Akdeniz-Duran et.al proposed the dierence-based Liu-type estimator minimizing (w.r.t. β ) 1 L = (e y − Xe β)> (e y − Xe β) + (η βbD − β)> (η βbD − β) as βbD (η) = (Xe > Xe + I )−1 (Xe > ye + η βbD ) (10) where η , 0 ≤ η ≤ 1 is a biasing parameter and when η = 1, βbD (η) = βbD . Bias{βbD (η)} = −(1 − η)(Xe > Xe + I)− β. 1 (11) 1 Akdeniz-Duran, E., Haerdle, W., Osipenko, M. (2012). Dierence based ridge and Liu type estimators in semiparametric regression models. Journal of Multivariate Statistics 23 / 56 Dierenced-based estimation Generalization of the Model Semiparametric partial linear model for observations {yi , xi , ui }ni= 1 yi = xi> β + f (ui ) + εi , i = 1, . . . , n (12) Previously we assumed E(εε> ) = σ I. Real data often reveals heteroscedasticity and in time series context error term exhibits autocorrelation. Now we assume E(εε> ) = σ V , V not necessarily diagonal. Thus E(e εεe> ) = σ VD where VD = DVD > . Generalized dierence-based estimator is 2 2 2 βbGD = (Xe > VD−1 Xe )−1 Xe > VD−1 ye. 24 / 56 Dierenced-based estimation EGeneralization of the Model cont'd Properties of the generalized dierence-based estimator of β depends on the characteristics of the information matrix Xe > VD− Xe = G . The estimate of the σ is 1 2 σ b2 = 1 n (e y − Xe βbGD )> VD−1 (e y − Xe βbGD ) (13) It is easy to show that s2 = 1 n−p (e y − Xe βbGD )> VD−1 (e y − Xe βbGD ) (14) is an unbiased estimator of σ . If G (p × p), p << n − m matrix is ill-conditioned with a large condition number, then the βbGD produces large sampling variances. Use restricted estimation or shrinkage estimation! 2 25 / 56 Restricted estimation Generalized Dierence-based Restricted Estimator The available prior information sometimes can be expressed in the form of exact, stochastic or inequality restrictions. We assumed an exact linear restriction on the parameters 2 Rβ = r R(q × p) matrix, r (q × 1) vector. Thus the generalized dierence-based restricted parameter estimate is given by βbGRD = βbGD + G −1 R > (RG −1 R > )−1 (r − R βbGD ) where G = Xe > VD− Xe . 1 2 Akdeniz, F., Akdeniz-Duran, E., Roozbeh,M., Arashi, M. (2013). Eciency of the generalized dierence-based Liu estimators in semiparametric regression models with correlated errors. Journal of Statistical Computation and Simulation 26 / 56 Restricted estimation Generalized Dierence-based Stochastic Restricted Estimator We do not always have exact prior information such as Rβ = r due to economic relations, industrial structures, production planning, etc. Thus we assume a stochastic linear constraint r = Rβ + , ∼ (0, σ W ) where W is assumed to be known and positive denite. Prior information are to be assigned not necessarily equal weights(ω ). In order to incorporate the restrictions in the estimation of parameters, we minimize (w.r.t. β ) 2 (e y − Xe β)> VD−1 (e y − Xe β) + ω(r − Rβ)> W −1 (r − Rβ) 0 < ω < 1: prior information receives less weight in comparison to the sample information ω > 1: higher weight to the prior information(of little practical interest). 27 / 56 Restricted estimation This leads to the following solution for β : βbGDWM (ω) = (G + ωR > W −1 R)−1 (Xe VD−1 ye + ωR > W −1 r ), Since (G + ωR > W −1 R)−1 = G −1 − ωG −1 R > (W + ωRG −1 R > )−1 RG −1 we have generalized dierence-based weighted mixed estimator as βbGDWM (ω) = βbGD + ωG −1 R > (W + ωRG −1 R > )−1 (r − R βbGD ). (15) For ω = 1 we obtain the generalized dierence-based mixed estimator: βbGDM = (G + R > W −1 R)−1 (Xe > VD−1 ye + R > W −1 r ), (16) This estimator in (16) gives equal weight to sample and prior information. 28 / 56 Generalized Dierence-based Liu Estimator Generalized Dierence-based Liu Estimator (Generalized) Liu estimator proposed by Liu(1993) and Akdeniz and Kaciranlar(1995) is dened as βbGDL (η) = (G + I )−1 (Xe > VD−1 ye + η βbGD ), 0 ≤ η ≤ 1, = (G + I )−1 (G + ηI )βbGD = Fη βbGD = Fη G −1 Xe > V −1 ye D (17) where βbGD = (Xe > VD− Xe )− Xe > VD− ye is the generalized dierence-based estimator of β and Fη = (G + I )− (G + ηI ). Observing that Fη and G − are commutative, we have 1 1 1 1 1 βbGDL (η) = G −1 Fη Xe > VD−1 ye (18) 29 / 56 Generalized Dierence-based Liu Estimator Generalized Dierence-based Weighted Mixed Liu Estimator We have obtained βbGDWM (ω) = βbGD + ωG −1 R > (W + ωRG −1 R > )−1 (r − R βbGD ). substituting βbGD (η) with βbGDL (η) in βbGDWM (ω), we describe a generalized dierence-based weighted mixed Liu estimator (GDWMLE), as follows: b η) βbGDWML (ω, η) = β(ω, = βbGDL (η) + ωG −1 R > (W + ωRG −1 R > )−1 (r − R βbGDL (η)) arranging terms we nally have βbGDWML (ω, η) = (G + ωR > W −1 R)−1 (Fd Xe > VD−1 ye + ωR > W −1 r ) (19) (Hubert and Wijekoon(2006) and Yang et.al(2009)). 30 / 56 Generalized Dierence-based Liu Estimator b η) , we can see that it is a general In fact, from the denition of β(ω, estimator which includes the βbDWM , βbDL and the mixed regression estimator βbGDM as special cases. Namely, b η = 1) = βbGDWM (ω), if η = 1 then β(ω, b = 0, η) = βbGDL (η), if ω = 0 then β(ω b = 1, η = 1) = βbGDM . if η = 1 and ω = 1 then β(ω 31 / 56 Eciency Properties Lemmas Lemma (Farebrother, 1976) Let A be a positive denite matrix, namely A > 0, and let α be some vector, then A − αα> ≥ 0 if and only if α> A− α ≤ 1. 1 Lemma Let n × n matrices M > 0, N ≥ 0 , then M > N if and only if λmax (NM − ) < 1 (Rao,2008). 1 Lemma (Trenkler and Toutenburg,1990) Let βbj = Aj y , j = 1, 2 be two competing estimators of β . Suppose that ∆ = Cov (βb ) − Cov (βb ) > 0. Then MSEM(βb ) − MSEM(βb ) ≥ 0 if and only if b > (∆ + b b > )− b ≤ 1, where bj denotes bias vector of βbj . 1 1 2 2 2 1 1 1 2 32 / 56 Eciency Properties b η) over βbGD MSEM Superiority of βbGDWM (ω) and β(ω, b η) are Expectation and dispersion matrices of the β(ω, b η)) = BAβ and Var (β(ω, b η)) = σ 2 B AB e > E (β(ω, (20) b η) is The bias of β(ω, b η)) = E (β(ω, b η)) − β = B(Fη − I )G β Bias(β(ω, (21) b d) is The mean squared error matrix of β(ω, b η)) = Var (β(ω, b η)) + Bias(β(ω, b η))Bias(β(ω, b d))> MSEM(β(ω, e > + b1 b > = σ 2 B AB (22) 1 B =: (G + ωR > W −1 R)−1 , A =: (Fη G + ωR > W −1 R) and Ae =: (Fη DFη > + ω 2 R > W −1 R), b1 = B(Fη − I )G β . 33 / 56 Eciency Properties MSEMs of the Estimators MSEM(βbGD ) = Var (βbGD ) = σ 2 G −1 MSEM(βbGDM ) = Var (βbGDM ) = σ 2 (G + R > W −1 R)−1 MSEM(βbGDL (η)) = σ 2 Fη G −1 F > + b2 b > , η 2 (23) (24) (25) with b = Bias(βbGDL (η)) = (Fη − I )β . 2 b MSEM(βbGDWM (ω)) = MSEM(β(ω)) b = Var (β(ω)) = σ 2 B(G + ω 2 R > W −1 R)B > (26) 34 / 56 Eciency Properties b b η) MSEM Comparison between β(ω) and β(ω, Theorem b η) is The generalized dierence-based weighted mixed Liu estimator β(ω, superior to the generalized dierence-based weighted mixed estimator b b b η)) ≥ 0 if β(ω) in the MSEM sense, namely MSEM(β(ω)) − MSEM(β(ω, − − > ? and only if σ b B b ≤ 1. 1 2 1 1 Proof. b b η)) = σ 2 B∆1 B > − b1 b > MSEM(β(ω)) − MSEM(β(ω, 1 2 + λi (G )+η)λi (G ) i (G )+η) where δi = λi (G ) − λi (G(λ)(λ = ( −η)( (λ . Since 2 2 i (G )+ ) i (G )+ ) 0 < η < 1 and λi (G ) > 0, then δi > 0. We note that δi > 0 is monotonic decreasing in η . Observing that B =: (G + ωR T W − R)− > 0 we get ∆ > 0 and B ? =: B∆i B > > 0.By lemma 1, we get b b η)) ≥ 0 if and only if MSEM(β(ω)) − MSEM(β(ω, − σ − b > B ? b ≤ 1. 1 1 1 2 1 1 1 1 1 2 1 1 35 / 56 Eciency Properties b η) MSEM Comparison between βbGDL (η) and β(ω, Theorem When λmax (NM − ) < 1, the generalized dierence-based weighted mixed b , η) is superior to the generalized dierence-based Liu Liu estimator β(w estimator βbGDL (η) in the MSEM sense, namely b η)) ≥ 0 if and only if MSEM(βbGDL (η)) − MSEM(β(ω, > > − b (σ ∆ + b b ) ≤ 1. 1 2 1 2 2 1 2 Proof. b η)) = σ 2 ∆2 + b2 b > − b1 b > . MSEM(βbGDL (η)) − MSEM(β(ω, 2 1 where M = Fη G − Fη> , N = B(Fη G Fη > + ωR > W − R)B > and ∆ = M − N . It is obvious that, M = Fη G − Fη> > 0, N = B(Fη GFη> + ω R > W − R)B > > 0. Therefore, when λmax (NM − ) < 1, we get ∆ > 0 by applying Lemma 2. By Lemma 3, b η)) ≥ 0 if and only if we have MSEM(βbGDL (η)) − MSEM(β(ω, > > − b (σ ∆ + b b ) b ≤ 1. 1 1 1 2 2 1 1 2 1 2 2 2 1 2 1 36 / 56 Eciency Properties Variance Comparison between βbGD and βbGDWM Theorem The generalized dierence-based weighted mixed estimator βbGDWM is superior to the generalized dierence-based estimator in the sense that Var (βbGD ) − Var (βbGDWM ) ≥ 0 if and only if ω − 1 W + RG − R > > 0. 2 1 Proof. ∆3 = Var (βbGD ) − Var (βbGDWM ) 2 2 2 > −1 −1 > = σ ω BR W − 1 W + RG R W −1 RB > . ω dierence is positive denite when B is positive denite and The 2 −1 R > > 0 which is positive denite as long as ω < 2. − 1 W + RG ω When q < p , R has full row rank, therefore R > has full column rank and it follows that in this case we can only conclude that ∆3 ≥ 0. 37 / 56 Application Application 1 The data set is from a household demand for gasoline in Canada (Yatchew(2003)). We allow price and age to appear nonparametrically: The basic specication is given by Model: dist = f (price, age) + β1 income + β2 drivers + β3 hhsize + β4 youngsingle + β5 urban + monthly dummies + ε dist: log of distance traveled per month by household, price: log of price of 1 lt gasoline, age: log of age, hhsize: size of the household etc. specication test result of joint signicance of the nonparametric variables: 5.96 order of dierencing: m=10 with d=(0.9494,-0.1437,-0.1314,-0.1197,0.1085,-0.0978,-0.0877,-0.0782,-0.0691,-0.0606,-0.527) 38 / 56 Application 0.9 Results 1 0.8 o o o o 0.5 0.6 price 0.7 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 0.4 o o 20 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 30 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 40 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 50 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 60 age Figure : Ordering of data with respect to price and age Di. Liu income drivers hhsize youngsingle urban R2 2 sdiff Dierence Est. Std. Error Est. Std.Error 0.287 0.021 0.291 0.021 0.532 0.035 0.571 0.033 0.122 0.029 0.093 0.027 0.198 0.063 0.191 0.061 -0.333 0.020 -0.332 0.020 .270 .263 .496 .501 39 / 56 Application Results 2 DIST pr 8.0 9.0 ic e e e ic e ag ag 7.0 3.2 pr DIST 6.0 3.3 3.4 3.5 3.6 Figure : raw data graph(left), smooth graph after estimating parametric terms with Liu estimator Having estimated the parametric eects using GDWML estimator, the 40 / 56 constructed data (y − z β , age , price ) are then smoothed to estimate Application Application 2: Hedonic Pricing of Housing Attributes Housing prices are very much aected by location (Yatchew(2003)) The price surface may be unimodal, multimodal or have ridges (prices along the subway are often higher) Thus, we include a two-dimensional nonparametric eect. Dataset: 92 detached homes in Ottawa. y(dependent variable): saleprice lotarea, square footage of housing, average neighbourhood income etc. are independent variables 41 / 56 Application Results 1 Estimate Std. Error t value Pr(>|t |) (Intercept) 73.9775 17.9961 4.11 0.0001 replac 11.7950 6.1523 1.92 0.0587 garage 11.8383 5.0793 2.33 0.0223 luxbath 60.7362 10.5148 5.78 0.0000 avginc 0.4776 0.2244 2.13 0.0364 disthwy -15.2774 6.6974 -2.28 0.0252 lotarea 3.2432 2.2766 1.42 0.1581 nrbed 6.5860 4.8962 1.35 0.1823 usespace 21.1285 10.9864 1.92 0.0580 south 7.5268 2.1505 3.50 0.0008 west -3.2062 2.5296 -1.27 0.2086 R .62 sres 424.3 2 2 42 / 56 Application Results 2 salep t 100 140 180 220 so ut h st we sou th price rice sale we s 10 20 30 40 50 60 70 Figure : (a) Graph of location versus saleprice , (b) Local Polynomial regression 43 / 56 Application Results 3 replac garage luxbath avginc disthwy lotarea nrbed usespace R2 2 sdiff Estimate Std. Error t value Pr(>|t|) 12.6489 5.7875 2.19 0.0316 12.9252 4.9146 2.63 0.0102 57.6187 10.5741 5.45 0.0000 0.5968 0.2342 2.55 0.0126 1.5538 21.3877 0.07 0.9423 3.0986 2.2387 1.38 0.1700 6.4323 4.7490 1.35 0.1792 24.7213 10.5893 2.33 0.0220 .66 375.5 Table : Parametric part is estimated by Liu-type regression 44 / 56 Simulation Study Simulation Design Various biasing parameters are considered: η = (0, 0.1, 0.2, ....., 1) Various weights w = (0.1, 0.3, 0.5, 0.7, 0.9) To achieve dierent degrees of collinearity predictors are generated by: xij = (1 − γ ) / zij + γzzip , i = 1, . . . , n, j = 1, . . . , p. 2 1 2 where zij are i.i.d. normal numbers. γ = (0.80, 0.90, 0.99) where correlation between two explanatory variables are γ Dependent observations are generated by 2 yi = 5 X xji βj + f (ti ) + εi j=1 where β = (1.5, 2, 3, −5, 4), ε ∼ N(0, σ V ) the elements of V are |i−j| vij = n , σ = 4. 2 1 2 45 / 56 Simulation Study Simulation Design 0.0 −0.4 −0.2 doppler(t) 0.2 0.4 Nonparametric function is generated by p f (ti ) = ti (1 − ti )sin(2.1π/(ti + 0.05)) where ti = (i − 0.5)/n which is called the Doppler function. 0.0 0.2 0.4 0.6 0.8 1.0 t Figure : Nonparametric part of the model Dicult to estimate, spatially inhomogenous function its smoothness varies over t. 46 / 56 Simulation Study Simulation Design The restriction on the parameters are taken to be Rβ + = r , ∼ N(0, 1) with R = (0, 1, −1, 1, 0) and r = 0. The fourth-order optimal dierencing weights for m = 4 are d = 0.8873, d = −0.3099, d = −0.2464, d = −0.1901, d = −0.1409 (Yatchew(2003),p.61) The dierence (100 − 4) × 100 dierencing matrix is dened as follows: d d d d d 0 ··· ··· 0 0 d d d d d 0 ··· 0 .. . .. D=. 0 ··· ··· d d d ··· d 0 0 0 ··· ··· d d d d d 0 1 2 0 3 4 1 2 3 4 0 1 2 3 4 0 1 2 0 1 4 2 3 4 47 / 56 Simulation Study Simulation Design X and y values are ordered with respect to the nonparametric variable. The matrix G = Xe > VD− Xe has condition numbers 20.92, 33.23, 352.63 for γ = 0.8, γ = 0.9 and γ = 0.99, respectively, which implies the existence of multicollinearity in the data set. Dierencing we can perform inference on β as if there were no nonparametric component f in the model to begin with. 1 48 / 56 Simulation Study Performance criteria The performances of the estimators are measured by b = \ (β) SMSE 1 X b β(l) − β 1000 l= 1000 2 1 b = \ (β) ABIAS p 1 X X b β − β i 1000 l= i= i(l) 1000 1 1 where βb(l) denotes the estimated parameter in the i th iteration [ (fb(u), f (u)) = MSE i 1 X hb f (ui ) − f (ui ) 1000 i= 1000 2 1 49 / 56 Simulation Study Simulation Results Table : Evaluation of the risk functions for γ = 0.99 dierent values of ω GDE GDWME GDME GDLE GDWMLE mse(ω = 0.1) 157.66 157.66 157.17 155.80 155.80 bias(ω = 0.1) 21.62 21.62 21.54 21.46 21.46 mse(ω = 0.3) 156.81 156.60 156.32 154.96 154.74 bias(ω = 0.3) 21.65 21.62 21.56 21.48 21.45 mse(ω = 0.5) 156.35 156.00 155.81 154.51 154.16 bias(ω = 0.5) 21.53 21.48 21.44 21.36 21.31 Optimum value of η for Liu estimator are calculated using the generalized cross validation. 50 / 56 Simulation Study Final Considerations We have proposed estimation methods for partially linear regression framework based on dierence-based method. Especially, the estimation of the parametric part is considered. Stochastic restrictions were put on the parameter space with specied weights. The results are generalized for heteroscedastic cases. The proposed estimators GDLE and GDWMLE had smaller smses for all cases. 51 / 56 References I Engle,R.F., Granger,C.W.J., Rice, J. and Weiss,A. Semiparametric estimates of the relation between weather and electricity sales Journal of American Statistical Association,81:310-320,1986. Härdle,W.K., Müller,M., Sperlich,M.S. and Werwatz,A. Nonparametric and Semiparametric Models Springer Verlag, Heidelberg, 2004. Hoerl, A.E. and Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics,12: 55-67,1970. Liu, K. A new class of biased estimate in linear regression. Communication in Statistics: Theory and Methods,22:393-402,1993. 52 / 56 References II Green,P.J. and Silverman, B.W. Nonparametric Regression and Generalized Linear Models. Chapman and Hall, 1994. Akdeniz-Duran,E., Haerdle, W. and Osipenko, M. Dierence based ridge and Liu type estimators in semiparametric regression models. Journal of Multivariate Analysis, 105, 164-175, 2012. Akdeniz,F.,Akdeniz-Duran,E., Roozbeh, M. and Arashi, M. Eciency of the generalized dierence-based Liu estimators in semiparametric regression models with correlated errors. Journal of Statistical Computation and Simulation, 2013, DOI: 10.1080/00949655.2013.806924. Akdeniz-Duran,E., Akdeniz, F. and Hu, H. Eciency of a Liu-type estimator in semiparametric regression models. Journal of Computational and Applied Mathematics, 235, 1418-1428, 2011. 53 / 56 References III Akdeniz-Duran,E., Akdeniz, F. and Hu, H. Eciency of the generalized dierence-based Liu estimators in semiparametric regression models with correlated errors. Journal of Statistical Computation and Simulation,DOI:10.1080/00949655.2013.806924. Akdeniz,F., Akdeniz-Duran, E., Roozbeh, M. and Arashi, M. New dierence-based estimator of parameters in semiparametric regression models. Journal of Statistical Computation and Simulation, Volume 83, Issue 5, 2013. www.r-project.rog 54 / 56 Thank you. Questions? 55 / 56 A New Dierence-based Weighted Mixed Liu Estimator in Partially Linear Models Fikri Akdeniz [email protected] Çag University, Mersin-TURKEY Department of Mathematics and Computer Science 11-06-2014
© Copyright 2026 Paperzz