幻灯片 1

Regression Analysis
Chapters 3
Two-Variable Regression Model:
The Problem of Estimation
Estimate the population regression function
(PRF) on the basis of the sample regression
function (SRF)
two generally used methods of estimation:
(1) ordinary least squares (OLS)
(2) maximum likelihood (ML).
the method of OLS is used extensively in
regression analysis
3.1 THE METHOD OF ORDINARY LEAST
SQUARES
The method of ordinary least squares is
attributed to Carl Friedrich Gauss
PRF is not directly observable, We estimate it
from the SRF: Y  ˆ  ˆ X  uˆ
i
1
2
i
i
 Yˆi  uˆi
But how is the SRF itself determined?
uˆi  Yi  Yˆi
 Yi  ˆ1  ˆ2 X i
 Now given n pairs of observations on Y and X, we
would like to determine the SRF in such a manner
that it is as close as possible to the actual Y.
 Choose the SRF in such a way that:
 the sum of the residuals is as small as possible.
 uˆ   (Y
i
i
 Yˆi )
all the residuals receive equal importance no matter
how close or how widely scattered the individual
observations are from the SRF.
4
Y
SRF
X
A consequence of this is that it is quite possible that the
algebraic sum of the residual is small (even zero)
although the residual are widely scattered about the SRF.
5
 We can avoid this problem if we adopt the leastsquares criterion, the squared residuals is
as small as possible
2
2
ˆ
ˆ
u

(
Y

Y
)
 i  i i

2
  (Yi  ˆ1  ˆ 2 X i)
(3.1.2)
It is obvious from (3.1.2) that
 uˆi
2
 f(ˆ1,ˆ2 )
the sum of the squared residuals is some function
of the estimators.
6
 the method of least squares provides us with
unique estimates of β1 and β2 that give the
smallest possible value of  uˆi
2

 uˆ i 
 2 (Yi  ˆ1  ˆ 2 X i )  2 uˆ i
ˆ
2

1
  uˆ i
ˆ
2
  2
 (Y
i
 ˆ1  ˆ 2 X i ) X i  2 uˆ i X i
2
 Yi  nˆ1  ˆ2  X i


2
ˆ
ˆ
Y
X


X


X

1 i
2 i
 i i
(3.1.4)
(3.1.5)
7
2
 X
ˆ
ˆ
 i  Yi  1n X i  2   X i 

ˆ n X 
ˆ n X 2

n
X
Y


1 
i
2 
i
  i i
(1)
(2)
(2)-(1)得 :
2
2
ˆ
n X i Yi   X i  Yi   2 [n X i   X i  ]
n X i Yi   X i  Yi
ˆ
2 

2
2
n X i   X i 
 ( X  X )(Y  Y )   x y
(X  X )
x
i
i
i
2
i
i
2
i
8
ˆ1 
 Xi
 Yi  X i  X iYi
n  X i ( X i)
2
2
2
 Yi  ˆ2X
(3.1.7)
The estimators obtained previously are known as the
least-squares estimators, numerical properties of
estimators obtained by the method of OLS:
I. The OLS estimators are expressed solely in terms of
the observable (i.e.,sample) quantities (i.e., X and Y).
II. They are point estimators
III. Once the OLS estimates are obtained from the sample
data, the sample regression line (Figure 3.1) can be
easily obtained. The regression line thus obtained has the
following properties:
9
 1. It passes through the sample means of Y and X.
 from (3.1.7) ˆ1  Y  ˆ 2 X
Y  ˆ1  ˆ 2 X
 2. The mean value of the estimated Y is equal to
the mean value of the actual Y
 Summing both sides of this last equality over the
sample values and dividing through by the sample
size n
10
 3. The mean value of the residuals is zero.
 the sample regression
 sum (2.6.2) on both sides to give
 Dividing Eq. (3.1.11) through by n, we obtain
 which is the same as (3.1.7). Subtracting Eq.
(3.1.12) from (2.6.2), we obtain
11
 4. The residuals are uncorrelated with the
predicted Y.
 5. The residuals are uncorrelated with X
 û X
i
i
0
12
3.2 THE ASSUMPTIONS UNDERLYING THE
METHOD OF LEAST SQUARES
 draw inferences(推断) about the true β1 and β2
 PRF: Yi  1   2 X i  ui
 Yi depends on both X i and ui
 Assumptions made about the Xi variable(s) and
the error term are extremely critical to the valid
interpretation of the regression estimates.
 The Gaussian, standard, or classical linear
regression model (CLRM)
13
14
In other words, richer families on the average consume more than poorer
families, but there is also more variability in the consumption expenditure of
the former.
15
Dependent on type of data
16
17
 (1)What variables should be included in the model?
 (2)What is the functional form of the model? Is it
linear in the parameters, the variables, or both?
 (3)What are the probabilistic assumptions made
 about the Yi , the Xi, and the ui entering the model?
18
3.3 PRECISION(精度)OR STANDARD ERRORS
(标准误)OF LEAST-SQUARES ESTIMATES
 From Eqs. (3.1.6) and (3.1.7), it is evident that
least-squares estimates are a function of the
sample data. But since the data are likely to
change from sample to sample, the estimates
will change ipso facto.
 what is needed is some measure of “reliability”
(可靠性)or precision (精度)of the
estimators(估计量).
 In statistics the precision of an estimate is
measured by its standard error (se 标准误).
19
the standard errors of the OLS estimates
var( ˆ 2 ) 
2
x
2
se( ˆ 2 ) 
i
var( ˆ1 ) 
 Xi
n xi
2
2

2
ˆ 2 
se( ˆ1 )  

 xi
2
 Xi
2
n  xi
2
2
ˆ
 ui
n2
20
21
22
3.4 PROPERTIES OF LEAST-SQUARES
ESTIMATORS: THE GAUSS–MARKOV
THEOREM(高斯-马尔可夫定理)
 the least-squares estimates possess some ideal or
optimum properties. These properties are contained
in the well-known Gauss–Markov theorem.
 To understand this theorem, we need to consider the
best linear unbiasedness property(最佳线性无偏
性质) of an estimator。
 best linear unbiased estimator (BLUE):
 1. It is linear
 2. It is unbiased
 3. It has minimum variance in the class of all
such linear unbiased estimators
23
24
 We now consider the goodness of fit of the fitted
regression line to a set of data;
 we shall find out how “well” the sample regression
line fits the data.
 What we hope for is that these residuals around the
regression line are as small as possible.
25
Venn diagram(维恩图), or the Ballentine(巴伦坦图)
The overlap of the two circles (the shaded area)
indicates the extent to which the variation in Y is
explained by the variation in X (say, via an OLS
regression).
26
 The greater the extent of the overlap, the greater
the variation in Y is explained by X.
2
2
y

(Y

Y
)
 i  i
为 总 平 方 和 ( TSS :
total sum of squares),表示实测的Y 值围绕其均值的
27
TSS=ESS+RSS
28
dividing (3.5.2) by TSS on both sides, we obtain
coefficient of determination (判定或可决系数)r 2 measures
the proportion or percentage of the total variation in Y
29
explained by the regression model.
Two properties of r2 may be noted:
 1. It is a nonnegative quantity.
 2. Its limits are 0 ≤ r2 ≤ 1.
 r2 can be obtained more quickly from the
following formula:
30
coefficient of correlation r
 Some of the properties of r:
 1. It can be positive or negative
 2. It lies between the limits of −1 and +1
 3. It is symmetrical in nature
 4. It is independent of the origin and scale
 5. If X and Y are statistically independent, the
correlation coefficient between them is zero
 6. It is a measure of linear association or linear
dependence only
 7. it does not necessarily imply any cause-andeffect relationship
31
EXAMPLES
SUMMARY AND CONCLUSIONS:
1.The basic framework of regression analysis is the
CLRM(经典线性回归模型).
2. The CLRM is based on a set of assumptions.
3. Based on these assumptions, the least-squares
estimators are BLUE(最优线性无偏性质)
4. The precision of OLS estimators is measured
by their standard errors.
32
 5. The overall goodness of fit of the regression
model is measured by the coefficient of
determination.
 6. A concept related to the coefficient of
determination is the coefficient of correlation
 It is a measure of linear association between two
 variables and it lies between −1 and +1.
 7. The CLRM is a theoretical construct or
abstraction because it is based on a set of
assumptions that may be stringent or “unrealistic.”
33