Chapter 3

Bölüm 3: Doğrusal
Gerileme
(Regresyon)
[email protected]
Jump to first page
1. Gerilemenin
(Regresyonun)
Anlamı
[email protected]
Jump to first page
Gerilemenin Anlamı

Bağımlı ve bağımsız değişkenler
arasındaki ilişkileri incelemektedir.


Bağımlı değişkenin popülasyon
ortalamasını, veri olan bağımsız
değişkenlere dayanarak tahmin
etmektedir.

[email protected]
Örn:Bir malın miktarı ile fiyatı
arasında nasıl bir ilişki vardır?
Örn: Kesin gelir düzeyi veri iken
tüketim düzeyi ne olur?
Jump to first page
Gerilemenin Anlamı

Ayrıca hipotezleri test eder:

Örn:Tüketim ve gelir arasındaki
kesin ilişki hakkında

Gelir arttığında tüketimin ne kadar
artacağını test eder.
[email protected]
Jump to first page
2. Gerileme
(Regresyon)Örneği
[email protected]
Jump to first page
Gerileme Örneği

Toplam popülasyonunu 60 ailenin
oluşturduğu bir ülkeyi varsayalım.


Gelir ve tüketim arasındaki ilişkiyi
inceleyelim.
Bazı aileler aynı miktarda gelire
sahip olsun.

Haftalık geliri ($100, $120, $140,
vb.) olacak şekilde gruplara
ayıralım.
[email protected]
Jump to first page
Gerileme Örneği

Her grup içerisindeki,aile tüketim
alışkanlık sınırları veri olsun.



6 aile 100$’lık gelire sahip
olsun.Bunların harcamaları ise
65$, 70$,74$, 80$, 85$, 88$ olsun.
Geliri (X) ve harcamaları (Y) bulun.
Sonra her kategorinin
içerisinde,kesin X koşulu
üzerinde,Y dağılımına sahip olalım.
[email protected]
Jump to first page
Gerileme Örneği

Her dağılım için koşullu ortalamayı
hesaplayalım:

E(Y|(X=X i).
E(Y|(X=X i)’i nasıl elde ettik ?
 Koşullu olasılık olan (1/6)’yı Y
değeriyle çarptık ve toplamlarını
aldık.

• Bu değer bizim örneğimiz için 77’dir.

Bu koşullu dağılımları her gelir
seviyesi için çizebiliriz.
[email protected]
Jump to first page
Gerileme Örneği

Popülasyon regresyonu; bağımlı
değişkenin koşullu ortalamalarını,
açıklayıcı değişkenlerin sabit
değerleri için birleştiren doğrudur.
Formal olarak: E(Y|Xi)
 Bu popülasyon regresyon
fonksiyonu Y’ye göre ortalamanın
X ile nasıl değiştiğini
açıklamaktadır.

[email protected]
Jump to first page
Gerileme Örneği

Bu fonksiyon hangi formu alır?


Birçok olasılığa rağmen,onun doğrusal bir
fonksiyon olduğunu kabul edelim: E(Y|Xi)
= 1 + 2Xi
1 ve 2 regresyon katsayılarıdır. (kesişme
ve eğim).


Eğim ,X’teki veri bir değişme ile Y’nin ne
kadar değişeceğini göstermektedir.
Biz 1 ve 2’yi ,X ve Y’nin gerçek
gözlemlerine dayanarak hesaplıyoruz.
[email protected]
Jump to first page
3. Doğrusallık
[email protected]
Jump to first page
Doğrusallık


Doğrusallık değişkenlerde veya
parametrelerde olabilir.
Değişkenlerdeki doğrusallık

Y’nin koşullu beklentisi X –’in
doğrusal bir fonksiyonudur.
Regresyon düz doğrudur.
 Eğim sabittir.
 Değişken eğimli kare,karekök veya
interaktif terimlilerle fonksiyon
oluşturamaz.

[email protected]
Jump to first page
Doğrusallık

Biz parametrelerdeki doğrusallıkla
ilgilenmekteyiz.
Parametreler sadece birinci
kuvvete kadar çıkabilirler.
 Değişkenler doğrusal olabilir yada
olmayabilir.

[email protected]
Jump to first page
Doğrusallık

Parametrelerdeki doğrusallık

Y’deki koşullu beklentiler parametrelerin
doğrusal fonksiyonudur.




Xs doğrusal olabilir yada olmayabilir.
E(Y|Xi) = 1 + 2Xi doğrusaldır.
E(Y|Xi) = 1 + 2Xi doğrusal değildir.
’lar birin kuvveti olarak görünüyorsa ve
diğer parametreler de çarpılmıyor yada
bölünmüyorsa doğrusaldır.
[email protected]
Jump to first page
4. Stokastik Hata
[email protected]
Jump to first page
Stokastik Hata



Tek tek değerler ,koşullu
ortalamadan yukarıda yada
aşağıda bir değer alabilir.
Belirtirsek; ui = Yi - E(Y|Xi),burada
ui tek değerin koşullu ortalamadan
sapmasını göstermektedir.
Yer değiştirince: Yi = E(Y|Xi) + ui

ui stokastik hata terimidir.

[email protected]

Bu bir tesadüfi bozulmadır.
Bu olmadan model deterministik olacaktır.
Jump to first page
Stokastik Hata Örneği

Aile tüketiminin gelirle doğrusal
ilişkisi olduğunu ,artı bozulma
terimi olduğunu kabul edelim.Bazı
örnekler:

65$’lık harcamaya sahip olan aile
bu şekilde gösterilebilir:


Yi = 65 = 1 + 2(100) + ui
75$’lık harcamaya sahip olan aile :

Yi = 75 = 1 + 2(100) + ui
[email protected]
Jump to first page
Stokastik Hata Örneği

Modelin deterministik ve stokastik
bölümleri vardır.


Bir ekonometrik model tüketim ve
gelir arasındaki ilişkiyi ifade
etmektedir.

[email protected]
Sistematik bölümü fiyat,eğitim vb.
ile gösterilir.
İlişki kesin değildir,tek tek
varyasyona konu olmakta ve bu
varyasyon u’ya yakalanmaktadır.
Jump to first page
U’nun Beklenen Değeri


Yi = E(Y|Xi) + ui
Koşullu beklentileri ele alalım.
E(Yi|Xi) = E[(EY|Xi)] + E(ui|Xi)
 E(Yi|Xi) = E(Y |Xi) + E(ui|Xi )



Bir sabitin beklenen değeri sabittir
ve bir kere Xi sabitlenirse, E(Y|Xi)
de sabit olur.
So E(ui|Xi) = 0

Koşullu ortalama değerleri ui =0
[email protected]
Jump to first page
Hata Terimi Ne Yakalar?

İhmal edilen değişkenler

Tüketimi etkileyebilecek diğer
değişkenler modele dahil
edilmemiştir.
Eğer doğru olarak belirlenecekse
modelimiz bunları da içermelidir.
 Ekonomik ilişkisini bilmiyorsak
değişkeni ihmal edebiliriz.
 Datamız olmayabilir.
 Kötü hava,grev gibi tesadüfi olaylar
düzensizce meydana gelir.

[email protected]
Jump to first page
Hata Terimi Ne Yakalar?

Bağımlı değişkendeki ölçüm
hatasını

Tüketimde Friedman modeli
Devamlı tüketim devamlı gelirin bir
fonksiyonudur.
 Bu datalar gözlemlenebilir değildir,
cari tüketim ve gelir gibi vekiller
kullanmak zorundadır.
 Sonra hata terimi bu ölçüm
hatasını temsil eder ve onu
yakalar.

[email protected]
Jump to first page
Hata Terimi Ne Yakalar?

İnsan davranışlarının tesadüfiliği

İnsanlar aynı durumlarda bile
tamamen aynı şekilde hareket
etmezler.

Yani hata terimi bu tesadüfiliği
yakalamaktadır.
[email protected]
Jump to first page
5. Örnek Regresyon
Fonksiyonu
[email protected]
Jump to first page
Örnek Regresyon
Fonksiyonu

If have whole population, we can
determine a regression line by
taking conditional means


In practice, usually have a sample.
Suppose took a sample of
population

Can’t accurately estimate the
population regression line since
we have sampling fluctuations.
[email protected]
Jump to first page
Sample Regression
Function

Our sample regression line can be
denoted:
^
Y  b1  b2 X i
^
Y is the estimator of E(Y | Xi), the conditiona l mean
b1 is estimator of B1
b2 is estimator of B 2
[email protected]
Jump to first page
Sample Regression
Function

In stochastic form:
Yi  b1  b2 X i  ei
where ei is sample residual or residual - an estimate of ui
We can have several independent variables this is multivariate regression e.g. consumption
may depend on interest rate as well as income.
[email protected]
Jump to first page
6. Ordinary Least
Squares
[email protected]
Jump to first page
OLS Regression

Estimate the PR by the method of
ordinary least squares.


The PRF is not directly observable,
so we estimate it from the SRF:


We have a PRF: Yi = 1 + 2Xi +ui
Yi = b1 + b2Xi +ei
We can rewrite as
ei = actual Yi - predicted Yi
 ei = Yi - b1 - b2Xi

[email protected]
Jump to first page
OLS Regression

We determine the SRF is such a
manner that it is a good fit.

We make the sum of squared
residuals as small as possible.
^
e  (Yi - Y i ) 2
2
i
By squaring, we give more weight to
larger residuals.
[email protected]
Jump to first page
OLS Regression

Residuals are a function of the
betas
Choosing different values for beta
gives different values for squared
residuals.
 We choose the beta values that
minimize this sum.


These are the least-squares
estimators.
[email protected]
Jump to first page
Normal Equations

The least squares estimates are
derived in the following manner:
Least squares minimizes ESS
^2
^
^
ESS   u i   (Yi   1   2 X i ) 2
^
Partially differenti ate ESS with respect to  1 :
ESS
^
 1
^2
u


^
^
^
i
 1
  2 ui
 ui
^
 1
^
^
 2 (Yi   1   2 X i )( 1)
[email protected]
Jump to first page
Normal Equations

^
Partially differenti ate ESS with respect to  2 :
ESS
^
^2
u


i
^
2
^
^
2
  2 ui
 ui
^
2
^
^
 2 (Yi   1   2 X i )(  X i )
Set resulting equations to zero and solve them :
^
 (Y  
 (Y  
i
^
1
^
i
  2 Xi)  0
^
1
  2 X i )( X i )  0
[email protected]
Jump to first page
Normal Equations

Simplifyin g yields :
^
^
Y  n     X
Y X    X    X
i
1
i
2
^
i
i
^
i
1
2
2
i
Solving both simultaneo usly yields :
^
^
1  Y   2 X
^
2
Y X  n X Y  ( X  X )(Y  Y )



(X  X )
 X  nX
i
i
2
i
i
i
2
2
i
[email protected]
Jump to first page
8. Assumptions of
Classical Linear
Regression Model
[email protected]
Jump to first page
Assumptions

Using model Y = B1 + B2X + u

Y depends on X and u
X values are fixed and u values are
random.
 Thus Y values are random too.
 Assumptions about u are very
important.


Assumptions are made that ensure
that OLS estimates are BLUE.
[email protected]
Jump to first page
Linearity Assumption

The regression model is linear in
the parameters and the error term.


Y = B1 + B2X + e.
Not necessarily linear in the
variables

We can still apply OLS to models
that are nonlinear in the variables.
[email protected]
Jump to first page
Specification Assumption

Assume the regression model is
correctly specified


All variables included (no
specification bias).
Otherwise, specification error
results.
[email protected]
Jump to first page
Expected Value of Error

Expected value of the error term=0

E(ui) = 0


Its mean value is 0, conditional on
the Xs.
Add a stochastic error term to
equations to explain individual
variation.

Assume the error term is from a
distribution whose mean is zero
[email protected]
Jump to first page
Expected Value of Error

In practice the mean is forced to be
zero by intercept term, which
incorporates any difference from
zero
Intercept represents the fixed
portion of Y that cannot be
explained by the independent
variables.
 The error term is the random
portion

[email protected]
Jump to first page
No Correlation with Error

Explanatory variables are
uncorrelated with the error term

There is zero covariance between
the disturbance ui and the
explanatory variable Xi.


Cov(Xi*ui) = 0
Alternatively, X and u have
separate influences on Y
[email protected]
Jump to first page
No Correlation with Error

Suppose the error term and X are
positively correlated.

Estimated coefficient would be
higher than it should because the
variation in Y caused by e is
attributed to X
[email protected]
Jump to first page
No Correlation with Error

Consumption function violates this
assumption

Increase in C leads to increase in
income which leads to increase in
C.


So error term in consumption and
income move together
If we do not have this assumption then simultaneous equation
estimation
[email protected]
Jump to first page
Constant Variance of Error

The variance of each ui is the
same given a value of Xi.
var(ui) = 2 a constant
(Homoscedasticity)
 Ex: variance of consumption is the
same at all levels of income


Alternative: variance of the error
term changes (Heteroscedasticity)

[email protected]
Ex: variance of consumption
increases as income increases
Jump to first page
No Correlation Across
Error Terms

No correlation between two error
terms

The covariance between the u's
zero

Cov (ui, uj) = 0 for i not equal to j
[email protected]
Jump to first page
No Correlation Across
Error Terms

Often shows up in time series serial correlation

Random shock in one period which
affects the error term may persist
and affect subsequent error terms.

Ex: positive error in one period
associated with positive error in
another:
[email protected]
Jump to first page
No Perfect Linear
Function Among Variables


No explanatory variable is a
perfect linear function of other
explanatory variables
Multicollinearity occurs when
variables move together

Ex: explain home purchases and
include both real and nominal
interest rates for a time period in
which inflation was constant.
[email protected]
Jump to first page
9. Properties of OLS
Estimators
[email protected]
Jump to first page
OLS Properties


1)linear (linear functions of Y): Y =
b1 + b2X
2)Unbiased:
E(b1) = B1and E(b2) = B2
 In repeated sampling, the expected
values of b1 and b2 will coincide
with their true values B1 and B2.

[email protected]
Jump to first page
OLS Properties

3) They have minimum variance
var b1 is less than the variance of
any other unbiased linear estimator
of B1
 var b2 is less than the variance of
any other unbiased linear estimator
of B2

[email protected]
Jump to first page
BLUE Estimator

Given the assumptions of the
CLRM, OLS estimators, in the
class of unbiased linear estimators,
have minimum variance

They are BLUE.
[email protected]
Jump to first page
10. Variances and
Standard Errors of
OLS Estimators
[email protected]
Jump to first page
Variances and Standard
Errors
Remember t he OLS estimators are :
^
^
1  Y   2 X
^
2
x y x y



S
x
i
i
i
2
i
i
XX
The estimators will vary across samples :
^
var (  1 ) 
2
X
 i
nS XX
^
2
^
standard error (  1 )  var (  1 )
[email protected]
Jump to first page
Variances and Standard
Errors
^
The variance and standard error of  2 :
^
var (  2 ) 
2
S XX
^
^
standard error (  2 )  var (  2 )
[email protected]
Jump to first page
Variances and Standard
Errors


2 is the variance of the error
term, assumed constant for each u
(homoscedasticity.)
If know 2 one can compute all
these terms.
If don't know it use its estimator.
 The estimator of 2 is (ei)2/n-2

^
e is the RSS or  (Yi  Yi ) 2 or
2
i
difference between actual and predicted Y.
[email protected]
Jump to first page
Degrees of Freedom

n-2 is degrees of freedom for error
Sum of independent observation
 To get e, we have to compute
predicted Y


To compute predicted Y, we must
first obtain b1 and b2, so we lose 2
df.
[email protected]
Jump to first page
Standard Error of
Estimate

^ 2
^
The square root of  is  .
This is called the standard error of the estimate
(the standard deviation of the Y values
about the regression line)
It is used as a measure of goodness of fit
of the estimated regression line.
[email protected]
Jump to first page
Example

Estimated regression line
Y = 24.47 + 0.509 X
se
(6.41) (.036)
t
3.813 14.243
How do we get these figures?
^ 2
 
2
e
i
n2
 337.3 / 8  42.16
[email protected]
Jump to first page
Example

^
var (  1 ) 
2
X
 i
nS XX
^
2 
322,000
42.16  41.4
10(33,000)
^
standard error (  1 )  var (  1 )  6.41
^
var (  2 ) 
2
S XX

^
42.16
 .0013
33,000
^
standard error (  2 )  var (  2 )  .036
[email protected]
Jump to first page
Example

The the estimated slope coefficient
is 0.509 and its standard error
(standard deviation) is 0.036.
This is a measure of how much 2
varies from sample to sample.
 We can say our computed 2 lies
within a certain number of standard
deviations from the true 2.

[email protected]
Jump to first page
11. Hypothesis
Testing
[email protected]
Jump to first page
Hypothesis Testing

Set up the null hypothesis that our
parameter values are not
significantly different from zero
H0:2 = 0
 What does this mean?:



Income has no effect on spending.
So set up this null hypothesis and
see if it can be rejected.
[email protected]
Jump to first page
Hypothesis Testing

In problem 5.3, 2= 1.25

This is different from zero, but this
is just derived from one sample
If we took another sample we
might get +0.509 and a third
sample we might get 0
 In other words, how do we know
that this is significantly different
from zero?

[email protected]
Jump to first page
Hypothesis Testing



2 ~ N(2, (2)2)
Can test either by confidence
interval approach, or by test of
significance approach.
2 follows the normal distribution
with mean and variance as above:
Z
 2  B2
/
2
x
 i
~ N(0,1)
[email protected]
Jump to first page
Hypothesis Testing

However, we do not know the true
variance 2
We can estimate 2
 Then we have:

 2  B2
^
/
x
2
i
^
where  
[email protected]
~ t n -2 ,
2
e
i
n2
Jump to first page
Hypothesis Testing

However, we do not know the true
variance 2
We can estimate 2
 Then we have:

 2  B2
^
/
2
x
 i
^
~ t n -2 , where  
2
e
i
n2
More generally (2 - B2)/ se 2
[email protected]
Jump to first page
Problem 5.3 Example

/se()=1.25/0.039=31.793~t(n-2)


At 95% with 7 df, t=2.365 so reject
the null.
Also could do a one-tail test
Set up the alternative hypothesis
that 2>0
 Also reject the null since t = 1.895
for one-tailed test.

[email protected]
Jump to first page
Problem 5.3 Example


Most of the time, we assume a null
that the parameter value = 0.
There are occasions where we
may want to set up a different null
hypothesis.
In Fisher example, we set up
hypothesis that b2 = 1.
 So now 1.25-1 /se = 0.25/.039 =
6.4 So it is significant.

[email protected]
Jump to first page
Confidence Interval
Approach

P(2.365  t  2.365)  .95




 2  B2
P  2.365  ^
 2.365   .95
2


 /  xi


^
^


2.365 
2.365 

P  2 
 B2   2 
 .95
2
2 


 xi
 xi 

P 2  2.365se(  2 )  B2   2  2.365se(  2 )   .95
P1.25  2.365(.039)  B2  1.25  2.365(.039)   .95
P1.158  B2  1.342)   .95
B2 = 0 and B2 = 1 do not lie in this interval
[email protected]
Jump to first page
12. Coefficient of
2
Determination--R
[email protected]
Jump to first page
Coefficient of
Determination

The coefficient of determination,
R2, measures the goodness of fit of
the regression line overall
^
ei  Yi  Yi
^
Yi  Yi  ei
Alternatively ,

^

^
(Yi  Y )  (Yi  Y )  (Yi  Yi )
variation in
variation in Y
Y from mean= explained by X+ unexplained
value
around its mean variation
[email protected]
Jump to first page
Coefficient of
Determination

Using lower cases to represent deviations from means
^
yi  y i  ei
Now square and sum
^2
2
y

y

e
  i i
2
i
TSS  ESS  RSS
Total variation in observed Y values about
their mean is partitioned into 2 parts,
one attributable to the regression line
and the other to random forces.
[email protected]
Jump to first page
Coefficient of
Determination


If the sample fits the data well,
ESS should be much larger than
RSS.
The coefficient of determination
(R2)= ESS/TSS

Measures the proportion or
percentage of the total variation in
Y explained by the regression
model.
[email protected]
Jump to first page
Correlation Coefficient

The correlation coefficient is the
square root of R2
Correlation coefficient measures
the strength of the relationship
between two variables.
 However, in a multivariate context,
R has little meaning.

[email protected]
Jump to first page
13. Forecasting
[email protected]
Jump to first page
Forecasting

Suppose we want to predict out of
sample and know relation between
CPI and S&P (Problem 5.2)
Have data to 1989 and want to
predict 1990 stock prices.
 Expect inflation in 1990 to be 10%
so CPI is 124 + 12.4 = 136.4
 Y = -195.08 + 3.82CPI
 Estimated Y for 1990 is 325.97=195.08 + 3.82(136.4)

[email protected]
Jump to first page
Forecasting

There will be some error to this
forecast - prediction error.
This has quite a complicated
formula.
 This error increases as we get
further away from the sample
mean.
 Hence, we cannot forecast very far
out of sample with a great deal of
certainty.

[email protected]
Jump to first page