E. Kusideł EKONOMERIA

Data and correlation
1
E. Kusideł
Demand Analysis
Conditions to apply statistical
methods to demand analysis
1. Economic phenomena must be quantified


Most of them are naturally quantified: prices,
income, GDP, level of demand, production, etc.
Some of them have not natural measure:
properties of goods like quality, smell, color.
2
Condition to apply statistical methods
2. Statistical data are accessible and
long enough


Sometimes we know that statistical data do
exists but we have no access to them
Formally number of observations must be
larger that number of estimated
parameters, but practically number of
observations must be much larger that
number of estimated parameters
3
Kinds of statistical data
Time series data (TSD):
xt for t=1,2,…,T.
Cross-section or spatial data (CSD):
xi for i=1,2,…,N.
Panel data or cross-section time (PD):
xit for i=1,2,…,N; t=1,2,…,T.
4
An example of time series data: xt , t=1,2,…,T.
Number of employees in Poland in period:
1st quarter 1995 – 4th quarter 2006
9500
9000
8500
8000
7500
7000
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
t=1,2,…48 (12 years with 4 quarters in every year: 4x12=48 observations).
5
ku
do
ja
w l no
sk
ś
o- l ąs
po
k
m ie
or
sk
ie
lu
be
ls
k
lu i e
bu
sk
i
łó e
m dzk
ał
op ie
m
o
az l sk
ow ie
ie
ck
op i e
po
o
dk l sk
ar ie
pa
c
po kie
dl
a
po ski
m e
or
sk
ie
w św
ś
lą
ar
i
m ęto sk
i
k
iń
sk rzy e
s
om ki e
a
za
w zur
sk
ch ie
ie
od lko
po
ni
ol
po ski
m e
or
sk
ie
An example of cross-section data: xi, i=1,2,…,N.
Number of employees in 4th quarter of 2006 in 16 regions of Poland
2500
2000
1500
1000
500
0
i=1,…16 (number of regions in Poland).
6
An example of panel data:
xit i=1,2,…,N; t=1,2,…,T.
Number of employees in 6 regions of Poland measured in 4 quarters of 2006
2209
1151
1116
1064
1068
741
741
725
702
880
929
1161
1305
2158
441
1146
2133
442
1111
929
3Q2006
1348
2076
404
920
4Q2006
1378
1096
1195
2Q2006
390
1Q2006
dolnośląskie
kujawsko-
lubelskie
lubuskie
łódzkie
małopolskie mazowieckie
pomorskie
i=1,…,6, t=1,…4. Number of observations: 4x6=24.
7
8
9
11
How to measure relationships between
economic phenomena expressed in
series of data?
1.
2.
3.
Correlation coefficient
Elasticities
Regression analysis
12
x, y
Formula 1
Correlation coefficient
between two variables x and y
cov( x, y )
rxy 

sx s y
1 n
( xi  x )( yi  y )

n i 1
n
1 n
1
2
2
(
x

x
)
(
y

y
)


i
i
n i 1
n i 1
n- sample amount,
cov(x,y)- covariance between x i y,
sx, sy, - standard deviation of variable x i y .
13
Properties of correlation
coefficient -rxy
rxy is a measure which can differ (vary)
between –1 and 1.
Module from rxy is a power of the
relationship
Sign of rxy inform us about direction of
relationship
14
Sign of correlation coefficient
rxy <0 – negative correlation (if x grows
then y falls or if x falls then y grows)
rxy >0 – positive correlation (if x grows
then y grows or if x falls then y falls)
15
Power of correlation coefficient
Module of rxy, value of which is between 0 and
1 informs us about power of relationship
rxy = 0 - variables are not connected (no
connection, or no correlation). e.g. demand
for woman bag is not correlated with demand
for computers.
rxy =1 – very strong connection between two
variables e.g. we can expect strong
relationship between demand for computer
and demand for computers screen.
16
Scatter diagram
The first step in most correlation
problems is to construct a graphic
picture of the relationship between the
two variables. Such a picture is best
provided by the a so-called scatter
diagram
17
Example 1.
Price and demand for a product
priceA
price
demand (sale)
40,00
2,40
136
2,30
151
2,20
157
2,20
157
2,10
185
2,00
199
1,90
235
1,80
246
1,70
271
1,60
294
30,00
20,00
10,00
0,00
1
2
3
4
5
6
7
8
9
10
7
8
9
10
demandA
250
200
150
100
50
0
1
2
3
4
5
6
18
Scatter diagram:
graph of relationships between
price and demand for product A
demand (sale)
300
250
200
150
100
1,50
1,70
1,90
2,10
2,30
2,50
price
19
20
Shape of scatter diagram
Sign of CC
 Growing line: positive correlation
 Falling line: negative correlation
Power of CC
 The more straight scatter diagram is
the greater power of CC
 The more round the scatter diagram is
the less power of CC.
 Straight line, but perpendicular or parallel
to an axis: no correlation
21
What does this scatter diagram say?
demand (sale)
300
200
100
1,50
1,70
1,90
2,10
2,30
2,50
price
22
Draw a scatter diagram for priceB and demandB
and income and demandA
(data from demandAB.xls)
priceA
demandA
wages
priceB
demandB
33,40
127
1001
2,27
136
32,30
134
1200
3,02
151
28,20
139
1355
1,11
157
27,10
135
1406
1,60
157
26,00
153
1800
4,03
185
25,90
170
2106
1,87
199
24,80
191
2653
0,63
235
23,70
197
3009
3,37
246
22,60
187
3504
1,44
271
21,50
195
4000
1,69
294
And answer following question:
1. Is it the case of positive or negative correlation?
2. Is the correlation stronger then in case of product A?
23
Scatter for price and demand
for product B
demandB
400
300
200
100
0
0,00
1,00
2,00
3,00
4,00
5,00
24
25
26
Homework
Calculate correlation coefficient
between price and demand for product B
using formula 1
27
Scatter diagram for price and
demand for product B
demandB
400
300
200
100
0
0,00
1,00
2,00
3,00
4,00
5,00
1. Is it the case of positive or negative correlation?
2. Is the correlation stronger then in case of product A?