Data and correlation 1 E. Kusideł Demand Analysis Conditions to apply statistical methods to demand analysis 1. Economic phenomena must be quantified Most of them are naturally quantified: prices, income, GDP, level of demand, production, etc. Some of them have not natural measure: properties of goods like quality, smell, color. 2 Condition to apply statistical methods 2. Statistical data are accessible and long enough Sometimes we know that statistical data do exists but we have no access to them Formally number of observations must be larger that number of estimated parameters, but practically number of observations must be much larger that number of estimated parameters 3 Kinds of statistical data Time series data (TSD): xt for t=1,2,…,T. Cross-section or spatial data (CSD): xi for i=1,2,…,N. Panel data or cross-section time (PD): xit for i=1,2,…,N; t=1,2,…,T. 4 An example of time series data: xt , t=1,2,…,T. Number of employees in Poland in period: 1st quarter 1995 – 4th quarter 2006 9500 9000 8500 8000 7500 7000 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 t=1,2,…48 (12 years with 4 quarters in every year: 4x12=48 observations). 5 ku do ja w l no sk ś o- l ąs po k m ie or sk ie lu be ls k lu i e bu sk i łó e m dzk ał op ie m o az l sk ow ie ie ck op i e po o dk l sk ar ie pa c po kie dl a po ski m e or sk ie w św ś lą ar i m ęto sk i k iń sk rzy e s om ki e a za w zur sk ch ie ie od lko po ni ol po ski m e or sk ie An example of cross-section data: xi, i=1,2,…,N. Number of employees in 4th quarter of 2006 in 16 regions of Poland 2500 2000 1500 1000 500 0 i=1,…16 (number of regions in Poland). 6 An example of panel data: xit i=1,2,…,N; t=1,2,…,T. Number of employees in 6 regions of Poland measured in 4 quarters of 2006 2209 1151 1116 1064 1068 741 741 725 702 880 929 1161 1305 2158 441 1146 2133 442 1111 929 3Q2006 1348 2076 404 920 4Q2006 1378 1096 1195 2Q2006 390 1Q2006 dolnośląskie kujawsko- lubelskie lubuskie łódzkie małopolskie mazowieckie pomorskie i=1,…,6, t=1,…4. Number of observations: 4x6=24. 7 8 9 11 How to measure relationships between economic phenomena expressed in series of data? 1. 2. 3. Correlation coefficient Elasticities Regression analysis 12 x, y Formula 1 Correlation coefficient between two variables x and y cov( x, y ) rxy sx s y 1 n ( xi x )( yi y ) n i 1 n 1 n 1 2 2 ( x x ) ( y y ) i i n i 1 n i 1 n- sample amount, cov(x,y)- covariance between x i y, sx, sy, - standard deviation of variable x i y . 13 Properties of correlation coefficient -rxy rxy is a measure which can differ (vary) between –1 and 1. Module from rxy is a power of the relationship Sign of rxy inform us about direction of relationship 14 Sign of correlation coefficient rxy <0 – negative correlation (if x grows then y falls or if x falls then y grows) rxy >0 – positive correlation (if x grows then y grows or if x falls then y falls) 15 Power of correlation coefficient Module of rxy, value of which is between 0 and 1 informs us about power of relationship rxy = 0 - variables are not connected (no connection, or no correlation). e.g. demand for woman bag is not correlated with demand for computers. rxy =1 – very strong connection between two variables e.g. we can expect strong relationship between demand for computer and demand for computers screen. 16 Scatter diagram The first step in most correlation problems is to construct a graphic picture of the relationship between the two variables. Such a picture is best provided by the a so-called scatter diagram 17 Example 1. Price and demand for a product priceA price demand (sale) 40,00 2,40 136 2,30 151 2,20 157 2,20 157 2,10 185 2,00 199 1,90 235 1,80 246 1,70 271 1,60 294 30,00 20,00 10,00 0,00 1 2 3 4 5 6 7 8 9 10 7 8 9 10 demandA 250 200 150 100 50 0 1 2 3 4 5 6 18 Scatter diagram: graph of relationships between price and demand for product A demand (sale) 300 250 200 150 100 1,50 1,70 1,90 2,10 2,30 2,50 price 19 20 Shape of scatter diagram Sign of CC Growing line: positive correlation Falling line: negative correlation Power of CC The more straight scatter diagram is the greater power of CC The more round the scatter diagram is the less power of CC. Straight line, but perpendicular or parallel to an axis: no correlation 21 What does this scatter diagram say? demand (sale) 300 200 100 1,50 1,70 1,90 2,10 2,30 2,50 price 22 Draw a scatter diagram for priceB and demandB and income and demandA (data from demandAB.xls) priceA demandA wages priceB demandB 33,40 127 1001 2,27 136 32,30 134 1200 3,02 151 28,20 139 1355 1,11 157 27,10 135 1406 1,60 157 26,00 153 1800 4,03 185 25,90 170 2106 1,87 199 24,80 191 2653 0,63 235 23,70 197 3009 3,37 246 22,60 187 3504 1,44 271 21,50 195 4000 1,69 294 And answer following question: 1. Is it the case of positive or negative correlation? 2. Is the correlation stronger then in case of product A? 23 Scatter for price and demand for product B demandB 400 300 200 100 0 0,00 1,00 2,00 3,00 4,00 5,00 24 25 26 Homework Calculate correlation coefficient between price and demand for product B using formula 1 27 Scatter diagram for price and demand for product B demandB 400 300 200 100 0 0,00 1,00 2,00 3,00 4,00 5,00 1. Is it the case of positive or negative correlation? 2. Is the correlation stronger then in case of product A?
© Copyright 2026 Paperzz