Some Essentials of Probability Introduction to Econometrics Spring 2012 Ken Simons Key Concepts from Probability 1. 2. 3. 4. 5. 6. Probabilities and distributions Expected values; variances, moments Multiple random variables The normal and other distributions Sample averages Large sample distribution 1. Random Variables • Random variables vary randomly – Multiple outcomes possible • Outcomes are finest-grain (indivisible) – An event is one or more outcomes – The sample space is all possible outcomes • Probability of each outcome is its chance of happening (with many trials, proportion of times it occurs) • Types of random variables: – Discrete: e.g. Heads, Tails; or 0, 1, 2, 3, … – Continuous: any values in a range, e.g. [0,1] any # from 0 to 1 Probability Distributions • For a discrete random variable, – Probability distribution gives probability for each outcome !0.2 if x = 0 p(x) = " #0.8 if x = 1 – X is called a Bernoulli random variable: outcomes are 0 and 1 – p(x) is the Bernoulli distribution: tells probability of each outcome • E.g., • Or, a table listing all outcomes and their probabilities • Note that the probabilities add to 1 – Cumulative distribution function (c.d.f.) is probability that X≤x Probability Density Functions • For a continuous random variable, – Probability density function gives probability for each outcome 2 * • E.g., ƒ(x) = 1 !X 1 $ x # µx ' exp , # & / ) 2" ,+ 2 % ! x ( /. – X is called a normal random variable – ƒ(x) is the normal distribution " • Total probability is 1, because #!" ƒ(x)dx = 1 – Cumulative distribution function (c.d.f.) is probability that X≤x The next two slides each present one half of Figure 2.2. 2. Moments E(X) Expected value or mean of X is average outcome (in infinitely many trials) ! 2X " Var(X) Variance of X is E[(X-E(X))2 ]; a measure of how much x varies ! X Standard deviation of X is Var(X); a "typical" deviation from mean More generally: E(X r ) r th moment of X is average value of X r E[(X-E(X))r ] r th central moment of X, average value of (X-E(X))r Discrete r.v. Continuous r.v. k $ E(X)=# x i pi E(X)= % xƒ(x)dx ! 2X = # (x i & E(X))2 pi ! 2X = ! X = Var(X) ! X = Var(X) E(X ) = # x 2i pi E(X 2 ) = i=1 k i=1 2 k -$ i=1 % $ -$ (x & E(X))2 ƒ(x)dx % $ -$ x 2 ƒ(x)dx If Y=a+bX, where a and b are constants, then E(Y)=a+bE(X), ! 2Y =b2 ! 2X If X = winnings from: flip a coin, get $4 if heads, $0 if tails 2 Another way to compute ! X : What are: 2 2 2), E[(X-E(X)) 2] E(X),! Var(X), = E(Xσ2X),&E(X E(X) X 3. Multiple Random Variables • Joint probability distribution for outcomes of 2 or more random variables together – E.g., temperature (X) and precipitation (Y) Y = 0 (Dry) Y = 1 (Snow /Rain) X = 0 (Warm) 0.2 0.1 • E.g., probabilities: X = 1 (Cold) 0.4 0.3 – Marginal probability distribution is then another name for probability distribution of one of the variables • E.g., Pr(X=0)=0.3, Pr(X=1)=0.7 ! • In general: Pr(Y = y) = ! Pr(X = x i ,Y = y) • Continuous i=1 • as discrete, but (a) substitute “p.d.f.” for “probability distribution,” (b) Pr(Y = y) = " Pr(X = x,Y = y)dx # !" Conditional distributions • With 2 random variables, X and Y – If know X takes on a particular value, x, but don’t know the value of Y • Conditional distribution of Y given X=x then tells the probability of each outcome for Y • – Pr(Y = y | X = x) = • Pr(X = x,Y = y) Pr(X = x) • Conditional expectation (or variance, etc.) of Y given X=x then tells the expectationk (etc.) for Y – E.g., E(Y | X = x) = ! y i Pr(Y = y i | X = x) i=1 – Example: Y = 0 (Dry) Y = 1 (Snow /Rain) What are: Pr(Y=0 | X=0) X = 0 (Warm) 0.2 0.1 Pr(Y=1 | X=0) X = 1 (Cold) 0.4 0.3 E(Y | X=0) Var(Y | X=0) σY| X=0 E(Y2 | X=0) Independence • Independent random variables: – Value taken by 1 variable is unrelated to the value taken by another variable – Mathematically: • Pr(Y=y | X=x) = Pr(Y=y), or • Pr(X=x,Y=y) = Pr(X=x) × Pr(Y=y) Covariance and Correlation Covariance : a measure of how much 2 variables vary together Cov(X,Y) = ! XY = E[(X " µ X )(Y " µ Y )] where µ X = E(X), µ Y = E(Y) k ! = # # (x j " µ X )(y i " µ Y )Pr(X = x j,Y = y i ) i=1 j=1 Correlation : a similar, but unit - free, measure Cov(X,Y) ! corr(X,Y) = = XY Var(X) $ Var(Y) ! X ! Y this is a number from -1 to +1 X and Y are uncorrelated if corr(X,Y) = 0 If X and Y are independent, then Cov(X,Y) = 0 and corr(X,Y) = 0 0 20 CI 40 60 Investment % of GDP versus GDP, 1985, Different Nations 0 5000 10000 RGDPL Estimated Correlation: +0.50 15000 20000 0 20 CI 40 60 Investment % of GDP versus Consumption % of GDP, 1985 20 40 60 CC Estimated Correlation: -0.53 80 100 20 40 CC 60 80 100 Consumption % of GDP vs. Population, 1985 0 200000 400000 600000 POP Estimated Correlation: -0.05 800000 1000000 Key Concept 2.3 4. The Normal and Other Distributions Four PDFs We May Use & Their Relationships Normal with mean µ and variance ! 2 : x ~ N(µ,! 2 ), ƒ x (x 0 ) = 2 2 1 e #[x 0 #µ ] / 2! 2"! Standard normal has mean 0 and variance 1: x ~ N(0,1) x-µ Standardization : if x ~ N(µ,! 2 ), then ~ N(0,1) ! Multiple independent standard normal random variables : x ~ N(0,In ) #1 +% m ( #m / 2 (m / 2)#1 #x 0 / 2 x0 e if x 0 > 0 Chi - square with m degrees of freedom : x ~ $ 2 (m), ƒ x (x 0 ) = , '&( 2 #1)!*) 2 -.0 otherwise Note E(x)=m, Var(x)=2m 0 m + 13 /2 5 0 x 2 3 #(n +1)/ 2 1 2 4 1+ Student' s t with m degrees of freedom : x ~ t(m), ƒ x (x 0 ) = (n")1/ 2 /(m/2) 21 m 54 /[(m + n) /2]mm / 2n n / 2 x(m / 2)#1 F with m and n degrees of freedom : x ~ F(m,n), ƒ x (x 0 ) = 6 /(m/2) /(n /2) (mx + n)(m +n )/ 2 Normal Probability Density Function for a Single Variable Relationships between the Four PDFs Weighted sum of independent normal random variables is normal: ax+by ~ N(aµ x + bµ y , a 2 ! 2x + b 2 ! 2y ) if x ~ N(µ x , ! 2x ), y ~ N(µ y , ! 2y ) (remember, covariance is 0 from independence) Sum of squares of m independent standard normal random variables is " 2 (m): x 2 +y 2 +z 2 +!~" 2 (m) if x, y, z, ..., are independent and standard normal If x ~ N(0,1) and y ~ " 2 (m), with x and y independent : x / y / m ~t(m) If x ~ " 2 (m) and y ~ " 2 (n), with x and y independent : x/m ~F(m,n) y/n 5. Sample Averages Collect data from n entities, Y1 , Y2 , ..., Yn Suppose Y1 , Y2 , ..., Yn are independent and that they all have the same distribution (called independent and identically distributed, i.i.d.). Sample average: 1 1 n Y = (Y1 + Y2 + ! + Yn ) = ! Yi n n i=1 Since each Yi is a random number, so is Y. Questions : What is E(Y)? What is Var(Y)? What is " Y ? If Yi ~N(µ Y ," 2Y ), what precisely is the pdf of Y ? H int s : Use the relations in Key Concept 2.3. 1 It might help to think of n=2, so Y = (Y1 + Y2 ), first. 2 Recall that the weighted sum of independent normal random variables is normal. answers 1 1 n Y = (Y1 + Y2 + ! + Yn ) = ! Yi n n i=1 1 n 1 n E(Y)= ! E(Yi ) = ! µ Y = µ Y where µ Y = E(Yi ) for all i n i=1 n i=1 "1 n % " n 1 % "1 % 1 Var(Y) = Var $ ! Yi ' = Var $ ! Yi ' = Var $ Y1 + Y2 + !' n #n & # n i=1 & # i=1 n & 1 1 1 Use what formulas Var(Y ) + Var(Y ) + ! + 2 Cov(Y , Y ) + ! 1 2 1 2 in key concept 2.3? n2 n2 n2 1 1 Covariances zero = 2 Var(Y1 ) + 2 Var(Y2 ) + ! + 0 for independent r.v.’s n n 1 1 = 2 ( 2Y + 2 ( 2Y + ! where ( 2Y = Var(Yi ) for all i n n " 1 2 % ( 2Y =n $ 2 ( Y ' = n #n & = ( Y = ( 2Y / n If Yi ~N(µ Y ,( 2Y ), since the sum of normal random variables is a normal random variable, Y is a normal random variable. The mean of Y is µ Y , and the variance of Y is ( 2Y / n. Hence: Y~N(µ Y ,( 2Y / n). 6. Large Sample Distribution What if the Yi variables are not normally distributed? Just assume they are collected by random sampling (this ensures they are i.i.d.) and have a finite variance. Law of Large Numbers: as n ! ", Y converges in probability to the mean µ Y . I.e., the probability of any difference between Y and µ Y becomes zero. To be precise: lim Pr ob(| Y-µ Y | > #) = 0 for every #>0. n!" Hence Y is called consistent for µ Y . Central Limit Theorem: as n ! ", Y converges in distribution to the normal distribution; specifically, to the distribution N(µ Y ,$ 2Y ). n(Y-µ Y ) converges Gaussian Large Sample Distribution • This central limit theorem is used repeatedly • Tells typical dispersion in estimates we make – Like here, the estimates will be averages (albeit special types of averages) • This works even if we don’t know true distributions of the variables being averaged – If we do know, can improve knowledge of dispersion – Most texts focus on improved formulae assuming true distributions are normal; our text avoids the assumption and considers large samples only You have learned (or reviewed): 1. 2. 3. 4. 5. 6. Probabilities and distributions Expected values; variances, moments Multiple random variables Normal and other distributions we’ll use Sample averages & their distribution Large sample distribution What’s Next • Do assignment for chapter 2 (“assignment ch2”) – Due Thursday Feb. 2nd – It’s long, so start now • Next Monday Jan. 30th, lab session using Stata • Thursday Feb. 2nd, start chapter 3: statistics
© Copyright 2026 Paperzz