PS 405 – Week 3 Section: Point Estimation, Confidence Intervals, and Hypothesis Testing D.J. Flynn January 28, 2014 Today’s plan Confidence intervals LLN and CLT t-tests But first... Valencia: Confidence intervals I Suppose we have a point estimate (e.g., 48% of voters prefer Romney). I Now we need to quantify our (un)certainty about that estimate. I If we’re really certain, Romney will almost certainly lose. I If we’re uncertain, he could have >50% and win! I We have a variety of tools to help us quantify certainty: significance tests, confidence intervals, visualization, etc.. I A confidence interval is a type of interval estimate of a population parameter and is used to indicate the reliability of an estimate (Wikipedia). Suppose we observe a value, X̄, on [0,1]. The 95% C.I. around X̄ is: σ̂ X̄ ± 1.96SE(X̄) = X̄ ± 1.96 √ n Which terms are easily observable? Example from lecture We do a survey (N = 400) and find that 79% of respondents have Facebook. Let’s calculate the confidence interval around the estimate, .79. 1. Estimate σ 2 . p(1 − p) = .79(1 − .79) = .1659 2. Estimate σ. √ σ2 = √ .1659 = .407 3. Calculate SE(p). σ̂ .407 √ =√ = .02 n 400 4. Plug into C.I. formula: p ± 1.96SE(p) = .79 ± .04 = [.75, .83] Doing this in R > p.hat<-317/400 > p.hat [1] 0.7925 > alpha<-.05 > z<-qnorm(1-alpha/2) > z [1] 1.959964 > se<-.407/sqrt(400) > se [1] 0.02035 > conf.int<-c(p.hat-z*se, p.hat+z*se) > conf.int [1] 0.7501147 0.8298853 LLN Definition Weak Law of Large Numbers: As n → ∞, it follows that limn→∞ P(|X − µ| ≥ ) = 0 In words: the distance between the sample mean and the population mean converges in probability to zero as our sample size increases to infinity. CLT Definition Central Limit Theorem: As n → ∞, it follows that √ n(X̄−µ) d − → σ N (0, 1). In words: the distribution of sample means converges on the Normal as our sample size increases to infinity. t-tests I t-tests are useful when we observe two sample means and want to see if the difference between them is statistically distinguishable from zero. I The null hypothesis is that the two means are equivalent: H0 : X¯1 = X¯2 I The alternative hypothesis is that the two are different: HA : X¯1 6= X¯2 Formula for t-test (which we’ll never use): µ1 − µ2 = (Xˆ1 − Xˆ2 ) ± tα/2 sp r 1 1 + , n1 n2 where α is significance level, df = (n1 − 1) + (n2 − 1). For α = 0.05, the critical t-value is 1.96. .... Lots of math .... [interval] If interval contains zero, difference is not significant. t-tests in R > x<-rnorm(100,10,2) > y<-rnorm(100,10,2) > t.test(x,y) Welch Two Sample t-test data: x and y t = -0.7687, df = 196.309, p-value = 0.443 alternative hypothesis: true difference in means is not equa 95 percent confidence interval: -0.7800831 0.3424939 sample estimates: mean of x mean of y 9.78514 10.00393 > x<-rnorm(100,10,2) > y<-x+2 > t.test(x,y) Welch Two Sample t-test data: x and y t = -6.7221, df = 198, p-value = 1.867e-10 alternative hypothesis: true difference in means is not equa 95 percent confidence interval: -2.586725 -1.413275 sample estimates: mean of x mean of y 9.78514 11.78514
© Copyright 2026 Paperzz