Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10:00 - 10:50 Mondays, Wednesdays & Fridays. https://www.youtube.com/watch?v=RhyRx42H-EQ http://onlinestatbook.com/stat_sim/sampling_dist/index.html By the end of lecture today 10/21/16 Law of Large Numbers Central Limit Theorem Before next exam (November 18th) Please read chapters 1 - 11 in OpenStax textbook Please read Chapters 2, 3, and 4 in Plous Chapter 2: Cognitive Dissonance Chapter 3: Memory and Hindsight Bias Chapter 4: Context Dependence Everyone will want to be enrolled in one of the lab sessions Labs continue Next week With Project 3 Homework On class website: Homework Assignments 15 & 16 Please complete the homework modules on the D2L website HW15-Confidence Intervals Please complete this homework worksheet 16 - Confidence Intervals Both are due: Monday, October 24th Central Limit Theorem Central Limit Theorem: If random samples of a fixed N are drawn from any population (regardless of the shape of the population distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean. Distribution of Raw Scores Melvin Eugene X X X X XX X XXX X X X X X X XX XX XX XX XX XX Sampling Distribution of Sample means 23rd sample X X X X X X X XX XX X XX X X 2nd sample Central Limit Theorem Proposition 1: If sample size (n) is large enough (e.g. 100) The mean of the sampling distribution will As n ↑ x will approach µ approach the mean of the population Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will As n ↑ curve be approximately normal, regardless of the will approach shape of the population normal shape Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of As n ↑ curve variability gets the population divided by the square root of the sample size. As n increases SEM decreases. smaller X X X X XX X XXX X X X X X X XX XX XX XX XX XX X X X X X X X XX XX X XX X X Central Limit Theorem: If random samples of a fixed N are drawn from any population (regardless of the shape of the population distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean. Distribution of Raw Scores Animation for creating sampling distribution of sample means Eugene Distribution of single sample Sampling Distribution of Sample means Mean for sample 12 Mean for sample 7 http://onlinestatbook.com/stat_sim/sampling_dist/index.html Sampling Distribution of Sample means What are the three propositions of the Central Limit Theorem? As n goes up … 1. Sample mean approaches true population mean 2. Curve becomes more “normal” 3. Variability goes down (includes standard deviation, variance, width of the curve, and random error) What is the formula for the standard error of the mean? What are confidence intervals for… Estimating a value (could be a single score, a mean of a sample or a mean of a population) by providing a range within we believe (with a certain level of confidence) it falls Confidence Intervals: What are they used for? We are estimating a value by providing two scores between which we believe the true value lies. We can be 95% confident that our mean falls between these two scores. 95% Confidence Interval: We can be 95% confident that our population mean falls between these two scores 99% Confidence Interval: We can be 99% confident that our population mean falls between these two scores Confidence Intervals: What are they used for? We are using this to estimate a value such as a mean, with a known degree of certainty with a range of values • The interval refers to possible values of the population mean. • We can be reasonably confident that the population mean falls in this range (90%, 95%, or 99% confident) • In the long run, series of intervals, like the one we figured out will describe the population mean about 95% of the time. Greater confidence implies loss of precision. (95% confidence is most often used) Can actually generate CI for any confidence level you want – these are just the most common How to find the two scores that border the middle 95% of the curve up … Normal distribution Raw scores z-scores Have z Find raw score Formula probabilities Z Scores z table Have z Find area Have area Find z Have raw score Find z Raw Scores Area & Probability . Try this one: Please find the (2) raw scores that border exactly the middle 95% of the curve Mean of 30 and standard deviation of 2 Go to .4750 nearest z = 1.96 table mean + z σ = 30 + (1.96)(2) = 33.92 .4750 Go to table nearest z = -1.96 mean + z σ = 30 + (-1.96)(2) = 26.08 . Try this one: Please find the (2) raw scores that border exactly the middle 99% of the curve Mean of 30 and standard deviation of 2 Go to .4950 nearest z = 2.58 table mean + z σ = 30 + (2.58)(2) = 35.16 .4950 Go to table nearest z = -2.58 mean + z σ = 30 + (-2.58)(2) = 24.84 . Please find the raw scores that Which is border the middle 95% of the curve wider? Please find the raw scores that border the middle 99% of the curve . Please find the raw scores that border the middle 95% of the curve 95% Confidence Interval: We can be 95% confident that the estimated score really does fall between these two scores 99% Confidence Interval: We can be 99% confident that the estimated score really does fall between these two scores Please find the raw scores that border the middle 99% of the curve Part 1 Find the scores that border the middle 95% Mean = 50 Standard deviation = 10 x = mean ± (z)(standard deviation) 95% 30.4 ? 69.6 ? .9500 .4750 ? .4750 ? 1) Go to z table - find z score for for area .4750 z = 1.96 Please note: We will be using this same logic for “confidence intervals” 2) x = mean + (z)(standard deviation) x = 50 + (-1.96)(10) x = 30.4 3) x = mean + (z)(standard deviation) x = 50 + (1.96)(10) x = 69.6 Scores 30.4 - 69.6 capture the middle 95% of the curve Construct a 95% confidence interval Mean = 50 Standard deviation = 10 n = 100 s.e.m. = 1 .9500 .4750 .4750 ? 95% 48.04 ? 51.96 ? For “confidence intervals” same logic – same z-score But - we’ll replace standard deviation with the standard error of the mean x = mean ± (z)(s.e.m.) x = 50 + (1.96)(1) x = 51.96 x = 50 + (-1.96)(1) x = 48.04 95% Confidence Interval is captured by the scores 48.04 – 51.96 standard error of the mean = σ n = 10 100 Confidence interval uses SEM 29.2 80.8 29.2 80.8 Upper boundary raw score x = mean + (z)(standard deviation) x = 55 + (+ 2.58)(10) x = 80.8 Lower boundary raw score x = mean + (z)(standard deviation) x = 55 + (- 2.58)(10) x = 29.2 29.2 80.8 Upper boundary raw score x = mean + (z)(standard error mean) 51.3 58.7 x = 55 + (+ 2.58)(1.42) x = 58.7 Lower boundary raw score x = mean + (z)(standard error mean) x = 55 + (- 2.58)(1.42) x = 51.3 10 49 1.42 51.3 58.7 Confidence Intervals: A range of values that, with a known degree of certainty, includes an estimated value (like a mean) • How can we make our confidence interval smaller? • Decrease variability (make standard deviation smaller) 1. Increase sample size (This will decrease variability) 2. Very careful assessment and measurement practices (improve reliability will minimize noise) • Decrease level of confidence . 95% 95%
© Copyright 2026 Paperzz