Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill Building 10:00 - 10:50 Mondays, Wednesdays & Fridays. http://courses.eller.arizona.edu/mgmt/delaney/d15s_database_weekone_screenshot.xlsx Everyone will want to be enrolled in one of the lab sessions Labs continue next week Please re-register your clicker http://student.turningtechnologies.com/ By the end of lecture today 10/9/15 Law of Large Numbers Central Limit Theorem Schedule of readings Before next exam (October 16th) Please read chapters 1 - 8 in OpenStax textbook Please read Chapters 10, 11, 12 and 14 in Plous Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability and Risk Chapter 14: The Perception of Randomness Homework On class website: Please print and complete homework worksheet #11 Due Monday October 12th Dan Gilbert Reading and Law of Large Numbers Review of Homework Worksheet just in case of questions Homework review 2 = .40 5 Based on apriori probability – all options equally likely – not based on previous experience or data Based on expert opinion - don’t have previous data for these two companies merging together Based on frequency data (Percent of rockets that successfully launched) Homework review Based on apriori probability – all options equally likely – not based on previous experience or data 30 = .30 100 Based on frequency data (Percent of times at bat that successfully resulted in hits) Based on frequency data (Percent of times that pages that are “fake”) Homework review 5 = .10 50 Based on frequency data (Percent of students who successfully chose to be Economics majors) . .8276 .1056 .2029 .1915 .4332 44 - 50 4 .3944 = -1.5 z of 1.5 = area of .4332 55 - 50 4 = +1.25 z of 1.25 = area of .3944 .4332 +.3944 = .8276 .3944 .3944 55 - 50 4 = +1.25 1.25 = area of .3944 .5000 - .3944 = .1056 52 - 50 4 = +.5 z of .5 = area of .1915 55 - 50 4 = +1.25 z of 1.25 = area of .3944 .3944 -.1915 = .2029 Homework review .3264 .2152 .5143 .1736 3000 - 2708 650 = 0.45 z of 0.45 = area of .1736 .5000 - .1736 = .3264 .1255 .1736 .3888 .3888 3000 - 2708 650 = 0.45 z of 0.45 = area of .1736 3500 - 2708 650 = 1.22 2500 - 2708 650 = -.32 z of -0.32 = area of .1255 3500 - 2708 650 = 1.22 z of 1.22 = area of .3888 z of 1.22 = area of .3888 .3888 - .1736 = .2152 .3888 +.1255= .5143 Homework review .0764 .9236 .4236 .4236 20 - 15 3.5 = 1.43 z of 1.43 = area of .4236 .5000 - .4236 = .0764 .1185 .4236 .3051 20 - 15 3.5 = 1.43 z of 1.43 = area of .4236 .5000 + .4236 = .9236 10 - 15 3.5 = -1.43 z of -1.43 = area of .4236 12 - 15 3.5 = -0.86 z of -.86 = area of .3051 .4236 – .3051 = .1185 Comments on Dan Gilbert Reading Law of large numbers: As the number of measurements increases the data becomes more stable and a better approximation of the true (theoretical) probability As the number of observations (n) increases or the number of times the experiment is performed, the estimate will become more accurate. Law of large numbers: As the number of measurements increases the data becomes more stable and a better approximation of the true signal (e.g. mean) As the number of observations (n) increases or the number of times the experiment is performed, the signal will become more clear (static cancels out) With only a few people any little error is noticed (becomes exaggerated when we look at whole group) With many people any little error is corrected (becomes minimized when we look at whole group) http://www.youtube.com/watch?v=ne6tB2KiZuk Sampling distributions of sample means versus frequency distributions of individual scores Distribution of raw scores: is an empirical probability distribution of the values from a sample of raw scores from a population Frequency distributions of individual scores X • derived empirically XX • we are plotting raw data XXX • this is a single sample Take a single score x Repeat over and over x Population x x x x x x Preston X X X X X X X X X X XX XX XX XX XX XX Eugene X X X X X X X XX XX X XX X X Melvin Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population important note: “fixed n” Sampling distributions of sample means • theoretical distribution • we are plotting means of samples Take sample – get mean Repeat over and over Population Mean for 1st sample Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population important note: “fixed n” Sampling distributions of sample means • theoretical distribution • we are plotting means of samples Take sample – get mean Repeat over and over Population Distribution of means of samples Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population Frequency distributions of individual scores • derived empirically • we are plotting raw data • this is a single sample Sampling distributions sample means • theoretical distribution • we are plotting means of samples Eugene X X X X XX X XXX X X X X X X XX XX XX XX XX XX X X X X X X X XX XX X XX X X Melvin 23rd sample 2nd sample Sampling distribution for continuous distributions Central Limit Theorem: If random samples of a fixed N are drawn from any population (regardless of the shape of the population distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean. Distribution of Raw Scores Melvin Eugene X X X X XX X XXX X X X X X X XX XX XX XX XX XX Sampling Distribution of Sample means 23rd sample X X X X X X X XX XX X XX X X 2nd sample Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population Notice: SEM is smaller than SD – especially as n increases Mean = 100 Standard Deviation = 3 µ= 100 σ=3 X X X X XX X XXX X X X X X X XX XX XX XX XX XX 100 Eugene X X X X X X X XX XX X XX X X Melvin 23rd sample An example of a sampling distribution of sample means Mean = 100 Standard Error of the Mean = 1 2nd sample µ = 100 =1 100 Central Limit Theorem Proposition 1: If sample size (n) is large enough (e.g. 100) The mean of the sampling distribution will As n ↑ x will approach µ approach the mean of the population Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will As n ↑ curve be approximately normal, regardless of the will approach shape of the population normal shape Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of As n ↑ curve variability gets the population divided by the square root of the sample size. As n increases SEM decreases. smaller X X X X XX X XXX X X X X X X XX XX XX XX XX XX X X X X X X X XX XX X XX X X
© Copyright 2026 Paperzz