Elementary Statistics and Inference 22S:025 or 7P:025 Lecture 25 1 Elementary Statistics and Inference 22S:025 or 7P:025 Chapter 19 2 Chapter 19 – Sample Surveys E. Most research is conducted with data from a Sample of observations from a Population. The results of the analyses are used to make an inference regarding the nature of the Population. Population – The total number of subjects (observations) in a Universe. Sample – A subset of observations from the Population. 3 1 Chapter 19 – Sample Surveys (cont.) Population Facts – Parameters (e.g., mean = µ, std dev = σ). Sample Facts – Statistics (e.g., mean = , std dev = S) Investigators estimate population parameters from statistics. 4 Chapter 19 – Sample Surveys (cont.) B. Bad Estimates Can be Made of Parameters When sampling is not representative of members of the population – 1936 Literary Digest Poll used telephone directories and club membership lists to select a directories, sample of 2.4 million voters – did not reflect mood of the country because many were unemployed and had no telephone. 5 Chapter 19 – Sample Surveys (cont.) C. Dewey vs. Truman 1948 Big Sampling Bias – quota sampling – where interviewer had wide choices in selecting persons to be interviewed – not representative of the population population. 6 2 Chapter 19 – Sample Surveys (cont.) D. Using Chance (Probability) in Survey Work Put the name of each survey candidate on a ticket, put the tickets in a box, and randomly select the names of the persons to be surveyed. Each person has the same chance of being selected to answer the survey (Simple Random Sampling). Example: N=1000 names on tickets in a box. Randomly select 100 tickets for persons to be surveyed. Everyone has the same chance of being selected for the survey – Sampling Without Replacement. 7 Chapter 19 – Sample Surveys (cont.) More sophisticated methods are now used to select representative samples of the population. Multistage Cluster Sampling – 4 geographical regions of USA – within each region develop representative population centers – within each population center you have wards – and within each ward you have precincts. 8 Chapter 19 – Sample Surveys (cont.) 9 3 Chapter 19 – Sample Surveys (cont.) The advantage of multistage sampling is that the interviewer has no discretion at all as to whom to interview – there is a definite procedure used for selecting the representative sample, and it involves the planned use of chance (p p (probability) y) in selecting g the sample. 10 Chapter 19 – Sample Surveys (cont.) E. Do Probability Sampling Methods Work? Since 1948 the error in estimating the winner in presidential elections has been very small – because impartial probability methods have been used to select the sample of likely voters. 11 Chapter 19 – Sample Surveys (cont.) 12 4 Chapter 19 – Sample Surveys (cont.) F. A Closer Look At the Gallup Poll In the 1984 election, the Gallup Organization tried to screen its survey respondents to include only those who were likely to vote - 13 14 Chapter 19 – Sample Surveys (cont.) They screened the non voters from the voters, and the undecided. The survey respondent drops his/her preference for the candidate in an enclosed sealed envelope into a box – the choice was unknown to the interviewer. 15 5 Chapter 19 – Sample Surveys (cont.) G. Telephone Surveys Cheaper, just as effective – since nearly every home has a phone. Sample picked – Random – area code (e.g., 319) Random – exchange within area code (e.g., 430) Bank – (e.g., 31) Digits – (e.g., 99) Call: 319-430-3199 16 17 Chapter 19 – Sample Surveys (cont.) H. Chance Errors and Bias Imagine a box with a very large number of tickets with “ones” and “zeros” on the tickets. Select a sample of tickets without replacement percentage of 1’s in sample = percentage of 1’s in the box + chance error. 18 6 Chapter 19 – Sample Surveys (cont.) Questions we will address about chance errors: How big are they likely to be? How do they depend on the size of the sample? the size of the population? How big does the sample have to be to keep the chance errors under control? 19 Chapter 19 – Sample Surveys (cont.) In complicated samples, the equation has to take bias into account: Estimate = parameter + bias + chance error Exercise Set A – (pp. 349-351) #1, 2, 3, 4, 5 20 Chapter 19 – Sample Surveys (cont.) I. Review Exercises – (pp. 351-353) #2, 5, 7, 9, 10 #10. n=100 P( H ) = 1 2 avg = SD = 1 2 1 2 ⎛1⎞ E (Heads) = n ⋅ avg = 100⎜ ⎟ = 50 ⎝ 2⎠ ⎛1⎞ SE (Heads) n ⋅ SD = 100 ⎜ ⎟ = 5 ⎝ 2⎠ 45 50 55 21 7 Chapter 19 – Sample Surveys (cont.) I would pick 45-55 because these numbers have the greatest likelihood of representing the number of heads in 100 tosses of the coin – 68.27% of the time we would g obtain a sum in this range. If you count the end points 44.5 – 55.5 (73%) 22 Chapter 19 – Sample Surveys (cont.) #7. p. 352 A coin is tossed 1000 times. There are two options: i) To win $1 $1.00 00 if the number of heads is between 490 and 510 ii) To win $1.00 if the percentage of heads is between 48% and 52% (480-520) Which option is better? 23 Chapter 19 – Sample Surveys (cont.) 1 2 1 SD of the box = 2 avg of the box = H 1 T 0 ⎛1⎞ E (Heads) = n ⋅ avg of the box = 1000⎜ ⎟ = 500 ⎝ 2⎠ ⎛1⎞ ⎛1⎞ SE (Heads) = n ⋅ SD of the box = 1000 ⎜ ⎟ = 31.62⎜ ⎟ = 15.81 ⎝ 2⎠ ⎝ 2⎠ 24 8 Chapter 19 – Sample Surveys (cont.) SE=15.81 480 500 520 Heads 490 510 490 − 500 = −.63 15.81 480 − 500 = −1.265 15.81 510 − 500 = .63 15.81 520 − 500 = 1.265 15.81 Z= X − Mean SE Z= X − Mean SE The chance of number of heads between 480 and 520 is greater than chance of heads between 490 and 510. 25 9
© Copyright 2026 Paperzz