Chapter 2 - University of Washington

Warm Up •  Researchers collected data on 500 pregnant women who were hospitalized in car accidents. They recorded which trimester of pregnancy each woman was in, and found that the majority were in the second trimester. –  Is this an observaBonal study or an experiment? –  What is the sample? –  What populaBon does the sample come from? –  What response is measured? Chapter 2: Samples, Good and Bad Roddy Theobald STAT 220 Summer 2012 MoBvaBng Example #1 Ques&on: are these reviews representaBve of all opinions about this restaurant? MoBvaBng Example #2 •  In the 1936 presidenBal elecBon, Franklin D. Roosevelt faced Alfred Landon MoBvaBng Example #2 •  To predict the outcome of the elecBon, The Literary Digest undertook one of the most extensive surveys ever conducted MoBvaBng Example #2 •  The magazine sent postcards to roughly 10,000,000 registered voters across the United States –  At the Bme, this was about one quarter of the naBon’s voters •  They received postcards back from over 2,000,000 people –  1,293,669 said they supported Landon –  972,897 said they supported Roosevelt •  From the original arBcle: “ The Poll represents the most extensive straw ballot in the field—the most experienced in view of its twenty-­‐five years of perfecBng—the most unbiased in view of its presBge—
a Poll that has always previously been correct.” MoBvaBng Example #2 •  Sure enough, the 1936 elecBon turned out to be one of the most lopsided elecBons in history… except it was Roosevelt who won 46 of the 48 states •  George Gallup (inventor of the Gallup Poll) successfully predicted the result of the elecBon—
within 1% of the final totals—using a much smaller poll of 50,000 registered voters •  Ques&on: How did the smaller poll generate a more accurate predicBon? Bias •  The two moBvaBng examples are both examples of biased studies. •  The design of a staBsBcal study is biased if it systemaBcally favors certain outcomes •  One way that a study can be biased is by selecBng a sample that is not representaBve of the populaBon –  Key ques&on: did every individual in the popula&on have an equal chance of being in the sample? If not (and if you are not doing a sophisBcated weighBng survey), your sample may be biased Biased sampling methods •  Sample surveys do not have to be deliberately biased to produce biased results! •  For the purposes of this course, we will focus on two sampling methods that are extremely common but oien lead to biased results –  Voluntary response samples –  Convenience samples Voluntary response samples •  A voluntary response sample chooses itself by responding to a general appeal –  Write-­‐in or call-­‐in opinion polls are examples of voluntary response samples •  Voluntary response samples tend to be biased because the type of people who volunteer to respond are not necessarily representa&ve of the en&re popula&on Voluntary response samples Rule of thumb: people with strong opinions are probably more likely to respond to requests for their opinion! Voluntary response samples •  There is the potenBal for voluntary response bias in The Literary Digest presidenBal poll as well •  Perhaps supporters of Landon were more likely to return the postcard than supporters of Roosevelt? Convenience samples •  SelecBon of whichever individuals are easiest to reach is called convenience sampling –  Interviewing people in the mall –  Going door-­‐to-­‐door in your neighborhood –  Surveying people in your school, church, etc. •  Convenience samples are oien biased because the people who are easiest to reach are not necessarily representa&ve of the whole popula&on Convenience samples •  To select the people to survey for their poll, The Literary Digest first sent postcards to all their subscribers •  Then they randomly selected registered voters from lists of automobile owners and telephone subscribers •  So, what could have gone wrong with this survey? Examples •  Suppose that I am interested in whether undergraduate students at the University of Washington support the recent tuiBon increases. I randomly select 40 undergraduate students from this class and require them to respond to a survey on the course website –  Could this sample be biased? –  If so, is it a voluntary response sample or a convenience sample? Examples •  During the show American Idol, viewers are asked to call in and vote for different performers. Suppose the populaBon of interest is all people who are watching American Idol that night. –  Could the resulBng sample be biased? –  If so, is it a voluntary response sample or a convenience sample? SelecBng an unbiased sample •  There are many ways to avoid bias in selecBng a sample. •  The most straighlorward is simple random sampling •  A simple random sample (SRS) of size n consists of n individuals from the populaBon chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected –  This is not the only unbiased sampling technique In plain English •  Returning to the tuiBon example: suppose that I am interested in whether undergraduate students at the University of Washington support the recent tuiBon increases •  I have enough money to sample and give a survey to 100 students •  If every single group of 100 undergraduate students at UW is equally likely to be selected as my sample, then I have taken a simple random sample of size 100 SelecBng a SRS •  The key to selecBng a SRS is randomiza&on •  For the tuiBon example –  Generate a numbered list of all 40,000 undergraduate students at UW –  Generate 100 random numbers between 1 and 40,000 –  Require the 100 students who correspond to those numbers to complete the survey •  Every group of 100 students is equally likely to be selected with this method, so this is a SRS •  Be careful: A simple random sample of the wrong popula+on is also biased! Simple random sample? •  Suppose I want to take a survey of students in this class. Are any of the following a SRS? –  I ask everyone simng in the first four rows of the classroom –  I ask a random sample of ¼ of the women and then ask a random sample of ¼ of the men –  I ask every 4th person who walks into class From the news •  The American Community Survey is an annual survey administered by the U.S. Census Bureau that collects data from roughly 3 million people per year •  U.S. RepresentaBve Daniel Webster is sponsoring a bill to end the ACS, arguing that “in the end this is not a scienBfic survey. It’s a random survey.“ –  If there’s one thing I want you take away from this lecture, it is that a random survey is a scienBfic survey! Return to Warm Up •  Researchers collected data on 500 pregnant women who were hospitalized in car accidents. They recorded which trimester of pregnancy each woman was in, and found that the majority were in the second trimester. •  My wife is in the second trimester of pregnancy. Should we be worried? Homework #1, Part 2 •  Read Chapter 2 (ignore secBons on selecBng random digits) •  Complete problems 2.1, 2.3, 2.5, 2.6, 2.7, 2.8, 2.13, and 2.18 Problem 2.3 •  You work for a local newspaper that has recently reported on a bill that would make it easier to create charter schools in the state. You report to the editor that 201 lepers have been received on the issue, of which 171 oppose the legislaBon. “I’m surprised that most of our readers oppose the bill. I thought it would be quite popular,” says the editor. Are you convinced that the majority of the readers oppose the bill? How would you explain the staBsBcal issue to the editor?