Collecting Data Understanding Random Sampling Objectives: To develop the basic properties of collecting an unbiased sample. To learn to recognize flaws in biased sampling. Intro… Do you know what it means when something occurs randomly? Randomly select a number from the next slide. Ready… 1234 Question: What would you except to happen if when we collected data on this simple task? How do we gather data? Surveys Opinion polls Interviews Studies Observational Retrospective (past) Prospective (future) Experiments Population Population – the entire group of individuals we want information about. Census – a complete count of the entire population Why would we not use a census all the time? 1) Not accurate 2) Very expensive 3) Perhaps impossible 4) If using destructive sampling, you would destroy population • Breaking strength of soda bottles • Lifetime of flashlight batteries • Safety ratings for cars Sample A part of the population that we examine in order to gather information Used to generalize information about a population Sampling design refers to the method used to choose the sample from the population Sampling frame a list of every individual in the population Simple Random Sample (SRS) consist of n individuals from the population chosen in such a way that every individual has an equal chance of being selected every set of n individuals has an equal chance of being selected SRS Advantages Unbiased Easy Disadvantages Large variance May not be representative Must have sampling frame (list of population) Systematic random sample select sample by following a systematic approach randomly select where to begin Systematic Random Sample Advantages Unbiased Ensure that the sample is distributed across population More efficient, cheaper, etc. Disadvantages Large variance Can be confounded by trend or cycle Formulas are complicated Identify the sampling design A local restaurant manager wants to survey customers about the service they receive. Each night the manager randomly chooses a number between 1 & 10. He then gives a survey to that customer, and to every 10th customer after them, to fill it out before they leave. Systematic random sampling Bias ERROR favors certain outcomes Note: We cannot ever draw conclusions from bias data. Throw it out and start over! Voluntary response People chose to respond Usually only people with very strong opinions respond Produces biased results Convenience sampling Ask people who are easy to ask Produces bias results Source of bias? Suppose that you want to estimate the total amount of money spent by students on textbooks each semester at Rice. You collect register receipts for students as they leave the bookstore during lunch one day. Convenience sampling – easy way to collect data 1970 Draft Lottery and the Role of Randomization In that first draft lottery (conducted on December 1, 1969), a large, deep, cylindrical bowl was filled with 366 dates, one for each day of the year (including February 29, of course). The dates were placed inside small capsules (balls about the size of a pecan), added to the bowl, and then mixed. After mixing, the capsules were selected, one by one, and assigned a draft priority. Draft registrants whose birthdays matched the first 100 or so dates selected were likely to be called for induction. However, the bowl's small diameter and height (nearly arm's length) made the mixing less than random because each month's dates had been added sequentially in the yearly order of months. January's capsules were dumped in first, followed by February's and so on until December. Set of Data for 1970 Draft Lottery 1970 Draft Lottery 1970 Draft Number by Day of Year Mean Draft Number by Month How did the nonrandomness of the draft effect the casualties (deaths) during the Vietnam war? This was recently studied by Paul Sommers in "The Writing on the Wall", Chance, Vol, 1, 2003, p35-38. He examined the names of the casualties on the Vietnam Memorial (available online at thewall-usa.com) together with other sources and found the number of casualties by birth month: Selecting a SRS For the AP exam: “Knowledgeable users of statistics need to be able to perform your sample exactly using the described method.” Methods: we can “pick samples from a hat”, use a random number generator, or use a table of random digits to derive our sample SRS by picking out of a hat Say items in hat are “mixed thoroughly” and state whether or not slips of paper are replaced back in the hat (yes if stratified sampling). Random digit table each entry is equally likely to be any of the 10 digits digits are independent of each other Suppose your population consisted of these 20 people: 1) Aidan 2) Bob 3) Chico 4) Doug 5) Edward 6) Fred 11) Kathy 16) Paul will need to use double17) Shawnie 7)We Gloria 12) Lori digit random numbers, 18) Tracy 8) Hannah 13) Matthew ignoring 9) Israel any 14)number Nancy greater 19) Uncle Sam 10) Jung 15) with OpusRow 1 and 20) Vernon than 20. Start read across. Ignore. Ignore. Use the following random digits to select a sample of five from these Ignore.Ignore. people. Row Stop when five people are selected. So 1 4 5my sample 1 8 would 0 5 consist 1 3 of :7 1 2 0 1 5 5 8 0 1 5 7 0 3 Aidan, 8 9 Edward, 9 3 Matthew, 4 3 Opus, 5 0 and6 Tracy 3
© Copyright 2026 Paperzz