Population and sample Ideas for a good sample survey Sampling

Ch12 Sample Surveys
Population and sample
Ideas for a good sample survey
Sampling methods:
•
•
•
•
•
Simple Random Sampling
Stratified Sampling
Cluster Sampling
Multistage Sampling
Systematic Sampling
Population and sample
• Consider the following news:
Population and sample
• The population is the entire group of individuals that we
want information about.
• A sample is a part of the population that we actually
examine in order to gather information.
• We’d like to know about an entire population, but
examining all of them is usually impractical, if not
impossible. So we settle for examining a sample.
Population and sample
• In a population, there are usually parameters of interest
whose values are unknown.
• We use sample estimators to estimate the values of those
parameters.
• The sample estimators are called sample statistics.
Population and sample
Name
Population Parameter
Sample Statistic
Mean
µ
x
Standard
deviation
σ
s
ρ
p
r
p̂
correlation
proportion
Sample survey
• Sample Survey is a study that asks questions of a
sample drawn from some population in the hope of
learning something about the entire population.
• If the sample is the whole population, such a study is
called a census.
• Often a sample is a true subset of the population.
• The sampling frame is a list of individuals from which the
sample is drawn.
• The number of units in a sample is called the sample
size.
Sample survey
• Sampling methods that, by their nature, tend to over- or
under- emphasize some characteristics of the population
are said to be biased.
There is usually no way to fix a biased sample and no
way to salvage useful information from it.
• Examples of biased sampling:
In
In order to know about the quality of the food at a
cafeteria, we ask students on their way out of that
cafeteria.
The majority of a poll taken at a statistics support Web
site (with 12,357 responses) said they enjoy doing
statistics homework. So we are quite sure that most
Statistics students feel this way, too.
Sample survey
• Basic principles of getting good samples
Randomization: Select a sample randomly.
For example, assign numbers to members of the
population. Use a random number generator to
select your sample.
Sample size is important. We need a large enough
sample so that we can get fair representation.
Sampling method
• Simple Random Sample (SRS): A sample in which
every group of n individuals has the same chance of
being selected.
• Stratified Sample: Usually used for large population
sizes.
1) First we divide the entire population into groups
(with common characteristics). Such groups are
called strata.
2) Within each stratum we use simple random sampling
method.
Sampling method
• Cluster sampling:
1) First we divide the population into some clusters or
groups.
2) Then we randomly select a few clusters and perform
a census in the clusters selected.
• Comparison between Stratified and Cluster sampling:
Strata are homogeneous, but differ from one another.
Clusters are heterogeneous and resemble the overall
population.
• Multistage sampling: Combination of several
sampling methods.
Systematic sampling:
» Sometimes we draw a sample by selecting individuals
systematically.
˃ For example, you might survey every 10th person on
an alphabetical list of students.
» To make it random, you must still start the systematic
selection from a randomly selected individual.
Sampling method
• Illustration: SRS
Sampling method
• Illustration: Stratified sampling
Sampling method
• Illustration: Cluster sampling
Sampling method
• Example:
1) An airline company wants to survey a random
sample of the 300 passengers on a flight from San
Francisco to Tokyo. They could
i. From the boarding list, randomly choose 5
people flying first class and 25 of the other
passengers. Stratified sampling
ii. Randomly generate 30 seat numbers and survey
the passengers who sit there. SRS
iii. Randomly select a seat position (right window,
right center, right aisle, etc.) and survey all the
passengers sitting in those seats. Cluster sampling
iv. Pick every 10th passenger as people board the
plane Systematic sampling
Sampling method
• Example:
2) If the airline company wants to survey a random
sample of all the passengers on a flight from San
Francisco to Tokyo in the past year. They could
i. first randomly select 10 flights from each month;
ii. for each selected flights, randomly generate 30
seat numbers and survey the passengers who sit
there.
Multistage Sampling
The January 2005 Gallup Youth Survey telephoned a random
sample of 1,028 U.S. teens aged 13-17 and asked these
teens to name their favorite movie from 2004. Napoleon
Dynamite had the highest percentage with 8% of teens
ranking it as their favorite movie. Which is true?
I. The population of interest is U.S. teens aged 13-17.
II. 8% is a statistic and not the actual percentage of all U.S. teens
who would rank this movie as their favorite.
III. This sampling design should provide a reasonably accurate
estimate of the actual percentage of all U.S. teens who would
rank this movie as their favorite.
A. I only
B. II only
C. III only
D. I II, and III
» It isn’t sufficient to just draw a sample and start asking
questions. Before you set out to survey, ask yourself:
˃ What do I want to know?
˃ Am I asking the right respondents?
˃ Am I asking the right questions?
˃ What would I do with the answers if I had them; would
they address the things I want to know?
These questions may sound obvious, but they are a number of
pitfalls to avoid.
» Know what you want to know.
˃ Understand what you hope to learn and from whom you
hope to learn it.
» Use the right frame.
˃ Be sure you have a suitable sampling frame.
» Time your instrument.
˃ The survey instrument itself can be the source of errors.
» Ask specific rather than general questions.
» Ask for quantitative results when possible.
» Be careful in phrasing questions.
˃ A respondent may not understand the question or
may understand the question differently than the way
the researcher intended it.
» Even subtle differences in phrasing can make a
difference.
» Be careful in phrasing answers.
˃ It’s often a better idea to offer choices rather than inviting
a free response.
The best way to protect a survey from
unanticipated measurement errors is to
perform a pilot survey.
A pilot is a trial run of a survey you
eventually plan to give to a larger group.
» Sample Badly with Volunteers:
˃ In a voluntary response sample, a large group of
individuals is invited to respond, and all who do
respond are counted.
+ Voluntary response samples are almost always biased, and so
conclusions drawn from them are almost always wrong.
˃ Voluntary response samples are often biased toward
those with strong opinions or those who are strongly
motivated.
˃ Since the sample is not representative, the resulting
voluntary response bias invalidates the survey.
» Sample Badly, but Conveniently:
˃ In convenience sampling, we simply include the
individuals who are convenient.
+ Unfortunately, this group may not be
representative of the population.
˃ Convenience sampling is not only a problem for
students or other beginning samplers.
+ In fact, it is a widespread problem in the business
world—the easiest people for a company to sample
are its own customers.
» Sample from a Bad Sampling Frame:
˃ An SRS from an incomplete sampling frame
introduces bias because the individuals included may
differ from the ones not in the frame.
» Undercoverage:
˃ Many of these bad survey designs suffer from
undercoverage, in which some portion of the
population is not sampled at all or has a smaller
representation in the sample than it has in the
population.
˃ Undercoverage can arise for a number of reasons, but
it’s always a potential source of bias.
» Watch out for nonrespondents.
˃ A common and serious potential source of bias for
most surveys is nonresponse bias.
˃ No survey succeeds in getting responses from
everyone.
+ The problem is that those who don’t respond may
differ from those who do.
+ And they may differ on just the variables we care
about.
» Don’t bore respondents with surveys that go on and on
and on and on…
˃ Surveys that are too long are more likely to be refused,
reducing the response rate and biasing all the results.
» Work hard to avoid influencing responses.
˃ Response bias refers to anything in the survey design
that influences the responses.
» Make sure the question wording is neutral.
˃ Many surveys, especially those conducted by specialinterest groups, present one side of an issue before
the question itself.
A chemistry professor who teaches a large
lecture class surveys the students who
attend his class on how he can make the
class more interesting to get more students
to attend. This survey method suffers from
A. voluntary response bias
B. nonresponse bias
C. response bias
D. undercoverage
E. none of the above
Which statement about bias is true?
I. Bias results from randomization and will always
be present.
II. Bias results from samples that do not represent
the population.
III. Bias is usually reduced when sample size is
larger.
A.
B.
C.
D.
E.
I only
II only
III only
I and III only
I, II, and III
Suggested exercises from the textbook:
Chapter 12: 2, 4, 8, 11, 15, 17, 19, 20, 23,
24, 34, 35, 36
31