stratified random sample

+ Homework
Please Complete the Following
READ
p. 215 - 219
COMPLETE
(p.227)19, 21, 23, 25, 27-29,
31, 33, 35
CHECK
your answers in the back of the book
Upcoming
Chance MC Chapter 3 – this Fri 10/26 & next
Mon 10/29 during lunch
 2nd
 Bonus
Test Chapter 3 – this Fri 10/26 after school
 Chapter
4 Test Tues 11/6
+ Last Night’s Homework - Evens
Check your Answers
4.12
(a) Number the 33 complexes from 01 to 33 alphabetically. Go to the random number
table and pick a starting point. Record two-digit numbers, skipping any that aren’t between 01
and 33 or are repeats, until you have 3 unique numbers between 01 and 33. (b) Starting at line
117 we read off the following numbers: 38 (ignore) 16 79 (ignore) 85 (ignore) 32 62 (ignore) 18.
So we have picked: Fairington (16), Waterford Court (32) and Fowler (18).
4.14
(a) Number the gravestones from 00001 to 55914. Go to the random number table
and pick a starting point. Record 5-digit numbers, skipping any that aren’t between 00001 and
55914 or are repeats, until you have 395 unique numbers between 00001 and 55914. (b) Starting
at line 127 we read off the following numbers: 43909 99477 25330 64359 40085 (ignore all
numbers not in bold). So the first three gravestones picked are those numbered 43909, 25330
and 40085.
4.16
(a) False—if it were true, then after looking at 39 digits, we would know whether or not
the 40th digit was a 0. (b) True—there are 100 pairs of digits 00 through 99, and all are equally
likely. (c) False— 0000 is just as likely as any other string of four digits.
4.18
(a) To obtain an SRS, every tree would need to have an equal chance of being
included in the sample. It is not practical to even identify every tree in the park. (b) This
sampling method is biased because these trees are unlikely to be representative of the
population. Trees along the main road are more likely to be damaged by cars and people and
may be more susceptible to infestation. (c) The scientists can be confident that the actual
percentage of pine trees in the area that are infected by the pine beetle is near 35% although
there is always some error associated with using sampling to estimate population parameters.
+
Chapter 4: Designing Studies
Section 4.1
Sampling and Surveys
Day 3 / 11
+ Section 4.1
Designing Samples
Learning Objectives
After this section, you should be able to…

DESTINGUISH between simple random samples, stratified random
samples, and cluster samples

IDENTIFY advantages and disadvantages of each random sampling
method

EXPLAIN how undercoverage, nonresponse and question wording
can lead to bias in sample surveys
+ Paired Activity: Rolling Down the River
When is a stratified random sample useful?
1.
2.
3.
4.
5.
6.
Complete the packet. Be
deliberately random clearly identify all 4
steps.
Plot your mean yield
data on the board
Answer the questions on p.3
Repeat the process with the addition of a new
irrigation system
When is it more useful to use stratified sampling?
When is it more useful to use a SRS?
+
Stratified Random Sample
SRS refers only to a simple random sample.

The basic idea of sampling is straightforward: take a SRS from
the population and use your sample results to gain information
about the population. Sometimes there are statistical advantages
to using more complex sampling methods.

One common alternative to a SRS involves sampling important
groups (called strata) within the population separately. These
“sub-samples” are combined to form one stratified random
sample.
Definition:
To select a stratified random sample, first classify the
population into groups of similar individuals, called
strata. Then choose a separate SRS in each stratum
and combine these SRSs to form the full sample.
Sampling and Surveys

+
Stratified Random Sample
 when
the individuals in each stratum are less varied
than the population as a whole
 iow,
when you have identifiable groups of similar
individuals

name three examples of when a stratified random
sample would be useful

Student opinion at a high school – strata would be grade
level
Sampling and Surveys
When is a stratified random sample more useful
than a SRS?
+
Other Sampling Methods
a stratified random sample can sometimes
give more precise information about a population
than a SRS, both sampling methods are hard to use
when populations are large and spread out over a
wide area.
 In
that situation, we’d prefer a method that selects
groups of individuals that are “near” one another.
Definition:
To take a cluster sample, first divide the population
into smaller groups. Ideally, these clusters should
mirror the characteristics of the population. Then
choose an SRS of the clusters. All individuals in the
chosen clusters are included in the sample.
Sampling and Surveys
 Although
Sampling at a School Assembly
Describe how you would use the following sampling methods
to select 80 students to complete a survey.

(a) Simple Random Sample

(b) Stratified Random Sample

(c) Cluster Sample
Sampling and Surveys

+
 Example:
Your Understanding
The manager of a sports arena wants to learn more about the
financial status of the people who are attending an NBA basketball
game. He would like to give a survey to a representative sample of
the more than 20,000 fans in attendance. Ticket prices for the game
vary a great deal: seats near the court cost over $100 each, while
seats in the top rows of the arena cost $25 each. The arena is
divided into 30 numbered sections, from 101 to 130. each section
has rows of seats labeled with letters from A (nearest to the court) to
ZZ (top row of the arena).

Why might it be difficult to give the survey to an SRS of 200 fans?

Which would be a better way to take a stratified random sample of
fans: using the lettered rows or numbered sections as strata? Why?

Which would be a better way to take a cluster sample of fans: using
the lettered rows or numbered sections as clusters? Why?
Sampling and Surveys

+
 Check
+
Bias
ERROR
favors
certain outcomes
Anything that causes the
data to be wrong! It might
be attributed to the
researchers, the
respondent, or to the
sampling method!
There
are
many
sources of
bias
+
Voluntary response
People
choose to respond
Usually
only people with very
strong opinions respond
An example would be the
surveys in magazines that
ask readers to mail in
responses.
Other examples are call-in
shows, American Idol,
etc.
Remember – the
way to determine
voluntary response
is:
Self-selection!!
+
Convenience sampling
Ask
people who are easy to ask
Produces
biased results
The data obtained by a convenience
sample will be biased – however this
method
often used
An is
example
wouldfor
be surveys
stopping&
results
reported in
newspapers
and to
friendly-looking
people
in the mall
survey. magazines.
Another example is the
surveys left on tables at restaurants
- a convenient method!
+
People with unlisted
Undercoverage
phone numbers – usually
some
families
groups withinhigh-income
the population
are left out of the sampling process
Suppose you take a
sample by randomly
selecting names from
the phone book –
some groups will not
have the opportunity
of being selected.
People without
phone numbers –
usually lowincome families
People with ONLY cell
phones – usually young
adults
+
Nonresponse
occurs when an individual chosen
Because
of huge
telemarketing
for the
sample
can’t
be
efforts in the past few years,
contacted
or
refuses
to
telephone surveys have a MAJOR
cooperate
People
are
chosen
by
the
problem
with
nonresponse!
One way
to
help
with
theresearchers,
problem
BUT refuse is
toto
participate.
of nonresponse
make follow
telephone
surveys
70%
contact with
the people
who are
NOT
self-selected!
nonresponse
not home
when
you first contact
them.
This is often confused with voluntary
response!
+
Response bias
occurs
when
behavior
Suppose
we the
wanted
to surveyof
high
school students
on drug abusecauses
and
respondent
or interviewer
we used a uniformed police officer
bias
in
the
sample
Response
bias
when for
to interview occurs
each student
in some
our
reason
(interviewer’s
respondent’s
sample
– would weorget
honest
wrong
answers
fault)
you get
incorrect answers.
answers?
+
Wording of the Questions
The level of vocabulary should be
appropriate
for the
you
Questions
mustpopulation
be worded
as
wording
influence
the
are can
surveying
neutral
as possible
to avoid
influencing
answers
thatthe
areresponse.
given
connotation
use
of words
of “big” words or technical
words
+
Source of Bias?
1) Before the presidential election of 1936,
FDR against Republican ALF Landon, the
magazine Literary Digest predicting Landon
winning the election in a 3-to-2 victory. A
survey of 10 million people. George Gallup
surveyed only 50,000 people and predicted
that Roosevelt –would
Digest’s
Undercoverage
sincewin.
the The
Digest’s
survey
surveyfrom
camecar
from
magazine
subscribers,
comes
owners,
etc.,
the peoplecar
owners, telephone
directories,
etc.
selected
were mostly
from high-income
families and thus mostly Republican!
answers are possible)
(other
+
2) Suppose that you want to estimate the
total amount of money spent by students
on textbooks each semester at FSU. You
collect register receipts for students as
they leave the bookstore during lunch one
day.
Convenience sampling – easy way to
collect data
or
Undercoverage – students who buy
books from on-line bookstores are
included.
3) To find the average value of a home
in Plano, one averages the price of
homes that are listed for sale with a
realtor.
Undercoverage – leaves out homes
that are not for sale or homes that
are listed with different realtors.
(other answers are possible)
EXAM HINTS
1) explain that the bias pushes the results one
way or the other
2)Specify the direction!!
However, unless you are explicitly ASKED TO
IDENTIFY the type of bias present, DO NOT!!!!
Just focus on the box above, state that the results
are biased and state the direction.
Sampling and Surveys
It is NOT enough to say that bias exists….
A student MUST
+
 AP
for Sampling
The purpose of a sample is to give us information about a
larger population.

The process of drawing conclusions about a population on the
basis of sample data is called inference.
Why should we rely on random sampling?
1)To eliminate bias in selecting samples from the list of
available individuals.
2)The laws of probability allow trustworthy inference about the
population
• Results from random samples come with a margin of
error that sets bounds on the size of the likely error.
• Larger random samples give better information about the
population than smaller samples.
Sampling and Surveys

+
 Inference
Surveys: What Can Go Wrong?
Most sample surveys are affected by errors in addition to
sampling variability.

Good sampling technique includes the art of reducing all
sources of error.
Definition
Undercoverage occurs when some groups in the population
are left out of the process of choosing the sample.
Nonresponse occurs when an individual chosen for the sample
can’t be contacted or refuses to participate.
A systematic pattern of incorrect responses in a sample survey
leads to response bias.
The wording of questions is the most important influence on
the answers given to a sample survey.
Sampling and Surveys

+
 Sample
+ Homework
Please Complete the Following
READ
p. 215 - 219
COMPLETE
(p.227)19, 21, 23, 25, 27-29,
31, 33, 35
CHECK
your answers in the back of the book
Upcoming
Chance MC Chapter 3 – this Fri 10/26 & next
Mon 10/29 during lunch
 2nd
 Bonus
Test Chapter 3 – this Fri 10/26 after school
 Chapter
4 Test Tues 11/6
+
Section 5.1 & 5.2
Designing Samples and Experiments
Homework
 2005
AP Open Response Question 5
pp 360-364 – quiz tomorrow on all of 5.2
through p. 364; may use notes
 Read
 p.364:
 p.366
5.39-5.41
5.43-5.44
+
AP EXAM HISTORICAL PERSPECTIVE
Open Response Questions - Chapter 5
 1997
#2
 2003
B #3a
 2007
#2, 5a
 1998
#3
 2003
B #4abd
 2007
B #3
 1999
#3
 2004
#2, 3d, 5b
 2008
#2
 2000
#5
 2004
B #2, 6c
 2008
B #4a
 2001
#4
 2005
#1bc, 5ac
 2009
#3
 2002
#2
 2005
B #3
 2009
B #4, 6a
 2002
B #3
 2006
#5
 2010
B #2
 2003
#4
 2006
B #5, 6f
 2011
B #2