How does a sample in statistics represent a population given certain

Introduction to Statistics
How do I understand statistics as a process for
making inferences about population parameters
based on a random sample from that population?
How do I recognize the purposes of and difference
among statistical data gathering methods?
What is/are statistics?
• Statistics is a way of reasoning, along with a
collection of tools and methods, designed
to help us understand the world.
• Statistics are particular calculations made
from data
Population Data or Sample Data?
• Population data is used when you are
gathering data from every individual of
interest.
• Ex: Asking the entire football team a question
• Sample data is used when you are
gathering data from some of the
individuals of interest.
• Ex: Asking only the offensive line a question and apply
it to the entire football team
Population Data or Sample Data?
The US Government takes a census of its citizens
every 10 years to gather information.
A. Population
B. Sample
Population Data or Sample Data?
You want to know what sports teens prefer
so you send out a survey to all the students
in your high school.
A. [Default]
B. [MC Any]
C. [MC All]
A. Population Data
B. Sample Data
Population Data or Sample Data?
You want data on the shoe size of all Mt.
Tabor students, so you interview every
student at school.
A.
B.
Population
Sample
Population Data or Sample Data?
You want to know how long people in
Winston-Salem visited the beach last
summer, so you polled 50 random
people at the Dixie Classic Fair.
A.
B.
Population
Sample
Population Data or Sample Data?
You want to know the average GPA of Mt. Tabor
students, so you ask all of the students in all of
your classes.
A.
B.
Population
Sample
Parameter vs. Statistic
• A statistic is a descriptive measure computed
from a sample of data.
• A parameter is a descriptive measure computed
from an entire population of data.
• Inferential statistics enables you to make an
educated guess about a population parameter
based on a statistic computed from a sample
randomly drawn from that population.
Parameter or statistic?
You want to know the mean income of the people
who subscribe to People magazine, so you
question 100 subscribers.
A.
B.
Parameter
Statistic
Parameter or statistic?
You want to know the average height of the
students in this math class, so you have
everyone in the class write their height on a
sheet of paper.
A.
B.
Parameter
Statistic
• A committee on community relations in
a college town plans to survey local
businesses about the importance of
students as customers. From telephone
book listings, the committee chooses
150 businesses at random. Of these, 73
return the questionnaire mailed by the
committee.
• What is the population for this sample
survey?
• What is the sample?
Correlation: The degree to which two or more
measurements on the same group of elements show a
tendency to vary together.
Causation: The degree to which one causes the other to
happen.
Correlation does not imply causation
Example: There is correlation between the number of
people wearing shorts and temperature. More people
wearing shorts doesn’t cause higher temperatures!
Ways to Gather Data
• Survey – a questionnaire used to collect interesting data on a
certain topic from a sample of people.
• EX: You want to find out how many students in your class had
a summer job.
• EX: The government wants to determine average household
income in the United States.
• EX: You want to know if tattoos have an influence on a
person’s GPA.
Ways to Gather Data
• Obervational Study – we observe individuals and
measure variables of interest but do not attempt to
influence the responses. Observational Studies may
show a correlation between variables, but cannot
always guarantee causation.
• EX: A study of child care enrolled 1364 infants in 1991
and planned to follow them through their sixth year in
school. In 2003, the researchers published an article
finding that “the more time children spent in child care
from birth to age four-and-a-half, the more adults
tended to rate them, both at age four-and-a-half and at
kindergarten, as less likely to get along with others, as
more assertive, as disobedient, and as aggressive.”
Ways to Gather Data
• Experiment – we deliberately impose some treatment on
(that is, do something to) individuals in order to observe their
responses. Experiments can carry more convincing evidence
of a cause and effect relationship.
• EX: “Take the Pepsi Challenge” – in the 80’s Pepsi had a huge
marketing scheme that had people do a blind taste test to see
which soda they preferred – Pepsi or Coke.
• EX: Does Vitamin C reduce the causes of getting a common
cold?
Which method would you choose?
You want to know the average GPA of a
football player at school this year.
A. Survey
B. Observational Study
C. Experiment
Which method was used?
The Gallop Poll questions a sample of
about 1500 adult U.S. residents to
determine national opinion on a variety
of issues.
A. Survey
B. Observational Study
C. Experiment
Which method would you choose?
Does working with computers improve
student performance in school?
A. Survey
B. Observational Study
C. Experiment
Which method is used?
A kindergartener is given the option to eat a
marshmallow immediately or if they can wait 5
minutes they can have 2 marshmallows. Years later,
the response of the kindergartener was used to
determine if delaying gratification can have an effect
on SAT scores .
A. Survey
B. Observational Study
C. Experiment
Which method is used?
Medical records were used to determine if there is a
correlation between inducing labor and autism in
children.
A. Survey
B. Observational Study
C. Experiment
Sampling
• When conducting a survey, experiment, or
observational study, it is almost impossible
to survey everyone in a population so
people use various sampling methods to
gather information.
• One major concern about sampling methods
is whether it is a biased or unbiased method
to gather information.
Sampling Methods
• Random sampling: when everyone in a population has an
equal chance of being chosen in the experiment.
• Stratified sampling: when the population is first divided into
similar categories and the number of members in each
category is determined.
• Systematic sampling: when you determine a method for
which to choose members of the population (assign
numbers to the population and then choose every 5th person
to participate)
• Cluster sampling: when you randomly put the population
into clusters and then choose a cluster randomly and then
randomly choose people in that cluster to participate.
Example if selecting 10 animals from 25 dogs, 15 cats, and 10 rabbits
• Random sampling: when everyone in a population has an
equal chance of being chosen in the experiment.
Randomly selecting 10 from all 50 animals
• Stratified sampling: when the population is first divided into
similar categories and the number of members in each
category is determined. Select 5 from 25 dogs, 3 from 15 cats
and 2 from the rabbits
• Systematic sampling: when you determine a method for
which to choose members of the population (assign numbers
to the population and then choose every 5th person to
participate) Give every animal a random number and then
choose every 5th number
• Cluster sampling: when you randomly put the population into
clusters and then choose a cluster randomly and then
randomly choose people in that cluster to participate.
Randomly put the animals into 2 groups of 25, choose a
group, and then choose 10 from that selected group.
Which sampling method is used in
the scenario below?
A Gallop poll surveyed 1,018 adults by
telephone in each of the 6 regions of the
country, and 22% of them reported that they
smoked cigarettes within the past week.
A.
B.
C.
D.
Random
Stratified
Systematic
Cluster
Which sampling method is used in
the scenario below?
A principal goes to one classroom in each
department and chooses two students from
each classes to participate in a school climate
survey.
A.
B.
C.
D.
Random
Stratified
Systematic
Cluster
Which sampling method is used in
the scenario below?
WSFCS sends out a survey to parents by
generating a list of student numbers from
PowerSchool.
A.
B.
C.
D.
Random
Stratified
Systematic
Cluster
Biased Questions
• Some questions may use language that people can
associate with emotions:
• How much of your time do you waste on Facebook?
• Do you prefer the wonderful math class or boring
Shakespeare Class?
• Some questions may refer to a majority or supposed
authority:
• Would you agree with the NCAE that teachers should be
paid more for earning their master’s degree?
• Phrased awkwardly:
• Do you disagree with people who oppose the ban on
smoking in public places?
Sampling Bias
• Occurs when one or more sub groups of a population are either
over represented or under represented when conducting a
survey or experiment. It must be random and fair selection
• Voluntary: People voluntarily turn in a survey
• Convenience: Questioner stays in one place
• Exclusion: Only asking certain members
• Under representation: Not getting 1/6 or at least 30 people
• Non-randomness: Calling the first five people on every page of
the phonebook.
• Self-selection: People choose groups
• Lack of double-blinding: Sampler knows which group/product
the person is selecting
Biased or Unbiased – be prepared to
defend your response.
A person asks, “Do you prefer delicious
pancakes or cold soggy cereal?
A. Biased
B. Unbiased
Biased or Unbiased – be prepared to
defend your response.
Asking people shopping at a farmer’s market if
they think locally grown fruit and vegetables
are healthier than supermarket fruits and
vegetables.
A. Biased
B. Unbiased
Biased or Unbiased – be prepared to
defend your response.
A survey about whether or not teachers who
earn their master’s degrees should be paid
more is sent out to all teachers in NC.
A. Biased
B. Unbiased
Activity:
• Martha wants to construct a survey that shows
which sports students at her school like to play
the most.
• List the goal of the survey.
• What population sample should she interview?
• How should she administer the survey?
• Create a data collection sheet that she can use to
record her results.
Errors in Summarizing Data
• No causation of effect: The cause could be
affected by something other than what is being
studied. (correlation only)
• (Ex: Frog with no legs are deaf)
• No causation of accurate population: Applying
your results to the population incorrectly.
• (Ex: Just because 85% of this class like math, it
doesn’t mean that 85% of all students at this
school like math.)
www.tylervigen.com shows correlations no
causations
What is needed to determine
causation for the population
Random
Selection
Random
Assignment
No Random
Selection
Causality
Causality
Population Only to sample
No Random No causality
Assignment Population
No causality
No results!
Resources used:
• "Next: Introduction to Data and Measurement Issues Surveys
and Samples." CK-12 Foundation. N.p., n.d. Web. 21 Aug.
2013.
• Yates, Daniel S., David S. Moore, and Daren S. Starnes. The
Practice of Statistics: TI-83/84/89 Graphing Calculator Enhanced.
New York: W.H. Freeman, 2008. Print.
• Greg Fisher – Mount Tabor High School
• Christina Holst – Parkland High School
• Wendy Bartlett – Parkland High School