chapter 1 statistics

Part III
Gathering Data
Chapter 11
Understanding Randomness

Random


An event is random if we know what
outcomes could happen but not which
particular values did or will happen
Random Numbers

“Hard to get”



Pseudorandom
Table of random digits
Pick a number from the next slide
1 2 3 4
Simulation


A simulation consist of a collection of things that
happened at random. Is used to model real-world
relative frequencies using random numbers.
Component


Outcome


An individual result of a simulated component of a simulation
Trial


Situation that is repeated in the simulation. Each component
has a set of possible outcomes
The sequence of events that we are pretending will take place
Step-by-step page 295
Chapter 12
Sample Surveys

Idea 1: Examine a part of the whole


Carefully select a smaller group from the
population (Sample)
A sample that does not represent the
population in some important way is said
to be biased
Sample Survey (cont.)

Idea 2: Randomize


Randomizing protect us from the influences
of all the features of our population, even
the ones that we may not have thought
about.
Is the best defense against bias, in which
each individual is given a fair random
chance of selection
Sample Surveys (cont.)

Idea 3: It’s the sample size


The fraction of the population that you have
sampled doesn’t matter. It’s the sample size itself
that’s important.
Census

A Sample that consist of the entire population.



Difficult to complete. Not practical, too expensive
Populations are not static
Can be more complex
Populations and parameters

Population parameter

Parameter (numerical value) that is part of a
model for a population. We want to estimate
this parameters from sampled data.
Sampling


When selecting a sample we want it to be
representative, that is that the statistics we
compute from the sample reflect the
corresponding parameters accurately
Simple Random Sample (SRS)


Is a sample in which each combination of
elements has an equal chance of being selected
Sampling Frame

A list of individuals from which the sample is
drawn
Other Sampling Designs

Stratified random sampling


A sampling design in which the population
is divided into homogeneous subsets called
strata, and random samples are drawn
from each stratum.
Cluster Sampling

Random samples are drawn not directly
from the population, but from groups of
clusters. (Convenience, practicality, cost)
Other Sampling Designs
(cont.)

Systematic Sample

Sample drawn by selecting individuals
systematically from a sampling frame.


(ex. Every 10 people)
Multistage Sample

Combining different sampling methods
How to Sample Badly

Sample badly with volunteers


Sample badly because of convenience



Voluntary response bias invalidates a survey
Convenience sampling: Simply include the
individuals who are at hand
Sample from a bad sampling frame
Undercoverage

Some portion of the population is not sampled at
all or has a smaller representation in the sample
than it has in the population.
How to Sample Badly


Non response bias
Response Bias



Influence arising from the design of the survey
wording.
Look for biases before the survey. There is no
way to recover from a biased sample or a
survey that asks biased questions
Sampling Variability

Difference from sample to sample, given that the
samples are drawn at random
Exercises

Page 325

#8

#14

#15
Chapter 13
Experiments

Investigative Study

Observational Studies



Retrospective study


Researchers don’t assign choices
No manipulation of the factors
Observational study in which the researcher
identifies the subject and then collect data on
their previous condition or behavior
Prospective Study

Identifies or selects the subjects and follows
the future outcomes
Experiment


Random assignment of subjects to treatments.
Explanatory Variable:


Response variable :



Subjects
Participants
Factor



Measurement
Experimental units


Factor (manipulate)
A variable whose levels are controlled by the experimenter
Levels of the factor
Treatments

All the combinations of the factors with their respective levels
The Four Principles of
Experimental Design

1 - Control


We need to control sources of variation
other than the factors being studied.
(make the conditions similar for all
treatment groups)
2 - Randomize

Assign the subjects randomly to the
treatments to equalize the effects of
unknown variation
The Four Principles of
Experimental Design (cont.)

3 - Replicate


Apply the treatments to several subjects.
4 - Block

Separate in blocks of identifiable attributes
that can affect the outcome of the
experiment
Designing an Experiment

Step-by-Step Page 335
Experiments

Control Treatment


Baseline treatment level to provide basis for
comparison.
Blinding

There are two main classes of individuals who can
affect the outcome of the experiment




Subjects, treatment administrators
Evaluators of the results
Single Blinding (one)
Double Blinding (both)
Experiments

Placebos


Blocking


A null treatment to make sure that the effect of
the treatment is not due to the placebo effect.
By blocking we isolate the variability due to the
differences between the blocks so that we can see
the differences due to the treatment more clearly
Confounding

When the levels of one factor are associated with
the levels of another factor, we say that these two
factors are confounded
Exercises

Page 351

#10

#12