P-Values and Confidence Intervals

Extending Hypothesis Testing
p-values & confidence intervals
So far:
• how to state a question in the form
of two hypotheses (null and
alternative),
• how to assess the data,
• how to answer the question by
using a statistic and an associated
measure of the probability of
observing our statistic, given the
current state or null hypothesis.
Next:
We will use the p-value to:
–make inferences about the
population
–assign a level of confidence
Review the Steps
Phase 1: State the Question
1. Evaluate and describe the data
2. Review assumptions
3. State the question-in the form of hypotheses
Phase 2: Decide How to Answer the Question
4. Decide on a summary number-a statistic-that
reflects the question
5. How could random variation affect that statistic?
6. State a decision rule, using the statistic, to
answer the question
Detailed Steps (cont)
Phase 3: Answer the Question
7. Calculate the statistic
8. Make a statistical decision
9. State the substantive conclusion
Phase 4: Communicate the Answer to the
Question
10. Document our understanding with text,
tables, or figures
Clarify & Generalize the steps
Step 2: Assumptions:
– Representative: Is the observed data
representative of the population?
– Independence: Are the observations
(responses of interest) independent?
– Size: Is the size of the sample large enough
to make generalizations to the population at
large?
Size Assumption
So, how large is “large enough”?
Rule-of-thumb: N large enough to expect to
see five of each of the two outcomes
Both of the following must be true:
p0 n > 5
(1 – p0) n > 5
In the CPR study
p0 = 0.06 and n = 278, so:
p0 • n = 0.06 • 278 = 16.68 > 5
(1 – p0) • n = (1 – 0.06) • 278 = 261.31 > 5
Common Mistakes
• Using the observed proportion rather than
the hypothesized proportion
• Compare the observed number of events
of interest to five
• Why Not? We always operate under the
assumption that the null hypothesis is true
– so use the null proportion!
Step 2 is particularly important:
• If the data do not meet the assumptions,
then the statistical tests applied to test the
hypothesis will not be valid
• Only proceed to steps 3 – 10 if the
assumptions are met
Step 4
• For the CPR example, we used a specific
statistic, the proportion p
• The statistic and decision rules can be
more generally defined and applied to all
situations for testing a proportion.
Review CPR Simulation
H0: The population survival proportion is
0.06 or less if the observed proportion p ≤
0.083 (x = 23 survivors or less).
HA: The population survival proportion is
larger than 0.06 if the observed proportion
p > 0.086 (x = 24 or more survivors).
Recall for the CPR simulations, the results looked
similar to a normal distribution
Applied vs. Theoretical
• The smooth curve is the theoretical
distribution of a normal curve under the null
hypothesis
• Centered on the population value (p0 =
0.06) with proportions farther away from this
center being less likely to occur
• Use the theoretical distribution to determine
if our observed proportion is different from
our assumed proportion
General Test Statistic
observed p - hypothesized p
standard error of the hypothesized p
z=
pˆ − p0
p0 (1 − p0 )
n
→
observed
proportion
z=
standard
error of p0
pˆ − p0
assumed
proportion
p0 (1 − p0 )
n
Why p0?
• Calculate the test statistic under the
assumption that the null hypothesis is true.
• We are not concerned about how the
variability of the observed data will affect our
hypothesis testing result
• We believe the null hypothesis and the
variability in the observed data should be
assumed to be the same as the variability
under the null hypothesis.
Z-score
• Using the z-score allows us to use a
decision rule based on the standard
normal distribution, rather than the
proportion, p.
• The standard normal distribution ~N(0,1)
• The cut-off for the decision rule does not
change for different values of p, n, and p0.
For an a = 0.05, the z value is 1.645, ( 5% of the
N(0,1) values are greater than 1.645)
General Decision Rule
H0: proportion p ≤ p0.
Choose this if z ≤ zcritical and p-value ≥ α.
HA: proportion p > p0.
Choose this if z > zcritical and p-value < α.
Clarify Steps w/ CPR Example
1. Evaluate and describe the data
• We observed n = 278 CPR patients who
received instructions by phone, of whom x = 29
survived to hospital discharge.
• The characteristic of interest is survival
proportion, p = 29/278 = 0.104.
• The intent is to compare the outcomes in this
study to a = 0.06 survival rate presumed to be
typical.
2. Review assumptions
There are three assumptions:
• Representativeness: From the design of the study, it
is clear that subjects are representative of cardiacarrest victims in cities with a quick-response emergency
system.
• Independence: The response of one cardiac-arrest
victim does not depend on the response of others. The
subjects are independent.
• Sufficient size: Since, • n = 0.06 • 278 = 16.68 > 5,
and (1 – ) • n = (1 – 0.06) • 278 = 261.31 > 5, this
assumption is valid.
3. State the question—in the form
of hypotheses
The intent is to show that phone-CPR is
superior to doing nothing.
Thus, the alternative hypothesis is that there
are higher than 6% survival rates:
H0: p ≤ 0.06
HA: p > 0.06.
4. Decide on a summary number—a statistic—
that reflects the question
We’ll use the z-score:
z=
pˆ − p0
p0 (1 − p0 )
n
5. How could random variation
affect that statistic?
If the null hypothesis is true, then z is zero.
Since the assumptions are met, z is
normally distributed.
Large values of z reflect higher survival
proportions and thus favor the alternative
hypothesis.
6. State a decision rule, using the
statistic, to answer the question
General – Choose to believe (at α = 0.05):
H0: Choose this if p-value ≥ α
HA: Choose this if p-value < α
For CPR Example, for an α = 0.05:
H0: p ≤ 0.06 Choose this if p-value ≥ 0.05
HA: p > 0.06. Choose this if p-value < 0.05
7. Calculate the statistic
z=
pˆ − p0
p0 (1 − p0 )
n
=
0.104 − 0.06
0.06 (1 − 0.06 )
278
0.044
=
= 3.09
0.0142
Recall that a z-value to the right of 3 is unlikely.
In fact, the associated p-value is p = 0.0010 (we’ll talk
about calculating p-values later).
8. Make a statistical decision
• Reject the null hypothesis since p-value <
0.05.
• The observed value of the summary
statistic is larger than what is expected by
chance alone.
9. State the substantive conclusion
We conclude that the survival proportion is
larger than 0.06.
10. Document our understanding with
text, tables, or figures
Does dispatcher-instructed bystander-administered CPR
improve the chances of survival?
Without this intervention it is presumed that the survival
probability will be unchanged (at 6%).
From this study, which used n = 278 patients, we observed
p = 0.1040 (x = 29 survived until hospital discharge).
The observed rate was compared to the hypothesized rate
using the z test statistic.
We reject the hypothesis p ≤ 0.06 in favor of the alternative
hypothesis that the survival probability is larger than 6%
(z = 3.09, p-value = 0.0010).
Universal Decision Rule
H0: null-hypothesis.
Choose this if p-value ≥ α (usually 0.05).
HA: alternative-hypothesis.
Choose this if p-value < α (usually 0.05).
How do we determine p-values?
• p-values can be determined from standard
normal tables, such as Table A.1 in the
§,715
Statistical Sleuth.
• Tedious and you need to be careful what
the table gives as the proportion – it could
be the opposite of what you are looking
for!
• Use a calculator
Calculation note:
• Software might return a p-value as “0”or
“0.000” – not possible
• Determine the number of decimal places
the calculator reports (when it will return a
0 value)
• Then report “p < 0.001” or “p < 0.0001”
Confidence Intervals
• Often, researchers want to use a less rigid
approach to hypothesis testing by
estimating the parameter and placing
upper and lower bounds (or limits) on the
estimate.
• The interval is called a confidence interval.
• The confidence interval approach allows
us to make statements about a population
parameter without referring to hypotheses
• Also gives a range of values that reflects
our degree of certainty.
Definitions
• Inference: An inference is a conclusion that
patterns observed in the data are present in
the broader population.
• Statistical Inference: A statistical inference
is an inference justified by a probability
model (distribution) linking the data to the
broader population.
• Parameter: A parameter is an unknown
numerical value describing a feature of a
distribution.
More Definitions
• Statistic: A statistic is any value that can
be calculated from the observed data.
• Estimate: An estimate is a statistic used
as a guess (or estimate) of a parameter.
General Definition
estimate ± (reliability coefficient) × (standard error)
Estimating a parameter with an interval
involves three components:
• The point estimate.
• The standard error of the estimate. This
describes how much variability we expect.
• A reliability coefficient. This describes our
degree of certainty.
Estimate
• Calculate the observed proportion:
p=x/n
• In the CPR case p = 0.104.
Standard Error
• The standard error we use here is different
from that used in hypothesis testing.
Recall that earlier we were in the mind-set
of hypothesis testing.
• Here we are not doing hypothesis testing
here. We’re just estimating a confidence
interval based upon the observed data
Standard Error of p-hat
SE pˆ =
pˆ (1 − pˆ )
n
Note that the standard error of the estimate gets
smaller as n gets larger.
We expect less variability in an estimate if we use
more data to make the estimate.
CPR Example
• For n = 278 and = 0.104, the associated
standard error is:
SE pˆ =
pˆ (1 − pˆ )
n
= 0.0183
=
0.104 (1 − 0.104 )
278
Reliability coefficient
• The reliability coefficient reflects how sure
we want to be:
– 95% sure
– 90% sure
– 99% sure
• Based on the standard normal for those
proportions
Reliability Coefficients Commonly Used
• For 90% confidence, use z = 1.645.
• For 95% confidence, use z = 1.96.
• For 99% confidence, use z = 2.575.
Confidence Interval
pˆ ± ⎡ z(1−α 2 ) × SE pˆ ⎤
⎣
⎦
0.104 ± 1.96 ( 0.0183 )
( 0.068,
0.140 )
Using a sentence:
In the first case study there were 29
survivors (out of n = 278 studied) yielding
a 95% confidence interval on the
population survival proportion of [0.068,
0.140].
That is, “We’re 95% confident that the
survival proportion is between 0.068 and
0.140.”
Is there a 100% CI?
Yes, it is [0,1]
But this is a silly answer and doesn’t make a
conclusive statement about the population
estimate.
This is the same for all proportions!
Using the 10 Steps
The Changes to the 10 Steps are minimal:
• 3. State the question (CI)
• 4. Decide on a summary statistic that reflects the
question (CI formula).
• 5. How could random variation affect that
statistic? (If the assumptions are met, then this
interval will cover the population proportion 95%
of the time )
• 6. Determine the reliability coefficient
and standard error to be used in the CI
• 7. Calculate the interval
• 8. Compare the interval to comparison
value (If there is a comparison value,
does the interval include it?)
• 9. State the substantive conclusion:
Something like: “We estimate the population
proportion of … to be [lower, upper] … with 95%
confidence …” perhaps “… which does not
include the hypothesized value of ….”
10. Document our understanding with text
Summary
• We have looked at several methods to assess
and describe data and underlying populations.
• We can use simulations, z-scores, p-values, or
confidence intervals about an estimate to make
conclusions about observed data and broader
populations.
• Next, we’ll look at sample size and precision of
estimates and the design of a study to estimate
population proportions.