day2 - probability

7/3/2013
Biostatistics in Dentistry
Rules of Probability
Probability
• A measure of “how likely” it is that an event
will occur
• It is a value defined to be between 0 and 1
– A higher value indicates more likely
– An impossible event has probability = 0
– A certain event has probability = 1
0
less likely
more likely
impossible
1
certain
Probability scale
1
7/3/2013
Probability as long-term frequency
• n independent trials of an experiment
• Probability is the relative frequency of the
event over many trials
#of trials in which the event occurs
→ probability of event,
n
as n gets very large (n →∞)
Probability based on equally-likely
outcomes
• Divide the sample space (all possible
outcomes) into equally-likely outcomes
• Probability is the proportion of the equallylikely outcomes that indicate the event occurs
number of outcomes in
Probability = which the event occurs
total number of
possible outcomes
2
7/3/2013
Example: NHANES I data
The “experiment” consists of choosing a random person from the pool of
people described in this table of participants from the NHANES I study.
Define the events:
CHD
Perio
= person chosen is
classified as having
periodontitis
We can then calculate the
probabilities of the events
CHD
History of heart disease
(CHD)
= Person chosen has
history of heart
disease
866
11328
Healthy
Periodontal Gingivitis
classification
Periodontitis
0.076
Edentulous
Total
Perio
2092
11328
No
Yes
Total
3809
153
3962
2458
145
2603
1915
177
2092
2280
391
2671
10462
866
11328
0.185
Combinations of events
Many events of interest are a combination of two or more events.
Suppose we are doing a survey asking dental-related information
A = event person surveyed flosses regularly
B = event person surveyed is a smoker
AND, intersection,∩
OR, union, U
An event denoted by “A and B”
means an event that is thought to
occur only if both A and B occur.
An event denoted by “A or B”
means an event that is thought to
occur if either A or B or both occur.
Example: “A and B” = event
person surveyed is a smoker who
flosses regularly
Example: “A or B” = event person
surveyed is a smoker, or is a flosser
(or both).
3
7/3/2013
Example: NHANES I data
History of heart disease
(CHD)
Define the events:
CHD
= Person chosen has
history of heart
disease
Perio
No
Yes
Total
3809
153
3962
2458
145
2603
1915
177
2092
2280
391
2671
10462
866
11328
Healthy
Periodontal Gingivitis
classification
Periodontitis
= person chosen is
classified as having
periodontitis
Edentulous
We can then calculate the
probabilities of the events
CHD and Perio
CHD or Perio
Total
177
11328
153
0.016
145
177 391
11328
1915
0.245
History of heart
disease (CHD)
Healthy
Periodontal Gingivitis
classification
Periodontitis
Addition Rule
Edentulous
Total
P CHD or Perio
153
145
177 391
11328
153
145
177
153
145 177
11328
P(CHD)
Yes
Total
153
3962
2458
145
2603
1915
177
2092
2280
391
2671
10462
866
11328
1915
391 1915
11328
391
No
3809
177
177
1915 177
177
11328
11328
+ P(Perio) – P(CHD and Perio)
4
7/3/2013
Addition Rule
P(A or B) = P(A) + P(B) – P(A and B)
A
B
=
A
B
+
-
Special case:
If P(A and B) = 0, then
P(A or B) = P(A) + P(B)
B
A
Example:
Application of a certain anesthetic carries the risk
of two main complications; headache and
euphoria. Suppose that
P(H) = .20, P(E) = .08, and P(H and E) = .05
The probability of at least one complication is:
P(H or E) = P(H) + P(E) - P(H and E)
= .20 + .08 - .05
= .23
5
7/3/2013
Opposite event
Denote the opposite of event A by ~A.
P(~A) = probability that event A does not happen
= 1 – P(A)
Examples:
• Probability of not getting a headache:
P(~H) = 1-P(H) = 1-.2=.8
• Probability of no complications at all:
P(~(H or E) ) = 1 – P(H or E) = 1- .23 = .77
Conditional Probability
Change the “experiment” by limiting the population
What is the probability
that the next person
with periodontitis
sampled has CHD?
History of heart disease
(CHD)
No
Yes
Total
3809
153
3962
2458
145
2603
1915
177
2092
Edentulous
2280
391
2671
Total
10462
866
11328
Healthy
Periodontal Gingivitis
classification
Periodontitis
6
7/3/2013
Conditional Probability
Change the “experiment” by limiting the population
History of heart disease
(CHD)
What is the probability
that the next person
with periodontitis
sampled has CHD?
P(CHD|Perio)
No
Yes
Total
3809
153
3962
2458
145
2603
1915
177
2092
Edentulous
2280
391
2671
Total
10462
866
11328
Healthy
Periodontal Gingivitis
classification
Periodontitis
177
2092
Note: the denominator is limited to those with periodontitis
Conditional Probability
Formula derivation
History of heart disease
(CHD)
P(CHD|Perio)
Healthy
=
177
2092
Periodontal Gingivitis
classification
Periodontitis
177⁄11328
=
2092⁄11328
=
Edentulous
Total
No
Yes
Total
3809
153
3962
2458
145
2603
1915
177
2092
2280
391
2671
10462
866
11328
P(CHD and Perio)
P(Perio)
7
7/3/2013
Conditional probability definition
P AB
P AandB
P B
Example: Anesthesia complications
What is the probability that a patient who is
suffering the headache complication will
experience the euphoria complication?
P EH
P EandH
P H
Notes on
Conditional
Probability
.05
.20
History of heart
disease (CHD)
Healthy
Periodontal Gingivitis
classification
Periodontitis
P(A|B) ≠ P(A and B)
Edentulous
Total
P(CHD | Edent)
.25
No
Yes
Total
3809
153
3962
2458
145
2603
1915
177
2092
2280
391
2671
10462
866
11328
Probability that a randomly chosen person from those that
are edentulous will have CHD
391/2671 = 0.146
P CHD and Edent
Probability that a randomly chosen person from the entire
population will be both edentulous and have CHD
391/11328 = 0.035
8
7/3/2013
Notes on
Conditional
Probability
History of heart
disease (CHD)
Healthy
Periodontal Gingivitis
classification
Periodontitis
Edentulous
P(A|B) ≠ P(B|A)
Total
P Edent | CHD
No
Yes
Total
3809
153
3962
2458
145
2603
1915
177
2092
2280
391
2671
10462
866
11328
Probability that a randomly chosen person from those that
have CHD will be edentulous
391/866 = 0.452
P(CHD | Edent)
Probability that a randomly chosen person from those that
are edentulous will have CHD
391/2671 = 0.146
Multiplying both sides of the conditional
probability definition by P(B) gives
Multiplication Rule:
For any two events A and B
P AandB
P A|B
P B
9
7/3/2013
Example: S. Mutans and Caries
A 1989 study of schoolchildren in Siena Italy found
17% of the children to have plaque colonized by S.
Mutans. Among those colonized, 95% had active
caries. In those not colonized, 72% had active caries.
What is the probability of having active caries in this
cohort?
• P(SM) = .17
• P(AC | SM) = .95
• P(AC | ~SM) = .72
P(AC) = ?
R. Gasparini, et al,, Eur J Epid, Vol. 5, No. 2 (Jun., 1989), pp. 189-192
Example: Multiplication Rule:
Given:
P(SM) = .17, P(AC | SM) = .95, P(AC | ~SM) = .72
Need to find: P(AC)
Note that: P(AC) = P(AC and SM) + P(AC and ~SM)
AC
SM
10
7/3/2013
Example: Multiplication Rule:
Can use Multiplication Rule to find P(AC and SM) and
P(AC and ~SM):
P(AC and SM) = P(AC|SM) x P(SM) = .95 x .17= .16
P(AC and ~SM) = P(AC|~SM) x P(~SM)= .72x(1-.17) = .60
P(AC) = P(AC and SM) + P(AC and ~SM)
= .16 + .60
= .76
Conditional probability and independence
Note that in the complications example, P(E) =
.08 while P(E|H) = .25, so it appears that the
likelihood of E depends on whether H is true.
Two events A, B are independent if and only if:
P(A|B) = P(A)
The multiplication rule implies
Two events A, B are independent if and only if:
P(A and B) = P(A) x P(B)
11
7/3/2013
Example: Anesthesia complications
What is the probability that both of two
successive patients get headaches?
H1 and H2 denote the events that the first and
second patients get headaches. Assume that
two different patients are independent.
P(H1 and H2) = P(H1) × P(H2) = .20 × .20 = .04
Example: Anesthesia complications
What is the probability that at least one of the
two successive patients get headaches?
P(H1 or H2) = P(H1) + P(H2) - P(H1 and H2)
= .20 + .20 - .04 = .36
Alternatively, can think of “at least one” as the
opposite of “both do not get headaches”.
1-P(~H1 and ~H2) = 1 - P(~H1) x P(~H2)
= 1 - .80 x .80 = .36
12
7/3/2013
Application: diagnostic testing
• Diagnostic tests are often imperfect indicators of
disease.
• A common method of evaluating accuracy of
diagnostic tests is by estimating two important
measures:
• Sensitivity = P(test positive | disease)
• Specificity = P(test negative | no disease)
• It is important to estimate both of these measures.
As it is quite possible that one can be good at the
expense of the other.
Example: caries diagnosis
A study* assessed 50 sites from teeth exfoliated or removed for orthodontic
reasons, via digital radiography and via a histological examination (gold
standard). The outcome of interest was carious lesions involving the dentine.
histology
radiography
lesion
no lesion
total
lesion
13
3
16
no lesion
6
28
34
total
19
31
50
Sensitivity = 13/19 = 68%
Specificity = 28/31 = 90%
Interpretation: The radiography is good at not producing false
positives, and fair at identifying the true lesions.
*Dias da Silva, PR, et al ,(2010) Dentomaxillofacial Radiology, 39, 362-367.
13
7/3/2013
Application: Relative Risk
• Relative Risk (RR) is a measure used to describe the
association of a disease with an exposure
• It is the ratio of the risk of disease in those who are
exposed, compared to the risk of disease in those
who are not exposed
• RR=
P Disease Exposed)
P Disease not Exposed)
Example: NHANES II data (longitudinal follow up)
• Disease is CHD within 10 years of exam
• Exposure is periodontal status at exam
no
yes
Total
Relative Risk
(compared to
healthy
participants)
3622
95.1
187
4.9
3809
100
4.9
= 1.00
4.9
2308
93.9
150
6.1
2458
100
6.1
= 1.24
4.9
1657
86.5
258
13.5
1915
100
13.5
= 2.74
4.9
1823
80.0
457
20.0
2280
100
20.0
= 4.08
4.9
10 year CHD
incidence
Healthy
Count
%
Gingivitis
Count
%
Periodontal
classification
Periodontitis
Count
%
Edentulous
Count
%
Interpretation: Someone with periodontitis is 2.74 times more likely to
suffer CHD in 10 years than someone with healthy gums.
14