Lecture7

Nanjing University of Science & Technology
Pattern Recognition:
Statistical and Neural
Lonnie C. Ludeman
Lecture 7
Sept 23, 2005
1
Review 1: Classifier Framework
May be
Optimum
2
Review 2: Classifier performance Measures
1. A’Posteriori Probability (Maximize)
2. Probability of Error ( Minimize)
3. Bayes Average Cost (Maximize)
4. Probability of Detection ( Maximize with fixed
Probability of False alarm)
(Neyman Pearson Rule)
5. Losses (Minimize the maximum)
3
Review 3: MAP and MPE Classification rule
Form 1
If
p( x | C1 ) P(C1 )
C1
> p( x | C2 ) P(C2 )
<
C2
Form 2
C1
If p( x | C1 ) / p( x | C2 ) >
< P(C2 ) / P(C1 )
C2 Threshold
Likelihood ratio
4
Topics for Lecture 7
1.Bayes Decision Rule – Introduction
2-Class case
2. Bayes Decision Rule – Derivation
2-Class Case
3. General Calculation of Probability of
Error
4. Calculation of Bayes Risk
5
Motivation
Good
Select
One
Bad
?
Basket of Eggs
Some good
Some bad
6
Possible Decision Outcomes:
Decide a
good egg is good
No problem - Cost = 0
good egg is bad
Throw away - Cost = 1 y
bad egg is bad
Throw away – Cost = 0.1 y
bad egg is good
Catastrophy ! – Cost = 100 y
7
1. Bayes Classifier- Statistical Assumptions
(Two Class Case)
Known:
C1 :
C2 :
Classes
x ~ p(x | C1) ,
x ~ p(x | C2) ,
Observed
Pattern
Vector
Conditional
Probability
Density
Functions
P(C1)
P(C2)
A’Priori
Probabilities
8
Bayes Classifier - Cost definitions
Define Costs associated with decisions:
C11
C12
C21
C22
Where C i j = the cost associated
with deciding Class C when true
i
class Class C j
9
Bayes Classifier - Risk Definition
Risk is defined as the average cost
associated with making a decision.
R = Risk = P(decide C1 | C1) P(C1) C11
+ P(decide C1 | C2) P(C2) C12
+ P(decide C2 | C1) P(C1) C21
+ P(decide C2 | C2) P(C2) C22
10
Bayes Classifier - Optimum Decision Rule
Bayes Decision Rule selects regions R1 and
R2, for deciding C1 and C2 respectively,
to
minimize the Risk, which is the average cost
associated with making a decision.
Can prove, details in book that the Bayes decision rule is a
Likelihood Ratio Test (LRT)
C1
If
p( x | C1)
p( x | C2)
>
<
(C22 - C12 ) P(C2)
(C11 - C21 ) P(C1)
= NBAYES
C2
11
Bayes Classifier - Calculation of Risk
12
Bayes Classifier - Special Case
C11 = C22 = 0 cost of 0 for
correct classification
C12 = C21 = 1
cost of 1 for
incorrect classification
Then Bayes Decision rule is
equivalent to the
Minimum Probability of Error
Decision Rule
13
Since
C1
If
p( x | C1)
p( x | C2)
>
<
(C22 - C12 ) P(C2)
(C11 - C21 ) P(C1)
= NBAYES
C2
Reduces to
C1
If
p( x | C1)
p( x | C2)
>
<
(1 - 0 ) P(C2)
(1 - 0 ) P(C1)
= NMPE
C2
14
Bayes Decision Rule - Example
Given the following Statistical Information
p(x | C1) = exp(-x) u(x)
P(C1) = 1/3
p(x | C2) = 2 exp(-2x) u(x)
P(C2) = 2/3
Given the following Cost Assignment
C11 = 0 C22 = 0
C12 = 3 C21 = 2
(a)Determine Bayes Decision Rule (Minimum Risk)
(b) Simplify your test to the observation space
(c)Calculate Bayes Risk for Bayes Decision Rule
15
Bayes Example – Solution is LRT
C1
If
p( x | C1)
p( x | C2)
>
<
NBAYES
C2
NBAYES =
(C22 - C12 ) P(C2)
(C11 - C21 ) P(C1)
p( x | C1)
p( x | C2)
=
exp(-x)
2exp(-2x)
=
(0 - 3 ) 2/3
(0 - 2 ) 1/3
= 3
u(x) = ½ exp(x) u(x)
16
Bayes Example – Solution in different spaces
(a) For x =
> 0 the Bayes Decision Rule is
C1
If ½ exp(x)
>3
<
In Likelihood
Ratio Space
C2
(b) For x =
> 0 the equivalent decision rule in the
observation space is seen to be
C1
If
x
> ln(6)
<
C2
In Observation
Space
17
Bayes Example – Calculation of Bayes Risk
(c) Must compute the conditional probabilities of error
P(error | C1) = P(decide C2 |C1)
=
p( x | C1 ) dx
R2
ln(6)
exp(-x) u(x)
=
0
= 5/6
18
Bayes Example – Calculation of Bayes Risk (cont)
P(error | C2) = P(decide C1 |C2)
=
p( x | C2 ) dx
R1
oo
= 2exp(-2x) u(x)
ln(6)
= 1/36
19
Bayes Example – Calculation of Bayes Risk (cont)
Risk = 0 + P(decide C2 | C1) P(C1) C21
+ P(decide C1 | C2) P(C2) C12 + 0
= (5/6) (1/3) 2 + (1/36) (2/3) 3
Risk = 11/18 units /decision
20
2. General Calculation of Probability of Error
F2 decide C2
F1 decide C1
R1 decide C1
y = g(x)
y
Feature Space
x
p(x | C1)
L( x ) = p(x | C2)
Pattern Space
R2 decide C2
N = Threshold
0
L1 decide C1
L1 decide C1
Likelihood Ratio Space
21
Probability of Error – Observation Space
P(error) = p(error | C1) P(C1) + P(error | C2) P(C2)
P(error | C1) =
p(x | C1 ) dx
R2
P(error | C2) =
p( x | C2 ) dx
R1
22
Probability of Error – Feature Space
P(error) = p(error | C1) P(C1) + P(error | C2) P(C2)
P(error | C1) =
p(y | C1 ) dy
F2
P(error | C2) =
p( y | C2 ) dy
F1
23
Probability of Error – Likelihood Ratio Space
P(error) = p(error | C1) P(C1) + P(error | C2) P(C2)
P(error | C1) =
N
p(l | C1 ) dl
- oo
oo
P(error | C2) =
p( l | C2 ) dl
N
24
End of Lecture 7
25