Slides9 - Zhangxi Lin - Texas Tech University

Lecture Notes 9
Prediction Limits
Zhangxi Lin
ISQS 7342-001
Texas Tech University
Note: Most slides in this file are sourced from SAS@ Course Notes
Section 3.1
Profit Variability
Random Profit Consequences
Primary Decision
random deterministic
Profiti = Yi - costsi
0
3
Profit
y
Conditional Profits
Primary Decision
random deterministic
Profiti = Yi - costsi
0
4
y
Expected Profit Consequence
Primary Decision
random deterministic
Profiti = Yi - costsi
Primary
Outcome
d(y|xi)
EPCi = E(Yi) - costsi
0
y
^
^
= p(xi)·D(xi) - costsi
Secondary
Outcome
0
5
y
Predicted Profit Plots
Overall
Average
Profit
Scaled
Total
Profit
0.20
N=96,367
0.18
0.16
0.14
0.12
$16,000
$14,000
$12,000
0.10
0.08
0.06
0.04
0.02
0.00
$10,000
Σ
$6,000
$4,000
10
20
30
40
50
% selected
6
$8,000
EPCi
60
70
80
90
Predicted and Observed Profit Plots
Overall
Average
Profit
Scaled
Total
Profit
0.20
N=96,367
0.18
0.16
0.14
0.12
$16,000
$14,000
$12,000
0.10
0.08
0.06
0.04
0.02
0.00
$10,000
Σ
Σ OP
$6,000
i (training)
$4,000
10
20
30
40
50
% selected
7
$8,000
EPCi
60
70
80
90
Predicted and Observed Profit Plots
Overall
Average
Profit
Scaled
Total
Profit
0.20
N=96,367
0.18
0.16
0.14
0.12
$16,000
$14,000
$12,000
0.10
0.08
0.06
0.04
0.02
0.00
$10,000
Σ
Σ OP
Σ OP
10
20
30
40
50
% selected
8
$8,000
EPCi
60
70
80
$6,000
i (training)
i (validation)
90
$4,000
Predicted and Observed Profit Plots
Overall
Average
Profit
Sum of independent r.v. (not i.d.)
Lyapunov conditions  var(Σ)=Σvari
0.20
0.18
0.16
0.14
0.12
N=96,367
$16,000
$14,000
$12,000
0.10
0.08
0.06
0.04
0.02
0.00
$10,000
Σ
Σ OP
Σ OP
$8,000
EPCi
10
20
30
40
50
% selected
9
Scaled
Total
Profit
60
70
80
$6,000
i (training)
i (validation)
90
$4,000
Beyond Expectations: Variability in Profit
Profiti = Yi - costsi
EPCi = E(Yi) - costsi
^
^
= p(xi)·D(xi) - costsi
Var( Profiti ) = Var (Yi)
= E(Yi2) – (EYi)2
10
Beyond Expectations: Variability in Profit
Profiti = Yi - costsi
E( Profiti ) = E(Yi) - costsi
^
^
= p(xi)·D(xi) - costsi
Var( Profiti ) = Var (Yi)
^
^
^
= pi·[E(Di2)-Di2·pi]
need to estimate
11
Some Second Moment Estimates
Distribution
12
Estimate
 Normal*
^
^
σ2 + Di2
 Poisson
^
^
Di + Di2
 Gamma
^
^
Di2 ·(1+1/σshape)
 Lognormal
^ 2 ·exp(σ
^ 2)
D
i
Some Profit Variance Estimates
Distribution
13
Estimate
 Normal*
^ ^
^ ^ ^
pi·Di2 [ 1–pi + σ2/Di2 ]
 Poisson
^ ^
^
^
pi·Di2 [ 1–pi + 1/Di ]
 Gamma
^ ^
^
^
pi·Di2 [ 1–pi + 1/ σshape ]
 Lognormal
^ 2 [ 1–p^ + exp(σ^2)–1 ]
p^i·D
i
i
Profit Plots with Tolerance Limits
Overall
Average
Profit
Scaled
Total
Profit
0.20
Σ
0.18
0.16
0.14
0.12
$16,000
$14,000
$12,000
0.10
0.08
0.06
0.04
0.02
0.00
$10,000
Σ
Σ OP
Σ OP
$8,000
EPCi
10
20
30
40
50
% selected
14
N=96,367
EPCi ± 2 √Σ Var(Profiti)
60
70
80
$6,000
i
$4,000
i
90
Profit Plots with Tolerance Limits
Overall
Average
Profit
Scaled
Total
Profit
0.20
N=96,367
0.18
0.16
0.14
0.12
$16,000
$14,000
$12,000
0.10
0.08
0.06
0.04
0.02
0.00
Σ OP
Σ EPC
Σ OP
Σ OP
i (score)
$8,000
i
$6,000
i (training)
i (validation)
10
20
30
40
50
% selected
15
$10,000
60
70
80
90
$4,000
1998 KDD-Cup Results
Rank
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
16
Total Overall
Profit Avg. Profit
$14,712 $0.153
14,662 0.152
13,954 0.145
13,825 0.143
13,794 0.143
13,598 0.141
13,040 0.135
12,298 0.128
11,423 0.119
11,276 0.117
Rank
Total Overall
Profit Avg. Profit
11. $ 10,720 $ 0.111
12.
10,706 0.111
13.
10,112 0.105
14.
10,049 0.104
15.
9,741 0.101
16.
9,464 0.098
17.
5,683 0.059
18.
5,484 0.057
19.
1,925 0.020
20.
1,706 0.018
$10,560
$ 0.110
Total profit
Avg. profit
for “solicit
everyone”
model
Prediction Limits: The Good
$±$
Quantifies uncertainty in expected
profit estimates
Lends perspective to model
comparisons
Gives insight into model fit
17
Prediction Limits: The Bad
$
$
Does not account for model
variability
$
Skewed by outlying predictions
$$
$$$
$
18
Model Variability
Overall
Average
Profit
0.26
0.24
0.22
0.20
0.18
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
• Same Model Specification
• Same Training Data
• Different Parameter Initialization
Σ EPC
i
10
20
30
40
50
% selected
19
60
70
80
90
Prediction Limits: The Ugly
Requires scaling adjustments for
sampling
Surprises analysts/management
20
Scaling Prediction Limits (More CLT)
Overall
Average
Profit
Scaled
Total
Profit
0.20
0.18
0.16
0.14
0.12
N=963,670
Overall Average Profit
Limits Scale by 1/√N
Total Profit
Limits Scale by √N
$140,000
$120,000
0.10
0.08
0.06
0.04
0.02
0.00
$100,000
$80,000
$60,000
$40,000
10
20
30
40
50
% selected
21
$160,000
60
70
80
90