Using Value of Information to Learn and Classify under Hard Budgets
Omid Madani http://www.cs.ualberta.ca/~greiner/BudgetLearn
Yahoo!
(UAI’03; UAI’04; COLT’04; ECML’05)
Dept of Computing Science, University of Alberta
Research
Heuristic Policies
Simpler Task
Task:
– Need classifier for diagnosing cancer
• Round Robin
– Flip C1, then C2, then ...
• Given:
Task:
• Determine which coin has highest P(head)
• … based on results of only 20 flips
Which coin??
20
• Produce:
-
-
-
-
-
to predict subtype of novel instance, based on values of its
features
19
– … learned using only features purchased
C1 C2 C3 C4 C5 C6 C7
-
-
1H
-
-
-
-
Process:
18
• Initially, learner R knows NO feature
values
• At each time,
– R can purchase value of
a feature of an instance
at cost
– based on results of prior purchases
– … until exhausting fixed budget
-
• Purchasing a feature value…
– Alters accuracy of classifier
– Decreases remaining budget
• Quality:
– accuracy of classifier obtained
1T
-
-
once
• Single Feature Lookahead (k)
– SFL(Ci, k) = loss of spending k flips on Ci
– Flip C* = argmini { SFL(Ci, k) } once
17
C1 C2 C3 C4 C5 C6 C7
-
-
1H
1T
-
1T
-
-
regret
alg A
optimal
rA
⋮
r
*
budget
0
1
?
?
?
?
0
?
?
?
?
1
?
?
?
?
1
X 2 X3
-
1H 5H 1T 1T 2H
1T 3T
1T
X4
Y
?
1
?
?
+
?
?
?
?
0
?
?
?
?
1
?
?
?
?
1
X2
?
?
+
?
--
?
?
X3
X4
Y
?
1
?
?
0
?
?
?
1
?
?
?
1
• Typically…
• MD also has constraints [… capitation …]
Extended model:
Hard budget for BOTH learner & classifier
•Eg:
• spend bL = $10,000 to learn a classifier,
that can spend only bC = $50 /patient…
• Classifier = “Active Classifier” [GGR, 2002]
policy (decision tree)
• sequentially gathers info about instance,
• until rendering decision
(Must make decision by bC depth)
•Learner …
• spends bL gathering information,
posterior distribution P(·)
⋮
X1
X2
?
?
+
?
--
?
•A is APPROXIMATION Algorithm
iff
A’s regret is bounded by a constant
worse than optimal (for any budget, #coins, …)
• NOT approximation alg’s:
Round Robin
Random
Greedy
Interval Estimation
X3
X4
Y
?
1
?
0
+ -+ -- --
[using naïve bayes assumption]
• uses Dynamic Program to find
best cost-bC policy for P(·)
1
?
?
• Double dynamic program!
• Too slow use heuristic policies
1
Glass – Identical Feature Costs
Learner
Classifier
Beta(1,1); n=10, b=10
Beta(1,1); n=10, b=40
Beta(10,1); n=10, b=40
0.35
0.3
0.25
0.2
0.15
Use NaïveBayes classifier as…
it handles missing data
no feature interaction
0.6
• MDP
– State = a1, b1, …, ak, bk, r
– Action = “Flip coin i”
– Reward = 0 if r0; else maxi { ai/(ai+bi) }
– solve for optimal purchasing policy
• NP-hard
Develop tractable heuristic
policies that perform well
0.5
round-robin
• Obvious approach
Round robin
is NOT good !
• Contingent policies work best
• Important to know/use
remaining budget
•…
50
60
70
80
90
RR
BR
SFL (depth25)
RSFL (depth25)
Greedy
All Data
0.4
0.35
0.3
greedy
look-ahead, depth 80
0.3
0.2
min error
0
50
100
150
200
250
300
350
400
450
500
Learning Budget
0.2
0.1
• Not standard Bandit:
•Pure explore for “b” steps,
then single exploit
• Not on-line learning
• No “feedback” until end
• Not PAC-learning
• Fixed #instances; NOT “polynomial”
• Not std experimental design
• This is simple active learning
• General Budgeted Learning is different
40
0.25
biased-robin
0.4
Results:
30
0.45
only O(N) parameters to estimate
Related Work
20
0.5
UCI Mushroom dataset
– Coin Ci drawn from Beta(ai, bi)
10
Heart Disease – Different Feature Costs (bC=7)
Each +class instance is “the same”, …
• Bayesian Framework:
0
Learning Budget
0/1 Misclassification Error
Selector
RR
BR
SFL (depth25)
RSFL (depth25)
Greedy
All Data
0.4
0.1
“C7”
(bC=3)
0.5
0.45
C1 C2 C3 C4 C5 C6 C7
3H
2T
Y
?
$85
• Randomized SFL
– Flip Ci with probability exp( SFL(Ci, k) ),
once
X4
?
X1
So far,
• LEARNER (researcher)
has to pay for features… but
• CLASSIFIER (“MD”)
gets ALL feature values … for free !
20
?
X1
$95
10
X 2 X3
?
Challenge:
– which feature of which instance?
1H
-
$100
15
?
$0
• Then R produces classifier
• At each time, what should R
purchase?
C1 C2 C3 C4 C5 C6 C7
-
– Flip C* = argmini { Loss1(Ci) }
30
X1
• Greedy Loss Reduction
– Loss1(Ci) = loss of flipping Ci once
-
– classifier
Which
feature
of which
instance??
• Biased Robin
– Flip Ci
– If heads, flip Ci; else Ci+1
C1 C2 C3 C4 C5 C6 C7
-
Costs:
0/1 validation error
– pool of patients whose
• subtype is known
• feature values are NOT known
– cost c(Xi) of purchasing feature Xi
– budget for purchasing feature values
Extended Task
Original Task
0/1 Misclassification Error
Russell Greiner, Daniel Lizotte, Aloak Kapoor,
0
0
50
100
150
200
250
300
350
time
Issues:
• Use “analogous” heuristic policies
• Round-robin (std approach) still bad
• SingleFeatureLookahead:
how far ahead?
• k in SFL(k) ?
Issues:
• Round-robin still bad… very bad…
• Randomized SFL is best
•(Deterministic) SFL is “too focused”
© Copyright 2026 Paperzz