F - Department of Intelligent Systems

Minimax Pathology
Mitja Luštrek 1, Ivan Bratko 2 and Matjaž Gams 1
1
Jožef Stefan Institute, Department of Intelligent Systems
2 University of Ljubljana, Faculty of Computer and Information Science
2005-12-08
Plan of the talk





What is the minimax pathology
Past work on the pathology
A real-valued minimax model
Why is minimax not pathological
Why is minimax beneficial
Mitja Luštrek
2005-12-08





What is the minimax pathology
Past work on the pathology
A real-valued minimax model
Why is minimax not pathological
Why is minimax beneficial
Mitja Luštrek
2005-12-08
What is the minimax pathology
 Conventional wisdom:
 the deeper one searches a
game tree, the better he
plays;
 no shortage of practical
confirmation.
 Theoretical analyses:
 minimaxing amplifies the
error of the heuristic
evaluation function;
 therefore the deeper one
searches, the worse he
plays;
 Pathology!
Mitja Luštrek
2005-12-08
The pathology illustrated
Current position
Game tree
Final values
(true)
Mitja Luštrek
2005-12-08
The pathology illustrated
Current position
Static heuristic
values (with error)
Final values
(true)
Mitja Luštrek
2005-12-08
The pathology illustrated
Current position
Backed-up heuristic
values (should be more
trustworthy, but have
larger error instead!)
Minimax
Static heuristic
values (with error)
Final values
(true)
Mitja Luštrek
2005-12-08
The pathology illustrated
Current position
Static heuristic values
(with smaller error)
Final values
(true)
Mitja Luštrek
2005-12-08





What is the minimax pathology
Past work on the pathology
A real-valued minimax model
Why is minimax not pathological
Why is minimax beneficial
Mitja Luštrek
2005-12-08
The discovery
 First discovered by Nau [1979].
 A year later discovered independently by Beal [1980].
 Beal’s minimax model:
1. uniform branching factor;
2. position values are losses or wins;
3. the proportion of losses for the side to move is constant;
4. position values within a level are independent of each other;
5. the error is the probability of mistaking a loss for a win or vice
versa and is independent of the level of a position.
 None of the assumptions look terribly unrealistic, yet the pathology
is there.
Mitja Luštrek
2005-12-08
Attempts at an explanation
 Researchers tried to find a flaw in Beal’s model by attacking its
assumptions.
1. Uniform branching factor:
 geometrically distributed branching factor prevents the
pathology [Michon, 1983];
 in chess endgames asymmetrical branching factor causes the
pathology [Sadikov, 2005].
2. Node values are losses or wins:
 multiple values do not help [Bratko & Gams, 1982; Pearl, 1983];
 multiple values used in a simple game -> pathological [Nau,
1982, 1983];
 multiple/real values used to construct a realistic model -> not
pathological [Scheucher & Kaindl, 1998; Luštrek, 2004].
Mitja Luštrek
2005-12-08
Attempts at an explanation
3. The proportion of losses for the side to move is constant:
 in models where it is applicable, it was agreed to be necessary
[Beal, 1982; Bratko & Gams, 1982; Nau, 1982, 1983].
4. Node values within a level are independent of each other:
 nearby positions are similar and thus have similar values;
 most researchers agreed that this is the answer or at least a
part of it [Beal, 1982; Bratko & Gams, 1982; Pearl, 1983; Nau,
1982, 1983; Schrüfer, 1986; Scheucher & Kaindl, 1998; Luštrek,
2004].
Mitja Luštrek
2005-12-08
Attempts at an explanation
5. The error is independent of the level of a position:
 varying error cannot account for the absence of the pathology
[Pearl, 1983];
 used varying error in a game and it did not help [Nau, 1982,
1983];
 varying error is a part of the answer (with the other part being
node-value dependence) [Scheucher & Kaindl, 1998].
 Despite some disagreement, node-value dependence seems to be
the most widely supported explanation.
 But is it really necessary? Is there no simpler, more fundamental
explanation?
We believe there is!
Mitja Luštrek
2005-12-08





What is the minimax pathology
Past work on the pathology
A real-valued minimax model
Why is minimax not pathological
Why is minimax beneficial
Mitja Luštrek
2005-12-08
Why multiple/real values?
 Necessary in games where the final outcome is multivalued (Othello,
tarok).
 Used by humans and game-playing programs.
 Seem unnecessary in games where the outcome is a loss, a win or
perhaps a draw (chess, checkers).
 But:
 in a losing position against a fallible and unknown opponent, the
outcome is uncertain;
 in a winning position, a perfect two-valued evaluation function
will not lose, but it may never win, either.
 Multiple values are required to model uncertainty and to maintain a
direction of play towards an eventual win.
Mitja Luštrek
2005-12-08
Why multiple/real values?
 If only the positions along the quickest / most probable path to
victory were evaluated as won, even two values would enable
optimal play.
 Strange: won position would be evaluated as lost.
 Not quite what the researchers using two values had in mind.
 But still ... ?
Mitja Luštrek
2005-12-08
A real-valued minimax model
 Aims to be a real-valued version of Beal’s model:
1. uniform branching factor;
2. position values are real numbers;
3. if the real values are converted to losses and wins, the
proportion of losses for the side to move is constant;
4. position values within a level are independent of each other;
5. the error is normally distributed noise and is independent of the
level of a position.
 The crucial difference is the assumption 5.
Mitja Luštrek
2005-12-08
Assumption 5
 Two-value error:
-
+
Loss
Win
 Real-value error:
0.31
0.74
Mitja Luštrek
2005-12-08
Assumption 5
Beal’s assumption 5:
Our assumption 5:
Static P (loss ↔ win)
constant with the depth of
search.
The magnitude of static
real-value noise constant
with the depth of search.
P (loss ↔ win)
Depth
Real-value noise
Depth
Note: static = applied at the lowest level of search.
Mitja Luštrek
2005-12-08
Building of a game tree
Mitja Luštrek
2005-12-08
Building of a game tree
True values distributed uniformly in [0, 1]
Mitja Luštrek
2005-12-08
Building of a game tree
True values backed up
Mitja Luštrek
2005-12-08
Building of a game tree
True values backed up
Mitja Luštrek
2005-12-08
Building of a game tree
True values backed up
Mitja Luštrek
2005-12-08
Building of a game tree
True values backed up
Mitja Luštrek
2005-12-08
Building of a game tree
Search to this depth
Mitja Luštrek
2005-12-08
Building of a game tree
Heuristic values = true values +
normally distributed noise
Mitja Luštrek
2005-12-08
Building of a game tree
Heuristic values backed up
Mitja Luštrek
2005-12-08
Building of a game tree
Heuristic values backed up
Mitja Luštrek
2005-12-08
Building of a game tree
Heuristic values backed up
Mitja Luštrek
2005-12-08
What we do with our model
 Monte Carlo experiments:
 generate 10,000 sets of true values;
 generate 10 sets of heuristic values per set of true values per
depth of search.
 Measure the error at the root:
 real-value error = the average difference between the true value
and the heuristic value;
 two-value error = the frequency of mistaking a loss for a win or
vice versa.
 Compare the error at the root when searching to different depths.
Mitja Luštrek
2005-12-08
Conversion of real values to losses and wins
 To measure two-value error, real values must be converted to losses
and wins.
 Value above a threshold means win, below the threshold loss.
 At the leaves:
 the proportion of losses for the side to move = cb (because it
must be the same at all levels);
 real values distributed uniformly in [0, 1];
 therefore threshold = cb.
 At higher levels:
 minimaxing on real values is equivalent to minimaxing on two
values;
 therefore also threshold = cb.
Mitja Luštrek
2005-12-08
Conversion of real values to losses and wins
Real values
Two values
Mitja Luštrek
2005-12-08
Conversion of real values to losses and wins
Real values
Two values
Minimaxing
Mitja Luštrek
2005-12-08
Conversion of real values to losses and wins
Real values
Two values
Minimaxing
Mitja Luštrek
2005-12-08
Conversion of real values to losses and wins
Real values
Two values
Apply threshold
Mitja Luštrek
2005-12-08
Conversion of real values to losses and wins
Real values
Two values
Mitja Luštrek
2005-12-08
Conversion of real values to losses and wins
Real values
Two values
Mitja Luštrek
2005-12-08
Conversion of real values to losses and wins
Real values
Two values
Apply
threshold
Mitja Luštrek
2005-12-08
Conversion of real values to losses and wins
Real values
Two values
Minimaxing
Mitja Luštrek
2005-12-08
Conversion of real values to losses and wins
Real values
Two values
Minimaxing
Mitja Luštrek
2005-12-08
Conversion of real values to losses and wins
Real values
Two values
Mitja Luštrek
2005-12-08





What is the minimax pathology
Past work on the pathology
A real-valued minimax model
Why is minimax not pathological
Why is minimax beneficial
Mitja Luštrek
2005-12-08
Error at the root / constant static real-value error
 Plotted: real-value and two-value error at the root.
 Static real-value error: normally distributed noise with standard
deviation 0.1.
0.4
Error at the root
0.35
0.3
0.25
Real-value
0.2
Two-value
0.15
0.1
0.05
0
0
1
2
3
4
5
6
7
8
9
10
Depth of search
Mitja Luštrek
2005-12-08
Static two-value error / constant static real-value error
 Plotted: static two-value error.
 Static real-value error: normally distributed noise with standard
deviation 0.1.
Static two-value error
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0
1
2
3
4
5
6
7
8
9
10
Depth of search
Mitja Luštrek
2005-12-08
Static real-value error / constant static two-value error
 Plotted: static real-value error.
 Static two-value error: 0.1.
Static real-value error
0.12
0.1
0.08
0.06
0.04
0.02
0
0
1
2
3
4
5
6
7
8
9
10
Depth of search
Mitja Luštrek
2005-12-08
Error at the root / constant static two-value error
Two-value error at the root
 Plotted: two-value error at the root in our real-value model and in
Beal’s model.
 Two-value error at the lowest level of search: 0.1.
0.45
0.4
0.35
0.3
0.25
Beal's model
0.2
0.15
0.1
Real-value model
0.05
0
0
1
2
3
4
5
6
7
8
9
10
Depth of search
 After a small tweak of Beal’s model, we get a perfect match.
Mitja Luštrek
2005-12-08
Conclusions from the graphs
 Static real-value is constant:
 static two-value error decreases with the depth of search;
 no pathology.
 Static two-value error is constant:
 static real-value error increases with the depth of search;
 pathology.
 Which static error should be constant?
Mitja Luštrek
2005-12-08
Should real- or two-value static error be constant?
 Explained why real values are necessary.
 Real-value error most naturally represents the fallibility of the
heuristic evaluation function.
 Game playing programs do not use two-valued evaluation functions,
but if they did:
 they would more often make a mistake in uncertain positions
close to the threshold;
 they would rarely make a mistake in certain positions far from
the threshold.
Mitja Luštrek
2005-12-08
Should real- or two-value static error be constant?
Mitja Luštrek
2005-12-08
Verification in trioomph
Mitja Luštrek
2005-12-08
Verification in trioomph
Average difference
70
60
50
40
30
20
10
0
0
1
2
3
4
5
Level
Mitja Luštrek
2005-12-08
Average difference?
 A simple solution,
but perhaps not
the best one.
 A table of all
possible
combinations of
errors might give
some insight.
Mitja Luštrek
2005-12-08
Average difference?
 A simple solution,
but perhaps not
the best one.
 A table of all
possible
combinations of
errors might give
some insight.
Mitja Luštrek
2005-12-08
Average difference?
 A simple solution,
but perhaps not
the best one.
 A table of all
possible
combinations of
errors might give
some insight.
Mitja Luštrek
2005-12-08
Average difference?
 A simple solution,
but perhaps not
the best one.
 A table of all
possible
combinations of
errors might give
some insight.
Mitja Luštrek
2005-12-08
Average difference?
 A simple solution,
but perhaps not
the best one.
 A table of all
possible
combinations of
errors might give
some insight.
Mitja Luštrek
2005-12-08
Two-value error larger at higher levels
 Some simplifications:
 branching factor = 2;
 node values in [0, 1];
 consider only one type of error: wins mistaken for losses;
 consider two levels at a time to avoid even/odd level differences.
 X ... true real value of a node
F (x) = P (X < x) ... distribution function of the true real value
e ... real-value error
X – e ... heuristic real value
t ... threshold
 Two-value error:
P (X > t  X – e < t ) = P (t < X < t + e ) = F (t + e ) – F (t )
Mitja Luštrek
2005-12-08
Two-value error larger at higher levels
 We need to show that two-value error at higher levels is larger than
at lower levels:
 Fi – 2 (t + e) – Fi – 2 (t) > Fi (t + e) – Fi (t)
 the difference in F between the points t + e and t is larger at
higher levels, which means that F is steeper at higher levels.
 Example: uniform distribution at the leaves, depth = 10:
 F10 (x) = x
 F8 (x) = 4 x 2 – 4 x 3 + x 4
Mitja Luštrek
2005-12-08
Two-value error larger at higher levels
 F8 (x) steeper than F10 (x) between x = a and x = b.
Mitja Luštrek
2005-12-08
Two-value error larger at higher levels
 In general, two-value error at higher levels is larger than at lower
levels if:
F (a) = 0.1624 < F (t) < 0.7304 = F (b)
 Why can we expect this condition to be true:
 F (t) = the proportion of losses, which is constant;
 a constant proportion of losses is achieved by each player having
just enough advantage after his move so that the opponent can
balance it out after his move;
 when one’s advantage is too large, it gets even larger at each
successive level;
 therefore F (t) can be expected not to be very large or very
small.
Mitja Luštrek
2005-12-08
Two-value error sufficiently larger at higher levels
 We have shown that two-value error at higher levels is larger than
at lower levels.
 Is it larger enough?
 Baseline:
 when searching to the maximum depth, we compute two-value
error for all levels: pi ... the error at level i;
 when searching to depth d, two-value error at depth d = pd .
 Two-value error at level i is larger than the baseline when:
Fi (t )  Fi (t  e) 
6  Fi (t )  4  12 Fi (t )  7 Fi (t ) 2
4
Mitja Luštrek
2005-12-08
Two-value error sufficiently larger at higher levels
 Pathology only when F (t + e) close to 1:
 F (t) not expected to be close to 1;
 therefore this means a very large error.
Mitja Luštrek
2005-12-08
Minimax is not pathological
 Real-valued evaluations are necessary for successful game-playing.
 If static real-value error is constant, static two-value error is larger
when searching to smaller depths.
 This happens because at smaller depths, node values are closer to
the threshold separating losses from wins.
 The pathology is thus eliminated.
 Our explanation for the lack of pathology is a necessary
consequence of a real-value minimax model and requires no
additional assumptions.
 What happens if we use a number of discrete values instead of real
values (which is more realistic)?
Mitja Luštrek
2005-12-08





What is the minimax pathology
Past work on the pathology
A real-valued minimax model
Why is minimax not pathological
Why is minimax beneficial
Mitja Luštrek
2005-12-08
Preamble
 Will only consider real values.
 Must compare searches to
different depths of the same tree:
 the shape of the tree does not
matter;
 what matters is that each
minimaxing step reduces the
error;
 therefore more steps mean
smaller error.
 Some simplifications:
 constant difference between
the values of sibling nodes;
 branching factor = 2.
Mitja Luštrek
2005-12-08
One minimaxing step
Mitja Luštrek
2005-12-08
Ten minimaxing steps
Mitja Luštrek
2005-12-08
Conclusion
 Theoretical analyses in the past have shown that minimax is
pathological.
 The explanations that followed have introduced unnecessary
complications.
 If the error is modeled in the way a real-valued model suggests, the
pathology disappears.
 Real values also lend themselves well to an explanation of why is
minimax beneficial.
Mitja Luštrek
Thank you.
Questions?

Download Report

F - Department of Intelligent Systems

Paperzz.com

Your Paperzz