1 Introduction and Examples

1
Introduction and Examples
Sequencing Problems
Definition
A sequencing problem is one that involves finding a sequence of steps that
transforms an initial system state to a pre-defined goal state for that system.
Sequencing Problem Example: Rubik’s Cube
Initial State of Rubik’s Cube
Figure 1: Rubik’s Cube Initial State
Sequencing Problem Example: Rubik’s Cube
Rubik’s Cube Goal State
Sequencing Problem Example: 8-Puzzle
8-Puzzle Initial and Goal State
Sequencing Problem Example: Maze Navigation
Initial State: Maze Entrance, Goal State: Maze Exit
1
Figure 2: Rubik’s Cube Goal State
Figure 3: 8-Puzzle
Sequencing Problem Example: Blocks World
Blocks World: Raising, Translating, and Setting Boxes
2
Modeling Sequencing Problems
State Models
State Model Definition
A state model is a constraint model SM = (V, D, C) whose model solutions
are called (legal) states. We let S denote the set of legal states for some state
model.
State Model Example: 8-Puzzle
2
Figure 4: Maze Navigation
Figure 5: Blocks World
• Variables. xij , i = 1, 2, 3, j = 1, 2, 3, denotes the tile located at row i
and column j.
• Domains. dom(xij ) = {1, . . . , 8, e}, where ‘e’ denotes empty space.
• Constraints. alldifferent(x).
State Graph Models
State Graph Model Definition
Let SM = (V, D, C) be a state model, and S the set of legal model states. A
state-graph model for S has the following components.
• a designated initial state s0 ∈ S
3
8
6
5
4
7
2
3
1
Figure 6: State s
8
6
5
4
7
2
3
1
Figure 7: State 1 of τ (s)
• one or more designated goal states sg ∈ G ⊆ S
• a state transition function τ : S → subset(S) that maps each state
to a subset of next states. An environmental factor that produces a
transition from a state s to a state in τ (s) is called an action.
• Optional: each state transition may have an associated cost.
The Purpose of Using State Graph Models
Find a solution path (preferably the shortest or least costly) from s0 to a goal
state sg ∈ G.
State Transition Function Example: 8-Puzzle
Constraint Models for Sequencing Problems
Finding a Solution for SM = (V, D, C) and (S, τ, s0 , G)
• Assume the solution path consists of a sequence of states s0 , . . . , sd−1 of
length d.
8
6
5
4
7
2
3
1
Figure 8: State 2 of τ (s)
4
8
4
5
2
6
7
3
1
Figure 9: State 3 of τ (s)
• Variables. d copies of V : V0 , V1 , . . . , Vd−1 , where an assignment over Vi
encodes state si of the solution path.
• Domains. d copies of Di , i = 1, . . . , m: Di0 , Di1 , . . . , Di(d−1) , where Dij
is the domain of the i th variable of Vj .
• State Constraints. d copies of C: C0 , C1 , . . . , Cd−1 , where Ci ensures
that an assignment over Vi encodes a valid state si .
• Initial Constraint. The assignment over V0 encodes the initial state s0 .
• Transition Constraints. For all i = 1, . . . , d − 1, si ∈ τ (si−1 ).
• Goal Constraints. si ∈ G for some i = 0, . . . , d − 1.
• Termination Constraints. For all i = 0, . . . , d − 2, if si ∈ G, then
si+1 = si .
3
A∗ Search
Proximity Heuristics
Proximity Heuristic Definition
Let (S, τ, s0 , G) be a state-transition model. A proximity heuristic is a function h : S → R, where h(s) estimates the distance (in terms of number of
remaining transitions or in terms of the sum of the transition costs of each remaining transition) from s to a goal state. We use the term “heuristic” since h
is not guaranteed to be accurate.
Informed Search
We say that a search is informed if it uses additional problem information, such
as a proximity heuristic, to assist in finding a solution. Enabling computers to
learn such heuristics in an unsupervised setting is an ongoing area of research.
This is similar to the problem of learning implied constraints within a constraintprogramming setting.
5
Proximity Heuristic Example: 8-Puzzle
Two Heuristics h1 and h2
• h1 (s) counts the number of tiles that have been moved from their goalstate position.
• h2 (s) =
8
P
mi , where mi is the number of moves needed to get tile i back
i=1
to its goal-state position.
Figure 10: h1 (s) = 7, and h2 (s) = 3 + 4 + 2 + 0 + 2 + 4 + 2 + 4 = 21
Best First Search Using Proximity Heuristic h
Mark s0 as having been visited. Initialize priority queue Q = {s0 } to contain
the initial state. While Q is nonempty:
Remove from Q the state s
for which h(s) is minimum.
Let C = τ (s) − M ,
where
M is the set of already-marked states.
If C contains a goal state sg ,
then return the path from s0 to sg
Update h(ŝ) for states of ŝ ∈ M ,
if necessary.
Mark each state of C as being visited and add C to Q.
Mark s as having been explored. Return null since no goal state was reached.
Best First Search Example
s0 = a, sg = i, h(s) = measures the alphabetical distance from the alphabet
label of s to the letter i. For example, h(s0 ) = 8 since a is 8 letters away from
i.
Q = {(a, 8)}
Q = {(d, 5), (b, 7)}
Q = {(h, 2), (e, 4), (b, 7)}
Q = {(g, 1), (e, 4), (b, 7)}
Q = {(e, 4), (b, 7)}
6
a
b
c
d
e
f
h
g
i
Figure 11: Initial State
a
b
c
d
e
f
h
g
i
Figure 12: a marks b and d
a
b
c
d
e
f
h
g
i
Figure 13: d marks e and h
7
a
b
c
d
e
f
h
g
i
Figure 14: h marks g
a
b
c
d
e
f
h
g
i
Figure 15: g marks i: Goal Reached!
8
A∗ Search: a Balance of Past and Future
Total Path Cost Heuristic
A∗ search refers to a best-first search in which the proximity heuristic h(n) is
replaced by the total path cost heuristic f (n) = g(n) + h(n), where f (n)
is the path cost from initial state to n, and h(n) is an estimate of the cost of
reaching a goal state from n.
Admissible and Consistent Heuristics
• h(n) is called admissible iff h(n) is no greater than the actual minimal
cost of moving from n to a goal state.
0
0
• h(n) is called consistent or monotone iff h(n) ≤ cost(n, n ) + h(n ) for
0
every successor node n ∈ τ (n).
Examples of Admissible Heuristics
8-Puzzle
• h1 (s) counts the number of tiles that have been moved from their goalstate position.
• h2 (s) =
8
P
mi , where mi is the number of moves needed to get tile i back
i=1
to its goal-state position.
Both heuristics are monotone (why?).
Finding Shortest Driving Distance from one Location to Another
h(n) is the Euclidean distance between n and final destination. This heuristic
is also monotone (why?).
Consistent Heuristics are Admissible
Theorem 1: Every consistent heuristic is admissible.
Proof of Theorem 1. Suppose n is a goal state. Then we may assume
0
that h(n) = 0, and hence the value is admissible. Now suppose h(n ) has an
0
admissible value, for every node n that is within k or fewer steps (i.e. actions)
from a goal state, for some k ≥ 0. Let n be a node that is k + 1 steps from a
0
goal state, and let n be a successor of n along a path that is optimal from n to
a goal state. Then by consistency we have
0
0
h(n) ≤ cost(n, n ) + h(n ) ≤
0
cost(n, n ) + min cost of reaching goal from n‘ =
the minimum cost of reaching a goal state from n. Therefore, h(n) is admissible.
9
Admissible Heuristics Lead to Optimal Solutions
Theorem 2
A best-first search that uses f (n) = g(n) + h(n) as ordering heuristic, where
h(n) is admissible, will find a goal state along an optimal path, provided action
costs are all positive-valued, and ties are broken according to g(n).
Proof that Admissible Heuristics Lead to Optimal Solutions
Proof of Theorem 2. It suffices to prove that, if P is an optimal path
from initial state to a goal node n, then the parent of n is removed from Q
before n. Let m 6= n be the node of P that is furthest from the root, and is
removed from Q before n is discovered. We know m exists, since the root node
is the first node of P , and is the first node removed from Q. We must show that
f (m) ≤ f (n), in which case m will be removed from the queue before n (since
g(m) < g(n)). Hence the sucessors of m will be considered, and any successors
that have not already been placed in the queue, will then be placed. But since
m 6= n is the furthest from the root that belongs to P and was placed in the
queue, it follows that n must be a successor of m, and hence the optimal path
to n will have been recorded.
Proof that Admissible Heuristics Lead to Optimal Solutions
Completing the Proof of Theorem 2: f (m) ≤ f (n)
• f (m) = g(m) + h(m) ≤ g(m) + (g(n) − g(m)), since h(m) is admissible,
and g(n) − g(m) is the cost in moving from m to goal state n.
• But g(n) = f (n), since h(n) = 0 (n is a goal state).
• Therefore, f (m) = g(m) + h(m) ≤ g(m) + (g(n) − g(m)) = g(n) = f (n).
Informedness
Informedness Definition
Suppose h1 and h2 are two admissible heuristics. We say h2 is more informed
than h1 iff h2 (n) ≥ h1 (n) for every node n ∈ S.
Theorem 3
Given a search problem, let C ∗ denote the minimum cost from the initial state
to a goal state. If admissible heuristic h2 is more informed than admissible h1 ,
then A∗ search using h2 visits at most the same number of nodes n that are
visited when using h1 , where f (n) < C ∗ . In other words, if an A∗ search visits
a node n using h2 that was not visited when using h1 , then it must be the case
that f (n) = g(n) + h2 (n) = C ∗ .
10
Proof of Theorem 3
We know that for both searches, the initial state s0 is added to Q. Now
suppose n is added to Q when using f2 . Suppose, by way of induction, that the
parent m of n is added to Q in both searches. Then since f1 (m) ≤ f2 (m) < C ∗ ,
and since m is popped from Q using f2 , it will also be popped using f1 . Hence,
n will be added to Q using f1 .
Now, since best-first search is used, every node n for which f (n) < C ∗
will be popped from the queue before the goal node is popped. Moreover,
if f2 (n) = g(n) + h2 (n) < C ∗ , then also f1 (n) = g(n) + h1 (n) < C ∗ , since
h1 (n) ≤ h2 (n). Hence, if node n is popped when f2 is used, then it will also be
popped when f1 is used. Therefore, if n is popped using f2 , but not using f1 ,
then we must have f2 (n) = C ∗ .
Methods for Generating Admissible Heuristics
Problem Relaxations
• Problem P2 is a relaxed version of problem P1 , provided both problems
share the same initial state, state space and goal-testing function, but
the action set for problem P2 contains the action set for problem P1 .
Thus, if the minimum cost of reaching a goal in the relaxed problem can
be computed, then this cost serves as a lower bound for the cost in the
original problem (since the original problem perhaps cannot use all of the
actions that were used in the relaxed problem).
• Example of 8-Puzzle Relaxation. Any two adjacent tiles may be
swapped (even if neither is the empty space).
Methods for Generating Admissible Heuristics
Other Methods
• Maximizing over a set of heuristics. If h1 , . . . , hk are all admissible
heuristics, then so is max(h1 , . . . , hk )(n).
• Pattern Databases. These databases contain a set of state patterns,
so that each state of the state space will fit at least one of the patterns.
Then associated with each pattern is a value of the cost in moving from
this pattern to a pattern that corresponds with a goal state. Used by
Richard Korf of UCLA for finding optimal solutions to Rubik’s Cube.
11
Exercises
1. Give an example that shows that not every admissible heuristic is monotone.
2. A robot is standing due north on square (1, 1) of an N × N grid. The
goal of the robot is to reach square (N, N ). The robot has three possible
actions: turn left, turn right, and move forward (either north, south, east,
or west) to an adjacent square. However, throughout the grid there are
walls that prevent the robot from moving forward from one square to
an adjacent one (there are also walls around the entire grid boundary).
Specify the state space for this problem, the goal-testing function, and the
set of actions. What is the size of the state space? Provide a bound on
the branching factor b, and a bound on the minimum goal depth d.
3. For the 4-puzzle, show that the state space can be divided into two sets of
equal size, such that no state from one set is adjacent to a state from the
other set. Note: this property also holds for the 8-puzzle, and shows that
in general not all states from a space are reachable from an initial state.
Extra credit: prove it is also true for the 8-puzzle.
1
4. For the n-queens problem, prove that the state space has at least (n!) 3
states. Hint: derive a lower bound on the branching factor by considering
the maximum number of of squares that a queen can attack in any column.
5. Given three jugs, one of 12 gallons, 8 gallons, and 3 gallons, and a water
faucet, you are allowed to fill the jugs and/or empty them into one another,
or on to the ground. The goal is to measure out exactly one gallon. Hint:
every action must have the effect of either entirely emptying or filling one
of the jugs. For example, if you pour from the 8-gallon jug, then it must
either empty, or it must fill at least one other jug.
6. Consider the state space consisting of the numbers 1 to 15. The two
possible actions on a state i are to multiply it by 2, or multiply it by 2
and add 1. Assuming 1 is the initial state, and 11 is the goal state. Show
the order of the search for breadth-first, bounded depth-first (with depth
bound of 3), Assume the 2k action has precedence over the 2k + 1 action.
7. Provide a sequence of state spaces S1 , . . . , Sn , . . . for which iterative deepening search finds the goal state in O(n2 ) steps, while depth-first search
finds the goal state in O(n) steps.
8. n checkers occupy squares (1, 1) through (1, n) of an n × n grid (i.e. they
are at the bottom row of the grid). The checkers must be moved to the
top row, but in reverse order. In other words, checker i must be moved to
square (n − i + 1, n). On each step each checker can move one square up,
down, left, right, or stay put. If a checker does not move, then it may be
jumped (vertically or horizontally) by one other checker that is adjacent
12
to it. Two checkers cannot occupy the same square. Calculate the size of
the state space as a function of n and estimate the branching factor as a
function of n.
9. For the previous problem, provide a nontrivial admissible heuristic hi that
estimates the number of steps needed
for the i th checker to reach its
Pn
goal square (n − i + 1, n). Will i=1 hi (n) be admissible? What about
mini (hi (n))? maxi (hi (n))?
10. In what way can breadth-first search be considered a special case of bestfirst search? Explain.
11. Prove that the set of nodes considered during an A∗ search is a subset of
nodes considered during a breadth-first search.
13