CS-E4800 Artificial Intelligence A∗ Example Heuristics

A∗ Example
Reach the Green Spot with the Straighline Distance as h
CS-E4800 Artificial Intelligence
Jussi Rintanen
Department of Computer Science
Aalto University
January 19, 2017
Heuristics by Relaxation
Pattern Databases
Culberson & Schaeffer 1998
Heuristic 1
h(s) looks at subset of all features (state
variables) ⇒ abstracted states
8-puzzle and 15-puzzle: subset of tiles
h(s) = actual goal distance for abstraction of s
h(s) = number of tiles in
non-target location
Heuristic 2
h(s) = sum of Manhattan
distances to each goal’s target
location
1
2
3
4
5
6
7
8
1
4
7
2
5
8
3
6
estimated by
1
2
distance 8
4
1 2
5
4 5
On the right, tiles 3, 6, 7 and 8 are considered absent.
Rubik’s Cube
43’252’003’274’489’856’000
∼ 4.3 · 1019 states
Has been proved to be always
solvable with max. 20 moves
Optimal solution (Korf 1997) by
Iterative Deepening A∗
Pattern Databases: look at
corners only
hmax (Bonet and Geffner)
1
2
1:
2:
3:
4:
5:
6:
7:
S0 represents the initial state
Si , i ≥ 1 is literals not falsified by actions possible
in Si−1
i := 0;
S0 := all literals true in the initial state;
repeat
B := {a ∈ A|{prec(a)} ∪ Si−1 is satisfiable };
Si+1 := {l ∈ Si |l 6∈ eff(a) for all a ∈ B};
i := i + 1;
until Si = Si−1 ;
A Simple Modeling Language
(Boolean) State variables X = {x1 , . . . , xn }
Actions (p, e) ∈ A consist of
precondition p = prec((p, e))
effects e = eff((p, e))
satisfying p ∪ e ⊂ {x1 , . . . , xn , ¬x1 , . . . , ¬xn }.
Initial state I : X → {0, 1}
Goal G is a formula over X
(Cost function C : A → R: We mostly use uniform costs:
C (a) = 1 for all a ∈ A)
hmax (Bonet and Geffner)
Definition
Construct sets S0 , S1 , S2 , . . . starting from s. A lower
bound on the distance from s to G is
hmax (s) = j
such that Sj ∪ {G } is satisfiable and either j = 0 or
Sj−1 ∪ {G } is unsatisfiable.
hmax (Bonet and Geffner)
Meaning of sets Si for hmax
Tractor example
1
Tractor moves:
from
from
from
from
2
3
1
2
2
3
to
to
to
to
2:
1:
3:
2:
{T 1, ¬T 2, ¬T 3, ¬A1, ¬A2, A3, ¬B1, ¬B2, B3}
{T 1, ¬T 2, ¬T 3, ¬A1, ¬A2, A3, ¬B1, ¬B2, B3}
{T 1, ¬T 2, ¬T 3, ¬A1, ¬A2, A3, ¬B1, ¬B2, B3}
{T 1, ¬T 2, ¬T 3, ¬A1, ¬A2, A3, ¬B1, ¬B2, B3}
{T 1, ¬T 2, ¬T 3, ¬A1, ¬A2, A3, ¬B1, ¬B2, B3}
S0
S1
S2
S3
S4
T 12 = h{T 1}, {T 2, ¬T 1}i
T 21 = h{T 2}, {T 1, ¬T 2}i
T 23 = h{T 2}, {T 3, ¬T 2}i
T 32 = h{T 3}, {T 2, ¬T 3}i
Tractor pushes A:
from 2 to 1: A21 = h{T 2, A2}, {T 1, A1, ¬T 2, ¬A2}i
from 3 to 2: A32 = h{T 3, A3}, {T 2, A2, ¬T 3, ¬A3}i
Tractor pushes B:
from 2 to 1: B21 = h{T 2, B2}, {T 1, B1, ¬T 2, ¬B2}i
from 3 to 2: B32 = h{T 3, B3}, {T 2, B2, ¬T 3, ¬B3}i
hmax
t
0
1
2
3
4
T1
T
TF
TF
TF
TF
T2
F
TF
TF
TF
TF
T3
F
F
TF
TF
TF
A1
F
F
F
F
TF
A2
F
F
F
TF
TF
A3
T
T
T
TF
TF
B1
F
F
F
F
TF
B2
F
F
F
TF
TF
B3
T
T
T
TF
TF
IDA∗: Iterative Deepening A∗
Tractor example
IDA∗ needs less memory than A∗
t
0
1
2
3
4
T1
T
TF
TF
TF
TF
T2
F
TF
TF
TF
TF
T3
F
F
TF
TF
TF
A1
F
F
F
F
TF
A2
F
F
F
TF
TF
A3
T
T
T
TF
TF
B1
F
F
F
F
TF
B2
F
F
F
TF
TF
Actual distance of A1 ∧ B1 is 8. hmax gives 4.
B3
T
T
T
TF
TF
Depth-first searches up to a specified bound
Visited states not remembered across searches
+ Little memory needed (linear in path length)
– May visit states multiple times
– When lots of different f-values: depth-bound
increases slowly, and number of visited states
quadratic in comparison to A∗ .
Iterative Deepening A∗
I
Iterative Deepening A∗
F
f (C )
1
Start with bound h(A)
2
Visit states below bound and
their successors
3
Identify lowest f-value above
previous bound
4
Make that f-value the new
bound
5
Next search covers E, F, G,
H, I
f (B)
H
f (D)
G
E
B
D
h(A)
C
A
Suboptimal Search Algorithms
1:
2:
3:
4:
5:
6:
7:
8:
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
procedure IDA∗ ;
bound := h(s0 );
repeat
t := DFS(s0 ,0,bound);
if t = SOLVED then return SOLVED;
if t = ∞ then return NOTSOLVED;
bound := t;
until 1 = 0;
procedure DFS(s : state, g : real, bound : real)
f := g + h(s);
if f > bound then return f;
if s is a goal state then return SOLVED;
min := ∞;
for each successor s 0 of s do
f’ := DFS(s’,g + cost(s,s’),bound);
if f’ = SOLVED then return SOLVED;
if f’ < min then min := f’;
end
return min;
Greedy Best-First Search
A∗ defines f (n) = g (n) + h(n)
Greedy Best-First Search defines f (n) = h(n) with
cost-so-far ignored
A∗ and IDA∗ guarantee optimality, but are very
expensive and with large state spaces impractical.
Other algorithms represent trade-offs runtime vs.
solution quality.
greedy best-first search: no quality guarantees
WA∗ - Weighted A∗ : solution at most w from optimal
Always expand path with the lowest cost to-go
Don’t expand promising but under-developed paths
Goal states usually reached with less search
Terminates when reaching a goal state
Greediness leading to a (very) poor solution:
Heuristic here is straight-line distance.
(Picture from Thayer & Ruml search tutorial)
WA∗: Weighted A∗
Dynamic programming (DP)
A∗ define f (n) = g (n) + h(n)
WA∗ define f (n) = g (n) + w · h(n) where w ≥ 1
(with A∗ as the special case w = 1)
More weight on lowest cost-to-go rather than the
estimated cost of the whole path
Solution guaranteed to be at most w times optimal. In
practice often close to optimal.
Often far less search than with A∗
Search effort does not always decrease as w increases
(anomalies sometimes observed in practice)
Example: Edit distance
i
n
t
e
n
t
i
o
n
x
2
e
3
c
4
u t
5 6
Example: Edit distance between two strings
Exponentially many insert-delete sequences
Efficiently solvable with dynamic programming
Example: Edit distance
Build an (n, m) array with edit distances between
n first letters of ”intention” and m first letters of
”execution”
e
0 1
1 2
2
3
4
5
6
7
8
9
Principle of avoiding solving a subproblem several
times is known as dynamic programming
Can turn an exponential problem into polynomial
Crucial in many areas, including sequential
decision-making (Markov decision processes)
If new letters are same, copy value from top left
diagonal. Otherwise take the min of top and left
neighbor and add +1.
i o n
7 8 9
i
n
t
e
n
t
i
o
n
0
1
2
3
4
5
6
7
8
9
e
1
2
3
4
3
4
5
6
x
2
3
4
5
4
5
6
7
e
3
4
5
6
5
6
7
8
c
4
5
6
7
6
7
8
u
5
6
7
8
7
8
9
t
6
7
8
7
8
9
8
i o n
7 8 9
6
Use Search, or something else?
A priori, efficient algorithms possible, even if
search space seems exponential
Search justified by NP-hardness
(Edit Distance not NP-hard!)
Procedure:
1
2
Attempt to prove NP-hardness, and to find a poly-time
algorithm
If neither works out, always OK to use search if
search-based tools are readily available, and
they turn out to be effective.
Problem-Solving and Planning
Which actions to take to satisfy a given objective?
Basic state-space search solves problems such
that...
There is exactly one initial state
All actions are deterministic
Can take only one action at a time
Environment is static: only actions change the state
Objective: reach one of a number of goal states
Cost measure: sum of action costs
Conclusions
Many basic AI problems boil down to search
A∗ is a standard optimal search method
A∗ is both complete and optimal
A∗ only expands nodes required to prove optimality
A∗ a basis of numerous other algorithms: ARA∗ ,
Block A∗ , D∗ , Field D∗ , FSA∗ , GAA∗ , IDA∗ , LPA∗ ,
SMA∗ , Theta∗ , Anytime A∗ , Realtime A∗ , Anytime
Dynamic A∗ , ...
Reusing solutions of sub-problems can reduce time
complexity crucially
General Modeling Languages
Abstract model and specification language
Algorithm(s) for solving any model expressed in
the specification language
Schematic Actions and Grounding
Number of state variables can be high (1000s)
Often lots of structurally similar actions
In most applications one can identify objects
Instead of enumerating all state variables and
actions, they can be represented schematically.
The process of using the schematic representation
is grounding
Widely used in modeling languages in AI,
computer-aided verification, CS in general
Schematic Actions and Grounding
Example
All actions for moving a car from one location to
another can be represented schematically as
PARAMETERS: r : R, l1 : L, l2 : L
PRECONDITION: at(r , l1 )
EFFECTS: at(r , l2 ), ¬at(r , l1 )
Schematic Actions and Grounding
Example
Consider cars R = {A, B, C } in locations L = {1, 2, 3}.
We want to use state variables
at(A,1)
at(B,1)
at(C,1)
at(D,1)
at(A,2)
at(B,2)
at(C,2)
at(D,2)
at(A,3)
at(B,3)
at(C,3)
at(D,3)
These can be obtained schematically from at(r,l) with
r ∈ R and l ∈ L
General-Purpose Modeling Languages
Languages for modeling and problem solving
without programming
The course uses NDL. (Other: PDDL widely used)
Problem class: deterministic, single initial state,
full information, single agent
State variables: bool, enums, numbers, relations
Similar more extended languages include
non-deterministic and stochastic environments
metric time, actions and event have durations
concurrency (actions may temporally overlap)
...
NDL: Types
NDL includes different types:
enums: named objects
integers, reals
bounded integers in a finite interval [lb,ub]
Types given names by type, and can involve set
operations
type
type
type
type
type
type
truck = { truck1, truck2 };
package = { obj1, obj2, obj3 };
city = { city1, city2 }
airport = { ap1, ap2 };
airplane = { apn1 };
allobjects = truck U package U airplane;
NDL: Actions
1
2
3
parameters: how action can be grounded?
precondition
effect
action loadTruck(pkg : package, trk : truck, loc : location)
at(trk) = loc & at(pkg) = loc
=>
at(pkg) := trk;
action unloadTruck(pkg : package, trk : truck, loc : location)
at(trk) = loc & at(pkg) = trk
=>
at(pkg) := loc;
NDL: State variables
State variables defined by decl with values of all types
(bool, int, real, enums)
type
type
type
type
type
type
package = { p1, p2 };
cities = { city1, city2 };
vehicle = { trk1, trk2 };
objects = package U vehicle;
day = { mon, tue, wed, thu, fri, sat, sun };
location = city U vehicle;
decl in[object] : location;
decl today : day;
NDL: numeric expressions
Arithmetic operations
+ - *
Numeric expressions
numeric literals (e.g. -1 0 1 123123123)
statevar
numexpr arithop numexpr
( numexpr )
NDL: Boolean expressions (formulas)
NDL actions: Effects
Relation operations: = != < > <= >=
Boolean formulas
01
statevar
numexpr relop numexpr
not fma
fma1 & fma2
fma1 | fma2
forall v : type fma
forsome v : type fma
( fma )
NDL: Initial state
We will use only simple effects: assignments as in
programming languages.
Effects
x := expr;
forall v : type eff;
NDL: Goal
By default
Boolean state variables have value 0 (false),
numeric variables have value 0.
Override defaults by a sequence of assignments
preceded by initial
RHS of assignments must be numeric constants or
enums
initial
distance(n6,n8) := 11;
distance(n4,n8) := 13;
distance(n4,n6) := 11;
today := monday;
Goal specified by a Boolean formula prefixed with goal.
goal at(p0,n4) & at(p1,n5) & at(p2,n6) & at(p3,n2);
NDL Modeling: Static Information
NDL Modeling: Static Information
Representing a graph
Many problems require representing static
non-changing information.
This may be hard-coded in the actions, or
represented as part of the initial state description
(and through grounding will be hard-coded in the
actions.)
Rush Hour in NDL
R
action goFromTo(node1 : node, node2 : node)
nowAt = node1
& (edge(node1,node2) | edge(node2,node1))
=>
nowAt := node2;
initial
edge(a,b)
edge(b,c)
edge(b,d)
edge(d,e)
:=
:=
:=
:=
1;
1;
1;
1;
Rush Hour: Types, state variables
Objects: X,Y coordinates in range 0 to 5
Indicate for each cell whether it contains a car
Car location:
leftmost cell of horizontal cars
bottommost cell of vertical cars
Indicate for each cell whether it is empty
type coord = [0,5];
Goal: get the red car out through the exit at right
Cars only move forward and backward
PSPACE-complete for arbitrarily large grids
decl
decl
decl
decl
decl
carV2[coord,coord]
carV3[coord,coord]
carH2[coord,coord]
carH3[coord,coord]
empty[coord,coord]
:
:
:
:
:
bool;
bool;
bool;
bool;
bool;
Rush Hour: Actions
Move horizontal car size of 2 from (x,y) to (x+1,y)
action moveH2right(x : [0,3], y : coord)
carH2(x,y) & empty(x+2,y)
=>
carH2(x,y) := 0;
carH2(x+1,y) := 1;
empty(x,y) := 1;
empty(x+2,y) := 0;
Rush Hour: Goal
Rush Hour: Initial state
“Hardest” Rush Hour instance:
initial
empty(0,0) :=
empty(0,1) :=
carH2(0,2)
carV2(0,4)
carH3(0,5)
carV2(1,1)
empty(1,3) :=
carH2(1,4)
carH2(2,0)
carV2(2,2)
carH2(2,3)
empty(3,1) :=
empty(3,2) :=
carV2(3,5)
carH2(4,0)
carH2(4,1)
empty(4,2) :=
carV3(4,5)
empty(5,2) :=
carV3(5,5)
1;
1;
:=
:=
:=
:=
1;
:=
:=
:=
:=
1;
1;
:=
:=
:=
1;
:=
1;
:=
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
Solving Rush Hour in practice
goal carH2(4,3);
Puzzle solved when the car is at (4,3)
Here we assume there is no size 2 horizontal car
right of the target car. This car would need to be
driven through the exit as well.
To distinguish between the target car from other
cars we would need variables carH2 and
targetcarH2, as well as actions for exiting cars
from the grid.
Instance with the longest shortest solution needs
93 moves (Collette, Raskin, Servais, 2007)
Small state-space: this “hardest” instance has
24132 reachable states, solved very quickly e.g. by
breadth-first search (< 1ms)
The hardest instances are very difficult for humans
Sample solution
Madagascar 0.99999 25/02/2015 09:46:27 amd64 1-core
Parser: 168 ground actions and 149 state variables
Simplified: 90 ground actions and 93 state variables
PLAN FOUND: 120 steps
STEP 0: moveh2left(2,3,1,3) moveh2left(4,1,3,5) movev3down(4,5,4,3,2) movev3down(5,5,4,3,2)
STEP 1: movev2down(3,5,4,3) movev3down(5,4,3,2,1)
STEP 2: moveh3right(0,5,1,2,3) movev2down(3,4,3,2)
STEP 3: moveh2right(1,4,2,3) moveh3right(1,5,2,3,4) movev2up(0,4,3,5)
STEP 4: moveh2left(1,3,0,2) moveh3right(2,5,3,4,5)
STEP 5: movev2up(2,2,1,3)
STEP 6: moveh2left(3,1,2,4)
STEP 7: movev3down(4,4,3,2,1)
STEP 8: moveh2right(2,4,3,4)
...
STEP 69: moveh2left(1,1,0,2) movev2down(3,3,2,1) movev3down(4,3,2,1,0)
STEP 70: movev2down(2,3,2,1)
STEP 71: moveh2right(0,3,1,2)
STEP 72: moveh2right(1,3,2,3)
STEP 73: moveh2right(2,3,3,4)
STEP 74: moveh2right(3,3,4,5)
189 actions in the plan.
total time 1.76 preprocess 0.00
total size 2.102 GB

Download Report

CS-E4800 Artificial Intelligence A∗ Example Heuristics

Paperzz.com

Your Paperzz