initial

AI Planning
1
The planning problem
 Inputs:
1. A description of the world state
2. The goal state description
3. A set of actions
 Output:
A sequence of actions that if applied to the
initial state, transfers the world to the goal
state
2
An example – Blocks world
 Blocks on a table
 Can be stacked, but only one block on top of
another
 A robot arm can pick up a block and move
to another position
– On the table
– On another block
 Arm can pick up only one block at a time
– Cannot pick up a block that has another one on
3
it
STRIPS Representation
 State is a conjunction of positive ground
literals
On(B, Table) Λ Clear (A)
 Goal is a conjunction of positive ground
literals
Clear(A) Λ On(A,B) Λ On(B, Table)
 STRIPS Operators
– Conjunction of positive literals as preconditions
– Conjunction of positive and negative literals as
effects
4
More on action schema
 Example: Move (b, x, y)
– Precondition:
Block(b) Λ Clear(b) Λ Clear(y) Λ On(b,x) Λ
(b ≠ x) Λ (b ≠ y) Λ (y ≠ x)
– Effect:
¬Clear(y) Λ ¬On(b,x) Λ Clear(x) Λ On(b,y)
Delete list
Add list
 An action is applicable in any state that
satisfies its precondition
5
STRIPS assumptions
 Closed World assumption
– Unmentioned literals are false (no need to
explicitly list out)
 STRIPS assumption
– Every literal not mentioned in the “effect” of an
action remains unchanged
 Atomic Time (actions are instantaneous)
6
STRIPS expressiveness
 Literals are function free: Move (Block(x), y, z)
 operators can be propositionalized (= actions)
Move(b,x,y) and 3 blocks and table can be expressed as
48 purely propositional actions
 No disjunctive goals: On(B, Table) V On(B, C)
 No conditional effects: On(B, Table) if ¬On(A, Table)
7
Planning algorithms
 Planning algorithms are search procedures
 Which state to search?
– State-space search
 Each node is a state of the world
 Plan = path through the states
– Plan-space search
 Each node is a set of partially-instantiated operators
and set of constraints
 Plan = node
8
State search
 Search the space of situations, which is
connected by operator instances (= actions)
 The sequence of actions = plan
 We have both preconditions and effects
available for each operator, so we can try
different searches: Forward vs. Backward
9
Planning: Search Space
A
C
B
B
C
A
C
A B
C
A B
A
B C
C
A
B
C
B
A
B
A C
A B C
A
A
B
C
B
C
A
B C
B
A
C
10
Forward state-space search (1)
 Progression
 Initial state: initial state of the problem
 Actions:
– Applied to a state if all the preconditions are
satisfied
– Succesor state is built by updating current state
with add and delete lists
 Goal test: state satisfies the goal of the
problem
11
Progression (forward search)
ProgWS(world-state, goal-list, PossibleActions, path)
If world-state satisfies all goals in goal-list,
1. Then return path.
2. Else Act = choose an action whose precondition is
true in world-state
a) If no such action exists
b) Then fail
c) Else return ProgWS( result(Act, world-state),
goal-list, PossibleActions,
concatenate(path, Act) )
12
Forward search in the Blocks world
…
…
13
Forward state-space search (2)
 Advantages
– No functions in the declarations goals 
search state is finite
– Sound
– Complete (if algorithm used to do the search is
complete)
 Limitations
– Irrelevant actions  not efficient
– Need heuristic or pruning procedure
14
Backward state-space search (1)
 Regression
 Initial state: goal state of the problem
 Actions:
– Choose an action A that
 Is relevant; has one of the goal literals in its effect set
 Is consistent; does not negate another literal
– Construct new search state
 Remove all positive effects of A that appear in goal
 Add all preconditions of A, unless already appears
 Goal test: initial world state contains
remaining goals
15
Regression (backward search)
RegWS(initial-state, current-goals, PossibleActions, path)
1. If initial-state satisfies all of current-goals
2. Then return path
3. Else Act = choose an action whose effect matches
one of current-goals
a. If no such action exists, or the effects of Act
contradict some of current-goals, then fail
b. G = (current-goals – goals-added-by(Act)) +
preconds(Act)
c. If G contains all of current-goals, then fail
d. Return RegWS(initial-state, G, PossibleActions,
concatenate(Act, path))
16
Backward state-space search (2)
 Advantages
– Consider only relevant actions  much smaller
branching factor
 Limitations
– Still need heuristic to be more efficient
17
Comparing ProgWS and RegWS
 Both algorithms are
– sound (they always return a valid plan)
– complete (if a valid plan exists they will find one)
 Running time is O(bn)
where b = branching factor,
n = number of “choose” operators
18
Efficiency of Backward Search
a1
a2
a1
a2
a3
initial state
…
a50
a3
goal
 Backward search can also have a very large
branching factor
– E.g., many relevant actions that don’t regress
toward the initial state
 As before, deterministic implementations can
waste lots of time trying all of them
Lifting
p(a,a)
foo(x,y)
precond: p(x,y)
effects: q(x)
foo(a,a)
p(a,b)
foo(a,b)
p(a,c)
q(a)
foo(a,c)
…
 Can reduce the branching factor of backward
search if we partially instantiate the operators
– this is called lifting
foo(a,y)
p(a,y)
q(a)
The Search Space is Still Too Large

Backward-search generates a smaller search space than
Forward-search, but it still can be quite large
 Suppose
a, b, and c are independent, d must precede
all of them, and d cannot be executed
 We’ll try all possible orderings of a, b, and c before
realizing there is no solution
d
a
b
d
b
a
d
b
a
d
a
c
d
b
c
d
c
b
c
b
goal
a
21
A ground version of the
STRIPS algorithm.
Blocks world: STRIPS operators
 Pickup(x)
Pre: on(x, Table), clear(x),
ae
Del: on(x, Table), ae
Add: holding(x)
 Putdown(x)
Pre: holding(x)
Del: holding(x)
Add: on(x, Table), ae
 UnStack(x,y)
Pre: on(x, y), ae
Del: on(x, y), ae
Add: holding(x), clear(y)
 Stack(x, y)
Pre: holding(x), clear(y)
Del: holding(x), clear(y)
Add: on(x, y), ae
23
STRIPS Planning
C
A
B
D
D
A
C
 Current state:
– on(A,table), on(C, B), on(B,table), on(D,table), clear(A),
clear(C), clear(D), ae.
 Goal
– on(A,C), on(D, A)
24
STRIPS Planning
Plan:
Goalstack: on(A,C), on(D,A)
on(A,C)
D
A
C
Stack(A, C)
holding(A), clear(C)
holding(A)
Pickup(A)
on(A,Table), clear(A), ae
C
A B D
on(A,table), on(C, B), on(B,table), on(D,table), clear(A), clear(C), clear(D), ae.
Current State
25
STRIPS Planning
Plan:
Goalstack: on(A,C), on(D,A)
on(A,C)
D
A
C
Stack(A, C)
Pickup(A)
Pre: on(A,Table), clear(A), ae
Del: on(A, Table), ae,
holding(A), clear(C)
holding(A)
Pickup(A)
Add: holding(A)
C
A B D
on(A,table),on(C,
holding(A),
on(C,B),
B),on(B,table),
on(B,table),on(D,table),
on(D,table),clear(A),
clear(A),clear(C),
clear(C),clear(D).
clear(D), ae.
Current State
26
STRIPS Planning
Plan:
Pickup(A)
Goalstack: on(A,C), on(D,A)
on(A,C)
D
A
C
Stack(A, C)
Stack(A, C)
Pre: holding(A), clear(C)
Del: holding(A), clear(C)
Add: on(A, C), ae
A
C
B D
holding(A),
on(A,C),
on(C,
on(C,
B),B),
on(B,table),
on(B,table),
on(D,table),
on(D,table),
clear(A),
clear(A),
clear(D),
clear(C),
ae.clear(D).
Current State
27
STRIPS Planning
Plan:
Pickup(A)
Stack(A, C)
Goalstack: on(A,C), on(D,A)
on(D, A)
D
A
C
Stack(D,A)
holding(D), clear(A)
holding(D)
Pickup(D)
on(D,Table), clear(D), ae
A
C
B D
on(A,C), on(C, B), on(B,table), on(D,A),
on(D,table),
holding(D),
clear(A),
clear(A),
clear(A),
ae clear(D)
clear(D), ae.
Current State
28
STRIPS Planning
Plan:
Pickup(A)
Stack(A, C)
Pickup(D)
Goalstack: on(A,C), on(D,A)
on(D, A)
D
A
C
Stack(D,A)
holding(D), clear(A)
holding(D)
A
C
B
D
on(A,C), on(C, B), on(B,table), on(D,A),
holding(D),
clear(A),
clear(A),
ae clear(D)
Current State
29
STRIPS Planning: Getting it Wrong!
Plan:
Goalstack: on(A,C), on(D,A)
on(D,A)
D
A
C
Stack(D, A)
holding(D), clear(A)
holding(D)
Pickup(D)
on(D,Table), clear(D), ae
C
A B D
on(A,table), on(C, B), on(B,table), holding(D),
on(D,table),clear(A),
clear(A),clear(C),
clear(C),clear(D)
clear(D), ae.
Current State
30
STRIPS Planning: Getting it Wrong!
Plan:
Pickup(D)
Goalstack: on(A,C), on(D,A)
on(D,A)
D
A
C
Stack(D, A)
D
C
A B
on(A,table), on(C, B), on(B,table), on(D,A),
holding(D),
clear(C),
clear(A),
clear(D),
clear(C),
ae.clear(D)
Current State
31
STRIPS Planning: Getting it Wrong!
Plan:
Goalstack: on(A,C), on(D,A)
Pickup(D)
D
A
C
Stack(D, A)
Now What?
– We chose the wrong goal first
– A is no longer clear.
– stacking D on A messes up the preconditions for
actions to accomplish on(A, C)
D C
A B
– either have to backtrack, or else we must undo
the previous actions
on(A,table), on(C, B), on(B,table), on(D,A), clear(C), clear(D), ae.
Current State
32
Limitation of state-space search
 Linear planning or Total order planning
 Example
– Initial state: all the blocks are clear and on the
table
– Goal: On(A,B) Λ On(B,C)
– If search achieves On(A,B) first, then needs to
undo it in order to achieve On(B,C)
 Have to go through all the possible
permutations of the subgoals
33
Search through the space of plans
 Nodes are partial plans, links are plan refinement
operations and a solution is a node (not a path).
 POP creates partial-order plans following a “least
commitment” principle.
34
Total Order Plans:
Partial Order Plans:
Start
Left
Sock
Right
Sock
Left Sock on
Right Sock on
Left
Shoe
Right
Shoe
Left Shoe on
Start
Start
Start
Start
Start
Start
Right
Sock
Right
Sock
Left
Sock
Left
Sock
Right
Sock
Left
Sock
Left
Sock
Left
Sock
Right
Sock
Right
Sock
Right
Shoe
Right
Shoe
Left
Shoe
Left
Sock
Right
Shoe
Left
Sock
Left
Shoe
Right
Shoe
Left
Shoe
Right
Sock
Right Shoe on
Finish
Finish
Finish
Left
Shoe
Finish
Right
Shoe
Left
Shoe
Right
Shoe
Finish Finish Finish
35
P.O. plans in POP
 Plan = (A, O, L), where
– A is the set of actions in the plan
– O is a set of temporal orderings between actions
– L is a set of causal links linking actions via a literal
Q Ac
Ap
Causal link
means that Ac has precondition
Q that is established in the plan by Ap.
(clear b)
move-a-from-b-to-table
move-c-from-d-to-b
36
Threats to causal links
Q
Ac if:
Step At threatens link Ap
1. At has (~Q) as an effect
2. At could come between Ap and Ac, i.e., O is
consistent with Ap < At < Ac
37
Threat Removal
 Threats must be removed to prevent a plan
from failing
 Demotion adds the constraint At < Ap to
prevent clobbering, i.e. push the clobberer
before the producer
 Promotion adds the constraint Ac < At to
prevent clobbering, i.e. push the clobberer
after the consumer
38
Initial (Null) Plan
 Initial plan has
– A = { A0, A}
– O = {A0 < A}
– L = {}
 A0 (Start) has no preconditions but all
facts in the initial state as effects.
 A (Finish) has the goal conditions as
preconditions and no effects.
39
POP algorithm
POP((A, O, L), agenda, PossibleActions):
1. If agenda is empty, return (A, O, L)
2. Pick (Q, An) from agenda
3. Ad = choose an action that adds Q.
a. If no such action exists, fail.
b. Add the link Ad Q An to L and the ordering Ad < An to O
c. If Ad is new, add it to A.
4. Remove (Q, An) from agenda. If Ad is new, for each
of its preconditions P add (P, Ad) to agenda.
Q Ac
Ap
5. For every action At that threatens any link
1. Choose to add At < Ap or Ac < At to O.
2. If neither choice is consistent, fail.
6. POP((A, O, L), agenda, PossibleActions)
40
Analysis
 POP can be much faster than the state-space
planners because it doesn’t need to backtrack
over goal orderings (so less branching is
required).
 Although it is more expensive per node, and
makes more choices than RegWS, the reduction
in branching factor makes it faster, i.e., n is larger
but b is smaller!
41
More analysis
 Does POP make the least possible amount
of commitment?
 Lifted POP: Using Operators, instead of
ground actions,
 Unification is required
42
POP in the Blocks world
PutOn(x,y)
Cl(x), Cl(y),
On(x,z)
On(x,y), Cl(x),
~Cl(y), ~On(x,z)
PutOnTable(x)
On(x, z)
Cl(x)
On(x,Table),
Cl(x), ~On(x,z)
43
POP in the Blocks world
44
POP in the Blocks world
45
POP in the Blocks world
46
POP in the Blocks world
47
Example 2
 A0: Start
– At(Home) Sells(SM,Banana) Sells(SM,Milk)
Sells(HWS,Drill)
 A : Finish
– Have(Drill) Have(Milk) Have(Banana) At(Home)
Buy (y,x)
At(x), Sells(x,y)
Have(y)
GO (x,y)
At(x)
At(y)
~At(x)
48
POP Example
x1 = SM
x2 = SM
x3 = H
start
Sells(SM, M) Sells(SM,B) At(H)
At(x3)
GO (x3,SM)
At(SM)
At(x1) Sells(x1,M)
At(x2) Sells(x2, B)
Buy (M,x1)
Buy (B,x2)
Have(M)
Have(B)
Have(M)
Have(B)
finish
49