12 planning

Logistics
• Travel
Wed class led by Mausam
• Week’s reading
R&N ch17
• Project meetings
© Daniel S. Weld
1
Weekly Course Outline
1. Overview, agents, problem spaces
2.
Search
3.
Knowledge Representation
4.
5. Learning
6. Planning (and CSPs)
7. Uncertainty
8. Planning under uncertainty
9. Reinforcement learning
10.Topics
11. Project Presentations & Contest
}
© Daniel S. Weld
2
Previously
• Constraint satisfaction
Backtracking policies & var-ordering heuristics
• The planning problem
Simplifying assumptions
• Searching world states
Regression
• Compilation to SAT, CSP, ILP, BDD
• Graphplan
Basics
…
© Daniel S. Weld
3
Graphplan
• Phase 1 - Graph Expansion
Necessary (insufficient) conditions for plan
existence
Local consistency of plan-as-CSP
• Phase 2 - Solution Extraction
Variables
• action execution at a time point
Constraints
• goals, subgoals achieved
• no side-effects between actions
© Daniel S. Weld
4
The Plan Graph
level 0
level 2
level 1
level 4
level 3
level 6
level 5
…
…
…
© Daniel S. Weld
5
Mutual Exclusion
• Actions A,B exclusive (at a level) if
A deletes B’s precond, or
B deletes A’s precond, or
A & B have inconsistent preconds
• Propositions P,Q inconsistent (at a level) if
all ways to achieve P
exclude all ways to achieve Q
© Daniel S. Weld
6
Searching for a solution
If goals are present & non-mutex:
Choose action to achieve each goal
Add preconditions to next goal set
Recurse!
© Daniel S. Weld
7
Immediate Outline
• Graphplan
Example
Relation to CSP
Reachability analysis & heuristic generation
Do you need the whole planning graph?
Temporal planning
• Planning under uncertainty
© Daniel S. Weld
8
Dinner Date
Initial Conditions: (:and (cleanHands) (quiet))
Goal:
(:and (noGarbage) (dinner) (present))
Actions:
(:operator carry :precondition
:effect (:and (noGarbage) (:not (cleanHands)))
(:operator dolly :precondition
:effect (:and (noGarbage) (:not (quiet)))
(:operator cook :precondition (cleanHands)
:effect (dinner))
(:operator wrap :precondition (quiet)
:effect (present))
© Daniel S. Weld
9
Planning Graph
noGarb
carry
cleanH
cleanH
dolly
quiet
quiet
cook
dinner
wrap
present
0 Prop
© Daniel S. Weld
1 Action
2 Prop
3 Action
4 Prop
10
Are there any exclusions?
noGarb
carry
cleanH
cleanH
dolly
quiet
quiet
cook
dinner
wrap
present
0 Prop
© Daniel S. Weld
1 Action
2 Prop
3 Action
4 Prop
11
Do we have a solution?
noGarb
carry
cleanH
cleanH
dolly
quiet
quiet
cook
dinner
wrap
present
0 Prop
© Daniel S. Weld
1 Action
2 Prop
3 Action
4 Prop
12
Extend the Planning Graph
noGarb
carry
cleanH
noGarb
carry
cleanH
dolly
quiet
cleanH
dolly
quiet
cook
quiet
cook
dinner
wrap
dinner
wrap
present
0 Prop
© Daniel S. Weld
1 Action
2 Prop
present
3 Action
4 Prop
13
Searching Backwards
noGarb
carry
cleanH
noGarb
carry
cleanH
dolly
quiet
cleanH
dolly
quiet
cook
quiet
cook
dinner
wrap
dinner
wrap
present
0 Prop
© Daniel S. Weld
1 Action
2 Prop
present
3 Action
4 Prop
14
One (of 4) Possibilities
noGarb
carry
cleanH
noGarb
carry
cleanH
dolly
quiet
cleanH
dolly
quiet
cook
quiet
cook
dinner
wrap
dinner
wrap
present
0 Prop
© Daniel S. Weld
1 Action
2 Prop
present
3 Action
4 Prop
15
One (of 4) possibilities
noGarb
carry
cleanH
noGarb
carry
cleanH
dolly
quiet
cleanH
dolly
quiet
cook
quiet
cook
dinner
wrap
dinner
wrap
present
0 Prop
© Daniel S. Weld
1 Action
2 Prop
present
3 Action
4 Prop
16
Graphplan Solution Extraction as
a Constraint Network
Present
Not dolly
& wrap
Dinner
Not carry
& cook
© Daniel S. Weld
NoGarb
17
Speed / Memory Ratios
Do & Kambhampati
http://citeseer.nj.nec.com/do00solving.html
© Daniel S. Weld
18
Variable-Ordering
• Consider normal GP solution extraction
• Can you think of a good VO heuristic?
P
Q
© Daniel S. Weld
19
Level of Proposition
• DVO
What if same number of supporting actions?
• Prefer props added latest
Most constrained
P
Q
© Daniel S. Weld
20
Immediate Outline
• Graphplan
Example
Relation to CSP
Reachability analysis & heuristic generation
Do you need the whole planning graph?
Temporal planning
• Planning under uncertainty
© Daniel S. Weld
21
Heuristics for World-Space Search
• Any ideas?
© Daniel S. Weld
22
Optimistic Projection of
Achievability
Grid Problem
Prop list
Level 0
Action list
Level 0
Prop list
Level 1
Action list
Level 1
Initial state
At(0,0)
noop
At(0,0)
x
Move(0,0,0,1)
x
0
1
2
Move(0,0,1,0)
At(0,1)
At(1,0)
Goal state
Key(0,1)
noop
key(0,1)
…...
…...
noop
noop
At(0,0)
Move(0,1,1,1)
xx
Prop list
Level 2
x
Move(1,0,1,1)
noop
Pick_key(0,1)
x
noop
x
At(0,1)
xx
x
At(1,1)
x
x
x
At(1,0)
x
Have_key
~Key(0,1)
Key(0,1)
x
x
Mutexes
•
•
0
1
© Daniel S. Weld
2
Serial PG: PG where any pair of non-noop actions are marked mutex
lev(S): index of the first level where all props in S appear nonmutexed.
If there is no such level, then
• If the graph is grown to level off, then 
• Else k+1 (k is the current length of the graph)
23
Cost of a Set of Literals
Prop list
Level 0
At(0,0)
Action list
Level 0
Prop list
Level 1
Action list
Level 1
At(0,0)
noop
x
Move(0,0,0,1)
x
noop
At(1,0)
key(0,1)
…...
…...
noop
noop
At(0,0)
Move(0,1,1,1)
xx
Move(0,0,1,0)
Key(0,1)
At(0,1)
Prop list
Level 2
x
Move(1,0,1,1)
noop
Pick_key(0,1)
x
noop
x
At(0,1)
Admissible
xx
x
At(1,1)
x
x
x
At(1,0)
x
Have_key
~Key(0,1)
Key(0,1)
x
x
Mutexes
h(S) = pS lev({p})
Sum
Partition-k
•
•
Adjusted Sum
h(S) = lev(S)
Set-Level
Combo
Set-Level
with memos
lev(p) : index of the first level at which p comes into the planning
graph
lev(S): index of the first level where all props in S appear nonmutexed.
If there is no such level, then
If the graph is grown to level off, then 
Else k+1 (k is the current length of the graph)
© Daniel S. Weld
24
Adjusting the Sum Heuristic
• Start with Sum heuristic and adjust it to take subgoal
interactions into account
Negative interactions in terms of “degree of interaction”
Positive interactions in terms of co-achievement links
• Ignore negative interactions when accounting for positive interactions
(and vice versa)
HAdjSum2M(S) = length(RelaxedPlan(S)) + max p,qS (p,q)
Where (p,q) = lev({p,q}) - max{lev(p), lev(q)} /*Degree of –ve Interaction */
© Daniel S. Weld
[Kambhampati AAAI 2000]
25
McDermott’s Grid World
© Daniel S. Weld
26
Immediate Outline
• Graphplan
Example
Relation to CSP
Reachability analysis & heuristic generation
Do you need the whole planning graph?
Temporal planning
• Planning under uncertainty
© Daniel S. Weld
27
Observation 1
p
¬q
¬r
p
A
q
p
A
¬q
q
A
¬q
B
¬r
p
r
q
¬q
B
¬r
r
¬r
Propositions monotonically increase
(always carried forward by no-ops)
© Daniel S. Weld
28
Observation 2
p
¬q
¬r
p
A
q
p
A
¬q
q
A
¬q
B
¬r
p
r
q
¬q
B
¬r
r
¬r
Actions monotonically increase
© Daniel S. Weld
29
Observation 3
p
p
p
q
q
q
r
r
r
…
…
…
A
Proposition mutex relationships monotonically decrease
© Daniel S. Weld
30
Observation 4
A
A
A
p
p
p
p
q
q
q
q
…
B
C
r
s
…
B
C
r
s
…
B
C
r
s
…
Action mutex relationships monotonically decrease
© Daniel S. Weld
31
New Representation
Propositions
© Daniel S. Weld
Actions
p
0
A1
q2
B3
r4
C3
s4
D5
…
…
32
Mutex Relationships
Propositions
p
q2
A1

B3
r4
7
C3
s4
D5
…
…
0
8
© Daniel S. Weld
Actions
33
Plan Graph
q2
A1

B3
r4
7
C3
s4
D5
…
…
p
0
6
 start time
end level  end time
Props & actions: start level
Mutex relations:
© Daniel S. Weld
34
TGP: Doing Time
[Smith & Weld IJCAI-99]
Actions
Real duration
Parallel
Effects occur at end
Preconditions hold throughout
Affected propositions undefined during
pre
cond1
A
eff1
eff2
© Daniel S. Weld
35
TGP Graph Expansion
+
+
New
Proposition
Add New
Prop Mutex
New
Action
Add New
Action Mutex
+
New
Support
Terminate
Prop Mutex
p
q2
A1

B3
r4
7
C3
s4
D5
…
…
0
8
© Daniel S. Weld
Terminate
Action Mutex
36
Temporal Planning
• Graphplan too complex
• Can we modify state-space search?
Suppose regression
© Daniel S. Weld
37
TP4: Search Space
[Haslum & Geffner ECP 2001]
S = ( E, F, t)
• E={pi}
pi is required to be true
at t
• F= {(ai, i)}
action ai is planned to
start at t- i
• t: time stamp
The initial state S0=?
S0= (G, NULL)
The final state S ?
{S= (E, NULL)|EIp}
© Daniel S. Weld
38
TP4: Action Application
• E={pi}
pi is required to be true
at t
• F= {(ai, i)}
action ai is planned to
start at t- i
• t: time stamp
S=(E,{(aj,j)},t)
A set of actions, A, is
applicable in S if
-- every action in A is
compatible with each
other;
aA, padd(a), a is
compatible with bF
--pE,
© Daniel S. Weld
39
TP4: branching rule
Choose a set of actions SE applicable in S,
define: Fnew = { (a, dur(a)| aSE }
adv = min{, a!=no-op}
then
E’ = {pre(a) | (a,)FFnew}
F’ = { (a, -adv)| (a,)FFnew, >adv}
S’ = (E’, F’)
© Daniel S. Weld
40