Open-Loop Planning

Open-Loop Planning as
Satisfiability
Henry Kautz
AT&T Labs
A Core Computation Problem
Focus: Classical state-space planning, extended
with parallel actions
Core problem - similar computational issues arise
in many other models
• Reactive plans
• Planning with uncertainty and utilities
• Continuous processes
• Metric time
2
State-space Planning
Find a sequence of operators that transform an
initial state to a goal state
State = complete truth assignment to a set of
variables (fluents)
Goal = partial truth assignment (set of states)
Operator = a partial function State  State
• specified by three sets of variables:
precondition, add list, delete list
3
Parallelism
Operators may be applied in parallel when all
orderings are well defined and equivalent
(Op1 || Op2)(s) = Op2(Op1(s)) = Op1(Op2(s))
A special form of non-linear plans
• Only allows parallel actions, not parallel
action sequences
• Easy to serialize
4
Abdundance of Negative
Complexity Results
I. Domain-independent planning: PSPACEcomplete
(Chapman 1987; Bylander 1991; Backstrom 1993)
II. Domain-dependent planning: NP-complete
(Chenoweth 1991; Gupta and Nau 1992)
III. Approximate planning: NP-complete
(Selman 1994)
5
Practice
Until recently, domain-independent planning
systems could only generate very short plans
– 5 steps
Practical systems minimize or eliminate search
• Search control rules
(Sacerdoti 1975, Slaney 1996, Bacchus 1996)
• Pre-compiling entire state-space
(Agre & Chapman 1997, Williams & Nayak 1997)
Scaling remains problematic when state space
is large or not well understood!
How far can can one push domainindependent planning?
– AIPS Planning Competition - 50 step plans
6
Planning as Inference
• Planning as first-order theorem proving
(Green 1969)
computationally infeasible
• STRIPS (Fikes & Nilsson 1971)
very hard
• Partial-order planning (modal truth criteria)
(Tate 1977, Chapman 1985, McAllester 1991, Smith & Peot 1993)
can be more efficient, but still hard
(Minton, Bresina, & Drummond 1994)
• SATPLAN: planning as propositional reasoning
7
Approach
SAT encodings are designed so that plans
correspond to satisfying assignments
Use recent efficient satisfiability procedures
(systematic and stochastic) to solve
Evaluation performance on benchmark
instances
8
Outline
1. Modeling and Solving
Planning Problems as SAT
2. Improved Encodings using
Graph Analysis
3. Improved Encodings using
Compiled Control Knowledge
4. Beyond SAT: Planning with
Resources and Optimization
9
Part 1: Modeling and Solving
Planning Problems as SAT
10
SAT Encodings
Propositional CNF: no variables or quantifiers
Sets of clauses specified by axiom schemas
• fully instantiated before problem-solving
Discrete time, modeled by integers
• state predicates: indexed by time at which
they hold
• action predicates: indexed by time at
which action begins
– each action takes 1 time step
– many actions may occur at the same step
11
Encoding Conventions
• Actions imply preconditions and effects
fly(x,y,i)  at(x,i) & route(x,y) & at(y,i+1)
• Conflicting actions cannot occur at same time (A
deletes a precondition of B)
fly(x,y,i) & yz  fly(x,z,i)
• If something changes, an action must have
caused it (Explanatory Frame Axioms)
at(x,i) & at(x,i+1)   y . route(x,y) & fly(x,y,i)
• Initial and final states hold
at(NY,0) & ... & at(LA,9) & ...
12
Modeling Tricks
Can often dramatically reduce size of problem by
modeling techniques
move(x,y,z,i) requires n4 vars
pickup(x,y,i), putdown(x,z,i) requires 2n3 vars
State-based encodings: eliminate all action
variables (“compile away”)
at(x,i)  at(x,i+1)   y . route(x,y) & at(y,i+1)
at(x,i) & xy  at(y,i)
13
Solution to a Planning Problem
A solution is specified by any model (satisfying
truth assignment) of the conjunction of the
axioms describing the initial state, goal state,
and operators
Easy to convert back to a STRIPS-style plan
14
SATPLAN
problem
description
axiom
schemas
instantiate
instantiated
propositional
clauses
length
mapping
plan
interpret
satisfying
model
SAT
engine(s)
15
SAT Algorithms
Systematic Search
• DP (Davis Putnam Logemann Loveland)
backtrack search + unit propagation
• satz (Chu Min Li) - variable selection by forward
checking: max unit props
• relsat (Bayardo) - dependency directed
backtracking: add new clauses at dead-ends
Local Search
• Inspired by Mins-Conflict algorithm
(Adorf, Johnson, Minton, Philips, & Laird)
• GSAT (Selman), Walksat (Selman, Kautz & Cohen)
greedy local search + noise to escape minima
16
Planning Benchmark Test Set
Extension of Graphplan test set
blocks world - up to 18 blocks, 1019 states
logistics - complex, highly-parallel transportation
domain.
Logistics.d:
• 2,165 possible actions per time slot
• 1016 legal configurations (22000 states)
• optimal solution contains 74 distinct actions over
14 time slots
Problems of this size never previously handled
by state-space planning systems
17
Scaling Up Logistics Planning
10000
log solution time
1000
100
Graphplan
DP
10
DP/Satz
Walksat
1
0.1
0.01
d
g.
lo
c
g.
lo
a
g.
lo
b
g.
lo
.b
et
ck
ro
.a
et
ck
ro
18
Unpredictability of Systematic
Search
10000
log solution time
1000
100
DP/Satz
10
Walksat
1
0.1
0.01
d
g.
lo
c
g.
lo
a
g.
lo
b
g.
lo
.b
et
ck
ro
.a
et
ck
ro
19
Randomized Restarts
Solution: randomize the systematic solver
• Add noise to the heuristic branching
(variable choice) function
• Cutoff and restart search after a fixed
number of backtracks
In practice: rapid restarts with low cutoff can
dramatically improve performance
(Gomes 1996, Gomes, Kautz, and Selman 1997, 1998)
20
Increased Predictability
10000
log solution time
1000
100
Satz
10
Satz/Rand
1
0.1
0.01
d
g.
lo
c
g.
lo
a
g.
lo
b
g.
lo
.b
et
ck
ro
.a
et
ck
ro
21
What SATPLAN Shows
General propositional theorem provers can
compete with state of the art specialized
planning systems
• New, highly tuned variations of DP
surprising powerful
– result of sharing ideas and code in large SAT/CSP research
community
– specialized engines can catch up, but by then new general
techniques
• Radically new stochastic approaches to
SAT can provide very low exponential
scaling
– 2+ orders magnitude speedup on hard benchmark
problems
22
Why SATPLAN Works
More flexible than forward or backward chaining
• Systematic: most unit propagation at
most highly constrained states
• Stochastic: iterative repair
Space for time tradeoff
• Less overhead since does not have to
instantiate variable during search
Randomized algorithms less likely to get
trapped along bad paths
23
Part 2: Improved Encodings by
Graph Analysis: The
BLACKBOX Planner
24
Graphplan
Planning as graph search (Blum & Furst 1995)
Set new paradigm for planning
Like SATPLAN...
• Two phases: instantiation of propositional
structure, followed by search
Unlike SATPLAN...
• Interleaves instantiation and pruning of plan graph
• Employs specialized search engine
Graphplan - better instantiation
SATPLAN - better search
25
Graph Pruning
Graphplan instantiates in a forward direction,
pruning unreachable nodes
• conflicting actions are mutex
• if all actions that add two facts are mutex, the facts
are mutex
• if the preconditions for an action are mutex, the
action is unreachable!
In logical terms: limited application of negative
binary propagation
• given:  P V  Q,
P V R V S V ...
• infer:  Q V R V S V ...
26
The Plan Graph
Facts
Actions
Facts
...
...
mutually exclusive
preconditions
add effects
delete effects
27
Bridging Paradigms
Both SATPLAN and Graphplan are disjunctive
planners (Kambhampati 1996)
Can the best features of each be
combined?
IJCAI Challenge in Bridging Plan Synthesis Paradigms
(Kambhampati 1997)
Our response: blackbox
28
Translation of Plan Graph
Pre1
Act1
Fact
Pre2
Act2
Fact  Act1  Act2
Act1  Pre1  Pre2
¬Act1  ¬Act2
29
Improved Encodings
Translations of Logistics.a:
STRIPS  Axiom Schemas  SAT
(Medic system, Weld et. al 1997)
• 3,510 variables, 16,168 clauses
• 24 hours to solve
STRIPS  Plan Graph  SAT
• 2,709 variables, 27,522 clauses
• 5 seconds to solve!
30
Limited Inference
SATPLAN used only a single general theoremprover
What role can limited (polytime) reasoning
algorithms play?
Two kinds of limited deduction
• Planning specific (mutex computation)
• General polytime simplification
– apply to all CNF formulas, may or may not be
designed with planning in mind
31
General Limited Inference
Generated wff can be further simplified by
consistency propagation techniques
Compact (Crawford & Auton 1996)
• unit propagation: is Wff inconsistant by resolution
against unit clauses?
O(n)
• failed literal rule: is Wff + { P } inconsistant by unit
propagation?
O(n2)
• binary failed literal rule: is Wff + { P V Q }
inconsistant by unit propagation?
O(n3)
Complements domain specific limited inference
Discovers hidden local structure!
32
General Limited Inference
Problem
Vars
bw.a
bw.b
bw.c
log.a
log.b
log.c
log.d
2452
6358
19158
2709
3287
4197
6151
Percent vars set by
unit
failed
binary
prop
lit
failed
10%
100%
100%
5%
43%
99%
2%
33%
99%
2%
36%
45%
2%
24%
30%
2%
23%
27%
1%
25%
33%
33
Blackbox
STRIPS
Simplifier
CNF
Mutex
computation
CNF
General
Stochastic /
Systematic
SAT
engines
Plan
Graph
Translator
Solution
34
Staged Inference
Abstract
problem
specification
Polytime
general
inference
Combinatorial
CORE
Polytime
domain
specific
inference
General
language
encoding
Full general
inference
(NP complete)
Domain
specific
model
Encoding scheme
Solution
35
Intuition
Many real-world problems not tractable, but are
nearly so
• polytime inference takes advance of special
kinds of structure
• structure may be visible at the level of a
domain specific representation, or only after
the problem is encoded
• small number of practical methods for
combinatorial core
– can be highly optimized
36
Blackbox Results
10000
1000
100
Graphplan
BB-walksat
10
BB-rand-sys
Handcoded-walksat
1
0.1
0.01
rocket.a rocket.b log.a
log.b
log.c
log.d
37
Part 3: Improved Encodings:
Compiling Control Knowledge
38
Kinds of Control Knowledge
About domain itself
• a truck is only in one location
About good plans
• do not remove a package from its destination location
About how to search
• plan air routes before land routes
39
Expressing Knowledge
Such information is traditionally incorporated in
the planning algorithm itself
– or in a special programming language
Instead: use additional declarative axioms
– (Bacchus 1995; Kautz 1998; Chen, Kautz, & Selman 1999)
• Problem instance: operator axioms +
initial and goal axioms + control axioms
• Control knowledge constraints on
search and solution spaces
• Independent of any search engine
strategy
40
Axiomatic Control Knowledge
State Invariant: A truck is at only one location
at(truck,loc1,i) & loc1 loc2 
 at(truck,loc2,i)
Optimality: Do not return a package to a location
at(pkg,loc,i) &  at(pkg,loc,i+1) & i<j 
at(pkg,loc,j)
Simplifying Assumption: Once a truck is loaded, it
should immediately move
in(pkg,truck,i) & in(pkg,truck,i+1) &
at(truck,loc,i+1) 
at(truck,loc,i+2)
41
Adding Control Kx to SATPLAN
Problem
Specification
Axioms
Control
Knowledge
Axioms
Instantiated
Clauses
SAT Simplifier
As control
knowledge
increases, Core
shrinks!
SAT “Core”
SAT Engine
42
Tradeoffs of Control Knowledge
If the planning domain is inherently intractable,
how can any amount of control knowledge
make planning tractable?
• by reducing solution quality
• optimal planning - NP-Hard
• non-optimal - (maybe) Polynomial
Issue: speed / quality tradeoff
Case study: Control Knowledge in TLPLAN and
BlackBox
• TLPLAN (Bacchus 1996): simple forwardchaining search with strong control rules
43
Intuition
Greater raw search power of SAT
approach should give better quality
solutions than planners entirely
dependent on control knowlege
44
TLPlan
Temporal Logic
Control Formula
45
Temporal Logic for Control
( at(obj1, loc1) =>
at(obj1, loc1) )
I. Rules involves only static information
II. Rules depends on the current state
III. Rules depends on the current state and
requires dynamic user-defined predicates
46
Category I Control Rules
Initial
a
SFO
Goal
a
NYC
a
ORL
Do NOT unload an object from an airplane unless the
object is at its goal destination
47
Pruning the Planning Graph
Category I Rules
Facts
...
Actions
Facts
...
48
Effect of Graph Pruning
Original
Pruned
number of nodes
10000
8000
6000
4000
2000
0
log-a
log-b
log-c
log-d
49
Category II Control Rules
a
SFO
ORL
NYC
Do NOT move an airplane if there is an object in the
airplane that needs to be unloaded at that location.
50
Control by Adding Constraints
Control Rules
(in(pkg,p)  at(p,ORL)  next(at(p,ORL)))
(xi  yi  yi
Planning Formula
)
 1
Constraints Clauses
51
Blackbox with Control Knowledge
(Logistics domain)
blackbox
blackbox(I)
blackbox(I&II)
10000
time (sec)
1000
100
10
1
log-a
log-b
log-c
log-d
log-e
52
Blackbox with Control Knowledge
(Tire-World domain)
blackbox
blackbox(I)
blackbox(I&II)
120
time (sec)
100
80
60
40
20
0
tire-a
tire-b
53
Comparison between Blackbox and TLPlan
(Parallel Plan Length)
TLPlan
Blackbox
35
Parallel Plan Length
30
25
20
15
10
5
0
log-c
log-d
log-e
log-1
log-2
54
Comparison between Blackbox and TLPlan
(Running Time)
TLPlan
Blackbox(I&II)
80
Time (sec)
60
40
20
0
log-a
log-b
log-c
log-d
log-e
55
Comparison
TLPlan (without Control):
Intractable.
TLPlan (with Control):
fastest, but limited parallelism
Blackbox (without Control):
slower, high parallelism
Blackbox (with Control):
faster, high parallelism
56
Summary
Easy to encode domain-specific knowledge in
the planning as satisfiablity frame
• Key to order-of-magnitude scaling
• Propositional logic, temporal logic, ...
• Can be applied before/after SAT encoding
Can control time / quality tradeoff
• Power of underlying SAT engines gives option of
finding higher quality solutions
Heuristics are independent from the SAT engine
• Can use same axioms for radically different
problem solvers
57
How to Generate Control Kx
Introspection
• Try to capture “obvious” inferences that are hard to
deduce
EBL (Minton, Kambhampati)
• Generalize trace of previous problem solving
Static analysis (Smith, Etzioni, Knoblock, Peot)
• Analyze operators
Inductive Logic Programming (Huang, Selman, Kautz)
• Find rules that hold for a set of previous high-quality
solution plans
58
Part 4: Beyond SAT: Planning
with Resources and
Optimization
59
Conclusions
• Propositional approaches to Open-Loop
planning using general SAT engines are highly
competitive with specialized planning
algorithms
• Synergy with Plan Graph approaches
• Can effectively employ purely declarative
control knowledge
• Promising direction: generalization to domains
with numeric information
• Biggest limitation: domains where number of
objects is too large to instantiate
– “lifted SAT” - limited quantification?
60