VEIL: Visual exploration and incremental utility elicitation

Heuristic Search Planners
Planning as heuristic search

Use standard search techniques, e.g. A*, best-first,
hill-climbing etc.

Attempt to extract heuristic state evaluator
automatically from the Strips encoding of the domain

Here, generate relaxed problem by assuming action
preconditions are independent
USC INFORMATION SCIENCES INSTITUTE
2
Recap: A* search

Best-first search using node evaluation
f(n) = g(n) + h(n)
where
g(n) = accumulated cost
h(n) = estimate of future cost

For A*, h(.) should never overestimate the cost. In
this case, the solution will be optimal. Then h is
called an admissible heuristic.
USC INFORMATION SCIENCES INSTITUTE
3
Derive cost estimate from a relaxed
planning problem


Ignore the deletes on actions
BUT – still NP-hard, so approximate:

For individual propositions p:
d(s, p) = 0 if p is true in s
= 1 + min(d(s, pre(a))) otherwise
[min over actions a that add p]
USC INFORMATION SCIENCES INSTITUTE
4
Cost of a conjunction

How to compute d(s,pre(a)) or d(s,G) ?

Different options:


Additive: d(s, P) = sum d(s, p) over p in P
Max: d(s, P) = max d(s, p)

Then h(s) = d(s, G)

Can compute d(.,.) in polynomial time
USC INFORMATION SCIENCES INSTITUTE
5
Admissibility and information

Is h+ (additive version) admissible?
How about h-max?
USC INFORMATION SCIENCES INSTITUTE
6
Admissibility and information II

If h+ is not admissible, why would we use it rather
than h-max?
USC INFORMATION SCIENCES INSTITUTE
7
HSP algorithm overview

Hill-climbing search + restarts if plateau for too long

Some ad hoc choices for the planning competition

Hill-climbing search is not complete
USC INFORMATION SCIENCES INSTITUTE
8
HSP2 overview

Best-first search, using h+

Based on WA* - weighted A*:
f(n) = g(n) + W * h(n).
If W = 1, it’s A* (with admissible h).
If W > 1, it’s a little greedy – generally finds solutions
faster, but not optimal.

In HSP2, W = 5
USC INFORMATION SCIENCES INSTITUTE
9
Experiments

Does ok compared with IPP (Graphplan
derivative) and Blackbox.
USC INFORMATION SCIENCES INSTITUTE
10
Regression search

Motivation for HSPr




HSP and HSP2 spend up to 80% of their time computing the
evaluation function.
Slow to generate nodes compared to other heuristic search
systems.
Search backwards from goal, then re-use cost estimates
from s0 to the goal, since we always have a single start state
s0.
Common wisdom: regression planning is good because the
branching factor is much lower
USC INFORMATION SCIENCES INSTITUTE
11
HSPr problem space

States are sets of atoms (correspond to sets of states
in original space)

initial state s0 is the goal G

Goal states are those that are true in s0

Still use h+. h+(s) = sum g(s0, p)
USC INFORMATION SCIENCES INSTITUTE
12
Mutexes in HSPr

Problem: many of the regressed goal states are
‘impossible’ – prune them with mutexes

E.g in blocksworld (on(c,d), on(a,d), ..) is probably
unreachable.
USC INFORMATION SCIENCES INSTITUTE
13
Mutexes in HSPr

First definition:
A set M of pairs R = {p, q} is a mutex set if
(1) R is not true in s0
(2) for every op o that adds p,
o deletes q

Sound, but too weak.
USC INFORMATION SCIENCES INSTITUTE
14
Mutexes in HSPr, take 2

Better definition:
A set M of pairs R = {p, q} is a mutex set if
(1) R is not true in s0
(2) for every op o that adds p,
either o deletes q
or o does not add q, and for some precond r of o,
{r, q} is in M.
Recursive definition allows for some interaction of the
operators
USC INFORMATION SCIENCES INSTITUTE
15
Computing mutex sets

Start with some set of potential mutex pairs

Delete any that don’t satisfy (1) and (2) above

Keep going until you don’t delete any more

Initial set? – could be all pairs (usually too expensive)
USC INFORMATION SCIENCES INSTITUTE
16
Initial set of potential mutexes

Ma = { {p, q} | some action adds p and deletes q}

Mb = { {r, q} | {p, q} is in Ma, some action adds p,
and has r in the precondition}

Initial set = Ma u Mb

Mutex set derived from Ma u Mb is M*
USC INFORMATION SCIENCES INSTITUTE
17
HSPr algorithm



WA* search using h+(s0) and M*
W = 5 as before
Prune states that contain pairs in M*
USC INFORMATION SCIENCES INSTITUTE
18
Experiments comparing HSP2 and HSPr

Sometimes HSPr does better, sometimes HSP2 does
better. Why?

Two reasons (per B & G):
 Still have spurious states
 Since HSP2 recomputes the estimate in each
state, it actually has more information
USC INFORMATION SCIENCES INSTITUTE
19
Evidence for spurious states

Re-run HSPr using mutex set derived from all
possible pairs.

No difference in most domains

Improvement in tire-world domain (with complex
interactions)

Slows down in logistics domain
USC INFORMATION SCIENCES INSTITUTE
20
Branching factor

Varies widely from instance to instance. (Always
seems greater in forward chaining though)

Performance of HSP2 vs HSPr doesn’t seem to
correlate with branching factor
 Other factors dominate, e.g. informedness of
heuristic
USC INFORMATION SCIENCES INSTITUTE
21
Derivation of heuristics

h+ has problems when there are positive or negative
interactions

Can efficient heuristics better capture the
interactions?

H^2 – use the cost of the most expensive pair of
goals
 Still admissible, more informative than hmax, still
cheap
 Room for domain-dependent options?
USC INFORMATION SCIENCES INSTITUTE
22
Comparing HSPr and Graphplan

Both search forwards in relaxed space, then
backwards

Planning graph encodes an admissible heuristic:
hg(s) = j if j is the first level where s appears without
a mutex

Graphplan encodes IDA* efficiently as solution
extraction – but this makes it hard to use other
search algorithms.
USC INFORMATION SCIENCES INSTITUTE
23
Overall

Planning as heuristic search: HSP family are elegant,
quite efficient for domain-independent, and use clear
principles of search

Simple algorithms and relatively thorough analysis –
make it easy to consider lots of extensions
USC INFORMATION SCIENCES INSTITUTE
24
Ways to extend

Improving automatically generated heuristics

More flexible action representations


Probably easier to encode in forwards than backwards
search
Principles and format for encoding domaindependent heuristics

Both the estimate function and other control
USC INFORMATION SCIENCES INSTITUTE
25