Partial-Order Causal-Link Planning with Preferences

Partial-Order Causal-Link Planning with
Preferences
Fabian Ginter
Master thesis at the Institute of Artificial Intelligence
March 14th, 2012
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
1 / 21
Content
• Motivation
• Task of the master thesis
• Definition of a planning problem with preferences
• Solving planning problems with preferences
– Approaches to resolve preferences
– Selection methods for preferences
• Evaluation
• Summary
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
2 / 21
Motivation
Solving problem “travel from Ulm to Munich”:
• Example for P1: 1 Task, Costs 100e, Time 3h, 0 Transfers
Ulm
Car
Munich
• Example for P2: 2 Tasks, Costs 130e, Time 2.5h, 1 Transfer
Ulm
Taxi
Station
Train
Munich
How to measure the quality of those plans?
• Typical quality measurements of a plan are costs like money,
time or number of tasks.
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
3 / 21
Motivation
We can clearly measure which plan is cheaper, faster or has more
tasks, but what about other properties?
• P1: Car is at Munich in the final state
Ulm
Car
Munich
• P2: Car is at home in the final state
Ulm
Taxi
Station
Train
Munich
→ Preferences can be used to describe properties of plans to
support each user individually
→ Preferences are state features, called Soft-Goals, that are
optional
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
4 / 21
Motivation
Features provided by preferences:
• Quality of plans can be measured by taking Soft-Goals into
account
• Soft-Goals are optional whereas “normal” goals must be
fulfilled in order to find a solution (which might be too
restrictive, since humans tend to define more goals than
possibly satisfiable)
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
5 / 21
Task of the master thesis
• Extend Panda2
– Integrate and handle Soft-Goals, which correspond to
“at-end” preferences in PDDL3.0
– Develop different strategies to handle Soft-Goals
– Implement planning graph
• Evaluation of the implementation
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
6 / 21
Planning problem with preferences
Definition of a planning problem: π = hV , A, sinit , sgoal , Prefsi
• V is a finite set of state variables
• A is
–
–
–
a finite set of actions where:
a = hpre, add, deli ∈ A where pre, add, del ⊆ V
a is applicable in a state s ∈ 2V iff pre ⊆ s
If a is applied in s, it leads to (s \ del) ∪ add
• sinit ∈ 2V and sgoal ⊆ V
Preferences: Can be arbitrary formulae over literals which can be
transformed into “Disjunctive Normal Form” (DNF)
• Prefs is a finite set of preferences
• {disjunct 1 , . . . , disjunct n } ∈ Prefs represents a formula in
DNF, where disjunct j ⊆ V , j = 1, . . . , n
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
7 / 21
Planning problem with preferences
Action sequence a is a solution iff a(sinit ) fulfills sgoal
Possible quality measurement called “Benefit”:
P
• Ben(a) =
benefit(p)
p∈Prefs
a(sinit ) fulfills p
Used quality measurement called “NetBenefit”:
P
P
• NetBen(a) =
benefit(p) −
costs(a)
p∈Prefs
a(sinit ) fulfills p
Partial-Order Causal-Link Planning with Preferences
a∈a
March 14th, 2012
8 / 21
Solving planning problems with preferences
• Based on a fringe which contains a set of partial plans
• Planselection selects a partial plan out of the fringe
• Partial plan has flaws, which need to be addressed in order to
get a plan
• Flawselection decides which flaw of the partial plan is
addressed
→ Results in new partial plans that are branched into the
fringe
Illustration of the search space:
•
•
•
•
•
•
•
•
•
•
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
9 / 21
Approaches to resolve preferences
• Developed and implemented two different strategies:
– “Split strategy”
– “No-split strategy”
• Basis for both strategies are preferences in DNF
→ Presentation of strategies on next slides
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
10 / 21
Approaches to resolve preferences — “Split strategy”
→ Basic idea: Split DNF into its disjuncts
First selection of the preference p (example):
p = ψ11 ∨ (ψ21 ∧ ψ22 )
p1 = ψ11
p2 = ψ21 ∧ ψ22
Each subsequent selection of the preference:
p1 = ψ11
p2 = ψ21 ∧ ψ22
Supporters
2
Partial-Order Causal-Link Planning with Preferences
Supporters
1
3
March 14th, 2012
11 / 21
Approaches to resolve preferences — “No-split strategy”
→ Basic idea: Stay flexible and work on different disjuncts
p = ψ11 ∨ (ψ21 ∧ ψ22 )
Supporters
2
1
3
• Literals are protected according to the used flaw selection
strategy which will be explained on the next slides
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
12 / 21
Selection methods for preferences
How do we decide which literal of which preference we protect
next?
→ Basic idea: LCFR (Least Cost Flaw Repair)
"
!#
LCFRDNF - ◦ (P) = min min
i
ϕj
◦ |supporter(P, fi (ψjk ))|
ψjk
with i = 1, . . . , |Prefs(P)| and ◦ ∈
nY X
o
,
, min
pi ∈ Prefs, pi = ϕ1 ∨ . . . ∨ ϕn , ϕj = ψj1 ∧ . . . ∧ ψjk , j = 1, . . . , n
→ 3 Strategies:
• LCFRDNF -
Q
• LCFRDNF -
P
• LCFRDNF - min
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
13 / 21
Selection methods for preferences — LCFRDNF -
Q


Y
(P) = min min  |supporter(P, fi (ψjk ))|

LCFRDNF -
Y
ϕj
i
ψjk
Example: Plan P has a preference p
p = ψ11 ∨ (ψ21 ∧ ψ22 )
Supporters
Calculated values:
2
1∗3=3
Q
→ LCFRDNF - = 2 → ϕ1 is selected
→ Literal with fewest supporters of ϕ1 is ψ11
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
14 / 21
Selection methods for preferences — LCFRDNF -
P


X
(P) = min min 
|supporter(P, fi (ψjk ))|

LCFRDNF -
X
ϕj
i
ψjk
Example: Plan P has a preference p
p = ψ11 ∨ (ψ21 ∧ ψ22 )
Supporters
Calculated values:
2
1+3=4
P
→ LCFRDNF = 2 → ϕ1 is selected
→ Literal with fewest supporters of ϕ1 is ψ11
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
15 / 21
Selection methods for preferences — LCFRDNF - min
"
!#
LCFRDNF - min(P) = min min min |supporter(P, fi (ψjk ))|
ϕj
i
ψjk
Example: Plan P has a preference p
p = ψ11 ∨ (ψ21 ∧ ψ22 )
Supporters
Calculated values:
2
1
3
→ LCFRDNF - min = 1 → ψ21 is selected.
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
16 / 21
Evaluation
Evaluation compares:
• Solve “normal” goals first vs. solve Soft-Goals first
• “Split strategy” vs. “No-split strategy”
Q
P
• LCFRDNF - min vs. LCFRDNF - vs. LCFRDNF • “A* NetBenefit” vs. “Uniform NetBenefit”
→ 24 resulting configurations
Used planning domains and problems:
• 200 randomly generated planning problems
→ 24 ∗ 200 = 4800 test cases
• 6 user assistance planning problems modeled by Claudia Bolch
→ 24 ∗ 6 = 144 test cases
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
17 / 21
Evaluation — Which kind of goals should be solved first?
Solve Soft-Goals first
Solve “normal” goals first
Random-Problems:
Random-Problems:
• 2400 tests done
• 2400 tests done
• 6 solutions found
• 2400 solutions found
→ Only 0.25% solved
User-Assistance-Problems:
→ 100% solved
User-Assistance-Problems:
• 72 tests done
• 72 tests done
• 49 solutions found
• 63 solutions found
→ It’s more important to find a solution than solving preferences
→ Therefor “normal” goals must be solved first
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
18 / 21
Evaluation — Approaches to resolve preferences
“Split strategy”
●
●
●
●
●
●
2000
2000
●
●
●
●
●
●
●
●
●
1500
●
●
●
●
●
●
●
net−benefit
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
1000
net−benefit
1500
●
●
●
●
1000
“No-split strategy”
●
●
500
500
●
●
●
●
●
●
●
●
●
●
●
A*,S,m
0
0
●
●
●
●
A*,S,*
A*,S,+
U,S,m
U,S,*
U,S,+
configuration
A*,no−S,m
A*,no−S,*
A*,no−S,+
U,no−S,m
U,no−S,*
U,no−S,+
configuration
→ “No-split strategy” dominates “Split strategy”
→ But: “Split strategy” has room for improvement
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
19 / 21
Evaluation — Selection methods for preferences
LCFRDNF -
LCFRDNF - min
●
●
●
●
●
●
P
●
●
●
●
●
●
2000
net−benefit
●
●
1000
net−benefit
●
●
●
●
●
●
1500
1500
●
●
1000
1500
2000
2000
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
A*,S,m
A*,no−S,m
U,S,m
configuration
U,no−S,m
0
●
500
500
●
0
500
●
0
net−benefit
LCFRDNF -
●
●
●
●
●
●
●
●
●
●
1000
Q
A*,S,*
A*,no−S,*
U,S,*
configuration
U,no−S,*
A*,S,+
A*,no−S,+
U,S,+
U,no−S,+
configuration
→ As expected: LCFRDNF - min is dominated by other strategies
→ Reason: LCFRDNF - min is not aware of DNF
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
20 / 21
Summary
• Planning algorithm was extended to enable the use of
preferences
– Developed and implemented different methods to resolve
preferences
– Developed and implemented different methods for
soft-flaw selection
– Evaluated combinations of different strategies
• Implemented the planning graph (used for heuristic
estimation)
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
21 / 21
Appendix — Planning algorithm
Algorithm 1: Preference-based POCL planning algorithm.
Input : The fringe Fringe = {Pinit }.
Output : A plan or fail.
1 best-nB := −∞
2 best-P := fail
3 while Fringe 6= ∅ and “no timeout” do
4
P = planSel(Fringe)
5
6
7
8
9
10
11
Fringe = Fringe \ {P}
if flawsh (P) = ∅ and netBen(P) > best-nB then
best-nB := netBen(P)
best-P := P
f = flawSel(flawsh (P) ∪ flawss (P))
Fringe = Fringe ∪ resolveFlaw(P, f )
return best-P
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
22 / 21
Appendix — Example: Planning Graph
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
23 / 21
Appendix — Semantics of preferences in PDDL3.0
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
24 / 21
Appendix — More details: “Split strategy”
First selection of the preference pi in plan P:
P : pi = ϕ1 ∨ · · · ∨ ϕn
P1 : pi0 = ϕ1
...
igno
re(p
Pn : pi0 = ϕn
i)
Pn+1 : fi ∈
/ flaws(P)
Each following selection of the simplified preference p 0 :
P : p 0 = ϕj
P1 : ψjk protected
...
Pn : ψjk protected
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
25 / 21
Appendix — Room for improvement of “Split strategy”
• Fringe contains partial plans
• “Split strategy” splits preference and branches the resulting
partial plans into the fringe
P : pi = ϕ1 ∨ · · · ∨ ϕn
P1 : pi0 = ϕ1
...
Pn : pi0 = ϕn
• Currently Panda2 is not able to differentiate between the
partial plans P1 , . . . , Pn
→ Random selection of a partial plan among P1 , . . . , Pn
→ But: Preference was selected because of a specific disjunct
ϕj thus the partial plan Pj with 1 ≤ j ≤ n should be preferred
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
26 / 21
Appendix — More details: “No-split strategy”
Branching of generated partial plans if preference p in partial plan
P is selected:
P : p = ϕ1 ∨ · · · ∨ ϕj
first
(ϕj )
fir
st(
pi )
P1 : ψjk protected . . . Pn : ψjk protected Pn+1 : fi ∈
/ flaws(Pn+1 ) Pn+2 : pi0 = pi \ ϕj
Partial-Order Causal-Link Planning with Preferences
March 14th, 2012
27 / 21

Download Report

Partial-Order Causal-Link Planning with Preferences

Paperzz.com

Your Paperzz