Strong-Cyclic Planning When Fairness is Not a Valid Assumption

Strong-Cyclic Planning When Fairness is Not a Valid Assumption
Alberto Camacho Sheila A. McIlraith
Department of Computer Science
University of Toronto, Canada
{acamacho,sheila}@cs.toronto.edu
KnowProS
July 10, 2016
Take Home Message
Motivation
Soundness of standard strong-cyclic solutions to Fully Observable
Non-Deterministic (FOND) planning problems is guaranteed only when
the fairness assumption holds.
Approach
We introduce L-fairness; a more generic concept that generalizes the
classical fairness assumption.
Contribution
FOND+ class of planning problems. Soundness of solutions is
predicated on the L-fairness assumption.
Identify a class of FOND+ solutions that are also solutions to
1-primary normative fault-tolerant planning problems.
We present different algorithms to solve FOND+ problems.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
2 / 21
Non-Deterministic Planning
Non-Deterministic Planning Domain D = hF, S, A, T i
F finite set of propositions
S finite set of states S ⊆ 2F
A set of actions a = hPrea , Eff a i
Preconditions Prea
Non-deterministic effects Eff a = hEff 1a , . . . Eff na i
T : S × A → 2S transition function
If s ′ ∈ T (s, a, Eff ia ) then s ′ = Prog (s, a, Eff ia ) for some Eff ia ∈ Eff a
We write state transition (s, a, s ′ )
In our paper, we address two classes of non-deterministic planning
problems:
Fully Observable Non-Deterministic (FOND) Planning
Fault-Tolerant Planning
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
3 / 21
FOND Planning
FOND Planning Problem P = hD, s0 , SG i
D = hF, S, A, T i is a non-deterministic planning domain
s0 ∈ S initial state
SG ⊆ S goal states
Solutions are policies, or mappings from states into actions.
weak solutions
strong solutions
strong-cyclic solutions
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
4 / 21
Solutions to a FOND Problem (cf. [Cimatti et al., 2003])
Weak Solutions
Weak solutions are plans that achieve the goal, but without guarantees.
Strong Solutions
Strong solutions guarantee goal achievement in all executions.
Strong-Cyclic Solutions
Strong-Cyclic solutions guarantee goal achievement, provided that all
executions are fair.
An execution σ is unfair when a state-action tuple s, a appears infinitely
often in σ, but the transition (s, a, s ′ ) occurs a finite number of times for
an outcome s ′ ∈ T (s, a).
Executions that are not unfair are said to be fair.
c.f. [Cimatti et al., 2003]
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
5 / 21
Fault-Tolerant Planning
Fault-Tolerant Planning Problem P = hD, s0 , SG , F , κi
D = hF, S, A, T i is a non-deterministic planning domain
s0 ∈ S initial state
SG ⊆ S goal states
F is an exception model
κ is an integer parameter
F :
S
Eff a → N is an exception model:
F (e) > 0 when the effect is faulty
a∈A
F (e) = 0 when the effect is normative
If |e | F (e) = 0, e ∈ Eff a | = 1 for all a ∈ A, then problem is
1-primary
c.f. [Jensen et al., 2004, Domshlak, 2013]
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
6 / 21
Solutions to Fault-Tolerant Planning Problems
κ-admissible Executions
A state-effect execution (s0 , e0 , . . . , si , ei , . . .) is κ-admissible when
Σi F (ei ) ≤ κ.
Solutions are κ-Plans
A policy is a κ-plan when all κ-admissible executions are finite and reach
the goal.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
7 / 21
Motivation
A
B
Blocksworld domain:
Initial state: {on(A, B), ontable(B), handempty }
Actions:
pick-up-block(?b,?from):
Pre = {handempty, on-block(?b,?from)}
Eff 1 = {holding(?b) ∧ ¬handempty}
Eff 2 = {on-table(?b) ∧ ¬on-block(?b,?from)}
put-block-on-table(?b)
Pre = {holding(?b)}
Eff = {on-table(?b) ∧ ¬holding(?b)}
put-on-block(?b1,?b2)
Pre = {handempty ∧ clear(?b2)};
Eff = {on-block(?b1,?b2) ∧ ¬handempty}
Goal condition: {on-table(A)}
A
B
A
B
Goal achievement is
predicated on fairness.
A
B
A
B
A
B
Goal achievement is not
predicated on fairness.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
8 / 21
Desired Solutions
Guarantees vs. no guarantees of occurrence:
solutions need not to rely on an effect for which there is no
guarantees of occurrence
Normative vs. faulty behaviour:
solutions need to achieve the goal when the system manifests its
normative behaviour
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
9 / 21
Outline
1 Background in Non-Deterministic Planning
2 The Model: FOND+
3 Algorithms to solve FOND+
4 Experimental Results
5 Conclusions
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
9 / 21
L-fair Executions
L-fair Executions
For a labeling function L : S × A × S → {F, U}, we say that an execution
in state s0 is L-unfair when there exists a state-action tuple (s, a) such
that
(s, a) appears infinitely often, and
there exists a transition (s, a, s ′ ) such that L(s, a, s ′ ) = F and
(s, a, s ′ ) occurs a finite number of times.
Executions that are not L-unfair are said to be L-fair.
Note that fairness, as defined by [Cimatti et al., 2003], is a particular
case of L-fairness that occurs when L assigns F to all transitions.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
10 / 21
Planning With Unfair Non-Determinism
FOND+ Planning Problem P = hD, s0 , SG , Li
D = hF, S, A, T i is a non-deterministic planning domain
s0 ∈ S is the initial state
SG ⊆ S is a set of goal states
L : S × A × S → {F, U} is a labeling function
Solutions
Solutions to a FOND+ problem P = hD, s0 , SG , Li are policies that
guarantee goal achievement, predicated on the assumption that all
executions of D in s0 are L-fair.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
11 / 21
Classes of FOND+ Solutions
Strictly Fair
A solution π to a FOND+ problem is strictly fair when all transitions t
produced by L-fair plan executions have L(t) = F.
Strictly Unfair
A solution π to a FOND+ problem is strictly unfair when all transitions
t produced by L-fair plan executions have L(t) = U.
Mixed
A solution π to a FOND+ problem is mixed when it is neither strictly fair
nor strictly unfair.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
12 / 21
FOND+ and Fault-Tolerant Planning
Normative Solutions
A FOND+ solution π is normative when, in each state s, reachable by π:
there exists a plan execution in s that reaches the goal and such
that all transitions t have L(t) = F, and
exactly one outcome of s by π(s) produces a transition t with
L(t) = F.
Normative Solutions are Fault-Tolerant
Normative solutions to a FOND+ problem P = hD, s0 , SG , Li are also
1-primary normative solutions to fault-tolerant planning problems
P ′ = hD, s0 , SG , F , κi s.t. F (e) = 0 (resp. F (e) > 0) when e produces
transition (s, a, s ′ ) such that L(s, a, s ′ ) = F (resp. L(s, a, s ′ ) = U).
Normative FOND+ solutions are robust to occurrence of any
possible number of faults during execution, as opposed to standard
fault-tolerant solutions.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
13 / 21
Outline
1 Background in Non-Deterministic Planning
2 The Model: FOND+
3 Algorithms to solve FOND+
4 Experimental Results
5 Conclusions
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
13 / 21
Algorithm to Find Strictly Fair Solutions
For a FOND+ problem P, the algorithm consists of two steps:
1
2
P is relaxed into a FOND problem P ′ = hD′ , s0 , SG i.
D′ is like D, but the actions applicable in a given state s are
restricted to those a’s that only yield transitions (s, a, s ′ ) labeled
with L(s, a, s ′ ) = F.
A sound and complete strong-cyclic FOND planner – e.g. PRP
[Muise et al., 2012] – is used to search for a strong-cyclic solution to
P ′ , which is returned as a strictly fair solution to P.
Theorem
Algorithm is sound and complete.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
14 / 21
Algorithm to Find Strictly Unfair Solutions
For a FOND+ problem P, the algorithm consists of two steps:
1
2
P is relaxed into a FOND problem P ′ = hD′ , s0 , SG i.
D′ is like D, but the actions applicable in a given state s are
restricted to those a’s that only yield transitions (s, a, s ′ ) labeled wth
L(s, a, s ′ ) = U.
A sound and complete strong FOND planer – e.g.
[Jaramillo et al., 2014] – is used to search for a strong solution to
P ′ , which is returned as a strictly unfair solution to P.
Theorem
Algorithm is sound and complete.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
15 / 21
Algorithm to Find Normative Solutions
Three basic steps (also in PRP):
Step 1: Search plan in the all-outcomes determinization of the problem
(i.e. ignore non-determinisim).
?
Init
a1
?
S1
a2
?
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
?
S2
?
a3
Goal
?
16 / 21
Algorithm to Find Normative Solutions
Three basic steps (also in PRP):
Step 1: Search plan in the all-outcomes determinization of the problem
(i.e. ignore non-determinisim).
Step 2: Select a state result of non-determinisim, and search plan to the
Goal or to a previously resolved state.
?
Init
a1
S3
?
S1
a2
?
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
S2
?
a3
Goal
?
16 / 21
Algorithm to Find Normative Solutions
Three basic steps (also in PRP):
Step 1: Search plan in the all-outcomes determinization of the problem
(i.e. ignore non-determinisim).
Step 2: Select a state result of non-determinisim, and search plan to the
Goal or to a previously resolved state.
Step 3: Repeat Step 2 until convergence.
?
Init
a1
S3
?
S1
a2
?
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
S2
?
a3
Goal
?
16 / 21
Algorithm to Find Normative Solutions
Three basic steps (also in PRP):
Step 1: Search plan in the all-outcomes determinization of the problem
(i.e. ignore non-determinisim).
Step 2: Select a state result of non-determinisim, and search plan to the
Goal or to a previously resolved state.
Step 3: Repeat Step 2 until convergence.
Difference with PRP is in the open list of states.
In PRP: First-In, Last-Out
In our algorithm: Exploration of states produced by normative
effects have preference.
Theorem
Algorithm is sound and complete.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
16 / 21
Outline
1 Background in Non-Deterministic Planning
2 The Model: FOND+
3 Algorithms to solve FOND+
4 Experimental Results
5 Conclusions
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
16 / 21
Objectives of the Experiments
Two Main Objectives:
Test the efficiency of one of our algorithms
Evaluate characteristics (planner run time and policy size) of
normative solutions
Procedure:
Compute Normative solutions to FOND+ problems
Compute Strong-Cyclic solutions to FOND problems, using PRP
planner [Muise et al., 2012]
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
17 / 21
Blocksworld Problems
Blocksworld problems from [Muise et al., 2012], with actions:
pick-up-block(?b,?from):
Pre = {handempty, on-block(?b,?from)}
Eff 1 = {holding(?b) ∧ ¬handempty}
Eff 2 = {on-table(?b) ∧ ¬on-block(?b,?from)}
put-block-on-table(?b)
Pre = {holding(?b)}
Eff = {on-table(?b) ∧ ¬holding(?b)}
put-on-block(?b1,?b2)
Pre = {handempty ∧ clear(?b2)}
Eff 1 = {on-block(?b1,?b2) ∧ ¬handempty}
Eff 2 = {on-table(?b1) ∧ ¬handempty}
In FOND+ problems we consider:
Eff 1 is a normative effect
Eff 2 is a faulty effect
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
18 / 21
Results
problem
p2
p3
p4
p5
p6
p7
p8
p9
p10
p11
p12
p13
p14
p15
Strong-Cyclic
run-time
size
0
3
0.002
5
0.020
11
0.070
27
0.110
39
0.114
32
0.150
26
0.278
46
0.336
49
0.522
120
0.626
97
0.682
57
3.794 1117
1.500
278
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
Normative
run-time
size
0
3
0.016
5
0.048
11
0.178
27
0.296
39
0.270
32
0.356
26
0.664
46
0.782
49
1.936
97
1.840 119.5
1.810
57
37.10 1123
7.814
278
19 / 21
Outline
1 Background in Non-Deterministic Planning
2 The Model: FOND+
3 Algorithms to solve FOND+
4 Experimental Results
5 Conclusions
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
19 / 21
Summary and Future Work
Strong-cyclic planning does not guarantee goal achievement in
problems the fairness assumption is not valid
We introduced L-fairness and FOND+ model
We identified connection between FOND+ and 1-primary normative
fault-tolerant planning
Introduced algorithms to search FOND+ solutions
Future Work:
Further investigate and formalise connections between FOND+ and
fault-tolerant planning
More extensive experiments
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
20 / 21
Questions?
code, benchmarks, and slides available soon:
http://www.cs.toronto.edu/~acamacho
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
21 / 21
References I
Cimatti, A., Pistore, M., Roveri, M., and Traverso, P. (2003).
Weak, strong, and strong cyclic planning via symbolic model checking.
Artificial Intelligence, 147:35–84.
Domshlak, C. (2013).
Fault tolerant planning: Complexity and compilation.
Hertle, A., Dornhege, C., Keller, T., Mattmller, R., Ortlieb, M., and Nebel, B. (2014).
An Experimental Comparison of Classical, FOND and Probabilistic Planning.
In Proc. of 37th International Conference on Artificial Intelligence (KI 2014), Prague.
Jaramillo, A. C., Fu, J., Ng, V., Bastani, F. B., and Yen, I.-L. (2014).
Fast strong planning for fond problems with multi-root directed acyclic graphs.
International Journal on Artificial Intelligence Tools, 23(06):1460028.
Jensen, R. M., Veloso, M. M., and Bryant, R. E. (2004).
Fault tolerant planning: Toward probabilistic uncertainty models in symbolic
non-deterministic planning.
pages 335–344.
Little, I. and Thiébaux, S. (2007).
Probabilistic planning vs. replanning.
ICAPS Workshop on IPC: Past, Present and Future.
Muise, C., McIlraith, S. A., and Beck, J. C. (2012).
Improved Non-deterministic Planning by Exploiting State Relevance.
In ICAPS, pages 172–180.
Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption
21 / 21