Strong-Cyclic Planning When Fairness is Not a Valid Assumption Alberto Camacho Sheila A. McIlraith Department of Computer Science University of Toronto, Canada {acamacho,sheila}@cs.toronto.edu KnowProS July 10, 2016 Take Home Message Motivation Soundness of standard strong-cyclic solutions to Fully Observable Non-Deterministic (FOND) planning problems is guaranteed only when the fairness assumption holds. Approach We introduce L-fairness; a more generic concept that generalizes the classical fairness assumption. Contribution FOND+ class of planning problems. Soundness of solutions is predicated on the L-fairness assumption. Identify a class of FOND+ solutions that are also solutions to 1-primary normative fault-tolerant planning problems. We present different algorithms to solve FOND+ problems. Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 2 / 21 Non-Deterministic Planning Non-Deterministic Planning Domain D = hF, S, A, T i F finite set of propositions S finite set of states S ⊆ 2F A set of actions a = hPrea , Eff a i Preconditions Prea Non-deterministic effects Eff a = hEff 1a , . . . Eff na i T : S × A → 2S transition function If s ′ ∈ T (s, a, Eff ia ) then s ′ = Prog (s, a, Eff ia ) for some Eff ia ∈ Eff a We write state transition (s, a, s ′ ) In our paper, we address two classes of non-deterministic planning problems: Fully Observable Non-Deterministic (FOND) Planning Fault-Tolerant Planning Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 3 / 21 FOND Planning FOND Planning Problem P = hD, s0 , SG i D = hF, S, A, T i is a non-deterministic planning domain s0 ∈ S initial state SG ⊆ S goal states Solutions are policies, or mappings from states into actions. weak solutions strong solutions strong-cyclic solutions Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 4 / 21 Solutions to a FOND Problem (cf. [Cimatti et al., 2003]) Weak Solutions Weak solutions are plans that achieve the goal, but without guarantees. Strong Solutions Strong solutions guarantee goal achievement in all executions. Strong-Cyclic Solutions Strong-Cyclic solutions guarantee goal achievement, provided that all executions are fair. An execution σ is unfair when a state-action tuple s, a appears infinitely often in σ, but the transition (s, a, s ′ ) occurs a finite number of times for an outcome s ′ ∈ T (s, a). Executions that are not unfair are said to be fair. c.f. [Cimatti et al., 2003] Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 5 / 21 Fault-Tolerant Planning Fault-Tolerant Planning Problem P = hD, s0 , SG , F , κi D = hF, S, A, T i is a non-deterministic planning domain s0 ∈ S initial state SG ⊆ S goal states F is an exception model κ is an integer parameter F : S Eff a → N is an exception model: F (e) > 0 when the effect is faulty a∈A F (e) = 0 when the effect is normative If |e | F (e) = 0, e ∈ Eff a | = 1 for all a ∈ A, then problem is 1-primary c.f. [Jensen et al., 2004, Domshlak, 2013] Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 6 / 21 Solutions to Fault-Tolerant Planning Problems κ-admissible Executions A state-effect execution (s0 , e0 , . . . , si , ei , . . .) is κ-admissible when Σi F (ei ) ≤ κ. Solutions are κ-Plans A policy is a κ-plan when all κ-admissible executions are finite and reach the goal. Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 7 / 21 Motivation A B Blocksworld domain: Initial state: {on(A, B), ontable(B), handempty } Actions: pick-up-block(?b,?from): Pre = {handempty, on-block(?b,?from)} Eff 1 = {holding(?b) ∧ ¬handempty} Eff 2 = {on-table(?b) ∧ ¬on-block(?b,?from)} put-block-on-table(?b) Pre = {holding(?b)} Eff = {on-table(?b) ∧ ¬holding(?b)} put-on-block(?b1,?b2) Pre = {handempty ∧ clear(?b2)}; Eff = {on-block(?b1,?b2) ∧ ¬handempty} Goal condition: {on-table(A)} A B A B Goal achievement is predicated on fairness. A B A B A B Goal achievement is not predicated on fairness. Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 8 / 21 Desired Solutions Guarantees vs. no guarantees of occurrence: solutions need not to rely on an effect for which there is no guarantees of occurrence Normative vs. faulty behaviour: solutions need to achieve the goal when the system manifests its normative behaviour Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 9 / 21 Outline 1 Background in Non-Deterministic Planning 2 The Model: FOND+ 3 Algorithms to solve FOND+ 4 Experimental Results 5 Conclusions Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 9 / 21 L-fair Executions L-fair Executions For a labeling function L : S × A × S → {F, U}, we say that an execution in state s0 is L-unfair when there exists a state-action tuple (s, a) such that (s, a) appears infinitely often, and there exists a transition (s, a, s ′ ) such that L(s, a, s ′ ) = F and (s, a, s ′ ) occurs a finite number of times. Executions that are not L-unfair are said to be L-fair. Note that fairness, as defined by [Cimatti et al., 2003], is a particular case of L-fairness that occurs when L assigns F to all transitions. Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 10 / 21 Planning With Unfair Non-Determinism FOND+ Planning Problem P = hD, s0 , SG , Li D = hF, S, A, T i is a non-deterministic planning domain s0 ∈ S is the initial state SG ⊆ S is a set of goal states L : S × A × S → {F, U} is a labeling function Solutions Solutions to a FOND+ problem P = hD, s0 , SG , Li are policies that guarantee goal achievement, predicated on the assumption that all executions of D in s0 are L-fair. Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 11 / 21 Classes of FOND+ Solutions Strictly Fair A solution π to a FOND+ problem is strictly fair when all transitions t produced by L-fair plan executions have L(t) = F. Strictly Unfair A solution π to a FOND+ problem is strictly unfair when all transitions t produced by L-fair plan executions have L(t) = U. Mixed A solution π to a FOND+ problem is mixed when it is neither strictly fair nor strictly unfair. Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 12 / 21 FOND+ and Fault-Tolerant Planning Normative Solutions A FOND+ solution π is normative when, in each state s, reachable by π: there exists a plan execution in s that reaches the goal and such that all transitions t have L(t) = F, and exactly one outcome of s by π(s) produces a transition t with L(t) = F. Normative Solutions are Fault-Tolerant Normative solutions to a FOND+ problem P = hD, s0 , SG , Li are also 1-primary normative solutions to fault-tolerant planning problems P ′ = hD, s0 , SG , F , κi s.t. F (e) = 0 (resp. F (e) > 0) when e produces transition (s, a, s ′ ) such that L(s, a, s ′ ) = F (resp. L(s, a, s ′ ) = U). Normative FOND+ solutions are robust to occurrence of any possible number of faults during execution, as opposed to standard fault-tolerant solutions. Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 13 / 21 Outline 1 Background in Non-Deterministic Planning 2 The Model: FOND+ 3 Algorithms to solve FOND+ 4 Experimental Results 5 Conclusions Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 13 / 21 Algorithm to Find Strictly Fair Solutions For a FOND+ problem P, the algorithm consists of two steps: 1 2 P is relaxed into a FOND problem P ′ = hD′ , s0 , SG i. D′ is like D, but the actions applicable in a given state s are restricted to those a’s that only yield transitions (s, a, s ′ ) labeled with L(s, a, s ′ ) = F. A sound and complete strong-cyclic FOND planner – e.g. PRP [Muise et al., 2012] – is used to search for a strong-cyclic solution to P ′ , which is returned as a strictly fair solution to P. Theorem Algorithm is sound and complete. Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 14 / 21 Algorithm to Find Strictly Unfair Solutions For a FOND+ problem P, the algorithm consists of two steps: 1 2 P is relaxed into a FOND problem P ′ = hD′ , s0 , SG i. D′ is like D, but the actions applicable in a given state s are restricted to those a’s that only yield transitions (s, a, s ′ ) labeled wth L(s, a, s ′ ) = U. A sound and complete strong FOND planer – e.g. [Jaramillo et al., 2014] – is used to search for a strong solution to P ′ , which is returned as a strictly unfair solution to P. Theorem Algorithm is sound and complete. Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 15 / 21 Algorithm to Find Normative Solutions Three basic steps (also in PRP): Step 1: Search plan in the all-outcomes determinization of the problem (i.e. ignore non-determinisim). ? Init a1 ? S1 a2 ? Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption ? S2 ? a3 Goal ? 16 / 21 Algorithm to Find Normative Solutions Three basic steps (also in PRP): Step 1: Search plan in the all-outcomes determinization of the problem (i.e. ignore non-determinisim). Step 2: Select a state result of non-determinisim, and search plan to the Goal or to a previously resolved state. ? Init a1 S3 ? S1 a2 ? Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption S2 ? a3 Goal ? 16 / 21 Algorithm to Find Normative Solutions Three basic steps (also in PRP): Step 1: Search plan in the all-outcomes determinization of the problem (i.e. ignore non-determinisim). Step 2: Select a state result of non-determinisim, and search plan to the Goal or to a previously resolved state. Step 3: Repeat Step 2 until convergence. ? Init a1 S3 ? S1 a2 ? Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption S2 ? a3 Goal ? 16 / 21 Algorithm to Find Normative Solutions Three basic steps (also in PRP): Step 1: Search plan in the all-outcomes determinization of the problem (i.e. ignore non-determinisim). Step 2: Select a state result of non-determinisim, and search plan to the Goal or to a previously resolved state. Step 3: Repeat Step 2 until convergence. Difference with PRP is in the open list of states. In PRP: First-In, Last-Out In our algorithm: Exploration of states produced by normative effects have preference. Theorem Algorithm is sound and complete. Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21 Outline 1 Background in Non-Deterministic Planning 2 The Model: FOND+ 3 Algorithms to solve FOND+ 4 Experimental Results 5 Conclusions Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21 Objectives of the Experiments Two Main Objectives: Test the efficiency of one of our algorithms Evaluate characteristics (planner run time and policy size) of normative solutions Procedure: Compute Normative solutions to FOND+ problems Compute Strong-Cyclic solutions to FOND problems, using PRP planner [Muise et al., 2012] Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 17 / 21 Blocksworld Problems Blocksworld problems from [Muise et al., 2012], with actions: pick-up-block(?b,?from): Pre = {handempty, on-block(?b,?from)} Eff 1 = {holding(?b) ∧ ¬handempty} Eff 2 = {on-table(?b) ∧ ¬on-block(?b,?from)} put-block-on-table(?b) Pre = {holding(?b)} Eff = {on-table(?b) ∧ ¬holding(?b)} put-on-block(?b1,?b2) Pre = {handempty ∧ clear(?b2)} Eff 1 = {on-block(?b1,?b2) ∧ ¬handempty} Eff 2 = {on-table(?b1) ∧ ¬handempty} In FOND+ problems we consider: Eff 1 is a normative effect Eff 2 is a faulty effect Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 18 / 21 Results problem p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 Strong-Cyclic run-time size 0 3 0.002 5 0.020 11 0.070 27 0.110 39 0.114 32 0.150 26 0.278 46 0.336 49 0.522 120 0.626 97 0.682 57 3.794 1117 1.500 278 Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption Normative run-time size 0 3 0.016 5 0.048 11 0.178 27 0.296 39 0.270 32 0.356 26 0.664 46 0.782 49 1.936 97 1.840 119.5 1.810 57 37.10 1123 7.814 278 19 / 21 Outline 1 Background in Non-Deterministic Planning 2 The Model: FOND+ 3 Algorithms to solve FOND+ 4 Experimental Results 5 Conclusions Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 19 / 21 Summary and Future Work Strong-cyclic planning does not guarantee goal achievement in problems the fairness assumption is not valid We introduced L-fairness and FOND+ model We identified connection between FOND+ and 1-primary normative fault-tolerant planning Introduced algorithms to search FOND+ solutions Future Work: Further investigate and formalise connections between FOND+ and fault-tolerant planning More extensive experiments Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 20 / 21 Questions? code, benchmarks, and slides available soon: http://www.cs.toronto.edu/~acamacho Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 21 / 21 References I Cimatti, A., Pistore, M., Roveri, M., and Traverso, P. (2003). Weak, strong, and strong cyclic planning via symbolic model checking. Artificial Intelligence, 147:35–84. Domshlak, C. (2013). Fault tolerant planning: Complexity and compilation. Hertle, A., Dornhege, C., Keller, T., Mattmller, R., Ortlieb, M., and Nebel, B. (2014). An Experimental Comparison of Classical, FOND and Probabilistic Planning. In Proc. of 37th International Conference on Artificial Intelligence (KI 2014), Prague. Jaramillo, A. C., Fu, J., Ng, V., Bastani, F. B., and Yen, I.-L. (2014). Fast strong planning for fond problems with multi-root directed acyclic graphs. International Journal on Artificial Intelligence Tools, 23(06):1460028. Jensen, R. M., Veloso, M. M., and Bryant, R. E. (2004). Fault tolerant planning: Toward probabilistic uncertainty models in symbolic non-deterministic planning. pages 335–344. Little, I. and Thiébaux, S. (2007). Probabilistic planning vs. replanning. ICAPS Workshop on IPC: Past, Present and Future. Muise, C., McIlraith, S. A., and Beck, J. C. (2012). Improved Non-deterministic Planning by Exploiting State Relevance. In ICAPS, pages 172–180. Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 21 / 21
© Copyright 2024 Paperzz