Evolutionary Testing of Autonomous Software Agents Cu D. Nguyen, Anna Perini, and Paolo Tonella Simon Miles, Mark Harman, and Michael Luck Fondazione Bruno Kessler Via Sommarive, 18 38050 Trento, Italy Department of Computer Science Kingʼs College London Strand, London WC2R 2LS, UK 1 AAMAS, Budapest 2009 Outline • Introduction • Approaches • Experiment and result discussion • Conclusion 2 Introduction Autonomous software agents are increasingly used Testing to build confidence in their operations is crucial ! 3 Introduction Agent autonomy makes testing harder • Agents make decisions for themselves based on their goals, intentions, and beliefs • Can behave differently in response to the same input 4 Introduction Agent autonomy makes testing harder • Agents make decisions for themselves based on their goals, intentions, and beliefs • Can behave differently in response to the same input Autonomous agents operate in an open environment with high variety of situations4 Introduction Agent autonomy makes testing harder • Agents make decisions for themselves based on their goals, intentions, and beliefs • Can behave differently in response to the same input Testing requires: • adequate output evaluations • techniques that produce wide range of contexts & can search for the most demanding test cases Autonomous agents operate in an open environment with high variety of situations4 Background • Testing is to find faults • We focus on agent level • We evaluate the exhibited performance of autonomous agents, not the underlying autonomy mechanism 5 Our approach (1) Use stakeholder’ requirements related to quality (e.g. efficiency) to judge autonomous agents. Stakeholder Efficiency Keep the airport clean Robustness Cleaner Agent Legends softgoal Goal 6 Actor dependence Our approach (1) Use stakeholder’ requirements related to quality (e.g. efficiency) to judge autonomous agents. Represent these requirements as quality functions Assess the agents under test Drive the evolutionary generation Stakeholder Efficiency ε Keep the airport clean d>ε Robustness time time Cleaner Agent Legends softgoal Goal 6 Actor dependence Our approach (2) Use quality functions in fitness measures to drive the evolutionary generation • • Fitness of a test case tells how good the test case is Evolutionary testing searches for test cases having the best fitness values. Fitness example: distance to be crashed 7 Our approach (2) Use quality functions in fitness measures to drive the evolutionary generation • • Fitness of a test case tells how good the test case is Evolutionary testing searches for test cases having the best fitness values. Fitness example: distance to be crashed d>ε ε time 7 Our approach (3) Use statistical methods to measure test case fitness • • Test outputs of a test case can be different • Statistical output data are used to calculate the fitness A test case execution is repeated a number of times (or in parallel) 8 Evolutionary procedure Evaluation final results outputs Test execution & Monitoring Agent inputs Generation & Evolution initial test cases (random, or existing) 9 Experiments Autonomous cleaning agent ‣ explore locations of important ‣ ‣ ‣ ‣ ‣ objects look for waste and bring them to the closest bin maintain battery charge avoid obstacles by changing course when necessary find the shortest path to reach a specific location stop when no movement is possible or running out of battery 10 Experiments Autonomous cleaning agent ‣ explore locations of important ‣ ‣ ‣ ‣ ‣ objects look for waste and bring them to the closest bin maintain battery charge avoid obstacles by changing course when necessary find the shortest path to reach a specific location stop when no movement is possible or running out of battery 10 Fitness measurement Same input environment, different outputs 11 Fitness measurement Same input environment, different outputs Cumulative box-plots of the distance of executions converge Convergence of cumulative boxplot max "#(&% "#(% quartile 3 (75%) min "#'% cumulative quartile 1 (25%) "#'&% "#$&% "#$% "#"&% "% !"#"&% !"#$% $% '% (% )% 11 &% *% +% ,% -% $"% $$% $'% $(% Number of executions $)% $&% $*% $+% $,% $-% '"% '$% ''% '(% "#"&% Fitness measurement "% !"#"&% $% '% (% )% &% *% +% ,% -% $"% $$% $'% $(% $)% $&% $*% $+% $,% $-% !"#$% Fitness (b) Cumulative box-plots for TC2 Figure 6.9: Cumulative box-plots for two test cases Then, the fitness function is defined as follows: min(D) + w1 ∗ quartile1(D) + w3 ∗ quartile3(D) if min(D) > ε, f= min(D) − ε if min(D) ≤ ε, +∞ if the agent cannot move and suspend safely. quartile 3 (75%) quartile 1 (25%) ε min 124 Search objective: bringing the box down to the threshold ε ,i.e. leading the agent to hit obstacles 12 '"% Result & discussion evolutionary testing 0.11 0.1 0.09 0.08 Safety Fitness 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 10 20 30 40 50 60 70 80 90 100 110 Generation 13 Result & discussion evolutionary testing 0.11 0.1 0.09 0.08 Safety Fitness 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 10 20 30 40 50 60 70 80 90 100 110 Generation 13 Result & discussion random testing 0.11 0.11 0.1 0.1 0.09 0.09 0.08 0.08 0.07 0.07 average of fitness Safety Fitness evolutionary testing 0.06 0.05 0.06 0.05 0.04 0.04 0.03 0.03 0.02 0.02 0.01 0.01 0 0 10 20 30 40 50 60 70 80 90 100 110 10 20 30 40 50 Generation 60 70 80 90 Time • evolutionary testing found better test cases than random testing • and is more effective in detecting faults 13 100 110 Conclusion • Autonomous agent testing is hard • • Non-deterministic outputs • • Use quality requirements as evaluation criteria • • Is more effective compared to random testing Variability of the world setting • Evolutionary testing Use them to guide the evolutionary generation of test input Is cost-effective, requires almost no additional cost 14
© Copyright 2026 Paperzz