Pervasive Self-Regeneration through Concurrent Model-Based Execution Brian Williams (PI) Paul Robertson MIT Computer Science and Artificial Intelligence Laboratory Approved for Public Release, Distribution Unlimited Outline • • • • • • • • What we are trying to do. Demonstration scenario and test bed. Implications of successful results. Technical approach. What is new. Anticipated challenges. Technology transition. Looking forward: next steps. 7/20/04 Approved for Public Release, Distribution Unlimited 2 What we are trying to do • Why software fails: – Software assumptions about the environment become invalid because of changes in the environment. – Software changes introduce incompatibilities. – Software is attacked by a hostile agent. • What can be done when software fails: – Recognize that a failure has occurred. – Diagnose what has failed – and why. – Find an alternative way of achieving the intended behavior. 7/20/04 Approved for Public Release, Distribution Unlimited 3 Building upon a proven technology base. • By extending RMPL to support software failure, we can extend robustness in the face of hardware failures to robustness in the face of software failures. • Many of the same issues pertain: – Detection of faulty behavior – Diagnosis of the fault given faulty behavior – Reconfiguration of the software to achieve intended behavior using different software components. – Select among alternatives to maximize utility. 7/20/04 Approved for Public Release, Distribution Unlimited 4 Deliverables • Model-based programming tools – A language for modeling (RMPL): • The process in terms of desired state evolutions; • The components; and • The environment. • Model-based executives that provide: – Safe optimal dispatch. – Redundant methods. – Continuous monitoring and diagnosis. – Regeneration and optimization. 7/20/04 Approved for Public Release, Distribution Unlimited 5 Architectural overview Desiderata: languages that are •Suspicious Model-based Embedded Programs •Monitor intentions and plans •Self-Adaptive •Exploits and generates contingencies S •State and Fault Aware •Anticipatory –“Model-predictive languages” –Plans and verifies into the future –Predicts future states Model Continuous Mode/State Estimation Continuous Reactive Commanding Obs Cntrl –Plans contingencies S Plant 7/20/04 Approved for Public Release, Distribution Unlimited 6 Basic assumptions for our approach • Objectives for the RMPL language – Low overhead • The effort required of the programmers to achieve robustness should be small compared to the effort required to implement the base capability. – Pervasive • The technology should be applied not only to major components but to all components in the system. – Incremental • Increased robustness can be achieved incrementally by adding greater modeling. 7/20/04 Approved for Public Release, Distribution Unlimited 7 Innovative claims • Features of our approach: – – – – – Fault-aware processes. Fault-adaptive. Model-based. Synthesizes a fault-adaptive process to achieve state evolutions. Reasons from MODELS of correct and faulty behavior of supporting service components. – Constructs novel recovery actions in the face of novel faults. • Specific approaches: – – – – Dynamic selection from redundant methods. Self-optimization: select the optimal candidates. Continuous monitoring. Incremental addition of robustness by adding monitoring procedures incrementally. – Synthesis of repair procedures from models. 7/20/04 Approved for Public Release, Distribution Unlimited 8 Demonstration scenario End to End Self-regeneration of Command & Control Systems: • Robot must plan and execute motion to one or more targets. • Robot must perform designated tasks at selected destinations. • Robot utilizes various sensors and actuators in achieving its task. • Robot utilizes various software components in interpreting its sensor data and manipulating its actuators. • Changing environment will cause software components to fail. Failed software components will be detected and diagnosed. Alternate configurations of the software will be found that can maintain mission objectives while maximizing utility—in realtime. • Fault injection testing: We will inject faults into the test bed including (1) environment changes, (2) incompatibilities, (3) software attacks. 7/20/04 Approved for Public Release, Distribution Unlimited 9 Test bed summary • Robot scenario involves: – Path planning and execution. – Goal selection with risks and rewards. – Visual and other sensors and actuators utilized in navigation and task execution. – Reacts to: • Failures in the software. – Exploiting redundant methods. – By re-planning to maintain optimality. • Discoveries in the environment. – Obstacles – Suitability of using selected sensors given terrain lighting etc. • Attack 7/20/04 Approved for Public Release, Distribution Unlimited 10 Rover test bed Consists of a reconfigurable environment with one ATRV2 and three ATRV-JRs. Allows real-world testing of planning and execution software 7/20/04 Approved for Public Release, Distribution Unlimited 11 Rover test bed setup GPS receiver Stereo camera Inclinometer Laser range scanner • Sensors give information on motion and environment. • Onboard PC allows for real-time computation and command processing. Compass Antennas for wireless LAN Sonar sensors Wheel encoders (odometry) Differential drive SICK LMS 200 laser scanner Stereo camera Sonar control board 802.11a wireless network adapter Ethernet card rFLEX controller Firewire card ttyR ports Right motor Serial port Left motor rFLEX screen Sonar sensors 7/20/04 Approved for Public Release, Distribution Unlimited Onboard PC Inclinometer 12 Implications of successful results • Robotic systems that can operate autonomously to achieve goals in a complex and changing environment. – Modeling environment • Software that detects and works around “bugs” resulting from incompatible software changes. – Modeling software components • Software that detects and recovers from software attacks. – Modeling attack scenarios • Software that automatically improves as better software components and models are added. • Higher level of command and control for robotic missions. 7/20/04 Approved for Public Release, Distribution Unlimited 13 Military roles for autonomous robots • Reconnaissance – Rover makes its way to designated places and reports back—such as with pictures. • Search and Rescue – Rover enters a dangerous area and reports back on the presence of wounded people. • Mapping – Rover explores (a building) producing a map annotated with pictures in support of ground forces. • Munitions Delivery – Rover goes to designated locations and delivers munitions. (like predator UAV) 7/20/04 Approved for Public Release, Distribution Unlimited 14 Technical approach • We will extend the technology developed for execution, hardware fault detection, diagnosis and reconfiguration successfully used in Deep Space One to be applied to software fault awareness and reconfiguration. • Software components look like hardware components but reconfiguration is less restricted. • A greater emphasis on environment modeling is necessary because most software faults will be because of environmental changes. 7/20/04 Approved for Public Release, Distribution Unlimited 15 Systems Should be Suspicious Traditional Commanding is Open Loop Robust Systems Should be Fully State Aware And Should Use To Dynamically Monitor Specifications Sensor Interpretation: • Nominal: Model • Command Confirmation • Fault Detection • Failure: • Fault Isolation Continuous Estimation of Modes and State Real Time • Fault Diagnosis 7/20/04 Approved for Public Release, Distribution Unlimited 16 Commanding Involves Repairing a Correct State Nominal commanding is Traditionally pre-determined Robust systems should select the best action in context For long-lived systems, failure is the rule: Model • Must navigate around failures. • Must operate with varying resources and capabilities. • Should select best means amongst multiple alternatives. 7/20/04 Continuous Mode/State Estimation Approved for Public Release, Distribution Unlimited Continuous Reactive Commanding 17 How Do We Guide Self-Regenerative Systems? - By Interacting Directly with State Embedded programs Model-based programs interact with sensors and actuators. interact with state Model-based Embedded Program Embedded Program Obs Cntrl S Services S Services Programmers map between sensors, actuators to states. 7/20/04 Model-based executives map between sensors, actuators to states. Approved for Public Release, Distribution Unlimited 18 Model-based Programs Interact Directly with State Model-based Embedded Programs S Fault Occurs Current Belief State X0 XN-1 XN Model-based Executive S T Plant Model Obs State goals State estimates Mode Estimation X1 RMPL State assertion State query ŝ Conditional Execution Preemption Iteration Concurrent execution Cntrl S Reactive Plant Commanding X0 X1 Reconfigure S 7/20/04 Action Approved for Public Release, Distribution First Unlimited XN-1 XN T least cost reachable 19 goal state Model-based Autonomy Architecture Mission Manager Ground System Executive Planner/ Scheduler Real-Time Execution Modelbased Fault Protection RAX Manager Planning Experts (incl. Navigation) 7/20/04 Fault Monitors Approved for Public Release, Distribution Unlimited Flight H/W 20 Model-based Autonomy Goal States Architecture Remote Agent Mission Manager Ground System Executive Planner/ Scheduler Real-Time Execution Model-based Fault Protection RAX Manager Planning Experts (incl. Navigation) 7/20/04 Fault Monitors Approved for Public Release, Distribution Unlimited Flight H/W 21 Model-based Autonomy Architecture Remote Agent Mission Manager Ground System Executive Planner/ Scheduler Real-Time Execution Model-based Fault Protection RAX Manager Planning Experts (incl. Navigation) 7/20/04 Fault Monitors Approved for Public Release, Distribution Unlimited Flight H/W 22 Model-based Autonomy Architecture Remote Agent Mission Manager Procedural Executive Planner/ Scheduler Modelbased Fault Protection Ground System Low-Level Fault Protection Real-Time Execution RAX Manager Planning Experts (incl. Navigation) 7/20/04 Fault Monitors Approved for Public Release, Distribution Unlimited Flight H/W 23 Model-based Autonomy Architecture Remote Agent Mission Manager Procedural Executive Planner/ Scheduler Modelbased Executive High-Level Fault Protection Ground System Real-Time Execution RAX Manager Planning Experts (incl. Navigation) 7/20/04 Fault Monitors Approved for Public Release, Distribution Unlimited Flight H/W 24 Remote Agent Experiment May, 1999 May 17-18th experiment: High-level Fault Protection • Generate plan for course correction and thrust • Diagnose camera as stuck on – Power constraints violated, abort current plan and replan • Perform optical navigation • Perform ion propulsion thrust May 21th experiment: Low-level Fault Protection • Diagnose faulty device and – Repair by issuing reset. • Diagnose switch sensor failure. – Determine harmless, and continue plan. • Diagnose thruster stuck closed and – Repair by switching to alternate method of thrusting. • Back to back planning 7/20/04 RA was a toolbox, want a seamless language 25 Approved for Public Release, Distribution Unlimited Technology transition • The structure of our solution facilitates transition. – Implements self-regenerative system using language+executive, as opposed to a set of regeneration utilities. – Easily wraps self-regeneration around existing components, through a clean separation of the “executive” and “plant.” – Has supported a long history of successful technology transition. • Papers and other publications 7/20/04 Approved for Public Release, Distribution Unlimited 26 Looking forward: next steps • Reasoning about complex software models. • Coordination while preserving privacy of subsystem information. • Extending the technology to work with complex distributed self-regenerative systems. – Regeneration requires peer to peer coordination of systems. – Subtle faults that are distributed across multiple systems. – Faults that are detected in systems in which we do not have direct control—and must negotiate fault resolution. 7/20/04 Approved for Public Release, Distribution Unlimited 27 Appendix A 7/20/04 Approved for Public Release, Distribution Unlimited 28 Model-based Execution Kernels as Stochastic Optimal Controllers Goal States Model Deductive Controller mode estimation Observations(t) Plant 7/20/04 s’(t) mode reconfiguration commands(t) s (t) g f Approved for Public Release, Distribution Unlimited 29 Operators and programmers reason through systemwide interactions to: • isolate faults • diagnose causes Diagnosis 7/20/04 Approved for Public Release, Distribution Unlimited 30 OPSAT Generate Most-likely Candidate Diagnoses: • conflict-directed A* (< 10 states visited) Test Against Model and Observables: • Incremental Satisfiability (avg. < 10 % off ideal) • Learn from counter examples (conflicts) to guide generation. Optimal feasible modes ISAT Conflictdirected A* 7/20/04 Checked kernel Approved for Public Release, Distribution Unlimited Conflict 31 Compare Most Likely Candidate to Observations Helium tank Oxidizer tank Flow1 = zero Pressure1 = nominal Acceleration = zero 7/20/04 Fuel tank Pressure2= nominal Main Engines Approved for Public Release, Distribution Unlimited 32 Isolate Conflicting Modes Helium tank Oxidizer tank Fuel tank Flow 1= zero Main Engines A conflict, C, is an assignment to a subset of the mode variables that is inconsistent with the model and observations. 7/20/04 Approved for Public Release, Distribution Unlimited 33 Generate Next Most Likely Candidate Helium tank Oxidizer tank Fuel tank Flow 1= zero Main Engines Every consistent mode assignment must differ from the conflict for at least one mode variable 7/20/04 Approved for Public Release, Distribution Unlimited 34 Generate Next Most Likely Candidate Helium tank Oxidizer tank Fuel tank Main Engines Every consistent mode assignment must differ from the conflict for at least one mode variable 7/20/04 Approved for Public Release, Distribution Unlimited 35 Testing Candidate Detects Another Conflict Helium tank Oxidizer tank Pressure1 = nominal Acceleration = zero 7/20/04 Fuel tank Pressure2= nominal Main Engines Approved for Public Release, Distribution Unlimited 36 Generate Next Most Likely Candidate Helium tank Oxidizer tank Pressure1 = nominal Acceleration = zero Fuel tank Pressure2= nominal Main Engines Consistent mode assignment must differ from both conflicts 7/20/04 Approved for Public Release, Distribution Unlimited 37 Consistent: Most Likely Diagnosis Helium tank Oxidizer tank Pressure1 = nominal Flow1 = zero Acceleration = zero 7/20/04 Fuel tank Pressure2= nominal Flow2 = positive Main Engines Approved for Public Release, Distribution Unlimited 38 Operators and programmers reason through systemwide interactions to: • isolate faults • diagnose causes Diagnosis 7/20/04 • repair • reconfigure Configuration Management Approved for Public Release, Distribution Unlimited 39 Conflicts Focus MR Goal: Achieve Thrust Find least cost modes that entail the current goal. • A conflict, C, is an assignment to a subset of the mode variables that entails the negation of the goal. 7/20/04 Approved for Public Release, Distribution Unlimited 40 Conflicts Focus MR Goal: Achieve Thrust Find least cost modes that entail the current goal. • A conflict, C, is an assignment to a subset of the mode variables that entails the negation of the goal. 7/20/04 Approved for Public Release, Distribution Unlimited 41 Conflicts Focus MR Goal: Achieve Thrust Find least cost modes that entail the current goal. • A conflict, C, is an assignment to a subset of the mode variables that entails the negation of the goal. 7/20/04 Approved for Public Release, Distribution Unlimited 43 Models are Concurrent, Constraint Encodings of Partially Observable Markov Decision Process vlv=stuck open => [Outflow]S = [Inflow]S and [Outflow]R = [Inflow]R vlv=open => [Outflow]S = [Inflow]S and [Outflow]R = [Inflow]R Stuck open Open Open Cost 5 Prob .9 Close Stuck closed Closed Vlv = closed => Outflow = 0; Unknown vlv=stuck closed=> Outflow = 0; • One automaton for each device. • Communication through shared variables. 7/20/04 Approved for Public Release, Distribution Unlimited 44 Operators and programmers reason through systemwide interactions to: • • • • • isolate faults diagnose causes monitor confirm commands track goals Estimating Modes 7/20/04 Approved for Public Release, Distribution Unlimited 45 Mode Estimation Find most likely reachable states consistent with observations. Left engine on Observe “no thrust” Enumerated by decreasing probability. Every component transitions at each time step. 7/20/04 Approved for Public Release, Distribution Unlimited 46 Operators and programmers reason through systemwide interactions to : • • • • • monitor track goals confirm commands isolat faults diagnose faults Estimating Modes 7/20/04 • • • • • repair reconfigure execute avoid failures change control policies Controlling Modes Approved for Public Release, Distribution Unlimited 47 Model-based Reactive Planning 1. Find least cost state that entails the current goal and is reachable from the current state. 2. Find first action that moves towards this state. Configuration goals Mode Est. Mode Reactive Reconf. Planning current state goal state Model Burton Model-based Executive IJCAI97 7/20/04 Command Approved for Public Release, Distribution Unlimited 48 Architectural overview Desiderata: languages that are •Suspicious Model-based Embedded Programs •Monitor intentions and plans •Self-Adaptive •Exploits and generates contingencies S •State and Fault Aware •Anticipatory –“Model-predictive languages” –Plans and verifies into the future –Predicts future states Model Continuous Mode/State Estimation Continuous Reactive Commanding Obs Cntrl –Plans contingencies S Plant 7/20/04 Approved for Public Release, Distribution Unlimited 49 Anticipated Challenges • Software is harder than hardware: – Hardware recovers by leveraging backups, while software leverages alternative methods. – Models of software are more complex to specify. – While hardware’s topology is static, software’s topology dynamically changes. • Performance – Software systems result in larger state spaces, making real-time response a challenge. 7/20/04 Approved for Public Release, Distribution Unlimited 50
© Copyright 2026 Paperzz