Report Back Template - Amazon Web Services

7/22/04 Report Back:
Performance Analysis Track
Dr. Carol Smidts
Wes Deadrick
Track Members
• Carol Smidts (UMD) – Track Chair
– Integrating Software into PRA
• Ted Bennett and Paul Wennberg (Triakis)
– Empirical Assurance of Embedded Software Using Realistic
Simulated Failure Modes
• Dolores Wallace (GSFC)
– System and Software Reliability
• Bojan Cukic (WVU)
– Compositional Approach to Formal Models
• Kalynnda Berens (GRC)
– Software Safety Assurance of Programmable Logic
– Injecting Faults for Software Error Evaluation of Flight Software
• Hany Ammar (WVU)
– Risk Assessment of Software Architectures
Agenda
•
•
•
•
•
•
Characterization of the Field
Problem Statement
Benefits of Performance Analysis
Future Directions
Limitations
Technology Readiness Levels
Characterization of Field
• Goal: Prediction and Assessment of Software
Risk/Assurance Level (Mitigation optimization)
• System Characteristics of interest
– Risk (Off-nominal situations)
– Reliability, availability, maintainability = Dependability
– Failures - general sense
• Performance Analysis Techniques - modeling
and simulation, data analysis, failure analysis,
design analysis focused on criticality
Problem Statement
• Why should NASA do performance analysis? - We care
if things fail!
• Successfully conducting SW and System Performance
Analysis gives us the data necessary to make informed
decisions in order to improve performance and overall
quality
• Performance analysis permits:
–
–
–
–
Ability to determine if/when system meets requirements
Risk reduction and quantification
Application of new knowledge to future systems
A better understanding of the processes by which systems are
developed and therefore enables NASA to exercise continual
improvement
Benefits of Performance Analysis
• Reduced development and operating costs
• Manage and optimize current processes thereby
resulting in more efficient and effective processes
– Defined and repeatable process – reduced time to do same
volume of work
•
•
•
•
Reduces risk and increases safety and reliability
Better software architecture designs
More maintainable systems
Enable NASA to handle more complex systems in the
future
• Put the responsibility where it belongs from a
organizational perspective - focuses accountability
Future Directions for Performance
Analysis
• Automation of modeling and data collection –
increased efficiency and accuracy
• A more useful, better reliability model
– useful = user friendly (enable the masses not just the
domain experts), increased usability of the data (learn
more from what we have)
– better = greater accuracy and predictability
• Define and follow repeatable methods/processes
for data collection and analysis including:
– education and training
– use of simulation
– gold nugget = accurate and complete data
Future Directions for Performance
Analysis (Cont.)
• Develop a method for establishing accurate
performance predictions earlier in life cycle
• Evolve to refine system level assessment
– factor in the human element
• Establish and define an approach to performing
trade-off of attributes – reliability, etc.
• Need for early guidance on criticality of
components
• Optimize a defect removal model
• Methods and metrics for calculating/defending
return on investment of conducting performance
analysis
Why not
• Standard traps - Obstacles
– Uncertainty about scalability
– User friendliness
– Lack of generality
– “Not invented here” syndrome
• Costs and benefits
– Difficult to assess and quantify
– Long term project benefit tracking
recommended
Technology Readiness Level
• Integrating Software into PRA – Taxonomy (7)
• Test-Based Approach for Integrating SW in PRA (3)
• Empirical Assurance of Embedded Software Using
Realistic Simulated Failure Modes (5)
• Maintaining system and SW test consistency (8)
• System Reliability (3)
• Software Reliability (9)
• Compositional Approach to Formal Models (2)
• Software Safety Assurance of Programmable Logic (2)
• Injecting Faults for Software Error Evaluation of Flight
Software (9)
• Risk Assessment of Software Architectures (5)
Research Project Summaries
Integrating Software Into PRA
Dr. Carol Smidts, Bin Li
Objective:
• PRA is a methodology to assess the risk of
large technological systems
• The objective of this research is to extend
current classical PRA methodology to
account for the impact of software onto
mission risk
Integrating Software Into PRA
(Cont)
Achievements
1. Developed a software related failure mode
taxonomy
2. Validated the taxonomy on multiple projects
(ISS, Space Shuttle, X38)
3. Proposed a step-by-step approach to
integration in the classical PRA framework with
quantification of input and functional failures.
Problem
Most embedded SW faults found at integ. test
traceable to Rqmts. & interface misunderstanding
Analyze/Test/V&V
Model,
Simulate,
Prototype, SYSTEM
ES, etc.
Analyze/Test/Verify
Requirements
SW
Interpretation
Design/Debug
Design/Debug
Disconnect exists between System
and software development loops
TRIAKIS Corporation
Build
Integration
Testing
Approach
• Develop & simulate entire system design using
executable specifications (ES)
• Verify total system design with suite of tests
• Simulate controller hardware
• Replace controller ES with simulated HW
running object (flight) software
• Test SW using system verification tests
When SW passes all system verification tests, it
has correctly implemented all of the tested
requirements
TRIAKIS Corporation
IV&V Facility
Empirical Assurance of Embedded SW
Using Realistic Simulated Failure Modes
• Problem: FMEA Limitations
Mini-AERCam
– Expensive & time-consuming
– List of possible failure modes extensive
– Focuses on prioritized subset of failure modes
• Approach: Test SW w/sim’d Failures
– Create pure virtual simulation of Mini-AERCam
HW & flight environment running on PC
– Induce realistic component/subsystem failures
– Observe flight SW response to induced failures
Can we improve coverage by testing SW resp. to sim’d failures?
Compare results with project-sponsored FMEA, FTA, etc.:
#Failure modes evaluated?
TRIAKIS Corporation
#Issues uncovered? Effort involved?
Software and System
Reliability
Dolores Wallace, Bill Farr, Swapna Gokhale
• Addresses the need to evaluate and
assess the reliability and availability of
large complex software intensive
systems by predicting (with associated
confidence intervals):
– The number of software/system faults,
– Mean time to failure and restore/repair,
– Availability,
– Estimated release time from testing.
2003 & 2004 Research
2003 (Software Based)
• Literature search completed
• New models were selected: 1) Enhanced Schneidewind
(includes risk assessment and trade-off analysis) and 2)
Hypergeometric Model
• Incorporated the new software models into the established
public domain tool SMERFS^3
• Applied the new models on a Goddard software project
• Made the latest version of SMERFS^3 available to the
general public
2004 (System Based)
• Conducted similar research effort for System Reliability and
Availability
• Will enhance SMERFS^3 and validate the system models
on a Goddard data set
A Compositional approach to Validation
of Formal Models
Dejan Desovski, Bojan Cukic
• Problem
– Significant number of faults in real systems can
be traced back to specifications.
– Current methodologies of specification
assurance have problems:
• Theorem Proving: Complex
• Model Checking: State explosion problems
• Testing:
Incomplete.
• Approach
– Combine them!
Software Fault Injection Process
Kalynnda Berens, Dr. John Crigler, Richard Plastow
• Standardized approach to test systems with COTS and
hardware interfaces
• Provides a roadmap of where to look to determine what
to test
Start
Obtain Source Code
and
Documentation
Identify
Interfaces and
Critical
Sections
Estimate Effort
Required
Sufficient
time and
funds?
Ye
s
Test Case
Generation
Fault
Injection
Testing
Error/Fault Research
Importance
Analysis
Select
Subset
Feedback to
FCF Project
Document Results,
Metrics, Lessons
Learned
End
Programmable Logic at NASA
Kalynnda Berens, Jacqueline Somos
• Issues
– Lack of good assurance of PLCs and PLDs
– Increasing complexity = increasing problems
– Usage and Assurance Survey - SA involved in less than 1/3 of
the projects; limited knowledge
• Recommendations
– Trained SA for PLCs
– PLDs – determine what is complex; use process assurance (SA
or QA)
• Training Created
– Basic PLC and PLD training aimed at SA
– Process assurance for hardware QA
Year 2 of Research
• What is industry and other government agencies doing
for assurance and verification?
– An intensive literature search of white papers, manuals,
standards, and other documents that illustrated what various
organizations were doing.
– Focused interviews with industry practitioners. Interviews were
conducted with assurance personnel (both hardware and
software) and engineering practitioners in various industries,
including biomedical, aerospace, and control systems.
– Meeting with FAA representatives. Discussions with FAA
representatives lead to a more thorough understanding of their
approach and the pitfalls they have encountered along the way.
• Position paper, with recommendations for NASA Code Q
Current Effort
• Implement some of the recommendations
– Develop coursework to educate software and
hardware assurance engineers
– Three courses
• PLCs for Software Assurance personnel
• PLDs for Software Assurance personnel
• Process Assurance for Hardware QA
– Guidebook
• Other recommendations
– For Code Q to implement if desired
– Follow-up CSIP to try software-style assurance on
complex electronics
Severity Analysis Methodology
Hanny Ammar, Katerina Goseva-Popstojanova, Ajith Guedem, Kalaivani
Appukutty, Walib AbdelMoez, and Ahmad Hassan
• We have developed a methodology to assess severity of
failures of components, connectors, and scenarios based on
UML models
• This methodology is applied on NASA’s Earth Observing
System (EOS)
Requirement Risk Analysis Methodology
• We have developed a methodology for assessing requirements
based risk using normalized dynamic complexity and severity of
failures. This can be used in the DDP process developed at JPL.
Requirements Scenarios
Failure Modes
FM1 FM2 …
R1
R2
S1
S2
S3
FMn
Rf
…
Rm
Risk factor of scenario S1 in
Failure mode FM2
•
According to Dr. Martin
Feather’s DDP Process, “The
Requirements matrix maps the
impacts of each failure mode
on each requirement.”
•
•
Requirements are mapped to UML use
case and scenarios
A failure mode refers to the way in which
a scenario fails to achieve its
requirement
What to Read
• Key works in the field
• Tutorials
• Web sites
– Will be completed at a later time