Safety Assessment Techniques

Development of safety critical CPS:
some applicable
dependability concepts &
safety assessment techniques
from the aeronautic domain
Christel Seguin
ONERA/DCSD
[email protected]
Lecture scope:
Cyber Physical Systems & Dependability
•
CPS:
•
Direction - Conférence
•
“complex engineering systems that rely on the integration of physical,
computation, and communication processes to function”
Holistic view addressed in the lecture
Presentation scope:
Cyber Physical Systems & Dependability
•
Dependability concepts:
•
Direction - Conférence
•
[Avizienis-al2004]
“ability to deliver
service that can
justifiably be trusted”
It encompasses
cyber security
and safety
Presentation scope:
Cyber Physical Systems & Dependability
•
Dependability practices
•
are application dependent
=> Lessons learnt
from aeronautic
•
Direction - Conférence
•
Certification process
Safety assessment
methods & tools
Direction - Conférence
General dependability concepts
Generic system definition
•
System =
•
•
a set of interacting items, forming an integrated whole
examples of various complexity:
• air traffic control, aircraft + pilot, flight-control system,
computers, sensors, actuators ...
aircraft
Direction - Conférence
aircraft systems
equipment
A380, Rafale, B787
flight control,
hydraulic, electrical,
flight warning, …
Flight control computers,
Flight warning computers, …
A leading « simple » example: A320 hydraulic system
•
Architecture overview:
•
•
About 20 components of 8 classes: reservoir, pumps, pipes, valves, controlers
Safety barriers: 3 redundant independant lines, valves for load management
and fault containment
Engine Driven Pump
Priority distribution
Non-Priority distribution
green
Pdistg
eng1
rsvg
EDPg
PVg
NPdistg
Engine #1
Reservoir
Priority Valve
PTU
Power Transfer Unit
eng2
From electrical
system side #1
EDPy
PVy
EMPy
Pdisty
yellow
elec1
Direction - Conférence
NPisty
rsvy
Electrical Motor Pump
elec2
EMPb
PVb
RAT
Pdistb
NPdistb
rsvb
blue
Ram Air Turbine
A more complex example: Remotely Piloted Aircraft
•
PRAS : a challenging system mixing organizational, human and
technical concerns
UAV
Avionics
Other
users
Perception
Control
Communication
Air Traffic Control
Direction - Conférence
On ground Pilot
System from dependability perspective
Failure: deviation of the service provided by the considered system, with
respect to the expectations.
Failure rate: the probability of failure per unit of time of items in operation
Failure mode: way by which a failure appears (e.g fail-silent, erroneous value,
…)
Direction - Conférence
Fault: cause of a potential failure
Undesirable event: Any adverse event or situation that could be due to the
considered system and its potential failures.
Also called Failure Condition
Direction - Conférence
Bath curve failure rate
Hydraulic example
•
•
•
•
Nominal function: hydraulic power delivery
Failure: no delivery of hydraulic power.
Expected failure rate of " no delivery of hydraulic power" is less than
10-9 per flight hour.
Failure modes:
•
•
•
•
•
•
Faults
•
Direction - Conférence
total loss of delivery of hydraulic power (loss of the three lines)
Partial loss of delivery of hydraulic power (loss of one line)
Loss of delivery of hydraulic power on one pipe
Loss of valve control
Intempestive valve closure
For pipe loss:
• Primary (intrinsic) cause: pipe wearing
• Secondary cause (extrinsic): pipe received to high pressure fluid
•
For intempestive valve closure
• Controller design error
Dependability properties
•
Dependability = [Avizienis-al2004]
"ability to deliver service that can justifiably be trusted"
•
It encompasses :
•
•
•
•
•
Direction - Conférence
•
Reliability: continuity of correct service
Availability: readiness for correct service
Maintainability: ability to undergo modifications and repair
Safety: absence of catastrophic consequences on the human,
equipments and the environment
Confidentiality: absence of unauthorized disclosure of information
...
Dependability measures - 1
•
Reminder:
•
•
•
Direction - Conférence
•
Reliability: continuity of correct service
Availability: readiness for correct service
Maintainability: ability to undergo modifications and repair
Mathematical definitions for a system S
•
R(t) = Prob(S non faulty during [0, t])
function decreasing from 1 to 0 for t in [0 +∞[
•
A(t) = Prob(S non faulty at t)
if the system can not be repaired R(t) = A(t)
•
M(t) = 1 – Prob(S non repaired during [0, t])
function increasing from 0 to 1 for t in [0 +∞[
Some other reliability attributes
•
Development Assurance Level:
•
Direction - Conférence
•
Level of care taken in the specification and the design of one item
Qualitative indicator: A, B, C, D, E
Direction - Conférence
Standard safety assessment process for
systems of civil aircraft
Overview of means to build safety
Direction - Conférence
Source: Dassault Aviation
Processes: certification / safety assessment / V&V
Safety assessment/Safety analysis process ARP4754
Safety Plan
Safety Program Plan
Certification
preparation
process
- Aircraft
Airworthiness
requirement
assignment
Function and implementation oriented safety assessment:
Functional Hazard Assessment (FHA):
Failure Condition Identification and allocation of safety objectives
(qualitative and quantitative). Aircraft level FHA and System level FHA
Multi system and system Safety Assessment
- Multi systems (Aircraft level): PASA/ASA (Preliminary and final Aircraft
Safety Assessment
- System level: PSSA/SSA (Preliminary and final System safety Assessment
- CRI for novelties
and associated
means of
compliance
Particular Risk Analysis (PRA): e.g. engine burst, bird impact.
- Certification plan
Human Error Analysis (HEA: Crew error, Maintenance error.
Common Cause Analysis (CCA)
Certification
readiness
Final
certification
Acceptable
means of
compliance
deliverables:
Safety dossier
(safety
synthesis /
Safety case,
FHA, SSA,
ASA, PRA, etc.)
Common Mode Analysis (CMA)
Direction - Conférence
Zonal Safety Analysis (ZSA): Installation review
Safety Validation/Verification and safety assurance process
Certification activity
Safety assessment/analysis activity
V/V and Assurance Process activities
Parts addressed in the lecture
Safety assessment/Safety analysis process ARP4754
Safety Plan
Safety Program Plan
Certification
preparation
process
- Aircraft
Airworthiness
requirement
assignment
Function and implementation oriented safety assessment:
Functional Hazard Assessment (FHA):
Failure Condition Identification and allocation of safety objectives
(qualitative and quantitative). Aircraft level FHA and System level FHA
Multi system and system Safety Assessment
- Multi systems (Aircraft level): PASA/ASA (Preliminary and final Aircraft
Safety Assessment
- System level: PSSA/SSA (Preliminary and final System safety Assessment
- CRI for novelties
and associated
means of
compliance
Particular Risk Analysis (PRA): e.g. engine burst, bird impact.
- Certification plan
Human Error Analysis (HEA: Crew error, Maintenance error.
Common Cause Analysis (CCA)
Certification
readiness
Final
certification
Acceptable
means of
compliance
deliverables:
Safety dossier
(safety
synthesis /
Safety case,
FHA, SSA,
ASA, PRA, etc.)
Common Mode Analysis (CMA)
Direction - Conférence
Zonal Safety Analysis (ZSA): Installation review
Safety Validation/Verification and safety assurance process
Certification activity
Safety assessment/analysis activity
V/V and Assurance Process activities
Safety requirements for aircraft systems - 1
•
Another general definition of dependability:
•
•
"ability to avoid services failures that are frequent and more
severe than acceptable“
Meaning of
•
•
•
severe?
frequent?
acceptable?
Direction - Conférence
depends on the system kind !
Safety requirements for aircraft systems - 2
•
Interpretation of the definition when considering safety
of civil aircraft
•
kind of service failures = Failure Condition (FC) =
• A condition with an effect on the aircraft and its occupants, both
direct and consequential,
• caused or contributed to by one or more failures,
• considering relevant adverse operational or environmental
conditions.
Direction - Conférence
•
In terms of commercial airplane airworthiness, a Failure
Condition is classified in accordance to the severity of its
effects as defined in FAA AC 25.1309-1A or JAR AMJ
25.1309.
Direction - Conférence
Classification table for the failure conditions of
systems in civil aircraft - 1
severity
classes
effects description
DAL
acceptable frequency of FC
catastrophic
prevent continuous safe flight and
landing: aircraft loss and loss of crew
and passengers
A
FC occurrence <10-9 per flight hour
+
no single failure leads to the FC
hazardous
large reduction in safety margins or
functional capabilities
or physical distress or high crew
workload
or serious or fatal injuries to a relatively
small number of passengers
B
<10-7 per flight hour
Classification table for the failure conditions of
systems in civil aircraft - 2
effects description
DAL
acceptable frequency of FC
major
significant reduction in safety margin
or functional capabilities
or significant increase in crew
workload
or discomfort to occupants possibly
including injuries
C
<10-5 per flight hour
minor
no significant reduction in aircraft
safety.
may include: slight reduction in safety
margin or functional capabilities, slight
increase in crew work load, some
inconvenience to the occupants
D
no objectives
Direction - Conférence
severity
classes
no safety
effect
E
Hazard Classification for CPS ?
Classification rules such the one used for civil aircraft are a
must
They are strongly dependent from the concept of operation
of the system and they do not exist for newest systems
•
•
• E.g.: There is not yet predefined classification of failure condition for remotely
piloted aircraft
• UAV crash on a area without population is not catastrophic
• The severity depends also on the aircraft energy (weight, speed, ...)
More research is required to help defining safety assessment
procedures proportionate to the operation risk
•
Direction - Conférence
•
Cf for instance http://easa.europa.eu/newsroom-andevents/news/easa-presents-new-regulatory-approach-remotelypiloted-aircraft-rpas
Example of FC and related safety requirements
"Total loss of hydraulic power is classified Catastrophic,
the probability rate of this failure condition shall be less than 10-/FH.
No single event shall lead to this failure condition "
Direction - Conférence
•
In practice, how to find all meaningful safety requirements for
aircraft items (functions, systems, equipements)?
FHA - Principles
Functional Hazard Assessment (FHA)
•
•
Direction - Conférence
•
a systematic, comprehensive examination of functions to identify and
classify FCs of those functions according to their severity
process:
1.
identification of all the functions associated with the system under
study (internal functions and exchanged functions)
2.
identification and description of FCs associated with these
functions, considering single and multiple failures in normal and
degraded environments
3.
determination of the effects of the FC
4.
classification of FC effects on the aircraft (cat, haz, maj, min, no
safety effect)
Direction - Conférence
FHA – Model of FHA table extracted from ARP 4761
FHA and CPS
•
FHA input is the overall functional system breakdown
•
•
However, for highly integrated system like CPS, it is also
meaningful to review combinations of multiple failures or
integration / control failures
•
•
Direction - Conférence
It can be applied to CPS
Cf STAMP http://sunnyday.mit.edu/STAMP-publications.html
Cf Bowtie methods
http://www.caa.co.uk/default.aspx?catid=2786&pagetype=90
PASA / PSSA – ASA/SSA
•
A Preliminary Aircraft / System Safety Assessment (PASA/PSSA) :
•
•
•
systematic examination of a proposed architecture(s)
to determine how failures could cause the Failure Conditions identified by the
FHA.
Outputs:
• consolidation of the safety requirements allocation or
• need for alternative protective strategies (e.g., partitioning, built-in-test, monitoring,
independence and safety maintenance task intervals, etc.).
•
. An Aircraft/System Safety Assessment (ASA/SSA)
•
Direction - Conférence
•
•
systematic, comprehensive evaluation of the implemented aircraft and system(s)
to show that relevant safety requirements are satisfied.
Ouputs: judgement on the compliance of the design with the safety requirements
as defined in the PASA and PSSA.
How to perform safety assessment? models and tools for failure
propagation analysis are defined in ARP 4761
Direction - Conférence
Methods for failure propagation analysis
Analysis of failure propagation : FMEA Principles
•
Failure Mode Effect Analysis (FMEA)
•
Inductive analysis of local and global effects of all components failures :
• Effect of leakage in the Green Reservoir :
Local effect : loss of fluid
eng1
EDPg
distg
rsvg
EMPg
elec
Global effect : loss of Green power
eng2
Direction - Conférence
rsvy
EDPy
disty
RAT
distb
rsvb
EMPb
elec
Direction - Conférence
FMEA of the hydraulic example
Fault Tree Analysis principles
Fault tree analysis (FTA)
•
•
•
A tree decomposing a top level event (a system failure) to exhibit all
its root causes
Mathematical model: a boolean formulae
Computation on the model:
Direction - Conférence
• Extraction of minimal combination of atomic faults leading to the top level
event
• Computation of the probability of occurrence of top event knowing the
probability of the tree leaves
FT unannunciated loss of wheel braking
Drawbacks of the classical
Safety Assessment Approaches
–
Fault Tree, FMEA
–
Give failure propagation paths without referring explicitly to a
commonly agreed system architecture / nominal behavior =>
– Misunderstanding between safety analysts and designers
– Potential discrepancies between working hypothesis
•
Exhaustive consideration of all failure propagations become
more and more difficult, due to:
–
–
Direction - Conférence
–
increased interconnection between systems,
integration of functions that often are performed jointly across multiple
systems
increased inter relations between hardware and software.
Model based safety assessment rationales
•
Goals
•
•
Propose formal failure propagation models closer to design
models
Develop tools to
• Assist model construction
• Analyze automatically complex models
•
For various purposes
• FTA, FMEA, Common Cause Analysis, Human Error Analysis, …
• since the earlier phases of the system development
Direction - Conférence
•
Approaches
Extend design models (Simulink,
SysML, AADL...)
with failure modes
Build dedicated failure
propagation models
(Figaro, AltaRica, Slim...)
Transform into analyzable
formalisms (boolean formulae,
automata, ...)
Develop specialized
analysis tools
A leading example: the basic block component
•
Let be a basic system component Block that
•
receives
• one Boolean input I,
• an activation signal A and
• a resource signal R.
•
produces
fail
• a Boolean output O
•
Block performs nominally the following transfer law
•
•
Block may fail.
•
In this case, the output O is false.
I
Block
O
A R
Initially, the block performs the nominal function
Direction - Conférence
•
O is true iff I, A and R are true.
35
Mode automata of a Boolean block –
Graphical view and concrete syntax
ok=true
O=(I and A and R)
fail
Direction - Conférence
ok=false
O=false
node Block
flow
O:Bool:out;
I, A, R :Bool: in;
state
ok: Bool;
event
fail;
trans
ok |- fail -> ok := false;
assert
O = (I and A and R and ok);
init
ok:= true;
edon
36
Kripke structure derived from the mode automata of
the Boolean block
Kripke structure = (Configurations, Data Assignation in
configurations, Relations between configurations)
Runs of a mode automata are paths of the derived Kripke structure
that start from one possible initial configuration
8 possible initial configurations to
8 possible final configurations
ok=true
I=A=R=O=true
ok=true
I=A=true, R=O=false
......
ok=true
I=A=R=O=false
Direction - Conférence
fail
ok=false
I=A=R=true, O=false
ok=false
I=A=true, R=O=false
......
ok=false
I=A=R=O=false
37
Internal operations on mode automata
•
Parallel composition : free product of mode automata
•
•
•
preserves all states, variables, transitions, assertions
interleaving parallelism (only one transition at a time)
Ex: two parallel Boolean blocks
Block 1
block1.ok=block2.ok=true
block1.O=(block1.I and block1.A and block1.R)
block2.O=(block2.I and block2.A and block2.R)
fail2
Direction - Conférence
fail1
block1.ok=false, block2.ok=true
block1.O=false
block2.O=(block2.I and block2.A and block2.R)
//
Block 2
block1.ok=true, block2.ok=false
block1.O=(block1.I and block1.A and block1.R)
block2.O=false
fail1
fail2
block1.ok=block2.ok=false
block1.O=false
block2.O=false
38
Internal operations on mode automata
•
Interconnection : mapping an input of an automaton with an
output of another automaton
•
•
•
•
•
preserves all states, variables, transitions, assertions
Introduces new assertions: Block2.I = Bloc1.O for all pairs of connected
interfaces
interleaving parallelism (only one transition at a time)
! allowed only if variables are not circularly defined
Ex: two series blocks
Block 1
Direction - Conférence
block1.ok=block2.ok=true
block1.O=block2.I=
(block1.I and block1.A and block1.R)
block2.O=(block2.I and block2.A and block2.R)
fail2
fail1
block1.ok=false, block2.O=true
block1.O=block2.I=false
block2.O=false
Block 2
block1.ok=true, block2.ok=false
block1.O=block2.I=
(block1.I and block1.A and block1.R)
block2.O=false
fail1
fail2
block1.ok=block2.ok=false
block1.O=block2.I=false
block2.O=false
39
Safety Assessment Techniques
•
Interactive simulation = user driven exploration of the Kripke
structure
Direction - Conférence
→ play simple combination of failures (in the style of FMEA)
40
Safety Assessment Techniques
•
OCAS Fault-Tree generation
The fault tree can be exported to other tools (Simtree, Arbor,...) to compute of minimal cut
sets and probabilities
Direction - Conférence
•
41
Safety Assessment Techniques
OCAS Sequence Generator
•
•
Direction - Conférence
•
Automatic generation of failure sequences that lead to the observation of the
failure conditions
Limit on the number of failures to be considered
/*
orders(MCS('hydrau_total_loss.O.true')) =
orders
product-number
3
35
4
56
total
91
end
*/
products(MCS('hydrau_total_loss.O.true'))=
{'EDPg.fail_loss', 'EMPb.fail_loss',
'disty.fail_loss'}
{'EDPg.fail_loss', 'EMPb.fail_loss',
'rsvy.fail_loss'}
{'EDPg.fail_loss', 'Elec2.fail_loss',
'disty.fail_loss'}
{'EDPg.fail_loss', 'Elec2.fail_loss',
'rsvy.fail_loss'}
...
42
MBSA application for CPSD
•
Application range:
Direction - Conférence
•
from detailed CPS architecture
Control architecture of one ONERA medium size UAV
MBSA application for CPSD
•
Application range:
•
from detailed CPS architecture to their CONOPS
Direction - Conférence
Procedure to ensure separation of aircraft trajectories in controlled airspace
Safety Assessment techniques and CPS
•
CPS are complex integrated systems
•
•
CPS support new operation concepts (CONOPS)
•
•
•
Safety assessment of the CONOPS needed to identify accurate
system safety requirements
Easier with MBSA
Open issue:
•
Direction - Conférence
MBSA is more adapted to CPS than FTA / FMEA
efficient safety assessment of the largest CPS systems (e.g. very
large networks of sensors).
Conclusion
•
CPS are systems:
•
•
CPS uses are versatil whereas dependability standards are
application dependant
•
•
•
•
Focus on the practices in aviation where safety is a must
Key ideas/process step can be adapted to CPS of other domains
=> need for convergence of safety standards to build a dependability
culture of CPS
CPS are complex systems
•
Direction - Conférence
general dependability concepts remain applicable to CPS
•
Most advanced safety assessment techniques needed to support
analysis of highly integrated systems
=> need for more research to address very large highly reconfigurable
systems
Bibliography
Direction - Conférence
[Avizienis-al2004 ] "Basic Concepts and Taxonomy of Dependable and Secure
Computing", Algirdas Avizienis, Jean-Claude Laprie, Brian Randell, and Carl
Landwehr, IEEE Transactions on Dependable and Secure Computing, vol.1, n°1,
january-march 2004
[ARP4754-2010] "Certification consideration for highly-integrated or complex
aircraft system", Aerospace Recommended Practice 4754, SAE
[ARP4761-1996/2013?] "Guidelines and methods for conducting the safety
assessment process on civil airborne systems and equipment", Aerospace
Recommended Practice 4754, SAE
[Bieber-Seguin2013] "Safety Analysis of the Embedded Systems with the AltaRica
Approach", in book "Industrial Use of Formal Methods: Formal Verification",
Editor JL Boulanger, DOI: 10.1002/9781118561829.ch3