Reliability Prediction in Systems with correlated Component Failures

1
Reliability Prediction in Systems with
correlated Component Failures - An Approach
using Copulas
Philipp Limbourg1,3, Hans-Dieter Kochs1, Klaus Echtle2, Irene Eusgeld2
obtained by:
I. INTRODUCTION
P(S( X) = 1) = E(S( X)) = ∑ P( X = x)S(x)
(1)
In this work, we investigate how spatial dependencies may
influence the reliability of a fault-tolerant system and how
such dependencies may be modeled in an easy-tocommunicate way. One of the main basic abstractions in
reliability modeling is the invariance of system reliability to
the physical location of a component. Failures are either
considered to hit only one component at a time, or the
common cause is explicitly modeled. Being a good
approximation and simplification in microscale computing,
this view may be too simple if system integration proceeds
(e. g. in the nanoscale). Due to the high integration of
components, failure modes such as electronic discharges are
likely to influence several neighbored components at a time.
Our approach proposes the use of copulas for the
representation of spatial dependencies. Being simple in their
usage, copulas have been proven to be a good tool for
dependency modeling without changing the underlying
system model. Especially if the common causes are not
known or not intended to be modeled, this approach is a
good way to go.
x
The system function of the majority voting unit can be
described as a Boolean system function. Let C1…Cn1 be the
input units and Cn1+1…Cn1+n2 the voting units. Then the
system either fails if the majority of all voting units fail, or if
the majority of all input units fail. S(x) is therefore given as:
II. SYSTEM
However, the simplifying s-independence assumption is not
necessarily a good approximation, if the redundant
components are small and closely packed together. In
highly-integrated circuits this two conditions are met.
Components are so densely packed, that causes of failures
may very likely strike several neighbored components at
once. Therefore, the spatial location of logically related
components can not be neglected and may play a vital role in
modeling. Reliability predictions relying on the
independence of all components run the danger of giving an
overconfident reliability prediction. Explicit modeling of
common causes may on the other hand lead to a highly
complex reliability model. Copulas as a middle course are
proposed to adapt common reliability models to dependent
faults without explicitly modeling their causes. Using
copulas, we illustrate in section V on a simple dependency
model that the predicted system failure probability may vary
in the order of magnitudes if dependencies are rising.
In case of interdependence between the different
components, the product rule does not hold any longer. If
engineers want to specify knowledge for modeling the joint
distributions, other ways of modeling X must be exploited.
In the following sections, we will investigate the use of
A. Logical model
The system under consideration is a redundant majority
voter with n1 redundant inputs units I1…In1 and n2 redundant
voting units V1…Vn2. The system therefore consists of
n=n1+n2 failure-prone components C1…Cn. Assuming a
Boolean reliability model, each component Ci may be
described by a Boolean state xi with xi=1 representing Ci to
be in a working and xi=0 to be failed. The system state
vector x describes the combined state of the components
x1…xn. Taking the step from deterministic to stochastic
models, the random variable Xi describes the state of the
component with reliability P(Xi=1) and failure probability
P(Xi=0) and X denotes the joint distribution of X1,…,Xn.
The system behavior may be described by a Boolean
function S(x). S(x) describes the state of the system
(failed/working) given the state vector of all components. In
analogy, S(X) denotes the distribution of the system state
given the component states. According to [1], S(X) can be
1
Information Lostics, Univ. of Duisburg Essen, Germany
Dependability of Computing Systems, Univ. of Duisburg Essen, Germany
3
Corresponding author, [email protected], +49-203-379-3621
2
n1
n2

0 ∑ x i < n1 + 1 / 2 ∨ ∑ x i < n 2 + 1 / 2 (2)
S(x) = 
i =1
i = n 1 +1

1
else

The calculation of the system function therefore reduces to
the calculation of the joint probability of the system state
vector X from the component state probabilities Xi. Under sindependence assumptions this is
n
P ( X = x) = ∏ P (X i = x i )
(3)
i =1
2
copulas [2] for dependency modeling in reliability
prediction. Based on the copula approach, we will show how
the degree of dependence between components may affect
the reliability of the system and thus play a vital role in
design decisions.
B. Spatial model
We assumed two possible types of spatial layout for the
components. The first, an abstract, scalable architecture,
allows comparing different degrees of redundancy. The
components are assumed to form two linear 1-D-arrays (as
depicted in Figure 1). The first array contains up to n1 input
components. The second array is formed by up to n2 voters.
By varying the parameters n1 and n2, the impact of
redundancy in the input units / voters can simply be
investigated. However, this layout does not represent a
realistic topology. The second example, a 3-3 majority voter
represents a more realistic layout (Figure 2). Direct unit and
conductor neighborhoods are marked by grey bars.
III. COPULAS
Copulas are a way of specifying joint distributions if only
the marginal probabilities are known. In the terms of system
reliability, this can be interpreted as inferring the system
state vector from the component states. The significant
characteristic of probabilistic modeling with copulas is the
separation of the component distributions (the marginals)
and the dependencies. This yields the advantage that copulas
can be used without modification of the system model. By
specifying copula parameters, engineers may predefine
which distributions are correlated without needing to define
basic independent events leading to this common cause.
Their main application area is financial risk prediction [3,
4], where model inputs are regularly known to be correlated
by complex mechanisms not included in the model. The
decoupling between those margins and copulas allows the
separate parameterization and eases the propagation through
the system model. Their popularity in reliability prediction is
still limited. However, copulas can be a valuable tool for
reliability prediction with scarce data [5].
I1
V1
I2
V2
I3
V3
I4
V4
...
...
ρ1,3=0
Unit 2
Voter 1
Voter 2
Unit 3
Voter 3
Figure 2 Possible dependencies in a 3-3 majority voter. The bars indicate
possible dependencies caused by a direct spatial neighborhood.
A. Mathematical framework
Formally, an n-dimensional copula can be defined as a
multivariate distribution function C(u) with uniform
distributed marginal distributions (u1,…un)=u in [0,1] and
the following properties [2]:
C: [0,1] n → [0,1]
C is grounded: if ui = 0, C(u) = 0
C has margins Ci which satisfy C(1,...,1,ui,1,...,1) = ui.
C is n-increasing
Given this definition, if F1,...,Fn are distribution functions,
C(F1(x1),...,Fn(xn)) is a multivariate distribution function with
margins F1,...,Fn . Sklar’s theorem [6] allows the use of
copulas for our purposes as it allows the separation of both
marginal distributions and dependencies.
Sklar’s Theorem: Let F be a an n-dimensional distribution
function with margins F1, . . . ,Fn. Then F has a copula
representation:
F(x1 ,…, x n ) = C(F1 (x1 ),…, Fn (x n ))
ρ1, n1+1=q
ρ1,2=q
Unit 1
In1
Vn1
ρn1+1, n1+2=q
ρn1+1, n1+3=0
Figure 1 Majority voting unit with correlated neighbors (2-dimensional
array)
(4)
If F1,…, Fn are continuous, then C is unique. Otherwise C is
uniquely determined on Ran(F1) × . . . × Ran(Fn), where
Ran(Fi) denotes the range of Fi. Thus, by fixing the copula
and the margins, a multivariate distribution can be defined.
B. Copula types
Different copulas and copula families are in common use.
The extremal copula function, the Fréchet-Hoeffding bounds
C+ respective C- constrain the set of possible copulas. They
represent perfect and opposite dependence:
n


C − (u) = max1 − n + ∑ u i ,0 


i =1


C + (u) = min(u1 ,..., u n )
(5)
(6)
3
The one by far most frequently applied is the
“unintentionally” used product copula. It represents the
independence assumption and is defined as:
C. Propagation using copulas
Having specified the copula parameter and the failure
probability it is possible to obtain the probability of a
component state vector as:
n
C P (u) = ∏ u i
(7)
i =1
Figure 3 shows two-dimensional copulas resulting from
perfect dependence, opposite dependence and independence
of the variables.
To be usable in reliability prediction, it is necessary to use
copulas which can be parameterized in a simple-tocommunicate way. In this context, the most popular copulas
are the Archimedean copulas, the Gaussian copula and the
Student t-Copula [3]. All include the product copula as a
special case. Archimedean copulas that are often used in the
bivariate case do not readily extend to the multivariate case.
The Gaussian copula being used in this work has the big
advantage of easy communicability. The only set of
parameters for estimating the dependencies is the correlation
matrix of the inputs.
The Gaussian Copula is defined as:
C G (u) = Φ ρ (Φ −1 (u1 ),..., Φ −1 (u n ))
P( X <= x) = F(x) = C(F1 ( x1 ),..., Fn ( x n ))
(9)
In the Boolean case, Fi(xi) reduces to a simple step function:
P(X i ≤ 1) = 1 if x i = 1
Fi ( x i ) = 
if x i = 0
 P ( X i = 0)
(10)
The probability P(X = x) can be obtained using the
inclusion-exclusion principle.
P ( X = x) = P ( X ≤ x) − P ( X < x ) = F( x ) − P ( X < x)
=
∑
K
j1 ={1, 2}
∑ (−1) j +...+ j
1
j n ={1, 2}
n
(11)
C(u 1, j1 ,..., u n , j n ) (12)
u i, 2 = Fi ( x i )
F ( x − 1)
u i,1 =  i i
0
if x i = 1
else
(13)
(8)
with Φ-1 being an inverse standard normal distribution (µ=0,
σ=1) and Φρ a multivariate standard normal distribution with
correlation matrix ρ. The concept of the Gaussian copula is
to map the dependency structure onto a Gaussian
multivariate distribution. The input marginal probabilities
u1…un are converted to values of a Gaussian distribution
using Φ-1. In a second step, the cumulative probability is
calculated by Φρ using the dependency parameters ρ.
IV. MODELING DEPENDENCY
We assume that only direct neighborhoods can have an
influence on the failure correlation. In this simplified
approach we do not model interactions of components in
indirect proximity. In the first layout, each input unit can
therefore interact with the neighbored input units, and each
voter with the adjacent voters. Voters and input units may
interact, if they are in the same row (Figure 1). Therefore we
assume the correlation matrix as:
ρ:n×n
 ρ'
ρ= T
 ρ' '
ρ' ' 

ρ' ' ' 
(14)
ρ is composed of three correlation matrices with ρ’ defining
the input-input correlations, ρ’’ the input-voter correlations
and ρ’’’ the voter-voter correlations.
p': n1 × n1
a)
b)
1 if i = j

ρ 'i, j = c if | i - j |= 1 vertical neighbor
0 else
p' ' : n1 × n 2
c if i = j horizontal neighbor
0
 else
ρ ' 'i, j = 
c)
Figure 3 Two-dimensional copula functions: a) perfect dependence, b)
opposite dependence, c) independence [1].Possible dependencies in a 3-3
majority voter.
(15)
(16)
4
p' ' ': n 2 × n 2
showing that a higher reliability may be either achieved by
reducing the individual failure probability, or by reducing
1 if i = j
the failure correlations. Copulas are a very transparent way
(17)

ρ ' ' 'i, j = c if | i - j |= 1 vertical neighbor
to model these dependencies. Being capable to work with
0 else
arbitrary marginals, their range of application goes beyond
Boolean models. Especially multi-state systems [7] and
continuous systems [8] could be expedient fields for their
The factor c ∈ [0,1] defines how strong the inputs are
application. As the system reliability is highly sensitive to
correlated (negative correlations, though mathematically small dependencies, it may be convenient to carry out an
possible, have no meaning in reliability prediction). In uncertainty analysis on c to gain confidence on the system
section V, the influence of the factor c on system reliability reliability estimate.
is investigated.
The second layout has a more complex correlation scheme.
REFERENCES
All components having adjacent edges are considered to be [1]
W. G. Schneeweiss, Boolean functions with engineering
correlated. Further extensions could e. g. include the
applications and computer programs. New York, USA:
Springer, 1989.
correlation as a function of the adjacent edges’ length. If
R. B. Nelsen, An Introduction to Copulas. New York: Springer,
component conductors are crossed or neighbored, the [2]
1999.
correlation was assumed to be higher. Therefore input unit 2 [3]
P. Embrechts, F. Lindskog, and A. McNeil, "Modelling
and 3 set to correlate with c0.9.
dependence with copulas and applications to Risk
0
c 0.9
1
0
0
0
c
0
0
1
0
0
0
0
0
0
1
c
0
0

0
0
c
1
[4]
(18) [5]
V. RESULTS & CONCLUSION
A. 2-dimensional array
It is investigated how the correlation parameter c influences
different redundant configurations. The failure probability of
3-3, 5-3 and 5-5 voters are shown for different component
failure probabilities of p=0.01, p=0.001 and p=0.0001.
Figure 4, Figure 5 and Figure 6 show the probability of
failure (log scale) depending on the correlation parameter c.
It can be observed that the failure probability rises for
several orders of magnitude. Especially if the degree of
redundancy is high (5-5 majority), an increase of the
correlation (0 to 0.2) leads to an increase of the failure
probability in the order of magnitudes.
B. Complex topology
Figure 7 shows the system failure probability (log scale
respective to c. Similar to the first example, p was fixed to
p=0.01, p=0.001 and p=0.00001. The influence of c is lower
than in the first example, but it has still a strong influence on
the predicted failure probability. Correlations of about c=0.1
can already have a severe influence on the failure
probability.
C. Conclusion
The results indicate that if the possibility of spatially
dependent common cause faults exists, dependency
modeling is required for a realistic reliability estimate.
Doing reliability analysis on a higher level therefore is only
feasible if not only the individual component failure
probabilities are known, but also the dependency scheme.
The results may also have an impact on design decisions,
[6]
[7]
[8]
10
10
-2
-3
Pfail
c
1
c
1

0.9
ρ = 0 c
0
c
0
0

0
0
Management," in Handbook of Heavy Tailed Distributions in
Finance, S. Rachev, Ed. Amsterdam, NL: Elsevier, 2003, pp.
329-384.
A. Prampolini, "Modelling default correlation with multivariate
intensity processes," 2001.
S. Ferson, J. Hajagos, D. Berleant, Jianzhong Zhang, W. Troy
Tucker, L. Ginzburg, and W. Oberkampf, Dependence in
Dempster-Shafer theory and probability bounds analysis.
Albuquerque: Sandia National Laboratories, 2004.
A. Sklar, "Fonctions de répartition à n dimensions et leurs
marges," Publ. Inst. Statist. Univ. Paris, vol. 8, pp. 229-231,
1959.
H. Pham, Reliability Engineering: Springer, 2003.
M. Finkelstein, "Simple Repairable Continuous State Systems
of Continuous State Components," presented at Mathematical
Methods in Reliability (MMR) 2004, Santa Fe, USA, 2004.
10
10
-4
3-3-majority
5-3-majority
5-5-majority
-5
0
0.05
0.1
0.15
0.2
0.25
0.3
Correlation c
Figure 4 2-d array: System failure probability (log scale) vs. dependency,
p=0.01, grid layout (3-3 redundancy, 5-3 redundancy, 5-5 redundancy).
5
10
Pfail
10
10
10
10
-4
-5
-6
-7
3-3-majority
5-3-majority
5-5-majority
-8
0
0.05
0.1
0.15
0.2
0.25
0.3
Correlation c
Figure 5 2-d array: System failure probability (log scale) vs. dependency,
p=0.001, grid layout (3-3 redundancy, 5-3 redundancy, 5-5 redundancy).
10
Pfail
10
10
10
10
-6
-8
-10
-12
3-3-majority
5-3-majority
5-5-majority
-14
0
0.05
0.1
0.15
0.2
0.25
0.3
Correlation c
Figure 6 2-d array: System failure probability (log scale) vs. dependency,
p=0.00001, grid layout (3-3 redundancy, 5-3 redundancy, 5-5 redundancy).
Figure 7 Complex topology: System failure probability (log scale) vs.
dependency, p=0.001, grid layout (p=0.01, p=0.001, p=0.00001).