Quality Control Using Inferential Statistics in Weibull

GRAPHITE TESTING FOR NUCLEAR APPLICATIONS: THE SIGNIFICANCE OF TEST SPECIMEN VOLUME
AND GEOMETRY AND THE STATISTICAL SIGNIFICANCE OF TEST SPECIMEN POPULATION
STP 1578, 2014 / available online at www.astm.org / doi: 10.1520/STP157820130122
Stephen F. Duffy1 and Ankurben Parikh2
Quality Control Using Inferential
Statistics in Weibull-based
Reliability Analyses
Reference
Duffy, Stephen F. and Parikh, Ankurben, “Quality Control Using Inferential Statistics in
Weibull-based Reliability Analyses,” Graphite Testing for Nuclear Applications: The
Significance of Test Specimen Volume and Geometry and the Statistical Significance of Test
Specimen Population, STP 1578, Nassia Tzelepi and Mark Carroll, pp. 1–18, doi:10.1520/
STP157820130122, ASTM International, West Conshohocken, PA 2014.3
ABSTRACT
Design codes and fitness-for-service protocols have recognized the need to
characterize the tensile strength of graphite as a random variable through the
use of probability density functions. Characterizing probability density functions
require more tests than typically needed to simply define an average value for
tensile strength. ASTM and the needs of nuclear design codes should dovetail on
this issue. The two-parameter Weibull distribution (an extreme-value
distribution) is adopted for the tensile strength of this material. The failure data
from bend tests or tensile tests are used to determine the Weibull modulus (m)
and Weibull characteristic strength (rh). To determine an estimate of the true
Weibull distribution parameters, maximum likelihood estimators are used. The
quality of the estimated parameters relative to the true distribution parameters
depends fundamentally on the number of samples taken to failure. The statistical
concepts of confidence intervals and hypothesis testing are presented pertaining
to their use in assessing the goodness of the estimated distribution parameters.
The inferential statistics tools enable the calculation of likelihood confidence
rings. The concept of how the true distribution parameters lie within a likelihood
ring with a specified confidence is presented. A material acceptance criterion is
Manuscript received August 15, 2013; accepted for publication March 10, 2014; published online July 18, 2014.
1
Ph.D., P.E., Cleveland State Univ., Cleveland, OH 44115, United States of America.
2
Cleveland State Univ., Cleveland, OH 44115, United States of America.
3
ASTM Symposium on Graphite Testing for Nuclear Applications: The Significance of Test Specimen Volume
and Geometry and the Statistical Significance of Test Specimen Population on Sept 19–20, 2013 in Seattle,
WA.
C 2014 by ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959.
Copyright V
1
2
STP 1578 On Graphite Testing for Nuclear Applications
defined here, and the criterion depends on establishing an acceptable
probability of failure of the component under design, as well as an acceptable
level of confidence associated with the estimated distribution parameter
determined using failure data from a specific type of strength test.
Keywords
graphite, Weibull, confidence bounds, likelihood ratio rings
Nomenclature
H0 ¼ null hypothesis
H1 ¼ alternative hypothesis
L ¼ likelihood function
L ¼ natural log of the likelihood function
m ¼ Weibull modulus
~ ¼ estimated Weibull modulus
m
n ¼ sample size
Pf ¼ probability of failure
T ¼ test statistic
a ¼ significance level
b ¼ probability of a Type II error
c ¼ confidence level
H0 ¼ vector of all the maximum likelihood estimator parameter estimates
Hc0 ¼ vector of all point estimates that are not maximum likelihood estimator
parameter estimates
rh ¼ Weibull characteristic strength
r~h ¼ estimated Weibull characteristic strength
Introduction
This work presents the mathematical concepts behind statistical tools that, when
combined properly, lead to a simple quality control program for components fabricated from graphite. The data on mechanistic strength (which is treated as a random
variable) should be used to accept or reject a graphite material for a given application.
The two-parameter Weibull distribution is used to characterize the tensile strength of
graphite. The Weibull distribution is an extreme-value distribution, and this facet
makes it the preferred distribution for characterizing a material’s minimum strength.
Estimates of the true Weibull distribution parameters should be determined
using maximum likelihood estimators (MLEs). The quality of the estimated parameters relative to the true distribution parameters depends fundamentally on the
number of samples taken to failure. The statistical concepts of confidence intervals
and hypothesis testing are employed to assess quality. Quality is defined by how
DUFFY AND PARIKH, DOI 10.1520/STP157820130122
close the estimated parameters are to the true distribution parameters. The quality
of the distribution parameters can have a direct effect on whether a certain grade or
type of graphite material is acceptable for a given application. Both inferential statistics concepts (i.e., confidence intervals and hypothesis testing) enable the calculation of likelihood confidence rings. Work showing how the true distribution
parameters lie within a likelihood ring with a specified confidence is presented here.
The size of the ring has direct bearing on the quality of the estimated parameters.
One must specify and associate an acceptable level of confidence with the estimated distribution parameters. This work shows how to construct likelihood ratio
confidence rings that establish an acceptance region based on a given level of confidence. Material performance curves are presented that are based on an acceptable
component probability of failure. Combining the two elements (i.e., the material
performance curve and a likelihood ratio ring) allows the design engineer to determine whether a material is suited for the component design at hand. The result is a
simple approach to a quality assurance criterion.
Point Estimates and Confidence Bounds
Data related to the tensile strength of graphite can be generated through the use of
tensile tests outlined in ASTM C565 [1], ASTM C749 [2], ASTM C781 [3], and
ASTM D7775 [4]. Bend tests are preferred for their simplicity, and flexural test procedures are outlined in ASTM C651 [5]. Given data on the tensile strength of
graphite, the first step is to ascertain values for the Weibull distribution parameters
using this information. MLEs, outlined in ASTM D7846 [6], are used to compute
point estimates. The next question is, have we estimated parameters to the best of
our ability? This is directly related to the fundamental question asked repeatedly,
“How many samples must be tested?” The typical answer to this question seems to
be about 30. However, the appropriate question to ask is, “How many samples must
be tested to establish a given confidence level for component reliability?” The work
outlined here answers this question quantitatively, utilizing interval estimates along
with hypothesis testing. The methods outlined here are currently being implemented in the ASME Boiler and Pressure Vessel Code [7].
Confidence intervals are used to indicate the potential variability in estimated
parameters. Every time a sample is taken from the same population, a point estimate of the distribution parameters can be calculated. Successive samples produce
different point estimate values, and thus the point estimates are treated as random
variables. Thus interval estimates bracketing the true distribution parameters are as
necessary as point estimates. If the interval that brackets the true distribution parameter contains the estimated parameter, then the estimate is consistent with the
true value. Increasing the sample size will always narrow the interval bounds and
provide point estimates that approach the true distribution parameters. Interval
bounds on parameter estimates represent the range of values for the distribution
parameters that are both reasonable and plausible.
3
4
STP 1578 On Graphite Testing for Nuclear Applications
Inferences and Hypothesis Testing
As noted above, there is a need to know whether a sample is large enough that the
point estimates of the distribution parameters are in the same statistical neighborhood as the true population distribution parameters. The techniques for making
this kind of assessment utilize inferential statistics. The type of inference focused on
here is the bounds on the true population parameters (which are never known)
given a particular sample. Here a particular type of bound that is referred to as a
likelihood ratio confidence bound is employed.
The basic issue is this: consider an infinitely large population with a known frequency distribution but unknown distribution parameters. Because of diminished
knowledge of the overall population available from a small sample taken from the
infinitely large population, that sample will generate a frequency distribution that is
different from that of the parent population—different in the sense that the distribution parameters estimated from the small sample (a subset) will not be the same
as the parameters of the parent population. The population and the sample can be
characterized by the same frequency distribution, but the two will have different
distribution parameters. As the sample size increases, the frequency distribution
associated with the sample more closely resembles that of the parent population
(i.e., the estimated parameters approach the true distribution parameters of the parent population).
Hypothesis testing is used to establish whether the true distribution parameters
lie close to the point estimates. Two statistical hypotheses are proposed concerning
the estimated parameter values. The first, H1, is referred to as the alternative hypothesis. The latter hypothesis, H0, is referred to as the null hypothesis. Both
hypotheses are then tested with the samples taken from the parent population. The
goal of the analyst is to decide whether there is enough evidence (data) to refute the
null hypothesis H0. That decision is made based on the value of a test statistic.
Here that test statistic is the ratio of two likelihood functions whose probability
is known under the assumption that H0, the null hypothesis, is true. If the test statistic takes on a value rarely encountered using the data collected, then the test statistic indicates that the null hypothesis is unlikely and H0 is rejected. The value of
the test statistic at which the rejection is made defines a rejection region. The probability that the test statistic falls into the rejection region by chance is referred to as
the significance level, denoted by a. The significance level is defined as the probability of mistakenly rejecting a hypothesis when the hypothesis is valid.
Rejecting Hypotheses
Making a decision regarding a hypothesis is associated with a statistical event with
an attending probability, so an ability to assess the probability of making incorrect
decisions is required. Fisher [8] established a method for quantifying the amount of
evidence required in order for an event to be deemed unlikely to occur by chance.
DUFFY AND PARIKH, DOI 10.1520/STP157820130122
He originally defined this quantity as the significance level. Significance levels are
different than confidence levels, but the two are related.
The significance level and the confidence level are functionally related through
the following simple expression, with the confidence level denoted by c:
(1)
c¼1a
The confidence level is associated with a range, or more specifically with bounds or
an interval, within which a true population parameter resides. The confidence level
and, through the equation above, the significance level are chosen a priori based on
the design application at hand.
Given a significance level a defined by Eq 1, a rejection region can be established. This is known as the critical region for the test statistic selected. For our purposes, the observed tensile strength data for a graphite material are used to
determine whether the computed value of the test statistic associated with a hypothesis (not the parameter estimates) lies within or outside the rejection region. The
amount of data helps define the size of the critical region. If the test statistic is
within the rejection region, then we say the hypothesis is rejected at the 100a % significance level. If a is quite small, then the probability of rejecting the null hypothesis when it is true can be made quite small.
Type I and Type II Errors
Consider that for a true distribution parameter h there is a need to test the null hypothesis that h ¼ h0 (in this context h0 is a stipulated value) against the alternative
that h = h0 at a significance level a. Under these two hypotheses, a confidence interval can be constructed that contains the true population parameters with a probability of c ¼ (1 a). In addition, this interval also contains the value h0.
Mistakes can be made in rejecting the null hypothesis given above. In hypothesis testing, two types of errors are possible.
1. Type I error: the rejection of the null hypothesis (H0) when it is true. The
probability of committing a Type I error is denoted by a.
2. Type II error: the failure to reject the hypothesis (H0) when the alternative
hypothesis (H1) is true. The probability of committing a Type II error is denoted
by b.
In either situation, judgment of the null hypothesis H0 is incorrect.
Now consider the situations in which correct decisions have been made. In the
first case, the null hypothesis is not rejected and the null hypothesis is true. The
probability of making this choice is c ¼ (1 a). This is the same probability associated with the confidence interval for the true population distribution parameters
discussed. For the second situation, the probability of correctly rejecting the null hypothesis is the statistical complement of a Type II error, that is, (1b). In statistics
this is known as the power of the test of the null hypothesis. This quantity is used
5
6
STP 1578 On Graphite Testing for Nuclear Applications
to determine the sample size. Maximizing the probability of making a correct decision requires high values of (1 a) and (1 b).
So decision protocols must be designed so as to minimize the probability of
either type of error in an optimal fashion. The probability of a Type I error is controlled by making a a suitable number, say, 1 in 10 or 1 in 20, or something smaller
depending on the consequence of making a Type I error. Minimizing the probability of making a Type II error is not straightforward. b is dependent on the alternative hypothesis, on the sample size n, and on the true value of the distribution
parameters tested. As discussed in the next section, the alternative hypothesis is
greatly influenced by the test statistic chosen to help quantify the decision.
Hence when hypothesis testing is applied to the distribution parameters, a
statement of equality is made in the null hypothesis H0. Achieving statistical significance for this is akin to accepting that the observed results (the point estimates of
the distribution parameters) are plausible values if the null hypothesis is not
rejected. The alternative hypothesis does not in general specify particular values for
the true population parameters. However, as shown in a section that follows, the alternative hypothesis helps us establish bounds on the true distribution parameters.
This is important when a confidence ring is formulated for a parameter distribution
pair. The size of the ring can be enlarged or reduced based on two controllable parameters, the significance level a and the sample size n.
The Likelihood Ratio Test Statistic
The test statistic used to influence decision making regarding the alternative hypothesis is a ratio of the natural log of two likelihood functions. In simple terms,
one likelihood function is associated with a null hypothesis, and the other is associated with an alternative hypothesis. For a probability density function with a single
distribution parameter, the general approaches to testing the null and alternative
hypotheses are defined, respectively, as
(2)
H0 : h ¼ h0
(3)
H1 : h ¼ h1
Note that in the expression for the alternative hypothesis H1, the fact that h equals
h1 implies that h is not equal to h0, and the alternative hypothesis is consistent with
the discussion in the previous section. As they are used here, the hypotheses
describe two complementary notions regarding the distribution parameters, and
these notions compete with each other. In this sense the hypotheses can be better
described mathematically as
(4)
H0 : h 2 H0 ¼ ð1 h0 ; 2 h0 ; :::; r h0 Þ
(5)
H1 : h 2 Hc0 ¼ ð1 h1 ; 2 h1 ; :::; r h1 Þ
where r corresponds to the number of parameters in the probability density function. Conceptually h0 and h1 are scalar values, whereas H0 and its complement Hc0
DUFFY AND PARIKH, DOI 10.1520/STP157820130122
are vectors of distribution parameters. A likelihood function associated with each
hypothesis can be formulated,
(6)
L0 ¼
n
Y
f ðxi jh 2 H0 Þ
i¼1
for the null hypothesis and
(7)
L1 ¼
n
Y
f ðxi jh 2 Hc0 Þ
i¼1
for the alternative function. The likelihood function L0 associated with the null hypothesis is evaluated using the maximum likelihood parameter estimates.
The sample population (i.e., graphite failure data) is assumed to be characterized by a two-parameter Weibull distribution. There are methods to test the validity
of this assumption. However, the material strength is characterized by a random
variable, so it makes sense to use a minimum extreme-value distribution such as the
Weibull distribution. Because this is a proof-of-concept effort focused on likelihood
ratio rings, goodness-of-fit tests that can be used to discriminate between alternative
underlying population distributions are left to others to pursue. A vector of distribution parameters whose components are the MLE parameter estimates is identified as
(8)
ð1 h0 ; 2 h0 Þ ¼ ðh~1 ; h~2 Þ
~ r~h Þ
¼ ðm;
where:
~ ¼ maximum likelihood estimate of the Weibull modulus, and
m
r~h ¼ maximum likelihood estimate of the characteristic strength.
Now
(9)
~ r~h Þ
H0 : h 2 H0 ¼ ðm;
that is, H0 contains MLE parameter estimates, and
(10)
H1 : h 2 Hc0
with Hc0 representing a vector of point estimates that are not MLE parameter estimates. In essence, we are testing the null hypothesis that the true distribution parameters are equal to the MLE parameter estimates, with an alternative hypothesis
that the true distribution parameters are not equal to the MLE parameter estimates.
The likelihood functions are now expressed as
(11)
~0¼
L
n
Y
i¼1
~ r~h Þ
f ðxi jm;
7
8
STP 1578 On Graphite Testing for Nuclear Applications
(12)
L1 ¼
n
Y
f ðxi jm; rh Þ
i¼1
A test statistic is introduced that is defined as the natural log of the ratio of the likelihood functions,
L1
(13)
T ¼ 2 ln
~0
L
The Neyman–Pearson lemma [9] states that this likelihood ratio test is the most
powerful test statistic available for testing the null hypothesis. We can rewrite this
last expression as
(14)
T ¼ 2 L~ L
where
(15)
(
)
n
Y
~
~ r~h Þ
L ¼ ln L0 ¼ ln
f ðxi jm;
i¼1
(
(16)
n
Y
~ 1 ¼ ln
L~ ¼ ln L
f ðxi jm; rh Þ
)
i¼1
The natural log of the likelihood ratio of a null hypothesis to an alternative hypothesis is our test statistic, and its distribution can be determined in the limit as
the sample size approaches infinity. The test statistic is then used to form decision
regions where the null hypothesis can be accepted or rejected. A convenient result
attributable to Wilks [10] indicates that as the sample size n approaches infinity,
the value 2 ln (T) will be asymptotically v2-distributed for a nested composite hypothesis. If one hypothesis can be derived as a limiting sequence of another, we say
that the two hypotheses are nested. In our case the sample (rX1, rX2,..., rXn) representing the rth sample is drawn from a Weibull distribution under H0. These same
samples are used in the alternative hypothesis H1, and because their parent distribution is assumed to be a Weibull distribution under both hypotheses, the two
hypotheses are nested and conditions are satisfied for the application of Wilks’s
theorem.
The test statistic is designed in such a way that the probability of a Type I error
does not exceed a, a value that we control. Thus the probability of a Type I error is
fixed, and we search for the test statistic that maximizes (1 b), where again b is
the probability of a Type II error. Where inferences are being made on parameters
from a population characterized by a two-parameter Weibull distribution, the
degree of freedom for the v2 distribution is one, and the values of the v2 distribution
are easily calculated. One can compute the likelihood ratio T and compare 2ln (T)
to the v2 value corresponding to a desired significance level to define a rejection
region. This is outlined in the next section. The value of the ratio of the two
DUFFY AND PARIKH, DOI 10.1520/STP157820130122
~0 ) approaches 1 in the optimal critical
likelihood functions defined above (L1/L
region (i.e., the value of the test statistic T should be small). This is a result of minimizing a and maximizing (1 b). The ratio is high in the complementary region. A
high ratio corresponds to a high probability of a correct decision under H0. The
likelihood ratio test implies that the null hypothesis should be rejected if the value
of the ratio is too small. How small is too small depends on the significance level of
the test (i.e., on what probability of Type I error is considered tolerable).
Lower values of the likelihood ratio mean that the observed result is much less
likely to occur under the null hypothesis than under the alternative hypothesis.
Higher values of the likelihood ratio mean that the observed outcome is more or
equally likely (or nearly as likely) to occur under the null hypothesis, and the null
hypothesis cannot be rejected.
The likelihood ratio test and its close relationship to the v2 test can be used to
determine what sample size will provide a reasonable approximation of the true
population parameters.
The Likelihood Ratio Ring
The likelihood ratio confidence bounds are based on the inequality
(17)
T ¼ 2 L~ L
Lðm; rh Þ
v2a;1
¼ 2 ln
~ r~h Þ
Lðm;
The equality in Eq 17 can be expressed as
(18)
2 va;1
~ r~h Þ exp Lðm; rh Þ ¼ Lðm;
2
~ and r~h are maximum likelihood estimates of the distribution parameters
Here, m
based on the data obtained from a sample. These parameter estimates are random
variables (they vary from sample to sample), as are the test statistic T and v2. Equation 17 stipulates a relationship between random variables. The true distribution
parameters m and rh are fixed values, but they are unknown to us unless the population is completely sampled.
However, if a is designated, then a value for v2 (i.e., a realization) is established.
Once this realization is established for v2, a realization for the test statistic T can be
established through Eq 17. For a given significance level, confidence bounds m0 and r0 h
can be computed that satisfy Eq 18 (i.e., these bounds satisfy the following expression).
(19)
2 0 0
va;1
~ r~h Þ exp L m ; rh Lðm;
¼0
2
With a given value of m0 , a pair of values can be found for r0 h. This procedure is
repeated for a range of m0 values until there are enough values to produce a smooth
9
10
STP 1578 On Graphite Testing for Nuclear Applications
FIG. 1 Log likelihood frequency plot of L(m,rh) with likelihood confidence ring and
associated test statistic T.
ring. These parameters map a contour ring in a plane perpendicular to the log likelihood axis (see Fig. 1). A change in the significance level results in a different-sized
ring. From the geometry in Fig. 1 we can see that the true distribution parameters
that are unknown to us will lie within the ring.
Aspects of Likelihood Confidence Rings
In order to present aspects of the likelihood confidence rings, Monte Carlo simulation is utilized to obtain test data. Using Monte Carlo simulation allows us the convenience of knowing what the true distribution parameters are for a particular
dataset. Here, it is arbitrarily assumed that the Weibull modulus is 17 and the Weibull characteristic strength is 400 MPa.
Figure 2 shows the likelihood confidence ring for a 90 % confidence level and a
sample size of 10, along with the true distribution parameters and the estimated distribution parameters. If the true distribution parameter pair were unknown, we
would be 90 % confident that the true distribution parameters were within the ring.
If the Monte Carlo simulation process were continued nine more times (i.e., if we
were in possession of ten simulated datasets), then on average one of those datasets
would produce a likelihood confidence ring that did not contain the true distribution parameter pair.
In Fig. 3 the effect of holding the sample size fixed and varying the confidence
level is presented. The result is a series of nested likelihood confidence rings. Here
DUFFY AND PARIKH, DOI 10.1520/STP157820130122
FIG. 2 Confidence ring contour for a sample size of 10 (m ¼ 17, rh ¼ 400).
we have one dataset and multiple rings associated with increments of the confidence level from 50 % to 95 %. Note that as the confidence level increases, the size
of the likelihood confidence ring expands. For a given number of test specimens in
a dataset, the area encompassed by the likelihood confidence ring expands as we
become more and more confident that the true distribution parameters are contained in the ring.
FIG. 3 Dependence of likelihood confidence rings on c for a sample size of 30 (m ¼ 17,
rh ¼ 400).
11
12
STP 1578 On Graphite Testing for Nuclear Applications
FIG. 4 Likelihood confidence rings for sample sizes ranging from 10 to 100 (m ¼ 17,
rh ¼ 400).
The next figure, Fig. 4, depicts the effect of varying the sample size and holding
the confidence level fixed at c ¼ 90 %. The sample size was increased from n ¼ 10 to
n ¼ 100. Note that all the likelihood confidence rings encompass the true distribution parameters used to generate each sample. In addition, the area within the rings
grows smaller as the sample size increases. As the sample size increases, we gain information on the population and thereby reduce the region that could contain the
true distribution parameters for a given level of confidence.
Figure 5 depicts a sampling procedure in which the size of the sample is held
fixed (i.e., n ¼ 10) and the sampling process and ring generation have been repeated
100 times. For a fixed confidence level of 90 %, one would expect that ten rings
would not encompass the true distribution parameters. Indeed that is the case. The
90 likelihood confidence rings that encompassed the true distribution parameters
are outlined in blue. The ten likelihood confidence rings that did not contain the
distribution parameters are outlined in dark orange.
Confidence Rings and Material Acceptance
The material acceptance approach outlined here depends on several things. First
one must have the ability to compute the probability of failure of the component
under design. This probability is designated (Pf)component and is quantified using a
DUFFY AND PARIKH, DOI 10.1520/STP157820130122
FIG. 5 100 likelihood confidence rings. For all rings, n ¼ 10 and c ¼ 0.9 (m ¼ 17, rh ¼ 400).
hazard rate format—that is, the probability of failure is expressed as a fraction with
a numerator of 1. The method for computing this quantity is available in the ASME
Boiler and Pressure Vessel Code [7].
The component probability of failure is modeled assuming the underlying
strength is characterized by a two-parameter Weibull distribution. Thus a component probability of failure curve can be depicted in an m rh graph as shown in
Fig. 6. Points along the curve represent parameter pairs equal to a specified probability of failure. This curve is referred to as a material performance curve. We overlay this graph with point estimates of the Weibull distribution parameters obtained
from tensile strength data that a typical material supplier might provide. Point estimates from these data that plot to the right of the material performance curve represent a lower probability of failure. Conversely, point estimates to the left of this
curve are associated with performance curves with a higher probability of failure.
Thus the material performance curve defines two regions of the m rh space, an
acceptable performance region and a rejection region relative to a specified component probability of failure.
The material performance curve is easily married to a likelihood confidence
ring (discussed in previous sections). This allows the component fabricator to
decide whether the material supplier is providing a material with high enough
13
14
STP 1578 On Graphite Testing for Nuclear Applications
FIG. 6 Generic material performance curve.
quality predicated on the component design and the failure data. Keep in mind that
parameter estimates are estimates of the true distribution parameters of the population, values that are never known in real-life applications. However, through the
use of the likelihood confidence ring method we can define a region in some close
proximity of the estimated point parameter pair, knowing with some level of assurance that the true distribution parameters are contained within that region. If that
region in its entirety falls to the right of the test specimen performance curve, the
component fabricator can accept the material with a known level of quality (i.e., the
significance level). Not surprisingly, we define this procedure as the quality acceptance criterion.
We have combined the two concepts, the likelihood confidence ring and the
material performance curve, in one figure (Fig. 7). Here the material performance
curve given in Fig. 6 is overlain with the likelihood confidence ring from Fig. 2. This
is a graphical representation of the quality assurance process. Rings that reside completely to the right of the material performance curve would represent acceptable
materials. Those rings to the left would represent unacceptable materials and would
be rejected. In the specific case presented, the material performance curve cuts
through the likelihood confidence ring. In this case there are certain regions of the
likelihood confidence ring that produce a safe design space, and there is a region of
the likelihood confidence ring that produces an unsafe design space. In this situation we know the distribution parameters, and they are purposely to the right of the
material performance curve. But given the sample size, the ring did not reside
entirely in the safe region. Moreover, in normal designs we never know the true
DUFFY AND PARIKH, DOI 10.1520/STP157820130122
FIG. 7 Material performance curve with likelihood confidence ring contour, n ¼ 10
(m ¼ 17, rh ¼ 400).
distribution parameters, so we do not know where the true distribution parameter
pair resides inside the likelihood confidence ring.
When the likelihood confidence ring resides totally to the left of the performance curve, the choice to reject the material is quite clear. When the likelihood
confidence ring lies completely to the right of the material performance curve,
then once again, the choice is quite clear: accept the material. When the material
performance curve slices through the likelihood confidence ring, we can shift
the material performance curve to the left, as depicted in Fig. 8. This shift represents
a reduction of component reliability or, alternatively, an increase in the component
probability of failure. Alternatively, the confidence bound associated with likelihood
confidence ring can be reduced so the ring shrinks enough such that the ring is
completely to the right of the material performance curve. This is depicted in Fig. 9.
An interesting aspect of this approach is that it seems that the likelihood confidence rings give a good indication of which side of the material performance curve
the true distribution parameters lie on. If the material performance curve slices
through a likelihood confidence ring for a specified confidence level, then as the
ring size is diminished the ring becomes tangent to one side of the curve or the
other. When this paper was written it was our experience that the side of the component reliability curve that the ring becomes tangent to matches with the side on
which the true distribution parameters lie. It is emphasized that this is anecdotal.
An example in which the true distribution parameters were chosen to the left of the
material performance curve is depicted in Fig. 10. The true distribution parameters
15
16
STP 1578 On Graphite Testing for Nuclear Applications
FIG. 8 Two parallel material performance curves with likelihood confidence ring (m ¼ 17,
rh ¼ 400).
FIG. 9 Material performance curves with likelihood confidence rings for changing
values of c (m ¼ 17, rh ¼ 400).
DUFFY AND PARIKH, DOI 10.1520/STP157820130122
FIG. 10 Material performance curves with likelihood confidence rings for changing
values of c (m ¼ 6, rh ¼ 350).
are known because we are conducting a Monte Carlo simulation exercise to produce failure data. As the confidence level decreases in Fig. 10, the rings become tangent to the curve on the rejection side.
Summary
This effort focused on graphite materials and the details associated with calculating
point estimates for the Weibull distribution parameters associated with the tensile
strength. One can easily generate point estimates from failure data using maximum
likelihood estimators. More information regarding the population (i.e., more failure
data) always improves the quality of point estimates; the question becomes how
much data is sufficient given the application. The work outlined here speaks directly
to this issue.
Hypothesis testing and the relationship it maintains with parameter estimation
were outlined. A test statistic was adopted that allows one to map out an acceptance
region in the m rh parameter distribution space. The theoretical support for the
equations used to generate the likelihood rings was outlined. Inferential statistics
allowed us to generate confidence bounds on the true distribution parameters utilizing the test data at hand. These bounds are dependent on the size of the sample
used to calculate point estimates. The effort focused on a particular type of confidence bound known as likelihood confidence rings.
Component reliability curves were discussed. The concepts of the likelihood
confidence rings and the component probability of failure curves were combined
graphically. This combination gives rise to a material qualification process. This
17
18
STP 1578 On Graphite Testing for Nuclear Applications
process combines information regarding the reliability of the component and the
parameter estimates to assess the quality of the material.
References
[1]
ASTM C565-93(2010)e1: Test Methods for Tension Testing of Carbon and Graphite
Mechanical Materials, Annual Book of ASTM Standards, ASTM International, West Conshohocken, PA, 2010.
[2]
ASTM C749-08(2010)e1: Test Method for Tensile Stress-Strain of Carbon and Graphite,
Annual Book of ASTM Standards, ASTM International, West Conshohocken, PA, 2010.
[3]
ASTM C781-08: Practice for Testing Graphite and Boronated Graphite Materials for Hightemperature Gas-cooled Nuclear Reactor Components, Annual Book of ASTM Standards,
ASTM International, West Conshohocken, PA, 2008.
[4]
ASTM D7775-11e1: Guide for Measurements on Small Graphite Specimens, Annual Book
of ASTM Standards, ASTM International, West Conshohocken, PA, 2011.
[5]
ASTM C651-11: Test Method for Flexural Strength of Manufactured Carbon and Graphite
Articles Using Four-point Loading at Room Temperature, Annual Book of ASTM Standards, ASTM International, West Conshohocken, PA, 2011.
[6]
ASTM D7846-13: Reporting Uniaxial Strength Data and Estimating Weibull Distribution
Parameters for Advanced Graphites, Annual Book of ASTM Standards, ASTM International, West Conshohocken, PA, 2013.
[7]
ASME, “Article HHA-II-3000, Section III, Division 5, High Temperature Reactors, Rules
for Construction of Nuclear Facility Components,” ASME Boiler and Pressure Vessel
Code, ASME, New York, 2013.
[8]
Fisher, R. A., “Theory of Statistical Estimation,” Proc. Cambridge Philos. Soc., Vol. 22,
1925, pp. 700–725.
[9]
Neyman, J. and Pearson, E., “On the Problem of the Most Efficient Tests of Statistical
Hypotheses,” Philos. Trans. R. Soc. London Series A, Vol. 231, 1933, pp. 289–337.
[10] Wilks, S. S., “The Large Sample Distribution of the Likelihood Ratio for Testing Composite
Hypotheses,” Ann. Math. Stat., Vol. 9, No. 1, 1938, pp. 60–62.