For CUSUM and EWMA, ARL ≠ 1/α, ARL ≠1

Monte Carlo method for finding run length distribution
• Motivation: run length (RL) distribution are difficult to calculate
analytically for most of detection methods (CUSUM, EWMA
included). Yet it is very critical to know about the RL distribution
because it is directly related to the design of a detection method.
- For CUSUM and EWMA, ARL0 ≠ 1/α, ARL1 ≠1/(1-β).
Why?
• For complicated detection method, people often use Monte
Carlo simulation methods to approximate RL distributions
and compute the ARL's.
1
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Monte Carlo method for finding run length distribution
• Basic idea:
2
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Monte Carlo method for finding run length distribution
3
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Monte Carlo method for finding run length distribution
• Example: RL distribution and ARL
4
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Monte Carlo method: comments
• The Monte Carlo method works in principle for any kind of
detection methods. the major shortcoming is its high, sometimes
unaffordable, computation demand.
• "m" and "N" should be large enough numbers
- " m" should be large because otherwise you may not be able to
record a viable Lj (imagine Lj = 100 but m=50). But m=5,000
should be large enough.
- It is usually not so easy to know a priori exactly how large N
should be to ensure the required accuracy of the estimates.
Typically, set N =10,000.
•
MATLAB command "randn" can be used to generate N(0,1). To
generate N(µ,σ2), you can first generate N(0,1), then multiple them by σ
and add to them µ, i.e., µ + randn*σ.
5
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Detection using discrete data
• What we discussed so far mainly concerns continuous data.
Many of these methods are also applicable to detection in
discrete data (also called attribute data).
• Detections based on discrete data (or attribute data) are very
important in practice, and it is especially so in the application of
health care or security surveillance, for example:
- counts of mortality
- counts of ER visits
- counts of accidents
• But because discrete data usually follow a distribution that is not
normal, certain degree of revisions are needed.
6
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Distribution for discrete data
• Discrete data typically follow three types of distribution:
hypergeometric distribution, Binomial distribution, and Poisson
distribution.
• Three scenarios:
Hypergeometric: Given a lot of N items, among which D items are
defective, what is the probability of getting x defective items in a
random sample of n items?
Binomial: Given a lot of items (amount unknown) and the probability
for each one of the items to be defective is p, what is the probability
of getting x defective items in a random sample of n items?
Poisson: Given a lot of items (amount unknown), we know that for a
sample of a fixed size (sample size is not given), the number of
defective items included is, on average, λ, what is the probability of
observing x defective items in a random sample of the same size?
7
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Distribution for discrete data
• Probability mass functions for the three scenarios:
• One can use one distribution to approximate another one:
Hypergeometric
Binomial
Poisson
8
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
- Suppose that you are given N items -- this is a FINITE population.
- Among them, D (D ≤ N) items fall into a class of special interest,
for example, defective or non-conforming items.
- Take a random sample of n (n ≤ N) items from the population
without replacement,
- Define by x the number of items in the sample that fall into the
class of interest.
⇒ Then x follows a hypergeometric distribution.
Its population mean and variance are
µ=
nD
N
σ2 =
nD ⎛ D ⎞⎛ N − n ⎞
⎜1 − ⎟⎜
⎟
N ⎝ N ⎠⎝ N − 1 ⎠
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
You have received an order of 100 robotic resistance
spot welders and plan to inspect 10 of them to check if
they meet the peak current specifications. If 5 out of the
100 welders do not meet specs, what is the distribution
for the number in your sample that will not meet specs?
ISEN 614 Advanced Quality Control (Anomaly and Change Detection) Spring 2008
Dr. Yu Ding
Supplement material for self-study
Bernoulli trials: A sequence of n independent trials, where the
outcome of each trial is either a “success” or a “failure”.
Examples of Bernoulli Trials
• Toss coin (outcome: success = "head", failure = "tail")
• Coming to class (outcome: success = "on time",
failure = "late")
• Play slot machine (outcome: success = "win", failure =
"lose")
• Quality inspection (outcome: success = "conforming",
failure = "nonconforming")
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
Binomial Distribution: If the probability of a failure on any trial is a
constant, p, then the number of failures, x, in n Bernoulli trials has the
Binomial distribution
Its population mean and variance are
E(x) = np Var(x) = np(1-p)
Assumptions about Binomial Distribution:
(1) Constant probability of failure p so that the probability of success is
also constant;
(2) Two mutually exclusive outcomes;
(3) All trials statistically independent;
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
• You have just received 10 new robotic resistance spot
welders and plan to inspect all ten of them to check if
they meet the peak current specifications. If the
probability that any given welder does not meet specs is
0.05, what is the distribution for the number that will not
meet specs in your sample?
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
Similarity between hypergeometric dist and binomial dist
p(x)
x
p(x) from hyper-geometric
0
1
2
3
4
5
0.7
0.7
0.6
0.6
Hyper-geometric
0.5
p(x)
0.4
p(x)
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
-1
0
1
2
x
3
4
5
Binomial
0.5
6
0
-1
0
1
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
2
3
x
4
5
6
Dr. Yu Ding
Supplement material for self-study
• For hypergometric distribution, we take a sample of n from a
population of N items without replacement, the number of
defective items x in n follows hypergeometric distribution.
• If we change the action to take a sample of n from a population
of N items with replacement, the corresponding x then follows a
binomial distribution.
• This because, with replacement, each time we pull an item out
of the population:
- the outcome is binary (defective or non defective);
- the probability of defective item is constant, namely D/N;
- the outcome is independent of outcomes of the previous ones.
⇒ This is exactly the scenario for binomial distribution as we
described it.
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
• What if we sample without replacement? Consider
that given a population of N items, among which D
are defective;
- Prob(1st item is defective) =
- Prob(2nd item is defective) =
- Prob(3rd item is defective) =
D−2
N −2
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
if either one draw is
defective
if both draws are defective
Dr. Yu Ding
Supplement material for self-study
• In hypergeometric scenario (without replacement), the outcome
of each item is not independent of the previous items. Nor is the
probability of getting a defective product a constant.
• The only difference between the binomial and hypergeometric
distributions is whether or not we replace each item. This
difference becomes negligible when N is large and n << N .
Under that circumstance,
Binomial dist ≈ Hypergeometric dist
• Conclusion: The binomial scenario is just the hypergeometric
scenario but sample with replacement; or sample from an
infinitely large population.
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
• Rule of thumb:
- Use binomial if you are told the value of p, or n/N ≤ 0.1
- Use hypergeometric if D and N are given, and n/N is
not less than 0.1.
• Approximation: when n/N ≤ 0.1, we can use a binomial
distribution to approximate a hypergeometric distribution,
estimate
pˆ =
D
N
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
• Poisson distribution is used when inspecting a single unit which has
(possibly) multiple defects. Examples are:
- surface flaws on a refrigerator;
- potholes in a section if highway;
- machine breakdowns in a fixed time interval.
• In the previous inspection example, a single unit is a random sample of
a fixed size.
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
• Define λ = expected average number of defects per unit;
x = the random number of defects on an actual (inspected)
unit,
Then, x follows a Poisson distribution
e −λ λx
p( x) =
, x = 0,1,...
x!
µ = λ,
σ2 = λ
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
An automobile manufacturer has been experiencing an excessive
number of paint defects (scratches, blemishes, bubbles, etc.) in the
passenger side fender. The mean number of defects per fender is 0.5
(i.e., one defect every two cars). Assume a Poisson distribution for the #
defects per fender.
What is the probability of an individual fender having no defects?
What is the probability of a fender having more than two defects?
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
Glass bottles are formed by pouring molten glass into a mold. The
molten glass is prepared in a furnace lined with firebrick. As the
firebrick wears, small pieces of brick are mixed into the molten
glass and finally appear as defects (called "stones") in the bottle.
If we can assume that stones occur randomly at the rate of 0.001
per bottle, what is the probability that a bottle selected at random
will contain at least one such defect?
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
• In Binomial distribution, if we count the number of defective
items in a sample of n, the maximum possible number is "n"
• In Poisson distribution, the number of potential defects on a unit
could be infinite (as if n → ∞) but the probability of occurrence
(p) of a defect is very small.
• Under the setting that n → ∞ and p → 0 but np ≠ 0,
Binomial → Poisson, in fact, λ = np.
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
An automobile manufacturer has been experiencing an excessive number of
paint defects (scratches, blemishes, bubbles, etc.) in the passenger side
fender. The mean number of defects per fender is 0.5 (i.e., one defect every
two cars). Use Binomial distribution to calculate the probability of finding
x = 0, 1, 2, … defects on a fender and compare the results with that using
Poisson distribution.
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
H: Sample without replacement
from finite population
B: Sample with replacement or
sample from infinitely large
population
Hypergeometric
Binomial
Poisson
B: finite constant number n trials
P: Infinite possible places/times of
occurrences, very small and constant
occurring probability at each place
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Supplement material for self-study
Hypergeometric
Binomial
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Poisson
Dr. Yu Ding
Attribute control chart: p-chart
• Control charts for discrete data are often called attribute control
chart. Two rudimentary attribute charts are: p-chart and u-chart
(cross reference the control chart table).
• p-chart: we are interested in detecting the nonconforming rate (or
defective rate) in a sample. In fact, Example 1.1 uses discrete
data and it includes both the actual number for defective items
and the defective rate in a sample.
• The nonconforming rate of each sample can be calculated as
A plot of p versus the sample index is called a p-chart.
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Attribute control chart: p-chart
• Control limits for a p-chart
• If p is unknown, need to estimate it from historical training data
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Attribute control chart: p-chart
• p-chart: its parameter L should be chosen according to
requirements on α error. But people often use L = 3 for simplicity.
• Earlier on, we showed in Example 1.1 a control chart for d, the
actual number of nonconforming items. We can also set up a pchart for the nonconforming rate.
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Attribute control chart: p-chart
• Revisit Example 1.1: a p-chart, producing the same results.
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Attribute control chart: u-chart
• u-chart: we are interested in detecting a change in the average
number of defects per inspection unit (or a random sample of fixed
size).
• Denote the number of defects per inspection unit by u. This u
(previously used x) follows a Poisson distribution:
• In actual inspection, there could be n inspection units in a sample.
Then (using the properties of a Poisson distribution),
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Attribute control chart: u-chart
• Control limits for a u-chart
•
If λ is not known, estimate it from historical data
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
u-chart: Example 2.7
• Example 2.7: A manufacturer wishes to set up a control chart for
inspecting gas water heater. Defects workmanship and visual
quality features are checked in this inspection. For the last 22
working days, 176 water heaters were inspected and a total of 924
defects reported. If the manufacturer wishes to use two water
heaters as an inspection unit for detecting any abnormality, how to
set up the control limits?
• Pleas note that in this example
an inspection unit = two water heaters
but the inspection sampling is to inspect an unit a time.
So n =1 not 2 here.
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
u-chart: Example 2.7
• First, estimate the in-control parameter.
# defects # heaters
per heater per inspection unit
• If use L= 3 (3-sigma control limits) and note that n = 1, then,
So roughly speaking, when there are more than two defects
observed in an inspection unit in this water heater inspection, it is
an indication that the underlying manufacturing process has
become significantly worse than its designed variability.
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding
Attribute control chart: other variants
• Two variants of p-chart and u-chart: people sometimes choose to
monitor the total number of defective items (= n*p) or the total
number of defects (= n*u) in a sample of size n. Control charts
established for these two statistics are called np-chart and c-chart
(not nu-chart), respectively.
• Earlier on, for Example 1.1, we have presented a chart for d, which
is in fact a np-chart. It produces the same detection result as the pchart.
• A np-chart (or c-chart) is nothing but a p-chart (or u-chart) with
everything (the statistic, the control limits) multiplied by a factor of n.
Therefore, the detection result from a np-chart (or c-chart) and a pchart (or u-chart) will be the same. So we choose to skip the details
of the np-chart and c-chart.
ISEN 614 Advanced Quality Control (Anomaly and Change Detection)
Dr. Yu Ding