Discrete Data Analysis Normal Approximation

Discrete Data Analysis
The table below summarizes the methods used to analyze discrete
data.
Comparing 2
Proportions
1 Proportion
Normal
Approximation
ˆ
ˆ
pˆ ± Z × p(1 − p)
(p
ˆ−
1
(and 2 way tables)
n
Poisson
Confidence
Interval
Use this approximation when the sample size is large and the number of
defects in the sample is greater than 10 (np>10), and the number of good
parts in the sample is greater than 10 (n(1-p)>10).
pˆ = # Defects
Z × pˆ (1 − pˆ )
×
1
+
1
n n
Exact
Binomial Test
Sample Size
A two sided confidence interval for the proportion (p) that are defective in
the population is given by the equation:
ˆ
ˆ
pˆ ± Z × p (1 − p )
2
n
The resultant will provide the lower and upper limits of a range of all
plausible values for the proportion of defects of that population.
2
Comparing Two Proportions
If evaluating two different sample sets with proportion defective data, the
confidence interval for the difference in proportion defectives between the
two sample sets is given by:
χ (Chi-square)
2
• Large
n (sample size)
• p not too
close to 0 or 1
• np>10 and
n(1-p)>10
• Large
n (sample size)
• small proportion
defective (p<0.10)
z is a value from the normal distribution, for
the required level of confidence.
Z
1.282
1.645
1.960
2.326
2.576
35
One Proportion
pˆ ) ±
1
More than 2
Proportions
Poisson
Approximation
Normal Approximation
2-Sided
Confidence
Level
80.00%
90.00%
95.00%
98.00%
99.00%
1 - Sided
Confidence
Level
90.00%
95.00%
97.50%
99.00%
99.50%
( pˆ1 − pˆ 2 ) ± Z × pˆ (1 − pˆ ) ×
1
n
+ 1
1
n
2
Where:
If Ki = # of defects in the ith sample
& ni = sample size of the ith sample
k
pˆ 1 = 1
n1
k
pˆ 2 = 2
n2
(k + k )
pˆ = 1 2
( n1 + n2 )
The resultant will provide the lower and upper limits of a range of all
plausible values of the difference between the proportion defective in the
populations.
If “0” is included within the range of plausible values, then there is not
strong evidence that the proportions of defects in the two populations are
different.
36
χ2 -- Test
•
Using the 1 Sample t test
• Run Stat > Basic Statistics > 1-Sample t. In the dialog box, identify the variable or
variables to be tested.
for Independence
2
The χ Test for independent relationship tests the null hypothesis (Ho,) that two
discrete variables are independent.
•
Data “relating” two discrete variables are used to create a contingency table.
For each of the cells in the contingency table the “observed frequency” is
compared to the “expected frequency” in order to test for independence.
•
The expected frequencies in each cell must be at least five (5) for the χ test to
be valid.
•
For continuous data, it is best to test for dependency, or correlation, by using
scatter plots and regression analysis.
2
Example of χ Analysis
2
1.
2.
There are 2 variables to be studied, height and weight. The null hypothesis Ho is
that “weight” is independent of “height.”
For each variable 2 conditions (categories) are defined.
3. The data has been accumulated as shown below:
Height below 5’6”
20
11
31
Height above 5’6”
13
22
35
Totals Rows
33
33
N=66
Manual χ
2
1.
2.
3.
4.
5.
6.
Compute fexp for each cell ij: fexp ij = (Row Total)i x (Column Total)j / N.
N is total of all fobs for all 4 cells. (For our example, N = 66 and fexp1,2 =
(33*35)/66 = 1155 / 66 = 17.5).
2
2
2
Calculate χ calc, where χ calc = Σ[[(fobs - fexp) / fexp ] = 4.927
Calculate the degrees of freedom df = (Number of Rows - 1)(Number of
Columns -1). For our example, df = (2-1) * (2-1) = 1.
2
2
Determine χ crit from the χ table for the degrees of freedom and confidence
2
level desired (usually 5% risk). For 1 df and 5% α risk, χ crit = 3.841
2
2
If χ calc> χ crit, then reject Ho and accept Ha, i.e. that weight depends on height.
In this example, we reject Ho.
Using Minitab to perform χ Analysis
• If “Test Mean” is selected, identify the desired mean to be tested, the mean of the null
hypothesis and in the alternative box,select the alternative hypothesis which is
appropriate for the analysis. This will determine the test used for the analysis (one tailed
or two tailed).
• If graphic output is needed, select the graphs button and choose among “Histograms”,
“Dotplot” and “Boxplot” output.
• Click Ok to run analysis.
Analyzing the test results.
• If running the “Confidence Interval” option, Minitab will calculate the “t” statistic and will
calculate a confidence interval for the data.
• If using the “Test Mean” option, Minitab will provide descriptive statistics for the tested
distribution(s), the “t” statistic and a “p” value.
Confidence Intervals
A confidence interval is a range of plausible values for a population parameter, such as
the mean of the population, µ.
For example, a test of 8 units might give an average efficiency of 86.2%. This is the
most likely estimate of the efficiency of the entire population. However, observations
vary, so the true population efficiency might be somewhat higher or lower than 86.2%. A
95% confidence interval for the efficiency might be (81.2%, 91.2%). 95% of the intervals
constructed in this manner will contain the true population parameter.
The confidence interval for the mean of one sample is
Minitab can be used to analyze data using χ with two different processes.
Stat>Tables>Chi Square Test and Stat>Tables>Cross Tabulation. Chi Square
Test analyzes data which is in a table. Cross Tabulation analyzes data which is
in columns with subscripted categories. Since Minitab commonly needs data in
columns to graph, Cross Tabulation is a preferred method for most analysis.
2
X ±t×
σˆ
n
“t” comes from the t tables (Page 65) with n-1 degrees of freedom and with the desired
level of confidence.
Confidence Interval for the difference in the means of 2 samples, if the variances of 2
samples are assumed to be equal is: ( x 1 − x 2 ) ± t*s p 1 + 1
n
n
1
2
“t” comes from the t tables (Page 65) with n1+n2-2 degrees of freedom, and with the
desired level of confidence. SP is the pooled standard deviation:
S
2
37
• If “Confidence Interval” is to be calculated on other than an a value of 95%, change to
the appropriate number.
• The graphic outputs will all have a graphical representation of the confidence interval of
the mean, shown by a red line with a dot at the mean for the sample population
Weight: < 140 lbs, > 140 lbs
Height: < 5’6”, > 5’6”
Weight below 140 LBS
Weight Above 140 LBS
Totals Columns
• Select the test to be performed, “Confidence Interval” or “Test Mean”.
P
=
[(n
1
− 1 )s
2
1
+
(n
2
− 1 )s
2
2
]
(n
1
+
n
2
− 2
)
In MINITAB, confidence Intervals are calculated using “1 Sample” and “2 Sample” t
methods, above. In the text output shown below, the 95% confidence interval for the
difference between the mean of Manu_a and the mean of Manu_b is 6.65 to 8.31. This
statement points to accepting Ha, that the means are different.
95% CI for mu Manu_a - mu Manu_b: ( 6.65, 8.31)
34
“t” Test
A “t” test tests the hypothesis that the means of two distributions are equal. It can be
used to demonstrate a shift of the mean after a process change. If there has
been a change to a process, and it must be determined whether or not the mean
of the output was changed, compare samples before and after the change using
the “t” test.
• Your ability to detect a shift (or change) is improved by increasing the size of
your samples and by increasing the size of the shift (or change) that you are
trying to detect, or by decreasing the variation (See Sample size; page 27 28).
• There are two tests for means, a One Sample t test and a Two Sample t
test.
• The “one sample t test” “Stat > Basic Statistics > 1-Sample t”
compares a single distribution average to a target or hypothesized value.
• The “two sample t test” “Stat > Basic Statistics > 2-Sample t” analyzes
the means of two separate distributions.
Using the 2 Sample t test
1. Pull samples in a random manner from the distributions whose means are being
evaluated. In Minitab, the data can be in separate columns or in a single column
with a subscript column.
2. Determine the Null Hypothesis Ho and Alternative Hypothesis Ha (Less than,
Equal to, or Greater than).
3. Confirm if variances are similar using “F” test or Homogeneity of Variance (page
30).
4. Run “Stat > Basic Statistics > 2-Sample t” In the dialog box, select “Samples in
One Column” and identify the data column and subscript column, or “Samples in
Different Columns” and identify both columns.
5. In the alternative box, select the alternative hypothesis which is appropriate for the
analysis. This will determine the test used for the analysis (one tailed or two
tailed).
6. If the variances are similar, check the “Assume Equal Variances” box.
7. If graphic output is needed, select the graphs button and choose between “dotplot”
and “boxplot” output.
8. Click Ok to run analysis.
Analyzing the test results.
Minitab will provide a calculation of descriptive statistics for each distribution, provide
a Confidence Interval statement (page 32) and provide a statement of the t test as
a test of the difference between two means. The output will provide a “t” statistic,
a “p” value and the degrees of freedom statistic. To use the “t” distribution table
on page 65, the “t” statistic and the degrees of freedom are required. Analysis
can be made using that table or the “p” value.
Minitab: Stat>Tables>Chi Square
1. Create the table shown in the example in
Minitab.
2. Run Stat>Tables>Chi Square Test. In the
dialog box select the columns containing the
tabular data, in this case, “C2” and “C3”. Click
“OK” to run.
3. In the Session window, the table that is created
will show the expected value for each of the data
cells under the actual data for the cell, plus the
c2 calculation , χ2calc = Σ[(fobs - fexp)2/ fexp].
4. The Chi Square calculation is χ2calc = 4.927.
The p value for this test of 0.026.
5. The degrees of freedom df = (Number of Rows 1)(Number of Columns -1) is shown, df = 1.
6. Determine χ2crit from the χ2 table for the degrees
of freedom and confidence level desired (usually
5% risk). χ2crit=.3.841.
7. Since Χ2 calc >Χ2crit, reject Ho.
Because the data is in tabular form in Minitab, no
other analysis can be done.
Minitab: Stat>Tables>Cross Tabulation
If additional analysis of data is desired, including
any graphical analysis, the Stat>Tables>Cross
Tabulation procedure is preferred. This
procedure uses data in the common Minitab
column format. Note that the data is in a single column and the factors or variables being
considered are shown as subscripted values. In this graphic, the data is in column C6 and
the appropriate subscripts are in column C4 and C5.
1. Run Stat>Tables>Cross Tabulation. In the dialog box select the columns identifying
the factors or variables in the “Classification
Variables” box.
2. Click Chi Square Analysis and select “Above
and expected count”.
3. Select the column containing the response data
in the “Frequencies in” box, in this case, “data”.
4. Click Run.
5. The output in the session window is very
similar to the output for Stat>Tables>Chi
Square Test, except that it does not show
the Chi Square calculation for each cell.
6. Analysis of the test are done as before,
either by using the generated p value or by
using the calculated χ2 and degrees of
freedom and entering the tables with that
information to find a χ2 crit.
33
38
Poisson Approximation
Use this approximation when the sample size is large and the probability of
defects (p) in the sample is less than 0.10.
In such a situation:
pˆ = k
Where: k=number of defects
n= number of sample parts
n
The confidence interval for this proportion defective can be found using
the Poisson distribution.
Testing Equality of Variances
The “F” test is used to compare the variances of two distributions. It tests the
hypothesis, Ho, that the variances of two distributions are equal. It is performed by
forming a ratio of two variances from two samples and comparing the ratio with a value
in the “F” distribution table. The “F” test can be used to demonstrate that the variance
has been increased or decreased after a process change. Since “t” tests and
“ANOVA” need to know if population variance is the same or different, this test is also
a prerequisite for doing other types of hypothesis testing. In Minitab, this test is done
as “Homogeneity of Variance”.
The “F” test is also used during the ANOVA process to confirm or reject hypotheses
about the equality of averages of several populations.
Performing an F Test
1. Pull samples in a random manner from the two distributions for which you are
comparing the variances. Prior to running the test confirm sample distribution
normality for each sample (page 17).
2. Compute the “F” statistic, Fcalc = σ σ
. The “F” statistic should always be
calculated so that the larger variance is in the numerator.
2
1
2
2
3. Calculate the degrees of freedom for each sample. Degrees of freedom = ni-1,
where ni is the sample size for the ith sample, i.e., n1-1 & n2-1.
4. Specify the risk level that you can tolerate for making an error in
(usually set at 5%.)
your decision
5. Use the “F” distribution table (p 59 - 60) to determine Fcrit for the degrees of
freedom in your samples and for the risk level you have chosen.
6. Compare Fcalc to Fcrit. If Fcalc < Fcrit., the null hypothesis, Ho, which implies that
the variances from both distributions are equal, cannot be rejected. If Fcalc>Fcrit ,
reject the null hypothesis and conclude that the samples have different variances.
1. Determine the desired confidence level (80%; 90% or 95%).
2. Find the lower and upper confidence interval factors for that level of
confidence for the number of failures found in the sample.
3. Divide these factors by the actual sample size used.
4. The resultant of the two calculations gives the range of plausible values
for the proportion of the population that is defective.
Example: k=2; n=200 (2 defects in 200 sampled parts or CTQ outputs)
Then:
pˆ = 2
200
= .0100
CI=(.619/200, 7.225/200)
=(.0031,.0361)
39
And the 95% 2-sided confidence Interval is:
Where:
Lower confidence factor = .619
Upper confidence factor = 7.225
Using Homogeneity of Variance (For MINITAB Analysis)
1. Homogeneity of Variance will allow analysis of multiple population variances
simultaneously. It will also allow analysis of “non-normal” distributions. Data from
all sample groups must be “stacked” in a single column with the samples
identified with a separate “subscript” or “factor” column.
2. In Minitab, use STAT>ANOVA>HOMOGENEITY OF VARIANCE. In the dialog
box, identify the single “response” column and a separate “Factors” column or
columns.
3. Analysis of the test will be done using the “p value.” If the data is Normal (See
Normality, page 15), use Bartlett's Test. Use Levene's Test when the data come
from continuous, but not necessarily normal distributions.
4. The computations for the homogeneity of variance test require that at least one
cell contains a non-zero standard deviation. Normally, it is possible to compute a
standard deviation for a factor if it contains at least two observations.
5. Two standard deviations are necessary to calculate Bartlett’s and Levene’s test
statistics.
32
The Transfer
Function
Hypothesis Statements (cont.)
F Test - Compares the variances of two Distributions
H0 - The sample variances tested are statistically the same
σ20=σ21
HA - The sample variances tested are not equal
σ20≠σ21
Homogeneity of Variance - Compares the variances of
multiple Distributions
Y=
H0 - The sample variances tested are statistically the same
σ20=σ21=σ3=…=σk
HA - At least one of the sample variances tested are not
f (X)
equal
σ20≠σ21≠σ21
Copyright 1995 Six Sigma Academy, Inc.
Bartletts Test - Tests Normal Distributions
Levene’s Test - Tests non-normal distributions
Χ2 Χ
Ho : p1 = p2 = p3 = ... = pn
What is a p-value?
Statistical definitions of p-value:
The observed level of significance.
The chance of claiming a difference if there is no difference.
The smallest value of alpha that will result in rejecting the null hypothesis.
How do I use it?
31
If p < Alpha, then the difference is statistically significant. Reject the null
hypothesis and declare that there is a difference.
Think of (1 - p) as the degree of confidence that there is a difference.
Example: p = .001, so (1 - p) = .999, or 99.9%.
.
You can think of this as 99.9% confidence that there is a difference
Discrete Continuous
Ha : At least one of the equalities does not hold
•
Output
Inputs
If Outputs are :
Ha : Dependent (There is a relationship between the populations)
•
Root Causes
Tests the hypothesis that two discretely measured
variables operate independently of each other
Ho : Independent (There is no relationship between the populations)
•
•
•
Effect
If Inputs are Continuous:
Regression
Analysis of Covariance
If Inputs are Continuous:
Logistic Regression
If Inputs are Discrete:
ANOVA
t Tests; F Tests
Confidence Intervals
DOE
If Inputs are Discrete:
Logistic Regression
Χ2
Confidence Intervals- Proportions
DOE
What
Whatis
isthe
themathematical
mathematicalrelationship
relationship
between
betweenthe
the“Y”
“Y”and
andthe
the“X’s”
“X’s”
40
ANOVA
ANOVA,, ANalysis Of VAriance is a technique used to determine the statistical
significance of the relationship between a dependent variable (“Y”) and a single or
multiple independent variable(s) or factors (“X’s”).
ANOVA should be used when the independent variables (X’s) are categorical (not
continuous). Regression Analysis (Pages 43 - 45) is a technique for performing a
similar analysis with continuous independent variables.
ANOVA determines if the differences between the averages of the levels is greater
than the expected variation. It answers the question: “Is the signal between levels
greater than the noise within levels?”
ANOVA allows the investigator to compare several means simultaneously with the
correct overall level of risk.
Basic Assumptions for using ANOVA
• Equal Variances (or close to the same) for each subgroup.
• Independent and normally distributed observations.
• Data must represent the population variation.
• Acceptable Gage R&R
• ANOVA tests for equality of means is fairly robust to the assumption of normality
for moderately large sample sizes, so normality is often not a major concern.
ANOVA - One Way
The One Way ANOVA enables the investigation of a single factor at multiple levels
with a continuous dependent variable. The primary investigation question is “Do any of
the populations of “Y” stemming from the levels of “X” have different means?”
MINITAB will do this analysis either with the data in table form, with data for each level
of X in separate columns (STAT>ANOVA>ONE WAY (UNSTACKED) ) or with all the
data in a single column and the factor levels identified by a separate subscript column
(STAT>ANOVA>ONE WAY). For the data below, use “One-Way c1-c3 and “One
Way” for data in columns c4-c5.
•In the dialog box, for “One Way
(Unstacked)”, identify each of the
columns containing the data.
•In the dialog box for “One-way”,
identify the column containing the
Response (Y) and the Factor (X) as
appropriate.
•For both analyses, if graphic
analysis is desired select the
“Graphs” button and select between
“Dotplots” and “Boxplots” .
•Click OK to run. For analysis, see
page 41.
41
Stating the Hypothesis HO and HA
Ho
The starting point for a hypothesis test is the “null” hypothesis - Ho. Ho
is the hypothesis of sameness, or no difference.
Example: The population mean equals the test mean.
Ha
The second hypothesis is Ha - the “alternative” hypothesis. It represents
the hypothesis of difference.
Example: The population mean does not equal the test mean.
•You usually want to show that there is a difference (Ha).
•Start by assuming equality (Ho).
•If the data show they are not equal, then they must be different (Ha).
Hypothesis Statements
1 Sample t - Compares a single distribution to a
target or hypothesized value.
– H0 - The sample tested equals the target
µ0=Target
Ha - The sample tested is not equal to the target or
greater than/less than the target.
µ0≠Target
µ0>Target
µ0<Target
2 Sample t - Compares the means of two separate
distributions
– H0 - The samples tested are statistically the same
µ0=µ1
– Ha - The sample tested is not equal to the target or
greater than/less than the target.
µ0≠ µ1
µ0> µ1
µ0< µ1
30
Hypothesis Testing
Since all data are variable, an observed change could be due to chance and may not
be repeatable. Hypothesis testing determines if the change could be due to chance
alone, or if there is strong evidence that the change is real and repeatable.
In order to show that a change is real and not due to chance alone, first assume there
is no change (Null Hypothesis, HO). If the observed change is larger than the change
expected by chance, then the data are inconsistent with the null hypothesis of no
change. We then “reject” the null hypothesis of no change and accept the alternative
hypothesis, HA.
The null hypothesis might be that two suppliers provide parts with the same average
flatness (HO:µ1=µ2, the mean for supplier 1 is the same as the mean for supplier 2). In
this case, the alternative hypothesis is that average flatness is not equal (HA: µ1≠µ2).
Real
Real World
World
µµ11=µ
=µ22
µµ11=µ
=µ22
µµ11≠µ
≠µ22
Correct
Decision
Type 2
Error
β
Type 1
Error
α
Correct
Decision
Decision
Decision
µµ11≠µ
≠µ22
If the means are equal and your decision
is that they are equal (top left box), then
you made the correct decision.
If the means are not equal and your
decision is that they are not equal(bottom
right box), then you made the right
decision.
If the means are equal but your decision
is that they are not equal (Bottom left
box), then you made a type 1 error. The
probability of this error is alpha (α)
If the means are not equal but your
decision is that they are equal (top right
box), then you made a Type 2 error. The
probability of this error is beta (β).
.
Steps in Hypothesis Testing
1. Define the problem; state the objective of the Test.
2. Define the Null and Alternate Hypotheses.
3. Decide on the appropriate statistical hypothesis test; Variance (Page 30);
Mean (t Test - Page 31 - 32); Frequency of Occurrence (Discrete - Χ2 Page 35 - 36).
Anova Two Way
Two way ANOVA evaluates the effect of two separate factors on a single response can be
evaluated. Each cell(combination of independent variables) must contain an equal number
of observations (must be balanced). See General Linear Model (Pages 42) for unbalanced
Data sets. In the data set on the right, Strength is
the response (Y) and Chem and Fabric are the
separate factors (X1 and X2). To analyze the
significance of these factors on Y, run
STAT>ANOVA>TWO WAY. In the dialog box,
identify the Response (Y), “Strength.” In the “Row
Factor” box, identify the first of two factors (X) for
analysis. In the “Column Factor”, Identify the
second “vital X”. Select the “Display Means” box
for each factor to gain Confidence interval and
means analysis.
Select “STORE RESIDUALS” and then “STORE
FITS”.
If graphical analysis of the ANOVA data is
desired, select the “Graphs” button and choose
one, or all of the four diagnostic graphs available.
This analysis does not produce F and p-values,
since you can not specify whether the effects are
fixed or random. Use Balanced ANOVA (Page
36) to perform a two-way analysis of variance,
specify fixed or random effects, and display the F
and p-values when you have balanced data. If you have unbalanced data and random
effects, use General Linear Model (Page 42) with Options to display the appropriate test
results.
It can be seen from the SS column
that the “error SS” is very small
relative to the other terms. In the
graphic Confidence interval analysis
it is clear that both factors are
statistically significant, since some
of the confidence intervals do not
overlap.
4. Define the acceptable α and β risk.
5. Define the sample size required. (Page 27 - 28)
6. Develop the sampling plan and collect samples.
7. Calculate the test statistic from the data.
8. Compare the calculated test statistic to a predicted test statistic for the risk
levels defined.
• If calculated is larger than the predicted test statistic, the statistic
indicates difference.
29
42
Calculating Sample Size
ANOVA - Balanced
The Balanced ANOVA allows the analysis of process data with two or more factors. As
with the Two Way ANOVA, Balanced ANOVA allows analysis of the effect of multiple
factors, at multiple levels simultaneously. A factor (B) is “nested” within another factor (A)
if the level of B appears with only a single level of A. Two factors are “crossed” if every
level of one factor appears with every level of the other factor. The data for individual
levels of factors must be balanced: “each combination of independent variables (cell)
must have an equal number of observations”. See General Linear Model (Page 38) for
analysis of unbalanced designs. Guidelines for normality and variance remain same as
shown on page 38.
Figure 1 shows how some of the factors and
data might look in the MINITAB worksheet.
Note there are five (5) data points for each
combination of the three factors.
To analyze for significance of these factors(Xij)
on
the
response
variable
(Y),
run
STAT>ANOVA>BALANCED ANOVA. In the
dialog box (Figure 2) , Identify the
“Y” variable in the Response box and identify
the factors in the “Model” box. Note that the
“pipes [“Shift \”] indicate the model analyzed is
to include factor interactions. Select “Storage”
to store “residuals” and “fits” for later analysis.
Select “Options” and select “Display means...” to
display information about data means for each
factor and level.
Figure 1
Figure 3 is the primary output of this analysis. There is no significant graphic analysis
for the balanced ANOVA. See page 41 for analysis of this output.
PIPE
To calculate the actual sample size without the table,
or to program a spreadsheet to calculate sample size,
use this equation.
(
z α / 2 + z β )2
n = 2×
(δ / σ )2
Zβ
α α/2 Zα/2
β
.20 .10 1.282 .20 0.842
.10
.05 1.645 .10 1.282
.05 .025 1.960 .05 1.645
.01 .005 2.576 .01 2.326
Example:
α = .10, β = .01,
δ/σ=.3
Figure 2
43
Figure 3
(1 . 645 + 2 . 326 )2
n = 2×
= 350
2
(. 3 )
28
Continuous Data Analysis
Interpreting the ANOVA Output
Sample Size Determination
When using sampling to analyze processes, sample size must be consciously selected
based on the allowable α and β risk, the smallest amount of true difference (δ) that
you need to observe for the change to be of practical significance and the variation
of the characteristic being measured (σ). As variation decreases or sample size
increases it is easier to detect a difference.
Steps to defining sample size
1.
2.
3.
4.
5.
Determine the smallest true difference to be detected, the gap ( δ ).
Confirm the process variation ( σ ) of the processes to be evaluated.
Calculate δ/σ.
Determine acceptable α and β risk.
Use chart on page 58 to read the sample size required for each level of the factor
tested.
Today
Desired
Today
Desired
The first table lists the factors and levels. In the table shown there are three “factors”,
“Region”, “Shift” and “WorkerEx”. There are three levels each for “Region” and “Shift”. The
values assigned for the Region and Shift levels are “1,2 &3”. “WorkerEx” is a two level
factor and has level values of “1&2”.
The second table is the ANOVA output. The columns are as defined below.
Source
T
T
variation(σ
σ)
gap delta (δ
δ)
variation(σ
σ)
gap delta(δ
δ)
δ
≅1
σ
δ
≅2
σ
SS
MS
F
For example -Assume the direction of the effect is unknown, but you need to see a delta sigma (δ/σ)
of 1.0 in order to say the change is important. For an α risk of 5% and a β risk of 10%,
we would need to use 21 samples. Remember that we would need 21 at each level of
the factor tested. If for the same δ, σ were reduced so that δ/σ were 2, only 5 samples
would be required. In general, the smaller the shift (δ/σ) you are trying to detect, and/or
the lower the tolerable risk, the greater the number of samples required.
Sample size sensitivity is a function of the standard error of the mean (σ
samples are less sensitive than larger samples.
27
DF
n ). Smaller
P
The source shows the identified factors from the model, showing
both the single factor information (i.e., Region) and the interaction
information (i.e., Region*Shift)
Degrees of Freedom for the particular factor. Region and shift have
3 levels and 3-1=2 df, and workerex has 2 levels and 2-1=1 df.
Factor “Sum of Squares” is a measure of the variation of the sample
means of that factor.
Factor “Mean Square” is the SS divided by the DF.
The F calc value is the MS of the factor divided by the MS of the Error
term. In the case of Region, F=90.577÷3.325=27.24. If using F crit
to analyze for significance, enter table with DF degrees of freedom
and α=.05. Compare F calc to F crit. If F calc is greater than F crit , the
factor is significant.
The calculated P value, the observed level of significance. If P<.05,
the factor is statistically significant at the 95% level of confidence.
Note: The relative size of the error SS to total SS indicates the percent of variation left
unexplained by the model. In this case, the unexplained variation is 39.16% of the total
variation in this model. The “s” of this unexplained variation is the square root of the MS of
the Error term (3.325). In this case the “within” group variation has a sigma of 1.82. If this
remaining variation does not enable the process to achieve the desired performance state,
look for additional factors.
44
General Linear Model
The General Linear Model (GLM) can handle “unbalanced” data - such as data sets with
missing observations. Where the Balanced ANOVA required the number of observations
to be equal in each “factor/level” grouping, GLM can work around this limitation.
The data must be “full rank” (enough data to estimate terms in the model). But you don’t
have to worry about this, because Minitab will tell you if your data isn’t full rank!
Analysis and Improve
Tools
Y
Discrete
In the data set shown in Figure 1, note that there is only
one data point in “Rot1”, the response column for factor
“Temp1”- level 10 / ”Oxygen1”- level 10(rows 8 & 9), and
only two data points for “Temp1” - level 16 / “Oxygen1” Level 6 (Row 14). In such case, “Balanced ANOVA”
would not run because the requirement of equal
observation would require three data points in each cell
(factor and level combination).
Figure 1
Figure 2 is the primary output of this analysis. There is
no graphic analysis of this output.
Discrete
•
•
•
X
Tables (Cross
tab)
Chi Square
Confidence
intervals for
proportions
Pareto
•
•
•
•
•
•
•
Continuous
Run STAT>ANOVA>GENERAL LINEAR MODEL. In
the Dialog box, identify the response variable in the
“Response” box and the factors in the “Model” box. Use
the pipe (shifted “\”) to include interactions in the
analysis.
•
Continuous
•
•
Logistic
regression
Discriminant
Analysis
CART
(Classification and
Regression Trees)
•
•
•
•
Confidence
intervals
t test
ANOVA
Homogeneity of
Variance
GLM
DOE (factorial fit)
Linear regression
Multiple regression
Stepwise
Regression
DOE response
surface
Logistic Regression, Discriminant Analysis and CART (Classification and
Regression Trees) are advanced topics not taught in Six Sigma Training.
The following references may be helpful.
Figure 2
Interpretation:
Temp1 is a significant X variable, because it explains 62% of the total variation
(528.04/850.4). (Temp1 also has a p-value < 0.05, indicating that it is statistically
significant)
Breiman, Friedman, Olshen and Stone; Classification and Regression Trees;
Chapman and Hall, 1984
Hosmer and Lemeshow; Applied Logistic Regression; Wiley, 1989
Minitab Help - Additional information about Discriminate Analysis
Neither Oxygen1 nor the interaction between Oxygen and Temperature appears
significant.
The unexplained variation represents 30.95% ((263.17÷850.4)*100) and the estimate of
the within subgroup variation is 5.4 (square root of 29.24).
45
26
Pareto Diagrams
Regression Analysis
Stat>Quality
Stat>Quality Tools>Pareto Chart
Regression can be used to describe the mathematical relationship between the response
variable and the vital few X’s, if you have continuous data for your X’s. Also, after the
“vital few variables” have been isolated, solving a regression equation can be used to
determine what tolerances are needed on the “vital few variables” in order to assure that
the response variable is within a desired tolerance.
Pareto Chart for Defects
When analyzing categorical defect data , it is
useful to use the Pareto chart to visualize the
400
relative defect frequency. A Pareto Chart is a
80
300
frequency ordered column chart. The analysis can
60
either analyze raw defect data, such as “scratch,
200
40
dent, etc”, or it can analyze count data such as is
made available from Assembly Line Defects
100
20
reports. The graphic on the left is from count data.
0
0
Set up the worksheet with two columns, the first
Defect
with the defect cause descriptor and the second
274
59
43
19
10
18
Count
with count or frequency of occurrences. In the
64.8
13.9
10.2
4.5
2.4
4.3
Percent
Cum %
64.8
78.7
88.9
93.4
95.7
100.0
“PARETO CHART” dialog box, select “Chart
DEFECTS TABLE”. Link the cause descriptor to the “LABELS IN” box and the counts to the
“FREQUENCY” box. Click OK. For more information, see Minitab Context sensitive help in the
Pareto Dialog box.
Count
Percent
100
ws
cre
gS
sin
M is
Mis
g
sin
Cli
ps
yG
ak
Le
k
as
et
D
ve
cti
efe
i
us
Ho
Inc
te
ple
om
rt
Pa
ers
Oth
To interpret the pareto, look for a sharp gradient to the categories with 80% of counted defects
attributable to 20-30% of the identified categories. If Pareto is flat with all categories linked to
approximately the same number of defects, try to restate the question to redefine the categorical
splits.
Cause and Effect Diagrams
Fishbone Diagrams: Stat>Quality
Stat>Quality Tools>Cause & Effect
Cause-and-Effect Diagram
Measurements
Micrometers
Microscopes
Materials
Alloys
Lubricants
Inspectors
When working with the Advocacy team to define the
potential factors (X’s), it is often helpful to use a
“Cause and Effect Diagram” or “Fishbone” to display
the factors. The arrangement helps in the discovery
of potential interactions between Factors (X’s).
Men
Suppliers
Shifts
Supervisors
Training
Operators
Exhaust Quality
Speed
Brake
Condensation
Moisture%
Environment
Engager
Angle
Methods
Lathes
Bits
Sockets
Machines
Use Minitab worksheet columns to record
descriptors for the factors identified during the team
brainstorming session. Group the descriptors in
columns by categories such as the 5M’s. Once the
factors are all recorded, open the Minitab
“Stat>Quality Tools>Cause and Effect” dialog
box.
The dialog box will have the 5M’s and Environment shown as default categories of factors. If using
these categories, link the worksheet columns of categorized descriptors to the dialog box categories.
If the team has elected to use other Category names, replace the default names and link the
appropriate columns. Click OK.
To interpret the Cause and Effect Diagram, look for places where a factor in one category could also
be included in another category. Question the Advocacy team about priority or significance of the
factors in each category. Then prioritize the factors as a whole. For the most significant factors, ask
the team where there is the potential for changes in one factor to influence the actions of another
factor. Use this information to plan analysis work.
25
Regression analysis can find a linear fit between the response variable Y and the vital few
input variables X1 and X2.
Y = B + B X + B X + error
0
1
1
2
2
(Start with a scatter diagram to examine the data.)
This linear equation can be used to decide what tolerances must be maintained on X1 and
X2 in order to hold a desired tolerance on a Variable Y.
∆Y = B ∆X + B ∆X
1
1
2
2
Regression analysis can be done using several of the MINITAB tools.
Stat>Regression>Fitted Line Plot is explained on Page 20. This section will discuss
Regression>Regression.
Data must be paired in the MINITAB worksheet. That is, one measurement from each
input factor (x) is paired with the response data (Y) for that particular measurement point.
Plot the data first using Minitab Stat>Plot.
Analyze the data
Stat>Regression>Regression. In the dialog box indicate the Response (Y) in the
Response
box
and
the
expected factors (X’s) in the
Predictors box.
Select the
Storage button and in that
dialog box select Fits and
Residuals. Click OK twice to
run the analysis. The output
will appear as shown in the
figure to the right.
using
The full regression equation is
shown at the top of the output.
Predictor influence can be
evaluated using the p column in
the first table. Analysis of the
second table is done in similar
fashion to the ANOVA analysis
on page 41. Note that R2 (ADJ)
is similar to R2 but is modified to
reflect the number of terms in
the regression. If there are many terms in the model, and the sample size is small, then R2
(ADJ) can be much lower than R2, and you may be over-fitting. In this example, the total
sample size is large (n=560), so R2 and R2 (ADJ) are similar.
46
Stepwise Regression
Time Series Plot
Graph>Time Series Plot
Stepwise Regression is useful to search for leverage factors in a data set with many
factors (x’s) and a response variable (Y). The tool can analyze up to 100 factors. But,
while this enables the analysis of Baseline data for potential Vital X’s, be careful not
to draw conclusions about significance of X’s without first confirming with a DOE.
The time series plot is useful as a
diagnostic tool. Use it to analyze data
collection processes, non-normal data sets,
etc. In GRAPH VARIABLES: Identify any
number of variables (Y) from the worksheet
you wish to look at over time. Minitab
assumes the values are entered in the order
they occurred. Enter one column at a time.
Minitab will automatically sequence to the
next graph for each column. The X axis is
the time axis and is set by selecting the
appropriate setting in TIME SCALE. Each
Time series plot will display on a separate graph. In FRAME,ANNOTATE and
OPTIONS, you can change chart axes, display multiple charts, etc. In analyzing the
Time Series Plot, look for a story. Look for trends, sudden shifts, a regular cycle,
extreme values, etc. If any of these exist, they can be used as a lead into problem
solving.
To use Stepwise regression, the data needs to be entered in Minitab with each
variable in a separate column and each row representing a single data point. Next
select Stat>Regression>Stepwise.
In the dialog box, identify the column containing the response (Y) data in the
Response box. In the Predictor box, identify the columns containing the factors
(X’s) you want Minitab to use. If their F-statistic falls below the value in the “F to
remove” text box under Options (Default = 4), Minitab removes them. By selecting
the Options button, you can change the Fcrit value for adding and removing factors
from the selection and also reduce the number of steps of analysis the tool goes
through before asking for your input.
Minitab will prioritize the leverage “X” variables and run the first regression step on
the factor with the greatest influence. It continues to add variables as long as the “t”
value is greater than the SQRT of the identified F statistic limit (Default = 4). The
Minitab output includes
Box-Cox Transformation
1) the constant and the factor coefficients for the significant terms.
2) the “t” value for the Factors included.
3) the “s” for the unexplained variation based on the current model.
4) the R2 for the current model.
If you have chosen “1 step between pauses”, Minitab will then ask if you wish to run
more. Type “yes” and “enter”. Continue this procedure until MINITAB won’t calculate
any more. At that point, you will have identified your potential “leverage X’s”.
Stat>Control
Stat>Control Charts>Box-Cox Transformation
B o x-C o x P lo t fo r S ke w e d
95% C onfidenc e Inter val
60
Last Iter ation Info
Lambda
Low 0.056
E st 0.113
Up 0.170
50
40
Output
30
S tD e v
In this output, there are five potential
predictors identified by stepwise
regression. The steps are shown by
the numbered columns and include
the regression information for
included factors. The information in
column 1 represents the regression
equation information if only “Form” is
used. In column 5, the regression
equation information includes five
factors, but the s is .050 and the R2
is only 25%. In all probability, the
analyst will choose to gather
information
including
additional
factors during next runs.
20
10
0
-5
-4
-3
-2
-1
0
1
Lam bda
2
3
4
5
S tD ev
2.784
2.782
2.783
BOX-COX TRANSFORMATION is a
useful tool for finding a transformation
that will make a data set closer to a
normal distribution. Once confirmed
that the distribution is non-normal, use
Box Cox to find an appropriate
transformation. Box-Cox provides an
exponent used in the transformation
called lambda, “λ”.
The transformed data is the original data raised to the power of λ. Subgroup data
can be in columns or across rows. In the dialog box, indicate how DATA ARE
ARRANGED and where located. If data is subgrouped and subgroups are in rows,
identify configuration. To store transformed data, select STORE TRANSFORMED
DATA IN and indicate new location.
The Box-Cox transformation can be useful for correcting non-normality in process
data, and for correcting problems due to unstable process variation. Under most
conditions, it is not necessary to correct for non-normality unless the data are highly
skewed. It may not be necessary to transform data which are used in control charts,
because control charts work well in situations where data are not normally
distributed.
Note: You can only use this procedure with positive data.
47
24
Box Plot
Regression with Curves (Quadratic) &
Interactions
GRAPH>BOXPLOT
The boxplot is useful for comparing
multiple distributions (Continuous Y and
discrete X).
In the GRAPH section of the dialog box, fill
in the column(s) you want to show for Y
and if a column is used to identify various
categories of X, i.e., subgroup coding, etc.
Click FRAME Button to give you the
options of setting common axes or multiple
graphs on the same page. To generate
multiple plots on a single page, select
FRAME>MULTIPLE GRAPHS>OVERLAY GRAPHS... Click ATTRIBUTES to allow you
to change individual box colors. Click OK
The box represents the middle 50% of the distribution. The horizontal line is the median
(the middlemost value) The whiskers each represent a region sized at 1.5*(Q3-Q1), the
region shown by the box). Interpretation can be that the box represents the hump of the
distribution and the whiskers represent the tails. Asterisks represent points which would
fall outside the lower or upper limits of expected values.
Interval Plot
GRAPH>INTERVAL PLOT
thickness
129.25
129.15
Useful
for
comparison
of
multiple
distributions. Shows the spread of data
around the mean by plotting standard error
bars or confidence intervals.
129.05
The default form of the plot provides error
bars extending one standard error (standard
128.95
deviation/square root of n) above and below
existing
new
type
a symbol at the mean of the data.
Y variable: Select the column to be plotted on the y-axis.
Group variable: Select the column containing the groups (or categories). This variable is
plotted on the x-axis.
Type of interval plot
Standard error: Choose to display standard error bars where the error bars extend one
standard error away from the mean of each subgroup.
Multiple: Enter a positive number to be used as the multiplier for standard errors (1 is the
default).
Confidence interval: Choose to display confidence intervals instead of standard error
bars. The confidence intervals assume a normal distribution for the data and use tdistribution critical values.
When analyzing multiple factor relationships, it is important to consider if there is
potential for quadratic (curved) relationships and interactions. Normal graphic analysis
techniques and Regression do not allow analysis of the effects of interrelated factors. To
accomplish this, the data must be analyzed in an orthogonal array (See Page 49). In
order to create an orthogonal array with continuous data, the factor (x) data must be
centered. Do this as follows:
1. The data to be analyzed need to be is columns, with the response in one column
and the values of the factors paired with the response and recorded in separate
columns.
2. Use Stat>DOE>Define Custom RS Design
In the dialog box, identify the columns containing the factor settings.
3. Next, analyze the model using Stat>DOE>Analyze RS Design.
Identify the column containing the response data
Check: Analyze Data using Coded Units.
4. Click on Storage and select Fits and Residuals for later regression diagnostics.
Click OK. Click on Graphs and select the desired graphs for analysis diagnostics.
The initial analysis will include all terms in the potential equation including full
quadratic.
Analysis of the output will be similar to that for
Regression>Regression (Page 43).
5. Where elements are
insignificant revert to the
Stat>DOE>Analyze RS
Design >Terms dialog box
to eliminate. In the case of
this example, the equation
can be analyzed as a linear
relationship, so select
“Linear” in the “Include the
following terms box”. Note
that this removes all the
interaction and quadratic
terms. Re-run the
regression. Once an
appropriate regression
analysis, including leverage
factors has been obtained, validate the adequacy of the model by using the
regression diagnostic plots Stat>Regression>Residuals Plots (Page 22).
Once an appropriate regression equation has been determined, remember this
analysis was done with centered data for the factors. The centering will have to be
reversed in order to make the equation useful from a practical standpoint. To
create a graphic of the model, use Stat>DOE>RSPlots (Page 52). From this
dialog box a contour plot of the results can be created.
Level: Enter the level of confidence for the intervals. The default confidence coefficient is
95%.
23
48
Binary Logistic Regression
One-Variable Regression
In binary logistic regression the predicted value (Y) will be probabilities p(d) of
an event such as success or failure occurring. The predicted values will be
bounded between zero and one (because they are probabilities).
STAT>REGRESSION>FITTED
LINE PLOT
Regression Plot
In the STAT>REGRESSION>FITTED
LINE PLOT dialog box, identify
Response Variable (Y). Identify one (1)
Predictor (X). Select TYPE OF MODEL
(Linear, Quadratic or Cubic). Click on
STORAGE. Select RESIDUALS and
FITS.
If you need to transform data, use
OPTIONS and select Transformation.
In
select
OPTIONS,
DISPLAY
Hardness
CONFIDENCE BANDS and DISPLAY
PREDICTION BANDS. Click OK.
The output from the fitted line plot contains an equation which relates your predictor
(input variable) to your response (output variable). A plot of the data will indicate
whether or not a linear relationship between x and y is a sensible approximation.
These observations are modeled by the equation :
Y = 2692.80 - 3.16067X
R-Sq = 0.784
700
600
Abrasion
Example: Predict the success or failure of winning a contract based on the
response cycle time to a request for proposal and the proposal team leader.
The probability of an event, π(x) or Y, is not linear with respect to the Xs. The
change in π(x) for a unit change becomes progressively smaller as π(x)
approaches zero or one. Logistic regression develops a function to model this.
π ( x) (1 − π ( x)) is the odds. The Logit is the Log of the odds. Ultimately the
transfer function being developed will solve for π(x).
300
β + β 1x
e0
95% CI
)
5. Evaluate the Model for
700
710
Count
113
110
223
2
P
StDev
1.670
1.799
Z
4.44 0.000
-4.74 0.000
1.2109
0.3005
4.03 0.000
3
740
750
760
Stat>Regression>Residual
Stat>Regression>Residual Plots
4
Odds
Ratio
95% CI
Lower
Upper
0.00
0.00
0.01
3.36
1.86
6.05
Weld Temp Fits
I Chart of Residuals
Normal Plot of Residuals
Residual
1
Residual
0.5
Log-Likelihood = -134.795
Test that all slopes are zero: G = 39.513, DF = 2, P-Value = 0.000
0.0
-0.5
Chi-Square
187.820
224.278
7.138
DF
116
116
7
P
0.000
0.000
0.415
-2
5
-1
0
-3.0SL=-0.9631
1
2
0
10
20
30
Observation Number
Residuals vs. Fits
10
5
0.0
Any time a model has been created for an X/Y
relationship, through ANOVA, DOE, Regression, the
quality of that model can be evaluated by analysis of
the error in the equation.
When doing the REGRESSION (Page 37-38), or the
FITTED LINE PLOT (above), be sure to select store
“FITS” and “RESIDUALS” in the “STORAGE” dialog
box. If the fit is good, the error should be normally
distributed with an average of zero and there should
be no pattern to the error over the range.
Then in the “RESIDUAL PLOTS” dialog box, identify
the column where the residuals are stored
-0.5
-1.0
Residual
6
X=0.000
Normal Score
-0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6
Summary Measures
Somers' D
Goodman-Kruskal Gamma
Kendall's Tau-a
0
Histogram of Residuals
0
(Between the Response Variable
and Predicted Probabilities)
Percent
72.9%
26.2%
0.9%
100.0%
3.0SL=0.9631
0.5
Table of Observed and Expected Frequencies:
(See Hosmer-Lemeshow Test for the Pearson Chi-Square Statistic)
Number
9060
3260
110
12430
1
-1
-1.0
Pairs
Concordant
Discordant
Ties
Total
730
Residual Plots
(Event)
Coef
7.410
-8.530
Method
Pearson
Deviance
Hosmer-Lemeshow
720
Use Residual Plots (Below) to plot the residuals vs predicted values ( Fits) and
determine if there are additional patterns in the data.
Logistic Regression Table
Predictor
Constant
Index
Brand Sp
Yes
690
The R-sq is the square of the correlation coefficient. It is also the fraction of the
variation in the output (response) variable that is explained by the equation. What is
a good value? It depends... chemists may require an R-sq of .99. We may be
satisfied with an R2 of .80.
Frequency
3. Check the Odds ratios for the
individual predictor levels.
4. Use the Confidence Interval to
confirm significance. Where
confidence interval includes
1.0, the odds are not
significant.
Value
Yes
No
Total
680
Y = b + mx + error
Logit
Response Information
Variable
Bid
670
Confidence Bands are 95% confidence limits for data means. Prediction Bands
are limits for 95% of individual data points.
Binary Logistic Regression
Link Function:
95% PI
660
To analyze the binary logistic problem, use STAT>REGRESSION>BINARY
LOGISTIC REGRESSION. The data set used for “Response” will be Discrete
and Binary (Yes/No;Success/Failure). In the “Model” dialog box, enter all
factors to be analyzed. In the “Factors” dialog box, enter those factors which
are discrete. Use the “Storage” button and select “Event probability”. This will
store the Calculated Event probability for each unique value of the function.
Analyze the Session Window
output.
1. Analyze the Hypothesis Test
for the model as a whole.
Check for a p value indicating
model significance.
2. Check for statistical
significance of the individual
factors separately. Use the P
Value.
Regression
200
β 0+ β 1x
(1+
400
Residual
π ( x) = e
500
4
5
6
7
8
9
10
Fit
in the “Residuals” box and the fits storage column in the “Fits” box.
0.47
0.47
0.23
Goodness of fit. Use
Hosmer- Lemeshow if there is a “continuous” “X” in the model.
6. Assess the measures of Association. Note “% Concordant” is a measure
49 similar to R2. A higher value here indicates a better predictive model.
The output includes a normal plot of residuals, a histogram of residuals, an Individuals Chart of
Residuals and a scatter plot of Residuals versus Fits.
Analysis of the Normal Plot should show a relatively straight line if the residuals are normally
distributed. The I chart should be analyzed as a control chart. The histogram should be a bell-shaped
distribution. The residuals vs fits scatter plot should show no pattern, with a constant spread over the
range.
22
Descriptive Statistics
Stat>Basic
Stat>Basic Statistics>Descriptive Statistics
Descriptive Statistics
Variable: C1
Anderson-Darling Normality Test
A-Squared:
0.235
P-Value:
0.790
42
52
62
72
82
Mean
StDev
Variance
Skewness
Kurtosis
N
92
Minimum
1st Quartile
Median
3rd Quartile
Maximum
95% Confidence Interval for Mu
69.3824
9.8612
97.2442
-5.0E-03
3.89E-02
500
38.9111
62.5858
69.6848
75.8697
99.6154
95% Confidence Interval for Mu
68.5160
70.2489
69
70
71
95% Confidence Interval for Sigma
9.2856
10.5136
The Descriptive Statistics>Graphs>Graphical
Summary graphic provides a histogram of the data
with a superimposed normal curve, a normality
check, a table of descriptive statistics, a box plot of
the data and confidence interval plots for mean and
median.
In the Descriptive Statistics dialog box select the
variables for which you wish to create the descriptive
statistics. If choosing a stacked variable with a
category column, check the BY VARIABLE box and
indicate the location of the category identifier.
95% Confidence Interval for Median
68.6347
70.8408
95% Confidence Interval for Median
Use the Graphs button to open the graphs dialog
box. In the graphs dialog box, select “Graphical
Summary”.
When using this tool to interpret normality, confirm
the p value and evaluate the shape of the
histogram. Remember that the p value is “The probability of claiming the data are not normal if the
data are truly from a normal distribution”, a type I error. A high p-value would therefore be consistent
with a normal distribution. A low p-value would indicate non-normality. When evaluating the shape
of the histogram graphically, determine: Is it bimodal? Is it skewed? If yes, investigate potential
causes for the non-normality. Improve if possible or analyze groups separately. If no special cause is
found for the non-normality, the distribution may be non-normal naturally and you may need to
transform the data (page 22) prior to calculating your Z.
Normal Plot
•
Normal Probability Plot
•
.999
Probability
.99
.95
.80
.50
.20
.05
.01
.001
26
36
46
56
66
76
86
96
Tolerance Analysis is a design method used determine the impact that
individual parts of a system have on the overall requirement for that
system.
Most often, Tolerance Analysis is applied to dimensional characteristics in
order to see the impact the dimensions have on the final assembly in
terms of a gap or interference. In this application, a tolerance loop may
be used to illustrate the relationship.
Purpose
To graphically show the relationships of multiple parts in a system which
result in a desired technical requirement in terms of a gap or
interference.
Process
1. Generate a layout drawing of your assembly. A hand sketch is all that is
required.
2. Clearly identify the gap in the most severe condition.
3. Select a DATUM or point from which to start your loop. (It is easier to
start the loop at one of the interfaces of the Gap.)
4. Use drawing dimensions as vectors to connect the two sides of the gap.
STAT>BASIC STATISTICS>NORMALITY TEST
106
Normal
Average: 70.0000
StDev: 10.0000
N: 500
Design for Six SigmaTolerance Analysis
Anderson-Darling Normality Test
A-Squared: 0.418
P-Value: 0.328
5. Assign sign convention (+/-) to vectors
Identify the variable you will be testing
in the Variable box.
Click OK (Use default Anderson Darling
test).
A Normal probability plot is a graphical method
to help you determine whether your data is
normally distributed. To graphically analyze
your data, look at the plotted points relative to
the sloped line. A normal distribution will yield
plotted points which closely hug the line. Nonnormal data will generally show points which
significantly stray from the line.
Vector Assignment
Datum
A
+
B1
B2
B3
B4
Gap
The test statistics displayed on the plot are Asquared and p-value.
The A-squared value is an output of a test for normality. Focus your analysis on the p value. The p
value is “The probability of claiming the data are not normal if the data are truly from a normal
distribution”, a type I error. A high p-value would therefore be consistent with a normal distribution. A
low p-value would indicate non-normality. Use the appropriate type I error probability for judging this
result.
Assign a positive (+) vector when
• An increase in the dimension
increases the gap.
• An increase in the dimension
reduces the interference.
Assign a negative (-) vector when
• An increase in the dimension
reduces the gap.
• An increase in the dimension
increases the interference.
In the diagram above, the relationship can be explained as:
GAP = A - B1 - B2 - B3 - B4
Because the relationship can be explained using only + & - signs, the
equation is considered LINEAR, and can be analyzed using a method
known as Root Sum of Squares (RSS) analysis.
21
50
Design for Six Sigma Tolerance Analysis (continued)
Histogram
GRAPH>HISTOGRAM
The histogram is useful to look
graphically at the distribution of data.
Using RSS, the statistics for the GAP can be explained as
follows. The bar (−) above the terms designates mean values &
“S” designates Standard Deviation.
___
_ __ __ __ __
GAP = A - B1 - B2 - B3 - B4
S
gap
=
(s
2
A
+s +s +s +s
2
2
2
2
B1
B2
B3
B4
In the GRAPH VARIABLES Box select
each variable you wish to graph
individually.
)
Click the OPTIONS button to change the
Histogram displayed:
•
Type of Histogram - Frequency
(Default); Percent; Density
•
Type of Intervals - Midpint (Default ) or Cutpoint
Definition of intervals - Automatic (Default ) or manual definition
•
See Minitab Help for explanation of how to use these options. Click HELP in the
HISTOGRAM>OPTIONS Dialog Box.
Given these equations, the impact that each individual part has
on the entire system can be analyzed. In order to perform this
analysis, follow these steps:
Gather and prepare the required data
•
gap nominals
gap specification limits
•
•
process data for each component
ü process mean
ü process s.st
ü process s.lt
•
process data assumptions, if data not available: try 'expert data sources'
ü or....process s.lt estimates when data is not available
ü s.st, s.lt from capability data
ü Z-shift assumptions, long term-to-short term, when one is known
♦
multiply s.st by 1.6 for a variance inflation factor
multiply s.st by 1.6+ for a process that has less control long term
♦
( divide s.lt by the above factor if long-term historical data is known)
Click FRAME button to give the options for setting common axes or multiple
graphs.
Click ATTRIBUTES button to access options for changing graphic colors or fill
type.
Scatter Plot
GRAPH>PLOT
The scatter plot is a useful tool for
understanding the relationship between two
variables.
9.7
Linear Tolerance Spreadsheet
Once the data has been collected, it can be analyzed using the
Tolanal.xls spreadsheet described on the next page. The
Tolanal.xls spreadsheet can be found on the GEA website, under
Six Sigma, Forms & Tools.
EXISTING
9.5
9.3
9.1
8.9
8.7
8.5
8.5
The Tolanal.xls spreadsheet performs its analysis using Root
Sum of Squares method, and should only be applied to LINEAR
relationships (i.e. Y=X1+X2-X3). Non-linear relationships require
more detailed analysis using advanced DFSS tools such as
Monte Carlo or the ANALYSIS.XLS spreadsheet. Contact a
Master Blackbelt for support.
8.7
8.9
9.1
9.3
9.5
9.7
In the GRAPH VARIABLES box select each X
and Y variable you wish to plot. MINITAB will
create individual plots for each pair of variables
selected. In the Six Sigma method, the
selected Y should be the dependent variable
and the selected X the independent variable.
Select as many combinations as you wish.
NEW
Click the OPTIONS button to add jitter to
the graph. Where there are multiple data points with the same value, this will allow
each data point to be seen.
Click FRAME button to give the options for setting common axes or multiple graphs.
Click ATTRIBUTES button to access options for changing Graphic colors or fill type.
Click ANNOTATION button to access options for changing the appearance of the data
points or to add titles, data labels or text.
Results: If Y changes as X changes, there is a potential relationship. Use the graph to
check visually for linearity or non-linearity.
51
20
Tolerancing Analysis -
The Planning Questions
Linear Spreadsheet
The
Principle of
Reverse
Loading
1. Input the
technical
requirements
Know
See
Tool
Data
Where
2. Input target
dimensions and
vector direction
Every problem solving task is focused on finding out something. The investigation will
be more effective if it is planned. Planning is appropriate for Gage R&R, characterizing
the process, analyzing the process for difference (hypothesis testing), design of
experiments or confirmation run analysis. In short it is appropriate in every phase of the
MAIC or MAD process.
This investigative process is called “reverse loading” because the process begins with a
question focusing on what is desired at the end of the process.
Plan
1) What do you want to know?
2) How do you want to see what it is that
you need to know?
Critical
Questions
3) What type of tool will generate what it
is that you need to see?
4) What type of data is required of the
selected tool ?
5) Where can you get the required type
of data?
Execute
3. Input short
and long term
σs of part dims
Spreadsheet
identifies major
contributors to
system variation
Input your process data into the spreadsheet
· input gap technical requirements
· input interface dimension 'nominals' as the baseline case
· input design vectors from loop diagram
· input process σ's (long and short term from available data)
Analyze the initial output
· Gap: Min/Max of constraints (info only) vs. Z.upper and Z.lower for 'Gap' (CTQ)
· Parts: RSS-σ.st contribution by each part: target the high RSS% first
Modify the input variables to optimize the Z-gap
..change:
· Part nominals vs. means (implies process shift... tooling sensitive?)
· Component σ.st (caution here... if you reduce, you are making 'big' assumptions )
· Z-shift factor (change the σ.lt, using actual data or assumptions)
· Target CTQ specifications (if not constrained... negotiate with Customer)
Review your output for the optimized Z-gap condition
· If initial Z-gap is very high, you can move part nominals or 'open' the variation (don't penalize
yourself by constraining processes too much)
· If you cannot get Z-gap to your goals, re-design should be considered
· Understand ALL implications of your changes to any of the input variables
Establish your tolerances based on %RSS contribution, sensitivity to Z.gap and desired
sigma' level of the particular contributing dimension; know the effect on your NOMINAL
design
· highest %RSS contributors will have most impact. Iterate by moving the nominal by 1.0 σ.st
in both directions.... continue iterating to 2*s.st.., 4*s.st.., 5*s.st.., etc.
· understand and weigh your risks by increasing tolerance (effect on nominals, subsequent
operations, etc)... how is the process managed and what are the controls? Who supplies?
Copyright 1995 Six Sigma Academy, Inc.
19
52
Design Of Experiments
Six Sigma Process Report
Baselining data collection is considered passive observation. The process is monitored
and recorded without intentional changes or tweaking. In Designed Experiments,
independent variables (Factors) are actively manipulated and recorded and the effect on
the dependent variable (Response) is observed. Designed experiments are used to:
• Determine which factors (X’s) have the greatest impact on the response (Y).
• Quantify the effects of the factors (X’s) on the response (Y).
• Prove the factors (X’s) you think are important really do affect the process.
Analysis of Continuous data
Report 1: Executive Summary
Process Performance
LSL
USL
Date:
August 1
Reported by:
Dick Kel
Project:
Wine Qua
Department:
Mountain
Process:
Aging
Characteristic: Perceive
5
10
15
20
1,000,000
Orthogonality
Actual (LT)
Potential (ST)
Units:
Likert S
Upper Spec:
22
Lower Spec:
8
Nominal:
15
Opportunity:
Award
100,000
10,000
Process Benchmarks
1000
Since our goal in experimentation is to determine the effect each factor has on the
response independent of the effects of other factors, experiments must be designed so as
to be horizontally and vertically balanced. An experimental array is vertically balanced if
there are an equal number of high and low values in each column. The array is
horizontally balanced if for each level within each factor we are testing an equal number of
high and low values from each of the other factors. If we have a balanced design in this
manner, it is Orthogonal. Standard generated designs are orthogonal. When modifying or
fractionating standard designs be alert to assure maintenance of orthogonality.
Actual (LT) Potential (ST)
100
Sigma
Replication
Duplicate experimental runs more than once after resetting the independent variables is
called replication. It is commonly used to assure generalization of results over longer term
conditions.
When using MINITAB for experimental designs, Replications can be
programmed during the design creation.
Randomization
Running experimental trials in a random sequence is a common, recommended practice
that assures that variables that change over time have an equal opportunity to affect all the
runs. When possible, randomizing should be used for designed experimental plans. It is
the default setting when MINITAB generates the design, but can be deselected using the
OPTIONS button.
Blocking
A block is a group of “homogeneous units”. It may be a group of units made at “the same
time”, such as a block by shift or lot, or it may be a group of units made from “the same
material” such as raw material lot or manufacturer. When blocking an experiment, you are
adding a factor to the design; i.e., in a full factorial 24 experiment with blocking, the actual
design will analyze as a 25-1 experiment. When analyzing processes subject to multiple
shift or multiple raw material flow environments, etc, blocking by those conditions is
recommended.
53
3.52
15034.0 215.402
PPM
1
1
2
3
4
5
6
7
8
Xbar and S Chart
Capability Indices
16
15
14
13
12
11
10
9
3.0SL=15.74
Subgroup
5
4
3
2
1
0
1
2
3
4
5
6
7
15.0000 12.4368
-3.0SL=9.133
1.8917
3.7003
3.7003
8
-3.0SL=0.000
I
8
I
I
22
I
Specification
Date
Reported By
Project
Department
Process Casting
Characteristic
Units
Upper Spec
Lower Spec
Nominal
Opportunity
Data Source
Time Span
06/31/96
Data Trace
I
22
99.9785 98.4966
Yiel
PPM 215.402 15034.0
Cp
Cpk
Pp
Ppk
1. Identify the configuration of the Y data, and
the location of the useful data. (columns or
rows).
1.22
0.78
1.13
0.72
Data Source:
Time Span:
2. Identify the CTQ specifications or the location
of the demographic information.
06/31/96
Duke Brewster
Shoe Cast
Brake Division
3. If detailed demographic information is to be
used, select the Demographics button. Either
enter the data for the information (shown at
the left) in the dialog box or reference a
spreadsheet column with this information
listed as shown.
Hardness
Brinell
42
38
40
01/01/96
To use this tool effectively, the response data (Y)
must be collected in rational subgroups of two (2)
or more data points. In addition to the Y data, a
demographics column may be added to provide the
demographic information on the right side of Report
1. The demographics column must be entered in
exact order shown if used. See figure.
Once the data is entered, create the report by
calling “Six Sigma>Process Report”.
P.LSL 0.000108 0.015033
P.Total 0.000215 0.015034
Process Tolerance
6.2592
18.6145
I
I
I
I
2.0454
4.6756
2.1692
Z.Bench 3.5205 2.1692
1.3513 1.3513
Z.Shift
P.USL 0.000108 0.000001
Actual (LT) Capability
8
LT
Mean
StDev
Z.USL
Z.LSL
S=1.691
Process Tolerance
9.2773
20.7227
I
I
I
ST
X=12.44
3.0SL=4.342
Specification
Completing a run more than once without resetting the independent variables is called
repetition. It is commonly used to minimize the effect of measurement and to analyze
factors affecting short-term variation in the response.
2.17
(Z.Bench)
10
Potential (ST) Capability
Repetition
The
Six
Sigma
Process
Report,
“Six
Sigma>Process Report” displays data to enable
the analysis of continuous process data. The
default reports are the Executive Summary (Report
1)and the Process Capability Report (Report 2).
Process Demographics
Actual (LT)
Potential (ST)
-
4. When the report is generated with only this
information, the default reports will be shown.
If additional reports are desired, they can be
accessed through the “Reports” button.
Bin # 1057a-9942
Executive Summary - Top left graphic displays the predicted distribution based on data.
MINITAB assumes normal data and will display a normal curve whether the data is
normal or not. The lower left hand graphic displays the expected PPM defect rates as
subgroups are added to the prediction. When this curve stabilizes (levels off) enough
data has been taken. The Process Benchmarks show the reported Z Benchmark scores
and PPM (Defects in both tails are combined) (Page 8).
Capability Study - The control charts provide an excellent means for diagnosing the
rational subgrouping process. Use normal techniques for analysis of this chart (Page 54).
The capability indices on the right provide tabular results of the study. The bar diagrams
at the bottom of the report show comparative graphics of the short term and long term
process predictions.
18
Normality of Data
Factorial Designs
Factorial Designs are primarily used to analyze the effects of two or more factors and
their interactions. Based on the level of risk acceptable, experiments may be either full
factorial, looking at each factor combination , or fractional factorial, looking at a fraction of
the factor combinations. Fractional Factorial experiments are an economical way to
screen for vital X’s. They only look at a fraction of the factor combinations. Their results
may be misleading because of confounding, the mixing of the effect of one factor with the
effect of a second factor or interaction. In planning a fractional factorial experiment, it is
important to know the confounding patterns, and confirm that they will not prevent
achievement of the goals of the DOE.
Probability
Data from many processes can be approximated by a normal distribution. Additionally,
The Central Limit Theorem states that characteristics which are the average of individual
values are likely to have an approximately normal distribution. Prior to characterizing your
project Y, it is valuable to analyze the data for normality to confirm whether the data does
follow a normal distribution. If there is strong evidence that the data do not follow a normal
distribution, then predictions of future performance should not be made using the normal
distribution.
Normal Probability Plot
Use “Stat>Basic Stats>Normality Test”
(Fig
1)
(Page
21)
or
“Stat>Basic
Stat>Descriptive Statistics” (Fig 2) (Page 21)
.999
.99
with “Graphs>Graphical Summary” checked.
.95
If
using “Normality Test”, the default is
.80
“Anderson-Darling”. Use that test for most
.50
.20
investigations. Use other tests with caution.
.05
For example, Kolmogorov-Smirnov is actually a
.01
less sensitive test.
.001
26
36
46
56
66
76
86
96 106
Normal
Anderson-Darling Norm
A-Squared: 0.418
P-Value: 0.328
Average: 70.0000
StDev: 10.0000
N: 500
Fig 1
Descriptive Statistics
Variable: Normal
Anderson-Darling Normality Test
0.418
A-Squared:
0.328
P-Value:
35.0
47.5
60.0
72.5
85.0
70.0000
10.0000
100.000
-5.0E-02
0.393445
500
Mean
StDev
Variance
Skewness
Kurtosis
N
97.5
29.824
63.412
69.977
76.653
103.301
Minimum
1st Quartile
Median
3rd Quartile
Maximum
95% Confidence Interval for Mu
95% Confidence Interval for Mu
70.879
69.121
69
70
71
95% Confidence Interval for Sigma
10.662
9.416
95% Confidence Interval for Median
95% Confidence Interval for Median
70.737
69.021
Fig 2
Xbar/R Chart for Mystery
15
14
13
12
11
10
90
Means 80
70
60
50
1 1 1 11 11 1 111 1
11
11 1
1111 1111111111111111111111 1111111
1
1 11111111111 11111111111111111111111
111
1111 1 1 1 1 111
Subgroup 0
50
100
1
50
40
30
20
Ranges1
0
3.0SL=48.2
R=22.83
-
Fig 3
17
3.0SL=113.
X=100.
-
The test statistic for primary use in analyzing the
test results is the P value. The null hypothesis,
Ho,states that the process is normal, so if the p
value <.05, then there is evidence that the data
do not follow a normal distribution. If the
process shows non-normality, either there are
special causes of variation that cause the nonnormality, or the common cause variation is not
normal. Analyze first for special cause.
Use Stat>Control Charts (Fig 3)(Page 49)or
Plot>Time Series Plot (Page 24) to look for
“out of control” points or drifts of the process
over time. Try to determine the cause of those
points and separate, or stratify, the data using
that knowledge. If the levels of X’s have been
captured, use graphics to aid in visualizing the
process stratified by the X’s. If the data can be
stratified and within the strata the data is
normal, the process can be characterized at the
individual levels and perhaps characterized
using the Product Report (page 15). The
discovery of a special cause contributing to nonnormality may lead to improving the process. If
the common cause variation is non-normal, it
may be possible to transform the data to an
approximately normal distribution.
MINITAB
provides such a tool in “Stat>Control
Charts>Box-Cox Transformation” (Page 24).
Additional notes on data transformation can be
found in the Quality Handbook; Juran Chap 22.
Fig 1
Fig 2
To create a Factorial Experiment using
MINITAB, select STAT>DOE>CREATE
FACTORIAL DESIGN. In the dialog box
(Fig 1) select the Number of Factors and
then the Designs Button. If the number
of factors allows both a fractional and full
factorial design, the Designs dialog box
(Fig 2) will show the available selections
including both full and fractional designs.
Resolution, which is a measure of
confounding is shown by each displayed
design. While in this dialog box identify
the number of replicates and blocks to be
used in the design. Select OK to return to
the initial dialog box. Select Options. In
the Options dialog box select Randomize
Run if planned.
Finally, select the
Factors button and in that dialog box,
name the factors being studied and the
factor experimental levels. Click OK twice
to generate the completed design. The
design will be generated on the MINITAB
worksheet as shown in Fig 3. An analysis
of the design, including the design
Resolution and confounding will be
generated in the MINITAB
Session Window.
Now run the experiment and
collect the data. Record run
data in a new column in same
row as run factor settings.
`
Fig 3
54
Characterizing the Process -
DOE Analysis
Rational Subgrouping
Analysis of DOE’s includes both graphical and tabular information. Once the data for the
experimental runs has been collected and entered in the MINITAB worksheet, analyze
with STAT>DOE>ANALYZE FACTORIAL DESIGN. In the ANALYZE FACTORIAL
DESIGN dialog box, identify the column(s) with the response data in the Responses
box. Select the GRAPHS button. In the GRAPHS dialog box, select PARETO for the
effects plots and change ALPHA (α level of significance) to .05. Click OK twice. Note
that we have not used the other options buttons at this time. Leave Randomize at default
settings. The initial analysis provides a session window output and a Pareto graph.
To separate the measurement of Z.ST and Z.LT and understand fully how the process
operates, capture data in such a way to see both short term variation, inherent to the
technology being used, and long term variation, which reflects the variation induced
by outside influences. The process of collecting data in such a manner is called
“Rational Subgrouping”. Analyzing Rational Subgroups allows analysis of “centering
vs. spread” and “control vs. technology.”
Pareto Chart of the Effects
(response is PCReact, Alpha = .05)
A:
B:
C:
D:
E:
B
D
BD
Steps to characterize a process using Rational Subgroups
1. Work with operational advocacy team to define the factors (X’s) suspected as
influential in causing output variation (Y). Confirm which of these factors are
operationally controllable and which are environmental. Prioritize and understand the
cycle time for sensing the identified factors. Be sure to question the effect of
elements of all the 5M’s of process variation:
Machine - Technology; Maintenance; Setup
Materials - Batch/Lot/Coil Differences
Method - MTS; Workstation layout; Operator method
Manpower - Station Rotation; Shift Changes; skill levels
Measurement - R&R; Calibration effects
Environment – Weather; Job Site or shop
Feedrate
Catalyst
Agitate
Tempera
Concentr
DE
E
CE
A
BC
AB
BE
AE
AD
AC
CD
C
0
10
20
Analysis of the DOE requires both graphic and model analysis, however, the model should
be generated and analyzed before full graphic analysis can be completed. An analysis of
the Fit Model in the MINITAB Session window shows the amount of effect and the model
coefficients. Most important though is the ANOVA table. This table may show the
significant factors or interactions (See Balanced ANOVA Page 40). In this case, the F
score is shown as “**” and there are no “p” values. This indicates that the model as
defined is too complex to be analyzed with the amount of data points taken. The model
needs to be simplified. The Pareto graphic is a helpful tool for that. Note that effects B,D
and E and interactions BD and DE show as significant effects. Remaining non-significant
effects can be eliminated.
2. Define a data collection plan over time that captures data within each subgroup
taken over a period of time short enough that only the variation inherent to the
technology occurs. Subgroup size can be anything greater than two (2). Two
measured data points are necessary to see subgroup variation. Larger subgroups
provide greater sensitivity to the process changes, so the choice of subgroup size
must be made to balance the needs of the business and the need for process
understanding. This variation is called “common cause” and represents the best the
process can achieve. In planning data collection use of The Planning Questions
(Page 19) is helpful.
3. Define the plan to allow for collection of the subgroups over a long period of time
which allows the elements of long term variation and systematic effects of potentially
important variables to influence the subgroup results. Do not tweak, or purposely
adjust the process, but rather recognize that the process will drift over time and plan
the data collection accordingly.
Rerun STAT>DOE>ANALYZE FACTORIAL DESIGN . This time select the TERMS
option button. In the dialog box, deselect the terms not shown as significant in the Pareto.
Click OK. Select STORAGE and select RESIDUALS and FITS. Click OK twice. The
resulting ANOVA table shows the significance of the factors and the model coefficients are
provided. Next, Run STAT>DOE>FACTORIAL PLOTS. Select and setup each of the
plots, MAIN EFFECTS, INTERACTIONS and CUBE as follows. Identify the response
column in the RESPONSES box. Select only the significant factors to be included in the
plot. Click OK twice to generate the plots. Confirm the significance of effects and
interactions graphically using MAIN EFFECTS and INTERACTIONS plots. Use the CUBE
PLOT to identify the select factor levels for achieving the most desirable response.
4. Capture data and analyze data using Control Charts (Page 53 - 55) and 6 Sigma
Process Report (Page 18) during the data collection period. Stay close to the
process and watch for data shifts and causes for the shifts. Capture data
documenting the levels of the identified vital X’s. This data may be helpful in
analyzing the causes of process variation. During data collection it may be helpful to
maintain a control chart or some other visual means of sensing process shift.
Interaction Plot for PCReact
Main Effects for PCReact
1
2
0
14
0
18
14 0
3
6
180
Cube Plot - Means for PCReact
6
3
Catalyst
75
47.0
80.0
2
PCReact
70
1
80
65
64.5
94.0
Temperat
180
60
emperat
140
5. Capture sufficient subgroups of data to allow for multiple changes in all the
identified vital X’s and also to allow for a stable estimate of the mean and variation in
the output variable (Y). See 6 Sigma Process Report (Page 18) for explanation of
graphic indicator of estimation stability.
66.0
55.5
6
55
Catalyst
Temperat
Concentr
Concentr
Concentr
55
40
62.0
53.0
3
1
2
Catalyst
16
Six Sigma Product Report
The Six Sigma Product Report “Six Sigma>Product Report” is used to
calculate and aggregate Z values from discrete data and data from multiple
normal processes.
Enter “# defects”, “# units” and “# opportunities” data in separate columns in
MINITAB. When Z shift is included in the calculation (1.5 default) the
reported Z bench is short term. If zero is entered, the reported Z.bench is
long term.
Defect count - Enter the actual defects recorded in the sample population.
If using defect data from Continuous Process Study, use PPM for long term.
If this report is a rollup of subordinate processes, use the defect count from
the subordinate process totals.
Units - Enter the actual number of parts included in the sample population
evaluated. If using data from Continuous Process Study, use 1,000,000. If
this report is a rollup of subordinate processes, use the actual number of
parts included in the sample population evaluated.
Opportunities - At the lowest level,
Report 7: Product Performance
use one (1) for the number of
opportunities. One (1) is the
number of CTQ’s characterized at
the lowest level of analysis. If this
report is a rollup of subordinate
processes, use the total number of
opportunities accounted for in the
subordinate process.
Characteristics (Optional) Enter
the
test
name
for
the
Characteristic, CTQ or subprocess.
Shift - Process ZSHIFT can be entered three ways. If the report is an agregate
of a number of continuous data based studies, for example, a part with
multiple CTQ’s, the ZSHIFT data can be entered in the worksheet as a separate
column and refered to in the Product Report dialog box. A fixed ZSHIFT of 1.5
is the default and will be used if nothing is specified. A ZSHIFT of zero (0) will
produce a report that shows only the long-term results.
As the levels are rolled up, the data from the totals in the subordinated
processes will become line items in the higher level breakdown. In the chart
above, the process reported includes data from 12 subprocesses. Door
assembly South includes a process that included six (6) CTQ’s
characterized.
Analyzing the report
The far right hand column of the report shows the Z.Bench for the individual
processes and for the Cumulative Z.Bench. The number at the bottom of the
DPO column, in this case 0.081917, reports the P (d), probability of a defect
at the end of the line.
Defs
Units
Opps
TotOpps
DPU
DPO
PPM
46332
66636
3
199908
0.695
0.231767
231767
1.500
2.233
2174
66636
1
66636
0.033
0.032627
32627
1.500
3.344
Sealed SystemHigh
554
66636
2
133272
0.008
0.004157
4157
1.500
4.139
Sealed SystemLow
3540
66636
3
199908
0.053
0.017708
17708
1.500
3.604
C84
3643
66636
1
66636
0.055
0.054667
54667
1.500
3.101
C85
1947
66636
1
66636
0.029
0.029223
29223
1.500
3.392
37052
66636
6
399816
0.556
0.092673
92673
1.500
2.824
811
66636
1
66636
0.012
0.012174
12174
1.500
3.752
14869
66636
1
66636
0.223
0.223144
223144
1.500
2.262
C87
2901
66636
1
66636
0.044
0.043534
43534
1.500
3.211
C90
1544
66636
1
66636
0.023
0.023166
23166
1.500
3.492
C91
4721
66636
1
66636
0.071
0.070852
70852
1.500
2.969
0.081917
81917
1.500
2.892
CG Case
C83
Door Assy South
C86
Plastics
Total
120089
1465992
Central Composite Design (CCD)
Response Surface analysis is a type of Designed Experiment that allows
investigation of non-linear relationships. It is a tool for fine tuning process
optimization once the region of optimal process conditions is known. Using the CCD
type RS Design, you will be designing an experiment that tests each factor at five
levels, and an experiment which can be used to augment a factorial experiment that
has been completed. The CCD design will include FACTORIAL points, STAR points
and CENTER points.
Start by Running STAT>DOE>CREATE RS DESIGN . Select CENTRAL
COMPOSITE from the design type choices in the dialog box. Identify the number of
factors to be studied and click the DESIGN button. In the DESIGN dialog box,
select the experiment design desired, including the blocks. Click OK and then select
the FACTORS button. In that dialog box identify the factors and their high and low
factorial settings and click OK. Randomize runs is found in the OPTIONS dialog box.
Click OK to generate the design. The design will be placed on a new worksheet.
Collect data for each of the scheduled trials defined by the design. Note that there
will be multiple points run at the centerpoint of each factor and there will be star
points for each factor beyond the factor ranges identified in the design.
ZShift ZBench
Analyze the data using STAT>DOE>ANALYZE RS DESIGN. In the dialog box
identify the response column. Leave the Use Coded Units selected and choose the
appropriate setting for the USE BLOCKS box, depending on plan. Click OK and run.
The resulting output is a combination of the Regression Output (Page 43) and the
ANOVA output (PAGE 41). The regression output analyzes how the individual
factors and interactions fit the model. The ANOVA table will analyze the type of
relationship and also the total fit of the model. If “Lack of Fit” error is significant,
another model may be appropriate. Simplify the model for terms and regression
complexity as appropriate.
See DOE Analysis (Page 51).
Rerun
STAT>DOE>ANALYZE RS DESIGN and select the TERMS button. Before
rerunning the simplified analysis, select STORAGE and select FITS and
RESIDUALS.
Continue simplification and tabular analysis to attempt to find a simple model that
explains a large portion of the variation. Confirm regression fit quality using Residual
Plots (Page 22). The terms in the ANOVA Matrix should show significance, except
that “Lack of Fit”
term should
become insignificant (p>.05). Next run
STAT>DOE>RSPLOTS. Select either CONTOUR or SURFACE plot and SETUP
for the selection. In the SETUP dialog box, confirm that the appropriate factors are
included for the plot, noting that each plot will have only the factor pair shown.
Check that the plot is displayed using UNCODED units and run. Use the graphic
generated to visually analyze for optimal factor setting or use the model coefficients
and solve for the optimal settings mathematically.
Contour Plot of strength
30
35
40
95
Composition
Characteristic
Response Surface
strength
30
85
20
24.5
25.5
26.5
27.5
28.5
29.5
Volume
30.5
31.5
75
15
40
25
26
27
28
Volume
29
30
31
75
85
95
Compositio
56
Control Charts
Control charts are a practical tool for detecting product and/or process performance
changes in and R over time in relation to historical performance. Since they are a
rigorous maintenance tool, control charts should be used as an alternate to closed loop
process control, such as mechanical sensing and process adjustment.
Common and special-caused variation can be seen in rationally subgrouped samples:
• common-cause variation characterized by steady state stable process variation
(captured by the within subgroup variation).
• special-cause variation characterized by outside assignable causes on the process
variation (captured by the “between subgroup variation).
• Control Chart Analysis signals when the steady state process variation has been
influenced by outside assignable causes.
Variables Control Charts
Variable Control Charts are used in pairs. One chart characterizes the variation of
subgroup averages, and the other chart characterizes the variation of the spread of the
subgroups.
Individual Charts (X/Moving Range): These charts are excellent for tracking long term
variation changes. Because they use a single measurement for each data point, they are
not a tool of choice where measurement variation is involved, such as with part
dimensions. They work well with temperatures, pressures, concentration, etc.
Subgroup Charts (XBar R or Xbar S): These charts are excellent for tracking changes
in short term variation as well as variation over time. They require multiple measurements
(two or more) in each subgroup. Using rational subgroup techniques with this chart
enables graphic analysis of both short term variation changes (Range or S) and long term
variation (X Bar chart). This chart is the chart of choice where measurement variation is
involved. It is also an excellent tool for tracking processes during baselining or
rebaselining, since it assists in pointing to special cause influence on results. Because
there is usually no change in temperature, pressures or concentration in the short term,
they are not used for that type of measurement.
Rolled Throughput Yield
Rolled Throughput Yield (YRT) is the probability of completing all the
opportunities in a process without a defect.
As such, it is a tool which can focus the investigation when narrowing
down the problem from a larger business problem.
In a process which has 18 stations, each with 5 opportunities, and
DPO = 0.001, the YRT is .9139, calculated as follows:
# Opportunities
Y
= ((Yield)
)
RT
Y
= (.999 ) = (.995) = .91389
RT
# Stations
Station
5
18
18
In addition to the straight multiplication method of calculating YRT
YRT= Y1 x Y2 x Y3 ......x YN
Where Y1, Y2, Y3....YN are yields of individual stations or operations
in a process.
YRT can also be estimated using the Poisson Approximation
Y =e
− DPU
And conversely
RT
DPU ≅ −Ln(Y )
rt
Attribute Charts
Attribute Control Charts are a single chart. The common difference between these charts
is whether they track proportion, a ratio, or defects, a count.
Proportion Defective Charts (P charts): This chart tracks proportion. The data point
plotted is the ratio Number of defects/Number of Pieces Inspected . In using proportion defective charts,
the number of pieces in a sample can vary, and the control limit for the chart will vary
based on that sample size.
Number Defective Charts (nP Charts): This chart tracks defect count. The data point
plotted is the number of defects in a sample. Because the data point is a number relative
to a sample size, it is important that the sample size be relatively constant between
samples. The sample size should be defined so that the average number of defects is at
least five in order for this chart to be effective.
In setting up Control Charts, use the Planning Questions (Page 19) first. Those questions
along with these notes will help define the type of chart needed. Use SETCIM, MINITAB,
SPQ (Supplier Process Quality) or other electronic methods for long term charting.
57
Normalized Average Yield
Normalized Average Yield (YNA) is the average yield of one
opportunity. It answers the question “What is the probability that the
output of this process will meet the output requirements?” The YNA
of a process is an average defect rate, and can be used for
comparing processes with differing levels of complexity.
Y = (Y )
NA
1
Opportunities
RT
Normalized Average Yield (YNA), is the probability of good product, so
if we calculate 1-YNA, we can find the Probability of a defect, P(d).
With this we can find the Z.LT score for a process.
P(d) = 1 − YNA
14
DPU / DPO
DPU
DPU is the number of defects per unit produced. It’s an average. This
means that on the average, each unit produced will have so many defects.
Interpreting Variables Control
Charts
UCL A
DPU gives us an index of quality generated by the effects of process,
material, design, environmental and human factors. Keep in mind that
DPU measures symptoms, not problems. (It’s the Y, not the X’s).
B
C
DPU = (# Defects) / (# units)
[DPU is the average number of defects in a unit]
X
DPU forms the foundation for Six Sigma. From DPU and a knowledge of
the opportunities, we can calculate the long term capability of the process.
C
B
LCL
Opportunity
An opportunity is anything you measure, test or inspect. It may be a part,
product or service CTQ. It can be each of the elements of an assembly or
subassembly.
A
Rule 1Rule 2
Rule 3
Rule 4
Rule 5
Note: A, B, and C represent plus and minus one, two
Note:
A,sigma
B, andzones
C represent
plus
and process
minus one,
two
and
three
from the
overall
average.
and three sigma zones from the overall process average.
DPO
A lack of control (“out of control”) is indicated when one
or more of the following rules apply to your chart
data:
DPO is the number of defects per opportunity. It is a probability.
Total Opportunities =# Units *
# Opportunities
Unit
[DPO is the probability of a defect on any one CTQ or step of a process]
Defects per Opportunit y (DPO) =
#Defects
DPU
=
Opportunit ies
Opportunit ies
Units *
unit
Yield = 1-DPO
DPO is the foundation for determining the Z value when using discrete
data. To find Z, given DPU, convert DPU to DPO. Then look up the P(d)
for DPO in the body of the Z table. Convert to Z score (page 7).
13
unit
1. A single point above or below a control limit
2. Two out of three consecutive points are on the same side of the
mean, in Zone A or beyond
10 / 11 points above Mean
12 / 14 points above Mean
3. Four out of five consecutive points are on the same side of the mean,
in Zone B or beyond
4. At least eight consecutive points are on the same side of the mean,
in Zone C.
5. 7 points in a row trending up or 7 points in a row trending down
6. 14 points sequentially alternating up then down then up, etc..
7. 14 points in a row in Zone C on both sides of the mean.
8. 8 points in a row alternating in Zone B or beyond.
58
Control Chart Constants
Analysis Criteria.
Variables Control Chart Control Limit Constants
n
1
2
3
4
5
6
7
8
9
10
A2
2.660
1.880
1.023
0.729
0.577
0.483
0.419
0.373
0.337
0.308
A3
3.760
2.659
1.954
1.628
1.427
1.287
1.182
1.099
1.032
0.975
D3
0
0
0
0
0
0.076
0.136
0.184
0.223
Average/Range Chart
X=
( X1 + X 2 + . . .+ X k ) , where X =
k
( R + R2 + . . .+ Rk )
R= 1
k
D4
3.267
2.575
2.282
2.115
2.004
1.924
1.864
1.816
1.777
B3
0
0
0
0
0.03
0.118
0.185
0.239
0.284
B4
3.267
2.568
2.266
2.089
1.970
1.882
1.815
1.761
1.716
d2
1.128
1.693
2.059
2.326
2.534
2.704
2.847
2.970
3.078
c4
0.7979
0.8862
0.9213
0.9400
0.9515
0.9594
0.9650
0.9693
0.9727
Individual X /Moving Range Chart
n Xi
å
i =1 n
UCLx = X + A2 R and LCLx = X − A2 R
UCLR = D4 R and LCLR = D3 R
X=
Rm
( X1 + X 2 +...+ X k )
k
= ( X i +1 − X i ) , and
Rm =
(R1 + R2 +...+Rk −1)
k −1
CLx = X ± E2 Rm
UCLRm = D4 Rm and LCLRm = D3 Rm
np Charts
np = # defective for each subgroup
np
np =
, for all k subgroups
k
( )
np(1 − p )
UCLnp = np + 3 np 1 − p
UCLnp = np − 3
p Charts
np = number of defectives
n= subgroup size
N= total number defectives for all
subgroups
p=
LCLp = p − 3
Desirable system will have a Gage R&R
Discrimination >5.
•
The system is acceptable if Gage R&R >10%, but < 20% and discrimination
categories =5.
•
If Gage R&R is > 20%, but < 30% and Categories of Discrimination =4, the
decision about acceptability will be based on importance of measuring the
characteristic and business cost.
<10% and Categories of
•
If Gage R&R is >30%, or the Categories of Discrimination < 15%
4, the
measurement system is not considered acceptable and needs to be
improved.
MINITAB Analysis Outputs
MINITAB provides a tabular and graphical
output.
The tabular output has three
tables; the first an ANOVA table (see
ANOVA Interpretation; Page 37). The
second table provides raw calculated
results of the study and the third table
provides the percent contribution results.
Interpretation of Gage R&R results is
focused on the third table. The third table
displays the “% Contribution” and “%
Study Variation”. “% Contribution” and “%
Study Variation” figures are interpreted as
Gage R&R.
If you have included a
tolerance range with the “Options” button,
this table will also report a “% Tolerance” result.
The Number of Distinct Categories is also provided. This number indicates how many
classifications can be reliably distinguished given the observed process variation.
The graphical analysis provides
several important graphic tools.
•
np
np
and p =
n
N
UCLp = p + 3
•
p(1 − p)
n
p(1 − p)
n
•
The control chart should appear
out of control. Operator to
operator variation defines control
limits. If the gage has adequate
sensitivity beyond its own noise,
more than 50% of the points will
be outside the control limits. If
this is not the case, the system
is inadequate to detect part-topart variations.
•
The range chart should be in
control, showing consistency between the operators. If there are only two or three
distinct ranges recorded, it may indicate lack of gage resolution.
The column chart shows the graphic picture of data provided in table three of the
tabular report. The graphics on the right show various interaction patterns that may
be helpful in troubleshooting a problem measurement system.
(1) Measurement Systems Analysis Reference Manual; ©AIAG 1994
59
12
Gage R&R (1)
Precontrol
What it is:
Gage R&R is a means for checking the measurement system (gage plus
operator) to gain a better understanding of the variation and sources from the
measurement system.
* 515
. or 5.15* EV 2 + AV 2
Gage R&R=
m
σ
Where σm =Measurement System standard deviation
Components of Measurement Error
• Repeatability = Equipment Variation (EV): The variation in
measurements attributable to one measurement instrument when used
several times by one appraiser to measure the identical characteristic
on the same part.
• Reproducibility = Appraisal Variation (AV): The variation in
measurements attributable to different appraisers using the same
measurement instrument to measure the same characteristic on the
same part.
Why use it?
Provide ongoing visual means of on-the-floor process control.
What does it do?
•
•
Gives operators decision rules for continuing or stopping
production.
Rules are based on probability that population mean has
shifted.
How do I do it?
1. Establish control zones:
-3.0 s
How to do the gage R&R study.
1.
Determine how the gage is going to be used; i.e., Product Acceptance
or Process Control.
Gage must have resolution 10X finer than the process variation it is
intended to measure. (i.e., measurement of parts with process
variation of .001 requires a gage with .0001 resolution)
2. Select approximately ten parts which represent the entire expected
range of the process variation, including several beyond the normally
acceptable range. Code (blind) the parts.
3. Identify two or three Gage R&R participants from the people who actually
do the measurement. Have them each measure each part two or three
times. The measurements should by done with samples randomized and
blinded.
4. Record results on a MINITAB worksheet as follows:
a)
One Column - Coded Part Numbers (PARTS)
b)
One Column - Appraiser number or name (OPER)
c)
One Column - Recorded Measurement (RESP)
5. Analyze using MINITAB by running “Stat>Quality Tools>GageR&R”
a)
In the initial dialog box choose ANOVA method.
b)
Identify the appropriate columns for “PARTS”, “OPERATOR”,
and “MEASUREMENT Data “.
c)
If you wish to include the analysis for process tolerance, select
the “OPTIONS” button. This is only to be used if the gage is
for pass fail decisions only, not for process control.
d)
If you wish to show demographic information on the graphic
output, including gage number, etc, select the “Gage
11
Information” button.
Red
Yellow
.07
-1.5 s
µ
Green
.86
+1.5 s
+3.0 s
Yellow
.07
Red
2. When five parts in a row are “green”, the process is
qualified.
3. Sample two consecutive parts on a periodic basis.
4. Decision rules for operators:
A. If first part is green, no action needed, continue to run.
B. If first part is yellow, then check a second part.
» If second part is green, no action needed.
» If second part is yellow on same side, then adjust
» If second part is yellow on opposite side, stop, call
support engineer.
C. If any part is red, stop, call support engineer.
5. After correcting and restarting a process, must achieve 5
consecutive “green” samples to re-qualify.
60
Project Closure
At Closure, the project must be positioned so that the changes made to the process are
sustainable over time. Doing so requires the completion of a number of tasks.
1. The improvement must be fully implemented, with leverage factors identified and
controlled. The process must have been re-baselined to confirm the degree of
improvement.
2. Process owners must be fully trained and running the process, controlling the
leverage factors and monitoring the Response (Y).
3. Required Quality Plan and Control Procedures, drawings, documents, policies,
generated reports or institutionalized rigor must be completed .
•
•
•
•
•
•
Workstation Instructions
Job Descriptions
Preventive Maintenance Plan
Written Policy or controlled ISO documents
Documented training procedures
Periodic internal audits or review meetings.
4. The project History Binder must be completed which records key information about
the project work in hard copy. Where MINITAB has been used for analysis, hard
copies of the generated graphics and tables should be included.
• Initial baseline data
• Gage R & R calculations
• Statistical characterization of the process
• DOE (Design of Experiments)
• Hypothesis testing
• Any data from Design Change Process activities (described on the next page),
Failure Modes and Effects Analysis (FMEA), Design for Six Sigma (DFSS),
etc.
• Copies of engineering part and tooling drawing changes showing “Z” score
values on the drawings.
• Confirmation run data
• Financial data (costs and benefits)
• Final decision on improvement and conclusions
• All related quality system documents
• A scorecard (with frequency of reporting)
• Documented control plan
Data Validity Studies
Non Measurement data is that which is not the result of a measurement
using a gage.
Examples:
• Finance data (T&L cost; Cost & Benefits; Utility Costs; Sales, etc.)
• Sales Data (Units sold; Items purchased, etc.)
• HR data (Employee Information; medical service provider
information)
• Customer Invoice Data
Samples of data should be selected to assure they represent the
population. A minimum of 100 data points is desirable. The data is then
analyzed for agreement by comparing each data point (as reported by
the standard reporting mechanism) to its true observed value.
The validity of the data is reported as % Agreement.
ö÷ × 100
% Agreement = æç Number of Agreements
Number
of
Observatio
ns
è
ø
% Agreement should be very good. Typically this measure is much
greater than 95%.
% Agreement for Binary
(Pass / Fail) Data
Calculate % Agreement in similar manner to Non Measurement, except
using the following equation.
ö ×100
% Agreement = æç Number of Agreements
Number of Opportunities ÷ø
è
Where the number of opportunities is found by the following equations.
n = total number of assessments per sample
s = number of samples
5. All data entries must be complete in PROTRAK
• Response Variable Z scores at initial Baselining
• Response Variable Z scores at Re-baselining,
• Project Definition
• Improvements Made
• Accomplishments, Barriers and Milestones for all project phases
• Tools used for all project phases.
If n is odd, then
If n is even, then
æ −1ö
÷
# Opportunit ies = s × ç n
ç 4 ÷
è
ø
2
æ 2ö
# Opportunit ies = s × ç n ÷
ç 4 ÷
è ø
6. Costs and Benefits for the project must be reconfirmed with site finance.
•
Overall % Agreement = Agreement rate for all opportunities
7. Investigate potential transfer opportunities where project lessons learned can be
applied to other business processes.
•
Repeatability % Agreement = Compare the assessments for one operator over
multiple assessment opportunities. (Fix this problem first)
•
Reproducibility % Agreement = Compare assessments of the same part from
operator to operator.
8. Submit closure package for signoff through the site approval channels.
61
10
Sample Size
Fulfillment & Span
Fulfillment º Providing what the customer wants
when the customer wants it
Fulfillment is a highly segmented metric and typically does not follow a normal
distribution. Because the data is non-normal some of the traditional 6 Sigma
tools should not be used (such as the 6 Sigma Process Report). Therefore,
Median and Span will be used to measure Fulfillment.
Median º the middle value in a data set
Span º The difference between two values in the data set
(e.g. 1/99 Span = the difference between the 99th percentile and the 1st
percentile)
We don’t want our decision to be influenced by a single data point.
Therefore, the Span calculation is dependent on the sample size. Larger
data sets will have a wider span. Following are corporate guidelines on the
Span calculation:
Sample Size
100-500
500-5000
>5000
Span
10/90 Span
5/95 Span
1/99 Span
Example
Ø
A sample of 100 delivery times has
a high value of 40 days
Ø
If that one value had instead been
30 days, the 1/99 span would
change by 10 days
Ø
10/90 span is not affected by what
happens to that highest point
In order to analyze a fulfillment process, the data should be segmented by the
variables that may affect the process. Each segment of data should be
compared to identify if the segmenting factor had an influence on the Median
and the Span. Mood’s Median test is a tool that can be used to identify
significant differences in Median. Factors that are identified as having an
influence on Span and Median, should be evaluated further through designed
experimentation.
9
α = 20%
α = 5%
α = 1%
α = 10%
β 20% 10% 5% 1% 20% 10% 5% 1% 20% 10% 5% 1% 20% 10% 5% 1%
δ/σ
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
225
100
56
36
25
18
14
11
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
9
7
6
5
5
4
4
3
3
2
13
11
9
8
7
6
5
5
4
4
17
14
12
10
9
8
7
6
5
5
26
22
18
15
13
12
10
9
8
7
12
10
9
7
6
5
5
4
4
3
17
14
12
10
9
8
7
6
5
5
22
18
15
13
11
10
8
7
7
6
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2
2
2
2
2
1
1
1
1
1
3
3
3
2
2
2
2
2
2
2
4
4
4
3
3
3
3
2
2
2
7
6
5
5
5
4
4
4
3
3
3
3
3
2
2
2
2
2
2
1
4
4
4
3
3
3
3
2
2
2
3.0
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
4.0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
1
1
1
1
1
1
1
3
3
3
2
2
2
2
2
2
2
2
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
1
1
1
1
1
1
1
328 428 651
146 190 289
82 107 163
53 69 104
36 48 72
27 35 53
21 27 41
16 21 32
309
137
77
49
34
25
19
15
428 541 789
190 241 350
107 135 197
69 87 126
48 60 88
35 44 64
27 34 49
21 27 39
392
174
98
63
44
32
25
19
525
234
131
84
58
43
33
26
650
289
162
104
72
53
41
32
919
408
230
147
102
75
57
45
584
260
146
93
65
48
36
29
744
331
186
119
83
61
46
37
891 1202
396 534
223 300
143 192
99 134
73
98
56
75
44
59
32
26
22
19
16
14
12
11
10
9
16
13
11
9
8
7
6
5
5
4
21
17
15
12
11
9
8
7
6
6
26
21
18
15
13
12
10
9
8
7
37
30
26
22
19
16
14
13
11
10
23
19
16
14
12
10
9
8
7
6
30
25
21
18
15
13
12
10
9
8
36
29
25
21
18
16
14
12
11
10
48
40
33
28
25
21
19
17
15
13
5
5
4
4
4
3
3
3
3
3
8
7
7
6
5
5
5
4
4
4
4
4
3
3
3
3
2
2
2
2
5
5
4
4
4
3
3
3
3
2
6
6
5
5
5
4
4
4
3
3
9
8
8
7
6
6
5
5
5
4
6
5
5
4
4
4
3
3
3
3
7
7
6
6
5
5
4
4
4
4
9
8
7
7
6
6
5
5
5
4
12
11
10
9
8
8
7
7
6
6
2
2
2
2
2
2
2
2
1
1
1
4
3
3
3
3
3
2
2
2
2
2
2
2
2
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
1
1
1
3
3
3
2
2
2
2
2
2
2
2
4
4
4
3
3
3
3
3
3
2
2
3
2
2
2
2
2
2
2
2
2
1
3
3
3
3
3
2
2
2
2
2
2
4
4
3
3
3
3
3
3
2
2
2
5
5
5
4
4
4
4
4
3
3
3
62
α=.05
F Distribution
Z - An Important Measure
Numerator Degrees of Freedom
Denom DF
1
2
3
4
5
6
7
8
9
10
1 161.40 199.50 215.70 224.60 230.20 234.00 236.80 238.90 240.50 241.90
18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40
2
10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79
3
7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96
4
6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74
5
5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06
6
5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64
7
5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35
8
5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14
9
4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98
10
4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85
11
4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75
12
4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67
13
4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60
14
4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54
15
4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49
16
4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45
17
4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41
18
4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38
19
4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35
20
4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32
21
4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30
22
4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27
23
4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25
24
4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24
25
4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22
26
4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20
27
4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19
28
4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18
29
4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16
30
4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08
40
4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99
60
3.92 3.07 2.68 2.45 2.29 2.17 2.09 2.02 1.96 1.91
120
3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83
∞
Z Short Term
Z .ST =
Z.st describes how the process performs at any given moment in time. It is referred to
as “instantaneous capability,” “short-term capability” or “process entitlement”. It is used
when referring to the “SIGMA” of a process. It is the process capability if everything is
controlled so that only background noise (common cause variation) is present. This metric
assumes the process is centered and the data were gathered in accordance to the
principals and spirit of a rational subgrouping plan (p. 14). The “Target” assumes that
each subgroup average is aligned to this number, so that all subgroup means are artificially
centered on this number. The sst used in this equation can be estimated by the square root
of the Mean Square Error term in the ANOVA Table. Since it is centered data, it can be
calculated from either one of the Specification Limits (SL).
Z Long Term
Minimum of
Z. LTUSL =
or
Z. LTLSL =
USL − µ
σ LT
µ − LSL
σ LT
Z Shift
Z.SHIFT = Z.ST − Z.LT
Z Benchmark
Z.LT describes the sustained reproducibility of a
process. It is also called “long-term capability.” It
reflects all of the sources of operational
variation, the influence of common cause
variation, dynamic nonrandom process centering
error, and any static off-set present in the
process mean. This metric assumes the data
were gathered in accordance to the principals
and spirit of a “rational sampling” plan (p. 14).
This equation is applicable to all types of
tolerances. It is used to estimate the long-term
process “PPM.”
Z.SHIFT describes how well the process being
measured is controlled over time. It reflects the
difference between the short term and long term
capability. It focuses on the dynamic nonrandom
process centering error, and any static off-set
present in the process mean. Interpretation of
the Z.shift is only valid when following the
principles of rational subgrouping (p.14)
While the Z values above are all calculated in reference to a single spec limit,
Z Benchmark is the Z score of the summation of the probabilities of defects in both
tails of the distribution. To find, sum the Probability of defect at the Lower Spec Limit
(PLSL ) and the Probability of defect at the Upper Spec Limit (PUSL ). Look up the sum
of the combined probabilities in a normal table to find the corresponding Z value.
Z.
Benchmark
63
SL − Target
sST
= Z score ( PUSL + P LSL )
8
The Standard Normal Curve
F Distribution
Numerator Degrees of Freedom
** Area under the curve = 1, the center is 0 **
Mean
The Z value is a measure of process
capability and is often referred to
as the “sigma of the process.” A Z
= 1 indicates a process for which
the performance limit falls one
standard deviation from the mean.
If we calculate the standard normal
deviate for a given performance
limit and discover that Z = 2.76,
the probability of a defect (P(d)) is
the probability of a point lying
beyond the Z value of 2.76.
Denom DF
µµ
Point of Inflection
Probability
of a Defect
Performance
Limit
Example = .00289
1σ
Z=1
Total Area = 1
Z
Units of Measure
Z=0 Z=1
This
table lists
the tail
area to
the right
of Z.
Copyright 1995 Six Sigma Academy, Inc.
7
Z
Area
Z
Area
Z
Area
Z
Area
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
1.05
1.10
1.15
1.20
1.25
1.30
1.35
1.40
1.45
1.50
.500000000
.480061306
.460172290
.440382395
.420740315
.401293634
.382088486
.363169226
.344578129
.326355105
.308537454
.291159644
.274253121
.257846158
.241963737
.226627465
.211855526
.197662672
.184060243
.171056222
.158655319
.146859086
.135666053
.125071891
.115069593
.105649671
.096800364
.088507862
.080756531
.073529141
.066807100
1.51
1.56
1.61
1.66
1.71
1.76
1.81
1.86
1.91
1.96
2.01
2.06
2.11
2.16
2.21
2.26
2.31
2.36
2.41
2.46
2.51
2.56
2.61
2.66
2.71
2.76
2.81
2.86
2.91
2.96
3.01
.065521615
.059379869
.053698886
.048457216
.043632958
.039203955
.035147973
.031442864
.028066724
.024998022
.022215724
.019699396
.017429293
.015386434
.013552660
.011910681
.010444106
.009137469
.007976235
.006946800
.006036485
.005233515
.004527002
.003906912
.003364033
.002889938
.002476947
.002118083
.001807032
.001538097
.001306156
3.02
3.07
3.12
3.17
3.22
3.27
3.32
3.37
3.42
3.47
3.52
3.57
3.62
3.67
3.72
3.77
3.82
3.87
3.92
3.97
4.02
4.07
4.12
4.17
4.22
4.27
4.32
4.37
4.42
4.47
4.52
.001263795
.001070234
.000904215
.000762175
.000640954
.000537758
.000450127
.000375899
.000313179
.000260317
.000215873
.000178601
.000147419
.000121399
.000099739
.000081753
.000066855
.000054545
.000044399
.000036057
.000029215
.000023617
.000019047
.000015327
.000012305
.000009857
.000007878
.000006282
.000004998
.000003968
.000003143
4.53
4.58
4.63
4.68
4.73
4.78
4.83
4.88
4.93
4.98
5.03
5.08
5.13
5.18
5.23
5.28
5.33
5.38
5.43
5.48
5.53
5.58
5.63
5.68
5.73
5.78
5.83
5.88
5.93
5.98
6.03
.000002999
.000002369
.000001867
.000001469
.000001153
.000000903
.000000705
.000000550
.000000428
.000000332
.000000258
.000000199
.000000154
.000000118
.000000091
.000000070
.000000053
.000000041
.000000031
.000000024
.000000018
.000000014
.000000010
.000000008
.000000006
.000000004
.000000003
.000000003
.000000002
.000000001
.000000001
Z = 2.76
Table of Area
Under the
Normal Curve
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
120
∞
12
15
20
24
30
40
60
α=.05
120
∞
243.90 245.90 248.00 249.10 250.10 251.10 252.20 253.30 254.30
19.41 19.43 19.45 19.45 19.46 19.47 19.48 19.49 19.50
8.74
8.70
8.66
8.64
8.62
8.59
8.57
8.55
8.53
5.91
5.86
5.80
5.77
5.75
5.72
5.69
5.66
5.63
4.68
4.62
4.56
4.53
4.50
4.46
4.43
4.40
4.36
4.00
3.94
3.87
3.84
3.81
3.77
3.74
3.70
3.67
3.57
3.51
3.44
3.41
3.38
3.34
3.30
3.27
3.23
3.28
3.22
3.15
3.12
3.08
3.04
3.01
2.97
2.93
3.07
3.01
2.94
2.90
2.86
2.83
2.79
2.75
2.71
2.91
2.85
2.77
2.74
2.70
2.66
2.62
2.58
2.54
2.79
2.72
2.65
2.61
2.57
2.53
2.49
2.45
2.40
2.69
2.62
2.54
2.51
2.47
2.43
2.38
2.34
2.30
2.60
2.53
2.46
2.42
2.38
2.34
2.30
2.25
2.21
2.53
2.46
2.39
2.35
2.31
2.27
2.22
2.18
2.13
2.48
2.40
2.33
2.29
2.25
2.20
2.16
2.11
2.07
2.42
2.35
2.28
2.24
2.19
2.15
2.11
2.06
2.01
2.38
2.31
2.23
2.19
2.15
2.10
2.06
2.01
1.96
2.34
2.27
2.19
2.15
2.11
2.06
2.02
1.97
1.92
2.31
2.23
2.16
2.11
2.07
2.03
1.98
1.93
1.88
2.28
2.20
2.12
2.08
2.04
1.99
1.95
1.90
1.84
2.25
2.18
2.10
2.05
2.01
1.96
1.92
1.87
1.81
2.23
2.15
2.07
2.03
1.98
1.94
1.89
1.84
1.78
2.20
2.13
2.05
2.01
1.96
1.91
1.86
1.81
1.76
2.18
2.11
2.03
1.98
1.94
1.89
1.84
1.79
1.73
2.16
2.09
2.01
1.96
1.92
1.87
1.82
1.77
1.71
2.15
2.07
1.99
1.95
1.90
1.85
1.80
1.75
1.69
2.13
2.06
1.97
1.93
1.88
1.84
1.79
1.73
1.67
2.12
2.04
1.96
1.91
1.87
1.82
1.77
1.71
1.65
2.10
2.03
1.94
1.90
1.85
1.81
1.75
1.70
1.64
2.09
2.01
1.93
1.89
1.84
1.79
1.74
1.68
1.62
2.00
1.92
1.84
1.79
1.74
1.69
1.64
1.58
1.51
1.92
1.84
1.75
1.70
1.65
1.59
1.53
1.47
1.39
1.83
1.75
1.66
1.61
1.55
1.50
1.43
1.35
1.25
1.75
1.67
1.57
1.52
1.46
1.39
1.32
1.22
1.00
64
Chi-Square
Distribution
α
df
1
2
3
4
5
.995
.990
.975
.950
.900
.750
7 Basic QC Tools - Ishikawa
.500
.000039 .000160 .000980 .003930 .015800 .101500 .455000
0.010
0.020
0.051
0.103
0.211
0.575
1.386
0.072
0.115
0.216
0.352
0.584
1.213
2.366
0.207
0.297
0.484
0.711
1.064
1.923
3.357
0.412
0.554
0.831
1.145
1.610
2.675
4.351
6
7
8
9
10
0.676
0.989
1.344
1.735
2.156
0.872
1.239
1.646
2.088
2.558
1.237
1.690
2.180
2.700
3.247
1.635
2.167
2.733
3.325
3.940
2.204
2.833
3.490
4.168
4.865
3.455
4.255
5.071
5.899
6.737
5.348
6.346
7.344
8.343
9.342
11
12
13
14
15
2.603
3.074
3.565
4.075
4.601
3.053
3.571
4.107
4.660
5.229
3.816
4.404
5.009
5.629
6.262
4.575
5.226
5.892
6.571
7.261
5.578
6.304
7.042
7.790
8.547
7.584
8.438
9.299
10.165
11.036
10.341
11.340
12.340
13.339
14.339
16
17
18
19
20
5.142
5.697
6.265
6.844
7.434
5.812
6.408
7.015
7.633
8.260
6.908
7.564
8.231
8.907
9.591
7.962
8.672
9.390
10.117
10.851
9.312
10.085
10.865
11.651
12.443
11.912
12.792
13.675
14.562
15.452
15.338
16.338
17.338
18.338
19.337
21
22
23
24
25
8.034
8.643
9.260
9.886
10.520
8.897
9.542
10.196
10.856
11.524
10.283
10.982
11.688
12.401
13.120
11.591
12.338
13.091
13.848
14.611
13.240
14.041
14.848
15.659
16.473
16.344
17.240
18.137
19.037
19.939
20.337
21.337
22.337
23.337
24.337
26
27
28
29
30
11.160
11.808
12.461
13.121
13.787
12.198
12.879
13.565
14.256
14.953
13.844
14.573
15.308
16.047
16.791
15.379
16.151
16.928
17.708
18.493
17.292
18.114
18.939
19.768
20.599
20.843
21.749
22.657
23.567
24.478
25.336
26.336
27.336
28.336
29.336
65
The seven basic QC tools are the simplest, quickest tools for
structured problem solving. In many cases these tools will define
the appropriate area in which to focus to solve quality problems.
They are an integral part of the Six Sigma DMAIC process toolkit.
•
Brainstorming: Allows generation of a high volume of ideas quickly.
Generally used integrally with the advocacy team when identifying the
potential X’s.
•
Pareto: Helps to define the potential vital few X’s. The pareto links
data to problem causes and aids in making data based decisions (Page
23).
•
Histogram: Displays frequency of occurrence of various categories in
chart form, can be used as first cut at mean, variation, distribution of
data. An important part of process data analysis. (Page 18).
•
Cause & Effect / Fishbone Diagram: Helps identify potential problem
causes and focus brainstorming. (Page 23).
•
Flowcharting / Process Mapping: Displays actual steps of process.
Provides basis for examining potential areas of improvement.
•
Scatter Charts:
(Page 18).
•
Check Sheets: Capture data in a format that facilitates interpretation.
Shows relationship between two variables.
Fishbone (Ishikawa)
Diagram
Pareto Chart
6
Practical Problem Statement
A major cause of futile attempts to solve a problem is poor, up front statement of the
problem. Define the problem using available facts, and planned improvement.
1. Write an initial “as is” problem statement condition. This statement describes the
problem as it exists now. It is a statement of what “hurts” or what “bugs” you. The
statement should contain data based measures of the hurt. For example:
As Is: “The response time for 15% of our service calls is more than 24 hours.
2. Be sure the problem statement meets the following criteria:
•Is as specific as possible
•Contains no potential causes
•Contains no conclusions or potential solutions
•Is sufficiently narrow in scope
The most common mistake in developing a Problem Statement is the problem is
stated at too high a level or is too broad for effective investigation. Use the
Structure Tree (Page 25), Pareto (Page 25) or Rolled Throughput Yield analysis
(Page 14) to break the problem down further.
3. Avoid the following in wording problem statements:
Avoid
Ineffective Problem
Statement
Effective Problem Statement
Questions
“How can we reduce the
downtime on the
Assembly Line.”
“Assembly Line downtime
currently runs 15% of operating
hours. “
The word
“lack”
“We lack word processing
software”
“Material to be typed is
backlogged by five days.”
Solution
masquerading
as a problem
“We need to hire another
warehouse shipping
clerk.”
“50% of the scheduled day’s
shipments are not being pulled
on time.”
Blaming
people instead
of processes
“File Clerks aren’t doing
their jobs.”
“Files cannot be located within
the allowed 5 minutes after
requested.”
4. Determine if you have identified the correct level to address the problem.
Ask: “Is my “Y” response variable (Output) defined at a level at which it can
be solved by direct interaction with it’s independent variables (X’s) Inputs?
5. Determine if correcting the “Y” response variable will result in the desired
improvement in the problem as stated.
6. Describe the “desired state”, a description of what you want to achieve by solving
the problem, as objectively as possible. As with the “as is” statement, be sure the
“desired state” is in measurable observable terms. For example:
Desired State: “The response time for all our service calls is less than 24 hours.”
5
Chi-Square Distribution
α
df
.250
.100
.050
.025
.010
2.706 3.841 5.024 6.635
4.605 5.991 7.378 9.210
6.251 7.815 9.348 11.345
7.779 9.488 11.143 13.277
9.236 11.070 12.832 15.086
.005
.001
7.879
10.597
12.838
14.860
16.750
10.828
13.816
16.266
18.467
20.515
1
2
3
4
5
1.323
2.773
4.108
5.385
6.626
6
7
8
9
10
7.841
9.037
10.219
11.389
12.549
10.645
12.017
13.362
14.684
15.987
12.592
14.067
15.507
16.919
18.307
14.449
16.013
17.535
19.023
20.483
16.812
18.475
20.090
21.666
23.209
18.548
20.278
21.955
23.589
25.188
22.458
24.322
26.125
27.877
29.588
11
12
13
14
15
13.701
14.845
15.984
17.117
18.245
17.275
18.549
19.812
21.064
22.307
19.675
21.026
22.362
23.685
24.996
21.920
23.337
24.736
26.119
27.488
24.725
26.217
27.688
29.141
30.578
26.757
28.300
29.819
31.319
32.801
31.264
32.909
34.528
36.123
37.697
16
17
18
19
20
19.369
20.489
21.605
22.718
23.828
23.542
24.769
25.989
27.204
28.412
26.296
27.587
28.869
30.144
31.410
28.845
30.191
31.526
32.852
34.170
32.000
33.409
34.805
36.191
37.566
34.267
35.718
37.156
38.582
39.997
39.252
40.790
43.312
43.820
45.315
21
22
23
24
25
24.935
26.039
27.141
28.241
29.339
29.615
30.813
32.007
33.196
34.382
32.671
33.924
35.172
36.415
37.652
35.479
36.781
38.076
39.364
40.646
38.932
40.289
41.638
42.980
44.314
41.401
42.796
44.181
45.558
46.928
46.797
48.268
49.728
51.179
52.620
26
27
28
29
30
30.434
31.528
32.620
33.711
34.800
35.563
36.741
37.916
39.087
40.256
38.885
40.113
41.337
42.557
43.773
41.923
43.194
44.461
45.722
46.979
45.642
46.963
48.278
49.588
50.892
48.290
49.645
50.993
52.336
53.672
54.052
55.476
56.892
58.302
59.703
66
Normal Distribution
Defining a Six Sigma Project
Z
Z
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
5.00E-01
4.60E-01
4.21E-01
3.82E-01
3.45E-01
3.09E-01
2.74E-01
2.42E-01
2.12E-01
1.84E-01
4.96E-01
4.56E-01
4.17E-01
3.78E-01
3.41E-01
3.05E-01
2.71E-01
2.39E-01
2.09E-01
1.81E-01
4.92E-01
4.52E-01
4.13E-01
3.75E-01
3.37E-01
3.02E-01
2.68E-01
2.36E-01
2.06E-01
1.79E-01
4.88E-01
4.48E-01
4.09E-01
3.71E-01
3.34E-01
2.98E-01
2.64E-01
2.33E-01
2.03E-01
1.76E-01
4.84E-01
4.44E-01
4.05E-01
3.67E-01
3.30E-01
2.95E-01
2.61E-01
2.30E-01
2.01E-01
1.74E-01
4.80E-01
4.40E-01
4.01E-01
3.63E-01
3.26E-01
2.91E-01
2.58E-01
2.27E-01
1.98E-01
1.71E-01
4.76E-01
4.36E-01
3.97E-01
3.59E-01
3.23E-01
2.88E-01
2.55E-01
2.24E-01
1.95E-01
1.69E-01
4.72E-01
4.33E-01
3.94E-01
3.56E-01
3.19E-01
2.84E-01
2.51E-01
2.21E-01
1.92E-01
1.66E-01
4.68E-01
4.29E-01
3.90E-01
3.52E-01
3.16E-01
2.81E-01
2.48E-01
2.18E-01
1.89E-01
1.64E-01
4.64E-01
4.25E-01
3.86E-01
3.48E-01
3.12E-01
2.78E-01
2.45E-01
2.15E-01
1.87E-01
1.61E-01
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
1.59E-01
1.36E-01
1.15E-01
9.68E-02
8.08E-02
6.68E-02
5.48E-02
4.46E-02
3.59E-02
2.87E-02
1.56E-01
1.34E-01
1.13E-01
9.51E-02
7.93E-02
6.55E-02
5.37E-02
4.36E-02
3.52E-02
2.81E-02
1.5 39E01
1.31E-01
1.11E-01
9.34E-02
7.78E-02
6.43E-02
5.26E-02
4.27E-02
3.44E-02
2.74E-02
1.52E-01
1.29E-01
1.09E-01
9.18E-02
7.64E-02
6.30E-02
5.16E-02
4.18E-02
3.36E-02
2.68E-02
1.49E-01
1.27E-01
1.08E-01
9.01E-02
7.49E-02
6.18E-02
5.05E-02
4.09E-02
3.29E-02
2.62E-02
1.47E-01
1.25E-01
1.06E-01
8.85E-02
7.35E-02
6.06E-02
4.95E-02
4.01E-02
3.22E-02
2.56E-02
1.45E-01
1.23E-01
1.04E-01
8.69E-02
7.21E-02
5.94E-02
4.85E-02
3.92E-02
3.14E-02
2.50E-02
1.42E-01
1.21E-01
1.02E-01
8.53E-02
7.08E-02
5.82E-02
4.75E-02
3.84E-02
3.07E-02
2.44E-02
1.40E-01
1.19E-01
1.00E-01
8.38E-02
6.94E-02
5.71E-02
4.65E-02
3.75E-02
3.01E-02
2.39E-02
1.38E-01
1.17E-01
9.85E-02
8.23E-02
6.81E-02
5.59E-02
4.55E-02
3.67E-02
2.94E-02
2.33E-02
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.28E-02
1.79E-02
1.39E-02
1.07E-02
8.20E-03
6.21E-03
4.66E-03
3.47E-03
2.56E-03
1.87E-03
2.22E-02
1.74E-02
1.36E-02
1.04E-02
7.98E-03
6.04E-03
4.53E-03
3.36E-03
2.48E-03
1.81E-03
2.17E-02
1.70E-02
1.32E-02
1.02E-02
7.76E-03
5.87E-03
4.40E-03
3.26E-03
2.40E-03
1.75E-03
2.12E-02
1.66E-02
1.29E-02
9.90E-03
7.55E-03
5.70E-03
4.27E-03
3.17E-03
2.33E-03
1.70E-03
2.07E-02
1.62E-02
1.26E-02
9.64E-03
7.34E-03
5.54E-03
4.15E-03
3.07E-03
2.26E-03
1.64E-03
2.02E-02
1.58E-02
1.22E-02
9.39E-03
7.14E-03
5.39E-03
4.02E-03
2.98E-03
2.19E-03
1.59E-03
1.97E-02
1.54E-02
1.19E-02
9.14E-03
6.95E-03
5.23E-03
3.91E-03
2.89E-03
2.12E-03
1.54E-03
1.92E-02
1.50E-02
1.16E-02
8.89E-03
6.76E-03
5.09E-03
3.79E-03
2.80E-03
2.05E-03
1.49E-03
1.88E-02
1.46E-02
1.13E-02
8.66E-03
6.57E-03
4.94E-03
3.68E-03
2.72E-03
1.99E-03
1.44E-03
1.83E-02
1.43E-02
1.10E-02
8.42E-03
6.39E-03
4.80E-03
3.57E-03
2.64E-03
1.93E-03
1.40E-03
3.0
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
1.35E-03
9.68E-04
6.87E-04
4.84E-04
3.37E-04
2.33E-04
1.59E-04
1.08E-04
7.25E-05
4.82E-05
1.31E-03
9.35E-04
6.64E-04
4.67E-04
3.25E-04
2.24E-04
1.53E-04
1.04E-04
6.96E-05
4.63E-05
1.26E-03
9.04E-04
6.41E-04
4.50E-04
3.13E-04
2.16E-04
1.47E-04
9.97E-05
6.69E-05
4.44E-05
1.22E-03
8.74E-04
6.19E-04
4.34E-04
3.02E-04
2.08E-04
1.42E-04
9.59E-05
6.42E-05
4.26E-05
1.18E-03
8.45E-04
5.98E-04
4.19E-04
2.91E-04
2.00E-04
1.36E-04
9.21E-05
6.17E-05
4.09E-05
1.14E-03
8.16E-04
5.77E-04
4.04E-04
2.80E-04
1.93E-04
1.31E-04
8.86E-05
5.92E-05
3.92E-05
1.11E-03
7.89E-04
5.57E-04
3.90E-04
2.70E-04
1.86E-04
1.26E-04
8.51E-05
5.68E-05
3.76E-05
1.07E-03
7.62E-04
5.38E-04
3.76E-04
2.60E-04
1.79E-04
1.21E-04
8.18E-05
5.46E-05
3.61E-05
1.04E-03
7.36E-04
5.19E-04
3.63E-04
2.51E-04
1.72E-04
1.17E-04
7.85E-05
5.24E-05
3.46E-05
1.00E-03
7.11E-04
5.01E-04
3.50E-04
2.42E-04
1.66E-04
1.12E-04
7.55E-05
5.03E-05
3.32E-05
4.0
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
3.18E-05
2.08E-05
1.34E-05
8.62E-06
5.48E-06
3.45E-06
2.15E-06
1.33E-06
8.18E-07
4.98E-07
3.05E-05
1.99E-05
1.29E-05
8.24E-06
5.23E-06
3.29E-06
2.05E-06
1.27E-06
7.79E-07
4.73E-07
2.92E-05
1.91E-05
1.23E-05
7.88E-06
5.00E-06
3.14E-06
1.96E-06
1.21E-06
7.41E-07
4.50E-07
2.80E-05
1.82E-05
1.18E-05
7.53E-06
4.77E-06
3.00E-06
1.87E-06
1.15E-06
7.05E-07
4.28E-07
2.68E-05
1.75E-05
1.13E-05
7.20E-06
4.56E-06
2.86E-06
1.78E-06
1.10E-06
6.71E-07
4.07E-07
2.57E-05
1.67E-05
1.08E-05
6.88E-06
4.35E-06
2.73E-06
1.70E-06
1.05E-06
6.39E-07
3.87E-07
2.47E-05
1.60E-05
1.03E-05
6.57E-06
4.16E-06
2.60E-06
1.62E-06
9.96E-07
6.08E-07
3.68E-07
2.36E-05
1.53E-05
9.86E-06
6.28E-06
3.97E-06
2.48E-06
1.54E-06
9.48E-07
5.78E-07
3.50E-07
2.26E-05
1.47E-05
9.43E-06
6.00E-06
3.79E-06
2.37E-06
1.47E-06
9.03E-07
5.50E-07
3.32E-07
2.17E-05
1.40E-05
9.01E-06
5.73E-06
3.62E-06
2.26E-06
1.40E-06
8.59E-07
5.23E-07
3.16E-07
67
SPECIFIC
Issue is clearly defined to the lowest level of cause
and effect. The project should have a ‘response
variable’ (Y) with specifications and constraints (i.e.
cycle time for returned parts, washer base width). It
should be bound by clearly defined goals. If it looks
big, it is. A poorly defined project will require greater
scoping time and will have a longer completion time
than one that is clearly defined.
CUSTOMER
FOCUSED
The Project Y should be clearly linked to a specific
customer want or need - can result in improved
customer perception or consumer satisfaction
(Customer WOW): on time delivery, billing accuracy,
call answer rate.
VALUE-ADDED
Financially justifiable - directly impacts a business
metric that returns value: PPM, reliability, yield,
pricing errors, field returns, factory yield, overtime,
transportation, warehousing, availability, SCR,
rework, under billing and scrap.
MEASURABLE
The ‘response variable’ (Y) must have reasonable
historical DATA, or you must have the ability to
capture a reliable data stream.
Having a method for measuring vital X’s is also
essential for in-depth process analysis with data.
Discreet data can be effectively used for problem
investigation, but ‘variable’ (continuous) data is
better. Projects based on unreliable data have
unreliable results.
LOCALLY
ACTIONABLE
The selected project should be one which can be
addressed by the accepted “local” organization
Adequate support is needed to ensure successful
project completion and permanent change to the
process. It is difficult to “manage improvements in
Louisville from the field”
A well defined problem is the first step
in a successful project!
4
Six Sigma
Problem Solving Processes
Step Description
Define
A
Identify Project CTQs
B
C
Develop Team Charter
Define Process Map
Measure
1
Select CTQ
Characteristics
2
Define Performance
Standards
3
Measurement System
Analysis
Analyze
4
Establish Process
Capability
5
Define Performance
Objectives
6
Identify Variation
Sources
Improve
7
Screen Potential
Causes
8
Discover Variable
Relationships
9
Establish Operating
Tolerances
Control
10
Define & Validate
Measurement System
on X’s in Actual
Application
11
Determine New
Process Capability
12
Implement Process
Control
3
Focus
Tools
Y
VOC; Process Map;
CAP
Project CAP
Y=f(x) Process Map
Normal Distribution
Deliverables
Project CTQs (1)
Approved Charter (2)
High Level Process
Map (3)
Z
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
5.0
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
3.00E-07
1.80E-07
1.07E-07
6.27E-08
3.66E-08
2.12E-08
1.22E-08
6.98E-09
3.96E-09
2.23E-09
2.85E-07
1.71E-07
1.01E-07
5.95E-08
3.47E-08
2.01E-08
1.16E-08
6.60E-09
3.74E-09
2.11E-09
2.71E-07
1.62E-07
9.59E-08
5.64E-08
3.29E-08
1.90E-08
1.09E-08
6.24E-09
3.53E-09
1.99E-09
2.58E-07
1.54E-07
9.10E-08
5.34E-08
3.11E-08
1.80E-08
1.03E-08
5.89E-09
3.34E-09
1.88E-09
2.45E-07
1.46E-07
8.63E-08
5.06E-08
2.95E-08
1.70E-08
9.78E-09
5.57E-09
3.15E-09
1.77E-09
2.32E-07
1.39E-07
8.18E-08
4.80E-08
2.79E-08
1.61E-08
9.24E-09
5.26E-09
2.97E-09
1.67E-09
2.21E-07
1.31E-07
7.76E-08
4.55E-08
2.64E-08
1.53E-08
8.74E-09
4.97E-09
2.81E-09
1.58E-09
2.10E-07
1.25E-07
7.36E-08
4.31E-08
2.50E-08
1.44E-08
8.26E-09
4.70E-09
2.65E-09
1.49E-09
1.99E-07
1.18E-07
6.98E-08
4.08E-08
2.37E-08
1.37E-08
7.81E-09
4.44E-09
2.50E-09
1.40E-09
1.89E-07
1.12E-07
6.62E-08
3.87E-08
2.24E-08
1.29E-08
7.39E-09
4.19E-09
2.36E-09
1.32E-09
Y
VOC; QFD;FMEA
Project Y (4)
Y
VOC, Blueprints
Performance
Standard for Project Y
(5)
Data Collection Plan
& MSA (6), Data for
Project Y (7)
6.0
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
1.25E-09
6.94E-10
3.84E-10
2.11E-10
1.15E-10
6.25E-11
3.38E-11
1.82E-11
9.72E-12
5.18E-12
1.18E-09
6.54E-10
3.61E-10
1.98E-10
1.08E-10
5.88E-11
3.18E-11
1.71E-11
9.13E-12
4.86E-12
1.11E-09
6.17E-10
3.40E-10
1.87E-10
1.02E-10
5.53E-11
2.98E-11
1.60E-11
8.57E-12
4.56E-12
1.05E-09
5.81E-10
3.21E-10
1.76E-10
9.59E-11
5.20E-11
2.81E-11
1.51E-11
8.05E-12
4.28E-12
9.88E-10
5.48E-10
3.02E-10
1.66E-10
9.02E-11
4.89E-11
2.64E-11
1.42E-11
7.56E-12
4.02E-12
9.31E-10
5.16E-10
2.84E-10
1.56E-10
8.49E-11
4.60E-11
2.48E-11
1.33E-11
7.10E-12
3.77E-12
8.78E-10
4.87E-10
2.68E-10
1.47E-10
7.98E-11
4.32E-11
2.33E-11
1.25E-11
6.66E-12
3.54E-12
8.28E-10
4.59E-10
2.52E-10
1.38E-10
7.51E-11
4.07E-11
2.19E-11
1.17E-11
6.26E-12
3.32E-12
7.81E-10
4.32E-10
2.38E-10
1.30E-10
7.06E-11
3.82E-11
2.06E-11
1.10E-11
5.87E-12
3.12E-12
7.36E-10
4.07E-10
2.24E-10
1.22E-10
6.65E-11
3.59E-11
1.93E-11
1.04E-11
5.52E-12
2.93E-12
Process Capability for
Project Y (8)
Improvement Goal for
Project Y (9)
Prioritized List of all Xs
(10)
7.0
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
2.75E-12
1.45E-12
7.64E-13
4.01E-13
2.10E-13
1.09E-13
5.68E-14
2.94E-14
1.52E-14
7.85E-15
2.58E-12
1.36E-12
7.16E-13
3.76E-13
1.96E-13
1.02E-13
5.32E-14
2.76E-14
1.42E-14
7.35E-15
2.42E-12
1.28E-12
6.72E-13
3.52E-13
1.84E-13
9.58E-14
4.98E-14
2.58E-14
1.33E-14
6.88E-15
2.27E-12
1.20E-12
6.30E-13
3.30E-13
1.72E-13
8.98E-14
4.66E-14
2.42E-14
1.25E-14
6.44E-15
2.13E-12
1.12E-12
5.90E-13
3.09E-13
1.62E-13
8.41E-14
4.37E-14
2.26E-14
1.17E-14
6.02E-15
2.00E-12
1.05E-12
5.54E-13
2.90E-13
1.51E-13
7.87E-14
4.09E-14
2.12E-14
1.09E-14
5.64E-15
1.87E-12
9.88E-13
5.19E-13
2.72E-13
1.42E-13
7.38E-14
3.83E-14
1.98E-14
1.02E-14
5.28E-15
1.76E-12
9.26E-13
4.86E-13
2.55E-13
1.33E-13
6.91E-14
3.58E-14
1.86E-14
9.58E-15
4.94E-15
1.65E-12
8.69E-13
4.56E-13
2.39E-13
1.24E-13
6.47E-14
3.36E-14
1.74E-14
8.97E-15
4.62E-15
1.55E-12
8.15E-13
4.28E-13
2.24E-13
1.17E-13
6.06E-14
3.14E-14
1.63E-14
8.39E-15
4.32E-15
List of Vital Few Xs
(11)
Proposed Solution
(13)
Piloted Solution (14)
8.0
8.1
8.2
83
8.4
8.5
8.6
8.7
8.8
8.9
4.05E-15
2.08E-15
1.07E-15
5.48E-16
2.81E-16
1.44E-16
7.34E-17
3.75E-17
1.92E-17
9.79E-18
3.79E-15
1.95E-15
9.99E-16
5.12E-16
2.62E-16
1.34E-16
6.87E-17
3.51E-17
1.79E-17
9.16E-18
3.54E-15
1.82E-15
9.35E-16
4.79E-16
2.45E-16
1.26E-16
6.42E-17
3.28E-17
1.68E-17
8.56E-18
3.31E-15
1.70E-15
8.74E-16
4.48E-16
2.30E-16
1.17E-16
6.00E-17
3.07E-17
1.57E-17
8.00E-18
3.10E-15
1.59E-15
8.18E-16
4.19E-16
2.15E-16
1.10E-16
5.61E-17
2.87E-17
1.47E-17
7.48E-18
2.90E-15
1.49E-15
7.65E-16
3.92E-16
2.01E-16
1.03E-16
5.25E-17
2.68E-17
1.37E-17
7.00E-18
2.72E-15
1.40E-15
7.16E-16
3.67E-16
1.88E-16
9.60E-17
4.91E-17
2.51E-17
1.28E-17
6.54E-18
2.54E-15
1.31E-15
6.69E-16
3.43E-16
1.76E-16
8.98E-17
4.59E-17
2.35E-17
1.20E-17
6.12E-18
2.38E-15
1.22E-15
6.26E-16
3.21E-16
1.64E-16
8.40E-17
4.29E-17
2.19E-17
1.12E-17
5.72E-18
2.22E-15
1.14E-15
5.86E-16
3.00E-16
1.54E-16
7.85E-17
4.01E-17
2.05E-17
1.05E-17
5.35E-18
9.0
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9
10.0
5.00E-18
2.56E-18
1.31E-18
6.67E-19
3.41E-19
1.75E-19
8.94E-20
4.58E-20
2.35E-20
1.21E-20
6.22E-21
4.68E-18
2.39E-18
1.22E-18
6.24E-19
3.19E-19
1.63E-19
8.37E-20
4.29E-20
2.20E-20
1.13E-20
5.82E-21
4.37E-18
2.23E-18
1.14E-18
5.83E-19
2.98E-19
1.53E-19
7.82E-20
4.01E-20
2.06E-20
1.06E-20
5.44E-21
4.09E-18
2.09E-18
1.07E-18
5.46E-19
2.79E-19
1.43E-19
7.32E-20
3.75E-20
1.93E-20
9.90E-21
5.09E-21
3.82E-18
1.95E-18
9.98E-19
5.10E-19
2.61E-19
1.34E-19
6.85E-20
3.51E-20
1.80E-20
9.26E-21
4.77E-21
3.57E-18
1.83E-18
9.33E-19
4.77E-19
2.44E-19
1.25E-19
6.40E-20
3.28E-20
1.69E-20
8.67E-21
4.46E-21
3.34E-18
1.71E-18
8.73E-19
4.46E-19
2.28E-19
1.17E-19
5.99E-20
3.07E-20
1.58E-20
8.11E-21
4.17E-21
3.13E-18
1.60E-18
8.16E-19
4.17E-19
2.14E-19
1.09E-19
5.60E-20
2.87E-20
1.48E-20
7.59E-21
3.91E-21
2.92E-18
1.49E-18
7.63E-19
3.90E-19
2.00E-19
1.02E-19
5.24E-20
2.69E-20
1.38E-20
7.10E-21
3.66E-21
2.73E-18
1.40E-18
7.14E-19
3.65E-19
1.87E-19
9.56E-20
4.90E-20
2.52E-20
1.29E-20
6.64E-21
3.42E-21
Y&X
Continuous Gage
R&R; Test/Retest,
Attribute R&R
Y
Capability Indices
Y
X
Team,
Benchmarking
Process Analysis,
Graphical Analysis,
Hypothesis Tests
X
DOE-Screening
X
Factorial Designs
X
Simulation
Continuous Gage
R&R, Test/Retest,
Attribute R&R
MSA
X, Y
Capability Indices
X
Control Charts;
Mistake Proofing;
FMEA
Process Capability Y,
X
Sustained Solution
(15), Project
Documentation (16),
X, Y
68
t-Distribution
1-α
α
Six Sigma Toolkit - Index
df .600 .700 .800 .900 .950
.990
.995
1
2
3
4
5
0.325
0.289
0.277
0.271
0.267
0.727
0.617
0.584
0.569
0.559
1.376
1.061
0.978
0.941
0.920
3.078
1.886
1.638
1.533
1.476
6.314 12.706 31.821 63.657
2.920 4.303 6.965 9.925
2.353 3.182 4.541 5.841
2.132 2.776 3.747 4.604
2.015 2.571 3.365 4.032
6
7
8
9
10
0.265
0.263
0.262
0.261
0.260
0.553
0.549
0.546
0.543
0.542
0.906
0.896
0.889
0.883
0.879
1.440
1.415
1.397
1.383
1.372
1.943
1.895
1.860
1.833
1.812
2.447
2.365
2.306
2.262
2.228
3.143
2.998
2.896
2.821
2.764
3.707
3.499
3.355
3.250
3.169
11
12
13
14
15
0.260
0.259
0.259
0.258
0.258
0.540
0.539
0.538
0.537
0.536
0.876
0.873
0.870
0.868
0.866
1.363
1.356
1.350
1.345
1.341
1.796
1.782
1.771
1.761
1.753
2.201
2.179
2.160
2.145
2.131
2.718
2.681
2.650
2.624
2.602
3.106
3.055
3.012
2.977
2.947
16
17
18
19
20
0.258
0.257
0.257
0.257
0.257
0.535
0.534
0.534
0.533
0.533
0.865
0.863
0.862
0.861
0.860
1.337
1.333
1.330
1.328
1.325
1.746
1.740
1.734
1.729
1.725
2.120
2.110
2.101
2.093
2.086
2.583
2.567
2.552
2.539
2.528
2.921
2.898
2.878
2.861
2.845
21
22
23
24
25
0.257
0.256
0.256
0.256
0.256
0.532
0.532
0.532
0.531
0.531
0.859
0.858
0.858
0.857
0.856
1.323
1.321
1.319
1.318
1.316
1.721
1.717
1.714
1.711
1.708
2.080
2.074
2.069
2.064
2.060
2.518
2.508
2.500
2.492
2.485
2.831
2.819
2.807
2.797
2.787
26
27
28
29
30
0.256
0.256
0.256
0.256
0.256
0.531
0.531
0.530
0.530
0.530
0.856
0.855
0.855
0.854
0.854
1.315
1.314
1.313
1.311
1.310
1.706
1.703
1.701
1.699
1.697
2.056
2.052
2.048
2.045
2.042
2.479
2.473
2.467
2.462
2.457
2.779
2.771
2.763
2.756
2.750
40
60
120
∞
69
.975
0.255
0.254
0.254
0.253
0.529
0.527
0.526
0.524
0.851
0.848
0.845
0.842
1.303
1.296
1.289
1.282
1.684
1.671
1.658
1.645
2.021
2.000
1.980
1.960
2.423
2.390
2.358
2.326
2.704
2.660
2.617
2.576
Index
Analysis and Improve Tools Selection Matrix· · · · · · · · · · · · · · · · · · ·
ANOVA
ANOVA / ANOVA One Way· · · · · · · · · · · · · · · · · · · · · · · · ·
ANOVA Two Way · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
ANOVA - Balanced · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Interpreting the ANOVA Output· · · · · · · · · · · · · · · · · · · · · · ·
Calculating Sample Size (Equation for manual Calculation) · · · · · · · · · ·
Characterizing the Process - Rational Subgrouping · · · · · · · · · · · · ·
Control Chart Constants· · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Control Charts· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Data Validity Studies/% Agreement on Binary (Pass / Fail) Data· · · · · · ·
Defining a Six Sigma Project · · · · · · · · · · · · · · · · · · · · · · · · ·
Definition of Z · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Design for Six Sigma
Loop Diagrams· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Tolerancing Analysis · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Discrete Data Analysis· · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
DOE
Design of Experiments· · · · · · · · · · · · · · · · · · · · · · · · · · ·
Factorial Designs · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
DOE Analysis · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
DPU / DPO · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Gage R & R · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
General Linear Model · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Hypothesis Statements
Hypothesis Testing · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Minitab Graphics
Histogram / Scatter Plot · · · · · · · · · · · · · · · · · · · · · · · · · ·
Descriptive Statistics / Normal Plot · · · · · · · · · · · · · · · · · · · ·
One Variable Regression / Residual Plots · · · · · · · · · · · · · · · · ·
Boxplot / Interval Plot· · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Time Series Plot / Box-Cox Transformation · · · · · · · · · · · · · · · ·
Pareto Diagrams / Cause & Effect Diagrams · · · · · · · · · · · · · · · ·
Normal Approximation· · · · · · · · · · · · · · · · · · · · · · · · · · · ·
χ2 Test (Test for Independence) · · · · · · · · · · · · · · · · · · · · ·
Poisson Approximation · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Normality of Data · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Planning Questions · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Practical Problem Statement · · · · · · · · · · · · · · · · · · · · · · · · ·
Precontrol · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Project Closure· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Regression Analysis
Regression · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Stepwise Regression · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Regression with Curves (Quadratic) and Interactions· · · · · · · · · · · ·
Binary Logistic Regression· · · · · · · · · · · · · · · · · · · · · · · · ·
Response Surface - CCD · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Rolled Throughput Yield · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Sample Size Determination · · · · · · · · · · · · · · · · · · · · · · · · · ·
Seven Basic Tools · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Six Sigma Problem Solving Processes · · · · · · · · · · · · · · · · · · · ·
Six Sigma Process Report · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Six Sigma Product Report · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Stable Ops and 6 Sigma · · · · · · · · · · · · · · · · · · · · · · · · · · ·
t Test (Testing Means) (1 Sample t; 2 Sample t; Confidence Intervals)· · · ·
Tables · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Determining Sample Size· · · · · · · · · · · · · · · · · · · · · · · · ·
F Test · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
χ2 Test · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Normal Distribution · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
t Test · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Testing Equality of Variance (F test; Homogeneity of Variance) · · · · · · · ·
The Normal Curve · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
The Transfer Function
26
41
42
43
44
28
16
59
57-58
10
4
8
50
51-52
35
53
54
55
13
11 - 12
45
30-31
29
20
21
22
23
24
25
36
37-38
39
17
19
5
60
61
46
47
48
49
56
14
27
6
3
18
15
9
33-34
62
63-64
65-66
67-68
69
32
7
40
The Toolkit - A Six Sigma Resource
The material in this Toolkit is a combination of material
developed by the GEA Master Black Belts and Dr. Mikel Harry
(The Six Sigma Academy, Inc.). Worksheets, statistical tables
and graphics are outputs of MINITAB for Windows Version
12.2, Copyright 1998, Minitab, Inc. It is intended for use as a
quick reference for trained Black Belts and Green Belts.
More detail information is available from the Quality Coach
Website SSQC.ge.com.
If you need more GEA Six Sigma Information, visit the GE
Appliances Six Sigma Website at
http://genet.appl.ge.com/sixsigma
For information on GE Corporate Certification Testing, go to
the Green Belt Training Site via the GE Appliances Six Sigma
Website.
For information about other GE Appliances Six Sigma
Training, contact a member of the GEA Six Sigma Training
Team
• Jeff Keller - Ext 7649
Email: [email protected]
• Irene Ligon - Ext 4562
Email:[email protected]
• Broadcast Group eMail:
[email protected]
GE Appliances Copyright 2001
Revision 4.5 - September 2001
GLOSSARY OF SIX SIGMA TERMS
α - alpha risk - Probability of falsely accepting the alternative (HA) of
difference
2. ANOVA - Analysis of Variance
3. β - Beta risk - Probability of falsely accepting the null hypothesis (H0 )
of no difference
4. χ2 - Tests for independent relationship between two discrete variables
5. δ - Difference between two means
6. DOE - Design of Experiments
7. DPU - Defects per unit
8. e-DPU - Rolled throughput yield
9. F- Test - Used to compare the variances of two distributions
10. g - number of subgroups
11. FIT - The point estimate of the mean response for each level of the
independent variable.
12. H0 - Null hypothesis
13. HA - Alternative hypothesis
14. LSL - Lower spec limit
15. µ - Population mean
16. µ - Sample mean
17. n - number of samples in a subgroup
18. N - Number in the total population
19. P Value - If the calculated value of p is lower than the alpha (α) risk,
then reject the null hypothesis and conclude that there is a difference.
Often referred to as the “observed level of significance”.
20. Residual - The difference between the observed values and the Fit,
the error in the model
21. σ - Population standard deviation
22. Σ - Summation
23. σˆ (s) - Sample standard deviation
24. Stratify - Divide or arrange data in organized classes or segments,
based on known characteristics or factors.
24. SS - Sum of squares
25. t-Test - Used to compare the means of two distributions
26. Transfer Function - Prediction Equation - Y=f(x)
27. USL - Upper spec limit
28. X - mean
29. X - mean of the means
30. Z - Transforms a set of data such that µ=0 and σ=1
31. ZLT - Z Long term
32. ZST - Z short term
33. ZSHIFT - ZST - ZLT
1.
g
GE Appliances
Six Sigma Toolkit
Rev 4.5 9/2001
GE Appliances Proprietary