Discrete Data Analysis The table below summarizes the methods used to analyze discrete data. Comparing 2 Proportions 1 Proportion Normal Approximation ˆ ˆ pˆ ± Z × p(1 − p) (p ˆ− 1 (and 2 way tables) n Poisson Confidence Interval Use this approximation when the sample size is large and the number of defects in the sample is greater than 10 (np>10), and the number of good parts in the sample is greater than 10 (n(1-p)>10). pˆ = # Defects Z × pˆ (1 − pˆ ) × 1 + 1 n n Exact Binomial Test Sample Size A two sided confidence interval for the proportion (p) that are defective in the population is given by the equation: ˆ ˆ pˆ ± Z × p (1 − p ) 2 n The resultant will provide the lower and upper limits of a range of all plausible values for the proportion of defects of that population. 2 Comparing Two Proportions If evaluating two different sample sets with proportion defective data, the confidence interval for the difference in proportion defectives between the two sample sets is given by: χ (Chi-square) 2 • Large n (sample size) • p not too close to 0 or 1 • np>10 and n(1-p)>10 • Large n (sample size) • small proportion defective (p<0.10) z is a value from the normal distribution, for the required level of confidence. Z 1.282 1.645 1.960 2.326 2.576 35 One Proportion pˆ ) ± 1 More than 2 Proportions Poisson Approximation Normal Approximation 2-Sided Confidence Level 80.00% 90.00% 95.00% 98.00% 99.00% 1 - Sided Confidence Level 90.00% 95.00% 97.50% 99.00% 99.50% ( pˆ1 − pˆ 2 ) ± Z × pˆ (1 − pˆ ) × 1 n + 1 1 n 2 Where: If Ki = # of defects in the ith sample & ni = sample size of the ith sample k pˆ 1 = 1 n1 k pˆ 2 = 2 n2 (k + k ) pˆ = 1 2 ( n1 + n2 ) The resultant will provide the lower and upper limits of a range of all plausible values of the difference between the proportion defective in the populations. If “0” is included within the range of plausible values, then there is not strong evidence that the proportions of defects in the two populations are different. 36 χ2 -- Test • Using the 1 Sample t test • Run Stat > Basic Statistics > 1-Sample t. In the dialog box, identify the variable or variables to be tested. for Independence 2 The χ Test for independent relationship tests the null hypothesis (Ho,) that two discrete variables are independent. • Data “relating” two discrete variables are used to create a contingency table. For each of the cells in the contingency table the “observed frequency” is compared to the “expected frequency” in order to test for independence. • The expected frequencies in each cell must be at least five (5) for the χ test to be valid. • For continuous data, it is best to test for dependency, or correlation, by using scatter plots and regression analysis. 2 Example of χ Analysis 2 1. 2. There are 2 variables to be studied, height and weight. The null hypothesis Ho is that “weight” is independent of “height.” For each variable 2 conditions (categories) are defined. 3. The data has been accumulated as shown below: Height below 5’6” 20 11 31 Height above 5’6” 13 22 35 Totals Rows 33 33 N=66 Manual χ 2 1. 2. 3. 4. 5. 6. Compute fexp for each cell ij: fexp ij = (Row Total)i x (Column Total)j / N. N is total of all fobs for all 4 cells. (For our example, N = 66 and fexp1,2 = (33*35)/66 = 1155 / 66 = 17.5). 2 2 2 Calculate χ calc, where χ calc = Σ[[(fobs - fexp) / fexp ] = 4.927 Calculate the degrees of freedom df = (Number of Rows - 1)(Number of Columns -1). For our example, df = (2-1) * (2-1) = 1. 2 2 Determine χ crit from the χ table for the degrees of freedom and confidence 2 level desired (usually 5% risk). For 1 df and 5% α risk, χ crit = 3.841 2 2 If χ calc> χ crit, then reject Ho and accept Ha, i.e. that weight depends on height. In this example, we reject Ho. Using Minitab to perform χ Analysis • If “Test Mean” is selected, identify the desired mean to be tested, the mean of the null hypothesis and in the alternative box,select the alternative hypothesis which is appropriate for the analysis. This will determine the test used for the analysis (one tailed or two tailed). • If graphic output is needed, select the graphs button and choose among “Histograms”, “Dotplot” and “Boxplot” output. • Click Ok to run analysis. Analyzing the test results. • If running the “Confidence Interval” option, Minitab will calculate the “t” statistic and will calculate a confidence interval for the data. • If using the “Test Mean” option, Minitab will provide descriptive statistics for the tested distribution(s), the “t” statistic and a “p” value. Confidence Intervals A confidence interval is a range of plausible values for a population parameter, such as the mean of the population, µ. For example, a test of 8 units might give an average efficiency of 86.2%. This is the most likely estimate of the efficiency of the entire population. However, observations vary, so the true population efficiency might be somewhat higher or lower than 86.2%. A 95% confidence interval for the efficiency might be (81.2%, 91.2%). 95% of the intervals constructed in this manner will contain the true population parameter. The confidence interval for the mean of one sample is Minitab can be used to analyze data using χ with two different processes. Stat>Tables>Chi Square Test and Stat>Tables>Cross Tabulation. Chi Square Test analyzes data which is in a table. Cross Tabulation analyzes data which is in columns with subscripted categories. Since Minitab commonly needs data in columns to graph, Cross Tabulation is a preferred method for most analysis. 2 X ±t× σˆ n “t” comes from the t tables (Page 65) with n-1 degrees of freedom and with the desired level of confidence. Confidence Interval for the difference in the means of 2 samples, if the variances of 2 samples are assumed to be equal is: ( x 1 − x 2 ) ± t*s p 1 + 1 n n 1 2 “t” comes from the t tables (Page 65) with n1+n2-2 degrees of freedom, and with the desired level of confidence. SP is the pooled standard deviation: S 2 37 • If “Confidence Interval” is to be calculated on other than an a value of 95%, change to the appropriate number. • The graphic outputs will all have a graphical representation of the confidence interval of the mean, shown by a red line with a dot at the mean for the sample population Weight: < 140 lbs, > 140 lbs Height: < 5’6”, > 5’6” Weight below 140 LBS Weight Above 140 LBS Totals Columns • Select the test to be performed, “Confidence Interval” or “Test Mean”. P = [(n 1 − 1 )s 2 1 + (n 2 − 1 )s 2 2 ] (n 1 + n 2 − 2 ) In MINITAB, confidence Intervals are calculated using “1 Sample” and “2 Sample” t methods, above. In the text output shown below, the 95% confidence interval for the difference between the mean of Manu_a and the mean of Manu_b is 6.65 to 8.31. This statement points to accepting Ha, that the means are different. 95% CI for mu Manu_a - mu Manu_b: ( 6.65, 8.31) 34 “t” Test A “t” test tests the hypothesis that the means of two distributions are equal. It can be used to demonstrate a shift of the mean after a process change. If there has been a change to a process, and it must be determined whether or not the mean of the output was changed, compare samples before and after the change using the “t” test. • Your ability to detect a shift (or change) is improved by increasing the size of your samples and by increasing the size of the shift (or change) that you are trying to detect, or by decreasing the variation (See Sample size; page 27 28). • There are two tests for means, a One Sample t test and a Two Sample t test. • The “one sample t test” “Stat > Basic Statistics > 1-Sample t” compares a single distribution average to a target or hypothesized value. • The “two sample t test” “Stat > Basic Statistics > 2-Sample t” analyzes the means of two separate distributions. Using the 2 Sample t test 1. Pull samples in a random manner from the distributions whose means are being evaluated. In Minitab, the data can be in separate columns or in a single column with a subscript column. 2. Determine the Null Hypothesis Ho and Alternative Hypothesis Ha (Less than, Equal to, or Greater than). 3. Confirm if variances are similar using “F” test or Homogeneity of Variance (page 30). 4. Run “Stat > Basic Statistics > 2-Sample t” In the dialog box, select “Samples in One Column” and identify the data column and subscript column, or “Samples in Different Columns” and identify both columns. 5. In the alternative box, select the alternative hypothesis which is appropriate for the analysis. This will determine the test used for the analysis (one tailed or two tailed). 6. If the variances are similar, check the “Assume Equal Variances” box. 7. If graphic output is needed, select the graphs button and choose between “dotplot” and “boxplot” output. 8. Click Ok to run analysis. Analyzing the test results. Minitab will provide a calculation of descriptive statistics for each distribution, provide a Confidence Interval statement (page 32) and provide a statement of the t test as a test of the difference between two means. The output will provide a “t” statistic, a “p” value and the degrees of freedom statistic. To use the “t” distribution table on page 65, the “t” statistic and the degrees of freedom are required. Analysis can be made using that table or the “p” value. Minitab: Stat>Tables>Chi Square 1. Create the table shown in the example in Minitab. 2. Run Stat>Tables>Chi Square Test. In the dialog box select the columns containing the tabular data, in this case, “C2” and “C3”. Click “OK” to run. 3. In the Session window, the table that is created will show the expected value for each of the data cells under the actual data for the cell, plus the c2 calculation , χ2calc = Σ[(fobs - fexp)2/ fexp]. 4. The Chi Square calculation is χ2calc = 4.927. The p value for this test of 0.026. 5. The degrees of freedom df = (Number of Rows 1)(Number of Columns -1) is shown, df = 1. 6. Determine χ2crit from the χ2 table for the degrees of freedom and confidence level desired (usually 5% risk). χ2crit=.3.841. 7. Since Χ2 calc >Χ2crit, reject Ho. Because the data is in tabular form in Minitab, no other analysis can be done. Minitab: Stat>Tables>Cross Tabulation If additional analysis of data is desired, including any graphical analysis, the Stat>Tables>Cross Tabulation procedure is preferred. This procedure uses data in the common Minitab column format. Note that the data is in a single column and the factors or variables being considered are shown as subscripted values. In this graphic, the data is in column C6 and the appropriate subscripts are in column C4 and C5. 1. Run Stat>Tables>Cross Tabulation. In the dialog box select the columns identifying the factors or variables in the “Classification Variables” box. 2. Click Chi Square Analysis and select “Above and expected count”. 3. Select the column containing the response data in the “Frequencies in” box, in this case, “data”. 4. Click Run. 5. The output in the session window is very similar to the output for Stat>Tables>Chi Square Test, except that it does not show the Chi Square calculation for each cell. 6. Analysis of the test are done as before, either by using the generated p value or by using the calculated χ2 and degrees of freedom and entering the tables with that information to find a χ2 crit. 33 38 Poisson Approximation Use this approximation when the sample size is large and the probability of defects (p) in the sample is less than 0.10. In such a situation: pˆ = k Where: k=number of defects n= number of sample parts n The confidence interval for this proportion defective can be found using the Poisson distribution. Testing Equality of Variances The “F” test is used to compare the variances of two distributions. It tests the hypothesis, Ho, that the variances of two distributions are equal. It is performed by forming a ratio of two variances from two samples and comparing the ratio with a value in the “F” distribution table. The “F” test can be used to demonstrate that the variance has been increased or decreased after a process change. Since “t” tests and “ANOVA” need to know if population variance is the same or different, this test is also a prerequisite for doing other types of hypothesis testing. In Minitab, this test is done as “Homogeneity of Variance”. The “F” test is also used during the ANOVA process to confirm or reject hypotheses about the equality of averages of several populations. Performing an F Test 1. Pull samples in a random manner from the two distributions for which you are comparing the variances. Prior to running the test confirm sample distribution normality for each sample (page 17). 2. Compute the “F” statistic, Fcalc = σ σ . The “F” statistic should always be calculated so that the larger variance is in the numerator. 2 1 2 2 3. Calculate the degrees of freedom for each sample. Degrees of freedom = ni-1, where ni is the sample size for the ith sample, i.e., n1-1 & n2-1. 4. Specify the risk level that you can tolerate for making an error in (usually set at 5%.) your decision 5. Use the “F” distribution table (p 59 - 60) to determine Fcrit for the degrees of freedom in your samples and for the risk level you have chosen. 6. Compare Fcalc to Fcrit. If Fcalc < Fcrit., the null hypothesis, Ho, which implies that the variances from both distributions are equal, cannot be rejected. If Fcalc>Fcrit , reject the null hypothesis and conclude that the samples have different variances. 1. Determine the desired confidence level (80%; 90% or 95%). 2. Find the lower and upper confidence interval factors for that level of confidence for the number of failures found in the sample. 3. Divide these factors by the actual sample size used. 4. The resultant of the two calculations gives the range of plausible values for the proportion of the population that is defective. Example: k=2; n=200 (2 defects in 200 sampled parts or CTQ outputs) Then: pˆ = 2 200 = .0100 CI=(.619/200, 7.225/200) =(.0031,.0361) 39 And the 95% 2-sided confidence Interval is: Where: Lower confidence factor = .619 Upper confidence factor = 7.225 Using Homogeneity of Variance (For MINITAB Analysis) 1. Homogeneity of Variance will allow analysis of multiple population variances simultaneously. It will also allow analysis of “non-normal” distributions. Data from all sample groups must be “stacked” in a single column with the samples identified with a separate “subscript” or “factor” column. 2. In Minitab, use STAT>ANOVA>HOMOGENEITY OF VARIANCE. In the dialog box, identify the single “response” column and a separate “Factors” column or columns. 3. Analysis of the test will be done using the “p value.” If the data is Normal (See Normality, page 15), use Bartlett's Test. Use Levene's Test when the data come from continuous, but not necessarily normal distributions. 4. The computations for the homogeneity of variance test require that at least one cell contains a non-zero standard deviation. Normally, it is possible to compute a standard deviation for a factor if it contains at least two observations. 5. Two standard deviations are necessary to calculate Bartlett’s and Levene’s test statistics. 32 The Transfer Function Hypothesis Statements (cont.) F Test - Compares the variances of two Distributions H0 - The sample variances tested are statistically the same σ20=σ21 HA - The sample variances tested are not equal σ20≠σ21 Homogeneity of Variance - Compares the variances of multiple Distributions Y= H0 - The sample variances tested are statistically the same σ20=σ21=σ3=…=σk HA - At least one of the sample variances tested are not f (X) equal σ20≠σ21≠σ21 Copyright 1995 Six Sigma Academy, Inc. Bartletts Test - Tests Normal Distributions Levene’s Test - Tests non-normal distributions Χ2 Χ Ho : p1 = p2 = p3 = ... = pn What is a p-value? Statistical definitions of p-value: The observed level of significance. The chance of claiming a difference if there is no difference. The smallest value of alpha that will result in rejecting the null hypothesis. How do I use it? 31 If p < Alpha, then the difference is statistically significant. Reject the null hypothesis and declare that there is a difference. Think of (1 - p) as the degree of confidence that there is a difference. Example: p = .001, so (1 - p) = .999, or 99.9%. . You can think of this as 99.9% confidence that there is a difference Discrete Continuous Ha : At least one of the equalities does not hold • Output Inputs If Outputs are : Ha : Dependent (There is a relationship between the populations) • Root Causes Tests the hypothesis that two discretely measured variables operate independently of each other Ho : Independent (There is no relationship between the populations) • • • Effect If Inputs are Continuous: Regression Analysis of Covariance If Inputs are Continuous: Logistic Regression If Inputs are Discrete: ANOVA t Tests; F Tests Confidence Intervals DOE If Inputs are Discrete: Logistic Regression Χ2 Confidence Intervals- Proportions DOE What Whatis isthe themathematical mathematicalrelationship relationship between betweenthe the“Y” “Y”and andthe the“X’s” “X’s” 40 ANOVA ANOVA,, ANalysis Of VAriance is a technique used to determine the statistical significance of the relationship between a dependent variable (“Y”) and a single or multiple independent variable(s) or factors (“X’s”). ANOVA should be used when the independent variables (X’s) are categorical (not continuous). Regression Analysis (Pages 43 - 45) is a technique for performing a similar analysis with continuous independent variables. ANOVA determines if the differences between the averages of the levels is greater than the expected variation. It answers the question: “Is the signal between levels greater than the noise within levels?” ANOVA allows the investigator to compare several means simultaneously with the correct overall level of risk. Basic Assumptions for using ANOVA • Equal Variances (or close to the same) for each subgroup. • Independent and normally distributed observations. • Data must represent the population variation. • Acceptable Gage R&R • ANOVA tests for equality of means is fairly robust to the assumption of normality for moderately large sample sizes, so normality is often not a major concern. ANOVA - One Way The One Way ANOVA enables the investigation of a single factor at multiple levels with a continuous dependent variable. The primary investigation question is “Do any of the populations of “Y” stemming from the levels of “X” have different means?” MINITAB will do this analysis either with the data in table form, with data for each level of X in separate columns (STAT>ANOVA>ONE WAY (UNSTACKED) ) or with all the data in a single column and the factor levels identified by a separate subscript column (STAT>ANOVA>ONE WAY). For the data below, use “One-Way c1-c3 and “One Way” for data in columns c4-c5. •In the dialog box, for “One Way (Unstacked)”, identify each of the columns containing the data. •In the dialog box for “One-way”, identify the column containing the Response (Y) and the Factor (X) as appropriate. •For both analyses, if graphic analysis is desired select the “Graphs” button and select between “Dotplots” and “Boxplots” . •Click OK to run. For analysis, see page 41. 41 Stating the Hypothesis HO and HA Ho The starting point for a hypothesis test is the “null” hypothesis - Ho. Ho is the hypothesis of sameness, or no difference. Example: The population mean equals the test mean. Ha The second hypothesis is Ha - the “alternative” hypothesis. It represents the hypothesis of difference. Example: The population mean does not equal the test mean. •You usually want to show that there is a difference (Ha). •Start by assuming equality (Ho). •If the data show they are not equal, then they must be different (Ha). Hypothesis Statements 1 Sample t - Compares a single distribution to a target or hypothesized value. – H0 - The sample tested equals the target µ0=Target Ha - The sample tested is not equal to the target or greater than/less than the target. µ0≠Target µ0>Target µ0<Target 2 Sample t - Compares the means of two separate distributions – H0 - The samples tested are statistically the same µ0=µ1 – Ha - The sample tested is not equal to the target or greater than/less than the target. µ0≠ µ1 µ0> µ1 µ0< µ1 30 Hypothesis Testing Since all data are variable, an observed change could be due to chance and may not be repeatable. Hypothesis testing determines if the change could be due to chance alone, or if there is strong evidence that the change is real and repeatable. In order to show that a change is real and not due to chance alone, first assume there is no change (Null Hypothesis, HO). If the observed change is larger than the change expected by chance, then the data are inconsistent with the null hypothesis of no change. We then “reject” the null hypothesis of no change and accept the alternative hypothesis, HA. The null hypothesis might be that two suppliers provide parts with the same average flatness (HO:µ1=µ2, the mean for supplier 1 is the same as the mean for supplier 2). In this case, the alternative hypothesis is that average flatness is not equal (HA: µ1≠µ2). Real Real World World µµ11=µ =µ22 µµ11=µ =µ22 µµ11≠µ ≠µ22 Correct Decision Type 2 Error β Type 1 Error α Correct Decision Decision Decision µµ11≠µ ≠µ22 If the means are equal and your decision is that they are equal (top left box), then you made the correct decision. If the means are not equal and your decision is that they are not equal(bottom right box), then you made the right decision. If the means are equal but your decision is that they are not equal (Bottom left box), then you made a type 1 error. The probability of this error is alpha (α) If the means are not equal but your decision is that they are equal (top right box), then you made a Type 2 error. The probability of this error is beta (β). . Steps in Hypothesis Testing 1. Define the problem; state the objective of the Test. 2. Define the Null and Alternate Hypotheses. 3. Decide on the appropriate statistical hypothesis test; Variance (Page 30); Mean (t Test - Page 31 - 32); Frequency of Occurrence (Discrete - Χ2 Page 35 - 36). Anova Two Way Two way ANOVA evaluates the effect of two separate factors on a single response can be evaluated. Each cell(combination of independent variables) must contain an equal number of observations (must be balanced). See General Linear Model (Pages 42) for unbalanced Data sets. In the data set on the right, Strength is the response (Y) and Chem and Fabric are the separate factors (X1 and X2). To analyze the significance of these factors on Y, run STAT>ANOVA>TWO WAY. In the dialog box, identify the Response (Y), “Strength.” In the “Row Factor” box, identify the first of two factors (X) for analysis. In the “Column Factor”, Identify the second “vital X”. Select the “Display Means” box for each factor to gain Confidence interval and means analysis. Select “STORE RESIDUALS” and then “STORE FITS”. If graphical analysis of the ANOVA data is desired, select the “Graphs” button and choose one, or all of the four diagnostic graphs available. This analysis does not produce F and p-values, since you can not specify whether the effects are fixed or random. Use Balanced ANOVA (Page 36) to perform a two-way analysis of variance, specify fixed or random effects, and display the F and p-values when you have balanced data. If you have unbalanced data and random effects, use General Linear Model (Page 42) with Options to display the appropriate test results. It can be seen from the SS column that the “error SS” is very small relative to the other terms. In the graphic Confidence interval analysis it is clear that both factors are statistically significant, since some of the confidence intervals do not overlap. 4. Define the acceptable α and β risk. 5. Define the sample size required. (Page 27 - 28) 6. Develop the sampling plan and collect samples. 7. Calculate the test statistic from the data. 8. Compare the calculated test statistic to a predicted test statistic for the risk levels defined. • If calculated is larger than the predicted test statistic, the statistic indicates difference. 29 42 Calculating Sample Size ANOVA - Balanced The Balanced ANOVA allows the analysis of process data with two or more factors. As with the Two Way ANOVA, Balanced ANOVA allows analysis of the effect of multiple factors, at multiple levels simultaneously. A factor (B) is “nested” within another factor (A) if the level of B appears with only a single level of A. Two factors are “crossed” if every level of one factor appears with every level of the other factor. The data for individual levels of factors must be balanced: “each combination of independent variables (cell) must have an equal number of observations”. See General Linear Model (Page 38) for analysis of unbalanced designs. Guidelines for normality and variance remain same as shown on page 38. Figure 1 shows how some of the factors and data might look in the MINITAB worksheet. Note there are five (5) data points for each combination of the three factors. To analyze for significance of these factors(Xij) on the response variable (Y), run STAT>ANOVA>BALANCED ANOVA. In the dialog box (Figure 2) , Identify the “Y” variable in the Response box and identify the factors in the “Model” box. Note that the “pipes [“Shift \”] indicate the model analyzed is to include factor interactions. Select “Storage” to store “residuals” and “fits” for later analysis. Select “Options” and select “Display means...” to display information about data means for each factor and level. Figure 1 Figure 3 is the primary output of this analysis. There is no significant graphic analysis for the balanced ANOVA. See page 41 for analysis of this output. PIPE To calculate the actual sample size without the table, or to program a spreadsheet to calculate sample size, use this equation. ( z α / 2 + z β )2 n = 2× (δ / σ )2 Zβ α α/2 Zα/2 β .20 .10 1.282 .20 0.842 .10 .05 1.645 .10 1.282 .05 .025 1.960 .05 1.645 .01 .005 2.576 .01 2.326 Example: α = .10, β = .01, δ/σ=.3 Figure 2 43 Figure 3 (1 . 645 + 2 . 326 )2 n = 2× = 350 2 (. 3 ) 28 Continuous Data Analysis Interpreting the ANOVA Output Sample Size Determination When using sampling to analyze processes, sample size must be consciously selected based on the allowable α and β risk, the smallest amount of true difference (δ) that you need to observe for the change to be of practical significance and the variation of the characteristic being measured (σ). As variation decreases or sample size increases it is easier to detect a difference. Steps to defining sample size 1. 2. 3. 4. 5. Determine the smallest true difference to be detected, the gap ( δ ). Confirm the process variation ( σ ) of the processes to be evaluated. Calculate δ/σ. Determine acceptable α and β risk. Use chart on page 58 to read the sample size required for each level of the factor tested. Today Desired Today Desired The first table lists the factors and levels. In the table shown there are three “factors”, “Region”, “Shift” and “WorkerEx”. There are three levels each for “Region” and “Shift”. The values assigned for the Region and Shift levels are “1,2 &3”. “WorkerEx” is a two level factor and has level values of “1&2”. The second table is the ANOVA output. The columns are as defined below. Source T T variation(σ σ) gap delta (δ δ) variation(σ σ) gap delta(δ δ) δ ≅1 σ δ ≅2 σ SS MS F For example -Assume the direction of the effect is unknown, but you need to see a delta sigma (δ/σ) of 1.0 in order to say the change is important. For an α risk of 5% and a β risk of 10%, we would need to use 21 samples. Remember that we would need 21 at each level of the factor tested. If for the same δ, σ were reduced so that δ/σ were 2, only 5 samples would be required. In general, the smaller the shift (δ/σ) you are trying to detect, and/or the lower the tolerable risk, the greater the number of samples required. Sample size sensitivity is a function of the standard error of the mean (σ samples are less sensitive than larger samples. 27 DF n ). Smaller P The source shows the identified factors from the model, showing both the single factor information (i.e., Region) and the interaction information (i.e., Region*Shift) Degrees of Freedom for the particular factor. Region and shift have 3 levels and 3-1=2 df, and workerex has 2 levels and 2-1=1 df. Factor “Sum of Squares” is a measure of the variation of the sample means of that factor. Factor “Mean Square” is the SS divided by the DF. The F calc value is the MS of the factor divided by the MS of the Error term. In the case of Region, F=90.577÷3.325=27.24. If using F crit to analyze for significance, enter table with DF degrees of freedom and α=.05. Compare F calc to F crit. If F calc is greater than F crit , the factor is significant. The calculated P value, the observed level of significance. If P<.05, the factor is statistically significant at the 95% level of confidence. Note: The relative size of the error SS to total SS indicates the percent of variation left unexplained by the model. In this case, the unexplained variation is 39.16% of the total variation in this model. The “s” of this unexplained variation is the square root of the MS of the Error term (3.325). In this case the “within” group variation has a sigma of 1.82. If this remaining variation does not enable the process to achieve the desired performance state, look for additional factors. 44 General Linear Model The General Linear Model (GLM) can handle “unbalanced” data - such as data sets with missing observations. Where the Balanced ANOVA required the number of observations to be equal in each “factor/level” grouping, GLM can work around this limitation. The data must be “full rank” (enough data to estimate terms in the model). But you don’t have to worry about this, because Minitab will tell you if your data isn’t full rank! Analysis and Improve Tools Y Discrete In the data set shown in Figure 1, note that there is only one data point in “Rot1”, the response column for factor “Temp1”- level 10 / ”Oxygen1”- level 10(rows 8 & 9), and only two data points for “Temp1” - level 16 / “Oxygen1” Level 6 (Row 14). In such case, “Balanced ANOVA” would not run because the requirement of equal observation would require three data points in each cell (factor and level combination). Figure 1 Figure 2 is the primary output of this analysis. There is no graphic analysis of this output. Discrete • • • X Tables (Cross tab) Chi Square Confidence intervals for proportions Pareto • • • • • • • Continuous Run STAT>ANOVA>GENERAL LINEAR MODEL. In the Dialog box, identify the response variable in the “Response” box and the factors in the “Model” box. Use the pipe (shifted “\”) to include interactions in the analysis. • Continuous • • Logistic regression Discriminant Analysis CART (Classification and Regression Trees) • • • • Confidence intervals t test ANOVA Homogeneity of Variance GLM DOE (factorial fit) Linear regression Multiple regression Stepwise Regression DOE response surface Logistic Regression, Discriminant Analysis and CART (Classification and Regression Trees) are advanced topics not taught in Six Sigma Training. The following references may be helpful. Figure 2 Interpretation: Temp1 is a significant X variable, because it explains 62% of the total variation (528.04/850.4). (Temp1 also has a p-value < 0.05, indicating that it is statistically significant) Breiman, Friedman, Olshen and Stone; Classification and Regression Trees; Chapman and Hall, 1984 Hosmer and Lemeshow; Applied Logistic Regression; Wiley, 1989 Minitab Help - Additional information about Discriminate Analysis Neither Oxygen1 nor the interaction between Oxygen and Temperature appears significant. The unexplained variation represents 30.95% ((263.17÷850.4)*100) and the estimate of the within subgroup variation is 5.4 (square root of 29.24). 45 26 Pareto Diagrams Regression Analysis Stat>Quality Stat>Quality Tools>Pareto Chart Regression can be used to describe the mathematical relationship between the response variable and the vital few X’s, if you have continuous data for your X’s. Also, after the “vital few variables” have been isolated, solving a regression equation can be used to determine what tolerances are needed on the “vital few variables” in order to assure that the response variable is within a desired tolerance. Pareto Chart for Defects When analyzing categorical defect data , it is useful to use the Pareto chart to visualize the 400 relative defect frequency. A Pareto Chart is a 80 300 frequency ordered column chart. The analysis can 60 either analyze raw defect data, such as “scratch, 200 40 dent, etc”, or it can analyze count data such as is made available from Assembly Line Defects 100 20 reports. The graphic on the left is from count data. 0 0 Set up the worksheet with two columns, the first Defect with the defect cause descriptor and the second 274 59 43 19 10 18 Count with count or frequency of occurrences. In the 64.8 13.9 10.2 4.5 2.4 4.3 Percent Cum % 64.8 78.7 88.9 93.4 95.7 100.0 “PARETO CHART” dialog box, select “Chart DEFECTS TABLE”. Link the cause descriptor to the “LABELS IN” box and the counts to the “FREQUENCY” box. Click OK. For more information, see Minitab Context sensitive help in the Pareto Dialog box. Count Percent 100 ws cre gS sin M is Mis g sin Cli ps yG ak Le k as et D ve cti efe i us Ho Inc te ple om rt Pa ers Oth To interpret the pareto, look for a sharp gradient to the categories with 80% of counted defects attributable to 20-30% of the identified categories. If Pareto is flat with all categories linked to approximately the same number of defects, try to restate the question to redefine the categorical splits. Cause and Effect Diagrams Fishbone Diagrams: Stat>Quality Stat>Quality Tools>Cause & Effect Cause-and-Effect Diagram Measurements Micrometers Microscopes Materials Alloys Lubricants Inspectors When working with the Advocacy team to define the potential factors (X’s), it is often helpful to use a “Cause and Effect Diagram” or “Fishbone” to display the factors. The arrangement helps in the discovery of potential interactions between Factors (X’s). Men Suppliers Shifts Supervisors Training Operators Exhaust Quality Speed Brake Condensation Moisture% Environment Engager Angle Methods Lathes Bits Sockets Machines Use Minitab worksheet columns to record descriptors for the factors identified during the team brainstorming session. Group the descriptors in columns by categories such as the 5M’s. Once the factors are all recorded, open the Minitab “Stat>Quality Tools>Cause and Effect” dialog box. The dialog box will have the 5M’s and Environment shown as default categories of factors. If using these categories, link the worksheet columns of categorized descriptors to the dialog box categories. If the team has elected to use other Category names, replace the default names and link the appropriate columns. Click OK. To interpret the Cause and Effect Diagram, look for places where a factor in one category could also be included in another category. Question the Advocacy team about priority or significance of the factors in each category. Then prioritize the factors as a whole. For the most significant factors, ask the team where there is the potential for changes in one factor to influence the actions of another factor. Use this information to plan analysis work. 25 Regression analysis can find a linear fit between the response variable Y and the vital few input variables X1 and X2. Y = B + B X + B X + error 0 1 1 2 2 (Start with a scatter diagram to examine the data.) This linear equation can be used to decide what tolerances must be maintained on X1 and X2 in order to hold a desired tolerance on a Variable Y. ∆Y = B ∆X + B ∆X 1 1 2 2 Regression analysis can be done using several of the MINITAB tools. Stat>Regression>Fitted Line Plot is explained on Page 20. This section will discuss Regression>Regression. Data must be paired in the MINITAB worksheet. That is, one measurement from each input factor (x) is paired with the response data (Y) for that particular measurement point. Plot the data first using Minitab Stat>Plot. Analyze the data Stat>Regression>Regression. In the dialog box indicate the Response (Y) in the Response box and the expected factors (X’s) in the Predictors box. Select the Storage button and in that dialog box select Fits and Residuals. Click OK twice to run the analysis. The output will appear as shown in the figure to the right. using The full regression equation is shown at the top of the output. Predictor influence can be evaluated using the p column in the first table. Analysis of the second table is done in similar fashion to the ANOVA analysis on page 41. Note that R2 (ADJ) is similar to R2 but is modified to reflect the number of terms in the regression. If there are many terms in the model, and the sample size is small, then R2 (ADJ) can be much lower than R2, and you may be over-fitting. In this example, the total sample size is large (n=560), so R2 and R2 (ADJ) are similar. 46 Stepwise Regression Time Series Plot Graph>Time Series Plot Stepwise Regression is useful to search for leverage factors in a data set with many factors (x’s) and a response variable (Y). The tool can analyze up to 100 factors. But, while this enables the analysis of Baseline data for potential Vital X’s, be careful not to draw conclusions about significance of X’s without first confirming with a DOE. The time series plot is useful as a diagnostic tool. Use it to analyze data collection processes, non-normal data sets, etc. In GRAPH VARIABLES: Identify any number of variables (Y) from the worksheet you wish to look at over time. Minitab assumes the values are entered in the order they occurred. Enter one column at a time. Minitab will automatically sequence to the next graph for each column. The X axis is the time axis and is set by selecting the appropriate setting in TIME SCALE. Each Time series plot will display on a separate graph. In FRAME,ANNOTATE and OPTIONS, you can change chart axes, display multiple charts, etc. In analyzing the Time Series Plot, look for a story. Look for trends, sudden shifts, a regular cycle, extreme values, etc. If any of these exist, they can be used as a lead into problem solving. To use Stepwise regression, the data needs to be entered in Minitab with each variable in a separate column and each row representing a single data point. Next select Stat>Regression>Stepwise. In the dialog box, identify the column containing the response (Y) data in the Response box. In the Predictor box, identify the columns containing the factors (X’s) you want Minitab to use. If their F-statistic falls below the value in the “F to remove” text box under Options (Default = 4), Minitab removes them. By selecting the Options button, you can change the Fcrit value for adding and removing factors from the selection and also reduce the number of steps of analysis the tool goes through before asking for your input. Minitab will prioritize the leverage “X” variables and run the first regression step on the factor with the greatest influence. It continues to add variables as long as the “t” value is greater than the SQRT of the identified F statistic limit (Default = 4). The Minitab output includes Box-Cox Transformation 1) the constant and the factor coefficients for the significant terms. 2) the “t” value for the Factors included. 3) the “s” for the unexplained variation based on the current model. 4) the R2 for the current model. If you have chosen “1 step between pauses”, Minitab will then ask if you wish to run more. Type “yes” and “enter”. Continue this procedure until MINITAB won’t calculate any more. At that point, you will have identified your potential “leverage X’s”. Stat>Control Stat>Control Charts>Box-Cox Transformation B o x-C o x P lo t fo r S ke w e d 95% C onfidenc e Inter val 60 Last Iter ation Info Lambda Low 0.056 E st 0.113 Up 0.170 50 40 Output 30 S tD e v In this output, there are five potential predictors identified by stepwise regression. The steps are shown by the numbered columns and include the regression information for included factors. The information in column 1 represents the regression equation information if only “Form” is used. In column 5, the regression equation information includes five factors, but the s is .050 and the R2 is only 25%. In all probability, the analyst will choose to gather information including additional factors during next runs. 20 10 0 -5 -4 -3 -2 -1 0 1 Lam bda 2 3 4 5 S tD ev 2.784 2.782 2.783 BOX-COX TRANSFORMATION is a useful tool for finding a transformation that will make a data set closer to a normal distribution. Once confirmed that the distribution is non-normal, use Box Cox to find an appropriate transformation. Box-Cox provides an exponent used in the transformation called lambda, “λ”. The transformed data is the original data raised to the power of λ. Subgroup data can be in columns or across rows. In the dialog box, indicate how DATA ARE ARRANGED and where located. If data is subgrouped and subgroups are in rows, identify configuration. To store transformed data, select STORE TRANSFORMED DATA IN and indicate new location. The Box-Cox transformation can be useful for correcting non-normality in process data, and for correcting problems due to unstable process variation. Under most conditions, it is not necessary to correct for non-normality unless the data are highly skewed. It may not be necessary to transform data which are used in control charts, because control charts work well in situations where data are not normally distributed. Note: You can only use this procedure with positive data. 47 24 Box Plot Regression with Curves (Quadratic) & Interactions GRAPH>BOXPLOT The boxplot is useful for comparing multiple distributions (Continuous Y and discrete X). In the GRAPH section of the dialog box, fill in the column(s) you want to show for Y and if a column is used to identify various categories of X, i.e., subgroup coding, etc. Click FRAME Button to give you the options of setting common axes or multiple graphs on the same page. To generate multiple plots on a single page, select FRAME>MULTIPLE GRAPHS>OVERLAY GRAPHS... Click ATTRIBUTES to allow you to change individual box colors. Click OK The box represents the middle 50% of the distribution. The horizontal line is the median (the middlemost value) The whiskers each represent a region sized at 1.5*(Q3-Q1), the region shown by the box). Interpretation can be that the box represents the hump of the distribution and the whiskers represent the tails. Asterisks represent points which would fall outside the lower or upper limits of expected values. Interval Plot GRAPH>INTERVAL PLOT thickness 129.25 129.15 Useful for comparison of multiple distributions. Shows the spread of data around the mean by plotting standard error bars or confidence intervals. 129.05 The default form of the plot provides error bars extending one standard error (standard 128.95 deviation/square root of n) above and below existing new type a symbol at the mean of the data. Y variable: Select the column to be plotted on the y-axis. Group variable: Select the column containing the groups (or categories). This variable is plotted on the x-axis. Type of interval plot Standard error: Choose to display standard error bars where the error bars extend one standard error away from the mean of each subgroup. Multiple: Enter a positive number to be used as the multiplier for standard errors (1 is the default). Confidence interval: Choose to display confidence intervals instead of standard error bars. The confidence intervals assume a normal distribution for the data and use tdistribution critical values. When analyzing multiple factor relationships, it is important to consider if there is potential for quadratic (curved) relationships and interactions. Normal graphic analysis techniques and Regression do not allow analysis of the effects of interrelated factors. To accomplish this, the data must be analyzed in an orthogonal array (See Page 49). In order to create an orthogonal array with continuous data, the factor (x) data must be centered. Do this as follows: 1. The data to be analyzed need to be is columns, with the response in one column and the values of the factors paired with the response and recorded in separate columns. 2. Use Stat>DOE>Define Custom RS Design In the dialog box, identify the columns containing the factor settings. 3. Next, analyze the model using Stat>DOE>Analyze RS Design. Identify the column containing the response data Check: Analyze Data using Coded Units. 4. Click on Storage and select Fits and Residuals for later regression diagnostics. Click OK. Click on Graphs and select the desired graphs for analysis diagnostics. The initial analysis will include all terms in the potential equation including full quadratic. Analysis of the output will be similar to that for Regression>Regression (Page 43). 5. Where elements are insignificant revert to the Stat>DOE>Analyze RS Design >Terms dialog box to eliminate. In the case of this example, the equation can be analyzed as a linear relationship, so select “Linear” in the “Include the following terms box”. Note that this removes all the interaction and quadratic terms. Re-run the regression. Once an appropriate regression analysis, including leverage factors has been obtained, validate the adequacy of the model by using the regression diagnostic plots Stat>Regression>Residuals Plots (Page 22). Once an appropriate regression equation has been determined, remember this analysis was done with centered data for the factors. The centering will have to be reversed in order to make the equation useful from a practical standpoint. To create a graphic of the model, use Stat>DOE>RSPlots (Page 52). From this dialog box a contour plot of the results can be created. Level: Enter the level of confidence for the intervals. The default confidence coefficient is 95%. 23 48 Binary Logistic Regression One-Variable Regression In binary logistic regression the predicted value (Y) will be probabilities p(d) of an event such as success or failure occurring. The predicted values will be bounded between zero and one (because they are probabilities). STAT>REGRESSION>FITTED LINE PLOT Regression Plot In the STAT>REGRESSION>FITTED LINE PLOT dialog box, identify Response Variable (Y). Identify one (1) Predictor (X). Select TYPE OF MODEL (Linear, Quadratic or Cubic). Click on STORAGE. Select RESIDUALS and FITS. If you need to transform data, use OPTIONS and select Transformation. In select OPTIONS, DISPLAY Hardness CONFIDENCE BANDS and DISPLAY PREDICTION BANDS. Click OK. The output from the fitted line plot contains an equation which relates your predictor (input variable) to your response (output variable). A plot of the data will indicate whether or not a linear relationship between x and y is a sensible approximation. These observations are modeled by the equation : Y = 2692.80 - 3.16067X R-Sq = 0.784 700 600 Abrasion Example: Predict the success or failure of winning a contract based on the response cycle time to a request for proposal and the proposal team leader. The probability of an event, π(x) or Y, is not linear with respect to the Xs. The change in π(x) for a unit change becomes progressively smaller as π(x) approaches zero or one. Logistic regression develops a function to model this. π ( x) (1 − π ( x)) is the odds. The Logit is the Log of the odds. Ultimately the transfer function being developed will solve for π(x). 300 β + β 1x e0 95% CI ) 5. Evaluate the Model for 700 710 Count 113 110 223 2 P StDev 1.670 1.799 Z 4.44 0.000 -4.74 0.000 1.2109 0.3005 4.03 0.000 3 740 750 760 Stat>Regression>Residual Stat>Regression>Residual Plots 4 Odds Ratio 95% CI Lower Upper 0.00 0.00 0.01 3.36 1.86 6.05 Weld Temp Fits I Chart of Residuals Normal Plot of Residuals Residual 1 Residual 0.5 Log-Likelihood = -134.795 Test that all slopes are zero: G = 39.513, DF = 2, P-Value = 0.000 0.0 -0.5 Chi-Square 187.820 224.278 7.138 DF 116 116 7 P 0.000 0.000 0.415 -2 5 -1 0 -3.0SL=-0.9631 1 2 0 10 20 30 Observation Number Residuals vs. Fits 10 5 0.0 Any time a model has been created for an X/Y relationship, through ANOVA, DOE, Regression, the quality of that model can be evaluated by analysis of the error in the equation. When doing the REGRESSION (Page 37-38), or the FITTED LINE PLOT (above), be sure to select store “FITS” and “RESIDUALS” in the “STORAGE” dialog box. If the fit is good, the error should be normally distributed with an average of zero and there should be no pattern to the error over the range. Then in the “RESIDUAL PLOTS” dialog box, identify the column where the residuals are stored -0.5 -1.0 Residual 6 X=0.000 Normal Score -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 Summary Measures Somers' D Goodman-Kruskal Gamma Kendall's Tau-a 0 Histogram of Residuals 0 (Between the Response Variable and Predicted Probabilities) Percent 72.9% 26.2% 0.9% 100.0% 3.0SL=0.9631 0.5 Table of Observed and Expected Frequencies: (See Hosmer-Lemeshow Test for the Pearson Chi-Square Statistic) Number 9060 3260 110 12430 1 -1 -1.0 Pairs Concordant Discordant Ties Total 730 Residual Plots (Event) Coef 7.410 -8.530 Method Pearson Deviance Hosmer-Lemeshow 720 Use Residual Plots (Below) to plot the residuals vs predicted values ( Fits) and determine if there are additional patterns in the data. Logistic Regression Table Predictor Constant Index Brand Sp Yes 690 The R-sq is the square of the correlation coefficient. It is also the fraction of the variation in the output (response) variable that is explained by the equation. What is a good value? It depends... chemists may require an R-sq of .99. We may be satisfied with an R2 of .80. Frequency 3. Check the Odds ratios for the individual predictor levels. 4. Use the Confidence Interval to confirm significance. Where confidence interval includes 1.0, the odds are not significant. Value Yes No Total 680 Y = b + mx + error Logit Response Information Variable Bid 670 Confidence Bands are 95% confidence limits for data means. Prediction Bands are limits for 95% of individual data points. Binary Logistic Regression Link Function: 95% PI 660 To analyze the binary logistic problem, use STAT>REGRESSION>BINARY LOGISTIC REGRESSION. The data set used for “Response” will be Discrete and Binary (Yes/No;Success/Failure). In the “Model” dialog box, enter all factors to be analyzed. In the “Factors” dialog box, enter those factors which are discrete. Use the “Storage” button and select “Event probability”. This will store the Calculated Event probability for each unique value of the function. Analyze the Session Window output. 1. Analyze the Hypothesis Test for the model as a whole. Check for a p value indicating model significance. 2. Check for statistical significance of the individual factors separately. Use the P Value. Regression 200 β 0+ β 1x (1+ 400 Residual π ( x) = e 500 4 5 6 7 8 9 10 Fit in the “Residuals” box and the fits storage column in the “Fits” box. 0.47 0.47 0.23 Goodness of fit. Use Hosmer- Lemeshow if there is a “continuous” “X” in the model. 6. Assess the measures of Association. Note “% Concordant” is a measure 49 similar to R2. A higher value here indicates a better predictive model. The output includes a normal plot of residuals, a histogram of residuals, an Individuals Chart of Residuals and a scatter plot of Residuals versus Fits. Analysis of the Normal Plot should show a relatively straight line if the residuals are normally distributed. The I chart should be analyzed as a control chart. The histogram should be a bell-shaped distribution. The residuals vs fits scatter plot should show no pattern, with a constant spread over the range. 22 Descriptive Statistics Stat>Basic Stat>Basic Statistics>Descriptive Statistics Descriptive Statistics Variable: C1 Anderson-Darling Normality Test A-Squared: 0.235 P-Value: 0.790 42 52 62 72 82 Mean StDev Variance Skewness Kurtosis N 92 Minimum 1st Quartile Median 3rd Quartile Maximum 95% Confidence Interval for Mu 69.3824 9.8612 97.2442 -5.0E-03 3.89E-02 500 38.9111 62.5858 69.6848 75.8697 99.6154 95% Confidence Interval for Mu 68.5160 70.2489 69 70 71 95% Confidence Interval for Sigma 9.2856 10.5136 The Descriptive Statistics>Graphs>Graphical Summary graphic provides a histogram of the data with a superimposed normal curve, a normality check, a table of descriptive statistics, a box plot of the data and confidence interval plots for mean and median. In the Descriptive Statistics dialog box select the variables for which you wish to create the descriptive statistics. If choosing a stacked variable with a category column, check the BY VARIABLE box and indicate the location of the category identifier. 95% Confidence Interval for Median 68.6347 70.8408 95% Confidence Interval for Median Use the Graphs button to open the graphs dialog box. In the graphs dialog box, select “Graphical Summary”. When using this tool to interpret normality, confirm the p value and evaluate the shape of the histogram. Remember that the p value is “The probability of claiming the data are not normal if the data are truly from a normal distribution”, a type I error. A high p-value would therefore be consistent with a normal distribution. A low p-value would indicate non-normality. When evaluating the shape of the histogram graphically, determine: Is it bimodal? Is it skewed? If yes, investigate potential causes for the non-normality. Improve if possible or analyze groups separately. If no special cause is found for the non-normality, the distribution may be non-normal naturally and you may need to transform the data (page 22) prior to calculating your Z. Normal Plot • Normal Probability Plot • .999 Probability .99 .95 .80 .50 .20 .05 .01 .001 26 36 46 56 66 76 86 96 Tolerance Analysis is a design method used determine the impact that individual parts of a system have on the overall requirement for that system. Most often, Tolerance Analysis is applied to dimensional characteristics in order to see the impact the dimensions have on the final assembly in terms of a gap or interference. In this application, a tolerance loop may be used to illustrate the relationship. Purpose To graphically show the relationships of multiple parts in a system which result in a desired technical requirement in terms of a gap or interference. Process 1. Generate a layout drawing of your assembly. A hand sketch is all that is required. 2. Clearly identify the gap in the most severe condition. 3. Select a DATUM or point from which to start your loop. (It is easier to start the loop at one of the interfaces of the Gap.) 4. Use drawing dimensions as vectors to connect the two sides of the gap. STAT>BASIC STATISTICS>NORMALITY TEST 106 Normal Average: 70.0000 StDev: 10.0000 N: 500 Design for Six SigmaTolerance Analysis Anderson-Darling Normality Test A-Squared: 0.418 P-Value: 0.328 5. Assign sign convention (+/-) to vectors Identify the variable you will be testing in the Variable box. Click OK (Use default Anderson Darling test). A Normal probability plot is a graphical method to help you determine whether your data is normally distributed. To graphically analyze your data, look at the plotted points relative to the sloped line. A normal distribution will yield plotted points which closely hug the line. Nonnormal data will generally show points which significantly stray from the line. Vector Assignment Datum A + B1 B2 B3 B4 Gap The test statistics displayed on the plot are Asquared and p-value. The A-squared value is an output of a test for normality. Focus your analysis on the p value. The p value is “The probability of claiming the data are not normal if the data are truly from a normal distribution”, a type I error. A high p-value would therefore be consistent with a normal distribution. A low p-value would indicate non-normality. Use the appropriate type I error probability for judging this result. Assign a positive (+) vector when • An increase in the dimension increases the gap. • An increase in the dimension reduces the interference. Assign a negative (-) vector when • An increase in the dimension reduces the gap. • An increase in the dimension increases the interference. In the diagram above, the relationship can be explained as: GAP = A - B1 - B2 - B3 - B4 Because the relationship can be explained using only + & - signs, the equation is considered LINEAR, and can be analyzed using a method known as Root Sum of Squares (RSS) analysis. 21 50 Design for Six Sigma Tolerance Analysis (continued) Histogram GRAPH>HISTOGRAM The histogram is useful to look graphically at the distribution of data. Using RSS, the statistics for the GAP can be explained as follows. The bar (−) above the terms designates mean values & “S” designates Standard Deviation. ___ _ __ __ __ __ GAP = A - B1 - B2 - B3 - B4 S gap = (s 2 A +s +s +s +s 2 2 2 2 B1 B2 B3 B4 In the GRAPH VARIABLES Box select each variable you wish to graph individually. ) Click the OPTIONS button to change the Histogram displayed: • Type of Histogram - Frequency (Default); Percent; Density • Type of Intervals - Midpint (Default ) or Cutpoint Definition of intervals - Automatic (Default ) or manual definition • See Minitab Help for explanation of how to use these options. Click HELP in the HISTOGRAM>OPTIONS Dialog Box. Given these equations, the impact that each individual part has on the entire system can be analyzed. In order to perform this analysis, follow these steps: Gather and prepare the required data • gap nominals gap specification limits • • process data for each component ü process mean ü process s.st ü process s.lt • process data assumptions, if data not available: try 'expert data sources' ü or....process s.lt estimates when data is not available ü s.st, s.lt from capability data ü Z-shift assumptions, long term-to-short term, when one is known ♦ multiply s.st by 1.6 for a variance inflation factor multiply s.st by 1.6+ for a process that has less control long term ♦ ( divide s.lt by the above factor if long-term historical data is known) Click FRAME button to give the options for setting common axes or multiple graphs. Click ATTRIBUTES button to access options for changing graphic colors or fill type. Scatter Plot GRAPH>PLOT The scatter plot is a useful tool for understanding the relationship between two variables. 9.7 Linear Tolerance Spreadsheet Once the data has been collected, it can be analyzed using the Tolanal.xls spreadsheet described on the next page. The Tolanal.xls spreadsheet can be found on the GEA website, under Six Sigma, Forms & Tools. EXISTING 9.5 9.3 9.1 8.9 8.7 8.5 8.5 The Tolanal.xls spreadsheet performs its analysis using Root Sum of Squares method, and should only be applied to LINEAR relationships (i.e. Y=X1+X2-X3). Non-linear relationships require more detailed analysis using advanced DFSS tools such as Monte Carlo or the ANALYSIS.XLS spreadsheet. Contact a Master Blackbelt for support. 8.7 8.9 9.1 9.3 9.5 9.7 In the GRAPH VARIABLES box select each X and Y variable you wish to plot. MINITAB will create individual plots for each pair of variables selected. In the Six Sigma method, the selected Y should be the dependent variable and the selected X the independent variable. Select as many combinations as you wish. NEW Click the OPTIONS button to add jitter to the graph. Where there are multiple data points with the same value, this will allow each data point to be seen. Click FRAME button to give the options for setting common axes or multiple graphs. Click ATTRIBUTES button to access options for changing Graphic colors or fill type. Click ANNOTATION button to access options for changing the appearance of the data points or to add titles, data labels or text. Results: If Y changes as X changes, there is a potential relationship. Use the graph to check visually for linearity or non-linearity. 51 20 Tolerancing Analysis - The Planning Questions Linear Spreadsheet The Principle of Reverse Loading 1. Input the technical requirements Know See Tool Data Where 2. Input target dimensions and vector direction Every problem solving task is focused on finding out something. The investigation will be more effective if it is planned. Planning is appropriate for Gage R&R, characterizing the process, analyzing the process for difference (hypothesis testing), design of experiments or confirmation run analysis. In short it is appropriate in every phase of the MAIC or MAD process. This investigative process is called “reverse loading” because the process begins with a question focusing on what is desired at the end of the process. Plan 1) What do you want to know? 2) How do you want to see what it is that you need to know? Critical Questions 3) What type of tool will generate what it is that you need to see? 4) What type of data is required of the selected tool ? 5) Where can you get the required type of data? Execute 3. Input short and long term σs of part dims Spreadsheet identifies major contributors to system variation Input your process data into the spreadsheet · input gap technical requirements · input interface dimension 'nominals' as the baseline case · input design vectors from loop diagram · input process σ's (long and short term from available data) Analyze the initial output · Gap: Min/Max of constraints (info only) vs. Z.upper and Z.lower for 'Gap' (CTQ) · Parts: RSS-σ.st contribution by each part: target the high RSS% first Modify the input variables to optimize the Z-gap ..change: · Part nominals vs. means (implies process shift... tooling sensitive?) · Component σ.st (caution here... if you reduce, you are making 'big' assumptions ) · Z-shift factor (change the σ.lt, using actual data or assumptions) · Target CTQ specifications (if not constrained... negotiate with Customer) Review your output for the optimized Z-gap condition · If initial Z-gap is very high, you can move part nominals or 'open' the variation (don't penalize yourself by constraining processes too much) · If you cannot get Z-gap to your goals, re-design should be considered · Understand ALL implications of your changes to any of the input variables Establish your tolerances based on %RSS contribution, sensitivity to Z.gap and desired sigma' level of the particular contributing dimension; know the effect on your NOMINAL design · highest %RSS contributors will have most impact. Iterate by moving the nominal by 1.0 σ.st in both directions.... continue iterating to 2*s.st.., 4*s.st.., 5*s.st.., etc. · understand and weigh your risks by increasing tolerance (effect on nominals, subsequent operations, etc)... how is the process managed and what are the controls? Who supplies? Copyright 1995 Six Sigma Academy, Inc. 19 52 Design Of Experiments Six Sigma Process Report Baselining data collection is considered passive observation. The process is monitored and recorded without intentional changes or tweaking. In Designed Experiments, independent variables (Factors) are actively manipulated and recorded and the effect on the dependent variable (Response) is observed. Designed experiments are used to: • Determine which factors (X’s) have the greatest impact on the response (Y). • Quantify the effects of the factors (X’s) on the response (Y). • Prove the factors (X’s) you think are important really do affect the process. Analysis of Continuous data Report 1: Executive Summary Process Performance LSL USL Date: August 1 Reported by: Dick Kel Project: Wine Qua Department: Mountain Process: Aging Characteristic: Perceive 5 10 15 20 1,000,000 Orthogonality Actual (LT) Potential (ST) Units: Likert S Upper Spec: 22 Lower Spec: 8 Nominal: 15 Opportunity: Award 100,000 10,000 Process Benchmarks 1000 Since our goal in experimentation is to determine the effect each factor has on the response independent of the effects of other factors, experiments must be designed so as to be horizontally and vertically balanced. An experimental array is vertically balanced if there are an equal number of high and low values in each column. The array is horizontally balanced if for each level within each factor we are testing an equal number of high and low values from each of the other factors. If we have a balanced design in this manner, it is Orthogonal. Standard generated designs are orthogonal. When modifying or fractionating standard designs be alert to assure maintenance of orthogonality. Actual (LT) Potential (ST) 100 Sigma Replication Duplicate experimental runs more than once after resetting the independent variables is called replication. It is commonly used to assure generalization of results over longer term conditions. When using MINITAB for experimental designs, Replications can be programmed during the design creation. Randomization Running experimental trials in a random sequence is a common, recommended practice that assures that variables that change over time have an equal opportunity to affect all the runs. When possible, randomizing should be used for designed experimental plans. It is the default setting when MINITAB generates the design, but can be deselected using the OPTIONS button. Blocking A block is a group of “homogeneous units”. It may be a group of units made at “the same time”, such as a block by shift or lot, or it may be a group of units made from “the same material” such as raw material lot or manufacturer. When blocking an experiment, you are adding a factor to the design; i.e., in a full factorial 24 experiment with blocking, the actual design will analyze as a 25-1 experiment. When analyzing processes subject to multiple shift or multiple raw material flow environments, etc, blocking by those conditions is recommended. 53 3.52 15034.0 215.402 PPM 1 1 2 3 4 5 6 7 8 Xbar and S Chart Capability Indices 16 15 14 13 12 11 10 9 3.0SL=15.74 Subgroup 5 4 3 2 1 0 1 2 3 4 5 6 7 15.0000 12.4368 -3.0SL=9.133 1.8917 3.7003 3.7003 8 -3.0SL=0.000 I 8 I I 22 I Specification Date Reported By Project Department Process Casting Characteristic Units Upper Spec Lower Spec Nominal Opportunity Data Source Time Span 06/31/96 Data Trace I 22 99.9785 98.4966 Yiel PPM 215.402 15034.0 Cp Cpk Pp Ppk 1. Identify the configuration of the Y data, and the location of the useful data. (columns or rows). 1.22 0.78 1.13 0.72 Data Source: Time Span: 2. Identify the CTQ specifications or the location of the demographic information. 06/31/96 Duke Brewster Shoe Cast Brake Division 3. If detailed demographic information is to be used, select the Demographics button. Either enter the data for the information (shown at the left) in the dialog box or reference a spreadsheet column with this information listed as shown. Hardness Brinell 42 38 40 01/01/96 To use this tool effectively, the response data (Y) must be collected in rational subgroups of two (2) or more data points. In addition to the Y data, a demographics column may be added to provide the demographic information on the right side of Report 1. The demographics column must be entered in exact order shown if used. See figure. Once the data is entered, create the report by calling “Six Sigma>Process Report”. P.LSL 0.000108 0.015033 P.Total 0.000215 0.015034 Process Tolerance 6.2592 18.6145 I I I I 2.0454 4.6756 2.1692 Z.Bench 3.5205 2.1692 1.3513 1.3513 Z.Shift P.USL 0.000108 0.000001 Actual (LT) Capability 8 LT Mean StDev Z.USL Z.LSL S=1.691 Process Tolerance 9.2773 20.7227 I I I ST X=12.44 3.0SL=4.342 Specification Completing a run more than once without resetting the independent variables is called repetition. It is commonly used to minimize the effect of measurement and to analyze factors affecting short-term variation in the response. 2.17 (Z.Bench) 10 Potential (ST) Capability Repetition The Six Sigma Process Report, “Six Sigma>Process Report” displays data to enable the analysis of continuous process data. The default reports are the Executive Summary (Report 1)and the Process Capability Report (Report 2). Process Demographics Actual (LT) Potential (ST) - 4. When the report is generated with only this information, the default reports will be shown. If additional reports are desired, they can be accessed through the “Reports” button. Bin # 1057a-9942 Executive Summary - Top left graphic displays the predicted distribution based on data. MINITAB assumes normal data and will display a normal curve whether the data is normal or not. The lower left hand graphic displays the expected PPM defect rates as subgroups are added to the prediction. When this curve stabilizes (levels off) enough data has been taken. The Process Benchmarks show the reported Z Benchmark scores and PPM (Defects in both tails are combined) (Page 8). Capability Study - The control charts provide an excellent means for diagnosing the rational subgrouping process. Use normal techniques for analysis of this chart (Page 54). The capability indices on the right provide tabular results of the study. The bar diagrams at the bottom of the report show comparative graphics of the short term and long term process predictions. 18 Normality of Data Factorial Designs Factorial Designs are primarily used to analyze the effects of two or more factors and their interactions. Based on the level of risk acceptable, experiments may be either full factorial, looking at each factor combination , or fractional factorial, looking at a fraction of the factor combinations. Fractional Factorial experiments are an economical way to screen for vital X’s. They only look at a fraction of the factor combinations. Their results may be misleading because of confounding, the mixing of the effect of one factor with the effect of a second factor or interaction. In planning a fractional factorial experiment, it is important to know the confounding patterns, and confirm that they will not prevent achievement of the goals of the DOE. Probability Data from many processes can be approximated by a normal distribution. Additionally, The Central Limit Theorem states that characteristics which are the average of individual values are likely to have an approximately normal distribution. Prior to characterizing your project Y, it is valuable to analyze the data for normality to confirm whether the data does follow a normal distribution. If there is strong evidence that the data do not follow a normal distribution, then predictions of future performance should not be made using the normal distribution. Normal Probability Plot Use “Stat>Basic Stats>Normality Test” (Fig 1) (Page 21) or “Stat>Basic Stat>Descriptive Statistics” (Fig 2) (Page 21) .999 .99 with “Graphs>Graphical Summary” checked. .95 If using “Normality Test”, the default is .80 “Anderson-Darling”. Use that test for most .50 .20 investigations. Use other tests with caution. .05 For example, Kolmogorov-Smirnov is actually a .01 less sensitive test. .001 26 36 46 56 66 76 86 96 106 Normal Anderson-Darling Norm A-Squared: 0.418 P-Value: 0.328 Average: 70.0000 StDev: 10.0000 N: 500 Fig 1 Descriptive Statistics Variable: Normal Anderson-Darling Normality Test 0.418 A-Squared: 0.328 P-Value: 35.0 47.5 60.0 72.5 85.0 70.0000 10.0000 100.000 -5.0E-02 0.393445 500 Mean StDev Variance Skewness Kurtosis N 97.5 29.824 63.412 69.977 76.653 103.301 Minimum 1st Quartile Median 3rd Quartile Maximum 95% Confidence Interval for Mu 95% Confidence Interval for Mu 70.879 69.121 69 70 71 95% Confidence Interval for Sigma 10.662 9.416 95% Confidence Interval for Median 95% Confidence Interval for Median 70.737 69.021 Fig 2 Xbar/R Chart for Mystery 15 14 13 12 11 10 90 Means 80 70 60 50 1 1 1 11 11 1 111 1 11 11 1 1111 1111111111111111111111 1111111 1 1 11111111111 11111111111111111111111 111 1111 1 1 1 1 111 Subgroup 0 50 100 1 50 40 30 20 Ranges1 0 3.0SL=48.2 R=22.83 - Fig 3 17 3.0SL=113. X=100. - The test statistic for primary use in analyzing the test results is the P value. The null hypothesis, Ho,states that the process is normal, so if the p value <.05, then there is evidence that the data do not follow a normal distribution. If the process shows non-normality, either there are special causes of variation that cause the nonnormality, or the common cause variation is not normal. Analyze first for special cause. Use Stat>Control Charts (Fig 3)(Page 49)or Plot>Time Series Plot (Page 24) to look for “out of control” points or drifts of the process over time. Try to determine the cause of those points and separate, or stratify, the data using that knowledge. If the levels of X’s have been captured, use graphics to aid in visualizing the process stratified by the X’s. If the data can be stratified and within the strata the data is normal, the process can be characterized at the individual levels and perhaps characterized using the Product Report (page 15). The discovery of a special cause contributing to nonnormality may lead to improving the process. If the common cause variation is non-normal, it may be possible to transform the data to an approximately normal distribution. MINITAB provides such a tool in “Stat>Control Charts>Box-Cox Transformation” (Page 24). Additional notes on data transformation can be found in the Quality Handbook; Juran Chap 22. Fig 1 Fig 2 To create a Factorial Experiment using MINITAB, select STAT>DOE>CREATE FACTORIAL DESIGN. In the dialog box (Fig 1) select the Number of Factors and then the Designs Button. If the number of factors allows both a fractional and full factorial design, the Designs dialog box (Fig 2) will show the available selections including both full and fractional designs. Resolution, which is a measure of confounding is shown by each displayed design. While in this dialog box identify the number of replicates and blocks to be used in the design. Select OK to return to the initial dialog box. Select Options. In the Options dialog box select Randomize Run if planned. Finally, select the Factors button and in that dialog box, name the factors being studied and the factor experimental levels. Click OK twice to generate the completed design. The design will be generated on the MINITAB worksheet as shown in Fig 3. An analysis of the design, including the design Resolution and confounding will be generated in the MINITAB Session Window. Now run the experiment and collect the data. Record run data in a new column in same row as run factor settings. ` Fig 3 54 Characterizing the Process - DOE Analysis Rational Subgrouping Analysis of DOE’s includes both graphical and tabular information. Once the data for the experimental runs has been collected and entered in the MINITAB worksheet, analyze with STAT>DOE>ANALYZE FACTORIAL DESIGN. In the ANALYZE FACTORIAL DESIGN dialog box, identify the column(s) with the response data in the Responses box. Select the GRAPHS button. In the GRAPHS dialog box, select PARETO for the effects plots and change ALPHA (α level of significance) to .05. Click OK twice. Note that we have not used the other options buttons at this time. Leave Randomize at default settings. The initial analysis provides a session window output and a Pareto graph. To separate the measurement of Z.ST and Z.LT and understand fully how the process operates, capture data in such a way to see both short term variation, inherent to the technology being used, and long term variation, which reflects the variation induced by outside influences. The process of collecting data in such a manner is called “Rational Subgrouping”. Analyzing Rational Subgroups allows analysis of “centering vs. spread” and “control vs. technology.” Pareto Chart of the Effects (response is PCReact, Alpha = .05) A: B: C: D: E: B D BD Steps to characterize a process using Rational Subgroups 1. Work with operational advocacy team to define the factors (X’s) suspected as influential in causing output variation (Y). Confirm which of these factors are operationally controllable and which are environmental. Prioritize and understand the cycle time for sensing the identified factors. Be sure to question the effect of elements of all the 5M’s of process variation: Machine - Technology; Maintenance; Setup Materials - Batch/Lot/Coil Differences Method - MTS; Workstation layout; Operator method Manpower - Station Rotation; Shift Changes; skill levels Measurement - R&R; Calibration effects Environment – Weather; Job Site or shop Feedrate Catalyst Agitate Tempera Concentr DE E CE A BC AB BE AE AD AC CD C 0 10 20 Analysis of the DOE requires both graphic and model analysis, however, the model should be generated and analyzed before full graphic analysis can be completed. An analysis of the Fit Model in the MINITAB Session window shows the amount of effect and the model coefficients. Most important though is the ANOVA table. This table may show the significant factors or interactions (See Balanced ANOVA Page 40). In this case, the F score is shown as “**” and there are no “p” values. This indicates that the model as defined is too complex to be analyzed with the amount of data points taken. The model needs to be simplified. The Pareto graphic is a helpful tool for that. Note that effects B,D and E and interactions BD and DE show as significant effects. Remaining non-significant effects can be eliminated. 2. Define a data collection plan over time that captures data within each subgroup taken over a period of time short enough that only the variation inherent to the technology occurs. Subgroup size can be anything greater than two (2). Two measured data points are necessary to see subgroup variation. Larger subgroups provide greater sensitivity to the process changes, so the choice of subgroup size must be made to balance the needs of the business and the need for process understanding. This variation is called “common cause” and represents the best the process can achieve. In planning data collection use of The Planning Questions (Page 19) is helpful. 3. Define the plan to allow for collection of the subgroups over a long period of time which allows the elements of long term variation and systematic effects of potentially important variables to influence the subgroup results. Do not tweak, or purposely adjust the process, but rather recognize that the process will drift over time and plan the data collection accordingly. Rerun STAT>DOE>ANALYZE FACTORIAL DESIGN . This time select the TERMS option button. In the dialog box, deselect the terms not shown as significant in the Pareto. Click OK. Select STORAGE and select RESIDUALS and FITS. Click OK twice. The resulting ANOVA table shows the significance of the factors and the model coefficients are provided. Next, Run STAT>DOE>FACTORIAL PLOTS. Select and setup each of the plots, MAIN EFFECTS, INTERACTIONS and CUBE as follows. Identify the response column in the RESPONSES box. Select only the significant factors to be included in the plot. Click OK twice to generate the plots. Confirm the significance of effects and interactions graphically using MAIN EFFECTS and INTERACTIONS plots. Use the CUBE PLOT to identify the select factor levels for achieving the most desirable response. 4. Capture data and analyze data using Control Charts (Page 53 - 55) and 6 Sigma Process Report (Page 18) during the data collection period. Stay close to the process and watch for data shifts and causes for the shifts. Capture data documenting the levels of the identified vital X’s. This data may be helpful in analyzing the causes of process variation. During data collection it may be helpful to maintain a control chart or some other visual means of sensing process shift. Interaction Plot for PCReact Main Effects for PCReact 1 2 0 14 0 18 14 0 3 6 180 Cube Plot - Means for PCReact 6 3 Catalyst 75 47.0 80.0 2 PCReact 70 1 80 65 64.5 94.0 Temperat 180 60 emperat 140 5. Capture sufficient subgroups of data to allow for multiple changes in all the identified vital X’s and also to allow for a stable estimate of the mean and variation in the output variable (Y). See 6 Sigma Process Report (Page 18) for explanation of graphic indicator of estimation stability. 66.0 55.5 6 55 Catalyst Temperat Concentr Concentr Concentr 55 40 62.0 53.0 3 1 2 Catalyst 16 Six Sigma Product Report The Six Sigma Product Report “Six Sigma>Product Report” is used to calculate and aggregate Z values from discrete data and data from multiple normal processes. Enter “# defects”, “# units” and “# opportunities” data in separate columns in MINITAB. When Z shift is included in the calculation (1.5 default) the reported Z bench is short term. If zero is entered, the reported Z.bench is long term. Defect count - Enter the actual defects recorded in the sample population. If using defect data from Continuous Process Study, use PPM for long term. If this report is a rollup of subordinate processes, use the defect count from the subordinate process totals. Units - Enter the actual number of parts included in the sample population evaluated. If using data from Continuous Process Study, use 1,000,000. If this report is a rollup of subordinate processes, use the actual number of parts included in the sample population evaluated. Opportunities - At the lowest level, Report 7: Product Performance use one (1) for the number of opportunities. One (1) is the number of CTQ’s characterized at the lowest level of analysis. If this report is a rollup of subordinate processes, use the total number of opportunities accounted for in the subordinate process. Characteristics (Optional) Enter the test name for the Characteristic, CTQ or subprocess. Shift - Process ZSHIFT can be entered three ways. If the report is an agregate of a number of continuous data based studies, for example, a part with multiple CTQ’s, the ZSHIFT data can be entered in the worksheet as a separate column and refered to in the Product Report dialog box. A fixed ZSHIFT of 1.5 is the default and will be used if nothing is specified. A ZSHIFT of zero (0) will produce a report that shows only the long-term results. As the levels are rolled up, the data from the totals in the subordinated processes will become line items in the higher level breakdown. In the chart above, the process reported includes data from 12 subprocesses. Door assembly South includes a process that included six (6) CTQ’s characterized. Analyzing the report The far right hand column of the report shows the Z.Bench for the individual processes and for the Cumulative Z.Bench. The number at the bottom of the DPO column, in this case 0.081917, reports the P (d), probability of a defect at the end of the line. Defs Units Opps TotOpps DPU DPO PPM 46332 66636 3 199908 0.695 0.231767 231767 1.500 2.233 2174 66636 1 66636 0.033 0.032627 32627 1.500 3.344 Sealed SystemHigh 554 66636 2 133272 0.008 0.004157 4157 1.500 4.139 Sealed SystemLow 3540 66636 3 199908 0.053 0.017708 17708 1.500 3.604 C84 3643 66636 1 66636 0.055 0.054667 54667 1.500 3.101 C85 1947 66636 1 66636 0.029 0.029223 29223 1.500 3.392 37052 66636 6 399816 0.556 0.092673 92673 1.500 2.824 811 66636 1 66636 0.012 0.012174 12174 1.500 3.752 14869 66636 1 66636 0.223 0.223144 223144 1.500 2.262 C87 2901 66636 1 66636 0.044 0.043534 43534 1.500 3.211 C90 1544 66636 1 66636 0.023 0.023166 23166 1.500 3.492 C91 4721 66636 1 66636 0.071 0.070852 70852 1.500 2.969 0.081917 81917 1.500 2.892 CG Case C83 Door Assy South C86 Plastics Total 120089 1465992 Central Composite Design (CCD) Response Surface analysis is a type of Designed Experiment that allows investigation of non-linear relationships. It is a tool for fine tuning process optimization once the region of optimal process conditions is known. Using the CCD type RS Design, you will be designing an experiment that tests each factor at five levels, and an experiment which can be used to augment a factorial experiment that has been completed. The CCD design will include FACTORIAL points, STAR points and CENTER points. Start by Running STAT>DOE>CREATE RS DESIGN . Select CENTRAL COMPOSITE from the design type choices in the dialog box. Identify the number of factors to be studied and click the DESIGN button. In the DESIGN dialog box, select the experiment design desired, including the blocks. Click OK and then select the FACTORS button. In that dialog box identify the factors and their high and low factorial settings and click OK. Randomize runs is found in the OPTIONS dialog box. Click OK to generate the design. The design will be placed on a new worksheet. Collect data for each of the scheduled trials defined by the design. Note that there will be multiple points run at the centerpoint of each factor and there will be star points for each factor beyond the factor ranges identified in the design. ZShift ZBench Analyze the data using STAT>DOE>ANALYZE RS DESIGN. In the dialog box identify the response column. Leave the Use Coded Units selected and choose the appropriate setting for the USE BLOCKS box, depending on plan. Click OK and run. The resulting output is a combination of the Regression Output (Page 43) and the ANOVA output (PAGE 41). The regression output analyzes how the individual factors and interactions fit the model. The ANOVA table will analyze the type of relationship and also the total fit of the model. If “Lack of Fit” error is significant, another model may be appropriate. Simplify the model for terms and regression complexity as appropriate. See DOE Analysis (Page 51). Rerun STAT>DOE>ANALYZE RS DESIGN and select the TERMS button. Before rerunning the simplified analysis, select STORAGE and select FITS and RESIDUALS. Continue simplification and tabular analysis to attempt to find a simple model that explains a large portion of the variation. Confirm regression fit quality using Residual Plots (Page 22). The terms in the ANOVA Matrix should show significance, except that “Lack of Fit” term should become insignificant (p>.05). Next run STAT>DOE>RSPLOTS. Select either CONTOUR or SURFACE plot and SETUP for the selection. In the SETUP dialog box, confirm that the appropriate factors are included for the plot, noting that each plot will have only the factor pair shown. Check that the plot is displayed using UNCODED units and run. Use the graphic generated to visually analyze for optimal factor setting or use the model coefficients and solve for the optimal settings mathematically. Contour Plot of strength 30 35 40 95 Composition Characteristic Response Surface strength 30 85 20 24.5 25.5 26.5 27.5 28.5 29.5 Volume 30.5 31.5 75 15 40 25 26 27 28 Volume 29 30 31 75 85 95 Compositio 56 Control Charts Control charts are a practical tool for detecting product and/or process performance changes in and R over time in relation to historical performance. Since they are a rigorous maintenance tool, control charts should be used as an alternate to closed loop process control, such as mechanical sensing and process adjustment. Common and special-caused variation can be seen in rationally subgrouped samples: • common-cause variation characterized by steady state stable process variation (captured by the within subgroup variation). • special-cause variation characterized by outside assignable causes on the process variation (captured by the “between subgroup variation). • Control Chart Analysis signals when the steady state process variation has been influenced by outside assignable causes. Variables Control Charts Variable Control Charts are used in pairs. One chart characterizes the variation of subgroup averages, and the other chart characterizes the variation of the spread of the subgroups. Individual Charts (X/Moving Range): These charts are excellent for tracking long term variation changes. Because they use a single measurement for each data point, they are not a tool of choice where measurement variation is involved, such as with part dimensions. They work well with temperatures, pressures, concentration, etc. Subgroup Charts (XBar R or Xbar S): These charts are excellent for tracking changes in short term variation as well as variation over time. They require multiple measurements (two or more) in each subgroup. Using rational subgroup techniques with this chart enables graphic analysis of both short term variation changes (Range or S) and long term variation (X Bar chart). This chart is the chart of choice where measurement variation is involved. It is also an excellent tool for tracking processes during baselining or rebaselining, since it assists in pointing to special cause influence on results. Because there is usually no change in temperature, pressures or concentration in the short term, they are not used for that type of measurement. Rolled Throughput Yield Rolled Throughput Yield (YRT) is the probability of completing all the opportunities in a process without a defect. As such, it is a tool which can focus the investigation when narrowing down the problem from a larger business problem. In a process which has 18 stations, each with 5 opportunities, and DPO = 0.001, the YRT is .9139, calculated as follows: # Opportunities Y = ((Yield) ) RT Y = (.999 ) = (.995) = .91389 RT # Stations Station 5 18 18 In addition to the straight multiplication method of calculating YRT YRT= Y1 x Y2 x Y3 ......x YN Where Y1, Y2, Y3....YN are yields of individual stations or operations in a process. YRT can also be estimated using the Poisson Approximation Y =e − DPU And conversely RT DPU ≅ −Ln(Y ) rt Attribute Charts Attribute Control Charts are a single chart. The common difference between these charts is whether they track proportion, a ratio, or defects, a count. Proportion Defective Charts (P charts): This chart tracks proportion. The data point plotted is the ratio Number of defects/Number of Pieces Inspected . In using proportion defective charts, the number of pieces in a sample can vary, and the control limit for the chart will vary based on that sample size. Number Defective Charts (nP Charts): This chart tracks defect count. The data point plotted is the number of defects in a sample. Because the data point is a number relative to a sample size, it is important that the sample size be relatively constant between samples. The sample size should be defined so that the average number of defects is at least five in order for this chart to be effective. In setting up Control Charts, use the Planning Questions (Page 19) first. Those questions along with these notes will help define the type of chart needed. Use SETCIM, MINITAB, SPQ (Supplier Process Quality) or other electronic methods for long term charting. 57 Normalized Average Yield Normalized Average Yield (YNA) is the average yield of one opportunity. It answers the question “What is the probability that the output of this process will meet the output requirements?” The YNA of a process is an average defect rate, and can be used for comparing processes with differing levels of complexity. Y = (Y ) NA 1 Opportunities RT Normalized Average Yield (YNA), is the probability of good product, so if we calculate 1-YNA, we can find the Probability of a defect, P(d). With this we can find the Z.LT score for a process. P(d) = 1 − YNA 14 DPU / DPO DPU DPU is the number of defects per unit produced. It’s an average. This means that on the average, each unit produced will have so many defects. Interpreting Variables Control Charts UCL A DPU gives us an index of quality generated by the effects of process, material, design, environmental and human factors. Keep in mind that DPU measures symptoms, not problems. (It’s the Y, not the X’s). B C DPU = (# Defects) / (# units) [DPU is the average number of defects in a unit] X DPU forms the foundation for Six Sigma. From DPU and a knowledge of the opportunities, we can calculate the long term capability of the process. C B LCL Opportunity An opportunity is anything you measure, test or inspect. It may be a part, product or service CTQ. It can be each of the elements of an assembly or subassembly. A Rule 1Rule 2 Rule 3 Rule 4 Rule 5 Note: A, B, and C represent plus and minus one, two Note: A,sigma B, andzones C represent plus and process minus one, two and three from the overall average. and three sigma zones from the overall process average. DPO A lack of control (“out of control”) is indicated when one or more of the following rules apply to your chart data: DPO is the number of defects per opportunity. It is a probability. Total Opportunities =# Units * # Opportunities Unit [DPO is the probability of a defect on any one CTQ or step of a process] Defects per Opportunit y (DPO) = #Defects DPU = Opportunit ies Opportunit ies Units * unit Yield = 1-DPO DPO is the foundation for determining the Z value when using discrete data. To find Z, given DPU, convert DPU to DPO. Then look up the P(d) for DPO in the body of the Z table. Convert to Z score (page 7). 13 unit 1. A single point above or below a control limit 2. Two out of three consecutive points are on the same side of the mean, in Zone A or beyond 10 / 11 points above Mean 12 / 14 points above Mean 3. Four out of five consecutive points are on the same side of the mean, in Zone B or beyond 4. At least eight consecutive points are on the same side of the mean, in Zone C. 5. 7 points in a row trending up or 7 points in a row trending down 6. 14 points sequentially alternating up then down then up, etc.. 7. 14 points in a row in Zone C on both sides of the mean. 8. 8 points in a row alternating in Zone B or beyond. 58 Control Chart Constants Analysis Criteria. Variables Control Chart Control Limit Constants n 1 2 3 4 5 6 7 8 9 10 A2 2.660 1.880 1.023 0.729 0.577 0.483 0.419 0.373 0.337 0.308 A3 3.760 2.659 1.954 1.628 1.427 1.287 1.182 1.099 1.032 0.975 D3 0 0 0 0 0 0.076 0.136 0.184 0.223 Average/Range Chart X= ( X1 + X 2 + . . .+ X k ) , where X = k ( R + R2 + . . .+ Rk ) R= 1 k D4 3.267 2.575 2.282 2.115 2.004 1.924 1.864 1.816 1.777 B3 0 0 0 0 0.03 0.118 0.185 0.239 0.284 B4 3.267 2.568 2.266 2.089 1.970 1.882 1.815 1.761 1.716 d2 1.128 1.693 2.059 2.326 2.534 2.704 2.847 2.970 3.078 c4 0.7979 0.8862 0.9213 0.9400 0.9515 0.9594 0.9650 0.9693 0.9727 Individual X /Moving Range Chart n Xi å i =1 n UCLx = X + A2 R and LCLx = X − A2 R UCLR = D4 R and LCLR = D3 R X= Rm ( X1 + X 2 +...+ X k ) k = ( X i +1 − X i ) , and Rm = (R1 + R2 +...+Rk −1) k −1 CLx = X ± E2 Rm UCLRm = D4 Rm and LCLRm = D3 Rm np Charts np = # defective for each subgroup np np = , for all k subgroups k ( ) np(1 − p ) UCLnp = np + 3 np 1 − p UCLnp = np − 3 p Charts np = number of defectives n= subgroup size N= total number defectives for all subgroups p= LCLp = p − 3 Desirable system will have a Gage R&R Discrimination >5. • The system is acceptable if Gage R&R >10%, but < 20% and discrimination categories =5. • If Gage R&R is > 20%, but < 30% and Categories of Discrimination =4, the decision about acceptability will be based on importance of measuring the characteristic and business cost. <10% and Categories of • If Gage R&R is >30%, or the Categories of Discrimination < 15% 4, the measurement system is not considered acceptable and needs to be improved. MINITAB Analysis Outputs MINITAB provides a tabular and graphical output. The tabular output has three tables; the first an ANOVA table (see ANOVA Interpretation; Page 37). The second table provides raw calculated results of the study and the third table provides the percent contribution results. Interpretation of Gage R&R results is focused on the third table. The third table displays the “% Contribution” and “% Study Variation”. “% Contribution” and “% Study Variation” figures are interpreted as Gage R&R. If you have included a tolerance range with the “Options” button, this table will also report a “% Tolerance” result. The Number of Distinct Categories is also provided. This number indicates how many classifications can be reliably distinguished given the observed process variation. The graphical analysis provides several important graphic tools. • np np and p = n N UCLp = p + 3 • p(1 − p) n p(1 − p) n • The control chart should appear out of control. Operator to operator variation defines control limits. If the gage has adequate sensitivity beyond its own noise, more than 50% of the points will be outside the control limits. If this is not the case, the system is inadequate to detect part-topart variations. • The range chart should be in control, showing consistency between the operators. If there are only two or three distinct ranges recorded, it may indicate lack of gage resolution. The column chart shows the graphic picture of data provided in table three of the tabular report. The graphics on the right show various interaction patterns that may be helpful in troubleshooting a problem measurement system. (1) Measurement Systems Analysis Reference Manual; ©AIAG 1994 59 12 Gage R&R (1) Precontrol What it is: Gage R&R is a means for checking the measurement system (gage plus operator) to gain a better understanding of the variation and sources from the measurement system. * 515 . or 5.15* EV 2 + AV 2 Gage R&R= m σ Where σm =Measurement System standard deviation Components of Measurement Error • Repeatability = Equipment Variation (EV): The variation in measurements attributable to one measurement instrument when used several times by one appraiser to measure the identical characteristic on the same part. • Reproducibility = Appraisal Variation (AV): The variation in measurements attributable to different appraisers using the same measurement instrument to measure the same characteristic on the same part. Why use it? Provide ongoing visual means of on-the-floor process control. What does it do? • • Gives operators decision rules for continuing or stopping production. Rules are based on probability that population mean has shifted. How do I do it? 1. Establish control zones: -3.0 s How to do the gage R&R study. 1. Determine how the gage is going to be used; i.e., Product Acceptance or Process Control. Gage must have resolution 10X finer than the process variation it is intended to measure. (i.e., measurement of parts with process variation of .001 requires a gage with .0001 resolution) 2. Select approximately ten parts which represent the entire expected range of the process variation, including several beyond the normally acceptable range. Code (blind) the parts. 3. Identify two or three Gage R&R participants from the people who actually do the measurement. Have them each measure each part two or three times. The measurements should by done with samples randomized and blinded. 4. Record results on a MINITAB worksheet as follows: a) One Column - Coded Part Numbers (PARTS) b) One Column - Appraiser number or name (OPER) c) One Column - Recorded Measurement (RESP) 5. Analyze using MINITAB by running “Stat>Quality Tools>GageR&R” a) In the initial dialog box choose ANOVA method. b) Identify the appropriate columns for “PARTS”, “OPERATOR”, and “MEASUREMENT Data “. c) If you wish to include the analysis for process tolerance, select the “OPTIONS” button. This is only to be used if the gage is for pass fail decisions only, not for process control. d) If you wish to show demographic information on the graphic output, including gage number, etc, select the “Gage 11 Information” button. Red Yellow .07 -1.5 s µ Green .86 +1.5 s +3.0 s Yellow .07 Red 2. When five parts in a row are “green”, the process is qualified. 3. Sample two consecutive parts on a periodic basis. 4. Decision rules for operators: A. If first part is green, no action needed, continue to run. B. If first part is yellow, then check a second part. » If second part is green, no action needed. » If second part is yellow on same side, then adjust » If second part is yellow on opposite side, stop, call support engineer. C. If any part is red, stop, call support engineer. 5. After correcting and restarting a process, must achieve 5 consecutive “green” samples to re-qualify. 60 Project Closure At Closure, the project must be positioned so that the changes made to the process are sustainable over time. Doing so requires the completion of a number of tasks. 1. The improvement must be fully implemented, with leverage factors identified and controlled. The process must have been re-baselined to confirm the degree of improvement. 2. Process owners must be fully trained and running the process, controlling the leverage factors and monitoring the Response (Y). 3. Required Quality Plan and Control Procedures, drawings, documents, policies, generated reports or institutionalized rigor must be completed . • • • • • • Workstation Instructions Job Descriptions Preventive Maintenance Plan Written Policy or controlled ISO documents Documented training procedures Periodic internal audits or review meetings. 4. The project History Binder must be completed which records key information about the project work in hard copy. Where MINITAB has been used for analysis, hard copies of the generated graphics and tables should be included. • Initial baseline data • Gage R & R calculations • Statistical characterization of the process • DOE (Design of Experiments) • Hypothesis testing • Any data from Design Change Process activities (described on the next page), Failure Modes and Effects Analysis (FMEA), Design for Six Sigma (DFSS), etc. • Copies of engineering part and tooling drawing changes showing “Z” score values on the drawings. • Confirmation run data • Financial data (costs and benefits) • Final decision on improvement and conclusions • All related quality system documents • A scorecard (with frequency of reporting) • Documented control plan Data Validity Studies Non Measurement data is that which is not the result of a measurement using a gage. Examples: • Finance data (T&L cost; Cost & Benefits; Utility Costs; Sales, etc.) • Sales Data (Units sold; Items purchased, etc.) • HR data (Employee Information; medical service provider information) • Customer Invoice Data Samples of data should be selected to assure they represent the population. A minimum of 100 data points is desirable. The data is then analyzed for agreement by comparing each data point (as reported by the standard reporting mechanism) to its true observed value. The validity of the data is reported as % Agreement. ö÷ × 100 % Agreement = æç Number of Agreements Number of Observatio ns è ø % Agreement should be very good. Typically this measure is much greater than 95%. % Agreement for Binary (Pass / Fail) Data Calculate % Agreement in similar manner to Non Measurement, except using the following equation. ö ×100 % Agreement = æç Number of Agreements Number of Opportunities ÷ø è Where the number of opportunities is found by the following equations. n = total number of assessments per sample s = number of samples 5. All data entries must be complete in PROTRAK • Response Variable Z scores at initial Baselining • Response Variable Z scores at Re-baselining, • Project Definition • Improvements Made • Accomplishments, Barriers and Milestones for all project phases • Tools used for all project phases. If n is odd, then If n is even, then æ −1ö ÷ # Opportunit ies = s × ç n ç 4 ÷ è ø 2 æ 2ö # Opportunit ies = s × ç n ÷ ç 4 ÷ è ø 6. Costs and Benefits for the project must be reconfirmed with site finance. • Overall % Agreement = Agreement rate for all opportunities 7. Investigate potential transfer opportunities where project lessons learned can be applied to other business processes. • Repeatability % Agreement = Compare the assessments for one operator over multiple assessment opportunities. (Fix this problem first) • Reproducibility % Agreement = Compare assessments of the same part from operator to operator. 8. Submit closure package for signoff through the site approval channels. 61 10 Sample Size Fulfillment & Span Fulfillment º Providing what the customer wants when the customer wants it Fulfillment is a highly segmented metric and typically does not follow a normal distribution. Because the data is non-normal some of the traditional 6 Sigma tools should not be used (such as the 6 Sigma Process Report). Therefore, Median and Span will be used to measure Fulfillment. Median º the middle value in a data set Span º The difference between two values in the data set (e.g. 1/99 Span = the difference between the 99th percentile and the 1st percentile) We don’t want our decision to be influenced by a single data point. Therefore, the Span calculation is dependent on the sample size. Larger data sets will have a wider span. Following are corporate guidelines on the Span calculation: Sample Size 100-500 500-5000 >5000 Span 10/90 Span 5/95 Span 1/99 Span Example Ø A sample of 100 delivery times has a high value of 40 days Ø If that one value had instead been 30 days, the 1/99 span would change by 10 days Ø 10/90 span is not affected by what happens to that highest point In order to analyze a fulfillment process, the data should be segmented by the variables that may affect the process. Each segment of data should be compared to identify if the segmenting factor had an influence on the Median and the Span. Mood’s Median test is a tool that can be used to identify significant differences in Median. Factors that are identified as having an influence on Span and Median, should be evaluated further through designed experimentation. 9 α = 20% α = 5% α = 1% α = 10% β 20% 10% 5% 1% 20% 10% 5% 1% 20% 10% 5% 1% 20% 10% 5% 1% δ/σ 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 225 100 56 36 25 18 14 11 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 9 7 6 5 5 4 4 3 3 2 13 11 9 8 7 6 5 5 4 4 17 14 12 10 9 8 7 6 5 5 26 22 18 15 13 12 10 9 8 7 12 10 9 7 6 5 5 4 4 3 17 14 12 10 9 8 7 6 5 5 22 18 15 13 11 10 8 7 7 6 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2 2 2 2 2 1 1 1 1 1 3 3 3 2 2 2 2 2 2 2 4 4 4 3 3 3 3 2 2 2 7 6 5 5 5 4 4 4 3 3 3 3 3 2 2 2 2 2 2 1 4 4 4 3 3 3 3 2 2 2 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 1 1 1 1 1 1 1 3 3 3 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 1 1 1 1 1 1 1 328 428 651 146 190 289 82 107 163 53 69 104 36 48 72 27 35 53 21 27 41 16 21 32 309 137 77 49 34 25 19 15 428 541 789 190 241 350 107 135 197 69 87 126 48 60 88 35 44 64 27 34 49 21 27 39 392 174 98 63 44 32 25 19 525 234 131 84 58 43 33 26 650 289 162 104 72 53 41 32 919 408 230 147 102 75 57 45 584 260 146 93 65 48 36 29 744 331 186 119 83 61 46 37 891 1202 396 534 223 300 143 192 99 134 73 98 56 75 44 59 32 26 22 19 16 14 12 11 10 9 16 13 11 9 8 7 6 5 5 4 21 17 15 12 11 9 8 7 6 6 26 21 18 15 13 12 10 9 8 7 37 30 26 22 19 16 14 13 11 10 23 19 16 14 12 10 9 8 7 6 30 25 21 18 15 13 12 10 9 8 36 29 25 21 18 16 14 12 11 10 48 40 33 28 25 21 19 17 15 13 5 5 4 4 4 3 3 3 3 3 8 7 7 6 5 5 5 4 4 4 4 4 3 3 3 3 2 2 2 2 5 5 4 4 4 3 3 3 3 2 6 6 5 5 5 4 4 4 3 3 9 8 8 7 6 6 5 5 5 4 6 5 5 4 4 4 3 3 3 3 7 7 6 6 5 5 4 4 4 4 9 8 7 7 6 6 5 5 5 4 12 11 10 9 8 8 7 7 6 6 2 2 2 2 2 2 2 2 1 1 1 4 3 3 3 3 3 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 1 1 1 3 3 3 2 2 2 2 2 2 2 2 4 4 4 3 3 3 3 3 3 2 2 3 2 2 2 2 2 2 2 2 2 1 3 3 3 3 3 2 2 2 2 2 2 4 4 3 3 3 3 3 3 2 2 2 5 5 5 4 4 4 4 4 3 3 3 62 α=.05 F Distribution Z - An Important Measure Numerator Degrees of Freedom Denom DF 1 2 3 4 5 6 7 8 9 10 1 161.40 199.50 215.70 224.60 230.20 234.00 236.80 238.90 240.50 241.90 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40 2 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 3 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 4 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 5 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 6 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 7 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 8 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 9 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 10 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 11 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 12 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 13 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60 14 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 15 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 16 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45 17 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 18 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38 19 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 20 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32 21 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30 22 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27 23 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25 24 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24 25 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22 26 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20 27 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19 28 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18 29 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 30 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08 40 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99 60 3.92 3.07 2.68 2.45 2.29 2.17 2.09 2.02 1.96 1.91 120 3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83 ∞ Z Short Term Z .ST = Z.st describes how the process performs at any given moment in time. It is referred to as “instantaneous capability,” “short-term capability” or “process entitlement”. It is used when referring to the “SIGMA” of a process. It is the process capability if everything is controlled so that only background noise (common cause variation) is present. This metric assumes the process is centered and the data were gathered in accordance to the principals and spirit of a rational subgrouping plan (p. 14). The “Target” assumes that each subgroup average is aligned to this number, so that all subgroup means are artificially centered on this number. The sst used in this equation can be estimated by the square root of the Mean Square Error term in the ANOVA Table. Since it is centered data, it can be calculated from either one of the Specification Limits (SL). Z Long Term Minimum of Z. LTUSL = or Z. LTLSL = USL − µ σ LT µ − LSL σ LT Z Shift Z.SHIFT = Z.ST − Z.LT Z Benchmark Z.LT describes the sustained reproducibility of a process. It is also called “long-term capability.” It reflects all of the sources of operational variation, the influence of common cause variation, dynamic nonrandom process centering error, and any static off-set present in the process mean. This metric assumes the data were gathered in accordance to the principals and spirit of a “rational sampling” plan (p. 14). This equation is applicable to all types of tolerances. It is used to estimate the long-term process “PPM.” Z.SHIFT describes how well the process being measured is controlled over time. It reflects the difference between the short term and long term capability. It focuses on the dynamic nonrandom process centering error, and any static off-set present in the process mean. Interpretation of the Z.shift is only valid when following the principles of rational subgrouping (p.14) While the Z values above are all calculated in reference to a single spec limit, Z Benchmark is the Z score of the summation of the probabilities of defects in both tails of the distribution. To find, sum the Probability of defect at the Lower Spec Limit (PLSL ) and the Probability of defect at the Upper Spec Limit (PUSL ). Look up the sum of the combined probabilities in a normal table to find the corresponding Z value. Z. Benchmark 63 SL − Target sST = Z score ( PUSL + P LSL ) 8 The Standard Normal Curve F Distribution Numerator Degrees of Freedom ** Area under the curve = 1, the center is 0 ** Mean The Z value is a measure of process capability and is often referred to as the “sigma of the process.” A Z = 1 indicates a process for which the performance limit falls one standard deviation from the mean. If we calculate the standard normal deviate for a given performance limit and discover that Z = 2.76, the probability of a defect (P(d)) is the probability of a point lying beyond the Z value of 2.76. Denom DF µµ Point of Inflection Probability of a Defect Performance Limit Example = .00289 1σ Z=1 Total Area = 1 Z Units of Measure Z=0 Z=1 This table lists the tail area to the right of Z. Copyright 1995 Six Sigma Academy, Inc. 7 Z Area Z Area Z Area Z Area 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 .500000000 .480061306 .460172290 .440382395 .420740315 .401293634 .382088486 .363169226 .344578129 .326355105 .308537454 .291159644 .274253121 .257846158 .241963737 .226627465 .211855526 .197662672 .184060243 .171056222 .158655319 .146859086 .135666053 .125071891 .115069593 .105649671 .096800364 .088507862 .080756531 .073529141 .066807100 1.51 1.56 1.61 1.66 1.71 1.76 1.81 1.86 1.91 1.96 2.01 2.06 2.11 2.16 2.21 2.26 2.31 2.36 2.41 2.46 2.51 2.56 2.61 2.66 2.71 2.76 2.81 2.86 2.91 2.96 3.01 .065521615 .059379869 .053698886 .048457216 .043632958 .039203955 .035147973 .031442864 .028066724 .024998022 .022215724 .019699396 .017429293 .015386434 .013552660 .011910681 .010444106 .009137469 .007976235 .006946800 .006036485 .005233515 .004527002 .003906912 .003364033 .002889938 .002476947 .002118083 .001807032 .001538097 .001306156 3.02 3.07 3.12 3.17 3.22 3.27 3.32 3.37 3.42 3.47 3.52 3.57 3.62 3.67 3.72 3.77 3.82 3.87 3.92 3.97 4.02 4.07 4.12 4.17 4.22 4.27 4.32 4.37 4.42 4.47 4.52 .001263795 .001070234 .000904215 .000762175 .000640954 .000537758 .000450127 .000375899 .000313179 .000260317 .000215873 .000178601 .000147419 .000121399 .000099739 .000081753 .000066855 .000054545 .000044399 .000036057 .000029215 .000023617 .000019047 .000015327 .000012305 .000009857 .000007878 .000006282 .000004998 .000003968 .000003143 4.53 4.58 4.63 4.68 4.73 4.78 4.83 4.88 4.93 4.98 5.03 5.08 5.13 5.18 5.23 5.28 5.33 5.38 5.43 5.48 5.53 5.58 5.63 5.68 5.73 5.78 5.83 5.88 5.93 5.98 6.03 .000002999 .000002369 .000001867 .000001469 .000001153 .000000903 .000000705 .000000550 .000000428 .000000332 .000000258 .000000199 .000000154 .000000118 .000000091 .000000070 .000000053 .000000041 .000000031 .000000024 .000000018 .000000014 .000000010 .000000008 .000000006 .000000004 .000000003 .000000003 .000000002 .000000001 .000000001 Z = 2.76 Table of Area Under the Normal Curve 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 60 120 ∞ 12 15 20 24 30 40 60 α=.05 120 ∞ 243.90 245.90 248.00 249.10 250.10 251.10 252.20 253.30 254.30 19.41 19.43 19.45 19.45 19.46 19.47 19.48 19.49 19.50 8.74 8.70 8.66 8.64 8.62 8.59 8.57 8.55 8.53 5.91 5.86 5.80 5.77 5.75 5.72 5.69 5.66 5.63 4.68 4.62 4.56 4.53 4.50 4.46 4.43 4.40 4.36 4.00 3.94 3.87 3.84 3.81 3.77 3.74 3.70 3.67 3.57 3.51 3.44 3.41 3.38 3.34 3.30 3.27 3.23 3.28 3.22 3.15 3.12 3.08 3.04 3.01 2.97 2.93 3.07 3.01 2.94 2.90 2.86 2.83 2.79 2.75 2.71 2.91 2.85 2.77 2.74 2.70 2.66 2.62 2.58 2.54 2.79 2.72 2.65 2.61 2.57 2.53 2.49 2.45 2.40 2.69 2.62 2.54 2.51 2.47 2.43 2.38 2.34 2.30 2.60 2.53 2.46 2.42 2.38 2.34 2.30 2.25 2.21 2.53 2.46 2.39 2.35 2.31 2.27 2.22 2.18 2.13 2.48 2.40 2.33 2.29 2.25 2.20 2.16 2.11 2.07 2.42 2.35 2.28 2.24 2.19 2.15 2.11 2.06 2.01 2.38 2.31 2.23 2.19 2.15 2.10 2.06 2.01 1.96 2.34 2.27 2.19 2.15 2.11 2.06 2.02 1.97 1.92 2.31 2.23 2.16 2.11 2.07 2.03 1.98 1.93 1.88 2.28 2.20 2.12 2.08 2.04 1.99 1.95 1.90 1.84 2.25 2.18 2.10 2.05 2.01 1.96 1.92 1.87 1.81 2.23 2.15 2.07 2.03 1.98 1.94 1.89 1.84 1.78 2.20 2.13 2.05 2.01 1.96 1.91 1.86 1.81 1.76 2.18 2.11 2.03 1.98 1.94 1.89 1.84 1.79 1.73 2.16 2.09 2.01 1.96 1.92 1.87 1.82 1.77 1.71 2.15 2.07 1.99 1.95 1.90 1.85 1.80 1.75 1.69 2.13 2.06 1.97 1.93 1.88 1.84 1.79 1.73 1.67 2.12 2.04 1.96 1.91 1.87 1.82 1.77 1.71 1.65 2.10 2.03 1.94 1.90 1.85 1.81 1.75 1.70 1.64 2.09 2.01 1.93 1.89 1.84 1.79 1.74 1.68 1.62 2.00 1.92 1.84 1.79 1.74 1.69 1.64 1.58 1.51 1.92 1.84 1.75 1.70 1.65 1.59 1.53 1.47 1.39 1.83 1.75 1.66 1.61 1.55 1.50 1.43 1.35 1.25 1.75 1.67 1.57 1.52 1.46 1.39 1.32 1.22 1.00 64 Chi-Square Distribution α df 1 2 3 4 5 .995 .990 .975 .950 .900 .750 7 Basic QC Tools - Ishikawa .500 .000039 .000160 .000980 .003930 .015800 .101500 .455000 0.010 0.020 0.051 0.103 0.211 0.575 1.386 0.072 0.115 0.216 0.352 0.584 1.213 2.366 0.207 0.297 0.484 0.711 1.064 1.923 3.357 0.412 0.554 0.831 1.145 1.610 2.675 4.351 6 7 8 9 10 0.676 0.989 1.344 1.735 2.156 0.872 1.239 1.646 2.088 2.558 1.237 1.690 2.180 2.700 3.247 1.635 2.167 2.733 3.325 3.940 2.204 2.833 3.490 4.168 4.865 3.455 4.255 5.071 5.899 6.737 5.348 6.346 7.344 8.343 9.342 11 12 13 14 15 2.603 3.074 3.565 4.075 4.601 3.053 3.571 4.107 4.660 5.229 3.816 4.404 5.009 5.629 6.262 4.575 5.226 5.892 6.571 7.261 5.578 6.304 7.042 7.790 8.547 7.584 8.438 9.299 10.165 11.036 10.341 11.340 12.340 13.339 14.339 16 17 18 19 20 5.142 5.697 6.265 6.844 7.434 5.812 6.408 7.015 7.633 8.260 6.908 7.564 8.231 8.907 9.591 7.962 8.672 9.390 10.117 10.851 9.312 10.085 10.865 11.651 12.443 11.912 12.792 13.675 14.562 15.452 15.338 16.338 17.338 18.338 19.337 21 22 23 24 25 8.034 8.643 9.260 9.886 10.520 8.897 9.542 10.196 10.856 11.524 10.283 10.982 11.688 12.401 13.120 11.591 12.338 13.091 13.848 14.611 13.240 14.041 14.848 15.659 16.473 16.344 17.240 18.137 19.037 19.939 20.337 21.337 22.337 23.337 24.337 26 27 28 29 30 11.160 11.808 12.461 13.121 13.787 12.198 12.879 13.565 14.256 14.953 13.844 14.573 15.308 16.047 16.791 15.379 16.151 16.928 17.708 18.493 17.292 18.114 18.939 19.768 20.599 20.843 21.749 22.657 23.567 24.478 25.336 26.336 27.336 28.336 29.336 65 The seven basic QC tools are the simplest, quickest tools for structured problem solving. In many cases these tools will define the appropriate area in which to focus to solve quality problems. They are an integral part of the Six Sigma DMAIC process toolkit. • Brainstorming: Allows generation of a high volume of ideas quickly. Generally used integrally with the advocacy team when identifying the potential X’s. • Pareto: Helps to define the potential vital few X’s. The pareto links data to problem causes and aids in making data based decisions (Page 23). • Histogram: Displays frequency of occurrence of various categories in chart form, can be used as first cut at mean, variation, distribution of data. An important part of process data analysis. (Page 18). • Cause & Effect / Fishbone Diagram: Helps identify potential problem causes and focus brainstorming. (Page 23). • Flowcharting / Process Mapping: Displays actual steps of process. Provides basis for examining potential areas of improvement. • Scatter Charts: (Page 18). • Check Sheets: Capture data in a format that facilitates interpretation. Shows relationship between two variables. Fishbone (Ishikawa) Diagram Pareto Chart 6 Practical Problem Statement A major cause of futile attempts to solve a problem is poor, up front statement of the problem. Define the problem using available facts, and planned improvement. 1. Write an initial “as is” problem statement condition. This statement describes the problem as it exists now. It is a statement of what “hurts” or what “bugs” you. The statement should contain data based measures of the hurt. For example: As Is: “The response time for 15% of our service calls is more than 24 hours. 2. Be sure the problem statement meets the following criteria: •Is as specific as possible •Contains no potential causes •Contains no conclusions or potential solutions •Is sufficiently narrow in scope The most common mistake in developing a Problem Statement is the problem is stated at too high a level or is too broad for effective investigation. Use the Structure Tree (Page 25), Pareto (Page 25) or Rolled Throughput Yield analysis (Page 14) to break the problem down further. 3. Avoid the following in wording problem statements: Avoid Ineffective Problem Statement Effective Problem Statement Questions “How can we reduce the downtime on the Assembly Line.” “Assembly Line downtime currently runs 15% of operating hours. “ The word “lack” “We lack word processing software” “Material to be typed is backlogged by five days.” Solution masquerading as a problem “We need to hire another warehouse shipping clerk.” “50% of the scheduled day’s shipments are not being pulled on time.” Blaming people instead of processes “File Clerks aren’t doing their jobs.” “Files cannot be located within the allowed 5 minutes after requested.” 4. Determine if you have identified the correct level to address the problem. Ask: “Is my “Y” response variable (Output) defined at a level at which it can be solved by direct interaction with it’s independent variables (X’s) Inputs? 5. Determine if correcting the “Y” response variable will result in the desired improvement in the problem as stated. 6. Describe the “desired state”, a description of what you want to achieve by solving the problem, as objectively as possible. As with the “as is” statement, be sure the “desired state” is in measurable observable terms. For example: Desired State: “The response time for all our service calls is less than 24 hours.” 5 Chi-Square Distribution α df .250 .100 .050 .025 .010 2.706 3.841 5.024 6.635 4.605 5.991 7.378 9.210 6.251 7.815 9.348 11.345 7.779 9.488 11.143 13.277 9.236 11.070 12.832 15.086 .005 .001 7.879 10.597 12.838 14.860 16.750 10.828 13.816 16.266 18.467 20.515 1 2 3 4 5 1.323 2.773 4.108 5.385 6.626 6 7 8 9 10 7.841 9.037 10.219 11.389 12.549 10.645 12.017 13.362 14.684 15.987 12.592 14.067 15.507 16.919 18.307 14.449 16.013 17.535 19.023 20.483 16.812 18.475 20.090 21.666 23.209 18.548 20.278 21.955 23.589 25.188 22.458 24.322 26.125 27.877 29.588 11 12 13 14 15 13.701 14.845 15.984 17.117 18.245 17.275 18.549 19.812 21.064 22.307 19.675 21.026 22.362 23.685 24.996 21.920 23.337 24.736 26.119 27.488 24.725 26.217 27.688 29.141 30.578 26.757 28.300 29.819 31.319 32.801 31.264 32.909 34.528 36.123 37.697 16 17 18 19 20 19.369 20.489 21.605 22.718 23.828 23.542 24.769 25.989 27.204 28.412 26.296 27.587 28.869 30.144 31.410 28.845 30.191 31.526 32.852 34.170 32.000 33.409 34.805 36.191 37.566 34.267 35.718 37.156 38.582 39.997 39.252 40.790 43.312 43.820 45.315 21 22 23 24 25 24.935 26.039 27.141 28.241 29.339 29.615 30.813 32.007 33.196 34.382 32.671 33.924 35.172 36.415 37.652 35.479 36.781 38.076 39.364 40.646 38.932 40.289 41.638 42.980 44.314 41.401 42.796 44.181 45.558 46.928 46.797 48.268 49.728 51.179 52.620 26 27 28 29 30 30.434 31.528 32.620 33.711 34.800 35.563 36.741 37.916 39.087 40.256 38.885 40.113 41.337 42.557 43.773 41.923 43.194 44.461 45.722 46.979 45.642 46.963 48.278 49.588 50.892 48.290 49.645 50.993 52.336 53.672 54.052 55.476 56.892 58.302 59.703 66 Normal Distribution Defining a Six Sigma Project Z Z 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 5.00E-01 4.60E-01 4.21E-01 3.82E-01 3.45E-01 3.09E-01 2.74E-01 2.42E-01 2.12E-01 1.84E-01 4.96E-01 4.56E-01 4.17E-01 3.78E-01 3.41E-01 3.05E-01 2.71E-01 2.39E-01 2.09E-01 1.81E-01 4.92E-01 4.52E-01 4.13E-01 3.75E-01 3.37E-01 3.02E-01 2.68E-01 2.36E-01 2.06E-01 1.79E-01 4.88E-01 4.48E-01 4.09E-01 3.71E-01 3.34E-01 2.98E-01 2.64E-01 2.33E-01 2.03E-01 1.76E-01 4.84E-01 4.44E-01 4.05E-01 3.67E-01 3.30E-01 2.95E-01 2.61E-01 2.30E-01 2.01E-01 1.74E-01 4.80E-01 4.40E-01 4.01E-01 3.63E-01 3.26E-01 2.91E-01 2.58E-01 2.27E-01 1.98E-01 1.71E-01 4.76E-01 4.36E-01 3.97E-01 3.59E-01 3.23E-01 2.88E-01 2.55E-01 2.24E-01 1.95E-01 1.69E-01 4.72E-01 4.33E-01 3.94E-01 3.56E-01 3.19E-01 2.84E-01 2.51E-01 2.21E-01 1.92E-01 1.66E-01 4.68E-01 4.29E-01 3.90E-01 3.52E-01 3.16E-01 2.81E-01 2.48E-01 2.18E-01 1.89E-01 1.64E-01 4.64E-01 4.25E-01 3.86E-01 3.48E-01 3.12E-01 2.78E-01 2.45E-01 2.15E-01 1.87E-01 1.61E-01 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.59E-01 1.36E-01 1.15E-01 9.68E-02 8.08E-02 6.68E-02 5.48E-02 4.46E-02 3.59E-02 2.87E-02 1.56E-01 1.34E-01 1.13E-01 9.51E-02 7.93E-02 6.55E-02 5.37E-02 4.36E-02 3.52E-02 2.81E-02 1.5 39E01 1.31E-01 1.11E-01 9.34E-02 7.78E-02 6.43E-02 5.26E-02 4.27E-02 3.44E-02 2.74E-02 1.52E-01 1.29E-01 1.09E-01 9.18E-02 7.64E-02 6.30E-02 5.16E-02 4.18E-02 3.36E-02 2.68E-02 1.49E-01 1.27E-01 1.08E-01 9.01E-02 7.49E-02 6.18E-02 5.05E-02 4.09E-02 3.29E-02 2.62E-02 1.47E-01 1.25E-01 1.06E-01 8.85E-02 7.35E-02 6.06E-02 4.95E-02 4.01E-02 3.22E-02 2.56E-02 1.45E-01 1.23E-01 1.04E-01 8.69E-02 7.21E-02 5.94E-02 4.85E-02 3.92E-02 3.14E-02 2.50E-02 1.42E-01 1.21E-01 1.02E-01 8.53E-02 7.08E-02 5.82E-02 4.75E-02 3.84E-02 3.07E-02 2.44E-02 1.40E-01 1.19E-01 1.00E-01 8.38E-02 6.94E-02 5.71E-02 4.65E-02 3.75E-02 3.01E-02 2.39E-02 1.38E-01 1.17E-01 9.85E-02 8.23E-02 6.81E-02 5.59E-02 4.55E-02 3.67E-02 2.94E-02 2.33E-02 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.28E-02 1.79E-02 1.39E-02 1.07E-02 8.20E-03 6.21E-03 4.66E-03 3.47E-03 2.56E-03 1.87E-03 2.22E-02 1.74E-02 1.36E-02 1.04E-02 7.98E-03 6.04E-03 4.53E-03 3.36E-03 2.48E-03 1.81E-03 2.17E-02 1.70E-02 1.32E-02 1.02E-02 7.76E-03 5.87E-03 4.40E-03 3.26E-03 2.40E-03 1.75E-03 2.12E-02 1.66E-02 1.29E-02 9.90E-03 7.55E-03 5.70E-03 4.27E-03 3.17E-03 2.33E-03 1.70E-03 2.07E-02 1.62E-02 1.26E-02 9.64E-03 7.34E-03 5.54E-03 4.15E-03 3.07E-03 2.26E-03 1.64E-03 2.02E-02 1.58E-02 1.22E-02 9.39E-03 7.14E-03 5.39E-03 4.02E-03 2.98E-03 2.19E-03 1.59E-03 1.97E-02 1.54E-02 1.19E-02 9.14E-03 6.95E-03 5.23E-03 3.91E-03 2.89E-03 2.12E-03 1.54E-03 1.92E-02 1.50E-02 1.16E-02 8.89E-03 6.76E-03 5.09E-03 3.79E-03 2.80E-03 2.05E-03 1.49E-03 1.88E-02 1.46E-02 1.13E-02 8.66E-03 6.57E-03 4.94E-03 3.68E-03 2.72E-03 1.99E-03 1.44E-03 1.83E-02 1.43E-02 1.10E-02 8.42E-03 6.39E-03 4.80E-03 3.57E-03 2.64E-03 1.93E-03 1.40E-03 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 1.35E-03 9.68E-04 6.87E-04 4.84E-04 3.37E-04 2.33E-04 1.59E-04 1.08E-04 7.25E-05 4.82E-05 1.31E-03 9.35E-04 6.64E-04 4.67E-04 3.25E-04 2.24E-04 1.53E-04 1.04E-04 6.96E-05 4.63E-05 1.26E-03 9.04E-04 6.41E-04 4.50E-04 3.13E-04 2.16E-04 1.47E-04 9.97E-05 6.69E-05 4.44E-05 1.22E-03 8.74E-04 6.19E-04 4.34E-04 3.02E-04 2.08E-04 1.42E-04 9.59E-05 6.42E-05 4.26E-05 1.18E-03 8.45E-04 5.98E-04 4.19E-04 2.91E-04 2.00E-04 1.36E-04 9.21E-05 6.17E-05 4.09E-05 1.14E-03 8.16E-04 5.77E-04 4.04E-04 2.80E-04 1.93E-04 1.31E-04 8.86E-05 5.92E-05 3.92E-05 1.11E-03 7.89E-04 5.57E-04 3.90E-04 2.70E-04 1.86E-04 1.26E-04 8.51E-05 5.68E-05 3.76E-05 1.07E-03 7.62E-04 5.38E-04 3.76E-04 2.60E-04 1.79E-04 1.21E-04 8.18E-05 5.46E-05 3.61E-05 1.04E-03 7.36E-04 5.19E-04 3.63E-04 2.51E-04 1.72E-04 1.17E-04 7.85E-05 5.24E-05 3.46E-05 1.00E-03 7.11E-04 5.01E-04 3.50E-04 2.42E-04 1.66E-04 1.12E-04 7.55E-05 5.03E-05 3.32E-05 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 3.18E-05 2.08E-05 1.34E-05 8.62E-06 5.48E-06 3.45E-06 2.15E-06 1.33E-06 8.18E-07 4.98E-07 3.05E-05 1.99E-05 1.29E-05 8.24E-06 5.23E-06 3.29E-06 2.05E-06 1.27E-06 7.79E-07 4.73E-07 2.92E-05 1.91E-05 1.23E-05 7.88E-06 5.00E-06 3.14E-06 1.96E-06 1.21E-06 7.41E-07 4.50E-07 2.80E-05 1.82E-05 1.18E-05 7.53E-06 4.77E-06 3.00E-06 1.87E-06 1.15E-06 7.05E-07 4.28E-07 2.68E-05 1.75E-05 1.13E-05 7.20E-06 4.56E-06 2.86E-06 1.78E-06 1.10E-06 6.71E-07 4.07E-07 2.57E-05 1.67E-05 1.08E-05 6.88E-06 4.35E-06 2.73E-06 1.70E-06 1.05E-06 6.39E-07 3.87E-07 2.47E-05 1.60E-05 1.03E-05 6.57E-06 4.16E-06 2.60E-06 1.62E-06 9.96E-07 6.08E-07 3.68E-07 2.36E-05 1.53E-05 9.86E-06 6.28E-06 3.97E-06 2.48E-06 1.54E-06 9.48E-07 5.78E-07 3.50E-07 2.26E-05 1.47E-05 9.43E-06 6.00E-06 3.79E-06 2.37E-06 1.47E-06 9.03E-07 5.50E-07 3.32E-07 2.17E-05 1.40E-05 9.01E-06 5.73E-06 3.62E-06 2.26E-06 1.40E-06 8.59E-07 5.23E-07 3.16E-07 67 SPECIFIC Issue is clearly defined to the lowest level of cause and effect. The project should have a ‘response variable’ (Y) with specifications and constraints (i.e. cycle time for returned parts, washer base width). It should be bound by clearly defined goals. If it looks big, it is. A poorly defined project will require greater scoping time and will have a longer completion time than one that is clearly defined. CUSTOMER FOCUSED The Project Y should be clearly linked to a specific customer want or need - can result in improved customer perception or consumer satisfaction (Customer WOW): on time delivery, billing accuracy, call answer rate. VALUE-ADDED Financially justifiable - directly impacts a business metric that returns value: PPM, reliability, yield, pricing errors, field returns, factory yield, overtime, transportation, warehousing, availability, SCR, rework, under billing and scrap. MEASURABLE The ‘response variable’ (Y) must have reasonable historical DATA, or you must have the ability to capture a reliable data stream. Having a method for measuring vital X’s is also essential for in-depth process analysis with data. Discreet data can be effectively used for problem investigation, but ‘variable’ (continuous) data is better. Projects based on unreliable data have unreliable results. LOCALLY ACTIONABLE The selected project should be one which can be addressed by the accepted “local” organization Adequate support is needed to ensure successful project completion and permanent change to the process. It is difficult to “manage improvements in Louisville from the field” A well defined problem is the first step in a successful project! 4 Six Sigma Problem Solving Processes Step Description Define A Identify Project CTQs B C Develop Team Charter Define Process Map Measure 1 Select CTQ Characteristics 2 Define Performance Standards 3 Measurement System Analysis Analyze 4 Establish Process Capability 5 Define Performance Objectives 6 Identify Variation Sources Improve 7 Screen Potential Causes 8 Discover Variable Relationships 9 Establish Operating Tolerances Control 10 Define & Validate Measurement System on X’s in Actual Application 11 Determine New Process Capability 12 Implement Process Control 3 Focus Tools Y VOC; Process Map; CAP Project CAP Y=f(x) Process Map Normal Distribution Deliverables Project CTQs (1) Approved Charter (2) High Level Process Map (3) Z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 3.00E-07 1.80E-07 1.07E-07 6.27E-08 3.66E-08 2.12E-08 1.22E-08 6.98E-09 3.96E-09 2.23E-09 2.85E-07 1.71E-07 1.01E-07 5.95E-08 3.47E-08 2.01E-08 1.16E-08 6.60E-09 3.74E-09 2.11E-09 2.71E-07 1.62E-07 9.59E-08 5.64E-08 3.29E-08 1.90E-08 1.09E-08 6.24E-09 3.53E-09 1.99E-09 2.58E-07 1.54E-07 9.10E-08 5.34E-08 3.11E-08 1.80E-08 1.03E-08 5.89E-09 3.34E-09 1.88E-09 2.45E-07 1.46E-07 8.63E-08 5.06E-08 2.95E-08 1.70E-08 9.78E-09 5.57E-09 3.15E-09 1.77E-09 2.32E-07 1.39E-07 8.18E-08 4.80E-08 2.79E-08 1.61E-08 9.24E-09 5.26E-09 2.97E-09 1.67E-09 2.21E-07 1.31E-07 7.76E-08 4.55E-08 2.64E-08 1.53E-08 8.74E-09 4.97E-09 2.81E-09 1.58E-09 2.10E-07 1.25E-07 7.36E-08 4.31E-08 2.50E-08 1.44E-08 8.26E-09 4.70E-09 2.65E-09 1.49E-09 1.99E-07 1.18E-07 6.98E-08 4.08E-08 2.37E-08 1.37E-08 7.81E-09 4.44E-09 2.50E-09 1.40E-09 1.89E-07 1.12E-07 6.62E-08 3.87E-08 2.24E-08 1.29E-08 7.39E-09 4.19E-09 2.36E-09 1.32E-09 Y VOC; QFD;FMEA Project Y (4) Y VOC, Blueprints Performance Standard for Project Y (5) Data Collection Plan & MSA (6), Data for Project Y (7) 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 1.25E-09 6.94E-10 3.84E-10 2.11E-10 1.15E-10 6.25E-11 3.38E-11 1.82E-11 9.72E-12 5.18E-12 1.18E-09 6.54E-10 3.61E-10 1.98E-10 1.08E-10 5.88E-11 3.18E-11 1.71E-11 9.13E-12 4.86E-12 1.11E-09 6.17E-10 3.40E-10 1.87E-10 1.02E-10 5.53E-11 2.98E-11 1.60E-11 8.57E-12 4.56E-12 1.05E-09 5.81E-10 3.21E-10 1.76E-10 9.59E-11 5.20E-11 2.81E-11 1.51E-11 8.05E-12 4.28E-12 9.88E-10 5.48E-10 3.02E-10 1.66E-10 9.02E-11 4.89E-11 2.64E-11 1.42E-11 7.56E-12 4.02E-12 9.31E-10 5.16E-10 2.84E-10 1.56E-10 8.49E-11 4.60E-11 2.48E-11 1.33E-11 7.10E-12 3.77E-12 8.78E-10 4.87E-10 2.68E-10 1.47E-10 7.98E-11 4.32E-11 2.33E-11 1.25E-11 6.66E-12 3.54E-12 8.28E-10 4.59E-10 2.52E-10 1.38E-10 7.51E-11 4.07E-11 2.19E-11 1.17E-11 6.26E-12 3.32E-12 7.81E-10 4.32E-10 2.38E-10 1.30E-10 7.06E-11 3.82E-11 2.06E-11 1.10E-11 5.87E-12 3.12E-12 7.36E-10 4.07E-10 2.24E-10 1.22E-10 6.65E-11 3.59E-11 1.93E-11 1.04E-11 5.52E-12 2.93E-12 Process Capability for Project Y (8) Improvement Goal for Project Y (9) Prioritized List of all Xs (10) 7.0 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 2.75E-12 1.45E-12 7.64E-13 4.01E-13 2.10E-13 1.09E-13 5.68E-14 2.94E-14 1.52E-14 7.85E-15 2.58E-12 1.36E-12 7.16E-13 3.76E-13 1.96E-13 1.02E-13 5.32E-14 2.76E-14 1.42E-14 7.35E-15 2.42E-12 1.28E-12 6.72E-13 3.52E-13 1.84E-13 9.58E-14 4.98E-14 2.58E-14 1.33E-14 6.88E-15 2.27E-12 1.20E-12 6.30E-13 3.30E-13 1.72E-13 8.98E-14 4.66E-14 2.42E-14 1.25E-14 6.44E-15 2.13E-12 1.12E-12 5.90E-13 3.09E-13 1.62E-13 8.41E-14 4.37E-14 2.26E-14 1.17E-14 6.02E-15 2.00E-12 1.05E-12 5.54E-13 2.90E-13 1.51E-13 7.87E-14 4.09E-14 2.12E-14 1.09E-14 5.64E-15 1.87E-12 9.88E-13 5.19E-13 2.72E-13 1.42E-13 7.38E-14 3.83E-14 1.98E-14 1.02E-14 5.28E-15 1.76E-12 9.26E-13 4.86E-13 2.55E-13 1.33E-13 6.91E-14 3.58E-14 1.86E-14 9.58E-15 4.94E-15 1.65E-12 8.69E-13 4.56E-13 2.39E-13 1.24E-13 6.47E-14 3.36E-14 1.74E-14 8.97E-15 4.62E-15 1.55E-12 8.15E-13 4.28E-13 2.24E-13 1.17E-13 6.06E-14 3.14E-14 1.63E-14 8.39E-15 4.32E-15 List of Vital Few Xs (11) Proposed Solution (13) Piloted Solution (14) 8.0 8.1 8.2 83 8.4 8.5 8.6 8.7 8.8 8.9 4.05E-15 2.08E-15 1.07E-15 5.48E-16 2.81E-16 1.44E-16 7.34E-17 3.75E-17 1.92E-17 9.79E-18 3.79E-15 1.95E-15 9.99E-16 5.12E-16 2.62E-16 1.34E-16 6.87E-17 3.51E-17 1.79E-17 9.16E-18 3.54E-15 1.82E-15 9.35E-16 4.79E-16 2.45E-16 1.26E-16 6.42E-17 3.28E-17 1.68E-17 8.56E-18 3.31E-15 1.70E-15 8.74E-16 4.48E-16 2.30E-16 1.17E-16 6.00E-17 3.07E-17 1.57E-17 8.00E-18 3.10E-15 1.59E-15 8.18E-16 4.19E-16 2.15E-16 1.10E-16 5.61E-17 2.87E-17 1.47E-17 7.48E-18 2.90E-15 1.49E-15 7.65E-16 3.92E-16 2.01E-16 1.03E-16 5.25E-17 2.68E-17 1.37E-17 7.00E-18 2.72E-15 1.40E-15 7.16E-16 3.67E-16 1.88E-16 9.60E-17 4.91E-17 2.51E-17 1.28E-17 6.54E-18 2.54E-15 1.31E-15 6.69E-16 3.43E-16 1.76E-16 8.98E-17 4.59E-17 2.35E-17 1.20E-17 6.12E-18 2.38E-15 1.22E-15 6.26E-16 3.21E-16 1.64E-16 8.40E-17 4.29E-17 2.19E-17 1.12E-17 5.72E-18 2.22E-15 1.14E-15 5.86E-16 3.00E-16 1.54E-16 7.85E-17 4.01E-17 2.05E-17 1.05E-17 5.35E-18 9.0 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 10.0 5.00E-18 2.56E-18 1.31E-18 6.67E-19 3.41E-19 1.75E-19 8.94E-20 4.58E-20 2.35E-20 1.21E-20 6.22E-21 4.68E-18 2.39E-18 1.22E-18 6.24E-19 3.19E-19 1.63E-19 8.37E-20 4.29E-20 2.20E-20 1.13E-20 5.82E-21 4.37E-18 2.23E-18 1.14E-18 5.83E-19 2.98E-19 1.53E-19 7.82E-20 4.01E-20 2.06E-20 1.06E-20 5.44E-21 4.09E-18 2.09E-18 1.07E-18 5.46E-19 2.79E-19 1.43E-19 7.32E-20 3.75E-20 1.93E-20 9.90E-21 5.09E-21 3.82E-18 1.95E-18 9.98E-19 5.10E-19 2.61E-19 1.34E-19 6.85E-20 3.51E-20 1.80E-20 9.26E-21 4.77E-21 3.57E-18 1.83E-18 9.33E-19 4.77E-19 2.44E-19 1.25E-19 6.40E-20 3.28E-20 1.69E-20 8.67E-21 4.46E-21 3.34E-18 1.71E-18 8.73E-19 4.46E-19 2.28E-19 1.17E-19 5.99E-20 3.07E-20 1.58E-20 8.11E-21 4.17E-21 3.13E-18 1.60E-18 8.16E-19 4.17E-19 2.14E-19 1.09E-19 5.60E-20 2.87E-20 1.48E-20 7.59E-21 3.91E-21 2.92E-18 1.49E-18 7.63E-19 3.90E-19 2.00E-19 1.02E-19 5.24E-20 2.69E-20 1.38E-20 7.10E-21 3.66E-21 2.73E-18 1.40E-18 7.14E-19 3.65E-19 1.87E-19 9.56E-20 4.90E-20 2.52E-20 1.29E-20 6.64E-21 3.42E-21 Y&X Continuous Gage R&R; Test/Retest, Attribute R&R Y Capability Indices Y X Team, Benchmarking Process Analysis, Graphical Analysis, Hypothesis Tests X DOE-Screening X Factorial Designs X Simulation Continuous Gage R&R, Test/Retest, Attribute R&R MSA X, Y Capability Indices X Control Charts; Mistake Proofing; FMEA Process Capability Y, X Sustained Solution (15), Project Documentation (16), X, Y 68 t-Distribution 1-α α Six Sigma Toolkit - Index df .600 .700 .800 .900 .950 .990 .995 1 2 3 4 5 0.325 0.289 0.277 0.271 0.267 0.727 0.617 0.584 0.569 0.559 1.376 1.061 0.978 0.941 0.920 3.078 1.886 1.638 1.533 1.476 6.314 12.706 31.821 63.657 2.920 4.303 6.965 9.925 2.353 3.182 4.541 5.841 2.132 2.776 3.747 4.604 2.015 2.571 3.365 4.032 6 7 8 9 10 0.265 0.263 0.262 0.261 0.260 0.553 0.549 0.546 0.543 0.542 0.906 0.896 0.889 0.883 0.879 1.440 1.415 1.397 1.383 1.372 1.943 1.895 1.860 1.833 1.812 2.447 2.365 2.306 2.262 2.228 3.143 2.998 2.896 2.821 2.764 3.707 3.499 3.355 3.250 3.169 11 12 13 14 15 0.260 0.259 0.259 0.258 0.258 0.540 0.539 0.538 0.537 0.536 0.876 0.873 0.870 0.868 0.866 1.363 1.356 1.350 1.345 1.341 1.796 1.782 1.771 1.761 1.753 2.201 2.179 2.160 2.145 2.131 2.718 2.681 2.650 2.624 2.602 3.106 3.055 3.012 2.977 2.947 16 17 18 19 20 0.258 0.257 0.257 0.257 0.257 0.535 0.534 0.534 0.533 0.533 0.865 0.863 0.862 0.861 0.860 1.337 1.333 1.330 1.328 1.325 1.746 1.740 1.734 1.729 1.725 2.120 2.110 2.101 2.093 2.086 2.583 2.567 2.552 2.539 2.528 2.921 2.898 2.878 2.861 2.845 21 22 23 24 25 0.257 0.256 0.256 0.256 0.256 0.532 0.532 0.532 0.531 0.531 0.859 0.858 0.858 0.857 0.856 1.323 1.321 1.319 1.318 1.316 1.721 1.717 1.714 1.711 1.708 2.080 2.074 2.069 2.064 2.060 2.518 2.508 2.500 2.492 2.485 2.831 2.819 2.807 2.797 2.787 26 27 28 29 30 0.256 0.256 0.256 0.256 0.256 0.531 0.531 0.530 0.530 0.530 0.856 0.855 0.855 0.854 0.854 1.315 1.314 1.313 1.311 1.310 1.706 1.703 1.701 1.699 1.697 2.056 2.052 2.048 2.045 2.042 2.479 2.473 2.467 2.462 2.457 2.779 2.771 2.763 2.756 2.750 40 60 120 ∞ 69 .975 0.255 0.254 0.254 0.253 0.529 0.527 0.526 0.524 0.851 0.848 0.845 0.842 1.303 1.296 1.289 1.282 1.684 1.671 1.658 1.645 2.021 2.000 1.980 1.960 2.423 2.390 2.358 2.326 2.704 2.660 2.617 2.576 Index Analysis and Improve Tools Selection Matrix· · · · · · · · · · · · · · · · · · · ANOVA ANOVA / ANOVA One Way· · · · · · · · · · · · · · · · · · · · · · · · · ANOVA Two Way · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ANOVA - Balanced · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Interpreting the ANOVA Output· · · · · · · · · · · · · · · · · · · · · · · Calculating Sample Size (Equation for manual Calculation) · · · · · · · · · · Characterizing the Process - Rational Subgrouping · · · · · · · · · · · · · Control Chart Constants· · · · · · · · · · · · · · · · · · · · · · · · · · · · Control Charts· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Data Validity Studies/% Agreement on Binary (Pass / Fail) Data· · · · · · · Defining a Six Sigma Project · · · · · · · · · · · · · · · · · · · · · · · · · Definition of Z · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Design for Six Sigma Loop Diagrams· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Tolerancing Analysis · · · · · · · · · · · · · · · · · · · · · · · · · · · · Discrete Data Analysis· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · DOE Design of Experiments· · · · · · · · · · · · · · · · · · · · · · · · · · · Factorial Designs · · · · · · · · · · · · · · · · · · · · · · · · · · · · · DOE Analysis · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · DPU / DPO · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Gage R & R · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · General Linear Model · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Hypothesis Statements Hypothesis Testing · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Minitab Graphics Histogram / Scatter Plot · · · · · · · · · · · · · · · · · · · · · · · · · · Descriptive Statistics / Normal Plot · · · · · · · · · · · · · · · · · · · · One Variable Regression / Residual Plots · · · · · · · · · · · · · · · · · Boxplot / Interval Plot· · · · · · · · · · · · · · · · · · · · · · · · · · · · · Time Series Plot / Box-Cox Transformation · · · · · · · · · · · · · · · · Pareto Diagrams / Cause & Effect Diagrams · · · · · · · · · · · · · · · · Normal Approximation· · · · · · · · · · · · · · · · · · · · · · · · · · · · χ2 Test (Test for Independence) · · · · · · · · · · · · · · · · · · · · · Poisson Approximation · · · · · · · · · · · · · · · · · · · · · · · · · · · Normality of Data · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Planning Questions · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Practical Problem Statement · · · · · · · · · · · · · · · · · · · · · · · · · Precontrol · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Project Closure· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Regression Analysis Regression · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Stepwise Regression · · · · · · · · · · · · · · · · · · · · · · · · · · · Regression with Curves (Quadratic) and Interactions· · · · · · · · · · · · Binary Logistic Regression· · · · · · · · · · · · · · · · · · · · · · · · · Response Surface - CCD · · · · · · · · · · · · · · · · · · · · · · · · · · · · Rolled Throughput Yield · · · · · · · · · · · · · · · · · · · · · · · · · · · · Sample Size Determination · · · · · · · · · · · · · · · · · · · · · · · · · · Seven Basic Tools · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Six Sigma Problem Solving Processes · · · · · · · · · · · · · · · · · · · · Six Sigma Process Report · · · · · · · · · · · · · · · · · · · · · · · · · · · Six Sigma Product Report · · · · · · · · · · · · · · · · · · · · · · · · · · · Stable Ops and 6 Sigma · · · · · · · · · · · · · · · · · · · · · · · · · · · t Test (Testing Means) (1 Sample t; 2 Sample t; Confidence Intervals)· · · · Tables · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Determining Sample Size· · · · · · · · · · · · · · · · · · · · · · · · · F Test · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · χ2 Test · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Normal Distribution · · · · · · · · · · · · · · · · · · · · · · · · · · · · · t Test · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Testing Equality of Variance (F test; Homogeneity of Variance) · · · · · · · · The Normal Curve · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · The Transfer Function 26 41 42 43 44 28 16 59 57-58 10 4 8 50 51-52 35 53 54 55 13 11 - 12 45 30-31 29 20 21 22 23 24 25 36 37-38 39 17 19 5 60 61 46 47 48 49 56 14 27 6 3 18 15 9 33-34 62 63-64 65-66 67-68 69 32 7 40 The Toolkit - A Six Sigma Resource The material in this Toolkit is a combination of material developed by the GEA Master Black Belts and Dr. Mikel Harry (The Six Sigma Academy, Inc.). Worksheets, statistical tables and graphics are outputs of MINITAB for Windows Version 12.2, Copyright 1998, Minitab, Inc. It is intended for use as a quick reference for trained Black Belts and Green Belts. More detail information is available from the Quality Coach Website SSQC.ge.com. If you need more GEA Six Sigma Information, visit the GE Appliances Six Sigma Website at http://genet.appl.ge.com/sixsigma For information on GE Corporate Certification Testing, go to the Green Belt Training Site via the GE Appliances Six Sigma Website. For information about other GE Appliances Six Sigma Training, contact a member of the GEA Six Sigma Training Team • Jeff Keller - Ext 7649 Email: [email protected] • Irene Ligon - Ext 4562 Email:[email protected] • Broadcast Group eMail: [email protected] GE Appliances Copyright 2001 Revision 4.5 - September 2001 GLOSSARY OF SIX SIGMA TERMS α - alpha risk - Probability of falsely accepting the alternative (HA) of difference 2. ANOVA - Analysis of Variance 3. β - Beta risk - Probability of falsely accepting the null hypothesis (H0 ) of no difference 4. χ2 - Tests for independent relationship between two discrete variables 5. δ - Difference between two means 6. DOE - Design of Experiments 7. DPU - Defects per unit 8. e-DPU - Rolled throughput yield 9. F- Test - Used to compare the variances of two distributions 10. g - number of subgroups 11. FIT - The point estimate of the mean response for each level of the independent variable. 12. H0 - Null hypothesis 13. HA - Alternative hypothesis 14. LSL - Lower spec limit 15. µ - Population mean 16. µ - Sample mean 17. n - number of samples in a subgroup 18. N - Number in the total population 19. P Value - If the calculated value of p is lower than the alpha (α) risk, then reject the null hypothesis and conclude that there is a difference. Often referred to as the “observed level of significance”. 20. Residual - The difference between the observed values and the Fit, the error in the model 21. σ - Population standard deviation 22. Σ - Summation 23. σˆ (s) - Sample standard deviation 24. Stratify - Divide or arrange data in organized classes or segments, based on known characteristics or factors. 24. SS - Sum of squares 25. t-Test - Used to compare the means of two distributions 26. Transfer Function - Prediction Equation - Y=f(x) 27. USL - Upper spec limit 28. X - mean 29. X - mean of the means 30. Z - Transforms a set of data such that µ=0 and σ=1 31. ZLT - Z Long term 32. ZST - Z short term 33. ZSHIFT - ZST - ZLT 1. g GE Appliances Six Sigma Toolkit Rev 4.5 9/2001 GE Appliances Proprietary
© Copyright 2026 Paperzz