MINI PAPER Candy Exercise – Simple Data Analysis by Bob Mitchell While brainstorming ideas for the Kansas City AQC division booth activity the Statistics Division officers wanted to identify a hands-on activity whereby exhibit hall visitors could participate in data generation and analysis. The goal is to demonstrate the power of basic statistical tools and Statistical Thinking in a fun, entertaining manner – something that could be further developed as a teacher lesson in the ‘Virtual Academy’ module of our re-designed website. The Virtual Academy is an on-line e-Learning basic statistics-training tool targeted for K-12 students. It is currently undergoing dramatic changes to incorporate lots of motion, animation, sounds, bright colors, etc. – tools to tickle the senses of GenY future Statistics Division members. Borrowing from the basic statistics module taught in Six Sigma BB and GB training, we decided to demonstrate Hypothesis testing, Signal-to-Noise ratios, and basic quality tools to develop our process knowledge of candy packaging. For example, known class exercises used in some Six Sigma training involve studying bag fill weight and color variation within and between Mars Inc.’s M&M candy varieties (plain, peanut, almond, crispy, peanut butter), or bag fill and flavor variation of Mars Inc.’s Skittles (original fruit, tropical, wild berry, sour, mint). One of the Statistics Division officers’ spouses owns a candy store. Hearing about our search for a booth activity, this spouse suggested that we examine the hypothesis that people least like banana pieces of the five Willy Wonka Runts flavors (cherry, strawberry, orange, banana, watermelon). This candy storeowner orders by bulk at the end of each month. It is her observation that more people tend to remove the banana pieces before their purchase; at the end of the month she has a disproportionate amount of yellow Runts remaining in the bin. Two different hypotheses were developed for our Kansas City AQC booth activity: Null hypothesis Ho#1: People like all Runts flavors equally; No preference: Red = Pink = Orange = Yellow = Green Null hypothesis Ho#2: Bag fill process (by weight) is stable; Short-term: Bag 1 = Bag 2 = Bag 3 Long-term: Lot 1 = Lot 2 = Lot 3 The ASQ Inspection Division provided the calibrated electronic scale. Exercise details: 1. Booth visitor selects and weighs a bag of Runts candies 2. Booth visitor opens the bag, sorts the candy by color (flavor) 3. Booth visitor tastes each flavor 4. Record bag weight and color count by box# and bag. 5. Record individual flavor preference. 6. Recorded nominal empty bag weight 7. Recorded nominal candy weight by color Data: The Runts exercise data were analyzed in Minitab. In order to promote discovery and learning we are providing this Minitab workbook for download from the Statistics Division website http://www.asqstatdiv.org/documents/special/runts.xls. We invite and encourage you to analyze the data and offer your insights by posting your analysis and conclusions to the Runts Discussion Page that we created on the Statistics Division website http://www.asqstatdiv.org/discussiongroups.htm. Sample data set: Bag Full Bag Red Empty Blue Orange Yellow Pink Green Total Preference Box 60.1 55.7 57.8 60.5 56.0 53.1 59.4 57.9 58.0 57.1 1.4 1.4 * * * * * * * * 10 4 2 3 2 6 2 4 5 5 6 7 11 11 14 11 6 15 10 13 9 13 16 15 14 11 11 11 15 17 9 8 2 4 6 1 5 5 5 0 43 45 48 49 45 44 44 48 48 49 orange yellow red orange red orange red red yellow red 1 1 1 1 1 1 1 1 1 1 7 3 7 4 4 11 11 7 4 4 2 10 10 12 5 4 9 6 9 10 Continued on page 15 16 ASQ STATISTICS DIVISION NEWSLETTER, Vol. 21, No. 3 CANDY EXERCISE – SIMPLE DATA ANALYSIS Continued from page 14 Summary data: Color Flavor Red Cherry Blue Raspberry OrangeOrange Yellow Banana Pink Strawberry Green Watermelon Rule of Thumb: “When p-value is low, Ho must go”. Ind Weight Box1 Freq Box2 Freq Box3 Freq Box4 Freq Flavor Total Flavor Preference 1.24 1.20 1.10 1.30 1.07 1.80 140 102 223 154 274 124 195 219 157 96 197 126 195 198 216 98 177 107 174 145 239 98 242 107 704 664 835 446 890 464 19 11 14 25 13 6 NOTE 1: We have a 6th color (flavor): blue raspberry. Wonka recently introduced “Chewy” Runts candies, and launched a new flavor with the chewy product line. NOTE 2: Unlike M&Ms and Skittles, Runts candies have distinct shapes. Each color/flavor/shape has its own weight distribution. P = 0 therefore reject the null hypothesis that there is no flavor preference. In fact, the data suggest that people might actually prefer the banana flavor (Observed > Expected)! This is opposite from the store owner’s casual observation. I am reminded by a quote from Don Wheeler, “All data out of context are meaningless”. The storeowner stocks Runts original (hard) candy; the AQC booth activity used Runts “chewy” variety. While consumers may prefer the banana flavor, feedback suggests that the banana-flavored pieces in the original Runts are so much harder than the other flavors. Is it this hardness characteristic that people dislike? (I sense another study). Green (watermelon) appears to be the least preferred variety of chewy Runts. Analysis of Means for Color Preference 0.03 Data Analysis: Flavor Preference Use Chi-Square to test Independence Chi-Square Test: Flavor Sum, Flavor Preference Expected counts are printed below observed counts Color/Total 2.6SL=0.2689 0.2 P=0.1667 0.1 -2.6SL-0.06448 Flavor Cherry Sum 704 707.45 Preference 19 (Obs) 15.55 (Exp) Total 723 Subgroup Color Raspberry 664 660.48 11 (Obs) 14.52 (Exp) 675 Orange 835 830.74 14 (Obs) 18.26 (Exp) 849 Banana 446 460.87 25 (Obs) 10.13 (Exp) 471 Strawberry 890 883.58 13 (Obs) 19.42 (Exp) 903 Watermelon 464 459.89 6 (Obs) 10.11 (Exp) 470 Total 4003 88 4091 1 2 Yellow Red 3 Orange 4 Pink 5 Blue 6 Green Overall 0.05 probability level used Interesting side point: Though this study indicates that people may prefer the chewy banana flavor, it has the lowest frequency from Wonka… Chi-Sq = 0.017 + 0.764 + 0.019 + 0.853 + 0.022 + 0.995 + 0.480 + 21.820 + 0.047 + 2.125 + 0.037 + 1.671 = 28.849 DF = 5, P-Value = 0.000 Continued on page 16 ASQ STATISTICS DIVISION NEWSLETTER, Vol. 21, No. 3 17 CANDY EXERCISE – SIMPLE DATA ANALYSIS Continued from page 15 I and MR Chart for Bag Weight Use ANOVA to test differences by box (22 bags per box) One-way ANOVA: Bag Wt. (Lot 3014TL2) versus Box MS 2.86 2.81 F 1.02 P 0.388 Mean=56.61 55 LCL=51.41 50 Subgroup 0 Individual 95% CIs For Mean Based on Pooled StDev Box N Mean StDev -----+---------+---------+---------+1 22 56.886 1.890 (-----------*-----------) 2 22 56.945 1.676 (-----------*-----------) 3 22 56.255 1.796 (-----------*----------) 4 22 56.336 1.272 (-----------*-----------) -----+---------+---------+---------+Pooled StDev =1.675 55.80 56.40 57.00 57.60 10 20 30 40 50 60 7 6 5 4 3 2 1 0 70 80 90 1 UCL=6.388 R=1.955 LCL=0 Another look at the same data, but segregated by box shows a somewhat different story: I and MR Chart for Bag Weight by Box Individual Value P-value > 0.05; cannot reject the null hypothesis that bag weights are the same. Boxplots of Bag Wt. by Box (means are indicated by solid circles) 62 61 60 64 62 60 58 56 54 52 50 1 58 57 56 55 54 7 6 5 4 3 2 1 0 2 3 4 UCL=60.10 Mean=56.34 LCL=52.57 Subgroup 0 59 Moving Range Bag Wt Lot 301 4TL2 UCL=61.81 60 Moving Range Analysis of Variance for Bag Wt. Source DF SS Box 3 8.58 Error 84 235.73 Total 87 244.31 Individual Value Bag Fill Consistency 10 20 1 30 2 40 50 60 70 3 80 90 4 1 UCL=4.621 R=1.414 LCL=0 4 3 2 52 1 53 The bag fill weights appear statistically equivalent; but is the process variation over time? It is theorized on our part that total bag weight is affected by flavor (shape, weight) distribution. So what is the relative color /flavor distribution by box? P=chart of Color Distribution Stratified by “Box” Red 1 2 3 4 UCL=0.3462 0.4 0.3 Color/Total The caution here is that we do not know the order of bag fill at the candy manufacturer. The time series order presented below is the order of booth visitor. Control limits that define the true bag fill process natural variation may very well be quite different; but our observation tells us that the bag weight is rather consistent, bag to bag. 0.2 P=0.1731 0.1 LCL=3.47E-05 0.0 0 10 20 30 40 50 Packet 60 70 80 90 Continued on page 17 18 ASQ STATISTICS DIVISION NEWSLETTER, Vol. 21, No. 3 CANDY EXERCISE – SIMPLE DATA ANALYSIS Continued from page 16 P=chart of Color Distribution Stratified by “Box” P=chart of Color Distribution Stratified by “Box” Blue 1 2 Pink 3 4 1 2 3 4 0.5 0.4 UCL=0.4364 0.2 P=0.1443 Color/Total UCL=0.3050 0.3 Color/Total 0.4 0.3 P=0.2408 0.2 0.1 0.1 0.0 LCL=0 0 10 20 30 40 50 Packet 60 70 80 LCL=0.04519 0.0 90 0 10 20 30 0.5 70 80 90 Green Orange 2 60 P=chart of Color Distribution Stratified by “Box” P=chart of Color Distribution Stratified by “Box” 1 40 50 Packet 3 1 4 3 4 0.3 UCL=0.4326 0.4 2 P=0.2378 0.2 Color/Total Color/Total UCL=0.2476 0.3 0.2 0.1 P=0.1065 0.1 LCL=0.04304 0.0 0.0 0 10 20 30 40 50 Packet 60 70 80 LCL=0 0 90 P=chart of Color Distribution Stratified by “Box” 10 20 30 40 50 Packet 60 70 80 90 Control Chart of Average Weight per Piece (Stratified by Box) Yellow 1 2 3 4 1.5 1 2 3 4 0.3 1.4 UCL=1.387 0.2 0.1 P=0.09751 Total Weight Color/Total UCL=0.2332 1.3 Mean=1.235 1.2 1.1 0.0 LCL=0 0 10 20 30 40 50 Packet 60 70 80 90 LCL=1.082 1.0 0 10 20 30 40 50 Bag # 60 70 80 90 The chart above, “Average Weight per Piece” gives us some insight that bag fill variation is controlled by the relative frequencies of Runt color. Continued on page 18 ASQ STATISTICS DIVISION NEWSLETTER, Vol. 21, No. 3 19 CANDY EXERCISE – SIMPLE DATA ANALYSIS Continued from page 17 Conclusion Our intention for the AQC booth activity was to demonstrate the applicability of basic statistics and quality tools in everyday activities. Our hope is that this brief, fun look at passive data analysis will motivate the use of graphical displays of data towards deeper process understanding and discovery. Our plan is to continue development of the Virtual Academy to present statistics in a fun and stimulating learning environment to K-12 education. 20 To be certain, we could have launched into more sophisticated statistical data analysis, but that was not our goal. Again, you are invited to download the dataset and post your analysis and conclusions on the Discussion Page we have provided on the Statistics Division website (www.asqstatdiv.org). ASQ STATISTICS DIVISION NEWSLETTER, Vol. 21, No. 3
© Copyright 2026 Paperzz