Some Analytical Chemistry of Potato Chips Lessons on Sampling and ANOVA in SAS and JMP Eric Cai *How much sodium dost a potato crisp hast? Sodium chloride NaCl Images courtesy of Poyraz 72 and Evan-Amos via Wikimedia. *Shakespearean online translator courtesy of LingoJam by Joseph Rocca. Objectives • Estimate the weight percentage of sodium in a bag of potato chips • Obtain a confidence interval for the estimated weight percentage • Need to minimize the cumulative uncertainty in the final result – Minimize the width of the confidence interval Objectives • Estimate the weight percentage of sodium in a bag of potato chips • Obtain a confidence interval for the estimated weight percentage • Need to minimize the cumulative uncertainty in the final result – Minimize the width of the confidence interval Bag of Potato Chips 1 2 3 4 How to minimize uncertainty? • Use precise instruments • Measure many aliquots • Minimize the variation between the samples How to minimize uncertainty? • Use precise instruments • Measure many aliquots • Minimize the variation between the samples Bag of Potato Chips 1 2 Variation in Weight Percentage Between Chips 4 3 Variation in Weight Percentage Between Chips Variation in Weight Percentage Between Chips Bag of Potato Chips 1 2 Variation in Weight Percentage Between Chips Variation in Weight Percentage Within a Chip 4 3 Variation in Weight Percentage Between Chips Variation in Weight Percentage Within a Chip Variation in Weight Percentage Between Chips Raw Data – Wide Format Aliquot 1 Aliquot 2 Aliquot 3 Chip 1 0.324% 0.311% 0.352% Chip 2 0.455% 0.467% 0.448% Chip 3 0.420% 0.463% 0.424% Chip 4 0.447% 0.377% 0.398% Raw Data – Wide Format Aliquot 1 Aliquot 2 Aliquot 3 Chip 1 0.324% 0.311% 0.352% Chip 2 0.455% 0.467% 0.448% Chip Desired Data Long Format Chip 1 Chip 1 Chip 1 Chip 2 Chip 2 Chip 2 Needed for analysis in both SAS and JMP Chip 3 Chip 3 Chip 3 Chip 4 Chip 4 Chip 4 Chip 3 0.420% 0.463% 0.424% Chip 4 0.447% 0.377% 0.398% Weight Percentage * enter the raw data; data sodium1; input chip1 chip2 chip3 chip4; datalines; 0.324 0.455 0.420 0.447 0.311 0.467 0.463 0.377 0.352 0.448 0.424 0.398 ; run; * transpose the data; * convert the weight percentages from a vertical display to a horizontal display; proc transpose data = sodium1 out = sodium2 name = sample prefix = aliquot; var chip:; run; * show the transposed data; proc print data = sodium2; run; Long, but still wide sample aliquot1 aliquot2 aliquot3 chip1 0.324 0.311 0.352 chip2 0.455 0.467 0.448 chip3 0.420 0.463 0.424 chip4 0.447 0.377 0.398 * sodium2 needs to be transposed once more for all weight percentages to be in one column; proc transpose data = sodium2 out = sodium3 ( rename = ( col1 = weight_percentage ) ) name = subsample; var aliquot:; by sample; run; * show sodium3 - it is now ready for analysis; proc print data = sodium3; run; Transformed Data – Long Format sample subsample weight_percentage chip1 aliquot1 0.324 chip1 aliquot2 0.311 chip1 aliquot3 0.352 chip2 aliquot1 0.455 chip2 aliquot2 0.467 chip2 aliquot3 0.448 chip3 aliquot1 0.420 chip3 aliquot2 0.463 chip3 aliquot3 0.424 chip4 aliquot1 0.447 chip4 aliquot2 0.377 chip4 aliquot3 0.398 PROC TRANSPOSE X 2 Wide to Long Aliquot 1 Aliquot 2 Aliquot 3 Chip 1 0.324% 0.311% 0.352% Chip 2 0.455% 0.467% 0.448% Chip 3 0.420% 0.463% 0.424% Chip 4 0.447% 0.377% 0.398% sample aliquot1 aliquot2 aliquot3 sample subsample weight_percentage chip1 aliquot1 0.324 chip1 aliquot2 0.311 chip1 aliquot3 0.352 chip2 aliquot1 0.455 chip2 aliquot2 0.467 chip1 0.324 0.311 0.352 chip2 aliquot3 0.448 chip2 0.455 0.467 0.448 chip3 aliquot1 0.420 chip3 0.420 0.463 0.424 chip3 aliquot2 0.463 chip4 0.447 0.377 0.398 chip3 aliquot3 0.424 chip4 aliquot1 0.447 chip4 aliquot2 0.377 chip4 aliquot3 0.398 See the November, 2015, issue of the VanSUG newsletter about PROC TRANSPOSE by Dilinuer Kuerban Visualize the Data Visualize the Data Group-specific means Sample means within each group (chip) Grand Mean Sample mean of all data Visualize the Data Within-group variation Between-group variation Compare the 2 sources of variation • Analysis of Variance (ANOVA) – Linear regression with categorical predictors – Partition a continuous variable by a categorical factor – Use sum of squares to quantify the variation – Sum of deviations of data away from the average • Scale (divide) each sum by the number of degrees of freedom Visualize the Data Within-group variation Between-group variation Analysis of Variance (ANOVA) • Use sum of squares to quantify the variations • Sum of deviations of data away from the average Between-group variation vs. Within-group variation * use ANOVA to partition and compare the 2 sources of variation; proc anova data = sodium4; class sample; model weight_percentage = sample; run; You can also use PROC GLM to implement ANOVA. ANOVA is one special case of general linear models. PROC ANOVA should only be used when there are equal numbers of observations for every combination of the classification factors. • There are many exceptions to this! Image courtesy of Cdang via Wikimedia There is much more variation in the weight percentage of sodium between the chips than within the chips! Bag of Potato Chips 1 2 Variation in Weight Percentage Between Chips Variation in Weight Percentage Within a Chip 4 3 Variation in Weight Percentage Between Chips Variation in Weight Percentage Within a Chip Variation in Weight Percentage Between Chips JMP • • • • • • A software from The SAS Institute Point-and-click Has underlying scripting language Statistics Machine learning Industrial statistics • Go to JMP demonstration! Bag of Potato Chips 1 2 3 4 Bag of Potato Chips 1 There is a trade-off! 2 3 4 More measurements are needed! Thank you JMP staff! Louis Valente Manager of Global Field Enablement for JMP Mark Bailey Principal Analytical Training Consultant for JMP Arati Mejdal Global Social Media Manager for JMP Software
© Copyright 2026 Paperzz