The sensR package Main purposes of sensR The sensR package: Difference and similarity testing Statistical tests of sensory discrimation and similarity data Per B Brockhoff Power and sample size computations for discrimination and similarity tests DTU Compute Section for Statistics Technical University of Denmark [email protected] Thurstonian analyses via d0 (d-prime) estimation Improved confidence intervals via profile likelihood methods Linking ”one sample at a time” Thurstonian analysis to more generic statistical analysis (regression and anova) August 17 2015 c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 c Per B Brockhoff (DTU) 1 / 48 The sensR package The sensR package c Per B Brockhoff (DTU) plot ROC AUC us on The sensR package an ell mp isc co M d0 dprime compare dprime test dprime table posthoc Official site: www.cran.r-project.org/packages=sensR eo ari s n ati o str Illu Tr an sfo rm ati rescale psyfun psyinv psyderiv pd2pc pc2pd s on le mp Sa & Po we r d 0, CI ,t est s discrimPwr d.primePwr discrimSS d.primeSS twoACpwr 2 / 48 The sensR package siz e Main functions in sensR discrim AnotA samediff twoAC betabin glm clm clmm The sensR package: Difference and similarity testingDTU Sensometrics 2015 findcr clm2twoAC SDT discrimR discrimSim samdiffSim The sensR package: Difference and similarity testingDTU Sensometrics 2015 3 / 48 Reference manual (51 pages) News / change-log Vignettes: Statistical methodology for sensory discrimination tests and its implementation in sensR Examples for papers Supporting package: ordinal for ratings data, A-not A with sureness, 2-AC etc. c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 4 / 48 The sensR package The sensR package The sensR package Obtaining sensR The sensR package Citation citation("sensR") ## ## To cite the sensR-package in publications use: ## ## Christensen, R. H. B. & P. B. Brockhoff (2015). sensR - An ## R-package for sensory discrimination. R package version 1.4-5. ## http://www.cran.r-project.org/package=sensR/. ## ## A BibTeX entry for LaTeX users is ## ## @Misc{, ## title = {sensR---An R-package for sensory discrimination}, ## author = {R. H. B. Christensen and P. B. Brockhoff}, ## year = {2015}, ## note = {R package version 1.4-5. http://www.cran.r-project.org/package=sensR/}, ## } 1 Download and install R from http://www.cran.r-project.org/ 2 Open R and run install.packages("sensR") 3 Tell R that you want to use it with library(sensR) 4 Analyze your data — try for example discrim(10, 15, method="triangle") c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 The sensR package 5 / 48 The sensR package X X X X X X X X X X X Regression analysis X X Replicated X X Likelihood CI Simulation Similarity test at io in Sample size Difference test X X X (X) X (X) Power d0 estimation X X X X X X n D X X X X X X rim 6 / 48 The sensR package Publications related to sensR (and ordinal) Duo-Trio, Triangle, Tetrad 2-AFC, 3-AFC A-not A Same-Different 2-AC A-not A w. Sureness isc The sensR package: Difference and similarity testingDTU Sensometrics 2015 The sensR package Protocols supported in sensR c Per B Brockhoff (DTU) c Per B Brockhoff (DTU) X X X X X X X X X X The sensR package: Difference and similarity testingDTU Sensometrics 2015 Brockhoff , P. B., & Christensen, R. H. B. (2010). Thurstonian models for sensory discrimination tests as generalized linear models. Food Quality and Preference, 21, 330–338. Christensen, R. H. B., & Brockhoff, P. B. (2009). Estimation and Inference in the Same Different Test. Food Quality and Preference, 20, 514–524. Christensen, R. H. B., Cleaver, G., & Brockhoff, P. B. (2011). Statistical and Thurstonian models for the A-not A protocol with and without sureness. Food Quality and Preference, 22, 542-549. Christensen, R. H. B., Lee, H-S. & Brockhoff , P. B. (2012). Estimation of the Thurstonian Model for the 2-AC Protocol. Food Quality and Preference, 24, 119-128. Christensen, R. H. B., & Brockhoff, P. B. (2013). Analysis of sensory ratings data with cumulative link models. J. Soc. Fr. Stat. & Rev. Stat. App., 154(3), 58-79. Christensen, R.H.B., Ennis, J.M., Ennis, D.M. & Brockhoff, P.B.(2014). Paired preference data with a no-preference option - Statistical tests for comparison with placebo data. Food Quality and Preference, 32, 48-55. John M Ennis, Rune HB Christensen (2014). Precision of measurement in Tetrad testing. Food Quality and Preference, Vol 32, 98-106. Ennis, J.M. & Christensen, R.H.B.(2015). A Thurstonian comparison of the Tetrad and Degree of Difference tests. Food Quality and Preference, Vol 40, 263-269. T. Næs, P.B. Brockhoff and O. Tomic, (2010). Statistics for Sensory and Consumer Science, John Wiley & Sons, Chapter 7. 7 / 48 c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 8 / 48 Why discrimination testing? Outline Discrimination testing - what can we use it for? Outline 1 The sensR package 2 Basic sensory difference testing 3 Five basic discrimination protocols 4 Proportion discriminators 5 The Thurstonian model 6 Power and sample size computations 7 Power and the five protocols Sensory discrimination testing is used to Show difference or similarity between products Detect unwanted product changes in product development Guide ingredient substitution (e.g. health initiatives) Detect production stability Measure consumer sensitivity Substantiate claims (“75% of consumers can tell the difference”) Dispute claims (“our product is at least as good as yours”) c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 Basic sensory difference testing 9 / 48 Difference testing with the Triangle protocol c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 Basic sensory difference testing Example 1: Ingredient substitution Difference testing with the Triangle protocol Example 1: Ingredient substitution In a calorie reduced yoghurt natural sweetener from from the Stevia plant is used. Unfortunately, consumers complained about a bitter after taste. The company decided to change to a different supplier of Stevia sweetener and now wants to know if consumers can actually tell the difference between the current production and the new yoghurt. It is decided to use a Triangle test with 100 consumers. The experiment was performed: Sensory discrimination protocol: Triangle Sample size: N = 100, no. correct answers: X = 45 Do consumers perceive the current and new yoghurts as significantly different? Each subject is presented with 3 samples, two of which are equal What are high and low values of X? Question: which sample is most different from the two others? What is the alternative hypothesis, HA ? X c Per B Brockhoff (DTU) Y What is the null hypothesis, H0 ? X The sensR package: Difference and similarity testingDTU Sensometrics 2015 10 / 48 11 / 48 c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 12 / 48 Basic sensory difference testing Difference testing with the Triangle protocol Basic sensory difference testing What can we do to test our hypotheses? Difference testing with the Triangle protocol What is the expected proportion of correct answers? Which range of values of pc are plausible given 45 correct answers out of 100? How many correct answers would we expect if H0 is true and pc = 1/3? We don’t know the true pc What would an extreme observation be if H0 is true? Could the true pc be 0.40? How extreme is 45 correct out of 100 if products are not different? Could the true pc be 0.33? We can estimate pc : p̂c = X/N = 0.45 Could the true pc be 0.90? Which values for the true pc are in fact plausible? We need a confidence interval! c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 Basic sensory difference testing 13 / 48 Difference testing with the Triangle protocol 14 / 48 Duo-Trio, Triangle, Tetrad, 2-AFC, and 3-AFC pc : The expected proportion of correct answers p̂c : The observed proportion of correct answers H0 : Products are not different, pc = 1/3 HA : Products are different, pc > 1/3 xc = 45 is the critical value: P r(X ≥ 45) ≤ 0.05 if products are not different p-value: The probability of observing 45 correct answers or more if the products are not different (H0 is true) is p = 0.01 Plausible values of pc are (0.35; 0.55) based on a 95% confidence interval The sensR package: Difference and similarity testingDTU Sensometrics 2015 The sensR package: Difference and similarity testingDTU Sensometrics 2015 Five basic discrimination protocols Conclusions from Example 1 c Per B Brockhoff (DTU) c Per B Brockhoff (DTU) 15 / 48 Duo-Trio 1 reference sample XR and 2 test samples X and Y Which sample is most similar to the reference? Triangle: 3 samples (X, X 0 , Y ) which sample is most different from the two others? Tetrad: 4 samples (X, X 0 , Y , Y 0 ) Group the samples based on similarity? 2-AFC: 2 samples X and Y which sample has the strongest/weakest sensory magnitude? 3-AFC: 3 samples (X, X 0 , Y ) which sample has the strongest/weakest sensory magnitude? c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 16 / 48 Five basic discrimination protocols Five basic discrimination protocols Duo-Trio, Triangle, Tetrad, 2-AFC, and 3-AFC Protocol Duo-Trio 2-AFC Triangle Tetrad 3-AFC Samples 3 2 3 4 3 Alternatives 2 2 3 3 3 p0 1/2 1/2 1/3 1/3 1/3 Which protocol should I choose? Different criteria: Type Unspecified Specified Unspecified Unspecified Specified Specified or unspecified problem? No. samples to taste in each trial Statistical power Recommendations: The Triangle test is often preferred over the Duo-Trio test due to its higher power. Table: Four protocols with binomial answers. The Tetrad test is more powerful than the Triangle test, but requires that 4 instead of 3 samples be assessed in each trial. p0 : The probability of a correct answer by guessing The 2-AFC test is often preferred over the 3-AFC test because it has similar power, but fewer samples in each trial. Unspecified: General product differences Specified: Attribute specific differences (sweetness, saltiness, etc.) c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 17 / 48 c Per B Brockhoff (DTU) Proportion discriminators The sensR package: Difference and similarity testingDTU Sensometrics 2015 18 / 48 Proportion discriminators Which proportion of subjects can actually tell the difference between the products? Example 2: Proportion discriminators in a Triangle test Assume two kinds of subjects: Discriminators and Guessers Discriminators can tell the difference and always choose the right sample. Guessers can only guess and will sometimes choose the right sample. If N = 100 in a Triangle test and 10% can tell the difference, how many correct samples can I expect? We denote the proportion discriminators with pd . Is this model realistic? c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 19 / 48 c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 20 / 48 Proportion discriminators Proportion discriminators Proportion discriminators and proportion correct Example 3: What is the proportion of discriminators? From pd to pc : Tetrad, Triangle, 3-AFC: pc = 2 1 + pd 3 3 Duo-Trio, 2-AFC: pc = 1 1 + pd 2 2 What is the proportion of discriminators in the sweetener example? Examples in R From pc to pd : Tetrad, Triangle, 3-AFC: pd = c Per B Brockhoff (DTU) pc − 1/3 2/3 Duo-Trio, 2-AFC: pd = What proportion of correct answers should we expect if we had used the Duo-Trio protocol instead? pc − 1/2 1/2 The sensR package: Difference and similarity testingDTU Sensometrics 2015 c Per B Brockhoff (DTU) 21 / 48 The sensR package: Difference and similarity testingDTU Sensometrics 2015 The Thurstonian model The Thurstonian model Gridgeman’s paradox Gridgeman’s paradox 1.0 Masuoka et al. (1995) Delwiche & O’Mahony (1996) Rousseau & O’Mahony (1997) Beer Choc. pudding Yoghurt no. tests 45 108 720 240 240 108 156 180 Triangle 21 42 363 104 99 50 106 105 3-AFC 32 62 539 161 148 75 145 152 Triangle 3−AFC 0.8 Proportion discriminators Product Bitter solutions Onion dip Salt solutions 1.0 Triangle 3−AFC Proportion correct Study Byer & Abrams (1953) Stillman (1993) Tedje et al. (1994) 22 / 48 0.6 0.4 0.2 0.0 0.8 0.6 0.4 0.2 0.0 a b c d e Studies f g h a b c d e f g h Studies Why does 3-AFC consistently give more correct answers than Triangle? c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 23 / 48 c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 24 / 48 The Thurstonian model The Thurstonian model The Thurstonian model, 3-alternatives Variability in Thurstonian models d': sensory difference d': sensory difference A B A a1 a2 b a1 Normal distributions with equal variance d0 = (µ2 − µ1 )/σ — a signal-to-noise measurement Each decision rule (protocol) leads to a specific relation between d0 and pc Is the answer correct for Triangle? for 3-AFC? c Per B Brockhoff (DTU) B The sensR package: Difference and similarity testingDTU Sensometrics 2015 25 / 48 a2 b Variability in sensory intensity comes from Variability in stimulus (sample-sample variation, temperature, etc.) Variability within the subject (fatigue, learning, presentation order) c Per B Brockhoff (DTU) The Thurstonian model The sensR package: Difference and similarity testingDTU Sensometrics 2015 26 / 48 The Thurstonian model Psychometric functions Gridgeman’s paradox resolved 2.5 Triangle 3−AFC 1.0 2−AFC Thurstonian d−prime 2.0 0.9 3−AFC 0.8 Duo−trio pc 0.7 Triangle 0.6 1.5 1.0 0.5 0.5 0.0 0.4 a b 0.3 c d e f g h Studies 0 1 2 3 4 5 Triangle and 3-AFC agree for all studies when d0 is used as a measure of sensory difference. 6 d' c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 27 / 48 c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 28 / 48 The Thurstonian model The Thurstonian model Examples of d-prime Example 4: d-prime estimation obvious difference (d' = 4) What is d-prime in the sweetener example? confusable (d' = 1) How low or high could we reasonably expect the true d-prime (δ) to be? Almost indistinguishable (d' = 0.2) c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 29 / 48 c Per B Brockhoff (DTU) Power and sample size computations The sensR package: Difference and similarity testingDTU Sensometrics 2015 30 / 48 Power and sample size computations Why plan sensory difference tests? Type I and Type II error rates H0 is true HA is true Because Accept H0 X Type II error (β) Reject H0 Type I error (α) X Five parameters involved when designing a sensory difference test: Too many tests is expensive and a waste of resources Sample size Too few tests is useless and a waste of resources Power (1 − β) Significance level (α) Sensory difference that you want to detect (d0 , pc , or pd ) Test protocol (Triangle, Duo-Trio, Tetrad, 2-AFC, 3-AFC) Usually we want a power between 80% and 90%. c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 31 / 48 c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 32 / 48 Power and sample size computations Power and sample size computations What is the power of a sensory difference test really? Example 5: Power estimation Power is the probability of finding a difference (if it is really there) What is the power of detecting d0 = 2 in a Triangle test with α = 0.05 and n = 15? of finding a difference if HA is true of rejecting H0 if HA is true of correctly rejecting H0 What are the null and alternative hypotheses? that the number of correct answers is at least as large as the critical value What is the critical value for the test? that the p-value from a sensory difference test is less than α c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 33 / 48 c Per B Brockhoff (DTU) Power and sample size computations Power versus pd Some new technical equipment has been bought to your company to replace an old machine used in the production of your calorie reduced yoghurt. The producer of the new machine promises that your yoghurt will remain exactly as before, but upon tasting the result, you strongly believe there is a subtle but detectable difference. You decide to use the Duo-Trio test and have a budget for 100 consumers. d0 = 1? 1.0 0.8 Power To make sure you don’t get complaints from your costumers, you decide to test whether consumers can tell the difference between the yoghurts made with the old and the new equipment. 0.6 2 alternatives 3 alternatives 0.4 0.2 0.0 0.0 How many consumers would you need to achieve 80% power? The sensR package: Difference and similarity testingDTU Sensometrics 2015 0.1 0.2 0.3 0.4 0.5 pd If you are convinced that the difference is primarily a difference in smoothness, then what can you do? c Per B Brockhoff (DTU) 34 / 48 Power and the five protocols Example 6: Power and sample size estimation What is the probability that you can detect a difference of The sensR package: Difference and similarity testingDTU Sensometrics 2015 More alternatives means more power 35 / 48 c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 36 / 48 Power and the five protocols Power and the five protocols Power versus d-prime Power versus sample size pd = 0.25 1.0 1.0 Power 0.8 n = 100 Power 3 alternatives 0.8 0.6 0.2 2 alternatives 0.4 0.2 Duo−Trio Triangle 2−AFC 3−AFC Tetrad 0.4 0.6 0.0 10 30 50 70 90 Sample size 0.0 0.0 0.5 1.0 1.5 130 150 2.0 Power is a zig-zag function of sample size. This is due to the discreteness of the binomial distribution Smoother behavior for 3 alternatives d−prime Specified tests are more powerful than unspecified tests c Per B Brockhoff (DTU) 110 The sensR package: Difference and similarity testingDTU Sensometrics 2015 37 / 48 c Per B Brockhoff (DTU) Power and the five protocols The sensR package: Difference and similarity testingDTU Sensometrics 2015 38 / 48 Power and the five protocols Reject the next consumer? First exact and stable exact sample sizes Power can decrease with increasing sample size! ● 0.80 ● ● 0.80 ● 0.78 ● ● Power ● 0.76 ● 0.72 ● 0.70 0.74 ● ● ● ● ● ● ● ● ● ● ● 100 ● ● ● ● 0.70 ● ● ● ● ● 0.72 ● ● 0.76 ● ● ● ● ● ● ● 0.74 Power ● 0.78 ● ● 105 110 Sample size 115 120 ● 100 105 110 Sample size 115 First exact sample size (conventional): 120 The smallest sample size that has the required power Stable exact: The smallest sample size that gives the required power Sample sizes provide guidelines — not conclusive requirements. c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 The smallest sample size that has the required power such that no greater sample size has a power smaller than that required. 39 / 48 c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 40 / 48 Power and the five protocols Power and the five protocols c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 Power and the five protocols c Per B Brockhoff (DTU) X Regression analysis X X X X X X X X X X X X X X X X X X The sensR package: Difference and similarity testingDTU Sensometrics 2015 42 / 48 Publications related to sensR (and ordinal) siz e plot ROC AUC us on dprime compare dprime test dprime table posthoc M d0 isc co ell mp an eo ari s n ati o str Illu Tr an sfo rm ati rescale psyfun psyinv psyderiv pd2pc pc2pd Brockhoff , P. B., & Christensen, R. H. B. (2010). Thurstonian models for sensory discrimination tests as generalized linear models. Food Quality and Preference, 21, 330–338. s on le mp Sa & Po we r d 0, CI ,t est s discrimPwr d.primePwr discrimSS d.primeSS twoACpwr X X Power and the five protocols Main functions in sensR discrim AnotA samediff twoAC betabin glm clm clmm X X Replicated at io rim c Per B Brockhoff (DTU) 41 / 48 X X Likelihood CI X X X (X) X (X) Simulation X X X X X X isc Lets see if we can replicate this number with sensR. Sample size Similarity test X X X X X X in The ISO standard for the Triangle test describes that for pd = 0.10, α = 0.05 and β = 0.001, the required (first exact) sample size is 1181. Power Difference test Duo-Trio, Triangle, Tetrad 2-AFC, 3-AFC A-not A Same-Different 2-AC A-not A w. Sureness n d0 estimation Protocols supported in sensR D Example 7: First-exact and stable-exact sample sizes findcr clm2twoAC SDT discrimR discrimSim samdiffSim The sensR package: Difference and similarity testingDTU Sensometrics 2015 43 / 48 Christensen, R. H. B., & Brockhoff, P. B. (2009). Estimation and Inference in the Same Different Test. Food Quality and Preference, 20, 514–524. Christensen, R. H. B., Cleaver, G., & Brockhoff, P. B. (2011). Statistical and Thurstonian models for the A-not A protocol with and without sureness. Food Quality and Preference, 22, 542-549. Christensen, R. H. B., Lee, H-S. & Brockhoff , P. B. (2012). Estimation of the Thurstonian Model for the 2-AC Protocol. Food Quality and Preference, 24, 119-128. Christensen, R. H. B., & Brockhoff, P. B. (2013). Analysis of sensory ratings data with cumulative link models. J. Soc. Fr. Stat. & Rev. Stat. App., 154(3), 58-79. Christensen, R.H.B., Ennis, J.M., Ennis, D.M. & Brockhoff, P.B.(2014). Paired preference data with a no-preference option - Statistical tests for comparison with placebo data. Food Quality and Preference, 32, 48-55. John M Ennis, Rune HB Christensen (2014). Precision of measurement in Tetrad testing. Food Quality and Preference, Vol 32, 98-106. Ennis, J.M. & Christensen, R.H.B.(2015). A Thurstonian comparison of the Tetrad and Degree of Difference tests. Food Quality and Preference, Vol 40, 263-269. T. Næs, P.B. Brockhoff and O. Tomic, (2010). Statistics for Sensory and Consumer Science, John Wiley & Sons, Chapter 7. c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 44 / 48 Details about the four basic protocols Details about the four basic protocols The Triangle test The Duo-Trio test 3 samples (X, X 0 , Y ) OR (X, Y , Y 0 ) 1 reference sample XR and 2 test samples X and Y 6 presentation orders XX 0 Y , XY X 0 , Y XX 0 , Y Y 0 X, Y XY 0 , XY Y 0 4 presentation orders XR XY , XR Y X, YR Y X, YR XY Unspecified test Unspecified test Question: Which sample is most similar to the reference? Question: which sample is most different from the two others? Constant or variable reference? c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 45 / 48 Details about the four basic protocols c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 46 / 48 Details about the four basic protocols The 2-AFC test The 3-AFC test 2 samples X and Y 3 samples (X, X 0 , Y ) OR (X, Y , Y 0 ) 2 presentation orders XY and Y X 6 presentation orders XX 0 Y , XY X 0 , Y XX 0 , Y Y 0 X, Y XY 0 , XY Y 0 Specified test Specified test Question: which sample has the strongest/weakest sensory mangnitude? Question: which sample has the strongest/weakest sensory mangnitude? Example: Which sample is the sweetest? Example: Which sample is the sweetest? c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 47 / 48 c Per B Brockhoff (DTU) The sensR package: Difference and similarity testingDTU Sensometrics 2015 48 / 48
© Copyright 2026 Paperzz