Skip to main content

Similarity, Equivalence Testing, and Discrimination Theory

  • Chapter
Sensory Evaluation of Food

Part of the book series: Food Science Text Series ((FSTS))

Abstract

This chapter discusses equivalence testing and how difference tests are modified in their analyses to guard against Type II error (missing a true difference). Concepts of test power and required sample sizes are discussed and illustrated. An alternative approach to equivalence, namely interval testing is introduced along with the concept of paired one-sided tests. Two theoretical approaches to the measurement of the size of a difference are introduced: discriminator theory (also called guessing models) and the signal detection or Thurstonian models.

Difference testing method constitute a major foundation for sensory evaluation and consumer testing. These methods attempt to answer fundamental questions about stimulus and product similarity before descriptive or hedonic evaluations are even relevant. In many applications involving product or process changes, difference testing is the most appropriate mechanism for answering questions concerning product substitutability.

—D. M. Ennis (1993)

An erratum to this chapter can be found online at http://dx.doi.org/10.1007/978-1-4419-6488-5_20.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 89.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • ASTM. 2008a. Standard guide for sensory claim substantiation. Designation E-1958-07. Annual Book of Standards, Vol. 15.08. ASTM International, West Conshohocken, PA, pp. 186–212.

    Google Scholar 

  • ASTM. 2008b. Standard practice for estimating Thurstonian discriminal differences. Designation E-2262-03. Annual Book of Standards, Vol. 15.08. ASTM International, West Conshohocken, PA, pp. 253–299.

    Google Scholar 

  • Amerine, M. A., Pangborn, R. M. and Roessler, E. B. 1965. Principles of Sensory Evaluation of Food, Academic Press, New York, pp. 437–440.

    Book  Google Scholar 

  • Antinone, M. A., Lawless, H. T., Ledford, R. A. and Johnston, M. 1994. The importance of diacetyl as a flavor component in full fat cottage cheese. Journal of Food Science, 59, 38–42.

    Article  CAS  Google Scholar 

  • Baird, J. C. and Noma, E. 1978. Fundamentals of Scaling and Psychophysics. Wiley, New York.

    Google Scholar 

  • Bi, J. 2005. Similarity testing in sensory and consumer research. Food Quality and Preference, 16, 139–149.

    Article  Google Scholar 

  • Bi, J. 2006a. Sensory Discrimination Tests and Measurements. Blackwell, Ames, IA.

    Google Scholar 

  • Bi, J. 2006b. Statistical analyses for R-index. Journal of Sensory Studies, 21, 584–600.

    Article  Google Scholar 

  • Bi, J. 2007. Similarity testing using paired comparison method. Food Quality and Preference, 18, 500–507.

    Article  Google Scholar 

  • Byer, A. J. and Abrams, D. 1953. A comparison of the triangle and two-sample taste test methods. Food Technology, 7, 183–187.

    Google Scholar 

  • Delwiche, J. and O’Mahony, M. 1996. Flavour discrimination: An extension of the Thurstonian “paradoxes” to the tetrad method. Food Quality and Preference, 7, 1–5.

    Article  Google Scholar 

  • Ennis, D. M. 1990. Relative power of difference testing methods in sensory evaluation. Food Technology, 44(4), 114,116–117.

    Google Scholar 

  • Ennis, D. M. 1993. The power of sensory discrimination methods. Journal of Sensory Studies, 8, 353–370.

    Article  Google Scholar 

  • Ennis, D. M. 2008. Tables for parity testing. Journal of Sensory Studies, 32, 80–91.

    Article  Google Scholar 

  • Ennis, D. M. and Ennis J. M. 2010. Equivalence hypothesis testing. Food Quality and Preference, 21, 253–256.

    Article  Google Scholar 

  • Ennis, D.M. and Mullen, K. 1986. Theoretical aspects of sensory discrimination. Chemical Senses, 11, 513–522.

    Article  Google Scholar 

  • Ennis, D. M. and O’Mahony, M. 1995. Probabilistic models for sequential taste effects in triadic choice. Journal of Experimental Psychology: Human Perception and Performance, 21, 1–10.

    Google Scholar 

  • Ferdinandus, A., Oosterom-Kleijngeld, I. and Runneboom, A. J. M. 1970. Taste testing. MBAA Technical Quarterly, 7(4), 210–227.

    Google Scholar 

  • Finney, D. J. 1971. Probit Analysis, Third Edition. Cambridge University, New York.

    Google Scholar 

  • Frijters, J. E. R. 1979. The paradox of the discriminatory nondiscriminators resolved. Chemical Senses, 4, 355–358.

    Article  Google Scholar 

  • Frijters, J. E. R., Kooistra, A. and Vereijken, P. F. G. 1980. Tables of d’ for the triangular method and the 3-AFC signal detection procedure. Perception and Psychophysics, 27(2), 176–178.

    Article  Google Scholar 

  • Gacula, M. C., Singh, J., Altan, S. and Bi, J. 2009. Statistical Methods in Food and Consumer Research. Academic and Elsevier, Burlington, MA.

    Google Scholar 

  • Green, D.M. and Swets, J. A. 1966. Signal Detection Theory and Psychophysics. Wiley, New York.

    Google Scholar 

  • Lawless, H. T. 2010. A simple alternative analysis for threshold data determined by ascending forced-choice method of limits. Journal of Sensory Studies, 25, 332–346.

    Article  Google Scholar 

  • Lawless, H. T. and Schlegel, M. P. 1984. Direct and indirect scaling of taste—odor mixtures. Journal of Food Science, 49, 44–46.

    Article  Google Scholar 

  • Lawless, H. T. and Stevens, D. A. 1983. Cross-adaptation of sucrose and intensive sweeteners. Chemical Senses, 7, 309–315.

    Article  CAS  Google Scholar 

  • Macmillan, N. A. and Creelman, C. D. 1991. Detection Theory: A User’s Guide. University Press, Cambridge.

    Google Scholar 

  • MacRae, A. W. 1995. Confidence intervals for the triangle test can give reassurance that products are similar. Food Quality and Preference, 6, 61–67.

    Article  Google Scholar 

  • Meilgaard, M., Civille, G. V. and Carr, B. T. 2006. Sensory Evaluation Techniques, Fourth Edition. CRC, Boca Raton.

    Google Scholar 

  • Morrison, D. G. 1978. A probability model for forced binary choices. American Statistician, 32, 23–25.

    Google Scholar 

  • O’Mahony, M. A. 1979. Short-cut signal detection measures for sensory analysis. Journal of Food Science, 44(1), 302–303.

    Article  Google Scholar 

  • O’Mahony, M. and Odbert, N. 1985. A comparison of sensory difference testing procedures: Sequential sensitivity analysis and aspects of taste adaptation. Journal of Food Science, 50, 1055.

    Article  Google Scholar 

  • O’Mahony, M., Masuoka, S. and Ishii, R. 1994. A theoretical note on difference tests: Models, paradoxes and cognitive strategies. Journal of Sensory Studies, 9, 247–272.

    Article  Google Scholar 

  • Schlich, P. 1993. Risk tables for discrimination tests. Food Quality and Preference, 4, 141–151.

    Article  Google Scholar 

  • Stillman, J. A. and Irwin, R. J. 1995. Advantages of the same-different method over the triangular method for the measurement of taste discrimination. Journal of Sensory Studies, 10, 261–272.

    Article  Google Scholar 

  • Thurstone, L. L. 1927. A law of comparative judgment. Psychological Review, 34, 273–286.

    Article  Google Scholar 

  • Ura, S. 1960. Pair, triangle and duo-trio test. Reports of Statistical Application Research. Japanese Union of Scientists and Engineers, 7, 107–119.

    Google Scholar 

  • USFDA. 2001. Guidance for Industry. Statistical Approaches to Bioequivalence. U.S. Department of Health and Human Services, Food and Drug Administration, Center for Drug Evaluation and Research (CDER). http://www.fda.gov/cder/guidance/index.htm

  • Viswanathan, S., Mathur, G. P., Gnyp, A. W. and St. Peirre, C. C. 1983. Application of probability models to threshold determination. Atmospheric Environment, 17, 139–143.

    Article  CAS  Google Scholar 

  • Welleck, S. 2003. Testing Statistical Hypotheses of Equivalence. CRC (Chapman and Hall), Boca Raton, FL.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

5.1 Appendix: Non-Central t-Test for Equivalence of Scaled Data

Bi (2007) described a similarity test for two means, as might come from some scaled data such as acceptability ratings, descriptive panel data, or quality control panel data. The critical test statistic is T AH after the original authors of the test, Anderson and Hauck. If we have two means, M 1 and M 2, from two groups of panelists with N panelists per group and a variance estimate, S, the test proceeds as follows:

$$T_{{\textrm{AH}}} = \frac{{M_1 - M_2 }}{{s\sqrt {2/N} }}$$
((5.13))

The variance estimate, S, can be based on the two samples, where

$$S^2 = \frac{{S_1^2 + S_2^2 }}{N}$$
((5.14))

and we must also estimate a non-centrality parameter, δ,

$$\delta = \frac{{\Delta _{\textrm{o}} }}{{s\sqrt {2/N} }}$$
((5.15))

where ∆ο is the allowable difference interval.

The calculated p-value is then

$$p = t_\nu (\left| {T_{{\textrm{AH}}} } \right| - \delta ) - t_\nu ( - \left| {T_{{\textrm{AH}}} } \right| - \delta )$$
((5.16))
and t ν is the p-value from the common central t-distribution value for ν = 2(N–1) degrees of freedom. If p is less than our cutoff, usually 0.05, then we can conclude that our difference is within the acceptable interval and we have equivalence.

For paired data, the situation is even simpler, but in order to calculate your critical value, you need a calculator for critical points of the non-central F-distribution, as found in various statistical packages.

To apply this, perform a simple dependent samples (paired data) t-test. Determine the maximum allowable difference in terms of the scale difference and normalize this by stating it in standard deviation units. The obtained value of t is then compared to the critical value as follows:

$$C = \sqrt F$$
((5.17))
where the F value corresponds to a value for the non-central F-distribution for 1, N–1 degrees of freedom, and a non-centrality parameter, given by N(É›), and (É›) is the size of the critical difference in standard deviation units. If you do not have easy access to a calculator for the critical values of a non-central F, a very useful table is given in Gacula et al. (2009) where the value of T may be directly compared to the critical value based on an alpha level of 0.05 and various levels of (É›) (Appendix Table A.30, pp. 812–813 in Gacula et al., 2009). The absolute value of the obtained t-value must be less than the critical C value to fall in the range of significant similarity or equivalence.

Worked examples can be found in Bi (2005) and Gacula et al. (2009).

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Lawless, H., Heymann, H. (2010). Similarity, Equivalence Testing, and Discrimination Theory. In: Sensory Evaluation of Food. Food Science Text Series. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-6488-5_5

Download citation

Publish with us

Policies and ethics