Advertisement

Energetic Masking and Masking Release

  • John F. CullingEmail author
  • Michael A. Stone
Chapter
Part of the Springer Handbook of Auditory Research book series (SHAR, volume 60)

Abstract

Masking is of central interest in the cocktail party problem, because interfering voices may be sufficiently intense or numerous to mask the voice to which the listener is attending, rendering its discourse unintelligible. The definition of energetic masking is problematic, but it may be considered to consist of effects by which an interfering sound disrupts the processing of the speech signal in the lower levels of the auditory system. Maskers can affect speech intelligibility by overwhelming its representation on the auditory nerve and by obscuring its amplitude modulations. A release from energetic masking is obtained by using mechanisms at these lower levels that can recover a useful representation of the speech. These mechanisms can exploit differences between the target and masking speech such as in harmonic structure or in interaural time delay. They can also exploit short-term dips in masker strength or improvements in speech-to-masker ratio at one or other ear.

Keywords

Better-ear listening Binaural unmasking Dip listening Equalization—cancelation Fundamental frequency difference Modulation masking Onset-time differences Spatial release from masking 

Notes

Compliance with Ethics Requirements

John Culling has no conflicts of interest.

Michael Stone has no conflicts of interest.

References

  1. ANSI. (1997). ANSI S3.5-1997. Methods for the calculation of the speech intelligibility index. Washington, DC: American National Standards Institute.Google Scholar
  2. ANSI. (2013). ANSI S1.1-2013. Acoustical terminology. Washington, DC: American National Standard Institute.Google Scholar
  3. Assmann, P. F., & Paschall, D. D. (1998). Pitches of concurrent vowels. The Journal of the Acoustical Society of America, 103, 1150–1160.CrossRefPubMedGoogle Scholar
  4. Assmann, P. F., & Summerfield, Q. (1990). Modeling the perception of concurrent vowels: Vowels with different fundamental frequencies. The Journal of the Acoustical Society of America, 88, 680–697.CrossRefPubMedGoogle Scholar
  5. Assmann, P. F., & Summerfield, Q. (1994). The contribution of waveform interactions to the perception of concurrent vowels. The Journal of the Acoustical Society of America, 95, 471–484.CrossRefPubMedGoogle Scholar
  6. Bee, M. A., & Micheyl, C. (2008). The cocktail party problem: What is it? How can it be solved? And why should animal behaviorists study it? Journal of Comparative Psychology, 122, 235–251.CrossRefPubMedPubMedCentralGoogle Scholar
  7. Bernstein, J. G. W., & Grant, K. W. (2009). Auditory and auditory-visual speech intelligibility in fluctuating maskers for normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of America, 125, 3358–3372.CrossRefPubMedGoogle Scholar
  8. Beutelmann, R., Brand, T., & Kollmeier, B. (2010). Revision, extension, and evaluation of a binaural speech intelligibility model. The Journal of the Acoustical Society of America, 127, 2479–2497.CrossRefPubMedGoogle Scholar
  9. Bird, J., & Darwin, C. J. (1998). Effects of a difference in fundamental frequency in separating two sources. In A. R. Palmer, A. Rees, A. Q. Summerfield, & R. Meddis (Eds.), Psychophysical and physiological advances in hearing. London: Whurr.Google Scholar
  10. Bregman, A. S. (1990). Auditory scene analysis. Cambridge, MA: MIT Press.Google Scholar
  11. Brokx, J. P., & Nooteboom, S. G. (1982). Intonation and the perceptual separation of simultaneous voices. Journal of Phonetics, 10, 23–36.Google Scholar
  12. Bronkhorst, A. W., & Plomp, R. (1988). The effect of head-induced interaural time and level differences on speech intelligibility in noise. The Journal of the Acoustical Society of America, 83, 1508–1516.CrossRefPubMedGoogle Scholar
  13. Brungart, D. S. (2001). Informational and energetic masking effects in the perception of two simultaneous talkers. The Journal of the Acoustical Society of America, 109, 1101–1109.CrossRefPubMedGoogle Scholar
  14. Buus, S. (1985). Release from masking caused by envelope fluctuations. The Journal of the Acoustical Society of America, 78, 1958–1965.CrossRefPubMedGoogle Scholar
  15. Christiansen, C., & Dau, T. (2012). Relationship between masking release in fluctuating maskers and speech reception thresholds in stationary noise. The Journal of the Acoustical Society of America, 132, 1655–1666.CrossRefPubMedGoogle Scholar
  16. Colburn, H. S. (1996). Computational models of binaural processing. In H. L. Hawkins, T. A. McMullen, A. N. Popper, & R. R. Fay (Eds.), Auditory computation (pp. 332–400). New York: Springer.CrossRefGoogle Scholar
  17. Colburn, H. S., & Durlach, N. I. (1978). Models of binaural interaction. In E. C. Carterette (Ed.), Handbook of perception (Vol. IV, pp. 467–518). New York: Academic Press.Google Scholar
  18. Collin, B., & Lavandier, M. (2013). Binaural speech intelligibility in rooms with variations in spatial location of sources and modulation depth of noise interferers. The Journal of the Acoustical Society of America, 134, 1146–1159.CrossRefPubMedGoogle Scholar
  19. Culling, J. F. (2007). Evidence specifically favoring the equalization-cancellation theory of binaural unmasking. The Journal of the Acoustical Society of America, 122(5), 2803–2813.CrossRefPubMedGoogle Scholar
  20. Culling, J. F., & Colburn, H. S. (2000). Binaural sluggishness in the perception of tone sequences. The Journal of the Acoustical Society of America, 107, 517–527.CrossRefPubMedGoogle Scholar
  21. Culling, J. F., & Darwin, C. J. (1993). Perceptual separation of simultaneous vowels: Within and across-formant grouping by F0. The Journal of the Acoustical Society of America, 93, 3454–3467.CrossRefPubMedGoogle Scholar
  22. Culling, J. F., & Darwin, C. J. (1994). Perceptual and computational separation of simultaneous vowels: Cues arising from low-frequency beating. The Journal of the Acoustical Society of America, 95, 1559–1569.CrossRefPubMedGoogle Scholar
  23. Culling, J. F., & Mansell, E. R. (2013). Speech intelligibility among modulated and spatially distributed noise sources. The Journal of the Acoustical Society of America, 133, 2254–2261.CrossRefPubMedGoogle Scholar
  24. Culling, J. F., & Summerfield, Q. (1995). The role of frequency modulation in the perceptual segregation of concurrent vowels. The Journal of the Acoustical Society of America, 98, 837–846.CrossRefPubMedGoogle Scholar
  25. Culling, J. F., & Summerfield, Q. (1998). Measurements of the binaural temporal window. The Journal of the Acoustical Society of America, 103, 3540–3553.CrossRefGoogle Scholar
  26. Darwin, C. J. (1984). Perceiving vowels in the presence of another sound: Constraints on formant perception. The Journal of the Acoustical Society of America, 76, 1636–1647.CrossRefPubMedGoogle Scholar
  27. Darwin, C. J., & Sutherland, N. S. (1984). Grouping frequency components of vowels: When is a harmonic not a harmonic? Quarterly Journal of Experimental Psychology, 36A, 193–208.CrossRefGoogle Scholar
  28. de Cheveigné, A. (1998). Cancellation model of pitch perception. The Journal of the Acoustical Society of America, 103, 1261–1271.CrossRefPubMedGoogle Scholar
  29. de Cheveigné, A., McAdams, S., Laroche, J., & Rosenberg, M. (1995). Identification of concurrent harmonic and inharmonic vowels: A test of Theory of harmonic cancellation and enhancement. The Journal of the Acoustical Society of America, 97, 3736–3748.CrossRefPubMedGoogle Scholar
  30. de Laat, J. A. P. M., & Plomp, R. (1983). The reception threshold of interrupted speech for hearing-impaired listeners. In R. Klinke & R. Hartmann (Eds.), Hearing—Physiological bases and psychophysics (pp. 359–363). Berlin, Heidelberg: Springer.CrossRefGoogle Scholar
  31. Deroche, M. L. D., & Culling, J. F. (2011a). Voice segregation by difference in fundamental frequency: Evidence for harmonic cancellation. The Journal of the Acoustical Society of America, 130, 2855–2865.CrossRefPubMedGoogle Scholar
  32. Deroche, M. L. D., & Culling, J. F. (2011b). Narrow noise band detection in a complex masker: Masking level difference due to harmonicity. Hearing Research, 282, 225–235.CrossRefPubMedGoogle Scholar
  33. Deroche, M. L. D., Culling, J. F., & Chatterjee, M. (2013). Phase effects in masking by harmonic complexes: Speech recognition. Hearing Research, 306, 54–62.CrossRefPubMedGoogle Scholar
  34. Deroche, M. L. D., Culling, J. F., Chatterjee, M., & Limb, C. J. (2014). Speech recognition against harmonic and inharmonic complexes: Spectral dips and periodicity. The Journal of the Acoustical Society of America, 135, 2873–2884.CrossRefPubMedPubMedCentralGoogle Scholar
  35. Durlach, N. I. (1963). Equalization and cancellation theory of binaural masking-level differences. The Journal of the Acoustical Society of America, 35, 416–426.CrossRefGoogle Scholar
  36. Durlach, N. I. (1972). Binaural signal detection: Equalization and cancellation theory. In J. V. Tobias (Ed.), Foundations of modern auditory theory (Vol. II, p. 365462). New York: Academic Press.Google Scholar
  37. Durlach, N. (2006). Auditory masking: Need for improved conceptual structure. The Journal of the Acoustical Society of America, 120, 1787–1790.CrossRefPubMedGoogle Scholar
  38. Edmonds, B. A., & Culling, J. F. (2005). The spatial unmasking of speech: Evidence for within-channel processing of interaural time delay. The Journal of the Acoustical Society of America, 117, 3069–3078.CrossRefPubMedGoogle Scholar
  39. Edmonds, B. A., & Culling, J. F. (2006). The spatial unmasking of speech: Evidence for better-ear listening. The Journal of the Acoustical Society of America, 120, 1539–1545.CrossRefPubMedGoogle Scholar
  40. Egan, J., Carterette, E., & Thwing, E. (1954). Factors affecting multichannel listening. The Journal of the Acoustical Society of America, 26, 774–782.CrossRefGoogle Scholar
  41. Festen, J., & Plomp, R. (1990). Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. The Journal of the Acoustical Society of America, 88, 1725–1736.CrossRefPubMedGoogle Scholar
  42. Fletcher, H. (1930). A space-time pattern theory of hearing. The Journal of the Acoustical Society of America, 1, 311–343.CrossRefGoogle Scholar
  43. French, N. R., & Steinberg, J. C. (1947). Factors governing the intelligibility of speech sounds. The Journal of the Acoustical Society of America, 19, 90–119.CrossRefGoogle Scholar
  44. Glasberg, B. R., & Moore, B. C. (1990). Derivation of auditory filter shapes from notched-noise data. Hearing Research, 47, 103–138.CrossRefPubMedGoogle Scholar
  45. Grantham, D. W., & Wightman, F. L. (1979). Detectability of a pulsed tone in the presence of a masker with time-varying interaural correlation. The Journal of the Acoustical Society of America, 65, 1509–1517.CrossRefPubMedGoogle Scholar
  46. Hartmann, W. M., & Pumplin, J. (1988). Noise power fluctuations and the masking of sine signals. The Journal of the Acoustical Society of America, 83, 2277–2289.CrossRefPubMedGoogle Scholar
  47. Hawkins, J. E., & Stevens, S. S. (1950). The masking of pure tones and of speech by white noise. The Journal of the Acoustical Society of America, 22, 6–13.CrossRefGoogle Scholar
  48. Hawley, M. L., Litovsky, R. Y., & Culling, J. F. (2004). The benefit of binaural hearing in a cocktail party: Effect of location and type of interferer. The Journal of the Acoustical Society of America, 115, 833–843.CrossRefPubMedGoogle Scholar
  49. Hilkhuysen, G., & Machery, O. (2014). Optimizing pulse-spreading harmonic complexes to minimize intrinsic modulations after cochlear filtering. The Journal of the Acoustical Society of America, 136, 1281–1294.CrossRefPubMedGoogle Scholar
  50. Hirsh, I. J. (1948). The influence of interaural phase on interaural summation and inhibition. The Journal of the Acoustical Society of America, 20, 536–544.CrossRefGoogle Scholar
  51. Holmes, S. D., & Roberts, B. (2011). The influence of adaptation and inhibition on the effects of onset asynchrony on auditory grouping. Journal of Experimental Psychology. Human Perception and Performance, 37, 1988–2000.CrossRefPubMedGoogle Scholar
  52. Houtgast, T., & Steeneken, H. J. M. (1985). A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. The Journal of the Acoustical Society of America, 77, 1069–1077.CrossRefGoogle Scholar
  53. Howard-Jones, P. A., & Rosen, S. (1993). Uncomodulated glimpsing in ‘checkerboard’ noise. The Journal of the Acoustical Society of America, 93, 2915–2922.CrossRefPubMedGoogle Scholar
  54. Jelfs, S., Culling, J. F., & Lavandier, M. (2011). Revision and validation of a binaural model for speech intelligibility in noise. Hearing Research, 275, 96–104.CrossRefPubMedGoogle Scholar
  55. Jørgensen, S., & Dau, T. (2011). Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing. The Journal of the Acoustical Society of America, 130, 1475–1487.CrossRefPubMedGoogle Scholar
  56. Jørgensen, S., Ewert, S. D., & Dau, T. (2013). A multi-resolution envelope-power based model for speech intelligibility. The Journal of the Acoustical Society of America, 134, 436–446.CrossRefPubMedGoogle Scholar
  57. Klatt, H. (1980). Software for a cascade/parallel formant synthesizer. The Journal of the Acoustical Society of America, 67, 971–995.CrossRefGoogle Scholar
  58. Klumpp, R. G., & Eady, H. R. (1956). Some measurements of interaural time difference thresholds. The Journal of the Acoustical Society of America, 28, 859–860.CrossRefGoogle Scholar
  59. Kohlrausch, A., Fassel, R., van der Heijden, M., Kortekaas, R., et al. (1997). Detection of tones in low-noise noise: Further evidence for the role of envelope fluctuations. Acta Acustica united with Acustica, 83, 659–669.Google Scholar
  60. Kohlrausch, A., & Sander, A. (1995). Phase effects in masking related to dispersion in the inner ear. II. Masking period patterns of short targets. The Journal of the Acoustical Society of America, 97, 1817–1829.CrossRefPubMedGoogle Scholar
  61. Kwon, B. J., & Turner, C. W. (2001). Consonant identification under maskers with sinusoidal modulation: Masking release or modulation interference? The Journal of the Acoustical Society of America, 110, 1130–1140.CrossRefPubMedGoogle Scholar
  62. Licklider, J. C. R. (1948). The influence of interaural phase relations upon the masking of speech by white noise. The Journal of the Acoustical Society of America, 20, 150–159.CrossRefGoogle Scholar
  63. McAdams, S. (1989). Segregation of concurrent sounds. I: Effects of frequency modulation coherence. The Journal of the Acoustical Society of America, 86, 2148–2159.CrossRefPubMedGoogle Scholar
  64. Meddis, R., & Hewitt, M. J. (1991). Virtual pitch and phase sensitivity of a computer model of the auditory periphery. I: Pitch identification. The Journal of the Acoustical Society of America, 89, 2866–2882.CrossRefGoogle Scholar
  65. Meddis, R., & Hewitt, M. J. (1992). Modeling the identification of concurrent vowels with different fundamental frequencies. The Journal of the Acoustical Society of America, 91, 233–245.CrossRefPubMedGoogle Scholar
  66. Miller, G. A. (1947). The masking of speech. Psychological Bulletin, 44, 105–129.CrossRefPubMedGoogle Scholar
  67. Miller, G. A., & Licklider, J. C. R. (1950). The intelligibility of interrupted speech. The Journal of the Acoustical Society of America, 22, 167–173.CrossRefGoogle Scholar
  68. Nelson, P., Jin, S.-H., Carney, A. E., & Nelson, D. A. (2003). Understanding speech in modulated interference: Cochlear implant users and normal-hearing listeners. The Journal of the Acoustical Society of America, 113, 961–968.CrossRefPubMedGoogle Scholar
  69. Oxenham, A., & Simonson, A. M. (2009). Masking release for low- and high-pass-filtered speech in the presence of noise and single-talker interference. The Journal of the Acoustical Society of America, 125, 457–468.CrossRefPubMedPubMedCentralGoogle Scholar
  70. Plomp, R. (1983). The role of modulation in hearing. In R. Klinke & R. Hartmann (Eds.), Hearing—Physiological bases and psychophysics (pp. 270–276). Heidelberg: Springer.CrossRefGoogle Scholar
  71. Pumplin, J. (1985). Low-noise noise. The Journal of the Acoustical Society of America, 78, 100–104.CrossRefGoogle Scholar
  72. Rhebergen, K. S., & Versfeld, N. J. (2005). A Speech Intelligibility Index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners. The Journal of the Acoustical Society of America, 117, 2181–2192.CrossRefPubMedGoogle Scholar
  73. Roberts, B., & Holmes, S. D. (2006). Asynchrony and the grouping of vowel components: Captor tones revisited. The Journal of the Acoustical Society of America, 119, 2905–2918.CrossRefPubMedGoogle Scholar
  74. Scheffers, T. M. (1983). Sifting vowels: Auditory pitch analysis and sound segregation. Doctoral thesis, University of Groningen.Google Scholar
  75. Schroeder, M. R. (1970). Synthesis of low-peak-factor signals and binary sequences with low autocorrelation. IEEE Transactions on Information Theory, 16, 85–89.CrossRefGoogle Scholar
  76. Schubert, E. D. (1956). Some preliminary experiments on binaural time delay and intelligibility. The Journal of the Acoustical Society of America, 28, 895–901.CrossRefGoogle Scholar
  77. Stone, M. A., Anton, K., & Moore, B. C. J. (2012). Use of high-rate envelope speech cues and their perceptually relevant dynamic range for the hearing impaired. The Journal of the Acoustical Society of America, 132, 1141–1151.CrossRefPubMedGoogle Scholar
  78. Stone, M. A., Füllgrabe, C., & Moore, B. C. J. (2010). Relative contribution to speech intelligibility of different envelope modulation rates within the speech dynamic range. The Journal of the Acoustical Society of America, 128, 2127–2137.CrossRefPubMedGoogle Scholar
  79. Stone, M. A., & Moore, B. C. J. (2014). On the near non-existence of “pure” energetic masking release for speech. The Journal of the Acoustical Society of America, 135, 1967–1977.CrossRefPubMedGoogle Scholar
  80. Studebaker, G. A., & Sherbecoe, R. L. (2002). Intensity-importance functions for bandlimited monosyllabic words. The Journal of the Acoustical Society of America, 111, 1422–1436.CrossRefPubMedGoogle Scholar
  81. Summerfield, Q., & Assmann, P. F. (1990). Perception of concurrent vowels: Effects of harmonic misalignment and pitch-period asynchrony. The Journal of the Acoustical Society of America, 89, 1364–1377.CrossRefGoogle Scholar
  82. Summerfield, Q., & Assmann, P. F. (1991). Perception of concurrent vowels: Effects of harmonic misalignment and pitch-period asynchrony. The Journal of the Acoustical Society of America, 89, 1364–1377.Google Scholar
  83. Summers, V., & Leek, M. R. (1998). Masking of tones and speech by Schroeder-phase harmonic complexes in normally hearing and hearing-impaired listeners. Hearing Research, 118, 139–150.CrossRefPubMedGoogle Scholar
  84. von Helmholz, H. (1895). On the sensations of tone as a physiological basis for Theory of music. London: Longmans.Google Scholar
  85. Wan, R., Durlach, N. I., & Colburn, H. S. (2014). Application of a short-time version of the equalization–cancellation model to speech intelligibility experiments with speech maskers. The Journal of the Acoustical Society of America, 136, 768–776.CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of PsychologyCardiff UniversityCardiffUK
  2. 2.Manchester Centre for Audiology and Deafness, School of Health SciencesUniversity of ManchesterManchesterUK

Personalised recommendations