Skip to main content

Intelligibility of HE-AAC Coded Japanese Words with Various Stereo Coding Modes in Virtual 3D Audio Space

  • Conference paper
  • 1445 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5954))

Abstract

In this paper, we investigated the influence of stereo coding on Japanese speech localized in virtual 3-D space. We encoded localized speech using joint stereo and parametric stereo modes within the HE-AAC encoder. First, we tested subjective quality of localized speech at various azimuths on the horizontal plane relative to the listener using the standard MUSHRA tests. We compared the encoded localized speech quality with various stereo encoding modes. The joint stereo mode showed significantly higher MUSHRA scores than the parametric stereo mode at azimuths of ±45 degrees. Next, the Japanese word intelligibility tests were conducted using the Japanese Diagnostic Rhyme Tests. Test speech was first localized at 0 and ±45 degrees and compared with localized speech with no coding. Parametric stereo-coded speech showed lower scores when localized at -45 degrees, but all other speech showed no difference between speech samples with no coding. Next, test speech was localized in front, while competing noise was localized at various angles. The two stereo coding modes with bit rates of 56, 32, and 24 kbps were tested. In most cases, these conditions show just as good intelligibility as speech with no encoding at all noise azimuths. This shows that stereo coding has almost no effect on the intelligibility in the bit rate range tested.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kilgore, R., Chignell, M., Smith, P.: The Vocal Village: Enhancing Collaboration with Spatialized Audioconferencing. In: Proc. World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education (2004)

    Google Scholar 

  2. Kaneda, Y.: Subjective Evaluation of Voiscape - A Virtual “Sound Room” Based Communication-Medium. Tech. Rep. of the IEICE EA 2007-42 (2007)

    Google Scholar 

  3. Junichi, N., Kenji, O.: Effects of Reproduction Methods and High-Efficiency Audio Coding on Word Intelligibility with Competing Talkers. J. IEICE A J88-A, 1026–1034 (2005)

    Google Scholar 

  4. Kobayashi, Y., Kondo, K., Nakagawa, N., Takano, K.: The Influence of Stereo Coding on the 3D Sound Localization Accuracy. Tech. Rep. of the IEICE EA2008-56 (2008)

    Google Scholar 

  5. Alexander, R., Claudia, S.: Auditory Assessment of Conversational Speech Quality of Traditional and Spatialized Teleconferences. In: Proc. 8. ITG-Fachtagung Sprachkommunikation, pp. 8–10 (2008)

    Google Scholar 

  6. ISO/IEC 14496-3:2003/Amd.1

    Google Scholar 

  7. ISO/IEC 14496-3:2005/Amd.2

    Google Scholar 

  8. Breebaart, J., Par, S.v.d., Kohlrausch, A., Schuijers, E.: Parametric Coding of Stereo Audio. EURASIP J. on Applied Signal Processing 9, 1305–1322 (2004)

    Google Scholar 

  9. Kitashima, Y., Kondo, K., Terada, H., Chiba, T., Nakagawa, K.: Intelligibility of read Japanese words with competing noise in virtual acoustic space. J. Acoustical Science and Technology 29(1), 74–81 (2008)

    Article  Google Scholar 

  10. HRTF Measurements of a KEMAR Dummy-Head Microphone, http://sound.media.mit.edu/resources/KEMAR.html

  11. Recommendation ITU-R BS.1534-1: Method for the subjective assessment of intermediate quality level coding system (2001-2003)

    Google Scholar 

  12. ASJ Continuous Speech Database for Research (ASJ-JIPDEC), http://www.jipdec.jp/chosa/public/report/onseidb/

  13. Coding Technologies Acquired by Dolby Laboratories, http://www.aacplus.net/

  14. Kondo, K., Izumi, R., Fujimori, R., Rui, K., Nakahawa, K.: On a Two-to-One Selection Based Japanese Speech Intelligibility Test. J. Acoust. Soc. Jpn. 63, 196–205 (2007)

    Google Scholar 

  15. Chiba, T., Kitashima, Y., Yano, N., Kondo, K., Nakagawa, K.: On the influence of localized position of interference noise on the intelligibility of read Japanese words in remote conference systems, Inter-noise 2008, PO-2-0294 (2008)

    Google Scholar 

  16. Kobayashi, Y., Kondo, K., Nakagawa, K.: Intelligibility of Low Bit rate MPEG-coded Japanese Speech in Virtual 3D audio space. In: Proc. 15th International Conference on Auditory Display, pp. 99–102 (2009)

    Google Scholar 

  17. Yano, N., Kondo, K., Nakagawa, K., Takano, K.: The Effect of Localized Speech and Noise Distance on the Speech Intelligibility. IPSJ-Tohoku B-2-3 (2008)

    Google Scholar 

  18. Rice University: Signal Processing Information Base (SPIB), http://spib.rice.edu/spib/select_noise.html

  19. Stoll, G., Kozamernik, F.: EBU Report on the Subjective Listening Tests of Some Commercial in EBU Technical Review, no. 283 (2000)

    Google Scholar 

  20. Kobayashi, Y., Kondo, K., Nakagawa, K.: Influence of Various Stereo Coding Modes on Encoded Japanese Speech Intelligibility with Competing Noise. In: Proc. International Work-shop on the Principles and Applications of Spatial Hearing, P19(Poster) (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kobayashi, Y., Kondo, K., Nakagawa, K. (2010). Intelligibility of HE-AAC Coded Japanese Words with Various Stereo Coding Modes in Virtual 3D Audio Space. In: Ystad, S., Aramaki, M., Kronland-Martinet, R., Jensen, K. (eds) Auditory Display. CMMR ICAD 2009 2009. Lecture Notes in Computer Science, vol 5954. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12439-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12439-6_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12438-9

  • Online ISBN: 978-3-642-12439-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics