Skip to main content

Advances in Acoustic Noise Tracking for Robust In-Vehicle Speech Systems

  • Chapter
Book cover Advances for In-Vehicle and Mobile Systems

Abstract

Speech systems work reasonably well under homogeneous acoustic environmental conditions but become fragile in practical applications involving real-world environments (e.g., in-car, broadcast news, digital archives, etc.) where the audio stream contains multi-environment characteristics. To date, most approaches dealing with environmental noise in speech systems are based on assumptions concerning the noise, rather than exploring and characterizing the nature of the noise. In this chapter, we present our recent advances in the formulation and development of an in-vehicle environmental sniffing framework previously presented in [1,2,3,4]. The system is comprised of different components to detect, classify and track acoustic environmental conditions. The first goal of the framework is to seek out detailed information about the environmental characteristics instead of just detecting environmental change points. The second goal is to organize this knowledge in an effective manner to allow intelligent decisions to direct subsequent speech processing systems. After presenting our proposed in-vehicle environmental sniffing framework, we consider future directions and present discussion on supervised versus unsupervised noise clustering, and closed-set versus open-set noise classification.

This work was supported in part by DARPA through SPA WAR under Grant No. N66001-002-8906, from SPA WAR under Grant No. N66001-03-1-8905, m part by NSF under Cooperative Agreement No. IIS-9817485.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Akbacak, J. H. L. Hansen, “Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems,” IEEE ICASSP-2003: International Conference Acoustics Speech & Signal Processing, vol. 2, pp. 113–116, Hong Kong, April 2003..

    Google Scholar 

  2. M. Akbacak, J. H. L. Hansen, “Environmental Sniffing: Robust Digit Recognition for an In-Vehicle Environment”, Interspeech-Eurospeech-2003, pp.2177–2180, Geneva, Switzerland, September 2003.

    Google Scholar 

  3. M. Akbacak, J. H. L. Hansen, “Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems”, IEEE Trans. Speech & Audio Proc, October 2005.

    Google Scholar 

  4. J. H. L. Hansen, X. Zhang, M. Akbacak, U. Yapanel, B. Pellom, W. Ward, Chapter 2, DSP in Mobile and Vehicle Systems, H. Abut, J.H.L. Hansen and K. Takeda (Editors) Springer, 2005.

    Google Scholar 

  5. R. Bakis, S. Sehen, P. Gopalakrishnan, R. Gopinath, S. Maes, and L. Polymenakos, “Transcription of Broadcast News — System Robustness Issues and Adaptation Techniques”, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 711–714, April 1997.

    Google Scholar 

  6. U. Jain, M. A. Siegler, S. J. Doh, E. Gouvea, J. Huerta, P. J. Moreno, B. Raj, and R. M. Stern, “Recognition of Continuous Broadcast News with Multiple Unknown Speakers and Environments”, Proceedings of the ARPA Workshop on Speech Recognition Technology, pp. 61–66, February 1996.

    Google Scholar 

  7. R. Bakis, S. Chen, P. Gopalakrishnan, R. Gopinath, S. Maes, L. Polymenakos, and M. Franz, “Transcription of Broadcast News Shows with the IBM Large Vocabulary Speech Recognition System”, Proceedings of DARPA Speech Recognition Workshop, pp. 67–72, February 1997.

    Google Scholar 

  8. M. A. Siegler, U. Jain, B. Raj, and R. M. Stern, “Automatic Segmentation, Classification and Clustering of Broadcast News Audio”, Proceedings of DARPA Speech Recognition Workshop, pp. 97–99, February 1997.

    Google Scholar 

  9. J. S. Lim, “Speech Enhancement”, Prentice Hall, Englewood Cliffs, NJ, 1983.

    Google Scholar 

  10. J. H. L. Hansen, Speech Enhancement, Encyclopedia of Electrical and Electronics Engineering, John Wiley & Sons, vol. 20, pp. 159–175, 1999.

    Google Scholar 

  11. J. G. Fiscus, “A Post Processing System to yield reduced error rates: Recognizer Output Voting Error Reduction (ROVER)”, IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 347–54, 1997.

    Google Scholar 

  12. S. Chen and P. S. Gopalakrishnan, “Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion”, Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, pp. 127–132, February 1998.

    Google Scholar 

  13. B. Zhou and J. H X. Hansen, “Unsupervised Audio Stream Segmentation and Clustering via the Bayesian Information Criterion”, Proc. of Inter. Conf. on Spoken Language Processing ICSLP-2000, vol. 3, pp. 714–717, October 2000.

    Google Scholar 

  14. G. Zhou, J. H. L. Hansen, and J. F. Kaiser, “Nonlinear Feature Based Classification of Speech under Stress”, IEEE Trans, on Speech & Audio Processing, vol. 9, no. 2, pp. 201–216, March 2001.

    Article  Google Scholar 

  15. Y. Gong, “Speech Recognition in Noisy Environments: A Survey, Speech Communication, vol. 16, pp. 261–91, 1995.

    Google Scholar 

  16. C. J. Leggetter and P. C. Woodland, “Maximum Likelihood Linear Regression for speaker adaptation of continuous density hidden Markov models”, Computer Speech and Language, vol. 9, no. 2, pp. 171–185, April, 1995.

    Article  Google Scholar 

  17. [17] M. Gales and S. Young, “Robust Continuous Speech Recognition using Parallel Model Combination”, IEEE Transactions on Speech and Audio Processing, vol. 4, pp. 352–359, September 1996.

    Google Scholar 

  18. H. Hermansky, “Perceptual linear predictive (PLP) analysis of speech”, Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738–1752, 1990.

    Article  Google Scholar 

  19. R. Sarikaya and J. H. L. Hansen, “High Resolution Speech Feature Parameterization for Monophone Based Stressed Speech Recognition”, IEEE Signal Processing Letters, vol. 7, no. 7, pp. 182–185, July 2000.

    Article  Google Scholar 

  20. R. Sarikaya and J. H. L. Hansen, “Robust detection of Speech Activity in the Presence of Noise”, International Conference on Spoken Language Processing, vol. 4, pp. 1455–1458, December 1998.

    Google Scholar 

  21. [21] P. Angkititrakul, J. H. L. Hansen, S. Baghaii, “Cluster-dependent Modeling and Confidence Measure Processing for In-Set/Out-of-Set Speaker Identification”, Interspeech-2004/ICSLP-2004: Inter. Conf. Spoken Language Processing, Jeju Island, South Korea, October 2004.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Akbacak, M., Hansen, J.H.L. (2007). Advances in Acoustic Noise Tracking for Robust In-Vehicle Speech Systems. In: Abut, H., Hansen, J.H.L., Takeda, K. (eds) Advances for In-Vehicle and Mobile Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-45976-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-45976-9_10

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-33503-2

  • Online ISBN: 978-0-387-45976-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics