Advances in Acoustic Noise Tracking for Robust In-Vehicle Speech Systems

Akbacak, Murat; Hansen, John H. L.

doi:10.1007/978-0-387-45976-9_10

Murat Akbacak⁵ &
John H. L. Hansen⁵

798 Accesses
1 Citations

Abstract

Speech systems work reasonably well under homogeneous acoustic environmental conditions but become fragile in practical applications involving real-world environments (e.g., in-car, broadcast news, digital archives, etc.) where the audio stream contains multi-environment characteristics. To date, most approaches dealing with environmental noise in speech systems are based on assumptions concerning the noise, rather than exploring and characterizing the nature of the noise. In this chapter, we present our recent advances in the formulation and development of an in-vehicle environmental sniffing framework previously presented in [1,2,3,4]. The system is comprised of different components to detect, classify and track acoustic environmental conditions. The first goal of the framework is to seek out detailed information about the environmental characteristics instead of just detecting environmental change points. The second goal is to organize this knowledge in an effective manner to allow intelligent decisions to direct subsequent speech processing systems. After presenting our proposed in-vehicle environmental sniffing framework, we consider future directions and present discussion on supervised versus unsupervised noise clustering, and closed-set versus open-set noise classification.

This work was supported in part by DARPA through SPA WAR under Grant No. N66001-002-8906, from SPA WAR under Grant No. N66001-03-1-8905, m part by NSF under Cooperative Agreement No. IIS-9817485.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

M. Akbacak, J. H. L. Hansen, “Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems,” IEEE ICASSP-2003: International Conference Acoustics Speech & Signal Processing, vol. 2, pp. 113–116, Hong Kong, April 2003..
Google Scholar
M. Akbacak, J. H. L. Hansen, “Environmental Sniffing: Robust Digit Recognition for an In-Vehicle Environment”, Interspeech-Eurospeech-2003, pp.2177–2180, Geneva, Switzerland, September 2003.
Google Scholar
M. Akbacak, J. H. L. Hansen, “Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems”, IEEE Trans. Speech & Audio Proc, October 2005.
Google Scholar
J. H. L. Hansen, X. Zhang, M. Akbacak, U. Yapanel, B. Pellom, W. Ward, Chapter 2, DSP in Mobile and Vehicle Systems, H. Abut, J.H.L. Hansen and K. Takeda (Editors) Springer, 2005.
Google Scholar
R. Bakis, S. Sehen, P. Gopalakrishnan, R. Gopinath, S. Maes, and L. Polymenakos, “Transcription of Broadcast News — System Robustness Issues and Adaptation Techniques”, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 711–714, April 1997.
Google Scholar
U. Jain, M. A. Siegler, S. J. Doh, E. Gouvea, J. Huerta, P. J. Moreno, B. Raj, and R. M. Stern, “Recognition of Continuous Broadcast News with Multiple Unknown Speakers and Environments”, Proceedings of the ARPA Workshop on Speech Recognition Technology, pp. 61–66, February 1996.
Google Scholar
R. Bakis, S. Chen, P. Gopalakrishnan, R. Gopinath, S. Maes, L. Polymenakos, and M. Franz, “Transcription of Broadcast News Shows with the IBM Large Vocabulary Speech Recognition System”, Proceedings of DARPA Speech Recognition Workshop, pp. 67–72, February 1997.
Google Scholar
M. A. Siegler, U. Jain, B. Raj, and R. M. Stern, “Automatic Segmentation, Classification and Clustering of Broadcast News Audio”, Proceedings of DARPA Speech Recognition Workshop, pp. 97–99, February 1997.
Google Scholar
J. S. Lim, “Speech Enhancement”, Prentice Hall, Englewood Cliffs, NJ, 1983.
Google Scholar
J. H. L. Hansen, Speech Enhancement, Encyclopedia of Electrical and Electronics Engineering, John Wiley & Sons, vol. 20, pp. 159–175, 1999.
Google Scholar
J. G. Fiscus, “A Post Processing System to yield reduced error rates: Recognizer Output Voting Error Reduction (ROVER)”, IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 347–54, 1997.
Google Scholar
S. Chen and P. S. Gopalakrishnan, “Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion”, Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, pp. 127–132, February 1998.
Google Scholar
B. Zhou and J. H X. Hansen, “Unsupervised Audio Stream Segmentation and Clustering via the Bayesian Information Criterion”, Proc. of Inter. Conf. on Spoken Language Processing ICSLP-2000, vol. 3, pp. 714–717, October 2000.
Google Scholar
G. Zhou, J. H. L. Hansen, and J. F. Kaiser, “Nonlinear Feature Based Classification of Speech under Stress”, IEEE Trans, on Speech & Audio Processing, vol. 9, no. 2, pp. 201–216, March 2001.
Article Google Scholar
Y. Gong, “Speech Recognition in Noisy Environments: A Survey, Speech Communication, vol. 16, pp. 261–91, 1995.
Google Scholar
C. J. Leggetter and P. C. Woodland, “Maximum Likelihood Linear Regression for speaker adaptation of continuous density hidden Markov models”, Computer Speech and Language, vol. 9, no. 2, pp. 171–185, April, 1995.
Article Google Scholar
[17] M. Gales and S. Young, “Robust Continuous Speech Recognition using Parallel Model Combination”, IEEE Transactions on Speech and Audio Processing, vol. 4, pp. 352–359, September 1996.
Google Scholar
H. Hermansky, “Perceptual linear predictive (PLP) analysis of speech”, Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738–1752, 1990.
Article Google Scholar
R. Sarikaya and J. H. L. Hansen, “High Resolution Speech Feature Parameterization for Monophone Based Stressed Speech Recognition”, IEEE Signal Processing Letters, vol. 7, no. 7, pp. 182–185, July 2000.
Article Google Scholar
R. Sarikaya and J. H. L. Hansen, “Robust detection of Speech Activity in the Presence of Noise”, International Conference on Spoken Language Processing, vol. 4, pp. 1455–1458, December 1998.
Google Scholar
[21] P. Angkititrakul, J. H. L. Hansen, S. Baghaii, “Cluster-dependent Modeling and Confidence Measure Processing for In-Set/Out-of-Set Speaker Identification”, Interspeech-2004/ICSLP-2004: Inter. Conf. Spoken Language Processing, Jeju Island, South Korea, October 2004.
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Robust Speech Systems, University of Texas at Dallas, Richardson, Texas, USA
Murat Akbacak & John H. L. Hansen

Authors

Murat Akbacak
View author publications
You can also search for this author in PubMed Google Scholar
John H. L. Hansen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

San Diego State University, San Diego, California, USA
Héseyin Abut
Sabanci University, Turkey
Héseyin Abut
Center for Robust Speech Systems (CRSS) Department of Electrical Engineering, Erik Jonsson School of Engineering & Computer Science, University of Texas at Dallas, Richardson, TX, USA
John H. L. Hansen
Department of Media Science, Nagoya University, Nagoya, Japan
Kazuya Takeda

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Akbacak, M., Hansen, J.H.L. (2007). Advances in Acoustic Noise Tracking for Robust In-Vehicle Speech Systems. In: Abut, H., Hansen, J.H.L., Takeda, K. (eds) Advances for In-Vehicle and Mobile Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-45976-9_10

Download citation

DOI: https://doi.org/10.1007/978-0-387-45976-9_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-33503-2
Online ISBN: 978-0-387-45976-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics