Preprocessing of Independent Vector Analysis Using Feed-Forward Network for Robust Speech Recognition

Oh, Myungwoo; Park, Hyung-Min

doi:10.1007/978-3-642-24958-7_43

Myungwoo Oh¹⁸ &
Hyung-Min Park¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7063))

Included in the following conference series:

International Conference on Neural Information Processing

2590 Accesses
2 Citations

Abstract

This paper describes an algorithm to preprocess independent vector analysis (IVA) using feed-forward network for robust speech recognition. In the framework of IVA, a feed-forward network is able to be used as an separating system to accomplish successful separation of highly reverberated mixtures. For robust speech recognition, we make use of the cluster-based missing feature reconstruction based on log-spectral features of separated speech in the process of extracting mel-frequency cepstral coefficients. The algorithm identifies corrupted time-frequency segments with low signal-to-noise ratios calculated from the log-spectral features of the separated speech and observed noisy speech. The corrupted segments are filled by employing bounded estimation based on the possibly reliable log-spectral features and on the knowledge of the pre-trained log-spectral feature clusters. Experimental results demonstrate that the proposed method enhances recognition performance in noisy environments significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Juang, B.H.: Speech Recognition in Adverse Environments. Computer Speech & Language 5, 275–294 (1991)
Article Google Scholar
Singh, R., Stern, R.M., Raj, B.: Model Compensation and Matched Condition Methods for Robust Speech Recognition. CRC Press (2002)
Google Scholar
Raj, B., Parikh, V., Stern, R.M.: The Effects of Background Music on Speech Recognition Accuracy. In: IEEE ICASSP, pp. 851–854 (1997)
Google Scholar
Hyvärinen, A., Harhunen, J., Oja, E.: Independent Component Analysis. John Wiley & Sons (2001)
Google Scholar
Kim, T., Attias, H.T., Lee, S.-Y., Lee, T.-W.: Blind Source Separation Exploiting Higher-Order Frequency Dependencies. IEEE Trans. Audio, Speech, and Language Processing 15, 70–79 (2007)
Article Google Scholar
Kim, L.-H., Tashev, I., Acero, A.: Reverberated Speech Signal Separation Based on Regularized Subband Feedforward ICA and Instantaneous Direction of Arrival. In: IEEE ICASSP, pp. 2678–2681 (2010)
Google Scholar
Oh, M., Park, H.-M.: Blind Source Separation Based on Independent Vector Analysis Using Feed-Forward Network. Neurocomputing (in press)
Google Scholar
Matsuoka, K., Nakashima, S.: Minimal Distortion Principle for Blind Source Separation. In: International Workshop on ICA and BSS, pp. 722–727 (2001)
Google Scholar
Raj, B., Seltzer, M.L., Stern, R.M.: Reconstruction of Missing Features for Robust Speech Recognition. Speech Comm. 43, 275–296 (2004)
Article Google Scholar
Raj, B., Stern, R.M.: Missing-Feature Methods for Robust Automatic Speech Recognition. IEEE Signal Process. Mag. 22, 101–116 (2005)
Article Google Scholar
Kim, M., Min, J.-S., Park, H.-M.: Robust Speech Recognition Using Missing Feature Theory and Target Speech Enhancement Based on Degenerate Unmixing and Estimation Technique. In: Proc. SPIE 8058 (2011), doi:10.1117/12.883340
Google Scholar
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall (1993)
Google Scholar
Price, P., Fisher, W.M., Bernstein, J., Pallet, D.S.: The DARPA 1000-Word Resource Management Database for Continuous Speech Recognition. In: Proc. IEEE ICASSP, pp. 651–654 (1988)
Google Scholar
Young, S.J., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.C.: The HTK Book (for HTK Version 3.4). University of Cambridge (2006)
Google Scholar
Varga, A., Steeneken, H.J.: Assessment for automatic speech recognition: II. In: NOISEX 1992: A Database and an Experiment to Study the Effect of Additive Noise on Speech Recognition Systems. Speech Comm., vol. 12, pp. 247–251 (1993)
Google Scholar
Allen, J.B., Berkley, D.A.: Image Method for Efficiently Simulating Small-Room Acoustics. Journal of the Acoustical Society of America 65, 943–950 (1979)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, Sogang University, #1 Shinsu-dong, Mapo-gu, Seoul, 121-742, Republic of Korea
Myungwoo Oh & Hyung-Min Park

Authors

Myungwoo Oh
View author publications
You can also search for this author in PubMed Google Scholar
Hyung-Min Park
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800, Dongchuan Road, 200240, Shanghai, China
Bao-Liang Lu & Liqing Zhang &
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
James Kwok

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Oh, M., Park, HM. (2011). Preprocessing of Independent Vector Analysis Using Feed-Forward Network for Robust Speech Recognition. In: Lu, BL., Zhang, L., Kwok, J. (eds) Neural Information Processing. ICONIP 2011. Lecture Notes in Computer Science, vol 7063. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24958-7_43

Download citation

DOI: https://doi.org/10.1007/978-3-642-24958-7_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24957-0
Online ISBN: 978-3-642-24958-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics