Abstract
This paper describes an algorithm to preprocess independent vector analysis (IVA) using feed-forward network for robust speech recognition. In the framework of IVA, a feed-forward network is able to be used as an separating system to accomplish successful separation of highly reverberated mixtures. For robust speech recognition, we make use of the cluster-based missing feature reconstruction based on log-spectral features of separated speech in the process of extracting mel-frequency cepstral coefficients. The algorithm identifies corrupted time-frequency segments with low signal-to-noise ratios calculated from the log-spectral features of the separated speech and observed noisy speech. The corrupted segments are filled by employing bounded estimation based on the possibly reliable log-spectral features and on the knowledge of the pre-trained log-spectral feature clusters. Experimental results demonstrate that the proposed method enhances recognition performance in noisy environments significantly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Juang, B.H.: Speech Recognition in Adverse Environments. Computer Speech & Language 5, 275–294 (1991)
Singh, R., Stern, R.M., Raj, B.: Model Compensation and Matched Condition Methods for Robust Speech Recognition. CRC Press (2002)
Raj, B., Parikh, V., Stern, R.M.: The Effects of Background Music on Speech Recognition Accuracy. In: IEEE ICASSP, pp. 851–854 (1997)
Hyvärinen, A., Harhunen, J., Oja, E.: Independent Component Analysis. John Wiley & Sons (2001)
Kim, T., Attias, H.T., Lee, S.-Y., Lee, T.-W.: Blind Source Separation Exploiting Higher-Order Frequency Dependencies. IEEE Trans. Audio, Speech, and Language Processing 15, 70–79 (2007)
Kim, L.-H., Tashev, I., Acero, A.: Reverberated Speech Signal Separation Based on Regularized Subband Feedforward ICA and Instantaneous Direction of Arrival. In: IEEE ICASSP, pp. 2678–2681 (2010)
Oh, M., Park, H.-M.: Blind Source Separation Based on Independent Vector Analysis Using Feed-Forward Network. Neurocomputing (in press)
Matsuoka, K., Nakashima, S.: Minimal Distortion Principle for Blind Source Separation. In: International Workshop on ICA and BSS, pp. 722–727 (2001)
Raj, B., Seltzer, M.L., Stern, R.M.: Reconstruction of Missing Features for Robust Speech Recognition. Speech Comm. 43, 275–296 (2004)
Raj, B., Stern, R.M.: Missing-Feature Methods for Robust Automatic Speech Recognition. IEEE Signal Process. Mag. 22, 101–116 (2005)
Kim, M., Min, J.-S., Park, H.-M.: Robust Speech Recognition Using Missing Feature Theory and Target Speech Enhancement Based on Degenerate Unmixing and Estimation Technique. In: Proc. SPIE 8058 (2011), doi:10.1117/12.883340
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall (1993)
Price, P., Fisher, W.M., Bernstein, J., Pallet, D.S.: The DARPA 1000-Word Resource Management Database for Continuous Speech Recognition. In: Proc. IEEE ICASSP, pp. 651–654 (1988)
Young, S.J., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.C.: The HTK Book (for HTK Version 3.4). University of Cambridge (2006)
Varga, A., Steeneken, H.J.: Assessment for automatic speech recognition: II. In: NOISEX 1992: A Database and an Experiment to Study the Effect of Additive Noise on Speech Recognition Systems. Speech Comm., vol. 12, pp. 247–251 (1993)
Allen, J.B., Berkley, D.A.: Image Method for Efficiently Simulating Small-Room Acoustics. Journal of the Acoustical Society of America 65, 943–950 (1979)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oh, M., Park, HM. (2011). Preprocessing of Independent Vector Analysis Using Feed-Forward Network for Robust Speech Recognition. In: Lu, BL., Zhang, L., Kwok, J. (eds) Neural Information Processing. ICONIP 2011. Lecture Notes in Computer Science, vol 7063. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24958-7_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-24958-7_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24957-0
Online ISBN: 978-3-642-24958-7
eBook Packages: Computer ScienceComputer Science (R0)