Abstract
An adaptation of independent component analysis (ICA) and missing feature theory (MFT)-based ASR for two simultaneous continuous speech recognition is described. We have reported on the utility of a system with isolated word recognition, but the performance of the MFT-based ASR is affected by the configuration, such as an acoustic model. The system needs to be evaluated under a more general condition. It first separates the sound sources using ICA. Then, spectral distortion in the separated sounds is estimated to generate missing feature masks (MFMs). Finally, the separated sounds are recognized by MFT-based ASR. We estimate spectral distortion in the temporal-frequency domain in terms of feature vectors, and we generate MFMs. We tested an isolated word and the continuous speech recognition with a cepstral and spectral feature. The resulting system outperformed the baseline robot audition system by 13 and 6 points respectively on the spectral features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Takeda, R., Yamamoto, S., Komatani, K., Ogata, T., Okuno, H.G.: Improving speech recognition of two simultaneous speech signals by integrating ica bss and automatic missing feature mask generation. In: Proceedings of International Conference on Spoken Language Processing (2006)
Kolossa, D., Klimas, A., Orglmeister, R.: Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 82–85. IEEE Computer Society Press, Los Alamitos (2005)
Yamamoto, S., Valin, J.-M., Nakadai, K., Rouat, J., Michaud, F., Ogata, T., Okuno, H.G.: Enhanced robot speech recognition based on microphone array source separation and missing feature theory. In: Proceedings of IEEE International Conference on Intelligent Robots and Systems, IEEE Computer Society Press, Los Alamitos (2005)
Palomaki, K.J., Brown, G.J., Barker, J.P.: Recognition of reverberant speech using full cepstral features and spectral missing data. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing, pp. 289–292 (2006)
Choi, S., Amari, S., Cichocki, A., Liu, R.: Natural gradient learning with a nonholonomic constraint for blind deconvolution of multiple channels. In: Proceeding of International Workshop on ICA and BBS, pp. 371–376 (1999)
Sawada, H., Mukai, R., Araki, S., Makino, S.: Polar coordinate based nonlinear function for frequency-domain blind source separation. IEICE Trans. Fundamentals E86-A (3), 505–510 (2003)
Murata, N., Ikeda, S., Ziehe, A.: An approach to blind source separation based on temporal structure of speech signals. Neurocomputing 1–24 (2001)
Nishimura, Y., Shinozaki, T., Iwano, K., Furui, S.: Noise-robust speech recognition using multi-band spectral features. In: Proc. of 148th Acoustical Society of America Meetings (2004)
Cooke, M., Green, P., Josifovski, L., Vizinho, A.: Robust automatic speech recognition with missing and unreliable acoustic data. Speech Communication 34(3)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Takeda, R., Yamamoto, S., Komatani, K., Ogata, T., Okuno, H.G. (2007). Evaluation of Two Simultaneous Continuous Speech Recognition with ICA BSS and MFT-Based ASR. In: Okuno, H.G., Ali, M. (eds) New Trends in Applied Artificial Intelligence. IEA/AIE 2007. Lecture Notes in Computer Science(), vol 4570. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73325-6_38
Download citation
DOI: https://doi.org/10.1007/978-3-540-73325-6_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73322-5
Online ISBN: 978-3-540-73325-6
eBook Packages: Computer ScienceComputer Science (R0)