Skip to main content

Evaluation of Two Simultaneous Continuous Speech Recognition with ICA BSS and MFT-Based ASR

  • Conference paper
New Trends in Applied Artificial Intelligence (IEA/AIE 2007)

Abstract

An adaptation of independent component analysis (ICA) and missing feature theory (MFT)-based ASR for two simultaneous continuous speech recognition is described. We have reported on the utility of a system with isolated word recognition, but the performance of the MFT-based ASR is affected by the configuration, such as an acoustic model. The system needs to be evaluated under a more general condition. It first separates the sound sources using ICA. Then, spectral distortion in the separated sounds is estimated to generate missing feature masks (MFMs). Finally, the separated sounds are recognized by MFT-based ASR. We estimate spectral distortion in the temporal-frequency domain in terms of feature vectors, and we generate MFMs. We tested an isolated word and the continuous speech recognition with a cepstral and spectral feature. The resulting system outperformed the baseline robot audition system by 13 and 6 points respectively on the spectral features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Takeda, R., Yamamoto, S., Komatani, K., Ogata, T., Okuno, H.G.: Improving speech recognition of two simultaneous speech signals by integrating ica bss and automatic missing feature mask generation. In: Proceedings of International Conference on Spoken Language Processing (2006)

    Google Scholar 

  2. Kolossa, D., Klimas, A., Orglmeister, R.: Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 82–85. IEEE Computer Society Press, Los Alamitos (2005)

    Chapter  Google Scholar 

  3. Yamamoto, S., Valin, J.-M., Nakadai, K., Rouat, J., Michaud, F., Ogata, T., Okuno, H.G.: Enhanced robot speech recognition based on microphone array source separation and missing feature theory. In: Proceedings of IEEE International Conference on Intelligent Robots and Systems, IEEE Computer Society Press, Los Alamitos (2005)

    Google Scholar 

  4. Palomaki, K.J., Brown, G.J., Barker, J.P.: Recognition of reverberant speech using full cepstral features and spectral missing data. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing, pp. 289–292 (2006)

    Google Scholar 

  5. Choi, S., Amari, S., Cichocki, A., Liu, R.: Natural gradient learning with a nonholonomic constraint for blind deconvolution of multiple channels. In: Proceeding of International Workshop on ICA and BBS, pp. 371–376 (1999)

    Google Scholar 

  6. Sawada, H., Mukai, R., Araki, S., Makino, S.: Polar coordinate based nonlinear function for frequency-domain blind source separation. IEICE Trans. Fundamentals E86-A (3), 505–510 (2003)

    Google Scholar 

  7. Murata, N., Ikeda, S., Ziehe, A.: An approach to blind source separation based on temporal structure of speech signals. Neurocomputing 1–24 (2001)

    Google Scholar 

  8. Nishimura, Y., Shinozaki, T., Iwano, K., Furui, S.: Noise-robust speech recognition using multi-band spectral features. In: Proc. of 148th Acoustical Society of America Meetings (2004)

    Google Scholar 

  9. Cooke, M., Green, P., Josifovski, L., Vizinho, A.: Robust automatic speech recognition with missing and unreliable acoustic data. Speech Communication 34(3)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Hiroshi G. Okuno Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Takeda, R., Yamamoto, S., Komatani, K., Ogata, T., Okuno, H.G. (2007). Evaluation of Two Simultaneous Continuous Speech Recognition with ICA BSS and MFT-Based ASR. In: Okuno, H.G., Ali, M. (eds) New Trends in Applied Artificial Intelligence. IEA/AIE 2007. Lecture Notes in Computer Science(), vol 4570. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73325-6_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73325-6_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73322-5

  • Online ISBN: 978-3-540-73325-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics