Robust Ego Noise Suppression of a Robot

Ince, Gökhan; Nakadai, Kazuhiro; Rodemann, Tobias; Tsujino, Hiroshi; Imura, Jun-Ichi

doi:10.1007/978-3-642-13022-9_7

Gökhan Ince^24,26,
Kazuhiro Nakadai^24,26,
Tobias Rodemann²⁵,
Hiroshi Tsujino²⁴ &
…
Jun-Ichi Imura²⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6096))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

2140 Accesses
5 Citations

Abstract

This paper describes an architecture that can enhance a robot with the capability of performing automatic speech recognition even while the robot is moving. The system consists of three blocks: (1) a multi-channel noise reduction block comprising consequent stages of microphone-array-based sound localization, geometric source separation and post filtering, (2) a single-channel template subtraction block and (3) a speech recognition block. In this work, we specifically investigate a missing feature theory based automatic speech recognition (MFT-ASR) approach in block (3), that makes use of spectrotemporal elements that are derived from (1) and (2) to measure the reliability of the audio features and to generate masks that filter unreliable speech features. We evaluate the proposed technique on a robot using word error rates. Furthermore, we present a detailed analysis of recognition accuracy to determine optimal parameters. Proposed MFT-ASR implementation attains significantly higher recognition performance compared to the performances of both single and multi-channel noise reduction methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sato, M., Sugiyama, A., Ohnaka, S.: An adaptive noise canceller with low signal-distortion based on variable stepsize subfilters for human-robot communication. IEICE Trans. Fundamentals E88-A(8), 2055–2061 (2004)
Article Google Scholar
Nishimura, Y., Nakano, M., Nakadai, K., Tsujino, H., Ishizuka, M.: Speech Recognition for a Robot under its Motor Noises by Selective Application of Missing Feature Theory and MLLR. In: ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition (2006)
Google Scholar
Ito, A., Kanayama, T., Suzuki, M., Makino, S.: Internal Noise Suppression for Speech Recognition by Small Robots. In: Interspeech, pp. 2685–2688 (2005)
Google Scholar
Ince, G., Nakadai, K., Rodemann, T., Hasegawa, Y., Tsujino, H., Imura, J.: Ego Noise Suppression of a Robot Using Template Subtraction. In: Proc. of the IEEE/RSJ International Conference on Robots and Intelligent Systems (IROS), pp. 199–204 (2009)
Google Scholar
Cohen, I.: Noise Estimation by Minima Controlled Recursive Averaging for Robust Speech Enhancement. IEEE Signal Processing Letters 9(1) (2002)
Google Scholar
Boll, S.: Suppression of Acoustic Noise in Speech Using Spectral Subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP-27(2) (1979)
Google Scholar
Raj, B., Stern, R.M.: Missing-feature approaches in speech recognition. IEEE Signal Processing Magazine 22, 101–116 (2005)
Article Google Scholar
Yamamoto, S., Nakadai, K., Nakano, M., Tsujino, H., Valin, J.M., Komatani, K., Ogata, T., Okuno, H.G.: Real-time robot audition system that recognizes simultaneous speech in the real world. In: Proc. of the IEEE/RSJ International Conference on Robots and Intelligent Systems, IROS (2006)
Google Scholar
Takahashi, T., Yamamoto, S., Nakadai, K., Komatani, K., Ogata, T., Okuno, H.G.: Soft Missing-Feature Mask Generation for Simultaneous Speech Recognition System in Robots. In: Proceedings of International Conference on Spoken Language Processing (Interspeech), pp. 992–997 (2008)
Google Scholar
Parra, L.C., Alvino, C.V.: Geometric Source Separation: Merging Convolutive Source Separation with Geometric Beamforming. IEEE Trans. Speech Audio Process 10(6), 352–362 (2002)
Article Google Scholar
Ince, G., Nakadai, K., Rodemann, T., Hasegawa, Y., Tsujino, H., Imura, J.: A Hybrid Framework for Ego Noise Cancellation of a Robot. In: Proc. of the IEEE/RSJ International Conference on Robotics and Automation, ICRA (to appear, 2010)
Google Scholar
Schmidt, R.: Multiple emitter location and signal parameter estimation. IEEE Trans. on Antennas and Propagation 34(3), 276–280 (1986)
Article Google Scholar
Valin, J.-M., Rouat, J., Michaud, F.: Enhanced Robot Audition Based on Microphone Array Source Separation with Post-Filter. In: Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2123–2128 (2004)
Google Scholar
Nakajima, H., Nakadai, K., Hasegawa, Y., Tsujino, H.: Adaptive step-size parameter control for real-world blind source separation. In: Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 149–152 (2008)
Google Scholar
Nishimura, Y., Shinozaki, T., Iwano, K., Furui, S.: Noise-robust speech recognition using multi-band spectral features. In: Proc. of 148th Acoustical Society of America Meetings, vol. 1aSC7 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Honda Res. Inst. Japan Co., Ltd., 8-1 Honcho, Wako-shi, Saitama, 351-0188, Japan
Gökhan Ince, Kazuhiro Nakadai & Hiroshi Tsujino
Honda Res. Inst. Europe GmbH, Carl-Legien Strasse 30, 63073, Offenbach, Germany
Tobias Rodemann
Dept. of Mech. and Env. Informatics, Grad. School of Information Science and Eng., Tokyo Inst. of Tech., 2-12-1-W8-1, O-okayama, Meguro-ku, Tokyo, 152-8552, Japan
Gökhan Ince, Kazuhiro Nakadai & Jun-Ichi Imura

Authors

Gökhan Ince
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhiro Nakadai
View author publications
You can also search for this author in PubMed Google Scholar
Tobias Rodemann
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Tsujino
View author publications
You can also search for this author in PubMed Google Scholar
Jun-Ichi Imura
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computing and Numerical Analysis, University of Cordoba, Campus Universitario de Rabanales, Einstein Building, 3rd floor, 14071, Cordoba, Spain
Nicolás García-Pedrajas
Dept. of Computer Science and Artificial Intelligence, ETS de Ingenierias Informática y de Telecomunicación, University of Granada, 18071, Granada, Spain
Francisco Herrera
School of Computing, University of the West of Scotland, PA1 2BE, Paisley, UK
Colin Fyfe
Dept. Computer Science and Artificial Intelligence, ETS de Ingenierias Informática y de Telecomunicación, University of Granada, 18071, Granada, Spain
José Manuel Benítez
Department of Computer Science, Texas State University-San Marcos, 601 University Drive, TX 78666-4616, San Marcos, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ince, G., Nakadai, K., Rodemann, T., Tsujino, H., Imura, JI. (2010). Robust Ego Noise Suppression of a Robot. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6096. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13022-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-13022-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13021-2
Online ISBN: 978-3-642-13022-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics