Abstract
Knowledge of how many users are there in the environment, and where they are located is essential for natural and efficient Human-Robot Interaction (HRI). However, carrying out the estimation of multiple Directions-of-Arrival (multi-DOA) on a mobile robotic platform involves a greater challenge as the mobility of the service robot needs to be considered when proposing a solution. This needs to strike a balance with the performance of the DOA estimation, specifically the amount of users the system can detect, which is usually limited by the amount of microphones used. In this contribution, an appropriately carriable small and lightweight hardware system (based on a 3-microphone triangular system) is used, and a fast multi-DOA estimator is proposed that is able to estimate more users than the number of microphones employed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lockwood ME, Jones DL, Bilger RC, Lansing CR, O’Brien WD Jr, Wheeler BC, Feng AS (2004) Performance of time- and frequency-domain binaural beamformers based on recorded signals from real rooms. J Acoust Soc Am 115(1):379–391
Hjelmås E, Low BK (2001) Face detection: a survey. Comput Vision Underst 83(3):236–274
Stiefelhagen R, Ekenel HK, Fugen C, Gieselmann P, Holzapfel H, Kraft F, Nickel K, Voit M, Waibel A (2007) Enabling multimodal human-robot interaction for the karlsruhe humanoid robot. IEEE Trans Robot 23(5):840–851
Faller C, Merimaa J (2004) Source localization in complex listening situations: selection of binaural cues based on interaural coherence. J Acoust Soc Am 116(5):3075–3089
Smith M, Kim K, Thompson D (2007) Noise source identification using microphone arrays. Proc Inst Acoust 29(5):8
Valin J, Rouat J, Michaud F (2004) Enhanced robot audition based on microphone array source separation with post-filter. In: Proceedings of IEEE/RSJ international conference intelligent robots and systems, pp 2123–2128
Murray JC, Erwin HR, Wermter S (2009) Robotic sound-source localisation architecture using cross-correlation and recurrent neural networks, What it means to communicate. Neural Networks 22(2): 173–189
Pineda L, Meza I, Aviles H, Gershenson C, Rascon C, Alvarado-Gonzalez M, Salinas L (2011) Ioca: interaction-oriented cognitive architecture. Res Comput Sci 54:273–284
Liu R, Wang Y (2010) Azimuthal source localization using interaural coherence in a robotic dog: modeling and application. Robotica First View, pp 1–8
Horchler AD, Reeve RE, Webb B, Quinn RD (2003) Robot phonotaxis in the wild: a biologically inspired approach to outdoor sound localization. In: Sound localization 11 th international conference on advanced robotics, (ICAR ’03), pp 1749–1756
Murray JC, Erwin H, Wermter S (2004) Robotics sound-source localization and tracking using interaural time difference and cross-correlation. AI Workshop on NeuroBotics
Nakadai K, Okuno HG, Kitano H (2002) Real-time sound source localization and separation for robot audition. In: Proceedings IEEE international conference on spoken language processing, 2002, pp 193–196
Wang D, Brown GJ (eds) (2006) Computational auditory scene analysis: principles, algorithms, and applications. IEEE Press/Wiley-Interscience. URL http://www.casabook.org/
Fong T, Nourbakhsh I, Dautenhahn K (2003) A survey of socially interactive robots. Robot Auton Syst 42(3–4):143–166
Saxena A, Ng AY (2009) Learning sound location from a single microphone. In: ICRA’09: proceedings of the 2009 IEEE international conference on robotics and automation, pp 4310–4315. IEEE Press, Piscataway, NJ, USA
Schmidt R (1986) Multiple emitter location and signal parameter estimation. Antennas Propag IEEE Trans 34(3):276–280
Mohan S, Lockwood ME, Kramer ML, Jones DL (2008) Localization of multiple acoustic sources with small arrays using a coherence test. J Acoust Soc Am 123(4):2136–2147
Rascon C, Pineda L (2012) Lightweight multi-doa estimation on a mobile robotic platform. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering and computer science 2012, WCECS 2012, 24–26 Oct, San Francisco, USA, pp 665–670
Rascon C, Aviles H, Pineda LA (2010) Robotic orientation towards speaker for human-robot interaction. Adv Artif Intell IBERAMIA 6433:10–19
Davis P (2001) Jack, connecting a world of audio. http://jackaudio.org/(2001)
Stivers T, Enfield NJ, Brown P, Englert C, Hayashi M, Heinemann T, Hoymann G, Rossano F, de Ruiter JP, Yoon KE, Levinson SC (2009) Universals and cultural variation in turn-taking in conversation. Proc Natl Acad Sci 106(26):10587–10592
Shriberg E, Stolcke A, Baron D (2001) Observations on overlap:Findings and implications for automatic processing of multi-party conversation. Proc Eurospeech 2:1359–1362
Pineda L, Castellanos H, Cuetara J, Galescu L, Juarez J, Listerri J, Perez P, Villaseñor L (2010) The corpus dimex100: transcription and evaluation. Lang Resour Eval 44(4):347–370
Acknowledgments
The authors would like to thank the support of CONACYT through the project 81965, and PAPIIT-UNAM with the project IN115710-3.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Rascon, C., Pineda, L. (2014). Multiple Direction-of-Arrival Estimation for a Mobile Robotic Platform with Small Hardware Setup. In: Kim, H., Ao, SI., Amouzegar, M., Rieger, B. (eds) IAENG Transactions on Engineering Technologies. Lecture Notes in Electrical Engineering, vol 247. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6818-5_16
Download citation
DOI: https://doi.org/10.1007/978-94-007-6818-5_16
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-6817-8
Online ISBN: 978-94-007-6818-5
eBook Packages: EngineeringEngineering (R0)