Multiple Direction-of-Arrival Estimation for a Mobile Robotic Platform with Small Hardware Setup

  • Caleb Rascon
  • Luis Pineda
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 247)


Knowledge of how many users are there in the environment, and where they are located is essential for natural and efficient Human-Robot Interaction (HRI). However, carrying out the estimation of multiple Directions-of-Arrival (multi-DOA) on a mobile robotic platform involves a greater challenge as the mobility of the service robot needs to be considered when proposing a solution. This needs to strike a balance with the performance of the DOA estimation, specifically the amount of users the system can detect, which is usually limited by the amount of microphones used. In this contribution, an appropriately carriable small and lightweight hardware system (based on a 3-microphone triangular system) is used, and a fast multi-DOA estimator is proposed that is able to estimate more users than the number of microphones employed.


HRI Lightweight Microphone array Mobile Multiple direction of arrival Reverberation Service robot 



The authors would like to thank the support of CONACYT through the project 81965, and PAPIIT-UNAM with the project IN115710-3.


  1. 1.
    Lockwood ME, Jones DL, Bilger RC, Lansing CR, O’Brien WD Jr, Wheeler BC, Feng AS (2004) Performance of time- and frequency-domain binaural beamformers based on recorded signals from real rooms. J Acoust Soc Am 115(1):379–391CrossRefGoogle Scholar
  2. 2.
    Hjelmås E, Low BK (2001) Face detection: a survey. Comput Vision Underst 83(3):236–274CrossRefzbMATHGoogle Scholar
  3. 3.
    Stiefelhagen R, Ekenel HK, Fugen C, Gieselmann P, Holzapfel H, Kraft F, Nickel K, Voit M, Waibel A (2007) Enabling multimodal human-robot interaction for the karlsruhe humanoid robot. IEEE Trans Robot 23(5):840–851CrossRefGoogle Scholar
  4. 4.
    Faller C, Merimaa J (2004) Source localization in complex listening situations: selection of binaural cues based on interaural coherence. J Acoust Soc Am 116(5):3075–3089CrossRefGoogle Scholar
  5. 5.
    Smith M, Kim K, Thompson D (2007) Noise source identification using microphone arrays. Proc Inst Acoust 29(5):8Google Scholar
  6. 6.
    Valin J, Rouat J, Michaud F (2004) Enhanced robot audition based on microphone array source separation with post-filter. In: Proceedings of IEEE/RSJ international conference intelligent robots and systems, pp 2123–2128Google Scholar
  7. 7.
    Murray JC, Erwin HR, Wermter S (2009) Robotic sound-source localisation architecture using cross-correlation and recurrent neural networks, What it means to communicate. Neural Networks 22(2): 173–189Google Scholar
  8. 8.
    Pineda L, Meza I, Aviles H, Gershenson C, Rascon C, Alvarado-Gonzalez M, Salinas L (2011) Ioca: interaction-oriented cognitive architecture. Res Comput Sci 54:273–284Google Scholar
  9. 9.
    Liu R, Wang Y (2010) Azimuthal source localization using interaural coherence in a robotic dog: modeling and application. Robotica First View, pp 1–8Google Scholar
  10. 10.
    Horchler AD, Reeve RE, Webb B, Quinn RD (2003) Robot phonotaxis in the wild: a biologically inspired approach to outdoor sound localization. In: Sound localization 11 th international conference on advanced robotics, (ICAR ’03), pp 1749–1756Google Scholar
  11. 11.
    Murray JC, Erwin H, Wermter S (2004) Robotics sound-source localization and tracking using interaural time difference and cross-correlation. AI Workshop on NeuroBoticsGoogle Scholar
  12. 12.
    Nakadai K, Okuno HG, Kitano H (2002) Real-time sound source localization and separation for robot audition. In: Proceedings IEEE international conference on spoken language processing, 2002, pp 193–196Google Scholar
  13. 13.
    Wang D, Brown GJ (eds) (2006) Computational auditory scene analysis: principles, algorithms, and applications. IEEE Press/Wiley-Interscience. URL
  14. 14.
    Fong T, Nourbakhsh I, Dautenhahn K (2003) A survey of socially interactive robots. Robot Auton Syst 42(3–4):143–166CrossRefzbMATHGoogle Scholar
  15. 15.
    Saxena A, Ng AY (2009) Learning sound location from a single microphone. In: ICRA’09: proceedings of the 2009 IEEE international conference on robotics and automation, pp 4310–4315. IEEE Press, Piscataway, NJ, USAGoogle Scholar
  16. 16.
    Schmidt R (1986) Multiple emitter location and signal parameter estimation. Antennas Propag IEEE Trans 34(3):276–280CrossRefGoogle Scholar
  17. 17.
    Mohan S, Lockwood ME, Kramer ML, Jones DL (2008) Localization of multiple acoustic sources with small arrays using a coherence test. J Acoust Soc Am 123(4):2136–2147CrossRefGoogle Scholar
  18. 18.
    Rascon C, Pineda L (2012) Lightweight multi-doa estimation on a mobile robotic platform. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering and computer science 2012, WCECS 2012, 24–26 Oct, San Francisco, USA, pp 665–670Google Scholar
  19. 19.
    Rascon C, Aviles H, Pineda LA (2010) Robotic orientation towards speaker for human-robot interaction. Adv Artif Intell IBERAMIA 6433:10–19Google Scholar
  20. 20.
    Davis P (2001) Jack, connecting a world of audio.
  21. 21.
    Stivers T, Enfield NJ, Brown P, Englert C, Hayashi M, Heinemann T, Hoymann G, Rossano F, de Ruiter JP, Yoon KE, Levinson SC (2009) Universals and cultural variation in turn-taking in conversation. Proc Natl Acad Sci 106(26):10587–10592CrossRefGoogle Scholar
  22. 22.
    Shriberg E, Stolcke A, Baron D (2001) Observations on overlap:Findings and implications for automatic processing of multi-party conversation. Proc Eurospeech 2:1359–1362Google Scholar
  23. 23.
    Pineda L, Castellanos H, Cuetara J, Galescu L, Juarez J, Listerri J, Perez P, Villaseñor L (2010) The corpus dimex100: transcription and evaluation. Lang Resour Eval 44(4):347–370CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Universidad Nacional Autónoma de MéxicoMexico CityMéxico

Personalised recommendations