Skip to main content
Log in

Multi-Sound-Source Localization Using Machine Learning for Small Autonomous Unmanned Vehicles with a Self-Rotating Bi-Microphone Array

  • Short Paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

While vision-based localization techniques have been widely studied for small autonomous unmanned vehicles (SAUVs), sound-source localization capabilities have not been fully enabled for SAUVs. This paper presents two novel approaches for SAUVs to perform three-dimensional (3D) multi-sound-sources localization (MSSL) using only the inter-channel time difference (ICTD) signal generated by a self-rotating bi-microphone array. The proposed two approaches are based on two machine learning techniques viz., Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Random Sample Consensus (RANSAC) algorithms, respectively, whose performances were tested and compared in both simulations and experiments. The results show that both approaches are capable of correctly identifying the number of sound sources along with their 3D orientations in a reverberant environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Availability of data and materials

The data that support the findings of this study are available from the corresponding author, Deepak Gala, upon reasonable request.

References

  1. Wang, Q., Ren, K., Zhou, M., Lei, T., Koutsonikolas, D., Su, L.: Messages behind the sound: real-time hidden acoustic signal capture with smartphones. In: Proceedings of the 22nd Annual International Conference on Mobile Computing and Networking, pp 29–41. ACM (2016)

  2. Böhme, H.-J., Wilhelm, T., Key, J., Schauer, C., Schröter, C., Groß, H.-M., Hempel, T.: An approach to multi-modal human–machine interaction for intelligent service robots. Robot. Auton. Syst. 44(1), 83–96 (2003)

    Article  Google Scholar 

  3. Murray, J.C., Erwin, H., Wermter, S.: Robotics sound-source localization and tracking using interaural time difference and cross-correlation. In: AI Workshop on NeuroBotics (2004)

  4. Borenstein, J., Everett, H., Feng, L.: Navigating mobile robots: systems and techniques. A K Peters Ltd. (1996)

  5. Rabinkin, D.V.: Optimum sensor placement for microphone arrays, Ph.D. dissertation, RUTGERS The State University of New Jersey - New Brunswick (1998)

  6. Brandstein, M., Ward, D.: Microphone Arrays: Signal Processing Techniques and Applications. Springer Science & Business Media, New York (2013)

    Google Scholar 

  7. Wallach, H.: On sound localization. J. Acoust. Soc. Am. 10(4), 270–274 (1939)

    Article  Google Scholar 

  8. Lee, S., Park, Y., Park, Y.-s.: Three-dimensional sound source localization using inter-channel time difference trajectory. Int. J. Adv. Robot. Syst. 12(12), 171 (2015)

    Google Scholar 

  9. Handzel, A.A., Krishnaprasad, P.: Biomimetic sound-source localization. IEEE Sensors J. 2 (6), 607–616 (2002)

    Article  Google Scholar 

  10. Eriksen, G.H.: Visualization tools and graphical methods for source localization and signal separation, Master’s thesis, Universityof OSLO Department of Informatics (2006)

  11. Zhong, X., Yost, W., Sun, L.: Dynamic binaural sound source localization with ITD cues: Human listeners. J. Acoust. Soc. Am. 137(4), 2376–2376 (2015)

    Article  Google Scholar 

  12. Gala, D., Lindsay, N., Sun, L.: Three-dimensional sound source localization for unmanned ground vehicles with a self-rotational two-microphone array. In: Proceedings of the 5th international conference of control, dynamic systems, and robotics (CDSR’18), pp 104.1–104.11 (2018)

  13. Valin, J.-M., Michaud, F., Rouat, J., Létourneau, D.: Robust sound source localization using a microphone array on a mobile robot. In: Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2003. (IROS 2003), vol. 2, pp 1228–1233. IEEE (2003)

  14. Sun, L., Cheng, Q.: Indoor multiple sound source localization using a novel data selection scheme. In: 48th Annual Conference on Information Sciences and Systems (CISS), pp 1–6. IEEE (2014)

  15. Zhong, X., Sun, L., Yost, W.: Active binaural localization of multiple sound sources. Robot. Auton. Syst. 85, 83–92 (2016)

    Article  Google Scholar 

  16. Blandin, C., Ozerov, A., Vincent, E.: Multi-source TDOA estimation in reverberant audio using angular spectra and clustering. Sig. Process. 92(8), 1950–1960 (2012)

    Article  Google Scholar 

  17. Swartling, M., Sällberg, B., Grbić, N.: Source localization for multiple speech sources using low complexity non-parametric source separation and clustering. Sig. Process. 91(8), 1781–1788 (2011)

    Article  Google Scholar 

  18. Dong, T., Lei, Y., Yang, J.: An algorithm for underdetermined mixing matrix estimation. Neurocomputing 104, 26–34 (2013)

    Article  Google Scholar 

  19. Yilmaz, O., Rickard, S.: Blind separation of speech mixtures via time-frequency masking. IEEE Trans. Signal Process. 52(7), 1830–1847 (2004)

    Article  MathSciNet  Google Scholar 

  20. Pavlidi, D., Griffin, A., Puigt, M., Mouchtaris, A.: Real-time multiple sound source localization and counting using a circular microphone array. IEEE Trans. Audio Speech Lang. Process. 21(10), 2193–2206 (2013)

    Article  Google Scholar 

  21. Loesch, B., Yang, B.: Source number estimation and clustering for underdetermined blind source separation. In: International Workshop on Acoustic Signal Enhancement (IWAENC), Seattle Washington, USA (2008)

  22. Zhong, X., Sun, L., Yost, W.: Active binaural localization of multiple sound sources. Robot. Auton. Syst. 85, 83–92 (2016)

    Article  Google Scholar 

  23. Catalbas, M.C., Dobrisek, S.: 3D moving sound source localization via conventional microphones. Elektronika ir Elektrotechnika 23(4), 63–69 (2017)

    Article  Google Scholar 

  24. Traa, J., Smaragdis, P.: Blind multi-channel source separation by circular-linear statistical modeling of phase differences. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 4320–4324. IEEE (2013)

  25. Gala, D., Sun, L.: Moving sound source localization and tracking using a self rotating bi-microphone array. In: Dynamic Systems and Control Conference, vol. 59148, p V001T09A002. American Society of Mechanical Engineers (2019)

  26. Gala, D., Lindsay, N., Sun, L.: Realtime active sound source localization for unmanned ground robots using a self-rotational bi-microphone array. J. Intell. Robot. Syst. 95(3-4), 935–954 (2019)

    Article  Google Scholar 

  27. Gala, D.: Sound source localization and tracking using a self-rotating bi-microphone array, Ph.D. dissertation New Mexico State University (2019)

  28. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, no. 34, vol. 96, pp 226–231 (1996)

  29. Knapp, C., Carter, G.: The generalized correlation method for estimation of time delay. IEEE Trans. Acoust. Speech Sig. Process. 24(4), 320–327 (1976)

    Article  Google Scholar 

  30. Azaria, M., Hertz, D.: Time delay estimation by generalized cross correlation methods. IEEE Trans. Acoust. Speech Sig. Process. 32(2), 280–285 (1984)

    Article  Google Scholar 

  31. Naylor, P., Gaubitch, N.D.: Speech Dereverberation. Springer Science & Business Media, New York (2010)

    Book  Google Scholar 

  32. Gala, D.R., Vasoya, A., Misra, V.M.: Speech enhancement combining spectral subtraction and beamforming techniques for microphone array. In: Proceedings of the International Conference and Workshop on Emerging Trends in Technology (ICWET), pp 163–166 (2010)

  33. Gala, D.R., Misra, V.M.: SNR improvement with speech enhancement techniques. In: Proceedings of the International Conference and Workshop on Emerging Trends in Technology (ICWET), pp 163–166. ACM (2011)

  34. International Organization for Standardization (ISO): British, European and International Standards (BSEN), Noise emitted by machinery and equipment – Rules for the drafting and presentation of a noise test code, 12001: Acoustics (1997)

  35. Goelzer, B., Hansen, C.H., Sehrndt, G.: Occupational exposure to noise: evaluation, prevention and control. World Health Organisation (2001)

  36. Calmes, L.: Biologically inspired binaural sound source localization and tracking for mobile robots. Ph.D. dissertation, RWTH Aachen University (2009)

  37. Raj, C.D.: Comparison of K means K medoids DBSCAN algorithms using DNA microarray dataset. Int. J. Comput. Appl. Math. (IJCAM) (2017)

  38. Farmani, N., Sun, L., Pack, D.J.: A scalable multitarget tracking system for cooperative unmanned aerial vehicles. IEEE Trans. Aerosp. Electron. Syst. 53(4), 1947–1961 (2017)

    Article  Google Scholar 

  39. Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40(1), 200–210 (2013)

    Article  Google Scholar 

  40. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)

    Article  MathSciNet  Google Scholar 

  41. Schubert, E., Sander, J., Ester, M., Kriegel, H.P., Xu, X.: Dbscan revisited, revisited: why and how you should (still) use dbscan. ACM Trans. Database Syst. (TODS) 42(3), 1–21 (2017)

    Article  MathSciNet  Google Scholar 

  42. Donohue, K.D.: Audio array toolbox. [Online] Available: https://github.com/UKY-Distributed-Audio-Lab/Array-Toolbox (2021)

  43. Allen, J.B., Berkley, D.A.: Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am. 65(4), 943–950 (1979)

    Article  Google Scholar 

  44. Donohue, K.D.: Audio systems lab experimental data - single-track single-speaker speech. [Online] Available: http://web.engr.uky.edu/donohue/audio/Data/audioexpdata.htm (2019)

  45. Stehman, S.V.: Selecting and interpreting measures of thematic classification accuracy. Remote Sens. Environ. 62(1), 77–89 (1997)

    Article  Google Scholar 

  46. Grondin, F., Glass, J.: Svd-phat: A fast sound source localization method. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 4140–4144. IEEE (2019)

  47. Coteli, M.B., Olgun, O., Hacihabiboglu, H.: Multiple sound source localization with steered response power density and hierarchical grid refinement. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 26 (11), 2215–2229 (2018)

    Article  Google Scholar 

  48. Sun, H., Teutsch, H., Mabande, E., Kellermann, W.: Robust localization of multiple sources in reverberant environments using eb-esprit with spherical microphone arrays. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 117–120. IEEE (2011)

  49. Jarrett, D.P., Habets, E.A., Naylor, P.A.: 3d source localization in the spherical harmonic domain using a pseudointensity vector. In: 2010 18th European Signal Processing Conference, pp 442–446. IEEE (2010)

  50. Moore, A.H., Evers, C., Naylor, P.A., Moore, A.H., Evers, C., Naylor, P.A.: Direction of arrival estimation in the spherical harmonic domain using subspace pseudointensity vectors. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 25(1), 178–192 (2017)

    Article  Google Scholar 

  51. Nadiri, O., Rafaely, B.: Localization of multiple speakers under high reverberation using a spherical microphone array and the direct-path dominance test. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 22(10), 1494–1505 (2014)

    Article  Google Scholar 

  52. Jia, M., Sun, J., Bao, C., Ritz, C.: Multiple-to-single sound source localization by applying single-source bins detection. Appl. Acoust. 138, 28–38 (2018)

    Article  Google Scholar 

  53. Sasaki, Y., Kagami, S., Mizoguchi, H.: Multiple sound source mapping for a mobile robot by self-motion triangulation. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 380–385. IEEE (2006)

Download references

Funding

All authors confirm that there has been no significant financial support for this work that could have influenced its outcome.

Author information

Authors and Affiliations

Authors

Contributions

Deepak Gala, developed the theoretical formalism, performed analytical analysis, performed numerical simulations, and planned the experiments. Also, he wrote the first draft of the manuscript, prepared relevant materials, and conducted result analyses. Nathan Lindsay, contributed to the design, prototyping, and integration of hardware components for the experimental platform, as well as conducting the experiments and data collection. Liang Sun, as the research advisor of the first and second authors, initiated the research work presented in the paper, developed the research plan for methodologies, simulations, experiments, analysis, and data collection, provided guidance for research discussions. All authors read and approved the revised manuscript.

Corresponding author

Correspondence to Deepak Gala.

Ethics declarations

Ethical Approval

No ethical approval was deemed necessary.

Consent to Participate

All authors voluntarily agreed to participate in this research study.

Consent for Publication

All authors confirm:

– that the work described has not been published before (except in the form of an abstract or as part of a published lecture, review, or thesis);

– that it is not under consideration for publication elsewhere;

– that its publication has been approved by all co-authors;

– that its publication has been approved (tacitly or explicitly) by the responsible authorities at the institution where the work is carried out.

All authors give their consent for information about themselves to be published in the Journal of Intelligent & Robotic Systems. All authors transfers their exclusive right to the presented paper, including the right to publish the paper in the Journal of Intelligent & Robotic Systems.

Competing interests

All authors confirm that there are no known conflicts of interest associated with this publication.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gala, D., Lindsay, N. & Sun, L. Multi-Sound-Source Localization Using Machine Learning for Small Autonomous Unmanned Vehicles with a Self-Rotating Bi-Microphone Array. J Intell Robot Syst 103, 52 (2021). https://doi.org/10.1007/s10846-021-01481-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-021-01481-4

Keywords

Navigation