A Review on Sound Source Localization Systems

Desai, Dhwani; Mehendale, Ninad

doi:10.1007/s11831-022-09747-2

A Review on Sound Source Localization Systems

Review article
Published: 05 May 2022

Volume 29, pages 4631–4642, (2022)
Cite this article

Archives of Computational Methods in Engineering Aims and scope Submit manuscript

2808 Accesses
21 Citations
Explore all metrics

Abstract

Sound Source Localization (SSL) systems focus on finding the direction of a sound source. Sound source localization is an essential feature in robots and humanoids. Research is being done for two decades to optimize SSL techniques and enhance their accuracy. Presented in this review we have categorized various proposed SSL techniques into four main types. Out of which one type is SSL that is based on conventional algorithms like Generalized Cross Correlation (GCC), Multiple Signal Classification (MUSIC), Time Difference of Arrival (TDOA), etc. using multiple microphones array configurations. The second type involves techniques based on binaural signal processing using conventional algorithms (GCC, MUSIC, TDOA). SSL techniques that fall under the third and fourth types of categories are developed recently in the last decade with the rise of Convolutional Neural Network (CNN) algorithms. The third type of SSL technique makes use of multiple microphone array configurations using CNN and the fourth type involves the most recently evolved technique based on binaural signal processing using CNN. The different SSL techniques based on multiaural and binaural signals using the conventional algorithm as well as CNN are presented in this review. The review paper provides an overview of SSL systems in terms of the number of microphones used, layouts of microphonic arrays, algorithms to perform SSL and localization in 3D space (azimuth, elevation and distance). From the review we found that out of all SSL techniques, CNN is the emerging and optimized one. By using CNN in SSL systems, the least error rate of 0.1 % was achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Virtual Augmentation of the Beamforming Array Based on a Sub-cross-spectral Matrix Computation for Localizing Stationary Signal Noise Sources

Article 07 May 2024

Deep Audio-visual Learning: A Survey

Article Open access 15 April 2021

A review on speech separation in cocktail party environment: challenges and approaches

Article 23 February 2023

Data Availibility

Not applicable

Code availability

Not applicable.

References

Council NR et al (2004) Hearing loss: determining eligibility for social security benefits. Springer, New York
Google Scholar
Smith LS (2015) Toward a neuromorphic microphone. Front Neurosci 9:398
Article Google Scholar
Jepsen ML, Ewert SD, Dau T (2008) A computational model of human auditory signal processing and perception. J Acoust Soc Am 124(1):422
Article Google Scholar
Chen JC, Yip L, Elson J, Wang H, Maniezzo D, Hudson RE, Yao K, Estrin D (2003) Coherent acoustic array processing and localization on wireless sensor networks. Proc IEEE 91(8):1154
Article Google Scholar
Fazenda B, Atmoko H, Gu F, Guan L, Ball A (2009) Acoustic based safety emergency vehicle detection for intelligent transport systems. In: 2009 ICCAS-SICE (IEEE), pp 4250–4255
Zhou J, Miles RN (2018) Directional sound detection by sensing acoustic flow. IEEE Sens Lett 2(2):1
Article Google Scholar
Hoshiba K, Washizaki K, Wakabayashi M, Ishiki T, Kumon M, Bando Y, Gabriel D, Nakadai K, Okuno HG (2017) Design of UAV-embedded microphone array system for sound source localization in outdoor environments. Sensors 17(11):2535
Article Google Scholar
Song KT, Chen JL (2003) Sound direction recognition using a condenser microphone array. In: Proceedings 2003 IEEE International Symposium on Computational Intelligence in Robotics and Automation. Computational Intelligence in Robotics and Automation for the New Millennium (Cat. No. 03EX694), vol 3 (IEEE), vol 3, pp 1445–1450
Fazenda B (2008) Localisation of sound sources using coincident microphone techniques. Proc Inst Acoust 29(7):106
Google Scholar
Chakrabarty S, Habets EA (2017) Broadband DOA estimation using convolutional neural networks trained with noise signals. In: 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (IEEE), pp 136–140
Li Q, Zhang X, Li H (2018) Online direction of arrival estimation based on deep learning. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE), pp 2616–2620
Sasaki Y, Tanabe R, Takernura H (2018) Online spatial sound perception using microphone array on mobile robot. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE), pp 2478–2484
Grondin F, Glass J, Sobieraj I, Plumbley MD (2019) Sound event localization and detection using CRNN on pairs of microphones. arXiv preprint arXiv:1910.10049
Adavanne S, Politis A, Nikunen J, Virtanen T (2018) Sound event localization and detection of overlapping sources using convolutional recurrent neural networks. IEEE J Sel Top Signal Process 13(1):34
Article Google Scholar
Raspaud M, Viste H, Evangelista G (2009) Binaural source localization by joint estimation of ILD and ITD. IEEE Trans Audio Speech Lang Process 18(1):68
Article Google Scholar
Li D, Levinson SE (2003) A bayes-rule based hierarchical system for binaural sound source localization. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings.(ICASSP’03). Vol 5 (IEEE), pp V–521
May T, Van De Par S, Kohlrausch A (2010) A probabilistic model for robust localization based on a binaural auditory front-end. IEEE Trans Audio Speech Lang Process 19(1):1
Article Google Scholar
Zannini CM, Parisi R, Uncini A (2011) Binaural sound source localization in the presence of reverberation. In: 2011 17th International Conference on Digital Signal Processing (DSP) (IEEE), pp 1–6
Parisi R, Camoes F, Scarpiniti M, Uncini A (2011) Cepstrum prefiltering for binaural source localization in reverberant environments. IEEE Signal Process Lett 19(2):99
Article Google Scholar
Pang C, Liu H, Zhang J, Li X (2017) Binaural sound localization based on reverberation weighting and generalized parametric mapping. IEEE/ACM Trans Audio Speech Lang Process 25(8):1618
Article Google Scholar
Rodemann T, Ince G, Joublin F, Goerick C (2008) Using binaural and spectral cues for azimuth and elevation localization. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems (IEEE), pp 2185–2190
Wu X, Talagala DS, Zhang W, Abhayapala TD (2015) Binaural localization of speech sources in 3-D using a composite feature vector of the HRTF. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE), pp 2654–2658
Dietz M, Ewert SD, Hohmann V (2011) Auditory model based direction estimation of concurrent speakers from binaural signals. Speech Commun 53(5):592
Article Google Scholar
Chan VYS, Jin CT, van Schaik A (2012) Neuromorphic audio-visual sensor fusion on a sound-localising robot. Front Neurosci 6:21
Article Google Scholar
Woodruff J, Wang D (2012) Binaural localization of multiple sources in reverberant and noisy environments. IEEE Trans Audio Speech Lang Process 20(5):1503
Article Google Scholar
Youssef K, Argentieri S, Zarader JL (2012) A binaural sound source localization method using auditive cues and vision. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE), pp 217–220
He W, Motlicek P, Odobez JM (2018) Deep neural networks for multiple speaker detection and localization. In: 2018 IEEE International Conference on Robotics and Automation (ICRA) (IEEE), pp 74–79
Pang C, Liu H, Li X (2019) Multitask learning of time-frequency CNN for sound source localization. IEEE Access 7:40725
Article Google Scholar
Jiang S, Wu L, Yuan P, Sun Y, Liu H (2020) Deep and CNN fusion method for binaural sound source localisation. J Eng 2020(13):511
Article Google Scholar
Xu Y, Afshar S, Singh RK, Wang R, van Schaik A, Hamilton TJ (2019) A binaural sound localization system using deep convolutional neural networks. In: 2019 IEEE International Symposium on Circuits and Systems (ISCAS) (IEEE), pp 1–5
Liu H, Yuan P, Yang B, Wu L (2019) Robust interaural time difference estimation based on convolutional neural network. In: 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO) (IEEE), pp 352–357
Ma N, May T, Brown GJ (2017) Exploiting deep neural networks and head movements for robust binaural localization of multiple sources in reverberant environments. IEEE/ACM Trans Audio Speech Lang Process 25(12):2444
Article Google Scholar
Vecchiotti P, Ma N, Squartini S, Brown GJ (2019) End-to-end binaural sound localisation from the raw waveform. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE), pp 451–455
Opochinsky R, Laufer-Goldshtein B, Gannot S, Chechik G (2019) Deep ranking-based sound source localization. In: 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (IEEE), pp 283–287
Wang J, Wang J, Qian K, Xie X, Kuang J (2020) Binaural sound localization based on deep neural network and affinity propagation clustering in mismatched HRTF condition. EURASIP J Audio Speech Music Process 2020(1):4
Article Google Scholar
Bianco MJ, Gannot S, Gerstoft P (2020) Semi-supervised source localization with deep generative modeling. In: 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP) (IEEE), pp 1–6
Nguyen Q, Girin L, Bailly G, Elisei F, Nguyen DC (2018) Autonomous sensorimotor learning for sound source localization by a humanoid robot. In: Workshop on Crossmodal Learning for Intelligent Robotics in conjunction with IEEE/RSJ IROS
Choi J, Chang JH (2020) Convolutional Neural Network-based Direction-of-Arrival Estimation using Stereo Microphones for Drone. In: 2020 International Conference on Electronics, Information, and Communication (ICEIC) (IEEE), pp 1–5

Download references

Acknowledgements

Authors would like to thank University of Mumbai for the funding and K. J. Somaiya College of Engineering for all the facilities and infrastructure provided.

Funding

Funding was received from University of Mumbai under Minor Research Project scheme ATD/ICD/2019-20/762, Project number:1070, Date:17 March, 2020.

Author information

Authors and Affiliations

K. J. Somaiya College of Engineering, Mumbai, India
Dhwani Desai & Ninad Mehendale

Authors

Dhwani Desai
View author publications
You can also search for this author in PubMed Google Scholar
Ninad Mehendale
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization was done by Ninad Mehendale (NM). The formal analysis was performed by Dhwani Desai (DD) and NM. Manuscript writing- original draft preparation was done by DD. Review and editing was done by NM. Visualization work was carried out by DD and NM.

Corresponding author

Correspondence to Ninad Mehendale.

Ethics declarations

Conflict of interest

Authors D. Desai and N. Mehendale declare that there has been no conflict of interest.

Ethical approval

All authors consciously assure that the manuscript fulfills the following statements: 1) This material is the authors’ own original work, which has not been previously published elsewhere. 2) The paper is not currently being considered for publication elsewhere. 3) The paper reflects the authors’own research and analysis in a truthful and complete manner. 4) The paper properly credits the meaningful contributions of co-authors and co-researchers. 5) The results are appropriately placed in the context of prior and existing research.

Consent to participate

This article does not contain any studies with animals or humans performed by any of the authors. Informed consent was not required as there were no human participants. All the necessary permissions were obtained from Institute Ethical committee and concerned authorities.

Consent for publication

Authors have taken all the necessary consents for publication from participants wherever required.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Desai, D., Mehendale, N. A Review on Sound Source Localization Systems. Arch Computat Methods Eng 29, 4631–4642 (2022). https://doi.org/10.1007/s11831-022-09747-2

Download citation

Received: 05 October 2021
Accepted: 11 April 2022
Published: 05 May 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s11831-022-09747-2

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Review on Sound Source Localization Systems

Abstract

Access this article

Similar content being viewed by others

Virtual Augmentation of the Beamforming Array Based on a Sub-cross-spectral Matrix Computation for Localizing Stationary Signal Noise Sources

Deep Audio-visual Learning: A Survey

A review on speech separation in cocktail party environment: challenges and approaches

Data Availibility

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Navigation

A Review on Sound Source Localization Systems

Abstract

Access this article

Similar content being viewed by others

Virtual Augmentation of the Beamforming Array Based on a Sub-cross-spectral Matrix Computation for Localizing Stationary Signal Noise Sources

Deep Audio-visual Learning: A Survey

A review on speech separation in cocktail party environment: challenges and approaches

Data Availibility

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation