Advertisement

Real-time multiple sound source localization and counting using a soundfield microphone

Original Research

Abstract

In this work, a multiple sound source localization and counting method based on a relaxed sparsity of speech signal is presented. A soundfield microphone is adopted to overcome the redundancy and complexity of microphone array in this paper. After establishing an effective measure, the relaxed sparsity of speech signals is investigated. According to this relaxed sparsity, we can obtain an extensive assumption that “single-source” zones always exist among the soundfield microphone signals, which is validated by statistical analysis. Based on “single-source” zone detecting, the proposed method jointly estimates the number of active sources and their corresponding DOAs by applying a peak searching approach to the normalized histogram of estimated DOA. The cross distortions caused by multiple simultaneously occurring sources are solved by estimating DOA in these “single-source” zones. The evaluations reveal that the proposed method achieves a higher accuracy of DOA estimation and source counting compared with the existing techniques. Furthermore, the proposed method has higher efficiency and lower complexity, which makes it suitable for real-time applications.

Keywords

Multiple source localization Direction of arrival estimation Sparsity Soundfield microphone 

Notes

Acknowledgments

This work has been supported by the National Natural Science Foundation of China (Nos. 61231015, 61201197), Specialized Research Fund for the Doctoral Program of Higher Education of the People’s Republic of China (No. 20121103120017), the Project supported by Beijing Postdoctoral Research Foundation.

References

  1. Argentieri S, Danes P(2007) Broadband variations of the music high-resolution method for sound source localization in robotics. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2007. IROS 2007. pp 2009–2014Google Scholar
  2. Asaei A, Taghizadeh MJ, Haghighatshoar S, Raj B, Bourlard H, Cevher V (2016) Binary sparse coding of convolutive mixtures for sound localization and separation via spatialization. IEEE Trans Signal Process 64(3):567–579MathSciNetCrossRefGoogle Scholar
  3. Bechler D, Kroschel K (2003) Considering the second peak in the gcc function for multi-source tdoa estimation with a microphone array. In: Proceedings of the International Workshop on Acoustic Echo and Noise Control (IWAENC ’03), pp 315–318Google Scholar
  4. Bechler D, Schlosser MS, Kroschel K (2004) System for robust 3d speaker tracking using microphone array measurements. In: Proceedings 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004. (IROS 2004), vol 3, pp 2117–2122 Google Scholar
  5. Belloni F, Koivunen V (2003) Unitary root-music technique for uniform circular array. In: Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology, 2003. ISSPIT 2003. pp 451–454Google Scholar
  6. Benesty J, Chen J, Huang Y (2004) Time-delay estimation via linear interpolation and cross correlation. IEEE Trans Speech Audio Process 12(5):509–519CrossRefGoogle Scholar
  7. Blandin C, Ozerov A, Vincent E (1950) Multi-source tdoa estimation in reverberant audio using angular spectra and clustering. Signal Process 92(8):1950–1960CrossRefGoogle Scholar
  8. Campbell DR, Palomki KJ, Brown GJ (2005) A matlab simulation of “shoebox” room acoustics for use in research and teaching. Comput Inf Syst J 9(3):48–51Google Scholar
  9. Cobos M, Lopez JJ, Martinez D (2011) Two-microphone multi-speaker localization based on a Laplacian mixture model. Digit Signal Process 21(1):66–76CrossRefGoogle Scholar
  10. Dmochowski J, Benesty J, Affes S (2007a) Direction of arrival estimation using the parameterized spatial correlation matrix. IEEE Trans Audio Speech Lang Process 15(4):1327–1339CrossRefGoogle Scholar
  11. Dmochowski JP, Benesty J, Affes S (2007b) Broadband music: Opportunities and challenges for multiple source localization. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2007, pp 18–21Google Scholar
  12. Gunel B, Hacihabiboglu H, Kondoz AM (2008) Acoustic source separation of convolutive mixtures based on intensity vector statistics. IEEE Trans Audio Speech Lang Process 16(4):748–756CrossRefGoogle Scholar
  13. Ishi CT, Chatot O, Ishiguro H, Hagita N (2009a) Evaluation of a music-based real-time sound localization of multiple sound sources in real noisy environments. In :IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009. IROS 2009. pp 2027–2032Google Scholar
  14. Ishi CT, Chatot O, Ishiguro H, Hagita N (2009b) Evaluation of a music-based real-time sound localization of multiple sound sources in real noisy environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009. IROS 2009. pp 2027–2032Google Scholar
  15. Jia M, Yang Z, Bao C, Zheng X, Ritz C (2015) Encoding multiple audio objects using intra-object sparsity. IEEE/ACM Trans Audio Speech Lang Process 23(6):1082–1095CrossRefGoogle Scholar
  16. Karbasi A, Sugiyama A (2007) A new DOA estimation method using a circular microphone array. In: Signal Processing Conference, 2007 15th European, pp 778–782Google Scholar
  17. Knapp C, Carter G (1976) The generalized correlation method for estimation of time delay. IEEE Trans Acoustics Speech Signal Process 24(4):320–327CrossRefGoogle Scholar
  18. Loesch B, Uhlich S, Yang B (2009) Multidimensional localization of multiple sound sources using frequency domain ica and an extended state coherence transform. In: IEEE/SP 15th Workshop on Statistical Signal Processing, 2009. SSP ’09. pp 677–680Google Scholar
  19. Lombard A, Zheng Y, Buchner H, Kellermann W (2011) Tdoa estimation for multiple sound sources in noisy and reverberant environments using broadband independent component analysis. IEEE Trans Audio Speech Lang Process 19(6):1490–1503CrossRefGoogle Scholar
  20. Nakadai K, Matsuura D, Okuno HG, Kitano H (2003) Applying scattering theory to robot audition system: robust sound source localization and extraction. In: Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2003. (IROS 2003). vol 2, pp 1147–1152Google Scholar
  21. Nesta F, Omologo M (2012) Generalized state coherence transform for multidimensional tdoa estimation of multiple sources. IEEE Trans Audio Speech Lang Process 20(1):246–260CrossRefGoogle Scholar
  22. Comon P, Jutten C (2010) Handbook of blind source separation: independent component analysis and applications. Academic Press, Elsevier, BurlingtonGoogle Scholar
  23. Pavlidi D, Griffin A, Puigt M, Mouchtaris A (2013) Real-time multiple sound source localization and counting using a circular microphone array. IEEE Trans Audio Speech Lang Process 21(10):2193–2206CrossRefGoogle Scholar
  24. Pavlidi D, Puigt M, Griffin A, Mouchtaris A (2012) Real-time multiple sound source localization using a circular microphone array based on single-source confidence measures. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012. pp 2625–2628Google Scholar
  25. Pulkki V (2007) Spatial sound reproduction with directional audio coding. J Audio Eng Soc 55(6):503–516Google Scholar
  26. Ren M, Zou YX (2012) A novel multiple sparse source localization using triangular pyramid microphone array. IEEE Signal Process Lett 19(2):83–86CrossRefGoogle Scholar
  27. Sawada H, Mukai R, Araki S, Malcino S (2005) Multiple source localization using independent component analysis. In: Antennas and Propagation Society International Symposium, 2005 IEEE, vol 4B, pp 81–84Google Scholar
  28. Schmidt R (1986) Multiple emitter location and signal parameter estimation. IEEE Trans Antennas Propag 34(3):276–280CrossRefGoogle Scholar
  29. Shiiki Y, Suyama K (2015) Omnidirectional sound source tracking based on sequential updating histogram. In: 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp 1249–1256Google Scholar
  30. Shujau M, Ritz CH, Burnett IS (2011) Separation of speech sources using an acoustic vector sensor. In: IEEE 13th International Workshop on Multimedia Signal Processing (MMSP), 2011, pp 1–6Google Scholar
  31. Sound C (2015) Core sound TetraMic. http://www.core-sound.com/TetraMic/1.php. Online; Accessed 25 Sep 2015
  32. Su D, Miro JV, Vidal-Calleja T (2015) Real-time sound source localisation for target tracking applications using an asynchronous microphone array. In: IEEE 10th Conference on Industrial Electronics and Applications (ICIEA), 2015, pp 1261–1266Google Scholar
  33. Swartling M, Sllberg B, Grbi N (2011) Source localization for multiple speech sources using low complexity non-parametric source separation and clustering. Signal Process 91(8):1781–1788CrossRefMATHGoogle Scholar
  34. Tim VDB, Evelyne C, Jan W (2011) Sound source localization using hearing aids with microphones placed behind-the-ear, in-the-canal, and in-the-pinna. Int J Audiol 50(3):164–176CrossRefGoogle Scholar
  35. Yi Z, Kuroda T (2014) Wearable sensor-based human activity recognition from environmental background sounds. J Ambient Intell Humaniz Comput 5(1):77–89CrossRefGoogle Scholar
  36. Zhang JX, Christensen MG, Dahl J, Jensen SH, Moonen M (2009) Robust implementation of the music algorithm. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009., pp 3037–3040Google Scholar
  37. Zheng X (2013) Soundfield navigation: separation, compression and transmission, doctoral dissertation. University of Wollongong, WollongongGoogle Scholar
  38. Zheng X, Ritz C, Xi J (2013) Collaborative blind source separation using location informed spatial microphones. IEEE Signal Process Lett 20(1):83–86CrossRefGoogle Scholar
  39. Zheng X, Ritz C, Xi J (2016) Encoding and communicating navigable speech soundfields. Multimed Tools Appl 75(9):5183–5204CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Speech and Audio Signal Processing Lab, School of Electronic Information and Control EngineeringBeijing University of TechnologyBeijingChina

Personalised recommendations