Abstract
This paper presents two new algorithms for mapping the time-differences-of-arrival (TDOAs) measured from the microphone pairs to sound source direction-of-arrival (DOA) and location in room environments based on the least-squares support vector machine (LS-SVM). Least squares (LS) has been widely used in the TDOA based algorithms for sound source DOA estimation or localization to map the measured TDOAs into sound source DOA or location. The drawback of LS mapping is that its performance degrades significantly in some scenarios. To combat this problem, an LS-SVM regression based algorithm for the nonlinear mapping is proposed, which outperforms the LS based algorithm in noisy reverberant rooms. Conventional approaches to sound source localization usually assume that the microphones used are ideal and that the locations of the microphones are also known a priori, which may not be well satisfied in practice. Therefore, the microphone arrays need to be calibrated carefully before use. However, it is not an easy task to calibrate microphone arrays perfectly. In this paper, we also proposed an algorithm for sound source localization based on the LS-SVM, which has the advantage that microphone array calibration is not required. The performance of the proposed algorithms is validated by the simulation results in noisy reverberant environments.
Similar content being viewed by others
References
Wang, H., & Chu, P. (1997). Voice source localization for automatic camera pointing system in videoconferencing. In Proc. IEEE int. conf. acoust., speech signal process. (ICASSP) (Vol. 1, pp. 187–190).
Kellermann, W. (2008). Beamforming for speech and audio signals. In D. Havelock, S. Kuwano, & M. Vorländer (Eds.), Handbook of signal processing in acoustics (pp. 691–702). New York: Springer.
Song, K.-T., Hu, J.-S., Tsai, C.-Y., Chou, C.-M., Cheng, C.-C., Liu, W.-H., et al. (2006). Speaker attention system for mobile robots using microphone array and face tracking. In Proc. IEEE int. conf. robot. autom. (ICRA) (pp. 3624–3629).
Brandstein, M. S., & Silverman, H. F. (1997). A practical methodology for speech source localization with microhpone arrays. Computer Speech & Language, 11(2), 91–126.
Brandstein, M. S., Adcock, J. E., & Silverman, H. F. (1997). A closed-form location estimator for use with room environment microphone arrays. IEEE Transactions on Speech and Audio Processing, 5(1), 45–50.
Karbasi, A., & Sugiyama, A. (2007). A new DOA estimation method using a circular microphone array. In Proc. Euro. signal process. conf. (EUSIPCO) (pp. 778–782).
Gillette, M. D., & Silverman, H. F. (2008). A linear closed-form algorithm for source localization from time-differences of arrival. IEEE Signal Processing Letters, 15, 1–4.
Nagai, T., Kondo, K., Daneko, M., & Kurematsu, A. (2001). Estimation of source location based on 2-D MUSIC and its application to speech recognition in cars. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 5, 3041–3044.
Dibiase, J., Silverman, H. F., & Brandstein, M. S. (2001). Robust localization in reverberant rooms. In M. S. Brandstein, & D. B. Ward (Eds.), Microphone arrays: Signal processing techniques and applications (pp. 157–180).
Dmochowski, J. P., Benesty, J., & Affes, S. (2007). Direction of arrival estimation using the parameterized spatial correlation matrix. IEEE Trans. Audio, Speech, Lang. Process., 15(4), 1327–1339.
Huang, Y., Benesty, J., & Chen, J. (2006). Acoustic MIMO signal processing. Berlin: Springer.
Vapnik, V. N. (1999) The nature of statistical learning theory. New York: Springer.
Suykens, J. A. K., & Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural Processing Letters, 9, 293–300.
Suykens, J. A. K., Van Gestel, T., Brabanter, J. D., Moor, B. D., & Vandewalle, J. (2002). Least squares support vector machines. Singapore: Word Scientific.
Tashev, N. (2004). Gain self-calibration procedure for microphone arrays. In Proc. IEEE int. conf. multimedia expo (ICME) (Vol. 2, pp. 983–986).
Knapp, C. H., & Carter, G. C. (1976). The generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing, 24(4), 320–327.
Chen, J., Benesty, J., & Huang, Y. (2006). Time delay estimation in room acoustic environments: An overview. EURASTP Journal an Applied Signal Processing, 2006, Article ID: 26503, 1–19.
Pardo, J. M., Anguera, X., & Wooters, C. (2007). Speaker diarization for multiple-distant microphone meetings using several sources of information. IEEE Transactions on Computers, 56(9), 1212–1224.
Suykens, J. A. K., Lukas, L., & Vandewalle, J. (2000). Sparse approximation using least squares support vector machines. In Proc. IEEE int. symp. circuits syst. (ISCAS) (Vol. 2, pp. 757–760).
Van Gestel, T., Suykens, J. A. K., Baesens, B., Viaene, S., Vanthienen, J., Dedene, G., et al. (2004). Benchmarking least squares support vector machine classifiers. Machine Learning, 54(1), 5–32.
Allen, J. B., & Berkley, D. A. (1979). Image method for efficiently simulating small-room acoustics. Journal of the Acoustical Society of America, 65, 943–950.
Cherkassky, V., & Ma, Y. (2004). Practical selection of SVM parameters and noise estimation for SVM regression. Neural Networks, 17, 113–126.
Acknowledgement
The authors would like to thank all the four anonymous reviewers for their helpful comments that helped to improve the presentation of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, H., Ser, W. Sound Source DOA Estimation and Localization in Noisy Reverberant Environments Using Least-Squares Support Vector Machines. J Sign Process Syst 63, 287–300 (2011). https://doi.org/10.1007/s11265-009-0423-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-009-0423-7