Abstract
In order to meet the high precision, strong robust demands of speaker tracking system, this paper proposed a new Particle filter algorithm with unknown noise statistic characteristics. The proposed algorithm estimate and correct the statistic characteristics of the unknown noise on-line by improved Sage-Husa estimator, and produce optimal distribution function with unscented Kalman filter. Finally, it realized speaker tracking problem based on audio-visual fusion in the framework of the new algorithm. Experiment results show that the method proposed in this paper has enhanced the accuracy and robustness of speaker tracking system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cobos M, Lopez JJ, Martinez D (2011) Two-microphone multi-speaker localization based on a Laplacian mixture model. Digital Signal Proc 21(1):66–76
Shivappa ST, Trivedi M, Rao D (2011) Audio-visual Information fusion in human computer interfaces and intelligent environments: a survey. IEEE Proc 98(10):1680–1691
Shivappa ST, Rao BD, Trivedi MM (2010) Audio-visual fusion and tracking with multilevel iterative decoding: framework and experimental evaluation. IEEE J Sel Top Signal Proc 4(5):882–894
Gatica-Perez D, Lathoud G, McCowan I, Odobez J-M, Moore D (2003) Image processing. In: Proceedings of ICIP 2003 international conference, vol 3, issue 2, pp 5–8
Perez DG, Lathoud G, Odobez JM, Cowan IM (2007) Audio-visual probabilistic tracking of multiple speakers in meetings. IEEE Trans Audio Speech Lang Process 15(2):601–615
Checka N, Wilson KW, Siracusa MR et al (2004) Multiple person and speaker activity tracking with a particle filter. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, May 2004
Gordon NJ, Salmond DJ, Smith AFM (1993) Novel approach to nonlinear/non-gaussian bayesian state estimation. IEEE Proc Radar Signal Process 140(2):107–113
Julier SJ, Uhlmann JK (2004) Unscented filtering and nonlinear estimation. Proc IEEE 92(3):401–422
Wan EA, Merwe R (2000) The unscented Kalman filter for nonlinear estimation. In: Proceedings of the international symposium on adaptive systems for signal processing, communications and control, Alberta, Canada, pp 153–158
Shi Y, Han C-Z (2011) Adaptive UKF method with applications to target tracking. Acta Automatica Sin 37(6):755–759 (in Chinese)
Cao J, Zheng J (2012) Speaker tracking based on audio-video information fusion. Comput Eng Appl 48(13):118–124 (in Chinese)
Blauth MV, Claudio PJ et al (2012) Voice activity detection and speaker localization using audiovisual cues. Pattern Recogn Lett 33(4):373–380
Acknowledgments
The research was supported by Nation Natural Science Foundation of China (61263031), Natural Science Foundation of Gansu province of China (1010RJZA046).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cao, J., Li, J., Li, W. (2013). Speaker Tracking Based on Audio-Visual Fusion with Unknown Noise. In: Sun, Z., Deng, Z. (eds) Proceedings of 2013 Chinese Intelligent Automation Conference. Lecture Notes in Electrical Engineering, vol 256. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38466-0_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-38466-0_25
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38465-3
Online ISBN: 978-3-642-38466-0
eBook Packages: EngineeringEngineering (R0)