TUT Acoustic Source Tracking System 2006

  • Pasi Pertilä
  • Teemu Korhonen
  • Tuomo Pirinen
  • Mikko Parviainen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4122)


This paper documents the acoustic source tracking system developed by TUT for the 2006 CLEAR evaluation campaign. The described system performs 3-D single person tracking based on audio data received from multiple spatially separated microphone arrays. The evaluation focuses on meeting room domain.

The system consists of four distinct stages. First stage is time delay estimation (TDE) between microphone pairs inside each array. Based on the TDE, direction of arrival (DOA) vectors are calculated for each array using a confidence metric. Source localization is done by using a selected combination of DOA estimates. The location estimate is tracked using a particle filter to reduce noise. The system is capable of locating a speaker 72 % of the time with an average accuracy of 25 cm.


False Alarm Rate Audio Data Microphone Array Multiple Object Track Sampling Importance Resampling 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Stiefelhagen, R., Garofolo, J.: CLEAR Evaluation Campaign and Workshop (2006), http://www.clear-evaluation.org/
  2. 2.
    Mostefa, D., et al.: Clear evaluation plan v.1.1 (2006), http://www.clear-evaluation.org/downloads/chil-clear-v1.1-2006-02-21.pdf
  3. 3.
    Pirinen, T.W., Pertila, P., Parviainen, M.: The TUT 2005 Source Localization System. In: Proceedings of the Rich Transcription 2005 Spring Meeting Recognition Evaluation, Royal College of Physicians, Edinburgh, UK, pp. 93–99 (2005)Google Scholar
  4. 4.
    Parviainen, M., Pirinen, T.W.: A Speaker Localization System for Lecture Room Environment. In: 3rd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (accepted for publication) (2006)Google Scholar
  5. 5.
    Huang, Y., Benesty, J., Elko, G.W.: Passive acoustic source localization for video camera steering. In: Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’00), vol. 2, pp. 909–912. IEEE, Los Alamitos (2000)Google Scholar
  6. 6.
    Roman, N., Wang, D.L., Brown, G.J.: Location-based sound segregation. In: Proceedings of the International Conference on Acoustics Speech and Signal Processing (ICASSP’02), pp. 1013–1016 (2002)Google Scholar
  7. 7.
    Blumrich, R., Altmann, J.: Medium-range localisation of aircraft via triangulation. Applied Acoustics 61(1), 65–82 (2000)CrossRefGoogle Scholar
  8. 8.
    Bass, H.E., et al.: Infrasound. Acoustics Today 2(1), 9–19 (2006)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Pertilä, P., Parviainen, M., Korhonen, T., Visa, A.: Moving sound source localization in large areas. In: 2005 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS 2005), pp. 745–748 (2005)Google Scholar
  10. 10.
    Omologo, M., Brutti, A., Svaizer, P.: Speaker Localization and Tracking - Evaluation Criteria. CHIL, v. 5.0 (2005)Google Scholar
  11. 11.
    Knapp, C., Carter, G.C.: The generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing 24(4), 320–327 (1976)CrossRefGoogle Scholar
  12. 12.
    Champagne, B., Bédard, S., Stéphenne, A.: Performance of time-delay estimation in the presence of room reverberation. IEEE Transactions on Speech and Audio Processing 4(2), 148–152 (1996)CrossRefGoogle Scholar
  13. 13.
    Omologo, M., Svaizer, P.: Use of the crosspower-spectrum phase in acoustic event location. IEEE Transactions on Speech and Audio Processing 5(3), 288–292 (1997)CrossRefGoogle Scholar
  14. 14.
    Varma, K., Ikuma, T., Beex, A.A.: Robust TDE-based DOA-estimation for compact audio arrays. Proceedings of the Second IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM) , 214–218 (2002)Google Scholar
  15. 15.
    Anguera, X., Wooters, C., Peskin, B., Aguiló, M.: Robust speaker segmentation for meetings: The ICSI-SRI spring 2005 diarization system. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 402–414. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  16. 16.
    Pirinen, T.: Normalized confidence factors for robust direction of arrival estimation. In: Proceedings of the 2005 IEEE International Symposium on Circuits and Systems (ISCAS), IEEE Computer Society Press, Los Alamitos (2005)Google Scholar
  17. 17.
    Yli-Hietanen, J., Kalliojärvi, K., Astola, J.: Low-complexity angle of arrival estimation of wideband signals using small arrays. In: Proceedings of the 8th IEEE Signal Processing Workshop on Statistical Signal and Array Signal Processing, pp. 109–112. IEEE Computer Society Press, Los Alamitos (1996)CrossRefGoogle Scholar
  18. 18.
    Hawkes, M., Nehorai, A.: Wideband Source Localization Using a Distributed Acoustic Vector-Sensor Array. IEEE Transactions on Signal Processing 51(6), 1479–1491 (2003)CrossRefMathSciNetGoogle Scholar
  19. 19.
    Gordon, N., Salmond, D., Smith, A.: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. Radar and Signal Processing, IEE Proceedings F 140(2), 107–113 (1993)CrossRefGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Pasi Pertilä
    • 1
  • Teemu Korhonen
    • 1
  • Tuomo Pirinen
    • 1
  • Mikko Parviainen
    • 1
  1. 1.Tampere University of Technology, P.O.Box 553, 33101, TampereFinland

Personalised recommendations