Machine Vision and Applications

, Volume 24, Issue 6, pp 1149–1165 | Cite as

Analysis of object description methods in a video object tracking environment

  • Pedro CarvalhoEmail author
  • Telmo Oliveira
  • Lucian Ciobanu
  • Filipe Gaspar
  • Luís F. Teixeira
  • Rafael Bastos
  • Jaime S. Cardoso
  • Miguel S. Dias
  • Luís Côrte-Real
Original Paper


A key issue in video object tracking is the representation of the objects and how effectively it discriminates between different objects. Several techniques have been proposed, but without a generally accepted method. While analysis and comparisons of these individual methods have been presented in the literature, their evaluation as part of a global solution has been overlooked. The appearance model for the objects is a component of a video object tracking framework, depending on previous processing stages and affecting those that succeed it. As a result, these interdependencies should be taken into account when analysing the performance of the object description techniques. We propose an integrated analysis of object descriptors and appearance models through their comparison in a common object tracking solution. The goal is to contribute to a better understanding of object description methods and their impact on the tracking process. Our contributions are threefold: propose a novel descriptor evaluation and characterisation paradigm; perform the first integrated analysis of state-of-the-art description methods in a scenario of people tracking; put forward some ideas for appearance models to use in this context. This work provides foundations for future tests and the proposed assessment approach contributes to the informed selection of techniques more adequately for a given tracking application context.


Computer vision Descriptors Appearance models  Tracking assessment Video object tracking 



The authors would like to thank the Fundação para a Ciência e a Tecnologia (FCT) - Portugal - and the European Commission, for financing part of this work through the grants SFRH/BD/31259/2006, SFRH/BD/73667/2010 and Fundo Social Europeu (FSE). The work was partially supported by Project: QREN 7900 LUL (Living Usability Lab), a co-promotion R&D projects funded by European Structural Funds for Portugal (FEDER) through COMPETE as part of the National Strategic Reference Framework (QREN), and managed by Agência de Inovação (ADI); QREN 13852 AAL4ALL (Ambient Assisted Living for All), co-financed by the European Community Fund FEDER through COMPETE – Programa Operacional Factores de Competitividade (POFC).


  1. 1.
    Alahi, A., Vandergheynst, P., Bierlaire, M., Kunt, M.: Cascade of descriptors to detect and track objects across any network of cameras. Comput. Vis. Image Underst. 114, 624–640 (2010)CrossRefGoogle Scholar
  2. 2.
    Bashir, F., Porikli, F.: Performance evaluation of object detection and tracking systems. In: PETS, Proceedings of IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (2006)Google Scholar
  3. 3.
    Bastos, R., Dias, M.S.: FIRST–Fast Invariant to Rotation and Scale Transform: Invariant Image Features for Augmented Reality and Computer Vision. VDM Verlag, Saarbrucken (2009)Google Scholar
  4. 4.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110, 346–359 (2008)CrossRefGoogle Scholar
  5. 5.
    Black, J., Ellis, T., Rosin, P.: A novel method for video tracking performance evaluation. In: Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (VS-PETS), pp. 125–132 (2003)Google Scholar
  6. 6.
    Bradski, G.R.: Computer vision face tracking for use in a perceptual user interface. Intel Technol. J. (Q2) (1998)Google Scholar
  7. 7.
    Brown, M., Lowe, D.: Invariant features from interest point groups. In: British Machine Vision Conference, pp. 656–665 (2002)Google Scholar
  8. 8.
    Cardoso, J.S., Carvalho, P., Teixeira, L.F., Corte-Real, L.: Partition-distance methods for assessing spatial segmentations of images and videos. Comput. Vis. Image Underst. 113(7), 811–823 (2009)CrossRefGoogle Scholar
  9. 9.
    Carvalho, P., Cardoso, J.S., Corte-Real, L.: Hybrid framework for evaluating video object tracking algorithms. Electron. Lett. 46(6), 411–412 (2010). Google Scholar
  10. 10.
    Carvalho, P., Cardoso, J.S., Corte-Real, L.: Filling the gap in quality assessment of video object tracking. Image Vis. Comput. 30(9), 630–640 (2012). doi: 10.1016/j.imavis.2012.06.002 CrossRefGoogle Scholar
  11. 11.
    Caviar: Ec-funded-caviar-project, i. 2001–37540 (2004).
  12. 12.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, CVPR ’05, pp. 886–893. IEEE Computer Society, Washington (2005)Google Scholar
  13. 13.
    Denman, S., Fookes, C., Sridharan, S., Lakemond, R.: Dynamic performance measures for object tracking systems. In: Proceedings of the 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS ’09, pp. 541–546. IEEE Computer Society, Washington (2009)Google Scholar
  14. 14.
    Ellis, T.: Performance metrics and methods for tracking in surveillance. In: 3rd IEEE International Workshop on Performance Evaluation of Tracking and Surveillance PETS’2002. Copenhagen, Denmark (2002)Google Scholar
  15. 15.
    Erdem, Ç.E., Sankur, B., Tekalp, A.M.: Performance measures for video object segmentation and tracking. IEEE Trans. Image Process. 13(7), 937–951 (2004)CrossRefGoogle Scholar
  16. 16.
    Han, Z., Ye, Q., Jiao, J.: Combined feature evaluation for adaptive visual object tracking. Comput. Vis. Image Underst. 115, 69–80 (2011)CrossRefGoogle Scholar
  17. 17.
    Jiang, Y.G., Yang, J., Ngo, C.W., Hauptmann, A.G.: Representations of Keypoint-Based semantic concept detection: a comprehensive study. IEEE Trans. Multimed. 12(1), 42–53 (2009)CrossRefGoogle Scholar
  18. 18.
    Jiang, Z., Huynh, D.Q., Moran, W., Challa, S., Spadaccini, N.: Multiple pedestrian tracking using colour and motion models. Digit. Image Comput. Tech. Appl. 328–334 (2010)Google Scholar
  19. 19.
    Ke, Y., Sukthankar, R.: PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR’04, pp. 506–513. IEEE Computer Society, Washington (2004)Google Scholar
  20. 20.
    Lazarevic-McManus, N., Renno, J.R., Makris, D., Jones, G.A.: An object-based comparative methodology for motion detection based on the F-Measure. Comput. Vis. Image Underst. 111(1), 74–85 (2008)CrossRefGoogle Scholar
  21. 21.
    List, T., Fisher, R.B.: CVML–an XML-based computer vision markup language. In: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR’04) vol. 1, ICPR ’04, pp. 789–792. IEEE Computer Society, Washington (2004)Google Scholar
  22. 22.
    Liu, H., Yu, Z., Zha, H., Zou, Y., Zhang, L.: Robust human tracking based on multi-cue integration and mean-shift. Pattern Recognit. Lett. 30 (2009)Google Scholar
  23. 23.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  24. 24.
    Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Int. J. Comput. Vis. 60, 63–86 (2004)CrossRefGoogle Scholar
  25. 25.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 2(10), 1615–1630 (2005)CrossRefGoogle Scholar
  26. 26.
    Moeslund, T.B., Granum, E.: A survey of computer vision-based human motion capture. Comput. Vis. Image Underst. 81, 231–268 (2001)zbMATHCrossRefGoogle Scholar
  27. 27.
    Nghiem, A.T., Bremond, F., Thonnat, M., Valentin, V.: Etiseo, performance evaluation for video surveillance systems. In: Proceedings of the 2007 IEEE Conference on Advanced Video and Signal Based Surveillance, pp. 476–481. IEEE Computer Society, Washington (2007)Google Scholar
  28. 28.
    Opelt, A., Pinz, A., Zisserman, A.: Incremental learning of object detectors using a visual shape alphabet. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 1, 3–10 (2006)Google Scholar
  29. 29.
    PETS: IEEE international workshop on performance evaluation of tracking and surveillance (2006).
  30. 30.
    Schlogl, T., Beleznai, C., Winter, M., Bischof, H.: Performance evaluation metrics for motion detection and tracking. In: ICPR ’04: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR’04), vol. 4, pp. 519–522. IEEE Computer Society, Washington (2004)Google Scholar
  31. 31.
    Schmid, C., Mohr, R., Bauckhage, C.: Evaluation of interest point detectors. Int. J. Comput. Vis. 37, 151–172 (2000)zbMATHCrossRefGoogle Scholar
  32. 32.
    Shahed, S.M.N., Ho, J., Yang, M.H.: Online visual tracking with histograms and articulating blocks. Comput. Vis. Image Underst. 114(8), 901–914 (2010)CrossRefGoogle Scholar
  33. 33.
    Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their localization in images. In: Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05), vol. 1, pp. 370–377. IEEE Computer Society, Washington (2005)Google Scholar
  34. 34.
    Szeliski, R.: Computer Vision: Algorithms and Applications. Springer, New York (2010)Google Scholar
  35. 35.
    Tang, F., Tao, H.: Object tracking with dynamic feature graph. In: ICCCN ’05: Proceedings of the 14th International Conference on Computer Communications and, Networks, pp. 25–32 (2005)Google Scholar
  36. 36.
    Teixeira, L., Carvalho, P., Cardoso, J., Corte-Real, L.: Automatic description of object appearances in a wide-area surveillance scenario. In: 19th IEEE International Conference on Image Processing (ICIP), pp. 1609–1612 (2012)Google Scholar
  37. 37.
    Teixeira, L.F., Cardoso, J.S., Corte-Real, L.: Object segmentation using background modelling and cascaded change detection. J. Multimed. (JMM) 2, 55–65 (2007)Google Scholar
  38. 38.
    Tell, D., Carlsson, S.: Combining appearance and topology for wide baseline matching. In: Proceedings of the 7th European Conference on Computer Vision-Part I, ECCV ’02, pp. 68–81. Springer, London (2002)Google Scholar
  39. 39.
    Tissainayagam, P., Suter, D.: Assessing the performance of corner detectors for point feature tracking applications. Image Vis. Comput. 22, 663–679 (2004)CrossRefGoogle Scholar
  40. 40.
    Venetianer, P.L., Deng, H.: Performance evaluation of an intelligent video surveillance system—a case study. Comput. Vis. Image Underst. 114, 1292–1302 (2010)CrossRefGoogle Scholar
  41. 41.
    Vizireanu, D.N.: Generalizations of binary morphological shape decomposition. J. Electron. Imaging 16, 013,002 (2007)Google Scholar
  42. 42.
    Vizireanu, N., Halunga, S., Marghescu, G.: Morphological skeleton decomposition interframe interpolation method. J. Electron. Imaging 19, 023,018 (2010)Google Scholar
  43. 43.
    Wu, L., Hu, Y., Li, M., Yu, N., Hua, X.S.: Scale-invariant visual language modeling for object categorization. IEEE Trans. Multimed. 11, 286–294 (2009)CrossRefGoogle Scholar
  44. 44.
    Zhao, T.: Model-based segmentation and tracking of multiple humans in complex situations. Ph.D. thesis, Faculty of the Graduate School of the University of Southern California (2004)Google Scholar
  45. 45.
    Zhao, T., Nevatia, R.: Tracking multiple humans in complex situations. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1208–1211 (2004) Google Scholar
  46. 46.
    Zhou, H., Yuan, Y., Shi, C.: Object tracking using SIFT features and mean shift. Comput. Vis. Image Underst. 113, 345–352 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Pedro Carvalho
    • 1
    Email author
  • Telmo Oliveira
    • 1
  • Lucian Ciobanu
    • 1
  • Filipe Gaspar
    • 2
  • Luís F. Teixeira
    • 3
  • Rafael Bastos
    • 4
  • Jaime S. Cardoso
    • 1
  • Miguel S. Dias
    • 5
  • Luís Côrte-Real
    • 1
  1. 1.INESC TEC (formerly INESC Porto) and Faculdade de Engenharia, Universidade do PortoPortoPortugal
  2. 2.ADETTI-IUL/ISCTE-Lisbon University InstituteLisbonPortugal
  3. 3.Departamento de Engenharia Informática, Faculdade de EngenhariaUniversidade do PortoPortoPortugal
  4. 4.eyenov, ADETTI-IUL/ISCTE-Lisbon University InstituteLisboaPortugal
  5. 5.Microsoft Language Development Center and ISCTE-Lisbon University InstituteLisbonPortugal

Personalised recommendations