Journal of Signal Processing Systems

, Volume 65, Issue 1, pp 49–62

Automatic Detection of Object of Interest and Tracking in Active Video

Article

Abstract

We propose a novel method for automatic detection and tracking of Object of Interest (OOI) from actively acquired videos by non-calibrated cameras. The proposed approach benefits from the object-centered property of Active Video and facilitates self-initialization in tracking. We first use a color-saliency weighted Probability-of-Boundary (cPoB) map for keypoint filtering and salient region detection. Successive Classification and Refinement (SCR) is used for tracking between two consecutive frames. A strong classifier trained on-the-fly by AdaBoost is utilized for keypoint classification and subsequent Linear Programming solves a maximum similarity problem to reject outliers. Experiments demonstrate the importance of Active Video during the data collection phase and confirm that our new approach can automatically detect and reliably track OOI in videos.

Keywords

Saliency detection Feature matching Visual attention Tracking Active video 

References

  1. 1.
    Avidan, S. (2004). Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8), 1064–1072.CrossRefGoogle Scholar
  2. 2.
    Avidan, S. (2005). Ensemble tracking. In Proceedings of computer vision and pattern recognition (pp. 494–501).Google Scholar
  3. 3.
    Avidan, S., & Shamir, A. (2007). Seam carving for content-aware image resizing. ACM Transactions on Graphics, 26(3). doi:10.1145/1276377.1276390.
  4. 4.
    Baker, S., & Matthews, I. (2004). Lucas-kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56(1), 221–255.CrossRefGoogle Scholar
  5. 5.
    Barber, C. B., Dobkin, D. P., & Huhdanpaa, H. (1995). The quickhull algorithm for convex hulls. ACM Transactions on Mathematical Software, 22, 469–483.MathSciNetCrossRefGoogle Scholar
  6. 6.
    Bay, H., Tuytelaars, T., & Gool, L. V. (2006). Surf: Speeded up robust features. In Proceedings of European conference on computer vision (pp. 404–417).Google Scholar
  7. 7.
    Carpenter, R. (1977). Movements of the eyes. London: Pion.Google Scholar
  8. 8.
    Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 790–799.CrossRefGoogle Scholar
  9. 9.
    Chouinard, J. Y., Fortier, P., Gulliver, T. A. (Eds.) (1996). Information theory and applications II, 4th Canadian workshop. Lac Delage, Québec, Canada, May 28–30, 1995. Selected papers. Lecture notes in computer science (Vol. 1133). Springer.Google Scholar
  10. 10.
    Collins, R. T. (2003). Mean-shift blob tracking through scale space. In Proceedings of computer vision and pattern recognition.Google Scholar
  11. 11.
    Collins, R. T., & Liu, Y. (2003). On-line selection of discriminative tracking features. In Proceedings of international conference on computer vision (pp. 346–352).Google Scholar
  12. 12.
    Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603–619.CrossRefGoogle Scholar
  13. 13.
    Comaniciu, D., Ramesh, V., & Meer, P. (2003). Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25, 564–577.CrossRefGoogle Scholar
  14. 14.
    Enkelmann, W. (2001). Video-based driver assistance: From basic functions to applications. International Journal of Computer Vision, 45(3), 201–221.MATHCrossRefGoogle Scholar
  15. 15.
    Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.MathSciNetMATHCrossRefGoogle Scholar
  16. 16.
    Ghanbari, M. (1999). Video coding: An introduction to standard codecs. Stevenage: Institution of Electrical Engineers.Google Scholar
  17. 17.
    Huang, J., & Li, Z. N. (2009). Automatic detection of object of interest and tracking in active video. In Proceedings of Pacific rim conference on multimedia (pp. 368–380).Google Scholar
  18. 18.
    Huang, J., & Li, Z. N. (2009). Image trimming via saliency region detection and iterative feature matching. In Proceedings of international conference on multimedia expo (pp. 1322–1325).Google Scholar
  19. 19.
    Intille, S. S., Davis, J. W., & Bobick, A. F. (1997). Real-time closed-world tracking. In Proceedings of computer vision and pattern recognition (pp. 697–703).Google Scholar
  20. 20.
    Itti, L., & Koch, C. (1999). A comparison of feature combination strategies for saliency-based visual attention systems. In Proceedings of SPIE. Human vision and electronic imaging IV. (HVEI’99) (Vol. 3644, pp. 473–482). San Jose: SPIE.Google Scholar
  21. 21.
    Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254–1259.CrossRefGoogle Scholar
  22. 22.
    Julesz, B. (1995). Dialogues on perception. Cambridge: MIT Press.Google Scholar
  23. 23.
    Kadir, T., & Brady, M. (2001). Saliency, scale and image description. International Journal of Computer Vision, 45(2), 83–105.MATHCrossRefGoogle Scholar
  24. 24.
    Kim, Z. (2008). Real time object tracking based on dynamic feature grouping with background subtraction. In Proceedings of computer vision and pattern recognition (pp. 1–8).Google Scholar
  25. 25.
    Liu, D., Hua, G., & Chen, T. (2008). Videocut: Removing irrelevant frames by discovering the object of interest. In Proceedings of European conference on computer vision (Vol. I, pp. 441–453).Google Scholar
  26. 26.
    Lu, Y., & Li, Z. N. (2008). Automatic object extraction and reconstruction in active video. Pattern Recognition, 41(3), 1159–1172.MATHCrossRefGoogle Scholar
  27. 27.
    Mahadevan, V., & Vasconcelos, N. (2008). Background subtraction in highly dynamic scenes. In Proceedings of computer vision and pattern recognition (pp. 1–8).Google Scholar
  28. 28.
    Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.CrossRefGoogle Scholar
  29. 29.
    Niebur, E., & Koch, C. (1998). Computational architectures for attention. In R. Parasuraman (Ed.), The attentive brain (pp. 163–186). MIT Press.Google Scholar
  30. 30.
    Rother, C., Bordeaux, L., Hamadi, Y., Blake, A. (2006). Autocollage. ACM Transactions on Graphics, 25(3), 847–852.CrossRefGoogle Scholar
  31. 31.
    Rother, C., Kumar, S., Kolmogorov, V., & Blake, A. (2005). Digital tapestry. In Proceedings of computer vision and pattern recognition (pp. 589–596).Google Scholar
  32. 32.
    Rubner, Y., Tomasi, C., & Guibas, L. J. (2000). The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2), 99–121.MATHCrossRefGoogle Scholar
  33. 33.
    Simakov, D., Caspi, Y., Shechtman, E., & Irani, M. (2008). Summarizing visual data using bidirectional similarity. In Proceedings of computer vision and pattern recognition (pp. 1–8).Google Scholar
  34. 34.
    Sizintsev, M., Derpanis, K. G., & Hogue, A. (2008). Histogram-based search: A comparative study. In Proceedings of computer vision and pattern recognition (pp. 1–8).Google Scholar
  35. 35.
    Stauffer, C., Eric, W., & Grimson, W. E. L. (2000). Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 747–757.CrossRefGoogle Scholar
  36. 36.
    Tsotsos, J. K., Culhane, S. M., Winky, W. Y. K., Lai, Y., Davis, N., & Nuflo, F. (1995). Modeling visual attention via selective tuning. Artificial Intelligence, 78(1–2), 507–545.CrossRefGoogle Scholar
  37. 37.
    Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of computer vision and pattern recognition (Vol. I, pp. 511–518).Google Scholar
  38. 38.
    Yin, Z., & Collins, R. T. (2008). Object tracking and detection after occlusion via numerical hybird local and global mode-seeking. In Proceedings of computer vision and pattern recognition (pp. 1–8).Google Scholar
  39. 39.
    You, W., Jiang, H., & Li, Z. N. (2008). Real-time multiple object tracking in smart environments. In Proceedings of international conference on robotics and biomimetics (pp. 818–823).Google Scholar
  40. 40.
    Zhu, S., & Ma, K. K. (2000). A new diamond search algorithm for fast block-matching motion estimation. IEEE Transactions on Image Processing, 9(2), 287–290.MathSciNetCrossRefGoogle Scholar
  41. 41.
    Zivkovic, Z. (2004). Improved adaptive gaussian mixture model for background subtraction. In Proceedings of international conference on pattern recognition (Vol. 2, pp. 28–31).Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Vision and Media Lab, School of Computing ScienceSimon Fraser UniversityBurnabyCanada

Personalised recommendations