Skip to main content
Log in

Automatic Detection of Object of Interest and Tracking in Active Video

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

We propose a novel method for automatic detection and tracking of Object of Interest (OOI) from actively acquired videos by non-calibrated cameras. The proposed approach benefits from the object-centered property of Active Video and facilitates self-initialization in tracking. We first use a color-saliency weighted Probability-of-Boundary (cPoB) map for keypoint filtering and salient region detection. Successive Classification and Refinement (SCR) is used for tracking between two consecutive frames. A strong classifier trained on-the-fly by AdaBoost is utilized for keypoint classification and subsequent Linear Programming solves a maximum similarity problem to reject outliers. Experiments demonstrate the importance of Active Video during the data collection phase and confirm that our new approach can automatically detect and reliably track OOI in videos.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10

Similar content being viewed by others

Notes

  1. The term “tracking” here refers to one of the traditional camera motions in filming, whereas in other parts of this paper it refers to the action of following moving OOI as it is generally used in Computer Vision literatures.

  2. We use the MATLAB code from http://research.graphicon.ru/machine-learning/gml-adaboost-matlab-toolbox.html.

  3. We implement Online Feature Selection (OFS) tracker in MATLAB.

References

  1. Avidan, S. (2004). Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8), 1064–1072.

    Article  Google Scholar 

  2. Avidan, S. (2005). Ensemble tracking. In Proceedings of computer vision and pattern recognition (pp. 494–501).

  3. Avidan, S., & Shamir, A. (2007). Seam carving for content-aware image resizing. ACM Transactions on Graphics, 26(3). doi:10.1145/1276377.1276390.

  4. Baker, S., & Matthews, I. (2004). Lucas-kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56(1), 221–255.

    Article  Google Scholar 

  5. Barber, C. B., Dobkin, D. P., & Huhdanpaa, H. (1995). The quickhull algorithm for convex hulls. ACM Transactions on Mathematical Software, 22, 469–483.

    Article  MathSciNet  Google Scholar 

  6. Bay, H., Tuytelaars, T., & Gool, L. V. (2006). Surf: Speeded up robust features. In Proceedings of European conference on computer vision (pp. 404–417).

  7. Carpenter, R. (1977). Movements of the eyes. London: Pion.

    Google Scholar 

  8. Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 790–799.

    Article  Google Scholar 

  9. Chouinard, J. Y., Fortier, P., Gulliver, T. A. (Eds.) (1996). Information theory and applications II, 4th Canadian workshop. Lac Delage, Québec, Canada, May 28–30, 1995. Selected papers. Lecture notes in computer science (Vol. 1133). Springer.

  10. Collins, R. T. (2003). Mean-shift blob tracking through scale space. In Proceedings of computer vision and pattern recognition.

  11. Collins, R. T., & Liu, Y. (2003). On-line selection of discriminative tracking features. In Proceedings of international conference on computer vision (pp. 346–352).

  12. Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603–619.

    Article  Google Scholar 

  13. Comaniciu, D., Ramesh, V., & Meer, P. (2003). Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25, 564–577.

    Article  Google Scholar 

  14. Enkelmann, W. (2001). Video-based driver assistance: From basic functions to applications. International Journal of Computer Vision, 45(3), 201–221.

    Article  MATH  Google Scholar 

  15. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.

    Article  MathSciNet  MATH  Google Scholar 

  16. Ghanbari, M. (1999). Video coding: An introduction to standard codecs. Stevenage: Institution of Electrical Engineers.

    Google Scholar 

  17. Huang, J., & Li, Z. N. (2009). Automatic detection of object of interest and tracking in active video. In Proceedings of Pacific rim conference on multimedia (pp. 368–380).

  18. Huang, J., & Li, Z. N. (2009). Image trimming via saliency region detection and iterative feature matching. In Proceedings of international conference on multimedia expo (pp. 1322–1325).

  19. Intille, S. S., Davis, J. W., & Bobick, A. F. (1997). Real-time closed-world tracking. In Proceedings of computer vision and pattern recognition (pp. 697–703).

  20. Itti, L., & Koch, C. (1999). A comparison of feature combination strategies for saliency-based visual attention systems. In Proceedings of SPIE. Human vision and electronic imaging IV. (HVEI’99) (Vol. 3644, pp. 473–482). San Jose: SPIE.

    Google Scholar 

  21. Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254–1259.

    Article  Google Scholar 

  22. Julesz, B. (1995). Dialogues on perception. Cambridge: MIT Press.

    Google Scholar 

  23. Kadir, T., & Brady, M. (2001). Saliency, scale and image description. International Journal of Computer Vision, 45(2), 83–105.

    Article  MATH  Google Scholar 

  24. Kim, Z. (2008). Real time object tracking based on dynamic feature grouping with background subtraction. In Proceedings of computer vision and pattern recognition (pp. 1–8).

  25. Liu, D., Hua, G., & Chen, T. (2008). Videocut: Removing irrelevant frames by discovering the object of interest. In Proceedings of European conference on computer vision (Vol. I, pp. 441–453).

  26. Lu, Y., & Li, Z. N. (2008). Automatic object extraction and reconstruction in active video. Pattern Recognition, 41(3), 1159–1172.

    Article  MATH  Google Scholar 

  27. Mahadevan, V., & Vasconcelos, N. (2008). Background subtraction in highly dynamic scenes. In Proceedings of computer vision and pattern recognition (pp. 1–8).

  28. Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.

    Article  Google Scholar 

  29. Niebur, E., & Koch, C. (1998). Computational architectures for attention. In R. Parasuraman (Ed.), The attentive brain (pp. 163–186). MIT Press.

  30. Rother, C., Bordeaux, L., Hamadi, Y., Blake, A. (2006). Autocollage. ACM Transactions on Graphics, 25(3), 847–852.

    Article  Google Scholar 

  31. Rother, C., Kumar, S., Kolmogorov, V., & Blake, A. (2005). Digital tapestry. In Proceedings of computer vision and pattern recognition (pp. 589–596).

  32. Rubner, Y., Tomasi, C., & Guibas, L. J. (2000). The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2), 99–121.

    Article  MATH  Google Scholar 

  33. Simakov, D., Caspi, Y., Shechtman, E., & Irani, M. (2008). Summarizing visual data using bidirectional similarity. In Proceedings of computer vision and pattern recognition (pp. 1–8).

  34. Sizintsev, M., Derpanis, K. G., & Hogue, A. (2008). Histogram-based search: A comparative study. In Proceedings of computer vision and pattern recognition (pp. 1–8).

  35. Stauffer, C., Eric, W., & Grimson, W. E. L. (2000). Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 747–757.

    Article  Google Scholar 

  36. Tsotsos, J. K., Culhane, S. M., Winky, W. Y. K., Lai, Y., Davis, N., & Nuflo, F. (1995). Modeling visual attention via selective tuning. Artificial Intelligence, 78(1–2), 507–545.

    Article  Google Scholar 

  37. Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of computer vision and pattern recognition (Vol. I, pp. 511–518).

  38. Yin, Z., & Collins, R. T. (2008). Object tracking and detection after occlusion via numerical hybird local and global mode-seeking. In Proceedings of computer vision and pattern recognition (pp. 1–8).

  39. You, W., Jiang, H., & Li, Z. N. (2008). Real-time multiple object tracking in smart environments. In Proceedings of international conference on robotics and biomimetics (pp. 818–823).

  40. Zhu, S., & Ma, K. K. (2000). A new diamond search algorithm for fast block-matching motion estimation. IEEE Transactions on Image Processing, 9(2), 287–290.

    Article  MathSciNet  Google Scholar 

  41. Zivkovic, Z. (2004). Improved adaptive gaussian mixture model for background subtraction. In Proceedings of international conference on pattern recognition (Vol. 2, pp. 28–31).

Download references

Acknowledgements

This work was supported in part by the Natural Sciences and Engineering Research Council of Canada under the grant RGP36726.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiawei Huang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, J., Li, ZN. Automatic Detection of Object of Interest and Tracking in Active Video. J Sign Process Syst 65, 49–62 (2011). https://doi.org/10.1007/s11265-010-0540-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-010-0540-3

Keywords

Navigation