Active Structured Learning for High-Speed Object Detection

  • Christoph H. Lampert
  • Jan Peters
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5748)


High-speed smooth and accurate visual tracking of objects in arbitrary, unstructured environments is essential for robotics and human motion analysis. However, building a system that can adapt to arbitrary objects and a wide range of lighting conditions is a challenging problem, especially if hard real-time constraints apply like in robotics scenarios. In this work, we introduce a method for learning a discriminative object tracking system based on the recent structured regression framework for object localization. Using a kernel function that allows fast evaluation on the GPU, the resulting system can process video streams at speed of 100 frames per second or more.

Consecutive frames in high speed video sequences are typically very redundant, and for training an object detection system, it is sufficient to have training labels from only a subset of all images. We propose an active learning method that select training examples in a data-driven way, thereby minimizing the required number of training labeling. Experiments on realistic data show that the active learning is superior to previously used methods for dataset subsampling for this task.


Active Learning Object Detection Structure Regression Compatibility Function Active Learning Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. Systems, Man, and Cybernetics 34(3) (2004)Google Scholar
  2. 2.
    Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Computing Surveys 38(4) (2006)Google Scholar
  3. 3.
    Kalman, R.E.: A new approach to linear filtering and prediction problems. Transaction of the ASME (1960)Google Scholar
  4. 4.
    Tanizaki, H.: Non-gaussian state-space modeling of nonstationary time series. J. Amer. Statist. Assoc. 82 (1987)Google Scholar
  5. 5.
    Tsatsanis, M.K., Giannakis, G.: Object detection and classification using matched filtering and higher-order statistics. In: Multidimensional Signal Processing (1989)Google Scholar
  6. 6.
    Hager, G.D., Belhumeur, P.N.: Efficient region tracking with parametric models of geometry and illumination. IEEE Pattern Analysis and Machine Intelligence 20(10) (1998)Google Scholar
  7. 7.
    Viola, P.A., Jones, M.J.: Robust real-time face detection. In: ICCV (2001)Google Scholar
  8. 8.
    Grabner, H., Bischof, H.: On-line boosting and vision. In: CVPR (2006)Google Scholar
  9. 9.
    Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. IJCV 77(1) (2008)Google Scholar
  10. 10.
    Bajramovic, F., Gräßl, C., Denzler, J.: Efficient combination of histograms for real-time tracking using mean-shift and trust-region optimization. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds.) DAGM 2005. LNCS, vol. 3663, pp. 254–261. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  11. 11.
    Reisert, M., Burkhardt, H.: Equivariant holomorphic filters for contour denoising and rapid object detection. IEEE Image Processing 17(2) (2008)Google Scholar
  12. 12.
    Shotton, J., Winn, J.M., Rother, C., Criminisi, A.: Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 1–15. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Blaschko, M.B., Lampert, C.H.: Learning to localize objects with structured output regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. JMLR 6(2), 1453 (2006)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Joachims, T., Finley, T., Yu, C.-N.: Cutting-plane training of structural SVMs. Machine Learning (2009)Google Scholar
  16. 16.
    Rowley, H.A., Baluja, S., Kanade, T.: Neural network-based face detection. In: CVPR (1996)Google Scholar
  17. 17.
    Joachims, T.: Training linear SVMs in linear time. In: KDD (2006)Google Scholar
  18. 18.
    Szummer, M., Kohli, P., Hoiem, D.: Learning cRFs using graph cuts. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 582–595. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  19. 19.
    Li, Y., Huttenlocher, D.P.: Learning for stereo vision using the structured support vector machine. In: CVPR (2008)Google Scholar
  20. 20.
    Jähne, B.: Digital Image Processing. Springer, Heidelberg (2005)zbMATHGoogle Scholar
  21. 21.
    Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active Learning with Statistical Models. Journal of Artificial Intelligence Research 4, 129–145 (1996)zbMATHGoogle Scholar
  22. 22.
    Roth, D., Small, K.: Active learning with perceptron for structured output. In: ICML Workshop on Learning in Structured Output Spaces (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Christoph H. Lampert
    • 1
  • Jan Peters
    • 1
  1. 1.Max Planck Institute for Biological CyberneticsTübingenGermany

Personalised recommendations