Learning Efficient Linear Predictors for Motion Estimation

  • Jiří Matas
  • Karel Zimmermann
  • Tomáš Svoboda
  • Adrian Hilton
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4338)


A novel object representation for tracking is proposed. The tracked object is represented as a constellation of spatially localised linear predictors which are learned on a single training image. In the learning stage, sets of pixels whose intensities allow for optimal least square predictions of the transformations are selected as a support of the linear predictor.

The approach comprises three contributions: learning object specific linear predictors, explicitly dealing with the predictor precision – computational complexity trade-off and selecting a view-specific set of predictors suitable for global object motion estimate. Robustness to occlusion is achieved by RANSAC procedure.

The learned tracker is very efficient, achieving frame rate generally higher than 30 frames per second despite the Matlab implementation.


Motion Estimation Object Motion Linear Predictor Learning Stage Displacement Estimation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baker, S., Matthews, I.: Lucas-kanade 20 years on: A unifying framework. International Journal of Computer Vision 56, 221–255 (2004)CrossRefGoogle Scholar
  2. 2.
    Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: IJCAI, pp. 674–679 (1981)Google Scholar
  3. 3.
    Shi, J., Tomasi, C.: Good features to track. In: Computer Vision and Pattern Recognition (CVPR 1994), pp. 593–600 (1994)Google Scholar
  4. 4.
    Lowe, D.: Object recognition from local scale-invariant features. In: International Conference on Computer Vision, pp. 1150–1157 (1999)Google Scholar
  5. 5.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)CrossRefGoogle Scholar
  6. 6.
    Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing 22, 761–767 (2004)CrossRefGoogle Scholar
  7. 7.
    Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., van Gool, L.: A comparison of affine region detectors. IJCV 65, 43–72 (2005)CrossRefGoogle Scholar
  8. 8.
    Gordon, I., Lowe, D.: Scene modelling, recognition and tracking with invariant image features. In: International Symposium on Mixed and Augmented Reality (ISMAR), pp. 110–119 (2004)Google Scholar
  9. 9.
    Lepetit, V., Lagger, P., Fua, P.: Randomized trees for real-time keypoint recognition. In: Computer Vision and Pattern Recognition, pp. 775–781 (2005)Google Scholar
  10. 10.
    Black, M.J., Jepson, A.D.: Eigentracking: Robust matching and tracking of articulated objects using a view-based representation. In: European Conference on Computer Vision, pp. 329–342 (1996)Google Scholar
  11. 11.
    Jepson, A.D., Fleet, D.J., El-Maraghi, T.F.: Robust online appearance models for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 25, 1296–1311 (2003)CrossRefGoogle Scholar
  12. 12.
    Cootes, T., Edwards, G., Taylor, C.: Active appearance models. PAMI 23, 681–685 (2001)Google Scholar
  13. 13.
    Jurie, F., Dhome, M.: Real time robust template matching. In: British Machine Vision Conference, pp. 123–131 (2002)Google Scholar
  14. 14.
    Masson, L., Dhome, M., Jurie, F.: Robust real time tracking of 3d objects. In: International Conference on Pattern Recognition (2004)Google Scholar
  15. 15.
    Williams, O., Blake, A., Cipolla, R.: Sparse bayesian learning for efficient visual tracking. Pattern Analysis and Machine Intelligence 27, 1292–1304 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jiří Matas
    • 1
    • 2
  • Karel Zimmermann
    • 1
  • Tomáš Svoboda
    • 1
  • Adrian Hilton
    • 2
  1. 1.Center for Machine PerceptionCzech Technical UniversityPragueCzech Republic
  2. 2.Centre for Vision, Speech and Signal Proc.University of SurreyGuildfordEngland

Personalised recommendations