Learning to Recognize Activities from the Wrong View Point

  • Ali Farhadi
  • Mostafa Kamali Tabrizi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5302)


Appearance features are good at discriminating activities in a fixed view, but behave poorly when aspect is changed. We describe a method to build features that are highly stable under change of aspect. It is not necessary to have multiple views to extract our features. Our features make it possible to learn a discriminative model of activity in one view, and spot that activity in another view, for which one might poses no labeled examples at all. Our construction uses labeled examples to build activity models, and unlabeled, but corresponding, examples to build an implicit model of how appearance changes with aspect. We demonstrate our method with challenging sequences of real human motion, where discriminative methods built on appearance alone fail badly.


Descriptive Feature Transfer Learning Random Projection Appearance Feature Human Activity Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Niculescu-Mizil, R.C.A.: Inductive transfer for bayesian network structure learning (2007)Google Scholar
  2. 2.
    Aloimonos, Y., Ogale, A.S., Karapurkar, A.P.: View invariant recognition of actions using grammars. In: Proc. Workshop CAPTECH (2004)Google Scholar
  3. 3.
    Mori, G., Efros, A.A., Berg, A.C., Malik, J.: Recognizing action at a distance. In: IEEE International Conference on Computer Vision (ICCV 2003) (2003)Google Scholar
  4. 4.
    Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817–1853 (2005)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Bakker, T.H.B.: Task clustering and gating for bayesian multitask learning. Journal of Machine Learning, 83–99 (2003)Google Scholar
  6. 6.
    Barron, C., Kakadiaris, I.: Estimating anthropometry and pose from a single uncalibrated image. Computer Vision and Image Understanding 81(3), 269–284 (2001)CrossRefzbMATHGoogle Scholar
  7. 7.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV, pp. 1395–1402 (2005)Google Scholar
  8. 8.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV (2005)Google Scholar
  9. 9.
    Bobick, A., Davis, J.: The recognition of human movement using temporal templates. PAMI 23(3), 257–267 (2001)CrossRefGoogle Scholar
  10. 10.
    Bregler, C., Malik, J.: Tracking people with twists and exponential maps. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 8–15 (1998)Google Scholar
  11. 11.
    Perkins, G.S.D.N.: Transfer of learning, 2nd edn. International Encyclopedia of Education (1992)Google Scholar
  12. 12.
    Dai, W., Yang, Q., Xue, G.-R., Yu, Y.: Boosting for transfer learning. In: ICML 2007 (2007)Google Scholar
  13. 13.
    Dance, C., Willamowski, J., Fan, L., Bray, C., Csurka, G.: Visual categorization with bags of keypoints. In: ECCV International Workshop on Statistical Learning in Computer Vision (2004)Google Scholar
  14. 14.
    Elidan, G., Heitz, G., Koller, D.: Learning object shape: From drawings to images. In: CVPR 2006, Washington, DC, USA, pp. 2064–2071. IEEE Computer Society, Los Alamitos (2006)Google Scholar
  15. 15.
    Evgeniou, T., Pontil, M.: Regularized multi–task learning. In: KDD 2004 (2004)Google Scholar
  16. 16.
    Farhadi, A., Forsyth, D.A., White, R.: Transfer learning in sign language. In: CVPR (2007)Google Scholar
  17. 17.
    Feng, X., Perona, P.: Human action recognition by sequence of movelet codewords. In: Proceedings of First International Symposium on 3D Data Processing Visualization and Transmission,2002, pp. 717–721 (2002)Google Scholar
  18. 18.
    Forsyth, D., Arikan, O., Ikemoto, L., O’Brien, J., Ramanan, D.: Computational aspects of human motion i: tracking and animation. Foundations and Trends in Computer Graphics and Vision 1(2/3), 1–255 (2006)Google Scholar
  19. 19.
    Howe, N.R., Leventon, M.E., Freeman, W.T.: Bayesian reconstruction of 3d human motion from single-camera video. In: Solla, S., Leen, T., Müller, K.-R. (eds.) Advances in Neural Information Processing Systems 12, pp. 820–826. MIT Press, Cambridge (2000)Google Scholar
  20. 20.
    Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. IEEE Trans. Systems, Man and Cybernetics - Part C: Applications and Reviews 34(3), 334–352 (2004)CrossRefGoogle Scholar
  21. 21.
    Ikizler, N., Forsyth, D.: Searching video for complex activities with finite state models. In: CVPR (2007)Google Scholar
  22. 22.
    Kaski, S., Peltonen, J.: Learning from Relevant Tasks Only. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  23. 23.
    Laptev, I., Lindeberg, T.: Space-time interest points (2003)Google Scholar
  24. 24.
    Lucas, B., Kanade, T.: An iterative image registration technique with an application to stero vision. IJCAI (1981)Google Scholar
  25. 25.
    Rosenstein, L.K.M.T., Marx, Z.: To transfer or not to transfer (2005)Google Scholar
  26. 26.
    Niu, F., Abdel-Mottaleb, M.: View-invariant human activity recognition based on shape and motion features. In: ISMSE 2004 (2004)Google Scholar
  27. 27.
    Niyogi, S., Adelson, E.: Analyzing and recognizing walking figures in xyt. In: Media lab vision and modelling tr-223. MIT, Cambridge (1995)Google Scholar
  28. 28.
    Scovanner, M.S.P., Ali, S.: A 3-dimensional sift descriptor and its application to action recognition. ACM Multimedia (2007)Google Scholar
  29. 29.
    Parameswaran, V., Chellappa, R.: View invariants for human action recognition. In: IEEE Conf. on Computer Vision and Pattern Recognition (2003)Google Scholar
  30. 30.
    Raina, D.K.R., Ng, A.Y.: Transfer learning by constructing informative priors (2005)Google Scholar
  31. 31.
    Ramanan, D., Forsyth, D.: Automatic annotation of everyday movements. In: Advances in Neural Information Processing (2003)Google Scholar
  32. 32.
    Rao, C., Yilmaz, A., Shah, M.: View-invariant representation and recognition of actions. IJCV 50(2), 203–226 (2002)CrossRefzbMATHGoogle Scholar
  33. 33.
    Taylor, C.: Reconstruction of articulated objects from point correspondences in a single uncalibrated image. Computer Vision and Image Understanding 80(3), 349–363 (2000)CrossRefzbMATHGoogle Scholar
  34. 34.
    Taylor, M.E., Stone, P.: Cross-domain transfer for reinforcement learning. In: ICML 2007 (2007)Google Scholar
  35. 35.
    Thrun, S.: Is learning the n-th thing any easier than learning the first? In: NIPS (1996)Google Scholar
  36. 36.
    Tran, D., Sorokin, A.: Human activity recognition with metric learning. In: ECCV (2008)Google Scholar
  37. 37.
    Turaga, P.K., Veeraraghavan, A., Chellappa, R.: From videos to verbs: Mining videos for activities using a cascade of dynamical systems. In: IEEE Conf. on Computer Vision and Pattern Recognition (2007)Google Scholar
  38. 38.
    Wang, L., Suter, D.: Recognizing human activities from silhouettes: Motion subspace and factorial discriminative graphical model. In: IEEE Conf. on Computer Vision and Pattern Recognition (2007)Google Scholar
  39. 39.
    Wang, L., Suter, D.: Recognizing human activities from silhouettes: Motion subspace and factorial discriminative graphical model. CVPR (2007)Google Scholar
  40. 40.
    Wang, Y., Huang, K., Tan, T.: Human activity recognition based on r transform. Visual Surveillance (2007)Google Scholar
  41. 41.
    Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3d exemplars. In: ICCV, Rio de Janeiro, Brazil (2007)Google Scholar
  42. 42.
    Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding (2006)Google Scholar
  43. 43.
    Wilson, A., Bobick, A.: Learning visual behavior for gesture analysis. In: IEEE Symposium on Computer Vision, pp. 229–234 (1995)Google Scholar
  44. 44.
    Wilson, A., Fern, A., Ray, S., Tadepalli, P.: Multi-task reinforcement learning: a hierarchical bayesian approach. In: ICML 2007 (2007)Google Scholar
  45. 45.
    Yamato, J., Ohya, J., Ishii, K.: Recognising human action in time sequential images using hidden markov model. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 379–385 (1992)Google Scholar
  46. 46.
    Yang, J., Xu, Y., Chen, C.S.: Human action learning via hidden markov model. IEEE Transactions on Systems Man and Cybernetics 27, 34–44 (1997)CrossRefGoogle Scholar
  47. 47.
    Yilmaz, A., Shah, M.: Actions sketch: A novel action representation (2005)Google Scholar
  48. 48.
    Marx, L.K.Z., Rosenstein, M.T.: Transfer learning with an ensemble of background tasks (2005)Google Scholar
  49. 49.
    Zhang, K., Tsang, I.W., Kwok, J.T.: Maximum margin clustering made practical. In: ICML 2007 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Ali Farhadi
    • 1
  • Mostafa Kamali Tabrizi
    • 2
  1. 1.Computer Science DepartmentUniversity of Illinois at Urbana ChampaignUSA
  2. 2.Institute for Studies in Theoretical Physics and MathematicsIran

Personalised recommendations