Advertisement

Seeing is Worse than Believing: Reading People’s Minds Better than Computer-Vision Methods Recognize Actions

  • Andrei Barbu
  • Daniel P. Barrett
  • Wei Chen
  • Narayanaswamy Siddharth
  • Caiming Xiong
  • Jason J. Corso
  • Christiane D. Fellbaum
  • Catherine Hanson
  • Stephen José Hanson
  • Sébastien Hélie
  • Evguenia Malaia
  • Barak A. Pearlmutter
  • Jeffrey Mark Siskind
  • Thomas Michael Talavage
  • Ronnie B. Wilbur
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8693)

Abstract

We had human subjects perform a one-out-of-six class action recognition task from video stimuli while undergoing functional magnetic resonance imaging (fMRI). Support-vector machines (SVMs) were trained on the recovered brain scans to classify actions observed during imaging, yielding average classification accuracy of 69.73% when tested on scans from the same subject and of 34.80% when tested on scans from different subjects. An apples-to-apples comparison was performed with all publicly available software that implements state-of-the-art action recognition on the same video corpus with the same cross-validation regimen and same partitioning into training and test sets, yielding classification accuracies between 31.25% and 52.34%. This indicates that one can read people’s minds better than state-of-the-art computer-vision methods can perform action recognition.

Keywords

action recognition fMRI 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: International Conference on Computer Vision, vol. 2, pp. 1395–1402 (2005)Google Scholar
  2. 2.
    Cao, Y., Barrett, D., Barbu, A., Narayanaswamy, S., Yu, H., Michaux, A., Lin, Y., Dickinson, S., Siskind, J.M., Wang, S.: Recognizing human activities from partially observed videos. In: Computer Vision and Pattern Recognition, pp. 2658–2665 (2013)Google Scholar
  3. 3.
    Connolly, A.C., Guntupalli, J.S., Gors, J., Hanke, M., Halchenko, Y.O., Wu, Y.C., Abdi, H., Haxby, J.V.: The representation of biological classes in the human brain. The Journal of Neuroscience 32(8), 2608–2618 (2012)CrossRefGoogle Scholar
  4. 4.
    Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)zbMATHGoogle Scholar
  5. 5.
    Cox, R.W.: AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Computers and Biomedical Research 29(3), 162–173 (1996)CrossRefGoogle Scholar
  6. 6.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision 88(2), 303–338 (2010)CrossRefGoogle Scholar
  7. 7.
    Fellbaum, C.: WordNet: an electronic lexical database. MIT Press, Cambridge (1998)Google Scholar
  8. 8.
    Gu, Q., Li, Z., Han, J.: Linear discriminant dimensionality reduction. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part I. LNCS, vol. 6911, pp. 549–564. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  9. 9.
    Hanson, S.J., Halchenko, Y.O.: Brain reading using full brain support vector machines for object recognition: there is no “face” identification area. Neural Computation 20(2), 486–503 (2009)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Haxby, J.V., Guntupalli, J.S., Connolly, A.C., Halchenko, Y.O., Conroy, B.R., Gobbini, M.I., Hanke, M., Ramadge, P.J.: A common, high-dimensional model of the representational space in human ventral temporal cortex. Neuron 72(2), 404–416 (2011)CrossRefGoogle Scholar
  11. 11.
    Huettel, S.A., Song, A.W., McCarthy, G.: Functional magnetic resonance imaging. Sinauer Associates, Sunderland (2004)Google Scholar
  12. 12.
    Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: International Conference on Computer Vision, pp. 1–8 (2007)Google Scholar
  13. 13.
    Just, M.A., Cherkassky, V.L., Aryal, S., Mitchell, T.M.: A neurosemantic theory of concrete noun representation based on the underlying brain codes. PloS One 5(1), e8622 (2010)Google Scholar
  14. 14.
    Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: International Conference on Computer Vision, pp. 2556–2563 (2011)Google Scholar
  15. 15.
    Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2-3), 107–123 (2005)CrossRefGoogle Scholar
  16. 16.
    Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Computer Vision and Pattern Recognition, pp. 3361–3368 (2011)Google Scholar
  17. 17.
    Liu, H., Feris, R., Sun, M.T.: Benchmarking datasets for human activity recognition, ch. 20, pp. 411–427. Springer (2011)Google Scholar
  18. 18.
    Messing, R., Pal, C., Kautz, H.: Activity recognition using the velocity histories of tracked keypoints. In: International Conference on Computer Vision, pp. 104–111 (2009)Google Scholar
  19. 19.
    Miller, G.A.: WordNet: a lexical database for English. Communications of the ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  20. 20.
    Pereira, F., Botvinick, M., Detre, G.: Using Wikipedia to learn semantic feature representations of concrete concepts in neuroimaging experiments. Artificial Intelligence 194, 240–252 (2012)CrossRefMathSciNetGoogle Scholar
  21. 21.
    Poldrack, R.A., Halchenko, Y.O., Hanson, S.J.: Decoding the large-scale structure of brain function by classifying mental states across individuals. Psychological Science 20(11), 1364–1372 (2009)CrossRefGoogle Scholar
  22. 22.
    Reddy, K.K., Shah, M.: Recognizing 50 human action categories of web videos. Machine Vision and Applications 24(5), 971–981 (2013)CrossRefGoogle Scholar
  23. 23.
    Ryoo, M.S.: Human activity prediction: early recognition of ongoing activities from streaming videos. In: International Conference on Computer Vision, pp. 1036–1043 (2011)Google Scholar
  24. 24.
    Sadanand, S., Corso, J.J.: Action Bank: A high-level representation of activity in video. In: Computer Vision and Pattern Recognition, pp. 1234–1241 (2012)Google Scholar
  25. 25.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: International Conference on Pattern Recognition, vol. 3, pp. 32–36 (2004)Google Scholar
  26. 26.
    Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. Computing Research Repository abs/1212.0402 (2012)Google Scholar
  27. 27.
    Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: Computer Vision and Pattern Recognition, pp. 3169–3176 (2011)Google Scholar
  28. 28.
    Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. International Journal of Computer Vision 103(1), 60–79 (2013)CrossRefMathSciNetGoogle Scholar
  29. 29.
    Wang, H., Schmid, C.: Action recognition with improved trajectories. In: International Conference on Computer Vision, pp. 3551–3558 (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Andrei Barbu
    • 1
  • Daniel P. Barrett
    • 2
  • Wei Chen
    • 3
  • Narayanaswamy Siddharth
    • 4
  • Caiming Xiong
    • 5
  • Jason J. Corso
    • 6
  • Christiane D. Fellbaum
    • 7
  • Catherine Hanson
    • 8
  • Stephen José Hanson
    • 8
  • Sébastien Hélie
    • 2
  • Evguenia Malaia
    • 9
  • Barak A. Pearlmutter
    • 10
  • Jeffrey Mark Siskind
    • 2
  • Thomas Michael Talavage
    • 2
  • Ronnie B. Wilbur
    • 2
  1. 1.MITCambridgeUSA
  2. 2.Purdue UniversityWest LafayetteUSA
  3. 3.SUNY BuffaloBuffaloUSA
  4. 4.Stanford UniversityStanfordUSA
  5. 5.University of California at Los AngelesLos AngelesUSA
  6. 6.University of MichiganAnn ArborUSA
  7. 7.Princeton UniversityPrincetonUSA
  8. 8.Rutgers UniversityNewarkUSA
  9. 9.University of Texas at ArlingtonArlingtonUSA
  10. 10.National University of Ireland MaynoothCo. KildareIreland

Personalised recommendations