Recognition by Enhanced Bag of Words Model via Topographic ICA

  • Min Jing
  • Hui Wang
  • Kathy Clawson
  • Sonya Coleman
  • Shuwei Chen
  • Jun Liu
  • Bryan Scotney
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8867)


The Bag-of-Words (BoW) model has been increasingly applied in the field of computer vision, in which the local features are first mapped to a codebook produced by clustering method and then represented by histogram of the words. One of drawbacks in BoW model is that the orderless histogram ignores the valuable spatial relationships among the features. In this study, we propose a novel framework based on a topographic independent component analysis (TICA), which enables the geometrically nearby feature components to be grouped together thereby bridge the semantic gap in BoW model. In addition, the compact feature obtained from TICA helps to build an efficient codebook. Furthermore, we introduce a new closeness measurement based on Neighbourhood Counting Measure (NCM) to improve the k Nearest Neighbour classification. The preliminary results based on KTH and Trecvid data demonstrate the proposed TICA/NCM approach increases the recognition accuracy and improve the efficiency of BoW model.


bag of words topographic ICA neighbourhood counting measurement action recognition 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Clawson, K., Jing, M., Scotney, B., Wang, H., Liu, J.: Human Action Recognition in Video via Fused Optical Flow and Moment Features – Towards a Hierarchical Approach to Complex Scenario Recognition. In: Gurrin, C., Hopfgartner, F., Hurst, W., Johansen, H., Lee, H., O’Connor, N. (eds.) MMM 2014, Part II. LNCS, vol. 8326, pp. 104–115. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  2. 2.
    Efros, A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: Proc. ICCV (2003)Google Scholar
  3. 3.
    Fulkerson, B., Vedaldi, A., Soatto, S.: Localizing objects with smart dictionaries. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 179–192. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  4. 4.
    Hu, Q., Qin, L., Huang, Q., Jiang, S., Tian, Q.: Action Recognition Using Spatial-Temporal Context. In: Proc. ICPR (2010)Google Scholar
  5. 5.
    Hyvärinen, A., Hoyer, P.O., Inki, M.: Topographic Independent Component Analysis. Neural Computation 13, 1527–1558 (2001)CrossRefzbMATHGoogle Scholar
  6. 6.
    Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proc. CVPR, pp. 1–8 (2008)Google Scholar
  7. 7.
    Liu, J., Shah, M.: Learning human action via information maximization. In: Proc. CVPR (2008)Google Scholar
  8. 8.
    Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild. In: Proc. CVPR (2009)Google Scholar
  9. 9.
    Liu, J., Yang, Y., Shah, M.: Learning Semantic Visual Vocabularies using Diffusion Distance. In: Proc. CVPR (2009)Google Scholar
  10. 10.
    Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. In: Proc. BMVC, vol. 3, pp. 1249–1258 (2006)Google Scholar
  11. 11.
    Hoyer, P.O.: Software Packages,
  12. 12.
    Over, P., Awad, G., Michel, M., et al.: TRECVID 2013 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics. In: Proc. TRECVID (2013)Google Scholar
  13. 13.
    Roth, P.M., Mauthner, T., Khan, I., Bischof, H.: Efficient Human Action Recognition by Cascaded Linear Classifcation. In: Proc. ICCV (2009)Google Scholar
  14. 14.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: Proc. ICPR (2004)Google Scholar
  15. 15.
    Yang, Y., Shah, M.: Complex Events Detection using Data-driven Concepts. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 722–735. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  16. 16.
    Wang, H.: Nearest neighbors by neighborhood counting. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(6), 942–953 (2006)CrossRefGoogle Scholar
  17. 17.
    Wang, H., Murtagh, H.: A study of neighborhood counting similarity. IEEE Transactions on Knowledge and Data Engineering 20(4), 449–461 (2008)CrossRefGoogle Scholar
  18. 18.
    Wang, H.: Neighborhood counting measure and minimum risk metric. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(4), 449–461 (2010)Google Scholar
  19. 19.
    Wang, H., Klaser, A., Schmid, C., Liu, C.: Action Recognition by Dense Trajectories. In: Proc. CVPR, pp. 3169–3176 (2011)Google Scholar
  20. 20.
    Wang, H., Yuan, C., Hu, W., Ling, H., Yang, W., Sun, C.: Action recognition using nonnegative action component representation and sparse basis selection. IEEE Trans. Image Processing 23(2), 571–581 (2014)MathSciNetGoogle Scholar
  21. 21.
    Zhou, J., Zhang, X.P.: An ICA mixture hidden Markov model for video content analysis. IEEE Trans. Circuit Syst. Video Technol. 18(11), 1576–1586 (2008)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Min Jing
    • 1
  • Hui Wang
    • 1
  • Kathy Clawson
    • 2
  • Sonya Coleman
    • 1
  • Shuwei Chen
    • 1
  • Jun Liu
    • 1
  • Bryan Scotney
    • 1
  1. 1.University of UlsterUK
  2. 2.Middlesbrough CollegeUK

Personalised recommendations