Journal of Real-Time Image Processing

, Volume 3, Issue 3, pp 163–176 | Cite as

Real-time human action recognition on an embedded, reconfigurable video processing architecture

  • Hongying Meng
  • Michael Freeman
  • Nick Pears
  • Chris Bailey
Special Issue


In recent years, automatic human action recognition has been widely researched within the computer vision and image processing communities. Here we propose a real-time, embedded vision solution for human action recognition, implemented on an FPGA-based ubiquitous device. There are three main contributions in this paper. Firstly, we have developed a fast human action recognition system with simple motion features and a linear support vector machine classifier. The method has been tested on a large, public human action dataset and achieved competitive performance for the temporal template class of approaches, which include “Motion History Image” based techniques. Secondly, we have developed a reconfigurable, FPGA based video processing architecture. One advantage of this architecture is that the system processing performance can be reconfigured for a particular application, with the addition of new or replicated processing cores. Finally, we have successfully implemented a human action recognition system on this reconfigurable architecture. With a small number of human actions (hand gestures), this stand-alone system is operating reliably at 12 frames/s, with an 80% average recognition rate using limited training data. This type of system has applications in security systems, man–machine communications and intelligent environments.


Human motion recognition Reconfigurable architectures Embedded computer vision FPGA Machine learning 


  1. 1.
    Aggarwal, J.K., Cai, Q.: Human motion analysis: a review. Comput. Vis. Image Underst. 73(3), 428–440 (1999). doi: CrossRefGoogle Scholar
  2. 2.
    Aizerman, A., Braverman, E.M., Rozoner, L.I.: Theoretical foundations of the potential function method in pattern recognition learning. Autom. Remote Control 25, 821–837 (1964)Google Scholar
  3. 3.
    Amadeus.: Use—ubiquitous system explorer (fpga development platform). (2004)
  4. 4.
    ARC.: Products and solutions: arc configurable cpu/dsp cores. (2007)
  5. 5.
    ARM.: Processor overview. (2007)
  6. 6.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV, pp. 1395–1402 (2005)Google Scholar
  7. 7.
    Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001)CrossRefGoogle Scholar
  8. 8.
    Bradski, G.R., Davis, J.W.: Motion segmentation and pose recognition with motion history gradients. Mach. Vis. Appl. 13(3), 174–184 (2002)CrossRefGoogle Scholar
  9. 9.
    Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: ECCV, vol. 2, pp. 428–441 (2006)Google Scholar
  10. 10.
    Davis, J.W.: Hierarchical motion history images for recognizing human motion. In: IEEE Workshop on Detection and Recognition of Events in Video, pp. 39–46 (2001)Google Scholar
  11. 11.
    Farnell, B.: Moving bodies, acting selves. Annu. Rev. Anthropol. 28, 341–373 (1999)CrossRefGoogle Scholar
  12. 12.
    Freeman, M.: Evaluating dataflow and pipelined vector processing architectures for FPGA co-processors. In: IEEE 9th Euromicro Conference on Digital System Design, Croatia (2006)Google Scholar
  13. 13.
    Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods—Support Vector Learning. MIT-Press, USA., oikonomopoulos (1999)
  14. 14.
    Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: ICCV, Beijing, China, October 15-21, 2005, pp. 166–173 (2005)Google Scholar
  15. 15.
    Kodak.: Kodak kac-9628 image sensor 648(h) x 488(v) color cmos image sensor. (2006)
  16. 16.
    Meng, H., Pears, N., Bailey, C.: Recognizing human actions based on motion information and SVM. In: 2nd IET International Conference on Intelligent Environments, IET, Athens, Greece, pp. 239–245 (2006)Google Scholar
  17. 17.
    Meng, H., Pears, N., Bailey, C.: A human action recognition system for embedded computer vision application. In: The 3rd IEEE Workshop on Embeded Computer Vision, Minneapolis, USA (2007a)Google Scholar
  18. 18.
    Meng, H., Pears, N., Bailey, C.: Motion information combination for fast human action recognition. In: 2nd International Conference on Computer Vision Theory and Applications (VISAPP07), Barcelona, Spain (2007b)Google Scholar
  19. 19.
  20. 20.
    Moeslund, T., Hilton, A., Kruger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 103(2–3), 90–126 (2006)CrossRefGoogle Scholar
  21. 21.
    Ogata, T., Tan, J.K., Ishikawa, S.: High-speed human motion recognition based on a motion history image and an eigenspace. IEICE Trans. Inf. Syst. E89(1), 281–289 (2006)CrossRefGoogle Scholar
  22. 22.
    Oikonomopoulos, A., Patras, I., Pantic, M.: Kernel-based recognition of human actions using spatiotemporal salient points. In: Proceedings of CVPR Workshop 06, vol. 3, pp. 151–156 (2006)Google Scholar
  23. 23.
    Pears, N.: Projects: Videoware—video processing architecture. (2004)
  24. 24.
    Schmidt, A., Laerhoven, K.V.: How to build smart appliances. IEEE Personal Commun. 8(4), 66–71. (2001)
  25. 25.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: ICPR, Cambridge, UK (2004)Google Scholar
  26. 26.
    Silicore.: Wishbone system-on-chip (soc) interconnection architecture for portable ip cores. (2002)
  27. 27.
    Tensilica.: Xtensa configurable processors—overview. (2007)
  28. 28.
    Weinland, D., Ronfard, R., Boyer, E.: Motion history volumes for free viewpoint action recognition. In: IEEE International Workshop on Modeling People and Human Interaction (PHI’05). (2005)
  29. 29.
    Wejchert, J.: “The disappearing computer”, information document, ist call for proposals, european commission, future and emerging technologies. (2000)
  30. 30.
    Wong, S.F., Cipolla, R.: Real-time adaptive hand motion recognition using a sparse bayesian classifier. In: ICCV-HCI, pp. 170–179 (2005)Google Scholar
  31. 31.
    Wong, S.F., Cipolla, R.: Continuous gesture recognition using a sparse bayesian classifier. In: ICPR, vol. 1, pp. 1084–1087 (2006)Google Scholar
  32. 32.
  33. 33.
  34. 34.
    Xilinx.: Spartan-3 fpga family complete data sheet. (2007c)

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  • Hongying Meng
    • 1
  • Michael Freeman
    • 1
  • Nick Pears
    • 1
  • Chris Bailey
    • 1
  1. 1.Department of Computer ScienceUniversity of YorkYorkUK

Personalised recommendations