Skip to main content
Log in

Real-time human action recognition on an embedded, reconfigurable video processing architecture

  • Special Issue
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript


In recent years, automatic human action recognition has been widely researched within the computer vision and image processing communities. Here we propose a real-time, embedded vision solution for human action recognition, implemented on an FPGA-based ubiquitous device. There are three main contributions in this paper. Firstly, we have developed a fast human action recognition system with simple motion features and a linear support vector machine classifier. The method has been tested on a large, public human action dataset and achieved competitive performance for the temporal template class of approaches, which include “Motion History Image” based techniques. Secondly, we have developed a reconfigurable, FPGA based video processing architecture. One advantage of this architecture is that the system processing performance can be reconfigured for a particular application, with the addition of new or replicated processing cores. Finally, we have successfully implemented a human action recognition system on this reconfigurable architecture. With a small number of human actions (hand gestures), this stand-alone system is operating reliably at 12 frames/s, with an 80% average recognition rate using limited training data. This type of system has applications in security systems, man–machine communications and intelligent environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others


  1. Aggarwal, J.K., Cai, Q.: Human motion analysis: a review. Comput. Vis. Image Underst. 73(3), 428–440 (1999). doi:

    Article  Google Scholar 

  2. Aizerman, A., Braverman, E.M., Rozoner, L.I.: Theoretical foundations of the potential function method in pattern recognition learning. Autom. Remote Control 25, 821–837 (1964)

    Google Scholar 

  3. Amadeus.: Use—ubiquitous system explorer (fpga development platform). (2004)

  4. ARC.: Products and solutions: arc configurable cpu/dsp cores. (2007)

  5. ARM.: Processor overview. (2007)

  6. Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV, pp. 1395–1402 (2005)

  7. Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001)

    Article  Google Scholar 

  8. Bradski, G.R., Davis, J.W.: Motion segmentation and pose recognition with motion history gradients. Mach. Vis. Appl. 13(3), 174–184 (2002)

    Article  Google Scholar 

  9. Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: ECCV, vol. 2, pp. 428–441 (2006)

  10. Davis, J.W.: Hierarchical motion history images for recognizing human motion. In: IEEE Workshop on Detection and Recognition of Events in Video, pp. 39–46 (2001)

  11. Farnell, B.: Moving bodies, acting selves. Annu. Rev. Anthropol. 28, 341–373 (1999)

    Article  Google Scholar 

  12. Freeman, M.: Evaluating dataflow and pipelined vector processing architectures for FPGA co-processors. In: IEEE 9th Euromicro Conference on Digital System Design, Croatia (2006)

  13. Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods—Support Vector Learning. MIT-Press, USA., oikonomopoulos (1999)

  14. Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: ICCV, Beijing, China, October 15-21, 2005, pp. 166–173 (2005)

  15. Kodak.: Kodak kac-9628 image sensor 648(h) x 488(v) color cmos image sensor. (2006)

  16. Meng, H., Pears, N., Bailey, C.: Recognizing human actions based on motion information and SVM. In: 2nd IET International Conference on Intelligent Environments, IET, Athens, Greece, pp. 239–245 (2006)

  17. Meng, H., Pears, N., Bailey, C.: A human action recognition system for embedded computer vision application. In: The 3rd IEEE Workshop on Embeded Computer Vision, Minneapolis, USA (2007a)

  18. Meng, H., Pears, N., Bailey, C.: Motion information combination for fast human action recognition. In: 2nd International Conference on Computer Vision Theory and Applications (VISAPP07), Barcelona, Spain (2007b)

  19. MIPS (2007) Architectures.

  20. Moeslund, T., Hilton, A., Kruger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 103(2–3), 90–126 (2006)

    Article  Google Scholar 

  21. Ogata, T., Tan, J.K., Ishikawa, S.: High-speed human motion recognition based on a motion history image and an eigenspace. IEICE Trans. Inf. Syst. E89(1), 281–289 (2006)

    Article  Google Scholar 

  22. Oikonomopoulos, A., Patras, I., Pantic, M.: Kernel-based recognition of human actions using spatiotemporal salient points. In: Proceedings of CVPR Workshop 06, vol. 3, pp. 151–156 (2006)

  23. Pears, N.: Projects: Videoware—video processing architecture. (2004)

  24. Schmidt, A., Laerhoven, K.V.: How to build smart appliances. IEEE Personal Commun. 8(4), 66–71. (2001)

  25. Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: ICPR, Cambridge, UK (2004)

  26. Silicore.: Wishbone system-on-chip (soc) interconnection architecture for portable ip cores. (2002)

  27. Tensilica.: Xtensa configurable processors—overview. (2007)

  28. Weinland, D., Ronfard, R., Boyer, E.: Motion history volumes for free viewpoint action recognition. In: IEEE International Workshop on Modeling People and Human Interaction (PHI’05). (2005)

  29. Wejchert, J.: “The disappearing computer”, information document, ist call for proposals, european commission, future and emerging technologies. (2000)

  30. Wong, S.F., Cipolla, R.: Real-time adaptive hand motion recognition using a sparse bayesian classifier. In: ICCV-HCI, pp. 170–179 (2005)

  31. Wong, S.F., Cipolla, R.: Continuous gesture recognition using a sparse bayesian classifier. In: ICPR, vol. 1, pp. 1084–1087 (2006)

  32. Xilinx.: Microblaze processor. (2007a)

  33. Xilinx.: Microblaze soft processor core. (2007b)

  34. Xilinx.: Spartan-3 fpga family complete data sheet. (2007c)

Download references


The authors would like to thank DTI and Broadcom Ltd. for the financial support for this research.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Hongying Meng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Meng, H., Freeman, M., Pears, N. et al. Real-time human action recognition on an embedded, reconfigurable video processing architecture. J Real-Time Image Proc 3, 163–176 (2008).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: