Hand Gesture Recognition System Based in Computer Vision and Machine Learning

  • Paulo TrigueirosEmail author
  • Fernando Ribeiro
  • Luís Paulo Reis
Part of the Lecture Notes in Computational Vision and Biomechanics book series (LNCVB, volume 19)


Hand gesture recognition is a natural way of human computer interaction and an area of very active research in computer vision and machine learning. This is an area with many different possible applications, giving users a simpler and more natural way to communicate with robots/systems interfaces, without the need for extra devices. So, the primary goal of gesture recognition research applied to Human-Computer Interaction (HCI) is to create systems, which can identify specific human gestures and use them to convey information or controlling devices. For that, vision-based hand gesture interfaces require fast and extremely robust hand detection, and gesture recognition in real time. This paper presents a solution, generic enough, with the help of machine learning algorithms, allowing its application in a wide range of human-computer interfaces, for real-time gesture recognition. Experiments carried out showed that the system was able to achieve an accuracy of 99.4 % in terms of hand posture recognition and an average accuracy of 93.72 % in terms of dynamic gesture recognition. To validate the proposed framework, two applications were implemented. The first one is a real-time system able to help a robotic soccer referee judge a game in real time. The prototype combines a vision-based hand gesture recognition system with a formal language definition, the Referee CommLang, into what is called the Referee Command Language Interface System (ReCLIS). The second one is a real-time system able to interpret the Portuguese Sign Language. Sign languages are not standard and universal and the grammars differ from country to country. Although the implemented prototype was only trained to recognize the vowels, it is easily extended to recognize the rest of the alphabet, being a solid foundation for the development of any vision-based sign language recognition user interface system.


Hand posture recognition Hand gesture recognition Computer vision Machine learning Remote robot control Human-computer interaction 



The authors wish to thank all members of the Laboratório de Automação e Robótica (LAR), at University of Minho, Guimarães. The authors would like to thank also, everyone who contributed to the hand data features acquisition phase, without which it would have been very difficult to carry out this study. Also special thanks to the Polytechnic Institute of Porto, the ALGORITMI Research Centre and the LIACC Research Center, for the opportunity to develop this research work.


  1. 1.
    Alpaydin E (2004) Introduction to machine learning. MIT Press, CambridgeGoogle Scholar
  2. 2.
    Backus JW, Bauer FL, Green J, Katz C, Mccarthy J, Perlis AJ, Rutishauser H, Samelson K, Vauquois B, Wegstein JH, Wijngaarden AV, Woodger M (1960) Revised report on the algorithmic language ALGOL 60. Communications of the ACM. ACMGoogle Scholar
  3. 3.
    Bourennane S, Fossati C (2010) Comparison of shape descriptors for hand posture recognition in video. SIViP 6:147–157CrossRefGoogle Scholar
  4. 4.
    Bradski G, Kaehler A (2008) Learning OpenCV: computer vision with the OpenCV library. O’Reilly Media, SebastopolGoogle Scholar
  5. 5.
    Buchmann V, Violich S, Billinghurst M, Cockburn A (2004) FingARtips: gesture based direct manipulation in augmented reality. 2nd International Conference on Computer Graphics and Interactive Techniques in Australasia and South East Asia. ACM, SingaporeGoogle Scholar
  6. 6.
    Buckland M (2005) Programming game AI by example. Wordware Publishing, Inc.Google Scholar
  7. 7.
    Camastra F, Vinciarelli A (2008) Machine learning for audio, image and video analysis. Springer, LondonGoogle Scholar
  8. 8.
    Chaudhary A, Raheja JL, Das K, Raheja S (2011) Intelligent approaches to interact with machines using hand gesture recognition in natural way: a survey. Int J Comp Sci Eng Survey 2:122–133CrossRefGoogle Scholar
  9. 9.
    Chowdhury JR (2012) Kinect sensor for Xbox gaming. M. Tech CSE, IIT KharagpurGoogle Scholar
  10. 10.
    Fink GA (2008) Markov models for pattern recognition—from theory to applications. Springer, BerlinGoogle Scholar
  11. 11.
    Hasanuzzaman M, Ampornaramveth V, Zhang T, Bhuiyan Ma, Shirai Y, Ueno H (2004) Real-time vision-based gesture recognition for human robot interaction. IEEE International Conference on Robotics and Biomimetics, August 22–26. Shenyang. IEEE, pp 413–418Google Scholar
  12. 12.
    Holt GAT, Reinders MJT, Hendriks EA, Ridder HD, Doorn AJV (2010) Influence of handshape information on automatic sign language recognition. 8th International Conference on Gesture in Embodied Communication and Human-Computer Interaction, February 25–27. Bielefeld. 2127632: Springer-Verlag, pp 301–312Google Scholar
  13. 13.
    Huang T, Pavlovic VH (1995) Gesture modeling, analysis, and synthesis. In Proc. of IEEE International Workshop on Automatic Face and Gesture Recognition, pp 73–79Google Scholar
  14. 14.
    KIM T (2008) In-depth: eye to eye—the history of Eyetoy [online]. Accessed 29 March 2013
  15. 15.
    King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758Google Scholar
  16. 16.
    Kratz S, Rohs M (2011) Protractor3D: a closed-form solution to rotation-invariant 3D gestures. 16th International Conference on Intelligent User Interfaces. ACM, Palo AltoGoogle Scholar
  17. 17.
    LI Y (2010) Protractor: a fast and accurate gesture recognizer. Conference on Human Factors in Computing Systems. ACM, AtlantaGoogle Scholar
  18. 18.
    Lieberman Z, Watson T, Castro A (2004) OpenFrameworks [online]. (2011)
  19. 19.
    Maung THH (2009) Real-time hand tracking and gesture recognition system using neural networks. Proc World Acad Sci: Enginee Tech 50:466–470Google Scholar
  20. 20.
    Millington I, Funge J (2009) Artificial intelligence for games. Elsevier, USAGoogle Scholar
  21. 21.
    Miner R (2006) RapidMiner: report the future [online]. Accessed Dec 2011
  22. 22.
    Mitra S, Acharya T (2007) Gesture recognition: a survey. IEEE transactions on systems, man and cybernetics. IEEEGoogle Scholar
  23. 23.
    Montgomery DC, Runger GC (1994) Applied statistics and probability for engineers. Wiley, USAGoogle Scholar
  24. 24.
    Murphy K (1998) Hidden Markov Model (HMM) toolbox for Matlab [online]. (2012)
  25. 25.
    Murthy GRS, Jadon RS (2009) A review of vision based hand gestures recognition. Int J Info Technol Knowl Manag 2:405–410Google Scholar
  26. 26.
    Ong SC, Ranganath S (2005) Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE Trans Pattern Anal Mach Intell 27:873–891CrossRefGoogle Scholar
  27. 27.
    OPENNI (2013) The standard framework for 3D sensing [online].
  28. 28.
    Rabiner LR (1989) A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc IEEE 77:257–286Google Scholar
  29. 29.
    Rabiner LR, Juang BH (1986) An introduction to Hidden Markov Models. IEEE ASSp MagazineGoogle Scholar
  30. 30.
    Reis LP, Lau N (2002) COACH UNILANG—a standard language for coaching a (robo) soccer team. In: Birk A, Coradeschi S, Tadokoro, S (eds) RoboCup 2001: Robot Soccer World Cup V. Springer Berlin HeidelbergGoogle Scholar
  31. 31.
    Sayad DS 2010. Support Vector Machine—Classification (SVM) [online]. Accessed 8 Nov 2012
  32. 32.
    Tara RY, Santosa PI, Adji TB (2012) Sign language recognition in robot teleoperation using centroid distance Fourier descriptors. Int J Comput Appl 48(2):8–12Google Scholar
  33. 33.
    Theodoridis S, Koutroumbas K (2010) An introduction to pattern recognition: a Matlab Approach. Academic, BurlingtonGoogle Scholar
  34. 34.
    Trigueiros P, Ribeiro F, Lopes G (2011) Vision-based hand segmentation techniques for human-robot interaction for real-time applications. In: Tavares JM, Jorge RMN (eds) III ECCOMAS thematic conference on computational vision and medical image processing, 12–14 De Oubtubro 2011 Olhão. Taylor and Francis, Publication pp 31–35Google Scholar
  35. 35.
    Trigueiros P, Ribeiro F, Reis LP (2012) A comparison of machine learning algorithms applied to hand gesture recognition. 7th Iberian Conference on Information Systems and Technologies, 20–23 July. Madrid, pp 41–46Google Scholar
  36. 36.
    Trigueiros P, Ribeiro F, Reis LP (2013) A comparative study of different image features for hand gesture machine learning. 5th International Conference on Agents and Artificial Intelligence, 15–18 February. BarcelonaGoogle Scholar
  37. 37.
    Vatavu R-D, Anthony L, Wobbrock JO (2012) Gestures as point clouds: a $P recognizer for user interface prototypes. 14th ACM International Conference on Multimodal Interaction. ACM, Santa MonicaGoogle Scholar
  38. 38.
    Vijay PK, Suhas NN, Chandrashekhar CS, Dhananjay DK (2012) Recent developments in sign language recognition: a review. Int J Adv Comput Eng Commun Technol 1:21–26Google Scholar
  39. 39.
    Wikipedia (2012) Língua gestual portuguesa [online]. (2013)
  40. 40.
    Witten IH, Frank E, Hall MA (2011) Data mining—practical machine learning tools and techniques. ElsevierGoogle Scholar
  41. 41.
    Wobbrock JO, Wilson AD, Li Y (2007) Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes. Proceedings of the 20th Annual ACM Symposium on User Interface Software and Technology. ACM, NewportGoogle Scholar
  42. 42.
    Wu Y, Huang TS (1999) Vision-based gesture recognition: a review. Proceedings of the International Gesture Workshop on Gesture-Based Communication in Human-Computer Interaction. Springer-Verlag.Google Scholar
  43. 43.
    Yoon J-H, Park J-S, Sung MY (2006) Vision-Based bare-hand gesture interface for interactive augmented reality applications. 5th International Conference on Entertainment Computing, September 20–22. Cambridge. 2092520: Springer-Verlag, pp 386–389Google Scholar
  44. 44.
    Zafrulla Z, Brashear H, Starner T, Hamilton H, Presti P (2011) American sign language recognition with the kinect. 13th International Conference on Multimodal Interfaces. ACM, AlicanteGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Paulo Trigueiros
    • 1
    • 2
    • 4
    Email author
  • Fernando Ribeiro
    • 2
    • 4
  • Luís Paulo Reis
    • 3
    • 4
    • 5
  1. 1.Insituto Politécnico do PortoIPPPortoPortugal
  2. 2.DEI/EEUM—Departamento de Electrónica Industrial, Escola de EngenhariaUniversidade do MinhoGuimarãesPortugal
  3. 3.DSI/EEUM—Departamento de Sistemas de Informação, Escola de EngenhariaUniversidade do MinhoGuimarãesPortugal
  4. 4.Centro AlgoritmiUniversidade do MinhoGuimarãesPortugal
  5. 5.LIACC—Laboratório de Inteligência Artificial e Ciência de ComputadoresPortoPortugal

Personalised recommendations