, Volume 41, Issue 2, pp 161–182 | Cite as

Nearest neighbour classification of Indian sign language gestures using kinect camera



People with speech disabilities communicate in sign language and therefore have trouble in mingling with the able-bodied. There is a need for an interpretation system which could act as a bridge between them and those who do not know their sign language. A functional unobtrusive Indian sign language recognition system was implemented and tested on real world data. A vocabulary of 140 symbols was collected using 18 subjects, totalling 5041 images. The vocabulary consisted mostly of two-handed signs which were drawn from a wide repertoire of words of technical and daily-use origins. The system was implemented using Microsoft Kinect which enables surrounding light conditions and object colour to have negligible effect on the efficiency of the system. The system proposes a method for a novel, low-cost and easy-to-use application, for Indian Sign Language recognition, using the Microsoft Kinect camera. In the fingerspelling category of our dataset, we achieved above 90% recognition rates for 13 signs and 100% recognition for 3 signs with overall 16 distinct alphabets (A, B, D, E, F, G, H, K, P, R, T, U, W, X, Y, Z) recognised with an average accuracy rate of 90.68%.


Indian sign language recognition multi-class classification gesture recognition. 


  1. [1]
    Schmitz M, Endres C and Butz A 2008 A survey of human-computer interaction design in science fiction movies. In: Proceedings of the 2nd international conference on INtelligent TEchnologies for interactive enterTAINment, page 7. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering)Google Scholar
  2. [2]
    Bauman D 2008 Open your eyes: Deaf studies talking. University of Minnesota PressGoogle Scholar
  3. [3]
    Muni B 1951 Natya Shastra. Calcutta: Asiatic Society of Bengal Google Scholar
  4. [4]
    Bulwer J 1648 Philocopus, or the deaf and dumb man’s friend. London: Humphrey and MoseleyGoogle Scholar
  5. [5]
    Cornett R O 1967 Cued speech. University of Nebraska Media Center. Captioned Films for the DeafGoogle Scholar
  6. [6]
    Geetha M and Manjusha U 2012 A vision based recognition of indian sign language alphabets and numerals using B-spline approximation. Int. J. Comp. Sci. Eng. (IJCSE) Google Scholar
  7. [7]
    Pugeault N and Bowden R 2011 Spelling it out: Real-time ASL fingerspelling recognition. In 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pages 1114–1119Google Scholar
  8. [8]
    ISL 2011 Sign dictionaries for indian deaf and dumb population. Sign Language Unit, The Faculty of Disability Management & Special Education (FDMSE) of Ramakrishna Mission Vidyalaya, Perianaickenpalayam, Coimbatore, India.
  9. [9]
    Mitra S and Acharya T 2007 Gesture recognition: A survey. IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev. 37 (3): 311–324CrossRefGoogle Scholar
  10. [10]
    Bilal S, Akmeliawati R, El Salami M J and Shafie A A 2011 Vision-based hand posture detection and recognition for sign language–a study. In: 4th International Conference on Mechatronics (ICOM), 2011, pages 1–6Google Scholar
  11. [11]
    Quek F K and Zhao M 1996 Inductive learning in hand pose recognition. In: Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, 1996, pages 78–83Google Scholar
  12. [12]
    Starner T, Weaver J and Pentland A 1998 Real-time american sign language recognition using desk and wearable computer based video. IEEE Trans. Pattern Anal. Mach. Intell. 20 (12): 1371–1375CrossRefGoogle Scholar
  13. [13]
    Luis-Pérez F E, Trujillo-Romero F and Martínez-Velazco W 2011 Control of a service robot using the mexican sign language. In: Adv. Soft Comput., pages 419–430. SpringerGoogle Scholar
  14. [14]
    Rekha J, Bhattacharya J and Majumder S 2011 Shape, texture and local movement hand gesture features for Indian Sign Language recognition. In: 3rd International Conference on Trendz in Information Sciences and Computing (TISC), 2011, pages 30–35Google Scholar
  15. [15]
    Singha J and Das K 2013 Indian sign language recognition using eigen value weighted euclidean distance based classification technique. arXiv preprint arXiv:1303.0634
  16. [16]
    Bhuyan M, Kar M K and Neog D R 2011 Hand pose identification from monocular image for sign language recognition. In: 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), pages 378–383Google Scholar
  17. [17]
    Kenn H, Megen F V and Sugar R 2007 A glove-based gesture interface for wearable computing applications. In: 4th International Forum on Applied Wearable Computing (IFAWC), 2007, pages 1–10Google Scholar
  18. [18]
    Hernandez-Rebollar J L, Lindeman R W and Kyriakopoulos N 2002 A multi-class pattern recognition system for practical finger spelling translation. In: Proceedings of the 4th IEEE International Conference on Multimodal Interfaces, ICMI ’02, pages 185–, Washington, DC, USA. IEEE Computer SocietyGoogle Scholar
  19. [19]
    Liang R -H and Ouhyoung M 1998 A real-time continuous gesture recognition system for sign language. In: Proceedings of the Third IEEE international conference on automatic face and gesture recognition, 1998, pages 558–567Google Scholar
  20. [20]
    Saengsri S, Niennattrakul V and Ratanamahatana C 2012 Tfrs: Thai finger-spelling sign language recognition system. In: Second International Conference on Digital Information and Communication Technology and it’s Applications (DICTAP), 2012, pages 457–462Google Scholar
  21. [21]
    Van den Bergh M and Van Gool L 2011 Combining RGB and ToF cameras for real-time 3D hand gesture interaction. In: IEEE Workshop on Applications of Computer Vision (WACV), 2011, pages 66–72Google Scholar
  22. [22]
    Wang F and Zhang C 2007 Feature extraction by maximizing the average neighbourhood margin. In: IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR’07, pages 1–8Google Scholar
  23. [23]
    Argyros A and Lourakis M I A 2006 Binocular hand tracking and reconstruction based on 2D shape matching. In: 18th International Conference on pattern recognition, 2006. ICPR 2006. volume 1, pages 207–210Google Scholar
  24. [24]
    Keskin C, Kıraç F, Kara Y E and Akarun L 2013 Real time hand pose estimation using depth sensors. In: Consumer depth cameras for computer vision, pages 119–137. SpringerGoogle Scholar
  25. [25]
    Ren Z, Yuan J and Zhang Z 2011 Robust hand gesture recognition based on finger-earth mover’s distance with a commodity depth camera. In: Proceedings of the 19th ACM international conference on Multimedia, MM ’11, pages 1093–1096, New York, NY, USAGoogle Scholar
  26. [26]
    Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M and Moore R 2013 Real-time human pose recognition in parts from single depth images. Commun. ACM 56 (1): 116–124CrossRefGoogle Scholar
  27. [27]
    Kramer J, Parker M, Herrera D, Burrus N and Echtler F 2012 Hacking the kinect. ApressGoogle Scholar
  28. [28]
    ElKoura G and Singh K 2003 Handrix: Animating the human hand. In: Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on computer animation, pages 110–119. Eurographics AssociationGoogle Scholar
  29. [29]
    PrimeSense 2011 OpenNI platform 1.0Google Scholar
  30. [30]
    Rusu R B and Cousins S 2011 3D is here: Point cloud library (PCL). In: IEEE International Conference on Robotics and Automation (ICRA), 2011, pages 1–4Google Scholar
  31. [31]
    Buades A, Coll B and Morel J M 2005 A non-local algorithm for image denoising. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005, vol. 2, pages 60–65Google Scholar
  32. [32]
    Tukey J W 1977 Exploratory data analysis. Reading, MA, 231Google Scholar
  33. [33]
    Shi J and Malik J 2000 Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22 (8): 888–905CrossRefGoogle Scholar
  34. [34]
    Lowe D G 2004 Distinctive image features from scale-invariant keypoints. Int. J. Comp. Vis. 60 (2): 91–110CrossRefGoogle Scholar
  35. [35]
    Lindeberg T 2012 Scale invariant feature transform. Scholarpedia 7 (5): 10491CrossRefGoogle Scholar
  36. [36]
    Lindeberg T 1994 Scale-space theory: A basic tool for analyzing structures at different scales. J. Appl. Stat. 21 (1–2): 225–270CrossRefGoogle Scholar
  37. [37]
    Lindeberg T 1998 Feature detection with automatic scale selection. Int. J. Comp. Vis. 30 (2): 79–116CrossRefGoogle Scholar
  38. [38]
    Bundy A and Wallen L 1984 Difference of gaussians. In: Bundy A and Wallen L (eds) Catalogue of artificial intelligence tools, symbolic computation, page 30. Springer, Berlin HeidelbergGoogle Scholar
  39. [39]
    Bhatnagar S 2007 Adaptive newton-based multivariate smoothed functional algorithms for simulation optimization. ACM Trans. Model. Comput. Simul. 18 (1): 2:1–2:35CrossRefGoogle Scholar
  40. [40]
    Lindeberg T 1993 Scale-space theory in computer vision. SpringerGoogle Scholar
  41. [41]
    Weickert J, Ishikawa S and Imiya A 1999 Linear scale-space has first been proposed in japan. J. Math. Imaging Vis. 10 (3): 237–252MathSciNetCrossRefzbMATHGoogle Scholar
  42. [42]
    Lindeberg T 2013 Generalized axiomatic scale-space theory. Adv. Imaging Electron Phys. 178: 1CrossRefGoogle Scholar
  43. [43]
    Yuille A L and Poggio T A 1986 Scaling theorems for zero crossings. IEEE Trans. Pattern Anal. Mach. Intell. 8 (1): 15–25CrossRefzbMATHGoogle Scholar
  44. [44]
    Babaud J, Witkin A P, Baudin M and Duda R O 1986 Uniqueness of the gaussian kernel for scale-space filtering. IEEE Trans. Pattern Anal. Mach. Intell. 8 (1): 26–33CrossRefzbMATHGoogle Scholar
  45. [45]
    Allaire S, Kim J J, Breen S L, Jaffray D A and Pekar V 2008 Full orientation invariance and improved feature selectivity of 3d SIFT with application to medical image analysis. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008. CVPRW’08, pages 1–8Google Scholar
  46. [46]
    Rusu R B, Bradski G, Thibaux R and Hsu J 2010 Fast 3D recognition and pose using the viewpoint feature histogram. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2010, pages 2155–2162Google Scholar
  47. [47]
    Rusu R B, Blodow N and Beetz M 2009 Fast point feature histograms (FPFH) for 3d registration. In: IEEE International Conference on Robotics and Automation, 2009. ICRA’09, pages 3212–3217Google Scholar
  48. [48]
    Bay H, Tuytelaars T and Van Gool L 2006 Surf: Speeded up robust features. In: Computer Vision–ECCV 2006, pages 404–417. SpringerGoogle Scholar
  49. [49]
    McDonnell M 1981 Box-filtering techniques. Comp. Graph. Image Process. 17 (1): 65–70CrossRefGoogle Scholar
  50. [50]
    Viola P and Jones M 2001 Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. CVPR 2001. vol. 1, pages I-511–I-518Google Scholar
  51. [51]
    Ansari Z 2013a Gesture recognition for Indian Sign Language with Kinect Xbox 360. Accessed: 2013-06-18
  52. [52]
    Ansari Z A 2013b Gesture recognition for Indian sign language. Master’s thesis, Indian Institute of Technology Jodhpur, IndiaGoogle Scholar

Copyright information

© Indian Academy of Sciences 2016

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringIndian Institute of Technology JodhpurRajasthanIndia

Personalised recommendations