Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Robust hand gesture recognition system based on a new set of quaternion Tchebichef moment invariants

  • 26 Accesses

Abstract

Hand gesture recognition is a challenging task due to the complexity of hand movements and to the variety among the same gesture performed by distinct subjects. Recent technologies, such as Kinect sensor, provide new opportunities, allowing to capture both RGB and depth (RGB-D) images, which offer high discriminant information for efficient hand gesture recognition. In the aspect of feature extraction, the traditional methods process the RGB and depth information independently. In this paper, we propose a robust hand gesture recognition system based on a new feature extraction method, fusing RGB images and depth information simultaneously, by using the quaternion algebra that provide a more robust and holistical representation. In fact, we introduce, for the first time, a novel type of feature extraction method, named quaternion Tchebichef moment invariants. The novelty of the proposed method in this paper lies in the direct derivation of invariants from their orthogonal moments, based on the algebraic properties of the discrete Tchebichef polynomials. The proposed approach based on quaternion algebra is suggested to process the four components holistically, for a robust and efficient hand gesture recognition system. The obtained experimental and theoretical results demonstrate that the present approach is very effective for addressing the problem of hand gesture recognition and have proved its robustness against geometrical distortion, noisy conditions and complex background compared to the state of the art, indicating that it could be highly useful for many computer vision applications.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

References

  1. 1.

    Ren Z, Yuan J, Meng J, Zhang Z (2013) Robust part-based hand gesture recognition using kinect sensor. IEEE Trans Multimed 15:1110–1120

  2. 2.

    Kumar P, Gauba H, Pratim Roy P, Prosad Dogra D (2017) A multimodal framework for sensor based sign language recognition. Neurocomputing 259:21–38

  3. 3.

    Erra U, Malandrino D, Pepe L (2018) A methodological evaluation of natural user interfaces for immersive 3D Graph explorations. J Vis Lang Comput 44:13–27

  4. 4.

    Pisharady PK, Saerbeck M (2015) Recent methods and databases in vision-based hand gesture recognition: a review. Comput Vis Image Underst 141:152–165

  5. 5.

    Joudaki S, bin Mohamad D, Saba T, Rehman A, Al-Rodhaan M, Al-Dhelaan A (2014) Vision-based sign language classification: a directional review. IETE Tech Rev 31:383–391

  6. 6.

    Dipietro L, Sabatini AM, Dario P (2008) A survey of glove-based systems and their applications. IEEE Trans Syst Man Cybern Part C Appl Rev 38:461–482

  7. 7.

    Mohandes M, Deriche M, Liu J (2014) Image-based and sensor-based approaches to arabic sign language recognition. IEEE Trans Hum Mach Syst 44:551–557

  8. 8.

    Wang C, Liu Z, Chan S (2015) Superpixel-based hand gesture recognition with kinect depth camera. IEEE Trans Multimed 17:29–39

  9. 9.

    Li Y, Wang X, Liu W, Feng B (2018) Deep attention network for joint hand gesture localization and recognition using static RGB-D images. Inf Sci 441:66–78

  10. 10.

    Lin J, Ding Y (2013) A temporal hand gesture recognition system based on hog and motion trajectory. Optik 124:6795–6798

  11. 11.

    Huang D-Y, Hu W-C, Chang S-H (2011) Gabor filter-based hand-pose angle estimation for hand gesture recognition under varying illumination. Expert Syst Appl 38:6031–6042

  12. 12.

    Patil SB, Sinha GR (2017) Distinctive feature extraction for indian sign language (ISL) gesture using scale invariant feature transform (SIFT). J Inst Eng India Ser B 98:19–26

  13. 13.

    Zhang F, Liu Y, Zou C, Wang Y (2018) Hand gesture recognition based on HOG-LBP feature. In: 2018 IEEE international instrumentation and measurement technology conference I2MTC, pp 1–6

  14. 14.

    Li Y-T, Wachs JP (2014) HEGM: a hierarchical elastic graph matching for hand gesture recognition. Pattern Recognit 47:80–88

  15. 15.

    Lee H-K, Kim JH (1999) An HMM-based threshold model approach for gesture recognition. IEEE Trans Pattern Anal Mach Intell 21:961–973

  16. 16.

    Ng CW, Ranganath S (2002) Real-time gesture recognition system and application. Image Vis Comput 20:993–1007

  17. 17.

    Patwardhan KS, Dutta Roy S (2007) Hand gesture modelling and recognition involving changing shapes and trajectories, using a predictive eigentracker. Pattern Recognit Lett 28:329–334

  18. 18.

    Shin MC, Tsap LV, Goldgof DB (2004) Gesture recognition using Bezier curves for visualization navigation from registered 3-D data. Pattern Recognit 37:1011–1024

  19. 19.

    Corradini A (2001) Dynamic time warping for off-line recognition of a small gesture vocabulary. In: Proceedings of the IEEE ICCV workshop recognition. Anal. Track. Faces Gestures Real-time Syst, pp 82–89

  20. 20.

    Mukundan R, Ong SH, Lee PA (2001) Image analysis by Tchebichef moments. IEEE Trans Image Process 10:1357–1364

  21. 21.

    Orange,https://orange.biolab.si/

  22. 22.

    Suarez J, Murphy RR (2012) Hand gesture recognition with depth images: a review. In: 2012 IEEE RO-MAN 21st IEEE international conference on robot and human interactive communication, pp 411–417

  23. 23.

    Pugeault N, Bowden R (2011) Spelling it out: real-time ASL finger spelling recognition. In: 2011 IEEE international conference on computer vision ICCV Workshop, pp 1114–1119

  24. 24.

    Pedersoli F, Benini S, Adami N, Leonardi R (2014) XKin: an open source framework for hand pose and gesture recognition using kinect. Vis Comput 30:1107–1122

  25. 25.

    Kevin NYY, Ranganath S, Ghosh D (2004) Trajectory modeling in gesture recognition using CyberGloves and magnetic trackers. In: 2004 IEEE region 10 conference TENCON 2004, Vol 1, pp. 571–574

  26. 26.

    Feng B, He F, Wang X, Wu Y, Wang H, Yi S, Liu W (2017) Depth-projection-map-based bag of contour fragments for robust hand gesture recognition. IEEE Trans Hum Mach Syst 47:511–523

  27. 27.

    Wang X, Feng B, Bai X, Liu W, Jan Latecki L (2014) Bag of contour fragments for robust shape classification. Pattern Recognit 47:2116–2125

  28. 28.

    Li S-Z, Yu B, Wu W, Su S-Z, Ji R-R (2015) Feature learning based on SAE-PCA network for human gesture recognition in RGBD images. Neurocomputing 151:565–573

  29. 29.

    Wang C, Liu Z, Zhu M, Zhao J, Chan S-C (2017) A hand gesture recognition system based on canonical superpixel-graph. Signal Process Image Commun 58:87–98

  30. 30.

    Zhang C, Tian Y (2015) Histogram of 3D facets: a depth descriptor for human action and hand gesture recognition. Comput Vis Image Underst 139:29–39

  31. 31.

    Sykora P, Kamencay P, Hudec R (2014) Comparison of SIFT and SURF methods for use on hand gesture recognition based on depth map. AASRI Proc 9:19–24

  32. 32.

    Hosny KM, Darwish MM (2019) New set of multi-channel orthogonal moments for color image representation and recognition. Pattern Recognit 88:153–173

  33. 33.

    Flusser J, Suk T, Zitova B (2016) 2D and 3D image analysis by moments, 1st edn. Wiley, Chichester

  34. 34.

    Benouini R, Batioua I, Zenkouar K, Najah S, Qjidaa H (2018) Efficient 3D object classification by using direct Krawtchouk moment invariants. Multimed Tools Appl 77:27517–27542

  35. 35.

    Dahmani D, Larabi S (2014) User-independent system for sign language finger spelling recognition. J Vis Commun Image Represent 25:1240–1250

  36. 36.

    Kaur B, Joshi G (2016) Lower order Krawtchouk moment-based feature-set for hand gesture recognition. Adv Hum Comput Interact 2016:1–10

  37. 37.

    Wang M, Chen W-Y, Li XD (2016) Hand gesture recognition using valley circle feature and Hu’s moments technique for robot movement control. Measurement 94:734–744

  38. 38.

    Singha J, Misra S, Laskar RH (2016) Effect of variation in gesticulation pattern in dynamic hand gesture recognition system. Neurocomputing 208:269–280

  39. 39.

    Chevtchenko SF, Vale RF, Macario V (2018) Multi-objective optimization for hand posture recognition. Expert Syst Appl 92:170–181

  40. 40.

    Padam Priyal S, Bora PK (2013) A robust static hand gesture recognition system using geometry based normalizations and Krawtchouk moments. Pattern Recognit 46:2202–2219

  41. 41.

    Zhou Y, Jiang G, Lin Y (2016) A novel finger and hand pose estimation technique for real-time hand gesture recognition. Pattern Recognit 49:102–114

  42. 42.

    Jadooki S, Mohamad D, Saba T, Almazyad AS, Rehman A (2017) Fused features mining for depth-based hand gesture recognition to classify blind human communication. Neural Comput Appl 28:3285–3294

  43. 43.

    Hu Y (2018) Finger spelling recognition using depth information and support vector machine. Multimed Tools Appl 77:29043–29057

  44. 44.

    Gallo L, Placitelli AP (2012) Recognition view-independent hand posture, from single depth images using PCA and Flusser moments. In: 2012 eighth international conference on signal image technology and internet based systems, pp 898–904

  45. 45.

    Hamilton WR (1844) II. On quaternions; or on a new system of imaginaries in algebra. Lond Edinb Dublin Philos Mag J Sci 25:10–13

  46. 46.

    Sangwine SJ (1996) Fourier transforms of colour images using quaternion or hypercomplex, numbers. Electron Lett 32:1979–1980

  47. 47.

    Chen B, Yang J, Ding M, Liu T, Zhang X (2016) Quaternion-type moments combining both color and depth information for RGB-D object recognition. In: 23rd international conference on pattern recognition, ICPR 2016, pp 704–708

  48. 48.

    Tchebychev PL (1854) Théorie des mécanismes connus sous le nom de parallélogrammes, par M. P. Tchébychev, 1re partie, Eggers

  49. 49.

    Zhu H, Li Q, Liu Q (2014) Quaternion discrete Tchebichef moments and their applications. Int J Signal Process Image Process Pattern Recognit 7.6:149–162

  50. 50.

    Karakasis EG, Amanatiadis A, Gasteratos A, Chatzichristofis SA (2015) Image moment invariants as local features for content based image retrieval using the Bag-of-Visual-Words model. Pattern Recognit Lett 55:22–27

  51. 51.

    VisTex,http://vismod.media.mit.edu/vismod/imagery/VisionTexture

  52. 52.

    CBT,http://multibandtexture.recherche.usherbrooke.ca

  53. 53.

    Outex,http://lagis-vi.univ-lille1.fr/datasets/outex.html

  54. 54.

    Amsterdam,http://aloi.science.uva.nl/public_alot/

  55. 55.

    Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:674–701

  56. 56.

    Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

  57. 57.

    Derrac J, García S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1:3–18

Download references

Acknowledgements

The authors thankfully acknowledge the Laboratory of Intelligent Systems and Applications (LSIA) for his support to achieve this work.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.

Author information

Correspondence to Ilham Elouariachi.

Ethics declarations

Conflicts of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Proof of proposition 1

With the help of Eqs. (15) and (15), the translated version of Tchebichef polynomials can be expressed as:

$$\begin{aligned} \begin{aligned} {t}_n (x-x_0;N)=\sum _{i=0}^{n}A(x-x_0)^i=\sum _{i=0}^{n}\sum _{s=0}^{i} \left( {\begin{array}{c}i\\ s\end{array}}\right) A_{n,i}(-1)^{i-s}\\ x^s(x_0)^{i-s}. \end{aligned} \end{aligned}$$
(33)

By substituting Eq. (15) into Eq. (33), we obtain the relationship between the translated version and traditional Tchebichef polynomials, as follows:

$$\begin{aligned} \begin{aligned} {t}_n(x-x_0;N)=\sum _{i=0}^{n}\sum _{s=0}^{i}\sum _{u=0}^{s}\left( {\begin{array}{c}i\\ s\end{array}}\right) A_{n,i}B_{s,u} (-1)^{i-s}\\ (x_0)^{i-s}{t}_u(x;N). \end{aligned} \end{aligned}$$
(34)

In a similar way, we also have:

$$\begin{aligned} {t}_m(y-y_0;N)=\sum _{j=0}^{m}\sum _{t=0}^{j}\sum _{v=0}^{t}\left( {\begin{array}{c}j\\ t\end{array}}\right) A_{m,j} B_{t,v}(-1)^{j-t}(y_0)^{j-t}{t}_v(y;N). \end{aligned}$$
(35)

Consequently, the \(\hbox {QTM}_{n,m}^t\) of a translated image \(f^t(x,y)\) can be written in terms of \(\hbox {QTM}_n,m\) of the original image f(xy) as:

$$\begin{aligned} \begin{aligned} \hbox {QTM}_{n,m}^t=&\sum _{i=0}^{n}\sum _{j=0}^{m}\sum _{s=0}^{i}\sum _{t=0}^{j}\sum _{u=0}^{s} \sum _{v=0}^{t}\left( {\begin{array}{c}i\\ s\end{array}}\right) \left( {\begin{array}{c}j\\ t\end{array}}\right) \times A_{n,i}A_{m,j}\\ {}&\times B_{s,u}B_{t,v}(-1)^{i-s+j-t} x_0^{i-s}y_0^{j-t}\hbox {QTM}_{u,v}. \end{aligned} \end{aligned}$$
(36)

As can be concluded, the QTM of any translated image by a translation vector \((x_0,y_0)\) can be expressed in terms of the QTM of the original image.

Therefore, the proof is completed.

Proof of proposition 2

The distorted version of Tchebichef polynomials can be expressed as follows:

$$\begin{aligned} \begin{aligned} {t}_n (a_{1,1}x+a_{1,2}y;N)=\sum _{i=0}^{n}A_{n,i}(a_{1,1}x+a_{1,2}y)^i=\\ \sum _{i=0}^{n}A_{n,i}\sum _{s=0}^{i}\left( {\begin{array}{c}i\\ s\end{array}}\right) (a_{1,1}x)^{i-s}(a_{1,2}y)^s\\ =\sum _{i=0}^{n}\sum _{s=0}^{i}\left( {\begin{array}{c}i\\ s\end{array}}\right) A_{n,i}(a_{1,1})^{i-s}(a_{1,2})^sx^{i-s}y^s. \end{aligned} \end{aligned}$$
(37)

Similarly

$$\begin{aligned} {t}_m (a_{2,1}x+a_{2,2}y;N)=\sum _{j=0}^{m}\sum _{t=0}^{j}\left( {\begin{array}{c}j\\ t\end{array}}\right) A_{m,j} (a_{2,1})^{j-t}(a_{2,2})^tx^{j-t}y^t. \end{aligned}$$
(38)

Consequently, using Eq. (37) and Eq. (38), the \(\hbox {QTM}_{n,m}^d\) of a deformed image \(f^d (x,y)\) can be written in terms of \(\hbox {QTM}_{n,m}\) of the original image f(xy) as:

$$\begin{aligned} \begin{aligned} \mathrm{QTM}_{n,m}^d=\sum _{i=0}^{n}\sum _{j=0}^{m}\sum _{s=0}^{i}\sum _{t=0}^{j} \sum _{u=0}^{i+j-s-t}\sum _{v=0}^{s+t}\left( {\begin{array}{c}i\\ s\end{array}}\right) \left( {\begin{array}{c}j\\ t\end{array}}\right) \times A_{n,i}A_{m,j}\\ \times B_{i+j-s-t,u}B_{s+t,v}(a_{1,1})^{i-s}(a_{1,2})^{s}(a_{2,1})^{j-t}(a_{2,2})^{t} \mathrm{QTM}_{u,v}. \end{aligned} \end{aligned}$$
(39)

Therefore, the proof is completed.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Elouariachi, I., Benouini, R., Zenkouar, K. et al. Robust hand gesture recognition system based on a new set of quaternion Tchebichef moment invariants. Pattern Anal Applic (2020). https://doi.org/10.1007/s10044-020-00866-9

Download citation

Keywords

  • Hand gesture recognition
  • Tchebichef moments
  • Moment invariants
  • Quaternion algebra
  • Complex background
  • RST invariants