Two Phase Classification for Early Hand Gesture Recognition in 3D Top View Data

  • Aditya Tewari
  • Bertram Taetz
  • Frederic Grandidier
  • Didier Stricker
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10072)

Abstract

This work classifies top-view hand-gestures observed by a Time of Flight (ToF) camera using Long Short-Term Memory (LSTM) architecture of neural networks. We demonstrate a performance improvement by a two-phase classification. Therefore we reduce the number of classes to be separated in each phase and combine the output probabilities. The modified system architecture achieves an average cross-validation accuracy of 90.75% on a 9-gesture dataset. This is demonstrated to be an improvement over the single all-class LSTM approach. The networks are trained to predict the class-label continuously during the sequence. A frame-based gesture prediction, using accumulated gesture probabilities per frame of the video sequence, is introduced. This eliminates the latency due to prediction of gesture at the end of the sequence as is usually the case with majority voting based methods.

Keywords

Driver assistance Hand gesture LSTM networks Hand features Neural networks 

References

  1. 1.
    Lansdown, T.C., Brook-Carter, N., Kersloot, T.: Distraction from multiple in-vehicle secondary tasks: vehicle performance and mental workload implications. Ergonomics 47, 91–104 (2004)CrossRefGoogle Scholar
  2. 2.
    Green, P.: Visual and task demands of driver information systems. Technical report (1999)Google Scholar
  3. 3.
    Jæger, M.G., Skov, M.B., Thomassen, N.G., et al.: You can touch, but you can’t look: interacting with in-vehicle systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1139–1148. ACM (2008)Google Scholar
  4. 4.
    Horrey, W.J.: Assessing the effects of in-vehicle tasks on driving performance. Ergonomics 19, 4–7 (2011)Google Scholar
  5. 5.
    Freeman, W.T., Roth, M.: Orientation histograms for hand gesture recognition. In: International Workshop on Automatic Face and Gesture Recognition, vol. 12, pp. 296–301 (1995)Google Scholar
  6. 6.
    Liu, Y., Gan, Z., Sun, Y.: Static hand gesture recognition and its application based on support vector machines. In: Ninth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, SNPD 2008, pp. 517–521 (2008)Google Scholar
  7. 7.
    Alpern, M., Minardo, K.: Developing a car gesture interface for use as a secondary task. In: Extended Abstracts on Human Factors in Computing Systems, CHI EA 2003, pp. 932–933. ACM, New York (2003)Google Scholar
  8. 8.
    Davis, J., Shah, M.: Recognizing hand gestures. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 800, pp. 331–340. Springer, Heidelberg (1994). doi:10.1007/3-540-57956-7_37 CrossRefGoogle Scholar
  9. 9.
    Hu, J., Brown, M.K., Turin, W.: Hmm based online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 18, 1039–1045 (1996)CrossRefGoogle Scholar
  10. 10.
    Chen, F.S., Fu, C.M., Huang, C.L.: Hand gesture recognition using a real-time tracking method and hidden Markov models. Image Vis. Comput. 21, 745–758 (2003)CrossRefGoogle Scholar
  11. 11.
    Yang, J., Horie, R.: An improved computer interface comprising a recurrent neural network and a natural user interface. Image Vis. Comput. 60, 1386–1395 (2015)Google Scholar
  12. 12.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)CrossRefGoogle Scholar
  13. 13.
    Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 6, 107–116 (1998)CrossRefMATHGoogle Scholar
  14. 14.
    Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013)Google Scholar
  15. 15.
    Neverova, N., Wolf, C., Paci, G., Sommavilla, G., Taylor, G.W., Nebout, F.: A multi-scale approach to gesture detection and recognition. In: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 484–491. IEEE (2013)Google Scholar
  16. 16.
    Yoon, H.S., Soh, J., Bae, Y.J., Yang, H.S.: Hand esture recognition using combined features of location, angle and velocity. Pattern Recogn. 34, 1491–1501 (2001)CrossRefMATHGoogle Scholar
  17. 17.
    Tewari, A., Grandidier, F., Taetz, B., Stricker, D.: Adding model constraints to CNN for top view hand pose recognition in range images. In: Proceedings of the ICPRAM 2005, pp. 170–177 (2016)Google Scholar
  18. 18.
    Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: 1993 IEEE International Conference on Neural Networks, pp. 586–591. IEEE (1993)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Aditya Tewari
    • 1
    • 2
  • Bertram Taetz
    • 1
  • Frederic Grandidier
    • 2
  • Didier Stricker
    • 1
  1. 1.Augmented VisionTechnische Universität KaiserslauternKaiserslauternGermany
  2. 2.IEE S.A.ConternLuxembourg

Personalised recommendations