Skip to main content

A Hybrid CRF/HMM for One-Shot Gesture Learning

  • Chapter
  • First Online:
Adaptive Biometric Systems

Abstract

This chapter deals with the characterization and the recognition of human gestures in videos. We propose a global characterization of gestures that we call the Gesture Signature. The gesture signature describes the location, velocity, and orientation of the global motion of a gesture deduced from optical flows. The proposed hybrid CRF/HMM model combines the modelling ability of hidden Markov models and the discriminative ability of conditional random fields. We applied this hybrid system to the recognition of gesture in videos in the context of one-shot learning, where only one sample gesture per class is given to train the system. In this rather extreme context, the proposed framework achieves very interesting performance which suggests its application to other biometric recognition tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.juergenwiki.de/work/wiki/doku.php?id=public%3ahog_descriptor_computation_and_visualization.

  2. 2.

    https://www.kaggle.com/c/GestureChallenge2.

  3. 3.

    http://torch.ch/torch3/.

  4. 4.

    http://crfpp.googlecode.com/svn/trunk/doc/index.html.

  5. 5.

    https://www.kaggle.com/c/GestureChallenge/leaderboard.

  6. 6.

    https://www.kaggle.com/c/GestureChallenge2/leaderboard.

  7. 7.

    The mentioned adaptation is the model adaptation to the one-shot learning context.

References

  1. Austin, S., Schwartz, R., Placeway, P.: The forward-backward search algorithm. In IEEE ICASSP, pp. 697–700 (1991)

    Google Scholar 

  2. Baum, L.E., Petrie, T.: Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Stat. 37, 1554–1563 (1966)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bengio, Y., LeCun, Y., Nohl, C., Burges, C.: LeRec: a NN/HMM hybrid for on-line handwriting recognition. neural Comput. 7(6), 1289–1303 (1995)

    Article  Google Scholar 

  4. Bhandarkar, Suchendra M., Luo, Xingzhi: Integrated detection and tracking of multiple faces using particle filtering and optical flow-based elastic matching. CVIU 113(6), 708–725 (2009)

    Google Scholar 

  5. Bradski, Gary, Kaehler, Adrian: Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly, Cambridge (2008)

    Google Scholar 

  6. Corradini, A.: Real-time gesture recognition by means of hybrid recognizers. In: Gesture Workshop, vol. 2298, pp. 34–46. Springer (2001)

    Google Scholar 

  7. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision & Pattern Recognition, vol. 2, pp. 886–893 (2005)

    Google Scholar 

  8. Ganapathiraju, A., Hamaker, J., Picone, J.: Hybrid SVM/HMM architectures for speech recognition. In: INTERSPEECH, ISCA, pp. 504–507 (2000)

    Google Scholar 

  9. Gilloux, M., Lemarie, B., Leroux, M.: A hybrid RBF network/hidden Markov model handwritten word recognition system. In: ICDAR, pp. 394–397 (1995)

    Google Scholar 

  10. Gunawardana, A., Mahajan, M., Acero, A., Platt, J.C.: Hidden conditional random fields for phone classification. In INTERSPEECH, ISCA, pp. 1117–1120 (2005)

    Google Scholar 

  11. Guyon, I., Athitsos, V., Jangyodsuk, B., Hamner, P., Escalante, H.: ChaLearn gesture challenge: Design and first results. In CVPR Workshops, pp. 1–6. IEEE (2012)

    Google Scholar 

  12. Hebert, D., T. Paquet, Nicolas, S.: Continuous CRF with multi-scale quantization feature functions application to structure extraction in old newspaper. In: ICDAR, pp. 493–497 (2011)

    Google Scholar 

  13. Jackson, E.: An HMM-based approach for gesture recognition using edge features. In: CVPR 2012 Workshop on Gesture Recognition (2012)

    Google Scholar 

  14. Johansen, F.T.: A comparison of hybrid HMM architectures using global discriminative training. In: ICSLP, ISCA (1996)

    Google Scholar 

  15. Knerr, S., Augustin, E.: A neural network-hidden markov model hybrid for cursive word recognition. ICPR 2, 1518–1520 (1998)

    Google Scholar 

  16. Konencny, J., Hagara, M.: One-shot learning gesture recognition using HOG/HOF features. In: ICPR 2012 Workshop on Gesture Recognition (2012)

    Google Scholar 

  17. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, pp. 282–289 (2001)

    Google Scholar 

  18. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10, 707 (1966)

    MathSciNet  MATH  Google Scholar 

  19. Marukatat, S., Artieres, T., Gallinari, P., Dorizzi, B.: Sentence recognition through hybrid neuro-Markovian modeling. In: ICDAR, pp. 731–737 (2001)

    Google Scholar 

  20. Matan, O., Burges, C., Lecun, Y., Denker, J.S.: Multi-digit recognition using a space displacement neural network. In: Advances in Neural Information Processing Systems, vol. 4, pp. 488–495 (1992)

    Google Scholar 

  21. Morgan, N., Bourlard, H., Renls, S., Cohen, M., Franco, H.: Hybrid neural network/hidden Markov model systems for continuous speech recognition. IJPRAI 7(4), 899–916 (1993)

    Google Scholar 

  22. Morita, M.E., Sabourin, R., Bortolozzi, F., Suen, C.Y.: Segmentation and recognition of handwritten dates: an HMM-MLP hybrid approach. IJDAR 6(4), 248–262 (2003)

    Article  Google Scholar 

  23. Neidle, Carol, Sclaroff, Stan, Athitsos, Vassilis: SignStream: a tool for linguistic and computer vision research on visual-gestural language data. Behav. Res. Methods Instrum. Comput. 33(3), 311–320 (2001)

    Article  Google Scholar 

  24. Niles, L.T., Silverman, H.F.: Combining hidden Markov models and neural network classifiers. In: ICASSP, pp. 417–420 (1990)

    Google Scholar 

  25. Ong, S.C.W., Ranganath, S.: Deciphering gestures with layered meanings and signer adaptation. In: IEEE International Conference on Automatic Face and Gesture Recognition, p. 559 (2004)

    Google Scholar 

  26. Quattoni, A., Wang, S., Morency, L., Collins, M., Darrell, T.: Hidden conditional random fields. IEEE Trans. Pattern Anal. Mach. Intell. 29(10), 1848–1852 (2007)

    Article  Google Scholar 

  27. Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. In: Proceedings of the IEEE, pp. 257–286 (1989)

    Google Scholar 

  28. Rajko, S., Qian, G.: A Hybrid HMM/DPA adaptive gesture recognition method. In: ISVC, vol. 3804, pp. 227–234 (2005)

    Google Scholar 

  29. Rigoll, G.: Maximum mutual information neural networks for hybrid connectionist-HMM speech recognition systems. IEEE Trans. Speech Audio Process. 2(1), 175–184 (1994)

    Article  Google Scholar 

  30. Sayre, Kenneth M.: Machine recognition of handwritten words: a project report. Pattern Recogn. 5(3), 213–228 (1973)

    Article  Google Scholar 

  31. Soullard, Y.: Hybrid HMM and HCRF model for sequence classification. Bruges, Belgium (2011)

    Google Scholar 

  32. Tebelskis, J., Waibel, A., Petek, B., Schmidbauer, O.: Continuous speech recognition by linked predictive neural networks. In: NIPS, pp. 199–205 (1990)

    Google Scholar 

  33. Thomas, S., Chatelain, C., Heutte, L., Paquet, T., Kessentini, Y.: A deep HMM model for multiple keywords spotting in handwritten documents. Accepted in Pattern Anal. Appl. (2015)

    Google Scholar 

  34. Trentin, E.: A survey of hybrid ANN/HMM models for automatic speech recognition. Neurocomputing 1–4, 91–126 (2001)

    Article  MATH  Google Scholar 

  35. Viterbi, A.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory 13(2), 260–269 (1967)

    Article  MATH  Google Scholar 

  36. Vogler, C., Metaxas, D.: A framework for recognizing the simultaneous aspects of American Sign Language. Comput. Vis. Image Underst. 81, 358–384 (2001)

    Article  MATH  Google Scholar 

  37. von Agris, U., Zieren, J., Canzler, U., Bauer, B., Kraiss, K.-F.: Recent developments in visual sign language recognition. Univ. Access Inf. Soc. 6(4), 323–362 (2008)

    Article  Google Scholar 

  38. Weiss, D.: HMM based one shot gesture recognition. In: CVPR 2012 Workshop on Gesture Recognition (2012)

    Google Scholar 

  39. Wu, D., Zhu, F., Shao, L.: One shot learning gesture recognition from RGBD images. In: CVPR, IEEE, pp. 7–12 (2012)

    Google Scholar 

  40. Yang, Yang, Saleemi, I., Shah, M.: Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1635–1648 (2013)

    Article  Google Scholar 

  41. Zavaliagkos, G., Austin, S., Makhoul, J., Schwartz, R.M.: A hybrid continuous speech recognition system using segmental neural nets with hidden Markov models. IJPRAI 7(4), 949–963 (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Belgacem, S., Chatelain, C., Paquet, T. (2015). A Hybrid CRF/HMM for One-Shot Gesture Learning. In: Rattani, A., Roli, F., Granger, E. (eds) Adaptive Biometric Systems. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-24865-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24865-3_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24863-9

  • Online ISBN: 978-3-319-24865-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics