Skip to main content

Transfer Learning Decision Forests for Gesture Recognition

  • Chapter
  • First Online:
Gesture Recognition

Abstract

Decision forests are an increasingly popular tool in computer vision problems. Their advantages include high computational efficiency, state-of-the-art accuracy and multi-class support. In this paper, we present a novel method for transfer learning which uses decision forests, and we apply it to recognize gestures and characters. We introduce two mechanisms into the decision forest framework in order to transfer knowledge from the source tasks to a given target task. The first one is mixed information gain, which is a data-based regularizer. The second one is label propagation, which infers the manifold structure of the feature space. We show that both of them are important to achieve higher accuracy. Our experiments demonstrate improvements over traditional decision forests in the ChaLearn Gesture Challenge and MNIST data set. They also compare favorably against other state-of-the-art classifiers.

Editors: Isabelle Guyon, Vassilis Athitsos, and Sergio Escalera

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • M.A.R. Ahad, J.K. Tan, H. Kim, S. Ishikawa, Motion history image: its variants and applications. Mach. Vis. Appl. 23(2), 255–281 (2012)

    Article  Google Scholar 

  • Y. Aytar, A. Zisserman, Tabula rasa: model transfer for object category detection, in Proceedings of the IEEE Computer Vision and Pattern Recognition, 2011

    Google Scholar 

  • E. Bart, S. Ullman, Cross-generalization: learning novel classes from a single example by feature replacement, in Proceedings of the IEEE Computer Vision and Pattern Recognition, 2005

    Google Scholar 

  • A. Bergamo, L. Torresani, Exploiting weakly-labeled web images to improve object classification: a domain adaptation approach, in Proceedings of the Advances in Neural Information Processing Systems, 2010, pp. 181–189

    Google Scholar 

  • A. Bobick, J. Davis, An appearance-based representation of action, in Proceedings of the International Conference on Pattern Recognition, 1996, pp. 307–312

    Google Scholar 

  • A. Bobick, J. Davis. The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach., 2001

    Google Scholar 

  • R. Bowden, D. Windridge, T. Kadir, A. Zisserman, J. M. Brady. A linguistic feature vector for the visual interpretation of sign language, in Proceedings of the European Conference on Computer Vision, 2004

    Google Scholar 

  • L. Breiman, Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  • T.M. Cover, J.A. Thomas, Elements of Information Theory (Wiley, Hoboken, 2006)

    MATH  Google Scholar 

  • A. Criminisi, J. Shotton, E. Konukoglu, Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found. Trends Comput. Graph. Vis. 7(2–3), 81–227 (2012)

    MATH  Google Scholar 

  • W. Dai, Q. Yang, G.R. Xue, Y. Yu, Boosting for transfer learning, in Proceedings of the International Conference on Machine Learning, New York, NY, USA, 2007, p. 193–200

    Google Scholar 

  • J.B. Faddoul, B. Chidlovskii, R. Gilleron, F. Torre, Learning multiple tasks with boosted decision trees, in Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, 2012

    Google Scholar 

  • A. Farhadi, D. Forsyth, R. White, Transfer learning in sign language, in Proceedings of the IEEE Computer Vision and Pattern Recognition, 2007, pp. 1–8

    Google Scholar 

  • L. Fei-Fei, R. Fergus, P. Perona, One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. 28(4), 594–611 (2006)

    Article  Google Scholar 

  • Pedro F. Felzenszwalb, Ross B. Girshick, David A. McAllester, Deva Ramanan, Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  • Y. Freund, R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • R. Gopalan, R. Li, R. Chellappa, Domain adaptation for object recognition: an unsupervised approach, in Proceedings of the IEEE International Conference on Computer Vision, 2011, pp. 999–1006

    Google Scholar 

  • E. Grosicki, H.E. Abed, ICDAR 2011 - French handwriting recognition competition, in Proceedings of the International Conference on Document Analysis and Recognition, 2011, pp. 1459–1463

    Google Scholar 

  • I. Guyon, V. Athitsos, P. Jangyodsuk, B. Hamner, H.J. Escalante, ChaLearn gesture challenge: design and first results, in Workshop on Gesture Recognition and Kinect Demonstration Competition, 2012

    Google Scholar 

  • I. Guyon, V. Athitsos, P. Jangyodsuk, H.J. Escalante, B. Hamner, Results and analysis of the ChaLearn gesture challenge 2012, in Advances in Depth Image Analysis and Applications, vol. 7854 of Lecture Notes in Computer Science (Springer, New York, 2013), pp. 186–204

    Google Scholar 

  • T. Hastie, R. Tibshirani, J.H. Friedman, The Elements of Statistical Learning. (Springer, New York, 2003)

    Google Scholar 

  • A. Kurakin, Z. Zhang, Z. Liu, A real-time system for dynamic hand gesture recognition with a depth sensor, in Proceedings of the European Signal Processing Conference, 2012, pp. 1980–1984

    Google Scholar 

  • Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition, in Proceedings of the IEEE, 1998, pp. 2278–2324

    Google Scholar 

  • C. Leistner, A. Saffari, J. Santner, H. Bischof, Semi-supervised random forests, in Proceedings of the IEEE International Conference on Computer Vision, 2009, pp. 506–513

    Google Scholar 

  • K. Levi, M. Fink, Y. Weiss. Learning from a small number of training examples by exploiting object categories, in Proceedings of the IEEE Computer Vision and Pattern Recognition Workshop, 2004, pp. 96–102

    Google Scholar 

  • W. Li, Z. Zhang, Z. Liu, Graphical modeling and decoding of human actions, in Proceedings of the IEEE International Workshop on Multimedia Signal Processing, 2008, pp. 175–180

    Google Scholar 

  • J.J. Lim, R. Salakhutdinov, A. Torralba, Transfer learning by borrowing examples for multiclass object detection, in Proceedings of the Advances in Neural Information Processing Systems, 2011

    Google Scholar 

  • J. Liu, K. Yu, Y. Zhang, Y. Huang, Training conditional random fields using transfer learning for gesture recognition, in Proceedings of the IEEE International Conference on Data Mining, 2010, pp. 314–323

    Google Scholar 

  • Y.M. Lui, Human gesture recognition on product manifolds. J. Mach. Learn. Res. 13, 3297–3321 (2012)

    MathSciNet  MATH  Google Scholar 

  • M.R. Malgireddy, I. Nwogu, V. Govindaraju, Language-motivated approaches to action recognition. J. Mach. Learn. Res. 14, 2189–2212 (2013)

    MathSciNet  Google Scholar 

  • S.J. Pan, Q. Yang, A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359, October 2010. ISSN 1041-4347

    Google Scholar 

  • Y. Pei, T.-K. Kim, H. Zha, Unsupervised random forest manifold alignment for lipreading, in Proceedings of the IEEE International Conference on Computer Vision, 2013

    Google Scholar 

  • N. Quadrianto, A.J Smola, T. Caetano, S.V.N. Vishwanathanand, J. Petterson, Multitask learning without label correspondences, in Proceedings of the Advances in Neural Information Processing Systems, 2010

    Google Scholar 

  • A. Quattoni, M. Collins, T. Darrell, Transfer learning for image classification with sparse prototype representations, in Proceedings of the IEEE Computer Vision and Pattern Recognition, 2008, pp. 1–8

    Google Scholar 

  • J.R. Quinlan, Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  • K. Saenko, B. Kulis, M. Fritz, T. Darrell. Adapting visual category models to new domains, in Proceedings of the European Conference on Computer Vision, 2010, pp. 213–226

    Google Scholar 

  • H.J. Seo, P. Milanfar, Action recognition from one example. IEEE Trans. Pattern Anal. Mach. 33(5), 867–882 (2011)

    Article  Google Scholar 

  • T. Sharp, Implementing decision trees and forests on a GPU, in Proceedings of the European Conference on Computer Vision, 2008, pp. 595–608

    Google Scholar 

  • E.B. Sudderth, A. Torralba, W.T. Freeman, A.S. Willsky, Learning hierarchical models of scenes, objects, and parts, in Proceedings of the IEEE International Conference on Computer Vision, 2005, pp. 1331–1338

    Google Scholar 

  • T. Tommasi, F. Orabona, B. Caputo, Learning categories from few examples with multi model knowledge transfer. IEEE Trans. Pattern Anal. Mach. 36(5), 928–941 (2014)

    Article  Google Scholar 

  • A. Torralba, K.P. Murphy, W.T. Freeman, Sharing visual features for multiclass and multiview object detection. IEEE Trans. Pattern Anal. Mach. 29(5), 854–869 (2007)

    Article  Google Scholar 

  • P.A. Viola, M.J. Jones, Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)

    Article  Google Scholar 

  • J. Wan, Q. Ruan, W. Li, S. Deng, One-shot learning gesture recognition from RGB-D data using bag of features. J. Mach. Learn. Res. 14, 2549–2582 (2013)

    Google Scholar 

  • Q. Wang, L. Zhang, M. Chi, J. Guo, MTForest: ensemble decision trees based on multi-task learning, in Proceedings of the European Conference on Artificial Intelligence, 2008, pp. 122–126

    Google Scholar 

  • J. won Lee, C. Giraud-Carrier, Transfer learning in decision trees, in Proceedings of the International Joint Conference on Neural Networks, 2007

    Google Scholar 

  • Y. Yao, G. Doretto, Boosting for transfer learning with multiple sources, in Proceedings of the IEEE Computer Vision and Pattern Recognition, 2010, pp. 1855–1862

    Google Scholar 

Download references

Acknowledgements

We would like to thank Zicheng Liu and Julio Jacobo-Berlles for their feedback and assistance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Norberto A. Goussies .

Editor information

Editors and Affiliations

Appendix A

Appendix A

We prove Theorem 1. First, we prove \({\mathbb {E}}(\mathcal{H}(S_K)) + \sum _{\mathbf {y}\in \mathcal{Y}} p_{\mathbf {y}} \log \left( 1 + \frac{1 - p_{\mathbf {y}}}{K p_{\mathbf {y}}} \right) \le \mathcal{H}({P}) \)

By definition of the empirical entropy and linearity of the expectation, we have:

$$\begin{aligned} {\mathbb {E}}(\mathcal{H}(S_K)) = - {\mathbb {E}}\left[ \sum _{\mathbf {y}\in \mathcal{Y}} \hat{p}_{S_K}(\mathbf {y}) \log (\hat{p}_{S_K}(\mathbf {y})) \right] = - \sum _{\mathbf {y}\in \mathcal{Y}} {\mathbb {E}}\left[ \hat{p}_{S_K}(\mathbf {y}) \log (\hat{p}_{S_K}(\mathbf {y})) \right] \end{aligned}$$

Using the definitions of the empirical histogram \(\hat{p}_{S_K}(\mathbf {y})\) and the expectation:

$$\begin{aligned} - \sum _{\mathbf {y}\in \mathcal{Y}} {\mathbb {E}}\left[ \hat{p}_{S_K}(\mathbf {y}) \log (\hat{p}_{S_K}(\mathbf {y})) \right] = - \sum _{\mathbf {y}\in \mathcal{Y}} \sum _{j=0}^K {P}\left( \hat{p}_{S_K}(\mathbf {y}) = \frac{j}{K}\right) \frac{j}{K} \log \frac{j}{K} \end{aligned}$$

Assuming that the samples are iid, then:

$$\begin{aligned} = - \sum _{\mathbf {y}\in \mathcal{Y}} \sum _{j=0}^K {K \atopwithdelims ()j} p_{\mathbf {y}}^j (1-p_{\mathbf {y}})^{K-j} \frac{j}{K} \log \frac{j}{K} \end{aligned}$$

Note that, in this equation, \(p_{\mathbf {y}}\) is the true probability of distribution \({P}\). After some algebraic manipulations, we obtain the following:

$$\begin{aligned}&= - \sum _{\mathbf {y}\in \mathcal{Y}} p_{\mathbf {y}} \sum _{j=0}^{K-1} {K-1 \atopwithdelims ()j} p_{\mathbf {y}}^j (1-p_{\mathbf {y}})^{K-1-j} \log \frac{j+1}{K} \\&= - \sum _{\mathbf {y}\in \mathcal{Y}} p_{\mathbf {y}} \sum _{j=0}^{K-1} {P}\left( \hat{p}_{S_K}(\mathbf {y}) = \frac{j}{K}\right) \log \frac{j+1}{K} \end{aligned}$$

Applying Jensen’s inequality for the convex function \(-\log (x)\), we obtain:

$$\begin{aligned}&\ge - \sum _{\mathbf {y}\in \mathcal{Y}} p_{\mathbf {y}} \log \left( \sum _{j=0}^{K-1} {P}\left( \hat{p}_{S_K}(\mathbf {y}) = \frac{j}{K}\right) \frac{j+1}{K} \right) \\&= - \sum _{\mathbf {y}\in \mathcal{Y}} p_{\mathbf {y}} \log \frac{(K-1)p_{\mathbf {y}} + 1}{K}\\&= - \sum _{\mathbf {y}\in \mathcal{Y}} p_{\mathbf {y}} \log \left( p_{\mathbf {y}} + \frac{1 - p_{\mathbf {y}}}{K}\right) \\&= - \sum _{\mathbf {y}\in \mathcal{Y}} p_{\mathbf {y}} \log \left( p_{\mathbf {y}} \left( 1 + \frac{1 - p_{\mathbf {y}}}{K p_{\mathbf {y}}}\right) \right) \\&= - \sum _{\mathbf {y}\in \mathcal{Y}} p_{\mathbf {y}} \log p_{\mathbf {y}} - \sum _{\mathbf {y}\in \mathcal{Y}} p_{\mathbf {y}} \log \left( 1 + \frac{1 - p_{\mathbf {y}}}{K p_{\mathbf {y}}} \right) \\&= \mathcal{H}({P}) - \sum _{\mathbf {y}\in \mathcal{Y}} p_{\mathbf {y}} \log \left( 1 + \frac{1 - p_{\mathbf {y}}}{K p_{\mathbf {y}}} \right) \end{aligned}$$

Now we prove \(\mathcal{H}({P}) \le {\mathbb {E}}(\mathcal{H}(S_K))\).

By definition of the empirical entropy and linearity of the expectation, we have:

$$\begin{aligned} {\mathbb {E}}(\mathcal{H}(S_K)) = - {\mathbb {E}}\left[ \sum _{\mathbf {y}\in \mathcal{Y}} \hat{p}_{S_K}(\mathbf {y}) \log (\hat{p}_{S_K}(\mathbf {y})) \right] = - \sum _{\mathbf {y}\in \mathcal{Y}} {\mathbb {E}}\left[ \hat{p}_{S_K}(\mathbf {y}) \log (\hat{p}_{S_K}(\mathbf {y})) \right] \end{aligned}$$

Applying Jensen’s inequality for the convex function \( x \log x\), we obtain the following:

$$\begin{aligned} \le - \sum _{\mathbf {y}\in \mathcal{Y}} {\mathbb {E}}\left[ \hat{p}_{S_K}(\mathbf {y}) \right] \log ({\mathbb {E}}\left[ \hat{p}_{S_K}(\mathbf {y}) \right] ) \end{aligned}$$

Since \({\mathbb {E}}\left[ \hat{p}_{S_K}(\mathbf {y}) \right] = p_{\mathbf {y}}\), we have:

$$\begin{aligned} = - \sum _{\mathbf {y}\in \mathcal{Y}} p_{\mathbf {y}} \log (p_{\mathbf {y}}) = \mathcal{H}({P}) \end{aligned}$$

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Goussies, N.A., Ubalde, S., Mejail, M. (2017). Transfer Learning Decision Forests for Gesture Recognition. In: Escalera, S., Guyon, I., Athitsos, V. (eds) Gesture Recognition. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-57021-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57021-1_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57020-4

  • Online ISBN: 978-3-319-57021-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics