Iterative Reference Driven Metric Learning for Signer Independent Isolated Sign Language Recognition

  • Fang YinEmail author
  • Xiujuan Chai
  • Xilin Chen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9911)


Sign language recognition (SLR) is an interesting but difficult problem. One of the biggest challenges comes from the complex inter-signer variations. To address this problem, the basic idea in this paper is to learn a generic model which is robust to different signers. This generic model contains a group of sign references and a corresponding distance metric. The references are constructed by signer invariant representations of each sign class. Motivated by the fact that the probe samples should have high similarities with their own class references, we aim to learn a distance metric which pulls the samples and their true sign classes (references) closer and push away the samples from the false sign classes (references). Therefore, given a group of references, a distance metric can be exploited with our proposed Reference Driven Metric Learning (RDML). In a further step, to obtain more appropriate references, an iterative manner is conducted to update the references and distance metric alternately with iterative RDML (iRDML). The effectiveness and efficiency of the proposed method is evaluated extensively on several public databases for both SLR and human motion recognition tasks.


Sign language recognition Signer independent Inter-signer variations Metric learning Human motion recognition 



This work was partially supported by 973 Program under contract No. 2015CB351802, Natural Science Foundation of China under contracts Nos. 61390511, 61472398, Microsoft Research Asia and the Youth Innovation Promotion Association CAS.


  1. 1.
    Lichtenauer, J.F., Hendriks, E.A., Reinders, M.J.: Sign language recognition by combining statistical DTW and independent classification. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 2040–2046 (2008)CrossRefGoogle Scholar
  2. 2.
    Ong, E.J., Cooper, H., Pugeault, N., Bowden, R.: Sign language recognition using sequential pattern trees. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2200–2207. IEEE (2012)Google Scholar
  3. 3.
    Wang, H., Stefan, A., Moradi, S., Athitsos, V., Neidle, C., Kamangar, F.: A system for large vocabulary sign search. In: Kutulakos, K.N. (ed.) ECCV 2010. LNCS, vol. 6553, pp. 342–353. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-35749-7_27 CrossRefGoogle Scholar
  4. 4.
    Chai, X., Li, G., Lin, Y., Xu, Z., Tang, Y., Chen, X., Zhou, M.: Sign language recognition and translation with kinect. In: IEEE Conference on AFGR (2013)Google Scholar
  5. 5.
    Von Agris, U., Schneider, D., Zieren, J., Kraiss, K.F.: Rapid signer adaptation for isolated sign language recognition. In: IEEE Conference on Computer Vision and Pattern Recognition Workshop, CVPRW 2006, pp. 159–159 (2006)Google Scholar
  6. 6.
    Leggetter, C.J., Woodland, P.C.: Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput. Speech Lang. 9(2), 171–185 (1995)CrossRefGoogle Scholar
  7. 7.
    Gauvain, J.L., Lee, C.H.: Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans. Speech Audio Process. 2(2), 291–298 (1994)CrossRefGoogle Scholar
  8. 8.
    Von Agris, U., Blomer, C., Kraiss, K.F.: Rapid signer adaptation for continuous sign language recognition using a combined approach of eigenvoices, MLLR, and MAP. In: 19th International Conference on Pattern Recognition, ICPR 2008. IEEE, pp. 1–4 (2008)Google Scholar
  9. 9.
    Kuhn, R., Junqua, J.C., Nguyen, P., Niedzielski, N.: Rapid speaker adaptation in eigenvoice space. IEEE Trans. Speech Audio Process. 8(6), 695–707 (2000)CrossRefGoogle Scholar
  10. 10.
    Farhadi, A., Forsyth, D., White, R.: Transfer learning in sign language. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8, June 2007Google Scholar
  11. 11.
    Zieren, J., Kraiss, K.-F.: Robust person-independent visual sign language recognition. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3522, pp. 520–528. Springer, Heidelberg (2005). doi: 10.1007/11492429_63 CrossRefGoogle Scholar
  12. 12.
    Shanableh, T., Assaleh, K.: User-independent recognition of Arabic sign language for facilitating communication with the deaf community. Digit. Sig. Process. 21(4), 535–542 (2011)CrossRefGoogle Scholar
  13. 13.
    Kong, W., Ranganath, S.: Towards subject independent continuous sign language recognition: a segment and merge approach. Pattern Recogn. 47(3), 1294–1308 (2014)CrossRefGoogle Scholar
  14. 14.
    Mensink, T., Verbeek, J., Perronnin, F., Csurka, G.: Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2624–2637 (2013)CrossRefGoogle Scholar
  15. 15.
    McFee, B., Lanckriet, G.R.: Metric learning to rank. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 775–782 (2010)Google Scholar
  16. 16.
    Lim, D., Lanckriet, G., McFee, B.: Robust structural metric learning. In: Proceedings of the 30th International Conference on Machine Learning, pp. 615–623 (2013)Google Scholar
  17. 17.
    Lu, J., Zhou, X., Tan, Y.P., Shang, Y., Zhou, J.: Neighborhood repulsed metric learning for kinship verification. IEEE Trans. Pattern Anal. Mach. Intell. 36(2), 331–345 (2014)CrossRefGoogle Scholar
  18. 18.
    Xing, E.P., Jordan, M.I., Russell, S., Ng, A.Y.: Distance metric learning with application to clustering with side-information. In: Advances in Neural Information Processing Systems, pp. 505–512 (2002)Google Scholar
  19. 19.
    Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Proceedings of the 24th International Conference on Machine Learning, ICML 2007, pp. 209–216. ACM, New York (2007)Google Scholar
  20. 20.
    Shental, N., Hertz, T., Weinshall, D., Pavel, M.: Adjustment learning and relevant component analysis. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 776–790. Springer, Heidelberg (2002). doi: 10.1007/3-540-47979-1_52 CrossRefGoogle Scholar
  21. 21.
    Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. In: Advances in Neural Information Processing Systems, pp. 1473–1480 (2005)Google Scholar
  22. 22.
    Chai, J., Liu, H., Chen, B., Bao, Z.: Large margin nearest local mean classifier. Signal Process. 90(1), 236–248 (2010)CrossRefzbMATHGoogle Scholar
  23. 23.
    Wang, F.: Semisupervised metric learning by maximizing constraint margin. IEEE Trans. Syst. Man Cybern. Part B Cybern. 41(4), 931–939 (2011)CrossRefGoogle Scholar
  24. 24.
    Baghshah, M.S., Shouraki, S.B.: Semi-supervised metric learning using pairwise constraints. In: IJCAI, vol. 9, pp. 1217–1222. Citeseer (2009)Google Scholar
  25. 25.
    Niu, G., Dai, B., Yamada, M., Sugiyama, M.: Information-theoretic semi-supervised metric learning via entropy regularization. Neural Comput. 26(8), 1717–1762 (2014)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Chai, X., Wang, H., Chen, X.: The devisign large vocabulary of chinese sign language database and baseline evaluations. Technical report VIPL-TR-14-SLR-001. Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS (2014)Google Scholar
  27. 27.
    Yin, F., Chai, X., Zhou, Y., Chen, X.: Weakly supervised metric learning towards signer adaptation for sign language recognition. In: British Machine Vision Conference (2015)Google Scholar
  28. 28.
    Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1290–1297. IEEE (2012)Google Scholar
  29. 29.
    Wang, C., Gao, W., Shan, S.: An approach based on phonemes to large vocabulary Chinese sign language recognition. In: Proceedings of Fifth IEEE International Conference on Automatic Face and Gesture Recognition, 2002, pp. 411–416. IEEE (2002)Google Scholar
  30. 30.
    Salvador, S., Chan, P.: Toward accurate dynamic time warping in linear time and space. Intell. Data Anal. 11(5), 561–580 (2007)Google Scholar
  31. 31.
    Xu, C., Wang, T., Gao, J., Cao, S., Tao, W., Liu, F.: An ordered-patch-based image classification approach on the image Grassmannian manifold. IEEE Trans. Neural Netw. Learn. Syst. 25(4), 728–737 (2014)CrossRefGoogle Scholar
  32. 32.
    Nguyen, H.V., Bai, L.: Cosine similarity metric learning for face verification. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6493, pp. 709–720. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-19309-5_55 CrossRefGoogle Scholar
  33. 33.
    Muller, M., Roder, T., Clausen, M.: Documentation mocap database HDM05. Technical report CG-2007-2, University of Bonn (2007)Google Scholar
  34. 34.
    Cho, K., Chen, X.: Classifying and visualizing motion capture sequences using deep neural networks. In: 2014 International Conference on Computer Vision Theory and Applications (VISAPP), vol. 2, pp. 122–130. IEEE (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS)Institute of Computing Technology, CASBeijingChina
  2. 2.University of Chinese Academy of SciencesBeijingChina
  3. 3.Cooperative Medianet Innovation CenterBeijingChina

Personalised recommendations