Chimpanzee Faces in the Wild: Log-Euclidean CNNs for Predicting Identities and Attributes of Primates

  • Alexander FreytagEmail author
  • Erik Rodner
  • Marcel Simon
  • Alexander Loos
  • Hjalmar S. Kühl
  • Joachim Denzler
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9796)


In this paper, we investigate how to predict attributes of chimpanzees such as identity, age, age group, and gender. We build on convolutional neural networks, which lead to significantly superior results compared with previous state-of-the-art on hand-crafted recognition pipelines. In addition, we show how to further increase discrimination abilities of CNN activations by the Log-Euclidean framework on top of bilinear pooling. We finally introduce two curated datasets consisting of chimpanzee faces with detailed meta-information to stimulate further research. Our results can serve as the foundation for automated large-scale animal monitoring and analysis.


Face Recognition Face Image Sparse Representation Convolutional Neural Network Deep Neural Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The authors thank Dr. Tobias Deschner for providing the images which were used to build the C-Tai dataset, Laura Aporius and Karin Bahrke for collecting and annotating the images which were used to build the C-Zoo dataset, and the Zoo Leipzig for providing permission for image collection. The images used for creating the C-Zoo dataset were collected as part of the SAISBECO project funded by the Pact for Research and Innovation between the Max Planck Society and the Fraunhofer-Gesellschaft. Part of this research was supported by grant RO 5093/1-1 of the German Research Foundation (DFG) and by a grant from the Robert-Bosch-Stiftung.

Supplementary material

419026_1_En_5_MOESM1_ESM.pdf (233 kb)
Supplementary material 1 (pdf 233 KB)


  1. 1.
    Arsigny, V., Commowick, O., Pennec, X., Ayache, N.: A log-Euclidean framework for statistics on diffeomorphisms. In: Larsen, R., Nielsen, M., Sporring, J. (eds.) MICCAI 2006. LNCS, vol. 4190, pp. 924–931. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  2. 2.
    Bottou, L.: Stochastic gradient descent tricks. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 421–436. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Branson, S., Van Horn, G., Belongie, S., Perona, P.: Improved bird species categorization using pose normalized deep convolutional nets. In: British Machine Vision Conference (BMVC) (2014)Google Scholar
  4. 4.
    Carreira, J., Caseiro, R., Batista, J., Sminchisescu, C.: Free-form region description with second-order pooling. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 37(6), 1177–1189 (2015)CrossRefGoogle Scholar
  5. 5.
    Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: British Machine Vision Conference (BMVC) (2014)Google Scholar
  6. 6.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)Google Scholar
  7. 7.
    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. (JMLR) 9, 1871–1874 (2008).
  8. 8.
    Freytag, A., Rodner, E., Darrell, T., Denzler, J.: Exemplar-specific patch features for fine-grained recognition. In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 144–156. Springer, Heidelberg (2014)Google Scholar
  9. 9.
    Göring, C., Rodner, E., Freytag, A., Denzler, J.: Nonparametric part transfer for fine-grained recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2489–2496 (2014)Google Scholar
  10. 10.
    He, X., Niyog, P.: Locality preserving projections. In: Neural Information Processing Systems (NIPS), vol. 16, p. 153 (2004)Google Scholar
  11. 11.
    He, X., Yan, S., Hu, Y., Niyogi, P., Zhang, H.J.: Face recognition using laplacianfaces. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 27(3), 328–340 (2005)CrossRefGoogle Scholar
  12. 12.
    Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report, pp. 07–49. University of Massachusetts, Amherst (2007)Google Scholar
  13. 13.
    Hughes, B., Burghardt, T.: Automated identification of individual great white sharks from unrestricted fin imagery. In: British Machine Vision Conference (BMVC), pp. 92.1–92.14 (2015)Google Scholar
  14. 14.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: ACM International Conference on Multimedia, pp. 675–678 (2014)Google Scholar
  15. 15.
    Kumar, N., Belhumeur, P.N., Biswas, A., Jacobs, D.W., Kress, W.J., Lopez, I.C., Soares, J.V.B.: Leafsnap: a computer vision system for automatic plant species identification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 502–516. Springer, Heidelberg (2012)Google Scholar
  16. 16.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  17. 17.
    Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: IEEE International Conference on Computer Vision (ICCV), pp. 1449–1457 (2015)Google Scholar
  18. 18.
    Loos, A.: Identification of great apes using gabor features and locality preserving projections. In: ACM International Workshop on Multimedia Analysis for Ecological Data, pp. 19–24. ACM (2012)Google Scholar
  19. 19.
    Loos, A., Ernst, A.: An automated chimpanzee identification system using face detection and recognition. EURASIP J. Image Vid. Process. 2013(1), 1–17 (2013)CrossRefGoogle Scholar
  20. 20.
    Ngiam, J., Coates, A., Lahiri, A., Prochnow, B., Le, Q.V., Ng, A.Y.: On optimization methods for deep learning. In: International Conference on Machine Learning (ICML), pp. 265–272 (2011)Google Scholar
  21. 21.
    O’Connell, A.F., Nichols, J.D., Karanth, K.U.: Camera Traps in Animal Ecology: Methods and Analyses. Springer, Japan (2010)Google Scholar
  22. 22.
    Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: British Machine Vision Conference (BMVC) (2015)Google Scholar
  23. 23.
    Rowcliffe, J.M., Carbone, C.: Surveys using camera traps: are we looking to a brighter future? Anim. Conserv. 11(3), 185–186 (2008)CrossRefGoogle Scholar
  24. 24.
    Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)CrossRefGoogle Scholar
  25. 25.
    Simon, M., Rodner, E.: Neural activation constellations: unsupervised part model discovery with convolutional networks. In: IEEE International Conference on Computer Vision (ICCV) (2015)Google Scholar
  26. 26.
    Simonyan, K., Vedaldi, A., Zisserman, A.: Learning local feature descriptors using convex optimisation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 36(8), 1573–1585 (2014)CrossRefGoogle Scholar
  27. 27.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1701–1708 (2014)Google Scholar
  28. 28.
    Tan, X., Triggs, B.: Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans. Image Process. (TIP) 19(6), 1635–1650 (2010)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Turk, M.A., Pentland, A.P.: Face recognition using eigenfaces. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 586–591 (1991)Google Scholar
  30. 30.
    Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on riemannian manifolds. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 30(10), 1713–1727 (2008)CrossRefGoogle Scholar
  31. 31.
    Vié, J.C., Hilton-Taylor, C., Stuart, S.N.: Wildlife in a changing world: an analysis of the 2008 IUCN Red List of threatened species. IUCN (2009)Google Scholar
  32. 32.
    Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 31(2), 210–227 (2009)CrossRefGoogle Scholar
  33. 33.
    Yang, M., Zhang, L.: Gabor feature based sparse representation for face recognition with gabor occlusion dictionary. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 448–461. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  34. 34.
    Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 834–849. Springer, Heidelberg (2014)Google Scholar
  35. 35.
    Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Alexander Freytag
    • 1
    • 2
    Email author
  • Erik Rodner
    • 1
    • 2
  • Marcel Simon
    • 1
  • Alexander Loos
    • 3
  • Hjalmar S. Kühl
    • 4
    • 5
  • Joachim Denzler
    • 1
    • 2
    • 5
  1. 1.Computer Vision GroupFriedrich Schiller University JenaJenaGermany
  2. 2.Michael Stifel Center JenaJenaGermany
  3. 3.Fraunhofer Institute for Digital Media TechnologyIlmenauGermany
  4. 4.Max Planck Institute for Evolutionary AnthropologyLeipzigGermany
  5. 5.German Centre for Integrative Biodiversity Research (iDiv)LeipzigGermany

Personalised recommendations