Skip to main content

Representation learning with deep extreme learning machines for efficient image set classification


Efficient and accurate representation of a collection of images, that belong to the same class, is a major research challenge for practical image set classification. Existing methods either make prior assumptions about the data structure, or perform heavy computations to learn structure from the data itself. In this paper, we propose an efficient image set representation that does not make any prior assumptions about the structure of the underlying data. We learn the nonlinear structure of image sets with deep extreme learning machines that are very efficient and generalize well even on a limited number of training samples. Extensive experiments on a broad range of public datasets for image set classification show that the proposed algorithm consistently outperforms state-of-the-art image set classification methods both in terms of speed and accuracy.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. 1.

    Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127

    Article  MATH  Google Scholar 

  2. 2.

    Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans PAMI 35(8):1798–1828

    Article  Google Scholar 

  3. 3.

    Cevikalp H, Triggs B (2010) Face recognition based on image sets. In: CVPR, pp 2567–2573

  4. 4.

    Chen S, Sanderson C, Harandi MT, Lovell BC (2013) Improved image set classification via joint sparse approximated nearest subspaces. In: CVPR, pp. 452–459

  5. 5.

    Cui Z, Chang H, Shan S, Ma B, Chen X (2014) Joint sparse representation for video-based face recognition. Neurocomputing 135:306–312

    Article  Google Scholar 

  6. 6.

    Du JX, Shao MW, Zhai CM, Wang J, Tang Y, Chen CLP (2015) Recognition of leaf image set based on manifoldmanifold distance. Neurocomputing 188:131–138

    Article  Google Scholar 

  7. 7.

    Gross R, Shi J (2001) The cmu motion of body database. Tech. Rep. CMU-RI-TR-01-18, Robotics Institute

  8. 8.

    Han B, He B, Sun T, Yan T, Ma M, Shen Y, Lendasse A (2016) HSR: \(l_{1/2}\)-regularized sparse representation for fast face recognition using hierarchical feature selection. Neural Comput Appl 27(2):305–320

    Article  Google Scholar 

  9. 9.

    Harandi M, Salzmannl M, Baktashmotlagh M (2015) Beyond gauss: image-set matching on the riemannian manifold of pdfs. In: ICCV

  10. 10.

    Harandi M, Sanderson C, Shirazi S, Lovell B (2011) Graph-embedding discriminant analysis on grassmannian manifolds for improved image set matching. In: CVPR, pp 2705–2712

  11. 11.

    Harandi MT, Salzmann M, Hartley R (2014) From manifold to manifold: geometry-aware dimensionality reduction for SPD matrices. In: ECCV, pp 17–32

  12. 12.

    Hayat M, Bennamoun M, An S (2014) Learning nonlinear reconstruction models for image set classification. In: CVPR, pp 1915–1922

  13. 13.

    Hu Y, Mian A, Owens R (2012) Face recognition using sparse approximated nearest points between image sets. IEEE Trans PAMI 34(10):1992–2004

    Article  Google Scholar 

  14. 14.

    Huang G (2015) What are extreme learning machines? Filling the gap between Frank Rosenblatt’s dream and John von Neumann’s puzzle. Cognit Comput 7(3):263–278

    Article  Google Scholar 

  15. 15.

    Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892

    Article  Google Scholar 

  16. 16.

    Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans SMC Part B 42(2):513–529

    Google Scholar 

  17. 17.

    Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501

    Article  Google Scholar 

  18. 18.

    Huang Z, Wang R, Shan S, Chen X (2015) Projection metric learning on Grassmann manifold with application to video based face recognition. In: CVPR, pp 140–149

  19. 19.

    Huang Z, Wang R, Shan S, Li X, Chen X (2015) Log-euclidean metric learning on symmetric positive definite manifold with application to image set classification. In: ICML

  20. 20.

    Johnson W, Lindenstrauss J (1984) Extensions of Lipschitz mappings into a Hilbert space. Conference in modern analysis and probability 26:189–206

  21. 21.

    Kasun LLC, Zhou H, Huang GB (2013) Representational learning with ELMs for big data. IEEE Intell Syst 28(6):30–59

    Article  Google Scholar 

  22. 22.

    Kim TK, Kittler J, Cipolla R (2007) Discriminative learning and recognition of image set classes using canonical correlations. IEEE Trans PAMI 29(6):1005–1018

    Article  Google Scholar 

  23. 23.

    Kim M, Kumar S, Pavlovic V, Rowley H (2008) Face tracking and recognition with visual constraints in real-world videos. In: CVPR, pp 1–8

  24. 24.

    Lan Y, Hu Z, Soh YC, Huang GB (2013) An extreme learning machine approach for speaker recognition. Neural Comput Appl 22(3):417–425

    Article  Google Scholar 

  25. 25.

    Lee KC, Ho J, Yang MH, Kriegman D (2003) Video-based face recognition using probabilistic appearance manifolds. In: CVPR, pp I313–I320

  26. 26.

    Leibe B, Schiele B (2003) Analyzing appearance and contour based methods for object categorization. In: CVPR, pp 409–415

  27. 27.

    Li B, Li Y, Rong X (2013) The extreme learning machine learning algorithm with tunable activation function. Neural Comput Appl 22(3):531–539

    Article  Google Scholar 

  28. 28.

    Liu L, Zhang L, Liu H, Yan S (2014) Towards large-population face identification in unconstrained videos. IEEE Trans CSVT PP(99):1–1

    Google Scholar 

  29. 29.

    Liu X, Lin S, Fang J, Xu Z (2015) Is extreme learning machine feasible? a theoretical assessment (part i). IEEE Trans Neural Netw Learn Syst 26(1):7–20

    MathSciNet  Article  Google Scholar 

  30. 30.

    Lu J, Wang G, Deng W, Moulin P (2014) Simultaneous feature and dictionary learning for image set based face recognition. In: ECCV, pp 265–280

  31. 31.

    Lu J, Wang G, Deng W, Moulin P, Zhou J (2015) Multi-manifold deep metric learning for image set classification. In: CVPR, pp 1137–1145

  32. 32.

    Lu J, Wang G, Moulin P (2013) Image set classification using holistic multiple order statistics features and localized multi-kernel metric learning. In: ICCV, pp 329–336

  33. 33.

    Mahmood A, Mian A, Owens R (2014) Semi-supervised spectral clustering for image set classification. In: CVPR, pp 121–128

  34. 34.

    Mian A, Hu Y, Hartley R, Owens R (2013) Image set based face recognition using self-regularized non-negative coding and adaptive distance metric learning. IEEE Trans Image Process 22:5252–5262

    Article  Google Scholar 

  35. 35.

    Nian R, He B, Lendasse A (2013) 3D object recognition based on a geometrical topology model and extreme learning machine. Neural Comput Appl 22(3):427–433

    Article  Google Scholar 

  36. 36.

    Ross D, Lim J, Lin R, Yang M (2008) Incremental learning for robust visual tracking. Int J Comput Vis 77:125–141

    Article  Google Scholar 

  37. 37.

    Uzair M, Mahmood A, Mian A, McDonald C (2013) A compact discriminative representation for efficient image-set classification with application to biometric recognition. In: International conference on biometrics, pp 1–8

  38. 38.

    Uzair M, Mahmood A, Mian A, McDonald C (2014) Periocular region-based person identification in the visible, infrared and hyperspectral imagery. Neurocomputing 149(Part B):854–867

    Google Scholar 

  39. 39.

    Viola P, Jones M (2004) Robust real-time face detection. Int J Comput Vis 57:137–154

    Article  Google Scholar 

  40. 40.

    Wang GG, Lu M, Dong YQ, Zhao XJ (2016) Self-adaptive extreme learning machine. Neural Comput Appl 27(2):291–303

    Article  Google Scholar 

  41. 41.

    Wang R, Chen X (2009) Manifold discriminant analysis. In: CVPR, pp 429–436

  42. 42.

    Wang R, Guo H, Davis L, Dai Q (2012) Covariance discriminative learning: a natural and efficient approach to image set classification. In: CVPR, pp 2496–2503

  43. 43.

    Wang R, Shan S, Chen X, Gao W (2008) Manifold-manifold distance with application to face recognition based on image set. In: CVPR, pp 1–8

  44. 44.

    Wang W, Wang R, Huang Z, Shan S, Chen X (2015) Discriminant analysis on Riemannian manifold of Gaussian distributions for face recognition with image sets. In: CVPR

  45. 45.

    Xie L, Lu C, Mei Y, Du H, Man Z (2016) An optimal method for data clustering. Neural Comput Appl 27(2):283–289

    Article  Google Scholar 

  46. 46.

    Zhu P, Zhang L, Zuo W, Zhang D (2013) From point to set: extend the learning of distance metrics. In: ICCV, pp 2664–2671

Download references


This work was supported by the Australian Research Council (ARC) Grant DP110102399 and UWA Research Collaboration Award 2014.

Author information



Corresponding author

Correspondence to Muhammad Uzair.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Uzair, M., Shafait, F., Ghanem, B. et al. Representation learning with deep extreme learning machines for efficient image set classification. Neural Comput & Applic 30, 1211–1223 (2018).

Download citation


  • Extreme learning machine
  • Image set classification
  • Representation learning
  • Face recognition