Online Dictionary Learning for Approximate Archetypal Analysis

  • Jieru Mei
  • Chunyu WangEmail author
  • Wenjun Zeng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11207)


Archetypal analysis is an unsupervised learning approach which represents data by convex combinations of a set of archetypes. The archetypes generally correspond to the extremal points in the dataset and are learned by requiring them to be convex combinations of the training data. In spite of its nice property of interpretability, the method is slow. We propose a variant of archetypal analysis which scales gracefully to large datasets. The core idea is to decouple the binding between data and archetypes and require them to be unit normalized. Geometrically, the method learns a convex hull inside the unit sphere and represents the data by their projections on the closest surfaces of the convex hull. By minimizing the representation error, the method pushes the convex hull surfaces close to the regions of the sphere where the data reside. The vertices of the convex hull are the learned archetypes. We apply the method to human faces and poses to validate its effectiveness in the context of reconstructions and classifications.


Archetypal analysis Convex hull Sparsity 

Supplementary material

474178_1_En_30_MOESM1_ESM.pdf (64 kb)
Supplementary material 1 (pdf 64 KB)


  1. 1.
    Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)CrossRefGoogle Scholar
  2. 2.
    Cao, Z., Simon, T., Wei, S., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: CVPR, pp. 1302–1310 (2017)Google Scholar
  3. 3.
    Chen, C., Ramanan, D.: 3D human pose estimation = 2D pose estimation + matching. In: CVPR, pp. 5759–5767 (2017)Google Scholar
  4. 4.
    Chen, Y., Mairal, J., Harchaoui, Z.: Fast and robust archetypal analysis for representation learning. In: CVPR, pp. 1478–1485 (2014)Google Scholar
  5. 5.
    Coates, A., Ng, A.Y.: Learning feature representations with K-means. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 561–580. Springer, Heidelberg (2012). Scholar
  6. 6.
    Cutler, A., Breiman, L.: Archetypal analysis. Technometrics 36(4), 338–347 (1994)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Donoho, D.L., Grimes, C.: Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc. Nat. Acad. Sci. 100(10), 5591–5596 (2003)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (1973)zbMATHGoogle Scholar
  9. 9.
    Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Georghiades, A.S., Belhumeur, P.N., Kriegman, D.J.: From few to many: illumination cone models for face recognition under variable lighting and pose. TPAMI 23(6), 643–660 (2001)CrossRefGoogle Scholar
  11. 11.
    Guennebaud, G., Jacob, B., et al.: Eigen v3 (2010).
  12. 12.
    Ham, J., Lee, D.D., Mika, S., Schölkopf, B.: A kernel view of the dimensionality reduction of manifolds. In: ICML (2004)Google Scholar
  13. 13.
    Hoyer, P.O.: Non-negative sparse coding. In: Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, NNSP 2002, Martigny, Valais, Switzerland, 4–6 September 2002, pp. 557–565 (2002)Google Scholar
  14. 14.
    Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5, 1457–1469 (2004)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Huang, Z., Wan, C., Probst, T., Gool, L.V.: Deep learning on lie groups for skeleton-based action recognition. In: CVPR, pp. 1243–1252 (2017)Google Scholar
  16. 16.
    Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3.6m: large scale datasets and predictive methods for 3D human sensing in natural environments. TPAMI 36(7), 1325–1339 (2014)CrossRefGoogle Scholar
  17. 17.
    Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: NIPS, pp. 801–808 (2006)Google Scholar
  18. 18.
    Lee, I., Kim, D., Kang, S., Lee, S.: Ensemble deep learning for skeleton-based action recognition using temporal sliding LSTM networks. In: ICCV, pp. 1012–1020 (2017)Google Scholar
  19. 19.
    Liu, J., Shahroudy, A., Xu, D., Wang, G.: Spatio-temporal LSTM with trust gates for 3D human action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 816–833. Springer, Cham (2016). Scholar
  20. 20.
    Mairal, J., Bach, F.R., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: ICML, pp. 689–696 (2009)Google Scholar
  21. 21.
    Mallat, S.: A Wavelet Tour of Signal Processing. Academic press, Cambridge (1999)zbMATHGoogle Scholar
  22. 22.
    Martinez, J., Hossain, R., Romero, J., Little, J.J.: A simple yet effective baseline for 3D human pose estimation. In: ICCV, pp. 2659–2668 (2017)Google Scholar
  23. 23.
    Mika, S., Ratsch, G., Weston, J., Scholkopf, B., Mullers, K.R.: Fisher discriminant analysis with kernels. In: Proceedings of the 1999 IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing IX, 1999, pp. 41–48. IEEE (1999)Google Scholar
  24. 24.
    Olshausen, B.A.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583), 607–609 (1996)CrossRefGoogle Scholar
  25. 25.
    Osborne, M.R., Presnell, B., Turlach, B.A.: On the lasso and its dual. J. Comput. Graph. Stat. 9(2), 319–337 (2000)MathSciNetGoogle Scholar
  26. 26.
    Ramakrishna, Varun, Kanade, Takeo, Sheikh, Yaser: Reconstructing 3D human pose from 2D image landmarks. In: Fitzgibbon, Andrew, Lazebnik, Svetlana, Perona, Pietro, Sato, Yoichi, Schmid, Cordelia (eds.) ECCV 2012. LNCS, vol. 7575, pp. 573–586. Springer, Heidelberg (2012). Scholar
  27. 27.
    Saul, L.K., Roweis, S.T.: Think globally, fit locally: unsupervised learning of low dimensional manifolds. J. Mach. Learn. Res. 4(Jun), 119–155 (2003)MathSciNetzbMATHGoogle Scholar
  28. 28.
    Shahroudy, A., Liu, J., Ng, T., Wang, G.: NTU RGB+D: a large scale dataset for 3D human activity analysis. In: CVPR, pp. 1010–1019 (2016)Google Scholar
  29. 29.
    Song, S., Lan, C., Xing, J., Zeng, W., Liu, J.: An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In: AAAI, pp. 4263–4270 (2017)Google Scholar
  30. 30.
    Sun, X., Shang, J., Liang, S., Wei, Y.: Compositional human pose regression. In: ICCV, pp. 2621–2630 (2017)Google Scholar
  31. 31.
    Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)CrossRefGoogle Scholar
  32. 32.
    Thompson, B.: Canonical correlation analysis. Encyclopedia of statistics in behavioral science (2005)Google Scholar
  33. 33.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B(Methodological), 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Tomè, D., Russell, C., Agapito, L.: Lifting from the deep: convolutional 3D pose estimation from a single image. In: CVPR, pp. 5689–5698 (2017)Google Scholar
  35. 35.
    Wang, C., Wang, Y., Yuille, A.L.: Mining 3D key-pose-motifs for action recognition. In: CVPR, pp. 2639–2647 (2016)Google Scholar
  36. 36.
    Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemometr. Intell. Lab. Syst. 2(1–3), 37–52 (1987)CrossRefGoogle Scholar
  37. 37.
    Yu, G., Sapiro, G., Mallat, S.: Solving inverse problems with piecewise linear estimators: From Gaussian mixture models to structured sparsity. IEEE Trans. Image Process. 21(5), 2481–2499 (2012)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Zhang, P., Lan, C., Xing, J., Zeng, W., Xue, J., Zheng, N.: View adaptive recurrent neural networks for high performance human action recognition from skeleton data. In: ICCV, pp. 2136–2145 (2017)Google Scholar
  39. 39.
    Zhang, Z., Zha, H.: Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J. Sci. Comput. 26(1), 313–338 (2004)MathSciNetCrossRefGoogle Scholar
  40. 40.
    Zhou, X., Zhu, M., Pavlakos, G., Leonardos, S., Derpanis, K.G., Daniilidis, K.: Monocap: Monocular human motion capture using a CNN coupled with a geometric prior. CoRR arXiv:abs/1701.02354 (2017)

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Microsoft Research AsiaBeijingChina

Personalised recommendations