Copula Archetypal Analysis

  • Dinu Kaufmann
  • Sebastian Keller
  • Volker Roth
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9358)


We present an extension of classical archetypal analysis (AA). It is motivated by the observation that classical AA is not invariant against strictly monotone increasing transformations. Establishing such an invariance is desirable since it makes AA independent of the chosen measure: representing a data set in meters or log(meters) should lead to approximately the same archetypes. The desired invariance is achieved by introducing a semi-parametric Gaussian copula. This ensures the desired invariance and makes AA more robust against outliers and missing values. Furthermore, our framework can deal with mixed discrete/continuous data, which certainly is the most widely encountered type of data in real world applications. Since the proposed extension is presented in form of a preprocessing step, updating existing classical AA models is especially effortless.



This work was partially supported by the Swiss National Science Foundation, project 200021_146178: Copula Distributions in Machine Learning.


  1. 1.
    Bauckhage, C.: A note on archetypal analysis and the approximation of convex hulls (2014). arXiv preprint arXiv:1410.0642
  2. 2.
    Bauckhage, C., Manshaei, K.: Kernel archetypal analysis for clustering web search frequency time series. In: 22nd International Conference on Pattern Recognition (ICPR), pp. 1544–1549. IEEE (2014)Google Scholar
  3. 3.
    Bauckhage, C., Thurau, C.: Making Archetypal analysis practical. In: Denzler, J., Notni, G., Süße, H. (eds.) Pattern Recognition. LNCS, vol. 5748, pp. 272–281. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  4. 4.
    Canhasi, E., Kononenko, I.: Weighted archetypal analysis of the multi-element graph for query-focused multi-document summarization. Expert Syst. Appl. 41(2), 535–543 (2014)CrossRefGoogle Scholar
  5. 5.
    Cutler, A., Breiman, L.: Archetypal analysis. Technometrics 36(4), 338–347 (1994)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Ebert, S., Schiele, B.: Where next in object recognition and how much supervision do we need? In: Farinella, G.M., Battiato, S., Cipolla, R. (eds.) Advanced Topics in Computer Vision, pp. 35–64. Springer, London (2013) CrossRefGoogle Scholar
  7. 7.
    Gasch, A.P., Spellman, P.T., Kao, C.M., Carmel-Harel, O., Eisen, M.B., Storz, G., Botstein, D., Brown, P.O.: Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. cell 11(12), 4241–4257 (2000)CrossRefGoogle Scholar
  8. 8.
    Harbison, C.T., Gordon, D.B., Lee, T.I., Rinaldi, N.J., Macisaac, K.D., Danford, T.W., Hannett, N.M., Tagne, J.B., Reynolds, D.B., Yoo, J., et al.: Transcriptional regulatory code of a eukaryotic genome. Nat. 431(7004), 99–104 (2004)CrossRefGoogle Scholar
  9. 9.
    Hoff, P.D.: Extending the rank likelihood for semiparametric copula estimation. Ann. Appl. Stat. 1(1), 265–283 (2007)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Joe, H.: Multivariate models and multivariate dependence concepts. CRC Press, Boca Raton (1997) CrossRefGoogle Scholar
  11. 11.
    Kersting, K., Wahabzada, M., Thurau, C., Bauckhage, C.: Hierarchical convex nmf for clustering massive data. ACML 10, 253–268 (2010)Google Scholar
  12. 12.
    Lee, J., Godon, C., Lagniel, G., Spector, D., Garin, J., Labarre, J., Toledano, M.B.: Yap1 and skn7 control two specialized oxidative stress response regulons in yeast. J. Biol. Chem. 274(23), 16040–16046 (1999)CrossRefGoogle Scholar
  13. 13.
    Li, S., Wang, P., Louviere, J., Carson, R.: Archetypal analysis: a new way to segment markets based on extreme individuals. In: Proceedings of the ANZMAC 2003 Conference, A Celebration of Ehrenberg and Bass: Marketing Knowledge, Discoveries and Contribution, pp. 1674–1679 (2003)Google Scholar
  14. 14.
    Nelsen, R.B.: An Introduction to Copulas. Science & Business Media. Springer, New York (1999) CrossRefGoogle Scholar
  15. 15.
    Norberg, U.M., Rayner, J.M.: Ecological morphology and flight in bats (mammalia; chiroptera): wing adaptations, flight performance, foraging strategy and echolocation. Philos. Trans. R. Soc. B Biol. Sci. 316(1179), 335–427 (1987)CrossRefGoogle Scholar
  16. 16.
    Prabhakaran, S., Raman, S., Vogt, J.E., Roth, V.: Automatic model selection in Archetype analysis. In: Pinz, A., Pock, T., Bischof, H., Leberl, F. (eds.) DAGM and OAGM 2012. LNCS, vol. 7476, pp. 458–467. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  17. 17.
    Seth, S., Eugster, M.J.: Probabilistic archetypal analysis (2013). arXiv preprint arXiv:1312.7604
  18. 18.
    Shoval, O., Sheftel, H., Shinar, G., Hart, Y., Ramote, O., Mayo, A., Dekel, E., Kavanagh, K., Alon, U.: Evolutionary trade-offs, pareto optimality, and the geometry of phenotype space. Sci. 336(6085), 1157–1160 (2012)CrossRefGoogle Scholar
  19. 19.
    Sifa, R., Bauckhage, C.: Archetypical motion: Supervised game behavior learning with archetypal analysis. In: IEEE Conference on Computational Intelligence in Games (CIG), pp. 1–8. IEEE (2013)Google Scholar
  20. 20.
    Sklar, M.: Fonctions de répartition à n dimensions et leurs marges. Université Paris, vol. 8 (1959)Google Scholar
  21. 21.
    Steyn, H., Roux, J.: Approximations for the non-central wishart distribution. S. Afr. Stat. J. 6, 165–173 (1972)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Thurau, C., Bauckhage, C.: Archetypal images in large photo collections. In: IEEE International Conference on Semantic Computing, ICSC 2009, pp. 129–136. IEEE (2009)Google Scholar
  23. 23.
    Thurau, C., Kersting, K., Bauckhage, C.: Convex non-negative matrix factorization in the wild. In: Ninth IEEE International Conference on Data Mining, ICDM 2009, pp. 523–532. IEEE (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (, which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.Department of Mathematics and Computer ScienceUniversity of BaselBaselSwitzerland

Personalised recommendations