Abstract
We present an extension of classical archetypal analysis (AA). It is motivated by the observation that classical AA is not invariant against strictly monotone increasing transformations. Establishing such an invariance is desirable since it makes AA independent of the chosen measure: representing a data set in meters or log(meters) should lead to approximately the same archetypes. The desired invariance is achieved by introducing a semi-parametric Gaussian copula. This ensures the desired invariance and makes AA more robust against outliers and missing values. Furthermore, our framework can deal with mixed discrete/continuous data, which certainly is the most widely encountered type of data in real world applications. Since the proposed extension is presented in form of a preprocessing step, updating existing classical AA models is especially effortless.
This is a preview of subscription content, log in via an institution.
References
Bauckhage, C.: A note on archetypal analysis and the approximation of convex hulls (2014). arXiv preprint arXiv:1410.0642
Bauckhage, C., Manshaei, K.: Kernel archetypal analysis for clustering web search frequency time series. In: 22nd International Conference on Pattern Recognition (ICPR), pp. 1544–1549. IEEE (2014)
Bauckhage, C., Thurau, C.: Making Archetypal analysis practical. In: Denzler, J., Notni, G., Süße, H. (eds.) Pattern Recognition. LNCS, vol. 5748, pp. 272–281. Springer, Heidelberg (2009)
Canhasi, E., Kononenko, I.: Weighted archetypal analysis of the multi-element graph for query-focused multi-document summarization. Expert Syst. Appl. 41(2), 535–543 (2014)
Cutler, A., Breiman, L.: Archetypal analysis. Technometrics 36(4), 338–347 (1994)
Ebert, S., Schiele, B.: Where next in object recognition and how much supervision do we need? In: Farinella, G.M., Battiato, S., Cipolla, R. (eds.) Advanced Topics in Computer Vision, pp. 35–64. Springer, London (2013)
Gasch, A.P., Spellman, P.T., Kao, C.M., Carmel-Harel, O., Eisen, M.B., Storz, G., Botstein, D., Brown, P.O.: Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. cell 11(12), 4241–4257 (2000)
Harbison, C.T., Gordon, D.B., Lee, T.I., Rinaldi, N.J., Macisaac, K.D., Danford, T.W., Hannett, N.M., Tagne, J.B., Reynolds, D.B., Yoo, J., et al.: Transcriptional regulatory code of a eukaryotic genome. Nat. 431(7004), 99–104 (2004)
Hoff, P.D.: Extending the rank likelihood for semiparametric copula estimation. Ann. Appl. Stat. 1(1), 265–283 (2007)
Joe, H.: Multivariate models and multivariate dependence concepts. CRC Press, Boca Raton (1997)
Kersting, K., Wahabzada, M., Thurau, C., Bauckhage, C.: Hierarchical convex nmf for clustering massive data. ACML 10, 253–268 (2010)
Lee, J., Godon, C., Lagniel, G., Spector, D., Garin, J., Labarre, J., Toledano, M.B.: Yap1 and skn7 control two specialized oxidative stress response regulons in yeast. J. Biol. Chem. 274(23), 16040–16046 (1999)
Li, S., Wang, P., Louviere, J., Carson, R.: Archetypal analysis: a new way to segment markets based on extreme individuals. In: Proceedings of the ANZMAC 2003 Conference, A Celebration of Ehrenberg and Bass: Marketing Knowledge, Discoveries and Contribution, pp. 1674–1679 (2003)
Nelsen, R.B.: An Introduction to Copulas. Science & Business Media. Springer, New York (1999)
Norberg, U.M., Rayner, J.M.: Ecological morphology and flight in bats (mammalia; chiroptera): wing adaptations, flight performance, foraging strategy and echolocation. Philos. Trans. R. Soc. B Biol. Sci. 316(1179), 335–427 (1987)
Prabhakaran, S., Raman, S., Vogt, J.E., Roth, V.: Automatic model selection in Archetype analysis. In: Pinz, A., Pock, T., Bischof, H., Leberl, F. (eds.) DAGM and OAGM 2012. LNCS, vol. 7476, pp. 458–467. Springer, Heidelberg (2012)
Seth, S., Eugster, M.J.: Probabilistic archetypal analysis (2013). arXiv preprint arXiv:1312.7604
Shoval, O., Sheftel, H., Shinar, G., Hart, Y., Ramote, O., Mayo, A., Dekel, E., Kavanagh, K., Alon, U.: Evolutionary trade-offs, pareto optimality, and the geometry of phenotype space. Sci. 336(6085), 1157–1160 (2012)
Sifa, R., Bauckhage, C.: Archetypical motion: Supervised game behavior learning with archetypal analysis. In: IEEE Conference on Computational Intelligence in Games (CIG), pp. 1–8. IEEE (2013)
Sklar, M.: Fonctions de répartition à n dimensions et leurs marges. Université Paris, vol. 8 (1959)
Steyn, H., Roux, J.: Approximations for the non-central wishart distribution. S. Afr. Stat. J. 6, 165–173 (1972)
Thurau, C., Bauckhage, C.: Archetypal images in large photo collections. In: IEEE International Conference on Semantic Computing, ICSC 2009, pp. 129–136. IEEE (2009)
Thurau, C., Kersting, K., Bauckhage, C.: Convex non-negative matrix factorization in the wild. In: Ninth IEEE International Conference on Data Mining, ICDM 2009, pp. 523–532. IEEE (2009)
Acknowledgements
This work was partially supported by the Swiss National Science Foundation, project 200021_146178: Copula Distributions in Machine Learning.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kaufmann, D., Keller, S., Roth, V. (2015). Copula Archetypal Analysis. In: Gall, J., Gehler, P., Leibe, B. (eds) Pattern Recognition. DAGM 2015. Lecture Notes in Computer Science(), vol 9358. Springer, Cham. https://doi.org/10.1007/978-3-319-24947-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-24947-6_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24946-9
Online ISBN: 978-3-319-24947-6
eBook Packages: Computer ScienceComputer Science (R0)