Abstract
Archetype and archetypoid analysis are extended to shapes. The objective is to find representative shapes. Archetypal shapes are pure (extreme) shapes. We focus on the case where the shape of an object is represented by a configuration matrix of landmarks. As shape space is not a vectorial space, we work in the tangent space, the linearized space about the mean shape. Then, each observation is approximated by a convex combination of actual observations (archetypoids) or archetypes, which are a convex combination of observations in the data set. These tools can contribute to the understanding of shapes, as in the usual multivariate case, since they lie somewhere between clustering and matrix factorization methods. A new simplex visualization tool is also proposed to provide a picture of the archetypal analysis results. We also propose new algorithms for performing archetypal analysis with missing data and its extension to incomplete shapes. A well-known data set is used to illustrate the methodologies developed. The proposed methodology is applied to an apparel design problem in children.
Similar content being viewed by others
References
Arbour JH, Brown CM (2014) Incomplete specimens in geometric morphometric analyses. Methods Ecol Evol 5(1):16–26
Ayala G, Epifanio I, Simó A, Zapater V (2006) Clustering of spatial point patterns. Comput Stat Data Anal 50(4):1016–1032
Bookstein F (1978) Lecture notes in biomathematics. In: The measurement of biological shape and shape change. Springer, New York
Brown CM, Arbour JH, Jackson DA (2012) Testing of the effect of missing data estimation and distribution in morphometric multivariate data analyses. Syst Biol 61(6):941–954
Canhasi E, Kononenko I (2013) Multi-document summarization via archetypal analysis of the content-graph joint model. Knowl Inf Syst. doi:10.1007/s10115-013-0689-8
Canhasi E, Kononenko I (2014) Weighted archetypal analysis of the multi-element graph for query-focused multi-document summarization. Expert Syst Appl 41(2):535–543
Chan B, Mitchell D, Cram L (2003) Archetypal analysis of galaxy spectra. Mon Not R Astron Soc 338:790–795
Claude J (2008) Morphometrics with R. Springer, New York
Cutler A, Breiman L (1994) Archetypal analysis. Technometrics 36(4):338–347
Davis T, Love B (2010) Memory for category information is idealized through contrast with competing options. Psychol Sci 21(2):234–242
D’Esposito MR, Ragozini G (2008) A new R-ordering procedure to rank multivariate performances. Quaderni di Statistica 10:22–40
D’Esposito MR, Palumbo F, Ragozini G (2012) Interval archetypes: a new tool for interval data analysis. Stat Anal Data Min 5(4):322–335
Dryden IL (2015) Shapes: statistical shape analysis. R package version 1.1-11. https://CRAN.R-project.org/package=shapes
Dryden IL, Mardia KV (1998) Statistical shape analysis. Wiley, Chichester
Dryden IL, Mardia KV (2016) Statistical shape analysis: with applications in R. Wiley, Chichester
Dryden IL, Zempléni A (2006) Extreme shape analysis. J R Stat Soc Ser C 55(1):103–121
Du J, Dryden IL, Huang X (2015) Size and shape analysis of error-prone shape data. J Am Stat Assoc 110(509):368–379
Eirola E, Doquire G, Verleysen M, Lendasse A (2013) Distance estimation in numerical data sets with missing values. Inf Sci 240:115–128
Eneh S (2015) Showroom the future of online fashion retailing 2.0: enhancing the online shopping experience. Master’s thesis, University of Borås, Faculty of Textiles, Engineering and Business
Epifanio I (2016) Functional archetype and archetypoid analysis. Comput Stat Data Anal 104:24–34
Epifanio I, Vinué G, Alemany S (2013) Archetypal analysis: contributions for estimating boundary cases in multivariate accommodation problem. Comput Ind Eng 64(3):757–765
Eugster MJ, Leisch F (2009) From spider-man to hero—archetypal analysis in R. J Stat Softw 30(8):1–23
Eugster MJA (2012) Performance profiles based on archetypal athletes. Int J Perform Anal Sport 12(1):166–187
Eugster MJA, Leisch F (2011) Weighted and robust archetypal analysis. Comput Stat Data Anal 55(3):1215–1225
Fréchet M (1948) Les éléments aléatoires de nature quelconque dans un espace distancié. Annales de l’Institut Henri Poincaré Probabilités et Statistiques 10(4):215–310
Goodall C (1991) Procrustes methods in the statistical analysis of shape. J R Stat Soc Ser B (Methodological) 53(2):285–339
Guerrero J, ASEPRI (2000) Estudio de tallas y medidas de la población infantil internacional. Asociación Española de Fabricantes de Productos para la Infancia (ASEPRI)
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Data mining, inference and prediction, 2nd edn. Springer, New York
Hinrich JL, Bardenfleth SE, Roge RE, Churchill NW, Madsen KH, Mørup M (2016) Archetypal analysis for modeling multisubject fMRI data. IEEE J Sel Top Sign Proces 10(7):1160–1171
Ibáñez MV, Vinué G, Alemany S, Simó A, Epifanio I, Domingo J, Ayala G (2012) Apparel sizing using trimmed PAM and OWA operators. Expert Syst Appl 39(12):10,512–10,520
Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York
Kendall D (1984) Shape manifolds, procrustean metrics, and complex projective spaces. Lond Math Soc 16:81–121
Kendall DG, Barden D, Carne T, Le H (2009) Shape and shape theory. Wiley, Chichester
Lawson CL, Hanson RJ (1974) Solving least squares problems. Prentice Hall, Englewood Cliffs
Li S, Wang P, Louviere J, Carson R (2003) Archetypal analysis: a new way to segment markets based on extreme individuals. In: ANZMAC 2003 conference proceedings, pp 1674–1679
MacLeod N (2015) Proceedings of the third international symposium on biological shape analysis, Chap The direct analysis of digital images (eigenimage) with a comment on the use of discriminant analysis in morphometrics. World Scientific, Singapore, pp 156–182
Midgley D, Venaik S (2013) Marketing strategy in MNC subsidiaries: pure versus hybrid archetypes. In: McDougall-Covin P, Kiyak T (eds) Proceedings of the 55th annual meeting of the academy of international business, pp 215–216
Mørup M, Hansen LK (2012) Archetypal analysis for machine learning and data mining. Neurocomputing 80:54–63
Pennec X (2006) Intrinsic statistics on Riemannian manifolds: basic tools for geometric measurements. J Math Imaging Vis 25(1):127–154
Porzio GC, Ragozini G, Vistocco D (2008) On the use of archetypes as benchmarks. Appl Stoch Models Bus Ind 24:419–437
R Development Core Team (2017) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. ISBN 3-900051-07-0. http://www.R-project.org
Ragozini G, D’Esposito MR (2015) Archetypal networks. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, New York, pp 807–814
Ragozini G, Palumbo F, D’Esposito MR (2017) Archetypal analysis for data-driven prototype identification. Stat Anal Data Min ASA Data Sci J 10(1):6–20
Robinette KM, Veitch D (2016) Sustainable sizing. Hum Fact J Hum Fact Ergonom Soc 58:657–664
Rohlf FJ (1998) On applications of geometric morphometrics to studies of ontogeny and phylogeny. Syst Biol 47(1):147–158
Rohlf FJ (1999) Shape statistics: procrustes superimpositions and tangent spaces. J Classif 16(2):197–223
Seth S, Eugster MJA (2016a) Archetypal analysis for nominal observations. IEEE Trans Pattern Anal Mach Intell 38(5):849–861
Seth S, Eugster MJA (2016b) Probabilistic archetypal analysis. Mach Learn 102(1):85–113
Sjöstrand K, Stegmann MB, Larsen R (2006) Sparse principal component analysis in medical shape modeling. In: International symposium on medical imaging, vol 6144. The International Society for Optical Engineering (SPIE), San Diego
Sjöstrand K, Rostrup E, Ryberg C, Larsen R, Studholme C, Baezner H, Ferro J, Fazekas F, Pantoni L, Inzitari D, Waldemar G (2007) Sparse decomposition and modeling of anatomical shape variation. IEEE Trans Med Imaging 26(12):1625–1635
Slice DE (2001) Landmark coordinates aligned by procrustes analysis do not lie in Kendall’s shape space. Syst Biol 50(1):141–149
Stoyan LA, Stoyan H (1995) Fractals, random shapes and point fields. Wiley, Chichester
Theodosiou T, Kazanidis I, Valsamidis S, Kontogiannis S (2013) Courseware usage archetyping. In: Proceedings of the 17th panhellenic conference on informatics, ACM, New York, PCI ’13, pp 243–249
Thøgersen JC, Mørup M, Damkiær S, Molin S, Jelsbak L (2013) Archetypal analysis of diverse pseudomonas aeruginosa transcriptomes reveals adaptation in cystic fibrosis airways. BMC Bioinform 14:279
Thurau C, Kersting K, Wahabzada M, Bauckhage C (2012) Descriptive matrix factorization for sustainability: adopting the principle of opposites. Data Min Knowl Disc 24(2):325–354
Tsanousa A, Laskaris N, Angelis L (2015) A novel single-trial methodology for studying brain response variability based on archetypal analysis. Expert Syst Appl 42(22):8454–8462
Vinué G (2017) Anthropometry: an R package for analysis of anthropometric data. J Stat Softw 77(6):1–39
Vinué G, Epifanio I (2017) Archetypoid analysis for sports analytics. Data Min Knowl Discov 31(6):1643–1677. doi:10.1007/s10618-017-0514-1
Vinué G, Epifanio I, Alemany S (2015a) Archetypoids: a new approach to define representative archetypal data. Comput Stat Data Anal 87:102–115
Vinué G, Epifanio I, Simó A, Ibáñez M, Domingo J, Ayala G (2015b) Anthropometry: an R package for analysis of anthropometric data. R package version 1:5
Vinué G, Simó A, Alemany S (2016) The k-means algorithm for 3D shapes with an application to apparel design. Adv Data Anal Classif 10(1):103–132
Viscosi V, Cardini A (2011) Leaf morphology, taxonomy and geometric morphometrics: a simplified protocol for beginners. PLoS ONE 6(10):1–20
Zapater V, Martínez-Costa L, Ayala G, Domingo J (2002) Classifying human endothelial cells based on individual granulometric size distributions. Image Vis Comput 20(11):783–791
Zou H, Hastie T (2012) elasticnet: elastic-net for sparse estimation and sparse PCA. R package version 1.1. http://CRAN.R-project.org/package=elasticnet
Acknowledgements
We would like to thank the Biomechanics Institute of Valencia for providing us with the data set.
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper has been partially supported by the Spanish Ministerio de Economía y Competitividad Project \(DPI2013-47279-C2-1-R\).
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Epifanio, I., Ibáñez, M.V. & Simó, A. Archetypal shapes based on landmarks and extension to handle missing data. Adv Data Anal Classif 12, 705–735 (2018). https://doi.org/10.1007/s11634-017-0297-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-017-0297-7
Keywords
- Statistical shape analysis
- Archetype analysis
- Archetypoid analysis
- Anthropometric data
- Children’s wear
- Missing data