Skip to main content
Log in

Archetypal shapes based on landmarks and extension to handle missing data

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Archetype and archetypoid analysis are extended to shapes. The objective is to find representative shapes. Archetypal shapes are pure (extreme) shapes. We focus on the case where the shape of an object is represented by a configuration matrix of landmarks. As shape space is not a vectorial space, we work in the tangent space, the linearized space about the mean shape. Then, each observation is approximated by a convex combination of actual observations (archetypoids) or archetypes, which are a convex combination of observations in the data set. These tools can contribute to the understanding of shapes, as in the usual multivariate case, since they lie somewhere between clustering and matrix factorization methods. A new simplex visualization tool is also proposed to provide a picture of the archetypal analysis results. We also propose new algorithms for performing archetypal analysis with missing data and its extension to incomplete shapes. A well-known data set is used to illustrate the methodologies developed. The proposed methodology is applied to an apparel design problem in children.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  • Arbour JH, Brown CM (2014) Incomplete specimens in geometric morphometric analyses. Methods Ecol Evol 5(1):16–26

    Article  Google Scholar 

  • Ayala G, Epifanio I, Simó A, Zapater V (2006) Clustering of spatial point patterns. Comput Stat Data Anal 50(4):1016–1032

    Article  MathSciNet  Google Scholar 

  • Bookstein F (1978) Lecture notes in biomathematics. In: The measurement of biological shape and shape change. Springer, New York

    Chapter  Google Scholar 

  • Brown CM, Arbour JH, Jackson DA (2012) Testing of the effect of missing data estimation and distribution in morphometric multivariate data analyses. Syst Biol 61(6):941–954

    Article  Google Scholar 

  • Canhasi E, Kononenko I (2013) Multi-document summarization via archetypal analysis of the content-graph joint model. Knowl Inf Syst. doi:10.1007/s10115-013-0689-8

    Article  Google Scholar 

  • Canhasi E, Kononenko I (2014) Weighted archetypal analysis of the multi-element graph for query-focused multi-document summarization. Expert Syst Appl 41(2):535–543

    Article  Google Scholar 

  • Chan B, Mitchell D, Cram L (2003) Archetypal analysis of galaxy spectra. Mon Not R Astron Soc 338:790–795

    Article  Google Scholar 

  • Claude J (2008) Morphometrics with R. Springer, New York

    MATH  Google Scholar 

  • Cutler A, Breiman L (1994) Archetypal analysis. Technometrics 36(4):338–347

    Article  MathSciNet  Google Scholar 

  • Davis T, Love B (2010) Memory for category information is idealized through contrast with competing options. Psychol Sci 21(2):234–242

    Article  Google Scholar 

  • D’Esposito MR, Ragozini G (2008) A new R-ordering procedure to rank multivariate performances. Quaderni di Statistica 10:22–40

    Google Scholar 

  • D’Esposito MR, Palumbo F, Ragozini G (2012) Interval archetypes: a new tool for interval data analysis. Stat Anal Data Min 5(4):322–335

    Article  MathSciNet  Google Scholar 

  • Dryden IL (2015) Shapes: statistical shape analysis. R package version 1.1-11. https://CRAN.R-project.org/package=shapes

  • Dryden IL, Mardia KV (1998) Statistical shape analysis. Wiley, Chichester

    MATH  Google Scholar 

  • Dryden IL, Mardia KV (2016) Statistical shape analysis: with applications in R. Wiley, Chichester

    Book  Google Scholar 

  • Dryden IL, Zempléni A (2006) Extreme shape analysis. J R Stat Soc Ser C 55(1):103–121

    Article  MathSciNet  Google Scholar 

  • Du J, Dryden IL, Huang X (2015) Size and shape analysis of error-prone shape data. J Am Stat Assoc 110(509):368–379

    Article  MathSciNet  Google Scholar 

  • Eirola E, Doquire G, Verleysen M, Lendasse A (2013) Distance estimation in numerical data sets with missing values. Inf Sci 240:115–128

    Article  MathSciNet  Google Scholar 

  • Eneh S (2015) Showroom the future of online fashion retailing 2.0: enhancing the online shopping experience. Master’s thesis, University of Borås, Faculty of Textiles, Engineering and Business

  • Epifanio I (2016) Functional archetype and archetypoid analysis. Comput Stat Data Anal 104:24–34

    Article  MathSciNet  Google Scholar 

  • Epifanio I, Vinué G, Alemany S (2013) Archetypal analysis: contributions for estimating boundary cases in multivariate accommodation problem. Comput Ind Eng 64(3):757–765

    Article  Google Scholar 

  • Eugster MJ, Leisch F (2009) From spider-man to hero—archetypal analysis in R. J Stat Softw 30(8):1–23

    Article  Google Scholar 

  • Eugster MJA (2012) Performance profiles based on archetypal athletes. Int J Perform Anal Sport 12(1):166–187

    Article  Google Scholar 

  • Eugster MJA, Leisch F (2011) Weighted and robust archetypal analysis. Comput Stat Data Anal 55(3):1215–1225

    Article  MathSciNet  Google Scholar 

  • Fréchet M (1948) Les éléments aléatoires de nature quelconque dans un espace distancié. Annales de l’Institut Henri Poincaré Probabilités et Statistiques 10(4):215–310

    MATH  Google Scholar 

  • Goodall C (1991) Procrustes methods in the statistical analysis of shape. J R Stat Soc Ser B (Methodological) 53(2):285–339

  • Guerrero J, ASEPRI (2000) Estudio de tallas y medidas de la población infantil internacional. Asociación Española de Fabricantes de Productos para la Infancia (ASEPRI)

  • Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Data mining, inference and prediction, 2nd edn. Springer, New York

    MATH  Google Scholar 

  • Hinrich JL, Bardenfleth SE, Roge RE, Churchill NW, Madsen KH, Mørup M (2016) Archetypal analysis for modeling multisubject fMRI data. IEEE J Sel Top Sign Proces 10(7):1160–1171

    Article  Google Scholar 

  • Ibáñez MV, Vinué G, Alemany S, Simó A, Epifanio I, Domingo J, Ayala G (2012) Apparel sizing using trimmed PAM and OWA operators. Expert Syst Appl 39(12):10,512–10,520

    Article  Google Scholar 

  • Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York

    Book  Google Scholar 

  • Kendall D (1984) Shape manifolds, procrustean metrics, and complex projective spaces. Lond Math Soc 16:81–121

    Article  MathSciNet  Google Scholar 

  • Kendall DG, Barden D, Carne T, Le H (2009) Shape and shape theory. Wiley, Chichester

    MATH  Google Scholar 

  • Lawson CL, Hanson RJ (1974) Solving least squares problems. Prentice Hall, Englewood Cliffs

    MATH  Google Scholar 

  • Li S, Wang P, Louviere J, Carson R (2003) Archetypal analysis: a new way to segment markets based on extreme individuals. In: ANZMAC 2003 conference proceedings, pp 1674–1679

  • MacLeod N (2015) Proceedings of the third international symposium on biological shape analysis, Chap The direct analysis of digital images (eigenimage) with a comment on the use of discriminant analysis in morphometrics. World Scientific, Singapore, pp 156–182

  • Midgley D, Venaik S (2013) Marketing strategy in MNC subsidiaries: pure versus hybrid archetypes. In: McDougall-Covin P, Kiyak T (eds) Proceedings of the 55th annual meeting of the academy of international business, pp 215–216

  • Mørup M, Hansen LK (2012) Archetypal analysis for machine learning and data mining. Neurocomputing 80:54–63

    Article  Google Scholar 

  • Pennec X (2006) Intrinsic statistics on Riemannian manifolds: basic tools for geometric measurements. J Math Imaging Vis 25(1):127–154

    Article  MathSciNet  Google Scholar 

  • Porzio GC, Ragozini G, Vistocco D (2008) On the use of archetypes as benchmarks. Appl Stoch Models Bus Ind 24:419–437

    Article  MathSciNet  Google Scholar 

  • R Development Core Team (2017) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. ISBN 3-900051-07-0. http://www.R-project.org

  • Ragozini G, D’Esposito MR (2015) Archetypal networks. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, New York, pp 807–814

  • Ragozini G, Palumbo F, D’Esposito MR (2017) Archetypal analysis for data-driven prototype identification. Stat Anal Data Min ASA Data Sci J 10(1):6–20

    Article  MathSciNet  Google Scholar 

  • Robinette KM, Veitch D (2016) Sustainable sizing. Hum Fact J Hum Fact Ergonom Soc 58:657–664

    Article  Google Scholar 

  • Rohlf FJ (1998) On applications of geometric morphometrics to studies of ontogeny and phylogeny. Syst Biol 47(1):147–158

    Article  Google Scholar 

  • Rohlf FJ (1999) Shape statistics: procrustes superimpositions and tangent spaces. J Classif 16(2):197–223

    Article  MathSciNet  Google Scholar 

  • Seth S, Eugster MJA (2016a) Archetypal analysis for nominal observations. IEEE Trans Pattern Anal Mach Intell 38(5):849–861

    Article  Google Scholar 

  • Seth S, Eugster MJA (2016b) Probabilistic archetypal analysis. Mach Learn 102(1):85–113

    Article  MathSciNet  Google Scholar 

  • Sjöstrand K, Stegmann MB, Larsen R (2006) Sparse principal component analysis in medical shape modeling. In: International symposium on medical imaging, vol 6144. The International Society for Optical Engineering (SPIE), San Diego

  • Sjöstrand K, Rostrup E, Ryberg C, Larsen R, Studholme C, Baezner H, Ferro J, Fazekas F, Pantoni L, Inzitari D, Waldemar G (2007) Sparse decomposition and modeling of anatomical shape variation. IEEE Trans Med Imaging 26(12):1625–1635

    Article  Google Scholar 

  • Slice DE (2001) Landmark coordinates aligned by procrustes analysis do not lie in Kendall’s shape space. Syst Biol 50(1):141–149

    Article  Google Scholar 

  • Stoyan LA, Stoyan H (1995) Fractals, random shapes and point fields. Wiley, Chichester

    MATH  Google Scholar 

  • Theodosiou T, Kazanidis I, Valsamidis S, Kontogiannis S (2013) Courseware usage archetyping. In: Proceedings of the 17th panhellenic conference on informatics, ACM, New York, PCI ’13, pp 243–249

  • Thøgersen JC, Mørup M, Damkiær S, Molin S, Jelsbak L (2013) Archetypal analysis of diverse pseudomonas aeruginosa transcriptomes reveals adaptation in cystic fibrosis airways. BMC Bioinform 14:279

    Article  Google Scholar 

  • Thurau C, Kersting K, Wahabzada M, Bauckhage C (2012) Descriptive matrix factorization for sustainability: adopting the principle of opposites. Data Min Knowl Disc 24(2):325–354

    Article  MathSciNet  Google Scholar 

  • Tsanousa A, Laskaris N, Angelis L (2015) A novel single-trial methodology for studying brain response variability based on archetypal analysis. Expert Syst Appl 42(22):8454–8462

    Article  Google Scholar 

  • Vinué G (2017) Anthropometry: an R package for analysis of anthropometric data. J Stat Softw 77(6):1–39

    Article  MathSciNet  Google Scholar 

  • Vinué G, Epifanio I (2017) Archetypoid analysis for sports analytics. Data Min Knowl Discov 31(6):1643–1677. doi:10.1007/s10618-017-0514-1

    Article  MathSciNet  Google Scholar 

  • Vinué G, Epifanio I, Alemany S (2015a) Archetypoids: a new approach to define representative archetypal data. Comput Stat Data Anal 87:102–115

    Article  MathSciNet  Google Scholar 

  • Vinué G, Epifanio I, Simó A, Ibáñez M, Domingo J, Ayala G (2015b) Anthropometry: an R package for analysis of anthropometric data. R package version 1:5

  • Vinué G, Simó A, Alemany S (2016) The k-means algorithm for 3D shapes with an application to apparel design. Adv Data Anal Classif 10(1):103–132

    Article  MathSciNet  Google Scholar 

  • Viscosi V, Cardini A (2011) Leaf morphology, taxonomy and geometric morphometrics: a simplified protocol for beginners. PLoS ONE 6(10):1–20

    Article  Google Scholar 

  • Zapater V, Martínez-Costa L, Ayala G, Domingo J (2002) Classifying human endothelial cells based on individual granulometric size distributions. Image Vis Comput 20(11):783–791

    Article  Google Scholar 

  • Zou H, Hastie T (2012) elasticnet: elastic-net for sparse estimation and sparse PCA. R package version 1.1. http://CRAN.R-project.org/package=elasticnet

Download references

Acknowledgements

We would like to thank the Biomechanics Institute of Valencia for providing us with the data set.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irene Epifanio.

Additional information

This paper has been partially supported by the Spanish Ministerio de Economía y Competitividad Project \(DPI2013-47279-C2-1-R\).

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 161 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Epifanio, I., Ibáñez, M.V. & Simó, A. Archetypal shapes based on landmarks and extension to handle missing data. Adv Data Anal Classif 12, 705–735 (2018). https://doi.org/10.1007/s11634-017-0297-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-017-0297-7

Keywords

Mathematics Subject Classification

Navigation