Skip to main content
Log in

A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional data

  • Original Paper
  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

This paper proposes a clustering procedure for samples of multivariate functions in \((L^2(I))^{J}\), with \(J\ge 1\). This method is based on a k-means algorithm in which the distance between the curves is measured with a metric that generalizes the Mahalanobis distance in Hilbert spaces, considering the correlation and the variability along all the components of the functional data. The proposed procedure has been studied in simulation and compared with the k-means based on other distances typically adopted for clustering multivariate functional data. In these simulations, it is shown that the k-means algorithm with the generalized Mahalanobis distance provides the best clustering performances, both in terms of mean and standard deviation of the number of misclassified curves. Finally, the proposed method has been applied to two case studies, concerning ECG signals and growth curves, where the results obtained in simulation are confirmed and strengthened.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Boudaoud S, Rix H, Meste O (2010) Core shape modelling of a set of curves. Comput Stat Data Anal 54:308–325

    Article  MathSciNet  MATH  Google Scholar 

  • Bouveyron C (2015) funFEM: clustering in the discriminative functional subspace. R package version 1.1. https://CRAN.R-project.org/package=funFEM. Accessed 26 Nov 2018

  • Cerioli A (2005) K-means cluster analysis and Mahalanobis metrics: a problematic match or an overlooked opportunity? Stat Appl 17:1

    Google Scholar 

  • Ferraty F, Vieu P (2006) Nonparametric functional data analysis: theory and practice. Springer series in statistics. Springer, New York

    MATH  Google Scholar 

  • Galeano P, Joseph E, Lillo Rosa E (2014) The Mahalanobis distance for functional data with applications to classification. Technometrics 57(2):281–291

    Article  MathSciNet  Google Scholar 

  • Gattone SA, Rocci R (2012) Clustering curves on a reduced subspace. J Comput Graph Stat 21(2):361–379

    Article  MathSciNet  Google Scholar 

  • Ghiglietti A, Paganoni AM (2017) Exact tests for the means of gaussian stochastic processes. Stat Prob Lett 131:102–107

    Article  MathSciNet  MATH  Google Scholar 

  • Ghiglietti A, Ieva F, Paganoni AM (2017) Statistical inference for stochastic processes: two-sample hypothesis tests. J Stat Plann Inference 180:49–68

    Article  MathSciNet  MATH  Google Scholar 

  • Horváth L, Kokoszka P (2012) Inference for functional data with applications. Springer series in statistics. Springer, New York

    Book  MATH  Google Scholar 

  • Ieva F, Paganoni AM, Pigoli D, Vitelli V (2013) Multivariate functional clustering for the morphological analysis of electrocardiograph curves. J R Stat Soc Ser C Appl Stat 62:401–418

    Article  MathSciNet  Google Scholar 

  • Jacques J, Preda C (2014) Model-based clustering for multivariate functional data. Comput Stat Data Anal 71:92–106

    Article  MathSciNet  MATH  Google Scholar 

  • Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis, vol 344. Wiley, New York

    MATH  Google Scholar 

  • Liu X, Müller HG (2003) Modes and clustering for time-warped gene expression profile data. Bioinformatics 19:1937–1944

    Article  Google Scholar 

  • Liu X, Yang M (2009) Simultaneous curve registration and clustering for functional data. Comput Stat Data Anal 53:1361–1376

    Article  MathSciNet  MATH  Google Scholar 

  • Martino A, Ghiglietti A, Ieva F, Paganoni AM (2018) gmfd: inference and clustering of functional data. R package version 1.0.1. https://CRAN.R-project.org/package=gmfd. Accessed 26 Nov 2018

  • Melnykov I, Melnykov V (2014) On K-means algorithm with the use of Mahalanobis distances. Stat Prob Lett 84:88–95

    Article  MathSciNet  MATH  Google Scholar 

  • R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

    Google Scholar 

  • Ramsay J, Silverman BW (2002) Applied functional data analysis–methods and case studies. Springer series in statistics. Springer, New York

    Book  MATH  Google Scholar 

  • Ramsay J, Silverman BW (2005) Functional data analysis, 2nd edn. Springer series in statistics. Springer, New York

    Book  MATH  Google Scholar 

  • Ramsay JO, Wickham H, Graves S, Hooker G (2014) fda: functional data analysis. R package version 2.4.4

  • Sangalli LM, Secchi P, Vantini S, Vitelli V (2010) k-mean alignment for curve clustering. Comput Stat Data Anal 54:1219–1233

    Article  MathSciNet  MATH  Google Scholar 

  • Soueidatt M (2014) Funclustering: a package for functional data clustering. R package version 1.0.1. https://CRAN.R-project.org/package=Funclustering. Accessed 26 Nov 2018

  • Tarpey T, Kinateder KKK (2003) Clustering functional data. J Classif 20:93–114

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Martino.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Martino, A., Ghiglietti, A., Ieva, F. et al. A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional data. Stat Methods Appl 28, 301–322 (2019). https://doi.org/10.1007/s10260-018-00446-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-018-00446-6

Keywords

Mathematics Subject Classification

Navigation