Skip to main content
Log in

Clustering of functional data in a low-dimensional subspace

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

To find optimal clusters of functional objects in a lower-dimensional subspace of data, a sequential method called tandem analysis, is often used, though such a method is problematic. A new procedure is developed to find optimal clusters of functional objects and also find an optimal subspace for clustering, simultaneously. The method is based on the k-means criterion for functional data and seeks the subspace that is maximally informative about the clustering structure in the data. An efficient alternating least-squares algorithm is described, and the proposed method is extended to a regularized method. Analyses of artificial and real data examples demonstrate that the proposed method gives correct and interpretable results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abraham C, Cornillon PA, Matzner-Lober E, Molinari N (2003) Unsupervised curve clustering using B-splines. Scand J Statist 30: 581–595

    Article  MathSciNet  MATH  Google Scholar 

  • Arabie P, Hubert L (1994) Cluster analysis in marketing research. In: Bagozzi RP (eds) Advanced methods of marketing research. Blackwell Business, Cambridge, pp 160–189

    Google Scholar 

  • Besse PC, Cardot H, Ferraty F (1997) Simultaneous non-parametric regressions of unbalanced longitudinal data. Comput Stat Data Anal 24: 255–270

    Article  MathSciNet  MATH  Google Scholar 

  • Besse PC, Ramsay JO (1986) Principal components analysis of sampled functions. Psychometorika 51: 285–311

    Article  MathSciNet  MATH  Google Scholar 

  • Boente G, Fraiman R (2000) Kernel-based functional principal components. Stat Probab Lett 48: 335–345

    Article  MathSciNet  MATH  Google Scholar 

  • Bouveyron C, Jacques J (2011) Model-based clustering of time series in group-specific functional subspaces. Adv Data Anal Classif 5: 281–300

    Article  MathSciNet  Google Scholar 

  • De Boor C (2001) A practical guide to splines, revised edition. Springer, New York

    Google Scholar 

  • de Leeuw J, Young FW, Takane Y (1976) Additive structure in qualitative data: An alternating least squares method with optimal scaling features. Psychometorika 41: 471–503

    Article  MATH  Google Scholar 

  • DeSarbo WS, Jedidi K, Cool K, Schendel D (1990) Simultaneous multidimensional unfolding and cluster analysis: an investigation of strategic groups. Mark Lett 2: 129–146

    Article  Google Scholar 

  • De Soete G, Carroll JD (1994) K-means clustering in a low-dimensional Euclidean space. In: Diday E, Lechevallier Y, Schader M, Bertrand P, Burtschy B (eds) New approaches in classification and data analysis. Springer, Heidelberg, pp 212–219

    Google Scholar 

  • Dunford N, Schwartz JT (1988) Linear operators, spectral theory, self adjoint operators in Hilbert space, part 2. Interscience, NewYork

    Google Scholar 

  • Green PJ, Silverman BW (1994) Nonparametric regression and generalized linear models: a roughness penalty approach. Chapman and Hall, London

    MATH  Google Scholar 

  • Hardy A (1996) On the number of clusters. Comput Stat Data Anal 23: 83–96

    Article  MATH  Google Scholar 

  • Hartigan J (1975) Clustering algorithms. Wiley, New York

    MATH  Google Scholar 

  • Hubert L, Arabie P (1985) Comparing partitions. J Classif 2: 193–218

    Article  Google Scholar 

  • Kneip A (1994) Nonparametric estimation of common regressors for similar curve data. Ann Stat 22: 1386–1427

    Article  MathSciNet  MATH  Google Scholar 

  • Illian JB, Prosser JI, Baker KL, Rangel-Castro JI (2009) Functional principal component data analysis: A new method for analysing microbial community fingerprints. J Microbiol Methods 79: 89–95

    Article  Google Scholar 

  • Lloyd SP (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28: 128–137

    Article  MathSciNet  Google Scholar 

  • Milligan GW, Cooper MC (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika 50: 159–179

    Article  Google Scholar 

  • Ocaña FA, Aguilera AM, Valderrama MJ (1982) Functional principal components analysis by choice of norm. J Multivariate Anal 71: 262–276

    Article  Google Scholar 

  • Pezzulli SD, Silverman BW (1993) Some properties of smoothed principal components analysis for functional data. Comput Stat 8: 1–16

    MathSciNet  MATH  Google Scholar 

  • R Development Core Team (2005) R: A language and environment for statistical computing. R Foundation for Statistical Computing. Austria. ISBN 3-900051-07-0, URL http://www.R-project.org

  • Ramsay JO, Wang X, Flanagan R (1995) A functional data analysis of the pinch force of human fingers. J Roy Stat Soc Ser C 44: 17–30

    MATH  Google Scholar 

  • Ramsay JO, Silverman BW (2005) Functional data analysis, 2nd Edn. Springer, New York

    Google Scholar 

  • Rice JA, Silverman BW (1991) Estimating the mean and covariance structure nonparametrically when the data are curves. J Roy Stat Soc Ser B 53: 233–243

    MathSciNet  MATH  Google Scholar 

  • Rossi F, Conan-Guez B, Golli AE (2004) Clustering functional data with the SOM algorithm. ESANN’2004 proceedings, pp 305–312

  • Silverman BW (1996) Smoothed functional principal components analysis by choice of norm. Ann Stat 24: 1–24

    Article  MATH  Google Scholar 

  • Steinley D (2003) K-means clustering: What you don’t know may hurt you. Psychol Methods 8: 294–304

    Article  Google Scholar 

  • Steinley D, Henson R (2005) OCLUS: an analytic method for generating clusters with known overlap. J Classif 22: 221–250

    Article  MathSciNet  Google Scholar 

  • Suyundykov R, Puechmorel S, Ferre L (2010) Multivariate functional data clusterization by PCA in Sobolev space using wavelets. Hyper Articles en Ligne:inria-00494702

  • Tarpey T (2007) Linear transformations and the k-means clustering algorithm: Applications to clustering curves. Am Stat 61: 34–40

    Article  MathSciNet  Google Scholar 

  • Timmerman ME, Ceulemans E, Kiers HAL, Vichi M (2010) Factorial and reduced K-means reconsidered. Comput Stat Data Anal 54: 1858–1871

    Article  MathSciNet  Google Scholar 

  • Vichi M, Kiers HAL (2001) Factorial k-means analysis for two-way data. Comput Stat Data Anal 37: 49–64

    Article  MathSciNet  MATH  Google Scholar 

  • Wahba G (1990) Spline models for observational data. Society for Industrial and Applied Mathematics, Philadelphia

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michio Yamamoto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yamamoto, M. Clustering of functional data in a low-dimensional subspace. Adv Data Anal Classif 6, 219–247 (2012). https://doi.org/10.1007/s11634-012-0113-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-012-0113-3

Keywords

Mathematics Subject Classification (2000)

Navigation