Nonparametric Hierarchical Clustering of Functional Data

Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 527)

Abstract

In this paper, we deal with the problem of curves clustering.We propose a nonparametric method which partitions the curves into clusters and discretizes the dimensions of the curve points into intervals. The cross-product of these partitions forms a data-grid which is obtained using a Bayesian model selection approach while making no assumptions regarding the curves. Finally, a post-processing technique, aiming at reducing the number of clusters in order to improve the interpretability of the clustering, is proposed. It consists in optimally merging the clusters step by step, which corresponds to an agglomerative hierarchical classification whose dissimilarity measure is the variation of the criterion. Interestingly this measure is none other than the sum of the Kullback-Leibler divergences between clusters distributions before and after the merges. The practical interest of the approach for functional data exploratory analysis is presented and compared with an alternative approach on an artificial and a real world data set.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Abraham et al., 2003]
    Abraham, C., Cornillon, P., Matzner-Løbe, E., Molinari, N.: Unsupervised curve clustering using b-splines. Scandinavian Journal of Statistics 30(3), 581–595 (2003)MathSciNetCrossRefMATHGoogle Scholar
  2. [Abramowitz and Stegun, 1970]
    Abramowitz, M., Stegun, I.: Handbook of mathematical functions. Dover Publications Inc., New York (1970)Google Scholar
  3. [Blei and Jordan, 2005]
    Blei, D.M., Jordan, M.I.: Variational inference for dirichlet process mixtures. Bayesian Analysis 1, 121–144 (2005)MathSciNetCrossRefGoogle Scholar
  4. [Boullé, 2010]
    Boullé, M.: Data grid models for preparation and modeling in supervised learning. In: Guyon, I., Cawley, G., Dror, G., Saffari, A. (eds.) Hands on Pattern Recognition. Microtome (2010) (in press)Google Scholar
  5. [Cadez et al., 2000]
    Cadez, I., Gaffney, S., Smyth, P.: A general probabilistic framework for clustering individuals and objects. In: Proc. ACM Sixth Inter. Conf. Knowledge Discovery and Data Mining, pp. 140–149 (2000)Google Scholar
  6. [Chamroukhi et al., 2010]
    Chamroukhi, F., Samé, A., Govaert, G., Aknin, P.: A hidden process regression model for functional data description. application to curve discrimination. Neurocomputing 73(7-9), 1210–1221 (2010)CrossRefGoogle Scholar
  7. [Chapman et al., 2000]
    Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: CRISP-DM 1.0: step-by-step data mining guide (2000)Google Scholar
  8. [Cover and Thomas, 1991]
    Cover, T., Thomas, J.: Elements of information theory. Wiley-Interscience, New York (1991)CrossRefMATHGoogle Scholar
  9. [Delaigle and Hall, 2010]
    Delaigle, G., Hall, P.: Defining probability density for a distribution of random functions. Annals of Statistics 38(2), 1171–1193 (2010)MathSciNetCrossRefMATHGoogle Scholar
  10. [Ferraty and Vieu, 2006]
    Ferraty, F., Vieu, P.: Nonparametric Functional Data Analysis: Theory and Practice. Springer (2006)Google Scholar
  11. [Gaffney and Smyth, 2004]
    Gaffney, S., Smyth, P.: Joint probabilistic curve clustering and alignment. In: Advances in Neural Information Processing Systems 17 (2004)Google Scholar
  12. [Gasser et al., 1998]
    Gasser, T., Hall, P., Presnell, B.: Nonparametric estimation of the mode of a distribution of random curves. Journal of the Royal Statistical Society 60, 681–691 (1998)MathSciNetCrossRefMATHGoogle Scholar
  13. [Hansen and Mladenovic, 2001]
    Hansen, P., Mladenovic, N.: Variable neighborhood search: principles and applications. European Journal of Operational Research 130, 449–467 (2001)MathSciNetCrossRefMATHGoogle Scholar
  14. [Hastie et al., 2001]
    Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer (2001)Google Scholar
  15. [Hébrail et al., 2010]
    Hébrail, G., Hugueney, B., Lechevallier, Y., Rossi, F.: Exploratory Analysis of Functional Data via Clustering and Optimal Segmentation. Neurocomputing 73(7-9), 1125–1141 (2010)CrossRefGoogle Scholar
  16. [Neal, 2000]
    Neal, R.M.: Markov chain sampling methods for dirichlet process mixture models. Journal of Computational and Graphical Statistics 9(2), 249–265 (2000)MathSciNetGoogle Scholar
  17. [Nguyen and Gelfand, 2011]
    Nguyen, X., Gelfand, A.: The dirichlet labeling process for clustering functional data. Sinica Statistica 21(3), 1249–1289 (2011)MathSciNetCrossRefMATHGoogle Scholar
  18. [Ramsay and Silverman, 2005]
    Ramsay, J., Silverman, B.: Functional Data Analysis. Springer Series in Statistics. Springer (2005)Google Scholar
  19. [Rissanen, 1978]
    Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)CrossRefMATHGoogle Scholar
  20. [Sheather and Jones, 1991]
    Sheather, S., Jones, M.: A reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society. Series B (Methodological), 683–690 (1991)Google Scholar
  21. [Teh, 2010]
    Teh, Y.W.: Dirichlet processes. In: Encyclopedia of Machine Learning. Springer (2010)Google Scholar
  22. [Vogt et al., 2010]
    Vogt, J.E., Prabhakaran, S., Fuchs, T.J., Roth, V.: The translation-invariant wishart-dirichlet process for clustering distance data (2010)Google Scholar
  23. [Wallach et al., 2010]
    Wallach, H.M., Jensen, S.T., Dicker, L., Heller, K.A.: An alternative prior process for nonparametric bayesian clustering. In: AISTATS, pp. 892–899 (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Marc Boullé
    • 1
  • Romain Guigourès
    • 1
    • 2
  • Fabrice Rossi
    • 2
  1. 1.Orange LabsLannionFrance
  2. 2.SAMM EA 4543Université Paris 1ParisFrance

Personalised recommendations