Model-based regression clustering for high-dimensional data: application to functional data

  • Emilie DevijverEmail author
Regular Article


Finite mixture regression models are useful for modeling the relationship between response and predictors arising from different subpopulations. In this article, we study high-dimensional predictors and high-dimensional response and propose two procedures to cluster observations according to the link between predictors and the response. To reduce the dimension, we propose to use the Lasso estimator, which takes into account the sparsity and a maximum likelihood estimator penalized by the rank, to take into account the matrix structure. To choose the number of components and the sparsity level, we construct a collection of models, varying those two parameters and we select a model among this collection with a non-asymptotic criterion. We extend these procedures to functional data, where predictors and responses are functions. For this purpose, we use a wavelet-based approach. For each situation, we provide algorithms and apply and evaluate our methods both on simulated and real datasets, to understand how they work in practice.


Model-based clustering Regression High-dimension  Functional data 

Mathematics Subject Classification




I am indebted to Jean-Michel Poggi and Pascal Massart for suggesting me to study this problem, and for stimulating discussions. I am also grateful to Jean-Michel Poggi for carefully reading the manuscript and making many useful suggestions. I thank Yves Misiti and Benjamin Auder for their help to speed up the code. I also thank referees for very interesting improvements and suggestions, and editors for their help for writing this paper. I am also grateful to Irène Gijbels for carefully proofreading the manuscript.


  1. Anderson TW (1951) Estimating linear restrictions on regression coefficients for multivariate normal distributions. Ann Math Stat 22(3):327–351MathSciNetCrossRefzbMATHGoogle Scholar
  2. Baudry J-P, Maugis C, Michel B (2012) Slope heuristics: overview and implementation. Stat Comput 22(2):455–470MathSciNetCrossRefzbMATHGoogle Scholar
  3. Birgé L, Massart P (2007) Minimal penalties for Gaussian model selection. Probab Theory Relat Fields 138(1–2):33–73Google Scholar
  4. Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data: methods, theory and applications, Springer Series in Statistics. Springer, BerlinGoogle Scholar
  5. Bunea F, She Y, Wegkamp M (2012) Joint variable and rank selection for parsimonious estimation of high-dimensional matrices. Ann Stat 40(5):2359–2388MathSciNetCrossRefzbMATHGoogle Scholar
  6. Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recognit 28(5):781–793CrossRefGoogle Scholar
  7. Ciarleglio A, Ogden T (2014) Wavelet-based scalar-on-function finite mixture regression models. Comput Stat Data Anal 93:86–96MathSciNetCrossRefGoogle Scholar
  8. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. Discussion. J R Stat Soc Ser B 39:1–38zbMATHGoogle Scholar
  9. Devijver E (2015) Finite mixture regression: a sparse variable selection by model selection for clustering. Electron J Stat 9(2):2642–2674MathSciNetCrossRefzbMATHGoogle Scholar
  10. Devijver E (2015) Joint rank and variable selection for parsimonious estimation in high-dimension finite mixture regression model. arXiv:1501.00442
  11. Ferraty F, Vieu P (2006) Nonparametric functional data analysis: theory and practice, Springer series in statistics. Springer, New YorkGoogle Scholar
  12. Gareth J, Sugar C (2003) Clustering for sparsely sampled functional data. J Am Stat Assoc 98(462):397–408MathSciNetCrossRefzbMATHGoogle Scholar
  13. Giraud C (2011) Low rank multivariate regression. Electron J Stat 5:775–799MathSciNetCrossRefzbMATHGoogle Scholar
  14. Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218CrossRefzbMATHGoogle Scholar
  15. Izenman A (1975) Reduced-rank regression for the multivariate linear model. J Multivar Anal 5(2):248–264MathSciNetCrossRefzbMATHGoogle Scholar
  16. Jones PN, McLachlan GJ (1992) Fitting finite mixture models in a regression context. Aust J Stat 34:233–240CrossRefGoogle Scholar
  17. Mallat S (1989) A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell 11:674–693CrossRefzbMATHGoogle Scholar
  18. Mallat S (1999) A wavelet tour of signal processing. Academic Press, DublinzbMATHGoogle Scholar
  19. McLachlan G, Peel D (2004) Finite Mixture Models, Wiley series in probability and statistics. Wiley, New YorkGoogle Scholar
  20. Meinshausen N, Bühlmann P (2010) Stability selection. J R Stat Soc Ser B Stat Methodol 72(4):417–473MathSciNetCrossRefGoogle Scholar
  21. Meynet C, Maugis-Rabusseau C (2012) A sparse variable selection procedure in model-based clustering. Research report, SeptemberGoogle Scholar
  22. Misiti M, Misiti Y, Oppenheim G, Poggi J-M (2004) Matlab Wavelet Toolbox User’s Guide. Version 3. The Mathworks Inc, Natick, MAGoogle Scholar
  23. Misiti M, Misiti Y, Oppenheim G, Poggi J-M (2007) Clustering signals using wavelets. In: Sandoval Francisco, Prieto Alberto, Cabestany Joan, Graña Manuel (eds) Computational and Ambient Intelligence, vol 4507, Lecture Notes in Computer Science. Springer, Berlin Heidelberg, pp 514–521Google Scholar
  24. Park T, Casella G (2008) The Bayesian lasso. J Am Stat Assoc 103(482):681–686MathSciNetCrossRefzbMATHGoogle Scholar
  25. Ramsay JO, Silverman BW (2005) Functional data analysis, Springer series in statistics. Springer, New YorkGoogle Scholar
  26. Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group lasso. J Comput Graph Stat 22:231–245Google Scholar
  27. Städler N, Bühlmann P, Van de Geer S (2010) \(\ell _{1}\)-penalization for mixture regression models. Test 19(2):209–256Google Scholar
  28. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58(1):267–288MathSciNetzbMATHGoogle Scholar
  29. Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109:475–494MathSciNetCrossRefzbMATHGoogle Scholar
  30. Wu J (1983) On the convergence properties of the EM algorithm. Ann Stat 1:95–103MathSciNetCrossRefzbMATHGoogle Scholar
  31. Yao F, Fu Y, Lee T (2011) Functional mixture regression. Biostatistics 2:341–353CrossRefGoogle Scholar
  32. Zhao Y, Ogden T, Reiss P (2012) Wavelet-based LASSO in functional linear regression. J Comput Graph Stat 21(3):600–617MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Inria Select, Université Paris SudOrsay CedexFrance

Personalised recommendations