Robust clustering for functional data based on trimming and constraints
- 131 Downloads
Many clustering algorithms when the data are curves or functions have been recently proposed. However, the presence of contamination in the sample of curves can influence the performance of most of them. In this work we propose a robust, model-based clustering method that relies on an approximation to the “density function” for functional data. The robustness follows from the joint application of data-driven trimming, for reducing the effect of contaminated observations, and constraints on the variances, for avoiding spurious clusters in the solution. The algorithm is designed to perform clustering and outlier detection simultaneously by maximizing a trimmed “pseudo” likelihood. The proposed method has been evaluated and compared with other existing methods through a simulation study. Better performance for the proposed methodology is shown when a fraction of contaminating curves is added to a non-contaminated sample. Finally, an application to a real data set that has been previously considered in the literature is given.
KeywordsFunctional data analysis Clustering Robustness Trimming Functional principal components analysis
Mathematics Subject Classification62G35 62H30 68T10
We would like to thank the Associate Editor and two anonymous reviewers for their helpful suggestions and comments. This work was partly done while DR and JO visited the Departamento de Estadística e I.O., Universidad de Valladolid, Spain, with support from Conacyt, Mexico (DR as visiting graduate student, JO by Projects 169175 Análisis Estadístico de Olas Marinas, Fase II y 234057 Análisis Espectral, Datos Funcionales y Aplicaciones), CIMAT, A.C. and the Universidad de Valladolid. Their hospitality and support is gratefully acknowledged. Research by LA G-E and A M-I was partially supported by the Spanish Ministerio de Economía y Competitividad, grant MTM2017-86061-C2-1-P, and by Consejería de Educación de la Junta de Castilla y León and FEDER, grant VA005P17.
- Bouveyron C, Jacques J (2014) funHDDC: model-based clustering in group-specific functional subspaces. R package version 1.0Google Scholar
- Cerioli A, García-Escudero LA, Mayo-Iscar A, Riani M (2017) Finding the number of normal groups in model-based clustering via constrained likelihoods. J Comput Graph StatGoogle Scholar
- Gallegos MT (2002) Maximum likelihood clustering with outliers. In: Classification, clustering, and data analysis (Cracow, 2002). Studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp. 247–255Google Scholar
- Ramsay JO, Wickham H, Graves S, Hooker G (2014) fda: functional data analysis. R package version 2.4.4Google Scholar
- Sguera C, Galeano P, Lillo RE (2015) Functional outlier detection by a local depth with application to NOx levels. Stoch Environ Res Risk Assess 462:1835–1851Google Scholar
- Soueidatt M (2014) Funclustering: a package for functional data clustering. R package version 1.0.1Google Scholar