Abstract
We propose a new method for detecting outliers in multivariate functional data. We exploit the joint use of two different depth measures, and generalize the outliergram to the multivariate functional framework, aiming at detecting and discarding both shape and magnitude outliers. The main application consists in robustifying the reference samples of data, composed by G different known groups to be used, for example, in classification procedures in order to make them more robust. We asses by means of a simulation study the method’s performance in comparison with different outlier detection methods. Finally we consider a real dataset: we classify data minimizing a suitable distance from the center of reference groups. We compare performance of supervised classification on test sets training the algorithm on original dataset and on the robustified one, respectively.
Similar content being viewed by others
References
Arribas-Gil A, Romo J (2014) Shape outlier detection and visualization for functional data: the outliergram. Biostatistics 15(4):603–619
Berrendero J, Justel A, Svarc M (2011) Principal components for multivariate functional data. Comput Stat Data Anal 55(9):2619–2634. doi:10.1016/j.csda.2011.03.011
Claeskens G, Hubert M, Slaets L, Vakili K (2014) Multivariate functional halfspace depth. J Am Stat Assoc 109(505):411–423
Cuesta-Albertos JA, Febrero-Bande M, Oviedo de la Fuente M (2016) The dd\(^g\)-classifier in the functional setting. TEST 1:1–24. doi:10.1007/s11749-016-0502-6
Febrero-Bande M, Galeano P, González-Manteiga W (2008) Outlier detection in functional data by depth measures, with application to identify abnormal no\(_x\) levels. Environmetrics 19(4):331–345
Gervini D (2008) Robust functional estimation using the median and spherical principal components. Biometrika 95(3):587–600
Hawkins DM (1980) Identification of outliers. Springer, New York
Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126
Hubert M, Vandervieren E (2008) An adjusted boxplot for skewed distributions. Comput Stat Data Anal 52:5186–5201
Hubert M, Rousseeuw P, Segaert P (2015) Multivariate functional outlier detection. Stat Methods Appl 24(2):177–202
Hubert M, Rousseeuw P, Segaert P (2016) Multivariate and functional classification using depth and distance. Adv Data Anal Classif 1:1–22
Hyndman R, Shang H (2010) Rainbow plots, bagplots, and boxplots for functional data. J Comput Graph Stat 19:29–45
Ieva F, Paganoni AM (2013) Depth measures for multivariate functional data. Commun Stat Theory Methods 42(7):1265–1276
Ieva F, Paganoni A, Pigoli D, Vitelli V (2013) Multivariate functional clustering for the morphological analysis of ecg curves. J R Stat Soc Ser C (Appl Stat) 62(3):401–418
Indino F (2015) Analisi statistica di dati ad alta dimensionalit : un’applicazione ai segnali elettrocardiografici. Master’s thesis, Politecnico di Milano, URL https://www.politesi.polimi.it/handle/10589/107284
Kraus D, Panaretos VM (2012) Dispersion operators and resistant second-order functional data analysis. Biometrika 99(4):813–832
Kuhnt S, Rehage A (2016a) An angle-based multivariate functional pseudo-depth for shape outlier detection. J Multivar Anal 146:325–340
Kuhnt S, Rehage A (2016b) An angle-based multivariate functional pseudo-depth for shape outlier detection. J Multivar Anal 146:325–340
Li J, Cuesta-Albertos J, Dd-classifier LRY (2012) Nonparametric classification procedure based on dd-plot. J Am Stat Assoc 107:737–753
Lopez-Pintado S, Romo J (2009) On the concept of depth for functional data. J Am Stat Assoc 104(486):718–734
Lopez-Pintado S, Romo J (2011) A half-region depth for functional data. Comput Stat Data Anal 55:1679–1695
Lopez-Pintado S, Sun Y, Genton M (2014) Simplicial band depth for multivariate functional data. Adv Data Anal Classif 8:321–338
Mosler K, Mozharovskyi P (2015) Fast dd-classification of functional data. Stat Pap. doi:10.1007//s00362-015-0738-3
Paindaveine D, Bever GV (2015) Discussion of multivariate functional outlier detection, by mia hubert, peter rousseeuw and pieter segaert. Stat Methods Appl 24:223–231
R Core Team (2013) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/. ISBN 3-900051-07-0
Ramsay J, Silverman B (2005) Functional data analysis, 2nd edn. Springer, New York
Rehage A (2016) FUNTA: Functional Tangential Angle Pseudo-Depth, https://CRAN.R-project.org/package=FUNTA. R package version 0.1.0
Sun Y, Genton M (2012) Adjusted functional boxplots for spatio-temporal data visualization and outlier detection. Environmetrics 23(1):53–64
Tarabelloni N, Ieva F (2016) On data robustification in functional data analysis. MOX Report 03/2016, Department of Mathematics - Politecnico di Milano, https://www.mate.polimi.it/biblioteca/add/qmox/03-2016.pdf
Tarabelloni N, Ieva F, Biasi R, Paganoni A (2015) Use of depth measure for multivariate functional data in disease prediction: an application to electrocardiographic signals. Int J Biostat, (To appear)
Tarabelloni N, Arribas-Gil A, Ieva F, Paganoni AM, Romo J (2016) roahd: Robust Analysis of High Dimensional Data. https://CRAN.R-project.org/package=roahd. R package version 1.0
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ieva, F., Paganoni, A.M. Component-wise outlier detection methods for robustifying multivariate functional samples. Stat Papers 61, 595–614 (2020). https://doi.org/10.1007/s00362-017-0953-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-017-0953-1