Advertisement

Functional outlier detection by a local depth with application to NO x levels

  • Carlo SgueraEmail author
  • Pedro Galeano
  • Rosa E. Lillo
Original Paper

Abstract

This paper proposes methods to detect outliers in functional data sets and the task of identifying atypical curves is carried out using the recently proposed kernelized functional spatial depth (KFSD). KFSD is a local depth that can be used to order the curves of a sample from the most to the least central, and since outliers are usually among the least central curves, we present a probabilistic result which allows to select a threshold value for KFSD such that curves with depth values lower than the threshold are detected as outliers. Based on this result, we propose three new outlier detection procedures. The results of a simulation study show that our proposals generally outperform a battery of competitors. We apply our procedures to a real data set consisting in daily curves of emission levels of nitrogen oxides (NO\(_{x}\)) since it is of interest to identify abnormal NO\(_{x}\) levels to take necessary environmental political actions.

Keywords

Functional depths Functional outlier detection Kernelized functional spatial depth Nitrogen oxides Smoothed resampling 

Notes

Acknowledgments

The authors would like to thank the editor in chief, the associate editor and an anonymous referee for their helpful comments. This research was partially supported by Spanish Ministry of Science and Innovation grant ECO2011-25706 and by Spanish Ministry of Economy and Competition grant ECO2012-38442.

References

  1. Barnett V, Lewis T (1994) Outliers in statistical data, vol 3. Wiley, New YorkGoogle Scholar
  2. Chakraborty A, Chaudhuri P (2014) On data depth in infinite dimensional spaces. Ann Inst Stat Math 66:303–324CrossRefGoogle Scholar
  3. Chen Y, Dang X, Peng H, Bart HL (2009) Outlier detection with the kernelized spatial depth function. IEEE Trans Pattern Anal Mach Intell 31:288–305CrossRefGoogle Scholar
  4. Cuesta-Albertos JA, Nieto-Reyes A (2008) The random Tukey depth. Comput Stat Data Anal 52:4979–4988CrossRefGoogle Scholar
  5. Cuevas A (2014) A partial overview of the theory of statistics with functional data. J Stat Plan Inference 147:1–23CrossRefGoogle Scholar
  6. Cuevas A, Fraiman R (2009) On depth measures and dual statistics. A methodology for dealing with general data. J Multivar Anal 100:753–766CrossRefGoogle Scholar
  7. Cuevas A, Febrero M, Fraiman R (2006) On the use of the bootstrap for estimating functions with functional data. Comput Stat Data Anal 51:1063–1074CrossRefGoogle Scholar
  8. Febrero M, Oviedo de la Fuente M (2012) Statistical computing in functional data analysis: the R package fda.usc. J Stat Softw 51:1–28Google Scholar
  9. Febrero M, Galeano P, González-Manteiga W (2007) A functional analysis of NOx levels: location and scale estimation and outlier detection. Comput Stat 22:411–427CrossRefGoogle Scholar
  10. Febrero M, Galeano P, González-Manteiga W (2008) Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels. Environmetrics 19:331–345CrossRefGoogle Scholar
  11. Ferraty F, Vieu P (2006) Nonparametric functional data analysis: theory and practice. Springer, New YorkGoogle Scholar
  12. Fraiman R, Muniz G (2001) Trimmed means for functional data. Test 10:419–440CrossRefGoogle Scholar
  13. Horváth L, Kokoszka P (2012) Inference for functional data with applications. Springer, New YorkCrossRefGoogle Scholar
  14. Hyndman RJ, Shang HL (2010) Rainbow plots, bagplots, and boxplots for functional data. J Comput Graph Stat 19:29–45CrossRefGoogle Scholar
  15. Ignaccolo R, Franco-Villoria M, Fassò A (2015) Modelling collocation uncertainty of 3D atmospheric profiles. Stoch Environ Res Risk Assess 29:419–429CrossRefGoogle Scholar
  16. López-Pintado S, Romo J (2009) On the concept of depth for functional data. J Am Stat Assoc 104:718–734CrossRefGoogle Scholar
  17. McDiarmid C (1989) On the method of bounded differences. Survey in combinatorics. Cambridge University Press, Cambridge, pp 148–188Google Scholar
  18. Menafoglio A, Guadagnini A, Secchi P (2014) A kriging approach based on Aitchison geometry for the characterization of particle-size curves in heterogeneous aquifers. Stoch Environ Res Risk Assess 28:1835–1851CrossRefGoogle Scholar
  19. Ramsay JO, Silverman BW (2005) Functional data analysis. Springer, New YorkCrossRefGoogle Scholar
  20. Ruiz-Medina MD, Espejo RM (2012) Spatial autoregressive functional plug-in prediction of ocean surface temperature. Stoch Environ Res Risk Assess 26:335–344CrossRefGoogle Scholar
  21. Sguera C, Galeano P, Lillo R (2014) Spatial depth-based classification for functional data. Test 23:725–750CrossRefGoogle Scholar
  22. Silverman BW (1986) Density estimation for statistics and data analysis. Chapman and Hall, LondonCrossRefGoogle Scholar
  23. Sun Y, Genton MG (2011) Functional boxplots. J Comput Graph Stat 20:316–334CrossRefGoogle Scholar
  24. Tukey JW (1975) Mathematics and the picturing of data. Proc Int Congr Math 2:523–531Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Department of StatisticsUniversidad Carlos III de MadridGetafeSpain

Personalised recommendations