Abstract
For more than a century, air pollution has been one of the most important environmental problems in cities. Pollution is a threat to human health and is responsible for many deaths every year all over the world. This paper deals with the topic of functional outlier detection. Functional analysis is a novel mathematical tool employed for the recognition of outliers. This methodology is applied here to the emissions of a coal-fired power plant. This research uses two different methods, called functional high-density region (HDR) boxplot and functional bagplot. Please note that functional bagplots were developed using bivariate bagplots as a starting point. Indeed, they are applied to the first two robust principal component scores. Both methodologies were applied for the detection of outliers in the time pollutant emission curves that were built using, as inputs, the discrete information available from an air quality monitoring data record station and the subsequent smoothing of this dataset for each pollutant. In this research, both new methodologies are tested to detect outliers in pollutant emissions performed over a long period of time in an urban area. These pollutant emissions have been treated in order to use them as vectors whose components are pollutant concentration values for each observation made. Note that although the recording of pollutant emissions is made in a discrete way, these methodologies use pollutants as curves, identifying the outliers by a comparison of curves rather than vectors. Then, the concept of outlier goes from being a point to a curve that employs the functional depth as the indicator of curve distance. In this study, it is applied to the detection of outliers in pollutant emissions from a coal-fired power plant located on the outskirts of the city of Oviedo, located in the north of Spain and capital of the Principality of Asturias. Also, strengths of the functional methods are explained.
Similar content being viewed by others
References
Akkoyunku A, Ertürk F (2003) Evaluation of air pollution trends in Istanbul. Int J Environ Pollut 18:388–398
Basden AG, Atkinson D, Bharmal NA, Bitenc U, Brangier M, Buey T, Butterley T, Cano D, Chemla F, Clark P, Cohen M, Conan JM, de Cos FJ, Dickson C, Dipper NA, Dunlop CN, Feautrier P, Fusco T, Gach JL, Gendron E, Geng D, Goodsell SJ, Gratadour D, Greenaway AH, Guesalaga A, Guzman CD, Henry D, Holck D, Hubert Z, Huet JM, Kellerer A, Kulcsar A, Laporte P, Le Roux B, Looker N, Longmore AJ, Marteaud M, Martin O, Meimon S, Morel C, Morris TJ, Myers RM, Osborn J, Perret D, Petit C, Raynaud H, Reeves AP, Rousset G, Sanchez Lasheras G, Sanchez Rodriguez ML, Santos JD, Sevin A, Sivo G, Stadler E, Stobie B, Talbot G, Todd S, Vidal F, Younger EJ (2016) Experience with wavefront sensor and deformable mirror interfaces for wide-field adaptive optics systems. Mon Not Roy Astron Soc 459(2):1350–1359
Bühlmann P, Van De Geer S (2011) Statistics for high-dimensional data. Springer Series in Statistics: 9, Berlin
Colbeck I (2008) Environmental chemistry of aerosol. Wiley-Blackwell, New York
Comrie AC, Diem JE (1999) Climatology and forecast modeling of ambient carbon monoxide in Phoenix. Atmos Environ 33:5023–5036
Cooper CD, Alley FC (2002) Air pollution control. Waveland Press, New York
De Andrés J, Sánchez-Lasheras F, Lorca P, de Cos Juez FJ (2011) A hybrid device of self organizing maps (SOM) and multivariate adaptive regression splines (MARS) for the forecasting of firms’ bankruptcy. Account Manage Inform Syst 10(3):351–374
De Cos J, Sanchez F, Ortega F, Montequin V (2008) Rapid cost estimation of metallic components for the aerospace industry. Int J Prod Econ 112:470–482
Díaz Muñiz C, García Nieto PJ, Alonso Fernández JR, Martínez Torres J, Taboada J (2012) Detection of outliers in water quality monitoring samples using functional data analysis in San Esteban estuary (Northern Spain). Sci Total Environ 439(15):54–61
Elbir T, Muezzinoglu A (2000) Evaluation of some air pollution indicators in Turkey. Environ Int 26(1–2):5–10
Febrero M, Galeano P, González-Manteiga W (2008) Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels. Environmetrics 19:331–345
Filzmoser P, Maronna R, Werner M (2008) Outlier identification in high dimensions. Comput Stat Data Anal 52:1694–1711
Fraiman R, Muniz G (2001) Trimmed means for functional data. Test 10:419–440
Friedlander SK (2000) Smoke, dust and haze: fundamentals of aerosol dynamics. Oxford University Press, New York
García Nieto PJ (2001) Parametric study of selective removal of atmospheric aerosol by coagulation, condensation and gravitational settling. Int J Environ Health Res 11:151–162
García Nieto PJ (2006) Study of the evolution of aerosol emissions from coal-fired power plants due to coagulation, condensation, and gravitational settling and health impact. J Environ Manag 79(4):372–382
García Nieto PJ, Álvarez Fernández JR, Sánchez Lasheras F, de Cos Juez FJ, Díaz Muñiz C (2012) A new improved study of cyanotoxins presence from experimental cyanobacteria concentrations in the Trasona reservoir (Northern Spain) using the MARS technique. Sci Total Environ 430:88–92
Godish T (2004) Air quality. Lewis Publishers, Boca Raton
Hall P, Müller HG, Wang JL (2006) Properties of principal component methods for functional and longitudinal data analysis. Ann Stat 34:1493–1517
Hewitt CN, Jackson AV (2009) Atmospheric science for environmental scientists. Wiley-Blackwell, New York
Hyndman RJ (1996) Computing and graphing highest density regions. The American Statistician, 50:120–126. Published by: Taylor & Francis, Ltd. on behalf of the American Statistical Association Stable URL: https://www.jstor.org/stable/2684423. Accessed 17 May 2018
Hyndman RJ, Ullah MS (2007) Robust forecasting of mortality and fertility rates: a functional data approach. Comput Stat Data Anal 51:4942–4968
Karaca F, Alagha O, Ertürk F (2005) Statistical characterization of atmospheric PM10 and PM2.5 concentrations at a non-impacted suburban site of Istanbul, Turkey. Chemosphere 59(8):1183–1190
Lalor G, Zhang CS (2001) Multivariate outlier detection and remediation in geochemical databases. Sci Total Environ 281:99–109
Lutgens FK, Tarbuck EJ (2018) The atmosphere: an introduction to meteorology. Prentice Hall, New York
Martínez Torres J, García Nieto PJ, Alejano L, Reyes AN (2011) Detection of outliers in gas emissions from urban areas using functional data analysis. J Hazard Mater 186(1):144–149
Monteiro A, Lopes M, Miranda AI, Borrego C, Vautard R (2005) Air pollution forecast in Portugal: a demand from the new air quality framework directive. Int J Environ Pollut 5:1–9
Ramsay JO, Silverman BW (1997) Functional data analysis. Springer, New York
Rousseuw PJ, Ruts I, Tukey JW (1999) The bagplot: a bivariate boxplot. Am Stat 53:382–387
Ruts I, Rousseeuw PJ (1996) Computing depth contours of bivariate point clouds. Comput Stat Data Anal 23:153–168
Schnelle KB, Dunn RF, Ternes ME (2017) Air pollution control technology handbook. CRC, Boca Raton
Scott DW (1992) Multivariate density estimation: theory, practice, and visualization. John Wiley & Sons, New York
Seinfeld JH, Pandis SN (2016) Atmospheric chemistry and physics: from air pollution to climate change. Wiley, New York
Suárez Sánchez A, García Nieto PJ, Riesgo Fernández P, del Coz Díaz JJ, Iglesias–Rodríguez FJ (2011) Application of a SVM–based regression model to the air quality study at local scale in the Avilés urban area (Spain). Math Comput Model 54(5–6):1453–1466
Tanner MA (1993) Tools for statistical inference: methods for the exploration of posterior distributions and likelihood functions, 2nd edn. Springer-Verlag, New York
Tukey JW (1975) Mathematics and the picturing of data. In: Proceedings of the International Congress of Mathematicians (Vancouver, B. C., 1974), Canad Math Congress, Montreal, vol 2, pp 523–531
Tukey JW (1977) Exploratory data analysis. AddisonWesley, Reading, MA
Turrado CC, Meizoso López MC, Sánchez Lasheras F, Rodríguez Gómez BA, Calvo Rollé JL, de Cos Juez FJ (2014) Missing data imputation of solar radiation data under different atmospheric conditions. Sensors 14(11):20382–20399
Vincent JH (2007a) Aerosol sampling: science, standards, instrumentation and applications. Wiley, Chichester
Vincent JH (2007b) Aerosol sampling: science, standards, instrumentation and applications. Wiley, New York
Wang LK, Pereira NC, Hung YT (2004) Air pollution control engineering. Humana Press, New York
Wark K, Warner CF, Davis WT (1997) Air pollution: its origin and control. Prentice Hall, New Jersey
Acknowledgements
The authors wish to acknowledge the computational support provided by the Department of Mathematics at the University of Oviedo, as well as pollutant data from the Santa Marina air quality automated monitoring station supplied by the Section of Industry and Energy from the Government of Asturias (Spain). We would like to thank Anthony Ashworth for his revision of the English grammar and spelling of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Marcus Schulz
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sánchez-Lasheras, F., Ordóñez-Galán, C., García-Nieto, P.J. et al. Detection of outliers in pollutant emissions from the Soto de Ribera coal-fired power plant using functional data analysis: a case study in northern Spain. Environ Sci Pollut Res 27, 8–20 (2020). https://doi.org/10.1007/s11356-019-04435-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11356-019-04435-4