Wavelets-based clustering of air quality monitoring sites

  • Sónia GouveiaEmail author
  • Manuel G. Scotto
  • Alexandra Monteiro
  • Andres M. Alonso


This paper aims at providing a variance/covariance profile of a set of 36 monitoring stations measuring ozone (O 3) and nitrogen dioxide (NO 2) hourly concentrations, collected over the period 2005–2013, in Portugal mainland. The resulting individual profiles are embedded in a wavelet decomposition-based clustering algorithm in order to identify groups of stations exhibiting similar profiles. The results of the cluster analysis identify three groups of stations, namely urban, suburban/urban/rural, and a third group containing all but one rural stations. The results clearly indicate a geographical pattern among urban stations, distinguishing those located in Lisbon area from those located in Oporto/North. Furthermore, for urban stations, intra-diurnal and daily time scales exhibit the highest variance. This is due to the more relevant chemical activity occurring in high NO 2 emissions areas which are responsible for high variability on daily profiles. These chemical processes also explain the reason for NO 2 and O 3 being highly negatively cross-correlated in suburban and urban sites as compared with rural stations. Finally, the clustering analysis also identifies sites which need revision concerning classification according to environment/influence type.


Air quality monitoring stations Ozone Nitrous oxide Wavelets Classification Clustering 



This work was supported by Portuguese Funds through FCT - Foundation for Science and Technology, in the context of the projects UID/CEC/00127/2013 and Incentivo/EEI/UI0127/2014 (IEETA/UA, Instituto de Engenharia Electrónica e Informática de Aveiro, and UID/MAT/04106/2013 (CIDMA/UA, Centro de I&D em Matemática e Aplicações, S. Gouveia acknowledges the postdoctoral grant by FCT (ref. SFRH/BPD/87037/2012), financed through POPH - QREN programme (European Social Fund and Nacional funds). Andres M. Alonso acknowledges the support of CICYT (Spain) Grants ECO2011-25706 and ECO2012-38442. The authors also gratefully acknowledge to the Portuguese Environmental Agency for providing the air quality monitoring data.

Compliance with ethical standards

Conflict of interests

The authors declare that they have no conflict of interest.


  1. Adame, J.A., Bolívar, J.P., & De la Morena, B.A. (2010). Surface ozone measurements in the southwest of the Iberian Peninsula (Huelva, Spain). Environmental Science and Pollution Research International, 17, 355–368.CrossRefGoogle Scholar
  2. Alkuwari, F.A., Guillas, S., & Wang, Y. (2013). Statistical downscaling of an air quality model using Fitted Empirical Orthogonal Functions. Atmospheric Environment, 81, 1–10.CrossRefGoogle Scholar
  3. Alonso, A.M., Berrendero, J.R., Hernández, A., & Justel, A. (2006). Time series clustering based on forecast densities. Computational Statistics and Data Analysis, 51, 762–776.CrossRefGoogle Scholar
  4. Austin, J., Hood, L.L., & Soukharev, B.E. (2007). Solar cycle variations of stratospheric ozone and temperature in simulations of a coupled chemistry-climate model. Atmospheric Chemistry and Physics, 7, 1693–1706.CrossRefGoogle Scholar
  5. Carvalho, A., Monteiro, A., Ribeiro, I., Tchepel, O., Miranda, A.I, Borrego, C., Saavedra, S., Souto, J.A., & Casares, J.J. (2010). High ozone levels in the northeast of Portugal: analysis and characterization. Atmospheric Environment, 44, 1020–1031.CrossRefGoogle Scholar
  6. Clapp, L.J., & Jenkin, M.E. (2001). Analysis of the relationship between ambient levels of O 3, NO 2 and NO as a function of NOx in the UK. Atmospheric Environment, 35, 6391–6405.CrossRefGoogle Scholar
  7. De Iaco, S. (2011). A new space-time multivariate approach for environmental data analysis. Journal of Applied Statistics, 38, 2471–2483.CrossRefGoogle Scholar
  8. D’Urso, P., & Maharaj, E.A. (2012). Wavelets-based clustering of multivariate time series. Fuzzy Sets and Systems, 193, 33–61.CrossRefGoogle Scholar
  9. D’Urso, P., De Giovanni, L., & Maharaj, E.A. (2014). Wavelet-based self-organizing maps for classifying multivariate time series. Journal of Chemometrics, 28, 28–51.CrossRefGoogle Scholar
  10. Everitt, B.S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis. Chichester: Wiley.CrossRefGoogle Scholar
  11. Emberson, L.D., Kitwiroon, N., Beevers, S., Buker, P., & Cinderby, S. (2013). Scorched Earth: How will changes in the strength of the vegetation sink to ozone deposition affect human health and ecosystems? Atmospheric Chemistry and Physics, 13, 6741–6755.CrossRefGoogle Scholar
  12. Figueiredo, M.L., Monteiro, A., Lopes, M., Ferreira, J., & Borrego, C. (2013). Air quality assessment of Estarreja, an urban industrialized area, in a coastal region of Portugal. Environmental Monitoring and Assessment, 185, 5847–5860.CrossRefGoogle Scholar
  13. Finazzi, F., Scott, E.M., & Fassó, A. (2013). A model-based framework for air quality indices and population risk evaluation, with an application to the analysis of Scottish air quality data. Applied Statistics, 62, 287–308.Google Scholar
  14. Fiore, A.M., Jacob, D.J., Mathur, R., & Martin, R.V. (2003). Application of empirical orthogonal functions to evaluate ozone simulations with regional and global models. Journal of Geophysical Research, 108, D14,443.Google Scholar
  15. Hogrefe, C., Rao, S.T., Zurbenko, I.G., & Porter, P.S. (2000). Interpreting the information in ozone observations and model predictions relevant to regulatory policies in the eastern United States. Bulletin of the American Meteorological Society, 81, 2083–2106.CrossRefGoogle Scholar
  16. Ignaccolo, R., Ghigo, S., & Giovenali, E. (2008). Analysis of air quality monitoring networks by functional clustering. Environmetrics, 19, 672–686.CrossRefGoogle Scholar
  17. Im, U., Incecik, S., Guler, M., Tek, A., Topcu, S., Unal, Y.S., Yenigun, O., Kindap, T., Talat Odman, M., & Tayanc, M. (2013). Analysis of surface ozone and nitrogen oxides at urban, semi-rural and rural sites in Istanbul, Turkey. Science of the Total Environment, 443, 920–931.CrossRefGoogle Scholar
  18. Joly, M., & Peuch, V.H. (2012). Objective classification of air quality monitoring sites over Europe. Atmospheric Environment, 47, 111–123.CrossRefGoogle Scholar
  19. Kracht, O., Reuter, H.I., & Gerboles, M. (2013). A tool for the SpatioTemporal screening of AirBase Datasets for abnormal values, European Commission Report 25787 EN, Joint Research Centre.Google Scholar
  20. Kracht, O., Reuter, H.I., & Gerboles, M. (2014). First evaluation of a novel screening tool for outlier detection in large scale ambient air quality datasets. International Journal of Environment and Pollution, 55, 120–128.CrossRefGoogle Scholar
  21. Levy, I., Mihele, C., Lu, G., Narayan, J., & Brook, J.R. (2014). Evaluating multipollutant exposure and urban air quality: pollutant interrelationships, neighborhood variability, and nitrogen dioxide as a proxy pollutant. Environmental Health Perspectives, 122, 65–72.Google Scholar
  22. Li, L., Wu, J., Ghosh, J.K., & Ritz, B. (2013). Estimating spatiotemporal variability of ambient air pollutant concentrations with a hierarchical model. Atmospheric Environment, 71, 54–63.CrossRefGoogle Scholar
  23. Liu, S., Maharaj, E.A., & Inder, B. (2014). Polarization of forecast densities: a new approach to time series classification. Computational Statistics and Data Analysis, 70, 345–361.CrossRefGoogle Scholar
  24. O’Leary, B.F., & Lemke, L.D. (2014). Modeling spatiotemporal variability of intra-urban air pollutants in Detroit: a pragmatic approach. Atmospheric Environment, 94, 417–427.CrossRefGoogle Scholar
  25. Monjardino, J., Ferreira, F., Mesquita, S., Perez, A.T., & Jardim, D. (2009). Air quality monitoring: establishing criteria for station classification. International Journal of Environment and Pollution, 39, 321–32.CrossRefGoogle Scholar
  26. Monteiro, A., Strunk, A., Carvalho, A., Tchepel, O., Miranda, A.I., Borrego, C., Saavedra, S., Rodriguez, A., Souto, J., Casares, J., & Elbern, H. (2012a). Investigating a high ozone episode in a rural mountain site. Environmental Pollution, 162, 176–189.CrossRefGoogle Scholar
  27. Monteiro, A., Carvalho, A., Ribeiro, I., Scotto, M.G., Barbosa, S., Alonso, A., Baldasano, J.M., Pay, M.T., Miranda, A.I., & Borrego, C. (2012b). Trends in ozone concentrations in the Iberian Peninsula by quantile regression and clustering. Atmospheric Environment, 56, 184–193.CrossRefGoogle Scholar
  28. Percival, D.B., & Walden, A.T. (2006). Wavelet methods for time series analysis. Cambridge: Cambridge University Press.Google Scholar
  29. Reich, B., Cooley, D., Foley, K., Napelenok, S., & Shaby, B. (2013). Extreme value analysis for evaluating ozone control strategies. Annals of Applied Statistics, 7, 739–762.CrossRefGoogle Scholar
  30. Rojas, A.L.P., & Venegas, L.E. (2013). Spatial distribution of ground-level urban background O3 concentrations in the Metropolitan Area of Buenos Aires, Argentina. Environmental Pollution, 183, 159–165.CrossRefGoogle Scholar
  31. Shaddick, G., & Wakefield, J. (2002). Modelling daily multivariate pollutant data at multiple sites. Applied Statistics, 51, 351–372.Google Scholar
  32. Sharma, D., & Kulshrestha, U.C. (2014). Spatial and temporal patterns of air pollutants in rural and urban areas of India. Environmental Pollution, 195, 276–281.CrossRefGoogle Scholar
  33. Sebald, L., Treffeisen, R., Reimer, E., & Hies, T. (2000). Spectral analysis of air pollutants. Part 2: ozone time series. Atmospheric Environment, 34, 3503–3509.CrossRefGoogle Scholar
  34. Seinfeld, J.H., & Pandis, S.N. (2006). Atmospheric Chemistry and Physics: from air pollution to climate change, 2nd Edition. New York: Wiley.Google Scholar
  35. Shi, P., Xie, P.-H., Qin, M., Si, F.-Q., Dou, K., & Du, K. (2014). Cluster analysis for daily patterns of SO 2 and NO 2 measured by the DOAS system in Xiamen. Aerosol and Air Quality Research, 14, 1455–1465.Google Scholar
  36. Speed, T. (2003). Statistical Analysis of Gene Expression Microarray Data. Boca Raton: CRC Press.CrossRefGoogle Scholar
  37. Statheropoulos, M., Vassiliadis, N., & Pappa, A. (1998). Principal component and canonical correlation analysis for examining air pollution and meteorological data. Atmospheric Environment, 32, 1087–1095.CrossRefGoogle Scholar
  38. Vilar, J.A., Alonso, A.M., & Vilar, J.M. (2010). Nonlinear time series clustering based on non-parametric forecast densities. Computational Statistics and Data Analysis, 54, 2850–2865. Scholar
  39. Tchepel, O., Costa, A.M., Martins, H., Ferreira, J., Monteiro, A., Miranda, A.I., & Borrego, C. (2010). Determination of background concentrations for air quality models using spectral analysis and filtering of monitoring data. Atmospheric Environment, 44, 106–114.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Sónia Gouveia
    • 1
    Email author
  • Manuel G. Scotto
    • 2
  • Alexandra Monteiro
    • 3
  • Andres M. Alonso
    • 4
  1. 1.Instituto de Engenharia Electrónica e Informática de Aveiro (IEETA) and Centro de I&D em Matemática e Aplicações (CIDMA)Universidade de AveiroAveiroPortugal
  2. 2.CEMAT, Instituto Superior TécnicoUniversidade de LisboaLisboaPortugal
  3. 3.Centre for Environmental Marine Studies (CESAM) and Department of Environment and PlanningUniversidade de AveiroAveiroPortugal
  4. 4.Department of Statistics and INEACUUniversidad Carlos III de MadridMadridSpain

Personalised recommendations