Advertisement

Abstract

Nowadays numerous urban areas have deployed a network of sensors for monitoring multiple variables of air quality. The measurements of these sensors can be treated individually—as time series—or collectively. Collectively, a variable monitored by a network of sensors can be transformed into a map embodying the same information, but converting numerical information into visual one. Once the numerical information has been transformed into maps, they can be used as images for the usual purposes of machine learning algorithms, and specially for clustering and outlier detection. Air quality is one of the main concerns in urban areas. In this work, firstly the numerical information of 12 monitoring station measuring the concentration of Ozone in Madrid (Spain) is transformed into daily maps. For this purpose a methodology for converting numerical information from a geographically distributed network of sensors into grey-scaled maps is proposed. Later, these maps are investigated for searching outliers—extreme episodes—with Density-based spatial clustering of applications with noise. Also the sensitivity of the search of extreme episodes to the methodology for transforming numerical information into maps is investigated.

Keywords

DBSCAN Outlier detection Air quality Madrid 

Notes

Acknowledgment

The research leading to these results has received funding by the Spanish Ministry of Economy and Competitiveness (MINECO) for funding support through the grant FPA2016-80994-C2-1-R, and “Unidad de Excelencia María de Maeztu”: CIEMAT - FÍSICA DE PARTÍCULAS through the grant MDM-2015-0509.

References

  1. 1.
    Madrid air quality plan 2011–2015 (2012)Google Scholar
  2. 2.
    Open data Madrid, August 2018. https://datos.madrid.es/portal/site/egob
  3. 3.
    Alberdi Odriozola, J.C., Díaz Jiménez, J., Montero Rubio, J.C., Mirón Pérez, I.J., Pajares Ortíz, M.S., Ribera Rodrigues, P.: Air pollution and mortality in Madrid, Spain: a time-series analysis. Int. Arch. Occup. Environ. Health 71(8), 543–549 (1998).  https://doi.org/10.1007/s004200050321CrossRefGoogle Scholar
  4. 4.
    Díaz, J., García, R., Ribera, P., Alberdi, J.C., Hernández, E., Pajares, M.S., Otero, A.: Modeling of air pollution and its relationship with mortality and morbidity in Madrid, Spain. Int. Arch. Occup. Environ. Health 72(6), 366–376 (1999).  https://doi.org/10.1007/s004200050388CrossRefGoogle Scholar
  5. 5.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD 1996, pp. 226–231. AAAI Press (1996)Google Scholar
  6. 6.
    Fréchet, M.: Sur quelques points du calcul fonctionnel. Rendiconti del Circolo Matematico di Palermo 22, 1–47 (1906)CrossRefGoogle Scholar
  7. 7.
    Han, J., Kamber, M., Pei, J.: Data Mining Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers, Waltham (2012)zbMATHGoogle Scholar
  8. 8.
    Linares, C., Díaz, J., Tobías, A., Miguel, J.M.D., Otero, A.: Impact of urban air pollutants and noise levels over daily hospital admissions in children in Madrid: a time series analysis. Int. Arch. Occup. Environ. Health 79(2), 143–152 (2006).  https://doi.org/10.1007/s00420-005-0032-0CrossRefGoogle Scholar
  9. 9.
    Méndez-Jiménez, I., Cárdenas-Montes, M.: Modelling and forecasting of the \(^{222}Rn\) radiation level time series at the Canfranc underground laboratory. In: Hybrid Artificial Intelligent Systems - 13th International Conference, HAIS 2018, Oviedo, Spain, 20–22 June 2018. Lecture Notes in Computer Science, vol. 10870, pp. 158–170. Springer (2018).  https://doi.org/10.1007/978-3-319-92639-1_14Google Scholar
  10. 10.
    Méndez-Jiménez, I., Cárdenas-Montes, M.: Time series decomposition for improving the forecasting performance of convolutional neural networks. In: Advances in Artificial Intelligence - 18th Conference of the Spanish Association for Artificial Intelligence, CAEPIA 2018, Granada, Spain, 23–26 October 2018. Lecture Notes in Computer Science, vol. 11160, pp. 87–97. Springer (2018).  https://doi.org/10.1007/978-3-030-00374-6_9CrossRefGoogle Scholar
  11. 11.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Simovici, D.A., Djeraba, C.: Mathematical Tools for Data Mining - Set Theory, Partial Orders, Combinatorics. Advanced Information and Knowledge Processing. Springer (2008).  https://doi.org/10.1007/978-1-84800-201-2

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Centro de Investigaciones Energéticas Medioambientales y TecnológicasMadridSpain

Personalised recommendations