Skip to main content
Log in

Self-organizing maps applied to the analysis and identification of characteristics related to air quality monitoring stations and its pollutants

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In order to address the growing problem of air pollution, it is necessary to implement innovative regulations and practical solutions to reduce and control its impact. Numerous studies have recommended using multivariate statistical methods to identify the connections and characteristics of atmospheric pollutants, which can provide valuable information about their generation, dispersion, and contribution to the deterioration of air quality. This study thoroughly examines the air quality in Salvador, Bahia, using the Self-Organizing Maps (SOM) technique. The data used in the analysis spans from 2011 to 2016 and comes from air quality monitoring stations. The dataset includes hourly measurements of pollutants such as SO2, CO, O3, particulate matter, and meteorological data (wind speed, ambient temperature, relative humidity, the standard deviation of wind direction, rainfall, and wind direction). The SOM analysis successfully identifies significant clusters, revealing associations between high concentrations of specific pollutants and environmental variables. For example, clusters with elevated SO2 concentrations are observed in areas that suggest the presence of local sources of pollution. The validation of the results using Principal Component Analysis strengthens the findings. These findings are essential for developing air quality management policies, as they highlight areas of concern and offer insights for mitigation strategies. This study demonstrates the effectiveness of the SOM technique in environmental analysis and emphasizes the importance of domain knowledge in comprehensively interpreting air pollution patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The datasets generated and/or analyzed during the current study are available in the Mendeley Data repository, https://data.mendeley.com/datasets/rg7pfvgwv8/1.

References

  1. Landrigan PJ, Fuller R, Acosta NJR, Adeyi O (2017) The lancet commission on pollution and health. Lancet 391:462–512

    Article  Google Scholar 

  2. Zivin JG, Neidell M (2018) Air pollution’s hidden impacts. Science 359:39–40

    Article  Google Scholar 

  3. Turner MC, Andersen ZJ, Diver WR, Gapstur SM, Pope CA III, Prada D, Samet J, Thurston G, Cohen A (2020) Outdoor air pollution and cancer: an overview of the current evidence and public health recommendations. CA Cancer J Clin 70:460–479

    Article  Google Scholar 

  4. Zhang J, Zhang L, Du M, Zhang W, Huang X, Zhang Y, Yang Y, Zhang JM, Deng S, Shen F, Li Y, Xiao H (2016) Indentifying the major air pollutants base on factor and cluster analysis, a case study in 74 Chinese cities. Science 144:37–46

    Google Scholar 

  5. Zhang K, Batterman S (2013) Air pollution and health risks due to vehicle traffic. Sci Total Environ 450:307–316

    Article  Google Scholar 

  6. Bai L, Wang J, Ma X, Lu H (2018) Air pollution forecasts: an overview. Int J Environ Res Public Health 15:307–316

    Article  Google Scholar 

  7. Núñez-Alonso D, Pérez-Arribas LV, Manzoor S, Cáceres JO (2018) Statistical tools for air pollution assessment: multivariate and spatial analysis studies in the Madrid Region. J Anal Methods Chem 2019:1–9

    Article  Google Scholar 

  8. Tian D, Fan J, Jin H, Mao H, Geng D, Hou S, Zhang P, Zhang Y (2020) Characteristic and spatiotemporal variation of air pollution in northern China based on correlation analysis and clustering analysis of five air pollutants. J Geophys Res Atmosph 125:1–12

    Article  Google Scholar 

  9. Manimaran P, Narayana AC (2018) Multifractal detrended cross-correlation analysis on air pollutants of University of Hyderabad Campus, India. Phys A 502:228–235

    Article  Google Scholar 

  10. Bai Y, Jin X, Wang XY, Wang J, Xu J (2020) Dynamic correlation analysis method of air pollutants in spatio-temporal analysis. Int J Environ Res Public Health 17:360

    Article  Google Scholar 

  11. Zhao S, Yu Y, Yin D, He J, Liu N, Qu J, Xiao J (2016) Annual and diurnal variations of gaseous and particulate pollutants in 31 provincial capital cities based on in situ air quality monitoring data from China National Environmental Monitoring Center. Environ Int 86:92–106

    Article  Google Scholar 

  12. Yin D, Zhao S, Qu J (2016) Spatial and seasonal variations of gaseous and particulate matter pollutants in 31 provincial capital cities, China. Air Qual Atmosph Health 10:359–370

    Article  Google Scholar 

  13. Li C, Wang Z, Li B, Peng Z, Fu Q (2019) Investigating the relationship between air pollution variation and urban form. Build Environ 147:559–568

    Article  Google Scholar 

  14. Periš N, Buljac M, Bralić M, Buzuk M, Brinić S, Plazibat I (2015) Characterization of the air quality in split, croatia focusing upon fine and coarse particulate matter analysis. Anal Lett 48:553–565

    Article  Google Scholar 

  15. Wang C, Zhao L, Sun W, Xue J, Xie Y (2018) Identifying redundant monitoring stations in an air quality monitoring network. Atmos Environ 190:256–268

    Article  Google Scholar 

  16. Samani S, Vadiati M, Nejatijahromi Z, Etebari B, Kisi O (2023) Groundwater level response identification by hybrid wavelet-machine learning conjunction models using meteorological data. Environ Sci Pollut Res 30(9):22863–22884

    Article  Google Scholar 

  17. Samani S, Vadiati M, Delkash M, Bonakdari H (2022) A hybrid wavelet-machine learning model for qanat water flow prediction. Acta Geophys 71:1895

    Article  Google Scholar 

  18. Ran Zhi-Yong HuB (2017) Parameter identifiability in statistical machine learning: a review. Neural Comput 29:1151–1203

    Article  MathSciNet  Google Scholar 

  19. Kohonen T (2001) Self-organizing maps, 3rd edn. Springer-Verlag, Berlin, Germany

    Book  Google Scholar 

  20. Asan U, Ercan S (2012) An introduction to self-organizing maps, 3rd edn. Atlantis Press, Paris, France

    Google Scholar 

  21. Liao X, Tao H, Gong X, Li Y (2019) Exploring the database of a soil environmental survey using a geo-self-organizing map: a pilot study. J Geog Sci 29:1610–1624

    Article  Google Scholar 

  22. Rivera D, Sandoval M, Godoy A (2015) Exploring soil databases: a self-organizing map approach. Soil Use Manag 31:121–131

    Article  Google Scholar 

  23. Zhou HY, Wang XS, Shan AQ (2015) Discriminating soil-contamination sources using combination of magnetic parameters. Environ Earth Sci 74:5805–5811

    Article  Google Scholar 

  24. Lee K, Yun S, Yu S, Kim K, Lee J, Lee S (2019) The combined use of self-organizing map technique and fuzzy c-means clustering to evaluate urban groundwater quality in Seoul metropolitan city, South Korea. J Hydrol 569:685–697

    Article  Google Scholar 

  25. Li T, Sun G, Yang C, Liang K, Ma S, Huang L (2018) Using self-organizing map for coastal water quality classification: towards a better understanding of patterns and processes. Sci Total Environ 628:1446–1459

    Article  Google Scholar 

  26. Chea R, Grenouillet G, Lek S (2016) Evidence of water quality degradation in lower mekong basin revealed by self-organizing map. Public Lib Sci 11:e0145527

    Google Scholar 

  27. Li Y, Wright A, Liu H, Wang J, Wang G, Wu Y, Dai L (2019) Land use pattern, irrigation, and fertilization effects of rice-wheat rotation on water quality of ponds by using self-organizing map in agricultural watersheds. Agric Ecosyst Environ 272:155–164

    Article  Google Scholar 

  28. Zhou P, Huang J, Pontius RG, Hong H (2016) New insight into the correlations between land use and water quality in a coastal watershed of China: Does point source pollution weaken it? Sci Total Environ 543:591–600

    Article  Google Scholar 

  29. Osemwegie I, Niamien-Ebrottie J, Koné M, Ouattara A, Biémi J, Reichert B (2017) Characterization of phytoplankton assemblages in a tropical coastal environment using Kohonen self-organizing map. Sci Total Environ 55:487–499

    Google Scholar 

  30. Zhong B, Wang L, Liang T, Xing B (2017) Pollution level and inhalation exposure of ambient aerosol fluoride as affected by polymetallic rare earth mining and smelting in Baotou, north China. Atmos Environ 167:40–48

    Article  Google Scholar 

  31. Jiang N, Scorgie Y, Hart M, Riley ML, Crawford J, Beggs PJ, Edwards GC, Chang L, Salter D, Virgilio GD (2017) Visualising the relationships between synoptic circulation type and air quality in Sydney, a subtropical coastal-basin environment. Int J Climatol 37:1211–1228

    Article  Google Scholar 

  32. Moosavi V, Aschwanden G, Velasco E (2015) Finding candidate locations for aerosol pollution monitoring at street level using a data-driven methodology. Atmosph Measurem Tech 8:3563–3575

    Article  Google Scholar 

  33. Kwon S, Jeong W, Park D, Kim K, Cho KH (2015) A multivariate study for characterizing particulate matter (PM10, PM2.5, and PM1) in Seoul metropolitan subway stations, Korea. J Hazard Mater 297:295–303

    Article  Google Scholar 

  34. Chang F, Chang L, Kang C, Wang Y, Huang A (2020) Explore spatio-temporal PM2.5 features in northern Taiwan using machine learning techniques. Sci Environ 736:139656

    Google Scholar 

  35. Li D, Liao Y (2020) Pollution zone identification research during ozone pollution processes. Environ Monitor Assessment. https://doi.org/10.1007/s10661-020-08552-3

    Article  Google Scholar 

  36. Gao L, Zhang W, Liu Q, Lin X, Huang Y, Zhang X (2023) Machine learning based on the graph convolutional self-organizing map method increases the accuracy of pollution source identification: A case study of trace metal(loid)s in soils of Jiangmen City, south China. Ecotoxicol Environ Saf 250:114467

    Article  Google Scholar 

  37. Licen S, Astel A, Tsakovski S (2023) Self-organizing map algorithm for assessing spatial and temporal patterns of pollutants in environmental compartments: A review. Sci Total Environ 878:163084

    Article  Google Scholar 

  38. Licen S, Astel A, Tsakovski S (2023) Self-organizing map algorithm for assessing spatial and temporal patterns of pollutants in environmental compartments: A review. Sci Total Environ 878:163084

    Article  Google Scholar 

  39. Brazilian Institute of Geography and Statistics (IBGE). Brazilian Census 2020. Brazilian Institute of Geography and Statistics, 2020. Brasília, Brazil: IBGE. Available online: https://www.ibge.gov.br/en/statistics/social/population/25071-2020-census.html?= &t=o-que-e

  40. Andrade AD, Brandão PRB (2009) Geografia de Salvador, 2nd edn. Salvador, Brazil, EDUFBA

    Google Scholar 

  41. Haykin S (2009) Neural networks and learning machines, 3rd edn. New Jersey, USA, Pearson Education

    Google Scholar 

  42. Vesanto J, Alhoniemi E (2000) Clustering of the self-organizing map. IEEE Trans Neural Networks 11:586–600

    Article  Google Scholar 

  43. Davies DL, Bouldin DWA (1979) Cluster separation measure. IEEE Transactions on pattern analysis and machine intelligence. PAMI-1, 224–227

  44. Hair JF (2009) Multivariate data analysis, 7th edn. Prentice Hall, New Jersey, USA

    Google Scholar 

  45. Unglert K, Radić V, Jellinek AM (2016) Principal component analysis vs. self-organizing maps combined with hierarchical clustering for pattern recognition in volcano seismic spectra. J Volcanol Geothermal Res 320:58–74

    Article  Google Scholar 

  46. Wang L, Jin X, Huang Z, Zhu H, Chen Z (2024) Short-Term PM2.5 prediction based on multi-modal meteorological data for consumer-grade meteorological electronic systems. IEEE Trans. https://doi.org/10.1109/TCE.2024.3354073

    Article  Google Scholar 

  47. Elmi Abdi M, Ahmad D, Abd Ghani IF (2024) Correlation study on water quality and indoor environment parameters of aquaponic systems using statistical and machine learning techniques. SSRN

  48. Dang W, Kim S, Park SJ, Xu W (2024) The impact of economic and IoT technologies on air pollution: an AI-based simulation equation model using support vector machines. Soft Comput. https://doi.org/10.1007/s00500-023-09622-7

    Article  Google Scholar 

  49. Abdi H, Williams LJ (2010) Principal component analysis. WIREs. Comput Stat 2:433–459

    Article  Google Scholar 

  50. Baghanam AH, Nourani V, Aslani H, Taghipour H (2020) Spatiotemporal variation of water pollution near landfill site: application of clustering methods to assess the admissibility of LWPI. J Hydrol 591:125581

    Article  Google Scholar 

Download references

Acknowledgements

The authors wish to acknowledge the financial support of the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES). We sincerely thank CETREL S.A. for providing us access to this valuable dataset.

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) - Finance Code 001.

Author information

Authors and Affiliations

Authors

Contributions

ELRC, TB, ELA, and MACF contributed to conceptualization and methodology. ELRC, TB, LAD, ELA, and MACF helped in data curation, writing—original draft preparation, and writing—reviewing and editing. ELA and MACF helped in collecting, analyzing, writing, and editing data.

Corresponding author

Correspondence to Marcelo A. C. Fernandes.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Consent for publication

All authors agreed with the content and gave explicit consent to submit.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Costa, E.L.R., Braga, T., Dias, L.A. et al. Self-organizing maps applied to the analysis and identification of characteristics related to air quality monitoring stations and its pollutants. Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09793-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00521-024-09793-w

Keywords

Navigation