A novel ensemble machine learning method for accurate air quality prediction

Emeç, M.; Yurtsever, M.

doi:10.1007/s13762-024-05671-z

A novel ensemble machine learning method for accurate air quality prediction

Original Paper
Published: 06 May 2024

(2024)
Cite this article

International Journal of Environmental Science and Technology Aims and scope Submit manuscript

111 Accesses
Explore all metrics

Abstract

Air pollution continues to be an important problem that causes health issues worldwide. Factors such as industrial development, increased vehicle traffic, and energy production have a negative impact on air quality by releasing harmful gases and particles into the atmosphere. Consequently, this can lead to respiratory diseases, cardiovascular problems, and other health complications. Predicting air quality is a crucial step in safeguarding human health and informing environmental policies. Many cities employ measurement instruments and data collection systems to monitor and forecast air quality. This data can be analyzed using machine learning models to predict future air pollution levels. This article examines the performance of a new stacking ensemble model for estimating PM_2.5, based on air quality datasets from major cities such as Beijing and Istanbul. The model combines predictions from various machine learning models. In the initial stage of the study, the performance of commonly used models in the literature, such as multi-layer perceptron, support vector regression, and random forest, were evaluated. These models were assessed for their ability to predict PM_2.5 using metrics such as mean absolute error (MAE), root mean squared error (RMSE) and R-squared (R²). This evaluation determines the proximity of the model predictions to the actual data. The stacking ensemble model examined in this study yielded the best results for PM_2.5 predictions, with MAE of 6.67, RMSE of 8.80 and R² of 0.91. In conclusion, the stacking ensemble model for air pollution prediction offers a promising approach for achieving superior results compared to traditional machine learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A random forest guided tour

Article 19 April 2016

Artificial intelligence-based solutions for climate change: a review

Article Open access 13 June 2023

Water quality prediction using machine learning models based on grid search method

Article Open access 29 September 2023

References

Air Quality Index Project, TW Beijing air pollution: real-time air quality index (2022). https://aqicn.org/city/beijing/
Akyol K (2020) Stacking ensemble based deep neural networks modeling for effective epileptic seizure detection. Expert Syst Appl 148:113239. https://doi.org/10.1016/j.eswa.2020.113239
Article Google Scholar
Ao Y, Li H, Zhu L, Ali S, Yang Z (2019) The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. J Petroleum Sci Eng 174:776–789. https://doi.org/10.1016/j.petrol.2018.11.067
Article CAS Google Scholar
Breiman L (2001) Random forests. Mach Learn 45:5–32
Article Google Scholar
Cao Y, Liu G, Sun J, Bavirisetti DP, Xiao G (2023) PSO-Stacking improved ensemble model for campus building energy consumption forecasting based on priority feature selection. J Build Eng 72:106589. https://doi.org/10.1016/j.jobe.2023.106589
Article Google Scholar
Castelli M, Clemente FM, Popovič A, Silva S, Vanneschi L (2020) A machine learning approach to predict air quality in California. Complexity https://doi.org/10.1155/2020/8049504
Chang YS, Abimannan S, Chiao HT, Lin CY, Huang YP (2020) An ensemble learning based hybrid model and framework for air pollution forecasting. Env Sci Poll Res 27:38155–38168. https://doi.org/10.1007/s11356-020-09855-1
Article Google Scholar
Chen B (2020) Air quality index forecasting via deep dictionary learning. IEICE Trans Inf Syst 103(5):1118–1125. https://doi.org/10.1587/transinf.2019EDP7296
Article Google Scholar
Chen MH, Chen YC, Chou TY, Ning FS (2023) PM_2.5 concentration prediction model: a CNN–RF ensemble framework. Int J Environ Res Public Health 20(5):4077. https://doi.org/10.3390/ijerph20054077
Article CAS Google Scholar
Chen R, Liang CY, Hong WC, Gu DX (2015) Forecasting holiday daily tourist flow based on seasonal support vector regression with adaptive genetic algorithm. Appl Soft Comput 26:435–443. https://doi.org/10.1016/j.asoc.2014.10.022
Article Google Scholar
Dong X, Yu Z, Cao W, Shi Y, Ma Q (2020) A survey on ensemble learning. Front Comput Sci 14:241–258. https://doi.org/10.1007/s11704-019-8208-z
Article Google Scholar
Fang H, Feng Y, Zhang L, Su M and Yang H (2020) A long short-term memory neural network model for predicting air pollution index based on popular learning. In: Database systems for advanced applications. DASFAA 2020 International Workshops: BDMS, SeCoP, BDQM, GDMA, and AIDE, Jeju, South Korea, September 24–27, 2020, Proceedings 25. Springer International Publishing, pp 190–199
Feng S, Gao D, Liao F, Zhou F, Wang X (2016) The health effects of ambient PM_2.5 and potential mechanisms. Ecotoxicol Environ Saf 128:67–74. https://doi.org/10.1016/j.ecoenv.2016.01.030
Article CAS Google Scholar
Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32(14–15):2627–2636. https://doi.org/10.1016/S1352-2310(97)00447-0
Article CAS Google Scholar
Gokul PR, Mathew A, Bhosale A, Nair AT (2023) Spatio-temporal air quality analysis and PM_2.5 prediction over Hyderabad City, India using artificial intelligence techniques. Ecol Inf 76:102067. https://doi.org/10.1016/j.ecoinf.2023.102067
Article Google Scholar
Harishkumar KS, Km Y, Gad I (2020) Forecasting air pollution particulate matter (PM_2.5) using machine learning regression models. Procedia Comput Sci 171:2057–2066. https://doi.org/10.1016/j.procs.2020.04.221
Article Google Scholar
Janarthanan R, Partheeban P, Somasundaram K, Elamparithi PN (2021) A deep learning approach for prediction of air quality index in a metropolitan city. Sustain Cities Soc 67:102720. https://doi.org/10.1016/j.scs.2021.102720
Article Google Scholar
Janiesch C, Zschech P, Heinrich K (2021) Machine learning and deep learning. Electron Markets 31(3):685–695. https://doi.org/10.1007/s12525-021-00475-2
Article Google Scholar
Juarez EK, Petersen MR (2022) A comparison of machine learning methods to forecast tropospheric ozone levels in Delhi. Atmosphere 13(1):46. https://doi.org/10.3390/atmos13010046
Article CAS Google Scholar
Karakuş CB, Yıldız S (2019) Hava kalite indeksi ile meteorolojik parametreler arasındaki ilişkinin çoklu regresyon yöntemi ile belirlenmesi. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi 8(2):698–711. https://doi.org/10.28948/ngumuh.598118
Article Google Scholar
Kumar K, Pande BP (2023) Air pollution prediction with machine learning: a case study of Indian cities. Int J Environ Sci Technol 20(5):5333–5348. https://doi.org/10.1007/s13762-022-04241-5
Article CAS Google Scholar
Kwon H, Park J, Lee Y (2019) Stacking ensemble technique for classifying breast cancer. Healthc Inf Res 25(4):283–288. https://doi.org/10.4258/hir.2019.25.4.283
Article Google Scholar
Li Z, Gan K, Sun S, Wang S (2023) A new PM_2.5 concentration forecasting system based on AdaBoost-ensemble system with deep learning approach. J Forecast 42(1):154–175. https://doi.org/10.1002/for.2883
Article CAS Google Scholar
Liang YC, Maimury Y, Chen AHL, Juarez JRC (2020) Machine learning-based prediction of air quality. Appl Sci 10:9151. https://doi.org/10.3390/app10249151
Article CAS Google Scholar
Lin CY, Chang YS, Abimannan S (2021) Ensemble multifeatured deep learning models for air quality forecasting. Atmosph Poll Res 12(5):101045. https://doi.org/10.1016/j.apr.2021.03.008
Article CAS Google Scholar
Liu H, Li Q, Yu D, Gu Y (2019) Air quality index and air pollutant concentration prediction based on machine learning algorithms. Appl Sci 9(19):4069. https://doi.org/10.3390/app9194069
Article CAS Google Scholar
Ma J, Ma X, Yang C, Xie L, Zhang W, Li X (2023) An air pollutant forecast correction model based on ensemble learning algorithm. Electronics 12(6):1463. https://doi.org/10.3390/electronics12061463
Article CAS Google Scholar
Madan T, Sagar S, Virmani D (2020) Air quality prediction using machine learning algorithms–a review. In: 2020 2nd international conference on advances in computing, communication control and networking (ICACCCN). IEEE, pp 140–145
Maltare NN, Vahora S (2023) Air quality index prediction using machine learning for Ahmedabad city. Digit Chem Eng 7:100093. https://doi.org/10.1016/j.dche.2023.100093
Article Google Scholar
Pui DY, Chen SC, Zuo Z (2014) PM_2.5 in China: measurements, sources, visibility and health effects, and mitigation. Particuology 13:1–26. https://doi.org/10.1016/j.partic.2013.11.001
Article CAS Google Scholar
Sarkar N, Gupta R, Keserwani PK, Govil MC (2022) Air quality index prediction using an effective hybrid deep learning model. Environ Poll 315:120404. https://doi.org/10.1016/j.envpol.2022.120404
Article CAS Google Scholar
Sethi JK, Mittal M (2019) A new feature selection method based on machine learning technique for air quality dataset. J Stat Manag Syst 22(4):697–705. https://doi.org/10.1080/09720510.2019.1609726
Article Google Scholar
SIM (Sürekli izleme merkezi) | T.C. Çevre, Şehircilik ve İklim Değişikliği Bakanlığı (2023). https://sim.csb.gov.tr/
Wang B, Eum KD, Kazemiparkouhi F, Li C, Manjourides J, Pavlu V, Suh H (2020) The impact of long-term PM_2.5 exposure on specific causes of death: exposure-response curves and effect modification among 53 million US Medicare beneficiaries. Environ Health 19:1–12. https://doi.org/10.1186/s12940-020-00575-0
Article CAS Google Scholar
Wang D, Yue X (2019) The weighted multiple meta-models stacking method for regression problem. In: 2019 Chinese control conference (CCC). IEEE, pp 7511–7516
WHO (2022) Household air pollution. 28 Nov 2023
Xiang X, Fahad S, Han MS, Naeem MR, Room S (2023) Air quality index prediction via multi-task machine learning technique: spatial analysis for human capital and intensive air quality monitoring stations. Air Qual Atmos Health 16(1):85–97. https://doi.org/10.1007/s11869-022-01255-3
Article CAS Google Scholar
Yang J, Yan R, Nong M, Liao J, Li F, Sun W (2021) PM_2.5 concentrations forecasting in Beijing through deep learning with different inputs, model structures and forecast time. Atmos Poll Res 12(9):101168. https://doi.org/10.1016/j.apr.2021.101168
Article CAS Google Scholar
Yurtsever M, Emeç M (2023) Potable water quality prediction using artificial intelligence and machine learning algorithms for better sustainability. Ege Academic Rev 23(2):265–278. https://doi.org/10.21121/eab.1252167
Article Google Scholar
Zhang Q, Jiang X, Tong D, Davis SJ, Zhao H, Geng G et al (2017) Transboundary health impacts of transported global air pollution and international trade. Nature 543(7647):705–709. https://doi.org/10.1038/nature21712
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

IT Department, Istanbul University, 34116, Fatih, Istanbul, Turkey
M. Emeç
Faculty of economics and administrative sciences, Department of management information systems, Izmir Democracy University, 35140, Karabaglar, Izmir, Turkey
M. Yurtsever

Authors

M. Emeç
View author publications
You can also search for this author in PubMed Google Scholar
M. Yurtsever
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Emeç.

Ethics declarations

Conflict of interest

There are no conflicts of interest, and all the authors are interested in publishing the manuscript.

Ethical approval

This article contains no studies with human participants or animals performed by authors.

Additional information

Editorial responsibility: Mohamed F. Yassin.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Emeç, M., Yurtsever, M. A novel ensemble machine learning method for accurate air quality prediction. Int. J. Environ. Sci. Technol. (2024). https://doi.org/10.1007/s13762-024-05671-z

Download citation

Received: 13 June 2023
Revised: 26 December 2023
Accepted: 21 April 2024
Published: 06 May 2024
DOI: https://doi.org/10.1007/s13762-024-05671-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel ensemble machine learning method for accurate air quality prediction

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

Artificial intelligence-based solutions for climate change: a review

Water quality prediction using machine learning models based on grid search method

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel ensemble machine learning method for accurate air quality prediction

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

Artificial intelligence-based solutions for climate change: a review

Water quality prediction using machine learning models based on grid search method

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation