Abstract
Due to rapid expansion in the global economy and industrialization, PM2.5 (particles smaller than 2.5 µm in aerodynamic diameter) pollution has become a key environmental issue. The public health and social development directly affected by high PM2.5 levels. In this paper, ambient PM2.5 concentrations along with meteorological data are forecasted using time series models, including random forest (RF), prophet forecasting model (PFM), and autoregressive integrated moving average (ARIMA) in Anhui province, China. The results indicate that the RF model outperformed the PFM and ARIMA in the prediction of PM2.5 concentrations, with cross-validation coefficients of determination R2, RMSE, and MAE values of 0.83, 10.39 µg/m3, and 6.83 µg/m3, respectively. PFM achieved the average results (R2 = 0.71, RMSE = 13.90 µg/m3, and MAE = 9.05 µg/m3), while the predicted results by ARIMA are comparatively poorer (R2 = 0.64, RMSE = 15.85 µg/m3, and MAE = 10.59 µg/m3) than RF and PFM. These findings reveal that the RF model is the most effective method for predicting PM2.5 and can be applied to other regions for new findings.
Similar content being viewed by others
Data availability
Not applicable.
References
Akdi, Y., Okkaoglu, Y., Golveren, E., & Yucel, M. E. (2020). Estimation and forecasting of PM10 air pollution in Ankara via time series and harmonic regressions. International Journal of Environmental Science and Technology, 17, 3677–3690. https://doi.org/10.1007/s13762-020-02705-0
Anggraeni, W., Vinarti, R. A., & Kurniawati, Y. D. (2015). Performance comparisons between arima and arimax method in moslem kids clothes demand forecasting: Case study. Procedia Computer Science, 72, 630–637.
Athanasopoulos, G., Hyndman, R. J., Song, H., & Wu, D. C. (2011). The tourism forecasting competition. International Journal of Forecasting, 27, 822–844.
Bhatti, U. A., Yan, Y., Zhou, M., Ali, S., Hussain, A., Qingsong, H., et al. (2021). Time series analysis and forecasting of air pollution particulate matter (PM2.5): An SARIMA and factor analysis approach. IEEE Access, 9, 41019–41031. https://doi.org/10.1109/access.2021.3060744
Bhatti, U. A., Marjan, S., Wahid, A., Syam, M. S., Huang, M., Tang, H., & Hasnain, A. (2023). The effects of socioeconomic factors on particulate matter concentration in China’s: New evidence from spatial econometric model. Journal of Cleaner Production, 417, 137969. https://doi.org/10.1016/j.jclepro.2023.137969
Bilal, M., Mhawish, A., Nichol, J. E., Qiu, Z., Nazeer, M., Ali, M. A., et al. (2021). Air pollution scenario over Pakistan: characterization and ranking of extremely polluted cities using long-term concentrations of aerosols and trace gases. Remote Sensing of Environment, 264, 112617. https://doi.org/10.1016/j.rse.2021.112617
Box, G., & Jenkins, G. (1976). Time series analysis: Forecasting and control. Holden-Day.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Brokamp, C., Jandarov, R., Hossain, M., & Ryan, P. (2018). Predicting daily urban fine particulate matter concentrations using a random forest model. Environmental Science and Technology, 52, 4173–4179.
Cekim, H. O. (2020). Forecasting PM10 concentrations using time series models: A case of the most polluted cities in Turkey. Environmental Science and Pollution Research, 27, 25612–25624. https://doi.org/10.1007/s11356-020-08164-x
Chang, Y. S., Abimannan, S., Chiao, S. T., Lin, C. Y., & Huang, Y. P. (2020). An ensemble learning based hybrid model and framework for air pollution forecasting. Environmental Science and Pollution Research, 27, 38155–38168. https://doi.org/10.1007/s11356-020-09855-1
Chelani, A. B. (2018). Estimating PM2.5 concentration from satellite derived aerosol optical depth and meteorological variables using a combination model. Atmospheric Pollution Research
Chuang, Y. H., Mazumdar, S., Park, T., Tang, G., Arena, V. C., & Nicolich, M. J. (2011). Generalized linear mixed models in time series studies of air pollution. Atmospheric Pollution Research, 2, 428–435.
CNEMC (2019). China national environmental monitoring centre. http://www.cnemc.cn/. Accessed 8 Aug 2019.
Dong, Y., Zhang, C., Niu, M., Wang, S., & Sun, S. (2021). Air pollution forecasting with multivariate interval decomposition ensemble approach. Atmospheric Pollution Research, 12, 101230. https://doi.org/10.1016/j.apr.2021.101230
Drewil, G. I., & Al-Bahadili, R. J. (2022). Air pollution prediction using LSTM deep learning and metaheuristics algorithms. Measurement Sensors, 24, 100546. https://doi.org/10.1016/j.measen.2022.100546
Fang, S., Li, Q., Karimian, H., Liu, H., & Mo, Y. (2022). DESA: A novel hybrid decomposing-ensemble and spatiotemporal attention model for PM2.5 forecasting. Environmental Science and Pollution Research, 29, 54150–54166.
Feng, X., Li, Q., Zhu, Y., Hou, J., Jin, L., & Wang, J. (2015). Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation. Atmospheric Environment, 107, 118–128.
Ghasempour, F., Sekertekin, A., & Kutoglu, S. H. (2021). Google Earth Engine based spatio-temporal analysis of air pollutants before and during the first wave COVID-19 outbreak over Turkey via remote sensing. Journal of Cleaner Production, 319, 128599.
Guo, Y., Tang, Q., Gong, D. Y., & Zhang, Z. (2017). Estimating ground-level PM2.5 concentrations in Beijing using a satellite-based geographically and temporally weighted regression model. Remote Sensing of Environment, 198, 140–149.
Guo, L., et al. (2018). Improving PM2.5 forecasting and emission estimation based on the Bayesian optimization method and the coupled FLEXPART-WRF model. Atmosphere, 9, 428.
Han, Y., Lam, J. C. K., Li, V. O., & Reiner, D. (2021). A Bayesian LSTM model to evaluate the effects of air pollution control regulations in Beijing, China. Environmental Science & Policy, 11, 26–34. https://doi.org/10.1016/j.envsci.2020.10.004
Hasnain, A., Sheng, Y., Hashmi, M. Z., Bhatti, U. A., Hussain, A., Hameed, M., Marjan, S., Bazai, S. U., Hossain, M. A., Sahabuddin, M., Wagan, R. A., & Zha, Y. (2022). Time series analysis and forecasting of air pollutants based on prophet forecasting model in Jiangsu Province, China. Frontiers in Environmental Science, 10, 945628. https://doi.org/10.3389/fenvs.2022.945628
Hasnain, A., Sheng, Y., Hashmi, M. Z., Bhatti, U. A., Ahmed, Z., & Zha, Y. (2023). Assessing the ambient air quality patterns associated to the COVID-19 outbreak in the Yangtze River Delta: A random forest approach. Chemosphere, 314, 137638. https://doi.org/10.1016/j.chemosphere.2022.137638
He, Q., & Huang, B. (2018). Satellite-based mapping of daily high-resolution ground PM2.5 in China via space-time regression modeling. Remote Sensing of Environment, 206, 72–83. https://doi.org/10.1016/j.rse.2017.12.018
Huang, K., Xiao, Q., Meng, X., Geng, G., Wang, Y., Lyapustin, A., Gu, D., & Liu, Y. (2018). Predicting monthly high-resolution PM2.5 concentrations with random forest model in the North China plain. Environmental Pollution, 242, 675–683.
Hyndman, R. J., & Khandakar, Y. (2008). Automatic time series forecasting: The forecast Package for R. The Journal of Statistical Software, 27, 1–22.
Lee, M. H., Rahman, N. H. A., Latif, M. T., Nor, M. E., & Kamisan, N. A. B. (2012). Seasonal ARIMA for forecasting air pollution index: A case study. American Journal of Applied Sciences, 9, 570–578.
Lee, M., Lin, L., Chen, C. Y., Tsao, Y., et al. (2020). Forecasting air quality in Taiwan by using machine learning. Science and Reports, 10, 4153. https://doi.org/10.1038/s41598-020-61151-7
Liu, D., & Sun, K. (2019). Short-term PM2.5 forecasting based on CEEMD-RF in five cities of China. Environmental Science and Pollution Research, 26, 32790–32803. https://doi.org/10.1007/s11356-019-06339-9
Liu, Y., Cao, G., Zhao, N., Mulligan, K., & Ye, X. (2018). Improve ground-level PM2.5 concentration mapping using a random forests-based geostatistical approach. Environmental Pollution, 235, 272–282.
Lu, D., Mao, W., Zheng, L., Xiao, W., Zhang, L., & Wei, J. (2021). Ambient PM2.5 estimates and variations during COVID-19 Pandemic in the Yangtze River delta using machine learning and big data. Remote Sens, 13, 1423. https://doi.org/10.3390/rs13081423
Maciąg, P. S., Bembenik, R., Piekarzewicz, A., et al. (2023). Effective air pollution prediction by combining time series decomposition with stacking and bagging ensembles of evolving spiking neural networks. Environ Model Soft, 170, 105851. https://doi.org/10.1016/j.envsoft.2023.105851
Moisan, S., Herrera, R., & Clements, A. (2018). A dynamic multiple equation approach for forecasting PM2.5 pollution in Santiago. Chile. Int J Forecast, 34, 566–581.
Molina, L. L., Angon, E., Garcıa, A., Moralejo, R. H., Caballero-Villalobos, J., & Perea, J. (2018). Time series analysis of bovine venereal diseases in La Pampa, Argentina. PloS one, 13, 1–17.
Qiao, D. W., Yao, J., Zhang, J. W., Li, X. L., Mi, T., & Zeng, W. (2022). Short-term air quality forecasting model based on hybrid RF-IACABPNN algorithm. Environmental Science and Pollution Research, 29, 39164–39181. https://doi.org/10.1007/s11356-021-18355-9
Shakya, D., Deshpande, V., Goyal, M. K., & Agarwal, M. (2023). PM2.5 air pollution prediction through deep learning using meteorological, vehicular, and emission data: A case study of New Delhi India. Journal of Cleaner Production, 427, 139278. https://doi.org/10.1016/j.jclepro.2023.139278
Shang, Z., Deng, T., He, J., & Duan, X. (2019). A novel model for hourly PM2.5 concentration prediction based on CART and EELM. Science of the Total Environment, 651, 3043–3052.
Shen, J., Valagolam, D., & McCalla, S. (2020). Prophet forecasting model: A machine learning approach to predict the concentration of air pollutants (PM2.5, PM10, O3, NO2, SO2, CO) in Seoul. South Korea. PeerJ, 8, e9961. https://doi.org/10.7717/peerj.9961
Silva, C., Perez, P., & Trier, A. (2001). Statistical modelling and prediction of atmospheric pollution by particulate material: Two nonparametric approaches. Environmetrics, 12(2), 147–159.
Song, W., Jia, H., Huang, J., & Zhang, Y. (2014). A satellite-based geographically weighted regression model for regional PM2.5 estimation over the Pearl River Delta region in China. Remote Sensing of Environment, 154, 1–7.
Taylor, S. J., & Letham, B. (2017). Forecasting at scale. Am. Statistician, 72(1), 37–45. https://doi.org/10.1080/00031305.2017.1380080
Wang, P., Zhang, H., Qin, Z., & Zhang, G. (2017). A novel hybrid-Garch model based on ARIMA and SVM for PM2.5 concentrations forecasting. Atmospheric Pollution Research, 8, 850–860.
Wei, J., Li, Z., Pinker, R. T., Sun, L., et al. (2021). Himawari-8-derived diurnal variations of ground-level PM2.5 pollution across China using a fast space-time Light Gradient Boosting Machine. Atmospheric Chemistry and Physics. https://doi.org/10.5194/acp-2020-1277
Wu, J., Wang, Y., Liang, J., & Yao, F. (2021). Exploring common factors influencing PM2.5 and O3 concentrations in the Pearl River Delta: Tradeoffs and synergies. Environmental Pollution, 285, 117138. https://doi.org/10.1016/j.envpol.2021.117138
Wu, F., Min, P., Jin, Y., Zhang, K., Liu, H., & Zhao, J. (2023). A novel hybrid model for hourly PM2.5 prediction considering air pollution factors, meteorological parameters and GNSS-ZTD. Environmental Modelling & Software, 167, 105780.
Yang, W., Wu, Q., Li, J., Chen, X., et al. (2024). Predictions of air quality and challenges for eliminating air pollution during the 2022 Olympic Winter Games. Atmospheric Research, 300, 107225. https://doi.org/10.1016/j.atmosres.2024.107225
Ye, Z. (2019). Air pollutants prediction in Shenzhen based on Arima and prophet method. E3S Web of Conferences, 136, 05001. https://doi.org/10.1051/e3sconf/201913605001
Zeng, Y., Jaffe, D. A., Qiao, X., Miao, Y., & Tang, Y. (2020). Prediction of potentially high PM2.5 concentrations in Chengdu, China. Aerosol and Air Quality Research, 20, 956–965. https://doi.org/10.4209/aaqr.2019.11.0586
Zhang, L., Lin, J., Qiu, R., Hu, X., Zhang, H., Chen, Q., Tan, H., Lin, D., & Wang, J. (2018). Trend analysis and forecast of PM2.5 in Fuzhou, China using the ARIMA model. Ecological Indicators, 95, 702–710.
Zhu, J., Lee, R. W., Twum, C., & Wei, Y. (2019). Exposure to ambient PM2.5 during pregnancy and preterm birth in metropolitan areas of the state of Georgia. Environmental Science and Pollution Research, 26, 2492–2500.
Acknowledgements
The authors are thankful to the kind and precious suggestions of Prof. Dr. Muhammad Zaffar Hashmi.
Author information
Authors and Affiliations
Contributions
Ahmad Hasnain: conceptualization; methodology; data curation; formal analysis; writing—original draft; writing—review and editing; validation; visualization. Muhammad Zaffar Hashmi: supervision, conceptualization, resources, writing—review and editing. Sohaib Khan: validation, investigation, data curation, writing—review and editing. Uzair Aslam Bhatti: supervision, investigation, conceptualization, data curation, writing—review and editing. Xiangqiang Min: data curation, formal analysis, writing—review and editing. Yin Yue: data curation, validation, writing—review and editing. Yufeng He: data curation, writing—review and editing. Geng Wei: data curation; writing—review and editing; validation. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent to publish
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hasnain, A., Hashmi, M.Z., Khan, S. et al. Predicting ambient PM2.5 concentrations via time series models in Anhui Province, China. Environ Monit Assess 196, 487 (2024). https://doi.org/10.1007/s10661-024-12644-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10661-024-12644-9