An ensemble NLSTM-based model for PM2.5 concentrations prediction considering feature extraction and data decomposition

Zhang, Rui; Awang, Norhashidah

doi:10.1007/s11869-023-01385-2

An ensemble NLSTM-based model for PM2.5 concentrations prediction considering feature extraction and data decomposition

Published: 21 June 2023

Volume 16, pages 1969–1987, (2023)
Cite this article

Air Quality, Atmosphere & Health Aims and scope Submit manuscript

192 Accesses
2 Citations
Explore all metrics

Abstract

Fine particulate matter (PM2.5) is a hazardous air pollutant with an aerodynamic diameter of 2.5 μm or less, which can lead to severe health impacts such as cardiovascular disease, respiratory illnesses, and various types of cancer. Therefore, accurate forecasting of PM2.5 concentrations is crucial for public health and policy-making. However, due to the stochastic nature of PM2.5, achieving high prediction accuracy and efficiency remains a challenge. To address this challenge, this study proposes a hybrid deep learning model consisting of principal component analysis (PCA), discrete stationary wavelet transform (DSWT), and Nested LSTM (NLSTM) neural network to predict PM2.5 concentrations. The proposed model aims to leverage the strengths of each technique to achieve better accuracy and efficiency in PM2.5 forecasting. Specifically, PCA is employed as the feature extraction method to reduce the dimensionality of the data and improve computing efficiency. Additionally, DSWT is utilized to decompose the reduced-dimensional data into several sub-signals that are more regular and stable, enabling the NLSTM network to learn each sub-signal separately. Finally, the predicted values of each sub-signal are reconstructed to obtain the final PM2.5 forecast. The proposed model is validated using daily air pollutants and meteorological variables collected in Taiyuan, China, from January 1, 2016, to December 31, 2020. The long-term, medium-term, and short-term forecast results demonstrate that the proposed model achieves better accuracy and efficiency compared to existing models. Overall, the proposed hybrid deep learning model provides a promising solution for accurate and efficient forecasting of PM2.5 concentrations, and the findings of this study have important implications for public health and environmental policy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PM2.5 forecasting for an urban area based on deep learning and decomposition method

Article Open access 20 October 2022

Integration of complete ensemble empirical mode decomposition with deep long short-term memory model for particulate matter concentration prediction

Article 27 July 2021

A hybrid Daily PM2.5 concentration prediction model based on secondary decomposition algorithm, mode recombination technique and deep learning

Article 05 January 2022

Data availability

Air pollutants data can be obtained from the website (https://www.aqistudy.cn/historydata/) and meteorological data is obtained from National Meteorological Science Data Center (https://data.cma.cn/)

References

Anshuka A, Chandra R, Buzacott AJV et al (2022) Spatio temporal hydrological extreme forecasting framework using LSTM deep learning model. Stoch Environ Res Risk Assess 36(10):3467–3485. https://doi.org/10.1007/s00477-022-02204-3
Article Google Scholar
Biancofiore F, Busilacchio M, Verdecchia M et al (2017) Recursive neural network model for analysis and forecast of PM10 and PM2.5. Atmos Pollut Res 8(4):652–659. https://doi.org/10.1016/j.apr.2016.12.014
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Article Google Scholar
Breiman L (2017) Classification and regression trees. Routledge, England
Book Google Scholar
Calderón-Garcidueñas L, Solt AC, Henríquez-Roldán C et al (2008) Long-term air pollution exposure is associated with neuroinflammation, an altered innate immune response, disruption of the blood-brain barrier, ultrafine particulate deposition, and accumulation of amyloid β-42 and α-synuclein in children and young adults. Toxicol Pathol 36(2):289–310. https://doi.org/10.1177/0192623307313011
Article CAS Google Scholar
Cetin M (2015) Using GIS analysis to assess urban green space in terms of accessibility: case study in Kutahya. Int J Sust Dev World 22(5):420–424. https://doi.org/10.1080/13504509.2015.1061066
Article Google Scholar
Cetin M (2019) The effect of urban planning on urban formations determining bioclimatic comfort area’s effect using satellitia imagines on air quality: a case study of Bursa city. Air Qual Atmos Health 12(10):1237–1249. https://doi.org/10.1007/s11869-019-00742-4
Article CAS Google Scholar
Cetin M (2020) Climate comfort depending on different altitudes and land use in the urban areas in Kahramanmaras City. Air Qual Atmos Health 13(8):991–999. https://doi.org/10.1007/s11869-020-00858-y
Article CAS Google Scholar
Cetin M, Adiguzel F, Gungor S, Kaya E, Sancar MC (2019) Evaluation of thermal climatic region areas in terms of building density in urban management and planning for Burdur, Turkey. Air Qual Atmos Health 12:1103–1112. https://doi.org/10.1007/s11869-019-00727-3
Article CAS Google Scholar
Chen J, Lu J, Avise JC et al (2014) Seasonal Modeling of PM2.5 in California's San Joaquin Valley. Atmos Environ 92:182–190. https://doi.org/10.1016/j.atmosenv.2014.04.030
Article CAS Google Scholar
Chen YC, Li DC (2021) Selection of key features for PM2. 5 prediction using a wavelet model and RBF-LSTM. Appl Intell 51(4):2534–2555. https://doi.org/10.1007/s10489-020-02031-5
Article Google Scholar
Cheng Y, Zhang H, Liu Z et al (2019) Hybrid algorithm for short-term forecasting of PM 2.5 in China. Atmos Environ 200:264–279. https://doi.org/10.1016/j.atmosenv.2018.12.025
Article CAS Google Scholar
Chung J, Gulcehre C, Cho KH et al (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. NIPS. https://doi.org/10.48550/arXiv.1412.3555
Cobourn WG (2010) An enhanced PM2. 5 air quality forecast model based on nonlinear regression and back-trajectory concentrations. Atmos Environ 44(25):3015–3023. https://doi.org/10.1016/j.atmosenv.2010.05.009
Article CAS Google Scholar
Crone SF, Kourentzes N (2010) Feature selection for time series prediction – a combined filter and wrapper approach for neural networks. Neurocomputing 73(10–12):1923–1936. https://doi.org/10.1016/j.neucom.2010.01.017
Article Google Scholar
Crouse DL, Goldberg MS, Ross NA (2009) A prediction-based approach to modelling temporal and spatial variability of traffic-related air pollution in Montreal. Canada. Atmos Environ 43(32):5075–5084. https://doi.org/10.1016/j.atmosenv.2009.06.040
Article CAS Google Scholar
Dhakal S, Gautam Y, Bhattarai A (2021) Exploring a deep LSTM neural network to forecast daily PM 2.5 concentration using meteorological parameters in Kathmandu Valley. Nepal. Air Qual Atmos Health 14:83–96. https://doi.org/10.1007/s11869-020-00915-6
Article CAS Google Scholar
Drucker H, Burges CJ, Kaufman L, Smola A, Vapnik V (1996) Support vector regression machines. NIPS 96:155–161
Google Scholar
Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211. https://doi.org/10.1016/0364-0213(90)90002-E
Article Google Scholar
Fang C, Zhang Z, Jin M et al (2017) Pollution Characteristics of PM2.5. Aerosol during Haze Periods in Changchun. China. Aerosol Air Qual Res 17:888–895. https://doi.org/10.4209/aaqr.2016.09.0407
Article CAS Google Scholar
Fang S, Li Q, Karimian H et al (2022) DESA: a novel hybrid decomposing-ensemble and spatiotemporal attention model for PM2.5 forecasting. Environ Sci Pollut Res 29:54150–54166. https://doi.org/10.1007/s11356-022-19574-4
Article CAS Google Scholar
Freeman BS, Taylor G, Gharabaghi B et al (2018) Forecasting air quality time series using deep learning. J Air Waste Manag Assoc 68(8):866–886. https://doi.org/10.1080/10962247.2018.1459956
Article CAS Google Scholar
Gardner MW, Dorling SR (1999) Neural network modelling and prediction of hourly NOx and NO2 concentrations in urban air in London. Atmos Environ 33(5):709–719. https://doi.org/10.1016/S1352-2310(98)00230-1
Article CAS Google Scholar
Gers FA, Schmid H (2000) Learning to Forget: Continual Prediction with LSTM. Neural Comput 12(10):2451–2471. https://doi.org/10.1162/089976600300015015
Article CAS Google Scholar
Han JY, Wang JH, Zhao Y, Wang QM, Zhang B, Li HH, Zhai JQ (2018) Spatio-temporal variation of potential evapotranspiration and climatic drivers in the Jing-Jin-Ji region, North China. Agric For Meteorol 256:75–83. https://doi.org/10.1016/j.agrformet.2018.03.002
Article Google Scholar
He J, Gong S, Yu Y et al (2017) Air pollution characteristics and their relation to meteorological conditions during 2014-2015 in major Chinese cities. Environ Pollution 223:484–496. https://doi.org/10.1016/j.envpol.2017.01.050
Article CAS Google Scholar
Hochreiter S, Schmidhuber J (1997) Long Short-Term Memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article CAS Google Scholar
Jin N, Zeng Y, Yan K et al (2021) Multivariate air quality forecasting with nested long short term memory neural network. IEEE Trans Industr Inform 17(12):8514–8522. https://doi.org/10.1109/TII.2021.3065425
Article Google Scholar
Kabacoff RI (2015) R in action: data analysis and graphics with R. Simon and Schuster, New York
Google Scholar
Kaur A, Sood SK (2020) Deep learning based drought assessment and prediction framework. Ecol Inform 57(101067):1–9. https://doi.org/10.1016/j.ecoinf.2020.101067
Article Google Scholar
Kilicoglu C, Cetin M, Aricak B, Sevik H (2021) Integrating multicriteria decision-making analysis for a GIS-based settlement area in the district of Atakum, Samsun. Turkey. Theor Appl Climatol 143(1-2):379–388. https://doi.org/10.1007/s00704-020-03439-2
Article Google Scholar
King AP, Eckersley R (2019) Statistics for biomedical engineers and scientists: How to visualize and analyze data. Academic Press, Cambridge
Google Scholar
Kulmala M (2015) Atmospheric chemistry: China’s choking cocktail. Nature 526(7574):497–499. https://doi.org/10.1038/526497a
Article CAS Google Scholar
Kumar D (2018) Evolving Differential evolution method with random forest for prediction of Air Pollution. Procedia Comput Sci 132:824–833. https://doi.org/10.1016/j.procs.2018.05.094
Article Google Scholar
Liu H, Yin S, Chen C et al (2020) Data multi-scale decomposition strategies for air pollution forecasting: A comprehensive review. J Clean Prod 277(124023):1–18. https://doi.org/10.1016/j.jclepro.2020.124023
Article Google Scholar
Lu W (2020) Deep learning notes. Peking University Press, Beijing
Google Scholar
Lv B, Cobourn WG, Bai Y (2016) Development of nonlinear empirical models to forecast daily PM2. 5 and ozone levels in three large Chinese cities. Atmos Environ 147:209–223. https://doi.org/10.1016/j.atmosenv.2016.10.003
Article CAS Google Scholar
Moniz JRA, Krueger D (2017) Nested lstms. Asian Conf Machine Learn PMLR 2017:530–544. https://doi.org/10.48550/arXiv.1801.10308
Article Google Scholar
Monner D, Reggia JA (2012) A generalized LSTM-like training algorithm for second-order recurrent neural networks. Neural Netw 25:70–83. https://doi.org/10.1016/j.neunet.2011.07.003
Article Google Scholar
Navares R, Aznarte JL (2019) Predicting air quality with deep learning LSTM: towards comprehensive models. Eco Inform 55(101019):1–18. https://doi.org/10.1016/j.ecoinf.2019.101019
Article Google Scholar
Papanastasiou DK, Melas D, Kioutsioukis I (2007) Development and assessment of neural network and multiple regression models in order to predict PM10 levels in a medium-sized Mediterranean city. Water Air Soil Pollut 182(1):325–334. https://doi.org/10.1007/s11270-007-9341-0
Article CAS Google Scholar
Paschalidou AK, Karakitsios S, Kleanthous S et al (2011) Forecasting hourly PM10 concentration in Cyprus through artificial neural networks and multiple regression models: implications to local environmental management. Environ Sci Pollut Res 18(2):316–327. https://doi.org/10.1007/s11356-010-0375-2
Article CAS Google Scholar
Percival DB, Walden AT (2000) Wavelet Methods for Time Series Analysis. Cambridge university press, London
Ping W, Yong L, Qin Z et al (2015) A novel hybrid forecasting model for PM10 and SO2 daily concentrations. Sci Total Environ 505:1202–1212. https://doi.org/10.1016/j.scitotenv.2014.10.078
Article CAS Google Scholar
Qi Y, Li Q, Karimian H, Liu D (2019) A hybrid model for spatiotemporal forecasting of PM2. 5 based on graph convolutional neural network and long short-term memory. Sci Total Environ 664:1–10. https://doi.org/10.1016/j.scitotenv.2019.01.333
Article CAS Google Scholar
Russo A, Lind PG, Raischel F, Trigo R, Mendes M (2015) Neural network forecast of daily pollution concentration using optimal meteorological data at synoptic and local scales. Atmos Pollut Res 6:540–549. https://doi.org/10.5094/APR.2015.060
Article CAS Google Scholar
Seng D, Zhang Q, Zhang X et al (2020) Spatiotemporal prediction of air quality based on LSTM neural network. Alex Eng J 60(2):2021–2032. https://doi.org/10.1016/j.aej.2020.12.009
Article Google Scholar
Sheikhan M, Mohammadi N (2013) Time series prediction using PSO-optimized neural network and hybrid feature selection algorithm for IEEE load data. Neural Comput & Applic 23:1185–1194. https://doi.org/10.1007/s00521-012-0980-8
Article Google Scholar
Slini T, Kaprara A, Karatzas K, Moussiopoulos N (2006) PM10 forecasting for Thessaloniki, Greece. Environ Model Softw 21:559–565. https://doi.org/10.1016/j.envsoft.2004.06.011
Article Google Scholar
State Bureau of Environment Protection (2012) Ambient Air Quality Standard (GB3095-2012). http://www.cnemc.cn/jcgf/dqhj/201711/t20171108_647276.shtml. Accessed 12 Nov 2022
Sun W, Zhang H, Palazoglu A, Singh A, Zhang W, Liu S (2012) Prediction of 24-hour-average PM2.5 concentrations using a hidden Markov model with different emission distributions in northern California. Sci Total Environ 443(15):93–103. https://doi.org/10.1016/j.scitotenv.2012.10.070
Article CAS Google Scholar
Tao Q, Liu F, Li Y et al (2019) Air Pollution Forecasting Using a Deep Learning Model Based on 1D Convnets and Bidirectional GRU. IEEE Access 7:76690–76698. https://doi.org/10.1109/ACCESS.2019.2921578
Article Google Scholar
Tella A, Balogun AL (2021) GIS-based air quality modelling: Spatial prediction of PM10 for Selangor State, Malaysia using machine learning algorithms. Environ Sci Pollut Res 29:86109–86125. https://doi.org/10.1007/s11356-021-16150-0
Article Google Scholar
Tie X, Madronich S, Li GH et al (2007) Characterizations of chemical oxidants in Mexico City: A regional chemical dynamical model (WRF-Chem) study. Atmos Environ 41(9):1989–2008. https://doi.org/10.1016/j.atmosenv.2006.10.053
Article CAS Google Scholar
Wang J, Xu W, Dong J et al (2022) Two-stage deep learning hybrid framework based on multi-factor multi-scale and intelligent optimization for air pollutant prediction and early warning. Stoch Environ Res Risk 2022:1–21. https://doi.org/10.1007/s00477-022-02202-5
Article Google Scholar
Wang ZF, Li J, Wang Z et al (2014) Modeling study of regional severe hazes over mid-eastern China in January 2013 and its implications on pollution prevention and control. Sci China Earth Sci 57(1):3–13. https://doi.org/10.1007/s11430-013-4793-0
Article CAS Google Scholar
Wen H, Dang Y, Li L (2020) Short-Term PM2.5 Concentration Prediction by Combining GNSS and Meteorological Factors. IEEE Access 8:115202–115216. https://doi.org/10.1109/ACCESS.2020.3003580
Article Google Scholar
WHO Health Organization (2021) Ambient (Outdoor) Air Pollution. https://www.who.int/news-room/fact-sheets/detail/ambient-(outdoor)-air-quality-and-health. Accessed 12 Nov 2022
Wu Q, Lin H (2019) A novel optimal-hybrid model for daily air quality index prediction considering air pollutant factors. Sci Total Environ 683:808–821. https://doi.org/10.1016/j.scitotenv.2019.05.288
Article CAS Google Scholar
Wu Z, Zhao W, Lv Y (2022) An ensemble LSTM-based AQI forecasting model with decomposition-reconstruction technique via CEEMDAN and fuzzy entropy. Air Qual Atmos Health 15(12):2299–2311. https://doi.org/10.1007/s11869-022-01252-66
Article CAS Google Scholar
Xu X, Yoneda M (2019) Multitask air-quality prediction based on LSTM-autoencoder model. IEEE Trans Cybern 51(5):2577–2586. https://doi.org/10.1109/TCYB.2019.2945999
Article Google Scholar
Yan R, Liao J, Yang J et al (2021) Multi-hour and multi-site air quality index forecasting in Beijing using CNN, LSTM, CNN-LSTM, and spatiotemporal clustering. Expert Syst Appl 169(114513):1-15. https://doi.org/10.1016/j.eswa.2020.114513
Zeng Y, Chen J, Jin N et al (2022) Air quality forecasting with hybrid LSTM and extended stationary wavelet transform. Build Environ 213(108822):1–10. https://doi.org/10.1016/j.buildenv.2022.108822
Article Google Scholar
Zhang B, Rong Y, Yong R, Qin D, Li M, Zou G, Pan J (2022a) Deep learning for air pollutant concentration prediction: A review. Atmos Environ 290(119347):1–18. https://doi.org/10.1016/j.atmosenv.2022.119347
Article CAS Google Scholar
Zhang L, Lin J, Qiu R et al (2018) Trend analysis and forecast of PM2. 5 in Fuzhou, China using the ARIMA model. Ecol Indic 95:702–710. https://doi.org/10.1016/j.ecolind.2018.08.032
Article CAS Google Scholar
Zhang X, Xu H, Liang D (2022b) Spatiotemporal variations and connections of single and multiple meteorological factors on PM2.5 concentrations in Xi'an, China. Atmos Environ 275(119015):1–10. https://doi.org/10.1016/j.atmosenv.2022.119015
Article CAS Google Scholar
Zhao J, He F, Ji Z, Ganchev I (2021) PM2.5 Prediction Based on the Combined EMD-LSTM Model. CSCI 2021:193–195. https://doi.org/10.1109/CSCI54926.2021.00104
Article Google Scholar
Zhou Y, Chang FJ, Chang LC et al (2019) Multi-output support vector machine for regional multi-step-ahead PM2. 5 forecasting. Sci Total Environ 651:230–240. https://doi.org/10.1016/j.scitotenv.2018.09.111
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematical Sciences, Universiti Sains Malaysia, 11800 USM, Penang, Malaysia
Rui Zhang & Norhashidah Awang
Department of Science, Taiyuan Institute of Technology, Taiyuan, 030008, China
Rui Zhang

Authors

Rui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Norhashidah Awang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Both authors contributed to the study’s conception and design. Data collection and analysis were performed by Rui Zhang. The first draft of the manuscript was written by Rui Zhang. Norhashidah Awang supervised, reviewed, and edited the manuscript. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Norhashidah Awang.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, R., Awang, N. An ensemble NLSTM-based model for PM2.5 concentrations prediction considering feature extraction and data decomposition. Air Qual Atmos Health 16, 1969–1987 (2023). https://doi.org/10.1007/s11869-023-01385-2

Download citation

Received: 25 November 2022
Accepted: 09 June 2023
Published: 21 June 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s11869-023-01385-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An ensemble NLSTM-based model for PM2.5 concentrations prediction considering feature extraction and data decomposition

Abstract

Access this article

Similar content being viewed by others

PM2.5 forecasting for an urban area based on deep learning and decomposition method

Integration of complete ensemble empirical mode decomposition with deep long short-term memory model for particulate matter concentration prediction

A hybrid Daily PM2.5 concentration prediction model based on secondary decomposition algorithm, mode recombination technique and deep learning

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An ensemble NLSTM-based model for PM2.5 concentrations prediction considering feature extraction and data decomposition

Abstract

Access this article

Similar content being viewed by others

PM2.5 forecasting for an urban area based on deep learning and decomposition method

Integration of complete ensemble empirical mode decomposition with deep long short-term memory model for particulate matter concentration prediction

A hybrid Daily PM2.5 concentration prediction model based on secondary decomposition algorithm, mode recombination technique and deep learning

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation