PM $$_{2.5}$$ forecasting based on transformer neural network and data embedding

Limperis, Jordan; Tong, Weitian; Hamza-Lup, Felix; Li, Lixin

doi:10.1007/s12145-023-01002-x

PM$_{2.5}$ forecasting based on transformer neural network and data embedding

Research
Published: 17 May 2023

Volume 16, pages 2111–2124, (2023)
Cite this article

Earth Science Informatics Aims and scope Submit manuscript

Jordan Limperis¹,
Weitian Tong¹,
Felix Hamza-Lup¹ &
…
Lixin Li¹

280 Accesses
2 Citations
Explore all metrics

Abstract

Forecasting time series data is a big challenge due to the temporal and multivariate dependencies in the data. In this paper, we present a new approach named as TPPM25 (Transformer-based Prediction of PM$_{2.5}$) for forecasting PM$_{2.5}$, a key air quality indicator. It is based on the state-of-the-art Transformer neural network and various data embedding techniques. By performing attention calculations among features over time steps, TPPM25 mimics cognitive attention and selectively enhances essential parts of the input data while diminishing other parts. TPPM25 is able to effectively capture temporal relations to multiple influencing meteorological features. Experiments demonstrate its effectiveness by comparing with a cutting-edge ensemble deep learning model from Zhang et al. (Inf Sci 544:427–445, 2021). Our TPPM25 model outperforms Zhang et al.’s model under the same experimental setting on a well-researched benchmark dataset. As Zhang et al.’s model is restricted to univariate PM$_{2.5}$ prediction, our TPPM25 model bypasses this restriction and further improves the prediction accuracy when considering more influencing meteorological features. Moreover, our TPPM25 model is able to maintain high prediction accuracy over longer periods of time as compared to the Long-Short Term Memory (LSTM) and Bidirectional LSTM models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling air quality PM2.5 forecasting using deep sparse attention-based transformer networks

Article Open access 04 April 2023

A Long Short-Term Memory Neural Network Model for Predicting Air Pollution Index Based on Popular Learning

Weather forecasting based on hybrid decomposition methods and adaptive deep learning strategy

Article 27 January 2023

Data Availability

The datasets generated during and/or analysed during the current study are not publicly available but are available from the corresponding author on reasonable request.

Code Availability

The code generated during and/or analysed during the current study is not publicly available but is available from the corresponding author on reasonable request.

References

Abduljabbar RL, Dia H, Tsai PW (2021) Unidirectional and bidirectional lstm models for short-term traffic prediction. J Adv Transp 2021. https://doi.org/10.1155/2021/5589075
Altaf B, Yu L, Zhang X (2018) Spatio-temporal attention based recurrent neural network for next location prediction. In: 2018 IEEE International conference on big data (Big Data). pp 937–942, https://doi.org/10.1109/BigData.2018.8622218
Ameer S, Shah MA, Khan A et al (2019) Comparative analysis of machine learning techniques for predicting air quality in smart cities. IEEE Access 7:128325–128338. https://doi.org/10.1109/ACCESS.2019.2925082
Article Google Scholar
Baker Effendi S, van der Merwe B, Balke WT (2020) Suitability of graph database technology for the analysis of spatio-temporal data. Future Internet 12(5):78. https://doi.org/10.3390/fi12050078
Article Google Scholar
Bermejo U, Almeida A, Bilbao-Jayo A et al (2021) Embedding-based real-time change point detection with application to activity segmentation in smart home time series data. Expert Syst Appl 185:115641. https://doi.org/10.1016/j.eswa.2021.115641
Article Google Scholar
Butland BK, Samoli E, Atkinson RW et al (2019) Measurement error in a multi-level analysis of air pollution and health: a simulation study. Environ Health 18(1):1–10. https://doi.org/10.1186/s12940-018-0432-8
Article Google Scholar
Chai T, Draxler RR (2014) Root mean square error (rmse) or mean absolute error (mae)? – arguments against avoiding rmse in the literature. Geosci Model Dev Discuss 7(1):1525–1534. https://doi.org/10.5194/gmd-7-1247-2014
Article Google Scholar
Choromanski K, Likhosherstov V, Dohan D, et al (2020) Rethinking attention with performers. arXiv preprint arXiv:2009.14794
Dai H, Huang G, Zeng H, et al (2022) PM$_{2.5}$ volatility prediction by XGBoost-MLP based on GARCH models. J Clean Prod 356:131898. https://doi.org/10.1016/j.jclepro.2022.131898
Danaci E, Alkaya AF, Gültekin OG (2020) An empirical analysis of swarm intelligence techniques on atm cash withdrawal forecasting. In: Intelligent and fuzzy techniques in big data analytics and decision making. pp 1235–1242. https://doi.org/10.1007/978-3-030-23756-1_145
David H (1979) Robust estimation in the presence of outliers. In: Robustness in statistics. pp 61–74. https://doi.org/10.1016/B978-0-12-438150-6.50011-X
Deb K, Pratap A, Agarwal S et al (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6(2):182–197. https://doi.org/10.1109/4235.996017
Article Google Scholar
Devlin J, Chang MW, Lee K, et al (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. pp 4171–4186. https://doi.org/10.18653/v1/n19-1423
EPA (2006) Air quality guidelines: global update 2005: particulate matter, ozone, nitrogen dioxide, and sulfur dioxide. World Health Organization
Grigsby J, Wang Z, Qi Y (2021) Long-range transformers for dynamic spatiotemporal forecasting. arXiv preprint arXiv:2109.12218
Hall JV, Brajer V, Lurmann FW (2010) Air pollution, health and economic benefits–lessons from 20 years of analysis. Ecol Econ 69(12):2590–2597. https://doi.org/10.1016/j.ecolecon.2010.08.003
Article Google Scholar
He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778, https://doi.org/10.1109/CVPR.2016.90
Kampa M, Castanas E (2008) Human health effects of air pollution. Environ Pollut 151(2):362–367. https://doi.org/10.1016/j.envpol.2007.06.012
Article Google Scholar
Kazemi SM, Goel R, Eghbali S, et al (2019) Time2vec: Learning a vector representation of time. arXiv preprint arXiv:1907.05321 https://doi.org/10.48550/arXiv.1907.05321
Kumar U, Jain V (2010) Arima forecasting of ambient air pollutants (O3, NO, NO2 and CO). Stoch Environ Res Risk Assess 24(5):751–760. https://doi.org/10.1007/s00477-009-0361-8
Article Google Scholar
Lee KH, Chen X, Hua G, et al (2018) Stacked cross attention for image-text matching. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 201–216, https://doi.org/10.1007/978-3-030-01225-0_13
Lei C, Xu X, Ma Y et al (2022) Full coverage estimation of the PM concentration across china based on an adaptive spatiotemporal approach. IEEE Trans Geosci Remote Sens 60:1–14. https://doi.org/10.1109/TGRS.2022.3213797
Article Google Scholar
Li T, Hua M, Wu X (2020) A hybrid CNN-LSTM model for forecasting particulate matter (PM$_{2.5}$). IEEE Access 8:26933–26940. https://doi.org/10.1109/ACCESS.2020.2971348
Li T, Shen H, Yuan Q et al (2022) A locally weighted neural network constrained by global training for remote sensing estimation of PM2.5. IEEE Trans Geosci Remote Sens 60:1–13. https://doi.org/10.1109/TGRS.2021.3074569
Article Google Scholar
Li X, Feng Y, Liang H (2017) The impact of meteorological factors on PM2.5 variations in hong kong. In: IOP Conference Series: Earth and Environmental Science. p 012003, https://doi.org/10.1088/1755-1315/78/1/012003
Li Y, Chen Q, Zhao H, et al (2015) Variations in PM10, PM$_{2.5}$ and PM1.0 in an urban area of the sichuan basin and their relation to meteorological factors. Atmosphere 6(1):150–163. https://doi.org/10.3390/atmos6010150
Liang X, Li S, Zhang S, et al (2016) PM$_{2.5}$ data reliability, consistency, and air quality assessment in five chinese cities. J Geophys Res Atmos 121(17):10–220. https://doi.org/10.1002/2016JD024877
Liu Y, Paciorek CJ, Koutrakis P (2009) Estimating regional spatial and temporal variability of pm2. 5 concentrations using satellite data, meteorology, and land use information. Environmental health perspectives 117(6):886–892. https://doi.org/10.1289/ehp.0800123
Lou C, Liu H, Li Y, et al (2017) Relationships of relative humidity with PM2.5 and PM10 in the yangtze river delta, china. Environ Monit Assess 189(11):1–16. https://doi.org/10.1007/s10661-017-6281-z
Ma J, Shou Z, Zareian A, et al (2019) Cdsa: cross-dimensional self-attention for multivariate, geo-tagged time series imputation. arXiv preprint arXiv:1905.09904
Niu M, Zhang Y, Ren Z (2023) Deep learning-based pm2.5 long time-series prediction by fusing multisource data: A case study of beijing. Atmosphere 14(2). https://doi.org/10.3390/atmos14020340
Pui DY, Chen SC, Zuo Z (2014) PM2.5 in china: Measurements, sources, visibility and health effects, and mitigation. Particuology 13:1–26. https://doi.org/10.1016/j.partic.2013.11.001
Qi Y, Li Q, Karimian H et al (2019) A hybrid model for spatiotemporal forecasting of pm2. 5 based on graph convolutional neural network and long short-term memory. Sci Total Environ 664:1–10. https://doi.org/10.1016/j.scitotenv.2019.01.333
Article Google Scholar
Rossel RV, Webster R (2012) Predicting soil properties from the australian soil visible-near infrared spectroscopic database. Eur J Soil Sci 63(6):848–860. https://doi.org/10.1111/j.1365-2389.2012.01495.x
Article Google Scholar
Shang Z, Deng T, He J et al (2019) A novel model for hourly PM$_{2.5}$ concentration prediction based on cart and eelm. Sci Total Environ 651:3043–3052. https://doi.org/10.1016/j.scitotenv.2018.10.193
Shen S, Yao Z, Gholami A, et al (2020) Powernorm: Rethinking batch normalization in transformers. In: Proceedings of the 37th international conference on machine learning. pp 8741–8751
Siami-Namini S, Tavakoli N, Namin AS (2019) The performance of lstm and bilstm in forecasting time series. In: 2019 IEEE International conference on big data (Big Data). pp 3285–3292, https://doi.org/10.1109/BigData47090.2019.9005997
Singh P, Narasimhan TL, Lakshminarayanan CS (2019) Deepair: air quality prediction using deep neural network. In: TENCON 2019-2019 IEEE Region 10 Conference (TENCON). pp 869–873, https://doi.org/10.1109/TENCON.2019.8929470
Song J, Wang J, Lu H (2018) A novel combined model based on advanced optimization algorithm for short-term wind speed forecasting. Appl Energy 215:643–658. https://doi.org/10.1016/j.apenergy.2018.02.070
Article Google Scholar
Tay Y, Dehghani M, Abnar S, et al (2020) Long range arena: A benchmark for efficient transformers. ArXiv abs/2011.04006
Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. pp 6000–6010, https://doi.org/10.5555/3295222.3295349
Wu Cl, Song Rf, Peng Zr, et al (2022) Prediction of air pollutants on roadside of the elevated roads with combination of pollutants periodicity and deep learning method. Build Environ 207:108436. https://doi.org/10.1016/j.buildenv.2021.108436
Xu Y, Xue W, Lei Y, et al (2020) Spatiotemporal variation in the impact of meteorological conditions on PM2.5 pollution in china from 2000 to 2017. Atmos Environ 223:117215. https://doi.org/10.1016/j.atmosenv.2019.117215
Zerveas G, Jayaraman S, Patel D, et al (2021) A transformer-based framework for multivariate time series representation learning. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. pp 2114–2124. https://doi.org/10.1145/3447548.3467401
Zhang S, Chen Y, Zhang W et al (2021) A novel ensemble deep learning model with dynamic error correction and multi-objective ensemble pruning for time series forecasting. Inf Sci 544:427–445. https://doi.org/10.1016/j.ins.2020.08.053
Article Google Scholar
Zhao R, Gu X, Xue B, et al (2018) Short period PM$_{2.5}$ prediction based on multivariate linear regression model. PLoS ONE 13(7):1–15. https://doi.org/10.1371/journal.pone.0201011
Zhou C, Chen J, Wang S (2018) Examining the effects of socioeconomic development on fine particulate matter (PM$_{2.5}$) in china’s cities using spatial regression and the geographical detector technique. Sci Total Environ 619:436–445. https://doi.org/10.1016/j.scitotenv.2017.11.124
Zhou H, Zhang S, Peng J et al (2021) Informer: Beyond efficient transformer for long sequence time-series forecasting. Proc AAAI Conf Artif Intell 35(12):11106–11115. https://doi.org/10.1609/aaai.v35i12.17325
Zhu Y, Ma Y, Liu B et al (2022) Retrieving the vertical distribution of PM2.5 mass concentration from lidar via a random forest model. IEEE Trans Geosci Remote Sens 60:1–9. https://doi.org/10.1109/TGRS.2021.3102059

Download references

Acknowledgements

All authors are supported by the Office of Research, Georgia Southern University.

Funding

All authors are supported by the Office of Research, Georgia Southern University.

Author information

Authors and Affiliations

Department of Computer Science, Georgia Southern University, 1332 Southern Dr, Statesboro, 30458, GA, USA
Jordan Limperis, Weitian Tong, Felix Hamza-Lup & Lixin Li

Authors

Jordan Limperis
View author publications
You can also search for this author in PubMed Google Scholar
Weitian Tong
View author publications
You can also search for this author in PubMed Google Scholar
Felix Hamza-Lup
View author publications
You can also search for this author in PubMed Google Scholar
Lixin Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Data collection, experiments, and analysis were performed by Jordan Limperis and Weitian Tong. The first draft of the manuscript was written by Jordan Limperis and Weitian Tong. All authors (i.e., Jordan Limperis, Weitian Tong, Felix Hamza-Lup, and Lixin Li) commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Weitian Tong.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Communicated by: H. Babaie.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Limperis, J., Tong, W., Hamza-Lup, F. et al. PM$_{2.5}$ forecasting based on transformer neural network and data embedding. Earth Sci Inform 16, 2111–2124 (2023). https://doi.org/10.1007/s12145-023-01002-x

Download citation

Received: 15 February 2023
Accepted: 15 March 2023
Published: 17 May 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s12145-023-01002-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PM\(_{2.5}\) forecasting based on transformer neural network and data embedding

Abstract

Access this article

Similar content being viewed by others

Modeling air quality PM2.5 forecasting using deep sparse attention-based transformer networks

A Long Short-Term Memory Neural Network Model for Predicting Air Pollution Index Based on Popular Learning

Weather forecasting based on hybrid decomposition methods and adaptive deep learning strategy

Data Availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PM\(_{2.5}\) forecasting based on transformer neural network and data embedding

Abstract

Access this article

Similar content being viewed by others

Modeling air quality PM2.5 forecasting using deep sparse attention-based transformer networks

A Long Short-Term Memory Neural Network Model for Predicting Air Pollution Index Based on Popular Learning

Weather forecasting based on hybrid decomposition methods and adaptive deep learning strategy

Data Availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation