Abstract
Forecasting time series data is a big challenge due to the temporal and multivariate dependencies in the data. In this paper, we present a new approach named as TPPM25 (Transformer-based Prediction of PM\(_{2.5}\)) for forecasting PM\(_{2.5}\), a key air quality indicator. It is based on the state-of-the-art Transformer neural network and various data embedding techniques. By performing attention calculations among features over time steps, TPPM25 mimics cognitive attention and selectively enhances essential parts of the input data while diminishing other parts. TPPM25 is able to effectively capture temporal relations to multiple influencing meteorological features. Experiments demonstrate its effectiveness by comparing with a cutting-edge ensemble deep learning model from Zhang et al. (Inf Sci 544:427–445, 2021). Our TPPM25 model outperforms Zhang et al.’s model under the same experimental setting on a well-researched benchmark dataset. As Zhang et al.’s model is restricted to univariate PM\(_{2.5}\) prediction, our TPPM25 model bypasses this restriction and further improves the prediction accuracy when considering more influencing meteorological features. Moreover, our TPPM25 model is able to maintain high prediction accuracy over longer periods of time as compared to the Long-Short Term Memory (LSTM) and Bidirectional LSTM models.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analysed during the current study are not publicly available but are available from the corresponding author on reasonable request.
Code Availability
The code generated during and/or analysed during the current study is not publicly available but is available from the corresponding author on reasonable request.
References
Abduljabbar RL, Dia H, Tsai PW (2021) Unidirectional and bidirectional lstm models for short-term traffic prediction. J Adv Transp 2021. https://doi.org/10.1155/2021/5589075
Altaf B, Yu L, Zhang X (2018) Spatio-temporal attention based recurrent neural network for next location prediction. In: 2018 IEEE International conference on big data (Big Data). pp 937–942, https://doi.org/10.1109/BigData.2018.8622218
Ameer S, Shah MA, Khan A et al (2019) Comparative analysis of machine learning techniques for predicting air quality in smart cities. IEEE Access 7:128325–128338. https://doi.org/10.1109/ACCESS.2019.2925082
Baker Effendi S, van der Merwe B, Balke WT (2020) Suitability of graph database technology for the analysis of spatio-temporal data. Future Internet 12(5):78. https://doi.org/10.3390/fi12050078
Bermejo U, Almeida A, Bilbao-Jayo A et al (2021) Embedding-based real-time change point detection with application to activity segmentation in smart home time series data. Expert Syst Appl 185:115641. https://doi.org/10.1016/j.eswa.2021.115641
Butland BK, Samoli E, Atkinson RW et al (2019) Measurement error in a multi-level analysis of air pollution and health: a simulation study. Environ Health 18(1):1–10. https://doi.org/10.1186/s12940-018-0432-8
Chai T, Draxler RR (2014) Root mean square error (rmse) or mean absolute error (mae)? – arguments against avoiding rmse in the literature. Geosci Model Dev Discuss 7(1):1525–1534. https://doi.org/10.5194/gmd-7-1247-2014
Choromanski K, Likhosherstov V, Dohan D, et al (2020) Rethinking attention with performers. arXiv preprint arXiv:2009.14794
Dai H, Huang G, Zeng H, et al (2022) PM\(_{2.5}\) volatility prediction by XGBoost-MLP based on GARCH models. J Clean Prod 356:131898. https://doi.org/10.1016/j.jclepro.2022.131898
Danaci E, Alkaya AF, Gültekin OG (2020) An empirical analysis of swarm intelligence techniques on atm cash withdrawal forecasting. In: Intelligent and fuzzy techniques in big data analytics and decision making. pp 1235–1242. https://doi.org/10.1007/978-3-030-23756-1_145
David H (1979) Robust estimation in the presence of outliers. In: Robustness in statistics. pp 61–74. https://doi.org/10.1016/B978-0-12-438150-6.50011-X
Deb K, Pratap A, Agarwal S et al (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6(2):182–197. https://doi.org/10.1109/4235.996017
Devlin J, Chang MW, Lee K, et al (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. pp 4171–4186. https://doi.org/10.18653/v1/n19-1423
EPA (2006) Air quality guidelines: global update 2005: particulate matter, ozone, nitrogen dioxide, and sulfur dioxide. World Health Organization
Grigsby J, Wang Z, Qi Y (2021) Long-range transformers for dynamic spatiotemporal forecasting. arXiv preprint arXiv:2109.12218
Hall JV, Brajer V, Lurmann FW (2010) Air pollution, health and economic benefits–lessons from 20 years of analysis. Ecol Econ 69(12):2590–2597. https://doi.org/10.1016/j.ecolecon.2010.08.003
He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778, https://doi.org/10.1109/CVPR.2016.90
Kampa M, Castanas E (2008) Human health effects of air pollution. Environ Pollut 151(2):362–367. https://doi.org/10.1016/j.envpol.2007.06.012
Kazemi SM, Goel R, Eghbali S, et al (2019) Time2vec: Learning a vector representation of time. arXiv preprint arXiv:1907.05321https://doi.org/10.48550/arXiv.1907.05321
Kumar U, Jain V (2010) Arima forecasting of ambient air pollutants (O3, NO, NO2 and CO). Stoch Environ Res Risk Assess 24(5):751–760. https://doi.org/10.1007/s00477-009-0361-8
Lee KH, Chen X, Hua G, et al (2018) Stacked cross attention for image-text matching. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 201–216, https://doi.org/10.1007/978-3-030-01225-0_13
Lei C, Xu X, Ma Y et al (2022) Full coverage estimation of the PM concentration across china based on an adaptive spatiotemporal approach. IEEE Trans Geosci Remote Sens 60:1–14. https://doi.org/10.1109/TGRS.2022.3213797
Li T, Hua M, Wu X (2020) A hybrid CNN-LSTM model for forecasting particulate matter (PM\(_{2.5}\)). IEEE Access 8:26933–26940. https://doi.org/10.1109/ACCESS.2020.2971348
Li T, Shen H, Yuan Q et al (2022) A locally weighted neural network constrained by global training for remote sensing estimation of PM2.5. IEEE Trans Geosci Remote Sens 60:1–13. https://doi.org/10.1109/TGRS.2021.3074569
Li X, Feng Y, Liang H (2017) The impact of meteorological factors on PM2.5 variations in hong kong. In: IOP Conference Series: Earth and Environmental Science. p 012003, https://doi.org/10.1088/1755-1315/78/1/012003
Li Y, Chen Q, Zhao H, et al (2015) Variations in PM10, PM\(_{2.5}\) and PM1.0 in an urban area of the sichuan basin and their relation to meteorological factors. Atmosphere 6(1):150–163. https://doi.org/10.3390/atmos6010150
Liang X, Li S, Zhang S, et al (2016) PM\(_{2.5}\) data reliability, consistency, and air quality assessment in five chinese cities. J Geophys Res Atmos 121(17):10–220. https://doi.org/10.1002/2016JD024877
Liu Y, Paciorek CJ, Koutrakis P (2009) Estimating regional spatial and temporal variability of pm2. 5 concentrations using satellite data, meteorology, and land use information. Environmental health perspectives 117(6):886–892. https://doi.org/10.1289/ehp.0800123
Lou C, Liu H, Li Y, et al (2017) Relationships of relative humidity with PM2.5 and PM10 in the yangtze river delta, china. Environ Monit Assess 189(11):1–16. https://doi.org/10.1007/s10661-017-6281-z
Ma J, Shou Z, Zareian A, et al (2019) Cdsa: cross-dimensional self-attention for multivariate, geo-tagged time series imputation. arXiv preprint arXiv:1905.09904
Niu M, Zhang Y, Ren Z (2023) Deep learning-based pm2.5 long time-series prediction by fusing multisource data: A case study of beijing. Atmosphere 14(2). https://doi.org/10.3390/atmos14020340
Pui DY, Chen SC, Zuo Z (2014) PM2.5 in china: Measurements, sources, visibility and health effects, and mitigation. Particuology 13:1–26. https://doi.org/10.1016/j.partic.2013.11.001
Qi Y, Li Q, Karimian H et al (2019) A hybrid model for spatiotemporal forecasting of pm2. 5 based on graph convolutional neural network and long short-term memory. Sci Total Environ 664:1–10. https://doi.org/10.1016/j.scitotenv.2019.01.333
Rossel RV, Webster R (2012) Predicting soil properties from the australian soil visible-near infrared spectroscopic database. Eur J Soil Sci 63(6):848–860. https://doi.org/10.1111/j.1365-2389.2012.01495.x
Shang Z, Deng T, He J et al (2019) A novel model for hourly PM\(_{2.5}\) concentration prediction based on cart and eelm. Sci Total Environ 651:3043–3052. https://doi.org/10.1016/j.scitotenv.2018.10.193
Shen S, Yao Z, Gholami A, et al (2020) Powernorm: Rethinking batch normalization in transformers. In: Proceedings of the 37th international conference on machine learning. pp 8741–8751
Siami-Namini S, Tavakoli N, Namin AS (2019) The performance of lstm and bilstm in forecasting time series. In: 2019 IEEE International conference on big data (Big Data). pp 3285–3292, https://doi.org/10.1109/BigData47090.2019.9005997
Singh P, Narasimhan TL, Lakshminarayanan CS (2019) Deepair: air quality prediction using deep neural network. In: TENCON 2019-2019 IEEE Region 10 Conference (TENCON). pp 869–873, https://doi.org/10.1109/TENCON.2019.8929470
Song J, Wang J, Lu H (2018) A novel combined model based on advanced optimization algorithm for short-term wind speed forecasting. Appl Energy 215:643–658. https://doi.org/10.1016/j.apenergy.2018.02.070
Tay Y, Dehghani M, Abnar S, et al (2020) Long range arena: A benchmark for efficient transformers. ArXiv abs/2011.04006
Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. pp 6000–6010, https://doi.org/10.5555/3295222.3295349
Wu Cl, Song Rf, Peng Zr, et al (2022) Prediction of air pollutants on roadside of the elevated roads with combination of pollutants periodicity and deep learning method. Build Environ 207:108436. https://doi.org/10.1016/j.buildenv.2021.108436
Xu Y, Xue W, Lei Y, et al (2020) Spatiotemporal variation in the impact of meteorological conditions on PM2.5 pollution in china from 2000 to 2017. Atmos Environ 223:117215. https://doi.org/10.1016/j.atmosenv.2019.117215
Zerveas G, Jayaraman S, Patel D, et al (2021) A transformer-based framework for multivariate time series representation learning. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. pp 2114–2124. https://doi.org/10.1145/3447548.3467401
Zhang S, Chen Y, Zhang W et al (2021) A novel ensemble deep learning model with dynamic error correction and multi-objective ensemble pruning for time series forecasting. Inf Sci 544:427–445. https://doi.org/10.1016/j.ins.2020.08.053
Zhao R, Gu X, Xue B, et al (2018) Short period PM\(_{2.5}\) prediction based on multivariate linear regression model. PLoS ONE 13(7):1–15. https://doi.org/10.1371/journal.pone.0201011
Zhou C, Chen J, Wang S (2018) Examining the effects of socioeconomic development on fine particulate matter (PM\(_{2.5}\)) in china’s cities using spatial regression and the geographical detector technique. Sci Total Environ 619:436–445. https://doi.org/10.1016/j.scitotenv.2017.11.124
Zhou H, Zhang S, Peng J et al (2021) Informer: Beyond efficient transformer for long sequence time-series forecasting. Proc AAAI Conf Artif Intell 35(12):11106–11115. https://doi.org/10.1609/aaai.v35i12.17325
Zhu Y, Ma Y, Liu B et al (2022) Retrieving the vertical distribution of PM2.5 mass concentration from lidar via a random forest model. IEEE Trans Geosci Remote Sens 60:1–9. https://doi.org/10.1109/TGRS.2021.3102059
Acknowledgements
All authors are supported by the Office of Research, Georgia Southern University.
Funding
All authors are supported by the Office of Research, Georgia Southern University.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Data collection, experiments, and analysis were performed by Jordan Limperis and Weitian Tong. The first draft of the manuscript was written by Jordan Limperis and Weitian Tong. All authors (i.e., Jordan Limperis, Weitian Tong, Felix Hamza-Lup, and Lixin Li) commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Communicated by: H. Babaie.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Limperis, J., Tong, W., Hamza-Lup, F. et al. PM\(_{2.5}\) forecasting based on transformer neural network and data embedding. Earth Sci Inform 16, 2111–2124 (2023). https://doi.org/10.1007/s12145-023-01002-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-023-01002-x