Air Quality, Atmosphere & Health

, Volume 12, Issue 4, pp 411–423 | Cite as

Deep learning PM2.5 concentrations with bidirectional LSTM RNN

  • Weitian TongEmail author
  • Lixin Li
  • Xiaolu Zhou
  • Andrew Hamilton
  • Kai Zhang


A better understanding of spatiotemporal distribution of PM2.5 (particulate matter with diameter less than 2.5 micrometer) concentrations in a continuous space-time domain is critical for risk assessment and epidemiologic studies. Existing spatiotemporal interpolation algorithms are usually based on strong assumptions by restricting the interpolation models to the ones with explicit and simple mathematical descriptions, thus neglecting plenty of hidden yet critical influencing factors. In this study, we developed a novel deep-learning-based spatiotemporal interpolation model, which includes the bidirectional Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) as the main ingredient. Our model is able to take into account both spatial and temporal hidden influencing factors automatically. To the best of our knowledge, it is the first time of applying the bidirectional LSTM RNN in the spatiotemporal interpolation of air pollutants concentrations. We evaluated our novel method using a dataset that contains daily PM2.5 measurements in 2009 over the contiguous southeast region of the USA. Results demonstrate a good performance of our model. We also conducted simulations to explore the properties of spatiotemporal correlations. In particular, we found the temporal correlation is stronger than the spatial correlation.


Spatiotemporal interpolation Air pollution Deep neural network Bidirectional LSTM (Long Short-Term Memory) RNN (Recurrent Neural Network) 



  1. Appice A, Ciampi A, Malerba D, Guccione P (2013) Using trend clusters for spatiotemporal interpolation of missing data in a sensor network. Journal of Spatial Information Science 2013:119–153Google Scholar
  2. Arel I, Rose DC, Karnowski TP (2010) Deep machine learning-a new frontier in artificial intelligence research [research frontier]. IEEE Comput Intell Mag 5(4):13–18CrossRefGoogle Scholar
  3. Balter BM, Faminskaya MV (2017) Irregularly emitting air pollution sources: acute health risk assessment using aermod and the monte carlo approach to emission rate. Air Qual Atmos Health 10(4):401–409CrossRefGoogle Scholar
  4. Bengio Y, LeCun Y (2007) Scaling learning algorithms towards ai. Large-scale kernel machines 34(5):1–41Google Scholar
  5. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166CrossRefGoogle Scholar
  6. Bengio Y, Courville A, Vincent P (2013) Representation learning: A review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828CrossRefGoogle Scholar
  7. Chauvin Y, Rumelhart DE (1995) Backpropagation: theory, architectures, and applications. Psychology PressGoogle Scholar
  8. Chen Y, Shi R, Shu S, Gao W (2013) Ensemble and enhanced PM10 concentration forecast model based on stepwise regression and wavelet analysis. Atmos Environ 74(Supplement C):346–359CrossRefGoogle Scholar
  9. Chicco D, Sadowski P, Baldi P (2014) Deep autoencoder neural networks for gene ontology annotation predictions. In: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, pp 533–540Google Scholar
  10. Ciregan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3642–3649Google Scholar
  11. De Boor C (1978) A practical guide to splines, vol 27. Springer-Verlag, New YorkCrossRefGoogle Scholar
  12. Devarakonda A, Naumov M, Garland M (2017) Adabatch: adaptive batch sizes for training deep neural networks. arXiv:171202029
  13. Durak BC (2018) Artificial neural networks., accessed: 2018-03-27
  14. Elkaref M, Bohnet B (2017) A simple lstm model for transition-based dependency parsing. arXiv:170808959
  15. EPA (2016) Air quality system (aqs). available online:
  16. Fan J, Li Q, Hou J, Feng X, Karimian H, Lin S (2017) A spatiotemporal prediction framework for air pollution based on deep rnn. ISPRS Annals of the Photogrammetry Remote Sensing and Spatial Information Sciences 4:15CrossRefGoogle Scholar
  17. Fann N, Risley D (2013) The public health context for pm2.5 and ozone air quality trends. Air Qual Atmos Health 6(1):1–11CrossRefGoogle Scholar
  18. Friedman JH, Bentley JL, Finkel RA (1977) An algorithm for finding best matches in logarithmic expected time. ACM Trans Math Softw (TOMS) 3:209–226CrossRefGoogle Scholar
  19. Geisser S (1993) Predictive inference. CRC Press, Boca RatonCrossRefGoogle Scholar
  20. Gers FA, Schmidhuber E (2001) Lstm recurrent networks learn simple context-free and context-sensitive languages. IEEE Trans Neural Netw 12(6):1333–1340CrossRefGoogle Scholar
  21. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, CambridgeGoogle Scholar
  22. Gräler B, Rehr M, Gerharz L, Pebesma E (2009) Spatio-temporal analysis and interpolation of PM10 measurements in europe forGoogle Scholar
  23. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRefGoogle Scholar
  24. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
  25. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv:14126980
  26. Krige DG (1951) A statistical approach to some mine valuations and allied problems at the witwatersrand. Master thesisGoogle Scholar
  27. Li J, Heap AD (2008) A review of spatial interpolation methods for environmental scientists, vol 137. Geoscience Australia CanberraGoogle Scholar
  28. Li J, Heap AD (2011) A review of comparative studies of spatial interpolation methods in environmental sciences: Performance and impact factors. Eco Inform 6:228–241CrossRefGoogle Scholar
  29. Li L, Revesz P (2004) Interpolation methods for spatio-temporal geographic data. Comput Environ Urban Syst 28:201–227CrossRefGoogle Scholar
  30. Li L, Tian J, Zhang X, Holt JB, Piltner R (2012) Estimating population exposure to fine particulate matter in the conterminous us using shape function-based spatiotemporal interpolation method: A county level analysis. GSTF Int J Comput 1:24–30Google Scholar
  31. Li L, Losser T, Yorke C, Piltner R (2014) Fast inverse distance weighting-based spatiotemporal interpolation: A web-based application of interpolating daily fine particulate matter PM2.5 in the contiguous US using parallel programming and k-d tree. Int J Environ Res Public Health 11:9101–9141CrossRefGoogle Scholar
  32. Liao D, Peuquet DJ, Duan Y, Whitsel EA, Dou J, Smith RL, Lin HM, Chen JC, Heiss G (2006) Gis approaches for the estimation of residential-level ambient pm concentrations. Environ Health Perspect 114:1374–1380CrossRefGoogle Scholar
  33. Losser T, Li L, Piltner R (2014) A spatiotemporal interpolation method using radial basis functions for geospatiotemporal big data. In: COM.geo, pp 17–24Google Scholar
  34. Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. J Big Data 2(1):1CrossRefGoogle Scholar
  35. Pebesma E (2012) spacetime: Spatio-temporal data in R. J Stat Softw 51:1–30CrossRefGoogle Scholar
  36. Qi Z, Wang T, Song G, Hu W, Li X, Zhang ZM (2018) Deep air learning: interpolation, prediction, and feature analysis of fine-grained air quality. IEEE Transactions on Knowledge and Data EngineeringGoogle Scholar
  37. Ross MA (2009) Integrated science assessment for particulate matter. US Environmental Protection Agency, Washington, pp 61–161Google Scholar
  38. Schachter EN, Moshier E, Habre R, Rohr A, Godbold J, Nath A, Grunin A, Coull B, Koutrakis P, Kattan M (2016) Outdoor air pollution and health effects in urban children with moderate to severe asthma. Air Qual Atmos Health 9(3):251–263CrossRefGoogle Scholar
  39. Shepard D (1968) A two-dimensional interpolation function for irregularly-spaced data. In: Proceedings of the 23rd ACM national conference, pp 517–524Google Scholar
  40. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958Google Scholar
  41. Tobler W R (1970) A computer movie simulating urban growth in the detroit region. Econ Geogr 46:234–240CrossRefGoogle Scholar
  42. Tong W, Li L, Zhou X, Franklin J, Besenyi G, Yates H (2017) Learning with spark for the optimal idw-based spatiotemporal interpolation. International Journal of Environmental Research and Public Health (under review)Google Scholar
  43. Van den Oord A, Dieleman S, Schrauwen B (2013) Deep content-based music recommendation. In: Advances in neural information processing systems, pp 2643–2651Google Scholar
  44. Zurflueh EG (1967) Applications of two-dimensional linear wavelength filtering. Geophysics 32:1015–1035CrossRefGoogle Scholar

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  1. 1.Department of Computer ScienceGeorgia Southern UniversityStatesboroUSA
  2. 2.Department of Geology and GeographyGeorgia Southern UniversityStatesboroUSA
  3. 3.Department of Epidemiology, Human Genetics and Environmental SciencesThe University of Texas Health Science Center at HoustonHoustonUSA

Personalised recommendations