Abstract
Air quality data (observational and numerical) were used to produce hourly spot concentration forecasts of ozone (O3), particulate matter 2.5 μm (PM2.5), and nitrogen dioxide (NO2), up to 48 h for six stations across Canada—Vancouver, Edmonton, Winnipeg, Toronto, Montreal, and Halifax. Using numerical data from an air quality model (GEM-MACH15) as predictors, forecast models for pollutant concentrations were built using multiple linear regression (MLR) and multi-layer perceptron neural networks (MLPNN). A relatively new method, the extreme learning machine (ELM), was also used to overcome the limitation of linear methods as well as the large computational demand of MLPNN. In operational forecasting, the continual arrival of new data necessitates frequent model updating. This type of learning (online sequential learning) is straightforward for MLR and ELM but not for MLPNN. Forecast performance of the online sequential MLR (OSMLR) and online sequential ELM (OSELM), together with stepwise MLR, all updated daily, were compared with MLPNN updated seasonally and the benchmark climatology model. OSELM, combining relatively inexpensive frequent model updating with nonlinear modeling capability, tended to outperform the other models in mean absolute error and correlation. Compared to the linear models, the nonlinear models (OSELM and MLPNN) often had worse bias (mean error) and more severe underprediction of extreme events.
Similar content being viewed by others
References
Alemayehu D, Hackett F (2015) Gaussian dispersion model to estimate the dispersion of particulate matters PM2.5 and sulfur dioxide SO2 concentrations on Tribal Land, Oklahoma. Am J Environ Sci 11(6):440–449
Anselmo D, Moran MD, Ménard S, Bouchet VS, Makar PA, Gong W, Kallaur A, Beaulieu P, Landry H, Stroud C, Huang P, Gong S, Talbot D (2010) A new Canadian air quality forecast model: GEM-MACH15. In: 12th Conference on Atmospheric Chemistry. American Meteorological Society, Boston, p 6
Antonopoulos S, Bourgouin P, Montpetit J, Croteau G (2012) Forecasting O3, PM25 and NO2 hourly spot concentrations using an updatable MOS methodology. In: Steyn DG, Castelli ST (eds) Air pollution modeling and its application XXI, NATO Science for Peace and Security Series C: Environmental Security. chap 53. Springer, Dordrecht
Autrup H (2010) Ambient air pollution and adverse health effects. Procedia - Social and Behavioral Sciences 2(5):7333–7338
Banerjee T, Srivastava RK (2009) Evaluation of environmental impacts of Integrated Industrial Estate - Pantnagar through application of air and water quality indices. Environ Monit Assess 172(1–4):547–560
Breiman L (1996) Bagging predictions. Mach Learn 24:123–140
Brunekreef B (2010) Air pollution and human health: from local to global issues. Procedia - Social and Behavioral Sciences 2(5):6661–6669
Cannon AJ (2012) monmlp: Monotone multi-layer perceptron neural network. R package version 1.1.2. https://cran.r-project.org/web/packages/monmlp/index.html
Cannon AJ, Lord ER (2000) Forecasting summertime surface-level ozone concentrations in the Lower Fraser Valley of British Columbia: an ensemble neural network approach. J Air Waste Manag Assoc 50(3):322–339
Chaloulakou A, Assimakopoulos D, Kekkas T (1999) Forecasting daily maximum ozone concentrations in the Athens basin. Environ Monit Assess 56:97–112
Chen H, Copes R (2013) Review of air quality index and air quality health index. Ontario agency for health protection and promotion (Public Health Ontario). Queen’s Printer for Ontario , Toronto
Comrie A (1997) Comparing neural networks and regression models for ozone forecasting. Journal of Air and Waste Management 47:653–663
Decker M, Brunke MA, Wang Z, Sakaguchi K, Zeng XB, Bosilovich MG (2012) Evaluation of the reanalysis products from GSFC, NCEP, and ECMWF using flux tower observations. J Clim 25(6):1916–44
Demuzere M, Trigo RM, de Arellano JV-G, van Lipzig NPM (2009) The impact of weather and atmospheric circulation on O3 and PM10 levels at a rural mid-latitude site. Chemical Physics 9(8):2695–2714
Dominick D, Latif MT, Juahir H, Aris AZ, Zain SM (2012) An assessment of influence of meteorological factors on PM10 and NO2 at selected stations in Malaysia. Sustainable Environment Research 22(5):305–315
Environment Canada (2004) National air pollution surveillance network quality assurance and quality control guidelines. Tech. Rep. AAQD 2004–1
Gardner MW, Dorling SR (1999) Neural network modeling and prediction of hourly NOx and NO2 concentrations in urban air in London. Atmos Environ 33:709–719
Gardner MW, Dorling SR (2000) Statistical surface ozone models: an improved methodology to account for non-linear behavior. Atmos Environ 34(1):21–34
Gilleland E (2010) Confidence intervals for forecast verification. NCAR Technical Note, NCAR/TN-479 + STR. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.174.5002&rep=rep1&type=pdf
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the 14th international conference on artificial intelligence and statistics (AISTATS), pp 315–23. Fort Lauderdale, Florida
Hamill TM, Whitaker JS, Mullen SL (2006) Reforecasts, an important data set for improving weather predictions. Bulletin of American Meteorological Society 87:33–46
Hamill TM, Bates GT, Whitaker JS, Murray DR, Fiorino M, Galarneau TJ, Zhu Y, Lapenta W (2013) NOAA’s second-generation global medium-range ensemble reforecast data set. Bulletin of American Meteorological Society 94:1553–1565
Hsieh WW (2009) Machine learning methods in the environmental sciences: neural networks and kernels. Cambridge University Press
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Huang GB, Wang D, Lan Y (2011) Extreme learning machines: a survey. International Journal of Machine Learning and Cybernetics 2:107–122
Kolehmainen M, Martikainen H, Ruuskanen J (2001) Neural networks and periodic components used in air quality forecasting. Atmos Environ 35(5):815–825
Lan Y, Soh YC, Huang GB (2009) Ensemble of online sequential extreme learning machine. Neurocomputing 72:3391– 3395
Liang NY, Huang GB, Saratchandran P, Sundararajan N (2006) A fast and accurate on-line sequential learning algorithm for feedforward networks. IEEE Transactions on Neural Networks 17(6):1411–1423
Lima AR, Cannon AJ, Hsieh WW (2015) Nonlinear regression in environmental sciences using extreme learning machines: a comparative evaluation. Environ Model Softw 73:175–188
Lima AR, Cannon AJ, Hsieh WW (2016) Forecasting daily streamflow using online sequential extreme learning machines. J Hydrol 537:431–443
Luecken DJ, Hutzell WT, Gipson GL (2006) Development and analysis of air quality modeling simulations for hazardous air pollutants. Atmos Environ 40(26):5087–5096
Nguyen D (2014) A brief review of air quality models and their applications. Open Journal of Atmospheric and Climate Change 1(2):60–80
Perez P (2001) Prediction of sulfur dioxide concentrations at a site near downtown Santiago, Chile. Atmos Environ 35:4929– 4935
Perez P, Trier A, Reyes J (2000) Prediction of PM2.5 concentrations several hours in advance using neural networks in Santiago, Chile. Atmos Environ 34:1189–1196
Reich SL, Gomez DR, Dawidowski LE (1999) Artificial neural network for the identification of unknown air pollution sources. Atmos Environ 33(18):3045–3052
Revlett G (1978) Ozone forecasting using empirical modeling. Journal of Air Pollution Control Association 28:338–343
Roadknight CM, Balls GR, Mills GE, Palmer BD (1997) Modeling complex environmental data. IEEE Transactions on Neural Networks 8(4):852–861
Schmidt WF, Kraaijveld MA, Duin RPW (1992) Feedforward neural networks with random weights. 11th IAPR International Conference on Pattern Recognition, Proceedings, vol II: Conference B: Pattern Recognition Methodology and Systems , pp 1–4
Seinfeld JH, Pandis SN (1997) Atmospheric chemistry and physics from air pollution to climate change. Wiley-Interscience
Tai PKAPK (2012) Impact of climate change on fine particulate matter air quality, PhD thesis. Harvard University
Thimm G, Fiesler E (1997) High-order and multilayer perceptron initialization. IEEE Transactions on Neural Networks 8:349– 359
Wang W, Xu Z, Lu JW (2003) Three improved neural network models for air quality forecasting. Eng Comput 20(2):192– 210
Wilson LJ, Vallée M (2002) The Canadian updateable model output statistics (UMOS) system: design and development tests. Weather Forecast 17:206–222
Wilson LJ, Vallée M (2003) The Canadian updateable model output statistics (UMOS) system: validation against perfect prog. Weather Forecast 18:288–302
Wolff GT, Lioy PJ (1978) An empirical model for forecasting maximum daily ozone levels in the northeastern U.S. Journal of Air Pollution Control Association 28:5087–5096
Wotawa G, Stohl A, Neininger B (1998) The urban plume of Vienna: comparisons between aircraft measurements and photochemical model results. Atmos Environ 32:2479–2489
Yuval, Hsieh WW (2003) An adaptive nonlinear MOS scheme for precipitation forecasts using neural networks. Weather Forecast 18(2):303–310
Acknowledgments
Jonathan Baik kindly sent us the data. This research was supported by the Natural Sciences and Engineering Research Council of Canada via a Discovery Grant to W. Hsieh.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Peng, H., Lima, A.R., Teakles, A. et al. Evaluating hourly air quality forecasting in Canada with nonlinear updatable machine learning methods. Air Qual Atmos Health 10, 195–211 (2017). https://doi.org/10.1007/s11869-016-0414-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11869-016-0414-3