Selection of the data time interval for the prediction of maximum ozone concentrations

Kocijan, Juš; Gradišar, Dejan; Stepančič, Martin; Božnar, Marija Zlata; Grašič, Boštjan; Mlakar, Primož

doi:10.1007/s00477-017-1468-y

Selection of the data time interval for the prediction of maximum ozone concentrations

Original Paper
Published: 13 October 2017

Volume 32, pages 1759–1770, (2018)
Cite this article

Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Juš Kocijan ORCID: orcid.org/0000-0002-1221-946X^1,2,
Dejan Gradišar¹,
Martin Stepančič¹,
Marija Zlata Božnar³,
Boštjan Grašič³ &
…
Primož Mlakar³

302 Accesses
6 Citations
Explore all metrics

Abstract

This paper highlights the problem of step-length selection for the one-step-ahead prediction of ozone called the data time interval. This is done using a case study-based comparison of two approaches for predicting the maximum daily values of tropospheric ozone. The first approach is the 1-day-ahead prediction and the second is the prediction of the maximum values based on a multi-step-ahead iteration of 1-h predictions. Gaussian process modelling is utilised for this comparison. In particular, evolving Gaussian-process models are used that update on-line with the incoming measurement data. These sorts of models have been successfully used in the past for the prediction of ozone pollution. This paper contributes an assessment of the way that the maximum ozone values are predicted. A comparison of the daily maximum ozone values forecasted by a model based on 1-day-ahead predictions with those obtained by iterated 1-h-ahead predictions of the ozone with predictions at predetermined hours of the day is given. The forecast results are in favour of the on-line model based on hourly predictions when approaching closer to the real maximum values of ozone, and in favour of the daily predictions when they are made on a daily basis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Short-Term Forecasting of Nitrogen Dioxide (NO2) Levels Using a Hybrid Statistical and Air Mass History Modelling Approach

Article 01 September 2016

Modeling and forecasting daily maximum hourly ozone concentrations using the RegAR model with skewed and heavy-tailed innovations

Article 12 October 2018

Statistical Variability and Persistence Change in Daily Air Temperature Time Series from High Latitude Arctic Stations

Article 03 July 2014

References

Al-Alawi SM, Abdul-Wahab SA, Bakheit CS (2008) Combining principal component regression and artificial neural-networks for more accurate predictions of ground-level ozone. Environ Model Softw 23:396–403
Article Google Scholar
Alyousifi Y, Masseran N, Ibrahim K (2017) Modeling the stochastic dependence of air pollution index data. Stoch Environ Res Risk Assess. doi:10.1007/s00477-017-1443-7
Article Google Scholar
Andrawis RR, Atiya AF, El-Shishiny H (2011) Combination of long term and short term forecasts, with application to tourism demand forecasting. Int J Forecast 27(3):870–886
Article Google Scholar
Bruno F, Paci L (2014) Spatiotemporal model for short-term predictions of air pollution data. In: Lanzarone E, Ieva F (eds) The contribution of young researchers to Bayesian statistics. Springer, Cham, pp 91–94
Casals J, Jerez M, Sotoca S (2009) Modelling and forecasting time series sampled at different frequencies. J Forecast 28(4):316–342
Article Google Scholar
Chan LLT, Liu Y, Chen J (2013) Nonlinear system identification with selective recursive Gaussian process models. Ind Eng Chem Res 52(51):18276–18286
Article CAS Google Scholar
Conde-Amboage M, González-Manteiga W, Sánchez-Sellero C (2017) Predicting trace gas concentrations using quantile regression models. Stoch Environ Res Risk Assess 31(6):1359–1370
Article Google Scholar
Ding W, Zhang J, Leung Y (2016) Prediction of air pollutant concentration based on sparse response back-propagation training feedforward neural networks. Environ Sci Pollut Res 23(19):19481–19494
Article Google Scholar
Duenas C, Fernandez MC, Canete S, Carretero J, Liger E (2005) Stochastic model to forecast ground-level ozone concentration at urban and rural areas. Chemosphere 61:1379–1389
Article CAS Google Scholar
EU-Commission (2008) Directive 2008/50/EC of the European Parliament and of the Council of 21 May 2008 on ambient air quality and cleaner air for Europe. Off J Eur Commun L152:1–44. http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2008:152:0001:0044:EN:PDF
Faris H, Alkasassbeh M, Rodan A (2014) Artificial neural networks for surface ozone prediction: models and analysis. Pol J Environ Stud 23(2):341–348
CAS Google Scholar
Faul S, Gregorčič G, Boylan G, Marnane W, Lightbody G, Connolly S (2007) Gaussian process modeling of EEG for the detection of neonatal seizures. IEEE Trans Biomed Eng 54(12):2151–2162
Article Google Scholar
Feng Y, Zhang W, Sun D, Zhang L (2011) Ozone concentration forecast method based on genetic algorithm optimized back propagation neural networks and SVM data classification. Atmos Environ 45:1979–1985
Article CAS Google Scholar
Gong B, Ordieres-Meré J (2016) Prediction of daily maximum ozone threshold exceedances by preprocessing and ensemble artificial intelligence techniques: case study of Hong Kong. Environ Model Softw 84:290–303
Article Google Scholar
Grašič B, Mlakar P, Božnar M (2006) Ozone prediction based on neural networks and Gaussian processes. Nuovo Cimento Soc Ital Fis C 29(6):651–661
Google Scholar
Gregorčič G, Lightbody G (2008) Nonlinear system identification: from multiple-model networks to Gaussian processes. Eng Appl Artif Intell 21(7):1035–1055
Article Google Scholar
Hong SM, Bukhari W (2014) Real-time prediction of respiratory motion using a cascade structure of an extended Kalman filter and support vector regression. Phys Med Biol 59(13):3555–3573
Article Google Scholar
Im U, Bianconi R, Solazzo E, Kioutsioukis I, Badia A, Balzarini A, Bar R, Bellasio R, Brunner D, Chemel C, Curci G, Flemming J, Forkel R, Giordano L, Jimnez-Guerrero P, Hirtl M, Hodzic A, Honzak L, Jorba O, Knote C, Kuenen JJP, Makar PA, Manders-Groot A, Neal L, Prez JL, Pirovano G, Pouliot G, Jose RS, Savage N, Schroder W, Sokhi RS, Syrakov D, Torian A, Tuccella P, Werhahn J, Wolke R, Yahya K, Zabkar R, Zhang Y, Zhang J, Hogrefe C, Galmarini S (2015) Evaluation of operational on-line-coupled regional air quality models over Europe and North America in the context of AQMEII phase 2. Part I: ozone. Atmos Environ 115:404–420. doi:10.1016/j.atmosenv.2014.09.042
Article CAS Google Scholar
Kang H, Park FC, Park FC (2015) Motion optimization using Gaussian process dynamical models. Multibody Syst Dyn 34(4):307–325
Article Google Scholar
Kocijan J (2016) Modelling and control of dynamic systems using Gaussian process models. Springer, Cham
Book Google Scholar
Kocijan J, Gradišar D, Božnar MZ, Grašič B, Mlakar P (2016) On-line algorithm for ground-level ozone prediction with a mobile station. Atmos Environ 131:326–333
Article CAS Google Scholar
Kourentzes N, Petropoulos F, Trapero JR (2014) Improving forecasting by estimating time series structural components across multiple frequencies. Int J Forecast 30(2):291–302
Article Google Scholar
Leith DJ, Heidl M, Ringwood J (2004) Gaussian process prior models for electrical load forecasting. In: Proceedings of 2004 international conference on probabilistic methods applied to power systems, Piscataway, NJ, IEEE. IEEE, pp 112–117
Leithead WE, Zhang Y, Neo KS (2005) Wind turbine rotor acceleration: Identification using Gaussian regression. In: Proceedings of 2nd international conference on informatics in control automation and robotics (ICINCO 2005), Setúbal, INSTICC. INSTICC, pp 84–91
Likar B, Kocijan J (2007) Predictive control of a gas–liquid separation plant based on a Gaussian process model. Comput Chem Eng 31(3):142–152. doi:10.1016/j.compchemeng.2006.05.011
Article CAS Google Scholar
Liu J, Han D (2013) On selection of the optimal data time interval for real-time hydrological forecasting. Hydrol Earth Syst Sci 17(9):3639–3659
Article Google Scholar
MacKay DJC (1998) Introduction to Gaussian processes. NATO ASI Ser 168:133–166
Google Scholar
Petelin D, Grancharova A, Kocijan J (2013) Evolving Gaussian process models for the prediction of ozone concentration in the air. Simul Model Pract Theory 33(1):68–80
Article Google Scholar
Quinonero-Candela J, Rasmussen CE, Williams CKI (2007) Large-scale Kernel machines, chapter approximation methods for Gaussian process regression. Neural information processing. The MIT Press, Cambridge, pp 203–223
Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning. MIT Press, Cambridge
Google Scholar
Schliep EM, Gelfand AE, Holland DM (2017) Alternating Gaussian process modulated renewal processes for modeling threshold exceedances and durations. Stoch Environ Res Risk Assess. doi:10.1007/s00477-017-1417-9
Article Google Scholar
Shi JQ, Choi T (2011) Gaussian process regression analysis for functional data. Chapman and Hall/CRC, Taylor & Francis Group, Boca Raton
Google Scholar
Sud K, Singh B, Kohli HS, Jha V, Gupta KL, Sakhuja V (2002) Evaluation of different sampling times for best prediction of cyclosporine area under the curve in renal transplant recipients. Transplant Proc 34(8):3168–3170
Article CAS Google Scholar
Taylan O (2017) Modelling and analysis of ozone concentration by artificial intelligent techniques for estimating air quality. Atmos Environ 150:356–365
Article CAS Google Scholar
Žabkar R, Honzak L, Skok G, Forkel R, Rakovec J, Ceglar A, Žagar N (2015) Evaluation of the high resolution WRF-Chem (v3.4.1) air quality forecast and its comparison with statistical ozone predictions. Geosci Model Dev 8(7):2119–2137
Article CAS Google Scholar
Zhang Y, Bocquet M, Mallet V, Seigneur C, Baklanov A (2012) Real-time air quality forecasting, part I: history, techniques, and current status. Atmos Environ 60:632–655. doi:10.1016/j.atmosenv.2012.06.031
Article CAS Google Scholar
Zhang Y, Bocquet M, Mallet V, Seigneur C, Baklanov A (2012) Real-time air quality forecasting, part II: state of the science, current research needs, and future prospects. Atmos Environ 60:656–676. doi:10.1016/j.atmosenv.2012.02.041
Article CAS Google Scholar

Download references

Acknowledgements

The authors acknowledge the financial support from the Slovenian Research Agency (Projects Nos. L2-5475, L2-8174 and P2-0001). The Slovenian Environment Agency provided part of the data.

Author information

Authors and Affiliations

Jožef Stefan Institute, Jamova cesta 39, 1000, Ljubljana, Slovenia
Juš Kocijan, Dejan Gradišar & Martin Stepančič
University of Nova Gorica, Vipavska 13, 5000, Nova Gorica, Slovenia
Juš Kocijan
MEIS d.o.o., Mali Vrh pri Šmarju 78, 1293, Šmarje-Sap, Slovenia
Marija Zlata Božnar, Boštjan Grašič & Primož Mlakar

Authors

Juš Kocijan
View author publications
You can also search for this author in PubMed Google Scholar
Dejan Gradišar
View author publications
You can also search for this author in PubMed Google Scholar
Martin Stepančič
View author publications
You can also search for this author in PubMed Google Scholar
Marija Zlata Božnar
View author publications
You can also search for this author in PubMed Google Scholar
Boštjan Grašič
View author publications
You can also search for this author in PubMed Google Scholar
Primož Mlakar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juš Kocijan.

Appendix: Performance measures

The following are performance measures used in the study.

The root-mean-square error—RMSE:
$$\begin{aligned} \mathrm {RMSE} = \sqrt{\frac{1}{N}\sum _{i=1}^N (E(\hat{y}_i)-y_i)^2}, \end{aligned}$$
(9)
where $y_i$ and $\hat{y}_i$ are the observation and the prediction in the i-th step, respectively, $E(\cdot )$ denotes the expectation, i.e., the mean value, of the random variable, and N is the number of used observations.
The standardised mean-squared error—SMSE
$$\begin{aligned} \mathrm {SMSE}=\frac{1}{N}\frac{\sum _{i=1}^N(E(\hat{y}_i)-y_i)^2}{\sigma _y^2}, \end{aligned}$$
(10)
where $\sigma _y^2$ is the variance of the observations.
The Pearson’s correlation coefficient—PCC:
$$\begin{aligned} \mathrm {PCC}=\frac{\sum _{i=1}^N(E(\hat{y}_i)-E(\hat{{\mathbf {y}}})) (y_i-E({\mathbf {y}}))}{N\sigma _y\sigma _{\hat{y}}}, \end{aligned}$$
(11)
where $E(\hat{{\mathbf {y}}})$ is the expectation, i.e., the mean value, of the vector of predictions, and $\sigma _y$, $\sigma _{\hat{y}}$ are the standard deviations of the observations and the predictions, respectively.
The mean fractional bias—MFB:
$$\begin{aligned} \mathrm {MFB}=\frac{1}{N}\sum _{i=1}^N\frac{E(\hat{y}_i)-y_i}{\frac{1}{2}(E(\hat{y}_i)+y_i)}. \end{aligned}$$
(12)
The factor of the modelled values within a factor of two of the observations—FAC2:
$$\begin{aligned} \mathrm {FAC2}=\frac{1}{N}\sum _{i=1}^Nn_i\ \mathrm {with}\ n_i= {\left\{ \begin{array}{ll} 1 &{} \mathrm {for} \,\,\,0.5\le |\frac{E(\hat{y}_i)}{y_i}|\le 2,\\ 0 &{} \mathrm {else}. \end{array}\right. } \end{aligned}$$
(13)

RMSE and SMSE are frequently used measures for the accuracy of the predictions’ mean values, which are 0 in the case of a perfect model. SMSE is the standardised measure with values between 0 and 1. PCC is a measure of the associativity and is not sensitive to bias. Its value is between $-\,1$ and $+\,1$, with ideally linearly correlated values resulting in a value 1. MFB is the measure that bounds the maximum bias and gives additional weight to underestimations and less weight to overestimations. Its value is between $-\,2$ and $+\,2$, with the value 0 in the case of a perfect model. FAC2 indicates the fraction of the data that satisfies the condition from Eq. (13). Its value is between 0 and 1, with the perfect model resulting in a value of 1.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kocijan, J., Gradišar, D., Stepančič, M. et al. Selection of the data time interval for the prediction of maximum ozone concentrations. Stoch Environ Res Risk Assess 32, 1759–1770 (2018). https://doi.org/10.1007/s00477-017-1468-y

Download citation

Published: 13 October 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s00477-017-1468-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Selection of the data time interval for the prediction of maximum ozone concentrations

Abstract

Access this article

Similar content being viewed by others

Short-Term Forecasting of Nitrogen Dioxide (NO2) Levels Using a Hybrid Statistical and Air Mass History Modelling Approach

Modeling and forecasting daily maximum hourly ozone concentrations using the RegAR model with skewed and heavy-tailed innovations

Statistical Variability and Persistence Change in Daily Air Temperature Time Series from High Latitude Arctic Stations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Performance measures

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Selection of the data time interval for the prediction of maximum ozone concentrations

Abstract

Access this article

Similar content being viewed by others

Short-Term Forecasting of Nitrogen Dioxide (NO2) Levels Using a Hybrid Statistical and Air Mass History Modelling Approach

Modeling and forecasting daily maximum hourly ozone concentrations using the RegAR model with skewed and heavy-tailed innovations

Statistical Variability and Persistence Change in Daily Air Temperature Time Series from High Latitude Arctic Stations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Performance measures

Appendix: Performance measures

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation