Abstract
We present a method to forecast pollution episodes with a bivariate response. The method simultaneously estimates the concentrations of two pollutants, using historical data. It is based on a location–scale model where the means and the standard deviations are approximated by kernel smoothers in additive models, while the variance–covariance matrix is obtained from the residuals of the previous models. The method provides not only an estimation of the concentration of both pollutants over time but also uncertainty regions covering a specific percentage of the data. The suitability of the model was tested with both simulated and real data (specifically \(\hbox {SO}_2\) and \(\hbox {NO}_x\) concentrations from a coal-fired power station). The results have proved highly satisfactory in both cases. The percentage of data covered by the uncertainty region, its area and a new loss function, a variant of the pinball loss function, were used as metrics to evaluate the performance of the model.
Similar content being viewed by others
References
Abhilash MSK, Thakur A, Gupta D, Sreevidya B (2018) Time series analysis of air pollution in Bengaluru using ARIMA model. In: Perez G, Tiwari S, Trivedi M, Mishra K (eds) Ambient communications and computer systems. Advances in intelligent systems and computing, vol 696. Springer, Singapore
Antanasijević D, Pocajt V, Perić-Grujić A, Ristić M (2018) Multiple-input-multiple-output general regression neural networks model for the simultaneous estimation of traffic-related air pollutants. Atmospheric Pollut Res 9:388–397
Azid IA, Ripin ZM, Aris MS, Ahmad AL, Seetharamu KN, Yusoff RM (2000) Predicting combined-cycle natural gas power plant emissions by using artificial neural networks. In: TENCON proceedings. Intelligent systems and technologies for the new millennium (Cat. o.00CH37119), Kuala Lumpur, Malaysia, vol 3, pp 512–517
Eastoe EF (2008) A hierarchical model for non-stationary multivariate extremes: a case study of surface-level ozone and NOx data in the UK. Environmetrics 20:428–444
Ferretti G, Piroddi L (2001) Estimation of \(\text{ NO}_x\) emissions in thermal power plants using neural networks. J Eng Gas Turbines Power Trans Asme 132(2):465–471
Garcia JM, Teodoro F, Cerdeira R, Coelho LMR, Prashant K, Carvalho MG (2016) Developing a methodology to predict PM10 concentrations in urban areas using generalized linear models. Environ Technol 37(18):2316–2325
García Nieto PJ, Sánchez-Lasheras F, García-Gonzalo E, de Cos Juez FJ (2018) PM10 concentration forecasting in the metropolitan area of Oviedo (Northern Spain) using models based on SVM, MLP, VARMA and ARIMA: a case study. Sci Total Environ 621:753–761
Genest C, Rivest L (1993) Statistical inference procedures for bivariate Archimedean copulas. J Am Stat Assoc 1993(88):1034–1043
Gilson M, Dahmen D, Moreno-Bote R, Insabato A, Helias M (2019) The covariance perceptron: a new paradigm for classification and processing of time series in recurrent neuronal networks. BioRxiv. https://doi.org/10.1101/562546
Giorgio C, Scanagatta M (2016) Air pollution prediction via multi-label classification. Environ Model Softw 80:259–264
Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall/CRC Monographs on Statistics and Applied Probability, London
Hsu KJ (1992) Time series analysis of the interdependence among air pollutants. Atmospheric Environ Part B Urban Atmosphere 26:491–503
Ibrahim MZ, Roziah Z, Marzuki I, Muhd SL (2009) Forecasting and time series analysis of air pollutants in several area of Malaysia. Am J Environ Sci 5(5):625–632
Kadiyala A, Kumar A (2019) Vector time series models for prediction of air quality inside a public transportation bus using available software. Environ Prog Sustain Energy 33(22):337–341
Kreuzer A, Valle LD, Czado C (2019) A Bayesian non-linear state space copula model to predict air pollution in Beijing. arXiv:1903.08421
Martínez-Silva I, Roca-Pardiñas I, Ordóñez C (2016) Forecasting SO2 pollution incidents by means of quantile curves based on additive models. Environmetrics 27(3):147–157
Muñoz E, Martín ML, Turias IJ, Jimenez-Come MJ, Trujillo FJ (2014) Prediction of PM10 and SO2 exceedances to control air pollution in the Bay of Algeciras, Spain. Stoch Environ Res Risk Assess 28(6):1409–1420
Nelsen RB (1999) An introduction to copulas. Springer, New York
Perez P, Trier A, Reyes J (2000) Prediction of PM2.5 concentrations several hours in advance using neural networks in Santiago, Chile. Atmospheric Environ 34:1189–1196
Roca-Pardiñas J, Ordóñez C (2019) Predicting pollution incidents through semiparametric quantile regression models. Stoch Environ Res Risk Assess 33(3):673–685
Roca-Pardiñas J, González Manteiga W, Febrero-Bande M, Prada-Sánchez JM, Cadarso-Suárez C (2004) Predicting binary time series of SO2 using generalized additive models with unknown link function. Environmetrics 15(7):729–742
Sheather SJ, Jones MC (1991) A reliable data-based bandwidth selection method for kernel density estimation. J R Stat Soc Ser B (Methodol) 53(3):683–690
Siew LY, Ching LY, Wee PMJ (2008) ARIMA and integrated ARFIMA models for forecasting air pollution index in Shah Alam Selangor. Malays J Anal Sci 12(1):257–263
Snezhana PK, Krassi VR, Todor V, Silviya BP (2012) Using copulas to measure association between air pollution and respiratory diseases. Int J Environ Ecol Eng 6(11):703–708
Yu K, Lu Z (2004) Local linear additive quantile regression. Scand J Stat 31:333–346
Zhanqiong H, Sriboonchitta S, Jing D (2013) Modeling dependence dynamics of air pollution: time series analysis using a copula based GARCH type model. In: Huynh VN, Kreinovich V, Sriboonchitta S, Suriya K (eds) Uncertainty analysis in econometrics with applications. Advances in intelligent systems and computing, vol 200. Springer, Berlin
Acknowledgements
Javier Roca-Pardiñas acknowledges financial support by the Grant MTM2017-89422-P (MINECO/AEI/FEDER, UE). He also acknowledges financial support from the Xunta de Galicia (Centro singular de investigación de Galicia accreditation 2019-2022) and the EU (ERDF), Ref. ED431G2019/06. Óscar Lado-Baleato is funded by a predoctoral grant from the Galician Government (Plan I2C)-Xunta de Galicia.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: flexible additive transformed model estimation
Appendix: flexible additive transformed model estimation
In order to obtain the estimated additive models in Eq. (3), we have used a backfitting algorithm based on local polynomial kernel smoothers. For mathematical notation simplicity, we consider in this section Y as our response variable and \({\mathbf {X}}=(X_1, \ldots , X_p)\) the vector of covariates. In this regression framework, we consider the transformed additive model:
where H is a known link function and \(\eta =\alpha + \sum _{j=1}^p f_{j}(X_{j})\) is the systematic component. Moreover, in order to guarantee the identification, we assume that \(E[f_j(X_j)=0]\). To estimate the model given in (18) we proposed an interactive algorithm based on the New Raphson procedure (extends the ACE—Alternating Conditional Expectation algorithm Hastie and Tibshirani 1990) in combination with a backfitting approach (Yu and Lu 2004).
Let \(\{\mathbf {X}_i, Y_{ir}\}_{i=1}^n\) be an independent random sample of \((\mathbf {X}, Y)\). For fitting (18) it is necessary to minimize
To solve this problem we need to use an iterative process. Here we have used a modified Newton–Raphson algorithm the steps of which are as follows:
Initialize. Compute \({\hat{\alpha }}=H^{-1} ({\bar{Y}})\) with \({\bar{Y}}=n^{-1}\sum _{i=1}^n Y_i\), for the initial estimates, \({\hat{f}}_{1}^0\ldots ={\hat{f}}_{p}^0=0\), \({\hat{\eta }}_i^0={\hat{\alpha }}\), for \(i=1,\ldots ,n\).
Step 1. Construct the linearized response \({{\tilde{Y}}}\) and the weights W so that
with \(H'({\hat{\partial }})=\delta H / \delta \partial\) and \({\hat{\sigma }}_i^2\) and estimation of \(Var(Y_i| {\hat{\mu }}_i^0)\). The estimated \({\hat{\sigma }}_i^2\) can be obtained by fitting an additive model to \((Y_i-H\left( {{\hat{\eta }}_i^0}\right) )^2\).
Step 2. Fit an additive model to \({{\tilde{Y}}}\) weighted by W and compute the updates \({\hat{f}}_j\), for \(j=1,\ldots ,p\). At this step, we have used an inner backfitting algorithm based on local polynomial kernel smoothers:
-
Step 2.1: Cycle \(j=1,\ldots ,p\), calculating the partial residuals
$$\begin{aligned} R_i^j={{\tilde{Y}}}_i - {\hat{\alpha }} - \sum _{k=1}^{j-1}{{\hat{f}}_{k}(X_{ik})} - \sum _{k=j+1}^p{{\hat{f}}_{k}^0(X_{ik})} \end{aligned}$$and compute for \(i=1,\ldots ,n\) the updates \({\hat{f}}_j (X_{ij})\) for \(i=1,\ldots ,n\), where the linear kernel estimate of \(f_j^\tau\) at a localization x is given by \({\hat{f}}_j (x)={\hat{a}}\), with \(({\hat{a}}, {\hat{b}})\) being the minimizers of
$$\begin{aligned} \sum _{i=1}^n W_i \left( { R_i^j - a-b(X_{ij}-x)}\right) K\left( \frac{X_{ij} - x}{h_j}\right) ^2 \end{aligned}$$(19)where \(K(\cdot )\) denotes a kernel function (a symmetric density) and \(h_j>0\) is the smoothing parameter. At this step, the obtained estimates \({\hat{f}}_j\) must be refocused by considering \({\hat{f}}_j (X_{ij})={\hat{f}}_j (X_{ij})- n^{-1}\sum _{l=1}^n {\hat{f}}_j (X_{lj})\) .
Here, we have used a Gaussian kernel and the bandwidth \(h_j\) is recalculated in each iteration following t cross-validation procedure
$$\begin{aligned} CV=\sum _{i=1}^n{ W_i \left( R_i^j - {\hat{f}}_j^{(-i)}(X_{ij})\right) ^2} \end{aligned}$$(20)where \({\hat{f}}_j^{(-i)}(X_{ij})\) is the leave-one-out estimator at \(X_{ij}\) obtained from the sample without the i-th data vector.
-
Step 2.2: Repeat Step 2.1 replacing \({\hat{f}}_{j}^0 (X_{ij})\) by \({\hat{f}}_{j} (X_{ij})\) for \(j=1,\ldots ,p\) and \(i=1,\ldots ,n\), until the convergence criterion
$$\begin{aligned} \frac{ \sum _{i=1}^n \left( { {\hat{f}}_{j}(X_{ij})-{\hat{f}}_j^0(X_{ij})}\right) ^2}{ \sum _{i=1}^n \left( { {\hat{f}}_{j}^0(X_{ij})}\right) ^2+0.001} \le \varepsilon \quad \text {for all } \quad j=1,\ldots ,p \end{aligned}$$is reached.
Step 3. Repeat Steps 1–3 with \({\hat{\eta }}_i^0\) being replaced by \({\hat{\eta }}_i= {\hat{\alpha }} + \sum _{j=1}^p {{\hat{\alpha }}_j \cdot X_{ij}}+ \sum _{j=1}^p {\hat{f}}_j(X_{ij})\) for \(i=1,\ldots ,n\) until
where \(\varepsilon\) is a small threshold and the mean squared error \(\textit{MSE}({\hat{\eta }},Y)\) is defined as \(\textit{MSE}({\hat{\eta }},Y)=n^{-1}\sum _{i=1}^n W_i(Y_i-H({\hat{\eta }}_i))^2\).
Rights and permissions
About this article
Cite this article
Roca-Pardiñas, J., Ordóñez, C. & Lado-Baleato, O. Nonparametric location–scale model for the joint forecasting of \(\hbox {SO}_{{2}}\) and \(\hbox {NO}_{{x}}\) pollution episodes. Stoch Environ Res Risk Assess 35, 231–244 (2021). https://doi.org/10.1007/s00477-020-01901-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-020-01901-1