Skip to main content

Predicting trace gas concentrations using quantile regression models

Abstract

Quantile regression methods are evaluated for computing predictions and prediction intervals of NOx concentrations measured in the vicinity of the power plant in As Pontes (Spain). For these data, smaller prediction errors were obtained using methods based on median regression compared with mean regression. A new method to construct prediction intervals involving median regression and bootstrapping the prediction error is proposed. This new method provides better coverage for NOx data compared with classical and bootstrap prediction intervals based on mean regression, as well as simpler prediction intervals based on quantile regression. A simulation study illustrates the features of this proposed method that lead to a better performance for obtaining prediction intervals for these particular NOx concentration data, as well as for any other environmental dataset that do not meet assumptions of homoscedasticity and normality of the error distribution.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

References

  1. Cade BS, Noon BR (2003) A gentle introduction to quantile regression for ecologists. Front Ecol Environ 1:412–420

    Article  Google Scholar 

  2. Chernozhukov V (2005) Extremal quantile regression. Ann Stat 33:806–839

    Article  Google Scholar 

  3. Conde-Amboage M, Sánchez-Sellero C, González-Manteiga W (2015) A lack-of-fit test for quantile regression models with high-dimensional covariates. Comput Stat Data Anal 88:128–138

    Article  Google Scholar 

  4. Fahrmeir L, Kneib T, Lang S, Marx B (2013) Regression: models, methods and applications. Springer, New York

    Book  Google Scholar 

  5. Feng X, He X, Hu J (2011) Wild bootstrap for quantile regression. Biometrika 98:995–999

    Article  Google Scholar 

  6. Fernández-Castro BM, Prada-Sánchez JM, González-Manteiga W, Febrero-Bande M, Bermúdez-Cela JL, Hernández-Fernádez JJ (2003) Prediction of \(SO_{2}\) levels using neural networks. J Air Waste Manag Assoc 53:532–538

    Article  Google Scholar 

  7. Fernández-Castro BM, Guillas S, González-Manteiga W (2005) Functional samples and bootstrap for predicting sulfur dioxide levels. Technometrics 47:212–222

    Article  Google Scholar 

  8. Fernández-Castro BM, González-Manteiga W (2008) Boosting for real and functional samples: an application to an environmental problem. Stoch Environ Res Risk Assess 22:27–37

    Article  Google Scholar 

  9. Fontanella L, Ippoliti L, Sarra A, Valentini P, Palermi S (2015) Hierarchical generalised latent spatial quantile regression models with applications to indoor radon concentration. Stoch Environ Res Risk Assess 29:357–367

    Article  Google Scholar 

  10. García-Jurado I, Gonzalez-Manteiga W, Prada-Sánchez JM, Febrero-Bande M, Cao R (1995) Predicting using Box-Jenkins, nonparametric and bootstrap techniques. Technometrics 37:303–310

    Google Scholar 

  11. Hall P, Yao Q (2005) Approximating conditional distribution functions using dimension reduction. Ann Stat 33:1404–1421

    Article  Google Scholar 

  12. Hall P, Wolff RCL, Yao Q (1999) Methods for estimating a conditional distribution fuction. J Am Stat Assoc 94:154–163

    Article  Google Scholar 

  13. Koenker R (1994) Confidence intervals for regression quantiles. In: Proceedings of the 5th Prague symposium on asymptotic statistics. Springer, New York, pp 349–359

  14. Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge

    Book  Google Scholar 

  15. Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46:33–50

    Article  Google Scholar 

  16. Li Q, Racine JS (2007) Nonparametric econometrics: theory and practice. Princeton University Press, Princeton

    Google Scholar 

  17. Mayr A, Hothorn T, Fenske N (2012) Prediction intervals for future BMI values of individual children: a non-parametric approach by quantile boosting. BMC Med Res Methodol 12:6

    Article  Google Scholar 

  18. Meinshausen N (2006) Quantile regression forests. J Mach Learn Res 7:983–999

    Google Scholar 

  19. Nanos N, Grigoratos T, Rodríguez-Martín JA, Samara C (2015) Scale-dependent correlations between soil heavy metals and As around four coal-fired power plants of northern Greece. Stoch Environ Res Risk Assess 29:1531–1543

    Article  Google Scholar 

  20. Prada-Sánchez JM, Febrero-Bande M (1997) Parametric, non-parametric and mixed approaches to prediction of sparsely distributed pollution incidents: a case study. J Chemom 11:13–32

    Article  Google Scholar 

  21. Prada-Sánchez JM, Febrero-Bande M, Cotos-Yáñez T, González-Manteiga W, Bermúdez-Cela JL, Lucas-Domínguez T (2000) Prediction of \(SO_{2}\) pollution incidents near a power station using partially linear models and a historical matrix of predictor-response vectors. Environmetrics 11:209–225

    Article  Google Scholar 

  22. Roca-Pardiñas J, González-Manteiga W, Febrero-Bande M, Prada-Sánchez JM, Cadarso-Suárez C (2004) Predicting binary time series of \(SO_{2}\) using generalized additive models with unknown link function. Environmetrics 15:729–742

    Article  Google Scholar 

  23. Roca-Pardiñas J, Cadarso-Suárez C, González-Manteiga W (2005) Testing for interactions in generalized additive models. Application to \(SO_{2}\) pollution data. Stat Comput 15:289–299

    Article  Google Scholar 

  24. Salama A (2005) A note on the impact of environmental performance on financial performance. Struct Chang Econ Dyn 16:413–421

    Article  Google Scholar 

  25. Seber GAF (1977) Linear regression analysis. Wiley, Hoboken

    Google Scholar 

  26. Shi JP, Harrison RM (1997) Regression modelling of hourly \(NO_x\) and \(NO_2\) concentrations in urban air in London. Atmos Environ 31:4081–4094

    CAS  Article  Google Scholar 

  27. Sousa SIV, Pires JCM, Martins FG, Pereira MC, Alvim-Ferraz MCM (2009) Potentialities of quantile regression to predict ozone concentrations. Environmetrics 20:147–158

    CAS  Article  Google Scholar 

  28. Stine RA (1985) Bootstrap prediction intervals for regression. J Am Stat Assoc 80:1026–1031

    Article  Google Scholar 

  29. Zhou KQ, Portnoy SL (1996) Direct use of regression quantiles to construct confidence sets in linear models. Ann Stat 24:287–306

    Article  Google Scholar 

Download references

Acknowledgments

This study was supported by Project MTM2013-41383P from the Spanish Ministry of Economy and Competitiveness, as well as the European Regional Development Fund (ERDF). Support from the IAP network StUDyS from the Belgian Science Policy is also acknowledged. M. Conde-Amboage was supported by FPU grant AP2012-5047 from the Spanish Ministry of Education. Comments and suggestions from two referees are gratefully acknowledged.

Author information

Affiliations

Authors

Corresponding author

Correspondence to César Sánchez-Sellero.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Conde-Amboage, M., González-Manteiga, W. & Sánchez-Sellero, C. Predicting trace gas concentrations using quantile regression models. Stoch Environ Res Risk Assess 31, 1359–1370 (2017). https://doi.org/10.1007/s00477-016-1252-4

Download citation

Keywords

  • Quantile regression
  • NOx concentration
  • Prediction errors
  • Prediction intervals
  • Bootstrapping
  • Median regression