Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Intercomparison of machine learning methods for statistical downscaling: the case of daily and extreme precipitation


Statistical downscaling of Global Climate Models (GCMs) allows researchers to study local climate change effects decades into the future. A wide range of statistical models have been applied to downscaling GCMs but recent advances in machine learning have not been explored compared to traditional approaches. In this paper, we compare five Perfect Prognosis (PP) approaches, Ordinary Least Squares, Elastic-Net, and Support Vector Machine along with two machine learning methods Multi-task Sparse Structure Learning (MSSL) and Autoencoder Neural Networks. In addition, we introduce a hybrid Model Output Statistics and PP approach by modeling the residuals of Bias Correction Spatial Disaggregation (BCSD) with MSSL. Metrics to evaluate each method’s ability to capture daily anomalies, large-scale climate shifts, and extremes are analyzed. Generally, we find inconsistent performance between PP methods in their ability to predict daily anomalies and extremes as well as monthly and annual precipitation. However, results suggest that L1 sparsity constraints aid in reducing error through internal feature selection. The MSSL+BCSD coupling, when compared with BCSD, improved daily, monthly, and annual predictability but decreased performance at the extremes. Hence, these results suggest that the direct application of state-of-the-art machine learning methods to statistical downscaling does not provide direct improvements over simpler, longstanding approaches.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  1. Abatzoglou JT, Brown TJ (2012) A comparison of statistical downscaling methods suited for wildfire applications. Int J Climatol 32(5):772–780

  2. Ahmed KF, Wang G, Silander J, Wilson AM, Allen JM, Horton R, Anyah R (2013) Statistical downscaling and bias correction of climate model outputs for climate change impact assessment in the US Northeast. Glob Planet Chang 100:320–332

  3. Argyriou A, Pontil M, Ying Y, Micchelli CA (2007) A spectral regularization framework for multi-task structure learning. In: Advances in neural information processing systems, pp 25–32

  4. Basu S, Karki M, Ganguly S, DiBiano R, Mukhopadhyay S, Nemani R (2015) Learning sparse feature representations using probabilistic quadtrees and deep belief nets. In: Proceedings of the European symposium on artificial neural networks, ESANN

  5. Benestad RE, Chen D, Mezghani A, Fan L, Parding K (2015) On using principal components to represent stations in empirical–statistical downscaling. Tellus Ser A Dyn Meteorol Oceanogr 67(1):28,326

  6. Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends®; Inf Retr 3(1):1–122

  7. Bürger G, Murdock TQ, Werner aT, Sobie SR, Cannon aJ (2012) Downscaling extremes-an intercomparison of multiple statistical methods for present climate. J Clim 25 (12):4366–4388. https://doi.org/10.1175/JCLI-D-11-00408.1. http://journals.ametsoc.org/doi/abs/10.1175/JCLI-D-11-00408.1

  8. Chen J, Zhou J, Ye J (2011) Integrating low-rank and group-sparse structures for robust multi-task learning. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 42–50

  9. Chen M, Shi W, Xie P, Silva V, Kousky VE, Wayne Higgins R, Janowiak JE (2008) Assessing objective techniques for gauge-based analyses of global daily precipitation. Journal of Geophysical Research: Atmospheres 113(D4)

  10. Coulibaly P, Dibike YB, Anctil F (2005) Downscaling precipitation and temperature with temporal neural networks. J Hydrometeorol 6(4):483–496. https://doi.org/10.1175/JHM409.1

  11. Das D, Ganguly AR, Obradovic Z (2014) A Bayesian sparse generalized linear model with an application to multi-scale covariate discovery for observed rainfall extremes over United States. IEEE Trans Geosci Remote Sens, pp 1–14

  12. Dee D, Uppala S, Simmons A, Berrisford P, Poli P, Kobayashi S, Andrae U, Balmaseda M, Balsamo G, Bauer P et al (2011) The era-interim reanalysis: configuration and performance of the data assimilation system. Q J R Meteorol Soc 137(656):553–597

  13. Ekström M, Fowler H, Kilsby C, Jones P (2005) New estimates of future changes in extreme rainfall across the UK using regional climate model integrations. 2. Future estimates and use in impact studies. J Hydrol 300 (1-4):234–251

  14. Evgeniou A, Pontil M (2007) Multi-task feature learning. In: Advances in neural information processing systems, vol 19, p 41

  15. Frumkin H, Hess J, Luber G, Malilay J, McGeehin M (2008) Climate change: the public health response. Am J Public Health 98(3):435–445

  16. Ganguli P, Kumar D, Ganguly AR (2015) Water stress on us power production at decadal time horizons. arXiv:151108449

  17. Ghosh S (2010) Svm-pgsl coupled approach for statistical downscaling to predict rainfall from gcm output. Journal of Geophysical Research: Atmospheres 115(D22)

  18. Ghosh S, Mujumdar P (2008) Statistical downscaling of gcm simulations to streamflow using relevance vector machine. Adv Water Resour 31(1):132–146

  19. Goncalves AR, Das P, Chatterjee S, Sivakumar V, Von Zuben FJ, Banerjee A (2014) Multi-task sparse structure learning. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management. ACM, pp 451–460

  20. Gutiérrez J, Maraun D, Widmann M, Huth R, Hertig E, Benestad R, Roessler O, Wibig J, Wilcke R, Kotlarski S et al (2018) An intercomparison of a large ensemble of statistical downscaling methods over Europe: results from the value perfect predictor cross-validation experiment. International Journal of Climatology

  21. Gutmann E, Pruitt T, Clark MP, Brekke L, Arnold JR, Raff DA, Rasmussen RM (2014) An intercomparison of statistical downscaling methods used for water resource assessments in the United States. Water Resour Res 50(9):7167–7186

  22. Haines A, Kovats RS, Campbell-Lendrum D, Corvalán C (2006) Climate change and human health: impacts, vulnerability and public health. Public health 120(7):585–596

  23. Hammami D, Lee TS, Ouarda TB, Lee J (2012) Predictor selection for downscaling gcm data with lasso. Journal of Geophysical Research: Atmospheres 117(D17)

  24. Hansen MC, Potapov PV, Moore R, Hancher M, Turubanova S, Tyukavina A, Thau D, Stehman S, Goetz S, Loveland T et al (2013) High-resolution global maps of 21st-century forest cover change. Science 342(6160):850–853

  25. Hertig E, Maraun D, Bartholy J, Pongracz R, Vrac M, Mares I, Gutiérrez J M, Wibig J, Casanueva A, Soares PM (2018) Comparison of statistical downscaling methods with respect to extreme events over europe. Validation results from the perfect predictor experiment of the cost action value. International Journal of Climatology

  26. Hessami M, Gachon P, Ouarda TB, St-Hilaire A (2008) Automated regression-based statistical downscaling tool. Environ Model Softw 23(6):813–834

  27. Hewitson B, Crane R (1996) Climate downscaling: techniques and application. Clim Res, pp 85–95

  28. Hidalgo HG, Dettinger MD, Cayan DR (2008) Downscaling with constructed analogues: Daily precipitation and temperature fields over the united states. California Energy Commission PIER Final Project Report CEC-500-2007-123

  29. Hinton G, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507

  30. Hinton G, Deng L, Yu D, Dahl GE, Mohamed Ar, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath TN et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29(6):82–97

  31. Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1):55–67

  32. Kalnay E, Kanamitsu M, Kistler R, Collins W, Deaven D, Gandin L, Iredell M, Saha S, White G, Woollen J et al (1996) The ncep/ncar 40-year reanalysis project. Bull Am Meteorol Soc 77(3):437–471

  33. Karl TR, Knight RW (1998) Secular trends of precipitation amount, frequency, and intensity in the United State. Bull Am Meteorol Soc 79(2):231–241

  34. Kendon EJ, Roberts NM, Fowler HJ, Roberts MJ, Chan SC, Senior CA (2014) Heavier summer downpours with climate change revealed by weather forecast resolution model. Nat Clim Chang 4(7):570

  35. Kim S, Xing EP (2010) Tree-guided group lasso for multi-task regression with structured sparsity. International Conference on Machine Learning

  36. Kingma D, Ba J (2014) Adam: a method for stochastic optimization. arXiv:14126980

  37. Knutti R (2013) Robustness and uncertainties in the new cmip5 climate model projections. Nat Clim Chang 3(4):369–373

  38. Kossin JP (2015) Validating atmospheric reanalysis data using tropical cyclones as thermometers. Bull Am Meteorol Soc 96(7):1089–1096

  39. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  40. Maraun D, Wetterhall F, Ireson A, Chandler R, Kendon E, Widmann M, Brienen S, Rust H, Sauter T, Themeßl M et al (2010) Precipitation downscaling under climate change: Recent developments to bridge the gap between dynamical models and the end user. Rev Geophys 48(3)

  41. Maraun D, Widmann M, Gutiérrez J M, Kotlarski S, Chandler RE, Hertig E, Wibig J, Huth R, Wilcke RA (2015) Value: a framework to valiyear downscaling approaches for climate change studies. Earth’s Future 3(1):1–14

  42. Maraun D, Shepherd TG, Widmann M, Zappa G, Walton D, Gutiérrez J M, Hagemann S, Richter I, Soares PM, Hall A et al (2017) Towards process-informed bias correction of climate change simulations. Nat Clim Chang 7(11):764

  43. Maurer EP, Hidalgo HG, Das T, Dettinger M, Cayan D (2010) The utility of daily large-scale climate data in the assessment of climate change impacts on daily streamflow in California. Hydrol Earth Syst Sci 14(6):1125–1138

  44. Neumann JE, Price J, Chinowsky P, Wright L, Ludwig L, Streeter R, Jones R, Smith JB, Perkins W, Jantarasami L et al (2015) Climate change risks to us infrastructure: impacts on roads, bridges, coastal development, and urban drainage. Clim Chang 131(1):97–109

  45. Parmesan C (2006) Ecological and evolutionary responses to recent climate change. Annual Review of Ecology Evolution, and Systematics, pp 637–669

  46. Perkins S, Pitman A, Holbrook N, McAneney J (2007) Evaluation of the ar4 climate models’ simulated daily maximum temperature, minimum temperature, and precipitation over Australia using probability density functions. J Clim 20(17):4356–4376

  47. Pierce DW, Cayan DR, Thrasher BL (2014) Statistical downscaling using localized constructed analogs (loca). J Hydrometeorol 15(6):2558–2585

  48. Rienecker MM, Suarez MJ, Gelaro R, Todling R, Bacmeister J, Liu E, Bosilovich MG, Schubert SD, Takacs L, Kim GK et al (2011) Merra: Nasa’s modern-era retrospective analysis for research and applications. J Clim 24(14):3624–3648

  49. Rummukainen M (1997) Methods for statistical downscaling of gcm simulations. SMHI Rapporter Meteorologi och Klimatologi (Sweden) no 80

  50. Schiermeier Q (2010) The real holes in climate science. Nat News 463(7279):284–287

  51. Schoof JT, Pryor S (2001) Downscaling temperature and precipitation: a comparison of regression-based methods and artificial neural networks. Int J Climatol 21(7):773–790

  52. Semenov MA, Barrow EM (1997) Use of a stochastic weather generator in the development of climate change scenarios. Clim Chang 35(4):397–414

  53. Smola A, Vapnik V (1997) Support vector regression machines. Adv Neural Inf Proces Syst 9:155–161

  54. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

  55. Taylor JW (2000) A quantile regression neural network approach to estimating the conditional density of multiperiod returns. J Forecast 19(4):299–311

  56. Themeßl M, Gobiet A, Leuprecht A (2011) Empirical-statistical downscaling and error correction of daily precipitation from regional climate models. Int J Climatol 31(10):1530–1544

  57. Thibeault JM, Seth A (2014) Changing climate extremes in the Northeast United States: observations and projections from cmip5. Clim Chang 127(2):273–287

  58. Thrasher B, Maurer EP, McKellar C, Duffy P (2012) Technical note: bias correcting climate model simulated daily temperature extremes with quantile mapping. Hydrol Earth Syst Sci 16(9):3309–3314

  59. Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological), pp 267–288

  60. Tryhorn L, DeGaetano A (2011) A comparison of techniques for downscaling extreme precipitation over the Northeastern United States. Int J Climatol 31(13):1975–1989

  61. Vandal T, Kodra E, Ganguly S, Michaelis A, Nemani R, Ganguly AR (2017) Deepsd: generating high resolution climate change projections through single image super-resolution. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1663–1672

  62. Vandal T, Kodra E, Dy J, Ganguly S, Nemani R, Ganguly AR (2018) Quantifying uncertainty in discrete-continuous and skewed data with Bayesian deep learning. In: Proceedings of the 24rd ACM SIGKDD international conference on knowledge discovery and data mining. ACM

  63. Walther GR, Post E, Convey P, Menzel A, Parmesan C, Beebee TJ, Fromentin JM, Hoegh-Guldberg O, Bairlein F (2002) Ecological responses to recent climate change. Nature 416(6879):389–395

  64. Wilby RL, Dawson CW, Barrow EM (2002) Sdsm—a decision support tool for the assessment of regional climate change impacts. Environ Model Softw 17(2):145–157

  65. Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2(1-3):37–52

  66. Wood AW, Maurer EP, Kumar A, Lettenmaier DP (2002) Long-range experimental hydrologic forecasting for the eastern United States. Journal of Geophysical Research: Atmospheres 107(D20)

  67. Wood AW, Leung LR, Sridhar V, Lettenmaier D (2004) Hydrologic implications of dynamical and statistical approaches to downscaling climate model outputs. Clim Chang 62(1-3):189–216

  68. Xie P, Chen M, Yang S, Yatagai A, Hayasaka T, Fukushima Y, Liu C (2007) A gauge-based analysis of daily precipitation over east asia. J Hydrometeorol 8(3):607–626

  69. Zhang T, Ghanem B, Liu S, Ahuja N (2012) Robust visual tracking via multi-task sparse learning. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2042–2049

  70. Zhang Y, Yeung DY (2012) A convex formulation for learning task relationships in multi-task learning. Uncertainty in Artificial Intelligence

  71. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B (Stat Methodol) 67(2):301–320

Download references


MERRA-2 climate reanalysis datasets used were provided by the Global Modeling and Assimilation Office at NASA’s Goddard Space Flight Center. The CPC Unified Gauge-Based Analysis was provided by NOAA Climate Prediction Center.


This work was funded by NSF CISE Expeditions in Computing award 1029711, NSF CyberSEES award 1442728, and NSF BIGDATA award 1447587.

Author information

Correspondence to Thomas Vandal.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Vandal, T., Kodra, E. & Ganguly, A.R. Intercomparison of machine learning methods for statistical downscaling: the case of daily and extreme precipitation. Theor Appl Climatol 137, 557–570 (2019). https://doi.org/10.1007/s00704-018-2613-3

Download citation