Skip to main content

Advertisement

Log in

Temporal downscaling of precipitation from climate model projections using machine learning

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

Increased greenhouse gas concentration in the atmosphere has led to significant climate warming and changes in precipitation and temperature characteristics. These trends, which are expected to continue, will affect water infrastructure and raise the need to update associated planning and design policies. The potential effects of climate change can be addressed, in part, by incorporating outputs of climate model projections into statistical assessments to develop the Intensity Duration Frequency (IDF) curves used in engineering design and analysis. The results of climate model projections are available at fixed temporal and spatial resolutions. Model results often need to be downscaled from a coarser to a finer grid spacing (spatial downscaling) and/or from a larger to a smaller time-step (temporal downscaling). Machine Learning (ML) models are among the methods used for spatial and temporal downscaling of climate model outputs. These methods are more frequently used for spatial downscaling; fewer studies explore temporal downscaling. In this study, multiple ML models are evaluated to temporally downscale precipitation time-series (available at 3-h time steps) generated by several regional climate models of the North American Regional Climate Change Assessment Program (NARCCAP) under a high-carbon-emission projection. The temporally downscaled time-series for 2-h, 1-h, 30-min, and 15-min durations are intended for subsequent statistical analysis to generate current- and future-climate IDF curves for Maryland. In this study, the behavior of the ML models is explored by assessing performance in predicting large target response quantities, identifying systematic trends in errors, investigating input/output relationships using response functions, and leveraging conventional performance metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Availability of data and material

The data used in this study are from (1) the Climate Data Online database (Hourly, 15-min, Daily summaries databases) of the NOAA National Center for Environmental Information (NCEI), https://www.ncdc.noaa.gov/cdo-web/search, and (2) North American Regional Climate Change Assessment Program database of UCAR, https://www.narccap.ucar.edu/.

Abbreviations

ANN:

Artificial Neural Network

AOGCM:

Atmosphere–Ocean General Circulation Model

BT:

Boosted Trees

CDO:

Climate Data Online

GBT:

Gradient Boosting Trees

GCM:

General Circulation Model or Global Climate Model

GPR:

Gaussian Process Regression

GRNN:

Generalized Regression Neural Network

IDF:

Intensity Duration Frequency

IMSE:

Integrated Mean Square Error

KNN:

K-Nearest Neighbors

LSSVM:

Least-Square SVM

MAE:

Mean Absolute Error

ML:

Machine Learning

MLR:

Multiple Linear Regression

MSE:

Mean Squared Error

NARCCAP:

North American Regional Climate Change Assessment Program

NCEI:

National Center for Environmental Information

NOAA:

National Oceanic and Atmospheric Administration

RCM:

Regional Climate Model

RCP:

Representative concentration pathway

RF:

Random Forest

RI:

Reference Index

RMSE:

Root Mean Squared Error

SN:

Signal-to-Noise

SVM:

Support Vector Machine

SVR:

Support Vector Regression

UCAR:

University Corporation for Atmospheric Research

WANN:

Wavelet ANN

WLSSVM:

Wavelet- LSSVM

References

  • Agel L, Barlow M, Qian J-H et al (2015) Climatology of daily precipitation and extreme precipitation events in the northeast United States. J Hydrometeorol 16:2537–2557

    Article  Google Scholar 

  • Al Kajbaf A, Bensi M (2020) Application of surrogate models in estimation of storm surge: a comparative assessment. Appl Soft Comput 91:106184

    Article  Google Scholar 

  • Alam MS, Elshorbagy A (2015) Quantification of the climate change-induced variations in Intensity–Duration–Frequency curves in the Canadian Prairies. J Hydrol 527:990–1005

    Article  Google Scholar 

  • American Meteorological Society (2022) intensity–duration–frequency curve. Gloss Meteorol

  • Beuchat X, Schaefli B, Soutter M, Mermoud A (2011) Toward a robust method for subdaily rainfall downscaling from daily data. Water Resour Res. https://doi.org/10.1029/2010WR010342

    Article  Google Scholar 

  • Botchkarev A (2019) A new typology design of performance metrics to measure errors in machine learning regression algorithms. Interdisc J Inf Knowl Manage 14:45–76. https://doi.org/10.28945/4184

    Article  Google Scholar 

  • Breiman L (1996) Bagging predictors. Mach Learn 24:123–140

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  • Burian SJ, Durrans SR, Tomić S et al (2000) Rainfall disaggregation using artificial neural networks. J Hydrol Eng 5:299–307

    Article  Google Scholar 

  • Burian SJ, Durrans SR, Nix SJ, Pitt RE (2001) Training artificial neural networks to perform rainfall disaggregation. J Hydrol Eng 6:43–51

    Article  Google Scholar 

  • Cheng M-Y, Firdausi PM, Prayogo D (2014) High-performance concrete compressive strength prediction using Genetic Weighted Pyramid Operation Tree (GWPOT). Eng Appl Artif Intell 29:104–113

    Article  Google Scholar 

  • Chou J-S, Chiu C-K, Farfoura M, Al-Taharwa I (2011) Optimizing the prediction accuracy of concrete compressive strength based on a comparison of data-mining techniques. J Comput Civ Eng 25:242–253

    Article  Google Scholar 

  • Coulibaly P, Dibike YB, Anctil F (2005) Downscaling precipitation and temperature with temporal neural networks. J Hydrometeorol 6:483–496

    Article  Google Scholar 

  • Dibike YB, Coulibaly P (2006) Temporal neural networks for downscaling climate variability and extremes. Neural Netw 19:135–144

    Article  Google Scholar 

  • Diez-Sierra J, Del Jesus M (2019) Subdaily rainfall estimation through daily rainfall downscaling using random forests in Spain. Water 11:125

    Article  Google Scholar 

  • Diez-Sierra J, del Jesus M (2020) Long-term rainfall prediction using atmospheric synoptic patterns in semi-arid climates with statistical and machine learning methods. J Hydrol 586:124789

    Article  Google Scholar 

  • Durrans SR, Burian SJ, Nix SJ et al (1999) Polynomial-based disaggregation of hourly rainfall for continuous hydrologic simulation 1. JAWRA J Am Water Resour Assoc 35:1213–1221

    Article  Google Scholar 

  • Foresee FD, Hagan MT (1997) Gauss-Newton approximation to Bayesian learning. In: Proceedings of international conference on neural networks (ICNN’97). IEEE, pp 1930–1935

  • Gandomi AH, Yun GJ, Alavi AH (2013) An evolutionary approach for modeling of shear strength of RC deep beams. Mater Struct 46:2109–2119

    Article  Google Scholar 

  • Harilal N, Singh M, Bhatia U (2021) Augmented convolutional LSTMs for generation of high-resolution climate change projections. IEEE Access 9:25208–25218

    Article  Google Scholar 

  • Hu H, Ayyub BM (2019) Machine learning for projecting extreme precipitation intensity for short durations in a changing climate. Geosciences 9:209

    Article  Google Scholar 

  • Jain AK, Mao J, Mohiuddin KM (1996) Artificial neural networks: A tutorial. Computer 29:31–44

    Article  Google Scholar 

  • Kannan S, Ghosh S (2011) Prediction of daily rainfall state in a river basin using statistical downscaling from GCM output. Stoch Environ Res Risk Assess 25:457–474

    Article  Google Scholar 

  • Kim S, Kisi O, Seo Y et al (2016) Assessment of rainfall aggregation and disaggregation using data-driven models and wavelet decomposition. Hydrol Res 48:99–116

    Article  Google Scholar 

  • Kumar J, Brooks B-GJ, Thornton PE, Dietze MC (2012) Sub-daily statistical downscaling of meteorological variables using neural networks. Procedia Comput Sci 9:887–896

    Article  Google Scholar 

  • Kumar B, Chattopadhyay R, Singh M et al (2021) Deep learning–based downscaling of summer monsoon rainfall data over Indian region. Theor Appl Climatol 143:1145–1156

    Article  Google Scholar 

  • Kunkel KE, Karl TR, Brooks H et al (2013) Monitoring and understanding trends in extreme storms: state of knowledge. Bull Am Meteorol Soc 94:499–514

    Article  Google Scholar 

  • Leathers DJ, Brasher SE, Brinson KR, Hughes C, Weiskopf S (2020) A comparison of extreme precipitation event frequency and magnitude using a high-resolution rain gage network and NOAA Atlas 14 across Delaware. Int J Climatol 40(8):3748–3756

    Article  Google Scholar 

  • Leclerc G, Schaake JC (1972) Derivation of hydrologic frequency curves. Report 142, Mass. Inst. of Technol., Cambridge, 151 pp.

  • Levenberg K (1944) A method for the solution of certain non-linear problems in least squares. Q Appl Math 2:164–168. https://doi.org/10.1090/qam/10666

    Article  Google Scholar 

  • MacKay DJ (1992) Bayesian interpolation. Neural Comput 4:415–447

    Article  Google Scholar 

  • Marquardt DW (1963) An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math 11:431–441

    Article  Google Scholar 

  • MathWorks (2018) Matlab Documentation.

  • MathWorks What Is a Neural Network? https://www.mathworks.com/discovery/neural-network.html. Accessed 8 Nov 2021a

  • MathWorks Bayesian regularization backpropagation - MATLAB trainbr. https://www.mathworks.com/help/deeplearning/ref/trainbr.html. Accessed 4 Nov 2021b

  • MathWorks Hyperparameter Optimization in Regression Learner App - MATLAB & Simulink. https://www.mathworks.com/help/stats/hyperparameter-optimization-in-regression-learner-app.html. Accessed 7 Nov 2021c

  • Mearns L, McGinnis S, Arritt R, et al (2007) North American Regional Climate Change Assessment Program dataset. Approximately 40 TB

  • Menne MJ, Durre I, Vose RS et al (2012) An overview of the global historical climatology network-daily database. J Atmos Ocean Technol 29:897–910. https://doi.org/10.1175/JTECH-D-11-00103.1

    Article  Google Scholar 

  • Mirhosseini G, Srivastava P, Fang X (2014) Developing rainfall intensity-duration-frequency curves for Alabama under future climate scenarios using artificial neural networks. J Hydrol Eng 19:04014022

    Article  Google Scholar 

  • NOAA CDO Climate Data Online (CDO) - The National Climatic Data Center’s (NCDC) Climate Data Online (CDO) provides free access to NCDC’s archive of historical weather and climate data in addition to station history information. | National Climatic Data Center (NCDC). https://www.ncdc.noaa.gov/cdo-web/. Accessed 26 Jul 2021

  • Noor M, Ismail T, Chung E-S et al (2018) Uncertainty in rainfall intensity duration frequency curves of peninsular Malaysia under changing climate scenarios. Water 10:1750

    Article  Google Scholar 

  • Nourani V, Farboudfam N (2019) Rainfall time series disaggregation in mountainous regions using hybrid wavelet-artificial intelligence methods. Environ Res 168:306–318

    Article  CAS  Google Scholar 

  • Ojha CSP, Kumar-Goyal M, Adeloye AJ (2010) Downscaling of precipitation for lake catchment in arid region in India using linear multiple regression and neural networks. Open Hydrol J 4:122–136

    Article  Google Scholar 

  • Ormsbee LE (1989) Rainfall disaggregation model for continuous hydrologic modeling. J Hydraul Eng 115:507–525

    Article  Google Scholar 

  • Sharifi E, Saghafian B, Steinacker R (2019) Downscaling satellite precipitation estimates with multiple linear regression, artificial neural networks, and spline interpolation techniques. J Geophys Res Atmos 124:789–805

    Article  Google Scholar 

  • Tripathi S, Srinivas VV, Nanjundiah RS (2006) Downscaling of precipitation for climate change scenarios: a support vector machine approach. J Hydrol 330:621–640

    Article  Google Scholar 

  • Vogt M, Remmen P, Lauster M et al (2018) Selecting statistical indices for calibrating building energy models. Build Environ 144:94–107

    Article  Google Scholar 

  • Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30:79–82

    Article  Google Scholar 

  • Wu S-Y (2015) Changing characteristics of precipitation for the contiguous United States. Clim Change 132:677–692

    Article  Google Scholar 

  • Zhang J, Murch RR, Ross MA et al (2008) Evaluation of statistical rainfall disaggregation methods using rain-gauge information for West-Central Florida. J Hydrol Eng 13:1158–1169

    Article  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge the support of the Maryland Department of Transportation State Highway Administration (MDOT SHA) under Statewide Planning and Research (SPR) Task Number SHA/UM/5-36 and the Maryland Water Resources Research Center (US Geological Survey Award #G21AP10629).

Author information

Authors and Affiliations

Authors

Contributions

AAK: Conceptualization, Formal analysis, Data curation, Visualization, Writing-original draft; MB: Conceptualization, Supervision, Visualization, Writing (review & editing); KLB: Conceptualization, Supervision, Visualization, Writing (review & editing).

Corresponding author

Correspondence to Michelle Bensi.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Parameterization of the models

Appendix: Parameterization of the models

1.1 Artificial neural network

To develop an ANN model, a number of parameters need to be specified to determine the architecture of the network, including: (1) the type of network, (2) the number of neurons and hidden layers, (3) the number of epochs, (4) the training algorithm and transfer function, and (5) the data division. In this study, the type of the network is multilayer feedforward back-propagation, which consists of an input layer, hidden layer(s), and an output layer (Jain et al. 1996; MathWorks). The ANN architecture is determined through an iterative process. The number of hidden layers is chosen as one layer with ten neurons. The number of neurons in the hidden layer was tested across the range 7 to 12, but it was ultimately set up at ten based on the performance of the model in terms of Mean Squared Error (MSE). The training algorithm that is used for this study is Levenberg–Marquardt (Levenberg 1944; Marquardt 1963). This method, which is also known as damped least-squares, has primary application in least-squares curve-fitting problems. Bayesian regularization back-propagation (MacKay 1992; Foresee and Hagan 1997; MathWorks) is used in developing the network to solve the problem of overfitting. In this framework, the weight and bias values based on Levenberg–Marquardt optimization are updated. The transfer function is Sigmoid in the hidden layer and linear in the output layer as default. The input data in the ANN model is standardized by centering the predictor and target values using their mean and standard deviation.

1.2 Ensemble methods (Boosted Trees and Random Forest)

RF and BT are both ensemble methods that use decision trees as the learner algorithm (Breiman 1996, 2001). Modeling choices associated with these two models were selected based on the options that yielded the best performance in terms of smaller MSE. The modeling selections were initially optimized via a Bayesian optimization algorithm (MathWorks). The optimized parameters did not provide the highest accuracy throughout all the analyses performed in this study. However, they were used as a starting point to find the set of parameters that worked the best for different parts of the analysis and when applied to NARCCAP models. Further manual experiments were performed to refine parameter selections.

For the BT model, the following parameters were defined:

  • minimum leaf size: 1

  • number of variables to sample: 4

  • ensemble aggregation method: Least-squares boosting

  • number of learning cycles: 100

  • learning rate: 0.09.

For the RF model, the following parameters were defined:

  • minimum leaf size: 30

  • ensemble aggregation method: Bootstrap aggregation

  • number of learning cycles: 30.

The BT and RF models do not require data standardization.

1.3 Support Vector Regression

Modeling choices associated with the SVR model were selected based on the options that yielded the best performance in terms of \(MSE\). In an SVR model, some of the key factors that can be designated in the modeling process are (1) Kernel function, (2) Kernel scale, and (3) standardization. The Kernel function is selected as Gaussian (radial basis function). The Gaussian Kernel is chosen in the current study because it provided the highest accuracy among other Kernels, including Polynomial and Linear. The Kernel scale is set as “auto” which allows the software (MathWorks 2018) to determine an appropriate value using a heuristic procedure. The SVR model is set up to standardize the data by centering the predictor and target values using their mean and standard deviation.

As a sensitivity case, a series of analyses were performed in which the SVR modeling selections were optimized via a Bayesian optimization algorithm. However, the computational expense was significantly higher for the optimized model but did not always result in better performance (in terms of \(MSE\)). For this reason, the non-optimized model was selected and used.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kajbaf, A.A., Bensi, M. & Brubaker, K.L. Temporal downscaling of precipitation from climate model projections using machine learning. Stoch Environ Res Risk Assess 36, 2173–2194 (2022). https://doi.org/10.1007/s00477-022-02259-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-022-02259-2

Keywords

Navigation