Temporal downscaling of precipitation from climate model projections using machine learning

Kajbaf, Azin Al; Bensi, Michelle; Brubaker, Kaye L.

doi:10.1007/s00477-022-02259-2

Temporal downscaling of precipitation from climate model projections using machine learning

Original Paper
Published: 09 July 2022

Volume 36, pages 2173–2194, (2022)
Cite this article

Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

1294 Accesses
7 Citations
Explore all metrics

Abstract

Increased greenhouse gas concentration in the atmosphere has led to significant climate warming and changes in precipitation and temperature characteristics. These trends, which are expected to continue, will affect water infrastructure and raise the need to update associated planning and design policies. The potential effects of climate change can be addressed, in part, by incorporating outputs of climate model projections into statistical assessments to develop the Intensity Duration Frequency (IDF) curves used in engineering design and analysis. The results of climate model projections are available at fixed temporal and spatial resolutions. Model results often need to be downscaled from a coarser to a finer grid spacing (spatial downscaling) and/or from a larger to a smaller time-step (temporal downscaling). Machine Learning (ML) models are among the methods used for spatial and temporal downscaling of climate model outputs. These methods are more frequently used for spatial downscaling; fewer studies explore temporal downscaling. In this study, multiple ML models are evaluated to temporally downscale precipitation time-series (available at 3-h time steps) generated by several regional climate models of the North American Regional Climate Change Assessment Program (NARCCAP) under a high-carbon-emission projection. The temporally downscaled time-series for 2-h, 1-h, 30-min, and 15-min durations are intended for subsequent statistical analysis to generate current- and future-climate IDF curves for Maryland. In this study, the behavior of the ML models is explored by assessing performance in predicting large target response quantities, identifying systematic trends in errors, investigating input/output relationships using response functions, and leveraging conventional performance metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing PM2.5 Predictions in Dakar Through Automated Data Integration into a Data Assimilation Model

Article 15 May 2024

A Review of Recent Advances in Research on Extreme Heat Events

Article Open access 03 August 2016

Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate sensitive upper Indus catchments

Article 10 April 2024

Availability of data and material

The data used in this study are from (1) the Climate Data Online database (Hourly, 15-min, Daily summaries databases) of the NOAA National Center for Environmental Information (NCEI), https://www.ncdc.noaa.gov/cdo-web/search, and (2) North American Regional Climate Change Assessment Program database of UCAR, https://www.narccap.ucar.edu/.

Abbreviations

ANN:: Artificial Neural Network
AOGCM:: Atmosphere–Ocean General Circulation Model
BT:: Boosted Trees
CDO:: Climate Data Online
GBT:: Gradient Boosting Trees
GCM:: General Circulation Model or Global Climate Model
GPR:: Gaussian Process Regression
GRNN:: Generalized Regression Neural Network
IDF:: Intensity Duration Frequency
IMSE:: Integrated Mean Square Error
KNN:: K-Nearest Neighbors
LSSVM:: Least-Square SVM
MAE:: Mean Absolute Error
ML:: Machine Learning
MLR:: Multiple Linear Regression
MSE:: Mean Squared Error
NARCCAP:: North American Regional Climate Change Assessment Program
NCEI:: National Center for Environmental Information
NOAA:: National Oceanic and Atmospheric Administration
RCM:: Regional Climate Model
RCP:: Representative concentration pathway
RF:: Random Forest
RI:: Reference Index
RMSE:: Root Mean Squared Error
SN:: Signal-to-Noise
SVM:: Support Vector Machine
SVR:: Support Vector Regression
UCAR:: University Corporation for Atmospheric Research
WANN:: Wavelet ANN
WLSSVM:: Wavelet- LSSVM

References

Agel L, Barlow M, Qian J-H et al (2015) Climatology of daily precipitation and extreme precipitation events in the northeast United States. J Hydrometeorol 16:2537–2557
Article Google Scholar
Al Kajbaf A, Bensi M (2020) Application of surrogate models in estimation of storm surge: a comparative assessment. Appl Soft Comput 91:106184
Article Google Scholar
Alam MS, Elshorbagy A (2015) Quantification of the climate change-induced variations in Intensity–Duration–Frequency curves in the Canadian Prairies. J Hydrol 527:990–1005
Article Google Scholar
American Meteorological Society (2022) intensity–duration–frequency curve. Gloss Meteorol
Beuchat X, Schaefli B, Soutter M, Mermoud A (2011) Toward a robust method for subdaily rainfall downscaling from daily data. Water Resour Res. https://doi.org/10.1029/2010WR010342
Article Google Scholar
Botchkarev A (2019) A new typology design of performance metrics to measure errors in machine learning regression algorithms. Interdisc J Inf Knowl Manage 14:45–76. https://doi.org/10.28945/4184
Article Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
Google Scholar
Breiman L (2001) Random forests. Mach Learn 45:5–32
Article Google Scholar
Burian SJ, Durrans SR, Tomić S et al (2000) Rainfall disaggregation using artificial neural networks. J Hydrol Eng 5:299–307
Article Google Scholar
Burian SJ, Durrans SR, Nix SJ, Pitt RE (2001) Training artificial neural networks to perform rainfall disaggregation. J Hydrol Eng 6:43–51
Article Google Scholar
Cheng M-Y, Firdausi PM, Prayogo D (2014) High-performance concrete compressive strength prediction using Genetic Weighted Pyramid Operation Tree (GWPOT). Eng Appl Artif Intell 29:104–113
Article Google Scholar
Chou J-S, Chiu C-K, Farfoura M, Al-Taharwa I (2011) Optimizing the prediction accuracy of concrete compressive strength based on a comparison of data-mining techniques. J Comput Civ Eng 25:242–253
Article Google Scholar
Coulibaly P, Dibike YB, Anctil F (2005) Downscaling precipitation and temperature with temporal neural networks. J Hydrometeorol 6:483–496
Article Google Scholar
Dibike YB, Coulibaly P (2006) Temporal neural networks for downscaling climate variability and extremes. Neural Netw 19:135–144
Article Google Scholar
Diez-Sierra J, Del Jesus M (2019) Subdaily rainfall estimation through daily rainfall downscaling using random forests in Spain. Water 11:125
Article Google Scholar
Diez-Sierra J, del Jesus M (2020) Long-term rainfall prediction using atmospheric synoptic patterns in semi-arid climates with statistical and machine learning methods. J Hydrol 586:124789
Article Google Scholar
Durrans SR, Burian SJ, Nix SJ et al (1999) Polynomial-based disaggregation of hourly rainfall for continuous hydrologic simulation 1. JAWRA J Am Water Resour Assoc 35:1213–1221
Article Google Scholar
Foresee FD, Hagan MT (1997) Gauss-Newton approximation to Bayesian learning. In: Proceedings of international conference on neural networks (ICNN’97). IEEE, pp 1930–1935
Gandomi AH, Yun GJ, Alavi AH (2013) An evolutionary approach for modeling of shear strength of RC deep beams. Mater Struct 46:2109–2119
Article Google Scholar
Harilal N, Singh M, Bhatia U (2021) Augmented convolutional LSTMs for generation of high-resolution climate change projections. IEEE Access 9:25208–25218
Article Google Scholar
Hu H, Ayyub BM (2019) Machine learning for projecting extreme precipitation intensity for short durations in a changing climate. Geosciences 9:209
Article Google Scholar
Jain AK, Mao J, Mohiuddin KM (1996) Artificial neural networks: A tutorial. Computer 29:31–44
Article Google Scholar
Kannan S, Ghosh S (2011) Prediction of daily rainfall state in a river basin using statistical downscaling from GCM output. Stoch Environ Res Risk Assess 25:457–474
Article Google Scholar
Kim S, Kisi O, Seo Y et al (2016) Assessment of rainfall aggregation and disaggregation using data-driven models and wavelet decomposition. Hydrol Res 48:99–116
Article Google Scholar
Kumar J, Brooks B-GJ, Thornton PE, Dietze MC (2012) Sub-daily statistical downscaling of meteorological variables using neural networks. Procedia Comput Sci 9:887–896
Article Google Scholar
Kumar B, Chattopadhyay R, Singh M et al (2021) Deep learning–based downscaling of summer monsoon rainfall data over Indian region. Theor Appl Climatol 143:1145–1156
Article Google Scholar
Kunkel KE, Karl TR, Brooks H et al (2013) Monitoring and understanding trends in extreme storms: state of knowledge. Bull Am Meteorol Soc 94:499–514
Article Google Scholar
Leathers DJ, Brasher SE, Brinson KR, Hughes C, Weiskopf S (2020) A comparison of extreme precipitation event frequency and magnitude using a high-resolution rain gage network and NOAA Atlas 14 across Delaware. Int J Climatol 40(8):3748–3756
Article Google Scholar
Leclerc G, Schaake JC (1972) Derivation of hydrologic frequency curves. Report 142, Mass. Inst. of Technol., Cambridge, 151 pp.
Levenberg K (1944) A method for the solution of certain non-linear problems in least squares. Q Appl Math 2:164–168. https://doi.org/10.1090/qam/10666
Article Google Scholar
MacKay DJ (1992) Bayesian interpolation. Neural Comput 4:415–447
Article Google Scholar
Marquardt DW (1963) An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math 11:431–441
Article Google Scholar
MathWorks (2018) Matlab Documentation.
MathWorks What Is a Neural Network? https://www.mathworks.com/discovery/neural-network.html. Accessed 8 Nov 2021a
MathWorks Bayesian regularization backpropagation - MATLAB trainbr. https://www.mathworks.com/help/deeplearning/ref/trainbr.html. Accessed 4 Nov 2021b
MathWorks Hyperparameter Optimization in Regression Learner App - MATLAB & Simulink. https://www.mathworks.com/help/stats/hyperparameter-optimization-in-regression-learner-app.html. Accessed 7 Nov 2021c
Mearns L, McGinnis S, Arritt R, et al (2007) North American Regional Climate Change Assessment Program dataset. Approximately 40 TB
Menne MJ, Durre I, Vose RS et al (2012) An overview of the global historical climatology network-daily database. J Atmos Ocean Technol 29:897–910. https://doi.org/10.1175/JTECH-D-11-00103.1
Article Google Scholar
Mirhosseini G, Srivastava P, Fang X (2014) Developing rainfall intensity-duration-frequency curves for Alabama under future climate scenarios using artificial neural networks. J Hydrol Eng 19:04014022
Article Google Scholar
NOAA CDO Climate Data Online (CDO) - The National Climatic Data Center’s (NCDC) Climate Data Online (CDO) provides free access to NCDC’s archive of historical weather and climate data in addition to station history information. | National Climatic Data Center (NCDC). https://www.ncdc.noaa.gov/cdo-web/. Accessed 26 Jul 2021
Noor M, Ismail T, Chung E-S et al (2018) Uncertainty in rainfall intensity duration frequency curves of peninsular Malaysia under changing climate scenarios. Water 10:1750
Article Google Scholar
Nourani V, Farboudfam N (2019) Rainfall time series disaggregation in mountainous regions using hybrid wavelet-artificial intelligence methods. Environ Res 168:306–318
Article CAS Google Scholar
Ojha CSP, Kumar-Goyal M, Adeloye AJ (2010) Downscaling of precipitation for lake catchment in arid region in India using linear multiple regression and neural networks. Open Hydrol J 4:122–136
Article Google Scholar
Ormsbee LE (1989) Rainfall disaggregation model for continuous hydrologic modeling. J Hydraul Eng 115:507–525
Article Google Scholar
Sharifi E, Saghafian B, Steinacker R (2019) Downscaling satellite precipitation estimates with multiple linear regression, artificial neural networks, and spline interpolation techniques. J Geophys Res Atmos 124:789–805
Article Google Scholar
Tripathi S, Srinivas VV, Nanjundiah RS (2006) Downscaling of precipitation for climate change scenarios: a support vector machine approach. J Hydrol 330:621–640
Article Google Scholar
Vogt M, Remmen P, Lauster M et al (2018) Selecting statistical indices for calibrating building energy models. Build Environ 144:94–107
Article Google Scholar
Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30:79–82
Article Google Scholar
Wu S-Y (2015) Changing characteristics of precipitation for the contiguous United States. Clim Change 132:677–692
Article Google Scholar
Zhang J, Murch RR, Ross MA et al (2008) Evaluation of statistical rainfall disaggregation methods using rain-gauge information for West-Central Florida. J Hydrol Eng 13:1158–1169
Article Google Scholar

Download references

Acknowledgements

The authors gratefully acknowledge the support of the Maryland Department of Transportation State Highway Administration (MDOT SHA) under Statewide Planning and Research (SPR) Task Number SHA/UM/5-36 and the Maryland Water Resources Research Center (US Geological Survey Award #G21AP10629).

Author information

Authors and Affiliations

Department of Civil and Environmental Engineering, University of Maryland, College Park, MD, 20742, USA
Azin Al Kajbaf, Michelle Bensi & Kaye L. Brubaker

Authors

Azin Al Kajbaf
View author publications
You can also search for this author in PubMed Google Scholar
Michelle Bensi
View author publications
You can also search for this author in PubMed Google Scholar
Kaye L. Brubaker
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

AAK: Conceptualization, Formal analysis, Data curation, Visualization, Writing-original draft; MB: Conceptualization, Supervision, Visualization, Writing (review & editing); KLB: Conceptualization, Supervision, Visualization, Writing (review & editing).

Corresponding author

Correspondence to Michelle Bensi.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Parameterization of the models

1.1 Artificial neural network

To develop an ANN model, a number of parameters need to be specified to determine the architecture of the network, including: (1) the type of network, (2) the number of neurons and hidden layers, (3) the number of epochs, (4) the training algorithm and transfer function, and (5) the data division. In this study, the type of the network is multilayer feedforward back-propagation, which consists of an input layer, hidden layer(s), and an output layer (Jain et al. 1996; MathWorks). The ANN architecture is determined through an iterative process. The number of hidden layers is chosen as one layer with ten neurons. The number of neurons in the hidden layer was tested across the range 7 to 12, but it was ultimately set up at ten based on the performance of the model in terms of Mean Squared Error (MSE). The training algorithm that is used for this study is Levenberg–Marquardt (Levenberg 1944; Marquardt 1963). This method, which is also known as damped least-squares, has primary application in least-squares curve-fitting problems. Bayesian regularization back-propagation (MacKay 1992; Foresee and Hagan 1997; MathWorks) is used in developing the network to solve the problem of overfitting. In this framework, the weight and bias values based on Levenberg–Marquardt optimization are updated. The transfer function is Sigmoid in the hidden layer and linear in the output layer as default. The input data in the ANN model is standardized by centering the predictor and target values using their mean and standard deviation.

1.2 Ensemble methods (Boosted Trees and Random Forest)

RF and BT are both ensemble methods that use decision trees as the learner algorithm (Breiman 1996, 2001). Modeling choices associated with these two models were selected based on the options that yielded the best performance in terms of smaller MSE. The modeling selections were initially optimized via a Bayesian optimization algorithm (MathWorks). The optimized parameters did not provide the highest accuracy throughout all the analyses performed in this study. However, they were used as a starting point to find the set of parameters that worked the best for different parts of the analysis and when applied to NARCCAP models. Further manual experiments were performed to refine parameter selections.

For the BT model, the following parameters were defined:

minimum leaf size: 1
number of variables to sample: 4
ensemble aggregation method: Least-squares boosting
number of learning cycles: 100
learning rate: 0.09.

For the RF model, the following parameters were defined:

minimum leaf size: 30
ensemble aggregation method: Bootstrap aggregation
number of learning cycles: 30.

The BT and RF models do not require data standardization.

1.3 Support Vector Regression

Modeling choices associated with the SVR model were selected based on the options that yielded the best performance in terms of \(MSE\). In an SVR model, some of the key factors that can be designated in the modeling process are (1) Kernel function, (2) Kernel scale, and (3) standardization. The Kernel function is selected as Gaussian (radial basis function). The Gaussian Kernel is chosen in the current study because it provided the highest accuracy among other Kernels, including Polynomial and Linear. The Kernel scale is set as “auto” which allows the software (MathWorks 2018) to determine an appropriate value using a heuristic procedure. The SVR model is set up to standardize the data by centering the predictor and target values using their mean and standard deviation.

As a sensitivity case, a series of analyses were performed in which the SVR modeling selections were optimized via a Bayesian optimization algorithm. However, the computational expense was significantly higher for the optimized model but did not always result in better performance (in terms of \(MSE\)). For this reason, the non-optimized model was selected and used.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kajbaf, A.A., Bensi, M. & Brubaker, K.L. Temporal downscaling of precipitation from climate model projections using machine learning. Stoch Environ Res Risk Assess 36, 2173–2194 (2022). https://doi.org/10.1007/s00477-022-02259-2

Download citation

Accepted: 27 May 2022
Published: 09 July 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s00477-022-02259-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Temporal downscaling of precipitation from climate model projections using machine learning

Abstract

Access this article

Similar content being viewed by others

Enhancing PM2.5 Predictions in Dakar Through Automated Data Integration into a Data Assimilation Model

A Review of Recent Advances in Research on Extreme Heat Events

Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate sensitive upper Indus catchments

Availability of data and material

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix: Parameterization of the models

1.1 Artificial neural network

1.2 Ensemble methods (Boosted Trees and Random Forest)

1.3 Support Vector Regression

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Temporal downscaling of precipitation from climate model projections using machine learning

Abstract

Access this article

Similar content being viewed by others

Enhancing PM2.5 Predictions in Dakar Through Automated Data Integration into a Data Assimilation Model

A Review of Recent Advances in Research on Extreme Heat Events

Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate sensitive upper Indus catchments

Availability of data and material

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix: Parameterization of the models

Appendix: Parameterization of the models

1.1 Artificial neural network

1.2 Ensemble methods (Boosted Trees and Random Forest)

1.3 Support Vector Regression

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation