Abstract
Increased greenhouse gas concentration in the atmosphere has led to significant climate warming and changes in precipitation and temperature characteristics. These trends, which are expected to continue, will affect water infrastructure and raise the need to update associated planning and design policies. The potential effects of climate change can be addressed, in part, by incorporating outputs of climate model projections into statistical assessments to develop the Intensity Duration Frequency (IDF) curves used in engineering design and analysis. The results of climate model projections are available at fixed temporal and spatial resolutions. Model results often need to be downscaled from a coarser to a finer grid spacing (spatial downscaling) and/or from a larger to a smaller time-step (temporal downscaling). Machine Learning (ML) models are among the methods used for spatial and temporal downscaling of climate model outputs. These methods are more frequently used for spatial downscaling; fewer studies explore temporal downscaling. In this study, multiple ML models are evaluated to temporally downscale precipitation time-series (available at 3-h time steps) generated by several regional climate models of the North American Regional Climate Change Assessment Program (NARCCAP) under a high-carbon-emission projection. The temporally downscaled time-series for 2-h, 1-h, 30-min, and 15-min durations are intended for subsequent statistical analysis to generate current- and future-climate IDF curves for Maryland. In this study, the behavior of the ML models is explored by assessing performance in predicting large target response quantities, identifying systematic trends in errors, investigating input/output relationships using response functions, and leveraging conventional performance metrics.
Similar content being viewed by others
Availability of data and material
The data used in this study are from (1) the Climate Data Online database (Hourly, 15-min, Daily summaries databases) of the NOAA National Center for Environmental Information (NCEI), https://www.ncdc.noaa.gov/cdo-web/search, and (2) North American Regional Climate Change Assessment Program database of UCAR, https://www.narccap.ucar.edu/.
Abbreviations
- ANN:
-
Artificial Neural Network
- AOGCM:
-
Atmosphere–Ocean General Circulation Model
- BT:
-
Boosted Trees
- CDO:
-
Climate Data Online
- GBT:
-
Gradient Boosting Trees
- GCM:
-
General Circulation Model or Global Climate Model
- GPR:
-
Gaussian Process Regression
- GRNN:
-
Generalized Regression Neural Network
- IDF:
-
Intensity Duration Frequency
- IMSE:
-
Integrated Mean Square Error
- KNN:
-
K-Nearest Neighbors
- LSSVM:
-
Least-Square SVM
- MAE:
-
Mean Absolute Error
- ML:
-
Machine Learning
- MLR:
-
Multiple Linear Regression
- MSE:
-
Mean Squared Error
- NARCCAP:
-
North American Regional Climate Change Assessment Program
- NCEI:
-
National Center for Environmental Information
- NOAA:
-
National Oceanic and Atmospheric Administration
- RCM:
-
Regional Climate Model
- RCP:
-
Representative concentration pathway
- RF:
-
Random Forest
- RI:
-
Reference Index
- RMSE:
-
Root Mean Squared Error
- SN:
-
Signal-to-Noise
- SVM:
-
Support Vector Machine
- SVR:
-
Support Vector Regression
- UCAR:
-
University Corporation for Atmospheric Research
- WANN:
-
Wavelet ANN
- WLSSVM:
-
Wavelet- LSSVM
References
Agel L, Barlow M, Qian J-H et al (2015) Climatology of daily precipitation and extreme precipitation events in the northeast United States. J Hydrometeorol 16:2537–2557
Al Kajbaf A, Bensi M (2020) Application of surrogate models in estimation of storm surge: a comparative assessment. Appl Soft Comput 91:106184
Alam MS, Elshorbagy A (2015) Quantification of the climate change-induced variations in Intensity–Duration–Frequency curves in the Canadian Prairies. J Hydrol 527:990–1005
American Meteorological Society (2022) intensity–duration–frequency curve. Gloss Meteorol
Beuchat X, Schaefli B, Soutter M, Mermoud A (2011) Toward a robust method for subdaily rainfall downscaling from daily data. Water Resour Res. https://doi.org/10.1029/2010WR010342
Botchkarev A (2019) A new typology design of performance metrics to measure errors in machine learning regression algorithms. Interdisc J Inf Knowl Manage 14:45–76. https://doi.org/10.28945/4184
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
Breiman L (2001) Random forests. Mach Learn 45:5–32
Burian SJ, Durrans SR, Tomić S et al (2000) Rainfall disaggregation using artificial neural networks. J Hydrol Eng 5:299–307
Burian SJ, Durrans SR, Nix SJ, Pitt RE (2001) Training artificial neural networks to perform rainfall disaggregation. J Hydrol Eng 6:43–51
Cheng M-Y, Firdausi PM, Prayogo D (2014) High-performance concrete compressive strength prediction using Genetic Weighted Pyramid Operation Tree (GWPOT). Eng Appl Artif Intell 29:104–113
Chou J-S, Chiu C-K, Farfoura M, Al-Taharwa I (2011) Optimizing the prediction accuracy of concrete compressive strength based on a comparison of data-mining techniques. J Comput Civ Eng 25:242–253
Coulibaly P, Dibike YB, Anctil F (2005) Downscaling precipitation and temperature with temporal neural networks. J Hydrometeorol 6:483–496
Dibike YB, Coulibaly P (2006) Temporal neural networks for downscaling climate variability and extremes. Neural Netw 19:135–144
Diez-Sierra J, Del Jesus M (2019) Subdaily rainfall estimation through daily rainfall downscaling using random forests in Spain. Water 11:125
Diez-Sierra J, del Jesus M (2020) Long-term rainfall prediction using atmospheric synoptic patterns in semi-arid climates with statistical and machine learning methods. J Hydrol 586:124789
Durrans SR, Burian SJ, Nix SJ et al (1999) Polynomial-based disaggregation of hourly rainfall for continuous hydrologic simulation 1. JAWRA J Am Water Resour Assoc 35:1213–1221
Foresee FD, Hagan MT (1997) Gauss-Newton approximation to Bayesian learning. In: Proceedings of international conference on neural networks (ICNN’97). IEEE, pp 1930–1935
Gandomi AH, Yun GJ, Alavi AH (2013) An evolutionary approach for modeling of shear strength of RC deep beams. Mater Struct 46:2109–2119
Harilal N, Singh M, Bhatia U (2021) Augmented convolutional LSTMs for generation of high-resolution climate change projections. IEEE Access 9:25208–25218
Hu H, Ayyub BM (2019) Machine learning for projecting extreme precipitation intensity for short durations in a changing climate. Geosciences 9:209
Jain AK, Mao J, Mohiuddin KM (1996) Artificial neural networks: A tutorial. Computer 29:31–44
Kannan S, Ghosh S (2011) Prediction of daily rainfall state in a river basin using statistical downscaling from GCM output. Stoch Environ Res Risk Assess 25:457–474
Kim S, Kisi O, Seo Y et al (2016) Assessment of rainfall aggregation and disaggregation using data-driven models and wavelet decomposition. Hydrol Res 48:99–116
Kumar J, Brooks B-GJ, Thornton PE, Dietze MC (2012) Sub-daily statistical downscaling of meteorological variables using neural networks. Procedia Comput Sci 9:887–896
Kumar B, Chattopadhyay R, Singh M et al (2021) Deep learning–based downscaling of summer monsoon rainfall data over Indian region. Theor Appl Climatol 143:1145–1156
Kunkel KE, Karl TR, Brooks H et al (2013) Monitoring and understanding trends in extreme storms: state of knowledge. Bull Am Meteorol Soc 94:499–514
Leathers DJ, Brasher SE, Brinson KR, Hughes C, Weiskopf S (2020) A comparison of extreme precipitation event frequency and magnitude using a high-resolution rain gage network and NOAA Atlas 14 across Delaware. Int J Climatol 40(8):3748–3756
Leclerc G, Schaake JC (1972) Derivation of hydrologic frequency curves. Report 142, Mass. Inst. of Technol., Cambridge, 151 pp.
Levenberg K (1944) A method for the solution of certain non-linear problems in least squares. Q Appl Math 2:164–168. https://doi.org/10.1090/qam/10666
MacKay DJ (1992) Bayesian interpolation. Neural Comput 4:415–447
Marquardt DW (1963) An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math 11:431–441
MathWorks (2018) Matlab Documentation.
MathWorks What Is a Neural Network? https://www.mathworks.com/discovery/neural-network.html. Accessed 8 Nov 2021a
MathWorks Bayesian regularization backpropagation - MATLAB trainbr. https://www.mathworks.com/help/deeplearning/ref/trainbr.html. Accessed 4 Nov 2021b
MathWorks Hyperparameter Optimization in Regression Learner App - MATLAB & Simulink. https://www.mathworks.com/help/stats/hyperparameter-optimization-in-regression-learner-app.html. Accessed 7 Nov 2021c
Mearns L, McGinnis S, Arritt R, et al (2007) North American Regional Climate Change Assessment Program dataset. Approximately 40 TB
Menne MJ, Durre I, Vose RS et al (2012) An overview of the global historical climatology network-daily database. J Atmos Ocean Technol 29:897–910. https://doi.org/10.1175/JTECH-D-11-00103.1
Mirhosseini G, Srivastava P, Fang X (2014) Developing rainfall intensity-duration-frequency curves for Alabama under future climate scenarios using artificial neural networks. J Hydrol Eng 19:04014022
NOAA CDO Climate Data Online (CDO) - The National Climatic Data Center’s (NCDC) Climate Data Online (CDO) provides free access to NCDC’s archive of historical weather and climate data in addition to station history information. | National Climatic Data Center (NCDC). https://www.ncdc.noaa.gov/cdo-web/. Accessed 26 Jul 2021
Noor M, Ismail T, Chung E-S et al (2018) Uncertainty in rainfall intensity duration frequency curves of peninsular Malaysia under changing climate scenarios. Water 10:1750
Nourani V, Farboudfam N (2019) Rainfall time series disaggregation in mountainous regions using hybrid wavelet-artificial intelligence methods. Environ Res 168:306–318
Ojha CSP, Kumar-Goyal M, Adeloye AJ (2010) Downscaling of precipitation for lake catchment in arid region in India using linear multiple regression and neural networks. Open Hydrol J 4:122–136
Ormsbee LE (1989) Rainfall disaggregation model for continuous hydrologic modeling. J Hydraul Eng 115:507–525
Sharifi E, Saghafian B, Steinacker R (2019) Downscaling satellite precipitation estimates with multiple linear regression, artificial neural networks, and spline interpolation techniques. J Geophys Res Atmos 124:789–805
Tripathi S, Srinivas VV, Nanjundiah RS (2006) Downscaling of precipitation for climate change scenarios: a support vector machine approach. J Hydrol 330:621–640
Vogt M, Remmen P, Lauster M et al (2018) Selecting statistical indices for calibrating building energy models. Build Environ 144:94–107
Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30:79–82
Wu S-Y (2015) Changing characteristics of precipitation for the contiguous United States. Clim Change 132:677–692
Zhang J, Murch RR, Ross MA et al (2008) Evaluation of statistical rainfall disaggregation methods using rain-gauge information for West-Central Florida. J Hydrol Eng 13:1158–1169
Acknowledgements
The authors gratefully acknowledge the support of the Maryland Department of Transportation State Highway Administration (MDOT SHA) under Statewide Planning and Research (SPR) Task Number SHA/UM/5-36 and the Maryland Water Resources Research Center (US Geological Survey Award #G21AP10629).
Author information
Authors and Affiliations
Contributions
AAK: Conceptualization, Formal analysis, Data curation, Visualization, Writing-original draft; MB: Conceptualization, Supervision, Visualization, Writing (review & editing); KLB: Conceptualization, Supervision, Visualization, Writing (review & editing).
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Parameterization of the models
Appendix: Parameterization of the models
1.1 Artificial neural network
To develop an ANN model, a number of parameters need to be specified to determine the architecture of the network, including: (1) the type of network, (2) the number of neurons and hidden layers, (3) the number of epochs, (4) the training algorithm and transfer function, and (5) the data division. In this study, the type of the network is multilayer feedforward back-propagation, which consists of an input layer, hidden layer(s), and an output layer (Jain et al. 1996; MathWorks). The ANN architecture is determined through an iterative process. The number of hidden layers is chosen as one layer with ten neurons. The number of neurons in the hidden layer was tested across the range 7 to 12, but it was ultimately set up at ten based on the performance of the model in terms of Mean Squared Error (MSE). The training algorithm that is used for this study is Levenberg–Marquardt (Levenberg 1944; Marquardt 1963). This method, which is also known as damped least-squares, has primary application in least-squares curve-fitting problems. Bayesian regularization back-propagation (MacKay 1992; Foresee and Hagan 1997; MathWorks) is used in developing the network to solve the problem of overfitting. In this framework, the weight and bias values based on Levenberg–Marquardt optimization are updated. The transfer function is Sigmoid in the hidden layer and linear in the output layer as default. The input data in the ANN model is standardized by centering the predictor and target values using their mean and standard deviation.
1.2 Ensemble methods (Boosted Trees and Random Forest)
RF and BT are both ensemble methods that use decision trees as the learner algorithm (Breiman 1996, 2001). Modeling choices associated with these two models were selected based on the options that yielded the best performance in terms of smaller MSE. The modeling selections were initially optimized via a Bayesian optimization algorithm (MathWorks). The optimized parameters did not provide the highest accuracy throughout all the analyses performed in this study. However, they were used as a starting point to find the set of parameters that worked the best for different parts of the analysis and when applied to NARCCAP models. Further manual experiments were performed to refine parameter selections.
For the BT model, the following parameters were defined:
-
minimum leaf size: 1
-
number of variables to sample: 4
-
ensemble aggregation method: Least-squares boosting
-
number of learning cycles: 100
-
learning rate: 0.09.
For the RF model, the following parameters were defined:
-
minimum leaf size: 30
-
ensemble aggregation method: Bootstrap aggregation
-
number of learning cycles: 30.
The BT and RF models do not require data standardization.
1.3 Support Vector Regression
Modeling choices associated with the SVR model were selected based on the options that yielded the best performance in terms of \(MSE\). In an SVR model, some of the key factors that can be designated in the modeling process are (1) Kernel function, (2) Kernel scale, and (3) standardization. The Kernel function is selected as Gaussian (radial basis function). The Gaussian Kernel is chosen in the current study because it provided the highest accuracy among other Kernels, including Polynomial and Linear. The Kernel scale is set as “auto” which allows the software (MathWorks 2018) to determine an appropriate value using a heuristic procedure. The SVR model is set up to standardize the data by centering the predictor and target values using their mean and standard deviation.
As a sensitivity case, a series of analyses were performed in which the SVR modeling selections were optimized via a Bayesian optimization algorithm. However, the computational expense was significantly higher for the optimized model but did not always result in better performance (in terms of \(MSE\)). For this reason, the non-optimized model was selected and used.
Rights and permissions
About this article
Cite this article
Kajbaf, A.A., Bensi, M. & Brubaker, K.L. Temporal downscaling of precipitation from climate model projections using machine learning. Stoch Environ Res Risk Assess 36, 2173–2194 (2022). https://doi.org/10.1007/s00477-022-02259-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-022-02259-2