Abstract
Effective drought prediction can be conducive to mitigating some of the effects of drought. Machine learning algorithms are increasingly used for developing drought prediction models due to their high efficiency and accuracy. This study explored the ability of several machine learning models based on penalized linear regression and decision tree (DT)-based ensemble methods to predict drought conditions represented by the Standardized Precipitation–Evapotranspiration Index (SPEI) in Northeast China. We compared the forecasting performance of the penalized linear regression models based on ridge regression (RR) and lasso regression (LR) with the ordinary least squares (OLS) regression model. In addition, the AdaBoost and Random Forests (RF) models were also used to explore the suitability of ensemble methods for improving the forecasting performance. The SPEI was forecast at the different timescales of 3, 6, 12, and 24 months using the aforementioned machine learning models and the indices were used to predict short-term and long-term drought conditions. The prediction results indicated that the penalized linear regression models provided better prediction results and the ensemble methods consistently outperformed the DT model. Overall, the LR models were the optimum models for forecasting the SPEI at different timescales in Northeast China.
Similar content being viewed by others
References
Ali Z et al (2017) Forecasting drought using multilayer perceptron artificial neural network model. Adv Meteorol. https://doi.org/10.1155/2017/5681308
Azad A, Manoochehri M, Kashi H, Farzin S, Karami H, Nourani V, Shiri J (2019) Comparative evaluation of intelligent algorithms to improve adaptive neuro-fuzzy inference system performance in precipitation modelling. J Hydrol 571:214–224. https://doi.org/10.1016/j.jhydrol.2019.01.062
Bachmair S, Svensson C, Hannaford J, Barker L, Stahl K (2016) A quantitative analysis to objectively appraise drought indicators and model drought impacts. Hydrol Earth Syst Sci 20:2589–2609
Bachmair S, Svensson C, Prosdocimi I, Hannaford J, Stahl K (2017) Developing drought impact functions for drought risk management. Nat Hazards Earth Syst Sci 17:1947–1960. https://doi.org/10.5194/nhess-17-1947-2017
Beguería S, Vicente-Serrano SM, Reig F, Latorre B (2014) Standardized precipitation evapotranspiration index (SPEI) revisited: parameter fitting, evapotranspiration models, tools, datasets and drought monitoring. Int J Climatol 34:3001–3023. https://doi.org/10.1002/joc.3887
Belayneh A, Adamowski J, Khalil B, Quilty J (2016) Coupling machine learning methods with wavelet transforms and the bootstrap and boosting ensemble approaches for drought prediction. Atmos Res 172:37–47
Borji M, Malekian A, Salajegheh A, Ghadimi M (2016) Multi-time-scale analysis of hydrological drought forecasting using support vector regression (SVR) and artificial neural networks (ANN). Arab J Geosci 9:725
Botai C, Botai J, Dlamini L, Zwane N, Phaduli E (2016) Characteristics of droughts in South Africa: a case study of free state and north west provinces. Water 8:439
Breiman L (1996) Bagging predictors machine learning 24:123–140
Breiman L (2001) Random forests machine learning 45:5–32
Byakatonda J, Parida B, Kenabatho P, Moalafhi D (2016) Modeling dryness severity using artificial neural network at the Okavango Delta. Botswana Glob Nest J 18:463–481
Caruana R, Karampatziakis N, Yessenalina A (2008) An empirical evaluation of supervised learning in high dimensions. In: Proceedings of the 25th international conference on Machine learning, ACM, pp 96–103
Caruana R, Niculescu-Mizil A (2006) An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd international conference on Machine learning, ACM, pp 161–168
Chen T, Xia G, Liu T, Chen W, Chi D (2016) Assessment of drought impact on main cereal crops using a standardized precipitation evapotranspiration index in Liaoning Province. China Sustain 8:1069
Cook BI, Smerdon JE, Seager R, Coats S (2014) Global warming and 21st century drying. Clim Dyn 43:2607–2627
Dai A (2011) Drought under global warming: a review. Wiley Interdiscip Rev Clim Change 2:45–65
Deo RC, Şahin M (2015) Application of the artificial neural network model for prediction of monthly standardized precipitation and evapotranspiration index using hydrometeorological parameters and climate indices in eastern Australia. Atmos Res 161:65–81
Dietterich TG (2000) Ensemble methods in machine learning. In: International workshop on multiple classifier systems. Springer, pp 1–15
Freund Y (1995) Boosting a weak learning algorithm by majority. Inf Computs 121:256–285
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139
Ganguli P, Reddy MJ (2014) Ensemble prediction of regional droughts using climate inputs and the SVM–copula approach. Hydrol Process 28:4989–5009
Gessner U, Naeimi V, Klein I, Kuenzer C, Klein D, Dech S (2013) The relationship between precipitation anomalies and satellite-derived vegetation activity in Central Asia. Glob Planet Change 110:74–87
Gill MK, Asefa T, Kemblowski MW, McKee M (2006) Soil moisture prediction using support vector machines. JAWRA J Am Water Res Assoc 42:1033–1046
Gocic M, Trajkovic S (2014) Drought characterisation based on water surplus variability index water. Resour Manag 28:3179–3191. https://doi.org/10.1007/s11269-014-0665-4
Gocic M, Trajkovic S (2014) Water surplus variability index as an indicator of drought. J Hydrol Eng 20:04014038
Guttman NB (1998) Comparing the palmer drought index and the standardized precipitation index JAWRA. J Am Water Resour Assoc 34:113–121
Hoerl AE, Kennard RW (1970) Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12:55–67
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning, vol 112. Springer, Berlins
Karimi S, Sadraddini AA, Nazemi AH, Xu T, Fard AF (2018) Generalizability of gene expression programming and random forest methodologies in estimating cropland and grassland leaf area index. Comput Electron Agric 144:232–240. https://doi.org/10.1016/j.compag.2017.12.007
Karimi S, Shiri J, Kisi O, Xu T (2018) Forecasting daily streamflow values: assessing heuristic models. Hydrol Res 49:658–669. https://doi.org/10.2166/nh.2017.111
McKee TB, Doesken NJ, Kleist J (1993) The relationship of drought frequency and duration to time scales. In: Proceedings of the 8th conference on applied climatology, vol 22. American Meteorological Society Boston, MA, pp 179–183
Kong Q, Ge Q, Zheng J, Xi J (2015) Prolonged dry episodes over Northeast China during the period 1961–2012. Theor Appl Climatol 122:711–719
Lantz B (2013) Machine learning with R. Packt Publishing Ltd,
Li Z, Zhou T (2015) Responses of vegetation growth to climate change in China. Int Arch Photogramm Remote Sens Spat Inf Sci 40:225
Maca P, Pech P (2016) Forecasting SPEI and SPI drought indices using the integrated artificial neural networks. Comput Intell Neurosci 2016:14
Niemeyer S (2008) New drought indices Options. Méditerranéennes Série A: Séminaires Méditerranéens 80:267–274
Ortegren JT, Knapp PA, Maxwell JT, Tyminski WP, Soulé PT (2011) Ocean–atmosphere influences on low-frequency warm-season drought variability in the Gulf Coast and southeastern United States. J Appl Meteorol Climatol 50:1177–1186
Park S, Im J, Jang E, Rhee J (2016) Drought assessment and monitoring through blending of multi-sensor indices using machine learning approaches for different climate regions. Agric For Meteorol 216:157–169
Park S, Seo E, Kang D, Im J, Lee MI (2018) Prediction of drought on pentad scale using remote sensing data and MJO index through random forest over East Asia. Remote Sens 10:18. https://doi.org/10.3390/rs10111811
Pedregosa F et al. (2011) Scikit-learn: Machine learning in Python Journal of machine learning research 12:2825–2830.
Peng J, Dong W, Yuan W, Zhang Y (2012) Responses of grassland and forest to temperature and precipitation changes in Northeast China. Adv Atmos Sci 29:1063–1077
Pereira JM, Basto M, da Silva AF (2016) The logistic lasso and ridge regression in predicting corporate failure. Procedia Econ Financ 39:634–641
Reiss MA et al (2015) Improvements on coronal hole detection in SDO/AIA images using supervised classification. J Space Weather Space Clim 5:A23
Rhee J, Im J (2017) Meteorological drought forecasting for ungauged areas based on machine learning: using long-range climate forecast and remote sensing data. Agric For Meteorol 237:105–122
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodological) 58:267–288
Touma D, Ashfaq M, Nayak MA, Kao S-C, Diffenbaugh NS (2015) A multi-model and multi-index evaluation of drought characteristics in the 21st century. J Hydrol 526:196–207s
Trevor H, Robert T, Friedman JH (2009) The elements of statistical learning: data mining, infersence, and prediction. Springer, New York
Tsakiris G, Vangelis H (2005) Establishing a drought index incorporating evapotranspiration. Eur Water 9:3–11
Uniejewski B, Nowotarski J, Weron R (2016) Automated variable selection and shrinkage for day-ahead electricity price forecasting. Energies 9:621
Vicente-Serrano SM, Beguería S, López-Moreno JI (2010) A multiscalar drought index sensitive to global warming: the standardized precipitation evapotranspiration index. J Clim 23:1696–1718
Vicente-Serrano SM, Van der Schrier G, Begueria S, Azorin-Molina C, Lopez-Moreno JI (2015) Contribution of precipitation and reference evapotranspiration to drought indices under different climates. J Hydrol 526:42–54. https://doi.org/10.1016/j.jhydrol.2014.11.025
Wang WX, Zuo DD, Feng GL (2014) Analysis of the drought vulnerability characteristics in Northeast China based on the theory of information distribution and diffusion. Acta Phys Sin 63:11. https://doi.org/10.7498/aps.63.229201
Wang X, Shen H, Zhang W, Cao J, Qi Y, Chen G, Li X (2015) Spatial and temporal characteristics of droughts in the Northeast China. Transect Nat Hazards 76:601–614
Wayne CP (1965) Meteorological drought US weather bureau research paper 58
Wells N, Goddard S, Hayes MJ (2004) A self-calibrating Palmer drought severity index. J Clim 17:2335–2351
Wilhite DA (2000) Drought as a natural hazard: concepts and definitions
Wu X et al. (2008) Top 10 algorithms in data mining Knowledge and information systems 14:1–37.
Yin X et al (2016) Adapting maize production to drought in the Northeast Farming Region of China. Eur J Agron 77:47–58
Yu X, He X, Zheng H, Guo R, Ren Z, Zhang D, Lin J (2014) Spatial and temporal analysis of drought risk during the crop-growing season over northeast China. Nat Hazards 71:275–289
Zargar A, Sadiq R, Naser B, Khan FI (2011) A review of drought indices. Environ Rev 19:333–349
Zhang Y, Xin Y, Li Q, Ma J, Li S, Lv X, Lv W (2017) Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications. Biomed Eng Online 16:125
Zhou Z-H (2012) Ensemble methods: foundations and algorithms. Chapman and Hall, London
Acknowledgements
This work was supported by the National Science Foundation of China (Grants Nos. 51679142 and 51709173).
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: F. Mesinger.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Li, Z., Chen, T., Wu, Q. et al. Application of penalized linear regression and ensemble methods for drought forecasting in Northeast China. Meteorol Atmos Phys 132, 113–130 (2020). https://doi.org/10.1007/s00703-019-00675-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00703-019-00675-8