Abstract
In this paper, the application of two data mining techniques (decision tree and association rules) was offered to discover affiliation between several thresholds of monthly precipitation (MP) values of Tabriz and Kermanshah synoptic stations (located in Iran) and de-trend sea surface temperature (SST) of the Black, Mediterranean and Red Seas. Two major steps of the modeling in this study were the classification of de-trend SST data and selecting the most effective groups and extracting hidden predictive information involved in the data. The decision tree techniques which can identify the good traits from a data set for the classification purpose were used for classification and selecting the most effective groups and association rules were employed to extract the hidden predictive information from the large observed data. To examine the accuracy of the rules, confidence and lift measures were calculated and compared for different thresholds of precipitation at different lag times. The computed measures confirm reliable performance of the proposed hybrid data mining method to forecast extreme precipitation events considering higher threshold values and the results show a relative correlation between the Mediterranean, Black and Red Sea de-trend SSTs and maximum MP of Tabriz and Kermanshah synoptic stations so that the confidence between the threshold of 35% of MP values and the de-trend SST of seas is higher than 70 for Tabriz and 60% for Kermanshah. It was also shown that the geographical location of stations and the distribution of precipitation data affect the measures of the rules and forecasting outcomes.
Similar content being viewed by others
References
Akrami SA, Nourani V, Hakim SJS (2014) Development of nonlinear model based on wavelet-ANFIS for rainfall forecasting at Klang gates dam. Water Resour Manag 28(10):2,999–3,018
Bayazit M (2015) Nonstationarity of hydrological records and recent trends in trend analysis: a state-of-the-art review. Environ Processes 2:527–542
Cervantes J, Lamont FG, Lopez-Chau A, Mazahuac L, Ruiz JS (2015) Data selection based on decision tree for SVM classification on large data sets. Appl Soft Comput 37:787–798
Changpetch P, Lin DKJ (2012) Model selection for logistic regression via association rules analysis. J Stat Comput Simul 82:1–14
Dhanya CT, Kumar D (2009) Data mining for evolution of association rules for droughts and floods in India using climate inputs. J Geophys Res Atmos 114:D02102
Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann Publishers, San Francisco
Laflamme EM, Linder E, Pan Y (2015) Statistical downscaling of regional climate model output to achieve projections of precipitation extremes. Weather and Climate Extremes 12:15–23
Meidani E, Araghinejad S (2014) Long-lead streamflow forecasting in the southwest of Iran by sea surface temperature of the Mediterranean Sea. J Hydrol Eng 19(8):05014005
Moreira E (2015) SPI drought class prediction using log-linear models applied to wet and dry seasons. Phys Chem Earth 94:136–145
Nourani V, Alami MT, Aminfar MH (2009) A combined neural-wavelet model for prediction of Ligvanchai watershed precipitation. Eng Appl Artif Intell 22(3):466–472
Nourani V, Alizadeh F, Roushangar K (2016) Evaluation of a two-stage SVM and spatial statistics methods for modeling monthly river suspended sediment load. Water Resour Manag 30(1):393–407
Otero FEB, Freitas AA, Johnson CG (2012) Inducing decision trees with an ant colony optimization algorithm. Appl Soft Comput 12:3,615–3,626
Pickands J (1975) Statistical inference using extreme order statistics. Ann Stat 3:119–131
Quinlan JR (1986) Induction of decision trees. Mach Learn 1:181–186
Quinlan JR (1992) Learning with continuous classes. In: proceedings of the 5th Australian joint conference on artificial intelligence 92:343-348
Rahimikhoob A (2010) Forecasting of maximum monthly precipitation of Ilam using data mining techniques. Iranian Journal of Soil and Water Research 42(1):1–7 (In Farsi)
Revadekar JV, Kulkarni A (2008) The el Nino-southern oscillation and winter precipitation extremes over India. Int J Climatol 28(11):1,445–1,452
Richman MB, Leslie LM (2014) Attribution and prediction of maximum temperature extremes in SE Australia. Procedia Comput Sci 36:612–617
Rowell DP (2003) The impact of Mediterranean SSTs on the Sahelian rainfall season. J Clim 16(5):849–862
Rucong Y, Minghua Z, Yongqiang Y, Yimin L (2001) Summer monsoon rainfalls over mid-eastern China lagged correlated with global SSTs. Adv Atmos Sci 18(2):179–196
Sattari MT, Anil AS, Apaydin H, Kodal S (2012) Decision trees to determine the possible drought periods in Ankara. Atmosfera 25(1):65–83
Sharma A (2000) Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: part 1-a strategy for system predictor identification. J Hydrol 239(1–4):232–239
Singh A, Damir B, Deep K, Ganju A (2015) Calibration of nearest neighbors model for avalanche forecasting. Cold Reg Sci Technol 109:33–42
Tadesse T, Wilhite DA, Harms SK, Hayes MJ, Goddard S (2004) Drought monitoring using data mining techniques: a case study for Nebraska, USA. Nat Hazards 33(1):137–159
Wilks DS (1995) Statistical methods in the atmospheric sciences: an introduction. Academic Press, San Diego
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nourani, V., Sattari, M.T. & Molajou, A. Threshold-Based Hybrid Data Mining Method for Long-Term Maximum Precipitation Forecasting. Water Resour Manage 31, 2645–2658 (2017). https://doi.org/10.1007/s11269-017-1649-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-017-1649-y