Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Threshold-Based Hybrid Data Mining Method for Long-Term Maximum Precipitation Forecasting

  • 282 Accesses

  • 9 Citations


In this paper, the application of two data mining techniques (decision tree and association rules) was offered to discover affiliation between several thresholds of monthly precipitation (MP) values of Tabriz and Kermanshah synoptic stations (located in Iran) and de-trend sea surface temperature (SST) of the Black, Mediterranean and Red Seas. Two major steps of the modeling in this study were the classification of de-trend SST data and selecting the most effective groups and extracting hidden predictive information involved in the data. The decision tree techniques which can identify the good traits from a data set for the classification purpose were used for classification and selecting the most effective groups and association rules were employed to extract the hidden predictive information from the large observed data. To examine the accuracy of the rules, confidence and lift measures were calculated and compared for different thresholds of precipitation at different lag times. The computed measures confirm reliable performance of the proposed hybrid data mining method to forecast extreme precipitation events considering higher threshold values and the results show a relative correlation between the Mediterranean, Black and Red Sea de-trend SSTs and maximum MP of Tabriz and Kermanshah synoptic stations so that the confidence between the threshold of 35% of MP values and the de-trend SST of seas is higher than 70 for Tabriz and 60% for Kermanshah. It was also shown that the geographical location of stations and the distribution of precipitation data affect the measures of the rules and forecasting outcomes.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. Akrami SA, Nourani V, Hakim SJS (2014) Development of nonlinear model based on wavelet-ANFIS for rainfall forecasting at Klang gates dam. Water Resour Manag 28(10):2,999–3,018

  2. Bayazit M (2015) Nonstationarity of hydrological records and recent trends in trend analysis: a state-of-the-art review. Environ Processes 2:527–542

  3. Cervantes J, Lamont FG, Lopez-Chau A, Mazahuac L, Ruiz JS (2015) Data selection based on decision tree for SVM classification on large data sets. Appl Soft Comput 37:787–798

  4. Changpetch P, Lin DKJ (2012) Model selection for logistic regression via association rules analysis. J Stat Comput Simul 82:1–14

  5. Dhanya CT, Kumar D (2009) Data mining for evolution of association rules for droughts and floods in India using climate inputs. J Geophys Res Atmos 114:D02102

  6. Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann Publishers, San Francisco

  7. Laflamme EM, Linder E, Pan Y (2015) Statistical downscaling of regional climate model output to achieve projections of precipitation extremes. Weather and Climate Extremes 12:15–23

  8. Meidani E, Araghinejad S (2014) Long-lead streamflow forecasting in the southwest of Iran by sea surface temperature of the Mediterranean Sea. J Hydrol Eng 19(8):05014005

  9. Moreira E (2015) SPI drought class prediction using log-linear models applied to wet and dry seasons. Phys Chem Earth 94:136–145

  10. Nourani V, Alami MT, Aminfar MH (2009) A combined neural-wavelet model for prediction of Ligvanchai watershed precipitation. Eng Appl Artif Intell 22(3):466–472

  11. Nourani V, Alizadeh F, Roushangar K (2016) Evaluation of a two-stage SVM and spatial statistics methods for modeling monthly river suspended sediment load. Water Resour Manag 30(1):393–407

  12. Otero FEB, Freitas AA, Johnson CG (2012) Inducing decision trees with an ant colony optimization algorithm. Appl Soft Comput 12:3,615–3,626

  13. Pickands J (1975) Statistical inference using extreme order statistics. Ann Stat 3:119–131

  14. Quinlan JR (1986) Induction of decision trees. Mach Learn 1:181–186

  15. Quinlan JR (1992) Learning with continuous classes. In: proceedings of the 5th Australian joint conference on artificial intelligence 92:343-348

  16. Rahimikhoob A (2010) Forecasting of maximum monthly precipitation of Ilam using data mining techniques. Iranian Journal of Soil and Water Research 42(1):1–7 (In Farsi)

  17. Revadekar JV, Kulkarni A (2008) The el Nino-southern oscillation and winter precipitation extremes over India. Int J Climatol 28(11):1,445–1,452

  18. Richman MB, Leslie LM (2014) Attribution and prediction of maximum temperature extremes in SE Australia. Procedia Comput Sci 36:612–617

  19. Rowell DP (2003) The impact of Mediterranean SSTs on the Sahelian rainfall season. J Clim 16(5):849–862

  20. Rucong Y, Minghua Z, Yongqiang Y, Yimin L (2001) Summer monsoon rainfalls over mid-eastern China lagged correlated with global SSTs. Adv Atmos Sci 18(2):179–196

  21. Sattari MT, Anil AS, Apaydin H, Kodal S (2012) Decision trees to determine the possible drought periods in Ankara. Atmosfera 25(1):65–83

  22. Sharma A (2000) Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: part 1-a strategy for system predictor identification. J Hydrol 239(1–4):232–239

  23. Singh A, Damir B, Deep K, Ganju A (2015) Calibration of nearest neighbors model for avalanche forecasting. Cold Reg Sci Technol 109:33–42

  24. Tadesse T, Wilhite DA, Harms SK, Hayes MJ, Goddard S (2004) Drought monitoring using data mining techniques: a case study for Nebraska, USA. Nat Hazards 33(1):137–159

  25. Wilks DS (1995) Statistical methods in the atmospheric sciences: an introduction. Academic Press, San Diego

Download references

Author information

Correspondence to Vahid Nourani.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nourani, V., Sattari, M.T. & Molajou, A. Threshold-Based Hybrid Data Mining Method for Long-Term Maximum Precipitation Forecasting. Water Resour Manage 31, 2645–2658 (2017). https://doi.org/10.1007/s11269-017-1649-y

Download citation


  • Maximum monthly precipitation forecasting
  • SST
  • Data mining
  • Decision tree
  • Association rules