Skip to main content
Log in

Threshold-Based Hybrid Data Mining Method for Long-Term Maximum Precipitation Forecasting

  • Published:
Water Resources Management Aims and scope Submit manuscript

Abstract

In this paper, the application of two data mining techniques (decision tree and association rules) was offered to discover affiliation between several thresholds of monthly precipitation (MP) values of Tabriz and Kermanshah synoptic stations (located in Iran) and de-trend sea surface temperature (SST) of the Black, Mediterranean and Red Seas. Two major steps of the modeling in this study were the classification of de-trend SST data and selecting the most effective groups and extracting hidden predictive information involved in the data. The decision tree techniques which can identify the good traits from a data set for the classification purpose were used for classification and selecting the most effective groups and association rules were employed to extract the hidden predictive information from the large observed data. To examine the accuracy of the rules, confidence and lift measures were calculated and compared for different thresholds of precipitation at different lag times. The computed measures confirm reliable performance of the proposed hybrid data mining method to forecast extreme precipitation events considering higher threshold values and the results show a relative correlation between the Mediterranean, Black and Red Sea de-trend SSTs and maximum MP of Tabriz and Kermanshah synoptic stations so that the confidence between the threshold of 35% of MP values and the de-trend SST of seas is higher than 70 for Tabriz and 60% for Kermanshah. It was also shown that the geographical location of stations and the distribution of precipitation data affect the measures of the rules and forecasting outcomes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Akrami SA, Nourani V, Hakim SJS (2014) Development of nonlinear model based on wavelet-ANFIS for rainfall forecasting at Klang gates dam. Water Resour Manag 28(10):2,999–3,018

    Article  Google Scholar 

  • Bayazit M (2015) Nonstationarity of hydrological records and recent trends in trend analysis: a state-of-the-art review. Environ Processes 2:527–542

    Article  Google Scholar 

  • Cervantes J, Lamont FG, Lopez-Chau A, Mazahuac L, Ruiz JS (2015) Data selection based on decision tree for SVM classification on large data sets. Appl Soft Comput 37:787–798

    Article  Google Scholar 

  • Changpetch P, Lin DKJ (2012) Model selection for logistic regression via association rules analysis. J Stat Comput Simul 82:1–14

    Article  Google Scholar 

  • Dhanya CT, Kumar D (2009) Data mining for evolution of association rules for droughts and floods in India using climate inputs. J Geophys Res Atmos 114:D02102

    Article  Google Scholar 

  • Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann Publishers, San Francisco

    Google Scholar 

  • Laflamme EM, Linder E, Pan Y (2015) Statistical downscaling of regional climate model output to achieve projections of precipitation extremes. Weather and Climate Extremes 12:15–23

    Article  Google Scholar 

  • Meidani E, Araghinejad S (2014) Long-lead streamflow forecasting in the southwest of Iran by sea surface temperature of the Mediterranean Sea. J Hydrol Eng 19(8):05014005

    Article  Google Scholar 

  • Moreira E (2015) SPI drought class prediction using log-linear models applied to wet and dry seasons. Phys Chem Earth 94:136–145

    Article  Google Scholar 

  • Nourani V, Alami MT, Aminfar MH (2009) A combined neural-wavelet model for prediction of Ligvanchai watershed precipitation. Eng Appl Artif Intell 22(3):466–472

    Article  Google Scholar 

  • Nourani V, Alizadeh F, Roushangar K (2016) Evaluation of a two-stage SVM and spatial statistics methods for modeling monthly river suspended sediment load. Water Resour Manag 30(1):393–407

    Article  Google Scholar 

  • Otero FEB, Freitas AA, Johnson CG (2012) Inducing decision trees with an ant colony optimization algorithm. Appl Soft Comput 12:3,615–3,626

    Article  Google Scholar 

  • Pickands J (1975) Statistical inference using extreme order statistics. Ann Stat 3:119–131

    Article  Google Scholar 

  • Quinlan JR (1986) Induction of decision trees. Mach Learn 1:181–186

    Google Scholar 

  • Quinlan JR (1992) Learning with continuous classes. In: proceedings of the 5th Australian joint conference on artificial intelligence 92:343-348

  • Rahimikhoob A (2010) Forecasting of maximum monthly precipitation of Ilam using data mining techniques. Iranian Journal of Soil and Water Research 42(1):1–7 (In Farsi)

    Google Scholar 

  • Revadekar JV, Kulkarni A (2008) The el Nino-southern oscillation and winter precipitation extremes over India. Int J Climatol 28(11):1,445–1,452

    Article  Google Scholar 

  • Richman MB, Leslie LM (2014) Attribution and prediction of maximum temperature extremes in SE Australia. Procedia Comput Sci 36:612–617

    Article  Google Scholar 

  • Rowell DP (2003) The impact of Mediterranean SSTs on the Sahelian rainfall season. J Clim 16(5):849–862

    Article  Google Scholar 

  • Rucong Y, Minghua Z, Yongqiang Y, Yimin L (2001) Summer monsoon rainfalls over mid-eastern China lagged correlated with global SSTs. Adv Atmos Sci 18(2):179–196

    Article  Google Scholar 

  • Sattari MT, Anil AS, Apaydin H, Kodal S (2012) Decision trees to determine the possible drought periods in Ankara. Atmosfera 25(1):65–83

    Google Scholar 

  • Sharma A (2000) Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: part 1-a strategy for system predictor identification. J Hydrol 239(1–4):232–239

    Article  Google Scholar 

  • Singh A, Damir B, Deep K, Ganju A (2015) Calibration of nearest neighbors model for avalanche forecasting. Cold Reg Sci Technol 109:33–42

    Article  Google Scholar 

  • Tadesse T, Wilhite DA, Harms SK, Hayes MJ, Goddard S (2004) Drought monitoring using data mining techniques: a case study for Nebraska, USA. Nat Hazards 33(1):137–159

    Article  Google Scholar 

  • Wilks DS (1995) Statistical methods in the atmospheric sciences: an introduction. Academic Press, San Diego

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vahid Nourani.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nourani, V., Sattari, M.T. & Molajou, A. Threshold-Based Hybrid Data Mining Method for Long-Term Maximum Precipitation Forecasting. Water Resour Manage 31, 2645–2658 (2017). https://doi.org/10.1007/s11269-017-1649-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11269-017-1649-y

Keywords

Navigation