Skip to main content
Log in

Predictor selection for streamflows using a graphical modeling approach

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

Streamflows are influenced by various hydroclimatic variables in complex ways. Accurate prediction of monthly streamflows requires a clear understanding of the dependence patterns among these influencing variables and streamflows. A graphical modeling technique, employing conditional independence, is adopted in this study to quantify the interrelationships between streamflows and a suite of available hydroclimatic variables, and to identify a reduced set of relevant variables for parsimonious model development. The nodes in the undirected graph represent relevant variables, and the strengths of the connections among the variables are learnt from the data. The graphical modeling approach is compared to the state-of-the-art method for predictor selection based on partial mutual information. For a synthetic benchmark dataset and a watershed in southern Indiana, USA, the graphical modeling approach shows more discriminating results while being computationally efficient. Along with artificial neural networks and time series models, results of the graphical model are used for formulating a variational relevance vector machine to predict monthly streamflows and perform probabilistic classification of hydrologic droughts in the watershed being studied. The parsimonious models developed for prediction at different lead times performed as well as the non-parsimonious models during both the calibration and testing periods. Drought forecasting for the study watershed at 1-month lead time was performed using the two selected predictors—soil moisture and precipitation anomalies alone, and the model performance was evaluated. The graphical model shows promise as a tool for predictor selection, and for aiding parsimonious model development applications in statistical hydrology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Alcamo J, Flörke M, Marker M (2007) Future long-term changes in global water resources driven by socio-economic and climatic changes. Hydrol Sci J 52:247–275. doi:10.1623/hysj.52.2.247

    Article  Google Scholar 

  • Anctil F, Lauzon N, Filion M (2008) Added gains of soil moisture content observations for streamflow predictions using neural networks. J Hydrol 359(3–4):225–234. doi:10.1016/j.jhydrol.2008.07.003

    Article  Google Scholar 

  • Asefa T, Kemblowski M, McKee M, Khalil A (2006) Multi-time scale stream flow predictions: the support vector machines approach. J Hydrol 318:7–16. doi:10.1016/j.jhydrol.2005.06.001

    Article  Google Scholar 

  • Aubert D, Loumagne C, Oudin L (2003) Sequential assimilation of soil moisture and streamflow data in a conceptual rainfall-runoff model. J Hydrol 280:145–161. doi:10.1016/S0022-1694(03)00229-4

    Article  Google Scholar 

  • Bach FR, Jordan MI (2004) Learning graphical models for stationary time series. IEEE Trans Signal Process 52:2189–2199

    Article  Google Scholar 

  • Barnston AG (1992) Correspondence among the correlation, RMSE, and Heidke forecast verification measures; refinement of the Heidke score. Weather Forecast 7(4):699–709

    Article  Google Scholar 

  • Besaw LE, Rizzo DM, Bierman PR, Hackett WR (2010) Advances in ungauged streamflow prediction using artificial neural networks. J Hydrol 386:27–37. doi:10.1016/j.jhydrol.2010.02.037

    Article  Google Scholar 

  • Bishop CM, Tipping ME (2000) Variational relevance vector machines. In: Boutilier C, Goldszmidt M (eds) Proceedings of the 16th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc., Massachusetts, pp 46–53

  • Bonev B (2010) Feature selection based on information theory. Ph.D. Thesis, University of Alicante, Alicante, Spain

  • Bowden GJ, Dandy GC, Maier HR (2005) Input determination for neural network models in water resources applications. Part 1. Background and Methodology. J Hydrol 301(1–4):75–92

    Article  Google Scholar 

  • Burn DH, Buttle JM, Caissie D et al (2008) The processes, patterns and impacts of low flows across Canada. Can Water Resour J 33:107–124. doi:10.4296/cwrj3302107

    Article  Google Scholar 

  • Chen Y, Zhang Q, Chen X, Wang P (2012) Multiscale variability of streamflow changes in the Pearl River basin, China. Stoch Environ Res Risk Assess 26:235–246. doi:10.1007/s00477-011-0495-3

    Article  CAS  Google Scholar 

  • Cover TM, Thomas JA (1991) Elements of information theory. Wiley, New York

    Book  Google Scholar 

  • Crone SF, Kourentzes N (2010) Feature selection for time series prediction—a combined filter and wrapper approach for neural networks. Neurocomputing 73:1923–1936. doi:10.1016/j.neucom.2010.01.017

    Article  Google Scholar 

  • Davies L, Gather U (1993) The identification of multiple outliers. J Am Stat Assoc 88(423):782–792. doi:10.1080/01621459.1993.10476339

    Article  Google Scholar 

  • Dempster AP (1972) Covariance selection. Biometrics 28(1):157–175

    Article  Google Scholar 

  • Dogan E, Tripathi S, Lyn DA, Govindaraju RS (2009) From flumes to rivers: can sediment transport in natural alluvial channels be predicted from observations at the laboratory scale? Water Resour Res 45:W08433. doi:10.1029/2008WR007637

    Article  Google Scholar 

  • Doswell CA, Davies-Jones R, Keller DL (1990) On summary measures of skill in rare event forecasting based on contingency tables. Weather Forecast 5(4):576–585. doi:10.1175/1520-0434(1990)005<0576:OSMOSI>2.0.CO;2

    Article  Google Scholar 

  • Edwards D (2000) Introduction to graphical modelling, 2nd edn. Springer, New York

    Book  Google Scholar 

  • Entin JK, Robock A, Vinnikov KY, Hollinger SE, Liu S, Namkhai A (2000) Temporal and spatial scales of observed soil moisture variations in the extratropics. J Geophys Res Atmos 105(D9):11865–11877. doi:10.1029/2000JD900051

    Article  Google Scholar 

  • Faul A, Tipping M (2001) A variational approach to robust regression. In: Dorffner G, Bischof H, Hornik K (eds) Artificial neural networks—ICANN 2001. Springer, Berlin, pp 95–102

    Chapter  Google Scholar 

  • Fernando TMKG, Maier HR, Dandy GC (2009) Selection of input variables for data driven models: an average shifted histogram partial mutual information estimator approach. J Hydrol 367(3–4):165–176. doi:10.1016/j.jhydrol.2008.10.019

    Article  Google Scholar 

  • Fiori M, Musé P, Sapiro G (2012) Topology constraints in graphical models. Adv Neural Inf Process Syst 25:800–808

    Google Scholar 

  • Galelli S, Castelletti A (2013) Tree-based iterative input variable selection for hydrological modeling. Water Resour Res 49:4295–4310. doi:10.1002/wrcr.20339

    Article  Google Scholar 

  • Gao C, Gemmer M, Zeng X et al (2010) Projected streamflow in the Huaihe River Basin (2010–2100) using artificial neural network. Stoch Environ Res Risk Assess 24:685–697. doi:10.1007/s00477-009-0355-6

    Article  Google Scholar 

  • Georgakakos KP (1986) A generalized stochastic hydrometeorological model for flood and flash-flood forecasting: 1. Formulation. Water Resour Res 22(13):2083–2095. doi:10.1029/WR022i013p02083

    Article  Google Scholar 

  • Ghosh S, Mujumdar PP (2008) Statistical downscaling of GCM simulations to streamflow using relevance vector machine. Adv Water Resour 31:132–146. doi:10.1016/j.advwatres.2007.07.005

    Article  Google Scholar 

  • Hejazi MI, Cai X (2009) Input variable selection for water resources systems using a modified minimum redundancy maximum relevance (mMRMR) algorithm. Adv Water Resour 32:582–593. doi:10.1016/j.advwatres.2009.01.009

    Article  Google Scholar 

  • Hoque YM, Tripathi S, Hantush MM, Govindaraju RS (2012) Watershed reliability, resilience and vulnerability analysis under uncertainty using water quality data. J Environ Manag 109:101–112. doi:10.1016/j.jenvman.2012.05.010

    Article  CAS  Google Scholar 

  • Hsu C-N, Huang H-J, Dietrich S (2002) The ANNIGMA-wrapper approach to fast feature selection for neural nets. IEEE Trans Syst Man Cybern B Cybern 32:207–212. doi:10.1109/3477.990877

    Article  Google Scholar 

  • Huang J, van den Dool HM, Georgakakos KP (1996) Analysis of model-calculated soil moisture over the United States (1931–1993) and applications to long-range temperature forecasts. J Clim 9(6):1350–1362. doi:10.1175/1520-0442(1996)009<1350:AOMCSM>2.0.CO;2

    Article  Google Scholar 

  • Ihler AT, Kirshner S, Ghil M, Robertson AW, Smyth P (2007) Graphical models for statistical inference and data assimilation. Physica D 230(1):72–87

    Article  Google Scholar 

  • Jensen FV, Nielsen TD (2007) Bayesian Networks and Decision Graphs. Springer, New York

    Book  Google Scholar 

  • Jolliffe IT, Stephenson DB (2003) Forecast verification: a practitioner’s guide in atmospheric science. Wiley-Blackwell, Hoboken

    Google Scholar 

  • Jordan MI (2004) Graphical models. Stat Sci 19(1):140–155. doi:10.1214/088342304000000026

    Article  Google Scholar 

  • Kalnay E et al (1996) The NCEP/NCAR 40-year reanalysis project. Bull Am Meteorol Soc 77(3):437–471

    Article  Google Scholar 

  • Karamouz M, Ahmadi A, Moridi A (2009) Probabilistic reservoir operation using Bayesian stochastic model and support vector machine. Adv Water Resour 32:1588–1600. doi:10.1016/j.advwatres.2009.08.003

    Article  Google Scholar 

  • Khalil A, Almasri MN, McKee M, Kaluarachchi JJ (2005) Applicability of statistical learning algorithms in groundwater quality modeling. Water Resour Res 41:W05010. doi:10.1029/2004WR003608

    Google Scholar 

  • Kisi O, Cimen M (2011) A wavelet-support vector machine conjunction model for monthly streamflow forecasting. J Hydrol 399:132–140. doi:10.1016/j.jhydrol.2010.12.041

    Article  Google Scholar 

  • Koster RD, Mahanama SPP, Livneh B et al (2010) Skill in streamflow forecasts derived from large-scale estimates of soil moisture and snow. Nat Geosci 3:613–616. doi:10.1038/ngeo944

    Article  CAS  Google Scholar 

  • Lauritzen SL (1996) Graphical models, vol 17. Oxford University Press Inc., New York

    Google Scholar 

  • Livneh B, Lettenmaier DP (2012) Multi-criteria parameter estimation for the unified land model. Hydrol Earth Syst Sci Dis 9:4417–4463. doi:10.5194/hessd-9-4417-2012

    Article  Google Scholar 

  • Mahanama SPP, Koster RD, Reichle RH, Zubair L (2008) The role of soil moisture initialization in subseasonal and seasonal streamflow prediction—a case study in Sri Lanka. Adv Water Resour 31:1333–1343. doi:10.1016/j.advwatres.2008.06.004

    Article  Google Scholar 

  • Maier HR, Jain A, Dandy GC, Sudheer KP (2010) Methods used for the development of neural networks for the prediction of water resource variables in river systems: current status and future directions. Environ Model Softw 25:891–909. doi:10.1016/j.envsoft.2010.02.003

    Article  Google Scholar 

  • Maity R, Kashid SS (2011) Importance analysis of local and global climate inputs for basin-scale streamflow prediction. Water Resour Res 47(11):W11504. doi:10.1029/2010WR009742

    Article  Google Scholar 

  • Maity R, Bhagwat PP, Bhatnagar A (2010) Potential of support vector regression for prediction of monthly streamflow using endogenous property. Hydrol Process 24(7):917–923. doi:10.1002/hyp.7535

    Article  Google Scholar 

  • Maity R, Ramadas M, Govindaraju RS (2013) Identification of hydrologic drought triggers from hydroclimatic predictor variables. Water Resour Res 49:4476–4492. doi:10.1002/wrcr.20346

    Article  Google Scholar 

  • Makkeasorn A, Chang NB, Zhou X (2008) Short-term streamflow forecasting with global climate change implications—a comparative study between genetic programming and neural network models. J Hydrol 352:336–354. doi:10.1016/j.jhydrol.2008.01.023

    Article  Google Scholar 

  • Malioutov DM, Johnson JK, Willsky AS (2006) Walk-sums and belief propagation in gaussian graphical models. J Mach Learn Res 7:2031–2064

    Google Scholar 

  • May RJ, Maier HR, Dandy GC, Fernando TMKG (2008) Non-linear variable selection for artificial neural networks using partial mutual information. Environ Model Softw 23(10–11):1312–1326. doi:10.1016/j.envsoft.2008.03.007

    Article  Google Scholar 

  • McKee TB, Doesken NJ, Kleist J (1993) The relationship of drought frequency and duration to time scales. In: Proceedings of the 8th conference on applied climatology, held at Anaheim, California, LA, pp 179–183

  • Moghaddamnia A, Gousheh MG, Piri J, Amin S, Han D (2009) Evaporation estimation using artificial neural networks and adaptive neuro-fuzzy inference system techniques. Adv Water Resour 32:88–97. doi:10.1016/j.advwatres.2008.10.005

    Article  CAS  Google Scholar 

  • Najjar RG (1999) The water balance of the Susquehanna River Basin and its response to climate change. J Hydrol 219:7–19. doi:10.1016/S0022-1694(99)00041-4

    Article  CAS  Google Scholar 

  • Noori R, Karbassi AR, Moghaddamnia A, Amin S, Han D (2011) Assessment of input variables determination on the SVM model performance using PCA, gamma test, and forward selection techniques for monthly stream flow prediction. J Hydrol 401:177–189. doi:10.1016/j.jhydrol.2011.02.021

    Article  Google Scholar 

  • Parthasarathy B, Kumar KR, Munot AA (1993) Homogeneous Indian Monsoon rainfall: variability and prediction. Proc Indian Acad Sci (Earth Planet Sci) 102:121–155. doi:10.1007/BF02839187

    Google Scholar 

  • Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238

    Article  Google Scholar 

  • Phatak A, Bates BC, Charles SP (2011) Statistical downscaling of rainfall data using sparse variable selection methods. Environ Model Softw 26:1363–1371. doi:10.1016/j.envsoft.2011.05.007

    Article  Google Scholar 

  • Prasad K, Dash SK, Mohanty UC (2010) A logistic regression approach for monthly rainfall forecasts in meteorological subdivisions of India based on DEMETER retrospective forecasts. Int J Climatol 30:1577–1588. doi:10.1002/joc.2019

    Google Scholar 

  • Praskievicz S, Chang H (2009) A review of hydrological modelling of basin-scale climate change and urban development impacts. Prog Phys Geogr 33:650–671. doi:10.1177/0309133309348098

    Article  Google Scholar 

  • Robertson DE, Wang QJ (2009) Selecting predictors for seasonal streamflow predictions using a Bayesian joint probability (BJP) modelling approach. In: Anderssen RS, Braddock RD, Newham LTH (eds.) 18th world IMACS congress and MODSIM09 international congress on modelling and simulation. Modelling and Simulation Society of Australia and New Zealand and International Association for Mathematics and Computers in Simulation, July 2009, Cairns, Australia, pp 2377–2383

  • Sharma A (2000) Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: Part 1—A strategy for system predictor identification. J Hydrol 239:232–239

    Article  Google Scholar 

  • Sharma A, Luk KC, Cordery I, Lall U (2000) Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: Part 2—Predictor identification of quarterly rainfall using ocean-atmosphere information. J Hydrol 239:240–248

    Article  Google Scholar 

  • Shukla S, Wood AW (2008) Use of a standardized runoff index for characterizing hydrologic drought. Geophys Res Lett 35:L02405. doi:10.1029/2007GL032487

    Article  Google Scholar 

  • Templeton GF (2011) A two-step approach for transforming continuous variables to normal: implications and recommendations for IS research. Commun Assoc Inf Syst 28(4). http://aisel.aisnet.org/cais/vol28/iss1/4

  • Thornthwaite CW (1948) An approach toward a rational classification of climate. Geogr Rev 38(1):55–94

    Article  Google Scholar 

  • Tian Y, Booij M, Xu Y-P (2014) Uncertainty in high and low flows due to model structure and parameter errors. Stoch Environ Res Risk Assess 28:319–332. doi:10.1007/s00477-013-0751-9

    Article  Google Scholar 

  • Traveria M, Escribano A, Palomo P (2010) Statistical wind forecast for Reus airport. Meteorol Appl 17:485–495. doi:10.1002/met.192

    Google Scholar 

  • Trenberth K (1999) Conceptual framework for changes of extremes of the hydrological cycle with climate change. Clim Change 42:327–339. doi:10.1023/A:1005488920935

    Article  Google Scholar 

  • Tripathi S, Govindaraju R (2007) On selection of kernel parametes in relevance vector machines for hydrologic applications. Stoch Environ Res Risk Assess 21:747–764. doi:10.1007/s00477-006-0087-9

    Article  Google Scholar 

  • Tripathi S, Govindaraju R (2011) Appraisal of statistical predictability under uncertain inputs: SST to rainfall. J Hydrol Eng 16:970–983. doi:10.1061/(ASCE)HE.1943-5584.0000278

    Article  Google Scholar 

  • Tripathi S, Srinivas VV, Nanjundiah RS (2006) Downscaling of precipitation for climate change scenarios: a support vector machine approach. J Hydrol 330:621–640. doi:10.1016/j.jhydrol.2006.04.030

    Article  Google Scholar 

  • Wang W-C, Chau K-W, Cheng C-T, Qiu L (2009) A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series. J Hydrol 374:294–306. doi:10.1016/j.jhydrol.2009.06.019

    Article  Google Scholar 

  • Ward MN (1992) Provisionally corrected surface wind data, worldwide ocean-atmosphere surface fields, and Sahelian rainfall variability. J Clim (United States) 5(5):454–475

    Google Scholar 

  • Western AW, Grayson RB, Green TR (1999) The Tarrawarra project: high resolution spatial measurement, modelling and analysis of soil moisture and hydrological response. Hydrol Process 13:633–652. doi:10.1002/(SICI)1099-1085(19990415)13:5<633:AID-HYP770>3.0.CO;2-8

    Article  Google Scholar 

  • Whittaker J (2009) Graphical models in applied multivariate statistics. Wiley Publishing, New York

    Google Scholar 

  • Wilks DS (2006) Statistical methods in the atmospheric sciences, 2nd edn. Academic Press/Elsevier, New York

    Google Scholar 

  • Willsky AS (2002) Multiresolution Markov models for signal and image processing. Proc IEEE 90(8):1396–1458. doi:10.1109/JPROC.2002.800717

    Article  Google Scholar 

  • Yu H, Choo Z, Uy WIT, Dauwels J, Jonathan P (2012) Modeling extreme events in spatial domain by copula graphical models. In: 15th international conference on information fusion (FUSION), 2012, pp 1761–1768

Download references

Acknowledgments

Studies were supported in part by the National Science Foundation under Grant AGS 1025430, and by USDA NIFA award number 2011-67019-21122. This support is gratefully acknowledged. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or the USDA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meenu Ramadas.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ramadas, M., Maity, R., Ojha, R. et al. Predictor selection for streamflows using a graphical modeling approach. Stoch Environ Res Risk Assess 29, 1583–1599 (2015). https://doi.org/10.1007/s00477-014-0977-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-014-0977-1

Keywords

Navigation