Abstract
Streamflows are influenced by various hydroclimatic variables in complex ways. Accurate prediction of monthly streamflows requires a clear understanding of the dependence patterns among these influencing variables and streamflows. A graphical modeling technique, employing conditional independence, is adopted in this study to quantify the interrelationships between streamflows and a suite of available hydroclimatic variables, and to identify a reduced set of relevant variables for parsimonious model development. The nodes in the undirected graph represent relevant variables, and the strengths of the connections among the variables are learnt from the data. The graphical modeling approach is compared to the state-of-the-art method for predictor selection based on partial mutual information. For a synthetic benchmark dataset and a watershed in southern Indiana, USA, the graphical modeling approach shows more discriminating results while being computationally efficient. Along with artificial neural networks and time series models, results of the graphical model are used for formulating a variational relevance vector machine to predict monthly streamflows and perform probabilistic classification of hydrologic droughts in the watershed being studied. The parsimonious models developed for prediction at different lead times performed as well as the non-parsimonious models during both the calibration and testing periods. Drought forecasting for the study watershed at 1-month lead time was performed using the two selected predictors—soil moisture and precipitation anomalies alone, and the model performance was evaluated. The graphical model shows promise as a tool for predictor selection, and for aiding parsimonious model development applications in statistical hydrology.
Similar content being viewed by others
References
Alcamo J, Flörke M, Marker M (2007) Future long-term changes in global water resources driven by socio-economic and climatic changes. Hydrol Sci J 52:247–275. doi:10.1623/hysj.52.2.247
Anctil F, Lauzon N, Filion M (2008) Added gains of soil moisture content observations for streamflow predictions using neural networks. J Hydrol 359(3–4):225–234. doi:10.1016/j.jhydrol.2008.07.003
Asefa T, Kemblowski M, McKee M, Khalil A (2006) Multi-time scale stream flow predictions: the support vector machines approach. J Hydrol 318:7–16. doi:10.1016/j.jhydrol.2005.06.001
Aubert D, Loumagne C, Oudin L (2003) Sequential assimilation of soil moisture and streamflow data in a conceptual rainfall-runoff model. J Hydrol 280:145–161. doi:10.1016/S0022-1694(03)00229-4
Bach FR, Jordan MI (2004) Learning graphical models for stationary time series. IEEE Trans Signal Process 52:2189–2199
Barnston AG (1992) Correspondence among the correlation, RMSE, and Heidke forecast verification measures; refinement of the Heidke score. Weather Forecast 7(4):699–709
Besaw LE, Rizzo DM, Bierman PR, Hackett WR (2010) Advances in ungauged streamflow prediction using artificial neural networks. J Hydrol 386:27–37. doi:10.1016/j.jhydrol.2010.02.037
Bishop CM, Tipping ME (2000) Variational relevance vector machines. In: Boutilier C, Goldszmidt M (eds) Proceedings of the 16th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc., Massachusetts, pp 46–53
Bonev B (2010) Feature selection based on information theory. Ph.D. Thesis, University of Alicante, Alicante, Spain
Bowden GJ, Dandy GC, Maier HR (2005) Input determination for neural network models in water resources applications. Part 1. Background and Methodology. J Hydrol 301(1–4):75–92
Burn DH, Buttle JM, Caissie D et al (2008) The processes, patterns and impacts of low flows across Canada. Can Water Resour J 33:107–124. doi:10.4296/cwrj3302107
Chen Y, Zhang Q, Chen X, Wang P (2012) Multiscale variability of streamflow changes in the Pearl River basin, China. Stoch Environ Res Risk Assess 26:235–246. doi:10.1007/s00477-011-0495-3
Cover TM, Thomas JA (1991) Elements of information theory. Wiley, New York
Crone SF, Kourentzes N (2010) Feature selection for time series prediction—a combined filter and wrapper approach for neural networks. Neurocomputing 73:1923–1936. doi:10.1016/j.neucom.2010.01.017
Davies L, Gather U (1993) The identification of multiple outliers. J Am Stat Assoc 88(423):782–792. doi:10.1080/01621459.1993.10476339
Dempster AP (1972) Covariance selection. Biometrics 28(1):157–175
Dogan E, Tripathi S, Lyn DA, Govindaraju RS (2009) From flumes to rivers: can sediment transport in natural alluvial channels be predicted from observations at the laboratory scale? Water Resour Res 45:W08433. doi:10.1029/2008WR007637
Doswell CA, Davies-Jones R, Keller DL (1990) On summary measures of skill in rare event forecasting based on contingency tables. Weather Forecast 5(4):576–585. doi:10.1175/1520-0434(1990)005<0576:OSMOSI>2.0.CO;2
Edwards D (2000) Introduction to graphical modelling, 2nd edn. Springer, New York
Entin JK, Robock A, Vinnikov KY, Hollinger SE, Liu S, Namkhai A (2000) Temporal and spatial scales of observed soil moisture variations in the extratropics. J Geophys Res Atmos 105(D9):11865–11877. doi:10.1029/2000JD900051
Faul A, Tipping M (2001) A variational approach to robust regression. In: Dorffner G, Bischof H, Hornik K (eds) Artificial neural networks—ICANN 2001. Springer, Berlin, pp 95–102
Fernando TMKG, Maier HR, Dandy GC (2009) Selection of input variables for data driven models: an average shifted histogram partial mutual information estimator approach. J Hydrol 367(3–4):165–176. doi:10.1016/j.jhydrol.2008.10.019
Fiori M, Musé P, Sapiro G (2012) Topology constraints in graphical models. Adv Neural Inf Process Syst 25:800–808
Galelli S, Castelletti A (2013) Tree-based iterative input variable selection for hydrological modeling. Water Resour Res 49:4295–4310. doi:10.1002/wrcr.20339
Gao C, Gemmer M, Zeng X et al (2010) Projected streamflow in the Huaihe River Basin (2010–2100) using artificial neural network. Stoch Environ Res Risk Assess 24:685–697. doi:10.1007/s00477-009-0355-6
Georgakakos KP (1986) A generalized stochastic hydrometeorological model for flood and flash-flood forecasting: 1. Formulation. Water Resour Res 22(13):2083–2095. doi:10.1029/WR022i013p02083
Ghosh S, Mujumdar PP (2008) Statistical downscaling of GCM simulations to streamflow using relevance vector machine. Adv Water Resour 31:132–146. doi:10.1016/j.advwatres.2007.07.005
Hejazi MI, Cai X (2009) Input variable selection for water resources systems using a modified minimum redundancy maximum relevance (mMRMR) algorithm. Adv Water Resour 32:582–593. doi:10.1016/j.advwatres.2009.01.009
Hoque YM, Tripathi S, Hantush MM, Govindaraju RS (2012) Watershed reliability, resilience and vulnerability analysis under uncertainty using water quality data. J Environ Manag 109:101–112. doi:10.1016/j.jenvman.2012.05.010
Hsu C-N, Huang H-J, Dietrich S (2002) The ANNIGMA-wrapper approach to fast feature selection for neural nets. IEEE Trans Syst Man Cybern B Cybern 32:207–212. doi:10.1109/3477.990877
Huang J, van den Dool HM, Georgakakos KP (1996) Analysis of model-calculated soil moisture over the United States (1931–1993) and applications to long-range temperature forecasts. J Clim 9(6):1350–1362. doi:10.1175/1520-0442(1996)009<1350:AOMCSM>2.0.CO;2
Ihler AT, Kirshner S, Ghil M, Robertson AW, Smyth P (2007) Graphical models for statistical inference and data assimilation. Physica D 230(1):72–87
Jensen FV, Nielsen TD (2007) Bayesian Networks and Decision Graphs. Springer, New York
Jolliffe IT, Stephenson DB (2003) Forecast verification: a practitioner’s guide in atmospheric science. Wiley-Blackwell, Hoboken
Jordan MI (2004) Graphical models. Stat Sci 19(1):140–155. doi:10.1214/088342304000000026
Kalnay E et al (1996) The NCEP/NCAR 40-year reanalysis project. Bull Am Meteorol Soc 77(3):437–471
Karamouz M, Ahmadi A, Moridi A (2009) Probabilistic reservoir operation using Bayesian stochastic model and support vector machine. Adv Water Resour 32:1588–1600. doi:10.1016/j.advwatres.2009.08.003
Khalil A, Almasri MN, McKee M, Kaluarachchi JJ (2005) Applicability of statistical learning algorithms in groundwater quality modeling. Water Resour Res 41:W05010. doi:10.1029/2004WR003608
Kisi O, Cimen M (2011) A wavelet-support vector machine conjunction model for monthly streamflow forecasting. J Hydrol 399:132–140. doi:10.1016/j.jhydrol.2010.12.041
Koster RD, Mahanama SPP, Livneh B et al (2010) Skill in streamflow forecasts derived from large-scale estimates of soil moisture and snow. Nat Geosci 3:613–616. doi:10.1038/ngeo944
Lauritzen SL (1996) Graphical models, vol 17. Oxford University Press Inc., New York
Livneh B, Lettenmaier DP (2012) Multi-criteria parameter estimation for the unified land model. Hydrol Earth Syst Sci Dis 9:4417–4463. doi:10.5194/hessd-9-4417-2012
Mahanama SPP, Koster RD, Reichle RH, Zubair L (2008) The role of soil moisture initialization in subseasonal and seasonal streamflow prediction—a case study in Sri Lanka. Adv Water Resour 31:1333–1343. doi:10.1016/j.advwatres.2008.06.004
Maier HR, Jain A, Dandy GC, Sudheer KP (2010) Methods used for the development of neural networks for the prediction of water resource variables in river systems: current status and future directions. Environ Model Softw 25:891–909. doi:10.1016/j.envsoft.2010.02.003
Maity R, Kashid SS (2011) Importance analysis of local and global climate inputs for basin-scale streamflow prediction. Water Resour Res 47(11):W11504. doi:10.1029/2010WR009742
Maity R, Bhagwat PP, Bhatnagar A (2010) Potential of support vector regression for prediction of monthly streamflow using endogenous property. Hydrol Process 24(7):917–923. doi:10.1002/hyp.7535
Maity R, Ramadas M, Govindaraju RS (2013) Identification of hydrologic drought triggers from hydroclimatic predictor variables. Water Resour Res 49:4476–4492. doi:10.1002/wrcr.20346
Makkeasorn A, Chang NB, Zhou X (2008) Short-term streamflow forecasting with global climate change implications—a comparative study between genetic programming and neural network models. J Hydrol 352:336–354. doi:10.1016/j.jhydrol.2008.01.023
Malioutov DM, Johnson JK, Willsky AS (2006) Walk-sums and belief propagation in gaussian graphical models. J Mach Learn Res 7:2031–2064
May RJ, Maier HR, Dandy GC, Fernando TMKG (2008) Non-linear variable selection for artificial neural networks using partial mutual information. Environ Model Softw 23(10–11):1312–1326. doi:10.1016/j.envsoft.2008.03.007
McKee TB, Doesken NJ, Kleist J (1993) The relationship of drought frequency and duration to time scales. In: Proceedings of the 8th conference on applied climatology, held at Anaheim, California, LA, pp 179–183
Moghaddamnia A, Gousheh MG, Piri J, Amin S, Han D (2009) Evaporation estimation using artificial neural networks and adaptive neuro-fuzzy inference system techniques. Adv Water Resour 32:88–97. doi:10.1016/j.advwatres.2008.10.005
Najjar RG (1999) The water balance of the Susquehanna River Basin and its response to climate change. J Hydrol 219:7–19. doi:10.1016/S0022-1694(99)00041-4
Noori R, Karbassi AR, Moghaddamnia A, Amin S, Han D (2011) Assessment of input variables determination on the SVM model performance using PCA, gamma test, and forward selection techniques for monthly stream flow prediction. J Hydrol 401:177–189. doi:10.1016/j.jhydrol.2011.02.021
Parthasarathy B, Kumar KR, Munot AA (1993) Homogeneous Indian Monsoon rainfall: variability and prediction. Proc Indian Acad Sci (Earth Planet Sci) 102:121–155. doi:10.1007/BF02839187
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238
Phatak A, Bates BC, Charles SP (2011) Statistical downscaling of rainfall data using sparse variable selection methods. Environ Model Softw 26:1363–1371. doi:10.1016/j.envsoft.2011.05.007
Prasad K, Dash SK, Mohanty UC (2010) A logistic regression approach for monthly rainfall forecasts in meteorological subdivisions of India based on DEMETER retrospective forecasts. Int J Climatol 30:1577–1588. doi:10.1002/joc.2019
Praskievicz S, Chang H (2009) A review of hydrological modelling of basin-scale climate change and urban development impacts. Prog Phys Geogr 33:650–671. doi:10.1177/0309133309348098
Robertson DE, Wang QJ (2009) Selecting predictors for seasonal streamflow predictions using a Bayesian joint probability (BJP) modelling approach. In: Anderssen RS, Braddock RD, Newham LTH (eds.) 18th world IMACS congress and MODSIM09 international congress on modelling and simulation. Modelling and Simulation Society of Australia and New Zealand and International Association for Mathematics and Computers in Simulation, July 2009, Cairns, Australia, pp 2377–2383
Sharma A (2000) Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: Part 1—A strategy for system predictor identification. J Hydrol 239:232–239
Sharma A, Luk KC, Cordery I, Lall U (2000) Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: Part 2—Predictor identification of quarterly rainfall using ocean-atmosphere information. J Hydrol 239:240–248
Shukla S, Wood AW (2008) Use of a standardized runoff index for characterizing hydrologic drought. Geophys Res Lett 35:L02405. doi:10.1029/2007GL032487
Templeton GF (2011) A two-step approach for transforming continuous variables to normal: implications and recommendations for IS research. Commun Assoc Inf Syst 28(4). http://aisel.aisnet.org/cais/vol28/iss1/4
Thornthwaite CW (1948) An approach toward a rational classification of climate. Geogr Rev 38(1):55–94
Tian Y, Booij M, Xu Y-P (2014) Uncertainty in high and low flows due to model structure and parameter errors. Stoch Environ Res Risk Assess 28:319–332. doi:10.1007/s00477-013-0751-9
Traveria M, Escribano A, Palomo P (2010) Statistical wind forecast for Reus airport. Meteorol Appl 17:485–495. doi:10.1002/met.192
Trenberth K (1999) Conceptual framework for changes of extremes of the hydrological cycle with climate change. Clim Change 42:327–339. doi:10.1023/A:1005488920935
Tripathi S, Govindaraju R (2007) On selection of kernel parametes in relevance vector machines for hydrologic applications. Stoch Environ Res Risk Assess 21:747–764. doi:10.1007/s00477-006-0087-9
Tripathi S, Govindaraju R (2011) Appraisal of statistical predictability under uncertain inputs: SST to rainfall. J Hydrol Eng 16:970–983. doi:10.1061/(ASCE)HE.1943-5584.0000278
Tripathi S, Srinivas VV, Nanjundiah RS (2006) Downscaling of precipitation for climate change scenarios: a support vector machine approach. J Hydrol 330:621–640. doi:10.1016/j.jhydrol.2006.04.030
Wang W-C, Chau K-W, Cheng C-T, Qiu L (2009) A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series. J Hydrol 374:294–306. doi:10.1016/j.jhydrol.2009.06.019
Ward MN (1992) Provisionally corrected surface wind data, worldwide ocean-atmosphere surface fields, and Sahelian rainfall variability. J Clim (United States) 5(5):454–475
Western AW, Grayson RB, Green TR (1999) The Tarrawarra project: high resolution spatial measurement, modelling and analysis of soil moisture and hydrological response. Hydrol Process 13:633–652. doi:10.1002/(SICI)1099-1085(19990415)13:5<633:AID-HYP770>3.0.CO;2-8
Whittaker J (2009) Graphical models in applied multivariate statistics. Wiley Publishing, New York
Wilks DS (2006) Statistical methods in the atmospheric sciences, 2nd edn. Academic Press/Elsevier, New York
Willsky AS (2002) Multiresolution Markov models for signal and image processing. Proc IEEE 90(8):1396–1458. doi:10.1109/JPROC.2002.800717
Yu H, Choo Z, Uy WIT, Dauwels J, Jonathan P (2012) Modeling extreme events in spatial domain by copula graphical models. In: 15th international conference on information fusion (FUSION), 2012, pp 1761–1768
Acknowledgments
Studies were supported in part by the National Science Foundation under Grant AGS 1025430, and by USDA NIFA award number 2011-67019-21122. This support is gratefully acknowledged. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or the USDA.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ramadas, M., Maity, R., Ojha, R. et al. Predictor selection for streamflows using a graphical modeling approach. Stoch Environ Res Risk Assess 29, 1583–1599 (2015). https://doi.org/10.1007/s00477-014-0977-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-014-0977-1