Skip to main content
Log in

Data Mining for Generating Predictive Models of Local Hydrology

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The problem of downscaling the effects of global scale climate variability into predictions of local hydrology has important implications for water resource management. Our research aims to identify predictive relationships that can be used to integrate solar and ocean-atmospheric conditions into forecasts of regional water flows. In recent work we have developed an induction technique called second-order table compression, in which learning can be viewed as a process that transforms a table consisting of training data into a second-order table (which has sets of atomic values as entries) with fewer rows by merging rows in consistency preserving ways. Here, we apply the second-order table compression technique to generate predictive models of future water inflows of Lake Okeechobee, a primary source of water supply for south Florida. We also describe SORCER, a second-order table compression learning system and compare its performance with three well-established data mining techniques: neural networks, decision tree learning and associational rule mining. SORCER gives more accurate results, on the average, than the other methods with average accuracy between 49% and 56% in the prediction of inflows discretized into four ranges. We discuss the implications of these results and the practical issues in assessing the results from data mining models to guide decision-making.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann: San Mateo, CA, 1993.

    Google Scholar 

  2. B. Liu, W. Hsu, and Y. Ma, “Integrating classification and association rule mining,” in Proceedings of Knowledge Discovery and Data Mining, New York, USA, 1998, pp. 80–86. Also in http://www.comp.nus.edu.sg/~dm2/result.html.

  3. P. Langley and H. Simon, “Applications of machine learning and rule induction,” CACM, vol. 38, no.11, pp. 55–64, 1995.

    Google Scholar 

  4. T. Mitchell, Machine Learning, McGraw-Hill Companies: New York, NY, 1997.

    Google Scholar 

  5. R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,” in Proceedings of the 20th International Conference on Very Large Databases, Santiago, Chile, 1994, pp. 487–499.

  6. D. Rumelhart, G.E. Hinton, and R.J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, pp. 533–536, 1986.

    Google Scholar 

  7. P. Trimble, S. Everett, and C. Neidrauer, “A refined approach to Lake Okeechobee water management: An application of climate forecasts,” Special report, South Florida Water Management District, Florida, 1998.

    Google Scholar 

  8. P. Trimble, E. Santee, and C. Neidrauer, “Including the effects of solar activity for more efficient water management: An application of neural networks,” in Proceedings of the Workshop on AI Application in Solar-Terrestrial Physics, Sweden, 1997.

  9. P. Domingos, “The role of Occam's Razor in knowledge discovery,” Data Mining and Knowledge Discovery, vol. 3, pp. 409–425, 1999.

    Google Scholar 

  10. J. Leuchner and R. Hewett, “A formal framework for large decision tables,” in Proceedings of the International Knowledge Retrieval, Use, and Storage for Efficiency Symposium (KRUSE), Santa Cruz, USA, 1997, pp. 65–179.

  11. R. Hewett and J. Leuchner, “Second-order relations and decision tables,” TR97-27, CSE, Florida Atlantic University, Boca Raton, FL, 1997.

    Google Scholar 

  12. R. Hewett and J. Leuchner, “The power of second-order decision tables,” in Proceedings of SIAM International Conference on Data Mining (SDM’02), Arlington, USA, 2002, pp. 384–399.

  13. H.C. Willet, “Climate responses to variable solar activity-past present and predicted,” in Climate History, Periodicity, Predictability, edited by R. Michael and Rampino, Van Nostrand Reinhold Company, Inc.: MIT, Cambridge, 1987.

    Google Scholar 

  14. United States Army Corps of Engineers, Rules Curves and Key Operating Regulation Manual, Append. D, 1978.

  15. K. Kira and L.A. Rendell, “A practical approach to feature selection,” in Proceedings of the Ninth International Conference on Machine Learning, Aberdeen, Scotland, 1992, pp. 249–256.

  16. R. Kohavi and G.H. John, “Wrappers approach,” in Feature Selection for Knowledge Discovery and Data Mining, edited by H. Liu and H. Motoda, Kluwer Academic Publishers, pp. 33–50, 1998.

  17. I. Kononenko, “Estimating attributes: Analysis and extensions of RELIEF,” in Proceedings of the 7th European Conf. on Machine Learning, Catania, Italy, 1994, pp. 171–182.

  18. J. Dougherty, R. Kohavi, and M. Sahami, “Supervised and unsupervised discretization of continuous features,” in Proceedings of the 12th International Conference on Machine Learning, San Francisco, CA, 1995, pp. 194–202.

  19. U. Fayyad and K.B. Irani, “Multi-interval discretization of continuous-valued attributes for classification learning,” in Proceedings of the 13th International Joint Conference on Artificial Intelligence, Morgan Kaufmann: San Francisco, CA, 1993, pp. 1022–1027.

    Google Scholar 

  20. L. Lapin, Statistics for Modern Business Decisions, Harcourt Brace Jovanovich, Inc., 1973.

  21. S. Weiss and C. Kulikowski, Computer Systems That Learn, Morgan Kaufmann: San Francisco, CA, 1991.

    Google Scholar 

  22. D.R. Easterling, “Development of regional climate scenarios using a downscaling approach,” Climatic Change, vol. 41, pp. 615–634, 1999.

    Google Scholar 

  23. H. von Storch, E. Zorita, and U. Cubash, “Downscaling of global climate change estimates to regional scales: An application to Iberian rainfall in wintertime,” Journal of Climate, vol. 6, pp. 1161–1171, 1993.

    Google Scholar 

  24. E. Zorita and H. von Storch, “A survey of statistical downscaling techniques,” Institute of Hydrophysics, GKSS Forschungszentrum Geesthacht, Germany, 1997.

    Google Scholar 

  25. L. Leung, M. Wigmosta, S. Ghan, D. Epstein, and L. Vail, “Application of subgrid orographic precipitation/surface hydrology scheme to a mountain watershed,” Journal of Geophysics Research, vol. 101, pp. 12803–12817, 1996.

    Google Scholar 

  26. E. Chown and T. Dietterich, “A comparison of neural network and process-based models for vegetation distribution under global climate change,” Technical Report, Department of Computer Science, Oregon State University, Corvallis, OR, 1997.

    Google Scholar 

  27. K. Hsu, H. Gupta, and S. Sorooshian, “Artificial neural network modeling of the rainfall runoff process,” Water Resources Research, vol. 31, pp. 2517–2530, 1995.

    Google Scholar 

  28. R. Hewett, J. Leuchner, and P. Trimble, “Discovering hydrologic forecasting rules for water management using table compression: A preliminary result,” in Proceedings of the 5th International Conference on Computer Science and Informatics (CS&I 2000), Atlantic City, USA, 2000, pp. 476–479.

  29. J. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, no.1, pp. 81–106, 1986.

    Google Scholar 

  30. U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (eds.), Advances in Knowledge Discovery and Data Mining, AAAI Press/The MIT Press: Menlo Park, CA, 1996.

    Google Scholar 

  31. S. Džeroski, “Inductive logic programming and knowledge discovery and databases,” in Advanced in Knowledge Discovery and Data Mining, edited by U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, AAAI Press/The MIT Press: Menlo Park, CA, pp. 117–152, 1996.

    Google Scholar 

  32. S. Muggleton and C. Feng, “Efficient induction of logic programs,” in Proceedings of the First Conference on Algorithmic Learning Theory, Tokyo, Japan, 1990, pp. 368–381.

  33. S. Hong, “R-MINI:Aheuristic algorithm for generating minimal rules from examples,” in Proceedings of the Third Pacific Rim International Conference on Artificial Intelligence, PRICAI’94, Beijing, China, 1994, pp. 331–337.

  34. S. Hong, R. Cain, and D. Ostapko, “MINI: A heuristic approach for logic minimization,” IBM Journal of Research and Development, pp. 443–458, 1974.

  35. C. Apte and S. Hong, “Predicting equity returns from securities data,” in Advanced in Knowledge Discovery and Data Mining, edited by U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, AAAI Press/The MIT Press: Menlo Park, CA, pp. 541–560, 1996.

    Google Scholar 

  36. R. Hewett, J. Leuchner, and M. Carvalho, “Generating predictive models of regional water flows from global climate history with machine learning,” in Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, Tucson, USA, 2001, pp. 292–297.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hewett, R. Data Mining for Generating Predictive Models of Local Hydrology. Applied Intelligence 19, 157–170 (2003). https://doi.org/10.1023/A:1026005922241

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1026005922241

Navigation