Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Using Data Mining to Understand Drinking Water Advisories in Small Water Systems: a Case Study of Ontario First Nations Drinking Water Supplies

  • 681 Accesses

  • 14 Citations


Although access to safe drinking water is widely assumed to be universal, small drinking water systems in many countries continue to experience an unacceptably large number of drinking water advisories (DWAs). The goal of this research is to describe novel data mining tools that identify the factors contributing to DWAs in small drinking water systems. A dataset containing information related to First Nations drinking water systems in the Province of Ontario, Canada is used for the case study. A decision tree classifier (one of the fastest and most versatile predictive modeling algorithms currently available for data mining) visually maps out the relationship of system characteristics (e.g., source water, system age, and operator certification) to DWA likelihood. The developed model achieves an overall accuracy of 71 % during repeated cross-validation of predictive performance and is of utility when prioritizing future expenditures aimed at proactively reducing the risk of delivering compromised water.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3


  1. Craun MF, Craun GF, Calderon RL, Beach MJ (2006) Waterborne outbreaks reported in the United States. J Water Health 4(Suppl 2):19–30

  2. Curriero FC, Patz JA, Rose JB, Lele S (2001) The association between extreme precipitation and waterborne disease outbreaks in the United States, 1948–1994. Am J Public Health 91(8):1194–1199

  3. Danneels JJ, Finley RE (2004) Assessing the vulnerabilities of US drinking water systems. J Contemp Water Res Educ 129(1):8–12

  4. Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874

  5. Health Canada (2013) Guidance for providing safe drinking water in areas of federal jurisdiction - version 2. Health Canada, http://www.hc-sc.gc.ca/ewh-semt/pubs/water-eau/guidance-federal-conseils/index-eng.php#a3

  6. Health Canada (2015) First Nations & inuit Health- Water and Wastewater Page. http://www.hc-sc.gc.ca/fniah-spnia/promotion/public-publique/water-eau-eng.php#s2a. Accessed 14 Jan 2015

  7. Hosmer D, Lemeshow S (2000) Applied logistic regression. Wiley, New York

  8. Hrudey EJ (2004) Safe drinking water: Lessons from recent outbreaks in affluent nations. IWA publishing

  9. Khadra R, Lamaddalena N (2010) Development of a decision support system for irrigation systems analysis. Water Resour Manag 24(12):3279–3297

  10. Kuhn M, Johnson K (2013) Applied predictive modeling, 1st edn. Springer Science and Business Media, New York

  11. Neegan Burnside (2011) National Assessment of First Natins Water and Wastewater Systems - Ontario Regional Roll-Up Report. Neegan Burnside. www.aadnc-aandc.gc.ca/DAM/DAM-INTER-HQ/STAGING/texte-text/enr_wtr_nawws_ruront_ruront_1314635179042_eng.pdf. Accessed 1 July 2014

  12. Parry ML (2007) Climate Change 2007: impacts, adaptation and vulnerability: contribution of Working Group II to the fourth assessment report of the Intergovernmental Panel on Climate Change, vol 4. Cambridge University Press

  13. RapidMiner (2014) RapidMiner Software Distribution Page. RapidMiner GmbH. http://rapidminer.com/. Accessed 1 July 2014

  14. Schuster CJ, Ellis AG, Robertson WJ, Charron DF, Aramini JJ, Marshall BJ, Medeiros DT (2005) Infectious disease outbreaks related to drinking water in Canada, 1974–2001. Can J Public Health/Revue Canadienne de Sante’e Publique:254–258

  15. Thomas KM, Charron DF, Waltner-Toews D, Schuster C, Maarouf AR, Holt JD (2006) A role of high impact weather events in waterborne disease outbreaks in Canada, 1975–2001. Int J Environ Health Res 16(03):167–180

  16. USEPA (2012) Basic Information - Drinking water management (overview). United States Environmental Protection Agency. http://water.epa.gov/scitech/datait/databases/drink/sdwisfed/basicinformation.cfm#overview. Accessed 12 Sept 2014

  17. Wachowic M (2002) Uncovering spatio-temporal patterns in environmental data. Water Resour Manag 16(6):469–487

  18. Witten I, Frank E, Hall M (2011) Data mining - practical machine learning tools and techniques. Morgan Kaufmann, Burlington

Download references


The authors are grateful to RES’EAU-WaterNET, the University of Guelph, and the Natural Sciences and Engineering Research Council of Canada (NSERC) for financial support.

Author information

Correspondence to Richard Harvey.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Harvey, R., Murphy, H.M., McBean, E.A. et al. Using Data Mining to Understand Drinking Water Advisories in Small Water Systems: a Case Study of Ontario First Nations Drinking Water Supplies. Water Resour Manage 29, 5129–5139 (2015). https://doi.org/10.1007/s11269-015-1108-6

Download citation


  • Data mining
  • Decision tree
  • Drinking water advisory
  • First Nations
  • Water