Skip to main content

Comparisons in Drinking Water Systems Using K-Means and A-Priori to Find Pathogenic Bacteria Genera

  • Conference paper
  • First Online:
Information Science and Applications 2018 (ICISA 2018)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 514))

Included in the following conference series:

Abstract

As water resources have become limited, there have been increased cases in illnesses related to waterborne pathogens, with this is mind studies and investigation needs to be done on alternative water sources such as, ground water and common water sources such as surface waters, to ensure that water provided to consumers are safe to consume. This research paper compares bacterial genera in both ground and surface source waters for drinking water systems, based on 16S rRNA profiling using machine learning methods, such as K-means and A priori. 16S can be used to identify and differentiate between bacterial genera. Not only is it important to identify specific bacterial genera found in water sources, but the relative abundance needs to be examined to determine whether groundwater is a more viable drinking water source than surface water. Using recent incidences of water-borne illnesses that have been reported across South Africa, five key bacterial indicators to determine water quality and safety can be identified, which can be found in both groundwater and surface waters. Captured data from samples collected is used to determine the abundance of each bacterium for each water sample in a more efficient and effective manner the five indicators outlined for this project are; E. coli (Escherichia), Legionella, Hemophilia, Bdellovibrio, Streptococcus. The dataset, used contained bacterium from both ground and surface waters using dimensional techniques and many parameters can be reduced for more efficient processing. The algorithms used include K-Means to cluster the data to allow for interpretation, A Priori algorithm to get the frequent items so that association rules can be derived, which allows patterns to be realized and SVM (support vector machine) to predict the error of new data coming into a stream. Using the results produced by the algorithms, it was discovered that the mean relative abundance of the pathogenic organisms found in groundwater was higher than that found in surface water. Results indicated that automated, scalable water viability assessment is feasible using the methods proposed, which make it an attractive avenue of research as the Internet of Things (IoT) in this domain develops.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hordon RM (2005) Water and surface hydrology. Water Encycl 7:89–112

    Google Scholar 

  2. Nkanyani G (2014) Water pollution. Rand Water South Africa 6:98–200

    Google Scholar 

  3. Serra LJ (2017) Tap water is making us sick. Daily Voice 1:45–56

    Google Scholar 

  4. Mike C (2016) Hadoop ecosystem overview. Big Data Blog 2:9–10

    Google Scholar 

  5. Taylor T (2016) South Africa tap water np cleaned properly. Experts 1:98–145

    Google Scholar 

  6. Colby D (2015) Life cycle of malaria parasites. Med Web 9:45–46

    Google Scholar 

  7. Mwabi JK, Mamba BB, Momba MN (2013) Removal of waterborne bacteria from surface water and groundwater by cost effect household water treatment. South Africa Publ Serv 2:65–99

    Google Scholar 

  8. Liberatore S (2015) Fancy a drink. Sci Technol 5:2–5

    Google Scholar 

  9. Winter TC, Harvey JW, Frank LO, Alley WM (1998) Groundwater and surface water. Res Gate 2:45–55

    Google Scholar 

  10. National research council (2004) Waterborne pathogens, indicators for waterborne pathogens. National Academy press, (1), pp 550–800

    Google Scholar 

  11. Agee J (1975) Protecting drinking water, responsibilities under the safe drinking water act. Environ Protect Agency 9:45–105

    Google Scholar 

  12. Parte AC (2001) E coli. Sci Direct 1:45–62

    Google Scholar 

  13. Markelova N (2010) Predacious bacteria, Bdellovibrio with potential for biocontrol. Int J Hyg Environ Health 6:428–431

    Article  Google Scholar 

  14. MeilinePlus (2001) Drinking water. Mediline Plus (5) 44–45

    Google Scholar 

  15. World Health Organization (2016) Water sampling. World Health Organ 40:1–6

    Google Scholar 

  16. Science Learning Lab (2009) DNA extraction. Sci Learn Lab 1:45–99

    Google Scholar 

  17. Wiley J (2015) Data science and big data analytics: discovering analyzing, visualizing and presenting data-Hadoop. Res Gate 1:300–325

    Google Scholar 

  18. Maindonald JH (2008) Using R for data analysis. Math Centre 6:98–110

    Google Scholar 

  19. Leskovec J, Rajaraman A, Ullman JD (2014) Mining of massive data sets. Stand Univ 1:560–750

    Google Scholar 

  20. Game T (2017) Fountains and springs. Ground Water Found 2:89–99

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dustin van der Haar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moodley, T., van der Haar, D. (2019). Comparisons in Drinking Water Systems Using K-Means and A-Priori to Find Pathogenic Bacteria Genera. In: Kim, K., Baek, N. (eds) Information Science and Applications 2018. ICISA 2018. Lecture Notes in Electrical Engineering, vol 514. Springer, Singapore. https://doi.org/10.1007/978-981-13-1056-0_36

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-1056-0_36

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-1055-3

  • Online ISBN: 978-981-13-1056-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics