Abstract
Understanding the dynamics of water quality in any water body is vital for the sustainability of our water resources. Thus, investigating spatio-temporal changes of dominant water quality parameters (WQPs) in any study is indeed critical for proposing the appropriate treatment for the water bodies. Traditionally, concentrations of WQPs have been measured through intensive fieldwork. Additionally, many studies have attempted to retrieve concentrations of WQPs from satellite images using regression-based methods. However, the relationship between WQPs and satellite data is complex to be modeled accurately by using simple regression-based methods. Our study attempts to develop a machine learning model for mapping the concentrations of dominant optical and non-optical WQPs such as electrical conductivity (EC), pH, temperature (Temp), total dissolved solids (TDS), silicon dioxide (SiO2), and dissolved oxygen (DO). In this context, a remote sensing framework based on the extreme gradient boosting (XGBoost) and multi-layer perceptron (MLP) regressor with optimized hyper parameters (HPs) to quantify concentrations of different WQPs from the Landsat-8 satellite imagery is developed. We evaluated six years of satellite data stretching spatially from upstream to downstream Ankinghat to Chopan (20 stations under Central Water Commission (CWC), Middle Ganga Basin) for characterizing the trends of dominant physico-chemical WQPs across the four clusters identified in our previous study. Through the developed XGBoost and MLP regression models between measured WQPs and the reflectance of the pixels corresponding to the sampling stations, a significant coefficient of determination (R2) in the range of 0.88–0.98 for XGBoost and 0.72–0.97 for MLP were generated, with bands B1–B4 and their ratios more consistent. Indeed, these findings indicate that from a small number of in-situ measurements, we can develop reliable models to estimate the spatio-temporal variations of physico-chemical and biological WQPs. Therefore, models generated from Landsat-8 could facilitate the environmental, economic, and social management of any waterbody.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from Central Water Commission (CWC) Middle Ganga Division (MGD I&II). But restrictions apply to the availability of these data, which were used under license for the current study and so are not publicly available.
References
Abdelmalik KW (2018) Role of statistical remote sensing for inland water quality parameters prediction. Egypt J Remote Sens Space Sci 21(2):193–200. https://doi.org/10.1016/j.ejrs.2016.12.002
Al-Badaii F, Shuhaimi-Othman M, Gasim MB (2013) Water quality assessment of the Semenyih River, Selangor, Malaysia. Journal of Chemistry, August. https://doi.org/10.1155/2013/871056
Allee RJ, Johnson JE (1999) Use of satellite imagery to estimate surface chlorophyll a and secchi disc depth of Bull Shoals Reservoir, Arkansas, USA. Int J Remote Sens 20(6):1057–1072. https://doi.org/10.1080/014311699212849
Andrzej Urbanski J, Wochna A, Bubak I, Grzybowski W, Lukawska-Matuszewska K, Łącka M, Śliwińska S, Wojtasiewicz B, Zajączkowski M (2016) Application of Landsat 8 imagery to regional-scale assessment of lake water quality. Int J Appl Earth Obs Geoinf 51:28–36. https://doi.org/10.1016/j.jag.2016.04.004
Antonini K, Langer M, Farid A, Walter U (2017) SWEET CubeSat – water detection and water quality monitoring for the 21st century. Acta Astronaut 140:10–17. https://doi.org/10.1016/j.actaastro.2017.07.046
Ay M, Kisi O (2014) Modelling of chemical oxygen demand by using ANNs, ANFIS and k-means clustering techniques. J Hydrol 511:279–289. https://doi.org/10.1016/j.jhydrol.2014.01.054
Baban SMJ (1993) Detecting water quality parameters in the norfolk broads, U.K., using Landsat imagery. Int J Remote Sens 14(7):1247–1267. https://doi.org/10.1080/01431169308953955
Bhat SA, Meraj G, Yaseen S, Pandit AK (2014) Statistical assessment of water quality parameters for pollution source identification in Sukhnag Stream: an inflow stream of Lake Wular (Ramsar site), Kashmir Himalaya. J Ecosystems 2014:1–18. https://doi.org/10.1155/2014/898054
Bhuyan MS, Bakar MA, Sharif ASM, Hasan M, Islam MS (2018) Water quality assessment using water quality indicators and multivariate analyses of the Old Brahmaputra River. Pollution 4(3):481–493. https://doi.org/10.22059/poll.2018.246865.350
Bonansea, M., Ledesma, M., Rodriguez, C., & Pinotti, L. (2018). Using new remote sensing satellites for assessing water quality in a reservoir. Hydrol Sci J, 0(0), 1–11. https://doi.org/10.1080/02626667.2018.1552001
Bonansea M, Rodriguez MC, Pinotti L, Ferrero S (2015) Using multi-temporal Landsat imagery and linear mixed models for assessing water quality parameters in Río Tercero reservoir (Argentina). Remote Sens Environ 158:28–41. https://doi.org/10.1016/j.rse.2014.10.032
Chang NB, Yang YJ, Daranpob A, Jin KR, James T (2012) Spatiotemporal pattern validation of chlorophyll-a concentrations in Lake Okeechobee, Florida, using a comparative MODIS image mining approach. Int J Remote Sens 33(7):2233–2260. https://doi.org/10.1080/01431161.2011.608089
Chen F, Xiao D, Li Z (2016) Developing water quality retrieval models with in situ hyperspectral data in Poyang Lake. China Geo-Spatial Information Science 19(4):255–266. https://doi.org/10.1080/10095020.2016.1258201
Du C, Wang Q, Li Y, Lyu H, Zhu L, Zheng Z, Wen S, Liu G, Guo Y (2018) Estimation of total phosphorus concentration using a water classification method in inland water. Int J Appl Earth Obs Geoinf 71(May):29–42. https://doi.org/10.1016/j.jag.2018.05.007
El Saadi AM, Yousry MM, Jahin HS (2014) Statistical estimation of Rosetta branch water quality using multi-spectral data. Water Science 28(1):18–30. https://doi.org/10.1016/j.wsj.2014.10.001
Garg V, Aggarwal SP, Chauhan P (2020) Changes in turbidity along Ganga River using Sentinel-2 satellite data during lockdown associated with COVID-19. Geomat Nat Haz Risk 11(1):1175–1195. https://doi.org/10.1080/19475705.2020.1782482
Garg V, Senthil Kumar A, Aggarwal SP, Kumar V, Dhote PR, Thakur PK, Nikam BR, Sambare RS, Siddiqui A, Muduli PR, Rastogi G (2017) Spectral similarity approach for mapping turbidity of an inland waterbody. J Hydrol 550:527–537. https://doi.org/10.1016/j.jhydrol.2017.05.039
Glasgow HB, Burkholder JAM, Reed RE, Lewitus AJ, Kleinman JE (2004) Real-time remote monitoring of water quality: a review of current applications, and advancements in sensor, telemetry, and computing technologies. J Exp Mar Biol Ecol 300(1–2):409–448. https://doi.org/10.1016/j.jembe.2004.02.022
González-Márquez LC, Torres-Bejarano FM, Torregroza-Espinosa AC, Hansen-Rodríguez IR, Rodríguez-Gallegos HB (2018) Use of LANDSAT 8 images for depth and water quality assessment of El Guájaro reservoir, Colombia. J S Am Earth Sci 82:231–238. https://doi.org/10.1016/j.jsames.2018.01.004
González S, García S, Del Ser J, Rokach L, Herrera F (2020) A practical tutorial on bagging and boosting based ensembles for machine learning: algorithms, software tools, performance study, practical perspectives and opportunities. Information Fusion 64(July):205–237. https://doi.org/10.1016/j.inffus.2020.07.007
Günen MA (2022) Performance comparison of deep learning and machine learning methods in determining wetland water areas using EuroSAT dataset. Environ Sci Pollut Res 29(14):21092–21106. https://doi.org/10.1007/s11356-021-17177-z
Günen MA, Atasever UH, Beşdok E (2020) Analyzing the contribution of training algorithms on deep neural networks for hyperspectral image classification. Photogramm Eng Remote Sens 86(9):581–588. https://doi.org/10.14358/PERS.86.9.581
Hafeez S, Wong M, Ho H, Nazeer M, Nichol J, Abbas S, Tang D, Lee K, Pun L (2019) Comparison of machine learning algorithms for retrieval of water quality indicators in case-II waters: a case study of Hong Kong. Remote Sensing 11(6):617. https://doi.org/10.3390/rs11060617
Haji Gholizadeh M, Melesse AM, Reddi L (2016) Spaceborne and airborne sensors in water quality assessment. Int J Remote Sens 37(14):3143–3180. https://doi.org/10.1080/01431161.2016.1190477
Haji Gholizadeh M, Melesse AM (2016) Assortment and spatiotemporal analysis of surface water quality using cluster and discriminant analyses. Catena. https://doi.org/10.1016/j.catena.2016.12.018
Ibrahem A, Osman A, Najah A, Fai M, Feng Y, El-shafie A (2021) Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia. Ain Shams Engineering Journal. https://doi.org/10.1016/j.asej.2020.11.011
Barrett Clay D, Frazier E Amy (2016) Automated method for monitoring water quality using Landsat imagery. Water 8(6):257 1–14. https://doi.org/10.3390/w8060257
Khan MYA, Gani KM, Chakrapani GJ (2017) Spatial and temporal variations of physicochemical and heavy metal pollution in Ramganga River—a tributary of River Ganges, India. Environ Earth Sci, 76(5). https://doi.org/10.1007/s12665-017-6547-3
Kiangala SK, Wang Z (2021) An effective adaptive customization framework for small manufacturing plants using extreme gradient boosting-XGBoost and random forest ensemble learning algorithms in an Industry 4.0 environment. Machine Learning with Applications, 4(December 2020), 100024. https://doi.org/10.1016/j.mlwa.2021.100024
Kikon A, Deka PC (2021) Artificial intelligence application in drought assessment, monitoring and forecasting: a review. Stochastic Environmental Research and Risk Assessment, 3(Subramanya 2013). https://doi.org/10.1007/s00477-021-02129-3
Koponen S, Pulliainen J, Kallio K, Hallikainen M (2002) Lake water quality classification with airborne hyperspectral spectrometer and simulated MERIS data. 79, 51–59
Kulithalai Shiyam Sundar P, Deka PC (2021) Spatio-temporal classification and prediction of land use and land cover change for the Vembanad Lake system, Kerala: a machine learning approach. Environ Sci Pollut Res, 0123456789. https://doi.org/10.1007/s11356-021-17257-0
Li Y, He L, Peng B, Fan K, Tong L (2018) Remote sensing inversion of water quality parameters in Longquan Lake based on PSO-SVR algorithm. International Geoscience and Remote Sensing Symposium (IGARSS), 2018-July(Fig 1), 9268–9271. https://doi.org/10.1109/IGARSS.2018.8517937
Liu J, Zhang Y, Yuan D, Song X (2015) Empirical estimation of total nitrogen and total phosphorus concentration of urban water bodies in China using high resolution IKONOS multispectral imagery. Water 7(12):6551–6573. https://doi.org/10.3390/w7116551
Lounis B, Aissa AB, Rabia S, Ramoul A (2013) Hybridisation of fuzzy systems and genetic algorithms for water quality characterisation using remote sensing data. Int J Image Data Fusion 4(2):171–196. https://doi.org/10.1080/19479832.2011.617318
Naganna SR, Deka PC (2019) Artificial intelligence approaches for spatial modeling of streambed hydraulic conductivity. Acta Geophys 67(3):891–903. https://doi.org/10.1007/s11600-019-00283-5
Nas B, Ekercin S, Karabörk H, Berktay A, Mulla DJ (2010) An application of landsat-5TM image data for water quality mapping in Lake Beysehir, Turkey. Water Air Soil Pollut 212(1–4):183–197. https://doi.org/10.1007/s11270-010-0331-2
Nguyen TT, Keupers I, Willems P (2018) Conceptual river water quality model with flexible model structure. Environ Model Softw 104:102–117. https://doi.org/10.1016/j.envsoft.2018.03.014
Olmanson LG, Brezonik PL, Bauer ME (2013) Remote sensing of environment airborne hyperspectral remote sensing to assess spatial distribution of water quality characteristics in large rivers: the Mississippi River and its tributaries in Minnesota. Remote Sens Environ 130:254–265. https://doi.org/10.1016/j.rse.2012.11.023
Panda SS, Garg V, Chaubey I (2004) Artificial neural networks application in lake water quality estimation using satellite imagery. 4(2), 65–74
Ramchoun H, Janati Idrissi MA, Ghanou Y, Ettaouil M (2019) Multilayer perceptron new method for selecting the architecture based on the choice of different activation functions. Int J Information Systems in the Service Sector 11(4):21–34. https://doi.org/10.4018/IJISSS.2019100102
River G, Management B (2013) Demographic and analysis in Middle Ganga Basin. 1–88
Rubin HJ, Lutz DA, Steele BG, Cottingham KL, Weathers KC, Ducey MJ, Palace M, Johnson KM, Chipman JW (2021) Remote sensing of lake water clarity : performance and transferability of both historical algorithms and machine learning. 1–18
Said S, Khan SA (2021) Remote sensing-based water quality index estimation using data-driven approaches: a case study of the Kali River in Uttar Pradesh, India. Environ Dev Sustain 23(12):18252–18277. https://doi.org/10.1007/s10668-021-01437-6
Shamitha SK, Ilango V (2019) A roadmap for intelligent data analysis using clustering algorithms and implementation on health insurance data. Int J Sci Technol Res 8(10):2008–2018
Sharaf El Din E, Zhang Y, Suliman A (2017) Mapping concentrations of surface water quality parameters using a novel remote sensing and artificial intelligence framework. Int J Remote Sens 38(4):1023–1042. https://doi.org/10.1080/01431161.2016.1275056
Song K, Liu G, Wang Q, Wen Z, Lyu L, Du Y, Sha L, Fang C (2020) Quantification of lake clarity in China using Landsat OLI imagery data. Remote Sens Environ 243(March):111800. https://doi.org/10.1016/j.rse.2020.111800
Sudheer KP, Chaubey I, Garg V (2007) Lake water quality assessment from Landsat thematic mapper data using neural network: an approach to optimal band combination selection. J Am Water Resour Assoc 42(6):1683–1695. https://doi.org/10.1111/j.1752-1688.2006.tb06029
Swain R, Sahoo B (2017a) Improving river water quality monitoring using satellite data products and a genetic algorithm processing approach. Sustainability Water Qual Ecol 9–10:88–114. https://doi.org/10.1016/j.swaqe.2017.09.001
Swain R, Sahoo B (2017b) Mapping of heavy metal pollution in river water at daily time-scale using spatio-temporal fusion of MODIS-aqua and Landsat satellite imageries. J Environ Manage 192:1–14. https://doi.org/10.1016/j.jenvman.2017.01.034
Teodoro AC, Veloso-Gomes F, Gonçalves H (2007) Retrieving TSM concentration from multispectral satellite data by multiple regression and artificial neural networks. IEEE Trans Geosci Remote Sens 45(5):1342–1350. https://doi.org/10.1109/TGRS.2007.893566
Tian Z, Xiao J, Feng H, Wei Y (2020) Credit risk assessment based on gradient boosting decision tree. Procedia Computer Science 174:150–160. https://doi.org/10.1016/j.procs.2020.06.070
Trivedi RC (2010) Water quality of the Ganga River – an overview. Aquat Ecosyst Health Manage 13(4):347–351. https://doi.org/10.1080/14634988.2010.528740
Vander Woerd H, Pasterkamp R (2004) Mapping of the North Sea turbid coastal waters using SeaWiFS data. Can J Remote Sens 30(1):44–53. https://doi.org/10.5589/m03-051
Wang X, Yang W (2019) Water quality monitoring and evaluation using remote-sensing techniques in China: a systematic review. Ecosystem Health and Sustainability 5(1):47–56. https://doi.org/10.1080/20964129.2019.1571443
Wang Xili, Ma L, Wang X (2010) Apply semi-supervised support vector regression for remote sensing water quality retrieving. International Geoscience and Remote Sensing Symposium (IGARSS), 2757–2760. https://doi.org/10.1109/IGARSS.2010.5653832
Wen X, Yang X (2011) Monitoring of water quality using remote sensing data mining. Knowledge-Oriented Applications in Data Mining. https://doi.org/10.5772/13698
Yepez S, Laraque A, Martinez JM, De Sa J, Carrera JM, Castellanos B, Gallay M, Lopez JL (2018) Retrieval of suspended sediment concentrations using Landsat-8 OLI satellite images in the Orinoco River (Venezuela). Comptes Rendus - Geoscience 350(1–2):20–30. https://doi.org/10.1016/j.crte.2017.08.004
Zhan H, Shi P, Chen C (2003) Retrieval of oceanic chlorophyll concentration using support vector machines. IEEE Transactions on Geoscience and Remote Sensing, 41(12 PART II), 2947–2951. https://doi.org/10.1109/TGRS.2003.819870
Zhang Y, Wu L, Ren H, Deng L, Zhang P (2020) Retrieval of water quality parameters from hyperspectral images using hybrid Bayesian probabilistic neural network. Remote Sensing 12(10):1–31. https://doi.org/10.3390/rs12101567
Zhou, C., Zhang, C., Tian, D., Wang, K., Huang, M., & Liu, Y. (2017). A software sensor model based on hybrid fuzzy neural network for rapid estimation water quality in Guangzhou section of Pearl River, China. J Environ Sci Health Tox Hazard Subst Environ Eng, 0(0), 1–8. https://doi.org/10.1080/10934529.2017.1369815
Acknowledgements
The Landsat images are downloaded from http://www.earthexplorer.usgs.gov. The water quality data that is procured from Central Water Commission (CWC) Middle Ganga Division (MGD I&II) is acknowledged.
Author information
Authors and Affiliations
Contributions
Ashwitha Krishnaraj: whole work.
Dr. Ramesh Honnasiddaiah: correcting and editing part of the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Responsible Editor: Xianliang Yi
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Krishnaraj, A., Honnasiddaiah, R. Remote sensing and machine learning based framework for the assessment of spatio-temporal water quality in the Middle Ganga Basin. Environ Sci Pollut Res 29, 64939–64958 (2022). https://doi.org/10.1007/s11356-022-20386-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11356-022-20386-9