Study of physicochemical parameters and wetland water quality assessment by using Shannon’s entropy

In water quality monitoring programs, optimization between information craved and information collected involves scrupulous judgment making processes and management approaches. The present study explores the few essential aspects of water quality monitoring program considering Shannon’s entropy with case studies on a few lakes and wetlands in North Guwahati, Assam (India). Firstly, the loss of information by traditional water quality indices (WQIs) has been addressed by the use of entropy weighted WQIs (EWQIs) which takes into account the randomness of data sets removing error through subjective judgments of experts in assigning parameter weights. This concept was extended to the quantification of heavy metals. The concept of multi-criteria decision-making methods (MCDMs) such as TOPSIS was introduced which utilize entropy weights and rough set theory to give a reliable and unbiased description of overall pollution levels of each sampling location. This study will be of great help to various agencies which take care of the water supply and water pollution control since this forms a significant tool for easy understanding and thereby making their applicability uncomplicated.


Introduction
Clean and safe water is of vital importance to any nation in the world. Freshwater of adequate quality and quantity is the necessity of sustainable development. Major portion of the available water in Earth is present in the form of surface water sources. Surface water sources play a vital role for social progress and economic development as ancient civilizations have prospered along them (Priscoli 2000;Bu et al. 2010). Lakes and wetlands play a supreme role in the environment-providing goods and services to the local community: staple food plants, flood control, inland fisheries (Maltby 2013), principally acting as carbon sinks, water purifiers and a host to biologically diverse ecosystems. They also contribute highly to the recharge of groundwater aquifers (Winter 1999). In fact, growth of a nation dependent on the conservation and prolific utilization of its water resources.
Presently, rapid population growth, urbanization, with swift industrialization has deteriorated the quality of surface water sources (Simeonov et al. 2003;Singh et al. 2004;Gradilla-Hernández et al. 2020;Teodorof et al., 2021). Surface water sources constitute the pathway through which a wide variety of living organisms are exposed to harmful elements which may be of either anthropogenic or geological origin. The expulsion of point sources such as untreated domestic and industrial wastewater, agricultural run-offs and leachate from solid waste dumpsite have not only been inimical to aquatic bodies but also have led to numerous toxic trace elements (Sin et al. 2001;Armitage et al. 2007;Gradilla-Hernández et al. 2020). Therefore, comprehensive and accurate assessments of water quality have become indispensable for government and local administrations (Zhang et al. 2022). Consistent monitoring is essential for such assessments and in execution of water resource management policies.
Consequently, the evaluation of water quality from the monitored data has become a serious concern in recent years (Ongley 1998). Monitoring programs often lead to the generation of a large amount of data sets (Dixon and Chiswell, 1996;Iscen et al. 2008). This requires their transformation into simpler numerical scores which can be easily understood by policymakers as well as the local community. Most of these water quality parameters are integrated by water quality indices (WQIs) into a single numerical score capable of describing the water quality at a particular site and at a particular time (Kaurish and Younos 2007). WQIs reveal the water quality parameters exceeding their standards and thus demonstrate the fruitfulness of stream restoration efforts (Gazzaz et al. 2012;Dash and Kalamdhad 2021). Another essential aspect of monitoring programs is prioritizing decisions to implement policies. In such situations, multi-criteria decision-making methods (MCDMs) are usually considered. Most professionals engaged in the design and operation of the monitoring programs are familiar with the symptoms of "data rich but information poor" monitoring systems which generate large amount of data in a discrete form but are often incapable of describing water quality trends in an area (Ward et al. 1986). It is essential to remove the gap between the information prerequisites on water quality and the information gained by the monitoring systems. In recent years, several researchers have focused on the application of "Shannon entropy" in water quality assessment (Dash and Kalamdhad 2021). Shannon entropy represents the average level of information or uncertainty associated with the variable's possible outcome. For a random variable X, Shannon's entropy is defined as where "n" is the number of different outcomes and p is the probability of outcome. Minimum value, i.e., 0, of this entropy will occur for a constant random variable. For a constant random variable, its probability value will be 1 and there will be no uncertainty associated with that event. Shannon entropy will attain its maximum value when the probability all the possible outcomes have equal value, i.e., 1/n.
With the advent of information theory, the term entropy also found its essence in quantification, storage and communication of information. Shannon entropy is a measure of the unpredictability of a random event, or equivalently the average information derived from its occurrence. In recent years, Shannon entropy has had diverse applications in the field of hydraulic engineering and environmental engineering as a large number of (1) random processes predominate our environment (Singh 2013(Singh , 2014. In the light of Shannon entropy, several techniques such as development of entropy-based quality indices to address shortcomings of conventional WQIs, modification of conflicts between different WQIs by entropy-based MCDMs and optimization of online monitoring systems to refine the information gained from monitoring programs have emerged (Singh et al. 2018a, b).
The overall aim of the present study is to identify the best possible approach for water quality assessment in order to provide effective and efficient information at optimum cost. The following objectives are defined to focus on the primary aim of this study: (a) Assessment of surface water quality of lakes and wetlands located in the North Guwahati, Assam (India) (b) To explore the application of Shannon entropy in water quality index and TOPSIS for the evaluation of surface water quality.

Study area
The study is based on a few important lakes and wetlands in the North Guwahati, Assam (India) (Fig. 1). The north Guwahati is a host to numerous lakes and wetlands situated at the north bank of the mighty Brahmaputra River. Temperatures in the region range approximately from 8 to 40 °C. The annual average rainfall in the region ranges from 1500 to 2600 mm with a relative humidity of 76%. North Guwahati experiences a subtropical climate. The locality has, however, undergone rapid and uncontrolled development activities including establishment of industries, construction activities and utilization of agricultural and forest land for other development purposes. Consequently, there is a need for proper monitoring of these lakes and wetlands to undertake strategies for their preservation and restoration as they are of immense biological and environmental importance.

Sampling strategies and analysis
A total of 20 sampling locations were identified for the collection of surface water samples for the analysis of water quality parameters as shown in Fig. 1. Surface water samples were collected in the pre-monsoon and post-monsoon period during 2017-2018. Standard methods (APHA 2012) have been followed throughout the analysis (Table 1). A quality control procedure was maintained throughout, including recalibration of instruments. Reagents were prepared as recommended by APHA (2012). All chemicals and reagents used in the analyses were of analytical grade unless otherwise stated. Deionized water was used for all dilutions. Standard solutions were prepared by diluting the stock solutions.

Entropy weighted water quality index (EWQI)
EWQI is an enhancement over the existing traditional WQIs. Steps involved in calculation of EWQI are as follows: ).
• A matrix was developed with all "m" water samples (m = 1, 2,…,m) and "n" measured parameters (n = 1, 2,…,n) • To remove the error caused by different dimensions and units, initial matrix was converted the standard grade matrix Y where Y is standard grade matrix and y ij was calculated as: • Shannon entropy was calculated by the formula: • Quality rating scale for each parameter was determined by: • EWQI was calculated by using the following formula: The EWQI ranges have been classified as (Table 2): Water supplies with good or excellent category would able to sustenance a high diversity of aquatic life. Additionally, the water would also be fit for all forms of recreation, including those involving direct contact with the water. Based on entropy weights, the EHCI has been proposed with concentrations pertaining exclusively to heavy metals (Singh et al. 2019).

TOPSIS
TOPSIS is a multi-criteria decision-making method (MCDM) for ordering the alternatives. It is an appropriate tool for picking a number of possible alternatives by determining their Euclidean distances from a desired ideal best and an undesired ideal worst. There are two types of conditions (positive and negative) in this approach.
Positive conditions are those that should be increased and negative ones are those which need to be decreased in order to mitigate risk. The TOPSIS model given by Hwang et al. (1993) can be implemented in the following manner: • The sampling locations (alternatives) and the parameters (criteria) were specified for wetlands to which the ranking was to be assigned according to their pollution status. • The ratings to the locations and parameters were assigned using matrix X WjQj where x ij showed the value of ith alternative for jth criterion • The weight of the water quality parameter was evaluated on the basis of Shannon entropy techniques as per Eq. (8): And, where 0 ≤ E j ≤ 1 where index with higher entropy has greater variation. Therefore, weight of the water quality parameter was calculated as: using vector normalization method as follows: • A weighted normalized decision matrix was developed ( V ) as: • The ideal best (IB) and the ideal worst (IW) of the alternatives were calculated as: • The Euclidean distance of each alternative from the IB ( d + i ) and IW ( d − i ) was calculated as: The performance score (PS) of each alternative was calculated as: • The alternatives were finally graded according to their PS.

Results and discussion
The statistical summary of observed parameters in pre-monsoon period is shown in Tables 3 and 4. Concentration of water quality parameters has been expressed in mg/L with exceptions in pH, EC (in µS/cm) and turbidity (NTU). The physicochemical parameters during pre-monsoon period depicted that in the sampling location LKBM15, the BOD 5 concentration exceeded the discharge standards for inland surface waters. This was primarily attributed to the discharge of untreated domestic sewage, leaves and woody debris; dead plants and animal manure (Bhateria and Jain 2016). The contribution of BOD 5 values was in the range of 7.80-33.30 mg/L. The pH values were in the range of 6.51-7.86 and the DO levels were healthy in the entire area varying between 5.77 and 10.07 mg/L. The EC and turbidity values were, however, higher than the permissible drinking water standards prescribed by BIS in majority of the locations varying between 214-741 µS/cm and 2.30-74.70, respectively (BIS IS 10500: 2012). High EC indicates the  abundance of cations and anions in the surface water. High turbidity is mainly because of floating algae, plant pieces, and soil washing from the banks in the water. The TDS concentrations in the surface water samples varied between 22.50 and 277.50 mg/L. The abundance of the major cations, namely Na + , K + , Ca 2+ and Mg 2+ , was in the order of Na + > K + > Ca 2+ > Mg 2+ . The fluoride (F − ) concentrations were low, suggesting that the surface water sources can be used as an alternative to groundwater sources which have high concentrations of F − in North-East region of India. Cl − and SO 4 2− were the most enriched anions. Fertilizer and wastewater are mainly responsible for elevated chloride concentration in lakes. High chloride concentrations influence the lake ecology and ecosystem services such as fisheries (Dugan et al. 2017).
Elevated concentrations of Fe were observed at all the sampling locations of North Guwahati accompanied with slightly higher concentrations of Mn and Cu in a few locations. The concentration of Fe was in the range of 0.43-6.25 mg/L, and Mn and Cu were in the range of 0.03-0.64 and 0.03-0.19 mg/L, respectively. These concentrations were in excess of their permissible drinking water standards prescribed by BIS (BIS IS 10500: 2012).
Concentrations of Pb were BDL with low concentrations of Cr and Zn in the region.
The physicochemical parameters for the post-monsoon period are depicted in Table 5. In the post-monsoon period, the BOD 5 was ranged from 5.70 to 14.70 mg/L. These values reveal a sharp contrast with the BOD 5 values of the pre-monsoon period and bring into evidence the effects of dilution and water depth on the surface water quality of the North Guwahati, Assam. In comparison with the pre-monsoon period, the lakes and wetlands in the region had a higher water depth, thus leading to the dilution of any incoming organic pollution load. The pH values were in the range of 6.67-8.52 and the DO levels in the sampling locations were healthy varying from 6.23 to 9.24 mg/L. The highest EC value recorded in this period was 650 µS/cm, and the EC values exceeded their permissible drinking water standards in majority of the sampling locations. The turbidity and TDS values in the post-monsoon period were in the range of 2.7-46 and 90-320 mg/L, respectively. The cations in the order of abundance were, namely Na + > Ca 2+ > K + > Mg 2+ . There were no significant changes observed in the fluoride and nitrate concentrations in both the periods. The anions Cl − and SO 4 2− were the most enriched anions. The observed values of heavy metals in the post-monsoon period are summarized in Table 6. In the post-monsoon period, Pb and Cu concentrations in excess to their prescribed BIS drinking water standards were observed in most of the sampling locations. The Pb and Cu concentrations were in the range of 0.08-0.89 and 0.08-2.35 mg/L, respectively. In a few sampling locations, Cr concentrations also exceeded its permissible drinking water limits. Furthermore, concentrations of Fe and Mn were relatively lower in the post-monsoon period in comparison with the pre-monsoon period. Fe and Mn concentrations in the post-monsoon period varied in the range of 0.01-1.50 and 0-0.48 mg/L, respectively. These observations shed significant light into the possible effects of leaching and runoff from nearby roadside dumps and agricultural lands. In such circumstances, possibly the leaching of heavy metals was far more pronounced than the effect of dilution contributing to excessive heavy metal concentrations.
Results demonstrate that these shallow lakes and wetlands are highly vulnerable to the human-induced activity.
Influences of human activity on water quality were greater than those of natural factors (Han et al. 2020).
EWQI has been applied to evaluate the appropriateness of water quality for drinking purpose. EWQI is a step forward from traditional WQIs, which rely on personal judgments and expert opinion to assign weights to parameters (Singh et al. 2019). The EWQI of all the sampling locations in north Guwahati for the pre-monsoon and post-monsoon period is shown in Table 7. The spatial variability of the water quality as computed by EWQI with respect to physicochemical parameters is illustrated in Fig. 2. In the pre-monsoon period and post-monsoon period, the EWQI varied in the range of 60. 98-205.45 and 55.12-135.67, respectively. The EWQI values signified that the water quality was worse in the pre-monsoon period varying in the range of "good" to "extremely poor," whereas the water quality in the post-monsoon period varied in the range of "good" to "average." The highest EWQI was evaluated at LKBM11 which had high concentrations of EC, turbidity and the lowest DO level in the pre-monsoon period. The sampling location LKBM11 also had the highest TH and TA values in the pre-monsoon period. In the pre-monsoon period, 30% of the sampling locations had a water quality of "poor" or "extremely poor" and 50% of the sampling locations had a water quality pertaining to "good" or "excellent" grades. In the post-monsoon period, the highest EWQI value was evaluated at LKBM8 which had high values of EC and turbidity relative to their permissible drinking water standards (BIS IS10500 2012). Due to the effects of dilution and increased water depth in the post-monsoon period, the water quality of the lakes and wetlands in the region was better relative to the water quality in the pre-monsoon period. In the post-monsoon period, 75% of the sampling locations had water quality of "excellent" or "good." None of the sampling locations had water quality "poor" or "extremely poor." Most of the lakes and wetlands in the North Guwahati serve as a hub to domestic activities of nearby communities. EHCI allows for a more accurate assessment of water quality in terms of heavy metals (Singh et al. 2020). The EHCI of all the sampling locations in north Guwahati for pre-and post-monsoon period is shown in Table 8. In both pre-and post-monsoon periods, the EHCI varied in the range of 72. 43-469.72 and 269.89-1591.84, respectively. In the pre-monsoon period, the highest EHCI was evaluated at the sampling location LKBM11 which had the highest concentration of Fe (6.25 mg/L) among all the sampling locations. The concentration of Fe in LKBM11 was 20 times higher than its permissible drinking water limit as prescribed by BIS (0.3 mg/L). Although the concentrations of the other metals in this location were well within their permissible limits, the contribution of Fe in the EHCI was 90.34%. The EHCI values in the pre-monsoon period suggested that the water quality of lakes and wetlands in the North Guwahati varied from "good" to "extremely poor" with 75% of the sampling locations having water quality "poor" or "extremely poor." Only 15% of the sampling sites had water of desirable quality. However, exceptionally high EHCI values were observed in the post-monsoon period with the highest EHCI evaluated at LKBM1. During post-monsoon period, LKBM1 had the highest concentration of Pb among all other sampling stations accompanied by elevated concentrations of Cu. Furthermore, the EHCI values suggested that all the sampling locations in the post-monsoon period had "extremely poor" water quality with respect to the excess heavy metal concentrations. Although the EWQI values were low in the post-monsoon period, the high EHCI values provided new insight on the leaching and run-off of heavy metals from roadside dumps, roads and agricultural lands.
In both the pre-and post-monsoon periods, EWQI suggested the surface water quality with respect to physicochemical parameters while EHCI suggested the surface water quality with respect to heavy metals. In order to develop an insightful knowledge about the overall pollution levels in the lakes and wetlands of North Guwahati, the TOPSIS methodology was applied on all the water quality parameters (physicochemical and heavy metals) to develop overall ranks such that the highest TOPSIS rank in each period would indicate the most polluted sampling location. It is a useful tool in the decision-making process. The performance score and TOPSIS ranks of the sampling locations for the pre-and post-monsoon period are shown in Table 9. The spatial variability of the TOPSIS ranks is shown in Fig. 2a, b. During pre-monsoon period, the sampling location LKBM11 was the most polluted in comparison with other locations. This was evident from the highest EWQI and EHCI values at this location in the pre-monsoon period. It also had the lowest DO concentrations accompanied with high values of EC and turbidity. The TH and TA values were also higher than their desirable concentration and highest among all the locations. The concentration of Fe was also highest at this location with a value as high as 6.25 mg/L. The TOPSIS methodology which is a reliable method based on entropy weights and utilizing rough set theory results in a reliable analysis from  Table 9. This was evident from the fact that the location had the second highest EWQI and a high EHCI value indicating it as "extremely contaminated" by heavy metals. The ranks are important to serve as entities on which policy making and restoration of the lakes and wetlands may be prioritized (Table 9).

Conclusion
In the present study, a few important aspects of water quality monitoring programs have been addressed in the light of Shannon's entropy. The evaluation of the water quality of a few important lakes and wetlands of North Guwahati has been done with an improved water quality index-EWQI which takes into account uncertainties of occurrences of physicochemical parameters. A similar index on the principle of entropy weights has been suggested for the quantification of heavy metal contamination-EHCI. Reliable entropy-based MCDMs such as TOPSIS have been employed in prioritizing decision making by ranking sampling sites based on their overall pollution levels. Based on the results, the following conclusions can be drawn from the study:  o For the evaluation of water quality in the post-monsoon period, EWQI values were relatively lower sprouting the effects of dilution while the EHCI values were relatively higher suggesting the entry of heavy metals through leaching and run-off. p In assessment of the overall pollution levels of each sampling location, TOPSIS worked as an effective method in ranking sampling sites for restoration efforts based on their pollution in individual periods. TOPSIS denoted that LKBM11 was most polluted in the pre-monsoon period and LKBM9 was most polluted in the post-monsoon period.
This study will help policy makers for making decisions in allocating funds for restoring of lakes. It will also help in comparing water qualities at different locations, enforcement of water quality standards and determining the changes in water quality.
Although a few major aspects of water quality monitoring programs have been addressed in the present study, yet there is ample scope for improvement in future work. Funding This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.
Data availability statement All data generated or analyzed during this study are included in this published article.
Code availability Not applicable.

Conflict of interest
The authors declare that they have no conflict of interest.
Ethical Approval Not applicable.

Consent to Publish Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.