Introduction

Water quality monitoring is crucial in every community providing a reasonable estimate of the current state of waters and reflects their most appropriate use for any human activities. A sound method for doing this starts with site characterization by studying the underlying properties of the area with respect to its natural water dynamics. According to Hofmann et al. [1], a complete set of data on hydrology, hydromorphology, climatology, water physico-chemistry, sedimentology, and habitat diversity enables for a detailed characterization of the stream landscape. Site characterization traditionally requires extensive field sampling and laboratory analysis which is often very tedious and expensive. Moreover, building a complete database for catchment characterization is challenging to achieve and most often, scientist and researchers make-do of what data are at hand.

A simple but not compromising technique in modeling can be implemented by delineating catchment boundaries to be able to model the dynamics of hydrologic processes within each catchment using remote sensing (RS) and geographic information system (GIS) tools. Such dynamics are governed partially by the temporal and spatial characteristics of inputs and outputs and the land use conditions.

RS is one technology that has been valuable in cleanup efforts and shows promise in providing an alternative to field sampling methods [2] and in identifying land use information [3]. The temporal, spatial, and spectral advantage of RS imageries can provide fast and repeated observations, which are the prerequisites in environmental monitoring. Moreover, publicly available RS data are collected at regional scales and temporal resolutions (i.e., repeat collection time) that are much more frequent than field sampling campaigns [4]. Various RS data were utilized in many contamination assessment studies [5,6,7,8], such as the application of aerial photography, multispectral and hyperspectral images, and actual spectral signatures. The most basic use of these remotely sensed data for environmental monitoring involves visual image interpretation [2], terrain modeling and analysis [9], and spectral analysis [10]. For terrain analysis, RS-derived products like digital elevation models (DEMs) are commonly used. Shuttle Radar Topographic Mission (SRTM), for example, has created an unparalleled data set of global elevations that are freely available for modeling and environmental applications [9]. Also, a growing number of researchers [11,12,13,14] have used different types of spectrometers, combined with varying methods of measurement to obtain the spectral data of a given water body [15]. The physics and chemical characteristics of water can be determined from spectral signatures [16]. However, this method requires the application of mathematical modeling to build analysis, simulation, and quantitative inverse relation in different water bodies or the same body of water in different time frames [15]. Also, extracting water quality measurements directly from satellite imagery can allow rapid identification of impaired waters, potentially leading to faster responses by water agencies [4].

Most often, RS techniques are coupled with GIS as a tool for mapping and spatial analysis. GIS is increasingly popular in the broad spectrum of research and environmental monitoring. Its geostatistical mapping capabilities are commonly employed to determine hot spots of contaminated sites [17], distribution of contamination [18, 19], and identification of point and non-point sources of contamination [20]. Also, the capability of GIS to integrate various data from different sources and present environmental conditions makes it a valuable tool for informed decision-making.

Monitoring, protecting, and improving the quality of water resource is critical for targeting conservation efforts and improving the quality of the environment [13,14,15]. Researchers have already developed a variety of algorithms or models for supporting missions of water-quality management, in response to the pressing concern to the challenge of water quality management vis-à-vis the principle of sustainable development. Most of the proposed methods are employing various RS techniques to monitor water quality in different types of inland water bodies from several satellite sensors [21].

However, such techniques are not yet openly used as tools for government-sponsored environmental monitoring programs and remain to be popular in the academe and research communities. In fact, it was noted in the Environmental Management Bureau (EMB) 2014 Water Quality Report [22] that their water quality monitoring data are not translated into catchment-scale spatial maps that ideally provide valuable insights in characterizing rivers and in identifying contamination risks. Barriers to adopting RS and GIS techniques for water quality management include the wrong perception that the use of satellite images and other RS products entails additional costs on top of the expenses required for acquiring hardware and expertise necessary for data processing and interpretation [23]. However, contrary to this impression, most of the published studies related to oceanography and water quality monitoring are utilizing freely available RS datasets from Sea-viewing Wide Field-of-view Sensor (SeaWiFS), Moderate Resolution Imaging Spectroradiometer sensors, Ocean Color Monitor: OceanSAT, and Landsat series [24]. Other constraints identified by Schaeffer et al. [23] are concerns about product accuracy, data continuity, and training support. In the Philippines, the water monitoring program by the management agencies typically relies on traditional and laborious methods. Depending on their available laboratory facility, instruments, transport, and human resources, all monitoring programs are restricted in some way and may collect data primarily by direct sampling or in limited water quality parameters.

The goal of this study is to demonstrate how to obtain a reliable estimate of catchment profiles that can provide local measurements and can be scaled-up from catchment to watershed scales or from local to national scale. In this study, we want to show that an RS-based catchment characterization can complement the traditional water quality monitoring campaigns and for promoting the concept of sustainable development in water resources assessment and monitoring. Hence, in this paper, a new RS approach for catchment characterization by deriving physical and spectral characteristics is applied to a catchment located in Tubay, Caraga, Philippines.

We implement our proposed method by using Landsat 8 image to delineate land cover information and SRTM DEM to extract catchment boundaries in order to prove the plausibility of using free RS products in catchment profiling. Then, we analyzed the potential relationship between land cover and water quality parameters like pH and Dissolved Oxygen (DO) by employing regression analysis using SPSS. Also, an on-site spectral measurement was performed to show the potential of relating spectral characteristics to water pH and DO that could lead to establishing empirical models for water quality estimations. These parameters are considered in this study because they are among the most commonly measured [25] and among the important parameters influencing the actual physicochemical status of a particular aquatic environment [26]. In some cases, water quality parameters affect each other [27] and showed significant correlations to other parameters. The pH level, for example, were found to be highly correlated with Copper [28], Arsenic [28, 29] and Lead [28].

Materials and methods

Study site background

The Municipality of Tubay is located in Caraga Region, the northern region of Mindanao Island, which is now one of the mining capital of the country. Majority of the land area of Tubay was approved for mining or exploration for big mining companies. Figure 1 shows that the location of large-scale mining companies and the terrain within the study area can potentially produce effluents with harmful pollutants that will directly flow to the catchment’s outlet if the drainage system and waste management system is inadequate. The EMB of the Department of Natural Resources reported that there are only two significant point pollution sources that could affect the water quality of Tubay River, namely: SR Metals Inc. and Agata Mining Ventures, Inc. The EMB’s 2015 report pointed out that the wastewater discharges from these mining establishments were regulated by their agency to ensure that the treated wastewater discharges are within the effluent standards. Activities of the residents and those at the neighboring municipalities bordering the river were also identified as possible sources of pollution (EMB, 2015). Recent water quality monitoring data of EMB, however, shows that there is no regular monitoring on the levels of heavy metals within the municipality.

Fig. 1
figure 1

The study area showing the Municipality of Tubay, the area coverage of approved mining permits, the location of the existing mining sites, and the sampling points

Data sources and methods

Remotely-sensed data like DEM from SRTM and Landsat Operational Land Imager (OLI) image were utilized for deriving catchment profile and land cover map. For the land cover map, the classified Landsat OLI image (acquired on March 29, 2017) was merged with the high-resolution land cover map from the Phil-LiDAR 2 Project of the Caraga State University (CSU) which was derived using high-resolution orthophotographs covering about 38% of the area of Tubay.

This study also utilizes water quality data from on-site measurements conducted on March 26–28, 2017 using our prototype WSN-based sensors for pH and DO. Before the actual water quality measurements using the WSN-based sensors, several measurements were made to evaluate the sensors’ efficiency. The efficiency of the sensors was assessed by comparing simultaneous measurements of the prototyped sensors and the Horiba® probe at the selected sampling points. The simultaneous measurement results for pH and DO are shown in Table 1. The linear plots of the two datasets taken from the observed readings of the developed prototype and the readings from Horiba® Water Checker yielded good R2 values: the WSN-based sensor readings for pH resulted in an R2 value of 0.973 while the R2 value for the DO sensor is 0.979.

Table 1 Comparison of values taken using the developed prototype and Horiba® probe

The WSN-based water quality based monitoring system is composed of integrated hardware and software components housed in a cost-effective buoy. The system consists of the sensor node, sink node and remote terminal. Each sensor node is a combination of probes, microcontroller, and an XBee module RF transceiver. A user application program was installed in each sensor node which handles sensing of pH and DO. We then pre-process the data to fit the transmission requirements and transmit the sensed data to the nearest node or sink node. Subsequently, the sink node will send the data to the remote terminal using the Global System for Mobile Communications/General Packet Radio Service (GSM/GPRS) network. The study utilized pH probes from Atlas Scientific (https://www.atlas-scientific.com), Arduino-based microcontroller, Zigbee technology, and GSM transceivers. The gathered measurements using the developed prototype are shown in Table 2. There was no measurement for DO at SP 1 due to transmission problem during the survey in the area.

Table 2 The flow velocity data measured on site, GIS-based water depths derived from bathymetric data, and water quality data for pH and DO using WSN-based prototype

Some of the locations of the sampling points of EMB was also utilized to map and generate GIS layers for each water quality parameter properly. Not all sampling points were considered as sampling areas because some sampling points are located outside the catchment area. An additional set of sampling points were also established for this particular study using a handheld GPS to achieve a well-distributed network of sampling points throughout the river. These sampling points are designated as “SP” to distinguish them from EMB sampling points which are named as “EMB SP”. Also, flow velocities were gathered through field measurements while bathymetric data of the river was acquired from UP-Diliman Dream Program. The river profiles at sampling point locations were generated from the bathymetric data using the river profiler tool in GIS. The physical river profiles taken from velocity measurements and that of the computer-based river profiling to derive the minimum, maximum and average depth of the river at the sampling areas are also tabulated in Table 2.

This study also employed spectral measurements at the designated sampling areas to investigate its potential application to catchment characterization. We provide a detailed description of this method in Section 3.4. Hence, this study utilized three types of RS datasets to be able to derive river profiles and catchment characterization in the study area. These datasets are namely: Landsat 8 OLI satellite image, SRTM DEM, and spectral signatures of water samples. This study mainly employed RS and GIS techniques to derive the river profiles and characteristics, integrated and analyzed using GIS and statistical analyses. The general flow of the methods employed in this study is illustrated in Fig. 2.

Fig. 2
figure 2

The methodological flow of the study

Catchment area delineation and profiling

The 90-m resolution SRTM-DEM was obtained freely from the Global Land Cover Facility website. The accuracy of this data is determined primarily by the resolution (the distance between sample points) and data types (integer or floating point). Thus, a pre-processing step must be undertaken to ensure that there are no bad data from the input DEM. Using ENVI software, NaN (Not a Number) and infinity values containing floating-point, double-precision floating, complex floating, or double-precision complex data types are masked out. Then, the SRTM-DEM is made ready for stream and catchment delineation using GIS software.

All the hydrological parameters needed to derive a catchment boundary from SRTM-DEM were calculated using the Hydrology toolset of the ArcGIS Spatial Analyst extension. This stage involves straightforward processes in deriving layers that are necessary for delineating streams and catchment areas using the Hydrology tools. By following the Hydrology tools in its appropriate sequence, we were able to derive the stream networks and catchment boundaries.

To identify the catchment covering the study area, the whole boundary of the Municipality of Tubay, the location of the mining companies, and the location of the existing sampling sites of the EMB were considered. A simple intersection operation of the overlaid layers was employed to select the best catchment site. This catchment was also used to mask the satellite image to derive a subset image following the form and shape of the catchment. Also, the potential contributing area (PCA) surrounding the sampling points was delineated using the pour point tool of ArcGIS. The PCA was the same with the CapZone concept employed by Japitana and Paringit [30] in characterizing the potential zones of influence to the water quality of each groundwater source. In this study, however, we did not implement the DEM shifting since the focus is on surface water. This method was adopted to best characterize the potential effect of land use on the existing water quality in the area.

Land use-land cover mapping

The multi-band Landsat 8 OLI with the 30-m resolution was used for extracting land use land cover information within the catchment. This study employed the Maximum Likelihood classifier, a supervised classification technique widely used for this purpose because of its high reliability and accuracy [31,32,33]. The high accuracy in delineating the land use/land cover (LULC) is aimed because it will be used further in analyzing the LULC’s potential influence on the water quality in the study area.

Spectral measurement and analysis

The spectral reflectance was gathered at designated sampling points along the stretch of Tubay River using OceanOptics USB 4000 VisNIR spectrometer connected to a field-type laptop. The spectral measurements for the water samples were taken between 10:00 AM until 2:00 PM or around the local solar noon period [34] to avoid unstable spectral response. Within the solar noon period, it is when the solar geometry is changing least [35, 36].

Results and discussion

Delineated catchment area and PCA

Before streams and catchment boundaries are generated, various hydrologic layers are derived first which include filled DEM, flow direction, flow accumulation, flow length, stream link, and stream order. These are derived in straight-forward processes using the ArcGIS Hydrology tools. To efficiently select the best catchment for this study, other vector layers were overlaid to analyze its spatial relationships. In this part, the municipal boundary was utilized to ensure that majority of the area of Tubay can be accounted. Then, the locations of big mining companies and the permanent sampling sites of EMB Caraga were mapped to check its extent against the catchment boundaries.

As a result, a sub-catchment with an area of 22,000 ha was identified in which the majority of the area of Tubay is situated downstream of the catchment. River width was automatically calculated by photo interpretation and GIS analysis. Results showed that the Tubay main river has an average width of 97 m. While its tributaries have widths ranging from 30 to 100 m. Figure 3 shows the final outputs in using the hydrology tools, the catchment area, and its river network is represented in colors according to its widths.

Fig. 3
figure 3

Map showing the derived river widths from analyzing Landsat 8 OLI and bathymetric data

LULC profiling

The LULC classification map (Fig. 4) derived from Landsat 8 OLI image yielded an accuracy of 97.66%. The Tubay catchment was used to clip the LULC map and determine the area distribution of each LC within the catchment. The result showed that the most dominant LC, covering about 61.5% of the catchment is the class “Other Vegetation”. This class includes less dense forest, grasslands, and shrubland areas. Built-up areas, Dense Vegetation, and Cropland share the next fractions in terms of area with a percent distribution of 15.9, 12.9, and 5.7%, respectively.

Fig. 4
figure 4

Land cover map of Tubay catchment derived from Landsat 8 OLI image

The derived PCAs for each sampling point have areas ranging from 0.10 to 22,000 ha. The LULC distribution for PCAs at each sampling points was also derived to evaluate its relationship to the water quality. Among the seven (7) sampling points, only PCAs for sampling points 1 and 2 have complete LC composition while the remaining five (5) have LC distribution ranging from one to three LC type. A regression analysis was employed on the LC distribution and the WSN-based water quality data using SPSS® and the results are shown in Table 3. The regression model derived for pH yielded a high R2 value of 0.96 in which water, other vegetation, and built-up areas have high coefficient values indicating its strong contribution to the observed pH levels. This initial results of the analysis imply that pH levels in rivers may be influenced by the presence of nearby vegetation and built-up areas. Also, the regression model indicates that the pH level from the surrounding water of the sampling point or those waters flowing towards the sampling point may also affect its pH value. All of the predictors for pH regression model are negative. This means that the increase of these LCs and the bad practices associated with them will result in a decrease in pH values, that is, making the river water more acidic. While the derived regression model for DO garnered an R2 value of 0.994, relatively higher compared to that of pH regression model. During the regression analysis, Water was excluded as one of the contributing factors to the DO regression model. Among its contributing factors, cropland is the most significant factor based on its coefficient values which imply that the presence of cropland areas nearby a sampling point may contribute to its DO concentration. It is also interesting to note that the Other Vegetation class has a positive coefficient value. This indicates that an increase in vegetation in the area will result in a high DO concentration. However, it must be noted that decomposing vegetation materials within the stream will cause depletion of DO as well as the effect of diurnal variations [37].

Table 3 Regression Model Summary for pH and DO

Spectral signature and analysis

The spectral data gathered per sampling point were averaged using spreadsheet software. Per station, there are ten (10) sampling measurements gathered with five (5) scans per sample. Figure 5 shows the spectral plot of the averaged values per sampling station. Based on this figure, it can be noticed that spectral signatures of water samples from upper streams are separable to those located in the lower streams at 600 to 800 nm. It is also interesting to note that water samples from the lower streams have higher reflectance values ranging from 30 to 50% at 500 to 900 nm compared to those at the upper streams with reflectance values ranging from 3 to 15%. It was also observed that a very high correlation between the spectral curves for SP 5 and SP 6 which are located in the middle stream and the permanent sampling stations 5 and 6 of EMB (EMB SP 5 and EMB SP 6) at 700 to 850 nm. This marked similarities of spectral curves for water samples at the lower streams and the upper streams may indicate that the riverine water, depending on its location throughout the river stretch, may share similar physical and chemical characteristics. The results further show the potential of spectral analysis in characterizing water bodies.

Fig. 5
figure 5

The spectral plot at the Visible to Near Infrared (NIR) wavelength showing the average reflectance curves of water at the sampling stations

At about 868 nm, prominent peaks were observed in the spectral curves of SP 1 and SP 2 in contrast to the relatively low peaks observed at SP 5, SP 6, EMB SP 5 and EMB SP 6. Looking at the pH values of these stations, it can be found that SP 1 and SP 2 have the highest pH values at 8.16 and 8.58, respectively while the remaining stations depicting low spectral peaks have pH values ranging from 7.53 to 8.10 indicating a positive correlation between pH and reflectance. On the other hand, a negative correlation is shown in the observed values for DO at 868 nm. Sampling stations depicting dips in this spectrum have low DO values in contrast to the observed values at sampling stations depicting low peaks. To evaluate these observations statistically, regression lines were derived from the scatter plots for pH and DO (as shown in Fig. 6). The regression lines confirm the positive correlation between pH and reflectance values with an R2 of 0.548 and the negative correlation between DO and reflectance values with a corresponding R2 value of 0.634.

Fig. 6
figure 6

The scatter plots with regression lines for comparing (a) pH and (b) DO to the reflectance values at 868 nm

Conclusions

This paper demonstrated how RS and GIS techniques could provide valuable datasets that can enrich site characterization, especially in remote and inaccessible areas and when the financial resource could hinder on-site measurements. This study proved the potential of maximizing the use of free RS datasets like SRTM-DEM and Landsat 8 OLI in deriving catchment profiles and characterizing a catchment of interest. Through GIS processing, the SRTM-DEM was effective in deriving the river network, the river’s width and depth, and catchment boundaries. The techniques employed in deriving catchment profiles like river width and river depths from remotely-sensed data demonstrated the practical use of these datasets as an alternative to ground measurements and surveys. While the Landsat 8 OLI image was effective in delineating LC information from a satellite image. LC data derived from Landsat 8 OLI also aided in understanding the influence of human activities on the water quality within the catchment. By using the LC information, we were able to evaluate the potential influence of the LC distribution to the existing water quality in the study area. The regression analysis results of this study show that LC classes like water, other vegetation, and built-up areas are significant factors to the existing pH levels in the sampling areas. The cropland areas were found to have a strong association with the existing DO levels in the study area based on the regression model derived. The result also showed that an increase in vegetation areas could result in a high DO concentration as indicated by its positive coefficient. Though during decay of vegetation or during night time, DO levels are expected to drop.

On the other hand, the graphical and statistical analysis of the reflectance curves derived from in-situ spectral measurements also proved its potential in determining empirical models for pH and DO. This initial progress on the use of a spectral signature, however, needs more experimentation and spectral data collection so that a reliable RS-based water quality models could be derived and employed in obtaining water quality maps from satellite images. Among the three RS datasets, it is clear that DEMs and satellite images can conveniently provide reliable physical river profiles without incurring a cost. River widths and depth can give useful insights on the river’s physical condition and on identifying best locations of water quality monitoring sites, for example. While the Landsat images have been proven to provide a good understanding on how existing anthropogenic activities associated to LC classes affects water quality. Further, this paper was able to demonstrate how water resource management and sustainability studies can be aided by supplementing data for a comprehensive site characterization that will lead to a reliable and science-based decision making.