1 Introduction

FAO (2014) termed harvesting aquatic fauna from inland waters as “Inland fisheries”. Inland fisheries provide an important source of livelihoods and food security, especially in the low and lower middle-income African developing countries. In 2018, the global inland fisheries harvested 12.02 million tonnes, of which 3 million tonnes (equivalent to 25%) were from Africa (FAO 2020). Accordingly, continuous monitoring of inland fisheries is an urgent need. Yet data collection is one of the international challenges facing these countries in achieving the United Nations’ 2030 Agenda for Sustainable Development (FAO 2020). Data scarcity hinders the accurate quantification of the share of inland fisheries (Lorenzen et al. 2016) and limits the ability to indicate the effect of fishing activity and anthropogenic drivers (FAO 2020). Although no single method provides an accurate description of inland fisheries (Funge-Smith and Bennett 2019), results pooled from multiple approaches can significantly aid in improving the data supply chain to meet the 2030 Agenda (FAO 2020).

Determining fish production for large water bodies is expensive, time-consuming, and involves intensive fieldwork that can result in the unintentional mortality of fish (Milligan 2018). As an alternative, fisheries managers rely on habitat characteristics (Milligan 2018), simple catch statistics, and empirical models to estimate the lake’s potential fish yield (Abobi and Wolff 2020). Fish yield is influenced by the lake’s morphological (as depth, volume, area, and shoreline development), edaphic (as total dissolved solids (TDS) and electrical conductivity (EC)), and climatic parameters (Ryder 1965). Ryder excluded the climatic variables and combined both the morphological and edaphic variables into one index called “morpho-edaphic index (MEI) or Ryder’s index”—defined as the ratio of TDS to mean depth—to estimate the potential productivity of 34 north-temperate lakes using regression analysis. In Africa, Henderson and Welcome (1974) substituted EC for TDS in Ryder’s index and related it to yields from 17 tropical inland lakes to develop an empirical yield model. Afterward, Khalil (1997) developed an empirical yield model to predict the potential fish yield of Lake Borollus in Egypt. He related Ryder’s index to yields from 6 Egyptian and 16 African lakes and reservoirs.

Traditional techniques used to measure lake’s edaphic variables require intensive collection of primary data as well as performing expensive laboratory tests (Shahzad et al. 2018). Moreover, in comparison with satellite images, in situ measurements can’t offer instantaneous spatial distribution over the entire water body (Emam et al. 2021). To make the monitoring process more convenient, regular, and successful, remote sensing technique could be an effective alternative tool (Emam et al. 2019; Emam and Soliman 2020, 2021). During the last decade, especially after which Landsat and Google Earth data became freely available, remote sensing technology has been extensively used in monitoring the water quality of inland waters (Topp et al. 2020), as it has proved its efficiency as a powerful analytical method in integrating in situ water quality data with spectral reflectance measured by satellite sensors (Bugnot et al. 2018). Spectral reflectance is correlated with water quality parameters that affect the lake’s optical properties (Nas et al. 2010). The optical characteristics of water rely on different parameters, such as the concentration and characteristics of suspended solids, dissolved solids, and other organic matters (González-Márquez et al. 2018a). Consequently, Landsat has been widely used in developing an efficient monitoring method depending on the correlation between Landsat band values and the optical characteristics of different water quality variables (Mushtaq and Lala 2016; Khalil et al. 2016; González-Márquez et al. 2018a; Ferdous et al. 2019). In Kashmir, Mushtaq and Lala (2016) derived an exponential regression model for EC retrieval from Landsat 8 for the Wular Lake (wetland) during October 2013. In Egypt, Khalil et al. (2016) developed multiple linear regression model for salinity retrieval of Bardawil Lagoon (Ramsar site) efficiently from Landsat 8 during December 2014. In Mexico, González-Márquez et al. (2018a) generated linear regression model to estimate EC from Landsat 8 for Playa Colorada Bay (Ramsar site) during spring 2016.

Although Egypt is one of the lower middle-income developing countries, it is one of the top 25 major inland fish-producing countries worldwide (FAO 2020). During 2018, Egypt harvested 2.3% (equivalent to 0.28 million tonnes) of the global inland catch, of which lakes recorded the highest percentage (69.49%), followed by Nile River (26.29%) and rice fields (4.20%) (CAMPAS 2020a). According to FAO (2020), the overall Egyptian natural fish production (from inland waters, Mediterranean Sea, and the Red Sea) is projected to witness an overall increase of 34.9 percent over 2018 in 2030. To this end, continuous monitoring of the Egyptian inland fisheries, especially the threatened ones, becomes a prerequisite.

Lake Qaroun, the third-largest lake in Egypt, is one of the most threatened Egyptian lakes due to water pollution and agrochemical contamination. The lake is the remnant of the first large freshwater reservoir (Lake Moeris; ≈ 1700 km2) formed more than 4000 years ago in the Fayum depression to accommodate 50 billion m3 from the water Nile flood (Chanson 2004). Bahr Yussef canal (16 km long) was dug to connect the Fayum depression with the River Nile (Abulnaga 2018). However, due to the lack of maintenance, rain scarcity for 7-years, and the high evaporation rate, about 1200 km2 of the reservoir’s fertile area dried up and exploited afterward in intensive farming (Abulnaga 2018). Nowadays, 3,822,836 people inhabit the Fayum depression (CAMPAS 2020b). Owing to the steep downward slope of Fayum land towards Lake Qaroun (Donia 2013), the lake annually receives 470*106 kg dissolved salts from the agricultural drainage (Rasmy and Estefan 1983). Since the lake is regarded as an endorheic lake without outlets (Williams and Mann 2014), the continuous increase of agricultural discharge and evaporation rate had raised the lake’s salinity. In 1906, the lake’s salinity was 10.95‰ (Anon 1997), yet regrettably, during the early 1990s, the salinity increased 219.6% to reach 35‰ in 1999 (Sami 2000) turning the lake into a “permanent saline inland lake”. Consequently, its catch composition changed greatly (El-Serafy et al. 2014) affecting the livelihood of the people inhabiting the area. Moreover, although the lake was declared as a protected area in 1989, both natural (climatic conditions and geological aspects) and anthropogenic (amount of discharged wastewater and seepage from the surrounding cultivated lands, lack of sustainable wastewater management, agrochemical contamination, eutrophication, heavy metals) factors continued to adversely affect the water quality and the lake’s ecosystem (Abdel-Satar et al. 2003). In 2014, the lake’s contribution to the country's total fish yield from inland lakes was about 12.62% (4518 tons) (CAMPAS 2016) yet decreased to 2.10% reaching 832 tons in 2018 (CAMPAS 2020a).

Considering the UN 2030 Agenda, ensuring a continuously up-to-date database for inland fisheries, especially in low and lower middle-income African developing countries, becomes essential for proper management. To this end, this study aimed at creating a sustainable and inexpensive Landsat-based EC model that can easily notify fisheries managers of changes in the potential fish yield of Lake Qaroun in Egypt (pilot case study). Lake Qaroun was comprehensively studied regarding its water quality (Abdel-Satar et al. 2010; Abou El-Gheit et al. 2012; El-Sayed and Mosad 2017; Al-Afify et al. 2019) and fisheries (El-Serafy et al. 2014; Shalloof 2020). However, to the best of our knowledge, attempts to integrate geospatial technology, field measurements, mathematical computations, and fish yield empirical model in estimating the lake’s potential fish yield are missing. The study set the following objectives; (1) measuring the water electrical conductivity (EC; mS/cm) and water depth (Z; m) along Lake Qaroun during November 2018, (2) analyzing and updating the lake’s morphometrics for the year 2018, (3) deriving an empirical retrieval EC model from Landsat-8 for the month of November, (4) investigating the feasibility of applying the retrieved EC model in estimating the potential fish yield of Lake Qaroun during the month of November.

2 Materials and methods

2.1 Study area

Lake Qaroun is one of the largest inland saline closed lakes in the North African Great Sahara and is the third-largest lake in Egypt. In 1989, Lake Qaroun was declared as a protected area and has been designated as an Important Bird Area (code: EG009 in 2001) as well as a Ramsar wetland (site # 2040 in 2012). The lake lies in the northern part of El-Fayoum depression between longitudes of 30° 24′ 08″ ~ 30°49′57″ E and latitudes of 29° 24′ 26″ ~ 29° 32′ 04.74″ N (Fig. 1). The lake is about 89 km south-west of Cairo. To the north, the area is completely covered by sand without vegetation and has been designated as an “eco-tourism development area”. To the south, the lake is surrounded by cultivated lands, fish farms, touristic resorts, as well as four salt extraction ponds operated by the “Egyptian Salts and Minerals Company” (EMISAL). The lake has an irregular shape with two main basins. The western basin is deeper than the eastern one (Flower et al. 2006). In the middle of the lake, there is a small island (1.5 km2) known as “Gezert El Qarn El Zahbi = Qarn Island” which is an important site for nesting birds.

Fig. 1
figure 1

Location map of study area

Water enters the lake via two main drains: El Bats and El Wadi drains. The former flows into the lake’s eastern end whereas the latter discharges into the mid-southern shore of the western basin (Fig. 1). Besides, there are fish farms and other minor drains that pour their drainage water into the lake (Authman and Abbas 2007). The lake has no outlets. Its subtropical climate is generally warm and dry (Baioumy et al. 2010), characterized by high temperature (Anwar et al. 2001), low seasonal rainfall (< 10 mm/year) (Flower et al. 2006), and a high evaporation rate.

2.2 Data assemblage

2.2.1 Field sampling

Seventeen sampling sites covering the entire study area with its different characteristics (Fig. 1) were selected to measure the electrical conductivity (EC; mS/cm) and water depth (Z; m) in Lake Qaroun during 28 November 2018. Global positioning system (Garmin GPS) was used to detect the coordinates of each site in the field. GPS data were georeferenced to the “Universal Transverse Mercator” system (UTM/zone 35-N using spheroid/datum WGS-84) to make them compatible with the satellite data.

Field Hydrolab (Hanna HI 9829) was used to measure the EC directly in the field. To measure water depths, portable echo sounder was used.

2.2.2 Satellite data

For morphometric analysis, a high-resolution Sentinel-2B MSI scene, dated 30 November 2018 and acquired at 8:33:09 am for Lake Qaroun (Tile number 36) with pixel size 10 m, was downloaded from the Copernicus Open Access Hub website (https://scihub.copernicus.eu/). The image was processed to Level-1C product.

For regression analysis, a cloud-free medium-resolution Landsat-8 OLI_TIRS scene, dated 28 November 2018 for Lake Qaroun (Path/Row = 177/40) with pixel size 30 m, was downloaded from the United States Geological Survey (USGS) (https://earthexplorer.usgs.gov/). The image was processed to a standard level-1 precision terrain corrected (L1T) product. Landsat 8 carries two sensors on board namely, “Operational Land Imager” (OLI) and a “Thermal Infrared Sensor” (TIRS). The OLI instrument images the earth in nine spectral bands, whereas the TIRS collects data in two thermal infrared bands. The current study used only 8 bands (B) of OLI sensor; B1 (Coastal aerosol; 0.43–0.45 μm), B2 (Blue; 0.45–0.51 μm), B3 (Green; 0.53–0.59 μm), B4 (Red; 0.64–0.67 μm), B5 (Near Infrared; 0.85–0.88 μm), B6 (Shortwave Infrared-1; 1.57–1.65 μm), B7 (Shortwave Infrared-2; 2.11–2.29 μm), and B8 (Panchromatic band; 0.50–0.68 μm). The downloaded scene folder contained a metadata file (_MTL.txt) and a GeoTIFF image of the scene for each band. These GeoTIFF images were in gray scale and were read into ArcGIS software as an unsigned 16-bit integer of Digital Numbers (DNs). The study converted the raw DNs of the OLI band data (bands 1 through 8) to reflectance using raster calculator in ArcMap 10.1.

2.3 Geospatial interpolation

The study used ArcGIS 10.1 software in digitizing the boundary of Lake Qaroun at 1:10,000 scale from Sentinel-2B MSI scene (2018). The lake’s boundary was stored in a polygon shapefile. Afterwards, “Topo to Raster” kriging interpolation method was used to map the spatial distribution of EC and depth readings measured along Lake Qaroun.

To increase and double the sample size required for regression analysis, the current study used the interpolated EC raster layer to extract the EC readings for an additional 19 points within the lake using the “Extract multi values to points” tool within ArcGIS software (Fig. 2).

Fig. 2
figure 2

Flowchart of the methodology adopted in the present study (EC electrical conductivity (mS/cm), MEI Morphoedaphic index)

2.4 Analysis

2.4.1 Morphometric analysis

For morphometric analysis, the study used the associated attribute table in the lake’s boundary shapefile to estimate the lake’s size metrics as surface area (A; km2), volume (V; 108m3), shoreline length (SL; km), maximum length (Lmax; km), effective maximum length (Le; km), maximum width (Wmax; km), and effective maximum width (We; km) (Fig. 2). The average width (Wa; km) and lake’s elongation (λ) were calculated according to Wetzel and Likens (2000) and Wirth (2004), respectively. Ecosounder was used to record the average depth (Ź; m) and maximum depth (Zmax; m) whose ratio was used in calculating the depth ratio (Rz) (Kalff 2002) to indicate the shape of the lake’s basin. Moreover, shoreline development index (DL) was calculated according to Mortimer (1959).

2.4.2 Regression analysis

Microsoft Excel 2013 software was utilized to perform the statistical analysis. The study divided the sampling dataset into training (75%) and validation (25%) sets (Fig. 2). The training dataset (n = 27) was used to develop and calibrate the model, while the validation dataset (n = 9) was used to validate and compare the performance of the model developed in the training phase.

The study tested at first the correlation of the reflectance data of Landsat-8 bands and band ratios (independent variables) with the measured EC readings (dependent variables). For model calibration, the study selected only the variables that showed very strong correlation (R > 0.80) with EC values. Stepwise polynomial and multiple linear regression models (MLR) were then used to predict the EC from the reflectance data of the selected Landsat variables using the following general formula

$${\text{Y}} = {\text{a}} + {\text{b}}_{1} {\text{X}} + {\text{b}}_{2} {\text{X}}^{2} + \cdots + {\text{b}}_{{\text{n}}} {\text{X}}^{{\text{n}}}$$
$${\text{Y}} = {\text{a}} + {\text{b}}_{1} {\text{X}}_{1} + {\text{b}}_{2} {\text{X}}_{2} + \cdots + \varepsilon$$

where (Y): predicted EC; (X's): reflectance data of Landsat 8 bands; (a): intercept; (b1, b2, b3): regression coefficients estimated through statistical techniques; (n): degree of polynomial model; (ε): Model error. Coefficient of determination (R2), adjusted R2, standard error (± SE), and significance F-ratio values were considered in the regression equations. In choosing the best regression model, the study selected the model with the highest R2, lowest error metrics (Table 1), and whose independent variables and coefficients (a, b1, b2) were highly significant (P < 0.05).

Table 1 Statistical error metrics used to assess the performance and the accuracy of the developed EC model in both the training and validation phases

To assess the performance as well as the accuracy of the developed EC model, in both the training and validation phases, the study used both graphical and statistical metrics. Scatter graphical plots were used to display the matching between the measured and the estimated EC readings. Moreover, the study used the developed model in generating a spatial map to ensure that the noncalibrated sites throughout the lake provided reasonable EC values. Regarding the statistical metrics, the current study used the eight metrics described in Table 1. The model with the least error metrics was selected.

Finally, to test the model’s efficiency in different years, the study used the historical EC readings—available from the monthly reports of the Egyptian Environmental Affairs Agency (EEAA 2015)—for Lake Qaroun during Nov 2015.

Model Application

Water quality plays an important role in fish production. Availability of reliable continuous up to date limnological data can help in managing the reservoir’s fisheries efficiently. However, the high-cost expenses of field measurements limit its regularity. Accordingly, the current study examined the possibility of using the developed EC model, as a decision-support tool, to assist managers in making first-order estimates of potential fish yield (Y; tonnes) regularly for Lake Qaroun during the month of November. The study applied the developed EC model into Khalil’s empirical fish yield model (Khalil 1997) as follows

$${\mathbf{Y}}_{{\left( {{\text{kg}}/{\text{ha}}} \right)}} = 16.3757 \times {\mathbf{MEI}}^{0.4445} { }\quad ({\text{R}} = 0.8627)$$

where (Y) is the potential fish yield; kg/ha, (MEI) is the morphoedaphic index (= Ryder’s index), (TDS) is the total dissolved solids; mg/l, and (Ź) is the average depth; m (Ryder 1965).

In Lake Qaroun, a strong positive correlation (R = 0.9) was recorded between water EC and TDS (Abdel-Satar et al. 2010). Accordingly, the current study replaced the TDS by EC and calculated the morphoedaphic index values for Lake Qaroun, using both the measured and the predicted EC values from the developed model, during November 2018. The two MEI average values were then compared, and their level of significance was examined.

Knowing that the area of Lake Qaroun in acre is 56,340, the study used the following formula to convert the estimated potential yield into tonnes

$${\mathrm{Y}}_{\left(\mathrm{tonnes}\right)}= \frac{\left({\mathrm{Y}}_{\left(\mathrm{kg}/\mathrm{ha}\right)} \times 0.405 \times 56340 \right)}{1000}$$

Moreover, the mean absolute percentage error (MAPE; %) was calculated for the potential yield estimates to examine the level of accuracy between the yield obtained using the EC measured in the field and that obtained from the developed EC model during Nov 2018 and Nov 2015.

3 Results and discussion

3.1 Descriptive statistics of field measurements

Table 2 presents the descriptive statistics of the field measurements of electrical conductivity (EC; mS/cm) and water depths (Z; m) recorded in the 17 sampling sites distributed along the eastern and western basins of Lake Qaroun during November 2018.

Table 2 Descriptive summary statistics of the electrical conductivity (EC; mS/cm) and water depths (Z; m) measured in Lake Qaroun along the eastern and western basins as well as along the entire lake during November 2018

In the present study, EC values ranged from 42.86 mS/cm (station #3; near El-Bats drain) to 49.08 mS/cm (station #6) with an average value of 46.08 ± 0.93 mS/cm along the eastern basin and from 47.01 mS/cm (station #17; near El-Wadi drain) to 52.55 mS/cm (station #12) with an average value of 50.57 ± 0.58 mS/cm along the western basin of Lake Qaroun during November 2018 (Table 2; Fig. 3). Our results resemble Goher et al. (2018) and Al-Afify et al. (2019) results who reported that the station near El-Bats drain recorded the lowest EC value followed by that near El-Wadi drain. The ranges recorded between the minimum and maximum EC values along the eastern and western basins, in the present study, were 6.22 and 5.54, respectively (Table 2). The overall mean EC value in Lake Qaroun was 48.98 ± 0.72 mS/cm during November 2018. According to Waiser and Robarts (2009), Lake Qaroun can be regarded as mesohaline (30–70 mS/cm).

Fig. 3
figure 3

Spatial distribution of the electrical conductivity (mS/cm) and depth (m) readings as obtained from GIS analysis along Lake Qaroun during November 2018

Fig. 4
figure 4

Graphical scatter plots of observed against predicted values of electrical conductivity at Lake Qaroun during 2018. a in the training phase (n = 27); b in the validation phase (n = 9); c the overall pattern of the observed and predicted values of electrical conductivity along the entire 36 stations

The spatial distribution of EC along the entire lake showed that the eastern part recorded the minimum values and increased gradually towards the northwestern area (Fig. 3). This could be attributed to the dilution effect of drainage water discharging into the lake from the southeastern side rather than the northwestern side (Abdel Wahed et al. 2015; El-Zeiny et al. 2019).

Readings of depth measurements in Lake Qaroun indicated that station #12 (at the southwestern side) and station #2 (at the southeastern side) recorded the least values (2.2 m and 2.24 m, respectively) (Table 2; Fig. 3), whereas the deepest areas (> 6 m) were recorded at stations #7 (8.9 m), #11 (7.57 m), and #3 (6.8 m) at the northern areas of the lake (Fig. 3). The overall mean depth value recorded in Lake Qaroun was 4.34 ± 0.48 m (Table 2). Our results resemble Elgamal et al. (2017) results who mentioned that the northern side of Lake Qaroun near the Qarn island recorded the highest depth of 9.0 m and the average depth of the lake was about 4.20 m.

3.2 Morphometric Analysis

The Morphometric characteristics of lakes play a paramount role in the ecological processes that can support their productivity (Ezekiel et al. 2019). Accordingly, the current study analyzed the morphometry of Lake Qaroun.

Table 3 summarizes the physical characteristics of Lake Qaroun. Results obtained from analyzing the lake’s boundary shapefile in ArcMap (10.1) showed that the lake basin is aligned in a SW-NE direction, with a surface area (A) of 228 km2 (22,800 ha), a maximum length (Lmax) of 41.07 km, and a maximum width (Wmax) of 9.8 km, resulting in an average width (Wa) of 5.55 km and an elongation value of 0.23 (Table 3). According to Wirth (2004), the lake becomes more elongated as the ratio approaches zero. The maximum effective length (Le) and width (We) are 89.5 and 99.5% of the maximum length and width of Lake Qaroun, respectively. These characteristics promote the proper mixing of water within the lake, which in turn enhances the circulation of oxygen required for life underwater (Ezekiel et al. 2019).

Table 3 Morphometric parameters of Lake Qaroun during November 2018

The lake level is at ≈ 47 m below sea level, which makes it the deepest area in El-Fayoum depression. That’s why, the lake receives most of the natural (subsurface flow) and artificial (agricultural) drainage along El-Fayoum depression (Abd El-Wahed et al. 2015). In the current study, the lake’s volume was estimated as 9.879*108 m3 (Table 3). In 1999, the lake received drainage water with a volume of about 338*106 m3 from the two main drains in addition to about 67.8*106 m3 from groundwater (El-Shabrawy and Dumont 2009). In 2011, the amount of water reaching the lake from the two main drains reached 419.56*106 m3 (Fouda and Fishar 2012).

GIS analysis and mathematical computation yielded a shoreline length (SL) of 139 km and a shoreline development index (DL) of 2.59 for Lake Qaroun (Table 3) indicating a subrectangular elongated lake that can support the lake’s potential for fisheries development (Ezekiel et al. 2019).

Analysis of depth readings revealed that Lake Qaroun had a maximum depth (Zmax) of 8.93 m and a mean depth (Ź) of 4.34 m. The depth ratio (Rz) of 0.48 for Lake Qaroun (Table 3) indicated that the lake basin had a conical shape property (Kalff 2002).

3.3 Regression analysis

3.3.1 Correlation

In the current study, during November 2018, EC showed very strong positive correlations (R > 0.80) with Landsat band ratios rather than individual bands. The correlation pattern followed the order B2/B4 (0.917) > B2/B8 (0.915) > B1/B4 (0.911) > B1/B8 (0.907) > B1/B3 (0.887) > B2/B3 (0.886) > B3/B4 (0.880) > B7/B5 (0.870) > B7/B6 (0.860) > B8/B4 (0.854) > B7/B4 (0.851) > B6/B5 (0.836) > B1/B2 (0.833) > B6/B4 (0.824) > B7/B8 (0.815). This strong correlation can be attributed to the fact that band ratios can (to some extent) eliminate the influence of atmosphere (Kutser 2012). That’s why, regression models with spectral ratios were found to be more powerful than with single band (Emam 2016; Khalil et al. 2016; Deutsch et al. 2018).

3.3.2 Model calibration

Table 4 summarizes the statistical outputs of the regression analysis performed using the above-mentioned Landsat band ratios as independent variables to predict the electrical conductivity over Lake Qaroun during November 2018. The key in selecting the best fit regression model is to choose an appropriate regression method as well as independent variables that result in the highest R2 (Mushtaq and Lala 2016), lowest SE, and most significant values (Emam 2016). In accordance with Emam (2016) and Deutsch et al. (2018), we observed that incorporating more spectral information into the model enhances the value of R2 (Table 4). Accordingly, the cubic regression model for Landsat band ratio (Green “B3”/Red “B4”) (R2 = 0.870; adjusted R2 = 0.859; ± SE = 0.853; P < 0.0001) (Table 4) was considered the best regression model. EC is sensitive in the visible spectra (Avdan et al. 2019) and the visible red band (B4) can differentiate the reflectance of each salinity class (Gorji et al. 2020). This can probably explain why B3/B4 showed the best regression results.

Table 4 Summary output of the polynomial regression statistics

Moreover, the study considered examining all the four possible combinations, within the polynomial cubic model, with their associated error metrics and significance values (Table 5) to ensure selecting the model with the least prediction error. Accordingly, model #2 (EC = – 476.800 + 879.266 (B3/B4)2 – 436.411 (B3/B4)3) was selected as it showed significant regression coefficients (P < 0.001), the highest R2, ENS, d, and the least error metrics (RMSE, MAE, PBIAS, and MAPE).

Table 5 Statistical error metrics used to compare and assess the performance of the four cubic models proposed in the current study during the training phase

3.3.3 Model validation

After developing the model, the study used the remaining nine samples to assess the validity of the developed model in estimating EC along Lake Qaroun. Table 6 summarizes the statistical performance results of the developed EC model in terms of R, R2, ENS, d, RMSE, MAE, PBIAS, and MAPE for both the training and validating sets.

Table 6 Summary output for the statistical error metrics used to assess the performance of the developed model

R and R2 values—in both the training and validation sets—were greater than 85% (Table 6) signifying a very strong correlation that succeeded in estimating 95% of EC values. In both sets, there were no significant differences (p < 0.001) between the measured and the estimated EC values. Willmott’s index (d) values—in both the training and validation sets—were very close to one (Table 6) revealing a perfect match between the measured and estimated EC values. Moreover, results of ENS (> 0.75) and PBIAS (< ± 10) (Table 6) indicated a very good performance.

The current results showed that EC had satisfied retrieval results according to the RMSE, which was the most important criterion for fit if the main objective of the model was prediction (Emam 2016). RMSE values developed for training and validation were 0.804 and 1.477, respectively (Table 6). The best value for RMSE should be less than half that of the standard deviation (SD) (Singh et al. 2004). Since the value of RMSE in the current study was less than half that of the SD (Table 6), the performance of the model can thus be considered high.

Furthermore, the MAPE values were less than 5%—in both the training and validation sets—(Table 6) indicating that the accuracy of the model is high and can be accepted (Lewis 1982; Swanson 2015), with an overall error of 23% being overestimated on average in the validation phase (MAE = 1.23 mS/cm).

Regarding the graphical validation, Fig. 4 illustrates how close the measured EC values were to the model predicted values—in both the training and validation sets—during 2018.

Moreover, the study applied the generated model into Landsat image (Fig. 5) to ensure that the noncalibrated sites throughout the lake provided reasonable EC values and spatial distribution over the entire lake. It was clear that the western and northwestern areas of the lake displayed the highest EC values, whereas the edges of the eastern basin recorded the lowest values (Fig. 5). Abdelmalik (2018) noted similar distribution pattern in Lake Qaroun using ASTER image during 2007.

Fig. 5
figure 5

Electrical conductivity spatial distribution map generated from the developed model

3.3.4 Model testing

The study evaluated the model’s performance on previous field data retrieved for the same month in year 2015 (Table 6). Based on the values of ENS and PBIAS, the model showed very good performance in estimating 97% (R2) of EC values with a satisfactory low RMSE value (2.5824 mS/cm) during 2015. Based on MAPE value, the estimated EC values were 4.793% close to the measured readings revealing an acceptable high accuracy.

The cross-validation and testing results proved that the model developed, in the current study, using Landsat band ratio (B3/B4) in estimating the electrical conductivity along Lake Qaroun is promising and can be used as a decision-support tool in tracking the lake’s electrical conductivity in the future.

Table 7 displays the model developed in the present study, to estimate EC using Landsat-8, in comparison with other previous studies. It was clear that there was no single model common to all waterbodies. Ferdous and Rahman (2020) used Landsat-8 combination of B3, B4, B2*B3, (B2*B3)/B4, and (B2*B4)/B3 as independent variables to estimate EC for Satkhira Upazilas in Bangladesh (R2 = 0.76; p < 0.001). González-Márquez et al. (2018b) used Landsat-8 band ratio (B2-B3)/(B4-B6) to estimate EC for El-Guájaro reservoir in Colombia (R2 = 0.69; p < 0.05). Mushtaq and Lala (2016) used Landsat 8 red band (B4) in their model to estimate EC for Wular Lake in Kashmir (R2 = 0.615; p < 0.004). The differences in selecting the independent variables (bands and band ratios) may be related to the differences in image quality or may depend on the physicochemical properties of the water body (Zhao et al. 2011). Even in the same lake, different satellites resulted in different models (Table 7). Abdelmalik (2018) used ASTER image to generate a linear regression model to estimate EC for Lake Qaroun during 2007 (R2 = 0.996; p < 0.0001). The wavelengths of the bands used in his model (range 0.78–2.365 μm) were higher than that used in the current study (range 0.53–0.67 μm) for the same lake. This discrepancy could be attributed to the increase in the electrical conductivity of the lake throughout the period from 2007 to 2018. Theologou et al. (2015) noted that increasing water salinity changes the reflectance values within the bands (visible and infrared).

Table 7 Electrical conductivity (EC; mS/cm) model developed in the present study in comparison with other previous studies

In closure, it is worth emphasizing that monitoring EC using Landsat-8 depends on the lake’s state. The model developed in the current study is site specific and can be relevant only to other lakes that resemble the environment in Lake Qaroun.

3.4 Model application

The quantity of fishes produced in Lake Qaroun depends on the potential productivity of the lake. Table 8 displays the descriptive statistics of the potential fish yield estimates (tonnes) calculated for Lake Qaroun using the MEI values, obtained from Landsat-based EC model in comparison with that from field measurements, during November 2015 and 2018.

Table 8 Descriptive summary statistics of potential yield values (Y; tonnes) estimated based on morphoedaphic index (MEI) for Lake Qaroun

Abdel-Satar et al. (2010) reported that electrical conductivity varied significantly among seasons (p < 0.01) in Lake Qaroun. Accordingly, the current study focused on applying the developed model in estimating the morphoedaphic index and the potential fish yield of Lake Qaroun during the month of November.

Results of the present study showed that MEI values obtained from the field measurements ranged from 5.26 to 23.84 with an average value (± SE) of 13.52 ± 1.38, whereas those obtained from the developed model ranged from 5.20 to 23.57 with an average value (± SE) of 13.61 ± 1.39 during Nov. 2018 (Table 8). There was no significant difference (P < 0.00001) between the two methods in estimating the values of MEI (Table 8) which indicates the possibility of using Landsat-based EC model in estimating the lake’s potential productivity.

Based on MEI values, the potential fish yield average values estimated during Nov. 2015 (1017.01 and 987.64 tonnes from Landsat and field, respectively) were lower than those obtained during Nov. 2018 (1166.74 and 1163.32 tonnes from Landsat and field, respectively) (Table 8) due to the increase in the mean values of electrical conductivity from 39.16 mS/cm (during Nov 2015) to 48.98 mS/cm (during Nov 2018). This result was in agreement with Jackson and Marmulla (2001) who reported that reservoirs with high concentration of dissolved solids have high potential productivity. Moreover, based on the annual reports of the environmental monitoring program for Lake Qaroun, the average annual readings of chlorophyll-a (μg/l) have increased throughout the period from 2013 (66.6 μg/l) to 2018 (124.13 μg/l) (https://www.eeaa.gov.eg/en-us/topics/water/lakes.aspx).

The study computed the mean absolute percentage error (MAPE) to determine the accuracy of Landsat-based EC model in estimating the potential yield of Lake Qaroun during Nov. 2015 and Nov. 2018. MAPE results showed that the average values of the estimated potential fish yield were 3.64% and 0.90% close to the measured readings during Nov. 2015 and Nov. 2018, respectively, revealing an acceptable high accuracy (< 5%) (Swanson 2015).

In comparison with the actual fish yield obtained from Lake Qaroun, it was found that during Nov. 2015, the actual fish yield was 96 tonnes (CAMPAS 2017) and decreased by 35.2% in 3 years to reach 71 tonnes in Nov. 2018 (CAMPAS 2020a). This decrease in actual production, despite an increase in the potential production, indicates that something is hampering the lake's efficiency in production. This decrease could be attributed to the cymothoid ectoparasite that entered the lake accidentally during the process of fish fry transplantation (Mehanna 2020) and/or to the high densities of algal blooming reported in Lake Qaroun by Ibrahim et al. (2021).

4 Conclusion

Our results proved that the EC model derived in the current study using Landsat-8 OLI for Lake Qaroun can be used very efficiently as a decision support tool to assist managers not only in monitoring the lake’s electrical conductivity regularly, during the month of November, but also in making preliminary estimates of the lake’s potential yield.

Landsat-8 band ratio B3/B4 was found to be the most prominent independent variable for EC retrieval in Lake Qaroun. The model showed a very good performance in estimating 95% of EC values significantly with high acceptable accuracy. It is worth mentioning that the model developed in the current study is site specific and can be relevant only to other lakes that resemble the environment in Lake Qaroun.