1 Introduction

Environmental monitoring of estuarine waterbodies is a fundamental tool to assure the fulfillment of water quality standards in these ecosystems. Data obtained from the monitoring network — consisting of a set of measures of different chosen parameters, as pollutants or nutrients concentrations, salinity, temperature, pH, chlorophyll, dissolved oxygen, and so on — can be used for general management and/or restoration of water quality in these areas.

One of most important issues in the design of a monitoring network is the number and the location of the sampling stations. Their number is usually limited by the available budget, but the determining of the monitoring locations — in past times mainly fixed by the intuitive experiences of stakeholders and decision-makers — needs to be systematically and scientifically chosen in order to optimize the effective performance of the network. Scientific studies show that the accuracy of identifying pollution sources is highly dependent on the location of these monitoring stations [1]. Therefore, finding a set of optimal locations for the set of sampling points is essential to correctly characterize pollution sources (wastewater discharges, accidental spills, runoffs, etc.).

In fact, as it has been remarked by several authors [1, 2], the placement of the sampling stations can be considered the most critical factor in the design of any water quality monitoring network. The selection of these optimal sampling points has been addressed by several authors, but mainly from a statistical viewpoint (a geostatistical approach combined with simulated annealing [3, 4], fuzzy logic based on a geographic information system [5], multivariate statistical techniques [6], cellular automata-Markov chain models [7], graphical optimization by interpolation via correlation coefficients and standard deviations [8, 9], Kriging variance combined with simulated annealing [10], a profile likelihood approach [11], etc.).

The aim of our research is to present a novel and effective approach to the problem of the optimal sampling points allocation within a simulation-based optimization framework, in the spirit of previous works of the authors for the case of a river water quality monitoring system [12,13,14], although this previous one-dimensional issue was a much simpler problem, both from the simulation viewpoint and from the optimization one. Here, we formulate the problem as a two-level optimization problem, where the upper level problem is the optimal fixing of the sampling points locations — given by their ability to capture the correct information on intensity and location of possible pollution releases — and the lower level problems are related to the optimal determination of these pollution sources.

In a specific way, the upper level optimization problem concerns the finding of the optimal sampling locations which best determine a large number of random point source pollution episodes. This problem can be formulated as an optimization problem where the objective function — measuring the global accuracy of the set of sampling stations — is given by the sum of the optimal approximation errors at the set of sampling points for all the different source pollution cases considered in the above formulation. In our study, the minimization process is executed via a controlled random search procedure for global optimization [15] in order to try to avoid the possibility of being trapped in local minima.

The (subsidiary) lower level problems are related to the optimal identification (location and intensity) of the numerous random pollution sources, that is a critical step in managing the quality of estuarine waters. This problem has been much more extensively studied (see, for instance, a general survey in the recent review [16] for surface and groundwaters, and the numerous references therein). However, in our bidimensional case, this inverse problem can be mathematically ill-posed (in this sense, several authors have proved the well-posedness of the problem if we have data for the whole boundary of the domain [17, 18], or if the known data from three sampling points in the case of the unbounded whole bidimensional space [19, 20]; but the case of a finite number of sampling points in a bounded domain remains, as far as we know, as an open problem). This identification problem can be also reformulated as an optimization problem where the cost function depends on the differences between the observed and the predicted values for the different pollutant concentrations at the sampling points. There exists a large variety of proposed methods for solving the pollution source identification problem, but they can be categorized into three main groups, according to their approach: the probability-based approach (including Bayesian inference [21], backward probability method [22], the minimum relative entropy [23], and many others), the classification approach [24, 25], and, finally, the optimization-based approach where the differences between simulated and observed pollutant concentrations at several points — obtained by solving a numerical model for pollutant concentrations — are minimized by means of a large range of optimization algorithms of derivative, derivative-free, or hybrid type [26,27,28]. In our case, we have chosen this linked simulation-optimization option, employing the gradient-free Nelder-Mead algorithm [29] for the minimization process and a convection-diffusion-reaction equation for the simulation step.

This article is divided into five sections. After this introduction follows a second section devoted to the rigorous formulation of the problem, whose computational implementation is detailed in Section 3. Final sections are devoted to present the numerical results for a case study posed in Ría of Vigo (NW Spain), showing several discussions and conclusions.

2 Mathematical Setting of the Problem

We consider a domain \(\Omega \subset \mathbb {R}^2\) occupied by shallow waters, for instance an estuary or a ría (river end flowing into the sea), and we are interested in determining the optimal locations of a (usually small) number N of sampling points in a water quality monitoring network, that is, we want to find the best locations \(p_i \in \Omega , \ i=1,\dots ,N,\) for the sampling points, with the only constraint that each point \(p_i\) must lie inside a desired area \(U_i \subset \Omega\) such that \(\mathrm{int}(U_j) \cap \mathrm{int}(U_k) = \emptyset , \ \forall j \not = k \in \{1,\dots ,N\}\).

We understand as the best locations a vector \(p=(p_1,\dots ,p_N) \in \prod _{i=1}^N U_i\) that allows, from the sampling data taken at those points for a time interval (0, T) — a vector function \(d(t)=(d^1(t), \dots , d^N(t)) \in [\mathcal {C}(0,T)]^N\) — to determine in the most accurate way the discharge point \(b \in \Omega\) and the discharge intensity — a function \(m(t) \in L^{\infty }(0,T)\) satisfying \(m_{min} \le m(t) \le m_{max},\) a.e. \(t \in (0,T)\) — that have caused the pollution levels collected.

So, if we denote by \(c_{(m,b)}(x,t)\) the concentration at point \(x\in \Omega\) and at time \(t \in (0,T)\) of an undesired pollutant (say, for instance, coliform bacteria Escherichia coli) coming from a discharge of intensity m at point b, then its evolution along \(\Omega \times (0,T)\) can be obtained as the solution of the following initial/boundary value problem (see, for instance, [30]):

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle {\frac{\partial c}{\partial t}+\overrightarrow {u} \cdot \nabla c-\beta \Delta c +\kappa c} = \displaystyle {\frac{1}{h} \, m \, \delta (x-b)} \ \ \mathrm{in } \ \Omega \times (0,T) , \\ \displaystyle {\frac{\partial c}{\partial n}}= 0 \ \ \mathrm{on } \ \Gamma \times (0,T) , \\ \displaystyle {c(x,0)} = c_0 \ \ \mathrm{in } \ \Omega , \end{array}\right. \end{aligned}$$
(1)

where \(\Gamma\) is the boundary of \(\Omega\) (assumed to be smooth enough), m(t) is the mass flow rate of E. coli discharged in b, and \(\delta (x-b)\) denotes the Dirac measure at point b. Experimentally known parameters \(\beta\) and \(\kappa\) correspond to horizontal viscosity and decay rate, respectively. Finally, h(xt) denotes the water depth, and \(\mathbf {u}(x,t)\) is the depth-averaged horizontal velocity of water, which can be measured in situ or estimated as the solution of the classical shallow water equations.

Then, for each set of sampling points \(p \in \prod _{i=1}^N U_i\) and for each set of sampling data \(d(t) \in [\mathcal {C}(0,T)]^N\), we define the cost function:

$$\begin{aligned} J^{p,d}(m,b) = \sum\limits_{i=1}^N \int _0^T \left( \frac{c_{(m,b)}(p_i,t) - d^i(t)}{d^i(t)} \right) ^2 dt , \end{aligned}$$
(2)

which measures the difference between the pollution levels caused by a discharge of intensity m at point b, and the levels collected in the samples taken at points \(p_i\), for \(i=1,\dots ,N\). The objective is to determine the vector p so that each pollution discharge \((\bar{m}, \bar{b})\) can be recovered by numerically solving the inverse problem:

$$\begin{aligned} \left. \begin{array}{cl} \min &{} J^{p,\bar{d}}(m,b) \\ (m,b) \in L^{\infty }(0,T) \times \Omega , &{} \\ m_{min} \le m(t) \le m_{max}, \ \text{ a.e. } t \in (0,T) &{} \end{array}\right. \end{aligned}$$
(3)

being \(\bar{d}(t)\) the pollution level samples, observed at monitoring points p, caused by discharge \((\bar{m}, \bar{b})\).

It is worthwhile remarking here that, for any set of locations p, if for a discharge \((\bar{m}, \bar{b})\) we solve the problem Eq. (1) and consider the synthetic samples given by \(\bar{d}^p = (\bar{d}^{p_1}, \dots , \bar{d}^{p_N})\), with \(\bar{d}^{p_i}(t) = c_{(\bar{m},\bar{b})}(p_i,t)\), for \(i=1,\dots ,N\), then \(J^{p,\bar{d}}(\bar{m}, \bar{b})=0\) and, consequently, \((\bar{m}, \bar{b})\) is a solution of the inverse problem Eq. (3). However, we need to assure a correct numerical resolution of this inverse problem, which strongly depends on the chosen vector p, since an unsuitable choice of the set of sampling locations might lead to an inaccurate solution, posibly very different to the theoretical solution \((\bar{m}, \bar{b})\). Our methodology aims to determine the best set p of sampling points location, which can recognize all possible discharges.

So, in order to determine the good quality of a particular set of sampling points locations \(p=(p_1,\dots ,p_N)\), we consider a (large) number M of random pollution sources (located at points \(b_j \in \Omega\) and with constant intensities \(m_j \in [m_{min}, m_{max}]\), for \(j=1, \dots ,M\)), which must be identified by the monitoring network from data given by this particular set of sampling points.

After solving problem Eq. (1) for the different pollution scenarios \((m_j, b_j), \ j=1,\dots ,M,\) we can obtain the synthetic data \(d_j^p\) corresponding to the set of sampling points p, given by:

$$\begin{aligned} d_j^{p_i}(t)=c_{(m_j,b_j)}(p_i,t), \ \text{ for } i=1,\dots ,N. \end{aligned}$$
(4)

Then, for the given p and for each \(j \in \{1,\dots ,M\}\), we solve the following intermediate optimization problem (\(\mathcal {P}_j^p\)): find \((\tilde{m}_j(p) , \tilde{b}_j(p))\) solution of problem Eq. (3) for data \(\bar{d} = d_j^p\).

Thus, we can define the function:

$$\begin{aligned} J_j(p) = J^{p,d_j^p}(\tilde{m}_j(p), \tilde{b}_j(p)) \end{aligned}$$
(5)

which determines the goodness of the particular set of sampling points locations p, that is, function \(J_j(p)\) measures the accurateness given by the set of sampling points locations \(p=(p_1,\dots ,p_N)\) when identifying the j-th pollution source \((m_j, b_j)\).

Finally, in order to find the best locations for the sampling points, we need to solve the following global optimization problem (\(\mathcal {P}\)): find \(\tilde{p}=(\tilde{p}_1,\dots ,\tilde{p}_N)\), with \(\tilde{p}_i \in U_i, \ \forall i\in \{1,\dots ,N\},\) minimizing the objective function J given by:

$$\begin{aligned} J(p) = \sum _{j=1}^M J_j(p) , \end{aligned}$$
(6)

that is, we look for the optimal set of sampling points locations that identifies in the most accurate way all the random pollution scenarios chosen at the beginning of the monitoring network design process. Last but not least, we must note that, due to the strong nonlinearities of the global problem, uniqueness for solution \(\tilde{p}\) is not expected. Nevertheless, this does not represent any difficulty for our approach, since any of these possible global minima is good enough and suitable for our purposes. So, it suffices to compute one of them.

For readers’ convenience, a detailed flowchart corresponding to the global process for the optimal network design can be found in Fig. 1.

Fig. 1
figure 1

General flowchart for optimal design

3 Numerical Implementation

In this section, we present the full details for the computational resolution of the problem by means of a suitable discretization process, addressing the numerical resolution of the boundary value problem Eq. (1), the minimization of the intermediate optimization problems (\(\mathcal {P}_j\kern 0.1300em^p\)), and the resolution of the global optimization problem (\(\mathcal {P}\)).

In particular, for solving problem Eq. (1), we consider the standard variational formulation of the problem, and apply finite element method techniques for its resolution on a triangular mesh \(\Omega _h\) of the domain. To do this, we use the open-source finite element software Freefem++ [31], through a full programming of the associated formulation. Moreover, in order to assure the robustness of our approach, we have compared our achieved results to those obtained by the 2D finite volume hydrodynamic model MIKE 21 [32], developed by the Danish Institute of Technology (DHI), showing a good agreement, both from a qualitative and a quantitative viewpoint.

When solving the intermediate minimization problems (\(\mathcal {P}_j\kern 0.1200em^p\)), for any particular p and for each \(j \in \{1, \dots ,M\}\), given the essentially geometric nature of the problem, the authors propose to use a direct search algorithm: the Nelder-Mead simplex method [29]. This gradient-free algorithm has been successfully used by the authors in other related environmental problems (see, for instance, their previous work [33], where a short description of the method can be also found), and presents good convergence properties in low dimensions (in our particular case, we are dealing with a three-dimensional design variable \((m_j,b_j)\) \(\in [m_{min}, m_{max}] \times \Omega \subset \mathbb {R}^3\)). In addition, the classical Nelder-Mead algorithm can be effectively modified with an oriented restarting when stagnation at a non-optimal point is detected. However, since the Nelder-Mead algorithm was originally designed for unconstrained minimization problems, in order to apply it to the constrained optimization problem (\(\mathcal {P}_j^p\)), we need first to modify the corresponding cost function by adding a penalty term related to the fulfilling of the constraints \((m_j,b_j) \in [m_{min}, m_{max}] \times \Omega\), which can be made in a simple and straightforward way: in case that \(m_j \not \in [m_{min}, m_{max}]\) or \(b_j \not \in \Omega\) we add to the original cost funtion a high penalty value, what makes vector \((m_j,b_j)\) inadmissible to be a minimum value.

Finally, for solving the global minimization problem (\(\mathcal {P}\)), we use a controlled random search procedure for global optimization [15] combined with a multi-start strategy in order to assure a better performance of the algorithm. Again, as in the intermediate optimization problems, constraints related to \(p_i \in U_i, \ i=1,\dots ,N\) need to be penalized in cost function Eq. (6) by adding a penalty term.

4 A Case Study

This section presents some numerical tests for a real-world scenario posed in the estuary Ría of Vigo (Galicia, NW Spain). This shallow water region, whose finite element mesh \(\Omega _h\) is depicted in Fig. 2, is delimited by the extremal points (measured in kilometers) \(A=( 504.5748, 4661.631)\), \(B=( 503.9068, 4679.963)\), \(C=( 532.2000, 4687.941)\), and \(D=( 530.6860, 4688.455)\), as shown in Fig. 2. Thus, the region \(\Omega\) under study extends in a northeast direction over a length of about 35 km with a maximum width of 18 km.

Fig. 2
figure 2

Triangular mesh of Ría of Vigo, showing the delimiting points \(A, \, B, \, C\), and D, and the admissible areas \(U_1, \dots , U_5\) for sampling points locations

For the numerical computation of the E. coli concentrations, via the resolution of the initial/boundary value problem Eq. (1), we have taken the viscosity parameter \(\beta = 200.0\) and the decay rate \(\kappa = 4.134 \times 10^{-4}\) [34], with water depth h and horizontal velocity \(\mathbf {u}\) computed by means of our own Fortran code.

In our case, due to budget constraints, only \(N=5\) monitoring stations will be allocated. Then, in order to avoid an incorrect accumulation of monitors in too limited areas, we have decided to divide the estuary into five vertical stripes and place a monitoring station in each of them. In particular, the case shown here corresponds to the following five admissible areas: \(U_1 = \{ (x,y)\in \Omega \ : \ x \le 509.5 \}\), \(U_2 = \{ (x,y)\in \Omega \ : \ 509.5 \le x \le 515.2 \}\), \(U_3 = \{ (x,y)\in \Omega \ : \ 515.2 \le x \le 520.9 \}\), \(U_4= \{ (x,y)\in \Omega \ : \ 520.9 \le x \le 526.6 \}\), and \(U_5 = \{ (x,y)\in \Omega \ : \ 526.6 \le x \}\).

To determine the goodness of each set of sampling locations p, we employ following \(M=9\) synthetic discharges for \(m_{min}=10.0\) and \(m_{max}=80.0\): \(m_1 = 30.0, \ b_1 = ( 513.9167, 4672.903)\), \(m_2 = 50.0 , \ b_2 = ( 509.0681, 4669.368)\), \(m_3 = 70.0 , \ b_3 = ( 522.4740, 4678.645)\), \(m_4 = 30.0 , \ b_4 = ( 517.5092, 4675.787)\), \(m_5 = 20.0 , \ b_5 = ( 511.0788, 4665.341)\), \(m_6 = 40.0 , \ b_6 = ( 510.9115, 4674.748)\), \(m_7 = 20.0 , \ b_7 = ( 520.8904, 4676.194)\), \(m_8 = 50.0 , \ b_8 = ( 526.0731, 4680.025)\), \(m_9 = 60.0 , \ b_9 = ( 529.9084, 4684.429)\), which must be identified from p by the Nelder-Mead algorithm with a multi-start approach (in our case, choosing the best result from three different initializations).

Fig. 3
figure 3

Optimal monitoring station locations \(\tilde{p}\)

Fig. 4
figure 4

Sub-optimal monitoring station locations \(\hat{p}\)

Table 1 Synthetic discharges identified by optimal and sub-optimal solutions

Then, applying the controlled random search procedure for the initial guess \(p_1=( 508.4854, 4662.537)\), \(p_2=( 513.9167, 4672.903)\), \(p_3=( 517.2793,\) 4674.887), \(p_4=( 522.8691, 4679.750)\), and \(p_5=( 530.4425, 4683.326)\), with a cost function value of \(J(p)=1.434 \times 10^2\), we achieved several optimal and sub-optimal solutions. For the sake of simplicity, we present here only two of them. So, we obtained the optimal solution \(\tilde{p}\) (corresponding to a cost function value \(J(\tilde{p}) = 6.175 \times 10^{-22}\)), given by \(\tilde{p}_1=( 509.5000, 4676.370)\), \(\tilde{p}_2=( 515.2000, 4675.295)\), \(\tilde{p}_3=( 515.2000, 4674.841)\), \(\tilde{p}_4=( 520.9000, 4675.953)\), and \(\tilde{p}_5=( 531.9204, 4687.832)\). We also obtained the sub-optimal solution \(\hat{p}\) (corresponding to a cost function value \(J(\hat{p}) = 7.990 \times 10^{-13}\)), given by \(\hat{p}_1=( 504.4070, 4674.276)\), \(\hat{p}_2=( 514.2178, 4672.690)\), \(\hat{p}_3=( 515.7867, 4675.043)\), \(\hat{p}_4=( 521.5240, 4676.762)\), and \(\hat{p}_5=( 530.8901, 4685.879)\). These achieved optimal and sub-optimal locations for sampling points are shown in Figs. 3 and 4, respectively. Finally, for both solutions, the identified locations and intensities for the synthetic discharges can be seen in Table 1.

By a straightforward analysis of above results, we can see how both the optimal and the sub-optimal solution are able to identify in a very accurate way all the random synthetic discharges — employed to calibrate the goodness of the set of sampling points locations — giving exact intensities and locations in the optimal case, and almost exact results in the sub-optimal one.

We must also note that, contrary to the suboptimal case where the five monitors are completely separated, in the optimal distribution case the monitors \(\tilde{p}_2\) and \(\tilde{p}_3\) are practically stuck together, which could indicate that the number of sampling stations could be maybe reduced from five to four without a loss of quality.

5 Conclusions

This paper proposes a new technique to automate the design of the sampling points for an estuarine water quality monitoring network by means of a linked simulation-optimization algorithm. After presenting a detailed and rigorous formulation of the problem, including its whole computational details, we have studied a real-world case posed in Rìa of Vigo (NW Spain), where the achieved optimal solutions show a very good ability to capture both the locations and the intensities of a large amount of possible discharges in the estuary.

Moreover, although we have formulated our problem for the particular case of the concentration of E. coli in an estuary, our methodology can be immediately extended with the minimal changes to the analysis of any other water quality indicator — or indicators — in any type of 1D, 2D, or 3D domains.

The novel methodology introduced here represents not only an advance towards the scientifical rationalization of the design of water quality monitoring systems, but it also shows its wide possibilities in other different fields of application (atmospheric contamination, groundwater pollution...), and for other different interests from the stakeholders and decision-makers (pollution detection in minimal time, minimization of the number of monitoring stations...).