Keywords

1 Introduction

The models built by scientists and engineers are often expressed in terms of processes that govern the dynamics and in terms of entities that determine the system states. They usually implement the models in a computerized simulation system. The values provided by a simulation are the model response to a certain system scenario [8]. Tuning, calibration or adjustment of the simulator are improvement process that seek the best set of input parameters to achieve the smallest difference between the output data and the reference data set [1, 6]. The automatic tuning of a simulation needs multiple instances of simulator running, one for each parameters combination. In consequence, the more parameters are considered, the more computing time is needed [8]. The main idea proposed in this article is an automatic tuning methodology for a simulator of a complex dynamic system that models the waves’ displacement in rivers and channels. This approach exploits a local behavior of the system: Parameters of the domain of the system with spatial proximity do not change or differ very little, which allows us to reduce the search space and in consequence, the computational cost. We take advantage of the research and the results of previous works [2, 3]. Using our methodology, we were able to find input scenarios to execute the simulation that provided an improvement of up to 50% in the quality of the simulation in relation to the initial scenario (currently used for simulation and forecasting).

2 Description of the Simulator

The computational model used in our experiments is a simulation software developed in the Laboratory of Computational Hydraulics of the National Institute of Water (INA). This computer model calculates the translation of waves through a riverbed calculated by the equations of Saint Venant. It implements a one-dimensional hydrodynamic model of the Paraná River for hydrologic forecasting [4, 5]. Next, we describe the key features of the computational model.

2.1 Domain Modeling Feature

Simulator represents a hydrodynamic model consisting of two sections or filaments. Each filament represents the path of a river. See graphical representation in Fig. 1. To simulate the transport of water in a filament channel, its route is subdivided into sections. Each section (Sc) represents a specific position within the path, and it is divided into subsections (Su).

Fig. 1.
figure 1

Discretization of the river domain. Some types of Su in the cross Sc of the domain.

The simulator requires setting a set of input parameters values, which determines a simulation scenario. At every Su, the roughness coefficient of Manning (m) is the parameter used as an adjustment variable, which depends on the resistance offered by the main channel and the floodplain [1, 5]. We distinguish both values as Manning of plain (\( mp \)) and Manning of channel (\( mc \)). Depending on the channel geometry in each section, a greater or lesser amount is needed of \( mp \) and \( mc \). The different forms of the sections were shown in Fig. 1.

2.2 Observed Data Measured at Monitoring Stations

A monitoring or measuring station (St) is the “physical and real” place where the river heights are surveyed and recorded. Each St is located in a city on the banks of the river channel. The data collected and recorded from the height of the river are known as observed data (OD), and they are measured daily. Data for the period from 1994 to 2011 were used to implement the experiences carried out in this work [4].

3 Proposed Methodology

We propose a search methodology to improve a simulator quality by finding the best set of parameters. Our aim is to optimize the simulation for a reduced search space, \( \omega \), such that \( \omega \subset\Omega , \) minimizing the use of computing resources to achieve the objective, where \( \Omega \) is the whole search space with all possible combinations of the adjustment parameters and \( \omega \) is the resulting reduced space [7]. Unlike the work in [3], we propose a calibration process of successive tuning steps to obtain an adjusted input parameters values from a preselected set of successive sections. Each parameters combination determines a simulation scenario and we detail its structure below. The quality of the simulated data (\( SD \)) is measured through calculating a divergence index (\( DI \)), as we explain in Sect. 3.4.

We start the method by choosing a monitoring station \( St_{k} \) located in an initial place \( k \) on the riverbed and selecting three contiguous sections, which are adjacent to that station. Figure 2 shows an outline of the set of possible scenarios and the selection of the best of them. This process searches the adjusted parameters set, \( \widehat{X}, \) for the station \( k \), which determines the best simulation scenario \( \widehat{{S_{k} }} \). This tuning method is repeated for the next station in a successive way, extending the successive adjustment process until reaching the last one, as we show in Fig. 3.

Fig. 2.
figure 2

Search process of the best scenario \( \widehat{{S_{k} }} \) for \( k \) station.

Fig. 3.
figure 3

Successive tuning process

3.1 Structure of the Input Scenario

We define a set of parameters for section m, subdivided into three subsections, as the 3-tupla:

$$ Sc_{m} = \left( {mp_{m} ,mc_{m} ,mp_{m} } \right) $$
(1)

Where \( mp_{m} \) represent Manning value of plain and \( mc_{m} \) represent Manning value of channel. We remark that Eq. (1) has two independent variables, \( mp_{m} \) and \( mc_{m} \) and it takes the same \( mp_{m} \) value at both section ends. Initially, three contiguous and adjacent sections were chosen for station \( k \), being its scenario \( \widehat{{S_{k} }} \) defined by:

$$ \widehat{{S_{k} }} = \left[ {\begin{array}{*{20}c} {Sc_{m } } \\ {Sc_{m + 1} } \\ {Sc_{m + 2} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {mp_{m } } & {mc_{m } } & {mp_{m } } \\ {mp_{m + 1} } & {mc_{m + 1} } & {mp_{m + 1} } \\ {mp_{m + 2} } & {mc_{m + 2} } & {mp_{m + 2} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {mp_{k} } & {mc_{k} } & {mp_{k} } \\ {mp_{k} } & {mc_{k} } & {mp_{k} } \\ {mp_{k} } & {mc_{k} } & {mp_{k} } \\ \end{array} } \right] $$
(2)

Being a physical system, and because the sections are close together, it is assumed that the three sections have the same values of \( mp \) y \( mc \) for \( St_{k} \).

We remark in Eqs. (4) and (5) that, \( mp_{k} \) and \( mc_{k} \) are independent variables. Therefore, the input scenario used to start the tuning process \( \widehat{X} \) is determined by the scenarios \( \widehat{{S_{k} }} \) corresponding to the sections \( Sc_{m} \), and for the intermediate scenarios \( \widehat{{S_{k} }}^{ + } \) corresponding to the intermediate sections \( Sc_{m}^{ + } \) located between the stations \( k \) y \( k + 1 \). Equation (5) represents \( \widehat{X} \) structure for \( n \) stations:

$$ \widehat{X } = \left\{ {\widehat{{S_{k} }} ,\widehat{{S_{k} }}^{ + } ,\widehat{{S_{k + 1} }},\widehat{{S_{k + 1} }}^{ + } , \ldots .,\widehat{{S_{n} }} } \right\} , \; with\; k = 1 $$
(3)

3.2 Discretization Process Parameters

The variation range of both \( imp \) and \( imc \), and the increment step, \( smp \) and \( smc \), are initial set values provided by INA experts. We implemented our experiences based on these values.

$$ imp = \left[ {mp_{min,} mp_{max} } \right] = \left[ {0.1, 0.71} \right];\;smp = 0.01 $$
(4)
$$ imc = \left[ {mc_{min,} mc_{max} } \right] = \left[ {0.017, 0.078} \right];\;smc = 0.001 $$
(5)
$$ \# \widehat{S} = 61 \;\;| \frac{{mp_{max} - mp_{min,} }}{smp} = \# \widehat{S} \wedge \frac{{mc_{max} - mc_{min,} }}{smc} = \# \widehat{S} $$
(6)

\( (\# \widehat{S} = 61 ) \) was obtained empirically for us. It represents minimum value of scenarios, which allow us to get improved output values when running the simulation. Increment \( \# \widehat{S} \) could increase the accuracy but will effectively increase the use of computational resources. Equation (9) determines each scenario values \( \widehat{{S_{k\left( i \right)} }} \):

$$ \begin{aligned} \widehat{{S_{k\left( i \right)} }} & = \left[ {\begin{array}{*{20}c} {mp_{i} } & {mc_{i} } & {mp_{i} } \\ {mp_{i} } & {mc_{i} } & {mp_{i} } \\ {mp_{i} } & {mc_{i} } & {mp_{i} } \\ \end{array} } \right] \\ & = \left[ {\begin{array}{*{20}c} {\left( {smp \cdot i} \right) + mp_{ini} } & {\left( {smc \cdot i} \right) + mc_{ini} } & {\left( {smp \cdot i} \right) + mp_{ini} } \\ {\left( {smp \cdot i} \right) + mp_{ini} } & {\left( {smc \cdot i} \right) + mc_{ini} } & {\left( {smp \cdot i} \right) + mp_{ini} } \\ {\left( {smp \cdot i} \right) + mp_{ini} } & {\left( {smc \cdot i} \right) + mc_{ini} } & {\left( {smp \cdot i} \right) + mp_{ini} } \\ \end{array} } \right] \\ \end{aligned} $$
(7)

Where \( i \) is the number of scenario, the range \( \left[ {i, \# \widehat{S}} \right] \subset {\mathbb{N}},\text{ }\;where\, 1 \le i \le \# \widehat{S} \), \( mp_{ini} \) and \( mc_{ini} \) are the initial values. We use these values to start the search process and to run the simulator for each scenario, in order to find the best one as we describe in next section.

3.3 Search of the Best Scenario

Through a divergence index implemented with the root mean square error estimator (RMSE), the best-input scenario is determined by comparing the SD series with the OD series.

$$ \varvec{DI}_{\varvec{k}}^{\varvec{y}} = RSME_{k}^{y} = \sqrt[2]{{\frac{{ \mathop \sum \nolimits_{{\varvec{i} = 1}}^{{\varvec{i} = \varvec{N} }} \left( {\varvec{H}_{\varvec{k}}^{{\varvec{OD},\varvec{y}}} - \varvec{H}_{\varvec{k}}^{{\varvec{SD},\varvec{y}}} } \right)_{\varvec{i}}^{2} }}{{\mathbf{N}}}}} $$
(8)

The index \( DI_{k}^{y} \) is the RMSE error of the series of river heights simulated \( H_{k}^{SD,y} \) with respect to of the series of river heights observed \( H_{k}^{OD,y} \), for a station \( k \), and for a year \( y \), which is the simulation time, and the number of stations, \( N \). The best fit scenario for the \( k \) station which generates a set of output \( H_{k}^{SD} \) is the minimum \( DI_{y}^{k} \), (\( \hbox{min} \left( {\widehat{{DI_{k}^{y} }}} \right) \)) of all the simulations.

3.4 Successive Tuning Process

After obtaining the scenario that best fits for St \( k \), the adjustment can be extended to a new St \( k + 1 \). We take advantage of the system locality behavior and we set the parameters with the previously calculated adjustment values. To make this possible, the scenario \( \widehat{{S_{k} }}^{ + } \) is initialized with the values of the best scenario \( \widehat{{S_{k} }} \), immediately before. We took advantage of the locality behavior, which means that those sections that are close one to another have similar adjustment parameters values or they differ very little.

In the successive input scenarios, we “leave fixed the adjusted parameters values found” in the previous calibrations, and thus the previous adjustment scenarios of each section are used to find the actual adjusted parameters values. For \( k \) St the new k parameter vector is: \( \widehat{{X_{k} }} = \left\{ {\widehat{{S_{k} }},\widehat{{S_{k} }}^{ + } ,\widehat{{S_{k + 1} }} ,\widehat{{S_{k + 1} }}^{ + } , \ldots ., \widehat{{S_{n} }} } \right\} \)

The \( k + 1 \) input scenario, or k + 1 parameters vector, is:

$$ \widehat{{X_{k + 1} }} = \left\{ {\widehat{{S_{k} }} ,\widehat{{S_{k} }}^{ + } ,\widehat{{S_{k + 1} }},\widehat{{S_{k + 1} }}^{ + } , \ldots .,\widehat{{S_{n} }} } \right\} , \,{\text{where}}\; \widehat{{S_{k} }}^{ + } = \widehat{{S_{k} }} $$

The \( k + 2 \) input scenario, or k + 2 parameters vector is:

$$ \widehat{{X_{k + 2} }} = \left\{ {\widehat{{S_{k} }} ,\widehat{{S_{k} }} ^{ + } ,\widehat{{S_{k + 1} }} ,\widehat{{S_{k + 1} }}^{ + } , \ldots , \widehat{{S_{n} }} } \right\} , \;{\text{where}}\, \widehat{{S_{k} }} ^{ + } = \widehat{{S_{k} }} , \widehat{{S_{k + 1} }} ^{ + } = \widehat{{S_{k + 1} }} $$

For \( n \) input scenario to the Simulator (scenario that adjusts the entire domain):

$$ \widehat{{X_{n} }} = \left\{ {\widehat{{S_{k} }} ,\widehat{{S_{k} }} ^{ + } ,\widehat{{S_{k + 1} }} ,\widehat{{S_{k + 1} }} ^{ + } , \ldots ,\widehat{{S_{n} }} } \right\}, \; {\text{where}}\; \widehat{{S_{k} }} ^{ + } = \widehat{{S_{k} }} , \ldots , \widehat{{S_{n - 1} }} ^{ + } = \widehat{{S_{n - 1} }} $$
(9)

4 Experimental Results

In search of the best scenario performed on the \( k \) St “Esquina” (ESQU), we found scenarios that improved the results up to 57% in relation to the initial scenario used by the INA’s experts, determined by ratio of \( DI_{k}^{y} \left( {Fit} \right) \) to \( DI_{k}^{y} \left( {Initial} \right) \).

We show in Table 1 the synthesis process with the top three scenarios found for processed \( k \) St. As also, it shows the second St \( k + 1 \) adjusted. The best scenario was found at “La Paz”, St (LAPA), which is an adjacent St to ESQU.

Table 1. Adjustment made in \( k \) St and \( k + 1 \) St, several years.

We show in Table 1, a synthesis process with the two best scenarios found for \( k + 1 \) station. Figure 4 shows a comparative graph with the observed data series (real measured), the initial simulated data series (original series loaded in the simulator) and the series of simulated data adjusted for the best fit scenarios in (\( k + 1 \)) station. We can see that our method achieves global better results over the whole series.

Fig. 4.
figure 4

Comparative OD, SD, Fit. Station k = LAPA, y = 2008

5 Conclusions

Our method “in successive steps of adjustments” was tuned by the locality simulation behavior, which provided promising results when finding scenarios that improved simulation quality and these encouraged us to continue our research in this direction. These scenarios provided numerical series of river heights closest to those observed at the measurement stations on the riverbed. The improvement percentage obtained was greater than 50%. The method is simple and manages to reduce the computational resources based on proportional successive increases of the initial scenario parameters (7). Thus, we reduced the time computing when we use the adjusted parameters obtained in the previous steps to calculate the actual one (9). We continue our work focused on finding an automatically calibration methodology of the computational model extending this methodology from a predetermined initial station to the last one at the end of the riverbed and applying the methodology described in this work. This proposal will make use of HPC techniques to decrease the execution times.