# Simultaneous perturbation stochastic approximation for tidal models

**Part of the following topical collections:**

## Abstract

The Dutch continental shelf model (DCSM) is a shallow sea model of entire continental shelf which is used operationally in the Netherlands to forecast the storm surges in the North Sea. The forecasts are necessary to support the decision of the timely closure of the moveable storm surge barriers to protect the land. In this study, an automated model calibration method, simultaneous perturbation stochastic approximation (SPSA) is implemented for tidal calibration of the DCSM. The method uses objective function evaluations to obtain the gradient approximations. The gradient approximation for the central difference method uses only two objective function evaluation independent of the number of parameters being optimized. The calibration parameter in this study is the model bathymetry. A number of calibration experiments is performed. The effectiveness of the algorithm is evaluated in terms of the accuracy of the final results as well as the computational costs required to produce these results. In doing so, comparison is made with a traditional steepest descent method and also with a newly developed proper orthogonal decomposition-based calibration method. The main findings are: (1) The SPSA method gives comparable results to steepest descent method with little computational cost. (2) The SPSA method with little computational cost can be used to estimate large number of parameters.

### Keywords

Numerical tidal modeling Parameter estimation Simultaneous perturbation Stochastic approximation## 1 Introduction

Accurate sea water level forecasting is crucial in the Netherlands. This is mainly because large areas of the land lie below sea level. Forecasts are made to support the storm surge flood warning system. Timely water level forecasts are necessary to support the decision for closure of the movable storm surge barriers in the Eastern Scheldt and the New Waterway. Moreover, forecasting is also important for harbor management, as the size of some ships have become so large that they can only enter the harbor during high water period. The storm surge warning service (SVSD) in close cooperation with the Royal Netherlands meteorological institute is responsible for these forecasts. The surge is predicted by using a numerical hydrodynamic model, the Dutch continental shelf model (DCSM) (see Stelling 1984; Verboom et al. 1992). The performance of the DCSM regarding the storm surges is influenced by its performance in forecasting the astronomical tides. Using inverse modeling techniques, these tidal data can be used to improve the model results.

Most efficient optimization algorithms require a gradient of the objective function. This usually requires the implementation of the adjoint code for the computation of the gradient of the objective function. The adjoint method aims at adjusting a number of unknown control parameters on the basis of given data. The control parameters might be model initial conditions or model parameters (Thacker and Long 1988). A sizeable amount of research on adjoint parameter estimation was carried out in the last 30 years in fields such as meteorology, petroleum reservoirs, and oceanography for instance by Seinfeld and Kravaris (1982), Bennet and Mcintosh (1982), Ulman and Wilson (1998), Courtier and Talagrand (1990), Lardner et al. (1993) and Heemink et al. (2002). A detailed description of the application of the adjoint method in atmosphere and ocean problems can be found in Navon (1998).

One of the drawbacks of the adjoint method is the programming effort required for the implementation of the adjoint model. Research has recently been carried out on automatic generation of computer code for the adjoint, and adjoint compilers have now become available (see Kaminski et al. 2003). Even with the use of these adjoint compilers, this is a huge programming effort that hampers new applications of the method. Courtier et al. (1994) had proposed an incremental approach, in which the forward solution of the nonlinear model is replaced by a low resolution approximate model. Reduced order modeling can also be used to obtain an efficient low-order approximate linear model (Hoteit 2008; Lawless et al. 2008).

This paper focuses on a method referred to as the simultaneous perturbation stochastic approximation (SPSA) method. This method can be easily combined with any numerical model to do automatic calibration. For the calibration of numerical tidal model, the SPSA algorithm would require only the water level data predicted from the given model. SPSA is stochastic offspring of the Keifer–Wolfowitz Algorithm (Kiefer and Wolfowitz 1952) commonly referred as finite difference stochastic approximation (FDSA) method. This algorithm uses objective function evaluations to obtain the gradient approximations. Each individual model parameter is perturbed one at a time and the partial derivatives of the objective function with respect to the each parameter is estimated by a divided difference based on the standard Taylor series approximation of a partial derivative. This approximation of each partial derivative involved in the gradient of the objective function requires at least one new evaluation of the objective function, thus this method is not feasible for automated calibration when we have large number of parameters.

The SPSA method uses stochastic simultaneous perturbation of all model parameters to generate a search at each iteration. SPSA is based on a highly efficient and easily implemented simultaneous perturbation approximation to the gradient. This gradient approximation for the central difference method uses only two objective function evaluation independent of the number of parameters being optimized. The SPSA algorithm has gathered a great deal of interest over the last decade and has been used for a variety of applications (Hutchison and Hill 1997; Spall 1998, 2000; Gerencser et al. 2001; Gao and Reynolds 2007). As a result of the stochastic perturbation, the calculated gradient is also stochastic, however the expectation of the stochastic gradient is the true gradient (Gao and Reynolds 2007). So one would expect that the performance of the basic SPSA algorithm to be similar to the performance of steepest descent.

The gradient-based algorithms are faster to converge than any objective function-based gradient approximations such as SPSA algorithm when speed is measure in terms of the number of iterations. The total cost to achieve effective convergence depends not only on the number of iterations required, but also on the cost needed to perform these iterations, which is typically greater in gradient-based algorithms. This cost may include greater computational burden and resources, additional human effort required for determining and coding gradients.

Vermeulen and Heemink (2006) proposed a method based on proper orthogonal decomposition (POD) which shifts the minimization into lower dimensional space and avoids the implementation of the adjoint of the tangent linear approximation of the original nonlinear model. Recently, Altaf et al. (2011) applied this POD-based calibration method for the estimation of depth values and bottom friction coefficients for a very large-scale tidal model. The method has also been applied in petroleum engineering by Kaleta et al. (2011) for history matching problems. One drawback of the POD-based calibration method is its dependence on the number of parameters.

In this paper the SPSA algorithm is applied for the estimation of depth values in the tidal model DCSM of the entire European continental shelf. A number of calibration experiments is performed both simulated and real data. The effectiveness of the algorithm is evaluated in terms of the accuracy of the final results as well as the computational costs required to produce these results. In doing so, comparison is made with a traditional steepest descent method and also with a newly developed POD-based calibration method.

The paper is organized as follows. Section 2 describes the SPSA algorithm. This section also briefly discusses the POD-based calibration approach which is used here as comparison with SPSA method. The following section briefly explains the DCSM model used in this study. Section 4 contains results from experiments with the model DCSM, to estimate the water depth. The paper concludes in Section 5 by discussing the results.

## 2 Parameter estimation using SPSA

*M*

_{i}is nonlinear and deterministic dynamics operator that includes inputs and propagates the state from time

*t*

_{i}to time

*t*

_{i + 1},

*γ*is vector of uncertain parameters which needs to be determined. Suppose now that we have imperfect observations \(Y(t_{i})\in \Re^{q}\) of the dynamical system (1) that are related to model state at time

*t*

_{i}through

*η*(

*t*

_{i}) is unbiased random Gaussian error vector with covariance matrix

*R*

_{i}.

*J*

*γ*satisfying the discrete nonlinear forecast model (1).

*J*(

*γ*) using the iteration procedure

*γ*evaluated at the old iterate,

*γ*

^{l}. if \(\hat{g_{l}}({\gamma^{l}})\) is replaced by \(\nabla J(\gamma^{l})\), then Eq. 4 represents the steepest descent algorithm.

- 1.Define the
*n*^{p}dimensional column vector \(\triangle_{l}\) byand$$ \triangle_{l}=[\triangle_{l,1},\triangle_{l,2},\cdots,\triangle_{l,n^p}]^{T}, $$(5)where \(\triangle_{l,i}, i=1,2,\cdots,n^p\) represents independent samples from the symmetric ±1 Bernoulli distribution. This means that + 1 or − 1 are the only possible values that can be obtained for each \(\triangle_{l,i}\). It also means that$$ \triangle_{l}^{-1}=[\triangle_{l,1}^{-1},\triangle_{l,2}^{-1},\cdots,\triangle_{l,n^p}^{-1}]^{T}, $$(6)and$$ \triangle_{l,i}^{-1}=\triangle_{l,i}, $$(7)where$$ E[\triangle_{l,i}^{-1}]=E[\triangle_{l,1}]=0, $$(8)*E*denotes the expectation. - 2.
Define a positive coefficient

*c*_{l}and obtain two evaluations of the objective function*J*(*γ*) based on the simultaneous perturbation around the current*γ*^{l}: \(J(\gamma^l+c_l\triangle_{l})\) and \(J(\gamma^l-c_l\triangle_{l})\). - 3.A realization of the stochastic gradient is then calculated by using central difference approximation asSince \(\triangle_{l}\) is a random vector, \(\hat g_l\) is also random vector. So by generating a sample of \(\triangle_{l}\), we generate a specific sample of \(\hat g_l\). The FDSA algorithm involves computation of each component of ∇$$ \hat{g_{l}}({\gamma^{l}})=\frac{J(\gamma^l+c_l\triangle_{l})-J(\gamma^l-c_l\triangle_{l})}{2c_l}\triangle_{l}^{-1} $$(9)
*J*by perturbing one model parameter at a time. If one does a one-sided approximation for each partial derivative involved in \(\nabla J(\gamma^{l})\), then computation of the gradient requires*n*^{p}+ 1 evaluations of*J*for each iteration of the steepest descent algorithm. In contrast, the SPSA requires only two evaluations of the objective function \(J(\gamma^l+c_l\triangle_{l})\) and \(J(\gamma^l+c_l\triangle_{l})\) at each iteration.

### 2.1 Choice of *a*_{l} and *c*_{l}

*a*

_{l}and

*c*

_{l}. These are specified here according to the guidelines given by Spall (1998). The relevant formulas for

*a*

_{l}and

*c*

_{l}are given by

*J*in a stochastic sense (almost surely). The choice of a, c, A, \(\hat \alpha\) and \(\hat \beta\) is to some extent case dependent and it may require some experimentation to determine good values of these parameters. Although the asymptotically optimal values of \(\hat \alpha\) and \(\hat \beta\) are 1.0 and 1/6, respectively (Chin 1997), but choosing smaller values, e.g., \(\hat \alpha =0.602\) and \(\hat \beta=0.101\) (Spall 1998) appear to be more effective in practice. One recommendation for A is to set A equal to 10% of the maximum number of iterations allowed.

The value of constant *c* should be chosen so that *c* is equal to the standard deviation of the noise in objective function *J*. If one has perfect objective function, then c should be chosen as small positive number.

### 2.2 Average stochastic gradient

*J*, the expectation of the stochastic gradient is the true gradient (Gao and Reynolds 2007), i.e.,

*N*different samples of \(\triangle_{l}\). Due to the relationship given in Eq. 12, one would hope that SPSA would have convergence properties similar to those of steepest descent in terms of the number of iterations required to reduce the objective function

*J*to a certain level. In this case, SPSA could be much more efficient than the steepest descent algorithm.

### 2.3 POD-based calibration method

Vermeulen and Heemink (2006) proposed a method based on POD which shifts the minimization into lower dimensional space and avoids the implementation of the adjoint of the tangent linear approximation of the original nonlinear model. Due to the linear character of the POD-based reduced model its adjoint can be implemented easily and the minimization problem is solved completely in reduced space with very low computational cost.

*X*

^{b}is the background state vector with the prior estimated parameters vector

*γ*

^{b}and \(\triangle \overline X\) is a deviation of the model from background trajectory.

*P*= {

*p*

_{1},

*p*

_{2}, ⋯ ,

*p*

_{r}} is a projection matrix such that

*P*

^{T}

*P*=

*I*and

*ξ*is a reduced state vector given by:

*γ*is the control parameter vector, \(\widetilde{M}_{i}\) and \(\widetilde{M}_{i}^{\gamma}\) are simplified dynamics operators which approximate the full Jacobians \(\frac{\partial M_{i}}{\partial X^{b}}\) and \(\frac{\partial M_{i}}{\partial\gamma_{k}}\), respectively:

*r*+

*n*

^{p}) × (

*r*+

*n*

^{p}) with

*n*

^{p}being the number of estimated parameters.

#### 2.3.1 Collection of the snapshots and POD basis

*γ*

_{k}to get a matrix

*E*is

*s*=

*n*

^{p}×

*n*

^{s}, where

*n*

^{s}is the number of snapshot collected for each individual parameter

*γ*

_{k}. The covariance matrix

*Q*can be constructed from the ensemble

*E*of the snapshots by taking the outer product

^{6}, so direct solution of eigenvalue problem is not feasible. To shorten the calculation time necessary for solving the eigenvalue problem for this high-dimensional covariance matrix, we define a covariance matrix

*G*as an inner product

*s*×

*s*eigenvalue problem

*λ*

_{i}are the eigenvalues of the above eigenvalue problem. The eigenvectors

**z**

_{i}may be chosen to be orthonormal and the POD modes

*P*are then given by:

*ψ*

_{i}for the relative information to choose a low dimensional basis by neglecting modes corresponding to the small eigenvalues:

*p*

_{r}(

*r*<

*s*) modes such that

*ψ*

_{1}>

*ψ*

_{2}> ... >

*ψ*

_{r}and they totally explain at least the required variance

*ψ*

^{e},

*r*in the POD basis

*P*depends on the required accuracy of the reduced model.

#### 2.3.2 Approximate objective function and its adjoint

The value of the approximate objective function \(\hat J\) is obtained by correcting the observations *Y*(*t*_{i}) for background state *X*^{b}(*t*_{i}) which is mapped on the observational space through a mapping *H* and to the reduced model state *ξ*(*t*_{i}, Δ*γ*) which is mapped to the observational space through mapping \(\hat H\), with \(\hat H = HP\).

*γ*is given by:

Recently, Altaf et al. (2011) applied this POD-based calibration method for the estimation of depth values and bottom friction coefficients for a very large-scale tidal model. The method has also been recently applied in petroleum engineering by Kaleta et al. (2011) for history matching problems. One drawback of the POD-based calibration method is its dependence on the number of parameters.

## 3 The Dutch Continental Shelf Model

*x*,*y*Cartesian coordinates in horizontal plane

*t*time coordinate

*u*,*v*depth-averaged current in

*x*and*y*direction, respectively*h*water level above reference plane

*D*water depth below the reference plane

*H*total water depth (

*D*+*h*)*f*coefficient for the Coriolis force

*C*_{2D}Chezy coefficient

*τ*_{x},*τ*_{y}wind stress in

*x*and*y*direction, respectively*ρ*_{w}density of sea water

*p*_{a}atmospheric pressure

*g*acceleration of gravity

*h*_{0}mean water level

*H*total water depth

*f*_{j}*H*_{j}amplitude of harmonic constituent

*j**ω*_{j}angular velocity of

*j**θ*_{j}phase of

*j*

### 3.1 Estimation of depth

The bathymetry for a model is usually from nautical maps. These maps usually give details of shallow rather than deep-water areas. If we use these maps to prescribe the water depth, it is reasonable to assume that this prescription of the bathymetry is erroneous. So depth can be a parameter on which model can be calibrated. In the early years of the developments of the DCSM, the changes to bathymetry were made manually. Later automated calibration procedures based on variational data assimilation were developed (Ten-Brummelhuis et al. 1993; Mouthaan et al. 1994). The complete description on the development of these calibrated procedures for DCSM can be found in Verlaan et al. (2005).

## 4 Numerical experiment

### 4.1 Experiment 1

The DCSM model used in this experiment covers an area in the north-east European continental shelf, i.e., 12°W to 13°E and 48°N to 62°N, as shown in Fig. 1. The resolution of the spherical grid is 1/8° × 1/12°, which is approximately 8 × 8 km. With this configuration there are 201 × 173 grid with 19,809 computational grid points. The time step is \(\triangle{\it{t}}=10\) min.

_{k},

*k*= 1, ⋯ , 7 see Fig. 2. For each subdomain Ω

_{k}, a correction parameters \(\gamma_{k}^{b}\) was defined that was related to \(D_{n_{1},n_{2}}\) by:

_{k}and leave the spatial dependence inside Ω

_{k}unaltered.

Seven observation points were included in the assimilation, two of which are located along the east coast of the UK, two along the Dutch coast and one at the Belgium coast (see Fig. 1). The truth model was run for a period of 15 days from 13 December 1997 00:00 to 27 December 1997 24:00 with the specification of water depth \(D_{n_{1},n_{2}}^{b}\) as used in the operational DCSM to generate artificial data at the assimilation stations. The first 2 days were used to properly initialize the simulations and set of observations *Y* of computed water levels *h* were collected for last 13 days at an interval of every ten minutes in seven selected assimilation grid points, which coincide with the points where data are observed in reality. The observations were assumed to be perfect. This assumption was made to see how close the estimate is to the truth; 5 m was added in \(D_{n_{1},n_{2}}^{b}\) at all the grid points in domain Ω to get the initial adjustments \(\gamma_{k}^{b}\).

For the SPSA optimization algorithm, two methods were applied to calculate the stochastic gradient. In the first method, the stochastic gradient \(\hat{g_{l}}({\gamma^{l}})\) was computed according to Eq. 9. In the second method, the gradient was computed by Eq. 13 referred as average SPSA where expectation is taken over two independent stochastic gradients.

The values of a, c, A, \(\hat \alpha\), and \(\hat \beta\) were obtained according to the guidelines given in section 2.1. These values were determined as best from several forward model simulations. The iteration cycle for the SPSA algorithm was aborted when the value of the objective function *J* did not change for the last three iterations of the minimization process (Wang et al. 2009).

*J*versus number of iterations

*β*for the two implementations of the SPSA algorithms compared with the steepest descent and the POD-based calibration method. Note that the gradient used in the steepest descent algorithm is obtained from the finite difference method using one-sided perturbation. The graph shows that both SPSA and average SPSA gave comparable results, although for average SPSA the decrease in the objective function

*J*is more at early iterations. Also, the rate of convergence of average SPSA is slightly better than the SPSA. However, both SPSA and average SPSA are less efficient than steepest descent method. The steepest descent algorithm converged in ten iterations as compared to 20 and 15 iterations in SPSA and average SPSA, respectively. However, the cost of single iteration in SPSA algorithm is far less than the steepest descent algorithm.

For all the algorithms, there was a significant improvements in parameters for regions coinciding with the UK, Dutch and Belgian coast, but there was not much improvement in deep water regions Ω_{1} and Ω_{7}. Since the subdomains containing deep areas are less sensitive as compared the subdomains containing shallow areas, so it is much difficult to estimate *γ*_{k} in regions Ω_{1} and Ω_{7}.

*ζ*) between the updated estimated parameters

*γ*

^{up}obtained after calibration with different optimization algorithms and the true parameter estimate

*γ*

^{t}. The measure is defined as the two norm of the difference between estimated parameters

*γ*

^{up}obtained after optimization and the true parameter estimate

*γ*

^{t}divided by the norm of the true parameter estimate

*γ*

^{t}(Gao and Reynolds 2007).

Comparison of estimated parameters to true parameters for the twin experiment

| SPSA (%) | Average SPSA (%) | Steepest descent (%) |
---|---|---|---|

All parameters | 35.11 | 29.27 | 21.02 |

Sensitive parameters | 9.95 | 6.29 | 6.49 |

*γ*

^{up}) and the true parameters (

*γ*

^{t}) after iterations

*β*= 5,

*β*= 10,

*β*= 15 and

*β*= 20 of SPSA algorithm for calibration stations and compares it with average SPSA and steepest descent algorithms. The RMSE for SPSA algorithm after iteration

*β*= 5 is 9.95 compared to 8.92 and 6.05 in average SPSA and steepest descent algorithm, respectively. So SPSA and average SPSA are comparable at this point. The RMSE for SPSA after ten iterations is comparable to the RMSE of steepest descent method after only five iterations. Since the cost of one iteration of steepest descent is eight model simulations compared to two model simulations in SPSA algorithm, SPSA is two times more efficient than steepest descent at this point and one would expect SPSA to be more efficient if we have large number of parameters.

RMSE results for the minimization process after 5th, 10th, 15th, and 20th iterations

| SPSA (cm) | Average SPSA (cm) | Steepest descent (cm) |
---|---|---|---|

Initial | 22.80 | 22.80 | 22.80 |

| 9.95 | 8.92 | 6.05 |

| 5.63 | 4.09 | 2.91 |

| 4.10 | 3.27 | – |

| 3.55 | – | – |

The RMSE with SPSA after *β* = 15 and average SPSA after *β* = 10 is similar. At this point the computational costs of both SPSA and average SPSA are also comparable. It is also clear from the Table 2 that the smallest RMSE value is achieved by steepest descent method in ten iterations.

*h*at the two tide gauge stations Den Helder and Southend along the Dutch and English coasts, respectively for the period from 18 December 1997 00:00 to 18 December 1997 24:00. These time series refer to water levels obtained from true values of the parameters, the initial values of the parameters and the estimated values of the parameters using SPSA algorithm, respectively. Figure 4 demonstrates that the estimation methods significantly reduces the differences between time series obtained from initial parameters and the true parameters as compared with the differences between time series obtained from the estimated parameters and true parameters.

### 4.2 Experiment 2

- 1.
water level measurement data from the Dutch DONAR database and

- 2.
British Oceanographic Data Center offshore water level measurement data.

*Y*of the computed water levels

*h*contain an error described by white noise process with standard deviation

*σ*

_{m}= 0.10 (m).

_{k},

*k*= 1, ..., 12 (see Fig. 8). The influence of the depth adjustments is quite significant specially in shallow regions. Thus, the subdivision of model area was made such that both deep and shallow areas were separated (see Fig. 8). The data observation points are concentrated in the English Channel, so this region was divided into five subdomains to improve the results by considering the local effects of the depth in each subdomain Ω

_{k},

*k*= 3, ⋯ , 7, in this area.

*J*versus number of iterations

*β*for the SPSA algorithm compared with the POD-based calibration method. The SPSA method is compared here with POD-based calibration method for practical reasons. One reason is we have seen in the previous experiment that the POD-based calibration method efficiently estimated the depth values with the fastest convergence rate as compared to SPSA and steepest descent algorithms. Secondly, its not worthwhile to compute gradient by finite differences in this large-scale model. The graph shows that both the calibration methods give comparable results in terms of reduction in the objective function

*J*. Though the rate of convergence of the POD-based calibration method is far better than the SPSA.

The POD-based calibration method converged in only two iterations as compared to 14 iterations with the SPSA, respectively. However, the cost of single iteration in the POD-based calibration method is much higher and is dependent on the number of parameters *n*^{p} and the POD modes *r* used to construct the reduced model (Altaf et al. 2009). So for this experiment one iteration of the POD method required 13 initial simulations of the original nonlinear model to get the ensemble and then additional simulations of the original model to construct the POD reduced model in each iteration *β* of the optimization process. The SPSA method on the other hand required only two objective function evaluations to compute the gradient in each iteration *β* of the optimization procedure. For this application, the POD method is also fast since it is not needed to use a full simulations of the original model for the generation of the ensemble (Altaf et al. 2011). One disadvantage of POD-based calibration method is if the number of parameters is large the size of ensemble becomes large too and to construct a good reduced model is usually difficult with large ensemble size. For both the experiments performed the SPSA algorithm converged in almost similar iterations although the number of parameters were different. So, it is expected that the SPSA algorithm will work even with more parameters as the SPSA algorithm is independent of the number of the estimated parameters.

## 5 Conclusions

In the absence of the adjoint model, the gradient is usually obtained by objective function evaluations to obtain the gradient approximations. Each individual model parameter is perturbed one at a time and the partial derivatives of the objective function with respect to the each parameter is estimated. This method is not feasible for automated calibration when large number of parameters are estimated. Simultaneous perturbation stochastic approximation (SPSA) method uses stochastic simultaneous perturbation of all model parameters to generate a search at each iteration. SPSA is based on a highly efficient and easily implemented simultaneous perturbation approximation to the gradient. This gradient approximation for the central difference method uses only two objective function evaluation independent of the number of parameters being optimized.

SPSA algorithm is applied to calibrate the model DCSM. The DCSM is an operational storm surge model, used in the Netherlands for real-time storm surge prediction in North sea. A number of calibration experiments was performed both with simulated and real data. The results from twin experiment showed that SPSA has a lower convergence rate than the steepest descent and POD-based calibration methods. The steepest descent algorithm converged in ten iterations as compared to 20 and 15 iterations in SPSA and average SPSA, respectively. However, the computational cost of single iteration in the steepest descent and the POD-based calibration methods is much higher and is dependent on the number of parameters *n*^{p}. Although both SPSA and steepest descent methods converged to similar value of the objective function, none of the optimization algorithms achieved the expected reduction in the objective function.

The results from a very large-scale tidal model and with real data showed that SPSA algorithm gives comparable results to POD-based calibration method. The POD-based calibration method converged in only two iterations as compared to 14 iterations with the SPSA, respectively. The POD-based calibration method though required 13 initial simulations of the original model to get the ensemble and then extra simulations to construct the POD reduced model in each iteration *β* of the optimization process. The SPSA method on the other hand required only two objective function evaluations to compute an approximation of the gradient in each iteration *β* of the optimization procedure independent of the number of estimated parameters. Thus, SPSA algorithm proved to be a promising optimization algorithm for model calibration for cases where adjoint code is not available for computing the gradient of the objective function.

**Open Access**

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

### References

- Altaf MU, Heemink AW, Verlaan M (2009) Inverse shallow-water flow modelling using model reduction. Int J Multiscale Com Eng 7:577–596CrossRefGoogle Scholar
- Altaf MU, Verlaan M, Heemink AW (2011) Efficient identification of uncertain parameters in a large scale tidal model of European continental shelf by proper orthogonal decomposition. Int J Numer Methods Fluids. doi:10.1002/fld.2511
- Bennet AF, Mcintosh PC (1982) Open ocean modeling as an inverse problem: tidal theory. J Phys Oceanogr 12:1004–1018CrossRefGoogle Scholar
- Chin DC (1997) Comparative study of stochastic algorithms for system optimization based on gradient approximation. IEEE Trans Syst Man Cybern 27:244–249Google Scholar
- Courtier P, Talagrand O (1990) Variational assimilation of meteorological observations with the direct and adjoint shallow water equations. Tellus 42:531CrossRefGoogle Scholar
- Courtier P, Thepaut JN, Hollingsworth A (1994) A strategy for operational implementation of 4d-var, using an incremental approach. Q J R Meteorol Soc 120:1367–1387CrossRefGoogle Scholar
- Gao G, Reynolds AC (2007) A stochastic algorithm for automatic history matching. SPE J 12:196–208Google Scholar
- Gerencser L, Hill SD, Vagoo Z (2001) Discrete optimization via spsa. In: Proc. of American control conference, USAGoogle Scholar
- Heemink AW, Mouthaan EEA, Roest MRT (2002) Inverse 3D shallow water flow modeling of the continental shelf. Cont Shelf Res 22:465–484CrossRefGoogle Scholar
- Hoteit I (2008) A reduced-order simulated annealing approach for four-dimensional variational data assimilation in meteorology and oceanography. Int J Numer Methods Fluids 58:1181–1199. doi:10.1002/fld.1794 CrossRefGoogle Scholar
- Hutchison DW, Hill SD (1997) Simulation optimization of airline delay with constraints. In: Proc. 36th IEEE conference on decision and control, San Diego, USAGoogle Scholar
- Kaleta MP, Henea RG, Jansen JD, Heemink AW (2011) Model-reduced gradient-based history matching. Comput Geosci 15:135–153CrossRefGoogle Scholar
- Kaminski T, Giering R, ScholzeM(2003) An example of an automatic differentiation-based modeling system. Lect Notes Comput Sci 2668:5–104Google Scholar
- Kiefer J, Wolfowitz J (1952) Stochastic estimation of a regression function. Ann Math Statist 23:462–466CrossRefGoogle Scholar
- Lardner RW, Al-Rabeh AH, Gunay N (1993) Optimal estimation of parameters for a two dimensional hydrodynamical model of the arabian gulf. J Geophys Res Oceans 98:229–242CrossRefGoogle Scholar
- Lawless AS, Nichols NC, Boess C, Bunse-Gerstner A (2008) Using model reduction methods within incremental 4dvar. Mon Weather Rev 136:1511–1522CrossRefGoogle Scholar
- Leendertse J (1967) Aspects of a computational model for long-period water wave propagation. Ph.D. thesis, Rand Corporation, Memorandom RM-5294-PR, Santa MonicaGoogle Scholar
- Mouthaan EEA, Heemink AW, Robaczewska KB (1994) Assimilation of ERS-1 altimeter data in a tidal model of the continental shelf. Dtsch Hydrogr Z 36(4):285–319CrossRefGoogle Scholar
- Navon IM (1998) Practical and theoratical aspects of adjoint parameter estimation and identifiability in meteorology and oceanography. Dyn Atmos Oceans (Special issue in honor of Richard Pfeffer) 27:55–79CrossRefGoogle Scholar
- Ray RD (1999) A global ocean tide model from topex/poseidon altimetry: Got99.2. NASA Technical Memorandum 209478Google Scholar
- Seinfeld JH, Kravaris C (1982) Distributed parameter identification in geophysics-petroleum reservoirs and aquifers. In: Tzafestas, SG (ed) Distributed parameter control systems. Pergamon, Oxford. pp 367–390Google Scholar
- Sirovich L (1987) Choatic dynamics of coherent structures. Physica D 37:126–145CrossRefGoogle Scholar
- Spall JC (1998) Implementation of the simultaneous perturbation algorithm for stochastic optimization. IEEE Trans Aerosp Electron Syst 34:817–823CrossRefGoogle Scholar
- Spall JC (2000) Adaptive stochastic approximation by the simultaneous perturbation method. IEEE Trans Automat Contr 45:1839–1853CrossRefGoogle Scholar
- Stelling GS (1984) On the construction of computational methods for shallow water flow problem. PhD thesis, Rijkswaterstaat Communications 35, RijkswaterstaatGoogle Scholar
- Ten-Brummelhuis PGJ (1992) Parameter estimation in tidal flow models with uncertain boundary conditions. Ph.D. thesis, Twente University, The NetherlandsGoogle Scholar
- Ten-Brummelhuis PGJ, Heemink AW, van den Boogard HFP (1993) Identification of shallow sea models. Int J Numer Methods Fluids 17:637–665CrossRefGoogle Scholar
- Thacker WC, Long RB (1988) Fitting models to inadequate data by enforcing spatial and temporal smoothness. J Geophys Res 93:10655–10664CrossRefGoogle Scholar
- Ulman DS, Wilson RE (1998) Model parameter estimation for data assimilation modeling: temporal and spatial variability of the bottom drag coefficient. J Geophys Res Oceans 103:5531–5549CrossRefGoogle Scholar
- Verboom GK, de Ronde JG, van Dijk RP (1992) A fine grid tidal flow and storm surge model of the north sea. Cont Shelf Res 12:213–233CrossRefGoogle Scholar
- Verlaan M, Mouthaan EEA, Kuijper EVL, Philippart ME (1996) Parameter estimation tools for shallow water flow models. Hydroinformatis 96:341–348Google Scholar
- Verlaan M, Zijderveld A, Vries H, Kroos J (2005) Operational storm surge forcasting in the Netherlands: developments in last decade. Philos Trans R Soc A 363:1441–1453CrossRefGoogle Scholar
- Vermeulen PTM, Heemink AW (2006) Model-reduced variational data assimilation. Mon Weather Rev 134:2888–2899CrossRefGoogle Scholar
- Wang C, Gaoming L, Reynolds AC (2009) Production optimization in closed-loop reservoir management. SPE J 14:506–523Google Scholar