1 Introduction

1.1 The study area

In this study, the Singapore Regional Waters (SRW) is defined as the area between 95°E–110°E and 6°S–11°N. It encompasses two strategic waterways, the Malacca Strait and Singapore Strait, the central part of the shallow Sunda Shelf which connects the South China Sea (SCS) and the Java Sea, and part of the deep basin of the Andaman Sea (Fig. 1). It is also one of the more complex tidal regions in the world. The complexity of the tide in this region is primarily due to the fact that here, the main interaction takes place between the predominantly different tidal signals from two oceans (Indian, mainly semi-diurnal; and Pacific, mainly diurnal). It is further complicated by factors such as sharply varying bottom topography toward the predominant shallow Sunda Shelf which acts as a separator of two deep basins (South China Sea/Pacific Ocean and Andaman Sea/Indian Ocean) and the complicated coastal geometries due to the narrow straits and numerous small islands. Past tidal studies in this region have been hampered by the paucity of reliable coastal stations with recent, long-term water level records. Efforts to use numerical tidal models to provide or aid the analysis of tide in this region have also been hindered by the lack of accurate high-resolution bathymetry data. The model bathymetry in the present paper is based on the latest publicly available Admiralty Charts, which in the Singapore area were supplemented with data from recent local surveys.

Fig. 1
figure 1

Map showing the Singapore Regional Waters (SRW) region (encompassed by black rectangle) and numerical tidal model domains, the Singapore Regional Model (SRM(C)) shown by dotted red lines as well as the South China Sea model (SCSM) shown by yellow rectangle

1.2 Earlier tidal studies

The strategic importance of this region has led to numerous studies to understand the physical processes that drive and are driven by the hydrodynamics in the SRW. However, due to geo-political realities, relatively few studies encompass the region as a whole. Except for Wyrtki (1961), most of the recent work to date focuses on specific sub-areas of the region: e.g., the SCS area (Shaw and Chao 1994; Zu et al. 2008), the Singapore Strait area (Chen et al. 2005; Chan et al. 2006) and the Malacca Strait up to the Andaman Sea (AS) region (Hii et al. 2006; Ibrahim and Yanagi 2006).

The focus of most tidal studies has been the SCS (e.g. Yanagi et al. 1997; Fang et al. 1999; Zu et al. 2008). For the SCS, the relative lack of accurate information is somewhat mitigated by the availability of satellite altimetry data and the dominance of the Pacific Ocean forcing in the large open SCS. Recent studies in the SCS area focused on tidal description by either analysis of TOPEX/POSEIDON data (Yanagi et al. 1997; Hu et al. 2001) or through numerical modelling (Fang et al. 1999; Cai et al. 2006). The Riau–Lingga region which borders both the Java Sea and the Singapore Strait is a typical area where the lack of detailed bathymetry data and reliable tidal observations has not allowed detailed description of the tide. Tidal analysis of the Indonesian waters has focused on the eastern Indonesian Seas (e.g. Schiller 2004; Hatayama et al. 1996; Ffield and Gordon 1996), due to their importance in the global circulation of water. Several modelling studies address the tide in the Singapore Strait (e.g. Shankar et al. 1997; Zhang and Gin 2000; Pang and Tkalich 2003; Chen et al. 2005). The majority of these models, however, cover a small domain and apply tidal open boundary forcing that is interpolated from data from nearby coastal stations, while the dynamics of the large-scale tidal interaction would require the consideration of a much larger domain. In the Malacca Strait, most of the published studies infer the general motion of water (e.g. Ibrahim and Yanagi 2006) but do not present a detailed description of the tidal dynamics of the area.

1.3 Present tidal modelling approach

Both the analysis of observation data and numerical modelling approaches have their respective strengths and weaknesses. Analysis of observation data requires that the data provide good spatial coverage of the area and be of a reasonable duration. Numerical models typically require relatively high-resolution bathymetry information in addition to good information on tidal forcing at open model boundaries that are further away for properly prescribing the lateral tide forcing in the model. A recent study by Zu et al. (2008) used data assimilation to combine a numerical model of the open SCS region with TOPEX/POSEIDON observation data by applying the inversion scheme of Egbert and Erofeeva (2002).

Their overall results suggest that it is possible to analyse the tidal sensitivity of a much more complex region like the SRW by combining available satellite altimetry data with a numerical model of the region. Satellite altimetry datasets of this region are typically of high and consistent quality provided that locations with effects due to land-sea transitions are excluded, as was the case in the present study. Analysing the sensitivity of the tidal representation to variation in parameters with known uncertainties (e.g. local depth information, bed friction, lateral tidal forcing at model open boundaries) provides a structured approach to reduce such uncertainties and optimise the values of the corresponding parameters in the tidal model.

For the hydrographically complex SRW, with its complex tidal interactions, a structured approach in the analysis of the sensitivities of the modelled tide to all these uncertain parameters would have great benefit, since intuitive oceanographic approaches are limited here. For that reason, the open-source tool for sensitivity analysis and simultaneous parameter optimisation OpenDA (http://www.openda.org; El Serafy et al. 2010) was used to guide the process. OpenDA provides structured and user-controlled, computer-aided sensitivity analysis. In Section 2, we describe the tidal model, the observational data, and the OpenDA approach of combining the observational data with the numerical model to analyse the sensitivity of the modelled tide in different parts of the SRW region to systematic simultaneous parameter variations.

2 Building blocks: Tidal model, tidal data and assimilation approach

This section discusses the three primary building blocks that are used in the study of the tidal sensitivity in the SRW: (1) a 2D (barotropic) depth-averaged numerical flow model, (2) a consistent set of high quality observation data, and (3) the software environment OpenDA for sensitivity analysis and parameter optimisation.

2.1 Numerical model—design, uncertainties and processing of results

The numerical model in this study solves the 2D depth-averaged shallow water flow equations. The model application is designed using a spherical, curvilinear grid and has previously been described by Kernkamp et al. (2005); e.g., its Figs. 9, 10 and 11) as the Malacca Strait model. The model has since become known as the Singapore Regional Model (SRM). It was initially developed to provide accurate tidal information in the Singapore Strait region of its domain (Kernkamp and Zijl 2004). The choice of the much larger model domain allows the varying tidal interactions to establish over a much larger area, avoiding the situation where the modelled tide in the area of interest is dominated strongly by the tidal prescription at nearby open boundaries. The use of a curvilinear grid reduces potential errors from representing the coastal geometry, especially when compared to a rectangular grid. The model covers the region 95°E–109°E and 4°S–10°N, stretching from northern Sumatra to the eastern coast of Borneo (Fig. 1 shows its extent by the waters bounded by the red lines). The total number of grid cells in the model is approximately 38,500 and the grid cells vary smoothly in size from approximately 20 × 40 km2 at the boundaries to approximately 150 × 200 m2 in the interior waters near Singapore. The SRM has open-water boundaries on the Andaman Sea, Java Sea and the South China Sea. Along these, best estimates of tidal constituents are prescribed, where tidal water levels at the open boundary conditions are calculated from the following relationship

$$ H(t) = {H_0} + \sum\limits_{{i = 1}}^K {{H_i}} {F_i}\cos \left\{ {{\omega_i}t + {{\left( {{V_0} + u} \right)}_i} - {G_i}} \right\} $$
(1)

in which:

H(t):

Tidal water level at time t

H 0 :

Mean water level over a certain period

k :

Number of prescribed tidal constituents

i :

Index of a tidal constituent

H i :

Local tidal amplitude of tidal constituent i

F i :

Nodal amplitude factor

ω i :

Angular velocity

(V 0 + u) i :

Astronomical phase at Greenwich

u i :

Nodal phase factor

G i :

Local tidal phase of tidal constituent i

Figure 2 shows the location of the open boundaries of the SRM through its boundary support points (big red circles; where tidal and mean sea level forcing are prescribed and adjusted). Eight main tidal constituents Q1, O1, P1, K1, N2, M2, S2, and K2 are prescribed at the three open sea boundaries, while direct tide generating forces are included in the interior domain. The bathymetry in the model domain ranges from maximum about 2,000 m in the AS to approximately 40–50 m depth in the Singapore Strait. Depth values are predominantly based on Admiralty charts—which as navigation charts have a bias towards vessel safety (shallow areas) and also typically obscure detailed features—with additional data around Singapore itself from local surveys. In addition to tidal boundary forcing, the model bathymetry data can therefore be an important source of uncertainty in the reproduction of the tide in a depth-averaged numerical model.

Fig. 2
figure 2

Singapore Regional Model (SRM(C)) model domain showing depth contours (m), boundary support points (red circles) where tidal and mean level forcing are prescribed and varied) and the eight distinctly numbered blocks 1–8 (black dash lines) used in the analysis of tidal open boundary forcing. The six distinct alphabet blocks A–F (blue solid lines) are used in the analysis of friction and depth variation. The diamonds denote observation locations used for optimising the tidal model representation

Intuitively, one may consider the modelled tide in each location to be represented as a linear combination of the eight tidal base vectors and their main compound harmonics, which can be uniquely identified by least squares analysis of time series from a full year simulation. Theoretically, this is not the case: the least squares analysis allows some marginal interaction between constituents, as will be explained in the discussion of the results.

A 1-year simulation with SRM requires 12 h total CPU time on a common desktop PC. For simple tests it was shown that OpenDA had required four to five iterations to obtain a solution (Ooi et al. 2010). This implies a minimum of 60 h total computational time if such tests had been carried out with the SRM. A 3 × 3 aggregated coarser grid version of the SRM (from now on called SRMC) was created to reduce the computational time by approximately a factor of 20. In the aggregation, the depth profiles and therefore volumes are maintained, albeit at a coarser scale. The accuracy of this particular coarser grid model has previously been assessed by Kurniawan et al. (2010) who showed that a 3 × 3 coarser grid designed as described above has largely the same response characteristics and can suitably replace the finer grid model for multiple parameter variation and sensitivity analysis purposes.

2.2 Observational data—along-track data and long-term in situ data sets

The SRW region is a region with relatively few reliable in situ tide gauge data sets (Gerritsen et al. 2000, 2003; Pang et al. 2003). In addition, many of the tide gauges in SRW (reliable or otherwise) are situated in shallow areas directly along the coast or in estuaries. They do not reflect the hydrodynamics in the more open waters and are therefore less suitable to describe the tidal wave propagation and tidal interaction dynamics in the SRW. The present satellite altimetry data from the TOPEX/POSEIDON (T/P) system and its successors provides reliable, consistent open water coverage of water level (tide) information, suitable for sensitivity analysis or assimilation. Combining available reliable long-term in situ observations with satellite altimetry presents the best possible mix of open-water and coastal water level and tide observation throughout the model domain.

At coastal stations, tidal constituents were derived from available multi-year data sets while at locations obtained from satellite altimetry tidal constituents were derived by tidal analysis of level-2 altimeter along-track data sets from the RADS database (Naeije et al. 2006). Coherence analysis of tidal constants from tide gauge data and comparison to altimeter derived tidal constants resulted in a database of tidal constants in a total of 77 observation locations or stations, shown as diamonds in Fig. 2. These stations consist of a mix of International Hydrographic Organization, Maritime and Port Authority of Singapore (MPA), and University of Hawaii Sea Level Center (UHSLC; http://ilikai.soest.hawaii.edu/uhslc/data.html) observations and satellite along-track data sets. Direct comparison of model-based and “observed” tidal constants provides an objective, quantitative and reproducible norm or benchmark to assess the tidal representation of the model (Le Provost et al. 1995).

2.3 OpenDA and multiple parameter variation

The generic OpenDA data assimilation environment (El Serafy et al. 2010) provides both a range of filtering routines as well as uncertainty and sensitivity analysis routines. It also supports assimilation of recent observations to improve model forecasts and allows a user to carry out sensitivity analysis and simultaneous parameter optimisation of model parameters in a given model application. It is an updated open-source version with extended functionalities of the existing DATools system which was used successfully for, e.g. data assimilation of current and salinity profiles (El Serafy et al. 2007) and for flood forecasting purposes (Weerts et al. 2010). The parameter variation and sensitivity analysis toolsets in OpenDA have many features which are essentially user-controlled and have robust default settings. Multiple parameter optimisation and sensitivity analysis will (almost) always end correctly in the sense that, mathematically speaking, optimised parameter values are found. The key question is then how physically realistic these are. To assess the realism of the model results for optimised parameter settings, several independent measures are defined that quantify the difference between modelled tide and the observed tide before, during and after parameter optimisation. The user should use such measures to easily interpret the consistency and realism of the model results.

In the present paper, OpenDA’s semi-automated parameter estimation method, Doesn’t Use Derivatives (DUD) is used. This method can be applied for structured variation of the parameters, with or without user-defined constraints on the parameters. DUD is a derivative-free algorithm for nonlinear least squares (Ralston and Jennrich 1978). It evaluates and optimises uncertain model parameters by minimizing a generalised form of a least-squares or goodness-of-fit (GoF) criterion which is formulated in the time domain. The user can specify sub-regions or blocks of observations, and specify stations within these blocks, in order to vary the GoF definition during the analysis. The parameter values that correspond to the minimum value of the GoF are considered the optimum parameter values for the given analysis.

In tidal modelling, the water level is the key model variable and the built-in GoF formulation is now configured to read as:

$$ {\text{GoF}} = \frac{1}{2}\sum {_{{r = 1}}^{{r = {\text{R}}\;{ \max }}}} \sum {_{{s = 1}}^{{s = {\text{S}}\;{ \max }}}\sum {_{{n = 1}}^{{n = {\text{N}}\;{ \max }}}{w_{{r,s}}}{{\left( {H_{{r,s,n}}^{\text{sim}}(t) - H_{{r,s,n}}^{\text{obs}}(t)} \right)}^2}/{{\left( {{\sigma_{\text{Hobs}}}} \right)}^2}} } $$
(2)

in which H(t) is the water level measured at time t, sim refers to results obtained from model simulations over the simulation period [0, T], obs are observed values, Nmax is the number of timesteps in the time series, Smax is the number of stations in region r, Rmax indicates the blocks for which observations are included while σ Hobs denotes the uncertainties assigned to the observations (here: tidal prediction values). σ Hobs is set at 0.05 m. The weight w r,s is set uniformly equal to 1.

To assess the sensitivity of the SRW tidal model representation to the prescribed tidal amplitudes and phases the 77 observation stations were subdivided into eight distinct blocks numbered from 1 to 8, see Fig. 2. The sensitivity of the SRW tidal representation to depth and friction changes in the Malacca Strait was studied by breaking up the strait into six distinct blocks named A to F (Fig. 2).

2.4 Evaluation criteria for assessing model representation of tide

In addition to the mathematical optimisation criterion (GoF) the user should assess the response of the model’s tide representation to the parameter variation in terms of the properties of the physics or process dynamics. The proposed method is to evaluate the differences in the modelled and observed tidal amplitudes and phases (H, G) of the available tidal constituents at the stations of the selected observer blocks and those in the whole model domain. A practical error measure for tidal constituent k is the summed vector difference (SVD) over selected observer blocks or the entire model where

$$ {\text{SV}}{{\text{D}}_k} = \sum {_{{r = 1}}^{{r = {\text{R}}\;{ \max }}}} \sum {_{{s = 1}}^{{s = {\text{S}}\;{ \max }}}} {\text{V}}{{\text{D}}_{{k,r,s}}} $$
(3)

and

$$ {\text{V}}{{\text{D}}_{{k,r,s}}} = \sqrt {{\left[ {{{\left( {{H_{{c,k}}}\cos {G_{{c,k}}} - {H_{{o,k}}}\cos {G_{{o,k}}}} \right)}^2} + {{\left( {{H_{{c,k}}}\sin {G_{{c,k}}} - {H_{{o,k}}}\sin {G_{{o,k}}}} \right)}^2}} \right]}} $$
(4)

H c,k , G c,k , H o,k , and G o,k are the computed (simulated) and observed astronomical amplitudes and phases of a tidal constituent k (Le Provost et al. 1995).

Finally, a single criterion to evaluate the improvement in tidal representation during the process of multiple parameter variation is defined as:

$$ \% {\text{IMP}} = \left( {{\text{InitialValu}}{{\text{e}}_M} - {\text{FinalValu}}{{\text{e}}_M}} \right)/{\text{InitialValu}}{{\text{e}}_M} \times 100\% $$
(5)

Here, M can be either the SVD or the GoF.

3 Design of the sensitivity experiments

3.1 Ranking of uncertainties—sequence of simulations

The starting point of the model parameter settings (tidal constants, depths, bed friction) is described in Kurniawan et al. (2010). With estimates of the local amplitudes and phases of eight tidal constituents at 21 positions along the open boundaries (see Fig. 2), six key areas of uncertain depths and bed friction, the number of uncertain model parameters is too large for simultaneous variation of all parameters using a local search algorithm. The uncertainties in the boundary forcing are considered to provide the largest contribution to the error in the tide representation. The sensitivity analysis starts therefore with this (section A in Fig. 3), followed by the assessment of the model sensitivity to variation of depth and friction in different parts of the Malacca Strait (section B in Fig. 3), and ends with a reassessment of the tidal forcing along the open boundaries (section C in Fig. 3). IM1 to IM3 are the intermediate combined model results after each variation block and are described in more detail in subsections 4.14.3 below. The details of the parameters that were varied for each sensitivity analysis are given in Table 1. The number of parameters that are simultaneously varied is denoted by P; Observer blocks used denotes the blocks which are included in the GoF. Iter denotes the number of iterations until the optimum; for DUD, Iter ≥ P + 1. DUD ranks the results of (P + 1) simulations, removes the worst case and uses the remaining sets of results to derive the parameter setting for a new case, etc. All the results shown in the figures and tables are the best optimisations from a set of different variations. The reasons to start with variation of amplitudes and phases of the semi-diurnal constituents at the AS and those of the diurnal constituents at the South China Sea and Java Sea (SCS&JS) were the dominating local physics of the system—the SCS&JS are predominantly diurnal regions while the AS is a predominantly semi-diurnal region.

Fig. 3
figure 3

Flowchart showing the progression in the sensitivity analysis for the SRW; a variation of open boundary forcing; (b bed friction and depths; c revisiting variation of the open boundary forcing

Table 1 Comparison of sensitivity/assimilation tests showing the SRMC model parameters that were varied, the selected observer blocks, the number of parameters P and Iteration Iter and the %IMP in GoF reported by OpenDA

3.2 Configuration aspects of OpenDA for sensitivity analysis

For a discussion of the sensitivity of parameter optimisation to the selected observer blocks and its effect on the tidal representation, see Kurniawan et al. (2010). Essentially, the regions where tide is most directly affected by the analysed parameter variation should be selected as observer blocks (see Fig. 2).

Case SD3 in Table 1 represents the optimum result from Kurniawan et al. (2010) who found that combining blocks 1 and 2 resulted in the best observer blocks for assessing the sensitivity of an incoming tide from AS. Similar tests had been carried out for the incoming tide from the SCS&JS resulting in blocks 5, 6, 7 and 8 being selected as observer blocks for simultaneous variation of the tidal constituents at both the SCS and JS boundaries of the model. The linking of the SCS and JS boundaries for simultaneous sensitivity assessments draws on physical insight provided by Wyrtki (1961) and Zu et al. (2008), suggesting that for the SRW, the SCS and JS boundary regions are quite similar in their tidal characteristics.

A further test analysed the difference of varying amplitude and phases uniformly along an open boundary (coupling the boundary support points (BSP)) or allowing individual variation at each BSP (uncoupling). As an example, M2 amplitudes and phases were varied along the Andaman Sea open boundary. With BSP = 5, the uncoupled and coupled cases lead to P = 10 and P = 2, respectively. As shown in the “OpenDA modifications (uncoupled)” column in Table 2, uncoupled parameter variation results in unrealistic oscillatory adjustments for both phase and amplitude (especially phase) due to the extra degrees of freedom which the case with coupling does not have. Both cases lead to an almost equal overall improvement of tidal representation in terms of SVD Eq. 2 away from the Andaman Sea boundary. However, the first is the result of clearly unphysical local adjustments while the second is physically realistic, mathematically simpler and robust. A two-parameter variation problem is also likely to be computationally much less demanding than a problem with 10 varying parameters. The result of the test confirms that when using mathematical concepts and tools such as OpenDA for sensitivity analysis one has to assess the results in terms of suitably chosen measures for the process dynamics. It is recommended to do so in an overall sense, but to also consider the effects by individual blocks and at the very locations where the parameter adjustments are made.

Table 2 Comparison of OpenDA tests for the M2 tidal constant along Andaman Sea open boundary for two specifications of the boundary support points (BSP) (A): uncoupled (physically unrealistic adjustments) and (B):coupled (physically justified adjustments)

4 Results and discussion

4.1 Sensitivity of tidal representation to variation of ocean forcing

4.1.1 The tidal forcing along the Andaman Sea model boundary

To examine the sensitivity of the region to uncertainties in the incoming tide prescription from the Indian and Pacific Oceans, the amplitudes and phases for selected tidal constituents at the boundaries of the numerical model were varied. In the initial test SD3 (Kurniawan et al. 2010), only amplitudes and phases of M2 and S2 were varied at the Andaman Sea. This resulted in an improvement of 36.3% in GoF in the observer blocks (see Table 1). Physically, this translates to an improvement in the SVD for M2 and S2 (the selected parameters) in the observer blocks of 32.5% and 27.9%, respectively (see Table 3). This variation of M2 and S2 not only affects the observer blocks but the overall tide representation as the total model SVD for M2 and S2 improve by 22.1% and 15.0%, respectively. The SVD values per constituent in Fig. 4 show that while M2 and S2 significantly improve, most other tidal constituents deteriorate slightly. In a different form, this is also presented in Tables 3 and 4. This is an effect of postprocessing: in the least squares analysis, a redistribution of the least squares residual may result in some small knock-on effect of the change of M2 and S2 into the other constituents. Figure 5 shows that the improvement in the M2, S2 tidal representation does not only occur in the observer blocks (Andaman Sea and Malacca Strait) but also in blocks 3 and 4 (Singapore waters), which together show an SVD reduction by 1.65 m. This is even larger than the 1.54-m reduction in blocks 1 and 2 combined. In blocks 5–8, the effect is very small. The notion of a small knock-on effect of energy between constituents suggests a joint variation of M2, S2, N2, K2 forcing, starting from the newly established M2 and S2 amplitudes and phases in test SD3.

Table 3 Sensitivity of the semi-diurnal tidal constituents—initial SVD and %IMP in the observer region and the overall SRMC model
Fig. 4
figure 4

Comparison of total SVD for all optimisation tests for each semi-diurnal and diurnal constituent (summation over all 77 stations)

Table 4 Sensitivity of the diurnal tidal constituents—initial SVD and %IMP in the observer region and the overall SRMC model
Fig. 5
figure 5

Comparison of total SVD for the semi-diurnal constituents at each block for all optimisation tests (summation over M2, S2, N2 and K2 over all 77 stations)

The results of test SD7 in Table 3 show only a marginal overall improvement of 0.1% and 0.04% in the SVD values of M2 and S2, while the observer regions also show small changes: 0.5% improvement for M2, and −0.6% for S2. The minor semi-diurnal constituents N2 and K2 both show significant improvement; in the observer regions the SVD values for N2 and K2 improve by 59% and 33%, while over the whole model the N2 and K2 representation improves by 20.9% and 12.9%, respectively. Clearly, a simultaneous variation approach is essential for the whole semi-diurnal signal. Figure 5 for SD7 shows that most of the additional improvement in the summed semi-diurnal constituents occurs in blocks 1 and 2 (Andaman Sea area and Malacca Strait), little change in block 3, while block 4 (Singapore waters) even shows a slight deterioration. Again, the variation of the semi-diurnal Indian Ocean tide forcing does not propagate to the eastern model domain blocks 5–8.

4.1.2 The tidal forcing along the SCS and JS model boundaries

In test D7, the main diurnal forcing from the Pacific Ocean defined in terms of O1, K1, Q1, P1 along the South China Sea and Java Sea model boundaries was varied. The whole eastern model domain (blocks 5–8) served as observer block. Table 4 shows that there is large sensitivity of the SRW tide to these changes. The %IMP in SVD for the diurnal constituents ranges from 35.5–70.0% for the whole model domain, and between 44.5–74.2% for the eastern part (observer blocks 5–8), depending on the individual constituent. Figure 6 shows that the improvement also extends to blocks 2, 3 and 4, i.e. Singapore Strait and Malacca Strait. This suggests that almost the entire SRW except for the region close to the Andaman Sea open boundary is sensitive to the diurnal tide propagation from the SCS&JS regions.

Fig. 6
figure 6

Comparison of total SVD for the diurnal constituents at each block for all optimisation tests (summation over O1, K1, Q1 and P1 over all 77 stations)

4.1.3 The minor tide contributions from the Indian and Pacific Oceans

With the uncertainties in the major tide forcing contributions significantly reduced, the minor tide forcing contributions need to be investigated: the N2, M2, S2, K2 semi-diurnal tide prescription along the South China Sea and Java Sea boundaries, and the Q1, O1, P1, K1 diurnal tide forcing along the Andaman Sea. A new baseline model called IM1 was created by applying the optimised tide forcings of SD7 and D7. As expected, the diurnal constituents show negligible changes, while the semi-diurnals are slightly better (see Fig. 4).

Case SD9 considers the semi-diurnal tide forcing along the eastern boundaries. The %IMP in GoF is 14% (Table 1). Compared to IM1, SD9 results show significant improvement for the semi-diurnals: SVD reduces from 17.3 to 15.3 m, see Fig. 5. The figure illustrates that the improvement is strong in the eastern part plus Singapore waters, and drops off in Malacca Strait. The SD9 and IM1 lines in Fig. 6 confirm that the diurnals are again only marginally affected as a result of the postprocessing (cf. discussion in paragraph 4.1.1 above).

The variation of the diurnal constituent forcing along the Andaman Sea in case D8 translates to a %IMP in GoF of 7.9%. The overall SVD of the diurnal contributions reduces from 9.02 to 7.85 m (13%), see Fig. 6. The figure shows that the improvements are largest in blocks 2 and 3 (Malacca Strait and West Singapore waters), while the changes are marginal in the eastern part of the model domain. Clearly, the Indian Ocean diurnal forcing does not have a significant effect beyond Singapore. Comparing IM1 and D8 in Fig. 5 confirms that the diurnals are only marginally affected by these variations, due to the postprocessing.

The sensitivity and optimisation simulations above confirm that the Malacca Strait region up to the Singapore Strait (blocks 1 to 4) is most sensitive to the incoming tide from the Andaman Sea (Indian Ocean). Almost the whole SRW domain is sensitive to tide from the South China Sea and Java Sea (mostly Pacific Ocean), except for the region immediately adjacent to the Andaman Sea boundary.

4.2 Sensitivity of the region to Malacca Strait bathymetry (and friction)

4.2.1 Role of Malacca Strait in tidal dynamics

The uncertainties in depth and friction model data in the Malacca Strait are most likely the next largest source of tide representation errors in the model. For this particular analysis, the Malacca Strait has been divided into six blocks, A–F, see Fig. 2. Prior analysis carried out on the observation data sets showed that the area corresponding to blocks C to F is the region with the highest spring/neap tides in the entire model domain, with much tidal mixing. This suggests that blocks C to F would possibly be the ones that would be most sensitive to any depth or parameter variation. The other implication of this analysis is that significant variation in depth or friction in this region could also affect the tide characteristics at the boundaries of the SRW. In their study, Kurniawan et al. (2010) chose Block 3 as the observer region which approximately covers the same area as Blocks D to F combined, and confirmed this hypothesis.

First, a new baseline model (IM2) was created using the optimisation results of D8 and SD9. For all eight tidal constituents, IM2 results are again better than either D8 or SD9, see Fig. 4.

4.2.2 Sensitivity to bed friction in Malacca Strait

To evaluate the sensitivity of the SRW to friction changes, OpenDA was applied to vary the friction (case R9) in both grid directions simultaneously (note that the directions of U and V are the curvilinear grid directions). This resulted in a %IMP in the GoF of 40.15% in the observer blocks (E, F) which was due to an approximately 40% reduction of the original Manning bed friction coefficient at the selected blocks, see Table 1. Table 3 shows that for the semi-diurnal constituents this resulted in a positive %IMP in SVD in the observer blocks (E, F) of between 36.5–39.6% and a positive %IMP in SVD of between 8.5–9.5% for the SRW. The effect on the diurnal constituents was mixed, see Table 4. In the observer blocks the O1 and Q1 tidal representation showed a strong SVD deterioration of 80.3 and 38.7%, respectively, while the tidal constituents K1 and P1 showed a SVD improvement of 16.8% and 6.7%, respectively. Over the whole model domain, SVD’s of O1 and Q1 showed 8.3% and 2.7% deterioration, respectively, while SVD’s of K1 and P1 improved by 1.4 and 1.2%, respectively. The deviating effect of O1–Q1 is not yet understood. In overall sense, the friction adjustment reduced the SVD from 23.5 to 22.3 m, an improvement in tidal representation by 5.3% (Fig. 4), although the areas most affected differ for diurnals and semi-diurnals (Figs. 5 and 6).

4.2.3 Sensitivity to depth variation in Malacca Strait

All of the general trends observed when friction was varied were almost similarly reproduced when depth was varied in the same blocks instead of friction (case Z9). The GoF shows a %IMP of 72.7% in blocks E and F which was due to an increase in the original depth in those blocks by 60%. Such a very large change is likely an overadjustment due to the assumption in the optimisation case that the depth in this region is the key source to all model errors. The semi-diurnal SVDs showed strong improvement in the observer blocks and the overall region, see Table 3. For the entire SRW SVD of M2 and S2 improved by 28.5% and 30.3%, respectively. For the diurnals, the same mixed trend in the %IMP in SVD observed earlier for friction was also observed, see Table 4. For the SRW as a whole, the SVD for diurnals deteriorates. The overall tidal representation error reduces from 23.53 to 20.24 m, however, i.e. by 14.0% (Fig. 4). Similar to the case of varying friction, Figs. 5 and 6 show that the representation of the SRW tide is most sensitive to depth variation in blocks 2 to 5.

The results of R9 and Z9 suggest that there are some complex local blocking features affecting tide, most likely stemming from the lack of high-resolution bottom topography data but also possibly due to the obscuring of detailed bottom features in the vicinity of blocks E and F. The results in Figs. 5 and 6 also suggest that the effects of varying depth and friction are generally local with the largest variation outside of block 3 seen only in blocks 2, 4 and 5 which border block 3. The large mixed variation in the diurnal constituents suggests that OpenDA may have initially over-optimised the boundary forcing for the diurnal tide. It was therefore decided to reassess the incoming diurnal tide at all the boundaries.

4.3 Final sensitivity analysis of SRW to incoming tide

To reassess the incoming diurnal tide the new baseline model IM3 was created, which combined the best results of R9 and Z9 for friction and depth. The interaction of the bed friction and depth adjustments in IM3 leads to a small (1.6%) deterioration of the tide representation (Fig. 4). The first additional case D9 assesses the sensitivity of the observer blocks and the SRW to variation of the diurnal constituents at the SCS&JS boundaries. The results of D9 is then used as the base for a second assessment called D10 which assessed the sensitivity of the SRW to variation of the diurnal constituents at the AS boundary.

Case D9 results in little change to the diurnals, see the %IMP in GoF Table 1; see also the %IMP in SVD for the diurnals in Table 4. All diurnals improve, though. This is also seen clearly in Fig. 6 where the small improvements in blocks 6–8 seems to be largely offset by a similar deterioration in blocks 3–5. The semi-diurnals also show little change in the SVD. These results indicate that it is not the incoming diurnal tide from the SCS&JS that is interacting with the depth and friction in blocks E and F. Using D9 as the new baseline the sensitivity of the SRW was analysed with regards to variation of the diurnal Indian Ocean tide at the AS. The optimum results of D10 in Table 4 show SVD improvement 13.3–22.7% for all the diurnal constituents for the observer blocks (1,2). The overall model SVD for the diurnal constituents improved between 6.5% for K1 and 11.7% for O1. Figure 6 shows that the tidal stations in the Malacca Strait (blocks 2, 3 and 4) are the most sensitive to the variation of the incoming diurnal tide from the AS boundary with depth and friction variations optimised. The result again confirms that the tide from the Andaman Sea is most sensitive to uncertainties in depth and friction in the narrow parts of Malacca Strait toward Singapore (blocks E and F).

4.4 Overall evaluation of the sensitivity of the SRW

In the previous sections, 4.14.3, the model uncertainties have been ranked according to expected largest impact on tidal representation. By successive multi-parameter variation simulations using the local search algorithm DUD, and applying a mathematical goodness-of-fit criterion combined with quantitative indicators that evaluate the quality of the tidal model representation, it has been shown that the largest remaining uncertainties could significantly be reduced, and the tide representation improved. To observe the sensitivity of the SRW to all these variations, the results of the reference (Base) to D10 in Figs. 5 and 6 are compared. The large SVD improvement (46.7%) in the semi-diurnals in the entire region (ALL BLOCKS column) and also the large SVD improvement (48.7%) for the diurnals suggests that the tidal models of the Singapore Regional Waters are very sensitive to variations in the boundary forcing and the friction/depth in a particular sub-region. With regard to the previously well-optimised sub-region of the Singapore Strait (Block 4) it has been shown that the tide in this region is very much affected by variations and changing interactions in a much wider area.

The effect of all variations in tide open boundary forcing, depth and friction and the region’s sensitivity to these variations has been visualized in cotide—phase plots, see Figs. 7 and 8. The amplitude and phase distributions of M2 and K1 are shown in three frames, representing observed, initial model results and final model results after optimisation. The time zone is GMT + 8. In addition, for better understanding, the spatial distributions of vector differences (as a measure of absolute error) before and after the optimisation are shown in Figs. 9 and 10 for M2 and K1, respectively. Figure 7 shows that for M2 the improvements are due to for instance the reduction in the amplitude in the central Malacca Strait (compare Fig. 7b,c with a) and to the depth/friction variation as evidenced by the reduction in the amplitude in the southern part of Malacca Strait toward Singapore. In absolute terms the M2 vector differences (compare Fig. 9a with b) reduces in the regions described above by at least 0.1 m or about 50%. Figure 8 shows that for the diurnal constituent (K1) the major improvements in the amplitudes and phases have been realised in the eastern regions, most directly influenced by the tide forcing prescribed along the SCS&JS boundaries. In absolute terms Fig. 10 shows that the K1 vector differences have reduced by approximately 0.2 m to almost zero in those regions, i.e. often much more than 50%.

Fig. 7
figure 7

Spatial distribution of M2 tidal constants in SRMC grid domain showing a interpolated observed, b interpolated model results before optimisation, c interpolated model results after optimisation. Filled contours denote the magnitude of the co-amplitude (m) and contour lines are the co-phase lines (degree) at GMT + 8

Fig. 8
figure 8

same as Fig. 7 but for the K1 tide

Fig. 9
figure 9

Spatial distribution of vector differences (VD; in m) of the M2 tide in the SRMC grid domain showing a interpolated model results before optimisation, b interpolated model results after optimisation

Fig. 10
figure 10

same as Fig. 9 but for the K1 tide

5 OpenDA compared to manual optimisation

In addition to the analysis of the tidal dynamics and model representation in the SRW area, OpenDA is now being applied to analyse the sensitivity of tide representation in the 2D South China Sea tidal model (SCSM). This model covers the area bounded by the yellow lines in Fig. 1. In the last decade, the tide representation of the SCSM was studied and already much improved using various techniques, but mainly using in situ tidal data with varying quality (Gerritsen et al. 2000, 2003). Figure 11 summarises the results comparing the results of the manual optimisation (bn21) to one preliminary OpenDA results (bn20_ODA2) with bn20 as the baseline scenario. The results in Fig. 11 show that OpenDA quickly reaches a much better result than the manual optimisation. Both methods are almost similar in performance when improving the diurnal constituents, SVD reduction of 0.76 m for OpenDA compared to 0.63 m reduction with manual optimisation. However at the same time, OpenDA significantly improves the semi-diurnals (reduction in SVD of 2.0 m) compared to the manual optimisation method (SVD reduction of only 0.09 m). These results give confidence that if configured correctly, using measures that account for process dynamics next to the mathematical criteria, OpenDA can easily be applied to analyse and improve the understanding of the tides in other numerical tide models and other areas.

Fig. 11
figure 11

Comparison of SVD with regards to total SVD (8 constituents) and the 4 semi-diurnal and the 4 diurnal constituents for the SCS model optimisation tests

6 Conclusions

The results of the sensitivity studies in this paper illustrate that the tidal dynamics in a complex region like the Singapore regional waters can be effectively analysed by combining a numerical tide model and reliable observational data with a semi-automated data assimilation and parameter optimisation tool such as the open-source OpenDA environment. The study has shown the benefits of a systematic approach to evaluating multiple parameter variation, beginning with the parameters that are expected to have the larger, more overall, impact and then continuing to parameters that have a smaller and more local effect on tidal dynamics. Since not all uncertain model parameters can be included simultaneously, a ranking according to expected uncertainty impact and some iteration after these have been successively reduced, is necessary. The study has also demonstrated that a tool such as OpenDA requires a combination of physical insight and understanding of the problem at hand, by defining suitable measures to evaluate the behaviour of the key physics. The semi-automated DUD procedure embedded in OpenDA is shown to be effective in reducing the repetitive tasks involved in studying the sensitivities of a complex region like the SRW to various parameter variations. The use of satellite altimetry data in combination with reliable in situ observation data proved to be simple, since OpenDA allows the observation uncertainty to be prescribed individually for each observation time series.

As a whole, the results suggest that an ocean modeller does not need to first develop in-depth knowledge in a field such as data assimilation or automated parameter optimisation but can continue to focus on his/her own area of expertise (analysing tide in this instance) when combining OpenDA with a numerical model.

With regards to the SRW, it is shown that the region as a whole is most sensitive to tidal open boundary forcing but that sub-regions are very sensitive to variation of depth and friction, which have a wider than local impact. Future studies of the SRW should focus on more systematic variation of friction and depth in the narrowing part of the Malacca Strait toward Singapore waters. The result of the analysis and optimisation has not only improved the insight into the tidal propagation and interaction characteristics for the region. It has also resulted in a numerical model for which much of the systematic errors have been reduced or minimized. Such a numerical model can be used with confidence to study tide-surge interactions in the SRW for which there is a need due to the complex interaction of tides, seasonal monsoons and shorter storm surges in the SRW. In addition, a model with minimum systematic errors can also be applied in a data assimilation combination with techniques such as Kalman filtering, which is not feasible if the model still has serious systematic errors or bias.