1 Introduction

High-frequency (HF) radars allow one to derive two-dimensional maps of ocean surface currents over a wide coastal area by measuring the Doppler shift of electromagnetic waves undergoing Bragg-scattering (Crombie 1955; Wait 1966; Stewart and Joy 1974; Barrick 1978) of surface gravity waves whose wavelength is exactly one-half the radar wavelength. They constitute an essential component of coastal observatories (Roarty et al. 2016; Roarty et al. 2019), with a rapidly expanding network in Europe with 105 sites already deployed or in a planning stage (Rubio et al. 2017; Roarty et al. 2019). A single HF radar site (composed typically by a collocated or combined transmitting and receiving antenna) measures the component of surface current directed radially toward or away from the antenna. In areas where the coverage of two or more HF radar sites overlaps, one can deduce vector currents by using one of the most commonly adopted combination algorithms, such as the unweighted least squares fitting (UWLS) method (Lipa and Barrick 1983; Gurgel 1994; Graber et al. 1997).

The traditional approach consists in interpolating the radial currents on a common grid and inverting a linear system (possibly over-constrained if there are more than two HF radar sites) to compute the zonal and meridional velocity components from the radial currents. These horizontal current vectors are often referred to as total currents. However, spurious total vector currents are obtained along the baseline between two radars where the measurements of radial velocities are nearly aligned and the total currents thus suffer from geometric dilution of precision, as described in Chapman and Graber (1997), and along the edges of the HF radar footprint area.

The spatial coverage of a radial current field is not constant and can exhibit gaps mainly due to interferences due to, e.g., power lines, nearby antennas, metal fences, lack of Bragg scattering ocean waves, low salinity environments, radio interferences, and ionospheric and lightning effects (Mantovani et al. 2020). As a consequence, the combined total current fields also have gaps which can prevent several applications which typically require full fields such as search and rescue (O’Donnell et al. 2005; Ullman et al.Ullman et al. 2006), oil spill tracking (Abascal et al. 2009), and ecological applications (Emery et al. 2006; Helbig and Pepin 2002; Zelenke et al. 2009).

Various techniques have been proposed in the scientific literature to deduce total currents from radial ones and/or to compute interpolated fields without gaps. A commonly used method is the Open-Boundary Modal Analysis which was firstly implemented by Lekien et al. (2004) and further optimized by Kaplan and Lekien (2007). This method decomposes the flow field into irrotational and non-divergent modes (as well as boundary modes) that describe all possible current patterns inside a two-dimensional domain. This set of linearly independent current modes allows the reconstruction of a full field without gaps and to reduce noise with a small spatial scale.

Kim et al. (2007) showed how optimal interpolation can be used to fill the gaps in HF radar data. The error covariance can be either an analytical covariance model or the sampling covariance from the HF radar. A sampling covariance can have spurious negative eigenvalues but in regions with sufficient data the background variance can be adjusted to avoid this problem. The weighted and unweighted least squares fitting of radial currents can also be seen as a special case of optimal interpolation (Kim et al. 2008).

Alternatively, Yaremchuk and Sentchev (2009) used a variational method with a constrain on divergence and curl of the flow field. In a follow-up paper, Yaremchuk and Sentchev (2011) extended the variational approach by using also empirical orthogonal functions showing the advantages over local linear interpolations of the variational method and Open-Boundary Modal Analysis (OMA; Kaplan and Lekien 2007), related to their ability to reconstruct the velocity field within the gaps in data coverage, near the coastlines, and in the areas covered only by one radar site.

The penalized least squares regression method based on a three-dimensional discrete cosine transform method (Fredj et al. 2016) uses both time and space variability to predict the missing data. Missing data are thus interpolated by using not only data nearby in space, but also data from previous and subsequent time instances.

As shown by Hernández-Carrasco et al. (2018), self-organizing maps (SOMs, Kohonen 1997), extracting common flow patterns, can also be used to reconstruct missing data in HF radar data. The authors also compared the SOM method to DINEOF—Data Interpolating Empirical Orthogonal Functions—initially applied to geophysical datasets as sea surface temperature, ocean color, and surface salinity (Beckers and Rixen 2003; Alvera-Azcárate et al. 2005; Alvera-Azcárate et al. 2016), by previously extending its application to obtain total surface currents. It has been shown that the DINEOF technique presents the lowest errors in the Eulerian comparison of the velocity field and performs better than the Open-Boundary Modal Analysis and similar to SOMs in the context of Lagrangian tracking (Hernández-Carrasco et al. 2018).

Most of these techniques are based on statistical consideration in order to fill missing data gaps and reduce the noise in the current measurements. The present manuscript aims to extend these statistical approaches by using dynamical constraints. This approach heavily uses ideas from 4D-Var assimilation but without requiring the setup of a full ocean model. The objective of the present manuscript is to quantify the potential improvements of using dynamical constraints in the context of variational interpolation.

The DIVAnd method (Barth et al. 2014) is an extension of DIVA (Data Interpolating Variational Analysis, Brasseur and Haus 1991; Troupin et al. 2012; Beckers et al. 2014) for more than 2 dimensions. In this work, DIVAnd is first applied to radial current measurements using various dynamical constraints in Section 2. The data used to test this method is presented in Section 3. Results and the validation with drifter data are discussed in Section 4. The findings are presented in the conclusion in Section 5.

2 Method

The DIVA method aims to derive a continuous field from a series of measurements at discrete locations. It is commonly used to derive a gridded climatology from in situ observations (Tyberghein et al. 2012; Lauvset et al. 2016; Troupin et al. 2010). The variational inverse method minimizes a cost function which ensures that the field is relatively close to the observations but with constraints on its regularity using its gradients and Laplacian (Brasseur and Haus 1991). The method does not perform a pure interpolation, since the analyzed field should not necessarily pass through all observations because observations are affected by errors and might not be fully representative (Janjić et al. 2018). These requirements are formalized via a cost function:

$$ J(\varphi) = \sum\limits_{j=1}^{N_{d}} \mu_{j}[d_{j}-\varphi({\mathbf x}_{j})]^{2} + \| \varphi- \varphi_{b} \|^{2} $$

where φ is a scalar ocean field defined and dj are the Nd measurements of the field φ at the locations xj for a given time instance and their weights μj. φb is a background estimate of the field. The background estimate is a first guess of the field to interpolate. For example, if φ represents salinity, the background estimate can be salinity averaged over space and time. For currents, the background estimate that will be used is zero. The experiments in the following were also conducted using the mean current over the domain but this did not change the results in a significant way.

The spatial (and temporal) coherence is introduced in Eq. 1 by defining a particular norm penalizing abrupt spatial (and temporal) variations over the domain Ω:

$$ ||\varphi||^{2}={\int}_{\Omega}(\alpha_{2} \boldsymbol \nabla\boldsymbol \nabla\varphi : \boldsymbol \nabla\boldsymbol \nabla\varphi +\alpha_{1} \boldsymbol \nabla\varphi \cdot \boldsymbol \nabla\varphi + \alpha_{0} \varphi^{2}) d {\Omega} $$

where the symbol : stands for double summation and the coefficients α control the relative importance of each contribution (Laplacian, gradient, and field value) in the norm.

The variational inverse method naturally decouples basins based on topography: an observation on one side of an island, isthmus, or any type of physical boundary can not spread directly to the other side (Troupin et al. 2012). When analyzing tracers, ocean currents can also be taken into account by adding to the cost function an additional term penalizing the advection. The aim here is however to derive the current field from HF radar observations. In a second step, such a current field can be used for the analysis of tracers requiring for instance that the isolines of a tracer align with the currents field.

In this manuscript, we extend the formalism of DIVA applied to ocean currents in the context of HF radar measurements. A single HF radar site provides a radial velocity measurement map. Total vector currents can be only derived in the overlapping region where radial velocities measured from at least two radars are available. In order to maximize the amount of provided data and to minimize additional smoothing and interpolation, the adopted approach is to directly use radial currents in the analysis procedure.

A suitable defined observation operator links the (unknown) analyzed vector field to the radial velocities of the different radar sites. The cost function includes therefore the following data term:

$$ J_{\text{vel}}(\mathbf{u}) = ||u||^{2} + ||v||^{2} + {\sum}_{i=1}^{N} \frac{(\mathbf{u}_{i} \cdot \mathbf{p}_{i} - {u_{r}}_{i})^{2}}{{\epsilon^{2}_{i}}}, $$

where \(\mathbf {u} = \left (u,v\right )\) is the velocity vector, pi is the normalized vector pointing toward the corresponding HF radar site of the i-th radial observation uri (Yaremchuk and Sentchev 2009), N is the number of available radar sites, and \({\epsilon ^{2}_{i}}\) represents the noise of the measurements. The same norm defined in Eq. 2 is also used for both components of the velocity field.

In fact, without imposing any spatial (and temporal) coherence and using only the data constraint, the minimization of the cost function would be equivalent to the least squares method (Lipa and Barrick 1983; Gurgel 1994), which is commonly used to combine several radial current estimates into total current vectors (Appendix Appendix). The effect of the spatial coherence is illustrated in panel (a) of Fig. 1 where the red vector represents a hypothetical measurement (in the x-direction) and the black vectors depict the analyzed field. In this figure, the gray area represents the coastline. The DIVA method, as any method related to optimal interpolation, naturally allows extrapolation of the measurements. Such plots are useful because they visually represent the underlying background error covariance matrix (e.g., Keppenne et al.2008; Barth et al. 2014). The final analysis is in fact a linear combination of these functions represented in Fig. 1a–c.

Fig. 1
figure 1

Current maps obtained for different constraints. The red arrow represents a single current measurement. The black arrows show the current analysis over the whole domain. These extrapolated currents are derived using the considered constraints: a smooth in space, b smooth and presence of the coastline, and c smooth and low horizontal divergence

2.1 Coastline effect

It is quite common for numerical ocean models to represent the coastline as an impermeable, coastal wall as the horizontal movements for the waterfront are generally much smaller than the model resolution (Haidvogel and Beckmann 1999). The velocity component perpendicular to the coastline is thus set to zero:

$$ \mathbf{u} \cdot \mathbf{n} = 0, $$

where n is the vector normal to the coastline Ω. At the open ocean boundaries, this constraint is not activated, allowing therefore a flow through the domain.

Internally, the presented method uses the staggered Arakawa C grid (Arakawa and Lamb 1977) where current components are defined as the interface of a grid cell. If a given cell interface is between a land and an ocean grid cell, then the boundary condition requires that the normal current is zero. The same boundary condition is also applied in the OMA method (Kaplan and Lekien 2007).

There are different ways to include a constraint in the context of variational analysis, and it is common to distinguish between strong constraints and weak constraints (e.g., Ngodock et al. 2017). A solution to the minimization problem has to satisfy exactly a strong constraint while for a weak constraint, residuals related to this constraint are added to the cost function. The residuals are typically divided by a scaling parameter, and as this scaling parameter tends toward zero, the weak constraint tends to a strong constraint. Here, the constraint on the normal velocity at the coastline is added as a weak constraint to the cost function.

$$ J_{\text{bc}}(\mathbf{u}) = \frac{1}{\epsilon^{2}_{\text{bc}}} {\int}_{\partial {\Omega}} (\mathbf{u} \cdot \mathbf{n})^{2} ds. $$

Effectively, this constraint can be included as additional measurements. The parameter \(\epsilon ^{2}_{\text {bc}}\) controls how strongly this constraint is enforced. One could also include it as a strong constraint (i.e., the normal velocity has to be exactly zero at the coastline), but the implementation was simplified by using a weak constraint and choosing a very small value of \(\epsilon ^{2}_{\text {bc}}\), which is practically equivalent to a strong constraint. The optimization of this parameter is discussed in Section 4. The effect of the boundary condition constraint is quite clear from Fig. 1b as it prevents the current from flowing into the coastline.

2.2 Horizontal divergence

In a stratified ocean, it takes a considerable amount of energy to move water vertically and therefore in most cases the vertical velocity is relatively small compared to the horizontal components (even when taking the different horizontal and vertical length scales into account). If we integrate the continuity equation over the surface layer and ignore the vertical velocity, we obtain an additional dynamical constraint on the horizontal velocity:

$$ \boldsymbol \nabla \cdot (h \mathbf{u}) \simeq 0. $$

where h is the average depth of the surface mixed layer or the total water depth where total water depth is shallower and the average depth of the surface layer. As before, this constraint is included in the cost function as a weak constraint with the following form:

$$ J_{\text{div}}(\mathbf{u}) = \frac{1}{\epsilon^{2}_{\text{div}}} {\int}_{\Omega} \left( \boldsymbol \nabla \cdot (h \mathbf{u})\right)^{2} dx $$

The parameter \(\epsilon ^{2}_{\text {div}}\) controls to which degree a horizontal divergence is allowed. A similar data constraint was also used by Yaremchuk and Sentchev (2009) to reduce non-physical flow structures visible in the radial currents. This is the first constraint introduced so far which significantly couples both velocity components. In the presence of a coastline, this constraint is responsible for deflecting the currents, as shown in Fig. 1c.

It is important to point out that we do not assume that the interpolated field is divergence free but we introduce an additional parameter allowing us to control and reduce the divergence of the interpolated field. The strength of this constraint will be determined objectively later by cross-validation. If this constraint would indeed degrade the result then the corresponding epsilon parameter would have quite a large value and the RMS error relative to the cross-validation data would not change.

2.3 3D analysis with time dimension

HF radar sites are able to monitor the surface ocean at a relatively high temporal frequency. It can therefore be desirable not to analyze every time instance separately but several time instances jointly. The correlation between two successive time instances of the radial current fields is 0.93 and for the two considered HF radar sites (FORM and GALF respectively, which will be introduced later). When the time dimension is included in the estimation vector x, the previously introduced concepts remain the same, except that the operator has now also a time component and that the regularization operator now must also have a 3rd-order derivative as explained in Barth et al. (2014). By increasing the size of the estimation vector, one also significantly increases the CPU time and memory requirements. With future parallelization of the algorithm in mind, we limit the time dimension to 3 time instances (including the data the hour before and after). For every analysis, only the central time is kept in the final result. The temporal correlation length is a free parameter that has to be determined.

2.4 Coriolis force

By including the time dimension and imposing a coherence in time, one ensures that the velocity at a given time is similar to the velocity at a previous and next time instance. In the Mediterranean Sea (e.g., Vandenbulcke et al. 2017), inertial oscillations can sometimes be quite energetic being usually of comparable magnitude to the mean slope current (Salat et al. 1992; Tintoré et al. 1995). The inertial oscillations have also been observed near the coast (Millot and Crépon 1981). Moreover, the power spectra of normalized zonal and meridional components from the HF radar data of the Ibiza Channel present dominant peaks at the diurnal, inertial (approximately 19 h), and semidiurnal bands as described in Lana et al. (2016). We therefore consider first the special case where inertial oscillations are dominant, and later in this section we will address other cases. The velocity at a given time instance should rather be similar to the velocity 1 h before and 1 h after, when suitably rotated according to the Coriolis parameter f:

$$ \begin{array}{@{}rcl@{}} \frac{\partial u}{\partial t} &=& f v, \end{array} $$
$$ \begin{array}{@{}rcl@{}} \frac{\partial v}{\partial t} &=& - f u. \end{array} $$

These equations can be easily integrated in time relating the velocity at time instance t and t + Δt:

$$ \mathbf{u}(x,y,t + {\Delta} t) = \left( \begin{array}{cc} \cos({\Delta} t f) & \sin({\Delta} t f) \\ -\sin({\Delta} t f) & \cos({\Delta} t f) \end{array} \right) \mathbf{u}(x,y,t) = \mathbf{M}_{C} \mathbf{u}(x,y,t) $$

where we introduce the matrix MC representing the rotation of the current vector between two successive time instances. It is clear that the Coriolis force is just one force among many, so that this constraint should not be imposed as a strong constraint but rather as a weak constraint:

$$ J_{C}(\mathbf{u}) = \frac{1}{\epsilon^{2}_{\text{Coriolis}}} \| \mathbf{u}(x,y,t + {\Delta} t) - \mathbf{M}_{C} \mathbf{u}(x,y,t) \|^{2} $$

The strength of this constraint is controlled by the factor \(\epsilon ^{2}_{\text {Coriolis}}\) whose optimal value will be determined later.

2.5 Coriolis force and surface pressure gradient

The mean flow is not subjected to inertial oscillations. To improve the previous constraint, two options were considered. One could compute time averages and apply the previous constraint simply on the anomalies relative to this time average. Another approach could be to extend the simplified momentum (8) by a surface pressure gradient term:

$$ \begin{array}{@{}rcl@{}} \frac{\partial u}{\partial t} &=& f v - g \frac{\partial \eta}{\partial x}, \end{array} $$
$$ \begin{array}{@{}rcl@{}} \frac{\partial v}{\partial t} &=& - f u - g \frac{\partial \eta}{\partial y}, \end{array} $$

where η represents the surface elevation and g the acceleration due to gravity. These equations thus allow for a geostrophically balanced mean current. However, it is important to note that this constraint does not mean that the HF radar currents analysis is necessarily geostrophically balanced as it is introduced as a weak constraint to account for neglected forces in the momentum equation and as it allows for a non-stationary field. The cost function is extended to also include the parameter η:

$$ J(u, v, \eta) = \|u\|^{2} + \|v\|^{2} + \gamma \|\eta \|^{2} + \sum\limits_{i=1}^{N} \frac{(\mathbf{u}_{i} \cdot \mathbf{p}_{i} - {u_{r}}_{i})^{2}}{{\epsilon^{2}_{i}}}. $$
$$ J_{\text{pgrad}}(\mathbf{u}) = \frac{1}{\epsilon^{2}_{\text{pgrad}}} {\int}_{\Omega} \| \mathbf{z}(x,y,t + {\Delta} t) - \mathbf M_{\text{pgrad}} \mathbf{z}(x,y,t) \|^{2} d{\Omega}. $$

where the vector field z is defined as:

$$ \mathbf{z}(x,y,t) = \left( \begin{array}{c} u(x,y,t) \\ v(x,y,t) \\ \eta(x,y,t) \end{array} \right) $$

The operator Mpgrad (operating on a vector field and producing a vector field of the same size) is derived by discretizing the Eqs. 12 and 13 and deriving its adjoint.

Knowing the surface elevation is not required (a priori) to apply this method. It is a free parameter (as the gridded currents) determined by minimizing the cost function using the radial current observations and the considered dynamical constraints (relating the gradient of the surface elevation and the velocity). The method could in theory be extended by also including surface elevation observations which could be tested in follow-up studies. Such observations are required because we have surface current observations and know how the surface currents are related to the surface elevation.

When the Coriolis force with and without the pressure gradient is used as a constraint, the temporal consistency is enforced via the momentum equation and there is no longer a temporal derivative involved in the regularization penalty in Eq. 2. Therefore, only in the 3D analysis case, there is an explicit temporal correlation parameter involved.

3 Data

The EMODnet Bathymetry (EMODnet Bathymetry Consortium 2016) is used to delimit the coastline for the present HF radar data analysis. The original resolution of this digital topography is 1/480 degree (1/8 min) and it is sub-sampled by a factor of 16 both in longitude and latitude. The bathymetry used in the analysis has thus a resolution of 1/30 degree (approximately 2.9 km by 3.7 km in the considered region) covering the area from 0.3 W to 1.45 W and from 38.3 N to 39.4 N. From this bathymetry, a binary mask is derived to distinguish sea and land points. The Coriolis parameter f is assumed to be constant and computed based on the average latitude equal to a mean latitude of 38.78 N and the acceleration due to gravity is set to 9.81 m/s2;.

Based on the SeaDataNet climatology (Simoncelli et al. 2018) extracted at the center of the domain, the surface mixed layer h is assumed to have a depth of 50 m or the total water depth in areas shallower than 50 m.

Surface currents datasets from HF radar and surface lagrangian drifters provided from the Balearic Islands Coastal Observing and forecasting System (SOCIB, Spain, http://www.socib.es, Tintoré et al. 2013) have been used in this study. SOCIB operates a complex network of observing platforms for long-term monitoring of physical and biogeochemical processes in the Western Mediterranean Sea (Tintoré et al. 2019). The HF radar system of the Ibiza Channel (Lana et al. 2015; Lana et al. 2016) consists of two CODAR SeaSonde radial stations located in Ibiza Island (Puig des Galfí) at 38.952 N, 1.218 E (GALF), and on Formentera Island (Cap Barbaria) at 38.665 N; 1.389 E (FORM) (Fig. 2). The antennas emit at a central frequency of 13.5 MHz with a bandwidth of 90 kHz, 512-point FFT (Doppler Bins), 2-Hz sweep rate. The radial velocities are processed from the 15-min Doppler spectra averaging with 10-min output rate. At the specified operating frequency, measurement depth is approximately 0.9 m (Stewart and Joy 1974). The hourly radial velocities are obtained by applying a centered 75-min running average and cover a wide coastal area out to a range approaching 85 km. At least two radial observations were required at each range and bearing in the final radial map. The system has been working operationally since June 2012 and for this study the selected period goes from 1 to 31 October 2014 (Tintoré et al. 2020). The number of available radials for that period is shown in Fig. 3.

Fig. 2
figure 2

SOCIB HF radar system. Blue squares, the location of the FORM and GALF sites, respectively on Formentera and Ibiza islands. The arrows represent the mean currents over the period 01 October 2014 to 31 October 2014 (without interpolation)

Fig. 3
figure 3

Radials and cross-validation points for the month of October 2014

We use the CODAR SeaSonde statistics to filter out outliers in the HF radar dataset (CODAR 2016). SeaSonde processing software provides spatial and temporal standard deviations (referred to as spatial and temporal quality in the SeaSonde data files) related to the spatial and temporal averaging of the short-time radials (typically produced every 10 min) as well as the maximum and minimum velocities for the averaging.

The retained radial data had to satisfy the following conditions:

  • Spatial and temporal quality smaller than 7 cm/s

  • Difference between maximum and minimum velocities (computed over the averaging period) less than 20 cm/s

  • Current speed less than 80 cm/s, as it is the established velocity threshold for radial velocities in the Ibiza Channel

To assess the accuracy of the reconstruction, cross-validation is used (e.g., Stone 1974; Brankart and Brasseur 1996; Alvera-Azcárate et al. 2015). While in some studies cross-validation data points are individual scalar data points completely chosen at random (Beckers and Rixen 2003), it is considered preferable that the method is tested on gaps with a more realistic spatial extent (e.g., Alvera-Azcárate et al. 2009; Beckers et al. 2006). This is achieved by marking some data points as missing from the 30 current maps with the best coverage (for each of the two HF radar stations). We use the coverage maps for the 30 current maps with the least coverage to mark some data points as cross-validation data in the 30 currents maps with the best coverage. If m1 is the binary mask (indicating presence or absence of the data) corresponding to the radial currents with the best coverage and m− 1 the binary mask of the radial currents with the lowest coverage, then all measurements where m− 1 is masked are considered as cross-validation data and not used in the subsequent analysis. This procedure is repeated for the dataset with the second best coverage m2 and the second worst coverage m− 2 up the 30th best/worst coverage map.

In total, there are 639,670 radial measurements, from which 27,136 have been marked for validation, representing 4.2% of the total data coverage. The remaining dataset (95.8%) is used to compute the analyzed vector currents. The obtained analyzed vector currents are then interpolated onto the location of the independent validation dataset to assess the accuracy of the analysis. This general procedure is called cross-validation. While in some studies, the cross-validation data points are chosen at random, here for all experiments, exactly the same cross-validation data points were used to facilitate the comparison of the results.

4 Results

Various experiments have been carried out to test the influence of the different constraints individually described in Section 2. These numerical experiments are summarized in Table 1.

Table 1 Overview of the different test cases with a description of the activated constraints

In the 2D case, every time snapshot is reconstructed using only data from the same time instance. The only parameters for this case are the horizontal correlation length and the 𝜖2 parameter.

The cost function depends on the gridded velocity field (and on the surface elevation for the last case), but also on a series of parameters involved in the considered dynamical constraints. For a fixed value of these analysis parameters, the cost function is quadratic and we use an efficient solver for this case (sparse matrix inversion or conjugate gradient method) to obtain the analyzed field.

At a higher level, the analysis parameters (𝜖 and correlation length for the 2D case and additional parameters used in other experiments) are optimized using the adaptive differential evolution method (Storn and Price 1997) implemented in the Julia BlackBoxOptim package (Feldt 2019) by minimizing the RMS error computed from cross-validation.

The smallest cross-validation error was obtained using a horizontal correlation length of 2889 m and a 𝜖2 parameter controlling the strength of the data constraint in the cost function equal to 0.032. The 2D case is used as the control case for computing the relative improvement for the more complex methods presented in the following. Skill score S for case C is defined in terms of the mean square error RMS2 (Murphy 1988):

$$ \mathrm{S}(C) = 1 - \frac{ {\text{RMS}}^{2}(C) }{ {\text{RMS}}^{2}(2D) } $$

A zero skill score means that the reconstruction is as “good/bad” as the control case (here the 2D case) and a skill score of 1 means that the reconstruction matches perfectly the validation dataset, which is of course not possible in practice since also the validation dataset is affected by noise.

Figures 4 (2D control analysis) and 5 (3D analysis including Coriolis force and surface pressure gradient) show two example reconstructions for 3 October 2014, 3:00 UTC. The left panels of these figures represent the derived total currents and the center and right panels are the radial velocities, measured by the HF radar sites located in Formentera (i.e., FORM) and Ibiza (i.e., GALF), respectively. Dots are the measured HF radar radial velocities plotted on top of the reprojected analyzed currents for the same snapshot. Where the analyzed radial currents and dots have a similar (resp. different) color, the corresponding residual is small (resp. large). Such plots give an indication of the degree of the observed information retained in the analysis and allow one to detect if the analysis under- or overfits the observations.

Fig. 4
figure 4

The reconstructed current velocity for the 2D control analysis. All panels are valid for the time 03 October 2014 T03:00:00. The analyzed total currents are shown in a where the red arrow represents 0.5 m/s. The radial currents (HF radar measurements and reprojected analysis) are shown in b and c for the two HF radar sites. The differences between the HF radar measurements and the reprojected analysis are given in (d) and (e)

Fig. 5
figure 5

Same as Fig. 4 for the 3D analysis including Coriolis force and surface pressure gradient

In both Figs. 4 and 5, there are some differences between the radial velocities and the reprojected analysis currents near the edge of the coverage. It is indeed expected that the error of the radial current measurements increases far away from the corresponding HF radar site. Visually, both analysis methods give quite similar results; it is therefore necessary to evaluate the error statistics relative to the cross-validation dataset to quantify the impact of the dynamical constraints.

Table 2 shows the RMS error relative to the validation dataset and the corresponding skill score of the different experiments. The RMS errors in this table are based on the reprojected radial currents and the radial current withheld for cross-validation. As for the reference, the standard deviation of the radial currents (two stations combined) is 0.12 m/s. The first constraint considered is the boundary condition, but it did not produce any noticeable effect on the validation metric. This can be explained by the fact that the boundary conditions, per definition, only act on the coastline and thus the effect is indeed expected to be small on the validated data which is mainly located offshore. The divergence constraint did not have an impact (compared to the 2D control case) for this case. However, a more significant improvement was obtained by including the time dimension and solving the 3D variational problem. For the best results, with a skill score of 0.441, the surface pressure gradient needed to be considered explicitly as well. All parameters were optimized as before using the adaptive differential evolution method (Storn and Price 1997) and the optimal values are shown in Table 2.

Table 2 RMS errors, skill score, and the value of the optimal parameters for different experiments

In order to test the robustness of the minimization procedure, we repeated four times the optimization for the most complete case (3D_Coriolis_pgrad) with the parameters strength of the data constraints, strength of the dynamical constraints, and the normalization factor γ. We computed the standard deviation normalized by its mean value for the different minimization experiments for the optimal parameter values and the optimal RMS value. The optimal RMS error was virtually identical (with a relative deviation of less than 0.004%). The relative standard deviation of the data constraints was 4% and the relative standard deviation of the strength of the dynamical constraints and the normalization factor γ were 14% and 20% respectively.

From the reconstructed surface currents, the mean current and their temporal variability are also derived (Fig. 6). The latter is represented as ellipses whose size is related to the standard deviation and the orientation to the correlation between the zonal and meridional currents. The standard deviation is scaled down by a factor of 5 to enhance visibility. The variability is actually quite large compared to the mean current in this region. The vectors outside the area covered by both antennas are of course much less reliable. The analysis reveals a quite strong current just in front of Puig des Galfi (GALF).

Fig. 6
figure 6

Surface current statistics. Arrows represent the reconstructed mean current velocity and ellipses their temporal variability. For clarity, only one current vector for every 3×3 grid cell is shown and the red arrow represents 0.1 m/s. The black arrow represents the mean currents and the light blue ellipse is the standard deviation (reduced by a factor of 5 for visibility)

4.1 Drifter validation

Besides the validation with a subset of HF radar data, the reconstructed surface current field is also compared to velocity derived from 13 satellite-tracked surface drifters of three different types (i.e., MetOcean CODE, MD03i, ODI) deployed by SOCIB in the Ibiza Channel in September 2014 (Tintoré et al. 2014) as described in Lana et al. (2016). The configuration of the floats was designed to ensure that both measurements, HF radar and drifters, remain inter-comparable, being both representative of the upper 1-m surface current. The float configuration was designed to ensure that the measurements are representative of the upper 1m surface current, hence ensuring that they are compatible with the HF radar measurements.

  • ODI, from Albatros Marine Technologies, has a spherical shape with small diameter (0.2 m) and low weight (3 kg), with less than 50% of its body emerging. A drogue of 5 kg was attached at 0.5 m below the sea surface.

  • MetOcean CODE is a robust solution to acquire coastal and estuarine water currents within a meter of the water surface, minimizing wind drag effects (Davis 1985). The MetOcean CODE drifters had their drogue between 30 and 100 cm and low wind exposure.

  • MD03i is a cylinder-shaped drifter, which has a diameter of 0.1 m and a length of 0.32 m, where only approx. 0.08 m is above the water surface when deployed. To enhance the drag, a drogue was attached 0.5 m below the sea surface with a 0.5 m length and diameter. Due to the very small sail area above the water surface, the drifter’s path represents the current in the upper meter of the water column.

Only those drifter positions flagged as “good,” following the SOCIB’s quality procedures (Lana et al. 2015; Ruiz et al. 2018), within the HF radar coverage for total surface currents and for the analyzed period (October 2014) are retained for the drifter validation, as shown in Fig. 7. The velocity is derived from the drifter position using the difference between two successive GPS positions. As the HF radar data also represents a time averaged current measurement, the velocity is filtered by a low-pass filter with a cutoff frequency corresponding to 1 h (which corresponds to the temporal resolution of the HF radar data). The filter is modeled after a 1D diffusion. RMS values are computed per drifter. In order to obtain an overall estimate of the accuracy, we compute the RMS value using all drifter data combined for each of the reconstructions.

Fig. 7
figure 7

Drifter trajectories available inside the HF radar total footprint area during October 2014 used for validation of the reconstructed surface currents (top panel). Temporal availability of each drifter inside the area and period of interest (bottom panel)

The combined RMS errors between the analysis currents (ui,vi) and the drifter currents (uobsi,vobsi) are defined as following:

$$ \begin{array}{@{}rcl@{}} {\text{RMS}}_{c}^{2} &=& \frac{1}{N} \sum\limits_{i=1}^{N} (u_{i} - {u_{\text{obs}}}_{i})^{2} +(v_{i} - {v_{\text{obs}}}_{i})^{2} \end{array} $$
$$ \begin{array}{@{}rcl@{}} &=& {\text{RMS}}_u^{2} + {\text{RMS}}_v^{2} \end{array} $$

where the RMS error for each component is defined as:

$$ \begin{array}{@{}rcl@{}} {\text{RMS}}_u^{2} &=& \frac{1}{N} \sum\limits_{i=1}^{N} (u_{i} - {u_{\text{obs}}}_{i})^{2} \end{array} $$
$$ \begin{array}{@{}rcl@{}} {\text{RMS}}_v^{2} &=& \frac{1}{N} \sum\limits_{i=1}^{N} (v_{i} - {v_{\text{obs}}}_{i})^{2} \end{array} $$

It should be noted that different approaches are possible to compute the combined RMS velocity error. For instance, one might divide by 2N instead of N (counting the zonal and meridional components individually).

Table 3 shows the RMS errors of the two best experiments from the previous cross-validation tests and corresponding skill scores using again the 2D analysis as a control experiment. For the experiment using multiple time instances, so far only 3 time instances (i.e., 3 h) were considered. In a separate set of experiments, the effect of including more time instances was tested. The 3D reconstruction case had the smallest RMS error using 7 h in total and in the case 3D_Coriolis_pgrad the optimal number of time instances was 13. Beyond 13 time instances, the RMS error was no longer reduced.

Table 3 RMS (m/s) and skill score for all drifter data combined

The improvements of the dynamical constraints are not as large as the improvements obtained using the cross-validation HF radar data. The best results were obtained, as before, for the case 3D_Coriolis_pgrad. For the 2D control experiment, the RMS error and skill score are also computed per drifter (Table 4). The number of hourly data points ranging from 103 to 2302 is also included. The SOCIB drifter with the name md03i004_scb is quite representative in terms of skill score for the overall improvement relative to the drifters and the zonal u and meridional v components are shown in Fig. 8. Significant inertial oscillations are quite apparent from this time series (and also from the trajectory plots in Fig. 7). The overall match between the drifter currents and the HF radar currents is quite good. Some spurious variability in the 2D analysis at short time scales is visible from the analysis. The temporal coherence is improved by including the time dimension and the discussed dynamical constraints. If a posteriori smoothing of the results were sufficient, then one would have expected that the 3D experiment would be of similar accuracy which is clearly not the case as shown in Table 3.

Table 4 RMS (m/s) and skill score per drifter
Fig. 8
figure 8

Zonal (upper panel) and meridional (lower panel) of the drifter md03i004_scb

The computation time for the most complete case (using 13 hours of data for a single reconstruction) is 26 h to reconstruct 744 current maps on 6 CPUs (Intel Xeon CPU E5-2660). On average, a single current map is reconstructed in 2 min of CPU time.

The improvements resulting from the inclusion of the dynamical constraints are less important when comparing the results to the drifter data. There are some inherent differences between HF radar currents and currents from drifters such as different integration depth, errors due to geometric dilution of precision, and different time and spatial resolutions. The difference between the interpolated HF radar data and the drifter velocity is due to (i) inherent differences between radar and drifter currents and (ii) interpolation error. By testing different interpolation schemes, we can only reduce the contribution of the second error term. In principle, one can also try to reduce the random and non-systematic part of the inherent differences but not systematic error.

In particular, the longest drifter time series md03i004_scb has an RMS error of 0.0592 m/s and 0.0688 m/s (for the u and v component respectively) when comparing the HF radar currents (without interpolation) to drifter currents as computed by Lana et al. (2016). This time series (using only HF radar where they are present) is 173 h long. The best method gives an RMS error of the interpolation currents of 0.0624 m/s and 0.0592 m/s (for the u and v components respectively). The interpolated time series with matching drifter data is 233 h long. It is natural to expect that the interpolated time series would have a larger RMS error than the original (gappy) data. However, we were able to show that the interpolated time series has a similar RMS error than the original data.

We also compared our method to the results from the Open-Boundary Modal Analysis (OMA) which is a quite commonly used method to interpolate HF radar observations (Kaplan and Lekien 2007) and which it is part of the HFR Progs package (https://github.com/rowg/hfrprogs/tree/master/matlab/OMA). The OMA is based on a set of linearly independent velocity modes (189 in the case of the Ibiza Channel HF radar) that are calculated before they are fitted to the radial data. OMA considers the kinematic constraints imposed on the velocity field by the coast since OMA modes are calculated taking into account the coastline by setting a zero normal flow.

Depending on the constraints of the methodology, it can be limited in representing localized small-scale features as well as flow structures near open boundaries. Also, difficulties may arise when dealing with gappy data, especially when the horizontal gap size is larger than the minimal resolved length scale (Kaplan and Lekien 2007) or when only data from one antenna are available. In the case of large gaps, unphysically fitted currents can be obtained if the size of the gap is larger than the smallest spatial scale (6 km in our case) of the modes, since the mode amplitudes are not sufficiently constrained by the data (Kaplan and Lekien 2007).

The RMS error relative to drifter observations for the OMA method has been computed (RMSu = 0.0570 m/s, RMSv = 0.0756 m/s for all drifters combined). The method 3D_Coriolis_pgrad (RMSu = 0.0536m/s, RMSV = 0.0680 m/s) compares favorably to the OMA method once multiple time instances are considered together. When using data from different time instances, it is necessary to specify how they are related. This is done in the present manuscript using the momentum equation (including the Coriolis force and the surface pressure gradient). If for instance the Coriolis force would not be taken into account, the amplitude of the inertial oscillations would be significantly smoothed out.

5 Conclusions

The DIVA framework is extended to handle surface currents and able to deal with observations when only one component of the velocity vector is measured. In order to check the importance of dynamical constraints in gridding a velocity field, a 2D analysis is used as the control experiments for different test cases. Including boundary conditions and the constraints on small divergence did not improve the accuracy of the constructions. However, the skill score was improved when taking for every time instance the previous and the following radial maps into account (i.e., a 3D analysis). By including in the cost function a momentum balance with the Coriolis force and surface pressure gradient, we can successively improve the skill score of the analysis. The best analysis procedure was obtained when considering the Coriolis force and the surface pressure gradient. Dynamical information appears to be quite beneficial when analyzing surface currents.

The main conclusion is also supported when comparing the results to drifter data. The skill scores relative to the control experiment are not as significant as relative to the HF radar cross-validation data. However, the best analysis was again obtained by considering the Coriolis force, the surface pressure gradient, and including the time dimension in the analysis. The comparison of drifter showed that the presented method compares favorably to the reconstruction from the OMA method.

It should be noted that if the area is well covered by observations and gaps are relatively small and infrequent, then many interpolation methods are likely to provide similar results. But if large gaps exist in space and/or time, using an interpolation algorithm able to leverage a priori information about the field to interpolate (like dynamical constraints and data at different time instances as shown here) can be beneficial and can avoid excessive smoothing when interpolating over large gaps.

The source code of the analysis is available under the terms of the GPL license at the address https://github.com/gher-ulg/DIVAnd_HFRadar.jl.