1 Introduction

The Surface Water and Ocean Topography (SWOT) altimeter mission was launched on December 16, 2022, and operated in a 1-day repeat orbit in its initial phase, transitioning to a 21-day repeat orbit in the science phase in July 2023. SWOT is positioned at an altitude of 890.5 km with a 77.6° inclination (Esteban-Fernandez 2017). SWOT measures sea surface heights (SSHs) over two 50-km swaths with a 20-km nadir gap using a Ka-band Radar Interferometer (KaRIN) altimeter that employs synthetic aperture radar (SAR)-interferometry techniques. SWOT is anticipated to have significant applications in both hydrology and oceanography. Its hydrological objectives include measuring the elevations of lakes, rivers, and wetlands and estimating global discharge for rivers and water storage for lakes (Morrow et al. 2019). In oceanography, SWOT is primarily used to detect mesoscale and submesoscale ocean circulation and eddies (Fu and Ubelmann 2014; Morrow et al. 2019).

A nadir-looking radar altimeter measures only SSHs along satellite ground tracks, yielding one-dimensional SSH observations. However, systematic and random errors in SSH observations from various tracks result in spatially non-uniform SSH observations. In contrast, SWOT measures wide-swath SSHs at a nominal 2-km spatial resolution with unprecedented accuracy. Consequently, SWOT’s swath SSH observations surpass radar altimeter data in both spatial resolution and accuracy, leading to greatly improved marine gravity fields (Morrow et al. 2019; Yu et al. 2021).

Marine geoid gradients derived from altimeters play crucial roles in geodesy and geophysics. They can be utilized to derive gravity anomalies, vertical gravity gradients, marine geoid heights, and short wavelength depths (Sandwell and Smith 1997; Hwang 1998). Many researchers have determined the marine geoid gradients from nadir-looking radar altimeters (Haxby et al. 1983; Sandwell 1984; Hwang and Parsons 1995, 1996; Olgiati et al. 1995; Hwang 1998). However, little research has been directed toward determining marine geoid gradients from the ongoing SWOT mission. Jin et al. (2022) calculated the north and east components of the deflection of the vertical (i.e., geoid gradient) from the simulated SWOT observations by weighted least-squares estimation and analyzed their accuracies using the Earth Gravitational Field Model 2008 (EGM2008) by Pavlis et al. (2012). Yu et al. (2021) studied the potential of SWOT in deriving high-accuracy marine gravity field from geoid gradients and geoid heights. However, an ill-conditioned problem occurs when inverting the covariance matrices in the LSC estimation of north and east components of geoid gradients. A tentative solution by Yu et al. (2021) is using error covariance matrices of the observed geoid gradients to avoid matrix singularities.

Ill-conditioned problems often emerge in physical geodesy. In an ill-conditioned problem, minor fluctuations in input data and their corresponding errors can result in substantial changes in the parameters estimated from the data (Rummel et al. 1979). Geodetic literature suggests that ill-conditioning in normal matrices during network adjustment can lead to sizable errors in estimated parameters (Grafarend and Sansò 2012). Ill-conditioned problems are typically addressed using regularization methods, which rely on a specific regularization parameter. A standard regularization method is Tikhonov regularization (Tikhonov and Arsenin 1977). Rummel et al. (1979) showed that Tikhonov regularization is equivalent to LSC when the regularization parameter is equal to one. The application of Tikhonov regularization in LSC uses a regularization parameter to balance the weight of the 2-norm of the estimated parameters and the weight of the 2-norm of residuals of the observations (Moritz 1980). For example, Abrikosov (1999) determined the regularization parameter in the LSC regularization based on three different principles. Marchenko and Tartachynska (2003) derived gravity anomalies from altimeter data in the Black sea area using LSC and the regularization method of Abrikosov (1999). In addition, the regularization method can be combined with variance component estimation to combine different types of data. For example, Koch and Kusche (2002) determined the regularization parameter by the ratio of two variance components. Xu et al. (2006) developed an iterative mean squared error (MSE)-based method to simultaneously determine variance components and regularization parameter. Xu (2009) proposed the iterative generalized cross-validation method extended from the iterative MSE-based method developed by Xu et al. (2006) to optimize regularization parameter and the variance components.

The purpose of this paper is to mitigate the ill-conditioned problem identified in the study by Yu et al. (2021), which determined the north and east geoid gradient components using LSC from simulated SWOT SSHs. In this paper, Tikhonov regularization is applied to LSC to attain stable solutions for the two geoid gradient components on a grid, with the regularization parameter ideally determined via the L-curve method. First, we generate the high-frequency SSH signals expected to be observed by SWOT using multi-beam depths and then generate error-free SWOT SSH observations. Second, we use LSC method to determine the north and east components of geoid gradients. Given that the calculated geoid gradient components without regularization are unstable, the Tikhonov-L-curve regularization is then implemented to resolve this issue to best recover the geoid gradient components. The theoretical values of the geoid gradient components using numerical differentiations are used to assess the accuracies of the recovered gradients.

2 SWOT SSH observations and “true values” of geoid gradient components

2.1 High-wavenumber SSH components from multi-beam depths

SWOT has been launched on December 16, 2022, and is currently in its science phase. However, the observations are undergoing calibration and validation, and may not be optimal for gravity recovery. Thus, this paper uses simulated SWOT SSH observations to test the method for overcoming the ill-conditioned problem. SWOT scans SSHs with unprecedented 1.4-cm accuracy and 2 × 2 km spatial resolution by SAR-interferometry (Peral and Esteban-Fernandez 2018; Morrow et al. 2019). Therefore, it can be difficult to simulate realistic SWOT observations from the present MSS models, which do not contain high wavenumber features over the seafloor. Following the approach of Yu et al. (2021), we simulate the high-wavenumber SSH components from the bathymetric data to pick up the signals not seen in the DTU18 mean sea surface (DTU18MSS) by Andersen et al. (2018). Note that DTU18MSS uses only data from the nadir-looking altimeter measurements. The SSHs from DTU18MSS may have been smoothed by filtering that reduced seafloor signatures.

Figure 1 (a) shows depths over our study area, which are from the multi-beam measurements of the Ministry of the Interior (MOI), Taiwan (Hsiao et al. 2016; Yu et al. 2021). The MOI multi-beam depth dataset has a mean accuracy of a few meters and a spatial resolution of 500 m along the ship cruise lines. Therefore, we use the 2020 version of the General Bathymetric Chart of the Oceans (the GEBCO_2020 Grid; GEBCO Bathymetric Compilation Group 2020) to fill the gap in the MOI depth data (void zone shaded by gray). The GEBCO_2020 Grid is on a 15″ × 15″ grid with a lower accuracy than MOI data. From the combined bathymetric data, the high-wavenumber SSH components are generated as follows (Yu et al. 2021)

$$ H_{{{\text{RDM}}}} \left( {x_{p} ,y_{p} } \right) = \frac{G}{\gamma }\mathop \smallint \limits_{{y_{1} }}^{{y_{2} }} \mathop \smallint \limits_{{x_{1} }}^{{x_{2} }} \frac{{\rho \left( {h\left( {x,y} \right) - h_{{{\text{ref}}}} \left( {x,y} \right)} \right)}}{{\sqrt {(x - x_{p} )^{2} + (y - y_{p} )^{2} } }}dxdy = \frac{G}{\gamma }\left[ {\rho h_{{{\text{res}}}} } \right]*\frac{1}{r} $$
(1)

where \(H_{RDM}\) is the effect of residual depth model (RDM) on SSH, \(G\) is the gravitational constant, \(\gamma\) is normal gravity, \(\left( {x_{1} , x_{2} } \right)\) and \( \left( {y_{1} , y_{2} } \right)\) define the integration range for the RDM in the local x–y plane centered at P, \(\rho = 1.64\) \({\text{g}}\cdot{\text{cm}}^{ - 3}\) is the density contrast between the average density of the ocean bedrock \(\left( {{2}{\text{.9 g}}\cdot{\text{cm}}^{ - 3} } \right)\) and that of seawater \(\left( {{1}{\text{.03 g}}\cdot{\text{cm}}^{ - 3} } \right)\), \(h_{{{\text{res}}}} = h - h_{{{\text{ref}}}}\) is the residual depth, \(h\) is the true depth, \(h_{{{\text{ref}}}}\) is the reference depth, \(r = \sqrt {x^{2} + y^{2} }\), and * is the convolution operator.

Fig. 1
figure 1

a Depths from the MOI multi-beam depth dataset, the areas without depth measurements (void zone) are gray-shaded, b high-wavenumber SSH components, and c simulated error-free SWOT SSH observations

The true depths are the combination of the MOI multi-beam depths and the GEBCO_2020 Grid, and the reference depths are obtained by low-pass filtering of the true depths with a wavelength of 20 km. Then the residual depths are the differences between the true depths and the reference depths. The high-wavenumber SSH components derived from the residual depths are shown in Fig. 1 (b). They have the same spatial frequency as that of the residual depths and aim at compensating for the missing high-wavenumber SSHs in DTU18MSS (Yu et al. 2021). We generate a SWOT_model on a 1′ × 1′ grid via superimposing the high-wavenumber SSH components on the DTU18MSS model. Then this SWOT_model is input into the SWOT simulator (version 3.1) developed by Gaultier et al. (2016) to generate one-cycle error-free SWOT SSH observations (Fig. 1 (c)). The simulated SWOT observations over the SWOT swaths are on a 2 km × 2 km grid and those along the nadir tracks have a 2-km interval. Because our focus is to identify the best approach for regularization in the LSC estimation of north and east gradient components, we do not consider the SWOT observation errors. Subjects such as the effects of SSH errors on SWOT-estimated geoid gradients and marine gravity anomalies and the error suppression techniques have been studied by Yu et al. (2021). Note that the simulated SWOT observation errors from our previous study (Yu et al. 2021) align with the SWOT error budget. Corrections for most of these errors will be incorporated in the SWOT Level 2 ocean products (Stiles 2020). These corrections should be applied prior to extracting the geoid gradients from the SWOT SSH data. Consequently, only the residual errors persist in the SWOT SSHs, influencing the geoid gradients. Until the real SWOT data are available, estimating these residual errors remains unfeasible.

2.2 “True values” of geoid gradient components using numerical differentiations

In this section, we compute the residual north and east components of geoid gradients from the SWOT_model generated in Sect. 2.1 using numerical differentiations. The residual geoid gradient components calculated in this way are regarded as the “true (theoretical) values” and can be used to assess the accuracies of the geoid gradient components computed by the method presented in Sect. 3.2. First, we removed the Levitus dynamic ocean topography (Levitus et al. 1997) and EGM2008 geoid heights to degree and order 2160 (Pavlis et al. 2012), which is used as a reference field, from the SSHs in the SWOT_model to determine the residual geoid heights on a 1′ × 1′ grid. Then we applied numerical differentiations to obtain the residual geoid gradient components from the regularly gridded residual geoid heights as

$$ \begin{gathered} \tilde{\xi } = \frac{{\partial N_{{{\text{res}}}} }}{\partial y} \hfill \\ \tilde{\eta } = \frac{{\partial N_{{{\text{res}}}} }}{\partial x} \hfill \\ \end{gathered} $$
(2)

where \(N_{{{\text{res}}}}\) is the residual geoid heights on a 1′ × 1′ grid, \(x\) and \(y\) are the local rectangular coordinates pointing east and north, respectively, \({\tilde{\xi }}\) is the north derivative of \(N_{res}\) and is actually the residual geoid gradient in the north–south direction, and \(\tilde{\eta }\) is the residual geoid gradient in the east–west direction. The horizontal derivatives of the gridded geoid heights are computed by the routine “QD2DR” in the International Mathematical and Statistical Library (URL: https://www.imsl.com/).

3 III-conditioning posed by the dense SWOT geoid gradients and Tikhonov-L-curve regularization

3.1 SWOT along- and cross-track geoid gradients

The geoid heights were obtained from the simulated SWOT SSH observations by removing the dynamic ocean topography from Levitus et al. (1997). Besides, we removed the reference geoid heights of the EGM2008 to degree and order 2160 (Pavlis et al. 2012) to generate the residual geoid heights. Because the SWOT observations over the swaths are two-dimensional grid data, we obtained the SWOT residual geoid heights in both the along-track and cross-track directions as described by (Yu et al. 2021). To improve the accuracies of the north and east components of geoid gradients and gravity anomalies from SWOT, the use of the cross-track SSHs (hence residual geoid heights) is critical (Yu et al. 2021). The residual along- and cross-track geoid gradients are calculated by

$$ \varepsilon_{{\alpha ,{\text{res}}}} = \frac{{N_{{{\text{res}}2}} - N_{{{\text{res}}1}} }}{d} $$
(3)

where \(\varepsilon_{{\alpha ,{\text{res}}}}\) is the residual geoid gradient with azimuth \(\alpha\), \(N_{{{\text{res}}1}}\) and \(N_{{{\text{res}}2}}\) are two successive residual geoid heights in the along- or cross-track direction, and \(d\) is their distance. This paper only focuses the residual geoid gradients because they are the contributions of the SWOT mission relative to the EGM2008 reference field.

3.2 Ill-conditioned LSC estimation of north and east geoid gradient components and the L-curve solution

The residual geoid gradients from Eq. (3) are used to determine the north and east components on a regular grid by LSC as (Hwang and Parsons 1995)

$$ {\mathbf{s}} = \left[ {\begin{array}{*{20}c} \xi \\ \eta \\ \end{array} } \right] = {\mathbf{C}}_{{{\mathbf{sL}}}} \left( {{\mathbf{C}}_{{{\mathbf{LL}}}} + {\mathbf{D}}_{{\mathbf{L}}} } \right)^{ - 1} {\mathbf{L}} $$
(4)

where \({\mathbf{L}}\) contains the residual geoid gradients in an \(8^{\prime} \times 8^{\prime}\) data selection window, \({\mathbf{s}}\) contains the expected residual north \(\left( \xi \right)\) and east \(\left( \eta \right)\) components of geoid gradients at the grid point, which is the center of this data selection window, \({\mathbf{C}}_{{{\mathbf{sL}}}}\) is the cross-covariance matrix for the output \({\mathbf{s}}\) and input \({\mathbf{L}}\), \({\mathbf{C}}_{{{\mathbf{LL}}}}\) is the auto-covariance matrix for the input \({\mathbf{L}}\), and \({\mathbf{D}}_{{\mathbf{L}}}\) is the error variance matrix of \({\mathbf{L}}\).

In estimating the residual north and east geoid gradient components from the residual geoid gradients, the data selection window should not be excessively large. To ensure the quality of the estimations of the geoid gradient components, the window should contain enough contributing points. We tested different window sizes and determined that the \(8^{\prime} \times 8^{\prime}\) window is optimal. Using this size, we estimate the geoid gradient components for each grid point, with the point centered inside the window. The elements in \({\mathbf{C}}_{{{\mathbf{sL}}}}\) and \({\mathbf{C}}_{{{\mathbf{LL}}}}\) are constructed from the global covariance function of the earth’s anomalous potential in relation to the EGM2008 gravitational model by covariance propagation (Tscherning and Rapp 1974; Hwang and Parsons 1995). Based on the global covariance function, we determined the isotropic covariances of the longitudinal and transversal components of geoid gradient (Tscherning and Rapp 1974), which are then used to construct the covariances between any two geoid gradients (Hwang and Parsons 1995).

When real SWOT observations are used for gradient component estimation, \({\mathbf{D}}_{{\mathbf{L}}}\) should be constructed using a priori or a posteriori variances and covariances of the observed geoid gradients (Yu and Hwang 2022). Note that, the errors used to construct \({\mathbf{D}}_{{\mathbf{L}}}\) are the residual errors remaining in the SWOT SSHs after applying the range and geophysical corrections. In this study, \({\mathbf{D}}_{{\mathbf{L}}}\) is set as a zero matrix given our inability to accurately simulate these residual errors (see Sect. 2.1). Our experiment indicated that in the study area shown in Fig. 1, almost all of the covariance matrices of \({\mathbf{L}}\) \(\left( {{\mathbf{C}}_{{{\mathbf{LL}}}} } \right)\) in the \(8^{\prime} \times 8^{\prime}\) windows are ill-conditioned because their condition numbers are large. For example, the condition numbers of \({\mathbf{C}}_{{{\mathbf{LL}}}}\) range from 103 to 106 (see Sect. 4.1). A \({\mathbf{C}}_{{{\mathbf{LL}}}}\) with a large condition number indicates that at least one column vector in \({\mathbf{C}}_{{{\mathbf{LL}}}}\) is nearly or fully dependent on other column vectors, making \({\mathbf{C}}_{{{\mathbf{LL}}}}\) rank defect. An ill-conditioned \({\mathbf{C}}_{{{\mathbf{LL}}}}\) leads to an unstable solution of \({\mathbf{C}}_{{{\mathbf{LL}}}}^{ - 1} {\mathbf{L}}\). That is, the result is noisy and physically meaningless since the solution is sensitive to changes in the observations in \({\mathbf{L}}\).

Many techniques can be used to stabilize the ill-conditioned problem. In this paper, we use a well-known regularization algorithm, Tikhonov regularization (Tikhonov and Arsenin 1977), to accept only reasonable and stable solutions. The process is formulated as follows. Let

$$ {\mathbf{x}} = \left( {{\mathbf{C}}_{{{\mathbf{LL}}}} + {\mathbf{D}}_{{\mathbf{L}}} } \right)^{ - 1} {\mathbf{L}} $$
(5)

and

$$ {\mathbf{A}} = {\mathbf{C}}_{{{\mathbf{LL}}}} + {\mathbf{D}}_{{\mathbf{L}}} $$
(6)

we have the following observation equations

$$ {\mathbf{Ax}} = {\mathbf{L}} $$
(7)

When \({\mathbf{A}}\) is ill-conditioned, Eq. (7) is an ill-conditioned problem with an unstable solution using \({\mathbf{x}} = {\mathbf{A}}^{ - 1} {\mathbf{L}}\). Here we reformulate the solution for \({\mathbf{x}}\) in Eq. (7). First, we allow residuals for the observations in vector \({\mathbf{L}}\) in Eq. (7) as.

$${\mathbf{V}} = {\mathbf{Ax}} - {\mathbf{L}},{\text{weight matrix of }}{\mathbf{L}} = {\mathbf{I}}$$
(8)

where \({\mathbf{V}}\) is the vector containing residuals, and the weight matrix \({\mathbf{I}}\) is the identity matrix. Indeed, it is impossible to obtain error-free observations; the observations in \({\mathbf{L}}\) from the real SWOT data can be contaminated by observation errors contained in \({\mathbf{V}}\). Second, we wish to obtain a smooth solution for \({\mathbf{x}}\) by allowing \({\mathbf{x}}\) to vary by \({\mathbf{V}}_{{\mathbf{x}}}\), so that.

$${\mathbf{V}}_{{\mathbf{x}}} = {\mathbf{x}} - {\mathbf{x}}_{0} ,{\text{weight matrix}} = {\lambda }{\mathbf{I}}$$
(9)

where \({\mathbf{x}}_{0}\) is the vector of a priori values for \({\mathbf{x}}\), and \(\lambda\) is a positive number. In practice, we set \({\mathbf{x}}_{0} = \mathbf{0}\) (zero vector) because of the use of a high-degree reference field (see Sect. 3.1). The two sets of equations in Eqs. (8) and (9) contain redundant pseudo observations with respect to \({\mathbf{x}}\). The problem formulated in Eqs. (8) and (9) is equivalent to the random effects model in the least-squares adjustment literature (Schaffrin 2013). A least-squares solution for \({\mathbf{x}}\) can be achieved by minimizing the target function (using \({\mathbf{x}}_{0} = \mathbf{0}\)):

$$ \varphi \left( {\mathbf{x}} \right) = \left\| {\mathbf{V}} \right\|^{2}_{2} + \lambda \left\| {{\mathbf{V}}_{{\mathbf{x}}} } \right\|^{2}_{2} = \left\| {{\mathbf{Ax}} - {\mathbf{L}}} \right\|^{2}_{2} + \lambda \left\| {\mathbf{x}} \right\|^{2}_{2} $$
(10)

where \(\left\| \cdot \right\|_{2}\) is the 2-norm, and \(\lambda\) is a regularization parameter already defined in Eq. (9), which balances the square of the norm of the regularized solution for \({\mathbf{x}}\) versus that of the norm of the corresponding residual (Hansen and O’Leary 1993). Minimizing \(\varphi \left( {\mathbf{x}} \right)\) in Eq. (10) leads to the regularized solution of \({\mathbf{x}}\) in Eq. (7) as

$$ {\hat{\mathbf{x}}} = \left( {{\mathbf{A}}^{{\text{T}}} {\mathbf{A}} + \lambda {\mathbf{I}}} \right)^{ - 1} {\mathbf{A}}^{{\text{T}}} {\mathbf{L}} $$
(11)

Therefore, the residual north and east components of geoid gradients in Eq. (4) can be calculated by

$$ {\mathbf{s}} = {\mathbf{C}}_{{{\mathbf{sL}}}} {\hat{\mathbf{x}}} $$
(12)

In this paper, we use the L-curve method (Hansen and O’Leary 1993) to choose an appropriate regularization parameter \(\lambda\) in Eq. (11). In Eq. (10), the 2-norm of the solution \(\left\| {\mathbf{x}} \right\|_{2}\) and that of the residual \(\left\| {{\mathbf{Ax}} - {\mathbf{L}}} \right\|_{2}\) are balanced by the regularization parameter \(\lambda\) to obtain an optimal solution for \({\hat{\mathbf{x}}}\). A best balance can be achieved by the L-curve method as follows. Let the two 2-norms be defined as functions of

$$ \lambda \;{\text{as}}\;\alpha \left( \lambda \right) = \left\| {{\mathbf{Ax}} - {\mathbf{L}}} \right\|_2 $$
$$ \beta \left( \lambda \right) = \left\| {\mathbf{x}} \right\|_{2} $$
(13)

In an \(8^{\prime} \times 8^{\prime}\) data selection window, we can plot \(\alpha \left( \lambda \right)\) against \(\beta \left( \lambda \right)\) to show the relationship between \(\alpha \left( \lambda \right)\), \(\beta \left( \lambda \right)\), and \(\lambda\). In this way, one can directly see the trade-off between the best possible minimal values of \(\alpha \left( \lambda \right)\) and \(\beta \left( \lambda \right)\) and also immediately see whether one or both quantities are unreasonably large or small. The name, L-curve, expresses the fact that the curve formed by the pairs (\(\alpha \left( \lambda \right)\), \(\beta \left( \lambda \right)\)) follows the shape of L. There is a distinct L-shape corner in the L-curve. This corner separates the horizontal and vertical parts of the L-curve and balances \(\alpha \left( \lambda \right)\) and \(\beta \left( \lambda \right)\). Therefore, the regularization parameter \(\lambda\) at the L-curve corner achieves a best balance between the sizes of the two 2-norms in Eq. (13). This corner has the maximum curvature on the L-curve.

Because the plot of \(\left( {\alpha \left( \lambda \right),\;\beta \left( \lambda \right)} \right)\) in a log–log scale can emphasize the corner of the L-curve (Hansen and O’Leary 1993), we define two new functions of \(\lambda\) as

$$ \hat{\alpha } = \log \alpha $$
$$ \hat{\beta } = \log \beta $$
(14)

The curvature of the L-curve \(( {\hat{\alpha },\;\hat{\beta }} )\) is given by (Hansen and O’Leary 1993)

$$ \kappa \left( \lambda \right) = \frac{{\hat{\alpha }^{\prime} \hat{\beta }^{^{\prime\prime}} - \hat{\alpha }^{^{\prime\prime}} \hat{\beta }^{\prime} }}{{\left( {\left( {\hat{\alpha }^{\prime} } \right)^{2} + \left( {\hat{\beta }^{\prime} } \right)^{2} } \right)^{3/2} }} $$
(15)

where \(\kappa\) is a function of the regularization parameter \(\lambda\), \(\hat{\alpha }^{\prime}\), \(\hat{\alpha }^{^{\prime\prime}}\), \(\hat{\beta }^{\prime}\), and \(\hat{\beta }^{^{\prime\prime}}\) denote the first and second derivatives of \(\hat{\alpha }\) and \(\hat{\beta }\) with respect to \(\lambda\), respectively. The best regularization parameter \(\lambda\) corresponds to the maximum curvature \(\kappa \left( \lambda \right)\). The \(\lambda\) value in Eq. (11) is a scaling factor for the weight matrix of the a priori vector \({\mathbf{x}}_{0}\). In addition, the covariance matrix of \({\hat{\mathbf{x}}}\) from Eq. (11) is

$$ {{\varvec{\Sigma}}}_{{{\hat{\mathbf{x}}}}} = \left( {{\mathbf{A}}^{{\text{T}}} {\mathbf{A}} + \lambda {\mathbf{I}}} \right)^{ - 1} $$
(16)

Thus, the variances of the two geoid gradients components (after considering \({\mathbf{C}}_{\mathbf{s}\mathbf{L}}\)) will decrease with increasing \(\lambda \). However, \(\lambda \) cannot be arbitrarily large because it may result in over-smoothed gradient components.

4 Results and discussion

4.1 Effectiveness of the L-curve method

To give an example of the L-curve method, in Fig. 2, we show the variations of the 2-norms with respect to \(\lambda\) in the log–log scale at a grid point located at 113.88°E, 17.80°N. The variations follow an L-shaped curve. The curve starts (from the left) with stable \(\log \left\| {{\mathbf{Ax}} - {\mathbf{L}}} \right\|_{2}\) values and rapidly decreasing \(\log \left\| {\mathbf{x}} \right\|_{2}\) values. After reaching the corner of the curve, the changes in \(\log \left\| {\mathbf{x}} \right\|_{2}\) are gentle as the \(\log \left\| {{\mathbf{Ax}} - {\mathbf{L}}} \right\|_{2}\) values increase toward the right. The \(\lambda\) values in Fig. 2 range from \(3.24 \times 10^{ - 5}\) to 31.35 (see the crosses for selected \(\lambda\) values). The corner occurs when \(\lambda \approx 0.008\), which is marked by the red dot. This corner corresponds to the maximum curvature \(\kappa\) in Eq. (15) and is the vertex of the alphabet “L”. The regularization parameter at the corner is optimal because it simultaneously achieves a smallest possible residual norm \(\log \left\| {{\mathbf{Ax}} - {\mathbf{L}}} \right\|_{2}\) and solution norm \(\log \left\| {\mathbf{x}} \right\|_{2}\), as shown in Fig. 2.

Fig. 2
figure 2

A plot of the L-curve \(\left( {\log \left\| {{\mathbf{Ax}} - {\mathbf{L}}} \right\|_{2} ,\;\;\log \left\| {\mathbf{x}} \right\|_{2} } \right)\) at a grid point (113.88°E, 17.80°N) as a function of the regularization parameter \(\lambda\)

In some cases, only a finite set of points on the L-curve (\(\hat{\alpha }\), \(\hat{\beta }\)) are known; the L-curve becomes non-differentiable. To address this, one viable solution is to fit a cubic spline curve to these discrete points of the L-curve. This approach is advantageous because such a cubic spline is twice differentiable and preserves the local shape of the curve (Hansen and O’Leary 1993). However, challenges arise when the L-curve is erratic, exhibiting multiple corners identified by the maximum curvature (as per Eq. (15)). This can lead to uncertainties in selecting the appropriate regularization parameter (Kusche and Klees 2002). When faced with the challenge of identifying the appropriate regularization parameters for the L-curve, one could turn to the minimum distance function proposed by Belge et al. (2002) for an approximation. Moreover, there are alternative methods to consider: the discrepancy principle (Morozov 1966) and the generalized cross-validation (Golub et al. 1979; Koch and Kusche 2002) serve as other potential approaches for selecting suitable regularization parameters.

In order to investigate the effectiveness of the Tikhonov-L-curve regularization described in Sect. 3.2, we use the “true (theoretical) values” of the two geoid gradient components \(( {\tilde{\xi },\;\tilde{\eta }} )\) derived from the SWOT_model using Eq. (2) to evaluate the quality of the two components estimated by this regularization method (Eqs. (11) and (12)). For comparison, we also compute the residual geoid gradient components using the LSC formula (Eq. (4)), and the inversion in this formula is implemented by the function “inv” available in MATLAB. Figure 3 shows the results from 113°E to 122°E with a fixed latitude of 20°N. Both the residual north \(\left( \xi \right)\) and east \(\left( \eta \right)\) components of geoid gradients from the case of inversion (red lines) range from −100 microrad to 300 microrad, which are quite different from their “true values” (cyan lines). As shown in Fig. 3 (a) and (b), the use of “inv” for the ill-conditioned \({\mathbf{C}}_{{{\mathbf{LL}}}} + {\mathbf{D}}_{{\mathbf{L}}}\) matrices in Eq. (4) results in unstable solutions of \(\xi\) and \(\eta\) (red lines). For instance, compared to the “true values”, there are several anomalously large residual \(\xi\) and \(\eta\) values (absolute values) from “inv”. In contrast, the residual \(\xi\) and \(\eta\) values estimated by the Tikhonov-L-curve regularization (blue lines) are in good agreement with their “true values”. The result is achieved by the regularization parameter that well stabilizes the inverse of \({\mathbf{C}}_{{{\mathbf{LL}}}} + {\mathbf{D}}_{{\mathbf{L}}}\).

Fig. 3
figure 3

The residual a north \(\left( \xi \right)\) and b east \(\left( \eta \right)\) geoid gradient components from 113°E to 122°E with a fixed latitude of 20°N

Figure 4 shows the differences between the estimated residual north \(\left( \xi \right)\) and east \(\left( \eta \right)\) geoid gradient components and their “true values” at 20°N. The differences of \(\xi\) and \(\eta\) from the case of inversion (“inv”) range from −100 microrad to 300 microrad, which are much larger than those from the L-curve method (range from −5 microrad to 5 microrad). The root-mean-squared differences (RMSDs) between the determined residual north \(\left( \xi \right)\) and east \(\left( \eta \right)\) geoid gradient components and their “true values” at 20°N are presented in Table 1. The RMSDs of \(\xi\) and \(\eta\) determined by the case of inversion are larger than 14 microrad, while those in the case of L-curve are smaller than 1 microrad.

Fig. 4
figure 4

The differences between the determined residual a north \(\left( \xi \right)\) and b east \(\left( \eta \right)\) geoid gradient components and their “true values” at 20°N

Table 1 RMSDs between the determined residual north \(\left( \xi \right)\) and east \(\left( \eta \right)\) geoid gradient components and their “true values” at the latitude of 20°N (unit: microrad)

In the study area, the “true values” of the residual north \(\left( \xi \right)\) and east \(\left( \eta \right)\) geoid gradient components determined from the SWOT_model using Eq. (2) are shown in Fig. 5 (a) and (b), respectively. The differences between the north and east geoid gradient components computed by inversion and their “true values” are shown in Fig. 5 (c) and (d), respectively. Evident stripe structures are present in these two figures. The stripe pattern is consistent with the pattern of the condition numbers of \({\mathbf{C}}_{{{\mathbf{LL}}}} + {\mathbf{D}}_{{\mathbf{L}}}\) in Eq. (4), as shown in Fig. 6 (a). Moreover, the condition numbers are correlated with the distribution of SWOT observations (Fig. 1 (c)). The condition numbers are relatively small in the area with sparse SWOT observations, where the estimated geoid gradient components are relatively stable. Therefore, in Fig. 5 (c) and (d), the differences with sparse SWOT data are relatively small. While in the area with large condition numbers, some differences are very large. In the study area, the RMSDs of the residual \(\xi\) and \(\eta\) values determined by inversion are larger than 100 microrad, as shown in Table 2.

Fig. 5
figure 5

The “true values” of the residual a north \(\left( \xi \right)\) and b east \(\left( \eta \right)\) geoid gradient components, the differences between Fig. a and Fig. b and the residual north (left column) and east (right column) geoid gradient components determined by c, d inversion, e, f L-curve, and g, h considering error variance of geoid gradients

Fig. 6
figure 6

The distributions of a the condition numbers of \({\mathbf{A}}\) in Eq. (6), and b the regularization parameters

Table 2 RMSDs between the determined residual north \(\left( \xi \right)\) and east \(\left( \eta \right)\) geoid gradient components and their “true values” in the study area (unit: microrad)

The Tikhonov-L-curve regularization significantly improves the accuracies of the residual north and east geoid gradient components. The differences between the two components estimated by the L-curve method and their “true values” are shown in Fig. 5 (e) and (f). These differences are much smaller than those in Fig. 5 (c) and (d). Table 2 shows that the RMSDs for the north \(\left( \xi \right)\) and east \(\left( \eta \right)\) components estimated by L-curve are 0.68 and 1.03 microrad, respectively. Compared with the accuracies from the case of inversion, the L-curve method can effectively improve the accuracies of the geoid gradient components. Because there are no SWOT observations in the diamond-like gaps (Fig. 1 (c)), the differences between the geoid gradient components determined by L-curve and their corresponding “true values” (Fig. 5 (e) and (f)) in these gaps are notably pronounced. Particularly, the quality of the east geoid gradient components (Fig. 5 (f)) is much lower in these gaps compared to other areas. Therefore, in Table 2, we present the RMSDs by ignoring the values in the diamond-like gaps. The accuracy of the east \(\left( \eta \right)\) geoid gradient components is improved from 1.03 microrad to 0.75 microrad, while the improvement of the accuracy of the north \(\left( \xi \right)\) geoid gradient components is in 0.01-microrad magnitude. The RMSDs of north \(\left( \xi \right)\) components are smaller than those of east \(\left( \eta \right)\) components in both Tables 1 and 2. One reason is that the high orbit inclination of SWOT (77.6°) results in more sensitive north geoid gradient components (Yu et al. 2021).

In this study area, the condition numbers of \({\mathbf{A}}\) in Eq. (6) are shown in Fig. 6 (a). This result shows that \({\mathbf{A}}\) is ill-conditioned almost in the whole study area. In Fig. 6 (a), there are obvious stripes with low condition numbers, where the SWOT data are sparse (Fig. 1 (c)). It is the sparse data that leads to the low condition numbers because the linear dependence of the columns of \({\mathbf{A}}\) is weak. In other words, the condition numbers are positively correlated with the density of SWOT data. The distribution of the determined regularization parameters is shown in Fig. 6 (b). The stripes in Fig. 6 (b) correspond to those in Fig. 6 (a). The regularization parameters tend to increase with the roughness of the fields of the geoid gradient components (Fig. 5 (a) and (b)).

The above results show that the Tikhonov-L-curve regularization can stabilize the solutions of LSC when ill-conditioned matrices appear. Note that \({\mathbf{A}}\) in Eq. (6) is ill-conditioned because of the use of zero matrix for \({\mathbf{D}}_{{\mathbf{L}}}\). In the study area, the dense SWOT data lead to nearly ill-conditioned auto-covariance matrix of geoid gradients \(\left( {{\mathbf{C}}_{{{\mathbf{LL}}}} } \right)\). To address this ill-conditioning, Yu et al. (2021) obtained \({\mathbf{D}}_{{\mathbf{L}}}\) by error propagation from the SWOT observation errors. While their estimated geoid gradient components result in gravity anomalies consistent with shipborne observations at mgal levels (Yu et al. 2021), the accuracy of the estimated geoid gradient components has not been assessed.

In this study, we assign the precision of the SWOT SSH observations to 1.4 cm (Peral and Esteban-Fernandez 2018) and then compute the error variance matrix \({\mathbf{D}}_{{\mathbf{L}}}\) based on the error propagation theory. In this case, \({\mathbf{A}}\) in Eq. (6) is not ill-conditioned because the error variance \({\mathbf{D}}_{{\mathbf{L}}}\) in LSC works as a regularization parameter (Sadiq et al. 2010). The differences between the north and east geoid gradient components estimated in this way and their “true values” are shown in Fig. 5 (g) and (h). These differences are much larger than those of L-curve (Fig. 5 (e) and (f)). Moreover, the patterns presented in Fig. 5 (g) and (h) are consistent with the signal distributions of the “true values” of the two geoid gradient components (Fig. 5 (a) and (b)). The RMSDs for the north \(\left( \xi \right)\) and east \(\left( \eta \right)\) components estimated using error variance are 0.82 and 1.02 microrad, respectively, when ignoring the values in the diamond-like gaps (Table 2). The accuracy of the two components estimated using error variance is inferior to that of L-curve (0.67 and 0.75 microrad). The error variance \({\mathbf{D}}_{{\mathbf{L}}}\) can mitigate the ill-conditioning of \({\mathbf{A}}\), but cannot obtain the optimal estimation of the geoid gradient components. One explanation could be that error variance in LSC normally balances the signal and noise variance and works as noise suppresser, and the \({\mathbf{D}}_{{\mathbf{L}}}\) computed from 1.4-cm precision of SWOT SSHs may be too large, resulting in overly smooth estimations of geoid gradient components.

Indeed, the previous experiment, where we assigned 1.4-cm errors to SWOT SSHs, lacks rigor. First, the simulated SWOT SSH data are error-free, which means the error variance \({\mathbf{D}}_{{\mathbf{L}}}\), as computed by the error propagation theory, will compromise the accuracy of the geoid gradient components estimated using LSC. Second, SWOT observation errors are not consistent. For instance, the errors in coastal areas are typically larger than those in deep ocean regions. Each SWOT SSH observation error should be individually computed based on its specific context, leading to varying errors for different observations.

When the real SWOT data are available, \({\mathbf{A}}\) in Eq. (6) may still be ill-conditioned for three reasons. First, the expected precision of the SWOT SSH observations is exceptionally high (1.4 cm; Peral and Esteban-Fernandez 2018). As a result, errors in the SWOT SSHs can be significantly minimized after averaging multiple-cycle data. Therefore, the diagonal elements in \({\mathbf{D}}_{{\mathbf{L}}}\) generated from the real SWOT observation errors are likely to be too small, resulting in ill-conditioned \({\mathbf{A}}\). Second, this study only used one-cycle SWOT SSH data to explore the ill-conditioning of \({\mathbf{A}}\). When the real multiple-cycle SWOT data are available, the data’s density of SWOT will rise. This is because the SWOT satellite does not maintain an exact repeat orbit, potentially exacerbating the ill-conditioning of \({\mathbf{A}}\). Last, the chosen study region resides in the lower latitudes; the SWOT data density in higher latitudes will increase, further intensifying the ill-conditioning of \({\mathbf{A}}\). In such scenarios, it becomes improbable to choose appropriate errors from the real SWOT SSH data or error variance of geoid gradients \(\left( {{\mathbf{D}}_{{\mathbf{L}}} } \right)\) to act as regularization parameters and alleviate the ill-conditioning of \({\mathbf{A}}\) and then subsequently determine the optimal north and east components. Therefore, the Tikhonov-L-curve regularization is essential for choosing appropriate regularization parameters and ensures the stabilization of LSC solutions in the presence of ill-conditioned matrices throughout the entire study area.

4.2 Optimizing geoid gradient components by removing adjacent data

We found that the spatially closer the two residual geoid gradients, the stronger the linear dependence of the corresponding columns of \({\mathbf{A}}\). Therefore, it is possible to mitigate the ill-conditioning of \({\mathbf{A}}\) by removing geoid gradients that are too close to each other (called adjacent gradients below). As shown below, this is a pre-processor that further improves the accuracy of estimated geoid gradient components by the L-curve method. The selection of the distance threshold should improve the conditioning of \({\mathbf{A}}\), while ensuring that sufficient residual geoid gradients remain within the data selection window. Based on testing, a threshold of 0.755 km yields the optimal estimation of the north and east geoid gradient components. That is, the acceptable minimum distance between two residual geoid gradients is 0.755 km. This means that when computing the residual geoid gradient components at one grid point using the \(8^{\prime} \times 8^{\prime}\) data selection window (see Sect. 3.2), a new residual geoid gradient is rejected when its distance to the data already in the window is less than 0.755 km.

Figure 7 (a) shows the numbers of the geoid gradients in the \(8^{\prime} \times 8^{\prime}\) data selection windows, and Fig. 7 (b) shows the numbers of the remaining geoid gradients after removing adjacent data. Although we only simulated one-cycle SWOT data in this study, there are sufficient data in each data selection window after removing adjacent data to estimate the north and east geoid gradient components. Figure 7 (b) shows that the numbers in the diamond-like gaps of the SWOT data (Fig. 1 (c)) are smaller than 40, and in other areas, the numbers are larger than 80. In the case of removing adjacent data, the condition numbers are shown in Fig. 7 (c), which are much smaller than those in Fig. 6 (a). This result indicates that the ill-conditioning of \({\mathbf{A}}\) is mitigated by removing adjacent data. The regularization parameters in this case are shown in Fig. 7 (d). The distribution pattern in Fig. 7 (d) is correlated with the distribution of geoid gradient components in Fig. 5 (a) and (b). In general, regularization parameters are large in areas where the absolute values of geoid gradient components are large.

Fig. 7
figure 7

a The numbers of the geoid gradients, and b the numbers of the remaining geoid gradients after removing adjacent data in the \(8^{\prime} \times 8^{\prime}\) data selection windows, c the condition numbers of \(\mathbf{A}\) in Eq. (6), and d the regularization parameters in the case of removing adjacent data

We also use the profiles of the residual north \(\left( \xi \right)\) and east \(\left( \eta \right)\) geoid gradient components at 20°N to investigate the effectiveness of removing adjacent data. The differences between the geoid gradient components in the case of removing adjacent data and their “true values” are shown in Fig. 8 (red lines), and in most areas, the amplitudes of these differences are much smaller than those in the case of not removing adjacent data (blue lines) which are also the blue lines in Fig. 4. Figure 8 shows that the differences in the estimated geoid gradient components for the two cases (represented by red and blue lines) are notably larger in the eastern part than in other areas. One possible reason is the complex seafloor topography in the eastern section of the study area, which varies between −3000 m and −1000 m (Fig. 1 (a)). This leads to high-frequency signals in the north and east geoid gradient components, making them challenging to compute accurately. This same phenomenon can also be observed in Fig. 4 and Fig. 5 (c)-(h).

Fig. 8
figure 8

The differences between the determined residual a north \(\left( \xi \right)\) and b east \(\left( \eta \right)\) geoid gradient components determined by L-curve and removing adjacent data and their “true values” at the latitude of 20°N

While removing adjacent geoid gradients decreases the data quantity, it enhances the quality of the estimated geoid gradient components. The RMSDs for the case where adjacent data are removed at 20°N are detailed in Table 1. They are lower than the RMSDs observed using only the L-curve method. Both across the entire study area and in regions excluding SWOT gaps, the RMSDs from the data-removal case are smaller than those derived solely from the L-curve, as shown in Table 2. Thus, removing adjacent data serves as a preprocessing step that further refines the effectiveness of the L-curve method. A similar pre-processor is “blockmean” of the Generic Mapping Tool (GMT; see https://docs.generic-mapping-tools.org/dev/blockmean.html), which computes the mean value of the raw data within a specified grid region to avoid aliasing caused by short wavelengths. If “blockmean” is to be used as a pre-processor for SWOT-derived gradients, one must create four files. Two files contain gradients in the along-swath and cross-swath directions from ascending tracks of SWOT (Yu et al. 2021), and another two files contain such gradients from descending tracks. Each of these four files is fed to “blockmean” to create “thinned” gradients to mitigate matrix ill-conditioning. This subject is not pursued in this paper, but we believe it is worth studying for users interested in gravity recovery from SWOT altimetry. Table 2 shows that the L-curve method combined with removing close data points can achieve a sub-microrad accuracy for geoid gradient, which translates to a sub-mgal accuracy for gravity anomaly.

5 Conclusion

The dense SWOT data lead to linearly dependent column vectors in the covariance matrices of geoid gradients, which result in the ill-conditioned problem in the LSC estimation of geoid gradient components. The solutions of geoid gradient components are unstable and unrealistic due to the ill-conditioned problem. This study used the Tikhonov-L-curve regularization to overcome the ill-conditioned problem. The L-curve method was used to determine the best regularization parameter. The plot of the 2-norms of the solution and the residual in the log–log scale follows a L-shaped curve, and the two 2-norms are simultaneously minimized at the corner of the L-curve, which corresponds to the optimal regularization parameter. The L-curve corner was located by the maximum curvature of the L-curve. The “true values” of the north and east geoid gradient components from the SWOT_model are used to assess the accuracies of L-curve estimated geoid gradient components, suggesting that the L-curve regularization stabilizes the solutions of geoid gradient components which achieve better than 1-microrad accuracy.

Because the ill-conditioned covariance matrix is caused by the dense geoid gradients, removing adjacent data can further improve the accuracies of the geoid gradient components. When using the real SWOT observations, we can remove the adjacent geoid gradients according to their error variances. That is, the geoid gradient with larger error variance should be removed. The 0.755-km threshold chosen in this study may not be the best. The optimal threshold could be chosen according to the real SWOT data. Also, we could resize the data selection window when determining geoid gradient components by LSC to make the best compromise between the computing time and computational accuracy. Although based on simulated SWOT data, our results are crucial for upcoming studies planning to utilize real SWOT data, expected by the end of 2023, this research highlights the need to avoid false seafloor signatures resulting from the data compactness of SWOT.