1 Introduction

Monthly Earth gravity fields based on the observations of the Gravity Recovery And Climate Experiment [GRACE, Tapley et al. (2004)] satellite mission are an important source of information on temporal mass variations in the system Earth (Wouters et al. 2014). Monthly gravity fields are not only provided by the official GRACE Science Data System (SDS) processing centers JPL, CSR and GFZ, but also by an increasing number of independent analysis centers (ACs) worldwide.

The standard approach is to expand the gravity field in spherical harmonics and provide the weight coefficients of this expansion (L2-products) that may be transformed to global grids (L3-products) for easier use. Depending on the field of application, the grids are complemented by monthly mean values of the short-term atmosphere and ocean mass variations [so-called GAX-products, Flechtner and Dobslaw (2013)] to restore the nontidal signal content. Moreover, the L3-products are usually pre-filtered to reduce noise.

Web services (e.g., PO.DAAC,Footnote 1 ISDC,Footnote 2 or TellusFootnote 3 for L3-products) are available to download the GRACE SDS products. Time-series of monthly gravity fields, also from the alternative ACs, are collected and made available via the International Centre for Global Earth Models (ICGEM).Footnote 4

Based on these sources of information, the user has to decide which time-series of monthly gravity fields to use. For most users, the peculiarities of the different processing approaches of the individual GRACE ACs remain unclear. Therefore, there is urgent need for a unification of gravity models like it is done for the products of other space geodetic techniques by, e.g., the International GNSS Service [IGS, Dow et al. (2009)], the International Laser Ranging Service [ILRS, Pearlman et al. (2002)], the International VLBI Service [IVS, Nothnagel et al. (2017)] or the International DORIS Service [IDS, Tavernier et al. (2005)].

Noise in the monthly gravity fields is dominated by either measurement system errors (observation noise), or temporal aliasing errors caused by imperfections in the background models [see, e.g., Flechtner et al. (2016); Seo et al. (2008)]. Due to the one-dimensional observation geometry, the temporal aliasing error is manifest as north–south striping. The noise characteristics of the various solutions differ because of the different parameterizations used by the individual ACs to compensate for both error sources. In the following, we refer to the solution noise not explained by the measurement system error as analysis noise.

The role of the analysis noise is illustrated by Fig. 1, which shows the discrepancy between calibrated errors of ITSG monthly GRACE gravity fields (the ITSG formal errors are calibrated by construction due to the use of a stochastic noise model) and the baseline accuracy determined in pre-mission simulations (Jekeli and Rapp 1980; Kim 2000). The baseline accuracy based on observation error models turned out to be too optimistic by about one order of magnitude. Consequently, the noncalibrated formal errors of the individual monthly solutions are too optimistic, as well (Fig. 1). One may expect that the combination of monthly gravity fields from different ACs based on different background models and using different parameterizations reduces the analysis noise, although all gravity fields are derived from the same observations and no new information is introduced in the combination. This expectation further motivates the combination service for monthly gravity fields.

Fig. 1
figure 1

GRACE baseline accuracy and formal (AIUB, GRGS and GFZ), resp., calibrated (ITSG) errors of monthly gravity fields. Shown are degree amplitudes of formal errors that were averaged over all monthly solutions per AC (2004–2010 in case of AIUB, GRGS and ITSG, 2006–2007 in case of GFZ)

To fully take into account correlations between the individual gravity field parameters, but also between gravity field, orbit, instrument, stochastic or other model parameters, gravity fields have to be combined on the normal equation (NEQ) level. Up to now, a combination of gravity fields on the NEQ level was not possible because the individual ACs normally do not provide NEQs. This was only changed in the frame of the EGSIEM project (see Sect. 2).

It is a well-known technique when combining NEQs to define relative weights iteratively by variance component estimation [VCE, Koch and Kusche (2002)]. This approach is applied, e.g., by the IVS (Böckmann et al. 2010). The technique is recapitulated in “Appendix A.1.” But applying VCE to NEQs provided by different ACs is hampered by a basic problem. Proper stochastic models of noise in the original data are not available. The individual ACs apply different noise modeling strategies and rely on different parameters to absorb background model errors, but the inversion of the individual normal matrices does not yield realistic covariance matrices of errors of model parameters. The error estimates of the unknown parameters differ considerably between the ACs, and consequently classical VCE converges to nonoptimal results. This problem is not only encountered in the combination of GRACE gravity fields. Lerch (1989), in case of the combination of data from different satellite missions, proposed a procedure to derive relative weights based on the analysis noise. Seitz et al. (2012), in the computation of the global reference frame DTRF2008, based on terrestrial and satellite data, completely replaced VCE derived by empirical weights.

We propose an alternative weighting scheme which is based on the noise levels of the individual solutions (Sect. 5). Relative weights are derived by VCE on the solution level (Jean et al. 2018) and then consequently applied to the NEQs. The main difference between the classical VCE on the NEQ level and the alternative VCE on the solution level is that in the latter case, while correlations between the unknown parameters are lost, the error assessment does not depend on the different error modeling and absorption strategies of the individual ACs, but on the differences to a (weighted) mean of the individual solutions.

Prior to weighting, the impact of the individual NEQs on the combination has to be balanced. This is achieved by empirical factors derived from the study of pairwise combinations and has to be done independently from the VCE, since the VCE is converging robustly to the same results, independent of the a priori weights used.

In a nutshell, to generate a statistically optimal combination of monthly gravity fields provided by different ACs:

  • We exploit the normal matrices provided together with solutions themselves.

  • Ideally, the inverse of the normal matrix is the full error variance–covariance matrix of model parameters. In reality, ACs do not guarantee a proper scaling of normal matrices. Therefore, we estimate the weight factors to be used in combining the individual NEQs.

  • The most obvious technique to estimate optimal weights is VCE. Unfortunately, a direct application of this technique, as it is presented in Appendix A.1, leads to suboptimal results, because the inversion of the NEQs provided by the ACs does not yield realistic covariance matrices of errors in model parameters.

  • In order to solve this problem, we apply VCE on the solution level (described in Sect. 5.3). In this case, the errors in model parameters are assumed to be uncorrelated and the error variances of all the parameters are assumed to be the same. Unfortunately, the weights estimates cannot be applied to the NEQs just like that, because the inversion of any of the NEQs results in an error covariance matrix that suffers, among others, from an unknown scaling factor.

  • We therefore compute and apply additional empirical factors, which so to say equalize the available normal matrices. Those factors are estimated such that all the individual solutions contribute equally to the combined solution, independently of their quality. (This procedure is described in Sect. 5.2.)

The article is structured as follows: We first introduce the EGSIEM combination service for monthly gravity fields in Sect. 2 and derive measures for quality control in Sect. 3. In Sect. 4, the individual contributions of the associated analysis centers are characterized, and in Sect. 5, the combination on the normal equation level is discussed. In Sect. 6 finally, the combined gravity fields are validated by comparison with the official GRACE SDS time-series and with the individual contributions. The paper concludes with Sect. 7.

2 The EGSIEM combination service for monthly gravity fields

In the frame of the Horizon 2020 project European Gravity Service for Improved Emergency Management [EGSIEM, Jäggi et al. (2019)], the prototype of a scientific combination service for time-variable gravity fields was established. The goal of this service is to provide consistent, reliable and validated monthly gravity fields, which are combined on the NEQ level from standardized NEQs of all associated ACs. EGSIEM ACs contributing to the combination are the Astronomical Institute of the University of Bern (AIUB), the Helmholtz Centre Potsdam, German Research Centre for Geosciences (GFZ), the Groupe de Recherche de Géodésie Spatiale (GRGS) and the Institute of Geodesy of the Technical University of Graz (IfG, former Institute for Theoretical and Satellite Geodesy, ITSG).

To guarantee consistency between the individual contributions, EGSIEM standards were defined for reference frame, Earth rotation and antenna reference points on the GRACE satellites, as well as for the relativistic effects and for third-body perturbations. The EGSIEM ACs were free to use their specific processing approaches and the background force models of their choice for the static gravity field of the Earth and for tidal mass variations. Neither were the de-aliasing products for short-term atmosphere and ocean mass variations [AOD, Flechtner and Dobslaw (2013)] harmonized, because background models and de-aliasing products are not free of errors. In the combination, errors in the individual models may be reduced; therefore, a wide variety of models is beneficial.

EGSIEM-combined gravity fields are provided in spherical harmonic representation (L2-products) and as global grids (L3-products). To generate L3-products, degree 1 terms derived from Satellite Laser Ranging (SLR) are added to transform between a center of mass and a center of figure frame. Then, the monthly mean of AOD is restored to achieve full (nontidal) signal content. The AOD correction is combined from the individual monthly means provided by the ACs using the same relative weights as in the combination of the gravity fields (Jäggi et al. 2019).

For hydrological applications, monthly means of atmosphere [GAA, Flechtner and Dobslaw (2013)], ocean [GAB, Flechtner and Dobslaw (2013)] and the global isostatic adjustment (GIA) model LM17.3Footnote 5 are subtracted. For oceanographic applications, monthly means of the atmosphere, the terrestrial water storage modeled by the WaterGAP Global Hydrological Model [WGHM; Döll et al. (2003)] and GIA (evaluated at the epochs of the monthly gravity fields) are subtracted.

A variant of the DDK-filter (Kusche 2007) making use of the full, monthly covariance information is applied to filter the different versions (still in spherical harmonics representation). Therefore, the calibrated error covariances of the ITSG gravity fields were used and the characteristics of the expected hydrological or oceanographic signals were taken into account. Finally, the spherical harmonic coefficients (SHC) were transformed to global grids with \(1^\circ \)-resolution.

All EGSIEM products are representative of the time span within a given month defined by the start and end day. Short GRACE data gaps are ignored when computing the monthly means of the AOD products, assuming that users in general do not thin out their observation database according to the availability of the GRACE observation data either.

EGSIEM-combined L2 and L3 products can be downloaded from the “Data” section of the EGSIEM homepage.Footnote 6 Furthermore, mass variations derived from individual time-series as well as from the combined gravity fields can be visualized by the EGSIEM plotter.Footnote 7

The prototype combination service continues after the completion of the EGSIEM project in the frame of the Combination Service for Time-variable Gravity field models (COST-G) as a product center of the International Gravity Field Service (IGFS) under the umbrella of the International Association of Geodesy (IAG).

3 Noise assessment

We need to assess the noise levels of the individual and the combined gravity fields for quality control, and we have to independently define relative weights to consider the different noise levels in the models to be combined. Prior to combination, the monthly gravity fields provided by the individual ACs undergo strict quality control based on their signal and noise content in the spectral and spatial domains. While noise levels may vary between ACs and are taken into account in the combination by noise-based relative weights, the signal content is expected to be the same in all gravity field time-series accepted for combination. Gravity field solutions with attenuated temporal variation due to intended or accidental regularization are excluded from the combination to avoid damaging the signal content.

The signal content is evaluated by the comparison of the amplitudes of seasonal mass variations in a large number of river basins and by the study of mass trends in polar regions. The tests of the signal content are described in detail by Jean et al. (2018).

We here focus on the noise content to assess the quality of the individual and combined gravity fields and to derive relative weights for combination. To separate signal from noise, we have two possibilities:

  • Comparison with the monthly mean of different gravity fields, assuming that all gravity fields contain the same signal, but are different in noise. The noise ideally is greatly reduced in the averaging process, while the signal content remains unchanged.

  • Comparison with a signal model. As our knowledge of mass transport in the system Earth is limited, we refer to a deterministic model of mass variation containing bias, trend, annual and semiannual variations fitted by a least-squares process to the monthly mean values of the individual gravity fields. The residuals with respect to this model are called anomalies.

The transformation between spherical harmonics representations and grids is linear, and therefore it does not matter whether we compute differences to the mean or anomalies per coefficient in the spherical harmonics domain or for each grid cell of global grids in the spatial domain. Moreover, differences to the mean or anomalies may be evaluated either in geoid heights or in equivalent water heights (EWH). The results will be different in this case, because the scaling factors involved are depending on the degree (Wahr et al. 1998) and consequently the noise in the high degrees is amplified when using EWH.

Fig. 2
figure 2

RMS of degree amplitudes of anomalies (solid) and differences to the mean (dashed) for the EGSIEM-AIUB time-series, expressed in geoid heights

Figure 2 shows the root-mean-square (RMS) in geoid heights over all monthly gravity fields 2004–2010 of the degree amplitudes of differences to the mean and anomalies of the EGSIEM-AIUB contribution. The differences to the mean values are in general smaller than the anomalies, because our signal model is incomplete and does not represent nonsecular, nonseasonal variations, and because the differences to the mean do not reflect the errors that are shared by all the models.

Due to the polar orbits of the GRACE satellites and due to the related sparse observation sampling in cross-track direction, high spherical harmonic orders are especially noisy and often removed by filtering (Kusche 2007). We therefore also show degree amplitudes computed from orders \(0,\ldots ,29\) only to focus on the geophysically most meaningful part of the spectrum. The spikes visible at degrees 15, 31, 46, 61 and around 76 are related to orbit resonances. The GRACE satellites circle the Earth approximately 15.3 times per day. Spherical harmonic orders at integer multiples of 15.3 are called resonant orders and suffer from aliasing by long-periodic signal of geophysical origin (Seo et al. 2008). Whenever the degree amplitudes include a new resonant order, this causes a jump in the noise level.

Figures 3 and 4 show the RMS per grid cell over all monthly anomalies and differences to the mean in geoid heights, respectively. In the spatial representation, it is obvious that the remaining signal in the anomalies is not distributed evenly over the globe, but is concentrated over the continents in regions with strong mass variability, while anomalies over the oceans are small. No corresponding phenomenon can be detected for the differences to the mean values, which only show a slight latitude dependence due to the denser observation coverage at higher latitudes.

Fig. 3
figure 3

RMS of anomalies of the EGSIEM-AIUB time-series in the spatial domain, expressed in geoid heights

Fig. 4
figure 4

RMS of differences of the EGSIEM-AIUB time-series to the mean in the spatial domain, expressed in geoid heights

Figures 5, 6 and 7 show the noise evaluated in EWH. Apart from the general up-weighting of high degrees and consequently also the noise in the high degrees, the conclusions are the same as for the geoid heights:

  • Differences to the mean are significantly smaller than anomalies and only show a small latitude dependence.

  • Anomalies include nonsecular, nonseasonal signals, which are concentrated over land regions with strong mass variability.

Consequently, we use the differences to the mean values as our best approximation of the noise content to define the relative weights for the combination of the monthly gravity fields. The weights are determined iteratively by VCE on the solution level (Sect. 5.3). On the other hand, the RMS of the anomalies, restricted to ocean areas, is taken as an independent quality control. Note that the anomalies still contain small signals over the oceans, as can be seen when comparing the global representations of the anomalies and the differences to the mean values. The Southern Atlantic Ocean in particular has regions with significant signal contents. A region free of anomalous signals can be detected in the central part of Antarctica. We nevertheless prefer the ocean areas for quality control due to their much larger size. Moreover, the polar regions are not representative of the rest of the globe due to the much denser GRACE satellite ground tracks and consequently the observation coverage near the poles. No AOD signals were restored for the computation of the anomalies.

Fig. 5
figure 5

RMS of degree amplitudes of anomalies (solid) or differences to the mean (dashed) for the EGSIEM-AIUB time-series, expressed in EWH

Fig. 6
figure 6

RMS of anomalies of the EGSIEM-AIUB time-series in the spatial domain, expressed in EWH

Fig. 7
figure 7

RMS of differences of the EGSIEM-AIUB time-series to the mean in the spatial domain, expressed in EWH

4 Individual time-series

Contributions for combination are provided by the four EGSIEM ACs: AIUB, GFZ, GRGS and ITSG. All ACs use variants of a dynamic orbit and gravity field determination approach based on variational equations and on K-band range-rates (KRR) as the main observable. The individual approaches differ in

  • their use of either the original GPS observations or kinematic satellite orbits derived thereof,

  • the relative weighting or sampling of observables,

  • the noise model or the parameters estimated to absorb the noise and

  • the background models used for signal separation.

We therefore shortly characterize the different approaches. The descriptions of the individual approaches are based on the EGSIEM Standards documentFootnote 8 and updated by information presented at the EGSIEM final meeting.Footnote 9 The descriptions are not meant to be exhaustive, but to illustrate some of the main differences in the parameterizations. For further details, consult the provided references.

The observables used by the ACs, their sampling, the maximum number of observations and the weights applied are compiled in Table 1. GFZ (Dahle et al. 2012) and GRGS (Bruinsma et al. 2010) are directly using the original GPS carrier-phase and code observations together with the KRR observable to determine the dynamic GRACE orbits and the monthly gravity fields. AIUB and ITSG first determine kinematic satellite orbits by a precise point positioning (PPP) algorithm based on carrier phases only and then use the kinematic positions together with the epoch-specific covariance information as pseudo-observations.

Table 1 Types and numbers of GRACE observations used by the different ACs

The weights of the observables are generally based on the RMS of the corresponding residuals, i.e., 0.7 m for GPS code, 0.2 cm for GPS carrier phase (L1), 0.7 cm for the ionosphere-free linear combination (L3) of carrier phases (L1 and L2) and 0.1–0.3 \(\upmu \)m/s for KRR. In the case of ITSG, the relative weights of the different observables are determined by VCE.

All ACs observe inconsistencies between GPS and KRR observations leading to increased noise in the gravity field solutions. This problem seems to be even more serious, if kinematic orbits are used as pseudo-observations instead of the original GPS observables. Both, GFZ and GRGS, down-weight the GPS code observable, and GFZ in addition down-weights the GPS phases (see Table 1). GRGS moreover limits the resolution of the gravity field contribution determined by GPS to degree and order 40. AIUB down-weights the kinematic positions by an empirically determined factor of \(15^2\), and ITSG down-samples the pseudo-observations by a factor of 10. The reason for the inconsistencies between GPS and KRR is still under investigation.

The parameters estimated by the ACs include orbit, instrument and force model parameters (Table 2). Epochwise clock corrections (2880 per day and satellite in case of 30 s GPS sampling) and GPS phase ambiguities (typically 300–400 per day and satellite) are not listed, neither are the gravity model parameters (8277 per month in the case of a maximum degree of 90; coefficients of degrees 0 and 1 are not estimated). It is common practice to set up empirical parameters to absorb instruments noise, but the choice of parameters is not unique. GFZ estimates KRR biases, drifts and once-per-revolution (1/rev) or twice-per-revolution (2/rev) periodic variations every 90 min as originally proposed by Kim (2000). Accelerometer (ACC) biases and scale parameters in all three axes (X, Y, Z) of the instrument frame are estimated in addition with a 3 h time resolution. (An additional parameter set is estimated at the end of the arc, giving a total of nine sets per 24 h arc in this special case). GRGS also relies on a rather dense ACC bias parameterization, while it estimates ACC scale factors once per day and axis.

Table 2 Types and numbers of nuisance parameters estimated by the different ACs

AIUB applies a more conservative instrument parameterization, but estimates so-called pseudo-stochastic accelerations in the three axes—radial (R), along-track (S) and cross-track (W)—of the corotating orbital frame every 15 min. The pseudo-stochastic accelerations are estimated to compensate for not only instrument noise, but also all kinds of model deficiencies. They are constrained to zero with uncertainties of \(\sigma =3\times 10^{-9}\) m/s\(^2\) to prevent absorbing time-variable gravity signal (Meyer et al. 2016).

While all other ACs apply very simple noise models (diagonal weight matrices with uniform weight per observable), ITSG applies empirical noise modeling techniques to take correlations between observations over 3 h arcs into account (Ellmer 2018). Consequently, IFG has to deal with fully populated weight matrices. But a realistic noise model can only be achieved by a careful separation between signal and noise. Therefore, ITSG determines constrained daily variations up to a spherical harmonics degree of 40. The monthly mean of the daily estimates is restored in the monthly solution not to impair the signal content. On top of that, ITSG estimates fully populated (symmetric) \(3\times 3\) ACC scale factor matrices for each day. This measure drastically reduces the artifacts with a period of 161 days that impair the \(C_{20}\) estimate (Klinger and Mayer-Gürr 2016).

The monthly NEQs are provided by the individual ACs in the SINEXFootnote 10 format to the combination center. As additional information, the maximum degree \(l_\mathrm{max}\), the number of observations n (reduced by the number of pre-eliminated parameters), the number of unknowns u (reduced by the number of constraints applied on the pre-eliminated parameters), the geophysical constants GM (gravity constant times mass of the Earth) and R (semimajor axis of the reference ellipsoid), the tide system (zero tide/tide free) and the weighted square sum of pre-fit residuals \(\pmb l^T\pmb P\pmb l\) are provided in the header of the SINEX files. The weighted square sum of postfit residuals needed for the derivation of VCE-weights and for statistics may then be computed as

$$\begin{aligned} \pmb v^T\pmb P\pmb v=\pmb l^T\pmb P\pmb l-\pmb {\hbox {d}x}^T\pmb b \end{aligned}$$

with \(\pmb {\hbox {d}x}=\pmb x-\pmb x_0\) being the parameters improved and \(\pmb b\) being the right-hand side vector of the normal equation system. All NEQs contain a priori gravity field coefficients \(\pmb x_0\), the normal equation matrix \(\pmb N\), the right-hand side vector \(\pmb b\) and the solution vector \(\pmb x\).

To simplify the combination on the NEQ level, all but the gravity field parameters are pre-eliminated by the ACs and the individual NEQs are normalized (i.e., each observable is weighted according to Table 1). Despite all these measures, the differences in the choice of observables and in the observation sampling cause huge differences in the number of observations entering the daily normal equations, and the numbers of the pre-eliminated parameters differ significantly, too. Moreover, the various noise modeling strategies cause very different magnitudes of the formal errors. In Sect. 5, a robust combination strategy is introduced.

5 Combination of normal equations

The monthly normal matrices \(\pmb N_{i}\) of the EGSIEM ACs as well as the right-hand side vectors \(\pmb b_{i}\) are weighted and stacked to form the combined NEQ system

$$\begin{aligned} \left( \sum _{i=0}^{n_\mathrm{sol}}w_{i}\pmb N_{i}\right) \pmb {\hbox {d}x} = \sum _{i=0}^{n_\mathrm{sol}}w_{i}\pmb b_{i}. \end{aligned}$$

This section is devoted to the preparatory work needed to scale the \(n_\mathrm{sol}\) different NEQs to a common set of geophysical constants GM and R and a priori gravity field model, and to derive the relative weights \(w_i\).

5.1 Transformation to common geophysical constants, tide system and a priori gravity model

The original NEQs refer to the values of \(GM_i\) and \(R_i\) used by the individual ACs. Therefore, the individuals \(\pmb N_i\) and \(\pmb b_i\) have to be scaled to the common reference values, e.g., in the case of EGSIEM, \(GM_\mathrm{ref}=3.986004415\times 10^{14}\) m\(^3/\)s\(^2\) and \(R_\mathrm{ref}=6378136.3\,m\) (as recommended by the IUGG General Assembly 1991 in Vienna) prior to combination. The scaling factors

$$\begin{aligned} f_l = \frac{GM_i}{GM_\mathrm{ref}}\left( \frac{R_i}{R_\mathrm{ref}}\right) ^l \end{aligned}$$

depend on spherical harmonic degree l.

We define a diagonal scale matrix \(\pmb F\), with elements \(F_{jk} = 0\) for \(j\ne k\) and \(F_{jj} = f_l\) corresponding to the degree of the coefficients in \(\pmb N_i\) and \(\pmb b_i\). The individual gravity field solution \(\pmb x_i\) is scaled by \(\pmb x_i' = \pmb F\pmb x_i\). The observation vector \(\pmb y=\pmb A_i\pmb x_i=\pmb A_i'\pmb x_i'\) (where \(\pmb A_i\) are the individual design matrices, which are not provided by the ACs) is not allowed to change, which is why the rescaled design matrix must take the form \(\pmb A_i'=\pmb A_i\pmb F^{-1}\) and the individual components of the NEQ are rescaled accordingly:

$$\begin{aligned} \pmb x_{0,i}'= & {} \pmb F\pmb x_{0,i}, \end{aligned}$$
$$\begin{aligned} \pmb b_i'= & {} \pmb F^{-1}\pmb b_i, \end{aligned}$$
$$\begin{aligned} \pmb N_i'= & {} \pmb F^{-1}\pmb N_i\pmb F^{-1}, \end{aligned}$$
$$\begin{aligned} \pmb {\hbox {d}x}_i'= & {} \pmb F\pmb {\hbox {d}x}_i. \end{aligned}$$

Furthermore, the NEQs have to refer to a common tide system. The tide-free system was selected as the EGSIEM standard. In the NEQs referring to the zero-tide system, a bias of \(4.173\times 10^{-9}\) has to be added to the a priori \(C_{20}\) gravity field coefficient.

Eventually, the NEQs have to be transformed to the common a priori gravity field coefficients \(\pmb x_{0,\mathrm{ref}}\). If \(\pmb {dx}_0=\pmb x_{0,\mathrm{ref}}-\pmb x_{0,i}\), then according to Brockmann (1997) the transformed NEQ is:

$$\begin{aligned} \pmb x_{0,i}'= & {} \pmb x_{0,i}+\pmb {\hbox {d}x}_0 = \pmb x_{0,\mathrm{ref}} \end{aligned}$$
$$\begin{aligned} \pmb b_i'= & {} \pmb b_i-\pmb N_i\pmb {\hbox {d}x}_0\end{aligned}$$
$$\begin{aligned} \pmb N_i'= & {} \pmb N_i\end{aligned}$$
$$\begin{aligned} \pmb {\hbox {d}x}_i'= & {} \pmb {\hbox {d}x}_i-\pmb {dx}_0. \end{aligned}$$

The weighted square sum of \(\pmb l_i\) also has to be adapted:

$$\begin{aligned} \pmb l_i'^T\pmb P_i\pmb l_i'=\pmb l_i^T\pmb P_i\pmb l_i-2\pmb b_i\pmb {\hbox {d}x}_0+\pmb {\hbox {d}x}_0^T\pmb N_i\pmb {\hbox {d}x}_0. \end{aligned}$$

5.2 Empirical scaling to balance the impact of NEQs on the combination

A unweighted combination of the individual NEQs not necessarily results in a combined solution close to the arithmetic mean of the individual solutions. As mentioned in Sect. 1 and further outlined in Sect. 4, the individual NEQs are based on different observables, noise models and parameterizations and therefore differ in their specific degrees of freedom and in the magnitude of the formal errors. Due to these differences, the impact of the individual NEQs on an unweighted combination is almost unpredictable.

Fig. 8
figure 8

Empirical determination of factors resulting in equal contribution of two gravity fields to the pairwise combinations (01/2006)

We know, on the other hand, that each NEQ basically contains the same information representative of the same time span of GRACE observations, and the individual solutions only differ in analysis noise. The latter will be taken into account by the weights derived from the individual solutions in Sect. 5.3.

We now derive empirical factors to balance the impact of the individual NEQs. We define one NEQ (\(\pmb N_\mathrm{ref}\), \(\pmb b_\mathrm{ref}\)), chosen freely from the individual NEQs, as the reference with a fixed weight of 1. Then, we perform pairwise combinations of the reference NEQ with all other NEQs (\(\pmb N_i\), \(\pmb b_i\)):

$$\begin{aligned} (\pmb N_\mathrm{ref}+w\pmb N_i)\pmb {\hbox {d}}x=\pmb b_\mathrm{ref}+w\pmb b_i \end{aligned}$$

and vary the weight w until the RMS of the coefficient-wise differences between the combined solution of Eq. 13 and the solutions to the individual NEQs in Eq. 13 is the same (Fig. 8). The RMS is computed as follows:

$$\begin{aligned} \hbox {RMS}_i=\sqrt{\frac{\sum _{l,m=0}^{l_\mathrm{max}}(K_{lm}^\mathrm{comb}-K_{l,m}^i)^2}{n_\mathrm{coef}}}, \end{aligned}$$

where \(K_{lm}\) stands for the spherical harmonics coefficients \(C_{lm}\) and \(S_{lm}\). Consequently, in case of \(n_\mathrm{sol}\) NEQs to be combined, we end up with \(n_\mathrm{sol}-1\) empirical weights. A combination of all \(n_\mathrm{sol}\) NEQs applying the empirical weights to the corresponding NEQs results in an approximation of the arithmetic mean of all individual solutions. In the following, the empirical weights are applied as correction factors to the weights representing the different noise levels (Sect. 5.3) to derive the final weights for the monthly combinations.

5.3 Relative weights based on solution noise

According to the argumentation in Sect. 1, we define relative weights representative of the noise content on the solution level. This can be done simply by comparing the individual solutions to their arithmetic mean. An alternative procedure based on VCE, proposed by Jean et al. (2018), is more robust against outliers. The same authors also study different weighting schemes, e.g., coefficient-wise, order-wise or field-wise weights, and conclude that monthly field-wise weights determined by VCE on the solution level are best suited for the combination. We therefore determine field-wise weights.

The basic idea of VCE on the solution level is to use the individual solutions as pseudo-observations, taking into account all coefficients (in the case of field-wise weights) with equal weight. The design matrices, the weight matrix and consequently also the normal matrices will all become unity matrices of dimension \(n_\mathrm{coef}\). Introduced into the formulas of VCE (see “Appendix A.1”), the relative weights of iteration k turn out to be

$$\begin{aligned} w_{i,k}=\left( 1-\frac{w_{i,k-1}}{\sum _i{w_{i,k-1}}}\right) \frac{1}{\hbox {RMS}(\pmb {x}_i-\hat{\pmb {x}}_{k-1})^2}. \end{aligned}$$

The combination on the solution level in the iteration step k is

$$\begin{aligned} \hat{\pmb {x}}_{k}=\frac{1}{\sum _i{w_{i,k}}}\sum _i{w_{i,k}\pmb {x}_i}, \end{aligned}$$

\(w_{i,0}=1/n_\mathrm{sol}\) may serve as starting values. We base the computation of relative weights directly on the unfiltered dimensionless spherical harmonics coefficients \(\pmb x_i\), but in principle filtered versions or coefficients transformed to EWH may be considered as well (the former to decrease the impact of the noisy high-degree coefficients on the weights, the latter to increase it).

Figure 9 shows the weights determined by VCE on the solution level for the four EGSIEM AC’s monthly solutions of January 2006 and Fig. 10 the corresponding noise levels for each iteration step, assessed by the weighted STD of anomalies over ocean areas (see Sect. 3). Usually, convergence is reached after four iteration steps.

Fig. 9
figure 9

Iterative determination of noise-based relative weights by VCE on the solution level for one example month (01/2006)

Fig. 10
figure 10

Noise level per iteration of combined solution, approximated by the weighted standard deviation of anomalies over the oceans

For comparison, we computed monthly combinations on the NEQ level spanning the two years 2006–2007 based on different weighting schemes:

  • applying no weights at all,

  • defining the relative weights iteratively by standard VCE on the NEQ level, i.e., without empirical factors (labeled “NEQ-VCE” in Fig. 11),

  • applying empirical factors to balance the impact of the individual contributions (arithmetic mean on the NEQ level),

  • basing the relative weights on the solution noise by multiplying the empirical factors by weights determined by VCE on the solution level (labeled “EGSIEM-COMB” in Fig. 11).

For the individual time-series and all four combination schemes, the monthly standard deviations of anomalies over the oceans were computed to assess their noise content (Fig. 11). The quality of the combined gravity fields is mainly driven by the outstanding ITSG contribution. Comparing the combined gravity fields, only the combination taking the solution noise into account reaches the noise level of ITSG. In the rare occasion where a monthly ITSG gravity field shows a slightly increased noise level (e.g., in September 2006), the combination surpasses all individual contributions.

The arithmetic mean on the NEQ level (based on the empirical factors to balance the impact of the individual NEQs) performs slightly worse than the “EGSIEM-COMB.” This result differs from the conclusion of Sakumura et al. (2014) (studying gravity field combinations on the solution level) that the arithmetic mean of the gravity fields of different ACs performs best. Contrary to Sakumura et al. (2014), who combined time-series of very homogeneous quality, we are confronted with more diverse noise levels and therefore, as already mentioned by Jean et al. (2018), the benefit of relative weights becomes apparent.

Neither the combination of unscaled and unweighted NEQs, nor the combination based on VCE on the NEQ level can reach the quality of the combinations based on the empirical balancing factors. As mentioned before and also discussed in Sect. 6, this is explained by the different processing and especially noise modeling strategies of the individual ACs that have to be taken into account.

Fig. 11
figure 11

Noise level per month of the individual contributions (top) and combinations based on various weighting strategies (bottom): equal weight, VCE on the NEQ level, only balanced, or balanced and weighted by VCE on the solution level (EGSIEM)

Note that due to the different orbit parameterizations, the combined monthly gravity fields do not correspond to one and the same satellite orbit valid for all ACs. While the AC-specific parameters are pre-eliminated prior to combination and the correlations between the local and the gravity field parameters are kept, a solution of the pre-eliminated parameters by re-substitution of the combined gravity field coefficients would lead to, e.g., different initial state vectors and increased residuals compared to the individual solutions. As long as we have to deal with diverse parameterizations, there exists nothing like an optimal common set of orbit parameters.

6 Evaluation of combined monthly solutions

As long as no signal biases impede the combination, the field-wise weights derived by VCE on the solution level provide a robust quality indicator for the monthly gravity fields provided by the EGSIEM ACs. Together with quality indicators based on the anomalies, noise levels can be characterized and signal attenuation due to regularization can be detected.

Within the EGSIEM project, originally only monthly gravity fields for the two years 2006–2007 were re-processed according to the EGSIEM standards and combined on the NEQ level. For this time span, 24 monthly gravity fields are available from the EGSIEM ACs AIUB, GFZ, GRGS and ITSG. Toward the end of the project, it was decided to extend the time span to the years 2004–2010 to enable the evaluation by users inside and outside the EGSIEM project. Only contributions from AIUB, GRGS and ITSG were accessible for the extended time span.

The monthly relative weights based on VCE on the solution level, the empirical scaling factors to achieve the same impact of the individual contributions on the combination, the final weights resulting as the product of the VCE weights and the empirical factors, and for comparison also the weights that would result from VCE on the NEQ level are visualized in Fig. 12 for the years 2006–2007 and in Fig. 13 for the other years in 2004–2010. All weights were normalized, i.e., divided by the monthly sum of the weights to add up to one. Only the weights derived by VCE on the solution level serve as a quality indicator. For most of 2006 and all of 2007, the ITSG contribution (labeled ITSG according to the official name of the original time-series) receives the highest weights. This is also true for the extended period 2004–2010 and is in agreement with the findings of Jean et al. (2018), who attributed the superior performance of the ITSG time-series to the advanced noise modeling at ITSG.

Fig. 12
figure 12

Empirical factors to balance the impact (top), relative weights based on noise levels of solutions (second), final weights (third) and weights based on VCE on NEQ level (bottom) in the case of four contributions 2006–2007

Fig. 13
figure 13

Empirical factors to balance the impact (top), relative weights based on noise levels of solutions (second), final weights (third) and weights based on VCE on NEQ level (bottom) in the case of three contributions

The empirical factors are determined relative to a reference contribution, which is identified by a constant weight of 1 in the figures. During 2006–2007 GFZ, the only member of the SDS in the EGSIEM consortium was selected as reference, and for the years without GFZ contribution, GRGS was selected. As explained in Sect. 4, the empirical factors balance the effect of different types of observations, different parameterizations and different noise models in the individual time-series. Consequently, the final weights cannot be interpreted as quality indicators anymore.

The weights derived by VCE on the NEQ level differ significantly from the final weights used for the EGSIEM combination. The fundamental difference is the low weight assigned to the ITSG contribution. This is explained by the empirical noise model applied by ITSG that leads to realistic, i.e., significantly larger formal errors compared to the other time-series. It remains unclear why the weights derived by VCE on the NEQ level are much less favorable for GFZ than for GRGS. Both ACs base their processing on the original GPS phase observations, and their formal errors are comparable, at least for the dominating medium to high-degree SHC. We conclude that VCE on the NEQ level does not necessarily produce optimal weights if NEQs stemming from different analysis approaches have to be combined.

In the presence of signal biases, the differences between the individual contributions and their mean values include the signal biases and the weights derived by VCE on the solution level for a biased contribution are smaller than expected from noise only. In this case, the relative weights are no longer representative of the different noise levels. Consequently, small VCE-derived weights together with small noise, as illustrated, e.g., by small anomalies over the oceans, indicate signal biases. In our case, all contributions of the EGSIEM ACs passed the quality control and no signal biases could be detected.

The noise level of the combined solution is also independently evaluated by means of anomalies in the spherical harmonic and the spatial domain (as proposed in Sect. 3). The deterministic signal model used to define the anomalies (see Sect. 3) was derived from the monthly arithmetic mean values of all the time-series available at ICGEM having passed quality control according to Jean et al. (2018).

Figure 14 compares the RMS of degree amplitudes of EWH anomalies in the spherical harmonic domain to the three official RL05 time-series (evaluated for the time span 2004–2010) of the GRACE SDS ACs. Beyond degree 30, the degree amplitudes of the anomalies in general are dominated by noise [see e.g., Jean et al. (2018)]. The RMS of the anomalies of the EGSIEM-combined solutions is smaller than that of the GRACE SDS RL05 time-series.

To exclude the effect of the noisy high-order spherical harmonics coefficients which normally are attenuated by postprocessing filters [e.g., Kusche (2007)], all gravity fields were truncated at order 29. But also the truncated degree amplitudes of the anomalies that focus on the part of the spectrum essential for geophysical analysis (dashed lines in Fig. 14) are smaller in the EGSIEM-combined gravity fields than in the GRACE SDS solutions.

Fig. 14
figure 14

Comparison of EGSIEM and GRACE SDS time-series of EWH anomalies in the spectral domain. RMS of degree amplitudes either to full order (solid) or to reduced order 29 (dashed)

Figure 15 shows the RMS values of anomalies of the GRACE SDS time-series and the EGSIEM combination in the spatial domain. All gravity fields were smoothed with a 400 km Gauss filter (Wahr et al. 1998). The RMS values of the combined anomalies show significantly reduced noise stripes, which are typical for the GRACE monthly gravity fields. Also, the residual signal seems to be less affected by stripes on the continents. A similar evaluation comparing the combined gravity fields with the individual EGSIEM ACs time-series can be found in Jäggi et al. (2019).

Fig. 15
figure 15

RMS of EWH anomalies of GRACE time-series in the spatial domain, smoothed by a 400 km Gauss filter. Top: CSR-RL05, second: JPL-RL05, third: GFZ-RL05, bottom: EGSIEM combination

Fig. 16
figure 16

Comparison of EGSIEM individual contributions and combined solution monthly RMS of EWH anomalies over the oceans (smoothed by a 400 km Gauss filter)

Figure 16 compares the combined gravity fields to the individual EGSIEM ACs’ time-series. The monthly RMS values of anomalies over the oceans are computed in order to assess the noise levels of individual solutions. Note that for 2006–2007 the combined solutions include the GFZ contribution. The ITSG contribution is clearly less noisy than the other individual ACs’ time-series in this evaluation. But with the exception of very few months, the noise level of the combined gravity fields is as small as or even smaller than that of ITSG. Note that poor quality of the solutions in January 2004 is caused by data gaps and in December 2005 by the satellite swap maneuver. From August to October 2004 (gray box in Fig. 16), the quality of the monthly gravity fields is impaired by the orbit resonances.

7 Conclusions and outlook

We presented the prototype of a combination service for monthly gravity fields, which was implemented in the frame of the EGSIEM project. The monthly gravity fields provided by the associated ACs show different noise levels due to different processing approaches: The number of observations used per month varies between 500,000 and 3,000,000, the number of estimated parameters between 5000 and 50,000. Moreover, the noise modeling techniques and parameter types differ substantially.

The combination is performed on the NEQ level to correctly take into account correlations between parameters. Relative weights, representative of the different noise levels, are derived by VCE on the solution level, i.e., by iterative comparison of the individual gravity fields to their weighted mean. The intrinsic weights of the individual NEQs are removed by a robust empirical procedure balancing the impact of the individual NEQs on the pairwise combinations.

Combined gravity fields were computed from three or four ACs for the time span between 2004 and 2010. An independent evaluation of the noise levels indicates that the quality of the best individual contribution (ITSG) is achieved or even topped by the combinations in 90% of the monthly solutions. Outliers can be identified with data problems. Compared to the official GRACE SDS monthly gravity fields, the anomalies of the EGSIEM combinations that are derived to assess the noise level are smaller. The original goal to provide consistent, reliable and validated gravity fields therefore is met.

The noise level differences of the individual time-series are striking. With a more homogeneous quality of the input series, the combinations should improve substantially as well. First, experiments with the new GRACE SDS RL06 time-series indicate a big step forward in this direction. With the availability of the new GRACE L1B-RL03 observational data and the SDS RL06 gravity fields, now a final combination of all GRACE time-series becomes feasible.

The EGSIEM initiative for gravity field combination is continuing with COST-G under the umbrella of the IAG. Since it cannot be expected that the GRACE SDS ACs will reprocess the whole GRACE time-series to be in accordance with the EGSIEM standards, the COST-G standards will be adapted to only specify the signal content of the monthly gravity fields which should include nontidal oceanographic, hydrological, glaciological and GIA signal to the full extent.