Combination of GRACE monthly gravity fields on the normal equation level

A large number of time-series of monthly gravity fields derived from GRACE data provide users with a wealth of information on mass transport processes in the system Earth. The users are, however, left alone with the decision which time-series to analyze. Following the example of other well-known combination services provided by the geodetic community, the prototype of a combination service has been developed within the frame of the project EGSIEM (2015–2017) to combine the different time-series with the goal to provide a unique and superior product to the user community. Four associated analysis centers (ACs) of EGSIEM, namely AIUB, GFZ, GRGS and IfG, generated monthly gravity fields which were then combined using the different normal equations (NEQs). But the relative weights determined by variance component estimation (VCE) on the NEQ level do not lead to an optimal combined product due to the different processing strategies applied by the individual ACs. We therefore resort to VCE on the solution level to derive relative weights that are representative of the noise levels of the individual solutions. These weights are then applied in the combination on the NEQ level. Prior to combination, empirical scaling factors that are based on pairwise combinations of NEQs are derived to balance the impact of the NEQs on the combined solution. We compare the processing approaches of the different ACs and introduce quality measures derived either from the differences w.r.t. the monthly means of the individual gravity fields or w.r.t. a deterministic signal model. After combination, the gravity fields are validated by comparison to the official GRACE SDS RL05 time-series and the individual contributions of the associated ACs in the spectral and the spatial domain. While the combined gravity fields are comparable in signal strength to the individual time-series, they stand out by their low noise level. In terms of noise, they are in 90% of all months as good or better than the best individual contribution from IfG and significantly less noisy than the official GRACE SDS RL05 time-series.


Introduction
Monthly Earth gravity fields based on the observations of the Gravity Recovery And Climate Experiment [GRACE, Tapley et al. (2004)] satellite mission are an important source of information on temporal mass variations in the system Meyer ulrich.meyer@aiub.unibe.ch Earth (Wouters et al. 2014). Monthly gravity fields are not only provided by the official GRACE Science Data System (SDS) processing centers JPL, CSR and GFZ, but also by an increasing number of independent analysis centers (ACs) worldwide.
The standard approach is to expand the gravity field in spherical harmonics and provide the weight coefficients of this expansion (L2-products) that may be transformed to global grids (L3-products) for easier use. Depending on the field of application, the grids are complemented by monthly mean values of the short-term atmosphere and ocean mass variations [so-called GAX-products, Flechtner and Dobslaw (2013)] to restore the nontidal signal content. Moreover, the L3-products are usually pre-filtered to reduce noise. Web services (e.g., PO.DAAC, 1 ISDC, 2 or Tellus 3 for L3-products) are available to download the GRACE SDS products. Time-series of monthly gravity fields, also from the alternative ACs, are collected and made available via the International Centre for Global Earth Models (ICGEM). 4 Based on these sources of information, the user has to decide which time-series of monthly gravity fields to use. For most users, the peculiarities of the different processing approaches of the individual GRACE ACs remain unclear. Therefore, there is urgent need for a unification of gravity models like it is done for the products of other space geodetic techniques by, e.g., the International GNSS Service [IGS, Dow et al. (2009)], the International Laser Ranging Service [ILRS, Pearlman et al. (2002)], the International VLBI Service [IVS, Nothnagel et al. (2017)] or the International DORIS Service [IDS, Tavernier et al. (2005)].
Noise in the monthly gravity fields is dominated by either measurement system errors (observation noise), or temporal aliasing errors caused by imperfections in the background models [see, e.g., Flechtner et al. (2016); Seo et al. (2008)]. Due to the one-dimensional observation geometry, the temporal aliasing error is manifest as north-south striping. The noise characteristics of the various solutions differ because of the different parameterizations used by the individual ACs to compensate for both error sources. In the following, we refer to the solution noise not explained by the measurement system error as analysis noise.
The role of the analysis noise is illustrated by Fig. 1, which shows the discrepancy between calibrated errors of ITSG monthly GRACE gravity fields (the ITSG formal errors are calibrated by construction due to the use of a stochastic noise model) and the baseline accuracy determined in pre-mission simulations (Jekeli and Rapp 1980;Kim 2000). The baseline accuracy based on observation error models turned out to be too optimistic by about one order of magnitude. Consequently, the noncalibrated formal errors of the individual monthly solutions are too optimistic, as well ( Fig. 1). One may expect that the combination of monthly gravity fields from different ACs based on different background models and using different parameterizations reduces the analysis noise, although all gravity fields are derived from the same observations and no new information is introduced in the combination. This expectation further motivates the combination service for monthly gravity fields.
To fully take into account correlations between the individual gravity field parameters, but also between gravity field, orbit, instrument, stochastic or other model parameters, gravity fields have to be combined on the normal equation (NEQ) 1 https://podaac.jpl.nasa.gov/GRACE. 2 http://isdc.gfz-potsdam.de/grace-isdc/.
3 https://grace.jpl.nasa.gov/. 4 http://icgem.gfz-potsdam.de. Fig. 1 GRACE baseline accuracy and formal (AIUB, GRGS and GFZ), resp., calibrated (ITSG) errors of monthly gravity fields. Shown are degree amplitudes of formal errors that were averaged over all monthly solutions per AC (2004in case of AIUB, GRGS and ITSG, 2006-2007 level. Up to now, a combination of gravity fields on the NEQ level was not possible because the individual ACs normally do not provide NEQs. This was only changed in the frame of the EGSIEM project (see Sect. 2).
It is a well-known technique when combining NEQs to define relative weights iteratively by variance component estimation [VCE, Koch and Kusche (2002)]. This approach is applied, e.g., by the IVS (Böckmann et al. 2010). The technique is recapitulated in "Appendix A.1." But applying VCE to NEQs provided by different ACs is hampered by a basic problem. Proper stochastic models of noise in the original data are not available. The individual ACs apply different noise modeling strategies and rely on different parameters to absorb background model errors, but the inversion of the individual normal matrices does not yield realistic covariance matrices of errors of model parameters. The error estimates of the unknown parameters differ considerably between the ACs, and consequently classical VCE converges to nonoptimal results. This problem is not only encountered in the combination of GRACE gravity fields. Lerch (1989), in case of the combination of data from different satellite missions, proposed a procedure to derive relative weights based on the analysis noise. Seitz et al. (2012), in the computation of the global reference frame DTRF2008, based on terrestrial and satellite data, completely replaced VCE derived by empirical weights.
We propose an alternative weighting scheme which is based on the noise levels of the individual solutions (Sect. 5). Relative weights are derived by VCE on the solution level (Jean et al. 2018) and then consequently applied to the NEQs. The main difference between the classical VCE on the NEQ level and the alternative VCE on the solution level is that in the latter case, while correlations between the unknown parameters are lost, the error assessment does not depend on the different error modeling and absorption strategies of the individual ACs, but on the differences to a (weighted) mean of the individual solutions.
Prior to weighting, the impact of the individual NEQs on the combination has to be balanced. This is achieved by empirical factors derived from the study of pairwise combinations and has to be done independently from the VCE, since the VCE is converging robustly to the same results, independent of the a priori weights used.
In a nutshell, to generate a statistically optimal combination of monthly gravity fields provided by different ACs: -We exploit the normal matrices provided together with solutions themselves. -Ideally, the inverse of the normal matrix is the full error variance-covariance matrix of model parameters. In reality, ACs do not guarantee a proper scaling of normal matrices. Therefore, we estimate the weight factors to be used in combining the individual NEQs. -The most obvious technique to estimate optimal weights is VCE. Unfortunately, a direct application of this technique, as it is presented in Appendix A.1, leads to suboptimal results, because the inversion of the NEQs provided by the ACs does not yield realistic covariance matrices of errors in model parameters. -In order to solve this problem, we apply VCE on the solution level (described in Sect. 5.3). In this case, the errors in model parameters are assumed to be uncorrelated and the error variances of all the parameters are assumed to be the same. Unfortunately, the weights estimates cannot be applied to the NEQs just like that, because the inversion of any of the NEQs results in an error covariance matrix that suffers, among others, from an unknown scaling factor. -We therefore compute and apply additional empirical factors, which so to say equalize the available normal matrices. Those factors are estimated such that all the individual solutions contribute equally to the combined solution, independently of their quality. (This procedure is described in Sect. 5.2.) The article is structured as follows: We first introduce the EGSIEM combination service for monthly gravity fields in Sect. 2 and derive measures for quality control in Sect. 3. In Sect. 4, the individual contributions of the associated analysis centers are characterized, and in Sect. 5, the combination on the normal equation level is discussed. In Sect. 6 finally, the combined gravity fields are validated by comparison with the official GRACE SDS time-series and with the individual contributions. The paper concludes with Sect. 7.

The EGSIEM combination service for monthly gravity fields
In the frame of the Horizon 2020 project European Gravity Service for Improved Emergency Management [EGSIEM, Jäggi et al. (2019)], the prototype of a scientific combina-tion service for time-variable gravity fields was established. The goal of this service is to provide consistent, reliable and validated monthly gravity fields, which are combined on the NEQ level from standardized NEQs of all associated ACs. EGSIEM ACs contributing to the combination are the Astronomical Institute of the University of Bern (AIUB), the Helmholtz Centre Potsdam, German Research Centre for Geosciences (GFZ), the Groupe de Recherche de Géodésie Spatiale (GRGS) and the Institute of Geodesy of the Technical University of Graz (IfG, former Institute for Theoretical and Satellite Geodesy, ITSG).
To guarantee consistency between the individual contributions, EGSIEM standards were defined for reference frame, Earth rotation and antenna reference points on the GRACE satellites, as well as for the relativistic effects and for thirdbody perturbations. The EGSIEM ACs were free to use their specific processing approaches and the background force models of their choice for the static gravity field of the Earth and for tidal mass variations. Neither were the de-aliasing products for short-term atmosphere and ocean mass variations [AOD, Flechtner and Dobslaw (2013)] harmonized, because background models and de-aliasing products are not free of errors. In the combination, errors in the individual models may be reduced; therefore, a wide variety of models is beneficial.
EGSIEM-combined gravity fields are provided in spherical harmonic representation (L2-products) and as global grids (L3-products). To generate L3-products, degree 1 terms derived from Satellite Laser Ranging (SLR) are added to transform between a center of mass and a center of figure frame. Then, the monthly mean of AOD is restored to achieve full (nontidal) signal content. The AOD correction is combined from the individual monthly means provided by the ACs using the same relative weights as in the combination of the gravity fields (Jäggi et al. 2019).
For hydrological applications, monthly means of atmosphere [GAA, Flechtner and Dobslaw (2013)], ocean [GAB, Flechtner and Dobslaw (2013)] and the global isostatic adjustment (GIA) model LM17.3 5 are subtracted. For oceanographic applications, monthly means of the atmosphere, the terrestrial water storage modeled by the WaterGAP Global Hydrological Model [WGHM; Döll et al. (2003)] and GIA (evaluated at the epochs of the monthly gravity fields) are subtracted.
A variant of the DDK-filter (Kusche 2007) making use of the full, monthly covariance information is applied to filter the different versions (still in spherical harmonics representation). Therefore, the calibrated error covariances of the ITSG gravity fields were used and the characteristics of the expected hydrological or oceanographic signals were taken into account. Finally, the spherical harmonic coefficients (SHC) were transformed to global grids with 1 • -resolution.
All EGSIEM products are representative of the time span within a given month defined by the start and end day. Short GRACE data gaps are ignored when computing the monthly means of the AOD products, assuming that users in general do not thin out their observation database according to the availability of the GRACE observation data either.
EGSIEM-combined L2 and L3 products can be downloaded from the "Data" section of the EGSIEM homepage. 6 Furthermore, mass variations derived from individual timeseries as well as from the combined gravity fields can be visualized by the EGSIEM plotter. 7 The prototype combination service continues after the completion of the EGSIEM project in the frame of the Combination Service for Time-variable Gravity field models (COST-G) as a product center of the International Gravity Field Service (IGFS) under the umbrella of the International Association of Geodesy (IAG).

Noise assessment
We need to assess the noise levels of the individual and the combined gravity fields for quality control, and we have to independently define relative weights to consider the different noise levels in the models to be combined. Prior to combination, the monthly gravity fields provided by the individual ACs undergo strict quality control based on their signal and noise content in the spectral and spatial domains. While noise levels may vary between ACs and are taken into account in the combination by noise-based relative weights, the signal content is expected to be the same in all gravity field timeseries accepted for combination. Gravity field solutions with attenuated temporal variation due to intended or accidental regularization are excluded from the combination to avoid damaging the signal content.
The signal content is evaluated by the comparison of the amplitudes of seasonal mass variations in a large number of river basins and by the study of mass trends in polar regions.
The tests of the signal content are described in detail by Jean et al. (2018).
We here focus on the noise content to assess the quality of the individual and combined gravity fields and to derive relative weights for combination. To separate signal from noise, we have two possibilities: -Comparison with the monthly mean of different gravity fields, assuming that all gravity fields contain the same signal, but are different in noise. The noise ideally is 6 http://www.egsiem.eu. 7 http://plot.egsiem.eu. greatly reduced in the averaging process, while the signal content remains unchanged. -Comparison with a signal model. As our knowledge of mass transport in the system Earth is limited, we refer to a deterministic model of mass variation containing bias, trend, annual and semiannual variations fitted by a least-squares process to the monthly mean values of the individual gravity fields. The residuals with respect to this model are called anomalies.
The transformation between spherical harmonics representations and grids is linear, and therefore it does not matter whether we compute differences to the mean or anomalies per coefficient in the spherical harmonics domain or for each grid cell of global grids in the spatial domain. Moreover, differences to the mean or anomalies may be evaluated either in geoid heights or in equivalent water heights (EWH). The results will be different in this case, because the scaling factors involved are depending on the degree (Wahr et al. 1998) and consequently the noise in the high degrees is amplified when using EWH. Figure 2 shows the root-mean-square (RMS) in geoid heights over all monthly gravity fields 2004-2010 of the degree amplitudes of differences to the mean and anomalies of the EGSIEM-AIUB contribution. The differences to the mean values are in general smaller than the anomalies, because our signal model is incomplete and does not represent nonsecular, nonseasonal variations, and because the differences to the mean do not reflect the errors that are shared by all the models.
Due to the polar orbits of the GRACE satellites and due to the related sparse observation sampling in cross-track direction, high spherical harmonic orders are especially noisy and often removed by filtering (Kusche 2007). We therefore also show degree amplitudes computed from orders 0, . . . , 29 only to focus on the geophysically most meaningful part of the spectrum. The spikes visible at degrees 15, 31, 46, 61 and around 76 are related to orbit resonances. The GRACE satellites circle the Earth approximately 15.3 times per day. Spherical harmonic orders at integer multiples of   (Seo et al. 2008). Whenever the degree amplitudes include a new resonant order, this causes a jump in the noise level. Figures 3 and 4 show the RMS per grid cell over all monthly anomalies and differences to the mean in geoid heights, respectively. In the spatial representation, it is obvious that the remaining signal in the anomalies is not distributed evenly over the globe, but is concentrated over the continents in regions with strong mass variability, while anomalies over the oceans are small. No corresponding phenomenon can be detected for the differences to the mean values, which only show a slight latitude dependence due to the denser observation coverage at higher latitudes.
Figures 5, 6 and 7 show the noise evaluated in EWH. Apart from the general up-weighting of high degrees and consequently also the noise in the high degrees, the conclusions are the same as for the geoid heights: -Differences to the mean are significantly smaller than anomalies and only show a small latitude dependence. -Anomalies include nonsecular, nonseasonal signals, which are concentrated over land regions with strong mass variability.
Consequently, we use the differences to the mean values as our best approximation of the noise content to define the relative weights for the combination of the monthly gravity On the other hand, the RMS of the anomalies, restricted to ocean areas, is taken as an independent quality control. Note that the anomalies still contain small signals over the oceans, as can be seen when comparing the global representations of the anomalies and the differences to the mean values. The Southern Atlantic Ocean in particular has regions with significant signal contents. A region free of anomalous signals can be detected in the central part of Antarctica. We nevertheless prefer the ocean areas for quality control due to their much larger size. Moreover, the polar regions are not representative of the rest of the globe due to the much denser GRACE satellite ground tracks and consequently the observation coverage near the poles. No AOD signals were restored for the computation of the anomalies.

Individual time-series
Contributions for combination are provided by the four EGSIEM ACs: AIUB, GFZ, GRGS and ITSG. All ACs use variants of a dynamic orbit and gravity field determination approach based on variational equations and on K-band range-rates (KRR) as the main observable. The individual approaches differ in -their use of either the original GPS observations or kinematic satellite orbits derived thereof, -the relative weighting or sampling of observables, -the noise model or the parameters estimated to absorb the noise and -the background models used for signal separation.
We therefore shortly characterize the different approaches. The descriptions of the individual approaches are based on the EGSIEM Standards document 8 and updated by information presented at the EGSIEM final meeting. 9 The descriptions are not meant to be exhaustive, but to illustrate some of the main differences in the parameterizations. For further details, consult the provided references.
The observables used by the ACs, their sampling, the maximum number of observations and the weights applied are compiled in Table 1. GFZ (Dahle et al. 2012) and GRGS (Bruinsma et al. 2010) are directly using the original GPS carrier-phase and code observations together with the KRR observable to determine the dynamic GRACE orbits and the monthly gravity fields. AIUB and ITSG first determine kinematic satellite orbits by a precise point positioning (PPP) algorithm based on carrier phases only and then use the kinematic positions together with the epoch-specific covariance information as pseudo-observations. The weights of the observables are generally based on the RMS of the corresponding residuals, i.e., 0.7 m for GPS code, 0.2 cm for GPS carrier phase (L1), 0.7 cm for the ionospherefree linear combination (L3) of carrier phases (L1 and L2) and 0.1-0.3 µm/s for KRR. In the case of ITSG, the relative weights of the different observables are determined by VCE.
All ACs observe inconsistencies between GPS and KRR observations leading to increased noise in the gravity field solutions. This problem seems to be even more serious, if kinematic orbits are used as pseudo-observations instead of the original GPS observables. Both, GFZ and GRGS, downweight the GPS code observable, and GFZ in addition downweights the GPS phases (see Table 1). GRGS moreover limits the resolution of the gravity field contribution determined by GPS to degree and order 40. AIUB down-weights the kinematic positions by an empirically determined factor of 15 2 , and ITSG down-samples the pseudo-observations by a factor of 10. The reason for the inconsistencies between GPS and KRR is still under investigation.
The parameters estimated by the ACs include orbit, instrument and force model parameters (Table 2). Epochwise clock corrections (2880 per day and satellite in case of 30 s GPS sampling) and GPS phase ambiguities (typically 300-400 per day and satellite) are not listed, neither are the gravity model parameters (8277 per month in the case of a maximum degree of 90; coefficients of degrees 0 and 1 are not estimated). It is common practice to set up empirical parameters to absorb instruments noise, but the choice of parameters is not unique. GFZ estimates KRR biases, drifts and once-per-revolution (1/rev) or twice-per-revolution (2/rev) periodic variations every 90 min as originally proposed by Kim (2000). Accelerometer (ACC) biases and scale parameters in all three axes (X, Y, Z) of the instrument frame are estimated in addition with a 3 h time resolution. (An additional parameter set is estimated at the end of the arc, giving a total of nine sets per 24 h arc in this special case). GRGS also relies on a rather dense ACC bias parameterization, while it estimates ACC scale factors once per day and axis.
AIUB applies a more conservative instrument parameterization, but estimates so-called pseudo-stochastic accelerations in the three axes-radial (R), along-track (S) and crosstrack (W )-of the corotating orbital frame every 15 min. The pseudo-stochastic accelerations are estimated to compensate for not only instrument noise, but also all kinds of model deficiencies. They are constrained to zero with uncertainties of σ = 3 × 10 −9 m/s 2 to prevent absorbing time-variable gravity signal (Meyer et al. 2016).
While all other ACs apply very simple noise models (diagonal weight matrices with uniform weight per observable), ITSG applies empirical noise modeling techniques to take correlations between observations over 3 h arcs into account (Ellmer 2018). Consequently, IFG has to deal with fully populated weight matrices. But a realistic noise model can only be achieved by a careful separation between signal and noise. Therefore, ITSG determines constrained daily variations up to a spherical harmonics degree of 40. The monthly mean of the daily estimates is restored in the monthly solution not to impair the signal content. On top of that, ITSG estimates fully populated (symmetric) 3 × 3 ACC scale factor matrices for each day. This measure drastically reduces the artifacts with a period of 161 days that impair the C 20 estimate (Klinger and Mayer-Gürr 2016).
The monthly NEQs are provided by the individual ACs in the SINEX 10 format to the combination center. As addi-  tional information, the maximum degree l max , the number of observations n (reduced by the number of pre-eliminated parameters), the number of unknowns u (reduced by the number of constraints applied on the pre-eliminated parameters), the geophysical constants G M (gravity constant times mass of the Earth) and R (semimajor axis of the reference ellipsoid), the tide system (zero tide/tide free) and the weighted square sum of pre-fit residuals l l l T P P Pl l l are provided in the header of the SINEX files. The weighted square sum of postfit residuals needed for the derivation of VCE-weights and for statistics may then be computed as with dx dx dx = x x x −x x x 0 being the parameters improved and b b b being the right-hand side vector of the normal equation system. All NEQs contain a priori gravity field coefficients x x x 0 , the normal equation matrix N N N , the right-hand side vector b b b and the solution vector x x x.
To simplify the combination on the NEQ level, all but the gravity field parameters are pre-eliminated by the ACs and the individual NEQs are normalized (i.e., each observable is weighted according to Table 1). Despite all these measures, the differences in the choice of observables and in the observation sampling cause huge differences in the number of observations entering the daily normal equations, and the numbers of the pre-eliminated parameters differ significantly, too. Moreover, the various noise modeling strategies cause very different magnitudes of the formal errors. In Sect. 5, a robust combination strategy is introduced.

Combination of normal equations
The monthly normal matrices N N N i of the EGSIEM ACs as well as the right-hand side vectors b b b i are weighted and stacked to form the combined NEQ system This section is devoted to the preparatory work needed to scale the n sol different NEQs to a common set of geophysical constants G M and R and a priori gravity field model, and to derive the relative weights w i .

Transformation to common geophysical constants, tide system and a priori gravity model
The depend on spherical harmonic degree l.
We define a diagonal scale matrix F F F, with elements F jk = 0 for j = k and F j j = f l corresponding to the degree of the coefficients in N N N i and b are the individual design matrices, which are not provided by the ACs) is not allowed to change, which is why the rescaled design matrix must take the form A A A i = A A A i F F F −1 and the individual components of the NEQ are rescaled accordingly: Furthermore, the NEQs have to refer to a common tide system. The tide-free system was selected as the EGSIEM standard. In the NEQs referring to the zero-tide system, a bias of 4.173 × 10 −9 has to be added to the a priori C 20 gravity field coefficient.
Eventually, the NEQs have to be transformed to the common a priori gravity field coefficients x x x 0,ref . If dx dx dx 0 = x x x 0,ref − x x x 0,i , then according to Brockmann (1997) the transformed NEQ is: The weighted square sum of l l l i also has to be adapted:

Empirical scaling to balance the impact of NEQs on the combination
A unweighted combination of the individual NEQs not necessarily results in a combined solution close to the arithmetic mean of the individual solutions. As mentioned in Sect. 1 and further outlined in Sect. 4, the individual NEQs are based on different observables, noise models and parameterizations and therefore differ in their specific degrees of freedom and in the magnitude of the formal errors. Due to these differences, the impact of the individual NEQs on an unweighted combination is almost unpredictable. We know, on the other hand, that each NEQ basically contains the same information representative of the same time span of GRACE observations, and the individual solutions only differ in analysis noise. The latter will be taken into account by the weights derived from the individual solutions in Sect. 5.3.
We now derive empirical factors to balance the impact of the individual NEQs. We define one NEQ (N N N ref , b b b ref ), chosen freely from the individual NEQs, as the reference with a fixed weight of 1. Then, we perform pairwise combinations of the reference NEQ with all other NEQs (N N N i , b and vary the weight w until the RMS of the coefficient-wise differences between the combined solution of Eq. 13 and the solutions to the individual NEQs in Eq. 13 is the same (Fig. 8). The RMS is computed as follows: where K lm stands for the spherical harmonics coefficients C lm and S lm . Consequently, in case of n sol NEQs to be combined, we end up with n sol −1 empirical weights. A combination of all n sol NEQs applying the empirical weights to

Relative weights based on solution noise
According to the argumentation in Sect. 1, we define relative weights representative of the noise content on the solution level. This can be done simply by comparing the individual solutions to their arithmetic mean. An alternative procedure based on VCE, proposed by Jean et al. (2018), is more robust against outliers. The same authors also study different weighting schemes, e.g., coefficient-wise, order-wise or field-wise weights, and conclude that monthly field-wise weights determined by VCE on the solution level are best suited for the combination. We therefore determine fieldwise weights.
The basic idea of VCE on the solution level is to use the individual solutions as pseudo-observations, taking into account all coefficients (in the case of field-wise weights) with equal weight. The design matrices, the weight matrix and consequently also the normal matrices will all become unity matrices of dimension n coef . Introduced into the formulas of VCE (see "Appendix A.1"), the relative weights of iteration k turn out to be The combination on the solution level in the iteration step k iŝ w i,0 = 1/n sol may serve as starting values. We base the computation of relative weights directly on the unfiltered dimensionless spherical harmonics coefficients x x x i , but in principle filtered versions or coefficients transformed to EWH may be considered as well (the former to decrease the impact of the noisy high-degree coefficients on the weights, the latter to increase it). Figure 9 shows the weights determined by VCE on the solution level for the four EGSIEM AC's monthly solutions of January 2006 and Fig. 10 the corresponding noise levels for each iteration step, assessed by the weighted STD of anomalies over ocean areas (see Sect. 3). Usually, convergence is reached after four iteration steps.
For comparison, we computed monthly combinations on the NEQ level spanning the two years 2006-2007 based on different weighting schemes: -applying no weights at all, -defining the relative weights iteratively by standard VCE on the NEQ level, i.e., without empirical factors (labeled "NEQ-VCE" in Fig. 11), -applying empirical factors to balance the impact of the individual contributions (arithmetic mean on the NEQ level), -basing the relative weights on the solution noise by multiplying the empirical factors by weights determined by VCE on the solution level (labeled "EGSIEM-COMB" in Fig. 11).
For the individual time-series and all four combination schemes, the monthly standard deviations of anomalies over the oceans were computed to assess their noise content (Fig. 11). The quality of the combined gravity fields is mainly driven by the outstanding ITSG contribution. Comparing the combined gravity fields, only the combination taking the solution noise into account reaches the noise level of ITSG. In the rare occasion where a monthly ITSG gravity field shows a slightly increased noise level (e.g., in September 2006), the combination surpasses all individual contributions. The arithmetic mean on the NEQ level (based on the empirical factors to balance the impact of the individual NEQs) performs slightly worse than the "EGSIEM-COMB." This result differs from the conclusion of Sakumura et al. (2014) (studying gravity field combinations on the solution level) that the arithmetic mean of the gravity fields of different ACs performs best. Contrary to Sakumura et al. (2014), who combined time-series of very homogeneous quality, we are confronted with more diverse noise levels and therefore, as already mentioned by Jean et al. (2018), the benefit of relative weights becomes apparent.
Neither the combination of unscaled and unweighted NEQs, nor the combination based on VCE on the NEQ level can reach the quality of the combinations based on the empirical balancing factors. As mentioned before and also discussed in Sect. 6, this is explained by the different processing and especially noise modeling strategies of the individual ACs that have to be taken into account.
Note that due to the different orbit parameterizations, the combined monthly gravity fields do not correspond to one and the same satellite orbit valid for all ACs. While the AC-specific parameters are pre-eliminated prior to combination and the correlations between the local and the gravity field parameters are kept, a solution of the pre-eliminated parameters by re-substitution of the combined gravity field coefficients would lead to, e.g., different initial state vectors and increased residuals compared to the individual solutions. As long as we have to deal with diverse parameterizations, there exists nothing like an optimal common set of orbit parameters.

Evaluation of combined monthly solutions
As long as no signal biases impede the combination, the fieldwise weights derived by VCE on the solution level provide a robust quality indicator for the monthly gravity fields provided by the EGSIEM ACs. Together with quality indicators based on the anomalies, noise levels can be characterized and signal attenuation due to regularization can be detected. The monthly relative weights based on VCE on the solution level, the empirical scaling factors to achieve the same impact of the individual contributions on the combination, the final weights resulting as the product of the VCE weights and the empirical factors, and for comparison also the weights that would result from VCE on the NEQ level are visualized in Fig. 12  The weights derived by VCE on the NEQ level differ significantly from the final weights used for the EGSIEM combination. The fundamental difference is the low weight assigned to the ITSG contribution. This is explained by the empirical noise model applied by ITSG that leads to realistic, i.e., significantly larger formal errors compared to the other time-series. It remains unclear why the weights derived by VCE on the NEQ level are much less favorable for GFZ than for GRGS. Both ACs base their processing on the original GPS phase observations, and their formal errors are comparable, at least for the dominating medium to high-degree SHC. We conclude that VCE on the NEQ level does not necessarily produce optimal weights if NEQs stemming from different analysis approaches have to be combined.
In the presence of signal biases, the differences between the individual contributions and their mean values include the signal biases and the weights derived by VCE on the solution level for a biased contribution are smaller than expected from noise only. In this case, the relative weights are no longer representative of the different noise levels. Consequently, small VCE-derived weights together with small noise, as illustrated, e.g., by small anomalies over the oceans, indicate signal biases. In our case, all contributions of the EGSIEM ACs passed the quality control and no signal biases could be detected.
The noise level of the combined solution is also independently evaluated by means of anomalies in the spherical 3) was derived from the monthly arithmetic mean values of all the time-series available at ICGEM having passed quality control according to Jean et al. (2018). Figure 14 compares the RMS of degree amplitudes of EWH anomalies in the spherical harmonic domain to the three official RL05 time-series (evaluated for the time span 2004-2010) of the GRACE SDS ACs. Beyond degree 30, the degree amplitudes of the anomalies in general are dominated by noise [see e.g., Jean et al. (2018)]. The RMS of the anomalies of the EGSIEM-combined solutions is smaller than that of the GRACE SDS RL05 time-series.
To exclude the effect of the noisy high-order spherical harmonics coefficients which normally are attenuated by postprocessing filters [e.g., Kusche (2007)], all gravity fields were truncated at order 29. But also the truncated degree amplitudes of the anomalies that focus on the part of the spectrum essential for geophysical analysis (dashed lines in Fig. 14) are smaller in the EGSIEM-combined gravity fields than in the GRACE SDS solutions. Figure 15 shows the RMS values of anomalies of the GRACE SDS time-series and the EGSIEM combination in the spatial domain. All gravity fields were smoothed with a 400 km Gauss filter (Wahr et al. 1998). The RMS values of the combined anomalies show significantly reduced noise stripes, which are typical for the GRACE monthly gravity fields. Also, the residual signal seems to be less affected by stripes on the continents. A similar evaluation comparing the combined gravity fields with the individual EGSIEM ACs time-series can be found in Jäggi et al. (2019). Figure 16 compares the combined gravity fields to the individual EGSIEM ACs' time-series. The monthly RMS values of anomalies over the oceans are computed in order to assess the noise levels of individual solutions. Note that for 2006-2007 the combined solutions include the GFZ contribution. The ITSG contribution is clearly less noisy than the other individual ACs' time-series in this evaluation. But with the exception of very few months, the noise level of the combined gravity fields is as small as or even smaller than that of ITSG. Note that poor quality of the solutions in January 2004   Fig. 16), the quality of the monthly gravity fields is impaired by the orbit resonances.

Conclusions and outlook
We presented the prototype of a combination service for monthly gravity fields, which was implemented in the frame of the EGSIEM project. The monthly gravity fields provided by the associated ACs show different noise levels due to different processing approaches: The number of observations used per month varies between 500,000 and 3,000,000, the number of estimated parameters between 5000 and 50,000. Moreover, the noise modeling techniques and parameter types differ substantially.
The combination is performed on the NEQ level to correctly take into account correlations between parameters. Relative weights, representative of the different noise levels, are derived by VCE on the solution level, i.e., by iterative comparison of the individual gravity fields to their weighted mean. The intrinsic weights of the individual NEQs are removed by a robust empirical procedure balancing the impact of the individual NEQs on the pairwise combinations.
Combined gravity fields were computed from three or four ACs for the time span between 2004 and 2010. An independent evaluation of the noise levels indicates that the quality of the best individual contribution (ITSG) is achieved or even topped by the combinations in 90% of the monthly solutions. Outliers can be identified with data problems. Compared to the official GRACE SDS monthly gravity fields, the anomalies of the EGSIEM combinations that are derived to assess the noise level are smaller. The original goal to provide consistent, reliable and validated gravity fields therefore is met.
The noise level differences of the individual time-series are striking. With a more homogeneous quality of the input series, the combinations should improve substantially as well. First, experiments with the new GRACE SDS RL06 time-series indicate a big step forward in this direction. With the availability of the new GRACE L1B-RL03 observational data and the SDS RL06 gravity fields, now a final combination of all GRACE time-series becomes feasible.
The EGSIEM initiative for gravity field combination is continuing with COST-G under the umbrella of the IAG. Since it cannot be expected that the GRACE SDS ACs will reprocess the whole GRACE time-series to be in accordance with the EGSIEM standards, the COST-G standards will be adapted to only specify the signal content of the monthly gravity fields which should include nontidal oceanographic, hydrological, glaciological and GIA signal to the full extent.
The combined normal equation system of iteration step k is compiled based on all contributing normal matrices N N N i and right-hand-side vectors b b b i applying the relative weights To compute the corrections dx dx dx k to the a priori gravity model, the normal equation system is solved in each iteration. As no original observations or design matrices are available, the weighted square sum of residuals for iteration step k has to be computed from the pre-fit residuals according to To compute partial redundancies k tr(N N N i where n i is the number of original observations l l l i , the inverse of the combined normal matrix N N N k = N i=0 w i,k N N N i is needed.
Eventually, the variance components for iteration step k + 1 are derived according to and the corresponding weights are In the case of the EGSIEM combination service, all arcspecific parameters and nongravitational model parameters are pre-eliminated. The number of original observations n i has to be reduced by the number of pre-eliminated parameters.
To avoid the costly computation of tr(N N N i N N N −1 k ) and the solution of the normal equation system for each iteration step, approximate procedures are available, which are not discussed here (see, e.g., Koch and Kusche 2002).