An ensemble of eddy-permitting global ocean reanalyses from the MyOcean project

A set of four eddy-permitting global ocean reanalyses produced in the framework of the MyOcean project have been compared over the altimetry period 1993–2011. The main differences among the reanalyses used here come from the data assimilation scheme implemented to control the ocean state by inserting reprocessed observations of sea surface temperature (SST), in situ temperature and salinity profiles, sea level anomaly and sea-ice concentration. A first objective of this work includes assessing the interannual variability and trends for a series of parameters, usually considered in the community as essential ocean variables: SST, sea surface salinity, temperature and salinity averaged over meaningful layers of the water column, sea level, transports across pre-defined sections, and sea ice parameters. The eddy-permitting nature of the global reanalyses allows also to estimate eddy kinetic energy. The results show that in general there is a good consistency between the different reanalyses. An intercomparison against experiments without data assimilation was done during the MyOcean project and we conclude that data assimilation is crucial for correctly simulating some quantities such as regional trends of sea level as well as the eddy kinetic energy. A second objective is to show that the ensemble mean of reanalyses can be evaluated as one single system regarding its reliability in reproducing the climate signals, where both variability and uncertainties are assessed through the ensemble spread and signal-to-noise ratio. The main advantage of having access to several reanalyses differing in the way data assimilation is performed is that it becomes possible to assess part of the total uncertainty. Given the fact that we use very similar ocean models and atmospheric forcing, we can conclude that the spread of the ensemble of reanalyses is mainly representative of our ability to gauge uncertainty in the assimilation methods. This uncertainty changes a lot from one ocean parameter to another, especially in global indices. However, despite several caveats in the design of the multi-system ensemble, the main conclusion from this study is that an eddy-permitting multi-system ensemble approach has become mature and our results provide a first step towards a systematic comparison of eddy-permitting global ocean reanalyses aimed at providing robust conclusions on the recent evolution of the oceanic state.


Introduction
There is an increasing need for estimating the present and past three-dimensional state of the ocean in the context of ocean monitoring, climate variability assessments and predictability purposes, such as the initialization and validation of long-range (i.e. seasonal and decadal) forecasts. However, the oceans remain seriously under-sampled and observational time series are often of limited usefulness to generate the required ocean estimates and ocean change indicators due to the short periods of coverage and sparse geographical distributions. On the other hand, over the course of the past few decades, considerable advancements have been made in the development of ocean data assimilation techniques which combine ocean models, atmospheric forcing fluxes and ocean observations, and a number of ocean data assimilation systems have been developed to estimate the time-evolving, three-dimensional state of the ocean. These combinations are known as ocean reanalyses (REAs) and their production is a recent activity that started approximately at the beginning of year 2000. Since then considerable progress has been made and today REA production is an established reality in several research and operational centers where REAs use advanced multivariate data assimilation schemes that allow assimilation of most of the available types of observation. There are low resolution REAs (about 1°), spanning long time periods (typically 50 years), as well as higher resolution products (about 1°/4°), which exhibit eddy permitting capabilities and are available for shorter records (usually the altimeter period 1993-onwards).
In particular, two communities are devoting effort in exploiting existing and new ocean REAs for a variety of purposes such as quantifying improvements in quality and uncertainty, and defining indices for ocean monitoring: 1. The Ocean Re-analyses Intercomparison Projects (ORA-IP), undertaken by the GOV and CLIVAR/ GSOP communities (Balmaseda et al. 2014). Rather than following a fixed protocol, ORA-IP exploits the existing reanalysis products, taking advantage of the diversity to gain insight on how robust our knowledge of the ocean is. As part of this effort, a large suite of indices and diagnostic quantities obtained from various ocean reanalysis products are compared and evaluated using observations where available. Many papers on the ORA-IP are included in this Special Issue. 2. The EU funded MyOcean project (www.myocean.eu), among several goals, aimed at providing a series of validated eddy-permitting and/or eddy-resolving global and European Seas REAs covering the recent "altimetry era" (namely 1 January 1993-onwards) (Ferry et al. 2012). These products are not only targeted towards the climate community but also to fisheries and offshore industry, and foster intermediate and downstream services for the benefit of agencies with environmental assessment responsibilities and monitoring duties.
This work is intended to provide a description of the state-of-the-art of eddy-permitting global ocean REAs produced in the framework of the MyOcean project using, wherever possible, an ensemble approach. The work also illustrates examples of possible validation strategies with the purpose of showing the applicability of these products for a wide range of scientific investigations, and to other relevant communities interested in the assessment of oceanic conditions at global and regional scales over recent decades. The coordinated European MyOcean effort on the global REAs intercomparison aimed at providing recommendations for future REAs production by identifying the weaknesses of existing individual systems and the suitability of an ensemble approach. Furthermore, it was intended to give feedback on how to improve the ocean observing system, assimilation methods, models and surface fluxes, and how to promote interaction with the user community and to encourage the archive of individual reanalysis products in public data repositories freely available to all users (www.myocean.eu).
Thanks to satellite altimetry, sea level displacements associated with ocean eddies have been observed with a few centimeter accuracy for more than two decades, and there is evidence both from observations and modeling studies that eddies play a role in the meridional transport of heat (e.g. Souza et al. 2011;Smith et al. 2000;Valdivieso et al. 2014). There is also evidence that ocean mesoscale features have an impact on the atmospheric winds (e.g. Chelton et al. 2004;Maloney and Chelton 2006). That is why including mesoscale features in REAs with constantly increasing resolution is an important issue and contributes to improving our understanding of ocean and climate variability.
The next generation of operational climate prediction systems will implement eddy-permitting ocean models, and it is therefore urgent to assess the capability of the global ocean REAs to be able to provide good quality initial conditions for such systems. The eddy variability is still poorly represented in global ocean models, despite its acknowledged important contribution to oceanic variability and its expected impact on climate variability in the upcoming generation of coupled models which include eddying oceans.
Nevertheless, one has to keep in mind that subsurface ocean observations are scarce before the 2000s (i.e. prior to the full deployment of Argo floats) and that large uncertainties may exist in the ocean REAs, making a robust estimation of the ocean history with reliable error bars still a 1 3 major challenge. A way to have access to both the variability and uncertainty estimates of REAs is to perform a multisystem ensemble of ocean REAs. The multi-system ensemble approach used in this study is presented in Sect. 2.
This work is based on specific analysis of a series of parameters, usually considered in the community as essential ocean variables (EOV), or a proxy for these variables: sea surface temperature (SST), sea surface salinity (SSS), temperature and salinity averaged over meaningful layers of the water column, sea level, transport across pre-defined sections, and sea ice parameters. The eddy-permitting nature of the global REAs is also estimated in terms of spatial pattern and level of eddy kinetic energy (EKE). All the results are shown in Sect. 3, and Sect. 4 includes an extensive discussion on the advantages and limitations of this study.

Global eddy-permitting reanalyses: an ensemble approach
The MyOcean global ocean reanalysis activity provided a series of eddy permitting global ocean REAs at 1/4° horizontal resolution constrained by assimilation of observations and covering the recent period during which altimeter data are available (period starting with the launch of TOPEX POSEIDON and ERS-1 satellites at the end of 1992). Four available global ocean eddy-permitting REAs (from CMCC, University of Reading, Mercator Océan and ECMWF) were built by using state-of-the-art ocean data assimilation systems. They assimilate, in different ways, reprocessed observations of SST, in situ temperature and salinity profiles and sea level anomaly (SLA). All the REAs cover at least the "altimetric era" (namely 1 January 1993 until 31 December 2011), which is analysed in this work. Some REAs are also available for longer periods into the past and are regularly updated to the present. Based on observational evidence that the time scales of mesoscale variability is generally lower than 100 days (see for example Fu and Cazenave 2001) and that the only variability that we can resolve here is characterized by length scales of the order of tens of kilometers or higher we decided to use monthly means as the minimum representative time scale. Therefore, monthly averages of each dataset are used, meaning that from this analysis only seasonal to interannual variability at eddy permitting scales can be captured and discussed. The use of monthly means also has the advantage to mitigate the problem related with possible jumps introduced by incremental assimilation over a time window of several days, since time averaging smooths any discontinuities out. The numerical products used in this work are freely available for users (www.myocean.eu).
In Table 1 we summarize the different characteristics of In addition to the REAs, we include in this study two observation-only products, generated by statistical interpolation of in situ observations. These products are CORA (v3.4, Cabanes et al. 2013) andEN3 (v2a, Ingleby andHuddleston 2007), produced by Ifremer and MetOffice Hadley Center, respectively. The advantage of dealing with a multi-system ensemble is that it is possible to estimate the uncertainties associated with the reanalysis systems used and gain insight on the signal to noise ratio of the ocean state estimations. Thanks to the four available global ocean eddy-permitting REAs an ensemble mean (EM) is computed, from which the spread is used to infer the «reliability envelop», or the uncertainty, for each ocean indicator. The EM of REAs can indeed be evaluated as one single system regarding its reliability in reproducing the climate signals, where both variability and uncertainties are assessed through the ensemble spread (ES), here quantified as the root-mean-square deviation of the ensemble members from the EM, and the signal-to-noise ratio, defined as the ratio between EM and ES. The ocean state is then analysed and discussed, based on the REAs ensemble compared to the observation-onlybased products, versus the same comparison done using each single reanalysis. This evaluation can help to establish the strategy of a single versus an ensemble reanalysis system. There are many sources of uncertainty in ocean modeling (e.g. parameterized processes, initialization, atmospheric forcing and numerical implementation, etc.) that lead to differences between the true values (unknown) and the measured or modeled values of the physical properties. Data assimilation can in principle reduce the uncertainties by combining dynamical models and observations. Our reanalysis systems differ mainly by the data assimilation schemes implemented to control the ocean state (data assimilation method itself, background error covariances, observation errors, bias correction schemes, etc.) as they share a similar OGCM configuration (ORCA025 with 75 levels for three REAs and 50 levels for one reanalysis), model (NEMO, Madec 2008) and surface forcing data (ERA-Interim, Dee et al. 2011). Therefore, the spread of the ensemble of REAs is assumed to be mainly representative of our ability to gauge uncertainty in the assimilation methods. However, part of the uncertainty comes also from the assimilated observations themselves, the spin up, and the surface forcing treated in different ways (bulk formula and corrections vary from one system to the other). In summary, although this is not a fully multi-system approach because of the same ocean model, the ensemble does partly span the uncertainty linked with the ocean model parametrizations and initial conditions, even if it is not possible to assess the contribution of the individual sources of uncertainty to the ensemble spread.

Results
In this section we present the main results from the intercomparison of the eddy-permitting REAs with a focus on multi-system ensemble, compared with observed products. The main objective is to identify issues and challenges which still remain unsolved in order to be able to identify the most robust global and regional trends of climate indices, and the capability of the ensemble of these REAs to capture the large scale ocean eddy variability. As global ocean REAs are primary tools for investigating variability of ocean indicators of climate change, in the following sections we focus on trends of key variables such as heat content, sea level, sea-ice and transports.
Tables 2, 3, 4, 5, 6, 7 and 8 summarize the statistics (correlations with respect to the verification dataset and linear trends) for the different products and the different variables. All the trends are calculated with monthly mean It should be noted that the lower vertical resolution of one of the REAs does not seem to affect trends and biases of vertically integrated variables such as heat content and volume transports when compared with the higher vertical resolution products. However, a systematic study of the vertical resolution impact has not been tried and we cannot exclude that some of the upper ocean estimates might be affected.
In the longitude-latitude plots of Figs. 2, 4 and 5 we also show the signal-to-noise ratio (see Sect. 2 for definition) that is indicative of the significance of the trends.

Temperature
The EM, as well as each reanalysis product, show SST globally average values which have seasonal and interannual variability, as well as linear trends, consistent with NOAA/AVHRR data (Reynolds et al. 2007) (Fig. 1). All 1 3 products show a very high correlation index with respect to NOAA/AVHRR data. The biases of each reanalysis, and of the EM, are generally small and correlations with observations extremely high even when the seasonal cycle is removed (from 0.89 to 0.99). The EM trend, 1.60 °C/century, is also in very good agreement with the observationonly 1.79 °C/century trend (see also Good et al. 2007) and they share the same uncertainty suggesting that both the trend and its uncertainty are robust and coherent among the REAs and the satellite-derived estimates. The only notable difference among the REAs is a tendency to be warmer than the NOAA/AVHRR product. This behavior is common to all the REAs, with the exception of ORAP.5, and is mainly due to the fact that the SST calculated from REAs is a daily mean, while NOAA/AVHRR analysis estimates nighttime SST (Reynolds et al. 2007).

3
The spatial pattern of the linear trend is very well captured by the EM and the signal-to-noise ratio indicates that the consistency among the REAs is high almost everywhere (Fig. 2a). The small areas between the solid and dashed contours that represent the only regions where the value of the EM trend is smaller than the ES suggest that there is high consistency in most of the regions. The highest ES is confined in the mesoscale regions of the western continental boundaries likely related to a shift of the separation of the western boundary currents and in the Antarctic circumpolar current (ACC) region (Fig. 2c). The high performance of the REAs in estimating SST is however not surprising due to the fact that all the systems assimilate SST either directly or use SST to correct the turbulent fluxes through a restoring term (see "Appendix 1" for details and also Valdivieso et al., submitted to Clim. Dyn. ORA-IP Special Issue). Therefore also the spatial linear trends are consistent with the observations, indicating cooling patterns in the Pacific Ocean suggested to be induced by an acceleration of the trade winds, an increase of the equatorial upwelling and a spin-up of the subtropical gyre (England et al. 2014).
The time series of the global ocean temperature averaged over two layers, the top 800 m, and the top 2000 m, are presented in Fig. 3. All the REAs, with the exception of GLORYS2V3, show in general a mean temperature colder than both CORA3.4 and EN3_V2 in the upper layer. In the deeper layer CORA3.4 and all the REAs, with the only exception of GLORYS2V3, are colder than EN3_V2.
Interestingly, all the REAs except C-GLORS4 exhibit a linear trend of the global ocean heat content that is significantly higher than the two observation-only products CORA3.4 and EN3_v2. This leads to an overestimation of the EM trend which is 0.78 °C/century as opposed to 0.58 and 0.35 °C/century for the observation-only products in the upper layer (Table 3). In the deeper layer the EM trend is 0.35 °C/century as opposed to 0.26 and 0.22 °C/century for the observation-only products (Table 4).
It is not straightforward to identify all the possible causes of discrepancies among the REAs and the observation-only products. The way that data are assimilated and the model dynamics, along with different initial conditions and a poor observing network in the 1990s, are the most likely candidates to explain the differences. The observations are not abundant enough to constrain the global ocean heat content down to a depth of 800 m, and even less so to 2000 m, and the assimilation methods, depending by construction on model vertical covariances, might in some cases induce spurious drifts. Furthermore, it should be noted that the observation-based products are highly dependent on the gap filling methods, the corrections applied to the observations, etc. (see Abraham et al. 2013 for a comprehensive review of the most recent observationonly ocean heat content estimates) and therefore cannot be considered as "real" observations sensu stricto. A final explanation of the differences between the observations and REAs in observational space could be achieved through a detailed analysis of the assimilation statistics, for example analyzing the obs-model differences before and after the assimilation. However, this kind an analysis is beyond the scope of the present work and can be partially found in the validation documentation of the MyOcean products (Quality Information Documents).
The spread among the REAs and between the observation-only products for global averages is also symptomatic of the lack of observations during the first decade, especially in the southern hemisphere. As mentioned by Cheng and Zhu (2014) the evolution of the observation network during recent decades induce some artifacts in the heat . However, both the seasonal variability and the anomalies with respect to climatological values are well reproduced in general by all products, and in particular by the EM for both the vertical layers investigated (Fig. 3, right panes). The spread among the REAs is reduced by at least half when the anomalies of each reanalysis with respect to its own climatology are considered, suggesting that the interannual variability is well captured. When the anomalies are considered, both the observation-only products fall inside the ES of the REAs. In upper ocean heat content the decrease of the spread is more evident after 2000, likely due to some beneficial impact of the Argo system. However, despite the greatly improved open ocean coverage by the Argo array since 2005, a wide spread in interannual rates still remains in our estimates, as well as between observational estimates (Abraham et al. 2013).
ORAP5.0 seems to behave differently from the other REAs after 2002 showing an increasing trend instead of the flattening characteristic of the ongoing warming hiatus (England et al. 2014). In ORAP5.0 SST is used to correct the turbulent surface heat fluxes via a restoring term. From 1993 onwards, different SST datasets have been used for this restoration depending on availability of the dataset. OSTIA reanalysis is only available from 1985 to 2007, and OSTIA operational is only available from 2009 onwards. The gap year of 2008 is filled by using 1/4° Reynolds OIv2 dataset (see "Appendix 1" for details). It is clear from Fig. 1 that the ORAP5 SST before 2008 (when it uses OSTIA), are systematically colder than in the other REAs. The increasing trend in the input SST also reflects on the ocean heat content, and it also likely affects the trends in the global upper ocean heat content which does not show a plateau in either layers (Fig. 3). Whether this is a specific problem with the combination OSTIA reanalyses-OSTIA operational, or just reflects the uncertainty in SST products needs to be established.
It is worth noting, even if difficult to explain, that the EM, when compared simultaneously with both observation-only products, has correlations higher than any other single reanalysis in both layers. This happens even when the correlations are calculated without the seasonal cycle  (Tables 3, 4). Possible reasons for this behaviour are discussed in the last section.

Sea surface salinity
The evaluation of the SSS trends is more difficult due to the fact that, unlike SST, a reliable reprocessed SSS product is not available for the period 1993-2011 because of the scarcity of data. From the EM the positive SSS trend over most of the Atlantic and tropical Pacific Oceans seems robust, at least considering the signal-to-noise ratio, even if the amplitude of the trend is overestimated with respect to both the CORA3.4 and EN3_v2 data (Fig. 4). The only large region where the trend is significantly negative is the Southern Ocean south of 40°S and the North Pacific Ocean. Here the signal is consistent among the REAs and EN3_ v2 but is not visible in the CORA3.4, which however has a strong fit to climatology before the full deployment of the Argo network. At high latitudes in the Northern Hemisphere the trend is also negative but the high spread among the REAs does not allow drawing any robust conclusion in this region, which is covered by sea-ice and where the disagreement might be due to sea-ice misplacement. In general, the patterns of the regional trends are consistent with those derived from an observation-only estimate of SSS over the period 1950-2008 by Durack and Wijffels (2010) (see their Fig. 5). We note in particular increasing salinity in the Atlantic and a freshening in the extratropical Pacific and Indonesian region, consistent with an amplification of the existing inter-basin salinity contrast and perhaps corresponding with an amplification of the hydrological cycle. However, the analysis of our ensemble indicates that these trends are associated with higher uncertainties (higher spread) in the South Pacific ITCZ and in the Atlantic ITCZ, i.e. in regions of high precipitation rates (Fig. 4,. It is worth noting that the only two REAs which show high correlations with the observation-only SSS products (not shown) are those where there is a restoring to SSS. As for the SSS, the coherence among the REAs for the global ocean salinity vertically averaged through the two layers, top 800 m and top 2000 m, is very low both in terms of  . Top panel also shows signal-to-noise contours, i.e. solid (dashed) contours correspond to positive (negative) trends greater than the ES. The globally averaged trend is subtracted for each dataset in order to show the trend of the dynamic sea level rather than the total sea level and provide comparable estimates of the locally varying sea level rise trends and interannual variability (not shown). A very general but not robust conclusion is that all the REAs in the upper layer show a positive significant trend over the period considered, coherently with the observations alone even if with large spread, suggesting evidence of a salinification tendency of the surface ocean during the last two decades (to be further investigated).
In summary, the variability between all estimates is very large for the global freshwater content of the ocean and, as a consequence, estimates of the ocean's freshwater content are still affected by large uncertainties. The reasons for this behavior are several. First, the global mean sea level as observed by altimetry cannot be represented by the REAs, due to the fixed volume of the NEMO ocean model. The sea level information is projected onto the model internal variability modes, giving rise to possible spurious salinity drifts in the reanalyses which do not relax to any climatology. In addition, the overestimation of precipitation in the Tropics by ERA-Interim freshens the waters in the first decade, and is then corrected when Argo observations start to be assimilated in the second decade.

Sea level
The global mean sea level (GMSL) trend cannot be estimated by these REAs because they do not reproduce the water mass contribution to the sea level and the global steric contribution, which can be only diagnosed a posteriori. Some reanalysis products such as ORAP5.0 and GLORYS2V3 are constructed to closely match the GMSL estimated by AVISO (Ducet et al. 2000) so as to bypass the limitations of the NEMO model that, due to the Boussinesq approximation and the climatological definition of river and land-ice runoff, cannot see the temporal evolution of GMSL, neither the steric nor the eustatic part. Therefore, any match or mismatch between the observed GMSL and the model output sea surface height (SSH) should not be directly attributed to reanalysis skill. Consequently we focus our analysis only on the regional sea level variability. Good skill of the ocean REAs in reproducing the regional sea level trends over the same period, i.e. 1993-2011 is shown in Fig. 5. The EM has good skill in representing the spatial distribution of the trends when compared to AVISO, both at the global and regional scales. The global distribution of change in SSH from the EM is plausible and exhibits many realistic features seen in the AVISO altimetry product, including increasing SSH concentrated in the western tropical Pacific, central Indian ocean and Indian sector of the ACC, and to a lesser extent throughout the Atlantic. The spatial patterns of SSH trends are consistent among all the products, as indicated by the signal-to-noise ratio.

Transports
Integrated volume transports (mean and standard deviation) through some notable World Ocean Circulation Experiment (WOCE) sections from the four REAs and their EM are shown in Fig. 6. Compared to estimates based on observations (Ganachaud and Wunsch 2000 and Lumpkin and Speer 2007) all the REAs but the ORAP5.0 overestimate the mean transports in the ACC sections. ORAP5.0 does not use the CORE formulae for momentum and this can explain some of the differences in the transports. It will be  Ganachaud and Wunsch (2000) and Lumpkin and Speer (2007). Bars indicate ±1 standard deviation of the monthly variability 1 3 investigated if the use of the CORE formulae with ERA-Interim winds produces too strong stress along the ACC or, alternatively, if the estimates from inverse methods tend to underestimate the ACC transport. Across all the other sections the products give similar volume transports in good agreement with observation-only estimates.
The Atlantic meridional overturning circulation (AMOC) is a key component of the ocean circulation and of the earth's climate as its strength directly modulates the meridional heat transport (MHT) in the ocean. The RAPID-MOC array (Cunningham et al. 2007) allows a continuous monitoring of the AMOC variability since 2004. We present here direct comparisons of the AMOC estimates from the MyOcean REAs with that of the RAPID array. Figure 7 displays the AMOC at 26.5°N as represented by the four ocean REAs and their EM. The seasonal cycle is clearly visible in the observations and most of the REAs reproduce this reasonably well. The EM is 15.61 ± 2.8 Sv compared to the RAPID mean value of 17.5 ± 4.0 Sv over the 2004-2011 period. The uncertainty of the AMOC estimated from the MyOcean ensemble, displayed in Fig. 7 as the grey shaded area, is also quite realistic as the RAPID AMOC estimate is within or very close to being within the ES.
The associated correlation coefficients for the 2004-2011 period when the RAPID data are available are given in Table 5. All correlation coefficients are statistically significant at the 0.95-level and the EM correlation shows the highest correlation of all when we include the seasonal cycle, and the second highest correlation without the seasonal cycle. It is encouraging that these good correlations are only marginally due to the fact that the ocean syntheses are able to reproduce the seasonal cycle of the AMOC, since all the correlations are decreased by <10 % when the seasonal cycle is removed. However, it is worth noting that the statistics might be affected by the short time series. It is also evident that all the REAs are stable throughout the period under consideration and coherently show a decreasing trend since 2004 in agreement with the observations.
The zonal mean MHT as a function of latitude is shown in Fig. 8 for both the global and the Atlantic Oceans. Global ocean values in the Northern Hemisphere and the Atlantic only values south of 45°N, seem reasonably well reproduced by the EM, with respect to Ganachaud and Wunsch (2000) and Trenberth and Caron (2001) estimates. Also the latitudinal pattern is consistent among the REAs with the only exception being a too strong cross-equatorial heat transport diagnosed at the global scale for GLORYS2V3. In the tropical band (10°N-35°N) the MHT averaged over the global ocean and the North Atlantic is 1.42 ± 0.14 and 0.94 ± 0.09 PW, respectively. In this region all the REAs provide MHTs lower than observational mean estimates from Ganachaud and Wunsch (2000). This feature is common also to the CORE-II lower resolution models (Danabasoglu et al. 2014) and our study seems to indicate that eddy-permitting capability and data assimilation are not enough to induce higher MHT in the Atlantic Ocean. However some significant differences with respect to the results by Danabasoglu et al. (2014) emerge. For example the oceanic heat gain evident in CORE-II models between 45°N and 55°N and attributed to the incorrect path of the North Atlantic Current is shifted southward between 35°N and 45°N, likely related to the Gulf Stream path. It is also evident from our results, as well as from a broader intercomparison performed in the context of ORA-IP (Valdivieso et al.,  the latitudinal variations in the North Atlantic MHT for the REAs is flatter than in the case of forced ocean models, implying a contribution of the data assimilation in the redistribution of zonally averaged regions of warming and cooling within the North Atlantic Ocean. In the Southern Hemisphere the global average MHT of three out of four REAs is drastically different from those inferred from the atmospheric REAs (Trenberth and Caron 2001) which exhibit much larger negative values in the tropics and are everywhere much smoother than those derived from the eddy permitting REAs. The ES in the Southern Hemisphere is clearly larger than in the Northern Hemisphere, presumably because the MHT estimates are constrained by fewer observations. However, despite the larger spread, all the REAs and their EM give global average MHTs which fall inside the range given by Ganachaud and Wunsch (2000).

Sea ice
Correlations and trends from the monthly EM of the sea-ice (Table 6) compare well with the CERSAT data, both in the Arctic and Antarctic. We also calculate correlations for the sea-ice minima and maxima, defined as the minimum and maximum values during each year, and found that correlations for the maxima are always smaller in the Antarctic  (Table 8). For all the other cases the EM shows a correlation that is at least equal to the best performing reanalysis, and higher than any individual product. This feature stems from the fact that REAs are either positively (GLORYS2V3, ORAP5.0) or negatively (UR025.4) biased with respect to the CERSAT sea-ice concentration validating dataset both at interannual (Fig. 9, top panels) and seasonal time scales (Fig. 9, bottom panels), and the ensemble average tends to cancel these biases. Note however that all REAs assimilate sea-ice concentration, which is consequently well constrained, and the biases are likely to reflect the different algorithms used by sea-ice concentration producers and the different methodologies used within sea ice assimilation (see "Appendix 1" for more details). This argument is supported also by the broader intercomparison performed in the context of ORA-IP (Chevallier et al., submitted to Clim. Dyn. ORA-IP Special Issue). Note that the spread is higher during the sea-ice extent maxima in both hemispheres, due to the higher uncertainty of the sea-ice edge extension. This is confirmed by the bottom panels of Fig. 9 that show the 1993-2011 seasonal climatology for the sea-ice extent in both hemispheres.
Correlation skill scores are very high for both hemispheres except for the Antarctic maximum sea-ice extent, when scores drop below 0.9 for all products. The fact that also the correlations for sea ice maxima in the Arctic are lower than those for the minima suggests that one of the reasons may reside in the different algorithms for the detection of sea-ice edge in the assimilated datasets with respect to CERSAT. The dynamical regime in the Southern Ocean may also contribute to these poorer performances.
The negative trend in the Arctic sea-ice minima is found to be significant for all the products (Table 7) and the EM reports a value (−129.68 × 10 9 m 3 /years) in close agreement with satellite estimates (−125.52 × 10 9 m 3 ). Trends in the Antarctic are generally positive but with large uncertainty, and the REAs tend to over-estimate these trends with respect to CERSAT. Aspects of the interannual variability are also well reproduced by all products: in particular the maximum ice area of 1996 and the minimum of 2007 in the Arctic, or the high value of the Antarctic March sea ice in 2008 (Fig. 9, top panels).

Eddy kinetic energy
The eddy-permitting capability of the estimates that we present in this study allows us to address other important aspects of the "ocean climate" like large scale turbulence. One of the major reasons for producing global REAs at higher resolution than those typically used for climate purposes (most products included in the other contributions of this Special Issue) is indeed to try to resolve currents, in particular boundary currents, in a more accurate way and therefore allow more accurate estimates of mass and heat transports (see previous section) not only in a zonal mean sense but also spatially varying. We have therefore analyzed the time mean eddy variability of the REAs by means of the comparison between their EKE and that obtained from the ocean surface current analyses (OSCAR) surface velocities (Bonjean and Lagerloef 2002). EKE is defined here as the difference between the total kinetic energy and the mean kinetic energy derived using the velocity averaged in time over the period 1993-2011.
The EM shows (Fig. 10) realistic patterns of the mesoscale variability at large scale, with quite a good representation of the eddy energy associated with both the Gulf Stream and the Kuroshio pathways, the equatorial currents, and also with the highest EKE regions of the Southern Ocean, i.e. the Malvinas and the Agulhas current regions. Energy levels are realistic in the EM also in the East Australia and the Somali currents region. The EKE structure can also be used as a good indicator of the mean flow structure, and the EM shows good agreement also in terms of the representation of the longitudinal extent and the separation from the coast of the mean western boundary currents. In particular it is worth noting that the EM is not affected by a weakness commonly seen in forced simulations, even at higher resolution, which is the poor representation of the Gulf Stream/North Atlantic current system extending almost zonally across the North Atlantic instead of turning northwestward around the Grand Banks. Even the signature of the Azores Current is remarkably well captured as well as the Kuroshio longitudinal extension. However, some differences characterize each product and the zonal mean average (Fig. 11) shows that in particular ORAP5.0 lacks EKE in the tropics and at middle latitudes, suggesting lower levels of eddy variability in the subtropical gyres. This behaviour is probably due to the fact that ORAP5.0 applies a super-obs. method to the altimeter data in order to reduce the weight given to the altimeter in relation to the in situ observations. Specific analysis of each product is however beyond the scope of this work, which aims at highlighting the usefulness of the ensemble approach. When compared to a control run (basically C-GLORS4 but without data assimilation, cyan line in Fig. 11) it is clear that the amplitude of the EKE of each reanalysis is significantly higher at all latitudes between 60°S-60°N. We can therefore conclude that the EM gives a realistic representation of both the spatial pattern and the strength of the small-scale variability at the global scale, and that the improvement is largely due to the assimilation of observations which can introduce variability that otherwise might only appear in models with much higher spatial resolution.

Discussion and conclusions
The main objective of this work was to indicate benefits and weaknesses of an ensemble approach based on state-ofthe-art available eddy-permitting global ocean REAs covering the altimetric era (1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011) and differing mainly in the way data assimilation is performed. This synthesis was produced not only to facilitate and encourage the use of the REAs, but also to provide a wide range of relevant user communities with a general assessment of the state of the global ocean for a variety of ocean parameters (T, S, sea level, volume and heat transports, sea-ice, etc.) at the global scale over recent decades, and an estimate of the ocean variability and trends over the same period.
The results show that in general there is good consistency between the different REAs. Robust features are present in the MyOcean reanalysis products, which tend to show that eddy-permitting ocean REAs have become mature. An intercomparison with experiments without data assimilation was done during the MyOcean project and we can conclude that data assimilation is crucial for correctly simulating some quantities such as regional trends of sea level, and in particular the EKE. The main advantage of having access to several REAs differing in the way data assimilation is performed is that it becomes possible to sample the uncertainty due to the data assimilation methods used. This uncertainty changes a lot from one ocean parameter to another, but the main conclusion is that a multi-system ensemble of the past ocean state is able to efficiently sample the true ocean state, and provide ocean state estimates at least as accurate as the best performing reanalysis and in some cases better than any single reanalysis. This result is only slightly altered when the seasonal cycle is removed.
Regarding the multi-system approach, it has been shown to be successful in improving prediction of weather and climate over a range of scales, as well as their uncertainties. Multi-model ensembles provide more reliable seasonal forecasts and were commonly used for the assessment of uncertainty of climate change predictions e.g. as in the Assessment Report 5 of the intergovernamental panel on climate change (IPCC). In the field of seasonal forecasting it was shown (Hagedorn et al. 2005) that the multi-model superiority over single model skill is due not only to error compensations but in particular to its greater consistency and reliability. Nevertheless, this ability cannot necessarily be extrapolated to all the multi-system strategies, including ours. It is beyond the scope of the present work to provide a definitive and robust explanation of the reasons underlying the fact that the EM is, in most cases, superior to any single member. We point out however that the degree of improvement that can be achieved with a multi-model depends strongly on the choice of reference. If the reference is the 1 3 best single model it is clear that the improvement of the multi-model cannot be significant. A more fair comparison should be done always using the same single model, and not the best single model for each process or variable considered. In this way, the argument for employing a multisystem, and its better performance across the whole range of variables, becomes more justified. Error cancellation is likely to be the main reason for the multi-system performance being superior to the average single-system performance. In the case of eddy-permitting experiments, we argue for example that the eddies in any single reanalysis are likely to be out of phase with the observations, but the EM better averages the eddies out and allows the largescale fit to improve. Or, more broadly speaking, since for most of the processes that we investigate the dominant variance comes from the seasonal cycle, the ensemble partially averages out other variability, and thus has the highest correlation because it best reproduces the seasonal cycle.
It is clear that our experimental set up does not represent an "ensemble" sensu stricto. "Ensemble" has indeed a specific meaning in both data assimilation and climate simulations, i.e. a set of perturbed experiments spanning some source of uncertainty. In the case of this paper, the analysis is mainly on the sensitivity of similar modelling tools to different choices of the assimilation systems. With the objective of identifying common and robust oceanic features among the existing eddy-permitting ocean REAs the use a multi-system ensemble approach was followed, where the signal and its associated uncertainty are measured by the EM and ES, respectively. To this end, diagnostics characterizing the distribution of the different reanalyses as members of an ensemble were shown, and the "spread" was quantified as the root-mean-square deviation of the ensemble members with respect to the EM.
However, there are two main limitations in the present work: 1. The estimation of the uncertainty is quantified only by using a multi-system ensemble approach; 2. The number of members is small, due to the fact that the REAs used are expensive state-of-the-art eddy-permitting products.
Another limitation affecting the REAs used in this work is the fact that they are not completely consistent with the model physics and that the dynamical and thermodynamical budgets are not closed. This is one of the most difficult, and still unresolved issues concerning the field of data assimilation: how far is the estimate from a physically consistent solution? How can we quantify the difference between an ocean-only simulation and the equivalent assimilated experiment and decide which is the best? Maybe the question should be posed in a different way: which is the best realization for the particular application that we have in mind? If the purpose, as it is here, is to estimate the ocean state and its variability over the most recent decades, we believe that a carefully validated reanalysis which beats the ocean-only simulation in terms of bias and root mean square error (RMSE) with respect to observations, could be the best solution. This is true for any single reanalysis as well as for an EM of good quality REAs. On the other hand, if one wants to use REAs to understand the governing ocean physics, a careful analysis of the dynamical and thermodynamical differences between the two solutions differing only because of data assimilation would be required. This would imply an accurate quantification of surface budgets, which are in general not closed, even in forced simulations, where non zero air-sea heat and fresh water fluxes are common, and rarely taken into account properly when dealing with process studies. A detailed analysis of the surface heat fluxes from different ocean REAs and their EM is contained in a contribution to this special issue (Valdivieso et al., under review) and suggests that most of the reanalysis products show mean flux biases of similar sign.
Here the purpose is to investigate if these products are comparable and able to provide new insight into the ocean state and its variability. To this end, the work shows for example that the eddying nature of the ocean, represented by increased values of EKE with respect to a control simulation (i.e. with no data assimilation), is captured in a coherent way by all the REAs. The ensemble approach seems therefore to be quite appropriate to investigate the eddy variability both in terms of spatial pattern and EKE amplitude. This also suggests that the assimilation of observations in eddy-permitting systems is able to drive the ocean towards energy levels at least comparable to the observation-only estimates and not normally generated by the model resolution used here, i.e. 1/4°.
For some variables, such as the heat content, the spread among the REAs is still high due to the lack of observations during the first decade especially in the Southern Ocean, and the EM trend overestimates that observed. On the other hand, if no data are assimilated, as in the reference simulation (not shown), the heat content trends are strongly underestimated.
Not surprisingly, since the SST is either directly assimilated or nudged by all the reanalysis systems, the global mean trend and its regional spatial pattern is well captured by the EM, and the signal-to-noise indicates that the consistency among the REAs is very high with the highest spread in the mesoscale regions of the western continental boundaries and in the ACC region. On the other hand, the evaluation of the SSS trends is much more difficult due to the scarcity of data. The only conclusion that can be made is that, from the EM the positive SSS trend over most of the global ocean seems robust, at least considering the signal-to-noise ratio, and the amplitude of the trend is overestimated with respect to observation-only products. In summary, the variability between all estimates is very large for the global freshwater content of the ocean, highlighting the fact that the ocean was historically under-sampled for salinity, and demonstrating a general problem of existing ocean simulations, as well as REAs, in determining the freshwater content from observations. As a consequence, estimates of the ocean's freshwater content are affected by large uncertainties and differences in the assimilation methodology may cause artificial signals.
The volume transports at the WOCE sections show good agreement with those derived from observations (Ganachaud and Wunsch 2000). The EM of the AMOC is about 2 Sv weaker than the RAPID value (15.61 ± 2.8 Sv compared to the RAPID mean value of 17.5 ± 4.0 Sv) but the associated error estimated from the global ocean REAs is quite realistic because the RAPID AMOC estimate is most of the time very close to being, or within, the ES.
Less robust conclusions can be drawn from the MHTs, even if for this variable it is more difficult to assess the REAs against observation-only products. In general, the latitudinal pattern is different among the REAs in the Southern Hemisphere where the EM heat transport is lower than the values inferred from the smoother atmospheric REAs (Trenberth and Caron 2001), however the values still lie within the range given by Ganachaud and Wunsch (2000).
Regarding the sea-ice the ensemble approach has been shown to be adequate for the Arctic region and for the minimum concentration in the Antarctic. The higher spread during sea-ice extent maxima is evident in both hemispheres and is due to the higher uncertainty affecting sea-ice extension during the respective winter seasons. An important robust conclusion is the negative trend in the Arctic sea-ice minima, which is found to be significant for all products, which also have an EM value matching well the satellite estimates (about −130 ± 26 × 10 9 m 3 /year).
In summary, even if the ensemble approach for eddypermitting global REAs seems promising, the results also suggest that there is quite a spread among these products, especially in global indices, even if most of them are based on a similar ocean model. This uncertainty comes mainly from the data assimilation methods, but not only from these: the observations assimilated, the spin up and the surface forcings are also different. However, the intercomparison exercise is essential to increase our knowledge of the possible causes of the differences and contribute to reducing the uncertainty of the EM.
Further work is needed to fully validate this approach but, in the light of the results presented here, the methodology seems quite promising. Despite the caveats and uncertainties that we have discussed, our results provide a first step towards a systematic comparison of eddy-permitting global ocean REAs aimed at providing robust conclusions on the recent evolution of the oceanic state.
The next generation of operational forecasting systems will soon become ocean eddy-permitting and this work indicates that the global REAs have reached enough maturity to provide them with the necessary initial conditions. This work has also indicated that the eddy-permitting forced ocean models still underestimate significantly the EKE of the ocean. The assimilation of observations can reduce this weakness and introduce into the systems the right amplitude of EKE at the right locations, which might play a positive role in the MHT and in air-sea interaction.
Acknowledgments This work has received funding from the Italian Ministry of Education, University and Research and the Italian Ministry of Environment, Land and Sea under the GEMINA Project and from the European Commission Copernicus program, previously known as GMES program, under the MyOcean and MyOcean2 Projects. The EN3 subsurface ocean temperature and salinity data were collected, quality controlled, and distributed by the Met Office Hadley Centre. The altimeter products were produced by SSALTO/ DUACS and distributed by AVISO, with support from CNES. The authors thank the NOAA/OSCAR group for providing OSCAR satellite-derived current data; the IFREMER/CORIOLIS group for providing CORA objective analysis data; the IFREMER/CERSAT for providing sea ice concentration data. During the preparation of this article, our co-author Nicolas Ferry passed away. He was an enthusiastic and fundamental contributor of the MyOcean global REAs activities. This work would not have been possible without his inspiring support and continuos effort. The data assimilation community has lost a good scientist and a good friend and this paper is dedicated to him.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix 1: Contributing REAs (in alphabetical order)
More details about the different systems and their quality can be found in the MyOcean documentations (Quality Information Documents and Product User Manual freely available on the web site www.myocean.eu). Here we present only the main characteristics of each system and focus in particular on the assimilation schemes and the observations that are assimilated.

C-GLORS4
C-GLORS is the reanalysis produced at CMCC covering the period 01-January-1979 to 31-December-2012. It_consists of a weekly three-dimensional variational analysis (3DVAR), followed by a 1-week ocean general circulation model (OGCM) integration, which brings the analysis forward to the next assimilation step. The threedimensional variational data assimilation system is a global implementation (Storto et al. 2011) of OceanVar (Dobricic and Pinardi 2008). The OGCM is the version 3.2 of NEMO (Madec 2008) in its ORCA025 configuration, coupled with the Louvain La Neuve Sea-Ice model (LIM2, Fichefet and Morales Maqueda 1997) dynamical and thermodynamical sea-ice model. The rheology used is the EVP (elasto-visco-plastic). The CORE bulk-formulas forcing method (Large and Yeager 2009) has been adopted.
All the forcing fields are provided by the European Centre for Medium-Range Weather Forecast (ECMWF) ERA-Interim atmospheric reanalysis project (Simmons et al. 2007;Dee et al. 2011).
ERA-Interim radiative fluxes and wind fields have been corrected as follows: • Large-scale short-wave and downward long-wave radiation fluxes have been corrected by means of a largescale climatological correction coefficient derived by the GEWEX Surface Radiation Budget project (Garric and Verbrugge 2010); • Precipitation fields were corrected by using a climatological coefficient derived from the REMSS/PMWC dataset .
Furthermore, in order to avoid artificial drifts of the globally-averaged sea-surface height due to the unbalanced fresh water budget, the evaporation minus precipitation minus runoff has been set equal to zero at each model time-step.
A large scale bias correction (LSBC) is performed during the model integration to avoid spurious model biases and drifts. The LSBC corrects the model tendencies every 12-h using differences between model and EN3 monthly objective analyses to estimate the model bias. The differences are filtered with a low-pass filter configured to filter out time scale smaller than 3 months and spatial scale smaller than 1200 km, in order to bias correct the large scale signals only. The filtered differences are then added to the tracer tendencies.
The river runoff used in the simulation has been created by Bourdalle-Badie and Treguier (2006) and provided by MERCATOR-Ocean. It is a monthly climatology that includes 99 major rivers and coastal runoffs.
The data assimilation step is used to correct three-dimensional fields of temperature and salinity. The analysis is performed every 7 days and a standard incremental formulation for the 3DVAR scheme is adopted. Since the observations are compared to the background field closer in time to the observations within 3-hourly time slots of the weekly assimilation time-window, this scheme is usually referred to as 3DVAR/FGAT (First Guess at Appropriate Time).
The background-error covariance matrix is decomposed onto two linear terms accounting, respectively, for vertical covariances and horizontal correlations. In our scheme, vertical covariances are represented by a 1-degree resolution set of 10-mode seasonal bivariate empirical orthogonal functions (EOFs) of salinity and temperature at full model vertical resolution. Horizontal correlations are modelled by means of a four-iteration first-order recursive filter, with three-dimensional, parameter-and direction-dependent correlation length-scales. Both the vertical EOFS and the correlation length-scales were calculated from the monthly anomalies (with respect to the climatology) of a non-assimilative OGCM run for the reanalysis period.

Assimilated observations
The variational data assimilation system of C-GLORS assimilates in situ observations of temperature and salinity and satellite sea level anomalies (SLAs). All the in situ observations from moorings, ARGO floats, expandable bathy termographs (XBTs) and conductivity-temperature-depth (CTDs) are extracted from the EN3v2a dataset (Ingleby and Huddleston 2007). These data are collected, quality-checked and distributed by the UK MetOffice Hadley Center. More information on the in situ data processing is available in Ingleby and Huddleston (2007).
The dataset of sea level anomalies is the AVISO alongtrack delayed mode dataset that includes observations from ERS-1 and -2, Envisat, GFO, Jason-1 and -2 and Topex/ Poseidon. Sea level corrections are covariated with vertical profiles of temperature and salinity by means of the "dynamic height" formulation and according to the bivariate definition of the background-error vertical covariances. The mean dynamic topography (MDT) used in this simulation has been calculated from the 1993 to 1999 mean SSH (in accordance with the AVISO convention), from a variant of C-GLORS with the assimilation of in situ data only.
C-GLORS also assimilates SST observations from the NOAA high-resolution daily analyses, which uses AVHRR and (from 2002) AMSR-E radiances. These observations are assimilated during the model integration through a simple nudging scheme that corrects the net heat flux at the sea surface by means of the difference between observed and modelled SST. The strength of this relaxation was set equal to −200 W/K s (corresponding to a relaxation time scale of 12 days).
Similarly, the net freshwater flux is corrected using differences between observed and modelled sea surface salinity. The observational dataset is the EN3 monthly objective analyses (Ingleby and Huddleston 2007). The restoring time scale was set to 300 days.
The assimilation of sea-ice concentration is performed through a nudging scheme that assimilates the gridded "NASA Team" sea-ice daily analysis (Cavalieri et al. 1999), with a relaxation time scale of 5 days.
Observations pre-processing includes a background quality-check and an horizontal data thinning in order to reject altimetric observations too close in space. Furthermore, observations close to the sea-ice are rejected in order to avoid analysis increments inconsistent with the sea-ice model. The observational errors for in situ observations were initially set equal to those found by Ingleby and Huddleston (2007) and subsequently tuned via the Desroziers' method (Desroziers et al. 2005). For SLA observations, the error variance is calculated as the sum of observational (satellite-dependent), MDT, representativeness and inverse barometer correction error variances (Storto et al. 2011).

Initialization strategy
For the initialization of C-GLORS a 10-year spinup that uses repeated atmospheric conditions valid for the year 1978 and taken from ERA-40 was run. The last day of the 10-year spinup provided the initial conditions valid at 1st January 1979, from which the C-GLORS production started.

GLORYS2V3
GLORYS2V3 is the reanalysis produced by MERCA-TOR Ocean covering the period 04-December-1991 to 31-December-2012. The ocean model is based on the version 3.1 of NEMO (Madec 2008) and the configuration is ORCA025 with a resolution close to 27 km at the equator and 21 km at mid-latitudes.
The vertical grid has 75 levels, with a resolution of 1 meter near the surface and 200 meters in the deep ocean. For a better representation of the topographic floor, the "partial cells" parameterization is used. The Elastic-Viscous-Plastic rheology formulation is activated with the LIM2 ice model (namely LIM2_EVP) and the ocean-seaice coupling is done every 5 model steps.
A specific attention is given to the surface boundary conditions. They are prescribed to the model using the CORE bulk formulation. Forcing fields are provided from ERA-Interim reanalysis products (Simmons et al. 2007), interpolated on ORCA025 native grid using an Akima interpolation algorithm and corrected as follows: • Radiative fluxes (both long wave and short wave) have been corrected by means of a large-scale climatological correction coefficient derived by the GEWEX Surface Radiation Budget project (Garric and Verbrugge 2010); • ERA-Interim rainfall fluxes have been corrected by means of GPCPV2.1 rainfalls flux which allows a more realistic SSS spatial distribution; • Specific corrections have been applied at high latitudes.
In the Arctic Ocean northward 80°N: 70 % of downward ERA-Interim short wave is retained, surface air temperature has been cooled by 2 °C and 85 % of air humidity is retained. Southward 60°S in the Southern Ocean: 80 % of downward ERA-Interim short wave is retained and 110 % of downward ERA-Interim long wave is applied.
Due to the high vertical resolution near the surface, a parameterization of the diurnal cycle on the solar flux is applied. Input is the daily mean flux, which is spread over the day according to the time, and geographical position on the earth. This parameterization aims at better representing the night-time convection which takes place in the upper most layer of the ocean (Bernie et al. 2007).
The river run-off data is inferred from Dai and Trenberth (2002) and in order to avoid unrealistic GMSL drift, the global mean freshwater flux is set to 0 at each time step. In addition, a global mean freshwater seasonal is imposed in order to force the global ocean mass seasonal variation to be close to the one observed with GRACE geodetic mission.
No restoring strategy has been implemented in this configuration for the sea surface salinity or the SST.
The data assimilation method relies on a reduced order Kalman filter based on the singular evolutive extended Kalman (SEEK) filter introduced by Pham et al. (1998). This approach has been implemented in different ocean model configurations at Mercator Océan ("Système d'Assimilation Mercator version 2", SAM2). In GLO-RYS2V3, the control vector is composed of the 2D barotropic height, the 3D temperature, salinity, zonal and meridional velocity fields and finally sea ice concentration. Associated to this vector, the forecast error covariance is based on the statistics of a collection of 3D ocean state anomalies (typically a few hundred) and is seasonally variable (i.e. fixed basis, seasonally variable). This approach is similar to the Ensemble optimal interpolation (EnOI) developed by Oke et al. (2008) which is an approximation to the EnKF that uses a stationary ensemble to define background error covariances. In our case, the anomalies are high pass filtered ocean states (Hanning filter, length cutoff frequency = 1/30 days −1 ) available over the 1993-2009 time period every 3 days. These ocean states come from a reference simulation carried out with the same ocean model configuration which is a companion simulation of GLO-RYS2V3 but without data assimilation.
The last feature of the model forecast covariance employed is a localization technique which sets the covariances to zero beyond a distance defined as twice the local spatial correlation scale.
In parallel, a 3D-VAR bias correction method has been implemented to correct large-scale temperature and salinity biases when enough observations are present. The bias correction corrects the slowly varying large scale error of the model whereas SAM2 assimilation scheme corrects the smaller scales of the model forecast error.
These increments are applied with an incremental analysis update (IAU) (Bloom et al. 1996, Benkiran andGreiner 2008). The IAU is a low-pass filter, which gives a smooth model integration, without the numerical shock at the analysis time when the increment is applied on one time step. The IAU also reduces spin up effects after the analysis time but is more costly than the "classical" model correction because of an additional model integration over the assimilation window.
An important feature of GLORYS2V3 reanalysis system (implemented also in the real-time global forecasting systems) is that the analysis is performed at the middle of the 7-day assimilation cycle, and not at the end of the assimilation cycle. Doing so, future and past observations are used in the analysis, providing a better estimate of the ocean because the analysis is temporally centered, with respect to the observations used.

Assimilated observations
The assimilated data consist of satellite SST and SLA data, in situ temperature and salinity profiles, and sea ice concentration. The model innovation (observation minus model equivalent) is computed at appropriate time (FGAT, first guess at appropriate time).
The SLA is provided by MyOcean SLA TAC (CLS: collecte localisation satellites). The assimilation of SLA observations requires the knowledge of a mean SSH. The mean surface reference is an adjusted version of CNES-CLS09 MDT (Rio et al. 2011) including GOCE observations and bias correction and reflects the mean ocean surface over the 1993-1999 period (i.e. centered on 1996).
The daily Reynolds 1°/4° AVHRR-only daily SST version 2 product (see http://www.ncdc.noaa.gov/oa/climate/ research/sst/oi-daily.php) is assimilated once at the date of the analysis (i.e. the day 4 at 24H00 of the assimilation cycle).
The In Situ profiles (T, S) from the CORAv3.3 data base provided by MyOcean in situ TAC (Coriolis data centre) are assimilated (http://www.coriolis.eu.org/Science/Data-and-Products/CORA-03). This data set has been extensively quality controlled using classical "in situ" quality control procedures. XBT profiles from CORAv3.3 have been only corrected partially from fall rate equation problems. Most but not all XBT profiles for the years 2002-2008 have been corrected using Johnson et al. 2009 method (see http:// www.nodc.noaa.gov/OC5/WOD09/bt_bias_notes.html). New temperature and salinity vertical profiles from the sea mammal (elephant seals) database (Roquet et al. 2011) were assimilated to compensate for the lack of such data at high latitudes. For 2012 year, the delayed mode database was not available, so we decided to use the real-time database produced by Coriolis data center. To minimise the risk of erroneous observations being assimilated in the model, the system carries out a quality control (QC), known as "background quality control" in meteorology, on the assimilated T and S vertical profiles.
Sea Ice concentration from the IFREMER/CERSAT products (Ezraty et al. 2007) are assimilated once at the date of the analysis (i.e. the day 4 at 24H00 of the assimilation cycle).

Initialization strategy
The GLORYS2V3 started at rest, with initial climatological temperature and salinity. The date of start is 4th December 1991. The used climatology is a merge of the Levitus 98 climatology, patched with PHC2.1 for the Arctic regions, and Medatlas for the Mediterranean Sea.
Initial conditions for sea ice (ice concentration) were inferred from the NSDIC Bootstrap products for December 1991.

ORAP5.0
ORAP5.0 (Ocean ReAnalysis Pilot 5.0) is a global ocean reanalysis produced by European Centre for Medium-Range Weather Forecasts (ECMWF) as a contribution to MyOcean2. ORAP5.0 covers the period 01-January-1979 to 31-December-2012, and it has been produced using the v3.4.1 of the NEMO ocean model (Madec 2008) at a resolution of 1/4 of degree in the horizontal and 75 levels in the vertical, with variable spacing (the top level has 1 m thickness). It also includes an active seaice model (LIM2, Fichefet and Morales Maqueda 1997). ORAP5.0 surface forcing comes from ECMWF ERA-Interim atmospheric reanalysis products (Simmons et al. 2007;Dee et al. 2011), and includes the impact of surface waves in the exchange of momentum and turbulent kinetic energy (Janssen et al. 2013). A modified CORE bulk formula (Large and Yeager 2009) has been used to compute turbulent air/sea fluxes, but substantial modifications have been made regarding the drag coefficient and wind stress calculations. The monthly climatological river runoff (Dai and Trenberth 2002) is applied along the land mask.
The freshwater flux is also adjusted by constraining the global model sea-level changes to the changes given by the altimeter data, and by relaxing the sea surface salinity to monthly climatology from WOA09, with the strength set to −33.3 mm/day. In addition, a very week (with a timescale of about 20 years) global 3D relaxation to temperature and salinity climatological value from WOA09 is also applied through the water column.
The reanalysis is conducted with NEMOVAR (Mogensen et al. 2012) in its 3D-var FGAT configuration. NEMOVAR is used to assimilate subsurface temperature, salinity, sea-ice concentration and sea-level anomalies (SLA), using a 5 day assimilation window. NEMOVAR is a variational data assimilation system developed based on OPAVAR data assimilation system (Weaver et al. 2005) for the NEMO ocean model by Mogensen et al. (2012). For our analysis NEMOVAR is applied as an incremental three-dimensional variational assimilation (3D-Var) using the FGAT approach. The analysis cycle consists of a single outer iteration of 3D-Var FGAT with observational quality control (QC) and bias correction steps. In the outer loop the NEMO model is integrated forward and used for calculation of the model equivalent of each available observation at the time step closest to the observation time, after which the QC of the observations is performed. The background state and the quality-controlled observations are passed to the inner loop part of 3D-Var FGAT where the incremental cost function is minimized using an observation space conjugate gradient (RPCG) method (Gürol et al. 2014) with 40 RPCG iterations to produce the assimilation increment. In the final phase of the analysis cycle, the assimilation increment resulting from the inner-loop minimization is applied using IAU (Bloom et al. 1996) with constant weights during a second model integration spanning the same time window as for the background trajectory.
A revised scheme for calculating background error horizontal correlation length-scales has been implemented as a variant of the scheme developed by Waters et al. (2014), by which the horizontal background correlation scales are set-up by the Rossby radius of deformation. In a compromise between computational efficiency and complexity, only the latitudinal variations of the Coriolis parameter are included in the Rossby radius of deformation, which otherwise uses a constant gravity wave phase speed of 2.7 m/s. The Rossby radius is capped to a minimum value of 50 km and maximum value of 250 km near the equator for zonal correlation. The zonal scales are elongated at the equator, to represent the distanced travelled by a Kelvin wave during the length of the assimilation window.
A bias correction scheme (Balmaseda et al. 2007) has been implemented in NEMOVAR to correct temperature/ salinity biases in the extra-tropical regions, and pressure bias in the tropical regions. Total bias term contains a priori bias (offline bias), which is calculated based on a preproduction run from 2000 to 2009 with only assimilation of temperature and salinity in situ data to account for the seasonal variations, and online bias, which is updated each analysis cycle using the assimilation increments.

Assimilated observations
In situ profiles of temperature and salinity data from the quality-controlled EN3 dataset (Ingleby and Huddleston 2007) are assimilated in ORAP5.0. EN3 version 2 with XBT depth correction (Wijffels et al. 2008) is used from 1979 to 2011 (EN3_v2a_xbtc) and a standard version EN3 is used for year 2012. The observation-error standard deviations for temperature and salinity are specified according to analytical functions depending on depth only (except near coastlines, where a inflation factor will be applied), and approximately fit to the vertical profiles of the globally averaged temperature and salinity obs-err estimated in EN3. The background-error standard deviation for temperature and salinity are parameterized in terms of the vertical gradient of the background temperature fields so flow-dependent aspects of variance propagation could be captured. Observation-space assimilation diagnostics (Desroziers et al. 2005) show that background-error variance is reasonably well parameterized in ORAP5.0.
EN3 data are quality controlled by the NEMOVAR QC procedure, including duplicate check, background check and stability check. A horizontal thinning scheme is applied to CTD and XBT data with a minimum distance requirement of 25 km and time gap set to 1-day. A vertical thinning of observation to allow no more than 2 observations per model level is also implemented to ensure that data with high vertical resolution (i.e. CTD) are not given too much weight in analysis.
Satellite along-track SLA data from AVISO delayed mode dataset that includes observations from ERS-1, ERS-2, Envisat, Jason-1, Jason-2 and Topex/Poseidon. To filter out the correlation on the SLA observation error, a superobservation scheme as implemented in ORAS4 is also used in ORAP5.0 for SLA data. A grid with approximately 100 km resolution is defined (superob grid). All observations within the same day are spatially binned into this grid to create a super-observation (Mogensen et al. 2012).
To assimilate along-track SLA data from AVISO, a new method was developed which can calculate the model MDT file relative to arbitrary period. Instead of using the same reference period as the observations (1993)(1994)(1995)(1996)(1997)(1998)(1999), the MDT is estimated by averaging the sea-level from a T/S assimilation experiments for the 2000-2009 Argo period. A correction factor is then added to take into account the different reference period. The spatial mean of the sea-level background field and of the input sea-level observations is removed before assimilation, so that the residual can be then used to close the fresh-water budget and thus helping with the attribution of sea-level rise. The assimilation of sea-level trends is carried out by applying spatially uniform fresh-water fluxes derived from this residual.
The daily gridded sea ice concentration data that is provided together with SST information has been assimilated. As for SST, this comes from a combination of NOAA and OSTIA products. The background state of ocean and sea ice states are produced from coupled NEMO-LIM2 run, but the minimization of the sea ice cost-function is separated from all ocean state variables in a different loop. At the same time a latitude band and thinning algorithm (by a factor of 2) were applied for the sea ice data to reduce the data density and to speed up convergence in the cost function.

Initialization strategy
The initial conditions for the ORAP5.0 were produced in two phases: • A 12-year (1979-1990) model spin up from the cold start given by the climatology from World Ocean Atlas 2009 (WOA09, see Locarnini et al. 2010;Antonov et al. 2010), forced with ERA-Interim fluxes, and using a 3-year relaxation to the WOA climatology. • A 5-year assimilation period (1975)(1976)(1977)(1978)(1979), starting from the end of the previous spin up conditions.

UR025.4
The numerical model used in the UR025.4 reanalysis is the global ocean model ORCA025 based on the NEMO modelling framework (Madec 2008), version 3.2. There are 75 vertical depth levels with separations varying smoothly from 1 m at the surface to 200 m at the bottom. The ocean is fully coupled with the Louvain-la-Neuve Ice Model version 2 [LIM2, (Timmermann et al. 2005;Fichefet and Morales Maqueda 1997;Goosse and Fichefet 1999)]. Surface atmospheric forcing for UR025.4 is obtained from the ECMWF ERA-Interim atmospheric reanalysis (Simmons et al. 2007;Dee and Uppala 2009;Dee et al. 2011). The ERA-Interim reanalysis provides 10 m wind, 2 m air humidity and air temperature to compute 6-hourly turbulent air/sea fluxes during model integration, using the bulk formula proposed by Large and Yeager (2004). Downwelling short and long wave radiative fluxes and precipitation are also provided by ERA-Interim. The monthly climatological river runoff of Dai and Trenberth (2002) is applied along the land mask. There is no sea surface salinity restoring in UR025.4.
The assimilation system used in UR025.4 is based on the UK Met Office operational FOAM-NEMO system (Storkey et al. 2010;Martin et al. 2007). A complete description of the system is provided by Storkey et al. (2010) and references therein. Here we summarize the main characteristics.
The data assimilation methodology is an optimal interpolation (OI)-type scheme (Lorenc et al. 1991) with assimilation increments calculated using a FGAT scheme every 5 days (73 assimilation cycles per year) and introduced evenly over the period in an IAU (Bloom et al. 1996) step. For all data types (apart from sea ice), the analysis is separated into horizontal and vertical parts. The SST increments are applied over the model mixed layer down to a maximum depth of 660 m. Mesoscale SLA increments are projected to depth using a version of the Cooper and Haines (1996) scheme. Synoptic scale SLA increments are applied directly to the model SLA field. For profile data, a vertical analysis is first performed at each observation point, followed by a horizontal analysis at each model level. After 3D temperature and salinity increments have been derived, geostrophic balancing increments are applied to the baroclinic velocity field, poleward of 5° ramping down to zero at 1°. Surface salinity increments are calculated to balance sea-ice increments and mesoscale surface height increments are recalculated using hydrostatic balance.
The model and observation spatial error covariance matrices are univariate, with cross-correlations between state variables being provided by dynamical balancing relationships as described above. A number of bias correction schemes are used to deal with systematic errors in the model and observations. The scheme of Bell et al. (2004) adds a correction term to the subsurface pressure gradients in the tropics to counter errors in the wind forcing or/and deficiencies in the vertical mixing of momentum. Errors in the MDT field are corrected using the method described by Lea et al. (2008). Different sources of satellite SST data are individually calibrated using the OSTIA SST analysis system (Stark et al. 2007;Donlon et al. 2011).

Assimilated observations
UR025.4 assimilates in situ and satellite SST data, satellite sea level data, satellite sea ice concentration data, and in situ temperature and salinity profile data. The satellite SST data includes level 3 data from the advanced very high resolution radiometer (AVHRR) sensor on board the NOAA series of satellites, processed by the Pathfinder project at version 5.0 (NODC 2009), and sub-sampled level 2 data from the (advanced) along-track scanning radiometer ((A)ATSR) sensors on board the ERS-1, ERS-2 and ENVISAT satellites, re-processed using the operational processor used in 2007 to form a consistent data-set (NEODC 2011). The in situ SST data are taken from the International Comprehensive Ocean-Atmosphere Data Set (ICOADS, Worley et al. 2005;Woodruff et al. 2010). Sea-ice concentration data are from a re-processed version of the special sensor microwave/imager (SSM/I) data provided by the EUMETSAT ocean and sea-ice satellite application facility (OSI-SAF; available from http:// osisaf.met.no). These SST and sea-ice concentration datasets are the same as those used in a reanalysis of the operational SST and sea-ice analysis (OSTIA) system. After 31st December 2007, near-real time versions of the NOAA AVHRR, AATSR, in situ SST and OSI-SAF sea-ice concentration data are used for assimilation as the gridded observational analysis data are not available.
Altimeter SLA data is along-track data processed by the CLS and distributed by Aviso, which includes data from the Jason-1, Jason-2, Topex/Poseidon, ERS, and Envisat platforms at various times. The MDT field is the Rio et al. (2005) climatology.
In situ temperature and salinity observations are obtained from the UK Met office quality controlled ENACT/ ENSEMBLES dataset (Ingleby and Huddleston 2007). This dataset is largely composed of observations from the World Ocean Database 2005 and supplemented by data from the global temperature and salinity profile program (GTSPP) and Argo. As such, the dataset includes all available hydrographic observations, including those from shipboard expendable bathythermographs (XBTs) and CTD measurements, as well as observations from mooring arrays (such as TAO and PIRATA). The operational quality control system used for the FOAM assimilation products ) is used to perform a number of consistency checks on the data. These include buddy checks, track checking, testing for density inversions and thinning of the data. The version of the dataset used in UR025.4 (EN3_v2a_NoCWT_ LevitusXBTMBTCorr) differs from the previous version assimilated in UR025.3 (EN3_v2a) as it includes bias corrections for XBT and MDT data (Levitus et al. 2009).

Initialization strategy
The entire period of the UR025.4 reanalysis covers 1989-2010. Initial conditions for temperature and salinity are derived from the ENACT/ENSEMBLES climatology.