1 Introduction

The severe and rapid decline of Arctic sea ice in the last two decades (Stroeve et al. 2012; Vaughan et al. 2013) has fostered research as well as socio-economic interests in the Arctic ocean for sectors such as the maritime transport, oil and fishery industries and tourism (e.g. Hall and Saarinen 2010; Meier et al. 2014; Smith and Stephenson 2013). Interest in predicting the timing of advance and retreat of sea ice at seasonal time scales has therefore increased, as illustrated by the Sea Ice Outlook initiative (Stroeve et al. 2014).

Seasonal prediction skill largely relies on the quality of initialization. Recent years have seen an improvement in predictive capacity (e.g. Blockley and Peterson 2018; Bushuk et al. 2017; Collow et al. 2015; Wang et al. 2013) and enhanced observational efforts over the region (Jung et al. 2016). The increasing density and quality of observations for the various components of the climate system has allowed for more accurate initial conditions (ICs) for seasonal predictions. The recent improvements in predictive capacity of the Arctic sea ice are also related to advances in forecast initialization techniques to phase the models with the observed climate evolution (e.g. Bushuk et al. 2017; Collow et al. 2015; Wang et al. 2013). In particular, efforts have been made to improve the initialization of different model components such as sea ice (e.g., Blanchard-Wrigglesworth et al. 2011; Blockley and Peterson 2018; Dirkson et al. 2017), ocean (in particular the ocean’s top thermal structure; e.g., Balmaseda and Anderson 2009; Balmaseda et al. 2009), atmosphere (Infanti and Kirtman 2016) and land surface (Koster et al. 2010).

The most widely-used method is full-field initialization, in which the model begins from an observed or reanalyzed state. This method, however, is affected by the presence of large model biases which can cause a drift in the predictions as the model transitions from the observed climate towards its own attractor (e.g. Exarchou et al. 2018; Magnusson et al. 2013; Meehl et al. 2014; Sanchez-Gomez et al. 2016). Interferences between the drift and the climate signal to be predicted could hamper the quality of climate predictions. This drift needs to be corrected a posteriori under certain assumptions (e.g. that the model drift is independent of the start date or the initial state; ICPO 2011), with the downside that only direct impacts of the drift can be filtered out through simple linear methods, not the non-linear interactions and consequences of the drift. On the other side, the model drift in climate predictions has been seen as an opportunity to track the time evolution of model errors from close-to-null to the asymptotic model biases (e.g. Exarchou et al. 2018; Vannière et al. 2014). The timescales and chronology of error growth hints at the mechanism involved through a differentiation of fast (e.g. atmospheric turbulence) versus slow (e.g. deep ocean circulation) processes and identification of causes versus consequences (e.g. an ocean bias developing before the induced sea ice bias). This drift is expected to be larger when initializing from reanalyses of another data assimilation system, such as ERA-Interim or NCEP/NCAR-RE1, than in-house reanalyses, since a reanalysis is slightly biased towards the attractor of the model which was used in its production and interpolation introduces additional errors and potential incompatibilities between variables. To minimize this drift, a potential alternative method is anomaly initialization, in which the model is started from a synthetic state built from observed or reanalized anomalies added on top of the mean model climate. However, instabilities can be triggered by dynamical imbalances between the observed anomalies and the mean model climate or by non-linearities such as the dependence of density on temperature and salinity (Boer et al. 2016; Magnusson et al. 2013; Meehl et al. 2014).

A large variety of techniques has been developed to assimilate observations and/or reanalysis into in-house reanalyses with the same model (or components of the coupled model) used to perform the forecasts. These go from simple approaches, like Newtonian relaxation or nudging (Lindsay and Zhang 2006; Tietsche et al. 2013), to more sophisticated methods like the Ensemble Optimal Interpolation (EnOI, e.g. Dulière and Fichefet 2007; Smith and Stephenson 2013) and the Ensemble Kalman Filter (EnKF, e.g. Evensen 2003; Massonnet et al. 2013). One of the key advantages of the latter is that it is a multivariate method that updates all related variables simultaneously according to their statistical relationship through the model-simulated covariance matrix. For example, EnKF assimilation of SIC has been proven to have a positive impact on the representation of sea ice thickness (Massonnet et al. 2013; Mathiot et al. 2012), even though sea ice thickness is not directly assimilated.

Nowadays, data assimilation is usually applied independently to each model component which are also run separately to generate the ICs in operational prediction centers (Arribas et al. 2011; Molteni et al. 2011; Saha et al. 2010). Incompatibilities between the ICs of the different model components are prone to appear with such an approach. Those incompatibilities can be reduced by running a fully coupled climate model between data assimilation steps performed independently in each model component, namely using weakly coupled data assimilation. The paradigm is currently changing, and many forecast centers develop strongly coupled assimilation systems (Kimmritz et al. 2019; Penny and Hamill 2017; Penny et al. 2019), i.e. in which each assimilation step also occurs in couple mode so that the assimilation of observations in one model component also leads to direct updates in the other model components. This approach is technically challenging but it helps to prevent the occurrence of ICs inconsistencies and imbalances in the forecasts (Laloyaux et al. 2016), which has been shown to benefit the forecast skill (Liu et al. 2017). When present, these incompatibilities can lead to initial shocks in the prediction, which are likely to be larger in areas with substantial disparities between the observations and simulations (Balmaseda et al. 2009). Other initial imbalances have been related to diverse causes, such as the presence of spurious trends in some of the initialization products (e.g. surface winds in Pohlmann et al. 2017), the interpolation of non-native reanalyses (e.g. Laloyaux et al. 2016), or the use of a different model versions to produce the ICs and perform the forecasts (Mulholland et al. 2015). Note that preventing or reducing these shocks is expected, but it does not necessarily lead to skill improvements (Pohlmann et al. 2013). Incompatibilities between variables can also arise within a reanalysis, those incompatibilities being introduced by the data assimilation scheme itself at the analysis step (the variable updates are linearly related to each other, whereas variables can be non-linearly related). In our study, we can not distinguish between initial shocks introduced within a model component by the assimilation schemes and between model components because of independent assimilations. We will consider those errors jointly and refer to them as ICs inconsistencies.

This article investigates the origin of the different Arctic sea ice forecast biases present in the latest seasonal predictions produced with EC-Earth3.2 (Acosta Navarro et al. 2019). In particular, we are interested in the development of the model bias as a function of forecast time, the impact of incompatibilities between ICs, and their relative importance along the prediction. This article is organized as follows: In Sect. 2, we briefly describe EC-Earth3.2 system and the experimental and observational datasets used, together with the initialization methodology. Section 3 offers a detailed presentation of the different sources of forecast biases. Their evolution and contribution to the total forecast bias are discussed in Sect. 4. Finally, Sect. 5 summarizes the main conclusions.

2 Methodology

2.1 Model description and experiments

The forecasts (also referred to as predictions) and their historical counterpart used in this study were produced with EC-Earth3.2 coupled climate model (Doblas-Reyes et al. 2018; http://www.ec-earth.org/). The ocean component is the third version of NEMO (Nucleus for European Modelling of the Ocean; Madec et al. 2015) with ORCA1 configuration (about 1 degree with enhanced tropical resolution) and 75 vertical levels. The sea ice component is the Louvain-la-Neuve sea ice model (LIM3, Vancoppenolle et al. 2009) directly embedded into NEMO. For the atmosphere, the integrated forecasting system (IFS), from the ECMWF is employed, in a configuration with 91 vertical levels and a T255 horizontal resolution. All components are coupled via OASIS3 (Valcke 2006).

Using EC-Earth3.2 and aerosols, greenhouse gases and solar activity forcings from CMIP6 protocol, we performed three sets of experiments. The first experiment is a historical simulation (HIST hereafter) which consists of one single member covering the 1950-2014 period initialized from a perpetual 1950 simulation. The other two sets of experiments consist of seven month-long seasonal forecasts (PRED hereafter) initialized each year from 1993 to 2008 on May 1st and November 1st, with an ensemble size of ten members. More details about the datasets and simulations used in this article can be found in Table 1.

In PRED, the atmosphere is initialized from ERA-Interim reanalysis (Dee et al. 2011) with initial perturbations between the ten members computed using singular vectors. The ocean was initialized from the five members (repeated once to get the ten initial members for PRED) of the Ocean Reanalysis System 4 (ORAS4, Balmaseda et al. 2013). The sea ice ICs are from an in-house reanalysis (hereafter referred to as RECON) produced with the ocean-sea ice model assimilating daily SIC from ESA-CCI version 1 (hereafter referred to as ESA, Hollmann et al. 2013) via a 25-member ensemble Kalman filter (EnKF). This satellite SIC dataset is different to the one prescribed to produce ORAS4 (Reynolds et al. 2002), which we will refer to hereafter as ORAS4_ice. Only the first 10 members were used as ICs for the sea ice. Also note that it is not possible to initialize directly LIM3 from observations, as this would require comprehensive and coherent information on many sea ice variables, which are not all conveniently observed. The advantage of the EnKF method is that through the use of model covariances, it assimilates specific observations (in this case of sea ice concentrations) ensuring at the same time that the other related variables (in this case all sea ice variables) evolve consistently with them. For further details on how the EnKF is implemented in the sea ice model, we redirect the reader to Mathiot et al. (2012) and Massonnet et al. (2015). Our study focuses on 1993-2008, the period covered continuously by the ESA SIC product.

In RECON, SIC is assimilated the last day of every month. The choice of using 25-members in the EnKF was made to reach a compromise between having a large enough ensemble to sample model uncertainty and minimizing the high computational constraints. The atmospheric fields used to force RECON were taken from the Drakkar forcing set in its version 5.2 (DFS, Dussin et al. 2016). DFS is based on ERA-Interim, but it includes bias corrections in temperature and humidity in the Arctic that reduce the mean air temperature by more than \(0.6\,^{\circ }\hbox {C}\) everywhere in the Arctic except on the Baffin Bay (Dussin et al. 2016). The different members of atmospheric forcings are produced by adding to each variable random daily perturbations which are representative of the observational error. The pool of possible perturbations is computed from monthly differences between the DFS and ERA-Interim wind. Once randomly picked, monthly perturbations are interpolated linearly to obtain daily ones (Guemas et al. 2014).

Table 1 Brief description of all the simulations and datasets used in this study

2.2 Forecast bias analysis

In this study, three different biases are addressed. First, the one caused by the ICs inconsistency, that will be characterized as the SIC difference between the ORAS4_ice and RECON at lead time 0 (i.e. initialization moment). Second, the deviation of the model attractor with respect to the observational reference, that will be referred to as model bias, and is defined as the difference between HIST and RECON climatologies for each calendar day. Finally, the forecast bias, that will be defined as the difference between PRED and RECON climatologies, and is calculated for each lead time. Both the effects of the ICs inconsistency and the development of the model bias contribute to the forecast bias. The extent of these contributions depend (1) on the magnitude of the ICs inconsistency at lead time 0, (2) on the magnitude of the model bias for each calendar day and (3) on the lead time. An schematic illustration of the different biases analyzed in this manuscript is shown in Fig. 1. The effect of the ICs inconsistency is represented as a deviation from the smooth asymptotic transition of the forecast to the historical state.

To complement the analysis of the SIC biases, we also consider the Integrated Ice Edge Error (IIEE), introduced by Goessling et al. (2016). Unlike for the climatological biases of the sea ice area (SIA) or sea ice extent (SIE), the IIEE does not suffer from spatial error compensation (i.e. areas with underestimated SIC being counterbalanced by areas with overestimated SIC). The IIEE is defined as the area where the forecast and the reference disagree on the SIC being above or below 15%. The IIEE can additionally be decomposed into the absolute extent error (AEE) and the misplacement error (ME) components, which allow to further investigate the nature of the errors. The AEE represents the absolute difference in sea ice extent between the forecast and its reference, while the ME integrates sea ice extent that has been predicted at an incorrect location. The reader can find more information about these scores in Goessling et al. (2016).

To track the relative contribution of the ICs inconsistency and the model bias to the SIC forecast bias along the first month, at each forecast time, we compute spatial correlations between each different source of forecast biases and the pattern of SIC forecast bias, masking out all grid points from 80N to 90N due to the extremely low sea ice variability at those latitudes.

Fig. 1
figure 1

Schematic representation of the Arctic sea ice extent evolution for a historical simulation (green), an observation reference (dark blue), a forecast (light blue) and the initial conditions for two model components (yellow and red dots). The different biases analyzed in this study are indicated in orange: forecast bias, model bias and initial conditions inconsistency. The dashed line represents the forecast trajectory that would be followed with absence of ICs inconsistency. The vertical grey line stands for the time zero, when the forecast is initialized

Fig. 2
figure 2

Maps of the SIC difference between PRED and NSIDC for (a) May 1st and (c) November 1st for the period 1993-2008. The purple line represents PRED sea ice edge (15% SIC) climatology, while the green one represents the NSIDC sea ice edge. NSIDC is not the dataset assimilated for the production of the sea ice ICs. (b) and (d) show the differences in SIC between RECON and the observations from ESA assimilated into RECON for May 1st and November 1st, correspondingly. The purple line represents RECON sea ice edge, while the green one represents the ESA sea ice edge. In (a) and (c) blue colours represent regions with a larger sea ice in PRED than in NSIDC, while for (b) and (d) they represent regions where RECON simulates more sea ice than ESA

2.3 Assessing the sea ice initialization and verification product

To evaluate RECON and the efficiency of the EnKF assimilation technique, we consider an independent dataset from the one used to produce the ICs, namely the National Snow and Ice Data Center (NSIDC, Cavalieri et al. 1996). Since no uncertainty is provided for the NSIDC datasets, we estimate it as the absolute difference between the two NSIDC SIC products: NASA-Team (Cavalieri et al. 1996) and Bootstrap (Comiso 2017), which only differ in the algorithm employed to estimate sea ice concentrations from passive microwaves measurements. A large systematic overestimation of Arctic SIC with respect to NSIDC (PRED minus NSIDC) is already present during the first forecast day in most of the peripheral Arctic regions for the May-initialized predictions (Fig. 2a) and across the interior Arctic ocean for the November-initialized predictions (Fig. 2c). In both cases (May and November 1st, i.e. the forecasts in the first 24 hours), the largest bias (\(\sim\)50%) occurs at the Eastern side of the Greenland Sea. These large errors arise through the assimilation protocol as a consequence of large observational uncertainties over the region. Indeed, Fig. 2b, d illustrates the initial difference between RECON (from which sea ice ICs were extracted) and the observations from ESA that were assimilated into RECON, for the first forecast day. Large differences appear in the same regions where the initial forecast biases emerged. Indeed, more weight has been given to the model information than to ESA SIC data over those areas in the EnKF that produced RECON. To understand this, it is important to keep in mind how the EnKF works. The strength of assimilation relies on a subtle balance between the observational uncertainty and the simulated ensemble spread, which is assumed to represent the model uncertainty. Whenever observations are uncertain and the EnKF ensemble has relatively little spread compared to the observations uncertainties, the EnKF reanalysis remains close to the simulated model state. This happens, for example, in May and November along the East Greenland coast, where ESA data exhibit relatively large uncertainties compared to the amplitude of the model spread (Fig. 3). By contrast, large model spread with well constrained observations lead to large EnKF increments and therefore closer EnKF reanalysis to the assimilated data. This occurs in November over the Kara Sea, where the model shows a large ensemble spread while there is relatively little observational uncertainty (Fig. 3). This large ensemble spread over the Kara Sea seems to derive from the low transitional SIC over the Marginal Ice Zone (hereafter MIZ, where 15% < SIC < 85%) (Fig. 4). It appears that the wider the MIZ, the larger the ensemble spread/model uncertainty. In the ESA dataset, narrow bands of high observational uncertainties appear at the sea ice edge, which could have locally larger magnitude than other products such as NSIDC (Fig. 3). But the ESA uncertainty, when integrated spatially, compares well with other products (e.g. Fig. 3). The estimate of model and observational uncertainties play a key role in the assimilation weights given to model and observations. Overestimation or underestimation of these uncertainties can be highly detrimental to the quality of the reanalysis. It is common practice to initialize climate predictions from reanalyses which account for the observational uncertainty and therefore do not track observations very closely when they are deemed uncertain. It is also common practice to assess forecast quality directly against observational datasets without accounting for their uncertainties. These are not consistent and a prediction whose ICs do not match very closely the observations can not be expected to track very closely the observations in the course of the prediction. A more consistent approach could be adopted either by verifying predictions against a reference which accounts for observational uncertainty similarly as done in the ICs or to develop new prediction scores which account for observational uncertainties. In the remainder of the manuscript, the biases will be quantified relative to RECON, which stands as our best possible estimate of the real climate state, since it is assumed to properly balance the observational uncertainty and the model uncertainty to produce such estimate. On top of that, since RECON provides our sea ice ICs, the sea ice forecast bias is initially 0 with respect to RECON and it grows only due to the impact of the ICs inconsistency and of the drift towards the model attractor. Thus, the spurious contribution of another source of bias such as the distance between RECON and an observational dataset is eliminated. For each month, RECON evolves in free-running mode until the last day of the month, in which satellite data are assimilated by the EnKF. This introduces some discontinuity with respect to the past month trajectory. For this reason, we focus on the first 30 forecast days, i.e. May and November, before the next assimilation phase happens.

Fig. 3
figure 3

Maps of the EnKF ensemble standard deviation across members (10 members) and the SIC uncertainty for the ESA and NSIDC observational products for the period 1993–2008. For ESA it was calculated as the SIC standard deviation, while for NSIDC it was measured as the difference between the two NSIDC products (Bootstrap and NASA-Team)

Fig. 4
figure 4

SIC climatology for RECON on November 1st. The orange and purple lines represent the sea ice edge defined by the 15% and 85% SIC thresholds, correspondingly

3 Characterization of the forecast bias

Fig. 5
figure 5

Maps of the SIC difference between the ORAS4_ice and RECON (sea ice ICs) the 30th of April/31st of October (a, d), HIST and RECON (b, e) and PRED and RECON (c, f) the 1st of May/1st of November respectively averaged over the period 1993–2008. The green line represents RECON sea ice edge (15% SIC) climatology, while the purple one represents the ORAS4_ice (a, d), HIST (b, e) and PRED (c, f) sea ice edge. For all panels red colours represent areas where RECON has a larger sea ice than (a, d) ORAS4_ice, (b, e) HIST and (c, f) PRED, while blue colours indicate less sea ice in RECON

3.1 Model bias

PRED is known to experience a drift as it adjusts towards the model (biased) climatology. The rate at which PRED drifts depends on how far the model is initialized from its attractor. The analysis of HIST allows to determine this attractor and therefore the systematic biases. A fast initial adjustment of ICs inconsistencies is expected to lead to the development of the patterns shown in Fig. 5a, d (see next subsection), but probably not to the extent shown in the figures, since that would be the maximum amplitude if the ocean did not adjust to the sea ice. Indeed, part of the inconsistency is expected to be absorbed by the ocean. On longer timescales, we expect the patterns shown in Fig. 5b, e to develop. On the 1st of May, the climatological sea ice edge in HIST is displaced towards the north with respect to RECON in regions like the Barents and Bering Seas where the SIC are largely underestimated. The opposite occurs in the Labrador Sea (Fig. 5b). On the 1st of November, the HIST climatology has considerably less ice than RECON around the Arctic continental margins (Fig. 5e), in particular over the Kara Sea, and produces more sea ice in the Baffin Bay. The drift will gradually bring the predictions closer to these model biases, with regional differences in speed. Note that the systematic negative bias in the sea ice is associated with a warm bias in the atmosphere and the ocean (not shown). As these three model components are tightly coupled, it is not possible to disentangle which one originates first.

Fig. 6
figure 6

Maps of the SST difference between ORAS4 and RECON for the period 1993–2008 for (a) April 30th and (b) October 31st. The green line represents RECON sea ice edge (15% SIC) climatology, while the purple one represents the ORAS4_ice sea ice edge

3.2 Inconsistencies between the initialization data

Using data from independent origins for initialization can introduce some shocks, as the different components adjust to each other in the first several days of the forecast. We focus here on the inconsistencies of both the atmosphere and ocean ICs with those of the sea ice. There exists some notable differences between the sea ice in RECON (used as sea ice ICs) and the sea ice that was used to produce the ocean and atmospheric reanalyses (ORAS4 and ERA-Interim, respectively). The RECON reanalysis produces more ice than the ORAS4_ice observational data (Fig. 5a, d), especially along the Eastern Greenland Coast (from Iceland to Svalbard). This reflects that the ocean state in ORAS4 is not perfectly compatible with the overlaying sea ice from RECON, and ORAS4 ocean state will tend to melt the RECON sea ice, while the RECON sea ice will tend to cool down the ORAS4 ocean state.

We remind here that, as already mentioned in Sect. 2.1, it is not possible to initialize the sea ice component of EC-Earth directly with a given observational dataset of SIC, like the data used to drive ORAS4, as the initialization of the sea ice model requires providing the simultaneous state of about 50 different variables which need to be physically consistent to avoid shocks. The excess of sea ice along the Eastern Greenland coast in RECON with respect to ORAS4_ice is associated with substantially colder local temperatures in the first (Fig. 6a, b), which implies that in the predictions the relatively warm waters ingested from ORAS4 will act to melt part of the sea ice initially imposed above, as already seen in Fig. 5c, f for day 1. Similarly, colder surface conditions in ORAS4 than in RECON, such as in the Bering Sea will favour sea ice formation early in the forecast. An inconsistency between ORAS4_ice SIC and SST appears in the Labrador, Barents and Bering seas and the Sea of Okhotsk, on April 30th and in the Barents Sea on October 31st (Figs. 5a and 6a), which will be investigated in an upcoming article.

Fig. 7
figure 7

Maps of the 2-meter air temperature difference between ERA-Interim and DFS for the period 1993–2008 for (a) April 30th and (b) October 31st

Inconsistencies between the initialization products for the sea ice and the atmosphere may also lead to initial shocks. Due to the corrections in DFS compared to ERA-Interim (colder air temperature by \(0.6\,^{\circ }\hbox {C}\) in the Arctic, except on the Baffin Bay, Dussin et al. 2016), during the first forecast days we should expect an overall melting effect of the atmosphere on the sea ice (that was produced with colder atmospheric conditions). The expected patterns of response can be deduced from the comparison of the 30th April/31st October air temperature fields in ERA-Interim and DFS (Fig. 7). These show that ERA-Interim is warmer than DFS (and therefore RECON) everywhere in the Arctic except in the Canadian Archipelago, Baffin and Hudson Bays for April 30th, and except over the Bering Sea, Sea of Okhotsk and part of the Canadian Archipelago for October 31st. The contribution of both ice-ocean and ice-atmosphere inconsistencies to the development of forecast bias will be explored in Sect. 4.2.

4 Understanding how the forecast biases develop

4.1 IIEE insights on the pan-Arctic sea ice biases

Fig. 8
figure 8

IIEE, AEE and ME as computed for the forecast bias (solid line), the model inherent bias (dash line) and the ICs inconsistency (dots). IIEE, AEE and ME are absolute errors

A summary of SIC biases is provided by the IIEE, which can be decomposed into a contribution from a general over/underestimation (AEE) and a contribution from incorrect processes leading to incorrect sea ice locations (ME). In May, the PRED IIEE grows slowly (Fig. 8a) and is still less than half of the HIST values by the end of the first forecast month. Whereas the PRED AEE reaches the asymptotic HIST AEE limit in no more than 5 days and dominates the PRED IIEE for the first few days, the PRED ME grows slowly and becomes predominant only by the end of the month, without having reached yet the HIST ME asymptotic limit. The HIST IIEE is dominated in May by the ME, which is associated with an overestimated SIC in Labrador Sea/Baffin Bay and Chukchi Sea and an underestimation along the rest of the sea ice edge (Fig. 5b). The PRED SIA bias evolution is consistent with the PRED AEE one (Fig. 1 in the supplementary material) and does not allow for an identification of the substantial errors in regional ice edge locations. The IIEE in November grows rapidly during the first week or so and then levels off (Fig. 8b). The IIEE is dominated in November by the AEE, which reaches the asymptotic HIST level in about one week. The November AEE is associated with a overall pan-Arctic SICs underestimation (Fig. 1 in the supplementary material). ME also grows throughout the month, but at a much lower rate and has only reached about half of the ME HIST by the end of the month. The ME results from overestimated SIC in Labrador and Greenland seas and Hudson Bay and underestimated SIC along the rest of the sea ice edge (Fig. 5). In both May and November, the drift towards the model attractor (Fig. 5b, e) and ICs incompatibility (Fig. 5a, d) lead to an overall sea ice underestimation (i.e. red areas dominate over the blue ones in Fig. 5). During the freezing season, the overall sea ice cover underestimation is reached as fast as during the melting season although the model bias has larger amplitude. These integrated diagnostics do not provide a comprehensive description of regional biases and their compensation which would allow to understand the ME development. These require the spatial SIC maps described in the next section.

4.2 Spatial evolution of the forecast biases

Fig. 9
figure 9

Evolution of SIC forecast biases for the predictions initialized in May. The first row corresponds to the ICs inconsistency at lag 0. The second and third rows show the SIC differences between PRED initialized in May and RECON (left column) and the model bias (right column) for lead times 10 and 30 for the period 1993–2008. The sea ice edge lines follow the legend of Fig. 5

The ICs inconsistency (first row in Fig. 9) corresponds to the SIC biases which are expected to develop rapidly as an initial shock in response to an inconsistent ocean below. The spatial maps of SIC biases in May (Fig. 9) evidence that the impact of ICs inconsistency could explain large forecast biases over the Greenland Sea. The Greenland SIC forecast biases are also quite close to the model biases by May 30th in sign and pattern (Fig. 9). However, for the ICs inconsistency the pattern has a larger longitudinal extension than the one from HIST. The PRED pattern of biases seems closer to the ICs inconsistency one before the end of the month. In other regions, like the Barents and Kara Seas, the model bias dominates the forecast bias in less than 10 days. By forecast day 30, the model bias is not yet fully developed in the forecasts, in particular in the Atlantic sector, which corresponds to regions of maximum SIC model biases. As in May, both the model bias and the ICs inconsistency exhibit a sea ice deficit in the Greenland Sea in November which resembles the pattern of SIC bias in PRED (Fig. 10). In the November case, the PRED pattern of SIC biases resembles closely the HIST one as early as forecast day 10. The ICs inconsistency seems to explain the lack of sea ice in the forecast over the Baffin Bay, that is still present by the end of the month.

Fig. 10
figure 10

As Fig. 9 for PRED initialized in November

4.2.1 Estimate of SIV melt by the inconsistency between RECON and ORAS4 ICs

A robust assessment of the contribution of the ICs inconsistency to the development of forecast biases requires a quantification of the sea ice volume (SIV) that could be melted by the warmer ocean in ORAS4 than in RECON (orange line, Fig. 11a–c). To calculate that amount of SIV, we first calculate the heat difference in the mixed layer between ORAS4 and RECON at t=0, as follows:

$$\begin{aligned} Q_{ml} = \int _{A} (\varDelta T_{ml} \times C_{w} \times h_{ml} \times \rho _{w})\ da \end{aligned}$$
(1)

where \(Q_{ml}\) is the heat difference in the mixed layer between ORAS4 and RECON integrated over the region of interest A and estimated in Joules; a is the area of each grid cell in square meters; \(\varDelta T_{ml}\) is the difference in ocean temperature in the mixed layer between ORAS4 and RECON in Kelvin; \(C_{w}\) is the water heat capacity (4186 \(J/Kg \cdot K\)); \(h_{ml}\) is the mixed layer depth in meters (derived from the first time step in the forecast); and \(\rho _{w}\) is the water density (1030 \(Kg/m^{3}\)).

Likewise, the total SIV over the selected region (Fig. 11d) that would be melted if all the heat difference in the mixed layer between ORAS4 and RECON (\(Q_{ml}\)) was used for it, can be derived from the following equation:

$$\begin{aligned} SIV = (Q_{ml} / L_{i})/\rho _{i} \end{aligned}$$
(2)

where SIV is the sea ice volume (in cubic meters); \(L_{i}\) is the latent heat of fusion of ice (334000 J/Kg) and \(\rho _{i}\) is the ice density (917 \(Kg/m^{3}\)).

The ICs inconsistency will lead to a cooling of the ocean and a melting of the sea ice until a new equilibrium is reached. This heat transfer from the ocean towards the sea ice could be superimposed with an additional model drift and internal variability in each forecast. To quantify the impact of the ICs inconsistency, we hypothesize that only the ocean mixed layer (estimated in EC-Earth as the depth at which the density difference with respect to the upper 10 m is larger than a threshold of 0.01 \(Kg/m^{3}\)) contributes to sea ice changes in the first month, and estimate its potential melting effect by computing the difference in heat content between ORAS4 and RECON over this layer, which in the Greenland, Baffin and Kara seas goes down to 47, 30 and 22m respectively. This heat is converted into a SIV following Eq. 2, and will represent the maximum SIV that could be melted/created by heat content differences in the ocean mixed layer due to inconsistencies between ocean and sea ice ICs, with negative values corresponding to SIV melting. Only the ocean mixed layer is considered in this estimate because of its short timescale of interaction with the sea ice, which is supposed to be rapid enough to play a key role in the initial fast adjustment of the forecast. We focus on the Baffin Bay and Greenland and Kara seas during the November forecasts because these three regions develop the largest negative SIC biases (Fig. 5f). In the Greenland Sea, the ORAS4 ocean holds enough excess heat compared to RECON to melt about three times as much sea ice than what is melted by the end of the month in PRED (Fig. 11a, orange and blue lines), which is consistent with a key contribution of the ICs inconsistency to the local SIC forecast biases. This SIV melt estimate, however, relies on the hypothesis of a local isolated mixed layer-ice system and did not consider potential lateral and vertical heat exchanges, nor exchanges with the atmosphere. This basic heat budget nonetheless supports our argument that enough energy is available from the warm ORAS4 ocean to explain the forecast sea ice melting, although other influences cannot be excluded. In the Baffin Bay, the initial SIV loss in PRED amounts to about 2/3 of the estimate of SIV that can be melted by the warm ORAS4 by the end of the month (Fig. 11b). In the Kara Sea, ORAS4 is colder than RECON and could not contribute to melt any sea ice as it is the case in the forecast, which discards any role of the ICs inconsistency (Fig. 11c).

Fig. 11
figure 11

In blue the difference in SIV between PRED initialized in November and RECON for (a) Greenland Sea, (b) Baffin Bay and (c) Kara Sea. The orange line represents the potential SIV melting that could be explained in PRED if all the heat difference between the sea surface and the mixed layer boundary at the ICs inconsistency was used to melt sea ice. In the case of the Kara Sea, the heat difference would not melt sea ice. (d) Map of the Arctic ocean regions

4.2.2 Inconsistency between the atmosphere and sea ice ICs

Most atmospheric variables can not be inconsistent with the sea ice cover, since ERA-Interim (i.e. the atmospheric ICs) and DFS (i.e. the atmospheric surface fluxes constraining RECON) are virtually identical in polar regions, except for the temperature (the only variable for which DFS introduces substantial corrections over the Arctic with respect to ERA-Interim; Dussin et al. 2016). Note also that temperature is more prone to generate inconsistencies because its vertical profile is expected to be continuous at the surface. Inconsistencies between the atmosphere and sea ice are expected to have a lesser impact than between the ocean and sea ice since the atmospheric heat capacity is about 3-to-4 times smaller than the for the ocean and the atmosphere is also much lighter (ocean density is \(\sim\)784 times higher). To quantify the effect of the warmer atmosphere in ERA-Interim than in RECON (Fig. 7), we calculate the difference between the 2-meter temperature in ERA-Interim and RECON (i.e. DFS) on October 31st over the Kara Sea, since it is the region where the contribution of the inconsistency between the ocean and sea ice ICs to PRED melting was previously discarded. If all the difference in heat from the lowest 100 meters of the atmosphere (the typical height of a stable boundary layer in polar regions) was used to melt sea ice, only 0.2 \(km^{3}\) would be melted, which is negligible compared to the 140 \(km^{3}\) melt by the end of the first forecast month (PRED minus RECON, Sea Ice Volume for Kara Sea; Fig. 11c). This suggests that the SIC forecast bias in the Kara Sea originates in the development of the model bias. It should be noted that DFS do not have temperatures above 2-meter because it is only a surface forcing, so we are assuming that the temperature difference between ERA-Interim and DFS would be constant along the vertical.

Fig. 12
figure 12

Correlation between the ICs inconsistency the day 0 and the forecast bias the first month (light blue) and synchronous correlation between the model bias and the forecast bias the first month (dark blue) for (a) May and (b) November. Correlations were calculated using the 2D bias point to point. Dots represent the significant values at the 95% level as estimated from a one-sided student-T distribution

4.2.3 Competing effect between the ICs inconsistency and the model bias

Our final analysis seeks to quantify how much of the pan-Arctic forecast bias can be explained in each forecast day by the ICs incompatibility and the model drift. We first compute spatial correlations between the SIC differences between ORAS4_ice and RECON at the corresponding forecast time 0 in PRED and the forecast bias for each of the following 31 days. We also compute the synchronous correlation between the model bias and the mean forecast bias as a function of forecast time (Fig. 12). Moreover, we excluded the region from 80N to 90N to focus only on the regions of bias. In May, the influence of ICs inconsistency on the mean forecast bias increases during the first 5 days (Fig. 12a), which is probably the time required for the warm ocean underneath to induce a significant sea ice melting (while consuming the excess of heat from ORAS4). After 5 days, its effects seem to stabilize and correlations slowly decrease but remain significant by the end of the month. The influence (as indicated by correlation) of model bias grows steadily, and by May 25th it becomes predominant over the ICs inconsistency one. By the end of the month the model bias is largely but not yet fully developed, as evidenced by its correlation with the forecast bias \(\sim\)0.7, and as already shown in Fig. 9. In November, the contributions from both sources of biases to the mean forecast bias evolve in a similar way to May. The effect of the ICs inconsistency is, however, already maximum from the first forecast day. Furthermore, the model bias prevails over the ICs inconsistency after 19 days.

5 Conclusions

In this study, we have identified and quantified the contributions from the different sources of Arctic sea ice concentration forecast biases that arise during the first month in a set of seasonal predictions produced with EC-Earth3.2. The main novelty of this work derives from the identification of the ICs inconsistency as a non-negligible source of forecast biases, which hinders the investigation of the model biases based on predictions. In our predictions, sea ice was initialized from a forced ocean reanalysis with Ensemble Kalman Filter (EnKF) assimilation of SIC (RECON).

The main results of this article are described in the following:

  • The comparison of a historical simulation (HIST) and the initialized predictions (PRED) has allowed us to study the model drift, i.e., how the inherent model bias develops in time, and to which extent the timescales for the development of biases depend on the region. The forecast bias in the PRED initialized in May has not reached the model bias (estimated as the difference between HIST and RECON) by the end of the month (Fig. 9), in particular in regions like the Baffin Bay and Barents and Kara Seas where the largest model biases can be found. For the predictions initialized in November, on the contrary, the model inherent biases seem to be fully developed before the end of the month in basins such as Barents, Kara, Chukchi and Bering Seas (Fig. 10).

  • Another source of forecast bias has been identified in the inconsistency between the initialization products used for the ocean and sea ice. In this forecast system, the ocean ICs, taken from ORAS4, are generally too warm for the overlying sea ice conditions from RECON, which leads to extensive sea ice melting during the first forecast days.

  • For the month of November, we have quantified the potential SIV melting associated to the ocean temperature differences between RECON and ORAS4 in the three regions with the largest SIC forecast biases. These estimations suggest that the ICs inconsistency could explain the negative SIC biases in the Baffin Bay and the Greenland Sea by the end of the month, but not in the Kara Sea, where it is probably linked to the model drift towards its attractor. Incompatibilities between the sea ice and the atmosphere ICs have also been investigated, but the latter have been shown to have a negligible melting effect on the sea ice.

  • By comparing the patterns of SIC forecast biases with those of the ICs inconsistency on one hand and the model bias on the other hand, we have been able to quantify their contributions to the forecast bias as a function of time. The analysis suggests that the initialization inconsistency dominates the forecast bias during the first 25 (19) days in May (November), while the model bias dominates afterwards. However, for particular regions like the Baffin Bay or the Greenland Sea the effect of ICs inconsistency can overshadow that of the model drift beyond one month.

The effect of the ICs inconsistency in the forecast biases has prevented the examination of the mechanisms behind the development of the model biases in our seasonal forecasts. Sanchez-Gomez et al. (2016) investigated in depth the development of model systematic errors in a decadal prediction system in which the atmosphere and the ocean were consistently initialized, and identified that ocean processes (including ENSO and changes in the North Atlantic ocean circulation) induced a slow adjustment towards the model attractor. The ocean and the atmosphere are known to influence Arctic sea ice at different timescales, and in this way can contribute to the development of the local model biases. In particular, previous studies have demonstrated that atmospheric processes can have large impacts on the sea ice cover: Woods and Caballero (2016) showed how the observed moist air intrusions into the Barents Sea reduce the local SIC up to a 2%. Another example would be the storm transiting over the Arctic during the summer of 2012, which favoured the Arctic SIE low record registered in September of that year (Parkinson and Comiso 2013). Therefore, model biases affecting the trajectory or frequency of occurrence of similar storms could affect the SIC as well. Or more generally, any bias affecting the atmospheric circulation, as illustrated in the analyses of Lecomte et al. (2016), who showed that wind biases could lead to systematic errors in the timing of the Antarctic sea ice growth, and by Smedsrud et al. (2011), that analyzed the link of wind-driven sea ice exports across the Fram Strait with Arctic sea ice decline. The ocean can also induce important sea ice biases, either through a direct influence of ocean temperature biases, or via key dynamical ocean processes that are not properly represented. For instance, the northward ocean heat transport, which can be a major precursor of rapid pan-Arctic sea ice declines (Auclair and Tremblay 2018), or the inflow of Atlantic Waters into the Arctic, which is key to understand the local temperature biases (Ilıcak et al. 2016).

The results from this article suffer from a few limitations. First, the calculation of the model bias has relied on the only historical simulation that had been produced with the same version of the model as used for the forecasts. Ideally, more ensemble members would be needed to filter out the internal climate variability and thus better constrain the model bias. This recommends some caution in the interpretation of our results, in particular at the regional level, where the true bias might not be accurately estimated. On top of that, the forecasts only expanded along 16 start dates, without covering the last decade, which could influence the robustness of the results discussed in this study. Finally, the sea ice reanalysis could have assimilated some ocean surface data to reduce sea ice uncertainty. Those are plans for the development of our assimilation system.

Using RECON as the reference for PRED and HIST could be viewed as the comparison between an ocean-sea ice and coupled systems. However, we consider RECON as the most trustworthy coherent estimation of the true sea ice state for this analysis because: 1) the atmospheric surface forcings provide a strong observational constraint which can be considered sufficient on its own to generate a sea ice reanalysis (Chevallier et al. 2013; Guemas et al. 2014); 2) it additionally assimilates remotely-sensed SIC data once a month, whose information can persist until the next assimilation time step, the persistence of SIC data being a few months (Blanchard-Wrigglesworth et al. 2011; Day et al. 2014); 3) it accounts for the observational uncertainty in the data it assimilates; 4) it has been produced with the same ocean and sea ice components than the forecast model; and 5) it provides their ICs, and thus allow us to easily track the development of the biases from t=0.

Despite the different biases identified and characterized in our analysis affecting the first forecast month, this same seasonal forecast system has been proven to be skillful over the Arctic and mid-latitudes, including the representation of extreme climate events (Acosta Navarro et al. 2019). There are potential ways to avoid, or at least minimize, the initial shocks, such as initializing all system components from strongly coupled assimilation runs performed with the same model version as used for the forecasts. However, reducing the initial shock does not necessarily translate into an improvement in predictive capacity as will be illustrated in an upcoming article. Recently, Kimmritz et al. (2019) showed how strongly coupled assimilation for the sea ice and ocean components enhances the sea ice skill up to 10 months for the Barents Sea, while reduces the SIT bias in comparison with ocean-only assimilation. The future of sea ice initialization points to strongly coupled assimilation, including the joint update of the atmosphere. However, the different temporal and spatial scales of the sea ice, ocean and atmosphere represents a challenging implementation (e.g. Penny et al. 2019), at the same time that it introduces a large computational cost. We are currently developing an ocean-sea ice coupled assimilation system and we plan to present its added value on the sea ice skill in the future.