1 Introduction

Arctic sea ice has quickly retreated during recent decades (Stroeve et al. 2014), which leads to large socioeconomic and climate impacts. The rapid decline of sea ice has a direct and immediate effect on Arctic communities through the impact on shipping (Ho 2010), resource extraction (Harsem et al. 2011), fisheries and marine mammals (Stenevik and Sundby 2007; Laidre et al. 2015). The loss of sea ice can affect the deep water formation, the local and remote atmospheric and oceanic circulations, and even the extreme weather in midlatitudes (Rudels and Quadfasel 1991; Thompson et al. 2000; Liu et al. 2012; Tang et al. 2013; Cohen et al. 2014; Suo et al. 2017).

There are increased interests in sea ice prediction on seasonal to decadal time scales. Sea ice predictions have been performed using empirical methods (e.g., Lindsay et al. 2008) or coupled global climate models (e.g., Chevallier et al. 2013; Bushuk et al. 2017). The statistical methods have the limitation that the relationships between the predictors and predictands may be non-stationary (Lindsay et al. 2008; Holland and Stroeve 2011). The dynamical approaches have the advantage over the statistical ones, as they can represent nonlinear climate regime shifts and complex interactions between different Earth system components (e.g., Barnston et al. 1999). The dynamic approaches are expected to show better performance in sea ice prediction (e.g., September SIE; Guemas et al. 2016a). Previous studies based on retrospective predictions with coupled climate models have shown that prediction of detrended pan-Arctic sea ice extent (SIE) can be skillful at 1–5 (1–11) lead months for summer (winter) months (Wang et al. 2013; Chevallier et al. 2013; Sigmond et al. 2013; Merryfield et al. 2013; Bushuk et al. 2019). Analysis of “perfect model” experiments, where the model is presumed unbiased, can provide an upper bound on the predictive capability of climate models (Day et al. 2014; Germe et al. 2014). These “perfect model” studies show that pan-Arctic SIE has a potential prediction skill up to 24 and 36 months for winter and summer, respectively. The large gap between the “perfect model” and operational prediction skill indicates that there is still room for improvement in future dynamical prediction systems (Bertino and Holland 2017; Bushuk et al. 2019).

Recently, some studies have shifted the focus of sea ice prediction from the pan-Arctic to more regional spatial scales, which are of interest to a large group of end users. They try to collect and predict local sea ice diagnostics, such as regional SIE (Krikken et al. 2016; Bushuk et al. 2017; Cruz-García et al. 2019), local probability of sea ice presence (Stroeve et al. 2015), ice retreat and advance dates (Stroeve et al. 2016; Sigmond et al. 2016). These studies have shown that the sea ice prediction skill is strongly region dependent. For example, Krikken et al. (2016) found that the detrended sea ice area in the Northeast Passage and the Kara and Barents Seas is more predictable than other regional seas, in particular for forecasts initialized in May and November. A comprehensive regional assessment by Bushuk et al. (2017) demonstrated that detrended SIE prediction is skillful at lead times of 5–11 months for winter months in the Labrador, Greenland-Iceland-Norwegian (GIN), and Barents Seas, and at lead times of 1–4 months for summer months in the Laptev, East Siberian, Chukchi, Beaufort, Okhotsk, and Bering Seas.

The mechanisms of sea ice predictability vary on different time scales, in different seasons and regions (Guemas et al. 2016a; Bertino and Holland 2017; Chevallier et al. 2019). The persistence of sea ice anomaly is an important source of predictability for sea ice prediction from daily to yearly time scales. For example, Blanchard-Wrigglesworth et al. (2011) found that the persistence of sea ice area (SIA) varies seasonally from 2 to 5 months, and longer persistence is seen in winter and summer than the other seasons. Two re-emergence mechanisms, highlighted by Blanchard-Wrigglesworth et al. (2011), can provide sources of sea ice predictability on time scales from a few months to 1 year. The re-emergence mechanism usually relies on the persistence of some sea-ice related variables. The summer-to-summer re-emergence of SIA is due to the long-lived sea ice thickness (SIT) anomalies and their impact on summer SIA, while the re-emergence of SIA anomalies from melt season to growth season is due to the persistence of SST anomalies. The advection of sea ice can provide additional predictability over simple persistence (e.g., Holland et al. 2013). Because sea ice is closely coupled with the atmosphere and the ocean, the two climate components can also provide some predictability of sea ice. The atmosphere is likely to impact the sea ice variability on subseasonal to interannual time scales, e.g., the impact of the North Atlantic/Arctic Oscillation on the Arctic region (Deser 2000), while the ocean can provide the source of sea ice predictability on interannual and longer timescales (e.g., Bitz et al. 2005; Yeager et al. 2015).

Due to the predictability provided by the intrinsic memory of sea ice and its related variables, accurate initial conditions are of importance for SIE predictions (Blanchard-Wrigglesworth et al. 2011; Guemas et al. 2016b). Current climate models used for SIE predictions are usually initialized using various atmospheric and oceanic observations, such as sea surface temperature (SST), subsurface ocean temperature and salinity, sea ice concentration (SIC), air temperature, or other data from existing reanalysis (e.g. Wang et al. 2013; Sigmond et al. 2013; Bushuk et al. 2017; Kimmritz et al. 2019). Some studies focused on the added skill of assimilating sea ice thickness data, which can slightly improve the sea ice concentration forecast and particularly benefit the prediction of summer ice extent (Yang et al. 2014; Xie et al. 2016; Blockley and Peterson 2018). Recently, Wang et al. (2019) mentioned that assimilation of SST already achieves skillful seasonal SIE prediction in the Barents, Labrador, and GIN Seas. However, to our knowledge, few studies have comprehensively examined the pan-Arctic and regional SIE prediction skill with climate models that only assimilate SST.

In this study, we focus on the dynamical prediction of pan-Arctic and regional Arctic SIE provided by the Norwegian Climate Prediction Model (NorCPM, Counillon et al. 2016), which assimilates SST anomalies with the Ensemble Kalman Filter (EnKF, Evensen 2003), an advanced data assimilation method. We aim to investigate how much skill can be achieved by solely assimilating SST anomalies for seasonal to decadal predictions. The regions we focus on are the five marginal seas near the Pacific and Atlantic Oceans, where the upper ocean heat content significantly contributes to the prediction skill of sea ice (Bushuk et al. 2017). The model and experimental designs are described in Sect. 2. The performance of NorCPM reanalysis for Arctic sea ice is evaluated in Sect. 3. Seasonal to decadal prediction skill by NorCPM is assessed in Sect. 4. Because the Barents Sea is a key region for teleconnections and feedback to atmospheric patterns (Koenigk and Brodeau 2014; Årthun et al. 2019), we further discuss possible mechanisms for SIE prediction skill in this region and the effects of SST assimilation on SIE prediction skill in Sect. 5. Section 6 is the summary and discussion.

2 Methods

2.1 Model and experimental design

NorCPM is a climate prediction system developed for seasonal-to-decadal climate predictions and long-term reanalyses (Counillon et al. 2014, 2016). It combines the Norwegian Earth System Model (NorESM, Bentsen et al. 2013) and the EnKF data assimilation method. NorESM is a global fully-coupled model for climate simulations (Bentsen et al. 2013). It is based on the Community Earth System Model version 1.0.3 (CESM1, Vertenstein et al. 2012), a successor to the Community Climate System Model version 4 (CCSM4, Gent et al. 2011). In NorESM, the ocean component is an updated version of the Miami Isopycnal Coordinate Ocean Model (MICOM, Bleck et al. 1992); the sea ice component is the Los Alamos sea ice model (CICE4, Gent et al. 2011; Holland et al. 2012); the atmosphere component is a version of the Community Atmosphere Model (CAM4-Oslo, Kirkevåg et al. 2013); the land component is the Community Land Model (CLM4, Oleson et al. 2010; Lawrence et al. 2011); the version 7 coupler (CPL7, Craig et al. 2012) is used.

The version of NorCPM and NorESM and the experimental designs used in this study are the same as in Counillon et al. (2016) and Wang et al. (2019). The reader is referred to the two papers for details; here, we only provide a brief description. NorESM is initialized from a random preindustrial stable condition and integrated from 1850 up to 2010 with CMIP5 forcing (Taylor et al. 2012). The 30-member ensemble mean of NorESM is referred to as FREE in the following. The 30-member reanalysis product of NorCPM has the same initial conditions as NorESM in 1950 and assimilates SST anomalies from Hadley Centre Sea Ice and Sea Surface Temperature dataset version 2.1 (HADISST2, Rayner et al. 2003) every month. Anomalies of the observation and model are defined relative to the 1950–2010 reference period mean. HADISST2 provides 10 realizations of monthly gridded SST over 1850–2010 with a 1° resolution. We consider the average and variance of these 10 realizations as observation and its error variance, respectively. SST data in the regions covered by sea ice are not assimilated. In terms of the EnKF implementation, we use a deterministic variant of the EnKF (DEnKF, Sakov and Oke 2008). We use the local analysis framework (Evensen 2003) in which assimilation is performed for each horizontal grid cell. The horizontal localization radius is limited to the grid cell dimensions, which vary from region to region. The ensemble spread is sustained by using the moderation technique and the pre-screening method (Sakov et al. 2012). The data assimilation only corrects the ocean state, meaning that the atmospheric and sea-ice components are only influenced dynamically in between the assimilation cycles. This reanalysis product is hereafter referred to as REANA_long and used for the study of decadal prediction. A second NorCPM reanalysis product used in this study has the same initial conditions as NorESM in 1980 and use 1980–2010 as the climatological reference period for SST anomaly assimilation. This reanalysis product is hereafter referred to as REANA_short and used for the study of seasonal prediction.

The prediction skill is assessed based on NorCPM hindcasts (i.e. retrospective predictions). Seasonal hindcasts start on the 15th of January, April, July, and October each year during 1985–2010, and thus, there are a total of 104 hindcasts. Each hindcast consists of 9 realizations (ensemble members) and is 13 months long. Initial conditions are taken from the first 9 members out of REANA_short. Since all members are equally likely with the EnKF, this choice is purely arbitrary. Decadal hindcasts start on the 15th of November every 2 years from 1959 to 1999. There are 21 hindcasts, and each hindcast consists of 20 members and is integrated for 10 years. Thus, the hindcasts cover the years from 1960 to 2010. Initial conditions are taken from the first 20 members out of REANA_long.

2.2 Prediction skill assessment

In this study, the performance of sea ice thickness is evaluated on CryoSat (http://cci.esa.int/), which is available for October–April during 2002–2017. We use data from HADISST2 as the observation to verify the sea ice skill in the NorCPM reanalysis and predictions. An independent sea ice concentration product OSI-450 (OSI SAF 2017, https://doi.org/10.15770/EUM_SAF_OSI_0008), which is included in the observational sea ice intercomparisons (Ivanova et al. 2015; Kern et al. 2019), is also used to evaluate the sea ice predictions. The variability of the detrended SIE anomalies is consistent between the two datasets (Fig. 1a), and the main results of this study are not changed when changing the observation dataset (not shown). Because OSI-450 is only available after 1979, in the following sections, we mainly show results using HADISST2 as the observation. SST and SIC from HADISST2 are regridded onto the NorCPM grid to avoid the systematic biases between the different land-sea masks of the two grid systems. SIE in a specific region is defined as the areal sum of all grid cells where the SIC exceeds 15%. The long-term linear trend has been removed for each variable and the anomalies mentioned in the following sections all refer to the detrended anomalies. The anomaly correlation coefficient (ACC) between the NorCPM and observation is used to assess the sea ice skill. The statistical significance of the ACC value is tested based on the Student’s t test. The effective number of degrees of freedom is calculated as \(N_{eff} = \frac{{1 - r_{1} r_{2} }}{{1 + r_{1} r_{2} }}N\), where N is the length of time series, \(r_{1}\) and \(r_{2}\) are the lag 1 autocorrelation for each time series (Bretherton et al. 1999).

Fig. 1
figure 1

a Yearly timeseries of detrended annual-mean pan-Arctic SIE from HADISST2 (black solid), OSI-450 (black dashed), REANA_long (red) and FREE (blue). Red (blue) shading represents the ensemble envelope for REANA_long (FREE). Seasonal cycles of pan-Arctic SIE (b) and area-mean SST north of 60° N (c). d ACCs between observed SIE and REANA_long (red) and between observed SIE and FREE (blue) for each month

3 Evaluation of the reanalysis product

As sea ice prediction skill depends on initial states of the prediction system, we start with an assessment of the NorCPM reanalysis product. Because the two reanalysis products have similar features, we only show the results from REANA_long in this section. The interannual variability of annual-mean pan-Arctic SIE is better represented in REANA_long than in FREE; the ACC increases from 0.38 in FREE to 0.59 in REANA_long, and the root mean square error (RMSE) decreases (Table 1). The standard deviation of the interannual variability in FREE (0.08 × 106 km2) is much smaller than that in the observation (0.31 × 106 km2, Fig. 1a). REANA_long improves the performance of interannual variability but still has a too weak standard deviation (0.15 × 106 km2), which indicates that the assimilation of SST would not be able to sufficiently synchronize the sea ice variability. It should be noted that an alternative calculation method is to first calculate the standard deviation of the interannual variability for each member in FREE or REANA_long and then make the ensemble mean of the standard deviation. This method results in the standard deviations of 0.20 × 106 km2 for both FREE and REANA_long, which is still smaller than the variability in the observation. The ensembles of REANA_long and FREE almost envelop the observed variability, including the extremely low annual-mean SIE in 1974 and 2006, which indicates that the ensembles have some reliability. As shown in Fig. 1a, assimilation of SST narrows the model spread.

Table 1 ACCs and RMSEs of detrended annual-mean and monthly-mean pan-Arctic SIE between observations and REANA_long/FREE during the period of 1960–2010

Seasonal cycles of pan-Arctic SIE (Fig. 1b) and SST (Fig. 1c) are almost identical between REANA_long and FREE as we use anomaly assimilation and model biases are left unchanged. REANA_long and FREE over-(under-) estimate summer SIE (SST) compared to observations, which is likely an effect of the reported too weak melt of snow in the summer season in NorESM (Bentsen et al. 2013). The overall modeled values for SIE and SST are close to the observations in winter, but model biases in winter can be as large as those during summer for specific regions (Fig. 2). Free and Reana_long have less sea ice in the Labrador Sea and the GIN Seas in March (Fig. 2a) compared to observations, and much more sea ice in the Labrador Sea and the Barents/Kara Seas in September (Fig. 2b). The discrepancy decreases when considering the annual-mean SIC (Fig. 2c). The ACCs of pan-Arctic SIE between Reana_long/FREE and observation also show a pronounced seasonal cycle and are larger in winter than in summer (Fig. 1d). Assimilation of SST anomalies improves the ACCs by about 0.2 for each month and is particularly beneficial for August when pan-Arctic SIE is almost the lowest.

Fig. 2
figure 2

Climatology of sea ice edge (15% sea ice concentration) in March (a), September (b), and annual mean (c) during 1960–2010 from REANA_long (red lines) and HADISST2 (black lines). d The regions considered for Arctic sea ice

To assess the skill in REANA_long in more detail, we distinguish between 14 regions of the Arctic Ocean (Fig. 2d) as defined in National Snow and Ice Data Center (NSIDC) and also used in Bushuk et al. (2017). By assimilating SST anomalies, REANA_long can well capture the interannual variability of annual-mean SST, with most of the ACC values higher than 0.8 (Fig. 3a). There is also some skill in annual-mean SIC, particularly in the Arctic marginal seas (Fig. 3b). Among these regions, the ACC value in the Barents Sea is the highest, which is due to the strong relationship between the variability of sea ice and upper ocean heat content (HC) in this region (Sandø et al. 2010; Wang et al. 2019; Kimmritz et al. 2019). Overall, the regional performance of SIC in winter (Fig. 3c) is better than that in summer (Fig. 3d). In summer, sea ice variability in the Arctic shelf seas is mostly driven by the initial condition of sea ice thickness (Bushuk et al. 2017). Due to SIT data availability, we assess the model performance of winter SIT in Fig. 4. We reveal that SST assimilation only slightly improves SIT biases. The slight improvement may be associated to the variability and/or mean state of the ocean component changed by assimilation of oceanic data, while further investigation is needed to understand the accurate reason.

Fig. 3
figure 3

Pointwise ACCs between REANA_long and observation for annual-mean SST (a), annual-mean SIC (b), SIC in ice maximum season (c), and SIC in ice minimum season (d). Grey shading indicates that the coefficient of variation (defined as the standard deviation divided by the mean value) of SIC/SST either from REANA_long or observation is smaller than 0.01

Fig. 4
figure 4

The difference of sea ice thickness between REANA_long and CryoSat (a), and between FREE and CryoSat (b) over October–April during 2002–2010. The unit is m

4 Regional Arctic SIE prediction skill

In this section, we assess the seasonal and decadal prediction skill of regional Arctic SIE. We mainly focus on the prediction skill in regions exposed to the open oceans: the Bering Sea and the Sea of Okhotsk near the Pacific sector, and the GIN, Barents, and Labrador Seas near the Atlantic sector. The regions in the Arctic shelf seas are not considered because the ACC skill of SIC is even poor in the NorCPM reanalysis (Fig. 3). To assess the prediction skill, we compare the ACC skill of NorCPM hindcast with that of FREE and persistence forecast. Whenever the skill of hindcast exceeds that of FREE, improvement in the initial state due to SST assimilation enhances SIE prediction skill. The persistence forecast in this study is an anomaly persistent forecast obtained by persisting the SIE anomaly of the initial year (month) in NorCPM reanalysis product up to the target year (month) for decadal (seasonal) prediction. It differs from standard persistence that is calculated from observation, but this metric is useful to diagnose the reason for the loss of predictability. If hindcast skill exceeds persistence, it indicates that our dynamical model can propagate correctly in time the improvement of the initial conditions.

4.1 Seasonal prediction skill

The seasonal prediction skill of SIE is evaluated based on the NorCPM hindcasts starting in January, April, July, and October every year from 1985 to 2010. As shown in Fig. 5, the ACC skill strongly depends on initial and target months. NorCPM roughly has skill in predicting pan-Arctic SIE at lead times of 1–7 months and is particularly skillful for winter target months (e.g., up to 12 lead months for January). When different Arctic regions are considered, the five marginal seas can be divided into three groups. Group one includes the Barents Sea and the Labrador Sea near the North Atlantic sector. The two regions show higher prediction skill than the other regions (Fig. 5). This can be partly attributed to the higher inherent persistence of SIE anomalies in the two regions (Fig. 6), where the mixed layer is thicker and sea ice variability is less influenced by the atmosphere (Kimmritz et al. 2019). The Labrador Sea has high SIE skill for spring target months. The SIE skill in the Barents Sea is the highest among the five regions and the skill is up to 10–11 months for winter-spring target months (Fig. 5). In the GIN Seas (Group two), NorCPM can only skillfully predict the following 1-3 months for hindcasts initialized in January and April. The rapid loss of SIE skill in this region may be due to the impact of sea ice export through the Fram Strait from the Arctic Basin, where SIE prediction skill is poor in this version of NorCPM. Group three includes the Bering Sea and the Sea of Okhotsk near the Pacific sector. It should be noted that considering the SIE skill in some summer–autumn months in these two regions is of little interest as both NorCPM and observation are almost ice-free in summer (Fig. 2b). The good performance of SIE prediction in these regions can be partly explained by the well constrain of SIE variability by SST variability (Fig. 7), which is directly improved by SST assimilation.

Fig. 5
figure 5

ACCs between observed SIE and NorCPM seasonal hindcasts, organized as a function of target month (x-axis) and lead months (y-axis). Dots indicate that ACC values are statistically significant at the 95% confidence level. Gray shading is used when hindcasts are not available. White shading indicates ice-free either in observation or NorCPM

Fig. 6
figure 6

Same as Fig. 5 but for persistence forecast. Dots indicate that the SIE skill of NorCPM hindcast exceeds that of persistence forecast

Fig. 7
figure 7

ACCs between SST and SIE from NorCPM seasonal hindcasts, organized as a function of target month (x-axis) and lead months (y-axis). Dots indicate that ACC values are statistically significant at the 90% confidence level. Gray shading is used when hindcasts are not available. White shading indicates ice-free either in observation or NorCPM

Overall, the NorCPM hindcast shows better performance than the persistence forecast (indicated by dots in Fig. 6). However, for all target months in the Sea of Okhotsk and for target months from July to December in the Barents Sea, the NorCPM hindcast shows poorer skill than the persistence. Since the degradation happens long after the initialization, it may be related to the poor model dynamics or a seasonal bias in the model. In contrast, some degradation happens at the starting time of the hindcasts (e.g., in the pan-Arctic, the Bering and Barents Seas). This may be related to the assimilation shock, and the reasons why we introduce such imbalance are as follows: (a) we only update the ocean component of the earth system and there may be some adjustment in the other components (e.g. ice and atmosphere); (b) we do not assimilate SST under sea ice, which may cause spatial discontinuity.

Compared with FREE, NorCPM has much improvement in regional SIE prediction skill by solely assimilating SST (indicated by dots in Fig. 8). As shown in Fig. 8, FREE has almost no skill in the GIN and Labrador Seas. While for February in the Bering Sea and April in the Sea of Okhotsk, FREE has some skill. The mechanism for the SIE skill in FREE is referred to as “model dynamics” in the following, as a comparison to the SIE skill stem from the assimilation of SST anomaly. The better performance of FREE in some regions may be related to the mechanism that is better isolated in the 30-member ensemble mean of FREE than in the 9-member ensemble mean of the hindcasts. In the Barents Sea, FREE also shows some skill for target months from February to July (Fig. 8), while the most notable improvement in NorCPM hindcast (Fig. 5) occurs in target months from December to March. This may be related to the re-emergence of well-constrained spring SST anomalies in winter (Blanchard-Wrigglesworth et al. 2011). It indicates that the prediction skill in the Barents Sea may come from both the model dynamics (February–July) and SST assimilation (December–March). The sources for seasonal prediction skill in the Barents Sea will be further discussed in Sect. 5. It should be noted that the SIE skill in pan-Arctic in August and September in FREE (Fig. 8) stems from the SIE skill in the Kara Sea and the Hudson Bay (not shown), and these two regions are not the focus of this paper.

Fig. 8
figure 8

Same as Fig. 5 but for FREE. Dots indicate that the SIE skill of NorCPM hindcast exceeds that of FREE

4.2 Decadal prediction skill

The decadal prediction skill of annual-mean SIE is evaluated based on the NorCPM hindcasts starting every 2 years from 1959 to 1999. The total Arctic SIE prediction has significant skill up to 10 years ahead (not shown). It indicates that NorCPM can well capture the long-term decreasing trend of sea ice, which is mostly due to external forcings. In the following, we mainly focus on the detrended SIE anomalies and compare the NorCPM prediction skill with that of anomaly persistence forecast and FREE. We also assess how well our decadal prediction maintains the skill of the reanalysis product. As shown in Fig. 9, REANA_long (green line) has relatively high skill for all considered regions, while FREE (blue line) has almost no skill except in the Bering Sea, where ACC values remain ~ 0.25 for all lead times. The skill of the persistence forecast (black line) decreases fast after 1-lead year. For the NorCPM hindcast (red line), the prediction skill of pan-Arctic SIE is significant at the lead time of 1 year and better than FREE and persistence forecast. However, the 1-year prediction skill shows obvious regional differences. In the two “isolated” seas (i.e., the Sea of Okhotsk and the Labrador Sea), the lead-1-year ACCs of hindcasts are statistically significant and close to REANA_long, showing similar features to that in the pan-Arctic. In the GIN and Barents Seas, the skill of the lead-1-year hindcast is also significant but worse than REANA_long. While in the Bering Sea, the hindcast skill decreases rather fast and even the ACC of the lead-1-year hindcast is not significant. Beyond lead 1 year, NorCPM prediction skill decreases rapidly for all considered regions and is mostly not statistically significant. The lack of detrended SIE prediction skill on interannual to decadal time scale is also found in Germe et al. (2014). On such time scales, the variability of sea ice is impacted by the variability of atmosphere or ocean (Guemas et al. 2016a). Therefore, improving the predictability of the two climate components may improve the predictability of detrended SIE on decadal predictions.

Fig. 9
figure 9

ACCs between observed SIE and NorCPM decadal hindcasts (red), REANA_long (green), FREE (blue), and persistence forecast (black). Dots indicate that ACC values are statistically significant at the 95% confidence level

It is surprising to find that two regions (i.e. the GIN Seas and the Bering Sea) show re-emerging skill. The Bering Sea shows re-emerging skill in years 6 and 10. We further examine the seasonal-mean SIE and find that the re-emerging skill of annual-mean SIE is mainly contributed by the SIE skill in ice maximum (February–April) and melting (May–July) seasons (not shown). Besides, the re-emerging skill at lead times of 10-year in the Bering Sea also occurs in the annual-mean SST. The GIN Seas show re-emerging prediction skill in years 7 and 8. The re-emerging skill is also significant when different seasons (e.g. ice minimum/maximum and ice growth/melting seasons) are considered (not shown). A similar peak of the hindcast skill at long lead times (7–9 years) was also found for detrended SST in the Nordic Seas in one of the CMIP5 models (Langehaug et al. 2017), while there is no significant re-emerging skill in the GIN Seas revealed in Germe et al. (2014). There is a clear difference between the Bering Sea and the GIN Seas; in the Bering Sea, the re-emerging skill of the hindcast is similar to FREE whereas in the GIN Seas the skill is higher than FREE. This suggests that the re-emerging skill in the GIN Seas at this lead time may come from the predictability carried by ocean conditions. Because the two regions are at locations where the exchange of water masses between the Arctic and Pacific/Atlantic Oceans occurs, the re-emergence might be related to either the (sub)decadal variability in the Pacific/Atlantic regions (e.g., the Atlantic meridional overturning circulation; Mahajan et al. 2011) or the variability of Arctic outflow (Proshutinsky et al. 2015). However, it should be noted that the decadal prediction starts every 2 years and we have a relatively small size of 21 points to derive the correlations. Therefore, such reemergence of skill could happen just by accident, because of sampling variability. The reason for the re-emergence requires further investigation.

5 Sources of seasonal prediction skill in the Barents Sea

Among all the considered sub-Arctic regions in this study, the seasonal prediction skill by NorCPM in the Barents Sea is the highest (see Sect. 4.1). Besides, FREE also has skill for target months from February to July in the Barents Sea (Fig. 8). It indicates that in addition to the improved initial conditions by SST assimilation, there are other mechanisms responsible for SIE prediction skill in this region (e.g., Bushuk et al. 2017; Kimmritz et al. 2019). For example, Onarheim et al. (2015) presented skillful SIE prediction in the Barents Sea based on statistical methods using observed ocean heat transport. Therefore, we will discuss potential mechanisms for seasonal SIE prediction skill in the Barents Sea in this section.

5.1 Energy budget in the Barents Sea

Heat budget in the Barents Sea is examined to investigate what is responsible for sea ice variability and predictability in this region. As shown in Fig. 10a, SIE variability in the Barents Sea is significantly correlated with ocean heat content (HC), particularly in winter and spring (also see Bushuk et al. 2017). Negative correlations imply that warmer ocean conditions correspond to less extensive sea ice cover as it prevents the formation of sea ice. NorCPM can skillfully predict SST variability up to 12-month lead time in the region that extends from the Iceland Basin to the Barents Sea (Wang et al. 2019). Thus, it is reasonable that NorCPM shows improved prediction skill in the Barents Sea due to the improvement of SST (Fig. 8).

Fig. 10
figure 10

ACCs between NorCPM hindcast SIE and HC (a), BSO heat transport (b), solar heat flux (c), longwave radiation (d), latent heat flux (e), and sensible heat flux (f) in the Barents Sea. ACCs in (b, c) are calculated by SIE lags the BSO heat transport and solar heat flux at 1 month. HC and heat fluxes are averaged over the areas with climatology sea ice cover for each month. Downward heat fluxes are positive. Dots indicate that ACC values are statistically significant at the 90% confidence level

Ocean heat flux through the Barents Sea Opening (BSO) plays an important role in sea ice variability in the Barents Sea (Sandø et al. 2010; Årthun et al. 2012; Keghouche et al. 2010). More heat transported through the BSO can lead to less sea ice cover in the Barents Sea. NorCPM can reasonably simulate the BSO heat transport, with a mean value of 56 TW and a standard deviation of 13 TW, which is consistent with observations and model simulations in previous studies (e.g., Årthun et al. 2012). Figure 10b shows the ACCs between the SIE in the Barents Sea and the lead-1-month BSO heat transport. We chose this time lag because the largest correlation is found for SIE in the Barents Sea lagging the BSO heat transport by 1 month based on the ocean model MICOM (similar to the one used in this study; Sandø et al. 2010). We also tested other time lags (e.g., lag 0 or 2 months) in NorCPM and similar or lower correlations were got (not shown). As shown in Fig. 10b, for NorCPM seasonal hindcasts, BSO heat transport is only significantly correlated with SIE in winter-spring target months (a mean correlation of − 0.4 for months from January to May). The correlations for summer target months (from July to September) are relatively low. Similar results were found for ACCs between HC and BSO heat transport (not shown). The weak correlations during the summer months may be due to the weak BSO heat transport and SIE in the Barents Sea is less influenced by the Atlantic inflow. The water is more stratified (shallower mixed layer) and ice is thinner (Bushuk et al. 2017), thus, SIE may be more sensitive to wind, which makes it less predictable. Besides, the sea ice cover during summer is mainly in the northern part of the Barents Sea (Fig. 2b) and there may be a long time lag between the time when the warm waters enter the BSO and when these waters are spread into the Barents Sea (Sandø et al. 2010).

Relationships between SIE in the Barents Sea and heat fluxes at the surface (output from the atmospheric component; downward heat fluxes are positive) are examined. The heat fluxes are averaged over the areas with climatology sea ice cover for each month (Fig. 10c–f). We also looked at the heat fluxes averaged over the entire Barents Sea and the results are similar (not shown). The ACCs are calculated with SIE lags solar heat flux at 1 month, as this lag time is found to have the largest correlation between SIE and solar heat flux in NorCPM and also reported by Sandø et al. (2010). As shown in Fig. 10c, summer SIE in the Barents Sea is significantly correlated with the variability of net solar heat flux at the surface (a mean correlation of − 0.69 for months from April to September). It indicates that summer SIE is largely constrained by solar heat flux and negative correlations imply that larger solar heat flux into the Barents Sea leads to less sea ice cover. Because both NorCPM hindcast and FREE use the same CIMP5 external forcing, FREE also has some SIE prediction skill for months from February to July (Fig. 8). In winter, the relationship between SIE and solar heat flux is very weak (Fig. 10c) as there is almost no sunshine. The net non-solar heat fluxes (positive downward) are also significantly correlated with SIE for most hindcasts (Fig. 10d–f), but they are likely responses to the change of SIE (If the net non-solar heat fluxes are the causes of sea ice variability, then the correlations should be negative). Positive correlations mean that less sea ice cover corresponds to more non-solar heat fluxes into the atmosphere. This mechanism can be understood as: reduced sea ice cover leads to more open water and higher temperature in the Barents Sea, and then leads to increased upward longwave radiation, latent and sensible heat fluxes.

5.2 Surface wind

In addition to the heat budget in the Barents Sea, the variability of SIE could also be driven by atmospheric fluctuations (Ingvaldsen et al. 2004; Kwok 2009; Keghouche et al. 2010; Lien et al. 2016; Olonscheck et al. 2019). For example, northerly wind anomalies can increase sea ice export from the Arctic Ocean to the Barents Sea and reduce the Atlantic inflow through the BSO. In this way, both the wind and Atlantic inflow can lead to more sea ice cover in the Barents Sea. The relationship between the variability of SIE and the near-surface wind is examined in the NorCPM seasonal prediction system (Fig. 11). For ice maximum season (FMA), SIE is mostly related to near-surface wind in the southern Barents Sea. The northeasterly wind anomalies favor the southward extension of sea ice and reduce the Atlantic inflow. For ice minimum season (ASO), easterly wind anomalies can also reduce the Atlantic inflow through the BSO. Besides, sea ice cover in this season is mainly in the northern Barents Sea, and thus, SIE is closely related to the variability of near-surface wind over the northern part of the Barents Sea. Since the near-surface wind cannot be improved by assimilating SST and is less predictable, it would be a challenge to constrain the atmospheric state to improve the SIE prediction skill.

Fig. 11
figure 11

Regressions of 1000 hPa horizontal wind anomalies upon the Barents SIE for ice maximum season (a) and ice minimum season (b). The red line defines the borders of the Barents Sea

6 Summary and discussion

This study investigates the seasonal and decadal prediction skill of regional Arctic SIE in NorCPM. The NorCPM hindcasts performed in this study are initialized by solely assimilating SST anomalies with the EnKF. The main focus of this study is to examine how much SIE skill we can get by solely assimilating SST in coupled climate models. The prediction skill is assessed based on detrended SIE anomalies. For all the considered regions, most NorCPM hindcasts outperform FREE, which points out the importance of SST assimilation in SIE predictions. For seasonal predictions, NorCPM can skillfully predict pan-Arctic SIE up to 12 months. Overall, our model shows higher skill in regions near the Atlantic sector than near the Pacific sector, consistent with the results in Bushuk et al. (2017). This may be partly due to the higher SIE persistence near the Atlantic sector and the larger heat capacity of a deeper mixed layer in that region enables the re-occurring skill in the winter season (Kimmritz et al. 2019). Among these regions, the prediction skill in the Barents Sea is the highest and is up to 10–11 months for winter target months. For decadal predictions, NorCPM has significant skill in predicting SIE in pan-Arctic, the Sea of Okhotsk, the Labrador Sea, the Barents Sea, and the GIN Seas up to a lead time of 1 year.

Sources of seasonal prediction skill in the Barents Sea are examined in this study. As indicated in former studies (Sandø et al. 2010; Årthun et al. 2012; Bushuk et al. 2017), the variability of winter SIE in the Barents is strongly constrained by SST and upper ocean heat content and is closely related to ocean heat transport through the BSO. Thus, NorCPM shows obvious improvement in predicting winter SIE in the Barents Sea by improving the initial ocean conditions compared with FREE. Besides, the variability of summer SIE in the Barents Sea is largely controlled by the variability of solar heat flux. Both NorCPM and FREE have some skill in predicting summer SIE in the Barents Sea, which may be due to the reasonable representation of the variability of solar heat flux in CMIP5 external forcing.

This study demonstrates that by solely assimilating SST observations with the EnKF, NorCPM provides skillful Arctic regional SIE predictions. The seasonal SIE prediction skill is comparable with state-of-the-art dynamical prediction systems that assimilate more data, such as the National Centers for Environmental Prediction coupled Climate Forecast System version 2 (CFSv2) in Wang et al. (2013), the Canadian Seasonal to Inter-annual Prediction System (CanSIPS) in Sigmond et al. (2013), and the GFDL prediction systems in Bushuk et al. (2017). Considering the five Arctic marginal seas, NorCPM has good performance of SIE predictions for winter target months. As we know, the SIE skill in these months is largely constrained by ocean temperature. On the contrary, NorCPM has relatively low SIE prediction skill for summer target months, compared with GFDL. The better skill in summer months in GFDL is likely due to additional assimilation of sea ice thickness and atmospheric data (Bushuk et al. 2017). It suggests that further assimilation of atmospheric states and sea ice data should be included in future SIE predictions.