1 Introduction

The California Current System (CCS) is a highly productive marine ecosystem, supporting primary productivity and fish catch disproportionately high for its spatial extent (Chavez and Messié 2009) and hosting a diverse array of top predators that rely on abundant prey resources in the region (Block et al. 2011). The proximate cause of elevated productivity in the CCS is the seasonal onset/intensification of northerly winds that drive coastal upwelling in the spring and summer, enriching the sunlit surface layer with nutrients from below and stimulating the growth of phytoplankton that form the base of the marine food web. These dynamics are sensitive to temporal variability on scales from daily weather to multi-decadal and secular change (Checkley and Barth 2009). In particular, leading modes of basin-scale climate variability including the El Niño-Southern Oscillation (ENSO), the Pacific Decadal Oscillation (PDO), and the North Pacific Gyre Oscillation (NPGO) exert strong control over CCS upwelling (Di Lorenzo et al. 2008; Jacox et al. 2014, 2015b), generating pronounced interannual variability in the nearshore environment (Fig. 1) and its biological communities.

Fig. 1
figure 1

Left 1982–2009 mean SST and right standard deviation of 1982–2009 annual mean SST. Data are from NOAA’s 0.25° daily OISSTv2 product

Given the dynamic climate and oceanographic variability of the CCS and the profound impact of this variability on marine resources, there is momentum building to move toward fisheries management strategies that incorporate real-time and forecasted environmental information to inform the temporal and spatial extent of fishery closures (Hobday et al. 2013; Lewison et al. 2015; Maxwell et al. 2015). For example, to predict the spatiotemporal overlap between protected and targeted species, empirical statistical relationships are used to link marine species to their preferred environment, and then predict the distribution of those species based on more widely available oceanographic observations or predictions. In Australian waters this approach has been extended to operational seasonal forecasts using dynamical climate forecast systems, with notable examples for coral reef stress on the Great Barrier Reef (Spillman et al. 2013), bycatch reduction for Southern Bluefin Tuna off east Australia (Hobday et al. 2011), and improved efficiency of the Southern Bluefin Tuna fishery in the Great Australian Bight (Eveson et al. 2015). While similar efforts have not yet been operationalized in the CCS, Kaplan et al. (2016) demonstrate one potential application using a sea surface temperature (SST) based habitat model in conjunction with downscaled seasonal forecasts (Siedlecki et al. 2016) to predict Pacific sardine distributions in the northern CCS. A second recent CCS climate forecast application aims to refine catch limits for climate-sensitive Pacific sardine by introducing short-term temperature forecasts (Tommasi et al. 2017). These results suggest that anticipating shifts into warm (productive) and cold (unproductive) states for sardine may enable increased catch without increasing the probability of stock collapse.

Considerations for applying global climate models (GCMs) to living marine resources have been detailed for both climate projections (Stock et al. 2011) and seasonal forecasts (Hobday et al. 2016). In general, the foundation of successful seasonal forecasting applications is a skillful forecast for some desired physical parameter, upon which biological responses can be added through statistical or dynamical models. A critical physical parameter in this regard is the sea surface temperature (SST), often both a primary driver and leading indicator of marine resource responses to climate variability (Ottersen et al. 2010). Dynamical skill for seasonal SST forecasts has been demonstrated in many coastal ecosystems around the world, including the CCS, despite the relatively coarse resolution of the global models used to make such predictions (Stock et al. 2015). Furthermore, the skill of seasonal forecasts tends to be improved by using multi-model ensembles like the North American Multi-Model Ensemble (NMME) rather than multi-member ensembles of a single model (Becker et al. 2014; Kirtman et al. 2014). Here, we explore the skill of seasonal SST forecasts in the CCS for individual seasonal forecast systems as well as for the NMME multi-model mean. We then document the conditions under which forecast skill emerges and elucidate the mechanisms behind it, with the intention of highlighting strengths and weaknesses of global forecast systems in the CCS and their potential application to regional downscaling and marine resource management in the region.

2 Methods

2.1 Study domain

The area of interest to this study is the CCS, off the west coast of the United States (Fig. 1). We focus on an area extending from northern Mexico to the southern tip of Vancouver Island (30–48°N) and from the coast to 300 km offshore, encompassing both the highly productive region within ~50 km of the coast and the California Current transition zone farther offshore, where nutrient and chlorophyll levels are elevated relative to the oligotrophic gyre but lower than in the nearshore region (Jacox et al. 2016a). In Sect. 3.4 we further divide the CCS into northern, central, and southern CCS regions, with divisions at Cape Mendocino (~40.5°N) and Point Conception (~34.5°N). These sub-regions capture distinct physical and biological regimes within the CCS, and are delineated according to the prevailing atmospheric forcing and ocean dynamics (Checkley and Barth 2009; Dorman and Winant 1995).

2.2 Seasonal forecasts

Seasonal SST forecast skill is tested using global coupled atmosphere-ocean-land-sea ice models participating in Phase I of the North American Multi Model Ensemble (NMME) (Kirtman et al. 2014). For our study period (1982–2009), the NMME includes hindcasts from 14 models, summarized in Table 1. Each model produces an ensemble of forecasts initialized each month, with the number of ensemble members ranging from 6 to 24 and forecast lead times ranging from 8 to 12 months depending on the model. In most cases, forecasts are initialized on the first of the month; for those that are not, we use forecasts initialized within the 2 weeks leading up to the beginning of the month. Ensemble members are averaged to produce a mean forecast for each model, and the mean of those constitutes the NMME mean forecast. Output on the native grids of individual models is interpolated to a common grid with 1° resolution in longitude and latitude.

Table 1 Brief model descriptions for 14 contributing members of Phase I of the NMME

In Sects. 3.3 and 3.4 we perform a more detailed analysis on a single member of the NMME, version 4 of the Canadian Centre for Climate Modeling and Analysis (CCCma) Coupled Climate Model (CMC2-CanCM4; hereafter CanCM4) (Merryfield et al. 2013). The focus on a single model complements the multi-model analysis and enables a more mechanistic exploration into the sources of forecast skill. We chose CanCM4 as it is amongst the best performers of the NMME ensemble for this region (Fig. 2), and also was the only NMME member with readily available atmospheric flux fields (wind stress, surface heat fluxes) for the full study period. CanCM4 employs the CanAM4 atmospheric model, with T63 truncation (~2.8° horizontal resolution) and 35 hybrid (sigma pressure) vertical levels. The CanOM4 ocean model has spherical coordinates with horizontal resolution of ~1.41° longitude by 0.94° latitude and 40 vertical z-levels. The atmospheric and oceanic model grids are aligned such that 6 ocean grid cells (2 longitude by 3 latitude) lie beneath each atmospheric grid cell. CanCM4 forecasts include 10 ensemble members run for lead times of 0–11 months, and a range of atmospheric and oceanic state variables are available as part of the NMME. In our analysis, we examine several oceanic and atmospheric fields including sea surface temperature, depth of the 26.0 kg m−3 isopycnal, vertical velocity at 50 m depth, meridional wind stress, and surface net heat flux to elucidate mechanisms of SST variability and predictability.

Fig. 2
figure 2

SST skill grids for all members of the NMME. Initialization month is on the x-axis, lead time is on the y-axis, and anomaly correlation coefficient is in color. The zero-lead forecast is for the month of initialization (e.g., the lower left corner of each grid represents a forecast of January’s monthly mean SST, initialized at the beginning of January). Gray dots indicate significant skill while white dots indicate significant skill above persistence (95% confidence level). 5 of the 14 models (COLA-RSMAS-CCSM3, both IRI models, NASA-GMAO, and NCEP-CFSv1) have been retired

2.3 Validation data

Sea Surface Temperature hindcasts are validated against monthly averages of NOAA’s 0.25° daily Optimum Interpolation Sea Surface Temperature, version 2 (OISSTv2) (Banzon et al. 2016; Reynolds et al. 2007). Stock et al. (2015) found through comparison with uninterpolated in-situ data that OISSTv2 is highly accurate in the CCS. Validation data for surface wind stress and net surface heat flux are synoptic monthly mean fields from the European Centre for Medium-range Weather Forecasting (ECMWF) 0.7° resolution Interim reanalysis (ERA Interim) (Dee et al. 2011). High-resolution (0.1°) surface and subsurface variables presented for comparison in Sect. 4 derive from the University of California–Santa Cruz CCS regional ocean reanalysis (CCSRA) (Neveu et al. 2016), which uses the Regional Ocean Modeling System (ROMS) with 4-dimensional variational data assimilation, 0.1° horizontal resolution, and 42 vertical levels. CCSRA spans the period 1980–2010 and assimilates satellite SST and SSH as well as available in situ temperature and salinity measurements. It has been used extensively to study regional ocean dynamics in the CCS, particularly related to basin-scale climate variability including El Niño–Southern Oscillation (ENSO) teleconnections of particular interest to the present study (Jacox et al. 2015a, 2016b).

2.4 Forecast skill evaluation

Forecast skill is evaluated as in Stock et al. (2015), using anomaly correlation coefficients (ACC) between forecast and observed fields. Forecast anomalies are calculated based on a lead-dependent climatology to account for model drift. We evaluate significance of forecast ACC using the method of Bretherton et al. (1999), which corrects for autocorrelation in the sample. Significance is tested relative to zero (no forecast skill) and relative to a persistence forecast, which assumes that SST anomalies from the month prior to initialization will persist across all lead times. Oceanic temperature anomalies evolve much more slowly than atmospheric anomalies, owing to the much greater thermal inertia in the ocean (Frankignoul 1985; Goddard et al. 2001). The persistence forecast is therefore often skillful and provides a baseline for evaluating model forecasts. The skill above persistence provides a measure of the added value of a dynamical forecast system. We focus on ACC as it is a common deterministic measure of skill and suits the interests of this study, which is concerned with patterns and mechanisms of predictive skill rather than a thorough evaluation of each model. For a detailed skill evaluation of all models in the NMME, including multiple deterministic and probabilistic skill metrics, see Hervieux et al. (2017).

3 Results

3.1 SST Forecast Skill in the CCS

In the CCS, simple persistence forecasts yield significant skill (ACC significantly greater than zero at the 95% confidence level) for lead times up to several months. This can be seen in the upper left panel of Fig. 2, where ACC is plotted as a function of initialization month and lead time. With the exception of September and December initializations, persistence forecasts have significant skill for 4 months or more, and for forecasts initialized in January, persistence offers significant forecast skill as far out as October (9-month lead time). In general, the lowest persistence forecast skill comes from forecasts initialized in late summer (July–September). This period is one of strong stratification, a shallow mixed layer, and low thermal inertia in the CCS. Late summer SST anomalies therefore have relatively weak correlation with SST anomalies in the following winter/spring. Rather, winter/spring SST anomalies are coupled to SST anomalies from the previous winter/spring, which reemerge when the surface mixed layer deepens (Alexander et al. 1999).

The NMME ensemble mean forecast demonstrates significant skill across nearly all initialization months and lead times, with the exception of some long lead (>6 months) forecasts initialized in late fall and winter (Fig. 2). Individual models exhibit skill patterns similar to that of the ensemble mean. In particular, enhanced predictability for forecasts of late winter/early spring (February–April) SST anomalies are visible as a band of high skill extending from the upper left to lower right in the panels of Fig. 2. As the late winter/early spring period is also one of minimal skill from persistence forecasts, it is where much of the dynamical skill above persistence resides. Though the degree of skill varies among models, the overall similarities suggest that different models may have similar mechanisms generating predictability, but with varying degrees of fidelity. The CMC models are notably strong performers; CMC2-CanCM4 in particular has skill comparable to and in some cases better than the ensemble mean forecast (we explore the sources of this skill in Sect. 3.3). Some of the models that exhibit relatively lower skill in the CCS (e.g., the GFDL and IRI models) have been shown to perform particularly well in other regions (Hervieux et al. 2017, Stock et al. 2015).

3.2 Forecast skill relative to basin-scale variability

We first explore the dynamics underlying seasonal evolution of CCS SST anomalies by examining their relation to the broader basin-scale variability. When SST anomalies averaged over the CCS region are correlated at zero-lag with SST anomalies throughout the Pacific basin, a clear pattern associated with ENSO emerges (Fig. 3). Not surprisingly, correlations are strongest in the vicinity of the CCS, with weaker correlations extending far afield. When the CCS mean SST anomalies are lagged relative to basin-wide anomalies, the ENSO pattern remains, though the intensity of correlations changes regionally. At 3 months lead, correlations within the CCS decline, indicating fading persistence. Conversely, SST anomalies in the Niño 3.4 region of the equatorial Pacific are more strongly correlated with CCS anomalies at 3 months lead than they are at zero lead. At 6–9 month lead times, a similar pattern continues, with the correlations continuing to decrease in the CCS while correlations between the CCS and the equatorial Pacific decline more slowly. These findings are consistent with our understanding of ENSO’s influence on the CCS, with equatorial Pacific SST anomalies leading CCS anomalies by several months (Jacox et al. 2015b), and suggest ENSO variability as a likely source of seasonal predictability in the CCS.

Fig. 3
figure 3

Correlation of Pacific basin-wide SST with CCS regionally averaged SST. Individual panels show the correlation of CCS SST with basin-wide SST 0, 3, 6, and 9 months prior. Black outlines mark the CCS and Nino3.4 regions

The relationship between ENSO and CCS SST forecast skill can be illustrated further through simple correlation analysis. Specifically, we construct a statistical forecast using multiple linear regression, where the observed SST anomaly at a given initialization month and lead time is fit as a function of both the CCS and Niño 3.4 SST anomalies the month prior to forecast initialization. For example, a statistical forecast of June initialized in February (4-month lead time) would fit the observed June SST anomalies as a function of the CCS and Niño 3.4 SST anomalies in January. Effectively, this regression constitutes a statistical forecast of CCS anomalies based on the combination of persistence and ENSO variability. It captures much of the skill of the dynamical forecast systems in the NMME (Fig. 4), and perhaps provides an even better benchmark than persistence upon which dynamical models should try to improve. The considerable skill of statistical forecasts is well established, with some highly evolved examples having been developed and applied on multiple spatial and temporal scales (Newman 2007).

Fig. 4
figure 4

Forecast skill above persistence (e.g., ACC of dynamical forecast minus ACC of persistence forecast) for left the NMME ensemble mean forecast, middle a statistical forecast based on persistence and tropical SST anomalies, and right the difference between them. The statistical forecast is constructed using a multiple linear regression, where the observed SST anomaly at a given initialization month and lead time is fit as a function of both the CCS and Niño 3.4 SST anomalies the month prior to forecast initialization. For example, a statistical forecast of June initialized in February (4-month lead time) would fit the observed June SST anomalies as a function of the CCS and Niño 3.4 SST anomalies in January

When averaged across all lead times, forecast skill is nearly constant for all initialization months, and the NMME multi-model mean forecast skill is on par with the best individual models as well as the statistical (persistence + ENSO) forecast (Fig. 5). It has been known for some time that a multi-model ensemble means tends to perform as well as or better than the best individual model; the ensemble mean skill stems from its greater consistency and reliability relative to individual models, as well as the cancellation of individual model errors when they are averaged (Hagedorn et al. 2005). The persistence forecast alone is comparable in skill to the worst performing individual model.

Fig. 5
figure 5

Anomaly correlation coefficient for forecasts in the CCS region, averaged by top initialization month, middle lead time, and bottom forecast month. In other words, from a given skill grid (Fig. 2), these three panels represent means of the columns, rows, and diagonals, respectively. Skill is shown for persistence, individual models, the NMME ensemble mean, and a simple multiple linear regression using persistence plus the Niño 3.4 SST anomaly at initialization. For a detailed skill analysis of each model in the CCS and other marine ecosystems, see Hervieux et al. (2017)

Skill declines steadily as lead times increase, though the decline in skill from 3 to 11 months is small relative to the decline from 0 to 3 months (Fig. 5, middle). The NMME mean, statistical forecast, and best individual model all exhibit similar performance. However, a more nuanced picture emerges when viewing skill as a function of the month being predicted (Fig. 5, bottom). Dynamical forecast skill and skill above persistence are highest for predictions of late winter/spring, with maximum skill arising for February/March SST anomalies. Skill is relatively low for Summer-Fall forecasts, with minimum skill for August, and neither the ensemble mean nor any individual model is able to forecast August–October with greater skill than a persistence forecast. The statistical forecast based on persistence and ENSO variability has a seasonal cycle of forecast skill that is qualitatively similar to that for dynamical forecasts. However, the NMME mean performs better for January–May forecasts, largely due to added skill for long lead (>6 month) forecasts initialized in spring, and the statistical forecast performs slightly better for October–December forecasts, largely due to added skill for long lead forecasts initialized in winter (Fig. 4).

3.3 Mechanisms of SST predictability

The findings of Sects. 3.1 and 3.2 highlight two key points about seasonal forecast skill in the CCS: (1) dynamical skill above persistence is concentrated in forecasts of the first half of the year, particularly January–April, and (2) skill above persistence derives largely, though not entirely, from a predictable regional manifestation of ENSO variability. We now turn our attention to elucidating the mechanisms through which dynamical forecast systems capture ENSO-related predictability in the CCS. In order to do so, we focus on a single member of the NMME, CanCM4, which is arguably the best performing model for CCS hindcasts (Fig. 2) and serves as a test case to explore in more detail the dynamics governing predictability.

The strong relationship of skill above persistence to ENSO variability (Sect. 3.2) suggests that years of large ENSO signals (i.e., El Niño and La Niña events) may contribute disproportionately to seasonal forecast skill. Indeed, when hindcast skill above persistence is partitioned into the years following medium to strong ENSO events [i.e., when the 3-month running mean of Niño 3.4 SST anomalies, also termed the Oceanic Niño Index (ONI), exceeds a magnitude of 1] and the years associated with neutral or weakly positive/negative ENSO conditions (|ONI|<1), we find that forecast skill above persistence is associated almost entirely with the former (Fig. 6). In other words, the dynamical forecast skill above persistence for 28-year hindcasts is largely captured by using dynamical forecasts for the 10 strongest ENSO events (1983, 1987, 1988, 1989, 1992, 1998, 1999, 2000, 2003, 2008) and persistence forecasts for the other 18 years. This finding is consistent with previous studies that identify ENSO variability as the primary driver of seasonal predictability in air temperature and precipitation anomalies over the continental United States (Barnett and Preisendorfer 1987; Quan et al. 2006). However, it should be noted that there is residual skill in the dynamical forecast beyond that generated during ENSO events, particularly for long lead forecasts (cf., left and middle panels of Fig. 6).

Fig. 6
figure 6

Left Forecast skill above persistence for CanCM4 in the CCS region. The contribution to skill above persistence by years that follow a moderate to strong El Niño or La Niña (N = 10) and by all other years (N = 18) is shown in the middle and right panels, respectively. White dots indicate significant skill above persistence (95% confidence level)

Having determined that SST forecast skill above persistence is mostly constrained to forecasts of the late winter/spring in moderate to strong ENSO events, we examine the regional forcing mechanisms driving SST anomalies during those periods. Tropical SST variability during ENSO events modifies north Pacific SST anomalies through atmospheric teleconnections [i.e., the atmospheric bridge (Alexander et al. 2002)]. Using global mixed layer models, Alexander et al. (2002) found that the atmospheric bridge drives basin-scale SST anomalies primarily through the net surface heat flux, with a weaker contribution from wind-driven Ekman transport. However, they found the contributions of surface heat flux and wind stress to be of comparable magnitude in the nearshore region of the CCS, where wind-driven coastal upwelling exerts significant control over ocean temperature variability. Regional studies confirm the importance of wind stress anomalies for driving environmental change in the CCS during ENSO events, with El Niño (La Niña) conditions typically bringing reduced (increased) upwelling intensity and consequently positive (negative) SST anomalies (Jacox et al. 2015b; Schwing et al. 2002). This atmospheric teleconnection is fast, with the CCS response lagging the tropics by a few weeks to a month, and upwelling anomalies during ENSO events typically persisting through ~April (Alexander et al. 2002; Jacox et al. 2015b). Local SST anomalies are also impacted by coastally trapped waves that propagate poleward into the CCS domain (Meyers et al. 1998), deepening the thermocline and limiting the efficacy of coastal upwelling for cooling the ocean surface. While this remote oceanic influence has been shown to contribute significantly to CCS anomalies during ENSO events (Chavez et al. 2002; Frischknecht et al. 2015), coastal waves are confined close to the coast (internal deformation radius of tens of km) and are not resolved by global climate forecast systems with horizontal resolution on the order of 1° (Alexander et al. 2002). We therefore focus our analysis on the surface wind stress and net surface heat flux, though coastal waves are discussed further in Sect. 4.

In order for a given forcing (e.g., surface wind stress) to generate skill above persistence in the SST anomaly field, it must satisfy three conditions: (1) it must exert influence over SST anomalies in the model, (2) its influence in the model must be consistent with its influence in nature, and (3) it must be predictable. In Figs. 7 and 8, we use correlation analyses to test these conditions for surface wind stress and net surface heat flux, respectively. For initialization months and lead times where ENSO-related forecast skill above persistence emerges (middle panel of Fig. 6), we correlate the forecast and observed wind stress (or net surface heat flux) anomalies with the residuals from the persistence SST forecast, which tests conditions 1 and 2. We then correlate the forecast and observed wind stress (or net surface heat flux) anomalies with each other to test condition 3. Note that we aim here to elucidate forecast skill above persistence, not forecast skill itself. We therefore use the term ‘SST residuals’ in reference to the residuals from persistence forecasts, and we examine relationships between surface forcing (wind stress or heat flux) and those residuals.

Fig. 7
figure 7

Left Forecast and middle observed relationships between meridional wind stress anomalies (x-axis) and SST persistence forecasts residuals (y-axis) during top moderate to strong ENSO events and bottom ENSO neutral periods. Positive wind stress anomalies indicate anomalous poleward winds. For each initialization month, data are plotted for the lead time when ENSO-related skill above persistence emerges (i.e., in the middle panel of Fig. 6, the shortest lead time with a white dot for each initialization month). right Meridional wind stress forecast skill for the same periods, as indicated by the correlation between model forecast (x-axis) and observed (y-axis) wind stress. Color indicates the value of the Niño 3.4 Index the month preceding forecast initialization. Forecasts are from CanCM4, observations are from NOAA OISSTv2 for SST and ERA Interim for wind stress

Fig. 8
figure 8

As in Fig. 7, but for surface heat flux in place of wind stress. Positive surface heat flux anomalies indicate anomalous heat flux into the ocean

The key result of the analysis presented in Figs. 7 and 8 is that all three of the aforementioned conditions for generating SST predictability are satisfied only by surface wind stress and only during ENSO events. Under those constraints, wind stress anomalies are strongly correlated (r = 0.8) with monthly SST residuals in both the CanCM4 forecasts and the validation data, and the observed and forecast wind stress anomalies are strongly correlated with each other (r = 0.7). Thus, late winter/early spring wind stress anomalies are predictable during El Niño/La Niña events, and they generate a predictable response in the SST anomaly field that is not captured by a persistence forecast. Note that SST anomalies do not instantaneously respond to anomalous wind, and a 3-month trailing mean has been applied to the wind stress and surface heat flux to capture their cumulative impacts over several months.

In addition to identifying a predictable wind response to ENSO events as the main driver of SST forecast skill above persistence in the CCS, Figs. 7 and 8 illuminate several reasons why skill does not emerge from wind stress during ENSO-neutral periods or from surface heat fluxes under any ENSO conditions. During ENSO-neutral periods, wind stress anomalies are correlated with SST residuals quite strongly (r = 0.7) in observations and less so (r = 0.4) in CanCM4 forecasts (Fig. 7). However, there is no skill in the wind stress anomaly forecasts during these periods, and our third condition for generating predictability is not satisfied. Surface heat flux anomalies are positively correlated with SST residuals in CanCM4 forecasts during ENSO events (Fig. 8). However, the same relationship is not found in the observations, and our second condition for predictability is not satisfied. We will see later (Sect. 4.2) that the observed relationship between surface heat flux and SST anomalies has important fine scale structure within the CCS, which is not captured by the coarse resolution GCMs. Finally, in ENSO-neutral conditions, CanCM4 forecasts exhibit no relationship between surface heat flux anomalies and SST residuals, and our first condition for predictability is not satisfied.

3.4 Regional differences in forecast skill

Given that atmospheric forcing varies dramatically between regions within the CCS (Checkley and Barth 2009; Dorman and Winant 1995), it is reasonable to expect that the predictability derived from that forcing will vary as well. Furthermore, applications of seasonal SST forecasts often occur on scales smaller than the entire CCS (e.g., Kaplan et al. 2016; Tommasi et al. 2017). We therefore divide the CCS into northern, central, and southern sub-regions, with divisions at Cape Mendocino (~40.5°N) and Point Conception (~34.5°N), and evaluate SST forecast skill on these finer scales. While the patterns in the lead-initialization month forecast skill matrix are similar between sub-regions, we find a marked latitudinal gradient from relatively high forecast skill in the north to relatively low forecast skill in the south (Fig. 9). In the northern CCS, forecast skill is routinely realized with anomaly correlation coefficients greater than 0.5, and February–April forecasts are significantly better than persistence at all leads. In the southern CCS there are some instances of skill above persistence (again, in the February–April timeframe). However, when averaged across all initialization months and lead times, forecast skill in the southern CCS is no better than persistence, and is worse than a simple statistical forecast based on persistence plus Niño 3.4 SST anomalies (Table 2).

Fig. 9
figure 9

Left 1982–2009 mean SST from OISSTv2, with CCS sub-regions outlined in black. Right CanCM4 SST forecast skill for northern, central, and southern CCS sub-regions. Markers are as in Fig. 2

Table 2 Anomaly correlation coefficient by CCS region averaged over all initialization months and lead times for persistence forecasts, a simple multiple linear regression using persistence plus the Niño 3.4 SST (as in Fig. 4), and CanCM4 forecasts

When forecast skill above persistence is partitioned into ENSO events and ENSO-neutral periods, our findings for the entire CCS (Fig. 6) hold true qualitatively within each CCS sub-region (Fig. 10). In particular, dynamical skill above persistence in each region is generated almost exclusively through forecasts during ENSO events. Consistent with the overall latitudinal skill gradient, the influence of ENSO on forecast skill is more pronounced to the north. The latitudinal skill gradient is also consistent with our finding that ENSO-related skill comes through the wind; ENSO drives variability predominately through anomalous wind forcing in the northern CCS, through remote ocean forcing (coastal waves) in the southern CCS, and through a combination of remote and local influences in the central CCS (Frischknecht et al. 2015; Hermann et al. 2009). The northern CCS therefore benefits most from skillful forecasts of ENSO-related wind anomalies, while SST forecast skill in the southern CCS likely suffers from the inability of global forecast systems to resolve coastal waves propagating up the west coast of North America.

Fig. 10
figure 10

As in Fig. 6, but for northern, central, and southern sub-regions of the CCS

4 Discussion

4.1 SST forecast skill in the CCS

Each of the coupled climate models contributing to Phase I of the NMME exhibits significant SST forecast skill in the CCS. At short lead times (0–4 months), much of that skill can be attributed to persistence, while at longer leads skill above persistence emerges and in some cases extends for the full length of the forecast (Fig. 2). Individual models, as well as the NMME multi-model mean, are particularly skillful for forecasts of February–April, regardless of initialization month. These late winter/spring forecasts also generate the greatest skill above persistence, as they coincide with times of low skill from persistence forecasts. In the case of CanCM4, we attribute the observed skill above persistence primarily to a predictable evolution of the CCS wind (and resultant upwelling) anomalies during moderate to strong ENSO events. Wind-driven upwelling anomalies during ENSO events typically onset in the CCS in ~December (Jacox et al. 2015b), consistent with a rapid atmospheric teleconnection from the tropics, and the lag between wind anomalies and their expression in the SST field results in the onset of SST forecast skill in ~January. Upwelling anomalies persist through April/May (Jacox et al. 2015b) and continue to drive SST forecast skill, which extends slightly longer (through June or July) due to persistence of the wind-generated anomalies.

Though not explored in the present analysis, additional potential sources of predictability especially at long lead times include reemergence of subsurface SST anomalies when the mixed layer deepens in winter (Alexander et al. 1999) and eastward advection of offshore anomalies into the CCS (Chikamoto et al. 2015; Stock et al. 2015). The presence of similar patterns of predictability among NMME models suggests that the skill-generating mechanisms may be the same, though captured more faithfully in some models than in others. Performing the analysis herein on additional NMME members would likely prove illuminating for understanding how mechanisms of predictability vary between models, as well as tradeoffs incurred by changing model formulations. For example, NCEP’s CFSv2 performs slightly worse than CFSv1 in our CCS hindcasts, but offers improvements over CFSv1 in nearly all other North American LMEs (Hervieux et al. 2017).

While we have focused our analysis on the conditions and mechanisms that generate predictive skill in the CCS, just as important for improving forecasts is to understand the conditions and mechanisms that limit forecast skill. Observed meridional wind stress anomalies are strongly positively correlated (r = 0.7–0.8) with SST residuals during periods of significant forecast skill above persistence (i.e., ENSO events) as well as during periods of limited SST forecast skill (i.e., ENSO-neutral years). However, wind stress anomalies are not forecast skillfully in the latter case (Fig. 7), and inaccurate forecasts of local CCS winds propagate through to errors in SST by a chain of events in which equatorward winds that are too weak (strong) produce upwelling that is too weak (strong) and SST that is too warm (cold). Furthermore, errors in the wind may introduce SST errors through inaccurate representation of offshore upwelling driven by wind stress curl as well as mechanical mixing and entrainment. Thus, in the CCS, forecast wind anomalies may be both the primary dynamical source of seasonal SST predictability (during ENSO events when there is a predictable response in the wind) and a dominant limitation on seasonal SST predictability (during ENSO-neutral periods when wind anomalies are not predicted accurately). The difficulty of accurately forecasting winds on seasonal timescales is not surprising given the chaotic nature and short memory of the atmosphere (Goddard et al. 2001), though our findings do suggest promise at least when climate signals are large. Prior studies demonstrating ENSO variability as the primary driver of forecast skill for air temperatures and precipitation over the US continent (Barnett and Preisendorfer 1987; Quan et al. 2006) have motivated consideration of conditional forecasts based on the ENSO state (Pegion and Kumar 2013), and a similar approach may be fruitful for SST forecasts off the US west coast.

4.2 Implications for regional downscaling

Several difficulties arise from the relatively coarse resolution of global climate forecast systems when applied to Eastern Boundary Upwelling Systems (EBUS) like the CCS, where fine-scale dynamics play an important role. First, upwelling is poorly represented for two reasons: (1) a horizontal grid resolution of order 1° in the ocean is too coarse to resolve the cross-shore scales of coastal upwelling (Jacox et al. 2014) and the temperature variability that comes with it (Fig. 1). Even with model winds that accurately simulate those in nature, a coarse ocean leads to upwelling that is too diffuse and consequently to muted SST anomalies, especially within a narrow band (tens of km) next to the coast (Fig. 11), and (2) typical atmospheric resolutions of 1° or more in global climate models are unable to resolve nearshore wind stress curl that can dramatically alter the cross-shore structure of upwelling and its impact on the temperature field (Capet et al. 2004; Small et al. 2015). Second, coastally trapped waves also have characteristic cross-shore scales on the order of tens of km, and are unresolved in most global climate models (Alexander et al. 2002). These waves are important drivers of variability in the CCS during ENSO events, particularly off southern California where local wind variability has relatively little impact (Frischknecht et al. 2015). The absence of this remote oceanic influence in global forecast systems is evident when comparing CanCM4 with a high resolution CCS reanalysis, which shows the southern CCS characterized by coastally intensified isopycnal depth anomalies during El Niño (Fig. 11).

Fig. 11
figure 11

Mean CCS response to strong El Niño events in left CanCM4 6-month lead forecasts and right the 0.1° resolution CCS reanalysis from UC Santa Cruz. Maps are composites of January–March anomalies during the three strongest El Niños in our study period (1983, 1992, 1998). Plotted variables are, from top to bottom, SST, meridional wind stress, surface net heat flux, vertical velocity at 50 m depth, and depth of the 26.0 kg m−3 isopycnal surface

The strengths and limitations we have outlined for global forecast system application to the CCS suggest considerable promise for dynamical downscaling in the region. Translating significant forecast skill in the broad-scale winds to a high-resolution regional model enables predictable upwelling variability to be resolved on the scales over which it occurs in nature, thereby improving representation of coastal SST anomalies. Indeed, coastal SST bias in the CCS can be largely eliminated by an order of magnitude increase in the ocean resolution of GCMs (from ~100 to ~10 km), in combination with a moderate increase in the atmospheric resolution (from ~200 to ~50 km) (Delworth et al. 2012). The fidelity of regional models in EBUS can be further improved by statistically downscaling the GCM winds to the regional model resolution prior to performing the dynamical downscaling (Machu et al. 2015). Similarly, in the regional model a narrow coastal band is visible in which the net surface heat flux is negative while SST anomalies are positive during strong ENSO events (Fig. 11). Warm SST anomalies in this nearshore band result from coastal wave propagation and anomalously weak coastal upwelling, making it a weaker heat sink than normal and producing negative surface heat flux anomalies that act to damp nearshore SST anomalies. These fine-scale dynamics are not captured by CanCM4 and other GCMs, nor is the poleward propagation of coastal waves that suppress the CCS thermocline during El Niño.

In dynamical downscaling experiments off the Oregon and Washington coasts, Siedlecki et al. (2016) found measureable forecast skill on timescales up to 4 months for physical and biogeochemical properties, particularly for bottom water properties on the continental shelf, but also incurred difficulties related to wind and shortwave radiation biases in NCEP’s CFS forecasts. Based on their results, they encouraged additional seasonal forecasting efforts using global forecasts coupled to regional ocean models, and the hindcast skill of CanCM4 makes it an obvious choice to force downscaled models in the CCS. However, an ensemble of downscaled runs forced by multiple NMME members (even those with low skill) will enable better characterization of the forecast uncertainty in the regional domain, and likely an ensemble mean that performs better than any individual model (DelSole et al. 2013; Tippett and Barnston 2008).