1 Introduction

The atmospheric extra-tropical flow is characterized as chaotic motions that are sensitive to initial conditions and thus is merely predictable by operational weather forecast models 2 weeks in advance given the current observational and modelling accuracy (e.g., Lorenz 1969; Leith 1971; Tribbia and Baumhefner 2004). On the other hand, the seasonal and inter-annual variations in the extra-tropical flow are subject to ocean boundary conditions and captured by climate models reasonably well (Kumar et al. 1996; Shukla 1998; Shukla et al. 2000; Goddard et al. 2001; Smith et al. 2012). The predictability in the intermediate range, between approximately 10 days and up to a month, is challenging for both weather forecast and climate models, due to the lack of significant influence from either the initial or the boundary conditions. However, low frequency (>10 days or longer than synoptic scales but shorter than seasonal scales) variability tends to dominate the total variability of the sub-seasonal tropospheric circulation (Blackmon 1976) for Northern Hemisphere (NH) winter and summer. Blackmon (1976) showed further that the structure of the sub-seasonal low-frequency fluctuations are dominated by large scale planetary waves, as compared to the high frequency (<10 days) fluctuations, which are dominated by synoptic waves.

There has been much less work done on the summer low-frequency variability and its predictability, although there is increased threat of persistent summer circulation anomalies associated with heat waves, floods and droughts throughout the world (e.g., Beniston 2004; Dole et al. 2011; Barriopedro et al. 2011; Coumou and Rahmstorf 2012). Similar to northern winter, the magnitude of the northern summer low frequency variability is generally much larger than that of the high frequency component in both the lower and upper troposphere (e.g., compare Fig. 1a, b with 1c, d). Thus understanding the predictability of the low-frequency variability in northern summer is of great importance and can be very useful for potentially predicting summer heat waves, floods and droughts.

Fig. 1
figure 1

NH summer (May–September) streamfunction daily variability in standard deviation (×106 m2 s−1): a 250 hPa low frequency (7-day running mean), b 750 hPa low frequency, c 250 hPa high frequency (residual of 7-day running mean), and d 750 hPa high frequency. The contour interval is 1 × 106 m2 s−1 for a, c and 0.5 × 106 m2 s−1 for b, d. a 250 hPa low frequency b 750 hPa low frequency, c 250 hPa high frequency, d 750 hPa high frequency

Statistical models have been explored to study the predictability of low frequency variability and showed promising skills comparable to dynamical models with much lower computational cost (e.g., Penland and Ghil 1993; Winkler et al. 2001; Newman et al. 2003; Pegion and Sardeshmukh 2011). Winkler et al. (2001) and Newman et al. (2003) illustrated that by including tropical heating in a linear inverse model (LIM) for the streamfunction fields, the predictability of northern winter and summer low-frequency circulation is much improved in the intermediate range (i.e., week 2 and week 3). Pegion and Sardeshmukh (2011) showed that the two sophisticated general circulation models are only slightly more skillful than the LIM at week 2 and week 3 when ensemble prediction is used.

The role of the intraseasonal tropical oscillation, or Madden–Julian oscillation (MJO), as one of the dominant modes of intraseasonal variability of the tropical heating, on the extratropical circulation has been discussed extensively in the literature (e.g., Zhang 2005; Zhou and Miller 2005; Cassou 2008; Lin et al. 2009). It is considered to be one of the most significant sources of intermediate range atmospheric predictability due to the quasi-periodicity of its temporal evolution (e.g., Cassou 2008; Lin et al. 2009). Most of such studies examine the boreal winter extra-tropical flow, and it is less clear of the role of MJO and other tropical heating variability on the boreal summer circulation and its predictability. In this study, we examine the intermediate to long range atmospheric predictability by focusing on the northern summer (May through September) circulation using a high-order vector autoregressive (VAR) model. The high-order VAR model is an extension of the LIM and has been recently used to study El Niño Southern Oscillation (ENSO) predictability (Chapman et al. 2015) and the Atlantic SST predictability (Lee et al. 2015) and has been shown by Chapman et al. (2015) to be equivalent to the linear version of the UCLA Empirical Model Reduction (Kravtsov et al. 2009). Chapman et al. (2015) show that the high-order VAR model not only takes into account the useful temporal evolution information of the target variable (e.g., SST or streamfunction), but also further whitens the residual noise of the statistical forecast model, thus improving the performance of the LIM. In particular, we use this VAR model to investigate the characteristics and mechanisms of the tropical heating in improving the NH atmospheric predictability that have not been fully explored in the LIM studies mentioned above.

The data and statistical model used in this study is presented in the next section, followed by the predictability results without and with the tropical heating in Sects. 3 and 4, respectively. The key findings are summarized in Sect. 5.

2 Data and methods

Data used in this study include the NH streamfunction at 250 and 750 hPa and tropical diabatic heating at all longitudes between 25°S and 25°N. The NH streamfunction at 250 and 750 hPa pressure levels is calculated using the zonal and meridional winds from the Modern-Era Retrospective Analysis for Research and Applications (MERRA; Rienecker et al. 2011). These two pressure levels are chosen to represent the predominantly baroclinic structure of the northern summer general circulation. The tropical total diabatic heating used here is directly taken from the MERRA archive and vertically integrated from 100 to 1000 hPa. All fields are spectrally truncated to a T21 resolution, although the test using the original T80 resolution shows very similar results. Linear trend and the first three harmonics of the annual cycle (12, 6, and 3 months) are removed for all data. The extended summer months from May through September are used in this study for more robust parameter estimation in the statistical model. The low-frequency part of the streamfunction fields is then obtained by applying the 7-day running mean to the daily data, with the daily residual fluctuation from the 7-day running mean as the high frequency component. Figure 1 shows the standard deviation of the low-frequency (top panels) and high frequency (lower panels) streamfunction fluctuations for the northern summer months at upper (right panels) and lower (left panels) troposphere. The dominance of the low-frequency fluctuations is apparent in Fig. 1, with the well-defined Pacific and Atlantic storm tracks in the high frequency component. This paper focuses on the predictability of the low-frequency part of the variability (top panels).

The high-order VAR model used in this study can be written as follows

$${\mathbf{x}}_{i} = \sum\limits_{j = N}^{L + N - 1} {{\mathbf{A}}_{j} \cdot {\mathbf{x}}_{i - j} } + {\mathbf{e}}_{i} ,$$
(1)

where x i represents the variable x at the i-th time step and is expressed as the linear combination of its states at the previous N to L + N − 1 time steps (L orders) with a forecast lead of N time steps. \({\mathbf{A}}_{j}\) is the coefficient for the j-th order and e represents the residual white noise process. This model reduces to the LIM and the Markov model when L = 1.

The empirical orthogonal function (EOF) analysis is applied to the streamfunction and diabatic heating to reduce the spatial degrees of freedom (DOF) of the statistical model, from the order of 105 to a relatively small number of leading modes (~40) that captures a large fraction of the total variance. It is thus possible to construct a reduced order statistical model that can capture the evolution of the large-scale circulation. The vector variable x can be decomposed as follows

$${\mathbf{x}} = {\mathbf{U}} \times \mathcal{\varSigma } \times {\mathbf{V}}^{T} ,$$
(2)

where the eigenvector matrix U represents the spatial patterns of x that are orthogonal to each other, and the eigenvalues (the squares of the diagonal of Σ) measure the relative variance explained by each eigenvector. The corresponding principle components (PCs) can be obtained as U T × x and are used to replace x in the VAR model (Eq. 1) to reduce the DOF of the calculations.

The forecast skill of the VAR model is evaluated by cross-validation as follows: the time series of a single year (summer) is first selected as the target year for forecast, and the rest of the years (summers) are used to estimate the coefficient matrix A (i.e., out-of-sample forecast). The predicted time series of the target year is then verified against the observed data. The temporal anomaly correlation between the predicted and the observed time series is calculated for each spatial grid point. The above routine is repeated for all years and the anomaly correlation is averaged across all the years to obtain the forecast skill. The anomaly correlation is sometimes spatially averaged over the entire NH or latitude bands of 0°–30°N, 30°–60°N, and 60°–90°N to obtain the domain averaged forecast skill. Since the probability distribution of linear correlation coefficient is not Gaussian and thus not directly additive, the Fisher’s Z transformation is used to calculate the averaged anomaly correlation and its confidence interval. The correlation coefficient r is first transformed to z by z = arctanh (r), where z is approximately Gaussian. The averaged z, z m , is then transformed back to the averaged correlation coefficient r m by r m  = tanh (z m ) (see Wilks 1995 for more details).

The statistical significance of the anomaly correlation is evaluated by Student’s t test. As the DOF depends on lead-time and order (DOF = number of days in each summer −7 – N − L − 2), the 95 % confidence level for the correlation coefficient (p = 0.05) is calculated for each DOF. A similar test is applied to the temporal regression of streamfunction and tropical heating on the PCs of tropical heating in Sect. 4.

3 VAR model prediction without tropical heating

The VAR model is first applied to the NH 250 and 750 hPa streamfunction to determine the optimal combination of parameters. The forecast skill of the VAR model depends on several parameters, including the number of EOFs retained, the order or initial time steps to be considered (L), and the lead-time (N, whether it is a 14-day or 21-day forecast, for example). The individual and cumulative variances explained by the leading EOFs are shown in Fig. 2. The first 5 EOFs explain ~40 % of the total variance. We consider a range of EOF modes in this study ranging from 20 modes (with an accumulative variance close to 70 % of the total variance) to 60 modes (with close to 90 % of the total variance). We also consider a range of orders from one to 15, representing temporal memories ranging from 1 day to 15 days prior to the forecast day. Increasing the order increases the number of independent variables that may lead to over fitting, whereas higher orders may help provide the essential time evolution information that can be potentially useful. The 10 to 14-day lead-time is chosen as the evaluation period in determining the optimal parameters, which represents the model’s intermediate range forecast skill.

Fig. 2
figure 2

Individual (dots) and accumulative (solid curve) variance explained by EOFs of NH summer 250 and 750 hPa streamfunction

Figure 3a shows the 250 hPa forecast skill, measured by the temporal anomaly correlation between forecast and verification fields and averaged over the entire NH and averaged for lead-times of 10–14 days, as a function of the number of EOFs and the order used in the VAR model. For a fixed number of EOFs, the skills increase quickly with the order and reach a maximum at the third order (L = 3). Higher orders beyond 3 reduce the forecast skill. The dependence of the forecast skill on the number of EOFs for a fixed order (e.g., L = 3) is relatively weak, with the maximum skill reached between 40 and 45 modes. The optimal parameters are 43 EOFs for the third order (denoted as E43L3 and by an ‘X’ in Fig. 3a) based on anomaly correlations in Fig. 3a. In contrast, the relative amplitude, measured by the ratio of the forecast amplitude to the verification amplitude, increases as both the number of EOFs and the order (Fig. 3b), indicating that the more EOFs and higher order to keep, the better the VAR model can capture the observed 250 hPa streamfunction amplitude. The forecast skill for the 750 hPa streamfunction exhibits qualitatively similar behaviors (Fig. 3c, d) as that for the 250 hPa streamfunction.

Fig. 3
figure 3

a Temporal correlation coefficient and b relative amplitude averaged over the entire NH and 10–14-day lead for MERRA 250 hPa streamfunction. cd Same as (ab) but for 750 hPa. The contour interval is 0.005 for correlation coefficient and 0.025 for relative amplitude. a 250 hPa temporal correlation, b 250 hPa relative amplitude, c 750 hPa temporal correlation, d 750 hPa relative amplitude

The high-order VAR model is a natural extension of the first-order one, with the latter assuming Markov property or single-step “memory” (only one time step) and the former long-range dependence (multiple steps). Atmospheric circulation consists of multiple time scales (e.g., Tsonis et al. 1999; Vyushin and Kushner 2009) and one expects the high-order VAR model to perform better than the first-order LIM. One also expects the high-order VAR model to better satisfy the assumption of the data driven statistical model such as Eq. 1 that the residual noise is closer to being white compared to that in the LIM (Chapman et al. 2015), when taking into account the long range dependence of the atmosphere than one single time step. Figure 3a shows some improvements from the first order to the third order in terms of the NH averaged temporal anomaly correlation, from approximately 0.27 to 0.34. As will be shown later, this improvement can be more substantial on the regional scale.

We further compare the forecast skill as a function of lead-time in Fig. 4 for the optimal combination of EOF truncations (43) and order (3) determined in Fig. 3a. The forecast skill of the VAR model are shown in Fig. 4, as well as the results based on a LIM model and the anomaly persistence. Also shown in Fig. 4 is the 95 % confidence level for the various lead-times. All measures of the forecast skill decay with the lead-time as the atmospheric circulation has limited temporal memory. For the persistence, anomaly correlation drops below the 95 % level in 9 days, while the VAR model with optimal parameters drops below the 95 % level in approximately 18 days, compared to 15 days for LIM, based on the 250 hPa streamfunction. The improvements of VAR model over both persistence and LIM is slightly more at the 750 hPa level than at the 250 hPa level. The NH averaged anomaly correlation using VAR model remains significant at the 95 % level at day 21, whereas the persistence loses its significance at day 8 and LIM at day 17.

Fig. 4
figure 4

a 250 hPa and b 750 hPa temporal correlation coefficient averaged over 0–90°N for lead-time 1–21 days for LIM (blue) and the third-order VAR model (red) with 43 EOFs, compared with the persistence of MERRA streamfunction (black). The width of the curves represents the confidence interval of the mean at the 95 % level. The dashed light grey curves represent the statistical significance of the correlation coefficient at 95 % level by t test

To examine the spatial distribution of the persistence and forecast skill, we show in Fig. 5 the 14-day lead anomaly correlation in the reanalysis and VAR model forecasts. There is little persistence at 14-day lead at both upper and lower levels (Fig. 5a, d), as the correlations are mostly negative. The LIM 14-day lead forecast skill shows pronounced improvements over persistence over the entire NH (Fig. 5b, e vs. 5a, d), while the VAR E43L3 forecast exhibits even higher skills (Fig. 5c, f). The improvement by increasing the order is not spatially uniform, and the most pronounced improvement on 250 hPa streamfunction is located from eastern Pacific to Central America and extends to northern Africa and west Asia.

Fig. 5
figure 5

a The 14-day lag auto-correlation indicating the persistence in MERRA NH summer 250 hPa streamfunction. Temporal correlation coefficient of NH summer 250 hPa streamfunction between MERRA reanalysis and the 14-day lead forecast using b LIM and c the third-order VAR model with 43 leading EOFs. df Similar to (ac) but for 750 hPa. The contour interval is 0.1 and the stippling represents the statistical significance of the correlation at 95 % level or higher by t test. a Persistence 250 hPa, b LIM 250 hPa, c VAR 250 hPa, d persistence 750 hPa, e LIM 750 hPa, f VAR 750 hPa

Overall, the third-order VAR model tends to have the highest predictive skill in the latitude band between 15°N and 45°N on 250 h Pa, with maximum skill over the Arabian Peninsula. The anomaly persistence tends to be marginally significant over West Asia and northern Africa (stippled area in Fig. 5a), which may partially explain the high forecast skill there. However, the relative high skill in the 15°N to 45°N band is clearly not purely due to persistence, but something dynamically driven. Similar conclusions apply to the 750 hPa streamfunction prediction (Fig. 5d–f), with the highest prediction skill region shifted to Central and East Asia (Fig. 5e, f), where persistence is relatively poor (Fig. 5d). The 14-day forecast skills are generally higher at 750 hPa than that at 250 hPa.

4 Forecast skill with tropical heating

Diabatic heating consists primarily of radiative, latent and sensible heat fluxes and is the main driver of the tropical atmospheric circulation (e.g., Matsuno 1966; Gill 1980; Ling and Zhang 2013). The low-frequency component of tropical heating is considered one potential source of low-frequency predictability beyond the intrinsic limit of the atmosphere chaos (e.g., Winkler et al. 2001; Newman et al. 2003). For example, Cassou (2008) showed the linkage between MJO and the NAO phases during NH winter. A large fraction of the observed winter intraseasonal circulation is associated with MJO at 2–4-week lead-time (e.g., Matthews et al. 2004; Seo and Son 2012). We examine in this section the role of tropical heating in northern summer low-frequency predictability with the VAR model.

4.1 Optimizing the contribution from tropical heating

In order to construct the VAR model with tropical heating included, we first perform the EOF analysis on the vertically integrated diabatic heating smoothed by the 7-day running mean filter for the latitude band of 25°S–25°N. The 5 leading EOFs explain about 15 % of the total variance, while the top 100 modes explain about 70 % of the total variance (Fig. 6). The spatial patterns of the first 20 tropical heating EOFs are shown in Fig. 7. The first EOF pattern includes anomalous heating located over the eastern equatorial Pacific and cooling in the western tropical Pacific northwest and southwest of the heating center (Fig. 7a). The second EOF pattern resembles the MJO phase 1 (Cassou 2008; Seo and Son 2012), with maximum heating over the maritime continents and cooling over the Central equatorial Pacific (Fig. 7b). The third EOF pattern is dominated by a north–south dipole at 15°S and 15°N with heating and cooling centers, respectively, in the longitude band of 120°W–0°W, from eastern Pacific to the Atlantic (Fig. 7c). As shown in Sect. 4.3 below, this is the mode that contributes most to the extended forecast skill for the Arabian Peninsula. The other leading EOFs, each of which explains approximately 1–2.5 % of the total variance, tend to have more localized centers compared to the first three modes and are less obvious of their physical causes (Fig. 7d–t), although a few of these do resemble the various phases of the MJO (i.e., Fig. 7b, e, g corresponding to phases 1, 4, and 7, respectively).

Fig. 6
figure 6

Individual (dots) and cumulative (solid curve) variance explained by leading EOFs of tropical heating (25°S–25°N)

Fig. 7
figure 7

Spatial patterns of the leading tropical heating EOFs. Contour interval is 0.025 K day−1 a Tropical heating EOF1 (4.1%), b tropical heating EOF2 (3.4%), d tropical heating EOF4 (2.5%), e tropical heating EOF5 (2.1%), f tropical heating EOF6 (1.9%), g tropical heating EOF7 (1.8%), h tropical heating EOF8 (1.7%), i tropical heating EOF9 (1.6%), j tropical heating EOF9 (1.5%), k tropical heating EOF11 (1.4%), l tropical heating EOF12 (1.3%), m tropical heating EOF13 (1.2%), n tropical heating EOF14 (1.2%), o tropical heating EOF15 (1.1%), p tropical heating EOF16 (1.1%), q tropical heating EOF17 (1.1%), r tropical heating EOF18 (1.0%), s tropical heating EOF19 (0.9%), t tropical heating EOF20 (0.9%)

The VAR model with tropical heating is constructed by extending the vector x in Eq. (1) to include both the streamfunction and tropical heating PCs. The optimal order of 3 and 43 modes for streamfunction fields were kept the same as the VAR model without tropical heating. To determine the optimal number of tropical heating EOFs to be included in the VAR model, we gradually increased the number of EOF modes for the diabatic heating from one to 40. The forecast skill improvement (relative to no tropical heating) averaged for the lead-time of 10–21 days and spatially- averaged for the entire NH (0°–90°N), tropics (0°–30°N), and extra-tropics (30°–90°N) at 750 and 250 hPa are shown in Fig. 8 as a function of the number of heating EOF modes. The improvement in forecast skill increases first to reach a peak at 19 EOF modes at 250 hPa for all forecast domains (solid curves in Fig. 8a) and then slowly decays after that. For the tropical domain (dashed curve), there are slight increases of the forecast skill as more EOFs are added, which tend to be true for both the upper (Fig. 8a) and lower (Fig. 8b) levels. At the lower level, however, the extra-tropical forecast skill actually drops as the tropical heating is included (dotted curve in Fig. 8b). We chose 19 modes in this study to reflect the maximum increase in forecast skill in the upper level streamfunction. The forecast skill of the tropical heating is comparable to earlier studies (e.g., Winkler et al. 2001; Newman et al. 2003), which will be discussed in the next subsection.

Fig. 8
figure 8

Average improvements in forecast skill of a 250 hPa and b 750 hPa streamfunction for lead-time 10–21 days by including tropical heating truncated at various number of EOFs. The grey shading represents the confidence interval of the mean at the 95 % level

4.2 Tropical heating impacts on forecast skill

Figure 9a illustrates the forecast skill in terms of NH averaged anomaly correlation at 250 hPa as a function of forecast lead-time from 1 to 21 days, shaded by the 95 % confidence interval. The inclusion of the tropical heating extends the significant NH average forecast skill from 17 days without the tropical heating to beyond 21 days with the tropical heating in Fig. 9a. The additional information brought by tropical heating only improves the 250 hPa streamfunction forecast at lead-time beyond 11 days (14 days) for the mid-latitude (high-latitude) domain (Fig. 9b), while the tropical domain benefits almost immediately. The grey shading represents the 95 % confidence interval in the mean improvements and the improvements are statistically significant if the lower bound of the shading exceeds zero. The wide confidence interval also indicates that in some years the tropical heating improves the streamfunction forecast skill more than the averages shown here. In contrast, the 750 hPa streamfunction forecast can be improved only for the medium range and only in the tropics by including tropical heating (not shown). It takes nearly 2 weeks for tropical heating to boost the forecast of high-latitude circulation (Fig. 9b). This time scale is consistent with that of most teleconnection patterns and wave-mean flow interactions (e.g., Feldstein 2000; Son et al. 2008; Teng et al. 2013), indicating wave propagation as a likely mechanism. The mechanisms of tropical heating impacts on forecast skill will be discussed in more detail in the next subsection.

Fig. 9
figure 9

a 250 hPa streamfunction temporal correlation coefficient averaged over the entire NH for lead-time 1–21 days for the third-order VAR model retaining 43 EOFs in streamfunction without (blue) or with (red) 19 EOFs in tropical heating, compared with the persistence (black). b The improvements by tropical heating averaged over different latitude bands, shaded by the corresponding 95 % confidence interval. c Similar to Fig. 8a but for tropical heating forecast skill averaged over 25°S–25°N. a streamfunction 0°–90°N forecast skill, b streamfunction forecast skill improvements, c tropical heating 25°S–25°N forecast skill

The tropical heating is also predicted simultaneously in the VAR model and its forecast skill is shown in Fig. 9c. At the initial time, the anomaly correlation between the part of the heating included in the VAR model (first 19 EOF modes and 30 % of the total variance) and the actual heating is only slightly above 0.5, as compared to the near perfect correlation for anomaly persistence. The skill associated with persistence continues to exceed that of the VAR model forecast from day 1 to 5 (Fig. 9c). Beyond day 5, however, the VAR model forecast skill exceeds that for persistence and the anomaly correlation stays significant above the 95 % level till day 17. The VAR model forecast skill is superior to the anomaly persistence primarily in the Indian Ocean and the equatorial western Pacific region (not shown). These are the dominant regions of MJO activities and the ability of the VAR model capturing the tropical heating time evolution in these regions indicates that it might have potential skills in predicting the atmospheric circulation variability associated with the MJO, as can be seen in Sect. 4.3.

To determine the spatial characteristics of the streamfunction forecast skill improvements due to the tropical heating, we show in Fig. 10 the forecast skills at 14, 21, 28, and 35-day lead without (left) and with (right) tropical heating. At 14-day lead-time, the forecast of 250 hPa streamfunction is improved mainly in the subtropics by including 19 EOFs of tropical heating in the VAR model (Fig. 10a, b). The general patterns of high forecast skill remain the same, but the skill with tropical heating (Fig. 10b) is generally stronger uniformly than the one without tropical heating (Fig. 10a). At 21-day lead, the significant forecast skill region with tropical heating extends further north and covers a larger zonal band as compared to the one without tropical heating (Fig. 10c, d). The regions with improved forecast skill at intermediate range are in general agreement with and slightly broader than those in earlier studies using LIM (e.g., Newman et al. 2003; note that the forecast skill is not directly compared with earlier studies because of different datasets, evaluation periods, and DOF used). At lead-time 28 and 35 days, the significant forecast skill remains high over the Arabian Peninsula with little decay in the case with the tropical heating (Fig. 10f, h), while the forecast skill diminishes quickly everywhere in the case without the tropical heating (Fig. 10e, g).

Fig. 10
figure 10

NH 250 hPa streamfunction forecast skill for VAR model a E43L3 (without heating), and b E43L3E19 (with heating), at 14-day lead. Similar to ab but for lead-time at cd 21-day, ef 28-day, and gh 35-day. a No heating 14-day lead, b with heating 14-day lead, c no heating 21-day lead, d with heating 21-day lead, e no heating 28-day lead, f with heating 28-day lead, g no heating 35-day lead, h with heating 35-day lead

4.3 Mechanism of tropical heating impact on forecast skill

The forecast skill over the Arabian Peninsula for lead-time up to 35 days when tropical heating is included stands out as one prominent feature of the summertime predictability. Due to the inclusion of the tropical heating, the domain averaged forecast skill in 250 hPa streamfunction stays at ~0.4 from 21-day lead up until 33-day lead, in contrast to the continuous drop to ~0.2 in the no heating case (Fig. 11a). We explore further in this section the possible mechanisms behind this prolonged predictability due to tropical heating.

Fig. 11
figure 11

a Same as Fig. 9a but for the region within 10–30°N and 30–70°E. b Same as Fig. 8a but for lead-time 21–35 days over 10–30°N and 30–70°E. c The improvements of 250 hPa streamfunction temporal correlation coefficient by each of the 20 leading EOFs of tropical heating averaged over 10–30°N and 30–70°E for lead-times 21–35 days, labelled with the corresponding type of heating. a 10°–30°N, 30°–70°E forecast skills, b cumulative improvement averaged over 21–35 days, c individual improvements averaged over 21–35 days

To determine the impacts of individual modes of tropical heating on forecast skill improvements, we show in Fig. 11b the increases in the 250 hPa streamfunction forecast skill in the Arabian Peninsula averaged over lead-times of 21–35 days as a function of the number of the leading PCs of tropical heating included in the VAR model. The overall forecast skill improvements increase as more tropical heating PCs are added to the VAR model and reach the maximum at 17 modes. Tropical heating PC3 and PC1 make the highest contribution to the overall improvements in the 250 hPa streamfunction forecast skill, while several other PCs also reinforce the improvements. The individual contribution from these PCs can be further assessed by including only one tropical heating PC at a time in the VAR model and calculating the corresponding improvement in the 250 hPa streamfunction forecast skill over the Arabian Peninsula (Fig. 11c). Although this decomposition is only quasi-linear, the consistence between Fig. 11b, c confirms the relative importance of these tropical heating PCs. The corresponding types of heating (“Persist” and “MJO”) for these PCs are discussed below. Figure 12 shows the regional forecast skill improvements due to the tropical heating PC1 and PC3. Given that the combined impact of the tropical heating PC1 and PC3 explains more than half of the total improvements in the forecast skill in this region, it is worth exploring further the behavior of these two tropical heating modes and their impacts on atmospheric circulation.

Fig. 12
figure 12

The improvements of 250 hPa streamfunction temporal correlation coefficient by the first PC of the tropical heating at lead-times of a 21, b 28, and c 35 days. df Same as (ac) but for the third PC of the tropical heating. a PC1 21-day b PC1 28-day, c PC1 35-day, d PC3 21-day, e PC3 28-day, f PC3 35-day

The role of tropical heating in impacting the 250 hPa streamfunction is examined through lagged regression between NH streamfunction averaged within the latitude band 10°N–30°N and the tropical heating PC1 with PC1 leading by 0–35 days (Fig. 13a). Figure 13a shows the regression coefficient of the streamfunction onto the tropical heating PC1 normalized by the maximum amplitude of the EOF1 heating, which gives the streamfunction amplitude per 1 Kday−1 heating. Thus the variance explained by the regressed streamfunction relative to the total summer low frequency variability can be estimated by comparing Fig. 13a to Fig. 1a. This diagram shows the development of the negative (cyclonic) anomaly at the Pacific sector and its eastward propagation in the first 2 weeks, as well as the development of the positive (anticyclonic) anomaly and the ensuing eastward propagation that reaches West Asia sector after 3 weeks and maintains there afterwards. To relate this evolution in streamfunction to that of the tropical heating, we show the similar regression as in Fig. 13a but for tropical heating averaged from 5° to 20°N (Fig. 13b) and for 15°S–5°N (Fig. 13c), where the dominant centers of the EOF1 heating are located (see Fig. 7a). The evolution of the regressed tropical heating shows fairly distinct behaviors in the western (0°–180°E) and eastern (0°–180°W) tropics. The heating/cooling centers in the western tropics tend to propagate eastward, while the centers in the eastern tropic are more or less stationary. To determine the relative contribution from diabatic heating in the eastern and the western tropics, additional experiments are carried out to include in VAR model only the western and eastern portion of the tropical heating. The leading PCs of each domain are included in the VAR model one at a time to evaluate their individual impacts on the Arabian Peninsula forecast skill. The contribution from the PC1 tropical heating is almost entirely from that of the eastern tropics. Therefore, it suggests that the quasi-stationary tropical heating/cooling patterns in the eastern tropics help to extend the forecast skill to the third and fourth weeks in the Arabian Peninsula. Although the EOF1 heating pattern resembles that of the heating anomalies during an El Niño event, the forecast skills in ENSO years and non-ENSO years are not significantly different, implying that ENSO might not be the dominant driver of this persistent heating pattern.

Fig. 13
figure 13

Hovmöller diagram of a 10–30°N averaged MERRA 250 hPa streamfunction (×106 m2 s−1 per K day−1), b 5–20°N and c 15°S–5°N averaged MERRA tropical heating (K day−1) regressed on the first PC of the tropical heating with the latter leading by 0–35 days. The stippling represents the statistical significance of the regression at 95 % level or higher by t test. a 250 hPa streamfunction 10°–30°N regressed on heating PC1, b tropical heating 5°–20°N regressed on heating PC1, c tropical heating 15°S–5°N regressed on heating PC1

Figure 14 shows the similar streamfunction and tropical heating regression for the third heating EOF mode. The general pattern of the streamfunction regression shows similar eastward propagation, but a quasi-stationary positive anomaly develops at the Arabian Peninsula starting from the end of week 1 and enhances in time until the fifth week (Fig. 14a). The tropical heating regressed onto PC3 generally shows quasi-stationary heating/cooling patterns (Fig. 14b, c). Further decomposition of the tropical heating into regional domains reveals that the heating and cooling dipole over the Northeast Pacific and North America, as well as the heating centers over the Southeast Pacific and South Atlantic, is able to significantly enhance the streamfunction forecast skill in the Arabian Peninsula (not shown). Again the heating/cooling centers that contributes to the extended forecast skill improvements are located in the eastern tropics and are quasi-stationary in time. In addition, PC12 and PC14 are also persistent heating that reinforces the improvements (see Fig. 11c for modes labelled with “Persist” and Fig. 7l, n for the corresponding heating patterns). All the other notable contributors, including PCs 2, 5, 7, 10, and 17, are more of MJO-type of tropical heating that propagates eastward with intraseasonal time scales (modes labelled with “MJO” in Fig. 11c; see Fig. 7b, e, g, j, and q for the corresponding heating patterns).

Fig. 14
figure 14

Same as Fig. 13 except for a 10–30°N averaged MERRA 250 hPa streamfunction regressed on the third PC of tropical heating and for averaged tropical heating at b 0–15°N and c 0–15°S. a 250 hPa streamfunction 10°–30°N regressed on heating PC3, b tropical heating 0°–15°N regressed on heating PC3, c tropical heating 0°–15°S regressed on heating PC3

The extended predictability of the upper tropospheric circulation over Arabian Peninsula in summer may be largely induced by the persistent tropical heating anomalies in the eastern Pacific and the Atlantic. It is not clear why the tropical heating in the eastern Pacific and the Atlantic tends to persist for a long period of time during northern summer. One possibility is the lack of strong MJO activity in the region which allows the anomalous heating, once formed, to last longer. Further analysis is needed to understand the cause of subseasonal persistent tropical heating anomalies. Even in the absence of tropical heating, the VAR model tends to have a significant forecast skill for 250 hPa streamfunction up to 35 days over the Arabian Peninsula (Fig. 9g). Previous studies have noted that this region is a node of a circumglobal wave train or teleconnection (e.g., Ding and Wang 2005, 2007; Saeed et al. 2011; Hoell et al. 2013; Teng et al. 2013) and is the re-entry location of the circumglobal wave train. In such cases, the anomaly induced by tropical heating or other random atmospheric processes travels around the NH and re-enters this region. Further investigation is needed to understand the link between long lead predictability in the Arabian Peninsula and the circumglobal wave propagation, particularly in summer.

5 Summary

A data-driven statistical model, the vector auto-regressive model, is evaluated for predicting the NH summer (May through September) low frequency variability. The high-order VAR model is able to use multiple-step temporal information to improve the forecast skill over first-order models such as the LIM, for the 250/750 hPa streamfunction as estimated using MERRA reanalysis for 1979–2013 summers. Significant forecast skills were achieved for an extended lead-time up to 18 days with a third-order model, far longer than the statistical persistence of 9 days and significantly longer than the LIM model’s 15 days on a NH average basis. Regionally, the forecast skill tends to be the strongest along a latitude band between 15°N and 45°N.

The VAR model is also able to incorporate the tropical diabatic heating to improve the forecast skill of the atmospheric low frequency circulation for the extended ranges (>7 days). Upon including tropical heating, the forecast skill starts to improve at low latitudes at very short lead but slightly degrades in the extra-tropics. It takes 10 days or so to enhance the skill at mid-latitudes, and another 2–3 days to reach high-latitudes. The improvements at different latitudes by introducing tropical heating over different lead-time indicate that the extra-tropical low frequency flow can be modulated by tropical heating through planetary wave propagation.

The VAR model reveals relatively long predictability at Arabian Peninsula, a node of the NH summer circumglobal wave train or teleconnection, even without tropical heating. The forecast skill at this region is significantly improved by including tropical heating in the VAR model. Several leading EOFs of the tropical heating are found to collectively contribute to the extended predictability in this region. The corresponding heating EOF patterns indicate the important role of relatively persistent heating anomalies located at the equatorial Pacific and the equatorial Atlantic. The underlying dynamics may be similar to the circumglobal propagation of Rossby wave train analyzed by Saeed et al. (2011) for summer monsoon circulation and Hoell et al. (2013) for winter circulation. Further investigation, such as dynamical model experiments with prescribed diabatic heating anomalies at above-mentioned regions, is required to explain the dynamic connections of the upper tropospheric circulation and the diabatic heating anomalies at the eastern equatorial Pacific, the equatorial Atlantic, and the Indian Ocean, as well as their relative importance. These dynamic connections call for accurate simulation of the tropical heating patterns and their impacts on atmospheric low frequency circulation in numerical weather forecast models, and provide an explanation for the long lead forecast skill in regions such as the Arabian Peninsula in numerical forecast models.

Although the local atmospheric circulation response to tropical heating is primarily linear, its propagation and interaction with extra-tropical flow could bring in nonlinearity. The statistical model used in this study is linear and thus can be potentially improved by including nonlinear terms to capture the nonlinearity of the atmospheric circulation (e.g., Kravtsov et al. 2005). This topic will be explored further in the future.