1 Introduction

Understanding the stratospheric response to an applied zonal torque is a long-standing research problem. It is important as an idealised problem that is useful for forming conceptual ideas about stratospheric dynamics, for understanding phenomena such as the Brewer–Dobson circulation (e.g. Holton et al. 1995; Shepherd 2007; Plumb 2010) and for anticipating how changes in gravity wave parameterisations will affect models of the stratosphere (e.g. Cohen et al. 2013, 2014). The response of the zonal mean circulation to an applied torque in the absence of wave feedbacks is well-understood (Eliassen 1951; Plumb 1982; Garcia 1987; Haynes et al. 1991). However, feedbacks arising from features such as planetary waves seem to strongly affect the extratropical stratospheric response to natural forcings such as the quasi-biennial oscillation (QBO) (Watson and Gray 2014), to changes in gravity wave parameterisations (e.g. Holton 1984; McLandress and McFarlane 1993; Cohen et al. 2013, 2014; Sigmond and Shepherd 2014) and to increasing \(\hbox {CO}_{2}\) concentrations (Sigmond and Shepherd 2014), and these are not well-understood.

Here we examine the stratospheric responses to applied steady extratropical zonally symmetric zonal torques in a 3D primitive equation numerical model of the middle atmosphere, which explicitly calculates feedbacks arising from the large-scale dynamics. We consider torques with a simple structure, so that the dynamics can be more easily understood. We mostly consider westerly torques, but we also show that the response is close to being equal and opposite if the torque is reversed in sign. Therefore our results are likely to apply directly to understanding easterly torques, such as those due to parameterised gravity wave drag (GWD) in atmospheric models. A westerly torque may also represent a reduction in GWD resulting from a change in the parameterisation scheme.

There are two main objectives to this work. The first is to gain a better understanding of how the interaction between the zonal mean and waveFootnote 1 parts of the flow affects the stratospheric response to applied forcings. The second is to test the suggestion of Watson and Gray (2014) that the long-term mean response of the extratropical stratosphere to an arbitrary forcing will tend to closely resemble the structure of the stratosphere’s leading mode of low-frequency variability, the annular mode (AM) (e.g. Thompson and Wallace 1998; Kushner 2010). This may explain why the observed Northern Hemisphere (NH) responses to QBO, El Niño-Southern Oscillation (ENSO), solar cycle and volcanic influences are similar to the Northern annular mode (NAM) (e.g. Dunkerton and Baldwin 1991; Kodera 1995; Sassi et al. 2004; Labitzke 2005; Ruzmaikin et al. 2005; Watson and Gray 2014). For this second objective, the simple torques we study are useful to test whether the extratropical response to a simple forcing is also robustly NAM-like. If the suggestion of Watson and Gray (2014) is correct, then it offers a simple conceptual way to think about the action of feedbacks on long time scales in general cases.

Regarding the first objective, understanding wave-mean flow interaction in the context of the circulation response to a zonally symmetric torque could help understanding of more complex problems, such as the effect of GWD. The influence of the wave component of the flow on the zonal mean part is well-understood. Following Andrews et al. (1987), the transformed Eulerian mean (TEM) zonal momentum equation, derived using the full primitive equations of motion, is

$$\begin{aligned} \frac{\partial \overline{u}}{\partial t} + \overline{v}^{*} \left[ (a \cos \phi )^{-1} (\overline{u} \cos \phi )_{\phi } - f \right] + \overline{w}^{*} \overline{u}_{z} = D_{F} + \overline{X}. \end{aligned}$$
(1)

Here \(\overline{u}\) is the zonal mean zonal wind, \((\overline{v}^{*},\overline{w}^{*})\) is the residual meridional circulation, \(a\) is the Earth’s radius, \(f\) is the Coriolis parameter, \(\overline{X}\) is the zonal mean mechanical forcing, \(t\) is time, \(\phi\) is the latitude and \(z\) is log-pressure height. Subscipts denote partial differentiation with respect to the subscripted variable.

$$\begin{aligned} D_{F}=(\rho _{0} a \cos \phi )^{-1} \nabla \cdot \mathbf {F} \end{aligned}$$
(2)

represents wave driving of the zonal flow. \(\mathbf {F}\) is the Eliassen–Palm (EP) flux, and \(\rho _{0}\) is a reference density profile.

\(D_{F}\) is typically negative in the winter extratropical stratosphere, and this weakens \(\overline{u}\) and drives a poleward meridional circulation (Eliassen 1951; Plumb 1982; Garcia 1987; Haynes et al. 1991). Moreover, Haynes et al. (1991) showed that in the steady state limit, the meridional circulation response is zero everywhere except directly below the region where the right hand side (RHS) of Eq. 1 is non-zero, so that only one overturning cell exists directly below this region. The effect of an additional mechanical zonal torque \((\delta \overline{X})\), with the \(D_{F}\) perturbation associated with wave feedbacks \((\delta D_{F})\) also added to the RHS of Eq. 1, can similarly be predicted: if \(\delta \overline{X}+\delta D_{F}\) is negative, the effect is to weaken \(\overline{u}\) and drive a poleward residual circulation; if the sum is positive then the effect is the opposite. \(\delta \overline{X}\) may represent the zonal mean zonal acceleration due to parameterised GWD in an atmospheric model, for example. \(\delta D_{F}\) is then the contribution to the change in EP flux divergence from the resulting planetary wave response.

A full understanding of the response to an applied torque \(\delta \overline{X}\) requires understanding the wave feedbacks, which has not been clearly elucidated in previous studies. One theoretical framework is “ray theory”. Following Andrews et al. (1987) again, under the assumptions that the wave part of the circulation is small and varies quickly in space and time compared to the background zonal flow, the wave part can be considered to consist of propagating eddies. These eddies trace ray paths that bend towards regions of greater “refractive index” in the meridional plane (Matsuno 1970; Karoly and Hoskins 1982). For stationary waves, which dominate wave forcing of the NH extratropical stratosphere, the refractive index is given by

$$\begin{aligned} n^{2} = \frac{\overline{q}_{\phi }}{a \overline{u}} - \frac{s^{2}}{a^{2} \cos ^{2}\phi } - \frac{f^{2}}{4N^{2}H^{2}}, \end{aligned}$$
(3)

where \(s\) is the zonal wavenumber, \(N^{2}\) is a reference static stability, assumed to be constant, and \(H\) is the pressure scale height.

$$\begin{aligned} \overline{q}_{\phi } = 2\varOmega \cos \phi - \left[ \frac{(\overline{u} \cos \phi )_{\phi }}{a \cos \phi } \right] _{\phi } - \frac{a}{\rho _{0}} \left( \frac{\rho _{0} f^{2}}{N^{2}} \overline{u}_{z} \right) _{z} \end{aligned}$$
(4)

is the meridional potential vorticity (PV) gradient, with \(\varOmega\) the Earth’s angular velocity about its spin axis. Under the same assumptions, the group velocity of these waves is parallel to \(\mathbf {F}\), so \(n^{2}\), together with knowledge of the wave sources, can be used to predict the direction of \(\mathbf {F}\). Despite the fact that the assumptions upon which this derivation is based are not realistic in the stratosphere, where wave amplitudes and wavelengths are large, this framework has been found to be useful for understanding qualitatively the climatology of \(\mathbf {F}\) (Matsuno 1970; Karoly and Hoskins 1982) and its changes during stratospheric sudden warmings (SSWs) (e.g. Palmer 1981; Butchart et al. 1982; McIntyre 1982; O’Neill and Youngblut 1982).

Regarding the second objective of this study, it has been found that the leading mode of variability resembles the responses to many different forcings in the Lorenz (1963) system (that which gives rise to the famous Lorenz butterfly attractor) (Palmer 1999; Palmer and Weisheimer 2011) and in models of the troposphere (Son and Lee 2006; Ring and Plumb 2007, 2008; Branstator and Selten 2009). Watson and Gray (2014) suggested that the stratosphere may behave in an analogous way and that this could explain why the NH extratropical responses to different natural forcings are NAM-like. If the suggestion is correct then the long-term mean stratospheric response to an applied torque will also be NAM-like.

The dynamics of the Lorenz (1963) system and the troposphere are quite different to the dynamics of the stratosphere, and the experiments performed here test whether the large-scale dynamics of the stratosphere displays the same behaviour. This behaviour is not fully general in the troposphere: for example, Son and Lee (2006) found that the steady state tropospheric response to heating perturbations in a general circulation model (GCM) is not always NAM-like, and Woollings (2008) showed that the NH extratropical tropospheric response to anthropogenic forcings simulated by various climate models is often not NAM-like. Understanding the circumstances in which this behaviour is manifested in the stratosphere may shed light on why it sometimes does and sometimes does not occur in other parts of the climate system as well.

The studies of Song and Robinson (2004) and Chen and Zurita-Gotor (2008) have previously shown the response to stratospheric zonal torques in models of the troposphere and stratosphere with zonally-symmetric boundary conditions, with their main objective being to examine the tropospheric response. However, the models that were used had weak planetary waves and are unlikely to have correctly simulated the changes in stratospheric waves. Cohen et al. (2013) examined the effect of changing parameters in a momentum-conserving gravity wave parameterisation in an atmospheric primitive equation model with zonal wavenumber-2 topography. They found that the EP flux due to resolved planetary waves adjusts to cancel out most of the change in the GWD. They found that this also occurs in the steady state response to a fixed zonally symmetric zonal torque. The experiments performed here extend upon this work, and complement the recent investigation of Cohen et al. (2014) into the mechanisms behind the cancellation.

The structure of the paper is as follows. Section 2 describes the model and experimental method. Section 3 presents the steady state and transient circulation responses to a variety of simple applied torques in perpetual January (PJ) simulations. Section 3.1 shows that the extratropical steady state circulation responses to torques placed at high latitudes are often NAM-like, and the transient \(\overline{u}\) and planetary wave responses are described in Sects. 3.2 and 3.3. Section 4 examines the responses in runs with a seasonal cycle (SC), for which the long-term responses do not appear as NAM-like. The wave feedbacks are found to be qualitatively similar to those in the PJ simulations but weaker, suggesting that the strength of the feedbacks is important for determining whether the long-term response becomes NAM-like. In Sect. 5 it is argued that the wave feedbacks can be understood through the torques’ direct effect on \(\overline{q}_{\phi }\) and \(n^{2}\), and how this affects wave propagation using ray theory. This indicates that the strength of the wave feedback is largely determined by the \(\overline{u}\) climatology and wave sources at the tropopause. It is argued that differences in the \(\overline{u}\) climatology between the PJ and SC experiments explain the differences in the simulated response to torques in each case. We show in Sect. 6 that the steady-state \(\overline{u}\) response to perturbations in tropospheric wave forcing in PJ conditions are also NAM-like, supporting the hypothesis that arbitrary forcings will tend to give a NAM-like response when wave feedbacks are strong. Extended discussion of the results is given in Sect. 7 and the main conclusions are summarised in Sect. 8.

2 Model and methods

2.1 The Stratosphere–Mesosphere Model

The UK Met Office Stratosphere–Mesosphere Model (SMM) is a global primitive equation model of the stratosphere and mesosphere. The configuration used here is identical to that used by Gray et al. (2001), with \(5^{\circ }\) resolution in latitude and longitude and 2 km resolution in log-pressure height over the domain 0.01–100 hPa (very nearly 16–80 km, with \(H=6.95\,\hbox {km}\)). At 100 hPa the geopotential height (GPH) is specified and enters into forcing terms in the zonal and meridional momentum equations. The radiative contribution to the diabatic heating rate is computed by the MIDRAD scheme (Shine 1987). Gravity wave drag is parameterised simply by Rayleigh friction terms in the zonal and meridional momentum equations, with time scale varying smoothly from 116 days below 50 km to about 2 days in the upper mesosphere.

This model allows the stratospheric dynamical response to be isolated, since wave generation by the troposphere is unchanged by the torques. Thus the behaviour of the internal middle atmosphere dynamics can be seen clearly. This is especially important for examining whether feedbacks from stratospheric dynamics tend to act to make the response NAM-like. The SMM is also computationally cheap enough for the response to many different torques to be examined under both PJ conditions and with a seasonal cycle.

The SMM has been shown to capture stratospheric variability reasonably well (e.g. Butchart et al. 1982; Fairlie et al. 1990; Fisher et al. 1993; O’Neill and Pope 1993) and has been used extensively to study stratospheric dynamics (e.g. Scaife and James 2000; Gray et al. 2001, 2004).

The Rayleigh friction does not conserve zonal momentum and this can give rise to physically unrealistic downward influences that may affect the response to applied forcings (Shepherd et al. 1996; Shepherd and Shaw 2004). Repeating selected experiments with zero or weakened Rayleigh friction north of 20N shows that the results are not strongly affected by the presence of Rayleigh friction (not shown).

2.2 Experimental method

Control runs were performed both under PJ conditions and with a seasonal cycle, defined by particular specifications of the solar radiation and 100 hPa GPH: Sects. 2.2.1 and 2.2.2 describe these “standard” control runs. To examine the responses to applied torques, runs were performed with an additional forcing term in the zonal momentum equation, as described in Sect. 2.2.3. Data were sampled at daily intervals.

The specified 100 hPa GPH was derived from a 240-year pre-industrial control run of the HadGEM2-CCS GCM, described by Watson and Gray (2014). The climatological 100 hPa GPH in HadGEM2-CCS is similar to that in ERA-40 (Uppala et al. 2005) (not shown) and this model exhibits realistic stratospheric variability (Mitchell et al. 2012; Osprey et al. 2013).

2.2.1 Perpetual January control run

The model was run for 40 years with radiative conditions set to those for January 15 in the MIDRAD scheme. The first two years were discarded in order to allow the model to adjust to the forcing conditions. The imposed 100 hPa GPH was the January climatology of that in the HadGEM2-CCS control run, with the zonally asymmetric component multiplied by a factor of 2 (Fig. 1). Climatological averaging smooths bursts of wave activity, so multiplying by this factor was found to be necessary in order for the model to exhibit irregular vacillations that are qualitatively similar to observed stratospheric variability. The wavenumber-1 and 2 amplitudes at 62.5N are 250 and 360 m respectively, which are similar to the amplitudes of specified bottom boundary single-wavenumber GPH eddies used in previous studies of stratospheric internal variability (e.g. Gray et al. 2003; Scott and Polvani 2006).

Fig. 1
figure 1

a Specified 100 hPa GPH used as the bottom-boundary forcing of the SMM in most of the perpetual January simulations, and b the zonally asymmetric component

Figure 2a shows the climatological \(\overline{u}\) and the difference from the ERA-40 January climatology. The modelled polar vortex is considerably weaker than that in ERA-40 and is further equatorward, with the peak \(\overline{u}\) around 30N in the middle and upper stratosphere compared to around 60N in ERA-40. Figure 2b shows the standard deviation of \(\overline{u}\), which is up to \(\sim 10\,\hbox {ms}^{-1}\) less than that in ERA-40 in the NH extratropical mid-stratosphere, though it still peaks at high latitudes despite the vortex being further south. In the tropical stratosphere the modelled standard deviation is up to ~20 ms−1 too small due to the absence of the QBO and SAO. Figure 2c shows a typical time-height section of \(\overline{u}\) at 57.5N over 365 days, illustrating the irregular vacillation cycles with \(\overline{u}\) reversals and intensifications that often propagate down to the lower stratosphere, in a qualitatively similar way to observed variability. The EP flux climatology (Fig. 2d) is qualitatively similar to that in ERA-40 (not shown). However, it is too poleward in the high-latitude lower stratosphere and is smaller, with the upward component being ~25–40 % weaker in the model than in ERA-40 in the lower stratosphere. \(D_{F}\) is represented reasonably in the model, though. The \(\overline{q}_{\phi }\) climatology (Fig. 2e) is greatest near ~30N, too far equatorward compared to ERA-40, consistent with the vortex being too far equatorward (Fig. 2a).

Fig. 2
figure 2

The climate of the perpetual January SMM standard control run. a Mean \(\overline{u}\) (contours, in \(\hbox {ms}^{-1}\)) and the difference from the ERA-40 January climatology (colours) and b the same for the standard deviation of \(\overline{u}\). c Time-height section of \(\overline{u}\) at 57.5N for 1 year. d The mean NH EP flux (arrows, shown every 6 km in height) and \(D_{F}\) (colours) below 50 km. A reference arrow in the top left corner has a magnitude of \((5\times 10^{6},\,5\times 10^{4})\,\hbox {kg}/\hbox {s}^{2}\). e Mean NH \(\overline{q}_{\phi }\) (contours, in \(10^{-5}\,\hbox {s}^{-1}\)) and the difference from ERA-40 (colours). \(D_{F}\) and \(\overline{q}_{\phi }\) at 87.5N are not plotted as the differentiation error is large there

Overall, therefore, the PJ control run has a qualitatively reasonable extratropical stratosphere but quantitatively it is quite different from January observations. Since the SC control run exhibits a wintertime climatology in much closer agreement with observations (Sect. 2.2.2), this is likely to be due to the PJ boundary conditions, which are not realistic.

Figure 3 shows the regression of monthly mean \(\overline{u}\), GPH at 32 km (10 hPa) and EP flux and \(D_{F}\) onto the NAM index in the PJ control run. The NAM index is defined in a similar way to that in Watson and Gray (2014) and is the leading principal component of monthly mean pressure- and area-weighted 3D GPH between 16–48 km north of 20N. The sign convention is opposite to that of the usual definition, so that a more positive NAM index corresponds to weaker high-latitude \(\overline{u}\), for easier comparison with the \(\overline{u}\) responses to torques (Sect. 3). The \(\overline{u}\) NAM signature is a dipole with negative \(\overline{u}\) north of about 30N at 30 km and positive \(\overline{u}\) further south, with the positive \(\overline{u}\) tilting northwards with increasing height (Fig. 3a). This is qualitatively similar to that derived from observations (e.g. Kodera 1995; Kushner 2010) and to the leading EOF of \(\overline{u}\) in this control run (not shown).

Fig. 3
figure 3

The NAM in the perpetual January SMM standard control run. a Regression of monthly mean \(\overline{u}\) onto the NAM index. b Similar regressions of EP flux (arrows) and \(D_{F}\) (colours). The reference arrow in the top left of (b) represents a flux \((F^{\phi },F^{z})=(5\times 10^{5},5\times 10^{3})\,\hbox {kg}/\hbox {s}^2\). \(D_{F}\) at 87.5N is not plotted as the differentiation error is large there

The NAM signature of 32 km GPH in the SMM (not shown) is also fairly similar to the signature derived from ERA-40 in January (Watson and Gray 2014), although the GPH signature is less zonally symmetric in the model. The EP flux and \(D_{F}\) NAM signatures (Fig. 3b) exhibit poleward and divergent flux above ~35 km that turns downward near the pole, with equatorward, convergent flux below. The EP flux signature is quite different to that in ERA-40 (Watson and Gray 2014), likely due to the PJ boundary conditions.

2.2.2 Seasonal cycle control run

A 45-year run was performed with a seasonal cycle of radiative conditions. Daily mean 100 hPa GPH values from the pre-industrial control run of HadGEM2-CCS were imposed, with linear interpolation between the middle of each day. The first two years were discarded.

Figure 4a shows the climatological November–February mean \(\overline{u}\) of the SC control run and the difference from the ERA-40 climatology. The vortex has a realistic structure with maximum \(\overline{u}\) near 60–70N. It is slightly too weak in the mid-stratosphere by ~5 ms−1. Figure 4b shows the standard deviation of the November–February mean \(\overline{u}\), showing that the model displays realistic variability in the extratropics, and again too little in the tropics due to the absence of the QBO and SAO. Figure 4c shows an example time-height section of the 57.5N \(\overline{u}\) for one winter in the SC control run, showing the vortex strengthening into January, followed by an SSW in early March. Figure 4d shows a similar sequence of events in the 1983–4 winter in ERA-40, indicating that the model variability is physically realistic. Overall there are \(7.5\,\hbox {SSWs/decade}\) according to the criterion of Charlton and Polvani (2007), which is similar to the frequency of about 6/decade in ERA-40, and the fraction of SSWs falling in each month November–March is also similar to that in ERA-40 (not shown). Figure 4e shows the seasonal cycle of \(\overline{u}\) at 60N and 32 km in the SC control run, along with its standard deviation and its range, alongside that at 60N and 10 hPa in ERA-40, indicating that the model seasonal cycle in the mid-stratosphere has a reasonable time evolution and variability, but \(\overline{u}\) is too weak by ~5 ms−1.

Fig. 4
figure 4

The climate of the SMM standard control run with a seasonal cycle. a November–February mean \(\overline{u}\) (contours, in ms−1) and the difference from the ERA-40 November–February climatology (colours) and b same for the standard deviation of \(\overline{u}\). c Time-height section of \(\overline{u}\) at 57.5N over November–March of one winter. d same for the 60N \(\overline{u}\) in ERA-40 in the 1983–1984 winter for comparison. e Climatological seasonal cycle of \(\overline{u}\) interpolated onto \((60\hbox {N},\,32\,\hbox {km})\) in the SMM (black solid line) and the same for the \((60\hbox {N},\,10\,\hbox {hPa})\,\overline{u}\) in ERA-40 (red solid line). Dashed lines show the mean \(\overline{u}\) plus or minus one standard deviation in the SMM (black) and ERA-40 (red). Dotted lines extreme values

In common with the PJ control run, the EP flux climatology (Fig. 5a) is in good qualitative agreement with that in ERA-40, but the flux is too small by ~25–40 %. The \(\overline{q}_{\phi }\) climatology (Fig. 5b) is similar to that in ERA-40, maximising around ~60–70N, though it is slightly too small in much of the high-latitude mid-stratosphere.

Fig. 5
figure 5

Climatological EP flux and \(\overline{q}_{\phi }\) in the SMM standard control run with a seasonal cycle. a November–February mean NH EP flux (arrows, shown every 6 km in height) and \(D_{F}\) (colours) below 50 km. The reference arrow in the top left corner has a magnitude of \((5\times 10^{6},\,5\times 10^{4})\,\hbox {kg}/\hbox {s}^{2}\). b November–February mean NH \(\overline{q}_{\phi }\) (contours, in \(10^{-5}\,\hbox {s}^{-1}\)) and the difference from ERA-40 (colours). \(D_{F}\) and \(\overline{q}_{\phi }\) at 87.5N are not plotted as the differentiation error is large there

The NAM-signature of November–Feburary mean \(\overline{u}\) in this control run (not shown), is similar to that in the PJ control run (Fig. 3a).

2.2.3 Runs with applied torques

A term of the form

$$\begin{aligned} A_{0}(\phi _{0},\delta \phi ,z_{0},\delta z) \exp \left( - \left[ \frac{\phi - \phi _{0}}{\delta \phi } \right] ^{2} - \left[ \frac{z-z_{0}}{\delta z} \right] ^{2} \right) \end{aligned}$$

was added to the zonal momentum equation, where \((\phi _{0},\,z_{0})\) is the position of the torque maximum, and \(\delta \phi\) and \(\delta z\) set the meridional and vertical scales of the applied torque respectively. \(A_{0}\) is a constant chosen such that (except in experiments designed to test the effect of varying the torque magnitude) the total zonal momentum added is equal to that of a torque with \(\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km},\,\delta \phi =11^{\circ },\,\delta z=10\,\hbox {km}\) and \(A_{0}=2.5\,\hbox {m/s/day}\). This will be referred to as the “standard torque”. This torque strength was found to produce responses in quantities such as \(\overline{u}\) of the order of one standard deviation in the control runs, which is comparable to the magnitude of the influence of natural forcings such as the QBO on the vortex in the real atmosphere. These \(\delta \phi\) and \(\delta z\) values are the same as those used by Ring and Plumb (2007) and Song and Robinson (2004) respectively. A torque with oppositely signed \(A_{0}\) is also applied in the opposite hemisphere (with the sign of \(\phi _{0}\) reversed) so no net zonal momentum is added globally. (Excluding this was not found to affect the results substantially.) A variety of experiments were performed, where normally only one of the torque parameters is different from that of the standard torque, to test the sensitivity of the response to varying each individual parameter, though a few experiments have more than one parameter varied as explained in the text.

The steady state responses to torques under PJ conditions are presented in Sect. 3.1. These are defined as the time-mean differences over years 3–20 between 20-year runs with applied torques and the standard PJ control run.

Transient responses to the torques are examined in later sections. These are estimated using 90-day branch runs, performed with both PJ and seasonally varying boundary conditions, with initial conditions taken from the appropriate control run as described below. In these the torque strength was linearly increased from zero to full strength in the first ten days, then held constant. Raising the period of the increase to twenty days was found not to qualitatively change the results, nor did decreasing the period to zero apart from in the first few days when spurious oscillations in the uppermost model levels were apparent, so the results are not sensitive to varying this time.

Branch runs under PJ conditions were started with initial conditions taken every 6-months from the 40-year standard PJ control run, not including the initial 2-year spin-up time. Thus 76 90-day branch runs were performed for each applied torque. The envelope of the autocorrelation function of \(\overline{u}\) at points in the extratropical stratosphere is about 0.2 or less at a lag of 6 months (not shown), so initial conditions taken 6 months apart are largely independent of each other.

SC branch runs were initiated at January 1 of each year of the SC control run, excluding the 2-year spin-up time. There were 43 90-day branch runs for each torque.

2.3 Statistical methods

Statistical significances of mean responses to torques were calculated using Monte Carlo (MC) methods. This takes into account the non-Gaussianity of the distribution of responses in our ensembles due to non-linear dynamics, which may cause methods that assume the distribution is normal to give misleading results.

Statistical significance of the steady state responses to torques in the PJ runs was calculated according to the null hypothesis that 6-month averages of the data have the same distribution in the runs with applied torques and the control run. The probability that the magnitude of each response would exceed that in the data under this null hypothesis was calculated at each grid point using a Monte Carlo (MC) permutation test. 1,000 surrogate control and perturbed run time series of the same length as those in the actual data were created by randomly combining 6-month averages from both the control and perturbed run data in each surrogate. Differences in the data were considered statistically significant at the 95 % level if they fell below the 2.5th percentile or exceeded the 97.5th percentile of the distribution of mean differences between the surrogate time series. Taking 6-month averages accounts for serial autocorrelation of the data—the calculated statistical significance is not very different if 12-month averages are used instead.

Statistical significances of the transient responses were calculated using an MC bootstrap test. At each grid point, a surrogate data sample was generated according to the null hypothesis that the mean response is zero but other moments of the true distribution of responses equal those in the data. The mean of the responses for all pairs of branch and control runs was subtracted from the response for each pair, and the results were resampled with replacement to produce surrogate ensembles of the same size as the originals. The probability of the mean of this resampled data being larger than that for the real data was estimated using 1,000 data resamplings.

Confidence intervals of ensemble-mean quantities (in Figs. 1517) were calculated using an MC bootstrap method in which the responses for the individual ensemble members were resampled with replacement to produce surrogate ensembles of the same size as the originals. The 95 % confidence intervals are the range between the 2.5th–97.5th percentiles of the distribution of the means of 1,000 such surrogate data samples.

3 Perpetual January run responses

3.1 Steady state responses

Figure 6 shows the steady state \(\overline{u}\) response to applied torques under PJ conditions. Figure 6a shows responses to westerly torques centred at different positions in the high-latitude stratosphere, Fig. 6b shows responses to westerly torques centred at \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) with different meridional and height scales, and Fig. 6c shows responses to three westerly torques centred at \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) with different magnitudes and one easterly torque.

Fig. 6
figure 6

Steady state \(\overline{u}\) responses (colours) to applied torques (contours, at \(\pm 0.5,\pm 2,\pm 4,\pm 6\ldots \,\hbox {m/s/day}\), with westerly contours solid and easterly contours dashed) under perpetual January conditions. a Responses to torques centred at different locations in the high-latitude stratosphere, b the responses to torques with different meridional and vertical scales centred at (60N, 30 km), and c the responses to torques with different peak strengths centred at (60N, 30 km). A NAM-like response is displayed in most cases, with mostly easterly high-latitude \(\overline{u}\) responses to westerly torques. The number in the top left corner of each panel is the anomaly correlation north of 20N between the \(\overline{u}\) response and its NAM signature in the standard control run (Fig. 3a). Plot titles indicate the latitude \(\phi _{0}\) and height \(z_{0}\) where the torque is strongest. Unless otherwise specified in the title, the torques have meridional scale \(\delta \phi = 11^{\circ }\), vertical scale \(\delta z=10\,\hbox {km}\), and a strength such that the magnitude of the total zonal momentum added to the NH equals that of the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque with peak strength \(2.5\,\hbox {m/s/day}\). Responses are not plotted where they are not statistically significant above the 95 % level according to the test described in Sect. 2.3

It is interesting to see that applying a westerly torque at high latitudes results in an easterly \(\overline{u}\) response in most of the high-latitude stratosphere. This shows that non-linear feedbacks play a very important role in bringing about this response. The \(\overline{u}\) responses share many characteristics with the NAM signature, with the responses to applied westerly torques all showing mostly an easterly \(\overline{u}\) response at high latitudes, and a westerly \(\overline{u}\) response in the subtropics, with the region of westerly response sloping poleward with increasing height. The anomaly correlationsFootnote 2 between the \(\overline{u}\) responses and the NAM signature (Fig. 3a) north of 20N are also written on the panels, and are larger than 0.8 in nine out of twelve cases, and larger than 0.9 in seven. However, many responses have the same sign as the torques just below where the torques peak, which is a departure from the NAM signature.

The responses to torques centred at \((\phi _{0}=60\hbox {N},\,z_{0}=40\,\hbox {km})\) and \((\phi _{0}=75\hbox {N},\,z_{0}=30\,\hbox {km})\) (Fig. 6a) have anomaly correlations with the NAM signature of only 0.75 and 0.44 respectively. However, it can still be seen visually that the main qualitative features of these responses are similar to those of the NAM signature, with a negative response in most of the extratropics and a positive subtropical response that tilts poleward with increasing height. The correlations may be relatively low because the peak torque magnitudes are relatively large - if the magnitudes of these torques are reduced by 60 %, the anomaly correlations rise to 0.85 and 0.70 respectively.

The response to the torque centred at \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) with peak strength 7.5 m/s/day (Fig. 6c) is less like the NAM signature in the control run. However, the NAM signature diagnosed from the run with the 7.5 m/s/day torque applied shows a better qualitative resemblance to the \(\overline{u}\) response, with positive \(\overline{u}\) between ~40–60N and ~16–50 km (not shown). This illustrates that when the applied forcing becomes large, the relationship between the response and the NAM signature becomes complicated due to the fact that the NAM signature is itself affected by the forcing.

The responses to torques centred in middle latitudes at \((\phi _{0}=30\hbox {N},\,z_{0}=30\,\hbox {km})\) and \((\phi _{0}=45\hbox {N},\,z_{0}=30\,\hbox {km})\) (not shown) are small and not NAM-like. Therefore the response only seems to be NAM-like for torques centred at high latitudes.

The steady state GPH responses to the torques in the middle stratosphere (not shown) are also NAM-like for torques that produce a NAM-like \(\overline{u}\) response. Responses on individual pressure levels have anomaly correlations of ~0.8–0.9 with the NAM signature in the latitude-longitude plane.

Overall, therefore, the steady state responses to applied torques centred at 60N or polewards are very like the NAM. The projection of the torque onto the NAM signature is opposite to the sign of the response of the NAM index (not shown), in contrast to what Ring and Plumb (2007) found for the tropospheric response to applied torques.

The responses to torques centred at \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) are quite linear with respect to varying the peak torque amplitude in the range 1–5 m/s/day, but non-linearity becomes substantial as the torque magnitude increases to 7.5m/s/day (Fig. 6c). Changing the sign of the 2.5 m/s/day torque centred at (60N, 30 km) gives a \(\overline{u}\) response that is nearly equal and opposite (the centre-left panel of Fig. 6a and leftmost panel of Fig. 6c), indicating that our analysis also likely applies for easterly torques in general, such as that due to an increase in GWD, with signs of the responses reversed.

The responses are not sensitive to the precise experimental set up. The steady state response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque is similar in runs with perpetual November, December and February radiative conditions and bottom boundary, and in PJ conditions with the eddy component of the prescribed 100 hPa GPH equal to 1.5 times that of the HadGEM2-CCS January climatology rather than two times (not shown).

3.2 Transient \(\overline{u}\) responses

Examining the NAM-like steady state responses to torques does not give any insight into how these responses come about. In order to better understand the underlying dynamics, the transient response is now examined. We focus on the response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque since this is qualitatively similar to the responses to all torques centred at 60N or polewards.

Figure 7 shows the ensemble-mean transient \(\overline{u}\) response at (60N, 30 km) to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque over the first 50 days, and the \(\pm 2\) standard error and \(\pm 2\) standard deviation ranges. The mean response is positive and the range of responses is very small up to ~day 15, in agreement with the expected direct response to the applied torque. Shortly after this the range increases substantially and not all ensemble members have a response of the same sign, due to the chaotic nature of the dynamics. The ensemble-mean response becomes negative just after day 30.

Fig. 7
figure 7

The ensemble-mean transient \(\overline{u}\) response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque under perpetual January conditions interpolated to \((60\hbox {N},\,30\,\hbox {km})\) (solid black line), the mean \(\pm\) two standard errors (dashed black lines), the mean \(\pm\) two standard deviations (dotted black lines) and the torque (red line)

Figure 8 shows the ensemble-mean transient \(\overline{u}\) response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque as a function of latitude and height at various times. On day 1, the structure of the response resembles the torque very well near (60N, 30 km), consistent with the expected similarity between the response and the forcing on short time scales (Watson and Gray 2014). The response is already negative in the lower and middle stratosphere to the north and south of the torque position. Up to day 10, the positive response spreads northwards, but between days 10 and 15 there is a substantial zonal deceleration in the high latitude upper stratosphere, bringing the response there close to zero. A negative response develops in the high latitude upper stratosphere and descends with time from ~day 20, and a positive response develops in the subtropics to give an overall response that resembles the steady state response shown in Fig. 6 a short time after (although the response does not become entirely steady within the 90-day length of the runs).

Fig. 8
figure 8

The ensemble-mean transient \(\overline{u}\) response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque under perpetual January conditions, averaged over the time periods of the branch runs indicated in the panel titles. Contours show the torque with levels at 0.05 and \(0.2\,\hbox {m/s/day}\) on day 1, at 0.25 and \(1\,\hbox {m/s/day}\) on day 5 and at 0.5 and \(2\,\hbox {m/s/day}\) from day 10 onwards. Responses are not plotted where they are not statistically significant above the 95 % level according to the test described in Sect. 2.3

3.3 Transformed Eulerian-mean diagnostics

In order to further investigate the dynamics of the stratospheric response, the transient response of each term in the TEM zonal momentum equation (Eq. 1) to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque is shown in Fig. 9. On day 1, \(\partial \overline{u}/\partial t\) (1st column) is dominated by the torque. The EP flux response (2nd column) is small, with its convergence increasing where the torque is being applied, associated with increased upward flux and equatorward flux from higher latitudes. The sum of the torque and \(D_{F}\) response (3rd column) is positive in most of the extratropics. As expected from the discusson in Sect. 1, the residual circulation response (4th column) is therefore equatorward in most of the stratosphere, with a negative associated acceleration term in Eq. 1:

$$\begin{aligned} -\overline{v}^{*} \left[ (a \cos \phi )^{-1} (\overline{u} \cos \phi )_{\phi } - f \right] - \overline{w}^{*} \overline{u}_{z}. \end{aligned}$$

This residual circulation response makes \(\partial \overline{u}/\partial t\) less than half the peak torque strength of 0.25 m/s/day near (60N, 30 km), and also causes \(\partial \overline{u}/\partial t\) to be negative in the low latitude lower and middle stratosphere.

Fig. 9
figure 9

In columns from left to right are the ensemble-mean transient responses to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque under perpetual January conditions of: \(\partial \overline{u}/\partial t\), the EP flux (arrows) and \(D_{F}\) (colours), the sum of \(D_{F}\) and the applied torque, and the residual circulation (arrows) and associated acceleration (colours). Data are averaged over a different time period of the branch runs on each row. \(\partial \overline{u}/\partial t\) data are multiplied by a scaling factor as indicated in the plot titles. Contours in the first and third columns show the torque, with contours at 0.05 and \(0.2\,\hbox {m/s/day}\) in the first row, at 0.4 and \(1.6\,\hbox {m/s/day}\) in the second row and at 0.5 and \(2\,\hbox {m/s/day}\) from the third row onwards. In the second and fourth columns, arrows in the top left corner of each panel indicate the size of an arrow that represents the EP flux or residual circulation value written in brackets alongside. The EP flux and residual circulation response is only plotted where either the \(\phi\)- or \(z\)-component is statistically significant above the 95 % level, and are plotted every 6 km in height. Stippling in the first, second and fourth columns shows where responses in \(\partial \overline{u}/\partial t,\,D_{F}\) and the acceleration associated with the residual circulation are statistically significant above the 95 % level respectively (Sect. 2.3). \(D_{F}\) responses at 87.5N are not plotted as the differentiation error is large there

On days 6–10, \(\partial \overline{u}/\partial t\) is larger due to the torque growing linearly in the first 10 days. The EP flux response is also much larger, with a stronger convergence response near (60N, 30 km) that is more than half the strength of the torque. The residual circulation response is qualitatively similar to that on day 1 and stronger.

Over days 11–15, \(\partial \overline{u}/\partial t\) becomes negative in the extratropical upper stratosphere due to the strengthening of the EP flux convergence response. The EP flux convergence response in the middle stratosphere now also cancels most of the direct effect of the torque. As a consequence, the residual circulation response has changed qualitatively so that it is poleward in the upper stratosphere, with an equatorward return flow in the lower stratosphere.

The negative \(\partial \overline{u}/\partial t\) response is largest over days 16–20, when it also extends to the high-latitude lower stratosphere. After day 20, the extratropical \(\partial \overline{u}/\partial t\) is smaller, and the EP flux and residual circulation responses evolve slowly to become more like the steady state responses (not shown). The \(\overline{u}\) response in the high-latitude lower stratosphere becomes negative in this period (Fig. 8). In the subtropics, the \(\overline{u}\) response becomes positive and similar to the NAM signature in this region (Fig. 3a), and this is associated with the Coriolis force acting on the poleward residual circulation response (Fig. 9).

The EP flux response after day 20 resembles the NAM signature (Fig. 3b) only in some respects. Both exhibit a convergent equatorward EP flux in the mid-latitude stratosphere and a poleward flux in the uppermost extratropical stratosphere. However, the NAM signature does not display the convergent upward EP flux in the high-latitude lower stratosphere shown in the response.

Figure 10 shows time series of the ensemble-mean responses of the terms in Eq. 1 at (60N, 30 km) to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque. This shows that up to about day 15 the \(\partial \overline{u}/\partial t\) response here is driven by the torque and resisted by responses in the EP flux and residual circulation. After ~day 25, the EP flux convergence response tends to be larger than the torque, so the \(\partial \overline{u}/\partial t\) response tends to be negative, with the residual circulation response contributing positively. So it can be seen that the EP flux convergence response drives the easterly \(\overline{u}\) acceleration in the high-latitude lower stratosphere after ~day 25. These features of the responses are generally statistically significant (see stippling in Fig. 9). The local Rayleigh friction contribution is very small (not shown).

Fig. 10
figure 10

Ensemble-mean transient responses of \(\partial \overline{u}/\partial t\) (black), the torque (green), \(D_{F}\) (orange), the sum of \(D_{F}\) and the torque (red) and the acceleration associated with the residual circulation (blue) to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque under perpetual January conditions as a function of time since initiation of each branch run, interpolated to (60N, 30 km). The \(D_{F}\) response becomes larger than the torque after ~day 25, and this drives the \(\overline{u}\) response negative

The response of the terms in the TEM zonal momentum equation to other torques centred at 60N and polewards (not shown) is qualitatively similar, with the EP flux responding to approximately cancel the applied torque and converge more in the upper stratosphere from about day 10, bringing about deceleration of \(\overline{u}\). The EP flux convergence responses to the \((\phi _{0}=60\hbox {N},\,z_{0}=40\,\hbox {km})\) and \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km},\,\delta z=2.5\,\hbox {km})\) torques do not become larger than the torques - in these cases the acceleration associated with the residual circulation response opposes the torque at all times, and the EP flux convergence response just becomes temporarily large enough for the \(\partial \overline{u}/\partial t\) response to become negative.

In summary, under PJ conditions the responses to torques are strongly affected by feedbacks from the wave part of the circulation. For torques placed at 60N or polewards, these feedbacks cause an EP flux convergence response that opposes the direct effect of the torque. This convergence response causes the \(\overline{u}\) response to become negative in the polar lower stratosphere, and temporarily can become larger than the torque. Altogether this brings about NAM-like \(\overline{u}\) and GPH responses after a few weeks.

4 Responses in runs with a seasonal cycle

In this section, the response to a torque in the SC runs is examined and compared to that in PJ runs. Figure 11 shows the ensemble-mean transient \(\overline{u}\) response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque in SC branch runs up to day 60. The response in the first 25 days goes through approximately the same stages as that under PJ conditions in the first 15 days (Fig. 8), with acceleration initially at high latitudes and a negative \(\overline{u}\) response developing further south. However, the response remains positive in most of the polar stratosphere at later times.

Fig. 11
figure 11

As in Fig. 8 but for the transient \(\overline{u}\) response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque in runs with a seasonal cycle. The response does not become as NAM-like as in the perpetual January runs or easterly at high latitudes

The \(\overline{u}\) response also does not become clearly NAM-like within 90 days. There is some qualitative similarity between the days 41–60 \(\overline{u}\) response and the NAM signature (c.f. Fig. 3a), with there being a meridional dipole in the response, and with the subtropical part sloping poleward with increasing height. However, the vertical structures of the response and the NAM signature are quite different at high latitudes. The subtropical part of the \(\overline{u}\) response is also larger than the high-latitude part, with the reverse being true for the NAM signature. The anomaly correlation between the days 41–60 \(\overline{u}\) response and the NAM signature of January–February mean \(\overline{u}\) in the standard control run is only \(-0.34\). \(\partial \overline{u}/\partial t\) is small by this time (not shown), so it seems unlikely that the response could become more NAM-like in the duration of a winter. Therefore the response is much less NAM-like in the SC experiments than in the PJ experiments.

Figure 12 shows the ensemble-mean transient response of the terms in the TEM zonal momentum equation to the same torque. Again this shows the same sequence of stages as for the PJ response (Fig. 9), with the torque initially causing \(\partial \overline{u}/\partial t\) to be greater, and a more equatorward middle and lower stratospheric residual circulation. The EP flux convergence increases in the region where the torque is strongest, and later it increases in the upper stratosphere, causing \(\overline{u}\) to decelerate there. However, the EP flux response is weaker and this sequence again unfolds more slowly than in the PJ case. It takes until days 16–20 for the EP flux response to largely cancel out the direct effect of the torque in the high-latitude middle stratosphere and to cause \(\partial \overline{u}/\partial t\) to become negative in some places north of 60N. Under PJ conditions these events happen during days 11–15. This difference may be because the zonal mean flow and EP flux co-evolve and respond to changes in each other—the feedbacks are weaker in the SC runs, which may make the time-derivatives of each smaller, so the sequence of changes proceeds more slowly. The EP flux convergence response in the high-latitude upper stratosphere also never becomes as large as it does under PJ conditions, and in the lower stratosphere north of ~70N it never becomes large enough to make \(\partial \overline{u}/\partial t\) negative, though it does temporarily become larger than the torque and make \(\partial \overline{u}/\partial t\) negative at (60N, 30 km) when averaged over days 26–60. This weaker EP flux response appears to be the reason why the \(\overline{u}\) response does not turn negative in the high-latitude lower and middle stratosphere in the SC runs, and hence why it does not appear NAM-like.

Fig. 12
figure 12

As in Fig. 9 but for the transient responses to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque in runs with a seasonal cycle. Contours in the first and third columns showing the torque are plotted slightly differently to those in Fig. 9, with contours at 0.05 and 0.2 m/s/day in the first row at 0.45 and 1.8 m/s/day in the second row and at 0.5 and 2 m/s/day from the third row onwards. The responses are qualitatively similar to those shown in Fig. 9, but the EP flux response is weaker

The responses to the same torques that were used in the PJ experiments were examined. The responses to torques centred at 60N or poleward are qualitatively similar to the response to the \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque (not shown), and reversing the sign of the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque approximately reverses the response. The responses are also qualitatively similar for branch runs begun on October 22 rather than January 1, except that the EP flux response is weaker and the sequence unfolds more slowly (not shown), likely because wave forcing at the lower boundary is weaker at this time.

5 Understanding the wave response

To summarise Sects. 3.3 and 4, the transient responses to westerly torques centred in the high-latitude stratosphere in the PJ and SC runs each proceed through three stages (with the response to an easterly torque being similar but opposite in sign):

  1. 1.

    The torque causes acceleration of \(\overline{u}\) and induces EP flux convergence where the torque is strongest, and it drives a residual circulation that is equatorward in the stratosphere.

  2. 2.

    The EP flux becomes more upward and convergent in the extratropical upper stratosphere, causing \(\overline{u}\) here to decelerate and the residual circulation response to become poleward in the upper stratosphere.

  3. 3.

    The \(\overline{u}\) acceleration reduces to zero in the high-latitude stratosphere. In the case of the PJ runs it becomes temporarily strongly negative, due to EP flux convergence increasing, resulting in a very NAM-like \(\overline{u}\) response. The EP flux response is weaker in the SC runs, and this produces a very different long-term mean response that is less NAM-like.

It can be understood from the discussion in Sect. 1, and references therein, why the \(\partial \overline{u}/\partial t\) response tends to have the same sign as the sum of the torque and \(D_{F}\) response, and why the residual circulation responds such that the associated acceleration contribution tends to have the opposite sign (Figs. 912). A complete explanation of the response must account for the change in the EP flux, however. In what follows we argue that the EP flux response, and the differences in the response between the PJ and SC experiments, can be understood using ray theory. This implies that these differences are largely due to differences in the \(\overline{u},\,\overline{q}_{\phi }\) and planetary wave climatologies in the two cases.

Figure 13a shows the transient response of the refractive index for stationary waves squared (\(n^{2}\), given by Eq. 3 using reference temperature \(T_{s}=240\,\hbox {K}\) and scale height \(H=6.95\,\hbox {km}\)) to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque after 1 day under PJ conditions, along with the zonal wavenumber-1 component of the EP flux response. We consider the different wave numbers separately because the predictions of ray theory apply to each wavenumber separately. The presented \(n^{2}\) response is a “trimmed mean” (Wilks 2006) of the responses over the ensemble members, where the top and bottom 10 values were excluded at each grid point to remove large outliers that arise after a few days at a few grid points because some ensemble members have small \(\overline{u}\) values, which makes \(n^{2}\) very sensitive to changes in \(\overline{u}\) and \(\overline{q}_{\phi }\). The important features identified in the following discussion are similar if the mean of the responses is used, and the features in the first few days can also be seen in the responses for individual ensemble members.

Fig. 13
figure 13

a Trimmed-mean response of the squared refractive index for stationary waves (\(n^{2}\), colours) and zonal wavenumber-1 EP flux (arrows) to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque under perpetual January conditions after 1 day. Hatching shows where trimmed-mean \(n^{2}\) for zonal wavenumber-1 is negative in the control run. The reference arrow to the left of the plot represents EP flux (5,000, 50) \(\hbox {kg/s}^{2}\). b Response of the meridional PV gradient \(\overline{q}_{\phi }\). c Contribution to the response in \(\overline{q}_{\phi }\) associated with the “\(u_{\phi \phi }\) term” \(-\overline{u}_{\phi \phi }/a\) and d the same for the “\(u_{zz}\) term” \(-af^{2}\overline{u}_{zz}/N^{2}\), which dominate the \(\overline{q}_{\phi }\) response. These responses are consistent with EP flux convergence increasing near 60N because \(\overline{q}_{\phi }\) increases and enhances planetary wave propagation into this region

There is a very good agreement between the structure of the \(n^{2}\) change and the wavenumber-1 EP flux response expected according to ray theory between ~40–90N and ~16–40 km on day 1. Between ~50–70N, \(n^{2}\) increases, and it decreases to the north and south. Correspondingly the EP flux below and to the north and south of (60N, 30 km) is directed more into this region, and wavenumber-1 EP flux convergence increases here (not shown). There is also a large increase in \(n^{2}\) south of ~30N in the stratosphere, though this is not associated with a substantial change in the EP flux. The wavenumber-1 EP flux response is similar to the full EP flux response (Fig. 9). The wavenumber-2 EP flux response (not shown) is also qualitatively similar.

The \(D_{F}\) response is not directly predicted by ray theory, but it may be anticipated that EP flux convergence will tend to increase where wave propagation increases, for example due to dissipative processes acting on the waves.

The refractive index response \(\delta n^{2}\) is initially approximately given by

$$\begin{aligned} \delta n^{2} \approx (\delta \overline{q}_{\phi } - \delta \overline{u} \,\, \overline{q}_{\phi }/\overline{u})/a \overline{u}, \end{aligned}$$
(5)

where \(\delta \overline{q}_{\phi }\) and \(\delta \overline{u}\) are the \(\overline{q}_{\phi }\) and \(\overline{u}\) responses respectively, derived from Eq. 3 taking \(\delta \overline{q}_{\phi }/\overline{q}_{\phi }\) and \(\delta \overline{u}/\overline{u}\) to be small, noting that \(N^{2}\) is assumed to be constant. On day 1 in the extratropical lower and middle stratosphere, the largest contribution to \(\delta n^{2}\) is associated with \(\delta \overline{q}_{\phi }\) (Fig. 13b) with the \(\delta \overline{u}\) term providing a smaller, oppositely signed contribution (not shown), as can be inferred from Fig. 8. \(\delta \overline{q}_{\phi }\) is contributed to mostly by a change in \(\overline{u}_{\phi \phi }\) and also partly by a change in \(\overline{u}_{zz}\). These terms decrease where \(\overline{u}\) is being accelerated most rapidly by the direct effect of the torque around (60N, 30 km), and the terms \(-\overline{u}_{\phi \phi }/a\) and \(-af^{2}\overline{u}_{zz}/N^{2}\) in the expansion of Eq. 4 are associated with most of the increase in \(\overline{q}_{\phi }\) here (Fig. 13c, d). Therefore \(n^{2}\) increases here mainly due to curvature of \(\overline{u}\) becoming more negative as a “nose” is pushed out in the \(\overline{u}\) profile around (60N, 30 km) (top left panel of Fig. 8). \(\overline{u}_{\phi \phi }\) increases (so \(-\overline{u}_{\phi \phi }/a\) decreases) to the north and south of this region and \(\overline{u}_{zz}\) increases (so \(-af^{2}\overline{u}_{zz}/N^{2}\) decreases) above and below, giving negative contributions to \(\delta \overline{q}_{\phi }\).

There is a good correspondence between the extratropical lower and middle stratospheric transient EP flux and refractive index responses in the first few days for all the torques in the PJ simulations (not shown).

In the SC simulations, the EP flux response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque in the first few days is also consistent with changes in the refractive index (not shown) caused by changes in the QG PV gradient, with the peak refractive index change after one day being about half that in the PJ runs shown in Fig. 13. This is because the climatological \(\overline{u}\) (Figs. 2a, 4a) and \(\overline{q}_{\phi }\) (Figs. 2e, 5b) are both about twice as large near (60N, 30 km) in the SC control run as in the PJ control run, and \(\delta \overline{u}\) and \(\delta \overline{q}_{\phi }\) are similar (which is expected since the direct \(\overline{u}\) response to the torques ought to be similar). So by Eq. 5, \(\delta n^{2}\) is half as large in the SC runs. Therefore it seems that the initial mean EP flux response is less strong in the SC runs because the vortex is stronger and further poleward on average.

On days 6–10, the PJ \(n^{2}\) response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque (Fig. 14) is largely qualitatively similar to that on day 1, though it has become positive in the stratosphere near the pole, due to the \(\overline{q}_{\phi }\) response becoming positive there (not shown). The upward EP flux response near 60N in the lower stratosphere is still qualitatively consistent with the \(n^{2}\) response and ray theory.

Fig. 14
figure 14

The trimmed-mean response of the squared refractive index for stationary waves (\(n^{2}\), colours) and zonal wavenumber-1 EP flux (arrows) to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque under perpetual January conditions, averaged over days 6–10 of the branch runs. Hatching shows where trimmed-mean \(n^{2}\) for zonal wavenumber-1 is negative in the control run. The reference arrow to the left of the plot represents EP flux \((10^{5},10^{3})\,\hbox {kg/s}^{2}\). The high-latitude lower stratospheric upward EP flux response is consistent with the positive \(n^{2}\) response and ray theory

The first stage of the transient response to a westerly torque can therefore be explained by the torque directly increasing \(\overline{q}_{\phi }\) by affecting the curvature of \(\overline{u}\), so that Rossby waves propagate more into the region where the torque is applied, which increases EP flux convergence here.

In the second stage of the response, the EP flux response becomes strongly upward and equatorward above ~35 km, after ~day 11 in the PJ simulations (Fig. 9). However, \(\delta n^{2}\) remains negative in the mid-latitude upper stratosphere, and is both positive and negative at different points in the high-latitude upper stratosphere (not shown). Therefore it is not immediately clear that the refractive index diagnostic is useful for explaining this stage. We show in the following analysis, though, that the response is still consistent with ray theory once changes in the wave propagation from below are accounted for.

Figure 15 shows the high-altitude response of \(\widetilde{D_{F}}=2\pi a^{2} \cos (\phi ) \nabla \cdot \mathbf {F}\) averaged over days 11–25 plotted against the 45–80N \(F^{z}\) response \((\delta F^{z})\) at 30 km averaged over days 6–10, for all the torques in the PJ and SC experiments. The \(\widetilde{D_{F}}\) response is integrated over 30–80N and 35–50 km. The integral of \(\delta F^{z}\) is defined as

$$\begin{aligned} \int _{45\mathrm {N}}^{80\mathrm {N}} 2 \pi a^{2} \cos (\phi ) \delta F^{z}|_{z=30 \, \mathrm {km}} \, \mathrm {d}\phi . \end{aligned}$$

This expression follows from applying Stokes’ theorem to equations 2.5 and 2.6 of Dunkerton et al. (1981) to infer the negative contribution of \(\delta F^{z}\) to the response of \(\widetilde{D_{F}}\) integrated within a closed surface of which the surface at 30 km between 45–80N forms a part. The day 11–25 period includes that of greatest \(\overline{u}\) deceleration and EP flux convergence responses in the upper stratosphere in both PJ and SC runs (Figs. 912).

Fig. 15
figure 15

The ensemble and day 11–25 mean responses in \(\widetilde{D_{F}}\) integrated over 30–80N and 35–50 km plotted against the ensemble and day 6–10 mean \(F^{z}\) responses at 30 km integrated between 45 and 80N, for all the torques in the perpetual January (red) and seasonal cycle (blue) experiments. The solid lines are the least-squares linear fits to each set of experiments. Error bars show the 2.5th–97.5th percentiles of the distribution of the mean according to an MC bootstrap estimate. Unless otherwise specified in the legend, the torques peak at \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\), have meridional scale \(\delta \phi = 11^{\circ }\) and vertical scale \(\delta z=10\,\hbox {km}\), and have a magnitude such that the total zonal momentum added to the NH equals that of the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque with peak magnitude 2.5 m/s/day. These results are consistent with enhanced upward planetary wave propagation into the high-latitude lower stratosphere leading to increased propagation into the upper stratosphere at later times with consequent greater EP flux convergence, with this wave response being qualitatively similar but weaker in runs with a seasonal cycle than in perpetual January runs

The high-altitude \(\widetilde{D_{F}}\) response during days 11–25 is more negative for torques that cause a greater \(F^{z}\) response in the middle stratosphere between days 6–10, with correlations of \(-0.96\) and \(-0.93\) across the PJ and SC experiments respectively. As argued previously, the \(F^{z}\) response between days 6–10 seems to be due to torques increasing \(\overline{q}_{\phi }\). The regression coefficient of the mean \(\widetilde{D_{F}}\) response against the mean \(F^{z}\) response is very similar in each set of simulations. This indicates that the high-altitude \(\widetilde{D_{F}}\) response in the SC experiments is weaker than in the PJ experiments primarily because the earlier \(F^{z}\) response is weaker.

Physically, this is consistent with the explanation that the mid-stratospheric upward EP flux response in days 6–10 is associated with propagating waves with an upward group velocity, with an associated transfer of easterly momentum into the upper stratosphere at later times, resulting in deceleration of \(\overline{u}\) that is enhanced by the lower air density here. The results also suggest that the extratropical response to the \((\phi _{0}=30\hbox {N},\,z_{0}=30\,\hbox {km})\) and \((\phi _{0}=45\hbox {N},\,z_{0}=30\,\hbox {km})\) torques is weak (Fig. 6) because these torques are too far south to cause substantial enhancement of upward wave propagation at high latitudes.

In the third phase of the response, the PJ \(D_{F}\) response in the high-latitude stratosphere becomes large enough to cause the \(\overline{u}\) response to turn negative (Fig. 8), but this does not occur in the SC experiments (Fig. 11). The high-latitude \(\widetilde{D_{F}}\) response between days 21–40, which is the period when it drives the high-latitude mid-stratospheric \(\overline{u}\) response to become negative in the PJ runs (Fig. 8), is well correlated with the \(\widetilde{D_{F}}\) response in the same place in days 6–10 in both sets of runs (Fig. 17). So the stronger \(\widetilde{D_{F}}\) response over days 21–40 in the PJ runs is related to their stronger initial response.

The \(n^{2}\) response to westerly torques continues to be positive in the high-latitude lower and middle stratosphere in both the PJ and SC runs past day 10 (Fig. 16a). This is partly because \(\overline{q}_{\phi }\) is increased near where the torque is strongest and where it causes more negative curvature of \(\overline{u}\) (Fig. 16b). \(\delta n^{2}\) continues to be larger in the PJ runs than in the SC runs (not shown), and would be expected to lead to greater EP flux convergence in the extratropics in the PJ runs, as in the first stage of the response. This may be what allows the overall \(\overline{u}\) response to turn easterly in the PJ runs. \(\delta n^{2}\) differs between the PJ and SC runs in this period in part because their control climatologies are different, as in the first stage of the response, which may account for why the early and late \(\widetilde{D_{F}}\) responses are closely related (Fig. 17).

Fig. 16
figure 16

a Trimmed-mean response of the squared refractive index for stationary waves \((n^{2})\) to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque under perpetual January conditions, averaged over days 21–60 of the branch runs. b Corresponding mean response of the meridional PV gradient \(\overline{q}_{\phi }\). The torque increases \(\overline{q}_{\phi }\) in the lower and middle stratosphere near ~45–60N, contributing to \(n^{2}\) being larger, encouraging wave propagation into the stratosphere in the long term

Fig. 17
figure 17

The ensemble and day 21–40 mean responses in \(\widetilde{D_{F}}\) integrated over 45–80N and 16–40 km plotted against the day 6–10 mean \(\widetilde{D_{F}}\) responses in the same region, for all the torques in the perpetual January (red) and seasonal cycle experiments (blue), plotted as in Fig. 15. There is a close relationship between the \(\widetilde{D_{F}}\) responses at early and late times, and the weaker days 6–10 response in the seasonal cycle runs compared to that in the perpetual January runs is related to the weaker response in days 21–40, and hence to the \(\overline{u}\) response being NAM-like in the perpetual January runs but not in those with a seasonal cycle

In summary, the wave feedbacks are qualitatively similar in the PJ and SC experiments, with the difference just being their strength. The feedbacks are consistent with ray theory, which predicts that as a westerly torque increases \(\overline{q}_{\phi }\) at high latitudes, wave propagation into the stratosphere will increase. Ray theory indicates that the differences between the PJ and SC feedback strengths arise due to differences in the control run \(\overline{u}\) climatologies, and possibly also differences in the planetary wave climatologies. Ray theory may also explain why the PJ \(\overline{u}\) response to torques with smaller meridional and height scales is larger (Fig. 6), as these have larger direct effects on \(\overline{q}_{\phi }\), so the wave response is larger. As the response to a torque tends towards its steady state, however, changes in the mesosphere and phenomena such as wave reflection could complicate the details of the wave behaviour, and this has not been fully investigated here, though it seems unlikely to be of primary importance.

Cohen et al. (2013) argued that increased convergence of the EP flux associated with resolved waves in the region of an applied torque is necessary, because otherwise the flow would eventually become unstable due to \(\overline{q}_{\phi }\) changing sign. The fact that in our experiments the EP flux becomes more convergent in the region where the torques peak within a day, for torques centred at 45N or poleward (shown for the \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque in Figs. 912), implies though that wave feedbacks begin counteracting the torques before instability develops. \(\overline{q}_{\phi }\) remains positive in our runs with applied torques, except very near 90N (not shown). This is consistent with the analysis of Cohen et al. (2014) and Sigmond and Shepherd (2014).

6 Comparison of the PJ response to 100 hPa GPH forcing and torques

In order to further test the idea that an arbitrary extratropical forcing will tend to give rise to an NAM-like response if planetary wave feedbacks are large enough, Fig. 18 shows the steady state \(\overline{u}\) differences between two PJ SMM runs without applied torques but with different prescribed 100 hPa GPH. The first is the “standard” PJ control run (Sect. 2.2.1). The second is a similar run with the eddy component of HadGEM2-CCS 100 hPa GPH multiplied by 1.5 rather than 2. In other words, the \(\overline{u}\) differences in Fig. 18 are due to a steady forcing by a 100 hPa GPH perturbation equal to a quarter that shown in Fig. 1b. This could represent an increased planetary wave forcing on the vortex due to a change in the tropospheric state.

Fig. 18
figure 18

The steady state \(\overline{u}\) response to increased planetary wave forcing at 100 hPa under perpetual January conditions. The number in the top left corner is the anomaly correlation north of 20N between the \(\overline{u}\) response and its NAM signature in the standard control run (Fig. 3a). Data are not plotted where they are not statistically significant above the 95 % level according to the test described in Sect. 2.3. The response is very NAM-like, similar to the responses to torques in Fig. 6, supporting the idea that arbitrary high-latitude forcings will give NAM-like steady state responses when feedbacks are strong enough

The \(\overline{u}\) response in Fig. 18 appears very like the NAM \(\overline{u}\) signature and the steady state \(\overline{u}\) responses to torques centred at 60N or poleward (Figs. 3a, 6), with an anomaly correlation of 0.91 with the NAM signature (calculated in the same way as in Sect. 3.1). The easterly high-latitude \(\overline{u}\) response in the steady state is the same sign as the expected direct effect of increased wave forcing, so wave feedbacks do not always act to make the steady state response opposite in sign to the direct effect of a forcing in the PJ simulations.

Therefore the circulation response to increased planetary wave forcing in the PJ simulations is also NAM-like, and this adds further support to the idea that feedbacks will give a NAM-like response to an arbitrary forcing in the high-latitude stratosphere when they are sufficiently strong.

7 Discussion

Overall, our results are generally consistent with those of other modelling studies. The approximate cancellation of the direct effect of an applied torque placed in the stratosphere by the planetary wave response in the steady state was also seen by Cohen et al. (2013, 2014) in both primitive equation and more comprehensive models with an interactive troposphere. Sigmond and Shepherd (2014) also found that this occurred in a GCM, albeit with incomplete cancellation at high latitudes. A similar effect was also seen by McLandress and McFarlane (1993) and Manzini and McFarlane (1998) in experiments on parameterised GWD, but was not investigated in depth.

However, the easterly response of the high-latitude stratospheric \(\overline{u}\) to a westerly torque seen in the PJ experiments has not been previously reported, though there are parallels with other systems—in tropospheric experiments, a similar reversal of the \(\overline{u}\) response to broad tropical heating by wave feedbacks was found by Sun et al. (2013), and Palmer (1999) showed that the response can have the opposite sign to an applied forcing in the Lorenz (1963) system. The studies of Song and Robinson (2004) and Chen and Zurita-Gotor (2008) indicate that the stratospheric \(\overline{u}\) response has the same sign as the torque in models with weak stationary wave forcing. However, the strong EP flux response in the results presented here indicates that the stationary wave forcing is very important for bringing about the steady state responses in the SMM simulations. In an additional PJ SMM experiment, with the imposed 100 hPa GPH wave amplitude reduced to a quarter of that in the PJ experiments described in Sect. 2.2.1, so that planetary wave activity is substantially lessened, the steady state \(\overline{u}\) response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) westerly torque is also westerly at high latitudes (not shown).

The easterly responses to westerly torques in the PJ runs can be understood as being due to the torque having two opposing effects on \(\overline{u}\): to directly make the flow more westerly, and to increase the meridional PV gradient so that planetary wave propagation into the stratosphere increases, which leads to easterly acceleration. There is nothing restricting the latter effect to be smaller than the former. Wave propagation into the stratosphere stays enhanced even after the \(\overline{u}\) response has become negative in the PJ runs because the \(D_{F}\) response is broader meridionally than the torque (bottom row of Fig. 9), making the total acceleration less negative where the torque peaks, affecting the curvature of \(\overline{u}\). As a result, the response of the meridional PV gradient to the torque is always positive around ~45–60N (Fig. 16b), enhancing upward wave propagation.

Cohen et al. (2013) find in a model with substantial stationary wave forcing that an easterly torque produces an easterly \(\overline{u}\) response at high latitudes. This is consistent with our results in the SC runs (Sect. 4) and is consistent with our analysis if their model simulates a weaker wave response to the applied torque than that in our PJ experiments.

The consistency we have found between the planetary wave responses to applied torques and ray theory is in accordance with previous studies that have shown that ray theory appears to be successful at explaining the tropospheric response to lower stratospheric heating (Simpson et al. 2009) and the extratropical stratospheric response to QBO forcing (Garfinkel et al. 2012). This provides evidence that ray theory is generally useful for understanding the responses to applied forcings in the stratosphere (though this does not preclude other frameworks from also being useful, such as the non-linear approach of O’Neill and Pope (1988), or the PV-based approaches of Cohen et al. (2014) and Scott and Liu (2014)).

7.1 Further tests exploring the reasons for differences between perpetual January and seasonal cycle runs

The analysis of Sect. 5 suggests that the long-term response to a torque is NAM-like in the PJ runs but less so in the SC runs because planetary wave feedbacks are stronger in the former. Additional possible explanations were also explored. For example, Son and Lee (2006) found that the tropospheric responses to heating perturbations are more AM-like when the leading circulation EOF dominates the variability more. However, we did not find that the first EOF of GPH explains very different fractions of the total variability in the PJ and SC control runs. Another possible explanation is suggested by the fluctuation-dissipation theorem, which predicts that the size of the response of the leading principal component (PC) to an applied forcing is inversely proportional to that PC’s autocorrelation time scale (Leith 1975), but we found that this time scale was not very different between the PJ and SC control runs.

To test the effect of removing daily 100 hPa GPH variability in the SC runs whilst retaining the seasonal cycle, we also performed an additional SC experiment, but prescribed monthly-mean rather than daily-mean 100 hPa GPH, with the eddy component multiplied by a factor of 2 as in the PJ runs. The \(\overline{u}\) climatology and long-term mean response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque (not shown) were similar to those in the PJ runs, with the response being NAM-like and easterly at high latitudes. This supports the previous analysis showing that the response depends on the \(\overline{u}\) climatology, as the response changed to become like that in the PJ experiments at the same time as the \(\overline{u}\) climatology changed to become like that in the PJ control run. This experiment also shows that the presence of seasonally-varying boundary conditions does not prevent the response from becoming NAM-like.

We performed a further SC experiment with daily variability of 100 hPa GPH, but with the eddy component multiplied by a factor of 1.5 (the SMM became unstable with eddy amplitudes scaled by a factor of 2). The \(\overline{u}\) climatology and response to the standard \((\phi _{0}=60\hbox {N},\,z_{0}=30\,\hbox {km})\) torque were similar to those in simulations with no scaling of the eddy component (Sect. 4), and the \(\overline{u}\) response was not NAM-like. Therefore it does not seem that the difference between the responses to torques in SC and PJ simulations is due to the mean wave amplitude being larger in the latter. This is also consistent with the analysis showing that the response depends on the \(\overline{u}\) climatology, as neither changed much.

By the process of elimination, the presence of high-frequency variability in the imposed 100 hPa GPH in the SC runs therefore seems to be an important factor that causes the \(\overline{u}\) climatology and responses to applied torques to differ from those in the PJ runs. However, it is difficult to directly test this because it is not clear how to include realistic daily 100 hPa GPH variability in the PJ runs to see if this makes them more like the SC runs.

7.2 Sources of error in the SMM

It is emphasised that the SMM is a simplified model that is only expected to show behaviour that is qualitatively similar to the true stratospheric dynamics. In particular, imposing GPH at 100 hPa and parameterising GWD using Rayleigh friction may introduce unrealistic effects. However, Fig. 6 indicates that the \(\overline{u}\) response to torques is qualitatively similar for torques placed at different heights above the bottom boundary, indicating that the presence of the bottom boundary does not strongly affect the response. The presented responses will not include feedbacks from the tropospheric response to the stratospheric circulation change, though. Feedbacks from changes in gravity waves are also not properly represented, though it is not expected that the qualitative nature of the results presented here would change if a more realistic gravity wave parameterisation were included, as the contribution from planetary waves usually dominates the stratospheric wave drag. As noted in Sect. 2.1, greatly weakening the NH Rayleigh friction was not found to alter the qualitative nature of the stratospheric responses to the applied torques, indicating that the main results of this study are not strongly affected by including this in the SMM.

8 Conclusions

We have investigated the NH wintertime steady state and transient circulation responses to applied zonally symmetric zonal torques that were simulated by a primitive equation model of the middle atmosphere, using both idealised perpetual January (PJ) boundary conditions and imposing a more realistic seasonal cycle (SC). This is relevant for understanding how the extratropical stratosphere responds to perturbations to the zonal mean circulation made directly by gravity wave parameterisations and also external forcings, such as the QBO, ENSO, solar cycle and volcanic eruptions. Our experiments indicate that feedbacks from the wave part of the extratropical circulation have a big effect on the spatial structure of the overall responses, and can even determine their sign. Therefore it seems necessary to understand these feedbacks to understand the extratropical stratospheric response to such forcings.

For both PJ and SC boundary conditions, westerly torques placed in the high-latitude NH stratosphere have two main effects: to directly accelerate \(\overline{u}\), and to cause greater upward EP flux, consistent with upward propagation of large-scale waves being enhanced because the torque increases the meridional PV gradient, consistent with ray theory. This wave feedback leads to increased EP flux convergence where the torque is applied, and this temporarily drives the overall zonal acceleration to become opposite in sign to the direct effect of the torque after about a couple of weeks (Sects. 3.3 and 4). This is possible because there is nothing to restrict the easterly acceleration due to the wave response to be smaller than the direct westerly acceleration of the torque.

In the PJ runs, the wave feedback can cause an easterly high-latitude long-term mean \(\overline{u}\) response to a westerly torque. This happens because the \(D_{F}\) response during and after this stage is broader meridionally than the torques we used (bottom row of Fig. 9), so the total acceleration is less negative near where the torque strength peaks, affecting the curvature of \(\overline{u}\). Consequently the response of the meridional PV gradient to the torque is always positive (Fig. 16b) and wave propagation continues to be enhanced even as the \(\overline{u}\) response reduces to zero, so the \(\overline{u}\) response becomes easterly.

The \(D_{F}\) response is roughly equal and opposite to most torques in the steady state in both the PJ and SC simulations. The EP flux convergence also increases in the upper stratosphere, decelerating the upper stratospheric \(\overline{u}\). The residual circulation response to applied torques (Sects. 3.3 and 4) appears very different to the expected response to an applied torque when the EP flux convergence is held fixed (e.g. Haynes et al. 1991), which is important for understanding how changes in gravity wave parameterisations affect the meridional circulation (Cohen et al. 2013, 2014).

In the PJ runs, the long-term responses of \(\overline{u}\) and GPH to high-latitude torques are generally NAM-like (Sect. 3.1), due to the strong wave feedbacks. These stratospheric NAM-like responses are analogous to the NAM-like responses of tropospheric models to various forcings (e.g. Son and Lee 2006; Ring and Plumb 2007, 2008; Branstator and Selten 2009) and the responses of the Lorenz (1963) system to applied forcings (Palmer 1999; Palmer and Weisheimer 2011). This indicates that stratospheric and tropospheric dynamics are similar in this regard and that this behaviour could be quite general. This broadly supports the suggestion of Watson and Gray (2014) that feedbacks cause the long-term NH extratropical stratospheric response to many forcings to be NAM-like, as long as the feedbacks are strong enough. This may explain why the observed responses of the polar vortex to the QBO, ENSO and solar cycle influences are NAM-like (e.g. Dunkerton and Baldwin 1991; Sassi et al. 2004; Labitzke 2005; Ruzmaikin et al. 2005; Watson and Gray 2014), supposing that planetary wave feedbacks are sufficiently strong in the real stratosphere. This is also supported by the fact that the response to 100 hPa GPH wave forcing in the SMM is also NAM-like, though feedbacks do not reverse the sign of the response in this case (Sect. 6). However, torques placed in middle latitudes do not produce a NAM-like response (Sect. 3.1), because it seems they do not strongly affect wave propagation into the stratosphere (Sect. 5, though we have not investigated the responses to these torques in detail). This is perhaps because their main effect is on wave breaking in middle latitudes (Cohen et al. 2014).

The long-term responses in SC runs are not very NAM-like, however, and the high-latitude \(\overline{u}\) responses are the same sign as the torque (Sect. 4). This is because the wave feedbacks are weaker than those in the PJ experiments, though qualitatively similar, consistent with ray theory and differences in the climatological \(\overline{u}\) between the SC and PJ runs (Sect. 5). The model may not simulate the overall magnitude of the feedbacks accurately, since the climatological EP flux is too weak (Sect. 2.2.2) and feedbacks associated with gravity waves and the tropospheric response were not included. This means that it is not clear from these experiments which of the PJ and SC simulations predicts the long-term responses more accurately. It would therefore also be interesting to examine the stratospheric response to torques in a more realistic model.

The overall picture that emerges is that wave feedbacks act to make the response to torques at high latitudes NAM-like on long time scales, but the total response will only become NAM-like if the feedbacks are strong enough. Therefore the effect of feedbacks on long time scales seems quite conceptually straightforward relative to the detailed physics of what happens shortly after a torque is applied.

Since the sign of the \(\overline{u}\) response differs between the PJ and SC experiments, our results indicate that it is necessary that models simulate the magnitude of the feedbacks quantitatively accurately in order to simulate a long-term response that is even qualitatively accurate. This is likely to also apply to forcings other than torques. According to ray theory, biases in the \(\overline{u}\) climatology and representation of tropospheric wave sources will lead to biases in the simulated strength of wave feedbacks. For example, too large climatological \(\overline{u}\) and \(\overline{q}_{\phi }\) would be expected to lead to wave feedbacks to given \(\overline{u}\) and \(\overline{q}_{\phi }\) perturbations to be too weak (Sect. 5). Comparing a model’s EP flux to that in observations may also be particularly useful for assessing a model’s ability to accurately simulate planetary waves and therefore its ability to simulate the wave feedbacks that shape the response to external forcings. Our results are reminiscent of the findings of Sigmond and Scinocca (2010), who showed that the tropospheric response to a doubling of \(\hbox {CO}_{2}\) is quite sensitive to the \(\overline{u}\) climatology.

The demonstration that the steady state stratospheric response to a forcing may have the opposite sign to the forcing (Sect. 3.1) has important implications for studies of the mechanisms by which external forcings influence the polar vortex—in principle it could be the case that the direct effect of a forcing has the opposite sign to the long-term mean response. As far as we are aware, this possibility has not been considered in any previous studies of the effect on the vortex of forcings such as the QBO, ENSO and the solar cycle. Feedbacks may greatly modify the response from what is expected based on simple arguments. It also highlights the difficulties of using diagnostics such as composite differences to understand forcing mechanisms, since these may be dominated by the effects of feedback processes (Watson and Gray 2014).

The implication that the extratropical stratospheric response to an external forcing is affected by the climatology may be relevant for understanding non-linearity in the way different forcings combine to affect the polar vortex, such as the suggested non-linear combined influence of the QBO and ENSO (Garfinkel and Hartmann 2007; Wei et al. 2007) and of the QBO and solar cycle (e.g. Labitzke 1987; Matthes et al. 2004). When one forcing affects the background circulation, this would be expected to change the circulation response to other forcings, and this effect may contribute to the reported non-linearities.

Our results provide further evidence that ray theory can be helpful for understanding the stratospheric extratropical response to external forcings, though other physical explanations may also be consistent with the results presented here. It is not altogether clear why strong wave feedbacks result in a circulation response that is similar to the NAM. Deeper analysis, beyond the scope of this work, may indicate if the NAM-like response occurs, for example, due to regime behaviour (Palmer 1999), or to NAM-like anomalies having a long decay time scale (Branstator and Selten 2009).

Our work suggests that it may be helpful to examine the transient responses to applied forcings in the troposphere to better understand whether feedbacks generally tend to make the response AM-like there, rather than just the long-term mean responses that have been the focus of previous work that we are aware of (e.g. Son and Lee 2006; Ring and Plumb 2007, 2008; Woollings 2008; Branstator and Selten 2009). In cases when the long-term response to a forcing is not AM-like, the transient response can be used to distinguish between the possibilities that the feedback processes are indeed different compared to situations when the response is AM-like or that the feedbacks are similar but simply weaker.