1 Introduction

Understanding the response of the climate system to increasing greenhouse gases is a topic of substantial scientific interest, reflecting in large part the societal concern for potential future climatic changes, but also the need to better understand the controls on past climates (e.g., Knutson et al. 2010; Walsh et al. 2016). Global climate models (GCMs) are one of the fundamental tools in developing this understanding. A common avenue in GCM development and improvement is the enhancement of model resolution, to better represent finer-scale processes and phenomena (e.g., Masumoto et al. 2004; Roberts et al. 2004; Yoshimura and Sugi 2005; Yoshimura et al. 2006; Zhao et al. 2009; Murakami and Sugi 2010; Murakami et al. 2010, 2011, 2015, 2016a; Chen and Lin 2011, 2013; Scoccimarro et al. 2011; Delworth et al. 2012; Kirtman et al. 2012; Bell et al. 2014; Vecchi et al. 2014; Roberts and et al. 2015; Harris et al. 2016; Haarsma et al. 2016; Yoshida et al. 2017). In this paper we explore a suite of idealized sensitivity studies to increasing atmospheric CO2 concentrations in GCMs spanning a range of horizontal atmospheric and land resolutions. This framework allows the evaluation of the sensitivity of global temperature, precipitation, and tropical cyclone (TC) activity to changes in atmospheric resolution.

The availability of high-resolution GCMs has enabled the explicit exploration of regional climatic sensitivity, the response of extreme events to climate forcings, and the assessment of the time-evolving probability of extreme events (e.g. Zhao et al. 2009; Walsh et al. 2015; Wehner et al. 2015; Delworth et al. 2015; Haarsma et al. 2016; Jia et al. 2016, Murakami et al. 2017a, b, 2018; Van der Wiel et al. 2016a, 2017; van Oldenborgh et al. 2017; Krishnamurthy et al. 2018; Patricola and Wehner 2018; Zhang et al. 2018; Bhatia et al. 2019). Tropical cyclones (TCs) can have catastrophic impacts, particularly due to their extremely strong winds and precipitation (Pielke et al.2008; Hsiang 2010; Mendelsohn et al. 2012; Villarini et al. 2014a). GCMs have advanced our understanding of spatial and temporal variability of TC genesis and landfall (Sugi and Yoshimura 2012; Zarzycki and Jablonowski 2014; Roberts et al. 2015; Camargo 2013; Camargo and Wing 2016; Murakami et al. 2017a, b, 2018; Baldwin et al. 2019), the association between climate oscillations and TCs (Bell et al. 2014; Chand et al. 2016; Vecchi et al. 2014; Krishnamurthy et al. 2016; Murakami et al. 2016a, b; Zhang et al. 2016), the responses of TCs to anthropogenic forcing, and provided projections for possible changes in the future (Yoshimura and Sugi 2005; Yoshimura et al. 2006; Gualdi et al. 2008; Zhao et al. 2009; Murakami and Sugi 2010; Held and Zhao 2011; Mendelsohn et al. 2012; Zhao and Held 2012; Knutson et al. 2013, 2015; Kim et al. 2014; Scoccimarro et al. 2014; Villarini et al. 2014b; Wehner et al. 2015; Yamada et al. 2017; Yoshida et al. 2017; Bhatia et al. 2018). Previous studies have found improvements in the simulation of TCs (e.g., genesis, track density and intensity) with increasing spatial resolution (e.g., Chen and Lin 2011, 2013; Zhao et al. 2009; Mendelsohn et al. 2012; Kim et al. 2014; Vecchi et al. 2014; Zarzycki and Jablonowski 2014; Wehner et al. 2015; Murakami et al. 2015, 2016a, Zhang et al. 2016, 2019; Liu et al. 2017, 2018a) and that large-scale biases in mean climate can impact the sensitivity of TCs to climate drivers (e.g., Vecchi et al. 2014; Krishnamurthy et al. 2016). In an effort to overcome the coarse resolution of most current climate models, statistical, dynamical and hybrid downscaling methods have been used to estimate the response of TCs to climate change and variability (e.g., Emanuel and Nolan 2004; Emanuel et al. 2008; Knutson et al. 2008; Bender et al. 2010; Vecchi et al. 2011, 2013; Emanuel 2013; Knutson et al. 2013; Villarini et al. 2012; Villarini and Vecchi 2012, 2013; Camargo and Wing 2016; Lee et al. 2018).

This study aims to assess the impacts of atmospheric resolution (in fully-coupled GCMs) on the response of regional and global TC activity to increased CO2. With this goal in mind, we perform a suite of perturbation experiments using three models that share identical ocean and sea ice components, and land and atmosphere components that differ only in their resolution. These models are derived from the National Oceanic and Atmospheric Administration’s Geophysical Fluid Dynamics Laboratory (NOAA/GFDL) Coupled Model version 2.1 (CM2.1, Delworth et al. 2006) and version 2.5 (CM2.5Delworth et al. 2012), and are the Low Ocean Atmosphere Resolution version of CM2.5 (LOAR; Van der Wiel et al. 2016a), the Forecast-oriented Low Ocean Resolution version of CM2.5 (FLOR; Vecchi et al. 2014) and the high atmospheric resolution version of FLOR (HiFLOR; Murakami et al. 2015), which have, respectively, ~ 2°, ~ 0.5° and ~ 0.25° atmospheric and land horizontal grid spacings. The changes in atmospheric resolution, land model and ocean parameterizations from CM2.1 to LOAR, to FLOR and then HiFLOR result in a general improvement to the simulation of large-scale near-surface climate and modes of variability (e.g., Vecchi et al. 2014; Delworth et al. 2015; Jia et al. 2015; Yang et al. 2015; Murakami et al. 2015; Baldwin and Vecchi 2016; Zhang et al. 2016; Van der Wiel et al. 2016a; Pascale et al. 2016, 2017; Muñoz et al. 2017; Kapnick 2018; Ng et al. 2018; Wittenberg et al. 2018; Ray et al. 2018), and general improvements in seasonal prediction skill (e.g., Vecchi et al. 2014; Jia et al. 2015; Murakami et al. 2015, 2016a; Zhang et al. 2019), although for some quantities (e.g., snowpack in the Western U.S.; Kapnick et al. 2018) the seasonal prediction skill does not improve (and can degrade in places) between FLOR and HiFLOR.

The two highest resolution models used in this study (FLOR and HiFLOR) explicitly simulate TC-like vortices naturally with many of the characteristics (such as geographic distribution, track, seasonal and interannual variability) resembling those observed in nature (e.g., Vecchi et al. 2014; Murakami et al. 2015, 2016a; Zhang et al. 2016; Liu et al. 2017, 2018a, b). HiFLOR can simulate TC climatology and variability more faithfully than does FLOR (Murakami et al. 2015; Zhang et al. 2016; Liu et al. 2018b), and it exhibits improved seasonal prediction skill for TCs relative to FLOR (Murakami et al. 2016a; Zhang et al. 2019). HiFLOR is able to provide a more faithful representation of the TC intensity distribution than does FLOR, including the existence of “Major” (Saffir-Simpson Category 3–5) tropical cyclones (Murakami et al. 2015, 2016a; Bhatia et al. 2018). Our study builds on previous high-resolution modeling experiments that explore the sensitivity of TCs to CO2 increase and warming (e.g., Yoshimura and Sugi 2005; Yoshimura et al. 2006; Gualdi et al. 2008; Murakami and Sugi 2010; Sugi and Yoshimura 2012; Held and Zhao 2011; Zhao and Held 2012; Kim et al. 2014; Walsh et al. 2015), and we focus on the impact of resolution changes in the atmosphere within a family of coupled GCMs that share the same ocean and sea ice, and whose atmospheric configurations differ only in their horizontal resolution. We also complement our coupled experiment with a suite of targeted experiments, to isolate the impact of climatological SST biases, interannual variability, patterns of SST change, uniform warming and isolated effect of CO2 doubling on the TC response—and the difference in TC response across these two models.

Based on the published literature, we expected both TC-permitting GCMs to show a decrease in global TC frequency in response to CO2 induced warming, which has been seen across a broad range of high-resolution GCMs (e.g., Yoshimura and Sugi 2005; Yoshimura et al. 2006; Knutson et al. 2010; Held and Zhao 2011; Sugi and Yoshimura 2012; Walsh et al. 2015, Scoccimarro et al. 2011). However, as will be shown later in this paper, the response of global TC frequency to the climate response to increased CO2 differs considerably between the two high-resolution coupled models, with FLOR showing a decrease that is typical of other published GCM results, but HiFLOR showing either no significant change or an increase in global TC frequency—depending on the sea surface temperature (SST) climatology of the model.

In Sect. 2 we describe the models and experiments used in this study. In Sect. 3 we discuss the response of the models to the various perturbation experiments, beginning with the response of global mean temperature, then precipitation, and finally a discussion of the response of TCs. In Sect. 4 we present a summary and thoughts for further work.

2 Methods

2.1 Models

The principal tools in this study are a set of coupled ocean–atmosphere-land-sea ice GCMs developed at NOAA/GFDL, which share a number of common elements but differ principally in the horizontal resolution of their atmospheric components. The three GCMs used here are: (1) LOAR (Van der Wiel et al. 2016a), (2) FLOR (Vecchi et al. 2014), and (3) HiFLOR (Murakami et al. 2015). These three models have identical ocean and sea ice components, all with 1° × 1° spatial resolution (telescoping to 1/3° meridional resolution equatorward of 15°) derived from the ocean component of GFDL-CM2.1 (Delworth et al. 2006; Gnanadesikan et al. 2006), but with modified physical parameterizations and advection scheme as described in Vecchi et al. (2014). The three GCMs (LOAR, FLOR and HiFLOR) also have the same atmospheric and land models, including parameterizations, but are run at different spatial resolutions. All three models use version 3 of the GFDL land model (LM3; Milly et al. 2014), and the GFDL’s finite volume cubed-sphere atmospheric dynamical core (FV3, Chen and Lin 2013), with the same 32 vertical levels and atmospheric physical parameterizations as in CM2.5 (Delworth et al. 2012), but each is run at different atmospheric/land resolutions. The lowest resolution model is LOAR, with 48 cells per side on each cubed sphere face (so C48) and an approximate horizontal resolution of 200 km on the Equator, followed by FLOR with an approximately 50 km resolution (C180), and HiFLOR has the highest resolution of approximately 25 km resolution (C384), building on the model used in Chen and Lin (2011, 2013). The atmospheric dynamical timestep is adjusted to match each resolution, so it is halved between FLOR and HiFLOR and quintupled between FLOR and LOAR (the atmospheric physics timestep is also 50% longer in LOAR than FLOR, but is the same in FLOR and HiFLOR). For analysis, the atmospheric and land model data are regridded conservatively from the cubed-sphere grids to latitude–longitude grids, with the LOAR data placed on a grid with 2.5° resolution in the zonal and 2° resolution in the meridional, the FLOR data on a grid with 0.625° resolution in the zonal and 0.5° resolution in the meridional, and HiFLOR data on a grid with 0.25° resolution in both the zonal and meridional.

2.2 Fully-coupled experiments

To explore the response of the modeled climate system to increases in CO2, with each coupled model (LOAR, FLOR and HiFLOR, see Sect. 2.1 above) we performed a series of idealized fully-coupled model experiments. An experiment labeled “1990-Control” serves as our reference and maintains natural and anthropogenic radiative forcing and land use/land cover for 300 years at their levels for the year 1990, based on the Coupled Model Intercomparison Project Phase 5 (CMIP5, Taylor et al. 2012) historical forcing scenario. The oceans in these experiments were initialized from observed present day climatology, and the model systems drifted from this initial state due both to the model’s inherent biases and the radiative imbalance induced by this late-20th century radiative forcing.

We generated perturbation experiments from each of the 1990-Control runs, with CO2 increasing from year 101 (i.e., 100 years after initialization), and all other radiative forcing and boundary conditions fixed at the control levels. The perturbation experiments involve an idealized increase of atmospheric CO2 from 1990 control values, at a rate of 1% per year (compounded) until doubling (approximately after 70 years, so to model year 170), at which point the atmospheric CO2 concentrations are held fixed at this doubled concentration for an additional ~ 130 years (to model year 300). We label these experiments the “Transient 2 × CO2 experiments.”

These experiments are described in Van der Wiel et al. (2016a), Pascale et al. (2017) and Murakami et al. (2017a, b). When exploring the impact of changing atmospheric CO2 concentrations in the fully coupled GCM, for each model we subtract the 1990-Control simulation results from those of the Transient 2 × CO2 experiment. We label this response to increasing CO2 the “Transient 2 × CO2 response”, to distinguish it from a suite of experiments described below in which various aspects of the CO2 forcing are isolated. For most analyses we explore differences between each pair of coupled simulation averaged over years 201–250 from initialization (so 31–80 years after CO2 doubling).

2.3 Nudged-SST experiments

The full coupled GCM response to increasing greenhouse gases arises due to a number of factors, including the impact of CO2 on atmosphere and land without SST changes (hereafter the “isolated impact of CO2”), the response of climate to overall ocean surface warming, and the response of climate to the spatial patterns of ocean surface warming. In addition, the Full GCM response occurs in a system with biases in SST and other variables, which may impact the character of the response. Finally, the Full GCM response potentially includes climate changes due to nonlinearity of the climate response to the superposition of climate variability and mean state change (e.g., a systematic shift in the symmetry of El Niño could lead to long-term changes in the climate system), we refer to this as the rectified impact of variability and mean warming.

To refine our understanding of the models’ responses to increasing greenhouse gases, idealized perturbation experiments were performed in which the SST of the fully-coupled model is “nudged” to different climatological SST targets, and idealized perturbations are applied to the model’s CO2 concentration. Specifically, the SST tendency equation in the ocean is modified to be:

$$\partial SST\left( {x,y,t} \right)/\partial t = \chi \left( {x,y,t} \right) + \frac{1}{\tau }\left( {SST_{T} \left( {x,y,t} \right) - SST\left( {x,y,t} \right)} \right)$$
(1)

where \(\partial SST\left( {x,y,t} \right)/\partial t\) is the time-tendency of SST that is applied at a particular location and time-step in the nudged-SST experiment, χ(x,y,t) is the coupled model’s tendency term for SST based on the state of the model and its governing equations, \(\tau\) is the nudging timescale (5 days in this case), SSTT is the target SST that the model will be nudged towards (interpolated to the model time-step from a monthly-mean value), and SST is the sea surface temperature of the model. In the absence of model tendency, this formulation would act to bring the model’s SST towards the target with an e-folding time-scale of \(\tau\). However, because of the model SST tendency term—which involves advection, mixing and heat-fluxes—the SST of the model can deviate from the target, even on multi-month timescales (though the deviations will be the largest on timescales shorter than \(\tau\)). This technique is described in Vecchi et al. (2014) and Pascale et al. (2017).

The goal of the nudged SST setup is to give control over the evolution of SST in the coupled model, while still allowing some level of high-frequency coupling between the ocean and the atmosphere. The high-frequency coupling between ocean and atmosphere is desired, in part, because there is evidence that ocean–atmosphere coupling may be important in the evolution of TC intensity (e.g., Lin et al. 2003; Lloyd and Vecchi 2011; Vincent et al. 2014), reproducing summer monsoon precipitation in the western Pacific (Wang et al. 2005), and the rainfall-SST relationship across the tropics (Wu et al. 2008; Kirtman and Vecchi 2011). Through these “nudged SST” experiments we are able to assess the extent to which the response of the fully-coupled models involves a rectified impact of changes in variability, depends on the underlying SST climatology, and/or is connected to particular patterns in SST change. The experiments performed with each of the three models are listed in the first column of Table 1.

Table 1 Nudged-SST experiments performed with the LOAR, FLOR and HiFLOR GCMs

Two reference experiments were generated, nudging SST to different climatologies and holding radiative forcing at 1990 levels. For the first one, the target is the SST climatology from the fully-coupled 1990 Control simulation from each model, which we label “MoC”. The second nudged-SST reference experiment, which we label “ObC”, used as its target the observed monthly-mean climatology over 1986–2005 from HadISSTv1 (Rayner et al. 2003), linearly interpolated to the model time-step.

Perturbation experiments were generated relative to each reference experiment, in which the CO2 levels are doubled, and the SSTs are nudged towards the sum of the reference climatology (either observed or coupled model generated) and the climatological SST response of each model (LOAR, FLOR or HiFLOR) to CO2 doubling over years 201–250. Specifically, for each model, m, we compute the climatological SST perturbation as:

$$\begin{aligned}\widehat{SST}_{m} \left( {x,y,t} \right) &= 1/50 \mathop \sum \limits_{year = 201}^{250} \left[ {SST_{m}^{{trans - 2xCO2_{2} }} \left( {x,y,t} \right) }\right.\\&\quad\left.{- SST_{m}^{1990 - Control} \left( {x,y,t} \right)} \right]\end{aligned}$$
(2)

where, at each (longitude, latitude) point (x,y), is the monthly-mean SST for each month t (January, February…) from the Transient 2 × CO2 experiment, and is the monthly-mean SST for each month t from the 1990-Control experiment of each model m. We label these perturbation experiments “ObC + full” and “MoC + full”, depending on whether we use the observed or model SST climatology as a background for the warming perturbation. In combination with ObC and MoC, these form two experiment pairs that allow us to understand the impact of each model’s SST biases on the climate response to CO2. The response is computed as the difference between the perturbed experiment (e.g., ObC + full) and the relevant reference experiment (e.g., ObC), and we label this response with the Greek letter (e.g., ∆ ObC + full = ObC + fullObC).

The ∆MoC + full and ∆ObC + full responses arise from the same perturbation to SST and radiative forcing, thus differences between these two responses will reveal the influence of each model’s background climatological biases on its multi-decadal response to CO2 doubling. This impact will depend on the character of each model’s SST biases (e.g., a model with minimal SST biases should have minimal impact of the SST biases on the response). Meanwhile, comparing the Full GCM response to ∆MoC + full allows us to explore the impact of nudging relative to full coupling, which includes the impact of any non-linear superposition of intrinsically generated climate variability in the full GCM onto radiatively-forced climate changes.

We generated a series of even more idealized perturbation experiments, starting from ObC of each model, in order to understand the impact of simplified perturbations in the fully coupled CO2-induced response of each model. To explore the extent to which the modeled response to CO2 depends on the pattern of SST response versus an overall warming and CO2 increase, we performed an idealized experiment that doubled the reference CO2 and added a globally-uniform 2K warming to the SST-nudging target (we label this experiment ObC + 2K + 2 × CO2, and its response ∆ObC + 2K + 2 × CO2). Although each coupled model has its own transient climate response, which is lower than 2K, we use a uniform 2 K warming here to be consistent with previous idealized studies (e.g., Yoshimura and Sugi 2005; Yoshimura et al. 2006; Held and Zhao 2011; Walsh et al. 2015). To explore the direct effects of warming, we generated an experiment in which we hold the CO2 at the reference level of the 1990 Control, but add a uniform 2K warming to the SST-nudging target (we label this experiment ObC + 2K, and its response ∆ObC + 2K). To isolate the direct and semi-direct impacts of CO2 increase (i.e., those that do not arise from SST changes), for each model we generate a perturbation of the reference experiment by doubling CO2, but holding the SST target fixed (we label this experiment ObC +2× CO2, and its response ∆ObC +2× CO2). We based these idealized experiments on the ObC experiment rather than the MoC experiment for two main reasons: (1) we wanted to compare our results to the experiments performed under the US-CLIVAR Working Group on Hurricanes and Climate (Walsh et al. 2015), which had control experiments using repeating observed SST climatology, and (2) we wanted to exclude as much as possible any impacts of climatological SST differences among these three models, in order to focus on the effects of SST and CO2 perturbations.

2.4 Tropical cyclone tracker

To track TCs in the two TC-permitting GCMs (FLOR and HiFLOR), we use the tracker developed in Harris et al. (2016), with parameter settings as in Murakami et al. (2016a). The algorithm identifies model TCs by tracking high cyclonic vorticity features, with a sea-level pressure minimum, localized warming in the mid troposphere (a “warm core”) and high near-surface winds. As such, the inputs to the tracker are instantaneous 6-hourly outputs of sea level pressure, mid-tropospheric temperature, 850-hPa vorticity, and 10-m zonal and meridional winds. The storm tracker is applied to the output of FLOR and HiFLOR after regridding to the relevant latitude–longitude grid from the original cubed-sphere grid. Because of their differing resolutions, the parameter settings used here for the second step of the Harris et al. (2016) tracker are different in FLOR and HiFLOR: the minimum wind speed criterion for HiFLOR is higher than that for FLOR, at 17 m s−1 and 15.3 m s−1, respectively, and the minimum warm core temperature anomaly relative to the surrounding environment in HiFLOR is higher than that for FLOR, at 2K relative to 1K. The wind speed criterion in FLOR is chosen based on the suggestions of Walsh et al. (2007) of a threshold 10% below gale force (17 m s−1) for a 50 km resolution model. The warm core threshold values are selected to give similar (and comparable to observations) global-mean frequency in the control experiments (see Murakami et al. 2015, 2016a).

2.5 Large-scale factors affecting tropical cyclone activity

A number of factors in large-scale climate have been suggested as drivers of changes in tropical cyclone activity, and some of those are used in Sects. 3.2.2 and 3.3.2 to understand the TC changes in the experiments with FLOR and HiFLOR described in Sects. 2.22.3. Increases in vertical wind shear have been shown to limit TC activity, while increases in TC potential intensity, mid-tropospheric humidity and lower tropospheric vorticity are associated with increases in TC activity (e.g., Emanuel and Nolan 2004; Tippett et al. 2011; Emanuel 2013). These individual factors have been combined in a variety of genesis indices, that aim to assess the potentially offsetting changes in individual drivers (e.g., Emanuel and Nolan 2004; Tang and Emanuel 2012b; Tippett et al. 2011; Emanuel 2013; Tang and Camargo 2014). Indices based on these quantities have been used to explore the sensitivity of TCs to climate in GCMs (e.g., Camargo et al. 2007a, b; Vecchi and Soden 2007a, b; Camargo 2013; Baldwin et al. 2019), with mixed success.

In this study we explore the magnitude of the vector difference of wind velocity at 850 hPa and 200 hPa, computed from monthly-mean output, as our measure of wind shear; the differences in climate-scale changes in shear computed from monthly-mean winds and from daily-mean winds has been found to be modest (Vecchi and Soden 2007b); increased shear is associated with a more unfavorable environment for TCs. We also look at Bister and Emanuel (1998) potential intensity, or PI, computed from monthly-mean model output using the fortran code made available by Prof. Kerry Emanuel (ftp://texmex.mit.edu/pub/emanuel/TCMAX/pcmin_2013.f). PI is the theoretical upper bound on tropical cyclone intensity, and increased PI reflects a more favorable environment for TC development. We also explore absolute vorticity computed from monthly-mean winds at 850 hPa; high vorticity reflects a more favorable environment for TC activity.

Mid-tropospheric drying is deleterious for TC genesis, and we explore three different metrics for humidity, each computed from monthly-mean model output. We explore relative humidity at 700 hPa. High values of RH indicate TC-favorable environment. It has been argued that measures of mid-tropospheric moisture that account for the moisture difference between the middle troposphere and the lower troposphere/planetary boundary layer, such as saturation deficit (Emanuel 2013) and entropy deficit (Tang and Emanuel 2010, 2012a), are more relevant to understanding the sensitivity of TC activity changes to the environment than mid-tropospheric relative humidity. Increases in both mid-tropospheric saturation deficit and entropy deficit are associated with an environment less favorable for TC genesis. Therefore, we also explore saturation deficit and entropy deficit.

Emanuel (2013) suggests a non-dimensional genesis potential index (or GPI) that combines a number of TC-relevant indices to connect them to changes to TC genesis. This GPI provides a way to merge the modeled changes in quantities to assess the expected overall change in TC activity, and is of potential utility given the spatially heterogeneous changes in TC-relevant parameters shown in Fig. 6, along with the tendency of the various parameters to change in ways with opposing expected influences on TC activity in different regions. Following Emanuel (2013) we compute the GPI index using monthly mean model output as:

$$GPI = \left| \eta \right|^{3} \chi_{saturation}^{ - 4/3} { \hbox{max} }\left( {\left( {PI - 35m/s} \right),0} \right)^{2} \left( {25m/s + u_{shear} } \right)^{ - 4}$$
(3)

where \(\eta\) is the 850 hPa absolute vorticity, \(\chi_{saturation}\) is the saturation deficit at 600 hPa, PI is the Bister and Emanuel (1998) potential intensity and ushear is the 850–200 hPa vertical wind shear.

Tang and Emanuel (2012b) suggest an alternative non-dimensional index, the ventilation index or \(\varLambda\), which has both a theoretical and empirical connection to the probability of TC genesis. Tang and Emanuel (2012b) show that the observed probability that an atmospheric wave will become a TC is strongly modulated by the ventilation index, with large values of the ventilation index leading to vanishingly small probability of genesis, and very small values of the ventilation index leading to extremely large (almost one) probabilities of genesis. Using monthly-mean model output we compute the ventilation index as:

$$\varLambda = \frac{{u_{shear} \chi_{entropy} }}{PI}$$
(4)

where ushear is the 850–200 hPa vertical wind shear,\(\chi_{entropy}\) is the entropy deficit computed as in Eqs. 2 and 3 in Tang and Emanuel (2012b), and PI is the Bister and Emanuel (1998) potential intensity. Large values of the ventilation index are associated with unfavorable conditions for TC genesis.

Held and Zhao (2011) suggested TC-genesis weighted changes in 500 hPa pressure velocity as a way to explain the response across a set of perturbation experiments of the GFDL HiRAM model; thus we also explore the fractional changes in 500 hPa pressure velocity weighted by the monthly climatological genesis density of the corresponding control experiment.

Changes in synoptic-scale variability that precedes TC genesis has been suggested as another influence of climate on TC activity. The importance of pre-TC synoptic variability was suggested by Yoshimura and Sugi (2005) and Yoshimura et al. (2006) as a potential driver of the global TC frequency response to idealized global perturbations. Li et al. (2010) argued that the response of tropical Pacific cyclone location to projected 21st century warming reflected changes in synoptic scale disturbances in the central Pacific, arising due to changes in wind shear. In order to explore the hypothesis that pre-TC synoptic scale disturbances, which we shall refer to as “TC Seeds”, are a main driver of the TC response in these models, we examine the index developed in Li et al. (2010): the variance of 3–10 day bandpass filtered 850 hPa vorticity computed for the Northern Hemisphere over the period July–October and for the Southern Hemisphere over the period December–March across the 50 years of each model experiment. In order to mitigate the potential contamination of conclusions about the relationship between changes in synoptic-scale variance and TC frequency by the TCs themselves, the variance is computed after removing the vorticity within 500 km of each TC identified by the tracker and linearly interpolating in space to fill the missing values. We note that the qualitative nature of the results presented here is not impacted if we retain the TCs in computing this index.

3 Results

3.1 Global-mean climate response

We begin by exploring the response to CO2 doubling of globally-averaged quantities in the three fully-coupled model experiment pairs (1990-Control and Transient 2 × CO2; Fig. 1). Consistent with expectations, in all three models increasing CO2 causes the net top of atmosphere (TOA) radiative imbalance to increase (i.e., more radiation enters the earth system in the net). The imbalance increases steadily as CO2 rises and then decreases more gradually once CO2 levels stabilize at 2 × CO2 (Fig. 1a). There is a tendency for the lowest (highest) resolution model to have a larger (smaller) decrease in net TOA radiative imbalance over the 130 years after CO2 doubling.

Fig. 1
figure 1

Transient 2 × CO2 response from the fully-coupled models of (a) top of atmosphere (TOA) net radiation, (b) global-mean surface temperature, (c) tropical (30°S–30°N) sea surface temperature, (d) full ocean temperature, (e) precipitation (as a percent of Control experiment values), and (f) atmospheric circulation at 500 hPa (as a percent of Control experiment values). Atmospheric CO2 concentrations begin increasing by 1% per year in the perturbation experiment starting in model year 101, and are held fixed after doubling (time indicated by thin vertical line in each panel). Response is computed as the difference between the Transient 2 × CO2 and the 1990-Control experiments of each GCM. Averages are low-pass filtered using a 17-year cosine weighted filter. The 500 hPa atmospheric circulation is calculated as the spatial average of downward (positive) minus upward (negative) 500 hPa pressure velocities. Blue lines show the values for LOAR, red lines for FLOR and black lines for HiFLOR

In response to the TOA imbalance, there is a warming of the global surface (Fig. 1a), tropical ocean surface (Fig. 1c) and full ocean (Fig. 1d) in all the models, which continues after the CO2 levels are stabilized in year 171. The tropical ocean warming across the three models largely reflects the spread in global surface warming (Fig. 1b, c). The largest full-depth ocean warming is found in the highest resolution model, which also exhibits the smallest surface warming; meanwhile, the lowest resolution model exhibits the smallest full-ocean warming but the largest surface warming (Fig. 1b, d). The more rapid decrease in Net TOA radiation imbalance in the LOAR and FLOR models over the 130 years (relative to HiFLOR), at least partly, reflects their more rapid surface warming (e.g., Soden and Held 2006; Soden et al. 2008; Winton et al. 2010).

All three models show global-mean precipitation increases at a pace smaller (~ 1–2% K−1) than expected from Clausius-Clapeyron scaling or the actual increase in atmospheric moisture in these models (~ 7.5% K−1), though the higher resolution models show a slightly faster precipitation increase than does the 2° LOAR model (Fig. 1e), even with less surface warming. This fractional response is similar to that of Coupled Model Intercomparison Project Phase 3 (CMIP3, Meehl et al. 2007) and CMIP5 models (Taylor et al. 2012), which exhibit a precipitation increase smaller than the increase in atmospheric moisture. Global precipitation is constrained by the response of atmospheric radiative cooling in models (e.g., Held and Soden 2006; Vecchi and Soden 2007a) and precipitation increases smaller than atmospheric moisture suggest a reduction in the strength of atmospheric circulation (e.g., Held and Soden 2006). All three models show a reduction of global circulation (defined as the difference between the spatial average of upward and the spatial average of downward 500 hPa pressure velocities; Fig. 1f), and in the strength of the Pacific Walker circulation (see Sect. B.1), consistent with this expectation (Knutson and Manabe 1995; Vecchi and Soden 2007a); the strength of the reduction of atmospheric circulation scales with global-mean surface temperature change across these three models.

Estimating the transient climate response (TCR) as the global-mean surface temperature response averaged over years 161–180 (centered on the year of CO2 doubling), we find a TCR of 1.78 K in LOAR, 1.63 K in FLOR and 1.53 K in HiFLOR; the differences in global-mean surface temperature response between HiFLOR and the two other models continue over the period following CO2 stabilization: averaged over years 201–250 the average surface warming of LOAR is 2.35 K, FLOR is 2.26 K and HiFLOR 1.85 K.

One possible interpretation of the TCR differences among the three models is that they have different atmospheric climate feedback strengths. This is a reasonable hypothesis, as one may speculate that the different resolutions in the atmosphere may lead to distinct cloud, water vapor and lapse-rate responses in each model, and thus feedbacks. To evaluate this hypothesis, we computed the strength of the transient radiative feedback terms for each model using the radiative kernels of Soden et al. (2008) (Fig. 2a), and find that: (1) the three models have similar net radiative feedback magnitudes, and (2) the small spread in net feedback strength does not align with the spread in TCR, as the medium resolution (and TCR) model (FLOR) shows the largest net radiative feedback strength. The three models show systematic differences in temperature and water vapor feedbacks which are strongest in LOAR due to its highest global mean surface temperature in the control run. However, the differences in temperature and water vapor feedbacks largely offset each other through the maintenance of near-constant relative humidity (Soden and Held 2006), resulting in similar net radiative feedbacks. That is, the spread in TCR in these models cannot be explained through differences in feedback strength, i.e., equilibrium climate sensitivity, and must represent differences in oceanic heat uptake.

Fig. 2
figure 2

a Average individual and total radiative feedbacks. Units are W m−2 K−1. b Relationship between the global mean TOA radiation anomalies and surface temperature anomalies during the CO2 stabilization period. The crosses represent 20-year averages and the round dots show averages of the entire CO2 stabilization period. The solid lines are the linear least squares fit to each set of crosses, whereas the dashed lines connect the round dots and the (0,0) point. The global mean CO2 radiative forcing is indicated on the y-axis. The average ocean heat uptake efficiency for the CO2 stabilization period is shown as the slope of the dashed lines. The heat uptake efficacy for the CO2 stabilization period is shown as the ratio of the global mean radiative forcing (3.5 W m−2) to the y-intercept of the solid lines

To further illustrate this point, we show the scatterplot of global mean surface temperature changes and TOA radiation during the stabilized CO2 period (Fig. 2b). The similarity in transient radiative feedbacks is reflected in the similarity in the ocean heat uptake efficacy, which is shown as the ratio of the CO2 radiative forcing (estimated as 3.5 W/m2) to the y-intercept of the solid lines in Fig. 2b (Winton et al. 2014; He et al. 2016). The efficacy is a measure of the transient radiative feedback associated with ocean heat uptake; a large efficacy corresponds to a small net transient radiative feedback and acts to slow down surface warming (Winton et al. 2010). The medium resolution (and TCR) model (FLOR) shows the lowest ocean heat uptake efficacy, which is consistent with its largest net transient radiative feedback. Therefore, we conclude that: (1) for this model family, the strength of the individual and net feedbacks are likely controlled by the physical parameterizations that are common to the three models, and (2) the spread in TCR in these models is not due to differences in radiative feedback strength.

We suggest that the spread in TCR is best understood in terms of differences in heat uptake efficiency in these three models, as has been in other recent studies (Raper et al. 2002; Kuhlbrodt and Gregory 2012; Winton et al. 2014; He et al. 2016). Supporting this hypothesis is the inverse relation between the full-ocean depth temperature response and the surface temperature response: the model that takes up the most (least) heat in the ocean warms the least (most) at the surface (Fig. 1b, c). The ocean heat uptake efficiency for the CO2 stabilizing period is shown as the slope of the dashed lines in Fig. 2b, which connects point (0, 0) with the points of average surface temperature change and average TOA radiation change (Winton et al. 2014; He et al. 2016). The differences in ocean heat uptake efficiency are substantial among the three models, and agree with the differences in their TCR: the largest TCR model (LOAR) shows the lowest ocean heat uptake efficiency. The role of ocean heat uptake in explaining the difference in these three models is peculiar, since the three models have exactly the same ocean and sea ice components; their fundamental difference is their atmospheric and land resolution. Most of the heat uptake by these models, and the difference in heat uptake across the models, is equatorward of 40° and above 1500 m depth, although they also show some differences in deep ocean heat uptake in the Southern Ocean. The mechanisms of the difference in heat uptake between the models are complex, and are to be explored in future work.

3.2 Large-scale tropical responses

3.2.1 Patterns of tropical SST and rainfall change

We next explore the response of aspects of the large-scale climate state in the tropics that have been linked to TC activity changes. These are quantities, such as precipitation, vertical wind shear, and TC potential intensity, that are directly simulated by GCMs or can be readily computed from GCM output, including GCMs at resolutions too low to accurately simulate TC climatology. These large-scale responses will help set the stage for the directly-modeled TC responses discussed in Sect. C.

Changes in patterns of tropical SST have been shown to be a useful proxy for changes in large-scale quantities more directly connected to TCs (e.g., Sugi et al. 2002; Vecchi and Soden 2007b, c; Xie et al. 2010), for regional TC activity changes (e.g., Sugi et al. 2002; Knutson et al. 2008; Vecchi et al. 2008; Zhao et al. 2009, 2010; Villarini et al. 2010, 2012; Murakami and Sugi 2010; Murakami and Wang 2010; Murakami et al. 2011, 2018; Zhao and Held 2012; Lin et al. 2015), and for tropical rainfall and atmospheric stability (e.g., Xie et al. 2010; Johnson and Xie 2010; Huang et al. 2013; Chadwick et al. 2014; Lin et al. 2015; Flannaghan et al. 2014). In response to transient 2 × CO2 increase, all three GCMs produce similar patterns of “relative SST,” defined as the difference between SST at a location and tropical-average (30°S–30°N) SST (Fig. 3a–c), and these patterns resemble those of other GCMs (e.g., Vecchi and Soden 2007b; Xie et al. 2010). The GCMs indicate an enhancement of warming in the equatorial tropics (similar to other GCMs; e.g., Liu et al. 2005), particularly in the eastern equatorial Pacific and in the northwestern Indian Ocean, as well as less warming than the tropical average across much of the subtropics, particularly in the Southern Hemisphere (Figs. 3a–c). The equatorial Pacific warming is the largest in the east Pacific, giving a rough “El Niño-like” structure. In the Northern tropical Atlantic, the two high-resolution models do not show the swath of relative cooling extending from northwest Africa to the Caribbean that is seen in LOAR (Fig. 3a) and many CMIP-class models (e.g., Vecchi and Soden 2007b, c; Xie et al. 2010).

Fig. 3
figure 3

Response of annual tropical “relative sea surface temperature” (left panels) and tropical rainfall (right panels) to transient 2 × CO2 increase in the coupled models. Changes are scaled by the corresponding global-mean surface temperature response of each model. Upper panels (a and d) show the response of LOAR, middle panels (b and e) show the response of FLOR, and lower panels (c and f) show the response of HiFLOR. Relative SST is defined as SST at a point minus the 30°S–30°N average. Contours in panels df show the climatological rainfall from the control experiment of each model. Response averages (shading) are computed over model years 201–250, comparing the Transient 2 × CO2 increase to the 1990-Control experiment. For the left panels, units are kelvin local relative temperature change per kelvin global mean surface temperature change; for the right panels units are mm/day per kelvin global surface temperature change

Studies have suggested that changes in tropical precipitation, and in particular the location of the Inter-Tropical Convergence Zone (ITCZ), could drive changes in TC activity (e.g., Merlis et al. 2013, 2016; Ballinger et al. 2015). In response to transient 2 × CO2 increase, the three GCMs all show increases in precipitation near the Equator (with particularly large increases in the Pacific), decreases in precipitation in the subtropics, and increases in the extratropics (Fig. 3d–f). These results are consistent with those of other GCMs (e.g., Held and Soden 2006; Vecchi and Soden 2007a; Xie et al. 2010; IPCC 2007; Stocker 2014). Tropical rainfall changes in these models exhibit substantial similarity to relative-SST changes with regions that warm more (less) than the tropical mean tending to have increases (decreases) in rainfall, as has been seen in other models and one would expect from SST-driven changes in atmospheric stability (e.g., Xie et al. 2010).

All three models show an eastward shift of the near-equatorial Southern Hemisphere Pacific rainfall, resembling an equatorward shift of the South Pacific Convergence Zone (e.g., Cai et al. 2012; Van der Wiel et al. 2016b). Similar to other GCMs, in response to transient CO2 increases, these models exhibit an eastward shift of equatorial Pacific rainfall (e.g., Knutson and Manabe 1995; Vecchi and Soden 2007a) and a westward shift of Indian Ocean rainfall (e.g., Vecchi and Soden 2007a, Zheng et al. 2010). A substantial difference exists in the response of the Pacific ITCZ across this model family. In the lowest resolution model (LOAR), the near-equatorial precipitation increase is the largest in the Southern Hemisphere. In FLOR, there is a more symmetric near-equatorial Pacific precipitation increase, with the precipitation increases in both hemispheres being of similar magnitudes. Meanwhile, HiFLOR shows a northern enhancement of the near-equatorial Pacific precipitation increase.

Even though their El Niño SST anomaly structures in the Pacific are substantially similar (and similar to observations), these three models have different El Niño precipitation responses in their control climates (Fig. 4). During El Niño, LOAR shows an enhancement of precipitation in the Southern Hemisphere tropical Pacific, while FLOR shows a more meridionally symmetric precipitation response, and HiFLOR shows a northward shift of the Pacific ITCZ (Fig. 4). HiFLOR shows improvement in a number of atmospheric aspects in the tropical Pacific relative to FLOR and LOAR, including a reduced climatological “double ITCZ” bias, and reduced meridional SST gradient bias, in the near-equatorial southeast Pacific in present-day simulations (Wittenberg et al. 2018); the “double ITCZ” tendency in GCMs has a substantial contribution from the atmospheric components of those models (e.g., Zhang and Wang 2006; Li and Xie 2012; Adam et al. 2016; Xiang et al. 2017). However, we cannot say that HiFLOR outperforms FLOR in modeling the regression of precipitation onto NIÑO3 shown in Fig. 4, as they have the same spatial correlation to the observed values over the tropical Pacific (0.92) and indistinguishable spatial root-mean square errors (0.378 mm day−1 in HiFLOR and 0.375 mm day−1 in FLOR), although both high resolution models show improvements over LOAR (r = 0.84, rmse = 0.523 mm day−1). These GCMs all show an enhanced warming along the equatorial Pacific in response to transient CO2 doubling, particularly in the eastern equatorial Pacific (Fig. 3), which is vaguely reminiscent of El Niño. There are also similarities between the Pacific El Niño precipitation signature (Fig. 4) and the Transient 2 × CO2 precipitation response (Fig. 3) in each of these models. Therefore, the distinct El Niño signature of these coupled GCMs provides a potential source of the difference in the Pacific ITCZ response to transient 2 × CO2 across the models.

Fig. 4
figure 4

Regression of SST (top panels) and precipitation (bottom panels) monthly-mean anomalies onto the monthly-mean NIÑO3 SST anomaly index. Leftmost panels show the observed regressions from (a) NOAA-OISSTv2 (1982–2016; Reynolds et al. 2002) and (e) GPCP-v2.3 (1982–2016; Adler et al. 2018). Regressions are computed from years 1-101 of the 1990-Control integrations of each model for (b) and (f) LOAR, (c) and (g) FLOR, and (d) and (h) HiFLOR. NIÑO3 SST is computed as an area average of SST over (150°W–90°W, 5°S–5°N; dashed gray box shown). Units are kelvin local SST per kelvin NIÑO3 SST in the upper panels, and mm day−1 local precipitation per kelvin NIÑO3 SST in the lower panels

The overall weakening of tropical circulation (Fig. 1f) and changes in tropical precipitation (Fig. 3d–f) are reflected in the regional structure of changes in 500 hPa pressure velocity to transient CO2 increase (Fig. 5). In the tropical Pacific, a weakening of the zonal overturning circulation (the Walker Circulation) is manifest as anomalous ascent over the eastern and central equatorial Pacific, and anomalous descent over the Maritime Continent; the regions of the strongest anomalous ascent (descent) also correspond to regions of relative SST and precipitation increase (decrease; Fig. 3). The changes in mid-tropospheric pressure velocity are generally similar across the three models, with notable exception in the near-equatorial Pacific changes that reflect the precipitation changes in each model—with a Southern Hemisphere enhanced anomalous ascent/rainfall increase in LOAR and a Northern Hemisphere enhanced anomalous ascent/rainfall increase in HiFLOR. The structure of mid-tropospheric pressure velocity changes in the TC seasons in each hemisphere (Fig. 5d–f) is very similar to that in the annual mean (Fig. 5a–c) for each model, but with the magnitude of the near-equatorial Pacific changes being larger in the warm season than the annual mean. Changes in TC-season 500 hPa pressure velocity have been suggested as an indicator of TC activity changes (e.g., Held and Zhao 2011), with anomalous ascent (descent) related to enhanced (reduced) TC activity. The 500 hPa pressure velocity changes in these models suggest increases in TC activity in the northwestern Indian Ocean and the tropical North Atlantic, and reductions in TC activity in the Northwest Pacific. Overall, the 500 hPa changes in each hemisphere’s TC season show large areas of both anomalous ascent and descent, suggesting spatially heterogeneous TC activity changes.

Fig. 5
figure 5

Response of mid-tropospheric pressure velocity in the atmosphere to transient 2 × CO2 increase in the coupled models, scaled in each panel by the corresponding global-mean surface temperature response of each model, for: (a, d) LOAR, (b, e) FLOR, and (c, f) HiFLOR. Response averages (shading) are computed over model years 201–250, comparing the Transient 2 × CO2 increase to the 1990-Control experiment. a, b, c Annual mean response, (d, e, f) local summer-fall response. Units are hPa/day per kelvin global mean surface temperature change

3.2.2 TC genesis parameters

We now turn to the model response in quantities that are more directly connected to TC activity (see Sect. 2.5). Figure 6 shows the response of four such quantities to transient CO2 increase in the three models. The models show modest amplitude (Fig. 6j, k, l) and very similar (Fig. 7g, h) patterns of 850 hPa vorticity change.

Fig. 6
figure 6

Fully-coupled annual-mean transient 2 × CO2 response of TC-relevant large-scale parameters (per kelvin global mean surface temperature response of each model); LOAR is shown in the leftmost panels, FLOR in the center panels and HiFLOR in the rightmost panels. Anomalies in the Northern Hemisphere are computed over June through November, while anomalies in the Southern Hemisphere are computed over December through May. Panels (ac) show the response of the magnitude of the monthly-mean 850 hPa–200 hPa vector wind shear [m s−1 K−1]. Panels (df) show the response of the Bister and Emanuel (1998) TC potential intensity [m s−1 K−1]. Panels (gi) show the response of the 600-hPa relative humidity [% K−1]. Panels (jl) show the response of the magnitude of the 850 hPa absolute vorticity [10−6 s−1 K−1]. These parameters are computed as in Vecchi and Soden (2007b, c), with data regridded onto a 2° × 2° grid conservatively before computing monthly values. Average responses are computed over the years 201–250 of the model simulations

Fig. 7
figure 7

Inter-model differences of the fully-coupled annual-mean transient 2 × CO2 response of TC-relevant large-scale parameters (per kelvin global mean surface temperature response of each model). The difference between FLOR and LOAR is shown in the left panels, and the difference between HiFLOR and FLOR in the right panels. Anomalies in the Northern Hemisphere are computed over June through November, while anomalies in the Southern Hemisphere are computed over December through May. Panels (a) and (b) show the inter-model difference in the response of the magnitude of the monthly-mean 850 hPa–200 hPa vector wind shear [m s−1 K−1]. Panels (c) and (d) show the inter-model difference in the response of the Bister and Emanuel (1998)’s TC potential intensity [m s−1 K−1]. Panels (e) and (f) show the inter-model difference in the response of the 600-hPa relative humidity [% K−1]. Panels (g) and (h) show the inter-model difference in the response of the magnitude of the 850 hPa absolute vorticity [10−6 s−1 K−1]. These parameters are computed as in Vecchi and Soden (2007b, c), with data regridded onto a 2° × 2° grid conservatively before computing monthly values. Average responses are computed over the years 201–250 of the model simulations

Overall, the changes in wind shear are very similar across the three models (Fig. 6a–c), with small differences between the shear response in the two high-resolution models (Fig. 7b). There are decreases in near-equatorial Pacific wind shear reflecting the reduction of zonal overturning, and substantial increases in wind shear in the Southern Hemisphere subtropics—particularly in the southeastern equatorial Pacific and South Atlantic, two regions without much TC activity. There are tendencies for shear increase across the tropical North Atlantic and decreases across the northern tropical Pacific, similar to the response seen in other coupled models (e.g., Vecchi and Soden 2007a; Camargo 2013). In isolation these shear changes would act to make the tropical North Atlantic less conducive to TC activity, while making the tropical Pacific more conducive. However, the models also exhibit substantial changes in the other TC-relevant parameters.

The models tend to show increases in Bister and Emanuel (1998)’s TC potential intensity (PI) across many regions of substantial TC activity (Fig. 6d–f). The PI changes in the three models exhibit substantial spatial structure, with regions of increase and decrease, though the region of greatest decrease is in the Southeast Pacific, where there is limited TC activity. The patterns of PI response to transient CO2 increase in these three models are similar to the modeled response in relative SST, with areas of relative warming (cooling) tending to show increases (decreases) in PI; this tendency is also seen in other models (Vecchi and Soden 2007b; Xie et al. 2010), although these models tend to show PI increase in regions of weak relative SST change. Consistent with the differences in relative warming of the Atlantic, both FLOR and HiFLOR show a greater tendency for PI increases in the tropical North Atlantic than does LOAR. HiFLOR tends to show more of a tendency for PI increase in many TC regions than does FLOR (Fig. 7d), which in isolation would indicate a tropics-wide more TC-favorable environment in HiFLOR than in FLOR.

Although the overall structure of mid-tropospheric relative humidity change in the three models exhibits some similarity, with a larger moistening in regions of strong anomalous ascent such as the equatorial Pacific and Northwestern Indian Ocean (Fig. 6g–i), yet the response of mid-tropospheric relative humidity is of different sign between HiFLOR and LOAR across most of the subtropics, in contrast to the other variables (Fig. 7e, f). The lowest resolution model shows much more of a tendency for mid-tropospheric moistening, while the highest resolution model shows more of a tendency for mid-tropospheric drying, with FLOR lying in the middle. In isolation, the reduced moistening in the mid-troposphere in HiFLOR would suggest a less TC-favorable environment, across the tropics, than in FLOR and LOAR; although the changes in PI point in the opposite direction (Fig. 7c, d).

Because of the nonlinear sensitivity of saturation vapor pressure to temperature (the Clausius-Clapeyron relationship) and because the free troposphere is colder than the surface, both saturation deficit and entropy deficit will tend to increase from warming absent substantial increases in mid-tropospheric relative humidity. Figure 8a–f show the response of saturation deficit and entropy deficit to transient CO2 increase, shown as the log of the ratio of the means of the inverse of saturation deficit and the inverse of entropy deficit across the three GCMs. In spite of the differences in mid-tropospheric relative humidity response across the three models, away from the equator the models all exhibit increases in both saturation deficit and entropy deficit (cool colors), suggesting an environment that—from this effect alone—should become less favorable to TC genesis across the tropics. The combined effect of warming and mid-tropospheric drying lead to larger saturation deficit and entropy deficit increases in HiFLOR than in FLOR.

Fig. 8
figure 8

Fully-coupled annual-mean transient 2 × CO2 response of genesis index parameters (per kelvin global mean surface temperature response of each model); LOAR is shown in the leftmost panels, FLOR in the center panels and HiFLOR in the rightmost panels. Panels (ac) show the response of 600-hPa saturation deficit, panels (df) row shows the response of 600 hPa entropy deficit, panels (gi) shows the response of Tang and Emanuel (2012b) ventilation index, and panels (jl) shows the response of Emanuel (2013) genesis potential index (GPI). Values are displayed so that warm (cool) colors indicate changes in response to CO2 doubling that make the environment more (less) favorable to TC activity. For saturation deficit, entropy deficit and the ventilation index, changes are displayed as the base-10 logarithm of the ratio between the control and transient 2 × CO2 experiment, with averages computed over June-November for the Northern Hemisphere and December-May for the Southern Hemisphere; for GPI the difference between the transient 2 × CO2 experiment and the control is shown and the climatological annual sum is computed. In the lower panels the annual summed control experiment GPI is shown in contours for reference, with a contour interval of 0.1 non-dimensional units. Values are computed over years 201–250 of each experiment

The response of the Emanuel (2013) Genesis Potential Index (or GPI) to transient CO2 increase shows substantial spatial heterogeneity in all three models (Fig. 8j–l), with broad areas of both increase and decrease. The overall patterns show some similarity across the three models, with a tendency for increase in parts of the Northwest Pacific, off of the Northeast United States and the eastern tropical North Atlantic, and a tendency for reductions in the Southwest Pacific, Gulf of Mexico and off of the Southeast United States. Meanwhile there are also regions with little inter-model agreement, such as the Central North Pacific, and the South Indian Ocean. The changes in GPI are complex, and driven by the partially-offsetting influence of various factors so that at the regional scale few generalities can be drawn about the dominant factor across various locations.

Changes in the Tang and Emanuel (2012b) ventilation index are also spatially heterogeneous (Fig. 8g–i), though outside the equatorial Pacific they tend to be dominated by the increase in entropy deficit (arising in all models from the overall warming, and reinforced in HiFLOR by mid-tropospheric relative humidity changes). The structure of these ventilation index changes is similar to that seen in other coupled models (e.g., Tang and Camargo 2014). In response to transient CO2 doubling, these models show an overall tendency to increase the ventilation index, suggesting a tropics-wide more unfavorable environment for TC genesis.

3.3 Tropical cyclone response

We now turn to the response of TC activity explicitly simulated by the two higher resolution models, FLOR and HiFLOR. We begin by exploring the simulation and fully-coupled transient 2 × CO2 response of TC genesis density for FLOR and HiFLOR (Fig. 9). The control models recover many aspects of real-world TC genesis (Fig. 9a), though as noted in Vecchi et al. (2014) and Murakami et al. (2015, 2016) they also exhibit a number of climatological TC biases, many tied to the coupled models’ climatological SST biases. HiFLOR shows substantially more TC genesis in the northwest Pacific than FLOR and observations, but a reduced (and more realistic) tendency for TC genesis in the central north Pacific. These differences in Pacific genesis cannot be understood in terms of the model-simulated GPI (contours in Fig. 8k, l), and the region of high simulated GPI off the east coast of the United States is not reflected in enhanced genesis of model-simulated TCs in that region. Murakami et al. (2015, 2016) and Zhang et al. (2016) offer deeper discussions into the differences in climatological TC simulation by these two models. The response of TC genesis to transient CO2-induced climate change (Fig. 9d, e) differs considerably between the two models: the response of FLOR is dominated by regions of genesis decrease, while HiFLOR shows a relatively comparable area with increase and with decrease. Focusing regionally, there are some regions where the sign of the response is similar in the two models (such as the Arabian Sea, far East Pacific and parts of the Southern Indian Ocean), but many in which they are opposite. Overall, the spatial correlation of TC genesis density change in these two models is negligible. In spite of having very similar changes in large-scale TC-relevant conditions (Figs. 6, 7, 8, 9), these two models have extremely different changes in TC genesis in response to transient CO2 forcing.

Fig. 9
figure 9

a Observed 1981–2015 mean, (b, c) 1990 Control and (d, e) response to transient 2 × CO2 of the 10° × 10° TC genesis density. Observations are from the IBTrACS database (Knapp et al. 2010). Model explicitly simulated TCs are from (b, d) FLOR and (c, e) HiFLOR. Genesis location in the models is defined as the location that a TC identified Harris et al. (2016) reaches the threshold wind intensity and has a mid-tropospheric warm core anomaly above the threshold for each model, while genesis location in observations is defined as the location that a TC in IBTrACS first reaches 17 ms−1

In the following subsections we further explore the responses, and inter-model differences in responses, of TC activity in FLOR and HiFLOR. We begin by exploring the global-mean change in TC frequency, then focus on TC intensity changes and finally explore changes in regional TC activity.

3.3.1 Global mean tropical cyclone frequency

As one would expect from the maps of TC genesis change, the response of global-mean TC frequency in the fully-coupled Transient 2 × CO2 experiments differs markedly between FLOR and HiFLOR (left bars Fig. 10): FLOR shows a substantial and statistically significant decrease, and while there is a slight decrease in HiFLOR it is not statistically significant. The global-mean frequency response of the fully-coupled Transient 2 × CO2 experiment is largely recovered by ∆MoC + full (middle bars in Fig. 10), with both experiments showing a significant decrease in global frequency in FLOR, and a non-significant decrease in HiFLOR; thus, in these models the global-mean frequency response does not fundamentally arise from a rectification of interannual variability. However, the coupled model SST bias can impact the response of global-mean TC frequency to CO2 increase: for ∆ObC + full in HiFLOR, there is a statistically significant global TC frequency increase (~ 6%, rightmost red bar in Fig. 10). Global TC frequency is relatively less impacted by the SST biases in FLOR (comparing the middle and rightmost blue bars in Fig. 10).

Fig. 10
figure 10

Response of global-mean TC frequency (including events of tropical storm and greater intensity) in FLOR (blue bars) and HiFLOR (red bars). Bars show the percent change in TC frequency (response divided by 1990-Control average) averaged over the period 201–250 for the Transient 2 × CO2 response (leftmost bars), and over 50 years for the ∆MoC + full and ∆ObC + full experiments (center and rightmost bars). Black lines show the 95% confidence interval on the change (confidence intervals estimated by Bootstrap resampling with replacement from the 50 years of each control and perturbation experiment to compute 10,000 samples of each 50-year averaged difference)

We can use the idealized forcing experiments in which CO2 is increased, SST is warmed uniformly, and both CO2 and SST are increased, to interpret the differing responses of FLOR and HiFLOR global TC frequency (Fig. 11). For FLOR the ∆ObC +2K +2× CO2 shows a decrease in global TC frequency, arising from a strong TC frequency decrease in ∆ObC +2× CO2, that is offset by a smaller increase ∆ObC + 2K—the effect of the combined forcing appears to be quite linear. Meanwhile, HiFLOR shows an increase in global TC frequency in ∆ObC +2K +2× CO2, arising from a strong tendency to increase frequency in ∆ObC +2K and a smaller decrease in ∆ObC +2× CO2. The global TC frequency difference in ∆ObC +2K +2× CO2 of HiFLOR relative to FLOR arises from both a smaller decrease in ∆ObC +2× CO2 and a larger increase in ∆ObC +2K (rightmost bars in Fig. 11).

Fig. 11
figure 11

Response of global-mean TC frequency in the idealized forcing experiments; leftmost bars are for the FLOR model, the second set of bars for the HiFLOR model, and rightmost bars show the difference between the response of HiFLOR and FLOR. In each group, the blue bar/symbols show the response to a combined uniform 2K warming and a CO2 doubling (∆ObC +2K +2× CO2), the gray bar/symbols show the response to CO2 doubling with fixed SST (∆ObC +2× CO2), and the red bar/symbols show the response to uniform 2K warming (∆ObC + 2K). Bars show the percent change in TC frequency averaged over 50 years (relative to the ObC control). Black lines show the 95% confidence interval on the change (computed as in Fig. 10)

Recently, a multi-model comparison was performed to assess the response to CO2 increase and a uniform SST warming of TCs across a range of GCMs with resolutions between ~ 50 and ~ 100 km (Walsh et al. 2015), which found that these GCMs consistently predicted a decrease in global TC frequency in response to CO2 and warming, though the partitioning between CO2-induced and warming-induced decrease differed across the models. Meanwhile, a statistical–dynamical downscaling scheme (Emanuel et al. 2008, 2013) was applied to the output from the same models, and predicted an increase in global frequency in response to uniform warming and CO2 increases (Walsh et al. 2015), driven by the SST warming and slightly offset by the effects of CO2. The response of FLOR, both to the combined forcing and to the individual forcing of SST warming and CO2, is generally within the spread of the GCMs used in the US-CLIVAR multi-model intercomparison (Walsh et al. 2016). Meanwhile, the response of HiFLOR is outside the range of the GCMs in Walsh et al. (2015) and FLOR. An increase in global frequency of TCs in response to warming is also seen in HiFLOR experiments forced with the multi-model mean projected SST anomalies from the CMIP5 ensemble (Bhatia et al. 2018) and in the Emanuel (2013) downscaling of CMIP5 experiments; meanwhile Zhao et al. (2009)’s 50-km model shows a decrease in global-mean frequency in response to ensemble-mean CMIP5 projected 21st century SSTs (Knutson et al. 2015). In summary, the global-mean TC frequency response in FLOR is consistent with the range of GCM results published to date (e.g., Held and Zhao 2011; Gualdi et al. 2008; Wehner et al. 2015; Yoshimura and Sugi 2005; Yoshimura et al. 2006; Knutson et al. 2013; Sugi and Yoshimura 2012; Walsh et al. 2015), but the global TC frequency sensitivity of the ~ 25 km resolution HiFLOR GCM appears inconsistent with that of those GCMs.

3.3.2 Large-scale environment and TC frequency

Do the dramatic differences in TC response between FLOR and HiFLOR reflect differences in projection of TC genesis probability due to differences in the response of large-scale conditions? Or do they reflect differences in the response of TC genesis probability to similar projected large-scale changes in each model? Or do they reflect some other factor(s)? In order to explore these questions, we have compared the tropical-mean response of the TC-relevant factors, such as saturation deficit, PI, ventilation index, GPI, to the response of global TC frequency across the twelve perturbation experiments (six for each model; Fig. 12).

Fig. 12
figure 12

Fractional response in global frequency of explicitly simulated TCs vs. fractional response in spatially-aggregated TC genesis indices. Orange symbols show the response of HiFLOR, gray symbols that in FLOR. The linear least-squares regression fit is indicated by the straight lines, with the fit equation and variance explained (R2) indicated, orange lines show regression for HiFLOR points, gray for FLOR points, and blue for all data combined. Each symbol is the response of one perturbation experiment relative to the relevant control experiment, for each model the six responses shown are: fully coupled transient CO2 increase, ΔMoC + Full, ΔObC + Full, ΔObC + 2K + 2 × CO2, ΔObC + 2K and ΔObC + 2 × CO2. Fractional response of simulated global TC frequency is compared to: (a) tropical-mean Emanuel (2013) GPI response, (b) ± 10–30° averaged inverse Tang and Emanuel (2012b) ventilation index, (c) control simulation TC-genesis-weighted 500 hPa pressure velocity, and (d) Li et al. (2010) TC “seed” index after removing model simulated TCs (see Sect. 3.3.1)

To explore the hypothesis that large-scale changes in shear, potential intensity, humidity and vorticity act to modify the probability of TC genesis in a way to explain the response of global TC frequency across these experiments, we first look at Emanuel (2013) GPI (Fig. 12a). For each model the fractional change of tropical-mean GPI shows a very strong relationship to the fractional response of global TC frequency across the six perturbation experiments (gray and orange lines), suggesting that tropical-mean changes in the probability of TC genesis encapsulated in GPI could help explain the response of global TC frequency. However, GPI is not able to explain the inter-model difference in response of global TC frequency, with the relationship for FLOR exhibiting a systematic shift relative to that for HiFLOR—a mean difference between the two fits of almost 11%, which represents a considerable fraction of the typical TC response (ranging between ± 15%). Correspondingly, in the fit across all twelve data points (blue line) the variance explained is substantially less than that for each model. So GPI changes could help explain why, for example, the response to uniform warming of global frequency in each model differs from that to isolated CO2 doubling, but it cannot explain why HiFLOR has a tendency for global increase relative to the response of FLOR for the same perturbations. Therefore, we look beyond GPI to help understand the inter-model spread in global genesis.

The Tang and Emanuel (2012a, b) ventilation index (\(\varLambda\)) has substantial theoretical (e.g., Tang and Emanuel 2010, 2012a, b) and empirical (Tang and Emanuel 2012b) support as a useful index for the probability of TC genesis. However, and to our surprise, we find that tropical mean changes in the inverse of the ventilation index alone are not a useful factor to explain the global TC genesis response either within or across each model (Fig. 12b). We explored other formulations and averaging regions and seasons (including the time-median of the ventilation index, the ventilation index itself, among many others) and none showed a significant relationship to global TC frequency across these model experiments when used as a sole covariate. Therefore, the Tang and Emanuel ventilation index does not explain the difference in global TC response across these two models. However, we will return to the ventilation index below, and demonstrate that it serves as a useful covariate for global TC frequency change in these models once one accounts for another factor impacting TC genesis (the frequency of pre-TC synoptic disturbances).

Motivated by Held and Zhao (2011) we compare TC-density weighed changed in 500 hPa pressure velocity with fractional changes in global TC frequency (Fig. 12c). Overall, the relationship between spatially aggregated genesis-weighted 500 hPa pressure velocity and global TC genesis within each model is comparable to or better than that for GPI (Fig. 12a). Furthermore, there is a tendency for the relationship to distinguish between the more positive response of HiFLOR and the more negative one of FLOR, such that the relationship across all twelve experiments is substantial (blue line, Fig. 12b). However, there is still a 7.5% gap in the fit of the HiFLOR response to that in FLOR to genesis-weighted 500 hPa pressure velocity, which is a considerable fraction of the typical response of each experiment. Further, because we did not save high-frequency 500 hPa pressure velocity data from these simulations, we are not able to remove potential contamination of the TCs themselves onto the 500 hPa pressure velocity signal and we therefore view these strong relationships with some level of caution, as TCs can contaminate the signal of large-scale factors used to understand them (e.g., Swanson 2008). It is worth noting that the genesis-weighted 500 hPa pressure velocity changes across these perturbation experiments does not show a simple relationship to the response of tropics-wide overturning, indicating that genesis weighted 500 h Pa pressure velocity changes should not be interpreted as a simple consequence of the response of overall tropical circulation.

The TC genesis parameters explored through Fig. 12a–c do not explain the difference in global TC frequency response between FLOR and HiFLOR. One possible interpretation is that the inter-model difference in global TC frequency response arises due to fundamentally different TC genesis sensitivity to the same large-scale environmental changes between the two models. In fact, this was a hypothesis for differences in the interannual simulation and prediction of TC frequency across these two models (e.g., Murakami et al. 2015, 2016a; Zhang et al. 2016). However, we here explore an alternative hypothesis: that the response of global TC frequency in these models reflects differential changes in the rate of pre-TC synoptic-scale disturbances, and not only the changes in the probability of genesis of pre-TC synoptic-scale disturbances. This hypothesis is explored using the index of Li et al. (2010; see Sect. 2.5).

The response of the vorticity variance-based TC Seed index averaged between 30°S and 30°N (over both land an ocean) explains a large fraction of the variance of the response of global TC frequency across both the various perturbation experiments for each model, and between the two models (Fig. 12d). The linear fits for each model (orange and gray lines) are very similar to the linear fit across both models (blue line). The relationship supports the hypothesis that changing non-TC synoptic-scale variability is a significant factor in the response of TC frequency, by changing the frequency of vortices that can then develop into TCs. However, although the relationship between the TC Seed index and global TC frequency is encouraging, the relationships show a non-zero intercept: there is a tendency for TC frequency decrease even when the change in the TC Seed Index is zero.

It appears changes in TC Seed activity, and not just changes in the probability of TC genesis modulated by large-scale changes in climate (i.e., genesis probability indices), are an important driver of TC frequency changes in these models. However, the models show clear changes in large-scale climate conditions that would affect the probability of TC genesis given a seed. In particular, Tang and Emanuel (2012b) in their Figs. 4 and 5 show clear empirical evidence for a dependence on the ventilation index (\(\varLambda\)) of the probability that a tropical disturbance will undergo cyclogenesis. This suggests that a more appropriate model to explain changes in TC frequency would be a Binomial one, in which the expected value of TC frequency (the “successes”) depends on the product of the number of TC seeds (the “trials”) and the probability of success of each trial. In such a model, the fractional change in the expected value of TC frequency (n) will be given by

$$\frac{\Delta n}{n} = \frac{\Delta N}{N} + \frac{\Delta p}{p}$$
(5)

where N is the expected value of frequency of TC seeds (the “trials”) and p is the expected value of the probability of genesis of each seed.

In order to build the most accurate model, one should likely account for the spatio-temporal variance and covariance of the means and changes in trials and probabilities. However, as an initial simple estimate, which could and should be refined in future work, we explore the extent to which changes in tropical-mean TC Seed activity and probability, inferred from global changes in the vorticity variance TC Seed index of Li et al. (2010) and the ventilation index of Tang and Emanuel (2012b), can be used to explain the inter-experiment and inter-model spread in global TC frequency response in FLOR and HiFLOR. First, we posit that the linear relationship between the TC seed index and global TC frequency in Fig. 12d is a useful estimate of the fractional change in TC seeds. That is, we posit that:

$$\frac{\Delta N}{N} \approx 3.19*\frac{\Delta Seed}{Seed}$$
(6)

where Seed is the 30°S–30°N (land and ocean) average of the vorticity variance-based TC Seed Index.

Then, based on Fig. 4 of Tang and Emanuel (2012b), we hypothesize that the probability of genesis given a seed should vary roughly with the inverse of the ventilation index. Figure 13a shows that the fractional change of the ± 10–30° averaged inverse of the ventilation index is a useful covariate to explain the residual of the fractional response of global TC frequency to the linear fit of global TC frequency to the vorticity variance-based TC Seed index (without the intercept of 5.80 included). Although the ± 10–30° average of the inverse of the ventilation index showed no useful relationship to fractional changes in global TC frequency in these experiments (Fig. 12b), once the linear relationship of global TC frequency changes to the tropical TC Seed index is removed, there is a strong relationship to the fractional changes in the spatial mean of the inverse of the ventilation index (Fig. 13a). Accordingly, thinking of a Binomial process (Eqs. 5, 6), we compare the fractional change in global TC frequency to a two-covariate model for global frequency (β) using the fractional change in spatially averaged vorticity variance-based TC Seed index and ± 10–30° spatially-averaged inverse ventilation index (Λ1):

$$\frac{\Delta N}{N} \approx 3.19*\frac{\Delta Seed}{Seed} + 1*\frac{{\Delta \varLambda^{ - 1} }}{{\varLambda^{ - 1} }} = \beta ,$$
(7)

as shown in Fig. 13b, one recovers practically all (variance explained 0.89) of the inter-experiment and inter-model variance of the fractional change in global TC frequency, and the fit to all of the experiments has an intercept closer to zero than the seed only fit (Fig. 12d).

Fig. 13
figure 13

Scatter across various perturbation experiments with FLOR and HiFLOR of (a) fractional changed in ± 10–30° latitude inverse Tang and Emanuel (2012a) ventilation index versus the residual of the fit of model-simulated fractional TC response to vorticity variance “seed” index; (b) two predictor fit (using vorticity “seed” index and fractional change in ± 10–30° latitude inverse Tang and Emanuel (2012b) ventilation index) versus fractional change in global TC activity. Orange symbols show the response of HiFLOR, gray symbols that in FLOR. The linear least-squares regression fit is indicated by the straight lines, with the fit equation and variance explained (R2) indicated, orange lines show regression for HiFLOR points, gray for FLOR points, and blue for all data combined. Each symbol is the response of one perturbation experiment relative the relevant control experiment, for each model the six responses shown are: fully coupled transient CO2 increase, ΔMoC + Full, ΔObC + Full, ΔObC + 2K + 2 × xCO2, ΔObC + 2K and ΔObC + 2 × xCO2. See Sect. 3.3.1

Based on these analyses we suggest the following interpretation of the global TC frequency response in these experiments. The fractional change in global TC frequency arises due to the combined impacts of changes in the frequency of pre-TC synoptic disturbances (“seeds”) and the probability that these disturbances will undergo cyclogenesis. Figure 14 decomposes the contribution to the two-predictor (“seed” change and inverse ventilation index-based probability) model for global TC frequency across all the experiments for FLOR and HiFLOR. Across all experiments, the global probability of cyclogenesis decreases, with the decreases in genesis probability being smallest in the isolated 2 × CO2 and isolated uniform warming. In the experiments with surface warming, for both models the synoptic variability increases, although the increase in synoptic variability is substantially larger in HiFLOR than in FLOR. In the isolated 2 × CO2 experiments the synoptic variability decreases for both models. For most FLOR experiments the change in genesis probability estimated from the inverse of the ventilation index is larger than the fractional change in “seeds”, correspondingly FLOR has a greater tendency for global TC decrease (or a reduced tendency for global TC frequency increase). Across these experiments there is an often partially-compensating impact of changes in pre-TC synoptic disturbances and the changes in probability of genesis (which is negative in all experiments). If this interpretation is correct, then efforts to understand the controls on non-TC synoptic variability in the tropics should be of paramount importance in improving our understanding and ability to predict the climate response of global TC frequency.

Fig. 14
figure 14

Fractional change in global TC frequency (blue bars) compared to the decomposition of the two-predictor model for fraction global TC frequency change (β in Eq. 7) into the contribution from change in ± 10–30° inverse ventilation index (orange bars, ΔΛ/Λ), interpreted as a fractional change in probability of TC genesis, and from changes in tropical-mean synoptic scale variability (gray bars, ΔSeed/Seed), interpreted as a fractional change in the frequency of TC precursors. Panel (a) shows the results for the FLOR experiments and panel (b) shows the results for the HiFLOR experiments

3.3.3 Global intensity

As has been shown in Murakami et al. (2015, 2016), the climatological intensity distribution of FLOR and HiFLOR are markedly different (Fig. 15). FLOR has a compressed range to its intensity distribution, with storms being more similar to each other and weaker than observed storms. Meanwhile, HiFLOR has a broader intensity range in its TCs, and recovers the observed range more realistically including the existence of “Major” (Saffir-Simpson Category 3–4–5) TCs (Murakami et al. 2015, 2016a; Bhatia et al. 2018, 2019).

Fig. 15
figure 15

Lifetime maximum intensity exceedance probabilities for global TCs in FLOR (left column) and HiFLOR (right column). Left vertical axis labels show the base-10 logarithm of the probability that the lifetime intensity of a TC will exceed a particular wind speed for each experiment; the right vertical axis labels show the return interval in number of TCs (inverse of the probability). Panels (a) and (b) compare the fully-coupled control and transient 2 × CO2 experiments (computed over years 201–250), panels (c) and (d) compare the two nudged-SST experiments referenced to each model’s SST climatology, and panels (e) and (f) compare the five nudged-SST experiments referenced to observed SST climatology (see Sects. 2.2 and 2.3 for a description of the experiments). Nudged run probabilities computed over 50 years of simulation

In these experiments, HiFLOR and FLOR also differ in the response of the intensity distribution of TCs to fully coupled transient CO2-induced climate changes (Fig. 15a, b). In the fully-coupled experiments, FLOR shows very little change to its intensity distribution (Fig. 15a), though a slight increase in intensity. Meanwhile, in addition to its mild increase in global-mean frequency, HiFLOR shows a shift towards more intense TCs in its fully coupled transient 2 × CO2 experiment (Fig. 15b). As a result, the global number of “Major” TCs (with maximum winds in excess of 50 m/s) increases by 38% in the fully-coupled HiFLOR in response to CO2 doubling (Table 2), while in neither the control nor the warmed climate are there any Major TCs in FLOR.

Table 2 Response of TC frequency of different intensities (as labeled in Row 1) for the Transient 2 × CO2 response of HiFLOR

The idealized nudged SST forcing experiments help shed some light on the causes of the differing response of global TC intensity in FLOR and HiFLOR (Fig. 15c–f). For both FLOR and HiFLOR the MoC and MoC + full experiments largely recover the intensity distributions in the respective fully-coupled experiment; there is negligible change in the FLOR intensity distribution between MoC and MoC + full, while there is a clear shift towards increased intensity from CO2 forcing in the HiFLOR experiments; the global intensity response to CO2 doubling in the fully coupled experiments arises largely from the response to the climatological changes in SST and the CO2 increase. We remind the reader that high-frequency coupling between ocean and atmosphere is still present in the nudged-SST experiments.

However, the SST biases in each model affect the mean intensity distribution in both the control and 2 × CO2 experiments, as well as the sensitivity to CO2-driven climate changes (comparing the black and red lines in Fig. 15c, e with the black and red lines in Fig. 15e, f, respectively). FLOR shows a tendency toward even weaker storms when its SSTs are nudged toward observed climatology, while HiFLOR shows stronger storms when its SSTs are nudged toward observed climatology. More to the point, relative to the intensity distribution from ObC in each model, the ObC + full experiment shows a clear intensification of TCs both in FLOR and HiFLOR. The SST bias in HiFLOR changes the control and perturbation intensity distribution, but not the global sensitivity of intensity to warming. The sensitivity of TC activity to CO2 increase from the ObC and ObC + full couplet in HiFLOR represents an alternative estimate to that from the fully coupled experiments, and the global and regional sensitivities in these experiments are summarized in Table 3.

Table 3 Response of TC frequency of different intensities (as labeled in Row 1) for the ∆ObC + full response of HiFLOR (see Sect. 2.3 for description of experiments)

Tables 2 and 3 show the changes in basinwide frequency of all TCs can be positive or negative depending on basin and experiment. However, the ubiquitous intensification of TCs overwhelms the few decreases in TC frequency so that for TCs with Saffir-Simpson intensity greater than two (maximum winds > 42 m/s), the basinwide changes are positive for all basins and experiments.

Does the response of TCs to CO2-induced climate changes relative to ObC represent a more faithful estimate of the sensitivity of TCs, since the reference SST is closer to that observed? Realistic background SSTs are a necessary, but not sufficient, condition for recovering the sensitivity of TCs to climate drivers. The TC sensitivity to climate depends on recovering both the large-scale climate drivers that influence TCs, as well as the sensitivity of TCs to those drivers. Artificially correcting the mean SST does not necessarily make the underlying sensitivity of each TC to climate more realistic, nor does it ensure that the large-scale climatic changes are more correctly represented. However, correcting SST biases in these models leads to improved representation of the TC tracks and spatial distribution relative to the experiments with biased SSTs (e.g., Vecchi et al. 2014; Krishnamurthy et al. 2016; Murakami et al. 2015). Thus, to the extent that the models can correctly capture the large-scale climate drivers of TC activity, and to the extent that the model TCs correctly respond to large-scale climate changes, having more realistic TC tracks should lead to an improved assessment of the sensitivity of TCs to climate, as the model TCs will encounter more relevant large-scale climate changes.

The increase of TC intensity in both FLOR and HiFLOR can be traced to the impact of overall warming of the tropics (Fig. 15e, f, gray and cyan lines), since the direct impact of increased CO2 on the atmosphere leads to mild decreases in TC intensity in both models (Fig. 15e, f, blue lines). All HiFLOR warming experiments show an increase in the global frequency of intense storms, irrespective of background climatology (observed or simulated), and for both uniform and structured SST warming. For FLOR, the model SST biases impact the response of global TC intensity more than does the spatial structure of SST change from the fully coupled model.

3.3.4 Spatial TC distribution

Global TC frequency and intensity represent a compact measure of TC activity, but the controls on and the impacts of TCs are local. To further probe the sensitivity of TCs to CO2 forcing, we examine the modeled response of regional TC activity to 2 × CO2. We show the change in overall TC activity (Figs. 16 and 17) as well as the change in the activity of Major TCs (Saffir-Simpson categories 3, 4 and 5; i.e., storms with lifetime maximum intensity exceeding 50 ms−1) from HiFLOR (Fig. 18).

Fig. 16
figure 16

Response of TC density in FLOR. Shading indicates the change between the perturbation and reference climate experiments for TC density [number of TC days per season in a 10° latitude by 10° longitude box centered at each 1° interval, as in Vecchi et al. (2014)]. TC days are defined for FLOR as times when the maximum zonal wind speed of the TC exceeds 15.3 m/s, and a warm core is identified by the Harris et al. (2016) TC tracker. Blue and green shading indicates decreases in TC density, red/yellow/orange shading indicates increases in TC density. Differences are averaged over years 201–250 for the transient 2 × CO2 response (panel a), and over 50 years of model simulation for the various nudged SST experiments (panels bf)

Fig. 17
figure 17

As in Fig. 16, but for HiFLOR. TC days are defined for HiFLOR as times when the maximum zonal wind speed of the TC exceeds 17 m/s, and a warm core is identified by the Harris et al. (2016) TC tracker

Fig. 18
figure 18

Response of Major TC density in HiFLOR. Shading indicates the change between the perturbation and reference climate experiments for TC density [number of Major TC days per season in a box 10° latitude by 10° longitude box centered at each 1° interval, as in Vecchi et al. (2014)]. Major TC days are defined as times when the maximum zonal wind speed of the TC exceeds 50 m/s—or Saffir-Simpson Scale 3, 4 and 5, and a warm core is identified by the Harris et al. (2016) TC tracker. Blue and green shading indicates decreases in TC density, red/yellow/orange shading indicates increases in TC density. Differences are averaged over years 201–250 for the transient 2 × CO2 response (Panel a), and over 50 years of model simulation for the various nudged SST experiments (Panels bf)

For both FLOR and HiFLOR, the global TC frequency changes (Sect. 3.3.1) emerge from the partially offsetting influences of regional increases and decreases in TC activity (Figs. 15, 16): the sensitivity of global TC frequency is not necessarily the principal driver of regional TC activity changes. The response of TC density in the two models differs, and is influenced both by the climatological biases in each model (comparing panels.b with.c from Figs. 16, 17, 18), and the patterns of SST change in each model (comparing panels.c with.d from Figs. 16, 17, 18). The response of TC activity over the North Atlantic, Central Pacific and South Pacific exhibits a strong dependence on climatological SST, with the climatological SST biases in each model leading to a damped response in TC activity relative to ∆ObC + full. In addition, the increase in TC frequency in the Atlantic in both models is tied to the pattern of SST response. We note that outside the North Atlantic, the patterns of SST change does not appear to have a first order impact on the response, with the response in Figs. 16c and 17c outside the North Atlantic largely recovered in Figs. 16d and 17d, respectively. We suggest that the relatively minor influence of SST change patterns outside the North Atlantic in these experiments may not be a general feature of all SST pattern perturbations, but may reflect the fact that the relative-SST anomalies in the North Atlantic in response to transient 2 × CO2 in these models are substantially larger than in all other TC basins (Fig. 3b, c). However, even in this pair of models that differ only in the resolution of the atmospheric and land components, a substantial source of uncertainty in the pattern of TC response arises due to atmospheric model differences: there are substantial differences in the response of the spatial structure of TC activity to identical, and uniform, SST warming (comparing Fig. 16d with Fig. 17d, and Fig. 16e with Fig. 17e).

HiFLOR shows a ubiquitous increase in the density of Major TCs in response to CO2 doubling (Fig. 18): the global increase in TC intensity of HiFLOR discussed in Sect. 3.3.2 above represents a truly global response in this model. Interestingly, there is an increase in the number of Major TCs even in regions where the potential intensity in the model decreases in response to 2 × CO2 (Fig. 11). The tendency for an increase in Major TCs in HiFLOR is largely driven by the overall warming (Fig. 18e), with the spatial pattern of SST change (Fig. 18c) and the model’s bias (Fig. 18b) acting to modify the spatial structure of the change. The isolated impact of CO2 (Fig. 18f) results in a complex pattern of Major TC activity change, which tends to partially offset the warming-induced increase in most places (e.g., western North Pacific), though can amplify it in others (e.g., South China Sea, Southwest Pacific).

4 Summary and discussion

We explore the response to CO2 doubling in three global coupled GCMs (LOAR, FLOR, and HiFLOR) with identical oceans and sea ice components, and atmospheric and land components that differ only in their resolution. The atmospheric/land resolutions explored ranges from the CMIP3/CMIP5 class 2° resolution, to the TC-permitting resolutions of ~ 0.5° and ~ 0.25°, which are more typical of the upcoming CMIP6 HiResMIP protocol (Haarsma et al. 2016).

We find that these models exhibit differing rates of global-mean surface warming and full-ocean heat uptake, with the lowest-resolution model showing the most surface warming and least ocean heat uptake, and the highest-resolution model showing the least surface warming and most ocean heat uptake. The differences in TCR across the models are not due to differences in atmospheric radiative feedbacks, but to differences in ocean heat uptake. These results echo recent work highlighting the key role of heat uptake (Raper et al. 2002; Kuhlbrodt and Gregory 2012; Winton et al. 2010, 2014; He et al. 2016) in influencing the TCR.

The models exhibit very similar spatial structure of their tropical ocean surface warming (after scaling for the differences in global mean warming); the patterns of warming seen in the models are also evident in the majority of CMIP models (e.g., Liu et al. 2005; Vecchi and Soden 2007a, b, c, Xie et al. 2010; Stocker 2014). The large-scale structure of the low-latitude precipitation response to warming in the models is also similar across the models, showing wetting of the tropics and drying in the subtropics. This broad structure is common in GCMs, and reflects both the tendency of the precipitation response to mirror the mean structure of precipitation (“wet-get-wetter”; Held and Soden 2006), the poleward shifts of large-scale atmospheric circulation features in the subtropics (Lu et al. 2008), and shifts of tropical circulation (Knutson and Manabe 1995; Vecchi and Soden 2007a; Xie et al. 2010). The response of the tropical ITCZ to transient CO2 doubling differs across these models (Fig. 3), and bears strong similarity to the response of Pacific rainfall to ENSO in these models (Fig. 4).

The two TC-permitting models (FLOR and HiFLOR) show different Transient 2 × CO2 responses in global TC frequency, with FLOR showing a substantial and significant reduction, but HiFLOR showing no significant change (Fig. 10). The difference between the global TC frequency response of FLOR and HiFLOR is further enhanced when the SST biases of each model are removed from the control and perturbation (i.e., ∆ObC + full). FLOR shows a significant TC frequency decrease, but HiFLOR shows a significant global TC frequency increase (Fig. 10). The response of global TC frequency in FLOR to CO2-induced warming, uniform warming and isolated CO2 increases without warming is consistent with that of other GCMs of similar resolution (e.g., Yoshimura and Sugi 2005; Yoshimura et al. 2006; Held and Zhao 2011; Wehner et al. 2015; Walsh et al. 2015). Although HiFLOR also shows a global TC frequency increase in response to uniform warming, and a decrease in response to CO2 increase without warming, the combined impact of these perturbations in HiFLOR differs from that of published GCMs by leading to a global TC frequency increase. Furthermore, the response of global frequency in HiFLOR is outside the range of the GCMs reported in Walsh et al. (2015). A fundamental difference between FLOR and HiFLOR is the presence in HiFLOR of Category 3–5 TCs (Murakami et al. 2015, 2016b) and TC rapid intensification (Bhatia et al. 2018).

Examination of changes in large-scale parameters that have been connected to changes in the probability of TC genesis (Figs. 6, 8, 12a–c) in isolation did not suggest any clear reason as to why HiFLOR did not show a decrease in global TC frequency, while FLOR did. However, the response of pre-TC synoptic disturbances to climate perturbations are a principal driver of the distinct global TC frequency response in FLOR and HiFLOR (Fig. 12d). After the relationship between changes in TC seeds and global TC frequency is accounted for, the ventilation index of Tang and Emanuel (2012a, b), which is theoretically and empirically connected with the probability of cyclogenesis of TC seeds (Tang and Emanuel 2012a, b), shows strong agreement with the inter-experiment and inter-model spread of global-mean frequency. This suggests that TC frequency changes should be interpreted through the lens of a Binomial process, in which the total number of TCs is controlled both by the frequency of TC seeds and the probability that each TC seed will become a cyclone.

In the experiments presented here, both the change in TC seeds and the change in probability that a seed will become a cyclone contribute to changes in global TC frequency (Figs. 12d, 13). TC seed activity in both FLOR and HiFLOR decreases from isolated CO2 increase, but increases in all experiments with surface warming (Fig. 12d). Meanwhile, changes in TC genesis probability, as captured by the Tang and Emanuel (2012b) ventilation index, act to decrease TC genesis in all the experiments (Fig. 13a). For the isolated CO2 increase experiments, the TC seed frequency and genesis probability changes act constructively, and lead to a consistent decrease in global TC frequency. Meanwhile, for the experiments with warming, there is compensation between the TC seed frequency and genesis probability changes, which in some cases leads to net increases and others net decreases in global TC frequency (Fig. 13b). In this framework, the reduction of seed activity from isolated CO2 increase largely offsets the increase in seed activity from uniform warming in FLOR (Fig. 14a), so that the response of global TC frequency to warming and CO2 increase in FLOR is dominated by the impact of reduced genesis probability from increasing ventilation index. Meanwhile, in HiFLOR, the increase in seed activity from warming is substantially larger than the decrease in seed activity from isolated CO2 increase (Fig. 14b), and the net increase in seeds either outweighs or cancels the impact of reduced genesis probability from increasing ventilation index. The tropics-wide increase in ventilation index is largely driven by increases in entropy deficit (Fig. 11) tied to warming and the non-linear relationship between saturation specific humidity and temperature (the Classius-Clapeyron relation), so that the mid-troposphere moistens at a reduced rate relative to the surface (e.g., Emanuel 2013). The isolated impact of CO2 increase also contributes to an increase in entropy deficit and a corresponding increase in ventilation index, resulting in decrease in the probability of TC genesis. The tendency for HiFLOR to show a greater increase in global TC frequency than FLOR arises from HiFLOR’s larger increase (smaller decrease) in TC seed activity in response to warming (isolated CO2 increase) than FLOR.

The global response power dissipation index (PDI, an index that combines the impact of TC frequency and intensity) of HiFLOR to transient CO2 doubling shows the same sign, but a much larger magnitude, than the response of the Emanuel (2013) downscaling methodology applied to the large-scale environmental changes of HiFLOR to transient CO2 doubling and to changes in solar forcing (Irvine et al. 2019). The framework we suggest here, in which changes to TC seeds are the main driver of global TC frequency, presents a plausible explanation for these differences. Both HiFLOR and the Emanuel (2013) methodology show an increase in TC intensity to warming. However, the present implementation of the Emanuel (2013) methodology uses a spatially and temporally uniform “TC seeding” framework, so it cannot include the impact of increasing TC seed frequency in response to warming that both FLOR and HiFLOR (to a larger degree) exhibit. Although both HiFLOR and the Emanuel (2013) methodology recover an increase in TC intensity in response to warming, a key element of PDI change, we suggest that the muted response of the Emanuel (2013) methodology applied to output from HiFLOR seen in Irvine et al. (2019) reflects, at least in part, the fundamentally different responses in global TC seed frequency in the two methods, and its impact on global TC frequency, another element of PDI change in response to warming (e.g., Villarini and Vecchi 2013).

We wish to emphasize that the detailed interpretation we present, of considering both TC seed frequency and TC genesis probability, was developed post hoc, as an effort to reconcile what at the time were surprising and mysterious results: that HiFLOR and FLOR showed very different global TC frequency responses (Figs. 10, 11), but very similar changes in large-scale factors connected to TC genesis probability (Figs. 6, 7, 8). Therefore, in spite of the nominally strong statistical relationship between the TC seed index and global TC frequency across these experiments (variance explained > 0.7; Fig. 12d), and the even stronger relationship of the TC seed plus genesis probability model (variance explained > 0.85; Fig. 14b), this hypothesized framework must still be explored with independent models and experiments, in order to further evaluate it and potentially falsify it. Analyses are underway to evaluate the validity of this framework across a larger range of models and experiments.

Additionally, the TC seed index used here, which is based on the variance of 3–10 day filtered lower tropospheric vorticity after removing the direct effect of TCs (Li et al. 2010), is not optimally applicable to a Binomial framework—which requires a “count based” estimate of TC seeds (as the expected value of “successes” from a Binomial process is the product of the probability of success and the number of trials—and variance measures convolve amplitude, frequency and duration of the pre-TC synoptic disturbances). Therefore, efforts should include development of count-based estimates of pre-TC synoptic variability, which would lend itself more cleanly to a Binomial model of global and regional TC frequency. We hypothesize that the “seed” and probability framework explored here to understand the global TC frequency response to global-scale forcing may also be applicable at more regional scales, and in order to understand the response to climate variability and more regionalized forcing. Finally, although we suggest that the global response of TC seeds to climate changes is an important factor for global TC frequency, we do not offer an explanation for the modeled changes in TC seeds and leave it for future work to understand the mechanisms controlling global TC seed activity. We do note that the weakening of tropical circulation from warming and CO2 increases does not present an explanation for the TC seed response for these models across these experiments. The crucial importance of the climatic response of pre-TC synoptic variability on the response of global TC activity to climatic changes requires an expanded exploration of pre-TC synoptic variability and its response to climate variability and change.

In targeted idealized experiments (e.g., Merlis et al. 2013, 2016; Ballinger et al. 2015) and observations (e.g., Zhang and Delworth 2006) there is a tendency for TC activity to follow the ITCZ. Given that HiFLOR (FLOR) showed a northward (southward) shift of the ITCZ, we expected the TC activity in HiFLOR to also show a larger Northern Hemisphere increase than FLOR (and perhaps a Southern Hemisphere decrease). However, in response to CO2 doubling, HiFLOR shows a larger increase in Southern Hemisphere TC activity than does FLOR, particularly over the Southwest Pacific and Australia, with more muted differences in the Northern Hemisphere (compare upper panels of Figs. 15 and 16). In the comprehensive GCM experiments explored here, the meridional structure of TC activity changes and ITCZ changes can differ, and need not follow the sensitivity seen in idealized modeling configurations.

HiFLOR shows a substantial increase in global TC intensity in all experiments with warming, with the global number of “Major TCs” increasing by 38% (Sect. 3.3.3; Fig. 14), and the global number of TCs with maximum lifetime wind speed exceeding 80 m/s (or 10 m/s above the cutoff for “Category 5” storms in the Saffir-Simpson scale) more than doubles (Tables 2, 3). Although these intensity increases are most pronounced in the West Pacific, the increase in major TC activity in HiFLOR is seen throughout the tropics (Fig. 17). The Bister and Emanuel (1998) potential intensity in both FLOR and HiFLOR shows increases in response to CO2 doubling across the TC main development regions, providing a potential explanation for the modeled increases in TC intensity (Fig. 6). These results echo those of Bhatia et al. (2018), looking at the response to multi-CMIP5 model projected 21st century warming with HiFLOR, which also display an increase in global TC frequency, intensity, and the rate of TC rapid intensification.

HiFLOR shows an increase in global TC frequency, while FLOR shows a decrease. The response of FLOR is consistent with most published studies with GCMs (e.g., Yoshimura and Sugi 2005; Yoshimura et al. 2006; Gualdi et al. 2008; Knutson et al. 2010; Knutson et al. 2013; Camargo 2013; Stocker 2014; Walsh et al. 2015, 2016; Zhao et al. 2009; Sugi and Yoshimura 2012; Wehner et al. 2015), while that of HiFLOR stands in contrast to those studies. The transition in resolution from FLOR (0.5°) to HiFLOR (0.25°) led to this change in sensitivity in this model family; however, in another model family going from ~ 0.5° to ~ 0.25° did not lead to an increase in global TC frequency under warming conditions (Wehner et al. 2015). It will be interesting to explore the impact of resolution changes on the sensitivity of global TC frequency in the larger model ensemble from the CMIP6 HiRESMIP experiment (Haarsma et al. 2016).

The response to CO2 doubling and warming of TC activity in both FLOR and HiFLOR depends substantially on the SST climatology, with Atlantic TC activity showing a much more dramatic increase in both models when the control SST climatology is corrected toward observations. Seasonal prediction experiments and assessments of the sensitivity of TCs to interannual modes of SST variability show that on interannual timescales, TC variations are substantially affected by climatological SST errors, with reduced errors leading to more faithful simulations and predictions of TC variability (e.g., Vecchi et al. 2014; Zhou and Xie 2015; Krishnamurthy et al. 2016; Liu et al. 2017). A potential consequence of this is that artificially correcting SST errors may lead to improved projections of regional and global TC activity in response to radiative forcing. However, even with the same SST climatology (and the same SST forcing), there is still substantial discrepancy between FLOR and HiFLOR in global and regional TC activity. Artificially correcting errors in climatological and projected SST is not sufficient to eliminate uncertainty in projections of TC activity. However, the impact of SST biases on TC sensitivity suggests that the fidelity of sensitivity estimates of TCs to climatic changes may be reduced within GCMs with SST biases.

Understanding the impact of model resolution on the response of a GCM family to CO2 is complex. This analysis attempted to create a “clean” set of model experiments in which the impact of atmospheric resolution on the response of the GCMs to CO2 doubling could be isolated from other factors. However, in changing resolution without adjusting model parameterizations, the three models developed different large-scale biases (e.g., LOAR was warmer at the surface than FLOR, which was warmer than HiFLOR), and by affecting the large-scale ocean circulation and the mean rainfall/TC activity, these biases appear to have influenced some of the simulated global and regional responses to CO2. An alternative approach would have been to adjust the parameterizations of each model, to bring the biases of the models into closer alignment. However, it is likely that adjusting model parameterizations would have also impacted the response to CO2 of the models in ways not directly connected to resolution. Experiments with a prescribed SST climatology and idealized perturbations (e.g., uniform warming, CO2 increase in isolation) provide an additional lens through which to explore the impact of resolution. However, although the idealized experiments were able to identify sensitivity of the solutions to different large-scale drivers and differences between models, and to allow comparison of the sensitivity of these models to others in the published literature (e.g., Walsh et al. 2015), their idealized nature did not provide a clear path to evaluating which model solution was most plausible, as there is no observed counterpart to such studies.

Which model sensitivity is most plausible? In assessing seasonal prediction skill, because we have observations of numerous events, we find that HiFLOR’s response to interannual SST variations outperforms that of FLOR (e.g., Murakami et al. 2015, 2016a; Zhang et al. 2016; Liu et al. 2018b). Although we can be confident that TCs in FLOR and HiFLOR have different responses to CO2 doubling, it is not yet clear whether the response of TCs to CO2 in HiFLOR is any more or less faithful than that of FLOR. Perhaps the improved representation of TCs in HiFLOR relative to FLOR, including their relationship to large-scale modes of climate variability and seasonal prediction skill (Murakami et al. 2015, 2016a; Zhang et al. 2016; Liu et al. 2018a), suggest that HiFLOR may produce a more faithful representation of the sensitivity of TCs to climate drivers. On the other hand, the mechanisms involved in the response to CO2 can differ in character from those involved in interannual variability (e.g., Lu et al. 2008), suggesting that simulation and prediction of seasonal TC anomalies may be an insufficient measure of fidelity in capturing the sensitivity of TCs to multi-decadal climate drivers. Furthermore, increased resolution need not always lead to improved predictive capability (e.g., Kapnick et al. 2018).

A broad range of assessments, connected to observable quantities, is needed to evaluate the plausibility of different model estimates of TC and rainfall sensitivity to radiative forcing. These should include evaluations of process-scale measures of TC simulation by the GCMs (e.g., Kim et al. 2018; Wing et al. 2019), assessment of the hour-to-multi-day prediction and simulation of TCs in models (e.g., Xiang et al. 2015), and evaluation of the ability to recover past seasonal, decadal, multi-decadal and centennial TC fluctuations. Similar evaluations are needed for regional rainfall, and other regional climatic impacts. We suggest that tightly coupling day-to-day weather, seasonal-to-decadal weather and climate change research, including basic theory, observational analyses and model-based studies. Further, because of the importance of synoptic variability in the response of TCs to climate drivers, directly applying models used to assess climate change to weather and seasonal prediction will likely accelerate progress in reducing uncertainty in regional climate change.