Clear-air turbulence trends over the North Atlantic in high-resolution climate models

Clear-air turbulence (CAT) has a large impact on the aviation sector. Our current understanding of how CAT may increase with climate change in future is largely based on simulations from CMIP3 and CMIP5 global climate models (GCMs). However, these models have now been superseded by high-resolution CMIP6 GCMs, which for the first time have grid lengths at which individual turbulence patches may start to be resolved. Here we use a multi-model approach to quantify projected moderate CAT changes over the North Atlantic using CMIP6 models. The influence of the model resolution on CAT projections is analysed. Twenty-one CAT diagnostics are used, in order to represent uncertainties in CAT production mechanisms. Each diagnostic responds differently in time, but the majority display an increase in moderate CAT between 1950 and 2050. Although winter is historically the most turbulent season, there is strong multi-model agreement that autumn and summer will have the greatest overall relative increase in CAT frequency. By 2050, summers are projected to become as turbulent as 1950 winters and autumns. The global-mean seasonal near-surface temperature is used as a comparative metric. For every 1 °C of global near-surface warming, autumn, winter, spring, and summer are projected to have an average of 14%, 9%, 9%, and 14% more moderate CAT, respectively. Our results confirm that the aviation sector should prepare for a more turbulent future.


Introduction
Clear-Air Turbulence (CAT) is an upper-level atmospheric phenomenon that has a hazardous and expensive impact on the aviation sector. Atmospheric turbulence is the leading cause (71%) of all in-flight weather-related injuries ( Hu et al. 2021) and annually costs the United States of America US$200 million (Williams 2014). CAT usually develops in cloud-free, stably stratified atmospheric environments (Jaeger and Sprenger 2007) and is undetectable using current onboard radar equipment. CAT develops in regions of sheardriven instability and is often found around upper-level 1 jet streams. Jet streams are narrow bands of intense winds, which have a strong seasonal dependence and owe their intensity to latitudinal horizontal temperature gradients. Due to the steepening of the pole-to-equator temperature gradient in the upper troposphere and lower stratosphere, jet streams are expected to intensify in wind shear with anthropogenic climate change (Lee et al. 2019). The extra annual cost on the aviation industry to avoid CAT is £16 million (Search Technology 2000). In a future scenario, with double the preindustrial CO 2 atmospheric concentration, longer transatlantic flights would add an extra 2000 hours of annual travel and an additional 70 million kg of CO 2 in annual fuel emissions (Williams 2016). In the same CO 2 scenario, Williams and Joshi (2013) projected winter-time moderate-or-greater CAT encounters to increase by 40-170% over the North Atlantic. Building on this previous work, and using the same CO 2 concentration scenario, Williams (2017) projected moderate CAT to increase by 94% in wintertime over the North Atlantic basin. Moderate turbulence inflicts vertical accelerations of up to 0.5 g (4.9 m s −2 ) on aircraft (Lane et al. 2004). Transatlantic air travel often confronts CAT due to the presence of the mid-latitude eddy-driven jet stream over the North Atlantic. Williams and Joshi (2013) and Williams (2017) used a Coupled Model Inter-comparison Project phase 3 (CMIP3) coupled atmosphere-ocean climate model, with a grid resolution of 2.5° × 2.0°. The World Climate Research Programme (WCRP), previously called the World Group on Coupled Modelling (WGCM), first developed the Coupled Model Inter-comparison Projects in the 1990s, to evaluate and improve global climate models (GCM) and to understand future and past climate variabilities (Bock et al. 2020). The three latest generations are CMIP3, CMIP5, and CMIP6 (Bock et al. 2020). Hu et al. (2021), using CMIP5 GCMs for their control state and a regional climate model, found an increase in CAT severities across the South China Sea, with moderate turbulence increasing by 12% over 50 years. Storer et al. (2017) also found a significant rise in moderate wintertime CAT across the globe. They projected an increase of 143%, 100%, 90%, and 127% at 200 hPa over the North Atlantic, North America, North Pacific, and Europe, respectively. Storer et al. (2017) used a CMIP5 GCM, namely the Met Office Hadley Centre HadGEM2-ES model. Williams and Storer (2022) compared this GCM against ERA-Interim reanalysis data from the European Centre for Medium Range Forecasts (ECMWF) and concluded that GCMs can successfully diagnose CAT and its response to climate change, when compared to reanalysis data.
The horizontal resolution coarseness of CMIP3 and CMIP5 GCMs led to the development of the High-resolution inter-comparison project (HighResMIP), which is a subsection of CMIP6 (Haarsma et al. 2016). PRIMAVERA (PRocess-based climate sIMulation: AdVances in high-resolution modelling and European Risk Assessments), launched in 2015 to manage and collate GCMs, further aided the development of HighResMIP. HighResMIP was created for the scientific community to evaluate the dependence of model resolution on a particular phenomenon and to reduce the resolution gap between numerical weather prediction (NWP) models and GCMs (Haarsma et al. 2016). Haarsma et al. (2016) suggested that an improvement in the horizontal resolution within a model could lead to a better representation of vertical dynamics in a system. For example, they found that vertically moving small-scale gravity waves were better represented in models with a finer horizontal resolution. The new CMIP generation also uses the Shared Socioeconomic Pathways (SSP) scenarios. These global warming simulations, within CMIP6, were found to better represent northern hemisphere (NH) storm tracks and jet streams, than CMIP5's Representative Concentration Pathway (RCP) 4.5 or CMIP3's special report emission scenario SRESA1B (Harvey et al. 2020). These different emission scenarios have led to an increase in climate sensitivity for CMIP6 models.
Upper-level atmospheric turbulence is anywhere from planetary to millimetres in horizontal size, but typically impacts aviation between 100 m and 1 km (Storer et al. 2017). Sharman et al. (2014), using extensive pilot reports (PIREPS), suggest a climatology for the turbulent state of the upper atmosphere. They concluded that a median patch of turbulence was approximately 60-70 km wide horizontally and 1 km deep vertically. Several CMIP6 GCMs have a close capability to resolve this median length scale (Sect. 2; Methodology). CAT has not previously been investigated with GCMs with such a capability. Therefore, this paper explores projected moderate CAT changes in time and with anthropogenic climate change using CMIP6 HighResMIP GCMs. A multi-model approach is applied to understand the dependence of CAT projections on model resolution. This study includes all seasons, over the North Atlantic. The layout of this paper is as follows. Section 2 discusses the approach and GCMs modelled data used. The results are discussed in Sect. 3. Section 4 draws the main conclusions from these findings.

Data and methods
The Met Office Hadley Centre HadGEM3-GC3.1 model, the Max-Plank Institute MPI-ESM1-2 model, and the European community Earth systems EC-Earth-3 model are the three CMIP6 HighResMIP GCMs chosen for this upperlevel turbulence analysis. HadGEM3-GC3.1 is the latest Met Office global climate model configuration and was the UK community's submission to CMIP6. This configuration has many new improvements and systematic error corrections compared to the corresponding CMIP5 submission (HadGEM3-GC2; Williams et al. (2018). HadGEM3-GC3.1 scored a high 727 out of 1000 on the Watterson et al. (2014) basic overall climate model metric, with HadGEM3-GC3.0 and HadGEM3-GC2 averaging at 711 and 686, respectively (Williams et al. 2018). HadGEM3-GC3.1 has three model resolutions available: HH/HM which has a horizontal grid spacing of 25 km, MM which has 60 km, and LL which has 135 km. The number of ensemble members for each resolution is found in Table 1.
Twenty-seven European research organisations and universities worked together to submit the updated version of EC-Earth to CMIP6. EC-Earth's new atmosphere and ocean model projections have finer horizontal and vertical resolutions than CMIP5's EC-Earth-2 GCMs (Haarsma et al 2020). Their HighResMIP sub-models are EC-Earth-3P (71 km) and EC-Earth-3P-HR (36 km). Due to a problem with greenhouse gas concentrations within ensemble member number 1 of EC-Earth-3P (71 km), two years (2013,2014) are omitted from the EC-Earth-3P analysis in Sect. 3. The Max-Plank Institute Meteorology Earth System GCMs are the final models used within this investigation: MPI-ESM1-2-HR (67 km) and MPI-ESM1-2-XR (34 km). MPI-ESM1-2-HR had a computational cost 20 times greater than its older, coarser version (-LR). This has led to improvements in representations of teleconnections and mid-latitude dynamics (Mauritsen et al. 2019). The Max-Planck Institute sub-models only have one ensemble member (Table 1). The CMIP6 GCMs have three main tiered experiments. Of these, the historically forced coupled climate and ocean experiments (hist-1950) and the future projected forcings of 2015-2050 (highres-future) were used. Future projections (2015-2050) simulated using the SSP5-8.5 high-end impact scenario (Haarsma et al. 2016). Theses CMIP6 models were chosen due to their accessibility and available data on certain height levels, necessary to calculate our CAT indices. These models cannot resolve the thin vertical depth (1 km) of a patch of turbulence in the atmosphere. Therefore, this study focuses on the horizontal grid spacings across the GCMs (Table 1).
Due to the difficultly in resolving sub-grid scale turbulent kinetic energy (TKE), twenty-one diagnostics are used to represent CAT. These indices, first collated by Williams and Joshi (2013), have been used in previous literature to represent turbulent flow and instability, and the usage of 21 indices ensure the results are as robust as possible. Using an ensemble of these diagnostics permits diagnostic uncertainty quantification. Each index is listed in the appendix.
The assumption that energy cascades from larger scales into smaller eddies is made with many NWP models and on average, is an appropriate representation of the spatial structure of atmospheric turbulence (Koch et al. 2005), and is so applied within this study. Each index represents different mechanisms for turbulent air flow. For example, an anticyclonically curved jet stream and the CAT associated with it, could be well represented by the vorticity advection index in combination with the magnitude of vertical wind shear. The Brown index, which is a combination of absolute vorticity and flow deformation, does not perform well in a strongly anticyclonic system because it does not efficiently distinguish between anticyclonic or cyclonic flow (Knox 1997). The frontogenesis function, an index related to the amplification of inertia-gravity waves during frontogenesis and their breakdown (Lane et al. 2004), also does not perform well in anticyclonic flow (Knox 1997), but well in cyclonical bent jet streams. The commonly used CAT forecasting Graphical Turbulence Guidance (GTG) tool is made up from several of our indices. Sharman and Pearson (2017) verified, through receiving operating characteristic curve analysis, that these GTG diagnostics effectively diagnose light-or-greater CAT at high altitudes. This paper takes an ensemble across the twenty-one indices to encapsulate a range of CAT-generated situations.
The cube-rooted eddy dissipation rate (EDR) is a common quantitative measure for atmospheric turbulence, as it is directly proportional to the root-mean-square vertical acceleration of a plane (MacCready 1964). This paper follows Williams (2017) and assumes certain EDR values relate to the severity of the turbulence encounter and the upper percentile ranges in each index correspond to these dissipation rates. Light, light-to-moderate, moderate, moderate-tosevere, and severe turbulent airflow arises in the 97.0-99.1%, 99.1-99.6%, 99.6-99.8%, 99.8-99.9%, 99.9-100% percentiles of each index, respectively. The cube-rooted EDR values (m 2/3 s −1 ) for the previously defined severities, in succession, are 0.1-0.2, 0.2-0.3, 0.3-0.4, 0.4-0.5, and > 0.5 (Williams 2017). This paper mainly focuses on moderate turbulence encounters, as moderate turbulence is more common than severe turbulence and more hazardous than light turbulence. To quantify a change in time and with climate

Results
Moderate turbulence occurs within an EDR range of 0.3-0.4 m 2/3 s −1 , or within the 99.6-99.8th percentile ranges within each index. The middle (green) boxplots, across However, if considering multiple models, these values could be related to more than one severity. Median lines that reside to the far left of the ranges suggest that one or two outlying sub-models have altered the spread and led to an overlap of box plots within Fig. 1. Interestingly, often if the boxes do not overlap, the whiskers, which represent remaining points above and below 75th and 25th quartiles of the distribution, are overlayed. This is shown in Fig. 1a (vertical wind shear of the horizontal wind), where boxes do not touch but the whiskers do. Vertical wind shear is a diagnostic linked directly with the generation of CAT and is a component in several of the indices. Generally, within Fig. 1, as the CAT severity increases, the medians of the diagnosed threshold values, and their standard deviations, increase. Interestingly, flow deformation multiplied by vertical temperature gradient has a broad range of light values, compared to the other severities (Fig. 1g). This is also apparent for light threshold ranges for negative Richardson number (Fig. 1e), an interesting finding as severe turbulence arises in the upper 99.9th percentile of an index and its thresholds would be more likely to differ than light turbulence. In the rest of this paper, for simplicity, only moderate CAT events are evaluated. Moderate turbulence has a suitable spread of threshold values within Fig. 1 for a robust comparison between GCMs.

CAT variations in time for individual CAT indices
Out of the twenty-one indices, twelve show a definitive overall increase in moderate CAT in time (Fig. 2), despite some interannual variability. This increase is particularly evident in the last few decades of the hundred-year period. These diagnostics are as follows: version 1 of the North Caroline State University index (NCSU: Fig. 2h), wind speed ( Fig. 2i), residual of non-linear balance equation (Fig. 2l), vertical vorticity squared (Fig. 2m), relative vorticity advection (Fig. 2n), negative absolute vorticity advection ( (Fig. 2u). PV has a few anomalously large peaks above + 1000%. This is only for EC-Earth-3P (71 km) models in the years 2012 and 2050. Wind speed multiplied by directional shear (Fig. 2j) and magnitude of horizontal divergence (Fig. 2k) both had 6 out of 7 sub models projecting an increase in CAT. It can be extrapolated that on average, across the model resolutions, fourteen indices project an increase in CAT between 1950 and 2050. Interestingly, relative vorticity advection (Fig. 2q), negative absolute vorticity advection (Fig. 2r) and Brown index (Fig. 2s) project their greatest increase in CAT within coarser models of each sub-model, particularly EC-Earth-3P (71 km) and HadGEM3-GC3.1-LL (135 km). Here we use the term "coarser" to refer to models with horizontal grid spacing ≥ 60 km.
Variant one of Ellrod's index (T1 ; Fig 2c), which is a combination of flow deformation and wind shear (Ellrod and Knapp 1992), has one model (HadGEM3-GC3.1-MM) projecting no change to CAT. All other sub models project T1 decreasing in time. Vertical wind shear of the horizontal wind (Fig. 2a), Brown energy dissipation rate (Fig. 2b), variant two of Ellrod's index (Fig. 2d), negative Richardson number (Fig. 2e) and horizontal temperature gradient (Fig. 2f) also project a decline in moderate CAT but for all GCMs. CAT develops in regions of increased vertical wind shear instability, yet horizontal wind shear ends the period with a decrease in the number of moderate CAT events from the 1950 threshold period (Fig. 2d). Projected wind speed related CAT events at 200 hPa are expected to rise in this period, with maximums of +400% (Fig. 2i). Wind speeds across neighbouring atmospheric heights may increase at a similar or larger rate, decreasing wind shear values. Atrill et al. (2021) also project a decline in wind shear over polar regions and determined the same There are five CAT severities displayed: light (blue), light to moderate (orange), moderate (green), moderate to severe (grey) and severe (purple). The 25th and 75th percentiles bound the boxes shown, with the vertical black line through each box representing the median data point. The whiskers extend to show the rest of the distribution ◂ explanation, but the annual mean wind shear is increasing over the North Atlantic Ocean in time, as found by Lee et al. (2019). Flow deformation multiplied by vertical temperature gradient interestingly had three models projecting a rise in moderate CAT and four simulating a decline. The three sub-models projecting an increase are MPI-ESM1-2-HR, EC-Earth-3P and HadGEM3-GC3.1-LL. These are all the coarsest sub-models within their respective GCMs.
Due to all indices being closely linked and not physically independent, it is difficult to group them. One could look at the building components and group diagnostics in terms of having a vertical derivative or not. There is an almost even split across the indices, with ten and eleven diagnostics linked with either vertical and horizontal or just horizontal derivatives. Overall, there is a mixture of indices projected to increase and decrease in time in both categories. However, when including wind speed times directional shear (Fig. 2j) and magnitude of horizontal divergence (Fig. 2k) increasing in time, 82% of horizontal-only derivative related diagnostics project increases in CAT. There is a half-half mix within the vertical derivatives category (including T1).

North Atlantic seasonal CAT projections
Within this section, Fig. 3 displays the yearly percentage changes for each season, using the 1950s decade and all seasons within this period as a reference to investigate the seasonality of CAT. Figure 4 shows the same information, but with the reference period now being for each season separately rather than all seasons, so that all lines oscillate around zero in the 1950s by definition. EC-Earth and HadGEM3-GC3.1 models project an increase in winter moderate CAT in time over the North Atlantic. After the year 2030 in Fig. 3, HadGEM3-GC3.1 sub models project 100% more moderate DJF CAT events than the reference period. Despite the coarsest HadGEM3-GC3.1 model (-LL: 135 km) suggesting a dip after 2040 to + 50%, there are only marginal differences between each HadGEM3-GC3.1 sub-model for DJF in time. However, within Fig. 4, when compared to its individual 1950s reference, DJF is not increasing at the fastest rate over the period. Due to the independence of the variables and normality within the distribution, Fig. 4 meets the assumption for linear regression analysis and slopes (trends in time) are discussed. The slopes representing DJF are second steepest in three models, third steepest in two models and smallest in two models (EC-Earth-3P,  Fig. 4b, f). Wintertime over the North Atlantic is historically a period of increased upperlevel instability with the greatest risk of moderate turbulence, due to the strengthening of the meridional temperature gradient. Despite strong model agreement projecting an increase in moderate DJF CAT event, other seasons could increase at a greater rate.
The moderate CAT frequency in summer is also projected to rise at the end of the 100-year period, as shown in HadGEM3-GC3.1 and EC-Earth-3 subplots within Fig. 3. JJA lines reside initially on or below 0% throughout Fig. 3 from 1950 to 2000 but shift positively and reach around + 25% (HadGEM3-GC3.1) and + 35% (EC-Earth-3) by 2020. JJA lines fluctuate at the end of the period at a similar range to DJF/SON at the beginning of the 100-years. This could imply that current and future JJA turbulence encounters have increased to the same original rate that DJF and SON had in the 1950s. Future CAT encounters in summer could become as common as the 1950s' most turbulent seasons. This rapid change in JJA is evident in Fig. 4. In fact, the two EC-Earth-3P sub-models within Fig. 4a, b project a significant increase in summer moderate CAT in time over the North Atlantic, at a greater rate in time than comparative seasons within the figure. These EC-Earth JJA rates peak at maximums of + 69.18% (3P-HR, Fig. 3a) in 2035 and + 76.13% (3P, Fig. 4b) in 2048. The coarser HadGEM3-GC3.1 models, MM (60 km) and LL (135KM) also project a significant JJA increase with similar trend of 0.37%/year and 0.38%/year (Fig. 3f, g).
MPI-ESM1-2-XR, -HR and HadGEM3-GC3.1-HM/HH (Fig. 4c-e) project a smaller percentage change in time for JJA moderate turbulent events, with slopes of 0.12 and 3 %/ year. The MPI-ESM1-2 results (Figs. 3d, e and 4d, e) generally have no significant trends in time for DJF, MAM, or JJA and have a significant amount of inter-annual variability. This model does, however, project an obvious increase in CAT for northern hemisphere autumn (SON). There is strong multi-model agreement that CAT events in SON will increase at a rapid rate in time. After 2020 and 2040 in Fig. 3, MPI-ESM1-2-HR (67 km) and EC-Earth-3P (71 km) project a sharp increase in the number of SON CAT events, reaching maximums of + 150%. Finer-resolution counter parts of MPI-ESM1-2 and HadGEM3-GC3.1 project an increase, but not at the same rate. Within Fig. 4, the greatest increase is simulated within the Max-Plank coarser sub-model (67 km, Fig. 4d) at + 105.19% in 2027. Averaging across sub-models, MPI-ESM1-2, EC-Earth and HadGEM3-GC3.1 simulate maximums of + 95.72%, + 62.69% and + 71.96% near the end of the 100-year period. The Max-Plank and Met Office Centre Hadley sub-models project the greatest rate increase in CAT events to occur within NH autumn.
Spring over the North Atlantic is a season with characteristically the fewest moderate CAT events. When including all seasons in the 1950-59 reference, the MAM lines reside below zero (Fig. 3). This suggests that the number of DJF and SON projected moderate CAT events in the 1950s is greater than the number of MAM CAT events throughout the 100-year period. However, there is an increase displayed in the HadGEM3-GC3.1 MAM lines within Fig. 4, with one sub model projecting a greater increase than DJF at 0.35%/ year (Fig. 4f). Despite this increase, MAM is quite variable across GCMs and sub-models, for example MPI-ESM1-2-HR projects a negative slope in time (Fig. 4d). Therefore, further seasonal analysis is needed.

Moderate CAT variations with TAS
Thus far Sect. 3 has concentrated on the change in CAT with time. To isolate the relationship between global surface warming and upper-atmospheric CAT over the North Atlantic, the warming trend in each model is considered. The mean near-surface (2 m height) global temperatures in time are displayed in Fig. 5. As anticipated, these temperatures increase with time for all GCMs across all seasons. The SSP high-end projections modelled within the CMIP6 GCMs relate this warming to anthropogenic sources. All climate models. This relates to the equal and opposite seasonal differences between the northern and southern hemispheres. One may have assumed DJF and JJA to have the same global average in surface air temperature, like MAM and SON, but they differ due to differences in land mass between the hemispheres and the larger oceans to land heat capacity.
HadGEM3-GC3.1 has the largest sample size available with 11 ensemble members over the sub models against only 6 other GCM members. HADGEM3-GC3.1 models have coldest start of the period, relative to the other GCMs. This is most apparent for ensemble member number 4 within HadGEM3-GC3.1-LL model runs, with a large change in DJF near-surface air temperatures (TAS) of 4.05 °C between 1950 and 2050. MPI-ESM1-2 and EC-Earth-3 models project a similar and interchangeable increase in global temperature change. Within MPI-ESM1-2, the finer -XR sub model is overall colder than the HR model, particularly in the beginning of the period. This new generation of CMIP models has a higher climate sensitivity than previous CMIP5 generations (Harvey et al. 2020). This spread of future temperature could be linked to this heightened sensitivity. Sensitivity is defined here as an outcome that arises from the physical and dynamical chaos within climate models, and something that is not directly developed by a climate modeller. This may be one explanation as to why different ensemble runs, within the same sub model, reach temperatures at different stages in time within Fig. 5. The change in TAS for each year from 1950 is analysed in later figures, to better compare the warming trend with CAT in each sub-model and to consider the inter-annual TAS variability. Figure 6 outlines the relationships between moderate CAT percentage change (Fig. 4) and the change in globalmean seasonal TAS. The slopes of regression are a form of analysis used to determine a trend within TAS. There is good internal agreement across the HadGEM3-GC3.1 (Fig. 6e-g) models for DJF with projected CAT increasing by 8.67-9.74%/°C. The Met Office Hadley Centre mid-resolution (60 km) sub model (HadGEM3-GC3.1-MM) projects NH spring values to increase by 11.69% per degree. This is the second fastest rate after NH autumn within Fig. 6f, placing DJF with the lowest increase. Despite EC-Earth-3P (71 km) projecting a similar increase of 11.89%/°C for MAM, the remaining sub-models within MPI-ESM1-2 and EC-Earth project an insignificant increase with TAS.
There is an apparent model grid dependence for SON projections. Across all GCMs, the coarser sub-model versions project the maximum SON percentage changes, by a difference of ~ 2%/°C for HadGEM3-GC3.1 and EC-Earth-3 models and by ~ 6% between MPI-ESM1-2 rates. EC-Earth3P (71 km) and HadGEM3-GC3.1-LL (135 km) simulate increases of 14.53%/°C and 14.74%/°C, respectively. MPI-ESM1-2-HR simulated the largest projected SON rate of 19.14%/°C. However, the confidence of this trend is combatted by a large interval of ± 7.97%/°C. MPI-ESM1-2's results have a large spectrum of uncertainty with relatively large confidence interval ranges. This could, however, be related to the ensemble size of MPI-ESM1. NH Summer projections vary across Fig. 6, from HadGEM3-GC3.1 modelled data suggesting an increase at a similar or just below SON. MPI-ESM1-2 projected an increase in JJA by 6.75%/degree on average and EC-Earth-3P simulate respectively large JJA trends, projecting + 20.64%/°C and + 18.75%/°C for EC-Earth-3P-HR (36 km) and -3P (71 km). EC-Earth-3P-HR projects larger increase for summer-time moderate CAT per degree than any other season or model. This suggests a rapid increase in the number of CAT events for summer, which is a season that has historically not been as comparably turbulent.

Averaged trends with anthropogenic climate changes
To reduce uncertainty that arose from averaging over ensemble members, Fig. 7 displays Fig. 6's regression line slopes, with confidence intervals for all ensemble members. Within  Fig. 7, the boxes are colour coded in terms of resolution range. Autumn across the North Atlantic is projected to have a large increase per degree of surface warming. SON's slopes (Fig. 7c) Fig. 7c propagates closely to the median line. This is also visible for DJF trends (Fig. 7a). DJF's slopes has few multi-model disagreements, with HadGEM3-GC3.1, MPI-ESM1-2, and EC-Earth-3 on average increasing by 9.10 ± 1.83, 9.73 ± 6.26, and 7.56 ± 2.86 % per degree. This strong model agreement is not apparent within NH summer projections, with JJA on average increasing by 12.41 ± 1.39%/°C, 6.71 ± 4.76%/°C and 21.18 ± 3.02%/C for HadGEM3-GC3.1, MPI-ESM1-2 and EC-Earth-3 models. This wide variability is evident across GCMs but is not the case internally, with ensemble members residing usually in similar brackets. For example, all EC-Earth-3 ensemble member JJA results lie above the median line, ranging from 15.70 to 22.53%/C. Interestingly, NH summer is projected to increase in the number of moderate CAT events at a similar (or greater) rate than NH Autumn. NH spring over the North Atlantic has a wide spread of moderate CAT projections. On average EC-Earth-3 and HadGEM3-GC3.1 MAM slopes are increasing at a similar rate with median values of 9.64 ± 4.75 and 8.89 ± 3.31%/°C, but there is a considerable spread of 0.07 to 13.21 and 2.37 to 23.90%/°C, respectively. HadGEM3-GC3.1-MM ensemble 1 projects the greatest increase in moderate CAT per degree for NH spring (23.90%/°C). However, this comes with a high slope uncertainty of ± 8.87%/°C. Ensemble 2 of this Met Office Hadley Centre model projects an increase of only 5.77 ± 7.40%/°C. To complete these large uncertainties associated with NH spring, MPI-ESM1-2 has the smallest average increase of + 2.50 ± 7.1%/°C.
There is at least one ensemble member for each of the GCMs that projects a minimal change per degree in spring. This adds weight to a MAM future scenario with negligible moderate CAT increases. However, there are many projections with increases similar to DJF. If one takes an ensemble member and multi-model average across these GCMs, MAM is projected to increase by 8.81 ± 1.90%/°C. This quantitative median applied to the other seasons leads to an overall increase in moderate CAT events by 8.94 ± 1.54%/°C, Fig. 6 The moderate percentage change scattered against change in mean global seasonal near surface temperature (TAS). Shade of scatter relates to the year of moderate CAT events. Season defined by colour and line style, with DJF, SON, JJA, and MAM coloured blue and dashed, pink and dotted, light blue and solid, and orange and dash dotted, respectively. The line of regression slopes, with 95% confidence intervals, take an average over ensemble members 1 3 13.82 ± 1.27%/°C, and 13.69 ± 1.28%/°C for DJF, SON, and JJA, respectively. This is not a weighted average, as HadGEM3-GC3.1 makes up 11 of the 17 sampled slopes, so has a larger influence on averages across GCMs.

Summary and discussion
This paper has explored the projected moderate CAT changes with anthropogenic climate change using a sample of CMIP6 HighResMIP GCMs. This publication uses three models, the Met-Office Hadley Centre model HadGEM3-GC3.1, the Max-Plank Institute model MPI-ESM1-2, and from a collaboration of European universities and organisations, the EC-Earth-3 model. All model simulations cover the period 1950-2050 and follow the CMIP6 HighResMIP protocol. There are several different resolutions across these GCMs, and so a multi-model approach was applied to understand the dependency of CAT projections on model resolution. CAT changes in the North Atlantic in northern hemisphere winter (DJF), summer (JJA), autumn (SON), and spring (MAM) are individually analysed from Figs. 3, 4, 6 and 7. Twenty-one indices are used to represent the different dynamical scenarios for CAT production.
There were variations in the results for the different indices and climate models. Most diagnostics displayed an increase in moderate CAT from 1950 to 2050. The greatest projected increase arose in the last 30-40 years of the period of interest. For three indices, the sub-models with coarser resolutions (grid lengths ≥ 60 km) simulated the maximum increases in moderate CAT.
HadGEM3-GC3.1 and EC-Earth-3 models projected an increase in wintertime CAT across the 100-year period, with large interannual fluctuations after 2030. These models also simulated an increase in CAT with time for all seasons. In contrast, the MPI-ESM1-2 changes had no significant trends for DJF, MAM, or JJA in time. However, there was strong multi-model agreement that CAT events in SON will increase at a fast rate in time, compared to other seasons. The greatest seasonal increase for SON was projected in MPI-ESM1-2-HR by + 105% in late 2020 (Fig. 4d).The number of CAT events in summer is also projected to rise at the end of the 100-year period, as shown in the HadGEM3-GC3.1 and EC-Earth-3 subplots within Figs. 3 and 4. This implies current and future JJA turbulence encounters may have increased to the same original rate DJF and SON had in the 1950s. Overall, no dependence on model horizontal resolution was found after averaging across the indices for DJF, JJA and MAM. Global seasonal-mean near-surface temperature was used as the metric of global warming in each model, allowing the expression of projected CAT changes per degree of global warming. Within SON results, and individual diagnostics, coarser sub-models generally produce greater increases in time and with near-surface warming. We speculate that perhaps the move to high resolution climate models may result in projections of increased SON CAT being revised downward slightly. Autumn, over the North Atlantic, had good multi-model agreement on future CAT projections, with all GCMs agreeing an increase between around 10 and 20%/°C, with a median percentage increase of 13.82 ± 1.27%/°C. A high multi-model agreement also arose for DJF CAT changes. Wintertime CAT on average will increase by 8.94 ± 1.54%/°C. This differs from JJA slopes, which had a considerable spread across the GCMs. NH summer CAT projections, over the North Atlantic, on average suggest an increase of 13.69 ± 1.28%/°C, a similar average to NH autumn results. However, on average for each GCM, JJA is projected to increase in the number of CAT encounters by 21.18 ± 3.02%/°C, 12.41 ± 1.39%/°C and 6.71 ± 4.76%/°C for EC-Earth-3, HadGEM3-GC3.1 and MPI-ESM1-2. Despite CAT previously not being commonly encountered in NH spring, both HadGEM3-GC3.1 and EC-Earth-3 models projected a significant increase time and with global warming. On average, across all models, MAM had an increase in the number of moderate CAT events by 8.81 ± 1.90% per degree of global seasonal near-surface warming. This rate is similar to that found for the most turbulent season DJF.
In summary, this multi-model analysis found that moderate CAT will increase in future over the North Atlantic, for all seasons. This will have consequences for to the aviation industry, with more flights disrupted and increased damages and costs. Future work should quantify how this overall increase will impact aircraft directly and if the increase found over the North Atlantic in this study is linked to a higher density of CAT outbreaks in particular regions.

Magnitude of vertical wind shear of horizontal wind
Vertical wind shear is one of the main sources of turbulence in the upper atmosphere. Increased shear around the jet stream can increases the denominator of the Richardson number (Ri), which in turn reduces Ri number to a critical value where turbulence is produced (Colson and Panofsky 1965;Ellrod and Knapp 1992). Wind shear is on a scale large enough to be resolved by GCMs (Puemple 2016) and so is deemed suitable for this investigation.

Wind speed
Due to the turbulent nature of fast-moving flows, horizontal wind speeds over the North Atlantic Ocean are used as a metric to determine areas of increased instability and CAT. Increasing winds at certain levels also aid wind shear development, which as previously discussed is one of the main sources of CAT.

Magnitude of horizontal temperature gradient
Kelvin-Helmholtz instability (KHI), a shear-based instability, is pronounced in regions of steep or tight horizontal temperature gradients. Flow deformation can amplify these gradients and increase the probability of CAT (Overeem 2002). Therefore, the upper percentile values of this meteorological variable are suitable to represent CAT.

Magnitude of horizontal divergence
Upper-level horizontal divergence can be related to low level instability, but it can also be associated with anticyclonic shear and gravity wave generation (Lee, 2013)(. Variant one of Ellrod's index, discussed below (12), is an index often used in CAT forecasting, and some forecasters have added divergence terms within the index to account for the above CAT generation.

Magnitude of relative vertical vorticity squared
Regions of increased or strong relative vorticity develop in the presence of gravity waves and in particular in their decay and overturning, further leading to CAT and turbulent airflow. (1)

Flow deformation
Ri can be used to represent the presence of instability and turbulence. Mathematically an increase in horizontal deformation would lower Ri (assuming a constant static stability). This would lead to the production of KHI and an increase in horizontal thermal gradients. It is uncommon for a forecast turbulence index to not include a deformation component (Knox 1997).

Flow deformation times wind speed
As discussed, these two diagnostics represent the dynamical system appropriately, and so are used in combination in many previous papers (WIlliams 2016).

Magnitude of potential vorticity
There are multiple ways potential vorticity (PV) can be related to CAT, either through regions of vigorous geostrophic adjustment, or from turbulent mixing itself with the vertical mixing of potential temperatures inhibiting the spread of isentropic surfaces, leading to increased vorticity aiding CAT and PV development.

Colson-Panofsky index
This index is a combination of Ri and the vertical wind shear and was created to better estimate the vertical kinetic energy in a system. Colson and Panofsky (1965) developed this index to be proportional to the actual turbulence energy.

Brown index
The Brown index was developed through empirical analysis on PIREP differences with a synoptic scale system and takes into consideration the shearing and stretching terms in flow deformation (Overeem 2002).

Brown energy dissipation rate
This index is very similar to the Brown index but also considers vertical and horizontal wind shear.

Variant 1 Ellrod index
As discussed within the main section of the paper, this index is a combination of wind shear and flow deformation (Ellrod and Knapp 1992). It is used within most upper-level turbulence forecasting systems such as the Graphical Turbulence Guidance tool (Sharman and Pearson 2017).

Variant 2 Ellrod index
This is the second version of variant 1 but it also takes into consideration converging air flow.

Wind speed time directional shear
As the name suggests this index is the combination of wind speed and directional wind shear, this takes into consideration the angle of wind shear.

Negative Richardson number
Ri is used within several forecasting tools within the aviation sectors, such as the Graphical Turbulence Guidance tool (Sharman and Pearson, 2017) (. A reduction in environmental Ri, say from gravity waves, means that there is increased shear instability and thus overturning within the atmosphere. Increased wind shear increases the denominator within the Ri equation. Turbulence arises when Ri is small (Lv et al. 2021). Ri < 1/4 relates to dynamically unstable and turbulent flow. Therefore, we use the negative Ri value, which is correlated with a negative Brunt-Väisälä stability function and increased hydrostatic instability.

Flow deformation times vertical temperature gradient
This index is a combination of flow deformation and vertical temperature gradients. Steeper thermal gradients can aid the development of CAT generation. Combining these two metrics allows of a better representation of the systems arising.

Frontogenesis function
Increased gravity wave amplitude is associated with regions of upper-level unbalanced frontogenesis and the breakdown of these inertial gravity waves can lead to CAT (Lane et al. 2004). These regions are usually found around a troposphere fold above a jet core (Lane et al. 2004;Koch et al. 2005). This index is also linearly proportional to flow deformation (Knox 1997).

Magnitude of residual non-linear balance equation
If geostrophic balance breaks down, the Coriolis force and pressure gradient force are no longer balanced. This can develop gravity waves, and lead to CAT through their breakdown. Nonlinearity leads to the waves breaking and generation of turbulent kinetic energy. A mechanism of gravity wave breaking, or dissipation could relate to the unbalanced  (Lane et al. 2004).

Magnitude of relative vorticity advection
Strong relative vorticity relative to strong anticyclonic flow can lead to gravity wave development and geostrophic adjustment and inertial instability and eventually lead to CAT (Knox 1997). Cyclonic relative vorticity is associated with deformation and frontogenesis.

Negative absolute vorticity advection
Negative values of absolute vorticity are strongly associated with lateral gradients found in anticyclonically curved jet outflow. CAT is developed within this region and manifests from large scale flow imbalance, inertial instability and gravity wave breaking. Kaplan et al. (2004) find Ellrod's indices to be too basic with a combination of just wind shear and flow deformation, and that physically they are missing out the kinematic forcings such as PV. The NSCU index looks at separation between the horizontal pressure gradient from the total vertical velocity and finds an index of the evolving ageostrophic frontogenesis. It is suggested that this tool may be useful when forecasting severe turbulence.