The Super-large Ensemble Experiments of CAS FGOALS-g3

A super-large ensemble simulation dataset with 110 members has been produced by the fully coupled model FGOALS-g3 developed by researchers at the Institute of Atmospheric Physics, Chinese Academy of Sciences. This is the first dataset of large ensemble simulations with a climate system model developed by a Chinese modeling center. The simulation has the largest realizations up to now worldwide in terms of single-model initial-condition large ensembles. Each member includes a historical experiment (1850–2014) and an experiment (2015–99) under the very high greenhouse gas emissions Shared Socioeconomic Pathway scenario (SSP5-8.5). The dataset includes monthly and daily temperature, precipitation, and other variables, requiring storage of 275 TB. Additionally, the surface air temperature (SAT) and land precipitation simulated by the FGOALS-g3 super-large ensemble have been validated and projected. The ensemble can capture the response of SAT and land precipitation to external forcings well, and the internal variabilities can be quantified. The availability of more than 100 realizations will help researchers to study rare events and improve the understanding of the impact of internal variability on forced climate changes.


Background
Climate change has greatly impacted the surface physics of land areas, the global monsoon, sea level change, and the lives of human beings (e.g., Church et al., 2013;Hatfield and Walthall, 2014;Loo et al., 2015;Fang et al., 2018). The signal of anthropogenic forcing in climate change is superposed on the internal variability (IV), which itself mainly originates from various physical processes such as the interactions among different climate components (atmosphere, ocean, land, etc.) as well as those among the different climate modes (e.g., Deser et al., 2020). IV is an important source of uncertainty for understanding historical climate change since it can account for a large component or even the dominant part of it, especially at regional scales (Deser et al., 2012a, b;Huang et al., 2020;Maher et al., 2021). More importantly, IV will cause large uncertainties for future regional climate projections, especially in the near term (Hawkins and Sutton, 2009;Hawkins et al., 2016).
To quantify the role of IV, the most popular approach is to produce single-model initial-condition large ensemble simulations. These ensemble simulations employ a single, fully coupled climate or earth system model under a particular radiative forcing scenario but with different initial conditions (e.g., Kay et al., 2015;Frankignoul et al., 2017;Frankcombe et al., 2018;Maher et al., 2021). The different initial fields cause different fluctuations of the coupled model across members, and then cause ensemble spread (Deser et al., 2020). By calculating the ensemble mean and spread, the response to external forcing and the IV can be split separately and robustly estimated (Frankcombe et al., 2018). As reported in IPCC AR6, large ensembles have improved our understanding of the impact of IV on forced changes and are highlighted as an important new field of progress in climate science (Zhou, 2021).
Since the era of CMIP3, in which only two coupled models carried out large-ensemble simulations [62 members in CCSM1.4 (e.g., Selten et al., 2004;Zelle et al., 2005;Drijfhout et al., 2008;Branstator and Selten, 2009) and 40 members in CCSM3 (e.g., Deser et al., 2012a)], an increasing number of modeling center research groups have moved in this direction. For instance, six research groups have conducted single-model initial-condition large ensemble simulations (at least 15 members) using CMIP5 coupled models in the past few years (Hazeleger et al., 2010;Jeffrey et al., 2013;Kay et al., 2015;Rodgers et al., 2015;Kirchmeier-Young et al., 2017;Maher et al., 2019). Among them, the maximum number of ensemble members is 100, conducted by only one group (the Max Planck Institute). Fast-forwarding to the latest phase of CMIP (i.e., CMIP6), more than 10 groups have now employed CMIP6 fully coupled models to conduct large-ensemble simulations, including all-forcings and single-forcing large ensembles, such as the CESM2 large ensemble simulations with 100 members under a historical/SSP3-7.0 scenario (Rodgers et al., 2021), CanESM5 (Swart et al., 2019), and EC-Earth3 (Wyser et al., 2021). However, only two groups to date, with CESM2 and MPI respectively, have conducted simulations with more than 100 ensemble members, since ensembles of such size require huge computational resources and massive storage capability.
Similar to previous studies under the framework of CMIP6, we have carried out super-large ensemble simulations using a single, fully coupled climate system model-namely, the Flexible Global Ocean-Atmosphere-Land System Model, grid-point version 3 (FGOALS-g3, Li et al., 2020b). For the ensemble simulations, the external forcings were adapted from historical forcings and the very high greenhouse gas emissions Shared Socioeconomic Pathway scenario (SSP5-8.5). Here, we document the used model, the design of the super-large ensemble, the responses to external forcings, and the IVs, to provide a description of this dataset for users.
The organization of the paper is as follows: Section 2 describes the coupled model, forcing data, the designed initial values for the super-large ensemble members, and the methods. Section 3 presents validation results of the ensemble, focusing mainly on the climatology and change in surface air temperature (SAT) and land precipitation, but also the Atlantic meridional overturning circulation (AMOC). Firstly, the temporal evolution is given for examining the historical and future responses of the ensemble. Secondly, the simulated historical mean state and changes in SAT and precipitation extreme events are validated. Meanwhile, the signal-to-noise ratio (S/N) is provided to illustrate the role of IVs. And thirdly, the precipitation and low-level winds in the East Asian monsoon region are validated. In section 4, projections in the near term (2021-40), middle term (2041-60), and long term (2080-99) are provided. Section 5 provides a summary. Section 6 describes the data record. And lastly, section 7 presents some usage notes.

Introduction to the model
The Chinese Academy of Sciences (CAS) FGOALS model, version 3, has three climate system model versions for CMIP6, developed by the Laboratory of Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), CAS. Among them, FGOALS-g3 (Li et al., 2020b) is employed in this study. In FGOALS-g3, the oceanic component is version 3 of the LASG-IAP Climate System Ocean Model (LICOM3; Lin et al., 2020); the atmospheric component is version 3 of the Grid-point Atmospheric Model of LASG-IAP (GAMIL3; Li et al., 2020a); the ice component is version 4 of the Los Alamos sea ice model (CICE4, http://climate.lanl.gov/Models/CICE); and the land component is the CAS Land Surface Model (CAS-LSM; Xie et al., 2020). The spatial resolutions of the model components are listed in section 7, and other setup details of FGOALS-g3 are described in Li et al. (2020b). The equilibrium climate sensitivity of FGOALS-g3 is 2.8 K (Li et al., 2020b).

Experimental design
The 110-member historical experiments  and SSP5-8.5 experiments (2015-99) are performed using FGOALS-g3, following the experimental design of CMIP6 (Eyring et al., 2016); plus, the forcings are from CMIP6. The SSP5-8.5 scenario is chosen because of the large effect of high emissions on the AMOC (e.g., Cheng et al., 2016) and regional monsoon precipitation (Moon and Ha, 2020). Only the initial values are different among the members, and they are chosen from the FGOALS-g3 preindustrial control (piControl) 2000-year simulations. These initial values are chosen from the last 1101 years (900-2000) of the piControl simulations (Fig. 1) to perform 110 ensemble historical experiments since the piControl experiment by FGOALS-g3 reaches a quasi-stationary state after the first 900 years of simulations with a slight global-mean SAT linear trend of −0.015°C (100 yr) −1 (Li et al., 2020b). A smaller linear trend [−0.01°C (1000 yr) −1 ] during 900-2000 is achieved (Fig. 1a). The ocean circulation (AMOC) has no obvious drift, with a small linear trend of −0.1 Sv (1000 yr) −1 (Fig.  1b). Here, the macro method (Deser et al., 2020) is applied to sample possible climate trajectories adequately, as different oceanic initial conditions strongly influence regional climate variations (Doblas-Reyes et al., 2013;Hawkins et al., 2016). In this study, we design a novel macro initialization scheme that fully considers the effects of decadal to interdecadal variabilities in the climate system, since these have been identified as possibly important terms for IVs (e.g., Dai and Bloecker, 2019). At decadal to interdecadal time scales, the leading basin-scale climate modes in the Pacific and Atlantic Ocean are the Interdecadal Pacific Oscillation (IPO) and Atlantic Multidecadal Oscillation (AMO), respectively. Additionally, the AMOC is an important driving source of the AMO (Zhang et al., 2019). The 110 initial values for the superlarge ensemble members are based on 90 pair-wise combinations of the positive, negative, and neutral phase of the IPO (IPO + , IPO − , and IPO 0 , respectively) and AMO (AMO + , AMO − , and AMO 0 ), and 20 different years of strong/weak AMOC values (AMOC + /AMOC − ) from the piControl simulations during model years 900-2000.
A positive ( + )/negative ( − ) AMO (IPO) phase occurs when the AMO (IPO) index is larger/smaller than 1.0/−1.0 times its standard deviation (SD). An AMO/IPO index value between ± 0.5 SD defines its neutral phase (AMO 0 or IPO 0 ). The AMOC index is computed as the maximum annual meridional streamfunction over 20°-60°N and below the depth of 500 m in the North Atlantic. A positive ( + )/negative ( − ) AMOC is defined when the AMOC index is larger (smaller) than 35.5 Sv (1 Sv = 10 6 m 3 s −1 ). According to this definition, the 110 restart years selected are provided for the initial values of super-large ensembles in Table 1. For example, the 1019 restart year is in a combined AMO + and IPO + phase, meaning the simulated data on 1 January 1019 are used as the input initial field for one of the ensemble members. Initialized by the selected macro climate conditions, the historical simulations including 110 members are performed using the time-varying external forcings of the historical run recommended by CMIP6 (https://esgf-node.llnl. gov/search/input4mips/). Every member has its own distinct initial value. The initial value corresponds to the transient restart field on 1 January of the model year (here, 1 January is omitted in Table 1). The SSP5-8.5 runs are initialized by the historical simulation on 1 January 2015 of each member. The SSP5-8.5 run is driven by the standard SSP5-8.5 external forcings from CMIP6.
Our novel macro initialization scheme is able to fully consider the possible states of the long-term oceanic IVs (providing SAT and AMOC), including different phases of the IPO, AMO, and AMOC. For instance, depending on the phase of the AMO, the evolution of the AMOC is totally different (Fig. 1c), and the different evolutions of the AMOC under different AMO phases indicate that the "memory" spans about three to four decades from the initialization (Fig. 1c), which is close to that in CESM2 large ensembles (Rodgers et al., 2021). Besides, 110 members will help to obtain more robust and precise conclusions on matters such as the forced response to external forcing. Separating the forced response of SAT to external forcing is used as an example to explain why 110 members is superior to using a small number of members. Following Milinski et al. (2020), the ensemble means of annual global SAT from the 110 members or their subsets (randomly selecting 1,5,10,25,50,75, and 100 members from 110) are considered as a reference "true" value of the forced response and estimated forced responses, respectively. Then, the root-mean-square error (RMSE) between the forced response estimated from each subset and the "true" forced response value is computed (Fig. S1 in the Electronic Supplementary Materials, ESM). As shown in the figure, the RMSE from a larger ensemble becomes smaller, and the spread is smaller too. This indicates that a larger number of ensemble members can obtain a more accurate quantification of the forced response, which is similar to the findings of Milinski et al. (2020).
Details of the outputs for the atmospheric and oceanic component models are given in Tables 2 and 3. The analysis in this study employs monthly SAT, total precipitation, meridional overturning streamfunction, wind vector fields at 850 hPa, daily precipitation, and the daily SAT maximum to describe and validate the 110 FGOALS-g3 ensemble members.

Data
To validate the temperature and precipitation, the SAT from HadCRUT5 (Morice et al., 2021), land precipitation data from the monthly analysis (version 2.3) of the Global Precipitation Climatology Project (GPCP; Adler et al., 2018), version 4 of the Climatic Research Unit (CRU) Time Series monthly high-resolution gridded multivariate climate dataset (Harris et al., 2020), and NOAA's Precipitation Reconstruction over Land (PRECL, Chen et al., 2002), are employed.
We use multiple observational datasets to evaluate the simulated extreme temperature and precipitation. They include: (1) HadEX3, which is a land-surface dataset of climate extreme indices on a 1.875° × 1.25° grid covering 1901-2018 (Dunn et al., 2020); (2) the NOAA Climate Prediction Center (CPC) Global Telecommunication Systembased daily SAT over the global land area from 1979 to the present day at a 0.5° × 0.5° resolution; and (3) the global gauge-based gridded daily precipitation from the Global Precipitation Climatology Centre (GPCC Full Data Daily Product) covering 1982-2019 with a resolution of 1° × 1° (Schneider et al., 2014). A common period of 1995-2014 is used to evaluate the simulated climate extremes.

(SAT M,t )
Following the definition of previous studies (e.g., Dai and Bloecker, 2019;Maher et al., 2019), and taking SAT as an example, since the external forcing is identical in all members, the transient forced response is estimated by taking the ensemble mean across members at each time step: where is from an individual member with ensemble numbers Nm at time step . Here, Nm is 110. The estimations of the forced response for other variables are similar to . The IV is defined as 1 standard deviation (SD) across the ensemble members. The SD is calculated using the following formula: (2) We also use a simple S/N analysis to assess the relative magnitudes of the forced and internally generated components of future climate change. Here, the signal is the change in the forced response between two time periods and is defined as the absolute change in the ensemble-mean value of a variable across ensembles, and noise is defined as 1 SD across ensemble members of this variable at each grid point: where , , and ( ) is the time average of over a time period ( ). It is clear that the signal is significant on the condition that S/N is larger than one.
The pattern correlation coefficient (PCC) calculated in this study is the Pearson product-moment coefficient of linear correlation between two datasets. For the Pearson correlation coefficient, the linear change in the two variables will not change its value. A high correlation coefficient does not mean two variables are exactly the same; rather, that the two variables have the same spatial gradient.    (Table S1) under the SSP5-8.5 scenario. The pink shadings show the spread of multiple CMIP6 models.
During 1939-45, due to the scarce observations at that time, a systematic warming bias exists (e.g., Chan and Huybers, 2021). Meanwhile, some studies have mentioned that observational datasets possess large uncertainty pre-1945 (Kennedy, 2014;Kennedy et al., 2019;Morice et al., 2021). Thus, we do not know whether the cold global SAT in the ensemble mean of the FGOALS-g3 super-large ensemble is real during 1850-1950. The ensemble mean simulates global warming with a magnitude of about 1.18°C (with an ensemble spread of 1.07°C-1.29°C) in 1995-2014 relative to 1850-1900, which is slightly larger than 1.1°C from HadCRUT5. During 1950-2014, the evolution of the super-large ensemble mean matches very well with that of the observation, indicating that the global SAT response to external forcing in FGOALS-g3 is very realistic.
The averages of the annual mean precipitation over the global land area from the ensemble simulations and three sets of observations are displayed in Fig. 2b. The ensemble mean of the simulated global land precipitation shows a small increasing trend, as in CRU, during 1900-2014. The ensemble mean is able to capture the observed values (CRU and GPCP), but with some underestimations. The averages of global land precipitation from CRU and GPCP fall within those of the simulated ensemble members generally, and both the simulated and observed land precipitation values possess large uncertainties. Before satellite observations became available (i.e., ~1980), the global land precipitation shows large uncertainty across the three datasets (maximum of ~0.15 mm d −1 ). The uncertainty is about half of the ensemble spread across members (~0.3 mm d −1 ).
The AMOC can significantly influence the climate by transporting large quantities of ocean heat poleward (e.g., Buckley and Marshall, 2016;Liu et al., 2020). Figure 2c shows the time series from 1850 to 2099 of the maximal annual mean AMOC at 26.5°N for the large ensemble members. The ensemble mean of the simulated AMOC across the members is 25.88 Sv (with an ensemble spread from 21.73 Sv to 28.95 Sv), which is an overestimation of the observed value (16.9 Sv; Smeed et al., 2018) from 2004to 2014. Meanwhile, during 1850-1980, the ensemble mean AMOC kept its amplitude of 28 Sv and displayed no obvious declining trend. After 1980 and up to 2014, the AMOC presents a noticeable declining trend of about 0.7 Sv (10 yr) −1 . The simulated AMOC intensity is overestimated in FGOALS-g3, which suggests a systematic AMOC bias exists in the coupled model. By contrast, the oceanic component (LICOM3) of FGOALS-g3, forced by two different atmospheric and runoff datasets, can simulate the observed AMOC well at 26.5°N (Lin et al., 2020). Therefore, the AMOC bias in FGOALS-g3 may be related to the interaction with atmospheric or sea ice components, which needs to be studied further.  Fig. 3b). The larger cold biases mainly lie to the north of 60°N and to the south of 60°S, around high terrain in plateau or mountainous regions (e.g., the Tibetan Plateau). The global mean SAT for the ensemble mean is −0.71°C lower than that for HadCRUT5, and this is mainly due to the cold SAT around the north Pacific, the Arctic Ocean, and the Southern Ocean close to the Antarctic Continent. A similar cold bias pattern also exists in the ensemble mean relative to BEST (Rohde and Hausfather, 2020), although there are uncertainties in the observed SAT at high latitudes. These cold biases also appear in several other models, such as CESM1, CSIRO, and EC-earth3 (Jeffrey et al., 2013;Kay et al., 2015;Döscher et al., 2021). In terms of the global mean, the ensemble mean using FGOALS-g3 has the smallest bias among these ensembles from different coupled models (Fig. S2 in the ESM), but the cold biases at high latitudes in FGOALS-g3 seem more severe than those in other models. The cold biases at high latitudes in the FGOALS-g3 ensemble may be related to the surface albedo, or downward solar radiation associated with cloud cover (e.g., Zhou et al., 2019). The bias in surface albedo in the Arctic Ocean may be associated with the bias in sea ice, and the bias at the land surface may be associated with snow parametrization (Li et al., 2020b).

Climatology and change in SAT and precipitation
The SAT changes (1995-2014 minus 1961-90) are shown for HadCRUT5 in Fig. 3c, and the ensemble mean is shown in Fig. 3d. The change in SAT also reflects the trend.
The observed change in SAT shows significant warming (>0.5°C) over the continent and the North Atlantic Ocean, and the largest warming (>1.5°C) takes place over the Arctic, while there is cooling over the Southern Ocean close to the Antarctic continent. The observed change in SAT in Fig. 3c is captured well by some individual members, except over the central-eastern tropical Pacific between 180°a nd 120°W, and over the Southern Ocean close to the Antarctic continent. The ensemble mean change in SAT (Fig. 3d) captures the observed change well, with a PCC of 0.86. Large changes (warming) also appear over the Arctic Ocean in the ensemble mean, and this warming should be due to external forcings since the S/N is larger than one for the ensemble mean. Over the Eurasian continent, relatively weaker warming is located over its central part in the observation, whereas weaker warming is located over its eastern parts in the ensemble mean. Meanwhile, over the eastern-central tropical Pacific and the Southern Ocean close to the Antarctic continent, the warming is larger based on the ensemble mean than it is in the observed data. Around the subpolar North Atlantic, IVs strongly influence the change in SAT. The model fails to simulate the weaker warming in the central-eastern tropical Pacific between 180° and 120°W, or the cooling over the Southern Ocean close to the Antarctic continent, since the S/N ratios are larger than one and the observation is located outside all of the FGOALS-g3 super-large members.
As previously stated by Bindoff and Min (2013), observations show the phenomenon of amplified warming in the regions of high latitudes (especially around the Arctic); and here, larger changes in SAT are found at high latitudes compared with low latitudes during 1995-2014 relative to 1961-90 (Fig. 3c). This observed phenomenon is generally reproduced by the ensemble mean of the FGOALS-g3 superlarge ensemble simulations, but with significant underestimation in magnitude (Fig. 3d). Previous studies have suggested a strong influence of IVs on SAT change at high latitudes (Meehl et al., 2014Dai et al., 2015). To illustrate whether IVs can influence the observed warming magnitude around the Arctic, we present the results of the two individual members with the lowest and largest SAT change (relative to 1961-90) averaged over the regions north of 60°N across ensemble members in 1995-2014 (Figs. 3e and f, respec-tively). The average changes in SAT north of 60°N are 0.41°C (Fig. 3e) and 1.87°C (Fig. 3f), respectively. The largest (lowest) SAT change averaged over the regions north of 60°N is significantly higher (still lower) than the change over the globe, with the value of 0.71 (0.49) in Fig. 3f (Fig. 3e). Thus, the phenomenon of polar amplification in the observation can be captured in some FGOALS-g3 members (Fig. S3 in the ESM), but with large IVs. This weak polar amplification in FGOALS-g3 may be related to the cold bias (inducing positive feedback with surface albedo due to excess sea ice) around the Arctic as well as the strong AMOC. Figures 4a and b show the climatological mean land precipitation in observations (CRU) averaged during 1961-90 and the bias of the ensemble mean relative to the observations. In the observations, the land precipitation belt is mainly located in the tropics, such as the monsoon regions and the Amazon. This distribution is similar to the reference values from other observational datasets (GPCP and PRECL). The observed large-scale spatial pattern and magnitude of land precipitation are captured well by the ensemble mean; the PCC between the observation and ensemble mean is 0.81. The averaged bias of global land precipitation is 0.07 mm d −1 . The simulated land precipitation is clearly underestimated over land in the tropics (30°S-30°N) (Fig.  4b), and the geographical pattern of land precipitation bias is similar to CESM2 (Danabasoglu et al., 2020). A dry bias is located over land over South Asia, South America, and central Africa, and is related to convective and large-scale precipitation biases (Pathak et al., 2019). Additionally, the simulated land precipitation is overestimated over high terrain, such as plateau or mountain regions (e.g., the Tibetan Plateau, Andes, Rocky Mountains), and the Maritime Continent, where the bias is associated with the model resolution (Schiemann et al., 2014). (1995-2014 relative to 1961-90) in the observation and ensemble mean, respectively. The observed land precipitation falls within almost all of the ensemble members (dotted in Fig. 4c), which indicates the change in precipitation can be captured by more than one FGOALS-g3 super-large ensemble member. The super-large ensemble members cannot simulate the observed precipitation change in some places, such as the northeast corner of China, the northeast part of Greenland, and western Africa in the tropics. In the ensemble mean, the change in precipitation is affected greatly by IVs covering most of the tropics south of 60°N (no dots in Fig.  4d), except the southern branch of the Intertropical Conver-gence Zone (ITCZ). In the middle-to-high latitudes of the Northern Hemisphere, the change in precipitation could be associated with the change in external forcing in the ensemble mean. Still, the response is very weak compared with that over the tropical oceans.

Figures 4c and d show the changes in land precipitation
The simulated changes in annual global SAT and land precipitation (1995-2014 minus 1961-90) are compared with those in the observational data (Fig. S4 in the ESM). The results show that the ensemble change spreads are large enough to cover the observed SAT and land precipitation change over the globe. The spread for land precipitation shows better coverage than that for the SAT over the globe.

Climatology of the Asian summer monsoon
Over the Asian monsoon region, the climatological mean precipitation and 850-hPa winds during 1995-2014 in boreal summer (June-July-August) simulated by the superlarge ensemble members of FGOALS-g3 are assessed by comparing with the observational and reanalysis data (Figs. 5a, c, and e). The distribution of precipitation is captured but with obvious underestimation in most of the Asian monsoon region, except the southeastern Tibetan Plateau and South China Sea (Fig. 5e). The PCCs of precipitation in the monsoon region between the 110 members and GPCP range from 0.28 to 0.32, whereas for the 850-hPa wind in (10°S-60°N, 60°-160°E) they are much higher, ranging from 0.90 to 0.92. The correlation coefficients between the PCCs of precipitation and those of low-level winds across members are close to zero. Therefore, the local biases in monsoon precipitation cannot be explained by the low-level winds and may instead be rooted in the convective parameterization schemes, treatment of topographic effects, and boundary layer processes Li et al., 2020b).

Climatology of climate extremes
Next, we compare the annual hottest daily maximum temperature (TXx) between the ensemble members and observations from HadEX3 and CPC over the period 1995-2014. Over land, in both HadEX3 and CPC, TXx exhibits an overall latitudinal structure, with generally warmer values in the tropics and cooler values in the northern high latitudes and mountainous regions. The FGOALS-g3 ensemble reproduces the spatial distribution of TXx reasonably well, with a pattern correlation of 0.99 with CPC over land. The ensemble slightly underestimates the simulated magnitude of TXx, which is 33.90°C (10th-90th percentile range of 33.84°C-33.97°C) in the ensemble and 35.51°C in the CPC dataset over global land areas.  (d), and (f) represent the near-term, mid-term, and long-term projections, respectively, relative to the mean of 1995-2014, in which dotted shading and arrows show where the S/N ratio is larger than one. The domain encircled by the thick gray line is higher than 2500 m, indicating the location of the Tibetan Plateau. The domain of the Asian summer monsoon is shown by the red contour, based on the definition by Wang and Ding (2008), which is composed of the East Asian, South Asian, and western North Pacific monsoons (divided by the dashed red line).
To evaluate the simulated extreme precipitation, we compare the annual maximum daily precipitation (Rx1day) between the ensemble members and observations from HadEX3 and GPCC over the period 1995-2014. Climatologically, extreme precipitation in both HadEX3 and GPCC is generally stronger in the tropics and monsoon regions than over the rest of the land areas. The FGOALS-g3 ensemble is able to reproduce the large-scale spatial distribution of extreme precipitation, with a pattern correlation of 0.90 with CPCC over land. The ensemble underestimates the simulated magnitude of Rx1day, which is 39.70 mm (10th-90th percentile range of 39.44-39.92 mm) in the ensemble and 49.20 mm in the GPCC dataset for the average over global land areas. It is common that global climate models generally underestimate the magnitude of extreme precipitation (Flato et al., 2013), which is partly related to model physics such as convection parameterization, and partly to their coarse spatial resolutions (Kopparla et al., 2013;Norris et al., 2021).

Temporal evolutions in future projections
Under the SSP5-8.5 scenario, warming is projected to increase globally (Fig. 2a and Fig. 8). In the FGOALS-g3 ensemble mean, the change in SAT averaged over the globe is 0.50°C (0.23°C-0.66°C), 1.10°C (0.93°C-1.22°C), and 2.59°C (2.46°C-2.75°C) during 2021-40, 2041-60, and 2080-99 relative to that during 1995-2014, respectively. By the end of the 21st century, the change in SAT averaged over the globe is projected to reach about 3.6°C in the ensemble mean, which is close to the magnitude (3.8°C) in the Max Planck Institute Grand Ensemble (Maher et al., 2019). The projected ensemble mean future warming over the globe in the FGOALS-g3 super-large ensemble lies within the spread of multiple CMIP6 models (Table S1 in the ESM), but at lower spread bounds (Fig. 2a). The spread in the FGOALS-g3 super-large ensemble is much smaller than that in multiple CMIP6 models.
Under the SSP5-8.5 scenario, the global land precipitation is projected to increase continuously during 2015-40, and then increase much more obviously thereafter. The increase in global land precipitation reflects the response to external forcing, and this can also be affected by the IVs. Maher et al. (2019) suggested that the increase in global precipitation is correlated with the increases in average SAT and CO 2 over the globe. Additionally, the IV (gray spread in Fig. 2b) can influence global land precipitation, both historically and in the future. This implies that the IVs need to be considered in future projections of precipitation. The superlarge ensemble members help to quantify the IVs and therefore future projections of precipitation. The projected ensem-ble mean of land precipitation in the future over the globe in the FGOALS-g3 super-large ensemble lies within the spread of multiple CMIP6 models (Fig. 2b). The projected ensemble mean and spread of land precipitation in the FGOALS-g3 super-large ensemble are both smaller than those in multiple CMIP6 models.
Under the SSP5-8.5 scenario, the declining trend of the AMOC is much more obvious than that during 1980-2014, and the value is about 1.2 Sv (10 yr) −1 during 2015-99. In multiple CMIP6 models, a significant decline in the AMOC is also projected to appear in the 21st century (Weijer et al., 2020), largely due to the rapid warming caused by continuous emissions of CO 2 (Maher et al., 2019;Dima et al., 2021). The evolution of the spread of members resembles that of the ensemble mean AMOC, with no decline during 1850-1980 and a decline during 1980-2099. This temporal change in the behavior of the spread was also reported in  1995-2014 (units: mm). Note that the HadEX3 and GPCC datasets cover land only. In HadEX3, only regions where at least 50% of records are temporally complete are shown. (d-f) Projected changes in Rx1day in the near-term (d), mid-term (e), and long-term (f) periods (units: % relative to the 1995-2014 baseline). Shading shows model ensemble medians. Dots and hatching indicate at least 70% and 90% of members agree on the sign of change, respectively. CMIP5 and CMIP6 models by Cheng et al. (2016).

Future changes in SAT and precipitation
Under the SSP5-8.5 scenario, in the near-term (2021-40), mid-term, and long-term projections, the ensemble mean of the FGOALS-g3 super-large ensemble members shows continuous warming in most global areas, and the warming patterns remain almost unchanged relative to 1995-2014. The projected warming exceeds the effect of IVs. In the mid-term and long-term projections, the warming amplification over the Arctic Ocean is obvious, and the warming is projected to extend southward to the Eurasian continent. In the tropics, El Niño-like patterns are found and are consistent in the near-term, mid-term, and long-term projections. Under the SSP5-8.5 scenario, the cool-ing remains the same as that during 1995-2014 in the subpolar gyre in the North Atlantic. The cooling is affected greatly by IVs in the near term but is beyond the IVs in the midterm and long-term projections. The cooling is due to the significantly weakened AMOC in the mid-term and long-term projections (Fig. 2c), as indicated by previous studies (Drijfhout et al., 2012;Rahmstorf et al., 2015;Bellomo et al., 2021).
Under the SSP5-8.5 scenario, the most significant changes are projected to occur in the tropics (a southern branch of the ITCZ) in the near term (2021-40), middle term (2041-60), and long term (2080-99). The response pattern in Fig. 8 is similar to the projection of zonally contrasting shifts in the ITCZ in CMIP6 (Mamalakis et al., 2021). In the middle-to-high latitudes of the Northern Hemisphere, the projected change in precipitation could be associated with the change in external forcing in the ensemble mean, but the precipitation response is weak. In the near term, the ensemble mean shows that the change in precipitation is affected greatly by IVs covering most regions within 60°S -60°N except the equator (no dots in Fig. 9a), similar to the change during 1995-2014 relative to 1961-90 (Fig. 4d). In the middle term, the effect of IVs on the change in precipitation reduces greatly south of 60°N, since the > 1 S/N ratios extend to cover the whole equatorial belt and over the Indian Ocean (Fig. 9b). In the long term, the effect of IVs on the change in precipitation reduces further south of 60°N. The IVs have a large impact on the change in precipitation in the areas between 10°-30°N and between 50°-30°S, such as in the subtropical Pacific, small regions of the Indian Ocean and Atlantic Ocean, the South China Sea, land areas in southern China, and western Africa.

Future change in the Asian summer monsoon
In both the near-term and mid-term projections, the FGOALS-g3 ensemble mean shows drying in most of the South Asian monsoon region but wetting over the southeastern Tibetan Plateau and Bay of Bengal; while for the East Asian monsoon, it presents wetting in the north but drying in the south, and for the western North Pacific monsoon, wetting dominates (Figs. 5b and d). However, most of these changes are weak relative to the strong IVs (S/N < 1). Similar to the change in precipitation, significant low-level circulation changes emerge only over South Asia and the western North Pacific (in the mid-term projection; Fig. 5d). Anticyclonic changes over South Asia due to the weakened Walker circulation under warming could impair the effect of increased moisture on precipitation (Chen and Zhou, 2015). In contrast to other places in the monsoon region, the robust strengthening of precipitation over the Tibetan Plateau begins as early as in the near-term projection (Figs. 5b,d,and f). In the long-term projection, the increase in precipitation in the wetting regions becomes more robust and the dry regions shrink.

Future change in climate extremes
Under the SSP5-8.5 scenario, TXx is projected to warm continuously over the globe, except in the North Atlantic sub- polar gyre region (Figs. 6d-f), which is consistent with the projected mean temperature changes (Fig. 8). The global pattern of changes in extreme high temperature projected by the FGOALS-g3 ensemble is generally consistent with that in the multiple CMIP6 models, pointing to a faster warming over land than over ocean [see Fig. 11.11 in IPCC AR6 (Seneviratne et al., 2021)]. Averaged over global land areas, TXx is projected to warm by 0.63°C (0.54°C-0.76°C), 1.39°C (1.28°C-1.46°C), and 3.24°C (3.14°C-3.32°C) in the nearterm, mid-term, and long-term periods, respectively, above the 1995-2014 level in the FGOALS-g3 ensemble.
As global warming continues in the future, extreme precipitation is projected to increase over most regions of the globe, with decreases confined to some subtropical regions . The global pattern of changes in extreme precipitation projected by the FGOALS-g3 ensemble is generally consistent with that in the multiple CMIP6 models [see Fig.   11.16 in IPCC AR6 (Seneviratne et al., 2021)]. Averaged over global land areas, Rx1day is projected to increase by 1.47% (0.56%-2.34%), 4.31% (3.30%-5.07%), and 12.24% (11.24%-13.12%) in the near-term, mid-term, and longterm periods, respectively, under the SSP5-8.5 scenario above the 1995-2014 level in the FGOALS-g3 ensemble.

Summary
A super-large ensemble simulation with 110 members has been carried out by using the fully coupled model FGOALS-g3 developed at the IAP, CAS. The simulation has the largest realizations to date from the perspective of single-model initial-condition large ensembles and is regarded as a major contribution from the Chinese climate modeling community to global climate research. The simulation covers both the historical climate, starting from 1850, and a future projection up to 2099 under SSP5-8.5. The FGOALS-g3 super-ensemble can be used for studying climate change, including the response of external forcings and the role of IVs.
FGOALS-g3 can reproduce the historical evolution of average SAT over the globe well during 1850-2014. The large-scale spatial features of SAT are simulated well. However, there are systematic biases in some regions. The larger cold biases mainly lie to the north of 60°N (over the Arctic Ocean) and to the south of 60°S, around high terrain like plateau or mountainous regions.
The observed change in SAT between 1995SAT between -2014SAT between and 1961SAT between -1990 is captured by the FGOALS-g3 ensemble but with large IVs in the North Atlantic Ocean subpolar gyre. The polar warming amplification (the warmest change) in the Arctic Ocean can be captured in some members, and large IVs exist. The ensemble mean underestimates the polar warming amplification in the high latitudes of the Northern Hemisphere (over the Arctic Ocean) during 1995-2014. This underestimation may be related to the climatological mean cold bias and excess sea ice there.
The evolution of average historical land precipitation over the globe can be captured by the FGOALS-g3 ensemble mean. The observed distribution of precipitation over global land areas can be captured by the ensemble mean, including the tropical land precipitation belt in the monsoon regions and the Amazon. The simulated land precipitation is clearly underestimated over land in the tropics (30°S-30°N) and overestimated over high terrain like plateau or mountainous regions and the Maritime Continent.
The change in the distribution of land precipitation between 1995-2014 and 1961-90 is significantly uneven and with very large IVs. The possible increase in precipitation is located in the high latitudes.
In terms of extreme highs in SAT and precipitation during 1995-2014, the FGOALS-g3 ensemble captures the spatial features well, albeit with some underestimations. Over land, the hottest temperature exhibits an overall latitudinal structure, being generally warmer in the tropics and cooler in the northern high latitudes and mountainous regions in both obser- vations and the FGOALS-g3 ensemble. Extreme precipitation is generally stronger in the tropics and monsoon regions than over the rest of the land areas in both observations and the FGOALS-g3 ensemble.
Under the SSP5-8.5 scenario, the patterns of change remain almost unchanged relative to 1995-2014 in the FGOALS-g3 ensemble mean in the near, middle, and long term. In the middle and long term, the polar warming amplification in the Arctic Ocean becomes more obvious. At these scales, the obvious cooling in the North Atlantic Ocean subpolar gyre is due to the significantly weakened AMOC. The extreme high SAT is projected to warm continuously as the projected mean temperature changes. The continuous warming could lead to an increase in land precipitation, and the changes in the distribution of precipitation remain almost unchanged relative to 1995-2014. Extreme precipitation is projected to increase over most regions of the globe, with decreases confined to some subtropical areas.
For the Asian monsoon, the summer precipitation and monsoonal circulation can be captured but with broadly underestimated precipitation. However, over the South China Sea and southeastern Tibetan Plateau, the summer precipitation is overestimated. Under the SSP5-8.5 scenario, summer precipitation increases over the central-western Tibetan Plateau in the near term and becomes significant and extends to almost the entire Tibetan Plateau in the long term.

Data records
The variables analyzed in this study based on the FGOALS-g3 110-member historical and SSP5-8.5 simulations have been uploaded to a data bank available at http:// www.doi.org/10.11922/sciencedb.01332. The model outputs are in the Network Common Data Form (NetCDF), version 4, and in the form of a native grid. These data can be processed and visualized by common computer programming languages (e.g., Python) and professional software such as NCAR Command Language (NCL) and Ferret. The outputs from ocean and sea ice components are curvilinear grids.

Usage notes
The atmospheric and land model components of FGOALS-g3 have the same equal area-weighted grid. The horizontal zonal and meridional grids are 180 and 80, respectively. There are 26 vertical levels for the atmospheric model component. The original ocean and sea-ice model components of FGOALS-g3 outputs are on a tripolar grid with two poles in the Northern Hemisphere continent. The zonal and meridional grid numbers are 360 and 218, respectively. The first-order conservation interpolation method can interpolate the tripolar ocean into a 1° latitude-longitude even rectangle grid. There are 30 vertical levels for the ocean model component. The horizontal resolution of CICE4 is the same as that in LICOM3, and the resolution of CAS-LSM is the same as that in GAMIL3. The horizontal resolution of FGOALS-g3 used for the super-large ensemble is comparable to other CMIP6 models with large ensembles (Table 4), coarser than that of CESM2 (Danabasoglu et al., 2020), and finer than that of CanESM5 (Swart et al., 2019). The numbers of vertical layers of the FGOALS-g3 oceanic and atmospheric components are less than those of CESM2 and CanESM5. Further details regarding each vertical level of GAMIL3 and LICOM3 and can be found in Li et al. (2020a) and Lin et al. (2020), respectively.
The outputs of the atmospheric and oceanic model components of FGOALS-g3 are listed in Tables 2 and 3, respectively. The outputs of the sea-ice and land model components are omitted here. Only the analyzed variables are listed in Table 5. The total storage is listed in Table 6. All outputs of experiments are in the form of a native grid. The data can be accessed from the website and some of them can be made Acknowledgements. This study is supported by the National Key Program for Developing Basic Sciences (Grant No. 2020YFA0608902) and the National Natural Science Foundation of China (Grant Nos. 41976026 and 41931183). The authors also acknowledge the technical support from the National Key Scientific and Technological Infrastructure project "Earth System Science Numerical Simulator Facility" (EarthLab). Some simulations presented in this study were performed on the CAS Xiandao-1 supercomputer. The authors also acknowledge the help with model setup from Dr. Lijuan LI and the help with data processing from Mr. Kangjun CHEN. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.