1 Introduction

Atmospheric general circulation models (GCMs) forced with a prescribed SST, perpetual equinoctial conditions, and lacking sea-ice and land masses produce an idealized representation of Earth’s atmosphere. These aquaplanet configurations reproduce the general characteristics of the zonally averaged circulation at the expense of some of the details (Hunt 1973; Goswami et al. 1984; Hayashi and Sumi 1986). The zonally symmetric, idealized conditions help isolate interactions between parameterized physical processes and the resolved large-scale circulations, providing an attractive framework for developing models and understanding (Hoskins et al. 1999). Previous work also suggests that aquaplanets respond to perturbations similarly to configurations with realistic geography (Medeiros et al. 2008), thus making them attractive for addressing climate change issues as well. A common criticism of the aquaplanet, however, is that it can not be evaluated using observations.

In this work, we explore the statistical properties of low clouds for GCM simulations in both aquaplanet mode and standard, Earth-like configurations. The basic finding is that the simulated low clouds, if appropriately sampled, do not depend on the details of the circulation that distinguish aquaplanets from Earth-like configurations. This means that differences in the representation of low clouds among GCMs, as a function of an appropriately sampled large-scale state, are also apparent in aquaplanet simulations, provided the particular aquaplanet configuration contains data within that large-scale state. This finding shows that aquaplanets can be compared to observations, and that improvements in the representation of aquaplanet cloud regimes should improve the same regimes in more realistically forced simulations.

Four GCMs are compared here, with each producing a standard, Earth-like simulation and an aquaplanet (Table 1). The Earth-like simulations all use climatological SST and sea-ice fields and retain the seasonal cycle. The aquaplanet configuration follows the AquaPlanet Experiment conventions (Neale and Hoskins 2001), using the “QOBS” SST distribution, given as a function of latitude, ϕ, by

$$ T(\phi) = \left\{\begin{array}{ll} \frac{T_{\rm max}}{2} \left( 2 - \sin^2\left(\frac{3\phi}{2}\right) - \sin ^4 \left( \frac{3\phi} {2} \right) \right) & |\phi| \leq \frac{\pi}{3}, \\ 0 & \text{otherwise} \end{array} \right. $$

where T max = 27°C. Two of the GCMs are the NCAR CAM and GFDL AM, which are also used by Medeiros et al. (2008) to compare the climate response between aquaplanets and Earth-like configurations. Also included are a super-parameterized version of the NCAR CAM (SP-CAM) and the MPI ECHAM (v5.4, Roeckner et al. 2006). The ECHAM uses a T63 truncation, resulting in about 1.8° grid spacing, and 53 vertical levels, with the extra levels mostly distributed in the upper atmosphere (i.e., within the troposphere the discretization is similar to the GFDL AM). The SP-CAM uses the semi-Lagrangian dynamical core of the NCAR CAM with similar resolution to the T42 Eulerian version used here, but the parameterized physics is replaced with a two-dimensional cloud resolving model (CRM). The CRM contains 32 columns, spaced 4 km apart, with north–south orientation, and 30 levels (cf. Khairoutdinov et al. 2005; Wyant et al. 2006b).

Table 1 List of models used here, with version and reference, along with the time interval used for the analysis and definition of the diagnostic low-cloud layer and overlap assumption used within that layer

Comparing aquaplanets with Earth-like configurations or observations requires an abstracted view of the data. Recent analyses of GCM clouds have tended toward two methods. First is to sample regimes as a function of the environmental characteristics of the flow, such as subsidence rate or lower-tropospheric stability (e.g., Tselioudis and Jakob 2002; Bony et al. 2004). Second is to define regimes as a function of the structure of the clouds themselves (e.g., Jakob and Tselioudis 2003; Williams and Tselioudis 2007). We adopt the former approach because it is straight forward to apply and plays to the strength of the GCMs by conditioning on the aspects of the model solution that are largely resolved and less subject to biases arising from sub-gridscale processes.

The remainder of this paper is organized as follows. In Sect. 2, the ability of the GCMs to adequately reproduce the distribution of important features of the large-scale environment is explored. Section 3 shows that sampling based on these features of the large-scale environment can usefully discriminate among important cloud characteristics. Section 4 further explores the structure of the cloud regimes that emerge from the conditionally sampled data and the sensitivity of these conditional averages to both the large-scale flow (i.e., aquaplanet vs. Earth-like configurations) and the underlying model. We show that the thermodynamic structure in these atmospheric states is insensitive to the distribution from which they are sampled; Earth-like configurations and aquaplanet configurations produce the same structures for the same large-scale conditions. Different climate models, however, produce conspicuously different clouds even given similar environmental conditions. These results suggest that the aquaplanet framework is a useful tool for improving the parameterization of physical processes that are well-sampled by the circulations produced by such a model.

2 Distributions of large-scale conditions

This investigation is based on the hypothesis that two quantities can usefully encapsulate the large-scale environment: vertical velocity (ω) in the free troposphere and lower-tropospheric stability (LTS  ≡ θ 700hPa − θ sfc). These have emerged in recent years as favored constraints to distinguish cloud regimes (Klein and Hartmann 1993; Bony et al. 2004; Bony and Dufresne 2005; Wyant et al. 2006a, 2009; Medeiros et al. 2008; Su et al. 2008). The analysis is based on monthly mean data, for the mostly practical reasons of smaller data volumes and that they should be better captured by the reanalysis. The latter may be particularly true for ω, for which the fidelity of daily reanalysis values is sometimes questioned (e.g., Trenberth et al. 2000), though there is also evidence of systematic behavior in some regimes (Stevens et al. 2007; Lenschow et al. 2007). Because cloud processes are fast, and because the analysis using monthly data is promising, there may be utility in extending this analysis to pentad or daily data.

Histograms of these quantities for tropical oceans between 35°S and 35°N from the ERA-40 reanalysis (Uppala et al. 2005) are shown in Fig. 1. The reanalysis data used here has been interpolated to a 2.5° grid (similar to typical GCM grid spacing); monthly mean values from 1984 through 2001 are used. The distribution of ω is skewed toward negative values, with most of the upward motion being carried by rare but intense motions, while the LTS is skewed toward positive values, with locally confined regions contributing disproportionately to the tropical mean. The LTS conditioned on subsiding motion, where it is expected to be a meaningful measure of inversion strength, is also shown. Using only locations under subsidence, the LTS distribution is more symmetric, showing that subsidence is not confined to regions where surface temperature is cold and stability high. Large values of LTS are conducive to marine stratocumulus (e.g., Klein and Hartmann 1993). Low values can describe subsidence over warmer water, though in the ERA-40 data, most of the values of LTS ≤ 10 K are associated with winter season storm tracks near the edge of the tropical domain in the western Pacific (east of Japan and southeastern Australia). The peaks of both distributions are at moderate values of vertical motion and LTS, a reminder that a great deal of the tropical ocean is characterized by slight descending motion above the trade inversion: typical trade-wind conditions.

Fig. 1
figure 1

Distribution of (left) vertical velocity at 500 hPa and (right) lower-tropospheric stability from ERA-40 reanalysis using tropical marine locations from monthly means for 1984–2001 (connected black circles and connected gray circles in right panel use only points with ω 500 ≥ 10 hPa day−1). Shading shows the envelope of the four Earth-like GCM simulations, and vertical bars show the envelope from the aquaplanet simulations; only the subsidence points are used in the LTS panel. Light curves show the cumulative distribution, dashed for entire tropics and solid for subsidence only

The shading in Fig. 1 shows the envelope of all four GCMs in the Earth-like configuration. Though differing in some details, the GCMs capture the size and shape of both ω and LTS distributions. The largest discrepancies appear as more frequent moderate upward motion (ω ≈ −20 hPa day−1) and a second peak at low values of LTS (≈12 K); these are due to biases in the NCAR CAM and MPI ECHAM, respectively. Vertical lines span the range of values from different aquaplanet simulations, which all exhibit narrower distributions than the Earth-like configurations. Footnote 1 Reduced variability in the aquaplanets is expected because they lack zonal asymmetries, for example associated with regions of very large LTS over eastern sides of ocean basins. In spite of the decreased variability, the aquaplanet distributions share the same mode as the Earth-like setting and the reanalysis; in effect isolating the most likely large-scale conditions in the tropics.

The joint distribution of ω 500 and LTS is shown in Fig. 2. Upward motion separates the deep tropics, where the moist adiabatic temperature structure links the surface with the rest of the troposphere, from suppressed conditions, where the distribution turns toward higher values of LTS. Larger values of LTS are indicative of decoupling of the surface and 700 hPa level, and the skewness toward larger LTS is dictated by this decoupling as the lower values necessary to maintain a more symmetric distribution would be conditionally unstable.

Fig. 2
figure 2

Left Joint distributions of vertical motion and LTS from the same ERA-40 data as in Fig. 1. Each color interval is 10% of the total data, with the first decile masked for clarity. Middle Low-cloud fraction based on ISCCP, conditioned on ERA-40 vertical motion and LTS. Right Precipitation based on GPCP, also conditioned on ERA-40 large-scale conditions

With this picture of the circulation come expectations for varying prevalence of cloud regimes. The peak of the distribution occurs at moderate values of ω 500 and LTS, which should be indicative of trade-wind conditions and shallow cumulus convection. At higher values of LTS in strongly suppressed conditions, marine stratocumulus are expected. At locations with upward motion, deeper convection is common. The exception apparent in Fig. 2 is a lobe with low LTS and weak upward to moderate subsiding motion. Further analysis (not shown) indicates that these points are associated with the winter season storm tracks and dry continental air; they account for less than 1.5% of the monthly mean values over tropical ocean points, and appear to have very little cloud over.

The GCM versions of Fig. 2 are shown in Figs. 3 and 4. All four Earth-like configurations in Fig. 3 share the features described above for the reanalysis, except the frequency of continental air impacting the marine locations. Each GCM distribution is narrower than the reanalysis in both dimensions, as could be expected from Fig. 1. Thus the GCMs fail to fully replicate the variability of Earth’s tropical atmosphere, including the extremes on both sides of the stability spectrum. Some of the differences may be artifacts of experimental design, in that (1) climatological SST is prescribed, eliminating interannual variability and (2) fewer simulated years are used (Table 1) than are examined for the reanalysis (18 years), but we suspect not.

Fig. 3
figure 3

As in Fig. 2, but for the Earth-like configurations of the (top to bottom) NCAR CAM, SP-CAM, GFDL AM, and MPI ECHAM

Fig. 4
figure 4

As in Figs. 2 and 3, but for the aquaplanet configurations of the GCMs

The aquaplanets’ distributions are narrower yet (e.g., Fig. 4), especially in LTS where the tail toward large values is absent. The peak of the distribution is similar to the Earth-like configurations, as is the range in the vertical velocity. The NCAR CAM and GFDL AM aquaplanets show a slight bimodality in their distributions: the secondary maxima in both cases occurs at LTS values typical of the deep tropics. In fact, these maxima are associated with the region along the equator, between well-separated intertropical convergence zones; configurations with a single ITCZ do not exhibit such bimodality. That aquaplanets fail to represent regions of unusually large values of LTS likely reflects the absence of zonal asymmetries in SST (and the land-ocean distribution); subtropical regions with strong low-level inversions over cool SST do not exist. On Earth and in the Earth-like settings, these conditions are expected to produce marine stratocumulus; the aquaplanets are unable to maintain such conditions at low latitudes. It is likely that with higher-frequency sampling, stratocumulus associated with post-frontal regions of extratropical cyclones do appear in these aquaplanets, which might allow extension of this framework to this important cloud type.

3 Conditionally averaged cloud and precipitation

The GCMs capture the dynamic and thermodynamic features of the tropical atmosphere, as measured by the distributions of ω 500 and LTS. In this section we examine the organization of cloud fraction and precipitation conditioned on these variables, using observations as well as Earth-like and aquaplanet GCM simulations.

Along with the distribution of ω 500 and LTS, Fig. 2 shows low-cloud amount from the ISCCP D2 data set (Rossow and Schiffer 1999) and precipitation from GPCP (Adler et al. 2003) (middle and right panels) sorted by vertical velocity and LTS from ERA-40. Monthly means on the same 2.5° grid as the reanalysis are used for the period of temporal overlap between ERA-40 and the ISCCP D2 data sets (i.e., 1984–2001). Within each ω 500-LTS bin, values are obtained by simple averaging accounting for the different area of grid cells. The statistical weight of each bin is determined by the distribution shown in the left panel, shown as dark contour lines in the other panels. The bins with the least data, accounting for 10% of the total data, are masked for clarity.

Both the mean low-cloud amount and the precipitation vary coherently across ω 500-LTS space. The ISCCP low-cloud amount, which includes all clouds with tops below 680 hPa, increases with LTS, but is nearly independent of ω 500. When upper level clouds are present, ISCCP can not detect low-level clouds, so the low-cloud amount is most useful for clouds beneath relatively clear skies. Large values of low-cloud amount (≥50%) are rare, accounting for only 1.8% of the total distribution, nearly all of which occur at LTS > 18 K. These are homogeneous stratocumulus found over eastern boundary current regions (cf. Klein and Hartmann 1993). The peak of the pdf is populated by low-cloud fraction of about 30%, and about 40% of the total data has mean low-cloud amount of 25–35%. Small values of low-cloud amount are also uncommon (mean low-cloud amount ≤16% accounts for 10% of the distribution), and confined to regimes with large-scale rising motion (where high clouds may obscure low-level cloudiness) or points strongly influenced by dry, continental air. The GPCP precipitation shows nearly orthogonal behavior: the most rain falls with the strongest upward motion, subsidence suppresses convection and precipitation is lowest in these regimes, and there is little evidence for a LTS dependence of precipitation. The precipitation distribution suggests that using ω 500 > 0 hPa day−1 is nearly equivalent to the common rule-of-thumb of using 4 mm day−1 to define regions of tropical convection. These complementary patterns suggest that the vertical velocity easily separates regimes dominated by boundary layer clouds (subsidence) from those associated with higher and/or deeper clouds (upward motion), and LTS organizes cloud types within the subsidence regimes from moderate cloud cover (shallow cumulus) to overcast conditions (stratocumulus).

Using the same compositing strategy for the GCMs, the low-cloud amount and precipitation are shown in Fig. 3 for the Earth-like configurations and Fig. 4 for the aquaplanets. The resultant distributions for the Earth-like configurations show the main features evident in Fig. 2. Low-cloud amount tends to be relatively moderate in the peak of the distribution and increases with increasing LTS, while rain rates correlate most strongly with ω. There are differences in the details of the GCM composites. Except for the MPI ECHAM (which has very little low-cloud cover for LTS < 14 K), the GCMs show larger low-cloud amount in regimes of large-scale rising motion compared to the ISCCP values and exhibit larger maximum cloudiness than ISCCP (Table 2). Low-cloud amount greater than 50% is more common in the NCAR CAM than ISCCP, while the GFDL AM has about the same probability of such conditions and the SP-CAM has only half the chance. Part of this disagreement is likely introduced using the diagnosed low-cloud amount from the models rather than a cloud amount derived using an ISCCP simulator (Klein and Jakob 1999; Webb et al. 2001). The SP-CAM was run with the ISCCP simulator activated; repeating the analysis using the low-cloud amount derived from the ISCCP simulator produces a similar distribution of low-cloud in subsidence regimes, but shows decreased low-cloud amount in convective regimes (with a commensurate increase in the likelihood of small values of low-cloud amount) and a smaller average low-cloud amount. These differences would likely arise in the other models as well, meaning that the ISCCP simulator should be used when high clouds are common; the focus here is on low clouds in subsidence regimes where use of the ISCCP simulator would only slightly alter the low-cloud amount.

Table 2 Aspects of the low-cloud amount in the ISCCP and GCM distributions, where \(\overline{C}\) is the mean low-cloud amount (per ω 500-LTS bin), \(<\overline{C}>\) is the mean low-cloud amount across all bins, and \({\mathcal{P}}\) is probability from the joint pdf

For the aquaplanets, the enhanced cloud cover with increasing LTS is virtually absent, but the dependence of precipitation on vertical motion is relatively well represented (Fig. 4). The narrowness of the distribution can also be seen in Table 2: the aquaplanets have almost no bins with mean low-cloud amount less than 16% or greater than 50%. The average low-cloud cover remains, however, similar to the Earth-like configurations.

4 Conditionally averaged profiles

The vertical velocity and lower-tropospheric stability appear to be useful for separating low-cloud regimes. In this section, these parameters are used to sample the simulations to determine if such a classification separates distinct environments. Specifically, we first broadly classify tropical low clouds as shallow cumulus or stratocumulus, and second we focus on conditions observed during a field campaign. By comparing the Earth-like and aquaplanet simulations in this way, a framework for evaluating the fidelity of aquaplanet cloud representation is developed.

4.1 Trade-wind versus stratocumulus

Separating regimes in which low-level clouds dominate the cloud field can be largely accomplished at low-latitudes simply by selecting regions of subsidence. Within these low-cloud dominated regions, however, the magnitude of the subsidence does little to discriminate among cloud regimes (Fig. 3). This has been a criticism of studies using it as the sole organizing quantity. Figures 2 and 3 suggest cloud amount is a function of LTS in subsidence regimes, so using both ω and LTS might better separate cloud types than using one or the other. Figure 5 shows such a classification, in which low-cloud regimes have been separated using ω 500 ≥ 10 hPa day−1 and ω 700 ≥ 10 hPa day−1. This constraint is designed to be rather conservative, ensuring reasonably strong subsidence throughout the lower troposphere. A threshold value of LTS ≥ 18.55 K—the value of 50% marine stratus cloud fraction from the Klein and Hartmann (1993) empirical relation—is used to cleave the low clouds into two regimes.

Fig. 5
figure 5

Profiles of (left to right) relative humidity, potential temperature, cloud amount, and cloud liquid water mixing ratio for the four GCMs (noted upper right in each row) for trade-wind cumulus conditions (filled blue circles) and stratocumulus conditions (empty pink circles). Circles denote the median value, horizontal bars show the interquartile range at each level. The gray curve and lighter gray shading show the associated aquaplanet results, which only have trade-wind cumulus conditions. The solid black curve shows the trade-wind cumulus conditions from the ERA-40 data set and the dashed black curve shows the ERA-40 stratocumulus conditions; no observational cloud or liquid water profiles are shown

The addition of ω at 700 hPa accounts for variations in the shape of the subsidence profile across the reanalysis and GCMs. Using just one level produces similar results to those reported here. Using both levels excludes months that marginally meet one or the other criterion, probably at the expense of underestimating the (still substantial) area covered by shallow cumulus convection.

The black curves in Fig. 5 show the profiles from the two regimes derived from the ERA-40 (solid vs. dashed). The difference in relative humidity and potential temperature suggest that these samples exhibit appreciably different thermodynamic structure, with the high-LTS regime (dashed) having a shallower boundary layer under a stronger inversion (necessarily, given the sampling criteria) and occurring over cooler surface temperatures. Figure 2 suggests a larger low-cloud amount and essentially no rain in this classification. These characteristics are in line with expectations for marine stratocumulus, therefore we hereafter refer to this as the stratocumulus regime. The other regime shows a deeper boundary layer with a weaker hydrolapse that occurs over warmer surfaces, which is expected to have little precipitation and moderate low-cloud amount. We refer to this as the trade-wind cumulus (or shallow cumulus) regime. The trade-wind regime is much more common, accounting for about 30% of the tropical ocean regions, while the stratocumulus is identified in around 5%. We note that this classification may lump transitional cloud types like cumulus topped by stratiform cloud into the trade-wind cumulus regime.

Maps of these points (Fig. 6) confirm that stratocumulus conditions are found mostly over the eastern boundary current regions, while trade-wind conditions are found further west and into the deeper tropics. The ISCCP mean low-cloud amount for the stratocumulus regime is 48% and for the trade-wind regime it is 32%. This further confirms that this simple classification scheme qualitatively separates these two low-cloud regimes.

Fig. 6
figure 6

Frequency of occurrence (%) of trade-wind (blue) and stratocumulus (red) classifications from ERA-40 and the GCMs. Stippled regions contain months of both regimes, the color is chosen as the more frequent one. White regions are never classified as a low-cloud regime

The same classification system is used for each GCM in Fig. 5, but the GCMs all struggle to maintain strong LTS, even in stratocumulus regions (Fig. 6). As a consequence, the stratocumulus regime is under-represented in all the Earth-like configurations, though when present it appears in the expected places. The stratocumulus points in the GCMs tend to have a shallow PBL, and generally small cloud fraction spread across several model levels. Except for the MPI ECHAM, which has very small cloud amounts in the trade-wind cumulus regions, the cloud structure in the two regimes is not strongly differentiated by the models, in contrast to expectations based on previous studies and observations. The diagnosed low-cloud amount is more realistic, around 40% in GFDL AM, MPI ECHAM, and SP-CAM and around 60% in NCAR CAM (see Table 1 for details of the low-cloud definitions).

Figure 5 affirms the similarity of the aquaplanets and Earth-like simulations when sampled based on large-scale conditions. The trade-wind regime in the GCMs accounts for around 40% of the tropical ocean regions. The humidity and temperature profiles resemble those of the ERA-40, though it is worth noting that conditioning on LTS constrains the temperature structure of the lower troposphere, necessitating a certain level of agreement in the composite temperature profile. Each model has its own idiosyncrasies, but generally these are shared by aquaplanet and Earth-like configurations of a given model. The aquaplanets show a slight cool bias because the SST distribution is slightly different between the aquaplanet and Earth configurations. The similarity of the aquaplanet and Earth-like trade-wind regime shows that the representation of these conditions is insensitive to the ω-LTS distribution from which they are drawn.

The vertical profile of cloud amount in the GCMs’ trade-wind regime varies tremendously. While the environmental conditions are similar, the cloud fields that result are disparate. The NCAR CAM produces large cloud fraction in the lower model levels, with increasing cloud amount with height in the three layers above the surface. The GFDL AM shows much smaller cloud fraction in each layer, a clear sub-cloud layer and cloud base around 900 hPa, and maximum cloud amount just above cloud base with decreasing cloud amount with height above. Both the Earth-like and aquaplanet configurations show the cloud layer extending above 600 hPa, which is deeper than typical trade-wind cumulus layers. The SP-CAM falls between the NCAR and GFDL GCMs, while it shows evidence of a distinct sub-cloud layer, cloud-base is relatively poorly defined. The MPI ECHAM shows less vertical structure and very little cloud across model levels in the trade-wind regime, but a well-defined cloud base at a reasonable pressure.

Cloud fraction is sometimes criticized as a poor metric for cloud representation, largely because the definition of a cloud is, in some sense, arbitrary. A more physical quantity is the liquid water mixing ratio, r L . The liquid water profiles in Fig. 5 mostly echo the cloud fraction in the GCMs used here. One difference is that the maximum liquid water mixing ratio is found in levels slightly above the maximum cloud amount in the trade-wind regimes of the GFDL AM and SP-CAM. A second difference is that the NCAR CAM exhibits values of r L that are comparable to, and even less than, those in the other models, in spite of its very large cloud fraction. The MPI ECHAM has very little liquid water, commensurate with its small cloud amount.

The integrated liquid water content, the liquid water path (LWP), gives a measure of the total mass of liquid water in the column, and has been observed from space using passive microwave sensors for many years. A comparison of the microwave-derived LWP (O’Dell et al. 2008) conditioned using the ERA-40 environment with that from the GCMs is shown in Table 3; in each case the LWP represents the average over clear and cloudy conditions (not the in-cloud liquid water). There is a known ≈10 g m−2 bias in the microwave retrieval (C. O’Dell, personal communication), which, if subtracted, bring the observations, the NCAR CAM, and SP-CAM into rough agreement for stratocumulus, while the GFDL AM and MPI ECHAM substantially underestimate LWP for these conditions. In the trade-wind regimes, the NCAR CAM shows larger LWP than the observations, the SP-CAM and GFDL AM have values more in line with the observations, and the MPI ECHAM again substantially underestimates LWP. As with the other quantities, the aquaplanets exhibit LWP values similar to the Earth-like configurations. The apparent disagreement among models—and among different observing systems—has also been noted recently by Li et al. (2008).

Table 3 Comparison of LWP for the stratocumulus, trade-wind, and RICO-like classifications

4.2 Zoom in on RICO conditions

Because differences in the large-scale models are not sensitive to the details of the circulation, in that one can not distinguish between aquaplanets and Earth-like configurations based on conditionally sampled cloud distributions, we hypothesize that data from a specific location can provide guidance as to the fidelity of the model representation of low clouds. To explore this hypothesis, the above sampling criteria are adjusted to composite based on conditions matching those observed during a field campaign. This allows comparison of a large sample of points from a model, or other global data set, with the detailed observations obtained from field studies. Here we continue with the emphasis on trade-wind conditions by focusing on conditions similar to those during the RICO field campaign (Rauber et al. 2007), which is thought to represent typical trade-wind cloud regimes.

To extract conditions similar to those observed during RICO, the thresholds for the trade-wind classification are slightly adjusted. For simplicity, the same dynamic constraint as above is used (ω 500 ≥ 10 hPa day−1 and ω 700 ≥ 10 hPa day−1), but the thermodynamic constraint is adjusted to capture the mean (13.2 K) and standard deviation (1.88 K) of radiosondes launched during RICO (11.3 K ≤ LTS ≤ 15.1 K). Profiles are plotted in Fig. 7 for comparison to Fig. 5; the RICO-like conditions are similar to more general trade-wind conditions. The red profiles show the average relative humidity and potential temperature from radiosondes launched during the RICO field study, while the black curves show the ERA-40 conditionally sampled for RICO-like conditions. The temperature structure is very similar between the radiosondes and the reanalysis, but the relative humidity is substantially different, with the radiosondes closer to saturation from the surface to about 500 hPa and the reanalysis closer to saturation above 500 hPa. A similar humidity profile is obtained by averaging the ERA-40 data in the vicinity of the RICO study. This dry bias could arise from deficient mixing in the boundary layer scheme of the reanalysis system, as has been discussed for stratocumulus conditions by Stevens et al. (2007), but could also reflect local conditions during RICO.

Fig. 7
figure 7

As in Fig. 5, but using only the criteria for conditions similar to those during the RICO field study. Blue circles and bars show the Earth-like values, gray lines and shading show aquaplanet values, the black line shows reanalysis values, and the red lines show radiosonde data from RICO

As in Fig. 5, there are differences among the relative humidity structures, and these differences are exaggerated in Fig. 7 by comparing with the radiosondes. The GCMs more closely track the ERA-40 humidity structure than they do the radiosondes. The sub-cloud layer and free tropospheric relative humidity are similar among the models and reanalysis, though Figs. 5 and 7 emphasize the cloud layer, where differences are more noticeable. These discrepancies emerge in the cloud layer because the models represent the clouds quite differently, as described above; away from the clouds, the large-scale environments of the GCMs are similar to that found in the reanalysis.

The profiles of liquid condensate in RICO-like conditions mirror the results from the more general trade-wind classification. Comparing the area-averaged LWP for RICO-like points with the microwave-derived climatology (Table 3) also shows the similarity of the trade-wind regime and the RICO-like conditions, as well as the similarity of the aquaplanets to their Earth-like counterparts. The SP-CAM and GFDL AM produce the most realistic depictions of trade-wind conditions, while the NCAR CAM and MPI ECHAM have too much and too little liquid, respectively. The variability of the LWP in the SP-CAM and GFDL AM appear slightly smaller than the satellite estimates. In contrast with the discussion by Li et al. (2008), there is rough agreement between the GCMs and the satellite estimates of LWP in RICO-like conditions. This might suggest that the GCMs’ representation of liquid water is more reliable in trade-wind regimes than for the planet as a whole, though there remains more than a factor of two in the spread of these GCMs within this regime.

Figure 8 provides some insight into the cloud distribution in conditions like those during RICO. It shows the median and interquartile range of high, middle, and low clouds from ISCCP and the GCMs. As above, large-scale conditions from ERA-40 determine the ISCCP sample of RICO-like conditions during the period of overlap between the data sets (1984–2001). Because of the method used to define cloud fraction, it is likely the ISCCP data set overestimates cloud fraction in this regime (Wielicki and Parker 1992; Di Girolamo and Davies 1997; Zhao and Di Girolamo 2006). As discussed above, the diagnosed cloud amounts in GCMs can differ from results using an ISCCP simulator, which was not available for this analysis for three of the GCMs, but this regime is not expected to contain much obscuring high-level cloud, so differences should be minimal. The results confirm again that aquaplanets (gray symbols) capture the trade-wind cumulus environments of their Earth-like counterparts, including their shortcomings, though with greater high cloud amount. The SP-CAM is the only model with a high cloud amount similar to ISCCP, while the other three models have much more high cloud. Neither ISCCP nor the models show much evidence for middle-level clouds, and SP-CAM has especially little middle-level cloudiness. The low-cloud amount for the ISCCP composite is approximately 30%, with generous spread, consistent with the broader trade-wind classification. The GCMs cluster about the same value, except the MPI ECHAM, which, as in the above results, has little low-cloud amount under trade-wind conditions (the aquaplanet has more low-level cloud, but still substantially less than the other models or ISCCP). The extension of this work to include satellite simulators, for ISCCP or other remote sensors, offers potential to evaluate GCM clouds in more detail (cf. Chepfer et al. 2008).

Fig. 8
figure 8

Median high, middle, and low-cloud fraction (markers) at points with RICO-like conditions from the ISCCP D2 data set (black) and GCMs in Earth-like configurations (colors) and aquaplanets (gray). Horizontal bars around each marker show the interquartile range for each sample. Dashed horizontal lines show the interquartile range of the ISCCP cloud top pressure, and the tickmarks give the median. Vertical placement of the GCM values is arbitrary

Measurements made during RICO suggest that cloud fraction peaks at about 15–20% at the base of the cloud layer and decreases sharply with height toward the typical height of the trade inversion (at about 2 km) and more slowly above it. LWPs that could be inferred from the data are likely somewhat larger than 30 g m−2, roughly compatible with (but lower than) the estimates from the microwave retrievals. These inferences are based on the data displayed in Fig. 9. The left panel shows in-cloud average liquid water at different heights measured using the Gerber PVM-100 probe (Gerber et al. 1994) during the C130 RICO flights. The mean liquid water sampled within a height interval, conditioned on the presence of liquid water, increases nearly adiabatically for the first few hundred meters above cloud-base, and somewhat less rapidly thereafter. Typical cloud average values are near 0.20 g kg−1 toward the middle of the cloud layer. Cloud fraction estimates are shown in the right panel. Cloud fraction is calculated from the downward looking lidar for the surveillance legs (circles of about 200 km circumference typically flown at 4.5 km near the beginning and end of each flight), and is derived from cloud-top incidence with a maximum-overlap assumption. Near cloud-base, the estimate becomes sensitive to the threshold chosen for the cloud-top detection. Here we use the aircraft (in situ) derived cloud amounts to choose the most representative threshold. The resulting cloud amount is slightly larger than satellite-derived trade-wind cloud amount in the same location (Zhao and Di Girolamo 2007). Cloud amounts from in situ measurements at other heights are not representative, as the decision as to how high to fly above cloud-base was conditioned on both the type of measurements being made and the presence of clouds at these altitudes. Accordingly, in situ cloud amounts from other than cloud-base and surveillance legs are strongly biased (and hence are not shown). Other biases may arise from the impact of precipitation on liquid water measurements (which explains the sub-cloud data where clouds were not observed, but precipitation shafts were sampled); the tendency of the Gerber probe to underestimate liquid water in regions of large (precipitation-sized) drops; and flight selection biases that avoided both deep and suppressed convection. All things considered, the data do appear sufficient to constrain significant aspects of the modeled cloud water. They suggest that the MPI ECHAM may less severely underestimate trade-wind convection than a comparison with ISCCP would suggest. The data show evidence for a well-defined cloud-base structure, which further suggests that the representation of shallow cumulus by the NCAR CAM is deficient.

Fig. 9
figure 9

Cloud water (left) and fraction (right) profiles from the C130 flights during RICO. The sampling included all flights legs below 2 km for which good data was available. Shown on the left is the interquartile variability (whisker), mean (gray circle) and median (black circle) of cloud-water. The right panel shows estimates of cloud fraction from the lidar (lines) using different detection thresholds (as indicated in red, with the black line being the 22 dBZ threshold). The filled circles show cloud fraction from in situ measurements near cloud base (where sampling was most random) and along the surveillance leg at 4.5 km. Cloud water measured in the sub-cloud layer is from precipitation

Figure 7 shows that the selection criteria extract generally similar conditions to those observed during RICO. Comparing cloud amount, the GCMs are mostly consistent with ISCCP (Fig. 8), but all are larger than estimates from the aircraft data in Fig. 9. An estimate of the area-averaged liquid water during the RICO flights comes from the product of the average in-cloud liquid water and cloud fraction in Fig. 9, though such an estimate is subject to all the biases listed above. That estimate would generally suggest liquid water values of around 0.02 g kg−1 at altitudes in the cloud layer. The NCAR CAM and GFDL AM exhibit similar values, while the SP-CAM and MPI ECHAM have values higher and lower, respectively. Conversely, an estimate of the in-cloud liquid water in the GCMs (r L divided by cloud fraction) suggests the NCAR CAM underestimates the the amount of water in clouds, while the other models, including the MPI ECHAM, have more realistic in-cloud water content.

This comparison of reanalysis, satellite, and in situ data with the models suggests that each model has deficiencies in the representation of trade-wind cumulus, and stresses the importance of representing different aspects of clouds. The NCAR CAM produces a reasonable average liquid water, but this water is spread over too many clouds and too many vertical levels because of the poor representation of the cloud fraction and vertical structure. The MPI ECHAM, on the other hand, has too few clouds, but they have sufficient liquid water content. The SP-CAM contains too much liquid water, but this translates to a small in-cloud water bias because of the realistic cloud fraction. The GFDL AM produces clouds that are too deep, but otherwise produces the most realistic structure, cloud fraction, and liquid water content. The diversity of biases sheds some light on the models’ differing cloud responses to climate change. Because the large-scale environments are similar among the models (and similar to observed conditions) these findings seem to confirm the idea that such weaknesses are linked to the representation of clouds, including the interactions among boundary layer turbulence, convection, and radiation.

5 Summary

We have examined the tropical boundary layer clouds in Earth-like and aquaplanet configurations of four GCMs. These clouds have been identified as a particular weakness of GCMs, and a source of divergence in estimates of climate sensitivity. When appropriately sampled, similar structures are extracted from the idealized (aquaplanet) and more Earth-like simulations: for trade-wind conditions, there is no substantive difference between the Earth-like and aquaplanet configurations. This finding shows that aquaplanets provide an appropriate framework for evaluating tropical cloud regimes in climate models

To evaluate the GCM clouds, the simulations are sampled based on the large-scale environment, characterized by the free-tropospheric vertical velocity and lower-tropospheric stability. The vertical velocity is used to separate regions of deeper convection from suppressed conditions, while LTS sorts low cloud types under subsidence. Comparing the distributions of these quantities between the ERA-40 reanalysis and the GCMs, we find that the Earth-like configurations are generally consistent with the reanalysis. The aquaplanets used here exhibit distributions of large-scale conditions that focus on the prevailing conditions in the Earth-like setting, helping to isolate the most common tropical conditions. Using two threshold values, the boundary layer cloud regimes are separated into stratocumulus and trade-wind cumulus regimes. Composite profiles in each of these regimes allows further comparison between the models and observations.

In all cases, the frequency of the trade-wind conditions in the GCMs is greater than in the reanalysis, while stratocumulus conditions are under-represented. Composite profiles show that the large-scale environment in each regime, including the more restrictive RICO-like sampling, is similar across the models and reanalysis. Despite the similar conditions, the clouds in both regimes reveal an unsettling assortment of characteristics among the models, bolstering the idea that the parameterized physics associated with clouds are a weakness in the simulations. The differences in the clouds are also manifest as differences in the humidity structure within the cloud layer, making the point that the weaknesses in representing clouds can have broad impacts on a model’s solution. The relative humidity profile in the RICO-like sampling of the GCMs and ERA-40 is quite different from that measured by radiosondes during RICO, but this discrepancy appears to have little effect on the cloud structure.

All of the models appear to have weaknesses in their representation of shallow cumulus convection. The NCAR CAM poorly represents the vertical structure of the trade-wind layer; additionally, it likely overestimates cloud amount and underestimates cloud water within cloud. The former bias seems to dominate, however, producing too much water overall in the trade-wind regions. In contrast, the MPI ECHAM better represents the vertical structure and in-cloud water, but underestimates cloud fraction. The GFDL AM and the SP-CAM both produce more compelling trade-wind clouds. The SP-CAM shares some of the deficiencies of the NCAR CAM, namely an insufficiently sharp distinction between the subcloud and cloud-base layers, which allows cloud fraction to maximize too deeply into the cloud layer. The GFDL AM has a distinct sub-cloud layer, with cloud fraction (and water) maximizing just above cloud base, but the top of the cloud layer is not well defined, suggesting frequent occurrence of deeper convection in the trade-wind regions.