Climate Dynamics

, Volume 33, Issue 1, pp 141–157

A quantitative performance assessment of cloud regimes in climate models

Authors

    • Met Office Hadley Centre
  • M. J. Webb
    • Met Office Hadley Centre
Article

DOI: 10.1007/s00382-008-0443-1

Cite this article as:
Williams, K.D. & Webb, M.J. Clim Dyn (2009) 33: 141. doi:10.1007/s00382-008-0443-1

Abstract

Differences in the radiative feedback from clouds account for much of the variation in climate sensitivity amongst General Circulation Models (GCMs). Therefore metrics of model performance which are demonstrated to be relevant to the cloud response to climate change form an important contribution to the overall evaluation of GCMs. In this paper we demonstrate an alternative method for assigning model data to observed cloud regimes obtained from clustering histograms of cloud amount in joint cloud optical depth—cloud top pressure classes. The method removes some of the subjectivity that exists in previous GCM cloud clustering studies. We apply the method to ten GCMs submitted to the Cloud Feedback Model Intercomparison Project (CFMIP), evaluate the simulated cloud regimes and analyse the climate change response in the context of these regimes. We also propose two cloud regime metrics, one of which is specifically targeted at assessing GCMs for the purpose of obtaining the global cloud radiative response to climate change. Most of the global variance in the cloud radiative response between GCMs is due to low clouds, with 47% arising from the stratocumulus regime and 18% due to the regime characterised by clouds undergoing transition from stratocumulus to cumulus. This result is found to be dominated by two structurally similar GCMs. The shallow cumulus regime, though widespread, has a smaller contribution and reduces the variance. For the stratocumulus and transition regimes, part of the variance results from a large model spread in the radiative properties of the regime in the control simulation. Comparison with observations reveals a systematic bias for both the stratocumulus and transition regimes to be overly reflective. If this bias was corrected with all other aspects of the response unchanged, the variance in the low cloud response would reduce. The response of some regimes with high cloud tops differ between the GCMs. These regimes are simulated too infrequently in a few of the models. If the frequency in the control simulation were more realistic and changes within the regimes were unaltered, the variance in the cloud radiative response from high-top clouds would increase. As a result, use of observations of the mean present-day cloud regimes suggests that whilst improvements in the simulation of the cloud regimes would impact the climate sensitivity, the inter-model variance may not reduce. When the cloud regime metric is calculated for the GCMs analysed here, only one model is on average consistent with observations within their uncertainty (and even this model is not consistent with the observations for all regimes), indicating scope for improvement in the simulation of cloud regimes.

1 Introduction

Assessment of General Circulation Models (GCMs) against observational data is routinely undertaken, both during their development and in comparing the performance of different GCMs. Recently, there has been increasing interest in developing sets of metrics to provide quantitative information on GCM performance which may then be used in assessing overall confidence in a GCM’s simulation of future climate change (Murphy et al. 2004; Reichler and Kim 2008; Gleckler et al. 2008; Pincus et al. 2008). Much of the variation between GCMs in projections of global-mean temperature response to an external forcing is due to variations in the radiative response from clouds (Cess et al. 1990; Senior and Mitchell 1993; Webb et al. 2006; Soden and Held 2006; Ringer et al. 2006; Cash et al. 2007). Traditionally, assessment of clouds in GCMs has been based on comparing maps or global averages of cloud variables—typically total cloud amount or cloud radiative forcing (CRF; e.g. Cess et al. 1990) (Pincus et al. 2008; Gleckler et al. 2008). However a GCM can perform well on such metrics through compensation of errors in the temporal frequency and radiative properties of different cloud types, which may result in an incorrect cloud radiative response under climate change (e.g. Webb et al. 2001).

A number of studies have provided a more detailed analysis of clouds in GCMs by compositing the cloud data into different dynamic and/or thermodynamic regimes (e.g. Bony et al. 2004; Williams et al. 2006; Norris and Weaver 2001; Tselioudis et al. 2000). An alternative approach developed by Jakob and Tselioudis (2003) is to apply a clustering algorithm to joint cloud optical depth (τ)-cloud top pressure (CTP) histograms of cloud amount produced by the International Satellite Cloud Climatology Project (ISCCP; Rossow and Schiffer 1999). Gordon et al. (2005) use a similar clustering approach but just use the ISCCP grid-box-mean cloud albedo (α), CTP and total cloud cover (TCC). Williams and Tselioudis (2007) (hereafter WT07) apply the clustering of the full τ-CTP histograms of cloud amount to comparable ISCCP simulator data (Webb et al. 2001; Klein and Jakob 1999; http://gcss-dime.giss.nasa.gov/simulator.html) from six GCMs. They show that much of the variance in the climate change cloud response across their ensemble of GCMs is due to differences in the present-day simulation of cloud regimes, and they propose that a metric be developed to assess the GCM simulation of cloud regimes.

Despite the clustering methodology used by WT07 providing information to model developers regarding the cloud regimes within their model, some subjective decisions are required which make it difficult for others to apply and means that the methodology cannot be fully automated for application to a large GCM ensemble. In this study, we address these issues by providing an alternative methodology which assigns the GCM data to the observational clusters, resulting in a common set of regimes. We use this technique with a larger ensemble of GCMs than that used by WT07 to address the following questions:
  1. 1.

    How well are cloud regime properties simulated in contemporary GCMs?

     
  2. 2.

    Which cloud regimes contribute most to the spread in equilibrium cloud radiative response to climate change amongst GCMs?

     
  3. 3.

    To what extent is this spread due to differences in the present-day simulation of cloud regimes?

     
  4. 4.

    Are the conclusions of WT07 robust across this larger GCM ensemble?

     

We also present two related metrics aimed at assessing simulated cloud regimes which focus in turn on the present-day top of atmosphere (TOA) radiative effects of the cloud regimes and on the global cloud radiative response to climate change. In developing these metrics we have attempted to retain the value of cloud regime analysis outlined by WT07, whilst making it suitably simple, objective and well documented for others to calculate for their GCMs. In the next section, the models, observational data and revised clustering methodology are presented. In Sect. 3 the simulated present-day cloud regimes are evaluated and the climate change response is discussed in Sect. 4. The proposed cloud regime error metrics are presented in Sect. 5 whilst conclusions are given in Sect. 6.

2 Models, observational data and methodology

2.1 Models and experimental design

This study uses equilibrium control and 2 × CO2 atmosphere – mixed-layer ocean (slab model) experiments from ten GCMs which have been submitted to the Cloud Feedback Model Intercomparison Project (CFMIP; McAvaney and Le Treut 2003; http://www.cfmip.net). The models used and references to their descriptions are listed in Table 1. Following WT07, 5 years of daily mean ISCCP simulator and CRF data are used from each experiment, with the exception of the two MIROC models for which only 2 years of daily data are available. Although four Met Office Hadley Centre models are included, HadGEM1 and HadCM3 have considerable structural differences, including a different dynamical core, resolution and many different or revised physical parametrizations. HadCM4 is an intermediate model containing aspects of both HadCM3 and HadGEM1, whilst HadGEM2 has many similarities to HadGEM1. The lower and higher sensitivity versions of the MIROC 3.2 model differ in two respects: the temperature range over which mixed phase clouds are simulated, and that when cloud ice melts it becomes rain in MIROC-ls whilst it becomes cloud water in MIROC-hs (Ogura et al. 2008).
Table 1

List of CFMIP models used in this study, their resolution, climate sensitivity and main reference

Model

Atmospheric resolution

Climate sens. (K)

Main references

CCCMa AGCM4

T47 L35

3.9

von Salzen et al. (2005)

CCSM3

T85 L26

2.9

Collins et al. (2006)

ECHAM5/MPI-OM

T63 L31

3.4

Roeckner et al. (2003)

GFDL-CM2.1

N72 L24

3.4

Delworth et al. (2006)

MetUM-HadCM3

N48 L19

3.5

Pope et al. (2000)

MetUM-HadCM4

N48 L38

3.7

Webb et al. (2001)

MetUM-HadGEM1

N96 L38

4.6

Martin et al. (2006)

MetUM-HadGEM2(N48)

N48 L38

4.4

Martin et al. (2008)

MIROC3.2(medres)-low sens.

T42 L20

3.9

K-1 Model Developers (2004)

MIROC3.2(medres)-high sens.

T42 L20

6.5

K-1 Model Developers (2004)

Horizontal resolution is prefixed by ‘T’ for the triangular truncation of spectral models and ‘N’ for half the number of east-west points for grid-point models (giving approximately comparable numbers). The number of atmosphere levels is prefixed by ‘L’. The climate sensitivity is calculated as the difference in the global-mean surface temperature between the 5-year periods of the 2 × CO2 and control simulations analysed here (these may be slightly different to values reported elsewhere due to different periods of the simulation being averaged). Although the standard resolution of HadGEM2 is N96, only a N48 atmosphere–mixed-layer version was available for this study

2.2 Observational data

Daily mean joint τ-CTP histograms of cloud amount were obtained by averaging the 3-hourly ISCCP D1 product (Rossow and Schiffer 1999). This product contains τ-CTP histograms of cloud amount on a 2.5° grid (i.e. the dataset is formed of a τ-CTP histogram for each grid-point). The ISCCP D1 product and ISCCP simulator diagnostics are only available at sun-lit points, so the day-mean is the average of the 3-hourly samples during daylight hours. Several previous studies have indicated that much of the error in the present-day simulation of CRF by GCMs, and variance in cloud radiative response to climate change is in the shortwave (e.g. Bony and Dufresne 2005; Williams et al. 2006), so these sun-lit-only data still provide a useful constraint on GCMs. Daily mean CRF data are obtained from the ISCCP-FD product (Zhang et al. 2004). Five years of observational data for the period March 1985–February 1990 are used for evaluation of the GCMs.

In order to address issues of observational uncertainty, we make use of two independent datasets. ISCCP-like joint τ-CTP histograms of cloud amount have been obtained from the collection 5 dataset of the Moderate Resolution Imaging Spectroradiometer (MODIS) (King et al. 2003). Daily mean histograms for the period January 2003–December 2006 have been produced by combining the data from both the Terra and Aqua satellites with equal weighting. These are used to assess the observational uncertainty in the regimes and their frequencies. Daily mean CRF data have also been obtained from the S4G product from the Earth Radiation Budget Experiment (ERBE; Barkstrom et al. 1990). Daily clear-sky measurements are only available from ERBE at places which are identified as cloud-free by the ERBE scene identification algorithm, hence averages of those days in the month which are clear are subtracted from the daily all-sky fluxes in order to produce a daily CRF. It should be noted that unlike ISCCP-FD, the ERBE dataset contains a considerable amount of missing data, particularly over land, which may lead to some bias in the CRF of regimes in these areas. As for ISCCP, data used from ERBE is for the period March 1985–February 1990 and provides an alternative CRF for the ISCCP cloud regimes. Although MODIS and ERBE are used in this study to provide the uncertainty in the observed regime frequencies and CRF (and so are often labelled together in figures), the ERBE data provides uncertainty in the CRFs associated with the ISCCP defined regimes, not the MODIS regimes.

2.3 Methodology for obtaining cloud regimes

WT07 cluster the model data independently of the observational regimes. In this study we assign each GCM data point to an observational regime (based on the grid-box mean CTP, α and TCC, Gordon et al. 2005). The advantage of this approach is that it allows a common set of regimes to be analysed and removes subjectivity from the methodology. The details of the new method and its motivation are described below. We first review the methodology of WT07 and then suggest two methods for assigning the model data to the observed clusters, the second of which we pursue in this study.

Jakob and Tselioudis (2003) obtain cloud clusters from the ISCCP τ-CTP histograms by applying the k-means clustering algorithm (Anderberg 1973) to the spatio-temporal set of 42 element vectors which form each histogram (cloud amount in 6 τ and 7 CTP categories i.e. \([\hbox{C}_{\tau1{\rm CTP1}},\hbox{C}_{\tau2{\rm CTP1}}, \ldots ,\hbox{C}_{\tau6{\rm CTP7}}],\) where C is cloud fraction in each bin). The 42 elements are treated as being orthogonal to each other. The algorithm is an iterative process which assigns the 42 element vector at each grid-point to the cluster centroid vector with the minimum Euclidean distance (i.e. the minimum root-sum-square distance), and then re-calculates the new cluster centroid vector. WT07 apply the clustering in exactly the same way to the GCM ISCCP simulator output (except that the ISCCP simulator has 7 τ categories as it has an additional one for optical depths smaller than the minimum which ISCCP is believed to detect: 0.3). The algorithm requires the number of clusters to be specified and WT07 present a semi-objective method, based on Rossow et al. (2005), for determining this for each GCM experiment. The approach is valuable for identifying the natural cloud clusters/regimes which occur in a GCM and how those evolve in a climate change simulation. However, since each GCM experiment can have a different number of clusters, WT07 had to combine some of the clusters into ‘principal cloud regimes’ in order to compare the GCMs with each other and with observations. This requires subjective judgement which, together with the subjective decisions remaining within the clustering procedure to define the number of clusters, make the method difficult to automate and for others to apply. The alternative approach presented here is to assign the GCM output to the observed ISCCP clusters. This has the advantage for comparison purposes that there is a common number of regimes which can be directly compared and, once the observed regimes have been defined, there are no subjective decisions in the methodology.

One way to assign the GCM data to the observed clusters would be to assign the 42 element vector at each GCM grid-point to the observed cluster centroid vector with the minimum Euclidean distance. However, this would treat each element of the vector (i.e. each τ-CTP bin) independently, making no allowances for the distances between cells in the 2D τ-CTP histogram. So if the GCM simulates stratocumulus with the correct grid-box TCC but one optical depth category too thick, it would be treated as being as different as if the grid-box had the same TCC but with high cloud of a completely different τ. As noted by WT07, most GCMs do not simulate cloud regimes with τ and CTP identical to the observations, so they may have a considerable Euclidean distance to the observed cluster histogram that they would be subjectively associated with. Consequently, the vectors at many GCM grid-points have a minimum Euclidean distance to the observed vector which is closest to the origin of the 42 orthogonal elements, even though the simulated cloud histogram may be quite different to this on any physical judgement. The cluster closest to the origin is that which has a small cloud cover in all τ-CTP bins (usually shallow cumulus).

In this study, we use a more tolerant methodology for assigning the model data to the observed regimes. Instead of using the 42 element vector of cloud amount in each τ-CTP bin, we use the grid-box mean α, CTP and TCC. The observed regimes are generated by clustering the 42 element vectors as previously in order to utilise the additional information in the full histogram (Jakob and Tselioudis 2003; Rossow et al. 2005; WT07, etc.), and the mean α, CTP and TCC obtained for the resulting regimes. These are normalised to each be in the range 0–1 (e.g. the CTP in hPa is divided by 1,000 hPa) and form a three element vector ([α,CTP,TCC]). Each model grid-point is then assigned to the observed regime with the minimum Euclidean distance in the vector of normalised α, CTP and TCC. This approach assumes that an error in α of 10% is treated equally to an error of 100 hPa in CTP or an error of 10% in TCC, and so allows for some variation between the simulated and observed τ and CTP of each regime. One prior concern regarding the method was that when a regime contains a mix of high and low top clouds, the mean CTP will be at mid-levels and may result in a similar vector to a regime with mid-level cloud tops. In practice, α and TCC are different enough between these regimes that this does not happen.

A small alteration to the ISCCP simulator would output CTP, α and TCC directly. This change is proposed for the next release of the ISCCP simulator (version 4). Having a vector of 3 variables rather than 42 for the assignment also reduces the amount of high temporal resolution data required proportionally, making the approach more attractive for application to large ensembles (e.g. Murphy et al. 2004). However, only the histograms were available for this study so the mean α and CTP have been calculated as cloud weighted means from the histograms assuming a central α and CTP in each τ-CTP bin (Table 2). The τ bin boundaries were converted to αs using the conversion table within the ISCCP simulator code and linearly interpolated to obtain the central αs.
Table 2

Central αs and CTPs for the ISCCP histogram bins

α (%)

CTP (hPa)

2.8

900

10.7

740

23.2

620

40.7

500

62.6

375

82.2

245

95.0

90

These are used with the histogram cloud amounts to calculate the grid-box mean α and CTP

Since values of α and τ cannot be defined for clear-sky, we exclude the small number of grid-boxes which are completely clear (less than 0.5% of points for the observations). Frequencies of occurrence presented in this study are relative to the total number of points containing some cloud.

The daily mean shortwave, longwave and net CRF (SCRF, LCRF and NCRF respectively) from grid-points assigned to each regime are averaged to obtain regime mean CRF components.

Following WT07, we separate the globe into three regions: the tropics (defined as 20°N-20°S), the ice-free extra tropics, and regions covered by snow and ice in the control climate. This separation is required in order to split tropical deep convective clouds from extra-tropical frontal clouds which both have similar τ-CTP histograms but may differ in their climate change response, and to separate regions which may have a large change in surface albedo which can affect the CRF (e.g. Cess et al. 1990). The boundary between the ice-covered and ice-free extra-tropics is allowed to vary spatio-temporally. The ice edge used with the observational data is obtained from the ISCCP-IS product (Rossow et al. 1996). For those GCMs where sea-ice and snow fractions have been submitted to CFMIP, the boundary is determined from the model fields, otherwise (in the case of the MIROC models) the ISCCP-IS product is used.

3 Present-day cloud regimes

As reported by WT07, seven tropical cloud regimes, seven cloud regimes over the ice-free extra-tropics and six cloud regimes over regions covered by snow and ice are obtained by clustering the ISCCP observational data (using the method documented by WT07). The resulting tropical regime histograms are shown on the top row of Fig. 1 and the mean α, CTP and TCC for each regime are given in Table 3. This table provides the 3 element observational vectors which the model data are assigned to, hence the table gives the observational data required for carrying out the methodology presented here with other GCMs. A key requirement when developing a more tolerant assignment method was that when the ISCCP observational data are assigned to the observed ISCCP clusters, the resulting mean histograms should be similar. The results of assigning the tropical ISCCP data are shown in the second row of Fig. 1. It can be seen that the mean histograms resulting from the assignment are indeed similar to the original histograms from the clustering, providing confidence that the tolerant assignment method does generally assign the data to the expected regime. Throughout the remainder of the paper, results marked as ‘ISCCP’ are where the ISCCP observational data has been assigned to the ISCCP clusters. In common with previous studies using the clustering methodology, each regime has been assigned a name based on the morphological cloud type that is likely to form much of the regime for ease of reference. However, it is acknowledged that such cloud types cannot be identified from τ-CTP-TCC alone and for some of the clusters (e.g. that labelled ‘Congestus’ in Fig. 1) there is currently uncertainty regarding the dominant morphological cloud type(s) (C. Jakob, personal communication).
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-008-0443-1/MediaObjects/382_2008_443_Fig1_HTML.gif
Fig. 1

Mean τ-CTP histograms for the tropical regimes. Shading indicates cloud amount (%) in each τ-CTP category with the light grey indicating cloud amounts less than 2% (including 0%). Top row results from clustering ISCCP observational data; second row results from assigning the ISCCP data to the ISCCP clusters (i.e. to the top row) using the tolerant assignment method. Third row results from assigning MODIS data to the ISCCP clusters. Fourth row average MODIS histograms from points identified as belonging to each regime from ISCCP clustering. This uses one common year’s worth of data (2004) from both observational datasets. Fifth row results from independently clustering the MODIS data

Table 3

Mean α, CTP and TCC for each ISCCP regime cluster

Regime

α (%)

CTP (hPa)

TCC (%)

RFO (%)

Tropics

 Shallow cumulus

26.1

652

31.4

37.5

 Congestus

33.9

483

81.3

19.5

 Thin cirrus

21.1

356

74.0

11.9

 Stratocu./Cu. Transition

33.8

784

64.0

10.3

 Anvil cirrus

31.3

327

94.4

9.1

 Deep Convection

53.2

285

97.9

6.4

 Stratocumulus

44.6

722

82.4

5.2

Ice-free extra-tropics

 Shallow cumulus

28.6

643

47.3

35.4

 Congestus

45.7

607

93.2

17.0

 Stratocu./Cu. transition

37.5

799

80.2

11.4

 Cirrus

32.5

430

91.4

10.4

 Stratocumulus

43.8

723

90.0

9.1

 Frontal

58.1

393

97.8

8.3

 Thin cirrus

22.0

389

71.3

8.3

Snow/ice covered

 Shallow cumulus

43.3

582

35.6

42.3

 Stratocumulus

51.0

740

74.7

19.1

 Thick mid-level

57.6

620

77.8

13.9

 Frontal

50.5

458

88.4

11.1

 Thin mid-level

34.3

595

84.1

9.4

 Thin cirrus

24.7

452

74.4

4.2

These data are used for assignment of the model data to the observed regimes. Also shown is the relative frequency of occurrence (RFO) of the regime within each region. Regimes are listed here (and throughout the paper) in order of decreasing RFO in each region

In developing evaluation metrics, it is important to consider uncertainty in the observations (Gleckler et al. 2008; Pincus et al. 2008). Here we achieve this, using the tolerant assignment method to assign independent data from MODIS to the ISCCP clusters as though it was another model. Tropical results of MODIS assigned to the ISCCP clusters are shown on the third row of Fig. 1. The regime histograms are reasonably similar for the optically thicker low and high cloud regimes, however, the ‘Congestus’ and ‘Thin cirrus’ regimes have some differences. Both appear thicker in MODIS and the ‘Congestus’ regime has higher cloud tops. Similar results are obtained if MODIS daily histograms are averaged from points identified by the ISCCP clustering as belonging to each regime (fourth row of Fig. 1), providing further evidence that the tolerant assignment approach is correctly associating the data with the correct regime.

Although comparison of MODIS and ISCCP can provide an indication of observational uncertainty, both use passive nadir-viewing instruments and have pixel sizes of 1 km or more, so suffer from some common problems. Both have trouble detecting thin clouds (particularly thin cirrus). Whilst ISCCP take a liberal approach of assigning as thin cirrus any clouds which are detected but for which an optical depth cannot be determined, this MODIS dataset uses a conservative approach and disregards data for which an optical depth cannot be assigned. This is highlighted when the MODIS data are clustered independently following the method of WT07 (bottom row of Fig. 1). Most of the regimes are again produced with similar histograms (providing confidence that the observed regimes are not sensitive to the dataset), but a separate ‘Thin cirrus’ regime is not produced. Since GCMs produce some separate thin cirrus, we assign the model data to ISCCP (which has this regime) and use the assignment of MODIS onto ISCCP to provide the observational uncertainty. The assignment of ISCCP data onto the MODIS clusters and average ISCCP histograms for MODIS located clusters is shown in the electronic supplementary material (ESM) to this paper (ESM 1). The other geographical regions reflect the tropical results in that ISCCP and MODIS compare reasonably well for optically thicker regimes, but there is more thin cirrus in ISCCP (ESM 2-4). Another problem for both observational datasets is where several cumulus clouds, with clear-sky between, are contained within a pixel. In these situations, the observations will be biased to indicate clouds which are thinner and more extensive than is actually the case. An estimate of the effect of this on the shallow cumulus cluster is presented by WT07. With the exception of this issue with shallow cumulus, studies comparing ISCCP with other observational datasets suggest more uncertainty in CTP than τ (e.g. Mace et al. 2006; Wang et al. 1999), particularly in multi-layer cloud situations. It will be shown below that the main model errors affecting the NCRF are biases in τ, hence we believe that these observational datasets still provide a useful constraint on the GCM simulations.

Since the tolerant assignment method allows for some variation in τ-CTP-TCC, the assigned mean histograms from the GCMs may be different to those observed. These are shown in full in ESM 2-4, but a quantitative indication of the difference from observations may be obtained by calculating the Euclidean distance between the 42 element τ-CTP vector from the GCM histogram and the observed histogram (Fig. 2a). For some of the regimes (such as tropical and extra-tropical thin cirrus), the comparatively large difference between ISCCP and MODIS means that all of the GCMs fall within observational uncertainty, whilst for others (e.g. tropical and extra-tropical transition clouds) all of the GCMs are outside of the observational range. Consistent with the findings of WT07, the newer Met Office Hadley Centre models appear to produce τ-CTP histograms most similar to observations in the tropics and extra-tropics, whilst ECHAM5 has the largest Euclidean distance to ISCCP in several of the regimes.
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-008-0443-1/MediaObjects/382_2008_443_Fig2_HTML.gif
Fig. 2

a Euclidean distance of the 42 element τ-CTP vector of cloud amount from the ISCCP observational regimes. b Root-mean-square difference (RMSD) between geographical maps of the regime frequency at each grid point (ESM 5-7). c Regime relative frequency of occurrence (RFO). d Regime mean SCRF. e Regime mean LCRF. f Regime mean NCRF. The mean values for each regime from each model’s control are shown. ISCCP/ISCCP-FD and MODIS/ERBE observations are linked by an error bar to provide an estimate of observational uncertainty. Uncertainty due to internal variability has also been calculated using the bootstrap method described by WT07, however this is smaller than the size of the symbols in all cases so has been omitted from the figure

Maps of the fraction of time that each grid-point is assigned to a particular regime provide information about whether the regimes occur in the correct place (ESM 5-7). This is obviously of importance when considering regional climate, but may also help with diagnosing errors. For example, the tropical transition cloud regime extends too far westwards near the equator in several GCMs (particularly CCCMa, CCSM3 and the MIROC models) (Fig. 3). In most cases this error is associated with a local cooling of the ocean surface which, through subsequent advection of the anomaly, may contribute to an enhancement of the Pacific cold tongue when these GCMs are coupled to a full ocean model. Root-mean-square differences (RMSDs) between these maps from the GCMs and ISCCP provide a simple measure of how well the observed map is reproduced (Fig. 2b). This could be extended to include temporal variability (Gleckler et al. 2008) to evaluate, for example, cloud changes during an El-Niño event, but since this study uses mixed-layer ocean models we concentrate on the mean simulation. The mean relative frequency of occurrence (RFO) of each regime is shown in Fig. 2c. There is considerable similarity with Fig. 2b, indicating that much of the RMSD between the maps is due to errors in overall regime frequency rather than geographical displacement. Consistent with WT07, ECHAM5 can be seen to simulate the frontal cloud regime too frequently. With the exception of extra-tropical congestus (which all the GCMs simulate too infrequently), there is at least one GCM within the observational uncertainty range of RFO for each of the regimes, however, less than half of the models are generally within the observational range. The largest variation in RFO is for shallow cumulus. Whilst a larger spread may be expected because it is the most frequent regime, if a model simulates clouds with very little similarity (in terms of τ and CTP) to any of the observed clusters and with a low TCC, then there is a risk that the grid-point will be incorrectly assigned to the shallow cumulus regime as the TCC element of the vector for this regime differs markedly from the other regimes (Table 3). There is some evidence that this is the case for the GFDL model in the ice-free extra-tropics since the RFO of the ‘shallow cumulus’ regime is excessive and the magnitude of the regime mean SCRF and LCRF are also larger than observed (Fig. 2d, e). This would be consistent with more radiatively active clouds being incorrectly assigned to the shallow cumulus regime in this region.
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-008-0443-1/MediaObjects/382_2008_443_Fig3_HTML.gif
Fig. 3

The fraction of time that each grid-point is assigned to the tropical transition regime

The observational uncertainty of the regime mean SCRF, LCRF and NCRF is generally fairly small in comparison with the spread in the GCMs (Fig. 2d–f). For both the tropical and ice-free extra-tropical stratocumulus and transition regimes, there is a systematic bias for SCRF to be stronger (more negative) than observed in all of the models (except for HadGEM2 extra-tropical stratocumulus), indicating that these cloud regimes are too reflective in GCMs. Examination of the regime histograms (ESM 2-4) reveals that, with the exception of HadGEM1 and HadGEM2, this is due to the regime τ being too large rather than an excessive TCC. In contrast, several of the models have SCRFs for deep convection and frontal cloud (the most reflective cloud regimes) which are too small. The GFDL model has a particularly strong SCRF for many of the regimes, whereas the RFO of these regimes is often less than observed. This suggests that the particularly good time-mean geographical distribution of CRF in the GFDL model (Pincus et al. 2008) may disguise a compensation of errors. The LCRFs of most of the regimes in ECHAM5 are weaker than observed due to the simulated clouds generally being optically thinner than observed rather than being too low.

WT07 find that the RFO of the tropical stratocumulus regime is too high in the MIROC models, partly accounting for their higher sensitivity. Using the tolerant assignment method, the stratocumulus RFO is more realistic but the error manifests itself as shallow cumulus and transition clouds with an excessive SCRF, and with a comparatively large Euclidean distance to the observed histograms (the independent clustering method of WT07 included some of these grid-points with stratocumulus). Establishing whether the error is stratocumulus covering too large an area or cumulus/transition clouds which are too bright and with too high a grid-box cloud cover would require a more detailed analysis of the physical processes operating within this model.

4 Climate change response

Changes in each of the regime histograms, RFO and CRF components in response to doubling CO2 are given in ESM 8-11. Following WT07, the change in area-mean net CRF \((\overline{\Updelta \hbox{NCRF}})\) can be expressed as:
$$\overline{\Updelta \hbox{NCRF}} = \sum_{r=1}^{\hbox{nregimes}} \hbox{NCRF}_{r} {\Updelta \hbox{RFO}}_{r} + \sum_{r=1}^{\hbox{nregimes}} \hbox{RFO}_{r} {\Updelta \hbox{NCRF}}_{r} + \sum_{r=1}^{\hbox{nregimes}} {\Updelta{\hbox{RFO}}_{r}} {\Updelta{\hbox{NCRF}}_{r}} $$
(1)
where the first term on the right-hand-side represents the contribution to \(\overline{\Updelta \hbox{NCRF}}\) from the change in the RFO of the regime (r) (‘the ΔRFO driven component’); the second represents the change in NCRF within the regime (‘the ΔNCRF driven component’); and the third reflects a second-order co-variation of RFO and NCRF. Figure 4 shows the tropical mean \(\overline{\Updelta \hbox{NCRF}}\) decomposed into each of the terms in Eq. 1 (all regions are shown in ESM 12-14). The components are normalised by the global-mean climate sensitivity (the equilibrium surface warming due to a doubling of CO2) of the respective model to provide one measure of cloud feedback (Boer and Yu 2003; Webb et al. 2006). Also shown in Fig. 4 is the decomposition of \(\overline{\Updelta \hbox{NCRF}},\) but using the ISCCP observed values of NCRF and RFO with the respective model’s ΔRFO and ΔNCRF, i.e. what the ΔRFO driven component would be with the observed NCRF, and what the ΔNCRF driven component would be with the observed RFO. This ‘ISCCP constrained’ value is labelled \(\overline{\Updelta \hbox{NCRF}_{{\rm obs}}}\) and can be used to provide an indication of what climate change response would occur if the control regimes were simulated accurately but the changes were the same (WT07). It is important to note that if improvements in the control regimes could be achieved, the changes to the model may well alter the ΔRFO and ΔNCRF so this is only a guide. However, it provides a useful indication of the extent to which errors in the control climate affect \(\overline{\Updelta \hbox{NCRF}}.\)
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-008-0443-1/MediaObjects/382_2008_443_Fig4_HTML.gif
Fig. 4

The change in NCRF in response to doubling CO2, normalised by the change in global-mean surface temperature, decomposed into the contribution from each term in Eq. 1. Results are shown for each tropical cloud regime and the total from all the regimes, hence the far left black bar is \(\overline{\Updelta\hbox{NCRF}}\) for the region. The small vertical lines at the end of each bar represent uncertainty estimates due to internal variability using the bootstrap method described by WT07. Also shown in the pale colours are the changes in NCRF if ISCCP observations of present-day RFO and NCRF are used with the model’s ΔNCRF and \(\Updelta\hbox{RFO}\)

When assigning to a common set of regimes, more of the climate change response will occur from changes in the RFO of regimes rather than changes in the NCRF of the regime. This can be seen by comparing Fig. 4 with Fig. 10 from WT07. In the tropics, the stratocumulus and transition regimes give some of the largest mean contributions to \(\overline{\Updelta \hbox{NCRF}}\) and have the largest variance amongst the models (bottom panel on Fig. 4). The large positive change in NCRF from the transition regime in the MIROC models and from the stratocumulus regime in CCCMa, CCSM3 and the MIROC models arises from a reduction in the RFO of the regime (i.e. the largest contribution is from the ΔRFO driven term - the dark blue bar). The pale blue bar is smaller in all of these cases, indicating that if the control NCRF was more accurately simulated in these models (specifically if the overly bright cloud error was reduced) \(\overline{\Updelta \hbox{NCRF}}\) could be reduced (all other things being equal). In addition, the variance amongst the models from the ΔRFO driven terms could be reduced. However, Fig. 4 also indicates that if the control regime RFO were more realistic, the contribution from the change in the radiative properties of the regimes (ΔNCRF driven term—red bars) may increase, and the variance of this component may also increase. In the case of stratocumulus, this is because several models simulate the regime too infrequently. When ISCCP observations are used, the increased variance also occurs in the congestus and anvil cirrus regimes since these regimes are simulated with less than half the frequency of ISCCP in most GCMs. When MODIS/ERBE observations are used (not shown), the increased variance occurs through the deep convection regime (rather than congestus/anvil cirrus) which is simulated less frequently in several GCMs than in MODIS.

Similarly to the tropics, the largest variance in the extra-tropics is from the stratocumulus regime and this is slightly reduced for the \(\overline{\Updelta \hbox{NCRF}_{{\rm obs}}}\) of this regime. Also similarly to the tropics, the variance of the congestus regime is increased between the models when the observed NCRF and RFO are used. The global response may be obtained by area weighting the response from each region (Fig. 5). In most models the global \(\overline{\Updelta \hbox{NCRF}}\) is a balance between a positive change in NCRF due to changes in the frequency of regimes and a negative change in NCRF due to changes in the radiative properties of the regimes. The former is primarily the result of regimes with a high NCRF (such as stratocumulus) reducing in frequency and being replaced by shallow cumulus with a small NCRF. The latter is mainly a result of several regimes becoming thicker and so more reflective. This is seen in stratocumulus over ice and snow covered regions, and changes in the regime histograms (ESM 10) reveal that at least some of the response is an actual change in regime τ rather than an artifact of cloud masking over ice/snow (e.g. Soden et al. 2004). It is also seen in the GFDL model with the tropical high-top cloud regimes becoming brighter (ESM 8). In CCSM3 a reduction in LCRF for the tropical high-top cloud regimes also contributes to the negative change in NCRF. Using NCRFobs and RFOobs in Eq. 1 generally reduces the variance in the ΔRFO driven contribution, but increases the variance in the contribution driven by ΔNCRF. Therefore, in contrast to the findings of WT07 for the smaller GCM ensemble, the variance in the global mean \(\overline{\Updelta \hbox{NCRF}}\) is not reduced (and is actually slightly increased) in the observationally constrained case, but the ensemble mean \(\overline{\Updelta \hbox{NCRF}}\) decreases and becomes negative. WT07 showed that \(\overline{\Updelta \hbox{NCRF}_{{\rm obs}}}\) may be used to illustrate the potential effect of simulating present-day regimes more accurately on the climate sensitivity. As found by WT07, the climate sensitivity of models at the upper end of the range is reduced, however in this larger ensemble the climate sensitivity of the models with the lowest sensitivities are also reduced leading to the range and standard deviation of the climate sensitivity being similar in the constrained and unconstrained cases, but with the ensemble mean sensitivity reduced by 0.5K.
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-008-0443-1/MediaObjects/382_2008_443_Fig5_HTML.gif
Fig. 5

Contribution of each region to the global mean \(\overline{\Updelta\hbox{NCRF}},\) normalised by the global mean surface temperature response. The means for each region are weighted by the area of that region and the shortwave component is also weighted by the fraction of data-points which are sun-lit. Also shown in the pale colours are the changes in NCRF if observations of present-day RFO and NCRF are used with the models’ ΔNCRF and ΔRFO. The ensemble mean and standard deviation are also shown

It can be valuable to GCM developers to know which regimes contribute most to the spread between GCMs in the cloud radiative response to climate change. This can be quantified as the fractional contribution of each regime to the global variance in \(\overline{\Updelta \hbox{NCRF}}\) (Vr). Following Boer and Yu (2003) and Webb et al. (2006), this may be estimated from:
$$V_{r} = \sum_{m=1}^{\hbox{nmodels}} \frac{{\overline{\Updelta \hbox{NCRF}} _{m}}^{+} \overline{\Updelta \hbox{NCRF}}_{r m}^{+}}{\sigma^{2} (\hbox{nmodels}-1)} $$
(2)
where \(\overline{\Updelta \hbox{NCRF}}_{m}^{+}\) and \(\overline{\Updelta \hbox{NCRF}}_{r m}^{+}\) are the deviations of the individual models from the ensemble mean and σ2 is the ensemble variance of \(\overline{\Updelta \hbox{NCRF}}.\) Figure 6 gives the contribution of each regime to the global variance in NCRF following Eq. 2. The contribution of some of the regimes is negative, indicating that they are reducing the variance through anti-correlation with other regimes. Consistent with the findings of Bony and Dufresne (2005) and Webb et al. (2006), most of the variance of \(\overline{\Updelta \hbox{NCRF}}\) in this ensemble comes from the low cloud regimes. The advantage of the analysis presented here is that the cloud regimes responsible may be specifically identified as stratocumulus and transition clouds, which together contribute 65% of the global variance (47 and 18%, respectively). An additional 18% of the variance arises from the congestus regime. There is a slightly smaller and negative (−14%) contribution from shallow cumulus despite its high RFO. By analysing the contribution of each model to the regime variance, it can be seen that the MIROC models dominate the variance in the response of the stratocumulus and transition regimes whilst GFDL accounts for most of the variance in the deep convection regime response (Fig. 6). GFDL and the MIROC models are responsible for most of the variance in the shallow cumulus response which is acting to reduce the global ensemble variance.
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-008-0443-1/MediaObjects/382_2008_443_Fig6_HTML.gif
Fig. 6

Black-outlined bars show the contribution to the total global variance in \(\overline{\Updelta\hbox{NCRF}}\) from each regime (%). Where the same regime is present in more than one region, they have been combined (e.g. the stratocumulus regime from all three regions have been combined). The thin and thick mid-level regimes over ice/snow covered regions have also been combined. Thin coloured bars show the contribution from each model to the regime variance (obtained by setting each model to the ensemble mean in turn and calculating the effect on the variance. The stacked bar on the right shows the overall contribution of each model to the total global variance in \(\overline{\Updelta\hbox{NCRF}}\)

In a tropical analysis of aqua-planet simulations, Medeiros et al. (2008) conclude that different climate sensitivities in the GFDL and CCSM models are due to differing responses of shallow cumulus rather than stratocumulus. Our global analysis of ten models concludes the opposite (the contribution from stratocumulus is still larger than shallow cumulus even if the MIROC models are removed). The reason for the differing result is not clear but it may well be because we are considering a larger GCM ensemble, of which the behaviour of the GFDL and NCAR models is not representative. In addition, it should be noted that the Medeiros et al. (2008) experimental design differs to that used here in that they apply uniform changes in sea surface temperate for their climate change experiment whereas we use atmosphere–mixed-layer ocean models with increased CO2. Ringer et al. (2006) show that the global mean changes in NCRF can differ between the two, possibly explaining the different CRF sensitivities in the GFDL and NCAR models presented here compared with Medeiros et al. (2008) (The global mean cloud feedbacks in the CFMIP atmosphere–mixed-layer ocean versions of the GFDL and NCAR models are very similar).

5 Cloud regime error metrics

The above analysis illustrates that examining the simulation of cloud regimes, their properties and regional distributions can provide model developers with considerable information about cloud errors in their models. A decomposed analysis of the climate change response also contributes to our understanding of the causes of different cloud responses to climate change amongst GCMs. As well as this detailed analysis, it may be useful to have a simple scalar metric which incorporates as much of this information as possible.

Figure 2 contains a large amount of information which could be used to form one or more metrics. It is envisaged that metrics based on cloud regimes would be included in a basket of measures covering many aspects of GCM performance, so for practical reasons condensing the information into one or two metrics is desirable. In order to do this it is necessary to combine different variables associated with the regimes and determine how the regimes will be weighted. The specific application for which the GCM is being assessed (e.g. ability to project global mean temperature) will determine these decisions. In this section we present two metrics based on the TOA radiative effects of the regimes. Both evaluate the same aspects of the regimes, however the weighting of the regimes in the first provides a cloud regime error metric (CREM) targeted at the radiative effects of the regimes in the present-day climate (CREMpd), whilst the weighting in the second provides a metric targeted at the climate change cloud radiative response (CREMcc), and hence this aspect of climate sensitivity. Note that the two metrics are not independent and it is not intended that both would be used within the same basket of measures.

A measure of the present-day radiative effect of each regime (r) in each geographical region is
$$ \hbox{CREMpd}_{r} = \hbox{aw} \sqrt{(\hbox{NCRF}_{r}' \hbox{RFO}_{{\rm obs} r})^2 + (\hbox{RFO}_{r}' \hbox{NCRF}_{\hbox{obs} r})^2} $$
(3)
where NCRFr′ and RFOr′ are regime mean differences between the model and ISCCP observations, RFOobsr and NCRFobsr are the ISCCP observed regime mean values which are used to weight the regimes, and aw is the respective area weight for the tropics, ice-free extra tropics and ice-snow covered regions. Since regimes are not defined during the polar night (the ISCCP diagnostics are only available at sun-lit points), in calculating the NCRF, the SCRF is weighted by the fraction of daily mean data-points which are sun-lit in the region (WT07). A single global cloud regime error metric (CREMpd) can then be expressed as:
$$ \hbox{CREMpd} = \sqrt{\frac{{\sum_{r=1}^{\hbox{nregimes}} \hbox{CREMpd}}_{r}^2}{\hbox{nregimes}}} $$
(4)
As an alternative, WT07 propose that the difference between \(\overline{\Updelta\hbox{NCRF}}\) and \(\overline{\Updelta\hbox{NCRF}_{{\rm obs}}}\) could provide a metric which uses the climate change response to weight the regimes, and is therefore targeted at the climate change cloud radiative response. Whilst this metric would include much of the relevant information, it is possible to achieve a favourable result through a compensation of errors between regimes. Also, using the individual model’s climate change response may be undesirable since, for example, a regime which has a large response in most GCMs (so may be considered generally important) will be down-weighted in a model which has an unusually weak response for that regime. We therefore propose the following revised metric for each regime targeted at the cloud radiative response under climate change:
$$ \hbox{CREMcc}_{r} = \hbox{aw} \sqrt{(\hbox{NCRF}_{r}' \langle \Updelta\hbox{RFO}^{*}_{r} \rangle)^2 + (\hbox{RFO}_{r}' \langle \Updelta\hbox{NCRF}^{*}_{r} \rangle)^2} $$
(5)
As for CREMpd, NCRFr′ and RFOr′ are differences between the model and ISCCP observations, however in this case the weighting is through \({\langle}\Updelta\hbox{RFO}^{*}_{r} \rangle\) and \(\langle \Updelta\hbox{NCRF}^{*}_{r} \rangle\) which are the GCM ensemble mean 2 × CO2 equilibrium response divided by the climate sensitivity. As before, in calculating the NCRF and ΔNCRF, the SCRF is weighted by the fraction of daily mean data-points, which are sun-lit in the region. The single global cloud regime error metric (CREMcc) is given by:
$$ \hbox{CREMcc} = \sqrt{\frac{{\sum_{r=1}^{\hbox{nregimes}} \hbox{CREMcc}}_{r}^2}{\hbox{nregimes}}} $$
(6)
By using the GCM ensemble-mean climate change response to normalise NCRFr′ and RFOr′ in Eq. 5, we are effectively up-weighting those regimes where GCMs agree on a large climate change response. Those for which there is little change in any GCM or the GCMs disagree on the sign of the response tend to have an ensemble mean closer to zero and so will be down-weighted. Of course, a consensus in the response amongst GCMs does not indicate that the response is correct, but we believe that this is still more useful than CREMpd for targeting the global-mean cloud radiative response. Since there are several model versions from the same centre in this study, we only include HadCM3 and HadGEM1 (which are structurally significantly different) from the Met Office Hadley Centre models and only use MIROC-ls from the two MIROC models in generating the ensemble mean. We find that removing any one model from the ensemble does not significantly alter the results presented here. The values of \(\langle\Updelta\hbox{RFO}^{*}_{r} \rangle\) and \({\langle}\Updelta\hbox{NCRF}^{*}_{r} \rangle\) are given in Table 4 to allow others to calculate the metric for other GCMs. When data from more GCMs become available, it is intended that these ensemble means will be re-calculated.
Table 4

The ensemble mean ΔRFO and ΔNCRF divided by the climate sensitivity, as used in Eq. 3

Regime

\({\langle}\Updelta\hbox{RFO}^*_{r} \rangle\)

\({\langle}\Updelta\hbox{NCRF}^{*}_{r} \rangle\)

(% K−1)

(Wm−2 K−1)

Tropics

 Shallow cumulus

0.205

0.059

 Congestus

−0.131

−0.507

 Thin cirrus

0.297

−0.037

 Stratocu./Cu. transition

−0.168

−0.018

 Anvil cirrus

0.012

0.414

 Deep Convection

0.031

−0.099

 Stratocumulus

−0.248

−0.123

Ice-free extra-tropics

 Shallow cumulus

0.306

0.054

 Congestus

−0.164

−0.579

 Stratocu./Cu. transition

−0.282

−0.501

 Cirrus

−0.097

−0.066

 Stratocumulus

−0.159

−0.337

 Frontal

0.179

0.450

 Thin cirrus

0.217

0.125

Snow/ice covered

 Shallow cumulus

−1.410

−0.046

 Stratocumulus

0.385

−1.968

 Thick mid-level

0.908

−1.944

 Frontal

0.726

−1.633

 Thin mid-level

−0.360

−0.925

 Thin cirrus

−0.249

0.026

This ensemble mean excludes HadCM4, HadGEM2 and MIROC-hs

The metrics for each regime (given by Eqs. 3, 5) and the total CREMpd and CREMcc (given by Eqs. 4, 6) are shown in Fig. 7. The smaller the metric for a particular GCM, the closer its control cloud regimes are to the ISCCP observations. Observational data from MODIS/ERBE have been used as though they were another model in order to provide an observational uncertainty on the metric. The errors due to the high RFO of frontal clouds in ECHAM5 and NCRF being too negative in GFDL can be clearly seen in both metrics. The overall ‘ranking’ of the models is similar in both cases, although the CREMcc metric tends to discriminate slightly more between the models which are closest to ISCCP observations. This is mainly due to the stratocumulus regime and regimes over snow and ice being up-weighted in CREMcc due to their comparatively large and consistent (in terms of sign) climate change response amongst the GCMs. Since the GCMs vary in their ability to accurately simulate these regimes which are important for the climate change response, CREMcc provides more variation between the models. HadCM3 has a poor simulation of stratocumulus and so is more heavily penalised in CREMcc, whereas CCCMa has the smallest CREMcc in the ensemble, just within the observational uncertainty (however CCCMa is not within observational uncertainty for all of the regimes). The good CREMcc for CCCMa appears to due to a particularly good simulation of tropical and ice-free extra-topical low cloud regimes, combined with a reasonable simulation for the other regimes. Much of the uncertainty in the observational datasets surrounds the cirrus (thick and thin), congestus and shallow cumulus regimes and it may be questioned whether the difference between ISCCP and MODIS and between ISCCP-FD and ERBE reflects the true uncertainty. However, even if the observational uncertainty is increased by 50% for all of these more uncertain regimes, CCCMa remains the only model consistent with the observations for CREMcc on average.
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-008-0443-1/MediaObjects/382_2008_443_Fig7_HTML.gif
Fig. 7

The CREMpd obtained from Eqs. 3 and 4 (top) and the CREMcc obtained from Eqs. 5 and 5 (bottom). Results for each regime are shown together with the total metric (far left). The metric would equal zero if the model regime characteristics equalled ISCCP/ISCCP-FD. The metric has been calculated for MODIS/ERBE observations (shown grey), hence the error bar provides an estimate of observational uncertainty

6 Conclusions

This study has presented an alternative method for assigning GCM ISCCP simulator data to cloud regimes obtained from clustering ISCCP observational data. The errors in τ-CTP histograms from contemporary GCMs have required the approach to be more tolerant than has been applied previously. Assigning to a common set of regimes has a number of advantages over independent clustering of the model data, including removing some of the subjective elements involved in obtaining and comparing the regimes. This allows automation of the method and permits others to unambiguously obtain the assigned cloud regimes from a GCM. The more tolerant assignment methodology requires only 3 daily mean cloud fields (in addition to CRF) rather than the 49 for the cloud clustering methodology of WT07, making it more accessible to large ensemble experiments. The diagnostics required for the analysis presented here have been included on preliminary diagnostic lists for the next phase of the Coupled Model Intercomparison Project (CMIP). This will provide the opportunity to apply the analysis to a comprehensive set of contemporary GCMs.

A number of questions were posed in the introduction to provide motivation for the study. In answer to these, our key findings are:
  1. 1.

    Generally less than half, and in most cases only one or two GCMs simulate regime properties (RFO, CRF, etc.) which are within observational uncertainty. None of the models studied simulate the properties of all of the regimes well. There is a bias for the stratocumulus and transition cloud regimes to be too reflective in the GCMs, in most cases due to excessive cloud optical depths. The observational uncertainty tends to be smaller for regime CRF rather than RFO. For optically thicker regimes, particularly for low clouds, ISCCP and MODIS appear to agree reasonably well on the regime histogram and RFO, providing a tight observational constraint for GCM evaluation. For thin high clouds, these observational datasets disagree more and possibly active instruments (such as Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO), Winker et al. 2007) are required to discriminate between the GCMs.

     
  2. 2.

    Sixty-five percent of the variance in the equilibrium cloud radiative response to climate change amongst the GCM ensemble used here is due to the stratocumulus and transition cloud regimes. This is found to be dominated by the MIROC models. In contrast with the conclusions of some previous studies, a smaller (and negative) contribution to the variance occurs from the shallow cumulus regime despite this being the most frequently simulated.

     
  3. 3.

    Sixty-eight percent of the variance in the climate change response of the stratocumulus regime is through differing radiative ΔRFO driven contributions (i.e. the NCRF ΔRFO term). If models were able to reproduce the observed NCRF of this regime more accurately, all other things being equal, this variance would reduce. However, if the RFO of some of the high-top cloud and congestus regimes were more realistic, and the ΔNCRF of the regime remained unchanged, it would increase the variance.

     
  4. 4.

    The principal errors identified by WT07 can still be identified using this new methodology (e.g. excessive RFO of frontal clouds in ECHAM5), although how they manifest can change (e.g. the tropical cumulus and transition regimes in the MIROC models being too optically thick, rather than the stratocumulus RFO being too high). The importance of the stratocumulus regime to the spread in climate change response, and the result that it is in part due to inaccuracies in the control simulation, are also consistent with WT07. The GFDL and CCSM3 models appear to have a larger negative ΔNCRF than the other models for high-top cloud regimes, but underestimate the RFO. As a result, when the RFO is set to the observed value, the response of these regimes increases which counteracts the reduced variance from constraining the low cloud. Hence in contrast to WT07, the present ensemble indicates that variance in the global cloud response and climate sensitivity would not be reduced if the regimes were simulated more realistically, but the ensemble mean climate sensitivity could decrease by around 0.5 K.

     

Two cloud regime error metrics are proposed to measure how well the control RFO and NCRF are simulated. One uses the observed RFO and NCRF to weight the regimes and hence is targeted at the present-day radiative effect of the cloud regimes. The other metric weights the regimes based on the extent to which GCMs consistently indicate a large change in the regime in their climate change response. In the ensemble analysed here, CCCMa is closest to observations for this metric, reflecting a good simulation of the stratocumulus and transition NCRF. We wish to stress that the proposed metrics are focused on the global TOA cloud radiative effect and the cloud radiative response to climate change (and hence climate sensitivity). Of course, GCMs have many more applications than providing a global temperature response, hence we propose that the metrics should be part of a basket of metrics similar to Murphy et al. (2004) or Gleckler et al. (2008), with CREMpd used in present-day evaluation of GCMs and CREMcc used as part of a basket to assess overall confidence in a GCM’s ability to simulate climate change. Further metrics based cloud regimes could also be envisaged for other aspects of GCM performance (three-dimensional heating rates, etc.). These may be more suited to assessing cloud regimes defined with vertical profile information Zhang et al. 2007 rather than the passive cloud-top view used here.

In addition to a top-level metric contributing to a holistic GCM assessment, detailed analysis of the cloud regimes provides model developers with considerable information on the weaknesses of the cloud simulation in their model. Analysis of the climate change response also increases our understanding of the causes of differences in cloud response between GCMs. This increased understanding, alongside evaluation, is a key aim of the second phase of CFMIP (http://www.cfmip.net) and cloud regime analysis may prove a useful technique for linking the understanding and evaluation aims of the project.

The CFMIP phase 1 data used in this study are openly available from the Program for Climate Model Diagnosis and Intercomparison (PCMDI)—see http://www.cfmip.net for more details.

Acknowledgments

This work was supported by the Joint Defra and MoD Programme (Defra) GA01101 (MoD) CBC/2B/0417_Annex C5. We thank Robert Pincus for providing the MODIS data and useful conversations around developing the CREMs. We also thank Alejandro Bodas-Salcedo, Philip Brohan, William Ingram and Mark Ringer for their comments on early drafts of the paper. We acknowledge the modelling groups for making their simulations available for analysis, PCMDI for collecting and archiving the CMIP3 and CFMIP model output, and the WCRP’s Working Group on Coupled Modelling (WGCM) for organising the model data analysis activity. The WCRP CMIP3 and CFMIP multi-model datasets are supported by the Office of Science, US Department of Energy. ISCCP and ERBE data were obtained from the NASA Langley Research Center Atmospheric Sciences Data Center.

Supplementary material

382_2008_443_MOESM1_ESM.pdf (8.4 mb)
Electronic supplementary material (PDF 8559 kb)

Copyright information

© Crown copyright 2008