1 Introduction

Modern climate science, wherein descriptive pictures of the climate system began to be complemented by quantitative theory, is only about a hundred years old. In the late nineteenth Century understanding of radiative transfer, particularly at the infrared wavelengths associated with terrestrial radiation, was developing rapidly, and it became possible to formulate quantitative descriptions of the relationship between the flux of energy through the Earth system and quantities like the average surface temperature. Exemplary in this respect are Arrhenius’ 1896 calculations suggesting that changes in carbon dioxide would induce changes in the surface temperature. Arrhenius’ study was essentially an exercise in radiative transfer in which he quantified the flow of solar and terrestrial radiation through the Earth system and the roles of the various processes that influenced these transports. Greenhouse gases such as carbon dioxide and water vapor played a key role in Arrhenius’ calculations, and other factors influencing solar radiation such as clouds and surface properties were also accounted for. Although prescient in many respects, for instance with respect to the role of the carbon cycle and feedbacks associated with water vapor and surface albedo, Arrhenius did not touch on a number of issues that have come to dominate the discourse with respect to climate change. Among these are rate of increase in ocean enthalpy and changes in the ocean circulation; changes in patterns of precipitation; the role of aerosols, both in influencing clouds and in the energy budget as a whole; and also the possibility that changes in cloudiness may enhance or offset other changes in the climate system.

An increased emphasis on precisely those issues largely left out of the early studies has marked a fundamental shift in climate science over the last forty or so years. Through the latter part of the 1960s, climate science was preoccupied with resolving controversies related to the radiative transfer that was the basis for the analysis of Arrhenius and his followers. Confusion about the nature and importance of the details of the spectroscopy of CO2 and H2O in the thermal infrared (e.g., Plass 1956; Kaplan 1960) was resolved only by the calculations of Manabe and Wetherald (1967), which showed that earlier controversies lost relevance when the vertical structure of the atmosphere is properly accounted for (cf., Pierrehumbert 2011). Their research propelled the field into the current era, wherein qualitatively new questions, in particular the role of clouds and aerosol particles, demanded a more detailed understanding and accounting of the energy flows through the climate system.

Arrhenius was perhaps the first to appreciate how small changes in flows of energy through the system can have a large impact on Earth’s climate. A doubling of the mixing ratio (commonly “concentration”) of atmospheric CO2, which is the paradigmatic example of a forcing of climate change, gives rise to a radiative perturbation of 3–4 W m−2, which is about 1 % of the solar radiation incident at the top-of-atmosphere;Footnote 1 the total radiative forcing attributed to long-lived greenhouse gases introduced through human activities over the industrial era thus far is about 3 W m−2. By way of comparison, in the tropics, the diurnal variation of incident solar radiation is more than 1,000 W m−2. The presence of a high cloud can change the outgoing long-wave radiation by 100–200 W m−2, comparable in magnitude to seasonal changes in radiative fluxes, albeit more short lived. From the perspective of climate system response, the central value of current estimates of the increase in global mean surface temperature that would result from a doubling of CO2, 3 K, is about 1 % of the global mean surface temperature, 288 K, and again, much less than geographical and temporal variability. The rather large consequences of such small changes in Earth’s energy flows, and the complexity of the system that mediates these flows, make determining the effects of changing atmospheric composition on Earth’s energy flows, through measurement or modeling, a challenging scientific problem.

In the following, we reflect on this challenge as framed by discussions at a recent ISSI workshop titled “Observing and Modelling Earth’s Energy Flows”. Our presentation is organized around three basic issues: (1) what is the status of present understanding of Earth’s energy budget; (2) how does the composition of the atmosphere, particularly clouds and aerosols, influence this budget; and (3) how can modeling help constrain a description of processes regulating the flow of energy through the climate system. These issues, particularly those aspects central to advancing understanding of the climate system, are discussed in turn below. Although the discussion presented herein benefitted greatly from presentations and discussions at the Workshop, responsibility for the material presented rests with the present authors and should not be taken as representing the views of participants in the workshop or as a workshop consensus.

2 Present Understanding of Earth’s Energy Budget

2.1 Global Energy Balance

Measurements have been indispensable to the advancing understanding of energy flows through the climate system. Current understanding of these flows is summarized in Fig. 1, which we have constructed based on the available literature (especially Trenberth et al. 2009; Kato et al. 2012; Stephens et al. 2012). The energy flows are more certain at the top of the atmosphere than at the surface, as measurements at the top of the atmosphere have benefitted greatly from advances in satellite remote sensing. Measurements from the Earth Radiation Budget Experiment (ERBE) (Ramanathan 1987) and now those from the Clouds and Earth’s Radiant Energy System (CERES) mission (Wielicki et al. 1996) have convincingly shown that Earth reflects less short-wave radiation and emits more long-wave radiation than previously thought (cf., Table 1) and have contributed to closure of the top-of-atmosphere energy budget to within a few watts per square meter.

Fig. 1
figure 1

Earth’s global and annual mean top-of-atmosphere (TOA) and surface energy budget. Values are presented as a two-sigma range (which is to say the authors assign a roughly 68 % likelihood that the actual values fall within the stated range). These values are subjectively determined based on a review of the literature, and complemented by global simulations, as discussed in the Appendix

Table 1 Estimates of Earth’s energy budget, subjectively determined based on a review of the existing literature and the best estimates of the net imbalance at the surface and top-of-atmosphere

Also contributing to the reduction of uncertainty in the balance of energy flows at the top of the atmosphere are new measurements of total solar irradiance and ocean heat uptake. The total solar irradiance determined by the Solar Radiance and Climate Experiment, 1,360.8 ± 0.5 W m−2 at solar minimum (Kopp and Lean 2011), is well below the range of previous estimates as summarized in Table 1. Confidence in these lower estimates and their associated assessment of uncertainty is gained through the identification of artifacts in the older measurements that lends enhanced credence to the newer lower estimates. Because the atmosphere has a relatively small heat capacity, an imbalance of energy flows at the TOA can be sustained only an increase in ocean enthalpy, augmented to lesser extent by melting of the cryosphere and warming of the land surface and the atmosphere. As discussed by Lyman (2011), the rate of heating of the top 700 m of the world ocean over the period from 1993 to 2008, as inferred from temperature measurements and expressed per the area of the entire planet, is 0.64 ± 0.11 W m−2 (90 % confidence interval). Sparser measurements extending to ocean depths of 3 km reported by Levitus et al. (2005) suggest that the upper ocean takes up about three-quarters of the ocean heating. Levitus et al. (2005) also estimate that the contributions of other enthalpy sinks, including the atmosphere, the land, and the melting of ice, can account for an additional 0.04] W m−2; based on these estimates the flux imbalance at the TOA is estimated to be 0.9 ± 0.3 W m−2. The uncertainty is based on the 90 % confidence interval given by Lyman and the assumption that the relative uncertainty in the deep ocean enthalpy uptake estimates and in the estimates of heating by other components of the Earth system are about 50 %. Footnote 2 For reference, in constructing the energy-balanced version of the CERES data, Loeb et al. (2009) estimated the surface enthalpy uptake to be 0.85 W m−2 similar to the 0.9 W m−2 estimated here and employed by Trenberth et al. (2009) based on a somewhat different line of reasoning. An analysis based on more recent ARGO data and a revised analysis of the CERES measurements Loeb et al. (2012) suggest a somewhat lower central value of 0.50 ± 0.43 W m−2 for the rate of increase in Earth system enthalpy, per unit area of Earth’s surface. In summary, although there is a rather larger, 6.5 W m−2, inherent uncertainty in measurements of the reflected short-wave and emitted long-wave radiation at the TOA, associated principally with uncertainty in the absolute calibration of the CERES instruments (Loeb et al. 2009), the net irradiance at the TOA is constrained to within 1 W m−2 by improved measurements of increases in ocean enthalpy.

The surface energy budget is distributed over several terms, each of which exhibits uncertainty that is several-fold greater than the uncertainty in the net budget at TOA. As pointed out by Trenberth et al. (2009), if each of the terms in the surface energy budget is estimated individually, in isolation of the others, an imbalance can arise in the net surface flux that is as much as 20 W m−2; this is more than an order of magnitude greater than current measurement-based estimates of the rate of increase in the enthalpy of the ocean, cryosphere, and land. In absolute terms, the uncertainty is largest for the long-wave irradiance downwelling at the surface and for the latent heat flux (precipitation). Here too, new measurements and improved modeling of energy flows are beginning to improve understanding. Active remote sensors such as the cloud profiling radar flown as part of the CloudSat mission (Stephens et al. 2008) and the Cloud-Aerosol Lidar with Orthogonal Polarization flown as part of the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation mission (Winker et al. 2010) are providing improved estimates of the vertical distribution of clouds, particularly cloud base. These measurements are crucial for estimates of the downward long-wave irradiance, which are based on radiative transfer modeling given a realistic atmospheric state. Using these measurements, Kato et al. (2012) estimate a downward long-wave irradiance at the surface of 345.4 ± 6.9 W m−2, which is considerably larger than the value derived by Trenberth et al (2009) from the residual of the surface energy balance given existing precipitation climatologies. CloudSat and surface-based measurements also suggest that existing precipitation climatologies underestimate light rain from warm clouds, which is common over the tropical ocean (Nuijens et al. 2009; Stephens et al. 2012), and which may help reconcile the differences in the energy-based versus precipitation-based estimates of the strength of the hydrological cycle. Although still large, overall the uncertainty in the surface energy budget, as presented in Fig. 1 is somewhat smaller than that presented by Stephens et al. (2012), as global modeling (including reanalysis) estimates constrained by observations are given somewhat more weight in the present analysis, particularly for the upward turbulent fluxes and the long-wave irradiance.

Advances in understanding of Earth’s energy flows become most apparent when looking beyond the globally and annually averaged estimates of these flows. Measurements by CERES instruments track day-to-day variations in Earth’s energy flows on regional scales. Measurements extending over more than a decade have made it possible to explore how these energy flows fluctuate on intradecadal time-scales. As an example, Fig. 2 shows the existing record of monthly and globally averaged radiation anomalies. The radiation anomalies are calculated using the SSF1deg (single scanner footprint at one degree) data set, which includes information only from the twice daily measurements of the CERES platform on the polar orbiting Terra (ascending node equator crossing time 10:30 am) and Aqua (ascending node equator crossing time 1:30 pm). Unlike the CERES SYN or EBAF products, the SSF1deg data set does not incorporate measurements from a shifting complement of geostationary satellites, and thus more fully represents the inherent stability of the CERES measurements themselves. Monthly anomalies were constructed by subtracting the monthly climatology of the same data set from the global and monthly averaged values. The remarkable agreement between the Aqua and Terra measurements, over the period of the record where they coincide, suggests that the observed variability is an intrinsic property of Earth’s climate system, rather than an artifact due to instrument precision or insufficient sampling. This inherent variability limits quantification of secular trends in the flow of energy through the TOA.

Fig. 2
figure 2

Anomalies in the monthly and globally averaged top of the atmosphere net radiation from the CERES SSF1deg product. Anomalies are calculated by subtracting the CERES monthly climatology from the monthly fluxes. Regression line uncertainty is estimated assuming that monthly anomalies are independent. The standard deviation of the anomalies is about 0.65 W m−2. Aqua, pm; Terra, am

This variability in Earth’s energy flows, manifest as fluctuations in the anomaly of net irradiance as large as 3 W m−2 over annual to inter-annual timescales, e.g., Fig. 2, is expected to be mirrored in other components of the climate system. By combining these trends with concurrent measurements of ocean enthalpy, Wong et al. (2006) and Trenberth and Fasullo (2010b) have begun to explore this question. By drawing attention to the what ended up being insignificant (and spurious) trends over the length of the record, Trenberth and Fasullo (2010b) have given what turns out to be a false impression of irreconcilable trends in the relation between ocean enthalpy increases and the TOA energy imbalance.Footnote 3 These results unfortunately overshadowed the real revolution in such approaches, namely the recognition that the present observing system shows evidence of large inter-annual fluctuations in the flow of energy through the Earth system and that these might be trackable as they work their way through the system— for instance from anomalies in the TOA net radiation, to changes in ocean heat uptake.

We disagree with the assertion of Stephens et al. (2012) that the satellite measurements such as CERES have “not significantly changed our understanding of the TOA energy balance” relative to what was known based on the ERBE data. Although we acknowledge that estimates of the surface radiation budget do not significantly differ from those of Houghton (1954) more than a half century ago, we would argue in both cases that, when viewed from the perspective of initial attempts to infer the planetary energy balance from measurements looking down and up from Mt Wilson (Abbot and Fowle 1908), or on the basis of a network of pyrheliometers over North America (e.g., Houghton 1954), progress has been tremendous, even if the small magnitude of changes in estimates of the globally and annually averaged budgets (Table 1) would seem to belie this progress. The present era of global sensing is unprecedented: high-precision instruments are tracking Earth’s energy flows on decadal timescales and regional spatial scales; an absolute accuracy of better than one part per thousand has been achieved in measurements of solar irradiance and is helping to close the TOA energy budget; networks of ocean-going floats are quantifying ocean heat uptake; and surface-based networks and profiling instruments are beginning to advance understanding of the surface energy budget. As should become increasingly clear through the course of this article, sustaining and expanding such measurements will be key to further advances in understanding of Earth’s climate and its susceptibility to change.

2.2 Radiative Forcing, Response and Climate Sensitivity

In addition to providing understanding of energy flows in the current climate, consideration of the energy budget also provides a framework for understanding climate change. This framework, which has developed over the last 30 years, rests on the assumption that changes in the globally averaged surface temperature can be linearly related to a radiative forcing, F. The constant of proportionality between the forcing and the response is called the equilibrium climate sensitivity, \(S_{\rm{eq}},\) which can be formally defined as the steady-state change in \(T_{\rm s}\) (the globally averaged near-surface air temperature) that would result from a sustained change in a radiative flux component of the Earth energy budget at the TOA (forcing), normalized to that flux change, with unit: K (W m−2)−1. That is, the equilibrium sensitivity is the proportionality constant between the steady-state change in surface temperature and the applied forcing,

$$ \Updelta T_{\rm s} =S_{\rm{eq}} F. $$

Because it codifies the sensitivity of such an essential feature of the climate state, the globally averaged surface temperature, Eq. 1 provides a powerful description of past and prospective future climate change. To the extent that other properties of the climate system scale with its value, \(S_{\rm{eq}}\) assumes an even broader significance. For these reasons, determination of \(S_{\rm{eq}}\) has evolved into a central focus of climate science (e.g., Knutti and Hegerl 2008).

Equation 1 is well suited to the interpretation of models, for which the changes in the equilibrium state can be determined on the basis of numerical simulations. However, by extending the framework that leads to Eq. 1, a related framework has been developed which allows for the determination of \(S_{\rm{eq}}\) from observations. Assuming that the forcing, F were known, it could be related to the rate of change in the enthalpy of the Earth system, \(\dot{H}\) , and the temperature change between that at t = 0, defined to correspond to a time when the system is in stationarity and the forcing is applied, and that at some later time, t

$$ \dot{H} =F - S_{\rm{eq}}^{-1} (T_{\rm s}(t)-T_{\rm s}(0)). $$

If both the forcing and the rate of change of the planetary enthalpy are known, the equilibrium sensitivity might be determined (Gregory et al. 2002) as

$$ S_{\rm{eq}} = {\frac{T_{\rm s}(t)-T_{\rm s}(0)} {F - \dot{H}}} $$

where the forcing and the change in global temperature are relative to a prior unperturbed state (at time, t = 0) and the rate of heating of the planet \(\dot{H}\) can be inferred either from satellite measurements at the top of the atmosphere or from the change in enthalpy of the planet inferred from ocean calorimetry as discussed by Lyman (2011). However, application of Eq. 3 is limited by the fact that changes in the enthalpy and surface temperature must be measured relative to an equilibrium state that is not generally known. To sidestep this issue Forster and Gregory (2006) and Murphy et al. (2009) proposed to determine \(S_{\rm{eq}}\) over shorter time-periods, not starting with the unperturbed state, as

$$ S_{\rm{eq}} = {\frac{T_{\rm s}^{\prime}}{F^{\prime}- \dot{H}^{\prime}}} $$

where primes denote changes in the quantities over a given time-period. The utility of any such determination of sensitivity rests on the uncertainties in the several quantities on the right-hand side of the equation.

Central to the conceptual framework surrounding the use of Eq. (1), or its time-dependent variants, are the assumptions: (1) that \(S_{\rm{eq}}\) does not depend on the nature or magnitude of the forcing; and (2) that the forcing resulting from a given change in atmospheric composition or surface properties can be determined unambiguously. Both assumptions become increasingly problematic if the forcing-sensitivity framework is pushed too far, for instance by demanding ever more certain estimates of \(S_{\rm{eq}}\). Models have long suggested that \(S_{\rm{eq}}\) depends on both the temperature and the nature of the forcing (e.g., Hansen et al. 1997; Colman and McAvaney 2009). For instance, Hansen et al. (1997) showed that if the TOA radiative forcing is equated with the initial radiative perturbation associated with a change in CO2, the change in \(T_{\rm s}\) will differ by as much as 25 % as compared to what arises for an equal radiative perturbation arising from a change in the solar irradiance. To get around this issue, the concept of an effective, or adjusted, forcing has been developed. The adjusted radiative forcing attempts to account for a rapid, and perturbation-dependent, adjustment of the climate system to a compositional change. The familiar example, and the one that motivated the original concept of an adjusted radiative forcing, is the adjustment of the stratosphere which proceeds differently depending on whether the initial radiative perturbation is caused by changes in well-mixed greenhouse gases as compared to changes in the solar irradiance, or stratospheric aerosols (e.g., Hansen et al. 1997). However, several studies have begun to show that, in addition to the stratosphere, changes in other components of the atmosphere, for instance clouds, also adjust rapidly in ways that depend on the nature of the external perturbation and thus must be factored into the calculation of an adjusted radiative forcing. An elegant framework for dealing with these adjustments has been proposed by Gregory et al. (2004). In their approach, the adjusted radiative forcing is defined as the y-intercept of the regression of the TOA radiative imbalance against \(T_{\rm s}\). This definition formalizes the idea of a rapid adjustment of the system to a perturbation which determines the effective, or adjusted radiative forcing, and which is temporally well separated from the slow response of the surface temperature to this adjusted radiative forcing. Of course this definition means that the forcing is no longer independent, or external to the system and thus cannot be determined a priori, for instance, through complete knowledge of radiative transfer and compositional changes alone. And as a result, internal physical processes, such as the nature of cloud adjustment, contribute to uncertainty in estimates of the effective radiative forcing driving changes in Earth’s climate.

The dependence of the forcing on the system to which the perturbation is applied makes the framework surrounding Eq. (1) less useful, as it implies that both F and \(S_{\rm{eq}}\) depend on the system. The extra uncertainty, which the fast response of the system introduces in the determination of the forcing, poses a particular challenge to observationally based attempts to estimate \(S_{\rm{eq}},\) for example through the use of either Eqs. (3) or (4). By focusing on periods over which the observational record is better constrained the numerator and denominator of Eqs. (3) and (4) become small; but in this case, because the denominator is the difference of two rather large terms, the answer becomes proportionally more sensitive to errors in either \(F, \dot{H}\) (or their associated changes in time). So, although in the past, estimates of the forcing F have generally been thought to be limited by understanding to changes in the aerosol over the period of industrialization (e.g., Gregory et al. 2002), recent work suggests that factors such as clouds, which contribute to the fast response of the system, (Hansen et al. 1997; Gregory and Webb 2008; Andrews et al. 2009 2011), may place a more fundamental limit on the quantification of the effective radiative forcing that arises as a result of compositional changes to the atmosphere.

3 Clouds

3.1 Radiative Effects, Feedbacks, and Forcing

A framework for considering cloud feedbacks in the climate system, first formulated by Schneider (1972), introduced the idea of cloud-forcing (see also Cess 1976; Ramanathan 1987). Identifying the subset of short- and long-wave radiative fluxes associated with cloud-free scenes, and assuming that three-dimensional effects associated with neighboring clouds can be neglected, permits determining “cloud radiative forcing” as the difference between the all-sky or actual radiative flux and the contribution from scenes having skies identified as cloud-free. Here, as before, the term “cloud-forcing” is used to denote a change in radiative flux due to a change in atmospheric composition, namely clouds. However, because this terminology does not allow one to distinguish the radiative effect of the totality of clouds from the radiative perturbation that would accompany an external perturbation to cloud properties, it proves useful to distinguish between the cloud radiative effect, CRE, as the radiative effect of the background state of cloudiness, and reserve the phrase “radiative forcing” for radiative perturbations driven by externally imposed changes.

The definition of the cloud radiative effect can be made precise as follows. Let Q denote the rate of absorption of solar energy (short-wave irradiance) and E denote the rate of emission of infrared energy (long-wave irradiance), both at TOA. If an inward directed flux is defined to be positive, E at the TOA must be negative. With this sign convention and letting H denote the enthalpy of the Earth system, including components such as oceans, land and surface ice, then the rate of change of H with time,Footnote 4 \(\dot{H}\) is given by \(\dot{H} = Q + E.\) Under the assumption that three-dimensional radiative effects are negligible the two radiative flux terms may be conceptually distinguished into components pertaining to contributions from cloudy and cloud-free regions of the planet:

$$ Q = Q_{\star}\left[ 1 - \alpha_0(1-A_{\rm c}) - \alpha_{\rm c}A_{\rm c}\right] $$
$$ E = -E_0(1-A_{\rm c}) - E_{\rm c} A_{\rm c} $$

Here \(A_{\rm c}\) denotes the fraction of the area of the planet that is, on average, cloudy; \(Q_{\star}\) denotes the average solar irradiance incident at the TOA (\(Q_{\star} \approx 340\) W m−2, e.g., Fig. 1); α0 and \(\alpha_{\rm c}\) denote the effective TOA albedo of the cloud-free and cloudy scenes, respectively; and E 0, and \(E_{\rm c}\) denote the emitted long-wave irradiance obtained by compositing over the cloud-free and cloudy scenes, respectively. The albedos are designated as “effective” quantities because Eq. (5) depends non-linearly on cloudiness and insolation and, thus, the effective albedos must account for the co-variability between these two quantities. So for instance, \(\alpha_{\rm c}\) is not the average cloud albedo, but rather the cloud albedo that the average cloudiness requires so that the planetary albedo, \(\alpha = \alpha_0 (1-A_{\rm c}) - \alpha_{\rm c} A_{\rm c}\) matches that observed, likewise for the α0, the effective albedo of the cloud-free scenes. In general each of the terms on the right-hand side of Eqs. (5) and (6) except for \(Q_{\star},\) is a function of the state of the system, importantly the vertical distribution of temperature, T 0, the composition of the atmosphere and the surface properties. \(E_{\rm c}\) is the effective emitted irradiance of cloudy scenes that is calculated by compositing over all columns not identified as being cloud-free. It depends on the distribution of clouds, but because clouds have some transparency and the atmosphere has some opacity, it also depends on the composition and temperature of the atmosphere, and the co-variability among the two. The short- and long-wave components of the CRE, denoted by superscripts (Q) and (E) respectively, follow naturally as the difference between the all-sky radiative flux and the fluxes which would be manifest in the absence of clouds, i.e.,

$$ F^{(Q)}_{{\rm c}} = -Q_{\star}(\alpha_{\rm c} - \alpha_0) A_{\rm c} $$
$$ F^{(E)}_{{\rm c}} = -(E_{\rm c} - E_0)A_{\rm c} $$

Given our sign convention, and because \(\alpha_{\rm c}\) is generally greater than α0 whereas \(E_{\rm c}\) is generally less than E 0, the short-wave CRE is negative and the long-wave CRE is positive. Both quantities increase in magnitude with cloud amount, \(A_{\rm c}.\) The net CRE, F c , is given by F (Q)c  + F (E)c . A secular increase in the magnitude of F (Q)c would exert a cooling influence on the Earth system, whereas an increase in the magnitude of F (E)c would exert a warming influence. It should be emphasized that the CRE depends not just on the properties of the cloudy fraction of the planet but on the differences between the cloudy and cloud-free portions of the planet. As reviewed by Loeb et al. (2009) the application of this concept to various data sets shows the short-wave CRE to range from −45.4 to −53.3 W m−2 and the long-wave CRE to range from 26.5 to 30.6 W m−2; the associated net CRE from these prior estimates ranges from −16.7 to −24.5 W m−2. The CERES EBAF data, upon which the TOA estimates in Fig. 1 are largely based, give a short-wave cloud radiative effect of −47.1 W m−2 and a long-wave CRE of +26.5 W m−2; overall clouds, more precisely cloudy scenes, exert a net cooling influence on the Earth system of about 20 W m−2.

The CRE concept has seen considerable use in the interpretation of feedbacks in the climate system. From the perspective of Eq. (1), a feedback is a change in a radiative flux that results from a change in global temperature; such a further change in radiative flux in addition to that caused by an initial forcing imposed on the climate system can enhance or diminish the temperature change induced by a given forcing (positive or negative feedback, respectively). The feedback concept has been quite useful in interpreting the contributions of different components of the climate system to \(S_{\rm{eq}}.\) It is quite straightforward to demonstrate that changes in cloud radiative effects are not the same as cloud feedbacks. To appreciate this point note that, in the limit of small changes, the strength of the short-wave cloud feedback can be derived formally from (5) and (6) as (e.g., Soden et al. 2008; Schwartz 2011)

$$ \lambda_{{\rm c}}^{(Q)} = {\frac{\partial Q}{\partial A_{\rm c}}} \; {\frac{\partial A_{\rm c}}{\partial T_{\rm s}}} + {\frac{\partial Q}{\partial \alpha_{\rm c}}} \; {\frac{\partial \alpha_{\rm c}}{\partial T_{\rm s}}} $$
$$ = - Q_{\star} \left[ (\alpha_{\rm c} - \alpha_0) {\frac{\delta A_{\rm c}}{\delta T_{\rm s}}} + A_{\rm c} {\frac{\delta \alpha_{\rm c}}{\delta T_{\rm s}}} \right]. $$

A change in the CRE from surface temperature induced changes in cloud amount is, however, not simply equal to the cloud feedback times the change in surface temperature; but rather includes an additional term that accounts for the change in the albedo of the cloud-free scenes with the change in surface temperature:

$$ \delta F_{{\rm c}}^{(Q)} = \lambda_{{\rm c}}^{(Q)} \delta T_{\rm s} + Q_{\star} A_c {\frac{\partial \alpha_0}{\partial T_s}} \delta T_s. $$

The relationship between the cloud feedback and the change in the CRE follows similarly for the long-wave part of the spectrum. Equation (11) explicitly includes the dependence of the CRE on factors other than cloudiness, in the present example also the cloud-free sky albedo and long-wave emission from cloud-free scenes. Thus if, for example, in a changing climate, the surface albedo or cloud-free-sky aerosol changed, the CRE, as conventionally defined, would change for reasons that have nothing to do with changes in cloud properties. If the cloud properties remained fixed, the actual cloud feedback would be zero. This situation is described by Soden et al (2008) as a masking effect. This example is readily extended to the long-wave, where the change in the cloud-free-sky emission, E 0 can cause a change in the CRE. Because δE 0 and δa 0 are in principle observable, the net effect of changing clouds on the response of the system to an external perturbation can, in principle, be determined from measurements. Soden et al. (2008), illustrate how radiative kernels can be used to diagnose cloud feedbacks from changes in CRE, although the kernel methods conflate cloud feedbacks with cloud mediated CO2 indirect forcing (adjustment), as discussed below. Irrespective of how a feedback is calculated, care must be taken in its interpretation, as the definition of a feedback is dependent on how one defines their system. Choosing relative humidity instead of absolute humidity as a thermodynamic coordinate, or potential temperature instead of height as a vertical coordinate, can give very different pictures of the feedbacks in the system for reasons that have nothing to do with the processes taking place in the atmosphere.

A further subtlety of the cloud-forcing concept is that the CRE, as defined above, depends not only on the properties of the cloud-free scenes, but also on other radiation-influencing constituents of the climate system that might be correlated with the presence of clouds. Hence, it is really a cloudy scene radiative effect, i.e., the difference between the total radiative flux and that which would obtain if cloud-free conditions always prevailed. The emphasis on cloudy scenes rather than clouds is because the former admits the possibility that the atmosphere in cloudy scenes is systematically different from the atmosphere in cloud-free scenes. For instance, if the atmosphere in cloudy scenes tends to be more humid than in cloud-free scenes then this difference in humidity will, through the definition of cloud radiative effect Eq. (8), be interpreted as an effect of clouds. This distinction has important consequences for how cloud radiative effects are calculated in models versus in measurements. In models, the radiative transfer calculation is usually performed twice, the second time with the clouds removed from the input. The first call to the radiation defines the all-sky radiative flux, the second call defines the clear-sky radiative flux and their difference is called the cloud radiative effect. In observations one cannot remove cloudsFootnote 5, so the clear-sky radiative flux is estimated based on scenes where no clouds are identified in the first place, rather than based on all scenes but with removal only of the clouds from those scenes in which they occur. If the atmosphere around the clouds is different (for instance more humid) in the cloudy scenes, as one might expect, this leads to differences between the two estimates of CRE. Such effects were shown by Sohn et al. (2010) to lead to systematic discrepancies between the observed long-wave CRE and that calculated by models. Such discrepancies can, however, be readily overcome by calculating the CRE in models in the same way as is done in observations, by compositing over cloud-free scenes when determining the clear-sky flux.

In Sect. 2.2, it was pointed out that rapid adjustments by clouds can contribute to the adjusted radiative forcing associated with an external perturbation to the atmospheric composition. It proves instructive to illustrate this idea using the conceptual framework developed above. To do so, consider the simplified case in which only the long-wave emission of the atmosphere depends on the greenhouse gas concentration, so that \(E = E(\chi,T_{\rm s})\) where χ denotes the greenhouse gas concentration. In this case, the net radiative forcing that results from a perturbation in a greenhouse gas concentration, δχ, can be expressed as follows

$$ F_{\chi} = F^{(E)}_{\chi} = {\frac{\partial E}{\partial \chi}} \delta \chi. $$

To the extent that the perturbation in greenhouse gas concentrations also influences cloud amount (for reasons that will become clear shortly), there is a resulting further contribution to the radiative forcing of the greenhouse gas perturbation, so that the effective radiative forcing due to the perturbation in concentration becomes

$$ F_{\chi} = \left[ \left( {\frac{\partial Q}{\partial A_{\rm c}}} + {\frac{\partial E}{\partial A_{\rm c}}} \right) {\frac{\partial A_{\rm c}}{\partial \chi}} + {\frac{\partial E}{\partial \chi}}\right] \delta \chi. $$

The first two terms on the right-hand side introduce the idea of an indirect forcing of greenhouse gases that is mediated by clouds. The use of the word “indirect" signifies that the change in the TOA irradiance is not a direct consequence of the greenhouse gas concentration on the clear-sky emissivity, but rather results from the sensitivity of cloud amount to the long-wave emissivity of the atmosphere, i.e., a cloud adjustment. Given our description of the system through Eqs. (5) and (6), an indirect CO2 forcing follows as soon as one admits that cloudiness may depend on the concentration of atmospheric CO2 (Forster and Gregory 2006; Gregory and Webb 2008; Andrews et al. 2009). The idea that clouds may be sensitive to the concentration of greenhouse gases and thus may rapidly adjust in ways that change the initial forcing, actually predates the idea that clouds may depend on surface temperature and hence act as a feedback (Plass 1956). This is not just a mathematical abstraction. Because the radiative cooling at the top of stratiform cloud layers, which is important to their sustenance, is sensitive to the downwelling long-wave radiative flux, which in turn depends on the long-wave opacity of the overlying atmosphere (Caldwell and Bretherton 2009; Stevens et al. 2003) it is likely that clouds respond rapidly (adjust) to perturbations in atmospheric CO2, thereby providing at least one justification for the ansatz that \(A_{\rm c} = A_{\rm c}(\chi).\)

3.2 Cloud Amount

A determination of whether changes in cloudiness are causing changes in Earth’s energy flows depends on the ability to identify clouds, and cloud changes, unambiguously. This proves to be a challenge, which stems in part from the very nature of clouds. Cloud are, in a word, nebulous. A cloud, like an aerosol more generally, is a dispersion of particulate matter in an often turbulent flow. But, additionally and in contrast to clear-air aerosols, clouds are inherently ephemeral, as they contain a substantial amount of condensed (liquid or solid) water, the presence and amount of which are maintained by local supersaturation, and which can quickly dissipate by evaporation, converting a cloudy scene to a cloud-free scene, making it difficult to determine the boundaries or even the presence of a cloud. This situation leads inevitably to a certain arbitrariness in whether a cloud is present at a given location. And this arbitrariness can lead to large differences in quantities, such as cloud fraction, which are central to interpreting Earth’s energy flows (Stephens 1988).

It might be argued that a clear basis for defining a cloud is provided by Köhler theory, namely as the set of particles that exist in an environment that is supersaturated relative to the equilibrium supersaturation over the particle surface and for which the equilibrium state is unstable. However, such a definition is not very useful in practice as the theory applies at best only to liquid clouds, and deliquesced aerosol in humid environments, or evaporating hydrometers in subsaturated environments are often optically indistinguishable from clouds defined on the basis of Köhler theory, the transition region extending over distances up to kilometers or more (Koren et al. 2007; Tackett and Di Girolamo 2009; Twohy et al. 2009; Bar-Or et al. 2010).

As a consequence of such concerns, the presence of clouds is often determined based on their radiative properties. This approach is advantageous from a practical perspective, as satellite borne instruments can be built to be sensitive to such properties, and thus afford the opportunity for reproducible measurements with global coverage and high spatial and temporal resolution. But it also introduces the possibility that quantities such as cloud fraction will be determined, in part, by the characteristics of the instruments with which they are measured. The seriousness of this issue is illustrated with the help of Figs. 3, 4 and 5, which show that co-located measurements by multiple approaches yield results that can differ profoundly at a single time, in monthly averages, and in the seasonal pattern. These issues can be ameliorated by focusing on anomalies over longer time-periods, which as show in Fig. 5 are less instrument dependent. When a comparison to models is of interest, instrument simulators can also help address these issues. Even so, differences among measurement techniques make the unambiguous determination of long-term trends more difficult, and potentially more sensitive to inhomogeneities in the observational record.

Fig. 3
figure 3

Cloud Fraction in north central Oklahoma determined by several techniques. Both panels show 3 h mean cloud fraction as a function of local standard time; upper panel gives 1-month overview; lower panel expands a 2 days period. ARSCL (active remote aensing of clouds) product of the Department of Energy Atmospheric Radiation Measurement (ARM) Program (Clothiaux et al. 2000), is time-average based on vertically pointing lidars and millimeter cloud radars; Solar infrared radiation system (SIRS) product of the ARM program (Long et al. 2006) is time-average based on downwelling short-wave irradiance within nominal 160° field of view. TSI (Total Sky Imager) is based on fraction of cloudy pixels within hemispheric field of view geostationary operational environmental satellite (GOES) is based on average of all pixels (4 km pixel size) within 20 km of the surface measurement site (Genkova et al. 2004). Figure provided by W. Wu (Brookhaven National Laboratory 2011)

Fig. 4
figure 4

Monthly mean cloud fraction in north central Oklahoma determined by several techniques, as in Fig. 3. ISSCP reference imaging radiometers on operational weather satellites (ISCCP); PATMOS-X Pathfinder Atmospheres Extended Product (Heidinger and Pavolonis 2009). Figure provided by W. Wu (Brookhaven National Laboratory 2011)

Fig. 5
figure 5

Anomalies in monthly mean cloud fraction in north central Oklahoma determined by several techniques, as in Fig. 3. Trend lines (not shown) are all zero within approximately one sigma, as evaluated assuming the absence of autocorrelation. Figure provided by W. Wu (Brookhaven National Laboratory 2011)

Natural variability in cloudiness compounds these issues. CERES SSF1deg short-wave CRE data analyzed and presented in Fig. 6 show that the CERES instruments on the different satellites are in relatively good agreement in their quantification of monthly anomalies during their period of overlap. Given that these estimates are based on similar instruments whose measurements are processed in a similar way, albeit for different samples of the planetary cloudiness, it is not surprising that the correlation between the two time-series is markedly stronger than what one deduces with different instruments, even if they sample the same cloud field, as in Fig. 5. The agreement between the CERES measurements aboard the TERRA and AQUA satellites suggests that natural variability is the main cause of month-to-month fluctuations, which can be as large 1–2 W m−2. These fluctuations are responsible for the stated uncertainty of about 0.3 W m−2 dec−1 in the 95 % significance ranges in monthly global anomalies of Q c, an uncertainty range which is just large enough to explain a nearly 0.6 W m−2 difference in the decadal trends estimated from the two instruments.

Fig. 6
figure 6

Anomalies in the monthly and globally averaged top-of-atmosphere short-wave cloud radiative effect (CRE) from the CERES SSF1deg product. Cloud short-wave forcing, difference between all-sky and cloud-free downwelling irradiance at the TOA, e.g., Eq. (7), is negative. Anomalies are calculated by subtracting the CERES monthly mean cloud-forcing (average over the data set) from the individual monthly forcing. Negative slope, i.e., decrease in cloud-forcing with time, corresponds to increase in cloud reflectivity climatology from the monthly fluxes. Aqua, pm; Terra, am

Such a large uncertainty, associated with natural variability alone, suggests that even very high quality, dedicated measurements, such as those provided by CERES will not be able to document potentially substantial changes in global cloud radiative effects on timescales shorter than half a century. From the perspective of climate change, a change in the irradiance associated with cloud-covered scenes that is comparable to a given radiative forcing over the time-period of interest can be considered substantial. Over the period 1960–2005, CO2 increased at an average rate of 1.4 ppm yr−1 (Forster et al. 2007). A 3.7 W m−2 forcing associated with a doubling of CO2 (Forster et al. 2007) would result in a trend in the energy budget of about 0.25 W m−2 dec−1, in the absence of any feedbacks (and assuming a linear increase in the forcing with time. In this context, a change in cloud properties in response to this forcing that resulted in a further change in the radiation budget \(|\dot{Q}| > 0.05\)  W m−2 dec−1 would constitute an appreciable change. Based on this we argue that observationally constraining the response of the climate system to such a perturbation, and to some meaningful degree, requires an ability to detect changes in the radiation budget \(|\dot{Q}|\) of 0.05 W m−2 dec−1. The analysis of Loeb et al. (2007) suggests detection of such a trend with 90 % confidence would require 50 years of data, and this analysis is optimistic as it assumes a perfect instrument and insignificant decadal variability. A similar conclusion has been reached by Dessler (2010). Moreover, as the number of years of data required to establish a trend is proportional to \(|\dot{Q}|^{2/3},\) where \(|\dot{Q}|\) is the magnitude of the trend (Weatherhead et al. 2000), establishing a trend even twice as large, which becomes interesting as the rate of forcing increases super linearly with time, would still require 30 years—a timeframe which, if anything is a lower bound as it does not factor in the effects of instrumental limitations, or issues related to the ambiguity of cloud identification, as discussed above.

4 Aerosol Radiative Effects and Forcing

4.1 Aerosol Radiative Effects

The radiative properties of cloud-free skies depend on, in addition to surface reflectivity, the gaseous composition of the atmosphere and on the spatial and temporal distribution and chemical and microphysical properties of atmospheric aerosols. Atmospheric aerosol particles also serve as cloud condensation nuclei and ice-forming nuclei that are the seed particles for the formation of cloud liquid drops and ice crystals and, thus, changes in the aerosol are expected to change cloud properties, thereby contributing to an indirect aerosol radiative forcing (Twomey 1974; Charlson et al. 1992; Albrecht 1989; Stevens and Feingold 2009). Hence, atmospheric aerosols influence, to varying degrees, the flows of energy in almost all of the constituent parts of Fig. 1. Whereas aerosol effects on the short-wave radiative properties of the atmosphere are usually emphasized (Charlson et al. 1992), aerosol effects on the emissivity of the atmosphere (Markowicz et al. 2003) or on the downwelling long-wave irradiance of thin clouds (Lubin and Vogelmann 2006) may also be important. Because human activity has long been recognized as contributing to increasing aerosol burdens, aerosols also contribute to the net radiative forcing that appears in Eq. (3). The radiative forcing of aerosol is generally appreciated to be negative. As a consequence, taking this forcing into account in estimating climate sensitivity Eq. (3) results in a greater sensitivity than would be inferred from consideration only of the greenhouse gas forcing (Gregory et al. 2002; Schwartz et al. 2010). This situation necessitates accurate knowledge of the radiative forcing associated with changes in the aerosol burden. Not only does this require an understanding of how the aerosol burden has changed as a result of human activity, but also an understanding of how Earth’s climate system adjusts to such changing aerosol burdens, thereby determining the effective radiative forcing. However, in both instances, this understanding is poor, and hence the magnitude of the aerosol radiative forcing is highly uncertain.

The complexity of aerosols is one reason it has proven difficult to understand how human activity has contributed to a changing aerosol burden. This complexity is expressed in terms of the heterogeneous chemical and microphysical properties and also the highly variable spatial and temporal distribution of the aerosol. The heterogeneous composition, which contrasts with the well defined molecular properties of the greenhouse gases, is a consequence of the numerous contributions to atmospheric aerosols: Primary emissions from natural and anthropogenic sources and gas-to-particle conversion resulting from atmospheric reactions of precursor gases, importantly sulfur and nitrogen oxides from combustion sources, ammonia from agriculture and animal husbandry, organics from anthropogenic sources and vegetation, and numerous other sources. Gas-to-particle conversion processes lead both to new particle formation and to growth of pre-existing particles. The resulting aerosols undergo further evolution in the atmosphere through condensation and coagulation and in cloud processing. Ultimately, the aerosol particles are removed from the atmosphere, importantly by precipitation. The optical and cloud-nucleating properties of aerosols, and thus their influences on climate and climate change, are strongly dependent on the size and chemical composition of the particles comprising the aerosol. For example, growth of particles with increasing relative humidity, which greatly increases their ability to scatter visible light, is dependent on composition.

The complexity in the spatio-temporal distribution of aerosols is hinted at even upon inspection of their long-term average global distribution undifferentiated by aerosol type as seen in the large spatial variability of aerosol optical depth (AOD) at 550 nm averaged over the available 11 years of multi-angle imaging spectroRadiometer (MISR) data, Fig. 7. Major contributions arise from windblown dust (e.g., Northern Africa, Arabian peninsula, western China) and biomass burning (e.g., central Africa, Amazonia, Indonesia). Substantial contributions from human activity, mainly combustion related, can be inferred over southeast and eastern Asia and extending into the western North Pacific. The highly industrialized regions of Europe and North America (extending to the North Atlantic) also exhibit noticeable enhancement of AOD relative to pristine continental regions and major portions of the Southern Hemisphere Ocean. The spatial heterogeneity of the distribution of these aerosols is a consequence of the heterogeneous distribution of sources together with the short atmospheric residence times of these aerosols, about a week, together with the intermittent removal by precipitation. Because aerosol sources have pronounced seasonality, and because sink and transport processes of all aerosols are heavily dependent on variable meteorological conditions, the distribution of aerosols shown in Fig. 7, being a long-term average, considerably understates the complexity of the spatial distribution of atmospheric aerosols. This points out that the complexity of aerosols is manifested not only by their varied chemical and microphysical properties, but also by their heterogeneous spatial distribution; for example an aerosol particle above a bright surface has a different radiative effect compared to even the same particle over a darker surface. All of these considerations make it much more difficult to quantify aerosol forcing than is the case with the incremental greenhouse gases.

Fig. 7
figure 7

Annually averaged aerosol optical depth climatology inferred from eleven (2000–2010) years of multi-angle imaging spectro radiometer (MISR) measurements

The short residence time of aerosol particles in the troposphere not only complicates characterization of their radiative influences, but also has implications on climate change that would result from future changes in emissions, especially as most of the incremental aerosol arises from emissions associated with fossil fuel combustion. If at some point in the future, emissions of CO2 from combustion are substantially reduced and if this were accompanied by reduction of associated emissions of sulfur and nitrogen oxides, major precursors of light-scattering tropospheric aerosols, the result would likely be for temperatures to initially increase because of the reduction of aerosol forcing. An initially abrupt increase in temperature following an abrupt cessation of aerosol forcing has been shown in climate model studies (e.g., Brasseur and Roeckner 2005; Matthews and Caldeira 2007).

As a consequence of all these considerations, understanding of energy flows in the Earth system and changes in these flows over the industrial era is challenged by poor understanding of the effect of aerosols on cloud-free skies where one can speak of the direct aerosol radiative forcing. This challenge is even greater with respect to aerosol effects on clouds (Lohmann and Feichter 2005; Stevens and Feingold 2009) where it has become common to speak of the indirect aerosol radiative forcing, which results from modification of the radiative influences of clouds that result from changes in the aerosol environment in which they form. The discrimination of aerosol forcing into direct and indirect components structures thinking about aerosol influences on climate and is used to structure the discussion below.

4.2 Aerosol Direct Forcing

Analogous to the way in which a CRE is calculated, the direct aerosol radiative effect, DARE, is the change in the irradiance at the top of the atmosphere that results from the total aerosol present. This can be contrasted with the direct aerosol radiative forcing, DARF, which is the quantity pertinent to forcing of climate change. As is the case with greenhouse gas forcing, DARF describes only the aerosol radiative effect associated with the secular (mainly anthropogenic) change in the DARE due to incremental aerosols, as that is the externally forced contribution. A negative forcing denotes a decrease in absorbed short-wave irradiance; opposite in sign to the positive greenhouse gas forcing, and, within the framework of the forcing-response paradigm, would offset some fraction of the greenhouse gas forcing.

The DARF depends on the difference between the present day and pre-industrial aerosol loading. This difference must account for the co-variability of the aerosol with surface and circulation features, as changes in the aerosol loading over bright surfaces, or above clouds, will have a different impact, and may even have a different sign, than changes over dark surfaces, or in clear air. And because the radiative effect of aerosols depends on their intensive properties, this difference must include a characterization of aerosol properties and how they have changed between the pre-industrial versus the present day. Because the aerosol loading is heterogeneous, the radiative effect of a change in the aerosol will also be heterogeneous and thus likely generate responses in the circulation as the atmosphere adjusts to different regional heating patterns. These changes can be thought of as adjustments, which further complicate an estimate of the adjusted forcing associated with the direct effect of the anthropogenic aerosol. Finally, because there are few reliable proxies for the pre-industrial aerosol loading, this must be determined using models, although attributing accumulation mode aerosols, aerosols which are formed by gas-to particle conversion, and which operationally have diameter less than about 1 μm, to the anthropogenic perturbation, provides some observational constraints. Still this approach has its limitations; for example, it would misattribute dust aerosol from tilled soils (which have particle diameters larger than 1 μm), and likewise would misattribute background sulfate and smoke, or natural organic haze such as gave rise to the name of the Great Smoky Mountains in the eastern United States. For all of these reasons, the direct aerosol radiative forcing is a much more challenging quantity to estimate than greenhouse gas forcing.

A starting point for quantifying the DARF is an estimate of the magnitude of the present day aerosol loading. This quantity is commonly expressed by the aerosol optical depth, AOD, the vertical integral of the aerosol extinction coefficient, typically given in the mid-visible, 500 or 550 nm. AOD can, with diligence, be measured by Sun photometry during the daytime and in the absence of clouds in the path to the Sun to an accuracy of 0.01 (e.g., Holben et al. 2001; Kim et al. 2008; Michalsky et al. 2010); the aerosol contribution to path extinction is determined by accounting for the extinction due to Rayleigh scattering and to atmospheric gases (importantly ozone). However, such measurements provide no information about the vertical distribution, are limited in their spatial coverage and to cloud-free skies during the day, and are especially lacking over the oceans (Smirnov et al. 2009). Hence, the approach taken to characterizing the distribution of aerosols globally has been to determine AOD by satellite, as for example illustrated by Fig. 7. It should be stressed, however, that the satellite measurement of AOD is rather indirect and also provides little information about the vertical distribution. The AOD is inferred from the enhancement of the path radiance over that which would be obtained in the absence of the aerosol. Rigorous cloud screening, which recalls the discussion (Sect. 3.2) of what is a cloud, is also required. By consistently accounting for contributions to path radiance from the surface and from Rayleigh scattering the contribution to path radiance attributable to light scattering by the aerosol can be determined. Converting from radiance to optical depth rests on assumptions about the scattering phase function (angular distribution of light scattering) and the fraction of the extinction by the aerosol that is due to scattering versus absorption. Typically look-up tables are used, with identification of aerosol type informed by the wavelength dependence of the aerosol light scattering and climatology of aerosol types. The MISR (Multi-angle Imaging SpectroRadiometer) instrument, upon which the measurements illustrated in Fig. 7 are based, takes advantage of measurements at multiple scattering angles to better constrain the phase function and in turn the AOD (Martonchik and Kahn 2009; Kahn 2011). This is particularly helpful over bright and complex surfaces, because the inferred AOD is highly sensitive to errors in estimates of surface-leaving radiance. Measurements at multiple angles also constrains inferences of aerosol type, and hence absorption, which is another large source of error in inferring AOD. Further detail on determination of AOD by satellite and associated uncertainties is provided by Kahn et al. (2010) and Kahn (2011).

Importantly, in determining the DARE, and how it has changed, it is necessary to account for the effects of clouds. The effect of cloud contamination of pixels used to determine aerosol optical depth as already been noted; such contamination would result in a gross overestimation of aerosol optical depth, and it is thus necessary to apply stringent cloud screening (Mishchenko et al. 1999). However, being overly stringent runs the risk that high-humidity regions will be misclassified as being cloudy, with the resultant effect that the swelling of particles at high relative humidity and the attendant increase in light-scattering cross-section, optical depth, and forcing is excluded from the measurements. These effects can be substantial; for sulfate aerosols, the scattering cross-section increases fourfold between 90 and 97 % relative humidity (Nemesure et al. 1995). Surface-based measurements are also subject to a similar concern as the technique requires a direct path to the Sun. More intrinsic to the radiative forcing issue, the direct radiative effect of light scattering by aerosols is greatly diminished in the presence of clouds, which prevents solar irradiance from reaching the aerosol if clouds are above the aerosol, or by providing a bright underlying albedo, minimizing the effect of aerosol scattering if clouds are below the aerosol. Clouds beneath an absorbing aerosol greatly increase the amount of solar absorbed irradiance relative to the cloud-free situation. Because it is difficult to retrieve aerosol amounts in the presence of clouds, it is has been common to estimate the DARE on a global basis simply by multiplying the cloud-free DARE, determined by measurement or by modeling of the amount and optical properties of the aerosol, by the cloud-free sky fraction. Such an approach has been shown in model calculations to yield a value of DARF whose magnitude is erroneously large (Bellouin et al. 2008) by a factor of two.

Increases in the amount of aerosol loading that can confidently be ascribed to anthropogenic emissions have been thought to give rise to changes in global average atmospheric radiative fluxes (aerosol forcing) that are a substantial fraction of the greenhouse gas forcing over the industrial era. As part of the fourth assessment report of the IPCC the DARF was estimated to be −0.5 ± 0.4 W m−2 (Forster et al. 2007, 90 % confidence limits). Although considerably reduced in magnitude as compared to earlier estimates, such as the value of −1.3 W m−2 initially estimated for the sulfate aerosol alone (Charlson et al. 1992), a value of DARF at the high magnitude end of the range cited by IPCC would still offset a substantial fraction of greenhouse gas warming over the industrial era (about 3 W m−2), and thereby imply a much larger climate sensitivity as inferred on the basis of Eq. (3) with F denoting the sum of greenhouse gas and aerosol forcing. To the extent that the DARF is large, advances in satellite remote sensing make it conceivable to look for a direct signal, by looking for signatures of aerosol trends in radiative fluxes measured by passive satellite borne instrumentation over the past decade of intensive Earth observations. Eleven years of MISR measurements (upper panel of Fig. 8) of AOD show large-scale shifts in specific regions. Large increases in AOD over northern India southeast and eastern Asia and around the Persian Gulf likely reflects the economic development of these regions. Over the region ranging from southwest of North America, across the Atlantic and into North Africa, the Mediterranean and central Europe, and over the maritime continent, AODs have decreased sharply, by as much as 0.1. A modest decrease is evident across the southern ocean, and there is an apparent increase over western Canada. The uncertainties associated with the retrievals of AOD from space (Kahn et al. 2010), and the susceptibility of decadal trends to the effects of inter-annual modes of variability, such as El Nino, preclude drawing confident conclusions from Fig. 8; however, the broad conclusions drawn by the figure are also supported by a more systematic analysis based on multiple platforms (Zhang and Reid 2010).

Fig. 8
figure 8

Decadal trend in the annually averaged aerosol optical depth at (550 nm) inferred from eleven years (2000–2010) of multi-angle imaging spectroRadiometer (MISR) measurements (upper) and in cloud-free scene (TOA) radiative fluxes taken from CERES measurements (lower). Both the CERES and MISR instruments are aboard the Terra satellite. Only trends that are significantly different from zero at the 95 % confidence level (as estimated based on the ratio of the residual variance and mean trend assuming no auto correlation in the yearly data) are shown. An increase in reflected short-wave irradiance, as indicated by the redder colors, is indicative of a brightening of the cloud-free skies

A concern over the interpretation of these measurements is that the marked changes in aerosol optical depth inferred from the MISR and multi-instrument aerosol measurements are not mirrored in trends in the outgoing short-wave irradiance in cloud-free scenes over the same time-period as measured by the highly calibrated CERES radiometer on the same platform, lower panel of Fig. 8. Although the irradiance in cloud-free circumstances shows trends in some regions where aerosol optical depth retrieved by MISR has been increasing, for instance over south Asia, particularly the middle east and southeast China, and to some extent over western Canada, the relationship between changes in AOD and changes in clear-sky short-wave radiative fluxes is not striking. The largest changes (irrespective of sign) in cloud-free-sky radiation appear over land in regions where there are no discernible trends in AOD. In almost every case statistically significant trends in CERES absorbed short-wave radiative irradiance over land are also evident in changes in land-surface properties measured over effectively the same time-period: the brightening, at TOA, of Australia; the brightening in southeastern Asia; the brightening/dimming patterns over the southwestern United States and northern Mexico as well as the brightening/dimming patterns over Argentina and South America, and the dimming of the very northern tip of Africa are all associated with consistent changes in soil moisture, whereby an increase in the outgoing short-wave irradiance for scenes identified as being cloud-free corresponds to a decrease in soil moisture and evapotranspiration (Jung et al. 2010). Notwithstanding the merit of exploring why such large apparent trends in AOD over the ocean are not seen in the CERES clear-sky short-wave irradiance, this analysis provides little support for a strong DARF.

In summary, the evolving understanding of radiative forcing, one in which the process of adjustment plays an important role; the complexity of the aerosol and its co-variability with clouds and surface features; and the role of even small amounts of absorption of aerosols over bright surfaces, suggests that back-of-the-envelope estimates of the DARF can easily be misleading. Based on this reasoning, and in consideration of more detailed calculations that suggest previous work underestimates the uncertainty in aerosol forcing (Loeb and Su 2010), it would not be surprising if the sum of the various contributions to the DARF is much closer to zero than previously thought.

4.3 Aerosol Indirect Forcing (Cloud Adjustments)

It has long been appreciated that clouds adjust to changes in tropospheric aerosols and this adjustment affects their albedo and precipitation development. Based on this understanding, it has been hypothesized that changes in the loading and properties of tropospheric aerosols may indirectly affect the radiative influences of clouds (i.e., alter the CRE) by modifying cloud properties and/or amount. These changes are referred to as aerosol indirect effects, or the indirect radiative forcing resulting from anthropogenic (or secular) changes to tropospheric aerosol loading and properties. They are analogous to the CO2 indirect forcing discussed in the context of Eq. (13) and, like the CO2 indirect forcing, are more usefully thought of as an adjustment to a compositional change of the atmosphere. Because such adjustments convolve spatially heterogeneous changes in the aerosol with changes in cloudiness, they tend to be complex and uncertain, and are only briefly touched upon them here.

Broadly speaking cloud adjustments to aerosol perturbations can be classified into adjustments of cloud intensive properties and adjustments in cloud extensive properties. Examples of an adjustment in cloud intensive properties would be an increase in the cloud droplet number concentration (Twomey 1974) and/or cloud droplet size distribution (Liu and Daum 2002), and in turn in changes in the cloud radiative influence, that result from an adjustment to increased concentrations of cloud active aerosol. Adjustments of cloud extensive properties are often called cloud lifetime effects, such as hypothesized by Albrecht (1989) and Pincus and Baker (1994), wherein changes to cloud droplet number concentrations modify the precipitation efficiency of clouds, resulting in modifications of cloud amount (cf., Stevens and Feingold 2009; Khain 2009). Research continues to explore how clouds adjust to aerosol perturbations, with interest increasingly focusing on changes to ice-forming nuclei and hence adjustments of ice or mixed phase clouds. Because clouds interact so potently with radiation, a slight, but systematic, change in the amount or albedo of clouds globally or over large regions of the Northern Hemisphere could exert a change in the global radiation budget that is comparable in magnitude to greenhouse gas forcing. For example, in a back-of-the-envelope calculation, Charlson et al. (1992) showed that a 30 % increase in the number concentration of cloud drops in marine stratus clouds globally would, other things being equal, exert a global radiative forcing of about −1 W m−2. In the intervening years, there have been abundant climate model calculations of the global magnitude of the cloud adjustment to aerosol forcing over the industrial era, with estimates ranging from near-zero to −3 W m−2 or more (Lohmann et al. 2010), based on differing assumptions regarding the relation between aerosols and cloud properties and various other controlling factors.

To be sure, a wealth of observational support exists for Twomey’s hypothesis that the cloud drop number concentration increases, cloud drop radius decreases, and cloud albedo increases with increasing aerosol particle concentration. However, quantification of the net radiative forcing that can be attributable to the indirect effects of aerosols on clouds has proven elusive. That cloud drop radii are reduced and cloud albedo is increased by aerosols is clearly shown in ship tracks (e.g., Segrin et al. 2007), but the net effect of such changes are partially offset by changes in cloud water content. Regionally, reduction in cloud drop effective radius is associated with enhanced concentration of anthropogenic aerosol (e.g., Schwartz et al. 2002), but the expected increase in cloud albedo is often absent. Such lack of enhancement of cloud albedo is also likely due to a decrease in cloud liquid water path with increasing aerosol concentration, at variance with Twomey’s ansatz of other things, especially cloud water content, remaining equal. A global survey using satellite observations showed roughly equal likelihood of negative, near-zero, or positive correlation of column liquid water and column drop concentrations in liquid water clouds (Han et al. 2002). Although several studies show strong correlations between cloud amount and aerosol optical depth (e.g., Nakajima et al. 2001; Koren et al. 2010), the interpretation of such correlations is difficult, as a variety of processes (both physical and retrieval artifacts) can be expected to produce such correlations, quite independently of whether or not the aerosol is interacting with the cloud (e.g., Loeb and Schuster 2008). For example, both aerosol optical depth and cloudiness increase with humidity and, thus, it is not surprising that modeling studies might overestimate the tendency of clouds to adjust to aerosol perturbations. By regressing the logarithm of the retrieved aerosol optical depth against the logarithm of the retrieved cloud droplet concentrations over a number of geographic regions, Quaas et al. (2005) found statistically significant slopes that range from 0.1 to 0.3 depending on location, with values over the ocean three times greater than those over land and with a global mean value of just under 0.2 (see also Quaas et al. 2009). Based on this analysis, Quaas et al. (2009) estimate quite a low aerosol indirect forcing, −0.2 ± 0.1 W m−2. As noted by Quaas et al. (2005), this uncertainty is parametric, and contributions to the uncertainty from structural effects can be expected to be substantial; hence, observational estimates cannot, on their own, establish with confidence even the sign of the effect hypothesized by Twomey, despite arguments based on simple physical considerations that it is negative. The structural uncertainty that frustrates attempts to quantify the Twomey, or Twomey-like, effects also makes it more difficult to test cloud lifetime hypotheses and all the more to quantify the resultant forcing. Moreover, to the extent that precipitation processes become involved the difficulties are compounded, in no small part because of the sensitivity of aerosol amount to wet scavenging by precipitation.

In summary, although there is little doubt of the importance of aerosol-cloud interactions in influencing the amount of atmospheric aerosol, as well as cloud properties, the variety of ways in which clouds adjust to aerosol perturbations (Stevens and Feingold. 2009), many of which are not possible to account for given the relatively crude description of cloud processes in climate models, lends weight to the argument that, after a full accounting, the radiative forcing attributable to cloud adjustments to aerosol perturbations is likely to be small, at least on a global scale.

5 Modeling Earth’s Energy Flows

The preceding discussion demonstrates that, irrespective of the sophistication of observing systems, models are indispensable for estimates of fundamental properties of the climate system. The forcing-feedback-response framework is useful only in so far as compositional perturbations can be associated with a radiative forcing. And if it is to have any general meaning, this radiative forcing must incorporate the perturbation-dependent fast response of the system, the adjustment. Complications posed by adjustment have long been appreciated for inhomogeneous perturbations, such as those due to aerosols, and in special cases for homogeneous forcings such as those due to greenhouse gases, i.e., stratospheric adjustment. Adjustments that involve changes in cloudiness, whether due to aerosol or greenhouse gas forcing, can be large, are often model dependent, and add significant uncertainty to ultimate estimates of the response of the climate system to compositional changes.

Uncertainties in the modeling arise because models are imperfect. And, given the singular nature of the task to which climate models are to be applied, these imperfections are difficult to quantify. Independent realizations of perturbations to the energy flows in Earth-like planets, which could be used to evaluate the reliability of climate models, do not exist. So the empiricism through which the adequacy of models could be assessed does not exist, and critical tests are invariably indirect. As an example, Hall and Qu (2006) showed that in models the surface albedo feedback, which is not observable, correlates with the relationship between seasonal variations of surface temperature and surface albedo, which is observable. The implication is that models which capture the observed relationship between seasonal changes in surface temperature and surface albedo more reliably represent the surface albedo feedback. Of course, the relationship that models show between their seasonal cycle and their response to a secular perturbation might simply be an artifact of how the models are constructed, and indeed there is evidence that relationships between quantities one desires from a model, and quantities that one can measure, may say more about the models from which such relationships are derived than they do about the physical system (e.g., Klocke et al. 2011). Nonetheless, the Hall and Qu (2006) example shows how focusing on perturbations to Earth’s energy flows might lead to the development of a critical framework for assessing the reliability of models. This idea hinges on advancements in the ability to measure Earth’s energy flows, as developed further below.

One of the key energy flows illustrated by Fig. 1, is that of short-wave radiation, which powers the climate system. The near constancy of the total solar irradiance is thought to be a prerequisite for the development of life, but this would come to nought were it not for a commensurate constancy in the planetary albedo. The constancy of the planetary albedo is evident not only on globally and annually averaged scales, but also within latitude zones. This point is illustrated by the latitudinal dependence of zonal averages from a decade of CERES measurements of the reflected short-wave radiation at the TOA, Fig. 9. Also shown are the range of annual averages of the reflected short-wave radiation, evaluated as

$$ \left\langle Q^{\uparrow}_n \right\rangle = {\frac{1}{2}} \int Q^{\uparrow}_n \cos(\varphi) {\rm d}\varphi $$

where \(Q^{\uparrow}_n\) denotes the reflected short-wave radiation as a function of latitude \(\varphi\) and year n; and the values averaged over the northern and southern hemisphere separately. The surprising feature of Earth’s climate system that is revealed in these measurements is the small inter-annual variability (the range in the yearly averages is 1.16 W m−2 and the standard deviation is 0.36 W m−2), despite the zonal average spanning more than 40 W m−2, with a root-mean squared variability (weighted by area) of nearly 9 W m−2. Also the difference between the two hemispherically averaged values is very small, only 0.35 W m−2.

Fig. 9
figure 9

Zonally averaged reflected short-wave radiation from ten years of CERES EBAF (energy balanced and filled) data. The figure shows the upward-directed short-wave irradiance (\(Q^{\uparrow}(\varphi)\), where \(\varphi\) denotes latitude) plotted versus the sine of latitude. The year-to-year variability is indicated by the gray shading (which spans the range of yearly values). The blue line shows the 10-year mean. The red vertical bar near the vertical axis denotes the range in the global and annual means. The two small horizontal bars near the vertical axis show the 10-year means averaged over the northern and southern hemisphere separately

Because the globally and annually averaged albedo can be, and is, readily tuned by adjusting global parameters in models, the fidelity with which it is represented is not a critical test of models. However, the ability of models to represent natural fluctuations in the energy flows about the parameter constrained global values begins to provide a much more critical test. The simplest example of such a fluctuation is that embodied by the climatology of zonally averaged anomalies, Fig. 10. These zonal patterns help to regulate the meridional heat transport and are not directly specified through the adjustment of global parameters. Broadly speaking, climate models, here represented by a subjective selection of five climate models, are skillful in representing zonally averaged anomalies; correlations between the observed and simulated latitudinal anomaly range between 0.60 and 0.94 for the models shown. However, the departures from the observations, which are due mainly to treatment of clouds in the models, are substantial; particularly when viewed in the context of radiative forcing over the industrial era. The root-mean square error in the zonal residuals ranges from 4 to 8 W m−2, comparable to the departure of the zonal mean from the global mean and considerably greater than a short-wave cloud feedback that would be important in the context of understanding climate feedbacks, 1–2 W m−2. The ways in which the models are wrong also varies, although some patterns emerge. In a comparison of zonal monthly mean albedo calculated with twenty climate models and ERBE observations over 1985–1990 Bender et al. (2006) found substantial positive and negative departures, not infrequently as great as 0.1, that were reproducible from year-to-year in a given model but differed substantially from model to model in space and time. Trenberth and Fasullo (2010a) show that most models reflect too little solar radiation in the southern storm tracks and misrepresent in one fashion or another the structure of the tropical convergence zones (cf., Lin. 2007). Most models reasonably represent the poleward increase of reflected short-wave radiation in the mid-latitudes of the northern hemisphere, presumably because part of this is carried by the influence of specified surface features such as the Saharan desert and Tibetan Plateau. The models also tend to accurately represent the remarkable constancy in the globally averaged values, and even the year-to-year variability within latitude bands, but almost all fail to properly capture the near equality of the hemispherically averaged values.

Fig. 10
figure 10

Biases in zonal anomalies in annually and zonally averaged reflected short-wave radiation. Here the biases are the differences between modeled and observed estimates, so that a negative (red) anomaly corresponds to too much absorption in the model. The model estimates are taken from five CMIP3 models, each processed over the last 10 years of a slab-control simulation, the observations are taken from ten years of CERES data

Given a globally and annually averaged flow of energy into the Earth system, models would ideally also produce a model state that is consistent with what is observed. In this respect, there is also room for improvement. The pre-industrial control climate can differ substantially among models, even if the flow of energy into the system is prescribed, or tuned to match the best estimate of the observations. This is evident in Fig. 11, which shows simulated global mean temperature over the twentieth century taken from all of the relevant simulations, 58 in total, in the CMIP3 archive. Simulated temperatures at the end of the twentieth century exhibit a range of nearly 3 °C (from 12.8 to 15.5 °C). Most models are biased cold, despite being forced with a total solar irradiance that is now thought to be too large. The multi-model mean temperature is more than 0.5 °C lower than measured, an offset that is comparable to the temperature change observed over the twentieth century. From a certain perspective, the agreement is excellent; errors in temperature of 1 K out of 288 K corresponds to an error of only 0.35 %, albeit somewhat larger (5.5 W m−2 or nearly 1.4 % when translated into an energy flux). However, even such a small temperature error can alter the modeled climate in ways that are as great as the climate change that has occurred over the twentieth century or are projected for the twenty-first century. Such an error would seem to have implications for model projections of climate change; so it is surprising that, despite these differences the models, individually and collectively, still represent the trend in twentieth century temperatures as accurately as they do. This surprise is tempered by a realization that the agreement in the twentieth century temperature trend may also be a reflection of the model development process and the considerable latitude that uncertainty in the aerosol forcing gives model developers in matching the observed temperature trend (Kiehl 2007).

Fig. 11
figure 11

Change in global mean temperature anomaly relative to 1900–1950 (left axis) and global mean temperature (right axis) over the twentieth century as evaluated with climate models that participated in the 2007 IPCC Assessment. The observed change in temperature anomaly (HadCRUT3) is plotted in black. The 13.75 °C offset in the vertical scales corresponds to the 1900–1950 global mean surface temperature (Jones et al. 1999, HadCRUT3). Adapted from Tredger (2009) and IPCC AR4 Figure 9.5

Although tests like the ones just discussed are neither exhaustive, nor absolute, it seems reasonable to assume that the ability of a model to represent perturbations to Earth’s energy flows that arise from natural forcings should be indicative of their ability to represent perturbations stemming from human sources (cf., Lucarini and Ragone 2011). Viewed comprehensively, an improvement in an ability to model these natural perturbations in Earth’s energy flows, could reasonably be associated with more reliable projections of perturbations in Earth’s energy flows stemming from prospective future changes in atmospheric composition. Viewed over many model generations, substantial improvements in these respects, should correlate with a reduction of model spread in the representation of perturbations to Earth’s energy flows that are otherwise not observable. To test the conjecture that models which better represent natural perturbations in Earth’s energy flows (for instance associated with seasons, decadal variations in solar forcings, volcanos, or different surface boundary conditions) are also better at representing forced perturbations requires both sustaining and advancing observations of Earth’s energy flows, but also standardized experiments within evolving intercomparison protocols, such as the coupled model intercomparison project (CMIP).

6 Concluding Remarks

Tremendous advances in observing systems now make it possible to track energy flows through the Earth system with high precision, and on temporal and spatial scales that would have been difficult to imagine just a few decades ago. More accurate measurements of the total solar irradiance and the increase in ocean enthalpy are helping to constrain the balance between net solar irradiance and outgoing long-wave irradiance to less than ±2 W m−2 at the top of the atmosphere. Precise measurements of Earth’s energy flows, as part of NASA’ s CERES program are providing new insights into how the energy flows at the top of the atmosphere vary spatially on timescales ranging from days to now more than a decade. And although, with the exception of the planetary albedo, which the satellite record has convincingly shown to be much smaller than pre-satellite estimates, the several terms in Earth’s energy budget do not appear to have changed markedly as a result of the satellite record, the confidence in this budget has been greatly enhanced by the accuracy and precision of the satellite measurements. Measurements have advanced to the point where estimates of long-term trends in the top-of-atmosphere energy budget are now limited principally by natural variability, rather than instrument precision. Active remote sensing, from both surface- and space-based platforms is helping to constrain the surface energy budget. And seasonally resolved climatologies of aerosol optical depth are available both from surface- and space-based networks.

These measurement advances are not without their attendant controversies, nor do they yet provide a complete picture of Earth’s energy flows. In particular, satellite-based estimates of precipitation remain difficult to reconcile with estimates of downwelling surface irradiance in the long-wave, given uncertainties in other terms in the energy budget. Also, the absolute uncertainty of global climatologies of energy related quantities remains comparable to, or larger than, perturbations expected from changes in atmospheric composition resulting from human activities, precluding measurement-based quantification of changes in flux terms resulting from the anthropogenic perturbation. Furthermore, distinguishing aerosols from clouds, or complex surfaces, and quantifying aerosol forcings remain challenging, even for a well characterized aerosol perturbation. These challenges limit confidence in estimates of total secular forcing. Nonetheless, the comprehensiveness of the measurements is making it possible to track energy flows through the Earth system on space and time scales that are unprecedented, thereby offering the possibility to pose critical tests of both models and understanding of how energy flows through the Earth system.

Climate change, as quantified, for instance, by changes to the globally averaged surface temperature, can be expected to result from compositional changes to Earth’s atmosphere which perturb the flow of energy through the Earth system. The forcing-response-feedback framework, which has developed to understand such changes, posits that compositional changes in the Earth system can be associated with an effective radiative forcing, so that distinct perturbations that result in quantitatively the same effective radiative forcing will elicit the same response, as measured by globally averaged surface temperature. The magnitude of the response to a sustained forcing (per unit forcing), the climate sensitivity, is thought to be an intrinsic property of the Earth system. Notwithstanding objections that climate change can be manifest in changes that do not scale with changes in the globally averaged surface temperature, this framework relies on an ability to determine this effective radiative forcing, given some externally imposed change to the system. However, because this effective forcing depends on the fast response (or adjustments) of the system, to a given perturbation, it cannot be determined from first principles. And so climate models, which previously were thought to be indispensable to an assessment of the climate sensitivity, have become equally indispensable to the estimation of the effective radiative forcing. Consequently those poorly modeled elements of the climate system that affect the flow of energy through the system, for instance clouds and convection, have become the principal limit in quantification of not only the climate sensitivity, but also the effective forcing – especially for complex perturbations such as those associated with aerosols.

Although climate models have become indispensable, they also lack critical tests. Independent realizations of perturbations to Earth-like planets do not exist and, hence, model-based estimates of the effective radiative forcing that accompanies a compositional perturbation, or the climate sensitivity that emerges as a result of feedbacks in the Earth system, can not be independently assessed. However, by focusing on the ability of models to represent the ways in which energy is observed to flow through the Earth system, particularly for repeatable events such as seasons, ENSOs, solar cycles, and perhaps volcanic perturbations, and as a function of surface state or latitude, it might be possible to develop a measure of reliability of climate models, somewhat analogous to what is done for weather forecasting. So doing would complement the heuristic use of models that currently proves essential to development of understanding. However, this will required sustained and expanded efforts to measure the flows of energy through the Earth system and international programs dedicated to documenting the evolving skill of models to represent these flows.