Optimal filtering for Bayesian detection and attribution of climate change
Authors
 First Online:
 Received:
 Accepted:
DOI: 10.1007/s0038200404563
 Cite this article as:
 Schnur, R. & Hasselmann, K. Clim Dyn (2005) 24: 45. doi:10.1007/s0038200404563
 18 Citations
 155 Views
Abstract
In the conventional approach to the detection of an anthropogenic or other externally forced climate change signal, optimal filters (fingerprints) are used to maximize the ratio of the observed climate change signal to the natural variability noise. If detection is successful, attribution of the observed climate change to the hypothesized forcing mechanism is carried out in a second step by comparing the observed and predicted climate change signals. In contrast, the Bayesian approach to detection and attribution makes no distinction between detection and attribution. The purpose of filtering in this case is to maximize the impact of the evidence, the observed climate change, on the prior probability that the hypothesis of an anthropogenic origin of the observed signal is true. Whereas in the conventional approach model uncertainties have no direct impact on the definition of the optimal detection fingerprint, in optimal Bayesian filtering they play a central role. The number of patterns retained is governed by the magnitude of the predicted signal relative to the model uncertainties, defined in a pattern space normalized by the natural climate variability. Although this results in some reduction of the original phase space, this is not the primary objective of Bayesian filtering, in contrast to the conventional approach, in which dimensional reduction is a necessary prerequisite for enhancing the signaltonoise ratio. The Bayesian filtering method is illustrated for two anthropogenic forcing hypotheses: greenhouse gases alone, and a combination of greenhouse gases plus sulfate aerosols. The hypotheses are tested against 31year trends for nearsurface temperature, summer and winter diurnal temperature range, and precipitation. Between six and thirteen response patterns can be retained, as compared with the one or two response patterns normally used in the conventional approach. Strong evidence is found for the detection of an anthropogenic climate change in temperature, with some preference given to the combined forcing hypothesis. Detection of recent anthropogenic trends in diurnal temperature range and precipitation is not successful, but there remains strong net evidence for anthropogenic climate change if all data are considered jointly.
Introduction
The degree to which the humaninduced climate change predicted by models has been detected and verified today by observations remains a topic of debate. The cautious statement in the Second Assessment Report (Houghton et al. 1996) of the Intergovernmental Panel on Climate Change (IPCC) that “the balance of evidence indicates a discernible human influence on climate” has been reaffirmed and strengthened in the Third Assessment Report (Houghton et al. 2001, see also status reports of the International ad hoc Detection and Attribution Group, Barnett et al. 1999, and IDAG 2004). The globalscale climate change observed in recent decades is indeed difficult to attribute to natural causes. But it is less clear whether climate changes observed on the regional scale, such as rainfall patterns, droughts, storms and other extreme events, which are likely to have the strongest impact on human living conditions, can be attributed to human activity. For these data, as well as for a number of global climate change indices, the time series are too short to establish reliable natural variability statistics, which are needed for conventional statistical detection and attribution tests. Therefore, natural variability is usually estimated from long model control simulations but this approach faces the problem of considerable differences in the simulated variability of different climate models (see Barnett 1999). These differences will be even larger for many nontemperature variables or indices. The errors of the climate models used to predict the climate change signals, against which the observed climate change data are compared, are similarly difficult to estimate. For these reasons, only a subset of the available climate change data, primarily surface temperature and zonally averaged vertical temperature data, have been applied so far for conventional statistical detection and attribution analyses (Hegerl et al. 1997; Santer et al. 1996; Tett et al. 1999; Stott et al. 2001).
To overcome these statistical data base limitations, Bayesian techniques have been proposed (Varis and Kuikka 1997, Hasselmann 1998, Risbey et al. 2000). By introducing subjectively defined probabilities, the Bayesian approach not only extends the permissible data base, but also provides a scientific foundation for investigating the implications of different subjective estimates of the errors of models and data, or of different prior beliefs in the validity of competing climate change forcing hypotheses (see Earman 1992).
As preparation for a detailed Bayesian detection and attribution analysis applied later to a more comprehensive climate change data set, we consider in this study an important technical aspect of the Bayesian approach: how should one weight the data in order to maximize the impact of the data on the probability that a given climate change forcing hypothesis is true?
In the conventional statistical approach to the detection and attribution problem, the data are projected onto a suitably chosen lowdimensional subspace in order to enhance the signaltonoise ratio (S/N). Without a reduction in the dimensionality of the climate phase space, the signal is normally swamped by the natural variability noise. The patterns used for the projection (known as fingerprints) can be chosen such that S/N is maximized (Hasselmann 1979, 1997). However, the optimal fingerprint method of the conventional detection and attribution approach cannot be immediately carried over to the Bayesian case, as here the detection and attribution problem is formulated quite differently.
In the conventional approach, the hypothesis of a climate change signal induced by human activity (or some other given external forcing mechanism) is compared against the null hypothesis that the observed climate change is due to natural climate variability. One estimates first the probability p^{0} that a climate change signal that is as least as large as the observed signal could have occurred within a given observation period through natural climate variability. If p^{0} is small, a signal is said to have been detected at a significance level of 1−p^{0} (or, in an alternative terminology, p^{0}). Once detection has been established, one investigates further whether the climate change signal inferred from the observations can be attributed to the postulated forcing mechanism, i.e. whether the predicted and inferred signals are mutually consistent within the estimated error bounds of the observations and the model prediction. The errors of the predicted signal are relevant only for attribution; for the detection test, their only impact is a minor degradation of the detection skill through a distortion of the computed fingerprint pattern relative to the true optimal pattern.
In the Bayesian approach, in contrast, there exists no formal separation between detection and attribution, and the uncertainties of the predicted signal and the natural climate variability play an equally important role for both questions. The observed climate change is treated as new evidence that modifies an initial, subjectively defined probability of the validity of a given hypothesis regarding the cause of the climate change. The evidence changes the initial probability prior to knowledge of the evidence (the “prior”) into a posterior probability (the “posterior”). The modification depends not only on the prior, but also on the ratio of the likelihood of making the relevant observations for the case that the forcing hypothesis is true or, alternatively, false (i.e. that the observations are due to natural climate variability or some other postulated forcing mechanism). The conceptual framework of the Bayesian approach therefore differs fundamentally from the conventional approach, and we shall indeed show that the solution for the optimal filtering or weighting of the observational data is quite different for the two cases.
In the following section we review briefly the basic relations required for the Bayesian analysis. In Sect. 3 and 4 we derive then the optimal data filtering results for the Bayesian case, while Sect. 5 illustrates their application for the example of global mean surface temperatures, summer and winter diurnal temperature ranges and precipitation. The principal conclusions of our analysis are summarized in the last section.
Bayes factors
We consider an ensemble of climate change data consisting of sets of individual indices and/or fields of spacetime dependent variables. Following the compressed notation of Hasselmann (1998), we denote the complete set of spacetime discretized climate change variables as the vector v. Thus the index i of the vector component v_{i} runs over all climate change variables, as well as over all discretized space and time coordinates of the variables, which can be defined at a set of stations or gridpoints.
Although we focus in the following on the set of climate change trajectories in which the time variable is absorbed in the comprehensive vector index i, all relations derived in the following apply formally also to the case that v is regarded as a set of timedependent variables, v=v(t), where v(t) can consist either of the original data or a derived data set obtained from these by some timedependent filtering operation (for example, running trends defined over some time interval). The following relations apply then at a given instant in time.
We note also that in Bayes’ approach one can consider the impact of the evidence e on the credibility of several alternative anthropogenic climate change hypotheses h_{j} simultaneously (see Smith et al. 2003 for another application of Bayes Factors to different climate change hypotheses). As discussed in the following, the relevant likelihood densities l_{j}=p(eh_{j}) can be computed in a relatively large common climate change phase space, without restriction, as in the conventional approach (see Hasselmann 1997; Hegerl et al. 1997), to a strongly reduced subspace spanned only by the signal patterns of the hypotheses being tested.
Optimal filtering in the Bayesian framework
Since the Bayesian detection and attribution analysis can be carried out, in principal, in the full observational climatechange phase space v, one may question whether data filtering or weighting techniques as in the conventional approach are needed. The optimal fingerprint technique in the conventional approach serves two purposes: it enhances the signaltonoise ratio by reducing the dimension of the space in which the detection test is carried out, and it maximizes S/N by choosing a suitable filter that suppresses the noise relative to the signal.
The first motivation no longer holds for the Bayesian case. As pointed out, in the conventional analysis, the computation of the detection significance level requires integrating the naturalvariability probability density over some finite phase space region, specifically over all climate change values greater than the observed climate change signal (“greater” being defined with respect to a metric determined by the inverse covariance matrix of the natural variability). The integral increases rapidly with the number of dimensions of the phase space. It is smallest for a onedimensional phase space obtained by projecting the full phase space onto a single detection variable (in the optimal case, using a fingerprint pattern derived through a suitable rotation of the predicted signal pattern). Thus the significance level for detection can be significantly enhanced by decreasing the number of phase space dimensions. In the Bayesian case, in contrast, the impact of the “evidence” e depends only on the ratio of likelihood densities, so that the dependence on the phasespace dimension cancels.
However, the second motivation, the maximization of the signaltonoise ratio, remains. For example, in the case that a single externalforcing climate change hypothesis H is to be tested against the null hypothesis \(\bar H,\) one would like to maximize the Bayes factor \(B = l/\ifmmode\expandafter\hat\else\expandafter\^\fi{l} = p(eh)/p(e\ifmmode\expandafter\bar\else\expandafter\=\fi{h}).\) We show below that this can be achieved through a suitable filtering operation. This leads also to dimensional reduction, but the reduction is in general less severe than in the conventional case. In contrast to the conventional optimal fingerprint technique, the filter depends not only on the natural variability covariance spectrum but, as to be expected from Eq. (5), also on the covariance matrix C_{ij}^{*} of the modelprediction errors.
The dimensional reduction follows from the application of a selection/rejection criterion to the pattern components in a transformed phase space. The transformation is composed of three successive transformations:
 1.
A rotation c=Uv to empirical orthogonal functions (EOF coordinates) c defined with respect to the natural variability noise. This diagonalizes the covariance matrix C of the natural variability, \(C \to E = (\hat \sigma _1^2 ,\hat \sigma _2^2 , \ldots ,\hat \sigma _n^2 ).\) In practice, this transformation will normally be combined with a truncation: one retains only those EOFs that can be distinguished from white noise in accordance with the criterions, for example, of North et al. (1982) or Preisendorfer and Overland (1982). Thus one works in a reduced computational space that is typically of the order of 10 to 20.
 2.A rescaling of the EOF coordinates c through a diagonal transformation x=Sc, in components, \(x_i = c_i /\hat \sigma _i ,\) such that the natural variability covariance matrix becomes the unit matrix. The probability distribution of the natural climate variability, which we assume to be Gaussian, is thereby transformed into the normalized isotropic form$$ \hat p(x) = (2\pi )^{  n/2} {\text{exp}}\left\{ {  \frac{1} {2}\sum\limits_{i = 1}^n {x_i^2 } } \right\} $$(10)
 3.
Finally, the coordinates are rotated again, yielding variables z=V(x), such that the covariance matrix C_{ij}^{*} of the model errors is diagonalized.
If the hypothesis of an externally forced climate change signal is true, the likelihood l that the observed climate change signal lies in an infinitesimal phase space element Δz at the point z = z^{p} is p(z^{p})Δz. If the complementary hypothesis that the observed climate change signal is due to natural climate variability is true, the likelihood \({\hat l}\) that the observed climate change signal v lies in the transformed infinitesimal phase space element Δz at the point z=z^{p} is \(\hat p({\mathbf{z}}^p )\Delta {\mathbf{z}}.\)
We note that in both the Bayesian and the conventional case, the filtering is carried out using only prior information on the estimated natural variability, the predicted signal and, in the Bayesian case, the signal uncertainty. No reference is made to the evidence, the observed climate change, which would introduce a statistical bias.
Estimation of C_{ij}^{*}
The model error covariance matrix C_{ij}^{*} consists generally of three contributions: sampling errors due to the internal variability of the model, inherent model errors and errors arising from intermodel differences in the applied forcings. For a realistic model, the internal model climate variability is similar to the real internal climate variability. It can be estimated from the computed model variability of a long control run and/or from a number of integrations of a given climate change scenario using perturbed initial conditions (see Cubasch et al. 1994; Tett et al. 1999). The inherent model errors and forcingrelated errors can be judged by intercomparisons of model runs using different models (although systematic errors common to all models cannot be detected in this manner). Since the number of model runs will normally be limited, and systematic model errors can be assessed only subjectively from general model performance, the prescription of the model error covariance matrix C^{*} will necessarily be somewhat subjective. Nevertheless, assuming that the spread in published scenario simulations using different models provides some measure of model and forcing errors (i.e. neglecting an unknown common model error), one can estimate C^{*} directly as the sum of the three contributions as follows:
From m scenario computations, we obtain m estimates v^{a}, a = 1,...,m of the climate change signal. Different Monte Carlo simulations with a single model can be counted either separately or averaged into a single run. In the first case, the relative contribution of individual model errors is underestimated, while the second case underestimates the relative contribution of the natural variability. A detailed discussion requires a separate treatment of both contributions, but this will not be pursued here. Some degree of arbitrariness in the weighting, equal or differentiated, of the errors of different models is inevitable, as different modellers will invariably have different assessments of model errors. Indeed, this is one of the motivations for applying Bayesian statistics.
This straightforward approach will normally run into the problem, however, that the number of independent model experiments is too small to span the selected EOF space. Thus, the estimate Eq. (19) of the model error covariance matrix C_{ij}^{*} will generally be degenerate and will need to be augmented by subjective assessments of the general structure of C_{ij}^{*}. These can be inferred from estimates of the model natural variability derived from long control runs, or from plausible assessments of model errors based on the models’ general performance in reproducing the present climate.
Application
As a simple demonstration and test exercise, we have applied the Bayesian detection and attribution framework as developed to the annual means of nearsurface temperature (TAS) and precipitation (PREC) and the summer and winter means of diurnal temperature range (DTRJJA and DTRDJF). These variables are sufficiently well covered by both observations and climate model ensemble simulations to permit the straightforward application to the algorithm outlined in Sect. 2 and 3. The Bayesian analysis will be compared with results from the conventional detection method.
H_{1} the observed climate change is due to changes in the greenhouse gas (GHG) forcing
H_{2 }the observed climate change is due to a combination of greenhouse gas plus sulfate aerosol (GS) forcings
H_{0} the observed climate change can be explained by natural variability alone
Although we have termed the hypotheses H_{1} and H_{2} for brevity as anthropogenic forcing hypotheses, we note that they represent in fact a superposition of anthropogenic forcing and natural variability. The effect of natural variability has been automatically incorporated in the estimation of the likelihood of observing the predicted anthropogenic climate change, which is determined both by the model uncertainty and the natural climate variability that is superimposed on the anthropogenic signal.
Models and ensemble integrations used to estimate the predicted climate change signal and the model error covariance matrix. Ensemble sizes are given for nearsurface temperature (TAS), precipitation (PREC) and diurnal temperature range (DTR), for the greenhouse gasonly scenario (GHG) and the greenhouse gas plus sulfate aerosol scenario (GS)
Model  Center  Size of ensemble  


 TAS  PREC  DTR  
GHG  GS  GHG  GS  GHG  GS  
ECHAM3/LSG  MPIfM, Germany  4  2  1  2  1  2 
ECHAM4/OPYC3  MPIfM, Germany  1  1  1  1  1  1 
HADCM2  Hadley Centre, UK  4  4  4  4  4  4 
HADCM3  Hadley Centre, UK  1  0  1  0  0  0 
GFDLR15a  GFDL, USA  2  1  2  1  0  0 
CSIROMk2  CSIRO, Australia  1  1  1  1  1  1 
CGCM1  CCCma, Canada  1  3  1  3  1  3 
Total  14  12  11  12  8  11 
Application of the optimal filtering algorithm for Bayesian detection and attribution to the annual means of surface temperature (TAS) and precipitation (PREC), and winter (DJF)/summer (JJA) means of diurnal temperature range (DTR). See text for details
 TAS  DTR DJF  DTR JJA  PREC  Net 

Latest observed 31year trend covers  196797  196494  196494  196595  
Number n of EOFs retained from control run  14  16  11  16  
(a) Greenhouse gas forcing vs. Natural  
Number of patterns selected by Bayesian criterion  13  9*  6*  9  
Bayes factor (B_{10})  >100  0.36  0.11  0.6  >100 
log _{10}(B_{10})  >2  −0.44  −0.96  −0.22  >2 
(b) Greenhouse gas plus sulfate forcing vs. Natural  
Number of patterns selected by Bayesian criterion  13  9*  6  9  
Bayes factor B_{20}  >100  0.52  0.05  0.46  >100 
log _{10}(B_{20})  >2  −0.28  −1.33  −0.33  >2 
(c) Greenhouse+sulfate hypothesis vs. Greenhouse hypothesis  
Bayes factor B_{21}  2.1  1.44  0.43  0.77  1.001 
log _{10}(B_{21})  0.3  0.16  −0.37  −0.11  0.0004 
To estimate the model error covariance matrix C_{ij}^{*}, we averaged first the Monte Carlo realizations of individual model ensembles and then regarded this set of averaged realizations as the basic model ensemble from which C_{ij}^{*} was computed in accordance with the relations given in Sect. 4. As pointed out, this captures the intermodel differences, but underestimates the contribution from the natural model variability. However, as the number of Monte Carlo simulations available for any given model was generally too small for a reliable direct estimation of the uncertainty of the model predictions arising from the model’s natural variability, we estimated this contribution indirectly by assuming that the natural variability of each model could be represented by the natural climate variability \(\tilde v,\) as determined from the long ECHAM3/LSG 2000year control run. Although there is considerable disparity between the surface temperature variability of control runs of different climate models (see Barnett 1999), the control runs of the models considered here are in broad terms mutually consistent and reproduce the observed natural climate variability reasonably well, considering uncertainty in both the model simulations and the observations. The variability of the ECHAM3/LSG control run appears to somewhat underestimate natural variability in surface temperature (see Stouffer et al. 2000), this should be kept in mind when interpreting the following results. A more complete evaluation of the variability in control runs of different models, including the variability for precipitation and DTR for which no systematic model intercomparisons are available, is beyond the scope of this study.
Interpretation of Bayes factors (after Kass and Raftery 1994)
log _{10}(B_{k0})  B_{k0}  Evidence in favour of H_{k} over H_{0} 

0 to 1/2  1 to 3.2  Weak (“not worth more than a bare mention”) 
1/2 to 1  3.2 to 10  Substantial 
1 to 2  10 to 100  Strong 
>2  >100  Decisive 
Following these guidelines, a Bayes factor larger than 100, for example, indicates that there is decisive evidence in favor of one hypothesis over the other. We have used a threshold value of 100 for the Bayes factors to avoid spuriously high likelihood ratios, which are unreliable. The ndimensional probability densities become very small, and the likelihood ratios correspondingly large, if the observation vector lies well outside the confidence ellipsoids of model errors or natural variability, particularly for high phasespace dimensions. Such high likelihood ratios are critically dependent on the Gaussian hypothesis, which is questionable on the limbs of the probability distributions.
The results show that both the GHG and the GS temperature signal can be detected with decisive evidence against natural variability. However, neither of the anthropogenic forcings can be detected in the DTR or precipitation signals. These results are consistent with the findings from an application of the conventional fingerprint detection technique to the ECHAM4/OPYC3 signals and ECHAM3/LSG natural variability (Schnur 2004, Detection of Climate Change Using Advanced Climate Diagnostics: Seasonal and Diurnal Cycle, submitted to Meteorol. Z.). While the temperature signal was found to be highly significant for both forcings, the precipitation and summer DTR signals could not be detected. In the case of winter DTR, the signal passed the conventional detection test for both forcings, but failed the subsequent attribution tests. This is also in accordance with the Bayesian result, in which detection and attribution is combined in a single analysis.
The last column in Table 2 shows the net results for all four variables jointly, obtained by multiplying the individual likelihood ratios (Bayes factors) under the assumption that the different likelihoods are independent. This overestimates the net impact of different observational data, but provides nevertheless a qualitative indication of the advantage of combining different sources of data in Bayesian detection and attribution studies. A comprehensive treatment would need to consider the correlations between different data types for model errors and natural variability in a joint analysis of all variables. Despite the low likelihood ratios for DTR and precipitation, there is still seen to be strong net evidence for detection of the GHG and GS forcings.
The ratio of the Bayes factors B_{20} and B_{10} yields also the Bayes factor B_{21} describing the impact of the evidence on the odds of hypothesis H_{2} over H_{1} (see Eq. 9). Thus, B_{21} represents the impact of the observations on the attribution of the observed climate change to the two alternative anthropogenic forcing mechanisms. In computing B_{21} as the ratio of B_{20} and B_{10}, a technical inconsistency can arise if different optimal patterns were selected in the computation of B_{20} and B_{10}. In this case, the likelihood for the H_{0} hypothesis, which formally cancels out in the ratio, is computed in two different phase spaces. To avoid this inconsistency, we used only those optimal patterns that fulfilled the Bayesian selection criterion (Eq. 14) in the computations of both B_{20} and B_{10}, so that the results of sections (A), (B) and (C) in Table 2 are based on the same phase space. This resulted in the omission of one optimal pattern for the cases marked with a star in Table 2.
Section (C) of Table 2 indicates that, for temperature, the likelihood for the GS hypothesis is about twice as high as the likelihood for the GHG hypothesis. This is consistent with the conventional findings, most of which support higher attribution levels for the GS than the GHG hypothesis (Hegerl et al. 1997; Barnett et al. 1999; IDAG 2004). However, a Bayes factor B_{21} of 2.1 represents only very weak evidence (Table 3). Also, for the net result for all observations, the likelihood ratio for both anthropogenic hypotheses is almost exactly unity. Compared with the conventional analysis, the explicit inclusion of model uncertainties in the Bayesian analysis tends to downgrade the higher attribution level for the combined greenhousegasplusaerosol forcing relative to greenhousegasonly forcing. This is consistent with the considerable scatter found between different models in the attribution significance level for the combined forcing using the conventional approach (Barnett et al. 1999). The scatter arises, in addition to uncertainties in the forcing itself, from the large uncertainties in the model response to sulfate aerosol forcing, an effect which is automatically included in the Bayesian analysis and reduces the impact of the sulfate forcing. For DTR and precipitation, B_{21} is close to unity and provides no evidence in favour of either of the two hypotheses. This is also consistent with the conventional analysis.
The left group of columns in Fig. 2 shows the results for the uninformed case of equal priors for all three hypotheses. The posterior probabilities in this case are the same as the Bayes factors, which were already listed in Table 2. The near surface temperature data provide decisive evidence for the detection of anthropogenic climate change: the posterior probability for the natural variability hypothesis is negligible. The posterior probability ratio for the GS and GHG hypotheses is about 2 to 1. In the case of diurnal temperature range and precipitation (middle columns in the left group), the observations have the opposite impact: the posterior probability for the natural hypothesis is enhanced relative to the original value of 1/3. However, the net result considering all evidence (right column) still yields decisive evidence for an anthropogenic origin of the observed climate change, with approximately equal probabilities for the GHG and GS forcing hypotheses.
The overall assessment remains unchanged if we start from the prior probabilities of either the “skeptic” or the “advocate” (middle and right groups of columns). The evidence for an anthropogenic origin of the observed temperature change is so large that the collective outcome for all variables is independent of the prior probabilities assigned to the three hypotheses. The climate change advocate would also find sufficient evidence in the winter DTR and precipitation data to favor the anthropogenic forcing hypotheses over natural variability for these data alone (in accordance with the conventional finding that the observed trends for these variables lie on the margin of detectability). However, the “skeptic” favors the hypothesis of natural variablility for these variables. For the still weaker summer DTR signal, the natural variability hypothesis is favored by both the “skeptic” and the “advocate”.
Summary and conclusions
The Bayesian approach to the detection and attribution of anthropogenic climate change differs from the conventional analysis in four important aspects: (1) it allows the incorporation of climate change indices for which the statistical data base is insufficient to reliably support conventional signaltonoise analysis methods; (2) detection and attribution are not regarded separately, but are treated together as coupled aspects of a single problem; (3) natural climate variability and model uncertainties are similarly combined in a single joint analysis, and (4) the analysis depends only on local probability densities, defined at the specific point in phase space at which the climate change signal has actually been observed. In this work we have focused on the latter three aspects, in preparation for a more extensive later analysis addressing the first aspect.
The limitation to local probability densities is a consequence of Bayes theorem, which can be expressed in terms of Bayes factors. These describe the impact of the observational evidence for two competing hypotheses in terms of the ratio of the local likelihoods for obtaining the observations for each individual hypothesis. Since the dimension of the phase space cancels in the likelihood ratio, there is no need for dimensional reduction, in contrast to the conventional case, in which the signal is normally lost in the noise unless the phase space is drastically reduced to one or two fingerprint patterns.
It can nevertheless be advantageous to reduce the dimension of the phase space dimension also in the Bayesian case in order to enhance the impact of the observations on a given hypothesis. This is achieved by removing patterns that reduce rather than enhance the Bayes factor for the hypothesis, relative to a competing hypothesis such as natural variability. The filter must, of course, be defined without knowledge of the observations. Expressed in terms of a suitably normalized EOF pattern space, the relevant pattern selection criterion is simply that the square of the predicted signal amplitude must be at least twice as large as the logarithm of the standard deviation of the model uncertainty.
As illustration, we applied the Bayesian filtering technique to a small set of climate variables, which had also been subjected to a conventional detection and attribution analysis: nearsurface temperature (TAS), winter and summer diurnal temperature range (DTR), and precipitation (PREC). Two forcing mechanisms were considered: greenhouse gas alone (GHG), and greenhouse gas plus aerosols (GS). The estimates of natural climate variability were derived from the Hamburg ECHAM3/LSG simulations. Estimates of the predicted signals and the model uncertainties required for the Bayesian analysis were obtained from sets of ensemble integrations using seven different models. The number of patterns retained in the Bayesian analysis ranged from six (summer DTR) to 13 (TAS), as opposed to two in the conventional analysis.
Using the conventional approach, a climate change signal exceeding the natural variability noise level could be detected for temperature and winter DTR, but not for summer DTR and precipitation. For temperature, both anthropogenic forcing mechanisms were consistent with the observed signal, whereas for winter DTR, the attribution results were dependend on the model control run used for the estimation of natural variability: the GS signal was marginally consistent with observations for ECHAM4/OPYC3 variability, but neither of the anthropogenic forcings was consistent for the case that ECHAM3/LSG natural variability was used.
The Bayesian analysis yielded qualitatively similar results. The observed nearsurface temperature change provided decisive evidence in favour of anthropogenic forcing versus natural climate variability, with a weak preference for GS forcing over GHG forcing. However, none of the remaining data enhanced the probabilities for the anthropogenic forcing hypotheses. This applied also to the winter DTR trends, which could be detected and at least marginally attributed to GS in the conventional analysis. The difference can presumably be attributed to the larger uncertainties of the model predictions, in particular for DTR and precipitation, which are not included formally in the conventional approach, but enter explicitly in the Bayesian analysis. In general, the weighting of the information on sulfate aerosol forcing is downgraded in the Bayesian analysis relative to the conventional approach through the inclusion of the model error structure. Thus, the large differences in the impact of sulfate forcing found for different models in conventional analyses (see Barnett et al. 1999) are automatically included and quantified in the Bayesian approach. Despite the negative impact of the observations of diurnal temperature range and precipitation on the priors for the anthropogenic forcing hypotheses, the net impact of all variables together is dominated by the near surface temperature and decisively support the anthropogenic climate change hypotheses, with approximately equal probabilities for greenhouse gas forcing with and without sulfate aerosols.
It is planned to apply the techniques developed and illustrated here to a comprehensive set of climate change indices that have been discussed in the literature (cf. Houghton et al. 2001) as possible evidence of anthropogenic climate change. This requires assessments of the natural variability and anticipated climate change signals for all data, including the signal uncertainties and model error structure. Due to the lack of adequate statistical information for many of these indices, these assessments will need to be augmented to a large part by subjective expert judgements.
see IPCC Data Distribution Centre, http://ipccddc.cru.uea.ac.uk/ for further data, model descriptions and references
Acknowledgements
This work was supported by the National Oceanic and Atmospheric Administration (NOAA) Climate Change Data and Detection program and the US Department of Energy, Office of Energy Research, as part of the International ad hoc Detection Group effort, and by the European Commission QUARCC project, ENV4960250.