# Predictability in a changing climate

- 1.4k Downloads
- 3 Citations

## Abstract

The standard framework of predictability defines a variable to be unpredictable from a set of observations if it is independent of those observations. This definition requires comparing two distributions: a forecast distribution that is conditioned on observations, and a climatological distribution that is not. However, if the system is non-stationary because of externally forced climate changes, or is characterized by a climatological distribution that is much broader than the distribution of states over the recent past, then a rigorous application of this framework gives unsatisfying answers to reasonable questions about weather and climate predictability. This paper proposes generalizations of this framework that resolves these limitations and is consistent with the definition of independence. The first generalization, which was proposed effectively by Lorenz and Leith, is to distinguish initial-value predictability from forced predictability, where the latter is defined by time variations in the climatological distribution. This paper goes a step further by introducing a new measure, called *total climate predictability*, that can be decomposed into a sum of previously known measures of forced and initial-value predictability, namely relative entropy and mutual information. The second generalization, called *generalized predictability*, provides a new approach to filtering in such a way that processes with long time scales do not contribute to predictability. This generalization is important when the system’s climatological distribution is much broader than the range of climates experienced in the recent past. These concepts are illustrated using a simple model in which all aspects of predictability can be solved exactly.

## Keywords

Predictability Information theory## 1 Introduction

The discovery of dynamical chaos by Lorenz (1963) constitutes one of the most seminal contributions to twentieth century science. Lorenz’s lucid description of chaos and its implications for weather predictability have hardly been surpassed. Since Lorenz’s paper, various facets of predictability have been developed in numerous papers (see Palmer and Hagedorn 2006, for a review). In general, a variable is said to unpredictable from a set of observations if it is independent of those observations. This definition requires comparing two distributions: one that depends on observations and one that does not. These distributions are called the forecast and climatological distributions, respectively. When these two distributions are equal, then the variable is independent of the observations and therefore unpredictable. In principle, the forecast distribution can be derived from the governing dynamical equations using Liouville’s theorem or the Fokker–Planck equation. Whether the climatological distribution exists is a difficult question that is the subject of ergodic theory. In this paper, we assume the climatological distribution is equal to the forecast distribution initialized in the infinite past. In practice, these distributions can be estimated from an ensemble of forecasts, in which each member experiences the same time-dependent external forcing and evolves according to the governing physical laws, but with the forecast ensemble initialized at the present time and the climatological ensemble initialized long before the present time (Leith 1978). Although this framework provides a rigorous foundation for defining predictability, it has at least two unsatisfying elements.

The first unsatisfying element concerns externally forced variability. For instance, changes in solar insolation due to the earth’s orbit about the sun lead to a characteristic annual cycle in local climate. Because the climatological distribution is obtained from the same equations as the forecast distribution, it includes the annual cycle. As a result, the annual cycle is contained in both the forecast and climatological distributions and therefore does not contribute to predictability, which requires a *difference* in distributions. This formulation is often viewed as desirable since it implies a forecaster does not receive credit for predicting, say, that summer will be warmer than winter. However, an inescapable consequence of this formulation is that the response to other forms of external forcing also should be included in the climatological distribution, and therefore will not contribute to predictability. In particular, the response to changes in greenhouse gas concentration due to human activity should be included in the both the forecast and climatology, and therefore would not contribute to predictability. Such a framework is very unsatisfying given the importance of predicting climate change. Lorenz (1975) recognized this issue and introduced “predictability of the second kind,” or what is now called boundary-value predictability or forced predictability. A framework that accommodates externally forced climate changes has been discussed in several papers (Lorenz 1970, 1975; Leith 1978; Meehl et al. 2009; Branstator and Teng 2010). The present paper goes beyond these studies by giving a mathematically explicit treatment and by proposing a new measure, called *total climate predictability*, that can be decomposed as the sum of previously proposed measures of initial-value predictability and forced predictability. Part of this framework has appeared previously in DelSole (2017).

The above formulation of initial-value and forced predictability still has certain unsatisfying elements. Specifically, the climatological distribution derived from the above framework may be too broad for certain applications. For instance, paleo-climate records show that large climate changes can occur abruptly over periods as short as a decade (Alley and Clark 1999). This fact demonstrates that the climate system can display very different behavior under nearly identical external forcing. Consequently, the system’s climatological distribution may be much broader than the distribution that describes the past few decades. Under such a broad climatology, weather would be deemed predictable for as long as the associated forecast predicts no shift in climate (possibly for years). This particular issue does not arise often in predictability studies because climate models often are selected according to their ability to simulate the climate of the past century when given the corresponding forcing. Whether this criterion leads to an overly narrow climatological distribution is unclear. In any case, fundamentally similar issues arise even in seasonal predictability studies. For instance, some droughts can last a decade or longer (“mega-droughts”; Cook et al. 2010). To the extent that these droughts occur naturally, the climatological distribution should include them, which leads to the same unsatisfying conclusion: weather would be predictable for as long as the associated forecast predicts the mega-drought to continue. Similarly, other long time-scale processes, e.g., El Niño, ocean overturning circulations, and land-ice, broaden the climatological distribution. To resolve this issue, we propose *generalized predictability*, based on a *conditional climatology*, that effectively filters out predictability on long time scales so that the predictability on short time scales can be identified. This formalism can define a spectrum of predictability questions distinguished by time scale.

This paper is organized as follows. Section 2 reviews the standard framework of predictability. Section 3 discusses limitations of the standard framework and describes a generalized framework that resolves them. Specifically, we define forced predictability as changes in the climatology over time, and define a conditional climatology that allows predictability on different time scales to be isolated. Section 4 proposes a new measure of predictability called *total climate predictability* that captures both initial-value and forced predictability. Section 5 illustrates the above concepts in a forced AR(1) model, where all distributions required to evaluate predictability can be written in closed form. This paper concludes with a summary of our results.

## 2 What is predictability?

Most people agree that the outcome of a (fair) coin toss is unpredictable. Why? In other words, can you explain precisely what makes you say a coin toss is unpredictable? Is it because you cannot make a prediction about it? Actually, you *can* make a prediction about it. Here it is: the probability of heads is 50%. Moreover, this prediction correctly describes the outcomes. So there you have it—a falsifiable prediction about a coin toss! Now perhaps you believe a coin toss is unpredictable because you can make only probabilistic predictions about it, as opposed to making a prediction with certainty. This also is not true: all forecasts of nature are uncertain, and the most complete description of uncertainty is a probability distribution. Hence, all forecasts of nature *should* be probabilistic. Therefore, the mere probabilistic nature of the forecast does not make a coin toss unpredictable.

The reason a coin toss is considered unpredictable is because *the outcome is independent of observations typically available to the forecaster prior to the toss*.^{1} The key word here is *independent*. More precisely, a variable is unpredictable from a given set of observations if it is independent of those observations. Any other definition of predictability would lead to situations that contradict the meaning of the term. For instance, if predictability were not defined based on statistical independence, then a variable might be declared predictable even though it is independent of observations, or be declared unpredictable even though there exists a dependence on observations.

*t*be denoted as \(O_t\) (which includes observations before time

*t*), and let the variable being predicted at future time \(t+\tau\) be denoted as

*Y*. The variable

*t*is called

*initial condition time*, \(\tau\) is called

*lead time*(usually positive), and \(t+\tau\) is called

*verification time*. The most complete description of the future value of

*Y*given observations \(O_t\) is the conditional distribution \(p_{t+\tau }(y|o_t)\). This distribution is called the

*forecast*distribution.

^{2}The distribution of the future value of

*Y*unconditional on observations is denoted \(p_{t+\tau }(y)\). This distribution is called the

*climatological*distribution of

*Y*. The variable

*Y*is said to be unpredictable from \(O_t\), in an

*initial-value sense*, if it is independent of \(O_t\):

*initial-value predictability*.

Strictly speaking, predictability at \(t+\tau\) depends on the quality, type, and spatio-temporal distribution of observations at time *t*. Thus, a variable *Y* might be deemed predictable with respect to one set of observations but unpredictable with respect to a different set of observations. For instance, many hurricanes over the ocean were unobserved during the pre-satellite era and therefore land fall was difficult to predict. After satellite data became available, many hurricanes over the ocean that previously would have been unobserved could be identified and tracked to provide skillful predictions of land fall. Predictability also may be improved simply by reducing observational errors from the existing network. This dependence of predictability on observations raises questions about the best observing system for maximizing predictability, especially whether it is advantageous for the observational network to adapt to the synoptic situation (Lorenz and Emmanuel 1998).

If a variable has no initial-value predictability, there still exists meaningful information about its future value. Specifically, the climatological distribution describes the variable’s future value. For example, the weather a year from today cannot be predicted with precision, but the season will be the same and that knowledge allows us to predict that the temperature will be in a certain range characteristic for that season. A variable is unpredictable only in the sense that available observations do not tell us anything *different* than what we already knew from the climatological distribution. This point was illustrated by the coin flip: the unconditional distribution is “50% heads, 50% tails” and accurately describes the future coin flip. Because this distribution is not altered by conditioning on current observations, the event is unpredictable.

*deterministic*, otherwise it is called

*stochastic*.

Distributions derived from (2) satisfy the *Markov property*, which means that the distribution conditioned on a sequence of past states depends only on the most recent state. Accordingly, the dependence of the future state on the present is described by the *transition kernel* \(p_{t+\tau }(y|y_t)\) (also called a stochastic kernel or Markov kernel). The transition kernel is essentially a propagator obtained by integrating the governing equations. More precisely, the transition kernel is obtained from the governing equations (2) by solving the *Chapman–Kolomogorov equation* (Gardiner 1990, ch2). If the governing equations are deterministic and satisfy certain regularity conditions, then the Chapman–Kolomogorov equation can be written in a differential form called *Liouville’s equation*. If the governing equations are stochastic and of a certain class, then the Chapman–Kolomogorov equation can be written in a differential form called the *Fokker–Planck equation*. In either case, the transition kernel is derived by solving a partial differential equation under suitable boundary conditions and given initial condition. Alternatively, an approximate transition kernel might be estimated empirically from a long record of past observations.

*y*can be inferred from observations—many different states

*Y*are compatible with the observed realization of \(o_t\). The distribution of states that are compatible with the observations \(o_t\) is described by the

*analysis*distribution, denoted \(p_t(y|o_t)\) (note that the lead time is zero). The derivation of this distribution is a central goal of

*data assimilation*(Jazwinski 1970).

*Y*conditioned on the present value

*and*conditioned on presently available observations. Note that the integration variable \(y_t\) is a vector, but for simplicity the multivariate nature of the integration is ignored. For Markov systems, observations do not add information about the future if the initial state is known, hence

*t*. This distribution can be interpreted as a collection of initial states, called an

*ensemble*, with relative frequencies proportional to the probability density. In weather and climate studies, this ensemble can be imagined as a theoretical collection of Earths, each subjected to the same external forcing. As time advances forward, each member of the ensemble evolves in accordance with the physical laws. The initial states may be visualized as a cloud of points in state space, each of which streams through state space as it follows a trajectory determined by physical laws. The distribution at any future time is obtained by integrating over all initial states.

Predictability of the climate system is limited by the fact that the atmosphere is *chaotic*. A chaotic system is one whose evolution is sensitive to small changes in the initial condition. This means that if the atmosphere were to come arbitrarily close to a state which it had assumed previously, the subsequent evolution would diverge wildly from the previous evolution after sufficient time. As a consequence, even the smallest uncertainties in the initial state translate into large uncertainties in the forecast after sufficient time.

For sufficiently small initial errors, predictability can be characterized by a set of Lyapunov exponents that measure error-growth rate. Atmospheric models based on the primitive equations suggest that small amplitude perturbations in the atmosphere amplify with an average doubling time of about 1.8 days, suggesting an upper limit of predictability of about two weeks (Simmons and Hollingsworth 2002). However, atmospheric predictability depends on spatial structure, with large-scale perturbations being more predictable than small-scale perturbations, so planetary-scale waves are found to be predictable even for averages over 16–46 days leads (Shukla 1981). Planetary waves may also interact with stratospheric waves to enhance their predictability (Tripathi et al. 2015). Beyond a few weeks, slowly varying components of the climate system, such as sea surface temperature, soil moisture, snow cover, or sea ice thickness, are still predictable owing to their slower time scale. These slower components can influence the atmosphere and hence give rise to predictability of atmospheric variables beyond a month (Charney and Shukla 1981; Shukla 1998). Because these slower components lie on the earth’s surface, this type of predictability is called *boundary forced predictability*. To the extent that atmospheric variables are independent of the slower components on long time scales, they can be treated as white noise stochastic forcing (Hasselmann 1976). In addition, coupled atmosphere-ocean systems support new mechanisms of predictability beyond those found in the separate uncoupled systems, the most well established of which is the El Niño Southern Oscillation (ENSO; Philander 1990), which has a doubling time of several months (Goswami and Shukla 1991) and can be predicted a few months in advance (Barnston et al. 2012).

Although small errors can grow exponentially in chaotic systems, the growth rate slackens when errors become large. The question arises as to when the errors become so large that all predictability is lost. The limit of predictability depends on how close the forecast distribution is to the climatological distribution. Defining this limit requires defining the climatological distribution.

*transitive system*(Lorenz 1968). To understand the consequence of assumption (7), consider applying the transformation \(t \rightarrow t - \tau\) to (6) and taking the limit \(\tau \rightarrow \infty\):

*is*the climatology).

If the climatological distribution \(p_t(y)\) is independent of time, then the system is said to be stationary. A stationary climatological distribution describes the relative frequency of states associated with sampling the system randomly in time. However, the observed climate system is clearly non-stationary owing to annual and diurnal cycles (e.g., winter and summer states clearly belong to different distributions). A more realistic assumption is that the climatological distribution is *cyclostationary*, i.e., it is a *periodic* function of time. For cyclostationary systems, the climatological distribution describes the relative frequency of states associated with sampling the system at random years but conditioned on calendar day and hour. A common approach to estimating the mean of the climatological distribution is to compute the mean of each calendar day and hour over a 30-year period (but see Narapusetty et al. 2009).

*noise*. The mean of the forecast often is called

*signal*. As time advances, the forecast distribution typically approaches the climatological distribution. As a result, differences between the two distributions diminish with lead time, leading to decay of initial-value predictability. The

*predictability time scale*is defined as the minimum lead time at which the difference between forecast and climatological distributions falls below some threshold.

It should be recognized that the variable being predicted may be a state variable averaged over some period of time, rather than the instantaneous value of the state variable. For instance, *weather predictability* and *seasonal predictability* are both examples of initial-value predictability, but the former concerns variables averaged over hours or days while the latter concerns variables averaged over months or seasons. *Decadal predictability* concerns the predictability of multi-year averages of state variables. Other forms of predictability are discussed in the literature (e.g, “potential predictability”), but the definitions are not always consistent and so will not be defined here.

## 3 Climatology in a changing climate

The above framework is used in many predictability studies. In this section, we show that rigorous application of this framework leads to unsatisfying conclusions if the climatological distribution varies from year to year or is overly broad compared to the climate of the recent past. We then propose extensions of the above framework that resolves these limitations.

*external forcing*. The climatological distribution of this system requires solving (9) starting from the distant past, and would be expected to vary in time in response to changes in external forcing \(f_i\). Changes in climatological distribution from year to year are called

*climate changes*. Thus, climate changes due to external forcing, such as that due to solar variability or human-cause changes in greenhouse gas concentrations, would be included in the climatological distribution. Because only differences relative to the climatological distribution contribute to predictability, it follows that

*climate changes do not contribute to initial-value predictability*. In fact, climate change is treated the same as annual and diurnal cycles: they are each subsumed into the climatological distribution. However, changes in the climatological distribution relative to its distribution in different years, especially those due to global warming, are of considerable interest in themselves. To distinguish questions of this type from those of initial-value predictability, Lorenz (1975) suggested a new type of predictability called

*predictability of the second kind*, or what is today called

*forced predictability*or

*boundary-value predictability*.

^{3}These terms are used even if the forcing itself is unpredictable (e.g., multi-decadal variations due to volcanic eruptions or solar variability). Assessing whether a climate forcing changes the climatological distribution of a variable is called

*climate change detection*. Both kinds of predictability involve measuring differences in distributions, hence tools used to analyze one kind of predictability often are useful in the other.

The above framework for initial-value and forced predictability still may be unsatisfying. In particular, the climatological distribution derived from (7) may be much broader than is appropriate for certain problems. For instance, if the climatological distribution is obtained by initializing the system in the *infinite* past, then even orbital parameters would have a large uncertainty due to the impacts of random planetesimals over billions of years (indeed, this climatology may assign non-zero probability to a universe without Earth). Aside from this, the climate system may display a wide range of variability even for the same external forcing. For instance, geological evidence reveals that the climate can change drastically over short periods (e.g., as much as 10 \(^\circ\)C locally in 10 years; Alley et al. 2003). The existence of abrupt climate change demonstrates that the climate can have very different distributions over time periods subjected to nearly identical forcing. Because a climatological distribution describes all climates compatible with the physical laws, it can be much broader than the distribution that describes the recent past. Under a broad climatology, even weather could be deemed predictable for years, provided the associated forecast predicts no shift in climate. A framework that expunges weather predictability is unsatisfying. The above problem might seem esoteric because overly broad climatologies do not occur frequently in practice. However, this impression might be an artifact of assessing climate models on the basis of their ability to simulate the last few decades. In particular, the climatology of the past few decades may be unrealistically narrow. For instance, some models suggest that even El Niño can behave differently across multi-decadal epochs under fixed external forcing (Wittenberg 2009).

In any case, essentially the same issues occur in weather and seasonal predictability. Specifically, weather is affected by a variety of natural, long time-scale processes, e.g., El Niño, the Atlantic Meridional Overturning Circulation (AMOC) and mega-droughts. As a result, the climatological distribution that describes the joint behavior of weather and long time-scale processes is broader than the distribution that is conditioned on the state of a long time-scale process. Unfortunately, some attempts to define predictability conditioned on the state of a long time-scale process have been based on problematic modeling frameworks. For instance, one familiar approach is to run an atmospheric general circulation model (AGCM) with specified sea surface temperatures. However, the unphysical one-way forcing of an AGCM with prescribed lower boundary conditions raises questions about how results from such models relate to the fully coupled system (Barsugli and Battisti 1998). A similar criticism can be raised for any modeling framework in which one artificially constrains variables in a climate model that would ordinarily feedback to other variables (e.g., sea surface temperatures, land variables, sea ice).

*filtering out*the predictability of processes with long time scales, so that predictability on shorter time scales can be identified. The question arises as to whether this filtering can be done in a mathematically consistent way without constraining a dynamical model in an unphysical way. We propose such an approach by defining a

*conditional climatology*, which is nothing more than a forecast initialized in the

*finite*past,

*s*, where \(s < t\):

*generalized predictability*is measured by the degree to which the forecast \(p_{t+\tau }(y| o_t )\) differs from the conditional climatology \(p_{t+\tau }(y | o_s )\). For instance, if \(t-s = 30\) years, processes with time scales much longer than 30 years would be virtually identical in the forecast and climatology and therefore not contribute to predictability in our generalized sense. In this way, predictability of processes with time scales longer than 30 years are selectively attenuated. On the other hand, processes on shorter time scales, such as atmospheric weather and El Niño, would differ between the forecast and climatology and therefore contribute to predictability. As

*s*increases, the climatology (10) is initialized further in the past and therefore broadens to describe a wider range of climate variability, increasing the generalized predictability due to processes with time scales longer than 30 years. The greater the value of \(t-s\), the greater the differences between forecast and conditional climatology, and the greater the predictability of long time-scale processes. Conversely, as

*s*approaches

*t*, the closer the forecast and climatology, and the smaller the predictability of long time-scale processes. The parameter

*s*serves as a filter parameter that controls the time scale of processes that are included in generalized predictability.

*conditional independence*. Here, conditional independence means

*Y*given

*both*\(O_t\)

*and*\(O_s\). This identity differs from (1) only by \(o_s\) appearing as a conditional in both distributions. Formally, this identity expresses the fact that

*Y*is

*conditionally independent*of \(O_t\) given \(O_s = o_s\), for any value of \(o_s\). Recall that, by definition, \(O_t\) includes all observations up to and including time

*t*, hence it includes \(O_s\). Thus, \(\{O_t, O_s\} = O_t\), and the identity (11) can be written equivalently as

*finite*past yields a definition of predictability that is consistent with the definition of conditional independence. Taking the limit \(s \rightarrow \infty\) recovers the traditional definition of predictability.

Generalized predictability offers a new way to identify regime-dependent predictability. To illustrate, consider mega-droughts. Presumably, weather behaves differently according to whether a drought is occurring or not. During the middle of a mega-drought, observations from one year ago effectively localize the climate in a mega-drought. Thus, for \(t-s = 1\) year, the forecast and conditional climatology both describe the same mega-drought, so predictability of the mega-drought itself is attenuated, allowing predictability of weather *during the drought* to be identified. In this way, regime-dependent weather predictability can be diagnosed by varying *t* while holding \(t-s\) constant: the parameter *t* controls the temporal location while \(t-s\) controls the time scales of interest. On the other hand, as \(s \rightarrow \infty\), the conditional climatology broadens to describe both droughts and non-droughts, thereby allowing us to quantify predictability of the drought itself. This approach to isolating short-period predictability differs fundamentally from time-filtering. The fact that defining a conditional climatology as (10) can act to filter predictability of processes on time scales beyond \(t-s\) does not seem to have been recognized previously.

In this paper, the current climate is understood to be the relevant climatology. Accordingly, in the remainder of this paper, we suppress the conditional \(o_s\) and adopt the implicit definition (10) for the climatological distribution.

## 4 Measures of predictability in a changing climate

By definition, a variable is unpredictable if the forecast and climatological distributions are identical. It is natural, then, to quantify predictability by some measure of the *difference* in distributions. However, no single measure can satisfy all purposes: some differences are more important than others, depending on the application. For instance, a civil engineer may be interested in the change in *average* rainfall while an emergency manager may be interested in the change in *extreme* rainfall. In choosing a measure, it is important to distinguish between predictability and *utility* (Palmer 2002). Utility is a measure of the *benefit* derived from a prediction. Because benefit depends on the arbitrary user, no universal measure of utility can be defined. Nevertheless, predictability and utility are related: if an event is unpredictable, then a forecast cannot *add* utility relative to that which is available from the climatological distribution. Thus, predictability analysis determines whether a forecast can add utility beyond the climatological distribution. Another nuance is that, technically, *any* difference in distribution constitutes predictability, but the difference may be so slight as to have no utility. These considerations imply that predictability is a necessary but not sufficient condition for utility.

*relative entropy*(also called

*Kullback–Leibler Divergence*), which is a central quantity in information theory and arises naturally in a number of disciplines, including communication, finance, statistical mechanics, and statistics (Kullback 1968; Cover and Thomas 1991; Mackay 2003). The relative entropy between two distributions

*p*(

*x*) and

*q*(

*x*) is defined as

*if and only if*\(p= q\), otherwise it is positive. Additional reasons for preferring relative entropy will emerge below, but one particular application is worth mentioning here. A reasonable definition of the value of a forecast is the increase in wealth that results from knowing the forecast. In the absence of a credible forecast, an investor could use the climatological distribution inferred from past observations to devise a profitable investment strategy. This approach is very common in the insurance industry, for instance. The question arises as to how much can an investor increase his return by using the forecast distribution. A strikingly simple answer is that the difference in rate at which wealth is doubled by the best investment strategy equals the relative entropy between the forecast and climatological distributions (Cover and Thomas 1991).

*initial climatology*\(p_t(y)\):

*total climate predictability*\(M_T\),

*mutual information*, proposed by DelSole (2004), and \(M_F\) is a measure of forced predictability proposed by Branstator and Teng (2010). Thus, the previously proposed measures of predictability \(M_{IV}\) and \(M_F\) emerge naturally in our framework.

*mutual information*:

*average*initial-value predictability. This measure depends on lead time \(\tau\) but not on \(o_t\) (because \(o_t\) has been integrated out).

The above measures are close to those proposed by Branstator and Teng (2010). Specifically, Branstator and Teng (2010) estimated the climatology from “an ensemble of realizations, each beginning long before \(t = 0\) and each experiencing the same time-dependent external forcing”. This procedure is tantamount to averaging over initial conditions and therefore is equivalent to our measure \(M_F\). On the other hand, Branstator and Teng measure initial value predictability using the relative entropy between the forecast and climatology *without averaging over initial conditions*, which is equivalent to our measure \(R_{IV}\). Branstator and Teng then add these two measures together to quantify total predictability. The sum \(M_F + R_{IV}\) quantifies the predictability due to changing climatology and the *particular* forecast. That is, this measure is a function of both lead time and initial condition, and hence is specific to a particular initial condition. In contrast, our proposed measure \(M_T = M_F + M_{IV}\) is independent of initial condition and quantifies predictability, or rate of increase in wealth, *in an average sense* over all initial conditions, and is therefore analogous to familiar skill measures that are averaged over initial conditions.

- 1.
\(M_{IV}\) and \(M_F\) are non-negative.

- 2.
\(M_{IV}\) vanishes if

*and only if**Y*at \(t+\tau\) is independent of \(O_t\). - 3.
\(M_F\) vanishes if

*and only if*the climatologies at*t*and \(t+\tau\) are identical. - 4.
\(M_{IV}\) and \(M_F\) are invariant to invertible, nonlinear transformations of

*Y*. - 5.
\(M_{IV}\) and \(M_F\) have natural generalizations to multivariate distributions.

*Y*can be transformed nonlinearly without altering predictability. Any reasonable measure of predictability should possess this invariance property. The fifth property implies that the above measures can be generalized to an arbitrary number of variables. Remarkably, multivariate measures of predictability satisfy the same properties, including invariance to invertible, nonlinear cross-variable transformations among variables in the same random vector. Finally, each term has attractive interpretations in terms of investment strategies.

While our definition of initial-value predictability is clearly based on dependence (e.g., \(M_{IV}\) vanishes if and only if *Y* is independent of \(O_t\)), it might not be obvious that forced predictability also is based on dependence. To see that this is so, note that the concept of dependence applies not only to random variables, but also to *parameters* (Dawid 1979). Specifically, the distribution of *Y* may be derived from a statistical model involving a parameter \(\theta\). In this case, the distribution of *Y* may depend on \(\theta\), but the joint distribution between *Y* and \(\theta\) does not exist because \(\theta\) is not a random variable. The statement that *Y* is independent of \(\theta\) expresses the fact that the distribution of *Y* is the same for all values of \(\theta\). As a result of the properties discussed above, \(M_F\) is non-zero if and only if the climatological distribution depends on the parameter \(\tau\). Thus, \(M_F\) measures the degree to which the climatology changes in time. If the climatological distribution does not change when \(\tau\) is varied, then it is independent of \(\tau\). Thus, forced predictability can be interpreted as reflecting a (generalized) dependence of the climatological distribution on lead time \(\tau\).

*both forecast and climatology*evolve in time. Our proof of this property follows that of Cover and Thomas (1991) (see Sect. 2.9). To prove this property, it is necessary to modify our notation slightly. Accordingly, let \(Y_t\) denote the random variable at time

*t*, and let the joint distribution between \(Y_{t+\tau }\) and \(Y_t\) under different conditionals be

A schematic of the different predictability measures is shown in Fig. 3b. Initial-value predictability decays monotonically with lead time, but total climate predictability does not because external forcing can cause changes in the climatological distribution that increase with time.

*noise-to-total ratio*because variance in the forecast distribution is associated with uncertainty or noise. Similarly, a standard result is that the relative entropy for normal distributions is

## 5 Predictability of a forced AR(1) processes

This section illustrates predictability in an exactly solvable model, namely a first order autoregressive model. Such models are called *stochastic* because they contain random variables as part of their dynamics. Stochastic models have limited predictability because the transition between two states is uncertain even if the initial state is known exactly. In contrast, nonlinear chaotic systems have limited predictability because of instability with respect to initial conditions. Nevertheless, stochastic models can be difficult to distinguish from chaotic deterministic systems, hence the predictability of simple stochastic models can give insight into the predictability of deterministic systems.

*forced*AR(1) model

We assume \(| \phi | < 1\), in which case the model (35) is stable. To compute distributions of \(Y_t\), the distribution of \(W_t\) needs to be specified. Here, \(W_t\) is assumed to be Gaussian white noise with zero mean and variance \(\sigma _W^2\).

In order to make a prediction, observations of the process must exist. Observations generally are imperfect. Accounting for errors in observation is not interesting for an AR(1) process because such processes are stable and hence initial condition errors are damped with lead time (in contrast to chaotic systems). Accordingly, only the case of perfect observations are considered. Suppose, then, that we have a set of *perfect* observations \(\{ y_0, y_{-1}, y_{-2}, \dots \}\).

*conditionally independent given*\(Y_{t-1}\).

*conditioned on the initial condition*\(Y_0 = y_o\). Specifically, the mean of

*Y*conditioned on the observation \(y_0\) is

*t*is therefore

*t*:

*anomaly*as the deviation from the climatological mean:

*initial anomaly*plus the climatological mean \(\mu _Y(t)\). For future reference, the variance of the climatological distribution (45) is denoted

*limit of predictability*often is defined as the time after which some measure of the difference between the conditional and unconditional distributions exceed some (arbitrary) threshold.

*contributes no initial-value predictability*in this model. To see how this works, consider constant forcing \(f_t = k\), for which the climatological mean (46) is

*non-stationary*. The mean of the forecast and climatological distributions for this case, for the same initial condition used above, are shown in Fig. 5b. As in the stationary case, the forecast distribution relaxes toward the climatological distribution. Importantly, the rate of this relaxation is the same in both cases and determined by the parameter \(\phi\). Thus, in this model, predictability is independent of \(f_t\) and depends only on \(\phi\) and \(\tau\).

The above analysis assumed a deterministic forcing \(f_t\). If, instead, \(f_t\) were a stochastic process with a predictable time scale much longer than that of \(Y_t\), then it would still be *approximately* a deterministic function on the short time scales relevant for \(Y_t\). Thus, the climatology could have been defined by evaluating (44) using a finite value of *s*, chosen such that \(t-s\) is short compared to the predictable time scale of \(f_t\), but long compared to the predictable time scale of \(Y_t\). An important point is that the model for the joint variable \((Y_t,f_t)\) would have two predictable time scales: a relatively short time scale for \(Y_t\) and a much longer time scale for \(f_t\). The above analysis illustrates how the short predictable time scale of \(Y_t\) can be defined even though it is dynamically coupled to a variable \(f_t\) with a much longer predictable time scale.

## 6 Conclusion

This paper discussed limitations of the standard framework of predictability and proposed a generalized framework for resolving these limitations. In the standard framework, an event is unpredictable from a given set of observations if it is independent of those observations. This definition requires comparing two distributions: one that is conditioned on observations and one that is not. These distribution are called the forecast and climatological distributions, respectively. For transitive systems, the climatological distribution is the forecast distribution initialized in the infinite past. The framework resulting from this definition is problematic if one is interested in climate changes due to external forcing, or if the climatological distribution dictated by the governing equations is too broad (i.e., describes the current climate and other climates that are very different from the present one). In the first case, an inevitable consequence of any sensible definition of predictability is that forced variability, such as that caused by human activities or annual cycles of solar insolation, must be subsumed in the specification of the climatological distribution. As a result, predictions of forced variability cannot constitute initial-value predictability because predictability requires a *difference* in distributions. Following Lorenz (1975), a new type of predictability, which we call forced predictability, is defined based on differences in climatological distribution between different times. We propose a general measure, called *total climate predictability*, that captures both forced and initial-value predictability. This new measure has a natural decomposition into measures that have been proposed previously. Specifically, it emerges naturally that forced predictability is measured by the relative entropy between the initial and final climatologies, as proposed by Branstator and Teng (2010), and initial-value predictability is measured by mutual information, as proposed by DelSole (2004). The new measure is invariant to nonlinear transformations of the variables and generalizes naturally to multivariate distributions. Most of these concepts are illustrated in a simple model in which all distributions can be expressed in closed form.

The above framework has many attractive features but still may be unsatisfying. In particular, it defines the climatological distribution as the forecast distribution initialized in the *infinite* past, which may be much broader than is appropriate for certain kinds of predictability questions. For instance, the existence of abrupt climate change in paleo-climate records demonstrates that very different climates can occur under nearly the same external forcing. This spectrum of climates should be described by the climatological distribution. However, if predictability studies were to use such a broad climatology, then weather would be deemed predictable for as long as the associated forecast predicts that the climate has not shifted, possibly for years. Many studies implicitly justify a narrower distribution by selecting a climate model that simulates the climate of the past few decades when given the present external forcing. Whether this criterion leads to overly narrow estimates of the climatological distributions is unclear. In any case, we propose a *generalized predictability* that provides a formalism for filtering out predictability due to long time scale processes so that predictability on short time scales can be identified. This framework is based on a conditional climatology, which is the forecast distribution initialized at a *finite* earlier time. Using such a conditional climatology filters out predictability contributions from long time-scale processes. This generalized framework follows naturally from the definition of conditional independence, and therefore retains consistency with the concept of dependence. Furthermore, generalized predictability retains the familiar property that initial-value predictability decays with lead time.

## Footnotes

- 1.
In reality, coin flips are predictable if the flipping is performed by a carefully constructed machine. The apparent randomness of coin flips arises from the fact that humans are “sloppy flippers” (Diaconis et al. 2007).

- 2.
This paper employs the notation that a capital letter denotes a random variable and a lower case letter denotes the values on which it takes. Also, technically,

*Y*is a vector containing all state variables, but distinguishing vectors from scalars leads to a cumbersome notation without adding anything to the formalism, so for simplicity the multivariate nature of the state vector is ignored. - 3.
This definition was not explicitly stated in Lorenz (1975), but is reasonably clear from context. For instance, to introduce predictability of the second kind, Lorenz wrote “We may inquire, for example, what would be the effect upon the climate of doubling the concentration of \(CO_2\) in the atmosphere\(\ldots\)” This question is a statement about changes in climatological distribution and is clearly within the scope of detection and attribution analysis.

## Notes

### Acknowledgements

This research was supported primarily by the National Science Foundation (AGS-1338427), National Aeronautics and Space Administration (NNX14AM19G), the National Oceanic and Atmospheric Administration (NA14OAR4310160). We also thank J. Shukla, Grant Branstator, and an anonymous reviewer for numerous insightful comments on an earlier version of this paper. The views expressed herein are those of the authors and do not necessarily reflect the views of these agencies. MKT was partially supported by the Office of Naval Research (N00014-12-1-0911 and N00014-16-1-2073).

## References

- Alley RB, Clark PU (1999) The deglaciation of the northern hemisphere: a global perspective. Annu Rev Earth Planet Sci 27(1):149–182. doi: 10.1146/annurev.earth.27.1.149 CrossRefGoogle Scholar
- Alley RB, Marotzke J, Nordhaus WD, Overpeck JT, Peteet DM, Pielke RA, Pierrehumbert RT, Rhines PB, Stocker TF, Talley LD, Wallace JM (2003) Abrupt climate change. Science 299(5615):2005–2010CrossRefGoogle Scholar
- Barnston A, Tippett MK, L’Heureux ML, Li S, DeWitt DG (2012) Skill of real-time seasonal ENSO model predictions during 2002–11: Is our capability increasing? Bull Am Meteor Soc 93:ES48–ES50Google Scholar
- Barsugli JJ, Battisti DS (1998) The basic effects of atmosphere-ocean thermal coupling on midlatitude variability. J Atmos Sci 55:477–493CrossRefGoogle Scholar
- Branstator G, Teng H (2010) Two limits of initial-value decadal predictability in a CGCM. J Clim 23:6292–6311CrossRefGoogle Scholar
- Charney JG, Shukla J (1981) Predictability of monsoons. In: Lighthill J, Pearce RP (eds) Monsoon dynamics. Cambridge University, Cambridge, pp 99–109CrossRefGoogle Scholar
- Cook ER, Seager R, Heim RR, Vose RS, Herweijer C, Woodhouse C (2010) Megadroughts in north america: placing ipcc projections of hydroclimatic change in a long-term palaeoclimate context. J Quat Sci 25(1):48–61. doi: 10.1002/jqs.1303 CrossRefGoogle Scholar
- Cover TM, Thomas JA (1991) Elements of information theory. WileyGoogle Scholar
- Dawid AP (1979) Conditional independence in statistical theory. J R Stat Soc Ser B (Methodological) 41(1):1–31. http://www.jstor.org/stable/2984718
- DelSole T (2004) Predictability and information theory part I: measures of predictability. J Atmos Sci 61:2425–2440CrossRefGoogle Scholar
- DelSole T (2017) Decadal prediction of temperature: achievements and future prospects. Curr Clim Change Rep 3(2):99–111. doi: 10.1007/s40641-017-0066-x CrossRefGoogle Scholar
- Diaconis P, Holmes S, Montgomery R (2007) Dynamical bias in the coin toss. SIAM Rev 49(2):211–235. http://www.jstor.org/stable/20453950
- Gardiner CW (1990) Handbook of stochastic methods, 2nd edn. Springer, BerlinGoogle Scholar
- Goswami BN, Shukla J (1991) Predictability of a coupled ocean-atmosphere model. J Clim 4(1):3–22. doi: 10.1175/1520-0442(1991)004<0003:POACOA>2.0.CO;2 CrossRefGoogle Scholar
- Hasselmann K (1976) Stochastic climate models I. Theory. Tellus 28:473–485CrossRefGoogle Scholar
- Jazwinski AH (1970) Stochastic processes and filtering theory. Academic, New York, p 376Google Scholar
- Kleeman R (2002) Measuring dynamical prediction utility using relative entropy. J Atmos Sci 59:2057–2072CrossRefGoogle Scholar
- Kullback S (1968) Information theory and statistics. DoverGoogle Scholar
- Leith CE (1978) Predictability of climate. Nature 276(5686):352–355CrossRefGoogle Scholar
- Lorenz EN (1963) Deterministic nonperiodic flow. J Atmos Sci 20:130–141CrossRefGoogle Scholar
- Lorenz EN (1968) Climate determinism. Am. Meteor. Soc., pp 1–3Google Scholar
- Lorenz EN (1970) Climatic change as a mathematical problem. J Appl Meteorol 9(3):325–329. doi: 10.1175/1520-0450(1970) 009<0325:CCAAMP>2.0.CO;2 CrossRefGoogle Scholar
- Lorenz EN (1975) Climate predictability. The physical basis of climate and climate modeling, vol 16. WMO, Geneva, Switzerland, pp 132–136Google Scholar
- Lorenz EN, Emmanuel KA (1998) Optimal sites for supplementary weather observations: simulation with a small model. J Atmos Sci 55:399–414CrossRefGoogle Scholar
- Mackay DJC (2003) Information theory, inference, and learning algorithms. Cambridge University, CambridgeGoogle Scholar
- Meehl GA, Goddard L, Murphy J, Stouffer RJ, Boer G, Danabasoglu G, Dixon K, Giorgetta MA, Greene AM, Hawkins E, Hegerl G, Karoly D, Keenlyside N, Kimoto M, Kirtman B, Navarra A, Pulwarty R, Smith D, Stammer D, Stockdale T (2009) Decadal prediction. Bull Am Meteorol Soc 90(10):1467–1485. doi: 10.1175/2009BAMS2778.1 CrossRefGoogle Scholar
- Narapusetty B, DelSole T, Tippett MK (2009) Optimal estimation of the climatological mean. J Clim 22:4845–4859CrossRefGoogle Scholar
- Nelkin M (1992) In what sense is turbulence an unsolved problem? Science 255(5044):566–570. doi: 10.1126/science.255.5044.566. http://science.sciencemag.org/content/255/5044/566, http://science.sciencemag.org/content/255/5044/566.full.pdf
- Palmer TN (2002) The economic value of ensemble forecasts as a tool for risk assessment: from days to decades. Q J R Meteorol Soc 128(581):747–774. doi: 10.1256/0035900021643593 CrossRefGoogle Scholar
- Palmer TN, Hagedorn R (eds) (2006) Predictability of weather and climate. Cambridge University PressGoogle Scholar
- Philander SGH (1990) El Niño, La Niña, and the Southern Oscillation. AcademicGoogle Scholar
- Shukla J (1981) Dynamical predictability of monthly means. Mon Wea Rev 38:2547–2572Google Scholar
- Shukla J (1998) Predictability in the midst of chaos: a scientific basis for climate forecasting. Science 282:728–731CrossRefGoogle Scholar
- Simmons AJ, Hollingsworth A (2002) Some aspects of the improvement in skill of numerical weather prediction. Q J R Meteorol Soc 128:647–677CrossRefGoogle Scholar
- Tripathi OP, Baldwin M, Charlton-Perez A, Charron M, Eckermann SD, Gerber E, Harrison RG, Jackson DR, Kim BM, Kuroda Y, Lang A, Mahmood S, Mizuta R, Roff G, Sigmond M, Son SW (2015) The predictability of the extratropical stratosphere on monthly time-scales and its impact on the skill of tropospheric forecasts. Q J R Meteorol Soc 141(689):987–1003. doi: 10.1002/qj.2432 CrossRefGoogle Scholar
- Wittenberg AT (2009) Are historical records sufficient to constraint ENSO simulations? Geophys Res Lett 36: doi: 10.1029/2009GL038,710

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.