# Sensitivity Analysis of Longevity and Life Disparity

• Hal Caswell
Open Access
Chapter
Part of the Demographic Research Monographs book series (DEMOGRAPHIC)

## Abstract

A similarly basic outcome of population, at the individual or cohort level, is longevity: the length of individual life. The most commonly encountered description of longevity is its expectation, the life expectancy.

## 4.1 Introduction

The population growth rate (λ or r) analyzed in Chap. is a population-level consequence of the individual-level vital rates. A similarly basic outcome, at the individual or cohort level, is longevity: the length of individual life. The most commonly encountered description of longevity is its expectation, the life expectancy. However, longevity is a random variable, differing among individuals (even when those individuals are subject to the same rates and hazards) because of the random vagaries of mortality and survival. Therefore, it is important to also consider its variance and higher moments. This chapter introduces the sensitivity analysis of longevity, which will be explored in more detail in Chaps. , , and .

As in Chap. , we will begin by reviewing a classic formula for the sensitivity of life expectancy in age-classified models. The we will use matrix calculus to derive more general formulas for the moments of longevity, the distribution of age or stage at death, and the life disparity, applicable to age- or stage-classified populations.

## 4.2 Life Expectancy in Age-Classified Populations

### Notation

It is customary to denote life expectancy by symbols like $$e_x^{\mathrm {o}}$$ or e(x), but in general the symbol e plays too many roles in mathematics to be helpful for our purposes. So, when we make the transition to matrix formulations, I will use the symbol η, in various vector and scalar manifestations, to indicate longevity.

Perturbation analysis of longevity has been pursued mostly within the framework of age-classified life cycles (e.g., Canudas Romo 2003; Keyfitz 1971; Pollard 1982; Vaupel 1986; Vaupel and Canudas Romo 2003). The life expectancy at age x is given by
\displaystyle \begin{aligned} e(x) = \frac{1}{\ell(x)} \int_x^\infty \ell(s) ds {} \end{aligned}
(4.1)
where the survivorship function (x) is the probability of survival to age x.
The classical result for the sensitivity of life expectancy at birth to a change in mortality at age a is
\displaystyle \begin{aligned} {d e(0) \over d \mu(a)} = -\ell(a) e(a) . {} \end{aligned}
(4.2)
That is, the sensitivity of life expectancy at birth to a change in mortality at age a is equal to the product of the probability of survival to age a and the life expectancy at age a. In other words, e(0) is most sensitive to changes in mortality at ages to which lots of individuals survive (to experience the change in mortality) and beyond which there is lots of longevity remaining (so they can enjoy the change in mortality). The derivative is negative because increasing mortality reduces life expectancy.

The result was presented independently by Keyfitz (1971) who also referenced some earlier approaches (Wilson 1938; Irwin 1949) and by Pollard (1982). Keyfitz’s derivation was sketchy, and Pollard simply stated that the result was well-known, and gave no derivation. From a general sensitivity analysis perspective, we can derive the result using the same approach applied in Chap. to population growth rate.

### 4.2.1 Derivation

Differentiating (4.1) with respect to mortality at some specified age a gives
\displaystyle \begin{aligned} {d e(0) \over d \mu(a)} = \int_0^\infty {d \ell(s) \over d \mu(a)} ds {} \end{aligned}
(4.3)
and our problem reduces to finding the derivative of (s) with respect to μ(a). To do so, introduce a parameter θ to measure the size of the perturbation at age a, and write mortality as
\displaystyle \begin{aligned} \mu(x,\theta) = \mu(x,0) + \theta \; \delta(x-a) {} \end{aligned}
(4.4)
where δ(x − a) is the Dirac delta function.1 The derivative with respect to μ(a) is obtained by differentiating with respect to θ and evaluating the result at θ = 0.
Write survivorship as
\displaystyle \begin{aligned} \ell(x,\theta) = \exp \left[ - \int_0^x \mu(z, \theta) dz \right] \end{aligned}
(4.5)
so that
\displaystyle \begin{aligned} {d \ell(x,\theta) \over d \theta} = - \ell(x,\theta) \int_0^x {d \mu(z,\theta) \over d \theta} d z \end{aligned}
(4.6)
From (4.4) we have
\displaystyle \begin{aligned} {d \mu(z,\theta) \over d \theta} = \delta(z-a) \end{aligned}
(4.7)
so that
\displaystyle \begin{aligned} \begin{array}{rcl} {d \ell(x,\theta) \over d \theta} &\displaystyle =&\displaystyle - \ell(x,\theta) \int_0^\infty \delta(z-a) dz \end{array} \end{aligned}
(4.8)
\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle =&\displaystyle -\ell(x,\theta) H(x-a) \end{array} \end{aligned}
(4.9)
where H(⋅) is the unit step function. Substituting this into (4.3) and evaluating at θ = 0 gives
\displaystyle \begin{aligned} \begin{array}{rcl} {d e(0) \over d \mu(a)} &\displaystyle =&\displaystyle - \int_0^\infty \ell(s) H(s-a) ds \end{array} \end{aligned}
(4.10)
\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle =&\displaystyle - \int_a^\infty \ell(s) ds \end{array} \end{aligned}
(4.11)
which, by (4.1) is equal to (4.2).

## 4.3 A Markov Chain Model for the Life Cycle

Age has a special status in demography because it is continuous, linear, and permits movement in only one direction and at one rate (age increases by one unit for every unit of time). All other demographic characteristics have the potential for much greater flexibility, and the operators that describe movement and development of individuals require an equal degree of flexibility. This book is devoted to matrix formulations of these problems, which have the great advantage of permitting both age and stage-classified models. The basic formulation, as far as longevity is concerned, is that of a finite-state absorbing Markov chain.

### 4.3.1 A Markov Chain Formulation of the Life Cycle

We describe the life cycle as an absorbing Markov chain. This approach was pioneered in demography by Feichtinger (1971) and Hoem (1969), and has been greatly extended in recent years (Caswell 2001, 2006, 2009; Horvitz and Tuljapurkar 2008; Tuljapurkar and Horvitz 2006; Steinsaltz and Evans 2004). Good sources for the basic theory of absorbing Markov chains are Kemeny and Snell (1976) and Iosifescu (1980).

These models will be explored in more detail in Chaps. and . The sensitivity analysis of measures of variance in longevity has been developed by Van Raalte and Caswell (2013) and Engelman et al. (2014). An important extension of Markov chain models for longevity is the incorporation of “rewards” to represent the value, in some sense, of the length of life, extending methods developed for dynamic programming (Howard 1960). The rewards include the production of offspring (Caswell 2011; van Daalen and Caswell 2015, 2017), the accumulation of income and expenditures (Caswell and Kluge 2015) and healthy longevity (Caswell and Zarulli 2018). The sensitivity analysis of these important models is derived in van Daalen and Caswell (2017).

Markov chain theory distinguishes between recurrent and transient states. A recurrent state has the property that the probability of returning to that state at least once is 1. A transient state is one for which that probability is less than 1. If a Markov chain contains transient states, it will eventually leave those states and arrive in a recurrent state or class of states, where it will remain permanently. Such a chain is called absorbing. Absorbing chains are the basic model for the demography of individuals because life is inherently transient. Any individual will, with probability one, eventually leave the set of living states and be absorbed by death.

If a Markov chain consists of a single set of recurrent states that all communicate with each other, it is said to be ergodic. The transition matrix for an ergodic chain is irreducible and primitive. Ergodic Markov chains play a limited role in demographic contexts because they cannot include mortality. Chapter will, however, present the sensitivity analysis of these models.

In demographic models, individuals move among a set of transient (i.e., living) states in their life cycle before they eventually reach an absorbing state (death). Transient states may represent age classes, developmental or life history stages, or states defined by health, employment, economic, or other kinds of status. In studying longevity, we are particularly interested in absorbing states representing death, or perhaps death classified by age or stage at death, or by cause of death. The analysis applies equally to other ways of leaving the life cycle (e.g., graduation in a model of educational states, discharge from treatment in model of health states).

Number the stages in the life cycle so that the transient states are 1, …, s and the absorbing states are s + 1, …, s + a. Then the transition matrix of the Markov chain is
(4.12)
Here, U is the s × s matrix of transition probabilities among the transient states. The a × s matrix M gives the probabilities of absorption in each of the absorbing states. The columns of P sum to one. I assume that the spectral radius (the dominant eigenvalue) of U is strictly less than one; a sufficient condition for this is that there is a non-zero probability of ultimate death for every stage.
Age-classified models are a special case with survival probabilities on the subdiagonal (and possibly in the last diagonal entry); e.g., for s = 3 in which
\displaystyle \begin{aligned} {\mathbf{U}} = \left(\begin{array}{ccc} 0 & 0 & 0 \\ p_1 & 0 &0 \\ 0& p_2 & p_3 \\ \end{array}\right) {} \end{aligned}
(4.13)
The age-specific survival probability is $$p_i = e^{- \mu _i}$$, with μi a mortality rate applying to age class i. The (s, s) entry of U is an age-independent survival probability for a final open-ended age class, with a remaining life expectancy of 1∕(1 − ps). If ps = 0 no one survives beyond age class s. When the age-classified model is constructed from a life table, pi = 1 − qi−1; that is, the survival of age-class 1 is the complement of the probability of death between age 0 and 1.
The mortality matrix M gives the probabilities of transition from each of the transient states to each of the absorbing states. Figure 4.1 shows some examples of life cycle formulations that can arise, including both age and stage classification in the transient states, and absorbing states classified by age at death, grouped ages at death, stage at death, or cause of death. The resulting mortality matrices are
\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{Figure 4.1a} \qquad & {\mathbf{M}}& = \left(\begin{array}{cccc} 1-P_1 & 1-P_2 & 1-P_3 & 1 \end{array}\right) \end{array} \end{aligned}
(4.14)
\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{Figure 4.1b} \qquad &{\mathbf{M}}& = \left(\begin{array}{cccc} 1-P_1 & 0 & 0 & 0 \\ 0 & 1-P_2 & 0 & 0 \\ 0 & 0 & 1-P_3 & 0\\ 0&0&0& 1-P_4 \end{array}\right) {} \end{array} \end{aligned}
(4.15)
\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{Figure 4.1c} \qquad &{\mathbf{M}}& = \left(\begin{array}{cccc} 1-P_1 & 1-P_2 & 0 & 0 \\ 0 & 0 & 1-P_3 & 1-P_4 \end{array}\right) \end{array} \end{aligned}
(4.16)
\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{Figure 4.1d} \qquad & {\mathbf{M}}& = \left(\begin{array}{cccc} q_1 & 0 & 0 & 0 \\ 0 & q_2 & 0 & 0 \\ 0 & 0 & q_3 & 0\\ 0&0&0& q_4 \end{array}\right) {} \end{array} \end{aligned}
(4.17)
\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{Figure 4.1e} \qquad & {\mathbf{M}} &= \left(\begin{array}{cccc} q_1 & q_2 & q_3& q_4\\ s_1 & s_2 & s_3 & s_4 \end{array}\right) \end{array} \end{aligned}
(4.18)
The beauty of formulating longevity as a Markov chain is that many statistics of longevity can be written in terms of the matrices U and M and sensitivity analysis can be carried out using matrix calculus.

### 4.3.2 Occupancy Times

Consider an individual in transient state j. Eventual absortion is certain. But before that, the individual will occupy various transient states. The number of such visits, the occupancy time2 is the basic unit of longevity. Occupancy is particularly central in studies of health demography, where it quantifies the parts of a life spent in different health states. But, even without the added dimension of something like health, occupancy of transient states is the basis of longevity analysis.

Let νij be the number of visits to transient state i by an individual in transient state j, prior to absorption. Its expectation is given by the fundamental matrix (e.g., Kemeny and Snell 1976; Iosifescu 1980)
\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{N}} &=& \left(\begin{array}{c} E(\nu_{ij}) \end{array}\right) \end{array} \end{aligned}
(4.19)
\displaystyle \begin{aligned} \begin{array}{rcl} &=& \left( {\mathbf{I}} - {\mathbf{U}} \right)^{-1} \end{array} \end{aligned}
(4.20)
More details, and examples, for the higher moments and variances of occupancy times are given in Chaps. and .

### 4.3.3 Longevity

The longevity of an individual in state j can be equated to the total occupancy time of all transient states by that individual, prior to eventual absorption. Let ηj be this longevity; the expectation of ηj is the sum of the elements in column j of N. We define η1 and η2 as the vectors containing the first and second moments of longevity, respectively. Then
\displaystyle \begin{aligned} E(\boldsymbol{\eta})^{\mathsf{T}} = \boldsymbol{\eta}_1^{\mathsf{T}} = {\mathbf{1}}^{\mathsf{T}} {\mathbf{N}} {} \end{aligned}
(4.21)
Figure 4.2a shows the life expectancy for India in 1961 and Japan in 2006.
The vector of the second moments of longevity satisfies
\displaystyle \begin{aligned} \boldsymbol{\eta}_2^{\mathsf{T}} = \boldsymbol{\eta}_1^{\mathsf{T}} \left( 2 {\mathbf{N}} - {\mathbf{I}} \right) {} \end{aligned}
(4.22)
(Iosifescu 1980). The variance and standard deviation of longevity are thus
\displaystyle \begin{aligned} \begin{array}{rcl} V \left( \boldsymbol{\eta}\right) ^{\mathsf{T}} &\displaystyle =&\displaystyle \boldsymbol{\eta}_2 - \boldsymbol{\eta}_1 \circ \boldsymbol{\eta}_1 {} \end{array} \end{aligned}
(4.23)
\displaystyle \begin{aligned} \begin{array}{rcl} SD(\boldsymbol{\eta}) &\displaystyle =&\displaystyle \sqrt{V(\boldsymbol{\eta})} {} \end{array} \end{aligned}
(4.24)
where the square root is taken element-wise.

Note that V (η) and SD(η) are vectors; their elements give the variance or standard deviation of longevity for individuals in each stage, making it easy to examine variation in remaining longevity conditional on the starting age. This conditioning can be important; Edwards and Tuljapurkar (2005) have made a strong case that SD(η10), starting from age 10, is a good index to prevent infant and child mortality from obscuring patterns in old age longevity.

Figure 4.2b shows SD(η) for India and Japan. The standard deviation at birth, SD(η1) is roughly twice as great in India as in Japan, a discrepancy that remains at SD(η10). Eventually, beyond the age of 50, SD(η) becomes greater in India than in Japan.

### 4.3.4 Age or Stage at Death

If the model contains more than one absorbing state (as in all the cases but the first in Fig. 4.1), the eventual fate of an individual is uncertain. The probability distributions of the eventual absorbing state are given by the columns of the matrix
\displaystyle \begin{aligned} {\mathbf{B}} = {\mathbf{M}} {\mathbf{N}} {} \end{aligned}
(4.25)
where bij is the probability of eventual absorption in absorbing state i for an individual starting in transient state j (Iosifescu 1980).
Suppose that the absorbing stages are defined as the age (or stage) at death, as in Fig. 4.1b, d. Then M is given by Eq. (4.17) and the jth column of B is the probability distribution of age at death for an individual starting in age class j:
\displaystyle \begin{aligned} \boldsymbol{\psi}_j = {\mathbf{B}}(:,j) = {\mathbf{B}} {\mathbf{e}}_j . \end{aligned}
(4.26)

### 4.3.5 Life Lost and Life Disparity

When an individual dies, it loses the remaining life that it would have experienced, had it not died. This counterfactual proposition seems abstract, but we can make it concrete by asking for the expectation of that lost lifetime. An individual that dies at age x will lose, on average, an amount of life given by the life expectancy at age x. Averaging this remaining life expectancy over the distribution of age at death gives the mean life lost due to mortality. Vaupel and Canudas Romo (2003) denoted the life lost by e. Here we define the vector η, whose ith entry is the expected life lost due to mortality by an individual starting in age class i; it is given by
\displaystyle \begin{aligned} \left( \boldsymbol{\eta}^\dagger \right)^{\mathsf{T}} = \boldsymbol{\eta}_1^{\mathsf{T}} {\mathbf{B}} . {} \end{aligned}
(4.27)

Calculations of life lost from mortality due to specific causes of death play a central role in the calculations of disability-adjusted life years (DALYs) used in calculations of the burden of diseases (e.g., Devleesschauwer et al. 2014; GBD 2016 DALYs and HALE Collaborators 2017). See Caswell and Zarulli (2018) for the relationship between DALY calculations and Markov chain methods, and for a calculation of the variance in life lost.

The life lost η has an additional interpretation as a measure of disparity. Consider a population in which everyone dies at the same age. In such a situation, η = 0, because at the age of death, there is no additional life expectancy. Thus η is a measure of “life disparity;” the larger its value, the more disparity there is among individuals in age at death (Vaupel et al. 2011).

The values of life disparity in age class 1, for Japan and India, in years, are
\displaystyle \begin{aligned} \eta_1^\dag = \left\{ \begin{array}{rl} 10.1 & \ \mbox{Japan} \\ 23.9 & \ \mbox{India} \end{array} \right. \end{aligned}
(4.28)
Just as India has a much larger variance in longevity than Japan, it also has a higher life disparity.

## 4.4 Sensitivity Analysis

Our goal is to obtain expressions for the derivatives of E(η), V (η), SD(η), B, and η, with respect to changes in age specific-mortality rates. The calculations and some results (contrasting the mortality schedules of Japan and India) are given here. More details are presented in Chaps. and . Results are presented in terms of an arbitrary vector θ of parameters on which U and M depend. In the examples, θ will be the vector μ of age-specific mortality rates.

### 4.4.1 Sensitivity of the Fundamental Matrix

The fundamental matrix N appears in many of these formulas. Its sensitivity was first obtained by Caswell (2006). Suppose that U is a function of some vector θ of parameters. Then
\displaystyle \begin{aligned} {d \mbox{vec} \, {\mathbf{N}} \over d \boldsymbol{\theta}^{\mathsf{T}}} = \left( {\mathbf{N}}^{\mathsf{T}} \otimes {\mathbf{N}} \right) {d \mbox{vec} \, {\mathbf{U}} \over d \boldsymbol{\theta}^{\mathsf{T}}} {} \end{aligned}
(4.29)
(see Chap. ).

### 4.4.2 Sensitivity of Life Expectancy

The sensitivity of the vector of life expectancy as a function of age is obtained by differentiating (4.21),
\displaystyle \begin{aligned} d \boldsymbol{\eta}_1^{\mathsf{T}} = {\mathbf{1}}^{\mathsf{T}} (d {\mathbf{N}}) \end{aligned}
(4.30)
Applying the vec operator and Roth’s theorem () gives
\displaystyle \begin{aligned} \begin{array}{rcl} d \boldsymbol{\eta}_1 &\displaystyle =&\displaystyle \left( {\mathbf{I}} \otimes {\mathbf{1}}^{\mathsf{T}} \right) d \mbox{vec} \, {\mathbf{N}} \end{array} \end{aligned}
(4.31)
\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle =&\displaystyle \left( {\mathbf{I}} \otimes {\mathbf{1}}^{\mathsf{T}} \right) \left( {\mathbf{N}}^{\mathsf{T}} \otimes {\mathbf{N}} \right) d \mbox{vec} \, {\mathbf{U}} \end{array} \end{aligned}
(4.32)
\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle =&\displaystyle \left( {\mathbf{N}}^{\mathsf{T}} \otimes \boldsymbol{\eta}_1^{\mathsf{T}} \right) d \mbox{vec} \, {\mathbf{U}} . {} \end{array} \end{aligned}
(4.33)
The last step uses the fact that (A ⊗B)(C ⊗D) = (AC ⊗BD). Applying the chain rule and the first identification theorem gives the result
\displaystyle \begin{aligned} {d \boldsymbol{\eta}_1 \over d \boldsymbol{\theta}^{\mathsf{T}}} = \left( {\mathbf{N}}^{\mathsf{T}} \otimes \boldsymbol{\eta}_1^{\mathsf{T}} \right) {d \mbox{vec} \, {\mathbf{U}} \over d \boldsymbol{\theta}^{\mathsf{T}}} {} \end{aligned}
(4.34)

### Sensitivity to mortality

If interest focuses on changes in age-specific mortality, so that θ = μ, then the sensitivity formula expands, using the chain rule, to
\displaystyle \begin{aligned} {d \boldsymbol{\eta}_1 \over d \boldsymbol{\mu}^{\mathsf{T}}} = \left( {\mathbf{N}}^{\mathsf{T}} \otimes \boldsymbol{\eta}_1^{\mathsf{T}} \right) {d \mbox{vec} \, {\mathbf{U}} \over d \boldsymbol{\mu}^{\mathsf{T}}} \end{aligned}
(4.35)
This can be evaluated in several ways, depending on how the matrix U is written as a function of mortality. One approach is used in Sect. 4.4.3, and a somewhat more widely useful approach in Sect. 4.4.4.

The results for Japan and India are shown in Fig. 4.2. Life expectancy is more sensitive to changes in mortality in Japan than in India; the (absolute value of) sensitivity decreases almost linearly with age in Japan, and slightly less linearly in India (Fig. 4.2). On the other hand, life expectancy is more elastic to changes in mortality in India, and less so in Japan.

### 4.4.3 Generalizing the Keyfitz-Pollard Formula

The Keyfitz-Pollard formula for the sensitivity of life expectancy to changes in mortality rate, given in Eq. (4.2), has a clear interpretation: the sensitivity to mortality at age a depends on the probability of survival to age a and the remaining life expectancy at age a. We are now in a position to generalize this to stage-classified matrix models.

First, we derive the matrix version of the Keyfitz-Pollard result, for the sensitivity of life expectancy of age class 1, which is
\displaystyle \begin{aligned} \begin{array}{rcl} d E \left( \eta_1 \right) &\displaystyle =&\displaystyle \left( {\mathbf{e}}_1^{\mathsf{T}} \otimes {\mathbf{1}}^{\mathsf{T}} \right) d \mbox{vec} \, {\mathbf{N}} \end{array} \end{aligned}
(4.36)
\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle =&\displaystyle \left( {\mathbf{e}}_1^{\mathsf{T}} \otimes {\mathbf{1}}^{\mathsf{T}} \right) \left( {\mathbf{N}}^{\mathsf{T}} \otimes {\mathbf{N}} \right) d \mbox{vec} \, {\mathbf{U}} {} \end{array} \end{aligned}
(4.37)
Consider a population with s age classes and let μi be the mortality rate and $$p_i=\exp (-\mu _i)$$ the survival probability for age class i. The matrix U is given by (4.13), which can be written
\displaystyle \begin{aligned} {\mathbf{U}} = \displaystyle \sum_{k=1}^{s-1} \left( {\mathbf{e}}_{k+1} {\mathbf{e}}_{k}^{\mathsf{T}} \right) \; p_k \end{aligned}
(4.38)
where ek is the unit vector, of length s, with a 1 in the kth position and zeros elsewhere. Differentiating U and applying the vec operator gives
\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, {\mathbf{U}} &\displaystyle =&\displaystyle \displaystyle - \sum_{k=1}^{s-1} \left({\mathbf{e}}_k \otimes {\mathbf{e}}_{k+1} \right) \; p_k \left( d \mu_k \right) {} \end{array} \end{aligned}
(4.39)
Substitute (4.39) into (4.37) and consider a perturbation of mortality at age a; the result is
\displaystyle \begin{aligned} {d E(\eta_1) \over d \mu_a} = - \left( {\mathbf{e}}_1^{\mathsf{T}} \otimes {\mathbf{1}}^{\mathsf{T}} \right) \left( {\mathbf{N}}^{\mathsf{T}} \otimes {\mathbf{N}} \right) \left({\mathbf{e}}_a \otimes {\mathbf{e}}_{a+1} \right) \; p_a . \end{aligned}
(4.40)
This simplifies to
\displaystyle \begin{aligned} \begin{array}{rcl} {d E(\eta_1) \over d \mu_a} &\displaystyle =&\displaystyle - \left( {\mathbf{e}}_1^{\mathsf{T}} {\mathbf{N}}^{\mathsf{T}} {\mathbf{e}}_a \otimes {\mathbf{1}}^{\mathsf{T}} {\mathbf{N}} {\mathbf{e}}_{a+1} \right) p_a \end{array} \end{aligned}
(4.41)
\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle =&\displaystyle - \underbrace{E \left( \nu_{a} \right)\, p_a}_{\mathrm{survival}} \, \underbrace{E\left( \eta_{a+1} \right)}_{\mathrm{expectancy}} \qquad \mbox{age-classified} \end{array} \end{aligned}
(4.42)
In an age-classified model, νa is either 0 or 1 (you cannot occupy a year of age for more than 1 year); hence the $$E \left ( \nu _{a} \right )$$ is the probability of survival to age a. Thus we have a matrix version of the Keyfitz-Pollard result: the sensitivity of life expectancy is the probability of survival to age a times the probability of survival from a to a + 1, times the life expectancy at age a + 1.
Now apply the same approach to a stage-classified model, in which U can be written as the product of a diagonal matrix Σ with survival probabilities on the diagonal, and a stochastic matrix G giving the transition probabilities conditional on survival:
\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{U}} &=& {\mathbf{G}} \boldsymbol{\Sigma} \end{array} \end{aligned}
(4.43)
\displaystyle \begin{aligned} \begin{array}{rcl} &=& {\mathbf{G}} \left(\begin{array}{ccc} p_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & p_s \end{array}\right) {} \end{array} \end{aligned}
(4.44)
\displaystyle \begin{aligned} \begin{array}{rcl} &=& \displaystyle {\mathbf{G}} \sum_{k=1}^s \left( {\mathbf{e}}_k {\mathbf{e}}_k^{\mathsf{T}} \right) \; p_k \end{array} \end{aligned}
(4.45)
Differentiating and applying the vec operator gives
\displaystyle \begin{aligned} d \mbox{vec} \, {\mathbf{U}} = \displaystyle \sum_{k=1}^s \left( {\mathbf{e}}_k \otimes {\mathbf{G}} {\mathbf{e}}_k \right) \; p_k \left(d \mu_k \right) \end{aligned}
(4.46)
Substitute this into (4.37) and focus on a change in mortality at stage a; the result is
\displaystyle \begin{aligned} {d E(\eta_1) \over d \mu_a} = - \left( {\mathbf{e}}_1^{\mathsf{T}} \otimes {\mathbf{1}}^{\mathsf{T}} \right) \left( {\mathbf{N}}^{\mathsf{T}} \otimes {\mathbf{N}} \right) \left( {\mathbf{e}}_a \otimes {\mathbf{G}} {\mathbf{e}}_a \right) \; p_a \end{aligned}
(4.47)
which simplifies to
\displaystyle \begin{aligned} \begin{array}{rcl} {d E \left( \eta_1 \right) \over d \mu_a} &\displaystyle =&\displaystyle - \left( {\mathbf{e}}_1^{\mathsf{T}} {\mathbf{N}}^{\mathsf{T}} {\mathbf{e}}_a \otimes {\mathbf{1}}^{\mathsf{T}} {\mathbf{N}} {\mathbf{G}} {\mathbf{e}}_{a} \right) p_a \end{array} \end{aligned}
(4.48)
\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle =&\displaystyle - E \left( \nu_{a1} \right) E \left( \boldsymbol{\eta}^{\mathsf{T}} \right) {\mathbf{G}}(:,a) p_a \end{array} \end{aligned}
(4.49)
\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle =&\displaystyle - \underbrace{E \left( \nu_{a1} \right)}_{\mathrm{occupancy}} \; \sum_{h=1}^s \underbrace{p_a g_{ha}}_{\mathrm{transitions}} \; \underbrace{E \left( \eta_h \right)}_{\mathrm{expectancy}} \qquad \mbox{stage-classified} {} \end{array} \end{aligned}
(4.50)
Equation (4.50) is the stage-classified version of Keyfitz-Pollard: the sensitivity of life expectancy to a change in mortality in stage j is the product of the expected time spent in stage j and the remaining life expectancy, calculated as an average of the life expectancy of all stages k, weighted by the probability of transition from j to k. This can be simplified further by noting that, for either age or stage-classified populations, G(:, a)pa = U(:, a), so that a completely general expression is
\displaystyle \begin{aligned} {d E \left( \eta_1 \right) \over d \mu_a} = - E \left( \nu_{a1} \right) E \left( \boldsymbol{\eta}^{\mathsf{T}} \right) {\mathbf{U}}(:,a) \qquad \mbox{age- or stage-classified} \end{aligned}
(4.51)

### 4.4.4 Sensitivity of the Variance of Longevity

The sensitivity of the variance in longevity is obtained by differentiating (4.23)
\displaystyle \begin{aligned} d V \left( \boldsymbol{\eta} \right) = d \boldsymbol{\eta}_2 - 2 \left( \boldsymbol{\eta}_1 \circ d \boldsymbol{\eta}_1 \right) \end{aligned}
(4.52)
and applying the vec operator (using results from Chap. on the vec of the Hadamard product), to obtain
\displaystyle \begin{aligned} d V \left( \boldsymbol{\eta} \right) = d \boldsymbol{\eta}_2 - 2 \mathcal{D}\,(\boldsymbol{\eta}_1) d \boldsymbol{\eta}_1. {} \end{aligned}
(4.53)
The derivative of η1 is already given by (4.33):
\displaystyle \begin{aligned} d \boldsymbol{\eta}_1 = \left( {\mathbf{N}}^{\mathsf{T}} \otimes \boldsymbol{\eta}_1^{\mathsf{T}} \right) d \mbox{vec} \, {\mathbf{U}} . {} \end{aligned}
(4.54)
The derivative of η2 is obtained by differentiating (4.22):
\displaystyle \begin{aligned} d \boldsymbol{\eta}_2^{\mathsf{T}} = 2 \left( d \boldsymbol{\eta}_1^{\mathsf{T}} \right) {\mathbf{N}} + d \boldsymbol{\eta}_1^{\mathsf{T}} \left( d {\mathbf{N}} \right) - d \boldsymbol{\eta}_1^{\mathsf{T}} \end{aligned}
(4.55)
Applying the vec operator to both sides and substituting (4.29) for dvec N gives
\displaystyle \begin{aligned} d \boldsymbol{\eta}_2 = \left( 2 {\mathbf{N}}^{\mathsf{T}} - {\mathbf{I}} \right) d \boldsymbol{\eta}_1 + 2 \left( {\mathbf{N}}^{\mathsf{T}} \otimes \boldsymbol{\eta}_1^{\mathsf{T}} {\mathbf{N}} \right) d \mbox{vec} \, {\mathbf{U}} {} \end{aligned}
(4.56)

Inserting (4.54) for dη1 and (4.56) for dη2 into (4.53) gives the sensitivity of the variance in remaining longevity, for any starting age or stage, to changes in U. The sensitivity of longevity to mortality is obtained by differentiating U with respect to μ.

### Derivatives of U

The derivative of U to the mortality vector μ are obtained as follows. For an age-classified model, define an age-advancement matrix
\displaystyle \begin{aligned} {\mathbf{L}} = \left(\begin{array}{ccc} 0 & 0 & 0\\ 1 & 0 & 0 \\ 0 & 1 & [1] \end{array}\right) \end{aligned}
(4.57)
(show here for three age classes, with the optional open-ended last age class). This matrix will mask the entries of a matrix 1pT, that contains p in each row, to obtain
\displaystyle \begin{aligned} {\mathbf{U}} = {\mathbf{L}} \circ \left( \mathbf{1} {\mathbf{p}}^{\mathsf{T}} \right) \end{aligned}
(4.58)
Differentiating and applying the vec operator gives
\displaystyle \begin{aligned} \begin{array}{rcl} d {\mathbf{U}} &\displaystyle =&\displaystyle {\mathbf{L}} \circ \left(\rule{0in}{2ex} \mathbf{1} \left( d{\mathbf{p}}^{\mathsf{T}} \right) \right) \end{array} \end{aligned}
(4.59)
\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, {\mathbf{U}} &\displaystyle =&\displaystyle \mathcal{D}\,(\mbox{vec} \, {\mathbf{L}} ) \left({\mathbf{I}} \otimes \mathbf{1} \right) d {\mathbf{p}} . {} \end{array} \end{aligned}
(4.60)
Since $${\mathbf {p}} = \exp ( - \boldsymbol {\mu } )$$,
\displaystyle \begin{aligned} d {\mathbf{p}} = - \mathcal{D}\,({\mathbf{p}}) d \boldsymbol{\mu}, \end{aligned}
(4.61)
and hence
\displaystyle \begin{aligned} d \mbox{vec} \, {\mathbf{U}}= - \mathcal{D}\,(\mbox{vec} \, {\mathbf{L}} ) \left({\mathbf{I}} \otimes \mathbf{1} \right) \mathcal{D}\,({\mathbf{p}}) d \boldsymbol{\mu} \qquad \mbox{age-classified} {} \end{aligned}
(4.62)
For a stage-classified model, write U = GΣ, as in (4.44) as
\displaystyle \begin{aligned} {\mathbf{U}} = {\mathbf{G}} \left[ {\mathbf{I}} \circ \left( \mathbf{1} {\mathbf{p}}^{\mathsf{T}} \right) \right] \end{aligned}
(4.63)
Differentiating and applying the vec operator, following the strategy of (4.60), gives
\displaystyle \begin{aligned} d \mbox{vec} \, {\mathbf{U}} = - \left( {\mathbf{I}} \otimes {\mathbf{G}}\right) \mathcal{D}\,(\mbox{vec} \, {\mathbf{I}} ) \left({\mathbf{I}} \otimes \mathbf{1} \right) \mathcal{D}\,({\mathbf{p}}) d \boldsymbol{\mu} \qquad \mbox{stage-classified} {} \end{aligned}
(4.64)

Substituting (4.62) and (4.64) into the expressions for dη1 and dη2, and substituting those into (4.53) gives the sensitivity of the variance in longevity to age- or stage-specific mortality. It is possible to carry out the substitutions and to arrive at a single (large) expression for dV (η); see Chap. .

Figure 4.2d shows the sensitivity and elasticity of variance of longevity to changes in age-specific mortality. The variance is more sensitive to mortality changes in Japan than in India, and the sensitivities are highest at young ages. Both life tables have the property that sensitivities are positive at early ages (≈0–20 for India, ≈0–80 for Japan) and then become negative. Before this age, reductions in mortality will reduce variance; after this age, reductions in mortality increase the variance. See Sect. 4.4.6 for more on this.

### 4.4.5 Sensitivity of the Distribution of Age at Death

The sensitivity of the distribution of age or stage at death is obtained by differentiating (4.25) and applying the vec operator,
\displaystyle \begin{aligned} d \mbox{vec} \, {\mathbf{B}} = \left( {\mathbf{N}}^{\mathsf{T}} \otimes {\mathbf{I}} \right) d \mbox{vec} \, {\mathbf{M}} + \left({\mathbf{I}} \otimes {\mathbf{M}} \right) d \mbox{vec} \, {\mathbf{N}}. \end{aligned}
(4.65)
We already know dvec N. To obtain dvec M, note that when the absorbing states are defined in terms of stage at death
\displaystyle \begin{aligned} {\mathbf{M}} = {\mathbf{I}} - \mathcal{D}\,( {\mathbf{p}}) \end{aligned}
(4.66)
and thus
\displaystyle \begin{aligned} d \mbox{vec} \, {\mathbf{M}} = - \mathcal{D}\,(\mbox{vec} \, {\mathbf{I}} ) \left( {\mathbf{I}} \otimes \mathbf{1} \right) d {\mathbf{p}} \end{aligned}
(4.67)
It is revealing to write the sensitivity of B to changes in mortality using the chain rule,
\displaystyle \begin{aligned} {d \mbox{vec} \, {\mathbf{B}} \over d \boldsymbol{\mu}^{\mathsf{T}}} = \left( {\mathbf{N}}^{\mathsf{T}} \otimes {\mathbf{I}} \right) {d \mbox{vec} \, {\mathbf{M}} \over d {\mathbf{p}}^{\mathsf{T}}} {d {\mathbf{p}} \over d \boldsymbol{\mu}^{\mathsf{T}}} + \left({\mathbf{I}} \otimes {\mathbf{M}} \right) {d \mbox{vec} \, {\mathbf{N}} \over d \mbox{vec} \,^{\mathsf{T}} {\mathbf{U}}} {d \mbox{vec} \, {\mathbf{U}} \over d {\mathbf{p}}^{\mathsf{T}}} {d {\mathbf{p}} \over d \boldsymbol{\mu}^{\mathsf{T}}} \end{aligned}
(4.68)
and to recognize how many of the pieces we have already obtained.
The distribution of stage at death for individuals starting in stage j is given by column j of B; i.e., ψj = B(:, j). The sensitivity of ψj to changes in mortality is
\displaystyle \begin{aligned} {d \boldsymbol{\psi}_j \over d \boldsymbol{\mu}^{\mathsf{T}}} = \left({\mathbf{e}}_j \otimes {\mathbf{I}} \right) {d \mbox{vec} \, {\mathbf{B}} \over d \boldsymbol{\mu}^{\mathsf{T}}} \end{aligned}
(4.69)
for any age or stage j of interest.

### 4.4.6 Sensitivity of Life Disparity

To get the sensitivity of the vector η, differentiate and apply the vec operator to Eq. (4.27), which gives
\displaystyle \begin{aligned} d \boldsymbol{\eta}^\dagger = {\mathbf{B}}^{\mathsf{T}} d \boldsymbol{\eta}_1 + \left( {\mathbf{I}} \otimes \boldsymbol{\eta}_1^{\mathsf{T}} \right) d \mbox{vec} \, {\mathbf{B}}. {} \end{aligned}
(4.70)

Evaluating this expression for the data on India and Japan, we see that the sensitivity of η shows a pattern similar to that of the sensitivity of V (η) (Fig. 4.2), confirming that these indices are measuring similar aspects of disparity in longevity.

In particular, they show the existence of a critical age, before which reductions in mortality reduce disparity and after which they have the opposite effect. Zhang and Vaupel (2009) showed that this critical age, which they describe as separating “early” from “late” deaths is a general property of η. Although the details depend on which index of disparity one uses, the existence of a critical age separating positive and negative sensitivities is also a property of other measures of variation in longevity (Van Raalte and Caswell 2013). Vaupel et al. (2011) have used the critical age to decompose historical changes in lifespan disparity into components due to early and late mortality.

## 4.5 A Time-Series LTRE Decomposition: Life Disparity

The LTRE decomposition analysis in Sect. can be used to decompose time series such as these into their components. We apply it here to calculate the contributions, to a long trajectory of changes in η, of changes in early and late mortality.

Suppose that some demographic outcome ξ(t) (dimension s × 1) is measured as a function of a parameter vector θ (dimension p × 1), at times 1, 2, …T. The changes in ξ(t) over time result from the changes in the parameters,
\displaystyle \begin{aligned} \begin{array}{rcl} \Delta \boldsymbol{\xi}(t) &\displaystyle =&\displaystyle \boldsymbol{\xi}(t+1) - \boldsymbol{\xi}(t) \end{array} \end{aligned}
(4.71)
\displaystyle \begin{aligned} \begin{array}{rcl} \Delta \boldsymbol{\theta}(t) &\displaystyle =&\displaystyle \boldsymbol{\theta}(t+1) - \boldsymbol{\theta}(t) \end{array} \end{aligned}
(4.72)

The decomposition analysis for such sequences was introduced as a “regression LTRE” method in the context of ecotoxicology and response to environmental factors (e.g., Caswell 1996; Knight et al. 2009). The same approach was introduced independently by Horiuchi et al. (2008) to decompose differences between two conditions by imagining a continuous path from one to the other.

The analysis starts by considering the change in ξ over time,
\displaystyle \begin{aligned} {d \boldsymbol{\xi}(t) \over d t} = {d \boldsymbol{\xi}(t) \over d \boldsymbol{\theta}^{\mathsf{T}}(t)} {d \boldsymbol{\theta}(t) \over d t} \end{aligned}
(4.73)
If the time series is evaluated at discrete times t = 1, …, T, then to first order
\displaystyle \begin{aligned} \Delta \boldsymbol{\xi}(t) \approx {d \boldsymbol{\xi}(t) \over d \boldsymbol{\theta}^{\mathsf{T}} (t)} \Delta \boldsymbol{\theta}(t) \qquad s \times 1 \end{aligned}
(4.74)
The contributions to Δξ(t) are displayed separately in a contribution matrix
\displaystyle \begin{aligned} {\mathbf{C}}(t) = {d \boldsymbol{\xi}(t) \over d \boldsymbol{\theta}^{\mathsf{T}} (t)} \, \mathcal{D}\,[\Delta \boldsymbol{\theta}(t) ] \qquad s \times p \end{aligned}
(4.75)
the (i, j) entry of C(t) is the contribution of Δθj(t) to Δξi(t). The contributions additive over time, so the contributions of all the changes, integrated from t1 to t2, are given by the entries of
\displaystyle \begin{aligned} {\mathbf{C}} \left( t_1, t_2 \right) = \sum_{t=t_1}^{t_2} {\mathbf{C}}(t) \end{aligned}
(4.76)

Suppose the dependent variable is ξ = η and the parameter vector is θ = μ.

At each time and for each age, we aggregate the contributions from early and late mortality. Let X be an indicator matrix whose entries define whether a particular entry of C(t) is to be counted as early or late:
\displaystyle \begin{aligned} x_{ij} = \left\{ \begin{array}{cl} 1 & \ \theta_j\mbox{ contributes to }\Delta \xi_i \\ 0 & \ \mbox{otherwise} \end{array} \right. \end{aligned}
(4.77)
Then
\displaystyle \begin{aligned} {\mathbf{c}}(t) = \left( {\mathbf{C}}(t) \circ {\mathbf{X}} \right) \mathbf{1}\end{aligned}
(4.78)
is a vector giving the contributions to the change in ξ from the parameters chosen in X. Defining Xearly and Xlate gives changes at time t due to early and late mortality. The LTRE analysis is then
\displaystyle \begin{aligned} {\mathbf{c}}_{\mathrm{early}}(t_1, t_2) = \sum_{t_1}^{t_2} {\mathbf{c}}_{\mathrm{early}} (t) \end{aligned}
(4.79)
and similarly for clate(t1, t2).
As an example, Fig. 4.3a, b shows a time series of life expectancy (increasing from about 40–80 years between 1800 and 2010) and life disparity for Swedish females, based on data from Human Mortality Database (2016). As in most developed countries, life disparity at birth dropped dramatically from 1850 to about 1950 (e.g., Edwards 2011; Vaupel et al. 2011). Declines at later ages were less dramatic, and remaining life disparity conditional on survival to age 50 has been almost flat (Engelman et al. 2014). How did changes in early and late mortality contribute to these patterns?

Figure 4.3c, d show the cumulative sums of the contributions cearly and clate, and their total, for ages 0 and 50. The decline in life disparity at birth was driven almost completely by improvements in early mortality, which completely overshadowed a small increase in disparity that was generated by improvements in late life mortality. The picture for remaining life disparity at age 50 is different: the contributions from changes in early and late life mortality almost completely cancel each other out. These patterns, looking at the details of a single time series, agree with the much more general exploration of multiple countries, using a different approach, by Vaupel et al. (2011).

The accuracy of the decomposition can be evaluated by comparing the time series calculated from the total contributions, as shown in Fig. 4.3c, d, with the observed series, as shown in Fig. 4.3b. The agreement is extremely close; the LTRE decomposition captures the end result of the historical changes from 1800 to 2010 with an error of less than 0.1%.

## 4.6 Conclusion

This chapter and Chap. contain examples of different approaches to the sensitivity analysis, of population growth rate and longevity, respectively. The power and flexibility of matrix calculus methods is apparent: the models are not restricted to age- or stage-classification, the absorbing states may be a single category of death or some more diverse set, the demographic outcomes are not limited to expectations, and the independent variables, the parameters that are being perturbed, can be anything of interest. The only requirement is that a chain of functional dependence can be followed: the outcome ξ depends on U, which depends on p, which depends on μ, …and so on. Mortality might depend on health status, which might depend on income level, which might depend on education, …, and so on. The sensitivity of ξ to any of these parameters is a application of the chain rule.

## Footnotes

1. 1.

See Chap. for a description of this generalized function.

2. 2.

Because time is discrete here, the number of visits is equal to the number of time increments, which is the amount of time spent in the state. In continuous-time models, the number of visits to, and the length of time spent in, a transient state are different. The corresponding calculations for continuous-time models are given in Chap. .

## Bibliography

1. Canudas Romo, V. 2003. Decomposition methods in demography. Population Studies, Rozenberg Publishers, Amsterdam, Netherlands.Google Scholar
2. Caswell, H. 1996. Analysis of life table response experiments. II. Alternative parameterizations for size- and stage-structured models. Ecological Modelling 88:73–82.
3. Caswell, H. 2001. Matrix Population Models: Construction, Analysis, and Interpretation. 2nd edition. Sinauer Associates, Sunderland, MA.Google Scholar
4. Caswell, H., 2006. Applications of Markov chains in demography. Pages 319–334 in MAM2006: Markov Anniversary Meeting. Boson Books, Raleigh, North Carolina.Google Scholar
5. Caswell, H. 2009. Stage, age and individual stochasticity in demography. Oikos 118:1763–1782.
6. Caswell, H. 2011. Beyond R 0: Demographic models for variability of lifetime reproductive output. PloS ONE 6:e20809.
7. Caswell, H., and F. A. Kluge. 2015. Demography and the statistics of lifetime economic transfers under individual stochasticity. Demographic Research 32:563–588.
8. Caswell, H., and V. Zarulli. 2018. Matrix methods in health demography: a new approach to the stochastic analysis of healthy longevity and DALYs. Population Health Metrics 16:8.
9. Devleesschauwer, B., A. H. Havelaar, C. M. De Noordhout, J. A. Haagsma, N. Praet, P. Dorny, L. Duchateau, P. R. Torgerson, H. Van Oyen, and N. Speybroeck. 2014. DALY calculation in practice: a stepwise approach. International Journal of Public Health 59:571–574.
10. Edwards, R. D. 2011. Changes in world inequality in length of life: 1970–2000. Population and Development Review 37:499–528.
11. Edwards, R. D., and S. Tuljapurkar. 2005. Inequality in life spans and a new perspective on mortality convergence across industrialized countries. Population and Development Review 31:645–674.
12. Engelman, M., H. Caswell, and E. M. Agree. 2014. Why do lifespan variability trends for the young and old diverge? Demographic Research 30:1367–1396.
13. Feichtinger, G. 1971. Stochastische Modelle Demographischer Prozesse. Lecture notes in operations research and mathematical systems, Springer-Verlag, Berlin, Germany.
14. GBD 2016 DALYs and HALE Collaborators. 2017. Global, regional, and national disability-adjusted life-years (DALYs) for 333 diseases and injuries and healthy life expectancy (HALE) for 195 countries and territories, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet 390:1260–1344.Google Scholar
15. Hoem, J. M. 1969. Markov chain models in life insurance. Sonderdruck aus Blatter der Deutschen Gelellschaft fur Versicherungsmathematik 9:91–107.Google Scholar
16. Horiuchi, S., J. R. Wilmoth, and S. D. Pletcher. 2008. A decomposition method based on a model of continuous change. Demography 45:785–801.
17. Horvitz, C. C., and S. Tuljapurkar. 2008. Stage dynamics, period survival, and mortality plateaus. American Naturalist 172:203–215.
18. Howard, R. A. 1960. Dynamic programming and Markov processes. Wiley, New York, New York.Google Scholar
19. Human Mortality Database. 2016. University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). www.mortality.org URL www.mortality.org.
20. Iosifescu, M. 1980. Finite Markov Processes and Their Applications. Wiley, New York, New York.Google Scholar
21. Irwin, J. 1949. The standard error of an estimate of expectation of life, with special reference to expectation of tumourless life in experiments with mice. The Journal of Hygiene 47:188–189.
22. Kemeny, J. G., and J. L. Snell. 1976. Finite Markov Chains. Second edition. Undergraduate Texts in Mathematics, Springer-Verlag, New York, New York, USA.Google Scholar
23. Keyfitz, N. 1971. Linkages of intrinsic to age-specific rates. Journal of the American Statistical Association 66:275–281.
24. Knight, T., H. Caswell, and S. Kalisz. 2009. Population growth rate of a common understory herb decreases non-linearly across a gradient of deer herbivory. Forest Ecology and Management 257:1095–1103.
25. Pollard, J. H. 1982. The expectation of life and its relationship to mortality. Journal of the Institute of Actuaries 109:225–240.
26. Steinsaltz, D., and S. N. Evans. 2004. Markov mortality models: implications of quasistationarity and varying initial distributions. Theoretical Population Biology 65:319–337.
27. Tuljapurkar, S., and C. C. Horvitz. 2006. From stage to age in variable environments: life expectancy and survivorship. Ecology 87:1497–1509.
28. van Daalen, S., and H. Caswell. 2015. Lifetime reproduction and the second demographic transition: Stochasticity and individual variation. Demographic Research 33:561–588.
29. van Daalen, S. F., and H. Caswell. 2017. Lifetime reproductive output: individual stochasticity, variance, and sensitivity analysis. Theoretical Ecology 10:355–374.
30. Van Raalte, A. A., and H. Caswell. 2013. Perturbation analysis of indices of lifespan variability. Demography 50:1615–1640.
31. Vaupel, J. W. 1986. How change in age-specific mortality affects life expectancy. Population Studies 40:147–157.
32. Vaupel, J. W., and V. Canudas Romo. 2003. Decomposing change in life expectancy: a bouquet of formulas in honor of Nathan Keyfitz’s 90th birthday. Demography 40:201–216.
33. Vaupel, J. W., Z. Zhang, and A. A. van Raalte. 2011. Life expectancy and disparity: an international comparison of life table data. BMJ Open 1:e000128.
34. Wilson, E. B. 1938. The standard deviation of sampling for life expectancy. Journal of the American Statistical Association 33:705–708.
35. Zhang, Z., and J. W. Vaupel. 2009. The age separating early deaths from late deaths. Demographic Research 20:721–730.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Authors and Affiliations

• Hal Caswell
• 1
1. 1.Biodiversity & Ecosystem DynamicsUniversity of AmsterdamAmsterdamThe Netherlands