The study of structured populations in ecology often is focused on organisms with a life history with distinct yearly events. A seminal contribution to this work was that of Leslie in the 1940’s (Leslie 1945) who represented the dynamics of age-structured populations using a discrete-time model formulated as a vector with entries representing the number of individuals at different, equally spaced, ages, and a matrix representing yearly transitions with the entries in the top row representing births and the sub-diagonal elements representing the year to year survival probabilities. Similar representations can be used to study spatially structured populations, and more commonly stage-structured populations (Caswell 2001, Lefkovitch 1965). In the latter case, entries represent yearly transition probabilities between different stages in the population. For some stage-structured populations, individuals transition back and forth between reproductive and nonreproductive states, and of great interest is the overall time individuals spend in reproductive states during their lifetime. This behavior typically occurs in organisms where successful reproduction represents a substantial expenditure of energy. Examples of organisms that do these kind of transitions are plants, with orchids a prominent example Shefferson et al. (2019). The importance of focusing on these transitions for understanding the dynamics of plant populations, especially in variable environments, has been emphasized Gremer et al. (2012), Alahuhta et al. (2017). Other organisms that move between breeding and nonbreeding states are large birds such as the Southern Fulmar Jenouvrier et al. (2015) and the California condor Meretsky et al. (2000) where individuals transition between breeding and nonbreeding states.

Although ecologists typically think of stage-structured modeling approaches as focusing on the numbers of individuals in different states, a different interpretation is possible if we break up the model into two pieces: one representing reproduction, and the other representing transitions among states not involving reproduction. This separation of the description of reproduction from the description of transitions between states by organisms (see e.g. Cushing (1998)) is a standard approach that is often useful. Interpretations of these stage-structured models usually assume that the matrix entries representing these transition rates are constant, ignoring both any effects of density dependence or any influence of environmental changes through time. Consequently, the state vector in the submodel that does not include reproduction could equivalently be considered as the probability that a single organism is at a particular state. The initial condition for this view is not the number of individuals in different states, but the probability distribution for the states that a focal individual starts in. Another feature is that organisms are not immortal, so an additional state could be included representing death.

Under this alternate interpretation, the discrete-time model can be viewed as a finite state inhomogeneous absorbing Markov chain. The different states of the Markov chain correspond to reproductive states or physical location. A new view that emerges from this interpretation is that it makes sense to consider the amount of time (number of years) that an organism spends in some subset of the states in the model. For example, an organism may move in and out of breeding states, if breeding one year moves the organism to a nonbreeding state the following year. Observations may easily be able to tell if an organism is breeding or non-breeding, or clearly flowering or nonflowering for plants. Thus it makes sense to consider how much time an individual spends in the breeding state. More generally, for a given set of states, the amount of time an individual spends in that set over its lifetime is called the occupancy time.

This intriguing view of dynamics of individuals has been explored theoretically in a recent paper by Roth and Caswell (2018). The key assumption that the probabilities of transitions between different states are constant through time produces a resulting Markov chain that is homogeneous. Then, using ideas from the theory of Markov chains they develop formulas to calculate various quantities of biological interest including the expected occupancy time of an individual in a target set as well as other aspects.

Yet, environments, and hence transition probabilities between states for an individual, are not constant through time. On shorter time scales, there is important environmental variability, such as El Ninos and the Pacific decadal oscillation, with similar cycles in other locations around the globe. Perhaps more importantly, global change is leading to dramatic longer term secular changes in the life histories of individual organisms. Thus, an important feature to include in demographic models for ecological populations is possible changes in the environment through time, representing either short or long term environmental fluctuations. In this case, the resulting demographic models lead to inhomogeneous Markov chains. In this article, we derive formulas for the distribution and moments of occupancy times for inhomogeneous Markov chains which will highlight the role that environmental variability can play. We will also demonstrate how to calculate and interpret these quantities with a specific application, given by breeder states of the Southern Fulmar bird population.

Occupancy times for inhomogeneous Markov chains

The transition probabilities among states for living organisms at any fixed time can be represented by a column sub-stochastic matrix, as no individuals are created, but death is possible. Since we allow for changes in these rates through time, we use a sequence of column sub-stochastic matrices through time to represent transition probabilities between states for individuals during their lifetime. Thus, let \(\{B(n)\}_{n\in {\mathbb {N}}_0}\) be a sequence of column sub-stochastic matrices in \({\mathbb {R}}^{d\times d}\), i.e. for all \(n\in {\mathbb {N}}_0=\{0,1,2,\dots \}\), all entries of B(n) are non-negative, and for \(j\in \{1,\dots ,d\}\), we have \(\sum _{i=1}^dB_{ij}(n)\le 1\). The matrix element \(B_{ij}(n)\) describes the probability an individual moves from state j to state i at the discrete time step from time n to time \(n+1\). Thus, the sum of the entries in the columns are less than one because we allow transition to death, which is not explicitly included as a state. Accordingly, the probabilities describing the evolution from time m to time \(n>m\) are given by the elements of the transition operator

$$\begin{aligned} \Phi (n,m) := B(n-1)\cdots B(m)\,, \end{aligned}$$

corresponding to the linear nonautonomous difference equation \(x_{n+1}=B(n)x_n\), see Kloeden and Rasmussen (2011), Pötzsche (2010).

Although the question of occupancy times can be analysed in a purely deterministic nonautonomous setting, we aim at a stochastic description in order to make the analysis of statistical quantities more convenient. Let \(\{{\bar{X}}_n\}_{n \in {\mathbb {N}}_0}\) be the corresponding inhomogeneous absorbing Markov chain on the finite state space \(S=\{1,\dots , d\}\) that starts in the probability vectorFootnote 1\(v\in [0,1]^d\), and note that we have

$$\begin{aligned} {\mathbb {P}}\{{\bar{X}}_{0}=i\}= v_i \quad \text{ for } \text{ all } i\in \{1,\dots , d\} \end{aligned}$$


$$\begin{aligned} {\mathbb {P}}\{{\bar{X}}_n = i \,|\, {\bar{X}}_m = j\} =\Phi _{ij}(n,m) \quad \text{ for } \text{ all } n>m \text{ and } i,j\in \{1,\dots ,d\}\,. \end{aligned}$$

We now extend our model to include death. We define the absorption probabilities at time \(n\in {\mathbb {N}}_0\) by \(b(n) = ({\text {Id}}-B(n))^T\mathbb {1}_d\), where \(\mathbb {1}_d = (1,\dots ,1)^T\in {\mathbb {R}}^{d}\). Here, \({\text {Id}}\) denotes the identity matrix. It is convenient to explicitly include an additional state representing death, and hence we can extend our absorbing Markov chain \(\{{\bar{X}}_n\}_{n\in {\mathbb {N}}_0}\) to obtain a full Markov chain \(\{X_n\}_{n\in {\mathbb {N}}_0}\), which is generated using the transition matrices

$$\begin{aligned} C(n) := \begin{pmatrix} B(n) &{} 0 \\ b(n)^T &{} 1 \end{pmatrix} \quad \text{ for } \text{ all } n\in {\mathbb {N}}_0\,. \end{aligned}$$

We note that the Markov chain \(\{X_n\}_{n\in {\mathbb {N}}_0}\) has values in the extended state space \(S\cup \{d+1\}\). We define the lifetime of the Markov chain \(\{X_n\}_{n\in {\mathbb {N}}_0}\) by

$$\begin{aligned} N := \min \big \{ n \in {\mathbb {N}} : X_{n} = d+1\big \}\,. \end{aligned}$$

Note that the lifetime N describes the random time that an individual enters the absorbing state \(d+1\), that is the time until death. Its probability distribution can be calculated explicitly.

Proposition 1

The probability distribution of the lifetime is given by

$$\begin{aligned} {\mathbb {P}}\{ N = n\} = b(n-1)^T \Phi (n-1,0)v \quad \text{ for } \text{ all } n\in {\mathbb {N}}\,. \end{aligned}$$


For all \(n\in {\mathbb {N}}\), we have

$$\begin{aligned} {\mathbb {P}}\{N=n\}&= \sum _{j=1}^d {\mathbb {P}}\{X_{n}=d+1, X_{n-1}=j\} \\&=\sum _{j=1}^d{\mathbb {P}}\{ X_{n}=d+1\,|\, X_{n-1}=j\}{\mathbb {P}}\{X_{n-1}=j\} \\&= \sum _{j=1}^d b_j(n-1) (\Phi (n-1,0)v)_j = b(n-1)^T\Phi (n-1,0)v\,, \end{aligned}$$

which finishes the proof of this proposition. \(\square \)

We are now interested in the random amount of time an individual spends in a subset, such as the breeding state, of the state space consisting of all possible individual states, up to a certain time.

Definition 1

For \(R\subset S\), we define the R-occupancy time at time n by

$$\begin{aligned} A_{R}(n) = \#\big \{k\in \{0,\dots , n-1\} : X_k \in R \big \}\quad \text{ for } \text{ all } n\in {\mathbb {N}}\,, \end{aligned}$$

where \(\#\) denotes the number of elements of the set, and we define \(A_R(0)=0\). The lifetime R-occupancy time

$$\begin{aligned} \tau _{ R}=A_{ R}(N) \end{aligned}$$

is the amount of time a individual spends in R over its lifetime.

In order to compute the distribution of the lifetime R-occupancy time and its moments, it is useful to consider the joint distribution of the R-occupancy time and the state of the Markov chain at time n, given by

$$\begin{aligned} p^{R}_j(a,n) := {\mathbb {P}}\{ A_{R}(n)=a, X_n=j\} \quad \text{ for } \text{ all } j\in S\,,\, n\in {\mathbb {N}}_0 \text{ and } a\in {\mathbb {N}}_0\,. \end{aligned}$$

We note that this is not a probability distribution since there is a nonzero probability of death. In fact, the total mass available at time \(n\in {\mathbb {N}}\) is given by \(\sum _{j=1}^d \sum _{a=0}^{\infty } p_j^{R}(a,n) = (1,\dots , 1) \Phi (n,0)v\), which is less than 1 in general.

The following proposition shows that this joint distribution evolves according to a partial difference equation, which is a generalisation of the discretised McKendrick–von Foerster equation for age dependent population dynamics (McKendrick 1926).

Proposition 2

The time evolution of the distribution \(p^{R}_j(a,n)\) is governed by the partial difference equation

$$\begin{aligned} p^{R}_j(a,n+1) = \sum _{k\in R} B_{jk}(n)p_k^{ R}(a-1,n)+ \sum _{k\not \in R} B_{jk}(n)p_k^{ R}(a,n) \,, \end{aligned}$$

for any \(a\in {\mathbb {N}}\) and \(n\in {\mathbb {N}}_0\), with initial conditions

and boundary conditions governed by

$$\begin{aligned} p_j^{ R}(0,n+1) = \sum _{k\not \in R}B_{jk}(n)p_k^R(0,n) \quad \text { for all } j\in S \text{ and } n\in {\mathbb {N}}_0\,. \end{aligned}$$


For the evolution equation, we have for all \(a\in {\mathbb {N}}\) and \(n\in {\mathbb {N}}_0\) that

$$\begin{aligned} p_j^{ R}(a,n+1)&= {\mathbb {P}}\{A_{ R}(n+1)=a, \, X_{n+1}=j\} \\&={\mathbb {P}}\{ A_{ R}(n+1)=a,\, X_{n+1}=j,\, X_n\in R\} \\&\qquad + {\mathbb {P}}\{ A_{ R}(n+1)=a,\, X_{n+1}=j,\, X_n\not \in R\} \\&={\mathbb {P}}\{ A_{ R}(n)=a-1,\, X_{n+1}=j,\, X_n\in R\} \\&\qquad +{\mathbb {P}}\{ A_{ R}(n)=a,\, X_{n+1}=j,\, X_n\not \in R\} \\&=\sum _{k\in R}{\mathbb {P}}\{X_{n+1}=j\,| X_n=k\}{\mathbb {P}}\{A_{ R}(n)=a-1,\, X_n=k\} \\&\qquad +\sum _{k\not \in R}{\mathbb {P}}\{X_{n+1}=j\,| X_n=k\}{\mathbb {P}}\{A_{ R}(n)=a,\, X_n=k\} \\&= \sum _{k\in R} B_{jk}(n)p_k^{ R}(a-1,n) + \sum _{k\not \in R} B_{jk}(n)p_k^{ R}(a,n)\,, \end{aligned}$$

which proves (3). The initial conditions are satisfied by definition, and the validity of the boundary conditions follows from

$$\begin{aligned} p_j^{ R}(0,n+1)&= {\mathbb {P}}\{A_{ R}(n+1)=0, \, X_{n+1}=j\}\\&={\mathbb {P}}\{ A_{ R}(n+1)=0,\, X_{n+1}=j,\, X_n\not \in R\} \\&={\mathbb {P}}\{ A_{ R}(n)=0,\, X_{n+1}=j,\, X_n\not \in R\} \\&= \sum _{k\not \in R}{\mathbb {P}}\{X_{n+1}=j\,| X_n=k\}{\mathbb {P}}\{A_{ R}(n)=0,\, X_n=k\} \\&= \sum _{k\not \in R} B_{jk}(n)p_k^{ R}(0,n)\,, \end{aligned}$$

for all \(j\in S\) and \(n\in {\mathbb {N}}\). \(\square \)

We define \(p^{ R}(a,n):=(p_1^{ R}(a,n),\dots ,p_d^{ R}(a,n))^T\) for all \(a,n\in {\mathbb {N}}_0\). Note that one can write Eq. (3) more compactly in matrix form

$$\begin{aligned} p^{R}(a,n+1) = B(n)\big (R p^{ R}(a-1,n)+ ({\text {Id}}-R)p^{ R}(a,n)\big )\, \end{aligned}$$

where we have allowed R to also stand for the diagonal binary matrix describing the subset \(R\subset S\), defined by

The following theorem explains how to calculate the distribution and moments of the lifetime R-occupancy time \(\tau _R\).

Theorem 1

For any \(R\subset S\), the distribution of the lifetime R-occupancy time \(\tau _R\) is given by

$$\begin{aligned} {\mathbb {P}}\{\tau _{ R}=a\}= \sum _{n=0}^{\infty } b(n)^T \left( Rp^{ R}(a-1,n) + ({\text {Id}}-R)p^R(a,n)\right) \end{aligned}$$

for all \(a\in {\mathbb {N}}\). The moments of \(\tau _{R}\) are given by

$$\begin{aligned} {\mathbb {E}}(\tau _{ R}^k)=\sum _{n=0}^{\infty }b(n)^T\left( m_k^{ R}(n) + R\sum _{j=1}^k {k\atopwithdelims ()j}m_{k-j}^{ R}(n)\right) \quad \text{ for } \text{ all } k\in {\mathbb {N}}\,, \end{aligned}$$

where \(m^{R}_k(n) = \sum _{a=0}^{\infty }a^k p^{ R}(a,n)\in {\mathbb {R}}^d\) for all \(k\in {\mathbb {N}}_0\). We demonstrate how to efficiently calculate the terms \(m_k^R(n)\) below in Proposition 3.


For all \(a\in {\mathbb {N}}\), we have

$$\begin{aligned} {\mathbb {P}}\{\tau _{ R}&=a\} = {\mathbb {P}}\{ A_{ R}(N)=a\} = \sum _{n=1}^{\infty } {\mathbb {P}}\{A_{ R}(n)=a, N=n\} \\&= \sum _{n=1}^{\infty } \sum _{j=1}^d {\mathbb {P}}\{A_{ R}(n)=a, X_{n-1}=j, X_{n}=d+1\} \\&= \sum _{n=1}^{\infty } \sum _{j=1}^d {\mathbb {P}}\{X_{n}=d+1\, | \, X_{n-1}=j\}{\mathbb {P}}\{A_{ R}(n)=a, X_{n-1}=j\} \\&= \sum _{n=1}^{\infty }\Big (\sum _{j\in R}{\mathbb {P}}\{X_n=d+1\,|\,X_{n-1}=j\}{\mathbb {P}}\{A_R(n-1)=a-1,X_{n-1}=j\} \\&\qquad + \sum _{j\not \in R}{\mathbb {P}}\{X_n=d+1\,|\,X_{n-1}=j\}{\mathbb {P}}\{A_R(n-1)=a, X_{n-1}=j\}\Big ) \\&=\sum _{n=0}^{\infty }b(n)^T\left( Rp^R(a-1,n)+({\text {Id}}-R)p^R(a,n)\right) . \end{aligned}$$

Note that from the second to the third line, we used conditional independence of the random variables \(X_{n}\) and \(A_R(n)\) given \(X_{n-1}\). This proves (4). To prove (5) we observe

$$\begin{aligned} {\mathbb {E}}(\tau _R^k)&=\sum _{a=1}^{\infty }a^k{\mathbb {P}}\{\tau _R=a\} \\&= \sum _{a=1}^{\infty }a^k\sum _{n=0}^{\infty }b(n)^T(Rp^R(a-1,n)+({\text {Id}}-R)p^R(a,n)) \\&= \sum _{n=0}^{\infty }b(n)^T\left( \sum _{a=1}^{\infty }a^kRp^R(a-1,n) + \sum _{a=1}^{\infty }a^k({\text {Id}}-R)p^R(a,n)\right) \\&= \sum _{n=0}^{\infty }b(n)^T\left( \sum _{a=0}^{\infty }(a+1)^kRp^R(a,n) + \sum _{a=0}^{\infty }a^k({\text {Id}}-R)p^R(a,n)\right) \\&=\sum _{n=0}^{\infty }b(n)^T\left( \sum _{a=0}^{\infty }a^kp^R(a,n) + \sum _{a=0}^{\infty }((a+1)^k-a^k)Rp^R(a,n) \right) \\&=\sum _{n=0}^{\infty }b(n)^T\left( m_k^{ R}(n) + R\sum _{j=1}^k {k\atopwithdelims ()j}m_{k-j}^{ R}(n)\right) \, . \end{aligned}$$

\(\square \)

We now demonstrate how to calculate the terms \(m^{R}_k(n) = \sum _{a=0}^{\infty }a^k p^{ R}(a,n)\) without having to solve Eq. (3). Fundamental is the observation that for a fixed \(k\in {\mathbb {N}}\), the sequence \(n\mapsto m^R_k(n)\) solves a difference equation, involving also terms \(m^R_\ell (n)\), where \(\ell <k\).

Proposition 3

Consider \(R\subset S\). Then the sequence \(n\mapsto m_0^R(n)\) satisfies the difference equation

$$\begin{aligned} m_0^R(n+1) = B(n) m_0^R(n)\,, \end{aligned}$$

For \(k\in {\mathbb {N}}\), the sequence \(n\mapsto m_k^{ R}(n)\) satisfies the difference equation

$$\begin{aligned} m_k^{ R}(n+1) = B(n)\left( m_k^{ R}(n) + R\sum _{j=1}^k {k\atopwithdelims ()j}m_{k-j}^{ R}(n)\right) \,. \end{aligned}$$

The initial conditions are given by \(m_0^R(0)=v\) and \(m_k^R(0)=0\) for all \(k\in {\mathbb {N}}\).


Firstly, the definition of \(m_0^R\) and (2) imply that \(m_0^R(n) = \Phi (n-1,0)v\), which proves that it satisfies the above difference equation. Next, for \(k\in {\mathbb {N}}\), we use the matrix form

$$\begin{aligned} p^{R}(a,n+1) = B(n)\big (R p^{ R}(a-1,n)+ ({\text {Id}}-R)p^{ R}(a,n)\big )\, \end{aligned}$$

of (3) and obtain

$$\begin{aligned} m_k^{R}(n+1)&=\sum _{a=1}^{\infty }a^k p^{R}(a,n+1) \\&=\sum _{a=1}^{\infty }a^kB(n)\big ( R p^{R}(a-1,n)+ ({\text {Id}}-R)p^{ R}(a,n)\big )\\&= B(n)\left( R\sum _{a=0}^{\infty }(a+1)^k p^{R}(a,n)+ ({\text {Id}}-R)\sum _{a=0}^{\infty }a^k p^{ R}(a,n) \right) \\&=B(n)\left( \sum _{a=0}^{\infty }a^k p^{R}(a,n)+ \sum _{j=1}^k{k\atopwithdelims ()j}R\sum _{a=0}^{\infty } a^{k-j}p^{R}(a,n) \right) \\&= B(n)\left( m_k^{ R}(n) + R\sum _{j=1}^k {k\atopwithdelims ()j}m_{k-j}^{ R}(n)\right) \,. \end{aligned}$$

This finishes the proof of this proposition. \(\square \)

The pseudo code presented in Fig. 1 describes how to efficiently implement the formulas in Theorem 1 and Proposition 3 to approximate the first K moments of the lifetime R-occupancy time for a given set of transition matrices \(\{B(n)\}_{n\in {\mathbb {N}}_0}\). The idea is to iteratively solve the set of difference equations described in Proposition 3 while simultaneously building up the sums in (5) for finitely many natural numbers \(n=0,\dots ,N\).

Fig. 1
figure 1

Pseudo code to approximate the first K moments of the lifetime R-occupancy time using the formulas from Theorem 1 and Proposition 3

We close this section by a short discussion how the analysis is simplified when looking at certain special cases.

We first consider the special case of determining the future life of an organism, \(R=S\). If we assume that \(X_n=j\) for some \(n\in {\mathbb {N}}_0\) and \(j\in S\), then this implies that \(A_S(n)=n\), and hence, \(p_j^S(n,n)={\mathbb {P}}\{X_n = j\} = (\Phi (n,0)v)_j\), and we have \(p_j^S(a,n)=0\) whenever \(a\not =n\). A consequence of this is that the partial difference Eq. (3) from Proposition 2 reduced to a (non-partial) difference equation of the form

$$\begin{aligned} p^S(n+1,n+1) = B(n)p^S(n,n)\quad \text{ for } \text{ all } n\in {\mathbb {N}}_0\,, \end{aligned}$$

with \(p^S(0,0)=v\). Formula (4) in Theorem 1 then simplifies to

$$\begin{aligned} {\mathbb {P}}\{\tau _S = a\} = b(a-1)^Tp^S(a-1,a-1) = b(a-1)^T\Phi (a-1,0)v\,, \end{aligned}$$

and the formulas in Proposition 3 simplify slightly, since the matrix R is the identity matrix and can be removed. Also note that this becomes identical to the probability distribution of the lifetime given in Proposition 1.

Now consider the special case that the transition probabilities between stages are constant through time, so the Markov chain is homogeneous, which has been treated in Roth and Caswell Roth and Caswell (2018) using a different approach. Since the matrix B(n) and the vector b(n) do not depend on \(n\in {\mathbb {N}}_0\) in this case, the probability distribution of the lifetime is given in this special case by

$$\begin{aligned} {\mathbb {P}}\{N=n\} = b^TB^{n-1}v\,, \end{aligned}$$

which is known as a phase-type distribution (Neuts 1981). The formulas in Theorem 1 and Proposition 3 do not simplify, except that the matrix B and the vector b do not depend on time.

Finally, in case the Markov chain is homogeneous and \(R=S\), the above discrete-time phase-type distribution is also the distribution of the occupancy time \(\tau _S\), and respective quantities for expectation and higher-order moments can be obtained more easily from Neuts (1981); see also Metzler and Sierra (2018), where the continuous-time case is treated via absorbing Markov processes.

Occupancy times in breeder stages of the Southern Fulmar

We illustrate our theory of occupancy times for inhomogeneous Markov chains using a stage-structured model describing the life cycle of the Southern Fulmar. The Southern Fulmar is a sea bird with colonies along the coast of Antarctica. There are four states that the bird can be in, the structure of the model is shown in Fig. 2.

Fig. 2
figure 2

Shows the four states of the Southern Fulmar model and how they are connected. Pre-breeder refers to birds which have yet to breed for the first time. Successful breeders have raised a chick that year. Failed breeders have found a mate but either failed to hatch an egg or the chick died. Finally non-breeders refers to birds who have previously bred but have no mate that year

The rates of transition between these four states depend on the ice conditions in that year. In Jenouvrier, Péron, and Weimerskirch Jenouvrier et al. (2015), the years are classified as favorable, ordinary and unfavorable. In favorable years the colony is close to the edge of the sea ice. This means foraging trips are shorter and more likely to be successful.

In Sect. 3.1, we first consider deterministic, but time-dependent, rates that describe transitions between these four states, and in Sect. 3.2, we use a random model to describe changes in the external environment.

Deterministic external environments

In Jenouvrier et al. (2015), the transition rates between the four different states have been obtained using satellite data between 1979 and 2010. This involved classification of the three different ice conditions (favorable, ordinary, unfavorable), see Fig. 3, and three different transition matrices (\(U_f\), \(U_o\), \(U_u\)) corresponding to the possible ice conditions:

$$\begin{aligned} U_f= & {} \begin{pmatrix} 0.828 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0.06624 &{}\quad 0.72912 &{}\quad 0.62244 &{}\quad 0.40176 \\ 0.02576 &{}\quad 0.18228 &{}\quad 0.24206 &{}\quad 0.15624 \\ 0 &{} 0.0186 &{}\quad 0.0455 &{}\quad 0.342 \end{pmatrix}\,, \\ U_o= & {} \begin{pmatrix} 0.9016 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0.011408 &{}\quad 0.66737 &{}\quad 0.49312 &{}\quad 0.1809 \\ 0.006992 &{}\quad 0.18823 &{}\quad 0.24288 &{}\quad 0.0891 \\ 0 &{}\quad 0.0744 &{}\quad 0.184 &{}\quad 0.63 \end{pmatrix}\,,\\ U_u= & {} \begin{pmatrix} 0.9154 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 0.002392 &{}\quad 0.4873 &{}\quad 0.25147 &{}\quad 0.0468 \\ 0.002208 &{}\quad 0.1895 &{}\quad 0.23213 &{}\quad 0.0432 \\ 0 &{}\quad 0.2632 &{}\quad 0.4464 &{}\quad 0.81 \end{pmatrix}\,. \end{aligned}$$
Fig. 3
figure 3

Shows the sequence of ice conditions between 1979 and 2010. The favorability of a year is determined by the distance between the ice edge and the colony, and the total sea ice area. The average life expectancy of a bird is around 12 years

We are interested in how many years of a bird’s life it attempts to breed. Note that a breeding attempt occurs if the bird is either in the Successful Breeder or Failed Breeder state. This means that, using the notation of the previous section, we are interested in the set \(R=\{2,3\}\) and the lifetime R-occupancy time \(\tau _R\) is the number of breeding attempts a bird makes in its life. In Fig. 4 we use Equ. (5) to calculate the expected value and co-efficient of variation for \(\tau _R\) for four different scenarios (as explained in the caption of this figure). In Fig. 5, we use Equ. (4) to calculate the distribution of \(\tau _R\) for the four scenarios.

Fig. 4
figure 4

Compares the expected value a and the co-efficient of variation b of the lifetime R-occupancy time \(\tau _R\) for different initial states. We compare the three autonomous scenarios of constant conditions with the time varying conditions described in Fig. 3

Fig. 5
figure 5

Panel a shows the full probability distribution \(a\mapsto {\mathbb {P}}\{\tau _R=a\}\) of the lifetime R-occupancy time for individuals initialised in the successful breeder state. We compare the distribution for the time-dependent scenario shown in Fig. 3 with constant conditions. Panel b shows the expected value of the lifetime R-occupancy time as a function of the year they first entered the successful breeder state

We conclude that for the time-dependent scenario shown in Fig. 3, the statistical properties of the lifetime R-occupancy time is quite similar to the scenario where all the years are ordinary. The effects of the occasional favorable and unfavorable years approximately balance each other out, but such a conclusion would not have been possible using a purely time-independent analysis of a homogeneous Markov chain.

Randomly varying external environments

The formulas and algorithm presented in Sect. 2 are valid for an arbitrary sequence of time-dependent transition matrices \(B=\{B(n)\}_{n\in {\mathbb {N}}_0}\). A particularly relevant source of such time-dependence arises from a Markov chain with transition probabilities varying randomly through time. In this subsection, we formulate this mathematically and apply it to the Southern Fulmar example.

In order to treat the case of a randomly varying environment we need to specify a probabilistic model of the environment. This is obtained by specifying a probability distribution on the set consisting of sequences of transition matrices. Below we describe a very simple model which is tailored to our Southern Fulmar example. However, the ideas described remain valid for any application and for more complicated probabilistic models. This could involve, for example, temporal correlations or an infinite set of possible transition matrices.

For the Southern Fulmar example, we have chosen a particularly simple probabilistic model by assigning probabilities \(P_f\), \(P_u\), \(P_o\in [0,1]\) for favorable, unfavorable and ordinary years, respectively. We clearly must have

$$\begin{aligned} P_f + P_u + P_o=1. \end{aligned}$$

We can generate a random sequence of transition matrices by picking the matrices \(U_f\), \(U_u\), \(U_o\) according to the probabilities \(P_f\), \(P_u\), \(P_o\) independently at each time-step. We use \(\mu \) to denote the probability measure on the set of sequences of matrices which is described above. For example we have

$$\begin{aligned} \mu \{B(0)=U_o,\, B(1)=U_f,\, B(2)=U_o\} = P_oP_fP_o. \end{aligned}$$

We note that there are now two separate sources of randomness contributing to lifetime R-occupancy time, see also Shoemaker et al. (2020), Brett (2008).

  1. 1.

    The demographic stochasticity modelled by the Markov chain, as used in the first part of the paper. Different individuals born at the same time take different demographic paths due to this level of randomness. We use \({\mathbb {E}}_{{\mathbb {P}}_B}\) to denote expectations taken over only this level of randomness. Here we emphasise the dependence on the specific sequence of matrices B.

  2. 2.

    The environmental stochasticity from the varying conditions. All individuals in the population at a given time experience the same randomly changing conditions. We use \({\mathbb {E}}_{\mu }\) to denote expectation taken over only this level of randomness.

As ecologists have emphasized Shoemaker et al. (2020), Brett (2008), demographic stochasticity is the dominant stochastic force in small populations, while environmental stochasticity is more important in larger populations, so it is important to examine the effects of both. We additionally note that climate change is expected to have large effects not only on means but on variability van der Wiel and Bintanja (2021), which is a further reason for focusing on the effect of variability on population dynamics.

The method described in Sect. 2 allows us to efficiently calculate the moments and distribution of the lifetime R-occupancy time for a specific sequence of transition matrices. This means we can obtain averages over the demographic stochasticity as described in the pseudo code in Fig. 1, so we do not need to resort to Monte Carlo simulation. However, to numerically obtain expected values over the environmental stochasticity we do have to randomly generate many sequences of matrices, apply the algorithm explained in Fig. 1 for each of the randomly generated sequences, and then take the average.

In Fig. 6 we show how the expected value and the co-efficient of variation of the lifetime R-occupancy time taken over both levels of randomness depend on the probabilities \(P_f\), \(P_u\) and \(P_o\) of the three different ice conditions. Unsurprisingly, we see in Fig. 6a that an increased probability of favorable years results in a large increase in the expected number of breeding attempts. We also see that a high probability of unfavorable years causes more relative variability in the number of breeding attempts.

Fig. 6
figure 6

a expected value and b coefficient of variation of the lifetime R-occupancy time as a function of the probabilities \((P_f, P_o, P_u)\). Here the expectation are taken over both the demographic and environmental stochasticity. For each \((P_f, P_o, P_u)\) we sample 2000 random sequences of matrices. For each random sequence of transition matrices we use the formulas in the previous section to calculate the demographic level expected value and variance. Then we average over the 2000 samples to approximate the statistics. Individuals are initialised in the pre-breeder state. It is helpful in understanding this figure to note that the three corners of the triangle correspond to constant deterministic conditions as shown in Fig. 4

It is interesting to consider how much the two levels of stochasticity contribute to the uncertainty expressed by the co-efficient of variation shown in Fig. 6b. We can decompose the variance to show the contribution of the two sources of randomness and obtain

$$\begin{aligned} \mathrm {Var}(\tau _{ R})&={\mathbb {E}}_{\mu }\Big [{\mathbb {E}}_{{\mathbb {P}}_B}\Big [(\tau _{ R}-{\mathbb {E}}_{\mu }{\mathbb {E}}_{{\mathbb {P}}_B}(\tau _{ R}))^2\Big ] \Big ]\\&= {\mathbb {E}}_{\mu }\Big [ {\mathbb {E}}_{{\mathbb {P}}_B}(\tau _{ R}^2)-2{\mathbb {E}}_{{\mathbb {P}}_B}(\tau _{ R}){\mathbb {E}}_{\mu }{\mathbb {E}}_{{\mathbb {P}}_B}(\tau _{ R})+({\mathbb {E}}_{\mu }{\mathbb {E}}_{{\mathbb {P}}_B}(\tau _{ R}))^2\Big ] \\&= {\mathbb {E}}_{\mu }\mathrm {Var}_{{\mathbb {P}}_B}(\tau _{ R}) + \mathrm {Var}_{\mu }{\mathbb {E}}_{{\mathbb {P}}_B}(\tau _{ R})\,. \end{aligned}$$

The total variance is the sum of the environmental expected value of the demographic variance and the environmental variance of the demographic expected value. In Fig. 7, we see that, for our model, almost all of the observed variance is explained by the demographic stochasticity rather than the environmental stochasticity.

Fig. 7
figure 7

Shows the contributions of the two sources of variance in the lifetime R-occupancy time. The expected value of the demographic level variance (a) is much larger than the variance of the demographic level expected value (b). The computations are done in the same way as Figure 6


We have presented extensions to the analysis of occupancy times for states in a compartmental model such as a stage-structured demographic model from the case of constant transition rates through time previously analyzed by Roth and Caswell Roth and Caswell (2018) to the important case of time-varying transition rates. For demographic models, understanding how the amount of time individuals will spend in different states as environmental conditions change, either randomly or on a longer term from global change, is an important question in conservation biology, as exemplified by the demography of the California condor Meretsky et al. (2000) and plants with dormant states Gremer et al. (2012), Alahuhta et al. (2017). We have illustrated this with specific calculations showing how random variations in the environment affect the expected time spent as a breeder in Southern Fulmar populations. Using these ideas in an applied context does bring up questions about the meaning of occupancy time.

We would like to emphasise that there are a number of interpretations of occupancy times for time-dependent compartmental processes, and this has been explored only very recently in the literature Sierra et al. (2017), Metzler (2020). Firstly, it makes a difference whether one considers only a single individual organism or particle, or whether one is interested in dynamical processes that involve a lot of organisms or particles. In this article, we focused on single organisms or particles, and we used a stochastic model. When considering many organisms or particles, the law of large numbers can be used to derive a deterministic model describing the evolution of mass accurately. In our context, the deterministic evolution is given by the difference equation \(x_{n+1}=B(n)x_n\) and solved using the transition operator from (1), via \(\Phi (n,m)v\), where v is a probability vector describing the initial mass distribution. The occupancy time \(\tau _R\) we have developed here carries over naturally to that setting, and the distribution of \(\tau _R\) has the same interpretation, meaning that in particular Theorem 1 applies for \(k=1\), giving the expected occupancy time. We note that higher-order moments for \(k>1\) do not have a meaning in this large particle limit.

For large particle systems, a different analysis is possible for time-dependent compartmental processes, in either discrete or continuous time, that also have inputs. These are modelled in the linear case either as a nonautonomous linear difference equation \(x_{n+1}=B(n)x_n + s_n\) or differential equation \(\dot{x} = B(t)x +s(t)\). The theory developed here also applies to this setting with the interpretation that the initial mass probability distribution v is determined by the input vector \(s_n\) or s(t) at time n or t, respectively. This yields a (time-dependent) occupancy time describing this quantity for all particles that enter the system at a particular time, but a different perspective is possible here and was first explored in Rasmussen et al. (2016). Instead of considering the future of all particles that come into the system at some time n or t, one can also look at for how long particles leaving the system at a particular time have spent in a certain subset, which essentially covers the past of the system. We note that for autonomous systems at equilibrium, these two approaches are the same. These two approaches yield different answers not only for nonautonomous systems, but also for autonomous systems that have not yet reached equilibrium.

As an example of where these different definitions of occupancy times for subsets of the states could be important in understanding dynamics, a discrete-time version of the carbon cycle models considered in Rasmussen et al. (2016) is illustrative. These models look at the expected time that carbon is in various pools. The approaches we have outlined in this paper could be used, for example, to determine the expected time a molecule spends in the litter pool (as opposed to the soil or vegetation pools).

We finally note that the methods we have developed here, as well as the variations discussed in the preceding paragraphs, are not only relevant for ecology, but can be applied to different settings such as systems biology, pharmacokinetics Anderson (1983), epidemiology Jacquez and Simon (1993) and biogeochemistry Sierra et al. (2017). Despite the fact that some of these applications can naturally be modeled effectively by autonomous dynamical systems, many applications such as the carbon cycle depend on changing environmental conditions, which are either of deterministic or random nature, and thus in the setting of this article.