1 Introduction

Much research has been carried out in the past 20 years on statistical methods for the analysis of recurrent events to better understand chronic disease processes in observational settings and to evaluate the effect of experimental interventions in clinical trials. Disease processes in which recurrent events are manifest are ubiquitous and include, for example, chronic obstructive pulmonary disease where individuals experience recurrent exacerbations (Grossman et al. 1998), epilepsy where seizures recur (Musicco et al. 1997), and cancer where skeletal metastases and associated clinical complications can recur over time (Hortobagyi et al. 1996).

In clinical trials it is essential that tests for treatment effects be valid such that the rejection rate under the null hypothesis is at the nominal level. It is also critically important that models and methods of estimation be formulated so that estimators are consistent for an estimand with a clear causal interpretation. Finally, standard errors must adequately reflect the sampling variation so that confidence intervals have empirical coverage rates that are compatible with the nominal level in finite samples. These criteria form the basis for the following investigation which we carry out in both the clinical trial and observational settings. We confine our attention to marginal rate-based and partially conditional rate-based analyses since these are frequently applied in practice.

Semiparametric models based on marginal rate functions (Andersen and Gill 1982) are among the most widely used for assessing treatment effects on recurrent event processes in clinical trials (Cook and Lawless 2007). Partially conditional models involve time-dependent stratification on the cumulative number of events; this is formulated like a Markov model and is sometimes referred to as the Prentice–Williams–Peterson approach, although Prentice et al. (1981) did not advocate its use in clinical trials. It is also often called the stratified Andersen–Gill approach due to its relation with the rate-based method of Andersen and Gill (1982). We use the term partially conditional model to reflect the fact that, in contrast to intensity-based models, here only part of the process history is conditioned upon. This partially conditional approach has been shown to provide some protection against extra-Poisson variation when model-based variance estimates are used (Boher and Cook 2006), and to mitigate biases induced by event-dependent censoring (Cook et al. 2009). We explore the robustness of the marginal and partially conditional model by evaluating the limiting value and variance of estimators of covariate effects when a Poisson model is misspecified through the omission of a covariate; we consider both the observational and clinical trial setting where interest lies in the effect of a treatment. Performance of these methods when the recurrent events are generated by a multistate Markov process is also considered empirically.

The remainder of the paper is organized as follows. In Sect. 2 we define and give the associated estimating equations for the multiplicative model based on the marginal rate function (Andersen and Gill 1982) as well as the partially conditional model (Prentice et al. 1981). The limiting behaviour of estimators of treatment effect are given in Sect. 3 for the marginal and partially conditional models when the events are generated by a Poisson process but a prognostic covariate is omitted. The results of empirical studies supporting the large sample theory are given in Sect. 4 where the investigation is broadened to study the setting where events are generated by a Markov process but a covariate is omitted in the marginal and partially conditional analyses. An application illustrating the various methods is given in Sect. 5 and concluding remarks are given in Sect. 6.

2 Marginal and partially conditional rate-based models

2.1 Multiplicative models based on marginal rate functions

Let \(N_i(t)\) denote the number of events occurring over [0, t] and \(\{N_i (t), 0 \le t\}\) be the right-continuous counting process for individual i in a sample of n independent individuals, \(i=1, 2, \ldots , n\). The number of events over the interval \([t, t+\Delta t)\) for individual i is then \(\Delta N_i(t)=N_i(t+\Delta t^-)-N_i(t^-)\) and \(dN_i (t) = \lim _{\Delta t \downarrow 0} \Delta N_i (t)\). We let \(X_i (t) = (X_{i1}(t), \ldots , X_{ip}(t))'\) denote a \(p\times 1\) vector of external potentially time-dependent covariates where the process \(\{X_i (t), 0 \le t\}\) is left-continuous. The process history is denoted by \({{{\mathcal {H}}}}_i (t) = \{N_i (s), X_i(s): \, 0 \le s < t \}\).

The stochastic nature of any point process can be characterized by an intensity function,

$$\begin{aligned} \lim _{\Delta t \downarrow 0} \frac{P(\Delta N_i (t) = 1| {{\mathcal {H}}}_i (t))}{\Delta t} = \lambda _i (t|{{\mathcal {H}}}_i (t) ) , \end{aligned}$$
(1)

which represents the instantaneous probability of an event at time t given the process history (Ross 1983; Taylor and Karlin 1984). Of course for a particular setting one must make model assumptions; the canonical model for recurrent events with time-dependent covariates is the modulated Poisson model (Lawless 1987; Cook and Lawless 2007, Chapter 3). The conditionally independent increment property of the modulated Poisson model implies that given \(X_i(t)\) the risk at time t does not depend on \(\{N_i(s), 0 \le s < t\}\), yielding an intensity of the form \( \lambda _i (t|{{\mathcal {H}}}_i (t)) = \rho _i (t|X_i (t))\). Multiplicative models with

$$\begin{aligned} \rho _i (t|X_i(t);\theta ) = \rho _0 (t; \alpha ) g(X_i (t);\beta ) , \end{aligned}$$
(2)

are most common, where \(\rho _0 (t; \alpha )\) is a baseline rate function indexed by \(\alpha \), \(g(X_i(t);\beta )\) is a positive valued function indexed by \(\beta \), and \(\theta = (\alpha ' , \beta ')'\). Lawless (1987) gives the partial likelihood and associated estimating equations for the semiparametric setting where \(\rho _0 (t, \alpha )\) is an arbitrary positive-valued function, and Andersen and Gill (1982) derive the large sample theory; the semiparametric model (2) is sometimes called the Andersen–Gill model. Lin et al. (2000) provide a rigorous derivation of the limiting behaviour of estimators with an emphasis on robust variance estimation.

Individuals are typically followed over a finite period of time to record the occurrence of events of interest. Let the start of the interval be denoted by 0 and A denote the planned administrative censoring time. To accommodate early withdrawal we let \(R_i\) be a non-negative random variable independent of the recurrent event and covariate process with survivor function \( P(R_i \ge t)={{\mathcal {G}}} (t)\), and let \(C_i = \min (R_i, A)\) be the effective right-censoring time for individual i; the function \(Y_i (t) = \text{ I }(t \le C_i)\) indicates whether individual i is under observation at time \(t>0\), \(i=1,\ldots , n\). Under independent and non-informative censoring (Cook and Lawless 2007), the log partial likelihood contribution for individual i having \(n_i\) events at times \(t_{i1}< \cdots < t_{i n_i}\) over \([0, C_i]\) is

$$\begin{aligned} \int _0^{\infty } Y_i(t) \left\{ \log \rho _i (t|X_i(t); \theta ) dN_i (t) - \rho _i (t|X_i(t); \theta ) dt \right\} , ~~~ i=1,\ldots , n. \end{aligned}$$
(3)

In the semiparametric setting of (2) the function \(g(x;\beta ) = \exp (x'\beta )\) is used most often. We let \(d\mu _0(t)=\rho _0(t)dt\) (\(t>0\)) so that \(d \mu _0(\cdot )\) can be viewed as an infinite dimensional parameter; differentiating the terms in (3) with respect to \(d\mu _0(t)\) we obtain the estimating equations

$$\begin{aligned} \sum _{i=1}^{n} Y_i (t)\left\{ dN_i (t) - d\mu _0 (t) \exp (X'_i (t) \beta )\right\} = 0 ,~~~ 0< t . \end{aligned}$$
(4)

The profile Breslow-type estimator \(d{\tilde{\mu }}_0 (t; \beta ) = d{\bar{N}}_{\cdot }(t)/\sum _{i=1}^n Y_{i}(t) \exp (X'_i (t) \beta )\). is the solution where \(d {\bar{N}}_{\cdot }(t) = \sum _{i=1}^{n} Y_i (t) dN_i (t)\). Differentiating (3) with respect to \(\beta \) and replacing \(d \mu _0(t)\) with \(d{\tilde{\mu }}_0 (t; \beta )\) gives the profile partial score equation

$$\begin{aligned} U (\beta ) = \sum _{i=1}^n U_i(\beta ) = \sum _{i=1}^n \int _0^{\infty } Y_i (t) \left\{ X_i (t) - \frac{S^{(1)} (\beta , t)}{S^{(0)} (\beta , t)} \right\} dN_i (t) = 0 , \end{aligned}$$
(5)

where \(S^{(r)} (\beta , t) = n^{-1} \sum _{i=1}^{n} Y_i (t) \exp (X'_i(t) \beta ) X_i(t)^{\otimes r}\) with \(X_i(t)^{\otimes 0} = 1\), \(X_i(t)^{\otimes 1}=X_i(t)\) and \(X_i(t)^{\otimes 2}=X_i(t) X'_i(t)\). Lin et al. (2000) showed that

$$\begin{aligned} \sqrt{n} ({\hat{\beta }} - \beta ^\dagger ) \rightarrow \text{ N }\left( 0, {{\mathcal {A}}}^{-1}(\beta ^\dagger ) \mathcal{B}(\beta ^\dagger ) [{{\mathcal {A}}}^{-1}(\beta ^\dagger )]' \right) , \end{aligned}$$
(6)

where \({\hat{\beta }}\) is the solution to (5), \(\mathcal{A}(\beta ) = E[-\partial U_i (\beta ) / \partial \beta ]\), \(\mathcal{B}(\beta ) = E[U_i (\beta )U'_i (\beta )]\), and \(\beta ^\dagger \) is the solution to

$$\begin{aligned} \int _0^{\infty } E\left[ Y_i (t) \left\{ X_i(t) - \frac{s^{(1)} (\beta , t)}{s^{(0)} (\beta , t)} \right\} dN_i (t)\right] = 0 \end{aligned}$$
(7)

where \(s^{(r)}(\beta , t) = E[S^{(r)}(\beta , t)]\), \(r=0, 1\), and \(E[\; \cdot \;]\) denotes an expectation taken with respect to the censoring, recurrent event and covariate processes.

2.2 Multiplicative models based on partially conditional rate functions

A common partially conditional model is obtained by specifying

$$\begin{aligned} \lim _{\Delta t \downarrow 0} \frac{P(\Delta N_{ij}(t) = 1 |N_i (t^- ) = j-1, {{\mathcal {H}}}_i (t) )}{\Delta t} = \rho _{j0}(t)g(X_i (t);\beta ) \end{aligned}$$
(8)

where \(\rho _{j0}(t)\) is an event-specific baseline rate function. The specification in (8) corresponds to a multiplicative intensity-based model for a Markov process with a common treatment effect (Prentice et al. 1981), but otherwise should be viewed as a partially conditional model because only part of the history, namely \(N_i(t^-)=j-1\), is conditioned upon.

For convenience in what follows we let \(\rho _{j0}(t)g(X_i (t);\beta ) = \rho _{ij}(t|X_i(t))\) and write simply \(\rho _{ij}(t)\) to suppress its dependence on \(X_i(t)\); \(\mu _{ij}(t) = \int _0^t \rho _{ij}(s) ds\). Because \(\{X_i(s), 0\le s \}\) is external we can conceive of conditioning on the complete covariate path \(\{ X_i(s), 0 \le s\}\) but we will ultimately focus primarily on the case of fixed covariates. We let \(Y_{ij}(t) =I(N_i (t^-) = j-1)\) indicate that individual i is at risk for their jth event at t and define \({\bar{Y}}_{ij}(t) = Y_i (t) Y_{ij}(t)\), \(i=1, \ldots , n\). We let \(dN_{ij}(t) = 1\) indicate the jth event for individual i occurs at time t, and \(dN_{ij}(t) = 0\) otherwise; \(d{\bar{N}}_{ij}(t) = {\bar{Y}}_{ij}(t) dN_{ij}(t)\) indicates that the jth event occurs at t and is observed.

If individual i is observed to experience \(n_i\) events at time \(t_{i1}< \cdots < t_{i n_i}\) over \([0,C_i]\), the estimating equation for \(\beta \) based on a sample of n independent individuals is

$$\begin{aligned} {\tilde{U}}(\beta ) = \sum _{i=1}^{n} {\tilde{U}}_i(\beta ) = \sum _{i=1}^{n} \sum _{j=1}^{n_i} \int _0^{\infty } {\bar{Y}}_{ij}(t) \left\{ X_i (t) - \frac{S_j^{(1)} (\beta , t)}{S_j^{(0)} (\beta , t)} \right\} dN_{ij}(t), \end{aligned}$$
(9)

where here \(S^{(r)}_j (\beta , t) = n^{-1} \sum _{i=1}^{n} {\bar{Y}}_{ij}(t) \exp (X'_i(t) \beta ) X_i(t)^{\otimes r}\); this is the profile pseudo-score function for the partially conditional model. Solving \({\tilde{U}}(\beta ) = 0\) yields the estimate \({\tilde{\beta }}\) which has the asymptotic distribution

$$\begin{aligned} \sqrt{n} ({\tilde{\beta }} - \beta ^\ddagger ) \rightarrow N(0, \tilde{{{\mathcal {A}}}}^{-1}(\beta ^\ddagger ) \tilde{\mathcal{B}}(\beta ^\ddagger ) [\tilde{{{\mathcal {A}}}}^{-1}(\beta ^\ddagger )]' ), \end{aligned}$$
(10)

where \(\tilde{{\mathcal {A}}}(\beta ) = E[-\partial {\tilde{U}}_i (\beta ) /\partial \beta ]\), \(\tilde{{\mathcal {B}}}(\beta ) = E[{\tilde{U}}_i (\beta ){\tilde{U}}'_i (\beta )]\) and \(\beta ^\ddagger \) is the solution to

$$\begin{aligned} \sum _{j=1}^{\infty } \int _0^{\infty } E\left[ {\bar{Y}}_{ij}(t) \left( X_i (t) - \frac{s^{(1)}_j (\beta , t)}{s^{(0)}_j (\beta , t)} \right) dN_{ij} (t) \right] = 0 \end{aligned}$$
(11)

with \(s_j^{(r)} (\beta , t) = E[S_{j}^{(r)}(\beta , t)]\), \(r=0, 1\). These expressions to calculate the bias and asymptotic robust variance are very general, and in principle could be used to evaluate the large sample behaviour of estimators for any underlying recurrent event process. Our interest however, is on the effect of omitted covariates and we explore this in detail in the next section.

3 Inference regarding treatment effects with omitted covariates

3.1 Asymptotic properties for estimators of treatment effect

Given the general theory reviewed in Sect. 2 we can now explore the limiting behaviour of treatment effect estimators under misspecified marginal and partially conditional models. Here we consider modulated Poisson processes with a binary treatment covariate X and an external potentially time-varying covariate Z(t). The true rate function is assumed to have the form

$$\begin{aligned} \rho (t|X, Z(t)) = \rho _0 (t) \, \exp (\eta X + \zeta Z(t) ), \end{aligned}$$
(12)

where \(\rho _0 (t)\) is a positive-valued baseline rate function. In the setting of a randomized trial \(X \perp Z(t)\) since Z(t) is external, but X and Z(t) may be correlated in the observational setting. When we model just the treatment indicator (i.e. we omit Z(t)), we fit the marginal rate-based model \(\rho (t|X) = \rho _0^{*}(t) \exp (\beta X)\) or the partially conditional model \(\rho _{j}(t|X) = \rho _{j0}^{*}(t)\exp (\beta X)\).

From (7) the asymptotic bias of the Andersen–Gill estimator \({\hat{\beta }}\) is \((\beta ^\dagger - \eta )\). To derive the explicit form we note that in the present setting we can evaluate (7) by first taking the expectation with respect to \(dN_{i}(t)|Y_i(t), X_i, Z_i(t)\) to obtain \(E_{dN_i(t)} \{U_i(\beta )|Y_i(t), X_i, Z_i(t)\}\) as

$$\begin{aligned} \int _0^{\infty } Y_i (t) \left\{ X_i - \frac{s^{(1)} (\beta , t)}{s^{(0)} (\beta , t)} \right\} \exp (\eta X_i + \zeta Z_i (t) ) d\mu _0 (t) \end{aligned}$$

under the assumption of conditionally independent censoring (i.e. \(R_i \perp {{\mathcal {H}}}_i(t)\)). Then taking the expectation with respect to the remaining terms gives

$$\begin{aligned} \int _0^{A} {{\mathcal {G}}}(t) \left\{ E[X_i \exp (\eta X_i + \zeta Z_i (t))] - E[\exp (\eta X_i + \zeta Z_i (t))]\cdot \frac{s^{(1)} (\beta , t)}{s^{(0)} (\beta , t)}\right\} d\mu _0 (t), \end{aligned}$$
(13)

where \(s^{(r)}(\beta , t) = E[S^{(r)}(\beta , t)] = {{\mathcal {G}}}(t) E[e^{\beta X_i} X^{\otimes r}_i ]\). When \(X_i\) is a binary treatment indicator with, say, \(P(X_i=1)=0.5\), then \(s^{(1)}(\beta , t)/s^{(0)}(\beta , t) = \exp (\beta )/(1 + \exp (\beta ))\) and substituting this into (13) and solving gives

$$\begin{aligned} \exp (\beta ^\dagger ) = \frac{\int _0^{A} {{\mathcal {G}}}(t) E[X_i \exp (\eta X_i + \zeta Z_i (t))]d\mu _0 (t)}{\int _0^{A} {{\mathcal {G}}}(t) E\left[ (1-X_i) \exp (\eta X_i + \zeta Z_i (t))\right] d\mu _0 (t)}. \end{aligned}$$
(14)

When \(Z_i (t)\) is independent of \(X_i\) as in a randomized controlled trial, \(\beta ^\dagger = \eta \) so a consistent estimate of the causal effect of treatment (\(\eta \)) is obtained even when an important (external) covariate is omitted. When Z(t) and X are correlated however, a marginal model omitting Z(t) will yield a biased estimate of the treatment effect with no easy causal interpretation. Finally note that in the case of a fixed covariate \(Z_i(t) = Z_i\) which is possibly correlated with \(X_i\), (14) can be simplified to

$$\begin{aligned} \exp (\beta ^\dagger ) = \frac{E[X_i \exp (\eta X_i + \zeta Z_i)]}{E[(1-X_i)\exp (\eta X_i + \zeta Z_i)]}. \end{aligned}$$
(15)

For the partially conditional model (11) gives

$$\begin{aligned} \sum _{j=1}^{\infty }\int _0^{\infty } \left\{ E\left[ {\bar{Y}}_{ij}(t) X_i d\mu _i(t)\right] - E\left[ {\bar{Y}}_{ij}(t)d\mu _i(t)\right] \cdot \left( \frac{s_j^{(1)}(\beta , t)}{s_j^{(0)} (\beta , t)}\right) \right\} \end{aligned}$$

which can be written as

$$\begin{aligned} \sum _{j=1}^{\infty }\int _0^{\infty } \left\{ s_j^{(1)}(t) - s_j^{(0)}(t)\cdot \left( \frac{s_j^{(1)}(\beta , t)}{s_j^{(0)} (\beta , t)}\right) \right\} dt, \end{aligned}$$
(16)

where \(s_j^{(r)} (t) = E[{\bar{Y}}_{ij}(t) \rho _i (t) X_i^{\otimes r}]\); \(r=0, 1\). Note that

$$\begin{aligned} s_{j}^{(r)}(t)= & {} E\left[ {{\mathcal {G}}}(t) \cdot P(N_i (t^- ) = j-1 |X_i, Z_i (t) ) \rho _0(t) \exp (\eta X_i + \zeta Z_i (t)) X_i^{\otimes r} \right] \\= & {} {{\mathcal {G}}}(t) \rho _0(t) \cdot E_{X_i, Z_i (t) } \left[ \frac{\exp (-\mu _i (t)) (\mu _i (t))^{j-1}}{(j-1)!} \exp (\eta X_i + \zeta Z_i (t)) X_i^{\otimes r} \right] , \end{aligned}$$

which depends on the event number j and time t even if \(Z_i(t)\) is a time-invariant covariate; the same is true for \(s_{j}^{(r)}(\beta , t)\). As \(Z_i (t)\) is an external covariate we can condition on it and think of \(\mu _i (t) = \int _0^{t} \rho _i (s) ds\) as a mean of \(N_i (t)\) given \((X_i, \{Z_i (s), 0 \le s < t\})\). Since there is no solution in closed-form, one must solve equation \(E[{\tilde{U}}(\beta )] = 0\) numerically for \(\beta ^\ddagger \). The complexity of the asymptotic calculation arises because of the extra conditioning on \(Y_{ij}(t)\) in the partially conditional model. In general, \(\beta ^\ddagger \ne \eta \), even when \(X_i\) and \(Z_i (t) \) are independent. This indicates that omitting the covariate Z(t) in the partially conditional model leads to a biased estimate of the causal treatment effect, even when Z(t) and X are independent. For the partially conditional model one conditions on the cumulative event count at t which is responsive to both treatment and other covariate effects, and hence

$$\begin{aligned} X_i \not \perp Z_i(t) |N_i(t^-) = j-1, ~~ t > 0 . \end{aligned}$$

The phenomenon of induced confounding through this conditioning is well-known in causal inference (Hernán 2010).

The model-based naive variance \({{\mathcal {A}}}^{-1}(\beta ^\dagger )\) will underestimate the variability of \({\hat{\beta }}\) under a misspecified marginal model so robust variance estimation is recommended to ensure valid inference (Lin and Wei 1989; Bernardo and Harrington 2001; Boher and Cook 2006). The explicit forms of the model-based naive and robust sandwich variances in the current setting are given in Appendix 1 and 2 for the marginal and partially conditional models respectively.

3.2 A case-study involving an omitted fixed covariate

Here we consider a case study of the effect of omitting covariates in the marginal and partially conditional models by considering a particular setting in detail. We assume a Poisson process with a Weibull rate function \(\rho (t) = \rho _0 (t) \exp (\eta X + \zeta Z)\) with \(\rho _0 (t) = \lambda \kappa (\lambda t)^{\kappa - 1}\). We let X be a binary treatment indicator with \(P(X=1) = P(X=0) = 0.5\) as before, and let Z be a fixed binary covariate with \(Z \sim \text{ Bin }(1, p_z )\); we let

$$\begin{aligned} \phi = \frac{P(Z=1|X=1)/P(Z=0|X=1)}{P(Z=1|X=0)/P(Z=0|X=0)} \end{aligned}$$

denote the odds ratio characterizing the association between X and Z where \(X \perp Z\) when \(\phi = 1\).

Fig. 1
figure 1

Limiting bias of estimates of treatment effect under the marginal model (left panel) and the partially conditional model (right panel) when omitting the covariate Z; X and Z are dependent binary covariates with odds ratio \(\phi \)

We let \(\beta = \log 0.75\), which reflects the positive effect of treatment and \(\zeta = 0\), \(\log 1.5\) or \(\log 3\) to represent the case of no, moderate or strong effects of Z on event occurrence. We let \(\kappa = 1.25\), and choose \(\lambda \) such that the expected number of observed events at \(t=1\) is 2 when X and Z are equal to 0. Without loss of generality we let the administrative censoring time be \(A = 1\) and we assume the random censoring time \(R_i\) follows an exponential distribution satisfying \(P(R_i < A) = 0.2\); this gives the effective censoring time \(C_i = \text{ min }(R_i, A)\). Under this setting, when we omit variable Z in the Andersen–Gill model, then by (15) the limiting bias of \({\hat{\beta }}\) is

$$\begin{aligned} \beta ^{\dagger } - \eta = \log \left[ \frac{e^{\zeta } P(X=1,Z=1) + P(X=1,Z=0)}{e^{\zeta } P(X=0,Z=1) + P(X=0,Z=0) }\right] \; \end{aligned}$$
(17)
Fig. 2
figure 2

Asymptotic naive and robust standard errors of estimates of treatment effect under the marginal model (left panel) and partially conditional model (right panel) omitting a binary covariate Z as a function of \(P(Z=1)\); \(\phi \) is the odds ratio of (XZ); \(\phi = 2.0\)

which is a function of the effect of Z on the outcome and the extent of the association between Z and X. Figure 1 plots the limiting bias of the treatment effect estimator under the marginal and partially conditional models as a function of the association between Z and X and the effect of Z (i.e. \(\zeta \)). The bias increases as the association between X and Z increases and as the magnitude of \(\zeta \) increases. When X and Z are independent the misspecified marginal model yields consistent estimates of the treatment effect, supporting the use of this method in randomized trials. The partially conditional model, however, yields a biased estimate of treatment effect when an important covariate is omitted even when X and Z are independent. Thus while the partially conditional model appears to be a more general model than the marginal model, it does not support robust causal inferences about treatment effects in randomized trials when recurrent event follows Poisson processes. It is also apparent from (13) and (16) that the limiting values of the marginal and partially conditional estimators are dependent on the administrative censoring time and the distribution of the random censoring time. We found there to be only a weak dependence on the random censoring rate in both frameworks so we do not report the results of these studies here.

The asymptotic naive and robust standard errors under the misspecified marginal model were also studied using (20) and (22) in Appendix 1, and under the misspecified partially conditional model using (23) and (24) in Appendix 2. Figure 2 plots the trend of asymptotic naive and robust standard errors of the treatment effect as a function of \(P(Z=1)\) when \(\phi = 2.0\). The robust standard error is larger than the naive standard error under the marginal model with the differences increasing as the effect of the covariate Z increases as expected. The robust and naive standard errors are in close agreement under the partially conditional model, in part because the extra-Poisson variation arising from the omission of Z is explained by the stratification; Boher and Cook (2006) made a similar observation based on empirical studies. The plots of the asymptotic naive and robust standard errors of the treatment effect estimators have a similar pattern for both the marginal and partially conditional models when \(\phi = 1.0\). Similar calculations were carried out for the setting in which Z|X follows a normal distribution with mean \(\theta _0 + \theta _1 X\) and variance \(\sigma ^2\); the results are shown for marginal models in Online Resource 1.

4 Empirical studies of finite sample behaviour

Here we consider an empirical study to investigate the finite sample properties of estimators of the treatment effect under the misspecified marginal and partially conditional rate-based models. In Sect. 4.1 we consider the events as generated by a Poisson process and in Sect. 4.2 we consider the case where the events are generated according to a Markov model. In both settings we examine the finite sample properties of estimators from marginal and partially conditional rate-based models in which an important covariate is omitted.

4.1 Misspecified rate-based models for Poisson processes

For the setting where events are generated by a Poisson process we use the same illustrative setting as in Sect. 3.2. We let \(\phi = 0.5, 1.0, 2.0\) and 4.0 to reflect varying strengths of the association between X and Z when Z is binary, and let \(P(Z=1) = 0.25\) or 0.50. The effect of Z on the event process is set to be \(\zeta = 0\), \(\log 1.5\) or \(\log 3.0\) to reflect no effect to a strong effect. The other parameter settings are the same as those in Sect. 3.2. We generated one thousand samples of size \(n=1000\) each. We adopt the marginal and partially conditional models with a single covariate reflecting the treatment, and investigate the empirical properties of the estimators under those misspecified models; see Table 1.

Table 1 Empirical frequency of estimates of treatment effect, when omitting covariate Z in the assumed rate function under the marginal and partially conditional rate-based models for the recurrent event following a Poisson process; X and Z are binary correlated with odds ratio \(\phi \); \(n=1000\) and \(nsim=1000\); all numbers for BIAS, ESE, ASE and ECP (\(\times 100\)) in the table

We find that when X and Z are independent there is negligible empirical bias of the estimated treatment effect under the marginal model, supporting theory that marginal model is robust and so yields a consistent estimators of the treatment effect in clinical trials. Furthermore, the average robust standard error is in close agreement with the empirical standard error of the estimates in general, while the average naive standard error underestimates the variability, especially when the effect of covariate Z is larger supporting the the need for robust standard errors. This can also be seen by comparing the empirical coverage probabilities of nominal 95% confidence interval for \({\hat{\beta }}\) based on naive and robust standard errors. Furthermore, when X and Z are independent, unlike the marginal model, the partially conditional model yields biased estimates of the treatment effect; this empirical bias is larger when the effect of Z on the event process increases. This means that the benefit of randomization is lost when we fit partially conditional models without addressing other covariate effects.

When X and Z are not independent, there is significant bias of the estimates for treatment effect under both models, and the bias increases when the association between X and Z is stronger or the effect of the omitted Z on the event process becomes larger. These findings agree with our theoretical results in Sect. 3.2. Note that under the misspecified marginal model, the robust standard errors accurately reflect the empirical variation indicating that they provide protection from the misspecification to some extent. Due to the significantly large bias of the estimates of treatment effect under the misspecified model, the empirical coverage probabilities of the 95% confidence intervals are unacceptably low when X and Z are correlated. We also note that under the misspecified partially conditional model, there is reasonable agreement between the average model-based standard errors and the average robust standard errors; this is in alignment with the theoretical results of Sect. 3.2. The results of additional simulation studies involving normally distributed Z lead to similar conclusions; see Online Resource 2 for results.

4.2 Misspecified rate-based models for Markov processes

We now consider the setting in which the events are generated by a progressive multistate process with states labeled \(0, 1, \ldots \) representing the cumulative number of events and \(\{N_i(t), 0\le t\}\) the event process as before; the state-space diagram is depicted in Figure 3. We assume that \(k \rightarrow k+1\) transitions occur according to a Markov intensity

$$\begin{aligned} \lim _{\Delta t \downarrow 0} \frac{P(N_i (t+\Delta t^{-}) = k+1|N_{i}(t^{-}) = k, X_i, Z_i)}{\Delta t} = q_k (t) \exp (\eta X_i + \zeta Z_i ), \end{aligned}$$

where \(q_k(t)\) is a baseline transition rate, \(k=0, 1, \ldots \). We let \(q_{k+1} (t) = q_k (t)e^{\alpha }\), \(k=0, \ldots , K\) so that the occurrence of each event increases the baseline rate of the next event up until the \((K+1)\)st event and set \(q_{k+1}(t) = q_k (t)\) for \(k=K+1, \ldots , K_m\) so that the risk does not increase beyond the \((K+1)\)st event; data are generated for at most \(K_m\) transition times but this is chosen to be large enough that the probability of entering the absorbing state over the planned period of observation is essentially zero.

Time-homogeneous transition intensities are obtained by letting \(q_0 (t) =q_0\). We let Q denote the \((K_m + 1) \times (K_m + 1)\) transition intensity matrix with \(Q_{jj} = -q_{j-1}\) entries on the diagonal \(Q_{j, j+1} = q_{j-1}\) above the diagonal and \(Q_{jl} = 0\) for \(l \ne j\) or \(j+1\); \(j=1, 2, \ldots , K_m + 1\). The Chapman–Kolmogorov equations then give,

$$\begin{aligned} P(s, s+t|X = 0, Z=0) = \exp (Q t), \end{aligned}$$
(18)

where \(P(s, s+t|X=0, Z=0) = P(0, t|X=0, Z=0)\) and \(P_{jl}(0, t|X=0, Z=0) = P(Z(t) = l |Z(0) = j, X=0, Z=0)\) (Cox and Miller 1965).

Fig. 3
figure 3

Multistate representation of a recurrent event process

As before we consider the case when Z is Bernoulli with \(P(Z=1) = p_z\) and the odds ratio for the association between X and Z is \(\phi \). We let \(\alpha =\log 1.05\) so there is a 5% increase in the risk of an event each time an event occurs up to \(K=5\), and let \(K_m = 20\). We determine \(q_0\) so that \(\mu (1|X=0, Z=0) = 2\) where

$$\begin{aligned} \mu (t|X=0, Z=0) = \sum _{k=0}^{K_m} k \cdot P(Z(t) = k|Z(0)=0, X=0, Z=0) \end{aligned}$$
(19)

is the expected number of events at time t given \(X=Z=0\); the other parameter settings are the same as in Sect. 4.1. We generate one thousand samples of size \(n=1000\) each from this Markov process. The marginal and partially conditional models are fitted with only a treatment indicator, and the empirical properties of the resulting estimators are summarized in Table 2.

Table 2 Empirical frequency of estimates of treatment effect, when omitting covariate Z in the assumed rate function under the marginal and partially conditional rate-based models for the recurrent event following a Markov process; X and Z are binary correlated with odds ratio \(\phi \); \(n=1000\) and \(nsim=1000\); all numbers for BIAS, ESE, ASE and ECP (\(\times 100\)) in the table; \(\alpha = \log (1.05)\)

The partially conditional model is the correct model when \(\zeta = 0\) and yields consistent estimators of treatment effect; see the column of results headed \(\zeta =0\) in Table 2. Although the marginal model ignores the state-dependent transition intensity, statistical inference for the treatment effect remains valid if the robust standard error is used when \(\zeta =0\). When \(\zeta \ne 0\) however, both the marginal and partially conditional models omitting Z are misspecified; here the resulting estimators are biased and the confidence intervals have poor empirical coverage probability. When \(X \perp Z\) and \(\zeta = \log 1.5\), the marginal and partially conditional models yield valid inferences. This does not hold with larger \(\alpha \) or when the covariate Z is normally distributed (see Online Resource 2 for more simulation results). Therefore, our empirical studies suggest that when the true model is Markov, ignoring the important confounders or even independent prognostic variables (i.e. \(X \perp Z\) at \(t=0\)) can yield estimators of treatment effect which are susceptible to misspecification. Whether valid estimates of the treatment effect can be obtained in the clinical trial setting under these two models therefore depends on how large the effect of omitted prognostic variables are as well as their distribution. This can be re-expressed by stating that inferences based on partially conditional rate-based analysis are sensitive to departures from the Markov assumption on which it is formally justified. Model assessment has a particularly useful role here and simulations and sensitivity analyses may be worthwhile to investigate the impact of model violations on the performance of estimators and tests based on marginal or partially conditional models.

5 Application to a trial in cystic fibrosis

Cystic fibrosis is a respiratory disease with airway obstruction caused by the accumulation of mucus in the lungs due to extracellular DNA; this results in recurrent pulmonary exacerbations. When delivered to the lungs in an aerosolized form, a highly purified recombinant form of DNase I called rhDNase cuts extracellular DNA, reducing the viscoelasticity of airway secretions and improving clearance. In a randomized double-blind trial 321 individuals were assigned to receive rhDNase and 324 we assigned to a placebo treatment (Fuchs et al. 1994). The primary purpose of this study was to investigate the effect of rhDNase on the suppression of exacerbations so to this end the onset times of exacerbations were recorded over the study period of approximately 169 days. In the control arm 139 individuals had at least one exacerbation, 42 had at least two exacerbations, and 18 had at least three exacerbations; in the rhDNase arm these numbers were 104, 39 and 9 respectively. The baseline forced expiratory volume (FEV) is a measure of lung function known to be highly associated with the onset of exacerbations; it was centered in the analyses that follow by subtracting the mean value and we denote it by FEVC. The data are available at the website for Cook and Lawless (2007).

Table 3 Estimates of treatment effect for cystic fibrosis trial using marginal and partially conditional models with four strata based on no events, 1 event, 2 events and \(\ge 3\) events when ignoring or controlling for the centered forced expiratory volume (FEVC)

We fit the marginal and partially conditional models with the treatment indicator alone, and when controlling for the baseline FEVC. Since only a few individuals experienced more than 3 events, four time-dependent strata were defined based on no events (\(N_i(t^-)=0\)), 1 event (\(N_i(t^-)=1\)), 2 events (\(N_i(t^-)=2\)), and \(\ge 3\) events (\(N_i(t^-) \ge 3\)). The results summarized in Table 3 reveal that the estimates and conclusions are comparable across the four analyses, but we make comments here related to the findings of the theory and empirical studies of Sections 3 and 4. First there is very close agreement between the estimates of treatment effect from the marginal analysis whether FEVC is controlled for or not—this is to be expected based on the results in Sect. 3.2. There is a slightly smaller standard error for the coefficient in the adjusted analysis as FEVC explains some of the variation in the event risk across individuals. The estimate of the treatment effect is smaller from the partially conditional (stratified) analyses decreasing from \(-0.271\) to \(-0.234\) in the models not adjusting for FEVC for example. This reduction in the estimated treatment effect is accompanied by a reduction in the robust standard error in the partially conditional analysis, and as a result the p values are virtually identical at 0.029 and 0.030 for the Wald tests. Similar findings are observed when controlling for FEVC.

For completeness we plot the semiparametric estimates of the cumulative baseline rates under the marginal (top row) and partially conditional baseline rates (bottom row) in Figure 4. While the plots are provided for each treatment group they are obtained from one fitted model for each analysis. The effect of treatment is evident graphically from the lower slope of the estimate in the rhDNase arm (top row). Moreover the estimates of the cumulative stratified baseline transition rates show the increased risk of event occurrence with each event; this is inferred by the progressively steeper estimates reflecting higher risks at any time.

Fig. 4
figure 4

Estimated cumulative marginal rate functions (top row) and cumulative stratified rate functions (bottom row) for the cystic fibrosis trial

Motivated by the suggestions we provide in Sect. 4.2, we carry out a small simulation study to mimic the cystic fibrosis data and investigate the behaviour of the estimates under the marginal and partially conditional models. Based on Figure 4, we assume the recurrent event follows a Markov process as we specified in Sect. 4.2 with \(K=2\) and \(K_m = 20\). We fit the model \(\lambda _i(t) = q_0 \exp (\eta X_i + \zeta Z_i + \alpha N_i(t^-))\) and obtain the estimates \({\hat{\eta }} = -0.228\), \({\hat{\zeta }}=-0.015\) and \({\hat{\alpha }}=0.343\). The Nelson–Aalen estimates of the cumulative baseline intensities could be obtained with the slopes providing a way of selecting \(q_0\); here we take \(q_0=0.0032\). The centered baseline forced expiratory volume approximately follows a normal distribution with mean 0 and standard deviation 26. In this simulation study we took the covariate Z to follow a similar distribution as FEVC, which is normally distributed with mean 0 and standard deviation \(\sigma _z = 20\), 26, or 30; we study settings with slightly lower and slightly higher variability. We let the effect of FEVC on the event rate be \(\zeta = -0.50\), -0.10, -0.01, 0.00, and 0.20. Using these values, we could generate the event times for \(n=645\) individuals. The empirical frequency of estimates under the marginal and partially conditional models with only the treatment indicator are summarized in Table 4. We note that when there is no effect of Z on the event process, the partially conditional model with only treatment indicator is the correct model and hence leads to consistent estimation of the treatment effect. Although the marginal model ignores the state-dependent transition intensity, statistical inference for the treatment effect is still valid when robust variance estimates are used. When \(\zeta \ne 0\), both the marginal and partially conditional models omitting Z result in biased estimates and the confidence intervals have poor empirical coverage probability.

Table 4 Empirical frequency of estimate for treatment effect, when omitting covariate Z in the assumed rate function under the marginal and partially conditional models for the recurrent event following a Markov process; Z is normally distributed with mean 0 and standard deviation \(\sigma _z\), and X and Z are independent; \(n=645\) and \(nsim=1000\); all numbers for BIAS, ESE, ASE and ECP (\(\times 100\)) in the table

6 Discussion

Marginal and partially conditional semiparametric models have received considerable attention in recent years as methods for assessing the effect of therapeutic interventions on the basis of recurrent events. The marginal rate-based model is viewed as offering a robust approach to assessing treatment effects but it is susceptible to the effects of model misspecification; while we have demonstrated this when the true event generating process is Markov, this arises whenever the basic multiplicative assumption of covariate effects is not satisfied. While the partially conditional model represents a generalization of the marginal model through the introduction of time-dependent strata, the strata are defined based on the cumulative number of events which is responsive to treatment and other risk factors which also having effect on the outcome. Conditioning on time-dependent variables which are realized post-randomization and potentially responsive to treatment has been known to be problematic for some time (Kalbfleisch and Prentice 2002). Hernán (2010) points out that analyses based on Cox regression models incorporate such conditioning implicitly through the comparison of covariate distributions among those individuals who are uncensored and event-free at each failure time post-randomization; see also Aalen et al. (2015). Here we investigate in detail the implications of conditioning on the cumulative number of events in a partially conditional model for recurrent event analyses. The findings mean that the full marginal model should be used in randomized trials since, as demonstrated here, it can yield an estimate of treatment effect with a simple causal interpretation. Careful examination of the multiplicative assumption is warranted however to ensure the assumption is reasonable.