Skip to main content

Bivariate pseudo-observations for recurrent event analysis with terminal events

Abstract

The analysis of recurrent events in the presence of terminal events requires special attention. Several approaches have been suggested for such analyses either using intensity models or marginal models. When analysing treatment effects on recurrent events in controlled trials, special attention should be paid to competing deaths and their impact on interpretation. This paper proposes a method that formulates a marginal model for recurrent events and terminal events simultaneously. Estimation is based on pseudo-observations for both the expected number of events and survival probabilities. Various relevant hypothesis tests in the framework are explored. Theoretical derivations and simulation studies are conducted to investigate the behaviour of the method. The method is applied to two real data examples. The bivariate marginal pseudo-observation model carries the strength of a two-dimensional modelling procedure and performs well in comparison with available models. Finally, an extension to a three-dimensional model, which decomposes the terminal event per death cause, is proposed and exemplified.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Availability of data and materials

The bladder cancer data set is available through the R package survival. Regarding the LEADER data (Marso et al. 2016): De-identified individual participant data, study protocol and redacted Clinical Study Report will be available according to Novo Nordisk data sharing commitments. The data will be made available permanently after research completion and approval of product and product use in both EU and US. Data will be shared with bona fide researchers submitting a research proposal requesting access to data and for use as approved by the Independent Review Board according to the IRB Charter (see novonordisk-trials.com). Access request proposal form and the access criteria can be found at novonordisk-trials.com. The data will be made available on a specialised SAS data platform.

Code availability

R code for the bivariate marginal pseudo-observation model is available upon request to the corresponding author.

References

  • Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10(4):1100–1120

  • Andersen PK, Perme MP (2010) Pseudo-observations in survival analysis. Stat Methods Med Res 19:71–99

    MathSciNet  Article  Google Scholar 

  • Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer series in statistics. Springer

  • Andersen PK, Klein JP, Rosthøj S (2003) Generalised linear models for correlated pseudo-observations, with applications to multi-state models. Biometrika 90:15–27

    MathSciNet  Article  Google Scholar 

  • Andersen PK, Angst J, Ravn H (2019) Modeling marginal features in studies of recurrent events in the presence of a terminal event. Lifetime Data Anal 25(4):681–695

    MathSciNet  Article  Google Scholar 

  • Binder N, Gerds TA, Andersen PK (2014) Pseudo-observations for competing risks with covariate dependent censoring. Lifetime Data Anal 20(2):303–315

    MathSciNet  Article  Google Scholar 

  • Byar D (1980) The veterans administration study of chemoprophylaxis for recurrent stage I bladder tumours: comparisons of placebo, pyridoxine and topical thiotepa. Springer

  • Cook R, Lawless JF (1997) Marginal analysis of recurrent events and a terminating event. Stat Med 16:911–924

    Article  Google Scholar 

  • Cox DR (1972) Regression models and life-tables. J R Stat Soc Ser B (Methodol) 34(2):187–220

    MathSciNet  MATH  Google Scholar 

  • Ghosh D, Lin D (2000) Nonparametric analysis of recurrent events and death. Biometrics 56:554–562

    MathSciNet  Article  Google Scholar 

  • Ghosh D, Lin D (2002) Marginal regression models for recurrent and terminal events. Stat Sin 12:663–688

    MathSciNet  MATH  Google Scholar 

  • Graw F, Gerds TA, Schumacher M (2009) On pseudo-values for regression analysis in competing risks models. Lifetime Data Anal 15:241–255

    MathSciNet  Article  Google Scholar 

  • Jacobsen M, Martinussen T (2016) A note on the large sample properties of estimators based on generalized linear models for correlated pseudo-observations. Scand J Stat 43:845–862

    MathSciNet  Article  Google Scholar 

  • Liang KY, Zeger ST (1986) Longitudinal data analysis using generalized linear models. Biometrika 73(1):13–22

    MathSciNet  Article  Google Scholar 

  • Lin D, Wei L, Yang I, Ying Z (2000) Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc 62(4):711–730

    MathSciNet  Article  Google Scholar 

  • Liu L, Wolfe RA, Huang X (2004) Shared frailty models for recurrent events and a terminal event. Biometrics 60:747–756

    MathSciNet  Article  Google Scholar 

  • Marso SP, Daniels GH, Brown-Frandsen K, Kristensen P, Mann JFE, Nauck MA, Nissen SE, Pocock S, Poulter NR, Ravn LS, Steinberg WM, Stockner M et al (2016) Liraglutide and cardiovascular outcomes in type 2 diabetes. N Engl J Med 375(4):311–322

    Article  Google Scholar 

  • Overgaard M (2019) Counting processes in p-variation with application to recurrent events. https://arxiv.org/pdf/1903.04296.pdf

  • Overgaard M, Parner ET, Pedersen J (2017) Asymptotic theory of generalized estimating equations based on jack-knife pseudo-observations. Ann Stat 45(5):1988–2015

    MathSciNet  Article  Google Scholar 

  • Pavlič K, Martinussen T, Andersen PK (2019) Goodness of fit tests for estimating equations based on pseudo-observations. Lifetime Data Anal 25:189–205

    MathSciNet  Article  Google Scholar 

  • Wei LJ, Lin DY, Weissfeld L (1989) Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc 84:1065–1073

    MathSciNet  Article  Google Scholar 

Download references

Funding

This research was carried out as part of xxx’s Ph.D. education. For the PhD education, she received funding from Novo Nordisk A/S and Innovation Fund Denmark. xxx is supported by the Novo Nordisk Foundation Grant NNF17OC0028276.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julie K. Furberg.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Theoretical details on bivariate normality of \((\hat{\beta }, \hat{\gamma })\)

According to Overgaard et al. (2017), the pseudo-observation approach of this paper produces consistent and asymptotically normal parameter estimates under essentially two conditions. One condition is that the estimate \(\hat{\theta }\) of \(\theta = E(f(W))\) can be seen as a functional, \(\phi \), of the empirical distribution, \(F_n\), in a Banach space setting such that \(\phi \) is two times (Fréchet) differentiable with a Lipschitz continuous second order derivative and such that \(\Vert F_n\Vert \) converges at a certain rate. This condition ensures that the close approximation of the pseudo-observation \(\hat{\theta }_i = \theta + \dot{\theta }(X_i) + \frac{1}{n-1}\sum _{j \ne i} \ddot{\theta }(X_i, X_j) + o_P(n^{-\frac{1}{2}})\) (uniformly in i) in terms of the estimator’s first and second order influence functions, \(\dot{\theta }\) and \(\ddot{\theta }\), holds. This, in turn, implies that the less close approximation \(\hat{\theta }_i = \theta + \dot{\theta }(X_i) + o_P(1)\) also holds. The other condition is therefore that \(E(\dot{\theta }(X) \mid Z) = E(f(W) \mid Z) - \theta \), which means that the pseudo-observations carry the right information and ensures that the estimating equation is unbiased under the model. The result of Overgaard et al. (2017) is formulated for one-dimensional pseudo-observations, but generalizes to multi-dimensional outcomes. In a multi-dimensional setting, the requirements then need to hold for each outcome separately.

For pseudo-observations of the Kaplan–Meier estimate \(\hat{S}(t_l)\), the conditions above hold under assumption of positivity, i.e. \(P(C> t_l) > 0\), and completely independent censoring, i.e. that C is independent of \((D^*, Z)\), as described by Overgaard et al. (2017) based on the work of Graw et al. (2009) and Jacobsen and Martinussen (2016). For pseudo-observations of \(\hat{\mu }(t_l)\), the conditions were established by Overgaard (2019), see Example 8, under similar assumptions of positivity, completely independent censoring, here that C is independent of \((N^*, D^*, Z)\), and additionally the assumption that \(N^*(t_l)\) has a little more than finite fourth moment.

The result of Overgaard et al. (2017) is that, under regularity conditions, estimates, \(\hat{\xi } = \hat{\xi }_n\), exist that solve (4) with high probability for large n such that

$$\begin{aligned} \sqrt{n}(\hat{\xi }_n - \xi ) \end{aligned}$$

is asymptotically normal with mean 0 and variance

$$\begin{aligned} M^{-1} \Psi M^{-1}, \end{aligned}$$

where

$$\begin{aligned} M = E\left( \left( \frac{\partial m_i}{\partial \xi }\right) ^T V_i^{-1} \frac{\partial m_i}{\partial \xi } \right) \end{aligned}$$

and

$$\begin{aligned} \Psi = {\text {Var}}\left( \left( \frac{\partial m_i}{\partial \xi }\right) ^T V_i^{-1} (\theta + \dot{\theta }(X_i) - m(\xi ; Z_i)) + h(X_i)\right) \end{aligned}$$

with

$$\begin{aligned} h(x) = E\left( \left( \frac{\partial m_i}{\partial \xi }\right) ^T V_i^{-1} \ddot{\theta }(x, X_i) \right) . \end{aligned}$$

In summary, the suggested pseudo-observation approach produces consistent and asymptotically normal parameter estimates under the assumptions

  1. 1.

    positivity, \(P(C> t_k) > 0\),

  2. 2.

    completely independent censoring, i.e. C is independent of \((N^*, D^*, Z)\),

  3. 3.

    a little more than finite fourth moment of \(N^*(t_k)\).

It is worth noting that the suggested estimate of \(\Psi \) can be expected to consistently estimate \({\text {Var}}\Big (\big (\frac{\partial m_i}{\partial \xi }\big )^T V_i^{-1} (\theta + \dot{\theta }(X_i) - m(\xi ; Z_i))\Big )\) but not \({\text {Var}}\Big (\big (\frac{\partial m_i}{\partial \xi }\big )^T V_i^{-1} (\theta + \dot{\theta }(X_i) - m(\xi ; Z_i)) + h(X_i)\Big )\). In other words, any contribution from the second order terms of h are not included and so the estimate, and thereby the standard errors of the sandwich variance estimator, can be expected to be biased.

Plots from simulation of bivariate normality of \((\hat{\beta }, \hat{\gamma })\)

This appendix displays additional plots visualizing the bivariate normal distribution of \((\hat{\beta }, \hat{\gamma })\) for different parameter settings and k.

\((n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, 0.2, 1)\) and \(t=2\)

See Appendix Fig. 8.

Fig. 8
figure 8

Plots investigating normality assumption of \((\hat{\beta }, \hat{\gamma })\) using 1000 simulated data sets and \((n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5,0.2, 1)\). The pseudo-observations are computed based on \(k=1\) with \(t=2\).

\((n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, -0.2, 1)\) and \(t=2\)

See Appendix Fig. 9.

Fig. 9
figure 9

Plots investigating normality assumption of \((\hat{\beta }, \hat{\gamma })\) using 1000 simulated data sets and \((n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, -0.2, 1)\). The pseudo-observations are computed based on \(k=1\) with \(t=2\)

\((n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, 0.2, 0.75)\) and \(t=(1,2,3)\)

See Appendix Fig. 10.

Fig. 10
figure 10

Plots investigating normality assumption of \((\hat{\beta }, \hat{\gamma })\) using 1000 simulated data sets and \((n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5,0.2, 0.75)\). The pseudo-observations are computed based on \(k=3\) with \(t=(1,2,3)\)

\((n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, 0.2, 1)\) and \(t=(1,2,3)\)

See Appendix Fig. 11.

Fig. 11
figure 11

Plots investigating normality assumption of \((\hat{\beta }, \hat{\gamma })\) using 1000 simulated data sets and \((n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5,0.2, 1)\). The pseudo-observations are computed based on \(k=3\) with \(t=(1,2,3)\)

\((n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, -0.2, 1)\) and \(t=(1,2,3)\)

See Appendix Fig. 12.

Fig. 12
figure 12

Plots investigating normality assumption of \((\hat{\beta }, \hat{\gamma })\) using 1000 simulated data sets and \((n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, -0.2, 1)\). The pseudo-observations are computed based on \(k=3\) with \(t=(1,2,3)\)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Furberg, J.K., Andersen, P.K., Korn, S. et al. Bivariate pseudo-observations for recurrent event analysis with terminal events. Lifetime Data Anal (2021). https://doi.org/10.1007/s10985-021-09533-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10985-021-09533-5

Keywords

  • Recurrent events
  • Terminal events
  • Pseudo-observations
  • Simultaneous model
  • Multi-state model