1 Introduction

In biomedical studies with survival endpoints, the eventual goal is to model the time until occurrence of an event of interest, say death. In some situations, the subjects under study can experience one of the several so-called ‘terminal’ events, and where the occurrence of an event precludes the subsequent occurrence of any other (Haneuse and Lee 2016). This is the classic ‘competing risks’ framework (Lau et al. 2009), where the time to the event of failure, and the cause of failure/censoring indicator is recorded for each observation. However, in many cancer clinical trials, the occurrence of an intermediate non-terminal event, such as tumor recurrence, is often the event of interest. This non-terminal recurrence may not prevent the subject from death (the terminal event), with the terminal event dependently censoring the non-terminal event, but not vice-versa. This is the semi-competing risks scenario (Fine et al. 2001).

Consider our motivating dataset generated from a randomized, multi-center, concurrently controlled clinical trial (Moertel et al. 1990, 1995) to determine the effectiveness of two adjuvant therapy regimens (‘levamisole only’, and ‘levamisole plus fluorouracil’) in improving surgical cure rates for stage III colon cancer. Clinically, a proper evaluation of the survival distribution of patients experiencing intermediate tumor recurrence following a specific treatment regimen post resection is necessary for an informed prediction of the risk of recurrence for other patients, and subsequent adaptive treatment strategies. Yet, this is challenging, given the strong positive correlation (see Fig. 1, left panel) between \(T_1\) (time to tumor recurrence) and \(T_2\) (time to death), with \(T_1 \le T_2\) (data observed only in the upper wedge), which renders the typical independent censoring assumptions in survival analysis (Oakes 1993) and competing-risks (consisting of only terminal events) invalid.

Fig. 1
figure 1

Correlation plot of \(T_1\) and \(T_2\) for the colon cancer data (left panel). Illness-death model for data with semi-competing risks structure (right panel)

There are two main approaches to analyzing semi-competing risks data. The first approach considers a joint distribution for \((T_1,T_2)\) specified via a copula model in the upper wedge. Within the copula framework, Fine et al. (2001) considered a Clayton copula (Clayton 1978) with two margins, Wang (2003) considered a more general copula setting, while Lakhal et al. (2008) considered an Archimedean copula to estimate the dependency parameter. Related formulation considering time-varying effects of a treatment on the marginal distribution of a non-terminal event was considered in Peng and Fine (2007) and Hsieh and Huang (2012). In addition to semiparametric transformation models (Chen 2012), more recent approaches (Zhou et al. 2016) tackled dependent censoring by first selecting a suitable copula model through an exploratory diagnostic approach (Bandeen-Roche et al. 2005), and then developing an inference procedure to simultaneously estimate the marginal survival function of cancer recurrence, and an association parameter in the copula model. Recently, Peng et al. (2018) considered a semiparametric extension to handle clustered semi-competing risks data. This latent variable formulation via a joint (copula) structure leads to hypothetical interpretation of the marginal distribution of the non-terminal events, and also complicates covariance analysis (Xu et al. 2010). The second approach casts the semi-competing risks problem into the classic illness-death (see, Fig. 1, right panel) compartment framework (Haneuse and Lee 2016), which, in reality, is a special case of multi-state models (Andersen and Keiding 2002). The concept of evaluating semi-competing risks fits naturally into the well-established illness-death paradigm (Xu et al. 2010), where, a patient can either transit to the terminal event either directly, or via the non-terminal event, with the model completely specified via the transition intensity functions for the three distinct transitions. Here, the dependency between \(T_1\) and \(T_2\) is introduced via a shared frailty, or random effects term. While Xu et al. (2010) considered a Gamma frailty, Jiang and Haneuse (2017) extended this to a class of transformation frailty models, allowing a wider range of possible frailty distributions. Within this illness-death framework, there exists other classical (Do Ha et al. 2020; Kim et al. 2019; Lee et al. 2021) and Bayesian (Han et al. 2014; Lee et al. 2015, 2016) frailty-based formulations of semicompeting risks handling various data complications, such as interval-censoring, intermediate missingness, and bias reduction.

In this paper, we consider the power variance function (PVF, Tweedie 1984) as our choice of the frailty, under the illness-death formulation. The PVF is easily tractable (Wienke 2010) due to the closed-form expressions of the marginal survival function, and contains the gamma, inverse Gaussian and positive stable densities as special cases. Choice of the frailty density is crucial in modeling the dependence structure, and misspecified frailty may lead to biased results (Kiche et al. 2019). Although gamma and other frailties have been extensively considered in modeling semicompeting risks, considering a frailty family would be elegant from a generalization perspective, and thereby ascertain whether the fit of these sub-models (that are members of the family) are satisfactory.

The remainder of this article proceeds as follows. After a brief introduction to the PVF density, Sect. 2 presents the general illness-death model, with the corresponding marginal transition rates, joint marginal survival function, leading to its formulation using the PVF frailty. Section 3 develops inference via maximum likelihood, incorporating covariates. In Sect. 4, we apply our model to the colon cancer data. The finite sample performance of the model parameters are evaluated via a small simulation study in Sect. 5. Finally, Sect. 6 concludes, with a discussion. Proof of the theorem presented, as well as detailed derivations of various important results and likelihood construction appear in the accompanying Supplementary Material.

2 Statistical Model

2.1 PVF Density

The PVF density, suggested by Tweedie (1984) and derived independently by Hougaard (1986), is a three-parameter family with parameters \(\mu >0\), \(\sigma >0\), and \(0 < \gamma \le 1\), with the probability density function g(z) given by:

$$\begin{aligned} g(z)= e^{-\sigma (1-\gamma )(\frac{z}{\mu }-\frac{1}{\gamma })}\frac{1}{\pi }\sum _{k=1}^{\infty }(-1)^{k+1}\frac{(\sigma (1-\gamma ))^{k(1-\gamma )}\mu ^{k\gamma }}{\gamma ^k}\frac{\Gamma (k\gamma +1)}{k!}z^{-k\gamma -1}\sin (k\gamma \pi ), \end{aligned}$$
(1)

The expectation \(\mathbb {E}[Z]=\mu\) and variance Var\([Z]=\mu ^2/\sigma\) can be derived from the Laplace transform of the PVF density (Aalen et al. 1992) given by

$$\begin{aligned} {\mathcal {L}}_Z(s)=\mathbb {E}[e^{sZ}]=e^{\frac{\sigma (1-\gamma )}{\gamma }\left[ 1-\left( 1+\frac{\mu s}{\sigma (1-\gamma )}\right) ^\gamma \right] } \end{aligned}$$
(2)

The PVF density reduces to the gamma (\(\gamma \rightarrow 0\)) and inverse Gaussian (\(\gamma =0.5\)) distributions, respectively, while the stable density is a particular case of the PVF under some asymptotic considerations (Wienke 2010). Furthermore, the compound Poisson (cP) distribution (Aalen 1988), which can be constructed as the sum of a Poisson-distributed number of independent and identically distributed (iid) gamma random variables, can be shown to have the same Laplace transform as the PVF model under certain parameterization (Wienke 2010), except for the range of \(\gamma\), which can be negative in the cP model. Consequently, the density function coincides with the respective function in the PVF model.

2.2 The Illness-Death Framework

Assume n, the number of subjects in our study. Let \(T_{j1}\) and \(T_{j2}\) be the time to non-terminal and terminal events, respectively, with \(C_j\), the (right) censoring time for the jth subject, \(j=1,\ldots ,n\). If the subject dies before the occurrence of the non-terminal event, we define \(T_{j1}=\infty\). This specification (Xu et al. 2010) avoids a latent distribution of \((T_{j1}, T_{j2})\) over the region \(t_1 >t_2\). Consider also the p-dimensional covariate vector \(\mathbf {x_j}=(x_{j1}, x_{j2},\ldots , x_{jp})^{\prime }\) observed for the jth subject. We also assume the censoring time \(C_j\) \(\perp\) of \((T_{j1}, T_{j2})\), given \(\mathbf {x_j}\). Under this setup, the observed data is denoted by \(D=(Y_{j1},Y_{j2},\delta _{j1},\delta _{j2}, {\textbf{x}}_j)\), \(j=1,\ldots ,n\), where \(Y_{j1}=\min \{T_{j1},Y_{j2}\}\), \(\delta _{j1}=I_{\{T_{j1}\le Y_{j2}\}}\), \(Y_{j2}=\min \{T_{j2},C_j\}\), and \(\delta _{j2}=I_{\{T_{j2}\le C_j\}}\). Also, when \(\delta _{j1} = 0\) and \(\delta _{j2} = 1\), we have \(Y_{j1} = Y_{j2} = T_{j2}\) and \(T_{j1} = \infty\). Since \(0 \le Y_{j1} \le Y_{j2}\), the observations lie on the upper wedge of the two-dimensional graph of \((T_{j1}, T_{j2})\). For the joint probability model of \((T_1, T_2)\), we consider an absolutely continuous density \(f(t_1, t_2), 0 \le t_1 \le t_2\) (Fig. 1, left panel). Thus,

$$\begin{aligned} \int _{0}^{\infty }\int _{t_1}^{\infty }f(t_1,t_2)dt_2dt_1=\mathbb {P}[T_1<\infty ]\le 1, \end{aligned}$$

with the balance of the probability distributed along the line at \(t_1=\infty\) with continuous density \(f_\infty (t_2)\), \(t_2>0\) (Xu et al. 2010).

Now, under the illness-death specification, a subject in the initial state, or state 0 (‘full resection of the tumor’) can transit directly to state 2 (‘death’ state), or first to state 1 (‘tumor recurrence state’), and then to state 2. This is described in Fig. 1 (right panel). This model is completely specified by the transition or hazard functions for the three distinct transitions: a cause-specific hazard for the illness (tumor) recurrence, \(\lambda _1(t_1)\); for death, \(\lambda _2(t_2)\); and for death conditional to the time to illness recurrence, \(\lambda _{12}(t_2|t_1)\), \(0<t_1<t_2\). These transition rates are defined as:

$$\begin{aligned} \lambda _1(t_1)&=\lim _{\Delta \rightarrow 0}\frac{\mathbb {P}[T_1\in [t_1,t_1+\Delta )|T_1\ge t_1,T_2\ge t_1]}{\Delta }, \ \ t_1>0 \nonumber \\ \lambda _2(t_2)&=\lim _{\Delta \rightarrow 0}\frac{\mathbb {P}[T_2\in [t_2,t_2+\Delta )|T_1\ge t_2,T_2\ge t_2]}{\Delta }, \ \ t_2>0 \nonumber \\ \lambda _{12}(t_2|t_1)&=\lim _{\Delta \rightarrow 0}\frac{\mathbb {P}[T_2\in [t_2,t_2+\Delta )|T_1= t_1,T_2\ge t_2]}{\Delta }, \ \ 0<t_1<t_2. \end{aligned}$$
(3)

Note, in general, the rate of the terminal event following the occurrence of the non-terminal event at time \(T_1=t_1\), \(\lambda _{12}(t_2|t_1)\), can depend on both \(t_1\) and \(t_2\). However, under Markov assumptions, \(\lambda _{12}(t_2|t_1)\) depends only on \(t_2\), i.e., \(\lambda _{12}(t_2|t_1)=\lambda _{12}(t_2)\) . The Markov model is most frequently used because of its simplicity (Meira-Machado et al. 2009). Intuitively, tumor recurrence can occur before the terminal event, but, not vice versa, and the occurrence of the non-terminal event can influence the occurrence of the terminal event, i.e., subject with tumor recurrence may die early. To quantify this (latent) dependency, we consider a shared frailty (or random effect) model (Wienke 2010), where the event times \(T_1\) and \(T_2\) are considered conditionally independent, given the frailty. Denoting this frailty term by a random variable \(Z>0\) with \(\mathbb {E}[Z]=1\), we define conditional transition functions analogous to \(\lambda _1(t_1)\), \(\lambda _2(t_2)\) and \(\lambda _{12}(t_2|t_1)\) as follows:

$$\begin{aligned} \lambda _1(t_1|Z)&=Z\lambda _{01}(t_1),\ \ t_1>0, \nonumber \\ \lambda _2(t_2|Z)&=Z\lambda _{02}(t_2),\ \ t_2>0, \nonumber \\ \lambda _{12}(t_2|t_1,Z)&=Z\lambda _{03}(t_2),\ \ 0<t_1<t_2, \end{aligned}$$
(4)

where \(\lambda _{0i}(\cdot )\), \(i=1,2,3\) are the baseline hazard functions that can be considered parametric, or non-parametric. The current framework considers the same value of the frailty for the three conditional transition rates. The conditional transition rate for terminal event given that a non-terminal event has occurred, \(\lambda _{12}(t_2|t_1,Z)\), is assumed Markov, i.e., \(\lambda _{03}(\cdot )\) does not depend on \(t_1\). The conditional explanatory hazard ratio, which characterizes the dependence between \(T_1\) and \(T_2\), is given by \(\textbf{EHR}=\lambda _{12}(t_2|t_1,Z)/\lambda _2(t_2|Z)=\lambda _{03}(t_2)/\lambda _{02}(t_2)\), for \(t_2>t_1\) (Clayton 1978; Xu et al. 2010). Thus, for \(\lambda _{02}(t_2)\ne \lambda _{03}(t_2)\) (General model), the dependence of \(T_2\) on \(T_1\) is described both by the conditional (given Z) explanatory hazard ratio, \(\lambda _{03}(t_2)/\lambda _{02}(t_2)\), as well as by the common frailty Z, through its unknown parameter, which we denote by \(\theta >0\). When \(\lambda _{02}(t_2)=\lambda _{03}(t_2)\), (Restricted model), the dependence of \(T_1\) and \(T_2\) is completely specified by Z. The marginal transition rates can be expressed as a function of the Laplacian transform \({\mathcal {L}}_Z(s)\) from the conditional transition rates. This result is presented in Theorem 1 below.

Theorem 1

Define the conditional transition rates \(\lambda _{i}(t_i|Z)=Z\lambda _{0i}(t_i)\), \(i=1,2\), and \(\lambda _{12}(t_2|t_1,Z)=Z\lambda _{03}(t_2)\), where \(\lambda _{0i}(\cdot )\), \(i=1,2,3\) are the baseline hazard functions, and Z is a continuous random variable (frailty), with positive support having mean 1 and finite variance. If the distribution of Z has Laplace transform, \({\mathcal {L}}_Z (s)\), then the marginal transition rates are given by

$$\begin{aligned} {\text{a}}.\;\lambda _{i} (t_{i} ) & = - \lambda _{{0i}} (t_{i} )\frac{d}{{ds}}\log ({\mathcal{L}}_{Z} (s)),\quad s = \Delta _{{01}} (t_{i} ) + \Delta _{{02}} (t_{i} ),\;i = 1,2\;{\text{and}} \\ {\text{b}}{\text{.}}\;\lambda _{{12}} (t_{2} |t_{1} ) & = - \lambda _{{03}} (t_{2} )\frac{d}{{ds}}\log \left( {\frac{d}{{ds}}{\mathcal{L}}_{Z} (s)} \right),\;s = \Delta _{{01}} (t_{1} ) + \Delta _{{02}} (t_{1} ) + \Delta _{{03}} (t_{1} ,t_{2} ). \\ \end{aligned}$$

where \(\Delta _{03}(t_1,t_2)=\Delta _{03}(t_2)-\Delta _{03}(t_1)\) and \(\Delta _{0i}(t_i) = \int _{0}^{t_i}\lambda _{0i}(t)dt\), \(i=1,2,3.\) are baseline cumulative distribution functions.

Expressions for the conditional survival functions, and proof of Theorem 1 appear in the Supplementary Material (Sections A and B, respectively). Wienke (2010) show that the joint survival function \(S(t_1,t_2)\), for shared frailty models can be expressed as the Laplace transform of the frailty distribution evaluated at the cumulative baseline hazard. Along the lines of Xu et al. (2010), we can prove that

$$\begin{aligned} S(t_1,t_2)= {\mathcal {L}}_Z(\Delta _{01}(t_1)+\Delta _{02}(t_1)+\Delta _{03}(t_2)-\Delta _{03}(t_1)), \ \ \ 0<t_1\le t_2 \end{aligned}$$
(5)

For the restricted model, the above expression reduces to

$$\begin{aligned} S(t_1,t_2)= {\mathcal {L}}_Z(\Delta _{01}(t_1)+\Delta _{02}(t_2)), \ \ \ 0<t_1\le t_2 \end{aligned}$$
(6)

Note, considering the Laplacian of the gamma distribution, the expression in (6) above is the joint survival function proposed by Xu et al. (2010) for the restricted case. Once we know the joint survival function, \(S(t_1,t_2)\), we can compute a local measure of association (Clayton 1978; Oakes 1982) between \(T_1\) and \(T_1\), denoted by \(\vartheta ^\star (t_1,t_2)\), as a function of \(\theta\), defined as:

$$\begin{aligned} \vartheta ^\star (t_1,t_2)= \frac{ S(t_1,t_2)\frac{\partial ^2}{\partial t_1\partial t_2}S(t_1,t_2)}{\left( \frac{\partial }{\partial t_1}S(t_1,t_2)\right) \left( \frac{\partial }{\partial t_2}S(t_1,t_2)\right) } \end{aligned}$$
(7)

where \(\partial\) denotes the partial derivatives of the corresponding quantities. Here, \(\vartheta ^\star (t_1,t_2)=1 (>1)\) implies independence (positive association) between \(T_1\) and \(T_2\). Under the general and restricted models in (5) and (6), respectively, and assuming the PVF density, \(\vartheta ^*(t_1, t_2)\) (\(0<t_1<t_2\)) is well-defined, and takes the form (derivation available in the Supplementary Material, Section C):

$$\begin{aligned} \vartheta ^*(t_1, t_2)= {\left\{ \begin{array}{ll} 1+\theta \left( 1+\frac{\theta (\Delta _{01}(t_1)+\Delta _{02}(t_1)+\Delta _{03}(t_2)-\Delta _{03}(t_1))}{1-\gamma }\right) ^{-\gamma },&{} \quad \text { General M.} \\ 1+\theta \left( 1+\frac{\theta (\Delta _{01}(t_1)+\Delta _{02}(t_2))}{1-\gamma }\right) ^{-\gamma },&{} \quad \text { Restricted M.} \end{array}\right. } \end{aligned}$$
(8)

We have \(\lim \limits _{t_i\rightarrow \infty }\Delta _{0i}(t_i)=\infty\), \(i=1,2,3\). Note, in both cases, \(\gamma \longrightarrow 0\) implies \(\vartheta ^\star (t_1,t_2) = 1+\theta\), a constant in \((t_1,t_2)\), which corresponds to the gamma frailty. In this case, the association between \(T_1\) and \(T_2\) increases, or decreases with \(\theta\). However, \(\vartheta ^\star (t_1,t_2)\) decreases with increasing time, with \(\vartheta ^\star (t_1,t_2)\rightarrow 1\) as \(t_1\rightarrow \infty\) and \(t_2\rightarrow \infty\). This implies that for a sufficiently large time, \(T_1\) and \(T_2\) are independent, but this does not happen for the gamma model. On the other hand, when \(\gamma <0\) (implying the cP model), \(\vartheta ^\star (t_1,t_2)\rightarrow \infty\) as \(t_1\rightarrow \infty\) and \(t_1\rightarrow \infty\), i.e., larger \(t_1\) and \(t_2\) leads to greater dependency.

2.3 The Illness-Death PVF Framework

In this section, we consider the illness-death framework for the general model; the corresponding results for the restricted model follows directly by assuming \(\lambda _{02}(\cdot )= \lambda _{03}(\cdot )\). In Theorem  1, we consider the frailty Z distributed as a PVF, with \(\mathbb {E}[Z]=\mu =1\) and \(Var(Z)=\theta =1/\sigma\) (this choice avoids the identifiability problem). Thus, the marginal transition rates are:

$$\begin{aligned} \lambda _i(t_i)&=\lambda _{0i}(t_i)\left( 1+\frac{\theta (\Delta _{01}(t_i)+\Delta _{02}(t_i))}{1-\gamma }\right) ^{\gamma -1}, \ \ t_i>0, \ \ i=1,2. \nonumber \\ \lambda _{12}(t_2|t_1)&=\lambda _{03}(t_2)\left( 1+\frac{\theta (\Delta _{01}(t_1)+\Delta _{02}(t_1)+\Delta _{03}(t_1,t_2))}{1-\gamma }\right) ^{-1} \nonumber \\&\times \left[ \theta +\left( 1+\frac{\theta (\Delta _{01}(t_1)+\Delta _{02}(t_1)+\Delta _{03}(t_1,t_2))}{1-\gamma }\right) ^{\gamma }\right] , 0<t_1< t_2. \end{aligned}$$
(9)

Note, \(\lambda _{12}(t_2|t_1)\), the marginal transition rate from ‘recurrence’ to ‘death’, depends on both \(t_1\) and \(t_2\), and is therefore not Markovian, as opposed to its corresponding conditional transition rate, unless Z is constant \((\theta =0)\). In this model, not all subjects have the same value of the frailty. Hence, the conditional transition rates are only comparable within subjects sharing frailty. This causes the interpretation of the marginal and conditional transition rates to be different (Lee and Nelder 2004).

As stated before, the PVF density constitutes a family, which includes the Gamma, inverse Gaussian (IG) and positive stable densities (Wienke 2010). In Table 1, we present the marginal transition rates for the Gamma and IG densities; these results were obtained from Eq. (9). The marginal transition rates for the cP model (when \(\gamma <0\)) are similarly obtained. An interesting property of the cP model is that it allows for a fraction of individuals with zero frailty who never experience the event under study (Wienke 2010). The size of the non-susceptible fraction, \(p_0\), is obtained when \((t_1, t_2) \rightarrow \infty\) in \(S(t_1,t_2)\), i.e., using the Laplacian (2) in (5) for \(\gamma <0\) and assuming \(\lim \limits _{t_i\rightarrow \infty }\Delta _{0i}(t_i)=\infty\), \(i=1,2,3\), we can show

$$\begin{aligned} p_0=\lim _{\begin{array}{c} t_1\rightarrow \infty \\ t_2\rightarrow \infty \end{array}}S(t_1,t_2) =\exp \left( \frac{1-\gamma }{\theta \gamma }\right) . \end{aligned}$$
(10)
Table 1 Marginal transition rates for the Inverse Gaussian (IG) and Gamma model

The marginal transition probabilities \(\mathbb {P}_{01}(t_1)\), \(\mathbb {P}_{02}(t_2)\) and \(\mathbb {P}_{12}(t_2|t_1)\) are defined as the probability of being in state 1 at time \(t_1\) given that the previous state 0 was entered at time 0, the probability of being in state 2 at time \(t_2\) given that the previous state (state 0) was entered at time 0, and probability of being in state 2 at time \(t_2\) given that the previous state (state 1) was entered at time \(t_1\), respectively. These quantities are estimated by integrating the conditional transition rates over the possible transition times,

$$\begin{aligned}&\mathbb {P}_{0i}(t_1)=\int _0^{t_1}\lambda _{01}(t)\left( 1+\frac{\theta s_i(t,t_1)}{1-\gamma }\right) ^{\gamma -1}\exp \left\{ \frac{1-\gamma }{\theta \gamma }\left[ 1-\left( 1+\frac{\theta s_{i}(t,t_1)}{1-\gamma }\right) ^\gamma \right] \right\} dt, \ \ i=1,2. \nonumber \\&\mathbb {P}_{12}(t_2|t_1)=\int _{t_1}^{t_2}\lambda _{03}(t) \left( 1+\frac{\theta s_3(t,t_1)}{1-\gamma }\right) ^{\gamma -1}\exp \left\{ \frac{1-\gamma }{\theta \gamma }\left[ 1-\left( 1+\frac{\theta s_3(t,t_1)}{1-\gamma }\right) ^\gamma \right] \right\} dt \end{aligned}$$
(11)

where \(s_1(t,t_1)= \Delta _{01}(t)+\Delta _{02}(t)+\Delta _{03}(t,t_1)\), \(s_2(t,t_1)=\Delta _{01}(t)+\Delta _{02}(t)\) and \(s_3(t,t_1)=\Delta _{03}(t_1,t)\). The integrals in (11) will be solved numerically, and the appears in proof of this result appear in the Supplementary Material (Section C).

3 Maximum Likelihood Inference

In this section, we develop the inferential procedures for our illness-death model via maximum likelihood (ML). Hereafter we assume a Weibull distribution for hazard baseline function \(\lambda _{0i}(t)=\alpha _i\beta _i t^{\beta _i-1}\), for \(\alpha _i>0\) and \(\beta _i>0\), \(i = 1,2,3\). We incorporate covariates in the conditional transition rates (4) as follows:

$$\begin{aligned} \lambda _1(t_1|Z,{\textbf{x}})=Z\lambda _{01}(t_1)\exp {(\varvec{\varphi }_1^\prime {\textbf{x}})},\ \ t_1>0 \nonumber \\ \lambda _2(t_2|Z,{\textbf{x}})=Z\lambda _{02}(t_2)\exp {(\varvec{\varphi }_2^\prime {\textbf{x}})},\ \ t_2>0 \nonumber \\ \lambda _{12}(t_2|t_1,Z,{\textbf{x}})=Z\lambda _{03}(t_2)\exp {(\varvec{\varphi }_3^\prime {\textbf{x}})},\ \ 0<t_1<t_2 \end{aligned}$$
(12)

where \({\textbf{x}}=(x_1,\ldots ,x_p)^{\prime }\) is a subject-specific vector of p covariates, \(\varvec{\varphi }_i=(\varphi _{i1},\ldots , \varphi _{ip})^{\prime }\), \(i=1,2,3\) are vectors of coefficients, and Z is the subject-specific shared frailty, distributed independently of \({\textbf{x}}\). The corresponding marginal transition rates with covariates are the same as in (9), where the baseline hazard functions, \(\lambda _{0i}(\cdot )\), and the cumulative baseline hazard functions, \(\Delta _{0i}(\cdot )\), are multiplied with \(\exp {(\varvec{\varphi }_i^{\prime }{\textbf{x}})}\) for \(i=1,2,3\).

The likelihood function is constructed via the conditional transition rates from each of the 4 possible cases, \((\delta _{j1},\delta _{j2})=(1,1),(0,1),(1,0),(0,0)\), and then integrating out Z. In this model, we assume the covariate effects are same at all times, and the censoring time \(C_j\) is independent of (\(T_{j1},T_{j2}|{\textbf{x}}_j\)). Denote \(\varvec{\vartheta }=(\gamma , \theta , \varvec{\eta }_1, \varvec{\eta }_2, \varvec{\eta }_3,\varvec{\varphi }_1, \varvec{\varphi }_2, \varvec{\varphi }_3)\), the parameter vector to be estimated, where \(\varvec{\eta }_i=(\alpha _i,\beta _i)\) is the parameter vector of the Weibull model for the rate \(\lambda _{0i}(\cdot )\) and \(\varvec{\varphi }_i=(\varphi _{i1}, \varphi _{i2},\ldots , \varphi _{ip})^{\prime }\), the coefficient vector, \(i=1,2,3\). The log-likelihood function for \(\varvec{\vartheta }\) given observed data is:

$$\begin{aligned} \ell (\varvec{\vartheta })&=\sum _{j=1}^{n}\left\{ \delta _{j1}\log \lambda _{01}(Y_{j1})+\delta _{j2}(1-\delta _{j1})\log \lambda _{02}(Y_{j2})+\delta _{j1}\delta _{j2}\log \lambda _{03}(Y_{j2})\right\} \nonumber \\&\quad +\sum _{j=1}^{n}\left\{ \delta _{j1}\varvec{\varphi }_1^{\prime }{\textbf{x}}_j+\delta _{j2}(1-\delta _{j1})\varvec{\varphi }_2^{\prime }{\textbf{x}}_j+\delta _{j1}\delta _{j2}\varvec{\varphi }_3^{\prime }{\textbf{x}}_j\right\} \nonumber \\&\quad +\sum _{j=1}^{n}\left\{ \delta _{j1}{\delta _{j2}} \log \left[ \theta +\left( 1+\frac{\theta K_1({\textbf{x}}_j)}{1-\gamma }\right) ^\gamma \right] +\delta _{j1}(\gamma -1-\delta _{j2})\log \left( 1+\frac{\theta K_1({\textbf{x}}_j)}{1-\gamma }\right) \right\} \nonumber \\& \quad +\sum _{j=1}^{n}\left\{ \log \left[ \frac{\delta _{j1}(1-\gamma )}{\theta \gamma }\left( 1-\left( 1+\frac{\theta K_1({\textbf{x}}_j)}{1-\gamma }\right) ^\gamma \right) \right] \right\} \nonumber \\&\quad +\sum _{j=1}^{n}\left\{ (\gamma -1)\delta _{j2}(1-\delta _{j1})\log \left( 1+\frac{\theta K_2({\textbf{x}}_j)}{1-\gamma }\right) \right\} \nonumber \\&\quad +\sum _{j=1}^{n}\left\{ \log \left[ \frac{(1-\delta _{j1})(1-\gamma )}{\theta \gamma }\left( 1-\left( 1+\frac{\theta K_2({\textbf{x}}_j)}{1-\gamma }\right) ^\gamma \right) \right] \right\} \end{aligned}$$
(13)

where \(K_1({\textbf{x}}_j)=(\Delta _{01}(Y_{j1})e^{\varvec{\varphi }_1^{\prime }{\textbf{x}}_j}+\Delta _{02}(Y_{j1})e^{\varvec{\varphi }_2^{\prime }{\textbf{x}}_j}+\Delta _{03}(Y_{j1},Y_{j2})e^{\varvec{\varphi }_3^{\prime }{\textbf{x}}_j})\), and \(K_2({\textbf{x}}_j)=(\Delta _{01}(Y_{j1})e^{\varvec{\varphi }_1^{\prime }{\textbf{x}}_j}+\Delta _{02}(Y_{j1})e^{\varvec{\varphi }_2^{\prime }{\textbf{x}}_j}), \;j=1,\ldots ,n\). The details on the derivation appear in the Supplementary Material (Section D). In the restricted model, note that \(\lambda _{02}(Y_{j2})=\lambda _{03}(Y_{j2})\), for \(j=1,2,\ldots ,n\). However, to guarantee that \(\varvec{EHR}=1\), we also consider \(\varvec{\varphi }_2=\varvec{\varphi }_3\). Thus, the expression in (13) is reduced to the likelihood function of the restricted model.

The ML estimator of \(\varvec{\vartheta }\) (\(\widehat{\varvec{\vartheta }}\), say) can be obtained via numerical optimization of \(\ell (\varvec{\vartheta })\), utilizing the optim function in R. Although all parameters can be estimated this way, here, we adopt a more computationally efficient profile likelihood approach for \(\gamma\). The maximization of \(\ell (\varvec{\vartheta })\) is a two-step procedure. In general, it is reasonable to expect the parameter \(\gamma\) to belong to the interval \((-\tau ,1)\), \(\tau >0\), since \(0\le \gamma \le 1\) represents the PVF model, while \(\gamma <0\) in the cP model. Hence, the first step involves specifying \(\gamma\) to take values in this interval, and determine the corresponding ML estimates \(\widetilde{\theta }(\gamma )\), \(\widetilde{\varvec{\eta }}_1(\gamma )\), \(\widetilde{\varvec{\eta }}_2(\gamma )\), \(\widetilde{\varvec{\eta }}_3(\gamma )\), \(\widetilde{\varvec{\varphi }}_1(\gamma )\), \(\widetilde{\varvec{\varphi }}_2(\gamma )\), and \(\widetilde{\varvec{\varphi }}_3(\gamma )\), and the corresponding (maximized) log-likelihood function \(\ell _{\max }(\gamma )\). In the second step, the log-likelihood function \(\ell _{\max }(\gamma )\) is maximized, and \(\hat{\gamma }\) is obtained. The ML estimates of \(\theta\), \(\varvec{\eta }_1\), \(\varvec{\eta }_2\), \(\varvec{\eta }_3\), \(\varvec{\varphi }_1\), \(\varvec{\varphi }_2\), \(\varvec{\varphi }_3\) are, respectively, given by, \(\hat{\theta }=\widetilde{\theta }(\gamma )\), \(\widehat{\varvec{\eta }}_1=\widetilde{\varvec{\eta }}_1(\gamma )\), \(\widehat{\varvec{\eta }}_2=\widetilde{\varvec{\eta }}_2(\gamma )\), \(\widehat{\varvec{\eta }}_3=\widetilde{\varvec{\eta }}_3(\gamma )\), \(\widehat{\varvec{\varphi }}_1=\widetilde{\varvec{\varphi }}_1(\gamma )\), \(\widehat{\varvec{\varphi }}_2=\widetilde{\varvec{\varphi }}_2(\gamma )\) and \(\widehat{\varvec{\varphi }}_3=\widetilde{\varvec{\varphi }}_3(\gamma )\). Under suitable regularity conditions (Maller and Zhou 1996), it can be shown that the asymptotic distribution of the MLE \(\widehat{\varvec{\vartheta }}=(\hat{\theta }, \widehat{\varvec{\eta }}_1, \widehat{\varvec{\eta }}_2, \widehat{\varvec{\eta }}_3, \widehat{\varvec{\varphi }}_1, \widehat{\varvec{\varphi }}_2 , \widehat{\varvec{\varphi }}_3)\) is multivariate normal with mean vector \(\varvec{\vartheta }\) and covariance matrix \(\varvec{\Sigma }(\widehat{\varvec{\vartheta }}) = \left\{ -\,\frac{\partial ^2 \ell (\varvec{\vartheta })}{\partial \varvec{\vartheta } \partial \varvec{\vartheta }^\top } \right\} ^{-1}=\{-J(\varvec{\vartheta })\}^{-1}\), evaluated at \(\varvec{\vartheta } = \widehat{\varvec{\vartheta }}\). The elements of the observed matrix \(J(\varvec{\vartheta })\) are obtained numerically from the Hessian matrix, using the optim function. An asymptotic confidence interval with significance level \(\alpha\) for each parameter \(\vartheta _r\) is given by \(\bigg ({\hat{\vartheta }}_r-z_{\alpha /2}\sqrt{\widehat{{\Sigma }}^{r,r}}, {\hat{\vartheta }}_{r}+z_{\alpha /2}\sqrt{\widehat{{\Sigma }}^{r,r}}\bigg )\), where \({\widehat{\Sigma }}^{r,r}\) is the rth diagonal element of \({\widehat{\varvec{\Sigma }}}(\widehat{\varvec{\vartheta }})\) estimated at \(\widehat{{\varvec{\vartheta }}}\), for \(r=1,\ldots ,p+\dim (\varvec{\varphi })+1\), \(\dim (\cdot )\) denotes the dimension of the parametric space, and \(z_{\alpha /2}\) is the \(1-\alpha /2\) quantile of the standard normal distribution.

4 Application: Colon Cancer Data

We now apply the illness-death model to the dataset generated from the stage III colon cancer clinical trial. The data appears as the colon data in the R survival package (Therneau 2015). Here, patients who had curative-intent resections of cancer were assigned to one of the (1) observation only, (2) Levamisole (Lev) only, and (3) Lev + Fluorouracil (5-FU). Here, after the full surgical resection of tumor, 929 patients were followed for 5 years or more (median follow up = 6.5 years). The scientific objective here is to evaluate covariate and treatment effects on the rates of the terminal and non-terminal events, and the dependence between these events. After deleting subjects with incomplete data and missing observation times, we have a subset of \(n = 888\) patients with approximately 50% of censoring. For each patient, we have \(t_{j1}\): time to tumor recurrence (in years), \(t_{j2}\): time to death (in years), and \(x_{j}\): treatment (Observation only, Lev, Lev + 5-FU), \(j=1,\ldots , 888\). R code for implementing our model are available in the GitHub link: https://github.com/bandyopd/PVF-Semicompeting.

We consider a Weibull baseline hazard, specified by \(\lambda _{0i}(t)=\alpha _i\beta _it^{\beta _i-1}\), \(i=1,2,3\), with the term incorporating covariates in (13) as \(\phi _{ij}=\exp (\varphi _{i1}x_{j1}+\varphi _{i2}x_{j2})\), \(j=1,\ldots ,n\), where the covariate (treatment) effects are defined via dummy variables as:

$$\begin{aligned} x_{1}={\left\{ \begin{array}{ll} 1, &{} \quad \text {if}\quad \text {Levamisole};\\ 0, &{} \quad \text {otherwise} \end{array}\right. } \;\;\text{ and }\; x_{2}={\left\{ \begin{array}{ll} 1, &{} \quad \text {if}\quad \text {Levamisole+5-FU};\\ 0, &{} \quad \text {otherwise}. \end{array}\right. } \end{aligned}$$
Fig. 2
figure 2

Profile log-likelihood function of \(\gamma\). General model (left panel), Restricted model (right panel)

We now fit our illness-death model with the PVF frailty, with the particular cases: Inverse Gaussian (IG, \(\gamma =0.5\)), and Gamma (\(\gamma \rightarrow 0\)). The parameter \(\gamma\) was estimated via profile likelihood; see Fig. 2 for the respective plots corresponding to the full and the restricted model. The ML estimates (MLE), standard errors (SE), the corresponding confidence intervals (95% CI) are listed in Table 2, for both the general and restricted models. In both cases, we observe a strong evidence of dependence between time to recurrence, and death (revealed by the estimate of \(\theta\)). Note, for the three frailty models, \(\hat{\theta }\) is smaller in the general model, indicating that part of the dependency between \(T_1\) and \(T_2\) is captured by the varying baseline hazards. We also observe that although Lev alone has no significant effect on the risk of recurrence, the combination (Lev + 5-FU) does. In addition, both regimens do not have any significant effect on the risk of death, with or without tumor recurrence, as revealed from both models.

Table 2 Analysis of Colon cancer data for the PVF, Inverse Gaussian (IG) and Gamma frailty models

Model comparisons were conducted using the popular AIC/BIC criteria and presented in Table 3. For the restricted model, Gamma frailty provides a (marginally) better fit, also revealed in Table 2 (since, \(\hat{\gamma }\) estimate is very close to zero.) Note, although the log-likelihoods for PVF and Gamma were almost identical, the \(\Delta\)BIC (difference between the BICs) was more prominent (compared to \(\Delta\)AIC), yet not exceeding the \(\ge 10\) rule of thumb (Anderson and Burnham 2004) to generate some enthusiastic support for the Gamma model. For the general model, both \(\Delta\)AIC and \(\Delta\)BIC were \(< 10\), implying considerably less support for the Gamma model. Although both models can be chosen, we consider the PVF/cP model in our subsequent analysis. The estimated proportion of patients remaining disease free, i.e., \(\hat{p_0}\) (given in (10)) is 0.2561, which implies that after a long period of time, approximately 25% of patients might experience neither tumor recurrence nor death from colon cancer after complete tumor resection surgically.

Table 3 Log-likelihood, AIC and BIC for the fitted models

In Fig. 3 (left panel), we plot the conditional explanatory hazard ratio (HR) including covariates, given by \(\text {EHR}=\frac{\lambda _{03}(t_2)}{\lambda _{02}(t_2)}\exp \left[ (\varvec{\varphi }_3^T-\varvec{\varphi }_2^T){\textbf{x}}\right] , t_2>t_1\). Here, EHR describes how the risk of death changes over time, given that the tumor recurrence occurred at time \(t_1\) (Lee et al. 2015). In particular, \(\text {EHR}>1\) implies tumor recurrence has an effect on the hazard of death due to colon cancer. Although EHR does not depend on \(t_1\) due to it’s Markov structure, it’s interpretation depends on the condition \(t_2> t_1\), for all \(t_1\) fixed. For a better visualization, we zoomed in on the EHR curve (see Fig. 3, right panel). We observe that the EHR for patients under the competing regimens (Observation only, Lev only, and Lev + 5-FU) is 1 at approximately \(t = 19.5,33.5\) and 38 years, respectively. This implies that at those instants, the time to tumor recurrence has no effect on death.

Fig. 3
figure 3

Estimated (conditional) hazard ratio (\(\textbf{EHR}\)) for each treatment

In Fig. 4, we plot the marginal transition rates, \(\hat{\lambda }_1(t)\) and \(\hat{\lambda }_2(t)\), stratified by treatments. We observe that in the first year, the patients following Lev+ 5-FU regimen have a lower risk of tumor recurrence, compared to patients in the other two groups. This finding is consistent with Table 2, where we observe a significant effect only for the Lev + 5-FU group on tumor recurrence. However, the transition rate to death (from \(\hat{\lambda }_2(t)\)) is higher with Lev + 5-FU, although Table 2 reveals regimen types do not have significant effects on death.

Fig. 4
figure 4

Estimated marginal transition rates for the transition to recurrence, \(\hat{\lambda }_1(t)\) (left panel) and transition to death, \(\hat{\lambda }_2(t)\) (right panel), stratified by regimen type

The estimated transition rates for death after tumor recurrence is plotted in Fig. 5. Note, this transition rate depends on the time to death \(t_2\), in addition to the time since tumor recurrence, and the plots varies with recurrence times \(t_1\). If we compare these transition rates with the transition rates for death in Fig. 4 (right panel), the curves without tumor recurrence (Fig. 4 right panel) lie significantly lower than the curves with tumor recurrence (Fig. 5), implying strong evidence of the effect of the non-terminal event (tumor recurrence) on the terminal event (death), and the dependence between them. We can infer that a patient with tumor recurrence has a higher risk of death than another without tumor recurrence. However, we also observe that as \(t_1\) increases, the transition rates to death after tumor recurrence decreases. This implies the effect of tumor recurrence to be less prominent for patients who experience tumor recurrence at the end of the study, than patients who experienced it early.

Fig. 5
figure 5

Estimated marginal transition rates to death given that a patient has had tumor recurrence in \(t_1\), \(\hat{\lambda }_1(t_2|t_1)\), stratified by the explanatory variable treatment, for \(t_1=1, 3, 6\) and 9 years

Finally, Fig. 6 plots the transition probabilities presented in Sect. 2.3. From the plot of \(\hat{\mathbb {P}}_{01}(t_1)\) (for tumor recurrence, see upper left), we observe that the probability is lowest for the Lev + 5-FU group, which corroborates with the finding in Table 2. For these patients, the first few post-surgical months following full tumor resection are critical, as revealed by the increasing probability of recurrence till time t, beyond which it decreases. However, the corresponding probabilities for death, i.e., \(\hat{\mathbb {P}}_{02}(t_2)\) (see upper right panel), are increasing for all time points, and treatments. Interestingly, this probability is the largest for the Lev + 5-FU group. The plot in the lower left panel compares \(\hat{\mathbb {P}}_{01}(t_1)\) to \(\hat{\mathbb {P}}_{02}(t_1)\). We observe that a patient is more likely to experience recurrence than death within the first 2.5 years post resection, beyond which the situation reverses. Finally, Fig. 6 (lower right) plots \(\hat{\mathbb {P}}_{12}(t|t_1)\), the transition probability to death, given that the patient was in the recurrence state at time \(t_1\), for various choices of \(t_1\). Comparing with \(\hat{\mathbb {P}}_{02}(t_2)\) (upper right), we infer that a patient experiencing tumor recurrence at time \(t_1\) is more likely to progress to death, compared to another patient without recurrence. The dependence between the recurrence and death is now manifested through the transition probabilities.

Fig. 6
figure 6

Estimated marginal transition probabilities for recurrence(\(\hat{\mathbb {P}}_1(t)\), upper left), death (\(\hat{\mathbb {P}}_2(t)\), upper right), their comparison plot (with \(\hat{\mathbb {P}}_1(t)\), lower left), and death given patient had tumor recurrence at \(t_1\) (\({\hat{S}}_{12}(t|t_1)\), lower right), stratified by the explanatory variable treatment regimen. \({\hat{S}}_{12}(t|t_1)\) is plotted for various values of \(t_1\)

5 Simulation Study

In this section, we present a small simulation study to evaluate the finite sample performance of the ML estimates of the model parameters. We generate semi-competing risks data, following the algorithm of Selle (2016) as follows:

  1. 1.

    Generate \(Z \sim PVF(\gamma ,1,\frac{1}{\theta })\), for \(\gamma =0.5\).

  2. 2.

    Simulating the first event time \(t_1\): Denote \(p_1=\mathbb {P}[\text {not having any transitions before } t_1|Z]\), and generate \(u_1\sim\) Uniform(0,1). Equating the expression for \(p_1\) to \(u_1\), we solve for \(t_1\).

  3. 3.

    Simulating the second event time \(t_2\): Denote \(p_2=\) conditional probability of staying in state 1 until \(t_2\), and generate \(u_2\sim\)Uniform(0,1). Equating the expression for \(p_2\) to \(u_2\), we solve for \(t_2\).

  4. 4.

    Denote \(p=\mathbb {P}[\text {The transition at time}\)t\({\textbf {is to state 1}}]\), and generate \(u\sim\)Uniform(0,1). If \(u\le p\) then \(Y_1=t_1\), \(\delta _1=1\) and \(Y_2=t_2\), else, \(Y_1=Y_2=t_1\) and \(\delta _1=0\).

  5. 5.

    Simulating the censoring time C: Generate \(u_3\sim\)Uniform(0,1). If \(u_3<0.5\), then \(C\sim\)Uniform(vw), else \(C=w\).

  6. 6.

    If \(C>Y_2\), then \(\delta _2=1\); else if \(C\ge Y_1\) and \(C\le Y_2\), then \(Y_2=C\) and \(\delta _2=0\), else \(Y_1=Y_2=C\) and \(\delta _1=\delta _2=0\).

In Step 2, the probability of not having any transitions before time t is given in (A.1), and considering the conditional model with covariates given in (12), we have \(p_1=\exp (-Z[\Delta _{01}(t)e^{({\varphi }_1^\prime {\textbf{x}})}+\Delta _{02}(t)e^{(\varvec{\varphi }_2^\prime {\textbf{x}})}])\). Similarly, if there is a transition to state 1, the second event time \(t_2\) is simulated in step 3, where the conditional probability of staying in state 1 until \(t_2\) is given in (A.2). From (12), we have \(p_2=\exp (-Z[\Delta _{03}(t_1,t_2)e^{\varvec{\varphi }_3^\prime {\textbf{x}}}])\). It follows that the probability of going to state 1 at time t (step 4) is \(p=\lambda _{01}(t)e^{\varvec{\varphi }_1^\prime {\textbf{x}}}/(\lambda _{01}(t)e^{\varvec{\varphi }_1^\prime {\textbf{x}}}+\lambda _{02}(t)e^{\varvec{\varphi }_2^\prime {\textbf{x}}})\). In step 5, the censoring time was simulated from a mixture distribution, i.e., from a uniform distribution on (1.5,3) with probability 0.5, and a point mass at 3 with probability 0.5. This restricts the average percentage of censored observations between 10% to 20%. The covariates \(x_{1j}\) and \(x_{2j}\) were generated, respectively, from a \(\text{ Bernoulli }(0.5)\), and \(\text{ uniform }(0,1)\) distribution, for \(j=1,\ldots ,n\). The baseline hazard, once again, follows \(\text{ Weibull }(\alpha _{i}, \beta _{i})\), \(i=1,2,3\). For the restricted model, we consider the coefficient vectors \(\varvec{\varphi _{1}}=\varvec{\varphi _{2}}=\varvec{\varphi _{3}}=(1,1)\), the Weibull parameters \(\log \alpha _1=\log \alpha _2=\log \alpha _3=1\), and \(\beta _1=\beta _2=\beta _3=1\), such that \(\varvec{EHR}=1\). For the general model, we consider \(\varvec{\varphi _{1}}=\varvec{\varphi _{2}}=\varvec{\varphi _{3}}=(1,1)\), \(\log \alpha _1=\log \alpha _2=1\), \(\log \alpha _3=1.25\) and \(\beta _1=\beta _2=\beta _3=1\), leading to \(\varvec{EHR}>1\). Finally, we consider the frailty parameter \(\theta =1\).

For the simulations, we consider sample sizes \(n= 250, 500\) and 1000. For each configuration, we conduct \(N=1000\) replications to calculate the averages of the MLEs, denoted by \(\varvec{\bar{\vartheta }}\). We compute the standard deviations \(SD=\sqrt{\sum _{k=1}^N(\varvec{\hat{\vartheta }}_k-\varvec{\bar{\vartheta }})/N}\), where \(\varvec{\hat{\vartheta }}_k\) is the MLE vector in the kth simulation, \(k=1,\ldots ,N\), root mean squared errors \(RMSE=\sqrt{\sum _{k=1}^N(\varvec{\hat{\vartheta }}_k-\varvec{\vartheta ^{(0)}})^2/N}\) where \(\varvec{\vartheta ^{(0)}}\) is the initial value vector of parameters, the average standard error (MSE), where \(SE_k=\sqrt{\text {diag}({\widehat{\varvec{\Sigma }}_k}(\widehat{\varvec{\vartheta }}_k))}\), \(k=1,\ldots ,N\), where \(\text {diag}({\widehat{\varvec{\Sigma }}_k}(\widehat{\varvec{\vartheta }}_k))\) is the diagonal of the kth estimated variance and covariances matrix, the empirical \(95\%\) coverage probabilities (CP) for the model parameters, and bias. The simulation results listed in Tables 4 and 5 for the general and restricted models, respectively, reveal that the MLEs are close to the true values, the bias, RMSE and standard errors decrease as sample size increases and the empirical CPs are closer to the nominal coverage level as sample size increases, which are all expected if the underlying estimation scheme is working correctly to produce consistent and asymptotically normal estimates. The simulation study was also repeated for \(\theta =0.5<1\) and \(\theta =1.15> 1\), and the results obtained were very similar.

Table 4 Simulation results for the general model (\(\varvec{EHR}>1\))
Table 5 Simulation results for the restricted model (\(\varvec{EHR}=1\))

6 Conclusion

In this paper, we consider modeling semi-competing data that arises in cancer clinical trials via the shared frailty illness-death framework. Our central contribution is to avoid the latent failure time approach (which comes with well-explored limitations), and explore the illness-death specification via the PVF frailty, which is a flexible and general class containing the gamma, inverse Gaussian and positive stable densities. The convenient Laplace transform of the PVF allows mathematically tractable representations of the hazards, transition probabilities, and survival functions. The simulation study indicate adequate finite sample performance with increasing sample sizes.

The R package SmoothHazard (Touraine et al. 2013) also fits an illness-death model with both Weibull (parametric), or penalized M-splines (semi-parametric) specifications of baseline hazard for arbitrarily censored survival data. To compare and contrast, our approach incorporates a dependency structure between recurrence time and death time using a shared frailty between conditional transition rates, which acts as a multiplicative effect on baseline hazard rates. This dependency is characterized by the frailty parameter \(\theta\).

Our current proposal of a (parametric) Weibull baseline hazard is certainly from the context of achieving computational stability. Future extensions may include various popular and flexible choices of the baseline hazard from the non- and semi-parametric toolbox (Ibrahim et al. 2001), such as a piecewise-constant, and study its impact both on parameter estimation and computational gain. Depending on the dataset, the current setup can also be extended to include a higher-dimensional covariate space, with a principled selection of those covariates (Chapple et al. 2017). Furthermore, the conditional transition rate for terminal event given that a non-terminal event was specified as a Markov model, however, a semi-Markov (Lee et al. 2015) specification is also possible. Inference may also be explored under a Bayesian paradigm. These will be considered elsewhere.