1 Introduction

Recurrent event data are frequently encountered in many longitudinal studies when a particular event of interest repeatedly occurs for a subject. Examples include the cancer recurrence, women’s menstrual cycles, and machinery breakdown. Assume there are n subjects in a study who have experienced an initial event (e.g. cancer occurrence). Let \(i=1,\dots ,n\) index the subjects and \(j=0,1,\dots\) index the recurrent events for a given subject, where \(j=0\) denotes the initial event. For subject i, let \(T_{ij}\) be the gap time, which is the time between \((j-1)\)th and jth events, let \(C_i\) be the time between the beginning of the study and the end of follow-up, then:

$$\begin{aligned} \sum _{j=1}^{m_i-1}T_{ij}\le C_i, \sum _{j=1}^{m_i}T_{ij}>C_i, \end{aligned}$$
(1)

where \(m_i\) is the number of observed gap times. When \(m_i=1\), \(T_{i1}\) is being censored at \(C_i\), then we define \(\sum _{j=1}^{m_i-1}T_{ij}=0\). Otherwise, if \(m_i>1\), then the first \(m_i-1\) gap times are complete and the last one is being censored at \(T_{i,m_i}^{+}=C_i-\sum _{j=1}^{m_i-1}T_{ij}\). The observed gap times consists of \(\{T_{i1},\dots ,T_{i,m_i-1},T_{i,m_i}^{+}\}\) for subject i. We assume the observed data are i.i.d. across the n subjects.

One particular research interest in studying the recurrent event data is the time trend analysis for the gap times [2,3,4]. The trend analysis is of scientific importance due to its application in measuring disease progression. For example, researchers are often interested in whether a treatment for a psychiatric patient will prolong the time for readmission of hospitalization, since frequently readmission means the treatment is not effective [5]. An explicit idea to study the time trend is to compare the length of different gap times according to chronological order. However, it is impossible to conduct a naive comparison, since the recurrent event data are subject to induced censoring [6], which means \(T_{ij}\text {s}~(j\ge 2)\) are subject to dependent censoring by \(C_i-T_{i1}-\cdots -T_{i,j-1}\). [1] tackled the induced censoring issue by comparing the marginal distributions of different gap times, and proposed to study the time trend by conducting a hypothesis testing procedure. The null hypothesis states that there is no time trend, i.e., all the \(T_{ij}\)s have the same marginal distribution within each subject i. The standard K-sample trend test can be applied if there is no censoring. However, in the presence of induced censoring, this approach will not work. One way to circumvent this issue is to introduce the comparability concept, where comparability means a further constraint to the different combinations of pairwise gap times within a subject, so that the gap time pairs that satisfy the comparability constraint can still be comparable (e.g. have the same conditional marginal distribution). Given subject i, for gap time pairs \((T_{ij},T_{ik})\) (\(j<k\)), \(\sum _{l=1}^k T_{il}\le C_i\) ensures that both \(T_{ij}\) and \(T_{ik}\) will not be censored. Denote \(S_{i,jk}=\sum _{l=1}^{k}T_{il}-(T_{ij}+T_{ik})\), given \(T_{i1},\dots ,T_{ik}\) and \(C_i\), in order to avoid \(T_{ij}\) and \(T_{ik}\) being censored, \(T_{ij}+T_{ik}\) must not be greater than \(C_i-S_{i,jk}\). According to [1]’s definition, \(T_{ij}\) and \(T_{ik}\) are comparable if \(T_{ij}\) can be fitted into \(T_{ik}\)’s observation interval and vice versa. Here the observation intervals for \(T_{ij}\) and \(T_{ik}\) are \(C_i-S_{i,jk}-T_{ij}\) and \(C_i-S_{i,jk}-T_{ik}\), respectively. For further details of the rationale, please see Sect. 2 in [1]. In the absence of covariates, the comparability constraint in [1] is defined as:

$$\begin{aligned} \begin{array}{c} T_{ij}\le C_i-S_{i,jk}-T_{ij},\\ T_{ik}\le C_i-S_{i,jk}-T_{ik}. \end{array} \end{aligned}$$
(2)

If \(T_{ij}\) and \(T_{ik}\) satisfy the constraint (2), then they will be a comparable pair. It is worth mentioning that the comparability concept also appears in regression problems based on independent truncated observations, see for example [7,8,9], among others.

In the presence of censoring, [1] proposed the comparability constraint and construct a test statistic for the trend analysis

$$\begin{aligned} U=\sum _{i=1}^n\sum _{j<k\le m_i-1}\delta _{i,jk}\text {sgn}\{(T_{ik}-T_{ij})(Z_{ik}-Z_{ij})\}, \end{aligned}$$
(3)

where \(Z_{ij}\) is a given trend measure for jth gap time of subject i, and \(\delta _{i,jk}\) is the comparability constraint. One practical example of the trend measure is the dose level [?]. An assigned measure such that \(Z_{ij}\) is increasing with j can also be used, such as \(Z_{ij}=j\) [10]. Here \(\delta _{i,jk}\) stands for the comparability constraint that is imposed on different gap time pairs \(T_{ij}\) and \(T_{ik}\) for subject i, if \(T_{ij}\) and \(T_{ik}\) is comparable, then we have \(\delta _{i,jk}=1\), otherwise, define \(\delta _{i,jk}=0\). Thus one can see that only comparable pairs are being selected and used in the hypothesis testing. Since the last observation \(T_{i,m_i}^+\) is always biased due to intercept sampling problem [6], \(T_{i,m_i}^+\) will be excluded from (3) as well as the following statistical analysis.

In the presence of covariates, [1] adapted the comparability constraint to the accelerated failure time model, and included the trend measure \(Z_{ij}\) as one component of the covariates, thus the sign of the coefficient of the trend measure covariate can be used to determine the trend. As a result, one only need to estimate the trend via a parameter estimation procedure. However, as can be shown in Sect. 2.1 of this paper, under the comparability constraint proposed by [1], the comparable gap time pairs will not only have the same distribution, but also are independent. Therefore, [1]’s estimation procedure under the accelerated failure time model will be subject to information loss. We also conduct simulation studies and the results show that the comparability constraint in [1]’s paper only use a half of the data under moderate censoring, when the censoring is heavy, the information loss will become even worse (Tables 1 and 2 in Sect. 3).

In this paper, we propose an alternative comparability constraint under the same model as in [1]. We want to mention that compared with [1]’s estimation procedure, our estimation procedure employs the same assumptions as in their paper. More importantly, we proved that the comparable pairs under the new comparability constraint will have the same conditional distribution, but they are not conditional independent. Thus our method will be superior to [1]’s. Since our constraint is weaker, we name our method and [1]’s as the weak comparability and strong comparability, respectively. Our simulation results also show that the weak comparability can recruit more comparable gap time pairs, and the variance for our estimator will be smaller than the estimator under the strong comparability.

This paper is organized as follows: In Sect. 2, we introduce the concept of weak comparability, and show the asymptotic results. Section 3 presents the simulation results. The proposed method is applied to the HIV Prevention Trial Network 052 data in Sect. 4. The paper is finalized by discussion in Sect. 5.

2 Main Results

2.1 The Strong Comparability Under the Accelerated Failure Time Model

Given subject i, for jth gap time (\(j=1,\dots ,m_i-1\)), consider the following accelerated failure time model:

$$\begin{aligned} \log T_{ij}=\alpha _i+Z_{ij}^\top \beta +e_{ij}, \end{aligned}$$
(4)

where \(\alpha _i\) is a random intercept, \(Z_{ij}\) is a \(p\times 1\) vector of covariates whose values vary within subject i with respect to different gap times, and one component of \(Z_{ij}\) is the trend measure, \(\beta\) is a \(p\times 1\) vector of parameters. For subject i, conditioning on \(\alpha _i\) and \(Z_{ij}\), the error terms \(e_{ij},j=1,\dots ,m_i-1\) are independent from each other and have a common distribution \(G_i\) with mean zero. If \(p=1\) and \(Z_{ij}\) is the trend measure, then model (4) provides a direct interpretation of time trend for gap times: \(\beta =0\) means no trend, \(\beta >0\)/\(\beta <0\) means that \(T_{ij},j=1,\dots ,m_i-1\) tend to be longer/shorter in chronological order, respectively.

Let \(e_{ij}(\beta )=\log T_{ij}-\alpha _i-Z_{ij}^\top \beta\) denote the residual of jth gap time for subject i. [1] stated that if \(e_{ij}(\beta )\) lies in the observation interval of \(e_{ik}(\beta )\) and, symmetrically, \(e_{ik}(\beta )\) lies in the observation interval of \(e_{ij}(\beta )\). Then \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) are comparable, and the comparability constraint is given below (pp. 792):

$$\begin{aligned} \begin{array}{c} e_{ij}(\beta )\le \log (C_i-S_{i,jk}-T_{ij})-(\alpha _i+Z_{ik}^\top \beta ),\\ e_{ik}(\beta )\le \log (C_i-S_{i,jk}-T_{ik})-(\alpha _i+Z_{ij}^\top \beta ). \end{array} \end{aligned}$$
(5)

Thus \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) constitute a strong comparable pair if they satisfy (5). Here we want to mention that (5) is a direct extension of (2) in the presence of model (4). We will use the first inequality as an example, the first inequality is equivalent to

$$\begin{aligned} \log T_{ij}-\alpha _i-Z_{ij}^\top \beta \le \log (C_i-S_{i,jk}-T_{ij})-(\alpha _i+Z_{ik}^\top \beta ), \end{aligned}$$

In the absence of model (4), the first inequality of the comparability is

$$\begin{aligned} T_{ij}\le C_i-S_{i,jk}-T_{ij}, \end{aligned}$$

which is the same as

$$\begin{aligned} \log T_{ij}\le \log (C_i-S_{i,jk}-T_{ij}), \end{aligned}$$

Here \(C_i-S_{i,jk}-T_{ij}\) is the observation interval of \(T_{ik}\), if \(T_{ij}\) is larger than \(T_{ik}\), then there is no way that \(T_{ij}\) and \(T_{ik}\) is comparable (since the support of \(T_{ij}\) must be shorter than the support of \(T_{ik}\)). However, under model (4), \(T_{ij}\) and \(T_{ik}\) cannot be compared directly. Therefore, one needs to adjust for their corresponding covariates, as a result, we need to subtract \(\alpha _i+Z_{ij}\beta\) on the left-hand side (since it’s related to \(T_{ij}\), and subtract \(\alpha _i+Z_{ik}\beta\) on the right-hand side (since it’s related to \(T_{ik}\).

2.2 The Weak Comparability Under the Accelerated Failure Time Model

It is easy to see that (5) is equivalent to:

$$\begin{aligned} \begin{array}{c} T_{ij}+T_{ij}\exp \{(Z_{ik}-Z_{ij})^\top \beta \}\le C_i-S_{i,jk},\\ T_{ik}+T_{ik}\exp \{(Z_{ij}-Z_{ik})^\top \beta \}\le C_i-S_{i,jk}. \end{array} \end{aligned}$$
(6)

In the following, we will use (6) to represent the strong comparability constraint, since the nuisance parameter \(\alpha _i\) is eliminated. Based on (6), we propose the weak comparability constraint as follows:

$$\begin{aligned} \begin{array}{l} T_{ij}+T_{ik}\le C_i-S_{i,jk},\\ T_{ij}\exp \{(Z_{ik}-Z_{ij})^\top \beta \}+T_{ik}\exp \{(Z_{ij}-Z_{ik})^\top \beta \}\le C_i-S_{i,jk}. \end{array} \end{aligned}$$
(7)

Constraint (7) is obtained by tweaking \(T_{ij}\) and \(T_{ik}\exp \{(Z_{ij}-Z_{ik})^\top \beta \}\) on the left hand side of corresponding inequalities in (6). Our intuition can be found in the geometrical shape of the constraint, which will be illustrated shortly via Fig. 1. In the following, we will first show that if \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy constraint (7), then they will follow the same distribution.

The assumptions that we require are as follows:

Assumption 1

Within each subject i, the transformed gap times \(\exp (e_{{ij}}(\beta ))=T_{ij}\exp (-\alpha _i-Z_{ij}^\top \beta )\) are independently distributed given \(\alpha _i\) and \(Z_{ij}\).

Assumption 2

Within each subject i, \(C_i\) is conditionally independent of random intercept \(\alpha _i\) and the random errors \(\{e_{i1},e_{i2},\dots \}\) given \(Z_{i1},Z_{i2},\dots\)

Assumption 1 takes the covariate effect \(Z_{ij}\) and random intercept \(\alpha _i\) into consideration, similar assumptions can also be found in [11] and [12]. Assumption 2 means that \(C_i\) is conditionally independent of \(\{T_{i1},T_{i2},\dots \}\) given \(\{Z_{i1},Z_{i2},\dots \}\).

Lemma 1

Under assumptions 1 and 2, given a fixed real value e, if \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy the strong comparability constraint (6), then we have

$$\begin{aligned} \text {Pr}\{e_{ij}(\beta )\le e\} =\text {Pr}\{e_{ik}(\beta )\le e \}, \end{aligned}$$
(8)

if \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy the weak comparability constraint (7), then we will also have

$$\begin{aligned} \text {Pr}\{e_{ij}(\beta )\le e \} =\text {Pr}\{e_{ik}(\beta )\le e \}. \end{aligned}$$
(9)

Lemma 1 shows that the constraint (7) can be used to find comparable pairs. When \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy the strong comparability constraint, assume the joint probability density function for \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) as \(f_{i,jk}^S\), and marginal probability density functions for \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) as \(f_{ij}^S\) and \(f_{ik}^S\), respectively. When \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy the weak comparability constraint, then we denote the joint probability density functions and the marginal probability density functions for \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) as \(f_{i,jk}^W\), \(f_{ij}^W\) and \(f_{ik}^W\). It is easy to see that \(f_{i,jk}^S=f_{ij}^S\times f_{ik}^S\), which means if \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy strong comparability constraint, then they are independent of each other. [1] has shown the independence of \(T_{ij}\) and \(T_{ik}\) in the absence of covariates, for further details, please see Sect. 2, subsection ‘Comparable \((t_j,t_k)\)’ in their paper. However, the same results do not hold under weak comparability. That is to say, if the comparable pairs satisfy the weak comparability constraint, we do not have \(f_{i,jk}^W=f_{ij}^W\times f_{ik}^W\), which means the weak comparable pairs will only have the same distribution but not independent. As a result, if the comparable pairs satisfy the strong comparability constraint, then they will have the same distribution and they are also independent of each other, which means the strong comparability automatically impose an additional constraint that is not needed in estimation. However, the weak comparability constraint will not suffer from this issue. All the above results show that constraint (6) is stronger than (7).

Denote \(\delta _{i,jk}^S(\beta )\) and \(\delta _{i,jk}^W(\beta )\) as the indicator functions for pair \((T_{ij},T_{ik})\) under the strong and weak comparability, respectively. If \(T_{ij}\) and \(T_{ik}\) satisfies (6), then \(\delta _{i,jk}^S(\beta )=1\), and \(\delta _{i,jk}^S(\beta )=0\) otherwise. Similarly, if \(T_{ij}\) and \(T_{ik}\) satisfies (7), then \(\delta _{i,jk}^W(\beta )=1\), and \(\delta _{i,jk}^W(\beta )=0\) otherwise. Then we can derive the unbiased estimates of parameter \(\beta\) by minimizing either of the following objective functions:

$$\begin{aligned}{} & {} M_S(\beta )=\sum _{i=1}^n\sum _{1\le j<k\le m_i-1}\delta _{i,jk}^S(\beta )\mid e_{ik}(\beta )-e_{ij}(\beta )\mid , \end{aligned}$$
(10)
$$\begin{aligned}{} & {} M_W(\beta )=\sum _{i=1}^n\sum _{1\le j<k\le m_i-1}\delta _{i,jk}^W(\beta )\mid e_{ik}(\beta )-e_{ij}(\beta )\mid . \end{aligned}$$
(11)

In the following, we show that \(\delta _{i,jk}^W(\beta )\) will always be larger than \(\delta _{i,jk}^S(\beta )\) for any \(\beta\). To see this, let \(d=\exp \{(Z_{ik}-Z_{ij})^\top \beta \}\), \(x=T_{ij}\), \(y=T_{ik}\), \(c=C_i-S_{i,jk}\), then (6) becomes

$$\begin{aligned} \begin{array}{c} x(1+d)\le c,\\ y(1+1/d)\le c. \end{array} \end{aligned}$$
(12)

Meanwhile, (7) is equivalent to

$$\begin{aligned} \begin{array}{c} x+y\le c,\\ dx+y/d\le c. \end{array} \end{aligned}$$
(13)

In Fig. 1, the shaded area represents the strong comparability constraint, while the hatched area represents the weak comparability constraint. It is easy to see that the hatched area is bigger, which indicates \(\delta _{i,jk}^W(\beta )\ge \delta _{i,jk}^S(\beta )\). As a result, (11) will utilize more gap time pairs and thus produce an estimate that has a smaller variance than the estimate obtained from (10). In addition, through simple calculation, the hatched area is \(c^2/(1+d)\) while the shadowed area is \(c^2d/(1+d)^2\), which shows that the hatched area is \(1+1/d\) times larger than the shadowed area, where the d depends on \(Z_{ij}\), \(Z_{ik}\) and \(\beta\).

Fig. 1
figure 1

Illustration of comparability

2.3 Asymptotic Results

Suppose (10) and (11) achieve the minimum at \(\hat{\beta }_n^S\) and \(\hat{\beta }_n^W\).

According to the suggestion of one of the referees, we also add some additional assumptions that are needed to establish the following theorem:

Assumption 3

There exists a matrix \(\Sigma _0(\beta )\) such that \(\lim _{n\rightarrow \infty }\frac{1}{n}\sum _{i=1}^n cov(\sum _{j<k}\delta _{i,jk}^W(\beta )\text {sgn}[(Z_{ik}-Z_{ij})\{e_{ik}(\beta )-e_{ij}(\beta )\}])=\Sigma _0(\beta )\).

Assumption 4

There exists a vector \(\mu (\beta )\) such that \(\lim _{n\rightarrow \infty }\frac{1}{n}\sum _{i=1}^n \sum _{j<k}\delta _{i,jk}^W(\beta )\text {sgn}[(Z_{ik}-Z_{ij})\{e_{ik}(\beta )-e_{ij}(\beta )\}]=\mu (\beta )=E\left\{ \sum _{j<k}\delta _{i,jk}^W(\beta )\text {sgn}[(Z_{ik}-Z_{ij})\{e_{ik}(\beta )-e_{ij}(\beta )\}]\right\} .\)

Assumption 5

The probability density functions for \(e_{ij}\) are continuous and bounded.

Since the two objective functions \(M_S(\beta )\) and \(M_W(\beta )\) have a similar form, in the following we will only present the asymptotic result for \(\hat{\beta }_n^W\). Denote \(\beta _0\) as the true value of \(\beta\), then we have:

Theorem 1

Under Assumptions 1 to 5, the estimator \(\hat{\beta }_n^W\) is consistent, and \({n}^{1/2}(\hat{\beta }_n^W-\beta _0)\) converges in distribution to \(N(0,\{\mu ^{\prime }(\beta _0)^\top \}^{-1}\Sigma (\beta _0)\{\mu ^{\prime }(\beta _0)\}^{-1})\), where \(\Sigma (\beta _0)\) is the covariance matrix of \(\sum _{j<k}\delta _{i,jk}^W(\beta _0)\text {sgn}[(Z_{ik}-Z_{ij})\{e_{ik}(\beta _0)-e_{ij}(\beta _0)\}]\), \(\mu ^{\prime }(\beta _0)\) is the partial derivative of

$$\begin{aligned} \mu (\beta )=E\left\{ \sum _{j<k}\delta _{i,jk}^W(\beta )\text {sgn}[(Z_{ik}-Z_{ij})\{e_{ik}(\beta )-e_{ij}(\beta )\}]\right\} \end{aligned}$$

at \(\beta _0\).

Thus \({n}^{1/2}(\hat{\beta }_n^W-\beta _0)\) is asymptotic normally distributed. We need to mention that it is hard to quantify the efficiency gain for our estimator and the strong comparability due to the sandwich structure of the variance, however, as we can see from the comments in Sect. 2.1, for subject i, each \(\delta _{i,jk}^W(\beta )\) can potentially recruit more gap time pairs than \(\delta _{i,jk}^S(\beta )\), thus the variance for weak comparability will be smaller than the variance for strong comparability.

The estimation of the asymptotic covariance matrix maybe difficult since the numerical computations of \(\mu ^{\prime }(\beta _0)\) and \(\Sigma (\beta _0)\) are nontrivial. Bootstrap methods can be applied in practice.

3 Simulation Study

In this section, we evaluate the finite sample properties of the methods developed in Sect. 2.1 via extensive simulation studies. We use the resampling approach proposed by [13] to approximate the distributions of \(\hat{\beta }_n^W\) and \(\hat{\beta }_n^S\). First we generate \(w_i\) independently from the binomial distribution \(\text {Bi}(n,1/n)\), then we minimize the perturbed objective function

$$\begin{aligned} \tilde{M}^W(\beta )=\sum _{i=1}^nw_{i}\sum _{j<k}\delta _{i,jk}^W(\beta )\mid e_{ik}(\beta )-e_{ij}(\beta )\mid , \end{aligned}$$

and repeat the procedure \(B=200\) times.

We use model (4) to generate the gap times, where we set \(p=1,~\alpha _i=1\), and assume \(e_{ij}\) follows a normal distribution with mean 0 and variance 1/4. Let the true value of \(\beta\) equals 0 and 0.2, respectively. 500 simulated data sets were generated. We denote the number of subjects in each simulated data set as n, where we choose \(n=30,60,100\). For ith subject in lth data set \((i=1,\dots ,n,l=1,\dots ,500)\), first we generate 10 successive times, then we use a constant \(C_i\) as the censoring time to select the first \(m_{i,l}\) gap times \(\{T_{i1},\dots ,T_{i,m_{i,l}}\}\), where \(\sum _{j=1}^{m_{i,l}-1}T_{ij}\le C_i\) and \(\sum _{j=1}^{m_{i,l}}T_{ij}>C_i\). We consider two trend measures in the simulation study, \(Z_{ij}=j\) and \(Z_{ij}=j^{1/2}\), the results for \(Z_{ij}=j\) are shown in Tables 1 and 2, and the results for \(Z_{ij}=j^{1/2}\) are shown in Tables 3 and 4.

For ith subject in the lth simulated data set, the observed gap times are \(\{T_{i1},\dots ,T_{im_{i,l}}^{+}\}\), denote the weak and strong comparability indicator for episodes j and k (\(j<k\le m_i-1\)) in subject i as \(\delta _{i,jk}^{l,W}(\beta )\) and \(\delta _{i,jk}^{l,S}(\beta )\), respectively. we compute the following quantities to compare the efficiency between weak and strong comparabilities:

  • Mean Total Pairs: It is defined as

    $$\begin{aligned} \frac{1}{500}\sum _{l=1}^{500}\sum _{i=1}^{n}\left( {\begin{array}{c}m_{i,l}-1\\ 2\end{array}}\right) . \end{aligned}$$

    The mean total pairs calculates the average total number of pairs and measures the maximum capacity allowed in estimation. Under this scenario, we assume that every pair is comparable, which means that for \(i=1,\dots ,n\), \(j=1,\dots ,m_i\), \(j<k\le m_i-1\), we have \(\delta _{i,jk}^{l,W}(\beta _0)=1\) and \(\delta _{i,jk}^{l,S}(\beta _0)=1\).

  • Mean Comparable Pairs: For the weak comparability, it is defined as

    $$\begin{aligned} \frac{1}{500}\sum _{l=1}^{500}\sum _{i=1}^{n}\sum _{j<k\le m_i-1}\delta _{i,jk}^{l,W}(\beta _0), \end{aligned}$$

    while for the strong comparability, it is defined as

    $$\begin{aligned} \frac{1}{500}\sum _{l=1}^{500}\sum _{i=1}^{n}\sum _{j<k\le m_i-1}\delta _{i,jk}^{l,S}(\beta _0). \end{aligned}$$

    The mean comparable pairs measures how much information the strong and weak comparability will utilize. The difference in mean comparable pairs for the weak and strong comparability reflects the relative efficiency of the two comparability constraints. If the mean total pairs equals to the mean comparable pairs, then every pair will be a comparable pair.

Table 1 Simulation results for \(\beta _0=0\), \(Z_{ij}=j\)

From Table 1, we can see that standard deviation for the estimates under the weak comparability constraint are smaller than the standard deviation for the estimates under the strong version. From the table, we can also see that the mean comparable pairs under weak comparability are larger than the mean comparable pairs under the strong version as well, which is already verified in a graphical way in Sect. 2.1. When the censoring time is shorter (\(C_i=8\)), the difference between the standard deviations will be larger. When the censoring time is longer (i.e. \(C_i=9\)), for each subject i (\(i=1,\dots ,n\)), \(m_i\) will be larger, thus the mean total pairs will also increase. Table 2 show similar results for \(\beta _0=0.2\). The simulation results for \(Z_{ij}=j^{1/2}\) are shown in Tables 3 and 4, and the results are similar to Tables 1 and 2. As mentioned by one of the referees, the empirical coverage probabilities in Tables 3 and 4 seem a bit low compared to the nominal level. We also conducted further simulations with larger sample sizes, and the results are shown in Tables 5 and 6. As we can see, the empirical coverage probabilities are closer to the nominal level as the sample size increases. In addition, for trend measure \(Z_{ij}=j^{1/2}\), the coverage probability for strong comparability tend to be lower than the weak version under moderate or heavy censoring.

One referee comment that whether the proposed estimator performs well under other censoring mechanisms. We would like to mention that the censoring mechanisms will not affect the performance, since we only utilize the uncensored gap times and select comparable pairs (either in strong comparability or weak comparability). Based on our intuitive observation in Sect. 2.1, the comparable pairs for strong comparability will be nested within the weak comparability. So for any censoring mechanisms, the standard deviation for weak comparability will be less than the standard deviation for strong comparability. To illustrate this, we conducted a simulation study under type II censoring, where the number of gap times for each subject is the same, the results are shown in Table 7. The results also coincide with our previous findings. When the number of gap times are larger, the number of the building blocks of U statistics with subject i (i.e. \(\mid e_{ik}(\beta )-e_{ij}(\beta )\mid\) ) will also become bigger, therefore, the standard deviation will decrease and the difference between two comparabilities will become smaller. As a further illustration, we also conducted a simulation under random censoring, the censoring time \(C_i\) is assumed to follow an exponential distribution with mean equal to 6, the results are shown in Table 8.

Furthermore, we also conducted a simulation when the trend measure is misspecified, here the censoring mechanism are type I and type II censoring. For type I censoring, the \(C_i\) are set to be 8, in type II censoring, the number of gap times for each subject equals to 4, and the true trend measure is set to be \(Z_{ij}=j^{1/2}\), while in the model, we assume the trend measure is \(Z_{ij}=j\), the results are shown in Tables 9 and 10. From this table, we can see that both estimators perform well.

In summary, all the results indicate that the weak comparability can utilize more data, and provide a more efficient estimate than the strong comparability.

Table 2 Simulation results for \(\beta _0=0.2\), \(Z_{ij}=j\)
Table 3 Simulation results for \(\beta _0=0\), \(Z_{ij}={j}^{1/2}\)
Table 4 Simulation results for \(\beta _0=0.2\), \(Z_{ij}={j}^{1/2}\)
Table 5 Simulation results for \(\beta _0=0\), \(Z_{ij}={j}^{1/2}\) under larger sample sizes, compared with Table 3
Table 6 Simulation results for \(\beta _0=0.2\), \(Z_{ij}={j}^{1/2}\) under larger sample sizes, compared with Tables 4
Table 7 Simulation results for \(\beta _0=0\), \(Z_{ij}=j\) under type II censoring
Table 8 Simulation results for \(\beta _0=0\), \(Z_{ij}=j\) under random censoring
Table 9 Simulation results under misspecification, true \(\beta _0=0\), true \(Z_{ij}={j}^{1/2}\)
Table 10 Simulation results under misspecification, true \(\beta _0=0\), true \(Z_{ij}={j}^{1/2}\)

4 Real Data

We apply the proposed method to the HIV Prevention Trial Network 052 data [14, 15]. 1763 HIV-1-serodiscordant couples were enrolled into this study since April 2005. The study randomly assigned 1763 HIV type 1 serodiscordant couples to receive either early or delayed antiretroviral therapy treatment. 886 participants were received early therapy during enrollment, the rest 877 participants started therapy after two consecutive CD4+ counts fell below 250 cells per cubic millimeter or if an illness indicative of the acquired immunodeficiency syndrome developed. All the couples were followed up until 2015. On May 11, 2011, all the patients in both cohorts had been provided early antiretroviral therapy treatment due to an independent data and safety monitoring board of the NIH/NIAID’s recommendation. They showed a dramatically 96% risk reduction for the early antiretroviral therapy treatment arm in HIV-1 transmissions.

In this study, it is essential to remain high levels of medication adherence to recommended treatment regimes to achieve effective antiretroviral therapy treatment. Here the adherence means patients’ ability to take medications as prescribed. It is known that at least 95% of adherence is needed to achieve an effective HIV treatment [16]. In this study, the adherence was measured by pill counts, self count as well as the measurement of viral load [15]. Doctor’s counselling is recognized as one of the critical aspects that could affect adherence [17], thus all the participants in the study had received adherence counselling during each visit [14].

During the time interval between two consecutive visits, we calculate the ratio of the number of total pills that a participant has eaten versus the number of total pills that has been dispensed to this participant. We use this ratio as the measurement of adherence, and the adherence varies across different time intervals. For each participant’s visit history, we construct the recurrent event data as follows: (1) If the adherence of a participant’s first visit interval is larger than 95%, then we count the total consecutive visit days that has adherence larger than 95%, and denote the total consecutive visit episode as the high adherence episode. (2) If the adherence of a participant’s first visit interval is smaller than 95%, then we count the total consecutive visit days that has adherence smaller than 95%, and denote the consecutive visit episode as the low adherence episode. (3) Follow (1) and (2) alternatively to construct the remaining episodes, each episode serves as the gap time. (4) Define an indicator function for each episode. It equals 1 if the adherence during that episode is larger than 95%, and equals 0 otherwise. For instance, if a patient’s data is as in Table 11, then the gap times for this patient are: 14, 86 (= 28 + 58), 150 (= 60 + 90), 70, 320 (= 80 + 180 + 60), 60, with indicators equal to 0, 1, 0, 1, 0, 1. In order to maintain a high efficacy of the antiretroviral therapy treatment, an ideal pattern will be that the high adherence episodes tend to be longer, and the low adherence episodes tend to be shorter. Here we will use the proposed trend analysis method to assess the pattern of the adherence alternation. The model we consider is

$$\begin{aligned} \log T_{ij}=\alpha _i+Z_{ij}^{(1)}\beta _1+Z_{ij}^{(2)}\beta _2 +Z_{ij}^{(1)}\times Z_{ij}^{(2)}\beta _3+e_{ij}, \end{aligned}$$
(14)

where \(T_{ij}\) is the ith gap time for ith participant. Let \(Z_{ij}^{(1)}\) denote the trend measure, here we set \(Z_{ij}^{(1)}=j\) or \(j^{1/2}\). Let \(Z_{ij}^{(2)}\) be an indicator function, it equals to 1 if jth episode has high adherence, and equals to 0 otherwise. \(\beta _3\) is the interaction term for the trend measure and the adherence indicator. We only focus on the 886 participants in the early antiretroviral therapy treatment arm, since the pattern of the treatment for this group is consistent from enrollment to the end of follow-up. The gap times were measured in months. Since some participants’ data were missing, we delete these participants’ data, and finally lead to 829 individuals in the analysis. The data set is analyzed under both full follow up time (without censoring) as well as an artificial censoring time (May 11, 2011). The results are shown in Table 12. All the estimates are significant in the table. From the table, we can see that \(\hat{\beta }_1\) is positive, which indicates that the length of the higher and lower adherence episodes become longer. Positive \(\hat{\beta }_2\) shows that the average length of the higher adherence episodes is longer than the average length of the lower adherence episodes. And negative \(\hat{\beta }_3\) means the length of the higher adherence episodes and the lower adherence episodes change on the opposite direction – when the length of the higher adherence episodes gets longer, the length of the lower adherence episodes gets shorter, and vice versa. The results under the full follow up data and the censored data are similar. For both data, the standard deviations for \(\hat{\beta }_1\) and \(\hat{\beta }_2\) under weak comparability will be smaller than the standard deviations for \(\hat{\beta }_1\) and \(\hat{\beta }_2\) under strong comparability. Though they do not differ too much, which is due to the number of gap times is large. This is also confirmed by simulation study as well, where we can see that as the censoring time \(C_i\) becomes longer, the difference between weak comparability and strong comparability becomes smaller.

Table 11 A sample patient visit data
Table 12 HIV Prevention Trial Network 052 data results

5 Discussion

In this paper, we propose a new version of comparability constraint for stratified gap time under the accelerated failure time model. This constraint can be used to identify the time trend for the gap times. Compared with [1]’s comparability constraint, the proposed constraint can recruit more data pairs in estimation procedure, thus the proposed weak comparability is more efficient. The theoretical and simulation results show that our method is better than [1]’s method. We use the accelerated failure time model due to its simple interpretation. However, we also plan to extend this idea to more complicated survival models (e.g., the Cox model) in the future. While in this paper we have considered the pairwise comparison for two gap times, we will also extend the idea to compare more than two gap times.