Abstract
Recurrent event data are frequently encountered in many longitudinal studies where each individual may experience more than one event. Wang and Chen (Biometrics 56(3):789–794, 2000) proposed a comparability constraint to estimate the time trend for the gap times, where the gap time pairs that satisfy the constraint have the same conditional distribution. However, the comparable paired gap times are also independent. Therefore, the comparable gap time pairs will be subject to a stronger constraint than needed for the estimation. Thus their procedure is subject to information loss. Under the accelerated failure time model, we propose a new comparability constraint that can overcome the drawback mentioned above. The gap time pairs being selected by the proposed comparability constraint will still have the same distribution, but they do not need to be independent of each other. We showed that the proposed comparability constraint will utilize more gap time data pairs than the strong comparability. And we showed via various simulation studies that the variance will be smaller than Wang and Chen’s (2000) estimator. We apply the proposed method to the HIV Prevention Trial Network 052 study.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Recurrent event data are frequently encountered in many longitudinal studies when a particular event of interest repeatedly occurs for a subject. Examples include the cancer recurrence, women’s menstrual cycles, and machinery breakdown. Assume there are n subjects in a study who have experienced an initial event (e.g. cancer occurrence). Let \(i=1,\dots ,n\) index the subjects and \(j=0,1,\dots\) index the recurrent events for a given subject, where \(j=0\) denotes the initial event. For subject i, let \(T_{ij}\) be the gap time, which is the time between \((j-1)\)th and jth events, let \(C_i\) be the time between the beginning of the study and the end of follow-up, then:
where \(m_i\) is the number of observed gap times. When \(m_i=1\), \(T_{i1}\) is being censored at \(C_i\), then we define \(\sum _{j=1}^{m_i-1}T_{ij}=0\). Otherwise, if \(m_i>1\), then the first \(m_i-1\) gap times are complete and the last one is being censored at \(T_{i,m_i}^{+}=C_i-\sum _{j=1}^{m_i-1}T_{ij}\). The observed gap times consists of \(\{T_{i1},\dots ,T_{i,m_i-1},T_{i,m_i}^{+}\}\) for subject i. We assume the observed data are i.i.d. across the n subjects.
One particular research interest in studying the recurrent event data is the time trend analysis for the gap times [2,3,4]. The trend analysis is of scientific importance due to its application in measuring disease progression. For example, researchers are often interested in whether a treatment for a psychiatric patient will prolong the time for readmission of hospitalization, since frequently readmission means the treatment is not effective [5]. An explicit idea to study the time trend is to compare the length of different gap times according to chronological order. However, it is impossible to conduct a naive comparison, since the recurrent event data are subject to induced censoring [6], which means \(T_{ij}\text {s}~(j\ge 2)\) are subject to dependent censoring by \(C_i-T_{i1}-\cdots -T_{i,j-1}\). [1] tackled the induced censoring issue by comparing the marginal distributions of different gap times, and proposed to study the time trend by conducting a hypothesis testing procedure. The null hypothesis states that there is no time trend, i.e., all the \(T_{ij}\)s have the same marginal distribution within each subject i. The standard K-sample trend test can be applied if there is no censoring. However, in the presence of induced censoring, this approach will not work. One way to circumvent this issue is to introduce the comparability concept, where comparability means a further constraint to the different combinations of pairwise gap times within a subject, so that the gap time pairs that satisfy the comparability constraint can still be comparable (e.g. have the same conditional marginal distribution). Given subject i, for gap time pairs \((T_{ij},T_{ik})\) (\(j<k\)), \(\sum _{l=1}^k T_{il}\le C_i\) ensures that both \(T_{ij}\) and \(T_{ik}\) will not be censored. Denote \(S_{i,jk}=\sum _{l=1}^{k}T_{il}-(T_{ij}+T_{ik})\), given \(T_{i1},\dots ,T_{ik}\) and \(C_i\), in order to avoid \(T_{ij}\) and \(T_{ik}\) being censored, \(T_{ij}+T_{ik}\) must not be greater than \(C_i-S_{i,jk}\). According to [1]’s definition, \(T_{ij}\) and \(T_{ik}\) are comparable if \(T_{ij}\) can be fitted into \(T_{ik}\)’s observation interval and vice versa. Here the observation intervals for \(T_{ij}\) and \(T_{ik}\) are \(C_i-S_{i,jk}-T_{ij}\) and \(C_i-S_{i,jk}-T_{ik}\), respectively. For further details of the rationale, please see Sect. 2 in [1]. In the absence of covariates, the comparability constraint in [1] is defined as:
If \(T_{ij}\) and \(T_{ik}\) satisfy the constraint (2), then they will be a comparable pair. It is worth mentioning that the comparability concept also appears in regression problems based on independent truncated observations, see for example [7,8,9], among others.
In the presence of censoring, [1] proposed the comparability constraint and construct a test statistic for the trend analysis
where \(Z_{ij}\) is a given trend measure for jth gap time of subject i, and \(\delta _{i,jk}\) is the comparability constraint. One practical example of the trend measure is the dose level [?]. An assigned measure such that \(Z_{ij}\) is increasing with j can also be used, such as \(Z_{ij}=j\) [10]. Here \(\delta _{i,jk}\) stands for the comparability constraint that is imposed on different gap time pairs \(T_{ij}\) and \(T_{ik}\) for subject i, if \(T_{ij}\) and \(T_{ik}\) is comparable, then we have \(\delta _{i,jk}=1\), otherwise, define \(\delta _{i,jk}=0\). Thus one can see that only comparable pairs are being selected and used in the hypothesis testing. Since the last observation \(T_{i,m_i}^+\) is always biased due to intercept sampling problem [6], \(T_{i,m_i}^+\) will be excluded from (3) as well as the following statistical analysis.
In the presence of covariates, [1] adapted the comparability constraint to the accelerated failure time model, and included the trend measure \(Z_{ij}\) as one component of the covariates, thus the sign of the coefficient of the trend measure covariate can be used to determine the trend. As a result, one only need to estimate the trend via a parameter estimation procedure. However, as can be shown in Sect. 2.1 of this paper, under the comparability constraint proposed by [1], the comparable gap time pairs will not only have the same distribution, but also are independent. Therefore, [1]’s estimation procedure under the accelerated failure time model will be subject to information loss. We also conduct simulation studies and the results show that the comparability constraint in [1]’s paper only use a half of the data under moderate censoring, when the censoring is heavy, the information loss will become even worse (Tables 1 and 2 in Sect. 3).
In this paper, we propose an alternative comparability constraint under the same model as in [1]. We want to mention that compared with [1]’s estimation procedure, our estimation procedure employs the same assumptions as in their paper. More importantly, we proved that the comparable pairs under the new comparability constraint will have the same conditional distribution, but they are not conditional independent. Thus our method will be superior to [1]’s. Since our constraint is weaker, we name our method and [1]’s as the weak comparability and strong comparability, respectively. Our simulation results also show that the weak comparability can recruit more comparable gap time pairs, and the variance for our estimator will be smaller than the estimator under the strong comparability.
This paper is organized as follows: In Sect. 2, we introduce the concept of weak comparability, and show the asymptotic results. Section 3 presents the simulation results. The proposed method is applied to the HIV Prevention Trial Network 052 data in Sect. 4. The paper is finalized by discussion in Sect. 5.
2 Main Results
2.1 The Strong Comparability Under the Accelerated Failure Time Model
Given subject i, for jth gap time (\(j=1,\dots ,m_i-1\)), consider the following accelerated failure time model:
where \(\alpha _i\) is a random intercept, \(Z_{ij}\) is a \(p\times 1\) vector of covariates whose values vary within subject i with respect to different gap times, and one component of \(Z_{ij}\) is the trend measure, \(\beta\) is a \(p\times 1\) vector of parameters. For subject i, conditioning on \(\alpha _i\) and \(Z_{ij}\), the error terms \(e_{ij},j=1,\dots ,m_i-1\) are independent from each other and have a common distribution \(G_i\) with mean zero. If \(p=1\) and \(Z_{ij}\) is the trend measure, then model (4) provides a direct interpretation of time trend for gap times: \(\beta =0\) means no trend, \(\beta >0\)/\(\beta <0\) means that \(T_{ij},j=1,\dots ,m_i-1\) tend to be longer/shorter in chronological order, respectively.
Let \(e_{ij}(\beta )=\log T_{ij}-\alpha _i-Z_{ij}^\top \beta\) denote the residual of jth gap time for subject i. [1] stated that if \(e_{ij}(\beta )\) lies in the observation interval of \(e_{ik}(\beta )\) and, symmetrically, \(e_{ik}(\beta )\) lies in the observation interval of \(e_{ij}(\beta )\). Then \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) are comparable, and the comparability constraint is given below (pp. 792):
Thus \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) constitute a strong comparable pair if they satisfy (5). Here we want to mention that (5) is a direct extension of (2) in the presence of model (4). We will use the first inequality as an example, the first inequality is equivalent to
In the absence of model (4), the first inequality of the comparability is
which is the same as
Here \(C_i-S_{i,jk}-T_{ij}\) is the observation interval of \(T_{ik}\), if \(T_{ij}\) is larger than \(T_{ik}\), then there is no way that \(T_{ij}\) and \(T_{ik}\) is comparable (since the support of \(T_{ij}\) must be shorter than the support of \(T_{ik}\)). However, under model (4), \(T_{ij}\) and \(T_{ik}\) cannot be compared directly. Therefore, one needs to adjust for their corresponding covariates, as a result, we need to subtract \(\alpha _i+Z_{ij}\beta\) on the left-hand side (since it’s related to \(T_{ij}\), and subtract \(\alpha _i+Z_{ik}\beta\) on the right-hand side (since it’s related to \(T_{ik}\).
2.2 The Weak Comparability Under the Accelerated Failure Time Model
It is easy to see that (5) is equivalent to:
In the following, we will use (6) to represent the strong comparability constraint, since the nuisance parameter \(\alpha _i\) is eliminated. Based on (6), we propose the weak comparability constraint as follows:
Constraint (7) is obtained by tweaking \(T_{ij}\) and \(T_{ik}\exp \{(Z_{ij}-Z_{ik})^\top \beta \}\) on the left hand side of corresponding inequalities in (6). Our intuition can be found in the geometrical shape of the constraint, which will be illustrated shortly via Fig. 1. In the following, we will first show that if \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy constraint (7), then they will follow the same distribution.
The assumptions that we require are as follows:
Assumption 1
Within each subject i, the transformed gap times \(\exp (e_{{ij}}(\beta ))=T_{ij}\exp (-\alpha _i-Z_{ij}^\top \beta )\) are independently distributed given \(\alpha _i\) and \(Z_{ij}\).
Assumption 2
Within each subject i, \(C_i\) is conditionally independent of random intercept \(\alpha _i\) and the random errors \(\{e_{i1},e_{i2},\dots \}\) given \(Z_{i1},Z_{i2},\dots\)
Assumption 1 takes the covariate effect \(Z_{ij}\) and random intercept \(\alpha _i\) into consideration, similar assumptions can also be found in [11] and [12]. Assumption 2 means that \(C_i\) is conditionally independent of \(\{T_{i1},T_{i2},\dots \}\) given \(\{Z_{i1},Z_{i2},\dots \}\).
Lemma 1
Under assumptions 1 and 2, given a fixed real value e, if \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy the strong comparability constraint (6), then we have
if \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy the weak comparability constraint (7), then we will also have
Lemma 1 shows that the constraint (7) can be used to find comparable pairs. When \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy the strong comparability constraint, assume the joint probability density function for \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) as \(f_{i,jk}^S\), and marginal probability density functions for \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) as \(f_{ij}^S\) and \(f_{ik}^S\), respectively. When \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy the weak comparability constraint, then we denote the joint probability density functions and the marginal probability density functions for \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) as \(f_{i,jk}^W\), \(f_{ij}^W\) and \(f_{ik}^W\). It is easy to see that \(f_{i,jk}^S=f_{ij}^S\times f_{ik}^S\), which means if \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy strong comparability constraint, then they are independent of each other. [1] has shown the independence of \(T_{ij}\) and \(T_{ik}\) in the absence of covariates, for further details, please see Sect. 2, subsection ‘Comparable \((t_j,t_k)\)’ in their paper. However, the same results do not hold under weak comparability. That is to say, if the comparable pairs satisfy the weak comparability constraint, we do not have \(f_{i,jk}^W=f_{ij}^W\times f_{ik}^W\), which means the weak comparable pairs will only have the same distribution but not independent. As a result, if the comparable pairs satisfy the strong comparability constraint, then they will have the same distribution and they are also independent of each other, which means the strong comparability automatically impose an additional constraint that is not needed in estimation. However, the weak comparability constraint will not suffer from this issue. All the above results show that constraint (6) is stronger than (7).
Denote \(\delta _{i,jk}^S(\beta )\) and \(\delta _{i,jk}^W(\beta )\) as the indicator functions for pair \((T_{ij},T_{ik})\) under the strong and weak comparability, respectively. If \(T_{ij}\) and \(T_{ik}\) satisfies (6), then \(\delta _{i,jk}^S(\beta )=1\), and \(\delta _{i,jk}^S(\beta )=0\) otherwise. Similarly, if \(T_{ij}\) and \(T_{ik}\) satisfies (7), then \(\delta _{i,jk}^W(\beta )=1\), and \(\delta _{i,jk}^W(\beta )=0\) otherwise. Then we can derive the unbiased estimates of parameter \(\beta\) by minimizing either of the following objective functions:
In the following, we show that \(\delta _{i,jk}^W(\beta )\) will always be larger than \(\delta _{i,jk}^S(\beta )\) for any \(\beta\). To see this, let \(d=\exp \{(Z_{ik}-Z_{ij})^\top \beta \}\), \(x=T_{ij}\), \(y=T_{ik}\), \(c=C_i-S_{i,jk}\), then (6) becomes
Meanwhile, (7) is equivalent to
In Fig. 1, the shaded area represents the strong comparability constraint, while the hatched area represents the weak comparability constraint. It is easy to see that the hatched area is bigger, which indicates \(\delta _{i,jk}^W(\beta )\ge \delta _{i,jk}^S(\beta )\). As a result, (11) will utilize more gap time pairs and thus produce an estimate that has a smaller variance than the estimate obtained from (10). In addition, through simple calculation, the hatched area is \(c^2/(1+d)\) while the shadowed area is \(c^2d/(1+d)^2\), which shows that the hatched area is \(1+1/d\) times larger than the shadowed area, where the d depends on \(Z_{ij}\), \(Z_{ik}\) and \(\beta\).
2.3 Asymptotic Results
Suppose (10) and (11) achieve the minimum at \(\hat{\beta }_n^S\) and \(\hat{\beta }_n^W\).
According to the suggestion of one of the referees, we also add some additional assumptions that are needed to establish the following theorem:
Assumption 3
There exists a matrix \(\Sigma _0(\beta )\) such that \(\lim _{n\rightarrow \infty }\frac{1}{n}\sum _{i=1}^n cov(\sum _{j<k}\delta _{i,jk}^W(\beta )\text {sgn}[(Z_{ik}-Z_{ij})\{e_{ik}(\beta )-e_{ij}(\beta )\}])=\Sigma _0(\beta )\).
Assumption 4
There exists a vector \(\mu (\beta )\) such that \(\lim _{n\rightarrow \infty }\frac{1}{n}\sum _{i=1}^n \sum _{j<k}\delta _{i,jk}^W(\beta )\text {sgn}[(Z_{ik}-Z_{ij})\{e_{ik}(\beta )-e_{ij}(\beta )\}]=\mu (\beta )=E\left\{ \sum _{j<k}\delta _{i,jk}^W(\beta )\text {sgn}[(Z_{ik}-Z_{ij})\{e_{ik}(\beta )-e_{ij}(\beta )\}]\right\} .\)
Assumption 5
The probability density functions for \(e_{ij}\) are continuous and bounded.
Since the two objective functions \(M_S(\beta )\) and \(M_W(\beta )\) have a similar form, in the following we will only present the asymptotic result for \(\hat{\beta }_n^W\). Denote \(\beta _0\) as the true value of \(\beta\), then we have:
Theorem 1
Under Assumptions 1 to 5, the estimator \(\hat{\beta }_n^W\) is consistent, and \({n}^{1/2}(\hat{\beta }_n^W-\beta _0)\) converges in distribution to \(N(0,\{\mu ^{\prime }(\beta _0)^\top \}^{-1}\Sigma (\beta _0)\{\mu ^{\prime }(\beta _0)\}^{-1})\), where \(\Sigma (\beta _0)\) is the covariance matrix of \(\sum _{j<k}\delta _{i,jk}^W(\beta _0)\text {sgn}[(Z_{ik}-Z_{ij})\{e_{ik}(\beta _0)-e_{ij}(\beta _0)\}]\), \(\mu ^{\prime }(\beta _0)\) is the partial derivative of
at \(\beta _0\).
Thus \({n}^{1/2}(\hat{\beta }_n^W-\beta _0)\) is asymptotic normally distributed. We need to mention that it is hard to quantify the efficiency gain for our estimator and the strong comparability due to the sandwich structure of the variance, however, as we can see from the comments in Sect. 2.1, for subject i, each \(\delta _{i,jk}^W(\beta )\) can potentially recruit more gap time pairs than \(\delta _{i,jk}^S(\beta )\), thus the variance for weak comparability will be smaller than the variance for strong comparability.
The estimation of the asymptotic covariance matrix maybe difficult since the numerical computations of \(\mu ^{\prime }(\beta _0)\) and \(\Sigma (\beta _0)\) are nontrivial. Bootstrap methods can be applied in practice.
3 Simulation Study
In this section, we evaluate the finite sample properties of the methods developed in Sect. 2.1 via extensive simulation studies. We use the resampling approach proposed by [13] to approximate the distributions of \(\hat{\beta }_n^W\) and \(\hat{\beta }_n^S\). First we generate \(w_i\) independently from the binomial distribution \(\text {Bi}(n,1/n)\), then we minimize the perturbed objective function
and repeat the procedure \(B=200\) times.
We use model (4) to generate the gap times, where we set \(p=1,~\alpha _i=1\), and assume \(e_{ij}\) follows a normal distribution with mean 0 and variance 1/4. Let the true value of \(\beta\) equals 0 and 0.2, respectively. 500 simulated data sets were generated. We denote the number of subjects in each simulated data set as n, where we choose \(n=30,60,100\). For ith subject in lth data set \((i=1,\dots ,n,l=1,\dots ,500)\), first we generate 10 successive times, then we use a constant \(C_i\) as the censoring time to select the first \(m_{i,l}\) gap times \(\{T_{i1},\dots ,T_{i,m_{i,l}}\}\), where \(\sum _{j=1}^{m_{i,l}-1}T_{ij}\le C_i\) and \(\sum _{j=1}^{m_{i,l}}T_{ij}>C_i\). We consider two trend measures in the simulation study, \(Z_{ij}=j\) and \(Z_{ij}=j^{1/2}\), the results for \(Z_{ij}=j\) are shown in Tables 1 and 2, and the results for \(Z_{ij}=j^{1/2}\) are shown in Tables 3 and 4.
For ith subject in the lth simulated data set, the observed gap times are \(\{T_{i1},\dots ,T_{im_{i,l}}^{+}\}\), denote the weak and strong comparability indicator for episodes j and k (\(j<k\le m_i-1\)) in subject i as \(\delta _{i,jk}^{l,W}(\beta )\) and \(\delta _{i,jk}^{l,S}(\beta )\), respectively. we compute the following quantities to compare the efficiency between weak and strong comparabilities:
-
Mean Total Pairs: It is defined as
$$\begin{aligned} \frac{1}{500}\sum _{l=1}^{500}\sum _{i=1}^{n}\left( {\begin{array}{c}m_{i,l}-1\\ 2\end{array}}\right) . \end{aligned}$$The mean total pairs calculates the average total number of pairs and measures the maximum capacity allowed in estimation. Under this scenario, we assume that every pair is comparable, which means that for \(i=1,\dots ,n\), \(j=1,\dots ,m_i\), \(j<k\le m_i-1\), we have \(\delta _{i,jk}^{l,W}(\beta _0)=1\) and \(\delta _{i,jk}^{l,S}(\beta _0)=1\).
-
Mean Comparable Pairs: For the weak comparability, it is defined as
$$\begin{aligned} \frac{1}{500}\sum _{l=1}^{500}\sum _{i=1}^{n}\sum _{j<k\le m_i-1}\delta _{i,jk}^{l,W}(\beta _0), \end{aligned}$$while for the strong comparability, it is defined as
$$\begin{aligned} \frac{1}{500}\sum _{l=1}^{500}\sum _{i=1}^{n}\sum _{j<k\le m_i-1}\delta _{i,jk}^{l,S}(\beta _0). \end{aligned}$$The mean comparable pairs measures how much information the strong and weak comparability will utilize. The difference in mean comparable pairs for the weak and strong comparability reflects the relative efficiency of the two comparability constraints. If the mean total pairs equals to the mean comparable pairs, then every pair will be a comparable pair.
From Table 1, we can see that standard deviation for the estimates under the weak comparability constraint are smaller than the standard deviation for the estimates under the strong version. From the table, we can also see that the mean comparable pairs under weak comparability are larger than the mean comparable pairs under the strong version as well, which is already verified in a graphical way in Sect. 2.1. When the censoring time is shorter (\(C_i=8\)), the difference between the standard deviations will be larger. When the censoring time is longer (i.e. \(C_i=9\)), for each subject i (\(i=1,\dots ,n\)), \(m_i\) will be larger, thus the mean total pairs will also increase. Table 2 show similar results for \(\beta _0=0.2\). The simulation results for \(Z_{ij}=j^{1/2}\) are shown in Tables 3 and 4, and the results are similar to Tables 1 and 2. As mentioned by one of the referees, the empirical coverage probabilities in Tables 3 and 4 seem a bit low compared to the nominal level. We also conducted further simulations with larger sample sizes, and the results are shown in Tables 5 and 6. As we can see, the empirical coverage probabilities are closer to the nominal level as the sample size increases. In addition, for trend measure \(Z_{ij}=j^{1/2}\), the coverage probability for strong comparability tend to be lower than the weak version under moderate or heavy censoring.
One referee comment that whether the proposed estimator performs well under other censoring mechanisms. We would like to mention that the censoring mechanisms will not affect the performance, since we only utilize the uncensored gap times and select comparable pairs (either in strong comparability or weak comparability). Based on our intuitive observation in Sect. 2.1, the comparable pairs for strong comparability will be nested within the weak comparability. So for any censoring mechanisms, the standard deviation for weak comparability will be less than the standard deviation for strong comparability. To illustrate this, we conducted a simulation study under type II censoring, where the number of gap times for each subject is the same, the results are shown in Table 7. The results also coincide with our previous findings. When the number of gap times are larger, the number of the building blocks of U statistics with subject i (i.e. \(\mid e_{ik}(\beta )-e_{ij}(\beta )\mid\) ) will also become bigger, therefore, the standard deviation will decrease and the difference between two comparabilities will become smaller. As a further illustration, we also conducted a simulation under random censoring, the censoring time \(C_i\) is assumed to follow an exponential distribution with mean equal to 6, the results are shown in Table 8.
Furthermore, we also conducted a simulation when the trend measure is misspecified, here the censoring mechanism are type I and type II censoring. For type I censoring, the \(C_i\) are set to be 8, in type II censoring, the number of gap times for each subject equals to 4, and the true trend measure is set to be \(Z_{ij}=j^{1/2}\), while in the model, we assume the trend measure is \(Z_{ij}=j\), the results are shown in Tables 9 and 10. From this table, we can see that both estimators perform well.
In summary, all the results indicate that the weak comparability can utilize more data, and provide a more efficient estimate than the strong comparability.
4 Real Data
We apply the proposed method to the HIV Prevention Trial Network 052 data [14, 15]. 1763 HIV-1-serodiscordant couples were enrolled into this study since April 2005. The study randomly assigned 1763 HIV type 1 serodiscordant couples to receive either early or delayed antiretroviral therapy treatment. 886 participants were received early therapy during enrollment, the rest 877 participants started therapy after two consecutive CD4+ counts fell below 250 cells per cubic millimeter or if an illness indicative of the acquired immunodeficiency syndrome developed. All the couples were followed up until 2015. On May 11, 2011, all the patients in both cohorts had been provided early antiretroviral therapy treatment due to an independent data and safety monitoring board of the NIH/NIAID’s recommendation. They showed a dramatically 96% risk reduction for the early antiretroviral therapy treatment arm in HIV-1 transmissions.
In this study, it is essential to remain high levels of medication adherence to recommended treatment regimes to achieve effective antiretroviral therapy treatment. Here the adherence means patients’ ability to take medications as prescribed. It is known that at least 95% of adherence is needed to achieve an effective HIV treatment [16]. In this study, the adherence was measured by pill counts, self count as well as the measurement of viral load [15]. Doctor’s counselling is recognized as one of the critical aspects that could affect adherence [17], thus all the participants in the study had received adherence counselling during each visit [14].
During the time interval between two consecutive visits, we calculate the ratio of the number of total pills that a participant has eaten versus the number of total pills that has been dispensed to this participant. We use this ratio as the measurement of adherence, and the adherence varies across different time intervals. For each participant’s visit history, we construct the recurrent event data as follows: (1) If the adherence of a participant’s first visit interval is larger than 95%, then we count the total consecutive visit days that has adherence larger than 95%, and denote the total consecutive visit episode as the high adherence episode. (2) If the adherence of a participant’s first visit interval is smaller than 95%, then we count the total consecutive visit days that has adherence smaller than 95%, and denote the consecutive visit episode as the low adherence episode. (3) Follow (1) and (2) alternatively to construct the remaining episodes, each episode serves as the gap time. (4) Define an indicator function for each episode. It equals 1 if the adherence during that episode is larger than 95%, and equals 0 otherwise. For instance, if a patient’s data is as in Table 11, then the gap times for this patient are: 14, 86 (= 28 + 58), 150 (= 60 + 90), 70, 320 (= 80 + 180 + 60), 60, with indicators equal to 0, 1, 0, 1, 0, 1. In order to maintain a high efficacy of the antiretroviral therapy treatment, an ideal pattern will be that the high adherence episodes tend to be longer, and the low adherence episodes tend to be shorter. Here we will use the proposed trend analysis method to assess the pattern of the adherence alternation. The model we consider is
where \(T_{ij}\) is the ith gap time for ith participant. Let \(Z_{ij}^{(1)}\) denote the trend measure, here we set \(Z_{ij}^{(1)}=j\) or \(j^{1/2}\). Let \(Z_{ij}^{(2)}\) be an indicator function, it equals to 1 if jth episode has high adherence, and equals to 0 otherwise. \(\beta _3\) is the interaction term for the trend measure and the adherence indicator. We only focus on the 886 participants in the early antiretroviral therapy treatment arm, since the pattern of the treatment for this group is consistent from enrollment to the end of follow-up. The gap times were measured in months. Since some participants’ data were missing, we delete these participants’ data, and finally lead to 829 individuals in the analysis. The data set is analyzed under both full follow up time (without censoring) as well as an artificial censoring time (May 11, 2011). The results are shown in Table 12. All the estimates are significant in the table. From the table, we can see that \(\hat{\beta }_1\) is positive, which indicates that the length of the higher and lower adherence episodes become longer. Positive \(\hat{\beta }_2\) shows that the average length of the higher adherence episodes is longer than the average length of the lower adherence episodes. And negative \(\hat{\beta }_3\) means the length of the higher adherence episodes and the lower adherence episodes change on the opposite direction – when the length of the higher adherence episodes gets longer, the length of the lower adherence episodes gets shorter, and vice versa. The results under the full follow up data and the censored data are similar. For both data, the standard deviations for \(\hat{\beta }_1\) and \(\hat{\beta }_2\) under weak comparability will be smaller than the standard deviations for \(\hat{\beta }_1\) and \(\hat{\beta }_2\) under strong comparability. Though they do not differ too much, which is due to the number of gap times is large. This is also confirmed by simulation study as well, where we can see that as the censoring time \(C_i\) becomes longer, the difference between weak comparability and strong comparability becomes smaller.
5 Discussion
In this paper, we propose a new version of comparability constraint for stratified gap time under the accelerated failure time model. This constraint can be used to identify the time trend for the gap times. Compared with [1]’s comparability constraint, the proposed constraint can recruit more data pairs in estimation procedure, thus the proposed weak comparability is more efficient. The theoretical and simulation results show that our method is better than [1]’s method. We use the accelerated failure time model due to its simple interpretation. However, we also plan to extend this idea to more complicated survival models (e.g., the Cox model) in the future. While in this paper we have considered the pairwise comparison for two gap times, we will also extend the idea to compare more than two gap times.
References
Wang M-C, Chen Y (2000) Nonparametric and semiparametric trend analysis for stratified recurrence times. Biometrics 56(3):789–794
Ascher H, Feingold H (1984) Repairable systems reliability: modeling. Inference, misconceptions and their causes
Lawless JF, Çiğşar C, Cook RJ (2012) Testing for monotone trend in recurrent event processes. Technometrics 54(2):147–158
Kvaløy JT, Lindqvist BH (2020) A class of tests for trend in time censored recurrent event data. Technometrics 62(1):101–115
Gaynes B, Brown C, Lux L, Ashok M, Coker-Schwimmer E, Hoffman V, Sheitman B, Viswanathan M (2015) Management strategies to reduce psychiatric readmissions. Technical Brief No. 21. (Prepared by the RTI-UNC Evidence-based Practice Center under Contract No. 290-2012-00008-I.)
Wang M-C, Chang S-H (1999) Nonparametric estimation of a recurrent survival function. J Am Stat Assoc 94(445):146–153
Bhattacharya P, Chernoff H, Yang S (1983) Nonparametric estimation of the slope of a truncated regression. Ann Stat 11(2):505–514
Lai T, Ying Z (1992) Asymptotically efficient estimation in censored and truncated regression models. Stat Sin 2(1):17–46
Efron B, Petrosian V (1999) Nonparametric methods for doubly truncated data. J Am Stat Assoc 94(447):824–834
Abelson R, Tukey J (1963) Efficient utilization of non-numerical information in quantitative analysis general theory and the case of simple order. Ann Math Stat 34(4):1347–1369
Huang Y, Chen YQ (2003) Marginal regression of gaps between recurrent events. Lifetime Data Anal 9(3):293–303
Chang S-H (2004) Estimating marginal effects in accelerated failure time models for serial sojourn times among repeated events. Lifetime Data Anal 10(2):175–190
Jin Z, Lin D, Ying Z (2006) Rank regression analysis of multivariate failure time data based on marginal linear models. Scand J Stat 33(1):1–23
Cohen M, Chen Y, McCauley M, Gamble T, Hosseinipour M, Kumarasamy N, Hakim J, Kumwenda J, Grinsztejn B, Pilotto J, Godbole S, Mehendale S (2011) Prevention of HIV-1 infection with early antiretroviral therapy. N Engl J Med 365(6):493–505
Chen Y, Masse B, Wang L, Ou S-S, Li X, Donnell D, McCauley M, Gamble T, Ribauldo H, Cohen M, Thomas R (2012) Statistical considerations for the HPTN 052 study to evaluate the effectiveness of early versus delayed antiretroviral strategies to prevent the sexual transmission of hiv-1 in serodiscordant couples. Contemp Clin Trials 33(6):1280–1286
Paterson D, Swindells S, Mohr J, Brester M, Vergis E, Squier C, Wagener M, Singh N (2000) Adherence to protease inhibitor therapy and outcomes in patients with HIV infection. Ann Intern Med 133(1):21–30
Iacob S, Iacob D, Jugulete G (2017) Improving the adherence to antiretroviral therapy, a difficult but essential task for a successful HIV treatment-clinical points of view and practical considerations. Front Pharmacol 8:831
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Appendix
Appendix
1.1 Proof of Lemma 1
Proof
Since (8) has been proved in [1], here we will only provide a proof for (9). Assume that \(e_{ij}(\beta )\) and \(e_{ik}(\beta )\) satisfy constaint (7). First we consider the left part of (9):
For the first inequality of (7), notice that \(T_{ij}+T_{ik}\le C_i-S_{i,jk}\) is equivalent to \(T_{ij}\le C_i-T_{ik}-S_{i,jk}\), substitute \(T_{ij}\) with \(\exp (\alpha _i+Z_{ij}^\top \beta +e)\):
The second inequality of (7) is equivalent to:
Substitute \(T_{ik}\) with \(\exp (\alpha _i+Z_{ik}^\top \beta +e)\), we have
Thus
Then we consider the right part of (9). Notice that \(T_{ij}+T_{ik}\le C_i-S_{i,jk}\) is equivalent to \(T_{ik}\le C_i-T_{ij}-S_{i,jk}\), thus:
And \(T_{ij}\exp \{(Z_{ik}-Z_{ij})^\top \beta \}+T_{ik}\exp \{(Z_{ij}-Z_{ik})^\top \beta \}\le C_i-S_{i,jk}\) is equivalent to
Substitute \(T_{ij}\) with \(\exp (\alpha _i+Z_{ij}^\top \beta +e)\), we have
Thus
Compare equation (15) and (16), we conclude that
Thus (9) is proved. \(\square\)
1.2 Proof Sketch of Theorem 1
Proof
To prove the asymptotic normality of the estimate \(\hat{\beta }_n^W\), notice that \(\delta _{i,jk}^W(\beta )\) in (11) is only a constraint aimed to select comparable pairs, then the derivative of (11) is:
Thus minimization of (11) is equivalent to solve \(M_W^{\prime }(\beta )=0\). For simplicity, we also denote
Then \(M_W^{\prime }(\beta _0)=\sum _{i=1}^nM_{i,W}^{\prime }(\beta _0)\) is the sum of i.i.d. random vectors and \(E\{M_W^{\prime }(\beta _0)\}=0\). Thus under the regularity conditions, \(n^{-1/2}M_W^{\prime }(\beta _0)\) converges to a normal distribution. However, when consider the covariance matrix, delta method cannot being directly used since \(M_W^{\prime }(\beta )\) is not differentiable with respect to unknown parameter \(\beta\). To overcome this difficulty, we first use a smooth approximation of \(M_W^{\prime }(\beta )\), which is \(\phi (\beta )=M_W^{\prime }(\beta _0)+n\mu (\beta )\), where \(\mu (\beta )=E[M_{i,W}^{\prime }(\beta )]\), we here assume that \(\mu (\beta )\) is positive definite. As \(\beta \rightarrow \beta _0\), \(\phi (\beta )\) is a local approximation of \(M_W^{\prime }(\beta )\). Denote \(D(\beta )=\phi (\beta )-M_W^{\prime }(\beta )\), then \(M_W^{\prime }(\beta )=\phi (\beta )-D(\beta )\). By using techniques of Lemma 5 and 6 of [7], we can prove that \(D(\beta )\) holds a stochastic equicontinuity condition which is \(D(\beta )-D(\beta _0)=\{\phi (\beta )-M_W^{\prime }(\beta )\}-\{\phi (\beta _0)-M_W^{\prime }(\beta _0)\}=o_p(n^{-1/2})\) for \(\beta -\beta _0=O_p(n^{-1/2})\). Then we have
This completes the proof by using the functional delta method and central limit theorem, after substitute \(\beta\) with \(\hat{\beta }_n^W\). \(\square\)
1.3 Code
The code for this paper is available at https://bit.ly/3hI9Ciq.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Liu, P., Huang, Y., Chan, K.C.G. et al. Semiparametric Trend Analysis for Stratified Recurrent Gap Times Under Weak Comparability Constraint. Stat Biosci 15, 455–474 (2023). https://doi.org/10.1007/s12561-023-09376-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-023-09376-8