Skip to main content
Log in

Matched survival data in a co-twin control design

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript


When using the co-twin control design for analysis of event times, one needs a model to address the possible within-pair association. One such model is the shared frailty model in which the random frailty variable creates the desired within-pair association. Standard inference for this model requires independence between the random effect and the covariates. We study how violations of this assumption affect inference for the regression coefficients and conclude that substantial bias may occur. We propose an alternative way of making inference for the regression parameters by using a fixed-effects models for survival in matched pairs. Fitting this model to data generated from the frailty model provides consistent and asymptotically normal estimates of regression coefficients, no matter whether the independence assumption is met.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others


  • Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10:1100–1120

    Article  MATH  MathSciNet  Google Scholar 

  • Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer, New York

    Book  MATH  Google Scholar 

  • Boomsma D, Busjahn A, Peltonen L (2002) Classical twin studies and beyond. Nat Rev Genet 3:872–882

    Article  Google Scholar 

  • Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88:9–25

    MATH  Google Scholar 

  • Carlsen K, Høybye MT, Dalton SO, Tjønneland A (2008) Social inequality and incidence of and survival from breast cancer an a population-based study in Denmark, 1994–2003. Eur J Cancer 44:1996–2002

    Article  Google Scholar 

  • Cox DR, Wong MY (2010) A note on the sensitivity to assumptions of a generalized linear mixed model. Biometrika 97:209–214

    Article  MATH  MathSciNet  Google Scholar 

  • Duchateau L, Janssen P (2007) The frailty model. Springer, New York

    Google Scholar 

  • Gross ST, Huber C (1987) Matched pair experiments: Cox and maximum likelihood estimation. Scand J Stat 14:27–41

    MATH  MathSciNet  Google Scholar 

  • Holt JD, Prentice RL (1974) Survival analyses in twin studies and matched pair experiments. Biometrika 61:17–30

    Article  MATH  MathSciNet  Google Scholar 

  • Horwitz AV, Videon TM, Schmitz MF, Davis D (2003) Rethinking twins and environment: possible social sources for assumed genetic influences in twin research. J Health Soc Behav 44:111–129

    Article  Google Scholar 

  • Hougaard P (2000) Analysis of multivariate survival data. Springer, New York

    Book  MATH  Google Scholar 

  • Kogevinas M, Pearce N, Susser M, Boffetta P (1997). Social inequalities and cancer. IARC Scientific Publications, No. 138, Lyon

  • Madsen M, Andersen PK, Gerster M, Andersen A-MN, Christensen K, Osler M (2011) Education and incidence of breast cancer: does the association replicate within twin pairs? Br J Cancer 104:520–523

    Article  Google Scholar 

  • Neuhaus JM, Kalbfleisch JD (1998) Between- and within-cluster covariate effects in the analysis of clustered data. Biometrics 54(2):638–645

    Article  MATH  Google Scholar 

  • Wienke A (2011) Frailty models in survival analysis. Chapman and Hall/CRC, Boca Raton

    Google Scholar 

Download references


We are grateful to Arvid Sjölander from the Karolinska Institute in Stockholm for sharing his unpublished report, “Between-within models for survival analyses”, with us.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Per Kragh Andersen.



The proofs follow closely the lines of Gross and Huber (1987) and build on the following conditions.

  1. (A1)
    $$\begin{aligned} \int \limits _0^{\tau }\alpha _0(t)dt<\infty . \end{aligned}$$
  2. (A2)

    There exists a neighborhood \(\mathcal{B }\) of \(\beta _0\) such that for each of the in (10)–(16) defined scalar, vector or matrix random functions, \(S(t,\beta )\), there is a deterministic function, \(s(t,\beta )\) satisfying that

    $$\begin{aligned}&\sup _{\beta \in \mathcal{B },t\in [0,\tau ]}\Vert S(t,\beta )-s(t,\beta )\Vert \stackrel{P}{\rightarrow }0, \end{aligned}$$
    $$\begin{aligned}&S_{L0}(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s-}]\log \left( \frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}\right) S_{0i}(s,\beta _0), \end{aligned}$$
    $$\begin{aligned}&S_{L1}(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s-}]\log \left( \frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}\right) S_{1i}(s,\beta _0), \end{aligned}$$
    $$\begin{aligned}&S_{L2}(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s-}]\left\{ \log \left( \frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}\right) \right\} ^2S_{0i}(s,\beta _0), \end{aligned}$$
    $$\begin{aligned}&S_1(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s-}]\frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}S_{1i}(s,\beta ), \end{aligned}$$
    $$\begin{aligned}&S_2(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s-}]\frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}S_{2i}(s,\beta ), \end{aligned}$$
    $$\begin{aligned}&S_{\Delta 2}(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s-}]S_{0i}(s,\beta _0)V_i(s,\beta ), \end{aligned}$$
    $$\begin{aligned}&S_{\Delta 4}(s,\beta )=\frac{1}{n}\sum _{i=1}^nE[Z_i\mid \mathcal{F }_{s-}]S_{0i}(s,\beta _0)V_i(s,\beta )^{\otimes 2}. \end{aligned}$$
  3. (A3)

    The deterministic \(s\)-functions, i.e. the limits (10)–(16) are bounded for \(t\in [0,\tau ]\) and \(\beta \in \mathcal{B }\), and the matrix

    $$\begin{aligned} \Sigma _\tau =\int \limits _0^\tau s_{\Delta 2}(s,\beta _0)\alpha _0(s)ds \end{aligned}$$

    is positive definite.

  4. (A4)

    Derivatives of the functions \(s(\cdot ,\beta )\) are limits in probability of the derivatives of the corresponding random functions, \(S(\cdot ,\beta )\).


(Theorem 1) Define

$$\begin{aligned} \Delta (\beta ,t)=\frac{1}{n}\left( C_t(\beta )-C_t(\beta _0)\right) . \end{aligned}$$

The compensator of this process is

$$\begin{aligned} \tilde{\Delta }(\beta ,t)=\frac{1}{n}\sum _{i=1}^n\sum _{j=1}^2\int \limits _0^t\left[ (\beta -\beta _0)^T \mathbf{X}_{ij}(s)-\log \left( \frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}\right) \right] d\Lambda _{ij}^\mathcal{O }(s)) \end{aligned}$$

where \(\Lambda _{ij}^\mathcal{O }(t)=\int _0^t\lambda _{ij}^\mathcal{O }(s)ds\), cf. (4). The corresponding martingale, i.e. the difference between (18) and (19) is

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n\sum _{j=1}^2\int \limits _0^t\left[ (\beta -\beta _0)^T \mathbf{X}_{ij}(s)-\log \left( \frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}\right) \right] dM_{ij}^\mathcal{O }(s) \end{aligned}$$

which has predictable variation process

$$\begin{aligned}&\frac{1}{n^2}\sum _{i=1}^n\sum _{j=1}^2\int \limits _0^t \left[ (\beta -\beta _0)^T X_{ij}(s)-\log \left( \frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}\right) \right] ^2\\&\quad \times Y_{ij}(s)\exp \left( \beta _0^TX_{ij}(s)\right) E[Z_i\mid \mathcal{F }_{s-}]\alpha _0(s)ds. \end{aligned}$$

Due to the conditions stated in (8) and (9) (use (11), (12) and (14)), this converges in probability to \(0\) as \(n\rightarrow \infty \). Now, using Lenglart’s inequality (Andersen et al. 1993, Sect. II.5.2), (20) converges to \(0\) in probability as \(n\rightarrow \infty \).

It further follows from (8) and (9) (use (10) and (13)) that for \(\beta \in \mathcal{B }\) and \(n\rightarrow \infty , \tilde{\Delta }(\beta ,\tau )\) and thereby also \(\Delta (\beta ,\tau )\) converges pointwise in probability towards

$$\begin{aligned} f(\beta )=\int \limits _0^\tau \left[ (\beta -\beta _0)^T s_1(s,\beta _0)- s_{L0}(s,\beta )\right] \alpha _0(s)ds. \end{aligned}$$


$$\begin{aligned} \frac{\partial }{\partial \beta }f(\beta )=\int \limits _0^\tau \left[ s_1(s,\beta _0)-\frac{\partial }{\partial \beta }\left( s_{L0}(s,\beta )\right) \right] \alpha _0(s)ds \end{aligned}$$


$$\begin{aligned} \frac{\partial s_{L0}(s,\beta )}{\partial \beta }= \lim _{n\rightarrow \infty }\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s-}]\frac{S_{1i}(s,\beta )}{S_{0i}(s,\beta )}S_{0i}(s,\beta _0) \end{aligned}$$

which for \(\beta =\beta _0\) equals \(s_1(s,\beta _0)\) and hence

$$\begin{aligned} \frac{\partial f(\beta )}{\partial \beta }\bigg |_{\beta =\beta _0}=0. \end{aligned}$$


$$\begin{aligned} \frac{\partial ^2 f(\beta )}{\partial \beta ^2} =\int \limits _0^\tau s_{\Delta 2}(s,\beta )\alpha _0(s)ds \end{aligned}$$

which is positive definite in \(\beta =\beta _0\), cf. (17). This ensures that \(f(\beta )\) is concave with a unique maximum in \(\beta _0\). Then the proof can be completed following the same arguments as (Andersen et al. (1993), p. 498) (using Andersen and Gill 1982, Theorem II.1). \(\square \)


(Theorem 2) First, we use the martingale CLT (Andersen et al. 1993, Theorem II.5.1) to show that

$$\begin{aligned} n^{-1/2}U_\tau ({\beta _{0}})\mathop {\rightarrow }\limits ^\mathcal{D }\mathcal{N }(0,\Sigma _\tau ). \end{aligned}$$

This requires a Lindeberg condition similar to the one in (Gross and Huber (1987), p. 33) and follows the lines of (Andersen et al. (1993), pp. 499, 523) as well as Gross and Huber (1987) using (9) on (16). Next, a Taylor expansion of \(U_\tau (\beta )\) around \(\beta _0\) gives that

$$\begin{aligned} U_\tau (\beta )-U_\tau (\beta _0)=-\mathcal{I }(\beta ^*)\cdot (\beta -\beta _0) \end{aligned}$$

where \(\beta ^*\) is a \(p\)-vector for which the \(j\)th coordinate is on the line segment between the \(j\)th coordinates of \(\beta \) and \(\beta _0\). In particular, since \(U(\widehat{\beta })=0\) we get that

$$\begin{aligned} n^{-1/2}U_\tau (\beta _0)=\frac{1}{n}\mathcal{I }(\beta ^*) n^{-1/2}\left( \widehat{\beta }-\beta _0\right) . \end{aligned}$$

In order to show (22), it must be shown that

$$\begin{aligned} \langle n^{-1/2}U_\tau ({\beta _0})\rangle \mathop {\rightarrow }\limits ^\mathcal{P }\Sigma _\tau . \end{aligned}$$

This follows from the fact that

$$\begin{aligned} \left\langle n^{-1/2} U(\beta _0)\right\rangle (\tau ) = \frac{1}{n}\sum _{i=1}^n\int \limits _0^\tau \alpha _0(s)E[Z_i\mid \mathcal{F }_{t-}] S_{0i}(s,\beta _0)V_i(s,\beta _0)ds, \end{aligned}$$

cf. (15) and (17).\(\square \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gerster, M., Madsen, M. & Andersen, P.K. Matched survival data in a co-twin control design. Lifetime Data Anal 20, 38–50 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: