Abstract
When using the cotwin control design for analysis of event times, one needs a model to address the possible withinpair association. One such model is the shared frailty model in which the random frailty variable creates the desired withinpair association. Standard inference for this model requires independence between the random effect and the covariates. We study how violations of this assumption affect inference for the regression coefficients and conclude that substantial bias may occur. We propose an alternative way of making inference for the regression parameters by using a fixedeffects models for survival in matched pairs. Fitting this model to data generated from the frailty model provides consistent and asymptotically normal estimates of regression coefficients, no matter whether the independence assumption is met.
Similar content being viewed by others
References
Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10:1100–1120
Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer, New York
Boomsma D, Busjahn A, Peltonen L (2002) Classical twin studies and beyond. Nat Rev Genet 3:872–882
Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88:9–25
Carlsen K, Høybye MT, Dalton SO, Tjønneland A (2008) Social inequality and incidence of and survival from breast cancer an a populationbased study in Denmark, 1994–2003. Eur J Cancer 44:1996–2002
Cox DR, Wong MY (2010) A note on the sensitivity to assumptions of a generalized linear mixed model. Biometrika 97:209–214
Duchateau L, Janssen P (2007) The frailty model. Springer, New York
Gross ST, Huber C (1987) Matched pair experiments: Cox and maximum likelihood estimation. Scand J Stat 14:27–41
Holt JD, Prentice RL (1974) Survival analyses in twin studies and matched pair experiments. Biometrika 61:17–30
Horwitz AV, Videon TM, Schmitz MF, Davis D (2003) Rethinking twins and environment: possible social sources for assumed genetic influences in twin research. J Health Soc Behav 44:111–129
Hougaard P (2000) Analysis of multivariate survival data. Springer, New York
Kogevinas M, Pearce N, Susser M, Boffetta P (1997). Social inequalities and cancer. IARC Scientific Publications, No. 138, Lyon
Madsen M, Andersen PK, Gerster M, Andersen AMN, Christensen K, Osler M (2011) Education and incidence of breast cancer: does the association replicate within twin pairs? Br J Cancer 104:520–523
Neuhaus JM, Kalbfleisch JD (1998) Between and withincluster covariate effects in the analysis of clustered data. Biometrics 54(2):638–645
Wienke A (2011) Frailty models in survival analysis. Chapman and Hall/CRC, Boca Raton
Acknowledgments
We are grateful to Arvid Sjölander from the Karolinska Institute in Stockholm for sharing his unpublished report, “Betweenwithin models for survival analyses”, with us.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
The proofs follow closely the lines of Gross and Huber (1987) and build on the following conditions.

(A1)
$$\begin{aligned} \int \limits _0^{\tau }\alpha _0(t)dt<\infty . \end{aligned}$$(8)

(A2)
There exists a neighborhood \(\mathcal{B }\) of \(\beta _0\) such that for each of the in (10)–(16) defined scalar, vector or matrix random functions, \(S(t,\beta )\), there is a deterministic function, \(s(t,\beta )\) satisfying that
$$\begin{aligned}&\sup _{\beta \in \mathcal{B },t\in [0,\tau ]}\Vert S(t,\beta )s(t,\beta )\Vert \stackrel{P}{\rightarrow }0, \end{aligned}$$(9)$$\begin{aligned}&S_{L0}(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s}]\log \left( \frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}\right) S_{0i}(s,\beta _0), \end{aligned}$$(10)$$\begin{aligned}&S_{L1}(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s}]\log \left( \frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}\right) S_{1i}(s,\beta _0), \end{aligned}$$(11)$$\begin{aligned}&S_{L2}(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s}]\left\{ \log \left( \frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}\right) \right\} ^2S_{0i}(s,\beta _0), \end{aligned}$$(12)$$\begin{aligned}&S_1(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s}]\frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}S_{1i}(s,\beta ), \end{aligned}$$(13)$$\begin{aligned}&S_2(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s}]\frac{S_{0i}(s,\beta )}{S_{0i}(s,\beta _0)}S_{2i}(s,\beta ), \end{aligned}$$(14)$$\begin{aligned}&S_{\Delta 2}(s,\beta )=\frac{1}{n}\sum _{i=1}^n E[Z_i\mid \mathcal{F }_{s}]S_{0i}(s,\beta _0)V_i(s,\beta ), \end{aligned}$$(15)$$\begin{aligned}&S_{\Delta 4}(s,\beta )=\frac{1}{n}\sum _{i=1}^nE[Z_i\mid \mathcal{F }_{s}]S_{0i}(s,\beta _0)V_i(s,\beta )^{\otimes 2}. \end{aligned}$$(16) 
(A3)
The deterministic \(s\)functions, i.e. the limits (10)–(16) are bounded for \(t\in [0,\tau ]\) and \(\beta \in \mathcal{B }\), and the matrix
$$\begin{aligned} \Sigma _\tau =\int \limits _0^\tau s_{\Delta 2}(s,\beta _0)\alpha _0(s)ds \end{aligned}$$(17)is positive definite.

(A4)
Derivatives of the functions \(s(\cdot ,\beta )\) are limits in probability of the derivatives of the corresponding random functions, \(S(\cdot ,\beta )\).
Proof
(Theorem 1) Define
The compensator of this process is
where \(\Lambda _{ij}^\mathcal{O }(t)=\int _0^t\lambda _{ij}^\mathcal{O }(s)ds\), cf. (4). The corresponding martingale, i.e. the difference between (18) and (19) is
which has predictable variation process
Due to the conditions stated in (8) and (9) (use (11), (12) and (14)), this converges in probability to \(0\) as \(n\rightarrow \infty \). Now, using Lenglart’s inequality (Andersen et al. 1993, Sect. II.5.2), (20) converges to \(0\) in probability as \(n\rightarrow \infty \).
It further follows from (8) and (9) (use (10) and (13)) that for \(\beta \in \mathcal{B }\) and \(n\rightarrow \infty , \tilde{\Delta }(\beta ,\tau )\) and thereby also \(\Delta (\beta ,\tau )\) converges pointwise in probability towards
Now,
with
which for \(\beta =\beta _0\) equals \(s_1(s,\beta _0)\) and hence
Furthermore,
which is positive definite in \(\beta =\beta _0\), cf. (17). This ensures that \(f(\beta )\) is concave with a unique maximum in \(\beta _0\). Then the proof can be completed following the same arguments as (Andersen et al. (1993), p. 498) (using Andersen and Gill 1982, Theorem II.1). \(\square \)
Proof
(Theorem 2) First, we use the martingale CLT (Andersen et al. 1993, Theorem II.5.1) to show that
This requires a Lindeberg condition similar to the one in (Gross and Huber (1987), p. 33) and follows the lines of (Andersen et al. (1993), pp. 499, 523) as well as Gross and Huber (1987) using (9) on (16). Next, a Taylor expansion of \(U_\tau (\beta )\) around \(\beta _0\) gives that
where \(\beta ^*\) is a \(p\)vector for which the \(j\)th coordinate is on the line segment between the \(j\)th coordinates of \(\beta \) and \(\beta _0\). In particular, since \(U(\widehat{\beta })=0\) we get that
In order to show (22), it must be shown that
This follows from the fact that
Rights and permissions
About this article
Cite this article
Gerster, M., Madsen, M. & Andersen, P.K. Matched survival data in a cotwin control design. Lifetime Data Anal 20, 38–50 (2014). https://doi.org/10.1007/s1098501392566
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1098501392566