Skip to main content
Log in

Generalized linear mixed models for correlated binary data with t-link

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A critical issue in modeling binary response data is the choice of the links. We introduce a new link based on the Student’s t-distribution (t-link) for correlated binary data. The t-link relates to the common probit-normal link adding one additional parameter which controls the heaviness of the tails of the link. We propose an interesting EM algorithm for computing the maximum likelihood for generalized linear mixed t-link models for correlated binary data. In contrast with recent developments (Tan et al. in J. Stat. Comput. Simul. 77:929–943, 2007; Meza et al. in Comput. Stat. Data Anal. 53:1350–1360, 2009), this algorithm uses closed-form expressions at the E-step, as opposed to Monte Carlo simulation. Our proposed algorithm relies on available formulas for the mean and variance of a truncated multivariate t-distribution. To illustrate the new method, a real data set on respiratory infection in children and a simulation study are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Albert, J., Chib, S.: Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88, 669–679 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  • Breslow, N., Clayton, D.: Approximate inference in generalized linear mixed models. J. Am. Stat. Assoc. 88, 9–25 (1993)

    MATH  Google Scholar 

  • Chib, S., Greenberg, E.: Analysis of multivariate probit models. Biometrika 85, 347–361 (1998)

    Article  MATH  Google Scholar 

  • Czado, C., Santner, T.: The effect of link misspecification on binary regression inference. J. Stat. Plan. Inference 33, 213–231 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  • Delyon, B., Lavielle, M., Moulines, E.: Convergence of a stochastic approximation version of the EM algorithm. Ann. Stat. 27, 94–128 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  • Fernandez, C., Steel, M.F.: Multivariate student-t regression models: pitfalls and inference. Biometrika 86, 153–167 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  • Genz, A., Bretz, F., Hothorn, T., Miwa, T., Mi, X., Leisch, F., Scheipl, F.: mvtnorm: multivariate normal and t distribution. R package version 0.9-2 (2008). http://CRAN.R-project.org/package=mvtnorm

  • Ho, H.J., Lin, T.I., Chen, H.Y., Wang, W.L.: Some results on the truncated multivariate t distribution. J. Stat. Plan. Inference 142, 25–40 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  • Højsgaard, S., Halekoh, U., Yan, J.: The R package geepack for generalized estimating equations. J. Stat. Softw. 15, 1–11 (2005)

    Google Scholar 

  • Jamshidian, M.: Adaptive robust regression by using a nonlinear regression program. J. Stat. Softw. 4, 1–25 (1999)

    Google Scholar 

  • Johnson, S., Narasimhan, B.: Package cubature. R package version 1.1-1 (2011). http://cran.r-project.org/web/packages/cubature/index.html

  • Lachos, V.H., Angolini, T., Abanto-Valle, C.A.: On estimation and local influence analysis for measurement errors models under heavy-tailed distributions. Stat. Pap. 52, 567–590 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  • Lange, K.L., Sinsheimer, J.S.: Normal/independent distributions and their applications in robust regression. J. Comput. Graph. Stat. 2, 175–198 (1993)

    MathSciNet  Google Scholar 

  • Lee, Y., Nelder, J.: Double hierarchical generalized linear models. Appl. Stat. 55, 139–185 (2006)

    MATH  MathSciNet  Google Scholar 

  • Lin, T.I., Lee, J.C.: Estimation and prediction in linear mixed models with skew-normal random effects for longitudinal data. Stat. Med. 27, 1490–1507 (2008)

    Article  MathSciNet  Google Scholar 

  • Liu, C.: Robit regression: a simple robust alternative to logistic and probit regression. Applied Bayesian modeling and causal inference from incomplete-data perspectives, pp. 227–238 (2004)

  • Lucas, A.: Robustness of the student t based M-estimator. Commun. Stat., Theory Methods 26, 1165–1182 (1997)

    Article  MATH  Google Scholar 

  • Matos, L.A., Prates, M.O., H-Chen, M., Lachos, V.: Likelihood-based inference for mixed-effects models with censored response using the multivariate-t distribution. Stat. Sin. 23, 1323–1342 (2013)

    MATH  Google Scholar 

  • McCulloch, C.: Maximum likelihood variance components estimation for binary data. J. Am. Stat. Assoc. 89, 330–335 (1994)

    Article  MATH  Google Scholar 

  • McCulloch, C.E.: Maximum likelihood algorithms for generalized linear mixed models. J. Am. Stat. Assoc. 92, 162–170 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  • McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions. Wiley, New York (1997)

    MATH  Google Scholar 

  • Meng, X., van Dyk, D.: Fast EM-type implementations for mixed effects models. J. R. Stat. Soc. B 60, 559–578 (1998)

    Article  MATH  Google Scholar 

  • Meza, C., Jaffrézic, F., Foulley, J.: Estimation in the probit normal model for binary outcomes using the SAEM algorithm. Comput. Stat. Data Anal. 53, 1350–1360 (2009)

    Article  MATH  Google Scholar 

  • Pinheiro, J.C., Liu, C.H., Wu, Y.N.: Efficient algorithms for robust estimation in linear mixed-effects models using a multivariate t-distribution. J. Comput. Graph. Stat. 10, 249–276 (2001)

    Article  MathSciNet  Google Scholar 

  • R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2013). http://www.R-project.org

    Google Scholar 

  • Robert, C., Casella, G., Robert, C.: Monte Carlo Statistical Methods vol. 2. Springer, New York (1999)

    Book  MATH  Google Scholar 

  • Tan, M., Tian, G., Fang, H.: An efficient MCEM algorithm for fitting generalized linear mixed models for correlated binary data. J. Stat. Comput. Simul. 77, 929–943 (2007)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank the editor, associate editor and two referees, whose constructive comments led to a much improved presentation. Victor Lachos acknowledges support from CNPq-Brazil (Grant 305054/2011-2) and from FAPESP-Brazil (Grant 2011/17400-6). Marcos Prates would like to acknowledge the partial support of Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG-Brazil).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Denise R. Costa.

Appendix

Appendix

Proof of Proposition 1

First note that if Xt p (μ,Σ,ν), then we can write

$$\biggl(\frac{\nu+p}{\nu+\delta} \biggr)^rt_p(\mathbf {x}| \boldsymbol {\mu },\boldsymbol {\varSigma },\nu)=c_p(\nu,r)t_p\bigl(\mathbf {x}|\boldsymbol {\mu },\boldsymbol {\varSigma }^*, \nu+2r\bigr). $$

It follows that

$$\begin{aligned} E \biggl[ \biggl(\frac{\nu+p}{\nu+\delta} \biggr)^r\mathbf{X}^{(k)} \biggr] =&c_p(\nu,r) \frac{T_p(\mathbf{a}|\boldsymbol {\mu },\boldsymbol {\varSigma }^*,\nu+2r)}{T_{p}(\mathbf{a}|\boldsymbol {\mu },\boldsymbol {\varSigma },\nu)} \\ &\times E \bigl[\mathbf{X}^{(k)}|\mathbf{X}\leq\mathbf{a} \bigr] , \end{aligned}$$

which concludes the proof. □

Lemma 1

If U∼Gamma(α,β), then for any vector \(\mathbf{B}\in \mathbb{R}^{p}\) and a p×p positive definite matrix Σ,

$$E\bigl[\varPhi_p(\mathbf{B}\sqrt{U}|\mathbf{0},\boldsymbol {\varSigma }) \bigr]=T_p\biggl(\sqrt{\frac{\alpha}{\beta}}\mathbf{B}|\mathbf{0}, \boldsymbol {\varSigma },2\alpha\biggr), $$

Proof

If VN p (0,Σ); then

$$\begin{aligned} &{E\bigl[\varPhi_p\bigl(\mathbf{B}\sqrt{U}|\mathbf{0},\boldsymbol {\varSigma }\bigr) \bigr]}\\ &{\quad=E_U \bigl[P(\mathbf{V}\leq\mathbf{B}\sqrt{u}|U=u) \bigr]} \\ &{\quad=E_U\biggl[P\biggl(\frac{\mathbf{V}}{(u\beta/\alpha)^{1/2}}\leq\sqrt{ \frac{\alpha}{\beta}}\mathbf{B}|U=u\biggr)\biggr]} \\ &{\quad=P\biggl(\mathbf{T}\leq\sqrt{\frac{\alpha}{\beta}}\mathbf{B}\biggr),} \end{aligned}$$

where, clearly \(\mathbf{T}=\frac{\mathbf{V}}{(U\beta/\alpha)^{1/2}}\) has a multivariate Student’s t-distribution, which concludes the proof. □

Details of the EM Algorithm:

Treat \(\mathbf{b}=\{\mathbf{b}_{i}\}^{m}_{i=1}\), \(\mathbf{Z}=\{\mathbf{Z}_{i}\}^{m}_{i=1}\) and \(\mathbf{U}=\{{U}_{i}\}^{m}_{i=1}\) as missing data. From the definition of the latent variable Z, we have {Y,Z}=Z. Then, the joint density for the complete-data Y com ={Y,Z,b,U} is

$$\begin{aligned} f(\mathbf {Y}_{com}|\boldsymbol {\theta }) =& \prod _{i=1}^{m} f(\mathbf {b}_i|u_i, \mathbf {D})f(\mathbf {Z}_i|\mathbf {b}_i,u_i,\beta)h(u_i| \nu) \\ =& \prod_{i=1}^{m} \phi_q \bigl(\mathbf {b}_i|\mathbf{0},u^{-1}_i\mathbf {D}\bigr) \phi_{n_i}\bigl(\mathbf {Z}_i|\boldsymbol {\mu }_i,u^{-1}_i \mathbf{I}_{n_i}\bigr) \\ &{} \times G(u_i|\nu/2,\nu/2). \end{aligned}$$
(9)

To complete the demonstration about how to employ the EM-type algorithm for ML estimation of the t-GLMM, it is necessary to derive the four conditional expectations of the complete-data sufficient statistics: E[U i |Y i ], E[U i Z i |Y i ], E[U i b i |Y i ] and \(E[U_{i}\mathbf {b}_{i}\mathbf {b}^{\top}_{i}|\mathbf {Y}_{i}]\). To calculate them, we first derive the conditional predictive distribution of the missing data, which is given by:

$$\begin{aligned} f(\mathbf {b},\mathbf {Z},\mathbf{U}|\mathbf {Y},\boldsymbol {\theta }) =& f(\mathbf {Z}|\mathbf {Y},\mathbf {b},\mathbf{u}, \boldsymbol {\theta }) f(\mathbf {b}|\mathbf {Y},\mathbf{u},\boldsymbol {\theta }) \\ &{}\times f(\mathbf{u}|\mathbf {Y},\boldsymbol {\theta }) = f(\mathbf {b}|\mathbf {Y},\mathbf {Z},\mathbf{u},\mathbf {D}) \\ &{}\times f(\mathbf{u}|\mathbf {Y},\mathbf {Z},\boldsymbol {\theta }) f(\mathbf {Z}|\mathbf {Y},\boldsymbol {\theta }). \end{aligned}$$
(10)

Since f(b|Y,Z,u,θ) is proportional to (9), we obtain the following result:

$$\begin{aligned} f(\mathbf {b}|\mathbf {Y},\mathbf {Z},\mathbf{u},\boldsymbol {\theta }) =& \prod_{i=1}^{m} f(\mathbf {b}_i|\mathbf {Y}_i,\mathbf {Z}_i,u_i, \boldsymbol {\theta }) \\ =& \prod_{i=1}^{m} f(\mathbf {b}_i| \mathbf {Z}_i,u_i,\boldsymbol {\theta }) \\ =& \prod_{i=1}^{m} \phi_q \bigl(\mathbf {b}_i|\boldsymbol {\varDelta }_i\bigl(\mathbf {Z}_i- \mathbf {X}^{\top}_i\boldsymbol {\beta }\bigr),u^{-1}_i \boldsymbol {\varLambda }_i\bigr), \end{aligned}$$

where \(\boldsymbol {\varDelta }_{i}=\mathbf {D}\mathbf {W}^{\top}_{i} \boldsymbol {\varOmega }_{i}^{-1}\), \(\boldsymbol {\varLambda }_{i}=\mathbf {D}-\mathbf {D}\mathbf {W}^{\top}_{i}\boldsymbol {\varOmega }^{-1}_{i}\mathbf {W}_{i}\mathbf {D}\) and \(\boldsymbol {\varOmega }_{i}=\mathbf {W}_{i}\mathbf {D}\mathbf {W}^{\top}_{i}+\mathbf{I}_{n_{i}}\), i=1,…,m. To derive the second term on the right-hand side of (10), we use the following result from Chib and Greenberg (1998)

$$\begin{aligned} P(\mathbf {Y}_i =&y_i|\mathbf {b}_i, \mathbf {Z}_i,u_i,\boldsymbol {\theta }) \\ =& \mathbb{I}_{(\mathbf {Z}_i \in \mathbb{B}_i)} \\ =& \prod_{j=1}^{n_i} \{\mathbb{ I}_{(Z_{ij}>0)}\mathbb{I}_{(Y_{ij}=1)}+\mathbb{I}_{(Z_{ij} \le 0)} \mathbb{I}_{(Y_{ij}=0)} \}, \end{aligned}$$
(11)

which indicates that given Z i , the conditional probability of Y i is independent of b i and u i . Hence, expression (11) implies \(P(\mathbf {Y}_{i}=\mathbf {y}_{i}|\mathbf {Z}_{i},\boldsymbol {\theta })=\mathbb{I}_{(\mathbf {Z}_{i} \in \mathbb{B}_{i})}\). Since the conditional probability Z i |u i ,θ is normally distributed and U i ∼Gamma(v/2,v/2), the marginal distribution of Z i |θ follows \(t_{n_{i}}(\mathbf {X}_{i}\boldsymbol {\beta },\boldsymbol {\varOmega }_{i},\nu)\). Furthermore, from

$$\begin{aligned} f(\mathbf {Z}_i|\mathbf {Y}_i,\boldsymbol {\theta }) \propto & f(\mathbf {Z}_i, \mathbf {Y}_i|\boldsymbol {\theta }) \\ =& f(\mathbf {Z}_i|\boldsymbol {\theta })P(\mathbf {Y}_i=\mathbf {y}_i| \mathbf {Z}_i,\boldsymbol {\theta }) \\ =& t_{n_i}\bigl(\mathbf {Z}_i|X^{\top}_i \boldsymbol {\beta },\boldsymbol {\varOmega }_i,\nu\bigr) \mathbb{I}_{(\mathbf {Z}_i \in \mathbb{B}_i)} \end{aligned}$$

we obtain

$$\begin{aligned} f(\mathbf {Z}|\mathbf {Y}_{obs},\boldsymbol {\theta }) =& \prod _{j=1}^{m} f(\mathbf {Z}_i,\mathbf {Y}_i| \boldsymbol {\theta }) \\ =& Tt_{n_i}\bigl(\mathbf {Z}_i|X^{\top}_i \boldsymbol {\beta },\boldsymbol {\varOmega }_i,\nu,\mathbb{ B}_i\bigr). \end{aligned}$$

Using the prior results and the property that, if Z|θ follows t p (μ,Σ,ν) and U∼Gamma(ν/2,ν/2), we have \(E[U|\mathbf {Z}]=\frac{\nu+p}{\nu+\delta}\) (Lachos et al. 2011), where δ represents the Mahalanobis distance. It follows that:

$$\begin{aligned} &{E[U_i|\mathbf {Y}_i]= E \bigl[E[U_i| \mathbf {Y}_i,\mathbf {Z}_i,\boldsymbol {\theta }]|\mathbf {Y}_i,\boldsymbol {\theta }\bigr]} \\ &{\phantom{E[U_i|\mathbf {Y}_i]}= E \biggl[\frac{\nu+n_i}{\nu+\delta_i}|\mathbf {Y}_i,\boldsymbol {\theta }\biggr] = \bar{ \mathbf {Z}}^0_i,} \\ &{E[U_i\mathbf {Z}_i|\mathbf {Y}_i] = E \bigl[ \mathbf {Z}_i E[U_i|\mathbf {Y}_i,\mathbf {Z}_i,\boldsymbol {\theta }]| \mathbf {Y}_i,\boldsymbol {\theta }\bigr]} \\ &{\phantom{E[U_i\mathbf {Z}_i|\mathbf {Y}_i]}= E \biggl[ \biggl(\frac{\nu+n_i}{\nu+\delta_i} \biggr)Z_i| \mathbf {Y}_i,\boldsymbol {\theta }\biggr] = \bar{\mathbf {Z}}^1_i,} \\ &{E[U_i\mathbf {b}_i|\mathbf {Y}_i]= {E \bigl[E \bigl[U_i E[\mathbf {b}_i|\mathbf {Y}_i, \mathbf {Z}_i,U_i,\boldsymbol {\theta }]|\mathbf {Y}_i,\mathbf {Z}_i, \boldsymbol {\theta }\bigr]|\mathbf {Y}_i,\boldsymbol {\theta }\bigr]}} \\ &{\phantom{E[U_i\mathbf {b}_i|\mathbf {Y}_i]}=\boldsymbol {\varDelta }_i\bigl(\bar{\mathbf {Z}}^1_i-\bar{ \mathbf {Z}}^0_i\mathbf {X}_i\boldsymbol {\beta }\bigr),} \\ &{E\bigl[U_i\mathbf {b}_i\mathbf {b}^{\top}_i| \mathbf {Y}_i\bigr]}\\ &{\quad=E \bigl[E \bigl[U_i E\bigl[ \mathbf {b}_i\mathbf {b}^{\top}_i|\mathbf {Y}_i, \mathbf {Z}_i,U_i,\boldsymbol {\theta }\bigr]|\mathbf {Y}_i, \mathbf {Z}_i,\boldsymbol {\theta }\bigr]|\mathbf {Y}_i,\boldsymbol {\theta }\bigr]} \\ &{\quad= {\boldsymbol {\varLambda }_i+\boldsymbol {\varDelta }_i\bigl(\bar{ \mathbf {Z}}^2_i+\boldsymbol {\gamma }_i\boldsymbol {\gamma }_i^{\top} \bar{\mathbf {Z}}^0_i-\bar{\mathbf {Z}}^1_i \boldsymbol {\gamma }^{\top}_i-\boldsymbol {\gamma }_i\bar{ \mathbf {Z}}^{1^{\top}}_i\bigr)\boldsymbol {\varDelta }_i},} \\ &{\mathbf {Z}_i|\mathbf {Y}_i\sim t_{n_i}(\boldsymbol {\gamma }_i, \boldsymbol {\varOmega }_i,\nu)\mathbb{I}_{\mathbb{B}_i}(\mathbf {Z}_i),} \end{aligned}$$

where \(\bar{\mathbf {Z}}^{2}_{i}=E [\frac{\nu+n_{i}}{\nu+\delta_{i}}\mathbf {Z}_{i}\mathbf {Z}^{\top}_{i}|\mathbf {Y}_{i} ]\), \(\delta_{i}=(\mathbf {Z}_{i}-\boldsymbol {\gamma }_{i})^{\top} \boldsymbol {\varOmega }^{-1}_{i}(\mathbf {Z}_{i}-\boldsymbol {\gamma }_{i})\), \(\boldsymbol {\varDelta }_{i}=\mathbf {D}\mathbf {W}^{\top}_{i} \boldsymbol {\varOmega }_{i}^{-1}\), \(\boldsymbol {\varLambda }_{i}=\mathbf {D}-\mathbf {D}\mathbf {W}^{\top}_{i}\boldsymbol {\varOmega }^{-1}_{i}\mathbf {W}_{i}\mathbf {D}\), \(\boldsymbol {\varOmega }_{i}=\mathbf {W}_{i}\mathbf {D}\mathbf {W}^{\top}_{i}+\mathbf{I}_{n_{i}}\), γ i =X i β, and \(\mathbb{B}_{i}=B_{i1}\times\cdots\times B_{in_{i}}\), where B ij is the interval (0,∞) if y ij =1 and the interval (−∞,0] if y ij =0.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prates, M.O., Costa, D.R. & Lachos, V.H. Generalized linear mixed models for correlated binary data with t-link. Stat Comput 24, 1111–1123 (2014). https://doi.org/10.1007/s11222-013-9423-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-013-9423-3

Keywords

Navigation