Skip to main content
Log in

Bayesian Joint Semiparametric Mean–Covariance Modeling for Longitudinal Data

  • Published:
Communications in Mathematics and Statistics Aims and scope Submit manuscript

Abstract

Joint parsimonious modeling the mean and covariance is important for analyzing longitudinal data, because it accounts for the efficiency of parameter estimation and easy interpretation of variability. The main potential risk is that it may lead to inefficient or biased estimators of parameters while misspecification occurs. A good alternative is the semiparametric model. In this paper, a Bayesian approach is proposed for modeling the mean and covariance simultaneously by using semiparametric models and the modified Cholesky decomposition. We use a generalized prior to avoid the knots selection while using B-spline to approximate the nonlinear part and propose a Markov Chain Monte Carlo scheme based on Metropolis–Hastings algorithm for computations. Simulation studies and real data analysis show that the proposed approach yields highly efficient estimators for the parameters and nonparametric parts in the mean, meanwhile providing parsimonious estimation for the covariance structure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Cepeda, E.C., Gamerman, D.: Bayesian modeling of joint regressions for the mean and covariance matrix. Biome. J. 46(4), 430–440 (2004)

    Article  MathSciNet  Google Scholar 

  2. Diggle, P.J., Heagerty, P.J., Liang, K.Y., Zeger, S.L.: Analysis of Longitudinal Data. Oxford University Press, Oxford (2002)

    MATH  Google Scholar 

  3. Diggle, P.J., Verbyla, A.P.: Nonparametric estimation of covariance structure in longitudinal data. Biometrics 54, 403–15 (1998)

    Article  MATH  Google Scholar 

  4. Eilers, P.H.C., Marx, B.D.: Flexible smoothing with B-splines and penalties. Stat. Sci. 11, 89–121 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  5. Gamerman, D.: Sampling from the posterior distribution in generalized linear mixed models. Stat. Comput. 7, 57–68 (1997)

    Article  Google Scholar 

  6. Heckman, N.E.: Spline smoothing in a partly linear model. J. R. Stat. Soc. Ser. B 48, 244–248 (1986)

    MathSciNet  MATH  Google Scholar 

  7. Lang, S., Brezger, A.: Bayesian P-splines. J. Comput. Graph. Stat. 13, 183–212 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  8. Leng, C., Zhang, W., Pan, J.: Semiparametric mean–covariance regression analysis for longitudinal data. J. Am. Stat. Assoc. 105(489), 181–193 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  9. Liang, K.Y., Zeger, S.L.: Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  10. Lin, X., Carroll, R.J.: Semiparametric regression for clustered data using generalized estimating equations. J. Am. Stat. Assoc. 96, 1045–1056 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  11. Liu, X., Zhang, W.: A moving average Cholesky factor model in joint mean-covariance modeling for longitudinal data. Sci. China Math. 56, 2367–2379 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  12. Pourahmadi, M.: Joint mean–covariance models with applications to longitudinal data: unconstrained parameterisation. Biometrika 86, 677–90 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  13. Pourahmadi, M.: Maximum likelihood estimation of generalised linear models for multivariate normal covariance matrix. Biometrika 87, 425–35 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  14. Schumaker, L.L.: Spline Functions: Basic Theory. Wiley, New York (1981)

    MATH  Google Scholar 

  15. Stone, C.: Additive regression and other nonparametric models. Ann. Stat. 13, 689–705 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  16. Wu, H., Zhang, J.T.: Nonparametric Regression Methods for Longitudinal Data Analysis: Mixed-Effects Modeling Approaches. Wiley, New York (2006)

    MATH  Google Scholar 

  17. Wang, Y., Carey, V.: Working correlation structure misspecification, estimation and covariate design: implications for generalised estimating equations performance. Biometrika 90, 29–41 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  18. Welsh, A.H., Lin, X., Carroll, R.J.: Marginal longitudinal nonparametric regression: locality and efficiency of spline and kernel methods. J. Am. Stat. Assoc. 97, 482–493 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  19. Ye, H., Pan, J.: Modelling of covariates structures in generalized estimating equations for longitudinal data. Biometrika 93, 927–941 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  20. Zeger, S.L., Diggle, P.J.: Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters. Biometrics 50, 689–699 (1994)

    Article  MATH  Google Scholar 

  21. Zhang, W., Leng, C.: A moving average Cholesky factor model in covariance modeling for longitudinal data. Biometrika 99, 141–150 (2012)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiping Zhang.

Additional information

This research is supported by the National Key Research and Development Plan (No. 2016YFC0800100) and the NSF of China (Nos. 11671374, 71631006).

Appendix

Appendix

Here we provide a detailed derivation of the posterior full conditional distributions that were used for the developments in Sect. 3. For (3.1), it is easy to know that

$$\begin{aligned}&\pi (\beta |Y,X,f_0,\Sigma )\propto f(Y|X,f_0,\beta ,\Sigma )\pi (\beta )\\&\quad \propto \mathrm{exp}\left\{ -\frac{1}{2}\sum _{i=1}^m (Y_i-X_i\beta -f_0(t_i))'\Sigma _i^{-1}(Y_i-X_i\beta -f_0(t_i))\right. \\&\left. \qquad -\frac{1}{2}(\beta -\beta _0)'\Delta _0^{-1}(\beta -\beta _0)\right\} \\&\quad = N_p\left( \Delta _1^{-1}\left( \Delta _0^{-1}\beta _0+\sum _{i=1}^mX'_i\Sigma _i^{-1}(Y_i-f_0(t_i))\right) ,\Delta _1^{-1}\right) , \end{aligned}$$

where \(\Delta _1=\Delta _0^{-1}+\sum _{i=1}^mX'_i\Sigma _i^{-1}X_i\).

Noting that \(f_0=B_0\psi _0\), thus for (3.2) we have

$$\begin{aligned}&\pi (\psi _0|\beta ,\gamma ,\lambda ,f_1)\propto f(Y|X,f_0,\beta ,\Sigma )p(\psi _0|\tau _0)\\&\quad \propto \mathrm{exp}\left\{ -\frac{1}{2}\sum _{i=1}^m (Y_i-X_i\beta -f_0(t_i))'\Sigma _i^{-1}(Y_i-X_i\beta -f_0(t_i))-\frac{1}{2\tau _0^2}\psi _0'\Omega ^{-1}\psi _0\right\} \\&\quad = N_L\left( P_0^{-1}\sum _{i=1}^mB_{0i}'\Sigma _i^{-1}(Y_i-X_i\beta ),P_0^{-1}\right) , \end{aligned}$$

where \(P_0=\sum _{i=1}^mB_{0i}'\Sigma _i^{-1}B_{0i}+\Omega /\tau _0^2\).

By (3.3), \(\epsilon _i\sim N(0,D_i)\) and (2.8), we have the posterior full conditional distribution of \(\gamma \) as follows

$$\begin{aligned}&\pi (\gamma |Y,X,\beta ,\lambda , f_0, f_1)\propto \prod _{i=1}^m f(Y_i-\mu _i|X_i,\beta ,\lambda ,f_0,f_1)\pi (\gamma ) \\&\quad \propto \prod _{i=1}^m \prod _{j=1}^{n_i}f(Y_{ij}-\mu _{ij}|X_i,\beta ,\lambda ,f_0,f_1, Y_{i1}-\mu _{i1},\ldots ,Y_{i(j-1)}-\mu _{i(j-1)})\pi (\gamma ) \\&\quad \propto \mathrm{exp}\left\{ -\frac{1}{2}\sum _{i=1}^m (Y_i-\mu _i-W_i\gamma )'D_i^{-1}(Y_i-\mu _i-W_i\gamma )\right. \\&\left. \qquad -\frac{1}{2}(\gamma -\gamma _0)'\Gamma _0^{-1}(\gamma -\gamma _0)\right\} \\&\quad = N_q\left( \Gamma _1^{-1}\left( \Gamma _0^{-1}\gamma _0+\sum _{i=1}^mW'_iD_i^{-1}(Y_i-\mu _i)\right) ,\Gamma _1^{-1}\right) , \end{aligned}$$

where \(\Gamma _1=\Gamma _0^{-1}+\sum _{i=1}^mW'_iD_i^{-1}W_i\).

For (3.5) and (3.6), by (3.3), \(\epsilon _i\sim N(0,D_i)\) and priors (2.8) it directly obtains that

$$\begin{aligned}&\pi (\lambda |\beta ,\gamma ,f_0,f_1)\propto f(Y-\mu | \mu ,f_0,f_1)\pi (\lambda )\\&\quad \propto \prod _{i=1}^m|D_i|^{-1/2}\mathrm{exp}\left\{ -\frac{1}{2}\sum _{i=1}^m(y_i-\mu _i-W_i\gamma )'D_i^{-1}(y_i-\mu _i-W_i\gamma )\right. \\&\left. \qquad -\frac{1}{2}(\lambda -\lambda _0)'\Lambda _0^{-1}(\lambda -\lambda _0)\right\} \\&\pi (\psi _1|\beta ,\gamma ,\lambda ,f_0)\propto f(Y-\mu |\mu ,f_0,f_1)p(\psi _1|\tau _1)\\&\quad \propto \prod _{i=1}^m|D_i|^{-1/2}\mathrm{exp}\left\{ -\frac{1}{2}\sum _{i=1}^m(y_i-\mu _i-W_i\gamma )'D_i^{-1}(y_i-\mu _i-W_i\gamma )\right. \\&\left. \qquad -\frac{1}{2\tau _1^2}\psi _1'\Omega \psi _1\right\} , \end{aligned}$$

respectively.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, M., Zhang, W. & Chen, Y. Bayesian Joint Semiparametric Mean–Covariance Modeling for Longitudinal Data. Commun. Math. Stat. 7, 253–267 (2019). https://doi.org/10.1007/s40304-018-0138-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40304-018-0138-9

Keywords

Mathematics Subject Classification

Navigation