Abstract
Joint parsimonious modeling the mean and covariance is important for analyzing longitudinal data, because it accounts for the efficiency of parameter estimation and easy interpretation of variability. The main potential risk is that it may lead to inefficient or biased estimators of parameters while misspecification occurs. A good alternative is the semiparametric model. In this paper, a Bayesian approach is proposed for modeling the mean and covariance simultaneously by using semiparametric models and the modified Cholesky decomposition. We use a generalized prior to avoid the knots selection while using B-spline to approximate the nonlinear part and propose a Markov Chain Monte Carlo scheme based on Metropolis–Hastings algorithm for computations. Simulation studies and real data analysis show that the proposed approach yields highly efficient estimators for the parameters and nonparametric parts in the mean, meanwhile providing parsimonious estimation for the covariance structure.
Similar content being viewed by others
References
Cepeda, E.C., Gamerman, D.: Bayesian modeling of joint regressions for the mean and covariance matrix. Biome. J. 46(4), 430–440 (2004)
Diggle, P.J., Heagerty, P.J., Liang, K.Y., Zeger, S.L.: Analysis of Longitudinal Data. Oxford University Press, Oxford (2002)
Diggle, P.J., Verbyla, A.P.: Nonparametric estimation of covariance structure in longitudinal data. Biometrics 54, 403–15 (1998)
Eilers, P.H.C., Marx, B.D.: Flexible smoothing with B-splines and penalties. Stat. Sci. 11, 89–121 (1996)
Gamerman, D.: Sampling from the posterior distribution in generalized linear mixed models. Stat. Comput. 7, 57–68 (1997)
Heckman, N.E.: Spline smoothing in a partly linear model. J. R. Stat. Soc. Ser. B 48, 244–248 (1986)
Lang, S., Brezger, A.: Bayesian P-splines. J. Comput. Graph. Stat. 13, 183–212 (2004)
Leng, C., Zhang, W., Pan, J.: Semiparametric mean–covariance regression analysis for longitudinal data. J. Am. Stat. Assoc. 105(489), 181–193 (2010)
Liang, K.Y., Zeger, S.L.: Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986)
Lin, X., Carroll, R.J.: Semiparametric regression for clustered data using generalized estimating equations. J. Am. Stat. Assoc. 96, 1045–1056 (2001)
Liu, X., Zhang, W.: A moving average Cholesky factor model in joint mean-covariance modeling for longitudinal data. Sci. China Math. 56, 2367–2379 (2013)
Pourahmadi, M.: Joint mean–covariance models with applications to longitudinal data: unconstrained parameterisation. Biometrika 86, 677–90 (1999)
Pourahmadi, M.: Maximum likelihood estimation of generalised linear models for multivariate normal covariance matrix. Biometrika 87, 425–35 (2000)
Schumaker, L.L.: Spline Functions: Basic Theory. Wiley, New York (1981)
Stone, C.: Additive regression and other nonparametric models. Ann. Stat. 13, 689–705 (1985)
Wu, H., Zhang, J.T.: Nonparametric Regression Methods for Longitudinal Data Analysis: Mixed-Effects Modeling Approaches. Wiley, New York (2006)
Wang, Y., Carey, V.: Working correlation structure misspecification, estimation and covariate design: implications for generalised estimating equations performance. Biometrika 90, 29–41 (2003)
Welsh, A.H., Lin, X., Carroll, R.J.: Marginal longitudinal nonparametric regression: locality and efficiency of spline and kernel methods. J. Am. Stat. Assoc. 97, 482–493 (2002)
Ye, H., Pan, J.: Modelling of covariates structures in generalized estimating equations for longitudinal data. Biometrika 93, 927–941 (2006)
Zeger, S.L., Diggle, P.J.: Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters. Biometrics 50, 689–699 (1994)
Zhang, W., Leng, C.: A moving average Cholesky factor model in covariance modeling for longitudinal data. Biometrika 99, 141–150 (2012)
Author information
Authors and Affiliations
Corresponding author
Additional information
This research is supported by the National Key Research and Development Plan (No. 2016YFC0800100) and the NSF of China (Nos. 11671374, 71631006).
Appendix
Appendix
Here we provide a detailed derivation of the posterior full conditional distributions that were used for the developments in Sect. 3. For (3.1), it is easy to know that
where \(\Delta _1=\Delta _0^{-1}+\sum _{i=1}^mX'_i\Sigma _i^{-1}X_i\).
Noting that \(f_0=B_0\psi _0\), thus for (3.2) we have
where \(P_0=\sum _{i=1}^mB_{0i}'\Sigma _i^{-1}B_{0i}+\Omega /\tau _0^2\).
By (3.3), \(\epsilon _i\sim N(0,D_i)\) and (2.8), we have the posterior full conditional distribution of \(\gamma \) as follows
where \(\Gamma _1=\Gamma _0^{-1}+\sum _{i=1}^mW'_iD_i^{-1}W_i\).
For (3.5) and (3.6), by (3.3), \(\epsilon _i\sim N(0,D_i)\) and priors (2.8) it directly obtains that
respectively.
Rights and permissions
About this article
Cite this article
Liu, M., Zhang, W. & Chen, Y. Bayesian Joint Semiparametric Mean–Covariance Modeling for Longitudinal Data. Commun. Math. Stat. 7, 253–267 (2019). https://doi.org/10.1007/s40304-018-0138-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40304-018-0138-9