Skip to main content
Log in

Robust semiparametric modeling of mean and covariance in longitudinal data

  • Original Paper
  • Published:
Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Abstract

Longitudinal data often suffer from heavy-tailed errors and outliers, which can significantly reduce efficiency and lead to invalid inferences. Robust techniques are essential, especially in joint mean-covariance modeling, as the estimation of the covariance matrix is more sensitive to heavy-tailed errors and outliers than the estimation of the mean. Motivated by the modified Cholesky decomposition of the covariance matrix, we propose a novel semiparametric method that uses robust techniques to simultaneously estimate the mean, autoregressive coefficients, and innovation variance. We provide a practical algorithm for this method and investigate the asymptotic properties of the mean and covariance estimators. Numerical simulations demonstrate that the proposed method is efficient and stable when the dataset is contaminated with outliers and heavy-tailed errors. The new robust technique yields statistically interpretable inferences in real data analysis, whereas traditional approaches fail to provide any acceptable inferences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Avella-Medina, M., Battey, H. S., Fan, J., & Li, Q. (2018). Robust estimation of high-dimensional covariance and precision matrices. Biometrika, 105, 271–284.

    Article  MathSciNet  MATH  Google Scholar 

  • Barron, J. T. (2019). A general and adaptive robust loss function. In Conference on computer vision and pattern recognition.

  • Chen, Z., & Dunson, D. B. (2003). Random effects selection in linear mixed models. Biometrics, 59, 762–769.

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, Z., Tang, M. L., & Gao, W. (2018). A profile likelihood approach for longitudinal data analysis. Biometrics, 74, 220–228.

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, Z., Tang, M. L., Gao, W., & Shi, N. Z. (2014). New robust variable selection methods for linear regression models. Scandinavian Journal of Statistics, 411, 725–741.

    Article  MathSciNet  MATH  Google Scholar 

  • Dallakyan, A., & Pourahmadi, M. (2022). Fused-lasso regularized Cholesky factors of large nonstationary covariance matrices of replicated time series. Journal of Computational and Graphical Statistics, 0, 1–14.

    MATH  Google Scholar 

  • Diggle, P. J., Heagerty, P., Liang, K. Y., & Zeger, S. (2002). Analysis of longitudinal data (2nd ed.). Oxford University Press.

    Book  MATH  Google Scholar 

  • Dockery, D. W., Berkey, C. S., Ware, J. H., Speizer, F. E., & Ferris, B. G., Jr. (1983). Distribution of forced vital capacity and forced expiratory volume in one second in children 6 to 11 years of age. American Review of Respiratory Disease, 128, 405–412.

    Article  Google Scholar 

  • Fan, Y., Qin, G., & Zhu, Z. (2012). Variable selection in robust regression models for longitudinal data. Journal of Multivariate Analysis, 109, 156–167.

    Article  MathSciNet  MATH  Google Scholar 

  • Fan, J., Wang, W., & Zhong, Y. (2019). Robust covariance estimation for approximate factor models. Journal of Econometrics, 208, 5–22.

    Article  MathSciNet  MATH  Google Scholar 

  • Ferrari, D., & Yang, Y. (2010). Maximum \(L_q\)-likelihood estimation. The Annals of Statistics, 38, 753–783.

    Article  MathSciNet  MATH  Google Scholar 

  • Goes, J., Lerman, G., & Nalder, B. (2020). Robust sparse covariance estimation by thresholding Tyler’s M-estimator. The Annals of Statistics, 48, 86–110.

    Article  MathSciNet  MATH  Google Scholar 

  • Huber, P. J. (1964). Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35, 73–101.

    Article  MathSciNet  MATH  Google Scholar 

  • Huber, P. J. (1973). Robust regression: Asymptotics, conjectures and Monte Carlo. The Annals of Statistics, 1, 799–821.

    Article  MathSciNet  MATH  Google Scholar 

  • Huber, P. J. (1981). Robust Statistics. Wiley Press.

    Book  MATH  Google Scholar 

  • Ke, Y., Minsker, S., Ren, Z., Sun, Q., & Zhou, W. X. (2019). User-friendly covariance estimation for heavy-tailed distributions. Statistical Science, 34, 454–471.

    Article  MathSciNet  MATH  Google Scholar 

  • Leng, C., Zhang, W., & Pan, J. (2010). Semiparametric mean-covariance regression analysis for longitudinal data. Journal of the American Statistical Association, 105, 181–193.

    Article  MathSciNet  MATH  Google Scholar 

  • Leung, D., Wang, Y., & Zhu, M. (2009). Efficient parameter estimation in longitudinal data analysis using a hybrid gee method. Biostatistics, 10, 436–455.

    Article  MATH  Google Scholar 

  • Liang, K. Y., & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22.

    Article  MathSciNet  MATH  Google Scholar 

  • Li, D., & Pan, J. (2013). Empirical likelihood for generalized linear models with longitudinal data. Journal of Multivariate Analysis, 114, 63–73.

    Article  MathSciNet  MATH  Google Scholar 

  • Maronna, R. A., Martin, R. D., Yohai, V. J., & Salibiáin-Barrera, M. (2018). Robust statistics: Theory and methods (with R) (2nd ed.). Wiley Press.

    Book  Google Scholar 

  • Newey, W. K., & Smith, R. J. (2004). Higher order properties of GMM and generalized empirical likelihood estimators. Econometrica, 72, 219–255.

    Article  MathSciNet  MATH  Google Scholar 

  • Pan, J., Ye, H., & Li, R. (2009). Nonparametric regression of covariance structures in longitudinal studies. Manchester Institute for Mathematical Sciences, University of Manchester.

    Google Scholar 

  • Peña, D., & Prieto, F. J. (2001). Multivariate outlier detection and robust covariance matrix estimation. Technometrics, 43, 286–310.

    Article  MathSciNet  Google Scholar 

  • Pourahmadi, M. (1999). Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation. Biometrika, 86, 677–690.

    Article  MathSciNet  MATH  Google Scholar 

  • Pourahmadi, M. (2000). Maximum likelihood estimation of generalised linear models for multivariate normal covariance matrix. Biometrika, 87, 425–435.

    Article  MathSciNet  MATH  Google Scholar 

  • Pourahmadi, M. (2007). Cholesky decompositions and estimation of a covariance matrix: Orthogonality of variance-correlation parameters. Biometrika, 94, 1006–1013.

    Article  MathSciNet  MATH  Google Scholar 

  • Pourahmadi, M. (2011). Covariance estimation: The GLM and regularization perspectives. Statistical Science, 26, 369–387.

    Article  MathSciNet  MATH  Google Scholar 

  • Qu, A., Lindsay, B. G., & Li, B. (2000). Improving estimating equations using quadratic inference functions. Biometrika, 87, 823–836.

    Article  MathSciNet  MATH  Google Scholar 

  • Sowers, M., Randolph, J. F., Jr., Crutchfield, M., Jannausch, M. L., Shapiro, B., Zhang, B., & La Pietra, M. (1998). Urinary ovarian and gonadotropin hormone levels in premenopausal women with low bone mass. Journal of Bone and Mineral Research, 13, 1191–1202.

    Article  Google Scholar 

  • Tang, C. Y., Zhang, W., & Leng, C. (2019). Discrete longitudinal data modeling with a mean-correlation regression approach. Statistica Sinica, 29(2), 853–876.

    MathSciNet  MATH  Google Scholar 

  • Tsybakov, A. B. (2009). Introduction to nonparametric estimation. Springer.

    Book  MATH  Google Scholar 

  • Wang, Y. G., & Carey, V. (2003). Working correlation structure misspecification, estimation and covariate design: Implications for generalised estimating equations performance. Biometrika, 90, 29–41.

    Article  MathSciNet  MATH  Google Scholar 

  • Wu, W. B., & Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data. Biometrika, 90, 831–844.

    Article  MathSciNet  MATH  Google Scholar 

  • Xu, L., Xiang, S., & Yao, W. (2019). Robust maximum \(L_q\)-likelihood estimation of joint mean-covariance models for longitudinal data. Journal of Multivariate Analysis, 171, 397–411.

    Article  MathSciNet  MATH  Google Scholar 

  • Ye, H., & Pan, J. (2006). Modelling of covariance structures in generalised estimating equations for longitudinal data. Biometrika, 93, 927–941.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, W., & Leng, C. (2012). A moving average Cholesky factor model in covariance modelling for longitudinal data. Biometrika, 99, 141–150.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, W., Leng, C., & Tang, C. Y. (2015). A joint modelling approach for longitudinal studies. Journal of the Royal Statistical Society: Series B, 77, 219–238.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The author Ran is sincerely appreciated Prof. Kano Yutaka and Dr. Morikawa Kosuke of the graduate school of engineering science at Osaka University for their generous personalities and numerous help to the Ran’s life during a tough time.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mengfei Ran.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 1009 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ran, M., Yang, Y. & Kano, Y. Robust semiparametric modeling of mean and covariance in longitudinal data. Jpn J Stat Data Sci 6, 625–648 (2023). https://doi.org/10.1007/s42081-023-00204-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42081-023-00204-3

Keywords

Navigation