Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values

Wang, Wan-Lun

doi:10.1007/s11749-018-0612-4

Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values

Original Paper
Published: 17 September 2018

Volume 28, pages 196–222, (2019)
Cite this article

TEST Aims and scope Submit manuscript

Wan-Lun Wang ORCID: orcid.org/0000-0002-0344-7954¹

415 Accesses
7 Citations
Explore all metrics

Abstract

The multivariate t nonlinear mixed-effects model (MtNLMM) has been shown to be effective for analyzing multi-outcome longitudinal data following nonlinear growth patterns with fat-tailed noises or potential outliers. This paper considers the problem of clustering heterogeneous longitudinal profiles in a mixture framework of MtNLMM. A finite mixture of multivariate t nonlinear mixed model is proposed, and this new model allows accommodating more complex features of longitudinal data. Intermittent missing values frequently occur in the data collection process of multiple repeated measures. Under a missing at random mechanism, a pseudo-data version of the alternating expectation-conditional maximization algorithm is developed to carry out maximum likelihood estimation and impute missing values simultaneously. The techniques for clustering of incomplete multiple trajectories, recovery of missing responses, and allocation of future subjects are also investigated. The practical utility is demonstrated through a real data example coming from a study of 124 normal and 37 abnormal pregnant women. Simulation studies are provided to validate the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Proceedings of the 2nd international symposium on information theory, Akademiai, Kiado, Budapest, pp 267–281
Anderson TW (2003) An introduction to multivariate statistical analysis, 3rd edn. Wiley, New York
MATH Google Scholar
Azzalini A, Capitaino A (2003) Distributions generated by perturbation of symmetry with emphasis on a multivariate skew \(t\)-distribution. J R Stat Soc Ser B 65:367–389
Article MathSciNet MATH Google Scholar
Bai X, Chen K, Yao W (2016) Mixture of linear mixed models using multivariate \(t\) distribution. J Stat Comput Simul 86:771–787
Article MathSciNet Google Scholar
Becker C, Gather U (1999) The masking breakdown point of multivariate outlier identification rules. J Am Stat Assoc 94(447):947–955
Article MathSciNet MATH Google Scholar
Booth JG, Casella G, Hobert JP (2008) Clustering using objective functions and stochastic search. J R Stat Soc B 70:119–139
Article MathSciNet MATH Google Scholar
Celeux G, Martin O, Lavergne C (2005) Mixture of linear mixed models for clustering gene expression profiles from repeated microarray experiments. Stat Model 5:243–267
Article MathSciNet MATH Google Scholar
De la Cruz-Mesía R, Quintana FA, Marshall G (2008) Model-based clustering for longitudinal data. Comput Stat Data Anal 52:1441–1457
Article MathSciNet MATH Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B 39:1–38
MATH Google Scholar
Fawcett T (2006) An introduction to ROC analysis. Patt Recog Lett 27:861–874
Article Google Scholar
Filzmoser P, Garrett RG, Reimann C (2005) Multivariate outlier detection in exploration geochemistry. Comput Geosci 31:579–587
Article Google Scholar
Gaffney SJ, Smyth P (2003) Curve clustering with random effects regression mixtures. In: Bishop CM, Frey BJ (eds) Proceedings of the 9th international workshop on artificial intelligence and statistics, Key West
Goldfeld SM, Quandt RE (1973) A Markov model for switching regression. J Econom 1:3–15
Article MATH Google Scholar
Grün B, Leisch F (2008) Finite mixtures of generalized linear regression models. Recent advances in linear models and related areas: essays in honour of helge toutenburg. Physica-Verlag HD, Heidelberg, pp 205–230
Chapter Google Scholar
Hartigan JA, Wong MA (1979) Algorithm AS 136: a K-means clustering algorithm. Appl Stat 28(1):100–108
Article MATH Google Scholar
Hastie T, Tibshirani R, Friedman JH (2001) Elements of statistical learning: data mining, inference, and prediction. Springer, New York
Book MATH Google Scholar
Hennig C (2000) Identifiablity of models for clusterwise linear regression. J Classif 17(2):273–296
Article MathSciNet MATH Google Scholar
Ho HJ, Lin TI (2010) Robust linear mixed models using the skew \(t\) distribution with application to schizophrenia data. Biom J 52:449–469
Article MathSciNet MATH Google Scholar
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Article MATH Google Scholar
Hughes JP (1999) Mixed-effects models with censored data with application to HIV RNA levels. Biometrics 55:625–629
Article MATH Google Scholar
Ibrahim J, Molenberghs G (2009) Missing data methods in longitudinal studies: a review. TEST 18:1–43
Article MathSciNet MATH Google Scholar
Ingrassia S, Minotti SC, Punzo A (2014) Model-based clustering via linear cluster-weighted models. Comput Statist Data Anal 71:159–182
Article MathSciNet MATH Google Scholar
Kotz S, Nadarajah S (2004) Multivariate \(t\) distributions and their applications. Cambridge University Press, Cambridge
Book MATH Google Scholar
Lachos VH, Bandyopadhyay D, Dey DK (2011) Linear and nonlinear mixed-effects models for censored HIV viral loads using normal/independent distributions. Biometrics 67:1594–1604
Article MathSciNet MATH Google Scholar
Laird NM, Ware JH (1982) Random effects models for longitudinal data. Biometrics 38:963–974
Article MATH Google Scholar
Lin TI, Lee JC (2008) Estimation and prediction in linear mixed models with skew normal random effects for longitudinal data. Stat Med 27:1490–1507
Article MathSciNet Google Scholar
Lin TI, Wang WL (2013) Multivariate skew-normal linear mixed models for multi-outcome longitudinal data. Stat Model 13:199–221
Article MathSciNet Google Scholar
Lin TI, Wang WL (2017) Multivariate-\(t\) nonlinear mixed models with application to censored multi-outcome AIDS studies. Biostatistics 18(4):666–681
Google Scholar
Lin TI, McLachlanc GJ, Lee SX (2016) Extending mixtures of factor models using the restricted multivariate skew-normal distribution. J Multivar Anal 143:398–413
Article MathSciNet MATH Google Scholar
Lin TI, Lachos VH, Wang WL (2018) Multivariate longitudinal data analysis with censored and intermittent missing responses. Stat Med 37:2822–2835
Article MathSciNet Google Scholar
Lindstrom MJ, Bates DM (1990) Nonlinear mixed effects models for repeated measures data. Biometrics 46:673–687
Article MathSciNet Google Scholar
Little RJA (1995) Modeling the drop-out mechanism in repeated-measures studies. J Am Stat Assoc 90:1113–1121
MathSciNet MATH Google Scholar
Lo K, Gottardo R (2012) Flexible mixture modeling via the multivariate \(t\) distribution with the Box–Cox transformation: an alternative to the skew-\(t\) distribution. Stat Comput 22(1):33–52
Article MathSciNet MATH Google Scholar
Marinoa MF, Alfó M (2016) Gaussian quadrature approximations in mixed hidden Markov models for longitudinal data: a simulation study. Comput Stat Data Anal 94:193–209
Article MathSciNet MATH Google Scholar
Marshall G, De la Cruz-Mesia R, Baron AE, Rutledge JH, Zerbe GO (2006) Non-linear random effects model for multivariate responses with missing data. Stat Med 25:2817–2830
Article MathSciNet Google Scholar
Marshall G, De la Cruz-Mesia R, Quintana FA, Baron AE (2009) Discriminant analysis for longitudinal data with multiple continuous responses and possibly missing data. Biometrics 65:69–80
Article MathSciNet MATH Google Scholar
Maruotti A (2011) Mixed hidden Markov models for longitudinal data: an overview. Int Stat Rev 79(3):427–454
Article MATH Google Scholar
Maruotti A (2015) Handling non-ignorable dropouts in longitudinal data: a conditional model based on a latent Markov heterogeneity structure. TEST 24:84–109
Article MathSciNet MATH Google Scholar
Maruotti A, Punzo A (2017) Model-based time-varying clustering of multivariate longitudinal data with covariates and outliers. Comput Stat Data Anal 113:475–496
Article MathSciNet MATH Google Scholar
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York
Book MATH Google Scholar
McNicholas PD, Murphy TB (2010) Model-based clustering of longitudinal data. Can J Stat 38(1):153–168
MathSciNet MATH Google Scholar
Meng XL, van Dyk D (1997) The EM algorithm—an old folk-song sung to a fast new tune. J R Stat Soc Ser B 59:511–567
Article MathSciNet MATH Google Scholar
Muñoz A, Carey V, Schouten JP, Segal M, Rosner B (1992) A parametric family of correlation structures for the analysis of longitudinal data. Biometrics 48:733–42
Article Google Scholar
Ng SK, McLachlan GJ, Wang K, Ben-Tovim L, Ng SW (2006) A mixture model with random-effects components for clustering correlated gene-expression profiles. Bioinformatics 22:1745–1752
Article Google Scholar
Peel D, McLachlan GJ (2000) Robust mixture modelling using the \(t\) distribution. Stat Comput 10:339–348
Article Google Scholar
Pfeifer C (2004) Classification of longitudinal profiles based on semi-parametric regression with mixed effects. Stat Med 4:314–323
Article MathSciNet MATH Google Scholar
Pinheiro J, Bates D, Debroy S, Sarkar D, R Core Team (2016) nlme: linear and nonlinear mixed effects models. R package version 3.1-128. http://CRAN.R-project.org/package=nlme. Accessed 8 Sept 2016
Punzo A, McNicholas PD (2017) Robust clustering in regression analysis via the contaminated Gaussian cluster-weighted model. J Classif 34(2):249–293
Article MathSciNet MATH Google Scholar
Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, Muller M (2011) pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform 12:77
Article Google Scholar
Rousseeuw PJ, Van Zomeren BC (1990) Unmasking multivariate outliers and leverage points. J Am Stat Assoc 85(411):633–651
Article Google Scholar
Roy A (2006) Estimating correlation coefficient between two variables with repeated observations using mixed effects model. Biom J 48:286–301
Article MathSciNet Google Scholar
Roy J, Lin X (2002) Analysis of multivariate longitudinal outcomes with nonignorable dropouts and missing covariates: changes in methadone treatment practices. J Am Stat Assoc 97:40–52
Article MathSciNet MATH Google Scholar
Rubin DB (1976) Inference and missing data. Biometrika 63:581–592
Article MathSciNet MATH Google Scholar
Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York
Book MATH Google Scholar
Sahu SK, Dey DK, Branco MD (2003) A new class of multivariate skew distributions with applications to Bayesian regression models. Can J Stat 31:129–150
Article MathSciNet MATH Google Scholar
Schroeter P, Vesin JM, Langenberger T, Meuli R (1998) Robust parameter estimation of intensity distributions for brain magnetic resonance images. IEEE Trans Med Imaging 17(2):172–186
Article Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article MathSciNet MATH Google Scholar
Shah A, Laird N, Schoenfeld D (1997) A random-effects model for multiple characteristics with possibly missing data. J Amer Statist Assoc 92:775–779
Article MathSciNet MATH Google Scholar
Spiessens B, Verbeke G, Komárek A (2002) A SAS-macro for the classification of longitudinal profiles using mixtures of normal distributions in nonlinear and generalised linear mixed models. Technical Report, Biostatistical Center, Catholic Univ., Leuven
Stephens M (2000) Dealing with label switching in mixture models. J R Stat Soc Ser B 62:795–809
Article MathSciNet MATH Google Scholar
Vaida F, Liu L (2009) Fast implementation for normal mixed effects models with censored response. J Comput Graph Stat 18:797–817
Article MathSciNet Google Scholar
Verbeke G, Lesaffre E (1996) A linear mixed-effects model with heterogeneity in the random-effects population. J Am Stat Assoc 91:217–221
Article MATH Google Scholar
Wang WL (2013) Multivariate \(t\) linear mixed models for irregularly observed multiple repeated measures with missing outcomes. Biom J 55:554–571
Article MathSciNet MATH Google Scholar
Wang WL (2017) Mixture of multivariate-\(t\) linear mixed models for multi-outcome longitudinal data with heterogeneity. Stat Sin 27:733–760
MathSciNet MATH Google Scholar
Wang WL, Fan TH (2010) ECM-based maximum likelihood inference for multivariate linear mixed models with autoregressive errors. Comput Stat Data Anal 54:1328–1341
Article MathSciNet MATH Google Scholar
Wang WL, Fan TH (2011) Estimation in multivariate \(t\) linear mixed models for multiple longitudinal data. Stat Sin 21:1857–1880
MathSciNet MATH Google Scholar
Wang WL, Lin TI (2014) Multivariate \(t\) nonlinear mixed-effects models for multi-outcome longitudinal data with missing values. Stat Med 33:3029–3046
Article MathSciNet Google Scholar
Wang WL, Lin TI (2015) Bayesian analysis of multivariate \(t\) linear mixed models with missing responses at random. J Stat Computat Simul 85:3594–3612
Article MathSciNet Google Scholar
Wang WL, Lin TI, Lachos VH (2018) Extending multivariate-\(t\) linear mixed models for multiple longitudinal data with censored responses and heavy tails. Stat Methods Med Res 27(1):48–64
Article MathSciNet Google Scholar
Wolfinger RD, Lin X (1997) Two Taylor-series approximation methods for nonlinear mixed models. Comput Stat Data Anal 25:465–490
Article MATH Google Scholar
Yamashita T, Okamoto S, Thomas A, MacLachlan V, Healy DL (1989) Predicting pregnancy outcome after in vitro fertilization and embryo transfer using estradiol, progesterone and human chorionic gonadotrophin \(\beta \)-subunit. Ferti Ster 51:304–309
Article Google Scholar
Yao W, Wei Y, Yu C (2014) Robust mixture regression using the \(t\)-distribution. Comput Stat Data Anal 71:116–127
Article MathSciNet MATH Google Scholar
Yu C, Chen K, Yao W (2015) Outlier detection and robust mixture modeling using nonconvex penalized likelihood. J Stat Plann Inference 164:27–38
Article MathSciNet MATH Google Scholar
Yu C, Yao W, Chen K (2017) A new method for robust mixture regression. Can J Stat 45(1):77–94
Article MathSciNet Google Scholar
Zucchini W, MacDonald IL, Langrock R (2016) Hidden Markov models for time series: an introduction using R, 2nd edn. Chapman and Hall, Boca Raton
MATH Google Scholar

Download references

Acknowledgements

The author would like to express her deepest gratitude to the Co-Editor, the Associate Editor and two anonymous reviewers for their insightful comments and suggestions that greatly improved this paper. This research was supported by MOST 107-2628-M-035-001-MY3 awarded by the Ministry of Science and Technology of Taiwan.

Author information

Authors and Affiliations

Department of Statistics, Graduate Institute of Statistics and Actuarial Science, Feng Chia University, Taichung, 40724, Taiwan
Wan-Lun Wang

Authors

Wan-Lun Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wan-Lun Wang.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 113 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, WL. Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values. TEST 28, 196–222 (2019). https://doi.org/10.1007/s11749-018-0612-4

Download citation

Received: 01 January 2018
Accepted: 03 September 2018
Published: 17 September 2018
Issue Date: 12 March 2019
DOI: https://doi.org/10.1007/s11749-018-0612-4

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values

Abstract

Access this article

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 113 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation