Abstract
Motivated by data measuring progression of leishmaniosis in a cohort of US dogs, we develop a Bayesian longitudinal model with autoregressive errors to jointly analyze ordinal and continuous outcomes. Multivariate methods can borrow strength across responses and may produce improved longitudinal forecasts of disease progression over univariate methods. We explore the performance of our proposed model under simulation, and demonstrate that it has improved prediction accuracy over traditional Bayesian hierarchical models. We further identify an appropriate model selection criterion. We show that our method holds promise for use in the clinical setting, particularly when ordinal outcomes are measured alongside other variables types that may aid clinical decision making. This approach is particularly applicable when multiple, imperfect measures of disease progression are available.
Similar content being viewed by others
References
Agresti A (2012) Categorical data analysis. Wiley series in probability and statistics. Wiley, Hoboken
Albert JH, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc 88(422):669–679. https://doi.org/10.1080/01621459.1993.10476321
Alessie R, Hochguertel S, van Soest A (2004) Ownership of stocks and mutual funds: a panel data analysis. Rev Econ Stat 86(3):783–796
Alvar J, Vélez ID, Bern C et al (2012) Leishmaniasis worldwide and global estimates of its incidence. PLoS One 7(5):e35671
Banerjee S, Carlin BP, Gelfand AE (2015) Hierarchical modeling and analysis for spatial data, 2nd ed. Chapman & Hall/CRC, London
Bürkner PC (2018) Advanced Bayesian multilevel modeling with the R package brms. R J 10(1):395–411. https://doi.org/10.32614/RJ-2018-017
Cappellari L, Jenkins SP (2008) The dynamics of social assistance receipt: measurement and modelling issues, with an application to Britain. OECD Social, Employment and Migration Working Papers 67, OECD Publishing, https://doi.org/10.1787/236346714741
Carpenter B, Gelman A, Hoffman MD et al (2017) Stan: a probabilistic programming language. J Stat Softw Artic 76(1):1–32. https://doi.org/10.18637/jss.v076.i01
Catalno PJ (1997) Bivariate modelling of clustered continuous and ordered categorical outcomes. Stat Med 16(8):883–900. https://doi.org/10.1002/(sici)1097-0258(19970430)16:8<883::aid-sim542>3.0.co;2-e
Chan JC, Grant AL (2016) On the observed-data deviance information criterion for volatility modeling. J Financ Econom 14(4):772–802
Chappuis F, Sundar S, Hailu A et al (2007) Visceral leishmaniasis: what are the needs for diagnosis, treatment and control? Nat Rev Microbiol 5(11):873–882
Chi EM, Reinsel GC (1989) Models for longitudinal data with random effects and ar(1) errors. J Am Stat Assoc 84(406):452–459
Cowles MK (1996) Accelerating Monte Carlo Markov chain convergence for cumulative-link generalized linear models. Stat Comput 6:101–111
Cowles MK, Carlin BP, Connett JE (1996) Bayesian tobit modeling of longitudinal ordinal clinical trial compliance data with nonignorable missingness. J Am Stat Assoc 91(433):86–98
Cressie N, Wikle CK (2011) Statistics for spatio-temporal data. Wiley, Hoboken
Ding S, Cook RD (2014) Dimension folding pca and pfc for matrix-valued predictors. Stat Sin 24(1):463–492
Duprey ZH, Steurer FJ, Rooney JA et al (2006) Canine visceral leishmaniasis, United States and Canada, 2000–2003. Emerg Infect Dis 12(3):440–446
Epstein ES (1969) A scoring system for probability forecasts of ranked categories (1962–1982). J Appl Meteorol 8(6):985–987
Feasey N, Wansbrough-Jones M, Mabey DCW et al (2009) Neglected tropical diseases. Br Med Bull 93(1):179–200. https://doi.org/10.1093/bmb/ldp046
Gabry J, Mahr T (2021) Bayesplot: plotting for bayesian models. R package version 1.8.0
Gelman A, Hill J (2007) Data analysis using regression and multilevel/hierarchical models. In: Vol analytical methods for social research. Cambridge University Press, New York
Gelman A, Rubin DB (1992) Inference from Iterative simulation using multiple sequences. Stat Sci 7(4):457–472. https://doi.org/10.1214/ss/1177011136
Gelman A, van Dyk DA, Huang Z et al (2008) Using redundant parameterizations to fit hierarchical models. J Comput Gr Stat 17(1):95–122. https://doi.org/10.1198/106186008X287337
Gelman A, Carlin J, Stern H et al (2013) Bayesian data analysis, 3rd ed. Chapman & Hall/CRC, Boca Raton
Gelman A, Hwang J, Vehtari A (2014) Understanding predictive information criteria for bayesian models. Stat Comput 24:997–1016
Genton MG, Kleiber W (2015) Cross-covariance functions for multivariate geostatistics. Stat Sci 30(2):147–163. https://doi.org/10.1214/14-STS487
Ghasemzadeh S, Ganjali M, Baghfalaki T (2020) Bayesian quantile regression for joint modeling of longitudinal mixed ordinal and continuous data. Commun Stat Simul Comput 49(2):375–395. https://doi.org/10.1080/03610918.2018.1484482
Gueorguieva RV, Agresti A (2001) A correlated probit model for joint modeling of clustered binary and continuous responses. J Am Stat Assoc 96(455):1102–1112
Hadfield JD (2010) Mcmc methods for multi-response generalized linear mixed models: the MCMCglmm R package. J Stat Softw 33(2):1–22
Hasegawa H (2009) Bayesian dynamic panel ordered probit model and its application to subjective well being. Commun Stat Simul Comput 38(6):1321–1347. https://doi.org/10.1080/03610910902903133
Heckman JJ (1981) The incidental parameters problem and the problem of initial conditions in estimating discrete time-discrete data stochastic process. In: Manski CF, McFadden DL (eds) Structural analysis of discrete data with econometric applications. The MIT Press, Cambridge, pp 179–195
Heckman, James J (1981) Statistical models for discrete panel data. Structural analysis of discrete data with econometric applications 114:178
Jin X, Banerjee S, Carlin BP (2007) Order-free co-regionalized areal data models with application to multiple-disease mapping. J Royal Stat Soc Ser B (Stat Method) 69(5):817–838
Jones RH (2011) Bayesian information criterion for longitudinal and clustered data. Stat Med 30(25):3050–3056. https://doi.org/10.1002/sim.4323
Kang EL, Cressie N (2011) Bayesian inference for the spatial random effects model. J Am Stat Assoc 106(495):972–983
LeishVet (2016) Clinical staging, treatment and prognosis. https://www.leishvet.org/fact-sheet/clinical-staging/
Li Q, Pan J, Belcher J (2016) Bayesian inference for joint modelling of longitudinal continuous, binary and ordinal events. Stat Methods Med Res 25(6):2521–2540. https://doi.org/10.1177/0962280214526199
Li ZR, McComick TH, Clark SJ (2020) Using bayesian latent gaussian graphical models to infer symptom associations in verbal autopsies. Bayesian Anal 15(3):781–807. https://doi.org/10.1214/19-BA1172
Liu JS, Wu YN (1999) Parameter expansion for data augmentation. J Am Stat Assoc 94(448):1264–1274
MacNab YC (2016) Linear models of coregionalization for multivariate lattice data: order-dependent and order-free cmcars. Stat Methods Med Res 25(4):1118–1144. https://doi.org/10.1177/0962280216660419
Matyas L, Sevestre P (2008) The econometrics of panel data: fundamentals and recent developments in theory and practice, 3rd edn. Springer, Berlin
Meng XL, Dyk DAV (1999) Seeking efficient data augmentation schemes via conditional and marginal augmentation. Biometrika 86(2):301–320
NCAR (2015) Verification: weather forecast verification utilities. R Package Vers 1:42
Neale MC, Hunter MD, Pritikin JN et al (2016) OpenMx 2.0: extended structural equation and statistical modeling. Psychometrika 81(2):535–549. https://doi.org/10.1007/s11336-014-9435-8
O’Malley AJ, Zaslavsky AM (2008) Domain-level covariance analysis for multilevel survey data with structured nonresponse. J Am Stat Assoc 103(484):1405–1418
Petersen CA, Barr SC (2009) Canine Leishmaniasis in North America: emerging or newly recognized? Vet Clin North Am Small Anim Pract 39(6):1065–1074
Plummer M (2003) Jags: A program for analysis of bayesian graphical models using gibbs sampling
Plummer M, Best N, Cowles K et al (2006) Coda: convergence diagnosis and output analysis for mcmc. R News 6(1):7–11
Proverbio D, Spada E, Bagnagatti de Giorgi G et al (2014) Relationship between leishmania ifat titer and clinicopathological manifestations (clinical score) in dogs. BioMed Res Int. https://doi.org/10.1155/2014/412808
Pudney S (2008) The dynamics of perception: modelling subjective wellbeing in a short panel. J Royal Stat Soc Series A (Stat Soc) 171(1):21–40
R Core Team (2021) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Ribeiro RR, Michalick MSM, da Silva ME et al (2018) Canine Leishmaniasis: an overview of the current status and strategies for control. Biomed Res Int. https://doi.org/10.1155/2018/3296893
Roy V (2020) Convergence diagnostics for markov chain monte carlo. Annu Rev Stat Appl 7(1):387–412. https://doi.org/10.1146/annurev-statistics-031219-041300
Schaut RG, Robles-Murguia M, Juelsgaard R et al (2015) Vectorborne transmission of leishmania infantum from hounds, United States. Emerg Infect Dis 21(12):2209–2212. https://doi.org/10.3201/eid2112.141167
Schmidt AM, Gelfand AE (2003) A bayesian coregionalization approach for multivariate pollutant data. J Geophys Res Atmos. https://doi.org/10.1029/2002JD002905
Schuurman NK, Grasman RPPP, Hamaker EL (2016) A comparison of inverse-wishart prior specifications for covariance matrices in multilevel autoregressive models. Multivar Behav Res 51(2–3):185–206. https://doi.org/10.1080/00273171.2015.1065398
Solano-Gallego L, Cardoso L, Pennisi MG et al (2017) Diagnostic challenges in the era of canine leishmania infantum vaccines. Trends Parasitol 33(9):706–717
Spiegelhalter DJ, Best NG, Carlin BP et al (2002) Bayesian measures of model complexity and fit. J Royal Stat Soc Ser B 64(4):583–639
Steele F, Grundy E (2021) Random effects dynamic panel models for unequally spaced multivariate categorical repeated measures: an application to child-parent exchanges of support. J Royal Stat Soc Ser C (Appl Statist) 70(1):3–23. https://doi.org/10.1111/rssc.12446
Stegmueller D (2013) Modeling dynamic preferences: a bayesian robust dynamic latent ordered probit model. Polit Anal 21(3):314–333
Talhouk A, Doucet A, Murphy K (2012) Efficient bayesian inference for multivariate probit models with sparse inverse correlation matrices. J Comput Gr Stat 21(3):739–757. https://doi.org/10.1080/10618600.2012.679239
Teimourian M, Baghfalaki T, Ganjali M et al (2015) Joint modeling of mixed skewed continuous and ordinal longitudinal responses: a bayesian approach. J Appl Stat 42(10):2233–2256. https://doi.org/10.1080/02664763.2015.1023557
Therneau Terry M, Grambsch Patricia M (2000) Modeling survival data: extending the cox model. Springer, New York
Toepp AJ, Schaut RG, Scott BD et al (2017) Leishmania incidence and prevalence in us hunting hounds maintained via vertical transmission. Vet Parasitol Reg Stud Rep 10:75–81
Toepp AJ, Monteiro GR, Coutinho JF et al (2019) Comorbid infections induce progression of visceral leishmaniasis. Parasit Vectors 12(1):1–12
Varin C, Czado C (2009) A mixed autoregressive probit model for ordinal longitudinal data. Biostatistics 11(1):127–138. https://doi.org/10.1093/biostatistics/kxp042
Wang WL, Fan TH (2010) ECM-based maximum likelihood inference for multivariate linear mixed models with autoregressive errors. Comput Stat Data Anal 54(5):1328–1341. https://doi.org/10.1016/j.csda.2009.11.021
Wang WL, Fan TH (2012) Bayesian analysis of multivariate t linear mixed models using a combination of ibf and gibbs samplers. J Multivar Anal 105(1):300–310. https://doi.org/10.1016/j.jmva.2011.10.006
Wilhelm S, G MB (2015) tmvtnorm: truncated multivariate normal and student t distribution. R package version 1.4-10
Wilks D (2005) Statistical methods in the atmospheric sciences. International Geophysics. Elsevier Science, Amsterdam
Woodbury M (1950) Inverting modified matrices. Department of Statistics, Princeton University, Princeton, Tech. rep
Wooldridge JM (2005) Simple solutions to the initial conditions problem in dynamic, nonlinear panel data models with unobserved heterogeneity. J Appl Econom 20(1):39–54. https://doi.org/10.1002/jae.770
Acknowledgements
Research and data collection reported in this publication was supported by the National Institute of Allergy and Infectious Disease of the National National Institutes of Health under Award Number R01AI139267, as well as through an award from the Masters of Foxhounds Association Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or any other party.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Seedorff, N., Brown, G., Scorza, B. et al. Joint Bayesian longitudinal models for mixed outcome types and associated model selection techniques. Comput Stat 38, 1735–1769 (2023). https://doi.org/10.1007/s00180-022-01280-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-022-01280-x