Bayesian Analysis of Multivariate Matched Proportions with Sparse Response

Meyer, Mark J.; Cheng, Haobo; Knutson, Katherine Hobbs

doi:10.1007/s12561-023-09368-8

Bayesian Analysis of Multivariate Matched Proportions with Sparse Response

Published: 30 March 2023

Volume 15, pages 490–509, (2023)
Cite this article

Statistics in Biosciences Aims and scope Submit manuscript

110 Accesses
Explore all metrics

Abstract

Multivariate matched proportions (MMP) data appear in a variety of contexts including post-market surveillance of adverse events in pharmaceuticals, disease classification, and agreement between care providers. It consists of multiple sets of paired binary measurements taken on the same subject. While recent work proposes methods to address the complexities of MMP data, the issue of sparse response, where no or very few “yes” responses are recorded for one or more sets, is unaddressed. The presence of sparse response sets results in the underestimation of variance components, loss of coverage, and lowered power in existing methods. Bayesian methods, which have not previously been considered for MMP data, provide a useful framework when sparse responses are present. In particular, the Bayesian probit model in combination with mean model prior specifications provides an elegant solution to the problem of variance underestimation. We examine a multivariate probit-based approach using hierarchical horseshoe-like priors along with a Bayesian functional principal component analysis (FPCA) to model the latent covariance. We show that our approach performs well on MMP data with sparse responses and outperforms existing methods. In a re-examination of a study on the system of care (SOC) framework for children with mental and behavioral disorders, we are able to provide a more complete picture of the relationships in the data. Our analysis provides additional insights into the functioning on the SOC that a previous univariate analysis missed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Power analysis for idiographic (within-subject) clinical trials: Implications for treatments of rare conditions and precision medicine

Article 16 December 2022

Using Multiple Imputation with GEE with Non-monotone Missing Longitudinal Binary Outcomes

Article 02 October 2020

Longitudinal Analysis of Patient-Reported Outcomes in Clinical Trials: Applications of Multilevel and Multidimensional Item Response Theory

Article Open access 17 June 2021

References

Knutson KH, Meyer MJ, Thakrar N, Stein BD (2018) Care coordination for youth with mental health disorders in primary care. Clin Pediatr 57:5–10. https://doi.org/10.1177/0009922817733740
Article Google Scholar
Klingenberg B, Agresti A (2006) Multivariate extension of McNemar’s test. Biometrics 62:921–928. https://doi.org/10.1111/j.1541-0420.2006.00525.x
Article MathSciNet Google Scholar
McNemar Q (1947) Note on the sampling error of the difference between correlated proportions or percentages. Pyschometrika 12:153–157. https://doi.org/10.1007/BF02295996
Article Google Scholar
Consonni G, La Rocca L (2008) Tests based on intrinsic priors for the equality of two correlated proportions. J Am Stat Assoc 103:1260–1269. https://doi.org/10.1198/01621450800000043
Article MathSciNet MATH Google Scholar
Saeki H, Tango T, Wang J (2017) Statistical inference for noninferiority of difference in proportions of clustered matched-pair data from multiple raters. J Biopharm Stat 27:70–83. https://doi.org/10.1080/10543406.2016.1148709
Article Google Scholar
Westfall PH, Troendle JF, Pennello G (2010) Multiple McNemar tests. Biometrics 66:1185–1191. https://doi.org/10.1111/j.1541-0420.2010.01408.x
Article MathSciNet MATH Google Scholar
Xu J, Yu M (2013) Sample size determination and re-estimation for matched pair designs with multiple binary endpoints. Biom J 55:430–443. https://doi.org/10.1002/bimj.201100231
Article MathSciNet MATH Google Scholar
Lui K-J, Chang K-C (2013) Testing and estimation of proportion (or risk) ratio under the matched-pair design with multiple binary endpoints. Biom J 55:603–616. https://doi.org/10.1002/bimj.201200224
Article MathSciNet MATH Google Scholar
Lui K-J, Chang K-C (2016) Notes on testing noninferiority in multivariate binary data under the matched-pair design. Stat Methods Med Res 25:1272–1289. https://doi.org/10.1177/0962280213477022
Article MathSciNet Google Scholar
Cochran WG (1950) The comparison of percentages in matched samples. Biometrika 37:256–266. https://doi.org/10.2307/2332378
Article MathSciNet MATH Google Scholar
Mantel N, Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 22:719–748. https://doi.org/10.1093/jnci/22.4.719
Article Google Scholar
Jiang Y, Xu J (2017) A comparative study of matched pair designs with two binary endpoints. Stat Methods Med Res 26:2526–2542. https://doi.org/10.1177/0962280215601136
Article MathSciNet Google Scholar
Agresti A (2013) Categorical data analysis, 3rd edn. Wiley, Hoboken
MATH Google Scholar
Altham PME (1971) The analysis of matched proportions. Biometrika 58:561–576. https://doi.org/10.2307/2334391
Article MATH Google Scholar
Broemeling LD, Gregurich MA (1996) A Bayesian alternative to the analysis of matched categorical responses. Commun Stat 25:1429–1445. https://doi.org/10.1080/03610929608831777
Article MATH Google Scholar
Ghosh M, Chen M-H, Ghosh A, Agresti A (2000) Hierarchical Bayesian analysis of binary matched pairs data. Stat Sin 10:647–657
MathSciNet MATH Google Scholar
Albert J, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc 88:669–679. https://doi.org/10.1080/01621459.1993.10476321
Article MathSciNet MATH Google Scholar
Albert J, Chib S (1995) Bayesian residual analysis for binary response models. Biometrika 82:747–759. https://doi.org/10.1093/biomet/82.4.747
Article MathSciNet MATH Google Scholar
Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Cambridge University Press, Cambridge
Book MATH Google Scholar
Gelman A (2006) Prior distributions for variance parameters in hierarchical models. Bayesian Anal 1:513–533. https://doi.org/10.1214/06-BA117A
Article MATH Google Scholar
Carvalho CM, Polson NG, Scott JG (2010) The horseshoe estimator for sparse signals. Biometrika 97:465–480. https://doi.org/10.1093/biomet/asq017
Article MathSciNet MATH Google Scholar
Van der Linde A (2008) Variational Bayesian functional PCA. Comput Stat Data Anal 53:517–533. https://doi.org/10.1016/j.csda.2008.09.015
Article MathSciNet MATH Google Scholar
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis, 3rd edn. Chapman and Hall-CRC, Boca Raton
Book MATH Google Scholar
Chib S, Greenberg E (1998) Analysis of multivariate probit models. Biometrika 85:347–361. https://doi.org/10.1093/biomet/85.2.347
Article MATH Google Scholar
Liu C (2001) Discussion. J Comput Graph Stat 10:75–81. https://doi.org/10.1198/10618600152418746
Article Google Scholar
Zhang X, Boscardin WJ, Belin TR (2006) Sampling correlation matrices in Bayesian models with correlated latent variables. J Comput Graph Stat 15:880–896. https://doi.org/10.1198/106186006X160050
Article MathSciNet Google Scholar
Webb EL, Forster JJ (2008) Bayesian model determination for multivariate ordinal and binary data. Comput Stat Data Anal 52:2632–2649. https://doi.org/10.1016/j.csda.2007.09.008
Article MathSciNet MATH Google Scholar
Goldsmith J, Kitago T (2016) Assessing systematic effects of stroke on motor control using hierarchical function-on-scalar regression. J R Stat Soc Ser C 65:215–236. https://doi.org/10.1111/rssc.12115
Article MathSciNet Google Scholar
Meyer MJ, Morris JS, Gazes RP, Coull BA (2022) Ordinal probit functional outcome regression with application to computer-use behavior in rhesus monkeys. Ann Appl Stat 16:537–550. https://doi.org/10.1214/21-AOAS1513
Article MathSciNet MATH Google Scholar
Gupta AK, Nagar DK (2000) Matrix variate distributions, 2nd edn. Chapman & Hall/CRC, Boca Raton
MATH Google Scholar
Eilers PHC, Marx BD (1996) Flexible smoothing with b-splines and penalties. Stat Sci 11:89–121. https://doi.org/10.1214/ss/1038425655
Article MathSciNet MATH Google Scholar
Polson NG, Scott JG (2012) On the half-Cauchy prior for a global scale parameter. Bayesian Anal 7:887–902. https://doi.org/10.1214/12-BA730
Article MathSciNet MATH Google Scholar
Wand MP, Ormerod JT, Padoan SA, Frühwirth R (2011) Mean field variational bayes for elaborate distributions. Bayesian Anal 6:847–900. https://doi.org/10.1214/11-BA631
Article MathSciNet MATH Google Scholar
Agresti A, Coull BA (1998) Approximate is better than ‘exact’ for interval estimation of binomial proportions. Am Stat 52:119–126. https://doi.org/10.1080/00031305.1998.10480550
Article MathSciNet Google Scholar
Agresti A, Caffo B (2000) Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures. Am Stat 54:280–288. https://doi.org/10.1080/00031305.2000.10474560
Article MathSciNet MATH Google Scholar
Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–511. https://doi.org/10.1214/ss/1177011136
Article MATH Google Scholar
Costello EJ, He J-P, Sampson NA, Kessler RC, Merikangas KR (2014) Services for adolescents with psychiatric disorders: 12-month data from the National Comorbidity Survey-Adolescent. Psychiatr Serv 65:359–366. https://doi.org/10.1176/appi.ps.201100518
Article Google Scholar

Download references

Acknowledgements

Partial funding for this work was provided by internal Georgetown University Summer Academic Research Grants.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, Georgetown University, Washington, DC, 20057, USA
Mark J. Meyer & Haobo Cheng
Duke University School of Medicine, Durham, NC, 27710, USA
Katherine Hobbs Knutson

Authors

Mark J. Meyer
View author publications
You can also search for this author in PubMed Google Scholar
Haobo Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Katherine Hobbs Knutson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mark J. Meyer.

Ethics declarations

Conflict of interest

The authors have no potential or actually conflicts of interest to declare. No additional data were collected for this research. The original study was retrospective in nature and was approved by the Boston University Institutional Review Board [1].

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 656 kb)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Meyer, M.J., Cheng, H. & Knutson, K.H. Bayesian Analysis of Multivariate Matched Proportions with Sparse Response. Stat Biosci 15, 490–509 (2023). https://doi.org/10.1007/s12561-023-09368-8

Download citation

Received: 06 November 2022
Revised: 24 February 2023
Accepted: 10 March 2023
Published: 30 March 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s12561-023-09368-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian Analysis of Multivariate Matched Proportions with Sparse Response

Abstract

Access this article

Similar content being viewed by others

Power analysis for idiographic (within-subject) clinical trials: Implications for treatments of rare conditions and precision medicine

Using Multiple Imputation with GEE with Non-monotone Missing Longitudinal Binary Outcomes

Longitudinal Analysis of Patient-Reported Outcomes in Clinical Trials: Applications of Multilevel and Multidimensional Item Response Theory

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Supplementary Information

Supplementary file1 (PDF 656 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian Analysis of Multivariate Matched Proportions with Sparse Response

Abstract

Access this article

Similar content being viewed by others

Power analysis for idiographic (within-subject) clinical trials: Implications for treatments of rare conditions and precision medicine

Using Multiple Imputation with GEE with Non-monotone Missing Longitudinal Binary Outcomes

Longitudinal Analysis of Patient-Reported Outcomes in Clinical Trials: Applications of Multilevel and Multidimensional Item Response Theory

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Supplementary Information

Supplementary file1 (PDF 656 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation