Skip to main content

Advertisement

Log in

Bayesian Analysis of Multivariate Matched Proportions with Sparse Response

  • Published:
Statistics in Biosciences Aims and scope Submit manuscript

Abstract

Multivariate matched proportions (MMP) data appear in a variety of contexts including post-market surveillance of adverse events in pharmaceuticals, disease classification, and agreement between care providers. It consists of multiple sets of paired binary measurements taken on the same subject. While recent work proposes methods to address the complexities of MMP data, the issue of sparse response, where no or very few “yes” responses are recorded for one or more sets, is unaddressed. The presence of sparse response sets results in the underestimation of variance components, loss of coverage, and lowered power in existing methods. Bayesian methods, which have not previously been considered for MMP data, provide a useful framework when sparse responses are present. In particular, the Bayesian probit model in combination with mean model prior specifications provides an elegant solution to the problem of variance underestimation. We examine a multivariate probit-based approach using hierarchical horseshoe-like priors along with a Bayesian functional principal component analysis (FPCA) to model the latent covariance. We show that our approach performs well on MMP data with sparse responses and outperforms existing methods. In a re-examination of a study on the system of care (SOC) framework for children with mental and behavioral disorders, we are able to provide a more complete picture of the relationships in the data. Our analysis provides additional insights into the functioning on the SOC that a previous univariate analysis missed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Knutson KH, Meyer MJ, Thakrar N, Stein BD (2018) Care coordination for youth with mental health disorders in primary care. Clin Pediatr 57:5–10. https://doi.org/10.1177/0009922817733740

    Article  Google Scholar 

  2. Klingenberg B, Agresti A (2006) Multivariate extension of McNemar’s test. Biometrics 62:921–928. https://doi.org/10.1111/j.1541-0420.2006.00525.x

    Article  MathSciNet  Google Scholar 

  3. McNemar Q (1947) Note on the sampling error of the difference between correlated proportions or percentages. Pyschometrika 12:153–157. https://doi.org/10.1007/BF02295996

    Article  Google Scholar 

  4. Consonni G, La Rocca L (2008) Tests based on intrinsic priors for the equality of two correlated proportions. J Am Stat Assoc 103:1260–1269. https://doi.org/10.1198/01621450800000043

    Article  MathSciNet  MATH  Google Scholar 

  5. Saeki H, Tango T, Wang J (2017) Statistical inference for noninferiority of difference in proportions of clustered matched-pair data from multiple raters. J Biopharm Stat 27:70–83. https://doi.org/10.1080/10543406.2016.1148709

    Article  Google Scholar 

  6. Westfall PH, Troendle JF, Pennello G (2010) Multiple McNemar tests. Biometrics 66:1185–1191. https://doi.org/10.1111/j.1541-0420.2010.01408.x

    Article  MathSciNet  MATH  Google Scholar 

  7. Xu J, Yu M (2013) Sample size determination and re-estimation for matched pair designs with multiple binary endpoints. Biom J 55:430–443. https://doi.org/10.1002/bimj.201100231

    Article  MathSciNet  MATH  Google Scholar 

  8. Lui K-J, Chang K-C (2013) Testing and estimation of proportion (or risk) ratio under the matched-pair design with multiple binary endpoints. Biom J 55:603–616. https://doi.org/10.1002/bimj.201200224

    Article  MathSciNet  MATH  Google Scholar 

  9. Lui K-J, Chang K-C (2016) Notes on testing noninferiority in multivariate binary data under the matched-pair design. Stat Methods Med Res 25:1272–1289. https://doi.org/10.1177/0962280213477022

    Article  MathSciNet  Google Scholar 

  10. Cochran WG (1950) The comparison of percentages in matched samples. Biometrika 37:256–266. https://doi.org/10.2307/2332378

    Article  MathSciNet  MATH  Google Scholar 

  11. Mantel N, Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 22:719–748. https://doi.org/10.1093/jnci/22.4.719

    Article  Google Scholar 

  12. Jiang Y, Xu J (2017) A comparative study of matched pair designs with two binary endpoints. Stat Methods Med Res 26:2526–2542. https://doi.org/10.1177/0962280215601136

    Article  MathSciNet  Google Scholar 

  13. Agresti A (2013) Categorical data analysis, 3rd edn. Wiley, Hoboken

    MATH  Google Scholar 

  14. Altham PME (1971) The analysis of matched proportions. Biometrika 58:561–576. https://doi.org/10.2307/2334391

    Article  MATH  Google Scholar 

  15. Broemeling LD, Gregurich MA (1996) A Bayesian alternative to the analysis of matched categorical responses. Commun Stat 25:1429–1445. https://doi.org/10.1080/03610929608831777

    Article  MATH  Google Scholar 

  16. Ghosh M, Chen M-H, Ghosh A, Agresti A (2000) Hierarchical Bayesian analysis of binary matched pairs data. Stat Sin 10:647–657

    MathSciNet  MATH  Google Scholar 

  17. Albert J, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc 88:669–679. https://doi.org/10.1080/01621459.1993.10476321

    Article  MathSciNet  MATH  Google Scholar 

  18. Albert J, Chib S (1995) Bayesian residual analysis for binary response models. Biometrika 82:747–759. https://doi.org/10.1093/biomet/82.4.747

    Article  MathSciNet  MATH  Google Scholar 

  19. Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  20. Gelman A (2006) Prior distributions for variance parameters in hierarchical models. Bayesian Anal 1:513–533. https://doi.org/10.1214/06-BA117A

    Article  MATH  Google Scholar 

  21. Carvalho CM, Polson NG, Scott JG (2010) The horseshoe estimator for sparse signals. Biometrika 97:465–480. https://doi.org/10.1093/biomet/asq017

    Article  MathSciNet  MATH  Google Scholar 

  22. Van der Linde A (2008) Variational Bayesian functional PCA. Comput Stat Data Anal 53:517–533. https://doi.org/10.1016/j.csda.2008.09.015

    Article  MathSciNet  MATH  Google Scholar 

  23. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis, 3rd edn. Chapman and Hall-CRC, Boca Raton

    Book  MATH  Google Scholar 

  24. Chib S, Greenberg E (1998) Analysis of multivariate probit models. Biometrika 85:347–361. https://doi.org/10.1093/biomet/85.2.347

    Article  MATH  Google Scholar 

  25. Liu C (2001) Discussion. J Comput Graph Stat 10:75–81. https://doi.org/10.1198/10618600152418746

    Article  Google Scholar 

  26. Zhang X, Boscardin WJ, Belin TR (2006) Sampling correlation matrices in Bayesian models with correlated latent variables. J Comput Graph Stat 15:880–896. https://doi.org/10.1198/106186006X160050

    Article  MathSciNet  Google Scholar 

  27. Webb EL, Forster JJ (2008) Bayesian model determination for multivariate ordinal and binary data. Comput Stat Data Anal 52:2632–2649. https://doi.org/10.1016/j.csda.2007.09.008

    Article  MathSciNet  MATH  Google Scholar 

  28. Goldsmith J, Kitago T (2016) Assessing systematic effects of stroke on motor control using hierarchical function-on-scalar regression. J R Stat Soc Ser C 65:215–236. https://doi.org/10.1111/rssc.12115

    Article  MathSciNet  Google Scholar 

  29. Meyer MJ, Morris JS, Gazes RP, Coull BA (2022) Ordinal probit functional outcome regression with application to computer-use behavior in rhesus monkeys. Ann Appl Stat 16:537–550. https://doi.org/10.1214/21-AOAS1513

    Article  MathSciNet  MATH  Google Scholar 

  30. Gupta AK, Nagar DK (2000) Matrix variate distributions, 2nd edn. Chapman & Hall/CRC, Boca Raton

    MATH  Google Scholar 

  31. Eilers PHC, Marx BD (1996) Flexible smoothing with b-splines and penalties. Stat Sci 11:89–121. https://doi.org/10.1214/ss/1038425655

    Article  MathSciNet  MATH  Google Scholar 

  32. Polson NG, Scott JG (2012) On the half-Cauchy prior for a global scale parameter. Bayesian Anal 7:887–902. https://doi.org/10.1214/12-BA730

    Article  MathSciNet  MATH  Google Scholar 

  33. Wand MP, Ormerod JT, Padoan SA, Frühwirth R (2011) Mean field variational bayes for elaborate distributions. Bayesian Anal 6:847–900. https://doi.org/10.1214/11-BA631

    Article  MathSciNet  MATH  Google Scholar 

  34. Agresti A, Coull BA (1998) Approximate is better than ‘exact’ for interval estimation of binomial proportions. Am Stat 52:119–126. https://doi.org/10.1080/00031305.1998.10480550

    Article  MathSciNet  Google Scholar 

  35. Agresti A, Caffo B (2000) Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures. Am Stat 54:280–288. https://doi.org/10.1080/00031305.2000.10474560

    Article  MathSciNet  MATH  Google Scholar 

  36. Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–511. https://doi.org/10.1214/ss/1177011136

    Article  MATH  Google Scholar 

  37. Costello EJ, He J-P, Sampson NA, Kessler RC, Merikangas KR (2014) Services for adolescents with psychiatric disorders: 12-month data from the National Comorbidity Survey-Adolescent. Psychiatr Serv 65:359–366. https://doi.org/10.1176/appi.ps.201100518

    Article  Google Scholar 

Download references

Acknowledgements

Partial funding for this work was provided by internal Georgetown University Summer Academic Research Grants.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mark J. Meyer.

Ethics declarations

Conflict of interest

The authors have no potential or actually conflicts of interest to declare. No additional data were collected for this research. The original study was retrospective in nature and was approved by the Boston University Institutional Review Board [1].

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 656 kb)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meyer, M.J., Cheng, H. & Knutson, K.H. Bayesian Analysis of Multivariate Matched Proportions with Sparse Response. Stat Biosci 15, 490–509 (2023). https://doi.org/10.1007/s12561-023-09368-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12561-023-09368-8

Keywords

Navigation