Skip to main content

Advertisement

Log in

Copula Approach for Developing a Biomarker Panel for Prediction of Dengue Hemorrhagic Fever

  • Published:
Annals of Data Science Aims and scope Submit manuscript

Abstract

The choice of variable-selection methods to identify important variables for binary classification modeling is critical for producing stable statistical models that are interpretable, that generate accurate predictions, and have minimal bias. This work is motivated by the availability of data on clinical and laboratory features of dengue fever infections obtained from 51 individuals enrolled in a prospective observational study of acute human dengue infections. Our paper uses objective Bayesian method to identify important variables for dengue hemorrhagic fever (DHF) over the dengue data set. With the selected important variables by objective Bayesian method, we employ a Gaussian copula marginal regression model considering correlation error structure and a general method of semi-parametric Bayesian inference for Gaussian copula model to estimate, separately, the marginal distribution and dependence structure. We also carry out a receiver operating characteristic (ROC) analysis for the predictive model for DHF and compare our proposed model with the other models of Ju and Brasier (Variable selection methods for developing a biomarker panel for prediction of dengue hemorrhagic fever. BMC Res Notes 6:365, 2013) tested on the basis of the ROC analysis. Our results extend the previous models of DHF by suggesting that IL-10, Days Fever, Sex and Lymphocytes are the major features for predicting DHF on the basis of blood chemistries and cytokine measurements. In addition, the dependence structure of these Days Fever, Lymphocytes, IL-10 and Sex protein profiles associated with disease outcomes was discovered by the semi-parametric Bayesian Gaussian copula model and Gaussian partial correlation method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Aasa K, Czadob C, Frigessic A, Bakkend H (2009) Pair-copula constructions of multiple dependence. Insur Math Econ 44(2):182–198

    Article  Google Scholar 

  2. Ahmad Z (2019) The hyperbolic Sine Rayleigh distribution with application to bladder cancer susceptibility. Ann Data Sci 6:211–222. https://doi.org/10.1007/s40745-018-0165-0

    Article  Google Scholar 

  3. Bayarri MJ, Berger JO, Forte A, Garcia-Donato G (2012) Criteria for Bayesian model choice with application to variable selection. Ann Stat 40:1550–1577

    Article  Google Scholar 

  4. Brasier AR, Ju H, Garcia J, Spratt HM, Victor SS, Forshey BM, Halsey ES, Comach G, Sierra G, Blair PJ, Rocha C, Morrison AC, Scott TW, Bazan I, Kochel TJ, Venezuelan Dengue Fever Working Group (2012) A Three-Component Biomarker Panel for Prediction of Dengue Hemorrhagic Fever. Am J Trop Med Hyg 86(2):341–348

  5. Denuit M, Lambert P (2005) Constraints on concordance measures in bivariate discrete data. J Multivar Anal 93(1):40–57

    Article  Google Scholar 

  6. Garcia-Donato G, Forte A (2015) R package BayesVarSel. R Foundation for Statistical Computing, Vienna

    Google Scholar 

  7. Genest, C., & Nešlehová, J. (2007). A primer on copulas for count data. ASTIN Bulletin 37(2):475–515. https://doi.org/10.1017/S0515036100014963

    Article  Google Scholar 

  8. Genest C, Ghoudi K, Rivest LP (1995) A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika 82(3):543–552

    Article  Google Scholar 

  9. Hoff PD (2007) Extending the rank likelihood for semiparametric copula estimation. Ann Appl Stat 1(1):265–283

    Article  Google Scholar 

  10. Joe H (1997) Multivariate models and dependence concepts. Chapman and Hall, London

    Book  Google Scholar 

  11. Ju H, Brasier AR (2013) Variable selection methods for developing a biomarker panel for prediction of dengue hemorrhagic fever. BMC Res Notes 6:365

    Article  Google Scholar 

  12. Kim D, Kim J-M (2014) Analysis of directional dependence using asymmetric copula-based regression models. J Stat Comput Simul 84(9):1990–2010

    Article  Google Scholar 

  13. Kim J-M, Jung Y-S, Sungur EA, Han K, Park C, Sohn I (2008) A copula method for modeling directional dependence of genes. BMC Bioinform 9:225

    Article  Google Scholar 

  14. Kim J-M, Jung Y-S, Choi T, Sungur EA (2011) Partial correlation with copula modeling. Comput Stat Data Anal 55(3):1357–1366

    Article  Google Scholar 

  15. Kojadinovic I, Yan J (2010) Modeling multivariate distributions with continuous margins using the copula R Package. J Stat Softw 34(9):1–20

    Article  Google Scholar 

  16. Madsen L, Fang Y (2011) Joint regression analysis for discrete longitudinal data. Biometrics 67(3):1171–1175

    Article  Google Scholar 

  17. Masarotto G, Varin C (2012) Gaussian copula marginal regression. Electron J Stat 6:1517–1549

    Article  Google Scholar 

  18. Nelsen R (2006) An introduction to copulas, 2nd edn. Springer, New York

    Google Scholar 

  19. Olson D, Shi Y (2007) Introduction to business data mining. McGraw-Hill/Irwin, Boston

    Google Scholar 

  20. Shi Y, Tian YJ, Kou G, Peng Y, Li JP (2011) Optimization based data mining: theory and applications. Springer, London

    Book  Google Scholar 

  21. Sklar A (1959) Fonctions de repartition a n-dimensions et leurs marges, (French). Publ Inst Stat Univ Paris 8:229–231

    Google Scholar 

  22. Song PX-K (2000) Multivariate dispersion models generated from Gaussian copula. Scand J Stat 27:305–320

    Article  Google Scholar 

  23. Zellner A (1986) On assessing prior distributions and Bayesian regression analysis with g-prior distributions. In: Zellner A (ed) In Bayesian inference and decision techniques: essays in Honor of Bruno de Finetti. Edward Elgar Publishing Limited, Cheltenham, pp 389–399

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yoonsung Jung.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, JM., Ju, H. & Jung, Y. Copula Approach for Developing a Biomarker Panel for Prediction of Dengue Hemorrhagic Fever. Ann. Data. Sci. 7, 697–712 (2020). https://doi.org/10.1007/s40745-020-00293-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40745-020-00293-x

Keywords

Navigation