Skip to main content

Maximum Accuracy Machine Learning Statistical Analysis—A Novel Approach

  • Chapter
  • First Online:
Cancer Drug Safety and Public Health Policy

Part of the book series: Cancer Treatment and Research ((CTAR,volume 184))

Abstract

Logistic regression is a statistical tool of paramount significance in the field of epidemiology1 and ranks as one of the most frequently published multivariable analyses for designs involving a single binary dependent variable and one or more independent variables in the fields of public health2,3 and medical4 research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ottenbacher KJ, Ottenbacher HR, Tooth LR, Ostir GV (2004) A review of two journals found that articles using multivariable logistic regression frequently did not report commonly recommended assumptions. J Clin Epidemiol 57:1147–1152

    Article  PubMed  Google Scholar 

  2. Hayat MJ, Powell A, Johnson T, Cadwell BL (2017) Statistical methods used in the public health literature and implications for training of public health professionals. PLoS ONE. https://doi.org/10.1371/journal.pone.0179032

    Article  PubMed  PubMed Central  Google Scholar 

  3. Zardo P, Collie A (2014) Predicting research use in a public health policy environment: results of a logistic regression analysis. Implement Sci 9:142

    Article  PubMed  PubMed Central  Google Scholar 

  4. Tetrault JM, Sauler M, Wells CK, Concato J. Reporting of multivariable methods in the medical literature. Journal of Investigative Medicine 20–08; 56: 954–957.

    Google Scholar 

  5. Kalil AC, Mattei J, Florescu DF, Sun J, Kalil RS (2010) Recommendations for the assessment and reporting of multivariable logistic regression in transplantation literature. Am J Transplant 19:1686–1694

    Article  Google Scholar 

  6. Real J, Forne C, Roso-Llorach A, Martinez-Sanchez JM (2016) Quality reporting of multivariable regression models in observational studies. Medicine 95:e3653

    Article  PubMed  PubMed Central  Google Scholar 

  7. Bagley SC, White H, Golomb BA (2001) Logistic regression in the medical literature: standards for use and reporting, with particular attention to one medical domain. J Clin Epidemiol 54:979–985

    Article  CAS  PubMed  Google Scholar 

  8. Zhang YY, Zhou XB, Wang QZ, Zhu XY (2017) Quality of reporting of multivariable logistic regression models in Chinese clinical medical journals. Medicine 96:e6972

    Article  PubMed  PubMed Central  Google Scholar 

  9. Kumar R, Indiayan A, Chhabra P (2016) Evaluation of quality of multivariable logistic regression in Indian medical journals using multilevel modeling approach. Indian J Public Health 60:99–106

    Article  PubMed  Google Scholar 

  10. Wright RE. Logistic Regression. In LG Grimm, PR Yarnold (Eds.), Reading and

    Google Scholar 

  11. Understanding Multivariate Statistics (2005) Washington. APA Books, DC

    Google Scholar 

  12. Yarnold PR (1996) Discriminating geriatric and non-geriatric patients using functional status information: An example of classification tree analysis via UniODA. Educ Psychol Measur 56:656–667

    Article  Google Scholar 

  13. Linden A, Yarnold PR (2016) Using data mining techniques to characterize participation in observational studies. J Eval Clin Pract 6:839–847

    Article  Google Scholar 

  14. Linden A, Yarnold PR (2016) Using classification tree analysis to generate propensity score weights. J Eval Clin Pract 6:848–853

    Article  Google Scholar 

  15. Linden A, Yarnold PR (2016) Identifying causal mechanisms in health care interventions using classification tree analysis. J Eval Clin Pract 6:854–858

    Google Scholar 

  16. Yarnold PR (1996) Characterizing and circumventing Simpson’s paradox for ordered bivariate data. Educ Psychol Measur 56:430–442

    Article  Google Scholar 

  17. Yarnold PR, Soltysik RC (2005) Optimal data analysis: Guidebook with software for Windows. APA Books, Washington, D.C.

    Google Scholar 

  18. Yarnold PR (2017) What is optimal data analysis? Optimal Data Analysis 6:26–42

    Google Scholar 

  19. Yarnold PR, Bryant FB (2015) Obtaining a hierarchically optimal CTA model via UniODA software. Optimal Data Analysis 4:36–53

    Google Scholar 

  20. Yarnold PR, Bryant FB (2015) Obtaining an enumerated CTA model via automated CTA software. Optimal Data Analysis 4:54–60

    Google Scholar 

  21. Yarnold PR (2017) What is novometric data analysis? Optimal Data Analysis 6:26–42

    Google Scholar 

  22. Yarnold PR, Soltysik RC (1991) Theoretical distributions of optima for univariate discrimination of random data. Decis Sci 22:739–752

    Article  Google Scholar 

  23. Yarnold PR, Soltysik RC (1991) Refining two-group multivariable classification models using univariate optimal discriminant analysis. Decis Sci 22:1158–1164

    Article  Google Scholar 

  24. Yarnold PR, Hart LA, Soltysik RC (1994) Optimizing the classification performance of logistic regression and Fisher’s discriminant analyses. Educ Psychol Measur 54:73–85

    Article  Google Scholar 

  25. Yarnold PR. UniODA vs. ROC analysis: Computing the “optimal” cut-point. Optimal Data Analysis 2014; 3, 117–120.

    Google Scholar 

  26. Yarnold PR (2016) How many EO-CTA models exist in my sample, and which is the best model? Optimal Data Analysis 5:62–64

    Google Scholar 

  27. Yarnold PR (2013) Univariate and multivariate analysis of categorical attributes with many response categories. Optimal Data Analysis 2:177–190

    Google Scholar 

  28. Yarnold PR, Linden A (2016) Theoretical aspects of the D statistic. Optimal Data Analysis 5:171–174

    Google Scholar 

  29. Rhodes JN, Yarnold PR (2020) Generating novometric confidence intervals in R: Bootstrap analyses to compare model and chance ESS. Optimal Data Analysis 9:172–177

    Google Scholar 

  30. Linden A, Yarnold PR (2017) Minimizing imbalances on patient characteristics between treatment groups in randomized trials using classification tree analysis. J Eval Clin Pract 23:1309–1315

    Article  PubMed  Google Scholar 

  31. Linden A, Yarnold PR (2016) Combining machine learning and matching techniques to improve causal inference in program evaluation. J Eval Clin Pract 22:868–874

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charles L. Bennett .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ugarte, S. et al. (2022). Maximum Accuracy Machine Learning Statistical Analysis—A Novel Approach. In: Bennett, C., Lubaczewski, C., Witherspoon, B. (eds) Cancer Drug Safety and Public Health Policy. Cancer Treatment and Research, vol 184. Springer, Cham. https://doi.org/10.1007/978-3-031-04402-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-04402-1_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-04401-4

  • Online ISBN: 978-3-031-04402-1

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics