Skip to main content
Log in

Classification tree analysis: a statistical tool to investigate risk factor interactions with an example for colon cancer (United States)

  • Published:
Cancer Causes & Control Aims and scope Submit manuscript

Abstract

Objective: Classification tree analysis is a potentially powerful tool for investigating multilevel interactions. Within the context of colon cancer etiology it may help identify disease pathways and evaluate important interactions of risk factors. Methods: We apply classification tree analysis as a statistical method to investigate interactions of risk factors for colon cancer. We use data collected from a population-based case–control study of newly diagnosed cases of colon cancer (N = 4403 cases and controls). Results: Our results indicate that, as expected, there are many factors that influence colon cancer risk, and that they interact on many levels. We find that the most important factor is the utilization of aspirin and/or non-steroidal anti-inflammatory drugs (NSAID), with those taking this medication having lower risk. Family history appears as a level two modifying factor when NSAID are not used, whereas Western diet is the second factor when NSAID are taken. The final tree has six levels, contains several modifying factors and correctly classifies case or control status for 60.8% (95% CI 59.4–62.2) of all individuals. Conclusions: Our results suggest that risk factors work together to determine disease risk. By accounting for interactions between risk factors we become better able to dissect disease pathways and determine those risk factors that increase susceptibility to disease. Our results highlight the importance of designing studies so that interactions can be addressed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Potter JD, Slattery ML, Bostick RM, Gapstur SM (1993) Colon cancer: a review of the epidemiology. Epidemiol Rev 15: 499–545.

    Google Scholar 

  2. World Cancer Research Fund, American Institute for Cancer Research Expert Panel (JD. Potter, Chair) (1997) Food, Nutrition and the Prevention of Cancer: A Global Perspective. Washington, DC: American Institute for Cancer Research.

    Google Scholar 

  3. Martinez ME, Giovannucci E, Spiegelman D, Hunter DJ, Willett WC, Colditz GA (1997) Leisure-time physical activity, body size, and colon cancer in women. Nurses' Health Study Research Group. J Natl Cancer Inst 89: 948–955.

    Google Scholar 

  4. Slattery ML, Edwards SL, Ma K-N, Friedman GF, Potter JD (1997) Physical activity and colon cancer: public health perspective. Ann Epidemiol 7: 137–145.

    Google Scholar 

  5. Caan BJ, Coates AO, Slattery ML, Potter JD, Quesenberry CP Jr, Edwards S (1998) Body size and colon cancer in a large case-control study. Int J Obesity Relat Metab Disord 22: 178–184.

    Google Scholar 

  6. Baron JA, Sandler RS (2000) Nonsteroidal anti-inflammatory drugs and cancer prevention (Review). Ann Rev Med 51: 511–523.

    Google Scholar 

  7. Le Marchand L, Wilkens LR, Hankin JH, Kolonel LN, Lyu LC (1999) Independent and joint effects of family history and lifestyle on colorectal cancer risk: implications for prevention. Cancer Epidemiol Biomarkers Prev 8: 45–52.

    Google Scholar 

  8. Slattery ML, Kerber RA (1994) Family history of cancer and colon cancer risk: the Utah Population Database. J Natl Cancer Inst 86: 1618–1626.

    Google Scholar 

  9. Rao DC (1998) CAT scans, PET scans, and genomic scans. Genet Epidemiol 15: 1–18.

    Google Scholar 

  10. Zangger P, Detsky A (2000) Computer-assisted decision analysis in orthopedics: resurfacing the patella in total knee arthroplasty as an example. J Arthroplasty 15: 283–288.

    Google Scholar 

  11. Grossfeld GD, Tigrani VS, Nudell D, et al. (2000) Management of a positive surgical margin after prostatectomy: decision analysis. J Urol 164: 93–99.

    Google Scholar 

  12. Tavakoli M, Prach AT, Malek M, Hopwood D, Senior BW, Murray FE (1999) Decision analysis of histamine H2–receptor antagonist maintenance therapy versus Helicobacter pylori eradication therapy: a randomised controlled trial in patients with continuing pain after duodenal ulcer. Pharmacoeconomics 16: 355–365.

    Google Scholar 

  13. Albert DA, Aksentijevich S, Hurst S, Fries JF, Wolfe F (2000) Modeling therapeutic strategies in rheumatoid arthritis: use of decision analysis and Markov models. J Rheumatol 27: 644–652.

    Google Scholar 

  14. Morgan RD, Olson KR, Krueger RM, Schellenberg RP, Jackson TT (2000) Do the DSM decision trees improve diagnostic ability? J Clin Psychol 56: 73–88.

    Google Scholar 

  15. Kasner SE, Kimmel SE (2000) Accuracy of initial stroke subtype diagnosis: a decision analysis. Cerebrovasc Dis 10: 18–24.

    Google Scholar 

  16. Zhang H, Bonney G (2000) Use of classification trees for association studies. Genet Epidemiol 19: 323–332.

    Google Scholar 

  17. Slattery ML, Potter JD, Caan BJ, et al. (1997) Energy balance and colon cancer-beyond physical activity. Cancer Res 57: 75–80.

    Google Scholar 

  18. Edwards S, Slattery ML, Mori M, Berry TD, Palmer P (1994) Objective system for interviewer performance evaluation for use in epidemiologic studies. Am J Epidemiol 140: 1020–1028.

    Google Scholar 

  19. Slattery ML, Caan BJ, Duncan D, Berry TD, Coates A, Kerber R (1994) A computerized diet history questionnaire for epidemiologic studies. J Am Diet Assoc 94: 761–766.

    Google Scholar 

  20. Slattery ML, Jacobs DR Jr (1995) Assessment of ability to recall physical activity of several years ago. Ann Epidemiol 5: 292–296.

    Google Scholar 

  21. Friedman GD, Coates A, Potter JD, Slattery ML (1998) Drugs and colon cancer. Pharmacoepidemiol Drug Safety 7: 99–106.

    Google Scholar 

  22. Slattery ML, Friedman GF, Potter JD, Edwards S, Ma KN (1997) Tobacco use and colon cancer. Int J Cancer 70: 259–264.

    Google Scholar 

  23. Kampman E, Potter JD, Slattery ML, Edwards S, Caan BJ (1997) Hormone replacement therapy, reproductive history, and colon cancer: a US multi-center case-control study. Cancer Causes Control 8: 146–158.

    Google Scholar 

  24. Zhang HP, Singer B (1999) Recursive Partitioning in the Health Sciences. New York: Springer.

    Google Scholar 

  25. Loh WY, Shih YS (1999) Split selection methods for classification trees. Statistica Sinica 7: 815–840.

    Google Scholar 

  26. Biggs D, deVille B, Suen E (1991) A method of choosing multi-way partitions for classification and decision trees. J Appl Stat 18: 49–62.

    Google Scholar 

  27. Kass G (1980) An exploratory technique for investigating large quantities of categorical data. Appl Stat 29: 119–127.

    Google Scholar 

  28. Slattery ML, Boucher KM, Caan BJ, Potter JD, Ma KN (1998) Eating patterns and colon cancer. Am J Epidemiol 148: 4–16.

    Google Scholar 

  29. Slattery ML, Benson J, Berry TD, et al. (1997) Dietary sugar and colon cancer. Cancer Epidemiol Biomarkers Prev 6: 677–685. 822 N.J. Camp and M.L. Slattery

    Google Scholar 

  30. Kampman E, Slattery ML, Bigler J, et al. (1999) Meat consumption, genetic susceptibility, and colon cancer risk: a US multi-center case-control study. Cancer Epidemiol Biomarkers Prev 8: 15–24.

    Google Scholar 

  31. Kerber RA, Slattery ML, Potter JD, Caan BJ, Edwards SL (1998) Risk of colon cancer associated with a family history of cancer or colorectal polyps: the diet, activity, and reproduction in colon cancer study. Int J Cancer 5: 157–160.

    Google Scholar 

  32. Slattery ML, Potter JD (2002) Physical activity and colon cancer, confounding, effect modification, and potential biological mechanisms. Med Sci Sports Exercise 34: 913–919.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Camp, N.J., Slattery, M.L. Classification tree analysis: a statistical tool to investigate risk factor interactions with an example for colon cancer (United States). Cancer Causes Control 13, 813–823 (2002). https://doi.org/10.1023/A:1020611416907

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1020611416907

Navigation