Skip to main content

Using Classification and Regression Trees (CART) to Identify Prescribing Thresholds for Cardiovascular Disease


Background and Objective

Many guidelines for clinical decisions are hierarchical and nonlinear. Evaluating if these guidelines are used in practice requires methods that can identify such structures and thresholds. Classification and regression trees (CART) were used to analyse prescribing patterns of Australian general practitioners (GPs) for the primary prevention of cardiovascular disease (CVD). Our aim was to identify if GPs use absolute risk (AR) guidelines in favour of individual risk factors to inform their prescribing decisions of lipid-lowering medications.


We employed administrative prescribing information that is linked to patient-level data from a clinical assessment and patient survey (the AusHeart Study), and assessed prescribing of lipid-lowering medications over a 12-month period for patients (n = 1903) who were not using such medications prior to recruitment. CART models were developed to explain prescribing practice. Out-of-sample performance was evaluated using receiver operating characteristic (ROC) curves, and optimised via pruning.


We found that individual risk factors (low-density lipoprotein, diabetes, triglycerides and a history of CVD), GP-estimated rather than Framingham AR, and sociodemographic factors (household income, education) were the predominant drivers of GP prescribing. However, sociodemographic factors and some individual risk factors (triglycerides and CVD history) only become relevant for patients with a particular profile of other risk factors. The ROC area under the curve was 0.63 (95 % confidence interval [CI] 0.60–0.64).


There is little evidence that AR guidelines recommended by the National Heart Foundation and National Vascular Disease Prevention Alliance, or conditional individual risk eligibility guidelines from the Pharmaceutical Benefits Scheme, are adopted in prescribing practice. The hierarchy of conditional relationships between risk factors and socioeconomic factors identified by CART provides new insights into prescribing decisions. Overall, CART is a useful addition to the analyst’s toolkit when investigating healthcare decisions.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2


  1. For example, the American Heart Association (AHA) recommends using a modified Framingham equation [3]. In the UK, the National Institute for Health and Care Excellence (NICE) recommends an absolute CVD risk algorithm known as QRISK2 [4].

  2. Prior exposure to medication is not preferred as we would observe risk factors after response to treatment.

  3. Similar complications exist in clinical decision making in general [17], and in observed (as well as recommended) prescribing patterns for statins [1, 18, 19].

  4. This has been shown to be an optimal method for model selection [29].

  5. This identifies all the nodes where the predictor is selected, sums the improvement in classification from each of these and divides by the number of tree branches [28].

  6. Bagging or ‘bootstrapped aggregating’ is a method for generating multiple versions of a tree to allow evaluation of predictor stability [31].

  7. For example, there is some evidence to suggest that compliance increases with the number of risk factors [45].


  1. Bonner C, et al. General practitioners’ use of different cardiovascular risk assessment strategies: a qualitative study. Med J Aust. 2013;199(7):485–9.

    Article  PubMed  Google Scholar 

  2. Jansen J, et al. General practitioners’ use of absolute risk versus individual risk factors in cardiovascular disease prevention: an experimental study. BMJ Open. 2014;4(5):e004812.

    PubMed Central  Article  PubMed  Google Scholar 

  3. Greenland P, et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: a report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines developed in collaboration with the American Society of Echocardiography, American Society of Nuclear Cardiology, Society of Atherosclerosis Imaging and Prevention, Society for Cardiovascular Angiography and Interventions, Society of Cardiovascular Computed Tomography, and Society for Cardiovascular Magnetic Resonance. J Am Coll Cardiol. 2010;56(25):e50–103.

    Article  PubMed  Google Scholar 

  4. Stone NJ, et al. 2013 ACC/AHA guideline on the treatment of blood cholesterol to reduce atherosclerotic cardiovascular risk in adults: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. J Am Coll Cardiol. 2014;63(25 Pt B):2889–934.

  5. Lalor E, et al. National Vascular Disease Prevention Alliance. Guidelines for the management of absolute cardiovascular disease risk. 2012. ISBN 978–0–9872830–1–6.

  6. National Heart Foundation of Australia. Guide to management of hypertension 2008. 2010. Available at:

  7. Heeley EL, et al. Cardiovascular risk perception and evidence-practice gaps in Australian general practice (the AusHEART study). Med J Aust. 2010;192(5):254–9.

    PubMed  Google Scholar 

  8. Razavian M, et al. Cardiovascular risk management in chronic kidney disease in general practice (the AusHEART study). Nephrol Dial Transpl. 2012;27(4):1396–402.

    Article  CAS  Google Scholar 

  9. Zwar N, et al. GPs’ views of absolute cardiovascular risk and its role in primary prevention. Aust Fam Physician. 2005;34(6):503–4.

    PubMed  Google Scholar 

  10. Varian HR. Big data: new tricks for econometrics. J Econ Perspect. 2014;28(2):3–27.

    Article  Google Scholar 

  11. Knott RJ, et al. How fair is Medicare? The income-related distribution of Medicare benefits with special focus on chronic care items. Med J Aust. 2012;197:625–30.

    Article  PubMed  Google Scholar 

  12. Knott RJ, et al. The effects of reduced copayments on discontinuation and adherence failure to statin medication in Australia. Health Policy. 2015;119(5):620–7.

    Article  PubMed  Google Scholar 

  13. Hothorn T, et al. Party: a laboratory for recursive partytioning. 2010. Available at:

  14. Drakopoulos SA. Hierarchical choice in economics. J Econ Surv. 1994;8(2):133–53.

    Article  Google Scholar 

  15. Scott A. Identifying and analysing dominant preferences in discrete choice experiments: an application in health care. J Econ Psychol. 2002;23(3):383–98.

    Article  Google Scholar 

  16. Pharmaceutical Benefits Scheme. General statement for lipid-lowering drugs prescribed as pharmaceutical benefits. Pharmaceutical Benefits Scheme; 2014.

  17. Garg AX, et al. Effects of computerized clinical decision support systems on practitioner performance and patient outcomes: a systematic review. JAMA. 2005;293(10):1223–38.

    Article  CAS  PubMed  Google Scholar 

  18. Berthold H, et al. Patterns and predictors of statin prescription in patients with type 2 diabetes. Cardiovasc Diabetol. 2009;8(1):25.

    PubMed Central  Article  PubMed  Google Scholar 

  19. Wong M, et al. Patterns of antihypertensive prescribing, discontinuation and switching among a Hong Kong Chinese population from over one million prescriptions. J Hum Hypertens. 2008;22(10):714–6.

    Article  CAS  PubMed  Google Scholar 

  20. Timofeev R. Classification and regression trees (CART) theory and applications. Humboldt-Universitat zu Berlin, Wirtschaftswissenschaftliche Fakultat. 2004.

  21. Tomcikova D, et al. Epidemiology, quality improvement and outcome: risk of in-hospital mortality identified according to the typology of patients with acute heart failure: classification tree analysis on data from the Acute Heart Failure Database–main registry. J Crit Care. 2013;28:250–8.

    Article  PubMed  Google Scholar 

  22. Navarro Mdel C, et al. Discriminative ability of heel quantitative ultrasound in postmenopausal women with prevalent vertebral fractures: application of optimal threshold cutoff values using classification and regression tree models. Calcif Tissue Int. 2012;91(2):114–20.

    Article  PubMed  Google Scholar 

  23. Shi K-Q, et al. Risk stratification of spontaneous bacterial peritonitis in cirrhosis with ascites based on classification and regression tree analysis. Mol Biol Rep. 2012;39(5):6161–9.

    Article  CAS  PubMed  Google Scholar 

  24. Breiman L, et al. Classification and regression trees. Boca Raton: CRC Press; 1984.

    Google Scholar 

  25. Torgo L. Inductive learning of tree-based regression models. Universidada do Porto. Reitoria. 1999.

  26. Briand B, et al. A similarity measure to assess the stability of classification trees. Comput Stat Data Anal. 2009;53(4):1208–17.

    Article  Google Scholar 

  27. Mohammed MA, El Sayed C, Marshall T. Patient and other factors influencing the prescribing of cardiovascular prevention therapy in the general practice setting with and without nurse assessment. Med Decis Making. 2012;32(3):498–506.

    Article  PubMed  Google Scholar 

  28. The Mathworks Inc. Matlab and statistics toolbox release 2015a. Natick: The Mathworks; 2015.

    Google Scholar 

  29. Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI. 1995;2:1137–43.

    Google Scholar 

  30. Kitsantas P, Hollander M, Li LM. Assessing the stability of classification trees using Florida birth data. J Stat Plan Inference. 2007;137(12):3917–29.

    Article  Google Scholar 

  31. Breiman L. Bagging predictors. Mach Learn. 1996;24(2):123–40.

    Google Scholar 

  32. Dannegger F. Tree stability diagnostics and some remedies for instability. Stat Med. 2000;19(4):475–91.

    Article  CAS  PubMed  Google Scholar 

  33. Strobl C, Malley J, Tutz G. An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods. 2009;14(4):323.

    PubMed Central  Article  PubMed  Google Scholar 

  34. Lopert R, Henry D. The Pharmaceutical Benefits Scheme: economic evaluation works… but is not a panacea. Aust Prescr. 2002;25(6):126.

    Google Scholar 

  35. French SD, et al. Evaluation of a theory-informed implementation intervention for the management of acute low back pain in general medical practice: the IMPLEMENT cluster randomised trial. PLoS One. 2013;8(6):e65471.

    PubMed Central  Article  CAS  PubMed  Google Scholar 

  36. McKenzie JE, et al. Evidence-based care of older people with suspected cognitive impairment in general practice: protocol for the IRIS cluster randomised trial. Implement Sci. 2013;8(1):91.

    PubMed Central  Article  PubMed  Google Scholar 

  37. McKenzie JE, et al. Improving the care for people with acute low-back pain by allied health professionals (the ALIGN trial): a cluster randomised trial protocol. Implement Sci. 2010;5(1):86.

    PubMed Central  Article  PubMed  Google Scholar 

  38. Allen D, Harkins KJ. Too much guidance? Lancet. 2005;365(9473):1768.

    Article  CAS  PubMed  Google Scholar 

  39. Prevedello LM, et al. Does clinical decision support reduce unwarranted variation in yield of CT pulmonary angiogram? Am J Med. 2013;126(11):975–81.

    PubMed Central  Article  PubMed  Google Scholar 

  40. Cholesterol Treatment Trialists’ (CTT) Collaborators. Efficacy and safety of cholesterol-lowering treatment: prospective meta-analysis of data from 90,056 participants in 14 randomised trials of statins. Lancet. 2005;366(9493):1267–78.

    Article  Google Scholar 

  41. Kearney P, et al. Efficacy of cholesterol-lowering therapy in 18,686 people with diabetes in 14 randomised trials of statins: a meta-analysis. Lancet. 2008;371(9607):117–25.

    Article  CAS  PubMed  Google Scholar 

  42. Ashworth M, et al. Social deprivation and statin prescribing: a cross-sectional analysis using data from the new UK general practitioner ‘Quality and Outcomes Framework’. J Public Health (Oxf). 2007;29(1):40–7.

    Article  CAS  Google Scholar 

  43. Weitoft GR, et al. Education and drug use in Sweden—a nationwide register-based study. Pharmacoepidemiol Drug Saf. 2008;17(10):1020–8.

    Article  PubMed  Google Scholar 

  44. Heeley E, et al. Disparities between prescribing of secondary prevention therapies for stroke and coronary artery disease in general practice. Int J Stroke. 2012;7(8):649–54.

    Article  PubMed  Google Scholar 

  45. Latry P, et al. Adherence with statins in a real-life setting is better when associated cardiovascular risk factors increase: a cohort study. BMC Cardiovasc Disord. 2011;11(1):46.

    PubMed Central  Article  PubMed  Google Scholar 

  46. Field CA, Welsh AH. Bootstrapping clustered data. J R Stat Soc Series B Stat Methodol. 2007;69(3):369–90.

    Article  Google Scholar 

Download references


This work was supported by Monash University, the George Institute for Global Health, and the University of Melbourne.

Chris Schilling, Duncan Mortimer, Kim Dalziel, Emma Heeley, John Chalmers and Philip Clarke declare that they have no conflicts of interest.

Author contributions

CS, DM and KD conceptualized this report. All authors had input in developing the approach. CS produced multiple drafts. All authors provided input on the draft report and all read and approved the final report.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Chris Schilling.


Appendix 1

See Table 4.

Table 4 Summary of Australian guidelines current during the study period, to inform the prescribing of lipid-lowering medication

Appendix 2

After condensing the data to obtain a single observation per patient, our CART makes no further adjustment for clustering of observations by GP. On average, GPs see eight patients within the dataset (minimum of one patient per GP; maximum of 16). Stability across bagged trees may be overestimated if ‘bags’ of observations are drawn from clustered data. In supplementary analyses, we evaluated stability of the CART in 100 samples drawn using cluster-bootstrap methods [46]. Predictor counts and threshold densities were much the same with the cluster bootstrap as for the simple bootstrap on clustered data described above.

Similarly, while detailed contextual data on each GP was not available, the data did contain a State location variable that identifies the GP’s geographic region. In supplementary analyses, we included this variable within the predictor set, however it did not enter into the preferred CART model shown in Fig. 1.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Schilling, C., Mortimer, D., Dalziel, K. et al. Using Classification and Regression Trees (CART) to Identify Prescribing Thresholds for Cardiovascular Disease. PharmacoEconomics 34, 195–205 (2016).

Download citation

  • Published:

  • Issue Date:

  • DOI:


  • Absolute Risk
  • Individual Risk Factor
  • Pharmaceutical Benefit Scheme
  • High Total Cholesterol
  • National Heart Foundation