Skip to main content
Log in

Cost Prediction Using a Survival Grouping Algorithm: An Application to Incident Prostate Cancer Cases

  • Original Research Article
  • Published:
PharmacoEconomics Aims and scope Submit manuscript

Abstract

Background

Prognostic classification approaches are commonly used in clinical practice to predict health outcomes. However, there has been limited focus on use of the general approach for predicting costs. We applied a grouping algorithm designed for large-scale data sets and multiple prognostic factors to investigate whether it improves cost prediction among older Medicare beneficiaries diagnosed with prostate cancer.

Methods

We analysed the linked Surveillance, Epidemiology and End Results (SEER)-Medicare data, which included data from 2000 through 2009 for men diagnosed with incident prostate cancer between 2000 and 2007. We split the survival data into two data sets (D0 and D1) of equal size. We trained the classifier of the Grouping Algorithm for Cancer Data (GACD) on D0 and tested it on D1. The prognostic factors included cancer stage, age, race and performance status proxies. We calculated the average difference between observed D1 costs and predicted D1 costs at 5 years post-diagnosis with and without the GACD.

Results

The sample included 110,843 men with prostate cancer. The median age of the sample was 74 years, and 10 % were African American. The average difference (mean absolute error [MAE]) per person between the real and predicted total 5-year cost was US$41,525 (MAE US$41,790; 95 % confidence interval [CI] US$41,421–42,158) with the GACD and US$43,113 (MAE US$43,639; 95 % CI US$43,062–44,217) without the GACD. The 5-year cost prediction without grouping resulted in a sample overestimate of US$79,544,508.

Conclusion

The grouping algorithm developed for complex, large-scale data improves the prediction of 5-year costs. The prediction accuracy could be improved by utilization of a richer set of prognostic factors and refinement of categorical specifications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Qi R, Zhou S. A comparative study of algorithms for grouping cancer data. Hong Kong: International MultiConference of Engineers and Computer Scientists; 12–14 Mar 2014.

  2. Green F, Compton C, Fritz A, Shah J, Winchester D. AJCC cancer staging atlas. New York: Springer; 2006.

    Book  Google Scholar 

  3. Chen D, Xing K, Henson D, Sheng L, Schwartz AM, Cheng X. Developing prognostic systems of cancer patients by ensemble clustering. J BioMed Biotech. 2009;2009:1–7. doi:10.1155/2009/632786.

    Google Scholar 

  4. Qi R, Zhou S. Simulated annealing partitioning: an algorithm for optimizing grouping in cancer data. Dallas: 2013 IEEE 13th International Conference on Data Mining Workshops; 7–10 Dec 2013.

  5. Prorok PC, Andriole GL, Bresalier RS, Buys SS, Chia D, Crawford ED, et al. Design of the prostate, lung, colorectal and ovarian (PLCO) cancer screening trial. Control Clin Trials. 2000;21(6 Suppl):273S–309S.

    Article  CAS  PubMed  Google Scholar 

  6. Jayadevappa R, Chhatre S, Weiner M, Bloom BS, Malkowicz SB. Medical care cost of patients with prostate cancer. Urol Oncol. 2005;23(3):155–62.

    Article  PubMed  Google Scholar 

  7. National Cancer Institute: Surveillance, Epidemiology, and End Results Program. About the SEER Registries. National Cancer Institute. 2013. http://seer.cancer.gov/registries. Accessed 29 Mar 2013.

  8. Zeng C, Wen W, Morgans AK, Pao W, Shu XO, Zheng W. Disparities by race, age, and sex in the improvement of survival for major cancers: results from the National Cancer Institute Surveillance, Epidemiology, and End Results (SEER) Program in the United States, 1990 to 2010. JAMA Oncol. 2015;1(1):88–96. doi:10.1001/jamaoncol.2014.161.

    Article  PubMed  Google Scholar 

  9. Tomioka A, Tanaka N, Yoshikawa M, Miyake M, Anai S, Chihara Y, et al. Risk factors of PSA progression and overall survival in patients with localized and locally advanced prostate cancer treated with primary androgen deprivation therapy. BMC Cancer. 2015;15:420. doi:10.1186/s12885-015-1429-0.

    Article  PubMed Central  PubMed  Google Scholar 

  10. Armstrong AJ, Tannock IF, de Wit R, George DJ, Eisenberger M, Halabi S. The development of risk groups in men with metastatic castration-resistant prostate cancer based on risk factors for PSA decline and survival. Eur J Cancer. 2010;46(3):517–25. doi:10.1016/j.ejca.2009.11.007.

    Article  CAS  PubMed  Google Scholar 

  11. Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. Montreal: International Joint Conference on Artificial Intelligence; 20–25 Aug 1995.

  12. Bang H, Tsiatis AA. Estimating medical costs with censored data. Biometrika. 2000;87(2):329–43.

    Article  Google Scholar 

  13. Onukwugha E, Osteen P, Jayasekera J, Mullins CD, Mair CA, Hussain A. Racial disparities in urologist visits among elderly men with prostate cancer: a cohort analysis of patient-related and county of residence-related factors. Cancer. 2014;120(21):3385–92. doi:10.1002/cncr.28894.

    Article  PubMed  Google Scholar 

  14. Mariotto AB, Yabroff KR, Shao Y, Feuer EJ, Brown ML. Projections of the cost of cancer care in the United States: 2010–2020. J Natl Cancer Inst. 2011;103(2):117–28. doi:10.1093/jnci/djq495.

    Article  PubMed Central  PubMed  Google Scholar 

  15. Grover SA, Coupal L, Zowall H, Rajan R, Trachtenberg J, Elhilali M, et al. The economic burden of prostate cancer in Canada: forecasts from the Montreal Prostate Cancer Model. CMAJ. 2000;162(7):987–92.

    PubMed Central  CAS  PubMed  Google Scholar 

  16. Geraedts AS, Fokkema M, Kleiboer AM, Smit F, Wiezer NW, Majo MC, et al. The longitudinal prediction of costs due to health care uptake and productivity losses in a cohort of employees with and without depression or anxiety. J Occup Environ Med. 2014;56(8):794–801. doi:10.1097/jom.0000000000000234.

    Article  PubMed  Google Scholar 

  17. Konig HH, Leicht H, Bickel H, Fuchs A, Gensichen J, Maier W, et al. Effects of multiple chronic conditions on health care costs: an analysis based on an advanced tree-based regression model. BMC Health Serv Res. 2013;13:219. doi:10.1186/1472-6963-13-219.

    Article  PubMed Central  PubMed  Google Scholar 

  18. Leutzinger JA, Ozminkowski RJ, Dunn RL, Goetzel RZ, Richling DE, Stewart M, et al. Projecting future medical care costs using four scenarios of lifestyle risk rates. Am J Health Promot. 2000;15(1):35–44.

    Article  CAS  PubMed  Google Scholar 

  19. Bowen JD, Goetzel RZ, Lenhart G, Ozminkowski RJ, Babamoto KS, Portale JD. Using a personal health care cost calculator to estimate future expenditures based on individual health risks. J Occup Environ Med. 2009;51(4):449–55. doi:10.1097/JOM.0b013e3181996ceb.

    Article  PubMed  Google Scholar 

  20. Fishman PA, Goodman MJ, Hornbrook MC, Meenan RT, Bachman DJ, O’Keeffe Rosetti MC. Risk adjustment using automated ambulatory pharmacy data: the RxRisk model. Med Care. 2003;41(1):84–99. doi:10.1097/01.mlr.0000039830.19812.29.

    Article  PubMed  Google Scholar 

Download references

Acknowledgments

The authors thank the staff from University of Maryland Pharmaceutical Research Computing for programming assistance with the primary data sets.

The collection of the California cancer incidence data used in this study was supported by the California Department of Public Health as part of the state-wide cancer reporting programme mandated by California Health and Safety Code Section 103885; the National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) Program, under contract N01-PC-35136 awarded to the Northern California Cancer Center, contract N01-PC-35139 awarded to the University of Southern California and contract N02-PC-15105 awarded to the Public Health Institute; and the Centers for Disease Control and Prevention’s National Program of Cancer Registries, under agreement #U55/CCR921930-02 awarded to the Public Health Institute. The ideas and opinions expressed herein are those of the author(s), and endorsement by the California Department of Public Health, the National Cancer Institute and the Centers for Disease Control and Prevention, or their contractors and subcontractors, is not intended nor should be inferred. The authors acknowledge the efforts of the Applied Research Program, NCI; the Office of Research, Development and Information, Centers for Medicare and Medicaid Services (CMS); Information Management Services (IMS), Inc.; and SEER Program tumour registries in the creation of the SEER-Medicare database.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eberechukwu Onukwugha.

Ethics declarations

No funding was received for the conduct of this study. Eberechukwu Onukwugha declares consulting income from Pfizer, IMS Health and Janssen Analytics (Johnson & Johnson), as well as grant funding from Bayer, Pfizer, Sanofi-Aventis and Novartis. Ran Qi, Jinani Jayasekera and Shujia Zhou have no conflicts of interest to disclose.

Author contributions

The interpretation and reporting of these data are the sole responsibility of the authors. Eberechukwu Onukwugha contributed to the study design, analytic approach and interpretation of the analysis; and drafted and revised the manuscript with input from all co-authors. Ran Qi contributed to the study design, conduct of the analysis and interpretation of the analysis; and drafted the manuscript with input from all co-authors. Jinani Jayasekera contributed to the data analysis; and reviewed and commented on/edited all drafts of the manuscript. Shujia Zhou contributed to the study design and interpretation of the analysis; and reviewed and commented on/edited all drafts of the manuscript. Eberechukwu Onukwugha acts as the guarantor for the overall content.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 52 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Onukwugha, E., Qi, R., Jayasekera, J. et al. Cost Prediction Using a Survival Grouping Algorithm: An Application to Incident Prostate Cancer Cases. PharmacoEconomics 34, 207–216 (2016). https://doi.org/10.1007/s40273-015-0368-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40273-015-0368-6

Keywords

Navigation