Skip to main content

A Comparative Study of Data Mining Techniques Applied to Renal-Cell Carcinomas

  • 137 Accesses

Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST,volume 432)

Abstract

Despite being one of the deadliest diseases and the enormous evolution in fighting it, the best methods to predict kidney cancer, namely Renal-Cell Carcinomas (RCC), are not well-known. One of the solutions to accelerate the current knowledge about RCC is through the use of Data Mining techniques based on patients' personal and clinical data. Therefore, it is crucial to understand which techniques are the most suitable to extract knowledge about this disease. In this paper, we followed the CRISP-DM methodology to simulate different techniques to determine the ones with the best predictive performance. For this purpose, we used a dataset of 821 records of RCC patients, obtained from The Cancer Genome Atlas. The present work tests different Data Mining techniques, that can be used to predict the 5-year life expectancy of patients with renal cancer and to predict the number of days to death for patients who have a life expectancy of less than 5 years. The results obtained demonstrated that the best algorithm for estimating the vital status at 5 years was Random Forest. This algorithm presented an accuracy of 87.65% and an AUROC of 0.931. For the prediction of days to death, the best performance was obtained with the k-Nearest Neighbors algorithm with a root mean square error of 354.6 days. The work suggested that Data Mining techniques can help to understand the influence of various risk factors on the life expectancy of patients with RCC.

Keywords

  • Renal-Cell Carcinoma
  • Data Mining
  • Survival
  • Life expectancy
  • RapidMiner

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-99197-5_5
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   64.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-99197-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   84.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.

References

  1. Sung, H., et al.: Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA. Cancer J. Clin. 71, 209–249 (2021). https://doi.org/10.3322/caac.21660

    CrossRef  Google Scholar 

  2. Hsieh, J.J., et al.: Renal cell carcinoma. Nat. Rev. Dis. Prim. 3, 1–19 (2017). https://doi.org/10.1038/nrdp.2017.9

    CrossRef  Google Scholar 

  3. Choueiri, T.K., Motzer, R.J.: Systemic therapy for metastatic renal-cell carcinoma. N. Engl. J. Med. 376, 354–366 (2017)

    CrossRef  Google Scholar 

  4. Dizman, N., Philip, E.J., Pal, S.K.: Genomic profiling in renal cell carcinoma. Nat. Rev. Nephrol. 16, 435–451 (2020). https://doi.org/10.1038/s41581-020-0301-x

    CrossRef  Google Scholar 

  5. Brierley, J.D., Gospodarowicz, M.K., Wittekind, C. (eds.): TNM Classification of Malignant Tumours. Wiley Blackwell (2017)

    Google Scholar 

  6. National Cancer Institute: Cancer Staging. https://www.cancer.gov/about-cancer/diagnosis-staging/staging. Accessed 08 June 2021

  7. Scelo, G., Larose, T.L.: Epidemiology and risk factors for kidney cancer. J. Clin. Oncol. 36, 3574–3581 (2018). https://doi.org/10.1200/JCO.2018.79.1905

    CrossRef  Google Scholar 

  8. American Cancer Society: Survival Rates for Kidney Cancer. https://www.cancer.org/cancer/kidney-cancer/detection-diagnosis-staging/survival-rates.html. Accessed 08 June 2021

  9. Jagga, Z., Gupta, D.: Classification models for clear cell renal carcinoma stage progression, based on tumor RNAseq expression trained supervised machine learning algorithms. BMC Proc. 8, 1–7 (2014). https://doi.org/10.1186/1753-6561-8-S6-S2

    CrossRef  Google Scholar 

  10. Rady, E.-H.A., Anwar, A.S.: Prediction of kidney disease stages using data mining algorithms. Inf. Med. Unlocked. 15, 100178 (2019). https://doi.org/10.1016/j.imu.2019.100178

  11. Ola, A.F.: A model for prediction of kidney cancer using data analytics technique. Am. J. Data Min. Knowl. Discov. 5, 27–36 (2020). https://doi.org/10.11648/j.ajdmkd.20200502.12

  12. Grossman, R.L., et al.: Toward a shared vision for cancer genomic data. N. Engl. J. Med. 375, 1109–1112 (2016). https://doi.org/10.1056/nejmp1607591

    CrossRef  Google Scholar 

  13. National Cancer Institute: TCGA Cancers Selected for Study. https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga/studied-cancers. Accessed 17 June 2021

  14. RapidMiner. https://rapidminer.com/. Accessed 07 May 2021

  15. Morais, A., Peixoto, H., Coimbra, C., Abelha, A., Machado, J.: Predicting the need of neonatal resuscitation using data mining. In: Procedia Computer Science, pp. 571–576. Elsevier B.V. (2017). https://doi.org/10.1016/j.procs.2017.08.287

  16. Dickie, L., Johnson, C., Adams, S., Negoita, S.: Solid Tumor Rules. National Cancer Institute, Rockville, MD (2020)

    Google Scholar 

  17. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002). https://doi.org/10.1613/jair.953

    CrossRef  MATH  Google Scholar 

  18. Peixoto, C., Peixoto, H., Machado, J., Abelha, A., Santos, M.F.: Iron value classification in patients undergoing continuous ambulatory peritoneal dialysis using data mining. In: Proceedings of the 4th International Conference on Information and Communication Technologies for Ageing Well and e-Health (ICT4AWE), pp. 285–290. SCITEPRESS (2018). https://doi.org/10.5220/0006820802850290

Download references

Acknowledgements

This work is funded by “FCT—Fundação para a Ciência e Tecnologia” within the R&D Units Project Scope: UIDB/00319/2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hugo Peixoto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Duarte, A., Peixoto, H., Machado, J. (2022). A Comparative Study of Data Mining Techniques Applied to Renal-Cell Carcinomas. In: Spinsante, S., Silva, B., Goleva, R. (eds) IoT Technologies for Health Care. HealthyIoT 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 432. Springer, Cham. https://doi.org/10.1007/978-3-030-99197-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-99197-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-99196-8

  • Online ISBN: 978-3-030-99197-5

  • eBook Packages: Computer ScienceComputer Science (R0)