Skip to main content

Predicting Cancer Stage from Circulating microRNA: A Comparative Analysis of Machine Learning Algorithms

  • Conference paper
  • First Online:
Bioinformatics and Biomedical Engineering (IWBBIO 2023)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 13919))

  • 429 Accesses

Abstract

In recent years, serum-based tests for early detection and detection of tissue of origin are being developed. Circulating microRNA has been shown to be a potential source of diagnostic information that can be collected non-invasively. In this study, we investigate circulating microRNAs as predictors of cancer stage. Specifically, we predict whether a sample stems from a patient with early stage (0-II) or late stage cancer (III-IV). We trained five machine learning algorithms on a data set of cancers from twelve different primary sites. The results showed that cancer stage can be predicted from circulating microRNA with a sensitivity of 71.73%, specificity of 79.97%, as well as positive and negative predictive value of 54.81% and 89.29%, respectively. Furthermore, we compared the best pan-cancer model with models specialized on individual cancers and found no statistically significant difference. Finally, in the best performing pan-cancer model 185 microRNAs were significant. Comparing the five most relevant circulating microRNAs in the best performing model with the current literature showed some known associations to various cancers. In conclusion, the study showed the potential of circulating microRNA and machine learning algorithms to predict cancer stage and thus suggests that further research into its potential as a non-invasive clinical test is warranted.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abu Alfeilat, H.A., et al.: Effects of distance measure choice on k-nearest neighbor classifier performance: a review. Big Data 7(4), 221–248 (2019)

    Article  PubMed  Google Scholar 

  2. Altmann, A., Toloşi, L., Sander, O., Lengauer, T.: Permutation importance: a corrected feature importance measure. Bioinformatics 26(10), 1340–1347 (2010)

    Article  CAS  PubMed  Google Scholar 

  3. Bengio, Y., Courville, A.C., Vincent, P.: Unsupervised feature learning and deep learning: a review and new perspectives. CoRR, abs/1206.5538 (2012)

    Google Scholar 

  4. Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodological) 57(1), 289–300 (1995)

    Google Scholar 

  5. Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, COLT 1992, New York, NY, USA, pp. 144–152. Association for Computing Machinery (1992)

    Google Scholar 

  6. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  7. Elias, K.M., et al.: Diagnostic potential for a serum miRNA neural network for detection of ovarian cancer. Elife 6, e28932 (2017)

    Article  PubMed  PubMed Central  Google Scholar 

  8. Fix, E., Hodges, J.L.: Discriminatory analysis. nonparametric discrimination: consistency properties. Int. Stat. Rev. Revue Internationale de Statistique 57(3), 238–247 (1989)

    Google Scholar 

  9. Galvão-Lima, L.J., Morais, A.H.F., Valentim, Ricardo A.M., Barreto, E.J.S.S.: mirnas as biomarkers for early cancer detection and their application in the development of new diagnostic tools. BioMedical Eng. OnLine 20(1), 21 (2021)

    Google Scholar 

  10. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    Google Scholar 

  11. Hastie, T., Tibshirani, R., Friedman, J.: Data Mining, Inference, and Prediction. Springer, The Elements of Statistical Learning (2009). https://doi.org/10.1007/978-0-387-21606-5

  12. Imaoka, H., et al.: Circulating microrna-1290 as a novel diagnostic and prognostic biomarker in human colorectal cancer. Ann. Oncol. 27(10), 1879–1886 (2016)

    Article  CAS  PubMed  Google Scholar 

  13. Iorio, M.V., Croce, C.M.: Microrna dysregulation in cancer: diagnostics, monitoring and therapeutics. a comprehensive review. EMBO Mol. Med. 4(3), 143–159 (2012)

    Google Scholar 

  14. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  CAS  PubMed  Google Scholar 

  15. Zhiyuan, L., et al.: MiR-31-5p is a potential circulating biomarker and therapeutic target for oral cancer. Mol. Ther. Nucleic Acids 16, 471–480 (2019)

    Article  Google Scholar 

  16. Matsuzaki, J., Kato, K., Oono, K., et al.: Prediction of tissue-of-origin of early stage cancers using serum miRNomes. JNCI Can. Spectrum 7(1), (2022). pkac080

    Google Scholar 

  17. McPhail, S., Johnson, S., Greenberg, D., Peake, M., Rous, B.: Stage at diagnosis and early mortality from cancer in England. Br. J. Can. 112(1), S108–S115 (2015)

    Article  Google Scholar 

  18. Mi, B., Li, Q., Li, T., Liu, G., Sai, J.: High mir-31-5p expression promotes colon adenocarcinoma progression by targeting TNS1. Aging (Albany NY) 12(8), 7480–7490 (2020)

    Article  CAS  PubMed  Google Scholar 

  19. Oshiro, T.M., Perez, P.S., Baranauskas, J.A.: How many trees in a random forest? In: Perner, P. (ed.) MLDM 2012. LNCS (LNAI), vol. 7376, pp. 154–168. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31537-4_13

    Chapter  Google Scholar 

  20. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    Google Scholar 

  21. Pulati, N., Zhang, Z., Gulimilamu, A., Qi, X., Yang, J.: HPV16+ -miRNAs in cervical cancer and the anti-tumor role played by mir-5701. J. Gene Med. 21(11), e3126 (2019)

    Article  CAS  PubMed  Google Scholar 

  22. Siegel, R.L., Miller, K.D., Fuchs, H.E., Jemal, A.: Cancer statistics, 2022. CA Can. J. Clin. 72(1), 7–33 (2022)

    Google Scholar 

  23. American cancer society. Survival rates for pancreatic cancer. https://www.cancer.org/cancer/pancreatic-cancer/detection-diagnosis-staging/survival-rates.html. Accessed 01 Apr 2023

  24. Wang, Y.-N., Chen, Z.-H., Chen, W.-C.: Novel circulating microRNAs expression profile in colon cancer: a pilot study. Eur. J. Med. Res. 22(1), 51 (2017)

    Article  PubMed  PubMed Central  Google Scholar 

  25. Liyi, X., Cai, Y., Chen, X., Zhu, Y., Cai, J.: Circulating mir-1290 as a potential diagnostic and disease monitoring biomarker of human gastrointestinal tumors. BMC Cancer 21(1), 989 (2021)

    Article  Google Scholar 

  26. Yokoi, A., et al.: Integrated extracellular microrna profiling for ovarian cancer screening. Nat. Commun. 9(1), 4319 (2018)

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

This work was supported by the University of Skövde, Sweden under grants from the Knowledge Foundation (20170302, 20200014). The computations were enabled by resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS) at Chalmers University of Technology partially funded by the Swedish Research Council through grant agreement no. 2022–06725.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sören Richard Stahlschmidt .

Editor information

Editors and Affiliations

7Appendix

7Appendix

Table 5.

Table 5. Hyperparameters used during grid search

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Stahlschmidt, S.R., Ulfenborg, B., Synnergren, J. (2023). Predicting Cancer Stage from Circulating microRNA: A Comparative Analysis of Machine Learning Algorithms. In: Rojas, I., Valenzuela, O., Rojas Ruiz, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2023. Lecture Notes in Computer Science(), vol 13919. Springer, Cham. https://doi.org/10.1007/978-3-031-34953-9_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34953-9_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34952-2

  • Online ISBN: 978-3-031-34953-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics