Abstract
In recent years, serum-based tests for early detection and detection of tissue of origin are being developed. Circulating microRNA has been shown to be a potential source of diagnostic information that can be collected non-invasively. In this study, we investigate circulating microRNAs as predictors of cancer stage. Specifically, we predict whether a sample stems from a patient with early stage (0-II) or late stage cancer (III-IV). We trained five machine learning algorithms on a data set of cancers from twelve different primary sites. The results showed that cancer stage can be predicted from circulating microRNA with a sensitivity of 71.73%, specificity of 79.97%, as well as positive and negative predictive value of 54.81% and 89.29%, respectively. Furthermore, we compared the best pan-cancer model with models specialized on individual cancers and found no statistically significant difference. Finally, in the best performing pan-cancer model 185 microRNAs were significant. Comparing the five most relevant circulating microRNAs in the best performing model with the current literature showed some known associations to various cancers. In conclusion, the study showed the potential of circulating microRNA and machine learning algorithms to predict cancer stage and thus suggests that further research into its potential as a non-invasive clinical test is warranted.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abu Alfeilat, H.A., et al.: Effects of distance measure choice on k-nearest neighbor classifier performance: a review. Big Data 7(4), 221–248 (2019)
Altmann, A., Toloşi, L., Sander, O., Lengauer, T.: Permutation importance: a corrected feature importance measure. Bioinformatics 26(10), 1340–1347 (2010)
Bengio, Y., Courville, A.C., Vincent, P.: Unsupervised feature learning and deep learning: a review and new perspectives. CoRR, abs/1206.5538 (2012)
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodological) 57(1), 289–300 (1995)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, COLT 1992, New York, NY, USA, pp. 144–152. Association for Computing Machinery (1992)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Elias, K.M., et al.: Diagnostic potential for a serum miRNA neural network for detection of ovarian cancer. Elife 6, e28932 (2017)
Fix, E., Hodges, J.L.: Discriminatory analysis. nonparametric discrimination: consistency properties. Int. Stat. Rev. Revue Internationale de Statistique 57(3), 238–247 (1989)
Galvão-Lima, L.J., Morais, A.H.F., Valentim, Ricardo A.M., Barreto, E.J.S.S.: mirnas as biomarkers for early cancer detection and their application in the development of new diagnostic tools. BioMedical Eng. OnLine 20(1), 21 (2021)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Hastie, T., Tibshirani, R., Friedman, J.: Data Mining, Inference, and Prediction. Springer, The Elements of Statistical Learning (2009). https://doi.org/10.1007/978-0-387-21606-5
Imaoka, H., et al.: Circulating microrna-1290 as a novel diagnostic and prognostic biomarker in human colorectal cancer. Ann. Oncol. 27(10), 1879–1886 (2016)
Iorio, M.V., Croce, C.M.: Microrna dysregulation in cancer: diagnostics, monitoring and therapeutics. a comprehensive review. EMBO Mol. Med. 4(3), 143–159 (2012)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Zhiyuan, L., et al.: MiR-31-5p is a potential circulating biomarker and therapeutic target for oral cancer. Mol. Ther. Nucleic Acids 16, 471–480 (2019)
Matsuzaki, J., Kato, K., Oono, K., et al.: Prediction of tissue-of-origin of early stage cancers using serum miRNomes. JNCI Can. Spectrum 7(1), (2022). pkac080
McPhail, S., Johnson, S., Greenberg, D., Peake, M., Rous, B.: Stage at diagnosis and early mortality from cancer in England. Br. J. Can. 112(1), S108–S115 (2015)
Mi, B., Li, Q., Li, T., Liu, G., Sai, J.: High mir-31-5p expression promotes colon adenocarcinoma progression by targeting TNS1. Aging (Albany NY) 12(8), 7480–7490 (2020)
Oshiro, T.M., Perez, P.S., Baranauskas, J.A.: How many trees in a random forest? In: Perner, P. (ed.) MLDM 2012. LNCS (LNAI), vol. 7376, pp. 154–168. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31537-4_13
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Pulati, N., Zhang, Z., Gulimilamu, A., Qi, X., Yang, J.: HPV16+ -miRNAs in cervical cancer and the anti-tumor role played by mir-5701. J. Gene Med. 21(11), e3126 (2019)
Siegel, R.L., Miller, K.D., Fuchs, H.E., Jemal, A.: Cancer statistics, 2022. CA Can. J. Clin. 72(1), 7–33 (2022)
American cancer society. Survival rates for pancreatic cancer. https://www.cancer.org/cancer/pancreatic-cancer/detection-diagnosis-staging/survival-rates.html. Accessed 01 Apr 2023
Wang, Y.-N., Chen, Z.-H., Chen, W.-C.: Novel circulating microRNAs expression profile in colon cancer: a pilot study. Eur. J. Med. Res. 22(1), 51 (2017)
Liyi, X., Cai, Y., Chen, X., Zhu, Y., Cai, J.: Circulating mir-1290 as a potential diagnostic and disease monitoring biomarker of human gastrointestinal tumors. BMC Cancer 21(1), 989 (2021)
Yokoi, A., et al.: Integrated extracellular microrna profiling for ovarian cancer screening. Nat. Commun. 9(1), 4319 (2018)
Acknowledgments
This work was supported by the University of Skövde, Sweden under grants from the Knowledge Foundation (20170302, 20200014). The computations were enabled by resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS) at Chalmers University of Technology partially funded by the Swedish Research Council through grant agreement no. 2022–06725.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
7Appendix
7Appendix
Table 5.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Stahlschmidt, S.R., Ulfenborg, B., Synnergren, J. (2023). Predicting Cancer Stage from Circulating microRNA: A Comparative Analysis of Machine Learning Algorithms. In: Rojas, I., Valenzuela, O., Rojas Ruiz, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2023. Lecture Notes in Computer Science(), vol 13919. Springer, Cham. https://doi.org/10.1007/978-3-031-34953-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-34953-9_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34952-2
Online ISBN: 978-3-031-34953-9
eBook Packages: Computer ScienceComputer Science (R0)