Abstract
Liquid biopsy is a useful, minimally invasive diagnostic and monitoring tool for cancer disease. Yet, developing accurate methods, given the potentially large number of input features, and usually small datasets size remains very challenging.
Recently, a novel feature parameterization based on the RNA-sequenced platelet data which uses the biological knowledge from the Kyoto Encyclopedia of Genes and Genomes, combined with a classifier based on the Convolutional Neural Network (CNN), allowed significantly improving the classification accuracy. In this work, we take a closer look at this approach and find that similar results can be obtained using significantly smaller models. Additionally, competitive results were achieved using gradient boosting. Since it has another advantage of adding interpretability to the model, we further analyze it in this work.
This work has been partially supported by Statutory Funds of Electronics, Telecommunications and Informatics Faculty, Gdansk University of Technology. This work was supported in part through the European Regional Development Fund as part of the Project entitled: Academy of Innovative Applications of Digital Technologies under Grant The Operational Programme “Digital Poland” 2014-2020 number POPC.03.02.00-00-0001/20-00. This research was supported by the SONATA grant of the National Science Centre (2018/31/D/NZ5/01263) and Medical University of Gdańsk statutory work (ST-23, 02-0023/07).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adler, A.S., et al.: An integrative analysis of colon cancer identifies an essential function for PRPF6 in tumor growth. Genes Dev. 28(10), 1068–1084 (2014)
Best, M.G., In ’t Veld, S.G.J.G., Sol, N., Wurdinger, T.: RNA sequencing and swarm intelligence-enhanced classification algorithm development for blood-based disease diagnostics using spliced blood platelet RNA. Nat. Protoc. 14(4), 1206–1234 (2019)
Best, M.G., et al.: Swarm intelligence-enhanced detection of non-small-cell lung cancer using tumor-educated platelets. Cancer Cell 32(2), 238–252 (2017)
Burges, C.J.: From ranknet to lambdarank to lambdamart: An overview. Technical report 23–581 (2010)
Cecotti, H., Graser, A.: Convolutional neural networks for p300 detection with application to brain-computer interfaces. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 433–445 (2010)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Conneau, A., Schwenk, H., Barrault, L., LeCun, Y.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL (2017)
Dasgupta, S., et al.: SH3GL2 is frequently deleted in non-small cell lung cancer and downregulates tumor growth by modulating EGFR signaling. J. Mol. Med. 91(3), 381–393 (2013)
Dunne, M.R., et al.: HLA-DR expression in tumor epithelium is an independent prognostic indicator in esophageal adenocarcinoma patients. Cancer Immunol. Immunother. 66(7), 841–850 (2017). https://doi.org/10.1007/s00262-017-1983-1
Frankish, A., et al.: GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47(D1), D766–D773 (2019)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Heinhuis, K.M., et al.: Rna-sequencing of tumor-educated platelets, a novel biomarker for blood-based sarcoma diagnostics. Cancers 12(6) (2020)
Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A., Vingron, M.: Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18(suppl\_1), S96–S104 (2002)
Kannan, A., et al.: Mitochondrial reprogramming regulates breast cancer progression. Clin. Cancer Res. 22(13), 3348–3360 (2016)
Ke, G., et al.: Lightgbm: a highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30, 3146–3154 (2017)
Love, M.I., Huber, W., Anders, S.: Moderated estimation of fold change and dispersion for rna-seq data with deseq2. Genome Biol. 15(12), 1–21 (2014)
Luo, W., Friedman, M.S., Shedden, K., Hankenson, K.D., Woolf, P.J.: Gage: generally applicable gene set enrichment for pathway analysis. BMC Bioinform. 10(1), 1–17 (2009)
Olson, R.S., Cava, W.G.L., Mustahsan, Z., Varik, A., Moore, J.H.: Data-driven advice for applying machine learning to bioinformatics problems. In: Biocomputing 2018: Proceedings of the Pacific Symposium, pp. 192–203 (2018)
Pastuszak, K., et al.: Implatelet classifier: image-converted rna biomarker profiles enable blood-based cancer diagnostics. Molecular Oncology (2021)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019)
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Poplin, R., et al.: A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36(10), 983–987 (2018)
Raghu, M., Zhang, C., Kleinberg, J.M., Bengio, S.: Transfusion: understanding transfer learning for medical imaging. In: Annual Conference on Neural Information Processing Systems, NeurIPS (2019)
Shamout, F.E., et al.: An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department. arXiv preprint (2020). https://arxiv.org/abs/2008.01774
Shen, M.L., et al.: Associations between UGT2B7 polymorphisms and cancer susceptibility: a meta-analysis. Gene 706, 115–123 (2019)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Zhang, H., Cissé, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. In: 6th International Conference on Learning Representations, ICLR (2018)
Zheng, S.E., et al.: Down-regulation of ribosomal protein L7A in human osteosarcoma. J. Cancer Res. Clin. Oncol. 135(8), 1025–1031 (2009)
Acknowledgments
Authors would like to thank Tomasz Bączek, Myron Best, Jacek Bigda, Peter Grešner, Jacek Jassem, Tomasz Stokowy, Thomas Würdinger, Anna Żaczek for constant support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Cygert, S. et al. (2021). Towards Cancer Patients Classification Using Liquid Biopsy. In: Rekik, I., Adeli, E., Park, S.H., Schnabel, J. (eds) Predictive Intelligence in Medicine. PRIME 2021. Lecture Notes in Computer Science(), vol 12928. Springer, Cham. https://doi.org/10.1007/978-3-030-87602-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-87602-9_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87601-2
Online ISBN: 978-3-030-87602-9
eBook Packages: Computer ScienceComputer Science (R0)