Adap-BDCM: Adaptive Bilinear Dynamic Cascade Model for Classification Tasks on CNV Datasets

Jiang, Liancheng; Jia, Liye; Wang, Yizhen; Wu, Yongfei; Yue, Junhong

doi:10.1007/s12539-024-00635-w

Adap-BDCM: Adaptive Bilinear Dynamic Cascade Model for Classification Tasks on CNV Datasets

Original research article
Published: 17 May 2024

(2024)
Cite this article

Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Liancheng Jiang¹,
Liye Jia²,
Yizhen Wang¹,
Yongfei Wu¹ &
…
Junhong Yue ORCID: orcid.org/0000-0002-0042-7376¹

126 Accesses
Explore all metrics

Abstract

Copy number variation (CNV) is an essential genetic driving factor of cancer formation and progression, making intelligent classification based on CNV feasible. However, there are a few challenges in the current machine learning and deep learning methods, such as the design of base classifier combination schemes in ensemble methods and the selection of layers of neural networks, which often result in low accuracy. Therefore, an adaptive bilinear dynamic cascade model (Adap-BDCM) is developed to further enhance the accuracy and applicability of these methods for intelligent classification on CNV datasets. In this model, a feature selection module is introduced to mitigate the interference of redundant information, and a bilinear model based on the gated attention mechanism is proposed to extract more beneficial deep fusion features. Furthermore, an adaptive base classifier selection scheme is designed to overcome the difficulty of manually designing base classifier combinations and enhance the applicability of the model. Lastly, a novel feature fusion scheme with an attribute recall submodule is constructed, effectively avoiding getting stuck in local solutions and missing some valuable information. Numerous experiments have demonstrated that our Adap-BDCM model exhibits optimal performance in cancer classification, stage prediction, and recurrence on CNV datasets. This study can assist physicians in making diagnoses faster and better.

Graphical Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A snapshot neural ensemble method for cancer-type prediction based on copy number variations

Article Open access 30 November 2019

DeepGene: an advanced cancer type classifier based on deep learning and somatic point mutations

Article Open access 23 December 2016

Classification of tumor types using XGBoost machine learning model: a vector space transformation of genomic alterations

Article Open access 21 November 2023

Code Availability

The source codes for the Adap-BDCM model and the CNV dataset for cancer classification have been archived in the GitHub repository (https://github.com/junhonga/Adap-BDCM).

References

Jin J, Wu X, Yin J et al (2019) Identification of genetic mutations in cancer: challenge and opportunity in the new era of targeted therapy. Front Onco 9:263. https://doi.org/10.3389/fonc.2019.00263
Article Google Scholar
Poduri A, Evrony GD, Cai X et al (2013) Somatic mutation, genomic variation, and neurological disease. Science 341(6141):1237758. https://doi.org/10.1126/science.1237758
Article CAS PubMed PubMed Central Google Scholar
Redon R, Ishikawa S, Fitch KR et al (2006) Global variation in copy number in the human genome. Nature 444(7118):444–454. https://doi.org/10.1038/nature05329
Article CAS PubMed PubMed Central Google Scholar
Buchynska LG, Brieiev OV, Iurchenko NP (2019) Assessment of HER-2/neu, c-MYC and CCNE1 gene copy number variations and protein expression in endometrial carcinomas. Exp Oncol 41(2):138–143. https://doi.org/10.32471/exp-oncology.2312-8852.vol-41-no-2.12973
Article CAS PubMed Google Scholar
Tian T, Bi H, Liu Y et al (2020) Copy number variation of ubiquitin-specific proteases genes in blood leukocytes and colorectal cancer. Cancer Biol Ther 21(7):637–646. https://doi.org/10.1080/15384047.2020.1750860
Article CAS PubMed PubMed Central Google Scholar
Zhang N, Wang M, Zhang P et al (1860) (2016) Classification of cancers based on copy number variation landscapes. Bba-gen Subjects 11:2750–2755. https://doi.org/10.1016/j.bbagen.2016.06.003
Article CAS Google Scholar
Liang Y, Wang H, Yang J et al (2020) A deep learning framework to predict tumor tissue-of-origin based on copy number alteration. Front Bioeng Biotech 8:701. https://doi.org/10.3389/fbioe.2020.00701
Article Google Scholar
Wu Q, Li D (2022) CRIA: an interactive gene selection algorithm for cancers prediction based on copy number variations. Front Plant Sci 13:839044. https://doi.org/10.3389/fpls.2022.839044
Article PubMed PubMed Central Google Scholar
Zhou ZH, Feng J (2019) Deep forest. Natl Sci Rev 6(1):74–86. https://doi.org/10.1093/nsr/nwy108
Article PubMed Google Scholar
Guo Y, Liu S, Li Z et al (2018) BCDForest: a boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data. BMC Bioinform 19(5):1–13. https://doi.org/10.1186/s12859-018-2095-4
Article Google Scholar
El-Nabawy A, Belal NA, El-Bendary N (2021) A cascade deep forest model for breast cancer subtype classification using multi-omics data. Mathematics 9(13):1574. https://doi.org/10.3390/math9131574
Article Google Scholar
Zhong L, Meng Q, Chen Y (2021) A cascade flexible neural forest model for cancer subtypes classification on gene expression data. Comput Intel Neurosc 2021:1–11. https://doi.org/10.1155/2021/6480456
Article Google Scholar
Shaaban MA, Hassan YF, Guirguis SK (2022) Deep convolutional forest: a dynamic deep ensemble approach for spam detection in text. Complex Intell Syst 8(6):4897–4909. https://doi.org/10.1007/s40747-022-00741-6
Article Google Scholar
Tenenbaum JB, Freeman WT (2000) Separating style and content with bilinear models. Neural Comput 12(6):1247–1283. https://doi.org/10.1162/089976600300015349
Article CAS PubMed Google Scholar
Lin TY, RoyChowdhury A, Maji S (2015) Bilinear CNN models for fine-grained visual recognition. In: Proceedings of the IEEE International Conference on computer vision, pp 1449–1457. https://doi.org/10.1109/ICCV.2015.170
Gao Y, Beijbom O, Zhang N et al (2016) Compact bilinear pooling. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 317–326. https://doi.org/10.1109/CVPR.2016.41
Kim JH, On KW, Lim W et al (2016) Hadamard product for low-rank bilinear pooling. arXiv. https://doi.org/10.48550/arXiv.1610.04325
Li Y, Wang N, Liu J et al (2017) Factorized bilinear models for image recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2079–2087. https://doi.org/10.1109/ICCV.2017.229
Li E, Samat A, Du P et al (2020) Improved bilinear CNN model for remote sensing scene classification. IEEE Geosci Remote Sens Lett 19:1–5. https://doi.org/10.1109/LGRS.2020.3040153
Article Google Scholar
Yu Z, Yu J, Fan J et al (2017) Multi-modal factorized bilinear pooling with co-attention learning for visual question answering. In: Proceedings of the IEEE International Conference on computer vision, pp 1821–1830. https://doi.org/10.1109/ICCV.2017.202
Gao C, Chen Y, Jiang X et al (2023) Bi-STAN: bilinear spatial-temporal attention network for wearable human activity recognition. Int J Mach Learn Cyb 14(7):2545–2561. https://doi.org/10.1007/s13042-023-01781-1
Article Google Scholar
Wang Z, Li R, Wang M et al (2021) GPDBN: deep bilinear network integrating both genomic data and pathological images for breast cancer prognosis prediction. Bioinformatics 37(18):2963–2970. https://doi.org/10.1093/bioinformatics/btab185
Article CAS PubMed PubMed Central Google Scholar
Li R, Wu X, Li A et al (2022) HFBSurv: hierarchical multimodal fusion with factorized bilinear models for cancer survival prediction. Bioinformatics 38(9):2587–2594. https://doi.org/10.1093/bioinformatics/btac113
Article CAS PubMed PubMed Central Google Scholar
Qiu L, Khormali A, Liu K (2023) Deep biological pathway informed pathology-genomic multimodal survival prediction. arXiv. https://doi.org/10.48550/arXiv.2301.02383
Dudoit S, Fridlyand J (2003) Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19(9):1090–1099. https://doi.org/10.1093/bioinformatics/btg038
Article CAS PubMed Google Scholar
Wang A, Liu H, Yang J et al (2022) Ensemble feature selection for stable biomarker identification and cancer classification from microarray expression data. Comput Biol Med 142:105208. https://doi.org/10.1016/j.compbiomed.2021.105208
Article CAS PubMed Google Scholar
Elmi J, Eftekhari M, Mehrpooya A et al (2023) A novel framework based on the multi-label classification for dynamic selection of classifiers. Int J Mach Learn Cyb 14(6):2137–2154. https://doi.org/10.1007/s13042-022-01751-z
Article Google Scholar
Hashemi A, Dowlatshahi MB, Nezamabadi-pour H (2022) Ensemble of feature selection algorithms: a multi-criteria decision-making approach. Int J Mach Learn Cyb 13(1):49–69. https://doi.org/10.1007/s13042-021-01347-z
Article Google Scholar
Ferreira AJ, Figueiredo MAT (2012) Boosting algorithms: a review of methods, theory, and applications. In: Zhang C, Ma Y (eds) Ensemble machine learning: methods and applications. Spring, New York, pp 35–85. https://doi.org/10.1007/978-1-4419-9326-7_2
Chapter Google Scholar
Wang FY, Zhou DW, Ye HJ et al (2022) Foster: Feature boosting and compression for class-incremental learning In: European Conference on Computer Vision, pp 398–414. https://doi.org/10.1007/978-3-031-19806-9_23
Mostafaei SH, Tanha J (2023) OUBoost: boosting based over and under sampling technique for handling imbalanced data. Int J Mach Learn Cyb 14(10):3393–3411. https://doi.org/10.1007/s13042-023-01839-0
Article CAS Google Scholar
Roshan S, Tanha J, Hallaji F et al (2023) IMBoost: a new weighting factor for boosting to improve the classification performance of imbalanced data. Complexity 2023:2176891. https://doi.org/10.1155/2023/2176891
Article Google Scholar
Liong VE, Lu J, Wang G (2013) Face recognition using deep PCA. In: 2013 9th International Conference on Information, Communications & Signal Processing, pp 1–5. https://doi.org/10.1109/ICICS.2013.6782777
Chan TH, Jia K, Gao S et al (2015) PCANet: a simple deep learning baseline for image classification? IEEE T Image Process 4(12):5017–5032. https://doi.org/10.1109/TIP.2015.2475625
Article Google Scholar
Wang W, Dai QY, Li F et al (2021) MLCDForest: multi-label classification with deep forest in disease prediction for long non-coding RNAs. Brief Bioinform 22(3):bbaa104. https://doi.org/10.1093/bib/bbaa104
Article CAS PubMed Google Scholar
Peng L, Tan J, Tian X et al (2022) EnANNDeep: an ensemble-based lncRNA–protein interaction prediction framework with adaptive k-nearest neighbor classifier and deep models. Interdiscip Sci 14(1):209–232. https://doi.org/10.1007/s12539-021-00483-y
Article CAS PubMed Google Scholar
Muthukrishnan R, Rohini R (2016) LASSO: A feature selection technique in predictive modeling for machine learning. In: 2016 IEEE International Conference on advances in computer applications (ICACA), pp 18–20. https://doi.org/10.1109/ICACA.2016.7887916
Arevalo J, Solorio T, Montes-y-Gómez M et al (2017) Gated multimodal units for information fusion. arXiv. https://doi.org/10.48550/arXiv.1702.01992
Zhu T, Lin Y, Liu Y (2017) Synthetic minority oversampling technique for multiclass imbalance problems. Pattern Recogn 72:327–340. https://doi.org/10.1016/j.patcog.2017.07.024
Article Google Scholar
Cerami E, Gao J, Dogrusoz U et al (2012) The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2(5):401–404. https://doi.org/10.1158/2159-8290.CD-12-0095
Article PubMed Google Scholar
Gao JJ, Aksoy BA, Dogrusoz U et al (2013) Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal 6(269):l1. https://doi.org/10.1126/scisignal.2004088
Article CAS Google Scholar
Mermel CH, Schumacher SE, Hill B et al (2011) GISTIC2. 0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol 12:1–14. https://doi.org/10.1186/gb-2011-12-4-r41
Article CAS Google Scholar
Ciriello G, Miller ML, Aksoy BA et al (2013) Emerging landscape of oncogenic signatures across human cancers. Nat Genet 45(10):1127–1133. https://doi.org/10.1038/ng.2762
Article CAS PubMed PubMed Central Google Scholar
Li J, Cheng K, Wang S et al (2017) Feature selection: a data perspective. ACM Comput Surv (CSUR) 50(6):1–45. https://doi.org/10.1145/3136625
Article Google Scholar
Fisher A, Rudin C, Dominici F (2019) All models are wrong, but many are useful: learning a variable’s importance by studying an entire class of prediction models simultaneously. J Mach Learn Res 20(177):1–81. https://doi.org/10.1016/j.petsci.2022.09.003
Article CAS Google Scholar
Pan X, Hu XH, Zhang YH et al (2019) Identification of the copy number variant biomarkers for breast cancer subtypes. Mol Genet Genom 294:95–110. https://doi.org/10.1007/s00438-018-1488-4
Article CAS Google Scholar
Huang T, Chen C, Du J et al (2023) A tRF-5a fragment that regulates radiation resistance of colorectal cancer cells by targeting MKNK1. J Cell Mol Med 27(24):4021–4033. https://doi.org/10.1111/jcmm.17982
Article CAS PubMed PubMed Central Google Scholar
Fernandez-Rozadilla C, Cazier JB, Tomlinson IP et al (2013) A colorectal cancer genome-wide association study in a Spanish cohort identifies two variants associated with colorectal cancer risk at 1p33 and 8p12. BMC Genom 14:1–11. https://doi.org/10.1186/1471-2164-14-55
Article CAS Google Scholar
Kim S, Kim JM, Lee HJ et al (2020) Alteration of CYP4A11 expression in renal cell carcinoma: diagnostic and prognostic implications. J Cancer 11(6):1478. https://doi.org/10.7150/jca.36438
Article CAS PubMed PubMed Central Google Scholar
Lee K, Jeong H, Lee S et al (2019) CPEM: Accurate cancer type classification based on somatic alterations using an ensemble of a random forest and a deep neural network. Sci Rep-UK 9(1):16927. https://doi.org/10.1038/s41598-019-53034-3
Article CAS Google Scholar
Shen J, Shi J, Luo J et al (2022) Deep learning approach for cancer subtype classification using high-dimensional gene expression data. BMC Bioinform 23(1):1–17. https://doi.org/10.1186/s12859-022-04980-9
Article Google Scholar

Download references

Acknowledgements

In this paper, the work is supported by the Fundamental Research Program of Shanxi Province (General program) (Grant No. 202303021211082, 202303021211025).

Author information

Authors and Affiliations

College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, 030600, China
Liancheng Jiang, Yizhen Wang, Yongfei Wu & Junhong Yue
College of Computer Science and Technology, Taiyuan Normal University, Taiyuan, 030619, China
Liye Jia

Authors

Liancheng Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Liye Jia
View author publications
You can also search for this author in PubMed Google Scholar
Yizhen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yongfei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Junhong Yue
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Concept and design: Liancheng Jiang and Liye Jia, data collection and analysis: Yizhen Wang; drafting of the article: Liancheng Jiang, Liye Jia and Junhong Yue; critical revision of the article for important content: Liancheng Jiang, Liye Jia, Yizhen Wang, Yongfei Wu, Junhong Yue, All the authors approved the final article.

Corresponding author

Correspondence to Junhong Yue.

Ethics declarations

Competing Interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Ethics statement

Not applicable.

Informed consent

Not applicable.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jiang, L., Jia, L., Wang, Y. et al. Adap-BDCM: Adaptive Bilinear Dynamic Cascade Model for Classification Tasks on CNV Datasets. Interdiscip Sci Comput Life Sci (2024). https://doi.org/10.1007/s12539-024-00635-w

Download citation

Received: 21 December 2023
Revised: 18 April 2024
Accepted: 23 April 2024
Published: 17 May 2024
DOI: https://doi.org/10.1007/s12539-024-00635-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adap-BDCM: Adaptive Bilinear Dynamic Cascade Model for Classification Tasks on CNV Datasets