Skip to main content
Log in

(CDRGI)-Cancer detection through relevant genes identification

  • S.I. :WorldCIST’20
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Cancer is a genetic disease that is categorized among the most lethal and belligerent diseases. An early staging of the disease can reduce the high mortality rate associated with cancer. The advancement in high throughput sequencing technology and the implementation of several Machine Learning algorithms have led to significant progress in Oncogenomics over the past few decades. Oncogenomics uses RNA sequencing and gene expression profiling for the identification of cancer-related genes. The high dimensionality of RNA sequencing data makes it a complex and large-scale optimization problem. CDRGI presents a Discrete Filtering technique based on a Binary Artificial Bee Colony coupling Support Vector Machine and a two-stage cascading classifier to identify relevant genes and detect cancer using RNA seq data. The proposed approach has been tested for seven different cancers, including Breast Cancer, Stomach Cancer (STAD), Colon Cancer (COAD), Liver Cancer, Lung Cancer (LUSC), Kidney Cancer (KIRC), and Skin Cancer. The results revealed that the CDRGI performs better for feature reduction while achieving better classification accuracy for STAD, COAD, LUSC and KIRC cancer types.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Xiao Y, Wu J, Lin Z, Zhao X (2018) A deep learning-based multi-model ensemble method for cancer prediction. Comput Methods Progr Biomed 153:1–9

    Article  Google Scholar 

  2. Xiao Y, Wu J, Lin Z, Zhao X (2018) A semi-supervised deep learning method based on stacked sparse auto-encoder for cancer prediction using RNA-seq data. Comput Methods Progr Biomed 166:99–105

    Article  Google Scholar 

  3. Elyasigomari V, Lee DA, Screen HR, Shaheed MH (2017) Development of a two-stage gene selection method that incorporates a novel hybrid approach using the cuckoo optimization algorithm and harmony search for cancer classification. J Biomed Inform 67:11–20

    Article  Google Scholar 

  4. Khalifa NEM, Taha MHN, Ali DE, Slowik A, Hassanien AE (2020) Artificial intelligence technique for gene expression by tumor RNA-Seq data: a novel optimized deep learning approach. IEEE Access 8:22874–22883

    Article  Google Scholar 

  5. Lu H, Chen J, Yan K, Jin Q, Xue Y, Gao Z (2017) A hybrid feature selection algorithm for gene expression data classification. Neurocomputing 256:56–62

    Article  Google Scholar 

  6. Cancer-World Health Organization. https://www.who.int/news-room/fact-sheets/detail/cancer

  7. Prager GW, Braga S, Bystricky B, Qvortrup C, Criscitiello C, Esin E, Strijbos M (2018) Global cancer control: responding to the growing burden, rising costs and inequalities in access. ESMO Open 3(2):e000285

    Article  Google Scholar 

  8. National Cancer Institute. https://www.cancer.gov/about-cancer/understanding/what-is-cancer

  9. Saini H, Lal SP, Naidu VV, Pickering VW, Singh G, Tsunoda T, Sharma A (2016) Gene masking-a technique to improve accuracy for cancer classification with high dimensionality in microarray data. BMC Med Genom 9(3):74

    Article  Google Scholar 

  10. National Cancer Institute. https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga

  11. Hsu YH, Si D (2018) Cancer Type Prediction and Classification Based on RNA-sequencing Data. In: 2018 40th annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE, pp 5374–5377

  12. Danaee P, Ghaeini R, Hendrix DA (2017) A deep learning approach for cancer detection and relevant gene identification. In: Pacific symposium on biocomputing 2017. pp 21–229

  13. Kashan MH, Nahavandi N, Kashan AH (2012) DisABC: a new artificial bee colony algorithm for binary optimization. Appl Soft Comput 12(1):342–352

    Article  Google Scholar 

  14. Lyu B, Haque A (2018) Deep learning based tumor type classification using gene expression data. In: Proceedings of the 2018 ACM international conference on bioinformatics, computational biology, and health informatics. pp 89–96

  15. Hamzeh O, Alkhateeb A, Zheng J, Kandalam S, Rueda L (2020) Prediction of tumor location in prostate cancer tissue using a machine learning system on gene expression data. BMC Bioinform 21(2):1–10

    Google Scholar 

  16. Shon HS, Batbaatar E, Kim KO, Cha EJ, Kim KA (2020) Classification of kidney cancer data using cost-sensitive hybrid deep learning approach. Symmetry 12(1):154

    Article  Google Scholar 

  17. Karaboga D (2005) An idea based on honey bee swarm for numerical optimization, vol 200. Technical report-tr06. Erciyes University, Engineering Faculty, Computer Engineering Department, pp 1–10

  18. Akay B, Karaboga D (2012) A modified artificial bee colony algorithm for real-parameter optimization. Inf Sci 192:120–142

    Article  Google Scholar 

  19. Schiezaro M, Pedrini H (2013) Data feature selection based on artificial bee colony algorithm. EURASIP J Image Video Process 2013(1):47

    Article  Google Scholar 

  20. CatBoost. https://catboost.ai/

  21. Kang P, Lin Z, Teng S, Zhang G, Guo L, Zhang W (2019) Catboost-based framework with additional user information for social media popularity prediction. In: Proceedings of the 27th ACM international conference on multimedia. pp 2677–2681

  22. Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A (2018) CatBoost: unbiased boosting with categorical features. In: Advances in neural information processing systems. pp 6638–6648

  23. Understanding of MultiLayer (MLP) Perceptron. https://medium.com/@AI.with.Kain/understanding-of-multilayer-perceptron-mlp-8f179c4a135f

  24. Tang J, Deng C, Huang GB (2015) Extreme learning machine for multilayer perceptron. IEEE Trans Neural Netw Learn Syst 27(4):809–821

    Article  MathSciNet  Google Scholar 

  25. Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D (2000) Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10):906–914

    Article  Google Scholar 

  26. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  27. https://dataaspirant.com/2017/05/22/random-forest-algorithm-machine-learing/

  28. Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11(3):R25

    Article  Google Scholar 

  29. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saad Razzaq.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al-Obeidat, F., Rocha, Á., Akram, M. et al. (CDRGI)-Cancer detection through relevant genes identification. Neural Comput & Applic 34, 8447–8454 (2022). https://doi.org/10.1007/s00521-021-05739-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-05739-8

Keywords

Navigation