Skip to main content

A Novel Computational Approach for Biomarker Detection for Gene Expression-Based Computer-Aided Diagnostic Systems for Breast Cancer

  • Protocol
  • First Online:
Artificial Neural Networks

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2190))

Abstract

Cancer produces complex cellular changes. Microarrays have become crucial to identifying genes involved in causing these changes; however, microarray data analysis is challenged by the high-dimensionality of data compared to the number of samples. This has contributed to inconsistent cancer biomarkers from various gene expression studies. Also, identification of crucial genes in cancer can be expedited through expression profiling of peripheral blood cells. We introduce a novel feature selection method for microarrays involving a two-step filtering process to select a minimum set of genes with greater consistency and relevance, and demonstrate that the selected gene set considerably enhances the diagnostic accuracy of cancer. The preliminary filtering (Bi-biological filter) involves building gene coexpression networks for cancer and healthy conditions using a topological overlap matrix (TOM) and finding cancer specific gene clusters using Spectral Clustering (SC). This is followed by a filtering step to extract a much-reduced set of crucial genes using best first search with support vector machine (BFS-SVM). Finally, artificial neural networks, SVM, and K-nearest neighbor classifiers are used to assess the predictive power of the selected genes as well as to select the most effective diagnostic system. The approach was applied to peripheral blood profiling for breast cancer where Bi-biological filter selected 415 biologically consistent genes, from which BFS-SVM extracted 13 highly cancer specific genes for breast cancer identification. ANN was the superior classifier with 93.2% classification accuracy, a 14% improvement over the study from which data were obtained for this study (Aaroe et al., Breast Cancer Res 12:R7, 2010).

Data: Available from NCBI Gene Expression Omnibus: accession number GEO:GSE16443.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aaroe J, Lindahl T, Dumeaux V et al (2010) Gene expression profiling of peripheral blood cells for early detection of breast cancer. Breast Cancer Res 12(1):R7

    PubMed  PubMed Central  Google Scholar 

  2. Marteau J-B, Mohr S, Pfister M et al (2005) Collection and storage of human blood cells for mrna expression profiling: a 15-month stability study. Clin Chem 51(7):1250–1252

    CAS  PubMed  Google Scholar 

  3. Fang X, Evans K, Willis RC et al (2006) High-throughput sample preparation from whole blood for gene expression analysis. J Assoc Lab Automat 11(6):381–386. https://doi.org/10.1016/j.jala.2006.10.001

    Article  CAS  Google Scholar 

  4. Fan X, Shi L, Fang H et al (2010) DNA microarrays are predictive of cancer prognosis: a re-evaluation. Clin Cancer Res 16(2):629–636

    CAS  PubMed  Google Scholar 

  5. Kretschmer C, Sterner-Kock A, Siedentopf F et al (2011) Identification of early molecular markers for breast cancer. Mol Cancer 10(1):15

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Ma S, Kosorok MR, Huang J et al (2011) Incorporating higher-order representative features improves prediction in network-based cancer prognosis analysis. BMC Med Genet 4:5

    Google Scholar 

  7. Schrauder MG, Strick R, Schulz-Wendtland R et al (2012) Circulating micro-rnas as potential blood-based markers for early stage breast cancer detection. PLoS One 7(1):E29770

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Sharma P, Sahni NS, Tibshirani R et al (2005) Early detection of breast cancer based on gene-expression patterns in peripheral blood cells. Breast Cancer Res 7(5):R634–R644

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Wu CFJ (1986) Jackknife, bootstrap and other resampling methods in regression analysis. Ann Stat 14(4):1261–1295

    Google Scholar 

  10. Obayashi T, Hayashi S, Shibaoka M et al (2008) coxpresdb: a database of coexpressed gene networks in mammals. Nucleic Acids Res 36(suppl 1):D77–D82

    CAS  PubMed  Google Scholar 

  11. Yip A, Horvath S (2007) Gene network interconnectedness and the generalized topological overlap measure. BMC Bioinformatics 8(1):22

    PubMed  PubMed Central  Google Scholar 

  12. Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol

    Google Scholar 

  13. Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1–2):245–271

    Google Scholar 

  14. Gilad-Bachrach R, Navot A, Tishby N (2004) Margin based feature selection—theory and algorithms. In Proceedings of the twenty-first international conference on machine learning. ACM, Banff

    Google Scholar 

  15. Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. Adv Neural Inf Proces Syst 14

    Google Scholar 

  16. Samarasinghe S (2010) Neural networks for water system analysis: from fundamentals to complex pattern recognition. In Hydrocomplexity: new tools for solving wicked water problems, international association of hydrological science, Paris. pp 209–213

    Google Scholar 

  17. Al-Yousef A, Samarasinghe S (2011) Ultrasound based computer aided diagnosis of breast cancer: evaluation of a new feature of mass central regularity degree

    Google Scholar 

  18. Dennis G, Sherman BT, Hosack DA et al (2003) DAVID: database for annotation, visualization, and integrated discovery. Genome Biol 4(5):P3

    PubMed  Google Scholar 

  19. Haldar S, Negrini M, Monne M et al (1994) Down-regulation of bcl-2 by p53 in breast cancer cells. Cancer Res 54(8):2095–2097

    CAS  PubMed  Google Scholar 

  20. Parton M, Dowsett M, Smith I (2001) Studies of apoptosis in breast cancer. BMJ 322(7301):1528–1532

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Feng Z, Marti A, Jehn B et al (1995) Glucocorticoid and progesterone inhibit involution and programmed cell death in the mouse mammary gland. J Cell Biol 131(4):1095–1103

    CAS  PubMed  Google Scholar 

  22. Graham JD, Clarke CL (1997) Physiological action of progesterone in target tissues. Endocr Rev 18(4):502–519

    CAS  PubMed  Google Scholar 

  23. European Molecular Biology Laboratory, EMBL-EBI (2011) European Bioinformatics Institute

    Google Scholar 

  24. Simonnet H, Alazard N, Pfeiffer K et al (2002) Low mitochondrial respiratory chain content correlates with tumor aggressiveness in renal cell carcinoma. Carcinogenesis 23(5):759–768

    CAS  PubMed  Google Scholar 

  25. Warburg O (1956) On the origin of cancer cells. Science 123(3191):309–314

    CAS  PubMed  Google Scholar 

  26. Beitsch PD, Clifford E (2000) Detection of carcinoma cells in the blood of breast cancer patients. Am J Surg 180(6):446–449

    CAS  PubMed  Google Scholar 

  27. Annibaldi A, Widmann C (2010) Glucose metabolism in cancer cells. Curr Opin Clin Nutr Metab Care 13(4):466–470. https://doi.org/10.1097/MCO.0b013e32833a5577

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work was supported by a Scholarship from Jerash University in Jordan and support from Lincoln University, New Zealand.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sandhya Samarasinghe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Al-Yousef, A., Samarasinghe, S. (2021). A Novel Computational Approach for Biomarker Detection for Gene Expression-Based Computer-Aided Diagnostic Systems for Breast Cancer. In: Cartwright, H. (eds) Artificial Neural Networks. Methods in Molecular Biology, vol 2190. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0826-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-0826-5_9

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-0825-8

  • Online ISBN: 978-1-0716-0826-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics