Abstract
Tumor subclass detection and diagnosis is inevitable requirement for personalized medical treatment and refinement of the effects that the somatic cells show towards other clinical conditions. The genome of these somatic cells exhibits mutations and genetic variations of the breast cancer cells and helps in understanding the characteristic behavior of the cancer cells. But their analysis is limited to clustering and there is requirement to analyze what else can be done with the data for identifying the tumor subcategory and the stages of subclasses. In this work, we have extended the work with similar data (consisting of 105 breast tumor cell lines) to solve other detection and characterization problems through computation and intelligent representation learning. Most of our work comprises of systematic data cleaning, analysis, and building prediction models with deep computational architectures and establish that the transformed data can help in better distinction of the respective categories. Our main contribution is the novel gene-subcategory interaction-based regularization (GSIAR) based data selection and analysis concept, alongside the prediction, proven to enhance the performance of the classification techniques.
Similar content being viewed by others
References
Mertins P, Mani DR, Ruggles KV, Gillette MA, Clauser KR, Wang P, et al. (2016) Proteogenomics connects somatic mutations to signalling in breast cancer. Nature 534(7605):55–62
Tyrer J, Stephen WD, Jack C (2004) A breast cancer prediction model incorporating familial and personal risk factors. Stat Med 23(7):1111–1130
Baker JA, et al. (1995) Breast cancer: prediction with artificial neural network based on BI-RADS standardized lexicon. Radiology 196(3):817–822
Lakhani SR, et al. (2005) Prediction of BRCA1 status in patients with breast cancer using estrogen receptor and basal phenotype. Clin Cancer Res 11(14):5175–5180
Chang JC, et al. (2003) Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. The Lancet 362(9381):362–369
Iorio MV, et al. (2005) MicroRNA gene expression deregulation in human breast cancer. Cancer Res 65 (16):7065–7070
Bardou V, et al. (2003) Progesterone receptor status significantly improves outcome prediction over estrogen receptor status alone for adjuvant endocrine therapy in two large breast cancer databases. J Clin Oncol 21 (10):1973–1979
Parker J, et al. (2009) Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol 27(8):1160–1167
Dowsett M, et al. (2010) Prediction of risk of distant recurrence using the 21-gene recurrence score in node-negative and node-positive postmenopausal patients with breast cancer treated with anastrozole or tamoxifen: a TransATAC study. J Clin Oncol 28(11):1829–1834
Gruvberger S, et al. (2001) Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns. Cancer Res 61(16):5979–5984
Reis-Filho J, Lajos P (2011) Gene expression profiling in breast cancer: classification, prognostication, and prediction. The Lancet 378(9805):1812–1823
Mangasarian O, Street W, Wolberg W (1995) Breast cancer diagnosis and prognosis via linear programming. Oper Res 43(4):570–577
West M, et al. (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci 98(20):11462–11467
Wooster R, et al. (1995) Identification of the breast cancer susceptibility gene BRCA2. Nature 6559(789):378
Van V, Marc J, et al. (2002) A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 347(25):1999–2009
Minn A, et al. (2005) Genes that mediate breast cancer metastasis to lung. Nature 436(7050):518–524
Paik S, et al. (2006) Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor–positive breast cancer. J Clin Oncol 24(23):3726–3734
Huang E, et al. (2003) Gene expression predictors of breast cancer outcomes. The Lancet 361(9369):1590–1596
Weigelt B, Frederick B, Jorge R (2010) The contribution of gene expression profiling to breast cancer classification, prognostication and prediction: a retrospective of the last decade. J Pathol 220(2):263–280
Weigelt B, et al. (2008) Refinement of breast cancer classification by molecular characterization of histological special types. J Pathol 21(2):141–150
Sotiriou C, et al. (2003) Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc Natl Acad Sci 100(18):10393–10398
Axelsson C, et al. (1992) Axillary dissection of level I and II lymph nodes is important in breast cancer classification. Eur J Cancer 28(8-9):1415–1418
Chuang H, et al. (2007) Network-based classification of breast cancer metastasis. Mol Syst Biol 1(140):3
Brenton J, et al. (2005) Molecular classification and molecular forecasting of breast cancer: ready for clinical application?. J Clin Oncol 23(29):7350–7360
Wang Y, et al. (2005) Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. The Lancet 365(9460):671–679
Viale G (2012) The current state of breast cancer classification. Ann Oncol 23(suppl_10):207–210
Colombo P, et al. (2011) Microarrays in the 2010s: the contribution of microarray-based gene expression profiling to breast cancer classification, prognostication and prediction. Breast Cancer Res 3(212):13
Rakha A, et al. (2010) Breast cancer prognostic classification in the molecular era: the role of histological grade. Breast Cancer Res 4(207):12
Tan A, Gilbert D (2003) Ensemble machine learning on gene expression data for cancer classification
Guyon I, et al. (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1):389–422
Akay M (2009) Support vector machines combined with feature selection for breast cancer diagnosis. Expert systems with applications 36(2):3240–3247
Polat K, Gunes S (2007) Breast cancer diagnosis using least square support vector machine. Digital Signal Process 17(4):694–701
Statnikov A, Wang L, Constantin A (2008) A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinf 9(1):319
Cruz A, David W (2006) Applications of machine learning in cancer prediction and prognosis. Cancer Informat 2:59
Xin J et al (2006) Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles. International Workshop on Data Mining for Biomedical Applications
Wolberg W, Street W, Mangasarian O (1995) Image analysis and machine learning applied to breast cancer diagnosis and prognosis. Anal Quant Cytol Histol 17(2):77–87
Wei L, et al. (2005) A study on several machine-learning methods for classification of malignant and benign clustered microcalcifications. IEEE Trans Med Imaging 24(3):371–380
Murphy K (2006) Naive bayes classifiers. University of British Columbia
Scott M (2002) Applied logistic regression analysis. Vol. 106. Sage
Liaw A, Wiener M (2002) Classification and regression by random forest. R news 2(3):18–22
Suykens J, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9 (3):293–300
http://www.breastcancer.org/symptoms/understand_bc/statistics
Ross E, et al. (2019) Online accounts of gene expression profiling in early-stage breast cancer: interpreting genomic testing for chemotherapy decision making. Health Expect 22(1):74–82
Srour MK et al (2019) Gene expression comparison between primary triple-negative breast cancer and matched axillary lymph node metastasis: 565-565
Nakshatri Harikrishna, et al. (2019) Genetic ancestry–dependent differences in breast cancer–induced field defects in the tumor-adjacent normal breast. Clin Cancer Res 25(9):2848–2859
Savci-Heijink CD, et al. (2019) A specific gene expression signature for visceral organ metastasis in breast cancer. BMC Cancer 19(1):333
Ishay-Ronen D, et al. (2019) Gain fat—lose metastasis: converting invasive breast cancer cells into adipocytes inhibits cancer metastasis. Cancer Cell 35(1):17–32
Mechera R, et al. (2019) Expression of RET is associated with Oestrogen receptor expression but lacks prognostic significance in breast cancer. BMC Cancer 19(1):41
Liedtke C, Pusztai L (2019) Gene expression profiling as an emerging diagnostic tool to personalize chemotherapy selection for early stage breast cancer. Pharmacogenetics of Breast Cancer. CRC Press, pp 87–106
Paroni G, et al. (2675) HER2-positive breast-cancer cell lines are sensitive to KDM5 inhibition: definition of a gene-expression model for the selection of sensitive cases. Oncogene 15(2019):38
Chang JC, Hilsenbeck SG, Fuqua AW (2019) Pharmacogenetics of breast cancer: toward the individualization of therapy. Pharmacogenetics of Breast Cancer. CRC Press, pp 15–23
Dworkin AM, Huang TH-M, Toland AE (2019) The role of epigenetics in breast cancer: implications for diagnosis, prognosis, and treatment. Pharmacogenetics of breast cancer. CRC Press, pp 57–71
Asano Yuka, et al. (2018) Expression and clinical significance of androgen receptor in triple-negative breast cancer. AR Signaling in Human Malignancies: Prostate Cancer and Beyond, pp 197
Franco HL, et al. (2018) Enhancer transcription reveals subtype-specific gene expression programs controlling breast cancer pathogenesis. Genome Res 28(2):159–170
Gyorffy B, et al. (1107) An integrative bioinformatics approach reveals coding and non-coding gene variants associated with gene expression profiles and outcome in breast cancer molecular subtypes. Br J Cancer 118(8):2018
Kwa M, Makris A, Esteva FJ (2017) Clinical utility of gene-expression signatures in early stage breast cancer. Nat Rev Clin Oncol 14(10):595
Cejalvo JM, et al. (2017) Intrinsic subtypes and gene expression profiles in primary and metastatic breast cancer. Cancer Res 77(9):2213–2221
Ramanathan R, et al. (2017) Angiopoietin pathway gene expression associated with poor breast cancer survival. Breast Cancer Res Treat 162(1):191–198
Bozovic-Spasojevic I, et al. (2017) The prognostic role of androgen receptor in patients with early-stage breast cancer: a meta-analysis of clinical and gene expression data. Clin Cancer Res 23(11):2702–2712
Casciello F, et al. (2017) G9a drives hypoxia-mediated gene repression for breast cancer cell survival and tumorigenesis. Proc Natl Acad Sci 114(27):7077–7082
Denkert C, et al. (2017) Molecular alterations in triple-negative breast cancer—the road to new treatment strategies. The Lancet 389(10087):2430–2442
Dai X, et al. (2017) Breast cancer cell line classification and its relevance with breast tumor subtyping. J Cancer 8(16):3131
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares that he has no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by the author.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sur, C. GSIAR: gene-subcategory interaction-based improved deep representation learning for breast cancer subcategorical analysis using gene expression, applicable for precision medicine. Med Biol Eng Comput 57, 2483–2515 (2019). https://doi.org/10.1007/s11517-019-02038-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-019-02038-2