A Bioinformatics Approach for Understanding Genotype–Phenotype Correlation in Breast Cancer

  • Sohiya Yotsukura
  • Masayuki Karasuyama
  • Ichigaku Takigawa
  • Hiroshi MamitsukaEmail author


Breast cancer (BC) patients can be clinically classified into three types, called ER+, PR+, and HER2+, indicating the name of biomarkers and linking treatments. The serious problem is that the patients, called “triple negative” (TN), who cannot be fallen into any of these three categories, have no clear treatment options. Thus linking TN patients to the main three phenotypes clinically is very important. Usually BC patients are profiled by gene expression, while their patient class sets (such as PAM50) are inconsistent with clinical phenotypes. On the other hand, location-specific sequence variants are expected to be more predictive to detect BC patient subgroups, since a variety of somatic, single mutations are well-demonstrated to be linked to the resultant tumors. However those mutations have not been necessarily evaluated well as patterns to predict BC phenotypes. We thus detect patterns, which can assign known phenotypes to BC TN patients, focusing more on paired or more complicated nucleotide/gene mutational patterns, by using three machine learning methods: limitless arity multiple procedure (LAMP), decision trees, and hierarchical disjoint clustering. Association rules obtained through LAMP reveal a patient classification scheme through combinatorial mutations in PIK3CA and TP53, consistent with the obtained decision tree and three major clusters (occupied 182/208 samples), revealing the validity of results from diverse approaches. The final clusters, containing TN patients, present sub-population features in the TN patient pool that assign clinical phenotypes to TN patients.This paper is an extended and detailed version on a pilot study conducted in Yotsukura et al. (Brief Bioinform, to appear).


Bioinformatics Approach Genotypes Phenotypes Breast cancer Correlation analysis 



We acknowledge Dr. Ajit Bharti (Boston University, USA) on his innovative conception that a better apprehension of breast cancer subtypes is needed. We would like to thank the TCGA Data Access Committee (DAC) for providing us the opportunity to work with the data for this study.

S.Y. is supported by Grant-in-Aid for JSPS Fellows and JSPS KAKENHI #26-381. M.K. is supported by JSPS KAKENHI #26730120. I.T. is funded by Collaborative Research Program of Institute for Chemical Research, Kyoto University (Grant# 2014-27, #2015-33). H.M. is partially supported by JSPS KAKENHI #24300054.


  1. 1.
    S. Yotsukura, I. Takigawa, M. Karasuyama, and H. Mamitsuka, “Exploring phenotype patterns of breast cancer within somatic mutations,” Briefings in Bioinformatics. To appear. doi: 10.1093/bib/bbw040
  2. 2.
    J. M. Rae, S. Drury, D. F. Hayes, V. Stearns, J. N. Thibert, B. P. Haynes, J. Salter, I. Sestak, J. Cuzick, and M. Dowsett, “CYP2D6 and UGT2B7 genotype and risk of recurrence in tamoxifen-treated breast cancer patients,” J. Natl. Cancer Inst., vol. 104, pp. 452–460, Mar 2012.CrossRefGoogle Scholar
  3. 3.
    R. G. Margolese, G. N. Hortobagyi, and T. A. Buchholz, “Management of metastatic breast cancer,” in Holland-Frei Cancer Medicine (D. W. Kufe, R. E. Pollock, R. R. Weichselbaum, et al., eds.), Hamilton, ON: BC Decker, 6 ed., 2003.Google Scholar
  4. 4.
    L. R. Howe and P. H. Brown, “Targeting the HER/EGFR/ErbB family to prevent breast cancer,” Cancer Prev Res (Phila), vol. 4, pp. 1149–1157, Aug 2011.CrossRefGoogle Scholar
  5. 5.
    K. R. Bauer, M. Brown, R. D. Cress, C. A. Parise, and V. Caggiano, “Descriptive analysis of estrogen receptor (ER)-negative, progesterone receptor (PR)-negative, and HER2-negative invasive breast cancer, the so-called triple-negative phenotype: a population-based study from the California cancer Registry,” Cancer, vol. 109, pp. 1721–1728, May 2007.CrossRefGoogle Scholar
  6. 6.
    A. Prat, C. Cruz, K. A. Hoadley, O. Diez, C. M. Perou, and J. Balmana, “Molecular features of the basal-like breast cancer subtype based on BRCA1 mutation status,” Breast Cancer Res. Treat., vol. 147, pp. 185–191, Aug 2014.CrossRefGoogle Scholar
  7. 7.
    B. D. Lehmann, J. A. Bauer, X. Chen, M. E. Sanders, A. B. Chakravarthy, Y. Shyr, and J. A. Pietenpol, “Identification of human triple-negative breast cancer subtypes and preclinical models for selection of targeted therapies,” J. Clin. Invest., vol. 121, pp. 2750–2767, Jul 2011.CrossRefGoogle Scholar
  8. 8.
    A. Prat, A. Lluch, J. Albanell, W. T. Barry, C. Fan, J. I. Chacon, J. S. Parker, L. Calvo, A. Plazaola, A. Arcusa, M. A. Segui-Palmer, O. Burgues, N. Ribelles, A. Rodriguez-Lescure, A. Guerrero, M. Ruiz-Borrego, B. Munarriz, J. A. Lopez, B. Adamo, M. C. Cheang, Y. Li, Z. Hu, M. L. Gulley, M. J. Vidal, B. N. Pitcher, M. C. Liu, M. L. Citron, M. J. Ellis, E. Mardis, T. Vickery, C. A. Hudis, E. P. Winer, L. A. Carey, R. Caballero, E. Carrasco, M. Martin, C. M. Perou, and E. Alba, “Predicting response and survival in chemotherapy-treated triple-negative breast cancer,” Br. J. Cancer, vol. 111, pp. 1532–1541, Oct 2014.Google Scholar
  9. 9.
    D. C. Koboldt, R. S. Fulton, M. D. McLellan, H. Schmidt, J. Kalicki-Veizer, J. F. McMichael, et al., “Comprehensive molecular portraits of human breast tumours,” Nature, vol. 490, pp. 61–70, Oct 2012.CrossRefGoogle Scholar
  10. 10.
    J. S. Parker, M. Mullins, M. C. Cheang, S. Leung, D. Voduc, T. Vickery, S. Davies, C. Fauron, X. He, Z. Hu, J. F. Quackenbush, I. J. Stijleman, J. Palazzo, J. S. Marron, A. B. Nobel, E. Mardis, T. O. Nielsen, M. J. Ellis, C. M. Perou, and P. S. Bernard, “Supervised risk predictor of breast cancer based on intrinsic subtypes,” J. Clin. Oncol., vol. 27, pp. 1160–1167, Mar 2009.CrossRefGoogle Scholar
  11. 11.
    I. R. Watson, K. Takahashi, P. A. Futreal, and L. Chin, “Emerging patterns of somatic mutations in cancer,” Nat. Rev. Genet., vol. 14, pp. 703–718, Oct 2013.CrossRefGoogle Scholar
  12. 12.
    X. Bai, E. Zhang, H. Ye, V. Nandakumar, Z. Wang, L. Chen, C. Tang, J. Li, H. Li, W. Zhang, W. Han, F. Lou, D. Zhang, H. Sun, H. Dong, G. Zhang, Z. Liu, Z. Dong, B. Guo, H. Yan, C. Yan, L. Wang, Z. Su, Y. Li, L. Jones, X. F. Huang, S. Y. Chen, and J. Gao, “PIK3CA and TP53 gene mutations in human breast cancer tumors frequently detected by ion torrent DNA sequencing,” PLoS ONE, vol. 9, no. 6, p. e99306, 2014.Google Scholar
  13. 13.
    S. P. Shah, A. Roth, R. Goya, A. Oloumi, G. Ha, Y. Zhao, G. Turashvili, J. Ding, K. Tse, G. Haffari, A. Bashashati, L. M. Prentice, J. Khattra, A. Burleigh, D. Yap, V. Bernard, A. McPherson, K. Shumansky, A. Crisan, R. Giuliany, A. Heravi-Moussavi, J. Rosner, D. Lai, I. Birol, R. Varhol, A. Tam, N. Dhalla, T. Zeng, K. Ma, S. K. Chan, M. Griffith, A. Moradian, S. W. Cheng, G. B. Morin, P. Watson, K. Gelmon, S. Chia, S. F. Chin, C. Curtis, O. M. Rueda, P. D. Pharoah, S. Damaraju, J. Mackey, K. Hoon, T. Harkins, V. Tadigotla, M. Sigaroudinia, P. Gascard, T. Tlsty, J. F. Costello, I. M. Meyer, C. J. Eaves, W. W. Wasserman, S. Jones, D. Huntsman, M. Hirst, C. Caldas, M. A. Marra, and S. Aparicio, “The clonal and mutational evolution spectrum of primary triple-negative breast cancers,” Nature, vol. 486, pp. 395–399, Jun 2012.Google Scholar
  14. 14.
    A. Terada, M. Okada-Hatakeyama, K. Tsuda, and J. Sese, “Statistical significance of combinatorial regulations,” Proc. Natl. Acad. Sci. U.S.A., vol. 110, pp. 12996–13001, Aug 2013.MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    T. Therneau, B. Atkinson, and B. Ripley, rpart: Recursive Partitioning and Regression Trees, 2011.Google Scholar
  16. 16.
    T. Hothorn, K. Hornik, and A. Zeileis, “Unbiased recursive partitioning: A conditional inference framework,” Journal of Computational and Graphical Statistics, vol. 15, no. 3, pp. 651–674, 2006.MathSciNetCrossRefGoogle Scholar
  17. 17.
  18. 18.
    R. Tibshirani, G. Walther, and T. Hastie, “Estimating the number of clusters in a dataset via the gap statistic,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 63, no. 2, pp. 411–423, 2000.MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    C. Kandoth, M. D. McLellan, F. Vandin, K. Ye, B. Niu, C. Lu, M. Xie, Q. Zhang, J. F. McMichael, M. A. Wyczalkowski, M. D. Leiserson, C. A. Miller, J. S. Welch, M. J. Walter, M. C. Wendl, T. J. Ley, R. K. Wilson, B. J. Raphael, and L. Ding, “Mutational landscape and significance across 12 major cancer types,” Nature, vol. 502, pp. 333–339, Oct 2013.CrossRefGoogle Scholar
  20. 20.
    H. Thorvaldsdottir, J. T. Robinson, and J. P. Mesirov, “Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration,” Brief. Bioinformatics, vol. 14, pp. 178–192, Mar 2013.CrossRefGoogle Scholar
  21. 21.
    d. a. W. Huang, B. T. Sherman, and R. A. Lempicki, “Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources,” Nat. Protoc, vol. 4, no. 1, pp. 44–57, 2009.Google Scholar
  22. 22.
    M. Kanehisa, S. Goto, Y. Sato, M. Kawashima, M. Furumichi, and M. Tanabe, “Data, information, knowledge and principle: back to metabolism in KEGG,” Nucleic Acids Res., vol. 42, pp. 199–205, Jan 2014.CrossRefGoogle Scholar
  23. 23.
    2014. Online Mendelian Inheritance in Man, OMIM. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University (Baltimore, MD), World Wide Web URL:
  24. 24.
    R Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2013.Google Scholar
  25. 25.
    C. O’Brien, J. J. Wallin, D. Sampath, D. GuhaThakurta, H. Savage, E. A. Punnoose, J. Guan, L. Berry, W. W. Prior, L. C. Amler, M. Belvin, L. S. Friedman, and M. R. Lackner, “Predictive biomarkers of sensitivity to the phosphatidylinositol 3’ kinase inhibitor GDC-0941 in breast cancer preclinical models,” Clin. Cancer Res., vol. 16, pp. 3670–3683, Jul 2010.CrossRefGoogle Scholar
  26. 26.
    L. H. Saal, K. Holm, M. Maurer, L. Memeo, T. Su, X. Wang, J. S. Yu, P. O. Malmstrom, M. Mansukhani, J. Enoksson, H. Hibshoosh, A. Borg, and R. Parsons, “PIK3CA mutations correlate with hormone receptors, node metastasis, and ERBB2, and are mutually exclusive with PTEN loss in human breast carcinoma,” Cancer Res., vol. 65, pp. 2554–2559, Apr 2005.CrossRefGoogle Scholar
  27. 27.
    K. Stemke-Hale, A. M. Gonzalez-Angulo, A. Lluch, R. M. Neve, W. L. Kuo, M. Davies, M. Carey, Z. Hu, Y. Guan, A. Sahin, W. F. Symmans, L. Pusztai, L. K. Nolden, H. Horlings, K. Berns, M. C. Hung, M. J. van de Vijver, V. Valero, J. W. Gray, R. Bernards, G. B. Mills, and B. T. Hennessy, “An integrative genomic and proteomic analysis of PIK3CA, PTEN, and AKT mutations in breast cancer,” Cancer Res., vol. 68, pp. 6084–6091, Aug 2008.CrossRefGoogle Scholar
  28. 28.
    H. G. Ahmed, M. A. Al-Adhraei, and I. M. Ashankyty, “Association between AgNORs and Immunohistochemical Expression of ER, PR, HER2/neu, and p53 in Breast Carcinoma,” Patholog Res Int, vol. 2011, p. 237217, 2011.Google Scholar
  29. 29.
    P. de Cremoux, A. V. Salomon, S. Liva, R. Dendale, B. Bouchind’homme, E. Martin, X. Sastre-Garau, H. Magdelenat, A. Fourquet, and T. Soussi, “p53 mutation as a genetic trait of typical medullary breast carcinoma,” J. Natl. Cancer Inst., vol. 91, pp. 641–643, Apr 1999.CrossRefGoogle Scholar
  30. 30.
    P. Yang, C. W. Du, M. Kwan, S. X. Liang, and G. J. Zhang, “The impact of p53 in predicting clinical outcome of breast cancer patients with visceral metastasis,” Sci Rep, vol. 3, p. 2246, 2013.Google Scholar
  31. 31.
    H. Yamashita, M. Nishio, T. Toyama, H. Sugiura, Z. Zhang, S. Kobayashi, and H. Iwase, “Coexistence of HER2 over-expression and p53 protein accumulation is a strong prognostic molecular marker in breast cancer,” Breast Cancer Res., vol. 6, no. 1, pp. 24–30, 2004.CrossRefGoogle Scholar
  32. 32.
    E. Biganzoli, D. Coradini, F. Ambrogi, J. M. Garibaldi, P. Lisboa, D. Soria, A. R. Green, M. Pedriali, M. Piantelli, P. Querzoli, R. Demicheli, P. Boracchi, I. Nenci, I. O. Ellis, and S. Alberti, “p53 status identifies two subgroups of triple-negative breast cancers with distinct biological features,” Jpn. J. Clin. Oncol., vol. 41, pp. 172–179, Feb 2011.CrossRefGoogle Scholar
  33. 33.
    S. Banerji, K. Cibulskis, C. Rangel-Escareno, et al., “Sequence analysis of mutations and translocations across breast cancer subtypes,” Nature, vol. 486, pp. 405–409, Jun 2012.CrossRefGoogle Scholar
  34. 34.
    C. X. Ma, T. Reinert, I. Chmielewska, et al., “Mechanisms of aromatase inhibitor resistance,” Nat. Rev. Cancer, vol. 15, pp. 261–275, May 2015.CrossRefGoogle Scholar
  35. 35.
    E. Cerami, J. Gao, U. Dogrusoz, B. E. Gross, S. O. Sumer, B. A. Aksoy, A. Jacobsen, C. J. Byrne, M. L. Heuer, E. Larsson, Y. Antipin, B. Reva, A. P. Goldberg, C. Sander, and N. Schultz, “The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data,” Cancer Discov, vol. 2, pp. 401–404, May 2012.CrossRefGoogle Scholar
  36. 36.
    J. Gao, B. A. Aksoy, U. Dogrusoz, G. Dresdner, B. Gross, S. O. Sumer, Y. Sun, A. Jacobsen, R. Sinha, E. Larsson, E. Cerami, C. Sander, and N. Schultz, “Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal,” Sci Signal, vol. 6, p. pl1, Apr 2013.Google Scholar
  37. 37.
    M. Heiskanen, J. Kononen, M. Barlund, J. Torhorst, G. Sauter, A. Kallioniemi, and O. Kallioniemi, “CGH, cDNA and tissue microarray analyses implicate FGFR2 amplification in a small subset of breast tumors,” Anal Cell Pathol, vol. 22, no. 4, pp. 229–234, 2001.CrossRefGoogle Scholar
  38. 38.
    V. K. Jain and N. C. Turner, “Challenges and opportunities in the targeting of fibroblast growth factor receptors in breast cancer,” Breast Cancer Res., vol. 14, no. 3, p. 208, 2012.Google Scholar
  39. 39.
    N. Turner, M. B. Lambros, H. M. Horlings, A. Pearson, R. Sharpe, R. Natrajan, F. C. Geyer, M. van Kouwenhove, B. Kreike, A. Mackay, A. Ashworth, M. J. van de Vijver, and J. S. Reis-Filho, “Integrative molecular profiling of triple negative breast cancers identifies amplicon drivers and potential therapeutic targets,” Oncogene, vol. 29, pp. 2013–2023, Apr 2010.CrossRefGoogle Scholar
  40. 40.
    S. L. Maguire, A. Leonidou, P. Wai, C. Marchio, C. K. Ng, A. Sapino, A. V. Salomon, J. S. Reis-Filho, B. Weigelt, and R. C. Natrajan, “SF3B1 mutations constitute a novel therapeutic target in breast cancer,” J. Pathol., vol. 235, pp. 571–580, Mar 2015.CrossRefGoogle Scholar
  41. 41.
    A. C. Vargas, J. S. Reis-Filho, and S. R. Lakhani, “Phenotype-genotype correlation in familial breast cancer,” J Mammary Gland Biol Neoplasia, vol. 16, pp. 27–40, Apr 2011.CrossRefGoogle Scholar
  42. 42.
    A. Langerød, H. Zhao, Ø. Borgan, J. M. Nesland, I. R. Bukholm, T. Ikdahl, R. Kåresen, A. L. Børresen-Dale, and S. S. Jeffrey, “TP53 mutation status and gene expression profiles are powerful prognostic markers of breast cancer,” Breast Cancer Res., vol. 9, no. 3, p. R30, 2007.Google Scholar
  43. 43.
    J. Alsner, M. Yilmaz, P. Guldberg, L. L. Hansen, and J. Overgaard, “Heterogeneity in the clinical phenotype of TP53 mutations in breast cancer patients,” Clin. Cancer Res., vol. 6, pp. 3923–3931, Oct 2000.Google Scholar
  44. 44.
    G. Ligresti, L. Militello, L. S. Steelman, A. Cavallaro, F. Basile, F. Nicoletti, F. Stivala, J. A. McCubrey, and M. Libra, “PIK3CA mutations in human solid tumors: role in sensitivity to various therapeutic approaches,” Cell Cycle, vol. 8, pp. 1352–1358, May 2009.CrossRefGoogle Scholar
  45. 45.
    M. N. Fletcher, M. A. Castro, X. Wang, I. de Santiago, M. O’Reilly, S. F. Chin, O. M. Rueda, C. Caldas, B. A. Ponder, F. Markowetz, and K. B. Meyer, “Master regulators of FGFR2 signalling and breast cancer risk,” Nat. Commun., vol. 4, p. 2464, 2013.Google Scholar
  46. 46.
    B. Wappenschmidt, R. Fimmers, K. Rhiem, M. Brosig, E. Wardelmann, A. Meindl, N. Arnold, P. Mallmann, and R. K. Schmutzler, “Strong evidence that the common variant S384F in BRCA2 has no pathogenic relevance in hereditary breast cancer,” Breast Cancer Res., vol. 7, no. 5, pp. R775–779, 2005.CrossRefGoogle Scholar
  47. 47.
    D. Walerych, M. Napoli, L. Collavin, and G. Del Sal, “The rebel angel: mutant p53 as the driving oncogene in breast cancer,” Carcinogenesis, vol. 33, pp. 2007–2017, Nov 2012.CrossRefGoogle Scholar
  48. 48.
    C. Coles, A. Condie, U. Chetty, C. M. Steel, H. J. Evans, and J. Prosser, “p53 mutations in breast cancer,” Cancer Res., vol. 52, pp. 5291–5298, Oct 1992.Google Scholar
  49. 49.
    D. A. Deming, A. A. Leystra, L. Nettekoven, C. Sievers, D. Miller, M. Middlebrooks, L. Clipson, D. Albrecht, J. Bacher, M. K. Washington, J. Weichert, and R. B. Halberg, “PIK3CA and APC mutations are synergistic in the development of intestinal cancers,” Oncogene, vol. 33, pp. 2245–2254, Apr 2014.CrossRefGoogle Scholar
  50. 50.
    B. Weigelt, P. H. Warne, M. B. Lambros, J. S. Reis-Filho, and J. Downward, “PI3K pathway dependencies in endometrioid endometrial cancer cell lines,” Clin. Cancer Res., vol. 19, pp. 3533–3544, Jul 2013.CrossRefGoogle Scholar
  51. 51.
    B. D. Lehmann, J. A. Bauer, J. M. Schafer, C. S. Pendleton, L. Tang, K. C. Johnson, X. Chen, J. M. Balko, H. Gomez, C. L. Arteaga, G. B. Mills, M. E. Sanders, and J. A. Pietenpol, “PIK3CA mutations in androgen receptor-positive triple negative breast cancer confer sensitivity to the combination of PI3K and androgen receptor inhibitors,” Breast Cancer Res., vol. 16, no. 4, p. 406, 2014.Google Scholar
  52. 52.
    R. Arsenic, A. Lehmann, J. Budczies, I. Koch, J. Prinzler, A. Kleine-Tebbe, C. Schewe, S. Loibl, M. Dietel, and C. Denkert, “Analysis of PIK3CA mutations in breast cancer subtypes,” Appl. Immunohistochem. Mol. Morphol., vol. 22, pp. 50–56, Jan 2014.CrossRefGoogle Scholar
  53. 53.
    S. Loibl, G. von Minckwitz, A. Schneeweiss, S. Paepke, A. Lehmann, M. Rezai, D. M. Zahm, P. Sinn, F. Khandan, H. Eidtmann, K. Dohnal, C. Heinrichs, J. Huober, B. Pfitzner, P. A. Fasching, F. Andre, J. L. Lindner, C. Sotiriou, A. Dykgers, S. Guo, S. Gade, V. Nekljudova, S. Loi, M. Untch, and C. Denkert, “PIK3CA mutations are associated with lower rates of pathologic complete response to anti-human epidermal growth factor receptor 2 (her2) therapy in primary HER2-overexpressing breast cancer,” J. Clin. Oncol., vol. 32, pp. 3212–3220, Oct 2014.Google Scholar
  54. 54.
    K. A. Hoadley, C. Yau, D. M. Wolf, A. D. Cherniack, D. Tamborero, S. Ng, et al., “Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin,” Cell, vol. 158, pp. 929–944, Aug 2014.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Sohiya Yotsukura
    • 1
  • Masayuki Karasuyama
    • 2
  • Ichigaku Takigawa
    • 3
  • Hiroshi Mamitsuka
    • 1
    Email author
  1. 1.Bioinformatics CenterInstitute of Chemical Research, Kyoto UniversityKyotoJapan
  2. 2.Department of Computer ScienceNagoya Institute of TechnologyNagoyaJapan
  3. 3.Graduate School of Information Science and TechnologyHokkaido UniversityHokkaidoJapan

Personalised recommendations