Abstract
Methods to catalog and computationally assess the mutational landscape of proteins in human cancers are desirable. One approach is to adapt evolutionary or data-driven methods developed for predicting whether a single-nucleotide polymorphism (SNP) is deleterious to protein structure and function. In cases where understanding the mechanism of protein activation and regulation is desired, an alternative approach is to employ structure-based computational approaches to predict the effects of point mutations. Through a case study of mutations in kinase domains of three proteins, namely, the anaplastic lymphoma kinase (ALK) in pediatric neuroblastoma patients, serine/threonine-protein kinase B-Raf (BRAF) in melanoma patients, and erythroblastic oncogene B 2 (ErbB2 or HER2) in breast cancer patients, we compare the two approaches above. We find that the structure-based method is most appropriate for developing a binary classification of several different mutations, especially infrequently occurring ones, concerning the activation status of the given target protein. This approach is especially useful if the effects of mutations on the interactions of inhibitors with the target proteins are being sought. However, many patients will present with mutations spread across different target proteins, making structure-based models computationally demanding to implement and execute. In this situation, data-driven methods—including those based on machine learning techniques and evolutionary methods—are most appropriate for recognizing and illuminate mutational patterns. We show, however, that, in the present status of the field, the two methods have very different accuracies and confidence values, and hence, the optimal choice of their deployment is context-dependent.
Similar content being viewed by others
References
Nowell PC (1976) The clonal evolution of tumor cell populations: acquired genetic lability permits stepwise selection. Science 194:23–28
Tian T et al (2011) The origins of cancer robustness and evolvability. Integr Biol (Camb) 3(1):17–30
Hanahan D, Weinberg RA (2000) The hallmarks of cancer. Cell 100(1):57–70
Hanahan D, Weinberg RA (2011) Hallmarks of cancer: the next generation. Cell 144(5):646–674
Greenman C et al (2007) Patterns of somatic mutation in human cancer genomes. Nature 446(7132):153–158
Loeb LA (2011) Human cancers express mutator phenotypes: origin, consequences and targeting. Nat Rev Cancer 11:450–457
Greenman C et al (2006) Statistical analysis of pathogenicity of somatic mutations in cancer. Genetics 173:2187–2198
Andre F et al (2013) Personalized medicine in oncology: where have we come from and where are we going? Pharmacogenomics 14(8):931–939
Chiang A, Million RP (2011) Personalized medicine in oncology: next generation. Nat Rev Drug Discov 10(12):895–896
Gonzalez-Angulo AM, Hennessy BT, Mills GB (2010) Future of personalized medicine in oncology: a systems biology approach. J Clin Oncol 28(16):2777–2783
Normanno N et al (2013) Molecular diagnostics and personalized medicine in oncology: challenges and opportunities. J Cell Biochem 114(3):514–524
Ciriello G et al (2013) Emerging landscape of oncogenic signatures across human cancers. Nat Genet 45(10):1127–1133
Creekmore AL et al (2011) Changes in gene expression and cellular architecture in an ovarian cancer progression model. PLoS One 6(3):e17676
Huang R, Wallqvist A, Covell DG (2006) Targeting changes in cancer: assessing pathway stability by comparing pathway gene expression coherence levels in tumor and normal tissues. Mol Cancer Ther 5(9):2417–2427
Vogelstein B et al (2013) Cancer genome landscapes. Science 339(6127):1546–1558
Hodis E et al (2012) A landscape of driver mutations in melanoma. Cell 150(2):251–263
Stephens PJ et al (2012) The landscape of cancer genes and mutational processes in breast cancer. Nature 486(7403):400–404
Nehrt NL et al (2012) Domain landscapes of somatic mutations in cancer. BMC Genom 13(Suppl 4):S9
Suva ML, Riggi N, Bernstein BE (2013) Epigenetic reprogramming in cancer. Science 339(6127):1567–1570
Reimand J, Wagih O, Bader GD (2013) The mutational landscape of phosphorylation signaling in cancer. Sci Rep 3:2651
Irish JM, Kotecha N, Nolan GP (2006) Mapping normal and cancer cell signalling networks: towards single-cell proteomics. Nat Rev Cancer 6(2):146–155
Burrell RA et al (2013) The causes and consequences of genetic heterogeneity in cancer evolution. Nature 501(7467):338–345
Shih AJ, Telesco SE, Radhakrishnan R (2011) Analysis of somatic mutations in cancer: molecular mechanisms of activation in the ErbB family of receptor tyrosine kinases. Cancers 3(1):1195–1231
Lemmon MA, Schlessinger J (2010) Cell signaling by receptor tyrosine kinases. Cell 141(7):1117–1134
Manning G et al (2002) The protein kinase complement of the human genome. Science 298(5600):1912–1934
Huse M, Kuriyan J (2002) The conformational plasticity of protein kinases. Cell 109:275–282
Telesco SE, Radhakrishnan R (2009) Atomistic insights into regulatory mechanisms of the HER2 tyrosine kinase domain: a molecular dynamics study. Biophys J 96(6):2321–2334
Shih AJ et al (2011) Molecular dynamics analysis of conserved hydrophobic and hydrophilic bond-interaction networks in ErbB family kinases. Biochem J 436(2):241–251
Huwe PJ, Radhakrishnan R (2012) Computational methodology for mechanistic profiling of kinase domain mutations in cancers. In: Advanced research workshop on in silico oncology and cancer investigation—the TUMOR project workshop (IARWISOCI), 2012 5th international, pp 1–4
Larkin MA et al (2007) Clustal W and clustal X version 2.0. Bioinformatics (Oxf Engl) 23:2947–2948
Caronia LM, Phay JE, Shah MH (2011) Role of BRAF in thyroid oncogenesis. Clin Cancer Res 17:7511–7517
Graham RP, Treece AL, Lindeman NI, Vasalos P, Shan M, Jennings LJ, Rimm DL (2018) Worldwide frequency of commonly detected EGFR mutations. Arch Pathol Lab Med 142(2):163–167
Bose R et al (2013) Activating HER2 mutations in HER2 gene amplification negative breast cancer. Cancer Discov 3(2):224–237
Kavuri SM et al (2015) HER2 activating mutations are targets for colorectal cancer treatment. Cancer Discov 5(8):832–841
Zuo WJ et al (2016) Dual characteristics of novel HER2 kinase domain mutations in response to HER2-targeted therapies in human breast cancer. Clin Cancer Res 22(19):4859–4869
Sun J, Pedersen M, Ronnstrand L (2009) The D816V mutation of c-kit circumvents a requirement for Src family kinases in c-Kit signal transduction. J Biol Chem 284(17):11039–11047
Isozaki K et al (2000) Germline-activating mutation in the kinase domain of KIT gene in familial gastrointestinal stromal tumors. Am J Pathol 157:1581–1585
Gajiwala KS et al (2009) KIT kinase mutants show unique mechanisms of drug resistance to imatinib and sunitinib in gastrointestinal stromal tumor patients. Proc Natl Acad Sci USA 106(5):1542–1547
Yamamoto Y (2001) Activating mutation of D835 within the activation loop of FLT3 in human hematologic malignancies. Blood 97:2434–2439
Heinrich MC et al (2003) PDGFRA activating mutations in gastrointestinal stromal tumors. Science 299(5607):708–710
Harada D et al (2007) Sustained phosphorylation of mutated FGFR3 is a crucial feature of genetic dwarfism and induces apoptosis in the ATDC5 chondrogenic cell line via PLCgamma-activated STAT1. Bone 41:273–281
Gujral TS et al (2006) Molecular mechanisms of RET receptor-mediated oncogenesis in multiple endocrine neoplasia 2B. Can Res 66:10741–10749
Capdeville R et al (2002) Glivec (STI571, imatinib), a rationally developed, targeted anticancer drug [review, 83 refs]. Nat Rev Drug Discov 1(7):493–502
Reddy EP, Aggarwal AK (2012) The ins and outs of bcr-abl inhibition. Genes Cancer 3:447–454
Ungureanu D et al (2011) The pseudokinase domain of JAK2 is a dual-specificity protein kinase that negatively regulates cytokine signaling. Nat Struct Mol Biol 18:971–976
Bandaranayake RM et al (2012) Crystal structures of the JAK2 pseudokinase domain and the pathogenic mutant V617F. Nat Struct Mol Biol 19:754–759
Kitamura Y, Hirota S, Nishida T (2001) A loss-of-function mutation of c-kit results in depletion of mast cells and interstitial cells of Cajal, while its gain-of-function mutation results in their oncogenesis. Mutat Res Fundam Mol Mech Mutagen 477(1):165–171
Frost MJ et al (2002) Juxtamembrane mutant V560GKit is more sensitive to imatinib (STI571) compared with wild-type c-kit whereas the kinase domain mutant D816VKit is resistant. Mol Cancer Ther 1(12):1115
Hirota S et al (1998) Gain-of-function mutations of c-kit in human gastrointestinal stromal tumors. Science 279(5350):577
Papakyriakou A et al (2009) Conformational dynamics of the EGFR kinase domain reveals structural features involved in activation. Proteins Struct Funct Bioinform 76(2):375–386
Fratev F et al (2009) Molecular basis of inactive B-RAF WT and B-RAF V600E ligand inhibition, selectivity and conformational stability: an in silico study. Mol Pharm 6:144–157
Capriotti E, Altman RB (2011) A new disease-specific machine learning approach for the prediction of cancer-causing missense variants. Genomics 98(4):310–317
Clifford RJ et al (2004) Large-scale analysis of non-synonymous coding region single nucleotide polymorphisms. Bioinformatics 20:1006–1014
González-Pérez A, López-Bigas N (2011) Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel. Am J Hum Genet 88:440–449
Li B et al (2009) Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics (Oxf Engl) 25:2744–2750
Ng PC (2003) SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res 31:3812–3814
Reva B, Antipin Y, Sander C (2007) Determinants of protein function revealed by combinatorial entropy optimization. Genome Biol 8:R232
Stone EA, Sidow A (2005) Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res 15:978–986
Bromberg Y, Rost B (2007) SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res 35:3823–3835
Izarzugaza JM et al (2012) Prioritization of pathogenic mutations in the protein kinase superfamily. BMC Genom 13(Suppl 4):S3
Kaminker JS et al (2007) Distinguishing cancer-associated missense mutations from common polymorphisms. Can Res 67(2):465–473
Torkamani A, Schork NJ (2007) Accurate prediction of deleterious protein kinase polymorphisms. Bioinformatics (Oxf Engl) 23:2918–2925
Dees ND et al (2012) MuSiC: identifying mutational significance in cancer genomes. Genome Res 22:1589–1598
Fiser A, Sali A (2003) Modeller: generation and refinement of homology-based protein structure models. Methods Enzymol 374:461–491
Hess B et al (2008) GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. J Chem Theory Comput 4(3):435–447
MacKerell AD et al (1998) All-atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B 102(18):3586–3616
Shan Y et al (2012) Oncogenic mutations counteract intrinsic disorder in the EGFR kinase and promote receptor dimerization. Cell 149:860–870
Sutto L, Luigi F (2013) Effects of oncogenic mutations on the conformational free-energy landscape of EGFR kinase. Proc Natl Acad Sci USA 110(26):10616–10621
Dixit A et al (2009) Computational modeling of structurally conserved cancer mutations in the RET and MET kinases: the impact on protein structure, dynamics, and stability. Biophys J 96:858–874
Karabencheva TG et al (2014) How does conformational flexibility influence key structural features involved in activation of anaplastic lymphoma kinase? Mol BioSyst 10(6):1490–1495
Berteotti A et al (2009) Protein conformational transitions: the closure mechanism of a kinase explored by atomistic simulations. J Am Chem Soc 131(1):244–250
Banavali NK, Roux B (2009) Flexibility and charge asymmetry in the activation loop of Src tyrosine kinases. Proteins 74(2):378–389
Yang S, Roux B (2008) Src kinase conformational activation: thermodynamics, pathways, and mechanisms. PLoS Comput Biol 4(3):e1000047
Lin YL et al (2013) Explaining why Gleevec is a specific and potent inhibitor of Abl kinase. Proc Natl Acad Sci USA 110(5):1664–1669
Azam M et al (2008) Activation of tyrosine kinases by mutation of the gatekeeper threonine. Nat Struct Mol Biol 15:1109
Bresler S et al (2014) ALK mutations confer differential oncogenic activation and sensitivity to ALK inhibition therapy in neuroblastoma. Cancer Cell 26(5):682–694
Wang J et al. (2006) Classification of imbalanced data by using the SMOTE algorithm and locally linear embedding. In: ICSP2006 proceedings. IEEE, Beijing, China. https://doi.org/10.1109/ICOSP.2006.345752
Fernandez A, Garcia S, Herrera F, Chawla NV (2018) SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J Artif Intell Res 61:863–905
Dayhoff MO, Schwartz RM (1978) A model of evolutionary change in proteins, chap 22. In: Atlas of protein sequence and structure. pp 345–352
Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89:10915–10919
Ng PC, Henikoff S (2001) Predicting deleterious amino acid substitutions. Genome Res 11:863–874
Adzhubei IA et al (2010) A method and server for predicting damaging missense mutations. Nat Methods 7:248–249
Reva B, Antipin Y, Sander C (2011) Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res 39:e118
Bastanlar Y, Ozuysal M (2014) Introduction to machine learning. Methods Mol Biol 1107:105–128
Alpaydin E (2010) Introduction to machine learning, 2nd edn. Adaptive computation and machine learning. MIT Press, Cambridge, p xl
Wei Q, Dunbrack RL (2013) The role of balanced training and testing data sets for binary classifiers in bioinformatics. PLoS One 8:e67863
Gnad F et al (2013) Assessment of computational methods for predicting the effects of missense mutations in human cancers. BMC Genom 14(Suppl 3):S7
Jordan EJ, Radhakrishnan R (2014) Machine learning predictions of cancer driver mutations. In: In silico oncology and cancer investigation (IARWISOCI), 2014 6th international advanced research workshop on, 2014
Valencia A, Hidalgo M (2012) Getting personalized cancer genome analysis into the clinic: the challenges in bioinformatics. Genome Med 4(7):61
Kohsaka S et al (2017) A method of high-throughput functional evaluation of EGFR gene variants of unknown significance in cancer. Sci Transl Med 9(416):eaan6566
Wilson FH et al (2015) A functional landscape of resistance to ALK inhibition in lung cancer. Cancer Cell 27(3):397–408
Chow RD, Chen S (2018) Cancer CRISPR screens in vivo. Trends Cancer 4(5):349–358
Park JH et al (2012) Erlotinib binds both inactive and active conformations of the EGFR tyrosine kinase domain. Biochem J 448(3):417–423
Yun CH et al (2008) The T790M mutation in EGFR kinase causes drug resistance by increasing the affinity for ATP. Proc Natl Acad Sci USA 105(6):2070–2075
Garraway LA, Janne PA (2012) Circumventing cancer drug resistance in the era of personalized medicine. Cancer Discov 2(3):214–226
Gottesman MM (2002) Mechanisms of cancer drug resistance. Annu Rev Med 53:615–627
Tan DS et al (2010) Anti-cancer drug resistance: understanding the mechanisms through the use of integrative genomics and functional RNA interference. Eur J Cancer 46(12):2166–2177
Wilson TR et al (2012) Widespread potential for growth-factor-driven resistance to anticancer kinase inhibitors. Nature 487(7408):505–509
Straussman R et al (2012) Tumour micro-environment elicits innate resistance to RAF inhibitors through HGF secretion. Nature 487(7408):500–504
Lebedeva G et al (2012) Model-based global sensitivity analysis as applied to identification of anti-cancer drug targets and biomarkers of drug resistance in the ErbB2/3 network. Eur J Pharm Sci 46(4):244–258
Purvis J, Ilango V, Radhakrishnan R (2008) Role of network branching in eliciting differential short-term signaling responses in the hyper-sensitive epidermal growth factor receptor mutants implicated in lung cancer. Biotechnol Prog 24(3):540–553
Telesco SE et al (2011) A multiscale modeling approach to investigate molecular mechanisms of pseudokinase activation and drug resistance in the HER3/ErbB3 receptor tyrosine kinase signaling network. Mol BioSyst 7(6):2066–2080
Hopkins AL (2008) Network pharmacology: the next paradigm in drug discovery. Nat Chem Biol 4(11):682–690
Haupt VJ, Schroeder M (2011) Old friends in new guise: repositioning of known drugs with structural bioinformatics. Brief Bioinform 12(4):312–326
Wu Z, Wang Y, Chen L (2013) Network-based drug repositioning. Mol BioSyst 9(6):1268–1281
Kreeger PK, Lauffenburger DA (2010) Cancer systems biology: a network modeling perspective. Carcinogenesis 31(1):2–8
Bachmann J et al (2012) Predictive mathematical models of cancer signalling pathways. J Intern Med 271(2):155–165
Kholodenko BN (2006) Cell-signalling dynamics in time and space. Nat Rev Mol Cell Biol 7(3):165–176
Hendriks B, Griffiths G, Benson R (2006) Decreased internalisation of erbB1 mutants in lung cancer is linked with a mechanism conferring sensitivity to gefitinib. IEE Proc Syst 153:457–466
Bissell MJ, Hines WC (2011) Why don’t we get more cancer? A proposed role of the microenvironment in restraining cancer progression. Nat Med 17(3):320–329
Wang E (ed) (2010) Cancer systems biology. Mathematical and computational biology series. CRC Press, Taylor and Francis, London
Zhao H et al (2013) Novel modeling of cancer cell signaling pathways enables systematic drug repositioning for distinct breast cancer metastases. Cancer Res 73(20):6149–6163
Deisboeck TS et al (2011) Multiscale cancer modeling. Annu Rev Biomed Eng 13:127–155
Telesco SE, Radhakrishnan R (2012) Structural systems biology and multiscale signaling models. Ann Biomed Eng 40(11):2295–2306
Tourdot RW et al (2014) Multiscale computational models in physical systems biology of intracellular trafficking. IET Syst Biol 8(5):198–213
Shih AJ, Purvis J, Radhakrishnan R (2008) Molecular systems biology of ErbB1 signaling: bridging the gap through multiscale modeling and high-performance computing. Mol BioSyst 4:1151–1159
Telesco SE, Vadigepalli R, Radhakrishnan R (2013) Molecular modeling of ErbB4/HER4 kinase in the context of the HER4 signaling network helps rationalize the effects of clinically identified HER4 somatic mutations on the cell phenotype. Biotechnol J 8(12):1452–1464
Kim E et al (2018) Cell signaling heterogeneity is modulated by both cell-intrinsic and -extrinsic mechanisms: an integrated approach to understanding targeted therapy. PLoS Biol 16(3):e2002930
Mosesson Y, Mills GB, Yarden Y (2008) Derailed endocytosis: an emerging feature of cancer. Nat Rev Cancer 8(11):835–850
Ramanan V et al (2011) Systems biology and physical biology of clathrin-mediated endocytosis. Integr Biol (Camb) 3(8):803–815
Stein M, Gabdoulline RR, Wade RC (2007) Bridging from molecular simulation to biochemical networks. Curr Opin Struct Biol 17(2):166–172
Saunders MG, Voth GA (2012) Coarse-graining of multiprotein assemblies. Curr Opin Struct Biol 22(2):144–150
Aloy P, Russell RB (2006) Structural systems biology: modelling protein interactions. Nat Rev Mol Cell Biol 7(3):188–197
Stamatakos G et al (2013) The technologically integrated oncosimulator: combining multiscale cancer modeling with information technology in the in silico oncology context. IEEE J Biomed Health Inform 18(3):840–854. https://doi.org/10.1109/JBHI.2013.2284276
Stamatakos GS et al (2007) The “Oncosimulator”: a multilevel, clinically oriented simulation system of tumor growth and organism response to therapeutic schemes. Towards the clinical evaluation of in silico oncology. In: Conference proceedings IEEE engineering in medicine and biology society, 2007, vol 2007, pp 6629–6632
Acknowledgements
We thank G. S. Stamatakos, N. Graf, and members of the CHIC consortium and the Radhakrishnan Laboratory for insightful discussions. The research leading to these results has received funding from the European Commission Grant FP7-ICT-2011-9-600841 and National Institutes of Health Grant U54 CA193417, U01 CA227550, and R35-GM122485 (MAL). Computational resources were provided in part by the National Partnership for Advanced Computational Infrastructure under Grant no. MCB060006 from XSEDE.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Insight statement: One of the grand challenges in understanding cancer progression is to find mechanistic links between molecular alterations and the hallmarks of cancers. As we gather clinical data at a large scale aimed at molecular profiling of patients or patient cohorts functionally annotating the data—or deriving mechanistic insights from the data—becomes ever more challenging. In this article we provide an integrative framework for combining the state-of-the-art in two different fields namely structural biology and machine learning to delineate hitherto unknown mechanisms and relationships in cancer genomes—an approach that has the potential to make a significant clinical impact in oncology.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Jordan, E.J., Patil, K., Suresh, K. et al. Computational algorithms for in silico profiling of activating mutations in cancer. Cell. Mol. Life Sci. 76, 2663–2679 (2019). https://doi.org/10.1007/s00018-019-03097-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00018-019-03097-2