Tumor cell sensitivity to vemurafenib can be predicted from protein expression in a BRAF-V600E basket trial setting
- 336 Downloads
Genetics-based basket trials have emerged to test targeted therapeutics across multiple cancer types. However, while vemurafenib is FDA-approved for BRAF-V600E melanomas, the non-melanoma basket trial was unsuccessful, suggesting mutation status is insufficient to predict response. We hypothesized that proteomic data would complement mutation status to identify vemurafenib-sensitive tumors and effective co-treatments for BRAF-V600E tumors with inherent resistance.
Reverse Phase Proteomic Array (RPPA, MD Anderson Cell Lines Project), RNAseq (Cancer Cell Line Encyclopedia) and vemurafenib sensitivity (Cancer Therapeutic Response Portal) data for BRAF-V600E cancer cell lines were curated. Linear and nonlinear regression models using RPPA protein or RNAseq were evaluated and compared based on their ability to predict BRAF-V600E cell line sensitivity (area under the dose response curve). Accuracies of all models were evaluated using hold-out testing. CausalPath software was used to identify protein-protein interaction networks that could explain differential protein expression in resistant cells. Human examination of features employed by the model, the identified protein interaction networks, and model simulation suggested anti-ErbB co-therapy would counter intrinsic resistance to vemurafenib. To validate this potential co-therapy, cell lines were treated with vemurafenib and dacomitinib (a pan-ErbB inhibitor) and the number of viable cells was measured.
Orthogonal partial least squares (O-PLS) predicted vemurafenib sensitivity with greater accuracy in both melanoma and non-melanoma BRAF-V600E cell lines than other leading machine learning methods, specifically Random Forests, Support Vector Regression (linear and quadratic kernels) and LASSO-penalized regression. Additionally, use of transcriptomic in place of proteomic data weakened model performance. Model analysis revealed that resistant lines had elevated expression and activation of ErbB receptors, suggesting ErbB inhibition could improve vemurafenib response. As predicted, experimental evaluation of vemurafenib plus dacomitinb demonstrated improved efficacy relative to monotherapies.
Conclusions: Combined, our results support that inclusion of proteomics can predict drug response and identify co-therapies in a basket setting.
KeywordsReverse phase protein array Orthogonal partials least squares Protein activity Targeted therapies BRAF inhibitor
AUC: area under IC50 dose response curve
least absolute shrinkage and selection operator
orthogonal partial least squares
reverse phase protein array
support vector regression
In recent decades, there has been a shift to add targeted therapeutics (e.g., Herceptin) to standard cancer treatment approaches such as surgery, chemotherapy, and radiation. This is due, in part, to the emergence of large-scale DNA sequence analysis that has identified actionable genetic mutations across multiple tumor types [1, 2]. For example, mutations in the serine-threonine protein kinase BRAF are present in up to 15% of all cancers , with an increased incidence of up to 70% in melanoma . In 2011, a Phase III clinical trial for vemurafenib was conducted in BRAF-V600E melanoma patients with metastatic disease . Based on the significant improvements observed for both progression-free and overall survival, vemurafenib was subsequently FDA-approved for first-line treatment of metastatic, non-resectable melanoma.
However, conducting a clinical trial for a targeted therapeutic can be challenging due to slow patient accrual, particularly for tumor types that harbor the mutation at a low frequency . To combat this challenge, basket trials have emerged as a method where multiple tumor types harboring a common mutation are entered collectively into a single clinical trial . Unfortunately, results of the basket clinical trial of vemurafenib for non-melanoma tumors with the BRAF-V600E mutation indicated that other cancers, including colorectal, lung, and ovarian responded poorly to vemurafenib monotherapy . However, some patients exhibited a partial response or achieved stable disease, suggesting that information beyond the presence of a genetic mutation might identify potential responders in a basket setting. Additionally, a subset of colorectal patients achieved a partial response when combined with cetuximab, suggesting that the effects of vemurafenib are subject to the larger cellular network context.
To better identify patient cohorts that will respond to targeted therapeutics, precision medicine approaches have begun to use machine learning algorithms to find associations between drug sensitivity and “omic” data such as gene expression and mutational status. Consistent with the basket trial result for melanoma, one such study found that mutation status was an imperfect predictor across multiple cancer types and drugs . While most prior studies have examined transcriptomic data to predict drug sensitivity , a few studies have examined protein expression and activation to predict response to therapies [10, 11]. A recent study showed that models built with protein expression were better able to predict sensitivity to inhibitors of the ErbB family of receptors compared to gene expression, suggesting protein expression may be more informative .
However, the studies performed by Li et al. analyzed cell lines independent of their genomic status. This may limit the translational potential of this approach as mutational status is a primary criteria for many targeted therapy trials due to the relative ease of developing companion diagnostics for single mutations. We hypothesize that in a basket setting, the addition of protein expression and activity will provide superior predictive power compared to mutation status alone and will lead to identification of co-therapies to improve responses for cells with inherent resistance. To address this hypothesis, we built and compared multiple machine learning models from a publicly available RPPA dataset for 26 BRAF-V600E pan-cancer cell lines and identified protein signatures predictive of sensitivity to the FDA-approved BRAF inhibitor vemurafenib. From these signatures, potential co-therapies were identified and their respective impacts on vemurafenib efficacy were tested.
Materials and methods
Cell lines and reagents
Unless otherwise stated, all reagents were purchased from ThermoFisher (Waltham, MA). Cancer Cell Line Encyclopedia lines A375, LS411N, and MDAMB361 were purchased from American Type Culture Collection (ATCC; Rockville, MD). Cells were maintained at 37 °C in a humidified 5% CO2 atmosphere. A375 and LS411N were cultured in RPMI 1640 supplemented with 1% penicillin/streptomycin and 10% heat-inactivated fetal bovine serum. MDA-MB-361 were cultured in RPMI 1640 supplemented with 1% penicillin/streptomycin, 15% heat-inactivated fetal bovine serum, and 0.023 IU/mL insulin (Sigma; St. Louis, MO).
Matching CCLE, RPPA, and CTRP cell data
BRAF-V600E mutational status of cancer cell lines was obtained through the CCLE portal (https://portals.broadinstitute.org/ccle, Broad Institute; Cambridge, MA). The RPPA data for the 26 BRAF mutated cancer cell lines (Additional file 1: Table S1) was generated at the MD Anderson Cancer Center as part of the MD Anderson Cancer Cell Line Project (MCLP, https://tcpaportal.org/mclp) . Of the reported 474 proteins in the level 4 data, a threshold was set that for inclusion a protein must be detected in at least 25% of the selected cell lines, resulting in 232 included in the analysis. Gene-centric RMA-normalized mRNA expression data was retrieved from CCLE portal. Data on vemurafenib sensitivity was collected as part of the Cancer Therapeutics Response Portal (CTRP; Broad Institute) and normalized area-under-IC50 curve data (IC50AUC) was procured from the Quantitative Analysis of Pharmacogenomics in Cancer (QAPC, http://tanlab.ucdenver.edu/QAPC/) .
Regression algorithms to predict vemurafenib sensitivity
For a receptor-only built O-PLS model, expression of AR, CMET, CMET-Y1235, EGFR, EGFR-Y1068, EGFR-Y1173, ERα, ERα-S118, HER2, HER2-Y1248, HER3, HER3-Y1289, IGFRB, PDGFRB, PR, and VEGFR2 were used to predict vemurafenib IC50 AUC, using all 26 cell lines for training. To simulate pan-ErbB inhibition for MDA-MB-361, LS411N, and A375, the RPPA values for EGFR, HER2, and HER3 phosphorylated receptors were set to each protein’s minimum value in the original data set.
Heatmaps and clustering
Mean-centered and variance scaled RPPA data for training and testing set cell lines were hierarchically clustered (1-Pearson) with publicly available Morpheus software (https://software.broadinstitute.org/morpheus, Broad Institute). Resulting heatmap plots were created in GraphPad Prism software (La Jolla, California).
CausalPath analysis of resistant cell lines
CausalPath software  was used to identify networks of proteins from the RPPA data set that were significantly enriched in the resistant cell lines (IC50 AUC < 0.2) compared to the sensitive cell lines. For analysis of predictive protein interactions, proteins with a VIP > 1 were examined (87 of the original 232 proteins met this criteria), and significant change in the mean expression of each protein/phosphorylated protein between the two groups was determined with 10,000 permutations and a FDR of 0.2 for total and phosphorylated proteins. This relaxed discovery rate is consistent with prior use of this algorithm with a constrained subset of proteins .
In vitro testing of co-therapeutics
To compare the different machine learning models, each model was evaluated on all 26 cell lines using leave one out cross validation. Errors for each cell line prediction were calculated, and models were evaluated on the number of cell lines for which they had the smallest error in comparison with O-PLS. A binomial t-test was performed in Prism for each model against O-PLS.
Tumors exhibit heterogeneous protein expression and sensitivity to vemurafenib
Orthogonal partial least squares model outperforms other regression models to predict vemurafenib sensitivity
Since the goal was to predict the continuous IC50 AUC in BRAF mutated cell lines based on their RPPA protein expression data, we compared various types of regression models to determine the model that performed with the highest accuracy. Regression models, such as support vector regression (SVR) with linear kernels, orthogonal partial least squares regression (O-PLS), and LASSO-penalized linear regression, utilize linear relationships between the protein expression and vemurafenib sensitivity for prediction. One limitation of our data set is the relatively low number of cell lines (observations, n = 26) relative to RPPA proteins (variables, n = 232); given a data set with more variables than observations, over-fitting of the training data is always a concern. O-PLS addresses this issue by reducing the dimension to predictive and orthogonal principal components that represent linear combinations of the original protein expression cohort , while LASSO-penalized regression instead addresses the same issue by introducing an L1 regularization term that penalizes non-zero weights given to proteins in the model . While these two model types are restricted to linear relationships, Random Forests (with regression trees) and SVRs with non-linear kernels possess the ability to find non-linear interactions between proteins to predict vemurafenib sensitivity. Random Forests address overfitting via the use of an ensemble approach, making predictions by an unweighted vote among multiple trees, while SVRs at least partially address overfitting by not counting training set errors smaller than a threshold ε, i.e., not penalizing predictions that are within an “ε-tube” around the correct value [21, 22].
O-PLS identifies unique protein signatures that correlate with vemurafenib sensitivity
Of the 232 proteins from the RPPA dataset used in this model, 87 had VIP scores greater than 1, and were thus the most important proteins for the prediction of this model. Figure 3c illustrates these proteins with respect to their weights along p. A small subset of proteins and phosphorylated forms of proteins correlated with projection along the negative space of p, suggesting that high levels of these proteins were associated with intrinsic resistance to vemurafenib (Fig. 3c, blue). Further inspection of the expression of these proteins in both the training and testing set showed that, on average, these proteins were more highly expressed in resistant cell lines (IC50 AUC < 0.2, Fig. 3d). Included in this signature were both EGFR and a phosphorylated form of HER3 (HER3 Y1289), as well as downstream signaling proteins in the AKT pathway, such as P70S6K, suggesting that expression and activity of this family of receptors and downstream pathways correlate with increased vemurafenib resistance. Conversely, the protein signature that correlated with increased sensitivity to vemurafenib included proteins in the MAPK pathway such as NRAS, BRAF S445, MEK S217/S221, MAPK T202/Y204 (Fig. 3c yellow bars, Fig. 3d). This suggests that even among cell lines that universally possess a constitutively activating mutation in BRAF, increased activation of this pathway correlated with increased sensitivity.
Protein expression and activity outperform gene expression for predicting vemurafenib sensitivity
ErbB receptor activation and downstream PI3K signaling is increased in vemurafenib-resistant cell lines
Inhibition of ErbB receptors enhances sensitivity of resistant cell lines to vemurafenib
From the pathway analysis, we hypothesized that increased ErbB family signaling led to intrinsic vemurafenib resistance. As receptor-level inhibition of cellular signaling is a common therapeutic approach (e.g., Herceptin), we tested whether pan-ErbB inhibition would increase vemurafenib sensitivity in the more resistant cell lines. To explore this scenario, an O-PLS model was built using the expression and activation of receptors from the RPPA dataset (16 proteins) in order to more easily simulate the impact of receptor inhibition without the confounding element of having to simulate the impact of receptor inhibition on downstream proteins. While model performance suffered (R2Y = 0.37, Q2Y = 0.12), receptors with the highest VIP scores were EGFR, HER3, and HER3 Y1289 (Fig. 5c,d). To test the hypothesis that ErbB receptor inhibition would increase vemurafenib sensitivity, inhibition was first simulated by reducing phosphorylated receptor expression in the MDA-MB-361, LS411N, A375 cell lines to that of the minimal levels detected in the data set. Vemurafenib sensitivity in these three ErbB “inhibited” cell lines was then predicted using the receptor-only O-PLS model (Fig. 5e). Simulations indicated that inhibition of ErbB pathway activity would increase sensitivity to vemurafenib across the three different tumor cell lines. To experimentally validate this prediction, we treated the MDA-MB-361, LS411N, and A375 cell lines in vitro with vemurafenib, dacomitinib (a pan-ErbB receptor tyrosine kinase inhibitor), or combination treatment of vemurafenib and dacomitinib. In comparison to either monotherapy, the IC50 concentrations for both drugs decreased in the combinatorial treatment, showing increased efficacy of treatment when ErbB and B-RAF were dually inhibited. Additionally, Loewe’s model values from the dose response curves indicated synergy between the two inhibitors (Fig. 5f,g, Additional file 7: Table S7). This suggests that the inhibitors worked cooperatively to target intrinsic BRAF phosphorylation (caused by the V-600E mutation), as well as upstream ErbB signaling that could activate pathways parallel to BRAF, including PI3K. The computational results shown here illustrate the utility of O-PLS modeling to predict vemurafenib sensitivity in an in vitro setting mimicking a basket trial. Additionally, the ease of interpreting the O-PLS model allowed for identification and in vitro validation of vulnerabilities in vemurafenib-resistant cell lines in order to increase the efficacy of treatment.
Using a basket trial setting of pan-cancer BRAF-V600E cell lines, we developed an O-PLS model to predict tumor cell sensitivity to vemurafenib and identified co-treatments to overcome inherent resistance. While others have identified signatures from transcriptomic or proteomic data that correlate to sensitivity, to attempt to expand vemurafenib use beyond BRAF-V600E mutations , the clinical reality is that the FDA-approved application of vemurafenib requires the detection of a BRAF-V600E mutation in advanced stage melanoma . Furthermore, the drug label warns that application of vemurafenib to BRAF wild-type tumors can increase cell proliferation in vitro . This is consistent with the move, over the past decade, to develop assays for predictive biomarkers to guide use of targeted cancer therapeutics . Use of such assays, termed “companion diagnostics” , often increases the success rates of drugs during clinical trials [27, 29]. The approved test method and guidelines are then used for future general-population administration. Despite the failures in the non-melanoma BRAF-V600E basket trial for vemurafenib, the existing FDA requirement and warning for BRAF mutation status provide a translational structure that cannot be ignored. Through our model of protein data in pan-cancer BRAF-V600E cells, vemurafenib sensitivity was accurately predicted in multiple tumor cell lines including colorectal, breast, bone, and melanoma tumors. With further refinement and expansion to clinical samples, we expect that this approach could translate to refine basket trial enrollment and improve outcomes.
One of the key findings of our work is that proteomic data outperforms transcriptomic data to predict response in the basket setting. This is consistent with results obtained since the release of the RPPA expression dataset from CCLE and TCGA cohort analyses [12, 30, 31]. Their results demonstrated that in a pan-cancer model where genetic mutations are not incorporated into inclusion criteria, proteomics from RPPA outperformed RNAseq transcriptomics to predict drug sensitivity . Through the outlined model comparisons shown in our study, we observed that O-PLS performed optimally when protein expression and activity were used instead of RNAseq expression. Closer analysis of individual transcript/protein/activated proteins suggests this is likely due to the disparities between protein and transcript expression or protein expression and protein activation (i.e., phosphorylation). While RPPA technology is currently used in clinical trials , there are situations where other protein-based assays will be needed. Chiefly, as a lysate-based measurement, RPPA from tumor biopsies will capture the protein status of the entire tumor and microenvironment, which may mask indicators of tumor cell sensitivity. As an alternative, we suggest that when RPPA is used to identify the reduced signature of highly predictive proteins in tumor cells, clinical implementation may be more accurate with techniques that enable tumor cell-specific quantification (i.e., multi-spectral imaging for solid tumors, flow cytometry for non-solid tumors).
Our results also demonstrated that broad inclusion of protein expression and activity measurements can identify altered signaling pathways that influence drug response. For example, vemurafenib targets the BRAF signaling cascade and model analysis of the data supported that lines with elevated sensitivity to vemurafenib had increased phosphorylation of BRAF, MEK, and MAPK proteins (Fig. 3d bolded). While melanoma patients treated with vemurafenib have shown rapid responses to the therapy, the duration of response is often short , motivating a need to identify combination treatments with vemurafenib to extend progression free survival times. Results from our model suggest that melanoma cell lines initially sensitive to vemurafenib have elevated expression of p-MEK and p-BRAF when compared to inherently resistant cell lines. Recent clinical trials results showed significantly increased progression free survival and overall survival in BRAF- mutant metastatic melanomas with dual BRAF and MEK inhibitors compared to BRAF inhibitor monotherapy . In constrast, the model found that cell lines with higher resistance had increased ErbB receptor-family activity and downstream PI3K signaling. Therefore, by using a method such as RPPA to expand the analysis of protein signaling beyond the targeted pathway, protein signaling activity can be better gauged and used to identify potential co-therapeutic targets in the pre-clinical setting. Additionally, through the use of models such as the O-PLS model presented here, co-treatments can be simulated to prioritize experimental testing. Specifically, we simulated dual pan-ErbB and BRAF inhibition, and validated the model prediction of a synergistic increase in sensitivity of breast, colorectal, and melanoma cell lines to vemurafenib.
While our prediction of anti-ErbB therapies was based on model analysis rather than prior knowledge, there is evidence that this synergy is clinically relevant. Our model indicated that tumor cells, including colorectal cancer cells, with increased HER3 phosphorylation exhibited increased resistance to vemurafenib. In vitro, colorectal tumor stem cells with increased HER3 expression exhibited resistance to vemurafenib in the presence of the HER3 ligand, NRG-1 . Additionally, melanoma in vivo and PDX models have shown that increased ErbB family-receptor activity is associated with acquired resistance to vemurafenib . While the O-PLS model presented in this study was not used to predict acquired resistance, it did identify melanoma lines with increased ErbB signaling that led to inherent vemurafenib resistance (A375). Our model and experimental results suggested that co-treatment with an ErbB inhibitor and vemurafenib would have a synergistic effect. Cetuximab, a monoclonal antibody directed towards EGFR, has been shown to increase survival in colorectal patients . However, the BRAF-V600E colorectal patient cohort did not respond as well to cetuximab monotherapy in comparison to the wild-type BRAF cohort. Interestingly, in the vemurafenib basket clinical trial, colorectal patients were split into vemurafenib or vemurafenib/cetuximab treatment arm. The outcomes showed that the dual treatment arm had an increase in partial and stable responders, suggesting a potential synergy between these two inhibitors, similar to the synergy we observed in multiple tumor cell types .
Here, we compared the predictive ability of leading machine learning algorithms for regression to predict vemurafenib sensitivity in BRAF-V600E cell lines from RPPA data. We determined that O-PLS predicted vemurafenib response more accurately than SVR, LASSO, and Random Forests, and the O-PLS model performed superiorly with proteomic data compared to transcriptomic data. Additionally, causal analysis identified that ErbB and PI3K signaling were upregulated in resistant cells, and that dual inhibition of ErbB receptors and BRAF increased vemurafenib sensitivity in resistant cells. Collectively, this study illustrates how an unbiased approach like O-PLS can be used to develop a model from proteomic data in a basket clinical trial setting in order to predict drug sensitivity and identify mechanisms of resistance.
We wish to thank the members of the Kreeger and Page laboratories for helpful discussions with this manuscript.
MJC, PKK, and DP conceived and designed the study. MJC and CRP curated the data and analyzed the models and validation experiments. MJC, PKK, and DP wrote the manuscript with input from all authors, and all authors have read and approved the manuscript.
Funding for this project was provided by the National Institute of Health (1DP2CA195766, PKK), the National Library of Medicine Bioinformatics Training Grant (T15 LM007359, MJC) with additional support from the University of Wisconsin-Madison Graduate School, and the University of Wisconsin Carbone Cancer Center Support Grant (P30 CA014520, DP). Funding bodies had no role in the study and collection, analysis, and interpretation of data and in writing the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
- 2.Slosberg ED, Kang BP, Peguero J, Taylor M, Bauer TM, Berry DA, et al. Signature program: a platform of basket trials. Oncotarget. 2018;9:21383–95. doi: https://doi.org/10.18632/oncotarget.25109.
- 8.Ding MQ, Chen L, Cooper GF, Young JD, Lu X. Precision Oncology beyond Targeted Therapy: Combining Omics Data with Machine Learning Matches the Majority of Cancer Cells to Effective Therapeutics. Mol Cancer Res. 2018;16:269 LP – 278. http://mcr.aacrjournals.org/content/16/2/269.abstract.CrossRefGoogle Scholar
- 9.Kaddi CD, Coulter WH, Wang MD. Developing Robust Predictive Models for Head and Neck Cancer across Microarray and RNA-seq Data. ACM-BCB . . . ACM Conf Bioinformatics, Comput Biol Biomed ACM Conf Bioinformatics, Comput Biol Biomed 2015;2015:393–402. doi: https://doi.org/10.1145/2808719.2808760.
- 11.Schoeberl B, Kudla A, Masson K, Kalra A, Curley M, Finn G, et al. Systems biology driving drug development: from design to the clinical testing of the anti-ErbB3 antibody seribantumab (MM-121). NPJ Syst Biol Appl. 2017;3:16034. https://doi.org/10.1038/npjsba.2016.34.CrossRefPubMedPubMedCentralGoogle Scholar
- 13.Pozdeyev N, Yoo M, Mackie R, Schweppe RE, Tan AC, Haugen BR. Integrating heterogeneous drug sensitivity data from cancer pharmacogenomic studies. Oncotarget. 2016;7:51619–25. doi: https://doi.org/10.18632/oncotarget.10010.
- 14.Witten IH, Frank E, Hall MA, Pal CJ. Data mining, fourth edition: practical machine learning tools and techniques. 4th ed. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.; 2016.Google Scholar
- 15.Babur Ö, Luna A, Korkut A, Durupinar F, Siper MC, Dogrusoz U, et al. Causal interactions from proteomic profiles: molecular data meets pathway knowledge. bioRxiv. 2018. http://biorxiv.org/content/early/2018/05/21/258855.abstract.
- 23.Jang INS, Neto EC, Guinney J, Friend SH, Margolin AA. Systematic Assessment Of Analytical Methods For Drug Sensitivity Prediction From Cancer Cell Line Data. In: Biocomputing 2014. WORLD SCIENTIFIC; 2013. p. 63–74. doi:doi: https://doi.org/10.1142/9789814583220_0007.
- 25.Regan KE, Payne PRO, Li F. Integrative network and transcriptomics-based approach predicts genotype- specific drug combinations for melanoma. AMIA Jt Summits Transl Sci proceedings AMIA Jt Summits Transl Sci. 2017;2017:247–56.Google Scholar
- 35.Prasetyanti PR, Capone E, Barcaroli D, D’Agostino D, Volpe S, Benfante A, et al. ErbB-3 activation by NRG-1β sustains growth and promotes vemurafenib resistance in BRAF-V600E colon cancer stem cells (CSCs). Oncotarget. 2015;6:16902–11. doi: https://doi.org/10.18632/oncotarget.4642.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.