Predicting drug response from single-cell expression profiles of tumours

Intra-tumour heterogeneity (ITH) presents a significant obstacle in formulating effective treatment strategies in clinical practice. Single-cell RNA sequencing (scRNA-seq) has evolved as a powerful instrument for probing ITH at the transcriptional level, offering an unparalleled opportunity for therapeutic intervention.

Results

Drug response prediction at the single-cell level is an emerging field of research that aims to improve the efficacy and precision of cancer treatments. Here, we introduce DREEP (Drug Response Estimation from single-cell Expression Profiles), a computational method that leverages publicly available pharmacogenomic screens from GDSC2, CTRP2, and PRISM and functional enrichment analysis to predict single-cell drug sensitivity from transcriptomic data. We validated DREEP extensively in vitro using several independent single-cell datasets with over 200 cancer cell lines and showed its accuracy and robustness. Additionally, we also applied DREEP to molecularly barcoded breast cancer cells and identified drugs that can selectively target specific cell populations.

Conclusions

DREEP provides an in silico framework to prioritize drugs from single-cell transcriptional profiles of tumours and thus helps in designing personalized treatment strategies and accelerating drug repurposing studies. DREEP is available at https://github.com/gambalab/DREEP.

View this article's peer review reports

Beyondcell: targeting cancer therapeutic heterogeneity in single-cell RNA-seq data

Article Open access 16 December 2021

Predicting heterogeneity in clone-specific therapeutic vulnerabilities using single-cell transcriptomic signatures

Article Open access 16 December 2021

Predicting tumor response to drugs based on gene-expression biomarkers of sensitivity learned from cancer cell lines

Article Open access 15 April 2021

Background

Intra-tumour heterogeneity (ITH) is a recurrent issue in designing effective treatment strategies in clinical practice. It refers to heterogeneous cancer cell populations present in the same tumour that exhibit high cellular plasticity and phenotypic heterogeneity. Accumulating evidence from independent studies shows that ITH gives rise to rare sub-populations of malignant cells characterized by a different epigenetic and transcriptional state that renders them refractory to a variety of anticancer drugs, protecting the whole population from complete eradication [1,2,3,4].

Single-cell RNA sequencing (scRNA-seq) has now become a powerful tool to explore ITH at the transcriptional level [5] and thus offers an unprecedented chance to tackle it therapeutically [4, 6, 7]. In addition, large-scale pharmacogenomic screens have been performed across hundreds of molecularly characterized cancer cell lines [8,9,10,11,12,13,14]. The resulting data have provided a valuable resource to link drug sensitivity with genomic and transcriptomic features. Indeed, in the last few years, several machine-learning models have leveraged these datasets to identify molecular markers linked to in vitro drug sensitivity in order to predict drug response from -omics profiles of tumour biopsies [8, 15, 16]. However, these approaches do not explicitly account for ITH with the result that genomic and non-genomic biomarkers of drug response have been found for only a restricted number of small molecules. Nonetheless, expression-based biomarkers measured from bulk cell populations have emerged as the most powerful predictors of drug response in vitro for several cytotoxic and targeted drugs, with far more predictive power than other -omics profiles [8, 10]. To overcome these limitations, a few recent studies like scDRUG [17, 18], scDEAL [19], beyondCell [20], chemCPA [21], and NCI-DOE [22] have attempted to leverage scRNA-seq and publicly available drug sensitivity profiles to predict drug response at the single cell level. However, these studies often exhibit certain drawbacks. They may lack comprehensive in vitro validation, fail to provide user-friendly software packages, rely on complex models requiring parameter fine-tuning before practical use, or are confined to predicting the efficacy of a limited number of drugs.

To address these challenges, we introduce DREEP (Drug Response Estimation from single-cell Expression Profiles), a novel bioinformatics tool designed to augment the precision and efficacy of therapeutic strategies by addressing the innate heterogeneity of cancer cell populations. DREEP’s primary function is to assess the complexity of tumour therapeutics and recommend specific drugs tailored to target distinct cell subpopulations. DREEP can predict a cell’s vulnerability to more than 2000 drugs by simplifying the whole process with enrichment analysis, eliminating the need for complex parameter adjustments and thus making this type of analysis accessible to a broad audience.

We validate DREEP performance extensively using in vitro drug response data of over 200 cancer cell lines covering 22 distinct cancer types from publicly available single-cell and drug viability datasets. We show that our method can effectively identify recurring drug vulnerabilities across cells of diverse tumour types and establishes a clear relationship between drug response and the functional status of the cells. We show that DREEP can detect changes in drug sensitivity over time in the breast cancer cell line MCF7 and predict the emergence of drug resistance. Additionally, using molecular barcoded MDA-MB-468 cells [4], we show that our method can recognize innate drug-tolerant cell subpopulations and predict drug co-treatments that will inhibit all of them. Finally, we demonstrate that DREEP can accurately predict drug sensitivity within heterogeneous cancer cell populations of patient-derived cultures (PDCs) containing a mix of primary and metastatic cells derived from individuals with head and neck squamous cell carcinomas [23]. In conclusion, our results demonstrate that it is possible to detect differences in drug response among individual cells within the same tumour. DREEP is implemented as an open-source R package, making it easily accessible and user-friendly for the broader scientific community. It is publicly available on GitHub at the following address https://github.com/gambalab/DREEP.

Methods

DREEP implementation

To predict drug response at the single-cell level, we first used gf-icf normalization (https://github.com/gambalab/gficf) [24] to extract the top relevant genes from each cell. We then used these gene sets as input for Gene Set Enrichment Analysis (GSEA) [25] against each GDPS ranked-list to predict the sensitivity of a cell to a specific drug. Since GPDS ranked lists contain predictive biomarkers of resistance at the top and predictive biomarkers of sensitivity at the bottom for each drug, a positive enrichment score implies that the cell expresses genes that render it tolerant to the drug, while a negative enrichment score indicates the cell expresses genes that render it sensitive to the drug. Finally, by estimating the median enrichment score on a cell population and counting the number of cells with a significant enrichment score less than zero, it is possible to estimate the overall effect of a drug on that population and the percentage of cells sensitive to it. The DREEP package is implemented in R and is available on github at the following address https://github.com/gambalab/DREEP. Enrichment Scores (ES) and associated p-values were computed using the fgsea package [26]. The fgsea package utilizes an adaptive and efficient multi-level split Monte Carlo approach [27] to estimate the p-value of an ES. Subsequently, the p-values for each drug across the cells are corrected for the false discovery rate (FDR) using the Benjamini–Hochberg correction method, implemented within the function p.adjust in the base package of the R statistical environment. Finally, only ES values with an FDR < 0.1 were deemed significant (unless otherwise stated) and used to predict the effect of a drug on a single cell. The percentage of sensitive and resistant cells in a sample is finally computed using only significant ES scores.

Bulk RNA-seq dataset processing

Raw counts and associated metadata of the bulk RNA-seq data of cancer cell lines used to build GPDS signature were obtained from the depmap portal (https://depmap.org). Cell lines associated with haematopoietic tumours were excluded. Raw counts were normalized with edgeR [28] and transformed into log10 (CPM + 1). Poorly expressed genes and genes whose entropy values were in the fifth percentile were discarded. Entropy values were estimated by discretizing the expression of each gene in ten equally spaced bins using the functions included in the R package named entropy. After these filtering steps, the final expression matrix comprised 1117 cell lines and 13,849 genes.

Construction of GPDS ranked lists

Genomic Profiles of Drug Sensitivity (GPDS) ranked lists were built by computing the Pearson correlation coefficient (PCC) between the expression of each gene and the potency of a drug across the cell lines for which drug potency was measured. At the end of this process, each GPDS is a ranked list of 13,849 expression-based biomarker genes whose degree of PCC reflects their importance in predicting the effect of a small molecule. This is because the AUC value of a drug reflects its cell line growth inhibition, with lower values indicating sensitivity and higher values implying tolerance to the drug. Thus, a gene whose expression positively correlates with the AUC of a drug is predictive of resistance to that drug; conversely, a negatively correlated gene is predictive of sensitivity to the drug.

Estimation of DREEP performances

To estimate DREEP’s precision and sensitivity in predicting drug response from scRNA-seq data of cancer cell lines, we used the three cell-viability screening datasets from CTRP2, GDSC, and PRISM described above. For each cell-viability screening dataset, the corresponding “gold standard” is composed of pairs of cell lines and drugs for which the corresponding cell line is sensitive. To build the “gold standard” of each cell-viability screening dataset, we first computed the z-score percentiles from the AUC of each drug across the cell lines for which AUC was available and then defined a cell line sensitive to a drug only if its z-score was in the 5% percentile. Next, we applied DREEP to scRNA-seq data of each cell line and computed its median enrichment score for each drug. Finally, precision-recall and ROC curves on each cell-viability screening dataset were computed with the precrec R package (https://github.com/evalclass/precrec) using the corresponding “gold standard” and sorting DREEP drug/cell line pairs according to the estimated median cell line enrichment score multiplied by minus one in order to have drugs predicted to be sensitive at the top.

Cell embedding from DREEP predictions

DREEP output can be arranged in a drug response matrix $E$ of dimensions $n\times m$ where $n$ is equal to the number of cells and $m$ equal to the number of GPDS used. Each element ${E}_{i,j}$ of matrix $E$ ranges within the $\left[-1,+1\right]$ interval and it is different from zero only if the enrichment score between the relevant genes of the cell $i$ and the GPDS ranked list $j$ is significant (i.e. FDR < 0.1). Specifically, ${E}_{i,j}>0$ indicates the cell $i$ is predicted resistant to the drug $j$, while ${E}_{i,j}<0$ indicates the cell $i$ is predicted sensitive to the drug $j,$ and ${E}_{i,j}=0$ indicates that there is no supporting evidence to decide if cell $i$ is sensitive or not to the drug $j$. Next, we computed the similarity of the predicted drug response profile between two cells as the distance between two cells $x$ and $y$ with Fuzzy Jaccard Distance defined as follows:

$${\mathrm{FJD}}_{x,y}=1-\frac{ \sum_{j=1}^{m}\left|{sign(E}_{x,j})+{sign(E}_{y,j})\right|}{2*m}$$

$FJD$ is a non-negative symmetric distance matrix of dimensions $n\times n$ whose elements range within the $\left[\mathrm{0,1}\right]$ interval. Specifically, each element ${FJD}_{x,y}$ of $FJD$ quantifies how many of the drugs to which the cell $x$ is sensitive are shared with the cell $y$ and how many of the drugs the cell $x$ is resistant to being shared with the cell $y$. The more two cells share drugs that are predicted to have the same effect, the closer the ${FJD}_{x,y}$ value is to 0, and vice versa for two cells with opposite drug predictions, ${FJD}_{x,y}$ equals 1. It can be easily shown that the Fuzzy Jaccard Distance we implemented satisfies all the conditions to be defined as a metric, including the triangle inequality constraint. Finally, the cell embedding space was built using the precomputed cell-to-cell distance matrix $FJD$ as input of the umap function of the uwot R package (https://github.com/jlmelville/uwot). Cell embedding from DREEP predictions has been implemented in the function runDrugReduction of the DREEP package.

scRNA-seq datasets processing

The scRNA-seq data of the cell line pan-cancer dataset from Kinker et al. [29] was obtained from the Broad Institute’s single cell portal (SCP542). Specifically, CPM count matrix was downloaded. Before being processed, gene symbols were converted into Ensembl ID and only Ensembl ID associated with no more than one gene were retained. At the end of this pre-processing step, the final cell line pan-cancer matrix contained 16,558 genes and 53,513 cells. Raw scRNA-seq data of the MCF7 cell line exposed to bortezomib [30] was obtained from GEO (GSE114461) and only cells with at least 5,000 UMI were retained. The single-cell breast cancer atlas dataset was obtained from figshare at the following address https://doi.org/10.6084/m9.figshare.15022698. scRNA-seq data of barcoded MDA-MB-468 cells were obtained from GEO (GSE228154). Raw UMI count data of nutlin treated and untreated cells was downloaded from figshare at https://figshare.com/s/139f64b495dea9d88c70 and only cells with at least 5000 UMI and with less than 10% of UMI counts in mitochondrial genes were used. PDCs TPM counts were instead retrieved from supplementary data of the original work [18].

Drug sensitivity dataset processing

The drug sensitivity dataset of CTRP2 (Cancer Therapeutic Response Portal), GDSC (Genomics of Drug Sensitivity in Cancer), and PRISM (Profiling Relative Inhibition in Mixtures) were all obtained from the depmap portal (https://depmap.org). Only drugs for which the in vitro response was available for at least 100 cell lines out of the 1117 for which gene expression data was available were retained. After this filtering, the final drug sensitivity dataset comprised 478 drugs from the GDSC dataset, 511 drugs from the CTRP2 dataset, and 1445 drugs from the PRISM dataset. For all three cell viability datasets, drug potency was expressed in terms of AUC (area under the curve) of the corresponding dose–response graph.

The therapeutic landscape of the cell line pan-cancer dataset

We first used DREEP to predict the effect of the 2434 drugs on each of the 53,513 single-cell transcriptional profiles of the Kinker et al. dataset [29]. Then, we converted predictions from the single-cell level to the cell-line level by counting the fraction of cells predicted to be sensitive to each drug (i.e. DREEP enrichment score < 0 and P < 0.05). At the end of this process, we obtained a matrix ${A}_{198\times \mathrm{2,434}}$ where ${A}_{i,j}$ contains the percentage of cells of the cell line $i$ predicted to be sensitive to the drug $j$. Next, we set elements of $A$ less than 0.9 as being equal to 0 while all elements of $A$ greater or equal to 0.9 were set as equal to 1. Then, we hierarchically clustered rows (i.e. cell lines) and columns (i.e. drugs) of $A$ using the Jaccard distance. Jaccard distances between row pairs and column pairs of the matrix $A$ were computed using the dist function of the R statistical environment. Clustering of both, cell lines and drugs, was performed using the hclust function and resulting dendrograms cut with the cutree function of R statistical environment.

Differential drug sensitivity analysis

We used DREEP to predict the effects of 2434 drugs on 1541 sequenced cells of the barcoded MDA-MB-468 cell line [4]. Then, we performed a Mann–Whitney test and corrected p-values using the Benjamini–Hochberg method to identify drugs with differential sensitivity between afatinib-sensitive and afatinib-tolerant subpopulations. Finally, a drug was considered specific for a subpopulation if its FDR was less than 0.1 and the median enrichment score across the other subpopulations was greater than zero. This analysis has been implemented in the runDiffDrugAnalysis function of the DREEP package.

Drug combinations assay

4 × 10⁴ MDA-MB-468 cells were seeded in a 96-well plate and incubated overnight at 37 °C. The two drugs to test were prepared in different dilutions and then combined in all possible drug pairs to generate a 5 × 5 combination matrix. Then, cells were exposed to either single agent drug or to the drug pairs of the drug combination matrix, while negative controls were treated with DMSO (each treatment was performed in triplicate). Following 72-h incubation at 37 °C, cell viability was measured with CellTiter (Promega) and the absorbance was read at 490 nm with the plate reader GloMax® Discover instrument. The drug interactions and expected drug responses were calculated with the Combenefit tool [31] using the Loewe additivity model.

Drug Set Enrichment Analysis (DSEA)

DSEA was employed by using the online drugEnrichr tool [32] using the Drug Repurposing Hub Mechanism of Action table resource.

Comparison with other state-of-the-art methods

All methods were run using default parameters and when required single-cell expression was normalized using the Seurat package. Common drugs between DREEP and each method were found by simply intersecting the drug names.

PDC drug response data

PDCs drug response data were retrieved from supplementary data of the original work [18]. Specifically, in the original work, eight drugs were tested on these five PDCs. However, we only used five out of eight drugs for which cell viability was measured at least in triplicate. Raw viability measures we used can be found in Table S7 (Additional file 2).

Results

Drug Response Estimation from single-cell Expression Profiles

DREEP (Drug Response Estimation from single-cell Expression Profiles) is a bioinformatics tool that leverages results from large-scale cell-line viability screens and enrichment analysis to predict drug vulnerability from the transcriptional state of a cell. It requires a pre-defined collection of Genomic Profiles of Drug Sensitivity (GPDS); these are lists of genes ranked according to their contribution in correctly predicting the effect of a small molecule. To build GPDS, we integrated several publicly available RNA-seq datasets and cell-line viability screens, including the following datasets: GDSC [8], CTRP2 [13, 33], and PRISM [34] (Fig. 1A).

GPDS were computed by correlating the in vitro response to a small molecule from the above studies with the expression of all messenger RNA (mRNA) across all the cancer cell lines for which the drug potency was measured (see Fig. 1A and the “Methods” section). In these studies, RNA-seq represents the basal gene expression of a cell line, while the potency of a small molecule in a cell line is evaluated as the area under the curve (AUC) of the corresponding dose–response curve. The AUC value reflects cell growth inhibition after 72 h of treatment, where lower values indicate sensitivity while higher values imply resistance to the tested drug. Thus, a gene whose expression positively correlates with the AUC of a drug across many cell lines can be considered predictive of drug resistance (i.e. the more expressed the gene, the higher the concentration needed to inhibit cell growth) [33]. Vice versa, a negatively correlated gene can be considered predictive of drug sensitivity (i.e. the more expressed the gene, the lower the concentration needed to inhibit cell growth) [33].

Using this approach, we generated a collection of 2434 GPDS signatures (i.e. 478 from GDSC, 511 from CTRP, and 1445 from PRISM) and used them to predict the single-cell-level drug response with the strategy depicted in Fig. 1B. Briefly, single-cell RNA sequencing data from a sample was first normalized with the gf-icf (Gene Frequency–Inverse Cell Frequency) pipeline to rank genes in each cell according to their relevance [24, 35, 36]. Then, for each cell, we used the most relevant genes as input for Gene Set Enrichment Analysis (GSEA) [25] against each GPDS ranked list. Since resistance biomarkers are at the top and sensitivity biomarkers are at the bottom of GPDS ranked lists, a positive value of the enrichment score (ES) returned by DREEP implies that the cell expresses genes associated with drug resistance. Vice versa, a negative ES indicates the cell expresses genes correlated with drug sensitivity. At the end of this process, it is thus possible to estimate the overall effect of each drug on a cell population by estimating the median DREEP enrichment score on that population and compute the percentage of cells sensitive to the drug by simply counting the number of cells in the sample that are associated with a significant ES less than zero (see the “Methods” section).

We found 90 small molecules shared among GDSC, CTRP2, and PRISM datasets. To assess the consistency of GPDS ranked lists, we calculated Spearman correlation coefficients (SCC) between GPDS representations of the same drug but constructed from distinct drug viability datasets. Figure S1 (Additional file 1) shows a substantial similarity in GPDS rankings of the same drug even if derived from two distinct datasets, surpassing random comparisons by several folds (see the “ Methods” section).

Estimation of the predictive performance of DREEP

To validate the GPDS ranked list and demonstrate the reliability of our method as well as measure its predictive performance on in vitro data, we applied it to the Kinker et al. dataset [29] comprising 53,513 single cell transcriptional profiles from 198 tumorigenic cell lines and spanning 22 distinct cancer types. To evaluate DREEP performance, we first converted predictions from the single-cell level to the cell-line level by computing the median enrichment score for each drug (i.e. GPDS) across the cells of each individual cell line. Then, to minimize potential confounders, we evaluated the performance of the DREEP method for each sensitivity dataset independently (see the “Methods” section). We also searched for the optimal number of relevant genes to use for enrichment analysis and predict the sensitivity of a cell (i.e. 50, 100, 250, 500, and 1000 genes). Figure 1C shows the method’s overall performance in terms of precision-recall and ROC (receiver operating characteristic) curves across 198 cancer cell lines for each drug sensitivity dataset independently (i.e. CTRP2, GDSC, and PRISM). In this analysis, predicted drug/cell-line pairs are ranked according to the median enrichment score estimated by DREEP and the performance is evaluated against the corresponding gold standard (see the “Methods” section). Figure 1D reports the ROC curve’s AUC (area under the curve) for all drugs and all cell lines individually. Next, to investigate whether DREEP exhibited a prediction bias toward a class of drugs characterized by a specific Mechanism of Action (MoA), we categorized the ROC curve’s AUC of each drug in Fig. 1D (left panels) into two distinct groups: (i) accurately predicted (defined as ROC-AUC ≥ 0.75) and (ii) erroneously predicted drugs (defined as ROC-AUC ≤ 0.5, indicating random performance). This analysis yielded a total of 639 drugs whose efficacy was accurately predicted (209 from GDSC, 147 from CTRP2, and 282 from PRISM; Additional file 2: Table S1) and 135 drugs whose efficacy was erroneously predicted by DREEP (21 from GDSC, 23 from CTRP2, and 91 from PRISM; Additional file 2: Table S1). Notably, accurately predicted drugs exhibited a significant enrichment [32] (FDR < 10%) across a broad spectrum of MoAs (see the “Methods” section), spanning various biological mechanisms (Additional file 2: Table S2). In contrast, the 135 drugs predicted with low accuracy showed enrichment for a limited number of MoAs but overlapped with the MoAs of accurately predicted drugs (Additional file 2: Table S2), except for prostanoid and glucocorticoid agonist molecules. This analysis indicates that DREEP did not have a prediction bias toward a specific class of drugs, but it proved incapable of predicting the effects of these 135 small molecules, leading us to exclude them from the final version of the tool. Lastly, to assess DREEP’s ability to generalize across different cancer types, we grouped the ROC curve’s AUC values for individual cell lines (as reported in Fig. 1D, right panels) according to their respective cancer types. Figure S2 (Additional file 1) illustrates that DREEP does not reveal any significant performance drop associated with specific cancer types except for neuroblastoma where average AUC is lower than in other cancer types. Finally, we conducted an additional assessment of DREEP’s predictive performance where we constructed GPDS ranked lists using the drug’s IC50 values instead of the AUC metric. As depicted in Fig. S3 (Additional file 1), DREEP’s performance noticeably decreased across all three drug viability datasets when employing this alternate approach.

To test DREEP’s performance on an additional independent dataset, we applied it to a single-cell breast cancer dataset [5]. This dataset comprises 35,276 individual cells from 32 breast cancer cell lines covering all the main breast cancer tumour subtypes (Luminal/Her2-positive/Basal Like). Figure 1E and F show DREEP’s performance in this additional dataset.

Next, we tested the prediction performance of the DREEP method on both pan-cancer and breast cancer datasets using absolute gene expression levels instead of gf-icf normalized data. We observed a drastic reduction of the predictive performance of the method (Additional file 1: Fig. S4) using the absolute gene expression level to estimate its relevance in a cell, showing the importance of data pre-processing prior to the application of a drug prediction algorithm. As a good compromise between the required computational time and prediction accuracy, we decided to use N = 500 genes as a default value for the DREEP sensitivity predictions in all subsequent analyses in the manuscript.

Finally, we used both the single-cell pan-cancer and breast cancer datasets to conduct a comparative analysis of DREEP’s performance with four other widely used single-cell drug prediction methods, including scDRUG [17], scDEAL [19], and beyondCell [20] (see the “Methods” section). We evaluated the performance of these methods based on the shared drugs, as depicted in Fig. S5A (Additional file 1). As demonstrated in Fig. S5B-F (Additional file 1) and Table S8-10, DREEP consistently outperformed the other methods across both datasets. While the biomarker-based method beyondCell, similar to DREEP, showcased reasonable performance when using the GDSC dataset, scDEAL and scDRUG yielded instead subpar results in all comparisons. This discrepancy is likely due to these two methods being trained on high-coverage single-cell data from 10X Genomics, which inherently has fewer dropout events compared to the low-coverage single-cell datasets used in our comparisons.

Taken together, these results collectively demonstrate the validity of our GPDS ranked lists for predicting the sensitivity of cells to drugs. Furthermore, our comprehensive evaluation showcases that DREEP consistently outperforms random predictions across various scenarios and outperforms other state-of-the-art methods.

The DREEP method reconstructs the therapeutic landscape of the pan-cancer dataset

Next, we used the DREEP method to reconstruct the therapeutic landscape of the 22 cancer types in the cell line pan-cancer dataset of Kinker et al. [29]. As shown in Fig. 2A, B (and Additional file 2: Table S3), we first used DREEP to convert each cell line into a list of small molecules predicted to be effective for at least 90% of profiled cells. We then grouped the cells and the drugs into four clusters each, so that cell lines that respond similarly across the tested drugs are grouped together, and analogously drugs that have an effect on the same cell lines are grouped together (i.e. cell line clusters C1, C2 C3, and C4 and drug clusters D1, D2, D3, and D4).

These analyses revealed interesting associations between cancer vulnerabilities and drug response, highlighting numerous cell lines and cancer types predicted to be sensitive to small molecules with shared mechanisms of action (MoA) [32]. Indeed, as shown in Fig. 2A, B (and Additional file 2: Table S3 and Additional file 2: Table S4), cell lines in clusters C1 and C2 comprised 87% (13 out of 15) of head and neck cell lines, 100% of colorectal cell lines (10 out of 10), 83% of bladder cell lines (5 out of 6), 73% of breast cell lines (11 out of 15), 73% of pancreatic cell lines (8 out of 11), and 33% of lung cancer cell lines (13 out of 39). These cancers all have in common epidermal growth factor (EGF) receptor overexpression/amplification, such as EGFR and ERBB2 [37]. Indeed, this is one of the major alterations in colorectal (CRC, 60–80%) [38], lung (NSCLC, 40–80%) [39], head and neck (HNSCC, 80–90%) [40, 41], bladder [42, 43], pancreatic (PDAC, 30–89%) [44], and breast cancers [45]. For these tumour types, therapies targeting EGFR, ERBB2, or their downstream effectors such as MEK or PI3K have shown some degree of success [41, 46,47,48]. As depicted in Fig. 2A, B, cell-line clusters C1 and C2 exhibit sensitivity to drugs in the D1 cluster that is enriched for drugs whose mode of action (see the “Methods” section and Additional file 2: Table S4) is related to the inhibition of EGFR and MEK signalling pathways.

Cluster C3 includes 69% of ovarian cancer cell lines (9 out of 13) and 70% of endometrial/uterine ones (7 out of 10). Cell lines in this cluster are predicted to be sensitive to drugs in the D2 cluster, which is enriched for topoisomerase, AURORA, HDAC, and CDK inhibitors. Interestingly, ovarian and uterine cancers are usually due to inefficient DNA repair [49]. The standard of care for these cancer types is mainly based on neoadjuvant chemotherapy and several assays on cell lines identified drugs targeting apoptosis or cell cycle regulators as being effective [50,51,52,53,54,55,56,57,58,59,60,61,62].

Cluster C4 includes all skin cancer cell lines (16 out of 16), all kidney (6 out of 6), all brain (12 out of 12), and 13 lung cancer cell lines. Cell lines in cluster C4 are predicted to be sensitive to multiple families of drugs, including RAF, KIT, PDGFR, and PI3K/mTOR inhibitors. This agrees with current clinical practice which uses mTOR and BRAF inhibitors as first-line therapy for kidney cancer and melanoma, respectively [63, 64], since the c-Met/mTOR axis is frequently altered in kidney cancer [65, 66] while BRAF mutations are the main genetic drivers of melanoma [64]. Finally, RTKs are frequently altered in glioblastoma [67, 68].

To further demonstrate DREEP’s ability to identify recurring drug vulnerabilities across cancer cells of different tumour types with shared genetic backgrounds, we applied our method to the dataset from McFarland et al. [69]. This dataset encompasses 3630 untreated and 4175 nutlin-treated cells, culminating in an extensive collection of 7805 cells (Fig. 2C). Cells were sampled at two time points, 6 and 24 h post-treatment, and originated from 24 distinct cancer cell lines covering 14 different tumour types. Nutlin, a selective MDM2 inhibitor, is a negative regulator of the tumour-suppressor gene TP53, triggering apoptosis and cell cycle arrest exclusively in cancer cells retaining the wild-type TP53 form [70]. As depicted in Fig. 2C, seven of the 24 cell lines within this dataset harbour a missense mutation in the TP53 gene, while all other cell lines retain the wild-type (WT) TP53 form. Indeed, as demonstrated in Fig. S6 (Additional file 1), DREEP accurately predicted that most TP53 wild-type cells would exhibit sensitivity to nutlin treatment, while cells bearing TP53 mutations would display resistance. To validate these predictions, we first aggregated them at the cell line level by computing the median enrichment score of nutlin estimated by DREEP across individual cells within each respective cell line and then correlated these scores with the experimentally observed nutlin response from McFarland et al. [69] (see the “Methods” section). As depicted in Fig. 2D, E, this correlation held true for both untreated and nutlin-perturbed cells and across the two time points. These findings further underscore the predictive accuracy of our method in effectively characterizing drug responses.

Altogether, these results show that DREEP can find recurrent drug sensitivity patterns shared by cell lines from multiple cancer types, as well as the relationship between drug response and functional cell status.

The DREEP method recapitulates cell population drug response over time

Next, to evaluate DREEP’s ability to recognize adaptation to drug treatment over time, we applied it to the Ben-David et al. dataset [30]. As Fig. 3A shows, this dataset comprises 7440 cells from a single cell-derived clone of the MCF7 cell line exposed for 2 days to 500 nM of bortezomib, a 26S proteasome inhibitor. Specifically, cells were sequenced before treatment (t0), at 12 h and 48 h following drug exposure (t12, t48), and 96 h following drug washout and recovery (t96).

As shown in Fig. 3B, the distribution of Bortezomib ES scores estimated by DREEP at time t0, t12, and t48 exhibits a bimodal pattern, indicating the presence of a heterogeneous cell population comprising both bortezomib-tolerant and bortezomib-sensitive cells (Fig. 2E, F). By contrast, DREEP identified that at the final time point (t96), the distribution of bortezomib ES scores is unimodal, with a homogeneous cell population where most cells underwent transcriptome rewiring characterized by the expression of biomarker genes associated with bortezomib resistance (Fig. 3C, D). Finally, we were interested in evaluating DREEP’s ability to detect distinct drug-response cell subpopulations before and after bortezomib treatment. To this end, we used drug response profiles estimated by DREEP instead of transcriptional profiles to build a new cell-embedding space (see the “Methods” section). Figure 3E shows that the cell-embedding space built from DREEP sensitivity predictions is better than the mixed cell groups identified using transcriptional profiles (Fig. 3A) at separating MCF7 cells into two main clusters resembling the drug exposure over time.

In summary, these analyses demonstrate DREEP’s ability to recapitulate drug response and variability over time.

The DREEP method identifies small molecules that selectively inhibit cancer cell subpopulations

Next, we evaluated DREEP’s ability to identify innate drug-resistant cell populations and to suggest drug co-treatments. To this end, we used single-cell data of the untreated barcoded MDA-MB-468 breast cancer cell line that we recently generated [4]. This dataset includes 1541 single cells from the untreated MDA-MB-468 cell line, each associated with a molecularly expressed barcode (i.e. lineage). In the original study, the purpose of these barcodes was to enable single-cell lineage tracing within the MDA-MB-468 cell population. Subsequently, these cells underwent exposure to increasing concentrations of Afatinib, a tyrosine kinase inhibitor that targets EGFR and HER2 receptors, over an extended period exceeding 40 days. A bulk NGS approach was then used to identify the lineages (i.e. barcodes) that survived this prolonged Afatinib exposure. Finally, the surviving lineages were retrospectively mapped onto the single-cell sequencing data of the untreated barcoded MDA-MB-468 cells. This approach, known as “retrospective lineage tracing”, enabled the categorization of untreated MDA-MB-468 cells as either sensitive or tolerant to Afatinib and anti-EGFR treatments [4] (Fig. 3F).

We applied DREEP to each of the two untreated MDA-MB-468 cell subpopulations (i.e. afatinib-tolerant and afatinib-sensitive) and predicted drugs able to selectively inhibit the growth of either. We thus found 156 drugs predicted to preferentially inhibit the afatinib-tolerant subpopulation and 80 drugs (FDR < 10% and Additional file 2: Table S5) for the afatinib-sensitive subpopulation (Fig. 3G), which included 11 out of 12 EGFR inhibitors (Additional file 2: Table S5).

Interestingly, the most overrepresented classes among the drugs predicted to selectively inhibit the afatinib-tolerant subpopulation were IGF1R, HDAC, and AKT/MTOR/PI3K inhibitors (Fig. 3H). We have already shown that afatinib and IGF1R inhibitors have a strong synergistic effect on these cells [4], thus validating the DREEP predictions that IGF1R inhibitors should deplete the cell population that does not respond to afatinib in this cell line.

To experimentally validate these predictions further, we selected three HDAC inhibitors (Tenovin-6, bellinostat, and AR-42) and three AKT/MTOR/PI3K inhibitors (Uprosertib, PIK93, and buparlisib) from the list of drugs predicted to selectively inhibit the afatinib-tolerant subpopulation. We then measured experimentally if these drugs had an additive, synergistic, or antagonistic effect on MDA-MB-468 cells in combination with afatinib (Additional file 2: Table S6). As shown in Fig. 3I, the median synergy score of each of the six selected drugs tested in combination with afatinib is compatible with an additive effect [31], suggesting that the two drugs work independently on different subpopulations of cells. Finally, as Fig. 3J shows, when cells are embedded into an embedding space reconstructed from DREEP sensitivity profiles (see the “Methods” section), two main cell clusters appear that are enriched for either afatinib-tolerant or afatinib-sensitive lineages, in contrast to the mixed cell groups identified using transcriptional profiles (Fig. 3F). These results show how the DREEP method alone can recognize that there are two different cell populations in this cell line characterized by a distinct drug response profile.

Altogether, these results show that DREEP can predict drug sensitivity at the single-cell level and identify drugs able to selectively inhibit a subpopulation of cells that can be then combined to find drug combinations.

The DREEP method accurately predicts drug response in heterogeneous cancer patient cells

Cancer cell lines often display restricted transcriptomic heterogeneity, failing to mirror the full complexity of tumours. Therefore, to demonstrate the clinical applicability of DREEP, we utilized our methodology on a series of patient-derived cultures (PDCs) from both primary and metastatic sites of individuals suffering from head and neck squamous cell carcinoma [23]. These models have already been proven to effectively replicate the complexity of the corresponding matching tumour [71], with the advantage that sensitivity measurements could be consistently assessed across multiple drugs [71]. Specifically, this dataset comprises 1027 single-cell transcriptional profiles of PDCs obtained from five head and neck cancer patients whose drug response was measured in triplicate for five different small molecules at two different concentrations [18].

As illustrated in Fig. 4A, the UMAP representation of single-cell transcriptomic profiles confirmed the presence of intra-patient transcriptomic heterogeneity within PDCs, particularly when juxtaposed with inter-patient heterogeneity. Notably, in line with observations previously reported in [23, 71], patients HN120 and HN137 displayed a significant degree of heterogeneity, where primary and lymph node metastatic tumour cells delineate two transcriptionally distinct cell subpopulations.

Next, to assess the predictive performance of DREEP on this dataset, we first applied it to the single-cell profiles of each patient and then converted these predictions from the single-cell level to the patient level by computing the median enrichment score for each drug across the cells within each individual patient. As depicted in Fig. 4B, our analysis uncovered a significant correlation (Spearman’s correlation coefficient = 0.634, p-value = 3.6e − 09) between DREEP predictions and the experimentally observed drug responses for five specific drugs: Docetaxel, Doxorubicin, Epothilone B, Gefitinib, and Vorinostat (refer to the “Methods” section) [18]. Patients’ cell viability was determined using each drug’s median IC50 values [18], which were threefold lower than those estimated from ATCC head and neck cancer cell lines present in the GDSC portal (see the “Methods” section). Finally, Fig. 4C illustrates Spearman’s correlation coefficient (SCC) between DREEP predictions and the experimentally observed drug responses within each independent GPDS dataset. Notably, GDSC exhibited the strongest performance with an SCC of 0.749 (p-value = 1.6e − 05), followed by CTRP2 with an SCC of 0.591 (p-value = 0.006), and PRISM with an SCC of 0.512 (p-value = 0.008).

Discussion

Intra-tumour heterogeneity (ITH) presents a significant challenge in developing effective cancer treatments and achieving precision oncology. ITH refers to the variation of tumour cells both within and between tumours. As such, it is crucial to have methods to accurately identify the drug sensitivities of different tumour clones and determine the most effective treatment for each cancer type and the individual patient. To address this challenge, we introduced DREEP, a novel computational method that predicts drug sensitivity at the single-cell level and can propose cell population-specific treatments. To achieve this, DREEP uses a set of drug-sensitivity signatures obtained by correlating gene expression and drug sensitivity profiles across hundreds of tumorigenic cell lines. It then uses these signatures to compute an enrichment score reflecting the degree of sensitivity of a cell to a specific drug.

To test the predictive performance and reliability of DREEP, we applied it to two independent datasets: (i) the pan-cancer cell line dataset from Kinker et al. comprising 198 tumorigenic cell lines from 22 distinct cancer types [29] and (ii) the single-cell breast cancer atlas from Gambardella et al. that includes 32 breast cancer cell lines [5]. We observed that DREEP successfully identifies the most effective drugs for each cell line with an overall performance that is several folds better than random on both datasets. We also successfully used DREEP to reconstruct the single-cell variability in drug response across the 198 cell lines of the pan-cancer dataset and to identify the functional cellular activities linked to common drug response patterns among cancer types and cell lines. Our approach can offer valuable insights for understanding the complex interplay between genetic backgrounds and drug sensitivity, paving the way for more personalized and effective cancer treatments. In addition, when used to analyse single-cell data collected from the breast cancer cell line MCF7 exposed to bortezomib over time [30], DREEP easily reconstructed the interchange between the sensitive and tolerant cell populations. Finally, using single-cell data of the molecularly labelled MDA-MB-468 cell line [4], we showed that DREEP can identify innate drug-resistant cell populations and predict drug co-treatments able to deplete these cell populations. Overall, we show that DREEP produces an accurate reconstruction of the variability of drug response in cancer cell lines from single-cell transcriptomics. The DREEP R package is available at https://github.com/gambalab/DREEP.

Our tool relies heavily on the quality and predictive power of GPDS, which were trained by linking gene expression to drug response data from pharmacogenomic screens. However, these screens may possess inherent biases and limitations in covering the broad spectrum of cancer types. We applied several evaluation metrics on multiple showcases to assess DREEP’s predictive accuracy and ability to generalize across different cancer types, encompassing approximately one hundred thousand cells originating from more than 20 distinct cancer types.

Conclusions

While our results have demonstrated DREEP’s ability to generalize across various cancer types effectively, we recognize that potential biases in our training data could limit DREEP’s applicability to underrepresented cancer types. Therefore, it is crucial to interpret DREEP’s performance within the context of the available training data. We advocate for ongoing refinement and expansion of pharmacogenomic screens to improve the diversity and coverage of cancer types. Addressing these biases in future data collection efforts can enhance DREEP’s utility in advancing personalized cancer treatment strategies. Moreover, in the future, it would be crucial to evaluate how the integration of other single-cell -omics measurements, such as genomic, proteomic, and epigenetic cell states, can improve the identification of variations in drug responses among cancer cell populations. Finally, since tumour heterogeneity is linked to clonal evolution processes, it would be also important to consider the dynamics of clonal populations under therapy pressure to improve the performance of future computational methods that will be developed for personalized diagnosis and treatment of cancer patients.

Availability and requirements

Project name: DREEP.

Project home page: https://github.com/gambalab/DREEP

Operating system(s): Platform independent.

Programming language: R and C + +

Other requirements: R version ≥ 4.3

Licence: MIT License.

Any restrictions to use by non-academics: Licence needed.

Availability of data and materials

All the processed datasets and associated metadata, including scRNA-seq, bulk RNA-seq, and sensitivity profiles used in this work, have been deposited on figshare at the following address https://doi.org/10.6084/m9.figshare.23261234. The DREEP R package is available at https://github.com/gambalab/DREEP.

Change history

16 February 2024
A Correction to this paper has been published: https://doi.org/10.1186/s12916-024-03289-z

Abbreviations

AUC:: Area under the curve
CPM:: Counts per million
CTRP2:: Cancer Therapeutic Response Portal
DMSO:: Dimethyl sulfoxide
DREEP:: Drug Response Estimation from single-cell Expression Profiles
ES:: Enrichment Score
FDR:: False discovery rate
GDSC:: Genomics of Drug Sensitivity in Cancer
GPDS:: Genomic Profiles of Drug Sensitivity
ITH:: Intra-tumour heterogeneity
MoA:: Mechanism of Action
PDCs:: Patient-derived cultures
PRISM:: Profiling Relative Inhibition in Mixtures
ROC:: Receiver operating characteristic
RNA-seq:: RNA sequencing
SCC:: Spearman correlation coefficient
scRNA-seq:: Single-cell RNA sequencing
UMI:: Unique Molecular Identifier

References

Shaffer SM, Dunagin MC, Torborg SR, Torre EA, Emert B, Krepler C, et al. Rare cell variability and drug-induced reprogramming as a mode of cancer drug resistance. Nature. 2017;546(7658):431–5. Available from: https://doi.org/10.1038/nature22794%5Cn http://www.ncbi.nlm.nih.gov/pubmed/28607484.
Article CAS PubMed PubMed Central ADS Google Scholar
Sharma SV, Lee DY, Li B, Quinlan MP, Takahashi F, Maheswaran S, et al. A chromatin-mediated reversible drug-tolerant state in cancer cell subpopulations. Cell. 2010;141(1):69–80. Available from: https://doi.org/10.1016/j.cell.2010.02.027.
Article CAS PubMed PubMed Central Google Scholar
Tirosh I, Izar B, Prakadan SM, Wadsworth MH, Treacy D, Trombetta JJ, et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science (80-). 2016;352(6282):189–96. Available from: http://science.sciencemag.org.gate2.inist.fr/content/352/6282/189.abstract%5Cn. http://www.sciencemag.org/cgi/doi/10.1126/science.aad0501
Article CAS PubMed Central ADS Google Scholar
Pellecchia S, Franchini M, Viscido G, Arnese R, Gambardella G. Single cell lineage tracing reveals subclonal dynamics of anti-EGFR therapy resistance in triple negative breast cancer. bioRxiv. 2023;2023.04.04.535588. Available from: http://biorxiv.org/content/early/2023/04/06/2023.04.04.535588.abstract.
Gambardella G, Viscido G, Tumaini B, Isacchi A, Bosotti R, di Bernardo D. A single-cell analysis of breast cancer cell lines to study tumour heterogeneity and drug response. Nat Commun. 2022;13(1):1714. Available from: https://doi.org/10.1038/s41467-022-29358-6.
Article CAS PubMed PubMed Central ADS Google Scholar
Li X, Francies HE, Secrier M, Perner J, Miremadi A, Galeano-Dalmau N, et al. Organoid cultures recapitulate esophageal adenocarcinoma heterogeneity providing a model for clonality studies and precision therapeutics. Nat Commun. 2018;9(1):2983. Available from: https://doi.org/10.1038/s41467-018-05190-9.
Article CAS PubMed PubMed Central ADS Google Scholar
Saeed K, Ojamies P, Pellinen T, Eldfors S, Turkki R, Lundin J, et al. Clonal heterogeneity influences drug responsiveness in renal cancer assessed by ex vivo drug testing of multiple patient-derived cancer cells. Int J Cancer. 2019;144(6):1356–66. Available from: https://doi.org/10.1002/ijc.31815.
Article CAS PubMed Google Scholar
Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, Schubert M, et al. A landscape of pharmacogenomic interactions in cancer. Cell. 2016;166(3):740–54. Available from: https://doi.org/10.1016/j.cell.2016.06.017.
Article CAS PubMed PubMed Central Google Scholar
Rees MG, Seashore-Ludlow B, Cheah JH, Adams DJ, Price EV, Gill S, et al. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat Chem Biol. 2016;12(2):109–16. Available from: http://www.ncbi.nlm.nih.gov/pubmed/26656090.
Article CAS PubMed Google Scholar
Costello JC, Heiser LM, Georgii E, Gonen M, Menden MP, Wang NJ, et al. A community effort to assess and improve drug sensitivity prediction algorithms. Nat Biotechnol. 2014;32(12):1202–12. Available from: http://www.ncbi.nlm.nih.gov/pubmed/24880487.
Article CAS PubMed PubMed Central Google Scholar
Daemen A, Griffith OL, Heiser LM, Wang NJ, Enache OM, Sanborn Z, et al. Modeling precision treatment of breast cancer. Genome Biol. 2013;14(10):R110. Available from: http://www.ncbi.nlm.nih.gov/pubmed/24176112.
Article PubMed PubMed Central Google Scholar
Heiser LM, Sadanandam A, Kuo WL, Benz SC, Goldstein TC, Ng S, et al. Subtype and pathway specific responses to anticancer compounds in breast cancer. Proc Natl Acad Sci U S A. 2012;109(8):2724–9. Available from: http://www.ncbi.nlm.nih.gov/pubmed/22003129.
Article CAS PubMed ADS Google Scholar
Rees MG, Seashore-Ludlow B, Cheah JH, Adams DJ, Price EV, Gill S, et al. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat Chem Biol. 2015;2015(12):1–10. Available from: https://doi.org/10.1038/nchembio.1986.
Google Scholar
Trastulla L, Noorbakhsh J, Vazquez F, McFarland J, Iorio F. Computational estimation of quality and clinical relevance of cancer cell lines. Mol Syst Biol. 2022;18(7):e11017. Available from: https://doi.org/10.15252/msb.202211017.
Article PubMed PubMed Central Google Scholar
Wu L, Wen Y, Leng D, Zhang Q, Dai C, Wang Z, et al. Machine learning methods, databases and tools for drug combination prediction. Brief Bioinform. 2022;23(1):bbab355. Available from: https://doi.org/10.1093/bib/bbab355.
Article CAS PubMed Google Scholar
Adam G, Rampášek L, Safikhani Z, Smirnov P, Haibe-Kains B, Goldenberg A. Machine learning approaches to drug response prediction: challenges and recent progress. npj Precis Oncol. 2020;4(1):19. Available from: https://doi.org/10.1038/s41698-020-0122-1.
Article PubMed PubMed Central Google Scholar
Hsieh C-Y, Wen J-H, Lin S-M, Tseng T-Y, Huang J-H, Huang H-C, et al. scDrug: from single-cell RNA-seq to drug response prediction. Comput Struct Biotechnol J. 2023;21:150–7. Available from: https://www.sciencedirect.com/science/article/pii/S2001037022005505.
Article CAS PubMed Google Scholar
Suphavilai C, Chia S, Sharma A, Tu L, Da Silva RP, Mongia A, et al. Predicting heterogeneity in clone-specific therapeutic vulnerabilities using single-cell transcriptomic signatures. Genome Med. 2021;13(1):189. Available from: https://doi.org/10.1186/s13073-021-01000-y.
Article CAS PubMed PubMed Central Google Scholar
Chen J, Wang X, Ma A, Wang Q-E, Liu B, Li L, et al. Deep transfer learning of cancer drug responses by integrating bulk and single-cell RNA-seq data. Nat Commun. 2022;13(1):6494. Available from: https://doi.org/10.1038/s41467-022-34277-7.
Article CAS PubMed PubMed Central ADS Google Scholar
Fustero-Torre C, Jiménez-Santos MJ, García-Martín S, Carretero-Puche C, García-Jimeno L, Ivanchuk V, et al. Beyondcell: targeting cancer therapeutic heterogeneity in single-cell RNA-seq data. Genome Med. 2021;13(1):187. Available from: https://doi.org/10.1186/s13073-021-01001-x.
Article CAS PubMed PubMed Central Google Scholar
Hetzel L, Böhm S, Kilbertus N, Günnemann S, Lotfollahi M, Theis F. Predicting cellular responses to novel drug perturbations at a single-cell resolution. Adv Neural Inf Process Syst. 2022;35(NeurIPS):1–19.
Google Scholar
Xia F, Allen J, Balaprakash P, Brettin T, Garcia-Cardona C, Clyde A, et al. A cross-study analysis of drug response prediction in cancer cell lines. Brief Bioinform. 2022;23(1):bbab356. Available from: https://doi.org/10.1093/bib/bbab356.
Article PubMed Google Scholar
Sharma A, Cao EY, Kumar V, Zhang X, Leong HS, Wong AML, et al. Longitudinal single-cell RNA sequencing of patient-derived primary cells reveals drug-induced infidelity in stem cell hierarchy. Nat Commun. 2018;9(1):4931. Available from: https://doi.org/10.1038/s41467-018-07261-3.
Article CAS PubMed PubMed Central ADS Google Scholar
Franchini M, Pellecchia S, Viscido G, Gambardella G. Single-cell gene set enrichment analysis and transfer learning for functional annotation of scRNA-seq data. NAR Genomics Bioinforma. 2023;5(1):1qad024. Available from: https://doi.org/10.1093/nargab/lqad024.
Article CAS Google Scholar
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545–50. Available from: http://www.pnas.org/cgi/content/long/102/43/15545. Cited 2014 Jul 10.
Article CAS PubMed PubMed Central ADS Google Scholar
Korotkevich G, Sukhov V, Budin N, Shpak B, Artyomov MN, Sergushichev A. Fast gene set enrichment analysis. bioRxiv. 2021;60012. Available from: http://biorxiv.org/content/early/2021/02/01/060012.abstract.
Botev ZI, Kroese DP. An efficient algorithm for rare-event probability estimation, combinatorial optimization, and counting. Methodol Comput Appl Probab. 2008;10(4):471–505. Available from: https://doi.org/10.1007/s11009-008-9073-7.
Article MathSciNet Google Scholar
Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26. Available from: https://doi.org/10.1093/bioinformatics/btp616.
Kinker GS, Greenwald AC, Tal R, Orlova Z, Cuoco MS, McFarland JM, et al. Pan-cancer single-cell RNA-seq identifies recurring programs of cellular heterogeneity. Nat Genet. 2020;52(11):1208–18. Available from: https://doi.org/10.1038/s41588-020-00726-6.
Article CAS PubMed PubMed Central Google Scholar
Ben-David U, Siranosian B, Ha G, Tang H, Oren Y, Hinohara K, et al. Genetic and transcriptional evolution alters cancer cell line drug response. Nature. 2018;560(7718):325–30. Available from: https://doi.org/10.1038/s41586-018-0409-3.
Article CAS PubMed PubMed Central ADS Google Scholar
Di Veroli GY, Fornari C, Wang D, Mollard S, Bramhall JL, Richards FM, et al. Combenefit: an interactive platform for the analysis and visualization of drug combinations. Bioinformatics. 2016;32(18):2866–8. Available from: https://doi.org/10.1093/bioinformatics/btw230.
Article CAS PubMed PubMed Central Google Scholar
Kropiwnicki E, Evangelista JE, Stein DJ, Clarke DJB, Lachmann A, Kuleshov MV, et al. Drugmonizome and Drugmonizome-ML: integration and abstraction of small molecule attributes for drug enrichment analysis and machine learning. Database [Internet]. 2021;29(2021):baab017. Available from: https://doi.org/10.1093/database/baab017.
Article CAS Google Scholar
Rees MG, Brenan L, do Carmo M, Duggan P, Bajrami B, Arciprete M, et al. Systematic identification of biomarker-driven drug combinations to overcome resistance. Nat Chem Biol. 2022;18(6):615–24. Available from: https://doi.org/10.1038/s41589-022-00996-7.
Article CAS PubMed Google Scholar
Corsello SM, Nagari RT, Spangler RD, Rossen J, Kocak M, Bryan JG, et al. Discovering the anticancer potential of non-oncology drugs by systematic viability profiling. Nat Cancer. 2020;1(2):235–48. Available from: https://doi.org/10.1038/s43018-019-0018-6.
Article CAS PubMed PubMed Central Google Scholar
Gambardella G, di Bernardo D. A tool for visualization and analysis of single-cell RNA-seq data based on text mining. Front Genet. 2019;10. Available from: https://www.frontiersin.org/article/10.3389/fgene.2019.00734/full.
Slovin S, Carissimo A, Panariello F, Grimaldi A, Bouché V, Gambardella G, et al. Single-cell RNA sequencing analysis: a step-by-step overview BT - RNA bioinformatics. In: Picardi E, editor. New York, NY: Springer US; 2021. p. 343–65. Available from: https://doi.org/10.1007/978-1-0716-1307-8_19.
Tajadura-Ortega V, Gambardella G, Skinner A, Halim A, Van Coillie J, Schjoldager KT-BG, et al. O-linked mucin-type glycosylation regulates the transcriptional programme downstream of EGFR. Glycobiology. 2020. Available from: https://doi.org/10.1093/glycob/cwaa075.
Pabla B, Bissonnette M, Konda VJ. Colon cancer and the epidermal growth factor receptor: current treatment paradigms, the importance of diet, and the role of chemoprevention. World J Clin Oncol. 2015;6(5):133–41.
Article PubMed PubMed Central Google Scholar
Prabhakar CN. Epidermal growth factor receptor in non-small cell lung cancer. Transl Lung Cancer Res Vol 4, No 2 (April 19, 2015) Transl Lung Cancer Res (Molecular Genet Lung Cancer)<sup>1</sup>. 2015; Available from: https://tlcr.amegroups.com/article/view/3745.
Stransky N, Egloff AM, Tward AD, Kostic AD, Cibulskis K, Sivachenko A, et al. The mutational landscape of head and neck squamous cell carcinoma. Science (80- ). 2011;333(6046):1157–60. Available from: https://doi.org/10.1126/science.1208130.
Article CAS ADS Google Scholar
Lawrence MS, Sougnez C, Lichtenstein L, Cibulskis K, Lander E, Gabriel SB, et al. Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature. 2015;517(7536):576–82. Available from: https://doi.org/10.1038/nature14129.
Article CAS ADS Google Scholar
Chaux A, Cohen JS, Schultz L, Albadine R, Jadallah S, Murphy KM, et al. High epidermal growth factor receptor immunohistochemical expression in urothelial carcinoma of the bladder is not associated with EGFR mutations in exons 19 and 21: a study using formalin-fixed, paraffin-embedded archival tissues. Hum Pathol. 2012;43(10):1590–5. Available from: https://www.sciencedirect.com/science/article/pii/S0046817711004898.
Article CAS PubMed PubMed Central Google Scholar
Røtterud R, Nesland JM, Berner A, Fosså SD. Expression of the epidermal growth factor receptor family in normal and malignant urothelium. BJU Int. 2005;95(9):1344–50. Available from: https://doi.org/10.1111/j.1464-410X.2005.05497.x.
Article CAS PubMed Google Scholar
Nedaeinia R, Avan A, Manian M, Salehi R, Ghayour-Mobarhan M. EGFR as a potential target for the treatment of pancreatic cancer: dilemma and controversies. Curr Drug Targets. 2014;15(14):1293–301. Available from: https://app.dimensions.ai/details/publication/pub.1069181356.
Article CAS PubMed Google Scholar
Harbeck N, Penault-Llorca F, Cortes J, Gnant M, Houssami N, Poortmans P, et al. Breast cancer. Vol. 5, Nature Reviews Disease Primers. 2019.
Google Scholar
Han J, Liu Y, Yang S, Wu X, Li H, Wang Q. MEK inhibitors for the treatment of non-small cell lung cancer. J Hematol Oncol. 2021;14(1):1. Available from: https://doi.org/10.1186/s13045-020-01025-7.
Article CAS PubMed PubMed Central Google Scholar
Xie Y-H, Chen Y-X, Fang J-Y. Comprehensive review of targeted therapy for colorectal cancer. Signal Transduct Target Ther. 2020;5(1):22. Available from: https://doi.org/10.1038/s41392-020-0116-z.
Article CAS PubMed PubMed Central Google Scholar
Johnson DE, Burtness B, Leemans CR, Lui VWY, Bauman JE, Grandis JR. Head and neck squamous cell carcinoma. Nat Rev Dis Prim. 2020;6(1):92. Available from: https://doi.org/10.1038/s41572-020-00224-3.
Article PubMed Google Scholar
Mirza-Aghazadeh-Attari M, Ostadian C, Saei AA, Mihanfar A, Darband SG, Sadighparvar S, et al. DNA damage response and repair in ovarian cancer: potential targets for therapeutic strategies. DNA Repair (Amst). 2019;80:59–84. Available from: https://www.sciencedirect.com/science/article/pii/S1568786418303148.
Article CAS PubMed Google Scholar
Al-Alem LF, Baker AT, Pandya UM, Eisenhauer EL, Rueda BR. Understanding and Targeting Apoptotic Pathways in Ovarian Cancer. Cancers. 2019; 11(11):1631. https://doi.org/10.3390/cancers11111631.
Zervantonakis IK, Iavarone C, Chen H-Y, Selfors LM, Palakurthi S, Liu JF, et al. Systems analysis of apoptotic priming in ovarian cancer identifies vulnerabilities and predictors of drug response. Nat Commun. 2017;8(1):365. Available from: https://doi.org/10.1038/s41467-017-00263-7.
Article CAS PubMed PubMed Central ADS Google Scholar
Psilopatis I, Pergaris A, Giaginis C, Theocharis S. Histone deacetylase inhibitors: a promising therapeutic alternative for endometrial carcinoma. W. Thiel K, editor. Dis Markers. 2021;2021:7850688. Available from: https://doi.org/10.1155/2021/7850688.
Article CAS PubMed PubMed Central Google Scholar
Moufarrij S, Srivastava A, Gomez S, Hadley M, Palmer E, Austin PT, et al. Combining DNMT and HDAC6 inhibitors increases anti-tumor immune signaling and decreases tumor burden in ovarian cancer. Sci Rep. 2020;10(1):3470. Available from: https://doi.org/10.1038/s41598-020-60409-4.
Article CAS PubMed PubMed Central ADS Google Scholar
Yang H, Sun B, Xu K, He Y, Zhang T, Hall SRR, et al. Pharmaco-transcriptomic correlation analysis reveals novel responsive signatures to HDAC inhibitors and identifies Dasatinib as a synergistic interactor in small-cell lung cancer. eBioMedicine. 2021;69. Available from: https://doi.org/10.1016/j.ebiom.2021.103457.
Shen Q, Li J, Mai J, Zhang Z, Fisher A, Wu X, et al. Sensitizing non-small cell lung cancer to BCL-xL-targeted apoptosis. Cell Death Dis. 2018;9(10):986. Available from: https://doi.org/10.1038/s41419-018-1040-9.
Article CAS PubMed PubMed Central Google Scholar
Han Z, Liang J, Li Y, He J. Drugs and clinical approaches targeting the antiapoptotic protein: a review. Caltabiano R, editor. Biomed Res Int. 2019;2019:1212369. Available from: https://doi.org/10.1155/2019/1212369.
Article CAS PubMed PubMed Central Google Scholar
Du R, Huang C, Liu K, Li X, Dong Z. Targeting AURKA in cancer: molecular mechanisms and opportunities for cancer therapy. Mol Cancer. 2021;20(1):15. Available from: https://doi.org/10.1186/s12943-020-01305-3.
Article CAS PubMed PubMed Central Google Scholar
Umene K, Yanokura M, Banno K, Irie H, Adachi M, Iida M, et al. Aurora kinase A has a significant role as a therapeutic target and clinical biomarker in endometrial cancer. Int J Oncol. 2015;46(4):1498–506.
Article CAS PubMed PubMed Central Google Scholar
Do T-V, Xiao F, Bickel LE, Klein-Szanto AJ, Pathak HB, Hua X, et al. Aurora kinase A mediates epithelial ovarian cancer cell migration and adhesion. Oncogene. 2014;33(5):539–49. Available from: https://doi.org/10.1038/onc.2012.632.
Article CAS PubMed Google Scholar
Pérez-Fidalgo JA, Gambardella V, Pineda B, Burgues O, Piñero O, Cervantes A. Aurora kinases in ovarian cancer. ESMO Open. 2020;5(5):e000718. Available from: https://www.sciencedirect.com/science/article/pii/S2059702920326909.
Article PubMed PubMed Central Google Scholar
Shah KN, Bhatt R, Rotow J, Rohrberg J, Olivas V, Wang VE, et al. Aurora kinase A drives the evolution of resistance to third-generation EGFR inhibitors in lung cancer. Nat Med. 2019;25(1):111–8. Available from: https://doi.org/10.1038/s41591-018-0264-7.
Article CAS PubMed Google Scholar
Takai N, Desmond JC, Kumagai T, Gui D, Said JW, Whittaker S, et al. Histone deacetylase inhibitors have a profound antigrowth activity in endometrial cancer cells. Clin Cancer Res. 2004;10(3):1141–9. Available from: https://doi.org/10.1158/1078-0432.CCR-03-0100.
Article CAS PubMed Google Scholar
Leonardi CG, Falzone L, Salemi R, Zanghì A, Spandidos AD, Mccubrey AJ, et al. Cutaneous melanoma: from pathogenesis to therapy (Review). Int J Oncol. 2018;52(4):1071–80. Available from: https://doi.org/10.3892/ijo.2018.4287.
Article CAS PubMed PubMed Central Google Scholar
Hodis E, Watson IR, Kryukov GV, Arold ST, Imielinski M, Theurillat J-P, et al. A landscape of driver mutations in melanoma. Cell. 2012;150(2):251–63. Available from: https://doi.org/10.1016/j.cell.2012.06.024.
Article CAS PubMed PubMed Central Google Scholar
Ghosh AP, Sudarshan S. Genetics of renal cancer: focus on MTOR. Aging (Albany NY). 2016;8(3):421–2. Available from: https://app.dimensions.ai/details/publication/pub.1022209586.
Article PubMed PubMed Central Google Scholar
Miricescu D, Balan Gabriela D, Tulin A, Stiru O, Vacaroiu Adela I, Mihai Andrada D, et al. PI3K/AKT/mTOR signalling pathway involvement in renal cell carcinoma pathogenesis (Review). Exp Ther Med. 2021;21(5):540. Available from: https://doi.org/10.3892/etm.2021.9972.
Article CAS PubMed PubMed Central Google Scholar
Qin A, Musket A, Musich PR, Schweitzer JB, Xie Q. Receptor tyrosine kinases as druggable targets in glioblastoma: Do signaling pathways matter? Neuro-Oncology Adv. 2021;3(1):vdab133. Available from: https://doi.org/10.1093/noajnl/vdab133.
Article Google Scholar
Alexandru O, Horescu C, Sevastre A-S, Cioc C, Baloi C, Oprita A, et al. Receptor tyrosine kinase targeting in glioblastoma: performance, limitations and future approaches. Contemp Oncol Onkol. 2020;24(1):55–66. Available from: https://doi.org/10.5114/wo.2020.94726.
Article CAS Google Scholar
McFarland JM, Paolella BR, Warren A, Geiger-Schuller K, Shibue T, Rothberg M, et al. Multiplexed single-cell transcriptional response profiling to define cancer vulnerabilities and therapeutic mechanism of action. Nat Commun. 2020;11(1):4296. Available from: https://doi.org/10.1038/s41467-020-17440-w.
Article CAS PubMed PubMed Central ADS Google Scholar
Vassilev LT, Vu BT, Graves B, Carvajal D, Podlaski F, Filipovic Z, et al. In vivo activation of the p53 pathway by small-molecule antagonists of MDM2. Science (80-). 2004;303(5659):844–8. Available from: https://doi.org/10.1126/science.1092472.
Article CAS ADS Google Scholar
Chia S, Low J-L, Zhang X, Kwang X-L, Chong F-T, Sharma A, et al. Phenotype-driven precision oncology as a guide for clinical decisions one patient at a time. Nat Commun. 2017;8(1):435. Available from: https://doi.org/10.1038/s41467-017-00451-5.
Article CAS PubMed PubMed Central ADS Google Scholar

Download references

Acknowledgements

We express our gratitude to Dr. Cathal Wilson for meticulously proofreading our manuscript. His invaluable efforts have significantly contributed to enhancing the clarity and accuracy of our work.

Funding

This work was supported by the My First AIRC grant 23162.

Author information

Simona Pellecchia and Gaetano Viscido contributed equally to the work.

Authors and Affiliations

Telethon Institute of Genetics and Medicine, Naples, Italy
Simona Pellecchia, Gaetano Viscido, Melania Franchini & Gennaro Gambardella
Genomics and Experimental Medicine Program, Scuola Superiore Meridionale, Naples, Italy
Simona Pellecchia
Department of Chemical, Materials and Industrial Engineering, University of Naples Federico II, Naples, Italy
Gaetano Viscido
Department of Electrical Engineering and Information Technology, University of Naples Federico II, Naples, Italy
Melania Franchini

Authors

Simona Pellecchia
View author publications
You can also search for this author in PubMed Google Scholar
Gaetano Viscido
View author publications
You can also search for this author in PubMed Google Scholar
Melania Franchini
View author publications
You can also search for this author in PubMed Google Scholar
Gennaro Gambardella
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SP and GV performed drug combination experiments. MF performed the comparison among the different single-cell drug prediction tools. GG supervised the work, performed part of the bioinformatics analysis, wrote the manuscript, and conceived the original idea. All authors read and approved the final manuscript.

Authors’ Twitter handles

@GennGambard.

Corresponding author

Correspondence to Gennaro Gambardella.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Figure S1.

GPDS consistency across viability datasets. Figure S2. DREEP’s ROC AUC distribution across cancer types. Figure S3. DREEP Method Performance on Pan-Cancer Dataset using IC50-Derived GPDS. Figure S4. Drug prediction performance using absolute expression to extract relevant genes of a cell. Figure S5. Performance comparison between DREEP and other single cell drug prediction tool. Figure S6. DREEP's predictions for McFarland et al. Dataset.

Additional file 2: Table S1.

Accurately and erroneously predicted drugs. Table S2. MoA enrichment analysis of accurately and erroneously predicted drugs. Table S3. Cell-lines therapeutic clusters. Table S4. MoA enrichment analysis of drug clusters. Table S5. Differential Drug Analysis of MDAMB468 cells. Table S6. Drug Combination results. Table S7. Cell Viability of patient-derived cultures (PDCs). Table S8. AUC of ROC and PPV curves for the comparison of DREEP vs BeyonCell (Figure S 5D-E). Table S9. AUC of ROC and PPV curves for the comparison of DREEP vs scDEAL (Figure S 5B-C). Table S10. AUC of ROC and PPV curves for the comparison of DREEP vs scDRUG (Figure S 5F-G).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Pellecchia, S., Viscido, G., Franchini, M. et al. Predicting drug response from single-cell expression profiles of tumours. BMC Med 21, 476 (2023). https://doi.org/10.1186/s12916-023-03182-1

Download citation

Received: 06 June 2023
Accepted: 20 November 2023
Published: 01 December 2023
DOI: https://doi.org/10.1186/s12916-023-03182-1

Predicting drug response from single-cell expression profiles of tumours

Abstract

Background

Results

Conclusions

Similar content being viewed by others

Beyondcell: targeting cancer therapeutic heterogeneity in single-cell RNA-seq data

Predicting heterogeneity in clone-specific therapeutic vulnerabilities using single-cell transcriptomic signatures

Predicting tumor response to drugs based on gene-expression biomarkers of sensitivity learned from cancer cell lines

Background

Methods

DREEP implementation

Bulk RNA-seq dataset processing

Construction of GPDS ranked lists

Estimation of DREEP performances

Cell embedding from DREEP predictions

scRNA-seq datasets processing

Drug sensitivity dataset processing

The therapeutic landscape of the cell line pan-cancer dataset

Differential drug sensitivity analysis

Drug combinations assay

Drug Set Enrichment Analysis (DSEA)

Comparison with other state-of-the-art methods

PDC drug response data

Results

Drug Response Estimation from single-cell Expression Profiles

Estimation of the predictive performance of DREEP

The DREEP method reconstructs the therapeutic landscape of the pan-cancer dataset

The DREEP method recapitulates cell population drug response over time

The DREEP method identifies small molecules that selectively inhibit cancer cell subpopulations

The DREEP method accurately predicts drug response in heterogeneous cancer patient cells

Discussion

Conclusions

Availability and requirements

Availability of data and materials

Change history

16 February 2024

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Authors’ Twitter handles

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Supplementary Information

Additional file 1: Figure S1.

Additional file 2: Table S1.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation