Patient Characteristic and Grouped Gene Expression Analysis Separates Sample Type
Table 1 lists all 19 patients with clinical annotations, including site of metastasis, biomarker [ER, progesterone receptor (PR), and HER2] expression as well as treatments received (two patient samples were excluded due to low read counts). The number of uniquely mapped reads was comparable between all sample groups (median uniquely mapped reads: CTCs 30,962,730 ± 19,535,767; metastases 40,580,689 ± 14,659,648; PB 43,727,727 ± 22,600,318; CTCs vs PB p = 0.094; CTCs vs metastases p = 0.094; PB vs metastases p > 0.99. Median coverage: CTCs 104X ± 64; metastasis 139X ± 64; PB 153X ± 93; CTCs vs PB p = 0.058; CTCs vs metastasis p = 0.25; PB vs metastasis p = 0.78) (Supplementary Table S3). We obtained an average coverage higher than 50X for all but five of the samples, and coverage greater than 100X in 86% of the samples. The negative controls (PBS samples processed by Parsortix) yielded virtually no read counts (Supplementary Table S3).
Table 1 Clinical annotations of all patients Principal component analysis (PCA) showed separation of the majority of CTCs versus metastases and PB in PC1, and separation of CTCs and metastases from PB in PC2 (Fig. 1A). A Venn diagram is shown in Fig. 1B for intergroup comparison of gene expression in CTCs, metastases, and PB. A pairwise grouped comparison was used to identify the five most up- and downregulated genes with an expression change of at least 2-fold and adjusted p-value of < 0.05: CTCs vs PB downregulated: YTHDC1, CREG1 CLK2, ADIPOR1, RN7SL2; upregulated: GPRC5D, LINC01376, LOC727993, TAS1R3, ARC; CTCs + metastases vs PB downregulated: SNAP23, GALNS, SELENOS, RN7SL2, MORN1; upregulated MGP, GDF9, MYH11, AZGP1, LUM; CTCs vs metastases downregulated: ABCF1, TJP1, DLG5, PTPRK, H19; upregulated: GRPC5D, TMEM198, LOC727993, ARC, RNU6ATAC (Supplementary Fig. S2). In summary, these results show that RNA-Seq can detect distinct gene expression features in enriched CTCs compared with metastases and PB.
Gene Expression of Potentially Clinically Actionable Genes Relevant to Breast Cancer
CTCs showed overall many more differentially expressed immune oncology target genes (Oncomine Immune Response Assay) compared with peripheral blood than did metastatic biopsies (overexpression: CTCs 131 vs metastases 15, 8.7-fold difference; downregulation: CTCs 38 vs metastasis 37, 1.03-fold difference). A total of 12 overexpressed and 15 downregulated genes compared with PB were in common between CTCs and metastasis (Fig. 2A, Supplementary Table S4). Notably, PD-L1 expression was significantly lower in both CTCs and metastases compared with PB (CTCs versus PB p = 3.5 × 10−5, CTCs vs metastases p = 0.004 and metastases versus PB p = 0.004) (Supplementary Fig. S3).
We found concordant expression of 50/64 (78%) potentially clinically actionable target genes in CTCs and corresponding metastases (Fig. 3A). No genes were uniformly overexpressed or downregulated in all sample groups (i.e., CTCs or metastases). Only 3/64 (4.7%) genes showed statistically significantly discordant expression in CTCs vs metastases (AKT3 p = 0.018, CCND1 p = 0.025, FOXA1 p = 0.034) (Fig. 3A). Figure 3B shows representative patient samples (n = 3) for the expression of all 64 clinically actionable target genes with related clinical trials as well as targeted therapeutics. The majority of CTC and metastasis samples showed overexpression of these targetable genes compared with PB, with few exceptions (i.e., lower or weak expression, < 2-fold) (Fig. 3B) (Supplementary Fig. S5, Supplementary Table S1). These results indicated that CTCs could potentially serve as a surrogate for distant macrometastases for the identification of druggable targets.
Sequential Analysis of CTC Samples
We tracked four patients with repeated harvest of CTCs at a second time point, as well as obtaining the imaging studies and systemic therapies these patients received (average time between first and second CTC harvest time point was 4 ± 0.8 months). Representative results for two patients are shown in Fig. 4 (the remaining data can be found in Supplementary Fig. S6). The metastatic sites profiled for these patients were pleural effusions (ER/PR+, HER2-) (patient 1 in Table 1) (Fig. 4A and B), (ER+, PR/HER2-) (patient 15 in Table 1) (Fig. 4C and D). For the first patient, hormone receptor genes were downregulated upon sequential CTC assessment after receiving tamoxifen (Fig. 4B) (fold change expression compared with PB: receptors—metastasis 1.29 ± 0.94, CTCs 2.56 ± 1.96, CTCs follow-up −0.2 ± 1.48; cell cycle—metastasis −0.71 ± 2.59, CTCs 0.83 ± 1.7, CTCs follow-up −2.50 ± 2.59; EGFR signaling—metastasis −0.73 ± 2.49, CTCs −0.43 ± 1.71, CTCs follow-up −2.39 ± 3.31). The difference in expression was statistically significant for comparison of CTCs and CTCs follow-up for receptor expression (p = 0.014) and cell cycle gene expression (p = 0.0005). The follow-up CTC sample in the second patient showed upregulation of DNA-damage repair genes under treatment with an alkylating agent (doxorubicin) (Fig. 4D) (fold change expression compared to PB: receptors—metastasis 3.90 ± 1.48, CTCs 3.54 ± 2.98, CTCs follow-up 3.68 ± 1.38; cell cycle—metastasis 2.24 ± 2.47, CTCs 3.27 ± 2.55, CTCs follow-up −0.59 ± 3.97; DNA damage repair—metastasis −0.33 ± 0.92, CTCs 0.39 ± 1.42, CTCs follow-up 0.84 ± 1.99). For both patients, the CTC follow-up samples showed a reduction in cell cycle gene expression. These data revealed an evolution of biological features within each patient’s disease under therapeutic pressure, with implications for clinical management.
Somatic Single Nucleotide Variant (SNV) Analysis
Across all samples, we detected SNVs in 1754 genes (1608 in CTCs, 212 in metastases). At the gene level, 65/212 (31%) of the gene mutations in the metastatic biopsies were also present in the CTCs (Fig. 5A, Supplementary Table S5). A total of 2258 somatic mutations were found across all samples, with CTCs showing a 9.4-fold higher number of SNVs compared with metastases (2041 in CTCs, 217 in metastases; mean and SD: CTCs 93 ± 231 vs metastases 13 ± 21, p = 0.01) (Fig. 5B). We detected 344 variants (17%) found in all CTCs and 42 variants (1.9%) in all metastases that corresponded to SNVs found in the COSMIC (Catalogue of Somatic Mutations in Cancer) database (Supplementary Table S6). Our data showed increasing genomic complexity represented by a higher number of SNVs in 3/4 patients with follow-up at the second time point (mean SNVs: metastasis 4 ± 1, CTCs 35 ± 20, CTCs follow-up 110 ± 119, CTCs vs CTCs follow-up p = 0.035) (Fig. 5C and D, Supplementary Fig. S7A–C). Sanger sequencing validated 6 of 10 selected RNA-Seq variants (60%) (Supplementary Fig. S7). We ranked the top 20 most frequently mutated genes across our samples: AHNAK, ALMS1, ANKRD12, ARID1A, ARHGAP35, BEST1, BPTF, CALM2, F5, HIVEP1, MACF1, MDN1, MIK67, MUC3A, MUC12, MUC16, SOS1, TET2, WIPF1, ZFHX4 (Fig. 5E, Supplementary Fig. S9, Supplementary Table S8) and validated these genes in publicly available data sets using cBioPortal, comparing metastatic (n = 396) vs non-metastatic (n = 5158) BC samples. Seventeen out of those 20 genes were mutated in BC with VAF varying from 0.2 to 9% (Fig. 5F). We found a significant difference in SNVs between metastatic and non-metastatic cases (p = 0.009) (mean metastatic vs non-metastatic 1.94 ± 1.84 vs 1.2 ± 2.35), with up to 10-fold differences for single genes (i.e., TET2). Comparing 184 putative BC driver genes identified in the IntOGen-mutations platform22 we found SNVs in 44/184 (24%) in our RNA-Seq data set (Supplementary Table S9). Of these, 34/44 (77.3%) were present only in CTC samples, 4/44 (9.1%) only in metastatic samples and 5/44 (11.4%) in both CTCs and metastases.
In summary, our analysis demonstrated a higher number of heterogeneous somatic mutations in CTCs compared with macrometastatic biopsies, an increase in the number of SNVs in CTCs over time, and revealed that RNA-Seq of CTCs can detect driver gene mutations in MBC.