Circulating Tumor Cell Transcriptomics as Biopsy Surrogates in Metastatic Breast Cancer

Background Metastatic breast cancer (MBC) and the circulating tumor cells (CTCs) leading to macrometastases are inherently different than primary breast cancer. We evaluated whether whole transcriptome RNA-Seq of CTCs isolated via an epitope-independent approach may serve as a surrogate for biopsies of macrometastases. Methods We performed RNA-Seq on fresh metastatic tumor biopsies, CTCs, and peripheral blood (PB) from 19 newly diagnosed MBC patients. CTCs were harvested using the ANGLE Parsortix microfluidics system to isolate cells based on size and deformability, independent of a priori knowledge of cell surface marker expression. Results Gene expression separated CTCs, metastatic biopsies, and PB into distinct groups despite heterogeneity between patients and sample types. CTCs showed higher expression of immune oncology targets compared with corresponding metastases and PB. Predictive biomarker (n = 64) expression was highly concordant for CTCs and metastases. Repeat observation data post-treatment demonstrated changes in the activation of different biological pathways. Somatic single nucleotide variant analysis showed increasing mutational complexity over time. Conclusion We demonstrate that RNA-Seq of CTCs could serve as a surrogate biomarker for breast cancer macrometastasis and yield clinically relevant insights into disease biology and clinically actionable targets. Supplementary Information The online version contains supplementary material available at 10.1245/s10434-021-11135-2.

expression of immune oncology targets compared with corresponding metastases and PB. Predictive biomarker (n = 64) expression was highly concordant for CTCs and metastases. Repeat observation data post-treatment demonstrated changes in the activation of different biological pathways. Somatic single nucleotide variant analysis showed increasing mutational complexity over time.
Conclusion. We demonstrate that RNA-Seq of CTCs could serve as a surrogate biomarker for breast cancer macrometastasis and yield clinically relevant insights into disease biology and clinically actionable targets.
Metastatic breast cancer (MBC) is responsible for virtually all BC deaths. MBCs are often discordant in biomarker profiles when compared with the primary tumor. 1 The American Society of Clinical Oncology guidelines call for biopsies of metastases for biomarker testing to guide decision making for systemic therapy. 2,3 However, not all metastatic sites are amenable to safe percutaneous biopsy. Improved survival in MBC is in large part due to the availability of targeted therapies. 4,5 Most patients with MBC unfortunately develop treatment resistance. 6 Although molecular profiling of tumors may predict targeted therapy opportunities, primary tumor or single metastatic biopsy site driven approaches may not represent multiple non-overlapping oncogenic alterations driving biology in patients with multiple metastatic sites. 7 Circulating tumor cells (CTCs) hold significant potential as liquid biopsies obtained via minimally invasive blood draws for the real-time assessment of a patient's tumor biology and heterogeneity. 8,9 CTCs have been shown to be prognostic in MBC and are present in 52-71% of MBC patients, 10 but have not provided predictive insights for targeted therapy. Hence, CTCs have not been used extensively to guide therapy decisions. A potential issue is the selection of cell populations based on cell surface marker expression, such as the only FDA approved method via the cell search system. 11 Another issue is that mere enumeration of CTCs or technological limitations have hampered the capability to interrogate CTC biology and gain insights into potentially targetable lesions. 8 Many sequencing approaches available now, including those with clinical application, focus on DNA sequencing, but there are major concerns that not all DNA mutations are expressed. 12 Refinements in RNA-Seq technology now enable detailed molecular profiling of CTCs 13 beyond gene expression, offering the potential for predicting treatment options via a liquid biopsy.
We hypothesized that molecular characterization via whole transcriptome RNA sequencing of CTCs isolated in an unbiased, marker independent fashion can capture disease heterogeneity of MBC and may serve as a surrogate for the analysis of macrometastases to identify predictive biomarkers, potentially leading to new target discovery and explaining treatment resistance.

Study Design and Patient Population
The project was designed as an observational study to evaluate whether RNA-sequencing (RNA-Seq) of CTCs can identify potential treatment targets. A total of 21 treatment naïve female MBC patients were prospectively enrolled at the Keck Medical Center and Norris Comprehensive Cancer Center at the University of Southern California (USC). Each patient underwent biopsies of macrometastases for clinical diagnostic purposes collected at baseline (prior to therapy for MBC) or upon disease progression prior to switching therapy. A baseline PB draw of 7.5 ml in an EDTA tube for CTC RNA-Seq was required for inclusion. Data from 19 patient samples passing quality criteria were included in further analysis. Four of the 19 patients with progressive disease returned for repeat PB draws after approximately 6 months of treatment to track the changes in CTC biology over time. Response to therapy was assessed based on RECIST criteria. 14 All procedures, including written patient informed consent, were approved by the Institutional Review Board (IRB HS-14-00595 and HS-11-00208) at USC. This study was compliant with the REMARK criteria. 15

MOLECULAR MARKER-INDEPENDENT CTC ISOLATION
The Parsortix microfluidics filtration system (ANGLE plc, Surrey, United Kingdom) efficiently captures and highly enriches CTCs in a cell surface marker independent manner based on size and deformability, [16][17][18][19] reducing the number of contaminating white blood cells (WBCs) by roughly 5 orders of magnitude. The device has a Diagnostic Devices Directive CE Mark for clinical use in Europe. We have previously validated the capture efficiency of the device in our lab using breast cancer cells spiked into peripheral blood samples ( Supplementary Fig. S1). A capture cassette with a critical gap of 10 microns was used to enriched CTCs. 19 Cell pellets were resuspended in 10 ll of lysis buffer (NuGEN Technologies, Inc., San Carlos, CA) and stored at -80°C for further use. Rigorous device cleaning was performed between samples. This cell surface marker independent approach allowed for the capture of heterogeneous CTC populations, including EpCAM negative cells and clusters of CTCs. 17 As processing time is critical to maximize capture efficiency 20 total time from blood draw to CTC harvest did not exceed 2 h. As negative controls, phosphate buffered saline (PBS) samples and PB from 5 healthy female donors were processed.

Sample Preparation and Whole Transcriptome RNA-Seq and Sanger Sequencing
Either 50 ng of RNA from a metastasis or PB, isolated with a TRIzol or RiboPure kit (both Thermo Fischer Scientific, Waltham, MA), respectively, or 2 ll of CTC lysate, were used to create cDNA for sequencing library preparation using the Ovation RNA-Seq System V2 and Ovation Ultralow Library System V2 (NuGEN Technologies, San Francisco, CA). Details regarding isolation and preparation of RNA can be found in Supplementary File S1. Sequencing was done on an Illumina HiSeq 2500 (Illumina, San Diego, CA) performing 100 base pair paired-end RNA-Seq using five samples per lane. Sanger sequencing was performed by Genewiz (South Plainfield, NJ, USA) and the sequencing data was analyzed manually with 4Peaks (Nucleobytes, Aalsmeer, Netherlands) (Supplementary File S1, Supplementary Table S1). RNA-Seq data quality control and mapping were performed as previously described 21 (Supplementary File S1). For somatic SNV (single nucleotide variant) calling, the FASTQ files were processed following the Best Practices Workflow for variant calling with RNA-Seq from the Broad Institute (Supplementary File S1). The COSMIC database and 184 known driver genes in BC from the Integrative Onco Genomics database (http://www.intogen.org/mutations/) 22 were investigated for known SNVs in our data set. The driver gene analysis was done using Maftools. 23 We curated a list of 64 BC related genes with clinical and preclinical therapeutic, prognostic, or diagnostic implications, performing an extensive literature search 24 (Supplementary Table S2) representing breast cancer relevant pathways (EGFR/RAF/MEK, IGF-1/PI3K/AKT/ mTOR, WNT/NOTCH/Hedgehog/FGF/MET, DNA damage repair, cell cycle, hormone receptor signaling, tumor suppressors, and tumor immunology). The FASTQ files, as well as the corresponding read count files for each sample were deposited in the Gene Expression Omnibus database (GSE113890).

Statistical Analysis
Statistical analyses were conducted using GraphPad Prism (San Diego, CA, USA). For differences in gene expression, two-way ANOVA was used. For SNV comparison, the Wilcoxon sign-rank test and Friedman test were used. The number of uniquely mapped reads was compared using Kruskal-Wallis and Dunn's multiple comparison tests.  Table S3). We obtained an average coverage higher than 50X for all but five of the samples, and coverage greater than 100X in 86% of the samples. The negative controls (PBS samples processed by Parsortix) yielded virtually no read counts (Supplementary  Table S3).

Patient Characteristic and Grouped Gene Expression Analysis Separates Sample Type
Principal component analysis (PCA) showed separation of the majority of CTCs versus metastases and PB in PC1, and separation of CTCs and metastases from PB in PC2 (Fig. 1A). A Venn diagram is shown in Fig. 1B Fig. S2). In summary, these results show that RNA-Seq can detect distinct gene expression features in enriched CTCs compared with metastases and PB.

Gene Expression of Potentially Clinically Actionable Genes Relevant to Breast Cancer
CTCs showed overall many more differentially expressed immune oncology target genes (Oncomine Immune Response Assay) compared with peripheral blood than did metastatic biopsies (overexpression: CTCs 131 vs metastases 15, 8.7-fold difference; downregulation: CTCs 38 vs metastasis 37, 1.03-fold difference). A total of 12 overexpressed and 15 downregulated genes compared with PB were in common between CTCs and metastasis ( Fig. 2A, Supplementary Table S4). Notably, PD-L1 expression was significantly lower in both CTCs and metastases compared with PB (CTCs versus PB p = 3.5 9 10 -5 , CTCs vs metastases p = 0.004 and metastases versus PB p = 0.004) ( Supplementary Fig. S3).
We found concordant expression of 50/64 (78%) potentially clinically actionable target genes in CTCs and corresponding metastases (Fig. 3A). No genes were uniformly overexpressed or downregulated in all sample groups (i.e., CTCs or metastases). Only 3/64 (4.7%) genes showed statistically significantly discordant expression in CTCs vs metastases (AKT3 p = 0.018, CCND1 p = 0.025, FOXA1 p = 0.034) (Fig. 3A). Figure 3B shows representative patient samples (n = 3) for the expression of all 64 clinically actionable target genes with related clinical trials as well as targeted therapeutics. The majority of CTC and

Sequential Analysis of CTC Samples
We tracked four patients with repeated harvest of CTCs at a second time point, as well as obtaining the imaging studies and systemic therapies these patients received (average time between first and second CTC harvest time point was 4 ± 0.8 months). Representative results for two patients are shown in Fig. 4 (the remaining data can be found in Supplementary Fig. S6). The metastatic sites profiled for these patients were pleural effusions (ER/PR?, HER2-) (patient 1 in Table 1

DISCUSSION
We present the gene expression profiling of enriched CTCs from MBC patients with comparison to metastases and PB, all acquired prior to treatment or at disease progression prior to a new line of therapy. PCA analysis showed that all sample groups (CTCs, metastases, and PB) separated in PC1, with most CTCs and metastases clustering together in PC2. The partial overlap with PB might be explained by findings that CTCs frequently associate with WBCs, in particular neutrophils. 25 Our lab has also previously shown that even ultra-pure CTC populations express WBC genes. 21 Both metastatic and CTC samples (expected 4-10 background leukocytes per CTC after Parsortix enrichment) likely also contain WBCs.
Several genes with biological and clinical implications for BC were highly expressed in CTCs. GPRC5D has been previously associated with tamoxifen resistance. 26 TMEM198 promotes LRP6 phosphorylation in activating Wnt signaling, 27 which has been associated with CSC biology in BC. The apoptosis inhibitor ARC has been associated with chemotherapy resistance, tumorigenesis, and metastasis in the polyoma middle T-antigen (PyMT) transgenic mouse model of BC. 28 It has also been shown to lead to TP53 inactivation in TP53 WT malignancies. 29 LOC727993, a non-coding antisense RNA of the gene known as PDYN-AS1, and RNU6ATAC, a small nuclear RNA associated with U12-dependent splicing, have not previously been demonstrated to be involved with tumor biology. Our approach identified both known and novel genes associated with CTCs, suggesting that CTCs might be suitable as a discovery tool to better understand the fundamental tumor biology of metastasis.
We found highly concordant expression in potentially clinically actionable genes in corresponding CTC and metastatic samples, demonstrating the potential clinical relevance of CTCs as predictive biomarkers in BC. Nevertheless, we also observed discordant results, which could be due to various conditions: (1) a heterogeneous origin of CTCs from various metastatic sites or seeding of CTCs from the primary tumor site, (2) changes in transcriptional programs once cells ''settle'' in a new environment, influenced by tissue or site-specific micro-environmental cues, 30 or (3) differences in the timing of when the metastatic (more remote and established) vs seeding of CTCs occurs (which may be more reflective of recent genomic alterations and treatments).
Longitudinal analysis of four patients with serial CTC assessments showed changes in biological pathway activation during treatment and disease progression. We found markedly increased genetic complexity in 3 out of 4 patients over time. These results indicate that serial CTC harvest might capture changes in additional mutation burden as a cancer evolves, particularly under the selection pressure of anti-cancer therapies. 31 Periodic surveying of the mutational evolution using CTCs could thus impact clinical decision making. Additionally, CTCs might capture the mutational landscape of a patient's cancer from different metastatic sites more comprehensively than single site biopsies. 32 Compared with gene expression of potentially actionable genes, we observed a much lower concordance in our SNV analysis. As there is no standard tool or pipeline for using RNA-Seq to call SNVs, we established a workflow for the purpose. SNV calling decreases with low read depth or low allelic frequency, diminishing the sensitivity of SNV detection. 33 The lack of overlap and greater genomic complexity in CTCs could also represent the pool of heterogeneous somatic mutations from various metastatic sites and different cancer cell clones compared with individual metastatic sites. The strength of our approach is the inference of expressed mutations, given that not all DNA mutations are expressed. The finding that lncRNAs were frequently mutated in CTCs and metastases offers an interesting opportunity to further investigate the role of regulatory RNAs in metastasis. 34 As this aspect of our paper is the most speculative piece of the manuscript, we believe that better tools are needed for more sensitive SNV calling from complex samples such as circulating tumor cells. Single cell sequencing studies may shed light on this by controlling for the input of cancer vs peripheral blood mononuclear cells.
Analysis of the expression of IO target genes demonstrated that CTCs expressed 170/200 genes related to immune response while metastatic biopsies expressed only 52/200 such genes, suggesting an important role for the immune system in CTC biology and potential immune escape. During dissemination, CTCs are exposed to many types of stress in the blood microenvironments and direct exposure to immune surveillance. Our results are in line with previous studies demonstrating upregulation of potential immune-escape mechanisms. 35 Several highly expressed IO genes in CTCs in our study have been shown to play important roles in immune evasion and metastatic efficiency: AKT1 can potentially suppress immunodetection by activating myeloid suppressor cells. 36 The complement component C1q might facilitate the metastatic potential of CTCs. 37 CXCL9-11 might act as a doubleedged sword via paracrine and autocrine signaling or interaction with PD-L1, inhibiting or facilitating immune escape and metastatic seeding, respectively. 38 We found expression of PD-L1 to be lower in CTCs and metastases compared with PB samples in our study. Although this marker has been suggested as a potential biomarker in CTCs, 39 our study differs regarding the detection method and CTC capture platform with consideration of gene expression relative to background PB. High gene expression of PD-L1 in PBMCs is to be expected; the Human Protein Atlas shows high expression of CD274 (PD-L1) in   basophils. 40 These findings have implications for the use of immune targeted drugs and warrant further investigation into immune targeting of CTCs. 41 There are several limitations of our study such as a relatively small number of patients, and CTC enrichment purity. We applied per patient normalization, utilizing matched white blood cells as a pre-specified analysis plan regarding our primary research question of comparing the gene expression of CTCs vs metastases for a list of well characterized, potentially clinically actionable marker genes. Our strategy of per patient normalization to PB signal is novel, focusing attention on genes with strong differential expression between tumor and blood by controlling for leukocyte background. 21,42 Thus, subtle differences in gene expression might not be captured with our method. Ideally, sequencing of pure cell populations, even at single cell level should be attempted to characterize differences of gene expression between CTCs and WBCs more stringently. 43 For gene expression results, standard normalization (reported as reads per kilobase of transcript, per million mapped (RPKM)) was applied but, due to the nature of our approach, we cannot rule out a certain degree of amplification bias. However, we previously utilized unspiked negative controls and extensively validated our RNA amplification strategy. 21,44 Although we successfully detected SNVs, our current data analysis pipeline does not allow for the detection of copy number variation. SNVcalling from RNA-Seq is less well established compared with the DNA based method, and further validation will be needed in the future.

CONCLUSIONS
RNA-Seq of Parsortix-enriched CTCs could lead to minimally invasive, real-time diagnostic strategies for precision therapeutic decision making for MBC patients. Our approach could serve as a surrogate liquid biopsy for potentially clinically actionable drug target gene expression and mutations, allowing longitudinal assessment of the evolution of a patient's cancer.
FUNDING Funding was provided by ANGLE plc. The project described was supported in part by Award Number P30CA014089 from the National Cancer Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health. ANGLE plc.,Personal Grant to J.E.L.,National Cancer Institute,P30CA014089.
DISCLOSURES The authors declare no competing interests for the presented study. Although ANGLE plc provided research funds to the author's institution, the authors have no relevant financial disclosures. The study sponsor did not influence data acquisition, analysis or the decision to publish. OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.