High chromosome instability identified by low-pass whole-genome sequencing assay is associated with TP53 copy loss and worse prognosis in BRCA1 germline mutation breast cancer

Background Though BRCA1 mutation is the most susceptible factor of breast cancer, its prognostic value is disputable. Here in this study, we use a novel method which based on whole-genome analysis to evaluate the chromosome instability (CIN) value and identified the potential relationship between CIN and prognosis of breast cancer patients with germline-BRCA1 mutation. Materials and methods Sanger sequencing or a 98-gene panel sequencing assay was used to screen for BRCA1 germline small mutations in 1151 breast cancer patients with high-risk factors. MLPA assay was employed to screen BRCA1 large genomic rearrangements in familial breast cancer patients with BRCA1 negative for small mutations. Thirty-two samples with unique BRCA1 germline mutation patterns were further subjected to CIN evaluation by LPWGS (low-pass whole-genome sequencing) technology. Results Firstly, 113 patients with germline BRCA1 mutations were screened from the cohort. Further CIN analysis by the LPWGS assay indicated that CIN was independent from the mutation location or type of BRCA1. Patients with high CIN status had shorter disease-free survival rates (DFS) (HR = 6.54, 95% CI 1.30–32.98, P = 0.034). The TP53 copy loss was also characterized by LPWGS assay. The rates of TP53 copy loss in CIN high and CIN low groups were 85.71% (12/14) and 16.67% (3/18), respectively. Conclusion CIN-high is a prognostic factor correlated with shorter DFS and was independent with the germline BRCA1 mutation pattern. Higher CIN values were significantly correlated with TP53 copy loss in breast cancer patients with germline BRCA1 mutation. Our results revealed a reliable molecular parameter for distinguishing patients with poor prognosis from the BRCA1-mutated breast cancer patients.


Introduction
BRCA1 is the most critical breast cancer hereditary susceptibility gene, which encodes homonymic proteins that serves in DNA repair processes during cellular stress. BRCA1 mutation could lead to inactivation of the homologues recombinant repair (HRR) mechanism and induce chromosome instability (CIN). Although been widely researched, the clinical interests of germline BRCA1 mutation in breast cancer prognosis have not been clearly described, with numerous studies reporting inconsistent results [1][2][3][4][5]. CIN serves as an important driver for carcinogenesis since it can lead to dramatic chromosomal structure malformation in cancer [6,7]. CIN can be subdivided into two major categories: (1) gain and/or loss of the whole copy of chromosomes, which are also termed as numerical-CIN or "aneuploid"; (2) regional alterations on some chromosomes, which are termed as "structural-CIN" that include amplifications, deletions, inversions, and translocations of DNA fragments [8]. In breast cancer, numerous researchers have proved that CIN can predict the biological behavior and clinical outcomes [9][10][11]. These remind us the potential value of CIN in determine the prognosis of patients with germline BRCA1 mutation, which has not been detailly investigated yet.
Recently, the limitation of breast cancer CIN evaluation remains to be the methodology since standardized protocol is still unavailable. Although lots of technologies have been introduced, the most popular approach to evaluate CIN remains nucleotide in situ hybridization, which is only a restricted reflection of the global genomic disorder [12][13][14][15].
Over recent years, the development and popularization of the Next Generation Sequencing (NGS) provided a novel solution for global CIN assessment: low-pass whole-genome sequencing (LPWGS), which depict the global genomic alteration rather than a single chromosome segment valuated by nucleotide hybridization.
Our current investigation explored the potential prognostic value of CIN in breast cancer patients with germline BRCA1 mutation as well as the possible molecular mechanism. We also established a scoring system to quantify the CIN status based on the LPWGS method.

Study population and samples
We reviewed the clinical data of patients who were admitted to Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital) from 2008 to date. Patients with pathological diagnosis of breast cancer and any one of the following risk factors were enrolled: (1) early onset of breast cancer with age ≤ 40 years; (2) TNBC with age ≤ 50 years; (3) with bilateral breast lesions or ≥ 2 ipsilateral breast lesions; (4) male breast cancer; (5) familial history of either ≥ 1 close relative with breast cancer (age of onset ≤ 50 years), ≥ 1 close relative with ovarian cancer (any age of onset) or ≥ 2 close relatives with breast cancer, prostatic cancer (Gleason score ≥ 7 or with terminal metastasis) or pancreas cancer at any age of onset. Peripheral blood samples were collected from all participants in EDTA tubes and stored at -80 °C. Their tissue blocks were collected for further DNA extraction after reviewing the HE slides.
The study was approved by Ethics Committee of Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital). Written informed consent was obtained from all recruited subjects.

Clinical and pathological data collection
The clinicopathological information of all enrolled breast cancer patients, including age at diagnosis, histologic type and grade, molecular type, and status of ER, PR, and HER2 were collected. ER, PR, and HER2 expression were determined by immunohistochemistry (IHC). Cases with HER2 "2 + " scores were further confirmed by fluorescence in situ hybridization (FISH). The follow-up information was collected either from medical records of regular follow-up examination or via telephone. The follow-up information included the time of follow-up, post-surgery treatment, time of recurrence and metastasis, metastatic status, and date and cause of death. Disease-free survival (DFS) was analyzed.

BRCA1 germline mutations analysis
The DNA samples were prepared from the peripheral blood samples. Three phases of mutation analysis were performed. First, Sanger sequencing was performed on 133 unrelated familial breast and/or ovarian cancer (FBOC) cases using a total of 32 pairs of primers covering entire coding regions and exon-intron boundaries of BRCA1. Second, a 98-gene panel sequencing assay was employed to screen for mutations of breast cancer susceptibility genes in 1018 cases, which included BRCA1. The NGS panel was adopted for the NEBNext Direct sequencing technology (New England Biolabs, Ipswich, MA). Third, if FBOC cases were identified as negative for BRCA1 small mutations by sequencing assays, large genomic rearrangements (LGRs) of BRCA1 were screened by Multiplex ligation-dependent probe assay (MLPA) assay using the SALSA P002 kit (MRC-Holland, Amsterdam, the Netherlands). The variant classification was performed according to the American College of Medical Genetics (ACMG) guidelines.

Tumor samples analysis by LPWGS assay
DNA was extracted from breast cancer FFPE tissues with BRCA1 germline pathogenic or likely pathogenic mutation and analyzed by Illumina X10. At least 10 M paired reads were collected for each sample. The reads were mapped to human reference genome hg19. Genomic coverage was then counted using software samtools mpileup. Next, the average coverage was calculated for each 200 k bin. Z-scores for each bin were then normalized with Z-score using the following formula: Circular binary segmentation (CBS) algorithm from R package DNACopy was then used to identify significant genomic breakpoints, and copy number changed genomics segments.
R package 'DNACopy' was used to analyze copy number changes. A P value of < 0.05 was considered as statistically significant binary segmentation. The absolute segment value was used for further analysis. The sensitivity and specificity of UCAD were estimated by Receiver Operating Characteristic (ROC) curves. For categorical variables, the chi-square test was used as appropriate.
Proportion trend tests were used to analyze the associations between clinicopathological UCAD screening positivity and clinicopathological parameters. Data are (1) coverage normalized = coverege raw − mean(coverage controls, raw ) stdev coverage controls, raw .
reported as means and standard deviations, medians and interquartile ranges, and hazard ratios or odds ratios with 95% confidence intervals, as appropriate. Missing data were removed from the analyses. All analyses were performed with the use of R software, version 3.4.3 (R Foundation for Statistical Computing).

CIN evaluation and TP53 copy loss identification
To evaluate the CIN status, Z scores of coverage for each chromosomal arm were estimated using the following formula, CIN scores were summarized by formula-1, CIN = sum (L chr * Z chr ), where L chr indicated the length of the chromosome segments and Z chr presented the Z score of the segment. The cut-off Z score that separating the CIN low and high groups were determined as 4000 in our research.
To evaluate whether there was copy loss of TP53 gene, Z scores of coverage for each chromosomal arm were estimated by the following formula, Any sample with Z TP53 < = − 3 indicates TP53 gene copy loss.

Statistical analysis
Single parameter and multiple survival analyses were performed with Cox regression and DFS rates were calculated with the Kaplan-Meier method. The correlation analyses between all of the parameters were performed by Fisher exact test. All statistical analyses were performed using R software (Ver 4.0.2).

BRCA1 mutation frequency in breast cancer patients with high-risk factors
A total of 1151 breast cancer patients having one or more high-risk clinical factors were enrolled for BRCA1 germline mutation screening. 113 cases with the BRCA1 mutation were identified, including five BRCA1 LGRs and 108 small pathogenic or likely pathogenic mutations. The overall mutation rate of BRCA1 was 9.8% (113/1151) in our cohort. The highest subgroup included patients with FBRC, which (2) Z chr = coverege chr,tumor − mean(coverage chr,controls ) stdev coverage chr,controls .
(3)  1A). These data indicated a significant correlation between CIN high phenotype and TP53 loss (P = 0.00021). Further analysis of multiple single clinicopathological parameters indicated that TP53 loss was a specific factor that determine the CIN status between the CIN low and high groups because other factors didn't show significant impact (Table 3).

High CIN leads to poor survival in breast cancer patients with BRCA1 germline mutations
To identify the factor which impact prognosis of breast cancer with BRCA1 germline mutation, we sought to determine the value of CIN in survival. Both single ( Fig. 2A and Table 3) and multiple parameter analyses (Table 4) with Cox regression indicated that high CIN value led to poor survival (P = 0.013, HR = 6.537, and P = 0.044, HR = 5.99, respectively,). Kaplan-Meier analysis further revealed a decreased DFS in high-CIN patients (P = 0.0094, Fig. 2B).

BRCA1 mutation profile did not impact the CIN value
The relationships between CIN status and BRCA1 mutation types (frame shift, stopgain, splicing defects, missense and large genomic rearrangement), mutation positions (domain) were analyzed. Within all the 32 cases that were formerly identified to have unique BRCA1 mutation profiles, no significant correlation was found between them (Fig. 3A).
The schematic illustration of mutations distributed along the BRCA1 gene, as well as CIN status and TP53 copy loss within the 32 cases was shown in Fig. 3B. These results suggested that BRCA1 mutation profile may not impact the CIN status, whereas loss of TP53 function was associated with chromosome instability.

Discussion
Over recent years, the popularization of the NGS technology provided another solution for CIN evaluation; nonetheless, the unification of CIN evaluation based on the NGS technology is necessary. Some researchers used gene panel sequencing assay to evaluate CIN. In 2019, Lee et al.evaluated CIN scores based on a gene panel including 170 common cancer drivers performed by the NGS and identified the relationship between this"panel-based-CIN" and CEP 17 copy gain [14]. Although this method is better than FISH, it can still only provide a sidelight instead of global CIN status. Herein, we report a novel method to preform CIN evaluation by LPWGS due to its advantage in whole genomic coverage and sensitivity in detecting large size DNA alterations. This method has been utilized by some previous studies to clinically identify the cytogenetics [16,17]. In the meantime, circulating Cell-free DNA evaluation by this method has also been reported and thought to be valuable in predicting the prognosis or chemotherapy resistance in non-small cell lung cancer and breast cancer [18][19][20]. Based on these reports, LPWGS is thought to be a reliable method in determine large segments alterations of cancer genomes. Here in our present research, we tried this method to analyze breast cancer CIN status to overcome the limitations of the former CIN evaluation method based on FISH.
Since we included high-risk clinical factors, the mutation frequency of BRCA1 in our cohort was 9.8% (113/1151), which was more than in unselected breast cancer    [4,5,10]. However, the prognosis of this BRCA1-mutation-subgroup remains to be elusive since various clinical trials and researches reported inconsistent results [21][22][23][24]. According to two large-scale clinical researches based on Chinese breast cancer patients published in 2017, the authors reported negative results because BRCA1 mutation was not significantly correlated with poor prognosis in multivariate analysis [4,5]. Thus, the identification of reliable molecular factors for sub-classification of patients with a BRCA1 mutation is of great value to develop potential efficient molecular therapy for this specific population. It has been reported that the CIN level is correlated with the prognosis in breast cancer [14,25]. In our research, we demonstrated that BRCA1 germline mutation breast cancer patients with high CIN values suffered from shorter DFS (Fig. 2B). This suggests that CIN determined by the LPWGS method may be used as a valuable prognosis factor in BRCA1-mutated breast cancer patients. Our data also revealed the potential mechanism leading to CIN elevation in breast cancer with BRCA1 germline mutation. In our cohort, TP53 copy loss (SNV of TP53 was not included) was related to the high CIN phenotype. Among the 32 cases, 15 ones were with TP53 copy loss (12 of 14 cases in CIN high group and 3 of 18 cases in CIN low group, P = 0.00021). This may be explained with the nature of CIN, which was caused by a failure in repairing DNA double strain break (DSB) due to the germline or somatic mutation of some genomic homeostatic genes, including homologous recombination repair (HRR) genes (e.g., BRCA1, BRCA2, PALB2, RAD51C, BRIP1) and genome caretaker genes such as TP53. Aberrant expression of TP53 disables cellular response to DNA damage on multiple levels [26]. Cancer cells with TP53 copy loss cannot arrest their cell cycle, which is necessary for proper DNA damage fixation, thus leading to accumulated large DNA fragments alterations such as translocations, duplications, and deletions [27,28]. Consequently, TP53 copy loss leads to dramatically increased CIN and cancer malignancy.
Previous studies has found that the BRCA1 mutation related genomic instability was at least partially dependent on abnormal TP53 activation, demonstrating that TP53 participated in the biological processing of these signal transduction [29,30]. In one study, Cao et al. found that stem cells with BRCA1 deletion presented senescence characteristics and they may underwent malignant transformation with TP53 copy loss at the mean time [31]. Thus, BRCA1-TP53 axis ensures the stability of somatic genomics, and prevent oncogenic progression. The potential mechanism of TP53 loss in breast cancer with BRCA1 germline mutation need to be investigated.
The mutation spectrum of the BRCA1 is extremely complicated. Various mutation sites that involved almost  all of its exons and introns have been identified, leading to all types of genetic alterations, including frameshift, missense, nonsense, inframe insertions and deletions, and splice altering mutations [32]. In the present study, we analyzed the BRCA1 mutations in all our cases and obtained the mutation spectrum. Unfortunately, we did not find any convincing hot-spot exons for high CIN cases, not even in the three breast cancer cluster regions (BCCRs) previously identified [33]. These results indicated that CIN was independent of BRCA1 mutation types and position. Despite the existence of numerous BRCA1 variants, any pathogenic mutations on the BRCA1 may be insufficient to accumulate chromosome instability.

Conclusion
In summary, our research identified a novel whole-genome sequencing method to quantify the CIN and we found high CIN status was correlated with poor DFS as well as the copy loss of TP53 in BRCA1 germline mutation breast cancer, while the distribution of the mutation sites along the BRCA1 gene did not affect the CIN value. According to these findings, application of the LPWGS methods will be valuable in CIN evaluation and can serve as a novel prognosis factor to predict the clinical outcome in patients with BRCA1 mutation.   LGR large genomic rearrangement, DCIS ductal carcinoma in situ, ER estrogen receptor, PR progesterone receptor, HER2 human epidermal growth factor receptor 2. Note: Among these parameters, only the TP53 alteration presented significant difference between CIN High and CIN Low groups.