Identification of candidate cancer predisposing variants by performing whole-exome sequencing on index patients from BRCA1 and BRCA2-negative breast cancer families

Shahi, Rajendra Bahadur; De Brakeleer, Sylvia; Caljon, Ben; Pauwels, Ingrid; Bonduelle, Maryse; Joris, Sofie; Fontaine, Christel; Vanhoeij, Marian; Van Dooren, Sonia; Teugels, Erik; De Grève, Jacques

doi:10.1186/s12885-019-5494-7

Identification of candidate cancer predisposing variants by performing whole-exome sequencing on index patients from BRCA1 and BRCA2-negative breast cancer families

Research article
Open access
Published: 04 April 2019

Volume 19, article number 313, (2019)
Cite this article

Download PDF

You have full access to this open access article

BMC Cancer Aims and scope Submit manuscript

Identification of candidate cancer predisposing variants by performing whole-exome sequencing on index patients from BRCA1 and BRCA2-negative breast cancer families

Download PDF

Rajendra Bahadur Shahi ORCID: orcid.org/0000-0002-7751-6852¹,
Sylvia De Brakeleer¹,
Ben Caljon²,
Ingrid Pauwels⁴,
Maryse Bonduelle⁵,
Sofie Joris⁴,
Christel Fontaine³,
Marian Vanhoeij³,
Sonia Van Dooren^2,5,
Erik Teugels^1,4^na1 &
…
Jacques De Grève^1,4^na1

7322 Accesses
31 Citations
2 Altmetric
Explore all metrics

Abstract

Background

In the majority of familial breast cancer (BC) families, the etiology of the disease remains unresolved. To identify missing BC heritability resulting from relatively rare variants (minor allele frequency ≤ 1%), we have performed whole exome sequencing followed by variant analysis in a virtual panel of 492 cancer-associated genes on BC patients from BRCA1 and BRCA2 negative families with elevated BC risk.

Methods

BC patients from 54 BRCA1 and BRCA2-negative families with elevated BC risk and 120 matched controls were considered for germline DNA whole exome sequencing. Rare variants identified in the exome and in a virtual panel of cancer-associated genes [492 genes associated with different types of (hereditary) cancer] were compared between BC patients and controls. Nonsense, frame-shift indels and splice-site variants (strong protein-damaging variants, called PDAVs later on) observed in BC patients within the genes of the panel, which we estimated to possess the highest probability to predispose to BC, were further validated using an alternative sequencing procedure.

Results

Exome- and cancer-associated gene panel-wide variant analysis show that there is no significant difference in the average number of rare variants found in BC patients compared to controls. However, the genes in the cancer-associated gene panel with nonsense variants were more than two-fold over-represented in women with BC and commonly involved in the DNA double-strand break repair process. Approximately 44% (24 of 54) of BC patients harbored 31 PDAVs, of which 11 were novel. These variants were found in genes associated with known or suspected BC predisposition (PALB2, BARD1, CHEK2, RAD51C and FANCA) or in predisposing genes linked to other cancer types but not well-studied in the context of familial BC (EXO1, RECQL4, CCNH, MUS81, TDP1, DCLRE1A, DCLRE1C, PDE11A and RINT1) and genes associated with different hereditary syndromes but not yet clearly associated with familial cancer syndromes (ABCC11, BBS10, CD96, CYP1A1, DHCR7, DNAH11, ESCO2, FLT4, HPS6, MYH8, NME8 and TTC8). Exome-wide, only a few genes appeared to be enriched for PDAVs in the familial BC patients compared to controls.

Conclusions

We have identified a series of novel candidate BC predisposition variants/genes. These variants/genes should be further investigated in larger cohorts/case-control studies. Other studies including co-segregation analyses in affected families, locus-specific loss of heterozygosity and functional studies should shed further light on their relevance for BC risk.

View this article's peer review reports

Whole-exome sequencing of BRCA-negative breast cancer patients and case–control analyses identify variants associated with breast cancer susceptibility

Article Open access 23 November 2022

Next-generation sequencing in familial breast cancer patients from Lebanon

Article Open access 15 February 2017

Identification of novel candidate genes by exome sequencing in Tunisian familial male breast cancer patients

Article 08 September 2020

Background

Breast cancer is the most common cancer and the leading cause of cancer deaths among women in the world [1]. About 10–20% of all BC patients occur in a familial context, with multiple family members affected across generations [2]. Familial BC susceptibility resulting from deleterious germline variations located on chromosome 17q21 was brought to light through linkage analysis for the first time in 1990 [3]. Since then, many highly penetrant rare variants (with a relative risk of above 10-fold) in BRCA1 (OMIM 113705), BRCA2 (OMIM 600185), TP53 (OMIM 191170), PTEN (OMIM 601728), STK11 (OMIM 602216) and CDH1 (OMIM 192090) to moderately penetrant rare variants (with a relative risk of 2 to 4-fold) in CHEK2 (OMIM 604373), PALB2 (OMIM 610355) [4], BARD1 (OMIM 601593), ATM (OMIM 607585), BRIP1 (OMIM 605882) have been reported. The exact penetrance associated to pathogenic variants in several of these genes is still under investigation. These genes were identified through linkage analysis, positional cloning and/or candidate gene sequencing [5,6,7,8,9,10,11,12,13,14,15,16,17]. Furthermore, with the advent of DNA microarray technology, many low penetrant common variants (with a relative risk often much less than twofold) were unraveled through genome-wide association studies [18]. More recently, thanks to dramatic advances in the speed and scale of next-generation sequencing (NGS) technologies combined with sophisticated computation algorithms and a sharp decrease in sequencing cost, a path for the discovery of additional candidate BC predisposing variants has been opened. To name a few, variants in XRCC2 (OMIM 600375), FANCC (OMIM 613899), BLM (OMIM 604610) and PPM1D (OMIM 605100) have been more recently reported as candidate variants with a BC risk through NGS technologies [19,20,21]. The aggregate currently known variants with high, moderate and low penetrance in familial BC susceptibility genes only account for up to 25–50% of all the high-risk BC families. This missing heritability in the remaining 50–75% of BC families [16, 22] reflects both the complexity of the BC genetic architecture and the challenges in identifying remaining BC predisposing variants for delivering timely screening, preventive intervention, and precision treatment.

Several studies have revealed that variants in familial BC susceptibility genes like BRCA1, BRCA2, TP53, PALB2, CDH1, PTEN (OMIM 601728), PIK3CA (OMIM 171834), STK11 (OMIM 602216), RINT1 (OMIM 610089) and NF1 (OMIM 613113) are not only associated with BC predisposition, but also with a number of other malignancies [7, 9, 23,24,25,26,27,28,29]. In the current study, we hypothesized that in BRCA1 and BRCA2-negative families with elevated BC risk, the analysis of a large array of genes previously associated to (hereditary) cancer syndromes or cancer in general, could likely lead to the identification of additional candidate BC predisposing genes/variants. Thus, firstly, we identified all rare variants both exome-wide and cancer-associated gene panel-wide (492 genes) in 54 BC patients from BRCA1 and BRCA2-negative families with elevated BC risk and compared their relative incidence in 120 geographically matched controls. Secondly, all nonsense, frame-shift indels and splice-site variants detected in BC patients within the 492 genes of the panel, which we estimated to possess the highest probability to predispose to BC, were validated on an independent sequencing platform (Roche Junior).

Methods

Sample selection

A total of 57 BC patients and 120 controls were considered for this study. Among the BC patients (Additional file 1), 54 were from unrelated BRCA1 and BRCA2-negative families with elevated BC and/or ovarian cancer (OC) risk (i.e. families with two or more affected first-degree relatives) and with a median age at diagnosis of 51 years (range: 36–72). The remaining three BC patients were included as “blinded internal positive controls”, each harboring a known germline variant in BRCA1 (NM_007300.3:c.5096G > A), BARD1 (NM_000465.3:c.1921C > T) or PALB2 (NM_024675.3:c.1571C > G). All geographically matched unrelated controls considered in this study (patients consulted at the same hospital), sequenced according to the same wet lab protocol for cardiac arrhythmias, were unselected for personal or familial history of cancer. The overview of the process of sample preparation, sequencing, analysis and variant validation workflow is presented in ‘Additional file 2’.

Patient recruitment and blood sampling were performed according to the ethical procedures approved by the institutional ethics committee of the UZ Brussel. Peripheral blood was collected after obtaining a written informed consent for a broad genomic analysis covering also incidental findings in genes predictive for other diseases. Genomic DNA was prepared using Chemagic Magnetic Separation Module I (Chemagen) according to the manufacturer’s recommendations.

A virtual panel of cancer-associated genes

After identification of rare variants in whole exomes, we further choose to prioritize variants present in a panel of 492 genes possibly/likely associated with (hereditary) cancer [hereafter called cancer-associated gene panel (CAGP) (Table 1 and Additional file 3)]. These genes are pooled together from seven gene lists: the well-known cancer susceptibility genes reported by Rahman et al. [30], various BC gene panels reported by Easton et al. [17], genes from BROCA-Cancer Risk Panel (Version 6) [31], Fanconi Anaemia pathway genes reported by Kanchi et al. [32], human DNA repair genes reported by Wood et al. [33], human cancer predisposition genes (GeneRead DNAseq Targeted Panel V2) from Qiagen and genes from the familial cancer database (FaCD, retrieved on 17/02/2015) [34]. Out of the 492 genes from our CAGP, 177 (36%) genes are contributed by at least two gene lists and are mostly known to be cancer susceptibility genes. The remaining 315 (64%) genes are private to a single gene list, mostly from Wood et al. (114 genes) and FaCD (167 genes). Some of these latter genes are not yet clearly associated with (hereditary) cancers (Table 1).

Table 1 Genes incorporated in the CAGP

Full size table

Target-enrichment and next-generation sequencing

For each of the BC patients and controls, one μg of DNA was fragmented using adaptive focused acoustics (Covaris) in order to obtain fragments of approximately 250 base pairs. After DNA end repair and adenylation, oligonucleotides adapters for paired-end sequencing (Illumina) were ligated to both ends of the fragments. Two hundred nanogram of ligated DNA of selected size was PCR amplified and subsequently captured by hybridization for 65 h with the Roche SeqCap EZ Human Exome v3.0 (Roche) Capture Library. After further selection of the targeted fragments through multiple steps of washing, the captured probe-selected DNA was cluster amplified on the Illumina cBot according to manufacturer’s protocol (Illumina), using five samples per flow cell lane in order to get sufficient DNA for the subsequent sequencing run. Sequencing was performed on a HiSeq1500 (Illumina) with a paired-end module, generating 125 base reads.

Sequence alignment, variant calling and annotation

Primary processing including base calling, read filtering and adapter trimming were performed using the standard Illumina pipeline. High quality reads for each sample were mapped to the human genome reference assembly GRh37/hg19 (https://www.ncbi.nlm.nih.gov/grc/human/issues/HG-37, build 37.2, Feb 2009) using BWA-MEM [35] (http://bio-bwa.sourceforge.net/, version 0.7.10-r789) with the default setting. After marking PCR duplicates with Picard (https://broadinstitute.github.io/picard/, version 1.97), the GATK pipeline [36] (https://software.broadinstitute.org/gatk/, version 3.4–46) with GATK Best Practices guideline was followed for local indel-realignment, base recalibration, variants calling (HaplotypeCaller), variant recalibration and variant filtration. The variants obtained thereafter were annotated with ANNOVAR [37] (http://annovar.openbioinformatics.org/, version 2015-12-14) to refGene database and population databases (1000g2015aug_eur,1000g2015aug_all, esp6500siv2_ea, esp6500siv2_all, exac03nontcga, snp132NonFlagged and GoNL [38]) in addition to ljb26_all, a database for variant function prediction scores. All the databases were obtained from ANNOVAR website except GoNL (http://www.nlgenome.nl/, release 5).

Variant filtration and classification

In-house Python script was used for variant filtration in three steps. Firstly, variants were only retained if they passed VQSLOD (tranche sensitivity threshold of 99.9%) and are located in the exons or at the splice-sites (±2 bp from the exon-intron border). In addition, we required a 10X absolute read depth at the variant position, at least two reads harboring the variant and a variant allele ratio between 20 and 80% along with a minor allele frequency (MAF) ≤1% in any of the population databases (mentioned earlier). Further, we assumed that those variants present in > 10% both in BC patients and controls most likely resulted from sequencing or alignment errors or they should be common variants exclusively in our study population (and thus missed by the MAF restriction). Thus, these variants were removed. Furthermore, missense variants were classified as “probably damaging” (pph2-prob ≥0.957), “possibly damaging” (0.453 ≤ pph2-prob≤0.956), or “benign” (pp2_hdiv ≤0.452) according to PolyPhen-2 (HDIV) [39] in silico prediction scores. Secondly, exome-wide variants that passed all the filters in the first stage were selected for their presence in genes of the CAGP. Lastly, frame-shift indels, nonsense and splice-site variants (hereafter collectively called potentially Protein Damaging Allelic Variants (PDAVs) as they have the highest probability to cause loss of protein function and thus to be associated to BC predisposition) that are present in genes of the CAGP were further validated.

Variant validation

For validation of the PDAVs obtained from the Illumina platform using capture-based library enrichment system, an orthogonal approach using amplicon-based library enrichment on a 454 platform from Roche (Junior) was performed. Primer pairs were designed in order to amplify DNA fragments (amplicons) that contain the desired variants. One primer of the primer pair was designed towards intronic regions, when possible, to avoid amplification of processed pseudogenes. In addition, BLAST of the target sequence was performed in order to choose only primer pairs that specifically amplify the target region meanwhile avoiding non-specific or pseudo-gene amplification. Furthermore, primers binding to target sequences containing SNPs with a MAF > 1% were avoided. For variant analysis, SeqNext software (JSI medical systems) was used.

Results

Exome coverage

On average, about 1.0 × 10⁸ unique good quality reads were generated per exome both for BC patients and controls. About 87% of these reads from BC patients (controls: 86%) could be aligned to the reference genome covering 94% (controls: 95%) [BC patients range: 77–96%, controls range: 90–96%] of the exome with at least 10X target bases coverage. The median of ‘mean depth coverage’ at target region was about 107X and 101X [BC patients range: 46X-295X, controls range: 64X-148X] across all the BC patients and controls, respectively (Additional file 4). Coverage in CAGP was very similar to the coverage in exome both for BC patients and controls.

Exome- and CAGP-wide variant enrichment in BC patients versus controls

Exome-wide, a total of 3,316,630 variants (average: 61,419 variants/BC patient) were called in 54 BC patients (3 internal positive controls excluded) and 7,413,256 variants (average: 61,777 variants/control) were called in 120 controls. After exhaustive variant filtering (as described in methods), 22,724 variants (average: 421 variants/BC patient) were retained in BC patients. Among them, 8153 single nucleotide variants (SNVs) were synonymous, 432 were in-frame indels, 543 were frame-shift indels, 162 were splice-site SNVs, 303 were nonsense SNVs and 5182 + 2227 + 5722 were missenses SNVs (predicted as “probably damaging”, “possibly damaging” and “benign” by PolyPhen-2, respectively). Similarly, in the controls we retained 51,219 variants (average: 427 variants/control) after filtering consisting of 17,891 synonymous SNVs, 981 in-frame indels, 1052 frame-shift indels, 420 splice-site SNVs, 768 nonsense SNVs and 11,929 + 5197 + 12,981 missenses SNVs (predicted as “probably damaging”, “possibly damaging” or “benign”, by PolyPhen-2, respectively). An overview of these data is presented in ‘Additional file 5’.

Subsequently, we investigated whether an exome-wide enrichment can be observed in the number of variants when comparing BC patients to controls (Student’s t-test or Welch’s t-test). No significant difference was observed in the average number of variants between BC patients and controls either by pooling all the variant types together (BC patients: controls; 420.81: 426.83, p = 0.3071) or by separately analyzing each sub-type of variants [synonymous SNVs (150.98: 149.09, p = 0.5058), in-frame indels (8.00: 8.18, p = 0.7043), splice-site SNVs (3.00: 3.50, p = 0.1053) and the missense SNVs [“probably damaging” (95.96: 99.41, p = 0.0628), “possibly damaging” (41.24: 43.31, p = 0.0866) and “benign” (105.96: 108.18, p = 0.3215)], except for frame-shift indels (10.06: 8.77, p = 0.0199) and nonsense SNVs (5.61: 6.40, p = 0.0446), (Additional file 6).

In the next step, we only considered the variants present in the 492 cancer-associated genes from the CAGP panel (see methods). In the BC patients, after filtering, we retained 240 synonymous SNVs, 8 in-frame indels, 13 frame-shift indels, 6 splice-site SNVs, 14 nonsense SNVs and 195 + 94 + 215 missenses SNVs (predicted as “probably damaging”, “possibly damaging” and “benign”, respectively). In the controls we retained 589 synonymous SNVs, 21 in-frame indels, 23 frame-shift indels, 20 splice-site SNVs, 13 nonsense SNVs and 398 + 174 + 417 missense SNVs (see Additional file 5). When comparing the average number of variants in BC patients versus controls, we observed that the average number of nonsense SNVs was more than twice higher in BC patients [BC patients: controls; 0.26:0.11; ratio = 2.39; p = 0.0287 (0.0688 with Welch correction)], whereas no obvious enrichment could be observed in the other sub-types of variants (see Additional file 6).

To investigate further whether specific genes are more frequently mutated in our BC patients compared to controls, we selected exome wide all the genes harboring high impact mutations (PDAVs) in at least two BC patients (see Additional file 7). Among the 95 genes selected, five (FAM11B, GRAMD2, SP100, USP45 and ZNF534) can be considered candidate BC predisposing genes as they were mutated in three BC patients (out of 54) but not in any of the 120 controls (Additional file 8). Two other good candidate genes are ASPH and C17orf80 as they harbored PDAVs in respectively five and four BC patients and only one control sample (Additional file 8). All PDAVs found in these 7 candidate BC predisposing genes were visually verified using the Integrative Genomics Viewer (IGV) [40].

Validation of PDAVs within the CAGP

PDAVs resulting in dramatic changes in protein structure and function have the highest chance to be associated with BC predisposition. Those PDAVs that were detected in BC patients, passed the filters, and are located within the genes of the CAGP, were further validated on an independent sequencing platform (see methods) and were also reviewed manually using IGV [40]. Thirty-one out of 33 PDAVs present in 24 out of 54 BC patients (~ 44%) passed the validation step (Table 2), of which 11 PDAVs are not reported in dbSNP147. Among the BC patients with a PDAV, eighteen harbored a PDAV in a single gene, five harbored PDAVs in two genes and one harbored a PDAV in three genes. Furthermore, all five splice-site SNVs (Additional file 9) were considered disruptive by the in silico web-based tool “Human Splicing Finder” [41] (http://www.umd.be/HSF3/, release 3.0). Three of the 26 genes harboring PDAVs in the BC patients were also found mutated in the control samples, suggesting that these genes (ABCC1, BBS10 and PDE11A) are not involved in cancer predisposition (compare Additional files 10 and 11), In addition, the pathogenic variants present in the three internal positive control samples included in this study were also identified.

Table 2 List of genes with the corresponding PDAVs that were validated as true positive in the corresponding BC patient

Full size table

Discussion

It is expected that exome-wide NGS analysis of a germline DNA sample will reveal many variants when compared to a haploid reference genome, even when only rare variants (MAF ≤ 0.01) are taken into consideration. However, when comparing the total number of variants detected in two individuals of the same ethnicity we do not expect to find significant differences. We confirmed this assumption by (using the same wet bench and dry bench approaches) comparing the average number of variants found in persons belonging to two groups living in the same area (patients recruited in the same hospital): BC patients belonging to elevated risk BC families and controls not selected for personal or familial history of cancer but for cardiac arrhythmias. The ratio of average number of observed (rare) variants in both groups is very close to one for all types of variants (Fig. 1 (red) and Additional file 6) except for splice site and nonsense variants (0.86 and 0.88, respectively), probably because of the relatively small number of splice site and nonsense variants detected per BC patient and control. When focusing exclusively on the genes of the CAGP, similar observations were obtained (Fig. 1 (blue) and Additional file 6) except for the category of nonsense variants, where more than a two-fold excess of nonsense variants was detected in BC patients compared to controls (ratio = 2.39). Although a larger sample size is a minimal requirement to reach statistical significance, our data suggest that the nonsense variants found in excess in the genes of the CAGP among the BC patients (compared to controls) are implicated in the molecular mechanism modulating BC risk(about 50% of these nonsense variants). If the increased number of nonsense variants seen in BC patients is associated with increased cancer risk, one would expect that these nonsense variants will be more frequently identified in genes functionally correlated with the cancer predisposition process. To verify this assumption, the PANTHER over-representation Test (Released 20,171,205) [42] was used with a false discovery rate (FDR) < 0.05. This over-representation test compares a test gene list to a reference gene list and determines whether a particular class of genes (e.g. those associated to a specific biological process) is overrepresented or underrepresented. We found that genes involved in the DNA repair process, namely inter-strand cross-link repair (FDR: 4.94E-02), double-strand break (DSB) repair via nonhomologous end joining (FDR: 4.35E-02), non-recombinational repair (FDR: 4.42E-02), DSB repair (FDR: 8.35E-02) were overrepresented in BC patients while not in the controls. Indeed, four nonsense variants (out of 14) found in BC patients were found in genes involved in the DSB repair process while only one such variant (out of 13) was found among the controls. It remains unclear for us why the same phenomenon is not observed with the frameshift indels. It is possible that false positive indel calls masked a possible enrichment of the true positive frameshift indels.

Our exome wide analysis revealed only seven genes (ASP, C17orf80, FAM111B, GRAMD2, SP100, USP45 and ZNF534) with high impact mutations (PDAVs) in three (or more) BC patients while comparable mutations were not found (or only once) among the control samples. None of these genes was reported to possess cancer predisposing properties and therefore not included in the CAGP. Gene Ontology (GO) annotation [43] for molecular function, biological processes and Reactome Pathways indicated that SP100 and USP45 are involved in DNA repair while ZNF534 is involved in DNA-templated regulation of transcription, making them good candidate cancer predisposing genes. No molecular function or biological process was annotated to C17orf80 and FAM11B, whereas ASPH and GRAMD2 were reported to be involved in calcium homeostasis and transport (Additional file 8).

When restricting our variant analyses performed on BC patients to the 492 genes of the CAGP, we found novel as well as known PDAVs in several genes known or suspected to be BC predisposing (Table 2 and Additional file 10). Genes participating in DNA DSB repair process e.g. PALB2, BARD1, CHEK2 and RAD51C are particularly intriguing as DSB repair process defective tumors can be selectively targeted by PARP (poly (ADP-ribose) polymerase) inhibitors resulting in synthetic lethality [44,45,46]. We also found PDAVs in genes linked to DNA repair, FA or occurring in some types of cancers but not well studied in the context of familial BC (Table 2 and Additional file 10). These candidate BC predisposing genes are also interesting to scrutinize further in familial BC setting as it is known that familial BC susceptibility genes can also predispose to multiple cancers [30]. Furthermore, we detected PDAVs in genes associated with other hereditary syndromes but not clearly related to cancer (Table 2 and Additional file 10). These genes are mostly derived from the FaCD panel, which is uncurated. PDAVs detected in the CAGP from control samples but not present in the BC patients are listed in Additional file 11.

Only about 44% of the BC patients were found to harbor a PDAV in one (and exceptionally in two or three) gene(s) of the CAGP in this study. Eleven out of 31 PDAVs detected were not reported in dbSNP147 and therefore considered novel. We should keep in mind that these PDAVs are not necessarily BC predisposing. Therefore, their cancer predisposing attributes should be further investigated in much larger cohort /case-control studies or by performing co-segregation analyses in positive families (if sufficient families are available). Although in this study we mainly focused on candidate PDAVs found in genes of the CAGP (which only accounts for 2% of the full exome), we must remain aware that missense variants in genes of the CAGP but also PDAVs and missense variants outside this gene panel may also predispose to BC. For instance, we identified 7 genes not represented in the CAGP in which PDAVs were over-represented in the BC patient cohort. Moreover, BC predisposition may not necessarily rely solely on the presence of one particular variant in the family but may result from combinatorial interactions between several variants. Indeed, it has been proposed that in the majority of BC families, BC predisposition could be polygenic in nature and the contribution of several variants located in genes associated to moderate or low risk could be responsible for the increased susceptibility to BC [47]. The mechanism how these different variants cooperate at the molecular level to create an increased BC risk is a matter of further investigation [48].

Conclusions

On average, twice more nonsense variants were found in BC patients than in controls when analyzing the genes from the CAGP. Moreover, GO analysis (biological process) of the genes accumulating those nonsense variants indicated that genes involved in the DSB repair process were overrepresented in the BC patients but not in controls. Comparable observations were not made for the other variant types in the CAGP, nor when considering the whole exome. Taken together, our observations might indicate that a nonsense variant found in the CAGP of a BC patient has more than 50% chance to be associated with BC risk while similar conclusions cannot be drawn for “probably/possibly damaging” missense or frameshift mutations. Larger case-control studies should be performed to confirm these assumptions and validate our candidates. This preliminary study in 54 BC patients from BRCA1 and BRCA2-negative BC families with elevated cancer risk identified candidate BC predisposing PDAVs (known as well as unknown) in 30 genes; PALB2, BARD1, CHEK2, RAD51C, FANCA, RINT1, EXO1, RECQL4, CCNH, MUS81, TDP1, DCLRE1A, DCLRE1C, CD96, CYP1A1, DHCR7, DNAH11, ESCO2, FLT4, HPS6, MYH8, NME8, TTC8, ASPH, C17orf80, FAM111B,GRAMD2, ZNF534, SP100 and USP45. The seven last genes of this list were not connected to the cancer process so far. These novel candidate variants and their associated genes should be further investigated with other methods to confirm their role in BC predisposition.

Abbreviations

BC:: breast cancer
CAGP:: cancer-associated gene panel
DSB:: double-strand break
FA:: Fanconi anemia
FaCD:: familial cancer database
FDR:: false discovery rate
GO:: Gene ontology
GoNL:: Genome of the Netherlands
IGV:: Integrative Genomics Viewer
MAF:: minor allele frequency
NGS:: next-generation sequencing
PDAV:: protein-damaging allelic variant
VQSLOD:: variant quality score log-odds

References

Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65(2):87–108.
Article Google Scholar
Rahman N, Stratton MR. The genetics of breast cancer susceptibility. Annu Rev Genet. 1998;32:95–121.
Article CAS Google Scholar
Hall JM, Lee MK, Newman B, Morrow JE, Anderson LA, Huey B, King MC. Linkage of early-onset familial breast cancer to chromosome 17q21. Science. 1990;250(4988):1684–9.
Article CAS Google Scholar
Antoniou AC, Casadei S, Heikkinen T, Barrowdale D, Pylkas K, Roberts J, Lee A, Subramanian D, De Leeneer K, Fostira F, et al. Breast-cancer risk in families with mutations in PALB2. N Engl J Med. 2014;371(6):497–506.
Article Google Scholar
Miki Y, Swensen J, Shattuck-Eidens D, Futreal PA, Harshman K, Tavtigian S, Liu Q, Cochran C, Bennett LM, Ding W, et al. A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science. 1994;266(5182):66–71.
Article CAS Google Scholar
Wooster R, Neuhausen SL, Mangion J, Quirk Y, Ford D, Collins N, Nguyen K, Seal S, Tran T, Averill D, et al. Localization of a breast cancer susceptibility gene, BRCA2, to chromosome 13q12-13. Science. 1994;265(5181):2088–90.
Article CAS Google Scholar
Malkin D, Li FP, Strong LC, Fraumeni JF Jr, Nelson CE, Kim DH, Kassel J, Gryka MA, Bischoff FZ, Tainsky MA, et al. Germ line p53 mutations in a familial syndrome of breast cancer, sarcomas, and other neoplasms. Science. 1990;250(4985):1233–8.
Article CAS Google Scholar
Saal LH, Gruvberger-Saal SK, Persson C, Lovgren K, Jumppanen M, Staaf J, Jonsson G, Pires MM, Maurer M, Holm K, et al. Recurrent gross mutations of the PTEN tumor suppressor gene in breast cancers with deficient DSB repair. Nat Genet. 2008;40(1):102–7.
Article CAS Google Scholar
Hearle N, Schumacher V, Menko FH, Olschwang S, Boardman LA, Gille JJ, Keller JJ, Westerman AM, Scott RJ, Lim W, et al. Frequency and spectrum of cancers in the Peutz-Jeghers syndrome. Clinical cancer research : an official journal of the American Association for Cancer Research. 2006;12(10):3209–15.
Article CAS Google Scholar
Masciari S, Larsson N, Senz J, Boyd N, Kaurah P, Kandel MJ, Harris LN, Pinheiro HC, Troussard A, Miron P, et al. Germline E-cadherin mutations in familial lobular breast cancer. J Med Genet. 2007;44(11):726–31.
Article CAS Google Scholar
Meijers-Heijboer H, van den Ouweland A, Klijn J, Wasielewski M, de Snoo A, Oldenburg R, Hollestelle A, Houben M, Crepin E, van Veghel-Plandsoen M, et al. Low-penetrance susceptibility to breast cancer due to CHEK2(*)1100delC in noncarriers of BRCA1 or BRCA2 mutations. Nat Genet. 2002;31(1):55–9.
Article CAS Google Scholar
Rahman N, Seal S, Thompson D, Kelly P, Renwick A, Elliott A, Reid S, Spanova K, Barfoot R, Chagtai T, et al. PALB2, which encodes a BRCA2-interacting protein, is a breast cancer susceptibility gene. Nat Genet. 2007;39(2):165–7.
Article CAS Google Scholar
De Brakeleer S, De Greve J, Loris R, Janin N, Lissens W, Sermijn E, Teugels E. Cancer predisposing missense and protein truncating BARD1 mutations in non-BRCA1 or BRCA2 breast cancer families. Hum Mutat. 2010;31(3):E1175–85.
Article Google Scholar
Broeks A, Urbanus JH, Floore AN, Dahler EC, Klijn JG, Rutgers EJ, Devilee P, Russell NS, van Leeuwen FE, van 't Veer LJ. ATM-heterozygous germline mutations contribute to breast cancer-susceptibility. Am J Hum Genet. 2000;66(2):494–500.
Article CAS Google Scholar
Seal S, Thompson D, Renwick A, Elliott A, Kelly P, Barfoot R, Chagtai T, Jayatilake H, Ahmed M, Spanova K, et al. Truncating mutations in the Fanconi anemia J gene BRIP1 are low-penetrance breast cancer susceptibility alleles. Nat Genet. 2006;38(11):1239–41.
Article CAS Google Scholar
Stratton MR, Rahman N. The emerging landscape of breast cancer susceptibility. Nat Genet. 2008;40(1):17–22.
Article CAS Google Scholar
Easton DF, Pharoah PD, Antoniou AC, Tischkowitz M, Tavtigian SV, Nathanson KL, Devilee P, Meindl A, Couch FJ, Southey M, et al. Gene-panel sequencing and the prediction of breast-cancer risk. N Engl J Med. 2015;372(23):2243–57.
Article CAS Google Scholar
Easton DF, Pooley KA, Dunning AM, Pharoah PD, Thompson D, Ballinger DG, Struewing JP, Morrison J, Field H, Luben R, et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature. 2007;447(7148):1087–93.
Article CAS Google Scholar
Hilbers FS, Wijnen JT, Hoogerbrugge N, Oosterwijk JC, Collee MJ, Peterlongo P, Radice P, Manoukian S, Feroce I, Capra F, et al. Rare variants in XRCC2 as breast cancer susceptibility alleles. J Med Genet. 2012;49(10):618–20.
Article CAS Google Scholar
Thompson ER, Doyle MA, Ryland GL, Rowley SM, Choong DY, Tothill RW, Thorne H, kConFab BDR, Li J, et al. Exome sequencing identifies rare deleterious mutations in DNA repair genes FANCC and BLM as potential breast cancer susceptibility alleles. PLoS Genet. 2012;8(9):e1002894.
Article CAS Google Scholar
Ruark E, Snape K, Humburg P, Loveday C, Bajrami I, Brough R, Rodrigues DN, Renwick A, Seal S, Ramsay E, et al. Mosaic PPM1D mutations are associated with predisposition to breast and ovarian cancer. Nature. 2013;493(7432):406–10.
Article CAS Google Scholar
Melchor L, Benitez J. The complex genetic landscape of familial breast cancer. Hum Genet. 2013;132(8):845–63.
Article CAS Google Scholar
Mai PL, Chatterjee N, Hartge P, Tucker M, Brody L, Struewing JP, Wacholder S. Potential excess mortality in BRCA1/2 mutation carriers beyond breast, ovarian, prostate, and pancreatic cancers, and melanoma. PLoS One. 2009;4(3):e4812.
Article Google Scholar
Jones S, Hruban RH, Kamiyama M, Borges M, Zhang X, Parsons DW, Lin JC, Palmisano E, Brune K, Jaffee EM, et al. Exomic sequencing identifies PALB2 as a pancreatic cancer susceptibility gene. Science. 2009;324(5924):217.
Article CAS Google Scholar
Norton JA, Ham CM, Van Dam J, Jeffrey RB, Longacre TA, Huntsman DG, Chun N, Kurian AW, Ford JM. CDH1 truncating mutations in the E-cadherin gene: an indication for total gastrectomy to treat hereditary diffuse gastric cancer. Ann Surg. 2007;245(6):873–9.
Article Google Scholar
Figer A, Kaplan A, Frydman M, Lev D, Paswell J, Papa MZ, Goldman B, Friedman E. Germline mutations in the PTEN gene in Israeli patients with Bannayan-Riley-Ruvalcaba syndrome and women with familial breast cancer. Clin Genet. 2002;62(4):298–302.
Article CAS Google Scholar
Orloff MS, He X, Peterson C, Chen F, Chen JL, Mester JL, Eng C. Germline PIK3CA and AKT1 mutations in Cowden and Cowden-like syndromes. Am J Hum Genet. 2013;92(1):76–80.
Article CAS Google Scholar
Park DJ, Tao K, Le Calvez-Kelm F, Nguyen-Dumont T, Robinot N, Hammet F, Odefrey F, Tsimiklis H, Teo ZL, Thingholm LB, et al. Rare mutations in RINT1 predispose carriers to breast and lynch syndrome-spectrum cancers. Cancer discovery. 2014;4(7):804–15.
Article CAS Google Scholar
Friedman JM: Neurofibromatosis 1. In: GeneReviews(R). Edn. Edited by Pagon RA, Adam MP, Ardinger HH, Wallace SE, Amemiya A, Bean LJH, Bird TD, Fong CT, Mefford HC, Smith RJH et al. Seattle (WA); 1993.
Rahman N. Realizing the promise of cancer predisposition genes. Nature. 2014;505(7483):302–8.
Article CAS Google Scholar
Walsh T, Casadei S, Lee MK, Pennil CC, Nord AS, Thornton AM, Roeb W, Agnew KJ, Stray SM, Wickramanayake A, et al. Mutations in 12 genes for inherited ovarian, fallopian tube, and peritoneal carcinoma identified by massively parallel sequencing. Proc Natl Acad Sci U S A. 2011;108(44):18032–7.
Article CAS Google Scholar
Kanchi KL, Johnson KJ, Lu C, McLellan MD, Leiserson MD, Wendl MC, Zhang Q, Koboldt DC, Xie M, Kandoth C, et al. Integrated analysis of germline and somatic variants in ovarian cancer. Nat Commun. 2014;5:3156.
Article Google Scholar
Wood Laboratory: Human DNA repair genes(last modified on Tuesday 15th April 2014). http://scienceparkmdandersonorg/labs/wood/dna_repair_geneshtml, Accessed 7 Apr 2016.
Sijmons RH: Identifying Patients with Familial Cancer Syndromes. In: Cancer Syndromes. edn. Edited by Riegert-Johnson DL, Boardman LA, Hefferon T, Roberts M. Bethesda (MD); 2009.
Li H. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics. 2014;30(20):2843–51.
Article CAS Google Scholar
DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43(5):491–8.
Article CAS Google Scholar
Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38(16):e164.
Article Google Scholar
Genome of the Netherlands C. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat Genet. 2014;46(8):818–25.
Article Google Scholar
Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7(4):248–9.
Article CAS Google Scholar
Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer. Nat Biotechnol. 2011;29(1):24–6.
Article CAS Google Scholar
Desmet FO, Hamroun D, Lalande M, Collod-Beroud G, Claustres M, Beroud C. Human splicing finder: an online bioinformatics tool to predict splicing signals. Nucleic Acids Res. 2009;37(9):e67.
Article Google Scholar
Mi H, Muruganujan A, Casagrande JT, Thomas PD. Large-scale gene function analysis with the PANTHER classification system. Nat Protoc. 2013;8(8):1551–66.
Article Google Scholar
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet. 2000;25(1):25–9.
Article CAS Google Scholar
Fong PC, Boss DS, Yap TA, Tutt A, Wu P, Mergui-Roelvink M, Mortimer P, Swaisland H, Lau A, O'Connor MJ, et al. Inhibition of poly(ADP-ribose) polymerase in tumors from BRCA mutation carriers. N Engl J Med. 2009;361(2):123–34.
Article CAS Google Scholar
Lord CJ, Ashworth A. BRCAness revisited. Nat Rev Cancer. 2016;16(2):110–20.
Article CAS Google Scholar
Greve JD, Decoster L, Shahi RB, Fontaine C, Vanacker L, Pauwels I, Denayer E, Brakeleer SD, Teugels E. Parp inhibitors. Belg J Med Onco. 2016;10(7):263–75.
Google Scholar
Antoniou AC, Easton DF. Models of genetic susceptibility to breast cancer. Oncogene. 2006;25(43):5898–905.
Article CAS Google Scholar
Teugels E, De Brakeleer S. An alternative model for (breast) cancer predisposition. NPJ Breast Cancer. 2017;3:13.
Article Google Scholar

Download references

Acknowledgements

We thank Didier Croes for technical assistance with HPC cluster and BRIGHT core for providing HPC platform. We also thank Diether Lambrechts for his valuable suggestions on the manuscript. This manuscript is dedicated to the patients and families who participated in this study.

Funding

The Wetenschappelijk Fonds Willy Gepts of the UZ Brussel, the Stichting Tegen Kanker, the Fund Armand Everaerts and the Fund Maaike Lars Trees of the Boudewijnstichting are acknowledged for the financial contributions to this study. The funding bodies had no role in study design, sample collection, data analysis, data interpretation and in writing the manuscript.

Availability of data and materials

All the relevant data are included in this published article (and its additional files).

Author information

Erik Teugels and Jacques De Greve contributed equally to this work.

Authors and Affiliations

Laboratory of Medical and Molecular Oncology (LMMO), Vrije Universiteit Brussel (VUB), Brussels, Belgium
Rajendra Bahadur Shahi, Sylvia De Brakeleer, Erik Teugels & Jacques De Grève
Brussels Interuniversity Genomics High Throughput core (BRIGHTcore) platform, Universitair Ziekenhuis Brussel (UZ Brussel) / Vrije Universiteit Brussel (VUB), Brussels, Belgium
Ben Caljon & Sonia Van Dooren
Breast Cancer Clinic, Oncologisch Centrum, Universitair Ziekenhuis Brussel (UZ Brussel), Brussels, Belgium
Christel Fontaine & Marian Vanhoeij
Familial Cancer Clinic, Oncologisch Centrum, Universitair Ziekenhuis Brussel (UZ Brussel), Brussels, Belgium
Ingrid Pauwels, Sofie Joris, Erik Teugels & Jacques De Grève
Centre for Medical Genetics, Reproduction and Genetics, Universitair Ziekenhuis Brussel (UZ Brussel) / Vrije Universiteit Brussel (VUB), Brussels, Belgium
Maryse Bonduelle & Sonia Van Dooren

Authors

Rajendra Bahadur Shahi
View author publications
You can also search for this author in PubMed Google Scholar
Sylvia De Brakeleer
View author publications
You can also search for this author in PubMed Google Scholar
Ben Caljon
View author publications
You can also search for this author in PubMed Google Scholar
Ingrid Pauwels
View author publications
You can also search for this author in PubMed Google Scholar
Maryse Bonduelle
View author publications
You can also search for this author in PubMed Google Scholar
Sofie Joris
View author publications
You can also search for this author in PubMed Google Scholar
Christel Fontaine
View author publications
You can also search for this author in PubMed Google Scholar
Marian Vanhoeij
View author publications
You can also search for this author in PubMed Google Scholar
Sonia Van Dooren
View author publications
You can also search for this author in PubMed Google Scholar
Erik Teugels
View author publications
You can also search for this author in PubMed Google Scholar
Jacques De Grève
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

JDG and ET conceptualized the study. RBS performed WES analysis and statistical testing in coordination with ET. RBS, ET and SDB performed validation experiment. IP, SJ, CF, MV, MB, and JDG collected samples as well as written informed consent from BC patients. BC has generated the exome sequencing data and SVD contributed data for controls. RBS, ET, and JDG drafted the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Erik Teugels or Jacques De Grève.

Ethics declarations

Ethics approval and consent to participate

Patient recruitment and blood sampling were performed according to the ethical procedures approved by the Institutional Ethics Committee of the UZ Brussel. All patients provided written informed consent for a broad genomic analysis covering also incidental findings in genes predictive for other diseases and the study was conducted in accordance with the Declaration of Helsinki.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:

Breast cancer (BC) patients considered for the study, their age at diagnosis and the number of affected first-degree relatives in their families. (XLSX 11 kb)

Additional file 2:

Sample preparation, sequencing, analysis and variant validation workflow. (TIF 2207 kb)

Additional file 3:

A panel of cancer-associated genes with their corresponding genomic coordinates and the lists of genes from which they are extracted. (XLSX 47 kb)

Additional file 4:

Sequencing coverage details for each BC patient and control. (XLSX 23 kb)

Additional file 5:

Types and number of variants detected (exome- and CAGP-wide) before and after variant filtration in each BC patient and control. (XLSX 32 kb)

Additional file 6:

BC patients vs. controls variant types enrichment in exome- and CAGP-wide. (XLSX 10 kb)

Additional file 7:

List with all genes (exome-wide) presenting a PDAV in at least two BC patients and the number of control samples presenting a PDAV in the same gene. (XLSX 12 kb)

Additional file 8:

List with genes presenting a PDAV in three (or more) BC patients but not (or only once) in the control samples. GO annotations are shown for molecular function, biological processes and Reactome Pathways. (XLSX 13 kb)

Additional file 9:

Splice-site variants with their corresponding HSF and MaxEnt Scores from Human Splicing Finder 3.0. (XLSX 10 kb)

Additional file 10:

List with PDAVs detected in 54 BC patients located in genes linked to (hereditary) cancer and/or hereditary diseases. Each variant/gene is briefly commented. (DOCX 63 kb)

Additional file 11:

PDAVs found in the CAPG of the 120 control samples and their occurrence in the general population. (XLSX 12 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Shahi, R.B., De Brakeleer, S., Caljon, B. et al. Identification of candidate cancer predisposing variants by performing whole-exome sequencing on index patients from BRCA1 and BRCA2-negative breast cancer families. BMC Cancer 19, 313 (2019). https://doi.org/10.1186/s12885-019-5494-7

Download citation

Received: 28 August 2018
Accepted: 20 March 2019
Published: 04 April 2019
DOI: https://doi.org/10.1186/s12885-019-5494-7

Identification of candidate cancer predisposing variants by performing whole-exome sequencing on index patients from BRCA1 and BRCA2-negative breast cancer families

Abstract

Background

Methods

Results

Conclusions

Similar content being viewed by others

Background

Methods

Sample selection

A virtual panel of cancer-associated genes

Target-enrichment and next-generation sequencing

Sequence alignment, variant calling and annotation

Variant filtration and classification

Variant validation

Results

Exome coverage

Exome- and CAGP-wide variant enrichment in BC patients versus controls

Validation of PDAVs within the CAGP

Discussion

Conclusions

Abbreviations

References

Acknowledgements

Funding

Availability of data and materials

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Publisher’s Note

Additional files

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation