Background

Copy-number variations (CNVs) are a major cause of Mendelian disorders [1]. CNV detection using Array-CGH and SNP-array has revolutionized the diagnostic approach and identified the molecular basis of many genetic diseases [2, 3]. Therefore, Array-CGH has become an essential routine diagnostic tool for various indications, including developmental disabilities and congenital anomalies [4]. It has replaced conventional cytogenetic methods for most conditions [2] leading to a rapid increase in the detection of new microdeletion and microduplication syndromes [5]. More recently, the advent of next-generation sequencing (NGS) led to the detection of much smaller events (down to a single exon) thanks to the development of novel tools [6, 7].

An increase in the number of detected CNVs, especially short-sized, has highlighted the need for a comprehensive classification to assess the relationship between a CNV and a given phenotype. The American College of Medical Genetics (ACMG) and Clinical Genome Resource (ClinGen) have jointly proposed guidelines for standardizing the interpretation of copy number variants [8,9,10]. For example, a CNV is reported of uncertain significance when the available evidence is insufficient to determine clinical relevance unequivocally. Such CNVs can include, for example, a CNV described with conflicting interpretations in multiple publications or databases.

Small CNVs that include few genes are difficult to characterize because the number of overlapping CNVs described in the literature is statistically more limited. Consequently, they less frequently lead to a diagnosis. Thus, the increasing resolution of aCGH and SNP-array has led to the detection of more VUS. However, existing databases expand periodically, such as the database of genomic variants (DGV) [11] or Clinvar [12]. This new data adds potentially new information to interpret a given CNV, justifying a reanalysis.

VUS is a recurrent problem in medical genetics. First, a VUS does not lead to a conclusive diagnosis or appropriate genetic counseling. Management and treatment are thus difficult to define. Secondly, it is difficult for the patient to deal with uncertainty concerning VUS’s impact on clinical management [13, 14]. Finally, it is also challenging for the medical geneticist to decide whether to pursue diagnostic investigations.

As CNV classification largely depends on the available literature, local initiatives have led to the regular updating of CNV classification [15, 16]. Similar work has been done regarding NGS data and concluded the usefulness of systematic reanalysis with an interval greater than 18 months from the original report [17, 18]. However, no clear recommendations have been published regarding the optimal time to CNV reanalysis, and the effectiveness of systematic reinterpretation has not yet been assessed [19, 20]. Altogether, these observations highlight the need for guidelines for CNV reinterpretation. This study examined the usefulness of reinterpreting CNVs of uncertain significance.

Methods

Study design and characteristics of the cohorts

This retrospective monocentric French study covers from January 2010 to December 2017. The cohort was composed of all patients seen by a medical geneticist, for whom array-CGH testing was performed at CHRU Nancy’s medical genetics laboratory (1641 patients, Additional file 1: Fig. S1). Only patients who agreed to participate in the study are reported. The final cohort included 259 patients with CNVs of uncertain significance (Additional file 1: Fig S2, Additional file 1: Table S1).

DNA samples

During routine care, peripheral blood, chorionic villi, or amniotic fluid were collected from the proband and peripheral blood from their parents (when available). DNA was extracted using the QIAmp DNA Kit (QIAGEN) manually or using the QIAcube instrument according to the manufacturer’s instructions.

Array-based comparative genomic hybridization

aCGH was done using 180 K-oligonucleotide arrays (Agilent, San Clara, CA) with an average resolution of 25 kb. DNA preparation and hybridization procedures were performed according to the manufacturer’s instructions. Data were analyzed using genome build NCBI36/hg18 until 2011 or GRCh37/hg19 from 2012. If a chromosomal aberration was detected, further studies were performed using complementary methods (e.g., FISH, qPCR) depending on the finding [21] (Additional file 1: Fig. S1). Agilent CytoGenomics software was used to visualize CNVs. Finally, CNVs were interpreted using Cartagenia (Bench Lab CNV, Agilent) with DECIPHER and DGV databases and UCSC tools. An analysis threshold was set in 2016 in our lab to reduce VUS detection to improve diagnosis delay (500 kb in prenatal and 200 kb in postnatal).

Data re-analyses

372 CNVs were interpreted independently by two genetic biologists specializing in neurodevelopmental diseases between November 2019 to March 2020. Results were then discussed with the clinical geneticists that prescribed the original analysis during a multidisciplinary meeting. No systematic approach of reanalysis was performed yet in our center explaining why re-classification was not done at patient follow-up.

NCBI36/hg18 CNVs boundaries were converted to GRCh37/hg19 build using the liftOver UCSC tool [22] (https://genome.ucsc.edu/cgi-bin/hgLiftOver). CNVs were then annotated by AnnotSV tool version 3.2.3 [23, 24]. In parallel, an MS Excel workbook listing all CNVs was built to allow efficient, standardized analysis accessible from any computer station. Several hyperlinks to public databases were created based on the genomic coordinate, variant type, and gene content. The following databases were interrogated: UCSC genome browser [22] (http://genome.UCSC.edu), the database of genomic variants[11] (http://dgv.tcag.ca/), OMIM (https://omim.org/), Clingen (https://clinicalgenome.org/)[25], Clinvar[12] (https://www.ncbi.nlm.nih.gov/clinvar/), filtering CNVs with at least one star (classification criteria provided) and DECIPHER database [26] (https://decipher.sanger.ac.uk/). Finally, a thorough Pubmed (https://pubmed.ncbi.nlm.nih.gov/) search was performed based on (1) genes included in the CNV and (2) the chromosome band and type of CNV (deletion/duplication).

Each CNV was then manually curated and classified as pathogenic, likely pathogenic, likely benign, benign, or of uncertain significance according to ACMG/ClinGen guidelines [8,9,10]. The workflow of our approach is summarized in Fig. 1.

Fig. 1
figure 1

Representation of our approach during reanalysis. The workflow used is based on the 2020 ACMG recommendations (see sections on the right). After extraction of the list of aCGH analysis, we extracted CNVs classified as uncertain significance (left of the figure). This list allow us to create a bed file that was fed into AnnotSV website. Based on the resulting file and clinical information (gender, phenotype), we build an Excel spreadsheet summarizing both files and creating links to major online databases. Briefly, the gene content and frequency in the general population were checked using UCSC and DGV. ClinGen was used to determine dosage sensitivity and OMIM to assess possible morbid gene in the region. Finally, reported association was assessed using PubMed, Decipher, and Clinvar. All this data was combined in one slide presentation and analyzed using 2020 ACMG recommendations and ClinGen expert reports if available

Statistical methods

Statistical analysis was performed using R software (R version 3.6.0, 2019-04-26 [27]). Percentages included only cases with known data for each feature — cases with missing data were excluded. Statistical tests included R2, linear regression, and comparison of the slopes (P < .05 was considered statistically significant).

Ethical issues

This study was registered on clinicaltrials.gov (NCT04575350). Authorizations from patients, or their parents, were obtained at the time of genetic analysis with a signed consent form. As patients are systematically notified that their results may change over time, no specific consent for this study was signed. Ethical approval for this study was obtained from CHRU Nancy’s local ethical committee.

Results

Of the 1641 array-CGH, 259 patients (15.8%) with 372 CNVs of uncertain significance were documented (Additional file 1: Fig. S2, Additional file 1: Table S1). Tests ranged from the prenatal period to a 72-year-old patient. Male and female patients were equally represented. Most patients were seen for malformation, intellectual disability, or autism (Additional file 1: Table S1).

After re-analyzing all cases with reported VUS, 106 of the 259 patients (40.9%) had a revised classification. 112 CNVs were downgraded in pathogenicity (30.1%, 52 to benign, and 60 to likely benign), and 12 were upgraded (3.2%, 5 likely pathogenic, and 7 pathogenic, Fig. 2, Table 1). 76% of CNVs first reported as VUS were smaller than 500 kb (Additional file 1: Table S2). Notably, in later years, CNVs of bigger sizes make up a larger proportion of the findings (1/46, 2% CNVs > 500 kb in 2010 and 18/48, 38% in 2017). Deletions and duplications were found in equal proportions (54% vs. 46%, respectively). Most CNVs (86.5% of CNVs when the information was known) were inherited.

Fig. 2
figure 2

VUSs With Variant Classification changes. A Patients with reclassification of CNVs from each of the categories. 52 CNVs of uncertain significance were reclassified as benign, 60 as likely benign, 5 as likely pathogenic, and 7 as pathogenic. B Cumulative reclassification rate plotted as the cumulative fraction of reclassified variants for each year. Testing was calculated using either only pathogenic and likely pathogenic CNVs (blue line) or all reclassified CNVs (red line). The extrapolated slopes for the change in VUS classification are 4.2% per year and 0.4% respectively. This linear distribution means that reanalysis rate is constant every year. The R2 and slope values were calculated using linear regression

Table 1 Patients with a CNV previously described as VUS classified as pathogenic and likely pathogenic

Next, we examined the likelihood of reclassification over time. CNVs were reclassified as benign or likely benign because of new DGV information. In contrast, CNVs were reclassified as pathogenic or likely pathogenic thanks to ClinGen or new literature referenced in Pubmed. All CNV reclassified as pathogenic or likely pathogenic were classified as such by AnnotSV. The reclassification rate varies greatly when considering all CNVs reclassified or only CNV reclassified as pathogenic or likely pathogenic (P-LP, 0.4 % change per year; all VUS, 4.2% change per year, p=0.003, Fig. 2).

Most CNVs reclassified as benign were short (<500kb, n=95, Fig. 3). However, no association existed between CNV type and class change.

Fig. 3
figure 3

Variants reinterpreted in the study. Number of reclassified CNVs depending on CNV size (A), CNV type (B), patient phenotype (D), or CNV AnnotSV ranking regarding downgraded (B/LB) or upgraded (LP/P) CNVs

The twelve CNVs reclassified as pathogenic or likely pathogenic are summarized in Table 1. Three groups are distinguished: (1) CNVs with full penetrance; (2) CNVs with incomplete penetrance; (3) neurodevelopmental predisposing factors.

A de novo Xq26.2 deletion was identified in 2010 in a 19-year-old girl with mild intellectual disability who was also heterozygous for the familial 21q21.1 duplication already known to be associated with her deficiency. Random X inactivation was noted (78%/22%). This Xq26.2 deletion includes the minimal critical region of Simpson-Golabi-Behmel syndrome, especially GPC3 [28,29,30,31]. Milder forms of this X-linked syndrome have been previously reported in female patients [30]; only four female cases are now documented in the literature. Increased incidence of neoplasia described in this syndrome would need to be monitored [32].

In 2017, we identified a 1q24.3 deletion in a fetus presenting with Intra Uterine Growth Restriction (IUGR). This CNV was inherited from his mother. During our reanalysis, we saw that this CNV was newly reported with short stature, microcephaly, brachydactyly, dysmorphic facial features, and intellectual disability [33,34,35]. Two inherited cases have been previously reported [34]. Our deletion containing part of the DNM3 gene, miR199, and miR214 that are harbored within intron 14, was included in the minimal region linked in 2018 to the syndrome [34]. Moreover, the pregnant mother also presented this skeletal phenotype, as she measured 1.54 meters and had brachydactyly.

Four CNVs are described with incomplete penetrance.

A small deletion of 191 kb in 4q31.23, detected in 2010, was reclassified following our work as likely pathogenic in a girl presenting with global developmental delay and pseudohypoaldosteronism. She inherited the deletion from her mother. This deletion contains NR3C2 (OMIM* 600983), encoding an aldosterone receptor. The implication of this CNV in pseudohypoaldosteronism was suspected, and conclusive evidence is now available [36,37,38,39,40]. The mother did not present symptoms, and asymptomatic patients have already been reported [39]. However, this CNV does not explain the neurodevelopmental delay.

Two copy gain CNVs encompassing SOX3 were found in 2015 and 2017 in our cohort in two fetuses: one duplication in a male fetus presenting with spina bifida (arr [hg19] Xq27.1(139103383_139801281)x2) and a duplication-triplication in a male fetus with acrania (arr [hg19] Xq27.1(139103383-139763381)x2~3). SOX3 duplications are implicated in variable phenotypes, including myelomeningocele in both sexes, intellectual disability (of varying severity), and growth hormone deficiency (including panhypopituitarism) in males [41]. Hureaux et al., 2019 conducted a study on a fetal cohort showing that these SOX3 gene duplications are involved in neural tube closure defects [42]. To our knowledge, SOX3 duplication-triplication has never been reported.

One of the CNVs, a de novo 1q21.1 duplication (arr [hg19] chr1:(145818702-147824207)x3 dn, SCV001480529), was identified in a 2-year-old girl presenting with vaginal aplasia, unilateral renal agenesis, and benign myoclonic epilepsy in 2011. This duplication is larger than the classical 1q21.1 duplication syndrome (distal, hg19, chr1:146577486-147394506). The extension of the recurrent duplication is of particular interest as microdeletions, and microduplications of the distal 1q21.1 region have been linked, after the initial analysis, to various disorders, including Mayer–Rokitansky–Küster–Hauser syndrome (MRKH MIM% 277000), and autism [43, 44]. MRKH is a congenital malformation characterized by impaired Müllerian duct development resulting in a missing uterus and variable degrees of upper vaginal hypoplasia. Chen and colleagues reported a woman presenting MRKH associated with a 31.48 kb (chr1:146778208-146809687) deletion [34]. As no duplications have been reported in this region, we could still not conclude that this CNV explained our patient’s malformations. However, we considered this CNV pathogenic, at least a recurrent neurodevelopmental predisposing factor, thanks to proven haploinsuffisance sensitivity. Myoclonic epilepsy could be linked to the 1q21.1 duplication syndrome.

Six of the twelve likely pathogenic or pathogenic CNVs are neurodevelopmental disorder/autism spectrum disorder (ASD) predisposing factors, four of which explain part of the patient’s pathology (Table 1). Three were recurrent variations, reclassified thanks to detailed descriptions on ClinGen’s “Recurrent CNV” list (https://search.clinicalgenome.org/kb/gene-dosage/cnv: one CNV is a 15q11.2 deletion, one a 16p11.2 duplication, and one a 17q12 duplication. Two other patients harbored a 2p16.2 deletion, diagnosed in 2012, including NRXN1 curated with sufficient evidence of haploinsufficiency and strong proof of pathogenicity in 2017 [45, 46]. One patient had a heterozygous 1,106 Mb deletion on 2q12.3q13 identified in 2014. It was recently associated with developmental delay and behavioral problems [47].

Discussion

During this 8-year study period,106 of the 259 patients (40.9%) had revised CNV classification. In our center, VUS had a yearly reclassification rate of 4.2%.

A clinically significant change occurred for 12 patients (4.6%).

Ten CNVs reclassified as pathogenic or likely pathogenic explained, at least partially, the patients' pathology (Table 1). These diagnoses have two main consequences: (1) clarification of the pathogenicity of the CNV allows appropriate genetic counseling (2) molecular diagnosis alleviates the need for lengthy and often (very) expensive analyses, and decreasing invasive and painful procedures helps compliance and follow-up which is an additional benefit. Moreover, from an economic point of view, reanalysis is far more cost-effective than doing a new analysis. These statements need to be mitigated for some patients and, in particular, those harboring predisposing factors (half of the patients in our study). For these patients, additional tests are needed such as exome sequencing.

Finally, in our study, reclassifications to pathogenic and likely pathogenic were due to the publication of new papers, highlighting the need for teams to publish cases and for collaborative works. International cooperations and studies are crucial aspects of VUS reinterpretation.

Among the twelve CNVs reclassified as pathogenic or likely pathogenic, six are predisposing factors for neurodevelopmental disorder/autism spectrum disorder (ASD). The complexity of interpreting these CNVs is related to the inherent difficulty in assessing their level of involvement in the patient’s phenotype and whether further genetic testing is warranted. The reason for reclassification was the publication of reports by expert groups confirming the role of predisposing factors (15q11 region, dosage ID: ISCA-37448, for example). Our work highlights the importance of the ClinGen dosage sensitivity map that has already been used for ClinVar CNV reclassification [48].

For 112 (30.2%) CNVs, a link with the patient’s phenotype was ruled out. The revised interpretation was due to an overlapping CNV in the general population at least in one cohort. Most of these CNVs had no gene content or were small intragenic CNVs away from the coding sequence. Significant changes were linked to DGV enrichment over the years (http://dgv.tcag.ca/dgv/app/downloads?ref=GRCh37/hg19#articles_cited). As has already been frequently stated, the identification of VUS has a significant impact on both patient and practitioner: stressful announcement, uncertainty as to what action to take, incomplete genetic counseling, or even difficulty in knowing what level of information to divulge [49, 50]. In our study, we could inform half of the patients concerning the non-pathogenicity of their variation.

Some CNVs were of interest. We report a new female case with Simpson-Golabi-Behmel syndrome. We highlight the possible association between a specific portion of 1q21.1 deletion and MRKH syndrome. Moreover, we further delineate the 1q24.3 deletion and highlight the role of two microRNAs (miR199 and miR214) located within intron 14 of DNM3. The role of SOX3 in neural tube defects is also reinforced by two copy number gains associated with acrania and spina bifida.

In this work, we propose an efficient strategy for CNV reanalysis with reproducibility in the analysis method and the tools used.

As all the CNVs reclassified as (likely) pathogenic were identified as such by AnnotSV, the sensitivity of this tool has been demonstrated (Additional file 1: Table S3, specificity = 75%, sensitivity = 100%). Indeed, on the 372 CNVs for which a reanalysis is required, only 102 (20 + 82) should be manually curated. As such, including AnnotSV in the workflow reduces the number of CNVs to be manually analyzed by almost four if considering only CNVs classified by AnnotSV as pathogenic or likely pathogenic. However, the precision remains low: on the 102 positive cases, only 12 (11.8%) were correctly classified as (likely) pathogenic.

As the rate of reinterpretation seems constant over the years, we cannot determine a specific delay for efficient reanalysis (Fig. 2). Based on previous reports analyzing NGS data re-interpretation [51], we started our work two years after the latest report. Consequently, we recommend reinterpretation of CNV at a minimum frequency of every 2 years. Implementing an automatic monitoring system would be also a solution [52]. CNV reclassified as pathogenic or likely pathogenic did not have specific characteristics. Consequently, the reinterpretation should not be limited to de novo CNVs, large CNVs, or copy number loss.

The monocentric design limited this study. Moreover, our results are mitigated by the high prevalence of CNV with risk factors. Further investigations are needed for these patients. In addition, it is unclear whether such a high rate of VUS reclassified as benign will remain stable as the DGV database is now more comprehensive than in 2010.

Conclusions

In conclusion, based on our long-termed experience of CGH array analysis, systematic reanalysis of CNVs of uncertain significance should be considered standard practice for all genetics laboratories. In summary, patients with CNV of uncertain significance should have their results reinterpreted at least every two years and before further genetic testing. The clinician should warn the patient at the time of the prescription that the outcome may change depending on the state of knowledge. Moreover, as no fully automatic system is yet available and based on ACMG guidelines, it should also be the responsibility of the clinician to prescribe such reanalysis at each follow-up consultation. Of course, points raised in this article on the reinterpretation of array CGH also apply to the reinterpretation of CNV detected by NGS analyses.