Objective

Copy number variations (CNVs) are a well-recognized cause of genetic disease through the disruption of gene dosage and/or expression. However, their contribution to hereditary hearing loss (HL) has long been underestimated and remains an important question. Recently, a better appreciation of the CNV-burden in HL patients has emerged, with one study estimating that CNVs are implicated in up to 18.7% of patients in whom a genetic cause of HL was identified [1]. Ongoing efforts in the field are underway to not only diagnose patients, but also to identify genes underlying HL [2,3,4]. The high frequency of CNVs in Mendelian phenotypes such as HL support a SNP-microarray analysis strategy for the purpose of identifying chromosomal aberrations in known and candidate genes [5].

In this project, we ascertained HL patients in whom a molecular genetic diagnosis could not be determined from exclusionary GJB2 (DFNB1A) screening to assess the contribution of CNVs to the diagnostic rate of HL. Our analysis established STRC (DFNB16) as a frequent cause of congenital HL [6] and identified a rare syndromic form of HL caused by a de novo deletion in the chromosome 4q35.1q35.2 region [7]. The data have also identified patients with inconspicuous SNP-microarray array findings who have advanced to projects that utilize high-throughput sequencing and bioinformatics analysis [8, 9]. The most interesting and impactful results from this work have been published. We have subsequently shifted our research efforts to employ whole exome sequencing in our HL cohort. However, we believe these SNP-microarray data may be of retrospective interest and offer valued information to the scientific community.

Data description

Patient recruitment

We studied the genomic DNA extracted from whole blood of 99 consecutively recruited patients with suspected hereditary HL and 9 unaffected family members between February 2011 and May 2013. Index patients with suspected environmental forms of HL were excluded. Family members were included, when possible, to enhance data analysis. Prior to investigation in this research setting, the patients had undergone routine diagnostic GJB2 screening that included Sanger sequencing and duplication/deletion analysis using a multiplex ligation-dependent probe amplification approach. Patients with homozygous or compound heterozygous pathogenic GJB2 variants were excluded from the study. In parallel, clinical records were collected and reviewed that are summarized in Data File 1 listed in Table 1. Data File 1 also includes familial relationships, if available.

Table 1 Overview of data files/data sets

Experimental protocols

The Illumina Infinium HD assay was performed according to manufacturer’s instructions using 200 ng genomic DNA. The Infinium HumanOmni1-Quad v1.0 SNP-microarrays (Illumina) were scanned using the BeadArray Reader and the iScan that are included in the last column of Data File 1 (Table 1).

Data analysis

Unprocessed raw intensity data (.idat files) shown in Data Set 1 of Table 1 were generated. Additionally, raw and normalized green and red intensities (GSE111131_Matrix_signal_intensities.txt.gz), as well as matrix processed data (GSE111131_Matrix_Processed.txt.gz) were assembled. For our study, data were loaded into GenomeStudio v.2011.1 software and the B allele frequency and log R ratio were analyzed using Manifest H, cnvPartition 3.2.0, and QuantiSNP 2.2 [10]. The sample sheet that contains the necessary information to match the patient IDs with the sub-array data for this analysis are included in Data File 2 (Table 1).

Limitations

This study was undertaken to initiate screening of a cohort of diagnostically unresolved HL patients. Of particular interest was obtaining a greater understanding of the contribution of CNVs to hereditary HL, which was underappreciated at the time of study initiation. As this was a pilot study, our intention was to screen a small cohort of 99 patients to gain insight into our primary research aims and then publish the most interesting findings separately [6, 7]. As our work has advanced to include high-throughput sequencing of genes involved in HL, it became evident that many clinically-relevant mutations reside beyond the resolution of the SNP-microarrays [8].

One further limitation relates to the clinical overview of the patients (Data File 1, Table 1). As this study was conducted between 2011 and 2013, any subsequent progression of HL or syndromes that may have manifested in patients after HL was diagnosed and clinical chart review occurred are not included. Thus, these data may not be well-suited for genome-wide association studies, but can nonetheless be included in data collections investigating other disorders with the disclaimer that these disorders, especially adult-onset disorders in patients who were recruited as children, cannot be conclusively excluded.

Technical limitations well-known to SNP-microarrays involve the inability to detect balanced translocations, copy-neutral alterations, and inversions that may nonetheless be relevant [11] for the clinical diagnosis of HL [12, 13].