Background

Approximately 1 in every 500 newborns exhibits a degree of hearing loss, and more than half of cases are associated with genetic mutations [1]. Genes responsible for hereditary hearing loss are highly heterogeneous. Recent advances in clinical genome sequencing, focusing on known deafness gene panels, have been used to efficiently detect pathogenic variants, inform appropriate clinical intervention (such as cochlear implants), and estimate prognosis, in terms of symptoms [2,3,4]. To date, more than 100 genes have been reported as associated with non-syndromic hearing loss [5]. Further, according to Online Mendelian Inheritance in Man (OMIM), hundreds of genes are associated with syndromic hearing loss. Targeted resequencing of deafness genes is cost-effective and, therefore, beneficial for diagnostic purposes [3]; however, it is not suitable for detection of very rare or novel deafness genes.

Whole exome sequencing (WES), involves sequencing of coding exons comprising approximately 2% of the whole human genome, which are estimated to contain approximately 85% of pathogenic variants associated with monogenic disease [6]. For efficient identification of pathogenic variants in patient samples by analysis of WES data, detected variants are often categorized in several groups, where those in genes previously associated with the clinical features of interest are the first priority for analysis [7, 8]. Sets of prioritized genes can be modified during analysis to increase the number of targeted genes, without resequencing the same samples. For comprehensive investigation of the genetic heterogeneity of diseases with a wide range of causative genes, such as hearing loss, and to identify novel candidate genes, WES analysis overcomes the limitations of targeted analysis and is considerably more cost-effective than whole genome sequencing (WGS) analysis.

In this study, we sought to explore the wide spectrum of genetic heterogeneity associated with hearing loss in Japan, and to discover novel candidate genes associated with hearing loss, using trio analysis of probands and their parents, and four originally developed gene groups ranked by priority (tiers), as a new strategy to filter candidate variants. Using this strategy, we successfully detected candidate pathogenic variants in 11 previously reported deafness genes in 21 families, as well as eight single candidate deafness genes in 10 families.

Methods

Editorial policies and ethical considerations

The Ethics Review Committees of the National Hospital Organization Tokyo Medical Center (approval number: R1-0703009) and all collaborating institutes approved the study procedures. All procedures were conducted after written informed consent had been obtained from each subject or their parents.

Subjects

All subjects were patients visiting the National Hospital Organization Tokyo Medical Center or collaborating hospitals. Medical histories were obtained, and clinical information, such as the results of physical, audiological, and blood tests, were collected from subjects and family members, when available. Hearing loss severity was determined according to the recommendations of the Genetic Deafness study group, using audiological tests, including pure-tone audiometry, auditory steady-state response, conditioned orientation reflex audiometry, or play audiometry, depending on the age of the patient and availability [9]. Subjects with hearing loss related to environmental factors, such as meningitis, premature birth, and rubella, were excluded.

Genetic analysis

Genomic DNA was obtained from blood samples collected from probands and their family members, mostly parents. Probands with known high prevalence deafness gene variants, and those with specific clinical features suggesting subsets of deafness genes, were filtered using the following methods. All probands were screened for GJB2 or mitochondrial m.1555A>G and m.3243A>G variants, which are frequently detected in Japanese patients with hereditary hearing loss, as described previously [10]. Probands were also screened for SLC26A4 variants when enlarged vestibular aqueduct was detected by computed tomography (CT), or when they were not examined by CT. Probands with auditory neuropathy, which manifests as normal otoacoustic emission and loss of auditory brainstem responses, were subjected to Sanger sequencing analysis of OTOF [11]. To rule out congenital cytomegalovirus infection, PCR examination for cytomegalovirus in the preserved umbilical cords of probands was conducted, when samples were available.

WES protocols have been reported previously [12]. In brief, genomic DNA extracted from blood was subjected to whole exome region capture using a Nextera Rapid Capture Exome kit (Illumina) [13] and to massively parallel sequencing using the HiSeq2500 platform (Illumina). Sequence reads were mapped onto the human reference genome (GRCh37) with a decoy sequence (hs37d5), using BWE-mem (v.0.7.5a), and variants were called using the Picard (v.1.106) and Genome Analysis Toolkit 3.4.46 (GATK) [14]. Individual variants were joint-called, together with in-house data (WES, n = 498 and WGS, n = 1037) [15] using GenotypeGVCFs. Variants were then annotated using Annovar [16]. Variants in repeat elements, low complexity regions, or considered to result from strand bias, were omitted from further analyses. Average mapping rate, read depth, and numbers of SNVs and indels, are presented in Additional file 1.

A schematic flowchart of the WES analysis conducted in this study is shown in Additional file 2. To identify candidate pathogenic changes, variants predicted to alter the encoded protein were first filtered according to minor allele frequency (MAF), as previously descried [12]. In brief, a threshold MAF of < 0.001 was applied for AD inheritance mode analysis of global public databases (Database of Single Nucleotide Polymorphisms (dbSNP) [17], East Asian population of 1000 Genomes [18], NHLBI Exome Variant Server (ESP6500), Exome Aggregation Consortium (ExAC) [19], Genome Aggregation Database (gnomAD) [20], Human Genetic Variation Database (HGVD) ver1.42 based on 1208 healthy Japanese subjects [21], and an in-house database including 1037 healthy Japanese subjects [15]; and a MAF threshold of < 0.003 applied for sporadic cases and AR inheritance mode analysis of global databases, except that a threshold of < 0.005 was used for the HGVD and in-house databases, as previously described [12]. Variants were further excluded out from candidates if all the in silico analyses (LRT, LR, Mutation Assessor, Mutation Taster, Polyphen 2-HDIV, Polyphen 2-HVAR, RadialSVM, SIFT) predicted no, benign, or tolerated effect of the variant. The effect of the splice site variants was predicted by MaxEntScan [22] and Human Splice Finder 3.0 [23] with default threshold values.

Remaining variants were prioritized in four categories before analysis of co-segregation with the disease: (1) Tier 1 genes were reported as associated with non-syndromic, syndromic hearing loss, and diseases including hearing loss as a non-characteristic symptom registered in OMIM (n = 293, gene list in [12]); (2) Tier 2 genes were associated with hearing loss in animal models by the Mouse Genome Informatics [24] or International Mouse Phenotyping Consortium [25], and not included in Tier 1 (n = 328, gene list in [12]); and (3) Tier 3 genes were expressed at > twofold higher levels in M. fascicularis cochlea than in other tissues [26] and were not included in Tier 1 or Tier 2 (n = 305; Additional file 3). Genes with high expression levels in M. fascicularis cochlea are enriched for deafness genes and may therefore contain novel candidates [26]. In total, 926 genes were categorized in Tiers 1–3. Genes not included in Tiers 1–3 were categorized as Tier 4.

Among selected variants co-segregating with hearing loss, those in Tier 1 genes were searched in the OMIM, Human Genome Mutation Database (HGMD) (last accessed March 12, 2019) and ClinVar (last accessed March 12, 2019) to determine the consistency of the clinical features of the individuals in this study, according to PP4 criterion in the ACMG guidelines [27]. Genes associated with syndromic hearing loss were excluded if they met the following criteria: (1) the variants had not been reported as pathogenic or likely pathogenic, and (2) the proband did not exhibit the characteristic symptoms of multiple organ disease caused by that gene.

Remaining candidate variants were subjected to PCR and Sanger sequencing. Primer sets used in this study are shown in Additional file 5. Representative electropherograms of variants detected in each proband are shown in Additional file 6 and Additional file 10.

Assessment of large deletion allele of STRC

A suspected homozygous deletion of STRC, mapping to chromosome 15q15.3 [28, 29], and detected in patients using IGV [30], was validated by MLPA (kit P461, MRC-Holland, Amsterdam, Netherlands), according to the manufacturer’s protocols. Copy numbers of exon 19 and the 5′ flanking region of STRC were also examined by duplicated quantitative PCR (qPCR) in subjects and their family members. Primers for copy number quantification of the exon 10 region of MYO7A (NM_000260.3) were used as a reference. Primer sets used in this study are presented in Additional file 5.

Results

Overview of subjects

Seventy-two families including 215 individuals (71 families with the proband and parents, and one with the proband and mother) were recruited for this study (Table 1). Most of the participants were Japanese, while a father with normal hearing in one family was Korean. Within the families, the majority of probands appeared to be sporadic cases (52 families, 72%). In addition, 9 (13%) and 10 (14%) families were presumed to have autosomal-dominant (AD) and autosomal-recessive (AR) inheritance modes, respectively, based on the symptoms of family members. The inheritance mode was not determined in one family, since the proband and both parents had hearing loss. The majority of probands had non-syndromic hearing loss (58 families, 81%).

Table 1 Number of families participated in this study

Detection of candidate pathogenic variants in previously known deafness genes

In WES analysis, a mean read depth of approximately 144 with > 99.9% average mapping rate, was obtained (Additional file 1). Approximately 0.4% of the targeted regions (848 out of 212,158 regions) showed insufficient read depth (< 20) consistently. Among them, 5 regions were included in Tier 1 genes (Additional file 4). In Tier 1 genes, approximately 0.67% of targeted bases (4,644 out of 697,091 bases) showed insufficient read depth consistently.

By trio analysis, 11 previously reported deafness genes were considered to be responsible for hearing loss in 21 families (Fig. 1). As described in “Methods” section, probands were prescreened for GJB2 variants, including the m.1555 A>G and m.3243 A>G variants, as well as SLC26A4 and OTOF variants, depending on their clinical features. All detected genes, genotypes, diseases, and clinical features of probands are summarized in Table 2. Additional bioinformatic data for each variant, including allele frequencies in population databases, in silico analyses, and conservation among vertebrate species, are shown in Table 3. Partial Sanger sequencing electropherograms validating each variant are presented in Additional file 6. All the variants reported in this study fulfilled at least one criterion (PM2_Supporting, absent or extremely low frequency in population databases), according to the American College of Medical Genetics and Genomics (ACMG) guidelines [27] and the modification of PM2 by ClinGen Sequence Variant Interpretation Working Group (https://www.clinicalgenome.org/site/assets/files/5182/pm2_-_svi_recommendation_-_approved_sept2020.pdf).

Fig. 1
figure 1

Frequencies of identified candidate genes associated with hearing loss in this study. Numbers in parenthesis indicate families carrying a candidate gene. Genes in red indicate those consistent with an autosomal recessive (AR) inheritance mode, and those in blue indicate genes consistent with an autosomal dominant (AD) inheritance mode. Sharp symbols (#) indicate that two deafness genes were found in different members of one family

Table 2 Clinical features of probands and genotypes of known deafness genes
Table 3 Deafness gene variants and annotation information in this study

Regarding non-syndromic deafness genes, compound heterozygous variants of MYO15A were identified in four sporadic cases (families 1470, 1540, 1479, and 1688); compound heterozygous variants of CDH23 in two sporadic cases (families 1644 and 1528); and compound heterozygous variants of PDZD7 in families 1397 and 1597; all three candidate PDZD7 variants mapped to exon 4 in regions encoding one of the PDZ domains, which are structural anchors that tether the protein to cytoskeletal components [31]. Although ADGRV1 and PDZD7 have been proposed as genes responsible for Usher syndrome type IIC [32], no candidate variants of ADGRV1 were detected among our patients. A homozygous variant of OTOF was identified in a sporadic case in family 1648. The pathogenicity of the c.5816G>A (p.Arg1939Gln) variant is established [11, 33]. As the proband had not been tested for otoacoustic emission, which is necessary to detect auditory neuropathy, this case was subjected to WES without prescreening for OTOF. Compound heterozygous variants of OTOG were identified in a sporadic case from family 739; the variants were predicted to disrupt splicing at the donor site (5′ splice site) of exon 11 and to be a nonsense mutation. Loss-of-function of both alleles of OTOG was considered to be sufficient explanation for hearing impairment.

Regarding syndromic deafness genes, two de novo variants of PTPN11 were identified in sporadic cases in families 1631 and 1543. Both detected variants, c.836A>G (p.Tyr279Cys) and c.1529A>G (p.Gln510Arg), reside in regions encoding the catalytic sites of the non-receptor type protein-tyrosine phosphatase [34], and are established pathogenic variants causing Noonan syndrome 1 (NS1) [35, 36]. The proband in family 1631 showed syndromic symptoms (short statue with subtle ocular hypertelorism, café -au-lait pigmentation, Table 2). Although evaluation of the developmental status of the proband was limited because of the age at the time of genetic test (2 years 0 month), developmental delay was not noted. The proband in family 1543 was 1 year 10 months old at the time of genetic test. No clinical features other than hearing loss were notified. Two de novo variants of SOX10 were identified in sporadic cases in families 1583 and 1651. The c.570C>A (p.Cys190Ter) variant maps to exon 3, and the transcript is predicted to be degenerated by nonsense-mediated decay (NMD) [37], whereas the other variant, c.1122del (p.Thr375ProfsTer127), maps to the last exon (exon 4) and is predicted to escape NMD. No neurologic disorders were recorded in the proband of family 1583, with the c.570C>A variant, whereas the neurologic symptoms of the proband of family 1651, with c.1122del, were consistent with Waardenburg syndrome, with neurological phenotypes (peripheral demyelinating neuropathy, central dysmyelination) associated with escape from NMD [38]. A heterozygous c.1082G>A (p.Arg361Gln) variant of EYA1, a gene responsible for Branchiootorenal syndrome 1 (BOR1), was identified in family 1636. The proband with the variant had amblyopia with refractive errors, which have not previously been reported in BOR1, while his father with the heterozygous p.Arg361Gln variant showed mild hearing loss without additional noticeable symptoms. The proband’s mother with normal hearing did not have the variant, and no other family members showed hearing loss. Compound heterozygous variants of ZNF335 were identified in the proband of family 1456. The two variants were both predicted to affect the region encoding the C2H2-type zinc finger domain; the genetic and clinical features of this family have been reported by others [39].

Assessment of homogenous large deletion spanning STRC

While detection of copy number variants (CNVs) from the results of WES is challenging using a single program [40], inspection using the Integrative Genomics Viewer (IGV) suggested that probands in families 1410, 1564, 1436, and 1700, and the mother (I-2) from family 1633, showed extremely low read depths across exon 16 and from exons 19 to 26 of STRC, in contrast to a control (III-1 of family 1470), with similar read depths covering all STRC exons (Fig. 2A–E, Additional file 7B). Moreover, this large homozygous deleted region appeared to extend to the adjacent gene (exons 8–10 of CKMT1B), as well as the entire CATSPER2 locus, in all probands (Additional file 7C). Homozygous deletion of both STRC and CATSPER2 has been reported to be associated with deafness-infertility syndrome (OMIM: 61102). The non-reduced read depths at other exons (including exons 1–15 and 27–29 of STRC, and exon 8 of CATSPER2) were likely due to multiple mapping of the sequences of the highly homologous pseudogenes, STRCP1 and CATSPER2P1 (Additional file 7D, E) [28, 41].

Fig. 2
figure 2

Clinical and genetic features of five families carrying an STRC copy number variant. AE Pedigrees of families with genotype variants in STRC or MYO6. Horizontal bars with or without sharp symbols (#) above each individual indicate that genotypes were determined by WES and qPCR or qPCR, respectively. Patient genotypes are indicated in blue. F, Estimated copy numbers of STRC exon 19 and 5′ UTR determined by qPCR. Results from four individuals with normal hearing are shown as controls. G Audiograms from patients with homozygous large deletions of the STRC region. Open circles and ⨉ symbols indicate hearing level thresholds in the right and left ears, respectively. Open triangles indicate thresholds measured in both ears. Left and right downward arrows indicate that the hearing level was below the indicated level at the respective sound frequency in right and left ears, respectively

Multiplex ligation-dependent probe amplification (MLPA) analysis of the probands from families 1410 and 1700 demonstrated homozygous deletion of a genomic region spanning from exon 8 of CKMT1B to exon 1 of CATSPER2 (Additional file 8). According to the positions of the MLPA probes, the 5′ breakpoint was predicted to be between exon 27 of PPIP5K1 (NC_000015.10:g.43851168) and exon 8 of CKMT1B (g.43890333), whereas the 3′ breakpoint was mapped between exon 1 of CATSPER2 (g.43940784) and exon 1 of PDIA3 (g.44038794). Based on the inner and outer boundaries, the deleted region was estimated to be between 50.5 and 187.6 kb. This structural variant resembled those recorded in dbVar (for example, nsv868983 (62.4 kb) and nsv3109791 (145.9 kb)) [42]. qPCR targeting of exon 19 and the 5′ UTR of STRC also demonstrated absence of these regions of STRC in patients from families 1410, 1564, 1436, and 1700 (Fig. 2F), consistent with the results of IGV and MLPA analyses (Fig. 2F, Additional file 7B and Additional file 8). In addition, heterozygous large deletion of an STRC allele in the parents of families 1410, 1436, and 1700 was also detected by qPCR (Fig. 2F); however, the copy numbers in the mother (II-5) of family 1564 were difficult to measure, making the exact genotypes predicted to carry the large STRC deletion allele ambiguous. The proband (III-3) and a sibling (III-2) in family 1410 had vision loss, in addition to hearing loss.

Intriguingly, IGV predicted homozygous deletion of STRC in the mother (I-2) of family 1633 (Additional file 7B), a family initially presumed to have an AD mode of inheritance (Fig. 2E). qPCR demonstrated that the mother (I-2) and the proband (II-1) had homozygous and heterozygous deletion of STRC, respectively, whereas the father (I-1) did not appear to have copy number loss of this gene. The trio of family 1633 was reanalyzed under the assumption that a distinct gene was responsible for hearing loss in the proband. Consequently, a de novo variant of MYO6 (c.1325G>A (p.Cys442Tyr) was identified in the proband.

Because OTOA is also known to have highly homologous pseudogene OTOAP1 especially in its exon 21–29, we searched for differences in read depths of OTOA. However, we could not detect any changes suggesting large deletion or duplication of OTOA in any probands.

Novel candidate genes associated with hearing loss

In addition to the previously known deafness genes categorized to Tier 1, eight additional genes were narrowed down as single candidates by WES analysis in a total of 10 families (Figs. 1, 3, 4). Two of these genes (SLC12A2 and BAIAP2L2, Tier 2) cause hearing loss phenotypes in mouse models, and one (HKDC1, Tier 3) is predominantly expressed in Macaca fascicularis cochlea. The other five genes (SVEP1, CACNG1, GTPBP4, PCNX2, and TBC1D8) were categorized as Tier 4 genes, with no known association with hearing loss. Genetic information for each variant is presented in Additional file 9. Partial Sanger sequencing electropherograms validating each variant are presented in Additional file 10. Association of SLC12A2 variants with hearing loss has been reported [12] and registered as DFNA78 in OMIM (619081).

Fig. 3
figure 3

Pedigrees with novel candidate genes associated with hearing loss. A BAIAP2L2; B HKDC1; C SVEP1; D CACNG1. Genotypes of candidate genes are shown in the pedigree. Audiograms of the probands, and predicted structures of gene products are shown. Horizontal bars with or without sharp symbols (#) above each individual indicate that genotypes were determined by WES and Sanger sequencing or Sanger sequencing, respectively. ASSR, auditory steady-state response; COR, conditioned orientation response; PTA, pure-tone audiometry. Positions of each candidate variant on the primary structure of the gene product are also indicated, along with the structural or functional domains of the product

Fig. 4
figure 4

Pedigrees with novel candidate genes associated with hearing loss (continued from Fig. 3). A GTPBP4; B PCNX2; C TBC1D8. Genotypes of candidate genes are shown in the pedigree. Audiograms of the probands, and positions of each candidate variant on the primary structure of the gene product are also indicated, along with the structural or functional domains of the product

A heterozygous variant of BAIAP2L2 was identified as the candidate cause for AD inheritance mode hearing loss in family 1427 (Fig. 3A). This gene encodes the membrane protein, brain-specific angiogenesis inhibitor 1-associated protein 2-like protein 2, which localizes to the plasma membrane in intestine and kidney epithelial cells [43]. Further, single-cell RNA sequencing analysis demonstrated predominant Baiap2l2 expression in hair cells in neonatal mouse cochlear epithelium [44] (Additional file 11), and mice deficient for Baiap2l2 have an increased auditory brainstem response threshold [24, 45]. The c.506T>C (p.Val169Ala) variant is predicted to reside in the IRSp53/MIM homology domain (IMD), which can bind to membranes and interact with a small GTPase (PROSITE: PRU00668) [46]. The proband with heterozygous BAIAP2L2 variant showed congenital, progressive, severe, steep sloping hearing loss without other symptoms.

Compound heterozygous HKDC1 variants were identified as the candidate cause of the sporadic hearing loss in family 1676 (Fig. 3B). This gene encodes hexokinase domain-containing 1, which catalyzes phosphorylation of glucose to generate glucose-6-phosphate [47]. A genome-wide association study (GWAS) identified HKDC1 as a risk factor for gestational hyperglycemia [48]. The missense variant found in the proband (c.1771A>C (p.Lys591Gln)) was predicted to reside in the hexokinase small subdomain 2, whereas the other compound heterozygous variant was predicted to affect splicing (c.376–2A>G). The proband had congenital, mild-to-moderate hearing loss, without other symptoms.

Compound heterozygous variants of SVEP1 were identified as the candidate cause of the sporadic hearing loss in two families: 1535 and 1555 (Fig. 3C). This gene encodes Sushi von Willebrand factor type A EGF and pentraxin domain-containing 1, which may function in cell attachment via integrin α9β1 [49]. A GWAS detected SVEP1 as a risk factor for coronary artery disease [50] and knockout of Svep1 in mice is embryonic lethal, with multiple developmental defects [25]. All four variants (c.6766C>G (p.Pro2256Ala), c.7357G>A (p.Val2453Met), c.6977C>T (p.Pro2326Leu), and c.10294T>C (p.Tyr3432His)) found in this study reside in the stretched sushi domains. The probands in families 1535 and 1555 carried the compound heterozygous variants c.[6766C>G];[7357G>A] and c.[6977C>T];[10294T>C], respectively, and had congenital, severe-to-profound non-syndromic hearing loss, without other symptoms.

A de novo heterozygous variant of CACNG1 was identified as the candidate cause of the sporadic hearing loss in family 1669 (Fig. 3D). This gene encodes voltage-dependent calcium channel gamma-1 subunit. The c.461C>T (p.Ser154Leu) variant of CACNG1 is predicted to encode a residue in the transmembrane domain of the putative protein product. Cacng1-knockout mice show dysregulated calcium transport in skeletal muscle [51]. The proband with the variant showed congenital, severe-to-profound hearing loss, without other symptoms.

A de novo heterozygous variant of GTPBP4 was identified as the candidate cause of the sporadic hearing loss in family 1696 (Fig. 4A). This gene encodes a nucleolar GTP-binding protein, and the variant (c.967C>G (p.Leu323Val)) in this gene affects the predicted GTP-binding domain. GTPBP4 mediates ribosomal RNA processing [52], suppresses schwannoma cell growth [53], and promotes colorectal carcinoma metastasis [54] in vitro. The proband (III-4) with the variant showed congenital, mild, and mid-frequency hearing loss, without other symptoms.

Compound heterozygous variants of PCNX2 were identified as the candidate cause of the AR inheritance mode hearing loss in family 1685 (Fig. 4B). This gene encodes Pecanex-like protein 2 and is frequently mutated in colorectal carcinomas with high microsatellite instability [55]. The detected variants were a nonsense change (c.4777C>T (p.Arg1593Ter)) and a missense variant (c.3505C>T (p.Arg1169Trp)), residing in the intracellular region of the plasma membrane protein. Pcnx2 deficiency modifies seizure-like behaviors in mouse [56]. The proband had congenital, progressive hearing loss, resulting in profound hearing loss at 2 years old, as well as abnormal pulmonary venous return, which was surgically treated at 1 day after birth.

A de novo heterozygous variant of TBC1D8 was identified as a candidate cause of the sporadic hearing loss in family 1575 (Fig. 4C). This gene encodes Tre-2 BUB2p and Cdc16p domain 1 family member 8, which functions as a GTPase-activator of Rab family proteins and promotes tumorigenesis of ovarian cancer [57]. TBC1D8 has also been reported to be within a susceptibility locus for osteoporosis-related traits [58]. The variant c.1997C>T (p.Ser666Leu) was predicted to reside in the putative carboxyl-terminal Rab-GTPase-TBC domain with unknown function. The proband with the variant had congenital, moderate low-frequency hearing loss, without other symptoms.

Discussion

Identification of variants in known deafness genes by WES analysis

Analysis of Tier 1 prioritized genes using WES data led to successful identification of pathogenic or likely pathogenic variants in 11 known deafness genes in 21 of 72 families, after screening of common deafness genes. Due to higher coverage of coding regions, WES is considered to detect pathogenic variants more efficiently and more cost-effectively than WGS. In addition, we narrowed down eight single genes as candidates associated with hearing loss in 10 families. Analysis of prioritized Tier 1 genes was similarly effective to targeted NGS analysis [3] and enabled efficient determination of the genes responsible for hearing loss in probands. After prescreening for GJB2, m.1555A>G, and m.3243A>G variants, as well as SLC26A4 and OTOF variants, when patient data suggested, the two most frequently identified genes in this study were STRC (DFNB16, five families) and MYO15A (DFNB3, four families). These two genes have been reported as relatively frequent causes of genetic hearing loss in Japan [59, 60] and studies in other ethnic regions [4, 61]. Subsequently, CDH23, PDZD7, and PTPN11 were detected as causative genes in two families each. Unlike CDH23 and PDZD7, which cause non-syndromic hearing loss or Usher syndrome presenting as non-syndromic hearing loss during childhood, PTPN11 is associated with NS1, which shows a variety of phenotypes in multiple organs [5]. Although two probands with PTPN11 variants had short stature, and one exhibited café-au-lait pigmentation, these clinical features had been unnoticed by the primary physicians. Our findings highlight that NS1 with no-to-mild symptoms, other than hearing loss, can be categorized as non-syndromic hearing loss in certain cases; hence PTPN11 may be a much more frequent cause of hearing loss than previously recognized.

Although a straightforward method to detect CNVs from WES data has yet to be established, homozygous deletion of STRC, which harbors a tandem homologous pseudogene sequence at its genomic locus, with potential for non-allelic homologous recombination [62], was successfully detected by combined assessment of read depths for each coding exon, MLPA, and qPCR. More extensive analyses of structural variants using several programs [63, 64], WGS [65], and long read sequencing [66] would reveal exact breakpoints of STRC CNVs.

In addition, this study demonstrated that trio WES analysis is a potent method of deciphering the reasons for discrepancies between pedigree and genetic inheritance, as shown in family 1633, where there was an initial presumption of AD inheritance, but mutations at two separate loci (STRC and MYO6) were detected. This study also demonstrates that WES analysis can be used to identify genes responsible for hearing loss and other factors suspected of influencing coexisting symptoms, to explain the clinical features in families; for example, families 1636 (EYA1 variant with amblyopia) and 1410 (STRC variant with vision loss).

Strategy to discover novel candidate deafness genes by WES analysis

Our strategy to discover novel candidate genes associated with hearing loss from Tiers 2–4 genes was based on the assumption that deafness genes would also cause auditory phenotypes in animal models [24, 25], which we categorized as Tier 2 genes, and that many genes critical for proper hearing in humans would also show predominant expression in M. fascicularis cochlea [26], which we categorized as Tier 3 genes. We identified SLC12A2, BAIAPL2, and HKDC1 as promising candidate genes warranting investigation for pathogenicity; however, identification of additional patients with variants in the same candidate genes will be critical for confirming their involvement. SVEP1 variants were detected in two families and are plausible candidates for further investigation, such as in vitro functional analysis or generation of an animal model with the identified variants knocked in. Confirmation of novel deafness genes will improve genetic tests for hearing loss.

We were unable to screen single candidate genes in 35 families, and no candidate variants emerged from WES analysis in six families. Hearing loss in these families may be attributable to pathogenic variants in untranslated regions, introns, cryptic splice sites, promoter or enhancer regions, intergenic regions, multigenic causes, or chromosomal arrangements, including CNVs, or unidentified environmental factors. We also aware that 5 exonic regions n Tier 1 genes showed insufficient read depth. Variants on these exons may also have been failed to be detected. In addition, our in silico filtering strategy did not use REVEL scores recommended by Hearing Loss Expert Panel guidelines [67]. In fact, our filtering strategy is considered very stringent; variants were filtered out only when all the in silico analyses (see “Methods” section) predicted no, benign, or tolerated effect. As a result, two candidate variants on our list showed low REVEL scores (PDZD7:c.503G>C, REVEL = 0.123, (Table 3) and PCNX2:c.3505C>G, REVEL = 0.139 (Additional file 9)). Although we cannot exclude out the possibility of filtering out pathogenic variants based on in silico prediction, it is considered quite unlikely.

Another possibility is that we may have missed causative genes due to discrepancies between the typical clinical features caused by the gene and those observed in our probands. For example, Tier1 genes included KDM6A, a gene responsible for syndromic hearing loss (Kabuki syndrome 2; OMIM: 300827). Variants of this gene were not considered as candidates when the proband had non-syndromic hearing loss; however, we cannot exclude the possibility that these variants can be associated with very mild or normal phenotypes, except for hearing loss. As we experienced in the case with known pathogenic variant of PTPN11 in family 1543, clinical features of several diseases such as Noonan syndrome show wide spectrum of symptoms including non-syndromic hearing loss, and these atypical features in patients could have been overlooked and affected the diagnostic yield. It is also possible that symptoms other than hearing loss are late-onset and overlooked at the time of genetic test. These are the limitations of this study to detect Tier 1 genes associated with hearing loss using WES analysis.

Conclusions

WES analysis using a tier system to prioritize genetic analysis is an efficient method to identify pathogenic variants of known deafness genes, as well as novel candidate deafness genes. Further analyses, including accumulation of variants and clinical features of patients, will expand perspectives on hereditary hearing loss.