Introduction

Worldwide, infertility affects 8–12% of fertile-aged couples, and male factor infertility accounts for approximately 50% of the condition [1, 2]. Primary spermatogenic failure is a major cause of male infertility [3]. Taking advantage of next-generation sequencing technologies (e.g., whole exome sequencing, whole genome sequencing), a growing number of monogenic variants have been associated with human spermatogenic failure [4, 5].

Programmed differentiation of spermatogonial stem cells (SSCs) involves over 2,000 genes enriched in human testis [6]. However, only a small proportion of these genes have been linked to spermatogenic failure/defects [7]. The genetic causes of male factor infertility remain largely unknown.

In the present study, we retrospectively analyzed the blood samples of 167 men with primary infertility using whole exome sequencing (WES) combined with gene expression analysis based on single cell sequencing data [8]. The results are expected to increase our knowledge on monogenic variants associated with spermatogenic failure/ defects, as well as to provide new information on gene diagnosis, genetic counseling, and individualized treatment for primary male infertility.

Materials and methods

Clinical samples

Patients with primary male infertility and altered semen parameters (e.g., azoospermia, asthenozoospermia, oligozoospermia, oligoasthenoteratozoospermia) according to the WHO guidelines (2010) were enrolled for the study at the cytogenetics laboratory of a tertiary medical center. The eligible patients had a normal karyotype and a negative result of AZF deletion test. Patients with a medical history of ejaculatory duct obstructions, cryptorchidism, epididymis and vas deferens, orchitis, hypogonadism, chemotherapy or radiation therapy were excluded for the study.The blood samples collected from the veins were stored at -80 C until use. .

WES

According to the manufacturer’s instructions, genomic DNA was isolated from peripheral blood leukocytes of all participants using a QIAamp mini DNA kit (Qiagen, Hilden, Germany) for library construction. WES was performed at the BGI Clinical Testing Center (Shenzhen, Guangdong, China). Exons were captured using the BGI-Exome kit V4 and sequenced by BGI-seq 500 with 100 bp paired end reads at an average sequencing depth of 100× and average coverage of 99.5%. Low-quality reads were removed by SOAPnuke, and then the reads were mapped to the human genome reference (GRCh38/hg38) by the Burrows-Wheeler aligner (BWA-MEM, version 0.7.10). Variants Calling was performed using the Genome Analysis Tool Kit (GATK, version 3.3). The ANNOVAR tool was used to annotate and classify all variants.

Variants filtering and prioritization

All variants were filtered based on their frequency in public databases including gnomAD (http://gnomad.broadinstitute.org/), the 1000 Genome Project (http://browser.1000genomes.org), dbSNP (http://www.ncbi.nlm.nih.gov/snp) and BGI internal database; and variants with minor allele frequency (MAF) < 0.01 were retained. Autosomal homozygous or double heterozygous variants, and X-Chromosomal variants were considered in this study. Prioritization of candidate variants was based on a recent review on 657 gene-disease relationships (GDR) for monogenic causes of human male infertility [4], and the genes associated with male infertility in the Mouse Genome Informatics database (MGI: http://www.informatics.jax.org/). Online tools including SIFT (http://provean.jcvi.org/), PolyPhen2 (http://genetics.bwh.harvard.edu/pph2/), MutationTaster (http://www.mutationtaster.org/), CADD (https://cadd.gs.washington.edu/snv), and GERP (http://mendel.stanford.edu/SidowLab/downloads/gerp/) were used to predict the damage of candidate variants.The exome data of 210 normally fertile men served as control to further screen the candidate variants.

CNV analysis using WES data

The GATK-gCNV pipeline [9] was used to analyze the germline CNVs from WES data of 167 patients in the study. CNVs with a quality score (QS) > 1000 were considered for annotation using DECIPHER database (https://decipher.sanger.ac.uk/). In order to identify potential male infertility-related genes within the pathogenic CNVs, coding genes covered by pathogenic CNVs were filtered based on the human candidate genes for male infertility [4].

Sanger sequencing

Forward and reverse primers for PCR amplification were designed using the Primer3 program based on the human genome reference sequence GRCh38/hg38. The purified PCR product was loaded for capillary electrophoresis on an ABI 3730XL DNA analyzer (ABI PRISM, Foster City, CA, USA).

Differential expression analysis of candidate genes

Single cell sequencing data (https://ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE109037) were downloaded and subjected to quality control and filtering following the scripts provided by the submitter [8]. Gene expression analysis was performed using the R language (Version 4.2.3) Seurat package (V4.0).

Results

Clinical features

The blood samples of a total of 167 patients, with a mean age ± standard deviation of 30.4 ± 4.8 yrs and a range of 22 to 49 years, were retrospectively analyzed. Based on medical records, semen characteristics of these patients include non-obstructive azoospermia (NOA) (n = 58, 34.7%), asthenozoospermia (n = 27, 16.2%), oligoasthenzoospermia (n = 63, 37.7%), oligoasthenoteratozoospermia (n = 6, 3.6%), oligozoospermia (n = 10, 6.0%), teratozoospermia (n = 3, 1.8%). All patients had a normal karyotype and no Y-chromosome infertility.

Genetic findings

In this study, variants of the genes that have been substantially demonstrated as causative genes of human male infertility were selected in priority for diagnostic purpose. Based on a recent review on 657 gene-disease relationships (GDR) for monogenic causes of human male infertility [4] and a complementary literature search, we identified deleterious variants of 17 known causative (five X-linked and twelve autosomal) genes, including ACTRT1, ADAD2, AR, BCORL1, CFAP47, CFAP54, DNAH17, DNAH6, DNAH7, DNAH8 DNAH9, FSIP2, MSH4, SLC9C1, TDRD9, TTC21A, and WNK3 in 23 patients (Table 1). The mutation spectrum of these known causative genes in patients included missense (64.1%), splice site (17.9%), frameshift (5.1%), nonsense (5.1%), start codon (2.6%), and in-frame deletion (2.6%) mutations, with a predominance of missense mutations resulting in amino acid substitution. We used the exome data from 210 normally fertile men as control. X-linked variants that were present in both male factor infertility patients and normal fertile controls were discarded. With respect to autosomal recessive variants DNAH17 c.12488T > G p.(Phe4163Cys), c.5278G > T p.(Asp1760Tyr), and c.3601 C > T p.(Arg1201Trp), DNAH8 c.6565 A > G p.(Met2189Val), and FSIP c.3365 C > G p.(Ser1122Cys), though one or two heterozygous carriers of these variants in the control group, they were still selected as causal variants for patients with male factor infertility in this study (Table 1).

Table 1 Variants of known causative genes for male infertility detected in the study

Based on the genes that have been linked to male infertility in mouse models or recurrently affected in patients, we selected 12 candidate genes for further investigation, involving seven X-linked and five autosomal genes, namely CHTF18, DDB1, DNAH12, FANCB, GALNT3, MAGEC1, OPHN1, RBMXL3, SCML2, UPF3A, ZMYM3, ZNF185 (Table 2). The mutation spectrum of candidate genes in patients included missense (50%), nonsense (15.4%), frameshift (11.5%), splice site (19.2%) and intron/exon boundary deletion (3.8%) mutations. These variants were not found in the 210 fertile men, except two variants DNAH12 c.6286 C > T p.(Gln2096X) and: c.6021–3 A > G, which had one heterozygous carrier, respectively. Considering the autosomal recessive inheritance pattern of the DNAH12 gene, the two variants were still considered as candidate causal variants for male factor infertility.

Table 2 Variants of candidate genes for male infertility

The results of the Sanger sequencing validation for the identified variants are shown in the Supplementary Data Sheet 1.

A total of 29 CNVs were identified using the GATK-gCNV tool, and none of them were directly related to male infertility in DECIPHER (accessed on August 27th, 2024; Supplementary Data Sheet 2). In one patient (M1241), a heterozygous deletion of chromosome 10q26.3 resulted in loss of a part of the spermatogenic failure gene SYCE1. This CNV was not considered to be the cause of infertility in the patient based on the recessive inheritance pattern of SYCE1.

Differential expression of candidate genes

We explored the differential expression of candidate genes at the developmental stages of spermatogenesis by referencing the human testis single cell-sequencing database [8]. Of the 12 candidate genes, CHTF18, DDB1 and MAGEC1 were preferentially expressed in spermatogonial stem cells (Fig. 1B, C and G). DNAH12 and GALNT3 were found primarily in spermatocytes and early spermatids (Fig. 1D and F). Obviously, UPF3A was present at a high level throughout spermatogenesis except in elongating spermatids (Fig. 1K). The testicular expression profiles of the genes mentioned above underlie their potential roles in spermatogenesis and the pathophysiology of male infertility. Other candidate genes were found to be expressed only in a small part of the cells during spermatogenesis and remain to be further investigated (Fig. 1).

Fig. 1
figure 1

(A) Uniform Manifold Approximation and Projection (UMAP) plot of testicular cells. The cells are colored based on cell types. SPG, SPC and SPT stand for spermatogonia, spermatocytes and spermatids, respectively. (B-M) Expression patterns of the candidate genes on the UMAP space

Discussion

Several previous studies have explored the genetic etiology of spermatogenic failure using WES, with a diagnostic yield ranging from 3.2% (10 out of 314 patients) to 47.6% (10 out of 21 patients) [31, 37,38,39,40,41]. In a more recent study that screened likely pathogenic and pathogenic (LP/P) variants of a male infertility gene panel in 521 patients with idiopathic primary spermatogenic failure, a molecular diagnosis rate of 12% (n = 64) was achieved [42]. In the present study, we identified 23 out of 167 patients with primary male infertility to carry candidate variants of known causative genes for spermatogenic failure, giving a diagnostic rate of 13.8%. We reasoned that performance of WES for detection of causal mutations of spermatogenic failure might be affected by the genetic background and inclusion criteria (e.g. pathological type, severity) of patients. Definitely, accumulating evidence on causative genes for spermatogenic failure would improve the diagnostic performance of WES.

Among the 17 known causative genes for male infertility identified in this study, AR, MSH4, and TDRD9 have been reported to be associated with azoospermia [12, 23, 43, 44]. The human androgen receptor (AR) gene, located on X chromosome q11-12, encodes a ligand-activated transcription factor that plays a key role in spermatogenesis [45]. Recently, a novel missense mutation [NM_000044.6: c.2051G > C p.(Gly684Ala)] within the ligand-binding domain of AR in azoospermic individuals from a Chinese family has been identified using WES [44]. In this study, we identified a start-loss variant of AR in a patient with NOA. MSH4 (MutShhomologue 4) is a member of the mammalian mismatch repair (MMR) gene family, and plays a vital role in meiotic chromosome synapsis [46]. Homozygous missense [NM_002440.4:c.1913 C > T p. (Pro638Leu); c.2261 C > T p. (Ser754Leu)], homozygous deletion frameshift [c.805_812del p.(Val269Glnfs*15); c.2220_2223del p.(Lys741Argfs*2)], and compound heterozygous [c.1950G > A p.(Trp650X) and c.2179delG p.(Asp727Metfs*11); c.244G > A p.(Gly82Ser) and c.670delT p.(Leu224Cysfs*3)] variants of MSH4 have been reported in patients with NOA [37, 43]. Here, we have identified a homozygous splice-site variant in a patient with NOA. The mouse Tdrd9 gene encodes an ATPase/DExH-type helicase, which has been linked to male sterility and meiotic failure [47]. A previous study has identified a homozygous 4 bp deletion frameshift mutation in TDRD9 [c.720_723 del TAGT p.(Ser241Profs*4)] as the causative mutation in five azoospermic infertile men of a large consanguineous Bedouin family [23]. In our study, we identified a homozygous missense variant of TDRD9 in a NOA patient. Together, our findings expand the mutation spectrum of these azoopsermia genes. Interestingly, it has been shown that biallelic deleterious mutations of CFAP54 can induce severe MMAF and NOA in humans [26].With respect to the recently identified MMAF causative gene CFAP47 [14], one patient (M1294) hemizygous for a missense variant of the gene had NOA from our study. This finding may expand the phenotype of male infertility caused by CFAP47. However, further evidence is required to confirm the relationship between CFAP47 and NOA.

Five of the 17 known causative genes, namely DNAH17, DNAH6, DNAH7, DNAH8, and DNAH9, belong to the DNAH family, members of which encode the axonemal dynein heavy chains [48]. Dyneins are components of the inner and outer dynein arms attached to the doublets of the axonemal microtubules in the motile cilia and sperm flagella. Mutations in the DNAH genes are known to be associated with cilia-related disorders, such as immotile cilia syndrome [49].Using WES, a recent study analyzed 90 cases of Chinese patients with MMAF and identified two patients with biallelic heterozygous missense variants of DNAH8 [c.11771 C > T p.(Thr3924Met) plus c.6689 A > G p.(Lys2230Arg); c.9427 C > T p.(Arg3143Cys) plus c.12721G > A p.(Ala4241Thr)]. Meanwhile, reanalysis of the exome sequencing data of 167 MMAF patients from France, Iran, and North Africa identified a patient of homozygous frameshift mutation [c.6962_6968del p.(His2321Profs*4)] in DNAH8 [18]. In this study, we identified two patients (M1237 and M1761) with asthenozoopsermia carrying double heterozygous variants of DNAH8. In the study by Whitfield et al., five patients with asthenozoospermia were found to carry variants of DNAH17 [15]. In our study, three patients with azoospermia were found to carry double heterozygous missense mutations in DNAH17. In line with our finding, a most recent study also detected multiple-heterozygous damaging variants in DNAH17 in Japanese men with isolated non-obstructive azoospermia [50]. Regarding DNAH9, we noticed the difference between the NOA phenotype of the patient M1470 in this study and the asthenozoospermia phenotype of the patients described previously [19]. Further investigation is needed to clarify this issue.

In the present study, we have identified 12 candidate genes for human spermatogenic failure/defects. Based on testicular single cell sequencing data, six of the 12 candidate genes (CHTF18, DDB1, DNAH12, GALNT3, MAGEC1, and UPF3A) are deferentially expressed at spermatogenic stages, suggestive of distinct roles of these genes in spermatogenesis. Variants of DNAH12 were previously identified in patients with MMAF or sperm motility disorder [29, 31]. In accordance with these findings, a homozygous nonsense variant of DNAH12 was identified in a patient (M842) with asthenozoospermia in our study, adding evidence to DNAH12 being a causative gene for male infertility. However, the patient M867 who carried double heterozygous splice-site variants in DNAH12 had azoospermia by routine semen analysis, seeming to be inconsistent with the asthenozoospermia and MMAF phenotype of patients with homozygous or compound heterozygous variants in DNAH12 [30]. We reason that the possibility of DNAH12 leading to azoospermia could not be excluded, considering that single cell sequencing data revealed its expression in spermatocytes and early spermatids, and that Dnah12−/− male mice had a dramatically decreased sperm number compared with the wild type [30]. In addition, the MMAF causative genes DNAH6 and DNAH17 have also been associated with azoospermia [50, 51]. To confirm whether DNAH12 is a causative for NOA, more azoopsermic patients with pathogenic variants in DNAH12 are anticipated to be identified in future studies.

Interestingly, loss of Chtf18 in mice leads to oligospermia due to defective meiotic recombination [26]; but the patient (M1286) double heterozygous for missense variants in CHTF18 from our study was diagnosed as NOA. The Upf3a knockout mice were described to have significant reduction in sperm count and a defect in spermatocyte progression [35], whereas the patient M847 carrying a homozygous frameshift variant of UPF3A from our study had a semen characteristic of asthenozoospermia. The phenotypic differences may be attributed to genetic differences between humans and mouse strains. Though our findings provide important information on the roles of CHTF18 and UPF3A in human spermatogenesis, whether they are casual genes for spermatogenic failure needs to be further verified.

There are some limitations to the study. First, the sample size of the current study was relatively small. Secondly, this study was a retrospective study and parental samples were not available. Therefore, the maternal and paternal origins of the candidate variants identified in patients are unclear. Furthermore, the patients did not undergo testicular puncture or biopsy, so it is impossible to assess the pathological types in the testes and detect the expression of related genes.

Taken together, our study shows that WES is an effective tool in the genetic diagnosis of primary male infertility. Our study increases a comprehensive and in-depth understanding of the genetic factors involved in spermatogenesis and has meaning for the diagnosis, clinical treatment and genetic counseling of male infertility.