Introduction

Mayer–Rokitansky–Küster–Hauser (MRKH) syndrome (MIM: %277,000), also known as Müllerian agenesis [1], Müllerian aplasia, or Müllerian dysgenesis or congenital absence of the uterus and vagina, is characterized by uterovaginal aplasia in an otherwise phenotypically normal female with a normal 46,XX karyotype. MRKH syndrome affects approximately one in 5000 live female births [2], and has been reported in approximately 16% of patients with primary amenorrhea [3]. When only the reproductive organs (uterus, fallopian tubes, cervix, and the upper part of the vagina) are affected, this condition is classified as MRKH syndrome type I. Some women with MRKH syndrome also have abnormalities in other organs of the body; in these cases, the disease is classified as MRKH syndrome type II [4]. Affected individuals often display renal, skeletal, and heart defects and hearing loss [5].

MRKH syndrome is directly caused by incomplete development of the Müllerian ducts. Genetic and/or environmental factors that control the formation and morphogenesis of Müllerian ducts are closely related to the MRKH syndrome. During human embryonic development, Müllerian ducts form just lateral to the Wolffian duct. Both Müllerian and Wolffian ducts develop from the intermediate mesoderm and on the surface of the mesonephric kidneys. Therefore, MRKH syndrome is usually associated with abnormalities of the renal and axial skeletal systems.

Previous studies have found that genetic perturbations can lead to MRKH syndrome. For example, PAX8, TBX6, GEN1, WNT4, WNT9B, BMP4, BMP7, HOXA10, EMX2, LHX1, GREB1L, LAMC1, and other genes are associated with MRKH syndrome [6,7,8,9,10,11,12,13,14,15,16]. Although these genetic factors have been related to the pathogenesis of MRKH syndrome, the genetic inheritance pattern of MRKH syndrome is very complex, including autosomal dominant, autosomal recessive, digenic, oligogenic, and incomplete penetrance. Although some genetic pathogenic factors have been identified, more unknown factors have not been identified. The identification of additional pathogenic genes will require large-scale genetic research studies at various institutions in different countries.

In this study, we aimed to explore the novel genetic causes of MRKH syndrome using whole-exome sequencing (WES) technology. We recruited 10 patients with MRKH syndrome and performed WES and family genetic analysis on them. We attempted to identify novel genetic pathogenic factors associated with MRKH syndrome.

Materials and methods

Patients

Ten patients diagnosed with MRKH syndrome with a 46,XX karyotype were recruited at Beijing Obstetrics and Gynecology Hospital from January 2019 to May 2021. The clinical conditions and manifestations of the ten patients are presented in Table 1. Five milliliters of peripheral blood were collected from each patient for further genetic analysis.

Table 1 Clinical information of the MRKH patients

WES analysis

Genomic DNA from each patient was extracted from the peripheral blood using the QIAamp DNA Blood Kit (Qiagen, Valencia, CA, USA). WES was performed as previously described [17]. The functional effects of the variants (damaging or not) were predicted using the PolyPhen-2, SIFT, MutationTaster, LRT, and FATHMM-MKL algorithms. The desired variants were filtered using two criteria: (i) missense, nonsense, frameshift, or splice site variants; and (ii) variants with minor allele frequency < 1%. The minor allele frequency information was obtained by referring to the Genome Aggregation Database (gnomAD, http://gnomad.broadinstitute.org/), 1000 Genomes Project (1000G, http://browser.1000genomes.org/index.html), NHLBI Exome Sequencing Project (ESP6500), and our in-house database.

Sanger sequencing analysis

Sanger sequencing was performed to validate the identified variants and determine if each variant was inherited from a parent. The primer pairs for each gene are listed in Additional file 1: Table S1. Forward or reverse primers were used to sequence the PCR products. Sequencing was performed using an ABI 3730 automatic sequencer (Applied Biosystems, Foster City, CA, USA).

Protein structure prediction

The three-dimensional structures of wild-type and mutant proteins were predicted using the Robetta online protein structure prediction server (https://robetta.bakerlab.org/) [18]. This tool can predict the three-dimensional structure of a given amino acid sequence. Protein structure alignment was performed using Visual Molecular Dynamics 1.9.3 software (https://www.ks.uiuc.edu/Research/vmd/).

Results

WES analysis

Of the 10 women with MRKH syndrome, seven had type I MRKH syndrome and the other three had type II MRKH syndrome (Table 1). WES helped to identify 11 variants in 90% (9/10) of the patients and was considered a molecular genetic diagnostic tool of MRKH syndrome. These 11 variants involved nine genes: TBC1D1 (Fc-M-1 and Fc-M-3), KMT2D (Fc-M-2 and Fc-M-7), LIFR (Fc-M-2), HOXD3 (Fc-M-4), DLG5 (Fc-M-6), CLIP1 (Fc-M-7), GLI3 (Fc-M-8), HIRA (Fc-M-9), and GATA3 (Fc-M-10) (Table 2). All the variants were heterozygous. These changes included one frameshift variant, one stop-codon variant, and nine missense variants (Table 2). All the identified variants were absent or very rare in the gnomAD East Asian population (Table 2). Two of the 11 variants (18.2%) were classified as pathogenic variants according to the American College of Medical Genetics and Genomics (ACMG) guidelines. The remaining nine variants (81.8%) were classified as variants of uncertain significance (VUS).

Table 2 In silico analysis of sequence variants found by WES in MRKH patients

Novel candidate genes of MRKH syndrome

TBC1D1

We identified TBC1D1 variants in two unrelated patients, Fc-M-1 and Fc-M-3 (Table 2). Fc-M-1 was diagnosed as type II MRKH syndrome (European Society of Human Reproduction and Embryology [ESHRE] classification: U5bC4V4) with full uterine agenesis, left pelvic kidney dysfunction, congenital anal atresia with vestibular fistula, ventricular septal defect, and accessory auricle (Table 1, Fig. 1A–C). Fc-M-3 was diagnosed as type I MRKH syndrome (ESHRE classification: U5bC4V4) with uterine remnants without a rudimentary cavity (Table 1 and Fig. 1D).

Fig. 1
figure 1

TBC1D1 mutation in patients with MRKH. A Image of patient Fc-M-1 with a diagnosis of MRKH. No uterine echo was evident in the pelvic ultrasound (indicated by *) behind the bladder (denoted by B). B Image of patient Fc-M-1 with a diagnosis of MRKH. The right renal (R–R) region is enlarged. C Image of patient Fc-M-1 with a diagnosis of MRKH. The left renal (L–R) region was dysplastic and located in the left lower abdomen (pelvic ectopic kidney). The region was 4.0 cm in length. A cystic cavity measuring 1.8 * 1.6 cm was evident. The renogram showed that the left kidney had no function. D Pelvic ultrasound image of patient Fc-M-3 with a diagnosis of MRKH. B denotes bladder and U denotes aplastic uterus without rudimentary cavity. E TBC1D1 is highly expressed in the human uterus. The data were obtained from an online database (https://varsome.com/gene/TBC1D1). The red arrow denotes the expression level of TBC1D1 in the human uterus. F Sanger sequencing confirmation of the heterozygous TBC1D1 variant in patient Fc-M-1. The patient’s father (I-1) also carried the same heterozygous variant. The patient’s mother (I-2) harbored two wild-type (WT) alleles. The red arrow indicates the variant site (c.2553delC); MT, mutated allele. G The domain and mutation in TBC1D1. Full-length TBC1D1 is 1168 amino acids (aa), and includes the PID domain from aa 246 to 404 (blue box) and the catalytic Rab-GAP TBC domain from aa 800 to 994 (red box). The p.R854Efs*24 mutation results in a predicted 23-aa frameshift sequence in the protein resulting in a nonsense mutation; WT, wild type allele. (H) Sanger sequencing validating the TBC1D1 variant in patient Fc-M-3. The red arrow indicates the variant site (c.1069G>C). I The wild-type (green) and p.E357Q mutant protein (red) structure for amino acids at positions 164 to 371 were predicted by RoseTTAFold. The wild-type sequence and the p.E357Q mutant sequence were aligned by VMD software

TBC1D1 was highly expressed in the uterus (Fig. 1E). Fc-M-1 harbored a frameshift c.2553delC (p.R854Efs*24) variant (Fig. 1F) inherited from her father (Fig. 1F). Therefore, TBC1D1 variant c.2553delC was associated with MRKH syndrome within this family. The c.2553delC variant of TBC1D1 was predicted to produce a truncated p.R854Efs*24 protein, which would destroy the Rab-GAP TBC domain of the TBC1D1 protein and lead to the loss of C-terminal sequences (Fig. 1G). The c.2553delC variant was absent in the gnomAD East Asian population. It was classified as a pathogenic variant (PVS1 + PM2 + PP3) according to ACMG guidelines.

Fc-M-3 carried a missense c.1069G>C (p.E357Q) variant confirmed by Sanger sequencing (Fig. 1H). Due to the unavailability of samples from the patient’s mother and father, we could not determine how this variant was transmitted. The allele frequency of the c.1069G>C variant in the gnomAD East Asian population was 0.0002. The variant was predicted to be a damaging variant by all five algorithms we used and was classified as VUS (PM2 + PP3 + BP1) according to the ACMG guidelines (Table 2). p.E357Q was located in a PTB_TBC1D1_like domain (TBC1 domain family member 1 and related protein phosphotyrosine binding (PTB)), which contains amino acids at positions 164 to 371. Wild-type (WT) and mutant protein structures for amino acids at positions 164 to 371 were predicted by the Robetta fold. The domain structural prediction results revealed significant structural changes in at least two regions between the protein carrying the p.E357Q variant and the WT protein (Fig. 1I), suggesting that the p.E357Q variant might affect the function of TBC1D1.

DLG5

We identified a DLG5 variant in patient Fc-M-6 who was diagnosed as having type I MRKH syndrome (ESHRE classification: U5bC4V4) with primary amenorrhea and dyspareunia (Table 1). DLG5 was highly expressed in the human cervix, uterus, and vagina (Fig. 2A). Fc-M-6 harbored the DLG5 stop-codon gained variant c.418C>T (p.Q140*) (Table 2). This variant was confirmed by Sanger sequencing (Fig. 2B) and classified as a pathogenic variant (PVS1 + PM2 + PP3) according to the ACMG guidelines (Table 2). The c.418C>T variant of DLG5 was predicted to produce a truncated p.Q140* protein, which would destroy all functional domains in the DLG5 protein (Fig. 2C).

Fig. 2
figure 2

DLG5 variant is associated with MRKH syndrome. A DLG5 is highly expressed in the human cervix, uterus, and vagina. The data were obtained from an online database (https://varsome.com/gene/DLG5). The red arrows denote the expression level of DLG5 in the human cervix, uterus, and vagina. B Sanger sequencing validating the DLG5 variant in patient Fc-M-6. The red arrow indicates the variant site c.418C>T. C The domain and mutation in DLG5. Full-length DLG5 is 1919 amino acids long. The Q140* mutation resulted in a nonsense mutation, which lost nearly all of the functional domains; WT, wild type allele

HOXD3

We identified a HOXD3 missense variant in patient Fc-M-4, who was diagnosed with type I MRKH syndrome (ESHRE classification: U5bC4V4) with primary amenorrhea and dyspareunia (Table 1). HOXD3 was highly expressed in the human uterus (Fig. 3A). Fc-M-4 carried a heterozygous HOXD3 variant, c.575C>G (p.P192R) (Table 2), which was confirmed by Sanger sequencing (Fig. 3B) and classified as a VUS variant (PM2 + PP3) according to the ACMG guidelines (Table 2). The structural prediction results revealed pronounced structural changes between the HOXD3-WT protein and the mutant protein (p.P192R) (Fig. 3C). Alignment of the two proteins was difficult (Fig. 3D).

Fig. 3
figure 3

HOXD3 variant is associated with MRKH syndrome. A HOXD3 is highly expressed in the human uterus. The data were obtained from an online database (https://varsome.com/gene/HOXD3). The red arrow denotes the expression level of HOXD3 in the human uterus. B Sanger sequencing validating the HOXD3 variant in patient Fc-M-4. The red arrow indicates the variant site c.575C>G. C The full-length wild-type (WT) HOXD3 protein and P192R mutant protein structures were predicted by RoseTTAFold. D The predicted protein structures for the WT HOXD3 protein (green) and the P192R mutant protein (red) were aligned

GLI3

We also identified a GLI3 missense variant in patient Fc-M-8, who was diagnosed with type I MRKH syndrome (ESHRE classification: U5bC4V4) with primary amenorrhea and dyspareunia (Table 1). GLI3 was highly expressed in the human uterus and vagina (Fig. 4A). Fc-M-8 carried a heterozygous GLI3 variant c.895C>G (p.L299V) (Table 2), which was confirmed by Sanger sequencing (Fig. 4B) and classified as a VUS variant (PM2 + PP3 + BP1) according to the ACMG guidelines (Table 2). The structural prediction results showed that there were significant structural changes between the GLI3-WT protein and the mutant protein (p.L299V) (Fig. 4C). Alignment of the GLI3-WT and GLI3-mutant proteins was difficult (Fig. 4D).

Fig. 4
figure 4

GLI3 variant is associated with MRKH syndrome. A GLI3 is highly expressed in the human uterus and vagina. The data were obtained from an online database (https://varsome.com/gene/GLI3). The red arrows denote the expression level of GLI3 in the human uterus and vagina. B Sanger sequencing validating the GLI3 variant in patient Fc-M-8. The red arrow indicates the variant site c.895C>G. C The 1–683 amino acid sequence for wild-type (WT) GLI3 protein and L299V mutant protein structures were predicted by RoseTTAFold. D The predicted protein sequence (1–683 amino acids) structures for WT GLI3 protein (green) and the L299V mutant protein (red) were aligned

Other genes associated with MRKH syndrome

We also identified variants of KMT2D, LIFR, CLIP1, HIRA, and GATA3. All variants were classified as VUS according to the ACMG guidelines (Table 2). All variants were confirmed using Sanger sequencing (Additional files 3 and 4: Figure S2A and S3A; some data not shown). Some of the mutant proteins that harbored the variants were analyzed by protein structure prediction (Additional files 3 and 4: Figure S2B, C, S3B, and C).

Two variants of KMT2D (c.2992C>G; P998A, and c.1754C>T; P585L) were found in two unrelated patients. The P998A variant was found in patient Fc-M-2 (Table 2), who was diagnosed with type I MRKH syndrome with primary amenorrhea and bilateral uterine remnants, without a rudimentary cavity (Table 1). The P585L variant was carried by patient Fc-M-7 (Table 2), who was diagnosed with type II MRKH syndrome with bilateral uterine remnants without a rudimentary cavity, congenital cleft palate, and bilateral fallopian tubal dysplasia (Table 1).

Discussion

Using WES and genetic analysis, this study identified several novel genetic variants that may lead to MRKH syndrome. Next, we discuss these novel genes involved in the pathogenesis of MRKH syndrome.

TBC1D1

TBC1D1 encodes a Rab-GTPase-activating protein and is involved in regulating the trafficking of GLUT4 storage vesicles to the cell surface [19]. Previous studies have found that heterozygous mutation of TBC1D1 is associated with congenital anomalies of the kidneys and urinary tract (CAKUT) [20, 21]. The TBC1D1 mutation may promote the pathogenesis of CAKUT through its role in glucose homeostasis [20]. Patient Fc-M-1 harboring the TBC1D1 truncating variant found in this study also had CAKUT; the left pelvic kidney did not function and the right kidney was enlarged as a compensatory response. Type II MRKH syndrome is usually complicated by abnormalities in the urinary system. Therefore, the study findings suggest that attention should be paid to whether patients with type II MRKH syndrome, especially those with urinary system abnormalities, carry genetic variants related to CAKUT. A previous study by our group has also shown that sequence variants related to CAKUT could be associated with another complex reproductive tract malformation, which is related to the Herlyn–Werner–Wunderlich syndrome [17].

DLG5

Dlg5 is required for epithelial tube maintenance in the mouse brain and kidney. Dlg5 gene knockout mice exhibit hydrocephalus and renal cysts [22]. Heterozygous sequence variants of DLG5 are associated with ureteropelvic junction obstruction or renal agenesis [23]. Gene expression data of DLG5 in humans in the present study (Fig. 2A) showed that DLG5 was highly expressed in the tissues of the reproductive tract, including the cervix, uterus, and vagina. Therefore, the DLG5 protein may play an important role in the development of the reproductive tract. In this study, patient Fc-M-6 with MRKH syndrome harbored a rare DLG5 truncating variant, Q140* (Fig. 2C), which is classified as a pathogenic variant according to ACMG guidelines. Q140* lacks almost all functional domains of the DLG5 protein, so the variant may lead to the loss of function of the protein. As few patients with MRKH syndrome were included in this study, only one DLG5 variant was found. We expect our follow-up research or other research groups to identify DLG5 variants in unrelated patients with MRKH syndrome and provide more evidence of the association of mutations in this gene with MRKH syndrome.

KMT2D

Previous studies have reported that KMT2D mutations can lead to Kabuki syndrome [24, 25]. KMT2D gene variants are also related to CAKUT and renal agenesis [26,27,28,29]. In the present study, the Fc-M-2 and Fc-M-7 patients harbored a KMT2D variant. Both patients also carried another genetic variant. Fc-M-2 carried a variant of the LIFR gene (Table 2), which is related to the pathogenesis of CAKUT [30]. Fc-M-7 harbored a variant in the CLIP1 gene (Table 2), which is also a candidate gene for CAKUT [29]. The findings suggest that Fc-M-2 and Fc-M-7 are related to digenic inheritance. KMT2D, LIFR, and CLIP1 are associated with CAKUT, which also indicates that perturbations of renal development-related genes may affect the normal development of reproductive tracts, including the uterus and vagina [17, 31].

Other genes

We also found several genes, including HOXD3, GLI3, HIRA, and GATA3, which may be associated with MRKH syndrome. HOXD3 is predominantly expressed in the uterus and kidney (https://varsome.com/gene/HOXD3), suggesting its important roles in reproductive and urinary tract development. GLI3 is highly expressed in the uterus (https://varsome.com/gene/GLI3). Variants in GLI3 have also been associated with CAKUT or renal agenesis [23, 26, 32]. HIRA encodes a histone chaperone and is considered the primary candidate gene in DiGeorge syndrome. Deletion or duplication in chromosomal loci 22q11.21 containing the HIRA gene has been associated with MRKH syndrome [33, 34]. GATA3 is expressed in Wolffian ducts at the time of their emergence in the embryonic intermediate mesoderm [35]. GATA3 mutations cause hypoparathyroidism, deafness, renal dysplasia syndrome, and CAKUT [26, 36, 37]. Female genital tract malformations, such as uterus didelphys with septate vagina and septate uterus, have also been observed in patients with hypoparathyroidism, deafness, and renal dysplasia syndrome harboring GATA3 mutations 38. Considering the close developmental relationship of the Müllerian and Wolffian ducts, mutations in the HOXD3, GLI3, HIRA, and GATA3 genes might lead to the abnormal fusion of the caudal parts of the Müllerian ducts needed for the normal development of the uterus and vagina, resulting in MRKH syndrome.

The limitations of this study lie in the following two aspects. First, the sample size of this study is relatively small. MRKH syndrome is a rare disease with an incidence of 1/5000. Patients who come to our hospital for treatment of this disease are very few, so the number of patients who can be recruited in the group is even less. Although the sample size is relatively small, the researchers in our team try their best to find the genetic pathogenic factors of each patient. More patients will be recruited in the future, and we will also focus on whether the pathogenic genes we have found are mutated in the patients enrolled in the future. Secondly, most of these variants found in this study are VUS variants. The reason why this kind of variant is VUS is mainly because the variants have not been rigorously analyzed by functional experiments. We will also continue to study the variants of interest to clarify the molecular mechanism of their pathogenesis.

Conclusion

Genetic variants, especially in the TBC1D1 gene, are related to the pathogenesis of MRKH syndrome. This study provides new insights into the etiology of MRKH syndrome and the data are a new molecular genetic reference for the development of the reproductive tract.