Molecular analysis and genotype-phenotype correlations in patients with classical congenital adrenal hyperplasia due to 21-hydroxylase deficiency from southern Poland — experience of a clinical center

Purpose The prevalence of CYP21A2 gene variants and genotype-phenotype correlations are variable among populations. The aim of this study was to characterize CYP21A2 gene variants in adult patients with classical congenital adrenal hyperplasia (CCAH) from southern Poland and to analyze genotype-phenotype correlations. Materials/Methods A total of 48 patients (30 women and 18 men) with CCAH were included in the study. Patients were divided into two clinical subgroups, namely, salt-wasting (SW) — 38 patients and simple virilizing (SV) — 10 patients. A genetic analysis MLPA (multiplex ligation-dependent probe amplification) was performed in all of them. In dubious cases, the analysis was complemented by Sanger sequencing. Genotypes were classified into five groups (depending on the residual in vitro enzymatic activity), namely, null, A, B, C, and D, and correlated with the clinical picture. Results Molecular defects were investigated and identified in 48 patients. The most common variant in the studied group was I2G, followed by whole or partial gene copy deletion, and I172N. One novel variant c.[878G>T] (p.Gly293Val) was found. In nine patients, a non-concordance between genotype and phenotype was observed. Genotype-phenotype correlations measured by positive predictive value (PPV) were as follows: 100% in group null, 90.5% in group A, and 66.7% in group B. Conclusions CYP21A2 variants in the studied cohort were similar to values previously reported in other countries of the region. There was a good correlation between genotype and phenotype in the null and A groups, the correlation being considerably lower in group B.


Introduction
Congenital adrenal hyperplasia (CAH) is an endocrine disorder caused by mutations of genes coding for the synthesis of enzymes involved in adrenal steroidogenesis. Based on the type of enzyme block, several CAH types can be distinguished. The most common type (which accounts for about 95-99% of cases) is related to mutation in the CYP21A2 gene, encoding 21-hydroxylase, and may result in different clinical forms, namely, classical, including saltwasting (SW) and simple virilizing (SV), and nonclassical (NCCAH), the latter manifesting at a later age and characterized by milder symptoms [1][2][3]. The incidence of CAH is estimated at about 1:15,000 and differs among populations [4]. In some ethnic groups, the reported prevalence is higher (for example, in the population of La Réunion, it is 1:2141 [5], and among Yupik Eskimos, it is 1:280 [6]). In the Caucasian population, the prevalence of the disease varies from 1 in 5000 to 1 in 23,000 live births [7]. According to the latest data, in the USA, the incidence of CAH varies from 1:9941 to 1:28,661 live births [8]. The CYP21A2 gene is located on the short arm of chromosome 6 (6p21.3) [9]. The inactive pseudogene, CYP21A1P, is 98% homologous compared to the active form of CYP21A2 and contains variants which, when incorporated in the active gene, can lead to the loss of its functions. The majority of pathogenic variants (about 90-95%) occurring within CYP21A2 originate from this pseudogene. Within the active gene, there are also de novo variants causing only a small portion of inherited cases of CAH [10]. Different variants in the CYP21A2 gene can lead to a variable degree of loss of 21-hydroxylase activity, which can result in various clinical presentations. To date, over 200 variants with a pathogenic role have been described [11,12]. The majority of pathogenic variants in the CYP21A2 gene are large conversions, large deletions, or one of nine small variants, as follows: p.Gly111Valfs*21 (classically designated as "del8bp" or "Δ8bp"), exon 6 ("E6") cluster (p.[Ile237Asn; Val238Glu; Met240Lys]), p.Leu308Phefs*6 ("F306+T"), p.Gln319Ter ("Q318X"), p.Arg357Trp ("R356W"), p.Ile173Asn ("I172N"), p.Pro31Leu, p.Val282Leu, p.Pro454Ser, and c.293-13A/ C>G ("I2G") [13]. Molecular defects in Polish patients with CAH have not so far been analyzed. Therefore, the aim of our study was to identify the spectrum of variants in adult patients from southern Poland with classical congenital adrenal hyperplasia (CCAH) due to 21-hydroxylase deficiency and to analyze genotype-phenotype correlations.

Study population
Forty-eight adult patients diagnosed with CCAH due to 21-hydroxylase deficiency, treated at the Department of Endocrinology, University Hospital Medical College, Krakow, Poland, between 2015 and 2019 were enrolled in this study. In the study group, there were 30 women aged 28.20 (± 12.12) years and 18 men aged 27.70 (± 8.97) years. In total, 38 patients (79.17%) had the SW form and 10 (20.83%) the SV form. The CCAH diagnosis was made based on the clinical and physical examination and retrospective analysis of the patients' medical history. Data such as age at onset of the disease, electrolyte disturbances, genital appearance, previous urological surgery, evidence of hyperandrogenism, hormonal data including 17-hydroxyprogesterone (17-OHP), ACTH, plasma renin activity (PRA), aldosterone levels, and treatment scheme were taken into consideration. Patients who presented symptoms of adrenal crisis as neonates were classified in the SW group. Patients diagnosed in early childhood, but without salt-wasting form symptoms, were assigned to the SV group.
This study is in compliance with the 1964 Declaration of Helsinki and its later amendments. All patients signed an informed consent for participation in the study; an additional informed consent for the genetic analysis was also obtained. The study was approved by the Ethics Committee of the Jagiellonian University Medical College: KBET/225/B/2013.

Statistical analysis
Continuous data with normal distribution are presented as mean value and standard deviation (SD), and non-normal variables are reported as median and interquartile range (IQR) (Me [Q1;Q3]). Categorical data are presented as percentages. Due to the small size of the groups, continuous data were compared using the Kruskal-Wallis test. Categorical data were compared using the chi-square test. A P-value below 0.05 was considered statistically significant. To obtain a visual representation of global patterns within the data, correspondence analysis was implemented. Data were analyzed using Statistica 13.0.

Molecular analysis
DNA was extracted from peripheral blood samples using the NucleoSpin Blood kit (Macherey-Nagel Inc.), according to the manufacturer's protocol. Genetic analysis based on MLPA (multiplex ligation-dependent probe amplification), with the use of the probemix SALSA MLPA P050 CAH from MRC Holland, was performed according to the manufacturer's recommendations, using 50 ng of the isolated DNA per sample. The SALSA MLPA Probemix from MRC Holland, which enables detection of large rearrangements and seven of the most common point mutations in one reaction mix at the same time, was used as the first step of molecular testing of our studied group. In dubious cases, it was complemented by Sanger sequencing, based on the method published by Xu et al. [14]. For the long-range PCR reaction, Amplus polymerase (EurX Sp. z o.o.) was used according to the manufacturer's recommendations, with 200 ng DNA per 25 μl reaction volume and betaine (Sigma-Aldrich) added to the reaction mixture at a final concentration of 1M. The reaction was performed on an Eppendorf Mastercycler realplex thermocycler with an annealing temperature of 61°C and 20 sec added to each elongation step beginning from cycle 21. After agarose gel verification, the remaining reaction mixture was purified with NucleoSpin PCR Clean-up (Macherey-Nagel). The sequencing PCR reaction was performed with Big-Dye Terminator v3.1 (ThermoFisher Applied Biosystems) and 120 ng of the purified PCR product in 10 μl reaction volume. The sequencing conditions were in accordance with the manufacturer's recommendations, with an annealing temperature of 55 °C. Products were purified by ethanol precipitation, and pellets were resuspended in 20 μl nuclease-free water for capillary electrophoresis (ABI 3500, Applied Biosystems).
Four patients were excluded from further analysis because it was not possible to determine the distribution of variants on individual alleles without performing molecular analysis of the patients' parents, who were not available for testing. Genotypes were categorized into five groups (according to the published residual in vitro activity of 21-hydroxylase, based on literature data [15,16]), namely, groups null, A, B, C, and D, and then compared with the clinical presentation (excepting group C, which was excluded from further analyses). Group null, with 0% residual enzyme function, included patients with alterations found in both alleles of CYP21A2, causing a total absence of enzymatic activity (deletions, gene conversions, F306+T, del8bp, cluster E6, R356W, and Q318X). Group A (0-1% residual 21-hydroxylase activity) comprised patients who were I2G homozygotes or heterozygotes consisting of I2G and another variant, with minimal enzymatic activity (0-1%). Homozygotic patients with the variant I172N (approximately 2% of residual 21-hydroxylase enzyme activity) or heterozygotes with the null, A, or B group variants were assigned to group B, with 1-2% residual enzyme activity. Genotype group C (patients with moderately impaired 21-hydroxylase activity and with about 20-60% preserved residual enzymatic function) were homozygotes or heterozygotes of the milder variants. Patients who were carriers of variants of unknown in vitro impact on residual enzymatic activity were assigned to genotype group D. The global representation of these classes with different CYP21A2 variants is depicted in Fig. 1. The presented variants are numbered according to reference sequences NM_000500.9 (at DNA level) and NP_000491.4 (at protein level). Additionally, common genetic variants are referred to in accordance with classically used designations to maintain coherence with other literature sources.

Results
Among the group of 44 patients with CCAH (88 alleles), CYP21A2 gene mutations were detected in all of them. A total of 100% of the analyzed 88 alleles revealed a mutation of the CYP21A2 gene. The CYP21A2 gene variants found in our study group are shown in Fig. 2.
Genetic variant distribution among the 88 alleles according to genotype groups null, A, B, and D is presented in Table 1.

Genotype-phenotype correlations
All genotypes were classified into four groups, as follows: null, group A, group B, and group D. Six patients were assigned to group null, 21 patients presented genotype A, and 11 had genotype B, while six patients were classified as genotype D.
Patients assigned to groups null and A were assumed to present a SW phenotype. Patients in genotype group B (with suspected sufficient residual 21-hydroxylase activity) were predicted to have the SV phenotype. Patients in group C were hypothesized to have the NCCAH clinical manifestation (but they were excluded from the study). Severe genotypes (null and A) demonstrated a good correlation with the expected phenotype, with positive predictive value (PPV) of 100 and 90.5%, respectively, whereas the less severe genotype B demonstrated a lower correlation (with PPV 66.7%). Nine patients presented a different phenotype from what had been expected (seven in the SW group and two in SV).
There were no differences in anthropometric data among the groups. In our study, patients with genotypes null, A, and B were overweight, with a BMI of 27.2 ± 8.0, 27.3 ± 5.9, and 28.3 ± 6.8 kg/m2, respectively. Age at menarche was also comparable. Blood glucose values were higher in group A, with a trend towards statistical significance (P-value 0.056). The clinical and molecular data according to genotypes are illustrated in Table 2.

Novel variant
One novel variant was found in one allele in the study group, namely, c.
[878G>T] (p.Gly293Val). This variant has not yet been described in the literature. However, another variant at this position, c.878G>A (p.Gly293Asp), has been associated with CCAH and shown to result in residual enzyme activity of < 1% [17].
We used bioinformatics tools to predict the effect of the variant detected in our study on protein function. The identified variant, c.
[878G>T], has been predicted to be deleterious by PROVEAN (scored −8.38 at a cutoff of −2.5) [18] and damaging by SIFT (scored 0.000 at a cutoff of 0.05) [19]. Also according to the Bayes classifier applied in  MutationTaster, the identified variant has been predicted to be disease-causing [20]. The clinical severity of the new variant could be deduced from the patient's phenotype since he had a severe alteration (deletion of most of the gene) of the other allele.

Discussion
This study identifies the spectrum and frequencies of CYP21A2 variants as well as genotype-phenotype correlations in a group of 48 adult patients with CCAH due to 21-hydroxylase deficiency treated in the Department of Endocrinology at the University Hospital in Krakow, Poland. To the best of our knowledge, our study is the first published report on the spectrum and frequency of CYP21A2 genetic alterations in the Polish population. As the distribution of CYP21A2 variants differs between individual populations, the results of the study may be a valuable tool in genetic counseling not only in Polish patients with CCAH but also in populations of the entire European area.
The most common genetic variant in the studied group was I2G, followed by whole or partial gene copy deletion. In a study of 155 CAH patients from southern Germany (92 SW and 52 SV), I2G was also mentioned as the most common genetic variant [13]. A high frequency of this alteration was also observed by authors from Croatia [21], Turkey [22], India [23], Cuba [24], and China [25,26]. In previous studies from Latin American countries (Argentina [27], Brazil [28]), and Portugal [29]), p.Val282Leu, which is the most frequent variant in NCCAH, was defined as the most common genetic alteration. Because only CCAH patients were enrolled in the present study, the latter variant accounted for only 3.41% of cases. It is believed that these differences in genotype frequencies in different countries may result from the heterogeneity/homogeneity of the studied population and different proportions of particular types of CAH in published series. The most common genetic alterations reported in several previous studies are summed up in Table 3.
In our study group, a good correlation between genotype and phenotype was observed in group null (patients with alterations in both alleles resulting in 0% residual enzyme  Only in seven patients with SW and in two with SV was there no concordance between genotype and phenotype. Genotype-phenotype correlations measured by PPV were as follows: 100% in group null, 90.5% in group A, and 66.7% in group B. Results of previous studies confirm a good correlation between the genotype and the observed CAH phenotype [26,28,[30][31][32][33]. Previous studies reported 100% concordance in null, 80-96% in A, and about 50-87% in B genotypes [13,27,34,35]. In a huge cohort study by New at al., based on 1507 families with CAH, a genotypephenotype non-concordance was observed in 50% of cases [36]. A recent German-Austrian study, which enrolled the largest European cohort of CAH patients from 28 centers (538 CAH cases), has reported a poor correlation in the less severe genotypes B (46%) and C (58%) [37].
In four patients of our study, extensive rearrangements were detected, and numerous pathogenic variants were found. These patients have been excluded from further analysis since it was impossible to determine the allelic distributions of these variants without performing an evaluation of the patients' parents, who were unavailable for genetic testing. In the latter group, one novel variant has been identified.
In our study group, there were more female than male patients. This is in agreement with previous data, according to which a substantial proportion of male patients remain undiagnosed [36]. In our study, adult patients were included in whom the diagnosis of CCAH was made years ago when neonatal screening was not available in Poland. This higher proportion of women is typical for countries where neonatal screening has not been routinely used. We believe that this trend may be reversed in future given that neonatal screening for CAH was introduced in Poland in 2016 [38].
In all groups null, A, B, and D, the patients' BMI was > 25 kg/m2. The tendency for obesity in CCAH accords with the data reported in previous studies [39,40]. Fasting blood glucose concentration was higher in group A, with a trend towards statistical significance. One of the largest studies on cardiometabolic complications also demonstrated higher frequencies of obesity and diabetes (mainly type 2) in CAH patients [41].
One novel variant was found in one allele in the study group, namely, c.[878G>T] (p.Gly293Val). By means of bioinformatics tools, the identified variant has been predicted to be pathogenic.
The patient had a severe alteration (deletion of most of the gene) on another allele. The clinical severity of the new variant could be deduced from the patient's phenotype; the diagnosis of SW in this patient was established in the neonatal period, when he presented symptoms of adrenal crisis and required glucocorticoid and mineralocorticoid therapy.
However, the small study group being a limitation of this study, the observed non-concordance between genotype and phenotype requires further investigation.
Due to the fact that neonatal screening has been widely introduced in most countries, the number of patients diagnosed with CCAH is expected to increase in the future. The results of the present study, and particularly the description of the novel variant, may contribute to a better understanding of the disease. Moreover, the presented data can be useful for the prediction of phenotype based on genotype and may be helpful not only in genetic counseling but also in making treatment decisions. The practical implication of the data is that special attention must be paid to patients with no or very low 21-hydroxylase activity determined by genotype, in whom in cases of insufficient doses of corticoids or concomitant acute diseases adrenal crisis may occur.

Conclusions
The majority of cases in our study were characterized by a strong genotype-phenotype correlation. Variant frequencies of affected alleles were similar to those of data previously reported for other countries of the region. The most common genetic variant in the study cohort was I2G, followed by large deletions and I172N. One novel variant, c.
[878G>T] (p.Gly293Val), has been identified and characterized by means of bioinformatics tools.
Funding This study was supported by funds from the subsidies of the Ministry of Science and Higher Education: No. K/ZDS/004513.

Declarations
Ethics approval The study was approved by the Ethics Committee of the Jagiellonian University Medical College, Krakow, Poland: KBET/225/B/2013.

Consent to participate
Informed consent was obtained from all individual participants included in the study.

Conflict of interest
The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.