Background

Cystic fibrosis (CF) (ORPHA: 586; OMIM: 219,700) is a rare genetic disease of autosomal recessive inheritance that is most common in the Caucasian population. [1]

This disorder originates in anomalies in the sequence of the CF transmembrane conductance regulator gene (CFTR) (OMIM 602,421), which cause an alteration in the chloride and bicarbonate transport channel regulated by cyclic adenosine monophosphate (cAMP). These alterations result in the appearance of various multisystemic clinical manifestations that generate a progressive deterioration in CF patients [2, 3].

Since the CFTR gene was first described in 1989 [46], 2107 variants have been reported in the cystic fibrosis mutation database [7], of which 431 are associated with the risk of disease [8]. The most common is c.1521_1523delCTT (p. Phe508del), which is present in more than 80% of alleles in the world population with CF and whose frequency is higher in northern European countries. In certain populations, other variants can reach higher frequencies than p. Phe508del, and some have only been described in specific territories [911].

Although p. Phe508del is also the most common CFTR gene variant in Spain, it is less common than in other northern European countries. Likewise, notable differences have been described between different regions of the country in the frequency of this sequence alteration, as well as great heterogeneity in other CFTR changes [12, 13].

The sequence variants are classified into 7 classes [14] according to the effect they have on the amount, function or stability of CFTR in the cell membrane [15, 16]. Recent studies have considered classifying sequence alterations into 2 groups: “minimal function variants” (I, II, and VII classes), which are considered high-risk mutations and are associated with a more severe phenotype and early deterioration, and “residual function variants”, or low-risk mutations (IV, V and VI classes), which can preserve some of the CFTR function and lead to milder late-onset disease [17, 18]. Class III variants (gating mutation) can belong to both classifications, although most are found among the minimal function variants [19]. However, it should be remembered that the variability or severity of CF symptoms also seems to be explained by factors such as age, disease progression, different environmental factors, and modifying genes [20, 21].

In addition, several studies have described an association between the genotype and different clinical manifestations, mainly reproductive, pancreatic and other gastrointestinal disorders, using different methods of classifying the genotype [2225].

In Spain, studies have been carried out to analyse the genotype–phenotype relationship in patients from clinical units, which all include specific clinical manifestations or particular sequence variants [26, 27]. On the other hand, state-level studies have been carried out with data from registries that are not specific to CF [28, 29], in which a sample of CF patients was selected to describe certain characteristics without addressing the genotype–phenotype relationship.

Recently, rare disease registries have been positioned as a fundamental instrument since they allow greater knowledge of the epidemiology and characteristics of the people affected by obtaining systematic and complete information on each of them [3032]. Therefore, the objective of this study was to describe the mutational spectrum as well as to analyse its relationship with the different clinical manifestations of people with CF based on the information from the rare disease registry of Murcia, a region of southeastern Spain.

Methods

Study population

A cross-sectional study was carried out among people with a confirmed diagnosis of CF through 31 December 2018, who were registered in the Rare Diseases Information System of Murcia (SIER) [33]. People with CFTR-related disorders (CFTR-RDs), CF-screen positive inconclusive diagnosis (CF-SPID) and healthy carriers were excluded. Informed consent of the study population was not needed, as the SIER is subject to personal data protection regulations and registered with the Spanish Data Protection Agency (no. 2101040243, on 14 April 2010) [34]. Even so, the study was presented to the Clinical Research Ethics Committee of the International Doctoral School of the University of Murcia (no. 3376/2021), and it was approved on 6 May 2021.

Rare Diseases Information System (SIER)

The SIER, existing since 2010, is a population registry of rare diseases (RDR) of the Region of Murcia, an Autonomous Community located in southeastern Spain with an estimated population of 1,493,898 inhabitants as of 1 January 2019, which constitutes 3.18% of the Spanish population. For the inclusion of people with a rare disease (RD), this system uses a list of selected codes from the International Classification of Diseases (ICD) and integrates information from various sources. Currently, the SIER has 47 different sources of information: administrative clinics such as the regional Minimum Basic Data Set (MBDS); preexisting patient registries such as the renal disease registry; orphan or foreign drug dispensing database, and databases of people with recognition of disability and dependency; notifications from patient associations; and clinical hospital units. For this study, the Regional CF Unit of the Virgen de la Arrixaca University Clinic Hospital (HCUVA) was one of the main sources of information in the contribution of people with CF to the registry. Other sources that incorporated some of the study patients are shown in Table 1.

Table 1 Information sources that contribute CF patients to the SIER*

Once possible cases of RD have been incorporated into the registry, they undergo a validation process, confirming the evidence of the diagnosis once the electronic medical record of the patient has been reviewed [33].

Regarding the codes for the detection of people with CF, code 277.0 (0–9) from the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) was used until 2015, and code E84 (0–9) from the tenth version of the Spanish Clinical Modification (ICD-10-ES) was used from 2016 to 2018, with no significant differences between the two classifications.

Data collection

The data collected from each patient included the following:

Consultation of basic patient information: Sex, age at diagnosis (< 18 years or ≥ 18 years), age on 31 December 2018, native country of the parents, diagnosis by neonatal screening, death and transplant (yes/no).

Obtaining genetic information: First, information was obtained about the variants of the CFTR gene. In addition, the project database CFTR2 [8], the database of single nucleotide polymorphisms (dbSNP) [35], database of cystic fibrosis mutations [7] and ClinVAr [36] were consulted to include the type of alteration that patients presented in the gene sequence, along with the associated nucleotide change, amino acid change and its molecular and clinical consequence. Second, the patients were classified into 2 groups according to genotype: "high-risk" if the 2 alleles were made up of class I, II, III and VII variants (“minimal function mutations”) and "low-risk" if at least one allele carried a class IV, V, or VI variant (“residual function mutations”). A patient with the genotype "F508del/D1270N + R74W" was classified as low-risk because the complex allele was considered to improve the effect of the main mutation. In contrast, a patient with the genotype "G85E/G451V + G253R" was classified as high-risk because the complex allele was considered to worsen the effect of the parent mutation [1, 17, 18].

Procurement of clinical manifestations: We collected the following clinical manifestations through 31 December 2018: respiratory and digestive symptoms, metabolic disturbances and others such as bone alterations.

Respiratory symptoms: Evidence of at least one episode of allergic bronchopulmonary aspergillosis (ABPA), one or more clinically relevant episodes of haemoptysis (> 200 ml), the presence of nasal polyps, chronic respiratory colonization by different microorganisms (Staphylococcus aureus, Burkholderia cepacia or Pseudomonas aeruginosa) and at least one documented acute infection with methicillin-resistant Staphylococcus aureus, Achromobacter xylosoxidans, or nontuberculous mycobacteria.

Lung function was evaluated using the best value of the forced expiratory volume in the first second (FEV1) recorded in 2018, normalized with respect to its theoretical value using the Global Lung Function Initiative (GLI) tool and expressed as a percentage of the predicted value. The variable was dichotomized into ≤ 90% and > 90%, which is the cut-off point used in other studies [37].

Digestive symptoms: Presence of meconium ileus at birth, rectal prolapse, intussusception, distal intestinal obstruction syndrome (DIOS), pancreatic insufficiency, recurrent acute or chronic pancreatitis and CF-related liver disease (Liver disease with or without cirrhosis, including fatty liver).

Metabolic disturbances: Insulin-dependent CF-related diabetes (CFRD) and at least one CF-related episode of dehydration requiring medical attention.

Others: Bone disorders, including low bone density, osteoporosis, and digital arthropathy.

Statistical analysis

We described the clinical and demographic variables in the two groups of genotypes established by a hypothesis contrast test according to the type of variables and their normality. The normality test was carried out using the Kolmogorov–Smirnov test. The absolute and relative frequencies of the clinical and demographic variables were evaluated. The allelic frequencies of the CFTR gene variants in the studied population were also evaluated.

For the quantitative variables, Student's t test was used if the data were normally distributed, and the Mann–Whitney U test was used if they were not. For qualitative variables, the chi-squared or Fisher’s exact test was used when applicable.

Additionally, crude and adjusted odds ratio (OR) and 95% confidence interval (CI) were calculated using binary logistic regression analysis to examine associations between genotype and the clinical manifestations of the participants. There was a statistically significant association between genotype and age at diagnosis and age as of 31 December 2018 (p < 0.01). Therefore, these variables were taken into account for the adjustment of the model together with sex and native country of the parents.

In addition, a sensitivity analysis was performed to verify that the patients diagnosed by neonatal screening did not lead to bias.

All tests were two-tailed, and the level of statistical significance was established at ≤ 0.05. Statistical analyses were performed with the IBM SPSS 25.0 statistical package (IBM Corporation, Armonk, New York, USA).

Results

There were 192 people diagnosed with CF registered in the SIER through 31 December 2018.

Of the total number of people included in the study, 53.6% were male, with a mean age ± standard deviation (SD) of 20.0 ± 15.2 years (median: 15.0, interquartile range [IQR]: 7.0–31.0), and 46.4% were female (mean ± SD: 24.5 ± 16.2 years, median: 23.0, IQR: 10.0–35.0). Adults (18 years or older) comprised 41.7% of all patients. The mean age ± SD at diagnosis was 7.8 ± 14.4 years, and the median was 0.0 years (IQR 0.0–7.5). Moreover, 16.1% of people (n = 31) were diagnosed by the neonatal screening program, which was implemented in Murcia in March 2007.

In 84.9% of the study population, the native country of the parents was Spain. In descending order of frequency, parents had other nationalities as follows: Ecuadorian (6.2%), English (2.6%), Moroccan (2.1%), and Argentine (1.0%). The remaining 3.2% included parents of French, Peruvian, Moldovan, Ukrainian, Hungarian and Bulgarian nationalities.

As a result of clinical manifestations, respiratory problems were present in 63.0% of the patients. The mean FEV1 percentage ± SD was 90.0 ± 21.4 and was inversely correlated with the age of the patients (−0.36; p < 0.01). In addition, 57.8% of the people presented infection/colonization by a bacterial pathogen at some point. The most frequently isolated microorganism was Staphylococcus aureus, with those under 18 years of age being the most likely to be infected. Among digestive manifestations, pancreatic insufficiency was the most common (56.8%). Furthermore, 8.9% of the patients presented meconium ileus as the first manifestation of the disease.

Table 2 shows the main demographic and clinical characteristics of the patients according to their genotype, available for 94.8% of the cases (n = 182). Patients for whom genetic information was not available (n = 10) were excluded from further statistical analyses. Of these patients, 67% were classified as having a high-risk genotype (n = 122), and 33% were classified as having a low-risk genotype (n = 60).

Table 2 Demographic and clinical characteristics according to genotype in patients with cystic fibrosis*

People with a high-risk genotype were younger (p < 0.001), with a lower mean age at diagnosis (p < 0.001) and lower mean FEV1 values (p = 0.045) with respect to the low-risk genotype. Likewise, the high-risk genotype presented a higher frequency of respiratory infections by methicillin-resistant Staphylococcus aureus (p = 0.013) and Achromobacter xylosoxidans (p = 0.034). A higher incidence of meconium ileus, pancreatic insufficiency, CF-related liver disease and CFRD was observed in the high-risk patients (p ≤ 0.01). Furthermore, 15.6% of patients with a high-risk genotype required lung or liver transplantation compared to 6.7% with a low-risk genotype, although the difference was not significant (p = 0.089).

Table 3 shows the frequency of CFTR gene variants by alleles in the 192 patients studied. The most common mutation was p. Phe508del in 58.3% of the patients (27.0% homozygous and 73.0% heterozygous) and 37.0% of the alleles.

Table 3 CFTR sequence variants detected in 384 alleles from 192 patients studied

In total, 76 genotypes and 49 different variants were found. In approximately 50% of the alleles, the following 3 mutations were observed: p. Phe508del, c.1624G > T (p.Gly542Ter) and c.3017C > A (p.Ala1006Glu). Other variants were found in 1.6% to 3.9% of the alleles, generally in compound heterozygosity with other “residual function mutations” or with p.Phe508del. The rest of the variants were presented in frequencies equal to or less than 1%.

Table 4 shows the multivariate analysis of the relationship between genotype and clinical manifestations. The high-risk genotype was significantly associated with a lower percentage of FEV1 values (OR: 5.3; 95% CI: 1.2, 24.4), a higher risk of developing Pseudomonas aeruginosa infection (OR: 7.5; 95% CI: 1.7, 33.0) and the presence of pancreatic insufficiency (OR: 28.1; 95% CI: 9.3, 84.4) (P < 0.05) compared to the low-risk genotype. No other statistically significant associations were observed.

Table 4 Multivariate analysis for the clinical manifestations of people with cystic fibrosis and their genotype*

Discussion

The present study shows CFTR sequence alterations and their relationship with the clinical manifestations of people with CF included in the rare disease registry of the Region of Murcia. Although the mutational spectrum of CFTR in Murcia was published in 2009 [38], the study included 91 patients selected from the CF unit in whom 29 different variants were described. Therefore, our study offers more complete, up-to-date and representative information on the genetics of people with CF in this geographic area. Furthermore, we have no evidence of other regional or national articles that analyse the genotype–phenotype relationship among patients included in this type of registry.

In the study population, the most frequently observed variants were p. Phe508del, p.Gly542Ter. and p.Ala1006Glu. Additionally, 76 genotypes and 49 different variants were detected, supporting the great heterogeneity described in Mediterranean countries [9, 39].

Phe508del is the most common sequence change in 58.3% of CF patients, a figure lower than that reported by other Spanish Autonomous Communities [40, 41] and different European Mediterranean countries [1, 11, 42,43,44]. In addition, its frequency by alleles is among the lowest data described to date (37.0%) due in large part to the high percentage of carriers of the variant in heterozygosity (73%) compared with the 48.1% reported recently by the registry of CF patients in Spain [13].

Moreover, the Region of Murcia constitutes, together with Andalusia and the Balearic Islands, as one of the Spanish Autonomous Communities with the highest percentage of alleles with the variant p.Gly542Ter [41, 45]. In fact, according to a study by Estivill et al. [12], this alteration is more common in Mediterranean countries, with an average frequency of 6.1%, and the highest prevalence described thus far was in the Balearic Islands (16.7%). Recent data from the Spanish CF registry suggest that 7.7% of registered patients carry the p.Gly542Ter variant [13]. In the SIER, this variant is present in 16.5% of patients and 8.1% of CFTR alleles.

The variants described above were followed in frequency by p. Ala1006Glu and c.617 T > G (p. Leu206Trp), which is rare in the rest of European countries [11, 13]. Nevertheless, other frequent mutations in Europe, such as c.1652G > A (p. Gly551Asp), which have a specific treatment [46], have not been described in our study population.

The patients were grouped into 2 genotypes according to the consequence that the different variants have on the function and amount of CFTR protein, as proposed by previous studies, [47]. However, to date, no work has used this classification to link the genotype to all the clinical manifestations included in this study.

The proportion of people classified as having the low-risk genotype was 33%. This represents a much higher percentage of patients with mild forms than that described by McKone [17] or De Boeck [47] in other countries, while studies in Mediterranean countries indicate that this figure is close to 15% [48]. In Spain, there are no studies that describe the percentage of mild forms, but the figure reported in this study supports what has been suggested by other authors such as De Gracia, who points out that the mild forms could be more frequent than what has been described to date [49].

The frequency of a high percentage of people with a low-risk genotype can largely explain the presence of the different clinical manifestations. An example of this is pancreatic insufficiency, which existed in 67.7% of the patients in this study. Although it has been classically described that approximately 85–90% of CF cases are associated with pancreatic insufficiency, our results show that this percentage may be compatible with severe forms, since pancreatic insufficiency was present in 89.2% of our cases classified as high-risk versus 30.5% classified as low-risk.

In addition, our results are consistent with previous studies that have described an association between genotype and respiratory and digestive symptoms in people with CF. In our study population, the most consistent findings observed were the appearance of pancreatic insufficiency with the high-risk genotype but also a lower percentage of predicted FEV1 and colonization by Pseudomonas aeruginosa.

The relationship between the severity of the mutations and pancreatic damage has been previously reported by grouping the mutations into classes, associating a higher risk of pancreatic insufficiency in those with variants that cause greater CFTR dysfunction [50,51,52,53].

Regarding colonization by microorganisms, Kerem et al. [54] and Vongthilath et al. [55] showed that chronic infection by Pseudomonas aeruginosa tends to appear in CF populations that present greater lung damage and more severe symptoms related to loss of function of CFTR. Therefore, although different authors have linked the genotype with pancreatic function and lung damage, to date, no studies have used this classification. So, we consider that the grouping used allows the entire mutational spectrum to be combined into only two well-differentiated groups (high and low risk), constituting a more useful way of approaching these studies and of knowing the prognosis of the disease in a simpler way.

Various studies also described a relationship between genetics and infection by other microorganisms [56]; however, we did not find statistically significant associations in this regard. The same occurred with complications such as CFRD and CF-related liver disease, in which no significant association was observed in the adjusted model.

Considering the limitations of the study, the relatively small population studied could make it difficult to detect potential effects. However, in our study statistically significant associations were found for different manifestations. Even so, we cannot rule out the appearance of a type II error for the variables in which no statistically significant differences were found.

Although not all clinical information was available for all patients, there were no significant differences between participants with or without information about genotype or age, so information bias is unlikely. Additionally, although this study analysed a large set of clinical manifestations and related diseases, there are some that were not addressed, such as nutritional status, infertility or oncological pathology, which may be included in future work.

The possibility that the neonatal screening diagnosis could act as a modifier of the association between clinical manifestations and genotype could be considered a limitation of our study. Nevertheless, we carried out a sensitivity analysis and verified that the associations were similar when analysing both groups separately, so it was concluded that there was no modification of the effect by screening.

Furthermore, it should be noted that not all the people with CF in the study had the same time of evolution of the disease. In fact, certain conditions, such as pancreatic insufficiency, are described as being closely related to age. Notwithstanding, a significant association was obtained for this relationship in the studied population when we adjusted the model by age and age at diagnosis.

It is worth mentioning that one of the main strengths of our study is the use of a population-based registry, which offers up-to-date and extensive information on these patients and allows us to know the frequency, distribution, evolution and needs of the patients affected by CF or other rare diseases. In addition, the SIER offers representative data of those affected by the disease, since it is estimated that it has registered all of these patients due to the high number of sources and the obligation of them in sending their information, constituting the reference registry for regional data [34]. For all these reasons, knowing their characteristics and their mutational spectrum, we can determine which patients would benefit from new treatments, such as highly effective CFTR-modulating therapies, where currently 16% of the people studied had received any such treatment.

However, future studies are needed to address these aspects, taking into account the changes in the eligibility criteria of some of the treatments, the incorporation of therapies recently approved in Spain, such as the combination of tezacaftor, ivacaftor and elexacaftor, and the impact that these treatments have on the progression of CF. In addition, new studies carried out using other Spanish regional registries or in a broader population such as that of the state RD registry with the methodology proposed here, could provide more information and support the results obtained [57].

Conclusions

To our knowledge, this is the first Spanish study to describe the mutational spectrum and its association with clinical manifestations in people with CF included in a rare disease registry. The frequency of the p. Phe508del variant was one of the lowest described in Europe, and the percentage of people classified as having a low-risk genotype was higher than that described by other authors.

In addition, the high-risk genotype increased the risk of severe lung damage, pancreatic insufficiency and chronic respiratory colonization by Pseudomonas aeruginosa in comparison with the low-risk genotype.

The results obtained in this work allow for planning of the resources that health services must provide to people with CF, contributing to the development of public health strategies to move towards personalized precision medicine that helps to optimize the health care for these patients.