Background

Congenital cytomegalovirus (cCMV) is the most common congenital infection in the world and a leading cause of sensorineural hearing loss, mental retardation, microcephaly, development delay, seizure disorders, and cerebral palsy [1]. Approximately 10% of congenitally infected infants have symptoms at birth, and sequelae occur in 40– 58% of them and in 13.5% of asymptomatic infants [2]. The causes for these differences in outcome remain unknown and are probably multifactorial, and may include different CMV pathogenicity in the context of the genetic backgrounds of the fetus and mother and other known factors such as when primary infection occurs in the first trimester of pregnancy [3]. Once CMV infects the placenta, local damage and inflammation lead to placental dysfunction, which in turn affects fetal development [4].

The activity of NK cells is controlled by the binding of CD94-NKG2 receptors to HLA-E complexed peptides from HLA-A, -B, -C and -G alleles expressed on the cell surface [5] and polymorphic UL40 peptides from CMV shape the adaptive NKG2C NK cell populations [6]. HLA-C, HLA-E, and HLA-G are the only HLA molecules expressed in the placenta [7] and HLA-C and HLA-E prevent maternal NK cell-mediated cytotoxicity through binding killer-cell immunoglobulin-like receptors (KIRs) expressed on decidua killer-cells. Some combinations of maternal KIRs and fetal HLA-C can lead to pregnancy complications [8]. HLA-E also presents these HLA peptides to CD8+ T cells to trigger specific cellular immune responses [9, 10]. Infection with CMV can trigger a CD8+ T cell response restricted by HLA-E and are specific for a peptide from UL40 (VMAPRTLIL) [11], which is characterized by biased TRBV14 gene usage [12]. Other viral and host factors (separate to the HLA-E binding peptides) could influence the outcome: The specific function of the rest of the UL40 gene is unknown, while the HLA-E binding peptide also alters expression of UL18 [13], which in turn binds leukocyte Ig-like receptor 1 (LIR1) with high affinity. LIR1 also binds HLA Class I molecules. More recently, heterogeneity in the UL18/LIR1 interaction has been associated with altered control of CMV in kidney transplant recipients [14].

We hypothesize that polymorphism in this viral ligand in the context of a particular HLA-A, -B, -C and -G allele of children and mothers may influence the immune response and consequently different cCMV infection outcomes may be expected.

The aim of this study was to gain more insight into cCMV pathogenesis and its clinical consequences by describing the variability and distribution of UL4015- 23 CMV peptide in a large cohort of children with congenital infection and mothers with an affected fetus or newborns. Moreover, it is relevant to establish whether the distribution of different peptides is the same in the healthy population as in the CMV strains that are responsible of congenital infection Therefore, in order to determine whether polymorphism in the UL4015-23 peptide represent a risk factor, we compared its occurrence to variability and distribution of these peptides derived from HLA-A, -B, -C and -G alleles in a healthy population.

Results

Distribution of UL4015-23 peptides in congenital CMV infection

Two hundred forty-two UL40 gene sequences were analyzed. Different UL40 polymorphisms were found, particularly in the region encoding the main potential HLA-E binding peptide. Once the HLA-E binding fragment was translated and selected, 19 different UL4015-23 peptides (9-mers) with different binding properties and distribution amongst patients were found (Table 1). The 8th amino acid position was observed to be the most variable (22nd position in UL40). VMAPRTLLL, VMAPRTLVL and VMAPRTLIL peptides account for 176/199 (88.4%) of all UL4015-23 peptides analyzed and they matched with most HLA-A, -B, or -C alleles of HLA IPD- IMGT/HLA Database (Table 1 Supplementary). The UL4015-23 peptide VMAPRTLFL was found in only two patients and matched all HLA-G alleles. In addition to the three most predominant UL4015-23 peptides, another four UL4015-23 peptides matched with a specific HLA Class I allele, while the remaining 11 did not match any allele, as shown in detail in Table 1 Supplementary.

Table 1 UL4015-23 peptides found in CMV strains from 199 patients and HLA-E predicted binders:

Distribution of HLA Class I peptides in healthy population

There is only one peptide copy of CMV per patient. However, the homologous peptide (HLA) in humans may be duplicated in each HLA Class I. Therefore, it may be the case that in humans there are two copies of a peptide (in HLA-C), another two copies in HLA-A and so on. In addition, the same individual has heterozygous peptides (two different HLA- C peptides, for example). We must consider this when statistically analyzing the 444 individuals. Therefore, we obtained 403 VMAPRTLIL and 76 VMAPRTLLL peptides in HLA-C that do not add up to 444. Distribution of HLA-A, B, C and G peptides in 444 healthy individuals was as follows:

HLA-A: All individuals belonged to one of these three groups: VMAPRTLLL/VMAPRTLLL 116(26%), VMAPRTLLL/VMAPRTLVL 220(49.5%), VMAPRTLVL/VMAPRTLVL 108(24.3%). HLA-C: VMAPRTLIL/VMAPRTLIL 342(77%), VMAPRTLIL/VMAPRTLLL 61(13.7%), VMAPRTLLL/VMAPRTLLL 15(3.4%), none of them 26(5.9%); VMAPRALLL or VMAPQALLL corresponding to HLA-C*07:18 and HLA-C*17 alleles, respectively). HLA-B: None. HLA-G: All were VMAPRTLFL.

Thus, not taking into account the copy number and locus, we found 354 individuals with VMAPRTLLL peptide; 328 with VMAPRTLVL peptide; 403 individuals with VMAPRTLIL peptide and 444 individuals with VMAPRTLFL peptide. Detailed data are shown in Table 2.

Table 2 UL40 9-mer found in cCMV versus HLA Class I of 444 healthy individuals

Comparison between HLA Class I peptides from healthy individuals and UL40 peptides from cCMV

Only four of the 19 peptides obtained in UL4015-24 CMV (VMAPRTLIL, VMAPRTLLL, VMAPRTLVL, and VMAPRTLFL) were also found in HLA Class I from healthy individuals (Table 2). VMAPRTLIL and VMAPRTLFL peptides were only found in HLA-C and HLA-G, respectively. A significant difference in peptide distribution between UL4015-23 and HLA-A, -B, -C and -G was found (Fig. 1). Interestingly, significant differences in VMAPRTLIL and VMAPRTLLL peptides between UL4015-23 and HLA-C were not as broad as in the other HLA Class I loci (p = 1.561797e-18 and p = 2.260558e-02, respectively). Comparisons and their significance in peptide distribution between UL4015-23 and HLA-A, -B, -C, -G were as follows: (VMAPRTLIL (p = 8.637e-77, p = 8.637e-77, p = 1.561e-18, p = 8.637e-77); VMAPRTLLL (p = 1.578e-57, p = 4.350e-11, p = 2.260e-02, p = 4.350e-11); VMAPRTLVL (p = 1.581e-40, p = 4.283e-20, p = 4.283e-20, p = 4.283e-20); other UL4015-23 peptides ((p = 7.646e-13, p = 7.646e-13, p = 7.646e-13, p = 1.104e-132); other HLA Class I peptides (p = 1, p = 1, p = 3.953e-4, p = 1).

Fig. 1
figure 1

Peptide distribution comparisons between UL4015-23 and HLA-A, -B, -C, -G. Figure 1 foot: RVV = other UL4015-23 peptides; RVHLA = other HLA Class I peptides

Limitations of study and assumptions

Assumptions:

  1. 1.

    In cCMV, UL4015-23 peptides that are equal to the HLA peptides of the patients may be selected, therefore, they should show a distribution without significant differences with respect to the HLA Class I peptides.

  2. 2.

    We assume that the distribution of HLA Class I peptides in a healthy population and in congenitally CMV infected children was the same.

Limitations:

  1. 1.

    It would be best to compare the peptide of the virus with the HLA class I peptide of the same patient (paired data) but this is impossible to obtain in this retrospective study, so we do not have this HLA data for the patient. An approach to the above was to study these peptides among a healthy population and to analyze whether this distribution of peptides is statistically similar or different from that obtained of the viruses of the patients. In further studies, it should be interesting correlate the findings with more detailed clinical data.

  2. 2.

    CMV Sanger sequencing reveals the most abundant UL40 gene CMV variants (> 30%). However, coinfection with less frequent variants should not be ruled out.

Discussion

The CMV UL40 signal peptide contains a 9-mer sequence that is processed in the same way as endogenous HLA-E binding peptides and, consequently when bound to HLA-E can promote CD8 T cells and NK responses. A wider range of UL4015-23 peptides in cCMV than in their counterpart HLA peptides from healthy donor individuals was observed: Up to 19 different UL4015-23 nonamers were found in clinical samples of 242 patients with cCMV infection. In contrast, three of them were present in all healthy donor individuals, although differently distributed among HLA Class I loci. These three endogenous HLA Class I peptides were found in 60.3% (VMAPRTLIL UL4015-23), 18% (VMAPRTLVL UL4015-23) and 10% (VMAPRTLLL UL4015-23) of cCMV strains. When we compared the distribution between peptides from HLA-A, -B, -C, -G and UL4015-23, significant differences were found. The importance of matching UL4015-23/ HLA-derived peptides has been described previously: a strong VMAPRTLIL-UL40-specific, CD8+ T cell response was observed in individuals who lack HLA-C alleles that encode this determinant [10, 15]. Selective pressure may drive the UL4015-23 peptide to be more adapted to the HLA alleles of hosts. The VMAPRTLFL UL4015-23 peptide was found only in two CMV strains, but it could be found in all HLA-G alleles, which is expressed only by trophoblast cells. In addition, any UL4015-23 peptide observed was not found in HLA-B of the cohort of healthy individuals (VMAPRTLLL is specific of a rare HLA-B*13:117 allele) and the fact that HLA-A and B are not expressed by trophoblasts suggests a main role of HLA-C derived peptides in cCMV pathogenesis. It is interesting that when we compare peptide distribution between UL4015-23 CMV from patients and peptides derived from HLA-C of healthy individuals a significant different distribution was found although the VMAPRTLIL peptide was the most prevalent peptide in UL40 and HLA-C.

Moreover, a study with blood donors and kidney transplant recipients demonstrated that CMV induces strong HLA-EUL40 CD8 T cells responses with potential allogeneic or/and autologous reactivity depending on virus strain and host HLA concordance [15]. This finding may be relevant in cCMV pathogenesis. Regarding the tolerating state of the fetus against allopeptides from the mother, different situations may be noted: CMV strains which produce UL4015-23 peptides that do not correspond with endogenous HLA-E binding peptides of children and their mothers may elicit robust HLA-E restricted T cells responses against viral ligands. Other conditions may be found when only mother or fetus are matched. On the contrary, CMV strains that produce UL4015-23 peptides, which match with allopeptides from mothers and children, may produce a tolerating state against these viral ligands. Of course, whether this is critical or not for stimulating strong allogeneic or/and autologous reactivity in cCMV requires further studies. In our study, a significant difference in UL4015-23 peptides from cCMV disease compared to those found in HLA Class I from healthy individuals was detected. Although many other viral and host factors are involved in cCMV, this variation suggests that it may play a role in cCMV pathogenesis. Paired CMV/HLA Class I genotyping cCMV studies are needed to validate this proposal.

The rationale for this study was to explore how HLA-E restricted cytomegalovirus UL40 peptide polymorphism are associated with risk for CMV disease following congenital infection. Our findings suggest that UL40 peptide polymorphisms are indeed associated with disease, and thus the findings reported here open new windows to explore potential biomarkers of congenital CMV disease that help clinicians in the management of pediatric patients. In this sense, characterization of UL40 nonamer and HLA- Class I nonamer in a patient (child and mother) could provide useful information regarding prognosis of the disease and it should facilitate clinical decision regarding the use of antivirals or a more extensive follow up of the patient. This new approach together with the knowledge of these mechanisms and biomarkers is linked to the principles of personalized medicine, which is undoubtedly being developed and will be developed in the future.

Conclusions

In the present study, we characterized the profile of HLA-E binding peptides of CMV in congenital infection. Our findings suggested that a mismatch between UL4015-23 peptides and the HLA Class I peptides of children and mothers might play a role in congenital CMV disease, and it may account for differences in outcome, morbidity and sequelae. To our knowledge, this study represents a novelty approach and the description of the highest number of different UL4015-23 peptides found in CMV disease. The above findings provided unique insights for further studies on CMV/host interactions in congenital infection and could serve as the basis for the development of molecular markers of disease.

Methods

Patients and specimens

Residual clinical specimens submitted for virological diagnosis of cCMV infection from 42 Spanish hospitals from 2009 to 2019 were analyzed. The use of these samples was approved by the Ethics Committee of the “Instituto de Salud Carlos III” (CEI PI 41_2016-v2). A total of 242 CMV PCR-positive clinical samples from 166 children younger than 6 years old diagnosed with cCMV infection (207 clinical samples) and 33 pregnant or breast feeding women (35 clinical samples) were available for the study. Children´s age ranged from less than 1-day-old to 6 years old with a median of 272 days old. 38% of them were females. Pregnant or breast-feeding women were a median of 31 years old. There was no reported epidemiological relationship among children of either pregnancy or breast-feeding women, except for a case of three newborn triplets. In this retrospective study, cCMV infection was diagnosed by neonatal symptoms: mainly, the infant being small for their gestational age (SGA, growth restriction < 10th centile), but also by abnormal clinical examination: hydrops, petechia or purpura, hepatosplenomegaly, microcephaly, hypotonia, sucking difficulties, lethargy, seizures; abnormal laboratory parameters: platelet count < 100 000/mm3, haemoglobin level < 11 g/dl, alanine aminotransferase level > 80 IU/L, conjugated bilirubin plasma level > 20 µmol/L and > 10% of total bilirubin; or severe abnormality on cerebral imaging (ultrasound and/or cranial computed tomographic scan): multiple intracranial calcifications, periventricular hyper echogenicity, severe ventriculomegaly (> 15 mm); or abnormal funduscopic examination or abnormal audiology assessment. In pregnant women, by pathological ultra sound images of fetus, and in breast feeding women, by laboratory confirmed prenatal primary CMV infection and some neonatal symptoms described above. The 242 clinical samples were distributed as follows: 157 urine, 24 amniotic fluid, 23 blood spots on paper, 14 blood, seven breast milk, four placenta, three sera, two CSF, two spleen biopsies, one bowel biopsy, one liver biopsy, one brain biopsy, one bone marrow biopsy, one nasopharyngeal wash and one nasopharyngeal aspirate.

Cohort of healthy individuals

A total of 444 healthy individuals were included in the study. They were unrelated Spanish European adults (50% males) from HLA genotyped healthy controls previously used in another study [16]. Detailed genotype data used in this study were provided by MF González-Escribano, Head of Department of Immunology of Virgen del Rocio Universitary Hospital, Seville, Spain.

DNA extraction, PCR and sequencing

CMV positive clinical samples were directly sequenced for UL40 gene using a nested PCR assay previously described by Garrigue I. et al. [17] with minor modifications. Briefly, DNA was extracted using QIAsymphony system and QIAamp DSP DNA Midi Kit (Qiagen) from 400μL of clinical sample. A CMV UL40 gene fragment (666 bp) was amplified using the Platinum SuperFi DNA Polymerase reaction kit (Invitrogen). PCR products were processed for Sanger dideoxy sequencing with BigDye v. 3.1 (Applied Biosystems) in an ABI PRISM 3100 sequencer (Applied Biosystems, California, USA) at the Genomic Department of National Center for Microbiology. DNA sequence analysis was carried out using the Lasergene SeqMan software. Alignments of raw sequence data were performed against expected amplicons. Afterwards, all sequences were imported, aligned and translated by MEGA 7 software using the UL40 gene sequence from Human Herpesvirus 5 (Merlin strain) as reference sequence (NCBI Reference Sequence: NC_006273.2).

Peptide HLA Class I allele search and prediction of MHC- restricted ligands

Peptide HLA class-I allele search was performed using the Immune Polymorphism Database (IPD-IMGT/HLA Database) and the sequence alignment tool from EMBL-EBI: https://www.ebi.ac.uk/ipd/imgt/hla/align.html. CMV UL40 peptide binding were predicted using RANKPEP (http://imed.med.ucm.es/Tools/rankpep.html) [18]. RANKPEP uses Position Specific Scoring Matrices or profiles from a set of aligned peptides known to bind to HLA-E as the predictor of MHC-peptide binding. The OPT percentage is the percentile score of the predicted peptide relative to that of the consensus. The consensus is the sequence that yields the maximum score, namely optimal score (OPT), with the selected profile.

Statistical analysis of data

A descriptive analysis of the variables was performed. To compare peptide distributions Fisher exact test were used. P-values were adjusted for multiple comparisons using Benjamini–Hochberg method. Statistical significance was defined as p < 0.05 for all tests. All statistical analyses were performed using R software (R 3.6.2;R foundation for Statistical Computing, Vienna, Austria).