Introduction

Human astroviruses (HAstVs) are a major cause of acute gastroenteritis in children, the elderly and in immunocompromised subjects [1], therefore constituting an important problem public health. Recently, new members of the Astroviridae family have been associated with neurologic disorders in humans, and several reports have suggested the possibility of a cross-species transmission that could eventually become a zoonotic problem [2, 3].

HAstVs virions are non-enveloped and contain a positive-sense, single-stranded polyadenylated genome of approximately 7 kb. The genome is organized into three open reading frames (ORFs), ORF1a, ORF1b, and ORF2, flanked by 5′ and 3′ untranslated regions (UTRs) of 80–100 nucleotides (nt) [4]. A VPg protein covalently linked to the 5′ terminus of the genome is essential for viral infectivity [5]. While ORF1a encodes four nonstructural proteins, one of which is a putative serine-protease (nsP1a3), ORF1b codes for the viral RNA-dependent RNA polymerase (nsP1b) and ORF2 encodes for the capsid conforming proteins [6]. It has been suggested that the proteins encoded in the C-terminal domain of ORF1a and the complete ORF1b are involved in viral replication [7].

Regulatory 5′ and 3′ UTRs are found in single stranded positive RNA viruses (ssRNA+). These UTRs have been implicated in viral translation and replication mainly because they contain sequences and RNA secondary structures that are recognized by cellular and viral proteins involved in these processes [8,9,10,11]. The roles of several of these proteins in the regulation of both the initiation of translation and the synthesis of the viral genomic RNAs have been described (see Tables 1, 2). However, a full understanding of the precise mechanism underlying the function of all the viral and cellular proteins involved remains incomplete. Very little is known about translation and replication mechanisms of the HAstVs, and even less is known about how the astroviral UTRs are involved in such processes. In spite of this, only one report exists on the RNA secondary structure of the 3′ end of HAstV-1, which is similar to those of other RNA viruses [12]. Moreover, nothing is known about the possible RNA secondary structure of the 5′ UTR of any HAstVs, or its possible role in the regulation of viral processes such as translation or replication, which has been well documented in other positive-stranded RNA viruses.

Table 1 Cellular proteins bound to 5′ untranslated region of ss+RNA viruses
Table 2 Cellular proteins that bound to 3′ untranslated region of ss + RNA viruses

A general mechanism followed by positive-sense ssRNA viruses to replicate their genome consists of the following. After the release of the RNA viral genome into the cytoplasm, it is translated to produce non-structural viral proteins that, along with cellular factors, assemble viral replication complexes (VRCs) that participate in negative-stranded RNA synthesis. This RNA is used as a template to synthesize a large copy of new positive-sense RNA virus, which is followed by additional rounds of translation/replication to form new viral particles. The balance between positive and negative stranded RNA synthesis is disproportionate, which means that there are more positive RNA viral molecules than negative-stranded RNA viral molecules, which depends on the RNA viruses [51].

Several single-stranded positive animal RNA viruses, such as poliovirus, hepatitis C, West-Nile, coronavirus, dengue, calicivirus family members and others, have been used as a model to study the role of the UTRs in many processes during viral infections [8, 9, 21, 52]. It has been suggested that the interaction between the UTRs and cellular factors is important during the viral replication cycle. For example, heterogeneous nuclear ribonucleoproteins (hnRNPs), such as polypyrimidine tract-binding protein (PTB), also known as hnRNP1, is an RNA binding protein involved in multiple aspects of cellular mRNA metabolism, including splicing regulation, polyadenylation, 3′ end formation, translation of RNAs, RNA localization and stability. Additionally, PTB binding to viral RNAs is by far the best characterized to date [14, 21, 24, 25, 41, 49, 50, 53, 54].

Other proteins include the La auto-antigen (La), which is a 47 kDa nuclear RNA-phosphoprotein that associates with transcripts of RNA polymerase III [55, 56], and has also been associated with protein transport from the nucleus to the cytoplasm and their re-localization to the cytoplasm and has been observed in several RNA viral infections [57] and might play a role in positive and negative-strand RNA synthesis [15, 18, 26, 38, 49, 58]. Additionally, La protein could be involved in the interplay between viral and host proteins to determine the essential switch from translation to replication in the viral life cycle as described in Hepatitis C virus [59].

PCBP2 (poly (rC)-binding protein 2), also known as hnRNPE2, appears to be a multifunctional protein, is one of the major cellular poly(rC)-binding proteins, binds to the UTRs of several positive RNAs viruses and has been shown to function in both, RNA translation and replication [57, 60,61,62]. Translation elongation factor 1-α (EF1-alpha) is an abundant multifunctional protein, a GTP-binding protein that catalyzes the binding of aminoacyl-transfer RNAs to the ribosome, is involved in nucleocytoplasmic trafficking and might be involved in viral replication [8, 9, 24, 63].

Finally, some members of the SR (serine/arginine-rich) protein family [64, 65] could participate in the viral replicative cycle because they often contain one or more RNA recognition motifs, and they have been classically described as regulators in precursor (pre)-mRNA splicing [66, 67]. More recently, the shuttling of SR proteins acting as adaptors for mRNA export and as regulators for translation in the cytoplasm has been described [68]. In this review, we explore the conserved similarities of the available sequences of the 5′ and 3′ UTRs of HAstVs, and aim to describe the presence of several putative binding sites for different cellular factors. We also consider whether these RNA–protein interactions might be involved in the several viral functions of human astrovirus.

Comparisons of the human astrovirus 5′ and 3′ untranslated region secondary structures

The genome of RNA viruses is extremely versatile. To maximize the efficiency of the genome, RNA viruses regulate translation and RNA synthesis through a switch between these two stages, which is triggered by the communication between the UTRs via both RNA–RNA and RNA–protein interactions [69]. These UTRs contain sequences recognized by cellular and viral proteins that participate in many viral functions. It has been suggested that the UTRs of several RNA viruses, such as dengue, human calicivirus, norovirus, West Nile, enterovirus, Hepatitis C virus and others, are involved in the translation and replication [16, 19, 21, 27, 69]. To explore a possible link between the sequence conservation and functional significance in classical HAstV, we carried out multiple sequence alignments of the 5′ UTRs of all the classical HAstVs and novel astrovirus (known as non-classical astrovirus) sequences available in the GenBank [2, 3]. The analysis revealed that the 5′ UTR contains several conserved motifs identical among all HAstV serotypes and novel astroviruses. For example, CCAA (position 1–4) residues are present in both classical viruses and novel serotypes, except for the first C which is absent in HAstV-4. The adenine present in position 5 in classical HAstVs is replaced by G/T in almost all novel astrovirus, with the exception of VA1 and AST/PS. Meanwhile the GGGGG and GGCC motifs localized around position 7 and 21, respectively, depending on the serotype (indicated in bold in Fig. 1a) are present only in classical viruses. Additionally, another sequence is conserved between the former motifs, TGGT residues are common in all classical HAstV (position nt 13–16), except in HAstV-6 where the last T residue changes to C, as well in VA and AST/PS genotypes. In MLB 1–3 clade viruses (position nt 8–11) as well in VA1-4 clade virus this motif is present but in different position (underlined). This position variability might have arisen as a function of UTR length variability: MLB1-3 (14 nt), VA1-4 (36–40 nt). Another conserved motif TTT was found in all human astrovirus and the VA1-4 clade, this is localized in position 25 and absent in MLB1-3 clade (Fig. 1b). This high degree of sequence conservation among the different classical and novel astrovirus serotypes supports the possible involvement of this region in important viral functions. The sequence similarities in both virus types may result of recombination events between human and animal strains, as previously suggested [2, 3], suggesting that they share similar mechanisms such UTRs-proteins interaction-mediated replication.

Fig. 1
figure 1

Sequences alignment, secondary structures predictions and putative binding sites for cellular proteins in the 5′ Untranslated region of HAstVs. a Alignment of the human astrovirus. 5’UTR sequences 1–8 (Accession Nos. Z25771, L13745, AF141381, AY720891, DQ028633, GQ495608.1, AF260508 respectively) except HAstV-7, which the 5′UTR is not available. b Alignment of novel astrovirus sequences available MLB1-3 (Accession Nos. FJ222451, NC_016155.1, JX857870.1), VA1-VA4 (Accession Nos. FJ973620.1, GQ502193.2, JX857868.1, JX857869.1), and AST/PS (Accesion No. GQ891990.1) The nucleotide position used to aligment MLB1-3 (nt 1–14 length) For VA1 nt 1–38, VA2 nt 1–42, VA3 nt 1–36, VA4 nt 1–49 and AST/PS nt 1–39. Red marked nucleotides correspond to sequence differences. c Predicted RNA secondary structures of the 5′UTR of human astrovirus 1–8 determined by the MFold software release, version 3.2 [70]. For secondary structure modeling and alignments, the following nucleotide positions were used: HAstV-1 nt 1–86; HAstV-2 nt 1–82, HAstV-3 nt 1–85; HAstV-4 nt 1–84; HAstV-5 nt 1–84, HAstV-6 nt 1–83 and HAstV–8 nt 1–81. d Localization of the putative binding sites for different cell factors in the HAstV-8 5′UTR RNA structure. e Mobility shift assay carried out with biotin labeled HAstV-8 5′UTR RNA incubated with 0.50, 0.75, and 1 µg of recombinant His-PTB protein (left panel), 0.5, 0.75, and 1 µg of BSA (middle panel) and heterologous RNA with 0.50, 0.75, and 1 µg His-PTB (right panel). Free probe is shown in lane 1 and PTB complexes formation is indicated by arrow. The RNA–protein complexes were detected with chemiluminiscent nucleic acid detection module (Pierce) following the conditions described by the manufacturer

From our analysis above and considering the proposal that 5′ and 3′ UTR might be required elements for viral replication and RNA synthesis [27], we set to establish whether the 5′ UTR of the HAstV region could be structurally conserved. Therefore we generated their predicted secondary structure. This is of upmost importance since no single report exists on the possible 5′ UTR RNA secondary structure (classical or novel astroviruses), neither on the importance of the 5′ UTRs in the regulation of viral processes such as translation or replication, as has been documented for other positive-stranded RNA viruses.

The analysis of classical astroviruses revealed that their secondary structures are very similar, with free energies (∆G) ranging from − 16 (HAstV-5) to − 12.1 kcal, except for HAstV6 whose predicted secondary structure has a ∆G of − 24.30 kcal. As observed above, compared to the rest of HAstVs, the 5′ UTR of HAstV-8 exhibits A to G change in position 5, and greater sequence conservation is observed towards the 5′ ends of the UTRs than towards their 3′ ends (nt 55–80) (Fig. 1a). These features correlate with the secondary structure modeling results. In general, the 5′ UTR RNA structures showed two main loops connected by a large and a small hairpin (Fig. 1c). The main differences observed in the aforementioned regions showed major nucleotide changes. Depending on the serotype, the first loop was localized around nucleotides 8/10 (HAstV-3, 4, 5, 8) or 10/14 (HAstV-1, 6), and the second loop was localized around nucleotides 55/84 (Fig. 1c). To our knowledge, this is the first report describing the possible secondary structure of the 5′ UTR from HAstV.

The pioneering work of Willcocks and Carter proposed the HAstV-1 3′ UTR secondary structure [12]. Recently, we described the predicted structures of the 3′ UTR of all HastV sequences available to date [50]. Our analysis showed that this region is highly conserved, complying with a double hairpin-loop secondary structure that is linked by a single-stranded RNA spacer. Following the approach used for the 5′ UTR; we performed multiple sequence alignment of the 3′ UTRs of all classical HAstVs and non- classical astrovirus available in the GenBank. The comparison revealed that the 3′ UTR contains several conserved motifs. For example, a GGGTACAGCG motif was located in both virus groups. For classical HAstV the sequence appears around positions 6739/6680, except in HAstV-2, and in non-classical astrovirus the motif is localized depending of the clade group: in MLB1-3 these motifs are absent, but in clade VA1-5, including AST/PS, the motif appears in 6406/6502. The TTT/TTTTA motif was localized around positions 6743/6801 on classical HAstV depending on the serotype (Fig. 2a). Similarly, conserved TT motifs appear on non-classical astrovirus, in MLB 1–3 are present within the first nucleotides, and VA1, 3, 5 and AST/PS clade the motif appears in different positions. Another slightly conserved sequence among no classical clade is the TTT motif, that changes the last T for G/A (VA1-VA2), or changes the middle T for A (VA4–VA5), and in AST/PS the T in the first position is changed for A (Fig. 2b). The phylogenetic analysis carried out with 3′ UTR sequences showed that classical and novel astrovirus are divergent, in which the clade VA1-VA5 and AST/PS for example, is less related with the classical astrovirus than the MLB1-3 clade as previously reported [2, 3]. The biological significance of sequence conservation between classical and non-classical astrovirus, particularly the GGGTACAGCG motif shared in classical and clade VA1-VA5 and AST/PS, remains to be uncovered, although it could be speculated that it might be involved in global replication process.

Fig. 2
figure 2

Sequence alignments of the 3′UTR sequence and putative binding sites for cellular proteins in the 3′ untranslated region of HAstVs. a Alignment of the human astrovirus 3′UTR sequences 1, 3, 4, 5, 6, 7, and 8. The accession numbers of Genbank as mentioned before (Genbank HAstV-7 accession number Y08632). The following nucleotide positions were used: HAstV-1 nt 6733–6813; HAstV-3 nt 6733–6815; HAstV-4 nt 6643–6723; HAstV-5 nt 66,677–6762; HAstV-6 nt 6665–6757 HAstV-7 nt 2403–2484 (nt numbering differs because of the partiality of the sequence); HAstV-8 nt 6674–6759. Bold and shaded nucleotides correspond to sequence differences. Gaps are noted with hyphens. GGGTACAG and TT/TTTA conserved elements are bold and underlined. b Alignment of novel astrovirus 3′ UTR sequences MLB1-3, VA1-VA5 and AST/PS The accession numbers of Genbank as mentioned before and included VA5 (Accession No KJ656124.1). The following nucleotide positions to aligment were used: MLB1 nt 6114–6171, MLB2 nt 6081–6119, MLB3 nt 6087–6124, VA1 nt 6489–6586, VA2 nt 6415–6531, VA3 nt 6487–6581, VA4 nt 6404–6518, VA5 nt 6401–6519, and AST/PS nt 6489–6584. Red marked nucleotides correspond to sequence differences. c Localization of the putative binding sites for the different cell factors in the HAstV-8 3′UTR RNA structure. d Mobility shift assay carried out with biotin labeled HAstV-8 3′UTR RNA incubated with 0.50, 0.75, and 1 µg of recombinant His-hnRNPE2 protein (left panel), or 0.5, 0.75, and 1 µg of BSA (middle panel). The heterologous RNA used was the same to Fig. 1e. Free probe is shown in lane 1 and hnRNPE2 complexes formation is indicated by arrow. The RNA–protein complexes were detected with Chemiluminiscent Nucleic Acid Detection Module (Pierce) following the conditions described by the manufacturer

Several works revealed that the integrity of the secondary structure of the UTRs of different single-stranded, positive sense RNA viruses is very important for the recruitment of both cellular and viral factors that are constituents of the replication complex (see Tables 1, 2). This complex selects and recruits viral RNA replication templates to perform synthesis of the minus-strand RNA [8] and participates in other functions, such as viral viability, RNA stability, translation initiation, and intracellular localization [62, 71,72,73,74]. Thus, this structural conservation in all HAstVs agrees and supports the possible implication of the biological functions proposed [12, 73].

Putative binding sites for cellular factors of Human Astrovirus 5′ and 3′ untranslated regions

As mentioned previously, several reports showed that cellular and viral proteins can bind to the UTRs in positive-sense RNA viruses and play important roles in many viral processes. Using various search engines, such as ESE Finder [75, 76], ESR search [77, 78] EBI-EMBL and RBPmap [79], we analyzed the 5′ and 3′ UTRs of HAstVs available looking for the target sites of cellular factors.

In silico analysis carried out with the 5′ UTR sequence showed putative binding sites for SRSF2 (GGCCTTTG), and SRSF5 (ACAGG) (Fig. 1d). Although the analysis was carried out for all classical HAstV, for the sake of clarity the sites were depicted on the secondary structure HAstV-8 only. In the case of HAstV-1 (CCACACA) and HAstV-8 (CCAAAAG) the putative binding sites were selected among the highest scores of the search engines used. The SRSF6 putative binding site (TGTATA) was identified in all 5′ UTRs analyzed except for HAstV-3 and 4 which lack this putative binding site. As before, the SRSF6 putative binding site in HAstV-2 (TGTGTA) also corresponds to the highest score in the ESEFinder software. Whereas the putative binding site (TCTAC) for SRSF3, previously known as SRp20, was observed in HAstV-2, 4, 5 and 8, using highest scores with RBPmap software, other SRSF3 binding motifs were observed in HAstV-1, 3 and 6.

hnRNPE2 (TTAT) putative binding sites are present in all serotype analyzed. Except for HAstV-3 a, the same holds for TIA1 (ATTTTCT) (Figs. S1a, b). This brings into question the role of SR proteins in the localization and infectivity of the viral particles and in the RNA secondary structure stabilization for the recruitment of the replication complex. More studies will be necessary to elucidate the participation of the SR proteins in the replicative cycle of astrovirus [64, 66, 77,78,79].

It has been described that the SR protein family, recently known as SRSF (SR splicing factor) [64, 65], often contains one or more RNA recognition motifs. They have been classically described as regulators of precursor (pre)-mRNA splicing [66, 67], but SRSF proteins have been involved in a series of transcription-related activities, including transcriptional elongation [80], RNA transport [81, 82], nonsense-mediated decay enhancement [28], translation-stimulation [83], genome stability maintenance [84] and cell cycle progression [85]. Furthermore, SRSF3 is required for efficient IRES-mediated translation in poliovirus through interacting with the PCBP2/hnRNPE2 protein, which directly binds to the poliovirus IRES. This probably results in either the direct or indirect recruitment of the translation complex to the viral RNA through other protein–protein or protein–RNA associations [86, 87]. So far, this is the only evidence that SRSF proteins are involved in translation of an RNA virus.

In addition to SR proteins, we found PTB/hnRNP1 putative binding sites in all viruses studied. Interestingly, in HAstV-1, -2, -3, -4.-6 and -8, the strong motif for PTB (TTCT) was localized in the double-stranded linker between the two hairpin loops that form the 5′ UTR, but not in HAstV-5 (Fig. S1a). However, we found a consensus motif TCTT located in a compromising RNA position (63–66 nt). Many reports have demonstrated the interaction of PTB (Table 1) with a viral 5′ UTR, suggesting its participation in viral translation or replication as mentioned previously [16, 54, 88,89,90,91]. To corroborate the presence of PTB binding site in the HAstV-8 5′ UTR (nt 61–64), mobility shift assays were performed with different amounts of recombinant PTB (rPTB). The results show its ability to interact with the strong motif in the 5′ UTR (Fig. 1e).

Likewise, the remaining HAstV 5′ UTRs contain putative binding site for PTB/hnRNP I and hnRNPE2/PCBP2, and more importantly these putative binding sites for both cellular proteins are located in double stranded RNA sequences suggesting a functional role of this RNA–protein interaction, such as RNA chaperones that could remodel RNA secondary structures to conform RNA–protein complexes, possibly involved in astrovirus replication process by mediating the contact of the ends of the viral genome as successfully seen in several single-stranded positive animal RNA viruses. Additionally, the putative binding site found to SRSF3 in this region suggests it may interact with hnRNE2 as previously reported for poliovirus.

A great deal of evidence has been published that describes the interaction of cellular proteins with the 5′ UTR. For example, picornaviruses interact with PCBP2/hnRNPE2, La, eEF1A, SRSF3 and PTB [15, 60, 86, 89], and some of these cellular proteins have also been observed to bind to the 5′ end of Norwalk virus (La, hnRNPL, PCBP2 and PTB) [90] and feline calicivirus (PTB) [16, 54]. In the case of the viral bovine diarrhea virus and Hepatitis C, the binding of NFAR proteins (and other cellular factors) at both UTRs is responsible for mediating the contacts between the 5′ and 3′ ends of the RNA [22, 23] (Table 1). There is no previous study on the 5′ UTR that could suggest a possible role in any viral function. In addition to the 3′ UTR, we think that this structure might be necessary for the recruitment of the viral replication complex. This includes the viral RNA polymerase (nsP1b) and the viral nsP1a-4 protein, which has been co-localized with the viral RNA containing a VPg covalently linked to the 5′ terminus in the rough endoplasmic membranes of infected cells [5, 7].

The analysis in silico carried out with 3′ UTRs of human astrovirus revealed putative binding sites for the SRSF proteins showed affinity motifs for SRSF2 (TGTCTCTG), SRSF5 (TCAGA), and except for HAstV-3 that lacks such motif, SRSF6 (TACAGC). Additionally, putative binding sites were found for SRSF3 (CTCTGTT), hnRNPE2 (TTAG) and TIA1 (TTATTTT); interestingly these protein recognition motifs are also present in the 5′ UTR of most HAstV (Figs. 2c, S2a, b). The few variations at this respect are the lack of hnRNPE2, SRSF3 and TIA1 in HAstV-2, and the lack of SRFS6 in HAstV-3 and -7. As described for the 5′ UTRs, the binding sites localized in the 3′ UTRs correspond to the highest scores obtained in all search engines used.

The possible recruitment of SRSF proteins to the 3′ UTR in astrovirus leads to the speculation that SRSF proteins could determine the localization and infectivity of the viral particles produced, probably because they function in RNA transport and nuclear export [86]. However, they could also help in the stabilization of RNA secondary structure and the recruitment of the replication complex. In this analysis, we found PTB/hnRNP1 putative binding sites in all astroviruses studied, except in HAstV-2. The TCTT motif is present in the 3′ UTR HAstV-8 structure (nt 6700–6704) is recognized by PTB, protein found in infected CaCo2 cell extracts. Published data from our group described that PTB/hnRNP1 knockdown in CaCo2 cells affects HAstV-8 viral RNA replication, suggesting that it is required for this viral function. Additionally, UV cross-linked assays with uninfected and infected CaCo2 cell extracts showed that other six proteins of 35, 40, 45, 50, 52, and 75 kDa from both cell extracts were bound to the 3′ UTR [50]. Preliminary results using mobility shift assays revealed that recombinant hnRNPE2 is able to interact with the 3′ UTR (nt 6717–6720/ TTAG) of HAstV-8 (Fig. 2d). Experiments exploring whether SRSF or TIA1 proteins are components of the astrovirus replication complex are currently being carried out in our laboratory.

Interestingly, both human astrovirus UTRs have one PTB binding site as observed in preliminary in silico analysis carried out in our lab. Additionally, it is possible that hnRNPE2 (PCPB2) and SRSF3 might be involved (Fig. 3). hnRNPE2 is capable of establishing independent protein–protein interactions both with PTB and SRSF3 [86, 92, 93]. PTB-hnRNPE2 and SRSF3-hnRNPE2 interactions might occur during astrovirus replication as has been observe in poliovirus and hepatitis C [61, 86, 87]. PTB proteins could also interact with each other and with the ends of the viral RNA to form a circular ribonucleoprotein (RNP) complex as reported previously [20, 44, 69, 74].

Fig. 3
figure 3

Schematic representation of the interaction of HAstVs UTR regions with cellular factors. The 5′ and 3′ UTR of astroviral RNA binds to PTB and hnRNPE2 (a.k.a. PCBP2), proteins. Additionally, the 3′ UTR binds to five others proteins p40, p45, p50, p52 and p75, the identity of these proteins has to be determined. The hnRNPE2 would be interacting with UUAT motif (nt 31–34) in 5′ UTR or UUAG motif in 3′ UTR (nt 6717–6720)

Conclusions

Our findings suggest the probable mode of action of hnRNPs and SRSF proteins that bind to human astrovirus UTRs and their involvement in the viral life cycles. This is similar to RNA chaperones, which maintain RNA structure in a conformation that favors viral RNA replication. These proteins play important roles in assembling the viral RNA replication complex, selecting and recruiting viral RNA replication templates, and other processes. Even when it has been proposed that circularization does not depend on the 5′ and 3′UTR protein interactions, multiple reports suggest that, in addition to viral proteins, the binding of cellular proteins in trans is required to promote or facilitate viral replication. The question is how the interplay of the human astrovirus 5′ and 3′ UTRs maximize the efficacy of the genome via either RNA–protein (PTB, hnRNPE2, and/or SRSF) or RNA–RNA interactions.