HPV18 Utilizes Two Alternative Branch Sites for E6*I Splicing to Produce E7 Protein
- 129 Downloads
Human papillomavirus 18 (HPV18) E6 and E7 oncogenes are transcribed as a single bicistronic E6E7 pre-mRNA. The E6 ORF region in the bicistronic E6E7 pre-mRNA contains an intron. Splicing of this intron disrupts the E6 ORF integrity and produces a spliced E6*I RNA for efficient E7 translation. Here we report that the E6 intron has two overlapped branch point sequences (BPS) upstream of its 3′ splice site, with an identical heptamer AACUAAC, for E6*I splicing. One heptamer has a branch site adenosine (underlined) at nt 384 and the other at nt 388. E6*I splicing efficiency correlates to the expression level of E6 and E7 proteins and depends on the selection of which branch site. In general, E6*I splicing prefers the 3′ss-proximal branch site at nt 388 over the distal branch site at nt 384. Inactivation of the nt 388 branch site was found to activate a cryptic acceptor site at nt 636 for aberrant RNA splicing. Together, these data suggest that HPV18 modulates its production ratio of E6 and E7 proteins by alternative selection of the two mapped branch sites for the E6*I splicing, which could be beneficial in its productive or oncogenic infection according to the host cell environment.
KeywordsHuman papillomavirus 18 (HPV18) HPV splicing Branch point E6, E7, E6 intron HPV oncogenes
Human papillomaviruses (HPV) are small, non-enveloped DNA viruses and contain a double-stranded DNA genome ~8 kb in size. The HPV genome encodes eight open-reading frames (ORF), of which six (E1, E2, E4, E5, E6 and E7) are encoded from the genome early region and two (L1 and L2) are encoded from the late region (Zheng and Baker 2006). More than 200 different genotypes of HPV have been identified (Ranjeva et al.2017). The mucosotropic HPV can be clinically classified as low-risk HPV (LR-HPV), producing benign warts, and high-risk HPV (HR-HPV), whose infections lead to development of cervical, genital and oropharyngeal cancers (Walboomers et al.1999; de Villiers 2013). Among HR-HPV, HPV16 and HPV18 are the two most frequent genotypes detected in cervical cancer (zur Hausen 2002).
HR-HPV E6 and E7 are two viral oncoproteins that respectively, induce protein degradation of tumor suppressors p53 and pRb (Vousden 1994). Unlike LR-HPV from which E6 and E7 are transcribed from separate promoters (Zheng 2010), HR-HPV E6 and E7 are transcribed from the same promoter as a single bicistronic E6E7 pre-mRNA (Zheng and Baker 2006; Wang et al.2011). The E6 ORF in this bicistronic pre-mRNA contains an intron (also called the E6 intron) which is subject to alternative splicing. Escaping of the E6 intron from RNA splicing leads to remain the E6 ORF integrity and is necessary for E6 protein translation. However, splicing of the E6 intron during HR-HPV infection is highly efficient and is required for E7 protein translation (Zheng et al.2004; Zheng and Baker 2006; Tang et al.2006). Majority of the spliced product with a disrupted E6 ORF is the E6*I which serves as an E7 mRNA for E7 protein translation (Tang et al.2006). Currently, it remains unclear how the E6 intron splicing is regulated for E7 expression, and how the E6 intron could escape from RNA splicing to express E6 protein during HR-HPV infection are under active investigation. HPV18 E6 and E7 are transcribed from two alternative early promoters, P102/105 and P55, upstream of the E6 ORF as a single bicistronic primary E6E7 transcript (Wang et al.2011, 2016). The E6 intron in the HPV18 E6E7 bicistronic pre-mRNA transcripts has one splice donor site (5′ splice site or 5′ss) at nt 233 and two alternative splice acceptor sites (3′ splice site or 3′ss), one at nt 416 and the other at nt 791. The alternative usage of these two 3′ splice sites results in two spliced mRNAs: the major E6*I with a 233^416 splicing junction to serve as an E7 mRNA and the minor E6*X with a 233^791 splicing junction to encode an E6^E7 fusion protein (Zheng et al.2004; Ajiro and Zheng 2015a).
In eukaryotes, introns are removed from pre-mRNAs in the spliceosome. The spliceosome complex are composed of proteins and small nuclear RNAs (snRNAs) (Will and Lührmann 2011; Shi 2017). RNA splicing requires four intronic elements: (i) a 5′ss with a GU dinucleotide at the intron 5′ end, (ii) a 3′ss with an AG dinucleotide at the intron 3′ end, (iii) a branch point sequence (BPS), and (iv) a polypyrimidine tract (PPT) between the 3′ss and BPS. Usually, the BPS, located 15–40 nucleotides upstream of the 3′ss, is comprised of 5 (YUNAY) (Gao et al. 2008; Mercer et al.2015) or 7 (YNYURAC) (Zhuang et al.1989; Kol et al.2005) nucleotides with one adenosine (underlined) as the branch site. During pre-mRNA splicing, the 5′ss is first recognized by U1 snRNA, the BPS by U2 snRNA, and the 3′ss by U2AF. Following these recognitions and crosstalking, the intron is excised by two trans-esterification reactions. In the first reaction, the hydroxyl ·OH radical of the branch site adenosine attacks the phosphodiester bond at the intron 5′ end G at the 5′ss, cleaving the 5′-exon from the intron. At this stage, the intron forms a lariat-intermediate structure with a 5′–2′ phosphodiester bond between the 5′ss guanosine and the branch site adenosine. In the second reaction, the ·OH radical of the released 5′-exon attacks the phosphodiester bond of the intron 3′ss exon in the lariat intermediate, leading the cleavage of the intron 3′ss from the 3′-exon and linking the 5′-exon to the 3′-exon. The intron is then released as a lariat conformation and degraded by de-branching enzymes (Will and Lührmann 2011; Shi 2017).
Although BPS is crucial for mRNA splicing in eukaryotes, mapping a BPS for individual intron splicing and understanding its regulation remain challenge. Various approaches in combination with computational bioinformatics have been attempted to enrich genome-wide circular forms of the lariats or lariat intermediates for BPS mapping (Gao et al.2008; Taggart et al.2012; Mercer et al.2015) with few successfully validated cases of a few selected genes. A recent report indicated that almost all human introns contain multiple branchpoints and exhibit tissue-specific branchpoint usage (Pineda and Bradley 2018). Using a lariat RT-PCR technology in combination of TA cloning-sequencing, our lab has mapped the BPS for RNA splicing of the BRAF intron 8 (Ajiro and Zheng 2015b), the BPV-1 late RNA intron 1 (Zheng et al.2000) and the HPV16 E6 intron (Ajiro et al.2012). In this report, we discovered that the HPV18 E6 intron contains two alternative branch sites at nt 384 and 388, but preferentially uses a 3′ss-proximal nt 388 adenosine as a branch site for E6E7 pre-mRNA splicing. Introduction of mutations at nt 388 and 384 blocked selection of the nt 416 3′ss for E6*I splicing, but activated the usage of a cryptic acceptor splicing site at nt 636, leading to a novel splicing isoform from nt 233^636.
Materials and Methods
Computational Analysis for BPS Detection
Computational analysis of the 3′ end 100-bp region (from position 316 to 415) of HPV18 E6 intron was carried out to identify a potential BPS with Human Splice Finder 3.1 (http://www.umd.be/HSF3/HSF.shtml) and default parameters. A BPS candidate is a seven-nucleotide motif (heptamer) with a consensus value (CV) calculated by its nucleotide similarity with the consensus sequence of the mammalian BPS (Desmet et al.2009).
A HPV18 E6 and E7 region (nt 103–967) with mutations at the BPS candidate motifs was amplified from the plasmid pZMZ84 originated from HeLa HPV18 genome by overlapping PCR with the corresponding primer sets (Supplementary Table S1) and inserted into a pcDNA3 vector (Invitrogen) between Hind III and Not I sites. The constructed plasmid pAYS2 had a single A384G mutation (mt-1); pAYS7 had a single A388G mutation (mt-2); and pAYS3 had both A384G and A388G mutations (mt-3). Additional constructs based on pFLAG-CMV-5.1 (Sigma-Aldrich) were used for in vivo RNA splicing and translation of Flag-tagged E6 and E7 proteins with the Flag-tag at the N-terminal E6 and at the C-terminal E7, respectively. All plasmids containing HPV18 E6 and E7 regions (nt 105-904) were constructed by overlapping PCR with the corresponding primer sets. The resulted plasmid pAYS8 had both wild type E6 and E7 ORFs (wt); pAYS9 had the A384G mutation (mt-1); pAYS11 had A388G mutation (mt-2); and pAYS10 had both A384G and A388G mutations (mt-3).
In vitro RNA Transcription and Splicing
To generate the DNA templates for in vitro transcription, the HPV18 E6 ORF region (nts 123 to 500) was PCR amplified using oST247 and oSB70 (Supplementary Table S1) to introduce T7 promoter (5′ end) and U1 snRNP-binding site (3′) from plasmid pMA77 (Ajiro et al.2016b) containing wt and pAYS2, pAYS3 or pAYS7, containing mutated BPS as described above. One µg of PCR templates were then used for in vitro transcription using Riboprobe System-T7 (Promega) in a 40-µL reaction containing 1× transcription buffer, 10 mmol/L DTT, 40 U RNase inhibitor, 0.5 mmol/L m7G(5´)ppp(5´)G cap analogue (New England Biolabs), 0.5 mmol/L of each rNTP (rATP, rUTP, rCTP, rGTP) and 60 U T7 RNA polymerase. The reactions were incubated at 37 °C for 2 h followed by DNA template removal by 30 min treatment with RNase-free DNase I (Promega). Resulting RNA was purified by ultrapure phenol:chloroform:isoamyl alcohol (25:24:1, v/v, Thermo Fisher Scientific) extraction, precipitated, and dissolved in DEPC-treated water. Alternatively, the in vitro transcribed RNA was radiolabeled by addition of 60 µCi [α-32P]-rGTP (Perkin Elmer) into the transcription reaction containing reduced level (0.05 mmol/L) of cold rGTP. The radiolabeled RNA was gel-purified on a 6% denaturing urea-PAGE gel (National Diagnostics). In vitro splicing was carried out with 100 ng of non-labeled or 4 ng of 32P-labeled of HPV18 E6 pre-mRNA in the presence of HeLa nuclear extract as previously reported (Zheng and Baker 2000). After 2 h incubation the spliced products were purified by phenol:chloroform:isoamyl alcohol extraction, precipitated and dissolved in DEPC-treated water. The spliced products generated from unlabeled RNA was used in lariat RT-PCR. The splicing products of radiolabeled RNA were separated on a 6% denaturing urea-PAGE gel (National Diagnostics), transferred to a filter paper and dried. The signals were captured by Amersham Typhoon 5 laser scanner (GE Healthcare) and signal intensity of each spliced product was determined using ImageQuant software (GE Healthcare). RNA splicing efficiency was calculated as described (Zheng and Baker 2000).
RNase R Treatment and Lariat RT-PCR
To remove the linear part of lariat RNA intermediates, in vitro spliced products from cold E6 pre-mRNA (50 ng) or from total RNA extracted from HeLa cells (10 µg), were treated with 40U of RNase R (Epicentre) (Suzuki et al.2006) in a 50-µL reaction at 37 °C for 2 h. Treated RNA was extracted with phenol:chloroform:isoamyl alcohol, precipitated, dried, and dissolved in DEPC-treated water. RNase R- digested products were used to identify the 5′–2′ bond in the circular lariat structure required for the first step RNA splicing of the E6 intron by lariat RT-PCR (Ajiro and Zheng 2015b): first, circular RNA was reversely transcribed with 200 U of SuperScript II reverse transcriptase (Thermo Fisher Scientific) with a primer oAYS12 (R) and the cDNA was subsequently PCR-amplified with a primer pair of oAYS12 (R) and oAYS13 (F1) primers followed by a semi-nested PCR with another primer pair of oAYS12 (R) and oAYS8 (F2) primers (Supplementary Table S1). Amplified products were excised from agarose gels, cloned into pCRII TOPO vector using TOPO TA cloning kit (Thermo Fisher Scientific) and sequenced.
Transfections, RT-PCR and Western Blotting
Human osteosarcoma U2OS cells and human embryonic kidney HEK293T cells were cultivated in McCoy′s 5a or in DMEM media, respectively, supplemented with 1 × Penicillin–Streptomycin-Glutamine (Thermo Fisher Scientific) and 10% fetal bovine serum (GE Healthcare). Approximately 3 × 106 of each cells were plated in a separate 60-mm dish and transfected with 2 µg of Flag-tagged plasmid pAYS8, pAYS9, pAYS10 or pAYS11 with LipoD293 (SignaGen Laboratories) or an empty vector (pFLAG-CMV-5.1) as a control. Each dish was treated at 24 h after transfection with proteasome inhibitor MG132 (Sigma-Aldrich) (Ajiro and Zheng 2015a) at a final concentration of 10 µmol/L for additional six hours before sample collection. Total RNA was isolated with TRIzol (Roche) and total protein was extracted by addition of 250 µL (to U2OS cells) or 500 µL (to HEK293T cells) of 2.5 × SDS protein gel loading solution with 10% β-mercaptoethanol.
RNA from transfected cells was treated with TURBO DNA-free kit (Thermo Fisher Scientific) to remove plasmid DNA contamination, and reverse transcription (RT) was carried out with Moloney murine leukemia virus reverse transcriptase (M-MLV RT, Thermo Fisher Scientific) and random hexamer primers. Following RT, PCR was carried out with AmpliTaq (Thermo Fisher Scientific) and a primer pair of oZMZ252 plus oZMZ253 (Supplementary Table S1) to amplify HPV18 E6E7, and a primer pair of oZMZ269 plus oZMZ270 (Supplementary Table S1) to amplify GAPDH (as a loading control). The efficiency of splicing was determined based on amount of individual splicing products separated in an ethidium bromide-containing agarose gel and their signal intensities measured by Image Lab software (Bio-Rad).
Total protein samples were separated on NuPAGE 4%–12% Bis–Tris gels (Thermo Fisher Scientific), transferred to a nitrocellulose membrane and subsequently blotted with the following primary antibodies: rabbit anti-FLAG polyclonal antibody (F7425, Sigma-Aldrich), goat anti-E7 HPV18 polyclonal antibody (SC-1590, Santa Cruz Biotechnology) or mouse anti-β-tubulin monoclonal antibody (T5201, Sigma-Aldrich). Subsequently, the membrane was blotted with a secondary antibody (anti-rabbit, anti-goat or anti-mouse) conjugated with horseradish peroxidase (Sigma-Aldrich). The immunoreactive proteins were detected with enhanced chemiluminescence using SuperSignal West Pico PLUS Chemiluminescent Substrate (Thermo Fisher Scientific). The signal was captured by ChemiDoc Touch imaging system (Bio-Rad) or on a X-ray film. The membrane was stripped with Restore Plus Western Blot Stripping Buffer (Thermo Fisher Scientific) and reblotted with another primary antibody.
BPS Prediction by Computational Analysis
Lariat RT-PCR for BPS Identification
Verification of the Mapped BPS by Point Mutation and In vitro RNA Splicing
Function of the Mapped Branch Points at nt 384 and nt 388 in HPV18 E6E7 RNA Splicing and Oncoprotein Production in HEK293T and U2OS Cells
It has been well established that the E6*I RNA serves as an E7 mRNA and the E6 intron splicing disrupts integrity of the E6 ORF and prevents production of full-length E6 protein (Zheng et al.2004; Tang et al.2006; Ajiro et al.2012). Subsequently, effect of the mapped branch sites at nt 384 and 388 on production of HPV18 E6 and E7 proteins were examined by Western blotting using the total protein extracts derived from individual vector-transfected HEK293T or U2OS cells as described above. As shown in Fig. 5, we found that the increased production of viral E6 protein both in HEK293T and U2OS cells was accompanied by the decreased expression of viral E7 protein level. This inversed expression of E6 and E7 proteins from the same bicistronic E6E7 RNA was in a linear correlation with the increased escaping of E6 intron splicing from the bicistronic E6E7 RNA transcripts (Fig. 4). As expected, the majority (> 95%) of E6E7 pre-mRNA transcripts with wt branch sites were efficiently spliced as an E6*I RNA (Fig. 4B, lanes 3 and 13), leading to predominantly produce E7 protein both in HEK293T and U2OS cells (Fig. 5A–5C, lanes 3). Minimal amount of E6 protein expressed from the same wt vector was encoded from the residual unspliced E6E7 RNA (Fig. 5A, 5C, lanes 3). Disruption of either one of mapped branch sites (mt-1 or mt-2) slowed down the efficiency of E6*I splicing (Fig. 4B, lanes 5, 7, 15 and 17) and increased E6, but decreased E7 protein production, in particular for the mt-2 with A-to-G mutation in the mapped branch site at nt 388, a 3′ss-proximal branch site (Fig. 5A–5C, compare lanes 4-5 to lanes 3 in both types of cells). Furthermore, disruption of both mapped branch sites was found to block E6E7 RNA splicing (Fig. 4B, lanes 9 and 19), resulting in production of plentiful E6 protein with little detectable E7 from this mt-3 vector (Fig. 5A–5C, compare lanes 6 with lanes 3-5 in both types of cells). Altogether, these data indicate the mapped branch sites are essential for E6E7 RNA splicing to regulate the production of viral E6 and E7 proteins. In this regard, the 3′ss-proximal branch site at nt 388 is more potent than its distal branch site at nt 384 in regulation of E6 and E7 protein production.
In HR-HPV, splicing of the E6 intron from a bicistronic E6E7 pre-mRNA is a crucial step to control expression of E6 and E7 oncogenes (Zheng et al.2004). An efficient RNA splicing depends on multiple RNA cis-elements including a functional BPS (a mammalian consensus heptamer YNYURAC) in each intron and cellular splicing factors (Zheng 2004; Lee and Rio 2015; Shi 2017). We had previously mapped the branch site of HPV16 E6 intron splicing to an adenosine at nt 385 (Ajiro et al.2012). In this study, we identified that HPV18 E6 intron utilizes two alternative branch sites, one at nt 384 and the other preferential one at nt 388, for its RNA splicing and expression of viral E6 and E7.
Although both HPV16 and HPV18 E6 introns initiate its first step splicing by using an adenosine nucleotide as a branch site at the sixth position of the mapped heptamer BPS, the sequence composition of the mapped BPS in the HPV16 E6 intron is AACAAAC, whereas in the HPV18 E6 intron is AACUAAC, a duplicate sequence from nt 379 to 385 and from nt 383 to 389. Thus, the mapped BPS in the HPV18 E6 intron differs from that of the HPV16 intron only at the fourth nucleotide position, U for HPV18 and A for HPV16, leading the mapped HPV18 E6 BPS a better CV score (77%) over the HPV16 E6 BPS (62%) (Ajiro et al.2012) to the consensus sequence YNYURAC of a mammalian BPS. Despite that multiple adenosines are present in each mapped BPS, our point mutation studies indicate that only the adenosine at the sixth position in the mapped BPS was used as a branch site during E6 RNA splicing. These data clearly show how accurately the cellular splicing machinery executes U2 snRNP recognition of each mapped BPS in HPV18 E6 splicing. By sliding only four nucleotides, U2 could simply switch to another alternative BPS to regulate the efficiency of E6 intron splicing.
Successful mapping of the two alternative BPSs in control of HPV18 E6 intron splicing also lead us to define the length and composition of a PPT between the mapped BPS and the nt 416 3′ss. Depending on which branch site is selected for HPV18 E6 intron splicing, the PPT at this 3′ss could be either in size of 26 nts if the nt 388 branch site is used or 30 nts if the nt 384 branch site is chosen. In either case, the 26-nt PPT is interrupted by its 12 purines (9 As and 3 Gs) and the 30-nt PPT is interrupted by its 14 purines (11 As and 3 Gs) upstream of the AG dinucleotides at the 3′ss. PPT composition of the mixed pyrimidines and purines, in particular cytosine and adenosine, makes the PPT being suboptimal in interaction with U2AF and other related splicing factors necessary for U2 recruitment (Berglund et al.1998; Sickmier et al.2006; Tavanez et al.2012; Agrawal et al.2016; Sutandy et al.2018) and affects efficiency of RNA splicing (Zamore et al.1992; Sohail and Xie 2015). Since the 26-nt PPT has fewer adenosines than the 30-nt PPT and thereby is a relatively stronger PPT over the 30-nt PPT, this feature of the 26-nt PPT may explain why the 3′ ss-proximal branch site at nt 388 is preferentially used for HPV18 E6*I splicing. Nevertheless, the importance of the nt 388 branch site in the E6*I splicing was also revealed by disruption of the proximal branch site to activate the usage of a cryptic acceptor site at nt 636 and lead to aberrant 233^636 splicing, which was not previously noted.
It has been well-established that the E6 intron splicing from a bicistronic E6E7 pre-mRNA creates a premature stop codon to increase intercistronic space for efficient translation termination-reinitiation and thus is crucial for generating E6*I RNA to translate E7 protein, while the E6 intron retention is necessary for keeping the E6 ORF integrity for production of E6 protein (Zheng and Baker 2006; Tang et al.2006; Zheng 2010; Ajiro and Zheng 2014). By transfection of HEK293T and U2OS cells with individual branch site-mutated constructs, we showed that the branch site at nt 388 was more potent than the branch site at nt 384 not only for the E6*I splicing but also for the production of E7 proteins. We observed that efficient production of the spliced E6*I mRNA led to efficient production of E7 proteins, but less amount of E6 protein. The reverse was true for more retention of the E6 intron due to less E6*I splicing. Despite existence of the branch site selection bias, we are assuming that both branch sites are useful for HPV18 infection. By alternative selection of the two mapped branch sites for splicing of bicistronic E6E7 pre-mRNA, HPV18 is capable to justify its production ratio of the two oncogenic, multifunctional E6 and E7 proteins along with its progression of productive infection, cell immortalization and transformation (Fig. 5D). This hypothesis is in consistent with the recent observation that the most human introns are recognized by multiple and tissue-specific BPS (Pineda and Bradley 2018). However, further studies are needed to understand what host splicing factor(s) contributes to selection of the proximal over the distal branch site for E6*I splicing despite that the expression of host splicing factors remains changing from undifferentiated keratinocytes in the basal layer to highly differentiated keratinocytes in the spinal/granular layers (Fay et al.2009; Mole et al.2009; Ajiro et al.2016a).
This research was fully supported by Intramural Research Program of the National Institutes of Health, National Cancer Institute, Center for Cancer Research (1ZIASC010357 to ZMZ). This study is a part of Brant AC Ph.D thesis being developed at the Post-graduate program in Genetics (PGGEN) of Rio de Janeiro Federal University (UFRJ), Rio de Janeiro, Brazil and at the National Cancer Institute of USA. Part of the author′s fellowship was supported by the PDSE program of Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (Capes, Brazil) and by the National Cancer Institute of USA.
ZMZ, ACB, VM designed the experiments. ACB, VM carried out the experiments. ACB, VM, ZMZ analyzed the data. ACB, VM, ZMZ wrote the paper. ZMZ, MAMM, ACB, VM checked and finalized the manuscript. All authors read and approved the final manuscript.
Compliance with Ethical Standards
Conflict of interest
The authors declare that they have no conflict of interest.
Animal and Human Rights Statement
This article does not contain any studies with human or animal subjects performed by any of the authors.
- Ajiro M, Tang S, Doorbar J, Zheng ZM (2016b) Serine/Arginine-rich splicing factor 3 and heterogeneous nuclear ribonucleoprotein A1 regulate alternative RNA splicing and gene expression of human papillomavirus 18 through two functionally distinguishable cis elements. J Virol 90:9138–9152CrossRefGoogle Scholar
- Sutandy FXR, Ebersberger S, Huang L, Busch A, Bach M, Kang HS, Fallmann J, Maticzka D, Backofen R, Stadler PF, Zarnack K, Sattler M, Legewie S, König J (2018) In vitro iCLIP-based modeling uncovers how the splicing factor U2AF2 relies on regulation by cofactors. Genome Res 28:699–713CrossRefGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.