Identification of immunogenic regions within the alternative reading frame protein of hepatitis C virus (genotype 3)


DOI: 10.1007/s10096-011-1194-1

Cite this article as:
Qureshi, H., Qazi, R., Hamid, S. et al. Eur J Clin Microbiol Infect Dis (2011) 30: 1075. doi:10.1007/s10096-011-1194-1


Hepatitis C virus (HCV) encodes ten classic proteins as well as a newly discovered alternative reading frame protein (ARFP) whose synthesis originates from the core region by a +1 frameshift. ARFP is produced by all HCV genotypes, but its function remains unknown. Although the immunogenicity of genotype 1- and 2-derived ARFP in infected hosts has been reported, no information is available for genotype 3-encoded ARFP. HCV genotype 3 core/ARFP region was PCR amplified, cloned, and sequenced. Recombinant ARFP and peptides were employed in ELISAs with patient serum samples. The effect of peptides on peripheral blood mononucleocytes (PBMCs) was also studied. DNA cloning and sequencing of HCV genotype 3 strain (PKHCV3) revealed it to encode 160 aa ARFP, which harbors a C-terminal extension of 36 aa. Serum from 74 of 88 patients (84%) contained rARFP-reactive antibodies. Peptide ELISAs showed that all regions of rARFP were immunogenic, with peptide F7 (DSLSPRRAGAKAGPGLSPGT) being the most immunodominant. When incubated with PBMCs from HCV-infected individuals, F7 stimulated the production of TNFα and IL10. PKHCV3-derived ARFP encodes a 160 aa protein and antibodies against its entire length are found in 84% of all genotype 3-infected subjects. Peptide ELISAs revealed F7 to be highly immunogenic and capable of eliciting impressive T-cell responses.


Chronic hepatitis C (HCV) infection is one of the most common causes of liver failure which reduces the quality of life for over 170 million people worldwide. HCV is categorized into six major genotypes, some of which vary from each other by as much as 40% at the nucleotide level, and its transmission occurs via transfer of blood or its products. Genotypes 1, 2, and 3 have worldwide distribution, while genotypes 4, 5, and 6 are localized to certain regions. Current therapeutic options are costly, associated with unpleasant side effects, and achieve considerably higher SVR for genotypes 2- and 3-infected subjects as compared to those with genotype 1, which is predominant in North America. No vaccine is currently available that confers protection against HCV infection, but one or more potent small-molecule inhibitors of HCV replication are likely to reach clinics within one year [1].

HCV belongs to the family of Flaviviridae and harbors a 9.6-kb positive-sense ssRNA genome that was previously thought to encode only three structural (i.e., E1, E2, and p7) and seven non-structural (i.e., NS2, NS3, NS4a, NS4b, NS5a, and NS5b) proteins which are synthesized as a ∼3,000 amino-acid-long precursor and subsequently processed into individual polypeptides by viral and host proteases [2]. An eleventh protein is also produced from within the core region and is known as alternate reading frame protein (ARFP or F-protein) [3, 4]. Synthesis of ARFP was initially reported to occur by +1 translational frameshifting within the adenine-rich region spanning codons 8–11 of core ORF, but subsequent studies showed that synthesis of other ARFP isoforms also initiates from codon 42 by translational frameshift, or internally from codon 26 or codons 85 and 87 [5, 6, 7]. Transcriptional slippage between amino acids 8–11 also appears to contribute to the recoding of core ORF [8]. All HCV genotypes encode ARFP. Full-length ARFP that starts from codons 9–11 directly associates with proteosome subunit protein α3 via a 20 aa sequence and is degraded by a ubiquitin-independent pathway [9].

ARFP is an intrinsically disordered protein with a tendency to self-aggregate and is not required for HCV replication [10, 11]. Biochemical studies have found ARFP to associate with a number of proteins, including MM1, which negatively influences the transcriptional activator function of c-Myc [12, 13]. ARFP also impacts expression of several genes, but its own synthesis is suppressed by core [14, 15, 16]. Notably, ARFP down-regulates expression of cell cycle inhibitor p21 and inhibits apoptosis by activating the NF-kB pathway [17]. Although the function of ARFP or its role in viral pathogenesis and disease progression remains unclear, foregoing observations suggest that ARFP promotes cell proliferation by influencing the expression of a discrete set of genes.

Anti-ARFP antibodies are found in sera of both acute and chronically infected individuals [18, 19, 20] and their specificity validated [21, 22]. Although ARFP might not be required for HCV replication, presence of specific B- and T-cell responses against ARFP in HCV-infected subjects suggests it may be a useful immunogen [18]. Immunogenicity of genotype 1- and 2-derived ARFP in infected hosts has been reported previously, but no information is available for genotype 3-encoded ARFP [4]. Herein, we report the cloning of an extended ARFP from HCV genotype 3 strain and demonstrate that it elicits strong humoral immune responses in vast majority of chronically infected individuals.

Materials and methods

Ethical consent and patient selection

This study was approved by the Ethical Review Committee of Aga Khan University (ref.: 731-BBS/ERC-07). Informed consent was obtained from all subjects found to be HCV genotype 3-positive through the HCV Amplicor kit and Linear Array assay (Roche). Selected subjects were between 18 and 60 years of age and did not appear to have any other infection or disease. Approximately 5 ml of blood was drawn from each subject and sera were stored at −70°C until use. For cytokine analysis, blood was drawn from subjects who had not begun therapy and were processed immediately for peripheral blood mononucleocytes (PBMC) isolation.

DNA cloning, expression, and purification of recombinant F-protein

Viral RNA was extracted with the High Pure Viral RNA Kit (Roche) and converted into cDNA with the MuLV kit (Invitrogen). PCR amplification of core region was carried out with Taq DNA polymerase (Biobasic) with gene-specific primers (C-1 F: 5′ATAGGGTGCTTGCGAGTGC 3′ and C-1R: 5′CCAGACGTATTCCGCCACT 3′) and resulting 642-bp amplifier was cloned into pGEM-T (Promega). One recombinant clone (pGEM-T-C) was subjected to DNA sequencing and used as a template to amplify ARFP-encoding region with a primer set (F-1 F: 5′GATCGGATCCGCACACTTCCTAAACCTCAAA 3′ and F-1R: 5′GATCAAGCTTCCCCGTCTTCAAGGG 3′). After amplification, 480-bp PCR product was cloned into BamH1 and HindIII sites of pQE-30 (Qiagen) and the resulting recombinant plasmid was used to transform chemically competent M15 [pREP4] cells. M15 cells containing pQE30-ARFP were grown to an OD595 of ∼0.2 and expression induced with 1 mM IPTG. Cells were grown at 37°C for 5 h and harvested. His6-ARFP was purified by Ni-NTA affinity chromatography under denaturing conditions according to the manufacturer’s protocol (Qiagen). To obtain native protein, purified rARFP in urea buffer was dialyzed against 500 ml buffer of containing 50 mM Tris-HCl (pH 7.5), 1 mM DTT, 0.5% NP-40, 50 mM NaCl, 0.5% sarkosyl, and 5% glycerol.


Fifteen 20 amino-acids (aa)-long peptides (F1–F15) with 10 aa overlap spanning entire length of PKHCV3 ARFP (160 aa), and five peptides spanning the first 60 aa of core (C1–C5) were custom synthesized (ChinaTech Peptide Co.). The peptide sequences are shown in Table 1. Peptides were >80% pure and solubilized in water at a concentration of 10 mg/ml.
Table 1

Sequences of the 15 ARFP (F1–15) and five core (C1–C5)-derived peptides


Amino acid number






























































Enzyme-linked immunosorbent assays (ELISAs)

Wells were coated with 0.5 μg/ml purified rARFP overnight and subsequently blocked with 3% BSA in PBST (PBS + 0.1% Tween) for 3 h and washed. Sera were diluted 1:2,500 in blocking buffer and 100 μl was added to each well for 1 h. Wells were washed and 100 μl of 1:10,000 anti-human HRP-linked antibody (Dako) was placed in each well for 30 min. After washing, 100 μl of TMB (Sigma) was added for color development for 20 min and reaction was terminated with 100 μl of 1 N HCl. Plates were placed in an ELISA reader (StatFax) and absorbance taken at 450 nm. All ELISAs were carried out in duplicate and average OD450 calculated. Cutoff was determined as the mean + 3SD of results from three HCV-negative samples (controls) plus 0.1. A serum sample was considered to be positive when absorbance was superior to the cutoff [19]. For peptide ELISAs, same procedure was used, except that 2 μg/ml was used to coat each well and that sera and secondary antibody were used at dilutions of 1:500 and 1:20,000, respectively. Sample OD values were divided by mean + 3SD ODs obtained from HCV-negative samples (controls) and plotted graphically using GraphPad Prism.

Antibody generation and Western blotting

Purified rARFP (10 μg) was separated on preparative 12% SDS-PAGE, band excised, homogenized, mixed with Freund’s complete adjuvant (Sigma-Aldrich), and injected intraperitoneally into two BALBc mice after 10-day intervals. Second, third, fourth, and fifth injections contained Freund’s incomplete adjuvant. Ascites fluid was collected after the fourth injection. Animals were euthanized 10 days after the last injection and blood collected. For Western blotting, ARFP (100 ng) was separated on 12% SDS-PAGE and electro-transferred onto nitrocellulose membrane. Membrane was blocked in TBST containing 5% non-fat milk, incubated with primary antibody (1:400) for 1 h at room temperature and washed. Following incubation with anti-mouse HRP conjugated IgG (1:5,000; Abcam) for 1 hour, blot was incubated in ECL plus detection solution (Amersham) and bands detected by autoradiography.

PBMC isolation and cytokine analysis

Collected blood was mixed with an equal volume of RPMI supplemented with 1 mM HEPES and 200 mM L-glutamine (Sigma). The mixture was layered over 10 ml of Histopaque (Sigma) and centrifuged at 1,400 rpm for 25 min at 25°C. Uppermost layer was collected and 1 ml was added to 9 ml RPMI medium (10% autologous plasma). The buffy coat interface was carefully collected and washed with 20 ml of RPMI twice by spinning at 1,400 rpm for 10 min at 25°C. Cells were suspended in 2 ml of 10% autologous plasma and seeded at 10,000 cells per well. Stimulation with peptides F7 and C3 (10 μg/ml each) was done in duplicate and supernatants collected 48 h post-stimulation and stored at −70°C. LPS (10 pg/ml) was employed as positive control. TNFα and IL10 were quantified using the Human Th1/Th2 Cytokine Cytometric Bead Array Kit (Cat #550749, BD Biosciences), as per manufacturer’s instructions using a BD FACS Array Bioanalyzer with FCAP Array software.

Statistical analyses

Dunnett’s multiple comparison test was applied after one-way analysis of variance (ANOVA) to compare the difference between columns (p-value < 0.05 and 95% confidence interval [CI]). Mann–Whitney U-test was applied to ascertain differences between columns representing cytokine levels (p-value < 0.05 and 95% CI). Statistical analyses were carried out using GraphPad Prism.


To gauge the immunogenicity of ARFP encoded by HCV genotype 3, which is prevalent in Southeast Asia and the Indian subcontinent, we collected sera from 88 genotype 3-infected individuals. Viral RNA was purified from one of the samples, converted into cDNA, and employed as template to PCR-amplify the 642-bp product spanning the core region into pGEM-T. DNA sequencing of the insert revealed a 573-bp region which, when theoretically translated in the zero and +1 ORFs, was found to encode 191 aa core and 160 aa ARFP proteins, respectively. We refer to this 573-bp sequence as PKHCV3. The DNA sequence of core along with its conceptually translated products of both proteins is shown in Fig. 1a. PKHCV3 core is highly homologous to its counterparts from genotypes 1, 2, 4, 5, and 6, which, altogether, share ∼80% identity at the amino acid level (data not shown). Similar analyses with PKHCV3 ARFP revealed that its N-terminal 124 aa share a high degree of homology with five other genotype 3-derived ARFPs retrieved randomly from the GenBank database but contains an additional 36 aa at its C-terminus (Fig. 1b). Whereas some genotype 3-derived ARFPs are 124 aa (e.g., NZL and 3D), others (e.g., TH85, PK1 and 3C) contain variable C-terminal extensions which are consequences of single-point mutations that convert stop codons into those representing amino acids at positions 125 (Leu), 143 (Glu and Trp), and 154 (Trp). PKHCV3 ARFP, however, contains the longest C-terminal extension observed among all genotype 3-encoded ARFPs reported to date. Comparison of the initial 124 aa of PKHCV3 ARFP with other genotype 3-encoded ARFPs showed ∼60% (74/124) identity. PKHCV3 ARFP was also compared to ARFPs from genotypes 1, 2, 4, 5, and 6 (Fig. 1c). With the exception of genotype 1 ARFP, which also contains a 36 aa C-terminal extension, all other genotypes encode 124 aa proteins. Overall, ARFPs across the six genotypes are poorly conserved, with only 30% (38/124) identity. Moreover, from the across-genotype comparison, amino acid stretches 1–10 and 28–40 appear to be somewhat conserved in ARFPs.
Fig 1

ARFP/core sequences and alignments. a DNA sequence of the core region of PKHCV3 and its theoretical translation in zero and +1 open reading frames (ORFs). The core region is 573 bp and encodes 191 aa. The +1 ORF represents ARFP, which is 160 aa in length. The first (GCA) and stop (TAA) of the ARFP ORF are highlighted. b Alignment of PKHCV3 ARFP with other genotype 3 sequences retrieved randomly from the GenBank database. Sequences and their accession numbers (in brackets) are as follows: th85 (D14307.1), nzl (D17763.1), PKHCV3, pk1 (GU294484.1), 3d (D16620), and 3c (D16612.1). c Alignment of PKHCV3 ARFP with genotypes 1, 2, 4, 5, and 6 sequences retrieved randomly from the GenBank database. Sequences and their accession numbers (in brackets) are as follows: G1 (AM262510.1), G2 (AF238483.1), PKHCV3, G4 (U33436.1), G5 (U10214.1), and G6 (U33435.1). The sequence of peptide F7 is shaded

Next, we expressed the 160 aa PKHCV3 ARFP as hexahistidine-tagged recombinant protein in E. coli M15 strain and purified it to homogeneity by affinity chromatography. SDS-PAGE analysis of purified rARFP revealed the ∼19-kDa protein to be >95% pure (Fig. 2a). Approximately 100 ng of purified rARFP along with 20 μg of whole-cell extract from the M15 strain were subjected to Western blotting with anti-ARFP antibodies. This exercise demonstrated that anti-ARFP antibodies specifically recognize purified rARFP as a single band and do not cross-react with any cryptic epitopes within the E. coli M15 proteome (Fig. 2b).
Fig. 2

Recombinant ARFP and ELISAs. a SDS-PAGE analysis of affinity purified rARFP. b Western blot probed with polyclonal anti-ARFP antibodies generated in mice. E. coli M15 strain lysate (lane 1) and purified ARFP (lane 2). c Results of ELISAs carried out with rARFP and serum samples from 88 infected individuals. Strongly reactive anti-ARFP antibodies were detected in 74 out of 88 samples. Cutoff was determined as the mean + 3SD of the results from three HCV-negative samples (controls) plus 0.1. Samples readings above cutoff were deemed to be positive

The magnitude of immune response elicited by ARFP was then evaluated by ELISA using purified rARFP, along with sera from 88 individuals chronically infected with HCV genotype 3. Sera from three healthy HCV-negative individuals were employed as negative controls and used to define a reference baseline or cutoff point. The results of this experiment are shown graphically in Fig. 2c and demonstrate that significant anti-ARFP antibody reactivity was present in sera from 74 out of 88 (84%) chronically infected subjects.

Although the translation of ARFP has been shown to initiate from different regions, we adopted an open-minded strategy and designed a series of 15 peptides that were 20 aa long with 10 aa overlap spanning the entire length of PKHCV3 ARFP in order to determine which of its regions are immunogenic (Table 1). Towards that end, 28 serum samples from 74 which had exhibited the best antibody reactivity against rARFP were selected and used in ELISAs with each of the ARFP peptides F1–F15. To minimize experimental error, all reactions were carried out in duplicate. Readings for each peptide were averaged and plotted as a bar graph (Fig. 3a). This experiment showed that, although all regions of ARFP are immunogenic, peptide F7 spanning amino acids 60–80 was the most immunodominant, followed by F14 and F15, which represent the last 30 aa of ARFP. To compare the intensity of host immune responses elicited by ARFP with core, same exercise was repeated with five peptides spanning amino acids 1–60 of PKHCV3 core and experimental findings shown in Fig. 3b. All core peptides were bound by antibodies from infected individuals, with peptide C3 (aa 20–40) being the most reactive, followed by peptides C2, C4, and C5, whose immunogenicity was equivalent but greater than that of F7.
Fig. 3

Peptide ELISAs. Peptides F1–F15 spanning full-length ARFP (a) and C1–C5 corresponding to amino acids 1–60 of core protein (b) were used in ELISAs together with 28 serum samples from infected individuals. Peptides F7 and C3 were found to be the most immune-reactive using one-way ANOVA, Dunnett’s multiple comparison test (p < 0.05). The error bars represent the standard error of the mean (SEM). Sample OD values were divided by mean + 3SD of the results from three HCV-negative samples (controls) and plotted graphically using GraphPad Prism

As antibodies reactive against peptides F7 and C3 were prevalent in genotype 3-infected individuals, we also evaluated their potential to stimulate TNFα and IL10 synthesis from circulating T-cells. For this purpose, blood from 12 genotype 3-infected individuals who had not received any therapy and three HCV-negative healthy subjects was collected and used to purify PBMCs. All isolated PBMCs produced TNFα when treated with lipopolysaccharide (data not shown). Incubation of PBMCs with peptide F7 culminated in the secretion of much higher amounts of both TNFα and IL10 as compared to those incubated with C3 (Fig. 4a, b). Specifically, F7 stimulated T-cells produced 6-fold and 40-fold more TNF-α and IL10, respectively, in comparison to C3-stimulated cells.
Fig. 4

Stimulation of peripheral blood mononuclear cells (PBMCs). Peptides F7 and C3 were used to stimulate purified PBMCs obtained from infected subjects. a TNFα response to peptides. Stimulation of HCV-infected PBMCs (dark bars, n = 11) by F7 resulted in a higher secretion of TNFα as compared to stimulation by C3 (p < 0.05, denoted by the star), whereas the stimulation of PBMCs from healthy individuals (light bars, n = 3) by F7 and C3 was less. b IL10 response to peptides. Stimulation of HCV-infected PBMCs (n = 12) by F7 resulted in a higher secretion of IL10 as compared to stimulation by C3 (p < 0.05, denoted by the star), while the stimulation of PBMCs from healthy individuals (light bars, n = 3) by F7 and C3 was less. The error bars represent the SEM


We report the cloning of ARFP from HCV genotype 3 strain PKHCV3 capable of encoding a 160 aa protein that harbors a 36 aa extension at its C-terminus. We find that 84% of chronically infected subjects contained antibodies which bind to recombinant PKHCV3 ARFP. All regions of this ARFP were immunogenic, especially amino acids 60–80 (peptide F7, shaded in Fig. 1) and the C-terminal 30 amino acids. Additionally, circulating memory T-cells collected from all chronically infected individuals responded to F7 stimulation by secreting large amounts of TNFα and IL10.

The majority of ARFP sequences in the GenBank database encode proteins of 124 aa. The 160 aa ARFP from PKHCV3 identified in this study is the longest reported so far from a genotype 3 strain. Beyond 124 aa, its 36 aa extension is a consequence of three single-point mutations that convert stop codons into those which represent amino acids at positions 125 (Leu), 143 (Glu and Trp), and 154 (Trp). The full 36 aa C-terminal extension in PKHCV3 ARFP is also present in genotype 1-encoded ARFPs but is absent from genotypes 2, 4, 5, and 6 strains. Very little homology is observed between C-terminal 36 aa from PKHCV3 and genotype 1 ARFPs. Across six genotypes, ARFP is not as well conserved as core, which is 80% identical at the amino acid level. ARFPs within genotype 3 strains are ∼60% identical, moderately conserved within genotypes 1, 2, and 3 (∼44% identity) [18], and poorly conserved across the various genotypes (∼30% identity). Differential rates of evolution within the two proteins are most likely due to the preferential accumulation of mutations in the Wobble base of core codons, which correspond to the second base within ARFP codons. Such targeted mutations should preserve core but accelerate ARFP evolution. Notwithstanding the lack of ARFP conservation across genotypes, amino acids 1–10 and 28–40 are two notable stretches conserved in all ARFPs but their role in ARFP function remains unknown.

Several investigators have reported the presence of ARFP-specific antibodies in patient sera [18, 20, 22, 23, 24, 25]. Studies employing recombinant ARFP in ELISAs found that a larger percentage of infected individuals have anti-ARFP antibodies compared to those which used one or more peptides, underscoring the importance of using full-length protein as opposed to peptides as screening antigens. Valuable data has been generated from these reports, but in light of across-genotype heterogeneity in ARFPs, more reliable information is likely to be generated with a homologous system that employs ARFP as well as sera from the same genotype infected individuals. Using a homologous system, we find that the majority (84%) of chronically infected individuals contain ARFP-specific antibodies. This percentage of anti-ARFP-positive infected subjects is much higher than that previously reported [18, 20, 22, 23, 24, 25].

Although only one genotype 3-derived clone was sequenced in this study, it is notable that the 36 aa extension in PKHCV3 is widespread in the vast majority of genotype 3-infected patients, as evidenced by the presence of antibodies in them. This C-terminal ARFP region is more immunodominant than any of the others, except F7.

The highly immunogenic region which we identified within PKHCV3 ARFP is not conserved in other genotypes, implying that it is of limited use as a protective antigen. Intriguingly, we find that, while core C3 peptide generated higher antibody titers as compared with peptide F7, circulating T-cells in infected subjects were more responsive to F7. The different T-cell responses towards the two peptides may be difficult to reconcile, but it is tempting to speculate that anti-ARFP antibody titers are low because, in contrast with core, ARFP is short-lived, evolves rapidly and, perhaps, is expressed late in infection. Moreover, since we have used PBMCs from untreated infected individuals, it would be useful to know the impact of therapy, if any, on cell-mediated responses towards F7 and core peptides.

The translation of ARFP has been found to initiate from codons 8–11, 26, 42, and 85/87 [5, 6, 7] within the core nucleotide sequence via translational frameshifting, transcriptional slippage, or internally, resulting in the production of different ARFP isoforms with variable N-terminal regions. Which ARFP isoforms are produced in what stoichiometry from PKHCV3 remains to be determined, but we note the presence of antibodies against peptide F1 (aa 1–20) in all of the 28 serum samples selected for the peptide-based ELISA experiment. This observation indicates that the synthesis of some, if not all, ARFPs initiates from codons 8–11.

Taken together, our findings indicate that most of the HCV genotype 3-derived ARFPs are 160 aa in length and that all of its regions are immunogenic, particularly the one spanning amino acids 60–80 in infected individuals. Further studies are needed to determine the function and role of ARFP in disease progression, particularly in the transition from chronic stage to hepatocellular carcinoma, and whether ARFP itself or host-mediated immune responses generated against it can serve as reliable biomarkers for monitoring different stages of infection, drug effectiveness, or in predicting therapeutic response.


We are grateful to Dr. Najeeha Talat (Department of Pathology & Microbiology) and Mr. Junaid Iqbal and Ms. Hina Zuberi (Department of Biological & Biomedical Sciences) for their technical assistance. This work was funded by grants from the Higher Education Commission (HEC-20-342) and Aga Khan University Research Council (1UM/70139).

Conflict of interest

All authors declare that they have no conflict of interest.

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • H. Qureshi
    • 1
  • R. Qazi
    • 2
  • S. Hamid
    • 3
  • S. A. Qureshi
    • 1
    • 4
  1. 1.Department of Biological and Biomedical SciencesAga Khan UniversityKarachiPakistan
  2. 2.Department of Pathology and MicrobiologyAga Khan UniversityKarachiPakistan
  3. 3.Department of MedicineAga Khan UniversityKarachiPakistan
  4. 4.Department of Biology, School of Science & EngineeringLahore University of Management SciencesLahorePakistan

Personalised recommendations