Background

Findings in a number of laboratories have indicated that the transmembrane (TM) proteins of a number of RNA viruses have common structural and functional elements critical for virus entry. These include a hydrophobic region designated a "fusion peptide", usually at or near the amino-terminus generated by cleavage of a precursor protein, together with fibrous structure defined by two antiparallel alpha helices. These general principles appear to apply to the Orthomyxoviruses, Paramyxoviruses, Retroviruses, Lentiviruses, and Filoviruses [1,2,3,4]. In some cases, such as between Ebola and Rous sarcoma viruses, there is considerable sequence identity to facilitate a comparison between two specific viruses [4]. In other cases, even within a single virus family such as the Retroviridae, both structural modeling and more limited sequence similarity must be combined to discern the relationship [3]. The finding of close sequence or structural similarity among otherwise disparate virus families has given rise to the concept of a viral TM superfamily sharing common structural and functional motifs [4]. Recent biophysical studies of entry protein structure have reinforced this concept [5,6].

In this respect, a general model of the Arenavirus glycoproteins, based on extensive study of lymphocytic choriomeningitis virus (LCMV) has been presented based on their overall similarity in functional organization to influenza and to other enveloped viruses. The GP-C precursor is proteolytically cleaved near a polybasic site to yield GP-1, a globular surface glycoprotein which contains receptor-binding sites, and GP-2, a TM protein forming the stalk of the complex via a coiled coil of amphipathic helices and responsible for virus entry by acid-dependent membrane fusion [7,8,9,10].

We present here a detailed model of GP-2 for Lassa fever virus, an Arenavirus associated with multiple epidemics of hemorrhagic fever with high morbidity and mortality in West Africa [11,12], and for the related lymphocytic choriomeningitis virus (LCMV) which has been associated with sporadic outbreaks of human disease in Europe and North America [12]. This model demonstrates that Arenaviruses share a number of specific sequence and structural motifs with other RNA viruses in the TM superfamily. Regions of Arenavirus GP-2 can be directly related to corresponding regions of Ebola, another agent of African hemorrhagic fever, and to HIV-1. Examination of the comparable regions of TM proteins from several virus families provides evidence suggesting divergence from a common ancestor.

Results and Discussion

The detailed model of LCM and Lassa fever virus GP-2 is shown in Figure 1. As shown previously for other members of the TM superfamily, both consist of two antiparallel helices separated by a disulfide-linked apex. The sequence of GP-2 contains a highly conserved hydrophobic sequence, LAGTFTWTL (in LCM) or LLGTFTWTL (in Lassa), in the vicinity of the post-translational cleavage site, with a canonical fusion tripeptide Gly-X-Phe. However, its candidacy as a functional fusion peptide, analogous to those of influenza, measles and HIV-1, has been questioned due to its weakly hydrophobic character and the fact that actual cleavage occurs, as shown in the models, not at the dibasic amino acids but within the hydrophobic site [13,14]. LCM virus does employ an acid dependent fusion event to enter the cell but virion-cell fusion is inactive above pH 6, and the Arenaviruses have never been demonstrated to exhibit cell-cell fusion. While this amino-terminus may not be a good candidate for a classical fusion peptide, its hydrophobic nature and position suggest that it may at least be the vestige of one.

Figure 1
figure 1

Models of the Transmembrane Proteins GP-2 of Lassa Virus and LCMV. Projections of the model structure for GP-2 of Lassa and LCMV are shown in parallel with one another, based on the consensus structures for algorithms of both viral amino acid sequences. Proposed helices are shown in helical net projection with sequential amino acids connected by solid lines. Proposed disulfide linkages are indicated by double lines. Hydrophobic amino acids are grouped with a solid background; neutral amino acids with a heavily outlined circle; hydrophilic amino acids with a light circle; glycosylation sites indicated by tridents. A heavy solid line indicates the point of proteolytic cleavage of the GP-C precursor protein to yield GP-1 and GP-2. Cysteines are highlighted by larger red circles. The surface membrane of the virus is indicated by a solid purple rectangle, the amino-terminal hydrophobic region by a yellow rectangle, and the conserved B-cell epitope by a blue rectangle. The two proposed antiparallel helices are labeled "AmphiHelix" for the extended heptad repeat, and "CPI Helix" for the charged, pre-insertion helix.

The region prior to the first helix consists first of a glycine-serine rich linker, and then a domain that is highly conserved among all Arenaviruses and contains four cysteines. Only the last of these four is conserved between the Filoviruses and Arenaviruses. We have not assigned disulfide linkages for these since there are neither data nor parallels with other viruses to permit such assignments. Since there is no disulfide cross-linkage with GP-1, these must participate in disulfide bonding within the same GP-2 protein, or in cross-linking GP-2 oligomers. The latter possibility is suggested by the kinetics of GP-2 association with experimental addition of reducing agent, indicating first a change in vitro from tetramers to dimers and then to monomers only after considerable additional reduction [8]. Whether the native multimeric form of GP-2 in the virion may be a trimer, as for the fusion glycoproteins of Retroviruses or Filoviruses, is yet to be determined.

The amino-terminal helices of both consist of extended amphipathic arrays with strong heptad repeats that have been previously noted [15], and are thought to form the backbone of the coiled-coil stalk of the viral glycoprotein complex [16] A peptide analogue of this extended heptad repeat in LCMV, GP-C 326-355, was examined by circular dichroism under different solvent conditions, as shown in Table 1. The peptide exhibited only limited helicity in aqueous solution, but 79% alpha helix in a neutral hydrophobic environment. This biophysical behavior is reminiscent of that of other similar peptides derived from the corresponding sequences of Paramyxoviral or Retroviral TM proteins [2,17].

Table 1 Circular Dichroism Spectroscopy of Peptide GPC 326-355

Comparison of the sequences of Lassa and LCM over this amphipathic heptad-repeat region (below) shows 31 identical of 58 amino acids, with the principal areas of conservation of sequence at the amino- and carboxy-terminal ends of the amphipathic helix.

The middle 25 amino acids appear poorly conserved, with only 6 of 25 identical, yet the character of the amino acids substituted is generally conserved. In particular, while none of the central 4 heptad amino acids (underlined and in bold) are identical in each virus, in all cases the hydrophobic character of the heptad repeat is maintained.

The apical domain is the only region to be glycosylated, also in line with a number of TM proteins including that of HIV-1 and other Retroviruses. The apical sequence, particularly the peptide KFWYL in LCMV or KYWYL in Lassa, defines a broadly-cross reactive antibody epitope shared by these viruses [18] that is in precisely the same topographical location as the broadly-reactive apical epitope (positions 598-609, LGIWGCSGKLIC) that has been finely mapped in HIV-1 [19]. Also like that in HIV-1, it is responsive to multimer conformation, and increasingly exposed after receptor binding that results in release of the binding subunit, GP-1 [13].

The second helical region has properties similar to that of the Retroviruses and Filoviruses, in that it is highly charged (30%) and amphipathic, with its helicity possibly stabilized by multiple ion pairings of acidic and basic residues, as first noted for the corresponding region of HIV-1 [3]

Although Lassa fever and Ebola viruses represent different virus families, both helices share an unexpectedly high sequence homology. The first lies in the amino-terminal half of the extended amphipathic helix. As shown in a concentric helical wheel projection in Figure 2A, when the helices are oriented with respect to the exclusion of charge and the heptad repeats for each sequence, 9 identical or highly similar amino acids (50%) may be aligned in each sequence.

Figure 2
figure 2

Concentric Helical Wheel Projections of Proposed Lassa and Ebola Helices. Helical wheel projections are shown for 18 amino acid segments of the proposed helices of Lassa and Ebola viruses. The projections are arranged concentrically to align the viral sequences, with the inner sequence that from Lassa virus, and the outer sequence from Ebola. Zaire virus (Genbank U31033). Rectangles indicate identical or highly similar residues in each sequence. Heavy lines indicate the hemicylinder of charge exclusion (hydrophobic) subtending an angle of 160° in each alignment. Below each wheel projection the linear sequences are also aligned to show the identities (solid lines) and high similarities (dotted lines). A. Concentric alignment of the amino terminal amphipathic helices ("lower amphi") of Lassa (amino acids 309-326, in wheel positions 1-18 respectively) and Ebola (555-572). B. Concentric alignment of the charged, pre-insertion helices ("CPI Helix") of Lassa (398-415) and Ebola (618-635).

The carboxy-terminal helical region also has properties in common with the similarly located helices in both Lassa and Ebola, shown as a concentric helical wheel projection in Figure 2B. Again, orienting the helix with respect to the hemicylindrical exclusion of charge, 9 identical or highly similar amino acids (50%) may be aligned. Furthermore, none of the amino acid differences represent a radical change of one sequence from the other.

Arenaviruses therefore share with a number of other virus families a fusion/entry protein GP-2 that appears to have the four cardinal structural features typical of proteins in the viral transmembrane entry protein superfamily. Our model of the extramembranal portion of GP-2 begins with a hydrophobic fusion peptide sequence, followed by two antiparallel extended helices, the first of which contains a strong heptad repeat sequence, which lie on either side of a disulfide-stabilized, glycosylated and strongly antigenic reverse turn. These features have been apparently maintained in spite of diversity in primary amino acid sequence within the Arenavirus family.

Conclusions

The most likely explanation for such high levels of similarity among Arenaviruses and Filoviruses would be divergence of both of these agents from a common viral ancestor. Since both virus families exhibit type variation over large areas coupled with stability among isolates within a more limited geographical area over considerable periods of time (the Arenaviruses being the more widespread) such divergence must have occurred eons ago. The potential importance of such apparent conservation in the biology of these agents is underscored by noting that of the corresponding peptide sequences within the TM superfamily of proteins, that for HIV-1 forms the center of a peptide analogue shown to inhibit fusion in the nanomolar range [20].

Modeling studies begun in the late 1980s have thus revealed a number of common and sequence motifs, subsequently shown in several cases to have homologous biological roles in infection, that were not otherwise apparent in studies of sequence homology. These models may lead to a common strategy of antiviral inhibition preventing entry of virus into host cells that is broadly applicable over a broad range of very diverse virus families.

Materials and Methods

Molecular Modeling

Sequences used for this analysis were LCMV - ARM (Genbank P09991) and Lassa, Josiah (Genbank P08669), and are numbered from the initiation methionine. Detailed models of the Arenavirus GP-1 proteins were determined by the methods of Gallaher et al. previously described [3,4,21] A consensus of several independent structural algorithms is used, and compared for different GP-2 sequences to test the consensus. The resulting model is an average consensus of the algorithms for these two sequences. Models are projected in helical net or helical wheel projections also as previously described.

Peptide Synthesis and Circular Dichroism

A peptide corresponding to amino acid positions 326-355 of LCMV-ARM-4 (Genbank VGXPLM) in single letter code, NKAALSKFKEDVESALHLFKTTVNSLISDQ, with an additional histidine at the N terminus was synthesized by standard BOC chemistry using double coupling and HF cleavage. The peptide was purified by reverse phase HPLC on a C-18 column and the peptide's weight confirmed by mass spectroscopy. The peptide was selected as predicted by the Lupas algorithm [22] to have a greater than 90% probability of forming a heptad repeat in the native protein structure. Peptide samples for circular dichroism (CD) were prepared at 0.1 mg/ml concentrations in either 1 mM NaCO3, pH 7.2 (Neutral) or in 100 mM MES, pH 5.5 (Acid). In spectra recorded with TFE, the TFE was present as 45% of solution volume. CD spectra were recorded from 300-180 nm with 0.5 nm steps with a pathlength of 0.1 cm and at 4°C. Final values were determined using the average of 15 spectra which were correlated with baseline spectra of buffer samples. A characteristic alpha helical spectrum was apparent for the peptide when placed in TFE with a positive peak at 195 nm (Θ = +43000) and a second minimum peak at 210 nm (Θ = -25000).