Introduction

Since 1981, bell pepper plants (Capsicum annuum L. var. grossum Sendt.) showing vein yellowing and leaf roll, symptoms very similar to those of the recently reported pepper yellow leaf curl virus (PYLCV) [3], have been observed every year in Okinawa Prefecture, Japan. A virus resembling a luteovirus was isolated from the plants, and initial studies showed that the virus replicates in phloem tissues and was transmitted by grafting and the aphid vector, Aphis gossypii Glover, in a persistent manner [13]. The virus was named pepper vein yellows virus (PeVYV) [13]. Here, we report the complete genome sequence of PeVYV.

In 2009, infected bell pepper plants showing typical symptoms of PeVYV were again obtained at the same place where the first isolation of PeVYV was done [13] because previously purified virions were unavailable. PeVYV-infected plants were maintained in a greenhouse by continuous aphid transmission to healthy bell pepper plants. Since RT-PCR using several primers for luteoviruses did not amplify cDNAs of PeVYV from infected plants, we first prepared RNA samples from 300 viruliferous A. gossypii, which had been given an acquisition access on the infected plants for two weeks. Then, randomly amplified RT-PCR products (ranging in size from 300 to 800 bp) were obtained according to the methods of Ryabov [11] and were cloned into pGEM-T Easy Vector (Promega). Inserts of 576 clones were sequenced and assembled. The 293 clones formed two contigs: one contig having 4034 nucleotides (nt) showed 86.0% sequence identity to nt 224-4257 of tobacco vein distorting virus (TVDV) [10], and the other contig having 93 nt also showed 93.1% identity to nt 5814-5906 of TVDV. The 5′ and 3′ terminal sequences were obtained using the 5′ and 3′ RACE systems of Invitrogen, the latter after polyadenylation of RNAs. Sequences between the two contigs were determined using specific primers for RT-PCR. The total length of the assembled sequence obtained from the aphids was 6244 nt. To rule out errors in sequence assembly, we sequenced RT-PCR products containing the junction regions of the contigs from the PeVYV-infected bell pepper plants. In addition, primers designed to amplify nt 239-1484, 1327-2754, 2524-4268, and 4194-6186 of the assembled sequence gave the expected DNA fragment sizes when RT-PCR was carried out using samples from infected plants, confirming that the assembled sequence represents the PeVYV sequence.

Comparisons of the nucleotide sequence of PeVYV and other fully sequenced viruses showed that nt 1-4252 of PeVYV had highest identity (85.1%) with nt 1-4261 of TVDV, and nt 4253-4996 of PeVYV had highest identity (77.1%) with nt 4105-4851 of cucurbit aphid-borne yellows virus (CABYV). Nucleotides 5018-5808 of PeVYV had 51.8% identity with nt 5010-5802 of potato leafroll virus, and nt 5878-5978 of PeVYV were again 88.1% identical to nt 5815-5915 of TVDV, which are located in the 3′ NCR of TVDV. The sequence of nt 6083-6244 of PeVYV had no similarity to other viruses.

The 5′ NCR of PeVYV was 51 nt long, resembling the NCR of poleroviruses. The 5′ NCR of most poleroviruses is shorter than 100 nt, whereas the NCRs of luteoviruses and enamoviruses are longer than 100 nt. The 5′-terminal region of the PeVYV sequence, ACAAAA, is identical in many poleroviruses, e.g., beet mild yellowing virus [7], CABYV [6], melon aphid-borne yellows virus [12] and TVDV [10].

The RNA genome of PeVYV contained six major open reading frames (ORFs), resembling those of poleroviruses in arrangement and sequence. Properties of the deduced amino acid sequence of each ORF were as follows: (i) ORF0, encoded by nt 52-801, contains an F-box consensus sequence [9] at nt 217-231; (ii) ORF1, encoded by nt 176-2140, has a serine protease motif at nt 893-1270, and the N-terminus of a genome-linked protein [10] maps to amino acid position 403; (iii) ORF2, encoded by nt 1621-3432, is an RNA-dependent RNA polymerase (RdRp) and appears to be translated through a -1 frameshift by a shifty heptanucleotide at nt 1654-1660; (iv) ORF3, encoded by nt 3632-4252, is the coat protein (CP); (v) ORF4, encoded by nt 3663-4133, is a movement protein in a distinct +1 reading frame of ORF3; (vi) ORF5, encoded by nt 4253-5842, is a readthrough domain (RTD) of ORF3 and has a proline hinge [6] at nt 4253-4339.

PeVYV had an intergenic NCR of 199 nt between ORF2 and ORF3, similar to other poleroviruses, whereas luteoviruses and enamoviruses have an intergenic NCR of about 100 nt. These characteristics also indicate that PeVYV is a polerovirus.

The 3′ NCR of PeVYV was 402 nt long, making it longer than those of other known poleroviruses but considerably shorter than the 3′ NCR of luteoviruses (>600 nt). Secondary structure prediction of the 3′ NCR of PeVYV detected 11 stem loops (data not shown). The sequence containing the first and second stem loops, nt 5923-5977, and that containing the third and fourth stem loops, nt 5992-6046, were nearly identical, suggesting that duplication of this sequence segment had occurred in the 3′ NCR of PeVYV, as has been observed for flaviviruses [5].

Phylogenetic trees of the RdRp, CP and RTD sequences of viruses of the family Luteoviridae also supported that PeVYV is a member of the genus Polerovirus (Fig. 1). Analysis of RdRp and CP sequences showed that PeVYV was closely related to TVDV, whereas an RTD analysis suggested that PeVYV was only distantly related to TVDV. Based on amino acid sequence identities for ORFs 0-3, the closest relative of PeVYV was TVDV (75.9-91.9%), but ORF5 of TVDV has only 25.1% amino acid identity to that of PeVYV. One of the species demarcation criteria for viruses of the family Luteoviridae is that differences in amino acid sequences of any gene product must be greater than 10% [2]. In addition, a previous transmission test [13] showed that PeVYV did not infect tobacco plants. These observations indicate that PeVYV belongs to a distinct species in the genus Polerovirus.

Fig. 1
figure 1

Relationship of PeVYV to members of the genera Luteovirus, Polerovirus and Enamovirus using amino acid sequences of the RNA-dependent RNA polymerase (RdRp, a), the coat protein (CP, b) and the readthrough domain (RTD, c). The complete genome sequences of the viruses were obtained from GenBank, and pepper yellow curl virus (PYLCV, accession number HM439608) and pepper yellows virus (PYV, FN600344) were also included in the CP analysis. Amino acid sequences were aligned using MUSCLE [4], and maximum-likelihood phylogenetic trees were constructed with PhyML [8] with the best-fit models of amino acid substitution selected by ProtTest [1], as judged by the Akaike information criterion. The models employed were LG+G+F for RdRp and LG+G for CP and RTD, respectively. Numbers on branches indicate bootstrap support values for 1,000 replicates, and values less than 50% are not shown. In panel a, the horizontal lines between luteoviruses and poleroviruses were shortened to one-fifth, and the line between poleroviruses and enamovirus was shortened to one-half. The virus abbreviations and accession numbers are as follows: BYDV-GAV, barley yellow dwarf virus-GAV, NC_004666; BYDY-MAV, barley yellow dwarf virus-MAV, NC_003680; BYDV-PAS, barley yellow dwarf virus-PAS, NC_002160; BYDY-PAV, barley yellow dwarf virus-PAV, NC_004750; BChV, beet chlorosis virus, NC_002766; BLRV, bean leafroll virus, NC_003369; BMYV, beet mild yellowing virus, NC_003491; BWYV, beet western yellows virus, NC_004756; CABYV, cucurbit aphid-borne yellows virus, NC_003688; CpCSV, chickpea chlorotic stunt virus, NC_008249; CtRLV, carrot red leaf virus, NC_006265; CYDV-RPS, cereal yellow dwarf virus-RPS, NC_002198; CYDV-RPV, cereal yellow dwarf virus-RPV, NC_004751; MABYV, melon aphid-borne yellows virus, NC_010809; PEMV, pea enation mosaic virus-1, NC_003629; PLRV, potato leafroll virus, NC_001747; RSDaV, rose spring dwarf-associated virus, NC_010806; SbDV, soybean dwarf virus, NC_003056; ScYLV, sugarcane yellow leaf virus, NC_000874; TuYV, turnip yellows virus, NC_003743; TVDV, tobacco vein distorting virus, NC_010732; WYDV-GPV, wheat yellow dwarf virus-GPV, NC_012931

Recently, partial genome sequences of polerovirus isolates from pepper have been reported from Israel [3] (PYLCV, accession number HM439608) and from Turkey (pepper yellows virus, PYV, FN600344). The available nt sequence of PYLCV had 93.4% identity with nt 3815-4252 of PeVYV, and that of PYV was 94.7% identical to nt 2613-4242 of PeVYV. The partial ORFs (ORFs 2, 3 and 4) of PYLCV and PYV showed 93-98% amino acid sequence identity to those of PeVYV. Thus, these two viruses appear to be geographic variants of PeVYV, all of which might form a new polerovirus species.