Skip to main content

Revision of Begomovirus taxonomy based on pairwise sequence comparisons


Viruses of the genus Begomovirus (family Geminiviridae) are emergent pathogens of crops throughout the tropical and subtropical regions of the world. By virtue of having a small DNA genome that is easily cloned, and due to the recent innovations in cloning and low-cost sequencing, there has been a dramatic increase in the number of available begomovirus genome sequences. Even so, most of the available sequences have been obtained from cultivated plants and are likely a small and phylogenetically unrepresentative sample of begomovirus diversity, a factor constraining taxonomic decisions such as the establishment of operationally useful species demarcation criteria. In addition, problems in assigning new viruses to established species have highlighted shortcomings in the previously recommended mechanism of species demarcation. Based on the analysis of 3,123 full-length begomovirus genome (or DNA-A component) sequences available in public databases as of December 2012, a set of revised guidelines for the classification and nomenclature of begomoviruses are proposed. The guidelines primarily consider a) genus-level biological characteristics and b) results obtained using a standardized classification tool, Sequence Demarcation Tool, which performs pairwise sequence alignments and identity calculations. These guidelines are consistent with the recently published recommendations for the genera Mastrevirus and Curtovirus of the family Geminiviridae. Genome-wide pairwise identities of 91 % and 94 % are proposed as the demarcation threshold for begomoviruses belonging to different species and strains, respectively. Procedures and guidelines are outlined for resolving conflicts that may arise when assigning species and strains to categories wherever the pairwise identity falls on or very near the demarcation threshold value.


The genus Begomovirus (family Geminiviridae) is the largest genus of plant viruses with respect to the number of species that it includes. In fact, with 288 species currently recognized by the International Committee on Taxonomy of Viruses (ICTV) (, it is the largest genus of all viral taxonomy. Begomoviruses infect a wide range of dicotyledonous plants, mostly in tropical and subtropical regions of the world. Their circular, single-stranded DNA genomes can be either monopartite or bipartite (with genomic components known as DNA-A and DNA-B), with the two components of bipartite genomes sharing a common region of approximately 200 nucleotides that includes the origin of replication [1]. In the Old World (OW; Africa, Asia, Australasia and Europe), most begomoviruses are monopartite, with a few having a bipartite genome. Begomoviruses native to the New World (NW; the Americas) are almost exclusively bipartite, with only a single monopartite virus having been identified so far [2, 3]. However, a number of monopartite begomoviruses occur in the NW as a result of their recent introduction from the OW [4, 5].

Geminiviruses have characteristically twinned or “geminate” particle morphology. The capsid consists of two joined, incomplete T = 1 icosahedral heads, with 110 molecules of the capsid protein organized as 22 pentameric capsomers [6]. Geminate particles contain a single molecule of circular ssDNA that ranges from 2.5 to 3.0 kilobases (kb) [1]. Therefore, for viruses having a bipartite genome, two particles, each containing a different genomic component (DNA-A and DNA-B), are required to establish infection.

Due to their economic importance as plant pathogens and their small genomes, begomoviruses were among the first plant viruses whose complete genomes were cloned and sequenced [7, 8]. By January 2014, more than 3500 full-length begomovirus sequences had been deposited in public databases. Even during the early days of full-genome sequencing, the increasingly large numbers of begomovirus sequences being determined worldwide made it clear that these viruses are abundant and widespread, and that they display a significant degree of genetic diversity [9]. Also, it created the opportunity for the development of a sequence-based taxonomy that relied primarily on pairwise sequence comparisons [10]. Such a system has been in place for the Geminiviridae since the mid-1990s, and it has been remarkably stable. It was also widely embraced by the begomovirus community, mostly due to its simplicity and ease of use. Similar classification systems have been adopted by a number of ICTV study groups, including those concerned with the Anelloviridae and the Circoviridae.

As useful as it has been to establish and streamline taxonomic communications, begomovirus taxonomy is not without controversy. Several criticisms have been voiced in the literature (one recent example being ref. [11]) and by the ICTV Executive Committee (EC), which rejected the Geminiviridae Study Group’s taxonomic proposals for creating new begomovirus species in 2010 and 2011. The main points of contention can be summarized as follows: (i) the creation of “too many” species in the genus; (ii) the recognition of new species based solely on sequence comparisons of members, without taking into consideration the biological properties of the viruses; and (iii) the establishment of species demarcation criteria that were “too relaxed” compared to those for other genera in the family, thus leading to point i. Moreover, and as pointed out by the Geminiviridae Study Group (SG) itself in the recent Mastrevirus and Curtovirus taxonomy revisions [12, 13], pairwise sequence identities for any particular pair of sequences may be calculated in different ways and therefore can result in differences in identity scores depending on the algorithm employed. Such discrepancies have made it highly desirable to establish a standard procedure to perform pairwise alignments and to calculate identity scores in order to eliminate (or at least minimize) taxonomic uncertainties and/or misplacements.

The concerns raised by the ICTV EC regarding begomovirus taxonomy encouraged the Geminiviridae SG to perform a comprehensive re-evaluation of the species demarcation criteria for the genus Begomovirus. The results of this re-evaluation have demonstrated that the current pairwise-identity-based taxonomy is sound, that it accurately reflects the biology of begomoviruses, that it will be stable, and that it will be easy to understand and to be adopted by geminivirologists worldwide. Here, we present the specific guidelines for the classification of begomoviruses, following those recently published for the genera Mastrevirus [12] and Curtovirus [13] of the family Geminiviridae.

A comprehensive analysis of the species demarcation criteria for the begomoviruses

Since a significant proportion of begomoviruses do not have a cognate DNA-B component, this component is not considered for species demarcation. A total of 3,123 full-length begomovirus genomic sequences (or DNA-A sequences) were downloaded from the NCBI-GenBank database on 31 Dec 2012. They corresponded to viruses belonging to 283 species according to the currently accepted 89 % species demarcation criterion (for comparison, see the 9th Report of the ICTV, which lists 192 species in the genus [1]). To reduce computing time, only the oldest sequences (full-length genomes or DNA-A components) from groups of sequences that shared >99.5 % genome-wide nucleotide (nt) sequence identity were included in the analysis. To the best of our knowledge, the analysis included sequences of members of all ICTV-recognized species and unclassified begomoviruses for which at least one full-length sequence was available in GenBank at that time (for many, there were multiple sequences per virus/strain). Using this data set (1,826 sequences), a preliminary phylogeny using the neighbor-joining (NJ) method was constructed (data not shown). The purpose of the NJ phylogenetic analysis was not to construct a definitive phylogeny but rather to identify groups of most closely related sequences that could be combined for pairwise sequence comparisons and maximum-likelihood (ML) phylogenetic analyses.

Based on the NJ phylogenetic tree, 38 groups were identified, each of which contained sequences that did not obviously correspond to the same viral species but also did not obviously correspond to distinct species. This approach was employed to more easily delineate distinct groups. Some groups consisted of as few as 2-3 sequences, whereas, others were represented by >30 sequences (Supplementary File S1). Pairwise sequence comparisons were carried out separately for each one of the 38 groups, using Sequence Demarcation Tool (SDT) v. 1.0 [14] with the MUSCLE [15] alignment option. Also, ML phylogenetic trees were predicted for each group using the PHYML3.0 method implemented in MEGA 5.2 [16] with the GTR+I+G nucleotide (nt) substitution model and branch support being tested with 3,000 bootstrap iterations.

Simulations were performed based on the results of pairwise sequence comparisons, using different cutoff values (rounded to the nearest full percentile) to delineate potential species so as to determine which sequences corresponded to virus isolates belonging to the same species, and which were isolates of distinct species. For this, we looked for the optimum cutoff value that placed each sequence into a given species without “outliers” (sequences that displayed identity levels above the cutoff value with two or more species).

Analysis of all 38 groups indicated that the best nt sequence identity cutoff value to separate isolates from different species was 91 %. This value is proposed here as the new species demarcation criterion for viruses of the genus Begomovirus using the outlined methodology. Implementing this value yielded the lowest number of outlier sequences compared to any other value within the range of 86 % to 94 % nt sequence identity. The cutoff for strain demarcation is 94 %. Parameters used for comparison are crucial. It is important to note that percent nt sequence identities must be calculated from true pairwise sequence alignments, with the exclusion of sites with gap characters. Ideally, the SDT software that is freely available [14] ( should be used, as it was developed specifically for this purpose.

Phylogenetic support was found to be robust for all new species analyzed across the 38 groups. The 91 % cutoff value is actually quite conservative, as is indicated by the trees for groups 3, 5, 7, 11, 16, 30 and 33 (Supplementary File S1). However, several groups (1, 2, 6, 27, 34 and 36; Supplementary File S1) required additional consideration because the pairwise sequence comparisons and phylogenetic results are conflicting, possibly due to recombination.

Dealing with outliers

The resulting taxonomic framework resulted in the delineation of a small number of outliers. Nevertheless, as the number of sequenced begomoviral genomes continues to increase, additional “conflicting” sequences will become evident. To address this problem, we propose the adoption of the approach described for viruses of the genera Mastrevirus [12] and Curtovirus [13]. In this light, the four possible conflicts are as follows:

  1. 1.

    An isolate having ≥91 % identity (full-length genome or DNA-A component) to isolates assigned to two (or more) species.

  2. 2.

    An isolate having ≥91 % identity to one or a few isolates from a particular species, even though it shares <91 % identity with the majority of isolates in that species.

  3. 3.

    An isolate having ≥94 % identity to isolates of two (or more) strains of a given species.

  4. 4.

    An isolate having ≥94 % identity to one or a few isolates from a particular strain, even though it shares <94 % identity with the majority of isolates from that strain.

The corresponding conflict-resolution criteria are as follows:

  1. 1.

    The new isolate should be considered to belong to the species that includes the isolate with which it shares the highest percentage of pairwise identity (full-length genome or DNA-A component).

  2. 2.

    The new isolate should be classified as belonging to the species with which it shares ≥91 % nt sequence identity with any one isolate from that species, even if it is <91 % identical to all other isolates from that species.

  3. 3.

    The new isolate should be considered to belong to the strain that includes the isolate with which it shares the highest percent identity.

  4. 4.

    The new isolate should be classified as belonging to the strain with which it shares ≥94 % nt sequence identity with any one isolate from that strain, even if it has <94 % identity to all other isolates from that strain.

Naturally, any working cutoff value established for viruses, particularly when rapid divergence is occurring (as appears to be the case for begomoviruses), will yield a number of outliers. By adopting these four conflict-resolution criteria, all outliers identified so far could be readily placed into an extant species group.

Exceptions to these rules can include recombinant viruses such as tomato yellow leaf curl Malaga virus (TYLCMaV) and tomato yellow leaf curl Axarquia virus (TYLCAxV), which have ≥91 % identity to both parental viruses (tomato yellow leaf curl virus, TYLCV, and tomato yellow leaf curl Sardinia virus, TYLCSV), thus leading to conflict #1 and causing the two parental species to merge into a single species, even though all isolates of the parental viruses have <91 % identity. Such recombinant viruses will have to be examined on a case-by-case basis for species assignment.

The new species demarcation criterion of <91 % nt sequence identity (for full-length genomes or DNA-A components) is more stringent than the previously used 89 %

At first, the higher value, at 91 %, compared to the previously implemented working cutoff of 89 %, may give the impression of a more relaxed species demarcation scenario that might delineate an even greater number of begomovirus species. However, this is not the case. Rather, the pairwise cutoff value at 91 % is a consequence of the implementation of a more robust approach (now standardized for the entire family Geminiviridae) for calculating pairwise identities: true pairwise alignments (compared to global alignment-based pairwise identities) without gaps. This proved to be more stringent than previous approaches based on multiple sequence alignments with gaps treated as a fifth character, which yielded a working cutoff of 89 %.

One group of begomoviruses that has been affected the most by applying the revised analysis is the “sweepovirus” group, a divergent clade of whitefly-transmitted geminiviruses that infect sweet potato and wild species in the Convolvulaceae. Previously, the group was proposed to include 17 species [17]. The new system reduces the number of species by more than half, delineating 8 species (Table 1; Fig. 1).

Fig. 1

(A) Pairwise sequence comparisons and (B) maximum-likelihood phylogenetic tree of sequences comprising the “sweepovirus” group. Sequences corresponding to the same species based on a 91 % cutoff (using the parameters described in the main text) are highlighted in the same color

Table 1 List of begomovirus species, as of January 2015. Species names are shown in bold italics, and isolate names are given in regular font. For species that do not have any known strains, only one isolate is listed, and that isolate is recognized as the “type” isolate. For species that have known strains, one isolate from each strain is shown, and the type isolate is the first one listed. Sequence accession numbers and assigned abbreviations are also listed. An expanded table including all begomovirus isolates in GenBank is available for download at the ICTV website (

Results of pairwise sequence comparisons accurately reflect the biology of begomoviruses

It has been claimed that begomoviral species are artificial because they are arbitrarily defined based on sequence alone, and therefore their biological characteristics have been ignored [11]. This is a misconception. Sequence-based taxonomy is possible only because it relies on the knowledge of the biological properties of these well-studied viruses. Therefore, sequence comparisons among related begomovirus isolates can accurately reflect differences in their biology. Several examples can be drawn upon to argue this point. One well-known example involves bean golden mosaic disease, an important disease of bean crops in Latin America. The disease is caused by at least two distinct, well-characterized begomoviruses, bean golden mosaic virus (BGMV), which occurs in Brazil and Argentina, and bean golden yellow mosaic virus (BGYMV), which occurs in Central/North America and the Caribbean Basin [18]. The symptoms of the disease are nearly indistinguishable, the whitefly vector species is the same for both pathogens, and the economic importance with respect to crop loss is comparable as well. In fact, initially, the same begomoviral etiology was suspected for the disease occurring in the two regions. However, when the causal agents from plants collected in Puerto Rico (USA) and Brazil were sequenced, the results indicated that they had substantially different genome sequences [19, 20]. Later, it was demonstrated that the two agents differed in at least one relevant biological property: tissue tropism. BGMV is phloem-restricted in beans, while BGYMV is not [20, 21]. Thus, the species cutoff based on sequence alone was accurate and reflected the biological differences between the viruses belonging to these two species.

The most obvious benefit to using the SDT-based pairwise identity analysis is that there are fewer species and strains at the interface between the cutoff and the next lower or higher percent nt sequence identity. As such, applying the proposed 91 % cutoff increases reliability owing to the robust stringency.

Why so many begomoviruses?

As noted above, the genus Begomovirus includes the largest number of species of all currently established genera, with 288 species currently recognized by the ICTV. Why so many begomoviruses? Are these species “artificial”, the result of flawed taxonomic demarcation criteria? The existence of this large number of species can be explained by natural order relationships based on the characteristics of this genus that set it apart from many other viral genera.

Begomoviruses are transmitted by members of a cryptic species complex, Bemisia tabaci (Genn.) (Hemiptera: Aleyrodidae), which is distributed worldwide and colonize a wide array of plants belonging to species in many families [2225]. B. tabaci has emerged as a major threat to agricultural systems in many regions of the world since the 1970s and 1980s [26, 27]. Reports of unprecedented B. tabaci infestations have characteristically resulted in outbreaks of previously undescribed begomoviruses and the apparent disappearance of others from cultivated plants [28]. Because B. tabaci colonizes so many plant species [25], it potentiates the transfer of begomoviruses between non-cultivated and cultivated hosts (which are most studied by plant virologists). While it is beyond the scope of this proposal to fully explore the hypothesis that most begomoviruses isolated from cultivated hosts likely evolved from viruses originally adapted to infecting non-cultivated hosts, this hypothesis could explain, at least in part, why there are so many more begomovirus species than are found for other virus genera where virus-host-vector interactions are more evolutionarily ancient.

Another consideration is that virologists working with ssDNA viruses have gained a powerful new tool in the form of rolling circle-amplification (RCA), a method that allows for rapid, sequence-independent sampling of virus populations. The impact of RCA in the field of geminivirology cannot be overstated (for example, see ref. [29]). Using RCA, it is possible to amplify and recover the complete genome of almost any begomovirus from minute amounts of total plant DNA extracted under suboptimal conditions [30]. Presently, tissue samples can be collected, dried, and stored for months or years at room temperature, and thousands of complete begomovirus genomes will be readily amplified using RCA following a quick DNA extraction [31, 32]. In the 1990s it would take months to clone one full-length begomovirus genome, whereas hundreds of genomes can now be cloned in a matter of weeks. Furthermore (and equally relevant), because RCA uses random primers, it reduces sequence amplification biases and enables the detection of most or all unique genome molecules present in a sample. As a result, new begomoviruses and other, often highly divergent, geminiviruses have been discovered that will probably lead to the recognition of additional genera in the family (and perhaps new families as well) [3336]. Also, this new technology has prompted a significant increase in the numbers of novel begomoviruses that are being sought, and found, in non-cultivated plants.

Finally, it should be pointed out that the extent of diversity currently recognized within this genus (and possibly for all viruses) represents only the tip of the iceberg. Metagenomic approaches are rapidly becoming affordable and will probably lead to the discovery of viruses belonging to hundreds of new genera and families, not to mention species [37]. Its impact on geminivirus discovery has already been felt [3842].

Different cutoff values must be used for the different genera in the family Geminiviridae

The approach implemented herein to demarcate species in the genus Begomovirus is identical to that used and approved by the ICTV for the other genera in the family [12, 13, 36]. However, for each genus, the working cutoff for species demarcation differs, even though the method applied to determine these cutoffs has been the same. For example, mastrevirus species are demarcated using a 78% cutoff. The 78% species cutoff value for the mastreviruses is demonstrated by the pairwise distance distribution plot (Fig. 2A), in which a clear valley is apparent at 78%. Such a valley is not readily evident in the equivalent plot for the begomoviruses (Fig. 2B and C), leading us to analyze this genus using groups of sequences. The analysis reported herein supported a 91% cutoff value for begomoviruses as that which best separates the species in the genus, and this is well supported by the SDT analysis.

Fig. 2

Distribution of the full-genome pairwise sequence identity scores for members of the genera (A) Mastrevirus and (B, C) Begomovirus (C corresponds to a higher resolution of the shaded region in B). Note the valley (or gap) corresponding to the 72–78 % frequencies in the Mastrevirus plot and the absence of significant valleys in the Begomovirus plot

Several other families and genera have species demarcation thresholds similar to that of the begomoviruses, including Parvovirus (95 %), Microviridae (80–85 %) and Sobemovirus (60–85 %). It is perhaps troubling that a uniform approach for computing species thresholds does not exist across all of viral taxonomy at this time. Currently, various study groups use different algorithms for specific genes, sets of genes, or complete genomes. Further, complete genome sequences are lacking for viruses of many species, particularly those with large genomes. In those instances, the trees represent a gene tree instead of a virus tree, which can create misconceptions about viral genome structure and lead to incorrect evolutionary inferences. It will be interesting to see if our approach may be useful for other viral families.

A step-by-step guide for classifying new begomovirus isolates as members of species or strains

To facilitate the taxonomic placement of newly discovered begomoviruses and to assist in the standardization of this procedure, the following guidelines are proposed for classification into species and strains:

  1. 1.

    A BLASTn analysis of the “non-redundant nucleotide” database should be performed to identify the species whose members have sequences most similar to the new sequence. The nucleotide sequence database at the NCBI website ( can also be searched using the search term “txid10814 [Organism: exp] AND 2500:3500[SLEN]”, which will return all begomovirus nucleotide sequences that are between 2500 and 3500 nucleotides long.

  2. 2.

    The new sequence should be added to a dataset of full-length genomes or DNA-A components created based on the BLAST results, and saved in FASTA format. All sequences must start at the same genomic coordinate (the first nucleotide after the nicking site within the conserved nonanucleotide at the origin of replication is the recommended standard).

  3. 3.

    The MUSCLE option in SDT v1.2 (freely available at or any other program that uses the MUSCLE alignment algorithm with pairwise deletion of gaps should be used to calculate identities between every pair of sequences in the dataset. If using SDT, these pairwise identities may be saved in either a column or matrix csv format that can then be viewed in a spreadsheet program such as Microsoft Excel. Percent identities must be rounded to the nearest full percentile.

  4. 4.

    If the new sequence shares <91 % genome-wide pairwise identity to any other known begomovirus sequence, appropriate species and virus names should then be proposed (see below for guidelines on doing so).

  5. 5.

    If the sequence shares <94 % genome-wide pairwise identity to all isolates described for that species, a strain name should then be proposed.

Guidelines for naming new species that include newly discovered begomoviruses

Virus species name

This is the ICTV-accepted name of a group of begomoviruses sharing ≥91 % pairwise sequence identity for the full-length genome or DNA-A component. If the sequence has <91 % sequence identity to all begomoviruses previously classified as members of distinct species, the virus should be considered a member of a new species, and a unique name that is not currently in use for any ICTV-recognized species should be assigned. This name should follow the template “Host symptom virus” (e.g., Bean golden mosaic virus). Although it was common practice for begomoviruses, the Geminiviridae SG recommends that country, city, town, village or province names not be used in naming new viruses and new viral species (e.g., Tomato yellow leaf curl Thailand virus), as this may cause misunderstandings when a virus named after a country or city is subsequently found in other locations within that country or in other countries. (Previously accepted names using this practice will not be changed to avoid conflicts in the literature.)

Strain name

Based on ICTV guidelines, there is no practical or standardized approach for differentiating or naming strains (or any other category below the species level). In fact, item 3.3 of the ICTV Statutes states that “The ICTV is not responsible for classification and nomenclature of virus taxa below the rank of species.” Nevertheless, the Geminiviridae Study Group has adopted its own guidelines for strain differentiation and nomenclature [43], although there is no formal requirement to do so. Our new analysis indicated a sequence identity threshold of 94 % for strain demarcation.

Ideally (when knowledge is available), strains should follow a nomenclature that reflects biological differences between the members of the same species. For example, if it is established that a number of BGMV isolates comprising a distinct strain are capable of infecting a host (e.g., lima bean) that other BGMV isolates do not normally infect, it would then be appropriate to name the strain BGMV-Lima bean. Likewise, symptom severity descriptors (e.g., Tomato golden mosaic virus-Yellow vein) could also be used. In either case, such strain names should be used only when the phenotype is observed in multiple isolates of the same strain. As recommended for species names, country, city, town, village or province names should not be used in naming new strains. Strain name follows species name separated by a hyphen (“-”).

Isolate descriptor

Following the species/strain name, and within square brackets (“[ ]”), the isolate descriptor may contain any number of sub-fields separated by hyphens. Although the 9th ICTV Report’s recommendations for geminivirus nomenclature [1] suggested the use of colons (“:”) to separate sub-fields in the isolate descriptor, this can cause problems in various phylogenetic-tree-drawing programs, which, when reading phylogenetic trees in Newick format, will misinterpret numbers after the colon as representing branch length information.

The first sub-field should always be the two-letter international code of the country/territory in which the isolate was sampled (Supplementary Table S2), whereas the last sub-field should always be the year in which the isolate was last present within living tissue. If the year in which the isolate was sampled differs from the date on which it was last present within living tissue (e.g., when isolates are propagated in the laboratory), the date when the isolate was sampled should then be included as an internal sub-field. Between the first and last sub-fields, any additional descriptors can be used (for example, the laboratory code of the sample from which the isolate was obtained, the city nearest to the place where the sample was obtained, the host species from which the virus was isolated).


Since the 1990s begomovirus taxonomy has been based primarily on sequence comparisons methods. In this regard, it was pioneering, helped by the large number of full-length sequences available, and allowed for a robust statistical treatment of the data. Although this approach has been criticized for not taking biology into account, a closer look into the recognized species will show that biology is accurately reflected in the taxonomy. This revision demonstrated the robustness and the reliability of a sequence-based taxonomy, and this was acknowledged by the ICTV with the positive outcome of the latest taxonomy proposal and the establishment of new begomovirus species ( It should be noted, however, that the ICTV Geminiviridae SG has no ulterior motivation for continuing to propose new species. Rather, the number of new species proposed is a genuine reflection of our increasingly effective methods to conceptualize the natural genetic variability of this remarkable group of viruses. By establishing clear guidelines for the analysis of full-length genomic sequences, and following standardized nomenclature for the naming of newly established species and strains, the intention of the ICTV Geminiviridae SG is that these changes will improve taxonomic communications among users while also leaving open options for further improvements in the future that will serve the geminivirus community at large.


  1. 1.

    Brown JK, Fauquet CM, Briddon RW, Zerbini FM, Moriones E, Navas-Castillo J (2012) Family Geminiviridae. In: King AMQ, Adams MJ, Carstens EB, Lefkowitz EJ (eds) Virus taxonomy. Ninth report of the international committee on taxonomy of viruses. Elsevier Academic Press, London, pp 351–373

  2. 2.

    Sanchez-Campos S, Martinez-Ayala A, Marquez-Martin B, Aragon-Caballero L, Navas-Castillo J, Moriones E (2013) Fulfilling Koch’s postulates confirms the monopartite nature of tomato leaf deformation virus: a begomovirus native to the New World. Virus Res 173:286–293

    Article  CAS  PubMed  Google Scholar 

  3. 3.

    Melgarejo TA, Kon T, Rojas MR, Paz-Carrasco L, Zerbini FM, Gilbertson RL (2013) Characterization of a new world monopartite begomovirus causing leaf curl disease of tomato in Ecuador and Peru reveals a new direction in geminivirus evolution. J Virol 87:5397–5413

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. 4.

    Nakhla MK, Maxwell DP, Martinez RT, Carvalho MG, Gilbertson RL (1994) Widespread occurrence of eastern Mediterranean “strain” of tomato yellow leaf curl geminivirus in tomatoes in the Dominican Republic. Plant Dis 78:926

    Article  Google Scholar 

  5. 5.

    Duffy S, Holmes EC (2007) Multiple introductions of the old world begomovirus Tomato yellow leaf curl virus into the new world. Appl Environ Microb 73:7114–7117

    Article  CAS  Google Scholar 

  6. 6.

    Zhang W, Olson NH, Baker TS, Faulkner L, Agbandje-McKenna M, Boulton MI, Davies JW, McKenna R (2001) Structure of the Maize streak virus geminate particle. Virology 279:471–477

    Article  CAS  PubMed  Google Scholar 

  7. 7.

    Stanley J, Gay MR (1983) Nucleotide sequence of cassava latent virus DNA. Nature 301:260–262

    Article  CAS  Google Scholar 

  8. 8.

    Hamilton WD, Stein VE, Coutts RHA, Buck KW (1984) Complete nucleotide sequence of the infectious cloned DNA components of tomato golden mosaic virus: potential coding regions and regulatory sequences. EMBO J 3:2197–2205

    PubMed Central  CAS  PubMed  Google Scholar 

  9. 9.

    Harrison BD (1985) Advances in geminivirus research. Annu Rev Phytopathol 23:55–82

    Article  CAS  Google Scholar 

  10. 10.

    Padidam M, Beachy RN, Fauquet CM (1995) Classification and identification of geminiviruses using sequence comparisons. J Gen Virol 76:249–263

    Article  CAS  PubMed  Google Scholar 

  11. 11.

    Van Regenmortel MH, Ackermann HW, Calisher CH, Dietzgen RG, Horzinek MC, Keil GM, Mahy BW, Martelli GP, Murphy FA, Pringle C, Rima BK, Skern T, Vetten HJ, Weaver SC (2013) Virus species polemics: 14 senior virologists oppose a proposed change to the ICTV definition of virus species. Arch Virol 158:1115–1119

    Article  CAS  PubMed  Google Scholar 

  12. 12.

    Muhire B, Martin DP, Brown JK, Navas-Castillo J, Moriones E, Zerbini FM, Rivera-Bustamante R, Malathi VG, Briddon RW, Varsani A (2013) A genome-wide pairwise-identity-based proposal for the classification of viruses in the genus Mastrevirus (family Geminiviridae). Arch Virol 158:1411–1424

    Article  CAS  PubMed  Google Scholar 

  13. 13.

    Varsani A, Martin DP, Navas-Castillo J, Moriones E, Hernandez-Zepeda C, Idris A, Zerbini FM, Brown JK (2014) Revisiting the classification of curtoviruses based on genome-wide pairwise identity. Arch Virol 159:1873–1882

    Article  CAS  PubMed  Google Scholar 

  14. 14.

    Muhire BM, Varsani A, Martin DP (2014) SDT: a virus classification tool based on pairwise sequence alignment and identity calculation. PLOS One 9:e108277

    Article  PubMed Central  PubMed  Google Scholar 

  15. 15.

    Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinform 5:1–19

    Article  Google Scholar 

  16. 16.

    Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 28:2731–2739

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  17. 17.

    Albuquerque LC, Inoue-Nagata AK, Pinheiro B, Resende RO, Moriones E, Navas-Castillo J (2012) Genetic diversity and recombination analysis of sweepoviruses from Brazil. Virol J 9:241

    Article  PubMed Central  PubMed  Google Scholar 

  18. 18.

    Morales FJ, Jones PG (2004) The ecology and epidemiology of whitefly-transmitted viruses in Latin America. Virus Res 100:57–65

    Article  CAS  PubMed  Google Scholar 

  19. 19.

    Gilbertson RL, Faria JC, Ahlquist P, Maxwell DP (1993) Genetic diversity in geminiviruses causing bean golden mosaic disease: the nucleotide sequence of the infectious cloned DNA components of a Brazilian isolate of bean golden mosaic geminivirus. Phytopathology 83:709–715

    Article  CAS  Google Scholar 

  20. 20.

    Gilbertson RL, Hidayat SH, Martinez RT, Leong SA, Faria JC, Morales FJ, Maxwell DP (1991) Differentiation of bean-infecting geminiviruses by nucleic acid hybridization probes and aspects of bean golden mosaic in Brazil. Plant Dis 75:336–342

    Article  CAS  Google Scholar 

  21. 21.

    Morra MR, Petty ITD (2000) Tissue specificity of geminivirus infection is genetically determined. Plant Cell 12:2259–2270

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  22. 22.

    Brown JK, Frohlich DR, Rosell RC (1995) The sweetpotato or silverleaf whiteflies: biotypes of Bemisia tabaci or a species complex? Annu Rev Entomol 40:511–534

    Article  CAS  Google Scholar 

  23. 23.

    De Barro PJ, Liu SS, Boykin LM, Dinsdale AB (2011) Bemisia tabaci: a statement of species status. Annu Rev Entomol 56:1–19

    Article  PubMed  Google Scholar 

  24. 24.

    Brown JK (2010) Phylogenetic biology of the Bemisia tabaci sibling species group. In: Stansly PA, Naranjo SE (eds) Bemisia: bionomics and management of a global pest. Springer, Dordrecht, pp 31–67

    Google Scholar 

  25. 25.

    Gill R, Brown JK (2010) Systematics of Bemisia and Bemisia relatives: can molecular techniques solve the Bemisia tabaci complex conundrum—a taxonomist’s viewpoint. In: Stansly PA, Naranjo SE (eds) Bemisia: bionomics and management of a global pest. Springer, Dordrecht, pp 5–29

    Google Scholar 

  26. 26.

    Brown JK, Bird J (1992) Whitefly-transmitted geminiviruses and associated disorders in the Americas and the Caribbean basin. Plant Dis 76:220–225

    Article  Google Scholar 

  27. 27.

    Costa AS (1975) Increase in the populational density of Bemisia tabaci, a threat to widespread virus infection of legume crops in Brazil. In: Bird J, Maramorosch K (eds) Tropical diseases of legumes. Academic Press, New York, p 171

    Google Scholar 

  28. 28.

    Brown JK (2007) The Bemisia tabaci complex: genetic and phenotypic variability drives begomovirus spread and virus diversification. APSNet Featur. doi:10.1094/APSnetFeature-2007-0107

  29. 29.

    Haible D, Kober S, Jeske H (2006) Rolling circle amplification revolutionizes diagnosis and genomics of geminiviruses. J Virol Methods 135:9–16

    Article  CAS  PubMed  Google Scholar 

  30. 30.

    Owor BE, Shepherd DN, Taylor NJ, Edema R, Monjane AL, Thomson JA, Martin DP, Varsani A (2007) Successful application of FTA Classic Card technology and use of bacteriophage phi29 DNA polymerase for large-scale field sampling and cloning of complete maize streak virus genomes. J Virol Methods 140:100–105

    Article  CAS  PubMed  Google Scholar 

  31. 31.

    Inoue-Nagata AK, Albuquerque LC, Rocha WB, Nagata T (2004) A simple method for cloning the complete begomovirus genome using the bacteriophage phi29 DNA polymerase. J Virol Methods 116:209–211

    Article  CAS  PubMed  Google Scholar 

  32. 32.

    Shepherd DN, Martin DP, Lefeuvre P, Monjane AL, Owor BE, Rybicki EP, Varsani A (2008) A protocol for the rapid isolation of full geminivirus genomes from dried plant tissue. J Virol Methods 149:97–102

    Article  CAS  PubMed  Google Scholar 

  33. 33.

    Krenz B, Thompson JR, Fuchs M, Perry KL (2012) Complete genome sequence of a new circular DNA virus from grapevine. J Virol 86:7715

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  34. 34.

    Loconsole G, Saldarelli P, Doddapaneni H, Savino V, Martelli GP, Saponari M (2012) Identification of a single-stranded DNA virus associated with citrus chlorotic dwarf disease, a new member in the family Geminiviridae. Virology 432:162–172

    Article  CAS  PubMed  Google Scholar 

  35. 35.

    Bernardo P, Golden M, Akram M, Naimuddin Nadarajan N, Fernandez E, Granier M, Rebelo AG, Peterschmitt M, Martin DP, Roumagnac P (2013) Identification and characterisation of a highly divergent geminivirus: evolutionary and taxonomic implications. Virus Res 177:35–45

    Article  CAS  PubMed  Google Scholar 

  36. 36.

    Varsani A, Navas-Castillo J, Moriones E, Hernández-Zepeda C, Idris A, Brown JK, Zerbini FM, Martin DP (2014) Establishment of three new genera in the family Geminiviridae: Becurtovirus, Eragrovirus and Turncurtovirus. Arch Virol 159:2193–2203

    Article  CAS  PubMed  Google Scholar 

  37. 37.

    Edwards RA, Rohwer F (2005) Viral metagenomics. Nat Rev Microbiol 3:504–510

    Article  CAS  PubMed  Google Scholar 

  38. 38.

    Ng TFF, Duffy S, Polston JE, Bixby E, Vallad GE, Breitbart M (2011) Exploring the diversity of plant DNA viruses and their satellites using vector-enabled metagenomics on whiteflies. PLOS One 6:e19050

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  39. 39.

    Idris A, Al-Saleh M, Piatek MJ, Al-Shahwan I, Ali S, Brown JK (2014) Viral metagenomics: analysis of begomoviruses by Illumina high-throughput sequencing. Viruses 6:1219–1236

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  40. 40.

    Candresse T, Filloux D, Muhire B, Julian C, Galzi S, Fort G, Bernardo P, Daugrois JH, Fernandez E, Martin DP, Varsani A, Roumagnac P (2014) Appearances can be deceptive: revealing a hidden viral infection with deep sequencing in a plant quarantine context. PLOS One 9:e102945

    Article  PubMed Central  PubMed  Google Scholar 

  41. 41.

    Rosario K, Duffy S, Breitbart M (2012) A field guide to eukaryotic circular single-stranded DNA viruses: insights gained from metagenomics. Arch Virol 157:1851–1871

    Article  CAS  PubMed  Google Scholar 

  42. 42.

    Rosario K, Breitbart M (2011) Exploring the viral world through metagenomics. Curr Opin Virol 1:289–297

    Article  CAS  PubMed  Google Scholar 

  43. 43.

    Fauquet CM, Briddon RW, Brown JK, Moriones E, Stanley J, Zerbini FM, Zhou X (2008) Geminivirus strain demarcation and nomenclature. Arch Virol 153:783–821

    Article  CAS  PubMed  Google Scholar 

Download references


The analysis described in this manuscript was part of the taxonomic proposal 2013.015a,bP, approved by the Executive Committee of the ICTV in July 2013 and ratified in March 2014. JKB and FMZ are past and current chairs, respectively, of the Geminiviridae Study Group of the ICTV. JNC, EM, RWB, CHZ, AI, VGM, DPM, RRB, SU and AV were members of the Geminiviridae SG during 2012–2014, when the work was performed. JNC and EM are members of the Research Group AGR-214, partially funded by Consejería de Economía, Innovación y Ciencia, Junta de Andalucía, Spain, cofinanced by FEDER-FSE. DPM and AV are supported by the National Research Foundation of South Africa.

Author information



Corresponding authors

Correspondence to Judith K. Brown or F. Murilo Zerbini.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary File S1

Pairwise sequence identity analysis and maximum-likelihood phylogenetic trees of 38 groups of closely related viruses in the genus Begomovirus. For each group, sequences corresponding to the same species based on a 91% cutoff are shaded in the same color. As two examples, Group 1 has no sequences that conflict with the proposed threshold, while Group 6 has one that, according to the conflict-resolution criteria adopted (the isolate should be considered to belong to the species containing the isolate with which it shares the highest degree of sequence identity), corresponds to Honeysuckle yellow vein mosaic virus (xls 607 kb)

Supplementary Table S2

Two-letter international country or territory codes (docx 28 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Brown, J.K., Zerbini, F.M., Navas-Castillo, J. et al. Revision of Begomovirus taxonomy based on pairwise sequence comparisons. Arch Virol 160, 1593–1619 (2015).

Download citation


  • Geminiviridae
  • Nomenclature
  • Sequence Demarcation Tool
  • Single-stranded DNA virus
  • Species demarcation
  • Taxonomy
  • Whitefly-transmitted viruses