Virus nomenclature below the species level: a standardized nomenclature for natural variants of viruses assigned to the family Filoviridae
- First Online:
- Cite this article as:
- Kuhn, J.H., Bao, Y., Bavari, S. et al. Arch Virol (2013) 158: 301. doi:10.1007/s00705-012-1454-0
- 1.5k Downloads
The task of international expert groups is to recommend the classification and naming of viruses. The International Committee on Taxonomy of Viruses Filoviridae Study Group and other experts have recently established an almost consistent classification and nomenclature for filoviruses. Here, further guidelines are suggested to include their natural genetic variants. First, this term is defined. Second, a template for full-length virus names (such as “Ebola virus H.sapiens-tc/COD/1995/Kikwit-9510621”) is proposed. These names contain information on the identity of the virus (e.g., Ebola virus), isolation host (e.g., members of the species Homo sapiens), sampling location (e.g., Democratic Republic of the Congo (COD)), sampling year, genetic variant (e.g., Kikwit), and isolate (e.g., 9510621). Suffixes are proposed for individual names that clarify whether a given genetic variant has been characterized based on passage zero material (-wt), has been passaged in tissue/cell culture (-tc), is known from consensus sequence fragments only (-frag), or does (most likely) not exist anymore (-hist). We suggest that these comprehensive names are to be used specifically in the methods section of publications. Suitable abbreviations, also proposed here, could then be used throughout the text, while the full names could be used again in phylograms, tables, or figures if the contained information aids the interpretation of presented data. The proposed system is very similar to the well-known influenzavirus nomenclature and the nomenclature recently proposed for rotaviruses. If applied consistently, it would considerably simplify retrieval of sequence data from electronic databases and be a first important step toward a viral genome annotation standard as sought by the National Center for Biotechnology Information (NCBI). Furthermore, adoption of this nomenclature would increase the general understanding of filovirus-related publications and presentations and improve figures such as phylograms, alignments, and diagrams. Most importantly, it would counter the increasing confusion in genetic variant naming due to the identification of ever more sequences through technological breakthroughs in high-throughput sequencing and environmental sampling.
The International Committee on Taxonomy of Viruses (ICTV, http://www.ictvonline.org), the body tasked by the International Union of Microbiological Societies (IUMS) to make decisions on matters of virus classification and nomenclature, is responsible for the assignment of viruses to taxa (orders, families, subfamilies, genera, and species). This task is supported by the intellectual input from specialized groups, the ICTV Study Groups. Study Groups serve as advisory committees and connect the ICTV taxonomists with laboratory virologists, who investigate viruses scientifically. ICTV Study Groups and other expert groups, but not the ICTV itself, propose unique virus names and abbreviations. Fauquet et al. discussed in 2008 that “[i]t is de facto accepted by the virologists that there is no homogeneity in the demarcation criteria, nomenclature and classification below the species level, and each specialty group is establishing an appropriate system for its respective family” . Unfortunately, most ICTV Study Groups or other expert groups have not provided clear guidelines in the past, accepting strain and genetic variant names as they were suggested by different researchers in their publications rather than creating consistent nomenclature schemes that apply at least to all viruses of one family. The status quo is, therefore, that variants of particular viruses are often named according to different standards. For instance, one may be assigned a number only, whereas another may be referred to by a name, whereas yet another one may have been designated with the year of isolation. Such variety is not necessarily cause for grave concern when the number of virus variants is very limited and experts in the field are aware of them. However, their number, and in particular the number of sequences deposited in databases, has increased considerably in recent years. It is therefore becoming difficult for researchers to be aware of them, and in particular, to know their specific characteristics. Decreasing sequencing costs and ongoing improvements in sequencing technology have led to increased submission of genomic consensus sequences of viruses to databases without associated peer-reviewed descriptive publications and without fulfilling (yet undefined) minimum standards for sequencing and metadata. In practice, this means that crucial information about these sequences and the associated viruses is lost when, for instance, the date of isolation or the location of isolation, i.e., the “biological context” in the form of metadata, are not deposited along with the sequences. In a recent report from a National Center for Biotechnology Information (NCBI) workshop on virus genome annotation, the authors of the report concluded that “…only when information…[is] included does a viral genome sequence become something more: a sample isolated in evolutionary time and environmental context, which can be compared to others, allowing inferences between sequence, host, chronology, and geography” . The amount of genomic information for members of the family Filoviridae is unlikely to become overwhelming in the short term. However, their importance in regard to biodefense measures, and recent calls for genetic filovirus variant standardization to expedite countermeasure development , make them suitable candidates for name standardization trends started by influenzavirus, coronavirus, and rotavirus experts. Here, we propose guidelines for the establishment of a standardized nomenclature for natural genetic variants of filoviruses, and how to build designations for them from metadata in GenBank records and publications.
Review of existing systems for nomenclature below the species level
Several nomenclature schemes have been brought forward for individual virus groups. The most commonly accepted one is the nomenclature for influenzaviruses (family Orthomyxoviridae, genera Influenzavirus A/B/C), which was published in 1953 by the World Health Organization  and which has since been updated several times [35, 36, 37]. According to the guidelines of this nomenclature, influenzaviruses are to be designated as
<Virus name> <antigenic type>/<host of origin if other than human>/<geographical origin>/<serial number>/<last two digits (or all four digits) of year of isolation> (<hemagglutinin subtype><neuraminidase subtype>)
Examples: influenza A virus A/duck/Germany/1868/68 (H6N1) or influenza A virus A/chicken/Vietnam/NCVD-404/2010 (H5N1)
A similar system was suggested for the naming of avian coronaviruses (order Nidovirales, family Coronaviridae, subfamily Coronavirinae, genus Gammacoronavirus, species Avian coronavirus) by Cavanagh in 2001 :
<Virus name>/<host of origin>/<geographical origin>/<serial number>/<last two digits of year of isolation> (<subtype>)
Example: infectious bronchitis virus/chicken/Netherlands/D274/78
The influenzavirus nomenclature has proven very useful as it allows searching for and identifying particular influenzavirus isolates from the more than 190,000 deposited sequences. It is generally accepted within the influenzavirus research community and has the advantage that the isolate designation is mostly self-explanatory, allowing non-influenzavirus specialists to comprehend it quickly. Its disadvantages are that (a) it has become partially redundant because the three “antigenic types” of “influenza virus” have been reclassified as three different viruses belonging to three different species (influenza A/B/C virus, species Influenza A/B/C virus); (b) host designation permits the use of non-standardized animal names that lack specificity (such as “duck” → which kind of duck?); and (c) it does not distinguish between strains and variants.
In 2000, the ICTV Caliciviridae Study Group proposed a nomenclature for caliciviruses :
<Host of origin>/<genus abbreviation>/<species abbreviation>/<virus name>/<year of occurrence>/<country of origin>
Examples: Fe/VV/FCV/F9/1960/US or Ra/LV/RHDV/V-351/1987/CZ
This system has several disadvantages. For one, abbreviations are used for host organisms (Fe → felid/cat; Ra → rabbit) but it is unclear how these abbreviations are created. Second, according to the International Code of Virus Classification and Nomenclature (ICVCN), genera and species should not be abbreviated but must be italicized . Here, they are abbreviated but not italicized (Vesivirus → VV; Feline calicivirus → FCV; Lagovirus → LV; Rabbit hemorrhagic disease virus → RHDV). Complicating the matter, the species abbreviations used are identical with the virus abbreviations in circulation. Third, “F9” or “V-351” are not virus names, but rather isolate identifiers. Fourth, the system does not differentiate between strains and variants.
<Virus species name> - [<Country of origin>:<isolate identifier>:<isolate host>:<sampling year>]
Examples: Malvastrum leaf curl virus - [China:Guangxi 100:Papaya:2005] or pepper yellow vein Mali virus - [Burkina Faso:Banfora:hot pepper1:2009]
Fauquet et al. also realized that it would be beneficial to develop a system for abbreviated names for sequence alignments or phylograms :
Malvastrum leaf curl virus - [China: Guangxi 100: Papaya: 2005] → MaLCuV - [CN:Gx100:Pap:05] or pepper yellow vein Mali virus - [Burkina Faso:Banfora:hot pepper1:2009] → PepYVMLV - [BF:Ban:Hpe1:09]
Reminiscent of the influenzavirus nomenclature, this system is easily comprehensible in its unabbreviated form. However, its implementation is complicated by the requirement that virus species names be listed when in practice virus names are (and should be) used instead [4, 7, 15, 28, 29, 30] and the problem that host designation permits the use of non-standardized host names that lack specificity (such as “hot pepper” → which kind of hot pepper?). The abbreviated names are more difficult to grasp immediately and raise the question how countries, cities, and hosts should be abbreviated consistently.
The most recent comprehensive virus nomenclature was proposed in 2011 by the Rotavirus Classification Working Group in conjunction with the development of an NCBI database for rotavirus genome sequences . This nomenclature is again similar to that used for influenzaviruses:
<rotavirus group>/<species of origin>/<country of identification>/<common name>/<year of identification>/<G- and P-type>
Examples: RVA/Human-wt/ECU/Ecu534/2006/G20P or RVA/Cow-lab/GBR/PP-1/1976/G3P
The system envisions the use of particular denominators to point out missing information (“XXX”). Suffixes in the <species of origin> field point out whether a viral genome has been directly sequenced from a clinical specimen (“-wt” for “wild-type”), from tissue/cell culture-derived viruses (“-tc”), from viruses passaged through homologous hosts (“-hhp”) or from laboratory-generated or laboratory-engineered viruses that resemble viruses from nature (“-lab”) or cannot be found in nature (“-LabStr”). Contrary to other nomenclatures described here, the developers of the rotavirus nomenclature did address the need for a standardization of country abbreviations and suggested to use the unique 3-letter (“alpha-3”) country codes used by the Representation of Names of Countries (ISO 3166) as prepared by the International Organization for Standardization (see http://www.iso.org/iso/country_codes.htm or https://www.cia.gov/library/publications/the-world-factbook/appendix/appendix-d.html). A minor problem of this nomenclature is that one denominator asks for “species of origin,” yet the provided examples list actual animals (“human” instead of “Homo sapiens” for instance). Furthermore, as in other nomenclatures described above, it is unclear which vernacular name of a given animal ought to be used (for instance: should “cougar” be used, or should “puma,” “mountain lion” or “mountain cat” be chosen?). Instead, the authors provide simple examples for the “species of origin” denominator, such as “mouse” or “bat,” which could be more descriptive (which kind of “mouse” or “bat”?), given that there are over a thousand different animals assigned to a roughly equal number of different mouse and bat species.
Filovirus nomenclature below the species level
Summary of current filovirus taxonomy as endorsed by the ICTV Filoviridae Study Group and accepted by the ICTV
Previous taxonomy and nomenclature (Eighth ICTV Report) 
Species Marburg marburgvirus
Species Lake Victoria marburgvirus
Virus 1: Marburg virus (MARV)
Virus: Lake Victoria marburgvirus (MARV)
Virus 2: Ravn virus (RAVV)
Species Taï Forest ebolavirus
Species Cote d’Ivoire ebolavirus [sic]
Virus: Taï Forest virus (TAFV)
Virus: Cote d’Ivoire ebolavirus [sic] (CIEBOV)
Species Reston ebolavirus
Species Reston ebolavirus
Virus: Reston virus (RESTV)
Virus: Reston ebolavirus (REBOV)
Species Sudan ebolavirus
Species Sudan ebolavirus
Virus: Sudan virus (SUDV)
Virus: Sudan ebolavirus (SEBOV)
Species Zaire ebolavirus
Species Zaire ebolavirus
Virus: Ebola virus (EBOV)
Virus: Zaire ebolavirus (ZEBOV)
Species Bundibugyo ebolavirus
Virus: Bundibugyo virus (BDBV)
Species Lloviu cuevavirus*
Virus: Lloviu virus (LLOV)
In recent years, filovirus disease outbreaks have been observed more frequently, and an ever-increasing number of isolates and genomic consensus sequences are becoming available. Technological breakthroughs in sequencing also have allowed the identification of a novel filovirus (Lloviu virus, LLOV) in the absence of replicating isolates . Until recently, filoviruses were exclusively isolated from humans (Marburg virus, MARV; Ravn virus, RAVV; Bundibugyo virus, BDBV; Ebola virus, EBOV; Sudan virus, SUDV; Taï Forest virus, TAFV), crab-eating macaques (Macaca fascicularis Raffles, 1821) (Reston virus, RESTV) and western chimpanzees (Pan troglodytes verus Schwarz, 1934) (Taï Forest virus, TAFV). However, MARV and RAVV were recently isolated from certain fruit bats—Egyptian rousettes (Rousettus (Rousettus) aegyptiacus E. Geoffrey, 1810) , and RESTV was isolated from apparently healthy domestic pigs (Sus scrofa scrofa Linnaeus, 1758) . LLOV was detected in Schreibers’s long-fingered bats (Miniopterus schreibersii Kuhl, 1817) . Fragmented MARV genomes were detected in greater long-fingered bats (Miniopterus inflatus Thomas, 1903) and eloquent horseshoe bats (Rhinolophus eloquens Andersen, 1905) . Fragmented EBOV genome consensus sequences were detected in tissues from deceased western lowland gorillas (Gorilla gorilla gorilla Savage, 1847) and central chimpanzees (Pan troglodytes troglodytes Blumenbach, 1775) , as well as from hammer-headed fruit bats (Hypsignathus monstrosus Allen, 1861), Franquet’s epauletted fruit bats (Epomops franqueti Tomes, 1860), and little collared fruit bats (Myonycteris (Myonycteris) torquata Dobson, 1878) . Conversely, several filoviruses are not available for research anymore and/or sequence information was lost, but awareness of them remains important for the understanding of historical reports .
It is foreseeable that the discovery or creation of novel filoviruses will accelerate in the near future. It is therefore of the utmost importance to retrospectively and prospectively establish a consistent, easily comprehensible nomenclature for filoviruses that not only provides crucial information such as isolation host and place and date of isolation, but also describes whether the entity in question is natural in origin or artificial, and extant or extinct/destroyed. In this article, a standardized nomenclature and guidelines for its further development are being proposed for natural filoviruses, i.e., filoviruses occurring in nature. Follow-up articles will clarify the nomenclature for laboratory and artificial/synthetic filoviruses and provide datasets on all filoviruses known with name designations following the scheme proposed here.
Filovirus variant and isolate definitions
Unfortunately, there is no universally accepted definition for the terms “strain”, “variant”, and “isolate” in the virology community, and most virologists simply copy the usage of terms from others. Here, we propose not to add to the existing confusion by constructing radically novel definitions, but rather to employ or extrapolate from the few existing definitions that have been brought forward. It is important to point out here that no matter at which classification level one examines “a virus”, one always deals with a varied population. A virus-infected cell will, after only one round or replication, already contain a population of genomes, and virions derived from these genomes will vary slightly from each other (quasispecies ). Likewise, a sample taken from a virus culture or an infected animal will contain numerous virions, many of which vary slightly (quasispecies ). While single-virion analysis is theoretically possible, it certainly is not done routinely right now (it would be meaningless as far as naming is concerned), and even if it were, one would still have to work with virion populations to infect animals—and of course virions are not equal to viruses . Consequently, “a virus”, “a strain”, “a variant”, or “an isolate” always refers to populations and not to single physical entities, and their descriptions are therefore based on average properties. For instance, “the sequence” of “an isolate” is a consensus sequence of the population of genomes present in the analyzed sample.
According to Van Regenmortel, a (natural) virus strain is a “variant of a given virus that is recognizable because it possesses some unique phenotypic characteristics that remain stable under natural conditions” [emphasis added by the authors] . Such “unique phenotypic characteristics” are biological properties different from the compared reference virus, such as unique antigenic properties, host range or the signs of disease it causes. Importantly, as Van Regenmortel points out, a virus variant with a simple “difference in genome sequence…is not given the status of a separate strain since there is no recognizable distinct viral phenotype” . This definition is very similar to that of Fauquet and Stanley, who argued that “strains are viruses that belong to the same species and differ in having stable and heritable biological, serological, and/or molecular characters [sic]” . These two definitions are also reflected in the words for “strain” in other languages, such as German (Stamm), which back-translate to “trunk” rather than “branch”, i.e., the word implies something fundamentally different from a reference entity despite it being directly related to it, possibly with little genomic sequence variation. A strain is therefore a genetically stable virus variant that differs from a natural reference virus (type variant) in that it causes a significantly different, observable, phenotype of infection (different kind of disease, infecting a different kind of host, being transmitted by different means etc.). “Genetically stable” means that the genomic changes associated with the phenotypic change are largely preserved over time through natural selection. The extent of genomic sequence variation is irrelevant for the classification of a variant as a strain since a distinct phenotype sometimes arises from few mutations. “Observable phenotype” means, for instance, that within a comparative animal experiment, it would be possible for the researcher to distinguish between the reference control virus-infected animal and the animal infected with the alleged new strain, without knowing which animal received which virus and without having any information about the differences between the two viruses. The designation of a virus variant as a virus strain would be the responsibility of international expert groups. Thus far, despite the abundant indiscriminate use of the word “strain” in the filovirus literature, natural filovirus strains according to this definition have not been reported. All described genetic variants of EBOV, for instance, cause a similar hemorrhagic fever in humans and even experimental animals and are transmitted similarly. None of the known EBOV genetic variants can be distinguished from others on clinical grounds alone. In fact, their variety seems to be limited to subtle differences in growth kinetics and plaque formation in vitro or subtle changes in the duration of disease in experimental animals, and ultimately derives from limited, but often stable, differences in genomic sequence . This also holds true for the different genetic variants of MARV, RAVV, BDBV, RESTV, and SUDV (currently, there is only one isolate of TAFV and none of LLOV). We therefore recommend abstaining from using the word “strain” in context of any natural filovirus until either a particular genetic filovirus variant is discovered that causes a difference in disease phenotype and/or until expert groups establish a clear-cut definition of what “phenotype” means and to which extent phenotypes must differ to establish a filovirus as a strain.
Genetic filovirus variants
Van Regenmortel defined a virus variant as an isolate or a set of isolates whose genomic (consensus) sequence(s) differ(s) from that of a reference virus , i.e., the term “variant” is often equivalent to “mutant”. According to Fauquet et al., a virus “variant is something that differs slightly from the norm…[i.e.,] it means a slightly different genome, symptom, or mode of transmission” [emphasis added by the authors] . According to van Regenmortel’s definition, which we adopt, multiple genetic filovirus variants have been described during the last four and a half decades.
Definition of “natural genetic filovirus variant”
A natural genetic filovirus variant is a natural filovirus that differs in its genomic consensus sequence from that of a reference filovirus (the type virus of a particular filovirus species) by ≤10 % but is not identical to the reference filovirus and does not cause an observable different phenotype of disease  (filovirus strains would be genetic filovirus variants, but most genetic filovirus variants would not be filovirus strains if a strain definition would be brought forward).
Fauquet and Stanley defined a virus isolate as “a sample…that has been cultured for study” . Van Regenmortel has come to a similar conclusion and defined a virus isolate as “simply an instance of a particular virus” . We suggest adopting the latter definition for filoviruses, as advancement in sequencing now allows for the partial characterization of an instance of a virus variant in the absence of culturing.
Definition of “natural filovirus isolate”
A natural filovirus isolate is an instance of a particular natural filovirus or of a particular genetic variant. Isolates can be identical or slightly different in consensus or individual sequence from each other.
It is important to point out that the designation of a filovirus as a “genetic variant” could change with time given the accumulation of new data that justify such a change. A novel isolate of a virus may at first be grouped with a particular genetic filovirus variant based on sequence information but later reclassified as a strain after experimental infections reveal it to behave phenotypically differently from a reference variant (for instance, if the filovirus did not cause viral hemorrhagic fever in a laboratory nonhuman primate but rather caused encephalitis). It would be the decision of international expert groups to change the designation under such circumstances. It is also worth mentioning that historically the term “virus isolate” was used to designate a particular virus detected in a certain biological organism at a certain time, but not for a virus batch prepared in a laboratory from a seed culture that ultimately is derived from that detected virus. We explicitly do not recommend labelling every instance of a filovirus culture in the laboratory as a separate isolate.
General nomenclature for natural genetic filovirus variants and filovirus isolates
Ideally, filovirus taxonomy below the species level would follow an existing general scheme. Unfortunately, as described above, such a global nomenclature does not exist—but the influenzavirus, avian coronavirus, and rotavirus nomenclatures are sufficiently similar to serve as examples. We suggest following the rotavirus proposal, and propose the following general template for filoviruses, to be used in the Materials and Methods sections of manuscripts:
The isolation host should be provided in one word in the format “first letter of genus name.full name of species descriptor” but remain unitalicized to denote the fact that the virus was isolated from an entity and not from a taxon . For instance: “H.sapiens” (member of the species Homo sapiens), “G.gorilla” (member of the species Gorilla gorilla), “R.aegyptiacus” (member of the species Rousettus aegyptiacus). If an isolation host can only be identified to a taxon level higher than species, then the entire name of the lowest known taxon should be used. For instance “Hipposiderus” (member of the genus Hipposiderus). Naming isolation hosts in this way is preferable to/more concise than using vernacular names, as these vary from language to language and one particular organism is often referred to by multiple vernacular names. See below for suffixes
The country of sampling field should contain an alpha-3 three-letter country code as outlined in ISO 3166-1 according to present country designations. For instance: “DEU” (Germany), “COG” (Republic of Congo), “UGA” (Uganda)
The year of sampling field should contain the year of sampling according to the Gregorian calendar in four digits
The genetic variant designation-isolate designation field should contain a unique genetic variant name or acronym (an abbreviation that can be pronounced) connected by a hyphen to an isolate descriptor. For instance: “Angola-1379c”, “Kikwit-9510621”
Example for the full-length designation of an isolate in the methods section of a manuscript: “Ebola virus H.sapiens-tc/COD/1995/Kikwit-9510621”
Furthermore, we propose following the suggestions of Fauquet et al.  to define shorter versions of names for the convenience of writers and presenters. We suggest using the following medium-length designation for virus names in figures, such as phylograms, sequence alignments or diagrams:
The isolation host should be provided in a four-letter format “first letter of genus name.first three letters of species descriptor.” For instance: “H.sap” (member of the species Homo sapiens), “G.gor” (member of the species Gorilla gorilla), “R.aeg” (member of the species Rousettus aegyptiacus). If an isolation host can only be identified to a taxon level higher than species, then the entire name of the lowest known taxon should be used. For instance: “Hipposiderus” (member of genus Hipposiderus). See below for suffixes
The country of sampling field should contain an alpha-3 three-letter country code as outlined in ISO 3166-1 according to present country designations. For instance: “DEU” (Germany), “COG” (Republic of Congo), “UGA” (Uganda)
The year of sampling field should contain the year of sampling according to the Gregorian calendar in two digits. For instance: “67”, “76”, “00”
The genetic variant designation-isolate designation should contain a unique genetic variant abbreviation connected by a hyphen to an isolate abbreviation. For instance: “Ang-1379c”, “Kik-9510621”
Example for the medium-length designation of an isolate in figures (alignments, phylograms) of a manuscript: “EBOV/Hsap/COD/95/Kik-9510621”
Finally, we propose to use the following name abbreviations within flowing text:
The genetic variant designation-isolate designation should contain a unique genetic variant abbreviation connected by a hyphen to an isolate abbreviation if the article addresses several different isolates of the same genetic variant. The isolate descriptor should be left blank if the latter is not the case. For instance: “Ang-1379c,” “Kik-9510621”; or “Ang”, “Kik”
Example for the designation of an isolate in the text of a manuscript: EBOV/Kik-9510621 (if other isolates of the same genetic variant are addressed in the same article); or EBOV/Kik (if this is the only isolate of this genetic variant addressed); or simply EBOV (if the article only addresses work with one particular genetic variant and isolate)
Filovirus GenBank records
GenBank records are indexed with regard to taxonomy, and each record must be associated with the “organism” field. Following the approach of rotavirus and adenovirus experts, we suggest the species name be used as the “organism” name on GenBank records. The virus name described above and the rest of the full-length designation described above (i.e., <isolation host-suffix>/<country of sampling>/<year of sampling>/<genetic variant designation>-<isolate designation>) should be used in the “/isolate” designation field. Using this approach, the definition line shown for nucleotide records would read “<organism name> <virus name> <isolate/genetic variant designation>”, for example, “Zaire ebolavirus Ebola virus H.sapiens-tc/COD/1995/Kikwit-9510621” [GenBank records currently cannot italicize entries, which is why the species name Zaire ebolavirus would appear without italics until this issue has been resolved].
It is important to remember that “…the rationale behind any specific naming scheme may not stand the test of time. Yet, the metadata is [sic] a constant, and as long as the relevant metadata is [sic] included in every genome record, any naming format will be supported” . This means that future developments in filovirology may result in the demand for a modified or different nomenclature, which, however, should not be difficult to create based on the metadata collected and archived for the current system. A logical consequence of this system is therefore also the development of a database that contains the information suggested here in conjunction with metadata available from publications, other databases, or research records. In fact, as many data as possible should be added to GenBank records, and metadata ought to be updated on a constant basis. Filovirus genetic variant/isolate designations are by no means planned to replace record metadata within GenBank records, but rather are designed to be “built” from them.
Nomenclature for filoviruses characterized from passage 0 material
In 1995, Maniloff convincingly argued that isolates of viruses do not necessarily have to be available for their classification in existing taxonomical schemes, and that “no special taxon need to be considered for uncultured viruses” (such as the taxon Candidatus used in bacteriology) as long as the relationship of the uncultured virus to existing ones can be inferred unequivocally . Maniloff therefore extended to virology what has long been held in bacteriology, namely that the majority of infectious entities in nature most likely cannot be propagated in the laboratory due to their special adaptation to particular cell types and replication conditions. Furthermore, sequence information from uncultured viruses (even if they could be cultured) is strongly desired because the virus in question could quickly adapt to culture conditions and therefore mutate rapidly.
We agree with Maniloff’s proposition but argue that the sequence of at least a near-complete genome (only incomplete at its extreme 5′ and 3′ termini) of an uncultured filovirus has to be available before classification. LLOV is the only uncultured filovirus known at the time of writing for which near-complete genomic data are available, and it is also one of only two filoviruses we know of that have been sequenced before passaging in tissue/cell culture .
To differentiate uncultured or passage 0 filoviruses for which near-complete genomic data are available from those that exist in culture, we propose to follow the suggestions of the Rotavirus Classification Working Group and to add the suffix “-wt” (for “wild-type”) to their genetic variant names as outlined above.
Nomenclature for filoviruses characterized from passage X material
Filoviruses that have undergone tissue/cell culture passaging should receive names supplemented with the suffix (“-tc”). The exact passaging history should be provided in GenBank metadata fields and also in the methods section of manuscripts next to the virus designation.
Nomenclature for potential filoviruses only known from genome fragments
In 2008, Voevodin and Marx introduced the term “frag-virus” for presumptive viruses known only from fragmented genomic sequence data . We agree that the amplification of short stretches of filovirus genomes and their phylogenetic placement using adequate homologous sequences derived from existing filoviruses is not sufficient to recognize truly novel viruses. For instance, amplified sequences could be experimental artifacts or result from RNA/DNA cross-contamination of samples. Near-full-length genomic sequencing and/or isolation of a replicating filovirus is essential to prevent misinterpretations of filovirus endemicity and diversity.
To distinguish presumptive filoviruses known primarily from genomic sequence fragments, we propose to add the suffix “-frag” (for “fragment”) to their genetic variant names. As genetic variant assignment could change upon further accumulation of data, the genetic variant names ought to be placed in quotation marks to denote the fact that they are considered temporary. For instance, Marburg virus R.aegyptiacus-frag/KEN/2007/“KE261” would be the designation for the virus hypothesized to exist based only on the availability of an NP gene fragment sequence (GenBank accession # GQ499199).“-frag” viruses may be reclassified as “-wt” or “-tc” viruses upon the detection and description of near-complete genomic data or virus isolation, upon which the genetic variant designation could become official (quotation marks dropped: “KE261” → KE261) or be changed.
Nomenclature for filoviruses that have been lost
Storage of viruses is not always optimal, thereby resulting in their inactivation over time. Furthermore, virus-infected samples, such as formalized, paraffin-embedded, or frozen tissues are often discarded when storage space is limited. It is therefore no surprise that once-isolated viruses have been inadvertently or deliberately destroyed. In particular, many MARV isolates obtained during the first recognized Marburg virus disease outbreaks in West Germany and Yugoslavia in 1967, such as isolates “Flak”, “Hilberger” “Lüdicke” or “Kliebe”, may have been lost forever. Others, such as “Popp” or “Hartz”, are still available in the form of guinea-pig-adapted or guinea-pig-passaged versions. A considerable percentage of the early filovirus literature reports experiments performed with these virus isolates. Their natural history often allows their closest still-available relatives to be inferred, thereby allowing for extrapolation of scientific data to virus isolates used today. We would like to emphasize the importance of studies done with now unavailable viruses while urging that it should be made clear to readers that viruses used for said studies are not available anymore and that results of experiments done at present with a closely related isolate may therefore not necessarily fit historical results. To distinguish unavailable filoviruses, we propose to add the suffix “-hist” (for “historical”) to their genetic variant names. Medium-length designations are not necessary for “-hist” viruses, as genomic sequences are not available and therefore the construction of phylograms and alignments is impossible.
Usage of designations
“HeLa cells in 96-well plates were infected for 1 h with Ebola virus H.sapiens-tc/COD/1995/Kikwit-9510621 (order Mononegavirales, family Filoviridae, species Zaire ebolavirus; GenBank accession no. AY354458) at an MOI of 0.5, 1, or 5. Virus was obtained from the Centers for Disease Control and Prevention, Atlanta, Georgia, USA, and had been passaged twice through grivets (species Chlorocebus aethiops) and twice through grivet kidney epithelial (Vero E6) cells before use”.
“Here, we demonstrate that infection of rhesus monkeys with EBOV/May protects from subsequent infection with EBOV/Kik”.
We propose to limit the use of medium-length designations to phylograms and sequence alignments (and to replace them with abbreviations if space is limited).
Creating new designations
Upon discovery of a novel filovirus, it is, ideally, up to the discoverer to create an appropriate isolate designation according to the scheme proposed here. We strongly recommend (i) discontinuing the usage of patient names or patient name abbreviations for any part of the designation, as such practice is ethically problematic; (ii) avoiding the use of country names, as this has caused diplomatic problems in the past; (iii) avoiding the use of any “unusual” characters, such as those with diacritical marks, but to stick to the standard 26-letter Latin alphabet for the sake of database input and handling; and (iv) choosing designations that can be pronounced easily in place of designations that solely consist of numbers as these are difficult to memorize. We further encourage all scientists to contact and seek the advice of the ICTV Filoviridae Study Group (http://ictvonline.org/subcommittee.asp?committee=24&se=) before publication of a novel isolate name.
We are indebted to Philippe le Mercier (Swiss Institute of Bioinformatics, Geneva, Switzerland) and colleagues for providing unpublished thoughts on virus nomenclature below the species level. We also thank Thomas S. Postler (New England Primate Research Center, Southborough, MA, USA) and Philip J. Kranzusch (Harvard Medical School, Boston, MA, USA) for their very useful editorial comments and suggestions.