A novel gammapolyomavirus in a great cormorant (Phalacrocorax carbo)

In this study, the complete genome of a novel polyomavirus detected in a great cormorant (Phalacrocorax carbo) was characterized. The 5133-bp-long genome of the cormorant polyomavirus has a genomic structure typical of members of the genus Gammapolyomavirus, family Polyomaviridae, containing open reading frames encoding the large and small tumor antigens, viral proteins 1, 2, and 3, and the X protein. The large tumor antigen of the cormorant polyomavirus shares 45.6–50.4% amino acid sequence identity with the homologous sequences of other gammapolyomaviruses. These data, together with results of phylogenetic analysis, suggest that this cormorant polyomavirus should be considered the first member of a new species within the genus Gammapolyomavirus, for which we propose the name “Phalacrocorax carbo polyomavirus 1”.


Introduction
Members of the family Polyomaviridae infect mammals, birds, and fish [1][2][3]. The biology of avian and mammalian polyomaviruses differs significantly. Avian polyomaviruses are not highly species-specific, and they cause acute, fulminant, and often fatal disease in their susceptible hosts. In contrast, mammalian polyomaviruses show rigorous host species specificity and typically cause inapparent infections in immunocompetent hosts [2]. The family Polyomaviridae comprises eight genera. Viruses of the genera Alpha-, Beta-, and Deltapolyomavirus, as well as the recently established genera Epsilon-and Zetapolyomavirus have been detected from mammals, whereas members of the genus Gammapolyomavirus infect birds. Polyomaviruses of fish have been assigned to the new genera Etapolyomavirus and Thetapolyomavirus [1].
Polyomaviruses are characterized by nonenveloped icosahedral particles, 40-45 nm in diameter, that enclose a circular dsDNA genome of 3962-7369 bp [2]. The polyomavirus genes are expressed in a time-dependent manner. The products of the early genes are primarily regulatory proteins (e.g., the large and small tumor antigens [LTA and STA]), and the late genes code for the viral proteins (VPs) VP1, VP2, and VP3, which are responsible for virion formation [2,[4][5][6]. Additional coding capacity has been noted; for example, the genome of avian polyomaviruses may encode an X protein or a VP4 protein downstream of the replication origin [2,[5][6][7][8].
Our knowledge about the genetic diversity of mammalian polyomaviruses has increased rapidly over the past 10 years, which has led to an extended taxonomic classification, with more than 100 species. Gammapolyomaviruses are represented by nine species [1][2][3]7]. Goose hemorrhagic polyomavirus and budgerigar fledgling disease virus, two well-characterized, high-mortality avian polyomaviruses, have been described in a number of bird species that might play a role in their natural circulation [8,9]. Hence, it seems plausible that wild birds serve as hosts for novel gammapolyomaviruses as well as for some highly pathogenic gammapolyomaviruses associated with economic losses.
In this study, wild birds that died in 2019 at the Zoo and Botanical Garden, Budapest (Hungary), were tested for polyomaviruses. Approximately 50-100 mg of internal organ tissue samples were homogenized in PBS using a Tis-sueLyzer LT instrument (QIAGEN, Hilden, Germany) and were centrifuged for 10,000 × g for 5 min. Nucleic acid was extracted using a ZiXpress-32 ® Automated Nucleic Acid Purification Instrument and a ZiXpress-32 ® Viral Nucleic Acid Extraction Kit (Zinexts Life Science Corp., New Taipei City, Taiwan) from a mixture of the prepared samples from each bird. Polyomavirus DNA was detected using a broad-spectrum nested PCR assay with the primer sets VP1-1f and VP1-1r, and VP1-2f and VP1-2r, described by Johne and co-workers [10]. Sequencing of the amplicons revealed traces of polyomavirus sequence in one out of 32 specimens collected from kidney and liver samples from a great cormorant (Phalacrocorax carbo). The bird was admitted to the zoo's rescue station with presumed traumatic injuries, but detailed pathological findings were not available.
Altogether, 528,035 sequence reads mapped to the novel genome with a sequencing depth of > 3100. The priming site of the back-to-back PCR was determined by Sanger sequencing, resulting in lower sequencing depth in this region. The novel genome (GenBank accession number MZ666388) was found to be 5133 bp long, and the genomic structure resembles that of gammapolyomaviruses, containing the putative ORFs encoding the LTA, STA, VP1, VP2, and VP3 proteins (Fig. 1, Table 1) [1][2][3][4][5][6][7][8]. Furthermore, ORF-X was predicted upstream of the VP2. We observed signatures of mRNA splicing in both the LTA and ORF-X genes (Fig. 1, Table 1). Analysis of representative complete genome sequences of members of all polyomavirus species, performed using RDP4 software, did not reveal any recombination events affecting the genome of the cormorant polyomavirus (CoPyV) [15].
Typical motifs similar to those in other avian polyomaviruses could be identified in the LTA of CoPyV, including the polyomavirus conserved region (CR1, LEELL), the hexapeptide in the J domain (HPDKGG), the pRB1binding motif (LHAEE), the nuclear localization signal (NLS, TPPKDRAT), the zinc finger motif (CETCKAQK-KDMPFRMLKRKWVGGHIDDH), and the ATPase motifs (GGVNTGKT and GAVPVNLE) [4,6,7]. The VP3 started with an in-frame methionine of the VP2, as part of the motif MALMPY, which conformed to the consensus motif MALXXΦ (Φ = W, F, Y) described also for other polyomaviruses [4]. The C-terminal region of the putative VP2 and VP3 proteins of CoPyV and all other gammapolyomaviruses is rich in arginine (R) and lysine (K), which may be components of functional NLSs [4]. Although NLSs have been recognized in VP1 proteins of mammalian polyomaviruses   [4], an accumulation of basic amino acids is not typical for this region of avian polyomaviruses.
In the LTA-based phylogenetic tree, the CoPyV sequence branched together with gammapolyomavirus sequences (Fig. 1). Each of the main coding sequence of CoPyV and the gammapolyomaviruses shared a maximum of 62.5% nt and 66.6% aa sequence identity in pairwise comparisons, showing the highest values with sequences from goose hemorrhagic polyomavirus, Adélie penguin polyomavirus, and butcherbird polyomavirus. According to the demarcation criteria for polyomaviruses, including a genetic distance of > 15% for the LTA aa sequence [3], CoPyV may be the first member of a novel species within the genus Gammapolyomavirus, for which we propose the name "Phalacrocorax carbo polyomavirus 1".

Data availability
The sequence data are available in the GenBank database with accession number MZ666388.

Conflict of interest The authors declare no conflicts of interest.
Ethical approval The authors confirm that no ethical approval was required.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.