Introduction

In recent decades, emerging viral diseases associated with chiropterans have caused the loss of human and animal lives worldwide, mainly in Africa, Asia, and Australia [1,2,3].Chiropterans are excellent bioindicators of environmental quality [4]; however, anthropogenic changes, such as deforestation, intensive agriculture, and extensive livestock farming, have modified the environment, thus causing an imbalance in ecosystems and leading to close contact with animal species that may carry unknown putative pathogens. These factors lead to greater proximity and close contact among wild animals, livestock, and humans, thereby increasing the risk of virus spillover and emerging infectious disease events [5,6,7]. It is important to highlight that in the last three decades, important lethal human pathogens have emerged in bats, such as henipavirus [8, 9], ebolavirus [10], and coronaviruses, such as SARS-CoV [11], SARS-CoV-2 [12], and MERS-CoV [13]. Vampire bats represent important actors in the spread of the rabies virus and thus are a source of concern to health and animal production authorities [14, 15].

Recently, 248 new virus species have been reported from bats around the world [1]; however, few of them are known to have the potential to cause severe disease. Species of concern belong to RNA virus families, such as Coronaviridae, Paramyxoviridae, Rhabdoviridae, Filoviridae, and Reoviridae [6]. Studies conducted in Brazil have screened 16 bat species and reported 576 virus species within the Adenoviridae, Astroviridae, Circoviridae, Coronaviridae, Hantaviridae, Papillomaviridae, and Parvoviridae families according to data tracked by DBatVir (http://www.mgc.ac.cn/DBatVir).

In the Neotropical region, three species of vampire bats can be found: Desmodus rotundus (E. Geoffroy Saint-Hilaire, 1810), Dyphilla ecaudata (Spix, 1823), and Diaemus youngi (Jentink, 1893). Most of the research is focused on D. rotundus due to its wide distribution and high-density colonies [16,17,18,19,20,21]. Research on D. caudata is scarce [22,23,24], and to the best of our knowledge, the virus diversity and pathogen potential of D. youngi have not been reported, with the exception of herpesviruses detected in blood samples collected in French Guiana [25].

The recent and ongoing SARS-CoV-2 pandemic has put the role of bats as potential zoonotic virus reservoirs in the spotlight, thus leading to a race to identify potential biomes and hosts that could be harboring the next outbreak, including South American bats [26]. Tropical and subtropical regions are associated with various factors, such as evergreen forests, rich biomes, high mammal diversity and density, pasture areas and deforestation, and these factors make tropical and subtropical regions hotspots with a high risk of emergence of infectious diseases (EIDs) [27, 28].

Rapid evolution of high-throughput sequencing (HTS) technologies has led to the discovery of an ever-increasing genetic diversity of viral genomes from various sources, such as animals, plants, and environments. This approach is especially important for analyzing samples when little or no information is available on the virus diversity. The presence of D. youngi was recently reported for the first time in Rio Grande do Sul State, southern Brazil, by our research group [29]. Thus, the aim of this study was to take advantage of HTS and detect, viruses in an “unbiased” manner by analyzing organ samples from D. youngi.

Materials and methods

Sample collection

In 2019, six D. youngi bats (one female and five males) were collected in the municipalities of Candelária, Restinga Seca, and São Miguel das Missões, which are located in Rio Grande do Sul State (Supplementary Fig. 1). The animals were captured with mist nets, euthanized with an overdose of 0.5% thionembutal administered intraperitoneally (1 mL) and immediately necropsied. Samples of liver, kidney, lung, heart, and intestines were harvested, stored separately in cryovials and stored at − 20 °C for up to two days, transported to the Veterinary Virology Laboratory and then stored at − 80 °C until processing. The research authorization was obtained from the country's responsible environmental agency, the Chico Mendes Institute for Biodiversity Conservation (ICMBio), under authorization number 61537-1 (Supplementary material 1), to carry out the capture and collection of vampire bats occurring in several municipalities in Rio Grande do Sul.

All organs were macerated individually and diluted to 20% (w/v) in phosphate-buffered saline (PBS) (pH 7.2). One pool containing all the samples was assembled with 100 μL of each organ. PBS was added to the pool to a final volume of 10 mL, which was then centrifuged at low speed at 2000×g for 30 min at 10 °C. The supernatant was filtered through a 0.45 μm filter to remove debris.

Viral metagenomics and HTS

The pool was ultracentrifuged on a 25% sucrose cushion at 150,000×g for 3 h at 4 °C in a Sorvall AH629 rotor. The pellet containing the viral particles was incubated for 1.5 h with DNase and RNase enzymes (Thermo Fisher Scientific, Waltham, MA, USA) [30]. Subsequently, viral RNA and DNA were isolated using TRI Reagent (Sigma Aldrich) and a standard phenol–chloroform protocol [31], respectively.

The viral DNA was enriched with the GenomePlex® Complete Whole Genome Amplification (WGA) Kit (Sigma-Aldrich, St. Louis, MO, USA), while the viral RNA was reverse-transcribed and enriched to dsDNA using the TransPlex® Complete Whole Transcriptome Amplification (WTA) Kit (Sigma-Aldrich, St. Louis, MO, USA), following the manufacturer's recommendations. The DNA products produced from these enrichment protocols were purified using the PCR Purification Combo Kit (Thermo Fisher Scientific). The quality and quantity of the DNA were assessed through spectrophotometry and fluorometry performed with a NanoDrop™ system (Thermo Fisher Scientific) and a Qubit™ system (Thermo Fisher Scientific), respectively, and the products were pooled in equimolar amounts to a final concentration of 0.2 ng of purified DNA. The library was further prepared using the Nextera XT DNA Library Preparation Kit and sequenced using an Illumina MiSeq System using an Illumina v2 reagent kit (2 × 150 paired-end reads).

Bioinformatic analysis

The quality of the generated reads was evaluated, adapters were trimmed using FASTQ Toolkit v. 2.2.5 (BaseSpace Labs), and the data were de novo assembled using SPAdes Genome Assembler v. 3.9.0 [32]. Both tools were accessed on BaseSpace Sequence Hub (https://basespace.illumina.com). The assembled contigs were examined for similarities with known sequences through BLASTX using Blast2GO [33], and all relevant assemblies were confirmed by mapping reads to contigs using Geneious Prime software 2020.0.5 (https://www.geneious.com). Sequences with E-values ≤ 10–3 were classified as likely to have originated from eukaryotic viruses, bacteria, phages, or unknown sources based on the taxonomic origin of the sequence with the best E-value. Gene nucleotide and protein comparisons were performed with the BLASTN and BLASTP programs (https://blast.ncbi.nlm.nih.gov/Blast.cgi) to identify the sequences that were most closely related to the viral contigs of interest. Complete ORFs were predicted and annotated using Geneious Prime. Sequences were replaced by their reverse complements where necessary to maintain all sequences in the same strand. In the annotation of circular genomes, for comparative analyses, we phased all sequences to the same start position of a selected reference protein.

Reference and/or representative sequences of viruses belonging to the families Anelloviridae, Genomoviridae, Smacoviridae, and Paramyxoviridae were obtained from GenBank and aligned with the sequences identified in the present using ClustalW [34] using MEGA6 software [35]. Taxonomic analysis was conducted by considering the pairwise distance matrixes (of either identity or divergence) calculated using MEGA6 according to cutoff values determined by ICTV guidelines specific for each family. Phylogenetic trees were constructed using MEGA6. For each phylogenetic tree, the best model was also calculated using MEGA6.

Detection of paramyxovirus and coronavirus by RT-PCR

All organs of the six bats were screened individually for a 494 bp fragment of the L gene of paramyxoviruses using a well-established broadly reactive PCR protocol described previously [36] to amplify and sequence a particular fragment highly used for phylogenetic analysis. Additionally, the samples were screened individually for a 440 bp fragment of the pol gene of coronaviruses using a pan-coronavirus protocol successfully described previously [37] to detect any coronavirus-related sequences. Primer sequences are specified in Supplementary Table 1.

Results

Overview

The DNA library generated a total of 108,036 reads that were de novo-assembled into 1,280 contigs. The contigs were compared with the GenBank nonredundant protein database through a BLASTX search conducted with an E-value cutoff of 10–5 in Blast2GO [33]. Using this approach, the vast majority (1045/1280; 82%) of sequences could not be classified (“no blast hit”). Host genome and bacterial and phage genomes corresponded to 15% (197/1,280) of the contigs, while the exogenous eukaryotic virus-related sequences corresponded to 3% (38/1280) of the contigs (Fig. 1a).

Fig. 1
figure 1

Metagenomic graphic results presenting the generated sequence distribution. a All reads obtained from this analysis; and b eukaryotic virus representativeness

The majority of eukaryotic virus sequences observed were circular DNA viruses (35/38, 92%), most of which belonged to the Anelloviridae family (26/38, 68%), while ten sequences were CRESS belonging to two viral families: Genomoviridae (7/38, 18%) and Smacoviridae (2/38, 5%). Three sequences (3/38, 8%) were related to single-stranded RNA (ssRNA) genomes belonging to the Paramyxoviridae family (Fig. 1b). Translated sequences similar to those of known or suspected eukaryotic viral proteins are briefly summarized in Table 1. The complete blastn/blastx results of virus-related contigs are presented in Supplementary Table 2. Information regarding the sequences obtained is described in the following sections.

Table 1 Summary of virus-related contigs detected in Diaemus youngi pool of tissues

Despite not being detected by HTS, a conventional RT-PCR “pan-coronavirus” protocol was performed to confirm its absence, and all animals were negative. A “pan-paramyxovirus” protocol was also performed, aiming for the phylogenetic analysis of a highly used L gene fragment, but despite positive results, repetitive attempts to sequence it by Sanger failed.

Anelloviridae

A total of 26 contigs closely related to Anelloviridae members were detected (Table 1), with eight classified as Xitorquevirus; moreover, three highly divergent contigs were proposed as belonging to a new genus herein named Yodtorquevirus. The contigs ranged from 250 to 2374 nt in length. It was possible to detect two complete genomes (Fig. 2a and b) and four nearly complete genomes that displayed three complete ORFs (ORF1, ORF2, and ORF3) (Fig. 2c and d). Five additional sequences displaying the complete ORF1 gene were also detected. Considering that the complete ORF1 nucleotide sequence is required for Anelloviridae member classification, the remaining 15 Anelloviridae-related contigs that displayed only partial ORF1 or only ORF2 were excluded from the analysis.

Fig. 2
figure 2

a Genomic organization of TTDyV-1 (Xitorquevirus) and b TTDyV-7 (proposed Yodtorquevirus genus); and c linearized genomic organization of TTDyV- 4 and 5 (Xitorquevirus) and d TTDyV-6 (proposed Yodtorquevirus genus). The arrows represent the directions and reading frame of each putative ORF (ORF1-ORF3). A closed green box indicates the GC-rich regions

According to the most recent ICTV guidelines, the demarcation criteria for new virus species within family Anelloviridae were set to nucleotide identities of less than 69% (http://www.ictvonline.org/virusTaxonomy.asp) of complete ORF1 nucleotide sequences, while new genera Thus, the nucleotide pairwise distance was calculated using SDT 2.1 [38]. Our nucleotide pairwise identity comparison revealed that seven species were detected (Supplementary Fig. 2). Also, according to the ICTV guidelines, we took into consideration a phylogeny-based approach using the ORF1 amino acid sequences, and two distinct genera could be observed. The phylogenetic analysis was constructed based on the translated ORF1 of the 11 TTDyV, their most closely related sequences in GenBank and representative members of each genus (Fig. 3). The results indicated that the isolates of the present study were grouped into two main clades representing two viral genera: Xitorquevirus and a putative new genus proposed in the present study, that we named Yodtorquevirus according to ICTV suggestion of using the Phoenician alphabet. The genomes showed similar organization in ORF1, ORF2, and ORF3. The ORF1 sequences of the Torque teno diaemus youngi virus (TTDyV) ranged from 442 to 550 aa long, with typical arginine-rich regions at their N-termini (30% of the 70 first amino acids). A GC-rich region was found upstream of ORF2 for all analyzed sequences, while an extra GC region could be observed within ORF2 in the members of the putative new genus (Figs. 2b and d). Sequence analysis indicated that the 11 TTDyVs shared nucleotide sequence identity of their ORF1 ranging from 26.43 to 63.91%.

Fig. 3
figure 3

Phylogenetic analysis performed based on the ORF1 nucleotide sequences of the 11 TTDyVs, their most closely related sequences in GenBank and representative members of each genus

Three sequences were classified as two new species (TTDyV-6 and 7) within Yodtorquevirus, while one isolate (TTDyV-1) was closely related to TTVs previously described in bats from Brazil: one from a Tadarida brasiliesis (NC024908) collected in the same region as the present work [39] and one from Desmodus rotundus (MF541386) collected in São Paulo State (Southeast Brazil) in 2010 [20]. These three isolates belonged to the same species within the Xitorquevirus genus based on the pairwise distance (Supplementary table 2), and the seven remaining isolates detected in the present study comprised four new species (TTDyV-2, 3, 4, and 5) within this genus.

Genomoviridae

In the present study, seven contigs closely related to Genomoviridae members were detected (Table 1). The contigs ranged from 330 to 2212 nt in length. One complete genome of 2157 nt long was obtained (Fig. 4a). The genome organization included a spliced Rep protein ORF, a Cap ORF in the opposite orientation, and one putative ORF3 in the same orientation of Rep. The stem-loop structure was found between the 5′ ends of the two main ORFs (Fig. 4a). The remaining contigs were of partial Rep/Cap and were analyzed only by means of blastn/blastx, as seen in Supplementary Table 2.

Fig. 4
figure 4

a Genome map of giant panda-associated gemykrogvirus 1 isolate Diaemus youngi (GiGemyV). Genes encoding the replication-initiation protein (Rep) and capsid protein (Cap) are shown with arrows. A putative ORF3 is also represented by an arrow. The position of the nonanucleotide (TAATATATT) at the potential stem-loop structure is also indicated; and b Rep amino acid phylogenetic tree of Genomoviridae. The sequences were analyzed through the maximum likelihood method with the LG + G + I model. Analyses were conducted with 1000 bootstrap replicates. Bootstrap values higher than 50% are shown. The sequence detected in the present study is highlighted with a circle

For taxonomic classification, the genome-wide pairwise identities and the phylogenetic tree of the translated Rep were analyzed. The phylogenetic analysis included our complete genome, the best matches based on Rep in GenBank and reference sequences considering the list from the family description from the ICTV (http://talk.ictvonline.org/taxonomy/), with a total of 47 sequences in the dataset.

Phasing of complete genomes was necessary to align the complete circular genomes, and the TATA box upstream of the stem loop was set as position 1. When phasing was not possible, the sequence was excluded from the analysis. The translated sequences were aligned with ClustalW [34] using MEGA6 software. Gap opening and extension penalties were set to 5 and 1, respectively, due to the high divergence of the sequences. A translated Rep maximum likelihood tree was constructed (Fig. 4b), and genome-wide pairwise identities were calculated (one minus p distances of pairwise aligned sequences with pairwise deletion of gaps) for species classification (Supplementary Table 3). Following ICTV guidelines, 78% pairwise identity was set as a value for species demarcation. The genome was classified within the Gemykrogvirus genus, and it belonged to the same species as from a virus previously associated with a giant panda (GenBank accession number MF327559) [40] based on the 94.1% genome-wide pairwise identity (Supplementary table 3); for this reason, our genome was named giant panda-associated gemykrogvirus 1 isolate Diaemus youngi ViroVet10 (GenBank accession number MW436631).

Paramyxoviridae

In the present study, three contigs closely related to the Paramyxoviridae family were detected (Table 1). The contig lengths were 332, 402, and 464 nt. The nodes were renamed ViroVet1, ViroVet2, and ViroVet3 for better reading of the study, and their mapping to a reference genome can be visualized on Fig. 5a. The blastx analysis (Supplementary Table 2) showed that ViroVet2 displayed 62.41% amino acid identity (query cover of 99%) with the L protein of an unclassified bat paramyxovirus (GenBank accession number AIF74184) detected in a Pomona roundleaf bat from China in 2012 (Wu et al., 2016); ViroVet1 had 64.55% identity (query cover of 99%) with the F protein of Lophumoris jeilongvirus 1 (formerly named Mount Mabu virus 1) detected in a Mozambique rodent in 2011 (GenBank accession number YP_009666844) [41], while ViroVet3 had 81.16% identity (query cover of 44%) with the M protein of the same virus.

Fig. 5
figure 5

a Mapping of bat paramyxovirus contigs ViroVet1, 2, and 3 in the complete genome of a reference bat paramyxovirus (MZ328288.1); b partial L; c partial F; and d partial M amino acid phylogenetic trees of Paramyxoviridae reference sequences and unclassified “bat paramyxovirus”. For the partial L tree, sequences of the Pneumoviridae family were included as outgroups. There were 86, 75, and 74 amino acid sequences, respectively, and 148 positions in the final dataset. The sequences were analyzed through the Maximum likelihood method with LG + G + I model. Analyses were conducted with 1000 bootstrap replicates. Bootstrap values higher than 50% are shown. The sequences detected in the present study were highlighted with circles

Based on the blastn results, the most similar sequences for each node were retrieved from the GenBank database along with reference sequences of the Paramyxoviridae family, with a final dataset of 86 sequences. Three datasets were analyzed separately. The nucleotide sequences were translated and aligned using ClustalW. A maximum likelihood phylogenetic tree was constructed for each partial protein.

The phylogenetic tree of the partial L protein (Fig. 5b) showed that ViroVet2 belongs to the Orthoparamyxovirinae subfamily but does not cluster close to any recognized genera. A cluster containing the newly recognized Jeilongvirus, the proposed Shannvirus, the newly described Balerina virus (unclassified) and unclassified bat paramyxoviruses is the most closely related to this L fragment. The phylogeny of the ViroVet1 partial F protein and ViroVet3 partial M protein (Fig. 5c and d, respectively) showed that our fragments were most related to Jeilongvirus members but still divergent. It was not possible to determine if the three fragments belonged to the same genome sequence or multiple genomes.

Smacoviridae

In the present study, two contigs closely related to Smacoviridae members were detected, one with 478 nt (partial Rep gene) and a full-length circular genome of 2465 nt, which was named Diaemus youngi-associated smacovirus-related virus (DiaemusSV) (Fig. 6a). This genome encoded a Rep ORF and a Cap ORF in the same orientation, which was classified as the type V organization following the CRESS DNA virus classification scheme proposed by Rosario and collaborators [42]. A putative stem loop structure was located near the 3′-end of the Rep ORF with homology to the degenerate NANNNTTAC nonanucleotide sequence motif, which is also shared by smacoviruses [43].

Fig. 6
figure 6

a Genome map of Diaemus youngi-associated smacovirus-related virus. Genes encoding the replication-initiation protein (Rep) and capsid protein (Cap) are shown with arrows. The position of the nonanucleotide at the potential stem-loop structure is also indicated; and b Phylogenetic analysis of 69 translated Rep sequences of smacoviruses. The smacovirus described in the present study is highlighted with a circle. Two arrows indicate the genomes with unisense orientation. The evolutionary history was inferred by using the maximum likelihood method based on the LG model. A discrete gamma distribution was used to model evolutionary rate differences among sites (5 categories (+ G, parameter = 3.4474)). Amino acid sequences of Banana bunchy top virus (NP604483) and Cardamom bushy dwarf virus (AHF47677) were used as outgroups representing the Nanoviridae family. All positions with less than 95% site coverage were eliminated. There were a total of 196 positions in the final dataset. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. Evolutionary analyses were conducted in MEGA6

The ICTV latest taxonomy mentions the existence of unidirectional smacoviruses, but they are not contemplated in the classification criteria. To the best of our knowledge, none unisense genome have yet been classified at the genus or species level. Thus, we conducted our analysis separately for each protein, and the virus was described as smacovirus-related.

A blast analysis of the complete genome showed that the most similar sequence was a unisense genome retrieved from a tracheal swab of a chicken in 2017 in the USA (GenBank accession number MN379594), and it had an identity of 88.28% in a query cover of 75%. In addition, 46 reference sequences were retrieved from GenBank, and 22 complete unisense genomes were included in the analysis. For the Rep phylogeny, amino acid sequences of Banana bunchy top virus (NP_604483) and Cardamom bushy dwarf virus (AHF47677) were added as outgroups representing the Nanoviridae family. Rep and Cap proteins were translated and aligned separately using ClustalW with gap opening and extension penalties of 5 and 1, respectively. Phylogenetic analyses were conducted using the maximum likelihood method within MEGA6.

The phylogenetic analysis of the Rep protein (Fig. 6b) showed that sequences derived from unisense genomes clustered in two separate groups, herein named smacovirus-related 1 and smacovirus-related 2, with high support values, one being highly divergent and closer to the outgroup. Our sequence, named isolate DiaemusSV, grouped within the smacovirus-related 1 cluster along with viruses associated with chicken, porcine, rodent, sewage, and other hosts (Fig. 8).

The Rep pairwise amino acid identity among the smacovirus-related 1 ranged from 58.4 to 100% (Supplementary Table 4), with the sequence chicken-associated smacovirus (GenBank accession number MN379594) being the closest to DiaemusSV, with 95.3% identity. The identity to established genera ranged from 15.2 to 37.1%, with Porprismacovirus and Huchismacovirus, respectively.

Blast analysis of the Cap gene alone showed that its identity was higher with another chicken smacovirus isolated in the USA in 2017 (GenBank accession number MN379623.1), with 70.18% identity in a query cover of 98%. Interestingly, this sequence is ambisense. The blastn/blastx analysis of the partial Rep contig (Supplementary Table 2) showed that it was similar to a CRESS genome detected from a minnow in 2017 in the USA (GenBank accession number MH617713.1) by the same study group that identified the chicken smacoviruses described above. A phylogenetic tree of the translated Cap was constructed using the respective genes of the Rep dataset and excluding the outgroup sequences. Similar to the blast results, DiaemusSV grouped closely related to a chicken smacovirus that belongs to the Porprismacovirus genus. The Cap phylogeny (Supplementary Fig. 5) showed great incongruency in relation to the Rep gene, suggesting frequent recombination events. A recombination analysis was not conducted here due to the need for alignment of complete genomes, which is not possible between viruses with opposite senses.

Discussion

The present study analyzed the virome of D. youngi, a vampire bat that feeds on bird blood and is widespread in Latin America, although it was only recently observed for the first time in Rio Grande do Sul State, Brazil. To the best of our knowledge, this is the largest and most up-to-date study analyzing virus diversity of the white-winged vampire bat. Recently, a study conducted in French Guiana described the diversity of herpesviruses in the same bat species [25], although interestingly, this virus was not detected in the present work.

In the present study, we were able to detect 39 virus-related complete genomes or partial sequences, most of which were DNA circular viruses classified as anellovirus, genomovirus, and smacovirus-related and RNA genomes belonging to the Paramyxoviridae family. A possible explanation for the prevalence of circular ssDNA viruses might be the so-called ‘minimal lifestyle’ observed in some animal species and environments, as suggested previously [42]. ssDNA viruses, especially with circular genomes, seem to be widespread in some animal species and environments; torque teno viruses, for instance, are the most abundant component of the human virome [44].

Within the anelloviruses, we detected five new species within the previously proposed Xitorquevirus [20, 39], a genus not yet recognized by the taxonomic committee. Members of this group were first described from organs of a Tadarida brasiliensis bat also detected in Rio Grande do Sul State in 2013 [39]. Later, similar viruses were described in opossum (Didelphis albiventris) and bat species (D. rotundus and Carollia perspicillata) from the southeastern region of Brazil, and the Sigma genus was proposed [20]. In the present study, three anelloviruses were highly divergent and a new genus named Yodtorquevirus was proposed. Considering the distance-matrix values, we also proposed two different species: Torque teno diaemus youngi virus 6 and 7. Our findings highlight the prevalence of torque teno viruses in the white-winged vampire bat virome.

In 2018, the circular replication-initiation protein encoding single-stranded (CRESS) DNA viruses associated with eukaryotic hosts were classified by ICTV into six families: Circoviridae, Genomoviridae, Geminiviridae, and Nanoviridae and the recently added Bacilladnaviridae and Smacoviridae. Currently consisting of nine genera and 73 species, the Genomoviridae family is found in a wide range of body fluids, tissues, organisms, and environments, and to date, there is no evidence of linked pathogenesis. Thus, genomoviridae viruses are usually named according to their source as “source-associated”, and this nomenclature will be used herein. Genomoviruses have an ambisense genome organization with two major inversely arranged ORFs encoding the replication-associated protein gene (Rep) and the capsid protein gene (Cap). A conserved stem-loop structure required for viral replication is located between the 5′ ends of the two main ORFs.

To the best of our knowledge, this report presents the first detection of a genomovirus in white-winged bats and the first detection of Gemykrogvirus in bats. Members of this family have already been detected in insectivorous and frugivorous bat species: Gemycirculavirus has been detected in the intermediate roundleaf bat (Hipposideros larvatus) in China in 2016 (unpublished), in the Pacific flying bat (Pteropus tonganus), in Tonga in 2015 [45], in the Eastern bent-wing bat (Miniopterus fuliginosus) in China in 2012 [46] and in European vespertilionid bats [47]. Gemykibivirus has been described in the greater horseshoe bat (Rhinolophus ferrumequinum) in China in 2013 [46] and in a Brazilian free-tailed bat (Tadarida brasiliensis) in Argentina in 2006.

The gemykrogvirus complete genome detected was surprisingly similar (94.1% genome-wide pairwise identity) to a virus previously isolated from the feces of a healthy captive giant panda in 2017 in China [40]. Other gemykrogviruses related to our sequence were detected in sewage, bovines, poultry, and reindeers. However, the virus was likely not acquired directly from the bat food source (chicken blood) but rather from a common alimentary source found in the environment because genomoviruses are found in a wide variety of organisms and environments. To date, only fungi have been found to be definitive hosts of a genomovirus of the Gemycirculavirus genus [48]. Considering that our sample was a pool of organs, including intestines, we could only assume whether the source was animal tissue or if it was present only in the fecal matter.

Smacoviruses were first detected in wild-living chimpanzee stool [49] and later described in the feces of healthy and diarrheic pigs [50], cattle [51], poultry [52], humans [53], and insects. Although these viruses are thought to infect eukaryotes, their actual host has not yet been confirmed. To date, only the fecal archaeon Candidatus Methanomassiliicoccus intestinalis has been shown to be a candidate host [54]. However, it remains to be established whether smacoviruses infect mammalian cells and cause disease.

The Smacoviridae family is currently classified into six genera. Smacoviruses have 2.3–2.9 kb genomes that contain two major ORFs encoding Rep and Cap proteins, typically in an ambisense organization, although unisense genomes have been reported [55]. The genetic diversity observed within smacoviruses might be due to high mutation rates and intrafamilial recombination events in their genomes [56]. To the best of our knowledge, unisense genomes have not been classified or further characterized thus far, which is probably due to genome-wide alignment limitations required for analysis. Thus, we sought to analyze translated Reps and Caps separately. The phylogenetic analysis of the Rep protein showed that sequences derived from unisense genomes clustered in two separate groups with high support values, and one was highly divergent and closer to the outgroup. Considering the lack of guidelines for unisense genomes taxonomy, we classified unisense sequences, including our genome, as smacovirus-related 1 and smacovirus-related 2. The presence of smacovirus unisense genomes should be taken into consideration in further classification criteria.

Bats have been recently highly recognized as a potential threat for virus spillover due to the coronavirus pandemic caused by SARS-CoV-2 because the most closely related viruses to SARS-CoV-2 were of bat origin [12]. Notably, bats also host major mammalian paramyxoviruses, including Nipah and Hendra viruses, members of the Henipavirus genus that are responsible for highly transmissible and lethal respiratory outbreaks. Recently, three new genera of orthoparamyxoviruses have been defined by the ICTV (https://talk.ictvonline.org/taxonomy/), i.e., Jeilongvirus, Narmovirus, and Salemvirus, while hundreds of unclassified “bat paramyxoviruses” are being uploaded to the GenBank database and increasingly reported worldwide [57,58,59]. A cluster of bat paramyxoviruses has been proposed as a new genus named Shaanvirus [46], and together with the genus Jeilongvirus, they encompass a large number of bat- and rodent-associated viruses. In the present study, we detected three partial sequences of the L, M, and F proteins of a bat paramyxovirus. The L gene encodes the polymerase protein, and it is the most conserved gene of paramyxoviruses and thus is commonly used for classification. The phylogenetic tree of the partial L protein showed that it belongs to a highly divergent paramyxovirus within the Orthoparamyxovirinae family, which could not be further classified. The M and F genes are known to be more variable, which explains the incongruency of the phylogenies between the three fragments. Novel attempts to amplify and sequence a larger fragment of the L gene and remaining genes are necessary to elucidate the classification of this highly divergent paramyxovirus.

Conclusion

A large number of ssDNA circular viruses with high diversity were described, and a new anellovirus genus, Yodtorquevirus, and seven new torque teno species were proposed. Additionally, genomoviruses and smacovirus-related sequences were described for the first time in this bat species. A highly divergent paramyxovirus was detected; however, further studies are necessary. Currently, the zoonotic risk for these viruses is low since their genomes do not display close phylogenetic relationships to viruses detected in humans; however, continuous surveillance is of paramount importance for prevention. Our study provides the first overview of the virome of the white-winged vampire bat and significantly increases the diversity of viruses.