Introduction

The term “laboratory animals” generally refers to animals that are used in scientific experiments and have been domesticated for a long time and bred to meet certain scientific requirements [1]. They are often descendants of wild animals that have been artificially raised in groups and have developed susceptibility to different pathogens, so they can be responsible for outbreaks and epidemics of diseases [2]. Rabbits were one of the earliest animals used for experiments, playing a unique role in medical science research [3].

Viruses are an important cause of disease in laboratory animal [4]. Some viruses cause fatal disease or affect the animals in a way that can affect the outcome of an experiment. An even more serious issue is that some viruses have an extensive host range and can potentially cause disease in both humans and animals [5]. Examples include highly pathogenic influenza viruses [6], epidemic haemorrhagic fever viruses [7, 8], and rotaviruses [9].

In order to identify viruses that are carried by laboratory rabbits, we collected samples from rabbits at the Animal Center of Jiangsu University, China, and a viral metagenomic method was used to detect viral nucleic acid in the feces, mouth, blood and skin of these animals. Complete genome sequences were determined for some of these viruses, including members of the families Polyomaviridae, Picobirnaviridae, Parvoviridae, and Microviridae.

Materials and methods

Samples

Forty samples, including fecal (n = 10), oral (n = 10), blood (n = 10) and skin (n = 10) samples were collected in October 2018 using cotton swabs from 10 laboratory rabbits bred at the Animal Center of Jiangsu University and stored at -80 °C. Each swab was placed in a 1.5-mL centrifuge tube with 500 μL of PBS and vortexed for 5 min at room temperature. The sample was then centrifuged for 5 min at 14,000 × g), and the supernatant was collected. The 40 sample supernatants were combined into eight pools each composed of five samples of the same type, and there were two pools for each sample type. Sample collection and all experiments in the present study were performed with Ethical Approval given by the Ethics Committee of Jiangsu University (reference number UJS2018029).

Viral metagenomic analysis

Each supernatant pool (500 μL) was filtered through a 0.45-μm filter (Millipore) to remove eukaryotic- and bacterial-cell-sized particles. The filtrates enriched in viral particles were treated at 37 °C with a mixture of DNases (Turbo DNase from Ambion, Baseline-ZERO from Epicentre, and benzonase from Novagen) and RNase (Fermentas) to digest unprotected nucleic acid. The fecal samples were treated for 90 min, while other samples were treated for 60 min [10,11,12]. The remaining total nucleic acid was then isolated using a QIAamp Viral RNA Mini Kit (QIAGEN) according to the manufacturer’s protocol. For DNA library construction, dsDNA was synthesized from different viral templates. For dsRNA viruses, we used a reverse transcription kit (SuperScript III Reverse Transcriptase) to reverse transcribe RNA into cDNA, after which the product was denatured at 95 °C for 2 min and quickly placed on ice for at least 2 min. The DNA polymerase I large (Klenow) fragment was then added to synthesize the second strand of cDNA (dsDNA). For ssDNA viruses, ssDNA was converted to dsDNA using the Klenow reaction. The resulting dsDNA products were used to construct eight libraries, using a Nextera XT DNA Sample Preparation Kit (Illumina), and sequencing was carried out using an Illumina MiSeq platform with 250-bp paired ends with dual barcoding for each pool [13].

Bioinformatics analysis

Paired-end reads of 250 bp generated by MiSeq sequencing were debarcoded using vendor software from Illumina. An in-house analysis pipeline running on a 32-node Linux cluster was used to process the data. Clonal reads were removed, and low-sequencing-quality tails were trimmed using Phred, with a quality of score ten as the threshold. Adaptors were trimmed using the default parameters in VecScreen, which uses NCBI BLASTn with specialized parameters designed for adapter removal. The cleaned reads from Illumina sequencing were assembled de novo within each barcode group using the Ensemble assembler to combine them into longer contigs. The Ensemble strategy integrates the sequential use of multifarious de Bruijn graph (DBG) and overlap-layout-consensus (OLC) assemblers by using a novel partitioned sub-assembly approach, integrating results from multiple assemblers, including SOAPDenovo2, ABySS, MetaVelvet, and Cap3 (source code available at https://github.com/xutaodeng/EnsembleAssembler). The assembled contigs, along with singlets, were compared to an in-house viral proteome database using BLASTx with an E-value cutoff of <10−5 [14, 15].

PCR screening and genome sequencing

In order to assess the prevalence of the novel polyomavirus and bocavirus among laboratory rabbits, nested PCR was performed. The PCR conditions were as follows: 95 °C for 5 min, 31 cycles of 95 °C for 30 s, 50 °C (for the first round) or 55 °C (for the second round) for 30 s, and 72 °C for 40 s, a final extension at 72 °C for 5 min, and the enzyme used in the reaction system was the premixed enzyme rTaq (TaKaRa). The samples used in PCR screening included the 40 samples originally collected for library construction and another 84 samples including oral (n = 28), blood (n = 28), and fecal (n = 28) samples collected from 28 rabbits in the experimental animal center in May 2019. Primers used for PCR screening were designed based on the virus contigs assembled from the sequence reads in the libraries. Because bocaviruses showed a very high prevalence rate in these samples, another set of primers were also designed to confirm the positive results. The primer sequences for each virus are shown in Table 1.

Table 1 Primers used in specific screening PCR for polyomaviruses and bocavirus in fecal, oral, blood, and skin samples from laboratory rabbits

After finding a novel polyomavirus sequence in an oral swab, inverse PCR was used to obtain its complete genome sequence. The inverse primers were designed based on a 381-base contig and a 1917-base contig assembled from the polyomaviral reads from the library. The inverse PCR primers used for bridging two gaps in the complete polyomavirus genome sequence are shown in Table 2. For all of the PCR steps, negative controls were included, and positive PCR bands were sequenced by the Sanger method. Representative images of a gel with DNA bands can be seen in Supplementary Material 2.

Table 2 Inverse primers used to generate the complete genome sequence of the novel polyomavirus

Phylogenetic analysis

Phylogenetic analysis was performed based on predicted amino acid sequences of viral proteins and included the closest viral relatives identified using a BLASTp search of the GenBank database, as well as representative members of related viral species or genera. Sequence alignment was performed using CLUSTAL X (version 2.1) with the default settings. Phylogenetic trees were generated using Bayesian inference (BI) in MrBayes3.2 [16]. Settings were according to the specific sequence until the average standard deviation of split frequencies was less than 0.01. A standard deviation below 0.05 was considered adequate for specific well-supported parts of the tree. Putative ORFs in the viral genome were predicted using Geneious10.0. For polyomavirus, putative exons and introns were predicted using the NetGene2 Server at http://www.cbs.dtu.dk/services/NetGene2/.

Results

Overview of the virome

The 40 laboratory rabbit samples of the eight libraries generated a total of 1,305,544 unique sequence reads (Supplementary Table S1), and the distribution of reads attributed to bacteria, fungi, viruses, etc. is shown in Fig. S1. Sequence reads were assembled de novo within each barcode group and compared to the GenBank non-redundant protein database using BLASTx. Analysis of the distribution of mammalian virus reads detected in each pool showed that 2,452 reads showed sequence similarity to bocaviruses (family Parvoviridae) accounting for a major part of total putative mammalian virus reads. As shown in Table S1, other mammalian virus sequences in order of sequence read abundance included coronaviruses (1,276 reads), polyomaviruses (259 reads), and picobirnaviruses (239 reads). In addition to mammalian viruses, bacterial viruses showing sequence similarity to members of the family Microviridae (3,890 reads) were also detected. The polyomavirus and picobirnavirus sequences showing low-level similarity to genome sequences available in the GenBank database were then further characterized. The bocavirus from the rabbit fecal library is described in detail because it showed a degree of high sequence identity (>95%) to a bocavirus strain isolated from wolves. Coronaviruses detected in the rabbit fecal libraries showed a high similarity to a known rabbit coronavirus strain, HKU14, belonging to the genus Betacoronavirus, with 99% identity, so we also analyzed the rabbit coronavirus based on these virus reads.

A novel polyomavirus in oral samples

Polyomaviruses are small, unenveloped, double-stranded circular DNA viruses that are widespread in nature and have been identified in birds, cattle, rabbits, rodents, bats, and humans [17]. They can infect numerous mammals with suppressed immune functions, causing sarcoma or cancer in multiple sites or organs in some hosts, but are not pathogenic to hosts with normal immune function, and they can also cause long-term persistent subclinical infections in immunocompetent hosts [18, 19]. The first human polyomaviruses were isolated in 1971 from immunocompromised patients [20]. The International Committee on Taxonomy of Viruses (ICTV) officially lists 102 species of polyomaviruses and divides the family into four genera, including Alpha-, Beta-, Gamma- and Deltapolyomavirus [21]. The genomes of polyomaviruses are roughly 5 kb in size and are known to code for five to nine proteins [22]. Transcription from one side of the viral origin of DNA replication (ori) results in mRNAs encoding the early proteins and non-structural proteins [23], which can participate in the regulation of the cell cycle and induce cellular transformation or tumor formation in some cases and are therefore referred to as tumor (T) antigens. From the opposite side of the ori, three structural proteins, including VP1, VP2, and VP3 are encoded by a late region. The human viruses, including JC polyomavirus and BK polyomavirus can also encode a multifunctional nonstructural protein in the 5’ region of the late mRNAs, which is called agnoprotein [24].

In this study, we detected a novel rabbit polyomavirus (named RabPyV) in the oral swab sample pool (rabbitoral01). In this library, two different contigs, 381 bp 1917 bp, respectively, corresponding to the large T antigen and ORF3 of the polyomavirus were generated. PCR screening with primers designed based on the large T antigen gene was performed on 124 samples, including the 40 samples used to make the libraries and 84 additional samples. PCR screening results indicated that 55 of the 124 (44.4%) samples were positive for the rabbit polyomavirus, including 20 oral swab samples, eight skin samples, 16 blood samples, and 13 fecal samples, which also indicated that 71.1% (27/38) of the rabbits were positive. The complete circular genome sequence of this novel polyomavirus was determined using an inverse PCR method.

The complete genome of the novel rabbit polyomavirus consisted of 5162 bp with an overall GC content of 37%, encoding five major proteins including an early region coding for the small T antigen (STAg) and the large T antigen (LTAg) on one strand and a late region coding for the VP1, VP2, VP3 proteins on the opposite strand, while the agnoprotein gene was not found [25]. A regulatory region between the beginning of the early region and the late region contained elements for viral DNA replication and one copy of the consensus T-antigen-binding pentanucleotide GAGGC, and two copies of the reverse complement GCCTC were present in the noncoding regulatory region [26]. An AT-rich non-coding region was also found between the ends of these two regions. The genome organization of this rabbit polyomavirus is shown in Fig. 1A. The size and position of the predicted open reading frames (ORF) of RabPyV is shown in Table S2. A more detailed analysis of the amino acid sequences of the proteins encoded by RabPyV revealed conserved sequences in functionally important regions. These proteins contained typical elements that are necessary for the life cycle of polyomaviruses [17]. The small T antigen shares an amino-terminal region with the large T antigen because these proteins initiate at the same ATG codon [26, 27]. Features such as conserved region 1 (cr1) and the conserved HPDKGG box are present in the antigens. The amino-terminal domain plays a necessary role in viral replication and transformation.

Fig. 1
figure 1

Genome organization of the rabbit polyomavirus and phylogenetic analysis of the novel polyomavirus identified in oral samples from laboratory rabbits. (A) Genomic organization of the rabbit polyomavirus. (B) Phylogenetic tree based on the amino acid sequence of the LTAg protein. (C) Phylogenetic tree based on the amino acid sequence of VP1. (D) Phylogenetic analysis was performed based on the amino acid sequence of VP2. The polyomavirus identified in this study is labeled in red

To determine the genetic relationship between the RabPyV and other related polyomaviruses, phylogenetic analysis was performed using the LTAg, VP2 and VP1 amino acid sequences of RabPyV, representative polyomaviruses of different genera, and viruses identified by a BLASTp search of the GenBank database to be related to RabPyV. Phylogenetic analysis based on LTAg amino acid sequences indicated that RabPyV is highly divergent from other polyomaviruses, and it clustered with three strains in the genus Gammapolyomavirus that had been isolated from Corvus monedula (CPyV, GenBank no. DQ192570), goose (GHPV, GenBank no. AY140984), and Eurasian bullfinch (PPyV, GenBank no. DQ192571) (Fig. 1B). Analysis of the amino acid sequences of VP1 and VP2 indicated that RabPyV clustered closely with the members of genus Betapolyomavirus (Fig. 1C and D), which included bank vole polyomavirus (BankPyV) and common vole polyomavirus (CommonPyV). The amino acid sequences of VP1 and VP2 shared the highest identity (70% and 60%, respectively) with those of CommonPyV (GenBank no. NC_028119). Trees based on different regions of the genome of RabPyV showed discordant results, suggesting that this novel rabbit polyomavirus had undergone genetic recombination. To determine whether RabPyV is a recombinant, the complete genome sequences of CPyV (DQ192570), GHPV (AY140984), PPyV (DQ192571), CommonPyV (NC_028119), and BankPyV (KR612372) were aligned with that of RabPyV using CLUSTAL X, and recombination analysis was performed using the Recombination Detection Program 4.0 (RDP 4.0), which includes seven different methods: RDP, GENECONV, BootScan, MaxChi, Chimaera, SiSCan and 3Seq [28]. The results of a manual bootscan analysis suggested that RabPyV resulted from a recombinant event occurring between the major parent NC_028119 and the minor parent DQ192570, with a high degree of confidence (P = 3.662 × 10−6). The beginning breakpoint was at base 2596, and the ending breakpoint was at base 2781 in the alignment (Fig. 2).

Fig. 2
figure 2

Bootscan evidence for recombination in the RabPyV genome

Picobirnaviruses in the fecal samples

Picobirnaviruses are small, nonenveloped viruses with a segmented genome of double-stranded RNA [29]. They have been detected in the feces of many different hosts, including humans, rabbits, dogs, pigs, rats, and birds [30]. Picobirnaviruses can infect hosts and cause diarrhea, and they are considered to be opportunistic gastrointestinal pathogens associated with clinical disease in humans [31, 32]. The genome of picobirnavirus consists of two segments. The large segment is about 2.2-2.7 kb long and encodes the capsid gene, while the small one, which encodes the viral RNA-dependent RNA polymerase (RdRp), is about 1.2-1.9 kb long [33, 34].

We found a rabbit picobirnavirus (RPBV) in one of the fecal pools (rabbitfeces07). The RPBV sequence reads could be assembled into two long contigs, one of which was 1297 bp long, encoding a portion of the capsid protein, and the other one was 1305 bp long, encoding most of the RdRp protein. Because the amino acid sequence of the capsid protein was so divergent that a reliable phylogenetic tree could not be established, we used only the amino acid sequence of RdRp to perform phylogenetic analysis. Based on the RdRp of the novel RPBV, the best matches from a BLASTp search, and some available related strains from the GenBank database, a phylogenetic tree was constructed. The results indicated that this RPBV clustered with an unclassified picobirnavirus detected in feces of Portuguese wolves (GenBank no. KT934307), with which it shared 70% sequence identity (Fig. 3).

Fig. 3
figure 3

Phylogenetic analysis of the picobirnavirus identified in the fecal samples of laboratory rabbits based on partial amino acid sequences of the RdRp protein. The picobirnavirus identified in this study is labeled in red

A bocavirus in the fecal samples

Bocaviruses are small, non-enveloped, linear single-stranded DNA viruses belonging to the family Parvoviridae [35]. Bocaviruses have been identified in humans and many mammals, including cats, dogs, and pigs [36]. These viruses usually infect the respiratory tract and intestines, causing diarrhea [37,38,39].

In this study, sequences showing similarity to bocaviruses were identified in all fecal and blood pools. PCR screening was then conducted with all 124 samples, showing that 59.7% of them (74/124) were positive for this bocavirus, including 26 oral swab samples, 10 skin samples, 15 blood samples, and 23 fecal samples, and all of the animals were positive. A complete genome sequence of this bocavirus, named “rabbit bocavirus” (RBoV), was generated by assembly of all 367 bocaviral reads in one fecal library (rabbitfeces07), which contained three complete ORFs after sequence assembly.

The complete genome was 5287 bp in length, with 161 bp in the 5’ UTR and 186 bp in the 3’ UTR (Fig. 4A). Phylogenetic analysis was performed based on the amino acid sequences of NS1 and VP1 respectively, and in both cases, RBoV clustered closely with a bocavirus strain isolated from Iberian wolves (GenBank no. NC_040533), with which it shared 99% and 81% amino acid sequence identity, respectively (Fig. 4B, Fig. 4C).

Fig. 4
figure 4

Genomic organization of the rabbit bocavirus (RBoV) and phylogenetic analysis of the bocavirus identified in laboratory rabbits. (A) Genomic organization of RBoV (distance in nt). (B) Phylogenetic tree based on the amino acid sequence of NS1. (C) Phylogenetic tree based on the amino acid sequence of VP1. The bocavirus identified in this study is labeled in red

Microviruses in the fecal samples

In one fecal library (rabbitfeces07), a large number of sequence reads showed similarity to sequences from viruses belonging in the family Microviridae based on a BLASTx search. Because viruses of the family Microviridae have never been described in rabbit feces before, the genome of this virus was also analyzed in detail. Assembly of these mycoviral sequence reads generated a nearly complete genome sequence 6352 bp in length that contained all of the major ORFs found in typical microviral genomes.

To assess the relationship of the rabbit microvirus to other microviruses, the major capsid protein (VP1) sequences were used for phylogenetic analysis. Reference amino acid sequences from members of the family Microviridae were obtained from the GenBank database, and the best matches from a BLASTp search were included. From this analysis, we found that the rabbit microvirus clustered with an unclassified microvirus isolate from mouse tissue (GenBank no. MH649088), with 56% amino acid sequence identity in the VP1 protein (Fig. 5).

Fig. 5
figure 5

Phylogenetic analysis of the microvirus identified in fecal samples of laboratory rabbits based on partial amino acid sequences of the VP1 protein. The microvirus identified in this study is labeled in red

Coronaviruses in the fecal samples

Coronaviruses (CoVs) are enveloped, positive-sense single-stranded RNA viruses belonging to the family Coronaviridae with the genome ranging from 26 to 32 kilobases in size, including a 5′ cap and 3′ poly(A) tail. The untranslated regions (UTRs) at the 5′ and 3′ ends usually contain structural elements and are related to replication and/or translation. The replicase gene, which occupies the 5′ two thirds of the CoV genome encodes the non-structural proteins, and the remaining one third encodes four major structural proteins, including the spike (S), membrane (M), envelope (E) and nucleocapsid (N) proteins [28, 40]. CoVs are classified into four genera: Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus. They can infect many species of animals and humans, causing respiratory, enteric, and nervous system diseases [41].

In our study, 1273 reads of coronavirus were detected in the fecal libraries, and these reads showed high similarity to rabbit coronavirus HKU14 (GenBank no. JN874561), a member of the genus Betacoronavirus, with 99% sequence identity. Phylogenetic analysis was performed based on RNA-dependent RNA polymerase (RdRp) and N protein amino acid sequences. Reference coronaviruses included the best matches from a BLASTp search as well as selected strains belonging to different genera. The phylogenetic tree based on the N protein indicated that this coronavirus clustered with other rabbit coronavirus strains (GenBank nos. JN874561, JN874560, JN874562), with 100%, 99.7% and 99.5% amino acid sequence identity, respectively (Fig. 6A). The phylogenetic tree based on partial RdRp protein sequences showed that this coronavirus clustered with other rabbit coronavirus strains (GenBank nos. JN874561, JN874559, JN874560), with 100%, 99.4% and 99.4% amino acid sequence identity, respectively (Fig. 6B).

Fig. 6
figure 6

Phylogenetic analysis of the coronavirus identified in laboratory rabbits. (A) Phylogenetic tree based on the amino acid sequence of the N protein. (B) Phylogenetic tree based on the amino acid sequence of the RdRp protein

Discussion

The development of viral metagenomics facilitates the identification of novel viruses in animals and humans, and it has been used for characterizing viromes in mammals [42]. The viromes of laboratory animals belongs to their genetic background and can affect experiments using these animals. However, few studies have investigated viral populations in laboratory rabbits. In the present study, we used metagenomics to detect viral nucleic acids in the feces, mouth, blood, and skin of laboratory rabbits from an animal center.

Polyomaviruses can infect numerous mammals with suppressed immune functions and are highly adapted to grow in a particular species and tissue. Examples include JC polyomavirus (JCPyV), which was isolated from the brain of a patient who died of progressive multifocal leukoencephalopathy (PML) [20, 43], BK polyomavirus (BKPyV), which was isolated from urine [44], African elephant polyomavirus 1(AelPyV-1)[45] from a fibroma in an African elephant, and raccoon polyomavirus (RacPyV) [46] from brain tumors in raccoons. In this study, a novel polyomavirus (RabPyV) was identified in an oral pool, and its complete genome sequence was determined. Based on new criteria for establishment of new polyomavirus species from the ICTV Polyomavirus Study Group, RabPyV can be considered a new species of polyomavirus [21]. Phylogenetic analysis based on different regions of the RabPyV genome and recombination analysis of related complete sequences showed that this novel strain is a recombinant with CommonPyV (NC_028119) as its major parent and CPyV as its minor parent (DQ192570). Recombination analysis based on related genomes suggested that RabPyV is a multiple recombinant between a rodent-like and avian-like polyomaviruses. Phylogenetic analysis based on LTAg amino acid sequences showed that RabPyV clustered with CPyV, GHPV and PPyV, which belong to the genus Gammapolyomavirus, while phylogenetic analysis based on VP1 and VP2 both showed that RabPyV was clustered closely with members of the genus Betapolyomavirus, including BankPyV and CommonPyV. CPyV, GHPV and PPyV are all bird polyomaviruses. In contrast to mammalian polyomaviruses, the polyomaviruses of birds are the causative agents of severe diseases with high mortality rates [47]. The recombinant strain RabPyV was obtained from healthy laboratory rabbits, and it remains to be determined whether the pathogenicity of this strain is higher than that of other mammalian polyomaviruses and whether it can cause cross-species infections between birds and mammals.

Picobirnaviruses have been detected in the feces of many hosts, including humans, rabbits, dogs, pigs, rats, and birds, and they may cause diarrhea [31]. A viral metagenomics study conducted in diarrheic free-ranging wolves showed for the first time that wolves are a potential reservoir for picobirnaviruses that might play a role as enteric pathogens. A phylogenetic analysis revealed that the RdRp of RPBV is closely related to that of a picobirnavirus detected in diarrheic feces from Portuguese wolves (GenBank no. KT934307). The wolf picobirnavirus strain was reported to be a possible reassortant [48], and we suspect that the picobirnavirus in the feces of wolves was not from the wolves themselves but from rabbits. Whether RPBV can cause diarrhea needs further investigation.

In this study, bocaviruses were identified in laboratory rabbits with a very high positive rate at both the level of samples and of animal cohorts, in which all of the rabbits appeared normal without clinical symptoms, suggesting that this bocavirus might belong to the normal virus flora in laboratory rabbits. Considering the sequence identity (99% and 81%) between bocavirus in this study and the closest relative (GenBank no. NC_040533), which was isolated from the feces of a wolf [49], it seems to be plausible that the bocavirus from the wolf feces might not have been a wolf virus but instead originated from wild rabbits eaten by wolves.

In conclusion, using metagenomic analysis, our study provides an overview of the fecal, oral, blood, and skin virome in laboratory rabbits, which included a polyomavirus, a picobirnavirus, a bocavirus, a microvirus and a coronavirus. These data will provide a genetic basis for the study of the occurrence, development, and transmission of these viruses and their association with disease. Whether these viruses can cause disease in laboratory rabbits remains to be studied.

Nucleotide sequence accession numbers

The genome sequences of the viruses described in detail in this article were deposited in the GenBank database under the accession numbers MT150088-MT150091 and MT780752-MT780753.

The raw sequence reads from the metagenomic libraries were deposited in the Sequence Read Archive of the GenBank database under the accession numbers SRX7845942, SRX7860111, SRX7860165, SRX7860217, SRX7860248, SRX7860346, SRX7845943, and SRX7860351.