Temperate phages integrate their nucleic acid into host bacterial genomes in a process termed lysogeny1,2. The genetic material of a bacteriophage in host chromosome, which is referred to as prophage, can be passed to daughter cells during cell division. Prophages may be induced under certain circumstances (such as UV radiation or mitomycin C treatment), which activates a lytic cycle1,3,4,5. Thus, prophages are viewed as “dangerous molecular time bombs”6. DNA could be transferred among different hosts during phage infection through either the lytic cycle or the lysogenic cycle1,2,3,4,5. Previous studies showed that more than 50% of marine isolates harbor prophages, which play a significant role in shaping the phenotypic traits of their hosts1,4,5.

Citromicrobium bathyomarinum strain JL354, a member of aerobic anoxygenic phototrophic bacteria (AAPB), was isolated from surface water in the South China Sea7. The type strain C. bathyomarinum JF-1 was first isolated from the deep-sea hydrothermal vent plume waters8. A complete prophage sequence was found from the preliminary genomic analysis of strain JL354. Two different phototrophic operons and a xanthorhodopsin-like protein co-exist in the same genome7,9,10. Horizontal gene transfer is thought to occur for phototrophic genes of phototrophic microorganisms. The objectives of this study were to 1) induce the prophage from C. bathyomarinum JL354 and 2) compare its genome to other homologous phages.


Discovery of the vB_CibM-P1 phage by induction from C. bathyomarinum JL354

A novel Mu-like phage-related gene cluster was discovered within the genomic sequence of C. bathyomarinum JL3547. To determine whether this phage was active, mitomycin C was used to treat C.bathyomarinum JL354 in the exponential growth phase and flow cytometry was used to measure the induction of virus-like particles (VLPs). Within 30 hours, the quantity of VLPs increased up to three orders of magnitude (Fig. 1), while the number of bacterial cells decreased an order of magnitude. The induced VLPs had Myoviridae-like (i.e., T4-like) morphology and polyhedral heads (approximately capsid 60–100 nm) with tail fibers (Fig. 2).

Figure 1
figure 1

Viral particle yield following mitomycin C induction of C. bathyomarinum JL354.

Flow cytometry counts of JL354 cells and viral-like particles were performed with (A) a mitomycin C-treated culture and (B) a control culture without mitomycin C.

Figure 2
figure 2

Virus generated by mitomycin C induction of C. bathyomarinum JL354.

Scale bars, 50 nm.

To confirm whether the induced phage matched the vB_CIBM-P1 prophage observed in the C. bathyomarinum JL354 genome, the induced viral DNA was re-sequenced. Although there was host DNA contamination, the read-rich region mapped to the predicted prophage position (Zheng et al., unpublished data).

Genomic structure of phage vB_CibM-P1

The complete genomic sequence of phage vB_CIBM-P1 is ~38 kb, with a GC content of 66.0%, which is slightly higher than that of its host (65.0%). The vB_CIBM-P1 GC content is much higher than that of bacteriophage Mu (52.0%) from E. coli MH5361 and slightly higher than the prophage found in Hoeflea phototrophica DFL-43 (61.7%). Using PHACTS, a novel tool for predicting whether the lifestyle of a phage is primarily lytic or lysogenic, that- determined vB_CIBM-P1 was temperate11.

In total, the vB_CIBM-P1 genome contains 58 predicted open reading frames (ORFs), representing 97% of the entire genome (Table 1 and Fig. 3). Forty-three ORFs have ATG start codons, whereas 13 ORFs and 2 ORFs start with GTG and TTG codons, respectively. No tRNA genes were detected by tRNAscan-SE12. Thirty-three ORFs yielded significant hits (E value ≤ 1e-5, coverage ≥ 75%) in the GenBank database and 21 ORFs were assigned as unknown (Table 1).

Table 1 Gene Annotation List for vB_CIBM-P1
Figure 3
figure 3

Comparison of bacteriophage Mu (A), vB_CIBM-P1 (B) and prophage in Hoeflea phototrophica DFL-43 (C).

Pink, early expression genes; orange, heads; yellow tails; red, GTA-like region; green, lysozyme genes; light gray, putative proteins.

Early expression region

The early expression region contains regulatory genes with associated lytic genes followed by two structural regions encoding for head and tail assembly (Fig. 3, Table 1)13. The genome starts with a C-repressor and ner (encoded by ORF 1 and ORF 2, respectively), which are involved in the regulation of lysogeny and lytic development. Similar to bacteriophage Mu, if the C-repressor bound, it would repress the lytic cycle and induce the lysogenic cycle instead; whereas if unbound, the lytic cycle would occur13. Transposases A and B, which are also located in this region at position 6661–9736, allow for integration (ORF 17) and transposition (ORF 18), respectively. ORF 17 encodes phage integrase, which is most similar to the integrase in Roseomonas cervicalis ATCC 49957.

Head, tail and tail fiber assembly

The head proteins have high identity to classical Mu or Mu-like bacteriophages, which suggests that heads may appear to be Myoviridae-like. From the structure and composition of the three phages (bacteriophage Mu, vB_CIBM-P1 and the Mu-like prophage from Hoeflea phototrophica DFL-43), the genes encoding head protein assembly are within the most conserved region (Fig. 3). ORF 37 and ORF 42 are the two identifiable head proteins found in the genome (Table 1). ORF 37 is most identical to a Mu-like phage in Roseovarius sp. 217, which is another marine alphaproteobacterial (AAPB) found in the euphotic zone. The final head assembly protein ORF 42 has the highest similarity (54%) with the Mu-like major head subunit of Hoeflea phototrophica DFL-4314.

ORFs 50–54 form the major tail/tail fiber proteins, which are 9 kb in size and represent 24% of the genome. However, only the tail tape measure protein appears to be similar to the Mu-like tail protein and the rest of the tail proteins have closer identity to Siphoviridae (e.g., Rhizobium phage 16-3)15.


The vB_CIBM-P1 genome is highly mosaic in nature, but it is a novel representative of marine Mu-like phages or myoviruses. The genome has 58 predicted ORFs and 17% of the genome encodes for identifiable Mu-like genes. Forty percent of the genome encoding for predicted proteins has no match in GenBank database, indicating the novelty and lack of knowledge about the genomes of tailed phages. Additional data are needed to confirm this variability in genome structure.

Mu-like phages, in general, have few predicted and conserved tail proteins, which are supported by the hypothesis that tailed phages are genetic mosaics derived by multiple step-wise recombination exchanges that occur within a large gene pool16,17,18,19. Tail proteins are generally more variable due to their role as host-range determinants17,18. Consistent with other Mu-like phages, genes that contribute to tail morphology are quite diverse and may be a red-queen driven process20. If driven by a red-queen process, both phage tail and host entry receptors need to evolve at a highly consistent rate20, which could potentially explain the high diversity and lack of conservation observed in this region.

Phylogenetic analysis of Mu-like phages

The major head gene was compared to many other Mu-like phages or prophages present in host genomes. The replicative nature of the genome in the phage vB_CIBM-P1 made it difficult to assign all sequences to a meaningful phylogeny due to their high mosaicism and variable genome structure13. Many Mu-like phages or prophages have been found in enteric bacteria, e.g., Gamma group (Fig. 4)13. It appears that AAPB hosts containing Mu-like prophages are mainly found in aquatic environments (freshwater or marine, Alpha group, Fig. 4). The vB_CIBM-P1 host C. bathyomarinum JL354 is a member of the Alpha group, with close identity to Hoeflea phototrophica DFL-43 (Fig. 4).

Figure 4
figure 4

Neighbor joining phylogenetic tree based on the phage major head gene sequences.

The Mu-like prophage is commonly found in the Neisseria, Escherichia and Haemophilus genera. It has been recently found in marine bacteria genomes. All bacteria had complete Mu-like prophage sequences in their genomes. Haemophilus haemolyticus M21639, NZ_AFQR00000000.1; Haemophilus influenzae 3655, NZ_AAZF00000000.1; Haemophilus ducreyi 35000HP, NC_002940.2; NZ_ AQCL00000000.1; Escherichia coli 0.1288, NZ_AMVJ00000000.1; Neisseria weaveri LMG 5135, NZ_AFWQ00000000.1; Neisseria meningitidis ATCC 13091, NZ_AEEF00000000.1; Roseomonas cervicalis ATCC 49957, NZ_ADVL00000000.1; Hoeflea phototrophica DFL-43, NZ_ABIA00000000.2; Citromicrobium bathyomarinum JL354, NZ_ADAE00000000.1; Rhodopseudomonas palustris TIE-1, NC_011004.1; Oceanicola sp. S124, NZ_AFPM00000000.1; NC_009428.1; Marinomonas sp. MED121, NC_009654.1; Mariprofundus ferrooxydans PV-1, NZ_AATS00000000.1; Fulvimarina pelagi HTCC2506, NZ_AATP00000000.1. Bootstrap percentages (>50) from neighbor joining (above) and maximum likelihood (below) are shown in the tree. The scale bar represents 20% amino acid substitution.

Bacteriophage of vB_CIBM-P1

Mu genome replication occurs by duplicative transposition into host chromosomes by random integration13,19,21,22. At the time of Mu genome packaging, the phage transposase cuts beyond the phage genome and can include approximately 1.8–3.0 Kb of host DNA23,24. In our re-sequencing data of the induced phage vB_CibM-P1, none of the reads contained host DNA fragments at either end of the phage genome. Additional experiments are needed to determine whether our phage has similar transposition properties as the Mu phage. Furthermore, our prophage could be induced by mitomycin C instead of high temperature (42°C), while previous studies showed that the Mu phage is usually induced by high temperature rather than mitomycin C13,24. All of the evidence indicates that the phage vB_CIBM-P1 is a novel myovirus.

Although temperate phages are widely found in marine bacteria2,4,6, this study is the first to report a temperate phage isolated from a marine aerobic anoxygenic phototrophic bacterium. With the increasing number of bacterial genomes in the GenBank database, more marine prophages will be potentially found and characterized, which will improve our understanding of the evolution and function of phage-host interactions. Comparison of phototrophy and closely related non-phototrophy genomes may provide clues to how temperate phages influence the evolution of photosynthetic genes.


Sequence analysis

The draft genome sequence of C. bathyomarinum JL354 is available under the GenBank accession number NZ_ADAE00000000. The JL354 genome was manually analyzed for the presence of a putative prophage. First, the genome was scanned for phage-related genes. When a phage-related gene cluster (ZP_06862281-ZP_06862290) was encountered, the surrounding genes were also examined. Putative prophage fragments were re-annotated by performing a PSI-BLAST search against the GenBank database for further analysis25. The beginning and end of a specific prophage genome were estimated based on the annotations of surrounding genes. The complete prophage genome was used to predict lifestyle with the Phage Classification Tool Set (PHACTS) (

A tRNA search was performed in the prophage genome using tRNAscan-SE ( Homologous prophages from Hoeflea phototrophica DFL-43 (NZ_ABIA00000000) and bacteriophage Mu (NC_000929) were used for comparative genomic analysis with the prophage found in C. bathyomarinum JL354.

Nearly complete major head (>900 bp) gene was used to construct a phylogenetic tree. All sequences collected from the NCBI database were aligned using Clustal X and phylogenetic trees were constructed using the neighbor-joining and maximum-likelihood algorithms in MEGA software 5.026. The phylogenetic trees were supported by bootstrap for resampling test with 1000 and 100 replicates using neighbor-joining and maximum-likelihood algorithms respectively.

Induction of the C. bathyomarinum JL354 prophage with mitomycin C

C. bathyomarinum JL354 was grown in rich organic (RO, containing 1.0 g yeast extract, 1.0 g Bacto Peptone and 1.0 g sodium acetate per liter artificial seawater with vitamins and trace element.)8 medium at 28°C with a shaking speed of 180 rpm throughout the induction experiment. The induction process and sampling were performed according to the protocol described by Chen et al.3,27. Briefly, 10 ml of JL354 culture in exponential growth phase was transferred to 200 ml of fresh RO medium. After the subculture reached OD600 = 0.25, it was split into two flasks (100 ml in each); one was treated with mitomycin C (final concentration, 0.5 μg ml−1) and the other served as the control. After incubation for 30 min, the control and mitomycin C-treated cells were washed twice, centrifuged at 7,500 × g for 10 min and resuspended in 100 ml fresh RO broth. Samples (0.5 ml) for viral and bacterial counting were fixed with glutaraldehyde (final concentration: 2.5%) for 15 min in the dark and then stored in liquid nitrogen for flow cytometry analysis28,29.

Viral particles and bacterial count

Virus and bacterial counts were determined by an Altra Epics II flow cytometer (Beckman Coulter), which was equipped with an external quantitative sample injector (Harvard Apparatus PHD 2000). The bacterial and viral particles were identified and counted28. SYBR Green I (Molecular Probes) was used as a nucleic acid stain for bacteria identification in red fluorescence versus green fluorescence plots. Virus samples were analyzed separately from bacteria samples. Briefly, once thawed at 37°C, samples were diluted in 0.02-μm filtered TE (Tris-EDTA, pH = 8) buffer as needed and heated for 10 min in the dark at 80°C after staining with the DNA dye SYBR Green I and then cooled for 5 min prior to analysis28,29,30. Viruses were discriminated on the basis of their green DNA-dye fluorescence versus 90° angle light scatter28,29,30.

Purification of induced phage

Phage particles in lysates were harvested and purified as described by Chen et al.3,27 with the following modifications. Phage lysates were treated with RNase A (final concentration: 2 μg ml−1) and DNase I (final concentration: 2 μg ml−1) at room temperature for 1 h. One liter of induced phage lysate was centrifuged at 12,000 × g for 15 min in a Hettich Rotina 38R centrifuge. Supernatants were filtered through a 0.45-μm pore size filter (type HA; Millipore) to remove host cells and cellular debris. Phage particles in the filtrate were treated with polyethylene glycol 8000 (final concentration: 100 g l−1) overnight at 4°C and precipitated by centrifugation at 12,000 × g for 90 min. Pellets were re-suspended with 6 ml of SM buffer (10 mM NaCl, 50 mM Tris, 10 mM MgSO4 and 0.1% gelatin) and incubated overnight at 4°C. The phage suspension was mixed with CsCl (final concentration, 0.6 g ml−1) and centrifuged for 24 h at 200,000 × g using an S55A rotor in a Micro-Ultracentrifuge CS-GX series centrifuge. Visible viral bands were extracted and then dialyzed (MWT = 30 KD) twice in SM buffer overnight at 4°C. CsCl-purified phage lysates were stored at 4°C until further analysis. Virus morphology was examined by transmission electron microscopy (JEM 2100 HC).

Extraction of phage DNA

Purified phage were treated with a proteinase K cocktail (100 μg ml−1 proteinase K, 50 mM Tris, 25 mM EDTA and 1% SDS) at 55°C for 3 h. The phage DNA was extracted using phenol/chloroform/isoamyl alcohol (25:24:1 by volume)3,27. The DNA was then ethanol-precipitated, re-suspended in TE buffer (10 mM Tris, 1 mM EDTA) and re-sequenced with an Agilent 2100 Bioanalyzer (Ion Torrent PGM, Invitrogen, China). One library with average sizes of 350 bps was constructed. After quality control, total 26,064 reads (4,314,206 bps) were used to assemble the viral genome.