Complete nucleotide sequence and annotation of the temperate corynephage ϕ16 genome

The complete genome of ϕ16, a temperate corynephage from Corynebacterium glutamicum ATCC 21792, was sequenced and annotated (GenBank: KY250482). The electron microscopy study of ϕ16 virion confirmed that it belongs to the family Siphoviridae. The ϕ16 genome consists of a linear double-stranded DNA molecule of 58,200 bp (G+C = 52.2%) with protruding cohesive 3’-ends of 14 nt. Four major structural proteins were separated by SDS-PAGE and identified by peptide mass fingerprinting technique. Using bioinformatics analysis, 101 putative ORFs and 5 tRNA genes were predicted. Only 27 putative gene products could be assigned to known biological functions. The ϕ16 genome was divided into functional modules. Seven putative promoters and eight putative unidirectional intrinsic terminators were predicted. One site of putative «-1» programmed ribosomal frameshifting was proposed in the phage tail assembly genome region. C. glutamicum genetic tools could be broadened by exploiting the known integrase gene (gp33) and the newly identified excisionase gene (gp47), participating in site-specific recombination between ϕ16-attP/attB. Electronic supplementary material The online version of this article (doi:10.1007/s00705-017-3383-4) contains supplementary material, which is available to authorized users.

frameshifting was proposed in the phage tail assembly genome region. C. glutamicum genetic tools could be broadened by exploiting the known integrase gene (gp33) and the newly identified excisionase gene (gp47), participating in site-specific recombination between /16-attP/ attB.
Corynebacterium glutamicum is widely used to produce commercially interesting bio-based substances [1]. Phages present a problem for the biotechnology industry and cause financial losses. Many corynephages have been isolated, but only a few of them have been completely sequenced (e.g. [2,3]). In the present study, the genome of /16, a temperate corynephage from C. glutamicum (ATCC 21792), kindly provided by Dr. Trautwetter [4], was sequenced and annotated. This information could provide valuable evolutionary insights and be helpful for phageresistant strain construction [5]. Different integrative vectors targeting different attB-sites have been constructed based on known integrases of phages /AAU2 [6], beta [7], /304L [8] and /16 [9] from C. glutamicum strains. The newly identified /16 excisionase, in addition to the known integrase gene, could be useful for broadening C. glutamicum genetic tools, e.g. for site-specific integration/excision of DNA fragments into bacterial chromosomes, as was demonstrated for other phage-based systems [10].
Transmission electron microscopy study of /16 virion confirmed that it belongs to the family Siphoviridae, with a polyhedral head of 73 nm in width and 336 nm in length, and with a non-contractile striated tail of 14 nm in diameter (Fig. 1a), in line with Dr. Trautwetter's group data [4]. Subsequently, one of the putative /16 gene products (gp), gp16, was assigned to the tail tape measure protein (TMP). The relationship between observed tail length (*336 nm) and TMP size (2,151 aa), with a ratio of 0.156 nm/aa, is reasonable [12].
Purified phage DNA was hydrodynamically sheared, and fragments of 2 to 5 kb in size were blunted and then cloned and sequenced by the Sanger's method. A total of 346 individual DNA fragments were sequenced with an average length of 750 ± 130 bp, and an achieved sequence coverage of * 6.6-fold. Closure of gaps was accomplished by primer walking. The genome sequence was finalized by determining the cos sequence with a sequence run-off experiment and comparison of the nucleotide sequence with the ligated phage ends.
Based on homology to known phage proteins, functional domains, and mutual arrangement, putative functions were assigned to products of 27 predicted ORFs (Supplementary Table 1). The entire genome was divided into the six functional modules (Fig. 2). The DNA packaging module contains small and large terminase subunits (gp2 and gp3) and the portal protein (gp4). The prohead protease (gp5), the major capsid and tail proteins (gp7 and gp13), the tail TMP (gp16), and the tail fiber protein (gp20) could be predicted in the structural components and assembly module. At the same time, four major structural proteins, the major capsid and tail proteins (gp7 and gp13) and two proteins with unknown function (gp6 and gp12), were detected by SDS-PAGE and identified by trypsin-based peptide mass fingerprinting technique (PMF), using Ultraflex II LC-MALDI-TOF/TOF (Brucker), performed according to Govorun et al. [14]. Furthermore, detection of N-terminal Met residue retention in a trypsin-digested peptide from gp12 and its elimination from N-terminal peptides from gp6 and gp7 confirmed the N-terminal processing rule [15] (Fig. 1b, c).
A putative site of a -1 programmed ribosomal frameshifting (PRF) could be found in the proposed tail assembly genes and was composed of three functional elements: an internal SD (5'-GAGG?3'), a ''slippery sequence'' (5 0 -GGGGGAA?3 0 ) and an H-type pseudoknot RNA structure (Supplementary Fig. 4) [16]. The PRF was predicted to lead to the formation of a large fusion protein, gp14A.
Homologues of two known enzymes were predicted in the host lysis module: the endolysin (gp22) and the holin (gp23). The lysogeny control module was unusual: it contained two putative integrases (gp33 and gp28), the excisionase (gp47), the phage superinfection exclusion protein (gp34), and the transcriptional regulator (gp36). The nucleotide sequence of the /16 int gene (corresponding to ORF33 in our annotation) and the /16 attP site were deposited previously (GenBank: Y12471.1) [8] and differ from the newly sequenced ORF33 in several points due to sequencing errors in the past, that resulted in differences in the structures of the corresponding gp(s) (Supplementary Fig. 5). We confirmed the ability of gp33 to provide site-specific integration of recombinant DNA into the /16-attB of the C. glutamicum ATCC 21792c chromosome, which was previously shown by the Trautwetter group [9]. We also demonstrated experimentally the effective excision of integrated recombinant DNA when gp33 and gp47 are expressed simultaneously of (manuscript in preparation). The experiments also showed that the second putative integrase, gp28, could not use the previously established /16-attP site [8] for site-specific recombination. No other putative attP-site was detected in the vicinity of ORF28 (unpublished result).
The analysis indicated that some modules of the /16 genome had complete or partial homology to distinct chromosomal regions of four bacteria, leading us to hyphothesize that these are uncharacterized prophages in bacterial genomes. Throughout large parts of the genome sequence, significant similarity was observed between /16 and the hypothetical prophage Corynebacterium pyruviciproducens ATCC BAA-1742, at the nucleotide and deduced protein sequence level. Significant similarity was also observed, throughout the whole genome, between /16   Fig. 6). Neither protein nor nucleotide homology was observed between /16 and BFK20 or P1201. However, the /16 genome is not present in any bacterial chromosome in the database; therefore, the sequence of the entire /16 phage genome was deposited for the first time in GenBank, under accession number KY250482.