Introduction

Strain YM16-304T (=NBRC 103263T) is the type strain of Ilumatobacter coccineum Matsumoto et al. 2013 [1]. I. coccineum YM16-304T and Ilumatobacter nonamiense YM16-303T (=NBRC 109120T) were isolated from a sand sample collected at Nonami Beach in Shimane Prefecture in Japan, and represent the second and the third species of the genus Ilumatobacter [2]. Phylogenetic analysis showed that genus Ilumatobacter branches near the presumed root of the class Actinobacteria (Figure 1), and thus may represent a new taxon outside the known family Acidimicrobiaceae, although the family accommodating this genus has not been decided yet [1,2]. Iamia majanohamensis is also located outside the family Acidimicrobiaceae, and is the sole genus and species in family Iamiaceae [4]. Among the organisms for which whole genome sequences have been reported, the most closely related to YM16-304T is Acidimicrobium ferrooxidans DSM 10331 [5], which is phylogenetically distant from I. coccineum with 16S rRNA gene sequence similarity of 86%. No complete or draft genome information is currently available for the genera Ilumatobacter and Iamia. The taxon contains a number of uncultured bacteria including putative marine sponge symbionts, and the complete genome sequence of strain YM16-304T would provide a basis of technological developments for the isolation and better understanding of related uncultured actinobacteria.

Figure 1.
figure 1

Phylogenetic tree highlighting the position of I. coccineum strain YM16-304T relative to other representative type strains. The tree was constructed by the neighbor-joining method [3] based on a 1,326 bp alignment of 16S rRNA gene sequences. Corresponding INSDC accession numbers are shown in parentheses. Numbers at nodes indicate support values obtained from 1,000 bootstrap replications. Species for which complete genome sequences are available are labeled with two asterisks.

Classification and features

Strain YM16-304T is a mesophilic, neutrophilic, aerobic bacterium with features as summarized in Table 1. Growth occurs at 12–36 °C and at pH 7–8. Cells are rods and non-motile [Figure 2]. Gram staining was positive. Electron microscope observation demonstrated no flagella and pili formation [1]. In agreement with this observation, the genome encodes no gene necessary for flagella, chemotaxis and pili.

Figure 2.
figure 2

Scanning electron micrograph of I. coccineum strain YM16-304T

Table 1. Classification and general features of I. coccineum strain YM16-304T

Strain YM16-304T grows poorly even in artificial seawater medium supplemented with 0.5% peptone and 0.1% yeast extract under optimum growth conditions [1]. From the genome sequence, strain YM16-304T seems to possess either deficient or unusual pathways for the synthesis of some amino acids and other essential cellular components as outlined in the later section (Primary metabolism).

Genome sequencing information

Genome project history

I. coccineum YM16-304T was selected for sequencing because of its isolated phylogenetic position and characteristics which distinguish this strain from other described actinobacterial species. Table 2 presents the project information and its association with MIGS version 2.0 compliance [15].

Table 2. Project information

Growth conditions and DNA isolation

I. coccineum YM16-304T cells were grown in a 20 L volume at 27°C in DifcoTM Marine broth 2216 (Beckton Dickinson). DNA was isolated from 0.5 g of wet cells by manual extraction after lysis with lysozyme and SDS.

Genome sequencing and assembly

The genome of I. coccineum YM16-304T was sequenced using the conventional whole-genome shotgun sequencing method. Plasmid libraries with average insert sizes of 1.5 kb and 6.0 kb were generated in pTS1 (Nippon Gene) and pUC118 (TaKaRa) vectors, respectively, while a fosmid library with average insert size of 38 kb was constructed in pCC1FOS (EPICENTRE) as described previously [16]. A total of 26,592 clones (18,432, 5,376 and 2,784 clones from libraries with 1.5 kb, 6.0 kb and 38 kb inserts, respectively) were subjected to sequencing from both ends of the inserts on a ABI 3730xl DNA Analyzer (Applied Biosystems). Sequence reads were trimmed at a threshold of 20 in Phred score and assembled by using Phrap and CONSED assembly tools [12,13]. Gaps between contigs were closed by sequencing PCR products which bridge two neighboring contigs. Finally, each base of the genome was ensured to be sequenced from multiple clones either from both directions with Phrap quality score ≥ 70 or from one direction with Phrap quality score ≥40.

Genome annotation

The complete sequence of the chromosome was analyzed using Glimmer3 [14] for predicting protein-coding genes, tRNAscan-SE [17] and ARAGORN [18] for tRNA genes, and RNAmmer [19] for rRNA genes. The functions of predicted protein-coding genes were assigned manually, using the in-house genome annotation system OCSS (unpublished), in comparison with Uniprot [20], Interpro [21], HAMAP [22] and KEGG [23] databases.

Genome properties

The genome of I. coccineum YM16-304T consisted of a circular chromosome of 4,830,181 bp (Figure 3). The chromosome was predicted to contain 4,291 protein-coding genes, 46 tRNA genes, two copies of rRNA operons. Protein functions were manually assigned based on UniProt and InterPro searches, and specific functions were predicted for 1,824 genes (42.5% of the protein-coding genes). Among the remaining predicted proteins, 520 (12.1%) were assigned to proteins belonging to specific protein families, 1,535 (35.8%) were assigned to hypothetical proteins (showing sequence similarity to published proteins without known function), and 409 (9.5%) were assigned to hypothetical proteins (prediction only) (lacking sequence similarity to published proteins). Average G+C content was 67.29%. The properties and the statistics of the genome are summarized in Tables 34.

Figure 3.
figure 3

Circular representation of the I. coccineum YM16-304T chromosome From outside to the center: circles 1 and 2, predicted protein coding genes on the forward and reverse strands, respectively; circle 3, tRNA genes; circle 4, rRNA operons; circle 5, G+C content; circle 6, GC skew. Predicted protein coding genes are colored according to their assigned COG functional categories (see Table 4).

Table 3. Nucleotide content and gene count levels of the genome
Table 4. Number of genes associated with the 25 general COG functional categories

Primary metabolism

Strain YM16-304T lacks the dapE gene for succinyl-diaminopimelate desuccinylase (EC:3.5.1.18) in the biosynthesis pathway of lysine and diaminopimelic acids (DAPs). Instead, two candidate genes (YM304_26990 and YM304_19190) for LL-DAP aminotransferase (EC:2.6.1.83, dapL), that constitutes an alternative DAP-lysine biosynthesis pathway (DAP aminotransferase pathway [24,25]), were identified. The dapL gene is found in discrete lineages of Bacteria and Archaea, and is known to complement Escherichia coli dapD and dapE mutants, although purified proteins favor the reverse reaction rather than the synthesis of LL-DAP [25].

Among the genes of serine biosynthesis pathway, the serB gene for phosphoserine phosphatase (EC:3.1.3.3) was not identified by similarity searches. On the other hand, the thrH gene for phosphoserine / homoserine phosphotransferase [26] (EC:3.1.3.3, 2.7.1.39) was identified (YM304_28950). The possibility of using thrH gene product for serine biosynthesis instead of serB gene product was suggested.

Strain YM16-304Tseems to possess an alternative form of histidine biosynthesis pathway in which hisB gene for the synthesis of L-histidinol was replaced with the hisN gene (YM304_12240) as typically found in Corynebacterium glutamicum ATCC13032 [27] and other actinomycetes. However, the hisE gene for phosphoribosyl-ATP pyrophosphohydrolase (EC:3.6.1.31), which is responsible for the second step in histidine biosynthesis pathway, was not identified by similarity searches.

Metabolic reconstruction based on the annotation suggested that strain YM16-304T possesses the enzymes required for the biosynthesis of saturated fatty acids, unsaturated fatty acids, branched-chain fatty acids and carotenoids. The putative carotenoid biosynthesis pathway comprises crtE (YM304_37400), crtB (YM304_37420), crtI (YM304_37410) and crtLm (YM304_23780) gene homologs, which most probably synthesizes γ-carotene from isopentenyl pyrophosphate derived from non-mevalonate pathway [2830]. Strain YM16-304T also possesses genes homologous to crtO (YM304_25370) and crtZ (YM304_38780), which were suggested to be involved in the synthesis of ketolated carotenoid such as canthaxanthin and astaxanthin [30]. Actual products of this pathway need to be experimentally verified.

The annotation also suggests that strain YM16-304T possesses the enzymes required for the biosynthesis of menaquinone (vitamin K), vitamin B6, nicotinate and nicotinamide, pantothenate and CoA, lipoic acid, protoheme, mycothiol and coenzyme F420, while biosynthetic pathways for folate, thiamine, riboflavin, biotin and adenosylcobalamin (coenzyme B12) are either missing or incomplete.

Secondary metabolism

The phylogenetic analysis based on 16S rRNA gene sequences showed that three species in the genus Ilumatobacter were closely related to some uncultured actinobacteria including marine sponge symbionts [31]. Marine sponges are noted as a rich source of biologically active secondary metabolites, true producers of such compound being suspected to be symbiotic bacteria [3234]. However, only a small percentage of these symbiotic microorganisms are culturable [35,36], and genes involved in the synthesis of bioactive compounds such as polyketide synthases have often been isolated by metagenomic approaches [37,38].

The strain YM16-304T genome seemed to encode only a limited number of secondary metabolic enzymes, i.e., two type I polyketide synthases (PKS). The genome does not contain genes for type II and type III PKS nor a gene for nonribosomal peptide synthetase.

The type I PKS genes of the strain YM16-304T (YM304_13420, YM304_13410), together with the adjacent pfaD homolog (YM304_13430), most probably encode omega-3 polyunsaturated fatty acid (PUFA) synthase gene cluster. In some Gammaproteobacteria from marine sources such as Photobacterium profundum strain SS9, omega-3 polyunsaturated fatty acids such as eicosapentaenoic acid (20:5n-3; EPA) and docosahexaenoic acid (22:6n-3; DHA) are known to be synthesized by a PKS system consisting of pfaA, pfaB, pfaC and pfaD genes [3941]. The domain organization of YM304_13420 was identical to that of the pfaA gene of P. profundum SS9. The N-terminal ketosynthase domain and the C-terminal dehydratase domains of YM304_13410 were similar to those of the pfaC gene of P. profundum, while the internal acyltransferase domain of YM304_13410 was moderately similar to that of the pfaB gene of P. profundum, representing a presumed chimeric form of PKS. As PfaB is the key enzyme determining the final product in EPA or DHA biosynthesis [42], the actual product of this PKS system may need to be clarified experimentally. Some PUFA-producing bacteria such as Moritella marina MP-1 [39,43] were reported to require an additional gene, pfaE, encoding a phosphopantheteinyl transferase. However, the pfaE gene was not identified in strain YM16-304T. Other classes of phosphopantheteinyl transferase (e.g. YM304_08850) may substitute the function of PfaE, similar to the case suggested in P. profundum SS9 [44].

Cell surface

Strain YM16-304 seemed to possess 13 ORFs containing LPXTG motif (InterPro ID: IPR001899), the presumed sorting signal of cell surface proteins in Gram-positive bacteria [45]. It was reported that several cell surface proteins containing LPXTG motif act as an adhesion factor known as microbial surface components recognizing adhesive matrix molecules (MSCRAMMs) [46]. The genome of strain YM16-304 contained extracellular polysaccharide gene cluster (YM304_29910-YM304_30490), including gene cluster for the synthesis of sialic acids (YM304_30300-YM304_30320), which are also crucial for cell adhesion [47]. These extracellular components might serve for the bacterium to adhere to host tissues such as marine sponges.

Many marine bacteria use the Na+ cycle and require Na+ for their growth [48]. In these bacteria, Na+ is often used in the respiratory chain, ATP synthase, flagellar rotation and solute uptake instead of H+ [49]. Some bacteria can use both Na+ and H+ to expand the range of environments in which the bacteria can grow [50]. Strain YM16-304 was isolated from a sand sample collected at a beach and grows optimally in marine broth media, suggesting its marine origin. However, the gene products for the respiratory chain and ATP synthase were predicted to be of the H+-dependent type by similarity search. The Na+-dependent amino acid symporters were also not identified, nor was the H+-dependent symporters.