Introduction

The genus Flavihumibacter was established in 2010 [1] and comprises three recognized species, Flavihumibacter petaseus T41T [1], Flavihumibacter cheonanensis WS16T [2] and Flavihumibacter solisilvae 3-3T [3], that were isolated from a subtropical rainforest soil, a shallow stream sediment and a forest soil, respectively. The Flavihumibacter members are Gram-positive, rod-shaped, strictly aerobic, non-motile, yellow-pigmented bacteria. The strains all contain phosphatidylethanolamine (as the major polar lipid, menaquinone-7 as the major respiratory quinine, iso-C15:0 and iso-C15:1 G as the principal fatty acids. In addition, the strains are oxidase- and catalase-positive and with a G + C content range of 45.9-49.5 mol% [13].

To the best of our knowledge, the genomic information of Flavihumibacter members still remains unknown. In this study, we present the draft genome information of F. solisilvae 3-3T. A polyphasic taxonomic study revealed that F. solisilvae 3-3T could utilize 33 kinds of sole carbon substrates, including 11 kinds of saccharides and 22 kinds of organic acids and amino acids [3]. Specially, this strain could utilize aromatic compound 4-hydroxyphenylacetic acid as a sole carbon source making it applicable environmental bioremediation [46]. In addition, this strain could utilize quinic acid as a sole carbon. Quinic acid is the substrate used to synthesize aromatic amino acids (phenylalanine, tyrosine and tryptophan) via the shikimate pathway. These aromatic amino acids are very useful as food additives, sweetener and pharmaceutical intermediates [6, 7]. The genome analysis of F. solisilvae 3-3T will provide the genomic basis for better understanding these mechanisms and applying the strain to industries and bioremediation more efficiently.

Organism information

Classification and features

F. solisilvae 3-3T was isolated from forest soil of Bac Kan province in Vietnam [3]. The classification and features of F. solisilvae 3-3T are shown in Table 1. A maximum-likelihood tree was constructed based on the 16S rRNA gene sequences using MEGA 5.0 [8]. The bootstrap values were calculated based on 1,000 replications and distances were calculated in accordance with Kimura’s two-parameter method [9]. The phylogenetic tree showed that F. solisilvae 3-3T was clustered with the other Flavihumibacter members (Fig. 1).

Table 1 Classification and general features of F. solisilvae 3-3T according to the MIGS recommendations [22]
Fig. 1
figure 1

A 16S rRNA gene based ML phylogenetic tree showing the phylogenetic position of F. solisilvae 3-3T. Bootstrap values (>50 %) based on 1,000 replications are shown at branch nodes. Bar, 1 substitutions per 100 nucleotide positions. Sphingobacterium alimentarium DSM 22362T is used as the outgroup. The GenBank accession numbers are shown in parentheses

Cells of F. solisilvae 3-3T (Fig. 2) are Gram-positive, aerobic, non-motile, and rod-shaped. Colony is yellow due to the production of flexirubin-type pigments [10]. F. solisilvae 3-3T grows well on NA and R2A agar (optimum), but do not grow on LB or TSA agar [3]. It can hydrolyze aesulin, gelatin, casein and tyrosine [3]. F. solisilvae 3-3T can also utilize various carbohydrate substrates (Table 1) and produces several glycosyl hydrolases, such as β-N-acetylhexosaminidase, α-galactosidase, β-glucosidase, β-galactosidase, α-fucosidase, α-mannosidase and α-glucosidase [3].

Fig. 2
figure 2

A transmission electron micrograph of F. solisilvae 3-3T cell. The bar indicates 0.5 μm

F. solisilvae 3-3T contains iso-C15:0, iso-C15:1 G and summed feature 3 (C16:1 ω6c/C16:1 ω7c) as the principal fatty acids, MK-7 as the major respiratory quinine. The major polar lipids were PE, three unidentified aminolipids and three unidentified lipids [3].

Genome sequencing information

Genome project history

F. solisilvae 3-3T was selected for sequencing based on its taxonomic representativeness and the potential application in food industry and bioremediation. The genome of F. solisilvae 3-3T was sequenced at Wuhan Bio-Broad Co., Ltd, Hubei, China. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JSVC00000000. The version described in this paper is version JSVC00000000.1. A summary of the genomic sequencing project information is given in Table 2.

Table 2 Project information of F. solisilvae 3-3T

Growth conditions and genomic DNA preparation

F. solisilvae 3-3T was grown aerobically in 50 ml R2A broth at 28 °C for 24 h with 160 r/min shaking. About 20 mg cells were harvested by centrifugation and suspended in normal saline, and then lysed using lysozyme. The DNA was obtained using the QiAamp kit according to the manufacturer’s instruction (Qiagen, Germany).

Genome sequencing and assembly

The genome of F. solisilvae 3-3T was sequenced by Illumina Hiseq 2,000 technology [11] with Paired-End library strategy (300 bp insert size). TruSeq DNA Sample Preparation Kits are used to prepare DNA libraries with insert sizes from 300–500 bp for single, paired-end, and multiplexed sequencing. The protocol supports shearing by either sonication or nebulization of 1 μg of DNA [12]. The genome of F. solisilvae 3-3T generated 7,041,525 reads totaling 1,422,388,050 bp data with an average coverage of 250 ×. Sequence assembly and quality assessment were performed using velvet v.1.2.10 [13] software. Finally, all reads were assembled into 75 contigs (> 200 bp) with a genome size of 5.41 Mbp.

Genome annotation

Genome annotation was performed through the NCBI Prokaryotic Genome Annotation Pipeline which combined the Best-Placed reference protein set and the gene caller GeneMarkS+. WebMGA-server [14] with E-value cutoff 1-e3 was used to assess the COGs. The translated predicted CDSs were also used to search against the Pfam protein families database [15]. TMHMM Server v.2.0 [16], SignalP 4.1 Server [17] and CRISPRfinder program [18] were used to predict transmenbrane helices, signal peptides and CRISPRs in the genome, respectively. The metabolic pathway analysis were constructed using the KEGG (Kyoto Encyclopedia of Genes and Genomes) [19].

Genome properties

The daft genome size of F. solisilvae 3-3T is 5,410,659 bp with 47 % GC content and contains 75 contigs. From a total of 4,698 genes, 4,215 (89.72 %) genes are protein coding genes, 439 (9.34 %) are pseudo genes and 44 (0.94 %) are RNA encoding genes. The genome properties and statistics are shown in Table 3 and Fig. 3. Altogether, 3,137 (74.42 %) protein coding genes are distributed into COG functional categories (Table 4).

Table 3 Genome statistics of F. solisilvae 3-3T
Fig. 3
figure 3

A Graphical circular map of F. solisilvae 3-3T genome. From outside to center, ring 1, 4 show protein-coding genes colored by COG categories on forward or reverse strand; ring 2, 3 denote genes on forward or reverse strand; ring 5 shows G + C content plot, and the innermost ring represents GC skew

Table 4 Number of genes associated with general COG functional categories of F. solisilvae 3-3T genome

Insights from the genome sequence

F. solisilvae 3-3T could grow on 33 kinds of sole carbon substrates including saccharides, organic acids and amino acids [3] (Table 1). Analysis of the genome reveals that this strain possesses putative enzymes for central carbohydrate metabolism to assimilate these carbon sources through different metabolic pathways [20]. The putative enzymes that responsible to the utilization of 20 sole carbons were found in the genome (Table 5). All key enzymes in the Embden-Meyerhof-Parnas pathway (glucokinase, pyruvate kinase and 6-phosphofructokinase) and TCA cycle are present in F. solisilvae 3-3T. The key enzymes of Pentose Phosphate pathway (glucose-6-phosphate dehydrogenase, 6-phosphogluconolactonase and 6-phosphogluconate dehydrogenase) were also found.

Table 5 Putative enzymes responsible the utilization of different sole carbon sources in the genome of F. solisilvae 3-3T

The presence of 4-hydroxyphenylpyruvate dioxygenase (KIC95062), homogentisate 1,2-dioxygenase (KIC93392) and other related enzymes suggests that 4-hydroxyphenylacetic acid is degradable via homogentisic acid pathway [21]. In addition, the presence of 3-dehydroquinate dehydratase (KIC93382), shikimate dehydrogenase (KIC92987), shikimate kinase (KIC93265), 3-phosphoshikimate 1-carboxyvinyltransferase (KIC94147) and chorismate synthase (KIC94148) indicates that F. solisilvae 3-3T could probably utilize quinic acid to synthesize the three aromatic amino acids (tryptophan, tyrosine and phenylalanine) via shikimate pathway [7].

Conclusion

To the best of our knowledge, this report provides the first genomic information of the genus Flavihumibacter . Analysis of the genome shows high correlation between the genotypes and the phenotypes. The genome possesses many key proteins of central carbohydrate metabolism which provides the genomic basis to utilize the various carbon sources. In addition, analyzing its genome indicates that this strain has potential application for the production of aromatic amino acids and for environmental bioremediation.