Introduction

Strain JSM 078169T (= DSM 21076 = KCTC 22279 = CCTCC AB 208031) is the type strain of the species Halomonas zhanjiangensis [1], one out of 84 species with a validly published name in the genus Halomonas [2], family Halomonadaceae [3]. The family Halomonadaceae currently comprises thirteen genera (Aidingimonas, Carnimonas, Chromohalobacter, Cobetia, Halomonas, Halotalea, Halovibrio, Kushneria, Marinospirillum, Modicisalibacter, Candidtus Portiera, Salinicola and Zymobacter) with Halomonas being the largest genus in this family [36]. Members of the genus Halomonas have been isolated from various saline environments and showed halophilic characteristics [711]. Strain JSM 078169T was originally isolated from a sea urchin (Hemicentrotus pulcherrimus) that was collected from the South China Sea. The genus name was derived from the Greek words ‘halos’ meaning ‘salt’ and ‘monas’ meaning ‘monad’, yielding the Neo-Latin word ‘halomonas’ [2]; the species epithet was derived from Latin word ‘zhanjiangensis’, of Zhanjiang, a city in China near where the sample was collected [1]. Strain JSM 078169T was found to assimilate several mono- and disaccharides and to produce numerous acid and alkaline phosphatases, leucine arylamidase, naphthol-ASBI-phosphohydrolase and valine arylamidase [1]. There are no PubMed records that document the use of these strain for any biotechnological studies; only comparative analyses performed for the description of later members of the genus Halomonas are recorded. However, the NamesforLife [12] database reports at least 70 patents in which Halomonas ssp. are referenced. Here we present a summary classification and a set of feature for H. zhanjiangensis JSM 078169T, together with the description of the genomic sequencing and annotation of DSM 21076.

Classification and features

16S rRNA analysis

The original assembly of the genome did not contain longer stretches of 16S rRNA copies. Therefore, a 1,413 bp long fragment of the 16S rRNA gene was later patched into the genome sequence assembly. This almost full length version of the 16S rRNA sequence was compared using NCBI BLAST [13,14] under default settings (e.g., considering only the high-scoring segment pairs (HSPs) from the best 250 hits) with the most recent version of the Greengenes database [15] and the relative frequencies of taxa and unidentified clones (or strains) were calculated by BLAST scores. The most frequently occurring genus was Halomonas (74.8%), and the unidentified clones or isolates represented 25.5% for the total BLAST results. Except for sequences of representatives of the genus Halomonas, no sequences from other genera were observed in the BLAST search. The highest degree of sequence similarity was reported with H. alkantarctica str. CRSS.

Figure 1 shows the phylogenetic neighborhood of H. zhanjiangensis JSM 078169T in a tree based on 16S rRNA genes. The 1,413 bp long sequence fragment of the 16S rRNA gene differs by three nucleotides from the previously published 16S rRNA sequence (FJ429198). The tree provided a precise insight into the nomenclature and classification of members of the genus Halomonas. The phylogenetic analysis showed that strain H. zhanjiangensis JSM 078169T was most closely related to H. nanhaiensis YIM M 13059T with 98.3% sequence similarity.

Figure 1.
figure 1

Phylogenetic tree highlighting the position of H. zhanjiangensis relative to the closest related type strains of the other species within the family Halomonadaceae. All the 16S rRNA gene sequences of the type strains within the genus Halomonas were included and combined with the representative 16S rRNA gene sequences of the type species in other genera, according to the most recent release of the EzTaxon database. The tree was inferred from 1,381 aligned characters [16] under the neighbor-joining (NJ) [17], and maximum-likelihood (ML) [18] method with 1,000 randomly selected bootstrap replicates using MEGA version 5.2 [19]. The branches are scaled in terms of the expected number of substitutions per site. Numbers adjacent to the branches are support values from 1,000 NJ bootstrap (left) and from 1,000 ML bootstrap (right) replicates [20] if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [21] are labeled with one asterisk, those also listed as ‘Complete and Published’ with two asterisks [22].

Morphology and physiology

H. zhanjiangensis JSM 078169T is a Gram-negative-staining, non-sporulating, strictly aerobic (Table 1), catalase-positive, oxidase-negative and slightly halophilic bacterium that reduces nitrate [1]. Cells of JSM 078169T are short rods (0.4–0.7 µm × 0.6–1.0 µm) and motile with peritrichous flagella (not visible in Figure 2). Colonies are yellow-pigmented, flat and non-translucent with glistening surfaces and circular/slightly irregular margins, 2–3 mm in diameter after incubation on Marine Agar (MA) at 28 °C for 3–5 days. No diffusible pigments are produced. Growth occurs at 4–40 °C with an optimum growth at 25–30 °C, at pH range of 6.0–10.5 with an optimum pH of 7.5. The salinity range suitable for growth was 1.0–20.0% (w/v) total salts with an optimum between 3.0–5.0% (w/v) total salts. No growth occurs in the absence of NaCl or with NaCl as the sole salt. Strain JSM 078169T grows on Marine Agar and the medium contained the following: 5.0 g peptone, 1.0 g yeast extract, 0.1 g ferric citrate, 19.45 g NaCl, 8.8 g MgCl2, 3.24 g Na2SO4, 1.8 g CaCl2, 0.55 g KCl, 0.16 g NaHCO3, 0.08 g KBr, 0.034 g SrCl2, 0.022 g H3BO3, 0.004 g sodium silicate, 0.0024 g sodium fluoride, 0.0016 g ammonium nitrate, 0008 g disodium phosphate and 15 g agar.

Figure 2.
figure 2

Scanning electron micrograph of H. zhanjiangensis JSM 078169T

Table 1. Classification and general features of H. zhanjiangensis JSM 078169T according the MIGS recommendations [23], (published by the Genomic Standards Consortium [24]), List of Prokaryotic names with Standing in Nomenclature [25] and the Names for Life database [12].

Chemotaxonomy

The predominant respiratory quinone is Q-9 which is consistent to the other members of the genus Halomonas [1]. The predominant fatty acids are C18:1 ω 7 c (48.9%), C16:0 (17.0%) and C12:0 3-OH (10.7%). The profile of major fatty acids is also similar to the other representatives of the genus Halomonas [3841].

Genome sequencing and annotation

Genome project history

This organism was selected for sequencing on the basis of its phylogenetic position [42,43]. Sequencing strain JSM 078169T is part of Genomic Encyclopedia of Type Strains, Phase I: the one thousand microbial genomes (KMG) project [44], a follow-up of the GEBA project [45], which aims in increasing the sequencing coverage of key reference microbial genomes. The genome project is deposited in the Genomes OnLine Database [21] and the permanent draft genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI) using state of the art sequencing technology [46]. A summary of the project information is shown in Table 2.

Table 2. Genome sequencing project information

Growth conditions and DNA isolation

H. zhanjiangensis JSM 078169T, DSM 21076, was grown in DSMZ medium 1510 (modified medium 514 for Halomonas sp.) [47] at 28 °C. DNA was isolated from 0.5–1.0 g of cell paste using MasterPure Gram-positive DNA purification kit (Epicentre MGP04100) following the standard protocol as recommended by the manufacturer with modification st/DL for cell lysis as described by Wu et al. [45]. DNA is available through the DNA Bank Network [48].

Genome sequencing and assembly

The draft genome sequence was generated using the Illumina technology [49]. An Illumina Standard shotgun library was constructed and sequenced using the Illumina HiSeq 2000 platform which generated 15,593,002 reads totaling 2,339.0 Mbp. All general aspects of library construction and sequencing performed at the JGI can be found at [50]. All raw Illumina sequence data was passed through DUK, a filtering program developed at JGI, which removes known Illumina sequencing and library preparation artifacts [51]. Following steps were then performed for assembly: (1) filtered Illumina reads were assembled using Velvet [52], (2) 1–3 kbp simulated paired end reads were created from Velvet contigs using wgsim [53], (3) Illumina reads were assembled with simulated read pairs using Allpaths-LG [54]. Parameters for assembly steps were: 1) Velvet (velveth: 63 –shortPaired and velvetg: –very clean yes –export-Filtered yes –min contig lgth 500 –scaffolding no –cov cutoff 10) 2) wgsim (–e 0 –1 100 –2 100 –r 0 –R 0 –X 0) 3) Allpaths-LG (PrepareAllpathsInputs: PHRED 64=1 PLOIDY=1 FRAG COVERAGE=125 JUMP COVERAGE=25 LONG JUMP COV=50, RunAllpathsLG: THREADS=8 RUN=std shredpairs TARGETS=standard VAPI WARN ONLY=True OVERWRITE=True). The final draft assembly contained 18 contigs in 17 scaffolds. The total size of the genome is 4.1 Mbp and the final assembly is based on 501.3 Mbp of Illumina data, which provides an average 123.5 × coverage of the genome.

Genome annotation

Genes were identified using Prodigal [55] as part of the DOE-JGI genome annotation pipeline [56], following by a round of manual curation using the JGI GenePRIMP pipeline [57]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, UniProt, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and InterPro database. These data sources were combined to assert a product description for each predicted protein. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes-Expert Review (IMG-ER) platform [58].

Genome properties

The assembly of the draft genome sequence consists of 17 scaffolds amounting to 4,060,520 bp, and the G+C content is 54.5% (Table 3 and Figure 3). Of the 3,739 genes predicted, 3,659 were protein-coding genes, and 80 RNAs. The majority of the protein-coding genes (87.1%) were assigned a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4 and Figure 3.

Figure 3.
figure 3

The graphical map of the largest scaffold of the genome. From bottom to the top: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNA green, rRNA red, other RNAs black), GC content, GC skew (purple/olive).

Table 3. Genome statistics
Table 4. Number of genes associated with the general COG functional categories

Insights into the genome sequence

One complete genome sequence from a type strain of the family Halomonas - H. elongata [22] is available in GenBank, and four other permanent draft genomes of H. anticariensis, H. lutea, H. jeotgali and H. halocynthiae are available from IMG. The genome size of H. zhanjiangensis is smaller than those of H. elongata, H. lutea and H. anticariensis (4.06–5.02 Mbp), but much larger than those of H. jeotgali and H. halocynthiae (2.85–2.88 Mbp). Using the genome-to-genome distance calculator [5961] version 2.0 revealed that all digital DNA-DNA hybridization (DDH) values are much lower than 70% using the program NCBI-BLAST, which demonstrated that H. zhanjiangensis is distinct from H. elongata, H. anticariensis, H. lutea, H. jeotgali and H. halocynthiae at the species level. Distance is 0.1845 between the type strain genomes of H. zhanjiangensis and H. elongata, which corresponds to a DDH value of 13.00 ± 2.99%. The distances of H. zhanjiangensis from H. anticariensis, H. lutea, H. jeotgali and H. halocynthiae are 0.1842, 0.1837, 0.1835 and 01849, which correspond to DDH values of 20.30 ±2.41%, 20.30 ±2.41%, 20.40 ±2.41% and 20.20 ±2.41%, respectively.

A major feature of the previously sequenced genomes from this family is the presence of large numbers of proteins for the TRAP-type C4-dicarboxylate transport systems. A total of 267 genes in the genome of H. zhanjiangensis encode proteins for carbohydrate transport and metabolism, 68 genes are related to TRAP-type C4-dicarboxylate transport systems and encoded 22 large permease proteins, 24 periplasmic proteins and 22 small permease proteins. Genomic analysis of H. elongata, H. anticariensis, H. lutea, H. jeotgali and H. halocynthiae showed that they encode 58, 65, 61, 7 and 32 proteins related to TRAP-type C4-dicarboxylate transport system respectively. Proteins for TRAP-type C4-dicarboxylate transport systems constitute 1.86% as the total protein-coding sequences of the H. zhanjiangensis genome. In the genomes of H. elongata, H. anticariensis, H. lutea, H. jeotgali and H. halocynthiae, TRAP-type C4-dicarboxylate transport system related proteins are accounted for 1.67%, 1.37%, 1.42%, 0.27% and 1.18% of the total protein-coding genes respectively. Therefore, H. zhanjiangensis has the highest percentage of TRAP-type C4-dicarboxylate transport system related encoding proteins in this group of bacteria to date.

Of the signal transduction mechanisms, Methyl-accepting Chemotaxis Proteins (MCPs) are transmembrane sensor proteins of bacteria. The MCPs allow bacteria to detect concentrations of molecules in the extracellular matrix so that they may smoothly swim or tumble accordingly [62,63]. Various environmental conditions give rise to diversity in bacterial signaling receptors, and consequently there are many genes encoding MCPs [64]. A number of MCPs (23) are present in H. zhanjiangensis, while H. elongata, H. anticariensis, H. lutea, and H. jeotgali have only 4, 21, 16, and 17 MCPs, respectively. MCPs are not found in the genome of H. halocynthiae. H. zhanjiangensis has the largest numbers of MCPs in this family. The analysis of bacterial genomes reveals that the family Halomonadaceae differs enormously in the number of MCPs from E. coli, and the number of MCPs in Halomonadaceae is about two times than that of E. coli strains.