Introduction

Strain 3410T (= DSM 43247 = ATCC 25592 = JCM 3198) is the type strain of the species Gordonia bronchialis, which is the type species of the genus. The genus Gordonia (formerly Gordona) was originally proposed by Tsukamura in 1971 [1]. The generic name Gordona has been chosen to honor Ruth E. Gordon, who studied extensively ‘Mycobacteriumrhodochrous (included later as a member of Gordona) [1]. In 1977, it was subsumed into the genus Rhodococcus [2], but revived again in 1988 by Stackebrandt et al. [3]. At the time of writing, the genus contained 28 validly published species [4]. The genus Gordonia is of great interest for its bioremediation potential [5]. Some species of the genus have been used for the decontamination of polluted soils and water [6,7]. Other species were isolated from industrial waste water [8], activated sludge foam [9], automobile tire [10], mangrove rhizosphere [11], tar-contaminated oil [12], soil [13] and an oil-producing well [7]. Further industrial interest in Gordonia species stems from their use as a source of novel enzymes [14,15]. There are, however, quite a number of Gordonia species that are associated with human and animal diseases [16], among them G. bronchialis. Here we present a summary classification and a set of features for G. bronchalis 3410T, together with the description of the complete genomic sequencing and annotation.

Classification and features

Strain 3410T was isolated from the sputum of a patient with pulmonary disease (probably in Japan) [1]. Further clinical strains in Japan have been isolated from pleural fluid, tumor in the eyelid, granuloma, leukorrhea, skin tissue and pus [17]. In other cases, G. bronchialis caused bacteremia in a patient with a sequestrated lung [18] and a recurrent breast abscess in an immunocompetent patient [19]. Finally, G. bronchialis was isolated from sternal wound infections after coronary artery bypass surgery [20]. G. bronchialis shares 95.8-98.7% 16S rRNA gene sequence similarity with the other type strains of the genus Gordonia, and 95.3-96.4% with the type strains of the neighboring genus Williamsia.

Figure 1 shows the phylogenetic neighborhood of for G. bronchialis 3410T in a 16S rRNA based tree. The sequences of the two 16S rRNA gene copies in the genome of G. bronchialis 3410T, differ from each other by one nucleotide, and differ by up to 5 nucleotides from the previously published 16S rRNA sequence from DSM 43247 (X79287). These discrepancies are most likely due to sequencing errors in the latter sequence.

Figure 1.
figure 1

Phylogenetic tree highlighting the position of G. bronchialis 3410T relative to the other type strains within the genus Gordonia. The tree was inferred from 1,446 aligned characters [21,22] of the 16S rRNA gene sequence under the maximum likelihood criterion [23] and rooted with the type strains of the neighboring genus Williamsia. The branches are scaled in terms of the expected number of substitutions per site. Numbers above branches are support values from 1,000 bootstrap replicates if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [24] are shown in blue, published genomes in bold.

In a very comprehensive study, Tsukamura analyzed a set of 100 quite diverse characters for 41 G. bronchialis strains isolated from sputum of patients with pulmonary disease, including the type strain [1]. Unfortunately, this study does not present the characteristics of the type strain 3410T as such. We nevertheless first present these data, as this study gives a good overview of the species itself. In order to summarize the data here, we regard positive reactions in more than 34 strains as positive, and positive reactions in only 13 or less strains as negative. Most characters, however, are either clearly positive (40 or 41 strains) or clearly negative (0 or 1 strains). The detailed methods are reported elsewhere [25,26].

G. bronchialis is Gram-positive (Table 1) and shows slight but not strong acid-fastness. A mycelium is not observed. G. bronchialis strains are non-motile and produce neither conidia nor endospores [1,3]. G. bronchialis is an obligately aerobic chemoorganotroph with an oxidative-type metabolism [3].The cells are rod-shaped and show compact grouping (like a cord) (Figure 2), and provide a rough colonial morphology with pinkish-brown colony pigmentation [1]. Photochromogenicity was not observed. G. bronchialis grows quite rapidly [1], with visible colonies appearing within 1–3 days [1,36]. G. bronchialis is positive for catalase and nitrate reduction, but arylsulphatase (3 days and 2 weeks), salicylate and PAS degradation was not observed [1]. Growth occurs on 0.2% sodium p-aminosalicylate and 62.5 and 125 µg NH2OH-HCl/ml, but not with 250 or 500 µg. G. bronchialis is tolerant to both 0.1 and 0.2% picric acid. G. bronchialis grows at 28°C and 37°C, but not at 45°C or 52°C [1]. G. bronchialis is positive for acetamidase, urease, nicotinamidase and pyrazinamidase, but negative for benzamidase, isonicotinamidase, salicylamidase, allanoinase, succinamidase, and malonamidase [1]. G. bronchialis utilizes acetate, succinate, malate, pyruvate, fumarate, glycerol, glucose, mannose, trehalose, inositul, fructose, sucrose, ethanol, propanol, and propylene glycol as a carbon source for growth, but not citrate, benzoate, malonate, galactose, arabinose, xylose, rhamnose, raffinose, mannitol, sorbitol, or various forms of butylene glycol (1,3-; 1,4-; 2,3-) [1]. G. bronchialis utilizes L-glutamate and acetamide as a N-C source, but not L-serine, benzamide, monoethanolamine or trimethylene diamine. Glucosamine is utilized by 18 strains [1]. G. bronchialis utilizes as nitrogen source L-glutamate, L-serine, L-methionine, acetamide, urea, pyrazinamide, isonicotinamide, nicotinamide, succinamide, but not benzamide and nitrite. Nitrate is utilized by 25 strains as nitrogen source [1]. G. bronchialis strains do not produce nicotinic acid. G. bronchialis strains do not grow on TCH medium (10 µg/ml) or on salicylate medium (0.05% and 0.01%) [1].

Figure 2.
figure 2

Scanning electron micrograph of G. bronchialis 3410T

Table 1. Classification and general features of G. bronchialis 3410T according to the MIGS recommendations [27]

In the following, characteristics of the type strain 3410T are presented: strain 3410T reduces nitrate and hydrolyses urea, but it does not hydrolyze aesculin, allantoin or arbutin [37]. It decomposes (%, w/v) starch (1) and uric acid (0.5), but not hypoxanthine (0.4), tributyrin (0.1), tween 80 (1), tyrosine (0.5) and xanthine (0.4) [37]. It grows on glycerol (1) and sodium fumarate (1) as sole carbon sources (%, w/v), but not on arbutin (1), D-cellobiose (1), N-acetyl-D-glucosamine (0.1), adipic acid (0.1), betaine (0.1), oxalic acid (0.1), propan-1-ol (0.1) [37]. Strain 3410T grows in the presence (% w/v) of oleic acid (0.1) and zinc chloride (0.001) [37].

In an API ZYM test, strain 3410T reacts positively for alkaline phosphatase, butyrate esterase, leucine arylamidase and naphtol-AS-BI-phosphohydrolase, but not for caprylate esterase, cystine arylamidase, β-glucosidase, myristate lipase, and valine arylamidase [13]. Complementary to the results of Tsukamura [1], strain 3410T utilizes as sole carbon source D(+) cellobiose, D(+) galactose, D(+) mannose, meso-inositol, L(+) rhamnose and sodium succinate, but not D(-)lactose, D(-) ribose, sodium benzoate and sodium citrate [38]. The use of D(+) galactose [38] might contrast above reported results from Tsukamura [1]. L-threonine and L-valinine are used as sole nitrogen source by strain 3410T, but not L-asparagine, L-proline and L-serine [38]. Interestingly, Tsukamura reports that 40 out of 41 strains utilize L-serine as sole nitrogen source [1], and it is not clear if the only negative strain in the Tsukamura study could be the type strain 3410T [1].

In the BiOLOG system, strain 3410T reacts positively for α-cyclodextrin, β-cyclodextrin, dextrin, glycogen, maltose, maltotriose, D-mannose, 3-methyl glucose, palatinose, L-raffinose, salicin, turanose, D-xylose, L-lactic acid, methyl succinate, N-acetyl-L-glutamic acid [12], but not for N-acetylglucosamine, amygdalin, D-arabitol, L-Rhamnose, D-ribose, D-sorbitol, D-trehalose, acetic acid, α-hydroxybutyric acid, β-hydroxybutyric acid, α-ketoglutaric acid, α-ketovaleric acid, L-lactic acid methyester, L-malic acid, propionic acid, succinamic acid, alaninamide, L-alanine and glycerol [12]. Further carbon source utilization results are published elsewhere [8].

Drug susceptibility profiles of 13 G. bronchialis strains from clinical samples have been examined in detail [17], but they are too complex to summarize here. No significant matches with any 16S rRNA sequences from environmental genomic samples and surveys are reported at the NCBI BLAST server (November 2009).

Chemotaxonomy

The cell-wall peptidoglycan is based upon meso-diaminopimelic acid (variation Alγ). The glycan moiety of the peptidoglycan contains N-glycolylmuramic acid. The wall sugars are arabinose and galactose. Mycolic acids are present with a range of ca. 48–66 carbon atoms. The predominant menaquinone is MK-9(H2), with only low amounts of MK-9(H0), MK-8(H2), and MK-7(H2) [3,8,3941]. Moreover, the cell envelope of G. bronchialis 3410T contains a lipoarabinomannan-like lipoglycan [42]. The same study also observed a second amphiphilic fraction with properties suggesting a phosphatidylinositol mannoside [42]. The cellular fatty acid composition (%) is C16:0 (23), tuberculostearic acid (20), C16:1cis9 (16), C16:1cis7 (11), C18:1 (10), and 10-methyl C17:0 (7). All other fatty acids are at 3% or below [8].

Genome sequencing and annotation

Genome project history

This organism was selected for sequencing on the basis of its phylogenetic position, and is part of the Genomic Encyclopedia of Bacteria and Archaea project. The genome project is deposited in the Genome OnLine Database [24] and the complete genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Table 2. Genome sequencing project information

Growth conditions and DNA isolation

G. bronchialis 3410T, DSM 43247, was grown in DSMZ 535 [43] at 28°C. DNA was isolated from 1–1.5 g of cell paste using Qiagen Genomic 500 DNA Kit (Qiagen, Hilden, Germany) following the manufacturer’s instructions with modification st/LALMP for cell lysis according to Wu et al. [44].

Genome sequencing and assembly

The genome was sequenced using a combination of Sanger and 454 sequencing platforms. All general aspects of library construction and sequencing performed at the JGI can be found on the JGI website. 454 Pyrosequencing reads were assembled using the Newbler assembler version 1.1.02.15 (Roche). Large Newbler contigs were broken into 5,776 overlapping fragments of 1,000 bp and entered into assembly as pseudo-reads. The sequences were assigned quality scores based on Newbler consensus q-scores with modifications to account for overlap redundancy and to adjust inflated q-scores. A hybrid 454/Sanger assembly was made using the parallel phrap assembler (High Performance Software, LLC). Possible mis-assemblies were corrected with Dupfinisher [45] or transposon bombing of bridging clones (Epicentre Biotechnologies, Madison, WI). Gaps between contigs were closed by editing in Consed, custom primer walk or PCR amplification. A total of 876 primer walk reactions, 12 transposon bombs, and 1 pcr shatter libraries were necessary to close gaps, to resolve repetitive regions, and to raise the quality of the finished sequence. The error rate of the completed genome sequence is less than 1 in 100,000. Together all sequence types provided 51.2 × coverage of the genome. The final assembly contains 52,329 Sanger and 508,130 pyrosequence reads.

Genome annotation

Genes were identified using Prodigal [46] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [47]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes Expert Review (IMG-ER) platform [48].

Genome properties

The genome consists of a 5.2 Mbp long chromosome and a 81,410 bp plasmid (Table 3 and Figure 3). Of the 4,999 genes predicted, 4,944 were protein coding genes, and 55 RNAs; 264 pseudogenes were also identified. The majority of the protein-coding genes (69.1%) were assigned with a putative function while those remaining were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.

Figure 3.
figure 3

Graphical circular map of the chromosome and the plasmid. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew.

Table 3. Genome Statistics
Table 4. Number of genes associated with the general COG functional categories