Background

Pseudomonas aeruginosa is a gram-negative, broad-host range, opportunistic pathogen found in diverse ecological niches. It is a frequent cause of many human infectious diseases including keratitis, burn infections, urinary tract infections (UTIs), sepsis, as well as acute and chronic infections of human airways. To understand the adaptation and pathogenesis of the bacterium, comprehensive investigations of the genomes and transcriptomes of P. aeruginosa strains from various sources are necessary.

Typical P. aeruginosa strains have a large genome size of 6–7 Mb encoding around 6000 genes contributing to the versatility of the species [1, 2]. The architecture of P. aeruginosa genomes exhibit a mosaic pattern composed of a core genome (5316 core genes) and a series of accessory genes inserted sporadically, including prophages, plasmids and islets [3]. Accessory genes could be acquired by horizontal gene transfer from various sources and they often contribute to the unique physiology, pathogenesis, or transmission capacity of the corresponding strains as has been demonstrated in several P. aeruginosa isolates [4, 5]. Although over one thousand genomes (deposited in NCBI GenBank) of P. aeruginosa have been sequenced, only 58 (as of May 2016) complete genomes are available, limiting a comprehensive understanding of this important group of opportunistic pathogens.

P. aeruginosa ATCC 27853 is commonly used in biomedical research and was initially isolated from a blood specimen in the Peter Bent Brigham Hospital in 1971 (Boston, USA) [6]. ATCC 27853 has been widely used as a model strain to survey antibiotics susceptibilities since 1978 [7, 8]. So far, four draft genomes of P. aeruginosa ATCC 27853 have been sequenced [9,10,11,12], but the complete genome of the strain is still lacking, hindering the understanding of its full physiological potentials.

In the present study, we sequenced and assembled the complete genome of P. aeruginosa ATCC 27853 using both PacBio’s SMRT and Illumina platforms. We then compared it with the complete genomes of two frequently used P. aeruginosa laboratory strains, P. aeruginosa PAO1 and P. aeruginosa PA14, to reveal distinct features of the ATCC 27853 genome. To advance our understanding of the physiology of the strain, specifically its morphogenesis, we performed comparative transcriptome analysis on ATCC 27853 and PAO1. These analyses revealed the presence of a large number (seven) of prophages in its genome and several unique physiological features of ATCC 27853, implying the striking ability of the strain to adapt to a variety of environmental niches and stresses.

Results

General features of the genome of P. aeruginosa ATCC27853

A total of 1.296 Gb raw data were produced by the PacBio platform. The error correction step produced 146,425 reads with an average length of 7564 bp and a maximum length of 39,699 bp. Corrected reads were assembled de novo, the contig was then polished and circularized using the SMRT Analysis pipeline to produce a single 6.833 Mb contig with 158× coverage. The assembly was also validated by mapping Illumina-generated reads. The GC content of the genome was 66.12%, which is comparable to other genomes within the P. aeruginosa species (Additional file 1: Table S1). A total of 6366 genes were predicted. Twelve rRNA genes, 66 tRNA genes and 215 tandem repeats were identified (Table 1).

Phylogenetic relationship of the ATCC 27853 with other P. aeruginosa strains based on SNPs from all complete genomes

Since the 16S rRNA genes in the different strains of the P. aeruginosa species exhibit high similarity (>99%, data not shown) with low discriminating capability, single nucleotide polymorphisms (SNPs) were used to construct the phylogenetic relationship between ATCC 27853 and published strains. Using Harvest [13], we collected 269,561 SNPs from the complete genomes included. We generated the phylogenetic tree in MEGA [14] based on the maximum likelihood (ML) algorithm. It became apparent that P. aeruginosa ATCC 27853 is closely related to P. aeruginosa T38079, P. aeruginosa F9670 and P. aeruginosa S86968, all of which are clinical isolates (Fig. 1, Additional file 1: Table S1).

Fig. 1
figure 1

Phylogenetic relationships of the currently available 59 complete genomes of Pseudomonas aeruginosa constructed based on the SNPs identified using Harvest with 100 bootstrap and maximum likelihood (ML) criterion in MEGA software. P. aeruginosa ATCC 27853 is highlighted in blue and italic style. The denotation of the strain is listed in the Additional file 1: Table S1

COG comparison

We compared Clusters of Orthologous Groups (COG) annotations of P. aeruginosa ATCC 27853 with those of P. aeruginosa PAO1, P. aeruginosa PA14 and P. aeruginosa LESB58 (an epidemic strain with known prophage functions) (Fig. 2 and Table 2). A total of 41 COGs are exclusively present in P. aeruginosa ATCC 27853 (Fig. 2 and Table 2), a much higher number if compared with the unique COGs in the other three genomes (Fig. 2). Most of these COGs are phage and plasmid proteins, consistent with the high number of prophages (seven) identified in P. aeruginosa ATCC 27853 (below). In addition, 58 COGs in P. aeruginosa ATCC 27853 are absent in P. aeruginosa PAO1. Nineteen of these genes have uncharacterized functions or with only hypothetical functions (Table 2). Several site-specific DNA methylase (COG0270 and COG0338) are also present in the list (Table 2).

Fig. 2
figure 2

Venn diagram showing the number of shared and exclusive genes among four P. aeruginosa strains: P. aeruginosa ATCC 27853, P. aeruginosa PAO1, P. aeruginosa PA14, P. aeruginosa LESB58. The number of unique genes, those shared among two, three and all four strains of ATCC 27853, PAO1, PA14 and LESB58 strains based on the COG gene annotations are shown

Genomic islands

A total of 25 genomic islands (GIs) were identified in the genome of P. aeruginosa ATCC 27853 by IslandViewer [15] using SIGI-HMM [16] and IslandPath-DIMOB [17] algorithms. The lengths of these GIs range from 4055 bp to 36,677 bp with four GIs associated with prophages (Table 3, and below). Some genes in the remaining GIs were assigned to functional groups including metal resistance, virulence, regulatory proteins etc. (Table 3). Knowledge of the exact functions of these genes would require further investigations. Compared with PAO1, three GIs that are unique to P. aeruginosa ATCC 27853 contain a number of genes encoding monoxoygenase, dioxygenase and hydrolase, which are likely responsible for catabolism of aromatic compounds. Genes in these GIs were not annotated as they only displayed high similarity to certain genes present in a handful of draft genomes of P. aeruginosa strains that lack functional annotation.

Prophages

Prophage prediction using Prophinder [18] and PHAST [19] revealed seven prophages in the genome of P. aeruginosa ATCC 27853. All these prophages were assigned as accessory genes and are designated as Prophage 1–7 (Table 4, Fig. 3). Prophage 1 which is closely related to phi CTX is located between genes encoding anthranilate synthase component I and component II. It is noteworthy that this prophage is observed in all available genomes of P. aeruginosa and its genomic location (between trpE and trpG genes) is also highly conserved, based on the PHASTER database [20]. The specific location of Prophage1 and its effect on the physiology of the P. aeruginosa host, particularly the antranilate biosynthesis, remain to be explored. Prophage 2 is 38,604 bp and harbors 50 open reading frames (ORFs). It is located between 797,729–836,333, upstream of the first phenazine biosynthesis gene cluster phz1 (see below) (Fig. 4). This prophage does not interrupt any genes involved in phenazine biosynthesis (Fig. 4). Most ORFs in this prophage encode phage components such as phage head and tail, transposases and integrases (Fig. 3 and Additional file 1: Table S2). Besides these structural genes, one transcription factor which belongs to the DNA-binding IclR family could be annotated. A previous study showed that this prophage shares high similarity with prophage B3, a Mu-Like bacteriophage identified by Braid et al. (2004) [21]. Interestingly, prophage prediction in the complete genomes of P. aeruginosa revealed that this prophage exists in a few other P. aeruginosa strains such as NCGM2.S1, VRFPA04 and Carb01_63, but in different genome locations and with distinctive flanking regions (Additional file 1: Figure S1).

Fig. 3
figure 3

Circular genome map of P. aeruginosa ATCC 27853 showing the Genomic Islands (GIs) predicted by IslandViewer and prophages. From the outside: circles 1 and 2 (clockwise and counterclockwise) genes on the + and - strands, respectively; circles 3, prophages; 4, Genomic Islands; 5, PAO1 genes; 6, GC content; 7, GC skew. The scale in mbp is indicated on the innermost of the map

Fig. 4
figure 4

Comparison of the gene cluster of phenazine biosynthesis (Phz1) and its flanking regions in three strains of P. aeruginosa: ATCC 27853, PAO1 and PA14. Genomic location of the prophage 2 upstream of phz1 gene cluster in ATCC 27853 is highlighted

Prophage 3 is located at the genomic site of 1,337,276–1,379,950 with a size of 42,674 bp. Several genes that encode virulence associated proteins and transcriptional regulators were also identified, such as ACG06_06430 (gene locus tag in the genome of ATCC 27853 annotated by NCBI) which belongs to the LuxR family transcription factor that modulate quorum sensing [22]. Prophage 4 is the largest predicted prophages in ATCC 27853 genome and is composed of genes from different prophages such as phages ES18 and D3, indicating a complicated evolutionary history. In addition to typical phage components, other genes contained in the predicted prophages in the genome of P. aeruginosa ATCC 27853 include those of virulence factors and other functional genes, e.g. an adenylate kinase in Prophage 5.

Phenazine biosynthesis

Phenazine compounds comprise an important class of secondary metabolites and virulence factors in Pseudomonas species. All phenazines contain a dibenzol annulated pyrazine ring represented by several structurally related compounds [23]. In most of the annotated P. aeruginosa genomes, two clusters of genes that encode phenazine biosynthetic pathways (Phz1 and Phz2) are present. The genes in the phenazine biosynthesis in ATCC 27853 and PAO1are highly similar (98.98 to 99.70% at nucleotide level). However, the phz1 gene cluster in ATCC 27853 is preceded by Prophage 2 island (see above, ORFs with gene locus tags: ACG06_03785-ACG06_04040) (Fig. 4). On the other hand, the orthologous gene cluster of phz1 in PAO1 (genes: PA4209-PA4217, gene locus tag in PAO1 from annotations by NCBI), is precededby a large fragment encompassing opmD, mexI, mexH and mexG genes (genes: PA4120-PA4208, Fig. 4) which are components of a Resistance Nodulation Division (RND) type efflux system and is proposed to pump the phenazine derivate 5-methylphenazine-1-carboxylate (5-Me-PCA) out of the cell [24]. These genes were absent in P. aeruginosa ATCC 27853. To examine whether this genomic difference affects phenazine production pattern, we measured the production of a major phenazine compound in P. aeruginosa, pyocyanin (PYO), in the two strains cultured in LB medium at room temperature. We observed a higher level of PYO in ATCC 27853 than in PAO1 at all time points examined (Fig. 5), suggesting that the different genomic architecture flanking the phz1 gene cluster may indeed affect the PYO production in P. aeruginosa strains [25, 26].

Fig. 5
figure 5

Measurements of pyocyanin in P. aeruginosa ATCC 27853 and PAO1 cultured in LB media

Virulence, surface-associated, and motility proteins

We compiled a database of 369 virulence genes based on a list of conserved virulence factors of Pseudomonas species with a primary focus on P. aeruginosa PAO1 and P. aeruginosa PA14 using the Virulence Factor Database (VFDB) [27] and the Victors Virulence Factors (PHIDIAS) (http://www.phidias.us/victors/index.php). Comparing ATCC 27853 genome against this database revealed that 254 of these virulence genes are also present in the genome of P. aeruginosa ATCC 27853 (Table 5). A class of virulence genes that are absent in P. aeruginosa ATCC 27853 include the wbp genes which encode the B-band lipopolysaccharide O antigen, with the exception of wbpX. B-band O-antigen of the lipopolysaccharide serotype O5 (such as that in PAO1) is important in conferring serum resistance in host pathogen interactions. Its presence or absence has also been shown to influence biofilm formation of the corresponding strain due to its capability to influence the hydrophilicity of cell surfaces and consequently the interaction of the cell with different surface materials and neighboring environment [28]. Absence of this system in P. aeruginosa ATCC 27853 probably indicates a defect in its defense mechanism against the host serum system and an altered biofilm formation capacity from that of the B+ strains such as PAO1.

Interestingly, SNP distribution analysis in the genomes of PAO1 and ATCC 27853 revealed that a large number of non-synonymous variant sites present in the two strains are concentrated in the regions and genes that encode surface associated proteins, such as those that encode flagellar components, pyoverdine receptor, transporters, and type 4 pili (Additional file 1: Table S4 and Figure S2). These genomic differences combined suggest potentially different surface characteristics of ATCC 27853 when compared to PAO1. We therefore cultured the two strains on LB agar surface supplemented with Congo red and examined their capabilities to form colony biofilms [29]. A distinctive wrinkled colony morphology was observed in ATCC 27853 but not in PAO1 (Fig 6), suggesting a different surface pattern of ATCC 27853 compared with PAO1 and a strong capability of the strain to form biofilms. The stronger color of the ATCC 27853 biofilm compared to the biofilm of PAO1 on Congo red containing plate indicated a high level of exopolysaccharide matrix production in ATCC 27853, consistent with a stronger capability of the strain to form biofilm.

Fig. 6
figure 6

Colony morphology of P. aeruginosa ATCC 27853 and P. aeruginosa PAO1 cultured at 25 °C on LB agar plates supplemented with Congo Red

Transcriptomes of P. aeruginosa ATCC 27853 and P. aeruginosa PAO1

The distinctive pattern of colony biofilms of ATCC 27853 and PAO1 shown above prompted us to investigate the functional genome of ATCC 27853 and compare it with that of PAO1 at that growth stage. We performed RNA-seq to obtain the complete transcriptomes of both strains cultured on LB agar surface at 25 °C, condition that is identical to that of colony biofilm formation described above. Cell cultures following 48 h incubation were harvested and RNA was extracted and sequenced as described in Materials and Methods. Statistical analysis including total reads number and bases sequenced, genome coverage, CDS coverage and mapping ratio for each sample from RNA-seq analyses are presented in supplementary data (Additional file 1: Table S5). To conduct a genome wide comparative gene expression analysis, orthologous genes between ATCC 27853 and PAO1 were first identified using reciprocal blastn and the ratio of their respective expression in the two strains was calculated by DESeq (Additional file 1: Table S3) [30].

One hundred thirty seven genes with higher expression levels (log2 fold changes over 2) in ATCC 27853 than in P. aeruginosa PAO1 (Fig. 7, Additional file 1: Table S3) were identified. These include several classes of genes involved in biofilm formation, such as the type IV pili biogenesis gene cluster (pilQPONM: PA5040-PA5044) which is involved in the initiation of biofilms. Genes encoding twitching motility proteins, pilGHIJK-chpABCDE (PA0408-PA0417) were expressed at a higher level in ATCC 27853 than in PAO1 (Fig. 7, Additional file 1: Table S3). pilABCDE (PA4525-PA4528), pilTU (PA0395-PA0396) and pilSR-yfiT-fimTU-pilVWXY1Y2E (PA4546-PA4556) were also identified to display a slightly higher expression level in ATCC 27853 than in PAO1 (Additional file 1: Table S3). Expression of a proton motive force gene (pfm) (PA2950) that encodes a protein involved in energy metabolism critical for the rotation of flagellum in P. aeruginosa [31] was also higher in ATCC 27853 than in PAO1 (Fig. 7, Additional file 1: Table S3). Additionally, several other genes which are not directly involved in biofilm formation but have been reported to mediate the process were also found to be expressed at a higher level in ATCC 27853 than in PAO1, such as Chaperone-usher pathway (cup) A (PA2128-PA2132, cupA1-A4) encoding genes which were found to be required for adhesion to inert surfaces [32, 33], the cbb3-type cytochrome c oxidase cco2 gene cluster (ccoN2O2Q2P2, PA1555-PA1557) which has been shown to promote biofilm formation under hypoxia through NO induction and its effect on cell elongation [34], as well as pyeR (PA4354) that encodes a non-classical ArsR family member of transcriptional regulators modulating biofilm formation in P. aeruginosa [35] (Fig. 7, Additional file 1: Table S3). All these genetic and transcriptional data support the distinct colony morphogenesis observed in ATCC 27853.

Fig. 7
figure 7

The genome wide transcriptomic profile of P. aeruginosa ATCC 27853 and PAO1. Green dots represent genes with higher relative expression level in PAO1 and red dots indicate genes with higher relative expression levels in ATCC 27853. The blue dashed lines represent log2-fold changes in expression. Selective genes and operons with distinctive expression patterns in the two strains are indicated

On the other hand, a much larger number (532 genes vs 137 as mentioned above) of genes with higher expression levels (log2 fold changes over 2) in P. aeruginosa PAO1 than in ATCC 27853 were observed (Fig. 7, Additional file 1: Table S3). Of particular prominence is a large fragment (PA2134-PA2181) of genes encoding trehalose biosynthesis. The homologous genes of this fragment in PA14 have been demonstrated to be involved in infection of plants [36]. Genes encoding several other virulence factors, such as lecA (encoding galactophilic lectin LecA) and lasB (encoding elastase LasB) were expressed at a higher level in PAO1 than in ATCC 27853 (Additional file 1: Table S3). It was also noticed that several transcriptional regulators which are quorum sensing genes mediating virulence factor production such as LasI, LasR, and RhlI and RhlR were also expressed at a higher level in PAO1 than in ATCC 27853 (Fig. 8).

Fig. 8
figure 8

Differential expression of the genes involved in the type III and type VI secretion systems and their regulators in P. aeruginosa ATCC 27853 and PAO1. Gene locus tags in PAO1 are shown. Values following gene locus tags of regulators indicate Log2 gene expression changes in PAO1 relative to that in ATCC 27853 (red color indicates higher expression in PAO1 than in ATCC27853, green color indicates higher expression level in ATCC 27853 than in PAO1). The full list of genes displaying differential expression in the two strains and their values are provided in supplementary Additional file 1: Table S3

An interesting observation is the expression patterns of the genes encoding various secretion systems in P. aeruginosa species. The components of type III secretion systems (T3SSs), such as genes in psc, pcr and exs gene clusters, display remarkably higher expression levels in ATCC 27853 than in PAO1 (Figs. 7 and 8), whereas those of the type I secretion system, namely T1SS, display a relatively higher expression level in PAO1 than in ATCC 27853. In the case of the type VI secretion system (T6SS) which includes three hemolysin co-regulated protein (Hcp) secretion islands HSI-I, II, III, while HSI-I was found to display a higher relative expression level in ATCC 27853 than in PAO1, that of HSI-II and III is opposite, i.e., they are expressed at higher level in PAO1 than in ATCC 27853 (Figs. 7 and 8).

Discussion

Morphogenesis in PAO1 and ATCC 27853

Surface characteristics play an important role in the morphogenesis of bacteria. P. aeruginosa is a well established model strain to study biofilms [37]. Outer membrane LPS and extracellular appendages, such as flagella, type IV pili and Cup fimbriae, are involved in the initial attachment of bacteria to a surface [38]. The present comparative genomic and transcriptomic study on P. aeruginosa ATCC 27853 and PAO1 revealed distinct genetic and expression pattern of surface associated proteins in ATCC 27853. Lacking of the B-band O-antigen (A+B) has been reported to lead to an increased hydrophobicity of the cell surface and an enhanced adherence to polystyrene materials [39]. Increased expression of type IV pili biosynthesis genes and flagella motility genes also enhances bacterial adherence to various surfaces during the initiation of a biofilm. Our transcriptome analysis supports the expression patterns of these genes in ATCC 27853 which is consistent with the observed enhanced colony biofilm formation of the strain.

Three types of exopolysaccharides, alginate, Psl and Pel, play an important role in the biofilm maturation and development stage. Alginate has been proposed not to be a critical component of the extracellular polysaccharide matrix in nonmucoid P. aeruginosa strains [40]. The low expression levels of alginate biosynthesis genes in PAO1 and ATCC 27853 are consistent with the nonmucoid colony morphologies of the two strains. Previous studies demonstrated that Pel and Psl have distinct physical properties and functional roles during biofilm maturation and development [41]. The pel locus (referring to pellicle, a biofilm formed at the air-medium interface), containing the genes pelA-G, is responsible for synthesis of the glucose-rich component of the matrix, whereas the psl locus (polysaccharide synthesis locus), containing the genes pslA-O, is responsible for the mannose- and galactose-rich component which forms a fiber-like matrix to enmesh bacterial communities [42]. Pel is required for close association of the two species in mixed-species microcolonies. In contrast, Psl is important for P. aeruginosa to form single-species biofilms. In the present study, expression of Pel biosynthesis genes were detected at a low level in both strains, however, a higher expression level of psl genes in ATCC 27853 compared to PAO1 was observed indicating a role of Psl in the development of ATCC 27853 colony biofilm. This result is also in agreement with a lower expression level of amrZ (PA3385) in ATCC 27853 than in PAO1, as the AmrZ transcriptional repressor controls switching between an alginate-producing mucoid state and a Pel-producing biofilm state through repression of psl genes [43, 44]. Another important signaling molecule which level in the cell correlates with the capability of the bacterium to form biofilms is the second messenger c-di-GMP. However, expression of several genes encoding diguanylate cyclase and phosphodiesterases which are involved in c-di-GMP production [45] was shown to be similar in PAO1 and ATCC 27853 in our comparative transcriptome analysis, suggesting that c-di-GMP did not play an important role in the distinctive colony biofilm formation observed in ATCC 27853 in comparison with that of PAO1.

Contribution of the phenazine compounds to the biofilm development of P. aeruginosa has also been reported [24, 46,47,48]. Recently, it was found that PYO can promote biofilm development of the bacterium by binding to extracellular DNA and enhancing the formation of extracellular matrix of biofilms [48]. Higher level of PYO production in ATCC 27853 than in PAO1 was observed in the present study. Thus, PYO may also contribute to the enhanced biofilm formation in ATCC 27853. The last step of PYO biosynthesis is the conversion of the zwitterionic intermediate 5-Me-PCA to the less charged PYO via hydroxylative decarboxylation. Interestingly, 5-Me-PCA, which is exported out of cells by the MexGHI-OmpD RND type efflux pump, was also shown to mediate the biofilm formation of P. aeruginosa in PA 14 [24]. It was proposed that export of 5-Me-PCA serves as a detoxification means in P. aeruginosa, likewise the conversion of this molecule to PYO which decreases the charge of the molecule and allows the transport of the product (PYO) across the membrane without the assistance of an efflux pump [24]. Indeed, PYO was shown not to be the substrate of the MexGHI-OmpD pump. The mexGHI-ompD system is present in both PAO1 and PA14, but is lacking in ATCC 27853. Yet, a higher level of PYO is observed in ATCC 27853 than in PAO1. This suggests that ATCC 27853 may contain other detoxification means allowing production of PYO in high level but minimizing the potential cytotoxicity of the intermediate 5-Me-PCA. Indeed, our genomic analysis revealed considerable differences of the two strains in terms of the numbers (122) of COGs. There are 71 unique COGs present in ATCC 27853 but are absent in PA14 and 51 COGs present in PA14 are lacking in ATCC 27853 (Fig. 2). These interesting observations warrant a comparative, molecular analysis of the PYO biosynthesis in ATCC 27853, PAO1, and PA14.

Phylogenetic relations and accessory genes of P. aeruginosa ATCC 27853

In the phylogenetic tree constructed (Fig. 1), ATCC 27853 was shown to be extraordinarily closely related to three strains, P. aeruginosa T38079, P. aeruginosa F9670 and P. aeruginosa S86968. This phenomenon is interesting. Sequences of the three strains, T38079, F9670 and S86968, became available only very recently in the NCBI GenBank, and we included them in our phylogenetic analysis. However, this observation does not necessarily mean that these four strains are almost identical. This is because the SNPs utilized to construct the phylogenetic tree were extracted from the core genomic regions of all 59 strains which complete genome sequences are available. The SNPs do not cover the accessory genomes which are unique to each of the strains. Thus, the resulting relatedness of the strains in the phylogenetic tree does not reflect their associations at the complete genome level. Nevertheless, in the dataset we extracted, only 146 SNPs among these four strains were identified. Furthermore, the three strains and ATCC 27853 are assigned to the same multi-locus sequence type (MLST, https://pubmlst.org/paeruginosa/) and the same phylogenetic group based on NCBI GenBank, indicating very similar genomic contents of these four strains.

Core genome and accessory genes are two main components of the genomes of different P. aeruginosa strains [2]. Accessory genes are associated with genomic islands and islets that are attributed to diversification of strains within the species. This is termed as diversifying selection. Certain selective pressure might be responsible for the acquiring of these accessory genes and the resulting genome diversity among the different strains within the same species.

With the complete genome of P. aeruginosa ATCC 27853 on hand, its accessory genes were extensively examined in the current study. Within these accessory genes, the most prominent observation was the presence of seven prophages. Prophages contained in the genome of bacteria have been shown to play important roles in the physiology of the host bacterial species [49]. For example, two tandem defective phage (pyocin) islands on the P. aeruginosa PAO1 genome are the determinants of fluoroquinolone susceptibility of the strain [4]. Another study on P. aeruginosa LESB58 (Liverpool Epidemic Strain) demonstrated that the four prophages present in its genome could enhance competitiveness of the strain in a chronic rat lung infection model [5]. The abundance of prophages in the genome of ATCC 27853 implies the complexity and strong fitness potential of the strain. However, expression of these prophages was found to be low or non-detectable in the present study based on the transcriptome data (Fig. 7). This probably was due to the rich growth medium used in this study. Elucidating the functions of the genes within these prophages especially those encoding several transcriptional factors may help to disclose the potential roles of the prophages in the fitness of ATCC 27853 to the non-laboratory, harsh environments in nature and in animal hosts.

Secretion systems

Secretion systems are important for the adaptation and pathogenesis of P. aeruginosa through dedicated secretion of specific exoproteins [50]. It has been shown that type III secretion systems (T3SSs) are correlated with acute infections in P. aeruginosa, while type VI secretion systems are often associated with chronic infections and biofilm formation of the species [47]. In the present study, genes encoding T3SS were found to be expressed at a remarkably higher level in ATCC 27853 than in PAO1 (Figs. 7 and 8). The genes encoding transcriptional activators of T3SS, e.g. exsA were also expressed at higher level in ATCC 27853 than in PAO1. Interestingly, a differential expression pattern of the three Hcp islands of T6SS was observed in these two strains, while HSI-II and HSI-III was expressed at a higher level in PAO1, HSI-I expression was higher in ATCC 27853 (Fig. 8). The three Hcp islands of P. aeruginosa have been assigned to different phylogenetic groups based on phylogenetic analysis, indicating a distinct evolutionary history of the three components [51]. This also suggests different roles of these three HSI islands during pathogenesis of P. aeruginosa. In addition, previous studies have demonstrated that the expression of these three Hcp islands of T6SS is mediated by different regulators [47]. LasR and RhlR positively regulate the expression of HSI-II and HSI-III gene clusters and LasR negatively regulates the HSI-I gene cluster in P. aeruginosa [47]. This is consistent with the higher expression level of LasR and RhlR in PAO1 compared with that in ATCC 27853 (Fig. 8). These observations indicate the complex expression patterns and functional roles of these secretion systems in the physiology and pathogenicity of different P. aeruginosa strains.

Conclusions

In summary, several genomic features of P. aeruginosa ATCC 27853 were identified based on the complete genome sequence generated using Pacific Biosciences SMRT (PacBio) technology. Comparing with the genomes of the other two frequently used model strains P. aeruginosa PAO1 and PA14, three unique genomic islands were present in P. aeruginosa ATCC 27853 which contain genes possibly related to the metabolisms of aromatic compounds. Seven prophages are predicted including the prophage 2 which is located adjacent to the phz1 phenazine biosynthesis gene clusters. Survey of virulence related genes revealed the lack of a gene cluster encoding the B-band O-antigen of LPS in P. aeruginosa ATCC 27853 which is important in evading of host immune responses and biofilm formation. Transcriptome analysis revealed differential gene expression of several groups of surface associated proteins and those involved in cellular redox metabolism, and the type I, III and VI secretion systems, confirming the different surface characteristics of ATCC 27853 from that of PAO1 and suggesting unique physiological and pathogenic potentials of ATCC 27853. These information provides genetic basis for the comprehensive understanding of the physiology, pathogenicity, and virulence of the strain.

Methods

Culture of bacterial cells and genomic DNA extraction

P. aeruginosa ATCC 27853 used in the present study was a gift obtained from Chinese University of Hong Kong (CUHK). It was cultivated in Luria-Bertani (LB) broth overnight with shaking (150 rpm) at 37 °C. Bacterial cells were harvested from 1 ml liquid culture via centrifugation at 10,000 rpm for 10 min. Genomic DNA of P. aeruginosa ATCC 27853 was extracted using QIAamp DNA Mini Kit (Qiagen, Hilden, Germany). The concentration and quality of genomic DNA was determined by NanoDrop and gel electrophoresis.

Colony morphology assay

Congo red plates were prepared following the protocol described by Dietrich et al. with slight modifications [29]. Briefly, 1% tryptone and 1% agar were mixed with 40 μg/ml Congo red and 20 μg/ml Coomassie blue and poured on the square petri dish. 10 μl of overnight culture of P. aeruginosa inoculated from single colonies was spotted onto the square agar plates followed by incubation at 25 °C up to 9 days. Colony morphologies were recorded daily.

Extraction and quantification of pyocyanin

Pyocyanin from liquid cultures harvested at different time point were extracted and measured following the protocol used by Recinos et al. and Apidianakis et al. with slight modification [26, 52]. Supernatant was collected after centrifugation at 13,000 rpm for 5 min and mixed with 0.6 volume of chloroform following vortex for 10 s twice. After centrifugation at 13,000 rpm for 5 min, blue layer at the bottom was transferred to a new tube and mixed with 0.5 volume of 0.2 M HCl with vortex for 10 s twice. 0.1 ml of the pink layer was transferred to a 96-well plate after 13,000 rpm for 5 min. Absorbance was determined at 510 nm.

RNA preparation

Total RNA was extracted from triplicates of both ATCC 27853 and PAO1. 10 μl of overnight culture of P. aeruginosa PAO1 and ATCC 27853 in LB Broth inoculated from single colonies was spotted on the LB agar surface and incubated at 25 °C for 2 days. Cell patches were scraped from the plates and resuspended in 1 ml LB medium. 0.125 ml ice-cold phenol/ethanol stop solution (5:95, v/v, Ambion™ water saturated phenol at pH 6.6) was mixed with bacterial culture and placed on ice for 10 min to stop mRNA degradation. The mixture was subsequently centrifuged at 4800 rpm for 10 min at 4 °C. The supernatant was removed and cell pellet was stored at −80 °C for RNA extraction. RNA extraction was following the manufacture’s instructions using RNeasy Mini kit (Qiagen, Hilden, Germany). The quality of the extracted RNA has passed the Agilent Bioanalyzer analysis in Genome Research Centre of The University of Hong Kong (all RNA Integrity Number, RIN, are over 7). Stranded libraries for all RNA samples were constructed with Kapa Biosystems RNA library preparation chemistry in Georgia Genomics Facility at University of Georgia.

Sequencing and de novo assembly

The whole genome sequencing of P. aeruginosa ATCC 27853 was performed using the PacBio RS II single-molecule, real-time sequencing system (SMRT) platform using 20 kb insert library and P6-C4 chemistry (Pacific Biosciences, Menlo Park, CA) by Macrogen(Korean). Raw SMRT reads were error corrected, de novo assembled the polished using the SMRT Analysis workflow [53] from Pacific Biosciences. The genome was checked for circularization by self-aligning the contig and inspecting the dotplot for sticky edges (dotplot was created in Geopard [54]). Circularization was carried out by trimming one end of the contig then collapsing using Minimus2 [55]. The genome of P. aeruginosa ATCC 27853 and transcriptomes of the two strains, PAO1 and P. aeruginosa ATCC 27853 were sequenced on the Illumina NextSeq platform (Illumina, San Diego, California, USA) using a run of 300 Cycles PE150 High Output Flow Cell in the Georgia Genomics Facility at the University of Georgia. DNA-seq raw reads from P. aeruginosa ATCC 27853 were aligned to the single PacBio contig and the Variant Call Format (VCF) file was generated with SAMtools [56].

Genome annotations

Automated gene calling and annotation was carried out using the National Center for Biotechnology Information (NCBI)‘s Prokaryotic Genome Annotation Pipeline 2.0 (PGAP) [57]. We assessed and validated the annotation by comparing to that from the Rapid Annotations using Subsystems Technology (RAST) Server [58] as well as that from Prokka [59]. tRNA genes were predicted using tRNAscan-SE 1.3.1 [60] and rRNA genes using RNAmmer 1.2 [61]. Metabolic pathways were predicted in silico using KAAS [62]. Protein sequences of P. aeruginosa ATCC 27853 were BLAST-ed against the Clusters of Orthologous Groups (COG) database with an e-value score of 1e-5 [63].

Prediction of prophage and genomic islands

Prophages in the genome of P. aeruginosa ATCC 27853 were predicted using the online softwares Prophinder with parameters (Scanning window size: 20,50,100,200,300; Minimum nb of CDS in prophages: 20; Minimum nb of ACLAME hits: 20; Blast Eval threshold: 1e-5; Minimum DR size: 10) [18] and PHAST [19]. IslandViewer was used with two methods including SIGI-HMM and IslandPath-DIMOB [15] to predict genomic islands (GIs). Hypothetic genes in prophages or GIs annotated by methods mentioned above were also blasted against the Pfam database constructed based on protein modules to improve annotations [64]. In addition, all available complete genomes of P. aeruginosa in Genbank were surveyed with PHAST to predict prophages [19].

Virulence gene prediction

In P. aeruginosa PAO1, 273 virulence genes were identified based on a conserved list of 369 virulence genes in Pseudomonas species obtained from the Virulence Factor Database (VFDB) [27], Victors Virulence Factors (PHIDIAS) (http://www.phidias.us/victors/index.php), and curation by the Pseudomonas Genome Database (PseudoCAP) [65] with a primary focus on P. aeruginosa PAO1 and P. aeruginosa PA14. These 273 virulence proteins were blasted against all proteins in ATCC 27853 through BLASTp with 1e-5 e-value. Those without positive result of the blast search were recognized as absent in ATCC 27853. All the protein sequences of ATCC 27853 were also blasted against this conserved list of virulence genes with 1e-5 e-value.

Comparative analysis of genomes

Four draft genomes of P. aeruginosa ATCC 27853 were recruited from Genbank (Table 1) [9,10,11,12] and compared with the complete genome obtained in the current study. 58 complete genomes of P. aeruginosa were also retrieved from Genbank and were compared with ATCC 27853 using progressiveMauve with default settings [66]. Proteins present exclusively in an individual strain and those shared between two or three strains based on Mauve and COG blast analysis were counted and represented in Venn diagrams generated by VennDiagram in R-platform [67]. For single nucleotide polymorphisms (SNPs) calling between PAO1 and ATCC 27853, VCF was first generated using Parsnp from Harvest tools [13]. VCF was annotated using SnpEff using PAO1 as reference genome [68].

Table 1 List of the genomic features of P. aeruginosa ATCC 27853 revealed from the complete genome (this study) and those of previous publications
Table 2 COG identified in P. aeruginosa ATCC 27853 but absent in PAO1
Table 3 List of genomic islands identified in P. aeruginosa ATCC 27853
Table 4 List of prophages identified in P. aeruginosa ATCC 27853
Table 5 List of virulence genes present in PAO1 but is absent or with low identity in P. aeruginosa ATCC 27853

Phylogenetic analyses

The phylogenetic analysis was performed to validate the phylogenetic position of P. aeruginosa ATCC 27853. Parsnp from Harvest tools [13] was employed with default settings to collect single nucleotide polymorphisms (SNPs) from all currently available complete genomes of P. aeruginosa and 269,561 SNPs were submitted for phylogenetic analysis with a maximum likelihood (ML) criterion in MEGA [14]. Parameters for this analysis included: Tamura-Nei substitution model, Gamma Distributed Rates among sites, Nearst-Neighbor-Interchange (NNI) ML Heuristic method for tree inference options, using automatically generated initial tree with NJ method, and 100 times bootstrap test.

RNA-seq quality processing

We performed quality control (QC) on the raw Illumina RNA-Seq data using BBduk2 (BBMap short read aligner, http://sourceforge.net/projects/bbmap). Reads were culled based on a minimum average quality of 20 over a window of 7 bp. Low quality read edges were trimmed and reads containing more than two ambiguous bases were removed. Finally, read pairs were trimmed evenly and a minimum length of 60 bp was enforced.

RNA-seq read mapping

QC reads were mapped to their respective reference genomes in two stages. First, QC reads were aligned using BWA-MEM with default parameters [69]. The second round of read mapping was conducted using Stampy with the output from BWA-MEM (with Stampy’s --bamkeepgoodreads -M options) [70]. SAMtools and BamTools were used for format conversions, statistics, and quality assessment and control [53, 71]. IGV tools were also used to visually inspect mapping quality to ensure its accuracy [72].

Fragment counts and statistics

Fragment (our RNA-Seq data are stranded) counting per genomic features (genes) was performed using featureCounts [73]. Reads that mapped with MAPQ scores below 10 were removed. Enforcing a MAPQ score below 10 also excludes multi-mapped reads albeit the percentage of this category is low (data not shown). Multi-mapping was determined using default parameters. Read pairs were checked for proper pairing as well as the proper insert size. Counting was performed for each gene based on its locus_tag. Read counts were used as input for DESeq analysis [30]. Genes with mean normalized expression <50 reads in all samples were considered as transcriptional noise and filtered out from the analysis. In DESeq, fold changes (log2(fold-change) ≥ 2or ≤2) for each expression gene and p-value < 0.05 [cut-off at 5% false discovery rate (FDR)] was employed as threshold for the statistics analysis.