Introduction

Pseudomonas syringae strains have been isolated from more than 180 host species [1] across the entire plant kingdom, including many agriculturally important crops, such as bean, tomato, cucumber, as well as kiwi, stone fruit, and olive trees. Strains are divided into more than 50 pathovars primarily based on host-specificity, disease symptoms, and biochemical profiles [24]. The first strain of this species was isolated from a lilac tree (Syringa vulgaris), which gave origin to its name [5].The observed wide host range is reflected in a relatively large genetic heterogeneity among different pathovars. This is most pronounced in the complement of virulence factors, which is also assumed to be the key factor defining host specificity [6]. For successful survival and reproduction, both epiphytic and endophytic P. syringae strains deploy different sets of type III and type VI secretion system effectors, phytotoxins, EPS, and other types of secreted molecules [611]. Currently, there are three completely sequenced P. syringae genomes published: pathovar syringae strain B728a which causes brown spot disease of bean [12], pathovar tomato strain DC3000 which is pathogenic to tomato and Arabidopsis [13], and pathovar phaseolicola strain 1448A, causal agent of halo blight on bean [14]. There are also a number of incomplete genomes of various qualities available for other strains.

Pseudomonas syringae pv. syringae strain B64 was isolated from hexaploid wheat (Triticum aestivum) in Minnesota, USA [15]. The strain has been deployed in several studies mainly addressing phylogenetic diversity of P. syringae varieties [1518], but never as an infection model for wheat. The genome sequencing of the B64 strain and its comparison with the other published genomes should reveal wheat-specific adaptations and give insights in virulence strategies for colonizing monocot plants.

Classification and features

Pseudomonas syringae belongs to class Gammaproteobacteria. Detailed classification of this species is still under heavy debate. Young and colleagues have proposed to group all plant-pathogenic oxidase-negative and fluorescent Pseudomonas strains into a single species, P. syringae, which is to be further sub-divided into pathovars [4,19]. Several DNA hybridization studies have shown a large genetic heterogeneity among the groups, however biochemical characteristics, with a few exceptions, did not allow elevating those into distinct species [20,21]. Currently, the species is divided into five phylogenetic clades based on MLST analysis. P. syringae pv. syringae (Pss) strains belong to group II within this nomenclature [22]. The basic characteristics of Pss B64 are summarized in Table 1, while its phylogenetic position is depicted in Figure 1.

Figure 1.
figure 1

Phylogenetic tree constructed using neighbor-joining method using MLST approach [40] and MEGA 5.10 software suit [41] with 1,000 bootstraps. The tree features the three completely sequenced P. syringae model strains Pto DC3000, Pss B728a, and Pph 1448A, the strain Pss B64 itself, as well as another wheat-isolated strain Pss SM. The model strains represent the major phylogenetic clades of P. syringae: I, II and III respectively. P. fluorescens Pf0-1 was used as an outgroup. The analysis confirms placement of Pss B64 into clade II.

Table 1. Classification and the general features of P. syringae pv. syringae B64 according to the MIGS recommendations [23]

Pss B64 has similar physiological properties as other representatives of its genus. It can grow in complex media such as LB [42] or King’s B [43], as well as in various defined minimal media: HSC [44], MG-agar [45], PMS [46], AB-agar [47], and SRMAF [48]. Even though the optimal growth temperature is 28°C, the bacterium can also replicate at 4°C. Growth is completely inhibited above 35°C. Pss B64 is capable of endophytic growth in the wheat leaf mesophyll, but does not seem to cause any symptoms unless a very high inoculation dose is applied.

The bacterium has a weak resistance to ampicillin (25 mg/L) and chloramphenicol (10 mg/L). It is also possible to develop spontaneous rifampicin-resistant mutants. In addition, the genomic sequence predicts this strain to be polymyxin B insensitive due to presence of the arn gene cluster.

Genome sequencing information

Genome project history

The organism was selected for sequencing because it has been identified to have a syringolin biosynthesis gene cluster [49]. Syringolin is a proteasome inhibitor produced by some strains of pathovar syringae. As a consequence of proteasome inactivation a number of plant intracellular pathways are being inhibited, including the entire salicylic acid-dependent defense pathway, thus promoting the entry of bacteria into leaf tissue and subsequent endophytic growth [9]. Since up to now it has not been possible to establish an infection model for syringolin in the model plant Arabidopsis, it was decided to explore another common research target and one of the most important crop plants, bread wheat (Triticum aestivum). The genome project has been deposited in the Genbank Database (ID 180994) and the genome sequence is available under accession number ANZF00000000. The version described in this paper is the first version, ANZF01000000. The details of the project are shown in Table 2.

Table 2. Genome sequencing project information

Growth conditions and DNA isolation

P. syringae pv. syringae strain B64 was grown in 40 mL of LB medium at 28°C, 220 rpm until OD600 of ∼1.0. Genomic DNA was isolated from the pelleted cell using a Qiagen Genomic-tip 100/G column (Qiagen, Hilden, Germany) according to the manufacturer’s instructions.

Genome sequencing and assembly

A 3kb paired-end library was generated and sequenced at the Functional Genomics Center Zurich on a Roche Genome Sequencer FLX+ platform. A total of 872,570 high-quality filtered reads with a total of 188,465,376 bases were obtained, resulting in 31.8-fold average sequencing coverage. The obtained reads were assembled de novo using Newbler 2.5.3. This resulted in 150 contigs combined into one 6 Mb-long super-scaffold and 3 smaller scaffolds of 5.29 kb, 2.84 kb and 2.74 kb in size. The largest of the minor scaffolds constituted a ribosomal RNA operon, the other two showed sequence similarity to non-ribosomal peptide synthase modules. A portion of intra-scaffold gaps have been closed by sequencing of PCR products using Sanger technology, decreasing the total number of contigs to 41 with a contig N50 value of 329.4 kb, the longest contig being 766.5 kb long. Note that the Genbank record contains 42 contigs due to fact that one of the contigs was split into two parts in order to start the assembly with the dnaA gene. While closing gaps it became possible to allocate the positions of all ribosomal operons by sequence overlap and thus to incorporate the largest of the minor scaffolds. However, it was not possible to precisely map the remaining two minor scaffolds. These must be located within two distinct remaining large gaps, but due to insignificance to the project they have been excluded from the assembly.

Genome annotation

Initial open-reading frame (ORF), tRNA, and rRNA prediction and functional annotation has been performed using the RAST (Rapid Annotation using Subsystem Technology) server [50]. For the purpose of comparison, the genome has also been annotated using Prokka [51], which utilizes Prodigal [52] for ORF prediction (the RAST server utilizes a modified version of Glimmer [53]). Start codons of all the predicted ORFs were further verified manually, using the position of potential ribosomal binding sites and BLASTP [54] alignments with homologous ORFs from other P. syringae strains as a reference. Functional annotations have also been refined for every ORF using BLASTP searches against the non-redundant protein sequence database (nr) and the NCBI Conserved-Domain search engine [55]. Functional category assignment and signal peptide prediction was done using the Integrated Microbial Genomes/Expert reviews (IMG/ER) system [56].

Genome properties

The genome of the strain B64 is estimated to be comprised of 5,930,035 base pairs with an average GC-content of 58.55 % (Table 3 and Figure 2), which is similar to what is observed in other P. syringae strains [12,13,53]. Of the 5,021 predicted genes, 4,947 were protein coding genes, 4 ribosomal RNA operons, and 61 tRNA genes; 78 were identified to be pseudo-genes. The majority of the protein-coding genes (83.65 %) were assigned a putative function, while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.

Figure 2.
figure 2

Graphical map of the chromosome. From outside to the center: genes on forward strand (colored by COG categories), genes on reverse strand (colored by COG categories), RNA genes: tRNAs - green, rRNAs - red, other RNAs - black, GC content, and GC skew

Table 3. Genome Statistics
Table 4. Number of genes associated with the 25 general COG functional categories

The genome contains a complete canonical type III secretion system and ten known effector proteins: AvrE1, HopAA1, HopI1, HopM1, HopAH1, HopAG1, HopAI1, HopAZ1, HopBA1, and HopZ3. Out of these ten, the first five are present in all other sequenced P. syringae strains, thereby constituting the effector core, whereas the latter five could be host-determinants for wheat. That there is such a small number of effectors is not something unusual, and is seen in other strains of clade II [22]. In addition, there are two complete type VI secretion system gene clusters and nine putative effector proteins belonging to the VgrG and Hcp1 families. Pss B64 genome also encodes gene clusters for biosynthesis of four phytotoxin: syringomycin, syringopeptin, syringolin, and mangotoxin. All of the above-mentioned genome components have been previously demonstrated to be involved in virulence, epiphytic fitness of P. syringae, as well as in competition with other microbial species [710,5759]. Additional identified virulence-associated traits are: exopolysaccharides alginate, Psl, and levan biosynthesis, surfactant syringofactin, type VI pili, large surface adhesins, siderophores pyoverdine and achromobactin, proteases and other secreted hydrolytic enzymes, RND-type transporters (including putative mexAB, mexCD, mexEF, and mexMN homologs [60,61]), all of which are found in other P. syringae strains. It is also notable that inaZ gene encoding ice-nucleation protein is truncated by a frameshift, thus making this strain ice-negative. The latter contradicts results of a previous study by Hwang and colleagues [16] in which Pss B64 has been identified to be ice-positive. This could be due to an assembly error, or the frameshift could have been introduced at a later point during propagation.