Defining the Pseudomonas Genus: Where Do We Draw the Line with Azotobacter?
The genus Pseudomonas has gone through many taxonomic revisions over the past 100 years, going from a very large and diverse group of bacteria to a smaller, more refined and ordered list having specific properties. The relationship of the Pseudomonas genus to Azotobacter vinelandii is examined using three genomic sequence-based methods. First, using 16S rRNA trees, it is shown that A. vinelandii groups within the Pseudomonas close to Pseudomonas aeruginosa. Genomes from other related organisms (Acinetobacter, Psychrobacter, and Cellvibrio) are outside the Pseudomonas cluster. Second, pan genome family trees based on conserved gene families also show A. vinelandii to be more closely related to Pseudomonas than other related organisms. Third, exhaustive BLAST comparisons demonstrate that the fraction of shared genes between A. vinelandii and Pseudomonas genomes is similar to that of Pseudomonas species with each other. The results of these different methods point to a high similarity between A. vinelandii and the Pseudomonas genus, suggesting that Azotobacter might actually be a Pseudomonas.
Pseudomonas bacteria are naturally widespread in the environment. For example, the plant pathogen, Pseudomonas syringae has been linked to the environmental cycle of water as an ice nucleus in the clouds and is found in rain, snow, lakes, and plants . Because of its abundance in the environment, the Pseudomonas genus was first characterized long ago, and over the past hundred years, it has gone through many taxonomic revisions. The number of organisms placed in the Pseudomonas group grew steadily over a period of 60 years. However, through refinement of defining criteria, many bacteria were moved to other genera over the next 50 [24, 36, 42, 47].
Early studies based on rRNA–DNA hybridization postulated five RNA subdivisions in the genus, where rRNA group I, including the type species Pseudomonas aeruginosa, was named after the genus as Pseudomonas . Studies on the determination and comparison of 16S rRNA sequences of Pseudomonas species resulted in the clustering of Pseudomonas into two groups: P. aeruginosa and Pseudomonas fluorescens . Later on, the extensive study of Anzai and collaborators on more than 100 Pseudomonas species based on 16S rRNA sequence comparison suggests seven clusters from the group of species of Pseudomonas sensu stricto, which also agreed in some parts with Palleroni’s report in 1973 . Although it is still a widely accepted method, debates on the poor resolution of the phylogeny analysis with rrs gene sequences lead to the idea of using other marker genes to characterize and classify Pseudomonas, such as gryB, rpoD, oprI, oprF, and rpoB sequences [2, 8, 13, 57]. In another study, ten housekeeping genes were used to assess the phylogeny of 2,4-diacetylphloroglucinol-producing fluorescent Pseudomonas spp. . Other phenotypic methods, such as siderotyping, were also suggested for the classification of plant-associated Pseudomonas . Pseudomonas sensu stricto (rRNA similarity group I) could be further divided into subgroups due to its considerable heterogeneity based on pathogenicity or pigment production .
The current status of the Pseudomonas genus today shows 202 species assigned to Pseudomonas on the Approved Lists of Bacterial Names, where the classification method depends on a combination of 16S rRNA, the analysis of the cellular fatty acids, and differentiating classical physiological and biochemical tests . The genus consists of a group of medically and biotechnologically important bacteria that are inhabitants of a wide range of niches including soil and water environments, in addition to plant and animal associations. Hence, they are well known for having enormous metabolic versatility [17, 18, 47]. They are non-sporulating, aerobic Gram-negative rods that are found in biofilms or in planktonic forms. Most of the pathogenic members are related to plants, whereas several strains are pathogenic to animals .
Azotobacter vinelandii, in this context, is interesting because of its common metabolic characteristics with Pseudomonas. A nitrogen-fixing member of Gammaproteobacteria, A. vinelandii is found mostly in soil environments where its nitrogen and energy metabolism is significant to agriculture. Many years ago, this organism was often used in biochemistry experiments for isolating enzymes during the kinetics studies which resulted in surprising yields and qualities . It is a free-living obligate aerobe known for having the highest respiratory characteristics, but it can still fix atmospheric nitrogen using a respiratory protection mechanism . It also has distinct properties, such as dramatic increase in chromosome numbers when reached at a stationary phase, formation of cysts under carbon depletion that helps the bacteria to resist dehydration , where alginate is a structural component, and accumulation of poly-beta-hydroxybutyrate at the end of the exponential growth as a carbon and energy source storage . Although the Azotobacter genus has been studied over 100 years in various experiments, currently, there is only one complete genome sequence available on NCBI GenBank database—A. vinelandii DJ . There are no further ongoing projects listed for this genus, out of several thousand bacterial genome projects.
Azotobacter and Pseudomonas are members of the Pseudomonadaceae family. They both have a significant genomic diversity and genetic adaptability in a wide range of niches. However, numerous studies show that they share many biochemical metabolic pathways such as nitrogen fixation, alginate production, and respiratory mechanisms, and they are found in similar environments [11, 58]. It was long thought that Pseudomonas species (sensu stricto) do not have nitrogen fixation abilities; however, recently, it has been demonstrated that some Pseudomonas strains can fix nitrogen and that their genes related to this machinery closely resemble that of A. vinelandii [39, 58, 59]. Another similarity is the alginate production in A. vinelandii, which is also a by-product in pathogenic P. aeruginosa infections in the lungs of cystic fibrosis patients . However, other phenotypic characteristics of Pseudomonas have been shown to be different from Azotobacter species, such as cell morphology and motility . This suggests that the diversity in some phenotypic characteristics might be the outcome of their adaptive properties since the two genera share the same set of core housekeeping genes or other conserved genes . In this context, analysis of 16S rRNA gene sequences by Rediers and collaborators showed that A. vinelandii was in the P. aeruginosa clade, sharing 96% identity with P. aeruginosa PAO1 strain. Owing to the low resolution in 16S rRNA sequences for these genomes, they conducted a phylogenetic analysis of 25 protein-coding genes, some of which were housekeeping genes. Phylogenetic trees generated with their dataset again revealed that A. vinelandii homologues are clustered within or close to the Pseudomonas group. The consensus tree out of these 25 topologies showed A. vinelandii phylogeny being closest to P. aeruginosa PAO1, concluding A. vinelandii to belong to the Pseudomonas genus . Young and Park  use a broader approach to this idea, taking into account all the morphological differences, concluding that Azotobacter species can be transferred to Pseudomonas along with a change in the criteria used for classification.
In this article, we analyze the evolutionary relationships of the Pseudomonas genus to A. vinelandii, discussing whether or not this species is actually a Pseudomonas, using comparative genomic methods such as phylogeny trees, pan–core genome analysis, and protein BLAST across the whole genomes. Genomes from related genera—Acinetobacter, Psychrobacter, and Cellvibrio—were also used for the analysis to provide a better resolution. All genomes are therefore members of Pseudomonadales order, where Pseudomonas, Azotobacter, and Cellvibrio belong to the Pseudomonadaceae family and Acinetobacter and Psychrobacter are members of Moraxellaceae in the same order.
Materials and Methods
Gathering Genomes and Gene Annotation
All of the 29 genomes used in the analysis are complete sequences downloaded from GenBank . The list consists of A. vinelandii, 17 Pseudomonas species (including P. aeruginosa, Pseudomonas putida, Pseudomonas entomophila, Pseudomonas mendocina, P. fluorescens, P. syringae, and Pseudomonas stutzeri), 7 Acinetobacter, 3 Psychrobacter, and one Cellvibrio genome.
List of genomes used in the comparative analysis (the colors of the groups are the same throughout all the figures)
Total size (bp)
No. of genes
Acinetobacter baumannii AYE
Acinetobacter baumannii AB307-0294
Acinetobacter baumannii AB0057
Acinetobacter baumannii ACICU
Acinetobacter baumannii ATCC 17978
Acinetobacter sp. ADP1
Acinetobacter baumannii SDF
Psychrobacter arcticus 273-4
Psychrobacter cryohalolentis K5
Psychrobacter sp. PRwf-1
Cellvibrio japonicus Ueda107
Pseudomonas mendocina ymp
Pseudomonas stutzeri A1501
Pseudomonas fluorescens Pf-5
Pseudomonas fluorescens Pf0-1
Pseudomonas fluorescens SBW25
Pseudomonas putida KT2440
Pseudomonas putida F1
Pseudomonas putida GB-1
Pseudomonas putida W619
Pseudomonas entomophila L48
Pseudomonas aeruginosa PAO1
Pseudomonas aeruginosa LESB58
Pseudomonas aeruginosa UCBPP-PA14
Pseudomonas aeruginosa PA7
P. syringae pv. tomato str. DC3000
P. syringae pv. syringae B728a
P. syringae pv. phaseolicola 1448A
Azotobacter vinelandii DJ
Phylogenetic Analysis—16S rRNA and Pan Genome Family Trees
Core and Pan Genome Analysis
16S rRNA Tree
According to the 16S rRNA phylogenetic tree in Fig. 1, A. vinelandii has a close relationship with Pseudomonas. The tree shows that the 16S rRNA sequence of this organism is very similar to P. aeruginosa 16S rRNA sequences, even more similar than the other Pseudomonas species as mentioned before in Rediers et al. . There are clear clusters among the different strains of the same species, with few exceptions, and as expected, genomes from the genus Pseudomonas cluster together when compared with the other Pseudomonadales, Acinetobacter, and Psychrobacter, which are more distantly related. In general, the evolutionary relationships indicated in this tree are in agreement with the known biology of these organisms. Furthermore, alignment of sequences including more 16S rRNA sequences obtained from RDP also shows the same clustering results for Azotobacter (see ESM Fig. 1). All the sequences from Azotobacter, Azorhizophilus, and Azomonas genera cluster close to the P. aeruginosa group, while other Pseudomonas species are more distant and Rhizobacter is an outgroup (note that it was not specifically selected as an outgroup during the methods).
Pan Genome Family Tree
The pan genome family tree (Fig. 2) compares the presence of gene families in each genome and measures the distances for the tree depending on the common gene families found between genomes, resulting in groups which share more gene families clustering together. As expected, the Pseudomonas genus groups together and has clear clusters for each species, this time with often 100% bootstrap values in the main nodes for each species. Some positions of the genomes are changed, such as Cellvibrio japonicus and P. fluorescens, but, most importantly for this work, the A. vinelandii now is in a different position on this tree. In this figure, A. vinelandii does not group with any individual Pseudomonas species but with all of the Pseudomonas clusters, still indicating that it shows a larger fraction of conserved protein families with Pseudomonads than the other Gammaproteobacteria.
Core and Pan Genome Analysis
The core and pan genome analysis is another method that uses the proteomes of the genomes (Fig. 3). The plot shows the change in the number of gene families that are common to the compared genomes, the core genome, and the pan genome . The bars indicate the new genes and gene families compared with a BLASTP against the genomes that were previously added to the list. Hence, every new gene family is accounted for in the pan genome, which increases with the addition of each new genome (blue line in Fig. 3), whereas the size of the core genome is reduced (red line). The order of the genomes is related to the evolutionary distance seen on the pan genome tree.
The overall result of the core and pan genome plot shows that there are only 443 conserved core gene families found in 29 genomes. The core genome size for just the Pseudomonas genomes is 1,706, and after A. vinelandii is added, it is reduced by 231 gene families, leading to 1,475 core gene families for the first 18 genomes. The pan genome size for all the strains adds up to 29,626 gene families. The increase in pan genome size after the addition of A. vinelandii is 1,506 gene families, and there are roughly 1,700 genes that are designated as new. It is also seen that after the addition of Pseudomonas putida strains, the core genome for Pseudomonads has a steep drop with 1,870 gene families and the pan genome has a sharp increase of 1,969 more gene families. There is also a big jump of 2,275 gene families in the pan genome after the addition of Acinetobacter species, where the core genome drops by 855 gene families.
The BLASTMatrix shows the shared gene families between and within the compared genomes (Fig. 4). There is a high fraction of shared genes within the same species, as denoted in darker green colors on the bottom of the matrix. The genus Pseudomonas can readily be distinguished, and it is highlighted with the dark red triangle. The results closely resemble the relations in the pan genome tree where Psychrobacter and Acinetobacter have a very low homology with Pseudomonas species. Azotobacter on the other hand shares as many gene families with Pseudomonas as some of the other members of Pseudomonadaceae. More specifically, it shares between 24% and 31% of its protein families with Pseudomonas; the maximum homology is seen with P. stutzeri with 31.2%, where P. stutzeri is homologous with other Pseudomonas between 28% and 34%. Also seen from the matrix is the homology within the species which is on average 79% for P. aeruginosa, 66% for P. putida, 49% for P. fluorescens, 65% for P. syringae, and 72% for Acinetobacter baumannii. In contrast, homology across various species within Pseudomonas indicates a level between 30% and 50%.
The results of using different comparative genomic methods to understand the phylogeny of Pseudomonas indicate a considerably close evolutionary relationship with A. vinelandii. Both in the 16S rRNA tree and pan genome family tree, A. vinelandii is clustered together or within the Pseudomonas species. It is not as clear whether or not A. vinelandii should be classified as being closest to P. aeruginosa. Although they are clustered together in the 16S rRNA tree, the low resolution of the 16S rRNA phylogenetic analysis has been noted by others [39, 59]. It does show, however, that they have a common ancestral rrs gene that is closer to each other than to other species. The outcome of the 16S rRNA tree also shows that members of the Moraxellaceae family (Acinetobacter and Psychrobacter) are clustered together and Pseudomonodaceae (Pseudomonas, Azotobacter, and Cellvibrio) are on another clade. In the pan genome family tree, on the other hand, clear clusters of each species can be seen. For example, P. fluorescens strains are in a group rather than separated as in the rRNA tree. Since this tree shows the distances of each group according to how many gene families they share, it is clear that A. vinelandii shares more gene families with the Pseudomonas group than it does with the other Pseudomonadales members used in this comparison, especially when compared with Cellvibrio, which is also another genus in the same family. Since the results are restricted to only one Azotobacter genome, other supportive results are crucial in order to understand the true relationship.
The core and pan genome analysis reveals the set of conserved gene families (“core”) and the total number of gene families (“pan genome”) for the set of sequenced genomes compared. The core genome refers to the idea of a backbone genome for organisms in the same genus; as more species from the same genus are added, the core is expected to approach a stable plateau in the same genus. The pan genome is, however, very flexible in size and, for some bacteria, can be quite large . For instance, the pan genome of Pseudomonas is more than ten times larger than its core genome . Figure 3 shows that the Pseudomonas core genome size does not have a dramatic change with the addition of A. vinelandii. Although the pan genome has a slight increase, such a small change in the core genome size is not expected from an organism that is in a different genus. The sharp decrease in the core genome when P. putida is added is due to its position on the list, being right after P. aeruginosa, which is clearly in another group in the phylogeny trees. However, after 18 genomes, the addition of a new genus (in this case, an Acinetobacter species) creates a big difference in the plot, reducing the core more than half in size and expanding the pan genome clearly by 10%, which is an obvious result of the addition of a different genus on the list.
The BLASTMatrix also strongly agrees with the pan genome family tree and pan–core genome plot as they are all comparing the gene families across the genomes. Among them, the BLASTMatrix provides more quantitative results on the similarities such as percentages or number of gene families. According to this, Azotobacter has a high fraction of shared protein families with Pseudomonas, mostly with P. stutzeri having 31%. They also share a similar fraction of protein family levels with other Pseudomonas, with A. vinelandii having on average 25% homology and P. stutzeri with on average 30%. The relation between these two organisms might be because of the similar functions that they have in their environment as they are both free-living, root-associated, nitrogen-fixing bacteria . However, phylogenetic analysis, based on 16S rRNA in this paper and in the other works also based on housekeeping genes, suggests that A. vinelandii could have a closer evolutionary relationship with P. aeruginosa. Taking into account that P. aeruginosa is the type species for Pseudomonas and that it shares a common evolutionary history on the conserved genes with A. vinelandii, all of these results strongly suggests that A. vinelandii has a Pseudomonas-like backbone including the conserved genes, while the diversity among them is caused by adaptive strategies during their evolution which comes from the transfer of genetic material.
It should be noted that complete genomes were chosen in the dataset for the reliability of the methods. For many of the incomplete genomes in the database, 16S rRNA sequences are missing or partial, and so as the many protein sequences. In a study where comparative analysis mostly relies on the BLAST comparison on the proteomes, partial or poorly annotated sequences cause artifacts in the search, which leads to unreliable conclusions. Another aspect from the taxonomical point of view is that in theory, a good classification method should be able to show the relations of organisms regardless of the number of data. Hence, having more Pseuodmonas genomes should not be the effecting factor in the discrimination of the organisms, especially at the genus level. On the other hand, having more Azotobacter genomes would be more reliable in terms of reducing the effect of sequencing, assembly, and annotation errors.
The increase in the availability of genome sequences of different organisms makes comparative genomic analysis a fundamental part of research on evolutionary relations between organisms and the genetic basis of their diversity . This applies as well to Pseudomonas. Looking at the comparative genomic analysis of nucleotide and the coded protein sequences of Pseudomonas, we can easily see the distinction between Pseudomonas species. This distinction is supported by the physical and biochemical classification that has been established over the last 100 years.
Although there is only one Azotobacter genome available, general assumptions from these comparative analyses that rely on the extensively studied algorithms and tools are worth examining. Support from similar analysis on the comparison of these genera is also taken into account. In conclusion, we suggest that A. vinelandii has a Pseudomonas-like backbone genome where the core functions of these groups are same. There are various lines of evidence which lead to this: the finding that A. vinelandii has approximately a third of the same protein families, clustering with the whole Pseudomonas clade on the pan genome family tree rather than being in the P. aeruginosa clade, and not causing a big drop on the core genome size. All three of these observations are consistent with the idea of having the same backbone, perhaps the same origins, but different adaptations throughout their evolution. This leads us to the question of the boundaries of new member assignments in the Pseudomonas genus. Where do we draw the line? We propose that based on whole-genome analysis, it is possible to better differentiate members of a genus by looking at their core genome properties. The standards for classification should be set using comparative methods on both DNA and protein levels. If this is the case, the Azotobacter can be assigned to the Pseudomonas genus. Perhaps, for future investigations, a detailed analysis of the core genes can be made and functionally categorized to see the background of their similarity.
We would like to thank Peter Fisher Hallin, PhD, from Center for Biological Sequence Analysis(CBS), Department of Systems Biology, The Technical University of Denmark, currently in Novozymes; Karin Lagesen from CBS, DTU and Centre for Molecular Biology & Neuroscience and Institute of Medical Microbiology, University of Oslo; Lars Snipen from Biostatistics, Dept. Chemistry, Biotech., and Food Sciences, Norwegian University of Life, for help in providing accurate and useful tools for this analysis, and Colleen Ussery for carefully proofreading and editing the manuscript. This research was supported in part by a grant 09-067103/DSF from the Danish Council for Strategic Research.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.