Genome subtraction for the identification of potential antimicrobial targets in Xanthomonas oryzae pv. oryzae PXO99A pathogenic to rice

In pathogenic bacteria, identification of essential proteins which are non-homologous to the host plants represents potential antimicrobial targets. We applied subtractive genomics approach for the identification of novel antimicrobial targets in Xanthomonas oryzae pv. oryzae PXO99A, the causative agent of bacterial blight in rice. Comparative analysis was performed through BLAST available with the NCBI. The analysis revealed that 27 essential protein sequences out of 4,988 sequences of X. oryzae pv. oryzae PXO99A are non-homologous to Oryza sativa. Subsequent analysis of 27 essential proteins revealed their involvement in different metabolic activities such as transport activity, DNA binding, structural constituent of ribosome, cell division, translation, and plasma membrane. These 27 proteins were analyzed for virulence and novelty and out of 27, three essential non-homologous proteins were found to be the novel antimicrobial targets.


Introduction
Generation of vast genomic data from prokaryotic whole genome projects in the recent years has opened new avenues for finding out novel drug targets in microbes (Buysse 2001). Genome sequences of pathogenic microbes provided tremendous information which is now facilitating in silico identification and characterization of potential therapeutic targets and virulence factors of pathogens (Amineni et al. 2010;Dutta et al. 2006;Miesel et al. 2003). Potential therapeutic targets should be an essential component of a particular metabolic pathway in a pathogen, should be adequately selective to yield a drug that is specific against the pathogen and should possess no homolog within the host system so that the designed lead molecule can act against the functionality of the pathogen only but not against the host. Subtractive genomics approach entwined with bioinformatics can find out optimal targets related to previously unknown cellular functions in microbes based on the understanding of relatively similar biological processes in pathogens and hosts (Vetrivel et al. 2011;Koteswara et al. 2010;Sakharkar et al. 2004). Using this approach, a number of potential drug targets have been identified for bacterial pathogens of humans (Barh et al. 2011).
Search for antibacterial targets in bacteria pathogenic to plants has remained an untouched area of in silico research, although there exists tremendous scope for likely work with the appearance of a large volume of data sets coming out from the whole genome sequencing projects of the phytopathogenic bacteria. Xanthomonas oryzae pv. oryzae, a gamma-proteobacteria is an important pathogen of rice (Swings et al. 1990) causing bacterial leaf blight (Niño-Liu et al. 2006) or bacterial blight (Salzberg et al. 2008). Highyielding rice cultivars are more susceptible to the disease that leads to wilting of seedlings, yellowing and drying of leaves and yield loss. Besides physical disease management practices including sanitation, seed treatment with bleaching powder (10 lg/ml) and zinc sulfate (2 %) are reported to reduce the disease but chemical control using antibiotics has only limited success (Rice Knowledge Bank 2009). Continuously increasing resistance among the pathogens towards antibiotics has generated the need for searching novel antimicrobial targets in pathogenic bacteria that may lead to the development of novel antimicrobial agents.
Xanthomonas oryzae pv. oryzae strain PXO99A is virulent towards many rice varieties representing diverse genetic sources for resistance and needs novel antimicrobials for reducing leaf blight resistance and increasing rice yield. Complete genome sequences of different strains of X. oryzae pv. oryzae like PXO99A (Salzberg et al. 2008), KACC10331, and MAFF311018 (Triplett et al. 2011) facilitated in-depth comparative genomic analyses. Genome comparison indicated that strain PXO99A contains various virulence-associated transcription activator-like effector genes and possesses a minimum of 10 major chromosomal rearrangements in comparison to the other strains KACC10331 and MAFF311018 (Salzberg et al. 2008). Looking into the practical implications of such work, we applied subtractive genomics approach to identify novel protein targets that encode pathogenicity in X. oryzae pv. oryzae PXO99A and help in finding out novel antimicrobial targets to develop potential antimicrobial agents against this important disease of rice.

Results and discussion
Computational approaches have been applied to identify essential genes in prokaryotes. We reported the identification of essential genes as the potential antibacterial targets in plant pathogenic bacteria X. oryzae pv. oryzae. The approach was based on sequence alignment of proteins downloaded from the NCBI ( Table 1) and database of essential genes (DEG). X. oryzae pv. oryzae PXO99A, O. sativa and database of essential genes (DEG) of prokaryotes contain 4,988, 21,342 and 7,643 protein sequences, respectively. Our results revealed that out of 4,988 proteins in X.oryzae pv. oryzae PXO99A, 406 unique sequences did not resulted in any hits (no hits found) and did not align with any sequence of O. sativa. The result is in agreement with the earlier reports (Jacobs et al. 2003, Sakharkar et al. 2004) who reported classified 300-400 essential genes in another bacteria P. aeruginosa. When non-homologous 406 sequences were aligned (two-way BLAST) against prokaryotic essential protein sequences of DEG with an e-value cutoff of 10 -10 for determination of their essentiality, 27 sequences were found essential for the pathogen. Further functional categorization based on the respective gene description or name of these proteins revealed that in the pathogen, these proteins might be considered as unique and linked with the essential metabolic pathway. All these 27 protein sequences were related to different functional cellular properties such as transport activity, DNA binding, structural constituent of ribosome, cell division, translation, plasma membrane and membrane protein ( Table 2). The KEGG GENES database which is a resource for cross-species annotation of all available genomes by KEGG orthology (KO) system, classified all 27 essential genes of X. oryzae pv. oryzae into different categories according to their involvement in different metabolic pathways (Table 3). Metabolic pathway analysis of essential proteins revealed that majorly three genes are involved in oxidative phosphorylation, three in nitrogen metabolism, two in bacterial secretion system, one in glutathione metabolism, one in arginine and proline metabolism and one each in bacterial chemotaxis and protein export besides several others that are involved in different essential pathways (Table 4). Some of these proteins directly contribute to the basic primary metabolic mechanisms like carbon fixation, phosphorylation, amino acid biosynthesis, citric acid cycle, nitrogen metabolism, etc. However, certain essential proteins like those encoding ABC transporters (1), bacterial chemotaxis (1), protein export (1) and secretion systems (2) found in the pathogen are, in one or the other way, linked to the pathogenicity, virulence factor, nutrient mobilization and uptake and motility of the organisms (Maranhão et al. 2009;Rodriguez and Smith 2006;Stergiopoulos et al. 2003). Virulent/nonvirulent properties predicted through support vector machine (SVM) approach revealed three virulent proteins (accession number: YP_001911579.1, YP_001913450.1, YP_001914963.1) leading to the assumption that these three essential proteins could have important role in the normal functioning of the pathogen within the host. Thus, these proteins can be viewed as novel targets because of their non-significant similarity with DrugBank targets and can be used for the development of antimicrobials. These three proteins [YP_001911579.1 (TonB-dependent receptor), YP_001913450.1 (Transposase) and YP_001914963.1 (TonB-dependent outer membrane receptor)] are very critical due to their presence in the outer membrane of the pathogens.

Identification of essential proteins
Complete protein sequence of X. oryzae pv. oryzae PXO99A was subjected to BLASTP (Altschul et al. 1990) against O. sativa protein sequences to identify gene products of the pathogen. Sequences that did not show any similarity were further subjected to BLASTP with e-value cutoff score of 10 -10 against all prokaryotic sequences of the DEG (Zhang et al. 2004) to screen out genes that appeared to represent essential genes. Further, biological processes, molecular functions and cellular components have been identified. Complete BLAST alignments were two-way.

Metabolic pathway analysis
Metabolic pathway analysis was carried out by KAAS (KEGG Automatic Annotation Server) at KEGG for the  identification of essential proteins in different pathways. KAAS provides functional annotation of genes by BLAST comparisons against the manually curated KEGG GENES database. The result contains KO assignments and automatically generated KEGG pathways (Moriya et al. 2007).
The method for identification of probable antibacterial targets is described in Fig. 1.

Prediction of virulent proteins
Bacterial virulent protein sequences were predicted through bi-layer cascade support vector machine (SVM) based prediction tool VirulentPred (Garg and Gupta 2008).
In the first layer SVM classifiers were trained and optimized with different individual protein sequence features and cascaded to the second layer SVM classifier to train and generate the final classifier. The selected prediction approach for the query were amino acid composition, dipeptide composition, similarity searching, higher order dipeptide composition, PSSM and cascased SVM module.

Prediction of targets as novel
All protein targets of DrugBank (Wishart et al. 2008) were downloaded and aligned with essential proteins for finding out the significant similarity and dissimilarity that represent non-novel and novel targets, respectively. Sequences were subjected to BLASTP with e-value cutoff score of 10 -4 and similarity[70 against all the experimental drug targets.