Integrative immunoinformatics for Mycobacterial diseases in R platform
The sequencing of genomes of the pathogenic Mycobacterial species causing pulmonary and extrapulmonary tuberculosis, leprosy and other atypical mycobacterial infections, offer immense opportunities for discovering new therapeutics and identifying new vaccine candidates. Enhanced RV, which uses additional algorithms to Reverse Vaccinology (RV), has increased potential to reduce likelihood of undesirable features including allergenicity and immune cross reactivity to host. The starting point for MycobacRV database construction includes collection of known vaccine candidates and a set of predicted vaccine candidates identified from the whole genome sequences of 22 mycobacterium species and strains pathogenic to human and one non-pathogenic Mycobacterium tuberculosis H37Ra strain. These predicted vaccine candidates are the adhesins and adhesin-like proteins obtained using SPAAN at Pad > 0.6 and screening for putative extracellular or surface localization characteristics using PSORTb v.3.0 at very stringent cutoff. Subsequently, these protein sequences were analyzed through 21 publicly available algorithms to obtain Orthologs, Paralogs, BetaWrap Motifs, Transmembrane Domains, Signal Peptides, Conserved Domains, and similarity to human proteins, T cell epitopes, B cell epitopes, Discotopes and potential Allergens predictions. The Enhanced RV information was analysed in R platform through scripts following well structured decision trees to derive a set of nonredundant 233 most probable vaccine candidates. Additionally, the degree of conservation of potential epitopes across all orthologs has been obtained with reference to the M. tuberculosis H37Rv strain, the most commonly used strain in M. tuberculosis studies. Utilities for the vaccine candidate search and analysis of epitope conservation across the orthologs with reference to M. tuberculosis H37Rv strain are available in the mycobacrvR package in R platform accessible from the “Download” tab of MycobacRV webserver. MycobacRV an immunoinformatics database of known and predicted mycobacterial vaccine candidates has been developed and is freely available at http://mycobacteriarv.igib.res.in.
KeywordsMycobacteria Vaccine Reverse Vaccinology Enhanced RV
SR thanks grants (BSC0121) from Council of Scientific and Industrial Research (CSIR). RC thanks The Indian Council of Medical Research for fellowship. Funding for IT infrastructure through CSIR-Institute of Genomics and Integrative Biology resources is acknowledged.
Conflict of interest
The authors declare that they have no competing interests.
- Andersen PH, Nielsen M, Lund O (2006) Prediction of residues in discontinuous B cell epitopes using protein 3D structures. Protein Sci 15:2358–2367Google Scholar
- Baddeley A, Dean A, Dias HM, Falzon D et al (2013) World Health Organization Global Tuberculosis Report. http://www.who.int/tb/publications/global_report/en/index.html. Accessed 1 November 2013
- Kondrashov FA, Rogozin IB, Wolf YI and Koonin EV (2002) Selection in the evolution of gene duplications. Genome Biol 3:RESEARCH0008Google Scholar
- Kulkarni-Kale U, Bhosle S and Kolaskar AS (2005) CEP: a conformational epitope prediction server. Nucleic Acids Res 33(Web Server issue):W168–W171Google Scholar
- Lockwood DNJ (2007) Leprosy Clin Evid (Online) Apr 1; 2007 pii: 0915Google Scholar
- Marchler-Bauer A, Anderson JB, Cherukuri PF, DeWeese-Scott C, Geer LY, Gwadz M, He S, Hurwitz DI, Jackson JD, Ke Z, Lanczycki CJ, Liebert CA, Liu C, Lu F, Marchler GH, Mullokandov M, Shoemaker BA, Simonyan V, Song JS, Thiessen PA, Yamashita RA, Yin JJ, Zhang D and Bryant SH (2005) CDD: a Conserved Domain Database for protein classification. Nucleic Acids Res 192–196Google Scholar
- Cooper PS, Lipshultz D, Matten WT, McGinnis SD, Pechous S, Romiti ML, Tao T, Valjavec-Gratian M, Sayers EW (2010) Education resources of the National Center for Biotechnology Information. Brief Bioinform 11:563–569Google Scholar
- Ramachandran S, Chaudhuri R, Verma SP, Shah AR, Paul C, Chakraborty S, Puniya BL and Mandal RS (2011) Biological Data Modelling and Scripting in R, Systems and Computational Biology - Bioinformatics and Computational Modeling, Prof Ning-Sun Yang (Ed), InTech. http://www.intechopen.com/books/systems-and-computational-biology-bioinformatics-and-computational-modeling/biological-data-modelling-and-scripting-in-r
- R Core Team (2013) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://www.R-project.org/
- Ross BC, Czajkowski L, Hocking D, Margetts M, Webb E, Rothel L, Patterson M, Agius C, Camuglia S, Reynolds E, Littlejohn T, Gaeta B, Ng A, Kuczek ES, Mattick JS, Gearing D, Barr IG (2001) Identification of vaccine candidate antigens from a genomic analysis of Porphyromonas gingivalis. Vaccine 19:4135–4142PubMedCrossRefGoogle Scholar
- Saha S, Raghava GP (2006) AlgPred: prediction of allergenic proteins and mapping of IgE epitopes. Nucleic Acids Res W202–W209Google Scholar
- Vita R, Zarebski L, Greenbaum JA, Emami H, Hoof I, Salimi N, Damle R, Sette A, Peters B (2010) The immune epitope database 20. Nucleic Acids Res 38(Database issue): D854–D862Google Scholar
- Yu NY, Wagner JR, Laird MR, Melli G, Rey S, Lo R, Dao P, Sahinalp SC, Ester M, Foster LJ, Brinkman FS (2010) PSORTb 30: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes. Bioinformatics 26:1608–1615PubMedCrossRefGoogle Scholar
- Zhang Q, Wang P, Kim Y, Haste-Andersen P, Beaver J, Bourne PE, Bui HH, Buus S, Frankild S, Greenbaum J, Lund O, Lundegaard C, Nielsen M, Ponomarenko J, Sette A, Zhu Z, Peters B (2008) Immune epitope database analysis resource (IEDB-AR). Nucleic Acids Res 36(Web Server):W513–W518PubMedCentralPubMedCrossRefGoogle Scholar