Skip to main content

Advertisement

Log in

Ensemble Technique for Prediction of T-cell Mycobacterium tuberculosis Epitopes

  • Original Research Article
  • Published:
Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Abstract

Development of an effective machine-learning model for T-cell Mycobacterium tuberculosis (M. tuberculosis) epitopes is beneficial for saving biologist’s time and effort for identifying epitope in a targeted antigen. Existing NetMHC 2.2, NetMHC 2.3, NetMHC 3.0 and NetMHC 4.0 estimate binding capacity of peptide. This is still a challenge for those servers to predict whether a given peptide is M. tuberculosis epitope or non-epitope. One of the servers, CTLpred, works in this category but it is limited to peptide length of 9-mers. Therefore, in this work direct method of predicting M. tuberculosis epitope or non-epitope has been proposed which also overcomes the limitations of above servers. The proposed method is able to work with variable length epitopes having size even greater than 9-mers. Identification of T-cell or B-cell epitopes in the targeted antigen is the main goal in designing epitope-based vaccine, immune-diagnostic tests and antibody production. Therefore, it is important to introduce a reliable system which may help in the diagnosis of M. tuberculosis. In the present study, computational intelligence methods are used to classify T-cell M. tuberculosis epitopes. The caret feature selection approach is used to find out the set of relevant features. The ensemble model is designed by combining three models and is used to predict M. tuberculosis epitopes of variable length (7–40-mers). The proposed ensemble model achieves 82.0% accuracy, 0.89 specificity, 0.77 sensitivity with repeated k-fold cross-validation having average accuracy of 80.61%. The proposed ensemble model has been validated and compared with NetMHC 2.3, NetMHC 4.0 servers and CTLpred T-cell prediction server.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Organization World Health (2016) Global tuberculosis report 2016. WHO. https://bit.ly/2qlTZ4j

  2. Shah P, Mistry J, Reche PA, Gatherer D, Flower DR (2018) In silico design of mycobacterium tuberculosis epitope ensemble vaccines. Mol Immunol 97:56–62

    CAS  PubMed  Google Scholar 

  3. Ferraz J, Melo F, Albuquerque MdFPM, Montenegro S, Abath F (2006) Immune factors and immunoregulation in tuberculosis. Braz J Med Biol Res 39(11):1387–1397

    CAS  PubMed  Google Scholar 

  4. Flynn JL (2004) Immunology of tuberculosis and implications in vaccine development. Tuberculosis 84(1):93–101

    PubMed  Google Scholar 

  5. Zhao Y, Pinilla C, Valmori D, Martin R, Simon R (2003) Application of support vector machines for T-cell epitopes prediction. Bioinformatics 19(15):1978–1984

    CAS  PubMed  Google Scholar 

  6. Brusic V, Bajic VB, Petrovsky N (2004) Computational methods for prediction of T-cell epitopes a framework for modelling, testing, and applications. Methods 34(4):436–443

    CAS  PubMed  Google Scholar 

  7. Bhasin M, Raghava G (2004) Prediction of CTL epitopes using QM. SVM and ANN techniques. Vaccine 22(23–24):3195–3204

    CAS  PubMed  Google Scholar 

  8. Nielsen M, Lundegaard C, Worning P, Lauemøller SL, Lamberth K, Buus S, Brunak S, Lund O (2003) Reliable prediction of T-cell epitopes using neural networks with novel sequence representations. Protein Sci 12(5):1007–1017

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Dönnes P, Elofsson A (2002) Prediction of MHC class I binding peptides, using SVMHC. BMC Bioinform 3(1):25

    Google Scholar 

  10. Pellequer JL, Westhof E, Van Regenmortel MH (1993) Correlation between the location of antigenic sites and the prediction of turns in proteins. Immunol Lett 36(1):83–99

    CAS  PubMed  Google Scholar 

  11. Alix AJ (1999) Predictive estimation of protein linear epitopes by using the program PEOPLE. Vaccine 18(3):311–314

    CAS  PubMed  Google Scholar 

  12. Odorico M, Pellequer JL (2003) BEPITOPE: predicting the location of continuous epitopes and patterns in proteins. J Mol Recogn 16(1):20–22

    CAS  Google Scholar 

  13. Saha S, Raghava G (2004) BcePred: prediction of continuous B-cell epitopes in antigenic sequences using physico-chemical properties. In: Nicosia G, Cutello V, Bentley PJ, Timmis J (eds) Artificial immune systems. International conference on artificial immune systems, vol 3239. Springer, Berlin, Heidelberg, pp 197–204

    Google Scholar 

  14. Saha S, Raghava G (2006) Prediction of continuous B-cell epitopes in an antigen using recurrent. Neural Netw 65(1):40–48

    CAS  Google Scholar 

  15. Chen J, Liu H, Yang J, Chou KC (2007) Prediction of linear B-cell epitopes using amino acid pair antigenicity scale. Amino Acids 33(3):423–428

    CAS  PubMed  Google Scholar 

  16. EL-Manzalawy Y, Dobbs D, Honavar V (2008) Predicting linear B-cell epitopes using string kernels. J Mol Recogn 21(4):243–255

    CAS  Google Scholar 

  17. Yao B, Zhang L, Liang S, Zhang C (2012) SVMTriP: a method to predict antigenic epitopes using support vector machine to integrate tri-peptide similarity and propensity. PloS One 7(9):e45152

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Huang JH, Wen M, Tang LJ, Xie HL, Fu L, Liang YZ, Lu HM (2014) Using random forest to classify linear B-cell epitopes based on amino acid properties and molecular features. Biochimie 103:1–6

    CAS  PubMed  Google Scholar 

  19. Yao L, HUANG ZC, Meng G, PAN XM (2015) An improved method for predicting linear B-cell epitope using deep maxout networks. Biomed Environ Sci 28(6):460–463

    Google Scholar 

  20. Shen W, Cao Y, Cha L, Zhang X, Ying X, Zhang W, Ge K, Li W, Zhong L (2015) Predicting linear B-cell epitopes using amino acid anchoring pair composition. BioData mining 8(1):1

    Google Scholar 

  21. Saha S, Raghava G (2006) AlgPred: prediction of allergenic proteins and mapping of IgE epitopes. Nucleic Acids Res 34(suppl 2):W202–W209

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Mohabatkar H, Mohammad Beigi M, Abdolahi K, Mohsenzadeh S (2013) Prediction of allergenic proteins by means of the concept of chou’s pseudo amino acid composition and a machine learning approach. Med Chem 9(1):133–137

    CAS  PubMed  Google Scholar 

  23. Gupta S, Ansari HR, Gautam A, Raghava GP (2013) Identification of B-cell epitopes in an antigen for inducing specific class of antibodies. Biol Direct 8(1):1

    CAS  Google Scholar 

  24. Khanna D, Rana PS (2017) Multilevel ensemble model for prediction of IgA and IgG antibodies. Immunol Lett 184:51–60

    CAS  PubMed  Google Scholar 

  25. Fleri W, Paul S, Dhanda SK, Mahajan S, Xu X, Peters B, Sette A (2017) The immune epitope database and analysis resource in epitope discovery and synthetic vaccine design. Front Immunol 8:278

    PubMed  PubMed Central  Google Scholar 

  26. Dhanda SK, Vir P, Raghava GP (2013) Designing of interferon-gamma inducing MHC class-II binders. Biol Direct 8(1):30

    PubMed  PubMed Central  Google Scholar 

  27. Vizcaíno C, Restrepo-Montoya D, Rodríguez D, Niño LF, Ocampo M, Vanegas M, Reguero MT, Martínez NL, Patarroyo ME, Patarroyo MA (2010) Computational prediction and experimental assessment of secreted/surface proteins from mycobacterium tuberculosis H37Rv. PLoS Comput Biol 6(6):e1000824

    PubMed  PubMed Central  Google Scholar 

  28. Nielsen M, Lund O (2009) NN-align. an artificial neural network-based alignment algorithm for MHC class II peptide binding prediction. BMC Bioinform 10(1):296

    Google Scholar 

  29. Jensen KK, Andreatta M, Marcatili P, Buus S, Greenbaum JA, Yan Z, Sette A, Peters B, Nielsen M (2018) Improved methods for predicting peptide binding affinity to MHC class II molecules. Immunology 154(3):394–406

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Buus S, Lauemøller S, Worning P, Kesmir C, Frimurer T, Corbet S, Fomsgaard A, Hilden J, Holm A, Brunak S (2003) Sensitive quantitative predictions of peptide-mhc binding by a query by committeeartificial neural network approach. Tissue antigens 62(5):378–384

    CAS  PubMed  Google Scholar 

  31. Andreatta M, Nielsen M (2015) Gapped sequence alignment using artificial neural networks: application to the MHC class I system. Bioinformatics 32(4):511–517

    PubMed  PubMed Central  Google Scholar 

  32. Andreatta M, Schafer-Nielsen C, Lund O, Buus S, Nielsen M (2011) Nnalign: a web-based prediction method allowing non-expert end-user discovery of sequence motifs in quantitative peptide data. PLoS One 6(11):e26781

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Nielsen M, Andreatta M (2016) NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets. Genome Med 8(1):33

    PubMed  PubMed Central  Google Scholar 

  34. Osorio D, Rondon-Villarreal P, Torres R (2015) Peptides: a package for data mining of antimicrobial peptides. R J 7(1):4–14

    Google Scholar 

  35. Boman H (2003) Antibacterial peptides: basic facts and emerging concepts. J Intern Med 254(3):197–215

    CAS  PubMed  Google Scholar 

  36. Hofmann H, Hare E, GGobi Foundation (2016) Evaluation of diversity in nucleotide libraries, version 0.2.2. https://github.com/heike/peptider

  37. RColorBrewer S, Deng H, Deng MH (2018) Package ‘RRF’, version ​1.9. https://sites.google.com/site/houtaodeng/rrf

  38. Therneau, T., Atkinson, B., Ripley, B., Ripley, M.B.: Package rpart. https://cran.ma.ic.ac.uk/web/packages/rpart/rpart.pdf. Accessed 20 Apr 2016 (2018)

  39. Williams CK, Engelhardt A, Cooper T, Mayer Z, Ziem A, Scrucca L, Tang Y, Candan C, Kuhn MM (2018) Package ‘caret’, version 6.0-80. https://cran.r-project.org/web/packages/caret/caret.pdf

  40. Hothorn T, Buehlmann P, Kneib T, Schmid M, Hofner B, Sobotka F, Scheipl F, Hofner MB (2018) Package ‘mboost’, version 2.9-1. https://github.com/boost-R/mboost

  41. Gosso A, Gosso MA (2012) Package ‘elmnn’, version 1.0. https://cran.rproject.org/web/packages/elmNN/index.html

  42. Hastie T, Hastie MT (2018) Package ‘gam’, version 1.16. https://cran.r-project.org/web/packages/gam/gam.pdf

  43. Ripley B, Venables W, Ripley MB (2016) Package ‘nnet’, version 7.3-12. https://cran.r-project.org/web/packages/nnet/nnet.pdf

  44. Karatzoglou A, Smola A, Hornik K, Karatzoglou MA (2018) Package ‘kernlab’, version ​0.9-27. ftp://tdf.c3sl.ufpr.br/CRAN/web/packages/kernlab/kernlab.pdf

  45. Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Mach Learn 36(1–2):105–139

    Google Scholar 

  46. Geluk A, Van Meijgaarden KE, Franken KL, Drijfhout JW, DSouza S, Necker A, Huygen K, Ottenhoff TH (2000) Identification of major epitopes of Mycobacterium tuberculosis AG85B that are recognized by HLA-A* 0201-restricted CD8+ T cells in HLA-transgenic mice and humans. J Immunol 165(11):6463–6471

    CAS  PubMed  Google Scholar 

  47. McMurry J, Sbai H, Gennaro M, Carter E, Martin W, De Groot A (2005) Analyzing Mycobacterium tuberculosis proteomes for candidate vaccine epitopes. Tuberculosis 85(1):95–105

    CAS  PubMed  Google Scholar 

  48. Lata S, Bhasin M, Raghava GP (2009) MHCBN 4.0: a database of MHC/TAP binding peptides and T-cell epitopes. BMC Res Notes 2(1):61

    PubMed  PubMed Central  Google Scholar 

  49. Nielsen M, Lundegaard C, Worning P, Hvid CS, Lamberth K, Buus S, Brunak S, Lund O (2004) Improved prediction of MHC class I and class II epitopes using a novel gibbs sampling approach. Bioinformatics 20(9):1388–1397

    CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Divya Khanna.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khanna, D., Rana, P.S. Ensemble Technique for Prediction of T-cell Mycobacterium tuberculosis Epitopes. Interdiscip Sci Comput Life Sci 11, 611–627 (2019). https://doi.org/10.1007/s12539-018-0309-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12539-018-0309-0

Keywords

Navigation