Application of Meta Learning to B-Cell Conformational Epitope Prediction

  • Yuh-Jyh HuEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 2131)


One of the major challenges in the field of vaccine design is identifying B-cell epitopes in continuously evolving viruses. Various tools have been developed to predict linear or conformational epitopes, each relying on different physicochemical properties and adopting distinct search strategies. In this chapter, we propose different ensemble meta-learning approaches for epitope prediction based on stacked, cascade generalizations, and meta decision trees. Through meta learning, we expect a meta learner to be able to integrate multiple prediction models and outperform the single best-performing model. The objective of this chapter is twofold: (1) to promote the complementary predictive strengths in different prediction tools and (2) to introduce computational models to exploit the synergy among various prediction tools. Our primary goal is not to develop any particular classifier for B-cell epitope prediction, but to advocate the feasibility of meta learning to epitope prediction. With the flexibility of meta learning, the researcher can construct various meta classification hierarchies that are applicable to epitope prediction in different protein domains.

Key words

B-cell epitopes Meta learning Stacking Cascade Meta decision trees 


  1. 1.
    Meloen RH, Puijk WC, Langeveld JP, Langedijk JP, Timmerman P (2003) Design of synthetic peptides for diagnostics. Curr Protein Pept Sci 4:253–260PubMedCrossRefPubMedCentralGoogle Scholar
  2. 2.
    Tanabe S (2007) Epitope peptides and immunotherapy. Curr Protein Pept Sci 8:109–118PubMedCrossRefPubMedCentralGoogle Scholar
  3. 3.
    Naz RK, Dabir P (2007) Peptide vaccines against cancer, infectious diseases, and conception. Front Biosci 12:1833–1844PubMedCrossRefPubMedCentralGoogle Scholar
  4. 4.
    Benjamin DC, Berzofsky JA, East IJ, Gurd FR, Hannum C, Leach SJ et al (1984) The antigenic structure of proteins: a reappraisal. Annu Rev Immunol 2:67–101PubMedCrossRefPubMedCentralGoogle Scholar
  5. 5.
    Pellequer JL, Westhof E, Van Regenmortel MH (1991) Predicting location of continuous epitopes in proteins from their primary structures. Methods Enzymol 203:176–201PubMedCrossRefPubMedCentralGoogle Scholar
  6. 6.
    Hopp TP, Woods KR (1981) Prediction of protein antigenic determinant from amino acid sequences. Proc Natl Acad Sci U S A 78:3824–3828PubMedPubMedCentralCrossRefGoogle Scholar
  7. 7.
    Pellequer J, Westhof E, Van Regenmortel M (1993) Correlation between the location of antigenic sites and the prediction of turns in proteins. Immunol Lett 36(1):83–99PubMedCrossRefGoogle Scholar
  8. 8.
    Blythe MJ, Doytchinova IA, Flower DR (2002) JenPep: A database of quantitative functional peptide data for immunology. Bioinformatics 18(3):434–439PubMedCrossRefPubMedCentralGoogle Scholar
  9. 9.
    Larsen JE, Lund O, Nielsen M (2006) Improved method for predicting linear B-cell epitopes. Immunome Res 2:2PubMedPubMedCentralCrossRefGoogle Scholar
  10. 10.
    Saha S, Raghava G (2006) Prediction of continuous B-cell epitopes in an antigen using recurrent neural network. Proteins 65(1):40–48CrossRefGoogle Scholar
  11. 11.
    Chen J, Liu H, Yang J, Chou K (2007) Prediction of linear B-cell epitopes using amino acid pair antigenicity scale. Amino Acids 33(3):423–428CrossRefGoogle Scholar
  12. 12.
    El-Manzalawy Y, Dobbs D, Honavar V (2008) Predicting linear B-cell epitopes using string kernels. J Mol Recognit 21(4):243–255PubMedPubMedCentralCrossRefGoogle Scholar
  13. 13.
    Andersen PH, Nielsen M, Lund O (2006) Prediction of residues in discontinuous B-cell epitopes using protein 3D structures. Protein Sci 15:2558–2567CrossRefGoogle Scholar
  14. 14.
    Qi T, Qiu T, Zhang Q, Tang K, Fan Y, Qiu J et al (2014) SEPPA 2.0-more refined server to predict spatial epitope considering species of immune host and subcellular localization of protein antigen. Nucleic Acids Res 42:W59–W63PubMedPubMedCentralCrossRefGoogle Scholar
  15. 15.
    Ponomarenko J, Bui HH, Li W, Fusseder N, Bourne PE, Sette A, Peters B (2008) ElliPro: a new structure-based tool for the prediction of antibody epitopes. BMC Bioinformatics 9:514PubMedPubMedCentralCrossRefGoogle Scholar
  16. 16.
    Karplus PA, Schulz GE (1985) Prediction of chain flexibility in proteins – a tool for the selection of peptide antigens. Naturwissenschaften 72:212–213CrossRefGoogle Scholar
  17. 17.
    Rubinstein ND, Mayrose I, Martz E, Pupko T (2009) Epitopia: a web-server for predicting B-cell epitopes. BMC Bioinformatics 10:287PubMedPubMedCentralCrossRefGoogle Scholar
  18. 18.
    Zhang W, Liu J, Zhao M, Li Q (2012) Predicting linear B-cell epitopes by using sequence-derived structural and physicochemical features. Int J Data Min Bioinform 6(5):557–569PubMedCrossRefPubMedCentralGoogle Scholar
  19. 19.
    Liang S, Zheng D, Standley DM, Yao B, Zacharias M, Zhang C (2010) EPSVR and EPMeta: prediction of antigenic epitopes using support vector regression and multiple server results. BMC Bioinformatics 11:381PubMedPubMedCentralCrossRefGoogle Scholar
  20. 20.
    Zhang W, Niu Y, Xiong Y, Zhao M, Yu R, Liu J (2012) Computational prediction of conformational B-cell epitopes from antigen primary structures by ensemble learning. PLoS One 7(8):e43575PubMedPubMedCentralCrossRefGoogle Scholar
  21. 21.
    Kringelum JV, Lundegaard C, Lund O, Nielsen M (2012) Reliable B cell epitope predictions: impacts of method development and improved benchmarking. PLoS Comput Biol 8(12):e1002829PubMedPubMedCentralCrossRefGoogle Scholar
  22. 22.
    Wolpert DH (1992) Stacked Generalization. Neural Netw 5:241–259CrossRefGoogle Scholar
  23. 23.
    Ting KM, Witten IH (1997) Stacked generalization: When does it work? In: International Joint Conference on Artificial Intelligence, pp 866–873Google Scholar
  24. 24.
    Gama J (1998) Combining classifiers by constructive induction. In: European Conference on Machine Learning, pp 178–189CrossRefGoogle Scholar
  25. 25.
    Gama J, Brazdil P (2000) Cascade Generalization. Mach Learn 41(3):315–343CrossRefGoogle Scholar
  26. 26.
    Todorovski L, Dzeroski S (2000) Combining multiple models with meta decision trees. Lect Notes Comput Sci 1910:54–64CrossRefGoogle Scholar
  27. 27.
    Breiman L (1996) Bagging predictors. Mach Learn 24:123–140Google Scholar
  28. 28.
    Schapire R (1990) The strength of weak learnability. Mach Learn 5:197–227Google Scholar
  29. 29.
    Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San FranciscoGoogle Scholar
  30. 30.
    Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefGoogle Scholar
  31. 31.
    Bishop CM (1996) Neural networks for pattern recognition. Oxford University Press, OxfordGoogle Scholar
  32. 32.
    Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27CrossRefGoogle Scholar
  33. 33.
    Zhang W, Xiong Y, Zhao M, Zou H, Ye X, Liu J (2011) Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature. BMC Bioinformatics 12:341PubMedPubMedCentralCrossRefGoogle Scholar
  34. 34.
    Nagano K (1973) Logical analysis of the mechanism of protein folding: I. predictions of helices, loops and beta-structures from primary structure. J Mol Biol 75(2):401–420PubMedCrossRefPubMedCentralGoogle Scholar
  35. 35.
    Hubbard SJ, Thornton JM (1993) NACCESS Computer Program. Department of Biochemistry and Molecular Biology, University College LondonGoogle Scholar
  36. 36.
    Lipkin HJ (2004) Physics of Debye-Waller Factors. arXiv:cond-mat/0405023Google Scholar
  37. 37.
    Liu R, Hu J (2011) Prediction of discontinuous B-cell epitopes using logistic regression and structural information. J Proteomics Bioinform 4:10–15Google Scholar
  38. 38.
    Sanner MF, Olson AJ, Spehner JC (1996) Reduced surface: an efficient way to compute molecular surfaces. Biopolymers 38(3):305–320PubMedCrossRefGoogle Scholar
  39. 39.
    Parker JM, Guo D, Hodges RS (1986) New hydrophilicity scale derived from high-performance liquid chromatography peptide retention data: correlation of predicted surface residues with antigenicity and X-ray-derived accessible sites. Biochemistry 25(19):5425–5432CrossRefGoogle Scholar
  40. 40.
    Zhang Z, Schäffer AA, Miller W, Madden TL, Lipman DJ, Koonin EV, Altschul SF (1998) Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res 26(17):3986–3990PubMedPubMedCentralCrossRefGoogle Scholar
  41. 41.
    Gerstein M, Tsai J, Levitt M (1995) The volume of atoms on the protein surface: calculated from simulation, using Voronoi Polyhedra. J Mol Biol 249:955–966PubMedCrossRefPubMedCentralGoogle Scholar
  42. 42.
    Lee B, Richards FM (1971) The interpretation of protein structures: estimation of static accessibility. J Mol Biol 55(3):379–400PubMedCrossRefGoogle Scholar
  43. 43.
    Gerstein M (1992) A resolution-sensitive procedure for comparing protein surfaces and its application to the comparison of antigen-combining sites. Acta Cryst A48:271–276CrossRefGoogle Scholar
  44. 44.
    Hausman RE, Cooper GM (2003) The cell: a molecular approach. ASM Press, Washington, DCGoogle Scholar
  45. 45.
    Kyte J, Doolittle RF (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157(1):105–132PubMedCrossRefGoogle Scholar
  46. 46.
    Kolaskar AS, Tongaonkar PC (1990) A semi-empirical method for prediction of antigenic determinants on protein antigens. FEBS Lett 276(1–2):172–174PubMedPubMedCentralCrossRefGoogle Scholar
  47. 47.
    Hu Y-J, Lin S-C, Lin Y-L, Lin K-H, You S-N (2014) A meta-learning approach for B-cell conformational epitope prediction. BMC Bioinformatics 15:378PubMedPubMedCentralCrossRefGoogle Scholar
  48. 48.
    Emini EA, Hughes JV, Perlow DS, Boger J (1985) Induction of hepatitis A virus-neutralizing antibody by a virus-specific synthetic peptide. J Virol 55:836–839PubMedPubMedCentralCrossRefGoogle Scholar
  49. 49.
    Janin J, Wodak S, Levitt M, Maigret B (1978) Conformation of amino acid side-chains in proteins. J Mol Biol 125(3):357–386PubMedCrossRefGoogle Scholar
  50. 50.
    Ponnuswamy PK, Prabhakaran M, Manavalan P (1980) Hydrophobic packing and spatial arrangement of amino-acid-residues in globular-proteins. Biochim Biophys Acta 623:301–316PubMedCrossRefPubMedCentralGoogle Scholar
  51. 51.
    Grantham R (1974) Amino acid difference formula to help explain protein evolution. Science 185:862–864PubMedPubMedCentralCrossRefGoogle Scholar
  52. 52.
    Breiman L (2001) Random forests. Mach Learn 45:5–32CrossRefGoogle Scholar
  53. 53.
    Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp 144–151Google Scholar
  54. 54.
    Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers Inc., Burlington, MAGoogle Scholar
  55. 55.
    Cohen WW (1995) Fast effective rule induction. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp 115–123CrossRefGoogle Scholar
  56. 56.
    Freund Y, Schapire RF (1999) Large margin classification using the perceptron algorithm. Mach Learn 37:277–296CrossRefGoogle Scholar
  57. 57.
    Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth & Brooks/Cole Advanced Books & Software, Monterey, CAGoogle Scholar
  58. 58.
    Schlessinger A, Ofran Y, Yachdav G, Rost B (2006) Epitome: database of structure-inferred antigenic epitopes. Nucleic Acids Res 34:D777–D780CrossRefGoogle Scholar
  59. 59.
    Ponomarenko J, Papangelopoulos N, Zajonc DM, Peters B, Sette A, Bourne PE (2011) IEDB-3D: structural data within the immune epitope database. Nucleic Acids Res 39:D1164–D1170PubMedCrossRefPubMedCentralGoogle Scholar
  60. 60.
    Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218CrossRefGoogle Scholar
  61. 61.
    Ansari HR, Raghava G (2010) Identification of conformational B-cell Epitopes in an antigen from its primary sequence. Immunome Res 6:6PubMedPubMedCentralCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  1. 1.College of Computer ScienceNational Chiao Tung UniversityHsinchuTaiwan
  2. 2.Institute of Biomedical EngineeringNational Chiao Tung UniversityHsinchuTaiwan

Personalised recommendations