Identification of B-cell epitopes in target antigens is a critical step in epitope-driven vaccine design, immunodiagnostic tests, and antibody production. B-cell epitopes could be linear, i.e., a contiguous amino acid sequence fragment of an antigen, or conformational, i.e., amino acids that are often not contiguous in the primary sequence but appear in close proximity within the folded 3D antigen structure. Numerous computational methods have been proposed for predicting both types of B-cell epitopes. However, the development of tools for reliably predicting B-cell epitopes remains a major challenge in immunoinformatics.
Classifier ensembles a promising approach for combining a set of classifiers such that the overall performance of the resulting ensemble is better than the predictive performance of the best individual classifier. In this chapter, we show how to build a classifier ensemble for improved prediction of linear B-cell epitopes. The method can be easily adapted to build classifier ensembles for predicting conformational epitopes.
B-cell epitope prediction Classifiers ensemble Random forest Epitope prediction toolkit
This is a preview of subscription content, log in to check access.
Springer Nature is developing a new tool to find and evaluate Protocols. Learn more
This work was supported in part by a grant from the National Institutes of Health (NIH GM066387) and by Edward Frymoyer Chair of Information Sciences and Technology at Pennsylvania State University.
Abbas AK, Lichtman AH, Pillai S (2007) Cellular and molecular immunology, 6th edn. Saunders Elsevier, PhiladelphiaGoogle Scholar
Reineke U, Schutkowski M (2009) Epitope mapping protocols, vol 524, 2nd edn, Methods in molecular biology. Humana Press, New YorkGoogle Scholar
Ansari HR, Raghava GP (2013) In silico models for B-cell epitope recognition and signaling. Methods Mol Biol 993:129–138PubMedCrossRefGoogle Scholar
Yao B, Zheng D, Liang S et al (2013) Conformational B-cell epitope prediction on antigen protein structures: a review of current algorithms and comparison with common binding site prediction methods. PLoS One 8(4):e62249PubMedCentralPubMedCrossRefGoogle Scholar
Emini EA, Hughes JV, Perlow D et al (1985) Induction of hepatitis A virus-neutralizing antibody by a virus-specific synthetic peptide. J Virol 55(3):836–839PubMedCentralPubMedGoogle Scholar
Karplus P, Schulz G (1985) Prediction of chain flexibility in proteins. Naturwissenschaften 72(4):212–213CrossRefGoogle Scholar
Parker JM, Guo D, Hodges RS (1986) New hydrophilicity scale derived from high-performance liquid chromatography peptide retention data: correlation of predicted surface residues with antigenicity and X-ray-derived accessible sites. Biochemistry 25(19):5425–5432PubMedCrossRefGoogle Scholar
Pellequer J-L, Westhof E, Van Regenmortel MH (1993) Correlation between the location of antigenic sites and the prediction of turns in proteins. Immunol Lett 36(1):83–99PubMedCrossRefGoogle Scholar
Kringelum JV, Lundegaard C, Lund O et al (2012) Reliable B cell epitope predictions: impacts of method development and improved benchmarking. PLoS Comput Biol 8(12):e1002829PubMedCentralPubMedCrossRefGoogle Scholar
Wozniak M (2013) Hybrid Classifiers: Methods of Data, Knowledge, and Classifier Combination, vol 519. Studies in Computational Intelligence, Springer Heidelberg LondonGoogle Scholar
El-Manzalawy Y (2010) Honavar V A framework for developing epitope prediction tools. In: Proceedings of the First ACM International conference on bioinformatics and computational biology. ACM, pp 660–662Google Scholar
Frank E, Hall M, Holmes G, Kirkby R, Pfahringer B, Witten IH, Trigg L (2005) Weka: A machine learning workbench for data mining. In Data Mining and Knowledge Discovery Handbook (pp 1305–1314) Springer USGoogle Scholar
Jungermann F Information extraction with rapidminer. In: Proceedings of the GSCL Symposium’Sprachtechnologie und eHumanities, 2009. pp 50–61Google Scholar
Berthold MR, Cebron N, Dill F et al (2008) KNIME: The Konstanz information miner. Data Analysis, Machine Learning and Applications. Springer Berlin Heidelberg, In, pp 319–326Google Scholar
Cai C, Han L, Ji ZL et al (2003) SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res 31(13):3692–3697PubMedCentralPubMedCrossRefGoogle Scholar
Bernard S, Heutte L, Adam S (2009) Towards a better understanding of random forests through the study of strength and correlation. Emerging Intelligent Computing Technology and Applications. With Aspects of Artificial Intelligence. Springer, In, pp 536–545Google Scholar
Altschul SF, Madden TL, Schäffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402PubMedCentralPubMedCrossRefGoogle Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140Google Scholar
Freund Y (1996) Schapire RE Experiments with a new boosting algorithm. ICML, In, pp 148–156Google Scholar