Immunogenetics

, Volume 58, Issue 8, pp 607–613

MHC-BPS: MHC-binder prediction server for identifying peptides of flexible lengths from sequence-derived physicochemical properties

  • Juan Cui
  • Lian Yi Han
  • Hong Huang Lin
  • Zhi Qun Tang
  • Li Jiang
  • Zhi Wei Cao
  • Yu Zong Chen
Original Paper

DOI: 10.1007/s00251-006-0117-2

Cite this article as:
Cui, J., Han, L.Y., Lin, H.H. et al. Immunogenetics (2006) 58: 607. doi:10.1007/s00251-006-0117-2

Abstract

Major histocompatibility complex (MHC)-binding peptides are essential for antigen recognition by T-cell receptors and are being explored for vaccine design. Computational methods have been developed for predicting MHC-binding peptides of fixed lengths, based on the training of relatively few non-binders. It is desirable to introduce methods applicable for peptides of flexible lengths and trained by using more diverse sets of non-binders. MHC-BPS is a web-based MHC-binder prediction server that uses support vector machines for predicting peptide binders of flexible lengths for 18 MHC class I and 12 class II alleles from sequence-derived physicochemical properties, which were trained by using 4,208∼3,252 binders and 234,333∼168,793 non-binders, and evaluated by an independent set of 545∼476 binders and 110,564∼84,430 non-binders. The binder prediction accuracies are 86∼99% for 25 and 70∼80% for five alleles, and the non-binder accuracies are 96∼99% for 30 alleles. A screening of HIV-1 genome identifies 0.01∼5% and 5∼8% of the constituent peptides as binders for 24 and 6 alleles, respectively, including 75∼100% of the known epitopes. This method correctly predicts 73.3% of the 15 newly published epitopes in the last 4 months of 2005. MHC-BPS is available at http://bidd.cz3.nus.edu.sg/mhc/.

Keywords

MHC-binding peptide Epitopes SVM 

Supplementary material

251_2006_117_MOESM1_ESM.doc (34 kb)
Table 1Distribution of the binding peptides of different HLA alleles with respect to peptide length in units of the number of amino acids (34,692 kb)
251_2006_117_MOESM2_ESM.doc (18 kb)
Table 2Data sets and the computed binder and non-binder prediction accuracies of the SVM prediction systems for different HLA alleles developed in this work. A total of 18 MHC class I and 12 MHC class II alleles are covered. TP, TN, FP, and FN are the number of true positive (true binder), true negative (true non-binder), false positive (false binder), and false negative (false non-binder), respectively. The total number of binders and non-binders in a data set is TP + FN and TN + FP, respectively) (42,217 kb)
251_2006_117_MOESM3_ESM.doc (5 kb)
Table 3List of newly reported epitopes in the last 4 months of 2005 and SVM prediction results (5,351 kb)
251_2006_117_MOESM4_ESM.doc (127 kb)
Table 4Statistics of the predicted peptide binders from the HIV-1 genome (NCBI entry NC_001802) by using our method and several other web-based prediction servers (130,095 kb)

Copyright information

© Springer-Verlag 2006

Authors and Affiliations

  • Juan Cui
    • 1
    • 2
  • Lian Yi Han
    • 1
    • 2
  • Hong Huang Lin
    • 1
    • 2
  • Zhi Qun Tang
    • 1
    • 2
  • Li Jiang
    • 1
    • 2
  • Zhi Wei Cao
    • 3
  • Yu Zong Chen
    • 1
    • 2
    • 3
  1. 1.Bioinformatics and Drug Design Group, Department of PharmacyNational University of SingaporeSingaporeSingapore
  2. 2.Department of Computational ScienceNational University of SingaporeSingaporeSingapore
  3. 3.Shanghai Center for Bioinformatics TechnologyShanghaiPeople’s Republic of China

Personalised recommendations