The Protein Journal

, Volume 28, Issue 6, pp 273–280

Improved Prediction of Protein Binding Sites from Sequences Using Genetic Algorithm

Article

DOI: 10.1007/s10930-009-9192-1

Cite this article as:
Du, X., Cheng, J. & Song, J. Protein J (2009) 28: 273. doi:10.1007/s10930-009-9192-1

Abstract

We undertook this project in response to the rapidly increasing number of protein structures with unknown functions in the Protein Data Bank. Here, we combined a genetic algorithm with a support vector machine to predict protein–protein binding sites. In an experiment on a testing dataset, we predicted the binding sites for 66% of our datasets, made up of 50 testing hetero-complexes. This classifier achieved greater sensitivity (60.17%), specificity (58.17%), accuracy (64.08%), and F-measure (54.79%), and a higher correlation coefficient (0.2502) than those of the support vector machine. This result can be used to guide biologists in designing specific experiments for protein analysis.

Keywords

Protein–protein interaction sitesGenetic algorithmSupport vector machineProtein sequence profile

Abbreviations

PDB

Protein Data Bank

FP

False positive

SVM

Support vector machine

FN

False negative

GA/SVM

Genetic algorithm and support vector machine

CC

Correlation coefficient

TP

True positive

TN

True negative

HSSP

Homology-derived secondary structure of protein

Supplementary material

10930_2009_9192_MOESM1_ESM.xls (16 kb)
(XLS 16 kb)

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Key Laboratory of Intelligent Computing and Signal Processing, Ministry of EducationAnhui UniversityHefeiChina