Predicting the Disulfide Bonding State of Cysteines with Combinations of Kernel Machines
- 69 Downloads
Cysteines may form covalent bonds, known as disulfide bridges, that have an important role in stabilizing the native conformation of proteins. Several methods have been proposed for predicting the bonding state of cysteines, either using local context or using global protein descriptors. In this paper we introduce an SVM based predictor that operates in two stages. The first stage is a multi-class classifier that operates at the protein level, using either standard Gaussian or spectrum kernels. The second stage is a binary classifier that refines the prediction by exploiting local context enriched with evolutionary information in the form of multiple alignment profiles. At both stages, we enriched profile encoding with information about cysteine conservation. The prediction accuracy of the system is 85% measured by 5-fold cross validation, on a set of 716 proteins from the September 2001 PDB Select dataset.
Unable to display preview. Download preview PDF.
- 6.C. Leslie, E. Eskin, and W. Noble, "The Spectrum Kernel: A String Kernel for SVM Protein Classification," in Proc. Pacific Symposium on Biocomputing, 2002, pp. 564-575.Google Scholar
- 7.C. Boutilier, N. Friedman, M. Goldszmidt, and D. Koller, "Context-Specific Independence in Bayesian Networks," in Prof. 12th Conf. on Uncertainty in Artificial Intelligence, Morgan Kaufmann, 1996, pp. 115-123.Google Scholar
- 10.J. Platt, "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods," in Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Scholkopf, and D. Schuurmans (Eds.), MIT Press, 2000.Google Scholar
- 11.A. Passerini, M. Pontil, and P. Frasconi, "From Margins to Probabilities in Multiclass Learning Problems," in Proc. 15th European Conf. on Artificial Intelligence, F. van Harmelen (Ed.), 2002.Google Scholar
- 12.J. Bridle, "Probabilistic Interpretation of Feedforward Classifi-cation Network Outputs, with Relationships to Statistical Pattern Recognition," in Neuro-Computing: Algorithms, Architectures, and Applications, F. Fogelman-Soulie and J. H´erault (Eds.), Springer-Verlag, 1989.Google Scholar
- 14.R. Collobert, S. Bengio, and Y. Bengio, "A Parallel Mixture of SVMs for Very Large Scale Problems," Neural Computation, vol. 14, no. 5, 2002.Google Scholar
- 15.D. Haussler, "Convolution Kernels on Discrete Structures," 1999.Google Scholar
- 16.D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Cambridge University Press, 1997.Google Scholar
- 20.O. Bousquet and A. Elisseeff, "Stability and Generalization," Journal of Machine Learning Research, vol. 2, 2002.Google Scholar