Abstract
In this study, we present a classifier which takes an amino acid sequence as input and predicts potential DNA-binding domains with support vector machines (SVMs). We got amino acid sequences with known DNA-binding domains from the Protein Data Bank (PDB), and SVM models were designed integrating with four normalized sequence features(the side chain pKa value, hydrophobicity index, molecular mass of the amino acid and the number of isolated electron pairs) and a normalized feature on evolutionary information of amino acid sequences. The results show that DNA-binding domains can be predicted at 74.28% accuracy, 68.39% sensitivity and 79.76% specificity, in addition, at 0.822 ROC AUC value and 0.549 Pearson’s correlation coefficient.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ptashne, M.: Regulation of transcription: from lambda to eukaryotes. Trends Biochem. Sci. 30, 275–279 (2005)
Ahmad, S., Gromiha, M.M., Sarai, A.: Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics 20, 477–486 (2004)
Jones, S., Shanahan, H.P., Berman, H.M., Thornton, J.M.: Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins. Nucleic Acids Res. 31, 7189–7198 (2003)
Tsuchiya, Y., Kinoshita, K., Nakamura, H.: PreDs: a server for predicting dsDNA-binding site on protein molecular surfaces. Bioinformatics 21, 1721–1723 (2005)
Ahmad, S., Sarai, A.: PSSM-based prediction of DNA binding sites in proteins. BMC Bioinformatics 6, 33 (2005)
Wang, L., Brown, S.J.: BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences. Nucleic Acids Res. 34, 243–248 (2006)
Venkatarajan, M.S., Braun, W.: New quantitative descriptors of amino acids based on multidimensional scaling of a large number of physical-chemical properties. J. Mol. Modeling 7, 445–453 (2001)
Wang, Y., Xue, Z., Xu, J.: Better Prediction of the Location of -Turns in Proteins With Support Vector Machine. PROTEINS: Structure, Function, and Bioinformatics 65, 49–54 (2006)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines (Version 2.3) (2001), http://www.csie.ntu.edu.tw/cjlin/papers/libsvm.pdf
Egan, J.P.: Signal Detection Theory and ROC Analysis. Series in Cognitition and Perception. Academic Press, New York (1975)
Sætrom, P., Snøve, O.J.: A comparison of siRNA efficacy predictors. Biochem. Biophys. Res. Commun. 321(1), 247–253 (2004)
Jones, S., Daley, D.T.A., Luscombe, N.M., Berman, H.M., Thornton, J.M.: Protein-RNA interactions: a structural analysis. Nucleic Acids Res. 29, 943–954 (2001)
Luscombe, N.M., Laskowski, R.A., Thornton, J.M.: Amino acid-base interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level. Nucleic Acids Res. 29, 2860–2874 (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, J., Wu, H., Liu, H., Zhou, H., Sun, X. (2007). Support Vector Machine for Prediction of DNA-Binding Domains in Protein-DNA Complexes. In: Li, K., Li, X., Irwin, G.W., He, G. (eds) Life System Modeling and Simulation. LSMS 2007. Lecture Notes in Computer Science(), vol 4689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74771-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-74771-0_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74770-3
Online ISBN: 978-3-540-74771-0
eBook Packages: Computer ScienceComputer Science (R0)