Abstract
In this paper, amino acid compositions are combined with some protein sequence properties (physiochemical properties) to predict protein structural classes. We are able to predict protein structural classes using a mathematical model that combines the nearest neighbor algorithm (NNA), mRMR (minimum redundancy, maximum relevance), and feature forward searching strategy. Jackknife cross-validation is used to evaluate the prediction accuracy. As a result, the prediction success rate improves to 68.8%, which is better than the 62.2% obtained when using only amino acid compositions. Therefore, we conclude that the physiochemical properties are factors that contribute to the protein folding phenomena and the most contributing features are found to be the amino acid composition. We expect that prediction accuracy will improve further as more sequence information comes to light. A web server for predicting the protein structural classes is available at http://app3.biosino.org:8080/liwenjin/index.jsp.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Chou KC, Zhang CT (1995) Prediction of protein structural classes. Crit Rev Biochem Mol Biol 30: 275–349. doi:10.3109/10409239509083488
Chou PY, Fasman GD (1978) Prediction of the secondary structure of proteins from their amino acid sequence. Adv Enzymol Relat Areas Mol Biol 47: 45–148
Klein P, Delisi C (1986) Prediction of protein structural class from the amino acid sequence. Biopolymers 25:1659–1672. doi:10.1002/bip.360250909
Nakashima H, Nishikawa K, Ooi T (1986) The folding type of a protein is relevant to the amino acid composition. J Biochem 99: 153–162
Zhang CT, Chou KC (1992) An optimization approach to predicting protein structural class from amino acid composition. Protein Sci 1:401–408
Chou KC, Maggiora GM (1998) Domain structural class prediction. Protein Eng 11: 523–538. doi:10.1093/protein/11.7.523
Chou KC, Zhang CT (1992) A correlation-coefficient method to predicting protein-structural classes from amino acid compositions. Eur J Biochem 207: 429–433. doi:10.1111/j.1432-1033.1992.tb17067.x
Cai YD, Feng KY, Lu WC, Chou KC (2006) Using LogitBoost classifier to predict protein structural classes. J Theor Biol 238: 172–176. doi:10.1016/j.jtbi.2005.05.034
Xiao X, Shao SH, Huang ZD, Chou KC (2006) Using pseudo amino acid composition to predict protein structural classes: approached with complexity measure factor. J Comput Chem 27: 478–482. doi:10.1002/jcc.20354
Dubchak I, Muchnik I, Holbrook SR, Kim SH (1995) Prediction of protein folding class using global description of amino acid sequence. Proc Natl Acad Sci USA 92: 8700–8704. doi:10.1073/pnas.92.19.8700
Wang ZX, Yuan Z (2000) How good is prediction of protein structural class by the component-coupled method? Proteins 38:165–175. doi:10.1002/(SICI)1097-0134(20000201)38:2<165::AID-PROT5>3.0.CO;2-V
Chou KC, Cai YD (2004) Predicting protein structural class by functional domain composition. Biochem Biophys Res Commun 321: 1007–1009. doi:10.1016/j.bbrc.2004.07.059
Cao Y, Liu S, Zhang L, Qin J, Wang J, Tang K (2006) Prediction of protein structural class with Rough Sets. BMC Bioinformatics 7: 1–6. doi:10.1186/1471-2105-7-20
Ding YS, Zhang TL, Chou KC (2007) Prediction of protein structure classes with pseudo amino acid composition and fuzzy support vector machine network. Protein Pept Lett 14: 811–815. doi:10.2174/092986607781483778
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27: 1226–1238. doi:10.1109/TPAMI.2005.159
Cai YD, Chou KC (2003) Nearest neighbour algorithm for predicting protein subcellular location by combining functional domain composition and pseudo-amino acid composition. Biochem Biophys Res Commun 305: 407–411. doi:10.1016/S0006-291X(03)00775-7
Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247: 536–540
Lo Conte L, Brenner SE, Hubbard TJ, Chothia C, Murzin AG (2002) SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res 30: 264–267. doi:10.1093/nar/30.1.264
Andreeva A, Howorth D, Brenner SE, Hubbard TJ, Chothia C, Murzin AG (2004) SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res 32: D226–D229. doi:10.1093/nar/gkh039
Chandonia JM, Walker NS, Lo Conte L, Koehl P, Levitt M, Brenner SE (2002) ASTRAL compendium enhancements. Nucleic Acids Res 30: 260–263. doi:10.1093/nar/30.1.260
Chandonia JM, Hon G, Walker NS, Lo Conte L, Koehl P, Levitt M et al (2004) The ASTRAL Compendium in 2004. Nucleic Acids Res 32: D189–D192. doi:10.1093/nar/gkh034
Brenner SE, Koehl P, Levitt M (2000) The ASTRAL compendium for protein structure and sequence analysis. Nucleic Acids Res 28: 254–256. doi:10.1093/nar/28.1.254
Dubchak I, Muchnik I, Mayor C, Dralyuk I, Kim SH (1999) Recognition of a protein fold in the context of the Structural Classification of Proteins (SCOP) classification. Proteins 35:401–407. doi:10.1002/(SICI)1097-0134(19990601)35:4<401::AID-PROT3>3.0.CO;2-K
Mucchielli-Giorgi MH, Hazout S, Tuffery P (1999) PredAcc: prediction of solvent accessibility. Bioinformatics 15: 176–177. doi:10.1093/bioinformatics/15.2.176
Ding C, Peng H (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol 3: 185–205. doi:10.1142/S0219720005001004
Weng Z, Rickles RJ, Feng S, Richard S, Shaw AS, Schreiber SL et al (1995) Structure-function analysis of SH3 domains: SH3 binding specificity altered by single amino acid substitutions. Mol Cell Biol 15: 5627–5634
Hansen JC, Lu X, Ross ED, Woody RW (2006) Intrinsic protein disorder, amino acid composition, and histone terminal domains. J Biol Chem 281: 1853–1856. doi:10.1074/jbc.R500022200
Ahmad S, Gromiha MM, Sarai A (2004) Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics 20: 477–486. doi:10.1093/bioinformatics/btg432
Kumarevel TS, Gromiha MM, Ponnuswamy MN (1998) Solvent accessibility analysis on the mutants of Hsc70 ATPase fragment. Biophys Chem 71: 99–111. doi:10.1016/S0301-4622(97)00137-3
Gromiha MM, Ahmad S (2005) Role of solvent accessibility in structure based drug design. Curr Comput-Aided Drug Des 1: 223–235. doi:10.2174/1573409054367664
Liu S, Zhang C, Liang S, Zhou Y (2007) Fold recognition by concurrent use of solvent accessibility and residue depth. Proteins 68: 636–645. doi:10.1002/prot.21459
Froeyen M, DeWinter H, Herdewijn P (2006) Conformational analysis, solvent-accessible surface and geometric extent of inhibitors and substrates. Collect Czech Chem Commun 71: 842–858. doi:10.1135/cccc20060842
Islam SA, Weaver DL (1990) Molecular interactions in protein crystals: solvent accessible surface and stability. Proteins 8: 1–5. doi:10.1002/prot.340080103
Eyal E, Najmanovich R, McConkey BJ, Edelman M, Sobolev V (2004) Importance of solvent accessibility and contact surfaces in modeling side-chain conformations in proteins. J Comput Chem 25: 712–724. doi:10.1002/jcc.10420
Connolly ML (1996) Molecular surfaces: A review. Solvent Accessible Surfaces http://www.netsci.org/Science/Compchem/feature14e.html
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Below are the Electronic Supplementary Materials.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License ( https://creativecommons.org/licenses/by-nc/2.0 ), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Li, W., Lin, K., Feng, K. et al. Prediction of protein structural classes using hybrid properties. Mol Divers 12, 171–179 (2008). https://doi.org/10.1007/s11030-008-9093-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11030-008-9093-9