Prediction of protein structural classes based on feature selection technique

  • Hui Ding
  • Hao LinEmail author
  • Wei ChenEmail author
  • Zi-Qiang Li
  • Feng-Biao Guo
  • Jian Huang
  • Nini Rao


The prediction of protein structural classes is beneficial to understanding folding patterns, functions and interactions of proteins. In this study, we proposed a feature selection-based method to accurately predict protein structural classes. Three datasets with sequence identity lower than 25% were used to test the prediction performance of the method. Through jackknife cross-validation, we have verified that the overall accuracies of these three datasets are 92.1%, 89.7% and 84.0%, respectively. The proposed method is more efficient and accurate than other existing methods. The present study will offer an excellent alternative to other methods for predicting protein structural classes.

Key words

protein structural class feature selection technique support vector machine tetrapeptide 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Bu, W.S., Feng, Z.P., Zhang, Z., Zhang, C.T. 1999. Prediction of protein (domain) structural classes based on amino-acid index. Eur J Biochem 266, 1043–1049.PubMedCrossRefGoogle Scholar
  2. [2]
    Cai, Y.D., Li, Y.X., Chou, K.C. 2000. Using neural networks for prediction of domain structural classes. Biochim Biophys Acta 3, 1–2.CrossRefGoogle Scholar
  3. [3]
    Chen, C., Shen, Z.B., Zou, X.Y. 2012. Dual-layer Wavelet SVM for Predicting Protein Structural Class via the General Form of Chou’s Pseudo Amino Acid Composition. Protein Pept Lett 19, 422–429.PubMedCrossRefGoogle Scholar
  4. [4]
    Chen, K., Kurgan, L.A., Ruan, J. 2008. Prediction of protein structural class using novel evolutionary collocation-based sequence representation. J Comput Chem 29, 1596–1604.PubMedCrossRefGoogle Scholar
  5. [5]
    Costantini, S., Facchiano, A.M. 2009. Prediction of the protein structural class by specific peptide frequencies. Biochimie 91, 226–229.PubMedCrossRefGoogle Scholar
  6. [6]
    Dai, Q., Wu, L., Li, L. 2011. Improving protein structural class prediction using novel combined sequence information and predicted secondary structural features. J Comput Chem 32, 3393–3398.PubMedCrossRefGoogle Scholar
  7. [7]
    Ding, S., Zhang, S., Li. Y., Wang, T. 2012. A novel protein structural classes prediction method based on predicted secondary structure. Biochimie 94, 1166–1171.PubMedCrossRefGoogle Scholar
  8. [8]
    Fan, R.E., Chen, P.H., Lin, C.J. 2005. Working set selection using the second order information for training SVM. J Mach Learn Res 6, 1889–1918.Google Scholar
  9. [9]
    Feng, Y., Luo, L. 2008. Use of tetrapeptide signals for protein secondary-structure prediction. Amino Acids 35, 607–614.PubMedCrossRefGoogle Scholar
  10. [10]
    Kurgan, L., Chen, K. 2007. Prediction of protein structural class for the twilight zone sequences. Biochem Biophys Res Commun 357, 453–460.PubMedCrossRefGoogle Scholar
  11. [11]
    Kurgan, L., Cios, K., Chen, K. 2008a. SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences. BMC Bioinformatics 9, 226.PubMedCrossRefPubMedCentralGoogle Scholar
  12. [12]
    Kurgan, L., Homaeian, L. 2006. Prediction of structural classes for protein sequences and domains—Impact of prediction algorithms, sequence representation and homology, and test procedures on accuracy. Pattern Recog 39, 2323–2343.CrossRefGoogle Scholar
  13. [13]
    Kurgan, L., Zhang, T., Zhang, H., Shen, S., Ruan, J. 2008b. Secondary structure-based assignment of the protein structural classes. Amino Acids 35, 551–564.PubMedCrossRefGoogle Scholar
  14. [14]
    Levitt, M., Chothia, C. 1976. Structural patterns in globular proteins. Nature 261, 552–558.PubMedCrossRefGoogle Scholar
  15. [15]
    Li, Z.C., Zhou, X.B., Dai, Z., Zou, X.Y. 2009. Prediction of protein structural classes by Chou’s pseudo amino acid composition: approached using continuous wavelet transform and principal component analysis. Amino Acids 37, 415–425.PubMedCrossRefGoogle Scholar
  16. [16]
    Lin, H., Ding, C., Song, Q., Yang, P., Ding, H., Deng, K.J, Chen, W. 2012. The prediction of protein structural class using averaged chemical shifts. J Biomol Struct Dyn 29, 643–648.PubMedCrossRefGoogle Scholar
  17. [17]
    Lin, H., Li, Q.Z. 2007. Using Pseudo Amino Acid Composition to Predict Protein Structural Class: Approached by Incorporating 400 Dipeptide Components. J Comput Chem 28, 1463–1466.PubMedCrossRefGoogle Scholar
  18. [18]
    Liu, T., Geng, X., Zheng, X., Li, R., Wang, J. 2011. Accurate prediction of protein structural class using auto covariance transformation of PSI-BLAST profiles. Amino Acids 42, 2243–2249.PubMedCrossRefGoogle Scholar
  19. [19]
    Liu, T., Jia, C. 2010. A high-accuracy protein structural class prediction algorithm using predicted secondary structural information. J Theor Biol 267, 272–275.PubMedCrossRefGoogle Scholar
  20. [20]
    Liu, T., Zheng, X., Wang, J. 2010. Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile. Biochimie 92, 1330–1334.PubMedCrossRefGoogle Scholar
  21. [21]
    McGuffin, L.J., Bryson, K., Jones, D.T. 2000. The PSIPRED protein structure prediction server. Bioinformatics 16, 404–405.PubMedCrossRefGoogle Scholar
  22. [22]
    Meus, J., Brylinski, M., Piwowar, M., et al. 2006. A tabular approach to the sequence-to-structure relation in proteins (tetrapeptide representation) for de novo protein design. Med Sci Monit 12, BR208–214.PubMedGoogle Scholar
  23. [23]
    Mizianty, M.J., Kurgan, L. 2009. Modular prediction of protein structural classes from sequences of twilightzone identity with predicting sequences. BMC Bioinformatics 10, 414.PubMedCrossRefPubMedCentralGoogle Scholar
  24. [24]
    Prevelige Jr, P., Fasman, G.D. 1989. Chou-Fasman prediction of the secondary structure of proteins, in Prediction of Protein structure and the principles of protein conformation, G.D. Fasman, ed., Plenum Press, New York, pp. 391–416.CrossRefGoogle Scholar
  25. [25]
    Qi, Y., Liang, H., Han, X., Lai, L. 2012. Sequence Preference of α-Helix N-Terminal Tetrapeptide. Protein Pept Lett 345–352.Google Scholar
  26. [26]
    Qin, Y.F., Wang, C.H., Yu, X.Q., Zhu, J., Liu, T.G., Zheng, X.Q. 2012. Predicting protein structural class by incorporating patterns of over-represented k-mers into the general form of Chou’s PseAAC. Protein Pept Lett 19, 388–397.PubMedCrossRefGoogle Scholar
  27. [27]
    Rackovsky, S. 1993. On the nature of protein folding code. Proc Natl Acad Sci USA 90, 644–648.PubMedCrossRefPubMedCentralGoogle Scholar
  28. [28]
    Shafiullah, G.M., Al-Mamun, H.A. 2010. Protein strucutral class prediction using support vector machine. 6th International Conference on Electrical and Computer Engineering 179–182.Google Scholar
  29. [29]
    Yang, J.Y., Peng, Z.L., Chen, X. 2010. Prediction of protein structural classes for low-homology sequences based on predicted secondary structure. BMC Bioinformatics 11, S9.CrossRefGoogle Scholar
  30. [30]
    Yang, J.Y., Peng, Z.L., Yu, Z.G., Zhang, R.J., Anh, V., Wang, D. 2009. Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation. J Theor Biol 257, 618–626.PubMedCrossRefGoogle Scholar
  31. [31]
    Yu, T., Sun, Z.B., Sang, J.P., Huang, S.Y., Zou, X.W. 2007. Structural class tendency of polypeptide: A new conception in predicting protein structural class. Physica A 386, 581–589.CrossRefGoogle Scholar
  32. [32]
    Zhou, G.P. 1998. An intriguing controversy over protein structural class prediction. J Protein Chem 17, 729–738.PubMedCrossRefGoogle Scholar

Copyright information

© International Association of Scientists in the Interdisciplinary Areas and Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Key Laboratory for NeuroInformation of Ministry of Education, Center of Bioinformatics, School of Life Science and TechnologyUniversity of Electronic Science and Technology of ChinaChengduChina
  2. 2.Department of Physics, Center for Genomics and Computational Biology, College of SciencesHebei United UniversityTangshanChina
  3. 3.School of information and EngineeringSichuan Agricultural UniversityYaanChina

Personalised recommendations