Skip to main content

Protein Fold Recognition Using an Overlapping Segmentation Approach and a Mixture of Feature Extraction Models

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNAI,volume 8272)

Abstract

Protein Fold Recognition (PFR) is considered as a critical step towards the protein structure prediction problem. PFR has also a profound impact on protein function determination and drug design. Despite all the enhancements achieved by using pattern recognition-based approaches in the protein fold recognition, it still remains unsolved and its prediction accuracy remains limited. In this study, we propose a new model based on the concept of mixture of physicochemical and evolutionary features. We then design and develop two novel overlapping segmented-based feature extraction methods. Our proposed methods capture more local and global discriminatory information than previously proposed approaches for this task. We investigate the impact of our novel approaches using the most promising attributes selected from a wide range of physicochemical-based attributes (117 attributes) which is also explored experimentally in this study. By using Support Vector Machine (SVM) our experimental results demonstrate a significant improvement (up to 5.7%) in the protein fold prediction accuracy compared to previously reported results found in the literature.

Keywords

  • Mixture of Feature Extraction Model
  • Overlapping Segmented distribution
  • Overlapping Segmented Auto Covariance
  • Support Vector Machine

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-03680-9_4
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-03680-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dehzangi, A., Phon-Amnuaisuk, S., Dehzangi, O.: Enhancing protein fold prediction accuracy by using ensemble of different classifiers. Australian Journal of Intelligent Information Processing Systems 26(4), 32–40 (2010)

    Google Scholar 

  2. Ghanty, P., Pal, N.R.: Prediction of protein folds: Extraction of new features, dimensionality reduction, and fusion of heterogeneous classifiers. IEEE Transactions on NanoBioscience 8(1), 100–110 (2009)

    CrossRef  Google Scholar 

  3. Deschavanne, P., Tuffery, P.: Enhanced protein fold recognition using a structural alphabet. Proteins: Structure, Function, and Bioinformatics 76(1), 129–137 (2009)

    CrossRef  Google Scholar 

  4. Dehzangi, A., Phon-Amnuaisuk, S., Manafi, M., Safa, S.: Using rotation forest for protein fold prediction problem: An empirical study. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds.) EvoBIO 2010. LNCS, vol. 6023, pp. 217–227. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  5. Dehzangi, A., Karamizadeh, S.: Solving protein fold prediction problem using fusion of heterogeneous classifiers. INFORMATION, An International Interdisciplinary Journal 14(11), 3611–3622 (2011)

    Google Scholar 

  6. Dong, Q., Zhou, S., Guan, G.: A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation. Bioinformatics 25(20), 2655–2662 (2009)

    CrossRef  Google Scholar 

  7. Chmielnicki, W., Stapor, K.: A hybrid discriminative-generative approach to protein fold recognition. Neurocomputing 75(1), 194–198 (2012)

    CrossRef  Google Scholar 

  8. Ding, C., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17, 349–358 (2001)

    CrossRef  Google Scholar 

  9. Yang, T., Kecman, V., Cao, L., Zhang, C., Huang, J.Z.: Margin-based ensemble classifier for protein fold recognition. Expert Systems with Applications 38, 12348–12355 (2011)

    CrossRef  Google Scholar 

  10. Kavousi, K., Sadeghi, M., Moshiri, B., Araabi, B.N., Moosavi-Movahedi, A.A.: Evidence theoretic protein fold classification based on the concept of hyperfold. Mathematical Biosciences 240(2), 148–160 (2012)

    MathSciNet  CrossRef  MATH  Google Scholar 

  11. Shamim, M.T.A., Anwaruddin, M., Nagarajaram, H.A.: Support vector machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs. Bioinformatics 23(24), 3320–3327 (2007)

    CrossRef  Google Scholar 

  12. Yang, J.Y., Chen, X.: Improving taxonomy-based protein fold recognition by using global and local features. Proteins: Structure, Function, and Bioinformatics 79(7), 2053–2064 (2011)

    CrossRef  Google Scholar 

  13. Dehzangi, A., Phon-Amnuaisuk, S.: Fold prediction problem: The application of new physical and physicochemical- based features. Protein and Peptide Letters 18(2), 174–185 (2011)

    CrossRef  Google Scholar 

  14. Sharma, A., Lyons, J., Dehzangi, A., Paliwal, K.K.: A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition. Journal of Theoretical Biology 320, 41–46 (2013)

    MathSciNet  CrossRef  Google Scholar 

  15. Dehzangi, A., Sattar, A.: Protein fold recognition using segmentation-based feature extraction model. In: Selamat, A., Nguyen, N.T., Haron, H. (eds.) ACIIDS 2013, Part I. LNCS, vol. 7802, pp. 345–354. Springer, Heidelberg (2013)

    CrossRef  Google Scholar 

  16. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J.H., Zhang, Z., Miller, W., Lipman, D.J.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Research 17, 3389–3402 (1997)

    CrossRef  Google Scholar 

  17. Shen, H.B., Chou, K.C.: Ensemble classifier for protein fold pattern recognition. Bioinformatics 22, 1717–1722 (2006)

    CrossRef  Google Scholar 

  18. Taguchi, Y.H., Gromiha, M.M.: Application of amino acid occurrence for discriminating different folding types of globular proteins. BMC Bioinformatics 8(1), 404 (2007)

    CrossRef  Google Scholar 

  19. Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: Scop: A structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology 247(4), 536–540 (1995)

    Google Scholar 

  20. Kawashima, S., Pokarowska, P.P.M., Kolinski, A., Katayama, T., Kanehisa, M.: Aaindex: Amino acid index database, progress report. Neucleic Acids 36, D202–D205 (2008)

    Google Scholar 

  21. Mathura, V.S., Kolippakkam, D.: Apdbase: Amino acid physico-chemical properties database. Bioinformation 12(1), 2–4 (2005)

    CrossRef  Google Scholar 

  22. Gromiha, M.M.: A statistical model for predicting protein folding rates from amino acid sequence with structural class information. Journal of Chemical Information and Modeling 45(2), 494–501 (2005)

    CrossRef  Google Scholar 

  23. Dehzangi, A., Paliwal, K.K., Sharma, A., Dehzangi, O., Sattar, A.: A combination of feature extraction methods with an ensemble of different classifiers for protein structural class prediction problem. IEEE Transaction on Computational Biology and Bioinformatics (TCBB) (in press, 2013)

    Google Scholar 

  24. Liu, T., Geng, X., Zheng, X., Li, R., Wang, J.: Accurate prediction of protein structural class using auto covariance transformation of psi-blast profiles. Amino Acids 42, 2243–2249 (2012)

    CrossRef  Google Scholar 

  25. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer (1999)

    Google Scholar 

  26. Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 1–27 (2011)

    CrossRef  Google Scholar 

  27. Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Dehzangi, A., Paliwal, K., Sharma, A., Lyons, J., Sattar, A. (2013). Protein Fold Recognition Using an Overlapping Segmentation Approach and a Mixture of Feature Extraction Models. In: Cranefield, S., Nayak, A. (eds) AI 2013: Advances in Artificial Intelligence. AI 2013. Lecture Notes in Computer Science(), vol 8272. Springer, Cham. https://doi.org/10.1007/978-3-319-03680-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03680-9_4

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03679-3

  • Online ISBN: 978-3-319-03680-9

  • eBook Packages: Computer ScienceComputer Science (R0)