Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

IAPR International Conference on Pattern Recognition in Bioinformatics

PRIB 2012: Pattern Recognition in Bioinformatics pp 166–177Cite as

  1. Home
  2. Pattern Recognition in Bioinformatics
  3. Conference paper
Cascading Discriminant and Generative Models for Protein Secondary Structure Prediction

Cascading Discriminant and Generative Models for Protein Secondary Structure Prediction

  • Fabienne Thomarat23,
  • Fabien Lauer23 &
  • Yann Guermeur23 
  • Conference paper
  • 1582 Accesses

  • 1 Citations

Part of the Lecture Notes in Computer Science book series (LNBI,volume 7632)

Abstract

Most of the state-of-the-art methods for protein seconday structure prediction are complex combinations of discriminant models. They apply a local approach of the prediction which is known to induce a limit on the expected prediction accuracy. A priori, the use of generative models should make it possible to overcome this limitation. However, among the numerous hidden Markov models which have been dedicated to this task over more than two decades, none has come close to providing comparable performance. A major reason for this phenomenon is provided by the nature of the relevant information. Indeed, it is well known that irrespective of the model implemented, the prediction should benefit significantly from the availability of evolutionary information. Currently, this knowledge is embedded in position-specific scoring matrices which cannot be processed easily with hidden Markov models. With this observation at hand, the next significant advance should come from making the best of the two approaches, i.e., using a generative model on top of discriminant models. This article introduces the first hybrid architecture of this kind with state-of-the-art performance. The conjunction of the two levels of treatment makes it possible to optimize the recognition rate both at the residue level and at the segment level.

Keywords

  • protein secondary structure prediction
  • discriminant models
  • class membership probabilities
  • hidden Markov models

Download conference paper PDF

References

  1. Qian, N., Sejnowski, T.J.: Predicting the secondary structure of globular proteins using neural network models. Journal of Molecular Biology 202, 865–884 (1988)

    CrossRef  Google Scholar 

  2. Pollastri, G., Przybylski, D., Rost, B., Baldi, P.: Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins 47, 228–235 (2002)

    CrossRef  Google Scholar 

  3. Cole, C., Barber, J.D., Barton, G.J.: The Jpred 3 secondary structure prediction server. Nucleic Acids Research 36, W197–W201 (2008)

    Google Scholar 

  4. Kountouris, P., Hirst, J.D.: Prediction of backbone dihedral angles and protein secondary structure using support vector machines. BMC Bioinformatics 10, 437 (2009)

    CrossRef  Google Scholar 

  5. Aydin, Z., Singh, A., Bilmes, J., Noble, W.S.: Learning sparse models for a dynamic Bayesian network classifier of protein secondary structure. BMC Bioinformatics 12, 154 (2011)

    CrossRef  Google Scholar 

  6. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–286 (1989)

    CrossRef  Google Scholar 

  7. Asai, K., Hayamizu, S., Handa, K.: Prediction of protein secondary structure by the hidden Markov model. CABIOS 9, 141–146 (1993)

    Google Scholar 

  8. Martin, J., Gibrat, J.-F., Rodolphe, F.: Analysis of an optimal hidden Markov model for secondary structure prediction. BMC Structural Biology 6, 25 (2006)

    CrossRef  Google Scholar 

  9. Won, K.-J., Hamelryck, T., Prügel-Bennett, A., Krogh, A.: An evolutionary method for learning HMM structure: prediction of protein secondary structure. BMC Bioinformatics 8, 357 (2007)

    CrossRef  Google Scholar 

  10. Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402 (1997)

    CrossRef  Google Scholar 

  11. Yao, X.-Q., Zhu, H., She, Z.-S.: A dynamic Bayesian network approach to protein secondary structure prediction. BMC Bioinformatics 9, 49 (2008)

    CrossRef  Google Scholar 

  12. Krogh, A., Riis, S.K.: Hidden neural networks. Neural Computation 11, 541–563 (1999)

    CrossRef  Google Scholar 

  13. Guermeur, Y.: Combining discriminant models with new multi-class SVMs. Pattern Analysis and Applications 5, 168–179 (2002)

    CrossRef  MathSciNet  MATH  Google Scholar 

  14. Guermeur, Y., Pollastri, G., Elisseeff, A., Zelus, D., Paugam-Moisy, H., Baldi, P.: Combining protein secondary structure prediction models with ensemble methods of optimal complexity. Neurocomputing 56, 305–327 (2004)

    CrossRef  Google Scholar 

  15. Lin, K., Simossis, V.A., Taylor, W.R., Heringa, J.: A simple and fast secondary structure prediction method using hidden neural networks. Bioinformatics 21, 152–159 (2005)

    CrossRef  Google Scholar 

  16. Guermeur, Y., Thomarat, F.: Estimating the Class Posterior Probabilities in Protein Secondary Structure Prediction. In: Loog, M., Wessels, L., Reinders, M.J.T., de Ridder, D. (eds.) PRIB 2011. LNCS (LNBI), vol. 7036, pp. 260–271. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  17. Bonidal, R., Thomarat, F., Guermeur, Y.: Estimating the class posterior probabilities in biological sequence segmentation. In: SMTDA 2012 (2012)

    Google Scholar 

  18. Ramesh, P., Wilpon, J.G.: Modeling state durations in hidden Markov models for automatic speech recognition. In: ICASSP 1992, pp. 381–384 (1992)

    Google Scholar 

  19. Guermeur, Y.: A generic model of multi-class support vector machine. International Journal of Intelligent Information and Database Systems (in press, 2012)

    Google Scholar 

  20. Baldi, P., Brunak, S., Frasconi, P., Soda, G., Pollastri, G.: Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 15, 937–946 (1999)

    CrossRef  Google Scholar 

  21. Chen, J., Chaudhari, N.S.: Cascaded bidirectional recurrent neural networks for protein secondary structure prediction. IEEE/ACM Transactions on Computational Biology and Bioinfomatics 4, 572–582 (2007)

    CrossRef  Google Scholar 

  22. Hosmer, D.W., Lemeshow, S.: Applied Logistic Regression. Wiley, London (1989)

    Google Scholar 

  23. Guermeur, Y.: Combining multi-class SVMs with linear ensemble methods that estimate the class posterior probabilities. Communications in Statistics (submitted)

    Google Scholar 

  24. Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge (1999)

    CrossRef  MATH  Google Scholar 

  25. Guermeur, Y.: VC theory of large margin multi-category classifiers. Journal of Machine Learning Research 8, 2551–2594 (2007)

    MathSciNet  MATH  Google Scholar 

  26. Cuff, J.A., Barton, G.J.: Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 34, 508–519 (1999)

    CrossRef  Google Scholar 

  27. Jones, D.T., Swindells, M.B.: Getting the most from PSI-BLAST. Trends in Biochemical Sciences 27, 161–164 (2002)

    CrossRef  Google Scholar 

  28. Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)

    CrossRef  Google Scholar 

  29. Weston, J., Watkins, C.: Multi-class support vector machines. Technical Report CSD-TR-98-04, Royal Holloway, University of London, Department of Computer Science (1998)

    Google Scholar 

  30. Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research 2, 265–292 (2001)

    Google Scholar 

  31. Lee, Y., Lin, Y., Wahba, G.: Multicategory support vector machines: Theory and application to the classification of microarray data and satellite radiance data. Journal of the American Statistical Association 99, 67–81 (2004)

    CrossRef  MathSciNet  MATH  Google Scholar 

  32. Guermeur, Y., Monfrini, E.: A quadratic loss multi-class SVM for which a radius-margin bound applies. Informatica 22, 73–96 (2011)

    MathSciNet  Google Scholar 

  33. Lauer, F., Guermeur, Y.: MSVMpack: a multi-class support vector machine package. Journal of Machine Learning Research 12, 2293–2296 (2011)

    MathSciNet  Google Scholar 

  34. Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A.F., Nielsen, H.: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16, 412–424 (2000)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

  1. LORIA – CNRS, INRIA, Université de Lorraine, Campus Scientifique, BP 239, 54506, Vandœuvre-lès-Nancy Cedex, France

    Fabienne Thomarat, Fabien Lauer & Yann Guermeur

Authors
  1. Fabienne Thomarat
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Fabien Lauer
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Yann Guermeur
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

  1. Institute of Medical Science, University of Tokyo, 4-6-1, Shirokanedai, 108-8639, Minato-ku, Tokyo, Japan

    Tetsuo Shibuya

  2. Department of Mathematical Informatics, The University of Tokyo, 7-3-1 Hongo, 113-8654, Bunkyo-ku, Tokyo, Japan

    Hisashi Kashima

  3. Department of Comouter Science, Tokyo Institute of Technology, 2-12-1 Ookayamama, 152-8550, Meguro-ku, Tokyo, Japan

    Jun Sese

  4. Bioinformatics Project, National Institute of Biomedical Innovation, 7-6-8 Saito-Asagi, 567-0085, Suita, Osaka, Japan

    Shandar Ahmad

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Thomarat, F., Lauer, F., Guermeur, Y. (2012). Cascading Discriminant and Generative Models for Protein Secondary Structure Prediction. In: Shibuya, T., Kashima, H., Sese, J., Ahmad, S. (eds) Pattern Recognition in Bioinformatics. PRIB 2012. Lecture Notes in Computer Science(), vol 7632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34123-6_15

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-642-34123-6_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34122-9

  • Online ISBN: 978-3-642-34123-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • The International Association for Pattern Recognition

    Published in cooperation with

    http://www.iapr.org/

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature