Skip to main content

Faster Gradient Descent Training of Hidden Markov Models, Using Individual Learning Rate Adaptation

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3264))

Abstract

Hidden Markov Models (HMMs) are probabilistic models, suitable for a wide range of pattern recognition tasks. In this work, we propose a new gradient descent method for Conditional Maximum Likelihood (CML) training of HMMs, which significantly outperforms traditional gradient descent. Instead of using fixed learning rate for every adjustable parameter of the HMM, we propose the use of independent learning rate/step-size adaptation, which has been proved valuable as a strategy in Artificial Neural Networks training. We show here that our approach compared to standard gradient descent performs significantly better. The convergence speed is increased up to five times, while at the same time the training procedure becomes more robust, as tested on ap-plications from molecular biology. This is accomplished without additional computational complexity or the need for parameter tuning.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  2. Durbin, R., Eddy, S., Krogh, A., Mithison, G.: Biological sequence analysis, probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge (1998)

    Book  MATH  Google Scholar 

  3. Krogh, A., Larsson, B., von Heijne, G., Sonnhammer, E.L.: Predicting transmembrane protein topology with a hidden Markov model, application to complete genomes. J. Mol. Biol. 305(3), 567–580 (2001)

    Article  Google Scholar 

  4. Henderson, J., Salzberg, S., Fasman, K.H.: Finding genes in DNA with a hidden Markov model. J. Comput. Biol. 4(2), 127–142 (1997)

    Article  Google Scholar 

  5. Krogh, A., Mian, I.S., Haussler, D.: A hidden Markov model that finds genes in E. coli DNA. Nucleic Acids Res. 22(22), 4768–4778 (1994)

    Article  Google Scholar 

  6. Baum, L.: An inequality and associated maximization technique in statistical estimation for probalistic functions of Markov processes. Inequalities 3, 1–8 (1972)

    Google Scholar 

  7. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. B. 39, 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  8. Baldi, P., Chauvin, Y.: Smooth On-Line Learning Algorithms for Hidden Markov Models. Neural Comput. 6(2), 305–316 (1994)

    Article  Google Scholar 

  9. Krogh, A.: Hidden Markov models for labeled sequences. In: Krogh, A. (ed.) Proceedings of the12th IAPR International Conference on Pattern Recognition, pp. 140–144 (1994)

    Google Scholar 

  10. Krogh, A.: Two methods for improving performance of an HMM and their application for gene finding. Proc. Int. Conf. Intell. Syst. Mol. Biol. 5, 179–186 (1997)

    Google Scholar 

  11. Bagos, P.G., Liakopoulos, T.D., Hamodrakas, S.J.: Maximum Likelihood and Conditional Maximum Likelihood learning algorithms for Hidden Markov Models with labeled data- Application to transmembrane protein topology prediction. In: Simos, T.E. (ed.) Computational Methods in Sciences and Engineering, Proceedings of the International Conference 2003 (ICCMSE 2003), pp. 47–55. World Scientific Publishing Co. Pte. Ltd, Singapore (2003)

    Chapter  Google Scholar 

  12. Krogh, A., Riis, S.K.: Hidden neural networks. Neural Comput 11(2), 541–563 (1999)

    Article  Google Scholar 

  13. Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1998)

    Google Scholar 

  14. Schiffmann, W., Joost, M., Werner, R.: Optimization of the Backpropagation Algorithm for Training Multi-Layer Perceptrons. Technical report, University of Koblenz, Institute of Physics (1994)

    Google Scholar 

  15. Riedmiller, M., Braun, H.: RPROP-A Fast Adaptive Learning Algorithm. In: Riedmiller, M., Braun, H. (eds.) Proceedings of the 1992 International Symposium on Computer and Information Sciences, Antalya, Turkey, pp. 279–285 (1992)

    Google Scholar 

  16. Schulz, G.E.: The structure of bacterial outer membrane proteins, Biochim. Biophys. Acta. 1565(2), 308–317 (2002)

    Article  Google Scholar 

  17. Von Heijne, G.: Recent advances in the understanding of membrane protein assembly and function. Quart. Rev. Biophys. 32(4), 285–307 (1999)

    Article  Google Scholar 

  18. Bagos, P.G., Liakopoulos, T.D., Spyropoulos, I.C., Hamodrakas, S.J.: A Hidden Markov Model capable of predicting and discriminating β-barrel outer membrane proteins. BMC Bioinformatics 5, 29 (2004)

    Article  Google Scholar 

  19. Moller, S., Croning, M.D., Apweiler, R.: Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17(7), 646–653 (2001)

    Article  Google Scholar 

  20. Bagos, P.G., Liakopoulos, T.D., Spyropoulos, I.C., Hamodrakas, S.J.: PRED-TMBB: a web server for predicting the topology of beta-barrel outer membrane proteins. Nucleic Acids Res 404(Web Server) , W400-W404 (2004)

    Google Scholar 

  21. Berman, H.M., Battistuz, T., Bhat, T.N., Bluhm, W.F., Bourne, P.E., Burkhardt, K., Feng, Z., Gilliland, G.L., Iype, L., Jain, S., et al.: The Protein Data Bank. Acta Crystallogr. D Biol. Crystallogr 58(Pt 6 No 1), 899–907 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bagos, P.G., Liakopoulos, T.D., Hamodrakas, S.J. (2004). Faster Gradient Descent Training of Hidden Markov Models, Using Individual Learning Rate Adaptation. In: Paliouras, G., Sakakibara, Y. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2004. Lecture Notes in Computer Science(), vol 3264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30195-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30195-0_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23410-4

  • Online ISBN: 978-3-540-30195-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics