Faster Gradient Descent Training of Hidden Markov Models, Using Individual Learning Rate Adaptation

  • Pantelis G. Bagos
  • Theodore D. Liakopoulos
  • Stavros J. Hamodrakas
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3264)


Hidden Markov Models (HMMs) are probabilistic models, suitable for a wide range of pattern recognition tasks. In this work, we propose a new gradient descent method for Conditional Maximum Likelihood (CML) training of HMMs, which significantly outperforms traditional gradient descent. Instead of using fixed learning rate for every adjustable parameter of the HMM, we propose the use of independent learning rate/step-size adaptation, which has been proved valuable as a strategy in Artificial Neural Networks training. We show here that our approach compared to standard gradient descent performs significantly better. The convergence speed is increased up to five times, while at the same time the training procedure becomes more robust, as tested on ap-plications from molecular biology. This is accomplished without additional computational complexity or the need for parameter tuning.


Hide Markov Model Learning Rate Gradient Descent Emission Probability Gradient Descent Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  2. 2.
    Durbin, R., Eddy, S., Krogh, A., Mithison, G.: Biological sequence analysis, probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge (1998)zbMATHCrossRefGoogle Scholar
  3. 3.
    Krogh, A., Larsson, B., von Heijne, G., Sonnhammer, E.L.: Predicting transmembrane protein topology with a hidden Markov model, application to complete genomes. J. Mol. Biol. 305(3), 567–580 (2001)CrossRefGoogle Scholar
  4. 4.
    Henderson, J., Salzberg, S., Fasman, K.H.: Finding genes in DNA with a hidden Markov model. J. Comput. Biol. 4(2), 127–142 (1997)CrossRefGoogle Scholar
  5. 5.
    Krogh, A., Mian, I.S., Haussler, D.: A hidden Markov model that finds genes in E. coli DNA. Nucleic Acids Res. 22(22), 4768–4778 (1994)CrossRefGoogle Scholar
  6. 6.
    Baum, L.: An inequality and associated maximization technique in statistical estimation for probalistic functions of Markov processes. Inequalities 3, 1–8 (1972)Google Scholar
  7. 7.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. B. 39, 1–38 (1977)zbMATHMathSciNetGoogle Scholar
  8. 8.
    Baldi, P., Chauvin, Y.: Smooth On-Line Learning Algorithms for Hidden Markov Models. Neural Comput. 6(2), 305–316 (1994)CrossRefGoogle Scholar
  9. 9.
    Krogh, A.: Hidden Markov models for labeled sequences. In: Krogh, A. (ed.) Proceedings of the12th IAPR International Conference on Pattern Recognition, pp. 140–144 (1994)Google Scholar
  10. 10.
    Krogh, A.: Two methods for improving performance of an HMM and their application for gene finding. Proc. Int. Conf. Intell. Syst. Mol. Biol. 5, 179–186 (1997)Google Scholar
  11. 11.
    Bagos, P.G., Liakopoulos, T.D., Hamodrakas, S.J.: Maximum Likelihood and Conditional Maximum Likelihood learning algorithms for Hidden Markov Models with labeled data- Application to transmembrane protein topology prediction. In: Simos, T.E. (ed.) Computational Methods in Sciences and Engineering, Proceedings of the International Conference 2003 (ICCMSE 2003), pp. 47–55. World Scientific Publishing Co. Pte. Ltd, Singapore (2003)CrossRefGoogle Scholar
  12. 12.
    Krogh, A., Riis, S.K.: Hidden neural networks. Neural Comput 11(2), 541–563 (1999)CrossRefGoogle Scholar
  13. 13.
    Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1998)Google Scholar
  14. 14.
    Schiffmann, W., Joost, M., Werner, R.: Optimization of the Backpropagation Algorithm for Training Multi-Layer Perceptrons. Technical report, University of Koblenz, Institute of Physics (1994)Google Scholar
  15. 15.
    Riedmiller, M., Braun, H.: RPROP-A Fast Adaptive Learning Algorithm. In: Riedmiller, M., Braun, H. (eds.) Proceedings of the 1992 International Symposium on Computer and Information Sciences, Antalya, Turkey, pp. 279–285 (1992)Google Scholar
  16. 16.
    Schulz, G.E.: The structure of bacterial outer membrane proteins, Biochim. Biophys. Acta. 1565(2), 308–317 (2002)CrossRefGoogle Scholar
  17. 17.
    Von Heijne, G.: Recent advances in the understanding of membrane protein assembly and function. Quart. Rev. Biophys. 32(4), 285–307 (1999)CrossRefGoogle Scholar
  18. 18.
    Bagos, P.G., Liakopoulos, T.D., Spyropoulos, I.C., Hamodrakas, S.J.: A Hidden Markov Model capable of predicting and discriminating β-barrel outer membrane proteins. BMC Bioinformatics 5, 29 (2004)CrossRefGoogle Scholar
  19. 19.
    Moller, S., Croning, M.D., Apweiler, R.: Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17(7), 646–653 (2001)CrossRefGoogle Scholar
  20. 20.
    Bagos, P.G., Liakopoulos, T.D., Spyropoulos, I.C., Hamodrakas, S.J.: PRED-TMBB: a web server for predicting the topology of beta-barrel outer membrane proteins. Nucleic Acids Res 404(Web Server) , W400-W404 (2004)Google Scholar
  21. 21.
    Berman, H.M., Battistuz, T., Bhat, T.N., Bluhm, W.F., Bourne, P.E., Burkhardt, K., Feng, Z., Gilliland, G.L., Iype, L., Jain, S., et al.: The Protein Data Bank. Acta Crystallogr. D Biol. Crystallogr 58(Pt 6 No 1), 899–907 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Pantelis G. Bagos
    • 1
  • Theodore D. Liakopoulos
    • 1
  • Stavros J. Hamodrakas
    • 1
  1. 1.Department of Cell Biology and Biophysics, Faculty of BiologyUniversity of AthensPanepistimiopolis, AthensGreece

Personalised recommendations