Simulated Annealing Algorithm with Biased Neighborhood Distribution for Training Profile Models

  • Anton Bezuglov
  • Juan E. Vargas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4203)


Functional biological sequences, which typically come in families, have retained some level of similarity and function during evolution. Finding consensus regions, alignment of sequences, and identifying the relationship between a sequence and a family allow inferences about the function of the sequences. Profile hidden Markov models (HMMs) are generally used to identify those relationships. A profile HMM can be trained on unaligned members of the family using conventional algorithms such as Baum-Welch, Viterbi, and their modifications. The overall quality of the alignment depends on the quality of the trained model. Unfortunately, the conventional training algorithms converge to suboptimal models most of the time. This work proposes a training algorithm that early identifies many imperfect models. The method is based on the Simulated Annealing approach widely used in discrete optimization problems. The training algorithm is implemented as a component in HMMER. The performance of the algorithm is discussed on protein sequence data.


Simulated Annealing Hide Markov Model Multiple Alignment Training Algorithm Simulated Annealing Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.: Biological Sequence Analysis. Cambridge University Press, Cambridge (1998)MATHCrossRefGoogle Scholar
  2. 2.
    Goutsias, J.: A hidden Markov model for transcriptional regulation in single cells. IEEE/ACM Transactions on Computational Biology and Bioinformatics 3(1), 57–71 (2006)CrossRefGoogle Scholar
  3. 3.
    Krogh, A., Brown, M., Mianx, S., Sjolander, K., Hausslery, D.: Hidden Markov models in computational biology: Applications to protein modeling ucsc-crl-93-32. Journal of Molecular Biology 235, 1501–1531 (1994)CrossRefGoogle Scholar
  4. 4.
    Baldi, P., Chauvin, Y., Hunkapiller, T., McClure, M.A.: Hidden Markov models of biological primary sequence information. Proc. Natl. Acad. Sci. USA 91, 1059–1063 (1994)CrossRefGoogle Scholar
  5. 5.
    Eddy, S.R.: Multiple alignment using hidden Markov models. In: Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology, pp. 114–120. AAAI Press, Menlo ParkGoogle Scholar
  6. 6.
    Barton, G.J.: Protein multiple sequence alignment and flexible pattern matching. Meth. Enzymol. 183, 403–427 (1990)CrossRefGoogle Scholar
  7. 7.
    Bashford, D., Chotia, C., Lesk, A.M.: Determinants of a protein fold: Unique features of the globin amino acid sequences. Journal of Molecular Biology 196(1), 199–216 (1987)CrossRefGoogle Scholar
  8. 8.
    Rabiner, L.R.: Tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  9. 9.
    Baum, L.E., Petrie, T.: Statistical inference for probabilistic functions of finite state Markov chains. Annals of Mathematical Statistics 37, 1554–1563 (1966)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Baum, L.E., Petrie, T., Soules, G., Weiss, N.: A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Annals of Mathematical Statistics 41(1), 164–171 (1970)MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Baum, L.E.: An equality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities 3, 1–8 (1972)Google Scholar
  12. 12.
    Dempster, A.P., Laird, N.M., Rubin, D.: Maximum-likelihood from incomplete data via the EM algorithm. J. Royal Statist. Soc. Ser. B (39) (1977)Google Scholar
  13. 13.
    Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Transactions on Information Theory IT-13, 260–269 (1967)CrossRefGoogle Scholar
  14. 14.
    Forney, G.D.: The Viterbi algorithm. Proceedings IEEE 61, 268–278 (1973)CrossRefMathSciNetGoogle Scholar
  15. 15.
    Levinson, S.E., Rabiner, L.R., Sondhi, M.M.: An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition. Bell Syst. Tech. J. 62(4), 1035–1074 (1983)MATHMathSciNetGoogle Scholar
  16. 16.
    Kirkpatrick, S., Gelatt Jr., C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)CrossRefMathSciNetGoogle Scholar
  17. 17.
    Ingber, L.: Adaptive Simulated Annealing (ASA). Lester Ingber Research (1995)Google Scholar
  18. 18.
    Wang, T.: Global Optimization for Constrained Nonlinear Programming. PhD thesis, University of Illinois (2001)Google Scholar
  19. 19.
    Hughey, R., Krogh, A.: Hidden Markov models for sequence analysis: Extension and analysis of the basic method. Computer Applications in the Biosciences 12, 95–107 (1996)Google Scholar
  20. 20.
    Aarts, E., Korst, J.: Simulated Annealing and Boltzmann Machines. J. Wiley and Sons, Chichester (1989)MATHGoogle Scholar
  21. 21.
    Romeo, F., Sangiovanni-Vincentelli, L.: A theoretical framework for simulated annealing. Algorithmica 6, 302–345 (1991)MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Hastings, W.: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 97–109 (1970)MATHCrossRefGoogle Scholar
  23. 23.
    Fogel, D.B.: An introduction to simulated annealing evolutionary optimization. IEEE Trans. on Neural Networks 5(1), 3–14 (1994)CrossRefGoogle Scholar
  24. 24.
    Andricioaei, I., Straub, J.E.: Generalized simulated annealing algorithms using Tsallis statistics: Application to conformational optimization of a tetrapeptide. Physical Review E 53(4), 3055–3058 (1996)CrossRefGoogle Scholar
  25. 25.
    Hansmann, U.H.E.: Simulated annealing with Tsallis weights: A numerical comparison. Physical A 242, 250–257 (1997)CrossRefGoogle Scholar
  26. 26.
    Corana, A., Marchesi, M., Martini, C., Ridella, S.: Minimizing multimodal functions of continuous variables with the simulated annealing algorithm. ACM Trans. on Mathematical Software 13(3), 262–280 (1987)MATHCrossRefMathSciNetGoogle Scholar
  27. 27.
    Courrieu, P.: The hyperbell algorithm for global optimization: A random walk using Cauchy densities. Journal of Global Optimization 10, 37–55 (1997)MATHCrossRefMathSciNetGoogle Scholar
  28. 28.
    Yao, X., Lin, G.: Evolutionary programming made faster. IEEE Transactions on Evolutionary Computation 3(2), 82–102 (1999)CrossRefMathSciNetGoogle Scholar
  29. 29.
    Scott, T.A., Mercer, E.I.: Concise Encyclopedia Biochemistry and Molecular Biology, 3rd edn. Walter de Gruyter, Berlin (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Anton Bezuglov
    • 1
  • Juan E. Vargas
    • 2
  1. 1.University of South CarolinaColumbiaUSA
  2. 2.Microsoft Co.RedmondUSA

Personalised recommendations