Simulated Annealing Algorithm with Biased Neighborhood Distribution for Training Profile Models
Functional biological sequences, which typically come in families, have retained some level of similarity and function during evolution. Finding consensus regions, alignment of sequences, and identifying the relationship between a sequence and a family allow inferences about the function of the sequences. Profile hidden Markov models (HMMs) are generally used to identify those relationships. A profile HMM can be trained on unaligned members of the family using conventional algorithms such as Baum-Welch, Viterbi, and their modifications. The overall quality of the alignment depends on the quality of the trained model. Unfortunately, the conventional training algorithms converge to suboptimal models most of the time. This work proposes a training algorithm that early identifies many imperfect models. The method is based on the Simulated Annealing approach widely used in discrete optimization problems. The training algorithm is implemented as a component in HMMER. The performance of the algorithm is discussed on protein sequence data.
KeywordsSimulated Annealing Hide Markov Model Multiple Alignment Training Algorithm Simulated Annealing Algorithm
Unable to display preview. Download preview PDF.
- 5.Eddy, S.R.: Multiple alignment using hidden Markov models. In: Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology, pp. 114–120. AAAI Press, Menlo ParkGoogle Scholar
- 11.Baum, L.E.: An equality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities 3, 1–8 (1972)Google Scholar
- 12.Dempster, A.P., Laird, N.M., Rubin, D.: Maximum-likelihood from incomplete data via the EM algorithm. J. Royal Statist. Soc. Ser. B (39) (1977)Google Scholar
- 17.Ingber, L.: Adaptive Simulated Annealing (ASA). Lester Ingber Research (1995)Google Scholar
- 18.Wang, T.: Global Optimization for Constrained Nonlinear Programming. PhD thesis, University of Illinois (2001)Google Scholar
- 19.Hughey, R., Krogh, A.: Hidden Markov models for sequence analysis: Extension and analysis of the basic method. Computer Applications in the Biosciences 12, 95–107 (1996)Google Scholar
- 29.Scott, T.A., Mercer, E.I.: Concise Encyclopedia Biochemistry and Molecular Biology, 3rd edn. Walter de Gruyter, Berlin (1998)Google Scholar