Phylogenetic Hidden Markov Models

  • Adam Siepel
  • David Haussler
Part of the Statistics for Biology and Health book series (SBH)


Hide Markov Model Graphical Model Marginal Probability Conservation Score Elimination Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    M. Alexandersson, S. Cawley, and L. Pachter. Cross-species gene finding and alignment with a generalized pair hidden Markov model. Genome Res., 13:496–502, 2003.CrossRefGoogle Scholar
  2. [2]
    P. F. Arndt, C. B. Burge, and T. Hwa. DNA sequence evolution with neighbor-dependent mutation. In Proceedings of the 6th International Conference on Research in Computational Molecular Biology (RECOMB’02), pages 32–38. ACM Press, New York, 2002.Google Scholar
  3. [3]
    D. Boffelli, J. McAuliffe, D. Ovcharenko, K. D. Lewis, I. Ovcharenko, L. Pachter, and E. M. Rubin. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science, 299:1391–1394, 2003.CrossRefGoogle Scholar
  4. [4]
    F. Chiaromonte, R. J. Weber, K. M. Roskin, M. Diekhans, W. J. Kent, and D. Haussler. The share of human genomic DNA under selection estimated from human-mouse genomic alignments. Cold Spring Harbor Symp. Quant. Biol., 68:245–254, 2003.CrossRefGoogle Scholar
  5. [5]
    Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature, 420:520–562, 2002.Google Scholar
  6. [6]
    Rat Genome Sequencing Project Consortium. Genome sequence of the Brown Norway Rat yields insights into mammalian evolution. Nature, 428:493–521, 2004.Google Scholar
  7. [7]
    R. Durbin, S. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge, 1998.Google Scholar
  8. [8]
    J. Felsenstein. Evolutionary trees from DNA sequences. J. Mol. Evol., 17:368–376, 1981.CrossRefGoogle Scholar
  9. [9]
    J. Felsenstein and G. A. Churchill. A hidden Markov model approach to variation among sites in rate of evolution. Mol. Biol. Evol., 13:93–104, 1996.Google Scholar
  10. [10]
    N. Friedman, M. Ninio, I. Peér, and T. Pupko. A structural EM algorithm for phylogenetic inference. J. Comp. Biol., 9:331–353, 2002.Google Scholar
  11. [11]
    N. Goldman, J. L. Thorne, and D. T. Jones. Using evolutionary trees in protein secondary structure prediction and other comparative sequence analyses. J. Mol. Biol., 263:196–208, 1996.CrossRefGoogle Scholar
  12. [12]
    N. Goldman and Z. Yang. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol., 11:725–735,, 1994.Google Scholar
  13. [13]
    M. Hasegawa, H. Kishino, and T. Yano. Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol., 22:160–174, 1985.Google Scholar
  14. [14]
    D. Heckerman. A tutorial on learning with Bayesian networks. In M. I. Jordan, editor, Learning in Graphical Models. MIT Press, Cambridge, MA, 1999.Google Scholar
  15. [15]
    J. Hein, J. L. Jensen, and C. N. S. Pedersen. Recursions for statistical multiple alignment. Proc. Natl. Acad. Sci. USA, 100:14960–14965, 2003.CrossRefGoogle Scholar
  16. [16]
    S. T. Hess, J. D. Blake, and R. D. Blake. Wide variations in neighbor-dependent substitution rates. J. Mol. Biol., 236:1022–1033, 1994.CrossRefGoogle Scholar
  17. [17]
    I. Holmes. Using guide trees to construct multiple-sequence evolutionary HMMs. Bioinformatics, 19(Suppl. 1):i147–i157, 2003.Google Scholar
  18. [18]
    I. Holmes and W. J. Bruno. Evolutionary HMMs: A Bayesian approach to multiple alignment. Bioinformatics, 17:803–820, 2001.CrossRefGoogle Scholar
  19. [19]
    D. Husmeier and G. McGuire. Detecting recombination in 4-taxa DNA sequence alignments with Bayesian hidden Markov models and Markov chain Monte Carlo. Mol. Biol. Evol., 20:315–337, 2003.Google Scholar
  20. [20]
    D. Husmeier and F. Wright. Detection of recombination in DNA multiple alignments with hidden Markov models. J. Comp. Biol., 8:401–427, 2001.Google Scholar
  21. [21]
    J. L. Jensen and A.-M. K. Pedersen. Probabilistic models of DNA sequence evolution with context dependent rates of substitution. Adv. Appl. Prob., 32:499–517, 2000.MathSciNetGoogle Scholar
  22. [22]
    V. Jojic, N. Jojic, C. Meek, D. Geiger, A. Siepel, D. Haussler, and D. Heckerman. Efficient approximations for learning phylogenetic HMM models from data. In Proceedings of the 12th International Conference on Intelligent Systems for Molecular Biology. UAI Press, Banff, Canada, 2004.Google Scholar
  23. [23]
    M. I. Jordan and Y. Weiss. Graphical models: probabilistic inference. In M. Arbib, editor, The Handbook of Brain Theory and Neural Networks. MIT Press, Camebridge, MA, second edition, 2002.Google Scholar
  24. [24]
    M. Kellis, N. Patterson, M. Endrizzi, B. Birren, and E. S. Lander. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature, 423:241–254, 2003.CrossRefGoogle Scholar
  25. [25]
    W. J. Kent, C. W. Sugnet, T. S. Furey, K. M. Roskin, T. H. Pringle, A. M. Zahler, and D. Haussler. The human genome browser at UCSC. Genome Res., 12:996–1006, 2002.Google Scholar
  26. [26]
    B. Knudsen and J. Hein. RNA secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics, 15:446–454, 1999.Google Scholar
  27. [27]
    J. M. Koshi and R. M. Goldstein. Probabilistic reconstruction of ancestral protein sequences. J. Mol. Evol., 42:313–320, 1996.CrossRefGoogle Scholar
  28. [28]
    P. Liò and N. Goldman. Models of molecular evolution and phylogeny. Genome Res., 8:1233–1244, 1998.Google Scholar
  29. [29]
    P. Liò, N. Goldman, J. L. Thorne, and D. T. Jones. PASSML: Combining evolutionary inference and protein secondary structure prediction. Bioinformatics, 14:726–733, 1998.Google Scholar
  30. [30]
    B. Lucena. Dynamic programming, tree-width, and computation on graphical models. PhD thesis, Brown University, 2002.Google Scholar
  31. [31]
    W. P. Maddison and D. R. Maddison. Introduction to inference for Bayesian networks. In M. I. Jordan, editor, Learning in Graphical Models. MIT Press, Cambridge, MA, 1999.Google Scholar
  32. [32]
    E. H. Margulies, M. Blanchette, NISC Comparative Sequencing Program, D. Haussler, and E. D. Green. Identification and characterization of multi-species conserved sequences. Genome Res., 13:2507–2518, 2003.Google Scholar
  33. [33]
    J. D. McAuliffe, L. Pachter, and M. I. Jordan. Multiple-sequence functional annotation and the generalized hidden Markov phylogeny. Bioinformatics, 20:1850–1860, 2004.CrossRefGoogle Scholar
  34. [34]
    G. McGuire, F. Wright, and M. J. Prentice. A Bayesian model for detecting past recombination events in DNA multiple alignments. J. Comp. Biol., 7:159–170, 2000.Google Scholar
  35. [35]
    I. M. Meyer and R. Durbin. Comparative ab initio prediction of gene structures using pair HMMs. Bioinformatics, 18:1309–1318, 2002.CrossRefGoogle Scholar
  36. [36]
    G. J. Mitchison. A probabilistic treatment of phylogeny and sequence alignment. J. Mol. Evol., 49:11–22, 1999.Google Scholar
  37. [37]
    K. Murphy, Y. Weiss, and M. I. Jordan. Loopy belief-propagation for approximate inference: An empirical study. In K. B. Laskey and H. Prade, editors, Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 467–476. Morgan Kaufmann, San Mateo, CA, 1999.Google Scholar
  38. [38]
    J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, CA, 1988.Google Scholar
  39. [39]
    A.-M. K. Pedersen and J. L. Jensen. A dependent rates model and MCMC based methodology for the maximum likelihood analysis of sequences with overlapping reading frames. Mol. Biol. Evol., 18:763–776, 2001.Google Scholar
  40. [40]
    A.-M. K. Pedersen, C. Wiuf, and F. B. Christiansen. A codon-based model designed to describe lentiviral evolution. Mol. Biol. Evol., 15:1069–1081, 1998.Google Scholar
  41. [41]
    J. S. Pedersen and J. Hein. Gene finding with a hidden Markov model of genome structure and evolution. Bioinformatics, 19:219–227, 2003.CrossRefGoogle Scholar
  42. [42]
    A. Siepel and D. Haussler. Combining phylogenetic and hidden Markov models in biosequence analysis. J. Comp. Biol., 11(2-3):413–428, 2004.Google Scholar
  43. [43]
    A. Siepel and D. Haussler. Computational identification of evolutionarily conserved exons. In Proceedings of the 8th International Conference on Research in Computational Molecular Biology (RECOMB’04), pages 177–186. ACM Press, New York, 2004.Google Scholar
  44. [44]
    A. Siepel and D. Haussler. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol. Biol. Evol., 21:468–488, 2004.Google Scholar
  45. [45]
    N. Stojanovic, L. Florea, C. Riemer, D. Gumucio, J. Slightom, M. Goodman, W. Miller, and R. Hardison. Comparison of five methods for finding conserved sequences in multiple alignments of gene regulatory regions. Nucleic Acids Res., 27:3899–3910, 1999.CrossRefGoogle Scholar
  46. [46]
    J. W. Thomas, J. W. Touchman, and R. W. Blakesley et al. Comparative analyses of multi-species sequences from targeted genomic regions. Nature, 424:788–793, 2003.CrossRefGoogle Scholar
  47. [47]
    J. L. Thorne, N. Goldman, and D. T. Jones. Combining protein evolution and secondary structure. Mol. Biol. Evol., 13:666–673, 1996.Google Scholar
  48. [48]
    M. Wainwright, T. Jaakkola, and A. Willsky. Tree-based reparameterization framework for analysis of sum-product and related algorithms. IEEE Trans. Inf. Theory, 49:1120–1146, 2001.MathSciNetGoogle Scholar
  49. [49]
    M. J. Wainwright and M. I. Jordan. Graphical models, exponential families, and variational inference. Technical Report 649, Department of Statistics, University of California, Berkeley, 2003.Google Scholar
  50. [50]
    S. Whelan, P. Liò, and N. Goldman. Molecular phylogenetics: State-of-the-art methods for looking into the past. Trends Genet., 17:262–272, 2001.CrossRefGoogle Scholar
  51. [51]
    Z. Yang. Estimating the pattern of nucleotide substitution. J. Mol. Evol., 39:105–111, 1994.Google Scholar
  52. [52]
    Z. Yang. A space-time process model for the evolution of DNA sequences. Genetics, 139:993–1005, 1995.Google Scholar
  53. [53]
    J. Yedidia, W. Freeman, and Y. Weiss. Bethe free energy, Kikuchi approximations, and belief propagation algorithms. Technical Report TR2001-16, Mitsubishi Electronic Research Laboratories, Camebridge, MA, 2001.Google Scholar

Copyright information

© Springer Science+Business Media, Inc. 2005

Authors and Affiliations

  • Adam Siepel
    • 1
  • David Haussler
    • 2
  1. 1.Center for Biomolecular Science and EngineeringUniversity of CaliforniaSanta CruzUSA
  2. 2.Center for Biomolecular Science and EngineeringUniversity of CaliforniaSanta CruzUSA

Personalised recommendations