Skip to main content

New Methods for Detecting Lineage-Specific Selection

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2006)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3909))

Abstract

So far, most methods for identifying sequences under selection based on comparative sequence data have either assumed selectional pressures are the same across all branches of a phylogeny, or have focused on changes in specific lineages of interest. Here, we introduce a more general method that detects sequences that have either come under selection, or begun to drift, on any lineage. The method is based on a phylogenetic hidden Markov model (phylo-HMM), and does not require element boundaries to be determined a priori, making it particularly useful for identifying noncoding sequences. Insertions and deletions (indels) are incorporated into the phylo-HMM by a simple strategy that uses a separately reconstructed “indel history.” To evaluate the statistical significance of predictions, we introduce a novel method for computing P-values based on prior and posterior distributions of the number of substitutions that have occurred in the evolution of predicted elements. We derive efficient dynamic-programming algorithms for obtaining these distributions, given a model of neutral evolution. Our methods have been implemented as computer programs called DLESS (Detection of LinEage-Specific Selection) and phyloP (phylogenetic P-values). We discuss results obtained with these programs on both real and simulated data sets.

This paper is presented here in abbreviated form; the complete version is available from http://www.bscb.cornell.edu/Homepages/Adam_Siepel/dless.pdf

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Nobrega, M.A., Ovcharenko, I., Afzal, V., Rubin, E.M.: Scanning human gene deserts for long-range enhancers. Science 302, 413 (2003)

    Article  Google Scholar 

  2. Woolfe, A., Goodson, M., Goode, D., Snell, P., McEwen, G., Vavouri, T., Smith, S., North, P., Callaway, H., Kelly, K., et al.: Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 3, e7 (2005)

    Google Scholar 

  3. Boffelli, D., McAuliffe, J., Ovcharenko, D., Lewis, K.D., Ovcharenko, I., Pachter, L., Rubin, E.M.: Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299, 1391–1394 (2003)

    Article  Google Scholar 

  4. Margulies, E.H., Blanchette, M., NISC Comparative Sequencing Program, Haussler, D., Green, E.D.: Identification and characterization of multi-species conserved sequences. Genome Res 13, 2507–2518 (2003)

    Article  Google Scholar 

  5. Cooper, G.M., Stone, E.A., Asimenos, G., Green, E.D., Batzoglou, S., Sidow, A.: Distribution and intensity of constraint in mammalian genomic sequence. Genome Res 15, 901–913 (2005)

    Article  Google Scholar 

  6. Siepel, A., Bejerano, G., Pedersen, J.S., Hinrichs, A.S., Hou, M., Rosenbloom, K., Clawson, H., Spieth, J., Hillier, L.W., Richards, S., et al.: Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15, 1034–1050 (2005)

    Article  Google Scholar 

  7. Nielsen, R., Yang, Z.: Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148, 929–936 (1998)

    Google Scholar 

  8. Yang, Z., Nielsen, R.: Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol. Biol. Evol. 19, 908–917 (2002)

    Google Scholar 

  9. Clark, A.G., Glanowski, S., Nielsen, R., Thomas, P.D., Kejariwal, A., Todd, M.A., Tanenbaum, D.M., Civello, D., Lu, F., Murphy, B., et al.: Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science 302, 1960–1963 (2003)

    Article  Google Scholar 

  10. Forsberg, R., Christiansen, F.B.: A codon-based model of host-specific selection in parasites, with an application to the influenza A virus. Mol. Biol. Evol. 20, 1252–1259 (2003)

    Article  Google Scholar 

  11. Guindon, S., Rodrigo, A.G., Dyer, K.A., Huelsenbeck, J.P.: Modeling the site-specific variation of selection patterns along lineages. Proc. Natl. Acad. Sci USA 101, 12957–12962 (2004)

    Article  Google Scholar 

  12. Nielsen, R., Bustamante, C., Clark, A.G., Glanowski, S., Sackton, T.B., Hubisz, M.J., Fledel-Alon, A., Tanenbaum, D.M., Civello, D., White, T.J., et al.: A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol. 3, e170 (2005)

    Google Scholar 

  13. ENCODE Project Consortium: The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306, 636–640 (2004)

    Article  Google Scholar 

  14. Felsenstein, J., Churchill, G.A.: A hidden Markov model approach to variation among sites in rate of evolution. Mol. Biol. Evol. 13, 93–104 (1996)

    Google Scholar 

  15. Yang, Z.: A space-time process model for the evolution of DNA sequences. Genetics 139, 993–1005 (1995)

    Google Scholar 

  16. Siepel, A., Haussler, D.: Phylogenetic hidden Markov models. In: Nielsen, R. (ed.) Statistical Methods in Molecular Evolution, pp. 325–351. Springer, New York (2005)

    Chapter  Google Scholar 

  17. Cooper, G.M., Brudno, M., Stone, E.A., Dubchak, I., Batzoglou, S., Sidow, A.: Characterization of evolutionary rates and constraints in three mammalian genomes. Genome Res 14, 539–548 (2004)

    Article  Google Scholar 

  18. McAuliffe, J.D., Pachter, L., Jordan, M.I.: Multiple-sequence functional annotation and the generalized hidden Markov phylogeny. Bioinformatics 20, 1850–1860 (2004)

    Article  Google Scholar 

  19. Siepel, A., Haussler, D.: Computational identification of evolutionarily conserved exons. In: Proc. 8th Int’l Conf. on Research in Computational Molecular Biology, pp. 177–186 (2004)

    Google Scholar 

  20. Holmes, I., Bruno, W.J.: Evolutionary HMMs: a Bayesian approach to multiple alignment. Bioinformatics 17, 803–820 (2001)

    Article  Google Scholar 

  21. Lunter, G., Miklos, I., Drummond, A., Jensen, J.L., Hein, J.: Bayesian coestimation of phylogeny and sequence alignment. BMC Bioinformatics 6, 83 (2005)

    Article  Google Scholar 

  22. Blanchette, M., Green, E.D., Miller, W., Haussler, D.: Reconstructing large regions of an ancestral mammalian genome in silico. Genome Res 14, 2412–2423 (2004)

    Article  Google Scholar 

  23. Zheng, Q.: On the dispersion index of a Markovian molecular clock. Math. Biosci. 172, 115–128 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  24. Jukes, T.H., Cantor, C.R.: Evolution of protein molecules. In: Munro, H. (ed.) Mammalian Protein Metabolism, pp. 21–132. Academic Press, New York (1969)

    Google Scholar 

  25. Gillespie, J.: Lineage effects and the index of dispersion of molecular evolution. Mol. Biol. Evol. 6, 636–647 (1989)

    Google Scholar 

  26. Felsenstein, J.: Evolutionary trees from DNA sequences. J. Mol. Evol. 17, 368–376 (1981)

    Article  Google Scholar 

  27. Blanchette, M., Kent, W.J., Riemer, C., Elnitski, L., Smit, A.F.A., Roskin, K.M., Baertsch, R., Rosenbloom, K., Clawson, H., Green, E.D., et al.: Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res 14, 708–715 (2004)

    Article  Google Scholar 

  28. Felsenstein, J.: Inferring Phylogenies. Sinauer Associates, Inc., Sunderland, Massachusetts (2004)

    Google Scholar 

  29. Nielsen, R., Huelsenbeck, J.P.: Detecting positively selected amino acid sites using posterior predictive P-values. Pac. Symp. Biocomput., 576–588 (2002)

    Google Scholar 

  30. Hasegawa, M., Kishino, H., Yano, T.: Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22, 160–174 (1985)

    Article  Google Scholar 

  31. Smith, N.G.C., Brandstrom, M., Ellegren, H.: Evidence for turnover of functional noncoding DNA in mammalian genome evolution. Genomics 84, 806–813 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Siepel, A., Pollard, K.S., Haussler, D. (2006). New Methods for Detecting Lineage-Specific Selection. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2006. Lecture Notes in Computer Science(), vol 3909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732990_17

Download citation

  • DOI: https://doi.org/10.1007/11732990_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33295-4

  • Online ISBN: 978-3-540-33296-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics