Large-Scale Analyses of Positive Selection Using Codon Models



Positive selection is the mechanism of adaptation to the environment, as well as the main source of novelty in evolution and thus it is of great interest to find its trace in genomes. During the last decade, different evolutionary models have been developed to detect positive selection at the gene level, based on divergence between species. Most recently, these models have been applied to large-scale comparisons of genomes. We present in this chapter some strengths and limitations of such genomic scans for positive selection and discuss the main recent large-scale studies, as well as relevant databases. We particularly discuss our recent results concerning the impact of genome duplication in vertebrate evolution and our related database Selectome.


  1. Anisimova M, Bielawski JP, et al (2001) Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution. Mol Biol Evol 18(8):1585–1592PubMedGoogle Scholar
  2. Anisimova M, Bielawski JP, et al (2002) Accuracy and power of bayes prediction of amino acid sites under positive selection. Mol Biol Evol 19(6):950–958PubMedGoogle Scholar
  3. Anisimova M, Liberles DA (2007) The quest for natural selection in the age of comparative genomics. Heredity 99(6):567–579CrossRefPubMedGoogle Scholar
  4. Anisimova M, Yang Z (2007) Multiple hypothesis testing to detect lineages under positive selection that affects only a few sites. Mol Biol Evol 24(5):1219–1228CrossRefPubMedGoogle Scholar
  5. Arbiza L, Dopazo J, et al (2006) Positive selection, relaxation, and acceleration in the evolution of the human and chimp genome. PLoS Comput Biol 2(4):e38CrossRefPubMedGoogle Scholar
  6. Avaron F, Thaeron-Antono C, et al (2003) Comparison of even-skipped related gene expression pattern in vertebrates shows an association between expression domain loss and modification of selective constraints on sequences. Evol Dev 5(2):145–156CrossRefPubMedGoogle Scholar
  7. Bakewell MA, Shi P, et al (2007) More genes underwent positive selection in chimpanzee evolution than in human evolution. Proc Natl Acad Sci USA 104(18):7489–7494CrossRefPubMedGoogle Scholar
  8. Baum J, Ward RH, et al (2002) Natural selection on the erythrocyte surface. Mol Biol Evol 19(3):223–229PubMedGoogle Scholar
  9. Bielawski JP, Yang Z (2003) Maximum likelihood methods for detecting adaptive evolution after gene duplication. J Struct Funct Genomics 3(1–4):201–212CrossRefPubMedGoogle Scholar
  10. Biswas S, Akey JM (2006) Genomic insights into positive selection. Trends Genet 22(8):437–446CrossRefPubMedGoogle Scholar
  11. Brunet FG, Crollius HR, et al (2006) Gene loss and evolutionary rates following whole-genome duplication in teleost fishes. Mol Biol Evol 23(9):1808–1816CrossRefPubMedGoogle Scholar
  12. Castresana J (2000) Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 17(4):540–552PubMedGoogle Scholar
  13. Christin PA, Salamin N, et al (2008) Evolutionary switch and genetic convergence on rbcL following the evolution of C4 photosynthesis. Mol Biol Evol 25(11):2361–2368CrossRefPubMedGoogle Scholar
  14. Clamp M, Cuff J, et al (2004) The Jalview Java alignment editor. Bioinformatics 20(3):426–427CrossRefPubMedGoogle Scholar
  15. Clark AG, Glanowski S, et al (2003) Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science 302(5652):1960–1963CrossRefPubMedGoogle Scholar
  16. Conant GC, Wolfe KH (2008) Turning a hobby into a job: How duplicated genes find new functions. Nat Rev Genet 9(12):938–950CrossRefPubMedGoogle Scholar
  17. Dufayard JF, Duret L, et al (2005) Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases. Bioinformatics 21(11):2596–2603CrossRefPubMedGoogle Scholar
  18. Duret L, Mouchiroud D, et al (1994) HOVERGEN: a database of homologous vertebrate genes. Nucleic Acids Res 22(12):2360–2365CrossRefPubMedGoogle Scholar
  19. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797CrossRefPubMedGoogle Scholar
  20. Endo T, Ikeo K, et al (1996) Large-scale search for genes on which positive selection may operate. Mol Biol Evol 13(5):685–690PubMedGoogle Scholar
  21. Eyre-Walker A (2006) The genomic rate of adaptive evolution. Trends Ecol Evol 21(10):569–575CrossRefPubMedGoogle Scholar
  22. Force A, Lynch M, et al (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151(4):1531–1545PubMedGoogle Scholar
  23. Gillespie JH (1991) The causes of molecular evolution. Oxford University Press, New YorkGoogle Scholar
  24. Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52(5):696–704CrossRefPubMedGoogle Scholar
  25. Hamosh A, Scott AF, et al (2005) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucl Acids Res 33(Suppl_1):D514–517PubMedGoogle Scholar
  26. He X, Zhang J (2005) Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics 169(2):1157–1164CrossRefPubMedGoogle Scholar
  27. Hughes AL, Hughes MK, et al (1994) Natural selection at the class II major histocompatibility complex loci of mammals. Philos Trans R Soc Lond B Biol Sci 346(1317):359–366; discussion 366–367CrossRefPubMedGoogle Scholar
  28. Hughes AL, Nei M (1988) Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature 335(6186):167–170CrossRefPubMedGoogle Scholar
  29. Hughes AL, Nei M (1989) Nucleotide substitution at major histocompatibility complex class II loci: evidence for overdominant selection. Proc Natl Acad Sci USA 86(3):958–962CrossRefPubMedGoogle Scholar
  30. Jaillon O, Aury JM, et al (2004) Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature 431(7011):946–957CrossRefPubMedGoogle Scholar
  31. Jiggins FM, Hurst GD, et al (2002) Host-symbiont conflicts: positive selection on an outer membrane protein of parasitic but not mutualistic Rickettsiaceae. Mol Biol Evol 19(8):1341–1349PubMedGoogle Scholar
  32. Jorgensen FG, Hobolth A, et al (2005) Comparative analysis of protein coding sequences from human, mouse and the domesticated pig. BMC Biol 3:2CrossRefPubMedGoogle Scholar
  33. Katoh K, Kuma K, et al (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33(2):511–518CrossRefPubMedGoogle Scholar
  34. King MC, Wilson AC (1975) Evolution at two levels in humans and chimpanzees. Science 188(4184):107–116CrossRefPubMedGoogle Scholar
  35. Kondrashov FA, Kondrashov AS (2006) Role of selection in fixation of gene duplications. J Theor Biol 239(2):141–151CrossRefPubMedGoogle Scholar
  36. Kosiol C, Vinar T, et al (2008) Patterns of positive selection in six Mammalian genomes. PLoS Genet 4(8):e1000144CrossRefPubMedGoogle Scholar
  37. Kuzniar A, van Ham RC, et al (2008) The quest for orthologs: finding the corresponding gene across genomes. Trends Genet 24(11):539–551CrossRefPubMedGoogle Scholar
  38. Landan G, Graur D (2007) Heads or tails: a simple reliability check for multiple sequence alignments. Mol Biol Evol 24(6):1380–1383CrossRefPubMedGoogle Scholar
  39. Lee YH, Ota T, et al (1995) Positive selection is a general phenomenon in the evolution of abalone sperm lysin. Mol Biol Evol 12(2):231–238PubMedGoogle Scholar
  40. Mayrose I, Doron-Faigenboim A, et al (2007) Towards realistic codon models: among site variability and dependency of synonymous and non-synonymous rates. Bioinformatics 23(13):i319–327CrossRefPubMedGoogle Scholar
  41. Messier W, Stewart CB (1997) Episodic adaptive evolution of primate lysozymes. Nature 385(6612):151–154CrossRefPubMedGoogle Scholar
  42. Moreno-Estrada A, Casals F, et al (2008) Signatures of selection in the human olfactory receptor OR5I1 gene. Mol Biol Evol 25(1):144–154CrossRefPubMedGoogle Scholar
  43. Nickel GC, Tefft D, et al (2008) Human PAML browser: a database of positive selection on human genes using phylogenetic methods. Nucleic Acids Res 36(Database issue):D800–D808PubMedGoogle Scholar
  44. Nielsen R, Bustamante C, et al (2005) A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol 3(6):e170CrossRefPubMedGoogle Scholar
  45. Nielsen R, Yang Z (1998) Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148(3):929–936PubMedGoogle Scholar
  46. Ohno S (1970) Evolution by gene duplication. Springer, New YorkGoogle Scholar
  47. Perriere G, Duret L, et al (2000) HOBACGEN: database system for comparative genomics in bacteria. Genome Res 10(3):379–385CrossRefPubMedGoogle Scholar
  48. Petersen L, Bollback JP, et al (2007) Genes under positive selection in Escherichia coli. Genome Res 17(9):1336–1343CrossRefPubMedGoogle Scholar
  49. Proux E, Studer RA, et al (2008) Selectome: a database of positive selection. Nucleic Acids Res 37:D404–D407CrossRefPubMedGoogle Scholar
  50. Putnam NH, Butts T, et al (2008) The amphioxus genome and the evolution of the chordate karyotype. Nature 453(7198):1064–1071CrossRefPubMedGoogle Scholar
  51. Robinson-Rechavi M, Huchon D (2000) RRTree: Relative-Rate Tests between groups of sequences on a phylogenetic tree. Bioinformatics 16(3):296–297CrossRefPubMedGoogle Scholar
  52. Rooney AP, Zhang J (1999) Rapid evolution of a primate sperm protein: relaxation of functional constraint or positive Darwinian selection? Mol Biol Evol 16(5):706–710PubMedGoogle Scholar
  53. Roth C, Betts MJ, et al (2005) The Adaptive Evolution Database (TAED): a phylogeny based tool for comparative genomics. Nucleic Acids Res 33(Database issue):D495–D497CrossRefPubMedGoogle Scholar
  54. Ruan J, Li H, et al (2008) TreeFam: 2008 Update. Nucleic Acids Res 36(Database issue):D735–D740PubMedGoogle Scholar
  55. Sawyer SL, Wu LI, et al (2005) Positive selection of primate TRIM5alpha identifies a critical species-specific retroviral restriction domain. Proc Natl Acad Sci USA 102(8):2832–2837CrossRefPubMedGoogle Scholar
  56. Schmid K, Yang Z (2008) The trouble with sliding windows and the selective pressure in BRCA1. PLoS ONE 3(11):e3746CrossRefPubMedGoogle Scholar
  57. Semon M, Wolfe KH (2007) Consequences of genome duplication. Curr Opin Genet Dev 17(6):505–512CrossRefPubMedGoogle Scholar
  58. Shiu SH, Byrnes JK, et al (2006) Role of positive selection in the retention of duplicate genes in mammalian genomes. Proc Natl Acad Sci USA 103(7):2232–2236CrossRefPubMedGoogle Scholar
  59. Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100(16):9440–9445CrossRefPubMedGoogle Scholar
  60. Studer RA, Penel S, et al (2008) Pervasive positive selection on duplicated and nonduplicated vertebrate protein coding genes. Genome Res 18(9):1393–1402CrossRefPubMedGoogle Scholar
  61. Suzuki Y (2008) False-positive results obtained from the branch-site test of positive selection. Genes Genet Syst 83(4):331–338CrossRefPubMedGoogle Scholar
  62. Vamathevan JJ, Hasan S, et al (2008) The role of positive selection in determining the molecular cause of species differences in disease. BMC Evol Biol 8:273CrossRefPubMedGoogle Scholar
  63. Wong KM, Suchard MA, et al (2008) Alignment uncertainty and genomic analysis. Science 319(5862):473–476CrossRefPubMedGoogle Scholar
  64. Yang Z (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci 13(5):555–556PubMedGoogle Scholar
  65. Yang Z (1998) Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol 15(5):568–573PubMedGoogle Scholar
  66. Yang Z (2006) Computational molecular evolution. Oxford University Press, New YorkCrossRefGoogle Scholar
  67. Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24(8):1586–1591CrossRefPubMedGoogle Scholar
  68. Yang Z, Nielsen R (2002) Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol Biol Evol 19(6):908–917PubMedGoogle Scholar
  69. Yang Z, Nielsen R (2008) Mutation-selection models of codon substitution and their use to estimate selective strengths on codon usage. Mol Biol Evol 25(3):568–579CrossRefPubMedGoogle Scholar
  70. Yang Z, Nielsen R, et al (2000) Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155(1):431–449PubMedGoogle Scholar
  71. Yang Z, Swanson WJ (2002) Codon-substitution models to detect adaptive evolution that account for heterogeneous selective pressures among site classes. Mol Biol Evol 19(1):49–57PubMedGoogle Scholar
  72. Yang Z, Swanson WJ, et al (2000) Maximum-likelihood analysis of molecular adaptation in abalone sperm lysin reveals variable selective pressures among lineages and sites. Mol Biol Evol 17(10):1446–1455PubMedGoogle Scholar
  73. Yang Z, Wong WS, et al (2005) Bayes empirical bayes inference of amino acid sites under positive selection. Mol Biol Evol 22(4):1107–1118CrossRefPubMedGoogle Scholar
  74. Zhang J (2004) Frequent false detection of positive selection by the likelihood method with branch-site models. Mol Biol Evol 21(7):1332–1339CrossRefPubMedGoogle Scholar
  75. Zhang J, Nielsen R, et al (2005) Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol 22(12):2472–2479CrossRefPubMedGoogle Scholar
  76. Zhang JZ (2003) Evolution by gene duplication: an update. Trends Ecol Evol 18(6):292–298CrossRefGoogle Scholar
  77. Zhang ZD, Weinstock G, et al (2008) Rapid evolution by positive Darwinian selection in T-cell antigen CD4 in primates. J Mol Evol 66(5):446–456CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  1. 1.Department of Ecology and Evolution, BiophoreLausanne UniversityLausanneSwitzerland
  2. 2.Swiss Institute of BioinformaticsLausanneSwitzerland

Personalised recommendations