Skip to main content

Selection on the Protein-Coding Genome

  • Protocol
  • First Online:
Evolutionary Genomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 856))

Abstract

Populations evolve as mutations arise in individual organisms and, through hereditary transmission, may become “fixed” (shared by all individuals) in the population. Most mutations are lethal or have negative fitness consequences for the organism. Others have essentially no effect on organismal fitness and can become fixed through the neutral stochastic process known as random drift. However, mutations may also produce a selective advantage that boosts their chances of reaching fixation. Regions of genes where new mutations are beneficial, rather than neutral or deleterious, tend to evolve more rapidly due to positive selection. Genes involved in immunity and defense are a well-known example; rapid evolution in these genes presumably occurs because new mutations help organisms to prevail in evolutionary “arms races” with pathogens. In recent years, genome-wide scans for selection have enlarged our understanding of the evolution of the protein-coding regions of the various species. In this chapter, we focus on the methods to detect selection in protein-coding genes. In particular, we discuss probabilistic models and how they have changed with the advent of new genome-wide data now available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pal C, Papp B, Lercher MJ (2006) An integrated view on protein evolution. Nature Rev Genet 7:337–348

    PubMed  CAS  Google Scholar 

  2. Flicek P, Aken BL, Ballester B, Beal K, Bragin E, Brent S, Chen Y, Clapham P, Coates G, Fairley S, Fitzgerald S, Fernandez-Banet J, Gordon L, Gräf S, Haider S, Hammond M, Howe K, Jenkinson A, Johnson N, Kähäri A, Keefe D, Keenan S, Kinsella R, Kokocinski F, Koscielny G, Kulesha E, Lawson D, Longden I, Massingham T, McLaren W, Megy K, Overduin B, Pritchard B, Rios D, Ruffier M, Schuster M, Slater G, Smedley D, Spudich G, Tang YA, Trevanion S, Vilella A, Vogel J, White S, Wilder SP, Zadissa A, Birney E, Cunningham F, Dunham I, Durbin R, Fernández-Suarez XM, Herrero J, Hubbard TJ, Parker A, Proctor G, Smith J, Searle SM (2010) Ensembl's 10th year. Nucleic Acids Research 38:D557–D562

    PubMed  CAS  Google Scholar 

  3. Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, Goldman M, Barber GP, Clawson H, Coelho A, Diekhans M, Dreszer TR, Giardine BM, Harte RA, Hillman-Jackson J, Hsu F, Kirkup V, Kuhn RM, Learned K, Li CH, Meyer LR, Pohl A, Raney BJ, Rosenbloom KR, Smith KE, Haussler D, Kent WJ (2011) The UCSC Genome Browser database: update 2011. Nucleic Acids Res 39:D876-D882

    PubMed  Google Scholar 

  4. Altenhoff AM, Dessimoz C (2012) Inferring orthology and paralogy. In: Anisimova M (ed) Evolutionary genomics: statistical and computational methods (volume 1). Methods in Molecular Biology, Springer Science+Business Media New York

    Google Scholar 

  5. Lee H, Tang H (2012) Next generation sequencing technology and fragment assembly algorithms. In: Anisimova M (ed) Evolutionary genomics: statistical and computational methods (volume 1). Methods in Molecular Biology, Springer Science+Business Media New York

    Google Scholar 

  6. Li R, Fan W, Tian G, Zhu H, He L, Cai J, Huang Q, Cai Q, Li B, Bai Y, Zhang Z, Zhang Y, Xuan Z, Wang W, Li J et al. (2010) The sequence and de novo assembly of the giant panda genome. Nature 463:311–317

    PubMed  CAS  Google Scholar 

  7. Posada D, Crandall KA (2002) The effect of recombination on the accuracy of phylogenetic estimation. J Mol Evol 54:396–402

    PubMed  CAS  Google Scholar 

  8. Sawyer S (1989) Statistical tests for detecting gene conversion. Mol Biol Evol 6:526–538

    PubMed  CAS  Google Scholar 

  9. Semple C Wolfe KH (1999) Gene duplication and gene conversion in the caenorhabditis elegans genome. J Mol Evol 48:555–564

    PubMed  CAS  Google Scholar 

  10. Doolittle WF (1999) Phylogentic classification and the universal tree. Science 284:2124–2129

    PubMed  CAS  Google Scholar 

  11. Robinson DM, Jones DT, Kishino H, Goldman N, Thorne JL (2003) Protein evolution with dependence among codons due to tertiary structure. Mol Biol Evol 20:1692–1704

    PubMed  CAS  Google Scholar 

  12. Choi SC, Holboth A, Robinson DM, Kishino H, Thorne JL (2007) Quantifying the impact of protein tertiary structure on molecularevolution. Mol Biol Evol 24:1769–1782

    PubMed  CAS  Google Scholar 

  13. Keilson J (1979). Markov Chain Models-Rarity and Exponentiality. Springer, New-York

    Google Scholar 

  14. Pollard KS, Salama SR, King B, Kern AD, Dreszer T, Katzman S, Siepel A, Perdersen JS, Berjerano G, Baertsch R, Rosenblum KR, Kent J, Haussler D (2006) Frorces shaping the fastest evolving regions in the human genome, PLoS Genetics 2(10): e168.

    PubMed  Google Scholar 

  15. Holloway AK, Begun DJ, Siepel A, Pollard K (2008) Accelerated sequence divergence of conserved genomic elements in Drosophila melanogaster. Genome Res 18:1592–1601

    PubMed  CAS  Google Scholar 

  16. Miyamoto MM, Fitch WM (1995) Testing the covarion hypothesis of molecular evolution. Mol Biol Evol 12:503–513

    PubMed  CAS  Google Scholar 

  17. Lockhart PJ, Steel MA, Barbrook AC, Huson DH, Charleston MA, Howe CJ (1998) A covariotide model explains apparent phylogenetic structure of oxygenic photosynthetic lineages. Mol Biol Evol 15:1183–1188

    PubMed  CAS  Google Scholar 

  18. Penny D, McComish BJ, Charleston MA, Hendy MD (2001) Mathematical elegance with biochemical realism: the covarion model of molecular evolution. J Mol Evol 53:711–753

    PubMed  CAS  Google Scholar 

  19. Siltberg J, Liberles DA (2002) A simple covarion-based approach to analyse nucleotide substitution rates. J Evol Biol 15:588–594

    CAS  Google Scholar 

  20. Lichtarge O, Bourne HR, Cohen FE (1996) An evolutionary trace method defines binding surfaces common to protein families. J Mol Evol 257:342–358

    CAS  Google Scholar 

  21. Gu X (1999) Statistical methods for testing functional divergence after gene duplication. Mol Biol Evol 16:1664–1674

    PubMed  CAS  Google Scholar 

  22. Armon A, Graur D, Ben-Tal N (2001) ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. J Mol Biol 307:447–463

    PubMed  CAS  Google Scholar 

  23. Gaucher EA, Gu X, Miyamoto MM, Benner SA (2002) Predicting functional divergence in protein evolution by site-specific rate shifts. Trends Biochem Sci 27: 315–321

    PubMed  CAS  Google Scholar 

  24. Pupko T, Galtier N (2002) A covarion-based method for detecting molecular adaptation: application to the evolution of primate mitochondrial genomes. Proc Biol Sci 269:1313–1316

    PubMed  CAS  Google Scholar 

  25. Blouin C, Boucher Y, Roger AJ (2003) Inferring functional constraints and divergence in protein families using 3D mapping of phylogenetic information. Nucleic Acids Res 31:790–797

    PubMed  CAS  Google Scholar 

  26. Landau M, Mayrose I, Rosenberg Y, Glaser F, Martz E, Pupko T, Ben-Tal N (2005) ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures. Nucleic Acids Res 33:W299–W302

    PubMed  CAS  Google Scholar 

  27. Gu X (2001) Maximum-likelihood approach for gene family evolution under functional divergence. Mol Biol Evol 18:453–464

    PubMed  CAS  Google Scholar 

  28. Gu X (2006) A simple statistical method for estimating type-II (cluster-specific) functional divergence of protein sequences. Mol Biol Evol 23:1937–1945

    PubMed  CAS  Google Scholar 

  29. Siepel A, Haussler D (2004) Combining phylogenetic and hidden Markov models in biosequence analysis. J Comput Biol 11:413–428

    PubMed  CAS  Google Scholar 

  30. Siepel A, Haussler D (2004) Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol 21:468–488

    PubMed  CAS  Google Scholar 

  31. Bofkin L, Goldman N (2007) Variation in evolutionary processes at different codon positions. Mol Biol Evol 24:513–521

    PubMed  CAS  Google Scholar 

  32. Hughes AL, Nei M (1988) Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature 335:167–170

    PubMed  CAS  Google Scholar 

  33. Yang Z, Nielsen R (2000) Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol 17:32–43

    PubMed  CAS  Google Scholar 

  34. Goldman N, Yang Z (1994) A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol 11:725–736

    PubMed  CAS  Google Scholar 

  35. Muse SV, Gaut BS (1994) A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol Biol Evol 11:715–724

    PubMed  CAS  Google Scholar 

  36. Grantham R (1974) Amino acid difference formula to help explain protein evolution. Science 185:862–864

    PubMed  CAS  Google Scholar 

  37. Yang Z (1998) Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol 15:568–573

    PubMed  CAS  Google Scholar 

  38. Schneider A, Cannarozzi GM, Gonnet GH (2005) Empirical codon substitution matrix. BMC Bioinformatics 6:134

    Google Scholar 

  39. Kosiol C, Holmes I, Goldman N (2007) An empirical codon model for protein sequence evolution. Mol Biol Evol 24:1464–1479

    PubMed  CAS  Google Scholar 

  40. Doron-Faigenboim A, Pupko T (2007) A combined empirical and mechanistic codon model. Mol Biol Evol 24:388–397

    PubMed  CAS  Google Scholar 

  41. Whelan S, Goldman N (1999) Distributions of statistics used for the comparison of models of sequence evolution in phylogenetics. Mol Biol Evol 16:1292–1299

    CAS  Google Scholar 

  42. Anisimova M, Bielawski JP, Yang Z (2001) Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution. Mol Biol Evol 18:1585–1592

    PubMed  CAS  Google Scholar 

  43. Kosiol C, Vinar T, Da Fonseca RR, Hubisz MJ, Bustamante CD, Nielsen R, and Siepel A (2008) Patterns of positive selection in six mammalian genomes. PLoS Genet 4: e10000144

    Google Scholar 

  44. Anisimova M, Bielawski JP, Yang Z (2002) Accuracy and power of bayes prediction of amino acid sites under positive selection. Mol Biol Evol 19:950–958

    PubMed  CAS  Google Scholar 

  45. Yang Z, Wong WS, Nielsen R (2005) Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol 22:1107–1118

    PubMed  CAS  Google Scholar 

  46. Yang Z, Nielsen R, Goldman N, Pedersen AMK (2000) Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155: 431–449

    PubMed  CAS  Google Scholar 

  47. Huelsenbeck JP, Dyer KA (2004) Bayesian estimation of positively selected sites. J Mol Evol 58:661–672

    PubMed  CAS  Google Scholar 

  48. Scheffler K, Seoighe. C (2005) A Bayesian model comparison approach to inferring positive selection. Mol Biol Evol 22:2531–2540

    PubMed  CAS  Google Scholar 

  49. Aris-Brosou S, Bielawski JP (2006) Large-scale analyses of synonymous substitution rates can be sensitive to assumptions about the process of mutation. Gene 378:58–64

    PubMed  CAS  Google Scholar 

  50. Massingham T, Goldman N (2005) Detecting amino acid sites under positive selection and purifying selection. Genetics 169:1753–1762

    PubMed  CAS  Google Scholar 

  51. Kosakovsky Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SD (2006) GARD: a genetic algorithm for recombination detection. Bioinformatics 22:3096–3098

    PubMed  Google Scholar 

  52. Kosakovsky Pond SL, Posada, D Gravenor MB, Woelk,CH and Frost SD (2006) Automated phylogenetic detection of recombination using a genetic algorithm. Mol Biol Evol 23:1891–1901

    Google Scholar 

  53. Felsenstein J (2004) Inferring phylogenies. Sinauer Associates, Sunderland Massachusetts

    Google Scholar 

  54. Yang Z, Dos Reis M (2011) Statistical properties of the branch-site test of positive selection. Mol Biol Evol 28:1217–1228

    PubMed  CAS  Google Scholar 

  55. Anisimova M, Yang Z (2007) Multiple hypothesis testing to detect lineages under positive selection that affects only a few sites. Mol Biol Evol 24:1219–1228

    PubMed  CAS  Google Scholar 

  56. Kosakovsky Pond SL., and Frost SD (2005) A genetic algorithm approach to detecting lineage-specific variation in selection pressure. Mol Biol Evol 22:478–485

    Google Scholar 

  57. Lemmon AR, and Milinkovitch MC (2002) The metapopulation genetic algorithm: An efficient solution for the problem of large phylogeny estimation. Proc Natl Acad Sci U S A 99:10516–10521

    PubMed  CAS  Google Scholar 

  58. Jobb G, von Haeseler A, and Strimmer K (2004) TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics. BMC Evol Biol 4:18

    PubMed  Google Scholar 

  59. Zwickl DJ (2006) Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. PhD dissertation, The University of Texas, Austin.

    Google Scholar 

  60. Guindon S.A, Rodrigo G, Dyer KA, Huelsenbeck JP (2004) Modeling the site-specific variation of selection patterns along lineages. Proc Natl Acad Sci U S A 101:12957–12962

    PubMed  CAS  Google Scholar 

  61. Siepel A, Bejerano G, Pedersen JS, Hinrichs A, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, Weinstock GM, Wilson RK, Gibbs RA, Kent WJ, Miller W, Haussler D (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 20: 1034–1050

    Google Scholar 

  62. Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A (2010) Detection of non-neutral substitution rates on mammalian phylogenies. Genome Res 20: 110–121

    PubMed  CAS  Google Scholar 

  63. Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24:1586–1591

    PubMed  CAS  Google Scholar 

  64. Kosakovsky Pond SL, Muse SV (2005) Site-to-site variation of synonymous substitution rates. Mol Biol Evol 22:2375–2385

    Google Scholar 

  65. Stern A, Doron-Faigenboim A, Erez E, Martz E, Bacharach E, and Pupko T (2007) Selecton 2007: advanced models for detecting positive and purifying selection using a Bayesian inference approach. Nucleic Acids Res 35:W506-511

    PubMed  Google Scholar 

  66. Klosterman PS, Uzilov AV, Bendana YR, Bradley RK, Chao S, Kosiol C, Goldman N, Holmes I (2006) XRate: a fast prototyping, training and annotation tool for phylo-grammars. BMC Bioinformatics 7: 428

    PubMed  Google Scholar 

  67. Heger A, Ponting CP, Holmes I (2009) Accurate estimation of gene evolutionary rates using XRATE, with an application to transmembrane proteins. Mol Biol Evol 26:1715–1721

    PubMed  CAS  Google Scholar 

  68. Yang Z, Nielsen R (2002) Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol Biol Evol 19:908–917

    PubMed  CAS  Google Scholar 

  69. Zhang J, Nielsen R, Yang Z (2005) Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol 22:2472–2479

    PubMed  CAS  Google Scholar 

  70. Vamathevan JJ, Hasan S, Emes RD, Amrine-Madsen H, Rajagopalan D, Topp SD, Kumar V, Word M, Simmons MD, Foord SM, Sanseau P, Yang Z, Holbrook JD (2008) The role of positive selection in determining the molecular cause of species differences in disease. BMC Evol Biol 8:273

    PubMed  Google Scholar 

  71. Nozawa M, Suzuki Y, Nei M (2009) Reliabilities of identifying positive selection by the branch-site and site-prediction methods. Proc Natl Acad Sci USA 106:6700–6705

    PubMed  CAS  Google Scholar 

  72. Markova-Raina P, Petrov D (2011) High sensitivity to aligner and high rate of false positives in the estimates of positive selection in 12 Drosophila genomes. Genome Res. doi:10.1101/gr.115949.110

    Google Scholar 

  73. Bakewell MA, Shi P, Zhang J (2007) More genes underwent positive selection in chimpanzee than in human evolution. Proc Natl Acad Sci USA 104:E97

    Google Scholar 

  74. Arbiza L, Dopazo J, Dopazo H (2006) Positive selection, relaxation, and acceleration in the evolution of the human and chimp genome. PLoS Comput Biol 2:e38

    PubMed  Google Scholar 

  75. Gibbs RA, Rogers J, Katze MG, Bumgarner R, Weinstock GM, Mardis ER, Remington KA, Strausberg RL, Venter JC, Wilson RK et al. (2007) Evolutionary and biomedical insights from the macaque genome. Science 316:222–234

    PubMed  CAS  Google Scholar 

  76. Mallik S, Gnerre S, Muller P, Reich D (2010) The difficulty of avoiding false positives in genome scans for natural selection. Genome Res 19:922–933

    Google Scholar 

  77. Schneider A, Souvorov A, Sabath N, Landan G, Gonnet GH (2009) Estimates of positive Darwinian selection are inflated by errors in sequencing, annotation, and alignment. Genome Biol Evol 1:114–118

    PubMed  Google Scholar 

  78. Fletcher W, Yang Z (2010) The effect of insertions, delections and alignment errors on the branch-site test of positive selection. Mol Biol Evol 27:2257–2267

    PubMed  CAS  Google Scholar 

  79. Löytynoja A, Goldman N (2005) An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci U S A 102:10557–10562

    PubMed  Google Scholar 

  80. Löytynoja A, Goldman N (2008) Phylogeny-aware gap placement prevents error in sequence alignment and evolutionary analysis. Science 320:1632–1635

    PubMed  Google Scholar 

  81. Jensen JL, Pedersen AK (2000) Probabilistic models of DNA sequence evolution with context dependent rates of substitution. Adv Appl Probab 32:499–517

    Google Scholar 

  82. Pedersen AK, Jensen JL (2001) A Dependent-Rates Model and an MCMC-Based Methodology for the Maximum-Likelihood Analysis of Sequences with Overlapping Reading Frames. Mol Biol Evol (2001) 18:763–776

    PubMed  CAS  Google Scholar 

  83. Christensen OF, Hoboth A, Jensen JL (2005) Pseudo-likelihood analysis of context dependent codon substitution models. J Comp Biol 12:1166–1182

    CAS  Google Scholar 

  84. Siepel A, Haussler D (2004) Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol 21:468–488

    PubMed  CAS  Google Scholar 

  85. Sabath N, Landan G, Gaur D (2008) A method for the simultaneous estimation of selection intensities in overlapping genes. PLoS One 3:e3996

    PubMed  Google Scholar 

  86. De Groot S, Mailund T, Hein J (2007). Comparative annotation of viral genomes with non-conserved genestructure. Bioinformatics 23:1080–1089

    PubMed  Google Scholar 

  87. McCauley S, Hein J (2006) Using hidden Markov models (HMMs) and observed evolution to annotate ssRNA Viral Genomes. Bioinformatics 22: 1308–1316

    PubMed  CAS  Google Scholar 

  88. McCauley S, de Groot S, Mailund T, Hein J (2007) Annotation of selection strength in viral genomes. Bioinformatics 23:2978–2986

    PubMed  CAS  Google Scholar 

  89. Anisimova M, Nielsen R, Yang Z (2003) Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics 164:1229–1236

    PubMed  CAS  Google Scholar 

  90. Martin DP, Williamson C, Posada D (2005) RDP2: recombination detection and analysis of sequence alignments. Bioinformatics 21:260–262

    PubMed  CAS  Google Scholar 

  91. Drummond AJ, Suchard MA (2008) Fully Bayesian tests of neutrality using genealogical summary statistics. BMC Genet 9:68

    PubMed  Google Scholar 

  92. Scheffler K, Martin DP, Seoighe C (2006) Robust inference of positive selection from recombining coding sequences. Bioinformatics 22:2493–2499

    PubMed  CAS  Google Scholar 

  93. Wilson DJ, McVean G (2006) Estimating diversifying selection and functional constraint in the presence of recombination. Genetics 172:1411–1425

    PubMed  CAS  Google Scholar 

  94. Duret L, Semon M, Piganeau G, Mouchiroud D, Galtier N (2002) Vanishing GC-rich isochores in mammalian genomes. Genetics 162:1837–1847

    PubMed  CAS  Google Scholar 

  95. Meunier J, Duret L (2004). Recombination drives the evolution of GC content in the human genome. Mol Biol Evol 21:984–990

    PubMed  CAS  Google Scholar 

  96. Berglund J, Pollard KS, Webster MT (2009) Hotspots of biased nucleotide substitutions in human genes. PLoS Biology 7:e26

    PubMed  Google Scholar 

  97. Ratnakumar A, Mousset S, Glemin S, Berglund J, Galtier N, Duret L, Webster MT (2010) Detecting positive selection within genomes: the problem of biased gene conversion. Phil Trans Roy Soc B 365:2571–2580

    CAS  Google Scholar 

  98. Yap B, Lindsay H, Easteal S, Huttley G (2010) Estimates of the effect of natural selection on protein-coding content. Mol Biol Evol 27:726–734

    PubMed  CAS  Google Scholar 

  99. Akashi H (1994) Synonymous codon usage in Drosophila melanogaster: Natural selection and translational accuracy. Genetics 136:927–935

    PubMed  CAS  Google Scholar 

  100. Chamary JV, Parmley JL, Hurst LD (2006) Hearing silence: non-neutral evolution at synonymous sites in mammals. Nat Rev Genet 7:98–108

    PubMed  CAS  Google Scholar 

  101. Ngandu N, Scheffler K, Moore P, Woodman Z, Martin D, Seoighe C (2009) Extensive purifying selection acting on synonymous sites in HIV-1 Groug M sequences. Virol J 5:160

    Google Scholar 

  102. Resch AM, Carmel L, Marino-Ramirez L, Ogurtsov AY, Shabalina SA, Rogozin IB, Koonin EV (2007) Widespread Positive Selection in Synonymous Sites of Mammalian Genes. Mol Biol Evol 24:1821–1831

    PubMed  CAS  Google Scholar 

  103. Cannarozzi GM, Faty M, Schraudolph NN, Roth A, von Rohr P, Gonnet P, Gonnet GH, Barral Y (2010) A role for codons in translational dynamics, Cell 141:355–367

    PubMed  Google Scholar 

  104. Hurst LD, Pál C (2001) Evidence of purifying selection acting on silent sites in BRCA1. Trends Genet 17: 62–65

    PubMed  CAS  Google Scholar 

  105. Chamary JV, Hurst LD (2005) Biased usage near intron-exon junctions: selection on splicing enhancers, splice site recognition or something else? Trends Genet 21:256–259

    PubMed  CAS  Google Scholar 

  106. Komar AA (2008) Protein translational rates and protein misfolding: Is there any link? In: O'Doherty CB, Byrne AC (eds) Protein Misfolding: New Research. Nova Science Publisher Inc, New York.

    Google Scholar 

  107. Kimichi-Sarfaty C, Oh JM, Kim IW, Sauna ZE, Calcagno AM, Ambudkar SV, Gottesman MM (2007) A silent polymorphism in the MDR1 gene changes substrate specificity. Science 315:525–528

    Google Scholar 

  108. Nackley AG, SA Shabalina, Tchivileva IE, Satterfield K, Korchynskyi O, Makarov SS, Maixner W, Diatchenko L (2006) Human catechol-O-methyltransferase haplotypes modulate protein expression by altering mRNA secondary structure. Science 314:1930–1933

    PubMed  CAS  Google Scholar 

  109. Mayrose I, Doron-Faigenboim A, Bacharach E, Pupko T (2007) Towards realistic codon models: among site variability and dependency of synonymous and non-synonymous rates. Bioinformatics 23:i319-327

    PubMed  CAS  Google Scholar 

  110. Zhou T, Gu W, Wilke CO (2010) Detecting positive and purifying selection at synonymous sites in yeast and worm. Mol Biol Evol 27: 1912–1922

    PubMed  CAS  Google Scholar 

  111. Wong WSW, Nielsen R (2004). Detecting selection in non-coding regions of nucleotide sequences. Genetics 167:949–958

    PubMed  CAS  Google Scholar 

  112. Roth A, Anisimova M, Cannarozzi GM (2011) Measuring codon usage bias. In: Cannarozzi G, Schneider A (eds) Codon Evolution: mechanisms and models. Oxford University Press

    Google Scholar 

  113. Nielsen R, Yang Z (2003) Estimating the distribution of selection coefficients from phylogenetic data with applications to mitochondrial and viral DNA. Mol Biol Evol 20:1231–1239

    PubMed  CAS  Google Scholar 

  114. Nielsen R, Bauer DuMont VL, Hubisz MJ, Aquadro CF (2007) Maximum likelihood estimation of ancestral codon usage bias parameters in Drosophila. Mol Biol Evol 24:228–235

    PubMed  CAS  Google Scholar 

  115. Yang Z, Nielsen R (2008) Mutation-selection models of codon substitution and their use to estimate selective strengths on codon usage. Mol Biol Evol 25:568–579

    PubMed  CAS  Google Scholar 

  116. Zhen Y, Andolfatto P (2012) Detecting selection on non-coding genomics regions. In: Anisimova M (ed) Evolutionary genomics: statistical and computational methods (volume 1). Methods in Molecular Biology, Springer Science+Business Media New York

    Google Scholar 

  117. Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595

    PubMed  CAS  Google Scholar 

  118. Fu YX, Li WH (1993) Statistical tests of neutrality of mutations. Genetics 133:693–709

    PubMed  CAS  Google Scholar 

  119. Fay JC, Wu CI (2000) Hitchhiking under positive Darwinian selection. Genetics 155:1405–1413

    PubMed  CAS  Google Scholar 

  120. Hudson RR, Kreitman M, Aguade M (1987) A test of neutral molecular evolution based on nucleotide data. Genetics 116:153–159

    PubMed  CAS  Google Scholar 

  121. Wayne ML, Simonsen K (1998) Statistical tests of neutrality in the age of weak selection. Trends Ecol Evol 13:1292–1299

    Google Scholar 

  122. Nielsen R (2001) Statistical tests of selective neutrality in the age of genomics. Heredity 86:641–647

    PubMed  CAS  Google Scholar 

  123. McDonald JH, Kreitman M (1991) Adaptive protein evolution at the Adh locus in Drosophila. Nature 351:652–654

    PubMed  CAS  Google Scholar 

  124. Fay JC, Wyckoff GJ, Wu CI (2001) Positive and negative selection on the human genome. Genetics 158:1227–1234

    PubMed  CAS  Google Scholar 

  125. Eyre-Walker A (2002) Changing effective population size and the McDonald–Kreitman test. Genetics 162:2017–2024

    PubMed  Google Scholar 

  126. Smith NG, Eyre-Walker A (2002) Adaptive protein evolution in Drosophila. Nature 415:1022–1024

    PubMed  CAS  Google Scholar 

  127. Sawyer SA, Hartl DL (1992) Population genetics of polymorphism and divergence. Genetics 132:1161–1176

    PubMed  CAS  Google Scholar 

  128. Hartl DL, Moriyama EN, Sawyer SA (1994) Selection intensity for codon bias. Genetics 138:227–234

    PubMed  CAS  Google Scholar 

  129. Akashi H (1999) Inferring the fitness effects of DNA mutations from polymorphism and divergence data: statistical power to detect directional selection under stationarity and free recombination. Genetics 151:221–238

    PubMed  CAS  Google Scholar 

  130. Bustamante CD, Nielsen R, Sawyer SA, Olsen KM, Purugganan, Hartl DL (2002) The cost of inbreeding: fixation of deleterious genes in Arabidopsis. Nature 416:531–534

    PubMed  CAS  Google Scholar 

  131. Bustamante CD, Fledel-Alon A, Williamson S, Nielsen R, Todd-Hubisz M, Glanowski S, Hernandez R, Civello D, Tanebaum DM, White TJ, Sninsky JJ, Adams MD, Cargill M, Clark AG (2005) Natural selection on protein coding genes in the human genome. Nature 437:1153–1157

    PubMed  CAS  Google Scholar 

  132. Boyko AR, Williamson SH, Indap AR, Degenhardt JD, Hernandez RD, Lohmueller KE, Adams MD, Schmidt S, Sninsky JJ, Sunyaev SR, White TJ, Nielsen R, Clark AG, Bustamante CD (2008) Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genetics 4(5):e1000083

    PubMed  Google Scholar 

  133. Bierne N, Eyre-Walker A (2004) Genomic rate of adaptive amino acid substitution in Drosophila. Mol Biol Evol 21:1350–1360

    PubMed  CAS  Google Scholar 

  134. Welch JJ (2006) Estimating the genome-wide rate of adaptive protein evolution in Drosophila. Genetics 173: 821–837

    PubMed  CAS  Google Scholar 

  135. Eyre-Walker A, and Keightley PD (2009) Estimating the rate of adaptive molecular evolution in the presence of slightly deleterious mutations and population size change. Mol Bio Evol 26:2097–2018

    CAS  Google Scholar 

  136. Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD (2009) Inferring the joint demographic history of multiple populations from SNP data. PLoS Genetics 5:e1000695

    PubMed  Google Scholar 

  137. Nielsen R, Hubisz MJ, Hellmann I, Torgerson D, Andrés AM, Albrechtsen A, Gutenkunst R, Adams MD, Cargill M, Boyko A, Indap A, Bustamante CD, Clark AG (2009) Darwinian and demographic forces affecting human protein coding genes. Genome Res 19:838–849

    PubMed  CAS  Google Scholar 

  138. Kimura M, Ohta T (1969) The average number of generations until fixation of a mutant gene in a finite population. Genetics 61:763–771

    PubMed  CAS  Google Scholar 

Download references

Acknowledgments

C.K. is supported by the University of Veterinary Medicine Vienna. M.A. is supported by the ETH Zurich and also receives funding from the Swiss National Science Foundation (grant 31003A_127325).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carolin Kosiol .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Kosiol, C., Anisimova, M. (2012). Selection on the Protein-Coding Genome. In: Anisimova, M. (eds) Evolutionary Genomics. Methods in Molecular Biology, vol 856. Humana Press. https://doi.org/10.1007/978-1-61779-585-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-585-5_5

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-61779-584-8

  • Online ISBN: 978-1-61779-585-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics