Journal of Molecular Evolution

, Volume 59, Issue 1, pp 121–132 | Cite as

A Maximum Likelihood Method for Detecting Functional Divergence at Individual Codon Sites, with Application to Gene Family Evolution

  • Joseph P. BielawskiEmail author
  • Ziheng Yang


The tailoring of existing genetic systems to new uses is called genetic co-option. Mechanisms of genetic co-option have been difficult to study because of difficulties in identifying functionally important changes. One way to study genetic co-option in protein-coding genes is to identify those amino acid sites that have experienced changes in selective pressure following a genetic co-option event. In this paper we present a maximum likelihood method useful for measuring divergent selective pressures and identifying the amino acid sites affected by divergent selection. The method is based on a codon model of evolution and uses the nonsynonymous-to-synonymous rate ratio (ω) as a measure of selection on the protein, with ω = 1, <1, and >1 indicating neutral evolution, purifying selection, and positive selection, respectively. The model allows variation in ω among sites, with a fraction of sites evolving under divergent selective pressures. Divergent selection is indicated by different ω’s between clades, such as between paralogous clades of a gene family. We applied the codon model to duplication followed by functional divergence of (i) the ε and γ globin genes and (ii) the eosinophil cationic protein (ECP) and eosinophil-derived neurotoxin (EDN) genes. In both cases likelihood ratio tests suggested the presence of sites evolving under divergent selective pressures. Results of the ε and γ globin analysis suggested that divergent selective pressures might be a consequence of a weakened relationship between fetal hemoglobin and 2,3-diphosphoglycerate. We suggest that empirical Bayesian identification of sites evolving under divergent selective pressures, combined with structural and functional information, can provide a valuable framework for identifying and studying mechanisms of genetic co-option. Limitations of the new method are discussed.


Maximum likelihood Functional divergence Codon model ECP EDN Globins 



Valuable discussions were contributed by Gabriela Aguileta. We thank Katherine A. Dunn and Gabriela Aguileta for constructive comments on the manuscript. This research was supported by a UK Biotechnology and Biological Sciences Research Council Grant.


  1. Anisimova, M, Bielawski, JP, Yang, Z 2001Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolutionMol Biol Evol1815851592PubMedGoogle Scholar
  2. Anisimova, M, Bielawski, JP, Yang, Z 2002Accuracy and power of Bayesian prediction of amino acid sites under positive selectionMol Biol Evol19950958PubMedGoogle Scholar
  3. Betrán, E, Long, M 2002Expansion of genome coding regions by acquisition of new genesGenetica1156580CrossRefPubMedGoogle Scholar
  4. Bielawski, JP, Yang, Z 2003Maximum likelihood methods for detecting adaptive evolution after gene duplicationJ Struct Funct Genomics3201212CrossRefPubMedGoogle Scholar
  5. Chen, L, DeVries, AL, Cheng, CH 1997Evolution of antifreeze glycoprotein gene from a trypsinogen gene in Antarctic notothenioid fishProc Natl Acad Sci USA9438113816CrossRefPubMedGoogle Scholar
  6. Claverie, JM 2001Gene number. What if there are only 30,000 human genes?Science29112551257CrossRefPubMedGoogle Scholar
  7. Domachowske, JB, Bonville, CA, Dyer, KD, Rosenberg, HF 1998Evolution of antiviral activity in the ribonuclease A gene superfamily: Evidence for a specific interaction between eosinophil-derived neurotoxin (EDN/RNase 2) and respiratory syncytial virusNucleic Acids Res2653275332CrossRefPubMedGoogle Scholar
  8. Fitch, DH, Bailey, WJ, Tagle, DA, Goodman, M, Sieu, L, Slightom, JL 1991Duplication of the gamma-globin gene mediated by L1 long interspersed repetitive elements in an early ancestor of simian primatesProc Natl Acad Sci USA8873967400PubMedGoogle Scholar
  9. Force, A, Lynch, M, Pickett, FB, Amores, A, Van, Y-I, Postlethwait, J 1999Preservation of duplicate genes by complementary, degenerative mutationsGenetics15115311545PubMedGoogle Scholar
  10. Forsberg, R, Christiansen, FB 2003A codon-based model of host-specific selection in parasites, with an application to the influenza A virusMol Biol Evol2012521259CrossRefPubMedGoogle Scholar
  11. Gaucher, EA, Gu, X, Miyamoto, MM, Benner, SA 2002Predicting functional divergence in protein evolution by site-specific rate shiftsTrends Biochem Sci27315321CrossRefPubMedGoogle Scholar
  12. Gibert, JM 2002The evolution of engrailed genes after duplication and speciation eventsDev Genes Evol212307318CrossRefPubMedGoogle Scholar
  13. Goldman, N, Yang, Z 1994A codon based model of nucleotide substitution for protein-coding DNA sequencesMol Biol Evol11725736PubMedGoogle Scholar
  14. Goodman, M 1999The genomic record of Humankind’s evolutionary rootsAm J Hum Genet643139CrossRefPubMedGoogle Scholar
  15. Goodman, M, Porter, CA, Czelusniak, J, Page, SL, Schneider, H, Shoshani, J, Gunnell, G, Groves, CP 1998Toward a phylogenetic classification of Primates based on DNA evidence complemented by fossil evidenceMol Phylogenet Evol9585598CrossRefPubMedGoogle Scholar
  16. Gu, X 2001Maximum-likelihood approach for gene family evolution under functional divergenceMol Biol Evol18453464PubMedGoogle Scholar
  17. Hamann, KJ, Ten, RM, Loegering, DA, Jenkins, RB, Heise, MT, Schad, CR, Pease, LR, Gleich, GJ, Barker, RL 1990Structure and chromosome localization of the human eosinophil-derived neurotoxin and eosinophil cationic protein genes: Evidence for intronless coding sequences in the ribonuclease gene superfamilyGenomics7535546PubMedGoogle Scholar
  18. Harris, MP, Fallon, JF, Prum, RO 2002Shh-Bmp2 signalling module and the evolutionary origin and diversification of feathersJ Exp Zool294160176CrossRefPubMedGoogle Scholar
  19. Hasegawa, M, Kishino, H, Yano, T 1985Dating of the human-ape splitting by a molecular clock of mitochondrial DNAJ Mol Evol22160174PubMedGoogle Scholar
  20. Hughes, AL 1994The evolution of functionally novel proteins after gene duplicationProc R Soc Lond B Biol Sci256119124PubMedGoogle Scholar
  21. Hughes, AL 2002Adaptive evolution after gene duplicationTrends Genet18433434CrossRefPubMedGoogle Scholar
  22. Johnson, RM, Buck, S, Chiu, C, Schneider, H, Sampaio, I, Gage, DA, Shen, TL, Schneider, MP, Muniz, JA, Gumucio, DL, Goodman, M 1996Fetal globin expression in New World monkeysJ Biol Chem2711468414691CrossRefPubMedGoogle Scholar
  23. Knudsen, B, Miyamoto, MM 2001A likelihood ratio test for evolutionary rate shifts and functional divergence among proteinsProc Natl Acad Sci USA981451214517CrossRefPubMedGoogle Scholar
  24. Koop, BF, Goodman, M 1988Evolutionary and developmental aspects of two hemoglobin beta-chain genes (epsilon M and beta M) of opossumProc Natl Acad Sci USA8538933897PubMedGoogle Scholar
  25. Li, W-H 1985Accelerated evolution following gene duplication and its implications for the neutralist-selectionist controversyOtha, TAoki, K eds. Population genetics and molecular evolutionJapan Scientific PressTokyo333352Google Scholar
  26. Long, M 2001Evolution of novel genesCurr Opin Genet Dev11673680CrossRefPubMedGoogle Scholar
  27. Long, M, Langley, CH 1993Natural selection and the origin of jingwei, a chimeric processed functional gene in DrosophilaScience2609195PubMedGoogle Scholar
  28. Lynch, M, Conery, JS 2000The evolutionary fate and consequences of duplicate genesScience29011511155CrossRefPubMedGoogle Scholar
  29. Lynch, M, Force, A 2000The probability of duplicate gene preservation by subfunctionalizationGenetics154459473PubMedGoogle Scholar
  30. Massingham, T, Davies, LJ, Lio, P 2001Analyzing gene function after duplicationBioessays23873876CrossRefPubMedGoogle Scholar
  31. Meireles, CM, Schneider, MP, Sampaio, MI, Schneider, H, Slightom, JL, Chiu, CH, Neiswanger, K, Gumucio, DL, Czelusniak, J, Goodman, M 1995Fate of a redundant gamma-globin gene in the atelid clade of New World monkeys: implications concerning fetal globin gene expressionProc Natl Acad Sci USA9226072611PubMedGoogle Scholar
  32. Meireles, CM, Czelusniak, J, Schneider, MP, Muniz, JA, Brigido, MC, Ferreira, HS, Goodman, M 1999Molecular phytogeny of ateline new world monkeys (Platyrrhini, atelinae) based on gamma-globin gene sequences: evidence that brachyteles is the sister group of lagothrixMol Phylogenet Evol121030CrossRefPubMedGoogle Scholar
  33. Muse, SV, Gaut, BS 1994A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with applications to the chloroplast genomeMol Biol Evol11715725PubMedGoogle Scholar
  34. Nielsen, R, Yang, Z 1998Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope geneGenetics148929936PubMedGoogle Scholar
  35. Ohta, T 1993Pattern of nucleotide substitution in growth hormone-prolactin gene family: a paradigm for evolution by gene duplicationGenetics13412711276PubMedGoogle Scholar
  36. Page, SL, Chiu, Ch, Goodman, M 1999Molecular phytogeny of Old World monkeys (Cercopithecidae) as inferred from gamma-globin DNA sequencesMol Phylogenet Evol13348359CrossRefPubMedGoogle Scholar
  37. Perutz, MF, Imai, K 1980Regulation of oxygen affinity of mammalian haemoglobinsJ Mol Biol136183191PubMedGoogle Scholar
  38. Piatigorsky, J, Wistow, G 1991The recruitment of crystallins: new functions precede gene duplicationScience25210781079PubMedGoogle Scholar
  39. Poyart, C, Wajcman, H, Kister, J 1992Molecular adaptation of hemoglobin function in mammalsRespir Physiol90317CrossRefPubMedGoogle Scholar
  40. Rosenberg, HF, Domachowske, JB 1999Eosinophils, riobnucleases and host defence: solving the puzzleImmunol Res20261274PubMedGoogle Scholar
  41. Susko, E, Inagaki, Y, Field, C, Holder, ME, Roger, AJ 2002Testing for differences in rates-across-sites distributions in phylogenetic subtreesMol Biol Evol1915141523PubMedGoogle Scholar
  42. Swofford, DL 2000PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4SinauerSunderland, MAGoogle Scholar
  43. Taylor, J, Peer, Y, Meyer, A 2001Genome duplication, divergent resolution and speciationTrends Genet17299301CrossRefPubMedGoogle Scholar
  44. Tagle, DA, Koop, BF, Goodman, M, Slightom, JL, Hess, DL, Jones, RT 1988Embryonic ε and γ globin genes of a prosimian primate (Galago crassicaudatus)J Mol Biol203439455PubMedGoogle Scholar
  45. True, JR, Carrol, SB 2002Gene co-option in physiological and morphological evolutionAnnu Rev Cell Dev Biol185380CrossRefPubMedGoogle Scholar
  46. Yang, Z 1994Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methodsJ Mol Evol39306314PubMedGoogle Scholar
  47. Yang, Z 1997PAML: A program package for phylogenetic analysis by maximum likelihoodAppl Biosci13555556Google Scholar
  48. Yang, Z 1998Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolutionMol Biol Evol15568573PubMedGoogle Scholar
  49. Yang, Z, Bielawski, JP 2000Statistical methods for detecting molecular adaptationTrends Ecol Evolut15496503CrossRefGoogle Scholar
  50. Yang, Z, Nielsen, R 2002Codon-substitution models for detecting molecular adaptation at individual sites along specific lineagesMol Biol Evol19908917PubMedGoogle Scholar
  51. Yang, Z, Nielsen, R, Goldman, N, Pedersen, A-MK 2000Codon-substitution models for heterogeneous selection pressure at amino acid sitesGenetics155431449PubMedGoogle Scholar
  52. Zhang, J, Rosenberg, HF 2002Complementary advantageous substitutions in the evolution of an antiviral RNase of higher primatesProc Natl Acad Sci USA9954865491CrossRefPubMedGoogle Scholar
  53. Zhang, J, Rosenberg, HF, Nei, M 1998Positive Darwinian selection after gene duplication in primate ribonuclease genesProc Natl Acad Sci USA9537083713CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag 2004

Authors and Affiliations

  1. 1.Department of BiologyUniversity College LondonLondonUK
  2. 2.Department of BiologyDalhousie UniversityHalifaxCanada

Personalised recommendations