A Maximum Likelihood Method for Detecting Functional Divergence at Individual Codon Sites, with Application to Gene Family Evolution
The tailoring of existing genetic systems to new uses is called genetic co-option. Mechanisms of genetic co-option have been difficult to study because of difficulties in identifying functionally important changes. One way to study genetic co-option in protein-coding genes is to identify those amino acid sites that have experienced changes in selective pressure following a genetic co-option event. In this paper we present a maximum likelihood method useful for measuring divergent selective pressures and identifying the amino acid sites affected by divergent selection. The method is based on a codon model of evolution and uses the nonsynonymous-to-synonymous rate ratio (ω) as a measure of selection on the protein, with ω = 1, <1, and >1 indicating neutral evolution, purifying selection, and positive selection, respectively. The model allows variation in ω among sites, with a fraction of sites evolving under divergent selective pressures. Divergent selection is indicated by different ω’s between clades, such as between paralogous clades of a gene family. We applied the codon model to duplication followed by functional divergence of (i) the ε and γ globin genes and (ii) the eosinophil cationic protein (ECP) and eosinophil-derived neurotoxin (EDN) genes. In both cases likelihood ratio tests suggested the presence of sites evolving under divergent selective pressures. Results of the ε and γ globin analysis suggested that divergent selective pressures might be a consequence of a weakened relationship between fetal hemoglobin and 2,3-diphosphoglycerate. We suggest that empirical Bayesian identification of sites evolving under divergent selective pressures, combined with structural and functional information, can provide a valuable framework for identifying and studying mechanisms of genetic co-option. Limitations of the new method are discussed.
KeywordsMaximum likelihood Functional divergence Codon model ECP EDN Globins
Valuable discussions were contributed by Gabriela Aguileta. We thank Katherine A. Dunn and Gabriela Aguileta for constructive comments on the manuscript. This research was supported by a UK Biotechnology and Biological Sciences Research Council Grant.
Anisimova, M, Bielawski, JP, Yang, Z 2001Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolutionMol Biol Evol1815851592PubMedGoogle Scholar Anisimova, M, Bielawski, JP, Yang, Z 2002Accuracy and power of Bayesian prediction of amino acid sites under positive selectionMol Biol Evol19950958PubMedGoogle Scholar Betrán, E, Long, M 2002Expansion of genome coding regions by acquisition of new genesGenetica1156580CrossRefPubMedGoogle Scholar Bielawski, JP, Yang, Z 2003Maximum likelihood methods for detecting adaptive evolution after gene duplicationJ Struct Funct Genomics3201212CrossRefPubMedGoogle Scholar Chen, L, DeVries, AL, Cheng, CH 1997Evolution of antifreeze glycoprotein gene from a trypsinogen gene in Antarctic notothenioid fishProc Natl Acad Sci USA9438113816CrossRefPubMedGoogle Scholar Claverie, JM 2001Gene number. What if there are only 30,000 human genes?Science29112551257CrossRefPubMedGoogle Scholar Domachowske, JB, Bonville, CA, Dyer, KD, Rosenberg, HF 1998Evolution of antiviral activity in the ribonuclease A gene superfamily: Evidence for a specific interaction between eosinophil-derived neurotoxin (EDN/RNase 2) and respiratory syncytial virusNucleic Acids Res2653275332CrossRefPubMedGoogle Scholar Fitch, DH, Bailey, WJ, Tagle, DA, Goodman, M, Sieu, L, Slightom, JL 1991Duplication of the gamma-globin gene mediated by L1 long interspersed repetitive elements in an early ancestor of simian primatesProc Natl Acad Sci USA8873967400PubMedGoogle Scholar Force, A, Lynch, M, Pickett, FB, Amores, A, Van, Y-I, Postlethwait, J 1999Preservation of duplicate genes by complementary, degenerative mutationsGenetics15115311545PubMedGoogle Scholar Forsberg, R, Christiansen, FB 2003A codon-based model of host-specific selection in parasites, with an application to the influenza A virusMol Biol Evol2012521259CrossRefPubMedGoogle Scholar Gaucher, EA, Gu, X, Miyamoto, MM, Benner, SA 2002Predicting functional divergence in protein evolution by site-specific rate shiftsTrends Biochem Sci27315321CrossRefPubMedGoogle Scholar Gibert, JM 2002The evolution of engrailed genes after duplication and speciation eventsDev Genes Evol212307318CrossRefPubMedGoogle Scholar Goldman, N, Yang, Z 1994A codon based model of nucleotide substitution for protein-coding DNA sequencesMol Biol Evol11725736PubMedGoogle Scholar Goodman, M 1999The genomic record of Humankind’s evolutionary rootsAm J Hum Genet643139CrossRefPubMedGoogle Scholar Goodman, M, Porter, CA, Czelusniak, J, Page, SL, Schneider, H, Shoshani, J, Gunnell, G, Groves, CP 1998Toward a phylogenetic classification of Primates based on DNA evidence complemented by fossil evidenceMol Phylogenet Evol9585598CrossRefPubMedGoogle Scholar Gu, X 2001Maximum-likelihood approach for gene family evolution under functional divergenceMol Biol Evol18453464PubMedGoogle Scholar Hamann, KJ, Ten, RM, Loegering, DA, Jenkins, RB, Heise, MT, Schad, CR, Pease, LR, Gleich, GJ, Barker, RL 1990Structure and chromosome localization of the human eosinophil-derived neurotoxin and eosinophil cationic protein genes: Evidence for intronless coding sequences in the ribonuclease gene superfamilyGenomics7535546PubMedGoogle Scholar Harris, MP, Fallon, JF, Prum, RO 2002Shh-Bmp2 signalling module and the evolutionary origin and diversification of feathersJ Exp Zool294160176CrossRefPubMedGoogle Scholar Hasegawa, M, Kishino, H, Yano, T 1985Dating of the human-ape splitting by a molecular clock of mitochondrial DNAJ Mol Evol22160174PubMedGoogle Scholar Hughes, AL 1994The evolution of functionally novel proteins after gene duplicationProc R Soc Lond B Biol Sci256119124PubMedGoogle Scholar Hughes, AL 2002Adaptive evolution after gene duplicationTrends Genet18433434CrossRefPubMedGoogle Scholar Johnson, RM, Buck, S, Chiu, C, Schneider, H, Sampaio, I, Gage, DA, Shen, TL, Schneider, MP, Muniz, JA, Gumucio, DL, Goodman, M 1996Fetal globin expression in New World monkeysJ Biol Chem2711468414691CrossRefPubMedGoogle Scholar Knudsen, B, Miyamoto, MM 2001A likelihood ratio test for evolutionary rate shifts and functional divergence among proteinsProc Natl Acad Sci USA981451214517CrossRefPubMedGoogle Scholar Koop, BF, Goodman, M 1988Evolutionary and developmental aspects of two hemoglobin beta-chain genes (epsilon M and beta M) of opossumProc Natl Acad Sci USA8538933897PubMedGoogle Scholar Li, W-H 1985Accelerated evolution following gene duplication and its implications for the neutralist-selectionist controversyOtha, TAoki, K eds. Population genetics and molecular evolutionJapan Scientific PressTokyo333352Google Scholar Long, M 2001Evolution of novel genesCurr Opin Genet Dev11673680CrossRefPubMedGoogle Scholar Long, M, Langley, CH 1993Natural selection and the origin of jingwei, a chimeric processed functional gene in DrosophilaScience2609195PubMedGoogle Scholar Lynch, M, Conery, JS 2000The evolutionary fate and consequences of duplicate genesScience29011511155CrossRefPubMedGoogle Scholar Lynch, M, Force, A 2000The probability of duplicate gene preservation by subfunctionalizationGenetics154459473PubMedGoogle Scholar Massingham, T, Davies, LJ, Lio, P 2001Analyzing gene function after duplicationBioessays23873876CrossRefPubMedGoogle Scholar Meireles, CM, Schneider, MP, Sampaio, MI, Schneider, H, Slightom, JL, Chiu, CH, Neiswanger, K, Gumucio, DL, Czelusniak, J, Goodman, M 1995Fate of a redundant gamma-globin gene in the atelid clade of New World monkeys: implications concerning fetal globin gene expressionProc Natl Acad Sci USA9226072611PubMedGoogle Scholar Meireles, CM, Czelusniak, J, Schneider, MP, Muniz, JA, Brigido, MC, Ferreira, HS, Goodman, M 1999Molecular phytogeny of ateline new world monkeys (Platyrrhini, atelinae) based on gamma-globin gene sequences: evidence that brachyteles is the sister group of lagothrixMol Phylogenet Evol121030CrossRefPubMedGoogle Scholar Muse, SV, Gaut, BS 1994A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with applications to the chloroplast genomeMol Biol Evol11715725PubMedGoogle Scholar Nielsen, R, Yang, Z 1998Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope geneGenetics148929936PubMedGoogle Scholar Ohta, T 1993Pattern of nucleotide substitution in growth hormone-prolactin gene family: a paradigm for evolution by gene duplicationGenetics13412711276PubMedGoogle Scholar Page, SL, Chiu, Ch, Goodman, M 1999Molecular phytogeny of Old World monkeys (Cercopithecidae) as inferred from gamma-globin DNA sequencesMol Phylogenet Evol13348359CrossRefPubMedGoogle Scholar Perutz, MF, Imai, K 1980Regulation of oxygen affinity of mammalian haemoglobinsJ Mol Biol136183191PubMedGoogle Scholar Piatigorsky, J, Wistow, G 1991The recruitment of crystallins: new functions precede gene duplicationScience25210781079PubMedGoogle Scholar Poyart, C, Wajcman, H, Kister, J 1992Molecular adaptation of hemoglobin function in mammalsRespir Physiol90317CrossRefPubMedGoogle Scholar Rosenberg, HF, Domachowske, JB 1999Eosinophils, riobnucleases and host defence: solving the puzzleImmunol Res20261274PubMedGoogle Scholar Susko, E, Inagaki, Y, Field, C, Holder, ME, Roger, AJ 2002Testing for differences in rates-across-sites distributions in phylogenetic subtreesMol Biol Evol1915141523PubMedGoogle Scholar Swofford, DL 2000PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4SinauerSunderland, MAGoogle Scholar Taylor, J, Peer, Y, Meyer, A 2001Genome duplication, divergent resolution and speciationTrends Genet17299301CrossRefPubMedGoogle Scholar Tagle, DA, Koop, BF, Goodman, M, Slightom, JL, Hess, DL, Jones, RT 1988Embryonic ε and γ globin genes of a prosimian primate (Galago crassicaudatus)J Mol Biol203439455PubMedGoogle Scholar True, JR, Carrol, SB 2002Gene co-option in physiological and morphological evolutionAnnu Rev Cell Dev Biol185380CrossRefPubMedGoogle Scholar Yang, Z 1994Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methodsJ Mol Evol39306314PubMedGoogle Scholar Yang, Z 1997PAML: A program package for phylogenetic analysis by maximum likelihoodAppl Biosci13555556Google Scholar Yang, Z 1998Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolutionMol Biol Evol15568573PubMedGoogle Scholar Yang, Z, Bielawski, JP 2000Statistical methods for detecting molecular adaptationTrends Ecol Evolut15496503CrossRefGoogle Scholar Yang, Z, Nielsen, R 2002Codon-substitution models for detecting molecular adaptation at individual sites along specific lineagesMol Biol Evol19908917PubMedGoogle Scholar Yang, Z, Nielsen, R, Goldman, N, Pedersen, A-MK 2000Codon-substitution models for heterogeneous selection pressure at amino acid sitesGenetics155431449PubMedGoogle Scholar Zhang, J, Rosenberg, HF 2002Complementary advantageous substitutions in the evolution of an antiviral RNase of higher primatesProc Natl Acad Sci USA9954865491CrossRefPubMedGoogle Scholar Zhang, J, Rosenberg, HF, Nei, M 1998Positive Darwinian selection after gene duplication in primate ribonuclease genesProc Natl Acad Sci USA9537083713CrossRefPubMedGoogle Scholar