A Maximum Likelihood Method for Detecting Functional Divergence at Individual Codon Sites, with Application to Gene Family Evolution
Received: 27 June 2003 Accepted: 29 December 2003 DOI:
Cite this article as: Bielawski, J.P. & Yang, Z. J Mol Evol (2004) 59: 121. doi:10.1007/s00239-004-2597-8 Abstract
The tailoring of existing genetic systems to new uses is called genetic co-option. Mechanisms of genetic co-option have been difficult to study because of difficulties in identifying functionally important changes. One way to study genetic co-option in protein-coding genes is to identify those amino acid sites that have experienced changes in selective pressure following a genetic co-option event. In this paper we present a maximum likelihood method useful for measuring divergent selective pressures and identifying the amino acid sites affected by divergent selection. The method is based on a codon model of evolution and uses the nonsynonymous-to-synonymous rate ratio (ω) as a measure of selection on the protein, with ω = 1, <1, and >1 indicating neutral evolution, purifying selection, and positive selection, respectively. The model allows variation in ω among sites, with a fraction of sites evolving under divergent selective pressures. Divergent selection is indicated by different ω’s between clades, such as between paralogous clades of a gene family. We applied the codon model to duplication followed by functional divergence of (i) the ε and γ globin genes and (ii) the eosinophil cationic protein (ECP) and eosinophil-derived neurotoxin (EDN) genes. In both cases likelihood ratio tests suggested the presence of sites evolving under divergent selective pressures. Results of the ε and γ globin analysis suggested that divergent selective pressures might be a consequence of a weakened relationship between fetal hemoglobin and 2,3-diphosphoglycerate. We suggest that empirical Bayesian identification of sites evolving under divergent selective pressures, combined with structural and functional information, can provide a valuable framework for identifying and studying mechanisms of genetic co-option. Limitations of the new method are discussed.
Keywords Maximum likelihood Functional divergence Codon model ECP EDN Globins References Anisimova, M, Bielawski, JP, Yang, Z 2001 Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution Mol Biol Evol 18 1585 1592 PubMed Google Scholar Anisimova, M, Bielawski, JP, Yang, Z 2002 Accuracy and power of Bayesian prediction of amino acid sites under positive selection Mol Biol Evol 19 950 958 PubMed Google Scholar Betrán, E, Long, M 2002 Expansion of genome coding regions by acquisition of new genes Genetica 115 65 80 CrossRef PubMed Google Scholar Bielawski, JP, Yang, Z 2003 Maximum likelihood methods for detecting adaptive evolution after gene duplication J Struct Funct Genomics 3 201 212 CrossRef PubMed Google Scholar Chen, L, DeVries, AL, Cheng, CH 1997 Evolution of antifreeze glycoprotein gene from a trypsinogen gene in Antarctic notothenioid fish Proc Natl Acad Sci USA 94 3811 3816 CrossRef PubMed Google Scholar Claverie, JM 2001 Gene number. What if there are only 30,000 human genes? Science 291 1255 1257 CrossRef PubMed Google Scholar Domachowske, JB, Bonville, CA, Dyer, KD, Rosenberg, HF 1998 Evolution of antiviral activity in the ribonuclease A gene superfamily: Evidence for a specific interaction between eosinophil-derived neurotoxin (EDN/RNase 2) and respiratory syncytial virus Nucleic Acids Res 26 5327 5332 CrossRef PubMed Google Scholar Fitch, DH, Bailey, WJ, Tagle, DA, Goodman, M, Sieu, L, Slightom, JL 1991 Duplication of the gamma-globin gene mediated by L1 long interspersed repetitive elements in an early ancestor of simian primates Proc Natl Acad Sci USA 88 7396 7400 PubMed Google Scholar Force, A, Lynch, M, Pickett, FB, Amores, A, Van, Y-I, Postlethwait, J 1999 Preservation of duplicate genes by complementary, degenerative mutations Genetics 151 1531 1545 PubMed Google Scholar Forsberg, R, Christiansen, FB 2003 A codon-based model of host-specific selection in parasites, with an application to the influenza A virus Mol Biol Evol 20 1252 1259 CrossRef PubMed Google Scholar Gaucher, EA, Gu, X, Miyamoto, MM, Benner, SA 2002 Predicting functional divergence in protein evolution by site-specific rate shifts Trends Biochem Sci 27 315 321 CrossRef PubMed Google Scholar Gibert, JM 2002 The evolution of engrailed genes after duplication and speciation events Dev Genes Evol 212 307 318 CrossRef PubMed Google Scholar Goldman, N, Yang, Z 1994 A codon based model of nucleotide substitution for protein-coding DNA sequences Mol Biol Evol 11 725 736 PubMed Google Scholar Goodman, M 1999 The genomic record of Humankind’s evolutionary roots Am J Hum Genet 64 31 39 CrossRef PubMed Google Scholar Goodman, M, Porter, CA, Czelusniak, J, Page, SL, Schneider, H, Shoshani, J, Gunnell, G, Groves, CP 1998 Toward a phylogenetic classification of Primates based on DNA evidence complemented by fossil evidence Mol Phylogenet Evol 9 585 598 CrossRef PubMed Google Scholar Gu, X 2001 Maximum-likelihood approach for gene family evolution under functional divergence Mol Biol Evol 18 453 464 PubMed Google Scholar Hamann, KJ, Ten, RM, Loegering, DA, Jenkins, RB, Heise, MT, Schad, CR, Pease, LR, Gleich, GJ, Barker, RL 1990 Structure and chromosome localization of the human eosinophil-derived neurotoxin and eosinophil cationic protein genes: Evidence for intronless coding sequences in the ribonuclease gene superfamily Genomics 7 535 546 PubMed Google Scholar Harris, MP, Fallon, JF, Prum, RO 2002 Shh-Bmp2 signalling module and the evolutionary origin and diversification of feathers J Exp Zool 294 160 176 CrossRef PubMed Google Scholar Hasegawa, M, Kishino, H, Yano, T 1985 Dating of the human-ape splitting by a molecular clock of mitochondrial DNA J Mol Evol 22 160 174 PubMed Google Scholar Hughes, AL 1994 The evolution of functionally novel proteins after gene duplication Proc R Soc Lond B Biol Sci 256 119 124 PubMed Google Scholar Hughes, AL 2002 Adaptive evolution after gene duplication Trends Genet 18 433 434 CrossRef PubMed Google Scholar Johnson, RM, Buck, S, Chiu, C, Schneider, H, Sampaio, I, Gage, DA, Shen, TL, Schneider, MP, Muniz, JA, Gumucio, DL, Goodman, M 1996 Fetal globin expression in New World monkeys J Biol Chem 271 14684 14691 CrossRef PubMed Google Scholar Knudsen, B, Miyamoto, MM 2001 A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins Proc Natl Acad Sci USA 98 14512 14517 CrossRef PubMed Google Scholar Koop, BF, Goodman, M 1988 Evolutionary and developmental aspects of two hemoglobin beta-chain genes (epsilon M and beta M) of opossum Proc Natl Acad Sci USA 85 3893 3897 PubMed Google Scholar Li, W-H 1985 Accelerated evolution following gene duplication and its implications for the neutralist-selectionist controversy Otha, T Aoki, K eds. Population genetics and molecular evolution Japan Scientific Press Tokyo 333 352 Google Scholar Long, M 2001 Evolution of novel genes Curr Opin Genet Dev 11 673 680 CrossRef PubMed Google Scholar Long, M, Langley, CH 1993 Natural selection and the origin of jingwei, a chimeric processed functional gene in Drosophila Science 260 91 95 PubMed Google Scholar Lynch, M, Conery, JS 2000 The evolutionary fate and consequences of duplicate genes Science 290 1151 1155 CrossRef PubMed Google Scholar Lynch, M, Force, A 2000 The probability of duplicate gene preservation by subfunctionalization Genetics 154 459 473 PubMed Google Scholar Massingham, T, Davies, LJ, Lio, P 2001 Analyzing gene function after duplication Bioessays 23 873 876 CrossRef PubMed Google Scholar Meireles, CM, Schneider, MP, Sampaio, MI, Schneider, H, Slightom, JL, Chiu, CH, Neiswanger, K, Gumucio, DL, Czelusniak, J, Goodman, M 1995 Fate of a redundant gamma-globin gene in the atelid clade of New World monkeys: implications concerning fetal globin gene expression Proc Natl Acad Sci USA 92 2607 2611 PubMed Google Scholar Meireles, CM, Czelusniak, J, Schneider, MP, Muniz, JA, Brigido, MC, Ferreira, HS, Goodman, M 1999 Molecular phytogeny of ateline new world monkeys (Platyrrhini, atelinae) based on gamma-globin gene sequences: evidence that brachyteles is the sister group of lagothrix Mol Phylogenet Evol 12 10 30 CrossRef PubMed Google Scholar Muse, SV, Gaut, BS 1994 A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with applications to the chloroplast genome Mol Biol Evol 11 715 725 PubMed Google Scholar Nielsen, R, Yang, Z 1998 Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene Genetics 148 929 936 PubMed Google Scholar Ohta, T 1993 Pattern of nucleotide substitution in growth hormone-prolactin gene family: a paradigm for evolution by gene duplication Genetics 134 1271 1276 PubMed Google Scholar Page, SL, Chiu, Ch, Goodman, M 1999 Molecular phytogeny of Old World monkeys (Cercopithecidae) as inferred from gamma-globin DNA sequences Mol Phylogenet Evol 13 348 359 CrossRef PubMed Google Scholar Perutz, MF, Imai, K 1980 Regulation of oxygen affinity of mammalian haemoglobins J Mol Biol 136 183 191 PubMed Google Scholar Piatigorsky, J, Wistow, G 1991 The recruitment of crystallins: new functions precede gene duplication Science 252 1078 1079 PubMed Google Scholar Poyart, C, Wajcman, H, Kister, J 1992 Molecular adaptation of hemoglobin function in mammals Respir Physiol 90 3 17 CrossRef PubMed Google Scholar Rosenberg, HF, Domachowske, JB 1999 Eosinophils, riobnucleases and host defence: solving the puzzle Immunol Res 20 261 274 PubMed Google Scholar Susko, E, Inagaki, Y, Field, C, Holder, ME, Roger, AJ 2002 Testing for differences in rates-across-sites distributions in phylogenetic subtrees Mol Biol Evol 19 1514 1523 PubMed Google Scholar Swofford, DL 2000PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 Sinauer Sunderland, MA Google Scholar Taylor, J, Peer, Y, Meyer, A 2001 Genome duplication, divergent resolution and speciation Trends Genet 17 299 301 CrossRef PubMed Google Scholar Tagle, DA, Koop, BF, Goodman, M, Slightom, JL, Hess, DL, Jones, RT 1988 Embryonic ε and γ globin genes of a prosimian primate ( Galago crassicaudatus) J Mol Biol 203 439 455 PubMed Google Scholar True, JR, Carrol, SB 2002 Gene co-option in physiological and morphological evolution Annu Rev Cell Dev Biol 18 53 80 CrossRef PubMed Google Scholar Yang, Z 1994 Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods J Mol Evol 39 306 314 PubMed Google Scholar Yang, Z 1997 PAML: A program package for phylogenetic analysis by maximum likelihood Appl Biosci 13 555 556 Google Scholar Yang, Z 1998 Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution Mol Biol Evol 15 568 573 PubMed Google Scholar Yang, Z, Bielawski, JP 2000 Statistical methods for detecting molecular adaptation Trends Ecol Evolut 15 496 503 CrossRef Google Scholar Yang, Z, Nielsen, R 2002 Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages Mol Biol Evol 19 908 917 PubMed Google Scholar Yang, Z, Nielsen, R, Goldman, N, Pedersen, A-MK 2000 Codon-substitution models for heterogeneous selection pressure at amino acid sites Genetics 155 431 449 PubMed Google Scholar Zhang, J, Rosenberg, HF 2002 Complementary advantageous substitutions in the evolution of an antiviral RNase of higher primates Proc Natl Acad Sci USA 99 5486 5491 CrossRef PubMed Google Scholar Zhang, J, Rosenberg, HF, Nei, M 1998 Positive Darwinian selection after gene duplication in primate ribonuclease genes Proc Natl Acad Sci USA 95 3708 3713 CrossRef PubMed Google Scholar