Protocols for the Molecular Evolutionary Analysis of Membrane Protein Gene Duplicates

Part of the Methods in Molecular Biology book series (MIMB, volume 1851)


Gene duplication is an important process in the evolution of gene content in eukaryotic genomes. Understanding when gene duplicates contribute new molecular functions to genomes through molecular adaptation is one important goal in comparative genomics. In large gene families, however, characterizing adaptation and neofunctionalization across species is challenging, as models have traditionally quantified the timing of duplications without considering underlying gene trees. This protocol combines multiple approaches to detect adaptation in protein duplicates at a phylogenetic scale. We include a description of models for gene tree-species tree reconciliation that enable different types of inference, as well as a practical guide to their use. Although simulation-based approaches successfully detect shifts in the rate of duplication/retention, the conflation between the duplication and retention processes, the distinct trajectories of duplicates under non-, sub-, and neofunctionalization, as well as dosage effects offer hitherto unexplored analytical avenues. We introduce mathematical descriptions of these probabilities and offer a road map to computational implementation whose starting point is parsimony reconciliation. Sequence evolution information based on the ratio of nonsynonymous to synonymous nucleotide substitution rates (dN/dS) can be combined with duplicate survival probabilities to better predict the emergence of new molecular functions in retained duplicates. Together, these methods enable characterization of potentially adaptive candidate duplicates whose neofunctionalization may contribute to phenotypic divergence across species.

Key words

Gene duplication Gene tree Birth-death models Molecular evolution dN/dS 



This research was supported in part by DEB-1442142 to L.M.D., DEB-1701414 to L.M.D., D.A.L., and L.R.Y., and DBI-1222940 to D.A.L. and L.L.


  1. 1.
    Hoegg S, Brinkmann H, Taylor JS et al (2004) Phylogenetic timing of the fish-specific genome duplication correlates with the diversification of teleost fish. J Mol Evol 59:190–203CrossRefGoogle Scholar
  2. 2.
    Jaillon O, Aury J-M, Brunet F et al (2004) Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature 431:946–957CrossRefGoogle Scholar
  3. 3.
    Lien S, Koop BF, Sandve SR et al (2016) The Atlantic salmon genome provides insights into rediploidization. Nature 533:200–205CrossRefGoogle Scholar
  4. 4.
    The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815CrossRefGoogle Scholar
  5. 5.
    De Bodt S, Maere S, Van De Peer Y (2005) Genome duplication and the origin of angiosperms. Trends Ecol Evol 20:591–597CrossRefGoogle Scholar
  6. 6.
    Hollister JD (2015) Polyploidy: adaptation to the genomic environment. New Phytol 205:1034–1039CrossRefGoogle Scholar
  7. 7.
    Liebeskind BJ, Hillis DM, Zakon HH (2015) Convergence of ion channel genome content in early animal evolution. Proc Natl Acad Sci U S A 112:E846–E851CrossRefGoogle Scholar
  8. 8.
    Konrad A, Teufel AI, Grahnen JA et al (2011) Toward a general model for the evolutionary dynamics of gene duplicates. Genome Biol Evol 3:1197–1209CrossRefGoogle Scholar
  9. 9.
    Hughes T, Liberles DA (2007) The pattern of evolution of smaller-scale gene duplicates in mammalian genomes is more consistent with neo- than subfunctionalisation. J Mol Evol 65:574–588CrossRefGoogle Scholar
  10. 10.
    Hahn MW (2009) Distinguishing among evolutionary models for the maintenance of gene duplicates. J Hered 100:605–617CrossRefGoogle Scholar
  11. 11.
    Sikosek T, Bornberg-Bauer E (2010) Evolution after and before gene duplication? In: Dittmar K, Liberles D (eds) Evolution after gene duplication. Wiley-Blackwell, Hoboken, NJ, pp 105–131Google Scholar
  12. 12.
    Zhao J, Teufel AI, Liberles DA et al (2015) A generalized birth and death process for modeling the fates of gene duplication. BMC Evol Biol 15:275CrossRefGoogle Scholar
  13. 13.
    Teufel A, Zhao J, O’Reilly M et al (2014) On mechanistic modeling of gene content evolution: Birth-death models and mechanisms of gene birth and gene retention. Computation 2:112–130CrossRefGoogle Scholar
  14. 14.
    Chothia C, Gough J, Vogel C et al (2003) Evolution of the protein repertoire. Science 300:1701–1703CrossRefGoogle Scholar
  15. 15.
    von Heijne G (2006) Membrane-protein topology. Nat Rev Mol Cell Biol 7:909–918CrossRefGoogle Scholar
  16. 16.
    Poolman B, Geertsma ER, Slotboom D-J (2007) A missing link in membrane protein evolution. Science 315:1229–1231CrossRefGoogle Scholar
  17. 17.
    Nei M, Rooney AP (2005) Concerted and birth-and-death evolution of multigene families. Annu Rev Genet 39:121–152CrossRefGoogle Scholar
  18. 18.
    Chen K, Durand D, Farach-colton M (2000) NOTUNG: a program for dating gene duplications. J Comput Biol 7:429–447CrossRefGoogle Scholar
  19. 19.
    Berglund-Sonnhammer AC, Steffansson P, Betts MJ et al (2006) Optimal gene trees from sequences and species trees using a soft interpretation of parsimony. J Mol Evol 63:240–250CrossRefGoogle Scholar
  20. 20.
    Doyon JP, Ranwez V, Daubin V et al (2011) Models, algorithms and programs for phylogeny reconciliation. Brief Bioinform 12:392–400CrossRefGoogle Scholar
  21. 21.
    Sjöstrand J, Sennblad B, Arvestad L et al (2012) DLRS: gene tree evolution in light of a species tree. Bioinformatics 28:2994–2995CrossRefGoogle Scholar
  22. 22.
    Hermansen RA, Hvidsten TR, Sandve SR et al (2016) Extracting functional trends from whole genome duplication events using comparative genomics. Biol Proced Online 18:11CrossRefGoogle Scholar
  23. 23.
    Bielawski JP, Yang Z (2003) Maximum likelihood methods for detecting adaptive evolution after gene duplication. J Struct Funct Genom 3:201–212CrossRefGoogle Scholar
  24. 24.
    Hahn MW, De Bie T, Stajich JE et al (2005) Estimating the tempo and mode of gene family evolution from comparative genomic data. Genome Res 15:1153–1160CrossRefGoogle Scholar
  25. 25.
    Liu L, Yu L, Kalavacharla V et al (2011) A Bayesian model for gene family evolution. BMC Bioinformatics 12:426CrossRefGoogle Scholar
  26. 26.
    Han MV, Thomas GWC, Lugo-Martinez J et al (2013) Estimating gene gain and loss rates in the presence of error in genome assembly and annotation using CAFE 3. Mol Biol Evol 30:1987–1997CrossRefGoogle Scholar
  27. 27.
    Eulenstein O, Huzurbazar S, Liberles DA (2010) Reconciling phylogenetic trees. In: Dittmar K, Liberles D (eds) Evolution after gene duplication. Wiley-Blackwell, Hoboken, NJ, pp 185–206Google Scholar
  28. 28.
    Górecki P, Eulenstein O (2014) Refining discordant gene trees. BMC Bioinformatics 15:S3CrossRefGoogle Scholar
  29. 29.
    Duncan RP, Husnik F, Van LJT et al (2014) Dynamic recruitment of amino acid transporters to the insect/symbiont interface. Mol Ecol 23:1608–1623CrossRefGoogle Scholar
  30. 30.
    Dahan RA, Duncan RP, Wilson AC et al (2015) Amino acid transporter expansions associated with the evolution of obligate endosymbiosis in sap-feeding insects (Hemiptera: Sternorrhyncha). BMC Evol Biol 15:52CrossRefGoogle Scholar
  31. 31.
    Ames RM, Money D, Ghatge VP et al (2012) Determining the evolutionary history of gene families. Bioinformatics 28:48–55CrossRefGoogle Scholar
  32. 32.
    Arvestad L, Lagergren J, Sennblad B (2009) The gene evolution model and computing its associated probabilities. J ACM 56(7):44Google Scholar
  33. 33.
    Teufel AI, Liu L, Liberles DA (2016) Models for gene duplication when dosage balance works as a transition state to subsequent neo-or sub-functionalization. BMC Evol Biol 16:45CrossRefGoogle Scholar
  34. 34.
    Nee S, May RM, Harvey PH (1994) The reconstructed evolutionary process. Philos Trans R Soc Lond Ser B Biol Sci 344:305–311CrossRefGoogle Scholar
  35. 35.
    Niimura Y, Matsui A, Touhara K (2014) Extreme expansion of the olfactory receptor gene repertoire in African elephants and evolutionary dynamics of orthologous gene groups in 13 placental mammals. Genome Res 24:1485–1496CrossRefGoogle Scholar
  36. 36.
    Pegueroles C, Laurie S, Albà MM (2013) Accelerated evolution after gene duplication: a time-dependent process affecting just one copy. Mol Biol Evol 30:1830–1842CrossRefGoogle Scholar
  37. 37.
    Spielman SJ, Wilke CO (2015) The relationship between dN/dS and scaled selection coefficients. Mol Biol Evol 32:1097–1108CrossRefGoogle Scholar
  38. 38.
    Mugal CF, Wolf JBW, Kaj I (2014) Why time matters: codon evolution and the temporal dynamics of dN/dS. Mol Biol Evol 31:212–231CrossRefGoogle Scholar
  39. 39.
    Liberles DA, Teufel AI, Liu L et al (2013) On the need for mechanistic models in computational genomics and metagenomics. Genome Biol Evol 5:2008–2018CrossRefGoogle Scholar
  40. 40.
    De Bie T, Cristianini N, Demuth JP et al (2006) CAFE: A computational tool for the study of gene family evolution. Bioinformatics 22:1269–1271CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Geology & GeophysicsYale UniversityNew HavenUSA
  2. 2.Department of Ecology and EvolutionStony Brook UniversityStony BrookUSA
  3. 3.Department of Statistics and Institute of BioinformaticsUniversity of GeorgiaAthensUSA
  4. 4.Department of Biology and Center for Computational Genetics and GenomicsTemple UniversityPhiladelphiaUSA

Personalised recommendations