Abstract
Proteins have played a fundamental role throughout life’s history on Earth. Despite their biological importance, ancient origin, early function, and evolution of proteins are seldom able to be directly studied because few of these attributes are preserved across geologic timescales. Ancestral sequence reconstruction (ASR) provides a method to infer ancestral amino acid sequences and determine the evolutionary predecessors of modern-day proteins using phylogenetic tools. Laboratory application of ASR allows ancient sequences to be deduced from genetic information available in extant organisms and then experimentally resurrected to elucidate ancestral characteristics. In this article, we provide a generalized, stepwise protocol that considers the major elements of a well-designed ASR study and details potential sources of reconstruction bias that can reduce the relevance of historical inferences. We underscore key stages in our approach so that it may be broadly utilized to reconstruct the evolutionary histories of proteins.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro H (ed) Mammalian protein metabolism, vol III. Academic, New York, pp 21–132
Kimura M (1968) Evolutionary rate at the molecular level. Nature 217(5129):624–626
Pauling L, Zuckerkandl E (1963) Chemical paleogenetics molecular restoration studies of extinct forms of life. Acta Chem Scand 17:9–16
Eck RV, Dayhoff MO (1966) Evolution of the structure of ferredoxin based on living relics of primitive amino acid sequences. Science 152(3720):363–366
Ohno S (1970) Evolution by gene duplication. Springer, Berlin
Doolittle R (1981) Similar amino acid sequences: chance or common ancestry? Science 214(4517):149–159
Felsenstein J (2004) Inferring phylogenies. Sinauer Associates, Sunderland
Fitch WM (1971) Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20(4)
Copley SD (2021) Setting the stage for evolution of a new enzyme. Curr Opin Struct Biol 69:41–49
Yang Z, Rannala B (2012) Molecular phylogenetics: principles and practice. Nat Rev Genet 13(5):303–314
Thornton JW (2004) Resurrecting ancient genes: experimental analysis of extinct molecules. Nat Rev Genet 5(5):366–375
Benner SA, Sassi SO, Gaucher EA (2007) Molecular paleoscience: systems biology from the past. Adv Enzymol Relat Areas Mol Biol 75:1–132, xi
Hochberg GKA, Thornton JW (2017) Reconstructing ancient proteins to understand the causes of structure and function. Annu Rev Biophys 46:247–269
Garcia AK, Kacar B (2019) How to resurrect ancestral proteins as proxies for ancient biogeochemistry. Free Radic Biol Med 140:260–269
Stackhouse J et al (1990) The ribonuclease from an extinct bovid ruminant. FEBS Lett 262(1):104–106
Malcolm BA et al (1990) Ancestral lysozymes reconstructed, neutrality tested, and thermostability linked to hydrocarbon packing. Nature 345(6270):86–89
Jermann TM et al (1995) Reconstructing the evolutionary history of the artiodactyl ribonuclease superfamily. Nature 374(6517):57–59
Zhang J, Nei M (1997) Accuracies of ancestral amino acid sequences inferred by the parsimony, likelihood, and distance methods. J Mol Evol 44(S1):S139–S146
Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17(6):368–376
Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24(8):1586–1591
Nguyen LT et al (2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32(1):268–274
Yang Z, Kumar S, Nei M (1995) A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141(4):1641–1650
Rannala B, Yang Z (1996) Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J Mol Evol 43(3):304–311
Pagel M et al (2004) Bayesian estimation of ancestral character states on phylogenies. Syst Biol 53(5):673–684
Huelsenbeck JP, Bollback JP, Olmstead R (2001) Empirical and hierarchical Bayesian estimation of ancestral states. Syst Biol 50(3):351–366
Ronquist F et al (2012) MrBayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61(3):539–542
Lartillot N, Lepage T, Blanquart S (2009) PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics 25(17):2286–2288
Carrigan MA et al (2015) Hominids adapted to metabolize ethanol long before human-directed fermentation. Proc Natl Acad Sci U S A 112(2):458–463
Thornton JW, Need E, Crews D (2003) Resurrecting the ancestral steroid receptor: ancient origin of estrogen signaling. Science 301(5640):1714–1717
Chang BS et al (2002) Recreating a functional ancestral archosaur visual pigment. Mol Biol Evol 19(9):1483–1489
Garcia AK et al (2017) Reconstructed ancestral enzymes suggest long-term cooling of Earth’s photic zone since the Archean. Proc Natl Acad Sci U S A 114(18):4619–4624
Gaucher EA, Govindarajan S, Ganesh OK (2008) Palaeotemperature trend for Precambrian life inferred from resurrected proteins. Nature 451(7179):704–707
Akanuma S et al (2013) Experimental evidence for the thermophilicity of ancestral life. Proc Natl Acad Sci U S A 110(27):11067–11072
Siddiq MA et al (2017) Experimental test and refutation of a classic case of molecular adaptation in Drosophila melanogaster. Nat Ecol Evol 1:DOI: 10.1038/s41559-016-0025
Kacar B et al (2017) Experimental evolution of Escherichia coli harboring an ancient translation protein. J Mol Evol 84(2-3):69–84
Venkataram S et al (2020) Evolutionary stalling and a limit on the power of natural selection to improve a cellular module. Proc Natl Acad Sci U S A 117(31):18582–18590
Sephus et al. Earliest photic zone niches probed by ancestral microbial rhodopsins. Mol Biol Evol 39(5):msac100. https://doi.org/10.1093/molbev/msac100
Kacar B et al (2017) Resurrecting ancestral genes in bacteria to interpret ancient biosignatures. Philos Trans A Math Phys Eng Sci 375. https://doi.org/10.1098/rsta.2016.0352
Garcia AK et al (2020) Reconstructing the evolutionary history of nitrogenases: evidence for ancestral molybdenum-cofactor utilization. Geobiology 18:394–411
Pearson WR (2013) An introduction to sequence similarity (“homology”) searching. Curr Protoc Bioinf 42(1)
Koonin EV, Galperin MY (2003) Sequence – evolution – function: computational approaches in comparative genomics. Kluwer Academic, Boston
Camacho C et al (2009) BLAST+: architecture and applications. BMC Bioinf 10:421
O’Leary NA et al (2016) Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 44(D1):D733–D745
Bateman A et al (2021) UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res 49(D1):D480–D489
Hillis DM, Cannatella D (1998) Taxonomic sampling, phylogenetic accuracy, and investigator bias. Syst Biol 47(1):3–8
Holland BR et al (2003) Outgroup misplacement and phylogenetic inaccuracy under a molecular clock – a simulation study. Syst Biol 52(2):229–238
Bergsten J (2005) A review of long-branch attraction. Cladistics 21(2):163–193
Vialle RA, Tamuri AU, Goldman N (2018) Alignment modulates ancestral sequence reconstruction accuracy. Mol Biol Evol 35(7):1783–1797
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797
Loytynoja A, Vilella AJ, Goldman N (2012) Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm. Bioinformatics 28(13):1684–1691
Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30(4):772–780
Mirarab S et al (2015) PASTA: ultra-large multiple sequence alignment for nucleotide and amino-acid sequences. J Comput Biol 22(5):377–386
Aadland K, Pugh C, Kolaczkowski B (2019) High-throughput reconstruction of ancestral protein sequence, structure, and molecular function. Methods Mol Biol 1851:135–170
Talavera G et al (2007) Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol 56(4):564–577
Capella-Gutierrez S, Silla-Martinez JM, Gabaldon T (2009) trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25(15):1972–1973
Darriba D et al (2011) ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27(8):1164–1165
Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30(9):1312–1313
Le SQ, Gascuel O (2008) An improved general amino acid replacement matrix. Mol Biol Evol 25(7):1307–1320
Guindon S et al (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59(3):307–321
Hanson-Smith V, Kolaczkowski B, Thornton JW (2010) Robustness of ancestral sequence reconstruction to phylogenetic uncertainty. Mol Biol Evol 27(9):1988–1999
Hall BG (2006) Simple and accurate estimation of ancestral protein sequences. Proc Natl Acad Sci 103(14):5431–5436
Ashkenazy H et al (2012) FastML: a web server for probabilistic reconstruction of ancestral sequences. Nucleic Acids Res 40(Web Server issue):W580-4
Redelings BD, Suchard MA, Lewis P (2005) Joint Bayesian estimation of alignment and phylogeny. Syst Biol 54(3):401–418
Suchard MA, Redelings BD (2006) BAli-Phy: simultaneous Bayesian inference of alignment and phylogeny. Bioinformatics 22(16):2047–2048
Risso VA et al (2014) Phenotypic comparisons of consensus variants versus laboratory resurrections of Precambrian proteins. Proteins 82(6):887–896
Kiss C et al (2009) Directed evolution of an extremely stable fluorescent protein. Protein Eng Des Sel 22(5):313–323
Williams PD et al (2006) Assessing the accuracy of ancestral protein reconstruction methods. PLoS Comput Biol 2(6):e69
Hochberg GKA et al (2020) A hydrophobic ratchet entrenches molecular complexes. Nature
Bickelmann C et al (2015) The molecular origin and evolution of dim-light vision in mammals. Evolution 69(11):2995–3003
Finnigan GC et al (2012) Evolution of increased complexity in a molecular machine. Nature 481(7381):360–364
Eick GN et al (2017) Robustness of reconstructed ancestral protein functions to statistical uncertainty. Mol Biol Evol 34(2):247–261
Bar-Rogovsky H et al (2015) Assessing the prediction fidelity of ancestral reconstruction by a library approach. Protein Eng Des Sel 28(11):507–518
Kędzior M et al (2022) Resurrected Rubisco suggests uniform carbon isotope signatures over geologic time. Cell Rep 39(4):110726
Acknowledgments
This work was supported by the National Science Foundation Emerging Frontiers Program Award No. 1724090.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Garcia, A.K., Fer, E., Sephus, C., Kacar, B. (2022). An Integrated Method to Reconstruct Ancient Proteins. In: Luo, H. (eds) Environmental Microbial Evolution. Methods in Molecular Biology, vol 2569. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2691-7_13
Download citation
DOI: https://doi.org/10.1007/978-1-0716-2691-7_13
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2690-0
Online ISBN: 978-1-0716-2691-7
eBook Packages: Springer Protocols