Skip to main content

An Integrated Method to Reconstruct Ancient Proteins

  • Protocol
  • First Online:
Environmental Microbial Evolution

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2569))

Abstract

Proteins have played a fundamental role throughout life’s history on Earth. Despite their biological importance, ancient origin, early function, and evolution of proteins are seldom able to be directly studied because few of these attributes are preserved across geologic timescales. Ancestral sequence reconstruction (ASR) provides a method to infer ancestral amino acid sequences and determine the evolutionary predecessors of modern-day proteins using phylogenetic tools. Laboratory application of ASR allows ancient sequences to be deduced from genetic information available in extant organisms and then experimentally resurrected to elucidate ancestral characteristics. In this article, we provide a generalized, stepwise protocol that considers the major elements of a well-designed ASR study and details potential sources of reconstruction bias that can reduce the relevance of historical inferences. We underscore key stages in our approach so that it may be broadly utilized to reconstruct the evolutionary histories of proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro H (ed) Mammalian protein metabolism, vol III. Academic, New York, pp 21–132

    Google Scholar 

  2. Kimura M (1968) Evolutionary rate at the molecular level. Nature 217(5129):624–626

    CAS  PubMed  Google Scholar 

  3. Pauling L, Zuckerkandl E (1963) Chemical paleogenetics molecular restoration studies of extinct forms of life. Acta Chem Scand 17:9–16

    Google Scholar 

  4. Eck RV, Dayhoff MO (1966) Evolution of the structure of ferredoxin based on living relics of primitive amino acid sequences. Science 152(3720):363–366

    CAS  PubMed  Google Scholar 

  5. Ohno S (1970) Evolution by gene duplication. Springer, Berlin

    Google Scholar 

  6. Doolittle R (1981) Similar amino acid sequences: chance or common ancestry? Science 214(4517):149–159

    CAS  PubMed  Google Scholar 

  7. Felsenstein J (2004) Inferring phylogenies. Sinauer Associates, Sunderland

    Google Scholar 

  8. Fitch WM (1971) Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20(4)

    Google Scholar 

  9. Copley SD (2021) Setting the stage for evolution of a new enzyme. Curr Opin Struct Biol 69:41–49

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Yang Z, Rannala B (2012) Molecular phylogenetics: principles and practice. Nat Rev Genet 13(5):303–314

    CAS  PubMed  Google Scholar 

  11. Thornton JW (2004) Resurrecting ancient genes: experimental analysis of extinct molecules. Nat Rev Genet 5(5):366–375

    CAS  PubMed  Google Scholar 

  12. Benner SA, Sassi SO, Gaucher EA (2007) Molecular paleoscience: systems biology from the past. Adv Enzymol Relat Areas Mol Biol 75:1–132, xi

    PubMed  Google Scholar 

  13. Hochberg GKA, Thornton JW (2017) Reconstructing ancient proteins to understand the causes of structure and function. Annu Rev Biophys 46:247–269

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Garcia AK, Kacar B (2019) How to resurrect ancestral proteins as proxies for ancient biogeochemistry. Free Radic Biol Med 140:260–269

    CAS  PubMed  Google Scholar 

  15. Stackhouse J et al (1990) The ribonuclease from an extinct bovid ruminant. FEBS Lett 262(1):104–106

    CAS  PubMed  Google Scholar 

  16. Malcolm BA et al (1990) Ancestral lysozymes reconstructed, neutrality tested, and thermostability linked to hydrocarbon packing. Nature 345(6270):86–89

    CAS  PubMed  Google Scholar 

  17. Jermann TM et al (1995) Reconstructing the evolutionary history of the artiodactyl ribonuclease superfamily. Nature 374(6517):57–59

    CAS  PubMed  Google Scholar 

  18. Zhang J, Nei M (1997) Accuracies of ancestral amino acid sequences inferred by the parsimony, likelihood, and distance methods. J Mol Evol 44(S1):S139–S146

    CAS  PubMed  Google Scholar 

  19. Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17(6):368–376

    CAS  PubMed  Google Scholar 

  20. Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24(8):1586–1591

    CAS  PubMed  Google Scholar 

  21. Nguyen LT et al (2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32(1):268–274

    CAS  PubMed  Google Scholar 

  22. Yang Z, Kumar S, Nei M (1995) A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141(4):1641–1650

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Rannala B, Yang Z (1996) Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J Mol Evol 43(3):304–311

    CAS  PubMed  Google Scholar 

  24. Pagel M et al (2004) Bayesian estimation of ancestral character states on phylogenies. Syst Biol 53(5):673–684

    PubMed  Google Scholar 

  25. Huelsenbeck JP, Bollback JP, Olmstead R (2001) Empirical and hierarchical Bayesian estimation of ancestral states. Syst Biol 50(3):351–366

    CAS  PubMed  Google Scholar 

  26. Ronquist F et al (2012) MrBayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61(3):539–542

    PubMed  PubMed Central  Google Scholar 

  27. Lartillot N, Lepage T, Blanquart S (2009) PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics 25(17):2286–2288

    CAS  PubMed  Google Scholar 

  28. Carrigan MA et al (2015) Hominids adapted to metabolize ethanol long before human-directed fermentation. Proc Natl Acad Sci U S A 112(2):458–463

    CAS  PubMed  Google Scholar 

  29. Thornton JW, Need E, Crews D (2003) Resurrecting the ancestral steroid receptor: ancient origin of estrogen signaling. Science 301(5640):1714–1717

    CAS  PubMed  Google Scholar 

  30. Chang BS et al (2002) Recreating a functional ancestral archosaur visual pigment. Mol Biol Evol 19(9):1483–1489

    CAS  PubMed  Google Scholar 

  31. Garcia AK et al (2017) Reconstructed ancestral enzymes suggest long-term cooling of Earth’s photic zone since the Archean. Proc Natl Acad Sci U S A 114(18):4619–4624

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Gaucher EA, Govindarajan S, Ganesh OK (2008) Palaeotemperature trend for Precambrian life inferred from resurrected proteins. Nature 451(7179):704–707

    CAS  PubMed  Google Scholar 

  33. Akanuma S et al (2013) Experimental evidence for the thermophilicity of ancestral life. Proc Natl Acad Sci U S A 110(27):11067–11072

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Siddiq MA et al (2017) Experimental test and refutation of a classic case of molecular adaptation in Drosophila melanogaster. Nat Ecol Evol 1:DOI: 10.1038/s41559-016-0025

    Google Scholar 

  35. Kacar B et al (2017) Experimental evolution of Escherichia coli harboring an ancient translation protein. J Mol Evol 84(2-3):69–84

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Venkataram S et al (2020) Evolutionary stalling and a limit on the power of natural selection to improve a cellular module. Proc Natl Acad Sci U S A 117(31):18582–18590

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Sephus et al. Earliest photic zone niches probed by ancestral microbial rhodopsins. Mol Biol Evol 39(5):msac100. https://doi.org/10.1093/molbev/msac100

  38. Kacar B et al (2017) Resurrecting ancestral genes in bacteria to interpret ancient biosignatures. Philos Trans A Math Phys Eng Sci 375. https://doi.org/10.1098/rsta.2016.0352

  39. Garcia AK et al (2020) Reconstructing the evolutionary history of nitrogenases: evidence for ancestral molybdenum-cofactor utilization. Geobiology 18:394–411

    PubMed  PubMed Central  Google Scholar 

  40. Pearson WR (2013) An introduction to sequence similarity (“homology”) searching. Curr Protoc Bioinf 42(1)

    Google Scholar 

  41. Koonin EV, Galperin MY (2003) Sequence – evolution – function: computational approaches in comparative genomics. Kluwer Academic, Boston

    Google Scholar 

  42. Camacho C et al (2009) BLAST+: architecture and applications. BMC Bioinf 10:421

    Google Scholar 

  43. O’Leary NA et al (2016) Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 44(D1):D733–D745

    PubMed  Google Scholar 

  44. Bateman A et al (2021) UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res 49(D1):D480–D489

    Google Scholar 

  45. Hillis DM, Cannatella D (1998) Taxonomic sampling, phylogenetic accuracy, and investigator bias. Syst Biol 47(1):3–8

    CAS  PubMed  Google Scholar 

  46. Holland BR et al (2003) Outgroup misplacement and phylogenetic inaccuracy under a molecular clock – a simulation study. Syst Biol 52(2):229–238

    CAS  PubMed  Google Scholar 

  47. Bergsten J (2005) A review of long-branch attraction. Cladistics 21(2):163–193

    PubMed  Google Scholar 

  48. Vialle RA, Tamuri AU, Goldman N (2018) Alignment modulates ancestral sequence reconstruction accuracy. Mol Biol Evol 35(7):1783–1797

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Loytynoja A, Vilella AJ, Goldman N (2012) Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm. Bioinformatics 28(13):1684–1691

    PubMed  PubMed Central  Google Scholar 

  51. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30(4):772–780

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Mirarab S et al (2015) PASTA: ultra-large multiple sequence alignment for nucleotide and amino-acid sequences. J Comput Biol 22(5):377–386

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Aadland K, Pugh C, Kolaczkowski B (2019) High-throughput reconstruction of ancestral protein sequence, structure, and molecular function. Methods Mol Biol 1851:135–170

    CAS  PubMed  Google Scholar 

  54. Talavera G et al (2007) Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol 56(4):564–577

    CAS  PubMed  Google Scholar 

  55. Capella-Gutierrez S, Silla-Martinez JM, Gabaldon T (2009) trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25(15):1972–1973

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Darriba D et al (2011) ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27(8):1164–1165

    CAS  PubMed  Google Scholar 

  57. Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30(9):1312–1313

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Le SQ, Gascuel O (2008) An improved general amino acid replacement matrix. Mol Biol Evol 25(7):1307–1320

    CAS  PubMed  Google Scholar 

  59. Guindon S et al (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59(3):307–321

    CAS  PubMed  Google Scholar 

  60. Hanson-Smith V, Kolaczkowski B, Thornton JW (2010) Robustness of ancestral sequence reconstruction to phylogenetic uncertainty. Mol Biol Evol 27(9):1988–1999

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Hall BG (2006) Simple and accurate estimation of ancestral protein sequences. Proc Natl Acad Sci 103(14):5431–5436

    CAS  PubMed  PubMed Central  Google Scholar 

  62. Ashkenazy H et al (2012) FastML: a web server for probabilistic reconstruction of ancestral sequences. Nucleic Acids Res 40(Web Server issue):W580-4

    PubMed  Google Scholar 

  63. Redelings BD, Suchard MA, Lewis P (2005) Joint Bayesian estimation of alignment and phylogeny. Syst Biol 54(3):401–418

    PubMed  Google Scholar 

  64. Suchard MA, Redelings BD (2006) BAli-Phy: simultaneous Bayesian inference of alignment and phylogeny. Bioinformatics 22(16):2047–2048

    CAS  PubMed  Google Scholar 

  65. Risso VA et al (2014) Phenotypic comparisons of consensus variants versus laboratory resurrections of Precambrian proteins. Proteins 82(6):887–896

    CAS  PubMed  Google Scholar 

  66. Kiss C et al (2009) Directed evolution of an extremely stable fluorescent protein. Protein Eng Des Sel 22(5):313–323

    CAS  PubMed  Google Scholar 

  67. Williams PD et al (2006) Assessing the accuracy of ancestral protein reconstruction methods. PLoS Comput Biol 2(6):e69

    PubMed  PubMed Central  Google Scholar 

  68. Hochberg GKA et al (2020) A hydrophobic ratchet entrenches molecular complexes. Nature

    Google Scholar 

  69. Bickelmann C et al (2015) The molecular origin and evolution of dim-light vision in mammals. Evolution 69(11):2995–3003

    CAS  PubMed  Google Scholar 

  70. Finnigan GC et al (2012) Evolution of increased complexity in a molecular machine. Nature 481(7381):360–364

    CAS  PubMed  PubMed Central  Google Scholar 

  71. Eick GN et al (2017) Robustness of reconstructed ancestral protein functions to statistical uncertainty. Mol Biol Evol 34(2):247–261

    CAS  PubMed  Google Scholar 

  72. Bar-Rogovsky H et al (2015) Assessing the prediction fidelity of ancestral reconstruction by a library approach. Protein Eng Des Sel 28(11):507–518

    CAS  PubMed  Google Scholar 

  73. Kędzior M et al (2022) Resurrected Rubisco suggests uniform carbon isotope signatures over geologic time. Cell Rep 39(4):110726

    Google Scholar 

Download references

Acknowledgments

This work was supported by the National Science Foundation Emerging Frontiers Program Award No. 1724090.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Betul Kacar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Garcia, A.K., Fer, E., Sephus, C., Kacar, B. (2022). An Integrated Method to Reconstruct Ancient Proteins. In: Luo, H. (eds) Environmental Microbial Evolution. Methods in Molecular Biology, vol 2569. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2691-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2691-7_13

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2690-0

  • Online ISBN: 978-1-0716-2691-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics