Journal of Molecular Evolution

, Volume 74, Issue 1, pp 1–34

The Phylogenomic Roots of Modern Biochemistry: Origins of Proteins, Cofactors and Protein Biosynthesis

Authors

    • Evolutionary Bioinformatics Laboratory, Department of Crop SciencesUniversity of Illinois
  • Kyung Mo Kim
    • Korean Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology (KRIBB)
  • Derek Caetano-Anollés
    • Evolutionary Bioinformatics Laboratory, Department of Crop SciencesUniversity of Illinois
Article

DOI: 10.1007/s00239-011-9480-1

Cite this article as:
Caetano-Anollés, G., Kim, K.M. & Caetano-Anollés, D. J Mol Evol (2012) 74: 1. doi:10.1007/s00239-011-9480-1

Abstract

The complexity of modern biochemistry developed gradually on early Earth as new molecules and structures populated the emerging cellular systems. Here, we generate a historical account of the gradual discovery of primordial proteins, cofactors, and molecular functions using phylogenomic information in the sequence of 420 genomes. We focus on structural and functional annotations of the 54 most ancient protein domains. We show how primordial functions are linked to folded structures and how their interaction with cofactors expanded the functional repertoire. We also reveal protocell membranes played a crucial role in early protein evolution and show translation started with RNA and thioester cofactor-mediated aminoacylation. Our findings allow elaboration of an evolutionary model of early biochemistry that is firmly grounded in phylogenomic information and biochemical, biophysical, and structural knowledge. The model describes how primordial α-helical bundles stabilized membranes, how these were decorated by layered arrangements of β-sheets and α-helices, and how these arrangements became globular. Ancient forms of aminoacyl-tRNA synthetase (aaRS) catalytic domains and ancient non-ribosomal protein synthetase (NRPS) modules gave rise to primordial protein synthesis and the ability to generate a code for specificity in their active sites. These structures diversified producing cofactor-binding molecular switches and barrel structures. Accretion of domains and molecules gave rise to modern aaRSs, NRPS, and ribosomal ensembles, first organized around novel emerging cofactors (tRNA and carrier proteins) and then more complex cofactor structures (rRNA). The model explains how the generation of protein structures acted as scaffold for nucleic acids and resulted in crystallization of modern translation.

Keywords

Aminoacyl-tRNA synthetasesNon-ribosomal protein synthesisOrigin of lifePhylogenetic analysisProtein domain structureRibonucleoprotein world

Abbreviations

aaRS

Aminoacyl-tRNA synthetase

CoA

Coenzyme A

F

Fold

FSF

Fold superfamily

FF

Fold family

nd

Node distance

PCP

Peptidyl carrier protein

r-protein

Ribosomal protein

SCOP

Structural classification of proteins

Supplementary material

239_2011_9480_MOESM1_ESM.doc (1.2 mb)
Supplementary material 1 (DOC 1.21 mb)

Copyright information

© Springer Science+Business Media, LLC 2011