Journal of Molecular Evolution

, Volume 72, Issue 1, pp 14-33

First online:

Proteome Evolution and the Metabolic Origins of Translation and Cellular Life

  • Derek Caetano-AnollésAffiliated withEvolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of IllinoisSchool of Molecular and Cellular Biology, University of Illinois
  • , Kyung Mo KimAffiliated withEvolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois
  • , Jay E. MittenthalAffiliated withDepartment of Cell and Developmental Biology, University of Illinois
  • , Gustavo Caetano-AnollésAffiliated withEvolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois Email author 

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


The origin of life has puzzled molecular scientists for over half a century. Yet fundamental questions remain unanswered, including which came first, the metabolic machinery or the encoding nucleic acids. In this study we take a protein-centric view and explore the ancestral origins of proteins. Protein domain structures in proteomes are highly conserved and embody molecular functions and interactions that are needed for cellular and organismal processes. Here we use domain structure to study the evolution of molecular function in the protein world. Timelines describing the age and function of protein domains at fold, fold superfamily, and fold family levels of structural complexity were derived from a structural phylogenomic census in hundreds of fully sequenced genomes. These timelines unfold congruent hourglass patterns in rates of appearance of domain structures and functions, functional diversity, and hierarchical complexity, and revealed a gradual build up of protein repertoires associated with metabolism, translation and DNA, in that order. The most ancient domain architectures were hydrolase enzymes and the first translation domains had catalytic functions for the aminoacylation and the molecular switch-driven transport of RNA. Remarkably, the most ancient domains had metabolic roles, did not interact with RNA, and preceded the gradual build-up of translation. In fact, the first translation domains had also a metabolic origin and were only later followed by specialized translation machinery. Our results explain how the generation of structure in the protein world and the concurrent crystallization of translation and diversified cellular life created further opportunities for proteomic diversification.


Origin of life Phylogenetic analysis Protein domain structure Ribonucleoprotein world RNA world