Fascinating. Exciting. The reconstruction of the history of life on Earth represents one of the most intriguing issues of science. And even more intriguing is trying to understand the (very) first molecular steps leading to the primordial cells and their early evolution. The extant cells are quite complex entities constituted from a myriad of different molecules that, however, have to act and interact in a concerted manner in order to assure the survival and reproduction of cells (and multicellular organisms). In each moment of cell life, billions of molecules are transformed into different ones through reactions that are accelerated (catalyzed) by the so-called enzymes, most of which are represented by proteins. Even though these proteins might interact with a plethora of different molecules during their chaotic trip within the cell, they bind only to specific molecules representing their substrate, and transform it into another and different molecules called product (of the reaction). Overall, this is not true for all enzymes; each enzyme interacts with one substrate giving rise to a specific product. Hence, in each moment of cell life billions of substrates are transformed into billions of products by billions of enzyme molecules. These reactions are extremely fast, and we can imagine the cell as a viscous environment where these reactions occur in an ordered (and only apparently chaotic) fashion. The whole body of these reactions is called metabolism, a circular “entity” in the sense that molecules can be destroyed (catabolism) to obtain energy and “bricks” that are required to construct other different molecules (anabolism) (Fig. 1). It is thus clear that within a cell an “equilibrium” between catabolic and anabolic reactions exists. Thus, metabolism of the extant cells is quite complex, but we can also consider it extremely ordered. Figure 2 charts an example of catabolic (the degradation of glucose during glycolysis) and anabolic (the biosynthesis of the amino acid histidine) systems. As we can see from Fig. 2, both glycolysis and histidine biosynthesis proceed through a sort of “cascade” of reactions where the destruction of glucose and the construction of histidine requires the sequential action of different enzymes, each of which is able to catalyze a single step of this cascade. The set of reactions starting from the substrate and leading to the final product of the reaction is called the metabolic pathway. In most cases, each step of a metabolic pathway is catalyzed by a single enzyme, which (in a third of the cases) is a single protein that is encoded by a single gene (Holliday et al. 2011).
If we assume that the extant and very complex cells originated from much simpler ancestral cells, it is also plausible to imagine that the latter had a simpler metabolism in respect to the extant one. This, in turn, implies that they should possess much simpler genomes, constituted very likely by a few hundreds of genes. If this is so, the question is: why and how did primordial cells assemble and evolve their metabolic pathways? The question can be rephrased as follows: why and how did the early cells increase the number of their genes and the complexity of their genomes? The answer(s) that we can try to give to these questions clearly depend on the conditions of primitive Earth and what primordial living beings looked like. However, this is one of the foggiest issues; in fact, although considerable efforts have been made to understand the emergence of the first living beings, we still do not know when and how life originated (Peretò et al. 1998). Still, it is commonly assumed that early organisms arose and inhabited aquatic environments (oceans, rivers, ponds, etc.) rich in organic compounds spontaneously formed in the prebiotic world. This heterotrophic origin of life is generally assumed and is frequently referred to as the Oparin–Haldane theory (Oparin 1924; Lazcano and Miller 1996). If this idea is correct, life evolved from a primordial soup containing different organic molecules (many of which are used by extant life forms). This soup of nutrient compounds was available to the early heterotrophic organisms, so they had to do a minimum of biosynthesis. An experimental support to this proposal was obtained in 1953 when Miller (1953) and Urey showed that amino acids and other organic molecules are formed under atmospheric conditions thought to be representative of those on the early Earth. The first living systems probably did stem directly from the primordial soup and evolved relatively fast up to a common ancestor, usually referred to as Last Universal Common Ancestor (LUCA), an entity representing the divergence starting-point of all the extant life forms on Earth (Fig. 3). If we assume that life arose in a prebiotic soup containing most, if not all, of the necessary small molecules, then a large potential availability of nutrients on the primitive Earth can be surmised, providing both the growth and energy supply for a large number of ancestral organisms. We can imagine the existence of an “early floating living world” constituted of primordial cells that might have looked like “soap bubbles” embedding one or more informational molecules and performing a limited number of metabolic reactions. These bubbles were able to divide, to interact with each other, and to fuse and share their genomes and metabolic abilities, giving rise to progressively complex living beings. If this scenario is correct, that is that primordial organisms were heterotrophic and had no need for developing new and improved metabolic abilities since most of the required nutrients were available, we can go back to the two questions that can be addressed, that is, why and how did primordial cells expand their metabolic abilities and genomes?
The answer to the first question is rather intuitive. Indeed, the increasing number of early cells thriving on primordial soup would have led to the depletion of essential nutrients, imposing a progressively stronger selective pressure that, in turn, favored (in a Darwinian sense) those microorganisms that had become capable of synthesizing those molecules whose concentration was decreasing in the primordial soup. Hence, the origin and the evolution of basic metabolic pathways represented a crucial step in molecular and cellular evolution since it rendered the primordial cells less dependent on exogenous sources of nutrients (Fig. 4).
But how did the expansion of genomes occur? The following section will focus on the molecular mechanisms that guided this transition, i.e., the expansion and the refinement of ancestral metabolic routes, leading to the structure of the extant metabolic pathways.