Selection without replicators: the origin of genes, and the replicator/interactor distinction in etiobiology
- First Online:
- Cite this article as:
- Wilkins, J.S., Stanyon, C. & Musgrave, I. Biol Philos (2012) 27: 215. doi:10.1007/s10539-011-9298-7
- 213 Views
Genes are thought to have evolved from long-lived and multiply-interactive molecules in the early stages of the origins of life. However, at that stage there were no replicators, and the distinction between interactors and replicators did not yet apply. Nevertheless, the process of evolution that proceeded from initial autocatalytic hypercycles to full organisms was a Darwinian process of selection of favourable variants. We distinguish therefore between Neo-Darwinian evolution and the related Weismannian and Central Dogma divisions, on the one hand, and the more generic category of Darwinian evolution on the other. We argue that Hull’s and Dawkins’ replicator/interactor distinction of entities is a sufficient, but not necessary, condition for Darwinian evolution to take place. We conceive the origin of genes as a separation between different types of molecules in a thermodynamic state space, and employ a notion of reproducers.
KeywordsGeneOrigins of lifeEtiobiologyHypercycleAutocatalysisNatural selectionNeo-DarwinismReplicatorInteractorReproducerWeismannDawkinsHull
Explaining the origins of the genome is a core problem for origins of life research. The modern “self-replicating” nucleotide sequence depends for its existence on a range of already-complex molecular interactions. Determining the origin of these systems is difficult.
The notion of the “information” of a genome is also problematic: what exactly is this information and how is it created? Some have argued that selection creates information (Eigen and Schuster 1979; Eigen and Winkler 1981), others that information arises from the self-organising properties of complex networks before selection takes effect (Kauffman 1993, 1995, 2001). Still others have drawn a distinction between replicators—long-lived, fecund and high-fidelity structures—and interactors—economic entities that are constituted by replicators and which support their existence and replication (Dawkins 1976; Hull 1981, 1988a, 1989; developmental systems theorists have criticized this view: Griffiths and Gray 1994, 1997; Griffiths and Neumann-Held 1999). This distinction is held to describe necessary conditions of Darwinian evolution by the neo-Darwinian consensus.
It is our contention that Darwinian evolution can occur without a genotype/phenotype or replicator/interactor division of labor, and that “information” in the context of the earliest Darwinian processes is the difference in the stability and interconnectedness of molecules involved in autocatalytic hypercycles and other reaction webs. The question of whether the “original” biotic or protobiotic regime was an RNA-world or a metabolism-world is therefore irrelevant, for the distinction between metabolic and genetic molecules did not then apply: nucleotides could be seen as a metabolic “actor” while polyaminoacids and other molecular structures such as metalloproteins could be seen as “informational” structures on a continuous spectrum.
This means that selection can occur on systems that lack a distinct replicator, which means that the Hull-Dawkins Distinction and related axiomatisations of Darwinian algorithms, while sufficient conditions for Darwinian evolution, are not necessary for it. It may be an error therefore to argue against some forms of group selection on the grounds that there are no replicators for the group properties selected (see discussion in Brandon 1988, 1990). If this is correct, then the Weismann barrier is not necessary for Darwinian evolution, and recent challenges to neo-Darwinism on the basis of claimed nonuniversality of genome sequestration do not undercut the basic Darwinian process. We consider that genetic information is just a matter of relative physical stability and persistence of a class of molecules, and their involvement in a larger number of reaction cycles.
There is no convenient term to denote the research and philosophical issues surrounding the origins of life. The usual index term is “origins of life”, but this carries with it connotations covering all phases from the initial abiotic production of organic compounds such as amino acids and sugars, the origins of chirality (handedness of proteins and sugars), the origins of self-replicating molecules (usually nucleotides), the origins of cellularity, of enzyme functions, and so forth. In other words, it covers everything from the end of the bombardment of the earth’s surface to the first prokaryotes, and therefore has under its rubric a period of about 300 million years and a range of extremely diverse conditions.
A glossary of chemical terms used in this paper
Autocatalysis: Where the reaction product is itself the catalyst for the reaction that produced it.
Catalysis: A chemical reaction where a molecule activates the reaction but is not combined or consumed by it.
Chirality: A property of any molecule that is non-superimposable on its mirror image. Chiral isomers can be distinguished by the direction they rotate polarised light (see levo and dextro).
Coascervate: A small spherical droplet of organic molecules which is held together by hydrophobic forces.
Covalent bond: A chemical bond formed between atoms by the sharing of one or more electrons.
Dextro: Of molecules—rotates the plane of polarized light clockwise (dextrotrotatory).
Dimerisation: formation of a structure from two monomers.
Hydrophily: the propensity for substances to bind water molecules.
Hydrophoby: the propensity for substances to repel water molecules (e.g., lipids are hydrophobic).
Isomer: a molecule which has the same sequence or chemical structure as another, but which has its own shape.
Levo: Of molecules—rotates the plane of polarized light anti-clockwise (levorotatory)
Lipid: a fatty or waxy organic compound. Lipids are a main component of cell membranes.
Metalloprotein: A protein that has a metal ion bound to it which is required for the protein’s function; e.g., zinc fingers.
Monomer: A single unit that can be built up into a larger structure. Monomers in the context of this article can be amino acids, RNA bases, sugars or even polypeptides.
Oligomer: a group of monomers bound covalently together; the monomers do not have to be identical.
Peptide/polypeptide: a chain of covalently bonded amino acids of arbitrary length. The amino acids do not need to be identical.
Polyaminoacid: a chain of covalently bonded amino acids.
Polymer: a compound formed of a series of repeating units (monomers). The monomers do not have to be identical.
Polymerase: An enzyme which produces a polymer from monomers, usually applied to enzymes that make polymers of RNA or DNA.
Stoichiometry: the ratios between two or more substances (and their products) undergoing a chemical reaction; two momomers reacting to give one dimer is a 2:1 stoichiometry.
Weak forces: Non-covalent forces such as ionic interactions, hydrophilic and hydrophobic forces.
Richard Dawkins, the doyen of popular neo-Darwinist thought, wrote that evolution and life began when a molecule gained the ability to replicate itself (Dawkins 1989: 15). As the originator of the concept of a replicator—a long-lived, fecund and high fidelity entity—it is to be expected that he would envisage life depending on the existence of such molecules, and it does seem obvious that replicators are necessary. After all, all known organisms have replicators—genes—upon which they depend for continuity. One of the leading hypotheses about the origins of biotic processes is known as the RNA World hypothesis—nucleotides came first, before metabolic processes (Poole et al. 1998; Szathmáry and Smith 1997). But there are some difficulties with this view (Yockey 1992, 1995). For a start, oligonucleotides are, on their own, relatively inert molecules and require a series of mediating reactions to replicate. In modern organisms, the machinery of polymerases and enzymes is what replicates genes, using existing genes as templates on which nucleotides are assembled. Without that complex machinery, DNA and RNA are relatively inactive.
Moreover, genes are commonly thought to be “codes” for proteins. The so-called Standard Genetic Code (SGC) is a mapping of triplets of nucleotides to the amino acids they express. However, the SGC is not a universal code, despite the widely-held view that it is (Schultz and Yarus 1996). At least seventeen alternative nuclear, mitochondrial and bacterial codes exist in modern taxa (Stegmann 2004; Elzanowski et al. 2006),1 allowing the possibility that there was once a period in which the codical properties of nucleotides were diverse—perhaps as diverse as those other early biotic molecules. Or, it may have been that the codical role of nucleotidal molecules in living systems evolved much later than the molecules themselves (Abkevich et al. 1996; Alberti 1997; Di Giulio 1997b; Ertem and Ferris 1996). Alternative molecules exhibiting some of the properties of standard nucleotides have been discovered (Nelson et al. 2000; Nielsen 1993; Schoning et al. 2000). So the question naturally arises—how and why did these particular molecules acquire the properties that define replicators? By what process did they evolve? Was it through a Darwinian process, or, as some would have it, a “proto-Darwinian” or “pseudo-Darwinian” process that only approximated a true Darwinian evolution through variation and selection? (Nowak and Ohtsuki 2008).
These questions acquire some cogency from the recent formalisations (and subsequent consensus) of any and all Darwinian processes as the outcome of the interplay between the random variation of replicators and selection over the phenotypic ensembles they “code for” or construct (Dawkins 1976; Dennett 1995; Eldredge 1989; Hull 1988a, 1989; Hull and Wilkins 2005). It is held that the necessary and sufficient conditions for Darwinian evolution are that there exist replicators with high but not exact fidelity, and that they build interactors (or vehicles) that protect and work on behalf of their replicators in the economic realms of metabolism, ecology, and ultimately, evolution.
So, if life begins at replication, and replicators are codes, and the replicator/interactor distinction is a necessary condition for evolution, there is a conundrum. Our best hypotheses posit that early nucleotides were not high-fidelity replicators, yet the transition from non-living biomolecular reactions to living systems seems to have occurred in their absence. We need to ask when genes, functionally described, became genes, and from this to ask if indeed replicators are necessary for Darwinian evolution to take place. The view that it happened by accident strikes us as a “scientific miracle”, which is to be avoided.
Our proposal is very general: we are not outlining a full chemical schema based on the current knowledge of early biochemistry (cf. Abkevich et al. 1996; Ertem and Ferris 1996; Levy and Miller 1998; Muller 1995, 1996; Nelson et al. 2000). We believe that in the early stages of biotic processes, nothing had significantly any of the properties now ascribed to genes, nor were there any replicator-like entities (cf. Griesemer 2005). More exactly, we think that several components might have been roughly equivalent in their properties but that selection, without replicators, on existing physical property differences led to the evolution of genes themselves. We are led to this conclusion by consideration of the general structure of the process of chemical reactions at the biotic transition—the hypercycle of autocatalytic molecules proposed by Eigen and his collaborators (Eigen 1993; Eigen et al. 1991; Eigen and Schuster 1979; Eigen and Winkler 1981; Eigen and Winkler-Oswatitsch 1992; Szathmáry 1997; Szathmáry and Demeter 1987; Szathmáry and Smith 1997)—together with reflections upon the conceptual role in modern neo-Darwinism of the separation of genotype from phenotype, germ from soma, gene from gene product.
In short, we challenge the fundamental necessity of the replicator/interactor distinction itself. We do not wish to deny the proper application of that distinction, for it has clear validity in most modern biotic systems. However, discussion of these border—line cases, both at biogenesis and in exotic cases in modern biology, will help us to clarify the nature of evolution and the origins of genicity.2
The genotype–phenotype distinction, its ancestors, heirs and successors
A large part of the problem some have with the origins of genetic information molecules lies in the Metaphor of the Code (Oyama 1985). The idea that genes encode proteins and eventually the cellular and organismic structures and processes collectively called the phenotype is generally a useful way to conceive the situation. Genes are “information” “about” properties of organisms and their development. The replication of genes is “transmission” of a “message”. Once this semantic metaphor is taken literally, though, questions of informational content come into play (Yockey 1992; Smith 2000), along with notions like Kolmogorov-Chaitin algorithmic complexity measures—genes become Turing programs stored in molecules that “specify” and “control” the phenotype as the output of algorithms (Dennett 1995).
A difficulty here is that undirected modifications to algorithms lead, in all probability, to degraded or dysfunctional algorithms, and it becomes very hard to conceive how more “information” or “order” can evolve from less “information” and less “order”. This has belief led many to claim that evolution is not directed and not Darwinian. For example, Gregory Chaitin, though a Darwinian, regrets that he cannot formalise Darwinian evolution using his algorithmic information theory, or AIT (Chaitin 1999). Yockey, with a better grounding in biochemistry, thinks that although he can accommodate genetic novelty in Chaitin AIT terms, he rejects the idea that genes could, through random assortment, reach the critical level of complexity from which replication can kick off Darwinian evolution.
Such conundrums and objections evaporate once the notion that genes are information is seen to be a metaphor, or at best a useful way of representing our knowledge of genes so that the analytic techniques of information theory and statistical algorithms can be used to model and understand that knowledge. What if genes were conceived to be just molecules, with only physical (that is, thermodynamic) properties like any other molecules involved in the biotic process (Waters 2000)? What if we deny genes privileged ontological standing? To be sure, they have, as polynucleotides, different values on several distinct measures than, say, a peptide, and that will be the core of our argument, but in the physical world of biology, there are only physical properties. Notions like “information”, “code”, and “complexity” are properties of data, propositions, mathematical formulae and other conceptual entities. To a first approximation, they are not physical properties.3 They are dimensionless properties.4
Weismann’s division into what we now call the genotype and phenotype (following Johannsen) became fixed into the Modern Synthesis of Darwinism and Mendelism as a basic schema for evolution itself. Undirected variation of genotypes (common genetic patterns arrayed over lineages of organisms) were subjected, through their phenotypic expressions, to selection (and later, when the views of Sewall Wright and Waddington had been digested, ascribed also to genetic drift and genetic assimilation). It became natural to say, after 1965 when the standard genetic code had been worked out, that the variation that selection (and drift etc.) operated on was genetic variation alone, and that information could only be created by natural selection on “errors” in the replication of gamete cell lineages.
In 1974, Richard Lewontin published his enormously influential text on genetics and evolution (Lewontin 1974), one of a number of seminal works on the conceptual foundation of evolution at the time (others being Dawkins 1976; Smith 1975; Ghiselin 1974; Williams 1966). In a brief introductory chapter, Lewontin applied a concept drawn from statistical thermodynamics—the state space—to evolution. He divided biological states canonically into genetic states and phenotypic states, and described the evolutionary relations5 as transformation laws from genes to zygote, zygote to mature adult, adult to gamete and gamete to genes. Evolution proceeded as variations in the transforms appeared.
The Hull-Dawkins distinction
In his classic essay on selection, George Williams introduced the “cybernetic abstraction” conception of the gene (Williams 1966) and discussed the information content of genes, meaning the degree to which genes specified the somatic structure and capacities. For him, genes were any hereditary information that had a selective bias in its favour that exceeded the noise of mutation. A decade later, Dawkins (1976) introduced the notion of the replicator—components of life (or indeed any suitable process) that bear three properties of a general kind. That is, they are copied with fidelity, they are long-lived, and they are fecund, increasing their numbers of copies in a population. Genes are replicators, says Dawkins. They are copied with high fidelity, they are stable molecules, and they create many more copies of themselves, in bodies and in progeny. They do one other thing, he said. They build themselves “vehicles”—cells and organisms, to achieve these properties. In Dawkins’ hands, the vehicle is a rather passive entity.7
Hull (1981, 1988a, b, 1989) extended this distinction as part of the metaphysical furniture for his “generalised model of evolution”, which, like Dawkins, and Williams (1992), he expects to cover evolution in substrates other than the biological, that is, in culture and concepts. Hull replaced passive vehicles with more active interactors, and this became known as the Hull-Dawkins Distinction (HDD) (Eldredge 1989; Williams 1992). The general trend of the history of the HDD has been to a “substrate-neutral” conception of Darwinian evolution; one that can apply equally well to selection on genes, populations, kin-groups, organisms, and perhaps even taxa, as well as giving rise to a cultural Darwinism (cf. Cziko 1995). Hull is not a genic reductionist, although plausibly both Dawkins and Williams are, and accepts with equanimity the possibility of group selection, because in the relevant sense groups might be replicators.8 Few distinctions of what is essentially philosophy of biology have received the general acceptance amongst working biologists of the HDD. Even Eldredge, notionally on the opposite side of the debate from Dawkins (Eldredge 1995) accepts it, although he prefers Hull’s formulation to Dawkins’.
Hull revised the vehicle concept to emphasise the ecological, or rather the economic, aspects of the interactor. Replicators generate interactors, which are, in Hull’s words, entities that interact as cohesive wholes with their environment, so that the interaction causes replication to be differential (Hull 1988b: chapter 11). Interactive structures are the phenotypes in biology, and they have a dynamic relationship with the resources of their environments. They acquire, dispose of, and sequester the materials and energy sources that are essential to the maintenance of the equilibrium of the interactor structures themselves. In the longer term these resources are critical to the creation and development of interactors like themselves. Re-generation and proliferation of interactors is called reproduction (Griesemer 1999, 2000, 2005), which, unlike replication, has only a broad and facultative degree of structural fidelity.9 Entities that reproduce are called reproducers by Griesemer. Children resemble one of their parents in many respects (e.g., overall body plan), but do not have the degree of close similarity to that parent that any of their discrete genetic sequences has to a genetic sequence in either their mother or father.
To further define the evolutionary process, Hull specified various terms. Selection he defined as the process resulting from the correlation between replicative fidelity and interactive success. Extinction and proliferation causes differential perpetuation of the relevant replicators. The degree of correlation between the fidelity of replication and the success of the interactive profiles that are reproduced is the selection coefficient in favour of, or against, those replicators, averaged over periods of time much longer than a single reproductive generation. A sequence of these replicator/interactor “bundles” forms, in Hull’s metaphysics, lineages. Lineages that diverge form phylogenetic trees, while lineages that recombine regularly to form networks are species.10 This process of replication and divergence is, of course, the process of evolution. This completes the sketch of what Dawkins has called Universal Darwinism (Dawkins 1983, 1986; cf. Hull 1988b; Kitcher 1993; Plotkin 1994; Rosenberg 1994; Williams 1970).
In the ordinary course of biology, which means to many biologists metazoan biology and includes the subdisciplines of cytology, immunology and ecology, the HDD is relatively accurate, clear and easy to specify in practice. We can often now identify with great precision the genetic replicators and the interactive phenotype. However, as Hull himself has shown (1988b, 1992) there are exotic cases that are not so easy: clonal plants where the interactors (ramets) are parts of the replicators (genets); vegetative propagation from cuttings of the somatic lineage of the organism; “naked replicators” where the genetic material is replicated differentially on the basis of its own physical properties in competition with other nucleotides; colonial organisms that differentiate genetically-heterogeneous cell lineages to form gametes; and so forth. Much argument has been expended on how to save the Weismann germ/soma distinction in these cases. In most cases a consensus is reached, but in others, the problem is quietly shelved. The HDD as a general notion has had a very important role nevertheless. It looks set to continue to do duty so long as Darwinian models of evolution are used. But at some stage, the HDD must fail to apply. In what follows, we shall attempt to give the reason for this.
The nonbiological world, excepting some virtual realms run on computers, has nothing like a replicator, and so by extension nothing like an interactor, either. Replication is thus confined to biological systems and, arguably, complex Turing machines. Consider that, at some stage in the etiology of life, as it arose from the “non-living” chemical world, replicators and interactors must have arisen, and since they are conceptual preconditions for what Kitcher (1993: 43f) calls neo-darwinian selection,11 it is often thought that they evolved non-Darwinianly, either through incipient or proto-Darwinian evolution, or some other process such as self-organisation (Smith and Szathmáry 1995).
So at some stage, as we track back the history of life, there are neither replicators or interactors, but something like what Greisemer has called reproducers—entities that have low-fidelity replication and low correlation between replication of structure and interactive success (Griesemer 2005, 1999; Godfrey-Smith 2009). The lack of strong correlation with success might be due to the fact that the generated structures can attain a large number of conformations each with varying degrees of efficiency, or because equivalent degrees of efficiency are achieved in numerous ways. At some point there is just no conceptual work done by the terms “replication” or “(ecological) interaction”. All that existed was (chemical) interaction12 between molecules in reactive and catalytic processes. Even the term “reproducers” in the merely autotrophic period will fail to significantly denote any class of objects. What we need to consider is the origin of the replicative category, based on speculative scenarios about the origins of life in the autocatalytic period. We also need to determine is whether this means that the Darwinian explanatory schema is not applicable in this initial period.
The origin of reproducers
Origins of life scenarios are necessarily somewhat speculative. Much of the information about the initial conditions in which reproduction and metabolism arose is lost, for subsequent life (and geological subduction) has destroyed all the source material and conditions. Moreover, several candidate reactions have been proposed. It may be that there is more than one feasible pathway to the first reproducing process. More than one might have been needed, serially or in parallel. There is an extensive literature on the topic (reviewed in Di Giulio 1997a; Szathmáry 1997).
One feature common to nearly all models is the autocatalytic template molecule. This is a stable molecule that catalyses processes without being changed itself. It is autocatalytic because either it generates copies of itself or is part of a reaction web that generates copies of it. Biomolecules that spontaneously self-assemble at low frequencies ordinarily can be generated many orders of magnitude more efficiently on a catalytic template. Weak forces attract monomers to the catalyst in the required order, and covalent bonds then assemble the polymer.
Eigen and Schuster (1979) proposed a general schema for an autocatalytic cycle that magnifies the output in such a way that a protometabolism would result. Given a sufficient supply of source molecules, a complex hypercycle of catalytic and combinatorial reactions would result. Hofstadfer (cf. Varetto 1998) renamed this a tanglecycle to include noncatalytic chemical interactions, but let us look at the simpler autocatalytic hypercycle first.13
It is often unappreciated by non-chemists that the catalytic process is not exact. The polymers that are created are similar, but may differ either in their conformation—that is, in their secondary structure—or in the sequence of the monomers—that is, in their primary structure. It follows that these homologs will have different physical properties. The availability of superficial binding sites will change the external charge at different points, varying the hydrophilic or hydrophobic character of the molecules, and the sorts of interactive relations the molecule can support. Hence, some molecules will be able to bind other molecules more effectively than structurally similar homologs. Moreover, different homologs will be more or less stable in different environments, depending on the medium, the pressure and the temperature. As a result, different hypercycles evolve, each with different overall efficiencies. Some hypercycles will have component catalysts that are highly specific for catalysing other components of the cycle, and some components will be more stable than their homologs in other hypercycles.
A tanglecycle is an extension of the hypercycle, except that it is not merely catalytic. Some products of the catalysts participate in stoichiometric reactions that consume those products in the process of generating other catalytic molecules. Although the reaction graph of a tanglecycle is more complex, and its dynamics are commensurately more involved, tanglecycles do not greatly change the argument and conclusions we are presenting in this paper.
Something like a hypercycle or a tanglecycle (e.g., Kaufmann’s catalytically closed cycle, Fig. 4c) can be assumed to have been the prebiotic chemical process that gave rise to what we would want to call living systems. As products of this reaction became larger and more complex, the properties of those products will have become able to initiate and sustain more sophisticated reactions. At some arbitrary point, living systems will have developed “genomes” and “metabolisms” that did not exist at some arbitrary earlier point. The question that presents itself as a philosophical problem as well as a problem of prebiotic chemistry is, how did genes evolve? with the subsidiary problem, did genes or metabolism come first? Proponents of genes-first versus metabolism-first scenarios put their cases forcefully. The former go under the banner of the RNA World, and both models have been widely discussed (Doolittle 1993; Poole et al. 1998; Szathmáry 1997; Blomberg 1997).
In the hypercycle, the overall rate of complete reproduction will in effect be “governed” by the slowest reaction group. As a critical point in that cycle, variants of the interaction group type that generate the most persistent structures will become more frequent in the reactor if the hypercycle of which they are a part generates them in turn. Longer lived structures interact with more of other molecules, both in reactions and catalysis, than those which have shorter half-lives. The likelihood of a variant becoming involved in the reaction again is higher.15 Since a variant is likely to generate, either by catalysis or reaction, variants of the next interaction group type, the hypercycles will tend to diverge, and more efficient hypercycles, in terms of stability and sequestration of raw material, will tend over time to predominate the reactor. Moreover, there will be selection in favour of hypercycles that have stabler rate governors, for they will tend on average to have a stabilising effect on the wider hypercycle through their interactions; the first and most immediate benefit of interaction is to stabilize both interacting molecules. In this way, life arises as the outcome of the physical properties of molecules (Schneider and Kay 1994; Schneider and Sagan 2005).
Note that this is a Darwinian process of selection, even though the whole reactor and all of its components have no equivalents to the HDD. There are no replicators (the rate governors are low fidelity and initially no different qualitatively to any other molecule) nor any ecological interactors (the whole hypercycle is the “efficiency bearer”16). Each hypercycle is its own reproducer without a replicator as such. That this is so need not challenge Darwinian theory. Considerations by Fisher (1930) showed that selection is still feasible in the absence of particulate Mendelian factors (that is, under blending inheritance) and under neo-Lamarckian inheritance of acquired characters (Wilkins 2001). Evolution would have had to operate over rates of novelty far higher than non-blending Weismannian heredity in order to account for observed stability, but operate it still could.
Selection of hypercycles is formally similar to Fisher’s speculations. Each component of a hypercycle has differing properties, which it shares to some degree with homologous molecules, and so homologous hypercycles will “share” rate governors and other components (assuming that the reactions are not yet compartmentalised or localised; these hypercycles occur “through” each other. They need not even be located in a single contiguous region of the reaction. Hypercycles are types of interaction webs). There is no “locus of information” in this case, nor any real sense of cybernetic control (“governance” is meant in the sense of a rate governor on a steam engine, as a process limiter). Evolution of systems that are in any real sense cybernetic will not happen until long after this period. Moreover, these rate governors need not even be nucleotides, although nucleotides have the relevant properties of stability and ability to catalyse other reactions. Selection occurs on interaction groups in the reactor.
At some point, Szathmáry hypothesises (1997, 1987) that hypercycles developed compartmentalisation, perhaps through the spontaneous properties of lipid molecules that may be by-products of the processes they end up encapsulating.17 Lipids form coascervates, cell-like membranes, through colloidal interactions with the aqueous medium, and when they grow to a critical size, they split through mechanical instabilities. Since these lipids are hypothetically generated from the hypercyclic components, they may have enclosed these components, and the resulting systems would then be subjected to selection in favour of size and ability to transport “food” molecules across the lipid membrane into the hypercycle. At this point, the entire system exhibits “metabolic” processes, of which some molecules are rate governors. These “pregenes” are not “encodings” of “phenotypic” structure, because the phenotype category is not applicable when every element of the system’s cycle is chemically interactive and none is dispensible. If long-chain long-lasting molecules—oligonucleotides of RNA initially, according to one version—are critical to the hypercycle, then they are more effective if incorporated into the compartment, where interaction probability is highest. Protocells that have this configuration will be more frequent in the reactor due to their efficiency and ability to outcompete other such structures for source molecules.
Some may question whether this is actually selection, if it lacks replicators.18 We believe that to deny that it is selection is to commit a petitio: to define selection under the HDD as the differential replication of variants, and assert that any process that is not defined in terms of a replicator cannot therefore be selection. But Darwin, in chapter IV of the Origin of Species did not have recourse to replicators when he proposed the notion of Natural Selection. He spoke instead of “variations” for which there is a strong “hereditary tendency”. One way to read this is in terms of replicators, but another is to merely assume that a token of a variant is passed on in a closely similar state. This does not assume molecules that have a genetic role; indeed, for Darwin the “variations” are characters of the organism, or “forms”, and can apply equally to molecules that do not play a genetic role in our etiobiological picture. Selection is a sorting process of reproducing systems, which may, but need not, have anything resembling a replicator. The primitive concept in selection is reproduction, not replication. Or, better expressed, replication is a proper subset of reproduction.
A different space
One problem with partitioning biological processes into genetic and phenotypic or ecological domains as Lewontin does is that it becomes very difficult to tell what part of the transformation cycle describes a physical event—say, a cell division. Take the division of a somatic cell. Is it phenotypic (covered by the developmental laws T1)? What if it is division of the stem cells of the gamete? Presumably in that case it is covered by T4, but it is still cell division of a somatic cell, until meiosis occurs to create the gamete. The implication of Lewontin’s model is that two distinct domains, one physical and the other what Williams called the “codical”, and which share no descriptors, somehow impose causal influence on each other (Williams 1992). It is a view very reminiscent of the philosophy of mind doctrine of substance dualism, where immaterial thoughts influence the behaviour of physical organs in mysterious ways. But in a properly scientific model, the presumption must be that causes are physical chains, and so the informational properties of replicators must lie in their physical properties—structure, thermodynamic capacities, electronic charge, mechanical effects, and so on. We suggest that instead of two unconnected domains, we can visualise evolution as occurring along several physical axes, and we will attempt to suggest some ways to conceive of them.
Although metazoan evolution might occupy a certain region of high information content for genes correlating closely with a high phenotypic efficiency, the Weismannian notion of the sequestration of the germ line and its heirs do not exhaust the possibilities for all organisms. It is likely, a priori, that protobiological processes had a low information content and low phenotypic efficiency, and other quadrants of the field of relations will be occupied at other times and by different kinds of organisms (e.g., asexual and parazoan lineages). So we might conceive early evolution as a time series mapped against two very broad categories that answer to “phenotypic” space, and to “genetic” space. To do this for early molecular protobiological evolution, we need to consider what general unitary measures might apply in each space or hierarchy. For example, heredity occurs in a range of substrates, some of RNA, some of DNA, some in cytoplasm, some in neurological substrates and some, if Hull and Dawkins are to be followed, in the form of theoretical structures in semantic languages.
Formal measures of “heredity”, must be construed in the etiobiological case, where there is no “genetic” material as yet, as the structural stability of some critical component molecules or set of catalysts. There is no cybernetic control as such, nor is there any clear coding, but only reaction systems that to some degree of accuracy attain similar states due to their homology. The measure of “heredity” is the structural homology at various levels of the (molecular) interactive properties of the system. At every point of the evolutionary cycle, information must be instantiated in physical properties—as stable or labile structures with effects on local molecules. It would be better, therefore, to specify these physical properties as orthogonal axes of a physical space rather than conceive of the informational and economic domains as conceptually distinct.
Lewontin made the interesting observation that when we talk about genes, at least in the population genetics sense that informs the evolutionary discussion, we cannot define them except as they indirectly refer to the phenotype, via its fitness for a given environment. Moreover, we cannot describe phenotypic traits in biometrics except by reference to their hereditability. Effectively, the two notions of genotypes and phenotypes in classical population genetics are inverse and obverse, defined in relation to each other, which is why we redescribe them with independent state variables.
E. Thermodynamic efficiency (Energy conversion efficiency), or the ability of a molecule to convert free energy (roughly equivalent to Dawkins’ fecundity);
F. Similarity: the degree of difference of homologs of a molecule—how accurately copies are made (Dawkins’ Fidelity), and
S. Stability: structural persistence, or durability (Dawkins’ longevity)
In what follows, what can be said of the system can also be said of its components, on the assumption that the system is composed of and inherits the properties of its parts (molecules have given stabilities and thermodynamic efficiencies, and the system properties are determined by its parts).
The axis or variable S is the stability of a molecule in a range of conditions. Some molecules are robust, some are fragile. At low temperatures, nucleotides are stable and resist denaturing for extremely long periods. Polypeptides are far less stable. This is a function of the bonds between their component monomers. Nucleotides are bound together through covalent bonds, while many bonds between amino acids have much weaker bonds. The structure and properties of the medium are also important.
The axis or variable E represents the heat capacity of molecules. Larger molecules can remain stable longer at higher temperatures than smaller molecules, since thermal agitation is expressed as vibration and movement in the conformation. The bigger the molecule, the more degrees of freedom it is likely to have, on average.
The axis or variable F measures the variability of molecular types in that reaction cycle. Some molecules will have only a few states, and will be produced in a limited range or not at all. Others will be produced in a broad range of states, differing either in monomeric sequence or secondary structure.19
Transforming Lewontin’s figure, we map a three dimensional field formed by the independent variables E, F and S, to capture the notion that some molecules must persist long enough to catalyze and react with other molecules. In modern biological systems, the system state moves with the hypercycle between the genotypic state (ge) of high fidelity and stability, and the phenotypic state (ph) of facultativeness, or low fidelity, and general specificity for other interactants due to the large number of configurations they can attain. This broad facultativeness increases the number of ways molecules can retard and employ energy gradients in the reactor. The “metabolic region” of this space is therefore around the ph end of the cycle. “Information” resides in the causal symmetry of structure between tokens of a system at t and derived tokens at t + n. In Weismannian systems that do sequester part of the system as genetic, there will be a large separation in the state space between ge and ph ends of the lifecycle.
Asymmetries between the component types will be amplified by selection for homologs that better contribute to the fitness of the hypercycle types in which they are involved. Hypercycles that are more efficient will come to predominate the reactor. Selection will have occurred on types of hypercycles, and indirectly on types of molecules, tending towards efficiency. Notice in passing that we do not make Wicken’s (1987) stipulation that individual hypercycles must be compartmentalised at this stage.
We should expect ge homologs to compete exclusively for the products of ph homologs, and so after selection, there should be fewer ge types in HMj. That is, selection reduces diversity in favor of the more efficient ge homologs, in the sense of having higher values for fidelity of copying and stability of structure. Homologs of ph will contribute to the energy efficiency of HMj and those homologs that have the highest affinity for catalysis by ge molecules in HMj and the highest energy conversion efficiency will be favored selectively through selection of the most efficient H’s in R. There will, of course, be a Pareto tradeoff between ge-affinity and energy efficiency of ph molecules, with a likely spread of values. Over time, reproduction will settle into a pattern of fidelity of some stable molecules in pre-self-replicating hypercycles that map more or less exactly onto reaction paths to other molecules that represent the most energy efficient homologs in that reactor.
Energy conversion efficiency is the result of the number of shapes and structural bonds that molecules can attain. These enable a molecule to capture and bind or catalyze other free molecules in the medium. Interactions that are critical to the persistence of H, and which have a broad range of conformations will be preferentially selected. The outcome will be many “metabolic”, “phenotypic” molecules in HMj, tightly mapped in direct reaction pathways to a few “genetic” molecules.
The degree of genicity is the degree of involvement of an entity in the production of the terminal states of the system. If we visualize the development of an organism as a differentiation tree (in metazoans, a cellular differentiation tree), the genes are critical elements that have tokens involved in the differentiation of the system to its final state. But this distinction is neither absolute (for genes are involved in metabolic processes) nor discrete (some hereditary information resides in cytoplasmic structures, for example). In protobiology, the genetic is not yet evolved, and the distinctions we now find useful are not helpful in that period.
So, the question before us is whether the HDD is necessary in order for there to be Darwinian evolution. It is certainly true that if the general ontology of the HDD obtains (replicators, interactors, lineages, populations, ecologies and so forth) then Darwinian evolution is axiomatically derivable. These are severally sufficient conditions for evolution to occur. But are they, as Lifson (1997) insists, necessary? We must agree with Kauffman (1993, 1995) that genomic information is not required for Darwinian evolution to proceed, for how else could genetic subsystems arise in the first place?
Neo-Darwinian evolution is a subset of Darwinian evolution
The very ways in which the HDD is defined offers us some way out of the conundrum into which we find ourselves cast. Replicators are copied “mostly” intact. Interactors are “cohesive wholes”. These are at best fuzzy predicates and at worst entirely subjective and arbitrary. The properties of being a replicator or an interactor, of being sufficiently cohesive to be a whole individual, indeed even of the process of reproduction itself, are clear enough ordinarily. But they are vague or ambiguous in exotic cases, and become quite misleading in such examples as ramets or hybrid species, colonial taxa or conjugating single cell lineages. The simple fact is that while these categories are useful in some familiar situations they are not at all useful in others, and the degree of usefulness is a feature of our theoretical and explanatory models, not of the processes that are actually occurring in biology.
What seems at a large scale a rapid transition, even a classical phase transition from liquid to vapour, involves a sequence of graded events at some smaller scale. Certainly the apparent cybernetic and informational properties of genes are a matter of scale and the ways in which the systems to which they contribute are characterised. If the boundaries are drawn at the scale of ribosomal transcription, then the informational nature of genes is hard to note. What we instead notice are the thermodynamic and chemical properties of several different kinds and sizes of molecules binding and breaking apart. Draw the boundary at the cellular level, and the “coding” becomes much easier to determine, although we suggest that the informational nature of the genome is best conceived at more encompassing levels than the nucleotidal.
The HDD presents a paradox—Darwinian evolution is an axiomatic derivation of a replicator/interactor lineage, and yet (Darwinian) must evolution occur before the HDD can properly be drawn. Darwinian evolution of a structure minimally requires two conditions: variation of environmentally significant properties and a correlated reproduction of those properties, which is why Lewontin’s characterisation of evolution is the more inclusive (see Kitcher 1993). The notion of a replicator in our protobiological case study is inapplicable, and Griesemer’s suggestion of “reproducer” is to be preferred.21 This applies to tokens of the whole system type, and variants.
The replicator/interactor distinction, with its historical precursors, is sometimes presented as an axiomatic precondition of Darwinism, although not by its originators. Darwinian evolution was formulated before such a distinction was proposed, and Darwinian evolution can proceed in conditions where the Weismann barrier does not apply. Genic reductionism is not possible in an autocatalytic environment such as the etiobiological.
Our proposal is that genes evolved from chemical interactors by selection that acted to enrich populations of molecules for the very properties that genes now have: fidelity, stability, and crucial involvement in the reaction chains of the living systems of which they are a part. Selection drove specialization of molecular roles. We consider that much of the discussion about where “information” resides in living systems is really a debate over what to cover by that term. Recognition of molecular information depends very much on the scale of the observed system. The notion of a singular type of hereditary unit is otiose, since frameshifts, open reading frames, exons, regulatory genes, and distributed coding regions have been discovered. We would go further to say that the notion of heredity itself disappears at the right scale, and should be understood, when the processes and actors are better known, in terms of thermodynamics and binding affinities in structured media. Talk about phenomena at one scale of resolution need not be reduced to fine grain talk (Hull 1974), so even knowing that no simple cohesive entities answer to the term “gene” we may still talk about them, on the understanding that they are relations of nucleotides to packets of interactive structures.22
While neo-Darwinian models are not universally applicable to biological evolution, they are not wrong. The emphasis on genetic fine control of phenotypic properties has been enormously fruitful, and the associated mathematical tools have been very useful in the investigation of biological phenomena. However, at least in the etiobiological context, “Darwinian” ⊂ “neo-Darwinian”.
Darwin clearly envisaged that selection operated upon types of organismic systems, which was natural enough since for him heredity was a blackbox property of living systems. In terms of the HDD, Darwin had a notion of hereditable interactors, but not of replicators as such. This is often put down to his ignorance of Mendelism, but we propose that the concept of natural selection is independent of a replicator entity as such. While it may be true that the structures upon which natural selection operates are based on replicator inheritance, at least, in the restricted case of Modern, mostly metazooan, animal and vertebrate, biology, it is not a precondition for Darwinian evolution.
Reproduction—the iteration of types—is a necessary condition of evolution by selection. Of course, replicators are themselves high-fidelity types. The sense of “type” intended by Darwin implies what Lewontin has called “variational” evolution: types are modal clusters of heritable properties, no matter how they are caused. In the evolution of genicity, the causal chains that generate types of ge and ph molecules and types of hypercycles, and subsequent systems, are the physical and chemical properties of the reactions and their components. In a Modern organism, the types are caused in part by genetic stability and the critical role of nucleotides in cellular reproduction. Either way, the process of sorting and optimising reproducing types at all levels, from nucleotide–amino acid mappings to the behavioral properties of complex organisms, is a Darwinian process. And even the evolution of genicity is the outcome of optimizing interactions via natural selection.
For example, in some organisms the UAG codon codes for the amino acid glutamine, rather than “stop”, as in the mRNA SGC (Schultz and Yarus 1996).
Peter Godfrey-Smith’s significant book (Godfrey-Smith 2009: 63f) has a critique of replicators that is very complementary to the one offered here. Godfrey-Smith conceives of selection processes in a conceptual manifold formed by the variables “fidelity of heredity”, (H) “abundance of variation”, (V) “continuity, or smoothness of the fitness landscape”, (C) and “dependence of reproductive differences on intrinsic character” (S). Since this is a continuous space, he, too, conceives of replication as occurring at one corner of the space, rather than there being some qualitative difference between replicators and other objects in evolution.
More exactly, informational terms refer to restricted properties of configurations of a certain class of physical systems: language-using brains. They are supervenient properties of some physical systems (van Gelder 1998; Kim 1993).
Not, as the figure might suggest (though not the caption) developmental states of a typical lifecycle.
Hull’s notion of an interactor, discussed below, is an ecological entity with economic properties.
The chief difference between our account and Dawkins (1976), chapter 2 is that for Dawkins, any molecule that could replicate itself is a replicator, but he seemed to suggest that other molecules were not “part” of the replicator. In the account given in this paper, molecules evolve stability through their involvement in interactions of differential rates and stability, and the entire process is protometabolic cf. Deamer and Weber (2010).
What counts as reproduction is open for debate. We think that it is broadly when the degree of structural similarity of progeny with parents exceeds similarity by chance, but this is not sufficient. A reproducer is any system that produces systems that are better than chance in their identity conditions (sequence identity, topological relations, etc.) but similarity is a vague notion in this context, and so the critical issue is what kind of identities matter. Reproducers are a class of entities, formed by populations of objects, whose interaction networks result in the creation and growth in number of the objects and thus the size and possibly number of the entities. Those entities with extremely high levels of fidelity can be described as interactor/replicator systems; those with merely enough reproductive fidelity to hold themselves together (through homo- and hetero-dimerisation with like molecules) are not interactor/replicator systems. Note that whether the entities are bounded (e.g., by a cell membrane) will determine whether they increase in number (cells) or just get bigger (hypercycles in the chemoton [see below]).
Asexual taxa are not species on the biological species concept of Mayr (1942) and Dobzhansky (1935). Hull takes a more pluralistic approach. One of us has argued that species is a broader concept than restricted applications to the sexual world (Wilkins 2007).
Kitcher outlines the explanatory schema for neo-darwinian selection in terms of genetic trajectories, initial distributions and frequencies of alleles, and fitness of alleles in a given environment and simple individual selection. The small capitals denote a core doctrine of the various Darwinian views of evolution. The overall argument is a straight nomological-deductive explanation schema in his view, as it is in Mary Williams’ and Hull’s generalised axiomatic views of evolution by selection. Kitcher also considers the views of Darwin himself as setting up a minimal Darwinism, which do not adhere to the centrality of selective mechanisms.
Unfortunately, the term “interactor” has a meaning also in chemistry, where it refers to any kind of stoichiometric or catalytic reaction between molecules. To avoid confusion, we will qualify the term where necessary.
Despite Szathmáry’s objections (Szathmáry 1988), the term “hypercycle” is here applied to any chemical cycle that is self-reproducing, and not those that are merely autocatalytic.
We are not committed to the detailed proposals of Eigen and Schuster here. Any autocatalytic process with several catalysts and reactants will exhibit the formal properties required by this argument cf. Wicken (1987).
The assumption here is that the reactor is a Malthusian reactor, i.e., one in which the rate of reaction is higher than the rate of the provision of new oligomers that act as raw material in the interactions of the hypercycle. Reactors in which the rate of reaction is less than or equal to the rate of new material are termed hyperbolic, after the growth curve. Malthusian reactors exhibit a growth curve similar to hyperbolic reactors but form a logistic curve as the raw material is exhausted, and competitive exclusion commences (i.e., as the carrying capacity is approached).
Although the efficiencies of various components will also be subjected to selection.
David Hull, in conversation with JSW, posed this objection.
This is similar to, but not derived from, Wicken’s treatment (Wicken 1985, chapter 8). A very similar approach is that of Nowak and Ohtsuki (2008), who show mathematically that selection can occur on non-HDD processes.
That is, if the conditions in R are neither severely Malthusian, nor entirely open to hyperbolic growth and thus free the selective coefficients.
In the context of speciation, Eldredge (1989) suggested “moremaker”, but this is a little awkward. We may quibble about some aspects of Griesemer’s definition, as Godfrey-Smith (2009) does, that the condition of “material overlap” between parent and progeny entities is too strong, and require only a causal interaction between these two entities, but that does not materially affect this paper’s argument.
We are grateful to the late and much-missed David Hull, Paul Griffiths, and Jim Greisemer for correspondence and conversation with JSW on this subject. Thanks also to the reviewer and to the editor, Kim Sterelny, for helpful comments.