Biology & Philosophy

, Volume 28, Issue 2, pp 299–330

The other eukaryotes in light of evolutionary protistology


    • Department of PhilosophyUniversity of Sydney
  • Alastair G. B. Simpson
    • Department of Biology, Life Sciences CentreDalhousie University
  • Andrew J. Roger
    • Department of Biochemistry and Molecular Biology, Tupper BuildingDalhousie University

DOI: 10.1007/s10539-012-9354-y

Cite this article as:
O’Malley, M.A., Simpson, A.G.B. & Roger, A.J. Biol Philos (2013) 28: 299. doi:10.1007/s10539-012-9354-y


In order to introduce protists to philosophers, we outline the diversity, classification, and evolutionary importance of these eukaryotic microorganisms. We argue that an evolutionary understanding of protists is crucial for understanding eukaryotes in general. More specifically, evolutionary protistology shows how the emphasis on understanding evolutionary phenomena through a phylogeny-based comparative approach constrains and underpins any more abstract account of why certain organismal features evolved in the early history of eukaryotes. We focus on three crucial episodes of this history: the origins of multicellularity, the origin of sex, and the origin of the eukaryote cell. Despite ongoing uncertainty about where the root of the eukaryote tree lies, and residual questions about the precise endosymbioses that have produced a diversity of photosynthesizing eukaryotes, evolutionary protistology has illuminated with considerable clarity many aspects of protist evolution. Our main message in light of evolutionary protistology is that these ‘other eukaryotes’ are in fact the organisms through which the rest of the eukaryotes should be understood.


ProtistsEukaryote tree of lifePhylogeny-based comparisonOrigin of sexOrigins of multicellularityOrigin of eukaryotes


For the general Biologist, to whom for the most part the Infusorial series represents but a single scarcely noteworthy link in the grand scheme of organic nature … there yet remain in this group certain side issues of the highest interest and importance (Saville Kent 1880–1881, p. viii).

An introduction to protists as the ‘other eukaryotes’

For most philosophers and even for many biologists, the paradigmatic eukaryotes are animals and plants. This is a very narrow view of eukaryotes, and our paper will focus on what might be called the ‘other eukaryotes’. These lesser-known and mostly microscopic organisms are highly diverse and plentiful. They are of deep relevance to every other form of life (eukaryotic and prokaryotic), and involved in all of the processes that make life possible for animals and plants. These ‘other eukaryotes’ include multicellular fungi, which are rarely and only fleetingly discussed in the philosophical literature, and the myriad microbial forms that are still all but invisible to many biological and philosophical discussions. We will make the eukaryotes called protists our main object of attention both because they have so much to say regarding the rest of eukaryote biology, and because they can also deeply inform philosophical discussions about, for example, the origins of the other eukaryotic groups, including the one to which philosophers and other humans belong. In addition, the ways in which protists are studied evolutionarily may be illuminating for anyone interested in evolutionary research methodologies.

What are protists?

There is no universally accepted definition of ‘protist’ and the use of the term and its cognates has a complicated history (Rothschild 1989; Scamardella 1999). Probably the most popular contemporary definition (‘Definition 1’ here) is, at core, a phylogenetic one that identifies a paraphyletic group: Protists are anything that is a eukaryote but is not an animal, (land) plant, or (true) fungus. This is essentially the definition of the kingdom Protista under Margulis’s (1971) popular version of the ‘five-kingdom system’ for categorizing life on earth,1 although systematists today do not treat protists as a formal taxon (e.g., Adl et al. 2012).

The other way (‘Definition 2’ here) of defining the category of protist is primarily by functional or biological criteria. Under this view, protists are essentially those eukaryotes that are never multicellular. Sometimes additional subtractive phylogenetic caveats are added (e.g., by excluding lineages that belong within the true fungi, even if those fungi are unicellular). As we will discuss later, there is an interesting range of degrees of multicellularity amongst eukaryotes, so it is not always clear where to draw the line between protists and multicellular non-protists under these definitions.

In practice these two definitions identify heavily overlapping sets of organisms as protists. The undoubted protists include a wide range of mostly small eukaryotic organisms, which are either unicellular, or are colonies or filaments of at most a couple of distinct cell types. Many protists are photosynthetic, and these are also referred to as algae. There is a wide diversity of algae from both basic biological and phylogenetic perspectives. Other protists are fungi-like in their mode of nutrition and/or their reproductive and dispersal strategies, with some (e.g., oomycetes) producing elongate walled extensions that are similar but only analogous to fungal hyphae. Other non-photosynthetic forms of protists, especially motile cells, are often referred to as ‘protozoa’. Most of the famous protozoa are parasites, but free-living protozoa are very diverse and highly abundant in most ecosystems.

Organisms considered protists under the first definition include various kinds of complex macroalgae. The brown algae (Phaeophyceae) in particular include highly differentiated, multicellular forms that are similar in complexity of overall structure to land plants. Large, complex algae are also seen amongst the red and green algae, neither of which is closely related to brown algae. Conversely, biological definitions of protists (Definition 2) may admit some of the unicellular (or nearly unicellular) organisms that are phylogenetically within the fungi or animals. For example, the microsporidia are a large group of parasites that have traditionally been treated as protists although they turn out to be fungi in a phylogenetic sense (Keeling and Fast 2002). In this paper we will generally apply the first definition of protist, unless otherwise noted.

Irrespective of exactly which collections of organisms are considered to be protists, they are profoundly important to an understanding of eukaryotic life on Earth for several reasons: they represent the great bulk of the major lineages of eukaryotic life today; the evolutionary history of protists extends further back in time than that of animals, plants and fungi; these other types of eukaryotes all descend, independently, from different protistan ancestors. In fact, to the first approximation, eukaryotes and protists are one and the same thing, with the animals, land plants, and true fungi representing unusual (albeit highly successful) ‘special cases’.

Why protists are often overlooked

This protist-centric view of eukaryotes of course is not the way that the majority of biologists see the biological world. For most of the history of biology it has been much more the case that protists (where recognized at all) are treated as unusual special cases of animals, plants or fungi. The epigraph captures this perception but is noted ironically by William Saville Kent, who in the later part of the nineteenth century was a great advocate of the protist world. And although some historians have scrutinized specific episodes of protist research (e.g., Churchill 1989, 2011; Reynolds 2008; Richmond 1989; Sapp 1987; Schloegel 1999; Sunderland 2011), philosophers have generally fallen in with the mainstream big-organism view that the important eukaryotes are plants and animals.

The chief reasons why protists are usually overlooked are not mysterious. They are small, and tend to influence us indirectly rather than directly. For both of these statements there are notable exceptions that confirm these general truisms. Algae are amongst the most well known protists, because some are large, and because of their obvious importance in aquatic ‘food chains’. The lack of land plants in the open ocean makes it hard to conceive of a functioning marine ecosystem without including algae in an account of it (even though an account that includes algae but leaves out other microbes is actually a highly incomplete account of complex ocean ‘food webs’—see, e.g., Pernthaler 2005; Sherr and Sherr 2002). By contrast, a superficially coherent description of other parts of the biological world without protists is quite feasible, as the case of soil indicates. A general understanding of organisms doing important functional things in the soil, such as decomposition, can be achieved with only glancing reference to ‘protozoa’ (e.g., This does not mean, of course, that protozoa are of little importance in soil (see Crotty et al. 2012 for details of what they do there); merely that one can imagine a functioning soil ecosystem without them. Likewise, only a few protists have been given medical superstar status. Giardia, Plasmodium,Toxoplasma gondii, and Trichomonas vaginalis are amongst the few that have wider public recognition, because they infect hundreds of millions of people. In health-focused media, they are usually described simply as ‘parasites’ (e.g., in the same way intestinal worms are.

Although we have not carried out any research, we are confident that if we were to ask people on the street of any city ‘what are protists?’ they would seldom have any ungoogled answers to give. We think the situation would be very different if the question were ‘what are bacteria?’ and suspect that protists are more likely to be thought of as ‘big bacteria’ in such a context, or, perhaps more rarely nowadays, as ‘small animals and plants’. For this reason, many protistologists often refer to themselves in public situations simply as ‘microbiologists’. The fact that well-known protist taxa such as algae and protozoan ‘parasites’ are mostly studied within the separate disciplines of phycology and parasitology (respectively) has hampered the development of an autonomous discipline of protistology and its public recognition.

Protists as rule-breaking microbes

Overlooking protists impoverishes views of the diversity of eukaryotic life, and leads to skewed views of what is normal in biology. For example, several general ‘facts’ about eukaryotic cells, which were derived from considering animals, plants and fungi as typical, prove to be anything but general once protists are examined phylogenetically. Classic examples include the basic structure of the mitochondrion and the architecture of mitosis (see Hausmann et al. 2003). Ecologically, there can be substantial overlaps in the predatory behaviours, niche and size distributions of protists. Giant amoebae or dinoflagellates are able to consume multicellular organisms. Hard rules about trophic modes in ecology are often ruined by protists. For example, although algae are usually identified as ‘plant-like’ (e.g.,, many abundant algae are in fact ‘mixotrophs’ that perform both phototrophy and the predatory consumption of other microbes by phagotrophy (Stoecker 1998; Zubkov and Tarran 2008). Examples include most photosynthetic dinoflagellates and many haptophytes.

In yet more blatant rule-breaking, ‘structural analogues’ of multiple cells in multicellular organisms have been found as subcellular structures in protists (Leander 2008). Complex behaviours typically associated with multicellular nervous systems (e.g., conditioning) have been found in protists such as ciliates (Eisenstein 1997; Armus et al. 2006). At the other end of the biological spectrum, protists are more structurally and behaviourally complex than the unicellular life-forms of prokaryotes (i.e., bacteria and archaea). In fact it is hard for many protistologists to imagine why anyone would study the latter once they had encountered protists, and conceiving protists as ‘big bacteria’ is anathema.

Protists as evolutionary transition points

Genomically, protists cover a remarkable range of genome sizes, ranging from 2.3 million base pairs (similar to the average for prokaryote genomes) to almost 100 billion base pairs, which is bigger than the genomes of many multicellular organisms (Gregory 2005).2 Evolutionarily, protists have been described from a population genetics perspective as ‘transitional organisms’ (Lynch and Conery 2003), in that their rates of mutation, and the ways in which selection and drift operate on those mutations, can be similar sometimes to prokaryotes (with evolutionarily relevant large population sizes) and other times to multicellular eukaryotes (with far smaller population sizes).

Due to this in-between status, protists can be conceived as both the evolutionary precursors of other eukaryotes and tractable analogues for broader biological issues (i.e., they function as a general group of model organisms). Many biologists and philosophers of biology want to understand animals and plants, and how those organisms evolved to be what they are. But as we have noted, plants and animals are special cases of eukaryotes, and are far outnumbered at the level of major lineages by protists and other unicellular eukaryotes. Plant and animal biologists often use protists to understand how their research organisms or features of them came into existence (e.g., Heywood and Magee 1976). The homologous features of protists, plants, fungi and animals can be used to understand the evolution of complex cells, multicellularity and sex. Protists have a great deal to offer on such topics now that protistologists have more refined evolutionary trees and insights from model and non-model protists (Montagnes et al. 2012). Often, these evolutionary situations are also examined from a more abstracted theoretical perspective that assumes certain facts relevant to the evolutionary situation (e.g., adaptive advantage) and sets out an evolutionary scenario on that basis.

Evolutionary protistology

Protistology as a discipline has had a long but tenuous existence. In part this is because many protists (or aspects of them) are studied by quite separate disciplines, such as phycology, parasitology or the outer fringes of mycology (Corliss 1989; Wolf and Hausmann 2001). Around 1970, however, a distinct field of evolutionary protistology came into being (Taylor 2003). Evolutionary protistology has primarily concerned itself with the phylogenetic relationships amongst protists and their affinities with other eukaryotes, as well as with the evolution of major features of eukaryotic cells (Patterson 1999). These include in particular the acquisition of plastids through multiple events of endosymbiosis, the properties of anaerobic mitochondrial organelles (which are found in several distinct lineages of protists as well as some other eukaryotes), processes such as gene transfer from endosymbionts or other organisms, and genome evolution in general.

While a small discipline, evolutionary protistology has wrought major advances in our understanding of life on earth. This progress has been anchored by the use of molecular phylogenetics, genomics, and other molecular methods. Evolutionary protistology has in fact played a significant role in the development of molecular phylogenetics, some aspects of genomics, and of ‘phylogenomics’—the analysis of evolutionary history at the genome level (e.g., Lang et al. 1999; Philippe et al. 2000; Bapteste et al. 2002). Molecular insights have been integrated with cell-biological data and more traditional microscopy information, and the systematic study of electron microscopy data has itself yielded significant advances in our understanding of protist biodiversity and phylogeny prior to, and overlapping with, the adoption of molecular approaches (Patterson 1999). Furthermore, the ongoing commitment to biodiversity research—discovering novel protists, or characterizing forgotten ones—has yielded particularly strong dividends (Simpson 2003; Brown et al. 2009; Lang et al. 1997; Moore et al. 2008; Corliss 2002).

The broad sweep of eukaryotic evolution

The central achievement of evolutionary protistology to date has been the reconstruction of much of the phylogenetic tree of eukaryotes, and the mapping onto its branches some very important events in the history of eukaryotic cells. The current model apportions the vast majority of eukaryotic organisms into half a dozen or so supra-kingdom-level groups or ‘supergroups’. This moniker conveys the idea that these groupings are ‘above the level of kingdom’ in the sense that the traditional multicellular ‘kingdoms’ (animals, land plants, fungi) are mere subgroups (Simpson and Roger 2004; Adl et al. 2005, 2012).

The first supergroup to be resolved clearly was Opisthokonta. This assemblage includes both animals and fungi, plus a scattering of heterotrophic protist lineages. Within opisthokonts, animals are closely related to choanoflagellates, which are small free-living unicellular or colonial bacterivores, and only a little less closely related to a series of lineages that include a collection of animal parasites called Ichthyosporea (Steenkamp et al. 2006; Ruiz-Trillo et al. 2008; Shalchian-Tabrizi et al. 2008). Fungi are the major representatives of a second primal branch within opisthokonts, with the other members of this group being the little known nucleariid amoebae, and an obscure aggregative ‘slime mold’ organism called Fonticula (Liu et al. 2009; Brown et al. 2009). The evolutionary unity (monophyly) of Opisthokonta has been increasingly strongly supported by molecular phylogenetics and phylogenomics for nearly two decades (Baldauf and Palmer 1993; Wainright et al. 1993; Brown et al. 2009; Liu et al. 2009). These findings confirm earlier strong suspicions that animals, choanoflagellates and fungi are related—a claim that was based primarily on comparative morphology, especially the arrangement of the flagellum within the cell (Cavalier-Smith 1987a). Interestingly, increasing evidence from molecular phylogenetic studies shows that some very obscure protozoa—the apusomonads, and probably breviates and ancyromonads—are the sistergroups to opisthokonts (Cavalier-Smith and Chao 1995; Kim et al. 2006; Katz et al. 2011; Fig. 1).
Fig. 1

Eukaryotic supergroups (Adl et al. 2012 - used with permission)

Amoebozoa is the second supergroup, and it is very probably related to Opisthokonta (Bapteste et al. 2002; Hampl et al. 2009; Liu et al. 2009). This is another essentially heterotrophic group, and as the name suggests, most are amoebae, usually with broad pseudopodia (dynamic cellular extrusions that are used for food capture and to enable movement across surfaces). Amoebozoa also includes some flagellate protozoan protists, as well as many ‘slime molds’, which may act as amoebae until aggregating to form a fruiting body from many formerly independent cells, or which have the ability to develop fruiting body structures without aggregation, though often via development into a large multinucleate ‘plasmodium’ (Cavalier-Smith et al. 2004; Shadwick et al. 2009).

The third supergroup is Archaeplastida, often called ‘Plantae’. Archaeplastida includes the groups of organisms that have plastids (chloroplasts) of primary endosymbiotic origin; that is, acquired directly through an ancient endosymbiosis between a eukaryotic host and a photosynthetic prokaryote, specifically a cyanobacterium or relative of one (in contrast to secondary endosymbiosis, mentioned below: Palmer 2003; Archibald 2009). There are three groups of Archaeplastida: glaucophytes, which are a very small obscure group of algae; rhodophytes or ‘red algae’, a group with several thousand described species ranging from unicells to large seaweeds; and Choroplastida (or Viridiplantae), which are the very diverse ‘green algae’ plus the land plants. In a phylogenetic sense, the land plants are ‘merely’ a highly specialized form within one sub–sub-group of Chloroplastida. Phylogenies of plastid genes and their homologues from prokaryotes, plus comparisons of plastid genome content, have strongly supported a close relationship uniting all plastids as a single group among living cyanobacteria (Palmer 2003; Rodríguez-Ezpeleta et al. 2005). This relationship is consistent with plastids all descending from a single ancient event of primary endosymbiosis in a common ancestor of all Archaeplastida. A common endosymbiosis is also strongly supported by the presence of a common protein import mechanism in all these primary algae (Palmer 2003; McFadden and van Dooren 2004; Price et al. 2012). There is much more limited support from nuclear gene phylogenies for the monophyly of Archaeplastida, but its basic unity is now quite widely accepted.

The next supergroup is now usually called SAR or Sar, which was originally an acronym of the constituent groups (Burki et al. 2007; Adl et al. 2012). Stramenopiles (also known as heterokonts), alveolates and Rhizaria are three major lineages of protists that became widely accepted one after the other between the 1980s and early 2000s. In the last five years, multigene and phylogenomic analyses have revealed that these three groups are all closely related to one another, in spite of having no particular overarching morphological similarity (Burki et al. 2007, 2008; Hackett et al. 2007). Sar encompasses a wide range of protists, including several of the better-known groups. Foraminifera and Radiolaria, large marine amoebae with mineralized shells or skeletons, are both kinds of Rhizaria. The ciliates (probably the most familiar type of free-living protozoan for the majority of biologists) and the apicomplexan parasites (e.g., malaria-causing Plasmodium, as well as Toxoplasma and Cryptosporidium) are both kinds of alveolates. The oomycetes are the most famous kind of pseudofungi, and belong within the stramenopiles.

From a broader perspective, however, it is the photosynthetic members of Sar that are the most well known: diatoms (unicellular algae with a two-part silica cell wall equivalent) and the multicellular brown algae (e.g., kelps and wracks) are just two of numerous kinds of stramenopile algae, while dinoflagellates (unicellular organisms famous for causing harmful algal blooms3) are a kind of alveolate. These photosynthetic members of Sar acquired their plastids directly or indirectly through an event of secondary endosymbiosis; that is, by the incorporation of a photosynthetic primary alga as a symbiont, and subsequent reduction to the status of an organelle (Archibald 2009). The original algal symbiont was in fact a red alga. Interestingly, the single group of algae within Rhizaria, the chloroarachniophytes, obtained their plastids through a separate event of secondary endosymbiosis involving a green algal symbiont (see Fig. 2 and the following subsection).
Fig. 2

The origin and spread of photosynthesis (Archibald 2009 - used with permission)

There are two other groups of unicellular algae with secondary plastids ultimately of red algal origin (see Fig. 2). These are the haptophytes, which are particularly abundant in the ocean, and the more obscure cryptophytes. On the basis of plastid similarities, these organisms were for a time considered to be related to stramenopiles and/or alveolates (Cavalier-Smith 1999). Early phylogenomic analyses subsequently suggested that the two groups might be closely related to each other and to a small collection of mostly obscure heterotrophic forms, thereby forming a separate supergroup called Hacrobia (Patron et al. 2007). More recent studies, however, have indicated that haptophytes and cryptophytes (and their heterotrophic relatives) are unlikely to be closely related to each other, and suggest separate affinities with Sar and Archaeplastida (Burki et al. 2012).

The final supergroup is Excavata. This is a collection of unicellular protists, almost all of which are heterotrophs (i.e., protozoa). A substantial fraction of known excavates is parasitic (e.g., Giardia, Trichomonas, and the sleeping sickness parasite Trypanosoma), and many are metabolic anaerobes with mitochondrial organelles that have extensively modified biochemistries.4 There is also one group of algae—the green euglenids—whose plastid was obtained through another secondary endosymbiosis involving a green alga (see Fig. 2). Originally the Excavata grouping was proposed on the basis of structural features, namely descent from an ancestor with a groove used for suspension feeding that was supported by a particular set of cytoskeletal features (Simpson 2003). This ‘excavate’ feeding groove has been retained in about half of the major lineages within Excavata. Molecular phylogenetic studies of excavates have been complicated by the very high rates of sequence evolution that characterize the genomes of several major lineages (see Philippe et al. 2000). The unity of the group has been supported by some phylogenomic analyses that exclude the rapidly evolving lineages (Rodríguez-Ezpeleta et al. 2007; Hampl et al. 2009). However, the monophyly of Excavata, or at least the precise list of its sub-lineages, remains somewhat controversial.

The tree of eukaryotes

Given these building blocks, what is the basic structure of the tree of eukaryotes? The ultimate answer to this question depends on the position of the ‘root’ of the eukaryote tree, which turns out to be a challenging question (and one to which we return below). Setting aside the position of the root for the moment, recent phylogenomic studies have generally supported a particular three-way partitioning of the unrooted tree of eukaryotes (Hampl et al. 2009; Burki et al. 2012). One group, often referred to as ‘unikonts’ and now called Amorphea (Adl et al. 2012), consists of Opisthokonts plus Amoebozoa, and presumably apusomonads etc.; a second group consists of Archaeplastida and Sar plus haptophytes, cryptophytes and several heterotrophic lineages; while the third consists of Excavates alone. Two, or all three, of these clusters would represent genuine major clades, depending on where the root of the eukaryote tree lies.

Thus, in the last two decades a substantially articulated model of the eukaryote evolutionary tree has emerged. This work is not complete. Some proposed groupings require further verification (e.g., Excavata), while there are a number of poorly known but apparently evolutionarily important groups whose phylogenetic position is still unclear (e.g., collodictyonids and Palpitomonas—see Brugerolle et al. 2002; Yabuki et al. 2010; Zhang et al. 2012). In particular, however, two major unresolved questions remain in spite of much intensive study: the position of the root of the tree, and the history of higher-order symbioses involving red algal symbionts.

The position of the root is extremely difficult to estimate using conventional molecular phylogenetic approaches. This is primarily because for any given gene there is typically a very large evolutionary distance between eukaryotes and prokaryotic outgroups. This fosters an artefact known as long-branch attraction, which in practice tends to result in rapidly evolving lineages amongst eukaryotes being ‘attracted’ artificially to the branches of the prokaryote sequences (see Philippe et al. 2000; Bapteste et al. 2002). Several recent approaches have been used to try to overcome this problem. These include analyzing genes where the eukaryotes are closer to outgroup sequences, especially genes descended from the mitochondrial symbiont (Derelle and Lang 2012); examining individual ‘rarely changing’ genomic characters where some eukaryotes retain the ancestral prokaryote condition (highly conserved amino acids, derived gene fusions, or novel biochemical pathways: see Stechmann and Cavalier-Smith 2002, 2003; Richards and Cavalier-Smith 2005; Cavalier-Smith 2010a; Rogozin et al. 2009), or examining the numbers of inferred gene duplications and losses across genomes (Katz et al. 2012). So far these approaches yield different results, suggesting root positions between opisthokonts and everything else (Katz et al. 2012), between the Amorphea grouping and everything else (e.g., Derelle and Lang 2012), between Archaeplastida and everything else (Rogozin et al. 2009), or even deeply within Excavata (Cavalier-Smith 2010a). This is a fascinating area of study, but the position of the root at present is widely considered as unsolved.

The history of higher-order endosymbiosis in eukaryotes is also complicated. Molecular phylogenies of plastid-encoded genes generally support a common ancestry within red algae for the plastids (where present) of stramenopiles, alveolates, haptophytes and cryptophytes (Yoon et al. 2002; Iida et al. 2007; Janouškovec et al. 2010). This shared ancestry, combined with the fact that they (usually) possess a chlorophyll pigment type not found in other plastids—chlorophyll c—suggests that these plastids all descend from a single event of secondary endosymbiosis. Putative shared replacements of nucleus-encoded plastid-targeted genes from the chlorophyll c-containing algae are also most simply explained by descent from a single endosymbiotic event on the host lineage side. This would suggest that stramenopiles, alveolates, haptophytes and cryptophytes were all closely related, and that all heterotrophic forms within these groups (including ciliates, for example) are descended from photosynthetic ancestors. This proposal is called the ‘chromalveolate hypothesis’ (Cavalier-Smith 1999; Keeling 2009).

Unfortunately for this elegant idea, nuclear gene phylogenies almost never recover chlorophyll c-containing groups as close relatives, which contrasts with the relative ease of determining the red algal affinities of the plastid (Baurain et al. 2010). In fact, recent phylogenomic analyses increasingly place some chlorophyll c-containing lineages, typically cryptophytes, as most closely related to Archaeplastida (Burki et al. 2012). This would make it impossible for the chlorophyll c-containing algae to stem from a common secondary endosymbiosis, because that event would then have to predate the origin of red algae, the source taxon of the symbiont. A more popular hypothesis now is that the chlorophyll c-containing plastids arose from a single secondary symbiosis, but were then moved again amongst eukaryotic lineages by at least two later eukaryote–eukaryote symbioses. Such events would constitute ‘tertiary’ endosymbioses (Archibald 2009; Fig. 2).

Methodological distinctions: phylogeny-based comparison vis-à-vis adaptationist scenario building

Although constructing an accurate tree of eukaryotes is a goal unto itself, the project also provides the basis for comparative phylogenetic approaches to understanding major evolutionary events, in which analyses of evolutionary history are carried out on the basis of characters on an existing phylogeny. A central premise of this methodology is that even in the absence of fossil data (which is extremely scarce for most protists), data from modern organisms (including but not restricted to molecular data) can be used to infer the past history of lineages and the properties ancestral to living taxa. Evolutionary reconstructions can involve determining the precursors or intermediate stages in the evolution of some complex or highly derived trait from its preservation in living relatives (Dacks and Doolittle 2001). Questions about the direction and rate of evolutionary change can be asked on the basis of phylogeny-based comparison (Pagel 1997). Alternatively, robust phylogenies can be used to infer that similar features in different organisms are evolutionarily independent convergences, rather than descended from a common ancestral feature (Leander 2008; Rundell and Leander 2010).

In this paper, we discuss phylogeny-based comparison in a way that is largely distinct from its use in hypothesis-driven adaptationist approaches to understanding evolution. For many evolutionary biologists, especially those focused on plants and animals, comparative analysis on the basis of phylogeny is the principal means by which the adaptiveness of a trait is inferred and tested (e.g., Harvey and Purvis 1991; Martins 2000). Once a trait has been identified as adaptive in such research, mechanistic ‘feasibility’ and adaptationist scenarios can be constructed to gain an understanding of why and how particular evolutionary changes occurred. Scenario-building tries to give plausible mechanistic accounts of evolved phenomena, and to identify the selective regimes under which such changes occur (Cavalier-Smith 2006). Such explanations tend to formulate a sequence of innovations that follows the ‘logic’ of intercellular interactions (e.g., Cavalier-Smith 2009; Lane and Martin 2010). Any such ‘staging’ or transition account has to utilize effectively large amounts of existing cell-biological and genome data, combine existing mechanistic knowledge of how eukaryote and prokaryote cells work, and at least gesture toward the step-by-step advantages gained in this major reorganization of the cell (Koonin 2010a). These scenarios can also require consideration of the transformation of selection pressures on the emerging cellular innovations (Lane 2011).

Hypothesis-driven scenario-building and phylogeny-based comparison are thus two very different ways of approaching the evolution of major organismal features. While these methods are both based on phylogenetic reconstructions and often combined, in their extreme forms the questions they pose in order to achieve their research goals are very different and they produce very different sorts of evolutionary explanations. Theoretical scenario-building and phylogeny-based comparison also have clear strengths and weaknesses. The comparative phylogenetic approach at its best is more closely tied to direct observations of evolved phenomena, but it does not as directly address the ultimate questions of why a particular feature evolved (Leroi et al. 1994; Losos 2011).

Although many episodes in the evolution of protists raise intriguing philosophical issues about evolutionary processes and the entities that experience them, our focus in what follows will be the science that produces that knowledge about protist evolution, and particularly the way in which evolutionary protistology is currently driven by phylogeny-based comparison, in the sense outlined above. We will examine three evolutionary accounts in the history of protists: the origins of multicellularity and sex, and of eukaryote cells themselves. These represent more or less a gradient in the way comparative phylogenetic approaches have so far been used in relation to theoretic scenario-building about evolutionary events. We will discuss reasons for the particular methodological balance in each case, and reflect further on the nature of historical science in these two different forms.

Origins of multicellularity

Understanding the various transitions to multicellularity is a topic of major importance to evolutionary biology. Historical understanding of the origins of multicellularity is closely connected to greater knowledge about how modern cells function (King 2010), and feeds into larger-scale insight about the diversity and evolutionary success of multicellular eukaryotes (Knoll 2011). Philosophers of biology interested in major transitions in evolution often pay considerable attention to the evolution of multicellularity. This is because of the bearing multicellularity has on the conceptualization of units of selection and biological individuality (Calcott and Sterelny 2011a; Godfrey-Smith 2009; Michod 2005), long popular topics in philosophy of biology. In these cases, multicellularity is often discussed as a general transition, rather than in terms of specific lineage characteristics. The range of forms of multicellularity, and how each evolved, are often less important than similarities when the focus is units of selection and how selection shifts from individual cells to groups of cells.

A slightly different strand of philosophical work, however, has emphasized the evolution of different forms of multicellularity, with the aim of reaching more general explanations. One focus might be, for example, the different paths taken to reach multicellularity, such as those taken by unicellular ancestors that became morphologically diverse animals compared to the modes of multicellularity achieved in ancestral plant lineages (Sterelny 2006). Another focus can be the elucidation of the range of mechanisms that allowed transitions to multicellularity to come about (Calcott 2011; Knoll and Hewitt 2011). In these analyses, the particular organisms involved are crucial to the discussion, but the point is again to be able to make general claims about multicellularity as a major evolutionary phenomenon.

Some of the specific adaptive scenarios for the origin of multicellularity are built on the basis of the evolution of predatory lifestyles, which began with unicellular heterotrophs and allowed the diversification of trophic levels (Stanley 1973). Multicellularity is argued to have provided an advantage to organisms able to increase their size (as collections of cells), because this would have made it difficult for any unicellular predator to ingest them (Bengtson 2002; King 2004). Other general hypotheses for multicellularity posit straightforward size advantages with room always available ‘at the top’ (Bonner 1998), or conjecture that cell-to-cell clustering brings cooperators together and limits interactions with non-cooperators (Pfeiffer and Bonhoeffer 2003). Size-related adaptive scenarios are linked to the benefits gained by sharing individually produced or obtained resources, thus creating advantages for cells in cooperative groups (Koschwanez et al. 2011; Grosberg and Strathmann 2007). More efficient absorption and storage of nutrients (Kaiser 2001) and greater environmental buffering against environmental change (Kirschner and Gerhart 1998) are also postulated as adaptive drivers of multicellularity.

Preadaptationist or ‘exaptive’ explanations are used to account for the fact that features apparently integral to multicellularity, such as developmental processes in metazoans, are underpinned by genes now known to pre-exist those features (e.g., Marshall and Valentine 2010). Explanations of what such preadaptative advantages might have been in particular selective environments, and whether such preadaptations require simplification scenarios (in which the ancestral organism was in fact more complex but over evolutionary time reduced that complexity), are then added to the adaptationist account of multicellularity. Non-adaptive processes are rarely considered seriously in such scenarios, even though there are good reasons to reflect on genome architectures that arose during the evolution of multicellularity as products of genetic drift, gene duplication and other neutral processes (Lynch 2007; Stoltzfus 2012). Tentative as existing adaptationist accounts of multicellularity are, they nevertheless underlie the main hypotheses driving experimental evolutionary investigations of the evolution of multicellularity. In specific experiments examining an adaptive hypothesis of multicellular origins (e.g., as a response to predation), multicellularity is discussed as a general condition even though the hypothesis is necessarily tested in specific lineages (e.g., Ratcliff et al. 2012; Boraas et al. 1998).

From these more abstracted perspectives, the evolution of multicellularity can be understood on the basis of organisms that exhibit features important to becoming multicellular. Volvox and Dictyostelium, for example, are favourite candidates for these investigations. However, features in these organisms are not representative of all instances of multicellularity, and some evolutionary inferences based on these model organisms are ‘phylogenetically naïve’ (Cavalier-Smith 2002b, p. 137). Nevertheless, they help tell a broad story about the transition to multicellularity, and acknowledge that it evolved multiple times in a variety of protist lineages (e.g., Grosberg and Strathmann 2007; Kaiser 2001; Herron and Michod 2007). One thing to consider is whether multicellularity really represents a dramatic transition, especially if it evolved so many times in such diverse ways (estimated to be as many as 25 times on a loose definition of multicellularity: see Buss 1987; Grosberg and Strathmann 2007). These transitions are probably even more common than this large number. For example, recent work suggests that aggregative multicellularity arose independently in all eukaryote supergroups (Brown et al. 2012).5 The repeated achievement of multicellularity might signal that the origins of these phenomena are relatively trivial episodes in the long history of evolution. The multiplicity of forms and origins of multicellularity also indicates that it is unlikely a single adaptive story could adequately explain the origin of multicellularity, although Derelle et al. (2007) suggest that ‘eukaryotes as a whole are preadapted for multicellularity’ (p. 217). Although single adaptive explanations of multiple convergences are common, comparative phylogenetic detail about the varieties of multicellularity is required for more precise evolutionary explanation (Rokas 2008a; Ruiz-Trillo et al. 2007).

While significant progress has been made in phylogeny-based comparative efforts to understand the origin of animal and other multicellularity, the same cannot yet be said of the adaptive scenarios that accompany those findings. Hypotheses remain very tentative, with no more explanatory closure now than a decade ago. Nevertheless, the increasingly robust phylogenetic inferences about the origins of animal multicellularity, built on comparative data (i.e., genome sequences), may eventually enable the development of better grounded ‘why’ scenarios for this transition in, for example, Opisthokonta, and even why this group seems predisposed towards producing multicellular lineages (Ruiz-Trillo et al. 2007). We will expand briefly on how evolutionary protistology has approached the evolution of multicellularity in animals, even though there are several other types of multicellularity that are also of great evolutionary interest such as fungal, colonial and aggregative multicellularity (Brown et al. 2009, 2012), and supposedly ‘complex’ versus ‘simple’ forms (Knoll 2011; Rokas 2008b).

Now that many of the deep-level relationships amongst opisthokonts and Amorphea have been identified (Adl et al. 2012), very fine-grained work can be done on possible precursers to animal multicellularity, in the closest living relatives of Metazoa. Choanoflagellates are the protists that are now agreed to be the closest living relatives of animals (Ruiz-Trillo et al. 2008; Carr et al. 2008; King et al. 2008; Lang et al. 2002). Many have capacities to form colonies (Carr et al. 2010), and more still have capacities for cell-to-cell communication and adhesion. These characteristics have encouraged evolutionary biologists to see a ‘stepwise evolution of complexity’, from individual unicellular organisms through simple multicellularity to the complex multicellularity of animals (Carr et al. 2008, p. 16641; Knoll 2011; King 2004). While choanoflagellates share an ancestor with animals, these extrapolations might be misleading because choanoflagellates appear to be monophyletic. Unless coloniality can be traced with confidence back to the choanoflagellate ancestor this suggests the convergent evolution of multicellularity sensu lato. Alternatively, earlier ancestors could also have had these colonial and communicative features. Phylogeny-based comparison works backwards beyond the common ancestor and assumes that if a succession of ancestral lineage features is found to have appeared at one point in a lineage and nowhere else, the ancestor to animals has probably been found (e.g., Shenk and Steele 1993). Features with single but more ancient origins are synapomorphies for clades, but not for the multicellular animal clade in particular.

Choanoflagellates had been talked about as evolutionarily important by natural historians of infusoria (most generally, microscopic organisms) such as Henry James-Clarke6 (1826–1873) and William Saville Kent7 (1845–1908). They perceived similarities in cellular structure between choanoflagellates and choanocytes, the feeding cells of sponges (which are basal metazoans), and argued that ‘backward’ phylogeny (ancestral state reconstruction) would lead to the discovery of even more ancient ancestral forms (James-Clark 1866, 1868). When identified as the closest living relatives of the metazoan ancestor, choanoflagellates can be studied as a model for the emergence of multicellular development on the basis of shared characteristics that enable multicellularity (Fairclough et al. 2010). Observations that some choanoflagellates have a single-cell phase, followed by asynchronous cell division to form multicellular colonies (rather than forming ‘mere’ aggregations of cells), suggest that early animal multicellular development was also driven by cell division and not aggregation (because this is how extant metazoans develop). However, in order to establish claims such as this, phylogenetic inferences need to be made that organisms ancestral to choanoflagellates and animals also developed in this way (Fairclough et al. 2010). Comparisons of extant choanoflagellate genomes with those of animals allow the reconstruction of some aspects of the genome of their last common ancestor (King 2004; Ruiz-Trillo et al. 2008). Identifying ancestral genes has allowed the development of evolutionary scenarios about the important characteristics of animal multicellularity (as opposed, for example, to those involved in fungal multicellularity) and even the very origins of animal multicellularity (Degnan et al. 2009).

But because many of the features thought until recently to be characteristic exclusively of multicellularity are found in a range of choanoflagellates and other protists, and specific characteristics once thought to be exclusively metazoan are found outside them (e.g., adhesion and signaling proteins), it is clear that simple assumptions about crucial features of multicellularity in animals and other clades have to be theorized more sophisticatedly in relation to their earlier evolutionary and organismal contexts (Suga et al. 2012; Sebé-Pedrós et al. 2010, 2012; de Mendoza et al. 2010; Abedin and King 2008, 2010; Pincus et al. 2008; Manning et al. 2008; Rokas 2008b; King et al. 2003). In other words, these features require analyses that incorporate considerations of co-option, convergence, loss and non-adaptive processes. Unicellular organisms with, for example, proteins that in multicellular organisms bind cells to extracellular matrices (cadherins, integrins and associated proteins), may have had very different functions in their ancestors (Nichols et al. 2012).8 Hypothesizing (and testing) preadaptive scenarios for such phenomena can be done in relation to extant representatives of those lineages, and functions inferred to the ancestral organisms (e.g., cadherins in choanoflagellates are postulated to be involved in phagocytosis—see Abedin and King 2008; Özbek et al. 2010).

A similarly provoking discovery is of epithelial cells in the amoebozoan slime-mold, Dictyostelium discoideum. These cells are characteristic of animal body plans and were once thought to be exclusively metazoan (Dickinson et al. 2011, 2012). There is some similarity between Dictyostelium and metazoan epithelia at the gene and protein level, but the cells function differently, perhaps because of less complex molecular machinery in the Dictyostelium variants (Dickinson et al. 2012). But important as it is to understand the functions of such organizational phenomena, and whether they are analogies or homologies, a phylogeny-based comparative approach also draws attention to the contrast between facultative and obligatory multicellularity (Dickinson et al. 2012). The capacity of organisms such as Dictyostelium to enter and leave multicellular states questions the notion of multicellularity as a major transitional leap involving the emergence of novel forms of cooperation. Instead, the emergence of multicellularity may better be explained ‘merely’ as an occasional survival strategy to deal with difficult environments, rather than a remarkable transformation in the organization of life. Evolutionary protistology is, from this perspective, offering an alternative way of thinking about evolutionary transitions. This is not what usually occurs when such events are perceived from the vantage point of multicellularity in an already full-blown metazoan form.

Origin of sex

The evolution of sex is variously described as a puzzle, a paradox, a mystery and an enigma (e.g., Hurst and Peck 1996; Hadany and Feldman 2005). Sexual reproduction is considered to be such a curious phenomenon because it appears to come at the high cost of lower reproductive output and fragmentation (via recombination) of advantageous combinations of genes (Barton and Charlesworth 1998; West et al. 1999). Theoretical concerns about the cost of sex have usually, therefore, led to searches for countervailing benefits to explain the widespread and evolutionarily persistent nature of sexual lineages. There are numerous benefit explanations on offer (Kondrashov 1993; Meirmans and Strand 2010), mostly developed on the basis of understanding large-organism biology. Sex is thought to facilitate adaptation and potentially evolvability by generating genetic novelty, to enable parasite resistance, and to purge deleterious mutations or even reduce genetic variation itself (Williams 1975; Maynard Smith 1978; Bell 1982; Hamilton et al. 1990; Hurst and Peck 1996; Gorelick and Carpinone 2009). Although these accounts are often seen as competing hypotheses for why sex is evolutionarily advantageous (and all have major shortfalls—for a drift-based explanation, see Otto 2009), in recent years there have been some suggestions that a plurality of mechanisms might be involved in the evolution of sex and thus a pluralistic explanation would be required (West et al. 1999).

Meiosis (reductive division producing haploid nuclei from diploid cells, accompanied by recombination), and the fusion of haploid nuclei (karyogamy, usually as part of syngamy, which is the fusion of gametes) are the core processes of sexual reproduction (we do not include here processes such as plasmid-mediated conjugation in prokaryotes). Its origin is a popular research focus, both from genomic and cell-biological evolutionary approaches, even though there are disagreements about which features of sexual reproduction evolved first, and even whether mitosis preceded meiosis. Adaptive hypotheses about meiosis focus on apparent advantages such as DNA repair, recombination fidelity, ploidy reduction, and epigenetic resetting (Wilkins and Holliday 2009; Egel and Penny 2007; Cavalier-Smith 2002b; Gorelick and Carpinone 2009; Michod 1993). But whatever the adaptive reasons for the evolution of meiosis, genetic variability—the standard explanation for the persistence of sex—is quite likely to have been ‘epiphenomenal’ for the early evolution of this new mode of reproduction (Cavalier-Smith 2002b; Wilkins and Holliday 2009).

It is clearly critical for these theoretical discussions that a careful distinction is made between the origin(s) and the maintenance of sex (Lenski 1999; Maynard Smith 1986). Historical reconstructions of the organisms and conditions in which sex originated need not have any bearing on the secondary evolutionary processes that maintain the phenomenon of sex once it has evolved. Phylogeny-based comparisons indicate that sex is probably widespread in protists and an ancient feature. There are relatively few groups of species in which sex has been conclusively identified so far, but they are distributed across the supergroups of eukaryotes described above (e.g., Lahr et al. 2011). Behaviours that are suggestive of sexual processes (apparent syngamy, meiotic reduction, and karyogamy) have been reported across a wide selection of protistan eukaryotes (Lahr et al. 2011). Genes specifically involved in the core sexual process of meiosis are similarly found in taxa across the eukaryote tree (Ramesh et al. 2005). Both of these observations indicate an ancient origin of sex, probably predating the last common eukaryote ancestor, and they dispose of presuppositions that early eukaryotes must have been ‘primitively’ asexual (Dunthorn and Katz 2010).

The fact that sex has not yet been directly observed in so many protist groups is primarily a function of its rarity in most of these organisms. There are relatively few protists in which there is a routine connection between sex and reproduction. Even groups of unicellular forms in which sex is well known, such as diatoms, ciliates and apicomplexan parasites, routinely go through dozens (or even thousands) of asexual generations on average between each sexual event (Dunthorn and Katz 2010; Smith et al. 2002; Mann 1993). The frequency is probably orders of magnitude lower in many other groups of protists, and directly observing sex is very much a search for a canoodle in a haystack. Also, definitive identification of sex, whether from morphological, molecular or behavioural evidence, can be problematic even in multicellular eukaryotes, because organismal and molecular data proposed as evidence of sexuality often have other explanations (Schurko et al. 2009). Comparative genomic data indicate very clearly that meiosis-specific genes are present in almost all studied eukaryotic genomes (Malik et al. 2008), and thus meiosis is reasonably presumed to be a default characteristic of eukaryotes (Ramesh et al. 2005). Many protists have recently been shown to have sexual processes, often on the basis of particular genetic signatures of recombination or sets of genes for meiosis. Giardia, Trichomonas, Leishmania,Naegleria and certain amoebozoan amoebae are some of the better studied lineages previously put in the ‘obligatory asexuals’ category, and now found to be sexual, at least sometimes (Lahr et al. 2011; Fritz-Kaylin et al. 2010; Akopyants et al. 2009; Lasek-Nesselquist et al. 2009; Malik et al. 2008; Poxleitner et al. 2008; Cooper et al. 2007).

From an evolutionary protistological perspective, many of the assumptions that underpin theoretical perspectives about the disadvantages of sex are not so obvious. For example, sex is held to have a (massive) twofold cost because of only one gender bearing offspring and differential gamete size investment (anisogamy). These costs are in addition to the potential breakup of favourable gene combinations and the loss of reproductive time spent in mate-finding, mating and meiosis for obligately sexual organisms. Viewed from evolutionary protistology, however, these assumptions are unlikely to be true in the ancestral cells in which sex evolved. Facultative sexuality is the norm, with asexual reproduction typically being orders of magnitude more common. Isogamy (implying equal ‘parental investment’) is common amongst extant protists, with conjugation between equal partners also common. Finally there is often only a tenuous connection between sex itself and reproduction sensu stricto (Spiegel 2011), or no special connection at all, for which pennate diatoms are just one significant example. Sex was almost certainly facultative and not obligatory at its outset (Dacks and Roger 1999), and it is possible that the earliest sexual organisms were isogamous. At best, most theoretical evolutionary scenarios about the costs of sex could be true about only the maintenance of a sub-type of sex, in which there is obligate (or at least common) linkage of sex and reproduction, and distinct ‘male’ and ‘female’ sexes (see Lehtonen et al. 2012).

Furthermore, any adaptive value of sex in unicellular organisms is not going to be understood in a single-lifestyle interpretation, but as something that occurs infrequently in multiple-generation populations. It needs to be recalled that ‘lifecycle’ for an animal biologist refers to the history of individual organisms from birth to completion of reproduction, usually through a sexual event. In protists with known sexual cycles, the ‘life cycle’ is usually the history of many generations of individuals within a population because sexual reproduction occurs only occasionally, typically in dependence on environmental and other factors that may or may not be well understood. Another of the difficulties in producing an adaptive scenario for the evolution of sex in protists is because gametes come in a variety of forms, mating types are multiple, and meiosis works in different ways (Raikov 1995; Heywood and Magee 1976; Phadke and Zufall 2009). While phylogeny-based comparison attempts to reconstruct historically the earliest forms of sex, extrapolations about its adaptive value when conceived as a unitary phenomenon are very problematic. Cavalier-Smith goes so far as to say that selectionist explanations of evolutionary innovations such as meiosis and syngamy are just ‘secondary metaphors’ (2010b: e53, 1995). Evolved phenomena, he argues, have to be understood in mechanistic accounts of how mutations and cell biology could produce novelties ‘from within’. Regardless of whether the scenario building is adaptationist or strictly mechanistic, phylogeny-based comparison in evolutionary protistology has constrained substantially the timing of the origin of sex, and strongly indicates some of the likely properties of the sexual cycles at the time of the divergence of the major eukaryote lineages. But given the structure of the eukaryote tree, the value of phylogeny-based comparison erodes almost completely as we push back further in time to the actual origin of sex. This erosion is a feature shared with our next focal topic: the origin of the complex eukaryotic cell itself.

Origin of the eukaryote cell

The evolution of the eukaryotic cell, almost certainly from prokaryotic ancestors, clearly represents one of the most important major evolutionary transitions in the history of life on earth. Eukaryogenesis was a unique occurrence (recapping from above: all eukaryotes are related because they share a single common ancestor), but the sheer number and variety of differences between prokaryotes and eukaryotes suggests a whole series of events, potentially quite a protracted series (Roger 1999; de Duve 2007; Cavalier-Smith 2009). In principle we could find intermediates (transitional between prokaryotes and eukaryotes) in this process—either obscure living lineages, or recorded somehow in the fossil record—but this search for missing links has so far been fruitless (Embley and Martin 2006). The classic features or hallmarks of eukaryotes (such as the possession of mitochondria, a nucleus, membrane-bound organelles, cytoskeletal and endomembrane systems, mitosis and the molecular underpinnings of these capabilities) are shared by all eukaryotes and must therefore have been features of the most recent common ancestor of extant eukaryotes. The failure to find evidence of any truly intermediate organisms, and the uniqueness of eukaryogenesis, make some evolutionary biologists think that the full story of this event might never be known (e.g., Koonin 2010a).

To the extent that modern eukaryotes can inform us about this transition, even if only the later stages, it is clearly the protists that will do so the most, as they represent or reflect the ancestral condition for all eukaryotes (unicellularity), and will straddle the root of the tree, almost irrespective of where that root actually belongs. Knowing whether features of extant lineages are ancestral features that have been preserved depends, however, on having a rooted tree of eukaryotes, most especially to identify ‘deep branching’ lineages (if they exist), or at least the deepest splits in the tree. If these exact splits are not known, then precise claims cannot be made about what the earliest eukaryote cell was or was not like. As we mentioned earlier, there is no well-supported consensus on the position of the root of eukaryotes and several hypotheses remain viable because there seems to be as much evidence in support of each hypothesis as there is contradicting every one of them. Methodologically, the transition between prokaryotes and eukaryotes remains obscure because of poor signal in existing molecular data, limited to no fossil data, shallow sampling of taxa, and complex patterns of sequence change that confound phylogenetic models of sequence change (crucial to pinpointing when particular events occurred). All these difficulties, even if seen as temporary, generate considerable uncertainty and disagreement about inferences postulating particular basal groups.

Although research on the origin of the eukaryote cell uses phylogeny-based comparison, it is currently driven by hypotheses such as the hydrogen hypothesis (Martin and Müller 1998; Martin et al. 2003), the somewhat similar syntrophy hypothesis (López-García and Moreira 1999), and the protoeukaryote hypothesis (Cavalier-Smith 2002a; de Duve 2007). These hypotheses attempt to reconstruct the configuration of the first eukaryote cell, and usually give adaptive reasons for why it succeeded evolutionarily (e.g., because of the advantages bestowed by biogenergetic capacity, increased cell and genome size, or phagotrophy). Any hypothesis about the exact architecture of the ancestral eukaryote cell, and thus the genes that should be looked for in hypothesized ancestors, depends heavily on preferred accounts of what the first eukaryote cell looked like and could do (Kurland et al. 2006; Martin et al. 2003; Embley and Martin 2006; de Duve 2007; Cavalier-Smith 2006; Archibald 2011; for an overview, see O’Malley 2010). Although some theoretical accounts see the acquisition of the mitochondrion as the primary precipitating event in eukaryogenesis (Lane and Martin 2010), other scenarios prioritize the very ability to enclose another cell, noting the definitive cytoskeletal capacities this capacity requires (Cavalier-Smith 2002a, b; Kurland et al. 2006). Quite where the evolution of the nucleus fits into this sequence of events is also wide open at the moment, although there are a number of scenarios that attempt to explain its origin via adaptive advantages (Martin and Koonin 2006; López-García and Moreira 2006; Cavalier-Smith 2010b).

The most major contribution of evolutionary protistology to the debate has been in the form of testing and refuting the archezoa hypothesis, which postulated an ancestral amitochondriate eukaryote (Cavalier-Smith 1987b). Under the archezoa hypothesis, primitively amitochondriate eukaryotes may have been ancestrally lacking some other typical eukaryote features, such as a complete endomembrane system, or sex (Cavalier-Smith 1987b; Patterson and Sogin 1992). Phylogeny-based comparisons of cell biology and molecular features have demonstrated that in fact all eukaryotes have some form of mitochondria (Roger 1999; Keeling 1998; van der Giezen 2009). Comparative analyses of mitochondrial genomes also established the endosymbiont origin of the mitochondrion, and thus illuminated how a definitive characteristic of the eukaryote cell evolved (Gray and Doolittle 1982). Improved estimates of the phylogenetic tree of eukaryotes, combined with comparative genomics and transcriptomics, have increasingly confirmed that the last common ancestor of eukaryotes was complex and substantially ‘complete’ with regard to most other general features as well (Field and Dacks 2009). While the position of the root of the eukaryote tree is unresolved, the scenarios currently discussed do not envisage a notably primitive last common ancestor of eukaryotes, with the limited exception of the Eozoa hypothesis (Cavalier-Smith 2010a). This is both progress and a disappointment to those hoping to gain significant insight into the origin of complex cells from living taxa.

Despite the many refinements of the competing hypotheses about the first eukaryote cell, they have not provided definitive answers. Indeed, for many researchers such inquiries are ‘contaminated’ by speculative reasoning about alternative hypotheses, for which there will possibly never be sufficient data to decide between. For this very reason, many evolutionary protistologists are happy with ‘what’ questions but shy away from a full-blown focus on ‘why’ questions. ‘Why’ questions will almost always fail, from their perspective, to yield satisfactory answers. But from the more theoretical perspectives striving to produce plausible adaptive hypotheses, if data are patchy, signal is poor and evidence can be interpreted in multiple ways, it makes sense to try and integrate whatever weak data there are into a clear story, and then to search for confirmation and disconfirmation of this coherent narrative. Despite the huge gaps in adaptationist narratives, evolutionary protistology and comparative phylogenetic approaches have achieved considerable success in filling in key changes, indicating their timing, and hinting at or ruling out other options. While it could be thought that these successes mean the focus will continue to be gap-filling via phylogeny-based comparison rather than elaborating extensively on adaptive scenarios, this possibility is stymied by the fact that the ancestral eukaryote already possessed all of the features of a modern eukaryote (which seems likely regardless of which of the root positions is favoured). Not being able to discriminate more finely means phylogeny-based comparisons will simply not make further findings that help answer the question about the earliest eukaryote.

Philosophies of evolutionary methodology

Evolutionary protistology highlights important and sometimes neglected aspects of how evolutionary explanations are produced. Rather than aiming for the most far-reaching generalizations possible, evolutionary protistologists have emphasized tracing ancestral links in order to ‘pin’ particular evolutionary relationships and dynamics to specific nodes on the best available phylogenetic tree. This comparative phylogeny-based approach requires certain methodological priorities.

Discussions of the origin of an evolutionary novelty, such as sex, which first ask “what is its selective advantage?” put the cart before the horse. To explain an evolutionary innovation, we must answer four different sorts of questions. First is the phylogenetic question: in what kind of organism and from what precursors did the new character evolve? Secondly, we must consider the nature of the mutations that originally created the new character. Thirdly, we must discuss the developmental question of how the newly mutated genes produced it each generation. Finally comes the population genetic question of how the novel mutations specifying the new character spread throughout the population … not only by positive selection but also by genetic drift (Cavalier-Smith 1995, p. 190).

The questions Cavalier-Smith sees as the methodological foundation of understanding protist evolution (and hence eukaryote evolution) overlap with Tinbergen’s famous four questions about causation, survival value, ontogeny and evolution (1963). However, the adaptive aspects of two of Tinbergen's questions (survival value and ‘evolution’) are often left aside in evolutionary protistology, for very good reasons.

Perhaps the best way to describe this emphasis of inquiry and explanation is that comparative phylogenetic protistology approaches evolutionary questions ‘backwards’, so to speak, from the organisms themselves rather than reasoning about why innovations have persisted. Without presuming the evolutionary advantage of new features or capabilities of ancestral organisms, phylogeny-based comparisons seek an accurate representation of evolutionary history and the placement of deep-level changes. One label it is sometimes given is ‘evolutionary cell biology’ (King 2010); from a genomic perspective it has been called ‘comparative-genomic reconstruction of ancestral forms’ (Koonin 2010b, p. 606). Evolutionary protistologists see the reconstruction of the exact evolutionary environment of the era of innovation as much harder to do, and inevitably conjectural, even when efforts are made to specify how selection must have worked on what. This does not mean that such adaptive scenarios are seen as uninteresting, but that they are believed to ask unanswerable questions. Most of the advances of recent evolutionary protistology have been achieved while avoiding such questions, and by instead focusing on the development of knowledge about ‘what’: the organisms that evolved these features, and the ones that inherited them.

The ‘what’ questions phylogeny-based comparisons involve can be seen as falling into two connected categories: ‘what happened’, which leads to the reconstruction of evolutionary history; and ‘what it is’, which is concerned with what the evolving phenomena themselves are (e.g., the description of the genome, or cellular structures). While these questions use inferences (i.e., about evolutionary relationships and ancestral structures, and how one structure became a new one) and not just concrete observations, there is a high level of confidence these days about what is known about many important relationships and how correct these understandings are. This level of confidence is far higher than can be attributed to ‘why’ scenarios, such as why the first eukaryote cell came into being. But as Koonin notes (2010c, p. 21 [slightly paraphrased]), even though ‘asking “Why” questions in evolutionary biology is dangerous [and] trying to answer them with sweeping hypotheses is doubly dangerous … Still, this type of risk-taking has to be applauded if we wish to “understand” evolution in any meaningful way.’

Many philosophers would contribute to the applause for such scenario-building. Good historical science is often thought of as being about the identification of ‘smoking guns’ (Cleland 2001, 2002), and the evolutionary transitions themselves have been identified as such types of evidence (Calcott and Sterelny 2011b, p. 3). As far as they can, these philosophical and theoretical accounts favour abstracted historical scenarios, so that ‘merely’ tracing back biological innovations is deemed a less ‘profound’ form of historical science (Calcott and Sterelny 2011b, p. 4). From this more theoretical perspective, evolutionary phenomena should be abstracted and used to explain more generally within a larger theoretical enterprise—usually one of adaptiveness but also in regard to accounts of evolution that are concerned with higher order properties such as evolvability, complexity, modes of inheritance, and individuality.

Much of the discussion of historical science is based on case studies outside biology, and we suggest that closer examination of evolutionary biology as a historical science will help develop a more fine-grained notion of what historical explanation is in biology, and how forms of it may differ. We must emphasize, however, that the ‘what’ reconstructions (‘what happened’ and ‘what it is’) in evolutionary protistology are properly explanatory, in the sense of identifying causal trajectories at a variety of biological levels and evolutionary scales. These explanations fit fairly well what Calcott (2009) calls ‘lineage explanations’, in which population-level accounts involving selection do not appear but an explanation of the sequence of events does. Lineage explanations give an account of how one mechanism becomes another; they focus on a trajectory of change, rather than the broader processes driving those changes (Calcott 2009). And even when these explanations are developed into full-blown mechanistic and selectionist scenarios that attempt to account for what caused such changes (Cavalier-Smith 2006), very rarely are single ‘smoking guns’ found; they are much more likely to be ‘subcollections of traces [that] make their causes merely highly probable, as opposed to determining them’ (Cleland 2001, p. 989; Losos 2011).

We have shown here that some of the more theoretical accounts of important evolutionary events are very distant from the actual biology in comparison to those produced by detailed phylogenetic reconstructions. While phylogeny-based comparisons do not have the theoretical ambition of the more abstracted efforts we have noted, this hardly seems disadvantageous in the context of broad swathes of deep evolutionary time in which multiple vague scenarios function primarily as heuristics in which to organize the phylogenetic detail. The origins of sex and multicellularity are particularly good exemplifications of the payoff from focused reconstructive efforts rather than theory-building in wide-open evolutionary spaces. The evolution of the eukaryote cell, while still an open question, has been at least constrained by phylogeny-based comparison, which has thus clarified the biological scenarios that could be at work.

But in addition to our claims about phylogeny-based comparison, the most encompassing point that we hope this article communicates is the importance of understanding protists to comprehend our own evolutionary background. Far from protists being, as the epigraph coyly suggests, ‘scarcely noteworthy links in the grand scheme of organic nature’ (Saville Kent 1880–1881, p. viii), we have shown repeatedly how eukaryotes, eukaryote evolution, evolutionary transitions, and major features of our own biology need insights from protistology, and especially evolutionary protistology. The diversity, abundance and evolutionary innovativeness of protists makes it no loss to view the eukaryote world from a protist-centred perspective. Indeed, what is lost by not doing so is far greater, and we think our snapshot analyses of the origins of sex, multicellularity and eukaryote cells give some indication of how protist-based knowledge and the methodology of evolutionary protistology provide alternative perspectives on evolutionary transitions, and eukaryote biology in general. The ‘other’ eukaryotes as a way of thinking about the protist world might even become a misnomer as a result of these reflections.


The ‘five kingdom system’ is often attributed to Whittaker (1969). However, his version intentionally included polyphyletic (not sharing a recent common ancestor) higher kingdoms and thus differs significantly in structure from the five-kingdom system popularized between the 1970s and 1990s. This more familiar version was proposed by Lynn Margulis (1971), who respectfully modified Whittaker’s scheme so that animals, plants and fungi were each potentially monophyletic, in effect by moving several groups to Protista from either fungi or plants.


Encephalitozoon intestinalis (a microsporidian) is the parasite with the smallest known protist genome; Gonyaulax polyedra (a dinoflagellate) has the largest reliably estimated genome (Gregory 2007). Even amongst parasitic protists (expected to have smaller genomes), there is considerable variation from just over two million base pairs to 160 million (Zubáčová et al. 2008). However, the size range of protist genomes is currently understood from very limited data (Gregory et al. 2007), and some earlier reports of extraordinarily large genomes in amoeba seem to be incorrect (Gregory 2005).


Roughly half of the dinoflagellates are photosynthetic. The remainder are free-living or parasitic heterotrophs.


Permanent anaerobiosis, the loss of the ability to use oxygen as a terminal electron acceptor during energy metabolism, has evolved a number of times in eukaryote evolution. There are particularly successful lineages of anaerobes in Excavata, but the property is not unique to this supergroup.


However, see Dickinson et al. (2012) for a suggestion that the various instances of multicellularity in Amorphea, usually considered ‘independent’, can be collapsed into a common origin. The current understanding of the deep-level diversity and phylogeny of Amorphea (e.g., Kim et al. 2006; Katz et al. 2012) makes this idea relatively unparsimonious.


HJC published as James-Clark; his actual surname was Clark even though this was his mother’s family name (his father’s was Porter).


Saville Kent also hyphenated his name occasionally; Saville was a second forename but he used it as the first part of his surname.


Integrin-related proteins have been found in bacteria as well as protists. In the former, their role is tentatively hypothesized as intracellular signaling (Chouhan et al. 2011).



We thank Mark Olson (UNAM) for detailed comments that greatly clarified our argument. MAO acknowledges funding from the Australian Research Council and University of Sydney in the form of a Future Fellowship; AGBS is supported by the Canadian Institute for Advanced Research program in Integrated Microbial Biodiversity, and a Discovery grant from the Natural Sciences and Engineering Research Council of Canada; AJR is supported by the Canada Research Chairs Program and a Discovery grant from the Natural Sciences and Engineering Research Council of Canada.

Copyright information

© Springer Science+Business Media Dordrecht 2012