Introduction

Traditional behaviors are found in many animal species (Galef and Laland 2005; Allen 2019), but only in Homo have they coalesced into integrated and shared cultural systems (e.g., Andersson et al. 2014a; Smaldino 2014; Richerson et al. 2016; Buskell et al. 2019; Read and Andersson 2019), and only Homo has come to specialize in maintaining and acting within such systems (e.g., via expanded cognitive and metacognitive functions; Sherwood et al. 2008; Csibra and Gergely 2011; Whiten and Erdal 2012; Shea et al. 2014; Sherwood and Gómez-Robles 2017; Dunstone and Caldwell 2018). The integrated cultural nature of human behavior is strikingly expressed in emblematic feats of cooperation and coordination—such as when hunting like “a highly competitive group-level predator” (Whiten and Erdal 2012)—but it also permeates the human way of life entirely and into its most minute details. For example, resources obtained using cultural hunting and foraging strategies go on to enter an intricate cultural “metabolic system” where they are processed, stored, distributed, disposed, and turned into a wide variety of products. These products themselves are part of the operation of this cultural system, which is much wider and older than its individual human stewards, who depend on it for their survival (e.g., Boyd and Richerson 2000; Henrich and McElreath 2003). This uniquely human “cultural community” embodies an emergent ecological strategy that cannot be reduced either to its learned or its genetic components (1992).

In many ways, this sounds more like the description of an organism than of an animal social community—except perhaps for some species of social insects, whose communities can be understood as an unusual type of organism, due in part to the high degree of relatedness within colonies (Queller 2000; Queller and Strassmann 2009; Kennedy et al. 2017). In human cultural communities, however, these integrated qualities cannot be understood genetically. While clearly underpinned by genetic adaptations, the exceptional variation of behavior seen in human communities is not explained by genetic variation (e.g., Lewontin 1972; Foley and Lahr 2011). Our genetic adaptation in this regard is indirect—it permits cultural adaptation to happen.

So how could features seen as typical of (or even unique to) adapted biological organization emerge from a substrate of social learning? Are we just pushing a tempting analogy too far? The Social Protocell Hypothesis (SPH; Andersson and Törnberg 2019) proposes that this affinity between cultural and biological organization is genuine and fruitful, and that it stems from a deep similarity in how they originated and evolved. More specifically, the SPH proposes that groups of socially learned traditions were integrated into a new group-level cultural entity via an Evolutionary Transition in Individuality (ETI; see Maynard-Smith and Szathmáry 1995; Michod 1999, 2007; Leigh 2010; Clarke 2014; Hanschen et al. 2015; Szathmáry 2015). In this view, group-level cultural integration did not follow as a secondary effect of the evolution of lower-level factors that are usually seen as primary (e.g., cumulative traditions, hominin cooperation, etc.) but precisely the other way around.

ETIs are rare evolutionary transitions where new higher-level evolutionary individuals (entities equipped to undergo adaptation by natural selection as wholes; Lewontin 1970; Buss 1987; Sober and Wilson 1994) arise from cooperating groups of lower-level evolutionary individuals (Buss 1987; Szathmáry and Maynard-Smith 1995; Queller 2000; Michod and Roze 2001; Hanschen et al. 2018). Repeated ETIs have thereby produced one of life’s most familiar characteristics: its hierarchical structure. For example, groups of cooperative genes evolved into the first cellular genome, groups of bacteria-like cells evolved into the eukaryotic cell, groups of eukaryotic cells evolved into multicellular organisms, and groups of multicellular organisms evolved into social insect colonies.

Evolutionary individuality (Buss 1987; Michod 1999; Radzvilavicius and Blackstone 2018) emerges during an ETI via cycles of cooperation, conflict, and conflict mediation. Selection and adaptation thereby move to a higher level of organization, producing an integrated type of entity whose lower-level units (originally evolutionary individuals) are co-opted and turned into parts of the whole (Michod 1999). For a brief overview of ETI theory and its relation to frameworks such as Major Transitions in Evolution, see Hanschen et al. (2018).

The cultural ETI described by the SPH is proposed to have started some 2.5 Mya, associated with the emergence of hominin big game carnivory and the appearance of the genus Homo. It is proposed to have been primed by a pre-existing fortuitous combination of behavioral and ecological circumstances that imparted a basic level of evolutionary individuality (via community-level boundaries, heredity, and reproduction) to collections of unintegrated traditions maintained in growing and splitting early hominin social communities.

The transition would have taken us from “animal culture” to integrated cultural entities, equipped with irreducible systems for heredity, reproduction, and development on the cultural group level. Traditions would evolve into increasingly subordinated components of a hierarchically organized cultural whole. Following Andersson and Törnberg (2019), the hypothetical type of evolutionary individual that emerged in this transition is referred to as a sociont. Notably, hominins are not seen as part of the emerging sociont but remain separate genetic evolutionary individuals.Footnote 1 The SPH thereby introduces a notable change of perspective since “the group” is here a group of traditions, not of hominins.

Arguably, the most fundamental question to pose about the SPH is whether—and, if so, when and to what degree—the culture of Homo shows evidence of evolutionary individuality. This question has many parts. For example, is there evidence that selection at an early point came to act collectively on combinations of animal-style traditions in hominin social communities? Were integrated systems of traditions formed as a result? Is the function and organization of Homo culture consistent with what one would expect if selection increasingly acted on cultural systems as wholes? Is there evidence that these proposed cultural organisms (like biological counterparts) evolved mechanisms that increased the extent to which they could be targets of selection? These are the questions we seek to address in this article.

We examine these issues by applying a set of criteria for assessing whether biological entities qualify as evolutionary individuals (Hanschen et al. 2018), adapting them to the cultural realm to account for the difference in substrate. We begin by outlining the SPH to explain why we think that an ETI drove human cultural evolution, and to introduce the entities used in the analysis. From this basis, we apply our criteria to judge whether, how, to what extent, and at roughly what stage they are fulfilled. The analyses are then summarized and compared with selected types of biological evolutionary individuals, and the results are assessed and compared with theoretical expectations. We conclude by discussing the results and evolutionary individuality in the context of a set of features of human culture that appear to be inconsistent with our results. This article combines schemata from several fields; to aid the reader we have provided a glossary for reference.

From Traditions to Socionts via the Social Protocell: An Overview

Pan as a Proxy for Early Hominins

We use the traditions and community dynamics of Pan (in particular the more studied common chimpanzee Pan troglodytes) as a proxy for a primordial (pre-Homo) early hominin condition. Aware of the risk of conveniently overstating similarities between Pan and early hominins (e.g., Sayers and Lovejoy 2008), our arguments rest in particular on assumed similarities in the following basic aspects of group behavior, social learning, and ecological strategy.

The diverse and broad range of traditions maintained by Pan include extractive foraging behaviors such as nut-cracking, leaf sponging, termite fishing, and ant-dipping, along with a wide variety of social conventions, food choices, and so on (see Boesch 2012 for overview). These traditions are transmitted between individual apes primarily by copying outcomes (emulation) rather than underlying processes (imitation; Tomasello et al. 1987; Tomasello 1996; Tennie et al. 2009; Whiten et al. 2009; Clay and Tennie 2018), and they may be stable and potentially long-lived (Mercader et al. 2002, 2007). Extant chimpanzees are believed to be qualitatively similar to the earliest hominins with regard to their capacity to form and maintain traditions (e.g., Whiten et al. 2003; McGrew 2010; van Schaik 2016, p. 78), and it is likely that early hominins maintained traditions at a level and of a type similar to extant wild chimpanzees (e.g., Boesch and Tomasello 1998; Whiten et al. 1999, 2003; Whiten 2005; Lycett et al. 2009; Harmand et al. 2015).

With regard to overall community organization, Pleistocene Homo appears to evolve from the basis of a Pan-like fission–fusion type of organization, through increasing refinement and the addition of new and intermediate levels of social organization (Grove et al. 2012; Layton et al. 2012) —not by shifting to some radically different type of group organization. Of particular interest is the community lifecycle of Pan (Moffett 2013, pp. 239–249; Andersson and Törnberg 2019, p. 89). Communities arise and expire in irreversible, and roughly symmetric, fission events when social conflicts spiral out of control (Goodall 1986; Furuichi 1987; Feldblum et al. 2018). This becomes progressively more likely if group size increases and overburdens social cognitive mechanisms for handling conflicts and maintaining cohesion (Dunbar 1992, 1993, 1998). These events are under-researched but appear to be inherent to social features shared between Pan and Homo.

The Social Protocell

The centerpiece of the SPH is the so-called “social protocell” model, whose name derives from the protocell model of how early cells arose via an ETI in a substrate of primitive RNA molecules (Gánti 1975, 1997; Michod 1983; Szathmáry 1986; Szathmáry and Demeter 1987; Szathmáry and Maynard-Smith 1995; Norris and Raine 1998). The claim is that the evolution of the sociont would have followed a similar pathway, but in a very different substrate. We will now briefly review the argument by Andersson and Törnberg (2019).

The social protocell is a set of circumstances that, as a side effect of their organization, creates the potential for selection to act on groups of traditions contained in social communities (see Fig. 1). This condition is claimed to be incompletely present in Pan communities today, and to thereby likely have existed also in early hominins (Andersson and Törnberg 2019, pp. 90–91; see also the previous section). The condition may be unpacked in terms of a system of three group-level evolutionary meta-functions: boundaries, reproduction, and heredity (see Table 1).

Fig. 1
figure 1

The SPH proposes that social communities impose a group-level lifecycle on collections of traditions, in the same way that protocells did with regard to proto-biotic RNA genes. Above, we compare idealized renditions of biological protocells with their proposed social counterparts to illustrate the parallelism

Table 1 Mechanisms behind primitive evolutionary individuality in biological and social protocells

These functions potentiate the evolution of group-level systems of coadapted traditions that would substantially expand the range of solutions achievable by social learning. Quite simply, you can do more (and different) things with an emergent system of traditions than you can with single traditions. We here refer to such adapted integrated systems of cultural components as institutional (Richerson et al. 2016). The sociont, in other words, consists of institutional organization. To illustrate, we outline “the Oldowan carnivory institution”Footnote 2 in Fig. 2 as a potential example of an early (ca. 2.6–1.8 Mya) institutional system of coadapted activities, each in distinct domains and contexts, and each likely supported by separately transmitted traditional behavior (e.g., Roche et al. 2009).

Fig. 2
figure 2

Consider a minimal rendition of “the Oldowan carnivory institution.” The components are behavioral traditions that occupy distinctly different regimes in terms of time, location, type of behavior, and materials used. Traditions such as obtaining raw material, tool production, and carcass processing are either pointless or impossible considered separately since they are adapted to be parts of an emergent system that, in turn, produces something useful for the sociont (via the hominins) as a whole

Fitness on the level of the social protocell would be driven by the biological fitness contributed by traditions to the hominins maintaining them. If the hominins survived and reproduced at higher rates, so would traditions contained within that social protocell (Fig. 1). But if traditions provided comparably small advantages, the fate of social protocells would be decided mostly by other factors, including the vagaries of chance. We therefore need reasons to infer that some important and widely available target, for which sophisticated institutional strategies would yield a substantial advantage, was available to early Homo but not to Pan, where an ETI was never initiated. Moreover, for evolution not to get stuck at an early point, this target must have kept yielding advantages as more and more sophisticated institutional strategies arose.

Cracking nuts or fishing for termites may provide adaptive additions to the diet, but they will hardly cause chimpanzee communities to decisively outcompete their neighbors. More generally, the rainforest resources available to Pan occur patchily and in small packages. Beyond a certain point of sophistication, the returns to increasing investments will thereby diminish. By contrast, Homo is uniquely associated with a resource that could have provided a strong and persistent competitive edge if pursued using cultural institutions, namely large carcasses. The earliest modified stone tools (the Oldowan complex) would have been especially useful for processing soft tissue on carcasses, and large game carnivory went on to become highly developed and foundational to the lifestyle of Homo during the Pleistocene—across a widening variety of habitats and supported by sophisticated cultural systems (e.g., Stiner 2002; Bickerton and Szathmáry 2011; Whiten and Erdal 2012; Gintis et al. 2015).

But how would the social protocell get us from a Pan-like early hominin state to simple institutions? The australopithecine ancestors of Homo are believed to have gradually moved from facultative hunting of small animals (similar to Pan) toward obligate large game predation, presumably via facultative scavenging (for recent reviews see Thompson et al. 2019; Pobiner 2020). Although details are contested, most agree that this path involved ascending a gradient in risk and task complexity—driven by the benefits of accessing large packages of high-quality food, but facing a range of persistent hurdles, such as a need for processing and an increased exposure to predators and pathogens. Getting further in this direction would thereby put a premium on complex, coordinated, and cooperative strategies.Footnote 3

As in Pan today, the social protocell would have been there all along as a side effect of component behaviors, maintained because they were adaptive in their own separate ways. Its incidental effect of enabling cultural group selection would kick in gradually as it enabled evolution in directions that otherwise could not be taken. The earliest stage could, for example, have selectively preserved institutional elaborations on Pan-like traditional strategies, which perhaps involved late access to carcasses, without opposition, and the use of only immediately available materials such as unmodified rocks and bones. The social protocell would in this way have opened up an otherwise inaccessible cultural design space (see Stankiewicz 2000) that could keep expanding in co-evolution with its underpinning capacities in Homo.

The SPH proposes that the social protocell would thereby not simply boost cultural evolution but actually trigger the formation of the first-ever non-biotic unit of selection via an ETI. Key to understanding why we think an ETI may have unfolded is that the fortuitous meta-evolutionary functions imparted by the social protocell—i.e., boundaries, reproduction, and heredity (Table 1) —themselves could come under cultural control and be adaptively expanded and refined by the cultural group selection that they enabled. This can be understood theoretically as the evolution of evolutionary individuality.

Applying Evolutionary Individuality Criteria to Human Culture

Background

A variety of criteria have been proposed by researchers to define and test whether candidate entities qualify as evolutionary individuals. In the following sections we will use the most commonly applied criteria as reviewed by Hanschen et al. (2017): spatial boundaries, informational uniqueness, informational homogeneity, indivisibility, group-level adaptations, division-of-labor, and the applicability of a specific kind of multilevel selection termed multilevel selection 2. These criteria identify features that are generated by and/or enabling of group-level selection, and that are thereby likely to arise during an ETI, but unlikely to be seen otherwise (in particular together and in a highly developed state). In this way we aim to test the SPH and systematically articulate the hypothesis in an empirical context.

For each criterion we first explain its role and importance with regard to evolutionary individuality. We then interpret these criteria in terms of cultural systems in Homo. We emphasize the Plio-Pleistocene origin of the sociont via the social protocell, its early evolutionary history (primarily in the Oldowan), and, finally, we consider trends across the evolution of Homo during the Pleistocene.

Spatial/Temporal Boundaries

Description

Boundaries constrain the components of lower-level units in ETIs, keeping them from diffusing between groups and from pursuing independent agendas that require free movement. For example, during the origin of cellular life, the protocellular lipid membrane kept autocatalytic chemical networks (the lower-level units in that transition) contained inside self-replicating vesicles (see Fig. 1). Being stuck together in this manner facilitated the evolution of cooperation and eventually the integration of dispersed genetic information into a genome (Jablonka and Szathmáry 1995; Maynard-Smith and Szathmáry 1995; Durand and Michod 2010); see also “the boomerang effect” (e.g., Dugatkin 2002).

Analysis

In “The Social Protocell” section, above, we claimed that collections of traditions were stuck together within the social protocell. The boundary in this case is a lack of social bridges across which transmission can happen between communities (Table 1). The social protocell and sociont are thereby primarily bounded in a social rather than a physical space, although for Pan, and frequently (but not always) also for Homo, this social space corresponds to a physical space in the form of a territory.

Three factors that cause robust and persistent containment of traditions in Pan are:

  1. 1.

    Close and persistent social contact favors the transmission of traditions. Such contact is present within communities but rarely applies between communities (e.g., Goodall 1986; Wilson and Wrangham 2003; Boesch et al. 2008; Schel et al. 2013).

  2. 2.

    Enculturated individuals cannot transfer freely between communities (e.g., Nishida et al. 1979; Pusey 1979; Wrangham 1979; Wilson and Wrangham 2003).

  3. 3.

    Enculturated individuals that do transfer are poor vectors of traditions, for example, due to conformity bias (Whiten et al. 2005; Van De Waal et al. 2010; Haun et al. 2012; Luncz and Boesch 2014, 2015) and rank bias (Horner et al. 2010; Kendal et al. 2015; Watson et al. 2017).

Although positive evidence is unavailable, it is quite plausible that the factors listed above would have applied similarly to early hominins in a Pliocene primordial state. These are primitive examples of what Durham (1992) labels transmission isolating mechanisms, which were subsequently expanded and institutionalized during the evolution of Homo.

One reason to suspect that the containment of culture increased rather than decreased in Homo is that the more complex culture became, the more strongly its transmission must have relied on close and persistent social intimacy. On the level of social learning, cultural components tend to be opaque (indeed, often both to role model and learner; e.g., Tostevin 2007, 2019; Premo and Tostevin 2016) and their transmission reliant on specialized “pedagogical” adaptations (Gergely and Csibra 2006; Tehrani and Riede 2008; Csibra and Gergely 2009, 2011; Kline 2015; Gärdenfors and Högberg 2017; Laland 2017). On higher levels of organization, increasingly complex and integrated hominin institutions may have diffused less and less easily (Richerson et al. 2016, p. 5): First, to be functional, all essential components of an institution must be effectively transmitted. Second, the institution, in turn, will be integrated into some specific higher-order system of institutions, which means it will likely be much less adaptive elsewhere. Third, institutions, even more than focused skills, rely on liberal amounts of tacit knowledge whose function and even existence is unknown to the agents (Polanyi 1967).Footnote 4

Direct empirical evidence of the timing and details are hard to come by, but interdisciplinary analysis suggests that the evolution of intercommunity boundaries has a gradual and drawn-out history across the Pleistocene (Grove et al. 2012). Layton et al. (2012) analyze data on the displacement of stone used for making artifacts (from Féblot-Augustins 1997) from the Oldowan to the Upper Paleolithic, in conjunction with estimated community sizes (Dunbar 1993; Hill and Dunbar 2003), and ethnographic as well as modeling analyses of area use. They conclude that movements of stone raw material remained mainly within small face-to-face coordinated social units (congruent with Pan communities) at least through the late Middle Paleolithic (until circa 50 kya, although simple intercommunity institutions may have emerged locally before that; e.g., Blegen 2017; Brooks et al. 2018). Exchange of lithic material may be expected to follow networks of amicable social interactions, and the spread of lithic material should therefore overlap with the transmission of culture. In late Pleistocene and Holocene human societies, social boundaries are certainly under cultural control and exhibit complex specializations (e.g., via cultural kinship, marriage, mythology, etc.; see, e.g., Read 2012). Not least language (e.g., via dialects) is a powerful boundary mechanism, and its function as such may have been an important factor in its evolution; see Moffett (2013, pp. 229–232).

Temporal boundaries are imposed by the irreversible community-level splitting dynamic, which the SPH views as analogous to cell division (see Fig. 1). Social protocells, and later socionts, thereby have beginnings and ends.

Summary

Boundaries in a social space here play the role that physical boundaries play in biology, and, although social spaces are frequently associated with physical territories, that is not always the case. These boundaries exist in Pan and thereby plausibly in early hominins, and rather than disappearing over time, they seem to have become more and more effective as barriers to culture, more institutionalized, and more subject to cultural adaptation.

Informational Uniqueness

Description

In the biological realm, informational uniqueness essentially means that each unit has its own independent genetic makeup. Such units may exhibit heritable individual differences, which promote evolutionary individuality by enabling group-level variation that selection can act upon. In the SPH case, the heritable information is cultural rather than genetic, and we see the sociont as informationally unique to the extent that it possesses its own stable, independent, and heritable set of traditions or cultural components.

Analysis

Henrich (2004) describes and reviews evidence for four mechanisms that promote what we here call informational uniqueness; see also Boyd and Richerson (2010) and Chudek and Henrich (2011). These mechanisms suppress within-group variability and increase between-group variability in human behavior, assuming the presence of boundaries between groups. The first is conformist learning, which increases the rate of horizontal spread of favored traditions within a community and prevents established traditions from dropping out. The second is prestige-biased learning (Henrich and Gil-White 2001; Jiménez and Mesoudi 2019), which further increases between-group variation (e.g., Boyd and Richerson 1987) since it selectively disfavors ideas originating outside of the group, while it permits some internal sources of variation that can break up conformist lock-ins. The third is punishment; i.e., that the biological agents accept the cost of punishing nonconformers, which greatly amplifies the stabilizing effect of conformism (e.g., Boyd and Richerson 1992). Finally, normative conformity represents conformity for purely social reasons, regardless of whether the behavior in question is otherwise useful or not.

Conformism, sensu lato, is widespread among animals with social learning (de Waal 2013) and may contribute to maintaining between-group variation in chimpanzees (Whiten et al. 2005; van de Waal et al. 2010, 2013; Haun et al. 2012; van Leeuwen et al. 2012; Luncz and Boesch 2014, 2015). However, it is a weaker force in Pan than it is in humans, who conform not only to gain access to better information, but also normatively in pursuit of social benefits (e.g., Van Leeuwen et al. 2013; Haun et al. 2014). We thereby deem it likely that some form of conformism was present to some degree in early hominin communities and increased in Homo across the Pleistocene.

Chimpanzees may exhibit a bias towards learning from individuals with a high rank and/or a track record of success (Horner et al. 2010; Kendal et al. 2015; Watson et al. 2017). If present in early hominins, this may have worked as an innate evolutionary starting point for “prestige” as a derived and culturally institutionalized version, buttressed by genetic adaptations (Henrich and Gil-White 2001). The other mechanisms reviewed by Henrich (2004) are weak, different, or absent in chimpanzees, and should consequently be viewed as derived during the evolution of Homo.

We argued above that spatial/temporal boundaries likely constrained horizontal (between-group) transmission before and (increasingly) during the evolution of Homo. Such boundaries to cultural dispersal, along with community-level heredity of sets of cultural components (via community-level splits; see “The Social Protocell” section), and the factors promoting informational uniqueness described above, indicate that communities may have diverged culturally over time due to selection or drift, rather than converging due to information flow—at least if cultural inheritance was sufficiently faithful.

Fidelity is important for the stable maintenance of informational uniqueness over time since it essentially bounds the amount of information that can be maintained by selection in a population (Eigen and Schuster 1977; Shea 2009; Andersson 2011, 2013). Notably, the fidelity of social protocell inheritance does not reduce to the fidelity of social learning, which may have been very low in early Homo (e.g., Tennie et al. 2017). Social protocell inheritance is simply the continuity of systems of specialized traditions through sociont reproduction events (see “The Social Protocell” section and Fig. 1). For example, say the abilities underpinning the Oldowan carnivory institution were, in some instance, distributed somewhat unevenly across a hundred hominins in a community. Fidelity may then be understood as the likelihood that, upon division, a sufficient number of instances of each needed tradition made it into the daughter communities to cause them to also feature this institution. This likelihood may be high also if the processes behind the traditions are transmitted with very low fidelity—as long as their functions are stable, which appears to potentially be the case (“Pan as a Proxy for Early Hominins” section).

Some degree of group-level informational uniqueness may thereby be theoretically expected in an early hominin Pan-like state. There is also some support for this in field studies where substantial between-group variation in traditions has been found between chimpanzee communities (e.g., Whiten et al. 1999, 2001; Sanz and Morgan 2007; Schöning et al. 2008; Lycett et al. 2009; Boesch 2012; Koops et al. 2015; van de Waal 2018; Kaufhold and Van Leeuwen 2019; Kalan et al. 2020).

Direct verification of informational uniqueness in Pleistocene Homo is challenging to obtain. Evidence is poorly synthesized and a coherent picture is lacking (Kuhn 2020). Numerous individual studies, however, support an overall picture of ancient and persistent geographical cultural heterogeneity. Analysis of traces of butchering techniques at Bolomor Cave and Gran Dolina (Middle Pleistocene) shows evidence of persistent group-specific patterns that vary across time and space in ways that are not obviously functionally relevant (Blasco et al. 2013). Intercommunity technological variation has also been inferred archaeologically in the earliest Oldowan (at Gona; see Stout et al. 2010, 2019), and Chinese stone tool industries between 300 and 40 kya yield evidence of persistent regional cultural distinctiveness, despite their apparent simplicity (Bar-Yosef and Wang 2012; Gao 2013). Foley and Lahr (2011) moreover suggest that cultural transmission by expansion of groups best explains observed patterns of geographic cultural variation over the past 100,000 years.

Summary

Taken together, informational uniqueness on the level of communities is present to some extent in Pan, and clearly present in more recent Homo. It is also indirectly suggested to have been present, and to have increased, during the evolution of Homo.

Informational Homogeneity

Description

Informational homogeneity is maximized when all lower-level units in a biological individual carry the same genetic information. The fitness interests of the lower-level units are then aligned by the fact (and to the extent) that it makes no evolutionary difference which unit reproduces (Hamilton 1964a, b). Informational homogeneity thereby promotes group selection and provides an ideal setting for cooperation and so-called “fraternal” ETIs (Queller 1997) on the basis of kinship selection (e.g., Maynard-Smith and Szathmáry 1995; Michod 1999), such as when all cells in a clonally developing multicellular organism are genetically identical. In contrast, a low degree of homogeneity promotes competition and selection at the lower level, which can be problematic for the emergence of evolutionary individuality at the higher level.

Analysis

While informational homogeneity plays a central role in fraternal ETIs, an evolutionary individual that arises via an “egalitarian” ETI (Queller 1997; such as early cells arising from different species of RNA genes, or the eukaryotic cell arising from bacteria and archaeans) is inherently informationally heterogeneous. This egalitarian type of ETI would also be the best match for the SPH since different types of traditions and cultural components, specialized in different tasks, are clearly underpinned by different sets of information. Components of culture are thereby more analogous to specialized types of genes becoming integrated into a genome than they are to clonal cells becoming integrated into a multicellular organism.

In our case, the informational homogeneity criterion is therefore neither expected to be met from the outset, nor to emerge as an outcome of the ETI. But the criterion is still important since the analysis tells us that lower-level competition will remain a problem if the scope of group selection is to keep expanding via the evolution of evolutionary individuality. During an egalitarian ETI, conflicts are resolved by the evolution of conflict modifier mechanisms (Michod and Nedelcu 2003). We should expect to find examples of this type of mechanisms in a sociont, as we do in cells.

If competition between traditions in the emerging sociont is analogous to competition between genes in the emerging cell, then cellular mechanisms for managing and suppressing genetic conflict may offer guidance. The integration of independently replicating RNA replicator/interactors into a genome is arguably the most central evolutionary innovation in this regard (Maynard-Smith and Szathmáry 1993). The chromosome is a specialized monopolistic group-level replicator whose operation is based on, but not reducible to, the original lower-level replication mechanism of independent genes. It replicates its genetic units, and thereby the group-level genetic structure and proportions, in a centralized and controlled manner once every lifecycle (Jablonka and Szathmáry 1995; Maynard-Smith and Szathmáry 1995; Durand and Michod 2010; Ågren 2014). This produces a setting where parasitic genes are suppressed since they need to become part of the chromosome to gain access to replication.

The evolution of the chromosome may exhibit a suggestive parallelism with the highly structured and institutionalized enculturation process that emerged in Homo (e.g., Read and Andersson 2019, pp. 2–3); see also the discussion of related processes by Smaldino (2014, pp. 250–251). A normative canon of cultural knowledge is here transmitted in a structured and cumulative sequence, following a modified and expanded process of physiological development (childhood), using adaptations for cultural transmission, that are unique to Homo (e.g., Thompson and Nelson 2011, 2016; Han and Ma 2015). Such an integration and monopolization of cultural heredity would stabilize higher-level cultural organization and make it more heritable, but it would also cause parasitic elements to be less likely to spread and disrupt the function of a sociont. The ability of cultural components to reproduce would become tied to admission into such a canon, which may require, for example, fitting functionally and logically into the prevailing system of customs and skills, and, not least, being considered part of the norm (see, e.g., contributions in Roughley and Bayertz 2019). Without such a normative centralized system, new traditions could suddenly arise to exploit some feature of the sociont or the hominins (cognitively or psychologically) to spread and disrupt the integrated function of the sociont.

Summary

The sociont does not exhibit informational homogeneity, and it is not predicted by the SPH to do so. Provisionally an example of an egalitarian ETI, it should be expected to instead exhibit derived adaptations for suppressing lower-level competition between cultural components.

Indivisibility

Description

Indivisibility means that one cannot separate the parts out from the whole and maintain the functional properties of the whole. This increases the likelihood that selection will act on the integrated unit rather than on separate parts. It is therefore indicative of evolutionary individuality if separated subunits do not maintain properties of the whole and cannot survive on their own outside the group context (Michod 1999, 2007).

One mechanism by which indivisibility can emerge is when components specialize and lose vital features that are taken over by other specialized parts; see also the “Evolutionary Division of Labor” section, below. For example, cells in differentiated multicellular organisms have specialized in varied internal functions in the organism, and lost the ability to reproduce and survive independently in this process. The same fate has befallen bacterial mitochondria and plastid endosymbionts of eukaryotic cells, and, likewise, the specialized castes of social insects. Once this has happened, the fitness of one component is dependent upon other components of the group. Indivisibility indicates a low level of conflict between lower-level units since the dependencies that make the individual indivisible act to align fitness interests on the lower level.

Analysis

Since the sociont is composed of interacting cultural components, indivisibility means that components on any level of organization are unlikely to function well outside of the group context; see also the “Informational Uniqueness” section, where some problems pertaining to intercommunity transmission of cultural subunits were discussed. As a limit case, we may readily establish that institutions in modern-day human societies make little sense on their own. We could not take the institutions of a society and create functioning societies each using subsets of the parts—banking in one, police in the other, daycare in the first, and so on. There has to be a relatively full set of complementary functional units in place. This logic permeates the entire internal hierarchy. Dividing institutional units on any level in this way incurs exactly the same set of problems, and the principle is particularly clearly expressed in technological systems, which are eminently indivisible.

This form of indivisibility is inherent to modular adapted systems, which we argued may have emerged early (circa 2.6–1.8 Mya); see the “Oldowan carnivory institution” (“The Social Protocell” section and Fig. 2). By contrast, sets of traditions contained in Pan communities are divisible in principle. Removing or adding one traditional practice—such as nut cracking or ant dipping—is unlikely to affect the function or transmission of other traditions since they lack interdependencies. The same would be true for a collection of early RNA species compartmentalized by a lipid membrane boundary. However, due to the protocellular dynamics (see Fig. 1 and Table 1) they would still not be divided regularly in practice. They would typically remain together, which would favor the evolution of dependencies, and thereby actual indivisibility.

Finally, while the division of socionts happens via the division of social groups of hominins, the former should not be confused with the latter (see “The Social Protocell” section and Fig. 2). Strictly speaking, the sociont never really divides (see “Analysis” of the “Informational Uniqueness” section). It simply persists in the daughter communities after a split if hominins carrying all necessary traditions make it across. Notably, this operation is less straightforward in the presence of division-of-labor between hominins (craft specialization). Cliques of hominins would then carry different parts of the cultural information, and additional mechanisms would be needed to ensure that hominins of all specializations make it into both daughter communities. Craft specialization is a feature of sedentary cultural communities, and despite good reasons for believing it to be generally adaptive on a group level (e.g., Henrich and Boyd 2008), it emerged only very late in prehistory.

Summary

Sets of animal traditions contained in the social protocell are divisible in principle but rarely divided in practice. Actual indivisibility would then have arisen during the evolution of the sociont, with the sociont becoming increasingly indivisible the more complex its institutional organization became.

Group-Level Adaptations

Description

Group-level adaptations provide evidence of group selection but identifying them can be challenging. Groups can have features that look like group-level adaptations but that really are properties driven by selection on the lower level that filter up to the level of a group (Shelton and Michod 2014, 2020). Williams (1966) illustrated this by way of describing how a "fleet herd of deer" is really just a “herd of fleet deer” where the group-level property may be described as a "fortuitous benefit" (Williams 1966) or a "cross-level byproduct" (Okasha 2006) of lower-level properties.

Key to telling true group-level adaptations from cross-level byproducts is to determine whether fitness has truly been “exported” from the lower level to the group level, or if the fitness of the group is simply an aggregative property of lower-level traits (Michod and Herron 2006; Michod 2007). In other words, have the lower-level units sacrificed their fitness as independent individuals in return for a greater contribution of fitness via the higher level? We may subject claims of group-level adaptation to a test by asking whether carrying implicated traits would cause the lower-level entities to suffer a reduction of fitness if they left the context of the group. Being fleet, for example, fails this test since being fleet would not be detrimental to a deer if it left the group.

Analysis

Do the components of cultural systems have properties that would cause them to have lower fitness if they left their cultural context? Are such properties linked to adaptive properties of the cultural system that they benefit from being part of? If so, we may be looking at cultural traits selected on the group level.

Richerson et al. (Richerson et al. 2016) conclude that institutions are group-level features; see also Smaldino (2014). Although their analysis is mostly set in relatively recent times, we think a similar argument can be made from very early on. If we pose the question formulated above about the Oldowan carnivory institution (Fig. 2), we find that its constituent components will certainly suffer if moved to another setting, and that they will do so because of how they are adapted to serve their roles in the institution as an integrated whole. For example, making stone tools without the knowledge of how to obtain animal carcasses would be minimally beneficial and perhaps maladaptive. This contrasts with animal traditions (such as nut cracking or termite fishing among chimpanzees) which would seem to be equally adaptive regardless of the context of other traditions.Footnote 5 Institutions then become more and more prevalent and complex the closer we get to modern times.

Summary

Group-level adaptations are absent (or marginal and not so far detected) in Pan communities. They seem to have arisen early during the evolution of Homo, whereafter they increased in complexity, integration, and importance. Institutions and institutions such as large game hunting may be early group-level traits.

Evolutionary Division of Labor

Description

To avoid confusion, let us first state that division of labor in anthropology (and in social science generally) refers specifically to craft specialization; i.e., the division of tasks and specializations between human individuals (e.g., Kuhn and Stiner 2006). Evolutionary theory uses a more general understanding of division of labor as a division of tasks and specializations between any types of components. For example, differentiated multicellular organisms exhibit division of labor between specialized cell types, cells exhibit division of labor between organelles, and social insect colonies exhibit division of labor between castes. Since cultural components rather than hominins form the parts of the sociont, we here refer to division of labor as occurring between cultural components.

Evidence of division of labor in an entity is evidence of “near-decomposability,” which is a universal principle of organization and design (see, e.g., Simon 1962; Wimsatt 1975; Marengo and Dosi 2005; Andersson and Törnberg 2018, pp. 129–31) where specialized functions are broken down into a level hierarchy of complementary sub-functions. This modular organization greatly simplifies and structures internal organization, and is evident in all but the very simplest adapted entities. It is simultaneously an outcome and a precondition for the evolution or design of complex adaptive organization. In an evolutionary context, division of labor is thereby evidence that fitness has been exported to the level of the group, and that the new higher-level entity has gained substantial evolutionary individuality.

Analysis

Archaeology robustly reveals a trend toward deepening division of labor between components of culture, both as observed in the products of culture (e.g., complex technology; see Querbes et al. 2014; Haidle et al. 2015), and in what we refer to as institutions; see also Smaldino (2014). Under the criteria of Indivisibility and Group-level adaptations we have already described several examples of the latter, and the principle is clearly manifested in the Oldowan carnivory institution (Fig. 2), whose components are specialized in focused sub-tasks, most of which make sense only together with the other coadapted components.

This trend of diversification, hierarchization, and increasing narrowness of specialization of cultural components has continued and accelerated into the present (e.g., expressed as diversity of cultural products; see Beinhocker 2006). Homo is qualitatively different from other animals, including Pan, in this regard. The traditions maintained by chimpanzees exhibit diversification but not integration of function beyond what can be achieved cognitively in creative problem solving (within the “zone of latent solutions”; e.g., Tennie et al. 2009; Reindl et al. 2018). We take this to imply that pre-Oldowan hominin traditional repertoires (like Pan) likely did not exhibit division of labor.

Notably, as the sociont is the outcome of an egalitarian ETI (see “Analysis” of the “Informational Homogeneity” section),Footnote 6 the pattern of division-of-labor in the sociont should more resemble what we see in unicellular organisms than what we see in multicellular organisms. Without informational homogeneity (see above) there is, for example, no basis for the differentiation between reproductive and somatic components that plays a key role in fraternal ETIs (e.g., single- to multicellular) by strongly suppressing lower-level selection.

Summary

Our observations lead us to conclude that cultural evolutionary division of labor arose early and deepened during the evolution of Homo, and that it is not evident in other species maintaining traditions, including Pan.

Multilevel Selection 2

Description

Damuth and Heisler (1988) seminally described a subdivision in the debate about multilevel selection in terms of two types of models: Multilevel Selection 1 (MLS1) and Multilevel Selection 2 (MLS2.) They characterized these as follows (“individual” here corresponds to our use of the term “lower-level”):

The criteria for MLS1 are as follows:

  1. 1.

    "Group selection" refers to the effects of group membership on individual fitness.

  2. 2.

    Fitnesses are properties of individuals and group fitness is an aggregative property of individual fitnesses.

  3. 3.

    Characters are values attributed to individuals (including both individual and contextual characters—see below).

  4. 4.

    Populations consist of individuals, organized into groups.

  5. 5.

    Explicit inferences can be made only about the changing proportions of different kinds of individuals in the whole population (the meta-population).

The criteria for MLS2 are as follows:

  1. 1.

    "Group selection" refers to change in the frequencies of different kinds of groups.

  2. 2.

    Fitnesses are properties of groups.

  3. 3.

    Characters are values attributed to groups (including both aggregate and global characters).

  4. 4.

    Populations consist of groups, composed of individuals.

  5. 5.

    Explicit inferences can be made only about the changing proportions of different kinds of groups in the population.

In essence, MLS2 models correspond to groups that are well on their way to becoming evolutionary individuals (Okasha 2006), while MLS1 models correspond to groups whose members do not constitute parts of a group-level individual, and where we cannot speak of group-level adaptations. Okasha (2007) furthermore remarked that there is a characteristic temporal ordering where MLS1 may turn into MLS2, so both types of dynamics may be in place at the same time.

Analysis

The ETI proposed by the SPH begins from an MLS1 scenario in which traditions are not integrated as parts of a higher-level system, but simply happen to be organized into groups as a by-product of early hominin group behavior (see “The Social Protocell” section). The fitness of such a group is an aggregative outcome of the effects of the different traditions since any interactions between them are minimal and likely not synergistic. As a result of these minimal interactions, characters and fitnesses may be assigned to individual traditionsFootnote 7 but not to groups thereof, except in the simple aggregate sense.

As the ETI progresses, institutional organization is argued to appear, as exemplified by the Oldowan carnivory institution (Fig. 2). Traditions here become specialized parts of larger functional systems. The adaptive functions of such institutional structure (such as contributing meat or other resources) are emergent, as they are determined by complex, nonlinear interactions between its traditional components. We may thereby speak increasingly of fitness on the cultural group level rather than on the cultural component levels, and we are also more and more inclined to speak also of properties of these systems as wholes. Since traditions interact and co-occur, change on the levels of institutions and socionts can also be characterized across time and space. In other words, traditions start out as independent entities, but they end up as specialized parts of larger systems.

Summary

The observations that we have made in prior sections support an interpretation where an MLS1 situation gradually turns into an MLS2 situation during the evolution of Homo. We deem that that the sociont is consistent with a MLS2 framework, although, as in other egalitarian ETI, selection on the original level is never fully eliminated.

Results

In Table 2 we summarize our findings along with biological examples for comparison. The eukaryotic cell (like the sociont) stems from an egalitarian ETI (eukaryogenesis) and is also the starting point of a fraternal ETI, namely the evolution of multicellular organisms. Colonial organisms that develop clonally (e.g., the volvocine green algae Eudorina elegans) represent an intermediate stage in this transition. The Pan/early hominin social protocell is a pre-ETI starting point exhibiting only pre-adapted evolutionary individuality. Early Homo represents the social protocell once it is undergoing an ETI. Homo represents the sociont, which is the proposed result of the cultural ETI: an integrated cultural unit that fulfills most of the individuality criteria.

Table 2 Comparative summary of the application of criteria for evolutionary individuality (see also corresponding sections)

For each applied criterion there is also a theoretical expectation that may be argued from the standpoint of the evolutionary trajectory that is invoked. For an egalitarian ETI beginning from “a protocellular situation,” we may predict as follows:

At the social protocell stage, informational uniqueness should be fulfilled at least to some extent since this is necessary for selectable variation (this is also referred to as “between-group variation”). Social protocells are also expected to not be informationally homogenous. Many types of lower-level entities coexist stably within the social protocell as they do not compete directly (e.g., traditions aimed at different foraging tasks). A social protocell is divisible in the sense that the fitnesses of lower-level entities do not depend on the presence of other types of such entities. Although internally heterogenous, the lower-level entities (individual traditions) are not initially organized into systems and exhibit neither division of labor nor group-level adaptations at the outset. On the same account, groups of traditions in social protocells do not exhibit MLS 2. The social protocell is by definition expected to exhibit temporal and spatial boundaries.

Beyond the protocell, an ETI should lead to all these criteria being fulfilled to an increasing degree as a result of selection for evolutionary individuality. The exception is informational homogeneity, whose functional effects will be achieved by the evolution of other conflict modifiers. The multicellular organism—our example of a fraternal ETI (Table 2) —should, on the other hand, be expected to also exhibit informational homogeneity (see Informational homogeneity).

We find that our assessments match up with theoretical expectations, as well as with the biological examples that we used for comparison. Our findings are consistent with the SPH hypothesis that a cultural evolutionary individual emerged as the outcome of an ETI that may be described as egalitarian.

Discussion

We tested the SPH by subjecting it to a range of criteria developed to identify evolutionary individuals (Clarke 2013; Hanschen et al. 2017). Many of these criteria correspond to mechanisms that promote selection at higher levels (Clarke 2013; Hanschen et al. 2017). We examined Pan as a proxy for early hominins, evidence of culture in early Homo, and later Homo cultural communities. We found that later Homo cultural systems satisfied more individuality criteria than did those of early Homo, and that early Homo cultural systems in turn satisfied more individuality criteria than did Pan.

Taken together, our analyses indicate that evolutionary individuality arose, and subsequently increased, in cultural systems during the evolution of Homo. Since the features that we have tested for are unlikely to arise and become highly developed for other reasons, this supports the hypothesis that deep-seated similarities exist between the evolutionary provenances of human culture and biological organisms.

But the idea that human culture would have more than superficial similarities with biological organisms is clearly controversial in biology as well as social science (see, e.g., Dunn 2016, pp. 11–31 for a review). We will therefore end by discussing the evolutionary individuality of human culture in the context of four salient differences between human culture and biological organisms.

Let us first briefly remark on what sort of similarities and differences the SPH should lead us to expect. The SPH implies that human cultural systems and biological organisms represent outcomes of the same type of evolutionary process—an ETI—operating in two radically different substrates, namely socially learned behavior and biochemistry (i.e., in the spirit of “general” or “universal” evolutionary theory; see, e.g., Campbell 1974; Hull 1980; Dawkins 1983, 1992; Cziko 1995; Aldrich et al. 2008; Andersson 2008). The expectation is thereby that differences in outcome are attributable to differences in substrate rather than to fundamental differences in the evolutionary process.

First Difference: Are Cultural Communities More Like an Ecosystem?

Recent societies are more often likened with ecosystems than with organisms, and ecosystems would not meet the individuality criteria discussed in the present article (Huneman 2014). Like ecosystems, recent societies exhibit the whole range of ecological relations, including competition, neutralism, parasitism, commensalism, and amensalism (see, e.g., Sandén and Hillman 2011). This raises the question as to whether the SPH overstates the similarity between cultural communities and biological organisms.

The time frame of the sociont (and of this study) is important in this context. The SPH places the base of the ETI at circa 2.0–2.5 Mya, at which time (and earlier) face-to-face coordinated social communities, strongly bounded upward in size by cognitive capacity (e.g. Dunbar 1993; Hill and Dunbar 2003), were the top level of social and cultural organization. This seems to have remained the case until some 50—100 kya when larger and more aggregated social units arose and became dominant during the Late Pleistocene (e.g. Moffett 2013, 2019; see also Spatial/Temporal boundaries). During the Holocene, cultural and social organization kept expanding dramatically in level upon level.

The sociont coincides with ancestral cultural communities of the older and smaller style. The more recent and larger aggregates would have required institutions extending between and above the level of the sociont to handle intercommunity conflicts (Gat 2010; Wilson 2013). Embedded in such institutions, the original sociont would need to adapt to new and changed roles. These aggregated cultural units would also be much larger, and there would be fewer of them, which would inhibit group selection on levels above that of the sociont (Traulsen and Nowak 2006). If anything, selection on cultural groups would thereby have waned in importance in an increasingly fluid multilevel organization, with less institutional checks on non-cooperative interactions (a “wicked” system; see Andersson and Törnberg 2018; Andersson et al. 2014b).

There is no reason to think that these more recent higher- and multilevel societies would be organized in the same way as the sociont components that they once emerged from.The suitability of ecological models to recent society thereby does not contradict the suitability of an organismal model for ancient societies. The first difference thereby primarily stands out on a comparison between recent human societies and biological individuals.

Second Difference: Internally Generated Adaptive Traits

While biological adaptation mainly operates on randomly generated changes in genes and genotypes, adaptive traits in cultural communities (including heritable features) frequently arise by internal innovation. Hominin creativity, and trial-and-error on a fine level of resolution, reacts much more rapidly than selection could have worked on variation on the sociont level. This internal and goal-directed nature of some cultural change raises the question: to what extent do we need cultural group selection to explain cultural adaptation?

Let us look at cultural evolution in action in more detail. High-resolution examples are hard to find in the deep past, so we will consider the emergence of a major institution across the interface between the late Epi-Paleolithic and the earliest Neolithic. Stiner et al. (2014, 2021; Munro et al. 2018) have described in detail how sheep domestication arose over the course of more than a millennium at Aşıklı Höyük (AH; Central Anatolia) —transitioning from management of wild populations, to fully domesticated animals alternately penned and herded to distant pastures (see also Abell et al. 2019). What Stiner and colleagues describe is a multigenerational process where novel solutions, in an iterative and cascading manner, produce new problems to solve and opportunities to pursue. For example, penning sheep within the settlement reduces losses to predators, but creates additional problems, such as with pests. These cascades of change propagate through society as a whole, leading in the end not only to a new institution, but to the integration of this institution as a functional component of the internal organization of the cultural community as a whole.

Several generalizable observations can be made in this example. First, humans here engage collectively in a dynamical and creative innovation process where solving problems and pursuing opportunities generates variation in cultural components, leading to cascades of transformations, and thereby to new problems and opportunities (Andersson et al. 2014a; Lane 2016). Second, while this new institution arises as a result of human problem-solving capabilities, there is no evidence (nor reason to expect) that ideas about the outcome—an integrated system of cultural knowledge and practices making up a pastoral economy—guided the actions taken (which applies to domestication generally; see, e.g., Zeder 2012, 2015). Humans here built an institution they cannot possibly have understood, they integrated it into a larger cultural system (that they would have understood even less), and it still worked splendidly; see also Lansing’s (1987) description of Balinese rice growing communities. Third, and most importantly, the cumulative innovative steps taken did not represent the selective replacement of sociont variants in a larger population, nor is it reasonable to believe that populations of variant institutions were maintained simultaneously within the community.

One possible SPH interpretation would be to view this (at least partly) as a developmental rather than evolutionary process. That is, to see innovation as the development of societal organization via a process that in turn is based on heritable cultural information, such as via what Heyes (2018) describes as “cognitive gadgets”; see also Ardila (2018), with writing and mathematics used as examples. Compared with biological counterparts, the degree of sociont developmental plasticity would truly be exceptional, but, then again, the affordances of a cultural organism would be radically different from those of a biological organism. Mechanisms for altering phenotypic expression via a flexible developmental process have clearly been strongly selected for, and played an important role, in biological evolution (e.g., West-Eberhard 2003; Sterelny 2011; LaFreniere and MacDonald 2013). In the sociont context, Andersson and Törnberg (2019) argued that one of the major advantages of an environmentally responsive and integrated cultural system could have been to leverage the high flexibility that generally is a trademark of great ape behavior (e.g., Ungar et al. 2006; Malone et al. 2012); see also, e.g., Fogarty et al. (2015) and Fuentes (2017b). That proposition dovetails with the “variability selection hypothesis” (Potts 1998, 2012; Grove 2011b, 2011a; Maslin et al. 2014, 2015) which argues that high levels of environmental variation during the early Pleistocene would have strongly favored any ability to rapidly reconfigure one’s behavior.

If so, the main targets of sociont selection and adaptation would not have been the detailed manifestations of culture, but something more toward the image of cognitive gadgets (Ardila 2018; Heyes 2018); i.e., fundamental systems that underpin our capacity to adapt culturally by governing how we think and by harnessing the capacity of our large brains (Andersson and Törnberg 2019, p. 86). For example, the people at Aşıklı Höyük would have been in a position to embark on their transformation into a Neolithic society (once facing Holocene conditions; e.g., Richerson et al. 2001) because earlier cultural communities (and hominins) in that lineage had accumulated, across hundreds and thousands of millennia, a richer and richer system of basic cognitive tools for dealing successfully with environmental change in general.

Basic and stably present systems like these would over time be expected to coevolve with Homo to form “hybrid” systems, whose genetic and cultural elements would become so closely intertwined that clearly classifying them as either genetic or cultural is impossible. Language could provide the most salient example of such a system. Language is underpinned by genetic adaptations (e.g., Mozzi et al. 2016), but we learn languages, and they evolve also in their own right (e.g., Croft and Cruse 1987; Mufwene 2001; Greenhill et al. 2010). Language is also far more than just a means of communication. It structures the complexities of human cognition and psychology, which, essentially, makes it an operating system for our large and expensive brains. It is widely agreed that language must have evolved gradually from a protolanguage. But while a protolanguage would have been useful, or even necessary, for early Homo (e.g., Bickerton 2009; Bickerton and Szathmáry 2011; in the context outlined in Fig. 2), it is still hard to account for its original emergence in terms of benefits accruing to individual hominins. The sociont would be expected to drive the evolution of precisely this type of complex and shared cultural systems that would become part of the selective environment of Homo.

Third Difference: Lower-Level Selection

Cultural evolution as selection acting on populations of variants of cultural components that arise and spread within cultural communities is a well-researched and central theme in cultural evolutionary studies (Cavalli-Sforza and Feldman 1981; Boyd and Richerson 1985, 2005; Durham 1992; Mesoudi 2007, 2011). This raises the question of how much selection remains in operation also on lower-level cultural components, and how this squares with the notion that lower-level selection would have been suppressed during an ETI?

We have identified multiple mechanisms that inhibit lower-level selection, as many of the individuality criteria correspond to mechanisms that promote higher-level selection and/or reduce lower-level selection (Clarke et al. 2013; Hanschen et al. 2017). But egalitarian ETIs never eliminate lower-level selection completely (see also the “Analysis” of the “Multilevel Selection 2” section). Lower-level selection could still be occurring within the sociont, as it does within biological individuals that have emerged in egalitarian ETI. For example, selection on genes is not completely suppressed in cells (see, e.g., Ågren 2014).

Lower-level selection may also belong to sociont mechanisms adapted specifically to increase the capacity to respond to the environment (i.e., as part of the types of mechanisms discussed as developmental in the previous section); see also Ziman (2000). Employing selection in such a role would not lack precedents. There are two major examples of biological organs that operate as Darwinian systems based on staged and adapted implementations of “blind-variation-selective-retention” (BVSR; Campbell 1960): the adaptive immune system and the brain (e.g., Jerne 1955; Changeux 1985; Michod 1988; Edelman 1993; Fernando and Szathmáry 2010; Müller et al. 2018). The function of these organs is precisely to provide the biological organism with capabilities for responding to the environment on timescales that are too short for genetic adaptation; for example, creativity, learning, and the ability to survive the onslaught of pathogenic microorganisms with much shorter generation times.

Fourth Difference: Boundaries and Manifestation

Biological individuals tend to be physically cohesive, and individuality criteria such as spatial/temporal boundaries and indivisibility are easy to interpret in terms of physical boundaries. The sociont, however, must be imagined largely in other spaces, such as social and ideational spaces. Are such boundaries equivalent with those in biology? We described sociont boundaries in the “Spatial/Temporal Boundaries” section but expand our discussion here with a tentative description of how a sociont would manifest itself—along with Homo—in cultural communities (see also the “Second Difference: Internally Generated Adaptive Traits” section).

On the basis of the analysis in this article, we propose that the phenotypic manifestation of the sociont may be pictured as a stationary and organized pattern of behaviors, cultural products, and environmental modifications—coincident with, and maintained by, but not identical to, a social community of hominins. It would be generated and maintained by the dynamical and parallel expression (by hominins) of cultural components, most of which are likely tacit. This emergent pattern would unfold in time and space as the expression of cultural components regulated the expression of other components (within and between brains) via social interactions, cultural products, and environmental modifications. Expressed cultural components would act by modulating hominin behavior via psychology, cognition, and metacognition.

The stationary structure of this dynamical pattern may be conceptualized schematically (in the manner of an organizational chart) as a nested hierarchy of functional subsystems – such as for hunting, fishing, tracking, knapping, pyrotechnology, but also strategies for things like teaching, distributing resources, resolving conflicts, and so on. This organization may be unpacked all the way into the individual brain, where culture interfaces with our psychology and cognition. In terms of extent, this system reaches only as far as the social interactions of its carriers—i.e., it has a boundary, and, since culture shapes social interactions, the nature and extent of this boundary is evolvable as a part of the system itself.

Moreover, the above description of the sociont potentially dovetails with other models of the dynamics and organization of culture within as well as between cognitive agents.

For example, Heyes (2018) describes cognitive gadgets as not only functional but also regulatory systems, acting within the brain to form adapted systems from highly domain-general innate components. The autocatalytic network model by Gabora and Steel (Gabora and Steel 2017, 2020) sees learning and creativity as a result of self-organization in mental representation networks. They suggest an extension of these dynamics to the social level, which could pertain to the above envisioned intra-sociont dynamics, whose mechanisms would then be shaped by sociont evolution (or mutualistic sociont-hominin coevolution.)

Models depicting emergent "group cognition" in networked human cognitive nodes, organized and mediated by culture, have been proposed by several authors (e.g., Grove and Coward 2008; Coward and Grove 2011; Gallagher 2013; Muthukrishna and Henrich 2016; Read 2020), including models where culture and its products themselves are depicted as part of an "external mind" (Clark and Chalmers 1998; Menary 2010); see Theiner (2014) for a review. Cultural niche construction focuses on complex causal feedback loops between cultural behavior and persistent environmental features (Laland and Brown 2006; Smith 2007; Laland and O’Brien 2015), and networked, recombining and cascading features in general are central in many theories of innovation in modern and ancient sociotechnical systems (see, e.g., Hughes 1986; Geels 2002; Schiffer 2005; Wimsatt and Griesemer 2007; Andersson et al. 2014a; Lane 2016; d’Errico and Colagè 2018).

The description also recalls several models of biological innovation and organization in a recent family of models often referred to as the “extended evolutionary synthesis” (e.g., Pigliucci and Müller 2010; Feldman et al. 2015; Jaeger et al. 2015); for its applications to culture see, for example, Andersson et al. (2014a), Fuentes (2016, 2017a), Smith et al. (2018), and Zeder (2018). Gene regulatory networks in biological development (e.g., Arthur 2011) exhibit dynamical and evolutionary similarities with socio-technical innovation (e.g., Erwin and Krakauer 2004). Also, the extension of genes, via social interactions, to group-level adaptations in social insects (via tactile and chemical signals as well as by sensing of persistent modifications of the environment) leads to the formation of a biological organismal unit that also challenges the view of organisms as physically bounded and contiguous entities (Dorigo et al. 2000; Queller and Strassmann 2009; Kennedy et al. 2017).

Future Directions

Many important issues could not be covered in this article. For example, while evolutionary individuality suggests that competition can take place on the group level, we have not discussed the issue of how competition actually would happen. Discussions of group-level competition provided by Smaldino (2014) and (Richerson et al. 2016) apply largely also to the SPH (even if the delimitation between genetic and cultural information is unclear in those treatments), but the issue must be expanded on in future work.

Formal modelling may help to clarify the role of group-level competition, and, in general, to investigate what assumptions are needed for the postulated links between the entities (e.g., in terms of fitnesses of traditions, hominins, and socionts) to operate as claimed, and to explore phenomena in populations and over time.

Future work should also examine derived human traits such as cooperation, altruism, and language in the context of the sociont and the proposed mutualistic partnership with Homo. This relation must be worked out theoretically, via models, via empirical examples, as well as via revisiting frameworks such as those mentioned in the previous section, including not least models of coevolution between Homo and cultural systems (Durham 1992; Herrmann et al. 2007; Smith 2007; Laland and O’Brien 2015; Fernandes and Woodley of Menie 2017; Hare 2017; Colagè and d’Errico 2018).

It would have high diagnostic value if features of culture and/or humans could be demonstrated to be adapted to evolutionary meta-functions of the sociont (see above). Generally, the prediction that culture was shaped by the same processes that generate organismic form in biology should be further explored. The concept of “organismality” (e.g., Queller and Strassmann 2009; West and Kiers 2009; Strassmann and Queller 2010) here represents an attempt to understand what it means to be an organism in the abstract, and it may serve as a useful starting point.

As argued in “The Social Protocell” section and the “Analysis” in the “Informational Uniqueness” section, we may imagine high-fidelity inheritance of “proto-institutions” as systems of functions before high-fidelity social learning enabled cumulative evolution of the processes underpinning the functions. This requires only the functions to be stable, which may be the case also if they are underpinned by processes that are constantly reinvented rather than inherited (i.e., emulated rather than imitated). Such a proto-cultural phase would require laxer assumptions about early hominin cognitive evolution than a direct move from emulation to an inheritance system based on high-fidelity imitation (see, e.g., Shea 2009).

Finally, do marginal proto-institutions exist in Pan? If they do, they could provide simple analogs for a proto-cultural stage of the evolution of human culture. Such systems may have emerged and remained marginal if combinations of stable socially learned behaviors produced adaptive effects on the community level, but without the open-endedness argued to apply to big game carnivory. Read (2012, pp. 99–104), for example, describes substantial intraspecies regional variability in collective group behavior in Pan, such as in how border patrols are organized, without evident genetic differences to explain this variation.