According to version (c) of the technological complexity thesis, complexity increases globally or on average, that is, in some lineages and more (often) than not. Accordingly, following McShea (1991), an adequate test of the thesis is one that is based on a representative sample of established lines of descent for which the complexity of ancestors and descendants has been or can be measured.
In order to apply these adequacy criteria to technology, they need to be made more precise. To start, what would count as an acceptable measure of technological complexity? In many presentations, cultural evolutionists have understood technologies straightforwardly as functional artifacts, such as kayaks, bows, and dogsleds (Boyd et al. 2013, p. 122). Other versions include or focus on bodies of knowledge (e.g., Mesoudi 2011b) rather than material items; and still others refer more or less consistently to behavioral patterns (e.g., Dean et al. 2014). This reflects a multiplicity of meanings that is widely acknowledged in the philosophy of technology (see, e.g., Mitcham 1978): ‘technology’ may refer to collections of artifacts, to bodies of knowledge, or to goal-directed practices, where none of these meanings holds obvious conceptual priority over the others. This multiplicity also surfaces in the various attempts to define technological complexity. As can be seen in Table 1, some definitions pertain to technological artifacts (definitions [1–2]), some to technological behavior (definitions [3–6]), others can be applied to both artifacts and behavior (definitions [7–9]). At a first pass, any sensible operationalization of the items on the list might be considered adequate. Yet, one might also insist that some measures are better than others. For instance, it has been claimed that humans’ unique capacity for cumulative cultural evolution can be attributed to imitation, i.e., faithful replication of the actual behavior of a mentor (e.g., Tomasello and Call 1997; Tomasello 1999; Boyd and Richerson 1996; Richerson and Boyd 2008; Mesoudi et al. 2013), rather than to emulation, i.e., re-invention of behaviors as a response to being exposed to the end products (e.g., artifacts) of others’ behaviors. If this is right—we take no stand on this issue here—and if cumulative culture refers to the accumulation of technological complexity (Mesoudi 2011a, b; Boyd et al. 2013; Kempe et al. 2014; Dean et al. 2014), the most natural definition of complexity would thus be one that pertains to technological behavior (definitions [3–6]) rather than to technological artifacts (definitions [1–2]).
Table 1 Definitions of technological complexity
One might think that the latter type of complexity is a reliable proxy for the former type. But this is far from established. Artifacts are multiply realizable. For example, prehistoric projectile points were made from stone or from osseous materials, such as bone, antler and ivory. Osseous points seem at least as complex as, if not more complex than, stone points: they have the same number of techno-units (definition [1]), also when hafted; when hafted, the densities of interactions between components plausibly are identical (artifactual definition [7]). However, since osseous materials are less subject to breakage, points made of caribou antler, are probably easier to produce than stone tools (Guthrie 1983) (behavioral definitions [3], [4] or [5]). More specifically, they can be produced by gradually scraping the raw material into the requisite form, and thus do not depend on preparatory work that is as extensive and risky as the work required for producing stone points.
In a similar vein, one cannot read off from an artifact the complexity of its usage—indeed, another type of technological behavior that is subject to imitation. For instance, because automatic gearboxes contain more components (definition [1]) and perhaps denser interactions between parts (definition [7]), they would seem more complex than manual gearboxes. But the reverse holds when we consider their use. The series of actions involved in operating a manual gearbox includes one component more than the series of actions involved in operating an automatic transmission (definition [8]), i.e., gear-shifting, and that component interacts with at least one other component in the series (definition [7]), i.e., steering the wheel.
Measures of complexity might even diverge within their own class (i.e., artifactual or behavioral). Complexity of production is not a reliable indicator of complexity of usage, as illustrated by the gearbox example. Or, to take an example from archaeology, unbarbed spears made out of one piece of wood—as those found at Schöningen (Thieme 1997)—are relatively easy to produce, but it requires an awful amount of skill and knowledge about the prey and the environment to put these one-component tools to effective use. Finally, obviously, there is no reason to expect artifactual definitions [1] and [2] to be correlated. The number of techno-units of a hafted spear tells us little about the number of other tools within the spear user’s toolkit.
Such possible divergence between measures of complexity is also relevant to specifying what would count as a representative sample for testing the complexity thesis. Version (c) of the complexity thesis comes in at least as many variants as the number of definitions in Table 1. Said divergence now tells us that for a sample to be representative it ought to match the variant under consideration.
What else would make for a representative sample? Since the accumulation of complexity is supposed to be uniquely human, the sample must preferably be drawn from the population of all past and present human technologies. Supposing we have agreed on a measure of complexity, we also must get a sense of whether that population contains subpopulations or strata, each of which would need to be sampled independently for the sample to be representative. Biological taxa usually serve the purpose of stratification in biology, but unfortunately no such established strata are available for technology. So whereas we might draw a subsample from all the large branches of the tree of life, we lack a tree of technology, or any other fully comprehensive classification with non-overlapping classes, allowing us to do the same for technological evolution. Consequently, we will need to perform a fully random sample, or else define strata which we, on theoretical grounds, suspect to be homogeneous in the relevant respects. Regarding such stratification, the level of technological complexity among small-scale societies has been claimed to depend on a variety of factors: environment, resource availability, subsistence, sedentism, linear settlement, technology, storage, population, exchange, conflict, competition, social organization, territoriality, style, labor organization, craft specialization, inequality, and status differentiation (Price and Brown 1985). These factors thus might guide us in subsample selection. Ideally, our sample would include a decent amount of variation with respect to these variables. If, as will turn out later, this proves unfeasible, the next best option is to seek and sample the extremes. One might for instance sample hunter-gatherer and WEIRD (Western, Educated, Industrialized, Rich and Democratic; Henrich et al. 2010) societies, as they, even taking into account internal variability, might be taken to differ substantially in a substantial number of respects, such as environment, resource availability, and sedentism. Alternatively, one might contrast samples pertaining to selectively neutral traits with samples pertaining to traits that are under intense selective pressure; functional considerations might be thought to constrain trends in complexity in the latter but not in the former.
Although such stratifications may avoid one of the problems which the lack of a well-established phylogeny of technologies gives rise to, it does not help us to address a second, thornier issue. The requirement to start from well-established lines of descent implies that one must consider, where possible, the developments in all, or at least an appropriate subset, of the lineages branching off from a given ancestor. To appreciate the importance of this point, consider Fig. 1, which is Steven Mithen’s (1996) fairly standard reconstruction of technological development over the history of our species. It starts with the first flaked stones documented by the archaeological record, dating to 2–3 million years ago, and continues until the emergence of the present-day computer. Now even if we were to agree with Mithen that the sequence evidences increasing complexity, and even if we assume that it represents actual ancestor-descendent relations (which evidently is contentious), the sequence does not establish the complexity thesis. For it includes only the tools that are supposedly diagnostic for the acts in what Mithen calls the drama of our past. So the Oldowan (2–1.5 Ma) (Act 2) is taken to be typified by the sharp-edged stones produced by striking one cobble (the core) with another (the hammerstone), as well as the remaining cores, which can be variously used as a chopper or scraper or something else. The emblematic feature of the early Acheulean techno-complex (ca. 1.6–0.5 Ma) (Act 3) is considered to be the bifacial handaxe, produced by more obviously shaping a core by means of both stone and soft hammers, whereas the late Acheulean (0.5–0.25 Ma) is typified by the emergence of Levallois tools, i.e., stone points the size and form of which is predetermined by careful preparation of the core. Mithen’s reconstruction thus ignores the diversity of tools within each act. For example, it ignores the fact that flaked tools continue to be abundantly present in the Acheulean and in fact the whole of prehistory, much later, e.g., among contemporary Aboriginals (Holdaway and Douglass 2011) and, until the 1960s, in the Mediterranean (Karimali 2005); it ignores other marked continuities, such as the persistent use of bone implements to work animal hides (Soressi et al. 2013) and unbarbed spears made out of one piece of wood—as at Schöningen (Thieme 1997) and among recent populations, for example in New Guinea and Australia (McCarthy 1957)—from the Pleistocene until present times; it ignores the fact that later stone implements, such as some of the handstones used during the Neolithic for grinding cereals, have a production process comprising only one step (i.e., seeking a suitable piece of rock; ibid.), which makes them, per definition [8], simpler than Oldowan tools; it ignores the fact that, as already hinted at, the use of osseous materials might have actually simplified production processes; and so forth. Obviously, these examples are insufficient to undermine the complexity thesis. Yet they do illustrate that uni-linear reconstructions fail to record developments which might shift the evidential balance, and thus which any proper test would like to include.
One might worry that technological evolution is highly reticulate, and therefore that the demand to start from established lines of descent is too stringent. Indeed, our ability to produce unambiguous rooted networks is currently limited (Morrison 2011). So perhaps we should acknowledge that, currently, we often lack the means to reliably infer technological ancestor-descendent relationships. This acknowledged, the next best option is to draw diachronic samples from a domain that preferably is wider in range than that of uni-linear reconstructions. The underlying assumption would be that, if the complexity thesis is true, one should observe an increase in complexity irrespective of the precise evolutionary relations between the diachronic samples.