Introduction

The language (Partee, 1984) and actions (Lashley, 1951) of humans (Homo sapiens), and the tools or tool production modes of hominins (Hominini) (Henshilwood et al., 2001; Putt et al., 2022; Schick & Toth, 1993; Stout, 2011; Wynn & Coolidge, 2016) are well recognized as showing compositionality (Janssen, 2012). Compositionality refers to the hierarchical nesting of parts into larger wholes with new individuality, identity, or meaning—or, inversely, the partitioning of such wholes into smaller parts with independence or individual meaning of their own. Words, for example, compose into sentences with new meaning (Pelletier, 1994); behaviors are made up of individual action units (Ekman & Friesen, 1978), and compositional tools are constituted from individual techno units (Oswalt, 1976).

Cognitive, generative, and biolinguists generally assume that compositionality requires (meta)cognitive organization on the part of the composer. Such hypothesized metacognition is theorized to take on the form of a language of thought, theory of mind, intentionality, reflexivity, or voluntary control (Carruthers & Chamberlain, 2000; Fodor, 1983; Frith & Frith, 2003; Frankland & Greene, 2020; Kazanina & Poeppel, 2023). Chomsky (1957), for example, is well-known for arguing that compositional constructions require some sort of mental computation. Berwick and Chomsky (2016) call this hypothesized “cognitive processor” or “internal computational system” “MERGE.” They argue that it is the “basic property of language” that underlies syntax, which is characterized by recursion. Also within this tradition, compositionality is traditionally distinguished from the associative behavior found in other animals and primates. Associative behavior was first described by Thorndike (1898), who was one of the founders of (comparative) psychology. Thorndike (1898) argued that a majority of observed stimuli-response phenomena form “series,” resulting from the “association” or combination of different independent actions, rather than “compound associations” or compositions, whereby different actions underlie new behavior. Research on combinatoriality and compositionality thus reaches far back in time.

Associative behavior today continues to be explained in terms of action-reaction schemes. Classic examples are predator alarm calls (Hollen & Radford, 2009). Such calls require a capacity for combinatoriality between actions and reactions, such as making a call when seeing a predator or reacting to the call on hearing it. Combinatoriality might be subject to (cognitive) learning, but it remains difficult to assess whether combinatoriality also requires either hierarchical organization or metacognition. The latter would involve making the call on seeing the predator so that those hearing it react to it. This might very well be what happens in the well-studied use of predator-alarm calls by vervet monkeys (Chlorocebus pygerythrus, Cheney & Seyfarth, 1990). However, animals generally neither display much flexibility in the production of their alarm calls, nor in reactions to them, but see Nieder & Mooney (2019) for a nuanced distinction between innate, volitional, and learned vocalizations.

Most animal alarm calls also do not partition into sub-calls. Scholars have hypothesized that the onset of compositional language and grammaticalization must have evolved as a form of hierarchical defragmentation of earlier evolved primate calls and gestures whose communicative meaning was holistic (Arbib & Bickerton, 2010; Kuteva & Heine, 2024; Tallerman, 2007; Wray, 2002). These views have induced a shift in linguistic focus from abstract, cognitive, or ideational language compositions to the embodied compositions that characterize language as a communicative system (Gontier, 2022; Sandler, 2018).

Comparisons between human and other animal and primate communication systems now show that the latter not only contain combinatorial but also compositional structures (Amici et al., 2022; Gil, 2023; Pleyer et al., 2022; Spiess et al., 2022; Waller et al., 2022; Zuberbühler, 2020). Studying what they call “componentiality” in chimpanzee (Pan troglodytes) multimodal communication, Oña et al. (2019), for example, found that while some chimpanzee expressions, such as the hoot face and the hoot vocalization always occur jointly, other facial expressions, such as teeth-baring, can be a component in a variety of expressions that combine facial with gestural articulations. The authors define multimodal communication as the “simultaneous combinations of signals from two or more modalities (gestural, facial, vocal, and olfactory signals), and/or any signals requiring sensory integration by the receiver” (Liebal et al., 2014). Signal simultaneity and its sensory integration must somehow be coordinated and thus require hierarchical organization, but it remains unclear whether this requires metacognition.

The idea that, beyond hierarchical organization, compositional behavior requires metacognition or mental computation also is investigated outside of linguistics in fields such as psychology, anthropology, and archaeology. Research on the syntax of language has raised questions about the hierarchical organization or syntax of behavior or action, and the possibility that both are under the same or similar neurocognitive control. Psychologists have searched for a syntax of movement (Lashley, 1951), behavioral or operational logics (Simon, 1955), cognitive heuristics (Simon & Newell, 1958; Tversky & Kahneman, 1974), action grammars or rule-bound strategies of behavior (Greenfield et al., 1972; Greenfield & Schneider, 1977), operant repertoires (Skinner, 1986), or algorithms (Dennett, 1996). Anthropologists and archaeologists have analyzed hypothesized behavioral syntaxes by deducing the chaîne opératoire or operational chain or sequence (Leroi-Gourhan, 1964) of the different tool technologies found in the archaeological record. These research lines have aptly shown that compositionality is neither solely the dominion of language nor of humans. Rather, compositionality can be observed in the manufacturing modes and the actual tools produced by hominins.

Contrary to what these different fields set out to show, compositionality might not require metacognitive rule following. Instead, the cognition required for compositionality can be embodied in behavioral actions and extended into material culture (Iliopoulos & Malafouris, 2024; Uomini, 2017). This view raises questions about how embodied and extended cognition organize hierarchically into compositional structures and where to draw the distinction between compositional and combinatorial behavior.

In this paper, I attempt to provide a possible way of answering such questions. I expand research on combinatoriality and compositionality in language, tool production, and multimodal communication to examine everyday skills displayed by numerous primates. I analyze skills for combinatoriality and compositionality by integrating tenets from hierarchy theory (Pattee, 1973; Simon, 1962; Wu, 2013) and research on operational chains or logics (Leroi-Gourhan, 1964) with insights from 4E cognition theory (Barrett, 2011; Johnson, 2001; Newen et al., 2018), pragmatics and praxeology (Bourdieu, 1977; Ingold, 2000).

Examples of everyday skills are behaviors, such as locomotion, nesting, eating, or grooming. In the first part of this paper, I show that skills are made up from various action units, and their praxis well extends and integrates into larger sociocultural and ecological settings, all of which are subject of evolution. I propose a new definition of skills as spatiotemporally embodied, embedded, enacted, extended, and evolving community traits that are hierarchically made up of combinatorially or compositionally organized action units that underlie bioreality formation. Biorealities in turn are life-based and lived actualities.

In the second part of this paper, I introduce four hierarchies to assess skills as either combinatorial or compositional. I define combinatorial skills as skills that combine independent action units either associatively into a momentarily hierarchical set or aggregation in space or orderly into a linearly hierarchical sequence or series in time. The latter has duration; the former has not. I define compositional skills as skills whereby the action units either nest into new hierarchical constructs or action units of previously nonrelated hierarchical constructs interact in new ways. I show that these definitions enable the identification and differentiation of skills as variously accidental, teleonomic, intentional, or creative.

In the final part of the paper, I exemplify the approach by analyzing a selection of day-to-day primate skills, including locomotor, eating, nesting, and grooming skills. I conclude that compositionality is present in all these skills and is not confined either to language or tool use, nor to humans.

Toward a 5E Definition of Skills

Skills are spatiotemporally embodied, embedded, enacted, extended, and evolving community traits that are hierarchically made up of combinatorially or compositionally organized action units that underlie bioreality formation. I exemplify what this definition entails with the help of the pointing skill, which is well-studied in primates for how it shows differentiation in composition, context of use, and possibly (meta)cognition.

Skills Are Hierarchically Organized from Different Action Units

Although skills are commonly identified as individual actions, that in turn are denominated by single action verbs in spoken language, or action gestures in signed language, they are actually made up of different action units that require hierarchical organization in both space and time. In humans, canonical or index-finger pointing (Butterworth, 2003; Kita, 2003) is identified as a compositional gesture, because it is construed of different action units that include the extension of the index finger and the curling of the other fingers into a closed, fist-like configuration, whilst the thumb rests either on top of or next to the curled fingers. The combination of these different actions constitutes the pointing skill.

If the thumb is located beside the curled fingers, it might grip or wrap around the first and sometimes second curling finger, but observers often do not qualify such variation as differential. Instead scholars (Colletta & Guidetti, 2012; Cooney et al., 2018; Cooperrider & Mesh, 2022; Leavens & Hopkins, 1999; Liszkowski & Rüther, 2024) discuss whether a thumb tucked inside rather than on or beside the curling fingers qualifies as canonical index-finger pointing. Most scholars agree that an additional pointing of the thumb either to the top or bottom or to the opposite side of the hand configuration would disqualify the gesture as canonical index-finger pointing, more so because this gesture is accompanied by a shift in spatial orientation of the hand. This gesture would be described either as a variation of the pointing gesture or as an altogether different gesture.

Beyond the action units that make up the pointing gesture, scholars also contemplate whether the spatial orientation of the pointing hand, the extension and directionality of the arm, or the entity pointed at should form part of the definition or identification of a gesture as pointing. Butterworth (2003), for example, defines index-finger pointing in humans as follows: “…the index finger and arm are extended in the direction of the interesting object, whereas the remaining fingers are curled under the hand, with the thumb held down and to the side.… The orientation of the hand, either palm downward or rotated so the palm is vertical with respect to the body midline, may also be significant in further differentiating subtypes of indexical pointing.” This definition goes beyond structure and takes the functional, in this case referential, aspects of the pointing gesture into account.

Skills Are Embodied, Embedded, Enacted, Extended, and Evolved Community Traits

The above shows that pointing can be studied from different perspectives: for how it is structurally composed of different parts, for how the pointing action unfolds over space and time, and for how it, as a compositional gesture, functions as a whole that in turn interacts with objects or subjects that surround it. These perspectives imply different views on how entities in the world organize hierarchically.

The composition of the pointing gesture or its interaction with other entities, such as objects or subjects in the surroundings inevitably takes place over time and space, both of which are processes with constraints and affordances (Gibson, 1977). How the pointing gesture is formed, for example, depends on the anatomy and physiology of the pointer. How the gesture is used depends on the sociocultural and ecological setting in which it is displayed and attributed meaning.

Wild apes are rarely seen index-finger pointing. Captive apes show index-finger pointing (Krause et al., 2018; Povinelli et al., 2003), but they either point with all fingers extended or by extending their index finger while curling their other fingers. Their physiology prevents them from closing and wrapping the hand in the way humans do while pointing. In humans, the pointing action can extend the anatomy and physiology of index-finger pointing and take on different configurations, such as middle-finger pointing, lip pointing, and even foot pointing (Wilkins, 2003). Pointing actions also are observed in bonobos (Pan paniscus) during requests for genito-genital rubbing that take on the form of foot pointing and single hip shimmy gestures (Douglas & Moscovice, 2015).

The configuration, action, and the use of pointing also is subject to sociocultural learning in both humans and other primates. In humans, index-finger pointing is thought to develop around 12 months of age from precursive behavior, such as index-finger extension and inspection of the environment (Butterworth, 2003), whereafter it starts to associate and interact with other skills, such as gazing, attention sharing, attention directing, or language learning (Goldin-Meadow & Butcher, 2003). In these cases, pointing, which is (developmentally and physiologically) made up of different action units, starts to (ecologically) function as an entity that, as a whole, associates and interacts with a larger spatiotemporally situated practice.

In humans, pointing can, for example, be used as a communicative act to share attention to the object pointed at in the environment or to request the object pointed at, which are distinguished as declarative and imperative pointing (Tomasello & Call, 1997). Apes also use the pointing gesture communicatively; language-trained and home-reared apes mostly use index-finger pointing declaratively, whereas institutionalized apes point more imperatively and with all fingers extended, often toward items out of their immediate reach (Leavens et al., 2005). Beyond individual physiology and development, pointing thus depends on the ecological and sociocultural environment that constrains and affords the overall context of use.

The context of use of skills requires learning, which in turn requires cognition. The mere physiological configuration of the different action units that make up the pointing gesture, for example, must be under neurocognitive control, as all muscular movements can eventually be proven to be. But we do not know whether the action units or the pointing skill as a whole also are under metacognitive control, i.e., whether a pointer learns to follows some operational technique (Mauss, 1936), sequence (Leroi-Gourhan, 1964), logical heuristic (Simon, 1955), or rule (Tversky & Kahneman, 1974) that says to curl the fingers and to extend the index-finger or to use pointing to alter the mental or physical state of an observer.

Research on cognitive control often is conducted in the neurological sciences, whereas research on metacognition is conducted in the cognitive sciences where scholars investigate whether pointers have intent, theory of mind, reflexivity, consciousness, or knowledge on the performance, and/or consequences of their actions. Often, such research gets stuck in mentalistic jargon (Byrne, 1996; Dennett, 1996; Fodor, 1983; Tomasello & Call, 1997), and in pragmatic theory, it has been criticized for decontextualizing skills from their overall context of use. Distributed cognition theories in psychology (Johnson, 2001) and praxiological theories in anthropology (Bourdieu, 1977; Ingold, 2000) provide alternative means to study skills.

Tenets of these theories today are integrated into 4E cognition theory (Newen et al., 2018). This research tradition studies cognition for how it is embodied and enacted by the organism (Varela et al., 1992), embedded or situated (Brown et al., 1989), and grounded (Barsalou, 2008) in a sociocultural context and how it extends (Clark & Chalmers, 1998) into the environment. It thus provides a hierarchical and interactional understanding of the mind-body-environmental relationship (Barrett, 2011).

4E cognition theory was primarily introduced to make better sense of what pragmatists and phenomenologists call worldviews, which are cognitive and sociocultural constructions of reality. In evolutionary epistemology (Gontier, 2018; Gontier & Bradie, 2017), however, it is used to understand actual worldbuilding. Because cognition is embodied in living organisms and because it materializes and extends into the environment, cognitive, sociocultural, and ecological niche construction theory (Bateson, 1972; Laland et al., 2003; Lewontin, 2002; Magnani, 2021; Sinha, 2024) enables an understanding of cognitive behavior as altering and building biological realities. 4E cognition theory, which was formulated within the developmental and ecological sciences, thus can be synthesized with the evolutionary sciences in an eco-evo-devo or ecological-evolutionary-developmental approach that undoes distinctions between the mind and the body, or the biotic and abiotic environment (Jablonka & Lamb, 2005).

Skills not only rely on, or show aspects of, 4E cognition but actually become embodied and enacted or performed by organisms. They become socioculturally embedded, and they extend into the environment that becomes altered by it. For these reasons, 4E cognition theories on skills can be extended toward a 5E approach that in addition examines how skills evolve. A 5E perspective of skills understands skills as evolving community traits. Community traits are “synergistic/organizational traits that characterize the community and that result from the cumulative, transgenerational, and constructed niches resulting in turn from biological, ecological, and sociocultural, extra-genetic inheritance” (Sukhoverkhov & Gontier, 2021).

Community Traits Underlie Bioreality Formation

As community traits, skills embody, embed, and enact this extragenetic inheritance, which they extend into the biotic and abiotic environment. So much so that ontologically, skills contribute to the construction of spatiotemporal biorealities (Gontier & Bradie, 2017), which comprise the everchanging life-based and lived actualities construed by individuals, possibly belonging to different species and the communities that they form.

Returning to our example, pointing can be shown to differ between and within species based on how it is used. For institutionalized apes, for example, pointing to acquire objects out of reach forms part of their lived and life-based actuality, as it forms part of the life-based and lived actuality of their caregivers to provide the objects pointed at. It forms part of their community’s rites and rituals, which are, just because they are embodied, embedded, enacted, and extended, ontologically real; they evolved in that specific context of use. Pointing toward objects also can be a means to acquire linguistic information about them, which is a community trait that underlies the lived- and life-based actuality of language-trained apes as well as human language learners. Foot pointing and hip shimmies to request gg-rubbing are evolved community traits that underlie the biorealities of bonobo communities. Community traits like these can become subject of learning and reenactment as well as extragenetic inheritance across generations through time and as such evolve further. However, they are bound or real only to those that have evolved the means to perform, observe, and interpret the (variations in) pointing behavior.

In summary, 5E skills are hierarchically complex community traits that combine and compose from different action units that underlie bioreality formation. The following section provides a scheme to explain this hierarchical complexity.

A Hierarchical Approach to Combinatoriality and Compositionality

Problems of combinatoriality and compositionality are not only problems of skills, language, or cognition. Rather, they are problems of ontological order and organizational complexity that can be researched epistemologically by hierarchy theory (Simon, 1962). Rooted in cybernetics, general systems theory, emergence, and complexity theory, hierarchy theories today are diverse and they are used by numerous sciences (Wu, 2013). All hierarchy theories understand the organizational complexity that characterizes worldly phenomena as resulting from units, components, or parts arranging, nesting, acting, or interacting into levels, structures, or wholes. Units and levels so identified provide descriptions of aspects of reality, and reality is minimally conceptualized as multilayered and possibly as existing of multiple realities.

Hierarchy Theories throughout the Sciences

How part-whole divisions and constructions or how units and levels are defined ontologically is determined by the epistemological frameworks and viewpoints used. Molecular biologists, for example, investigate a molecular level of reality where they can examine how biomolecular masses, such as DNA sequences, are structurally constituted by phosphates, sugar molecules, and nitrogenous bases or they examine how mobile genetic elements so constituted can switch positions within and between genomes (Shapiro, 2022). Developmental biologists examine how genes bring forth cells, tissues, and organs during organismal life history (Gawne, McKenna & Nijhout, 2018). Physiologists examine how such anatomical structures organize into organismal systems, such as the cardiovascular and respiratory system, and how these systems interact with the environment (Noble & Noble, 2022). Neurocognitive scientists contemplate cognitive control hierarchies in the brain and running from the brain to the nervous and muscular systems (Badre & Nee, 2018). Evolutionary biologists examine how, over natural history, genes bring forth organisms, and organisms bring forth species, and how, ecologically, organisms group into populations and communities that form ecosystems (Eldredge & Salthe, 1984).

The different hierarchies propose different perspectives from which to study entities in the world. Organisms, for example, can be studied for how they evolve over time, how they develop from genes, how they function through the working of their organs, or how they interact ecologically with other organisms in the environment. Applying different hierarchies results in organisms being considered either as parts or as wholes. Part-whole identification thus depends on the perspective taken. This is no different for skills whose analyses also rely on science-based perspective taking. The identification of the action units that make up skills can thus vary with the science used to analyze a skill, as well as, as we shall see, wit the overall context wherein the skill is used. In all cases, such analyses must at one point focus on how the action units identified are organized hierarchically.

Hierarchy Theories in the Behavioral, Cognitive, Linguistic, and Anthropological Sciences

The problems of combinatoriality and compositionality encountered in the behavioral (Arun, 2022; Cavicchio et al., 2018), cognitive (Penn et al., 2008), linguistic (Baggio, 2021), and anthropological sciences (Varallyay et al., 2023) also ask about hierarchical organization and are rooted in hierarchy theory. This is due to the early association between hierarchy theory and synchronic, structural-functional, and systems theoretical research (Pattee, 1973). Although often not explicitly stated as such, research on compositionality and combinatoriality can be said to follow the classic distinction between nested and unnested hierarchies. Nested and unnested hierarchies, in turn, are today also placed in diachronic and evolutionary perspectives. The following sections first introduce a new epistemological scheme to understand hierarchies, then situate combinatoriality and compositionality in this scheme.

A Hierarchy of Hierarchies: Aggregational, Linear, Nested, and Interactional Hierarchies

Hierarchical organizations can be conceptualized as aggregational, linear, nested, or interactional (Gontier, 2021). The former two hierarchies are unnested, the latter are nested. Aggregational hierarchies are hierarchies where units arrange in an unordered spatial collection. Linear hierarchies result from units arranging over time into a series. Nested hierarchies result from units arranging into new entities or constructs over space and time. Entities or constructs so formed develop individuality that is characterized by a functionality that emerges from the units wherefrom they are constituted. Interactional hierarchies result from nested structures interacting through their units with one another.

Combinatorial and Compositional Skills in the Hierarchy of Hierarchies

Combinatorial skills can be understood as resulting either from the aggregational or linear hierarchical arrangement of action units. Combinatorial skills are aggregationally hierarchical when different action units combine into a spatial arrangement that lacks order or nestedness. Combinatorial skills are linearly hierarchical when skills show a sequential or serial, successive or consecutive arrangement of action units in time. Linearly hierarchical skills thus always result in a time series whereby single action units are ordered successively (following one another) or consecutively (following one another continuously), without the series of action units composing into a new whole (Table I).

Table I Combinatorial versus compositional skills according to a hierarchy of hierarchies

Compositional skills can be understood as resulting either from the hierarchically nested or interactional arrangement of action units in both space and time. Compositional skills are hierarchically nested when different action units constitute a new behavior that functions as a whole (over time) or when the compositional behavior can be partitioned into independent action units (in space). Compositional skills can moreover arrange in interactional hierarchies when their action units interact and become applied in new spatiotemporal settings.

Hierarchical Analyses of Everyday Primate Skills

The following sections asses a selection of common, everyday skills shown by most primates as either combinatorial or compositionally hierarchical. The examples show that understanding skills from within this hierarchical scheme also enables us to distinguish accidental from teleonomic, intentional, and creative behaviors. The hierarchical arrangements enable such assessments without necessarily understanding such behaviors as cognitive or metacognitive. The examples focus on assessing the combinatorial and compositional complexity of the skills analyzed, rather than providing evolutionary scenarios on how the skills evolved.

Locomotor Skills

Brachiation (arm swinging), quadruple knuckle walking, or bipedal walking are examples of everyday primate locomotor skills (Kimura, 2002). These skills all require individual learning and vary at a biological (species) and sociocultural (community) level, which underlies the formation of different biorealities, i.e., life-based and lived actualities. For example, brachiation associates with an arboreal lifestyle, knuckle walking associates with terrestrial lifestyles, and bipedal walking links to both arboreal and terrestrial navigation.

Brachiation, knuckle walking, or bipedalism are all compositionally nested hierarchical structures. In the previous part, we defined compositional skills as hierarchically nested when different action units constitute a new behavior over time or when the compositional behavior can spatially be partitioned into independent action units. Brachiation meets these criteria, because it is constituted of different action units that include the spatiotemporal alteration of striding limbs. Each of these strides, in turn, are compositional structures that are minimally composed of a support and swing phase in continuous contact brachiation, or, in the case of ricochetal brachiation, an additional aerial phase where there is no contact of the limb with the surface brachiated upon (Chang et al., 2000). These phases combine different goal-oriented or teleonomic skills (Corning, 2014, 2019; Pittendrigh, 1958; Vane-Wright, 2014), such as reaching, grabbing, holding, swinging, and releasing the branch brachiated on, into linear hierarchical sequences that on a higher level become nested into the brachiating skill. If the different action units do not combine into the linear sequences, and if these sequences are not nested, they at most form aggregational hierarchies of individual skilled behaviors, but the compositional skill cannot emerge. For that reason, compositional skills can be considered intentional; the action units have to join and form new and recognizable constructions with identity and individuality, regardless of whether or not they are under cognitive or metacognitive control. Intentionality does not necessarily lie in the brain but in the specific hierarchical configuration of different action units. This configuration is furthermore not intrinsic to the action units but flexible and dependent upon the extrinsic context of use.

Knuckle and bipedal walking also are constituted of alternating limb strides that each combine with specific body postures. Knuckle walking is a skill displayed by African apes that, as defined by Wunderlich (2022), entails “…a form of quadrupedal locomotion in which forelimb weight is born on the dorsa of the middle phalanges of the hand (not actually on the ‘knuckles’ as its name implies). The hand is held in a position in which the interphalangeal joints are flexed and the metacarpophalangeal joints are extended or in a neutral position.” Focusing on the role of the wrist makes Wunderlich (2022) implicate numerous other physiological and anatomical structures in knuckle-walking, such as the scaphoid dorsal concavity, scaphoid beak, capitate distal concavity, capitate wasting, capitate dorsal ridge, hamate dorsal ridge, and the hamate distal concavity. The hierarchical nestedness required to display the knuckle walking skill must be enormous.

Compositional skills can arrange in interactional hierarchies when their action units combine and become applied in new spatiotemporal settings. Knuckle-walking, for example, varies ecologically, based on the surface knuckle-walked on. Kivell and Schmitt (2009) in this regard differentiate between “2 fundamentally different biomechanical modes of knuckle-walking: an extended wrist posture in an arboreal environment (Pan) versus a neutral, columnar hand posture in a terrestrial environment (Gorilla).” Both types of knuckle walking associate with species-specific, evolved, environmental adaptations that define their everyday biorealities. These types of locomotion can be understood as creative means to solve problems of locomotion. Such creativity is not necessarily metacognitive in kind. Rather it characterizes the evolutionary process itself, because there is no necessary ontological relationship between either of these environments and the evolution of knuckle-walking. Indeed, other animals have evolved different locomotor means to navigate these surfaces.

Bipedal walking (Vaughan, 2003), typical of humans, also can be defined as a compositional skill that is constituted from alternating steps but while maintaining an upright posture. This matches the definition given by Schmitt et al. (2022), for example, who define bipedal walking as “…an erect (nonsprawling) posture and a striding (sequenced between right and left) footfall pattern.” While individual steps may combine into random aggregations, the walking skill requires the anatomical balancing of the body in an erect posture while feet alternate over space and time, thereby establishing a (possibly random) trajectory. Merely maintaining an erect posture, for example, would qualify as a standing skill but not as a walking skill. Striding continuously with the same foot instead of alternating feet would qualify as the hopscotching skill. Simultaneously lifting both feet would qualify as the jumping skill. It is only when the different action units nest into a specific and recognizable spatiotemporal composition that the walking skill emerges. Such intentionality lies in the identifiable hierarchical composition of the skill and not necessarily in the cognitive or metacognitive control of that skill. The latter would imply walking at will, or in a certain direction, or with a specific purpose, each of which would require different control hierarchies that make scientific sense of this.

Chimpanzees, bonobos, and orangutans also display bipedal walking. Orangutans, for example, use bipedal walking to navigate on tree branches (Thorpe et al., 2009). Bonobos and chimpanzees display bipedal walking mainly in terrestrial environments; Videan & McGrew (2001) argue that, functionally, bonobos show bipedality for carrying and vigilance, whereas chimpanzees use it for display.

In hominins, walking has over the course of evolution started to interact hierarchically with numerous earthly surfaces, ranging from forestry to savanna, desert, mountain, and even aquatic environments. The walking skill in this regard functions to help explore and build new biorealities. Walking, moreover, can interact hierarchically with numerous other skills, such as running, kicking, or dancing.

Eating Skills

Eating is another daily skill displayed by primates and other animals that requires learning and that demonstrates not only species-specific but also community variation, each of which can be subject of evolution. The different action units that compose the eating skill, such as swallowing and chewing or the opening and closing the mouth, are all skills that can occur individually or aggregationally or they can function as action units of larger combined or composed skill sets.

The act of eating, however, involves the nonrandom and to some extent linear arrangement of different action units into a compositional skill that minimally includes bringing food to the mouth (by hand or by sucking or biting on the food source) and swallowing it wholly or piecemeal through chewing. These different action units might show alterations, but overall, the individual actions need to combine into a determining behavioral sequence to compose into the eating skill.

Many of the subskills that compose into the eating skill are goal-directed or teleonomic, because they involve the organized combination or orderly repetition of different actions over time. Chewing, for example, consists of a combination of teeth grinding and tongue-twisting movements that are repeated over time as a determining action sequence, which becomes embedded into a larger and nested skill set that alternates chewing with swallowing and food intake.

Control for such coordinated skill sets is traditionally attributed to a controlling agent, such as a mental state (Byrne & Whiten, 1990) if not a conscious self (Damasio, 1999). However, swallowing reflexes (Nishino, 2013), for example, can but do not necessarily require theory of mind or voluntary control. Intentionality lies in the hierarchically nested and embodied set of actions; teleonomy results from the serial or sequential ordering of the same or different action units over time.

Real-life situations make it so that the already nested eating skill also interacts differentially with the action units of numerous other compositional skills, such as food identification or recognition; food acquisition through hunting, scavenging, foraging, or sharing; or food vocalizations. Such interactional hierarchies underlie creative bioreality formation, because they come to define species and community-specific praxis. Well-studied examples in primates that qualify as establishing such interactional hierarchies between eating and other skills include lip smack expressions and food grunts, leaf-folding, nut-cracking, or termite-fishing. We briefly turn to these in the following paragraphs.

Most primates combine the already compositional eating skill with expressions, such as lip smacking, or vocalizations, such as food grunts (Steiner et al., 2001; Watson et al., 2015). Although these skills interact hierarchically with the eating skill, they do not actually constitute it. Eating can occur in the absence of these skills, and although food vocalizations can be specific and used only in food contexts, lip smacking can be displayed in different social contexts and signal, amongst others, submission or alliance. When food grunts do combine with the eating skill, their interaction has the potential to underlie novel sociocultural situations. Food grunts can be understood as food calls by both the utterers and the hearers, and such calls might be used for the recruitment of social allies or reproductive mates (Kalan & Boesch, 2015). Chimpanzees are also known for trying to voluntarily suppress these food grunts to avoid attracting attention and having to share food (de Waal, 1989). In cases like these, action units from different skill sets interact reticulately and compose into larger, interactional hierarchies that underlie community-based bioreality formation.

Nut-cracking, a skill displayed by West African forest chimpanzees of the Ivory Coast, Guinea, Liberia, and Sierra Leone (Boesch et al., 1994; McGrew et al., 1997), is another example of a compositionally nested skill. The skill hierarchically nests teleonomic skills (repetitive blows with a hammer stone on the nut, or picking up and fiddling with the nut until the fruit is released) into the compositional and intentional skill set (cracking the nut). Joining the nut-cracking skill with eating the nut in turn is an act of creativity, interactionally hierarchical in kind, that underlies a new spatiotemporal bioreality, one where nuts become a recognized part of a chimpanzee’s subsistence strategies. This bioreality differs from older behavioral repertoires, and it can become subject of extragenetic inheritance or cultural transmission at a community level (Boesch et al., 2019; Marshall-Pescini & Whiten, 2008; Neadle et al., 2020). Luncz et al. (2015) in this regard show that immigrating females of neighboring chimpanzee communities in the Taï National Park (Ivory Coast) adjust their nut-cracking behavior to local customs. Reported as an example of conformity bias, the study demonstrates how nut-cracking interacts not only with subsistence but also sociopolitical strategies.

Leaf-folding, defined by Biro, Sousa & Matsuzawa (2006), as “…the use of leaves that are folded, accordion-like, inside the mouth before being dipped into water and retrieved” is a behavior that forms part of the bioreality of chimpanzees of Bossou, Guinea, who use the skill to collect, suck, and drink water. The authors differentiate the skill from leaf sponging and leaf spooning, which are more commonly observed in other chimpanzee communities.

Termite fishing (Sanz & Morgan, 2011) is an interactional compositional skill whereby chimpanzees intentionally abstract termites from their hills to eat them. Eating termites is a creative act that also expands chimpanzee bioreality by altering subsistence strategies. The abstraction of termites consists of the joining of a series of teleonomic subgoals, including the finger poking of the termite hills, the stripping of the straws and sticks, the repetitive fishing, and the licking of the straws.

Over the course of evolution, hominins have expanded their diet by identifying more and more environmental resources as food sources (Braun et al., 2010; Mata-González et al., 2023) and by interactionally combining and subjugating these food sources with skills developed under other circumstances, such as plucking, cleaving, fire making, etc. Each of these skills consists of a hierarchical nesting of mostly already teleonomic skills sets, into larger compositions, of which the action units start to interact creatively over time. The interactional hierarchies that result from such creative interactions in turn underlie bioreality formation.

Nesting Skills

All great apes build nests to rest or sleep in (Coolidge & Wyn, 2006; Goodall, 1962; Koops et al., 2012). The nest-building skill shows species- and community-wide variation that is subject to learning (Baldwin et al., 1981; Brownlow et al., 2001; Schuppli et al., 2016). Infants, who initially share the nest of their parents, can be observed practicing several of the individual parts of the nesting skill before they go on to build their own nests (Fruth & Hohmann, 1996; Fruth et al., 2018; Goodall, 1962; Yamanashi et al., 2020; Videan, 2006).

Nest-building is a hierarchically nested compositional skill that intentionally combines several aggregational and teleonomic skill sets to produce a final construct, the nest. Examples of teleonomic skills that compose hierarchically into the nest-building skill include repeated twig bending and breaking, leaf gathering, carrying, and heaping, and testing the nest for adequacy for shorter-term daytime or longer-term overnight resting (Fruth & Hohmann, 1994; Russon et al., 2007). Beyond the act of nest building, the nest itself can be understood as a hierarchical structure composed of three layers: a solid platform or frame, a central mattress, and a lining that is established by adding leaves and twigs (McGrew, 1992).

Nest building is a compositional skill that interacts hierarchically with the action units of other skills, such as sleeping or infant rearing. Chimpanzees, for example, also show location preference, possibly choosing trees with parasite-repelling properties as well as locations that make them less accessible to predators (Lacroux et al., 2022). Such interactions again establish community-based biorealities.

Grooming, Dental, and Hair Care

Another everyday skill found in primates is body care through grooming (Spruijt et al., 1992). Grooming includes the individual or social nurturing of hair and skin, by brushing, rubbing, picking, scratching, biting, or licking it so that old cells, dirt, or parasites are removed. Individuals often are groomed by parents, peers, or caregivers before they groom themselves or others, indicating a strong participant observational and learned, imitative component to the behavior. In bonobos and chimpanzees, grooming behavior is a learned skill that is transmitted in all five possible directions (Cavalli-Sforza & Feldman, 1981; Sukhoverkhov & Gontier, 2021). Grooming is performed vertically (from parent to offspring), obliquely (from older to younger group members), reversely (from younger to older group members), horizontally (peer-to-peer), and reticulately (between members of different species, for example, with humans). In the latter case, grooming is an act of symbiosis.

Old skin or parasites can itch and bring forth a scratch reflex, but in primate grooming, the body is visually and manually searched and systematically treated with repeated brushes, rubs, bites, pokes, and so on. According to the scheme, any of these individual skills is minimally teleonomic or goal-oriented. When several of these skills combine into a larger grooming session, they become compositional. Grooming also can be creative when new ontological associations are made. In the wild, apes can, for example, integrate sticks or leaves into their grooming routines (Boesch, 1995; Zamma, 2002). Leaf-grooming often is preceded by leaf-clipping (Badihi et al., 2023), and thus a complex series of skills become joined creatively into the grooming behavior.

When outcomes or effects of grooming are considered, the ontological complexity becomes phenomenal. Grooming is proven to contribute to overall hygiene, health, and emotional well-being, and it increases social bonding and alliance formation which in turn facilitates group cohesion (de Waal, 2007; Dunbar, 1988, 1996; Jablonski, 2021; Masataka, 2016; Schino & Aureli, 2008; Terry, 1970; Tiddi et al., 2012; Torfs et al., 2023; Zamma & Nakamura, 2015).

A well-studied gesture that does not constitute the grooming skill, but that hierarchically interacts with it is the grooming-hand-clasp, and this gesture is reported to hierarchically interact with sociopolitical life (van Leeuwen et al., 2023). First described in chimpanzees (McGrew & Tutin, 1978) but also observed in bonobos (Fruth et al., 2006), individuals performing the gesture groom with one hand while they hold their partner’s hand or arm above their head. The gesture varies across communities in regard to what part of the hand or arm is held, and van Leeuwen et al. (2023) show that this variation is subject of both vertical as well as oblique transmission. The oblique transmission demonstrates conformity bias toward the hand-clasping of older and dominant individuals. Tracking and mapping these interactions brings forth network-like interactional hierarchies that enable a better understanding of how community-based extragenetic biorealities evolve.

Primate grooming also includes tooth-pulling and tooth-picking with grass sticks or twigs (McGrew & Tutin, 1973). Teeth-picking is a creative, interactional compositional skill that intentionally nests teleonomic grass pulling and poking skills with targeted teeth (cavity) tweaking skills. From a hierarchical point of view, the use of leaves for body grooming or toothpicks for dental hygiene is no less creative than the use of grooming to obtain social favors or to establish social alliances. In both cases, new biorealities are opened through the interactions that emerge between different compositional skill sets.

A female chimpanzee at Chimfunshi Orphanage Trust in Zambia, called Noel, was observed using grass sticks while cleaning the teeth of her dead, adopted son, Thomas (van Leeuwen et al., 2017). The researchers who observed the behavior understood the behavior to evidence postmortal care and mourning. Teeth of hominins, including H. habilis, H. erectus, H. heidelbergensis, H. neanderthalensis and H. sapiens, also show wear-and-tear grooves indicative of systematic tooth picking with objects foreign to the mouth (Frayer & Russell, 1987; Lozano et al., 2013; Nowaczewska et al., 2021; Ungar et al., 2001). Evidence for tooth picking is mostly associated with pain relief. It also might be a first route to creative teeth modification that can include teeth recontouring, grinding, sanding, and filling. Such additional creative skills might have subsequently become attributed with symbolic meaning (Nowell & Cooke, 2024).

Other creative, interactionally compositional skills related to dental hygiene are tooth, tongue, and gum cleaning (Niazi et al., 2016; Wu et al., 2001). Humans have invented and continue to invent numerous ways to clean their teeth, ranging from chewing on selective plants or herbs, such as sage (Salvia officinalis) or mint (Mentha spicata and Mentha × piperita) or root sticks, such as miswak (Salvadora persica), to manufacturing toothbrushes and dental pastes. In all cases, tooth brushing involves an interactionally compositional, creative skill, because it results from the interaction of different compositional skill sets (e.g., toothbrush construction or selective plant, herb, bark, or root foraging that becomes linked to dental hygiene), in turn composed out of combinatorial skills (e.g., the chewing and grinding skill, which can be understood as teleonomic, because it consists of goal-directed, repetitive up and down or zig-zag movements of the molars with the leaves or sticks).

There is no necessary structural or ontological relation between (knowledge of) sage, mint, or miswak and periodontal health or disease. Such linkage occurs because different ontological realms are cognitively and physically combined in creative ways. Creativity thus lies in the cognitive and physical linkage of different compositional skill sets. The result is a reduction of dental plaque made up of oral bacteria and overall better mouth hygiene. Without it, the biocultural reality of primates would be more afflicted by disease.

Hominins have come to ontologically associate their skin and body care routines with many more ecologically available resources, such as (hot) water, mud, clay, oil, milk, and cacao, and technologically produced items, such as combs, hairpins, soaps, nail clippers, and foot brushes. From an ontological point of view, none of these resources or items are necessarily or relationally connected to the body. Their linkage demonstrates creative and intentional, compositional behavior that extends cognition and underlies knowledge formation that is in turn subject of sociocultural, extragenetic transmission, and modification. An example is hair combing. Compared with other primates, human head hairs or beards require much more hair care, because these hairs continue to grow (Kamberov et al., 2018; Neufeld & Conroy, 2004; Yesudian, 2011).

Hair combing, in the scheme, is a creative skill that uses finger and hand movements or material artifacts, such as a comb, to clean and style the hair. Combing is a teleonomic, linearly organized, hierarchical skill that has a clear beginning (the placing of the fingers or comb in the hair), duration (the streaking of the hair from top to bottom), and ending in time (the release of the hair from the fingers or comb). One such teleonomic combing movement, however, does not comb an entire head of hair. For that, the combing skill must be nested, using additional cognitive and physical strategies. An example is intentional course-following by going from front to back or left to right until the entire head of hair is combed, indicating nested compositionality. Using the fingers or a comb to style hair can be understood as interactionally compositional and thus as an act of creativity. This is because such behavior results from the interaction of different compositional skill sets that include control of fine finger movements to make combing movements, or the manufacture of combs, and linking either to hair care. Note that the combing skill can thus be teleonomic, compositional, or interactional depending on its context of use.

Archaeological evidence shows that combs were invented multiple times, in different shapes and sizes, with designs focusing on functional as well as symbolic use. An Egyptian ivory comb found in Kemet is considered the oldest specimen found so far. It is dated to 6000 years ago (Ashton, 2011, 2013) and shows what Petrie (1891) long ago called the “pick” style. The pick comb style is a complex artistic design wherein a single piece of raw material is divided into three different areas: a bottom part containing the teeth of the comb, a middle part serving as the handle, and a decorative top-part, in this case one with horn-shaped endings. The number of interactions needed to produce such a functional and artistic specimen is numerous and minimally require bone abstraction (cutting, splicing) and polishing skills. The imposition of design that is resonant with functional (hand) usage, or that is zoo- or anthropomorphic, shows the presence of symbolic skills that far surpass the creative skills required to make the comb. Such combs, furthermore, often were worn in the hair as hairpins and functioned as symbols of gender, status, and so on, and thereby underlie numerous new ways of worldbuilding.

Younger combs (Arriaza et al., 2014; del Río & Álvarez, 2018; Palma, 1991) hint at different designs (e.g., double edges or multilayered teeth rows) with older functionality (e.g., delousing, detangling, hair smoothening, or hair shaping), as well as manufacturing techniques applied to more malleable materials, such as grass, reeds, ribs of palm leaves, wood, and shells. Many combs also show complex threading and weaving patterns. Weaving is an inherently intentional and compositional skill and often a creative skill in so far as it interacts with design patterning applied to different raw materials, made for different purposes (nets, clothing, tapestry, baskets, etc.). In human cultures, combs also are used to style fabrics or other woven materials. Combes are used on other species or material artifacts, such as dolls. They can even be used as scratching tools or perhaps as weapons, depending on the size and shape of the teeth.

While these examples might well indicate a hierarchy of creative and symbolic usage of hair combs, one in need of further exploration, the point is that the very act of combing hair either with one’s fingers or a comb, is already an act of creativity because of the ontological interactions it requires between different compositional skill sets.

Conclusions

Skills are subject to evolution and rely on the complex interplay between individual and sociocultural praxis that extends across communities in space and time, where they help with worldbuilding or bioreality formation. This is because skills combine or compose into hierarchical ontological structures, processes, and events. In this paper, I presented a workable epistemological scheme to methodologically differentiate combinatorial from compositional skills. Combinatorial skills are either aggregational or linearly hierarchical depending on whether the action units combine associatively in space, indicating unorderly behavior, or successively or sequentially over time, thereby forming an action series, indicating teleonomy. Compositional skills are either nested or interactionally hierarchical depending on whether the action units result in new hierarchical constructions, indicating intentionality, or in new interactions between existing constructs, indicating creativity.

Most primate skills are already compositional and hierarchically nest teleonomic action sequences that necessarily unfold over time, or engage in interactional hierarchies through their action units in space, where they underlie creative bioreality formation. Creativity, intentionality, teleonomy, or accidentality primarily lies in the hierarchical structure of the skill, more so than in a possible mentality of its performance. The hierarchical scheme that I propose makes no claims on metacognition and instead focusses on the different actions needed to portray a complex skill, how these actions combine hierarchically into larger skill sets, and in which spatiotemporal and ontologically reticulate contexts they are used. The approach is agnostic over whether the hierarchically organized skills result from theory of mind, mentalizing, consciousness, or free will, which are used as criteria to differentiate human from other primate and animal behaviors.

A hierarchy-based assessment of accidental, teleonomic, intentional, or creative behavior is better able to deal with real-life situations where organisms often simultaneously display a multiplicity of behavioral and cognitive skills of varying hierarchical complexity. The approach breaks tradition by showing that accidental or teleonomic combinatoriality, or intentional or creative compositionality are not the dominion of one species. Rather, primates and perhaps most animals appear capable of portraying all four forms of hierarchical organization.

Skills also prove to be very dynamic. Neither are they bounded to one operational sequence, nor are they inherently anchored to one specific hierarchical form of organization. As explained with the combing example, skills can switch from being teleonomic, to intentional, to creative depending on their praxis and context of use. It is the latter that determines whether specific skills are combinatorial or compositional. The delineation of the context of use wherein a skill occurs requires an operationalization of parameters through an observational stance, one that, as noted by one referee of this manuscript, once identified, enables falsification or empirical confirmation to a greater degree than mentalistic claims that currently remain hypothetical constructs of the observer.

The approach described here furthermore demonstrates that ontologically, the hierarchical organization of skills can underlie momentary, timely, spatiotemporal, and reticulate bioreality formation. Understanding bioreality formation requires a relational ontological approach that focusses on how processes unfold and change over time (Whitehead, 1978). Beyond the study of cognitive or behavioral rule following, or the linear and structural operational sequences needed to produce skills, research on combinatoriality and compositionality must therefore include research on how skills originate and alter sociocultural and ecological, life-based, and lived biorealities. Most complex skills prove to result from reticulate ontological mergers rather than linear sequencing. This finding is consistent with theories that show that worldbuilding relies on ecological, sociocultural, and cognitive scaffolding (Vygotsky, 1978) and niche construction (Laland et al., 2003; Lewontin, 2002; Magnani, 2021; Sinha, 2024) and thus require an eco-evo-devo approach.