1 Introduction

Mathematics is so deeply interwoven in our best scientific inquiries that today the hallmark of a good scientific theory is that it has mathematics as its essential part. As long as the aim of science is to provide understanding of the world, in modern theories mathematics can play both an exploratory and heuristic role. Both uses can provide a deeper understanding of phenomena within the scope of interest of a particular theory or account. On the one hand, one of the basic criteria of a successful scientific explanation is that it provides an understanding of events and phenomena in question. On the other hand, mathematics is often used as a source of concepts, in the sense that for instance physical concepts often involve mathematical notions. Thus, the role of mathematics in science is not limited to its explanatory power, mathematics also has heuristic functions, improving understanding by providing fruitful conceptualizations and enhancing expressive power of the language associated with a given framework or theory.

Much of the discussion concerning the explanatory role of mathematics in natural science focuses on applications of mathematics in physics, astronomy and earth science (but the importance of biological examples is increasing). However, the repertoire of the issues and arguments discussed in this debate can be enriched. In particular, as we will show, it is philosophically fruitful to examine the role of mathematics in those approaches to cognitive science that subscribe to the claim that cognition occurs at least partially outside one’s head. Regardless of their intrinsic interest, they provide a good case study of a certain use of mathematics.

The last 30 years have witnessed a rapid development of an approach according to which cognitive processes are extended beyond the brain or even the organism of the cognizer. There are many related views that stress various aspects of this extension, including embodiment (Clark, 1999; Shapiro, 2019; Varela et al., 1991), enactment (Hutto & Myin, 2012; Varela et al., 1991), situativity (Brown et al., 1989, Clancey, 1997, Robbins and Aydede 2008), etc. We will refer to all these with the blanket term “extended cognition,” or EC. Some proponents of this approach suggest that the best way to capture the complexity of a cognitive system composed of brain, body and ecological niche is in terms of the mathematical theory of dynamical systems (DST). In order to describe cognition, EC researchers adapt specific notions from DST, including the concepts of phase space, attractor, nonlinearity and bifurcation, and employ coupled differential equations (see Hotton & Yoshimi, 2011; Silberstein & Chemero, 2012; Shapiro, 2019). These methods are often introduced by way of examples from the natural sciences, which creates an impression that the roles DST plays in EC research are the same as in neuroscience, molecular biology, genetics, earth sciences or some fields of social science—and, in particular, that these roles include explanation (for a discussion of the explanatory power of DST, see Berger, 1998; Kaplan & Bechtel, 2011; Kaplan & Craver, 2011; Ross, 2015; Saatsi, 2017). At first glance, this claim sounds reasonable, given the wide applicability of DST in natural sciences and the omnipresence of mathematical notions in the works of EC researchers.

In this article, we analyze how DST is used by the proponents of EC to deliver scientific accounts of cognitive processes. We demythologize the alleged explanatory power of DST in EC research by showing how limited it in fact is. Even if dynamical explanations developed in EC research are indeed credible, their explanatory power has to stem from other sources than DST. However, DST has an important heuristic role to play, too. In particular, we argue that using mathematical notions improves the expressive power of the language associated with the EC paradigm and improves our understanding of cognition. The case study of EC allows us to exhibit and analyze this important role of mathematics, which largely seems to be neglected in contemporary discussions.

In Sect. 2, we discuss the general problem of the explanatory role of mathematics in empirical science, stressing the importance of applying a particular type of argumentation in mathematical explanations. In Sect. 3, we present the features of dynamical explanations relevant for our discussion and present two applications of DST: in neurobiology and social psychology. In Sect. 4, we turn to cognitive science. We discuss three different uses of dynamical models proposed within the EC framework: in Baber’s (2019) account of creative tool use, in Hotton and Yoshimi (2011) and in Spivey (2007). In Sect. 5, we justify the negative claim of the paper, showing that DST does not play an explanatory role in EC research. In Sect. 6, we set out to justify the positive thesis of the paper and identify the non-explanatory, but heuristically important roles that mathematical theory, such as DST, can play in cognitive science. We conclude with a brief summary in Sect. 7.

2 Explanatory Power of Mathematics in Science

While there is no generally agreed-upon answer to the question of whether there are distinctively mathematical explanations in science, there is no doubt that mathematics plays a significant role in the best scientific explanations we have.Footnote 1 Depending on the approach, to explain a natural phenomenonFootnote 2 is to identify its cause or underlying mechanism,Footnote 3 or provide a unified account that connects the phenomenon with other, seemingly unrelated, phenomena. Mathematics can enter scientific explanations at two crucial stages: during the modeling of the phenomenon and, later, in a formal analysis of the model in terms of the entities or operations of a specific mathematical theory.

The explanatory power of mathematics is thus closely linked to problems of modeling, idealization and abstraction in science. Although there is no consensus on this issue (see, for example, Jansson & Saatsi, 2019), a useful distinction has been given by Batterman (2009, 2010) and Batterman and Rice (2014), who identify two general modes of applying mathematical formalism that can be crucial to explanation. We will call these modes mapping and idealization. Although Batterman claims that there are in fact two different sources of explanatory power of mathematics, real-life explanations vary in their abstractness and we should rather think of a spectrum, at one end of which there are explanations involving a relatively accurate mapping and, at the other end, there are explanations where the explanatory power of mathematics stems from a great deal of idealization.

In the case of mapping, the explanatory role of mathematics comes from a preferably exact and detailed mathematical representation of the phenomenon of interest. This kind of explanation consists in indicating a mathematical structure (topological, algebraic, probabilistic, etc.), in which the situation is represented. As Bueno and Colyvan (2011, 346) put it, a mathematical structure is chosen to “accurately capture the important structural relations of an empirical set up, and we can thus read off important facts about the empirical set up from the mathematics”. After the embedding of the physical system into an appropriate mathematical structure, some inferences are drawn about the mathematical structures purely from within mathematics. Obviously, the abstract structure used to represent certain features of the physical world has some mathematical (topological, algebraic, probabilistic, etc.) properties that can be identified and described using appropriate theorems. These might be existential theorems (or non-existence theorems), optimization theorems, theorems concerning isomorphisms, etc. The mathematical results obtained in this way are “translated back” and taken to shed light on the target physical system: by showing, for instance, the inevitability or impossibility of a certain course of events.

The second source of explanatory power of mathematics in empirical science is idealization. In such cases, not only are some details of the phenomenon under investigation omitted but also some of its features can be deliberately misrepresented (Jones, 2005). This kind of strategy is typically used in explanations of the behavior of large-scale systems where it is required to provide a model “which most economically caricatures the essential physics” (Goldenfeld, 1992, 33). So, while explanations that use mappings tend to seek representational accuracy by adding detail to the description, adding detail to idealization models makes them less explanatory (Batterman, 2009, 430). According to Batterman and Rice (2014), the explanatory power of such models comes from their universality, which is directly associated with the minimal set of features that allows us to describe the behavior of a system in question (or class of systems—i.e., the universality class they form together). This gives us information that something must either happen or fail to happen as a matter of mathematical necessity irrespective of actual physical structure established by contingent causal and nomological connections. This, in turn, allows us to infer information about independence of the explanandum of the irrelevant facts and acquire an understanding of why the omitted or distorted features of the systems are not necessary for the stability or robustness of the phenomenon we are explaining. Lyon’s (2012) analysis of the soap-bubbles phenomenon within his framework of program explanations is a good illustration of this approach.Footnote 4 It also exhibits characteristic features of the approaches where some kind of inherent explanatory power is ascribed to mathematics. Roughly, the claim is that mathematics identifies some fundamental modality (see also Pincock’s (2015) abstract explanations or Lange’s (2017) explanations by constraints).

Typical idealization models represent large-scale structures in which the behavior of low-level objects does not enable us to explain the behavior of the whole structure. They provide successful explanations of a target phenomenon by reference to the distribution of possible outcomes, pure organization (for example, spatial and temporal relationships), the structure or pattern of behavior of a system (Batterman & Rice, 2014; Berger, 1998; Irvine, 2015). Such models are said to provide explanation without appealing to the causal structure that produces, maintains or underlies the phenomena. Even more, in such cases, mathematical models (functions, sets of equations, algorithms or computational methods) are often used when the underlying causal-mechanical structure that generates the phenomena to be explained is unknown but a reasonable amount is known about the phenomena themselves.Footnote 5 Construction of abstract models often starts with a specification of the model’s desired output and the aim is to develop a model that will produce the output while also plausibly matching known features of the phenomenon (Irvine, 2015; Knuuttila, 2011). In the case of idealization models, in order to play an explanatory role, mathematics has to provide us with a list of common features that are necessary for the phenomenon of interest to occur and a backstory about why various omitted details are essentially irrelevant (Batterman & Rice, 2014). This can be done by specifying the scope of changes that could be made either to the physical constitution of the phenomenon or to contingent laws governing the phenomenon without affecting the validity of the explanation (Jansson & Saatsi, 2019). In this sense, explanations provided by mathematics give a picture of the abstract structure of a target system or some patterns in its behavior by clarifying spatial and temporal relationships between different variables.

As we can see, regardless of whether we are dealing with mapping or idealization, we can speak of the explanatory power of mathematics when the explanation involves (a) capturing properties of the physical system into a mathematical structure, (b) making inferences about the mathematical structure by appeal to some non-trivial mathematical arguments specific to this structure, and then (c) translating their conclusions back into physical terms. Thus the modeling phase consisting in indicating a mathematical structure in which the phenomenon would be embedded is only the first step in the explanation. The explanatory role played by mathematics is not assured by merely introducing mathematical notions. If this were so, mathematics would be explanatory “for free”, as virtually all of physics (and chemistry, biology, economy, etc.) uses mathematics in this trivial sense. To put it briefly, using counting numbers is not applying number theory and the mere use of such DST-specific terms as “bifurcation” does not yet mean that DST has an explanatory role in a given scientific account.

It should be also noted here that describing the process in terms of phases (a), (b) and (c) has a simplifying character. These phases are usually neither disjoint, nor even clearly determined. Mathematical tools are sometimes invented “per se”, but very often it happens during the analysis of the physical situation in question. So we do not claim that applying mathematics amounts to using some ready-made mathematical toolbox.

For an explanation of a natural phenomenon that draws substantially on mathematics, it is essential that some mathematical argumentation has been used in the inferential phase (b). Mathematical arguments often involve using mathematical theorems,Footnote 6 but need not be tantamount to it. Some arguments consist of demonstrating that the proposed equations in the mapping phase do indeed, after some transformations, adequately describe dependencies of the variables used to describe the system.

Such understanding of the sources of explanatory power of mathematics can be used to give an account of the most famous examples of explanations that rely on conceptual operations on a mathematical structure, in which a physical system is represented: (1) It is impossible to take a walk crossing all the bridges in Königsberg in such a way that each bridge is crossed exactly once. The explanation is provided by identifying a simple fact in graph theory, rather than examining the physical properties of the system of bridges. Although the explanation does not appeal to any physical features, it offers an answer to the question of how changes in the number of bridges or islands would affect the target phenomenon—i.e., tourability (see Jansson & Saatsi, 2019; Pincock, 2007).Footnote 7 (2) Honeycombs have a particular geometric structure, which is explained by an optimization theorem: hexagonal tiling minimizes the total perimeter length (Hales, 2000, 2001). The bees, therefore, minimize the amount of wax, which gives them an evolutionary advantage.Footnote 8 The explanation also gives an account of why other polygons used to build a honeycomb would be less effective. (3) The Borsuk–Ulam theorem offers a mathematical explanation of the fact that there are always two antipodal points on the surface of the Earth, where both temperature and pressure (or any other similar parameters) are equal (Baker, 2005, 2009; Baker & Colyvan, 2011).Footnote 9 (4) Bangu (2013) discusses a simple example of a game (drawing bananas from boxes), where the phenomenon to be explained is that one of the players wins overwhelmingly often. The explanation is provided by the weak law of large numbers (and also shows why, for instance, the phenomenon would not occur if the rules of the game were different).Footnote 10 Needless to say, this list is not exhaustive.Footnote 11

3 Dynamical Explanations

3.1 Dynamical Models

Dynamical explanation strategies make use of the theory of nonlinear dynamical systems. More specifically, they rely on a set of its analytic tools (nonlinear differential and difference equations) and geometric modeling techniques (constructions of multidimensional configuration or phase spaces) to describe the phenomena to be explained. Usually, these phenomena are the behavior of complex macroscopic physical systems, especially when their evolution in time is important. They can be characterized by a set of variables that describe their various elements or aspects of behavior. The aim is to find equations that best describe dependencies of the variables used to describe the system. These modeling techniques allow us to describe and predict rates of change among system variables over time (Witherington, 2018, 41).

Dynamical explanations give an account of patterns and regularities (or lack thereof) in the behavior of the system. The mathematics of nonlinear dynamics captures nonlinear relations between the components of a system and allows us to identify stable patterns of behavior at the higher level that are associated with much variability at the individual level (Guastello & Liebovitch, 2009). In many cases, when the object of interest is a long-term behavior or its temporal structure, a computer simulation is used to iterate the model’s time dynamic and study whether there is a shift in the equation’s solution that could uncover new levels of pattering.Footnote 12

These techniques can be used to reveal the system’s behavior under many types of circumstances and thereby to give an understanding of the general principles governing the way some phenomena unfold over time. In the past 40 years, dynamical explanations have become commonplace in many special sciences, including geology, biology, medicine, ecology, neuroscience, development psychology and sociology. As Berger (1998, 309) points out, they are of particular use in cases where it is possible to identify equations that describe a system but where we are unable to solve those equations analytically.

3.2 Explanatory Power of Dynamical Explanations

Although many examples of dynamical explanations are known (e.g. swinging pendulum, motion of celestial bodies, growth of crystals, formation of traffic jams, etc.), there is no agreement as to whether their explanatory power comes rather from mapping or idealization. One approach, especially popular among theorists of cognitive science, is to regard dynamical explanation as a species of covering-law explanations (for discussion, see Bechtel, 1998; Bechtel & Abrahamsen, 2002; Chemero, 2009; Chemero & Silberstein, 2008; Clark, 1997, 1998; van Gelder, 1995, 1998; Walmsley, 2008). According to others, dynamical explanations are a sort of causal (Strevens, 2008) or mechanistic explanations (Kaplan & Craver, 2011; Zednik, 2011). These accounts assume that the role of mathematics in dynamical explanations is mainly representational and that information about the mathematical structure can be reduced to information about causes, physical laws or complex mechanisms. The explanatory power of dynamical models stems from their representational properties, i.e., from the fact that the mapping between the model and the target system is faithful. They can capture the change over time of some physical magnitudes as a set of (differential) equations. For the proponents of the mechanistic account of explanation, the variables in the model correspond to components, activities, properties and organizational features of the mechanism that produces the phenomenon in question and the dependencies posited between these variables in the model correspond to the causal relations between the components of the mechanism (Kaplan, 2011; Kaplan & Craver, 2011). Thus, mathematical descriptions help to explain a phenomenon by giving a quantifiable account of the causal structure of the phenomenon. From this point of view, mathematical structures and derivations within DST are explanatory insofar as they accurately represent causal relations between the entities or properties of a mechanism. These accounts allow for idealization and thus omission of causal factors irrelevant for the emergence of the phenomenon being explained but still demand that, for a model to be explanatory, it must provide an adequate representation of the causal relationships between the components of the target system (Craver & Kaplan, 2018; Strevens, 2008).

Note, however, that although dynamical explanations allow us to identify certain types of factors that contribute to the emergence of complex behaviors and, thereby, sometimes guide the search for plausible causal or mechanistic models, dynamical models as such reveal only the temporal and spatial features of a physical system, without necessarily capturing the causal processes, interactions or mechanisms that underlie the phenomenon under investigation. A dynamical model incorporates a small number of selected features of the physical system under investigation, ignoring the system’s causal history. In this sense, such models have counterfactual-supporting properties: they provide information about how a system would behave, were some of its structural features different. As a result, dynamical models are particularly often referred to as good examples of non-causal or abstract explanations (see Berger, 1998; Ross, 2015; Saatsi, 2017). Additionally, regardless of how one views the relationship between explanation and prediction, most dynamic models also have predictive powers: they can correctly forecast future values of modeled magnitudes in a time series.

3.3 Two Applications of Dynamical Explanations

To see how mathematics contributes to dynamical explanations, we will discuss two examples of such explanations offered within broadly construed cognitive science: (1) Ross’s (2015) analysis of a dynamical model that explains patterns of behavior displayed by physically distinct neural systems and (2) a dynamic framework that was used by Nowak et al. (2005) to explain some properties of social change.

3.3.1 The Case of Neuroscience

Neuroscience research focuses mainly on the role of neurons, neuronal circuits and synaptic organization in information-processing. From the molecular perspective, neurotransmission is explained in terms of a neuron model consisting of voltage-gated ion channels sensitive to sodium and potassium ions. However, knowing the anatomical structure of the synaptic circuits and the electrophysiological properties of neurons is not sufficient to determine what the cell is doing and why it is doing it (McCormick, 2004). It is empirically established that cells having nearly identical currents can exhibit different dynamics (Rinzel & Ermentrout, 1989) and hence different neurocomputational properties. On the other hand, neurons that differ radically in their microstructural details are known to exhibit the same type of excitability (Izhikevich & Hoppensteadt, 1997, 33).

Dynamical systems neuroscience explains these qualitative features of neural systems independently of their functioning at the molecular level. In this case, the dynamical modeling consists in constructing coupled differential equations that capture the variability over time of various properties of molecularly different neurons (i.e., the variables in the equations are characteristics of neuron’s behavior). The explanation why various types of neurons can exhibit the same patterns of excitability lies in the topological equivalence of different classes of differential equations (i.e., the structure of the possible trajectories is the same). By implementing some dynamical systems techniques and the Ermentrout–Kopell theorem (1986), it is possible to show that all such models can be reduced to the so-called canonical model. This simple model with one variable captures crucial characteristics of neuron excitability.

All such systems exhibit the same change in topological structure in their transition from the resting state to sustained firing. A neuron is excitable because its resting state is near bifurcation, or transition from quiescence to periodic spiking (Izhikevich, 2007, 159). According to the Ermentrout–Kopell theorem, all such models exhibit a saddle-node on a limit cycle bifurcation and are reducible to a simple one-dimensional model of neuron spiking (the canonical model).

To sum up, sameness of behavior exhibited by various physically different systems is explained by providing mathematical models for molecularly different neural systems and then using, in the derivation phase, the Ermentrout–Kopell theorem to show that each of these models can be reduced to the canonical model. Ross (2015) pointed out that the canonical model and the mathematical techniques used to obtain it explain also why one can abstract away from the molecular details of neural systems to capture their higher-level behavior. Batterman and Rice would express this by saying that it amounts to showing why such distinct systems belong to the same universality class.

3.3.2 Dynamics of Social Change

Social influence refers to any change in an individual’s thoughts, feelings or behavior that occurs as the result of interaction (real or imagined) with others (Rashotte, 2007). The notion of social influence allows researchers to understand a broad range of social phenomena, such as conformity, obedience, persuasion and peer pressure. Dynamic social impact theory (Latané, 1996) views society as a self-organizing complex system in which individuals interact and impact each others’ beliefs. In this account, the attitudes and beliefs of a single individual depend on the attitudes and beliefs of other individuals with whom he or she interacts.

Nowak et al., (2005) used these ideas to offer a dynamical model that explains an electoral pattern that appeared during the transitive years, in the late 1980s and early 1990s, in post-Communist countries in Europe. The pattern was that post-Communist parties suffered crushing defeat in the initial elections but often won subsequent elections. Roughly speaking, the aim was to explain how the distribution of opinions at a given stage of political transition influences the distribution of opinion at subsequent stages.

Nowak et al., (2005) based their explanation of the electoral pattern in societies undergoing a rapid social transition on the fact that, at a given stage, individuals adopt the opinions that they regard as the most influential. Nowak and colleagues explore the behavior of a relatively simple dynamical model that ascribes three properties to each individual: an opinion on a given topic (individuals are assumed to have one of two opinions on an issue), a degree of persuasive strength (how credible or persuasive the individual is) and a position in a social space (a decreasing function of group members’ social distance from the individual) (Nowak, et al., 1990). Generally speaking, this model describes social conflict where two opposite groups interact, which leads to societal change. The amount of impact other people have on an individual attitude or belief depends on the number of people exerting and receiving the influence, the respective strength of these individuals, and their proximity to one another (Nowak et al., 2005). At every stage of an interaction, each individual is assumed to discuss the issue with other group members and assess the degree of support for each position. The assessment consists in adopting the most prevalent opinion: opinions of those who are closest to the individual and have the greatest persuasive strength have the most influence on the individual’s opinion. The strength of influence of each opinion is expressed by the formula

$${I}_{i}={\left(\sum_{j=1}^{N}{\left(\frac{{s}_{j}}{{d}_{ij}^{2}}\right)}^{2}\right)}^\frac{1}{2}$$

where Ii represents total influence, sj denotes the persuasive strength of each individual, and dij represents the distance between individuals i and j (Vallacher & Nowak, 2007).

The time dynamics for each individual (his or her influence on others and the change in opinion) can be calculated at a given stage. Computer simulation allows us to iterate this procedure until the system has reached an equilibrium and there are no further shifts in opinion. Routinely, there are two kinds of outcomes observed at the group level: polarization of opinions and clustering, which together explain why a minority opinion can survive (Latane et al., 1994). Based on a formal analysis of the Nowak, Szamrej and Latane model, one can formulate the statistical mechanics of such models that can then be used to describe the dynamics of social impact (Lewenstein et al., 1992).

In the case of the explanation of the electoral pattern observed in post-Communist countries in Europe proposed by Nowak et al. (2005), predictions concerning spatial–temporal patterns of change were derived using computer simulation. The hypotheses were then successfully tested by comparing the model’s output against archival data about real voting patterns and entrepreneurship in Poland in the 1990s.

3.3.3 Explaining with the DST

Both explanations we have described, from neuroscience and social change theory, attempt to explain why a phenomenon occurs in a variety of circumstances. In the former case, the question concerns a particular pattern of activation in microstructurally different neural systems; in the latter, a pattern of electoral choices observed in societies undergoing rapid transition. In both cases, the explanandum is the functioning of higher-level units (systems), whereas the explanans provides a mathematical representation of the system’s behavior. In these explanations mathematics enters at two stages: at the modeling phase, and more importantly, at the phase of formal model analysis. Undoubtedly, the explanatory power in any of these cases does not come simply from representing as accurately as possible the phenomenon. Carefully selected dependencies of the variables used to describe the system allow to find equations which are then formally transformed. Due to the further use of mathematical or computational techniques an argument is given where certain properties of the mathematical system are demonstrated. These properties explain rates of change among system variables over time and capture patterns of behavior at the higher-level system despite much variability at the lower level.

In the neuroscience example, the explanandum was the observation that anatomical structure of a neural circuit and electrophysiological properties of neurons does not determine the behavior of a system. Hence, in constructing the explanation, much of the detail of the physical systems was omitted at the mapping stage. Mathematical structures were identified that captured the essential relationships within the physical setups. Formal analysis of topological properties of these structures proved that the result of development over time will be the same for all the setups.Footnote 13 Without applying mathematical results and tools from DST, it would not be possible to explain the phenomena in question in the present paradigm.Footnote 14

In the second of the examples discussed, in constructing the explanation, a great deal of idealization was done at the mapping stage. The explanandum was an electoral pattern (distribution of opinions) in countries that shared certain initial characteristics. For the explanation, interactions between only three potential real factors were considered. Thus, Nowak et al.’s explanation shows that a certain minimal set of physical setup features makes it possible to describe the behavior of the entire system. However, there is no doubt that the actual change in the opinion of individuals is influenced by many psychological, social and economic factors. It is not at all clear that indeed all these types of influence can be captured by an individual’s susceptibility to the strength of the authority of others in his immediate environment. In case of the proposed explanation, then, one can probably follow Goldenfeld (1992) in speaking of a caricatured representation of the actual physical setting.

As we can see, the dynamic model used in the computer simulation was an idealization based on unrealistic assumptions about the causes of opinion change. The explanatory power of the simulation comes from robustly generating the target phenomenon (observed electoral pattern). Even if we agree that the dynamic model of Nowak et al. is only partially explanatory, its explanatory power stems from capturing with DST the dynamics of change over time in a systematic way leading to the phenomenon being explained. What more, it informs us what would happen if the conditions were different. Thus, DST seems to play an explanatory role here. But we will argue that this is not the case in EC research, where in many cases the appeal to DST is strikingly different from how it is used in the examples described above.Footnote 15

4 Dynamical Systems Theory in Extended Cognition Research

The research program of EC is based on the idea that some aspects of the agent’s body, beyond the brain, and of the agent’s environment play a significant causal or constitutive role in cognitive processing.Footnote 16 A common assumption found in most EC theories is that the manner in which sensorimotor capacities, body and environment interact is crucial for an understanding of the nature of mind and cognition. As such, EC theories stand in stark contrast to computational and representational accounts of the mind, which regard the mind as a device to manipulate symbols and attempt to explain cognition by focusing on the organism’s internal cognitive processes.

The idea to conceive of cognitive processes in dynamical rather than computational terms was first introduced by van Gelder (1995, 1998),Footnote 17 who illustrated it with a suggestive example of Watt’s centrifugal governor, a mechanical device used to control pressure. An increase in steam pressure causes a shaft, with two flyballs attached to it, to rotate at a greater speed, which, in turn, causes a throttle valve to close a little, thereby reducing steam flow. The crucial point is that a centrifugal governor works by creating a direct coupling between the relevant parameters of the engine, which means that the link between the parameters is not mediated by any computations. Hence, we do not need the notion of an algorithm, or a Turing machine, to explain what is going on. According to van Gelder, it is the Watt governor, rather than the Turing machine, that is the appropriate model of cognition (van Gelder, 1995, 381).

The idea is gaining currency and many people working in Embodied Cognition declare that they use DST as an explanatory tool. From the EC’s point of view, a particularly appealing property of the DST is that it allows to conceive agent’s nervous system (in particular brain), its body and environment as a coupled dynamical systems. DST also provides the tools to capture the temporal nature of cognitive processes and tendencies for self-organization of many cognitive systems. Beer (2014) proposes to distinguish between two approaches to the dynamics of cognitive systems: i) the dynamical framework, in which there are specific concepts and metaphors used to talk about dynamicity of a system, and ii) dynamical systems theory, where the mathematical theory is applied to characterize systems whose states change over time in a systematic way (we will call this the dynamical systems approach). On the dynamical systems approach, an explanation should consist of both a set of (differential or difference) equations that capture the behavior of a system and a dynamical analysis using the tools of DST to describe the abstract properties of the model’s mathematical structure (see Zednik, 2011). We will try to find specific theories formulated within the dynamical systems approach but, as we will show, it is not always easy to determine which of the two approaches is adopted by a given author.

In the literature on dynamical explanations in cognitive science, there are basically two strategies to introduce the idea of using DST to model the behavior of a cognitive system: the first is to give an example of an effective application of DST to explain some behavioral phenomenon and the second it to provide a simplified example of how DST could work to account for cognition.

As to the first strategy, there are practically only two successful applications of DST in the domain of EC: namely, the account of dynamic coordination in motor behavior (Spivey, 2007; Zednik, 2011) and the model of perceptual adaptation (Beer, 2003; Warren, 2006; Wilson & Foglia, 2017). These explanations had been known long before the emergence of EC and, as such, were only incorporated into the imaginarium of EC as standard DTS success stories [for pre-EC work on bimanual coordination, see Haken et al., (1985) and Kelso (1984) on perception, see Gibson (1966, 1977) and Rosen (1978)]. Therefore, it is impossible to claim that these applications of DTS have contributed to solving any problems arising originally in EC research.

4.1 Creativity and Craftsmanship

Baber (2019) proposes a dynamical explanation of the creativity of a jeweler crafting a piece of work (see also Baber et al., 2019). In general, the problem is how to explain the proficiency of an agent at using a tool to achieve a desired end.

The dynamical system here consists of an agent, the tool and an object. The agent uses sensory data from ongoing work to modify his or her actions in such a way as to move the work in toward a specific goal. While working, the craftsman has to creatively solve problems by responding to “the opportunities offered within the space of constraints”. These spaces are determined by the interactivity of the tool, the type of material being processed and proficiency of the agent. Spaces of constraints are defined by order parameters that can be affected by control parameters which, in turn, can be modified by the agent.

The craftsman has to appreciate the control parameters to continually adjust his or her actions to meet the demands of the situation. There are local fluctuations in the performance, connected to locally unexpected perturbations, but, over longer periods of time, it is possible to observe repeatable patterns of “crafting behavior” that emerge. The variability in performance can usually be described in terms of pink noise. Indeed, this DST notion can be taken to correspond to the craftsman’s level of dexterity.

4.2 Playing Scrabble

As to the second strategy, Hotton and Yoshimi (2011) introduce the idea of using DST to model cognitive processes in line with EC approach by giving a suggestive impression of how dynamical explanation could work. They develop a formalism that could be used to compare the dynamical analysis of an agent isolated from the environment and a dynamical analysis of the agent embedded in the environment.

A large part of their paper is devoted to a detailed presentation of the notions of DST relevant to their goals. In particular, they discuss the Hopfield network as a model of a neutral network that is embedded in a simple one-object environment and show how to model representational processes in such cases. Again, some mathematical facts about such networks (for instance: with constant inputs, there are only fixed-point attractors) are given and analyzed. The very basic and abstract examples they supply are intended to illustrate how their explanation could work in principle. Then they voice a “further-in-similar-fashion” credo, claiming that more complicated dynamic cognitive phenomena can be captured in essentially the same way.

They concentrate on cases where some artifacts are used to perform better in tasks that could theoretically be carried out without external help. As an example they discuss the game Scrabble. It has been shown that subjects equipped with physical tiles that they can manipulate are better at producing words than subjects performing the same task “in their heads” (Maglio et al., 1999). The situation of Scrabble players is analyzed in terms of a system that consists of an embedded agent and used artifacts. However, no detailed formal follow-up in terms of DST is given which might justify the claim that the use of mathematical vocabulary is essential.

4.3 Mental Representation

Another interesting example of adopting the second strategy is Spivey’s presentation of his approach towards modeling mental representations as continuous phenomena. He employs concepts from DTS, for instance the notion of a dynamical trajectory through state space that should replace the notion of a static symbolic representation (Spivey, 2007, 5).

Spivey offers an analysis of a categorization phenomenon: agents have to categorize a dish as a mug or a bowl (“the typical tall and thin cup versus the typical low and wide bowl”) and the shape of the object is changing (10% increments in both the vertical and horizontal dimension). In some cases, the reaction times were longer, which is explained in terms of dynamical systems: as there are two competing attractors (cup versus bowl), the journey of the mental states through the phase state might take a different time (depending on the parameters).

Spivey emphasizes the dynamic and continuous character of the process: the mind is not static, it is in constant motion through the neural state space. (Another simple example discussed by Spivey is the Necker cube, where depending on the time we look at it, its perspective seems to change). There are numerous examples of “shifting perspective phenomena”—and their description might involve the image of a neural system traversing intermediate regions of the state space, in particular approaching the attractors (Spivey, 2007, 23).

These are merely some examples of how DST is used in EC research. A natural question to ask is whether, in such cases, DST performs any explanatory function, as it does in neuroscience or social change theory. We analyze this in two subsequent sections.

5 Dynamical EC Models Lack Explanatory Power Conferred by Mathematics

We have described cases from neuroscience and social psychology where mathematics can be taken to perform some genuinely explanatory work. Can the same be said about DST in EC research? Shapiro (2019) discusses Watt’s centrifugal governor in this connection, pointing to the system of differential equations (and the mathematical characterization of the equilibrium state) as an important source of the model’s explanatory power. We grant that this is a nice example of a mathematical explanation in physics. But cognitive science and physics are different fields. According to Shapiro (2019, 167–173), DTS is genuinely explanatory in EC research and, more importantly, the models proposed in embodied cognitive science work in exactly the same way as the dynamical explanations we know from physics. We disagree, in EC mathematics does not play (perhaps only for now) a genuinely explanatory role.Footnote 18 Moreover, due to the underdevelopment of the mathematical model, they have neither special predictive power, nor do they offer counterfactual support.

In Sect. 2, we stressed the importance of mathematical theorems featuring in the derivation phase of mathematical explanations in empirical science. For instance, the behavior of predators (see footnote 6) is explained by appeal to an appropriate optimization theorem, meteorological phenomena are explained via the Borsuk–Ulam theorem, aspects of bees’ behavior are explained by the Honeycomb theorem, etc. The example from neurophysiology is similar: it invokes the Ermentrout–Kopell theorem. To repeat: we need to appeal to an argument that has as its premises some mathematical facts in order to explain a particular phenomenon. Using Batterman and Rice’s classification of explanatory modes (“common features accounts”, which largely depend on representations—i.e., mappings versus minimal models based on the notion of a universality class), we can say that, regardless of whether the applications of DST in neuroscience or social psychology fall into the first or the second class, it is crucial that they make non-trivial use of specific mathematical arguments based on either theorems or other types of transformations, in particular of a qualitative or mixed nature.

We take such non-trivial use of mathematical argumentation to be a necessary condition for mathematics to play an explanatory role in scientific theorizing. Dynamical models in EC fail to meet this constraint. Consider Spivey’s theory. It is couched in terms of DTS, but invokes neither a theorem or quantitative relationships to account for the categorization phenomena it aims to explain. Rather, the conceptual apparatus of DST serves merely as a means to the creation of a suggestive narrative. The same goes for the example of Scrabble: a story is told, but no theorem is directly used in the explanans. The dynamical explanation of jewelers’ creativity given by Baber (2019) fails in exactly the same way: it features some mathematical notions, such as that of the attractor or pink noise, but it employs DST merely to describe the phenomenon in mathematical terms. No differential or difference equations and no geometric modeling techniques characteristic for DS explanations are used. This description—at most—fulfills some representational role, but there is no proper mapping, not to mention the proper derivation phase. No mathematical theorems (which identify the logical dependencies between concepts) are directly involved in this description. It seems that the whole reasoning might also be expressed in natural language without any loss of explanatory power. Likewise, in Hotton and Yoshimi’s work, there is a considerable gap between “DST in neural networks” and “DST in EC”. While Hopfield networks may indeed be a good model of neurophysiological phenomena (this is not an issue we are going to discuss here), there is no transition from this to the claim that DST is genuinely explanatory in EC. Such a claim cannot be justified in terms of minimal models as universality classes (Batterman & Rice, 2014): indeed, the “clustering” into universality classes would be extremely general. Such a class would consist of a very broad array of systems evolving in time. But this is not enough to provide an explanation, in particular such a general description does not inform us what would happen if the conditions were different. All this is connected with the fact that, in such a general setting, no theorems can be used. Incidentally, the only theorem cited in Shapiro’s Handbook of Embodied Cognition (2014) is a version of Arrow’s theorem concerning group preferences.

It is also impossible to justify the explanatoriness of DST within EC in terms of program explanations, where mathematics enables us to identify the relevance of abstract properties of the system which, though not causally efficacious themselves, nevertheless program “the instantiation of causally efficacious properties and/or entities that causally produce the explanandum” (Lyon, 2012, 567). It is a theorem that convinces us that some scenario is inevitable (or impossible, or has to unfold in some special way).Footnote 19 But due to the lack of mathematical model or quantification of the relationships between the distinguished elements of a complex system evolving over time it would be extremely hard to identify such programming properties in the analyzed examples of DST within EC research. Instead, we find exceedingly general statements such as “systems evolving in time change”, which cannot explain anything.

Explanations by constraints are studied in great detail by Lange (2017), who stresses their modal ingredient: the explanans involves some constraints, which possess “some variety of necessity stronger than ordinary laws of nature possess” (Lange, 2017, 10). In particular, a mathematical theorem can do explanatory work because theorems put constraints on possible courses of action.Footnote 20 And again, no explanatory constraints of this kind are presented in any examples described in Sect. 4. An interesting story is provided, which can indeed be scientifically fruitful (we discuss its possible impact in the next section), but this is not an explanation in any of the aforementioned senses.

So, regardless of what one thinks about the nature of explanation—whether one is partial to program explanations, abstract explanations, explanations by constraints, minimal models, universality classes, etc.—we claim that DST does not play an explanatory role in EC research, at least not in the standard sense. Moreover, it does not yield predictions and it is even hard to tell whether the models can be tested. It is only a slight exaggeration, in our opinion, to say that applying DST in EC research boils down to formulating very general statements, such as “over time, systems undergo changes due to the influence of diverse factors”. Referring only to such superficial claims has deeper reasons: in order to apply a mathematical argumentation that has a theorem or equations stating quantitative relationships between variables as a premise in a reasonable way, we need to have a mathematical model. The argument in question is applied in order, for example, to predict the behavior of the system or to describe its features (or to identify some deeper reasons behind some phenomenon). Merely using mathematical notions is not sufficient to claim that it is mathematics that plays an explanatory role (otherwise, the claim would be utterly trivial, as virtually all of physics uses mathematical terms). And, according to Botvinick (a critic of the approach), “(A)t least in some instances, highfalutin terms are applied to what appear to be fairly pedestrian concepts” (Botvinick, 2012, 79).Footnote 21

However, despite this lack of explanatory value, we think that DST plays an important heuristic role in EC research, which we discuss in the next section.

6 The Heuristic Role of Dynamical Systems Theory in Extended Cognition Research

The notion of understanding used to be absent from philosophy of science because it was standardly thought to merely denote a psychological byproduct of the scientific process. The situation has changed, though, and the phenomenon of scientific understanding is now receiving more and more attention (see, for instance, de Regt, 2017; Khalifa, 2017, de Regt et al., 2009; Grimm et al., 2017). Our claim is that providing a feeling of understanding is another important role of mathematics and the role of DST in EC research is a good example of this.

According to our negative claim from Sect. 4, DST does not play an explanatory role in EC research. This means that if mathematics can provide understanding of cognition within the EC paradigm, the understanding involved cannot be associated with explanation.Footnote 22 In these cases mathematics plays a pragmatic or heuristic role. According to Gijsbers (2013) and Lipton (2009), for example, an X that does not explain the phenomenon under investigation can nonetheless contribute to the phenomenon’s understanding. We submit that this is what DST offers to EC research. There are two kinds of contribution to scientific understanding that deserve special attention in this connection: that resulting from enhancing a theory’s expressive abilities and that provided by toy models.

6.1 Expanding Expressive Abilities

An important feature of mathematics is that it expands our expressive abilities. According to many authors, this is the main function of mathematics in science. For instance, Melia claims that the role of mathematics is “to make more things sayable about concrete objects” (Melia 1998, 70–71). According to Yablo, mathematical (numerical) language enables us to make claims that would otherwise be inexpressible (Yablo, 2002b, 230). Balaguer characterizes mathematics as a merely descriptive aid that facilitates the stating of facts about the physical world (Balaguer, 1998, 137). Liggins uses the term “abstract expressionism”—according to this doctrine, the usefulness of mathematics in science consists in helping us “to say things about concrete objects, which it would otherwise be more difficult, or perhaps impossible, for us to say.” (Liggins, 2014, 600). (These claims were made in the context of the enhanced indispensability argument but they are independent of the realism-antirealism debate.) We do not think that these claims are true in general—but we do think that the use of DST in EC research is a perfect example of this “resource-expanding phenomenon”.

Spivey’s account of categorization (2007) provides an apt illustration of this. Spivey uses the conceptual apparatus of DST (phase space, trajectory, basin of attraction, attractors, etc.) as well as probabilistic concepts and notions inspired by quantum mechanics, such as “pure mental state” and “probabilistic mental state”, which enable him to formulate novel claims and express new ideas. The mind traversing a phase space of mental states that is populated by various attractors, leading to differences in reaction times, is very suggestive and it is hard to imagine how to express such claims in a radically different conceptual system. If we decided to describe the categorization phenomena (“bowl versus mug”) in computational terms, we would point out that computation tasks vary in complexity (or that the algorithms are not optimal) and conclude that this is why reaction times are different. So we would tell a story of our “internal Turing machine solving a computationally difficult task” rather than speaking of the “mind travelling through a complex phase space”. These are two different descriptions and two different toy models. And within these two toy models, the explications of the term “process coming to conclusion in a categorization task” will be very different.Footnote 23 So DST does, indeed, expand researchers’ expressive abilities: there are claims about cognitive systems that would be inexpressible without the aid of specific mathematical tools. The mental-phase-space toy model can provide a better understanding of cognition even if we concede that it does not represent any real cognitive process.Footnote 24

One important method of expanding expressive abilities is through the use of metaphor and we may think of mathematics as the source of metaphors of a certain type.Footnote 25 Spivey’s account is an interesting example of “metaphorical modeling”. Mathematical notions are used to express novel claims about mental phenomena, but their use should be considered to be metaphorical.Footnote 26 The notion of metaphor is used frequently also in the context of the so-called computer metaphor of the mind and in the context of the dynamical systems approach, and Spivey makes use of both (2007, 138). Spivey gives some metatheoretical discussion concerning the use of metaphors, indicating that metaphors are used when the target domain is too complex—and understanding can be imported from simpler domains. The use of the notion of dynamical systems and attractor networks to provide a better understanding of the functioning of the mind is an example of this transfer (Spivey, 2007, 33).

When it comes to the problem of metaphor, it is worth mentioning Manin’s (2007). Indeed, the title of the book is “Mathematics as metaphor”, and it contains also an essay devoted to this particular problem. The entire book discusses the role of mathematics in intellectual culture, focusing on profound and philosophically significant ideas. According to Manin, “In order to understand how mathematics is applied to the understanding of the real world, it is convenient to subdivide it into the following three modes of functioning: model, theory, metaphor” (Manin, 2007, 14). The role of mathematics extends far beyond being merely a computational tool or being a language for science. In particular, mathematics fulfills what might be called a narrative role. It has the potential to provide us with understanding at a profound level and has what we might call “humanistic dimension”: „One cannot tell precisely what mathematics teaches us… The teaching itself is submerged in the act of re-thinking this teaching” (Manin, 2007, 28).

Using mathematical metaphors can be an epistemically fruitful procedure, but, of course, it is very different from constructing and testing models. Nevertheless, mathematical metaphors can enhance our understanding. Mathematicians are aware of the importance of intuitive, “hand-waving” arguments within their discipline. Such arguments often provide insight into the problem and convey important ideas. However, hand-waving arguments might also be misleading and they need to be examined with great care. There is a danger of abuse of language when mathematical notions are used in an extremely informal, loose way: they “sound mathematical” but their mathematical content is not preserved. Thus, mathematical metaphors must be handled with great care.

6.2 Understanding and Toy Models

Reutlinger et al. (2018) draw a useful distinction between autonomous and embedded toy models. Embedded toy models are closely associated with some well-established theory, for example, the model of a pendulum and models of planetary motion are associated with classical mechanics. By contrast, autonomous toy models are linked to a specific class of phenomena and, consequently, can be considered in isolation from theory. A simple example is Schelling’s model of racial selection. It is not based on any established theory, but shows that it is possible for racial segregation to occur even in the absence of significant racist attitudes. This might seem surprising, but it is possible to mathematically define a scenario where segregation is inevitable even under quite weak assumptions. So, thanks to the toy-model scenario, we gain an understanding of how such a situation might ever arise—even if it is clear that the model does not represent reality, has no predictive power and offers no explanation in any reasonable sense of the word. Therefore, autonomous toy models provide “how-possibly” understanding. They need not and do not provide “how-actually” understanding [the distinction was made in Reutlinger et al., (2018)].

An important question is whether theories have to be true, or at least approximately true, in order to provide understanding of natural phenomena. According to de Regt and Gijsberg (2017), they do not. De Regt and Gijsberg make a strong case for this claim: although false, at least by our present lights, the phlogiston theory (along with its descendant, the caloric theory), Newtonian physics, and the substance models of energy and electricity have all significantly contributed to our understanding of the relevant phenomena (de Regt & Gijsbers, 2017, 70).Footnote 27

It seems that something similar is going on in cognitive science. The dynamical models in EC research are not full-fledged scientific models comparable to models in physics or neurobiology. Beer and Williams (2015), who analyzed the relative merits of information processing and DST in accounting for the behavior of a highly idealized agent, admit that such agents are not intended to be realistic models of human cognition. However, “they serve as excellent vehicles for exploring theoretical landscapes and developing and testing new concepts and mathematical tools” (Beer & Williams, 2015, 2). Beer and Williams stress the importance of studying toy models, which do not provide a detailed description of cognition, but nevertheless might be important for developing the language and conceptual framework appropriate for a rigorous explanation of human cognition (Beer & Williams, 2015, 31). So they might have an important epistemic role to play—and we think that it is natural to account for this role in terms of autonomous toy models. Reutlinger et al., (2018, 1093) identify the modal function, the heuristic function, and the pedagogical function as central epistemic functions of how-possibly understanding. We do not discuss the pedagogical function here, but both the modal and the heuristic functions can surely be identified within the dynamical systems approach to EC. Even if the veridicality condition is not fulfilled, theories using the conceptual framework of dynamical systems can give insight into the possible scenarios and have heuristic value. Spivey’s theory (as well as other examples: Beer and Williams’ account of idealized agents, Baber’s description of the dexterity of jewelers, Hotton and Yoshimi’s attempt to model an agent playing Scrabble) might be viewed as providing abstract toy models that are not explanatory but contribute to a better understanding of cognition.

7 Conclusion

Our working assumption in this paper was that, at least sometimes, mathematics plays an explanatory role in science. We discussed the possible epistemic roles of mathematics in cognitive science based on a case study of the applications of DST in EC research. In this context, we have argued for both a negative and a positive claim.

The negative claim was that DST does not play an explanatory role in EC research. So the situation is different from for instance neuroscientific models, or even models in social psychology which we presented in Sect. 2. The important factors are: (1) The DS models are not mathematical models in the standard sense: they do not make predictions and cannot be tested; (2) The standard examples of mathematical explanations in science appeal to theorems: a theorem is an essential ingredient of the explanans. It is not sufficient just to present some kind of mapping—i.e., a mathematical description or representation of the problem. However, such an essential application of theorems is not present in the DST approach to EC (this was the subject of Sects. 3 and 4). (3) In EC, mathematical vocabulary of DST is freely used. However, its mathematical content is not always preserved, and the mathematical terms are often used in a rather loose, “hand-waving” manner.

We also argued for a positive claim: the theory of dynamical systems plays an important heuristic role, being a source of new ideas and approaches. It provides toy models and metaphors, which help us to get a new, fruitful conceptual grasp of the problems, and which can be a significant theoretical contribution. So, even if it does not provide explanations in the standard sense, it can enhance our understanding—and, in this way, play an epistemically fruitful role.

We believe that these findings are relevant for the discussion concerning the EC approaches to cognitive science, de-mythologizing the alleged explanatory role of DST in EC research but stressing different important roles. The case studies illustrate the fact that, besides conferring explanatory power, mathematics can also improve understanding—and, in this way, contribute to epistemic progress.