Entropy, prediction and the cultural ecosystem of human cognition

Major proponents of both Distributed Cognition and Predictive Processing have argued that the two theoretical frameworks are strongly compatible. An important conjecture supporting the union of the two frameworks is that cultural practices tend to reduce entropy —that is, to increase predictability— at all scales in a cultural cognitive ecosystem. This conjecture connects Distributed Cognition with Predictive Processing because it shows how cultural practices facilitate prediction. The present contribution introduces the following challenge to the union of Distributed Cognition and Predictive Processing: the problem of entropic cultural practices. The problem lies in the existence of multiple cultural practices that tend to increase entropy instead of reducing it. This paper discusses these entropic cultural practices and the nature of the problem at hand. Finally, the paper advances an expanded conception of cultural practices that could unite the two frameworks and explores the difficulties of committing to such a conception.


Introduction
Edwin Hutchins's ground-breaking book "Cognition in the Wild" brought forth a whole new way of thinking about cognition -Distributed Cognition, according to which all instances of cognition can be understood as emerging from processes distributed at many spatial and temporal scales (Hutchins, 1995). If the idea is on track, any system in which cognition emerges out of the interaction of its elements can be studied through this lens: from the interaction of different neurons to the interaction of areas of the brain and organs of the body, and from an individual's use of cognitive artefacts to multi-agent cultural practices that unfold in vast cognitive ecosystems (Hutchins, 2008). Thus, Distributed Cognition is compatible with the idea that cognition is embodied (Hutchins, 2009), enactive  and extended (Hutchins, 2011).
The synergy with complementary theories might go even further when one looks at what Distributed Cognition can bring to the study of cultural practices. In one of his most recent contributions, Edwin Hutchins advances the claim that Distributed Cognition is also compatible with Predictive Processing (PP), a theoretical framework that conceives the brain as fundamentally a prediction engine. In his defence of the union of the two frameworks, Hutchins hypothesises that cultural practices tend to reduce entropy (i.e. to increase predictability). Combining the two views -the brain as a prediction machine and cultural practices as ways to increase predictability-holds the promise of a grand unifying scheme of cognition that holds explanatory power across multiple temporal and spatial scales (Hutchins, 2014;Clark, 2016).
At first, the insight that cultural practices tend to increase predictability seems to provide the sought-for connection between PP and Distributed Cognition. The problem is that not all cultural practices tend to increase predictability. Many cultural practices -ranging from playing piñata to civil disobedience and from free partying to Cartesian doubt-actually tend (and intend!) to increase entropy, that is, to decrease predictability. I call this challenge "the problem of entropic cultural practices". In this paper, I will outline this substantial challenge to the bridging of PP and Distributed Cognition, sketch what might be the only way out of said challenge while holding the central tenets of the two frameworks and discuss the associated costs of adhering to this way out.
The paper is structured as follows: Sect. 2 offers an overview of Distributed Cognition. Section 3 introduces PP. Section 4 discusses 'entropic' cultural practices and shows why they pose a problem. Section 5 develops the only possible way out of the "the problem of entropic cultural practices". Section 6 concludes the paper and outlines some challenges and future directions.

Distributed cognition, entropy and cultural practices
Distributed Cognition was born out of the detailed observation of the cultural practices aboard the USS Palau ship of the US Navy. Edwin Hutchins intended to conduct cognitive science research outside the laboratory, and the navigational challenges of the Palau were ideal for this enterprise. What he found is that to do justice to cognition as it happens in the wild, one needed to shift the unit of analysis beyond the skin of the individual. The computational challenges in the USS Palau were solved through a cognitive process that was widely distributed across a team of navigators and their cognitive artefacts. The insights from this fieldwork resulted in a theoretical framework that sees all instances of cognition as emerging from distributed processes (Hutchins, 1995(Hutchins, , 2001; see also Rogers, 1997;Kirsh, 2006;Sutton, 2006). Distributed cognitive processes may include elements at many spatial and temporal scales, so the unit of analysis is not pre-determined but depends on the phenomenon under study (Hutchins, 2014). At the scale of neuroscience, the interaction of brain areas can be seen as distributed cognitive processes. The scale widens according to the cognitive process under study, encompassing, for instance, the brain and the body, the interaction of humans with cognitive artefacts, or the interactions of multiple agents. At an even larger scale, cultural practices within a given cognitive ecosystem (e.g. the emergence of language, as in Hutchins and Johnson, 2009) can also be explored through the lens of distributed cognition. Since its inception, the framework has been employed in studies of a broad range of subjects, including aviation (Roth, 2015), sports (Muntanyola-Saura & Sánchez-García, 2018), health care (Lippa et al., 2017), informatics (Furniss et al., 2019), Search and Rescue operations (Plant & Stanton, 2016) or management (Heavey & Simsek, 2017), to name but a few recent examples.
Something that sets Distributed Cognition apart from alternative but complementary embodied, embedded, enactive, and extended approaches is its particular focus on the way cultural practices permeate and shape cognition. The idea is that the different ways in which humans operate in the world constrain each other and need to be coordinated. These constraints can be bodily, mental, affective, or they can come down to the materiality of a physical environment (e.g., the arrangement of seats in an amphitheatre influences the direction that people decide to face once they sit down). This set of dynamic interactions within a particular cognitive ecosystem gives rise to situated cultural practices that organise action and cognition.
An illustrative case of a distributed cognitive process is queuing, a cultural practice in which the way that agents are arranged on a line maps the order of arrival of those agents. Such a practice is often materially scaffolded through the use of ropes creating corridors for the queue to form, or through lines on the floor indicating where the queue should start. Queuing is a form of dimensionality reduction that increases predictability. Once we see this physical structure (position in the line) as a conceptual structure (order of arrival), we can make a series of inferences (i.e. predictions), such as what the order of arrival was, or the approximate time we would need to wait before being the first in line. Note that this dimensionality reduction happens in the space shared by the participants of the practice, not inside the individual mind of any of the participants (Hutchins, 2005).
Key here is the hypothesis that cultural practices tend to decrease entropy (and thus, increase predictability, because entropy is a measure of unpredictability) across all scales in a cognitive ecosystem (Hutchins, 2012(Hutchins, , 2014) specifically refers to information theoretical entropy, which is a measure of the average amount of information obtained when observing that a particular state has occurred. Entropy is also equivalent to the expected value of surprisal (which in turn is the logarithm of the reciprocal of a probability, and thus higher for low probability events; Tribus, 1961). Therefore, on average, a lower entropy means higher predictability.
Queuing is only one example and dimensionality reduction is only one of the many ways in which cultural practices tend to decrease entropy. Hutchins provides a comprehensive (but probably non exhaustive) inventory of the ways in which cultural practices achieve this (see Hutchins, 2010a), which include: • dimensionality reduction -the production of a conceptual structure out of complex assemblages of possibly preconceptual material via the conjunction of features (e.g. the production of a queue out of a group of humans via the conjunction of position in the line with order of arrival), • filtering -preserving some features or elements while ignoring others (e.g. directing our attention to the white lines painted at the edge of a mountain road), • constraint satisfaction -the simultaneous fulfilment of multiple restrictions that change the probability of different configurations of cultural practices emerging (e.g. when cycling there is a simultaneous fulfilment of the constraints of the human body, the mechanics of the bike and a rich legal and cultural code, which makes certain ways of cycling more likely to emerge), • modulated positive feedback -recycling a (filtered) subset of the output as an input (e.g. decreasing speed when our car gets dangerously close to the white line at the edge of the road), • superposition of structure -the projection of imagined structure onto elements of a perceived or imagined world (e.g. seeing a constellation when looking at the stars). • mapping across conceptual spaces -combining filtering with constraint satisfaction and superposition in order to map patterns from one conceptual space to another (e.g. translations, comparisons, analogies, metaphors…), • design -activities outside the normal workflow that attempt to create explicit representations of work practices (e.g. agreeing on a chain of command within an organisation).
Hutchins discusses how these methods of entropy reduction are present not only in relatively straightforward cultural practices such as queuing, but also in highly intellectual and complex practices such as quantum physics research. A crucial upshot of Hutchins's hypothesis is that it provides a way of connecting Distributed Cognition with PP: More recently, I have been using information theoretic measures to explore the hypothesis that cultural practices tend to reduce entropy (increase predictability) at all scales in a cultural cognitive ecosystem (Hutchins, 2012). This is important because a brain that is a prediction machine, as suggested by Clark in his latest work (2013), will require predictable experience.
-p. 5, Hutchins, 2014 Andy Clark had already hinted at this connection between PP and Distributed Cognition in his seminal article "Whatever next? Predictive brains, situated agents, and the future of cognitive science" (Clark, 2013), but Hutchins provides the needed conceptual link with his hypothesis about entropy and cultural practices. As we are about to see in the next section, this is a link that some authors in the PP side are eager to hold on to as well. Of course, there are many authors that put into question that theoretical unification in cognitive science is possible, or even desirable (Horst, 2016;Dale et al., 2009;Colombo & Wright, 2017). Unification, is, however, a cherished goal of many PP and Distributed Cognition proponents. The unification in question goes beyond the integration of results and methods, and it comes down not to reduction, but rather to unificatory understanding and explanation (Miłkowski & Hohol, 2021). Some of the virtues of theoretical unification is that it can help with the theory crisis in psychology, remove ad hoc assumptions, and offer a systematic understanding of the whole cognitive domain (Bangu, 2017).

Predictive processing
In this section, I will offer only a brief introduction to PP. There are some competing interpretations of the framework (e.g. where PP stands in the representation wars; Williams, 2018) that are outside the scope of this paper, so I intend to stay as neutral as possible in these matters. For a detailed introduction to PP, the reader can consult Hohwy (2013) and Clark (2016), and for a recent review of the philosophical issues surrounding PP, Howhy (2020). Here, I will follow an action-oriented version of PP, according to which the same process -active inference-subsumes perception, cognition and action . The reason to call PP a framework rather than a theory is that it guides and constrains the development of more specific theories (Michel, 2022; see also Kersten, 2022). The framework encompasses theories defending a vision of the brain as a predictive engine, the main one being active inference (for a discussion of how active inference relates to other Bayesian approaches to the brain such as hierarchical predictive coding; see Bruineberg, Kiverstein, and Rietveld, 2016;Allen and Friston, 2018;Ramstead, Kirchhoff, and Friston, 2020;Ramstead et al., 2019). Active inference originates in theoretical biology and employs information theory and dynamical systems theory to explain how organisms remain in their characteristic states via adaptive action . Central to active inference is the free-energy principle, according to which any self-organizing system that is at equilibrium with its environment must minimize its variational free energy, which is the difference between the sensory data registered by the system and the data expected under some model of how the data were generated (Friston, 2010. PP follows some simplifying mathematical assumptions under which variational free energy is equivalent to prediction error. PP conceives the brain as a hierarchical prediction (and prediction-updating) mechanism. Such a mechanism is constantly predicting incoming sensory input. The higher the echelon in the hierarchy, the wider the spatiotemporal range of those predictions. PP conceives of a top-down system in which predictions high-up the hierarchy influence the predictions at lower levels. Therefore, predictions cascade down the hierarchy, and go side-ways along a given level. The mismatch between predictions and sensory input, called prediction error, goes up the hierarchy until new, updated predictions can account for said error. This process is modulated (i.e. weighted) by how important and reliable errors are expected to be, which is estimated through 'precision' (statistically speaking, the inverse of variance), so that predictions with lower assigned precision will be less likely to drive further processing up the hierarchy.
The organism is engaged in an overarching process of prediction error minimization (PEM) over time. The organism minimizes error both by updating its predictions and by changing the world through action to fulfil predictions. The predictions of future desired state of affairs (i.e. goals) emerge at high levels of the hierarchy and cascade down into modality specific (e.g. proprioceptive) predictions. Action occurs to fulfil these predictions in order to minimise prediction error. Precision also plays a role here. By biasing the degree to which error units drive further processing, the precision-weighting modulates not just perceptual predictions but also action-related predictions, transforming the likelihood of different actions.
If the aim is to provide a grand unifying vision of cognition, then PP should also be compatible with theories about culture. It is here that Distributed Cognition has a lot to offer to PP. That the two frameworks are strongly compatible is not only defended by Hutchins. Andy Clark, one of the major proponents of PP, agrees with Hutchins that it may be useful to understand cultural practices as "entropy (surprise) minimization devices operating at extended spatial and temporal scales. Action and perception then work together to reduce prediction error only against the more slowly evolving backdrop of a culturally distributed process that spawns a succession of practices and designer environments" (p. 280, Clark, 2016).
PEM happens in the context of a larger cognitive ecosystem. Within that ecosystem, cultural practices emerge out of the interaction of different prediction-optimising agents. These practices shape both the physical and the cultural environment in a way that facilitates PEM for the enculturated agents. Now, if this is on track, the mechanisms through which cultural practices increase predictability (e.g. dimensionality reduction, filtering, constraint satisfaction, etc.) are then also mechanisms through which our predictive mechanisms minimise prediction error (for other views on the connection between PP and cultural practices, see Constant et al., 2018;Fabry, 2021;Veissière et al., 2020).
In their recent book, Kirchhoff and Kiverstein offer an idea of how this would work: "the constraints that come from cultural practices influence how precision is weighted in a given context and thus how uncertainty is kept to a minimum" (p. 97, Kirchhoff and Kiverstein, 2019). Take filtering as a case in point. We are driving our car through a mountain road. It is dark and perceiving the edge of the road is difficult. Thankfully, roads are "designer environments", so if we have been properly enculturated (i.e. we have learned the appropriate pattern of precision-weighting), high precision will be assigned to the white lines that signal where the edge is. Predictions will then quickly adapt to the related error so as not to drive over the edge of the road, and uncertainty will be kept to a minimum.
Note also that when Hutchins puts entropy at the heart of cultural practices, he is implicitly assuming that there is an observer with a generative model. Entropy corresponds to the expected surprisal of a probability distribution, so for cultural practices to increase predictability (i.e., decrease entropy, or expected surprisal), there must be an observer making predictions. The clear answer in the synthesis with PP is that the prediction-optimising agent is the one with a generative model in reference to which the notion of entropy becomes meaningful (or, possibly, group of agents, see Constant et al., 2018). And, of course, Hutchins's idea of reducing entropy over time is quite amenable to PP. Free energy is an upper bound on surprise, so "by minimising prediction error over time, a system minimises the entropic dispersion of its states" (p. 273, Kirchhoff and Robertson, 2018).
Discussing niche constriction and free-energy minimisation, Bruineberg et al. (2018a, b) present active inference as the coupling of a generative process and a generative model. The generative process concerns the actual structure of the world (the set of regularities) that generates the observations, and the generative model concerns how the agent expects those observations to be generated. Free energy then becomes the misfit between the two, between an agent and its environment. This perspective fits within the extended evolutionary synthesis view, according to which developmental processes including niche construction also contribute to the organism-environment complementarity (Laland et al., 2015). What we get is an agent and an environment driving each other through a fitness landscape. Extending this distinction to cultural practices, the emergent view is one in which cultural practices make environments more structured, and as a result more predictable. Cultural practices increase predictability, improving the fit between a generative process and a generative model.
Up to here, we get a nicely fitting story of how humans, in their quest for minimising prediction error, immerse themselves in cultural environments that increase predictably.

The problem of entropic cultural practices
There is a problem, however, with the story sketched so far: many cultural practices tend to increase entropy. Let us call this latter type 'entropic' cultural practices, as opposed to the standard 'negentropic' cultural practices (which tend to decrease entropy), and the challenge in question, "the problem of entropic cultural practices".
In an earlier contribution to his attempt at bridging PP and Distributed Cognition, Hutchins himself acknowledges the existence of entropic cultural practices, but this is not a topic into which he delves in depth (Hutchins, 2012). Hutchins provides an inventory of negentropic cultural practices, but no inventory of entropic cultural practices. He simply acknowledges that these exist, and that human cognition moves through cycles of disorder and reorder on all time scales. The only concrete example that he points to is that "in the conduct of a scientific investigation, accumulating disorder may lead to a productive conceptual reordering", and he notes that "a jolt of unpredictability is sometimes needed to overcome stable but inadequate conceptual structures" (p. 321, Hutchins, 2012).
Ultimately, distinguishing between entropic and negentropic cultural practices requires information theoretic measurements, but prima facie, entropic cultural practices include many instances of games, comedy and art. Playing a board game involves wilfully engaging with uncertainty about dice throws, opponents' moves and cards to be drawn. Going to a stand-up show involves laughing about unexpected punchlines (Franklin & Adams, 2011). Listening to music involves navigating unexpected musical patterns (Huron, 2006). Unpredictability might also be a key ingredient in creative activities in general (Rompay and Jol, 2016).
Another entropic practice is the 'derive' of psychogeography. The term 'derive' (often referred to as 'drift' or 'drifting' in English) was coined by Guy Débord, and it was one of the central practices of the Situationist movement. Derives are unplanned journeys in which the participants drop their usual ways of movements to instead be pulled by the attractions of the terrain, resulting in unexpected encounters. A derive involves wandering through the city not by following any preconceived plan but by spontaneously following the appeal of the immediate places around one.
These practices are unequivocally entropic, as they are "aimed at restoring value to the undecidable and radically anarchical aspect of spatial experience" (La Cecla in Schmidt di Friedberg, 2017, p. 9). Such entropy-seeking practices are not just an outlandish occurrence of artists and thinkers with too much free time. To give an example from a different cognitive ecosystem, similar practices exist in Transbaikalia, where Evenki natives celebrate an annual ritual in which they wander between secret places without any prescribed movement, often walking in circles and making loops (Safonova & Sántha, 2013).
Entropic cultural practices are by no means only present in the domains of arts and entertainment. In the socio-political domain, strikes, revolts, protests and riots are all forms of entropic cultural practices, trying to disrupt the established order. And in academic inquiry, the purposeful disruption of order is often a precursor to discovery and insight (Locke et al., 2008). In this vein, the sociologist Wright Mills describes the following practice of mixing up his research folders with the hope of eliciting fruitful associations of ideas: You simply dump out heretofore disconnected folders, mixing up their contents, and then re-sort them. You try to do it in a more or less relaxed way… to be passively receptive to unforeseen and unplanned linkages.
From the examples above, one can see that the problem of entropic cultural practices is not simply that a given practice decreases predictability in some instances. After all, cognitive ecosystems merely increase the probability of certain patterns emerging: "Experience, training, and the design of environments can all be seen as ways to bias the probability of the dynamic formation of particular practices" (p. 13, Hutchins, 2014). The interaction between the cognitive ecosystem and the resulting pattern of activity is non-linear. In these dynamic systems, it makes no sense to fixate on regular causation (i.e. causation in the sense in which if A causes B, then A must always be followed by B), because a regularity notion of causation cannot be meaningfully defined for systems without linear interactions among their variables (for a mathematical discussion, see Wagner, 1999). When we are discussing cultural practices, we should follow a probabilistic notion of causality. Using this notion, to say that negentropic cultural practices increase predictability is to say that negentropic cultural practices make an increase in predictability more likely. The problem of entropic practices comes only when we find some cultural practices that tend to decrease predictability instead of increasing it. And as we have just seen, there are many such practices. Note that the aim here is to provide proximate rather than ultimate explanations (for a debate of this distinction in cognitive science, see Scott-Phillips et al. ,2011). That is, to provide explanations that are concerned with the mechanism (e.g. PEM) that fulfils a function rather than with the function itself (e.g. the organism maintain-ing integrity). Evolutionary pressures and the fitness between agent and environment might provide ultimate (or why) explanations 1 . Active inference can be seen not just as a normative theory, but as a process theory derived from variational principles (Friston et al., 2017). In other words, active inference does not simply advance the normative principle of free energy but also detailed and physiologically plausible predictions about the processes implementing the principle. Therefore, I believe we should think of PP as offering proximate (or how) explanations of how an agent resists a natural tendency to entropy (Friston, 2010). They do so through prediction error minimisation. In active inference formulations of PP, it is expected free energy that is minimised by proximate mechanisms. Thus, the proximate mechanisms are temporally extended, considering future free energy that is a consequence of different action policies (Smith et al., 2022). Nevertheless, cultural practices are supposed to be a key ingredient in this large process of entropy reduction. With the problem of entropic practices, the tension comes from agents in a process of prediction error minimisation ordinarily engaging with practices designed to increase entropy. If this is a widespread phenomenon, as the picture sketched in this section seems to indicate, then the explanatory power of PP would be notably diminished. This tension -with agents and cultural practices seemingly engaged in diametrically divergent processes-challenges the union of Predictive Processing and Distributed Cognition. If the tension proved unsurmountable, it might put an end to one of the most promising schemes uniting cognition and culture.

A way out
There is a certain resonance between the problem of entropic cultural practices and the 'dark room' problem: Prediction-error minimizers -us, allegedly -should find their deepest motivations fulfilled by the most utterly boring experiences, since a sure way to minimize prediction-error is just to place oneself in a highly predictable environment (such as a dark, empty room where nothing much happens).
While the dark room problem presses the point that much of human activity does not seem directed at reducing prediction error, the problem of entropic cultural practices presses the point that many cultural practices do not seem directed at reducing entropy. By exploring existing responses to the dark room scenario, we can actually start to see possible ways to overcome the problem of entropic cultural practices (for recent PP responses to the dark room problem see Van de Cruys, Friston and Clark, 2020;Seth et al., 2020). Part of the reason for this is that, interestingly enough, some of these responses appeal to a subset of entropic cultural practices, namely, games and art: "By designing and repeatedly redesigning our own environments, populating them with new books, paintings, theories, games, and practices, we humans continually move the goalposts for our own prediction-based learning" (p. 531, Clark, 2018). The idea is that by being embedded in a culture, we learn to value (i.e. assign high precision to) certain cultural practices that help us (by enabling learning) in our goal of reducing prediction error over time. It seems then that to solve the problem of entropic cultural practices, we need to bring our attention to how cognition is widely distributed both spatially and temporally (Hutchins, 199520062008;Kirchhoff, 2012;Bietti & Sutton, 2015;Fabry, 2017).
A first step in this direction is recognising how cultural practices enable learning. Games are a prime example: they provide us with an initially high entropy (e.g. we fail to answer the questions correctly in a trivia game), we slowly progress at reducing that entropy (e.g. we answer the questions with more and more accuracy) and we learn in the process (e.g. we improve our general knowledge). This is incredibly important because we might reduce error down a gradient that leads us to a local minimum in which we get stuck, and cultural practices serve to nudge agents out of such local minima (Bengio, 2014). In a recent review, Andersen and colleagues argue that individuals do not only chase slopes (gradients of error reduction) but build slopes as well, so that if an environment is low in uncertainty, players will create an environment that generates surprise and uncertainty, which will then be reduced through play. In other words, humans will create error-inducing environments in order to generate a slope on which to reduce error over time (Andersen et al., 2022).
Nevertheless, practices that enable learning do not seem to be quite what Hutchins is referring to by "a jolt of unpredictability" and "accumulating disorder". Hutchins was thinking in particular of scientific practices aimed at undermining existing theories. Educational games are structures designed for the agent to learn new (but already culturally established) patterns of precision-weighting, and the learning is generally gradual along some learning curve. This does not seem to be the case with the dérives, or with the Evenki wandering ritual, or with protests and revolts, or with the throwing around of research folders to disrupt established order. A useful distinction within entropic cultural practices is between learning practices (e.g. games) and disruptive practices (e.g. revolts). Learning practices result in a gradual reduction of prediction error along a learning curve. In contrast, disruptive entropic cultural practices cannot be easily explained away by appealing to learning. With disruptive practices, there is no straightforward way in which they enable the gradual improvement of prediction success and there are no established patterns of precision-weighting that the agents involved could learn.
Nevertheless, we can conjecture that both the learning and the disruptive types of entropic practices increase predictability over time, which is compatible both with prediction error minimisation (over time) and with the claim that cultural practices increase predictability (over time). In the case of both types of entropic cultural practices, they decrease predictability in the short term and increase it in a longer term. This is certainly what happens in the case of learning, and, I will argue, it is also what happens with disruptive entropic cultural practices, such as the case of scientific investigation mentioned by Hutchins. A range of disruptive, entropic cultural practices destabilise a theoretical structure (e.g. by coming up with problems that the structure cannot assimilate), which prompts a conceptual reordering.
If the cultural-cognitive ecosystem is "seen as a constraint satisfaction system that settles into a subset of possible configurations of elements" (p. 45, Hutchins, 2014), then disruptive cultural practices are ways of disassembling existing configurations to give rise to alternative ones. This conception of disruptive cultural practices has the added advantage of providing an explanation of radical organizational change, which is a ubiquitous feature of human cultures (Mace & Jordan, 2011), and which had been previously advanced as a hard case for the PP framework (Buskell, 2020).
Disruptive practices are useful when cognitive ecosystems revolve around negentropic cultural practices that increase predictability in globally suboptimal ways. In other words, a given negentropic cultural practice manages to increase predictability, but there is a latent possibility of new negentropic cultural practices that will do a better job at increasing predictability. Then, a likely role for disruptive practices would be to destabilise the cognitive ecosystem until a new negentropic cultural practice emerges that does a better job at increasing predictability.
Such a solution begs the question of how suboptimal negentropic practices emerge in the first place. Firstly, suboptimal negentropic practices might emerge due to pathdependency, a popular notion in historical sociology (Mahoney, 2000): cultural practices develop in contingent ways, out of the vicissitudes of their epoch and environment. This might lead to practices that are just 'good enough' at a local level, but not optimal at a global level (for a defence of such a "good enough principle" in linguistic practices, see Ferreira and Patson, 2007).
Secondly, negentropic practices might emerge in a particular situation in which they reduce entropy efficiently and then persist even when their efficiency has substantially decreased over time. Think about how users might stick to computer programmes they know well even when better programmes have already been developed. These suboptimal practices persist because cultural practices are both self-reinforcing (e.g. the more people respect queues, the more likely people are to form queues) and mutually reinforcing (e.g. "the ways of speaking about 'first in line,' 'next,' 'back of the line,' and so on are discursive practices that enter into relations of mutual reinforcement with the conception of the linear spatial array as a queue" p.41, Hutchins 2014), and, most importantly for our present discussion, because there might be no way of moving to a more efficient negentropic practice without first undergoing a period of decreased efficacy in terms of predictability. The role of disruptive cultural practices is precisely to induce these periods of fruitful disorder.
Moreover, when we say that cultural practices increase predictability, we should ask "for whom?" Video surveillance in a prison increases predictability for the guards. Certain disruptive entropic cultural practices (e.g. protests) might erupt when a group of individuals does not benefit from the established arrangement of cultural practices in a given cognitive ecosystem. Such practices can be productively understood as a way of increasing predictability over time for the people initiating the disruption in question.
Disruptive cultural practices operate at many levels. While riots disrupt the established state at a society-wide level, some practices might be better studied at the level of the individual. This is the case of many practices that take us "out of our comfort zone". From the examples above, throwing our research papers around in the hope of eliciting insight, or undertaking a solitary dérive can be seen as ways of challeng-ing our established habituality. Many forms of art might thus be explained as disruptive cultural practices. Relatedly, in a recent PP account, Letheby (p. 133, 2021) conceives serotonergic psychedelics as disrupting predictive models and notes how "such disruption can also afford an opportunity to improve these models … rendering them temporarily more flexible, plastic, and permeable by new information". And, of course, the learning-disruptive distinction is not a sharp dichotomy. Cycles of order and disorder have varying temporal signatures, and cultural practices tend to reduce entropy over timescales of varying lengths.
The disordering function of disruptive cultural practices makes sense when we think of the cognitive ecosystem as a self-organising system (as in Hutchins, 2002Hutchins, , 2006 -"a dynamical system in which certain configurations of elements (what we know as stable practices) emerge (self-assemble) preferentially." (p. 46, Hutchins 2014). Something that Hutchins does not address is that one of the most pervasive features of self-organising systems is their predisposition to destroy their own fixed points (Maturana & Varela, 1980). This tendency of a system to destroy its own fixed points is termed autovitiation and is also a feature of the dynamics of PP in its information-theoretical formulation (Friston et al 2012).
That human cognition tends towards instability is acknowledged also outside of the theoretical frameworks of PP and of Distributed Cognition: research ranging from multi-stability in perceptual categorisation (Theodoni et al., 2011) to fractal fluctuations at all scales of cognition (Anderson, 2000) all agree on the tendency of cognitive systems to destroy their own fixed points. In an in-depth review of neuroscientific and behavioural research on the dynamics of cognitive processes, Tognoli and Kelso put the idea as follows: The living brain never finds itself frozen for any length of time in a particular coordination state, although it might be desirable that some parts of the system dwell over longer timescales (for instance, in some memory processes) than others (say, perception).
In PP, the appeal to autovitiation makes sense in light of the idea that the generative model used by the brain is non-linear and dynamic (with the associated tendency towards instability) results from the world being non-linear and dynamic (p. 5, Friston et al. 2012). Intuitively, the organism needs to keep moving (through state space) to adapt to a changing world: If neuronal activity represents the causes of sensory input, then it should represent uncertainty about those causes in a way that precludes overly confident representations. This means that neuronal responses to stimuli should retain an optimal degree of instability that allows them to explore alternative hypotheses about the causes of those stimuli. -p. 3, Friston et al., 2012. It is easy to see how this relates to uncertainty optimisation and PEM. Completely trying to avoid uncertainty at all times would lead to a decrease in learning and to potential bursts of prediction error. In this context, both learning and itinerant behaviour are two different ways of engaging with uncertainty. In PEM terms, the organism minimises expected entropy by engaging in both exploitative and exploratory behaviour . The same logic extends to cultural practices. While negentropic cultural practices within a cognitive ecosystem will reduce entropy immediately, entropic cultural practices within that ecosystem will increase entropy in the short term (through learning and through disruption) and nevertheless reduce entropy over time. This provides the mechanism behind autovitiation in the cognitive ecosystem. Dynamic systems showing autovitiation (i.e. momentary increases in entropy) decrease entropy over time, and disruptive practices and learning practices are the mechanism through which these dynamics are instantiated.
In the previous section, I had conceived of PP as providing proximate explanations. A critic might contend that this is incompatible with framing disruptive cultural practices as increasing overall predictability, because they operate at vast timescales. Here, the distinction between proximate and ultimate explanations might be taken to correspond to a distinction between synchronic and diachronic explanations. However, as Kirchhoff and Kiverstein (2020) have recently argued, considering the way in which cognitive process are widely distributed over time and space complicates the very distinction between proximate and ultimate explanations. When we study dynamic systems, one often needs to include ultimate processes (as in distant in time or space; as is often the case with cultural practices) in proximate explanations of how a particular process works, because these processes are diachronically distributed. Note that the aim is still to explain how (not why) disruptive cultural practices reduce entropy over time (i.e. by eliciting a reordering of the cognitive system).
Of course, the above should not be confused with the claim that there are no cultural practices whatsoever that increase entropy over time. There might well be such practices (one might call them long-term entropic practices, in contrast to learning and disruptive practices, which are only entropic at short timescales). We have seen that path dependency, and self-reinforcement and mutual reinforcement can all lead to the emergence of suboptimal practices. Cultural practices are arrangements that emerge to satisfy a multitude of heterogenous constraints. In some cases, particular arrangements might emerge (and persist) that increase entropy over time.
Similarly, it is also clear that individuals might engage on some practices that are not conducive to error minimisation over time, addiction being a clear example. The question is not whether all human activity decreases error over time (surely not), but whether we can largely explain human behaviour as a process of prediction error minimisation. For instance, we might be able to explain addiction as a case of aberrant precision estimation (Miller et al., 2020). The point is that if the synthesis of PP and DC is to succeed, the explanatory scope should extend to cultural practices. The issue was that as seen in the previous section, the apparent ubiquity of entropic cultural practices created a tension between PP and DC. Showing how most of these practices actually tend to reduce entropy over time abates this tension and raises the prospects of theoretical unification.

Conclusion, challenges and future directions
Paying proper attention to the dynamics of cognitive ecosystems and to how cultural practices operate at different scales both spatially and temporally allows us to tackle the problem of entropic practices. The conjecture that would unite PP and Distributed Cognition remains: cultural practices tend to reduce entropy. The necessary caveat is that different practices tend to reduce entropy at different time scales. Negentropic cultural practices (like the ones in Hutchins's detailed inventory) do so in relatively short temporal scales. Entropic cultural practices (disruptive practices and learning practices) do so at longer time scales. Together, they contribute to an increase of predictability over time in a given cognitive ecosystem. This increase in predictability over time contributes to the prediction error minimisation activities of the humans operating within said cognitive ecosystem. Moreover, this development parallels a current trend in the active inference literature, with some authors advancing accounts centred around expected free energy (Friston et al., 2015) and around free energy of the expected future (Millidge et al., 2021).
As elegant as the above elaboration might seem, the caveat about temporal scales is not altogether unproblematic. Hutchins's ethnographic and theoretical work has provided a series of processes (e.g. dimensionality reduction) through which negentropic cultural practices reduce entropy. We need to find a series of processes through which entropic cultural practices reduce entropy over time. The task is not easy, because simply inversing the known processes of negentropic cultural practices will not do. For example, increasing dimensionality by itself will increase entropy in the short term, but there is no indication that should lead us to believe that such a process would result in a decrease in entropy over time. Developing a taxonomy of entropic cultural practices will require an interdisciplinary effort involving ethnographic fieldwork, information theory and empirical work.
A related concern is that claiming that cultural practices tend to reduce entropy over time begs the question of what the appropriate timescale is. And of course, it will depend on the case at hand. If one takes scientific investigation as an example, the appropriate timescale might be considerably lengthy. If we consider Averroes criticism of the Ptolemaic system as leading to the Copernican revolution (Bakker, 2015) as an instance of an entropic practice leading to entropy reduction over time, then "over time" will mean several centuries. And of course, this is not to say that we should not expect vast timescales on some instances. The issue is that some claims about entropy reduction will not be empirically testable.
Despite these difficulties, the union of PP and Distributed Cognition holds incredible potential. The unification offers a common language for studying cognitive processes at the vastly differing scales in which they unfold in the wild. In this way, it provides insight and explanatory power. If, as Hutchins argues, cognition is distributed at scales ranging from neural responses to large groups, the unification means that we can gain a systematic understanding of a given cognitive process by considering both synaptic gain and the movement of bodies in a protest in terms of precision, prediction, and prediction error. Connecting PP and Distributed Cognition can offer us a vision of different cultural practices unravelling dynamically at different spatiotemporal scales to modulate how humans engage with entropy in their effort to mini-mise prediction error. Some cultural practices, ranging from queuing to star gazing, will reduce entropy from the get-go. Other practices, such as games and educational activities, will increase entropy at first so that the individuals involved can gradually progress in their attempt at reducing error. And there will also be practices in which entropy is actively sought for in the hope of prompting radical rearrangements of the existing cultural patterns, with the resulting increase in predictability over time.