1 Introduction

Currently, many researchers emphasize the need to use normative concepts in the analysis of cognitive and biological functions (cf. Bickhard, 2003; 2009; Christensen, 2012; Christensen & Bickhard, 2002; Godfrey-Smith, 1993; Kitcher, 1993; Millikan, 1984; 1989). On the one hand, they argue that functions are normative and that their normativity makes it possible to explain the functioning of organisms, while showing when it is incorrect. On the other hand, the attractiveness of their approach lies in the fact that they reduce what is normative to what is descriptive. The issue of the normativity of biological functions is the subject of lively debate. However, it is an issue that is often raised “on the occasion” or “on the sidelines” of discussing other problems, even though it seems that in many cases it is of key importance.

Christensen (2012) emphasizes the need to include what he describes as evaluative normativity in scientific analyses of living and artificial systems. In his opinion, we must be able to say when such and such states of a given system can be described as better or worse. Christensen’s thesis is confirmed by the influential opinion of Paul Thagard, who stated that philosophy has a non-trivial role to play, because it provides science, and in particular cognitive science, with certain normative questions: “philosophy is concerned not only with how things are but also with how they should be. Philosophical theories of knowledge and morality need to go beyond descriptive theories of how people think and act by also developing normative (prescriptive) theories of how people ought to think and act” (2009, p. 238). The view of Thagard and Christensen on the one hand points to an important place for normative language in scientific debates, and on the other expresses the unspoken idea that normativity is something that (1) has an evaluative character and that (2) it is ascribed to objects, states of affairs, processes or functions externally.

In this paper, I will argue for the opposite. I claim that there are specific mechanisms that are normative by their nature and not because of the attribution made by the observer. In this sense, normativity is a real property that can be assigned to a given mechanism due to the function it performs in an organism, cognitive system, or, for example, a decision-making process, and which cannot be reduced to some evaluation practices.Footnote 1 In other words, the condition for ascribing normativity is that the mechanism in question is normative per se.Footnote 2 I will argue that in order to be able to explain the success or failure of an agent’s actions in the environment, attention should be paid to specific biological mechanisms (illustrated here by the example of predictive mechanisms) that determine the selection of such and such actions. This means that explaining the potential effectiveness of an agent’s actions in an uncertain environment presupposes explaining mechanisms that are normative for these actions. A mechanismFootnote 3 is normative when it plays a specific causal role in the explanation of such and such actions or behaviors. And it can play such a role because it plays such and such a causal role in the functioning of a given mechanism.Footnote 4

My analyses distinguish between external normativity—that is, one that allows the observer to externally (i.e. from the perspective of the observer of a given object or mechanism) decide whether a given object or mechanism is effective or ineffective, correct or incorrect, good or bad, etc.—and internal normativity (the perspective of a system or mechanism), i.e. that which is ascribed to a given mechanism for ontic reasons (i.e. because the mechanism is as it is). This is the perspective that the biological system itself has about what counts as its proper functioning Winning,2020a, p. 20).Footnote 5 This approach assumes that internal normativity (understood here as a constitutive property of actions and behaviors) is a real property of a given object, mechanism or behavior, independent of the observer and its evaluative practices.Footnote 6 In this sense, the term “normative” applies to those properties that are constitutive for the structure and functioning of a given system (Bickhard, 2003)Footnote 7 as well as for their explanation. Internal normativity can therefore be called “constitutive normativity” or “explanatory normativity”.

Before going any further, it is necessary to justify why the notion of normativity is important for the explanation of the predictive processing framework (PP).Footnote 8 According to this framework brains are predictive machines (entailing generative models), whose primary function is to constantly match information coming from sensory modalities to internally generated, model-based predictions explaining the nature and sources of such information (for full exposition see: Clark, 2013; 2016; Hohwy, 2013, 2014, 2020a; Piekarski, 2021; Wiese & Metzinger, 2017). The process of minimizing prediction errors, i.e. the disproportion between expectations (hypotheses about the world) based on the internal parameters of the model and the variable information reaching the model through the senses, assumes hierarchical and multi-level predictive information processing and a generative model which is (generally speaking ) the statistical model of how observations are generated.Footnote 9 Supporters of the PP claim that the generative model is a hierarchical Bayesian probabilistic model, which constructs and tests internal models of the external environment by implementing cognitive processes that are an approximation of Bayesian inference (Clark, 2013, p. 189). Bayesian rule helps to identify an optimal way of updating one’s beliefs given new evidence under conditions of uncertainty. Thanks to this, prediction errors will be minimized by the model only when the model adopts the best possible hypothesis regarding the causes of the sensory signal source (perceptual inference) or by making an active inference when the agent interferes with the causal structure of relevant states of affairs and changing the information reaching the model (cf. § 3; Friston, 2010; Pezzulo et al., 2015). Perceptual inference process can be called abductive, but the key thing is that it is unsupervised: the inputs are not a priori classified, and the beliefs at the starting point can be randomized and then progressively match the statistics of the input. For this reason, the generative model can be understood as self-evidencing (Hohwy, 2014). In this regard, Jakob Hohwy states that the Bayesian rule can be perceived as a paradigm of normativity: “it prescribes optimal relative weighting of evidence and prior belief. Violations of the norm occur when too much or too little weight is given to the prior or to the evidence, leading to false inference. (…) It approximates the optimal results a system would get by complying with the Bayesian norm” Hohwy, 2020b, p. 15).Footnote 10

The belief in the normative nature of Bayesian models is shared by many researchers (cf. Anderson, 1990; Hahn, 2014; Oaksford & Chater, 2007; Oaksford, 2014). Bayesian models are meant to be normative in the sense that human thinking is measured and evaluated in the light of the rules it formulates. In other words, the Bayesian models using Bayesian rule are meant not only to describe how cognitive and decision-making processes take place, but also to show how they should proceed.

We have to say that the assumption about the normativity of Bayesian models is closely related to the assumption regarding the rationality of agents that think, make decisions and act, approximating the (normative) Bayesian rule. The position that links rationality with normativity can be defined as normative rationalism (Elqayam & Evans, 2011, p. 235). For this reason, there is a common view that Bayesian theorem describes the optimal procedure for inference under uncertainty. However, this position is not free from objection (cf. Knill & Pouget, 2004; Joyce, 2004). The belief in the normative nature of the Bayesian rule approximated by generative models is also challenged by many researchers (cf. Colombo et al., 2018; Elqayam & Evans, 2011; Jones & Love, 2011). In their opinion, in science, and above all in psychology and cognitive science, the prescriptive approach should be abandoned in favor of the empirical or descriptive one.

In this paper I will defend the view according to which Bayesian PP is normative not because it allows for the formulation of rules of action and policies or because it contains such rules, but because (some of) the predictive mechanisms themselves are normativeFootnote 11(cf. Piekarski, 2019 for a basic introduction to this topic). They condition the choice of such and such actions by the agent. To substantiate this view, I will refer to the relation that takes place between a given prediction and the action conditioned by it. I will call this relation motivational. It enables the agent to act in one way and not another, depending on the belief system that the agent considers true, resp. accurate (cf. O’Brien, 2005). The motivational relation (on which the motivation is founded as the need to reduce uncertainty (Anselme, 2010)) is the relationship between predictions and the actions they guide in relation to certain environmental states. This means that the agent’s motivation is shaped by the generated predictions that stem from the need to minimize prediction errors by taking into account the states of the environment as well as the possibilities of action this environment offers. I will argue that a given mechanism is normative as long as its operation (function) is to generate (normative) predictions. The mechanism thus understood makes it possible to explain the behavior of the agent in the environment in terms of its success or failure.

At this point, we should also mention normative theories in the life and physical sciences. This is important because: (1) the approach proposed here can be regarded to some extent as an application of these theories to explain the activities of living organisms. In the approach proposed here, this means that low-level physical normativity can be and is actually realized by high-level normative mechanisms (such as, for example, the predictive mechanisms described here); and (2) there is a clear link between these theories and the approach proposed here, especially with regard to the significant distinction between rational Bayesian inference and bounded or approximate Bayesian inference. Namely: in the technical literature, a normative theory is generally read as a formal specification of a process that is equipped with an objective function. In other words, the process can be understood as extremizing or optimizing a function of its states. Clear examples here include Lyapunov functions in dynamical systems theory through to loss functions in optimal control. Implicit in this sort of definition is a process that can be cast as optimizing some measurable function.

This reading of normative may have a special role in the present discussions. This follows because treatments of PP implicitly invoke an objective function – namely - model evidence or marginal likelihood. Mathematically, rational Bayesian decision-making then corresponds to exact Bayesian inference. However, exact Bayesian inference is mathematically intractable, which is why approximate Bayesian inference is the only realizable kind of Bayesian inference. The objective function is then known as a variational free energy (VFE) or an evidence bound (cf. Winn & Bishop, 2005). This converts exact Bayesian inference into approximate Bayesian interest inference.Footnote 12 By introducing VFE, an in intractable integration problem was converted into a tractable optimization problem; namely minimizing VFE (Dayan et al., 1995). In short, the definitive aspect of PP—at least under FEP—is its normative aspect. It is an important observation because, on the one hand, it justifies the need to analyze predictive mechanisms, and on the other, it constitutes an additional argument against those approaches that deny the legitimacy of research on normativity: from the point of view of normativity understood in this way, postulated by the aforementioned authors, a descriptive or empirical approach is also a prescriptive approach.

I have structured the paper as follows. In Sect. 2, I discuss the concept of the normative function and justify why I give up the teleosemantic approach to it. Instead, I propose an approach based on the actual causal role played by a given function. In Sect. 3, I describe the motivational relation that defines the normative relation between the generated predictions, the actions taken, and the (expected) states of the world. Section 4 explains the relation between generated prediction and policy selection. I claim that the choice of a given policy is related to a specific environmental situation, the generated high-precision prediction that allows inferring the expected outcome; and objective function, which specifies a general (normative) requirement for policies. Section 5 distinguishes counterfactual predictions from semifactual predictions. The former normalize the (relatively) effective actions, the latter the ineffective ones. The possibility of choosing between them proves the normativity of the motivational relation. In fact, the agent does not need to know which of the actions taken by it are based on counterfactual predictions and which are based on semifactual predictions, but is always obliged to choose an action normalized by this or that prediction. In Sect. 6, I discuss the role of environmental constraints in the motivational relation and their importance in explaining the normativity of predictive mechanisms. In Sect. 7, I argue that mechanisms that perform normative functions can be the causes of certain actions and behaviors. Importantly, these are not the only causes for these actions, but they are the causes that explain (in mechanistic manner) the success or failure of the action taken by a given organism in a specific environment and situation. They are therefore what I call “constitutive” causes. In the Conclusion, I summarize the analyses carried out and emphasize their importance.

2 The concept of the normative function

The concept of the normative function appears in the works of Millikan (1984; 1989). She links it with the concept of the etiological proper function. An object has a proper function if it derives from a line that owes its survival to the existence of a correlation between its distinguishing features and the effects that can be defined as functions of these features (Millikan, 1989, pp. 288–289). This means that a proper function has properties that have been selected through the mechanisms of evolutionary natural selection. In this way, functions can be assigned to such systems, organisms or artifacts which, although having their functions, are not capable of performing them. They fail to fulfill their inherent functions because of some “damage” or because of certain background conditions, for example, those that helped their predecessors keep the function in operation and are now absent. This approach to function relates firstly to how a given thing has been designed or how it acts on purpose (as opposed to what it does accidentally) and secondly, to the existence of some kind of “pattern” that can be found wherever there is a natural attribution of purpose and/or intentionality. For this reason, this approach is called normative, with the proviso that this understanding of normativity has nothing to do with some “evaluation” or “assessment”. In such an approach, the concept of a proper function should be treated as a specific measure or norm for deciding whether something is a function or not, or, crucially, whether it is a dysfunction of a given system, organism or artifact (Millikan, 1984). Normativity in this framework is understood purely historically and is associated with normality as a certain regularity of action or performance (Millikan, 1989, p. 284).

In this approach, the function of representation is the proper function of perceptual and cognitive systems. The dysfunction here consists in not fulfilling the proper function, that is, in incorrect, or ultimately erroneous representation. Misrepresentation is an important and constitutive element of the normal functioning of the organism, because “from an evolutionary perspective, it is more profitable to overrepresent certain features of the environment, rather than not representing them at all” (Bielecka, 2018, p. 185). What is significant for this view is (1) that the possible misrepresentation cannot be explained without reference to the evolutionary history of an organ or part of it; and (2) that in order to explain the function of representation by specific organs or, more broadly, organisms, it is necessary to explain the dysfunction or to define the normative conditions appropriate to the representation function.

However, what makes Millikan’s proposal attractive also determines its problematic nature. Firstly, assuming that the explanation of the dysfunction of representational systems is based on the reference to the history of the selection of their natural predecessors, leads to objections formulated by Davies (2001).Footnote 13 Secondly, Millikan’s approach offers a description of the normative function of representation only from the perspective of the species, and not individuals. In practice, this means that we can only provide minimal conditions that the representative system should meet in order to be able to represent correctly or incorrectly, but we cannot explain why a given organism or system behaved in one way or another or produced such and such representation. The point is that talking about normativity in teleosemantics serves only to justify this and no other type of description of biological and artificial systems. In practice, this means that if we wanted to use the notion of proper function to explain the normativity of the Bayesian generative model and its predictive mechanisms, we should reduce this explanation to the conclusion that it consists in showing the normal conditionsFootnote 14 that it the model must satisfy in order to (for example) incorrectly minimize prediction errors. However, such an approach is unsatisfactory, because generative models, by definition, work in such a way that they are always prone to misrepresentation. In the technical language of PP: a statistical generative model determines the posterior probabilities of the predictions it generates and constantly compares them with the incoming data. It thus updates its internal parameters, while determining new probability distributions for the specific values of the variables it takes. The whole process consists in bringing the probability distributions of internal parameters and the assigned probability distributions of the external states of the environment as close as possible to each other. The discrepancy between the two distributions is referred to in the PP as the “Kullback-Leibler divergence” (KL divergence). It concerns the difference between posterior generative and approximate recognition distributions (cf. Bogacz, 2017; Kiefer & Hohwy, 2017). This means that misrepresentation is inherent in the very nature of the generative model.

It should be stated that ascribing the possibility of misrepresentation to given organs, systems or mechanisms or recognizing their dysfunction (which is constitutive for the normativity of a function or mechanism) is epistemic. Therefore, it is relativized to the description formulated by an external observer in relation to its research interests. I argue that the causal conditions that the organism or system “armed” with the generative model must meet in order to be able to act in the uncertain environment and guarantee cognitive and non-cognitive successes to a specific organism, are based on internally generated probability distributions about the world. This means that a given function is normative not only because it makes it possible to explain the possibility of dysfunction, but also because it plays a specific causal role, i.e. it influences and maintains the stability of a given system (cf. Bickhard, 2003; 2009). It is normative because the system, in order to exist and self-organize itself, must fulfill certain normative functions. For example, the function of the heart to pump blood serves the purpose of delivering oxygen to the brain. Therefore, it is normative, because, if the function is not performed, the organism would cease to exist. For this reason, we will say that the function of the heart to produce acoustic effects is only a causal function and not a normative function, since it contributes neither to the realization of other processes, nor to the stability and survival of the organism. It should also be emphasized that the normative function in the sense of Millikan is possible thanks to the performance of the normative function as described here. In this sense, I will refer to explanatory normativity because of its important role in explaining the behavior of a given system, as I argue in § 6. This explanation comes down to indicating a specific normative mechanism (i.e. one that performs a normative function), which is the cause of this behavior or, in other words, makes the behavior possible. In the approach that I defend, the constitutive causes of actions and behaviors are specific mechanisms that perform normative functions. By constitutive cause I mean one that determines the logical conditions for the appearance and realization of a given action or behavior.Footnote 15

Referring these analyses to PP, it should be stated that the content of the generated prediction is normative for the selection of specific actions. Predictions are normative because they are conditions for selecting appropriate actions, resp. policies of action. This means that the actions taken by the agent have their own logical conditions that define the criteria for selecting these actions. Thanks to this, specific predictions can justify selected policies of action. Normativity understood in this way is internal, because the agent’s obligation to act in a certain way results from the very fact of the existence of a given prediction, just as, for example, the fact of the existence of a moral norm implies an obligation to observe it. The normativity of prediction is directly related to the requirement of long-term minimization of the prediction error, resp. VFE. An agent that does not minimize prediction errors in the long-term will cease to exist, so it must do so because of all the possible states it may be in. It must find an appropriate subset of states (determined by some priors and the organism’s phenotype) that will allow it to survive and effectively exchange energy with the environment (Friston & Stephan, 2007).

The above remarks should be made more precise. First of all, when I speak of the “normativity of predictions”, I refer to the predictions that are related to active inference, i.e. the domain of actions and decision making. I claim that a specific prediction or a set of them (I will describe them later in terms of counterfactual predictions) normalizes the selection of such and such actions, resp. policies of action. It is not difficult to notice that such an approach presupposes weak normativity. Weak normativity—as I understand it—can be associated with the metaethical position of motivational internalism (cf. Rosati, 2016). Motivational internalism assumes that having a motive for a given action, in our case appropriate prediction, is sufficient to justify this action. In other words: prediction is the norm for a given action, resp. policy of action, in the sense that this action, resp. policy of action, is consistent or inconsistent with this prediction (cf. Brandom, 1994, p. 18–20) (by consistence I understand the possibility of justifying a given action, resp. policy of action, by reference to a given prediction).Footnote 16 The weakness of this approach is that a normative prediction (or sets of normative predictions) allows one to reconcile certain actions, in the sense that the agent acts according to a given norm (prediction) or not. This means normative prediction (or sets of normative predictions) can specify the sequence of sensory states that must be brought about in order to achieve some outcome, which means that one might pursue multiple policies designed to bring about the same outcome. This state of affairs reveals a weak normativity, since “must” is downgraded to a “could”, while duty is replaced by arbitrariness.Footnote 17

In this paper, however, I defend a stronger understanding of the normativity of prediction, which can be associated with the metaethical position of externalism. According to this approach, and contrary to the claims of internalism, having a motive, or generating a prediction for a given action, is insufficient to justify it (cf. Rosati, 2016). A motive is therefore something separate from justification (reason), although it may sometimes coincide with it. In our case, this means that a prediction is the norm for a given action, resp. policy of action, not only in the sense that this action, resp. policy of action, is consistent or inconsistent with this prediction, but primarily in the sense that this action, resp. policy of action, is realized precisely because of such and such prediction. In other words: the agent not only acts according to the prediction, but acts because of it. The normativity of predictions in this strong sense implies that are related to a certain claim to have them fulfilled (cf. Korsgaard, 1996, pp. 8–9). Normative predictions in this sense not only describe the way in which the agent regulates its actions (weak normativity), but also demand what the agent should do to effectively minimize prediction errors, resp. VFEFootnote 18 (strong normativity). I devote the analysis in § 5 to the justification of the strong normativity of predictionsFootnote 19

This normative property of predictions makes them good guides of actions. It favors actions in situations that increase the probability of certain predictions, thereby rejecting actions that relate to situations predicted by low probability predictions. Therefore, predictions are normative also because we can assign certain logical values to them (cf. Bickhard, 2003). In this way, they can influence the content and structure of the generative model and guide action.

Let’s illustrate these analyzes with a simple example: I have an important job meeting far from my house. I don’t own a car and the nearest bus stop is an hour’s walk away. The weather is important to me, because I have to take this condition into account when choosing my clothes, and in the long run, perhaps also the means of transport. If it is going to rain, for example, I will have to put on a coat and take an umbrella, and if it is going to be sunny, I can leave home without these items. So I look out the window and see the cloudy sky. I can also look at the barometer and thermometer. My previous experiences with changing weather and the information my senses provide (e.g. cloud cover, atmospheric pressure, air temperature) allow me to predict that it is about to rain and maybe there will be a storm. The accuracy of my predictions is crucial for the actions will I take (choice of clothes, means of transport, time of leaving home, etc.). I know that a break in the weather may make it difficult for me to be on time for my appointment and my clothes may get wet. In this sense, this prediction is normative for the actions which it conditions. It determines which actions should be taken if I assume the high probability of this prediction. It also points to some such actions that are completely unrelated to this prediction (see § 4). If I predict that there will be a storm and want to minimize its negative effects on the achievement of the goal of reaching my destination at the appointed time in neat clothes, my optimal choice of action will be to put on an overcoat, order a taxi or postpone the meeting to another date. However, I will not minimize these negative effects if, for example, I turn on the TV, order a pizza or go to sleep, because I will not fulfill my purpose in this way (see § 5).Footnote 20 The normativity of my predictionFootnote 21 not only normalizes what I can do, but also excludes many actions that are not relevant to that prediction if it is considered accurate or true. It is also normative in the sense that if I act in accordance with my prediction (i.e. its content), I increase my chance of success for my actions, and if I ignore them, my actions may end in failure.Footnote 22

3 Motivational relation

I claim that a given prediction significantly influences the choice of certain actions, favoring some of them and rejecting others. Consider an example: I am driving on an unlit road at night. I can see two approaching points of light. I predict that these are the headlights of a car coming from the opposite direction. I also assume this car is in the correct lane. There is also a riskFootnote 23 that it goes against the tide. However, I am not sure. So how can I make a decision about what to do? It is a situation in which there are numerous discrepancies between the model’s priors and the information coming from hidden causes in the environment. These prediction errors will only be minimized once the model adopts the best possible hypothesis about the causes of the sensory source with respect to the corresponding space-time vectors. For example, one level of the model (higher) will concern the possibility of recognizing light points as car lights, another (lower lying) will refer to e.g. the detection of the edge of the perceived object, the next level will generate predictions regarding e.g. the time when both vehicles will collide. At each level, the model estimates how precise a given prediction error is, so that it is possible to revise the previously adopted hypotheses (cf. Friston, 2009, p. 299).Footnote 24 Appropriate perceptual hypotheses are a basis for predictions that condition my behavior as a driver. For example, the model generates a prediction according to which if the car I am driving keeps the current direction and track, there will be a collision with the vehicle coming towards it. The error regarding the discrepancy in determining the position of the car in front of me and the possibility of a collision can be minimized in two ways: either the model will revise the priors under the influence of the prediction generated, or it will perform active inference, i.e. it will interfere with the causal structure of the world in order to minimize the error about a potential collision (cf. Friston, 2010, p. 129). By active inference I mean selective sampling of sense data so that it can be adapted to the generated predictions. In practice, this means that the agent’s previous priors include the assumption that it must take actions that will minimize surprise, or a given prediction error. This means that the agent must represent itself in the expected future states by performing specific actions (Friston et al., 2014; Schwartenbeck et al., 2013, p. 2).Footnote 25 What is important in these analyses is the relation between the generated prediction and the action taken (as part of active inference).

In the analyzed example with a car, I can take both actions that will minimize the prediction error effectively and those whose effectiveness is questionable. Predicting a collision, I can, for example, pull over to the roadside or stop the car. Each of these actions interferes with the causal structure of the world in the sense that they can trigger a specific reaction in the driver of the other vehicle. However, there are actions that, despite interfering with certain states of affairs, will not minimize the prediction error. Such actions are, for example, turning on the music or air conditioning, activating the windshield wipers or talking to a fellow passenger. This means that the appropriate prediction normalizes the choice of possible actions to be taken (limited by the knowledge of the model about probabilistic relationships in the world and by specific priors regarding, for example, driving behavior). A given prediction may therefore normalize both actions that will allow for effective minimization of the prediction errors and those that do not lead to minimization, although the agent may believe that e.g. listening to relaxing music in the car will allow him or her to react faster in a situation of emergency.

Schematically, this „normalization” can be written as the following conditional:

If ‘prediction’ (condition), then ‘action’ (outcome).

I argue that the relation between predictions and actions, however, is not a typical causal relation that can be written symbolically as “If B, then A”, but a relations that I will refer to as motivational. It can be represented as a conditional with a specific form:

If B or C or D (etc.) then A, but not if E, F or G (etc.).

Predictions condition the emergence of certain actions, thus excluding others. This thesis is justified by one of the main assumptions of PP, according to which the purpose of the model is (long-term) minimization of average prediction errors. It follows that only those actions that help achieve this goal are favored. In practice, this means that a given organism acts in a way that increases its chance of survival, i.e. reduces the degree of surprise associated with the need to act in an uncertain and an unknown environment.

The justification of the normative nature of predictive mechanisms also implies the need to refer to the interaction between the organism and the environment, which, as many researchers claim (cf. Gibson, 1979; Norman, 2013), is already pre-structured. Therefore, it should be said that the normativity of prediction is determined by specific functional roles in the generative model (including selection and guiding actions) and by reference to specific properties of the natural environment as well as socio-cultural circumstances. Due to the fact that the world is “previously” structured, it can present for the organism certain values of reward or evidence (cf. Friston et al., 2012). I will focus on this thread in § 6.

Predictive mechanisms are normative because they refer not only to the requirement of minimizing prediction errors and uncertainty, resp. VFE, but also to prior beliefs, preferences, or motivations that arise in relation to certain environmental states.Footnote 26 From this perspective, a possible error or misrepresentation has a normative significance for the organism not only as a potential result of specific causal processes, but above all because they significantly shape the causal transitions between states with specific content and a structured environment, thus determining the selection of appropriate actions in such a way that they correspond to the normative Bayesian rule (cf. Kiefer, 2017; Shams, Ma & Beierholm, 2005). Thus, what an organism does depends on the requirement to minimize prediction errors, individual preferences and beliefs as well as on specific properties of the environment. This means that actions are selected on the basis of some conditional potential (linked to predictions) and the relationship that the organism enters into with its environment. It is therefore appropriate to agree with Bickhard that „ Such conditional relationships can branch — a single interaction outcome can function to indicate multiple further interactive potentialities — and they can iterate — completion of interaction A may indicate the potentiality of B, which, if completed, would indicate the potentiality of C, and so on” (Bickhard, 2009, p. 78). This means that the cognitive system continues to predict the form and content of sensory signals, not only by using the priors and Bayesian rule, but also by actively acting in its environment.

Now let’s take a closer look at the motivational relation. It is not a factual conditional (if A happened, then B happened), but rather a counterfactual conditional (if A had happened, then B would have happened). Why? Because (in PP version of this story) ignorance of the actual outcome (caused by an action) causes that the probability of a given prediction (according to Bayesian rule) is estimated only in relation to the likelihood of observations. For this reason, the model selects the counterfactual prediction that is the most likely, in the light of the data, to explain the prediction error, i.e. one for which high precision is expected. This means that the model favors such actions (as part of active inference) that are conditioned by the most probable counterfactual prediction. It speaks to the fact that uncertainty reducing policies have to be selected via a process of Bayesian model selection. This in turn rests upon the capacity to entertain counterfactual hypotheses like “what would happen, if I did that”. It must also be added that ignorance of the outcome is constitutive of the necessity to make a choice between actions that are conditioned by an appropriate prediction. Thus, the expected outcome is only more or less probable. For this reason, the entire process described here specifies conditionals that are not factual but counterfactual. “Generative models underlying perception incorporate explicitly counterfactual elements related to how sensory inputs would change on the basis of a broad repertoire of possible actions, even if those actions are not performer” (Seth, 2014, p. 97). This means that generative models encode not only the likely causes of sensory signals, but also process these signals according to the a repertoire of possible actions. Based on this knowledge, they then make a choice between alternative policies. Vague predictions of alternative actions which minimize prediction errors and their expected consequences are ignored by the model in the light of reliable information based on the model’s priors and uncertain information from input.

The policy selection process can be described in more general terms as one in which agents look backward for the best explanation for the antecedent and then forward to see whether the explanation would imply the output (Rips, 2010, p. 212). The best explanation in PP is strictly related to the abductive inference: the model abductively “infers” about the possible outcome of actions with high expected precision in such a way that it presents hypotheses that best explain information coming from the environment. For example: from the action “I will pull over to the roadside” someone can derive the outcome “There will be no collision with a driving car coming from the opposite direction”. Some authors (cf. Mackie, 1974) argue that conditionals are neither true nor false, but that they only serve to highlight inferences that are permissible in a given cognitive situation, and not to state that something is true or false. Therefore, they do not speak of an ontological situation. It seems, however, that the conditionals embodied in the motivational relation occur not only in an epistemic but also an ontic situation. Counterfactual predictions, which are among the constitutive elements of the motivational relation, are inferences based on sensory information from an environment that is already pre-structured by certain constraints. According to PP, the model does not know these constraints (it can, of course, infer about them on the basis of information from input), but they are conditions for structuring this information. For example, in the process of seeing, they guarantee the matching of appropriate elements to most natural scenes (cf. Marr & Poggio, 1976; 1979). One such natural constraint is, for example, spatial location (Marr, 1982, pp. 68–70). This means that objects in the world that cause changes in the intensity of light are spatially located. Thus, these constraints can be understood as specific facts about the real world (Shagrir, 2010, p. 489).

4 Counterfactual predictions and policy selection

The statement that the motivational relation is ontic in nature does not yet determine its normative character. The normativity of this relation is directly related to a certain arbitrariness of the choice of actions conditioned by a given prediction. Returning to the example of the weather, if I predict that there will be a storm and at the same time I want to reach the agreed place in dry clothes, my prediction is justified by the fact that, for example, I will put on a coat, order a taxi, etc. However, it does not justify turning on the TV or ordering take-away food. This means that such actions are unlikely to be undertaken by me, as they do not maximize the model evidence. This normative requirement to maximize the model evidence and the specific environmental conditions justify the normativity of prediction (in such a way that such a prediction is highly precise), i.e. they justify why the agent may choose among such and such possible actions, and why it will rather not choose from some other actions.

So it means that the agent does not have to choose that particular action (then we would be able to say that the choice is caused by a prediction), but is obligated to choose some action that is normalized by such and such counterfactual prediction.Footnote 27 Therefore, the agent is obliged to choose one of the actions offered by the generated prediction in the sense that this prediction (abductively) justifies the agent’s choice. In other words, the agent somehow interferes in the world so that its actions result in the realization of one of the expected counterfactual outcomes. This obligation results from the need to minimize prediction errors, resp. VFE, i.e. the need to remain in existence. (cf. Hohwy, 2020b). In this way, a prediction error will be minimized by implementing counterfactual (active) inference.Footnote 28 These observations require a clarification.

Due to the fact that the agent does not have any specified input data, it must independently find the appropriate patterns, dependencies and relationships against which it will plan its actions. It is a common view that in PP we are dealing with unsupervised learning algorithms. In practice, this means that, given a policy adopted by an agent and a set of environmental constraints, a generated counterfactual prediction will prescribe a course of action that the agent should take, assuming that they want to advance the policy as best they can, given their constraints. And here’s the problem: given that agents can hold multiple policies at once, the actions related to each potentially being pragmatically in conflict, it’s hard to understand how any predictions prescribing such actions could be said to be normative to the agent (at best they can be normative to the agent qua a given policy—but that doesn’t settle which action should be performed, since the agent still needs to choose between policies). This problem can be solved by referring to the normative nature of the prediction: on the one hand, they are normative because they justify the choice of a given policy, which means that at the same time they can suggest a change of the currently implemented policy, if it is different from the one that regulates, or justifies the generated prediction. Predictions are therefore necessary because they drive the selection of action policies and at the same time they oblige the agent to change the policy, because the outcome expected by them maximizes the Bayesian model evidence.Footnote 29 On the other hand, they are normative because they can “ease” conflicts between existing policies. This issue requires a closer look.

Conflicts between policies can be related to the explore/exploit trade-off. The concept is taken from machine learning, but has a much wider application (cf. Cohen et al., 2007). Generally speaking, this trade-off concerns situations in which one chooses between what is known and can be foreseen (it may meet the agent’s expectations)—exploitation—and what is not certain, but there is a strong assumption that it may offer some novelty in the form of information, experience or skills—exploration. Given the goals set in these analyses, I assume that the exploration-exploitation trade-off concerns (1) the choice between acquiring new knowledge and using the already existing knowledge; and (2) the choice between new non-obvious action options and proven and known action strategies. The choice between exploration and exploitation can be understood as a choice between certain behavioral tendencies. The challenge is to provide a formal account of goal-directed exploration, where agents are guided by minimizing uncertainty, resp. prediction error and actively learning about the world (Schwartenbeck et al., 2019). Imagine going out to the restaurant in the evening. Do we want to choose a well-known and proven place that bore us a bit, or go to another one that may positively or negatively surprise us? It is a choice between options that may have a positive or a negative outcome, which is directly related to the unexpected and expected uncertainty (Yu & Dayan, 2005).

At this point, I should refer to the concept of the expected free energy (EFE). EFE quantifies the VFE of various actions based on expected future outcomes (cf. Friston et al., 2015; Millidge et al., 2021). Why is this concept relevant to the issue of the normativity of prediction? Future actions, i.e. those to be conditioned by normative predictions, trigger future outcomes that have not yet been observed. Actions must therefore be selected in such a way that they can minimize the EFE. The already mentioned exploration-exploitation trade-off returns here, because minimizing the EFE leads to both maximization of reward and minimization of uncertainty.Footnote 30 By minimizing the EFE, the agent maximizes the expected outcomes in the exploitation of the environment. At the same time, the agent minimizes the uncertainty about the state of the world by obtaining information from the environment (exploration). This means, to use the language of active inference framework, that most actions have both pragmatic and epistemic aspects that can be associated with the already mentioned exploration-exploitation dilemma (Friston et al., 2015, p. 2). The solution to this dilemma by the agent is connected with the implementation of the normative requirement according to which the agent must minimize the EFE if it wants to solve the exploration-exploitation dilemma.

Thus, the minimization of the EFE reveals another aspect of the normativity of prediction. How the agent will minimize the EFE, i.e. whether by realizing pragmatic actions (exploitation) or by realizing epistemic actions (exploration), depends on the predictions about future actions and their expected (future) consequences (Smith, Friston & White, 2022, p. 10) i.e. predictions about the EFE. EFE is referred to both prior beliefs (that play the role of preferences), higher-order representations (e.g. the agent’s model of itself) and the phenotype of a given organism, which defines a set of states that are the least “surprising” for a given agent and are consistent with its survival. In this sense, what is normative is the ability to arbitrate policies with respect to predictions, certain sets of prior beliefs, higher-order representations and an organism’s phenotypeFootnote 31.

Summing up, it should therefore be stated that the choice of a specific policy is related to (1) a specific environmental situation, which can be analyzed in terms of constraints for mechanisms (see § 7); (2) the generated high-precision prediction that guides future actions by minimizing the EFE; and (3) VFE (objective function), which specifies a general (normative) requirement for policies, which is minimizing expected surprise in the long-term average (Friston et al., 2017). In this understanding of the choice of policies of action, unsupervised learning becomes in PP self-supervised learning that is normatively constrained on the one hand by normative predictions and on the other hand by specific states of the world.Footnote 32

5 Counterfactual predictions vs. semifactual predictions

According to the previous analysis the motivational relation should be written as follows:

If B, C or D (etc.) then A, but not if E, F or G (etc.).

This notation shows that as part of active inference the agent is obliged to choose one counterfactual prediction from among many available. For example:

“If I change lane (B), there will probably be no collision (A)”; or.

“If I change the direction of travel (C), there is probably no collision (A)”; or.

“If I turn on the radio (E), a collision will probably occur (A)”; or.

“If I start a conversation with a fellow passenger (F), there will probably be a collision (A),“ etc.

Why would the model generate the prediction “If I change the lane, there will probably be no collision”, rather than the prediction “If I turn on the radio, there will probably be no collision”? After all, the agent does not need to know the statistics of car accidents, their causes, or the unsuccessful attempts to avoid them (it is difficult to imagine such a situation in the modern world, but it is not impossible). The latter prediction seems unreliable. However, the agent is unable to state that with certainty. Estimates of counterfactual causal relationships between events (e.g. radio on) and their outcomes (car collision) may influence the subjective impression that some alternative variants are close to reality and others are not (Kahneman & Varey, 1990). In the context of PP, this means that the agent actually chooses between counterfactual predictions (more effective) and semifactual predictions (less effective or ineffective). Counterfactual predictions can be specified as “if conditionals"” and semifactual ones as “even if conditionals”. The latter are so defined by philosophers (cf. Chisholm, 1946; Goodman, 1973) because they combine a counterfactual antecedent and a factual consequent. This means that, unlike counterfactuals about what might have been, semifactual alternatives seem to suggest that the outcome is inevitable (Byrne, 2005, p. 129), when in fact different antecedents of behaviors can lead to different outcomes/consequents. This is directly related to the features of counterfactual predictions aimed at identifying actions that will actually influence the course of events in ways that matter to the agent. For example: “Even if the driver turned on the radio, it would not avoid a collision”. The situation is different with counterfactual predictions: “If the driver changed the lane, it would avoid a collision”. Here the outcome suggests that it is the result of a specific action. This means that counterfactual predictions relate to how sensory stimuli would change if the agent interacted with the world in the manner suggested by these predictions, taking into account the expected consequences of these interactions. In the case of semifactual predictions, the relation between interactions or consequences is weak or absent. Again, in an effort to avoid a collision, the agent expects that it will accomplish this goal by changing the lane. Turning on the radio will certainly not help, because the avoidance of a collision is not the expected consequence or outcome of such an action.

Let’s take a closer look at it. The agent, who wants to achieve the expected outcome (e.g. to avoid a collision on the road), takes actions that are conditioned by specific counterfactual predictions. Thus, if the agent is to act in such a way that selects the best policy for bringing about some outcome (in other words—that the EFE is minimized), it must act according to the predictions that suggest actions leading to the realization of these expected outcomes. It therefore means that the agent not only acts according to these counterfactual predictions but also because of them. Why? Because the requirement to minimize the EFE implies not only specific actions, but also recognizing minimization of the EFE as leading to actions. The driver will interact with the environment in such a way that, in his or her opinion, his or her actions that will result in avoiding a collision. Thus, the actions will not only be in line with his or her predictions, but will also be taken precisely because of these predictions. In this sense, these counterfactual predictions are de facto norms of what the driver has or is not supposed to do in a given situation.

Generally speaking: the agent’s actions are therefore not only in line with the predictions, but are also explained by them. They provide answer to the question why the agent should take such and such actions if it wants to achieve specific goals (in our case this will be to avoid a collision) (cf. § 7). If this argument is correct, it justifies the fact that the predictions are normative in both the weak and the strong sense, which implies the position of motivational externalism.

It should be added that counterfactual predictions assume the minimization of the prediction error by means of active inference, which in this case means the actual interference of the agent with the causal structure of the world: lane change is the cause for avoiding a collision. Semifactual prediction assume a change in the hypothesis regarding the causes of the sensory signal source, which means that the agent minimizes the prediction error by changing, for example, its beliefs about the causal relationships between radio activation and car collisions. What I mean is that choosing a policy of action based on this semifactual prediction presupposes the change of certain parameters of the generative model first, so that “collision avoidance” is the expected outcome of the “radio on” action. More precisely, the acceptance of the truth of a semifactual prediction assumes a change in the Bayesian network, which is the generative model. For example, the prediction “If the driver turns on the radio, it will avoid a collision” assumes the introduction of a belief that turning on the radio may have an impact on avoiding a collision on the road, which in turn changes the coherence of the model. This is because such a belief is generally not substantiated by other beliefs present in the Bayesian network. This is not the case with the counterfactual prediction “If the driver changed the lane, he or she would avoid a collision” which may be coherent with the agent’s other beliefs,Footnote 33 which in practice means that the expected consequences assumed by this prediction have a specific degree of corroboration (cf. Popper, 2005, Chap. 10) in the light of the agent’s actual knowledge.Footnote 34

Therefore, it can be concluded that counterfactual predictions normalize (relatively) effective actions, and semifactual predictions normalize ineffective actions. The possibility of choosing between counterfactual predictions (with related actions) and semifactual predictions indicates the normativity of the motivational relation. In fact, the agent does not need to know which of the actions taken by it are based on a counterfactual prediction and which are based on a semifactual prediction, but is always obliged to choose an action normalized by a given prediction. And this action, let’s repeat it again, is not only consistent with a given (counterfactual) prediction, but more importantly, it is taken precisely because of this prediction. The necessity to choose a given action, resp. policy of action, is conditioned by the necessity to minimize the emerging prediction errors. In other words, the normativity of the motivational relation is grounded in the normativity of the generative model: the generative model does not simply optimize itself (increases its coherence) in terms of both actions and perception, but needs to be optimized. Otherwise it will not exist (cf. Hohwy, 2020b; Ramstead et al., 2020, p. 233).Footnote 35

6 Motivation and constraints

The agent is in a causal relation with the environment. Nevertheless, as has been shown, it may also be in a motivational relation with it. The motivational relation is normative, which was justified in § 3. It is also normative in the sense that it constitutes a reference of the agent to a given object in the environment that allows it to be perceived as valuable, i.e. one that evokes desire, will, aversion or disgust in the agent. In other words, when faced with certain objects, the agent may feel motivated or obliged to take a certain action or not.Footnote 36 The existence of motivational states would not be possible if a normative motivational relation between the agent and the environment had not been established. The object would also not be perceived as valuable if it did not become the pole of the motivational relation. Certain objects or states of affairs in the environment may have a special meaning for the agent, precisely as something valuable. In this sense, the agent perceives its environment not only as a source of information, but also as a place where its interests, desires or intentions are realized. However, these claims require further justification.

I found that there is a motivational relation between a prediction and the action taken by the agent. Without its existence, it is impossible to explain why the agent undertook such and such action in a specific situation. Its motivational nature is related to the fact that a given prediction or (Bayesian) belief, “by itself and relative to a fixed background of desires, disposes the subject to behave in ways that would promote the satisfaction of his desires if its content were true” (O’Brien, 2005, p. 56). Therefore, what makes a prediction or belief normative for a specific action or behavior is whether it plays any significant motivational role (Sullivan-Bissett, 2017, p. 95). However, I argue that what is motivational is not only the prediction itself, but most of all the relation between the prediction and the specific action given such and such environmental conditions. I claim this because the environmental conditions (understood by me in terms of constraints) co-constitute the actions directed by the predictions. What I mean is that, in the motivational relation, certain predictions, understood as probability distributions of such and such states of the world, normalize the appearance of such and such actions, thus excluding other actions (cf. § 3). Therefore, priority is given to those actions that result from the agent’s motivation in relation to specific, expected states of the world. For this very reason, I argue that attention should be paid to the key role of the environment as a cause of motivational signals.

I suggest to describe the environment in terms of constraints. The concept of constraint was proposed by Pattee (1968; 1972). In his opinion, to identify constraints in a given system is to ensure a better understanding of its functioning. Pattee distinguished between constraints and laws. The latter are necessary and cannot be avoided. On the contrary, constraints are often random and relative. Constraints, unlike the laws of nature, must be a consequence of specific material structures, such as particles, membranes or, for example, machines. These structures are static, that is, to some extent dependent on the laws of nature, but their behavior can only be explained by pointing to their time-dependent constraints. It is for this reason that Pattee refers to them as “rules” (cf. Marr, 1982, pp. 22–23). In general, constraints reduce the degree of freedom of a given system with regard to the variability or the possibility of changing its parameters, components and behavior (Umerez & Mossio, 2013).

So let’s consider how environmental constraints affect the functions of an organism. What an organism encounters in an environment structured by constraints constitutively influences its motivation, allowing it to reduce its uncertainty under certain conditions. Importantly, uncertainty not only has a potentially detrimental effect on the agent, but is also a motivational property (Anselme, 2010). Thus, motivation should not be treated simply as an expression of the needs of a given organism, but as a factor constituting policies aimed at seeking novelty. This is because it is directly related to the processing of information about objects in order to optimize actions (Anselme, 2010, p. 292). It consists in the fact that certain properties of the world constraint the pool of possible actions of the organism, excluding some options and pointing to others. For example: if we enclose a frog in an aquarium, the properties of such an “artificial” ecosystem will reduce the possible behavior pool of the amphibian. It will only be able to move within the boundaries of the aquarium and “hunt” only what is in it. A trapped frog’s potential behavior is governed by the environmental constraints of the aquarium in which it resides. It has been showed that, when introduced into such an ecosystem, a predator tadpole can change the shape of its body under the influence of the stress hormone, so that it is better prepared for a potential attack (Maher et al., 2013). A threat signal, i.e. an increase in uncertainty in the ecosystem, triggers a corresponding hormonal response in tadpoles.

In my interpretation of environmental constraints, information about an emerging threat motivates the organism to react in a specific way. This reaction may be, as in the case of tadpoles, a morphological change, escaping to a safe place or remaining motionless in order to prevent a predator from tracking the victim (cf. Tolledo, Sazima & Haddad, 2011). The reaction may also be a change of the car’s lane when the signal of a prediction error regarding a potential collision motivates me to act in this and no other way.

It should be emphasized that the approach to motivation presented here does not define it as a mental state in the sense of folk psychology (cf. Ravenscroft, 2019), but rather as a functional role of mechanisms generating predictions (cf. Miller Tate, 2019). This claim needs clarification. Alex Miller Tate points out that “theory of motivation can only succeed if it shows how a single mental state (typically, a proximal intention or similar) can play the roles of action initiation, guidance, and control. And it’s far from obvious that such a supposition is reasonable; though we might begin our theorising with a folk-psychological notion of intention, there may turn out to be no one-to-one mapping between this category and the computational / neural components of motivational architecture” (Miller Tate, 2019, p. 4). This is a significant issue to consider. This author suggests adopting a framework in which motivation is understood as states or combinations of states that play the functional roles of initiation, guidance, and control. The states thus understood “[cause] the prediction of, and selective redeployment of attention towards, action-relevant proprioceptive and exteroceptive sensory signals” (Miller Tate, 2019, p. 5). At this point, our take is broadly in line with Miller Tate’s view. The main difference, however, concerns (1) paying attention to the normative nature of the motivational relation, which indicates those properties of the environment thanks to which the objects of perception become valuable, i.e. important due to the fact that they trigger the motivation of the agent. In this sense, one can speak of the agent’s normative dependence on the environment; and (2) the ontic nature of the motivational relation.Footnote 37 I argue that its ontological character is determined by the fact that this relation, conditioned by certain environmental constraints, is a component of the mechanism by which we can explain the actions of the agent in the environment. I will devote my further analyses to this issue.Footnote 38

7 Normative predictive mechanisms

Many researchers (cf. Gładziejewski, 2019; Harkness & Keshava, 2017; Hohwy, 2015) believe that the appropriate explanatory framework for PP is provided by mechanisms (cf. Craver, 2007; Bechtel, 2008; Kaplan, 2011). In this approach, PP is to be a sketch of a mechanism that will allow researchers to formulate a mechanistic explanation of specific cognitive phenomena. This means that although the brain is composed of many distinct mechanisms, these mechanisms may be unified by the fact that they fall under a common blueprint in their functional organization (Gładziejewski, 2019, p. 659). The thesis about normativity, which for me is a strong premise for adopting a realistic position in relation to PP, is grounded in the belief that mechanisms are normative as long as they allow one to explain the normativity of specific functions (cf. Garson, 2013).Footnote 39 In this sense, I claim that they can be referred to as normative mechanisms. Garson emphasizes that if we do not refer to the normative functions performed by mechanisms, it becomes difficult to explain their dysfunctions, which may lead to talking about the mechanisms responsible for dysfunctions, e.g. mechanisms that are responsible for heart attacks, malfunction of mixers or misrepresentations. Garson’s thesis suggests that normativity is not only a pragmatically useful concept because of a specific research strategy, but it is a concept that seems to have a specific explanatory power. Nevertheless, one can make an objection to Garson similar to that of the teleosemantic approach: the concepts of function and dysfunction are constitutively assuming and mutually defining each other, for there is no function without dysfunction. For this reason, it is difficult to say that this concept of function actually explains anything.

However, it can be argued, and I will defend this approach, that what is meant by describing a given mechanism or function as normative is that it plays certain causal roles (and not only functional roles, as Miller Tate claims). I argue that when speaking of such roles in relation to the normative properties of predictive mechanisms and functions, they are referred to as the causes of specific actions of an organism in the environment. In other words, normativity is a predicate with which we can explain certain phenomena (i.e. actions and behaviors) in terms of the mechanisms and functions that cause them.

Craver (2012) provides a strong argument for linking the explanation of mechanisms with their functions (see also Piccinini & Craver 2011). He points out that despite the rejection of teleological explanations by many sciences, both the physiological sciences and the neurosciences often refer to functional descriptions and these often lead to the search for mechanisms. Functional descriptions contribute to mechanistic explanations in three ways: (1) as a means of carefully pointing to appropriate etiological explanations;Footnote 40 (2) as ways of framing constitutive explanations; and (3) as ways of explaining specific items by locating them in higher order mechanisms. What is important for us is that functional descriptions are ontic in nature, which means that they are not based on the observer’s decisions and research strategies, but on the actual regularities present in the phenomena (cf. Craver, 2013).

Here, I come to the conclusion of our considerations, according to which the mechanisms that serve normative functions may be the causes of specific actions and behaviors. We recognize them by identifying these and no other functions. The important point is that not all mechanisms can constitute constitutive causes of actions and behaviors, but only those that can be defined as normative. This means that I distinguish between the normative, or constitutive, causal mechanisms, and the simply causal mechanisms. For this reason, it cannot be said that the mechanism responsible for moving the hand is the constitutive cause for the fact that, for example, I wanted to drink coffee from a cup. It is of course the mechanism that co-constitutes the act of grabbing the cup and pointing it towards the mouth, but it would be a great abuse to say that it is the cause of all that might be described as drinking coffee. This is not the case with the (normative) predictive mechanism (which determines the constitutive cause of an action): the feeling of thirst and fatigue increases uncertainty in the environment. If I am sleepy and thirsty, my possibilities of action are much smaller, which may result in the emergence of situations with a greater degree of uncertainty. The counterfactual prediction that I will minimize potential uncertainty if I drink a caffeinated drink could lead to coffee consumption. Let’s go further. The appropriate high-level predictive mechanism responsible for the appearance of predictions minimizing the prediction errors associated with low performance of the organism is the cause of such and such action, in this case drinking coffee. However, I distinguish the cause of a given action from its reason. Confusing these concepts may result in a false treatment of PP in terms of folk psychology. In the approach that I defend, the causes of actions are specific predictive mechanisms that perform such and such normative functions with regard to the requirement for (long-term) minimization of prediction errors. These are, of course, not the only causes for these actions, but they are the causes that explain the success or failure of an action taken by a given organism in a specific environment and situation. They are therefore what I call “constitutive”.

It should also be added that the explanation of normative mechanisms, including predictive mechanisms, should include a description of the components and their relations, their actions and physical constraints that are jointly responsible for the appearance of a given phenomenon. It is important that not all phenomena can be explained in terms of neural mechanisms (cf. Weiskopf, 2016). Sometimes it is necessary to refer to appropriate components and operations that are also co-constituted by social and cultural constraints (Miłkowski et al., 2018, p. 9; Norman 2013). Following Marr, I understand the constraints as causal and effective, that is, those which provide the necessary and sufficient conditions for the functioning of specific processes and mechanisms (Marr, 1982, pp. 111–116). In this approach, constraints are, in a sense, norms or principles that define the boundaries and principles of realizing such and such processes. Constraints can be physical, biological, social or cultural.Footnote 41 Their analysis is crucial to explaining the constitution of a given mechanism. For example: it is impossible to satisfactorily explain the mechanism of driving a car without taking into account the physical and symbolic restrictions related to road traffic (specific regulations, knowledge of road signs, etc.).

On the basis of the above analyses, it should be concluded that constraints are an important component of mechanisms that can be used to explain relevant situations, actions, phenomena or processes. The example of driving a car in urban space shows this clearly. If you do not take into account the many possible and existing constraints of driving, then the explanation for this phenomenon is either trivial or schematic, so ultimately it is not a good explanation (cf. Winning & Bechtel, 2018). I can now explain how constraints affect motivation. What the agent encounters in an environment structured by constraints (constitutively and not merely causally) influences its motivation to reduce uncertainty. Certain physical, social, symbolic or cultural properties of the world constrain the pool of possible actions of the agent, excluding some options and pointing to others.

The predictive function of the model is obviously constrained, on the one hand, by its internal parameters (the structure and content of the Bayesian network and the Bayesian inference), and on the other hand, by certain physiological constraints (e.g. the capacity of the organ of vision, the efficiency of neural processes) or chemical constraints (e.g. chemical reactions in pyramidal cells), etc., as well as by the motivational relation. The relation is motivational due to the constraints imposed on the selection of actions by the model’s predictive function—or, more precisely, the generated prediction—and certain environmental constraints. The motivational relation, which is somehow embodied in the relation of prediction and certain states of the world, is the basis for the emergence of such and such motivation in the agent. Thus, the explanation of normative predictive mechanisms should refer not only to the structure of the generative model and its parameters, but also to the already structured environment as a natural constraint for the agent. Thus, the mere requirement to maximize the evidence of the model cannot constitute a sufficient justification for such and such actions or behaviors of the agent. What does constitute such justification is the existence of certain specific properties in the environment.

8 Conclusion

The choice of a given action depends on the generated prediction. The prediction selection process is based on (1) Bayesian abductive inference; (2) a policy adopted by the agent (subordinated to the requirement to minimize prediction errors, resp. VFE or EFE); and (3) internally (high-level beliefs and priors of the generative model) and externally (specific environmental constraints) regulated motivation. This means that action normalizing predictions are selected on the basis of a certain potential implemented in the generative model of the Bayesian network and specific relations with the environment. In practice, this means that the agent chooses among several counterfactual hypotheses about the form “what will happen, if I do this and that” (cf. Seth, 2015). Modeling probabilistic action scenarios allows for planning actions and long-term minimization of prediction errors (Pezzulo et al., 2015, p. 24; cf. also Clark 2019, p. 10). However, what distinguishes the approach proposed here from others in the literature is the emphasis on the explanatory role of environmental factors. The action is effective if the prediction generated by the model takes into account such and such states of the environment. This means that a precise, highly weighted counterfactual prediction must correspond to certain factual or counterfactual properties of the world.Footnote 42 In this way, high-level, precision-weighted predictions determine how agents act in the world. It must therefore be said, as I have already emphasized several times, that the actions taken by the agent are regulated not only by such and such predictions, but also by specific states or properties of the environment, which are understood here as constraints for the mechanism. Thanks to this, predictions not only guide actions, but also shape causal transitions between states that have specific content and satisfaction conditions (e.g. mental states). The position defended here should therefore be described as “externalist”, by which I mean that the exemplification of certain mental states is conditioned both by the parameters of the generative model (e.g. specific excitation of neural populations in the case of low-level states) as well as the environmental states and socio-cultural situatedness of agents.Footnote 43 Thus, it is the normative relation of predictions with the environment that determines or specifies the content of certain states. Why normative? As I have shown, this relation, which is constitutive of the agent’s interaction with the environment, cannot be reduced to the structure of the model, specific learning algorithms or the very requirements of minimizing prediction errors or maximizing the coherence of the model. Additionally, it enables explaining the meaning of cognitive errors (e.g. representational error) from the organism’s own perspective, and not, as in the case of e.g. teleosemantics, from the perspective of an external observer. This means that a possible error or misrepresentation has a normative significance for the organism, i.e. it affects the selection and guiding of actions, and not only a potential effect of specific causal processes.

The analyses carried out here strengthen the thesis about the normativity of predictive mechanisms. The motivational relation between a selected prediction and an action or sequence of actions taken because of that prediction can be fixed over time. Namely: the statistical effectiveness of actions taken under such and such conditions may lead to the emergence of a specific pattern of behavior. The point is that if, in certain circumstances, a certain prediction that normalizes a certain actions or their sequence leads to the goals intended by the agent, the agent may learn a certain pattern of action. This pattern can be called a “pattern of behavior” in the sense that it constitutes a matrix that determines how the agent should behave in such and such environmental conditions.Footnote 44 Thus, statistical effectiveness can lead to the emergence of certain rules of action, which can be reduced to the following form:

If under the environmental conditions X, the optimal action is S, then if agent A wants to act optimally, then A in X should perform S„.Footnote 45

According to the approach to normative mechanisms defended here, action S is explained by indicating its constitutive cause at the level of the mechanism, i.e. a specific prediction generated by the predictive mechanism. The justification for such action is the rule given above. Such justification can take the following form: someone performed S because under the conditions of X, such action was optimal. In this sense, indicating a rule is tantamount to giving the reason for such and such action of the agent.