1 Introduction

Predictive processing (PP) is an explanatory framework developed within cognitive neuroscience to explain cognition, perception, and action under a single unifying principle: prediction error minimisation (PEM). The idea is that, by learning statistical regularities from its neural activations, the brain forms an inner generative model from which it generates predictions of incoming sensory information via approximate Bayesian inference. If the predictions match the input, the latter is said to be ‘explained away’. Predictions are not always accurate, however, and prediction error is thus to be expected. When there is prediction error, it is used to update the generative model to generate more accurate predictions in the future, minimising prediction error.

Usually, within the PP framework, it is argued that perceptual experience arises from the brain’s predictions. It has been claimed that PP “explains not just that we perceive but how we perceive: this idea [i.e., PEM] applies directly to key aspects of the phenomenology of perception. Moreover, it is only this idea that is needed to explain these aspects of perception” (Hohwy, 2013, p. 1). One of such key aspects is the temporal structure and flow of perceptual experience. It is not simply that perceptual experiences have an objective duration, but that they also involve the experience of duration. We perceive objects and events as enduring in time. In this sense, there is an intrinsic subjective temporality to perception.

The temporal structure of perception has not gone unnoticed by PP theorists. Recently, two PP approaches have been proposed to explain the subjective temporality of perceptual experience (Hohwy et al., 2016; Wiese, 2017). Both draw from Rick Grush’s computational model of how a cognitive system might represent time, integrating it within the PP framework. Grush’s approach, called the Trajectory Estimation Model (TEM), is meant to explain at a computational level how we explicitly represent time by internally modelling the trajectories of temporally extended processes in the world (Grush, 2005). PP approaches extend TEM by subsuming it under the PEM mechanism.

Grush’s TEM, as well as the PP approaches that expand upon it, take Edmund Husserl’s phenomenology of time-consciousness as a guiding thread. As Grush claims, the thought is that there is “something about the mechanisms [of] neural information processing that explains why our phenomenal experience explains those features of phenomenology revealed by Husserl’s analysis” (2006, p. 441). Consequently, what TEM and PP posit at a computational (and sub-personal) level aims to explain what Husserl describes at a phenomenological (hence, personal) level (cf. Grush, 2006, pp. 447–448).

The objective of this paper is to evaluate PP approaches to time-consciousness from a Husserlian perspective. More specifically, I assess whether such PP approaches are consistent with Husserl’s analyses.Footnote 1 To do so, I present some core aspects of Husserl’s phenomenological analysis of time-consciousness, introducing what I call the ‘Kantian-Brentanian’ approach to time-consciousness as a foil to the Husserlian one (Sect. 2). As I argue, Husserl’s approach is a direct response, as well as an improvement, to the Kantian-Brentanian one. I then provide an overview of the PP framework and of the existing PP approaches to time-consciousness (Sect. 3). Finally, in Sect. 4, I argue that, given its representational commitments and the role that imagination plays in their framework, current PP views are consistent with the Kantian-Brentanian approach to the phenomenology of time-consciousness rather than to the Husserlian one, deepening the connection between PP and Kantian philosophy (see Anderson & Chemero, 2013; Swanson, 2016; Zahavi, 2018). I conclude that, from the Husserlian perspective which PP theorists are arguably drawing from, their approaches fail to account for time-consciousness in a satisfying way.

2 The phenomenology of time-consciousness

In classical phenomenology, Husserl provided what is arguably the most influential account of the experience of time and the temporal structure of perception. Consequently, there have been several attempts to model Husserl’s account connecting it with empirical research in cognitive neuroscience (see, e.g., van Gelder 1999; Varela, 1999; Lloyd, 2002; Grush, 2006). Recently, PP theorists have followed suit and have tried to address Husserl’s account of time-consciousness via their Bayesian framework (see Hohwy et al., 2016; Wiese, 2017). The purpose of this section is to present the Husserlian analysis of time-consciousness. In addition to the Husserlian approach, I also present what I call the ‘Kantian-Brentanian’ approach to time-consciousness. There are two reasons why I do so. First, the Husserlian approach is a direct response to Brentano’s (and implicitly to Kant’s) views. In his lectures on inner time-consciousness, Husserl (1991, §§ 3–6) develops his phenomenological approach as an improvement on Brentano’s theory. Second, as we shall see (Sect. 4), despite its aim to address the Husserlian analysis from a computational perspective, PP approaches are closer to the Kantian-Brentanian one.Footnote 2

Consider the experience of listening to a melody. As I listen to it, it appears as a set of tones unfolding one after the other; what constitutes the melody as an identifiable unity is a set of “parts” that do not occur simultaneously. In terms of the contents of experience, the constituent parts of the melody are a set of notes (e.g., C-D-E) that are given one after the other. Phenomenologically speaking, the experience of a given now-phase is linked to the experience of the just-past. It is not that when I hear D at \(t\), I am still experiencing C in the same way I experienced it at \(t-1\).Footnote 3 If that were the case, I would not experience D after C, but rather I would experience both tones as simultaneous, which is obviously not the case. What about E? If E came after D, then E would simply be experienced as happening after D. Suppose that, instead, the melody ended abruptly. I would experience the abrupt end as surprising. Such a surprise indicates that at \(t\) I was aware of D as happening, of C as just-past, and anticipated what was about-to-occur. In sum, at any given now-phase of a perceptual experience there is: (1) a perception of what is appearing in that same now-phase; (2) a sense of the just-past; and (3) an anticipation of the about-to-occur.

I now turn to present two possible ways of accounting for the tripartite structure of time-consciousness, namely, the Kantian-Brentanian account, which relies on the representational power of the imagination; and the Husserlian account, which conceives of the tripartite structure of time-consciousness as not involving any kind of representation.

2.1 The Kantian-Brentanian approach

According to Kant, the imagination is “the faculty for representing an object even without its presence in intuition” (Kant, 1787/1998, p. B151). Linked to this definition, Kant introduces a threefold synthesis that functions as the transcendental condition of all forms of cognition. The mid-point of this threefold synthesis is what Kant calls ‘the synthesis of reproduction’, which is itself a synthesis of the imagination (Kant, 1787/1998, p. A101). Following Heidegger’s (1997, § 33) interpretation, one’s sense of the just-past is a result of the synthesis of reproduction.

According to Kant:

if I draw a line in thought, or think of the time from one noon to the next, […] I must necessarily first grasp one of these manifold representations after another in my thoughts. But if I were always to lose the preceding representations […] from my thoughts and not reproduce them when I proceed to the following ones, then no whole representation and none of the previously mentioned thoughts […] could ever arise. (Kant, 1787/1998, p. A102)

Considering that this kind of synthesis is done by the imagination, it becomes clear that, by imaginatively reproducing (i.e., representing) past apprehended intuitions, the mind can have a grasp (or produce) the past itself.

Kant’s view is consistent with Brentano’s take on temporality.Footnote 4 According to Brentano, phantasy (i.e., imagination) produces the temporal determinations that give representations their temporality (Husserl, 1991, p. 12; see also Fréchette, 2017, p. 80).Footnote 5 Regarding the past, Brentano argues that after the apprehension of sensory content, phantasy produces a new representation with the temporal determination ‘past’ which is ‘originally associated’ with the original representation. Additionally, phantasy also forms a representational expectation of the future insofar as it has a grasp of the past (Husserl, 1991, § 4). Doing so, the whole temporal horizon that includes past, present, and future is constituted by the imagination.

What defines the Kantian-Brentanian approach to time-consciousness is the idea that the past and future horizons of a given now-phase are constituted via representations of the imagination. In the example of the melody, when hearing D at \(t\), the imagination arguably represents C as something that just happened and E as something that is about to occur. These three are somehow ‘originally associated’, constituting a unified experience of the flow from C to D to E.

Note that, in this context, representations must be understood at the personal level (as opposed to the sub-personal level). Accordingly, I will refer to this kind of representations as p-representations, in contrast to s-representations which are sub-personal (see Sect. 3.2).

2.2 The Husserlian approach

Husserl begins his 1905 lectures on the phenomenology of inner time-consciousness by addressing Brentano’s theory (Husserl, 1991, §§ 3–6), introducing his own phenomenological approach as an improvement on it. Note that Husserl’s arguments not only apply to Brentano’s theory, but to the Kantian-Brentanian approach in general.

Before presenting Husserl’s arguments, I must acknowledge that it would be a mistake to attribute a single unified analysis of time-consciousness to him.Footnote 6 According to Lanei Rodemeyer, “Husserl’s own position on the structure of inner time-consciousness shifted significantly at least once” (2022, p. 184), namely, when he introduced the idea of ‘absolute consciousness’ circa 1908–1909. After that, some of the shifts in his views are better understood as attempts at deepening previous analyses. A good example of such deepening is found in his analyses of passive syntheses, where, in contrast to his earlier focus on the form of time-consciousness, he addresses its contents, linking temporality with association and affection (see Husserl, 2001a). Here, I will nevertheless focus on the formal structure of time-consciousness, as it is the main concern of PP theorists when approaching Husserl’s analysis.Footnote 7 It is worth noting that this structure remained virtually unchanged across Husserl’s works (see Husserl, 1991, §§ 10–11; 2001b, Nr. 1; 2006, Nr. 3), and that it is also one of the main foci of most commentators (see, e.g., Zahavi, 2003, pp. 81–86; Gallagher, 2017, pp. 93; Rodemeyer, 2022).

Husserl’s argument against Brentano’s theory can be summed up in one quote: “[I]n his theory of the intuition of time Brentano does not take into consideration at all the difference between the perception of time and the phantasy of time” (Husserl, 1991, p. 17). If our perception of both the just-past and the about-to-occur were a result of a p-representational act like imagining something, there would be no difference between perceiving and, say, imagining a temporally extended object (e.g., a melody). If the sense I have of C and D while hearing E were a p-representational product of the imagination, then there would be no difference between perceiving the flow from C to D, and imagining the perception of the flow from C to D.

Furthermore, if time-consciousness were a product of the imagination, then it would be impossible to account for the fact that other reproductive experiences (e.g., recollection) are also structured as involving a sense of both the just-past and the about-to-occur. Consider a memory of a past event. From the Kantian-Brentanian perspective, the just-past within a given memory would be a product of the p-representational function of the imagination. In other words, the just-past would be the p-representation of the past. However, memory itself is a p-representation of the past. Therefore, we would end up with a p-representation of the past (i.e., the memory) of a p-representation of the past (i.e., the just-past in the memory); but this would be absurd, because a p-representation of the past p-representing the past is no different from just a p-representation of the past. For a memory to have temporal structure, it cannot be a p-representation of p-representations, but rather a representation temporally structured by something entirely different: namely, by presentations of the past and the future.Footnote 8

Additionally, Husserl claims that “Brentano does not distinguish between act and content […]. Yet we must make up our minds about which of these accounts it is to which the temporal moment should be charged” (Husserl, 1991, p. 17). It is somewhat unfair to say that Brentano’s theory does not distinguish between act and content. It seems plausible to distinguish between, say, the imaginative act that p-represents tone C as just-past, and tone C as its content. From this perspective, the temporal mode of givenness arises from the encounter between act and content. On the one hand, the act alone cannot account for the temporal mode of ‘just-past’ (or ‘about-to-occur’) since it happens in the present. On the other hand, the content alone does not account for any temporal mode since the same content (e.g., C) can be given as present, just-past, or about-to-occur. It is when an imaginative act p-represents a given content that the temporal mode of givenness arises: the content is p-represented as, e.g., ‘just-past’. From this point it follows that, as time goes by, there must be new p-representations accounting for the sinking in the past of a given content. More concretely, at \(t\) I hear C, at \(t+1\) I hear D and p-represent C as ‘just-past’, at \(t+2\) I hear E and p-represent D as ‘just-past’ and C as ‘just-just-past’, and so on. In other words, the different temporal aspects are construed via distinct p-representations. However, we experience time as continuously flowing. If the Kantian-Brentanian approach relies on discrete p-representations to account for our sense of time, it is unclear how could it go from such a discrete structure to a continuous flow. It is regarding this point and the earlier one (i.e., the just-past and the about-to-occur cannot be p-represented) that the Husserlian approach improves on the Kantian-Brentanian one.

In his analysis, Husserl distinguishes between what is given (i.e., content) and how it is given via an intentional act. According to him, we are aware of something as just-past via retention, of something as now via primal impression, and of what is about-to-occur via protention (Husserl, 2001a, pp. 610–612). Retention, primal impression, and protention are the tripartite structure of what Husserl (2006, Nr. 3) calls ‘the living present’. The idea is that we are simultaneously aware of the just-past, the now-phase, and the about-to-occur because retention, primal impression, and protention are simultaneous intentions. Therefore, at any given moment, we are aware of a specious or living present that comprises the now-phase and its immediate temporal horizon (Husserl, 2001b, Nr. 1 § 4).

There are at least two important differences between the Husserlian and the Kantian-Brentanian approaches. First, Husserl explicitly characterises both retention and impression as presentational intentions.Footnote 9 When analysing retention and protention, Husserl distinguishes them from recollection and expectation respectively (Husserl, 1991, §§ 14, 40; Husserl, 2001a, p. 138). I can always p-represent in recollection what just happened, and in expectation what I anticipate happening in the near future. However, these p-representational acts must be distinguished from retention and protention, which are implicit intentions directed toward the just-past and the about-to-occur. Where the Kantian-Brentanian introduces p-representational acts of the imagination, the Husserlian approach introduces presentational intentions. By doing so, Husserl avoids one of problems of the Kantian-Brentanian approach.

At this point, given that retention and protention are presentational intentions, one might wonder about the difference between them and primal impression. Put simply, whereas what is intended in primal impression is given ‘in the flesh’, retention and protention refer respectively to what is no longer here, and what is not here yet. To make this distinction clearer, it is useful to delve into the dynamics of the living present (Fig. 1). Doing so, in turn, will disclose a second difference between the Husserlian and the Kantian-Brentanian approaches.

According to Husserl (1991, § 11), what is given in primal impression is retentionally modified continuously, which means that the contents given in primal impression are gradually adumbrated, continually carrying more of the heritage of the past. In turn, retention pre-figures protention. What is protended is somewhat (but not entirely) determined by what one has already experienced. Lastly, a protention may be either fulfilled or negated by a subsequent primal impression. The negation of a protention, in turn, alters previous retentions: when being surprised by the abrupt end of the melody, not only the protention of the continuation of the melody is negated, but also my retention of my previous empty expectations of the melody as not about to end abruptly. This last aspect accounts for the fact that, whenever I listen to the melody again, I am no longer surprised by its abrupt end.

Underlying the dynamics just described lies Husserl’s distinction between empty and intuitive intentions. Consider the difference between judging that a notebook is blue when seeing it and judging so when the notebook is nowhere to be seen (Zahavi, 2003, p. 28). In both cases, the act (i.e., judging) and its content (i.e., the blueness of the notebook) are identical. The difference lies in the mode of givenness of the object. In one case, the notebook is given intuitively (i.e., as bodily present); in the other, it is given emptily (i.e., as absent). The distinction between intuitive and empty modes of givenness is not categorical, but a matter of degrees. As Zahavi clarifies, “[t]he object can be given more or less directly, that is, it can be more or less present” (2003, p. 28). This point is crucial in the context of time-consciousness. For Husserl, primal impression gives its objects intuitively. In contrast, retention and protention are closer to empty intuitions. More precisely, on the one hand, retentional modification is a process in which, what was initially given intuitively in primal impression continually loses its intuitive content until it eventually sinks into “an indeterminate, undifferentiated, completely obscure past” (Husserl, 2001a, p. 481), becoming an empty retention (Husserl, 1991, p. 27; see also Mishara, 1990). On the other hand, protention is an empty intention in the sense that it is not directed toward something in particular. Rather, the about-to-occur that is given in protention is better understood as a set of open possibilities that is constrained by what is pre-figured by previous intentions (Husserl, 2001a, §§ 2, 10). In contrast to retention and protention, p-representational intentions give their objects by reproducing other kind of intentions, entailing a p-representation of their objects (Husserl, 2005, p. 372). For instance, recollection implies a reproduction an earlier perceptual experience, effectively p-representing its object in the mode of having-being-perceived. P-representational intentions given their objects indirectly, i.e., via the reproduction of other intentions.

Neither protention nor retention is to be taken as a p-representation because they are not indirectly giving their object. In contrast with expectation, protention does not give an object in particular. And in contrast with recollection, retention is the emptying of what was given in perceptual intuition, rather than a bringing to p-representational intuition what is already emptily retained. Husserl invites us to think of a set of modes of consciousness which, without being p-representational, do not give their object in the flesh.

Given the distinction between intuitive and empty intentions, one may describe the dynamics of protentional fulfilment and retentional modification as a continuous process of (ful)filling and emptying (Fig. 1). What is emptily intended in protention is fulfilled with the intuitive content of a primal impression, which is itself emptied via retentional modification. These continuous dynamics are meant to account for the continuous temporal flow of experience, something the Kantian-Brentanian approach struggles with.

Fig. 1
figure 1

The dynamics of the tripartite structure of the living present. R stands for ‘retention’, I for ‘primal impression’, and P for ‘protention’. Even though Husserl conceptually distinguishes three distinct intentional acts as constitutive of our awareness of the living present, these distinctions must be seen as conceptual. As his analyses of the dynamics of time-consciousness reveal, retention, primal impression, and protention are related via a continuous process of fulfilment and emptying of intentions

3 Orthodox predictive processing and time-consciousness

I now turn to the predictive processing (PP) framework. For some theorists, the prediction error minimisation (PEM) mechanism found at the core of PP can explain fundamental aspects of the phenomenology of perception (e.g., Hohwy, 2013; Clark, 2016). In this section, I introduce the PP framework and its orthodox interpretation. I also present how it has been argued that PEM can explain the temporal structure of perceptual experience as it is described in Husserlian phenomenology.

3.1 The predictive processing framework

According to the PP framework, brain functioning can be understood under a single computational principle: PEM. In contrast to standard views of brain functioning, which conceive of it as a bottom-up process (see, e.g., Gazzaniga et al., 2019, pp. 190–200), PP emphasises on top-down processing without ignoring bottom-up processes. In the words of Andy Clark, “[h]ierarchical predictive processing combines the use […] of ‘top-down’ probabilistic generative models with the core predictive coding strategy of efficient encoding and transmission” (2013, p. 183). It claims that, by encoding learnt statistical regularities of its sensory inputs, the brain forms an inner generative model of the world from which it generates predictions (statistical estimations) of future inputs. Advocates of PP argue that, by employing its models and predictions, the brain can infer the probable causes of its incoming sensory signals (Friston, 2005). Heuristically, the hypothesis is that, since the brain has access to its predictions, models, and sensory input, it can invert the question “What can be predicted given the model and current available sensory information?” to the question “What should be encoded in the model to generate accurate predictions given the current available sensory information?”. It is by this kind of inversion that the brain gains access to what might have generated current sensory data.

Within PP, generative models are hierarchically structured (Hinton, 2007). Because of the complexity of the statistical regularities of the world, the model should encode multiple regularities at different temporal and spatial scales organized in a multi-layered and hierarchically-structured manner. In the case of temporal regularities, the highest layers of the model encode the slowest, less detailed and most abstract regularities; lower layers encode the fastest and most detailed regularities one could encounter in the world. For instance, when listening to a music album from start to end, regularities that correspond to the very fast changes that occur from one tone to the next in a given song would be encoded in one of the lowest levels of the generative model, and regularities that correspond to the slower changes that occur from one song to the next would be encoded in a higher layer of the model. As I explain below, the hierarchical structure of the generative model entails that each prediction is always contextualized by other, more abstract (i.e., far-reaching) predictions.

Since it is by means of its hierarchical generative model and its predictions that the brain infers the state of affairs in the world, its predictions should be as accurate as possible. Prediction error should be minimised. Hence, it is argued that brain functioning naturally tends toward PEM.

Note that, within PP, brain processing is understood in probabilistic terms. Heuristically, the problem that is being solved by the brain is the following: “Given both prior (learnt regularities) and current evidence (sensory information), how probable is it that this information was caused by this given object?”. From this perspective, PEM is just a matter of approximately applying Bayes’ theorem, maximizing the evidence for a given model.Footnote 10 In Bayesian terms, each level of the predictive hierarchy generates predictions that function as priors for the levels below (this is known as ‘empirical Bayes’, see Friston, 2005); it may thus be the case that what is received as prediction error at a lower level of the predictive hierarchy may be explained away in a higher level.

Crucially, whenever prediction error is not explained away, and thus propagates up the hierarchy, the brain is said to update its generative model, accommodating it to the received prediction error, effectively minimising it. This bottom-up process is known as ‘perceptual inference’ because, via the updating of the generative model, the synaptic activity of the brain is optimized to better encode the states of the environment, entailing an optimization of perception itself (Friston, 2009, p. 295).Footnote 11

Sometimes prediction error is unreliable. The world we live in is often ambiguous and the sensory information we receive from it may be full of noise. Consider the difference between a human-like shaped thing in a train and the same thing lying in the display window of a clothing store. Given the context, we know that the former ‘thing’ would probably be a person, whereas the latter would probably be a mannequin. However, the sensory information the brain may receive in these two cases would be quite similar. According to the PP framework, in cases like the one just described, the brain turns to precision weighting. The thought is that, given information encoded in higher layers of the generative model, the brain estimates how reliable current sensory information is. If required, it weighs down the precision of incoming prediction error, entailing more reliance on its top-down predictions.

3.2 Orthodox predictive processing

The reference to ‘perception’ in perceptual inference suggests that PP has something to say about perceptual experience as distinct from (but related to) mere brain functioning. At this point I can introduce what I call ‘orthodox PP.’ In a nutshell, PP by itself is nothing but a computational-neuroscientific approach to brain functioning and, thus, it relies on the modelling of neural activity.

Importantly, however, there is a difference between saying that you can model the brain as encoding a generative model of the world, and that the brain generatively models the world. The former idea is, in a sense, metaphysically neutral for it does not necessarily entail representational content. The latter, in contrast, immediately suggests that the brain literally has a model (i.e., a representation) of the world. Such a representation is sub-personal—it is concerned with what the brain does, not what the whole person does. Accordingly, I call it s-representation (as distinct from p-representations, see Sect. 2.1 above).

The representational and realist interpretation of PP is part of what I call orthodox PP.Footnote 12 It is ‘orthodox’ because, by interpreting the PP models as referring to something that literally happens in the brain, orthodox PP theorists commit themselves to an orthodox view of cognition as representational.

To be sure, ‘orthodox PP’ is not a heterogeneous category. A theorist can be to a lesser or greater extent orthodox. Take, for instance, the difference between Hohwy’s (2016) and Clark’s (2017) PP approaches. Both are representational, and hence orthodox, but it cannot be denied that Clark’s views on cognition are more embodied and less detached from the world than Hohwy’s. However, as both are committed to a representational view of cognition under PP, they are representatives of orthodox philosophical takes of PP.

One of the consequences of an orthodox PP view is conceiving of perception as involving p-representational content. The PP account of perceptual experience is usually introduced by appealing to the binocular rivalry phenomenon (e.g., Hohwy, 2013, Chap. 1). Binocular rivalry happens when a subject receives very different visual input through each one of her eyes—such as a picture of a house through one eye, and another of a face through the other. Instead of having a perceptual experience of a mishmash of the house and the face, the subject experiences a bi-stable switching: for a few seconds she sees only the house, then only the face, then only the house, and so on. This is explained within the PP framework by arguing that the brain cannot settle on the cause of the received visual stimuli. Because the brain has previously learnt that faces and houses are not found in the same spatial scale, it does not generate a prediction that would explain away the conflicting visual inputs by taking them to be caused by a house and a face located in the same spatial scale. Thus, what the brain does sub-personally is settling for one of the predictions (e.g., the house), making the subject visually experience a house at the personal level, and after sufficient prediction error has been accumulated from the visual input coming from the image of the face, the sub-personal prediction is changed and, with it, what is being perceived at the personal level.

Under this view, perceptual experience arises from the PEM mechanism. It is not that one perceives what the brain initially predicts, but rather that the later disambiguation of sensory information by means of perceptual inference construes perceptual content. From this perspective, we do not perceive the world directly, but as it is s-represented within the brain’s generative model. As binocular rivalry is taken to suggest, we perceive what the brain internally predicts the world is like. Perception is thus conceived of as p-representational. That is why, under orthodox PP, perception is conceptualized as a controlled hallucination (Hohwy, 2013; Seth, 2021). It is as if the brain hallucinated the world by generatively modelling it, where the hallucination is controlled by incoming prediction error.

To be more precise, the ‘perception is controlled hallucination’ motto is not entirely correct within the orthodox PP framework. As noted by Jones and Wilkinson, it is better to say that “perception is constrained imagination” (2020, p. 99). Under a prominent take of the imagination within the PP literature, we can explain mental imagery by referring to the precision weighting mechanism of the predictive mind. The idea is that given that the brain can weigh down the precision of a given set of sensory stimuli, thus completely relying on its endogenously generated top-down predictions, such a predictive system must be able to generate p-representations that are completely detached from the world. Those would be the p-representations of the imagination. If such a view is right, then there is a strong relationship between imagination and perception, for a s-representation

at layer N + 1 [of the generative model] becomes capable […] of generating the sensory data (i.e., the input as it would there be represented) at layer N (the layer below) for itself. Since this story applies all the way down to layers that are attempting to predict activity in early processing areas, that means that such systems are fully capable of generating ‘virtual’ versions of the sensory data for themselves. […] Perception […] is co-emergent with something functionally akin to imagination. (Clark, 2016, p. 94)

.

To what extent it is correct to say that perception is ‘co-emergent’ with imagination? From one perspective, the one Clark focuses on, perception seems to imply imagination and vice versa. Perception is imagination sub-personally corrected by using sensory input as a constraint, and imagination is uncorrected perception using the precision weighting sub-personal mechanism. From another perspective, however, imagination can be seen as more basic than perception. Even if both imply each other, the building blocks of experience are the top-down s-representations endogenously generated by the generative model. Sensory information only has the function of sometimes constraining and correcting such s-representations. Even without such a constraint, the brain would still generate experiential content—that of the imagination. Insofar as orthodox PP is taken to claim that the brain is fundamentally prediction machine (Clark, 2013), it seems correct to say that it implicitly suggests that the brain is fundamentally an imagination-producing machine. Without the brain’s predictions, it is unclear what role would sensory information have in the PP framework since it is conceptualized as the constraints of prediction. In contrast, without sensory information, the brain would arguably still be fully capable of p-representing an experiential world, albeit an imaginary one.Footnote 13 The basis for the phenomenology of perception is, thus, the imagination (i.e., a p-representational form of experience), which in turn is based on predictive s-representations. Hence, this is an orthodox PP framework.Footnote 14

In general, what I call orthodox PP can be related to Kantian (see Swanson, 2016) and neo-Kantian philosophy. As Zahavi (2018) rightly points out, much of the philosophical commitments of some of the current takes on PP (mainly those that I consider orthodox) are remarkably similar to those of the German neo-Kantians, who conceived of the relationship between mind and world as representational. Footnote 15 Such view was inherited from Kant. Similarly, Anderson and Chemero (2013) label PP as a ‘neo-neo-Kantian’ view. Interestingly, as Zahavi points out, on the one hand some neo-Kantians eventually came to realise that their representationalism was untenable and, on the other hand, Husserl’s transcendental idealism was partly motivated as a response that tried to overcome such representationalism. Such a philosophical heritage already suggests that, as I argue later, orthodox PP may have a hard time trying to explain the phenomenology of time-consciousness as Husserl analysed it. Before showing why, I now turn to introduce the orthodox PP account of time-consciousness.

3.3 The orthodox predictive processing account of time-consciousness

Within the PP literature, to date there are two prominent PP approaches to time-consciousness: Hohwy et al.’s (2016), and Wiese’s (2017). I now turn to explain the orthodox PP (henceforth just ‘PP’, unless stated otherwise) account of time-consciousness.

Both Hohwy et al.’s (2016) and Wiese’s (2017) approaches borrow from Grush’s Trajectory Estimation Model (TEM), which is a computational model that is meant to serve as a bridge between the phenomenological account given by Husserl and cognitive neuroscience (Grush, 2006).Footnote 16

TEM builds on Grush’s (2004) emulation theory of representation, which states that the brain forms a priori estimates of the current states of processes in the world. Such estimates are to be compared with the incoming sensory information so that any discrepancy and noise is filtered via a Kalman filter (i.e., a control tool that estimates sensory noise) in a Bayesian fashion. From this filtering, a new kind of estimate (which Grush calls ‘a posteriori estimate’) is formed. This kind of estimate serves as an updated s-representation of a state in the world.

TEM expands upon the emulation theory by noting that, at any instant \(t\), the brain is said to have three different kinds of temporal estimates, namely, smoothed, filtered, and predictive estimates. A smoothed estimate is a representation of a given state at \(t-1\), a filtered estimate is a representation of a given state at \(t\), and a predictive estimate is a representation of a given state at \(t+1\). Both filtered and the smoothed estimates are a posteriori, whereas predictive estimates are a priori. Additionally, smoothed and predictive estimates can be iterated to represent states at \(t-n\) and \(t+n\) respectively.

The idea is that, at \(t\), the brain s-represents the trajectory of a process that unfolds from \(t-n\) to \(t+n\), where a 200 ms temporal window that goes from \(t-1\) to \(t+1\) is said to correspond to the living present (Grush, 2006). These three estimates are meant to explain at a computational level what is understood at the phenomenological level as retention, primal impression, and protention. By s-representing the trajectory of worldly processes, the brain effectively represents time, giving rise to the experience of the living present.

Hohwy et al. (2016) identify a problem with TEM:

For binocular rivalry, there is no change in the actual state of affairs in the world, since the stimuli to both eyes are held constant. And yet there is an internal sense of [temporal] flow in as much as the two stimuli are perceived to occur in succession. In rivalry, that is, it seems the extended moving window can move along due wholly to internal processing. […] [In this case it does not] seem sufficient or necessary to appeal to the notion of mirroring of flow in environmental causes, and yet there is experience of temporal flow. (p. 330)

Their point is that, if TEM’s account of time-consciousness relies on the s-representation of change (i.e., trajectories) and there is no change, there should not be a felt flow of time either. However, rivalry involves an experience of such a flow without there being change in the objects in the world.

Wiese (2017) identifies another limitation of TEM. Experientially speaking, when listening to a melody, we seem to be aware of more than the 200 ms long living present. Grush (2006, p. 447) distinguishes between conceptual and perceptual (p-)representations. According to him, music appreciation has to do with the former, which goes well-beyond the 200 ms temporal window: “[a conceptual representation] is a matter of interpreting present experience in terms of concepts of processes that span potentially large intervals” (Grush, 2006, p. 447). In contrast, a perceptual (p-)representation refers to what is given in the 200 ms long living present. The problem, Wiese (2017, pp. 6–7) suggests, is that, given Grush’s definitions, conceptual and perceptual (p-)representations are qualitatively different, even though in experience they seem to be seamlessly integrated. Accordingly, Wiese asks, “How are perceptual representations of sequences integrated with conceptual representations of sequences?” (2017, p. 7).

To solve TEM’s limitations, both Hohwy et al. and Wiese give a PP spin to it. Smoothed, filtered, and predictive estimates can be left as it is—they are still meant to account for retention, primal impression, and protention at a computational level. These estimates are conceived of as s-representations happening within the hierarchical generative model. The main idea is that a predictive system that functions in an empirical Bayes fashion will naturally tend to ‘distrust the present’ (Hohwy et al., 2016). Recall that predictions generated at the higher layers of the predictive hierarchy will function as priors for lower layers. Given that the world is constantly changing, a predictive system will constantly predict that its current lower-level predictions will not be valid for much time given that one of the most probable regularities of world we live in is that there is constant change. Such a regularity will be encoded in the higher layers of the generative model and, therefore, will make the predictive system to change its current predictions rapidly and constantly. So even if there is no actual change in the world, there will be a constant change in what is perceived. In the specific case of binocular rivalry, even if both the house and the face stay the same for an extended period of time, the agent’s brain will constantly ‘distrust’ what it ‘believes’ is causing its incoming sensory input in the present, constantly changing its current prediction from ‘face’ to ‘house’, to ‘face’, to ‘house’, etc.

Additionally, within the hierarchical generative model, there is no sharp boundary between perceptual and conceptual representations (Wiese, 2017). Predictions responsible of forming conceptual p-representations would be located at the higher levels of the hierarchy, whereas predictions responsible of forming perceptual p-representations would be located at the lower levels. Given the hierarchical structure and functioning of the generative model, conceptual and perceptual p-representations are continuous, meaning that, withing the PP framework, they would not be qualitatively different. In this way, the orthodox PP approach to time-consciousness avoids TEM’s limitations.

The integration of TEM in the PP framework is straightforward because both frameworks rely on the updating of current estimates/predictions via Bayesian inference. The only change would be that of replacing the Kalman filtering mechanism with PEM. Arguably, the computational process would have the same results.

In sum, Grush’s TEM can be easily integrated into PP so that the latter can account for time-consciousness. At least in the lowest level of the predictive hierarchy,Footnote 17 the brain should be able to form predictions concerning the just-past (retention, via smoothed estimates), the now-phase (primal impression, via filtered estimates), and the about-to-occur (protention, via predictive estimates).

Notice that by following Grush (2006), both Hohwy et al.’s and Wiese’s take the Husserlian analysis of time-consciousness for granted. The structure of the living present as analysed by Husserl is what their models are meant to capture at a computational level. The question to ask now is how well these PP accounts fare concerning the Husserlian phenomenology of time-consciousness. As I argue in the next section, not very well.

4 The predictive mind cannot represent time

My view is that, when comparing the PP accounts of time-consciousness with Husserl’s phenomenological analysis presented earlier (Sect. 2), they turn out to be unsatisfactory. In a few words, the computational-level story told by PP entails a phenomenological-level story closer to the Kantian-Brentanian approach than to the Husserlian one. The problem with that is that, on the one hand, PP accounts of time-consciousness have resorted to Husserl’s analysis as an accurate description of the temporal phenomenology of perception, so if it turns out that their framework implies a Kantian-Brentanian approach on the personal level, they are missing their self-imposed mark (i.e., accounting for Husserlian time-consciousness). On the other hand, if the PP theorist decides to reject the Husserlian approach in favour of the Kantian-Brentanian one, she would still have to address the issues that the latter approach has.Footnote 18 In this section, I show why, as they currently stand, PP accounts of time-consciousness imply the Kantian-Brentanian approach rather than the Husserlian one.

Recall the difference between the Kantian-Brentanian and the Husserlian approaches to time-consciousness presented in Sect. 2.1. Whereas the former relies on the p-representational power of the imagination to constitute the past and future horizons of time-consciousness, the latter introduces presentational intentions to do so. As argued above, the Kantian-Brentanian approach has the problem that if retention and protention were p-representations, then it would be impossible to account for the phenomenological difference between them and other p-representational acts, such as recollection and expectation. Additionally, given that each intentional act of the imagination is conceived of as distinct from one another, entailing a discrete structure within time-consciousness, the Kantian-Brentanian approach may struggle with the experienced continuous flow of perceptual experience. Husserl overcomes both problems by conceiving of retention and protention as presentational intentions, as well as disclosing the continuous constitutive dynamics at play in the living present.

At a computational, sub-personal level, PP introduces s-representations that are meant to account for the rise of retention, primal impression, and protention at the experiential, personal level. Importantly, smoothed, filtered, and predictive estimates are functionally distinct from one another: each estimate is formed from different pieces of information (Grush, 2005, p. 212; Wiese, 2017, pp. 4–5, 11–12), implying that they are distinct s-representations. Given that these s-representations are meant to account for the rise of retention, primal impression, and protention at the experiential level, it seems safe to assume that PP implies that these personal-level intentions are discretely distinct from one another. From this perspective, one might wonder how PP can account for the continuous dynamics of the living present as disclosed by the Husserlian analysis. For instance, it is not entirely clear how retentional modification can be accounted for. At the experiential level, what is intuitively given in primal impression is retentionally modified continuously. In contrast, at the computational level, filtered and smoothed estimates seem to be discrete and thus distinct from each other. It is not that one estimate flows (so to speak) from one to the other. Rather, at \(t\) the brain is said to form an s-representation of \({x}_{t}\) (i.e., a filtered estimate of \(x\)), and at \(t+1\) it forms a new s-representation of \({x}_{t-1}\) (i.e., a smoothed estimate of \(x\)).Footnote 19 It is unclear how the formation of distinct discrete s-representations would give rise to a continuous flow of retentional modification at the experiential level. It seems more plausible to assume that each s-representational state generates a distinct p-representational state. As argued earlier (Sect. 3.2), PP conceives of perception as p-representational and construed on the basis of the brain’s sub-personal predictions and PEM processes. Consequently, a filtered estimate of \(x\) at \(t\) would form a p-representation of \(x\) at \(t\), which would be what we call primal impression. At \(t+1\), the smoothed estimate of \({x}_{t-1}\) would form a p-representation of \(x\) at \(t-1\) (i.e., a retention). Here, there would be no continuous retentional modification, but the formation of new discrete p-representations at each time-step, just as in the Kantian-Brentanian approach. In a few words, because of the discrete nature of its s-representations and the link between them and p-representations, it is unclear how PP can account for the continuous dynamics of the living present.

At this point, a PP theorist may point out the aforementioned idea according to which the brain naturally tends to ‘distrust the present’ (Sect. 3.3), which was meant to account for the felt flow of experience. Although such use of the PEM mechanism would effectively imply the constant change of predictions within the brain, arguably changing the current p-representations that make up the phenomenology of perception, the s-representations responsible for the rise of such p-representations would still be discrete, making it unclear how discrete sub-personal processes could give rise to a continuous sense of temporal flow in experience.

The above argument already suggests another issue with the PP takes on time-consciousness from a Husserlian perspective. Recall that, for PP, perception is controlled imagination (Sect. 3.2). The brain’s s-representations, regardless of whether they have already been updated in the light of prediction error or not, can give rise to p-representations of the imagination. It is when such s-representations are controlled by sensory information that we get perception—which is still construed as a p-representational state akin to imagination. In fact, perception is simply imagination plus the constraints of sensory information. In the context of time-consciousness, smoothed and predictive s-representations would give rise to retentions and protentions that themselves are p-representational states of the imagination. Indeed, a predictive s-representation responsible for forming a given protention could not have been constrained by sensory information yet—it is still an a priori s-representation. Such an s-representation would give rise to a p-representation of the future (i.e., a protention) which, given that it has not been constrained by prediction error, must be a p-representation of the imagination. Similarly, given that a filtered s-representation of \({x}_{t}\) formed at \(t\) would be distinct from a smoothed s-representation of \({x}_{t-1}\)formed at \(t+1\), and given that one cannot receive sensory information from the past, a smoothed s-representation would lack the constraint of incoming prediction error. It would thus follow that a retention formed from such a smoothed s-representation would also be a p-representation of the imagination.

Given the above argument, it is now clear that the sub-personal PP account of time-consciousness ends up entailing a Kantian-Brentanian approach at the experiential level. Indeed, at the latter level, both retentions and protentions are implicitly construed as p-representations of the imagination.

Considering PP’s commitment to representationalism and a discrete s-representation of time, it is hard to see how could account for the presentational character of retention and protention and the continuous dynamics of the living present as analysed by Husserl. Even if PP theorists decide to subscribe to the Kantian-Brentanian approach instead of the Husserlian one (as they have done in the past), they would have to explain how to overcome its problems.

5 Conclusion

I have tried to show why I believe that orthodox PP is an inadequate framework for accounting for time-consciousness as it was analysed by Husserl. Specifically, if the claim is that the PEM mechanism is all that is needed to explain the phenomenology of perception—as Hohwy suggest it is—, then time-consciousness should also be explained under that same mechanism. Here, I have distinguished between the Kantian-Brentanian and the Husserlian approaches to time-consciousness to show that the latter is a more plausible account of time-consciousness. I have also shown that the orthodox PP approach to time-consciousness implies the Kantian-Brentanian approach, despite the fact that it is explicitly trying to account for the Husserlian analysis.

To be fair, I believe that my arguments apply only to orthodox PP, which, as mentioned, inherits philosophical commitments from Kantian and neo-Kantian philosophy. This fact does not entail that PP cannot either settle for the Kantian-Brentanian approach to time-consciousness or move beyond its Kantian roots and closer to Husserlian phenomenology. However, if Husserl’s arguments against Brentano (and, more broadly, what I have called the Kantian-Brentanian approach) are accepted, then PP should probably take the second option and find a home more suited to approach the problem of how to explain time-consciousness. Such work, however, is yet to be done.