1 Introduction

Theorists of mind have long been committed to the idea that differences between perception and cognition are differences of kind (Fodor 1983; Pylyshyn 1999; Firestone and Scholl 2016) —anyone can appreciate the difference between seeing objects and thinking about them. However, the idea that there is no clear distinction between perceptual states and cognitive states has sparked much discussion in recent years, for its consequences for foundational epistemologies and some well assented theoretical frameworks are profound (Lupyan 2015; Clark 2013). Indeed, there is no clear and consensual criterion to distinguish perception from cognition, no one has shown the specific point where one ends and the other begins, and yet various debates in philosophy and psychology presuppose that perception and cognition belong to separate categories. Debates on the cognitive penetrability of perception, the epistemological role of perception, the richness of perceptual content, or debates on the very existence of non-conceptual content, largely assume that perception and cognition are different categories defining different mental phenomena. For this reason, the interest in marking the border, if any, between perception and cognition, has increased.

Efforts to mark such a border have followed different strategies. The representational strategy, for example, postulates that perception is distinguished from cognition because the former produces representations with non-conceptual content and iconic format, whereas the latter generates representations with conceptual content and discursive format (Carey 2009; Burge 2010, 2014; Block 2014, Block unpublished). The architectural strategy, on the other hand, proposes that perception is modular and informationally encapsulated, whereas cognition is not (Fodor 1983; Pylyshyn 1999; Raftopoulos 2009; Firestone and Scholl 2016; Mandelbaum 2018). These two strategies have, of course, detractors; the iconic and non-conceptual character of perceptual states has been deeply questioned (Quilty-Dunn 2016; Mandelbaum 2018), and its modular nature too (Churchland 1988; Prinz 2006). In this paper I am interested in a third strategy: the stimulus-dependence strategy. The stimulus-dependence strategy basically suggests that perception depends on sensory stimulation in a way that cognition does not. It seems, in fact, natural to claim that perception is stimulus-dependent; just as perceptual representations are, in order to exist, connected to objects in the world, cognitive representations are not.Footnote 1 This is the stimulus-dependence criterion outlined by Beck (2018), roughly, just as perception is constrained by its inputs, cognition is not. This proposal, however, rapidly invokes possible counterexamples: hallucinations, which seem to be non-stimulus-dependent perceptual states; and demonstrative thoughts, which seem to be stimulus-dependent cognitive states. Beck’s paper is, in fact, an in-depth discussion of the nature of these two phenomena and their possible accommodation to the stimulus-dependence criterion.Footnote 2 In this paper, I examine this view and place the phenomena of amodal completion and visual categorization under the lens of the stimulus-dependence criterion. I demonstrate that, from this perspective, the notion of perception becomes exceedingly restricted.

This paper is structured as follows: Section 2 outlines the stimulus-dependence criterion (S-D) proposed by Beck to draw the perception/cognition border; Section 3 presents some general difficulties in taking S-D as an appropriate criterion; Section 4 analyses the nature (be it perceptual or cognitive) of the phenomenon of amodal completion placed under the lens of the S-D criterion; Section 4 does the same with visual categorization; Section 5 concludes that considering the analysis of sections 3 and 4, and the difficulties remarked in section 2, the S-D criterion is too restrictive —the remaining notion of perception becomes puzzling and theoretically uninteresting— thus raising questions about the need to draw such a border.

2 The Criterion of Stimulus-Dependence

Beck (2018) proposes an outstanding attempt to distinguish perception from cognition; perceptual states are distinguished from cognitive states by being stimulus-dependent, more specifically, for having the function of being fully stimulus-dependent.Footnote 3 If this story is on track, then we have strong reasons to divide the mind in at least two parts; on one side perception, a stimulus-driven process, and on the other, cognition, a non-stimulus-driven process. A direct and easy way to reject this view is by considering extreme cases. For example, when rubbing your eyes, you can perceptually experience flashlights, or when closing your eyes, you can still have a perceptual experience of darkness. These tricky cases are, however, of limited theoretical interest, so I will leave them aside and focus on the more common cases. For example, you cannot see apples or hear thunder if there are no apples or thunder to be seen and heard. It is in this sense that perception is considered stimulus-dependent. In an initial approximation, perceptual states are dependent on a causal link that derives from the distal stimulus (the actual physical stimulus) and is mediated by the proximal stimulus (what directly impinges on the perceiver sensory organs —in vision, for instance, the image that falls on the retina). This is an important distinction because the actual physical stimulus and the energy falling on a receptor surface do not necessarily coincide (e.g. perceptual illusions). There is, of course, a causal link between the distal stimulus and the perceptual state, but such a causal link requires the mediation, whether it be veridical or not, of the proximal stimulus. So, ultimately, the occurrence of a mental state is stimulus-dependent only in the case that it is causally sustained by present proximal stimulation (Beck 2018: 323).

To be appropriate as a mark of the perception/cognition boundary, Beck notes, the stimulus-dependence criterion should accommodate two problematic cases: hallucinations, where perception is independent of stimulus, and demonstrative thoughts, where cognition is dependent on the stimulus. His efforts are, then, focused on accommodating these disturbing cases to his account. He begins by offering a simple formulation:

S-D SIMPLE: ⍦ is perceptual if, necessarily, all veridical occurrences of ⍦ are stimulus-dependent; otherwise, ⍦ is cognitive (Beck 2018: 323).

The problem with this simple formulation is that hallucinations or perceptual illusions count as stimulus-independent since referring explicitly to veridical occurrences, non-veridical perceptions stay out of the formulation. Here is the first problem; by assuming that hallucinations and illusions are perceptual, how can they be accommodated in S-D simple? To solve this problem Beck first distinguishes between exogenous and endogenous hallucinations, while the first is a deviation of a proximal stimulation (there are stimuli but the perceptual experience is not under the causal control of such proximal stimuli) the second does not require proximal stimulation (everything occurs in the absence of sensory stimulation). The former case (as well as the case of perceptual illusions) only requires a slight modification in the formulation —removing the veridical value of the occurrences; the latter case, endogenous hallucinations, is more complicated to resolve. The complete absence of stimuli leaves Beck’s account uncovered, so further elaboration is needed. One solution is to put forward evidence for the non-perceptual nature of these type of hallucinations, for example, one can argue that endogenous hallucinations are more similar to imagination than to genuine perception, since neither of the two is sustained by proximal stimulation. However, there are also reasons to think that at least some kind of endogenous hallucinations may be perceptual —endogenous hallucinations (as well as imaginations) are phenomenally and introspectively identical to perceptions. This makes it difficult to consider the real nature of endogenous hallucinations. However, Beck finds a way to account for endogenous hallucinations by elaborating a functional definition:

S-D FUNCTION: ⍦ is perceptual if, necessarily, all occurrences of ⍦ have the function of being stimulus-dependent; otherwise, ⍦ is cognitive (Beck 2018: 326).

In this new formulation, the veridical value of the mental occurrences is removed, thus allowing exogenous hallucinations and perceptual illusions to be accounted for. The concerns posed by endogenous hallucinations are resolved by introducing the functional definition. If the neural mechanism triggered by a given endogenous hallucination has the function of producing states that are stimulus-dependent, then such a hallucination is perceptual, otherwise it is cognitive. A further distinction is needed here: there should be endogenous imaginative hallucinations and endogenous perceptual hallucinations; while the former is stimulus-independent and non-perceptual, the latter is stimulus-dependent and accordingly perceptual. But, how can we distinguish one from the other? The answer, according to Beck, lies in neural mechanisms. If the hallucination is produced by neural mechanisms whose function is to generate stimulus-dependent outputs, then we are facing an endogenous perceptual hallucination, otherwise it is an endogenous imaginative hallucination. The former counts as perceptual, the latter as cognitive. Thus, what counts as perceptual or imaginative endogenous hallucination becomes, in the long run, an empirical question determined by the parts of the brain activated. In short, there may be no stimulus, but if endogenous hallucinations are carried out in typically perceptual brain areas, then they should still have the function of being stimulus-dependent and therefore will be perceptual. On the contrary, if endogenous hallucinations are not carried out in typically perceptual brain areas, then they do not have the function of being stimulus-dependent and in consequence will be cognitive. Beck reasons as follows:

If we have reason to believe that a mechanism has the function of producing stimulus-dependent outputs, and we also have reason to believe that a mental state is the product of that mechanism, we will have reason to count that mental state as perceptual rather than as cognitive (Beck 2018: 327).

The core issue is that introducing the functional formulation, where perceptual states are distinguished from cognitive states by virtue of having the function of being stimulus-dependent, Beck claims to avoid endogenous hallucinations as a dangerous exception for his account. Simply, in the extent that a perceptual state has the function of representing the occurrences of the environment, the representation of deviant occurrences (exogenous hallucinations and illusions) or even the very absence of the occurrences (endogenous hallucinations) in the environment are, by the functional formulation, tolerated.

Things also get complicated when we look towards demonstrative thoughts. It is exactly the opposite case of hallucinations; here cognitive states depend on stimuli, and therefore stimulus-dependence criteria cannot be taken as a reliable distinction between perception and cognition. Here a further distinction must be also drawn. There are perceptually grounded demonstrative thoughts (PGDT) and mnemonically grounded demonstrative thoughts (MGDT). Only the former is relevant, since mnemonically grounded demonstrative thoughts do not rely on current perception and therefore cannot have the function of being stimulus-dependent —they fall within cognition. Perceptually grounded demonstrative thoughts are, by contrast, stimulus-dependent, and therefore a potential hindrance for S-D function.

Now, by delving into the constitution of perceptual demonstrative thoughts, one can easily see a demonstrative element, whose reference is determined by perception and therefore maintains the function of being stimulus-dependent, and a conceptual element (or attribute), which does not have the function of being stimulus-dependent. Perceptual demonstrative thoughts are therefore partially but not fully stimulus-dependent. So, the new formulation is as follows:

S-D FULL: ⍦ is perceptual if, necessarily, all occurrences of all elements of ⍦ have the function of being stimulus-dependent; otherwise, ⍦ is cognitive (Beck 2018: 331).

Beck deploys two reasons to argue for the non-perceptual nature of the attributive elements of PGDTs. The first is that conceptual attributives in PGDTs are not proximally constrained. To use Beck’s example, when seeing a sandpiper, you can veridically entertain the PGDT, That bird is spotted, and this occurs even when you cannot perceive the distinctive pattern of dark spots on its breast. The second reason is that conceptual attributives in PGDTs are bounded only by one’s conceptual repertoire. For example, when visually attending to a copy of The Brothers Karamazov, you can surely entertain the PGDT, That book is an instance of an existential masterpiece, but this is, beyond all doubt, a property that is far from being a perceptible property, but a property that is part of the subject’s conceptual repertoire. These reasons, according to Beck, mark the difference between perception (fully stimulus-dependent) and PGDT (partially stimulus-dependent). By adding the clause all elements to the formulation, in principle, Beck eludes the problems raised by demonstrative thoughts, thus accommodating them into his account.

3 General Difficulties in Taking S-D Full as a Criterion to Demarcate Perception from Cognition

There are, however, controversial assumptions in conceding the variations introduced by Beck to the original formulation of S-D to accommodate hallucinations and demonstrative thoughts. First, to accommodate hallucinations, S-D function rests on neuroscientific evidence about perceptual brain mechanisms, and second, to accommodate demonstrative thoughts, S-D full assumes the object-independent theory of demonstrative thoughts.

On the one hand, in introducing the functional formulation, where perceptual states are distinguished from cognitive states by virtue of having the function of being stimulus-dependent, Beck claims to avoid endogenous hallucinations as a dangerous exception for his account. However, by taking the mechanism as the brain area carrying out a determined function, we are granting a narrow relation between the mechanism and its function. This would be, in principle, an acceptable statement, although this decrement in abstraction at the explanatory level —note that we have gone from perception to its function and from function to its neural mechanism— runs the risk of losing, or perhaps altering, the explanatory continuity. Imagine, however, that these concerns dissolve by appealing to explanatory bridges that successfully connect the more concrete with the more abstract levels (perhaps via some kind of successful reductionism). This being the case, taking a perceptual state (hallucinatory in this case) as a functional state and a functional state as a certain mechanism carried out by a neuronal state, entails that a certain perceptual state can be explicatively aligned with certain brain areas. Ultimately, following this line of reasoning forces us to focus on the brain —it appears that the whole argument will be, in the long run, upheld by appealing to neuronal mechanisms.

Therefore, for the functional formulation to make sense the S-D criterion must depend on, and go hand in hand with, differences in the stimulation of typically perceptual or typically cognitive brain areas. At this point, the S-D criterion has been transformed into a neurobiological criterion.Footnote 4 It is the same to state that perception is distinguished from cognition by being stimulus-dependent, as stating that they are distinguished by the neural mechanisms implicated for each one. If perception is distinguished from cognition by having the function of being stimulus-dependent, and the distinction between perceptual endogenous hallucinations (which count as perceptual and therefore are stimulus-dependent) and imagistic endogenous hallucinations (which count as cognitive and therefore are not stimulus-dependent) depends on the brain areas implicated, there is nothing to prevent us from thinking that the stimulus-dependence criterion is coextensive with neurological criteria.Footnote 5

The following natural step cannot be other than examining the neural basis of hallucination: Can we neurobiologically distinguish perceptual endogenous hallucinations from imagistic endogenous hallucinations? Although the neural mechanisms that produce hallucinations and other psychotic symptoms remain, to a large degree, unclear, the most recent research provides very interesting results. Let me focus on one of the examples cited by Beck as a typical perceptual hallucination: the Charles Bonnet Syndrome. The peculiarity of this syndrome is that despite being caused by loss of vision due to eye diseases, with no clear pathology in the brain itself and no necessary impairment to mental health other than the hallucinations, it produces very complex and vivid hallucinations. Beck relies on Ffytche et al. (1998) study to argue that the Charles Bonnet Syndrome is an instance of perceptual endogenous hallucination, since it is correlated with activity in areas of the visual cortex closely linked to mechanisms of visual perception (rather than to mechanisms of visual imagination), particularly, with cerebral activity in the ventral extrastriate visual cortex (Beck 2018: 329). Instead, recent studies on the Charles Bonnet Syndrome suggest different mechanisms. Reichert et al. (2013), for example, suggest a generative model to explain the syndrome (see also Powers III et al. 2016; Corlett et al. 2019). According to this model, the disorder is produced by an impairment or a disconnection in the balance of top-down predictive processing conveying prior expectations and more high-level learnt concepts, and bottom-up processing driven by sensory input. In these cases, the system is able to internally synthesise rich representations of image content, such as objects, even in the absence of (corresponding) sensory input. Simply, in hallucinations top-down influences tend to dominate visual processing and erroneous perceptions prevail in the face of contradictory sensory evidence (O’Callaghan et al. 2017: 68). Therefore, the content of complex hallucinations cannot be accounted for by only appealing to anatomical organisational properties of visual areas, but on distributed, high-dimensional, hierarchical representations that go beyond local visual features. In sum, the syndrome might arise when prior predictions exert an inordinate influence over perceptual inferences, thus creating percepts with no corresponding stimuli at all. The issue is that the very distinction between imagistic and perceptual endogenous hallucinations loses sense when we approach the neurological basis of hallucination presented.

On the other hand, the variation incorporated to the S-D function in order to accommodate perceptual demonstrative thoughts, that is, the appealing to the stimulus-dependence of all and not only a part of the elements of the mental state, is not without controversy either. Indeed, S-D full depends on the object-independent theory for demonstrative thoughts (Burge 2010), but we can consider the object-dependent theory of demonstrative thoughts (Evans 1982; McDowell 1984).Footnote 6 These theories are currently cause for discussion (see Crawford 2020). Let me consider them briefly.

Beck strongly relies on Burge’s theory of perceptual demonstrative thoughts, according to which perceptual demonstrative thoughts have, like perceptual states, a demonstrative element that makes the object causally impinge on the thinker’s sensory organs, but differs from perceptual states in the attributive element. Just as in perceptual states the attributive elements are typically perceptual, in perceptual demonstrative thoughts, the attributive elements are conceptual. This is key for Beck’s account since it makes the difference between the relational state of the subject with the object conveyed during perception, whose content is purely perceptual, and perceptual demonstrative thought, whose content is a hybrid of perception and cognition. Accordingly, the conceptual nature of the attribution in demonstrative thoughts seems to be beyond the perceptual experience of the object itself. This is, in the long run, what allows him to solve the difficulties posited by perceptual demonstrative thoughts, classifying them as cognitive rather than perceptual.

In stark contrast to Burge’s account, one can consider a theory of perceptual demonstrative thoughts in which the attributive elements, even though conceptual, fall into the perceptual scope. Indeed, taking into account that in the absence of sensorial input there is no attribution at all, it is the presence of the input that specifies the whole content of the perceptual demonstrative thought. Evans (1982) and McDowell (1984) are two prominent examples of this view. As McDowell (1984), for example, puts it, ‘contents … are de re, in the sense that they depend on the existence of the relevant res’ (p. 291). The idea is that, contrary to Burge, the attributives of a perceptual demonstrative thought enters into the specification of perceptual content. To take an example, young children (and perhaps even babies and animals) are perfectly capable of having perceptual demonstrative thoughts, but do not possess the capacity to conceptually attribute what such demonstrative thoughts bring to bear in their perceptual thinking.Footnote 7 This shows that the reference of demonstrative thoughts is not necessarily determined in a descriptive manner through conceptual material associated with the object, but by the very fact of the perceptual relation to it. What this kind of example seems to show is that the attribution of a perceptual demonstrative thought might be stimulus-dependent, which would leave the S-D full in Beck’s account in a complicated situation.

To explore which of these theories is on the right track exceeds the aim of this paper, but arguably, for the S-D full criterion to demarcate perception from cognition, the theory at stake must be that of Burge —that is, that perceptual demonstrative thought must be, unlike pure perceptual states, composed of a perceptual demonstrative element and a conceptual attributive element, the latter being what marks the difference with pure perceptual states.

The general complications just discussed in establishing S-D full as a valid criterion for determining the border between perception and cognition are, of course, not definitive. A defender of S-D might insist on the correct accommodation of the awkward cases by embracing the plausibility of the modifications imposed to the original formulation. One can, indeed, accept both: appealing to neurological mechanisms to account for differentiating between perceptual and imagistic endogenous hallucinations, and appealing to the object independent theory for perceptual demonstrative thoughts. Let me, therefore, overlook these difficulties and assume that the variations imposed by the potential counterexamples (hallucinations and demonstrative thoughts) on the original formulation of S-D simple are appropriate. Suppose that S-D full is the final and suitable formulation of the perception/cognition distinction. With this assumption in hand, let me analyse the possible accommodation of other mental states that seem to be halfway between perception and cognition; in particular, amodal completion, which is prima facie perceptual yet stimulus-independent, and visual categorization, which is prima facie cognitive yet stimulus-dependent.

4 The Case of Amodal Completion

Amodal completion is the capacity to see objects as having occluded parts. For example, when we see a cat behind a picket fence, our perceptual system represents those parts of the cat that are occluded by the picket fence. Such a completion is part of our ordinary perceptual experiences; in the real world, we constantly represent features of perceived objects occluded or sometimes hidden from us.Footnote 8 This is easily achieved; even with very limited information we are able to reconstruct an appropriate percept on what it is out there.Footnote 9 However, the mechanism by which this is achieved is a cause of discussion. Some researchers argue that we amodally perceive an object’s occluded features by imaginatively projecting them into the relevant regions of visual egocentric space (Nanay 2010), others think that amodal completion is characterized by genuinely visual representations (Gibson 1972) and others that amodal completion is represented by means of beliefs inferred from the object’s visible features as well as relevant background knowledge (Briscoe 2011).

These disputes are currently in vogue, but arguably, when we consider the varieties in which amodal completion is represented, the three possibilities have something to say (Briscoe 2011). For example, there can be situations in which we project a mental image of a perceived object’s occluded features, (in the manner suggested by Nanay 2010), and these features are constrained by past experiences and beliefs recorded in subject’s mental repertoire (Briscoe 2011). Consider for example Fig. 1. When an observer represents the hidden parts of the occluded animal (a horse facing to the right), the attributed features must be constrained by their past experiences and beliefs. Of course, if the observer has never seen a horse, they will have difficulties in attributing properties to the occluded part of the figure (Nanay 2010: 247), but if they have previously seen horses or have acquired enough knowledge as to recognize horses, then it is unlikely that the observer will represent the occluded part as hiding the front half of a cat, a bird or a person. Thus, the kind of completion illustrated in Fig. 1 (as well as the cat example above) should have an intuitively cognitive component.

Fig. 1
figure 1

Illustration of a cognitive-driven amodal completion

But plausibly, there are also situations in which we form beliefs about a perceived object’s occluded features directly based on the object’s visible features and other collateral information, and this occurs without first projecting a mental image of its occluded features and without the need for any extra background knowledge triggered by the perceiver. Consider Fig. 2. We see a single, white surface partially occluded by a grey stripe. Our visual impression is not of four unrelated image regions on the same plane of depth, but of a grey stripe that appears to be closer in-depth, and hiding parts of a white surface that complete behind it. In this situation, the completion seems to be stimulus-driven and not depending on background knowledge or imagining processes.

Fig. 2
figure 2

Illustration of a stimulus-driven amodal completion

Therefore, amodal completion is a multi-faceted phenomenon, on occasions explained by means of mental imagery and other cognitive processes, and on occasions explained by means of pure perceptual mechanisms. The former are cases where S-D simple formulation is not satisfied, since the completion is, at least in part, stimulus-independent, that is, are cases where completion is not fully stimulus-driven. However, they might satisfy the S-D function formulation. After all, we complete the right side of the horse because there is a visible part on the left side showing the back half of the horse. So, plausibly, this completion potentially has the function of being stimulus-dependent.Footnote 10 The kind of completion achieved by genuinely visual representations is, prima facie, perception without proximal stimulation, and therefore a potential counterexample to Beck’s account. The case of perceptual amodal completion is, in fact, parallel to the case of perceptual endogenous hallucinations —neither of them satisfy S-D simple (it is perception without proximal stimulation), though plausibly, they satisfy S-D function. Since satisfying S-D function requires considering the brain mechanisms involved in the completion of objects, Beck’s account requires that the brain mechanisms deployed by cognitive amodal completion be different than the brain mechanisms deployed by genuinely perceptual amodal completion. It is worth assessing the current empirical evidence in order to unravel these concerns.

So, what are the differences between the brain mechanisms of cognitive and perceptual amodal completion? In an initial approximation, it is easy to see that completing Fig. 1 requires at least one extra mechanism than completing Fig. 2: one cannot adequately complete Fig. 1 if one has never seen a horse. The completion of Fig. 2, instead, does not require additional knowledge besides the picture itself. Shortly, while Fig. 1 involves the influences of previous recognition and categorization, Fig. 2 does not. This indicates that the brain mechanisms involved in imagistic (or cognitive) completion are different, or at least require further processing than the brain mechanisms involved in the mere perceptual completion. Indeed, current evidence points out that simple stimuli such as oriented bars squares, circles, crosses, or stars might restrict the neural representation only to low-level visual areas, whereas more natural scenes and actual objects such as tools, faces, or animals might recruit higher-order visual areas (for review see Thielen et al. 2019).

Recapitulating, if we apply the S-D simple formulation to all types of amodal completion, the result is that both cognitive (and imagistic) and perceptual amodal completion are not sustained by present proximal stimulation and therefore do not satisfy S-D simple. If we revisit S-D function and focus on brain mechanisms, the most recent empirical evidence suggests that cognitive amodal completion requires the recruitment of higher-level visual areas, whereas perceptive amodal completion is restricted to low-level visual areas. Thereby, whereas the former does not have the function of being stimulus-dependent, the latter does. So that, perceptual and cognitive amodal completion are instances of mental states without present proximal stimulation, but only the first has the function of being stimulus-dependent. Furthermore, if we take a look at S-D full, perceptual amodal completion clearly has the function of being fully stimulus-dependent (there is not a conceptual attributive in this sort of completion), while cognitive amodal completion requires the use of prior conceptual information, and therefore, it is only partially stimulus-dependent, so it does not satisfy S-D full.

Therefore, the verdict is that in a similar way to endogenous hallucinations and in consonance with Beck’s account, the phenomenon of amodal completion has, depending on the stimuli, two faces: one perceptual, since even without proximal stimulation the completion can have still the function of being fully stimulus-dependent; and the other cognitive, since without proximal stimulation the completion does not have the function of being fully stimulus-dependent. Now, the crucial point is to evaluate the prevalence of each one of these faces in our everyday perceptual lives. It can rapidly be argued that in the vast majority of everyday perceptual scenarios, stimuli are completed in a cognitive manner. Indeed, real perceptual experiences are not usually composed of oriented bars, squares, circles or crosses, but rather by more complex stimuli like tools, animals, faces or moving objects. That is, in natural conditions, it is very rare to amodally complete stimuli through only sensory stimulation-driven perception. This is important because taking the perceptual side of the amodal completion story as a residual and rare phenomenon only present in labs, in controlled situations and with deliberately designed stimuli, one is tempted to claim that the phenomenon of amodal completion is mostly cognitive, and therefore, according to Beck’s logic, stimulus-independent. For the moment, I am only interested in the verdict —that amodal completion is mostly cognitive— later we will see the vast consequences of such a verdict. Let me now apply Beck’s constraints to another very common mental phenomenon, sometimes labelled as cognitive and sometimes as perceptual: visual categorization.

5 The Case of Visual Categorization

Pictured objects and scenes can be understood in a brief glimpse, conceptual information is effortlessly available in the blink of an eye, almost instantaneously. Recent evidence shows that conceptual understanding can be reached when a novel picture is presented as briefly as 13 ms from stimulus onset (Potter et al. 2014). Whether visual identification falls on the perceptual or on the cognitive side of the story is a debated issue. Firestone and Scholl (2016), for example, argue that as far as object (or face) recognition necessarily involves the intervention of concepts stored in long-term memory, it should count as a genuine cognitive capacity. On the contrary, other researchers focus on other typical aspects of object recognition such as swiftness (Grill-Spector and Kanwisher 2005; Potter et al. 2014) or automaticity (Mandelbaum 2018) to include it as a typical perceptual capacity. The condition of visual categorization, whether it be perceptual or cognitive, is therefore an unsettling issue. Let me then apply the constraints of the stimulus-dependence criterion.

Considering the stimulus-dependence strategy to mark the perception/cognition border, visual categorization clearly satisfies S-D simple. Categorizing objects is causally sustained by proximal stimulation, one cannot visually categorize an object if the object is not visually present. Hence, as all veridical occurrences of visual categorization are stimulus-dependent, visual categorization satisfies S-D simple. Let me then examine whether visual categorization satisfies S-D function. Recall that according to Beck, in order to have the function of being stimulus-dependent, the mechanisms deployed by a mental state must be correlated with activity in areas of the cortex that are closely linked to mechanisms of perception. What are the brain mechanisms activated during visual categorization? Following the most recent neuroimaging studies, visual categorization is represented in a distributed fashion throughout the brain, and multiple neural systems are involved. The highest-level areas of the visual system, particularly the inferotemporal cortex, prefrontal cortex, parietal cortex, premotor and motor cortex, and other areas as the hippocampus, medial temporal lobe, basal ganglia, cortico-striatal loops, midbrain dopaminergic system, and the interactions between all these neural systems play a key role in the encoding of object categories and category learning (for reviews see Seger and Miller 2010; Freedman and Miller 2008). While some authors associate categorical discrimination (e.g. ‘animate’ vs ‘inanimate’) with the prefrontal cortex (Freedman et al. 2003), others suggest that features at the stage of the inferotemporal cortex are already optimized for category discrimination (Kriegeskorte et al. 2008). It seems then that whatever the decisive area for discriminating between categories, in principle there are not enough reasons to hold that visual object categorization can be achieved without the intervention of higher-level visual areas or even other non-visual brain areas. This suggests that the mechanisms of visual categorization are not correlated with activity in areas of the cortex closely linked to mechanisms of perception, since other non-perceptually grounded brain areas are involved. So, in principle visual categorization does not satisfy S-D function, and should therefore count, according to Beck’s logic, as cognitive.

Nevertheless, there is a way of conceding that the brain mechanisms engaged during visual categorization are perceptually grounded. Let’s imagine that visual categorization is achieved by the mere implication of parts of the cortex associated with visual processing. Indeed, decades of evidence suggest that even implicating different brain areas located in different brain lobes, the activity through the ventral visual stream (the “what pathway”) is sufficient to recognize and categorize objects (Ungerleider and Mishkin 1982). Although visual categorization is most likely achieved during the last sections of the ventral stream —particularly the Inferotemporal Cortex, involved in recognizing complex object features (Tanaka 1996), and the Prefrontal Cortex, involved in linking perception to memory and action (Riley and Riley and Constantinidis 2016; Haller et al. 2018)— the functional mechanisms activated along the ventral pathway are sufficient for categorizing objects perceptually. Let us, therefore, carefully and reluctantly grant that the ventral pathway (from V1 to Prefrontal Cortex) is a perceptual mechanism that has the function of being stimulus-dependent. This being the case, visual categorization could satisfy S-D function and therefore count as perceptual.

What about S-D full? Does visual categorization have the function of being partially or fully stimulus-dependent? Recall that the disturbing case to accommodate in S-D function is the perceptually grounded demonstrative thoughts (PGDTs), thoughts with the schematic form “that X is p”. Recall also that we have assumed Burge’s account on PGDT’s, to wit, that as in perception, PGDTs have a demonstrative element and an attributive element, in both, the demonstrative element counts as perceptual, but in PGDTs, unlike paradigmatic perception, the attributive element counts as cognitive. PGDTs are, therefore, partially stimulus-dependent and in consequence, do not count as perceptual but as cognitive (since they do not meet S-D full). Now, note that in visual categorization the schematic form is similar to PGDT, when objects are categorized, e.g. a flower, subjects plausibly take the schematic form “that X is p” (“that object is a flower”), where there is a demonstrative element “that X” and an attributive element “is p”.

Recall now that Beck raises two reasons to argue for the non-perceptual nature of the attributive elements in PGDT. The first is that conceptual attributives in PGDT are not proximally constrained, the second that conceptual attributives in PGDTs are bound only by one’s conceptual repertoire. These reasons, according to Beck, mark the difference between perception (fully stimulus-dependent) and perceptually grounded demonstrative thoughts (partially stimulus-dependent). The question is: Do the attributive elements of visual categorization count as perceptually or cognitively grounded? Let us apply the constraints. First, as in PGDT’s, you can visually categorize objects veridically even when you cannot attribute some distinctive patterns of such object. For example, you can categorize a face as your friend face even without seeing the distinctive freckles of his face, and at the same time truly think: That face is my friend’s freckled face. That is, as you can visually categorize objects without seeing all the distinctive properties of the object, then the attributive elements in visual categorization are not necessarily proximally constrained. The second constraint is also satisfied. When you visually categorize, for example, a soccer ball, you are not employing exclusively perceptual attributives (e.g. its roundness, its colour), but also conceptual attributives (this is a soccer ball and not a basketball or a tennis ball), that is, your conceptual repertoire also plays this game. As a matter of fact, the property of being a ball that is used to play soccer is not the kind of thing that one can visually perceive. So, by applying Beck’s logic, one can conclude that, as in PGDT’s, the attributive element of visual categorization is not proximally constrained. Thus, visual categorization has at most the function of being only partially stimulus-dependent and consequently, it counts as cognitive.Footnote 11

At the beginning of this section, I have previously pointed out the exaggerated swiftness of visual categorization, we can recognize an object categorically as near as 13 ms from stimulus onset (Potter et al. 2014).Footnote 12 Indeed, using a rapid serial visual processing task (RSVP) Potter et al. (2014) presented participants with a series of visual images: the subjects’ goal was to detect and categorize a specified target from the sequence of the rapidly presented images. They found that observers could determine the presence or absence of a specific picture even when the pictures in the sequence were presented for just 13 ms each. These results are usually taken to argue in favor of feedforward models in visual categorization, to wit, that an initial feedforward wave of neural activity through the ventral stream is sufficient to allow identification of a complex visual stimulus. And most importantly, this evidence also suggests that perception outputs conceptualized representations in form of basic-level categories, or in other words, that the connection between the object and the belonging category, must be carried out during perception properly (Mandelbaum 2018).

There is, however, another possible interpretation of this evidence. We can argue that conceptualization is a genuine cognitive process by holding that perception is the process developed during this reduced space of time (the first 13 ms). This being the case, perception becomes a very narrow process, but perhaps still significant. Indeed, at this point, the speed of the visual system should not surprise us; if instead of the speed of processing we attend to what it is processed during this short period of time (the representation of shape, colour, and so on), it is reasonable to think that what occurs before conceptualization enters into play is what has the function of being fully stimulus-dependent, and therefore it is what counts as genuine perception. Summing up, S-D full demands that perception is a pre-conceptualized representation, since conceptual attributives, the kind of attributives present in PGDT and visual categorization, do not have the function of being fully stimulus-dependent, and therefore counts as cognitive.

Thus far, the only option left to Beck’s account is that perception is what it is registered pre-conceptually, that is, the intrinsic properties of perceptual experiences processed during the early levels of visual processing (again, the processing of shape, colour, size, texture, brightness or motion). But even this is at odds with current research on visual categorization. Let us look at what the most recent evidence shows in a little more detail. Studies of temporal dynamics have found overlapping signatures of low- mid- and high-level visual representations (Groen et al. 2013; Harel et al. 2016), thus suggesting co-occurring and co-localized visual and categorical processing (Ramkumar et al. 2016). Such evidence questions the classical hierarchical model and the utility of the very distinction between levels of properties (Groen et al. 2017). For example, Ramkumar et al. (2016) studied the temporal relationship between visual information representation and rapid-scene categorization. They used confusion matrices (matrix whose entry represents the fraction of trials for categorization errors) to track the pattern of errors produced by visual representation or by other processes that lead to categorical choice. They found that the very same regions that represented low-level features could also explain unique variance in neural confusions that were directly related to behavioural confusions. Crucially, both visual and behavioural confusions predicted neural confusions nearly simultaneously, thus suggesting a temporal overlap in the encoding of visual features and processing that influences behavioural choice (see also Dima et al. 2018). Something similar occurs with mid-level representations. Some studies suggest that object recognition can co-occur or even occur before figure-ground organization (Peterson 1994; Grill-Spector and Kanwisher 2005), others, even show that visual categorization based directly on low-level features, without grouping or segmentation stages, can benefit object localization and identification (Torralba and Oliva 2003) and others even suggest that visual category responsive regions are not purely driven by low-level visual features but also by the high-level perceptual stimulus interpretation (Schindler and Bartels 2016). Critically, this evidence shows that the representation of low-level visual features, mid-level segmentation and processing that informs categorical information is not sequential, but co-occur at the same time and within the same cortical networks. Significantly, this evidence is consistent with Potter et al.’s (Potter et al. 2014) results, and at odds with the idea that perception is an unconceptualized process. What remains to be clarified are the low-level contributions needed to ascribe meaning to scenes, but, as a matter of fact, the co-occurrence of these levels makes it hard to draw a limpid line between the non-conceptualized and conceptualized parts of visual processing.

The core of the above discussion was to analyse the nature, whether it be perceptual or cognitive, of visual categorization as seen through the prism of S-D criterion. Now we have a prognosis. Visual categorization does not fulfil S-D full criterion for two reasons. Firstly, because in order to have the function of being stimulus-dependent, the mechanisms deployed by a mental state must be correlated with activity in areas of the cortex closely linked to mechanisms of perception, and this does not seem to be the case for visual categorization. Though, with many qualms, I have conceded that visual categorization may be achieved by the mere implication of parts of the cortex associated with visual processing, thus escaping from the functional constraint. The second reason is, nevertheless, insurmountable. Visual categorization has, at most, the function of being only partially stimulus dependent; the demonstrative element has the function of being causally sustained by present proximal stimulation, but the attributive element does not.

Finally, if visual categorization is a cognitive process that can be reached as close as 13 ms from stimulus onset, then Beck’s account requires that perception is what is registered pre-conceptually. This being the case, we have a problem with the scope of the purely perceptual side of the story, since perception becomes a very narrow mental process. Furthermore, recent evidence suggests that the representation of low-level visual features, mid-level segmentation and processing that informs categorical information is not sequential, but co-occurs at the same time and within the same cortical networks. Definitely, these arguments run against the S-D criterion to mark the perception/cognition border, since, at least based on evidence on visual categorization, the purely perceptual part of the process is difficult to address.

6 The Final Verdict: By Way of Conclusion

After submitting both hallucinations and demonstrative thoughts to S-D full scrutiny, Beck concludes that only perceptual endogenous hallucinations are purely perceptual states —imagistic endogenous hallucinations and perceptually grounded demonstrative thoughts fall on the side of cognition. When extending the S-D full restrictions to cases like amodal completion and visual categorization, it follows that only some rare cases of amodal completion (perceptual amodal completion) are, in principle, typically perceptual states, the other more common and naturalistic cases of amodal completion and cases of visual categorization fall on the side of cognition. Furthermore, in my discussion on visual categorization, I have put forward some studies which show that pre-conceptualized representations show overlapping signatures with conceptualized ones, that is, the representation of low-level visual features, mid-level segmentation and categorical information is not sequential, but co-occurs at the same time and within the same cortical networks. The question now is: if these arguments are on the right track, what is it like to be in a pure perceptual state? The answer is that, according to S-D full criterion, to be in a pure perceptual state is a very rare (if not an unfeasible) situation. This makes it unsuitable to speak of perception as a natural kind, since perceptual states are, in order to exist, predominantly combined with other mental states (cognitive, imagistic, or whatever). Of course, this is not to say that there are no stimulus-dependent mental states (in the sense of S-D simple), but rather that it is hard to find a mental state whose function is to be fully stimulus-dependent.Footnote 13

A general diagnosis can be drawn from all this. If we want to keep perception as a theoretically and scientifically useful term, we have to change our thinking about it. One possibility is to get rid of the idea of a demarcated line between perception and cognition and consider these terms not as natural kinds, but as mostly interdependent mental phenomena which refer to a unique mental state, and whose intervention is variable depending on external factors (ambiguity, noise or uncertainty). If we decide to take this step, perception and cognition appear predominantly intertwined. This means, there can be pure perceptual states, pure cognitive states and mental states with different degrees of perception and cognition. Pure cognitive states are common (e.g., linguistic judgements), pure perceptual states are, as we have seen, very unusual, and the intertwined state encompasses the majority of our everyday experiential situations —it is, in sum, the prevailing state in our mental lives.

Both philosophers and experimental psychologists have long tried (with little success) to isolate, terminologically and experimentally, pure perceptual states from other non-perceptual states. In this paper, I have considered the criterion of stimulus-dependence as perhaps the most compelling way to stipulate such a division. Indeed, the idea that perceptual states, in contrast to cognitive states, are causally sustained by proximal stimulation actually sounds convincing. However, a closer examination uncovers a different reality. When applying the constraints of the S-D full criterion the notion of pure perception becomes so narrow that it is extremely difficult to even grasp it. The full version of the S-D criterion is therefore so restrictive that the differences between perception and cognition should perhaps be considered more in degree than in kind.