Beyond the Platonic Brain: facing the challenge of individual differences in function-structure mapping

In their attempt to connect the workings of the human mind with their neural realizers, cognitive neuroscientists often bracket out individual differences to build a single, abstract model that purportedly represents (almost) every human being’s brain. In this paper I first examine the rationale behind this model, which I call ‘Platonic Brain Model’. Then I argue that it is to be surpassed in favor of multiple models allowing for patterned inter-individual differences. I introduce the debate on legitimate (and illegitimate) ways of mapping neural structures and cognitive functions, endorsing a view according to which function-structure mapping is context-sensitive. Building on the discussion of the ongoing debate on the function(s) of the so-called Fusiform “Face” Area, I show the necessity of indexing function-structure mappings to some populations of subjects, clustered on the basis of factors such as their expertise in a given domain.

in some behavioral task. In most cases, functional ascriptions are a matter of sharp contention. However, a few functional ascriptions are taken as received wisdom.
Take the striate cortex, also known as V1. Indeed, almost everybody agrees that V1 is in the business of processing visual information-hence its alias primary visual cortex. This is unsurprising, given that V1 receives afferents fibers from the geniculate nucleus, which in turn receives fibers from the retina. It would thus be safe to say that V1 deals with visual information in most cases.
But what functional role does this primary visual cortex play in blind subjects? Since they do not process any visual information, were brain areas rigidly 'tied' to their functional destiny, we would expect that V1 simply does nothing. However, this seems not to be the case. In fact, fMRI scans reveal activity in blind subjects' V1 (and in other cortices) when they read Braille; and impairment in their performance may be obtained by perturbing V1 via transcranial stimulation (Kupers et al. 2007). A growing number of such cases suggest that several 'visual' areas are way too busy in non-sighted subjects to be actually and uniquely visual.
To account for such findings, some scholars have been ready to retract the ascription of visual functionality to V1. Notably, according to the meta-modal hypothesis (Hamilton and Pascual-Leone 2001), 1 V1 and other purportedly 'visual' cortices still perform the same computation, e.g. shape recognition, across various modalities. Having described striking similarities in the neural activity of the 'visual' cortices of sighted and congenitally blind subjects on several circumstances, Ricciardi and Pietrini (2011) argue for a more abstract characterization of the function of these structures.
However, this 'digging deeper' strategy (which I shall examine more closely in §3.2 under the term "functional abstraction") may sometimes be unsatisfactory. In fact, in blind (as opposed to non-blind) individuals, several "visual cortices" activate during tasks such as listening to a spoken language or doing math equations, which are hardly accountable in terms of "shape recognition" (Bedny 2017).
The story might get even more complex-and intriguing. As bizarre as it may seem, some congenitally blind individuals devise an ingenious strategy to compensate for the lack of visual input when navigating space: echolocation. They repeatedly make a click sound with their tongue and use the echo to identify obstacles and other objects in their surroundings (Downey 2016). Notably, fMRI reveals that echolocation activates areas in the vicinity of V1 (Thaler et al. 2011). Now, undoubtedly several metamodal labels can be devised to account for the activity of some visual areas in tasks such as reading Braille or echolocating. Yet, by dismissing the differences between such tasks, these labels are likely to miss something quite important: namely, that this very area is involved in doing something different in the two cases. However, it is worth stressing that the area's engagement in visual processing simply cannot occur in blind subjects; nor can its engagement in echolocation occur in sighted subjects.
This simplified story was meant to emphasize that brains and cognitive abilities exhibit important differences across human beings. This is a well-known fact, of course. Yet, as it often happens to facts when they are too well-known, I contend that its implications have received less attention than they deserve. Recall the quotes from the handbook that open of this section: the task they set for cognitive neuroscience is to "understand the relationship between the brain and mind" (Banich and Compton 2018: p. 3. Italics mine); to unravel "how mental processes […] are organized and implemented by the brain" (J. Ward: 1. Italics mine)-as if one "Platonic" model of the brain might explain everything we need to know about the functional organization of every actual (and possible) human brains.
In this paper, I have three aims. First, I aim to explain the rationale behind the "Platonic Brain Model" that treats individual differences in brain functional organization as noise to be zeroed out ( §2). Second, I highlight some of its limitations. To do so, I provide an upshot of the extant philosophical literature on (the difficulty of) mapping cognitive functions onto neural structures ( §3), and present a case study in which a Platonic Brain Model falls short of accounting for all the relevant function-structure mappings, namely the case of the Fusiform 'Face' Area ( §4). Third and last, I discuss the virtue of, and begin to sketch a recipe for, a post-platonic neuroscience, in which the function-structure mappings are indexed to certain populations of subjects, e.g. subjects who share a certain degree of expertise in some domain ( §5).

Dealing with individual differences: the Platonic Brain Model
Individual differences in human brains are particularly evident when we consider extreme pathological cases like those hydrocephalic subjects 2 who exhibit no sign of mental impairment (Nahm et al. 2017), or when we compare children and adults' brains (e.g. Wilke, Schmithorst and Holland 2003). However, aided by the development of tools such as Voxel-Based Morphometry (Whitwell 2009), neuroscientists have become growingly interested in investigating differences even outside the clinical or the developmental domain. In a now classic study, Taxi drivers in London were shown to exhibit a peculiar increase and decrease in volume in the posterior and anterior hippocampus, respectively, which correlates with the amount of time spent driving taxi (Maguire et al. 2000). Hippocampus' thickness also seems positively correlated with the socioeconomic status, like many other cortical and subcortical structures (Farah 2017).
But individual differences are not limited to anatomy. Instead, they extend to physiology, altering the functional landscape of brain organization, sometimes profoundly. As people get older, the same tasks that used to elicit a right-lateralized activity of the prefrontal cortex tend to bring about a more bilateral activation. In other tasks, aging correlates with increased activity in the anterior regions, and diminished activity in posterior regions, even when performance rests unaltered (see De Brigard 2017 and the references therein). In people who played Pokémon videogames in their youth, as opposed to those who did not, looking at Pokémon consistently activates a portion of the occipitotemporal sulcus whose localization is consistent among subjects (Gomez et al. 2019). Similarly, merely looking at this very sentence (or at any string of letter) should suffice to engage a part of the left ventrotemporal cortex in literate subject-s-hence its alias visual word form area. But the same activity would not show up in illiterate subjects (Dehaene et al. 2010).
On the one hand, psychological science never ignored individual differences. Indeed, according to Danziger (Danziger 1990(Danziger /1994, investigating individual differences was precisely the core business (sometimes literally!) of one of the most prominent tradition during the early decades of scientific psychology, namely the Galtonian tradition (after Francis Galton). Setting the stage for modern psychometrics, they studied the relationships between indicators of different nature-sociological, anatomical, and psychological (e.g. performance in tests)-in individual subjects and groups (sometimes yielding to questionable conclusions). However, throughout the history of scientific psychology, and later also in the history of neuroscience, individual differences were mainly conceived quantitatively. That is, the difference between subjects is represented by scoring differently on several scales, e.g., IQ scores, the Big Five personality traits, the volume of some brain areas such as the posterior hippocampus.
Far less attention has been paid to qualitative differences in cognitive skillset or brain organization. Most studies in cognitive neuroscience are taken to be generalizable for almost every human brain. When they are too dissimilar from the average, "outliers" are treated as exceptions or noise, whose explanation is left to clinicians.
A rationale for this framework can be found in a well-known methodological paper by Caramazza (1986). Before the rise of neuroimaging techniques that allow studying the human brain in vivo, lesion studies were the bread and butter of mind and brain sciences. Cognitive neuropsychologists studied the cognitive impairments of patients who suffered a focal brain lesion not merely for the sake of understanding a clinical case, but also to unravel the workings of healthy brains.
As a neuropsychologist, Caramazza reflects upon the inferences that can be drawn from impaired to unimpaired cognitive systems, making the underlying assumptions explicit. He states that A crucial feature of [any model of human mind/brain] is the assumption of universality; that is, the assumption that [the model] is true of "normal" human mind/brains in general and, therefore, of any individual normal mind/brain (Caramazza 1986: p. 49).
Caramazza clearly does not ignore individual differences. Instead, he seeks them as something to rule out for the sake of generality, claiming that "we are going to have to place some restrictions on what will count as 'normal human mind/brain'" (ibid.). He goes as far as to claim that "if we were not to accept the assumption of universality, we would negate the possibility of scientifically investigating the mind/brain" (ibid.).
Crucially, the assumption of universality is the cornerstone of the single-patient method, for it justifies a generalization of the model (however falsifiable) based on evidence gathered from a single-patient and then extended to every "normal" subject. For Caramazza, the single-patient method is superior to the group-method, given that the latter may engender a spurious clustering of patients based on syndromes rather than on etiology. 3 Indeed, what might prima facie appear as the same impairment due to common symptoms might be caused by lesions in different sites, and these important differences are easily overlooked when averaging across patients.
However, Caramazza clearly stresses that the universality assumption also holds for group studies: The justification for using the performance of groups of subjects in our experimental investigations is based on the assumption that the averaged performance of the group essentially reflects the performance of any individual in the reference population from which the group was drawn. Thus, any conclusions arrived at for the group of subjects tested will be assumed to be true of all individuals in the reference population. This argument is only valid if the assumption of universality is true (Caramazza 1986: p. 50).
Let us now make a twenty-years leap forward in time. The rise of hemodynamic neuroimaging techniques has reshaped the scientific landscape. Lesion studies are no longer the favorite tool of cognitive neuroscientists (that is how they get called now), as researchers can now glimpse at brain activity in vivo in healthy subjects. As these subjects are healthy, the syndrome-etiology worry that concerned Caramazza ceases to apply. However, the universality assumption is still there, albeit in a new guise.
At the end of a lengthy discussion aimed at formalizing and justifying the inferences drawn from neuroimaging data to function-structure mappings, Henson (2005) addresses the possible objection according to which "different developmental trajectories (based, for example, on the order of exposure to different stimuli, such as language) may result in different 'final' function-structure mappings" (Henson 2005: p. 224). His reply reads as follows: the only solution is the nomothetic solution: to define "normal" psychological functions-in the healthy adult, for example-and to assume that there is [a] single "normal" mapping of these functions to the brain. This then becomes an empirical question: if the function-structure mapping changed […] so dramatically as a function of different developmental trajectories, one would not observe reliable differences in imaging data when averaging over random samples of normal individuals. The fact that neuroimaging can produce reliable and reproducible differences suggests that there is a normal (default) function-structure mapping. (Henson 2005: pp. 224-225).
Researchers working under such an assumption thus look at individual differences as a possible threat to generalizability. Two strategies are employed for ensuring generalization: (a) adopting some screen-off criteria to exclude "abnormal subjects", and (b) establishing the perimeter of normal results by averaging among data gathered from normal subjects.
Screen-off the abnormal-Usually, the "participants" section of neuroscientific articles reassures the reader that screen-off criteria have been respected. Typically, the description of participants boils down to a few details, e.g. whether they are healthy, adults, right-handed, and normal-sighted (to be able to see the stimuli in the scanner). As long as they are respected, the researchers go on describing the findings as if they pertain to a Platonic Brain, suggesting the reader that they generalize to virtually any human being-unless she fails to meet the screen-off criteria.
Averaging-Even after the 'abnormal' brains are screened off, normal brains do come with different sizes, shapes, and several other idiosyncratic features. Researchers have come up with several different techniques for aggregating data from multiple sub-jects. A common strategy, that Zina Ward (2019) dubbed the cartographic approach, consists in plotting activations from multiple subjects' brain onto a common reference space or template. But other approaches exist, that employ functional criteria together with anatomical ones to align brain activations of multiple subjects (cf. fn. 10).
Working under the assumption of the Platonic Brain Model was not unreasonable. After all, brain development is often based on somewhat similar genetic blueprints, and many experiences are similar enough for the whole humankind to result in the same brain organization. 4 This theoretical bet has proven fruitful not only for phylogenetically old mental activities, but also for relatively recent ones: in fact, the location of the visual word form area is stable across subjects (Dehaene et al. 2010; see Rathkopf (forthcoming) for a discussion of some implications).
Nevertheless, working under the assumption of a Platonic Brain Model comes with some theoretical costs and risks. For instance, Zina Ward (2019) shows how, when different brains are aligned with respect with some features (e.g. microanatomical properties) in neuroimaging studies, this may result in misalignment in other features (e.g. functional properties), and concludes that the choice of aligning technique depends upon the question one has in mind-i.e. there is no one-size-fits-all techniques to align brains.
Since the topic has many more implications that one can account for with a single paper, I will restrict my discussion to a specific topic. Like many neuroscientific endeavors start by hypothesizing a simple function-structure mapping (Bechtel and Richardson 2010), the following discussion will scrutinize Henson's claim that "there is a normal (default) function-structure mapping".

The quest for functional specialization
The question of whether the cortex is equipotential (i.e., its parts are functionally interchangeable) or functionally specialized (i.e., each part plays a specific function) dates back at least to the nineteenth century (Mundale 2002). Since then, scientists have been oscillating between the two positions. The longevity of the debate suggests that empirical data do not speak for themselves: they can be interpreted differently to defend one viewpoint or the other. The advocates of the functional specialization hypothesis have strenuously defended the following claim: Functional Specialization [FS]: for each neural structure s, there is one functional description f, so that the process f occurs if and only if s activates.  [FS] entails that the relationship between mental functions f and (the activity of some) neural structures s will result in a set of biconditionals s x ↔ f y . Counterevidence can affect both directions: when a structure underpins several functions (multifunctionality), it contradicts the statement that s x → f y ,. When a function is conjunctively (distributed processing) or disjunctively (degeneracy) realized by several structures or set of structures, it contradicts f y → s x Prima facie, FS is wiped out by current evidence: the literature reports several cases in which one or the other side of the biconditional fails to obtain. Such cases may be sorted into two major classes: cases of many-to-one function-structure relationships (a) 5 ; and cases of one-to-many function-structure relationships-which come in two versions (b, c) (for a schematic recap, see Fig. 1). Let us briefly examine these cases in turn: (a) Multifunctionality: one neural structure implements multiple functions.
Multifunctionality seems to be a ubiquitous property of brain structures, albeit some have more functions than others (Anderson and Pessoa 2011). 6 A few examples: the insula seems to be involved in a variety of functions, ranging from sensorimotor control to the experience of disgust and its perception in others, to the processing of salience for novel stimuli (Uddin et al. 2017). Broca's area, once thought to be eminently involved in language production and/or in syntactic processing, has been reported to activate also during motor planning and music processing (Tettamanti and Weniger 2006). Left posterior lateral fusiform, corresponding to the abovementioned visual word form area, also seems to be involved in the interpretation of visual attributes of animals, as well as in tactile-visual interface (Price and Friston 2005).
(b) Distributed processing: one cognitive function relies upon the coordinated work of multiple brain structures.
No brain region operates in isolation from others: instead, each cognitive task seems to elicit the orchestrated activity of a coalition of neural regions. This principle has been discussed at length in the literature on neural correlates of specific emotions: each one correlates with a set of regions, some of which are shared by multiple emotions (Lindquist et al. 2012). More generally, distributed processing is gaining traction due to the predictive success of the multivoxel pattern analysis techniques, whose underlying assumption is that neural representations are codified by the activity of vast, sparse networks of neurons.
(c) Degeneracy: one cognitive function might be implemented by different neural underpinnings.
Another reason why functions may fail to map neatly onto a single neural structure is that they are degenerate, i.e. that they can be realized by different neural underpinnings (Noppeney et al. 2004). For instance, it has been argued that reading familiar words might be processed either lexically (reading whole words) or phonologically (reading letter-by-letter). 7 While both distributed processing and degeneracy represent cases of many-to-one function-structure relationships, in the former a single function maps onto a conjunction of structures, whereas in the latter it maps onto their disjunction. It is also worth noting that these three issues are not mutually exclusive. On the contrary, they intersect with one another, thereby casting multiple shadows upon FS.
However, (a-c) do not necessarily falsify FS. Indeed, far from being a purely empirical claim about the brain, FS is best understood as a working hypothesis, a bet whose heuristic power guides the development of an integrated cognitive ontology (McCauley and Bechtel 2001). The history of psychology shows that the operations of the mind can be variously described, and multiple hypotheses exist as to how these operations can be decomposed. Moreover, although less obvious, there are several ways to partition brain structures. All things considered, in a case of putative falsification of FS given a certain set of mental and neural categories, a researcher has the option of preserving FS, and rather dispense with some of the mental or neural categories that fail to map onto each other, in favor of more brain-friendly mental categories and/or more mind-friendly neural categories.
Such an ontological refurnishing promises a huge payoff: namely, the possibility of predicting a function straight out of a structure's activation, and vice versa. The degree of success of this search for predictive power "can be embodied operationally in a cost function that reflects the prediction error in going from function to structure and back again" (Price and Friston 2005: p. 272).
For instance, some of the threats posed by (a-c) might be addressed renegotiating the neural ontology. To begin with, a putative multifunctional structure may sometimes be subdivided in smaller structures, each with different functions. So, for instance, the seemingly multifunctional insula turns out to be decomposable into four distinct sub-regions, each one with more specific functions (Uddin et al. 2017: p. 301;McCaffrey 2015). Also, Broca's Area has been reconceptualized as consisting in two structures: "one frontotemporal language-selective network and a second that belongs to the domain-general frontoparietal [multiple-demand] network" (Fedorenko and Blank 2020: p. 270), whose distinction was largely overlooked in group studies due to their anatomical proximity (Z. Ward 2019).
Reforms of the neural ontology can also go the other way, i.e. clustering structures. Indeed, in recent times, to address the fact that virtually every function correlates with the activity of multiple and scattered brain areas, the focus of neural ontology progressively shifted from regions to networks, where 'network' is often simply regarded as a set of brain areas (Klein 2012). More recently, De Brigard (2017) suggested that topological properties-as opposed to localization-are the relevant properties to individuate structures (and to link them to functions). However, for the purpose of this paper, we are mainly interested in a strategy to reform of cognitive ontology, that I discuss in the next sub-section.

Functional Abstraction and Neural Reuse
How to characterize the function of a neural structure has been a hot topic in many philosophical and scientific discussions. In many contexts (e.g. , a distinction has been proposed between (a) functional labels that are closer to folk psychological terms, that represent a specific kind of psychological information, or that are task/domainbounded, and transparent with regard to the behavioral outcome, e.g. "reading"; and (b) domain-neutral functional labels indicating the operation/computation a neural structure performs, irrespectively of the outcome it subserves, e.g. "normalization" (Carandini and Heeger 2012). Despite their similarities, most notions introduced to account for the multi-layered nature of functional ascriptions carry some theoretical baggage with themselves, which makes them not perfectly overlapping. Thus, to avoid committing to any theory-laden lexicon, in the remainder of the paper I will employ the rather uncommitted notions 'surface-functions' and 'deep-functions' to refer to (a) and (b), respectively. The deliberate vagueness of these terms is meant to capture the heterogeneity of functional ascriptions among cognitive neuroscientists and philosophers. They are relational, contrastive terms, i.e. it only makes sense to speak of a surface-functions in opposition to a deep(er)-function.
However interpreted, the distinction between surface-functions and deep-functions comes in handy for defenders of FS: confronted with evidence that a given structure gets activated across different functions, they often admit that multifunctionality obtains at the level of surface-functions, only to withdraw into defending FS at the level of deep-functions. In Burnston's (2016a) terms, they retreat from traditional absolutism to computational absolutism. The rationale of this strategy, which I dub functional abstraction, is expressed by Price and Friston: "structure-function relationships can be described at multiple levels […] Each level may be appropriate in a different context […] but it is more useful to label a region with a function that explains all patterns of activation" (2005: p. 268). Thus, they suggest that the surfacemultifunctionality of the left posterior lateral fusiform are but various manifestations of deep-monofunctionality, i.e. sensorimotor integration. Similarly, in the face of the Notice that, as this comparison abstracts away from many theoretical details, terms of each columns are not perfectly synonymous surface-multifunctionality of Broca's Area, Tettamanti and Weniger (2006) propose a single underlying deep-function, i.e. hierarchical processing. The reader might recall that a similar strategy was invoked by the proponents of meta-modal hypothesis at the beginning of this paper ( §1).
The distinction between surface-functions and deep-function also allows preserving FS in the light of degeneracy. In the dual mode of reading example from the previous subsection, a broadly defined surface-function such as reading is exploited by means of two distinct (sets of) deep-functions, each one associated with its neural machinery (Figdor 2010;Polger and Shapiro 2016).
The pervasiveness of surface-multifunctionality (nicely documented by Anderson et al. 2013) is predicted by, and in fact inspired, several theories that fall under the umbrella-term neural reuse (Anderson 2010). Neural reuse boils down to the claim that complex behaviors do not hinge upon modules sculpted by evolution, but rather emerge by repurposing the same old neural structures into playing new roles by gathering into different functional allegiances (Anderson 2014).
Noticeably, while Anderson used to present neural reuse in terms of surfacemultifunctional structures that nonetheless may preserve the same deep-function, 8 he later becomes skeptic that FS can be preserved even at the level of deep-function (Anderson 2014; for a discussion see Zerilli 2019). In his most recent views, not even the deep-functions of a neural entity are strictly predetermined: the fact that some neural structures tend to play the same (either deep and surface-)functional roles across individuals depend upon their functional biases-a weaker, dispositional counterpart of FS,-which may nonetheless end up being differently exploited depending on ontogeny. In a similar vein, the current neuroscience literature increasingly recognizes that the etiology of cortical functions depends upon an interplay of intrinsic facts pertaining cytoarchitectonic and connectivity and the experiences that sculpt its plasticity (see for instance Spunt and Adolphs 2017; de Beeck, Pillet and Ritchie 2019).

Contextualism
By shifting the focus from surface-functions to deep-functions, i.e. analyzing the inner workings of some neural structures irrespectively of the behavior they contribute to, the functional import of their structural properties is likely to become more transparent-eventually to the point that we may bridge the mind-brain gap and predict functions out of structures, as was hoped by Price and Friston (2005;cfr. Rathkopf 2013). However, a perfect knowledge of deep-functions will not suffice for cognitive neuroscience.
To see why, recall Price and Friston's functional abstraction upon the left posterior lateral fusiform: the three surface-functions whose involvement they report (perception of words, of animal parts, and tactile-visual interface) are subsumed under the deep-functional label "sensorimotor integration". However, such an abstract functional description is hardly specific to a given structure: apart from those neural regions most proximal to sensory receptors or motor effectors, such a description applies to the whole cortex. And even if from the activity of left posterior lateral fusiform we were to infer that some sensorimotor integration is occurring, this will leave us clueless about what this mean at the level of behavior: is the subject reading words? Or is she at the zoo, focusing on some animal parts? Or perhaps she is engaged in some new tasks we did not expect? We cannot tell (for an extensive discussion, see Klein 2012; Burnston 2016a,b). While brain anatomy and physiology may also be interesting regardless of their contribution to the behavior (Haueis 2014), cognitive neuroscience seeks to establish how they contribute to behavior. Deep-functions alone are not enough to predict behavior: surface-functions are also in order. We need them for planning and interpreting experiments, which are often made of tasks; and, a fortiori, to understand the import of cognitive neuroscience in everyday life. 9 So, rather than aiming for deep-function labels that accounts for the activation of some structure across different contexts, several scholars argued for context-sensitive function-structure mappings.
Contextualism has been first introduced in philosophical debates by Klein (2012), though its rationale has been defined and defended at length by Burnston (2016b). Roughly speaking, contextualism boils down to a simple recipe: whenever one variable, i.e. a given function or structure, is not enough to predict the other, further variable(s) can be added, i.e. context(s) (Fig. 3).

Fig. 3
Contextualism allows us to disambiguate many-to-many (surface-)function-structure mappings by reducing them to one-to-one mappings indexed to a specific context According to Burnston (2016a), computational absolutism (i.e., the quest for establishing a single deep-function for each brain area) owes its charm to ambitious promises of generalizability and projectability. Supported by a thorough discussion of the context-dependent sensitivity of the MT area (which responds either to motion, depth, or both, according to the availability of the stimuli), Burnston (2016a,b) argues that such promises are empty. While absolutists regard the ascription of an open-ended conjunction of possible (surface-)functions to an area as a bug, Burnston suggests embracing it as a feature of the mapping endeavor. In the framework that he sketches, known functional ascriptions are still generalizable in the sense that they guide the generation of hypotheses about unknown functions. But "the level of generality of each of the conjuncts in the theory is precisely a matter for empirical investigation-discovering the limits of generalizability for a certain conjunct is just as important an advance as discovering that it holds in many instances" (Burnston 2016b: pp. 546-547).
A few notions of context have been proposed in the philosophical literature. To assess their usefulness, at least in principle, one may adapt a revised version of the criterion put forward by Price and Friston (2005: p. 272): that function should be predicted from anatomical activation and conversely that anatomical activation should predict [surface-]function [given a certain context]. This can be embodied operationally in a cost function that reflects the prediction error in going from [surface-]function [plus context] to structure and back again.
Building on the works of McIntosh (2004), Klein introduced the notion of neural context, defined as "the overall network in which a region is participating" (2012: p. 957). As a matter of fact, an area can be activated while performing various sorts of tasks, to the point that the ongoing task is not predictable upon the observation of the area's activity in isolation. Yet, the relevant area is accompanied by different sets of areas in different cases. Actually, even the very same set of areas may implement different surface-functions, due to changes in connection strength between regions, or due to their fine-grained temporal sequence of activation, which is unlikely to be revealed by fMRI scans alone (Pessoa 2014). In a similar spirit, Viola and Zanin (2017) proposed that some physiological features of neural structures' activity, such as their oscillatory frequency, may help discriminate their current (surface-, but pos-sibly even deep-)function. This idea is further developed by Burnston (2019: p. 12), who painstakingly describes how "physiological activity within a given brain unit-cell, population of cells, or brain area-is not dependent solely on its inputs, but is modulated by background variables" such as the Local Field Potential. Khalidi (2017) argued that function-structure relationships should also consider non-neural variables concerning the environmental-etiological context. His strategy would allow us to preserve some folk psychological distinctions even when they do not map neatly onto dedicated neural substrates. For instance, given the striking similarity in terms of neural activity between remembering the past and imagining it (or the future), a conflation between memory and imagination has been proposed. However, doing so would prevent us from distinguishing correct memories and incorrect ones (as you cannot misimagine something). To avoid this issue, Khalidi claims that knowledge of distal etiological context must come into play when individuating (surface-)functions. Namely, when the recalled trace and some actual mental representation held in the subject's past are congruent, we speak of memory. Otherwise we speak of imagination.

Population as a context for function-structure mapping
In the light of the discussion of the previous sub-section, I can now state my theoretical commitments with the regard to the debate on function-structure mapping. In general, I endorse contextualism: I do not think that cognitive neuroscience should (nor could) do without surface-functions. Quite on the contrary, I think that cognitive neuroscientists should mainly care about the surface-functional properties of neural structures. However, I also think that the notion of deep-function has some merits, with the caveat that it should not be pursued as a "ultimate functional description" of a given neural structure, but more deflationary as a heuristic device to unravel new surface-functions. Conversely, discovering new surface-functions may lead to revision in deep-functional ascriptions (see §5.1).
That being said, while all the notions of context presented in the previous sub-section may prove useful to bridge structures and surface-function despite multifunctionality and degeneracy, none of them address the very challenge I am addressing in this paper: namely, that different neural structures can play different surface-functions (e.g. processing visual information or subserving echo-location) in some individuals, but not in others.
Previously, individual differences in function-structure mapping were invoked to justify skepticism about the very idea of cognitive neuroscience. For instance, inspired by Karl Lashley's seminal work on cortical plasticity, Jerry Fodor remarkably claimed that "is entirely possible that the nervous system of higher organisms characteristically achieves a given psychological end by a wide variety of neurological means. If so, then the attempt to pair neurological structures with psychological functions is foredoomed" (1974: p. 105).
However, pace Fodor, rejecting the (quasi-)universal ambitions of the Platonic Brain Model does not foredoom neuroscience to turn into a mere collection of idiosyncratic token-correspondences about the brain and mind of single individuals, blocking any possible generalizations. Patterned function-structure mappings can provide a middleground between the Platonic Brain Model and Fodor's aporia. Similarly to Kim (1992), who proposed to split "jades" into two categories "jadeite" and "nephrite", I suggest sorting individuals into distinct neuroscientifically relevant populations based on some factors that drive them to share a same functional organization. To be fair with Caramazza, he himself envisaged the possibility "to scientifically investigate some domain of natural phenomena where we could not make the assumption of universality (e.g., if there were several kinds of human minds)" (1986: 49, fn. 3). However, unlike jadeite and nephrite, human brains cannot be sorted out into natural kinds based on a single, underlying essence. Quite on the contrary, membership to some population is due to the joint work of several factors steering functional organization. As such, the same individual may belong to several populations at once, and its membership can be altered in time (see below in §5.2).
Before proposing some criteria for sorting subjects into populations, in order to stress why this proposal is in order, in the next section ( §4) I discuss the case of the so-called "Fusiform Face Area".

A dedicated module for face perception
Going somehow against the tide of contemporary emphasis on distributed processing (e.g. Pessoa 2014), Nancy Kanwisher (e.g. 2010Kanwisher (e.g. , 2017 strenuously defends the existence of cortices functionally specialized for some category of stimuli. Among them, the most notable is the so-called Fusiform Face Area: a portion of the right fusiform gyrus (in temporal lobe) specialized in processing faces so to determine their identities.
The hypothesis that a dedicated cognitive system exists for recognizing faces (but not other objects) circulated in psychology way before neuroscience kicked in. Early findings showed that presenting a stimulus upside-down impairs the recognition of faces far more than other objects (Yin 1969). At the same time, the identity of two juxtaposed half-faces is harder to recognize when the two halves are matched than when they are misaligned. Interestingly, the recognition of the two halves is better when these composite faces are presented upside-down (Young, Hellawell and Hey 1987). Bodamer (1947) reported a specific syndrome in which face perception is selectively impaired, dubbing it prosopagnosia. In the following decades, double dissociations have been reported between (rare) patients who suffer only from impairment in face recognition (e.g. Rossion et al. 2003;Riddoch et al. 2008) and patients with preserved face recognition, who are nonetheless impaired with other objects (Rumiati and Humphreys 1997). However, prosopagnosic subjects present the same recognition accuracy when faces are presented upright or upside-down (Busigny and Rossion 2010). These and other findings support the idea that, in most cases, face recognition is based on holistic processing, i.e., on the relative positions of the elements of a face rather than on the detailed analysis of specific facial details. Enter neuroscience. Inspired by evidence of a specific module for face processing, some laboratories undertook the quest for localizing it. The most notable result was achieved by Kanwisher and colleagues (1997). With a fMRI study, they found that a portion of the occipitotemporal cortex, in the right fusiform gyrus, was consistently more responsive to stimuli depicting faces than other objects-including scrambled faces. 10 Kanwisher's team thus claimed that that area, which they dubbed FFA, is selective for processing faces. More specifically, for holistic processing of faces, as suggested by lesser activation with scrambled face.
Their hypothesis was supported by other evidence. For instance, FFA's activation in subjects suffering from prosopagnosia differ from that of controls (e.g. Hadjikhani and de Gelder 2002). And acquired prosopagnosia often depend upon lesions in the FFA. Yet, in some patients FFA is spared, and the lesion is instead located in the left frontal cortex (Cohen et al. 2019). Does this speak against the one-to-one mapping between face processing and FFA?

Is face processing distributed?
Contrary to Kanwisher, some scholars held that face processing should not be narrowly localized, as it rather depends on a distributed processing of several neural structures spread across the ventral temporal cortex. In what became the proof of concept for MultiVariate Pattern Analysis, Haxby and colleagues (2001) exposed subjects to visual stimuli from six categories (including faces) during an fMRI scan. Based on a wide, non-localized activation pattern corresponding to each category from half of their dataset, they predicted the category of the stimulus of the unlabeled activations for the other half of the dataset with astonishing accuracy (> 90%). Crucially, when the prediction was repeated leaving out the voxels that were maximally responsive to each category, accuracy dropped only very slightly. The following moral was drawn: the representation of object categories (including faces) depends on a widespread coding and not on the activity of a small area.
While Kanwisher grants that face recognition, like any cognitive task, may depend upon the joint activation of several areas, she also vindicates FFA's centrality by employing a "correlation is not causation" argument. In fact, since BOLD fMRI only provides indirect measures of brain activity, the signal employed by researchers (or by classifiers) to infer mental states from neural activations is not necessarily the same signal playing a causal role in the neural implementation of that mental state ). Her argument is further supported by the finding that, through direct cortical stimulation of FFA, a neurosurgical patient reported seeing illusory faces on top of ordinary objects (Schalk et al. 2017).
In the next subsection, I examine some evidence showing that FFA has multiple functions in different subjects. Beside highlighting the need for a population-based contextualism ( §5.1), this discussion will also suggest some criteria for sorting subjects into populations ( §5.2).

A Fusiform "Flexible" Area?
While not denying that FFA is crucial for face recognition, the advocates of the so-called expertise hypothesis argue that this functional description is but the tip-however big-of an iceberg. In my lexicon, the hypothesis claims that FFA has a deep-function that is holistic processing of familiar stimuli. This deep-function is typically put at the service of face recognition because faces are a kind of stimulus with which most human beings are familiar. Still, the same deep-function could be recruited for other surface-functions. Hence, they propose to preserve the acronym "FFA", but to reinterpret it as "Fusiform Flexible Area" (Tarr and Gauthier 2000). The higher levels of accuracy registered in recognizing faces from one's own ethnicity and age may lend further support to this hypothesis (Gross 2009), and so would the recent claim that, after all, most people are good at recognizing only some faces -namely, those they are familiar with (Young and Burton 2017).
During the last two decades, different research groups seeking confirmation for the expertise hypothesis designed an experiment to demonstrate the involvement of FFA for other stimuli domain. Gauthier and colleagues (1999) trained some subjects to recognize some artificial objects called greebles. fMRI scans of both trained and untrained participants during a greebles recognizing task revealed greater FFA activity in the trained participants. Interestingly, this engagement vanished when greebles were presented upside down, similarly to what happens with faces. While greebles proved to be scarcely significant due to their similarity to faces, other categories of stimuli have also been reported to undergo a distinctive (holistic) processing style in experts of some domain, yielding to more accurate and faster recognition, and to activate FFA. These domains include cars and birds (Gauthier et al. 2000;Xu 2005); regular chess games, but not isolated chess pieces (Bilalić et al. 2011); and radiological images (Bilalić et al. 2014).
However suggestive, findings of the involvement of FFA in new surface-functions does not establish that its putative FS for faces was just a matter of expertise. Further support for this claim comes from Arcaro et al. (2017). In their study, some rhesus monkeys were raised in a face-deprived environment since birth. After 2 months, these monkeys did not exhibit the same face-centric gaze fixation patterns, nor the same cortical activation patterns, of controls.
Yet, still other studies suggest that the special status of face, as well as it privileged relation with FFA, may be at least partially independent from visual experience, and instead depend upon the anatomy of FFA and top-down connections with cortices implied in social cognition (Powell, Kosakowski and Saxe 2018;Kamps et al. 2020). 11 11 For instance, Buiatti and colleagues (2019) employed specific EEG adapted for testing 1-to 4-days old human newborns. They found increased activity in newborns' temporooccipital cortex (their "proto-FFA") for upright face-like stimuli when compared to inverted or scrambled face-like stimuli. Moreover, van den Hurk, Van Baelen and de Beeck (2017) report that either visual and auditory stimuli of "facial actions" (e.g. kissing or whistling) activate similar areas in the ventral-temporal cortex (where the FFA stands) of sighted individuals, and that auditory face stimuli result in bilateral fusiform activations in congenitally blind So, is the selectivity for face a surface-function that simply exploits FFA's deepfunctional predisposition for holistic processing? Or perhaps faces are so tightly tied to that spot of cortex that they themselves influence its deep-functional workings, making it suitable for being reused in recognizing other stimuli such as chess games? As the debate is ongoing, it is prudent to stay agnostic on this question. In either case, it is undisputed that FFA is sometimes reused for processing objects other than faces. But this reuse is conditional on learning. In other words: the FFA becomes a surface-multifunctional area in some subjects with an expertise in certain domains, while in others it seemingly remains surface-monofunctionally attuned to face. In the next section, I will leverage on this observation to argue for a shift from a Platonic function-structure mapping framework toward a population-bounded contextualism.

The limits of the Platonic Brain Model
Recall: one aim of a Platonic neuroscience is that of finding "a normal (default) function-structure mapping" (Henson 2005: p. 225) of the utmost generalizability. Within that framework, it seems safe to claim that the FFA is functionally specialized in processing faces. The fact that the FFA-face processing mapping could fail to obtain in congenital prosopagnosic subject is not a nuisance for the Platonic Brain Model. After all, the model is meant to generalize to healthy adults, and prosopagnosic subjects hardly count as such.
However, the Platonic Brain Model is committed to overlook the latent surfacemultifunctionality of FFA, as it only shows up in certain individuals, i.e. expert perceivers of some category of objects like cars or radiological images.
A defender of the Platonic Brain Model may well respond that these are just idle details about surface-functions, and rather settles for specifying the deep-function of each neural structure. Thus, since neuroscientists indeed came up with a relatively solid deep-functional characterization of FFA as an area involved in holistic shape processing, why bother specifying to which domains this processing applies?
Now, while legitimate, such a reply comes with a non-negligible implication: namely, that the Platonic Brain Model becomes rather uninformative of behavior. As discussed in §3.3, deep-functions underdetermine behavior. Recall that according to Price and Friston (2005: p. 272), the hallmark of a good ontology is that "function should be predicted from anatomical activation and conversely that anatomical activation should predict function". If we take "function" to include surface-function, we can observe that, by knowing that someone's FFA is getting relevantly activated, you cannot infer whether she is just looking at some face or she is e.g. a veteran radiologist trying to diagnose some lung infection. However, we have seen (in §3.3) that contextual information may ameliorate the lack of robustness of reverse inference from Footnote 11 continued individuals. In a recent preprint (Ratan Murty and colleagues manuscript), Kanwisher's team describes an experiment where they found that in congenitally blind subjects, a same area in the fusiform gyrus responds to both auditory facial stimuli (inspired by the van den Hurk and colleagues' study), and haptic stimuli (3D-printed faces). structures to surface-functions. In this case, the relevant contextual information is the expertise of the subject in some specific visual domains: by knowing that she is not an expert radiologist, for instance, we may exclude that the activation of FFA in her case underlies the observation of radiological images.
While knowing about a structure's deep-functions is insufficient to predict its contribution to behavior, deep-functional ascriptions can play an important heuristic role. Indeed, when investigating FFA's involvement in expert perception beyond faces, they did expect a same deep-functional style of processing, that is, configural processing. To verify this, they checked for inversion effects in the perception of greebles (Gauthier et al. 1999) or radiological images (Bilalić et al. 2014). Based on such expectations about the deep-function of FFA, future investigations about its possible redeployment in recognizing objects from other domains will likely privilege classes of objects which can be discriminated by their overall shape rather than classes of objects that can be discriminated by paying attention on some details.
Conversely, knowledge of possible surface-functions may help to understand the deep-functions an area may play. In localizing a portion of cortex selective for Pokémons across various former players, Gomez and colleagues (2019) were not merely checking an oddity. Rather, they were wondering what deep-functional semantic property best accounted for the surface-functional stimulus specificity that portion of cortex could acquire: animacy, rectilinearity, real-world size, or eccentricity of the stimuli (i.e. whether it is typically perceived in foveal or in peripheral vision). By comparing subjects' response to Pokémon characters with other classes of objects like faces, animals, body parts, and corridors, they argued in favor of eccentricity (but see de Beeck et al. 2019: p. 794).
Other than across-subject (surface-)multifunctionality, the Platonic Brain Model is poorly equipped to account for across-subject degeneracy. Some cognitive tasks, especially those involving complex higher-cognitive faculties, admit multiple solutions -that is, subjects can solve them by leveraging on alternative sets of deep-functions, resulting in the involvement of different neural circuits. Miller and colleagues (2002) report wide across-subject differences in patterns of fMRI activation of six subjects performing a retrieval task. However, they also noted a good across-time stability within the same individual-suggesting these idiosyncrasies should not be dismissed as noise. Since these idiosyncratic activations might be overlooked by pooling the activations of different subjects, they warn that "group analysis alone, particularly for higher order cognitions like episodic retrieval, may be incomplete and, in some cases, misleading" (Miller et al. 2002(Miller et al. : 1211. Accepting that individuals accomplish tasks in different ways does not imply that no generalizability is possible. On the contrary, in a later study Miller and colleagues (2012) try to interpret this variety of activations as differences in cognitive style (i.e. whether people are mainly visualizers or verbalizers) and encoding strategies, measured with some questionnaires. They found that both factors accounted for a significant part of the variance in the similarity of brain activity between individuals. In a similar vein, Noppeney and colleagues (2006) had 17 subjects performing a verbal decision task about some auditory stimulus (spoken word or sound), primed by either congruent or incongruent visual stimuli (written word or picture). Besides some regions commonly activated across all subjects, the experimenters were able to distinguish two subgroups, based on the set of further regions activated beside the common ones. They speculate on the deep-functional meaning of different activations, although unfortunately, unlike Miller and colleagues (2012), they did not verify them with independent measures such as questionnaires.

Population of subjects as a context for function-structure mappings
According to Burnston, within a contextualist approach toward function-structure mapping, "discovering the limits of generalizability for a certain conjunct is just as important an advance as discovering that it holds in many instances" (Burnston 2016b: p. 547). Thus far, the allure of the Platonic Brain Model may have misled neuroscientists. They could either have overlooked some aspects of some structure's deep-function or failed to appreciate some of their unusual surface-function. For instance, the comparisons between sighted and blind subjects allowed to reveal supramodal deep-functions in areas traditionally (mis)conceived as merely visual, as well as to reveal further surface-functions that they may acquire due to plastic reorganization (Pascual-Leone and Hamilton 2001; Ricciardi and Pietrini 2011;Cecchetti et al. 2016;Bedny 2017). On the other hand, Simons, Shoda and Lindsay (2017) suggest that to mitigate the dreaded replication crisis in psychological science, the section "Constraints on generality" may be added in research papers. Here researchers should answer questions concerning the generalizability they expect for their findings, including: "to which populations do you expect those findings to apply?" To answer this question, and more in general to develop a proper contextualist function-structure mapping that still admits some generalizability, criteria shall be found for sorting subjects in different populations. The carvings of such populations are not self-evident. Rather, "the definition of a reference population is theory-laden" (Caramazza 1986: p. 51). So, for instance, blind subjects may be a population if we are studying vision, but other criteria will possibly drive population-clustering when we turn to study for instance smell. Moreover, Hochstein (2016) reminds us that, like the rest of science, the sciences of the mind have many purposes, and that the best categorization of the mental depends on what purpose one is pursuing. The same point holds for population-based contextualism about structure function mappings. For instance, neurosurgeons are happy to sacrifice generalizability for the sake of accuracy, which is why before the surgical intervention they often directly assess causal role of some specific spots of cortex is often assessed by means of direct electrical stimulation (Duffau 2017). To them, the ideal population is composed by only one individual (at a given time).
For researchers who want to retain some generality, developing an ontology of neuroscientifically relevant populations will require a significant amount of conceptual and empirical effort. Factors predictive of a certain functional organization of the brain must be predicted, and their boundaries and mutual interactions scrutinized, in the light of certain aims and questions. Subjects will simultaneously belong to multiple populations, and each of these memberships is likely of interest only a few function-structure mappings. To complicate things, we have reason to expect that these memberships will interact non-linearly, as surface-functions often compete for neural real estate: developmental studies show that the FFA's preference for face is lateralized in the right hemisphere after that, and possibly because of , the ipsilateral region establish its preference for words (Dundas, Plaut and Behrmann 2013;Dehaene-Lambertz, Monzalvo and Dehaene 2018).
That being said, I can sketch with broad stokes the contour of at least one kind of population, for which I propose the following working definition: [Expertise-based Population] Structure S playing the deep-function F is recruited for some processing related to the surface-functional domain D in F-way if and only if a subject is an expert in D.
This definition captures the empirical insight that, in some cases, prolonged experience with some kinds of stimuli or cognitive activities does not only yield quantitative changes in brain anatomy (as in Maguire et al. 2000). Rather, when the relevant amount of expertise is achieved, expert performance may differ not only quantitatively, e.g. because it engages neural structure more or less intensively (which can be easily accounted for by the Platonic Brain Model), but also qualitatively. When this is the case, a same task may recruit different deep-functions and hence different neural structures (Roepstorff, Niewöhner, and Beck 2010;Guida et al. 2012;Bilalić 2017). The transition between non-experts and expert may occur smoothly, and sharp boundaries may thus be hard to establish (Buckner 2016). But just like vague boundaries between brain areas do not discourage scientists from speaking about brain areas (Haueis 2012), neither should vague boundaries between experts and non-experts in a certain domain discourage them to speak about expert populations.
Being an expert in the respective domains makes the difference for the FFA's involvement in holistic perception of objects like greebles, chess games, and radiological images. Moreover, if the expertise hypothesis is correct, expertise is also at the root of FFA's surface-functional specialization in face recognition. The same logic applies equally well for other cases such as echolocators and literate subjects: it is only thanks to the acquaintance with written letters that the left fusiform gyrus of literate subjects acquires the surface-function of reading. Moreover, a sufficiently liberal notion of 'expertise' can also account for the difference between the functional vocation of the (Platonically speaking) "visual" areas in sighted as opposed to non-sighted subjects the former being conceived as 'experts' in the (broad) domain of vision.
Other kinds of populations may be individuated based on demographic (e.g. socio-economic status, ethnicity or gender) or clinical factors (e.g. suffering from a psychopathological disease).
Admittedly, "demographical factors" include disparate things, from Socioeconomic Status (SES) to sex. A review of the neural correlates of SES (Farah 2017) shows that, on the one hand, patterned differences can be found in the neural activity underlying certain tasks based on the subjects' SES, which are not only quantitative but also qualitative. However, on the other hand, Farah highlights that, despite being a proxy measure for predicting these differences, SES per se is not necessarily producing them. On the contrary, SES is a composite index that collapses many factors like stress and literacy, whose causal relationships with brain organization can follow multiple paths. Indirect measures such as SES can be at best proxies toward the causally relevant factors driving brain organization. In other words, it is likely that, at least in some cases, function-structure mappings with a nice correlation with some demographic factor turn out to correlate even better with some underlying factors that can be accounted in term of expertise. 12 One may think that this line of cautionary reasoning applies to certain factors, like SES, but not to others, like sex. After all, unlike SES, an eminently social phenomenon, sex is rooted in biology. Thus, we can perhaps hope to distinguish "the female brain" and "the male brain". This line of reasoning, however, fails to appreciate that belonging to a particular sex (and gender) has several consequences that span throughout the life history of an individual, and that are highly dependent on cultural factors. In simple terms, in virtually every society of human history, being born female or male makes it much more likely to routinely perform some activities rather than others. Thus, even if differences that are driven on genetic or hormonal bases exist, they are so tightly intertwined with culturally based learning-driven factors that distinguishing them is not straightforward (Joran- Young and Rumiati 2012;Joel et al. 2015; but cf. Del Giudice 2019).
Finally, seen in the light of population contextualism, many clinical uses of functional neuroimaging are seen as inferences from structure plus function to a certain clinical population. Similarly to the Platonic Brain Model strategy to zero-out the contextual variable of population by checking only "normal subjects", many clinical studies aim at zeroing-out the possible confounds of an externally-driven function by using a resting state design (e.g. van den Heuvel et al. 2013). Defining and diagnosing psychopathological and neurological disorders presents several challenges (e.g., recall Caramazza's concern about etiology-versus symptom-based grouping), and relevant ethical implications, that cannot be addressed in this paper. However, abandoning the Platonic Brain Model may set the stage for a more fruitful discussion. To mention but one example, think of the recent claims that, at least in some cases, Asperger Syndrome is not a syndrome at all, it is just neurodiversity-not differently from other cases of biological dimorphism (for a discussion, see Jaarsma and Welin 2012). Whatever one think about this claim, approaching it from a post-platonic framework will probably help to take this claim with the seriousness that it deserves, because such a framework does not prescribe a monolithic normativity about the brain.
invoked the necessity to develop an ontology of neuroscientifically relevant notions of populations. Finally, I suggested one promising (albeit not exhaustive) criterion for sorting population, namely expertise-based population.
My plea goes on a par with those coming from an increasing number of researchers who invite to treat inter-subject diversity as signal, not as noise (e.g. Falk et al. 2013;Thiebault de Schotten and Shallice 2017;Seghier and Price 2018;Clark Barrett 2020). The Platonic Brain Model has served us quite well in many cases, providing a baseline for many claims about brain organization. But as cognitive neuroscience is evolving into a mature science, replacing it with the more context-sensitive framework of a post-platonic neuroscience would repay in terms of accuracy in predicting behavior. Moreover, on the grounds of philosophical anthropology, it would allow us to break free from the tethers of a concept of human nature largely predetermined by evolution (see also Rathkopf forthcoming), and to widen the scope of neuroscience from traditional behaviors toward peculiar ones. The time is ripe to look beyond what the ideal brain typically does, toward what real brains can do in their beautiful diversity.