1 Do we Think in Language?

Ever since Plato let the stranger in the dialogue Sophistes [263e] explain that thinking is a conversation of the soul with itself, speaking and thinking have been thought to be intimately intertwined (cf. Plato & Fowler 1921). One point of contention, however, was whether we think in a natural language, as Plato, for example, seemed to have assumed (Gacea 2019), or in a mental language that is not made up of natural language, as St. Augustin claimed (Meier-Oeser 2011). In modernity, St. Augustin’s assumption that we think in a lingua mentis was in turn challenged and the idea that thought occurs in a natural language was variously reconsidered; thinking in (natural) words was claimed to obviate the need to keep the associated ideas active in mind all the time.Footnote 1 In the 1970s, the idea of a mental language was also revived (Fodor 1975), but came under attack with the advent of non-symbolic, embodied (Shapiro 2011, 2014; Chemero 2011) and non-representationalist models of mind and cognition (cf., e.g., van Gelder 1995; Clark 2015; cf. the contributions in Smortchkova et al. 2020, for a critical discussion). More recently, renewed interest in inner speech as an internalized natural language can be observed. The debate owes much to Vygotsky’s exposition of the topic (cf., e.g., Vgyotsky 1978, 1986, 1987, 1999). Another point of contention, however, concerns the function inner speech is thought to serve. Is inner speech—just as, arguably, language in general—predominately a means of making thoughts accessible (to whomever); is it merely a means to a communicative end? Or is inner speech a means (of thinking) in itself, perhaps even a means of enhancing our reasoning abilities?

In what follows, I will argue for the latter. To that end, I will introduce the phenomenon of inner speech by briefly comparing it to overt speech in terms of form and use, a comparison that is suggested by the (Neo-)Vygotskian approach adopted here. I will propose that inner speech be defined, not as is commonly done, by recourse to phenomenology but more ‘mechanistically’ and discuss ways in which it might provide us with cognitive benefits, some of which might even go beyond those provided by overt speech. I will then turn to the question of whether inner speaking is mainly a way of making thoughts somehow accessible, or whether it may be more crucially implicated in thinking. I will argue for the latter. And while others have done so before me (cf. for more recent suggestions, e.g., Deamer 2021; Gauker 2018; Roessler 2016; Vicente and Jorba 2019; Wilkinson 2020), I will advance a somewhat novel route to this conclusion. First, and drawing on the Vygotskian idea of inner speech being internalized overt speech, I will argue that inner speaking is, occasionally at least, thinking. Second, the ‘mechanistic’ definition will allow me to, conversely, argue that thinking is, occasionally, inner speaking. I will close by summarizing the main points.

2 Inner Speech and Overt Speech

Inner speech is a familiar phenomenon, even if people differ in how much inner speech they report (Hurlburt et al. 2013, p. 1483). Many of us experience episodes of inner speech whilst thinking about a theoretical problem, rehearsing the shopping list, preparing a talk or recalling the row we had with our partner last night. Moreover, the mental life of many a fictional character is revealed to us as we are made privy to the character’s inner monologues and dialogues.

However, the study of inner speech raises a host of methodological questions (cf. Alderson-Day and Fernyhough 2015 for a discussion). It also raises pressing conceptual questions. What exactly is inner speech? What other phenomena ought it to be distinguished from? One way of approaching these questions is by comparing inner speech to overt speech and examining the manner in which it is similar to and the extent to which it differs from overt speech.

It seems to be similar to overt, social speech in that it comes in similar forms and is amenable to similar distinctions. There is—phenomenologically speaking—inner monologue and inner dialogue; maybe inner polyphony as well. There is inner speaking and signing, which is accompanied by a sense of agency, and inner hearing or auditory imagery (Gauker 2018), where one experiences oneself as being passive. There is inner reading or rehearsing, and inner writing; there are even co-thought gestures (Chu and Kita 2016). And inner speech can be experienced as being more or less goal-directed. More fully, it seems to be in the service of

  • deliberation, clarification, planning, or problem-solving;

  • self-motivation (“You can do this”), self-regulation (“Don’t do this”), self-evaluation (“Good job”), and maybe even self-entertainment;

  • keeping something in mind, rehearsing something;

  • gauging the potential effects of an utterances on an imagined audience;

  • divergent, creative thinking (as in daydreaming or mind-wandering);

and it, presumably, fulfils other functions as well; many, if not all, of which can also be fulfilled by overt or private speech (—the latter being a form of audible self-talk; cf. Diaz and Berg 2016; Winsler et al. 2009). From this angle, inner speech looks very much like a silent version of outer speech (Martinez-Manrique and Vicente 2010). Thus, one might wonder whether all these functions are equally served by inner and overt speech. If that were so, one could venture to guess that if we engage in silent self-talk, we do so merely for reasons of social etiquette, as audible self-talk is commonly frowned upon (cf. Hatzigeorgiadis et al. 2011 for a discussion of overt and covert self-talk in sports performance). If, on the other hand, some cognitive functions were better served by inner speech than by overt speech, this additional benefit had to accrue from differences between inner and overt speech or from the manner in which inner speech transforms our cognitive infrastructure in ontogeny (or maybe did so in phylogeny).

And indeed, inner speech differs from overt speech—most notably in the extent to which it exhibits sensory qualities such as a particular tone, prosody or accent, or is accompanied by motor sensations such as a slight contraction of the muscles of lip or tongue (or of the hands in inner sign). Also, it may involve imagistic elements (Wiley 2016) or even be multimodal (Perrone-Bertollotti et al. 2014). According to the Vygotskian approach adopted here (Vygotsky 1986), language, while being acquired in social interaction, becomes internalized in the course of development, first morphing into private speech, which is still audible but no longer other-directed, and finally turning into (inaudible) inner speech. Importantly, we internalize all kinds of social (or social-linguistic) practices, repurposing them as “means of individual psychological organization” (Vygotsky and Luria 1994, p. 138). Moreover, according to Vygotsky, speech is transformed in the process of internalization, becoming truncated or “incomplete” (Vygotsky 1986, p. 235). It is thus a well-rehearsed point in the literature by now that inner speech can be more or less condensed or expanded (Fernyhough 2004).Footnote 2 Often, inner speech is experienced as being in a more telegraphic style and lacking phonological or articulatory detail.Footnote 3 As Vygotsky put it: “in inner speech words die as they bring forth thought. Inner speech is to a large extent thinking in pure meanings. It is a dynamic, shifting unstable thing, fluttering between word and thought […]” (Vgyotsky 1986, p. 249).

3 Towards a Definition

How to best define inner speech, then? Broader and narrower definitions seem possible. Some authors highlight phenomenological experience and claim that “[i]nner speech can be defined as the subjective experience of language in the absence of overt and audible articulation” (Alderson-Day and Fernyhough 2015, p. 931). And while this definition is tailor-made to capture consciously experienced episodes of inner speech, one might also define inner speaking more ‘mechanistically’ (cf. Kompa and Mueller 2020), less phenomenologically. Focusing on inner speaking (i.e., inner speech acts)—as opposed to inner reading, writing, or listening—I propose that inner speaking be defined as an inner episode that substantially engages the speech production system.

What does it take to substantially engage the speech production system, then? According to the influential model of speech production developed by Levelt and colleagues, speech production involves four levels of processing: “the activation of lexical concepts, the selection of lemmas, the morphological and phonological encoding of a word in its prosodic context, and, finally, the word’s phonetic encoding” (Levelt et al. 1999, p. 2). More precisely, the speaker has to first select a lexical concept in light of their communicative intent or goal (Levelt 1989). As a particular object or event can be referred to differently, this involves perspective taking, i.e. selecting the lexical concept that will best serve the communicative goal in light of the doxastic state of the audience (Levelt et al. 1999). In a next step, the stage of lexical selection, a lexical item, called ‘Lemma’, that specifies the syntactic properties of the word (whether it is a noun or a verb, its grammatical gender, etc.) has to be retrieved. The third step, form encoding, requires that a morpho-phonological code be retrieved, followed by the stage of phonetic encoding, and resulting in an articulatory score whose execution will then yield overt speech (cf. Figure 1 in Indefrey and Levelt 2004; p. 104; cf. also Levelt et al. 1999). If we omit the last step(s), this seems to be a pretty good (even if still rough) model of inner speech production–except for one thing that I will come back to in Sect. 5.

More specifically, then, the suggestion is that we (tentatively) define inner speaking as a mental episode that engages the speech production system at least up to the level of lemma representations. This would nicely accommodate, along the lines suggested by Vicente and Martínez-Manrique (2016), what Hurlburt and colleagues label ‘unsymbolized thinking’ (Hurlburt and Akther 2008). As Vicente and Martínez-Manrique point out, unsymbolized thought may be taken to be the syntactically structured, “semantic content of an interrupted inner speech act” (Vicente and Martinez-Manrique 2016, p. 11; cf. also Vicente and Jorba 2019). Whether phonological or articulatory representations are also activated during inner speech episodes is a matter of some controversy (Oppenheim and Dell 2010; Loevenbruck et al. 2018; Grandchamp et al. 2019; cf. Alderson-Day and Fernyhough 2015 for a discussion).

Also, the exact mechanism underlying inner speech production is subject of debate. According to Carruthers, for example, inner speech “is just a sensory forward model in auditory code produced by activated (but not executed) speech actions” (Carruthers 2018, p. 35). We formulate a motor plan for an utterance, thereby generating a forward model that predicts the sensory consequences of the planned utterance (what it would sound like). If the executing of the motor plan is aborted, we consciously experience the forward model as inner speech. But then, why make a prediction of what the utterance would sound like if one never even plans on uttering it out load in the first place (cf. Gauker 2018)? Also, what is the prediction compared to in inner speech? There is no actual outcome that can be compared to either the predicted outcome or to the desired outcome (cf. Perrone-Bertolotti et al. 2014, Lœvenbruck et al. 2018; Swiney 2018). The only possible comparison is between prediction and original communicative intention. But that presupposes that there always is a communicative intention determinate enough to make comparison possible. Yet what language would it be formulated in? If it were (innerly) formulated in a natural language, this would land us in a regress. Assuming that it would be in a Language of Thought (LoT, for short) is also not without its problems (more on this in Sect. 6). It seems therefore worthwhile to look for (or rather develop) an alternative account of how (and why) inner speech is produced and what accounts for the experience of inner speech. (This is a topic for another paper, however).

Pending a fuller account of inner speech production, and in order not to prejudge the issue and exclude interesting cases of inner speech, I suggest that we stick with the lean definition put forward above. However, one might happily admit that in many cases of inner speech, phonological or articulatory representations may in fact be activated; yet there is (as far as I can see) no good reason to make this a definitional feature. Thus only the first two stages would be mandatory—no language without semantics and syntactics—while the next two stages would be engaged in a task-specific and context-sensitive manner, and only if need be. Such a lean definition might be useful if our aim is to explain the full cognitive potency of inner speech, as it makes room for less costly forms of inner speech (given that phonological and articulatory encoding makes demands on the cognitive system) and also for unconscious forms of inner speech.

Yet there is a general worry one might have concerning this definition of inner speech. Machery (2005) argues that introspective evidence of inner speech is no evidence that thought is linguistic as the latter is a claim about the vehicles of thought, to which we have no introspective access. In a similar vein, one might object to my definition of inner speech that it has not been shown that what is experienced as inner speech always engages the speech system. By way of reply, one might point out that evidence is accumulating that inner speech recruits similar albeit not exactly the same regions as overt speech (Perrone-Pertollotti et al. 2014; Loevenbruck et al. 2018; Grandchamp et al. 2019; Geva et al. 2011; Geva 2018).Footnote 4 The motor cortex seems to be less active, for example (Jones 2009; Perrone-Bertolotti et al. 2014). And since we also have to distinguish not only between inner speaking and inner listening, rehearsing, writing or reading, but also between spontaneous and task-elicited inner speech (Geva and Fernyhough 2019; Hurlburt et al. 2016;) as well as between wilful and involuntary inner speech (Perrone-Bertollotti et al. 2014; Loevenbruck et al. 2018), one would expect that somewhat different areas of the brain are recruited in each case. Still, there is ample evidence by now that during episodes of inner speech, relevant parts of the language system are activated. Whenever this has been experimentally investigated, inner speech has been shown to engage similar areas of the brain as does overt speech production. And no one so far has observed a case of subjectively experienced inner speech without engagement of (some relevant parts of) the language system.Footnote 5

4 The Cognitive Benefits of Inner Speech

A certain cognitive potency of inner speech—one that goes (partly, at least) beyond that of overt speech—is commonly acknowledged. Vygotsky’s (1986) idea that in the course of development speech becomes internalized, thereby turning into a cognitive or psychological tool, has intrigued psychologists and philosophers alike. Among the possible candidates for how inner speech may prove cognitively beneficial are the following (—the list is by no means exhaustive). Inner speech is said to help us.

  • plan ahead and solve problems (Vygotsky 1986; Lidstone et al. 2010).

  • manage our knowledge effectively (Gauker 2011).

  • make thoughts conscious (Carruthers 1998, 2011, 2018).

  • engage in reflexive/higher-order thinking (Bermudez 2018).

  • gauge the social effects of our utterances (Carruthers 2018).

  • train perspective-switching (Fernyhough 2009, 2016).

  • broadcast information throughout the cognitive system (Carruthers 2002, 2012).

  • reduce cognitive load (Kompa and Müller 2020).

  • gain self-knowledge and aid self-reflection (Morin 2005).

  • augment cognitive control (Gade and Paelecke 2019; Miyake et al. 2004; Granato et al. 2020).

  • enhance working memory (via the phonological loop) (Baddeley 1986).

  • ….

It is worth noting that some of these suggestions allow for unconscious inner speech, while others explicitly tie inner speech to consciousness. Some require phonologically specified inner speech, while others do not. Still others seem to require that the inner speech utterance in question not be phonologically specified, for example, if it serves to reduce cognitive load or to broadcast information throughout the cognitive system.Footnote 6

Moreover, some seem to accord inner speech a cognitive role that goes beyond that of overt speech. This would be most pronounced in cases in which inner speech is not simply used due to social etiquette (i.e., because self-directed out-loud speech is commonly frowned upon), but rather turns into an integral part of our cognitive infrastructure and becomes (part of) a cognitive mechanism in its own right. It seems to be implicated in the phonological loop, for example, as a component of working memory (Buchsbaum and D’Esposito 2019). And some argue that it is a mechanism for integrating and broadcasting information in the cognitive system (e.g., Carruthers 2002; cf. also Godfrey-Smith 2016). Also, it has been said to be implicated in cognitive control mechanisms (Granato, Borghi and Baldassare 2020; Miyake et al. 2004), and to play a role in the processing of abstract concepts (Fini et al. 2022). More generally, if we allow for unconscious inner speech (and some methods used to examine inner speech such as dual tasks studies seem to rely on such a notion; cf., e.g., Sokolov 1972; Emerson and Miyake 2003; Miyake et al. 2004; Fini et al. 2022), one might argue that inner speech thereby assumes a cognitive function of its own.

However, one of the most foundational questions bearing on the issue of how and to what extent inner speech serves cognitive needs is, arguably, the question of how it relates to thought; i.e., whether it enhances our cognitive abilities by crucially figuring in thought processes or not.

5 Inner Speech and Thought

Very roughly, one might distinguish two positions here.

  1. A.

    Inner speaking is exclusively a way of making thoughts accessible (to whomever). There is a prior occurrent thought that is entertained independently of any inner speech act and that may then be expressed in inner speech.Footnote 7

  2. B.

    Inner speaking and thinking are more intimately intertwined. Inner speaking is, occasionally, a form of thinking; no independent, prior act of propositional thought is required.Footnote 8 And, conversely, some paradigmatic cases of thinking are cases of inner speaking.

Account A claims a primacy of thought over language. It rests on the idea that thinking is different from speaking and, initially or in its purest form, (natural) language-independent. ‘Pure’ thought is completely untainted by (natural) language. Thinking is one thing, inner speaking is something else; it is a means to an end as it gives linguistic form to our language-independent thoughts. Before we engage in inner speech acts, there is a prior propositional thought act whose content might, eventually, be expressed by means of an inner speech utterance.

Account A may be construed as a version of the so-called ‘communicative conception of language’ (Carruthers 1998) when applied to inner speech. According to an old idea that features prominently in the work of John Locke, language in general is an expedient adopted for the purpose of making our thoughts accessible to our conspecifics, or, in other words, making them communicable (Locke 1979Essay III.II.1). Analogously, inner speech may be a means of making thoughts accessible to oneself.

Also, the current literature mostly takes inner speaking to occur in a natural (albeit possibly condensed or fragmented) language. Given this, those in favor of a language of thought (LoT) might be inclined to subscribe to A. Moreover (as discussed before), one might think it necessary to assume a prior, language-independent thought in order to explain speech production. Levelt, for example, seems to commit himself to an initial, propositional communicative intention (preverbal message) that is “cast in the propositional language of thought” (Levelt 1989, p. 73). This “preverbal message is a semantic representation that refers to some state of affairs” (ibid.). Postulating an antecedent thought couched in an LoT may seem like an elegant option. However, it raises other difficult questions; we will come back to this in Sect. 6. And note that postulating a prior thought in an LoT does not help with the problem of how (inner or outer) speech is generated as long as there is no account of how utterances in an LoT are generated.

Let me thus put forth an argument in favor of account B. I will argue, in what follows, that there are two (connected) reasons for assuming that (occasionally at least) we think in language. We think in language to the extent that we re-use internalized social-linguistic practices. And we think in language to the extent that we thereby exploit the syntactic and semantic features of natural language.

More specifically, I will first argue—or at least lend some plausibility to the claim—that some instances of inner speaking are instances of thinking.Footnote 9 Yet the problem is (as just discussed), that in order to do so, one would have to provide an account of how an inner speech utterance is generated without requiring a prior language-independent thought (doing the ‘real’ cognitive work). What is needed is an account of how inner speech production is initiated. As of yet, no such account exists. The question of which thought processes lead up to the production of speech is surprisingly under-researched (cf. Garagnani and Pulvermüller 2013).

One might nonetheless take some, admittedly exploratory, steps in this direction. Given that inner speech utterances somehow have to be generated, there must be antecedent thought processes. In defending a version of B, one clearly ought to allow for antecedent imagistic (Gauker 2011, 2018) or affective processes. There may also be prior conceptual and representational states. Following Crane (2009), one might distinguish between a mental state bearing representational content, propositional content, or conceptual content. Whether we also allow for prior propositional states will depend on whether we think that there can be non-linguistic propositional states (e.g., imagistic propositional states) – something I will not go into here. One might also take hints from discussions on animal cognition. Bermudez (2003), e.g., suggests that non-linguistic creatures may engage in imagistic reasoning, empathetic reasoning, trial and error reasoning, analogical reasoning and reasoning involved in exercising complex bodily skills. In humans, too, processes such as these might precede (and also accompany) the production of linguistically formulated contents. Also, inner speech production may be rather spontaneous or haphazard (Dennett 1991), or a reaction to an external or internal (such as a prior inner speech utterance) stimulus, as we are all in the habit of linguistically reacting to our environment.

Yet which cases of inner speech may be cases of thinking, then? Those that play a decisive role in processes that we deem paradigmatic processes of thinking, and which are thus not merely acts of inner speech (e.g., cases of auditory imagery) but inner speech acts (Roessler 2016; Wilkinson 2020). Interestingly, we often seem to engage in inner speaking during deliberation, problem-solving or similar cognitively demanding tasks. And inner speech is not a mere by-product or a convenient expedient in these cases. Rather, it seems that the yielding of antecedent thought processes to linguistic formulation itself enables us to engage in certain forms of complex deliberative reasoning. Frankish (2018) considers the example of wondering whether or not to go to a party to which he has been invited. He silently asks himself a question (“Do I want to go to the party?”), hears his own utterance and then his language comprehension system comes up with an interpretation that is broadcast to parts of his cognitive system. Further (partly autonomous) processes predict that Henry will be at the party. He silently utters these words, which give rise to another question (“Do I want to meet Henry?”). Further (again, partly autonomous) reasoning reveals that Henry will probably want to talk about the budget cuts, which results in the decision (which may be a sort of self-commitment) that he’d rather avoid meeting Henry and will thus not go to the party (Frankish 2018, p. 234).

As soon as antecedent cognitive contents are linguistically formulated in inner speaking, they acquire a level of semantic determinacy and differentiation, of explicitness and syntactic structure (that admits of productivity), that allows them to serve as premises in theoretical and practical deliberation, be denied and affirmed, stand in all kinds of inferential relations, and become communicable and interpretable by ourselves and others.

Moreover, many paradigmatic cases of thinking such as deliberation or problem-solving strikingly resemble social-linguistic practices of argumentation, question-answer-protocols, dialogue, or joint goal-directed action more generally. On the Vygotskian account adopted here, we learn to engage in deliberative activities like these when acquiring language and by being immersed in various social(-linguistic) practices (Vygostky 1978, p. 57). But once language is internalized, we can engage in these practices by inner speaking, resulting in paradigmatic cases of thinking, or so I would like to suggest.

Second, I would like to argue that some instances of thinking are instances of inner speaking, namely all those that exploit the syntactic and semantic features of natural language. I will thereby draw on the definition of inner speaking suggested before in order to argue as follows:

  1. 1.

    Thinking, i.e. entertaining a thought, either engages at least the first two levels of the speech production system, namely selection of a lexical concept and corresponding lemma, or it does not.

  2. 2.

    If it does, it is a form of inner speaking (given the definition proposed before).

  3. 3.

    If it does not, the thought entertained has no semantic or syntactic features, i.e., it is neither syntactically structured not semantically meaningful (does not invoke lexical concepts), as the speech system is where these features are being processed.

  4. 4.

    Therefore: Entertaining a thought is either an instance of inner speaking or the thought entertained does not exhibit syntactic structure or invoke lexical concepts.

Note that I am not claiming that all structured thought is linguistic in nature. There might be structured thought that does not result from imposing syntactic structure on lexical items. Rules, for example, provide structure; structured thought thus only requires the application of a rule to more basic items, whatever these may be (as in a conditional—a common notion or ‘rule’ in neuropsychology; cf. Bunge and Wallis 2008). The distinction between (otherwise) structured thought and linguistically structured thought is important to keep in mind, especially if we assume an evolutionary perspective and aim to explain how complex thought might have evolved.

One way to block the conclusion is by rejecting the definition employed in premise 2. One might argue instead that inner speech has to be phonologically specified (as does, for example, Langland-Hassan 2018; cf. Bermudez 2018 for a critique). Another way to block the conclusion is by rejecting premise 3, to which the following section now turns.

6 The Language-of-Thought (LoT) Hypothesis

One might object to the first claim (that some instances of inner speaking are instances of thinking), instead arguing that thinking and inner speaking are two separate and distinct cognitive acts. And while inner speaking is performed in a natural language, thinking is performed in an LoT.

By way of reply, one might point to the fact that we are trained in these social-linguistic practices (argumentation, dialogue, question-answer-protocols, etc.) by using natural language. Why change the language, then, when turning inwards? What would be the point of training these practices by using natural language and then re-purposing them in inner deliberation and problem-solving by switching to an LoT? That does not look like computationally very efficient strategy; wouldn’t we thereby incur unnecessary cognitive costs? Also, it raises the tricky question of how an utterance in an LoT is translated into an inner speech utterance and vice versa, and whether this can be done without loss. So unless there are other good reasons for postulating an LoT (see below), one might think that we are better off without it (in terms of cognitive economy).

With regard to the second claim (that some instances of thinking are instances of inner speaking), and in particular with regard to premise 3 in the argument, one might claim (once more) that there is an LoT that has syntax and semantics but does not engage the speech production system. Fodor (1975) famously argued that we need to postulate an LoT in order to explain, among other things, how children can acquire a natural language, and also in order to account for (certain forms of) animal cognition.

My aim in this paper is not to refute the LoT-hypotheses (as abler minds have tried before me) but to put forward an alternative account that tries to do without. However, it is worth noting that Fodor argued mostly abductively and that today, there are sensible alternative explanations that seem more compatible with the available empirical evidence. For example, insight into the extent to which children are statistical learners (cf., e.g., Rebuschat and Williams 2012; Romberg and Saffran 2010) and the advent of usage-based approaches to language acquisition (Tomasello 2003) go some way towards providing viable alternative explanations of how children manage to acquire language. Also, the notion that the cognitive accomplishments of various animals are best explained on the assumption that they avail themselves of an LoT (and that we share with them this rather basic element of cognitive infrastructure) has fallen somewhat out of fashion. Rather, comparative research examines whether (or to what extent) animals can learn to use symbols (Pika 2015); engage in rule-governed (Pika et al. 2018) or intentional (Townsend et al. 2017) communicative behavior; whether there is a rudimentary form of morpho-syntax (as is argued, e.g., by Collier et al. 2014 or Engesser et al. 2016; cf. Suzuki et al. 2021 for a review) in animal communications systems; to what extent animals can reason (cf. Wynne and Udell 2020 or Kaufmann et al. 2021 for an overview) and, at the more general level, how linguistic and other cognitive functions might have coevolved in the first place. To the best of my knowledge, no plausible story of how the LoT could possibly have evolved in non-human animals has been told thus far.

Still, a defender of the LoT hypothesis might point to empirical evidence suggesting a dissociation between linguistic and other cognitive abilities. As studies with people with aphasia, e.g., make clear, some show severe deficits in various cognitive domains, while others display only little cognitive impairment. Varley and Siegal (2000), for example, discuss the case of an agrammatical aphasic, S.A., with severe difficulties in sentence and verb comprehension; performance was above chance only on tasks requiring the comprehension of spoken and written nouns.Footnote 10 Yet S.A. nonetheless performed well in several cognitive tasks requiring causal reasoning, and also in false-belief-tasks (—however, other studies suggest a strong correlation between language impairment due to aphasia and performance in reasoning tasks, cf. Baldo et al. 2015). How is this to be explained on the proposed account?

First, note that all I am claiming is that some instances of thinking are instances of inner speaking, not that all thinking is inner speaking. Second, in light of cases such as these, one could either go for a low-level explanation and argue that these tasks, contrary to appearance, do not require linguistically structured thought but rather, for example, conditional or associative reasoning. Given that various animals seem to engage in forms of causal reasoning (yet without clear evidence for human-like capacities; cf. Schloegel and Fischer 2017), and that pre-verbal infants have been shown to master non-verbal versions of the false-belief task (Buttelmann et al. 2009; Onishi and Baillargeon 2005), we clearly need to acknowledge that these cognitive accomplishments do not presuppose natural language mastery. Of course, one could postulate an LoT in animals and pre-verbal infants to explain these cognitive achievements. However, unless a plausible tale can be told of how an LoT might have evolved, and pending neuropsychological evidence of neural correlates of an LoT, this should be our last resort. Alternatively, one could claim that in cases such as Varley’s aphasic, inner speech is preserved to the extent required for mastering the respective tasks, while overt speech production and comprehension is disabled due to problems with phonological or articulatory encoding (primarily of verbs). In fact, there are various studies with aphasics who have impaired overt speech yet preserved inner speech, i.e. they “can say words in their head that they cannot say out loud” (Fama et al. 2019, p. 106; cf. also Fama et al. 2017; Stark, Geva and Warburton 2018). This dissociation is especially prominent—as is to be expected on the model suggested—when the problem concerns mostly speech output (Fama and Turkeltaup 2020).

7 Summary

Let me sum up. The claim that we think in language is as old a claim as it is hard to spell out in exact terms. I tried to spell it out by drawing on recent work on inner speech. To that end, I proposed that inner speaking be defined as a mental episode that substantially engaged the speech production system (at least up to the level of lemma representation). And while many agree that inner speech is somehow cognitively beneficial, the extent to which (and the manner in which) it is implicated in thinking is still a matter of some controversy.

I argued, first, that inner speaking is, occasionally, a form of thinking in that it prominently figures in thought processes that result from re-using social-linguistic practices of argumentation, dialogic interaction, problem-solving, etc. in inner deliberation and reasoning. No antecedent, natural language-independent, thought is required to kick-start the inner speech utterance. As Vygotsky aptly put it: “Thought is not merely expressed in words; it comes into existence through them” (Vygotsky 1986, p. 218).

I also argued, second, that thinking is, occasionally, inner speaking, namely when it exploits linguistic features of a natural language and thereby engages the speech production system. I closed by discussing the hypothesis of an LoT and whether it serves to provide a better account of various cognitive accomplishments than the notion of an inner (or internalized) natural language.