1 Introduction

In recent years, we have seen a growing interest in theorizing about homonymy and polysemy within linguistics, philosophy, and psychology. It is fair to say that a received view has emerged within the literature on polysemy and homonymy representation. According to this view, there is a difference between how polysemes and homonyms are represented in our mental lexicons, such that the meanings associated with a polysemous expression are stored and represented together under one lexical entry, whereas the meanings associated with a homonymous expression are represented separately. It is usually argued that this is the picture that is supported by the growing body of empirical evidence coming from psycholinguistics. Psycholinguistic studies consistently show that polysemous expressions enjoy various processing advantages compared to homonyms, and the received view is generally taken to be required in order to explain these results. The aim of this paper is not only to show that this is not the case but also, and more fundamentally, that the received view falls far short of explaining the data to a sufficient degree.

Section 2 will introduce the theoretical and experimental motivations for distinguishing between homonymy and polysemy as two types of lexical ambiguity. Section 3 will introduce the core theses of the received view along with some of its implications. In Sect. 4, my case against the received view is made. Put in general terms, I argue that the received view only manages to account for parts of the data coming from psycholinguistics. There is, moreover, an essential aspect of this data with which the received view would seem to conflict. As a result, the received view is caught up in an explanatory dilemma that I dub the Continuum Puzzle. The best way of escaping this puzzle, I claim in Sect. 5, is to substitute the received view for an alternative consistent with the currently available evidence. The main contribution of this section is a number of sketches of what such an alternative view may look like. Importantly, reaching a viable alternative to the received view will require of us to reject the following pervasive but ill-motivated assumption: Differences in ambiguity processing and resolution can only be explained by there being some corresponding differences in the architecture of our mental lexicons. In Sect. 6, I conclude by summarizing the main takeaways from this paper.

2 Polysemy and Homonymy as Two Types of Lexical Ambiguity

Lexical ambiguity can be defined as occurring when two or more words that are distinct in meaning overlap in some respect, e.g., phonologically, graphemically, or pictorially (see Sennet 2021). This type of characterization of lexical ambiguity is familiar. However, a need for a more fine-grained characterization has surfaced within the literature on ambiguity processing, resolution, and representation. In this literature and elsewhere, it is becoming increasingly common to distinguish between polysemy and homonymy as two different types of lexical ambiguity. This division has been both theoretically and experimentally motivated.

In theoretical linguistics, the distinction between polysemy and homonymy is commonly drawn on the basis of the degree of semantic relatedness between the interpretations associated with a lexically ambiguous expression (see Eddington and Tokowicz 2015, for a review).Footnote 1 So distinguished, an ambiguous expression is said to be homonymous if it is associated with two or more semantically unrelated (or only weakly related) meanings. In contrast, an expression is said to be polysemous if it is associated with two or more semantically related meanings (commonly referred to as “senses”).Footnote 2,Footnote 3 But what exactly does it mean to say that two interpretations associated with a lexically ambiguous expression are “semantically related” and how can semantic relatedness be measured?

There are several answers to the above question (see Koskela and Murphy 2006, and Lyons 1977). One can take semantic relatedness between interpretations to be a matter of etymology, i.e., two distinct interpretations of the same expression are said to be semantically related if they share a common etymological origin. However, in the literature engaged with here, semantic relatedness between interpretations is standardly established and measured by relying on native speakers’ relatedness judgments. As such, the interpretations associated with a lexically ambiguous expression are said to be semantically related if they are judged to be so by speakers of the language. Below are some concrete examples of how semantic relatedness has been defined and measured.

For their study of polysemy processing, Brocher and colleagues (2016) characterized semantic relatedness as follows: “[T]he degree to which speakers judge the two interpretations of an ambiguous word to be semantically similar based on physical, functional, or other properties” (p. 1803). To measure semantic similarity,Footnote 4 Brocher and colleagues presented a group of test subjects with booklets containing pairs of single sentences. Each sentence pair contained an occurrence of a selected ambiguous expression but encouraged different interpretations of that expression. Participants were then asked to rate the semantic similarity between the two interpretations from 1 (not similar) to 7 (the same meaning). For a later study, Brocher and colleagues (2018) asked participants to judge similarity based on the questions: “Can the two meanings appear in similar contexts? Do they share physical or functional properties? Do they taste, smell, sound, or feel similarly? Do they behave similarly?” (p. 448). A similar method was used by Klepousniotou and colleagues (2008). The authors selected two interpretations of a range of ambiguous expressions (e.g., ‘Lamb’: Animal/Meat). They then constructed four pairs of words, of which two encouraged one of the selected interpretations (e.g., baby lamb, friendly lamb), and two the other interpretation (e.g., marinated lamb, tender lamb), for each expression. Participants were then asked to rate the degree of relatedness between the two interpretations from 1 (no or weak relatedness) to 5 (strong relatedness).

Semantic relatedness ratings from studies like the above are used to characterize expressions as either polysemes or homonyms. Based on such characterizations, it is then investigated how polysemy, compared to homonymy, is processed and resolved. With few exceptions (Klein and Murphy 2001, 2002; Foraker and Murphy 2012), the results that experimental studies have provided further motivate the homonymy-polysemy distinction as given above. With the exceptions already mentioned, psycholinguistic studies consistently show that lexical ambiguities between semantically related interpretations are processed and resolved in ways that differ from ambiguities between semantically unrelated, or only weakly related, interpretations. Most importantly, polysemes have been found to be processed with much more ease than homonyms.

To summarize some highlights, experimental studies have found that: (i) When a homonymous expression is encountered, readers quickly settle on an interpretation of that expression. In contrast, when a polysemous expression is encountered, no immediate decision is made between senses (Brocher et al. 2016; Frazier and Rayner 1990; Frisson and Pickering 2001). (ii) When a homonymous expression is encountered, readers are biased towards the dominant meaning associated with the homonym. Such bias effects are not observed, or not observed to the same extent, when readers encounter polysemes (Frisson 2009; Klepousniotou et al. 2008; Brocher et al. 2016, 2018). (iii) When readers encounter a homonymous expression, the meanings associated with the homonym compete for activation, and a meaning that is not selected decays quickly. In contrast, when readers encounter polysemous expressions, a sustained co-activation of associated senses occurs such that the related senses mutually prime each other and are sustained over a period of time (MacGregor et al. 2015). (iv) A polysemous expression is recognized as a word faster than homonymous expressions are, and polysemes thus enjoy a recognition advantage (Klepousniotou and Baum 2007; Klepousniotou et al. 2012).Footnote 5 Given this data, it seems reasonable to conclude that there are grounds for distinguishing between polysemy and homonymy.

3 The Architectural Explanation

Having established that the theoretical distinction between polysemy and homonymy has empirical support, it is reasonable to wonder what is actually reflected by speakers’ relatedness judgments and why homonyms and polysemes are processed differently. According to the received view in this field, homonyms and polysemes are processed differently because they are represented differently in speakers’ mental lexicons. Furthermore, this difference in mental representation is taken to be what is reflected by speakers’ relatedness judgments. Put somewhat metaphorically, the received view claims that it is the architecture of our mental lexicons which explains the differences between homonyms and polysemes that have been observed.

In its simplest form, the idea is that whereas the meanings associated with a homonym are stored in separate lexical entries in our mental lexicons, the different senses associated with a polyseme are stored and represented together in one lexical entry. The logic behind this conclusion is, as stated by Eddington and Tokowicz (2015), “that homonyms are two separate words that happen by chance to have the same word form, and therefore they should not be stored together” (p. 33). In contrast, polysemy is a matter of one word being associated with several different senses. To illustrate, consider the polyseme ‘mouth’. ‘Mouth’, it is argued, can denote

the whole mouth, its outside part, a part of its inside part, its whole inside part, an aperture (such as in the mouth of the cave), the part of the river that opens into an ocean (river mouth), a whole person (I have two mouths to feed), a person who speaks too much (big mouth), etc. (Vicente 2018, p. 948)

Given that these different uses of ‘mouth’ are (perhaps with some exceptions) clearly related in meaning, one might be inclined to think that these are all just different ways of using the same word. That is, ‘mouth’ when used to denote a part of a river where it meets the sea is simply a way of using the word for oral cavities but in a different sense. This is also what the received view proposes. I will henceforth refer to this way of accounting for the differences between homonyms and polysemes as the Architectural Explanation, and any theory of polysemy and homonymy representation that accepts it will be referred to as an architectural account of polysemy and homonymy.

The claim that polysemy, unlike homonymy, is a matter of one word being associated with several different senses has several consequences. The architectural picture entails that the following two statements, though similar on the surface, have very different contents.

  1. 1.

    Some words are homonyms.

  2. 2.

    Some words are polysemes.

Normally, to say that two words, w1 and w2, are homonyms is to say that w1 and w2 stand in a certain relation to each other, i.e., that some expression is such that it can be used as either w1 or w2 (but not both simultaneously). So understood, (1) says that some word is such that a single expression can be used as that word, and also as at least one other word. In contrast, given the architectural picture, to say that w1 and w2 are polysemes would be to say that w1 is associated with several senses and that w2 is too. That is not to say that w1 and w2 stand in any interesting relation to each other. Hence, (2) says that some word is associated with several distinct senses.Footnote 6 This can be illustrated by considering the expression ‘bank’.

The two words bank1 (denoting financial institutions) and bank2 (denoting lands sloping down to a river or lake) stand in a relation of homonymy; they are distinct in meaning but overlap both phonologically and graphemically. However, bank1 is also often considered polysemous between at least two senses. In one sense, as in (3), bank1 denotes financial institutions. In another sense, as in (4), bank1 denotes buildings where a financial institution is located.

  1. 3.

    I need to clear a check at the bank.

  2. 4.

    The bank burned down.

These senses are related, and the word bank1 is considered capable of expressing both. Put somewhat differently, bank1 and bank2 correspond to two distinct lexemes, BANK1 and BANK2, that require separate lexical entries in the (mental) lexicon. In contrast, the two different senses associated with bank1 belong in the same lexical entry – namely, the lexical entry for the lexeme BANK1.Footnote 7 As a consequence, the lexical ambiguity that affects polysemes now becomes very different from that which affects homonyms. Standard characterizations of lexical ambiguity, that is, characterizations in terms of an overlap between two or more distinct words, will not do justice to polysemy if polysemy is defined in this way. Proponents of architectural accounts must thus provide a story about what goes into the lexical entry of a polysemous word, a characterization of the type of ambiguity affecting polysemes, along with an explanation of how polysemes are disambiguated when encountered. This has been done in several different ways.

Architectural accounts can be said to come in two general strands: Overspecification and underspecification views of polysemy. Characteristic of overspecification views is the idea that the total meaning of a polyseme is composed of the word’s various senses, an idea often traced to Pustejovsky’s (1995) work on lexicons. It is in this sense that the meaning of a polyseme is taken to be “overspecified.” Thus, given an overspecification view of polysemy, disambiguation is a matter of selecting one sense from the complex whole. Underspecification views, often associated with work by Ruhl (1989), are instead characterized by the idea that the meaning of a polyseme is an abstract “core” representation that encompasses, or is shared by, all of the polyseme’s various senses. It is this underspecified “core-meaning” that is taken to be stored in the lexical entry of a polyseme. In processing, the core-meaning is first activated, rendering sense selection (or disambiguation) possible in context.Footnote 8

Overspecification and underspecification views thus tell two very different stories about the nature of polysemes and the ambiguity that affects them, and much of the literature on polysemy representation has been centered around debating which of these views is best supported by the empirical evidence. However, I do not intend to discuss these views in any further detail. If I am right in arguing, as I will below, that the Architectural Explanation lacks the degree of empirical support it has been assumed to enjoy, the same holds for these different ways of developing it further.

4 The Continuum Puzzle

Another important observation coming from the empirical study of ambiguity processing is that the degree to which senses associated with a polyseme are judged as related varies. For some polysemes the semantic relatedness between senses is high, but for others it is only moderate. In effect, when judgments of semantic relatedness are used to distinguish between polysemy and homonymy, we end up with a distinction that is not clear-cut, as is shown by Klepousniotou and Baum (2007). Rather, Klepousniotou and Baum claim, “[…] it seems to be a matter of a continuum from ‘pure’ homonymy to ‘pure’ polysemy […]” (2007, p. 7, my emphasis). Semantic relatedness is, after all, a matter of degree. As a result, ambiguous linguistic objects can be placed along a continuum stretching from those associated with highly related interpretations to those associated with completely unrelated interpretations.Footnote 9 Importantly, it has also been found that the degree to which two interpretations associated with an ambiguous expression are semantically related correlates with the strength of observed facilitation effects in processing (see Klepousniotou 2002; Klepousniotou and Baum 2007). All in all, we get the following picture:

  1. I

    Lexical ambiguity comes in two varieties: Homonymy and polysemy.

  2. II

    Semantic relatedness between associated interpretations distinguishes polysemy from homonymy.

  3. III

    Semantic relatedness is a matter of degree, and lexically ambiguous expressions can be placed along a continuum ranging from those that are ambiguous between unrelated interpretations to those that are ambiguous between highly related interpretations.

  4. IV

    Expressions that are ambiguous between semantically related interpretations are processed with more ease compared to expressions that are ambiguous between semantically unrelated interpretations (as outlined in Sect. 2.1), and the strength of these various facilitation effects correlates with the degree to which the interpretations associated with a lexically ambiguous expression are semantically related.

As we have seen, the architecturalists then add the Architectural Explanation to this picture, summarized in two theses below, in order to explain why the relevant effects on processing occur:

  1. V

    The different interpretations associated with a polyseme are stored in the same lexical entry.

  2. VI

    The different interpretations associated with a homonym are stored in separate lexical entries.

However, together, (I)-(VI) generate a puzzle for the architecturalists. The puzzle takes the form of an explanatory dilemma.

4.1 The Architecturalist’s Dilemma

A theory of polysemy and homonymy representation must do a number of things simultaneously. It must (a) provide a story about what the differences between polysemy and homonymy are and (b) explain why polysemes and homonyms are processed differently. But at the same time, the theory must (c) be able to account for and predict the continuum. That is, the theory must explain why we observe a correlation between the degree to which the interpretations associated with an ambiguous expression are judged as related and the degree to which facilitation effects occur in the processing of that expression. Architecturalists provide the Architectural Explanation as an answer to (a) and (b). However, this way of accounting for the differences between homonyms and polysemes inhibits the architecturalist’s ability to properly account for the continuum.

The problem is the following: As stated in (III), the empirical data shows that semantic relatedness is a matter of degree, instantiated along a continuum. As stated in (IV), this data also shows that the same is true of the relevant facilitation effects, and that semantic relatedness and facilitation effects are correlated. In contrast, being represented as having one lexical entry, or being represented as one word, is not a matter of degree and cannot be instantiated along a continuum. The same holds for being represented as having several lexical entries, or being represented as several words. Thus, the architecturalist’s two theses, (VI) and (VI), cannot explain why some polysemes are processed with more ease than others. That is, the Architectural Explanation does not account for the continuum.

More importantly, the Architectural Explanation would seem to predict the opposite of a continuum. If polysemes and homonyms really were distinct in the way that is assumed, then we should observe a clear cut in the psycholinguistic evidence. That is, under the assumption that there is a clear-cut difference between how polysemes and homonyms are mentally represented, one would expect there to also be a clear-cut difference between how polysemes are processed vis-à-vis homonyms. A clear cut is, however, precisely what we do not find. The coarse-grained, in fact, dichotomous picture painted by architectural accounts is consequently in conflict with the psycholinguistic evidence.

We thus seem to be caught in a dilemma: One can insist on the Architectural Explanation, but if one does, one will fail to both predict and account for the continuum. Alternatively, one can give up on the Architectural Explanation in order to do justice to the continuum, but this will be at the cost of losing the rather neat explanation for the processing differences that the Architectural Explanation provided us with. Nevertheless, some more sophisticated architectural accounts have been designed in order to account for the continuum, and these accounts are the topic of the next section.

4.2 Continuum-Friendly Architectural Accounts?

Brocher and colleagues (2016, 2018) propose what they call a shared features model for (irregular) polysemy. The account builds on the assumption that the lexical representation of a word’s meaning is composed of a set of semantic features. For example, the lexical representation of the word cat would include the semantic features ANIMATE, FELINE, and DOMESTICATED, whereas the lexical representation of lion would lack the feature DOMESTICATED. The idea is then that the senses associated with a polyseme form a complex of overlapping lexical representations in virtue of these senses sharing some of the same semantic features. In contrast, the meanings associated with a homonymous expression constitute non-overlapping lexical representations. To illustrate, the polyseme ‘wire’ is assumed to be associated with two lexical representations: one for metal threads used for carrying electricity and one for secret listening devices. However, both lexical representations share, Brocher and colleagues propose, the semantic features TIN, CYLINDRICAL, and METAL (2018, p. 446). Thus, these two lexical representations overlap. During processing, this shared subset of semantic features will be activated, which, Brocher and colleagues argue, is what leads to reduced bias effects, reduced competition between senses, and so forth.

The picture that is painted is one according to which the lexical representation of the word wire is this complex of overlapping lexical representations, each constituting a separate sense of the word. The account is thus similar to more standard overspecification views. Still, the account differs from standard overspecification views in an important way, for it is part of the shared features model that the senses associated with a polyseme can overlap more or less (depending on the number of semantic features they share). In this way, the shared features model is supposed to manage the continuum.

This approach has some appeal, whether or not one wants to capture the relevant notion of relatedness in terms of an overlap of semantic features. Carston (2019, 2021) provides a so-called family resemblance view according to which the mentally stored senses of a polyseme form a cluster (a “family”) of interrelated senses in the mental lexicons of speakers. While the account is similar to Brocher and colleagues’ in some respects, Carston treats relatedness as a matter of a resemblance between senses, where “resemblance” is intended to capture something similar to Sperber and Wilson’s (1986) notion of “interpretative resemblance.” Like the shared features model, Carston’s family resemblance view manages to provide some explanation as to why a continuum is observed, as the senses associated with a polyseme can resemble each other to varying degrees. Even so, neither the shared features model nor the family resemblance view ultimately succeeds in escaping the puzzle.

Brocher and colleagues (2018) conclude their paper as follows, in further support of the homonymy-polysemy distinction being a matter of degree.

Semantic similarity strongly varies across lexically ambiguous items (as is supported by our local norming studies as well as Klepousniotou et al. 2008). For irregular polysemes with weakly related senses, activation of the shared features might not be sufficiently strong to greatly contribute to lexical retrieval. Under these conditions, unshared features should also quickly and strongly be activated, making these irregular polysemes behave more like homonyms. (Brocher et al. 2018, p. 463)

However, this conclusion, paired with the Architectural Explanation, creates a tension. While Brocher and colleagues’ general approach to the homonymy-polysemy distinction is fine-grained, the distinction imposed by the Architectural Explanation remains coarse-grained. Thus, Brocher and colleagues, as well as Carston, must predict that the distinction between something’s being polysemous or homonymous hinges on some minimal, single-case difference in degrees of semantic relatedness between associated interpretations. While this prediction is not problematic in and of itself, it would indeed seem to follow from (I)-(IV), the prediction becomes problematic when polysemy and homonymy are assumed to be mentally represented in two “architecturally” distinct ways.

By clinging to the Architectural Explanation, both the shared features model and the family resemblance view are not only going to predict that the homonymy-polysemy distinction hinges on some minimal difference in degrees of semantic relatedness, but they are going to predict that minimal, single-case differences in degrees of semantic relatedness determine whether the interpretations associated with an ambiguous expression are stored in one lexical entry or two separate ones. In other words, that minimal, single-case differences in degrees of semantic relatedness determine whether an ambiguity is a matter of two words being associated with the same expression or one word being associated with several senses in the minds of individual speakers. On the face of it, it is implausible to think that the structure of our mental lexicons would be sensitive to such minimal differences. Hence, if one wishes to argue that it is, some additional motivation is required. For example, an argument which shows that the distinction between polysemy and homonymy as given by the Architectural Explanation is still necessary in order to explain why there are processing differences between polysemes and homonyms. While an argument to that conclusion is possible to make, it is essential to note that both the shared features model and the family resemblance view significantly undermine the explanatory value of the Architectural Explanation. The Architectural Explanation was provided to explain why polysemes are processed more easily than homonyms. However, on the shared features model and the family resemblance view, this job is largely, if not fully, done by there being a semantic overlap between senses or by there being a resemblance between senses.Footnote 10

Advocates of these “continuum friendly” architectural accounts are thus also caught in a dilemma. They can insist on the Architectural Explanation, but if they do, the homonymy-polysemy distinction as given by the Architectural Explanation is left close to explanatorily inert and made to hinge, in an implausible way, on minimal differences in degrees of semantic relatedness. Alternatively, they can give up on the Architectural Explanation, but at the cost of having no principled way of distinguishing homonymy from polysemy representation in the mental lexicon.

5 The Prospects of a Non-Architectural Explanation

To see where to go from here, let us quickly recapitulate: The data that we have on polysemy and homonymy representation is limited, but it clearly supports the following claims: (i) Semantic relatedness is a matter of degree, and lexically ambiguous expressions can be placed along a continuum ranging from those that are ambiguous between unrelated interpretations to those that are ambiguous between highly related interpretations. (ii) There is a notable correlation between the degree to which the interpretations associated with a lexically ambiguous expression are judged as semantically related and the degree to which facilitation effects occur in the processing of that expression. Importantly, and as I have argued, the Architectural Explanation fails to explain, and even conflicts with, this data. This being the case, a Non-Architectural Explanation of the processing data ought to be considered. That is, an explanation of why polysemes are processed with more ease than homonyms which does not make reference to the way in which our mental lexicons and their entries are structured. Entertaining such an alternative would, however, be considered controversial. The rise of the Architectural Explanation has been fueled by the now pervasive assumption that differences in ambiguity processing and resolution can only be explained by there being some corresponding differences in the architecture of our mental lexicons. For a Non-Architectural Explanation to be viable, this assumption must be rejected.

Consider the following exchange: Foraker and Murphy (2012) argue that the senses associated with polysemes are represented in separate lexical entries just as the meanings associated with homonyms are. This conclusion is questioned by Brocher and colleagues (2018) on the basis that evidence against that conclusion was already found in Foraker and Murphy’s own post-hoc testing. Foraker and Murphy found overall dominance effects for polysemes consistent with dominance effect patterns that are typical for homonyms. However, their post-hoc testing suggested that the strength of the observed dominance effects could be predicted by the degree to which senses were rated as similar. This should not be seen nor predicted, Brocher and colleagues state, if a separate entries account were correct (2018, p. 445). But why would a separate entries account like Foraker and Murphy’s be incompatible with perceived meaning similarity having effects on processing? The reason seems to be precisely the assumption that differences in ambiguity processing and resolution can only be explained by there being some corresponding differences in the architecture of our mental lexicons. However, this assumption is ill-considered, for a proponent of a separate entries account just needs to make reference to some other factor capable of influencing processing to make sense of this data. A natural contender for such a factor would seem to be the fact that the associated senses are judged as related. Devitt (2021) makes a similar point, criticizing the idea that any separate entries account must predict that polysemes are processed just like homonyms are.Footnote 11

Why predict that the sense dominance that is significant for homonyms would also be significant for polysemes? Though the mind’s path for the efficient processing of homonyms utilizes dominance, perhaps that for polysemes utilizes only those relations. Perhaps the fact that the meanings of a polyseme are related yields much better clues to its interpretation in a context than the crude fact that one meaning is dominant. I see no basis for predicting otherwise. (Devitt 2021, p. 153)

It is not clear why the Architectural Explanation should immediately be favored over an explanation along these lines. In the following, I sketch three alternative ways of explaining why processing differences occur that are neutral with respect to how polysemes are mentally represented in the lexicon.Footnote 12 All three alternatives face some challenges, but the main point is to show that these alternatives exist.

5.1 Option 1: It is due to World Knowledge

The first option is an alternative hinted at by Löhr (2021), and it says that the reason why two meanings are perceived as related is because they denote very similar things. This is something that we know; it is part of our world knowledge that, say, the chair’s arm is similar to the person’s. Löhr writes:

It could simply be argued that the intuition of relatedness stems from the fact that the conventionalized senses of polysemic expressions tend to pick out very similar kinds of things. This intuition of similarity could be represented independently as part of world knowledge. (p. 13)

It is perfectly reasonable to assume that intuitions of similarity, stemming from our knowledge about the things ambiguous expressions can denote, could affect how these ambiguities are processed but not necessarily how they are represented in our mental lexicons. The core idea would be that an ambiguous expression is considered polysemous when the things denoted by its various senses are perceived as similar, and that this intuition of similarity is what influences the disambiguation process. We can summarize the suggested view of polysemy as follows:

KNOWLEDGE

An expression E is polysemous between two meanings, M1 and M2, for a subject S, if and only if (i) S regularlyFootnote 13 associates E with M1 and M2, (ii) M1 and M2 are distinct, and (iii) S perceives the things denoted by M1 and M2 as similar.Footnote 14

We can generate different versions of this alternative by explicating what “perceives as similar” really means.

While an option like this is worth exploring, it has several problems. An analysis along these lines assumes that for a subject to perceive two interpretations of an ambiguous expression as semantically related, she must perceive the things denoted as similar. However, it is doubtful that this is always the case. For example, I might regard the use of ‘mouth’ to pick out apertures as related to the use of ‘mouth’ to pick out parts of rivers where they meet the sea, but I need not think that apertures and parts of rivers are similar things at all. As a result, the proposal will seemingly fail to account for cases in which the things denoted by an ambiguous expression are not perceived as similar, but where the senses associated with that expression are still judged as related and effects on processing are present. Alternatively, the proposal risks attributing intuitions of similarity to speakers that they do not have.

Consider further the use of ‘suit’ to denote, on the one hand, a type of clothing and, on the other, executives. These two alternative meanings are clearly related, but are they so related because executives are similar to suits? It seems more plausible to think that the reason we perceive these two meanings as related is because of the fact that we only began calling executives ‘suits’ because suits are what executives wear. This brings us to the second option I want to consider.

5.2 Option 2: It is due to Causal Relations

The second option is Devitt’s (2021). On this alternative, judgments of relatedness are due to causal relations between associated meanings. All senses associated with a polyseme, Devitt argues, are causally related, for “the polyseme came to have its related senses because its having one sense partly caused it to have others; ‘suit’ came to mean an executive because it meant what an executive wears” (2021, p. 148). From the start, they were related via an “association of ideas,” something which we might expect to affect processing, but not necessarily representation, Devitt goes on to argue. We can spell out this alternative characterization of polysemy as follows:

CAUSAL RELATIONS

An expression E is polysemous between two meanings, M1 and M2, for a subject S, if and only if (i) S regularly associates E with M1 and M2, (ii) M1 and M2 are distinct, and (iii) the use of E to express M1 in S’s linguistic community is causally related to S’s linguistic community’s use of E to express M2.Footnote 15

This option does away with the requirement that subjects must perceive the things denoted by the two meanings as similar but faces other problems. Firstly, this solution presupposes that we all have access to these causal facts, for how else could they influence our processing? However, it can be doubted that ordinary speakers generally have access to such facts. Furthermore, the solution entails that for an expression to be polysemous for some individual, the interpretations associated with that expression must in fact be causally related. Let us look at a hypothetical scenario sketched by Carston (2021) in order to see why this is problematic. Carston says:

Suppose, for instance, that a form /xyz/ denotes the beak of a certain breed of bird and also a particular kind of bracket used in assembling furniture, there being no historical connection between the two words, but it happens that the bracket has a beak-like shape and movement. (p. 110)

The question is, is this a case of homonymy or polysemy? The most reasonable answer to this question is the one that Carston gives, namely, that it will depend on the speaker (cf. Löhr 2021). ‘Xyz’ will be polysemous for an individual who perceives the two meanings as related, whereas ‘xyz’ will be homonymous for an individual who does not. The scenario is interesting to consider, as there are real-life examples of precisely the type of situation Carston describes. That is, cases in which an expression is polysemous for some speakers but homonymous for others. Consider this passage from Murphy (2010):

I assumed for years that the word ear was a polyseme that could mean ‘an organ for hearing’ or ‘a cob (of corn).’ It seemed to me that cobs of corn came to be called ears (in North America at least) because they stick out from the stalk the same way that our ears stick out from our heads. But later I found out that (a) many people do not see a similarity between the two senses, and (b) the words for hearing organs and corncobs are not etymologically related. […] On etymological grounds and in other people’s minds, ear is a homonym.Footnote 16 (pp. 91–92)

No study, as far as I am aware, has experimented on expressions with associated meanings that are clearly related in the minds of some individuals, but unrelated in the minds of others. However, we should expect that, given an ambiguous expression E, those individuals who believe that the meanings associated with E are related will process E differently compared to those who do not think that they are. This should be expected, given what we know about how the degree to which meanings are judged as related correlates with effects on processing.

These kinds of examples pose a problem for Devitt precisely because he assumes that what distinguishes polysemy from homonymy, and explains differences in processing, is that the meanings associated with a polyseme are causally related. Hence, ambiguous expressions whose associated meanings are not causally related cannot be polysemous, even if everyone believes that their meanings are related.

If we read Devitt as saying that these causal relations are, instead, in the minds of individual speakers, Devitt’s solution is committed to the view that for two meanings, M1 and M2, to be related in the mind of a subject S, S’s having M1 must have partially caused S’s having M2 (or vice versa). Furthermore, these causal relations must still be in S’s mind somehow affecting S’s processing. But now, just because I regard the use of ‘mouth’ to pick out oral cavities as related to the use of ‘mouth’ to pick out parts of rivers where they meet the sea, why must my having the one sense of ‘mouth’ have been caused by my having the other? Maybe my having the second sense was caused simply by people telling me that ‘mouth’ means parts of rivers where they meet the sea. I can still regard the two senses as related, for I can come to believe that the reason why these parts of rivers are called ‘mouths’ is because someone once thought they were interestingly similar to our oral cavities in some respect, just as Murphy believed that cobs of corn came to be called ‘ears’ because the way they stick out from the stalk is similar to the way that our ears stick out from our heads. But then, it is my belief that the senses are so related which relates them.

5.3 Option 3: It is due to Metalinguistic Beliefs About Causal Relations

Given what was said in the previous section, a third option presents itself. According to this alternative, when an expression E is associated with two different interpretations and a subject S believes them to be causally related, this will affect how S processes occurrences of E. It could be the case that S is unaware that two interpretations associated with an expression E1 are causally related and thus processes E1 like a mere homonym. It might also be the case that S wrongly believes that two causally unrelated interpretations associated with some expression E2 are related, which will lead S to process E2 differently than E1. Importantly, it is perfectly reasonable to suppose that our metalinguistic beliefs are capable of affecting processing but not necessarily representation. In sum, the proposal is as follows.

BELIEFS

An expression E is polysemous between two meanings, M1 and M2, for a subject S, if and only if (i) S regularly associates E with M1 and M2, (ii) M1 and M2 are distinct, and (iii) S believes that the use of E to express M1 in S’s linguistic community is causally related to S’s linguistic community’s use of E to express M2.Footnote 17

This option avoids the problems associated with Option 1 and Option 2. It does, however, raise an immediate question: How does Murphy process occurrences of ‘ear’ now when she has learned that the two alternative meanings are unrelated and supposedly no longer believes them to be? This is an empirical question that would require investigation, but if the same facilitation effects still occur, that is a problem for Option 3. To mend it, one might propose a view that combines Option 1 and Option 3. Because, maybe, people like Murphy who have believed that the two meanings associated with ‘ear’ were causally connected for so long have come to perceive the ears of corn and the ears of humans as similar, and that is something which might be hard to, so to speak, unsee. This would yield a characterization of polysemy along the following lines:

KNOWLEDGE AND BELIEFS

An expression E is polysemous between two meanings, M1 and M2, for a subject S, if and only if (i) S regularly associates E with M1 and M2, (ii) M1 and M2 are distinct, and (iii) S believes that the use of E to express M1 in S’s linguistic community is causally related to S’s linguistic community’s use of E to express M2 or S perceives the things denoted by M1 and M2 as similar.

According to this alternative, what influences our judgments of relatedness and the way we process ambiguous expressions is our knowledge and our beliefs about the things ambiguous expressions can denote, as well as our beliefs about how meanings are related. This alternative keeps what was good about Option 1 but can also account for judgments of relatedness that stem from our beliefs about meaning relations. Thus, it is worthy of further attention in the literature.

5.4 The Continuum Puzzle Revisited

While these Non-Architectural Explanations avoid the original version of the continuum puzzle, one might worry whether the continuum puzzle does not arise again but in another form.Footnote 18 How do the alternatives proposed here explain why processing is facilitated in varying degrees?

Option 1 can provide a rather simplistic answer: The more similar the things denoted by the two alternative meanings are perceived to be, the more prominent the facilitation effects will be. Option 2 has a harder time accounting for the continuum. However, we might imagine that a solution would be to say that the closer the causal relation between two senses is, the more prominent the facilitation effects will be. Thus, no facilitation effects will occur if two senses are very distantly related. A proponent of Option 2 would then have to spell out what she means by two senses being “distantly related,” for this could be interpreted in different ways. Cashing it out in terms of the number of intervening steps would only work for a handful of cases, so she will probably want to say that two senses are distantly related when the ambiguity was established so long ago that speakers of the language have started to forget what the relevant association of ideas really was about.Footnote 19 The challenge is perhaps most acute when it comes to Option 3. In what manner can our metalinguistic beliefs account for when and why facilitation effects occur to varying degrees? Perhaps one can appeal to how strongly held the belief is, e.g., our credences toward the proposition that M1 and M2 are related. That is possible, but the resulting picture would perhaps provide a too intellectualized account of the issue. It can be objected that people simply do not have sophisticated metalinguistic beliefs of that kind. The issue is not one that I will be able to resolve here, but it merits further discussion. What I will say is this: investigating and discussing this particular issue puts us in a better position to understand why polysemes are processed with more ease than homonyms.

6 Conclusion

The conclusion to draw from this paper is that the evidence coming from empirical studies of ambiguity processing and resolution does not favor a distinction between polysemy and homonymy along the following lines: The different meanings associated with a homonym are stored in separate lexical entries in our mental lexicons, and homonymy is a matter of two or more words sharing some of the same word forms. The different senses associated with a polyseme are stored in the same lexical entry, and polysemy is a matter one word having more than one sense. I engage with this discussion in particular because this type of homonymy-polysemy distinction is growing in popularity. Much so with reference to it being favored by the evidence. Various views and conclusions have then been taken to be strengthened by the existence of polysemy so defined. This includes, in particular, various versions of overspecification and underspecification views, along with related views of linguistic meaning proper (see Vicente 2018; Löhr 2021; Recanati 2017). What this paper shows, however, is that caution is needed when we discuss what the empirical evidence either supports or does not support.

It is also important to flag one thing that I did not discuss. The distinction between regular and irregular polysemy was not of importance for this discussion, as the data coming from the psycholinguistic studies engaged with here does not leave us in any kind of position to conclude how regular polysemes in particular or irregular polysemes in particular are mentally represented. However, there may be independent reasons for thinking that regular polysemy is a phenomenon distinct from both irregular polysemy and homonymy.

Unlike irregular polysemes and homonyms, regular polysemes are systematically generated and follow general patterns. Some examples of such patterns are: The object/representational content pattern (The book is blue/The book is entertaining), the place/institution pattern (The school burned down/The school decided to ban smartphones), the animal/meat/fur pattern (The rabbit is on the road/The rabbit was delicious/The model wears rabbit), and the plant/food pattern (A field of corn/The corn was delicious). These patterns are productive and extend easily.Footnote 20 Most appear cross-linguistically (Pethő 2001) but are sometimes instantiated by different sets of senses (Srinivasan and Rabagliati 2015).

Another interesting feature taken to be distinctive of (at least some) regular polysemes is that the different senses associated with such expressions are not mutually exclusive. This means that anaphoric relations across senses are usually allowed, as well as co-predication. The regular polysemes ‘book’, ‘lunch’, ‘rabbit’, and ‘Brazil’ illustrate this feature in the following examples.

  1. 5.

    That book [content] is boring. Put it [object] on the top shelf. (Vicente and Falkum 2017)

  2. 6.

    Lunch was delicious [food] but took forever [event]. (Vicente 2018)

  3. 7.

    John kills [animal], eats [meat] and wears [fur] rabbits. (Löhr 2021)

  4. 8.

    Brazil is a large [land] two-century-old [institution] Portuguese-speaking [people] country. (Arapinis and Vieu 2015)

Whether these features of regular polysemes support treating them as mentally represented in some special way is something which this paper remains neutral on.