Against intentionalism: an experimental study on demonstrative reference

In this paper, we present two experimental studies on reference of complex demonstratives. The results of our experiments challenge the dominant view in philosophy of language, according to which demonstrative reference is determined by the speaker's intentions. The first experiment shows that in a context where there are two candidates for the referent—one determined by the speaker’s intention, the other by some “external” factors—people prefer to identify the referent of a demonstrative with the latter object. The external factors for which this prediction has been confirmed include the speaker’s demonstration and the descriptive content of a demonstrative. The second experiment shows that while this preference can be explained in terms of the speakers’ having different sorts of referential intentions, the relevant kind of intentions are fully opaque to the subjects. At the end of our paper, we point to some alternative accounts of demonstrative reference, including a pluralistic and hybrid approach, which can accommodate our experimental results.


Introduction
An important issue concerning demonstrative expressions is the question what role speakers' intentions play in determining reference, i.e. the semantic value of an expression like ''this '', ''that'', ''this F'', etc., in a context. In philosophy of language, the dominant view is that this role is central: if the referent of a demonstrative used in a context can be identified, it must be determined by some sort of the speaker's intention. This view, in one form or another, is taken for granted or defended by, among others, Å kerman (2015), Kaplan (1989b, Sec. 2), King (2013King ( , 2014, Michaelson (2013), Radulescu (2019), Speaks (2016Speaks ( , 2017, and Stokke (2010). Let us call this view ''intentionalism''.
Although demonstrative reference and the role of speaker's intentions has been a subject of much debate for a long time, there is little empirical evidence on this matter in philosophical discussions. This is surprising given that many arguments provided in this debate directly appeal to linguistic intuition and thus call for validation in terms of what actual judgments are made by a wider group of linguistically competent subjects. Moreover, it is rather uncontroversial to think that any theory of reference should have a role in explaining human communication. This means that an account of demonstrative reference needs to be compatible with the facts about how people actually use and understand demonstratives and how they make referential interpretations. These observations motivate the goal of our paper which is to verify by means of empirical methods to what extent the speaker's intentions are taken to be relevant in the process of assigning reference to demonstrative expressions by ordinary users of language. In particular, we check whether people can regard an object as the referent if this object, in light of the information provided, is not the one the speaker really intends to talk about. The results of our study give a positive answer to this question: while information about the speaker's intention is significant to the referential choices of our study participants in some cases, it is definitely not the determining factor guiding the referential interpretations. Thus, the evidence provided by our study poses a challenge to intentionalism, or at least its most typical versions. In the further part of our paper, we discuss some possible objections to this claim. Firstly, we consider a possibility that the data obtained from our study participants does not reflect genuine semantic judgments. We argue, however, that the particular strategies of resisting the semantic significance of our results are not convincing. Secondly, we explore a possibility of explaining the results in line with some refined versions of intentionalism which introduce different sorts of referential intentions, and our consideration leads to a second experimental study. We acknowledge that the referential choices of the study participants may reflect recognition of ''secondary'' intentions, but the evidence provided by the second study indicates that the relevant kind of intention is not straightforwardly recognizable by them.
The structure of the paper is as follows. In Sect. 2, we outline the theoretical background of the study and our underlying motivation. In Sect. 3, we present our main experiment (Study I). Section 4 discusses the major objections to the claim that our results challenge intentionalism. A second experiment related to one of these objections (Study II) is presented in Sect. 5. Finally, in Sect. 6, we point to some non-intentionalist accounts of reference which can accommodate our experimental results. Section 7 offers a summary and rephrases the main conclusions.

Intentionalism and referential predictions
The term ''intentionalism'' is used by us to denote a variety of views on how demonstratives refer, which nevertheless share the belief that a (relevant sort of) speaker's intention is indispensable for reference. In a general form, intentionalism can be stated as follows: An individual x is the semantic value of an utterance of a demonstrative d made by a speaker S in context c iff (i) S intends to talk about x by uttering d in context c and, optionally, (ii) some conditions u hold. 1 A ''pure'' version of intentionalism is a theory of reference which sticks only to condition (i) (cf. Kaplan, 1989b, Sec. 2;Speaks, 2016;Radulescu, 2019). ''Impure'' versions introduce clause (ii) with different conditions u. Under this heading, we mean in fact very different views on how reference is established. Various impure versions of intentionalism can be arrived at in different ways. Firstly, certain constraints, which operate independently from the speaker's intentions, can be imposed on a successful reference. This strategy can be exemplified by King's (2013King's ( , 2014 ''coordination account'', in particular, a version of his account he calls ''Bad Intentions'' (for details, see King, 2013: 296-297). According to King, x is the semantic value of a demonstrative d uttered by S iff (i) S intends x to be that value, (ii) an attentive, competent, and reasonable audience would take x to be the object that S intends to be the value. Reimer's (1992) is another proposal of this sort. She argues that when an utterance of a demonstrative is accompanied by a demonstration act, the semantic value must be located in the direction indicated by the demonstration.
The second strategy of impure intentionalism is to hold that among various intentions that a speaker may have while using a demonstrative, only specific kinds of intentions are reference-determining. This strategy can be exemplified by the theory of Grice, Bach, or King's ''Best Laid Plans'' (2013: 298-301). For instance, Bach (1987Bach ( , 1992aBach ( , 1992b argues that a referential intention is a part of a broader communicative intention of the speaker, and it is not just ''having an object in mind''. According to him, the relevant referential intention ''involves intending one's audience to identify something as the referent by means of thinking of it in a certain identifiable way'' (cf. Bach, 1987: 49-53). 2 For example, the speaker may intend the audience to recognize the referent as the object being demonstrated or the most salient in the context. In other words, Bach's condition for an intention to be reference-determining assumes that the intention appeals to some recognizable features of the referent. Similarly, the Best Laid Plans version of King's theory introduces the notion of a ''controlling'' intention, which is, roughly, an intention that the speaker wants their hearers foremost to recognize.
An important issue which emerged in the discussion about intentionalism is the question whether the theory provides adequate predictions about what a given demonstrative refers to in a context, or relatedly, what the truth-value of a statement containing that demonstrative should be. And this issue has actually raised some controversies. For instance, Kaplan (1978: 239) challenged intentionalism with his famous Carnap/Agnew example. Suppose that without turning and looking I point to the place on my wall which has been long occupied by a picture of Rudolf Carnap and say: ''This is the picture of one of the greatest philosophers in the twentieth century.'' However, unbeknownst to me, someone has replaced the picture of Carnap with one of Spiro Agnew and so the latter is actually the one I am pointing at while saying my words. My statement seems to be simply false, but pure intentionalism is arguably committed to the claim that it is true. 3 Based on this, some have concluded that pure intentionalism is incorrect (e.g., Reimer, 1991; but, for some different analyses of this example, see King, 2014: 222-224 or Radulescu, 2019. A different example of this sort where a referential intention does not match the contextual clues concerning the identity of the referent is given by Gauker (2008: 363): Suppose that Harry and Sally are at a department store and Harry is trying on ties. Harry has wrapped a garish pink-and-green tie around his neck and is looking at himself in a mirror. Sally is standing next to the mirror gazing toward the tie around Harry's neck and says, ''That matches your new jacket.'' As a matter of fact, Sally has been contemplating in thought the tie that Harry tried on two ties back. While Gauker insists that ''that'', as uttered by Sally in the presented case of mismatch, cannot be taken to refer to the former tie, some other commentators of this example claim the opposite (e.g., Montminy, 2010Montminy, : 2912Montminy, -2913; for a discussion see Å kerman, 2015: 494-496).
In our view, what all those discussions are missing is a solid empirical basis. One way of understanding a judgment that in such-and-such circumstances, a statement with a demonstrative is true, false, or neither, is to treat it as a voice of ''intuition'' or ''linguistic competence''. This is mostly presumed in discussions about examples like those given above, but this approach is not very promising. There is much disagreement on the status of intuitions as evidence in the philosophy of language Footnote 2 continued function of context. Thus, in a sense, Bach presents a view which is far from intentionalism as defined above. Still, he claims that it is the intention of the speaker that the referent will be identified by the hearer which matters and recognition of such an intention explains successful communication (2017: 71). 3 The Carnap/Agnew case is more complex from the viewpoint of such theories as the one of Bach. We will discuss it in Sect. 4.2.
(cf. Bach, 2002;Martí, 2008;Machery et al., 2009;Cohnitz and Haukioja, 2015) and arguments based on intuitions seem to be especially problematic when these intuitions vary among philosophers. So, there is little hope that any such argument will provide a relevant contribution to the debate on demonstrative reference and the role of speakers' intentions in it. Yet, we can look for the reference or truth-value judgments from a different perspective, by treating them as predictions about what competent speakers of language would say when confronted with a given use of a demonstrative in real life. More specifically, these judgments can be understood as (attempts at) generalizations of the predicted language users' behavior. If we follow this path, then there is a natural need to test hypotheses about these judgments against empirical reality, and the empirical way of settling the issue seems to be the adequate one. In particular, we may wish to employ experimental methods in order to gather data about exactly these cases that are interesting to us. The results of empirical investigations may eventually confirm or disprove particular predictions concerning reference or the truth-value and thereby strengthen or weaken a given philosophical argument based on these predictions.
In the next sections, we will present our attempts to empirically verify the judgments of intentionalism about the truth-value understood as predictions of ordinary speakers' behavior. We target pure intentionalism, as well as a certain impure version of this view. In light of a diversity of impure intentionalism, we found it hard to test all possible positions that can be classified under this heading in a single experimental design. For this reason, we have decided to focus on one particular position in our main study, which is the Bad Intentions version of King's theory. This account gives essentially different predictions from the ones of pure intentionalism on the one hand, and of the non-intentionalist accounts, on the other hand. The cases investigated in our study are the ones where the referential intention of the speaker cannot be recovered from the context, as the context (or, more generally, some objective factors) actually indicates that the speaker talks about a different object from the one she or he really has in mind. Because of this, King's ''idealized audience'' would not be in a position to grasp the speaker's true referential intention, and thus the Bad Intentions account predicts a reference failure. 4 In turn, pure intentionalism and non-intentionalism predict successful reference but to different objects in such cases. King's account seems to be thus a good candidate for diagnostic experimental comparisons. 5 4 Also, King would presumably say that the examples investigated in our study involve ''conflicting intentions''. The Bad Intentions account directly predicts a reference failure in such cases (see King, 2013: 297). 5 When it comes to Reimer's theory, it focuses on the role of the speaker's demonstration and we wanted to test intentionalism also with regards to the cases not involving demonstrations. Also, we find problematic to test the view of Bach, or King's Best Laid Plans, since their predictions are hard to be distinguished from the ones of non-intentionalist views which say that contextual factors-and not the intentions related to them--determine reference (for details, see Sect. 4.2).
3 Study I: Do intentions determine reference?

Strategy of testing intentionalism
A reasonable strategy to test intentionalism is to focus on the cases in which the referential choice of language users may potentially diverge from that predicted by intentionalism. Such cases can be naturally constructed by introducing another factor that could be reference-determining and indicates a different candidate for the referent from the one fixed by the speaker's intention-like in the previously discussed examples of Kaplan and Gauker. We selected three such factors.
The first is an act of demonstration. If a speaker makes a gesture of pointing while using a demonstrative, then the referent should be located somewhere in the direction indicated by that gesture. In light of the conversational context and some common-sense assumptions, a demonstration is usually enough to determine a unique object as the referent. Thus the speaker's act of demonstration may be regarded as a reference-determining factor which operates independently of his/her intention. This proposal is obviously rooted in some classical theories of demonstrative reference, such as the one presented by Kaplan (1989a), according to which the ''true'' demonstratives are semantically incomplete and require a form of demonstration in order to successfully refer (see also Roberts, 2002).
The second factor is description. That is to say, we may regard the property expressed by a demonstrative (in particular, the one expressed by the nominal phrase in a complex demonstrative, like ''black cat'' in ''that black cat'') as reference-determining whenever there is only one object satisfying the property in the context. Although complex demonstratives, i.e., expressions of the form ''this F'' or ''that F'', are often used when there is another F in the context, this is not an unexceptional rule. Hence, the description itself may determine a particular object provided we assume a contextual domain restriction. 6 The third factor is salience. This is a broad category that inter alia includes demonstrating but is not limited to it. Objects may be salient in virtue of very different features, e.g., they occupy a certain spatial position with regards to the speaker, or they have previously been mentioned in the utterance (cf. Lewis, 1979: 348-350). Basically, the idea is that the referent of a demonstrative used at a given stage of a discourse is the most salient object at this stage. With regards to demonstratives, several theorists advocate for accounts of this sort (e.g., Mount, 2008; see also Gauker, 2008;Newen, 1998;Wettstein, 1984). Usually it is assumed that salience is a gradable feature and we can compare different objects that gain salience from presumably different sources in terms of their levels of salience and create a ranking. Not much is said by the aforementioned theorists about how these mechanisms work in detail; however, the applications of the theory are straightforward and intuitively clear in ordinary cases. For instance, consider the previously quoted example of Gauker: Harry is searching for a tie, and at the moment he tries on a pink-and-green one Sally says, ''That matches your new jacket.'' The current pink-and-green tie is definitely the most salient one at the moment of Sally's utterance. In our experimental study we used scenarios similar to Gauker's example, in which it is easy to identify the most salient object, and the maximal salience is gained in virtue of the object's location, its status in a series of events, or the conversational focus.
In total, we chose three possibly reference-determining factors: demonstration (Dem), the descriptive content (Des), and salience (Sal). Each tested condition was considered to be a mismatch case in which the speaker's intention determines one object, while Dem / Des / Sal determines another object as the referent.
We also considered it important to test the predictions of intentionalism in two different types of conditions. The first was a situation in which the object determined by the intention had the property ascribed by the speaker, and the object determined by an alternative factor (e.g., Dem) did not. So, this case was similar to Kaplan's example where the picture actually demonstrated by the speaker did not present a great twentieth-century philosopher, which is in contrast to the picture the speaker had in mind. However, we also included the inverse situation in our experimental design: the object determined by the intention did not have the ascribed property and the object determined by Dem, Des or Sal had it. This enabled us to get more predictive results and test the predictions of intentionalism in a complementary way. Henceforth, let us refer to the first case as ''T-condition'' and to the second as ''F-condition''. (For instruction: the capital letter T/F always indicates the state of the object determined by intention, i.e., whether the prediction of intentionalism is true/false.) Because we included the Des condition, we focused on complex demonstratives as our case study; strictly speaking, a demonstrative in each condition had the form of ''this F''. Naturally, further empirical research is needed with regards to other kinds of demonstratives (including those which contain ''that''). Nonetheless, we believe that the most general conclusions of our study will apply to all kinds of demonstratives, including bare ones like ''this''. As ''this F'' contains ''this'' as its component, it is unlikely that these two expressions are essentially different in the way in which they refer to objects. Hence, if it turns out that intentions do (or do not) determine reference in the case of complex demonstratives, we are justified in believing that they are likewise important for bare demonstratives.

Predictions of intentionalism and non-intentionalism
The predictions of pure intentionalism are quite straightforward: people should tend to evaluate a statement with a demonstrative as ''true'' in the T-conditions and as ''false'' in the F-conditions. At least, their preference for affirmative judgments should essentially be stronger in the T-conditions than in the F-conditions. The predictions of King's Bad Intentions view (impure intentionalism) are more complex. This theory allows for a possibility of reference failure, granted that the object fixed by the speaker's intention cannot be properly identified by a hearer. As we have said at the end of Sect. 2, the cases investigated in our study seem to involve a failed reference from the viewpoint of the Bad Intentions account. It is because the true referential intention of the speaker cannot be grasped by an ideal audience, provided that a factor like Dem, Des, or Sal indicates that the speaker wants to talk about a different object. A failure of reference in a subject-predicate sentence means that this sentence does not express a determinate proposition, so the study participants neither should opt for ''true'', nor for ''false'', in their evaluation. However, one cannot exclude the possibility that even if the participants recognize a reference failure, they still prefer definite answers, so they opt for falsity (it is rather improbable that they would say that the statement is ''true'' in such a case). 7 In sum, King's Bad Intentions account is compatible with the pattern in which respondents choose ''false'', or neither ''false'' nor ''true'' across T-conditions and F-conditions, as both types of conditions likely involve a reference failure from the viewpoint of this account (Table 1).
It is instructive to present the predictions of the considered versions of intentionalism in a table and contrast them with those which clearly count against these accounts: Intentionalism-in any version-is inconsistent with a pattern of results in which ''true'' dominates in the F-conditions and at the same time ''false'' dominates in the corresponding T-conditions (or with a pattern in which the preference for affirmative judgment is stronger in the F-conditions than in the T-conditions). We call the view which generates the indicated set of predictions ''non-intentionalism'', as the predicted pattern of results strongly suggests that people fix reference according to an alternative factor (like Dem or Des) and not according to the speaker's intention. Also, let us note that the introduction of the F-conditions allowed us to create an experimental design yielding different sets of predictions for each theoretical approach, and thus providing a possibility of making a diagnostic comparison between these approaches. The predictions of non-intentionalism in the F-conditions differ from the ones of pure intentionalism and of King's Bad Intentions account at the same time.
Finally, we should make a brief comment about the cases with the Des condition. Here, one may suspect that the predictions of intentionalism depend on the view about the semantic role of the noun phrase in a complex demonstrative. There are very different opinions on what that contribution may be; nevertheless, we need to

T-conditions T F/N in both conditions F
F-conditions F T T = true, F = false; N = neither true nor false; T-condition = a case in which the uttered statement is true according to pure intentionalism; F = a case in which the uttered statement is false according to pure intentionalism emphasize that no matter what role ''F'' actually plays in the semantics of ''The F is G'', once we agree that intentions determine reference, we basically end up with a set of predictions from the left-hand or the centre column but not from the righthand column. 8

Materials and design
The study had the form of a questionnaire in which participants were asked to read short fictional stories (''vignettes'') and answer some questions about them. We developed three kinds of mismatch situations that differ in their setup and plots: Dog Show, Wedding Cakes and Game of Chess (see Appendix A for the materials). The difference across setups was not predicted to be significant and served merely as a robustness check. Each vignette described a situation in which someone used ''this F'' as a grammatical subject of a simple statement made in a conversational setting (i.e., there was always an interlocutor to which the utterance with a demonstrative was addressed). As a rule, the background story provided two different candidates for the semantic value: the object x that the speaker intended to talk about using ''this F'' and the object y that was differently determined in each version of the story. Here is a sample of the Dog Show vignette in all three conditions: Dem, Des, and Sal, where the object determined by the intention satisfied the property from the ascription (T-condition): In sum, we employed a 2 9 3 mixed experimental design with one within-subject factor (condition: Dem, Sal, or Des) and one between-subject factor (T-condition vs. F-condition). This made six tested conditions: Dem, Des, and Sal in both T and F versions.

Participants and procedure
For Study I, 371 participants were recruited on Amazon Mechanical Turk to complete an online questionnaire. They were paid a small fee for their participation. Non-native speakers and subjects who failed the attention or comprehension check were excluded. The final sample consisted of 290 subjects (150 women, M age = 39.6, SD = 12.7). 9 Each participant was randomly assigned to one between-subject condition (Tcondition or F-condition) and received three vignettes with different setups, one for each within-subject condition (Dem, Des and Sal cases). The within-subject conditions were counterbalanced across vignettes in such a way that every participant received one vignette for each condition without repeating vignettes. The order of presentation was randomized. For each experimental condition, three separate screens were presented with the following questions: 1. Categorical comprehension question (e.g., ''Which dog is chocolate?'' with two possible answers to choose from: ''Labrador'' and ''Irish Setter''); 2. Question regarding the truth-value of a protagonist's utterance (e.g., ''How would you evaluate Olivia's statement in the described situation?'') followed by three choices: TRUE, FALSE and CANNOT SAY; 10 3. Question on confidence level regarding the answer to the truth-value question (e.g., ''To what extent do you agree that what Olivia said was true?'' in cases in which the participants answered ''true'' to the previous question) followed by a 5-point scale ranging from -2 (marked with ''strongly disagree'') to ?2 (''strongly agree''). 11 The participants were unable to go back to the previous vignette or the previous screen, and the story remained at the top of the screen throughout. Before completing all questions in the experimental conditions, a short demographic survey was presented; this questionnaire also preceded testing in our second experiment. 9 A high number of excluded participants is a result of a strict exclusion criterion: only participants who successfully answered the comprehension questions in the questionnaire were included in our final sample. 10 Different evaluations of the logical value naturally reflect different referential choices made by the study participants. Asking about the evaluation of a sentence seems to be better than direct referentialintuition questions, as the former tests more the ''usage'' of language (see Devitt and Porot, 2018). 11 The ones who chose CANNOT SAY in the second question received two additional questions: (1)

Results
For each of our experimental conditions (Des, Dem and Sal), we conducted separate analyses. For the currently discussed analysis, we excluded from the data CANNOT SAY answers to our truth-judgment question. The reason for this procedure is that these answers constituted a minority of answers (about 10% across all experimental conditions). A detailed table containing all the data is included in Appendix B.
In the Dem conditions, a majority of participants provided answers that were incompatible with the predictions of intentionalism (see Fig. 1). In the T-condition, where (pure) intentionalism predicted TRUE, a majority of participants evaluated the speaker's utterance as FALSE (78.2%, binomial test (P = 0.5): p \ 0.05). In the F-condition, the participants tended to judge it as TRUE while intentionalism predicted FALSE (63.8%, binomial test (P = 0.5): p \ 0.05). We also observed a statistically significant difference between answers in the T-condition and the Fcondition (v 2 = 43.32, p \ 0.001).
A similar pattern of responses was observed in the Des conditions. An overwhelming majority of participants assessed the target utterance as FALSE in the T-condition (95.2%, binomial test (P = 0.5): p \ 0.01). In the F-condition, the answers were more divided. A majority of participants said that the target utterance was true (56%), but this difference did not reach statistical significance. However, a comparison between T-condition and F-condition yielded a statistically significant difference (v 2 = 82.61, p \ 0.001), which suggests that the referential intuition of the respondents was guided more by description than by intention. The situation is different when it comes to the scenarios with the Sal conditions. The results are, in principle, compatible with the predictions of intentionalism (67.9% of TRUE answers in T-condition and 61.2% of FALSE answers in Fcondition, v 2 = 20.09, p \ 0.001). However, during our research, we identified several factors which strongly influence responses in this condition. (A further discussion is in Sect. 3.4.) We also conducted a more sophisticated statistical procedure in which we combined the two dependent variables present in our study: the truth-value judgment and the confidence ratings. For each answer we computed the ''logical value index'' in the following way: for TRUE answers we took confidence rating as a value, and for FALSE answers we reversed the sign of confidence rating (i.e., ''2'' on the scale becomes ''-2''). This procedure allowed us to carry out a more finegrained analysis of the answers. We ran a mixed analysis of variance on the confidence ratings transformed in the described manner. The ANOVA test revealed that both of our experimental factors were highly statistically significant (Tcondition vs. F-condition: F(1, 288) = 24.2; p \ 0.001, Dem vs. Des vs. Sal: F(2, 576) = 28.3; p \ 0.001. The interaction was also statistically significant (F(2, 576) = 80.1; p \ 0.001), which suggests that different types of contextual factors influence the truth-value judgments in different ways.
Two types of post hoc analyses were conducted. First, a comparison of means was performed in the T-condition and F-condition separately for each within-subject condition (Des, Dem and Sal, see Fig. 2 and Table 2). In all sub-conditions, the Fig. 2 Study I: results of combined truth-value judgment and confidence ratings. Error bars represent standard error of the mean difference was highly statistically significant (two-sample t-test, p \ 0.001). Additionally, we ran a separate 3 9 3 9 2 ANOVA in which we included different setups as an additional factor (Dog Show, Wedding Cakes, Game of Chess). We did not find any significant effect of the story setup (p [ 0.05).
The second type of analysis was performed to check whether the means in each of our sub-conditions differed from those predicted by chance (see Fig. 3 and Table 3). If the participants were responding randomly, the expected mean would equal 0. All but one mean (the Sal scenario in the F-condition) significantly differed from 0 (one sample t-test).

Discussion
In general, the presented results do not support intentionalism: the predictions of pure intentionalism and the Bad Intentions account have not been confirmed in Dem and Des conditions, where the pattern of responses is more consistent with the set of predictions of non-intentionalism. The situation, though, seems to be different in Sal conditions. Let us discuss our results in more detail.
Firstly, in the Dem condition, the results are fully consistent with nonintentionalism, which suggests that people firmly reject intentions as referencedetermining factors whenever they stay in conflict with demonstrations. A similar effect-though a weaker one-can be observed in the Des condition. While the results here are definitely inconsistent with pure intentionalism, they are not so clear-cut from the viewpoint of a comparison between non-intentionalism and the Bad Intentions account. There is an asymmetry between responses in the Des conditions: while in the T-condition, a vast majority of respondents provided nonintentionalist evaluations (95%), ''only'' 56% of them gave such evaluations in the F-condition. In the second condition, around 44% of the respondents answered FALSE. This may suggest that a certain group of people had intuitions consistent with King's theory and answered FALSE regardless of the fact whether it was the Tor the F-condition, as they recognized reference failures in both these cases. On the other hand, this hypothesis is not supported by the data concerning the confidence levels. As we can see from Fig. 2, the subjects' responses were quite decisive in both Dem and Des conditions, which would be puzzling if the subjects in fact recognized reference failures in any of these cases. An alternative explanation is that the more ambiguous results of the F-conditions are, in general, a consequence of the more complex structure of the scenarios (in the scenarios with the F-condition, the speaker always made two mistakes: one concerning the referential act, the other the predicate ascription).
Let us now analyze the results of the Sal conditions. It seems that the results of these conditions are compatible with intentionalism. A closer inspection reveals that such a conclusion is not entirely warranted. Firstly, for the Sal vignettes, we obtained very even distributions of responses, especially in the F-condition. For our compound variable (combined confidence and truth-value judgments), the mean does not differ significantly from 0 in the F-condition. It could be argued that this pattern of responses indicates that the respondents had difficulty in deciding whether the target utterance was true or false. Secondly, the results we obtained for the Sal conditions are non-robust. In an additional series of short experiments in which we used only the Dog Show setup, we established that the distribution of answers in the Sal case heavily depended on various factors, such as the order of questions, formulation of the truth question, an amount of information given to justify the protagonist's error, a way of indicating the salient object, or the exact wording used to describe the protagonist's intention. The most likely explanation of this fact is that both salience and intentions play a role in determining reference of complex demonstratives. In particular, the effect of salience may depend on the strength of the salience of a given object, and the level of salience necessary to overturn the speaker's intentions was simply not reached in the cases presented to the participants in our main study.
To sum up, we can formulate the conclusion that the speaker's intention is not a crucial factor in determining reference of complex demonstratives, as it is taken to be less relevant than demonstration and likely description by ordinary speakers. Still, the results suggest that intentions are sometimes relevant. Based on the results of the Sal conditions, we can conclude that people rely on the speaker's intention in their referential choice under certain circumstances. Furthermore, the results of Des conditions do not count decisively against the Bad Intentions account, which again suggests some role of the speaker's intentions in people's referential interpretations in this condition.

Do the results really challenge intentionalism?
The conclusion saying that intentionalism is wrong would be, however, too hasty. Firstly, one may cast doubt on whether the obtained data reflect anything about the semantics of complex demonstratives. Secondly, as it was pointed out, there are some versions of (impure) intentionalism which were not targeted by our study and so are left untouched by the presented criticism. This means that it is still possible that demonstrative reference is determined by some sorts of intentions. We will address these two issues in the current section.

Semantic significance of the results
One objection to the claim that our results challenge intentionalism is that the data we collected are not semantically significant, that is to say, the evaluations of the target statements made by the study participants may not concern the semantic content of these statements. But then the question arises what they actually concerned and what particular pragmatic factors interfered with the process of semantic evaluation of the target statements.
One possibility is that the study participants may focus on what the speaker means in the context presented by a vignette rather than what her or his words mean; in particular, the participants may confuse the so-called ''speaker's referent'' with the ''semantic referent'' (in the sense of Kripke, 1977). 12 However, we do not think that this particular objection applies to our study. The objection says that the evaluation of a statement with a referential term may reflect judgments about the speaker's referent and not about the semantic value of that term. But the speaker's referent is standardly characterized as a particular object which the speaker has in mind on a given occasion. In other words, if there is something which can be identified as the speaker's referent of a demonstrative, this should be, in principle, the object which is the referent according to (pure) intentionalism. Consequently, the predictions of intentionalism overlap with the prediction that the study participants evaluated the speaker's meaning. Given that our findings are inconsistent with the intentionalist predictions, the hypothesis that the participants expressed judgments about the speaker's meaning cannot be true and thus intentionalism cannot use it for its defense.
Moreover, there are some studies whose results suggest that the influence of the speaker's meaning for folk's judgments is not relatively overwhelming, at least, with respect to the truth-value judgment task of the sort used in our study. For instance, Machery et al. (2015) proposed several strategies for how to resist the potential effect of the speaker's meaning in a survey on reference of proper names and implemented them in their experimental study. As it turned out, the results did not differ much from the results of earlier studies which were open to the speaker'sreferent objection. This suggests that people were not mostly guided by the speaker's meaning in the original study, either. Another piece of evidence is delivered by the study of Rostworowski and Pietrulewicz (2019) who, among other things, tested the prediction that people tend to agree with a false statement when the speaker nevertheless conversationally implicates the truth. They reported that people had a relatively small propensity to agree that such a statement was true (2019: 619-621). Importantly, all the aforementioned studies employed very similar question patterns to the one used in our questionnaire; therefore, the worry that people confuse semantic meaning with the speaker's meaning when asked to assess a given statement in terms of truth/falsity is probably overestimated.
Another possibility consistent with the conclusion that the results are semantically insignificant is that the respondents in our questionnaires took the perspective of the hearer and consequently refuse to take the information about the speaker's intention into account in interpreting the sentence. That is to say, the study participants could identify themselves with the speaker's interlocutor featured in each story, who does not have an access to the speaker's intention and can only rely on contextual clues (like a demonstration) or the explicit content of the statement in determining the proposition expressed by it. The way how we especially formulated the probe question (''How would you evaluate … in the described situation?'') could slightly suggest to the respondents that they should put themselves in the shoes of the speaker's interlocutor. This would mean for them that they should not include the information about the speaker's intention provided in the vignette in their interpretation and evaluation. If this hypothesis is correct, it seems that people actually evaluated the proposition which the interlocutor could reasonably take to be expressed and not the proposition which, according to intentionalism, is actually semantically expressed by the speaker. Hence, our experiment did not test the relevant judgments from the viewpoint of intentionalism.
We think that the suggested possibility is unlikely. Firstly, for a proponent of intentionalism the external factors such as demonstration, etc., are only clues to figure out the speaker's intention and the intention itself fixes the reference. It would be then odd, from this perspective, if the study participants excluded the explicit information about the thing that is a relevant semantic fact, according to intentionalism, and relied their interpretation on the factors that are in fact inessential. Secondly, if participants indeed took the perspective of the interlocutor and ignored the speaker's intention-and this was the reason why they gave evaluations inconsistent with intentionalism in the Dem and Des conditions-then a puzzle arises why they did not take the same perspective in the Sal conditions in the main study. As the study revealed, people mostly gave evaluations consistent with intentionalism in the Sal conditions, which means that they must have taken the information about the speaker's intention into account. The Sal vignettes were very similar to the ones with Dem and Des conditions (they contained similar background stories, were structured the same way, etc.) and the probe employed the same pattern of question, which makes it altogether very unlikely that people took essentially different perspectives in their interpretations across all these conditions. This means that the information about the speaker's intention has been included by the study participants in Dem and Des conditions, as well, but they acknowledged that intention is less important than demonstration or description and did not treat it as a reference-determining factor this time.
Finally, in order to further investigate the possibility that the respondents understood the probe question as a question about how things looked from the perspective of the hearer, we carried out an explorative study using somehow different patterns of the questionnaire. We left the vignettes in the exact form they had in original Study I, but employed new formulations of the main question suggesting to the respondents that they should evaluate the speaker's statement on more objective means, or take into account all information provided in the vignette. We actually used four different formulations of the question, which were the following: (i) ''Given all the information, how do you evaluate X's statement?'' (ii) ''How would you evaluate the sentence in bold, given all the information?'' (iii) ''Is what X said true?'' (iv) ''Is what X said false?'' What we discovered was that the results in the Dem and Des cases did not differ significantly from those obtained in the original version of Study I. Based on this, we can conclude that if the reformulated questions served their purpose and did not bias the respondents towards taking the hearer's perspective, such a bias was not significantly present in the original experiment, either.
In sum, we conclude that the claim that people took the hearer's perspective (and consequently ignored the information about the speaker's intention) is not persuasive. Also, intentionalism cannot be defended by claiming that people expressed their judgments about the speaker's reference. This leads to the conclusion that while one cannot for sure exclude a possibility that the study participants did not evaluate the semantic content of the sentence, the more concrete hypotheses about what kind of propositions they could evaluate instead are problematic. Hence, the objection that our results are not semantically significant does not have a solid ground.

Secondary referential intentions
The second major objection to the claim apparently supported by our experimental results-that intentions do not determine reference-points to the fact that not all versions of intentionalism have been targeted by our study and these views may still be correct in light of the experimental data. To remind, these were the positions like the one of Bach or King's Best Laid Plans version of his coordination account. These views yield the predictions about the cases investigated in our study which are indeed compatible with the results, as we will see in a moment. And perhaps the analysis of the experimental cases offered by the aforementioned theories can provide a more general strategy of defending intentionalism against our findings. However, we will argue that this strategy is problematic and has some limitations. The argument for the latter will be based on further empirical evidence to be presented in the next section.
As we have noted in Sect. 2, Bach recognizes different sorts of intentions of the speaker who uses a demonstrative and claims that only the intention that appeals to identification-basis features of the referent is reference-determining. 13 It is instructive to illustrate this view with an example. For instance, in the Carnap/ Agnew case, the speaker wants to talk about the picture of Carnap, but he also has another kind of intention, namely-an intention to talk about the picture in the range of his demonstration, i.e., the one currently hanging on the wall (as he presumes that this is Carnap's picture). It is the second intention which determines reference in this case, since ''being a picture located in the range of the demonstration'' constitutes a feature based on which a particular object can be easily identified by the audience in the presented context. Consequently, Bach's theory predicts that the speaker referred to Agnew's picture and his statement is false in the considered example. Hence, we may observe that Bach's theory gives exactly the prediction of the view according to which the demonstration in the context (presumably together with some further pragmatic factors) determines Agnew's picture as the referent. More generally, impure intentionalism in style of Bach makes referential predictions which are in principle equivalent to the ones of non-intentionalist views. It is because it treats the intention which straightforwardly appeals to non-intentional ''objective'' factors (like being demonstrated etc.) as reference-determining.
In light of what has been said, it is not hard to figure out how a defender of Bach's impure intentionalism can explain the results obtained in Study I. Roughly, the speakers in our stories had a second kind of intention which determined the referent, and the study participants were guided by this second intention in their semantic evaluations-at least, when it comes to the Dem and Des conditions. For convenience, we will use the notion of a ''secondary'' intention in order to identify the relevant reference-determining intention in each experimental case. (We borrow the term from Reimer (1992: 389); King (2013: 303) calls it a ''controlling'' intention.) Let us now examine in detail how this explanation works in the particular cases investigated in our study. We will first focus on the Dem condition (illustrating it, once again, with the Dog Show scenario): Dog Show, Dem, T-condition: James and Olivia are attending a dog show. At the moment, two dogs are being presented in the ring: a red Irish Setter and a chocolate Labrador. Both dogs are running as they are doing some tricks. Olivia wants to point to the Labrador, but the dogs are moving so quickly that she accidentally points with her finger to the Setter while saying to James: ''This dog is chocolate.'' This case looks quite transparent for a defender of impure intentionalism: Olivia, who has an intention to talk about the Labrador, gains also another intention which serves to realise her referential goal, namely, an intention to talk about the dog which is in the direction of her demonstration right now, as she attempts to demonstrate the Labrador. Based on such description, the audience is able to easily identify a particular dog (though not the one Olivia actually has in mind), that is, the Setter. In sum, Olivia can be attributed a secondary intention to refer to the Setter and the study participants identified the referent according to this intention.
In a similar way, one can argue that the speaker has a secondary intention in the Des cases, though of a different kind. Let us first illustrate this case by one of our vignettes: Game of Chess, Des, T-condition: Alice and Ralph are looking at an old set of wooden chess pieces. At the moment, Ralph is studying the figures of the king and the queen. The king is made of oak, while the queen is made of sycamore. However, Ralph confuses these two figures: he takes the king to be the queen and vice versa. Having the king in mind, he says to Alice: ''This chess queen is made of oak.'' In this example, Ralph wants to talk about the king and his intention is grounded in direct perception of this figure. But Ralph thinks that the perceived figure is a queen and that it can be identified as such by his audience, Alice. Thus Ralph can form an intention to refer to (whatever is) the chess queen in the set of chess pieces he is currently studying. This secondary intention is similar to the one in the Dem case in the sense that both intentions have a descriptive character, i.e., they determine an object purely based on some properties. In the Dem case, the relevant property is the description from the nominal, or a certain enrichment of it.
In sum, impure intentionalism in style of Bach can successfully explain what is going on in our experimental cases with Des and Dem conditions in a way fully consistent with the judgments of the study participants. 14 Furthermore, one may wonder whether pure intentionalism can adopt the ''strategy of secondary intentions''. That is to say, a defender of pure intentionalism may agree that we can distinguish different referential intentions of the speaker and when these intentions (unbeknown to the speaker) determine different objects on an occasion, then one of them may ''trump'' the others. This is a position similar to the one presented by Speaks (2016). 15 However, we believe that this strategy itself raises a crucial concern. A proponent of pure intentionalism should explain why the secondary intention actually trumps the primary one and not vice versa. This is essential for explaining our results in the Dem and Des conditions. Yet, it is ultimately hard to provide a systematic theoretical explanation of the trumping phenomenon (see Speaks, 2016Speaks, , 2017. We should note here that impure intentionalism is in a better position than pure one. Bach's claim that only certain kinds of intentions are reference-determining is justified in the framework of his theory of communication. On the other hand, taking a secondary-intention strategy by a defender of pure intentionalism without a theoretical foundation and claiming that the secondary, not primary, intention fixes reference in the Dem and Des but not in the Sal cases sounds ad hoc.
Furthermore, explaining reference in terms of secondary intentions is still disputable, even if one appeals to the theory of Bach. According to him, a relevant referential intention is the one which presents an object in a way ''identifiable'' from the viewpoint of the audience. In other terms, the intention involves such features of the referent x which are accessible to the audience and they indicate that x should be the referent rather than anything else. But this raises a question why it is actually an intention which appeals to such properties and not these properties themselves which establishes that x is the referent. That is to say, if secondary intentions indicate the features of the referent based on which it can be identified anyway, the intentions themselves seem to be explanatorily redundant and, moreover, introduce extra complexity: if a simpler mechanism of referential interpretation is available, why then would people follow a more complex one, which additionally includes a step of reconstructing the speaker's intention? (For similar criticisms see Gauker, 2019, Sec. 5, or Stojnić et al., 2013 Finally, we have decided to shed some light on the issue of secondary intentions by using empirical means. We wanted to find out whether people recognize secondary intentions and, if yes, to what extent their evaluations of the statements in the relevant conditions can be explained by an appeal to such recognition. The results of our second study revealed that secondary intentions are generally not recognizable by the study participants. This finding is not per se a problem for intentionalism since this view is not committed to the claim that people must actually recognize secondary intentions to the effect that they are ready to attribute them to the speakers. These intentions may guide the referential interpretations at a subconscious level. Yet, this result could be viewed as problematic in light of some additional considerations and it further delimits the set of empirically adequate intentionalist positions.

Research hypotheses
In the second study, we concentrated on the cases in which secondary intentions could potentially play a role, i.e., the Dem and Des conditions. We wanted to test two hypotheses: (A) People are inclined to attribute secondary intentions to the speakers.
(B) There should be a positive correlation between answering FALSE in the T-conditions (and answering TRUE in the F-conditions) and attributing secondary intentions to the speakers.
The motivation for Hypothesis (A) was to check whether the study participants recognize secondary intentions at all. In particular, we wanted to find out whether the participants attribute secondary intentions to a similar degree as primary intentions which should be clearly transparent to the participants in light of the information from the vignettes. The rationale behind the second hypothesis (B) was to test whether people who make referential interpretations not in line with the (pure) intentionalist predictions do it because they recognize secondary intentions. More specifically, the ones who prefer FALSE in the T-conditions and TRUE in the F-conditions make such evaluations because they recognize a secondary intention and thus attribute it to the speaker, which can be operationalized as Hypothesis (B).

Participants, materials and procedure
157 participants were recruited via Amazon Mechanical Turk to complete an online survey. The data from non-native speakers and subjects who failed the comprehension or attention tests were excluded. 116 participants remained (54 women, M age = 39, SD = 12.2). The experimental design and materials of the second study were very similar to the first. We ran the study as a 2 9 2 mixed design. The number of factors in the between-subjects condition was reduced from three to two (after the exclusion of the Sal conditions). We introduced two changes to our experimental procedure. First, as we wanted to include an additional dependent variable (attributions of intentions), we introduced a screen on which questions about different intentions of the speaker were presented, e.g., ''Do you agree that the following statement is true in light of the story: Ralph wanted to talk about the king.'' [primary intention] ''Do you agree that the following statement is true in light of the story: Ralph wanted to talk about whichever figure as long as it is a queen.'' [secondary intention].
The participants were asked to respond to these questions on a visual analogue scale (slider) from -50 to 50, with ''Strongly disagree'' and ''Strongly agree'' as anchors (see Appendix C for the exact formulation of all questions). Such prompts should be understandable to competent language users and at the same time they captured the general spirit of the notion of a secondary intention, as the questions indicated a descriptive character of the intention.
The second modification concerned the scale in our confidence level question: it was changed from a Likert scale to a visual analogue scale. This adjustment was driven by the aim to make the responses of our participants more granular.

Results
Our results concerning the semantic judgments were replicated in the second study. The statistical procedure used to analyse categorical data was the same as in the first study (see Fig. 4). In both the Dem and Des conditions, a majority of participants said that the target utterance is false in the T-condition (Des: 71.7%, Dem: 61.7%) and that it is true in F-condition (Des: 72.5%, Dem: 82.2%).
The main finding of Study II is that people are inclined to recognize primary but not secondary intentions (see Fig. 6). The difference is large in the T-condition; in the F-conditions, it is somewhat less pronounced (see Table 4 for a detailed comparison). Small but statistically significant differences between the two types of contextual factors (Dem vs. Des) were observed. 16 The differences between attributions of primary and secondary intentions were statistically significant in each sub-condition (two-sample t-test, p \ 0.05).
No statistically significant correlation between attributions of primary intention and semantic judgments was observed. A statistically significant correlation between secondary-intention attribution and semantic judgment was discovered in all but one (Dem and F-condition) sub-condition (see Fig. 7). It is, however, important to note that in the T-conditions, the correlation had the opposite direction compared to the predictions of intentionalism (i.e., Hypothesis (B)).

Discussion
As we have seen, people are not inclined to ascribe secondary intentions to the speakers in our vignettes. So, Hypothesis (A) is not confirmed. This still leaves a possibility that secondary intentions guide people's referential interpretations at a subconscious level. In light of this observation, the finding concerning Hypothesis (B) is not particularly significant; yet, for the sake of accurateness, let us note that it was not generally true that participants who identified the referent according to the non-intentionalist predictions tended to ascribe secondary intentions to a greater degree than those whose referential interpretations did not fit the aforementioned predictions. As we noted, the correlation predicted by Hypothesis (B) and the one actually observed completely differ in both T-conditions. As far as the F-conditions The fact that neither of the two main hypotheses has been confirmed is not yet problematic for intentionalism. As we stressed, intentionalism does not require that users of language whose referential interpretations are guided by secondary intentions are aware of this fact and thus would tend to explicitly ascribe such intentions to speakers. Also, the mere asymmetry in recognitions of primary and secondary intentions can hardly be taken as evidence against intentionalism, too, since a defender of this view may hold that the two kinds of intentions have much different natures. However, the results may be viewed as troublesome for some defenders of intentionalism, namely, the ones who nevertheless would like secondary intentions to be the kind recognizable by hearers, just like the primary intentions. Furthermore, one can argue that competent language users should recognize that the appropriate reference-conditions are met when they perform a referential interpretation, and that this recognition is empirically verifiable-that is, people should somehow reflect in their behavior that they indeed track the speaker's intention and resolve reference according to it, and not some non-intentional factors. The indicated remark requires a much deeper discussion which exceeds the scope of this paper. What we can conclude for sure is that the most (any?) plausible version  Although the results do not match the earlier-formulated hypotheses (A) and (B), this pattern of responses should be also explained. The direction of correlation is positive across all sub-conditions and generally weak. One possible explanation of this fact is that it is an effect of the well-known-at least in social sciencesphenomenon called survey ''satisficing'' (cf. Krosnick, 1991). Simply put, some of our participants have a tendency to answer questions in a ''socially desirable'' way, which in this case could mean giving positive answers (both in the truth-value judgment task and the intention-attribution task in our study). This tendency, if present, would manifest itself as a weak correlation between these variables. This issue needs to be investigated further, but at the moment it seems to be the most plausible explanation for this pattern of responses. One could argue that if our account is correct, we would observe the same pattern as far as the attributions of primary intentions are concerned. We did not observe a similar pattern, but this can be easily explained by the ceiling effect: the attributions of primary intentions were too strong to generate these spurious correlations. Fig. 7 Correlation between attributions of secondary intention and the combined truth-value judgment task 6 Alternative approaches to demonstrative reference We have concluded so far that in light of our experimental results, only some particular versions of impure intentionalism are legitimate, if any. This conclusion leads to the question about the alternative: if intentionalism turns out to be wrong, what kind of a theory should we adopt instead? The aim of this paper is not to address this question, but we want to indicate what kinds of theoretical approaches find support in our results, apart from some specific versions of impure intentionalism.
Basically, our study has shown that each of the tested factors has some relevance in ordinary users' process of reference resolution, and these factors are very different: pointing gestures, descriptions contained in the expression, but also-as we suggested in our discussion in Sect. 3.4-the speakers' intentions and salience. These findings indicate that an adequate theory of demonstrative reference should regard a wide range of factors as potentially reference-determining. If this direction is correct, then the referring function of demonstratives is not uniform in the sense that there is no ''essence'' of reference-one particular factor which determines the referential connection in each case.
There are some theoretical proposals nowadays which can directly accommodate such conclusions, or at least make room for them. One proposal is by Nowak (2021) (see also Nowak and Michaelson, 2019) who rejects the presumption that there is a single notion of reference in the case of demonstratives, and so sentences containing them do not have ''canonical'' truth conditions. Rather than investigating reference simpliciter, we should focus on sorts of reference taken from different perspectives, e.g., of the communicative goal of the speaker, or the one related to what a listener is able to recover in the context. This approach is compatible with our findings as it assumes that different factors may affect people's referential interpretations of demonstratives depending on an occasion. It may happen that one sort of reference (e.g., the one related to the hearer's perspective) is more prominent to language users than another sort (like the one related to the speaker's communicative goal), which is the phenomenon reported in our study. Obviously, the details of such an explanation need to be worked out.
A second proposal is a hybrid approach similar to the account of Gauker (2008), according to which the referent of a demonstrative is the object that best satisfies various accessibility criteria. The point is that these criteria as a rule include very different factors, like salience (in the sense that the hearer can easily spot the object by looking around), earlier-mentioning, relevance (there should be a reason to talk about the object), charity (in the sense that the interpretation of a demonstrative yields to a sensible interpretation of the whole sentence), etc. However, Gauker's list of accessibility criteria should be further elaborated on in light of our empirical findings. We already have demonstrations on the list. This is rather uncontroversial: whenever a speaker is pointing to a certain object, this object (together with its closest environment) definitely becomes accessible. But, in a similar vein, we may conceive how the description works in the case of complex demonstratives. That is, the contained description is a tool for distinguishing certain object(s) by pointing out their specific feature. In other words, the description serves as an accessibility criterion, making all objects in the context which satisfy it better candidates for the referent than the ones which do not meet the description. 17 Finally, contra Gauker, we want to add the speaker's intentions to the list. We will not engage here in the dispute over whether the speaker's referential intentions are accessible or not. Gauker thinks they are not, yet some philosophers question his view (e.g., Michaelson, 2013: 30). Our hypothesis is that once we think of referential intentions in more objective terms-not merely as mental states but as relations between the speaker's mental state and an object in the world-we could be justified in thinking that they are nevertheless accessible to hearers. In particular, we may conceive referential intentions as sorts of causal relations (see Devitt, 1981). Since people are inclined to interpret events in the world as bearing causal connections to each other, they may be quite good in recognizing the speaker's intentions.
Whether or not the above hypothesis is on the right track, the speaker's intention may, in a sense, affect the ranking of accessible objects. A mere supposition of the listener that the speaker has a particular object x in mind while using a demonstrative makes x accessible to the listener and thus a candidate for the referent. This would explain our results of the Sal conditions, where the explicit information about the speaker's intention was indeed treated as a relevant clue to reference by the study participants.
Altogether, the alternative approaches to demonstrative reference which seem to be adequate in light of our experimental findings are pluralistic or genuinely hybrid ones.

Summary and conclusions
In the paper, we presented two experimental studies investigating the role of the speaker's intention in determining reference of complex demonstratives. Our main study (Study I) did not confirm the referential predictions of pure intentionalism and impure intentionalism (i.e., King's Bad Intentions). We observed that people were not guided by the speaker's intention in assigning a referent to a demonstrative; they regarded demonstrations and most likely the descriptive content of an expression to be more relevant. We next considered an objection stating that our results are semantically insignificant-which appealed to the speaker's reference or the hearer's perspective-and argued that this objection is not persuasive. We noted, however, that some versions of impure intentionalism which introduce secondary referential intentions (e.g., Bach's theory or King's Best Laid Plans) are fully compatible with the results of Study I. An additional piece of empirical evidence, provided by our second experiment (Study II), indicated that secondary intentions are nonetheless unrecognizable by ordinary speakers. Altogether, the two 17 Some theoreticians claim that the nominal does not contribute to the semantics of complex demonstratives but only helps the hearer to realize which particular object is talked about (see Larson and Segal, 1995). In our view, this may be partially correct. The nominal makes some object(s) accessible to a hearer, but the accessibility ranking eventually determines the semantic referent, according to the presented theory. So, the nominal indirectly contributes to the semantics of complex demonstratives. experiments showed that intentionalism captures the way how people understand complex demonstratives only if it incorporates essential refinements and accepts some limitations, in particular, that the relevant kind of intention acts at a hidden level. In the last section, we suggested a pluralistic or a hybrid account of reference as the alternatives. Whether the suggested approaches are better than intentionalism depends, of course, on several theoretical issues that are not taken up in this paper. However, since empirical adequacy is one of the relevant aspects in which one should evaluate a semantic theory, we believe that our studies deliver important evidence to the debate about the role of the speaker's intentions in demonstrative reference.
the Setter and vice versa. Having the Labrador in mind, she says to James: ''This Setter is chocolate.'' Dog Show, Des, F-condition: James and Olivia are attending a dog show. At the moment, two dogs are being presented in the ring: a red Irish Setter and a chocolate Labrador. However, Olivia confuses these two breeds-she takes the Labrador to be the Setter and vice versa. Moreover, she does not recognize their colors properly and thinks that the Labrador is red. Having the Labrador in mind, she says to James: ''This Setter is red.'' Dog Show, Sal, T-condition: James and Olivia are attending a dog show and admiring various breeds. Firstly, they look at a crate with a cute chocolate Labrador. Then they go further and find beautiful red Irish Setters. One of the Setters is very playful and they stay for a while, watching as the dog is chasing its own tail. At one moment, Olivia says thinking about the Labrador: ''This dog is chocolate,'' while James is petting the playful red Setter in front of her.
Dog Show, Sal, F-condition: James and Olivia are attending a dog show and admiring various breeds. Firstly, they look at a crate with a cute chocolate Labrador. Olivia mistakenly thinks that this kind of dog is ''red''. Then she and James go further and find beautiful red Irish Setters. One of the Setters is very playful and they stay for a while, watching as the dog is chasing its own tail. At one moment, Olivia says thinking about the Labrador: ''This dog is red,'' while James is petting the playful red Setter in front of her.
Wedding Cakes, Dem, T-condition: Jack and Emily are on a wedding cake tasting. At the moment, two cakes are being presented to them on a rotating display: a sour cherry cake and a sweet raspberry cake. Jack tastes both cakes. He wants to point to the cherry one, but as the cakes are constantly moving on the display, he accidentally points with his finger to the raspberry cake while saying to Emily: ''This cake is a bit sour.'' Wedding cakes, Dem, F-condition: Jack and Emily are on a wedding cake tasting. At the moment, two cakes are being presented to them on a rotating display: a sour cherry cake and a sweet raspberry cake. Jack tastes both cakes. However, he cannot tell the difference in their flavors. Jack wants to point to the cherry cake, but as the cakes are constantly moving on the display, he accidentally points with his finger to the raspberry one while saying to Emily: ''This cake is sweet.'' Wedding Cakes, Des, T-condition: Jack and Emily are on a wedding cake tasting. At the moment, they are trying two flavors: a sour cherry cake and a sweet raspberry cake. Jack tastes both cakes. However, he confuses the flavors: he takes the cherry cake to be a raspberry one and vice versa. Having the cherry cake in mind, he says to Emily: ''This raspberry cake is a bit sour.'' Wedding Cakes, Des, F-condition: Jack and Emily are on a wedding cake tasting. At the moment, they are trying two flavors: a sour cherry cake and a sweet raspberry cake. Jack tastes both cakes. However, he confuses the flavors, taking the cherry cake to be the raspberry one and vice versa. Moreover, he does not realize that one of the cakes tastes sour. Having the cherry one in mind, he says to Emily: ''This raspberry cake is sweet.'' Wedding Cakes, Sal, T-condition: Emily and Jack are on a wedding cake tasting. They are trying various flavors, one at a time. Firstly they taste a cherry cake. The cake is a bit sour, but neither of them realize it at the moment. Then the cherry cake is replaced with a raspberry one. The raspberry cake looks very delicious and is decorated with fruits and whipped cream. Emily tries the raspberry cake first and it turns out to be very sweet. At this moment, Jack says thinking about the cherry cake: ''This cake is a bit sour,'' while Olivia serves him a piece of the raspberry cake.
Wedding Cakes, Sal, F-condition: Emily and Jack are on a wedding cake tasting. They are trying various flavors, one at a time. First they taste a cherry cake. The cake is a bit sour, but Jack does not realize it and think that the cake is sweet. Then the cherry cake is replaced with a raspberry one. The raspberry cake looks very delicious and is decorated with fruits and whipped cream. Emily tries the raspberry cake first and it turns out to be very sweet. At this moment, Jack says thinking about the cherry cake: ''This cake is sweet,'' while Olivia serves him a piece of the raspberry cake.
Game of Chess, Dem, T-condition: Alice shows to Ralph an old set of wooden chess pieces. At this moment, Alice is holding both the chess king and the chess queen. The king is made of oak, while the queen is made of sycamore. Having noticed the difference, Ralph wants to point to the king, but Alice moves the figures so fast that he accidentally points with his finger to the queen while saying: ''This chess figure is made of oak.'' Game of Chess, Dem, F-condition: Alice shows to Ralph an old set of wooden chess pieces. At this moment, she is holding both the king and the queen. The king is made of oak, while the queen is made of sycamore. However, Ralph does not realize that they are made of different kinds of wood. He wants to point to the queen, but Alice moves the figures so fast that he accidentally points with his finger to the king while saying: ''This chess figure is oaken.'' Game of Chess, Des, T-condition: Alice and Ralph are looking at an old set of wooden chess pieces. At the moment, Ralph is studying the figures of the king and the queen. The king is made of oak, while the queen is made of sycamore. However, Ralph confuses these two figures, namely, he takes the king to be the queen and vice versa. Having the king in mind, he says to Alice: ''This chess queen is made of oak.'' Game of Chess, Des, F-condition: Alice and Ralph are looking at an old set of wooden chess pieces. At the moment, Ralph is studying the figures of the king and the queen. The king is made of oak while the queen is, in turn, made of sycamore. However, Ralph confuses these two figures: he takes the king to be the queen and vice versa. Moreover, he doesn't realize that they are made of different kinds of wood. Having the chess queen in mind, he says to Alice: ''This chess king is made of oak.'' Game of Chess, Sal, T-condition: Alice and Ralph are examining an old wooden set of chess pieces. First they look at the chess king which is an oaken figure. Then Alice puts the king aside and takes the figure of the queen. The queen is actually made of sycamore-it is large and elegantly carved and both Ralph and Alice are admiring the style of the figure. At this moment, Ralph says thinking about the king: ''This chess figure is made of oak,'' while Alice is examining only the queen in front of him.
Game of Chess, Sal, F-condition: Alice and Ralph are examining an old wooden set of chess pieces. First they look at the chess king which is an oaken figure. Ralph mistakenly thinks that it is made of sycamore. Then Alice puts the king aside and takes the figure of the queen. The queen is actually made of sycamore. It is large and elegantly carved and both Ralph and Alice are admiring the style of the figure. At this moment, Ralph says thinking about the king: ''This chess figure is made of sycamore,'' while Alice is examining only the queen in front of him.

Appendix B
See Table 5.

Appendix C
Dog Show, Dem, both T-and F-condition: (a) Do you agree that the following statement is true in light of the story: Olivia wanted to talk about the Labrador. (b) Do you agree that the following statement is true in light of the story: Olivia wanted to talk about whichever dog she pointed at.
Dog Show, Des, both T-and F-condition: (a) Do you agree that the following statement is true in light of the story: Olivia wanted to talk about the Labrador. (b) Do you agree that the following statement is true in light of the story: Olivia wanted to talk about whichever dog as long as it is a Setter. Wedding Cakes, Dem, T-condition: both T-and F-condition: (a) Do you agree that the following statement is true in light of the story: Jack wanted to talk about the cherry cake. (b) Do you agree that the following statement is true in light of the story: Jack wanted to talk about whichever cake he pointed at.
Wedding Cakes, Des, both T-and F-condition: (a) Do you agree that the following statement is true in light of the story: Jack wanted to talk about the cherry cake. (b) Do you agree that the following statement is true in light of the story: Jack wanted to talk about whichever cake as long it is a raspberry cake.
Game of Chess, Dem, both T-and F-condition: (a) Do you agree that the following statement is true in light of the story: Ralph wanted to talk about the king. (b) Do you agree that the following statement is true in light of the story: Ralph wanted to talk about whichever figure he pointed at.
Game of Chess, Des, both T-and F-condition: (a) Do you agree that the following statement is true in light of the story: Ralph wanted to talk about the king. (b) Do you agree that the following statement is true in light of the story: Ralph wanted to talk about whichever figure as long it is a queen.