Strategy of testing intentionalism
A reasonable strategy to test intentionalism is to focus on the cases in which the referential choice of language users may potentially diverge from that predicted by intentionalism. Such cases can be naturally constructed by introducing another factor that could be reference-determining and indicates a different candidate for the referent from the one fixed by the speaker’s intention—like in the previously discussed examples of Kaplan and Gauker. We selected three such factors.
The first is an act of demonstration. If a speaker makes a gesture of pointing while using a demonstrative, then the referent should be located somewhere in the direction indicated by that gesture. In light of the conversational context and some common-sense assumptions, a demonstration is usually enough to determine a unique object as the referent. Thus the speaker’s act of demonstration may be regarded as a reference-determining factor which operates independently of his/her intention. This proposal is obviously rooted in some classical theories of demonstrative reference, such as the one presented by Kaplan (1989a), according to which the “true” demonstratives are semantically incomplete and require a form of demonstration in order to successfully refer (see also Roberts, 2002).
The second factor is description. That is to say, we may regard the property expressed by a demonstrative (in particular, the one expressed by the nominal phrase in a complex demonstrative, like “black cat” in “that black cat”) as reference-determining whenever there is only one object satisfying the property in the context. Although complex demonstratives, i.e., expressions of the form “this F” or “that F”, are often used when there is another F in the context, this is not an unexceptional rule. Hence, the description itself may determine a particular object provided we assume a contextual domain restriction.Footnote 6
The third factor is salience. This is a broad category that inter alia includes demonstrating but is not limited to it. Objects may be salient in virtue of very different features, e.g., they occupy a certain spatial position with regards to the speaker, or they have previously been mentioned in the utterance (cf. Lewis, 1979: 348–350). Basically, the idea is that the referent of a demonstrative used at a given stage of a discourse is the most salient object at this stage. With regards to demonstratives, several theorists advocate for accounts of this sort (e.g., Mount, 2008; see also Gauker, 2008; Newen, 1998; Wettstein, 1984). Usually it is assumed that salience is a gradable feature and we can compare different objects that gain salience from presumably different sources in terms of their levels of salience and create a ranking. Not much is said by the aforementioned theorists about how these mechanisms work in detail; however, the applications of the theory are straightforward and intuitively clear in ordinary cases. For instance, consider the previously quoted example of Gauker: Harry is searching for a tie, and at the moment he tries on a pink-and-green one Sally says, “That matches your new jacket.” The current pink-and-green tie is definitely the most salient one at the moment of Sally’s utterance. In our experimental study we used scenarios similar to Gauker’s example, in which it is easy to identify the most salient object, and the maximal salience is gained in virtue of the object’s location, its status in a series of events, or the conversational focus.
In total, we chose three possibly reference-determining factors: demonstration (Dem), the descriptive content (Des), and salience (Sal). Each tested condition was considered to be a mismatch case in which the speaker’s intention determines one object, while Dem / Des / Sal determines another object as the referent.
We also considered it important to test the predictions of intentionalism in two different types of conditions. The first was a situation in which the object determined by the intention had the property ascribed by the speaker, and the object determined by an alternative factor (e.g., Dem) did not. So, this case was similar to Kaplan’s example where the picture actually demonstrated by the speaker did not present a great twentieth-century philosopher, which is in contrast to the picture the speaker had in mind. However, we also included the inverse situation in our experimental design: the object determined by the intention did not have the ascribed property and the object determined by Dem, Des or Sal had it. This enabled us to get more predictive results and test the predictions of intentionalism in a complementary way. Henceforth, let us refer to the first case as “T-condition” and to the second as “F-condition”. (For instruction: the capital letter T/F always indicates the state of the object determined by intention, i.e., whether the prediction of intentionalism is true/false.)
Because we included the Des condition, we focused on complex demonstratives as our case study; strictly speaking, a demonstrative in each condition had the form of “this F”. Naturally, further empirical research is needed with regards to other kinds of demonstratives (including those which contain “that”). Nonetheless, we believe that the most general conclusions of our study will apply to all kinds of demonstratives, including bare ones like “this”. As “this F” contains “this” as its component, it is unlikely that these two expressions are essentially different in the way in which they refer to objects. Hence, if it turns out that intentions do (or do not) determine reference in the case of complex demonstratives, we are justified in believing that they are likewise important for bare demonstratives.
Predictions of intentionalism and non-intentionalism
The predictions of pure intentionalism are quite straightforward: people should tend to evaluate a statement with a demonstrative as “true” in the T-conditions and as “false” in the F-conditions. At least, their preference for affirmative judgments should essentially be stronger in the T-conditions than in the F-conditions. The predictions of King’s Bad Intentions view (impure intentionalism) are more complex. This theory allows for a possibility of reference failure, granted that the object fixed by the speaker’s intention cannot be properly identified by a hearer. As we have said at the end of Sect. 2, the cases investigated in our study seem to involve a failed reference from the viewpoint of the Bad Intentions account. It is because the true referential intention of the speaker cannot be grasped by an ideal audience, provided that a factor like Dem, Des, or Sal indicates that the speaker wants to talk about a different object. A failure of reference in a subject-predicate sentence means that this sentence does not express a determinate proposition, so the study participants neither should opt for “true”, nor for “false”, in their evaluation. However, one cannot exclude the possibility that even if the participants recognize a reference failure, they still prefer definite answers, so they opt for falsity (it is rather improbable that they would say that the statement is “true” in such a case).Footnote 7 In sum, King’s Bad Intentions account is compatible with the pattern in which respondents choose “false”, or neither “false” nor “true” across T-conditions and F-conditions, as both types of conditions likely involve a reference failure from the viewpoint of this account (Table 1).
It is instructive to present the predictions of the considered versions of intentionalism in a table and contrast them with those which clearly count against these accounts:
Intentionalism—in any version—is inconsistent with a pattern of results in which “true” dominates in the F-conditions and at the same time “false” dominates in the corresponding T-conditions (or with a pattern in which the preference for affirmative judgment is stronger in the F-conditions than in the T-conditions). We call the view which generates the indicated set of predictions “non-intentionalism”, as the predicted pattern of results strongly suggests that people fix reference according to an alternative factor (like Dem or Des) and not according to the speaker’s intention. Also, let us note that the introduction of the F-conditions allowed us to create an experimental design yielding different sets of predictions for each theoretical approach, and thus providing a possibility of making a diagnostic comparison between these approaches. The predictions of non-intentionalism in the F-conditions differ from the ones of pure intentionalism and of King’s Bad Intentions account at the same time.
Finally, we should make a brief comment about the cases with the Des condition. Here, one may suspect that the predictions of intentionalism depend on the view about the semantic role of the noun phrase in a complex demonstrative. There are very different opinions on what that contribution may be; nevertheless, we need to emphasize that no matter what role “F” actually plays in the semantics of “The F is G”, once we agree that intentions determine reference, we basically end up with a set of predictions from the left-hand or the centre column but not from the right-hand column.Footnote 8
Materials and design
The study had the form of a questionnaire in which participants were asked to read short fictional stories (“vignettes”) and answer some questions about them. We developed three kinds of mismatch situations that differ in their setup and plots: Dog Show, Wedding Cakes and Game of Chess (see Appendix A for the materials). The difference across setups was not predicted to be significant and served merely as a robustness check. Each vignette described a situation in which someone used “this F” as a grammatical subject of a simple statement made in a conversational setting (i.e., there was always an interlocutor to which the utterance with a demonstrative was addressed). As a rule, the background story provided two different candidates for the semantic value: the object x that the speaker intended to talk about using “this F” and the object y that was differently determined in each version of the story. Here is a sample of the Dog Show vignette in all three conditions: Dem, Des, and Sal, where the object determined by the intention satisfied the property from the ascription (T-condition):
Dog Show, Dem, T-condition: James and Olivia are attending a dog show. At the moment, two dogs are being presented in the ring: a red Irish Setter and a chocolate Labrador. Both dogs are running, as they are doing some tricks. Olivia wants to point to the Labrador, but the dogs are moving so quickly that she accidentally points with her finger to the Setter while saying to James: “This dog is chocolate.”
Dog Show, Des, T-condition: James and Olivia are attending a dog show. At the moment, two dogs are being presented in the ring: a red Irish Setter and a chocolate Labrador. However, Olivia confuses these two breeds—she takes the Labrador to be the Setter and vice versa. Having the Labrador in mind, she says to James: “This Setter is chocolate.”
Dog Show, Sal, T-condition: James and Olivia are attending a dog show and admiring various breeds. Firstly, they look at a crate with a cute chocolate Labrador. Then they go further and find beautiful red Irish Setters. One of the Setters is very playful and they stay for a while, watching as the dog is chasing its own tail. At one moment, Olivia says thinking about the Labrador: “This dog is chocolate,” while James is petting the playful red Setter in front of her.
In sum, we employed a 2 × 3 mixed experimental design with one within-subject factor (condition: Dem, Sal, or Des) and one between-subject factor (T-condition vs. F-condition). This made six tested conditions: Dem, Des, and Sal in both T and F versions.
Participants and procedure
For Study I, 371 participants were recruited on Amazon Mechanical Turk to complete an online questionnaire. They were paid a small fee for their participation. Non-native speakers and subjects who failed the attention or comprehension check were excluded. The final sample consisted of 290 subjects (150 women, Mage = 39.6, SD = 12.7).Footnote 9
Each participant was randomly assigned to one between-subject condition (T-condition or F-condition) and received three vignettes with different setups, one for each within-subject condition (Dem, Des and Sal cases). The within-subject conditions were counterbalanced across vignettes in such a way that every participant received one vignette for each condition without repeating vignettes. The order of presentation was randomized. For each experimental condition, three separate screens were presented with the following questions:
Categorical comprehension question (e.g., “Which dog is chocolate?” with two possible answers to choose from: “Labrador” and “Irish Setter”);
Question regarding the truth-value of a protagonist’s utterance (e.g., “How would you evaluate Olivia’s statement in the described situation?”) followed by three choices: TRUE, FALSE and CANNOT SAY;Footnote 10
Question on confidence level regarding the answer to the truth-value question (e.g., “To what extent do you agree that what Olivia said was true?” in cases in which the participants answered “true” to the previous question) followed by a 5-point scale ranging from −2 (marked with “strongly disagree”) to +2 (“strongly agree”).Footnote 11
The participants were unable to go back to the previous vignette or the previous screen, and the story remained at the top of the screen throughout. Before completing all questions in the experimental conditions, a short demographic survey was presented; this questionnaire also preceded testing in our second experiment.
For each of our experimental conditions (Des, Dem and Sal), we conducted separate analyses. For the currently discussed analysis, we excluded from the data CANNOT SAY answers to our truth-judgment question. The reason for this procedure is that these answers constituted a minority of answers (about 10% across all experimental conditions). A detailed table containing all the data is included in Appendix B.
In the Dem conditions, a majority of participants provided answers that were incompatible with the predictions of intentionalism (see Fig. 1). In the T-condition, where (pure) intentionalism predicted TRUE, a majority of participants evaluated the speaker’s utterance as FALSE (78.2%, binomial test (P = 0.5): p < 0.05). In the F-condition, the participants tended to judge it as TRUE while intentionalism predicted FALSE (63.8%, binomial test (P = 0.5): p < 0.05). We also observed a statistically significant difference between answers in the T-condition and the F-condition (χ2 = 43.32, p < 0.001).
A similar pattern of responses was observed in the Des conditions. An overwhelming majority of participants assessed the target utterance as FALSE in the T-condition (95.2%, binomial test (P = 0.5): p < 0.01). In the F-condition, the answers were more divided. A majority of participants said that the target utterance was true (56%), but this difference did not reach statistical significance. However, a comparison between T-condition and F-condition yielded a statistically significant difference (χ2 = 82.61, p < 0.001), which suggests that the referential intuition of the respondents was guided more by description than by intention.
The situation is different when it comes to the scenarios with the Sal conditions. The results are, in principle, compatible with the predictions of intentionalism (67.9% of TRUE answers in T-condition and 61.2% of FALSE answers in F-condition, χ2 = 20.09, p < 0.001). However, during our research, we identified several factors which strongly influence responses in this condition. (A further discussion is in Sect. 3.4.)
We also conducted a more sophisticated statistical procedure in which we combined the two dependent variables present in our study: the truth-value judgment and the confidence ratings. For each answer we computed the “logical value index” in the following way: for TRUE answers we took confidence rating as a value, and for FALSE answers we reversed the sign of confidence rating (i.e., “2” on the scale becomes “−2”). This procedure allowed us to carry out a more fine-grained analysis of the answers. We ran a mixed analysis of variance on the confidence ratings transformed in the described manner. The ANOVA test revealed that both of our experimental factors were highly statistically significant (T-condition vs. F-condition: F(1, 288) = 24.2; p < 0.001, Dem vs. Des vs. Sal: F(2, 576) = 28.3; p < 0.001. The interaction was also statistically significant (F(2, 576) = 80.1; p < 0.001), which suggests that different types of contextual factors influence the truth-value judgments in different ways.
Two types of post hoc analyses were conducted. First, a comparison of means was performed in the T-condition and F-condition separately for each within-subject condition (Des, Dem and Sal, see Fig. 2 and Table 2). In all sub-conditions, the difference was highly statistically significant (two-sample t-test, p < 0.001). Additionally, we ran a separate 3 × 3 × 2 ANOVA in which we included different setups as an additional factor (Dog Show, Wedding Cakes, Game of Chess). We did not find any significant effect of the story setup (p > 0.05).
The second type of analysis was performed to check whether the means in each of our sub-conditions differed from those predicted by chance (see Fig. 3 and Table 3). If the participants were responding randomly, the expected mean would equal 0. All but one mean (the Sal scenario in the F-condition) significantly differed from 0 (one sample t-test).
In general, the presented results do not support intentionalism: the predictions of pure intentionalism and the Bad Intentions account have not been confirmed in Dem and Des conditions, where the pattern of responses is more consistent with the set of predictions of non-intentionalism. The situation, though, seems to be different in Sal conditions. Let us discuss our results in more detail.
Firstly, in the Dem condition, the results are fully consistent with non-intentionalism, which suggests that people firmly reject intentions as reference-determining factors whenever they stay in conflict with demonstrations. A similar effect—though a weaker one—can be observed in the Des condition. While the results here are definitely inconsistent with pure intentionalism, they are not so clear-cut from the viewpoint of a comparison between non-intentionalism and the Bad Intentions account. There is an asymmetry between responses in the Des conditions: while in the T-condition, a vast majority of respondents provided non-intentionalist evaluations (95%), “only” 56% of them gave such evaluations in the F-condition. In the second condition, around 44% of the respondents answered FALSE. This may suggest that a certain group of people had intuitions consistent with King’s theory and answered FALSE regardless of the fact whether it was the T- or the F-condition, as they recognized reference failures in both these cases. On the other hand, this hypothesis is not supported by the data concerning the confidence levels. As we can see from Fig. 2, the subjects’ responses were quite decisive in both Dem and Des conditions, which would be puzzling if the subjects in fact recognized reference failures in any of these cases. An alternative explanation is that the more ambiguous results of the F-conditions are, in general, a consequence of the more complex structure of the scenarios (in the scenarios with the F-condition, the speaker always made two mistakes: one concerning the referential act, the other the predicate ascription).
Let us now analyze the results of the Sal conditions. It seems that the results of these conditions are compatible with intentionalism. A closer inspection reveals that such a conclusion is not entirely warranted. Firstly, for the Sal vignettes, we obtained very even distributions of responses, especially in the F-condition. For our compound variable (combined confidence and truth-value judgments), the mean does not differ significantly from 0 in the F-condition. It could be argued that this pattern of responses indicates that the respondents had difficulty in deciding whether the target utterance was true or false. Secondly, the results we obtained for the Sal conditions are non-robust. In an additional series of short experiments in which we used only the Dog Show setup, we established that the distribution of answers in the Sal case heavily depended on various factors, such as the order of questions, formulation of the truth question, an amount of information given to justify the protagonist’s error, a way of indicating the salient object, or the exact wording used to describe the protagonist’s intention. The most likely explanation of this fact is that both salience and intentions play a role in determining reference of complex demonstratives. In particular, the effect of salience may depend on the strength of the salience of a given object, and the level of salience necessary to overturn the speaker’s intentions was simply not reached in the cases presented to the participants in our main study.
To sum up, we can formulate the conclusion that the speaker’s intention is not a crucial factor in determining reference of complex demonstratives, as it is taken to be less relevant than demonstration and likely description by ordinary speakers. Still, the results suggest that intentions are sometimes relevant. Based on the results of the Sal conditions, we can conclude that people rely on the speaker’s intention in their referential choice under certain circumstances. Furthermore, the results of Des conditions do not count decisively against the Bad Intentions account, which again suggests some role of the speaker’s intentions in people’s referential interpretations in this condition.