Automatic mental simulation in native and non-native speakers

van Zuijlen, Samuel J. A.; Singh, Sharon; Gunawan, Kevin; Pecher, Diane; Zeelenberg, René

doi:10.3758/s13421-024-01533-8

Automatic mental simulation in native and non-native speakers

Open access
Published: 14 February 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Memory & Cognition Aims and scope Submit manuscript

Automatic mental simulation in native and non-native speakers

Download PDF

787 Accesses
1 Altmetric
Explore all metrics

Abstract

Pictures of objects are verified faster when they match the implied orientation, shape, and color in a sentence-picture verification task, suggesting that people mentally simulate these features during language comprehension. Previous studies had an unintended correlation between match status and the required response, which may have influenced participants’ responses by eliciting strategic use of this correlation. We removed this correlation by including color-matching filler trials and investigated if the color-match effect was still obtained. In both a native sample (Experiment 1) and a non-native sample (Experiment 2), we found strong evidence for a color-match advantage on median reaction time and error rates. Our results are consistent with the view that color is automatically simulated during language comprehension as predicted by the grounded cognition framework.

Embodied human language models vs. Large Language Models, or why Artificial Intelligence cannot explain the modal be able to

Article 07 February 2024

Language is primarily a tool for communication rather than thought

Article 19 June 2024

The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences

Article Open access 19 July 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

If we write our friend about the new car we bought, how does our friend know exactly what we mean when we say “car”? How is this meaning of a car represented in her mind when she comprehends language relating to an object? The world comprises many objects, and when we read or talk about them, we almost instantly understand what is meant and what the most salient features of the object are (given that we have experienced the object before). Because language comprehension plays such a big part in how humans come to understand and interact with other people, it is important to understand its mechanisms. A question that has interested cognitive scientists is whether perceptual features of objects are represented by readers when they comprehend language. According to grounded cognition theories, the features that are activated during language comprehension are based on earlier perceptual-motor experiences with the objects described in the sentences (Barsalou, 1999; Barsalou et al., 2003). On this account, people represent the meaning of language by mentally simulating the perceptual and motor processes that they would also have used if they were immersed in the real-world equivalent of what is described by the language. In the present study we investigated whether such mental simulations also underlie understanding of non-native languages.

Several studies testing native speakers have obtained evidence for visual mental simulations during language comprehension. When participants are given a verbal property verification task (e.g., “Is a banana yellow?”), their responses are influenced by visual characteristics of the properties (Borghi, 2004; Borghi, et al., 2004; Morey et al., 2021; Solomon & Barsalou, 2004; Spivey & Geng, 2001; Taylor & Zwaan, 2008; Zwaan & Taylor, 2006) and trial-to-trial switches in perceptual modality (Ambrosi et al., 2011; Connell & Lynott, 2011; Marques, 2006, Pecher et al., 2003, 2004; Van Dantzig et al., 2008; Vermeulen et al., 2007). One of the earliest studies on mental simulations during language comprehension found that participants were faster and more accurate in verifying that a pictured object (e.g., an upright nail) was mentioned after reading a sentence implying the depicted orientation (e.g., “He hammered the nail into the floor”) than after reading a sentence implying a different orientation (e.g., “He hammered the nail into the wall”) (Stanfield & Zwaan, 2001). In this so-called sentence-picture verification task, participants decide whether the object presented immediately after the sentence was mentioned in the preceding sentence or not. It seems that participants mentally simulate the content of the sentence, and that subsequent verification of the depicted object is faster and more accurate when the visual feature implied by the sentence matches that of the picture even though that feature was not explicitly mentioned. Match effects have now been obtained for various sensory features such as shape, distance, or size (De Koning et al., 2017; Pecher, van Dantzig, Zwaan, et al., 2009; Sato et al., 2013; Winter & Bergen, 2012; Zwaan et al., 2004; Zwaan & Pecher, 2012; Zwaan et al., 2002; Zwaan et al., 2018). Note that the simulation account of the match effect differs from mental imagery, which is largely conscious and concerns itself with imagination. Mental simulation is assumed to be unconscious, and is the underlying process of conceptual processing (Pecher, van Dantzig, & Schifferstein, 2009a, 2009b; Solomon & Barsalou, 2004; Vermeulen et al., 2008; Zwaan & Pecher, 2012).

Relatively little attention has been devoted to the role of mental simulations in language comprehension of non-native speakers. Some researchers have argued that mental simulations may be less vivid when people read a non-native language. Especially for a second language that is learned later in life, in a formal setting such as school, people may have weaker links between language and sensory experiences (Foroni, 2015; Kogan et al., 2020; Norman & Peleg, 2022). The strength of mental simulations may depend on proficiency in the second language (e.g., Dijkstra & van Heuven, 2002; Monaco et al., 2019; van Heuven & Dijkstra, 2010; Zhao et al., 2019; but see Bergen et al., 2010). Empirical evidence for mental simulation in non-native speakers is relatively sparse and seems to come mainly from paradigms that aim to assess involvement of the motor system. There is some evidence that non-native speakers perform mental simulations (Dudschig et al., 2014; Wheeler & Stojanovic, 2006), although this may depend on the extent to which a person’s native language can be mapped onto the meanings of their non-native language (Ahlberg et al., 2018). In sentence-picture verification tasks the evidence for mental simulations in non-native speakers is weak at best. Chen et al. (2020) presented items that matched or mismatched in implied shape in a delayed recognition task (modelled after a study with native speakers by Pecher, van Dantzig, Zwaan, et al., 2009) to participants who were native speakers of Cantonese and non-native speakers of English and Mandarin. They found a match effect in reaction times only when sentences had been read in the participants’ native language and not in either of the two non-native languages. Norman and Peleg (2022) found a shape-match effect for native Hebrew speakers when sentences were in Hebrew but not when sentences were in their non-native language English. Ahn and Jiang (2018), on the other hand, did obtain similar shape and orientation match effects for native and non-native speakers of Korean. However, their study used different sentences in the match and mismatch conditions, which introduced a confound between condition and stimulus materials, raising questions about the validity of the results.

In the present study we investigated the mental simulation of color using the sentence-picture verification task in native and non-native speakers of English. Objects (e.g., a leaf) can take different colors (green, brown), which can be implied by a sentence (“The leaf was on the tree” vs. “The leaf was on the ground”). Although initially a mismatch advantage was obtained (Connell, 2005; 2007), later studies, using larger samples, did not report this color-mismatch advantage (Hoeben-Mannaert et al., 2017; Zwaan & Pecher, 2012). Rather, both studies reported a positive match effect of color (i.e., a match advantage), where images that matched the preceding sentence on the object and color produced faster responses than images that mismatched the color (also see De Koning et al., 2017). Together, the results across different studies suggest that people represent color during language comprehension.

Before testing whether match effects can be found for non-native speakers, we wanted to improve the paradigm by eliminating the potential for strategic responding. If sensory simulation is an integral part of language comprehension, simulations should be automatic whenever language comprehenders process the meaning of a sentence. A noticeable feature of the sentence-picture verification task is that there is a correlation between the match status (match vs. mismatch) and the required (i.e., correct) response (‘’yes’’ vs. ‘’no’’). Consider, to make this more concrete, studies that have investigated the color-match effect. Participants in these experiments read a sentence that is followed by an object picture and decide whether the object is mentioned in the preceding sentence. On critical trials (i.e., where the depicted object was mentioned in the preceding sentence), typically half consist of a color-match trial and half consist of a color mismatch trial. On filler trials (i.e., where the depicted object was not mentioned in the preceding sentence, thus requiring a “no” response), color match is not controlled or manipulated. Because objects can have many different colors this results in few, if any, filler trials on which the object color matches the color implied by the sentence. Consequently, the color match/mismatch status is correlated with the required response. If the color of the object in the picture matches that of the implied color in the sentence, there is a high probability that the object was mentioned in the sentence.^{Footnote 1} On the other hand, if the color of the object in the picture does not match the implied color in the sentence, the probability that the object was mentioned in the sentence is well below 50%.^{Footnote 2} If participants pick-up on this correlation they may use it to aid their responses in the sentence-picture verification task.

Ample research, using a variety of tasks, stimuli, and procedures, has shown that people are sensitive to correlations between stimuli as well as correlations between stimuli and responses (e.g., Garcia et al., 1955; Parise et al., 2012; Reber, 1967; Zeelenberg et al., 2004), even without explicit instructions to detect, learn, or use such correlations to facilitate responding. As an example, consider a well-known study by Neely et al. (1989), who investigated semantic priming effects in a lexical decision task. In a lexical decision task, participants make binary decisions about the lexical status (word vs. nonword) of the target stimulus. A characteristic of the primed lexical decision task is that there is a correlation between the relatedness of the prime and the target and the required response. If the prime and target are semantically related (e.g., cat – dog), the target is a word, because nonwords are not semantically related to words. Neely et al. assumed that participants use this correlation in the decision process. Participants will be biased to give a “word” response if they detect a relation between prime and target and they will be biased to give a “nonword” response if they do not detect a relation between prime and target. By manipulating the nonword ratio, Neely et al. showed that participants are sensitive to the correlation between relatedness of the prime and target and the lexical status of the target. The nonword ratio is the probability that the target is a nonword given that it is unrelated to the prime. If the nonword ratio is high, the absence of a relation between prime and target is highly predictive for the fact that the target is a nonword. If the nonword ratio is low, however, the absence of a relation between prime and target is less informative. Neely et al. found larger priming effects with high nonword ratios, indicating that participants are indeed sensitive to the correlation between the relatedness of the prime and target and the lexical status of the target stimulus. In addition to these more strategic decisional processes that are biased by this correlation and contribute to the semantic priming effect, researchers have argued that semantic priming is also due to automatic activation processes (e.g., den Heyer et al., 1983; McNamara, 1992; Neely, 1977; Neely et al., 1989). Because strategic processes may have the same effects on performance measures as automatic processes, it is difficult to conclude that the observed priming effects are due to automatic processes. Therefore, researchers have designed experiments that aim to eliminate the contribution of strategic processes. Consistent with an automatic activation view, semantic priming effects are also obtained when the contribution of strategic processes is prevented (e.g., Balota & Lorch, 1986; de Groot, 1983; Pecher et al., 2002). Thus, to investigate automatic conceptual processes, it is important to use procedures that eliminate more strategic contributions to an effect.

In the present study, we investigated if a color-match effect in the sentence-picture verification task is also found when the correlation between the presence of a color match and the required response is eliminated. Thus, in contrast to previous studies on the match effect, the presence of a match was not predictive of the required response. Crucially, we added color-match filler (i.e., “no”) trials to the stimuli presented in the experiment. That is, even on trials where the object was not mentioned in the preceding sentence, the object color could still match the implied color of the object mentioned in the sentence (e.g., the sentence implies pink paint and a picture of a pink marshmallow is shown). In doing so, we removed the correlation between color match and the required response. A match effect in the absence of this correlation provides stronger evidence that match effects are not dependent on mental simulations that are strategically employed in the sentence-picture verification task. In our study, the presence of a color match does not inform participants about the required response. If language comprehenders, given that they process the sentence at a semantic level, automatically simulate properties such as the color of an object mentioned in a sentence, we should still obtain a color-match advantage. If, on the other hand, the color-match advantage depends on strategically employed mental simulations to aid responding in the sentence-picture verification task, no such advantage would be present because there is no correlation between color match and the required response. If under these circumstances we still find a match effect for native speakers in Experiment 1, we will then proceed to test a sample of non-native speakers in Experiment 2 using the same stimulus materials.

Experiment 1

Method

Preregistration

Predictions, method (including exclusion criteria), and planned data analyses of Experiment 1 were preregistered on the Open Science Framework (OSF) in advance of data collection (https://osf.io/r6b7j/).

Participants

A total of 371 native speakers of English were recruited for the study. The data of 300 participants were included in the final analysis (detailed information about participant exclusion is provided later). The mean reported age was 31.0 (range 18–73) years, 165 participants reported being female. They reported their country of birth/country of residence as the UK (51.4%/57.8%), South Africa (17.3%/18.1%), USA (6.8%/8.4%), Australia (3.0%/3.5%), Ireland (3.0%/3.0%), Canada (2.4%/3.0%), Germany (1.1%/0%), and 26 other countries or did not provide this information (14.3%/6.2%). Participants were recruited on Prolific and received £1.25 for their participation. Completing the experiment took approximately 8 min. The posting on Prolific was offered only to native speakers of English. Based on the effect size reported by Hoeben-Mannaert et al. (2017) for their Experiment 1 (d = 0.26), we computed the required sample size to obtain a statistical power of .95 with a two-tailed paired-sample t-test (α = .05) using G*power (Faul et al., 2009). The required sample size amounted to 195 participants. To be on the safe side, we decided to test 300 participants. We included only participants who met all the following criteria: (1) participants completed the experiment, (2) participants indicated that they were native speakers of English, (3) participants responded correctly on at least 80% of the sentence-picture verification trials, and (4) participants responded correctly to at least 50% of the sentence comprehension questions. The data from participants who failed to meet one or more of these criteria were excluded from the analyses. Removed participants were replaced by new ones who were tested with the same counterbalancing version.

Stimulus materials and software application

The present experiment used the same critical stimuli as Hoeben-Mannaert et al. (2017). These consisted of 16 sentence pairs and 16 picture pairs. The two versions of a sentence pair, each one implying a different color, could be coupled with the two versions of a picture pair, each one in a different color, to form matching or mismatching trials. Across four list versions, for each of the 16 critical objects, one of the sentences in a pair was coupled with one of the pictures in a pair, resulting in four different combinations of sentence-picture pairs (see Table 1 for examples). Thus, each participant saw only one sentence and one picture of an object. In this manner, four counterbalanced versions were created so that, across participants, each sentence and each picture were presented equally often in the color-match and color-mismatch condition. The same set of 16 filler items was presented to all participants. Each counterbalanced version thus included 16 critical sentences paired with 16 critical pictures (i.e., eight color-match trials and eight non-match trials) and 16 sentence-picture pairs (also eight color-match trials and eight non-match trials) that were used as fillers. As shown in Table 1, on filler trials the object color matched or mismatched the implied color of the sentence, but the depicted object was not mentioned in the preceding sentence. Half of the filler trials were followed by a comprehension question with an equal number of ”yes” and ”no” responses (see Table 1), to ensure that participants did not merely skim the sentences. An additional set of eight sentence-picture pairs and eight comprehension questions was used for practice. We used the same practice pairs for all participants.

Table 1 Example of experimental and filler stimuli

Full size table

The experiment was programmed in Inquisit (https://www.millisecond.com/), a software application developed for online psychological testing. All pictures were royalty-free images obtained through the Google images search engine. All images were of an object in one dominant color against a neutral background. We only selected images of objects that have limited color variations (e.g., a ripe vs. an unripe tomato). Image height was 50% of the screen. See Online Supplementary Materials (OSM) for examples of pictures. All text was presented in the letter font Verdana (letter height 3% of the screen) against a white background.

Procedure

Participants selected the study on Prolific and started the experiment from their personal computer or laptop. Upon opening the experiment, participants gave informed consent and read a welcome text and instructions. Participants were instructed to respond as quickly and as accurately as possible. They responded “yes” (by pressing the M key on the keyboard) when the depicted object was mentioned in the preceding sentence. They responded “no” (by pressing the Z key on the keyboard) when the depicted object was not mentioned in the sentence. Each trial started with a fixation cross (+) presented for 1,000 ms vertically in the middle of the screen and horizontally aligned to the left, where the first character of the sentence would appear (see Fig. 1). The fixation cross was followed by a sentence. After the sentence was read and understood, the participant pressed the spacebar to proceed. Another fixation cross (+) was presented centrally for 500 ms. Following this fixation cross, an object picture was presented centrally to which participants responded “yes” or “no” using the M or Z key, respectively. On half of the filler trials the response to the picture was followed by a comprehension question. Comprehensions questions required a “yes” (M key) or “no” (Z key) response. If participants made an error on picture trials or comprehension questions, the feedback message “Incorrect” was presented for 500 ms. A 1,000-ms intertrial interval followed the response of the participant (or feedback in case of an incorrect response). Trials were presented in a random order. Different random orders were generated for each participant.

Participants first completed eight practice trials followed by 32 experimental trials (16 of which required a “yes” response and 16 of which required a “no” response) presented in random order. The trial procedure was the same for practice and experimental trials. For both practice trials and experimental trials, half consisted of color-match trials and half consisted of color-mismatch trials.

The experiment ended with a closing questionnaire that asked participants for their native language, gender, and age, followed by a final thank you message.

Data analysis

Following previous studies (Hoeben-Mannaert et al., 2017; Stanfield & Zwaan, 2001; Zwaan & Pecher, 2012; Zwaan et al., 2002) and our preregistered analysis plan, statistical analyses were based on the median reaction times. For each participant and condition the median reaction time for correct responses was determined and entered in the analyses. We conducted a paired-samples t-test to compare the median reaction times in the color match and color mismatch conditions (using α = .05). A comparable analysis was performed on the mean error rates.

Results and discussion

Based on our preregistered criteria, the data from 71 participants were excluded. Thirty-three participants indicated that their native language was one other than English. Twelve participants did not finish the experiment. Moreover, we removed the data from one participant due to a low comprehension score (below 50%),^{Footnote 3} and we removed the data from ten participants due to a low overall accuracy on the critical trials (below 80%). Finally, we removed the data from 15 participants to ensure an equal number of participants in each counterbalancing version.^{Footnote 4} We analyzed and report the data of the remaining 300 participants (56% female, mean age = 30.8 years, SD = 11.7). The data of all experiments reported in this article are available at https://osf.io/r6b7j/.

The analyses were based on only the critical trials. Figure 2 shows the mean median reaction times (RTs) for the match and mismatch condition (only trials with a correct response were included in the analyses). Participants responded faster on match trials than on mismatch trials. The 106-ms color-match advantage (871 ms vs. 976 ms) was significant, t(299) = 6.23, p < .001, d = 0.36. Moreover, participants made fewer errors on color-match trials than on color-mismatch trials (4.3% vs. 13.1%), t(299) = 10.55, p < .001, d = 0.61. Thus, we found a color-match advantage on RTs and error rates even when the color match status was uncorrelated to the required response. These results are consistent with the view that native speakers of English mentally simulated color during language comprehension.

Our results are in line with previous studies reporting a color-match advantage (De Koning et al., 2017; Hoeben-Mannaert et al., 2017; Zwaan & Pecher, 2012). Our evidence in favor of a match advantage, however, stands in contrast with studies reporting a mismatch advantage (Connell, 2005; 2007). We are not aware of a good explanation for this discrepancy in results. However, several researchers have now obtained a color-match advantage in preregistered experiments (Hoeben-Mannaert et al., 2017; the current Experiment 1). Moreover, the positive color-match effect aligns with similar findings for shape (e.g., Zwaan et al., 2002, 2018; Zwaan & Pecher, 2012) and orientation (e.g., Stanfield & Zwaan, 2001; Zwaan & Pecher, 2012). We are not aware of anyone reporting a negative match effect for implied shape and orientation. Thus, it seems that native speakers mentally simulate the color of an object when comprehending language.

Experiment 1 provided additional evidence for the idea that simulations are an automatic consequence of sentence processing. In Experiment 2 we asked whether evidence for mental simulations is also found when language comprehenders read sentences in their non-native language. Given the mixed results for non-native speakers in previous studies using sentence-picture verification tasks (Ahn & Jiang, 2018; Chen et al., 2020; Norman & Peleg, 2022), we investigated if the color-match effect that we observed in Experiment 1 with native speakers would be found in a group of non-native participants. If non-native speakers of English, like native speakers, use a similar visual simulation process during language comprehension, we should find a color-match effect as we did in Experiment 1.