Introduction

Misreading is a prevalent phenomenon (e.g., Gibson et al., 2013; Mirault et al., 2018; Slattery, 2009; see Huang & Staub, 2021b, for a review) that may arise from multiple causes. Misreading of word neighbors (e.g., brunch and branch) is usually assumed to be due to misperception (Harris et al., 2021; Slattery, 2009), while misreading involving phrasal structures (e.g., swapping of the theme and goal in a ditransitive construction; Gibson et al., 2013) has been argued to involve post-perceptual inference. Reading two words out of order, or failing to notice that two words in a sentence are actually transposed, is of particular interest, as some have argued that it arises from misperception of the words’ positions (Mirault et al., 2018; Snell & Grainger, 2019), while in previous work we have maintained that rapid, post-perceptual inference may account for this phenomenon (Huang & Staub, 2021a). The current study further investigates the cause of misreading of transposed words, by exploring the effect of different modes of stimulus presentation.

Mirault et al. (2018) first demonstrated that readers misjudge as grammatical a sentence containing two transposed words (e.g., The white was cat big) more often than other ungrammatical sentences. They concluded that words are processed in parallel and positional coding of words is noisy: The simultaneous activation of adjacent words makes it possible for the second word (cat) to be recognized before the first (was), and noisy positional representation enables the second word to be assigned to the first spatial position. Thus, under their model (Snell et al., 2018), mis-ordering is a perceptual problem, as the two words are never perceived as occupying their veridical positions.

In Huang and Staub (2021a), on the other hand, we argued against parallel word processing as an explanation for the transposed-word effect. In two eye-tracking experiments, we manipulated the length and word class of transposed words, finding that transposed words of very different lengths still elicited a significant transposed-word effect, and that having an easier word following a more difficult word in the transposed sequence did not make the transposed-word effect stronger. We argued that these findings are inconsistent with the perceptual-error account. We proposed, instead, that words are recognized serially from left to right (Reichle et al., 2009), but that integration of a recognized word into a higher-level representation of sentence structure and meaning may sometimes be delayed, so that two words are available for integration simultaneously; a reader’s syntactic knowledge is then brought to bear in making an inference about word order.

The present study attempts to arbitrate between perceptual and post-perceptual accounts by deploying a paradigm in which the accounts make clearly distinct predictions. If the transposed-word effect is due to misperception of word positions arising under conditions of parallel word processing, the effect should not be present when words are presented sequentially, and are visible only one at a time, as in this case two words cannot be perceptually processed in parallel. On the other hand, an account attributing the transposed-word effect to post-perceptual inference predicts that the effect may arise even when words can only be processed sequentially.

We follow up a recent Chinese study (Liu et al., 2022) that tested these predictions by comparing the transposed-word effect under word-by-word serial visual presentation (SVP) and standard parallel visual presentation (PVP). Under SVP at the rate of 250 ms per word, Liu et al. found a significant transposed-word effect, with subjects failing to notice the ungrammaticality in 11% of the critical sentences, versus 3% of control ungrammatical sentences. Under PVP, the effect was larger than under SVP, although the error rate in the control condition was also larger (31% vs. 7%). Liu et al. concluded that parallel processing of words is not required for a transposed-word effect. The finding that the effect is larger under PVP suggests, however, that parallel word processing may make an additional contribution to the effect. Alternately, as we discuss further below, PVP may allow faster serial word recognition via parafoveal processing (Cutter et al., 2015), which makes the time window of integration of the two words more likely to overlap (Huang & Staub, 2021a, 2021b).

Our first goal was to assess whether the Liu et al. (2022) findings replicate in English. English and Chinese differ not only in their writing systems (alphabetic vs. logographic), but also in the flexibility of their word order. For instance, SVO, SOV, and OSV are all possible orders of subject, verb, and object in Chinese, and some adverbs can appear in different positions in a sentence (Huang et al., 2009). This flexibility in word order might induce a bias to accept transposed-word sentences even under SVP, compared to a language like English with relatively fixed word order. Second, it has been argued that lexical processing demonstrates more parallelism in Chinese than in alphabetic scripts (Yan et al., 2010; Yang et al., 2009). It is thus an open question whether a transposed-word effect would be obtained with SVP in English, and it is also an open question whether the transposed-word effect would be larger under PVP. Experiment 1 aimed at replicating Liu et al. (2022) with English materials.

In addition, a potential concern with the Liu et al. (2022) study is that in the SVP condition, participants responded only after the whole sequence was presented, while in the PVP condition, they could make a response in the middle of reading the sentence, and in fact were encouraged to do so. The delay in responding in the SVP condition suggests a potential role for redintegration (Jones & Farrell, 2018). That is, participants might have, at the time of making the response, reconstructed their short-term memory into a grammatical sequence (Botvinick & Bylsma, 2005), even if they initially did detect the anomaly during incremental processing of the sentence. The transposed-word effect observed under SVP thus might have an entirely different explanation from the one observed under PVP, as it may be due to short-term-memory limitations emerging under specific task demands. In Experiment 2 of the present study, we assess whether a transposed-word effect is obtained under SVP even when participants are instructed to respond as soon as they have detected an error.

Experiment 1

Methods

Participants

A total of 70 self-reported native English speakers, with IP addresses within the USA, were paid for their online participation in the study via the Amazon Mechanical Turk platform.Footnote 1 Those whose accuracy on either the fillers or controls was below 70% in either experimental block (N = 20) and those whose accuracy in detecting transposition errors was significantly lower than chance level (<30%) in either block (N = 1) were not included in our analysis.Footnote 2 Thus, 49 participants were included in the data analysis in Experiment 1.

Materials

We constructed eighty critical grammatical sentences, all of which were seven words in length. Ungrammatical versions were created by transposing the third and fourth words of each sentence, for example, They hardly text her or call her was modified to They hardly her text or call her. The transposed words within a sentence were between two and five letters, and differed by no more than one letter in length. Each participant saw either the grammatical or transposed version of a given item, for a total of 40 critical transposed sentences and 40 grammatical counterparts. Along with these sentences, 40 grammatical filler sentences with varying structures were included (e.g., They saw him jump from the window), as well as 40 ungrammatical controls where the third, fourth, fifth, sixth, and seventh words in a grammatical sentence were randomly scrambled (e.g., The seriously is ill lady unconscious still). These sentences were also seven words in length. Each participant went through two experimental blocks, one PVP and the other SVP, the order of which was counterbalanced. There were 80 sentences in each block (20 critical ungrammatical sentences, 20 grammatical counterparts, 20 grammatical fillers, and 20 ungrammatical controls).

In order to investigate a question unrelated to the main issue we discuss here, we also divided the critical sentences into two types: one in which the two critical words were a verb followed by a pronoun (e.g., They hardly text her or call her), and another in which these words were a verb followed by a preposition (e.g., The boy sat on the school bus). For the former type, transposition resulted in the sentence becoming ungrammatical at the first transposed word (They hardly her), while for the latter type, the transposed sentence was not ungrammatical until the second transposed word (The boy on sat). However, we discovered after the fact that the two sentence types differed substantially in the trigram frequency of the two critical words and the preceding word; in other work, we have found that this variable predicts item variability in how frequently a transposition is overlooked (Huang & Staub, 2022). Thus, the results from this manipulation are not easily interpretable, and in the present paper we present analyses that collapse across this manipulation. Materials and data are available in our online repository https://osf.io/kcd6v/.

Design and procedures

The experiment was created using Psychopy 3.0 (Peirce et al., 2019) and implemented online via Pavlovia.org. Each participant went through a PVP and an SVP block, the order of which was counterbalanced. Thus, there were in total four presentation lists (stimulus counterbalancing × block order). In the PVP block, participants were instructed to read the sentence at their natural pace and judge whether the sentence was a well-formed sentence by pressing one of two buttons. Each sentence was revealed all at once, on one line. In the SVP block, each sentence was presented one word at a time at a rate of 250 ms per word, and after the final word disappeared a “well-formed?” prompt would appear for 3 s. For comparison with Experiment 2, we hereafter refer to the SVP condition as ESVP, i.e., “End-of-Sentence Serial Visual Presentation.” The instructions and three practice trials were given at the beginning of each block. Each trial began with a 500-ms blank, a “hit space to start” prompt, and a 1-s fixation cross, after which the stimuli appeared.

Analysis

For PVP, one trial with a response time lower than 500 ms was excluded; for ESVP, all response times (RTs) were higher than 500 ms (i.e., they responded only after the third word appeared). Additionally, trials where participants made no response were removed (0.1%).

Generalized linear mixed effect models (GLMMs) were run for the accuracy data. We used the bobyqa optimizer with 200,000 iterations to improve convergence. All models were constructed with maximal random slopes unless there was a singularity or convergence issue (Barr et al., 2013), in which case the highest-level random factors and/or correlation terms were removed. When there were random interaction slopes, correlation terms were always excluded due to convergence issues. Details of each model are provided below. Model comparisons were based on χ2 tests (p < .05).

Results

The left panel in Fig. 1 provides an overview of the results. Participants in both paradigms were very good at rejecting the scrambled controls, and rarely rejected grammatical sentences. However, they were even better at both rejecting controls and accepting grammatical sentences under PVP than under ESVP. A transposed-word effect clearly exists in both paradigms (as confirmed by the statistical tests presented below): error rates in the transposed conditions were substantially higher than in the scrambled control conditions. The transposed word effect appears to be larger under PVP.

Fig. 1
figure 1

Mean error rates in each condition under each paradigm for each experiment. Error bars reflect by-subject 95% confidence intervals

Our statistical models compared accuracy for the critical transposed sentences and the scrambled controls. Three main effects were entered into the GLMM: transposition (transposed vs. control ungrammatical), block order, and paradigm (PVP vs. ESVP), along with all their interactions.

The only significant effects were the main effect of transposition and its interaction with paradigm (Table 1). Model comparisons did not favor the model with the full three-way interaction over a model with only one two-way interaction (χ2(5) = 4.24, p > .05). The significant two-way interaction term suggests that the difference between the accuracy of detecting transposition errors and the accuracy of detecting scrambling errors was larger under PVP than under ESVP. When further breaking the dataset into PVP and ESVP and separately running a model with transposition as the only main effect, it was significant under both paradigms (zs = 4.61 and 5.52).

Table 1 Generalized linear mixed effect model (GLMM) estimates of the three effects and their interactions in Experiment 1

Discussion

The results of Experiment 1 showed that while the transposed-word effect was larger under PVP than under ESVP, it was reliably present under ESVP. Experiment 1 thus is a successful conceptual replication of Liu et al. (2022), showing the generalizability of the two findings to a language of less flexible word order and of a different writing system.

Although a transposed-word effect under ESVP may provide evidence that parallel word processing is not necessary for a transposed-word effect under PVP (Liu et al., 2022), the ESVP paradigm that allowed responses only after the whole sequence was presented made it possible that the frequent failure to reject a transposed-word sentence was due to a late-stage redintegration of the serially perceived sequence (Jones & Farrell, 2018). A late-stage cause for the transposed-word effect under PVP, however, has been argued to be unlikely (Huang & Staub, 2021a), based on the finding of undisrupted eye movements on trials where participants fail to report the transposition error; there is no indication that readers first notice the transposition, and then later “correct” it.

To provide an even stronger case that the transposed-word effect under PVP is not necessarily due to parallel word processing, we need to demonstrate the effect while ruling out not only parallel presentation of words but also the possibility of late-stage redintegration. Experiment 2 addresses whether a transposed-word effect under SVP still can be obtained when participants can respond as soon as ungrammaticality emerges.

Experiment 2

Methods

Participants

A total of 68 participants from the same pool as Experiment 1 were paid to participate.Footnote 3 Under the same criteria as in Experiment 1, 18 were excluded due to their low accuracy on either fillers or control sentences (< 70%). One additional participant whose accuracy in detecting transposition errors was significantly lower than chance level (<30%) in either block (N = 1) was excluded, leaving 51 participants’ data to be analyzed.

Materials

The exact same stimuli were used as in Experiment 1.

Design and procedures

As in Experiment 1, every participant went through two experimental blocks, one PVP and the other SVP. The PVP condition was identical to that in Experiment 1. In the SVP condition, each sentence was presented one word at a time and participants were instructed that they could press a rejection button at any time from the presentation of the first word up to 3 s after the disappearance of the final word. We refer to this as the SSVP condition, i.e., “Self-Terminating Serial Visual Presentation.” Participants were instructed not to make any response in the SSVP block if they thought the sentence was well formed.

Results

The analysis pipeline was the same as in Experiment 1. There were no excessively short RTs (all RTs > 500 ms). Trials where participants made no response were removed (0.1%).

The task instruction in the SSVP condition was effective in eliciting responses during the presentation of the sentence itself. As shown in Fig. 2, most “reject” responses in the transposed condition were made before 1,750 ms (i.e., while the sentence was still being presented), and 80% of the “reject” responses were made before 2,000 ms (i.e., 250 ms after the offset of the last word) in Experiment 2, compared to Experiment 1 where only 7% of the “reject” responses were made before 2,000 ms.

Fig. 2
figure 2

Histogram of response times of correct rejection to transposed-word sentences relative to the onset of the first word in the two experiments

A GLMM model was run, with the same fixed effects as those in Table 1. Table 2 shows the results (see also Fig. 1).

Table 2 Generalized linear mixed effect model (GLMM) estimates of the three effects and their interactions in Experiment 2

Along with the strong main effect of transposition and the two-way interaction between transposition and paradigm, there was an unexpected three-way interaction, indicating that the two-way interaction between transposition and block differed across PVP and SSVP. When looking at only the PVP dataset, although the interaction between transposition and block was marginally significant (z = -1.79), there was a transposed-word effect for both blocks (zs = 5.03 and 4.64). When looking at only the SVP dataset, there was neither an interaction (z = 1.36) nor a main effect of block (z = 0.25), but only a main effect of transposition (z = 2.78, p < .01). In short, the transposed-word effect was present under both PVP and SSVP, and the effect was larger in the former paradigm.

In a further exploratory analysis, we combined the data from the two experiments and ran a three-way GLMM, including Transposition, SVP type (ESVP vs. SSVP), and Paradigm (PVP vs. SVP) and all their interactions.

The results showed no three-way interaction, but a significant Transposition × Paradigm effect (Table 3). This suggests that for both ESVP and SSVP, the transposed-word effect was larger under PVP than under SVP. Finally, to directly compare the SVP types, we restricted the analysis to SVP data only, constructing a GLMM with fixed effects of Transposition, SVP type, and their interaction. The results showed that the only significant effect was Transposition (z = 4.17; zs = -1.24 and 0.38 for SVP type and the interaction).Footnote 4

Table 3 Generalized linear mixed effect model (GLMM) estimates of the three effects and their interactions for all data

Discussion

Experiment 2 provided a stronger test of the transposed-word effect under SVP by allowing participants to respond as soon as they noticed ungrammaticality. Participants still failed to reject transposed-word sentences on a significant proportion of trials. As a transposed-word effect was still obtained under SSVP, our results provided strong evidence that the transposed-word effect observed in SVP is not due to a late reconstructive memory process.

General discussion

The motivation for presenting visual words serially is to examine the role of parallel word processing in the transposed-word effect (Mirault et al., 2018). In Experiment 1, we confirmed that in English, like in Chinese (Liu et al., 2022), a transposed word effect is obtained with serial presentation. Thus, this finding does not depend on the flexible word order of Chinese. In Experiment 2, we ruled out an interpretation that would attribute this effect only to a redintegration process that emerges when responding is delayed relative to the point of transposition. Thus, the present data provide strong evidence that parallel word processing is not necessary for the transposed-word effect.

We did, however, find that the transposed-word effect under PVP was larger than under SVP (both ESVP and SSVP), which is consistent with Liu et al. (2022). On the one hand, this result may suggest that parallel word processing contributes to the transposed-word effect, even if it is not necessary for this effect. But on the other hand, as suggested by Liu et al. (2022), the difference in accuracy between PVP and SVP could also be attributed to parafoveal processing under a serial-attention model. It is possible that the covert attention shift to processing Word N+1 makes integration of Word slower. Similarly, the preview of Word N+1 may make its recognition faster. Both will make the integration stage of the two words more likely to overlap (Huang & Staub, 2021a). In contrast, when words are presented in isolation, attention can be deployed more focally and integration might finish faster, making processing more incremental, and thus more sensitive to violations. Finally, the smaller transposed-word effect in SVP could also be attributed to the reduced likelihood of noise under SVP compared to PVP. That is, under PVP, experienced readers might attribute the perceived erroneous sequence to an error in eye movements (Staub et al., 2019). In sum, while further research is required to explore why the transposed word effect is larger under conditions of parallel word presentation, it is clear that parallel word processing is not necessary for the effect to emerge.

We did not find a robust difference between ESVP and SSVP. Our motivation to adopt an SVP paradigm that allowed immediate rejection responses was to deal with a potential confound arising from syntax-biased redintegration in short-term memory (e.g., Jones & Farrell, 2018). That is, under ESVP, participants might initially notice the transposition, but on some proportion of the trials, they might forget having noticed the transposition by the time they need to make the response, and since the perceived sequence was only one transposition away from a grammatical sentence order, they reconstruct their memory to conform to syntax (Botvinick & Bylsma, 2005). If so, it is expected that allowing participants to respond during the presentation should reduce the transposed-word effect. Contrary to this prediction, there was no significant difference, but only a numerical one. The lack of a robust difference might be because grammaticality judgments do not involve explicit recollection. Indeed, Allen et al. (2018) found that a memory advantage for sentences, compared to unstructured lists of words, was only apparent in recall tasks and harder to find in recognition tasks. However, as we noted above (footnote 4), it is also possible that there is some small difference between SSVP and ESVP that our study did not have the power to detect.

To explain the transposed-word effect under SSVP, we assume, following Gibson et al. (2013), that comprehenders correct for potential noise during communication. Instead of interpreting the perceived message literally all the time, they sometimes infer a more plausible message, based on their prior linguistic experience and their noise model. We have adopted this idea in Huang and Staub (2021a) to explain the transposed-word effect during normal reading. We argued that this rational inference must be rapid, and post-lexical syntactic/semantic integration must sometimes be delayed (rather than being perfectly incremental) in order for the eye movements to be undisrupted on those trials when a transposition was present but not noticed – the empirical finding in our eye-tracking experiments. To illustrate, consider the sentence They hardly her text or call her. If syntactic and semantic processing were perfectly incremental, processing the word her would always result in immediate difficulty. As there was no disruption in eye movements on those trials when a transposition was not explicitly detected, Huang and Staub (2021a) suggested that syntactic integration must not be perfectly incremental, but must be delayed on at least some trials. In these cases, the order of the not-yet-integrated words her and text can be corrected by a rapid, unconscious process of rational inference. However, incrementality of integration might differ across modalities. The current experiment showed that, under SVP, even when given 250 ms per word (approximately the duration of an average word inspection during normal reading; Brysbaert, 2019; Gagl et al., 2022), transposition of two words did not always trigger a rejection response. This provides further evidence of occasional non-incremental integration. While the approximately 10% error rate obtained in SSVP may seem modest, it must be noted that the current task is explicit error detection. In natural reading, non-incremental integration of successive words might occur even more frequently. While we do not oppose the thesis that language processing is highly incremental, we maintain that the existence of some non-incremental processing has consequences for both linguistic judgments and eye movements (Huang & Staub, 2021a), and might even do so for reading of normal, grammatical sentences. For instance, a recent study by Paape and Vasishth (2022) showed that explicit consideration of qualitatively different types of trials – in contrast to assuming homogeneous incrementality across all trials – provided better predictive fit to reading time and judgment data.

In addition to the above theoretical implications, our empirical data add to the literature on misreading/mishearing. There have been several studies revealing the prevalence of failure to notice linguistic anomaly while reading in variants of “error-detection” tasks (Gibson et al., 2013; Healy & Zangara, 2017; Staub et al., 2019). Similarly, there also have been some under SVP or in listening (Ferreira, 2003; Sanford et al., 2011; Vissers et al., 2007; Zhou et al., 2010). Whereas previous studies involved detection of semantic/thematic errors, the current experiment provides evidence regarding detection of syntactic errors under SVP. Interestingly, there is great variation in detectability of different errors across studies. Studies that explicitly compare different types of error detection might further shed light on the interaction or modularity among syntactic, semantic, and thematic processing.

In conclusion, the current study found a transposed-word effect in English even under SVP, and even when participants were free to respond during the serial presentation of the sentence. Thus, parallel word processing is clearly not necessary for the effect. There was evidence of a greater transposed-word effect under PVP than SVP, consistent with Liu et al. (2022), which can be explained under both perceptual and post-perceptual accounts. We propose that the different timing and dynamics of word recognition and integration in the two presentation paradigms might be what underlies the difference in accuracy between PVP and SVP. Future studies can investigate this by, for example, manipulating the presentation rate of words under SVP.