Advances in digital technology and the increasing availability of mobile devices, such as smartphones and tablets, are dramatically changing the ways in which people access, acquire, and consume information. With the advent of the Internet and its sophisticated search engines, mobile devices ensure that accessing information is literally as easy as lifting a finger (Sparrow, Liu, & Wegner, 2011). More than 85% of U.S. adults report that they spend more time reading news on mobile devices than through conventional print media. Digital texts are also increasingly widely used in classroom contexts and in higher education (Moehring, Schroeders, Leichtmann, & Wilhelm, 2016).

At a practical level, digital reading resources have the potential to enhance global literacy by providing more people with cheaper access to written material through mobile devices. However, before endorsing investment in such resources, it is important to systematically evaluate whether and how reading electronic material on small-screen mobile devices changes the reading process. Such evidence will inform educational decisions about the advantages and disadvantages of using mobile technology to deliver information to achieve different learning goals for readers at varying levels of proficiency, and guide the design of mobile reading applications for different contexts and audiences.

Reading comprehension is the ability to understand the meaning of text. It requires the coordination of a complex set of perceptual and cognitive process to extract information from the ‘squiggles’ of written script and use it to retrieve the meanings of words and phrases and integrate them with existing knowledge to construct an understanding of the text (Andrews & Reichle, 2019). Reading has been intensively studied since the earliest days of psychology (see, e.g., Huey, 1908), but relatively little attention has been paid to the format in which information is presented, despite the advent of digital technology.

The few direct comparisons of comprehension of printed material with digital text reading have yielded mixed results. Early research found that comprehension did not differ between the two formats (e.g., Mills & Weldon, 1987), but that traditional paper format was superior to digital text in terms of reading speed and readability (e.g., Gould & Grischkowsky, 1986). However, there have been many subsequent advances in digital technology. A recent systematic review identified 36 studies published since 2000 that directly compared reading of print and digital texts using objective measures of comprehension (Singer & Alexander, 2017a). Some studies reported superior comprehension for print than digital material (e.g., Mangen, Walgermo, & Brønnick, 2013; Noyes, Garland, & Robbins, 2004), while others found an advantage for digital material (e.g., Verdi, Crooks, & White, 2014), and still others found no significant differences between comprehension for the two formats (e.g., Margolin, Toland, Driscoll, & Kegler, 2013). The variability in outcomes was partially explained by text length: Shorter texts tended to yield null effects or a digital advantage but comprehension of texts longer than 500 words, or more than one page of the book or screen, was better for printed texts. Singer and Alexander (2017a) suggested that the reduced comprehension of longer digital texts may reflect the impact of scrolling, which interrupts the reading process, and increases cognitive load (Wastlund, Norlander, & Archer, 2008). Such detrimental effects may be exacerbated on small-screen mobile devices.

Research investigating the cognitive processes involved in reading on mobile devices remains scarce. The purpose and content of reading on such devices is often different from print reading because they are used principally to access information on the Internet. Reading on mobile devices is therefore often associated with an ‘online reading’ strategy in which the reader does not aim to comprehend the complete text but to achieve a particular goal by locating, evaluating, synthesizing, and communicating information (Leu et al., 2011). This requires problem-solving strategies other than comprehension to search for information and evaluate its relevance to the current goal (Coiro & Dobler, 2007; Henry, 2006). These strategies are often associated with ‘browsing patterns’ that are very different to the sequential scanning that characterizes reading of print (see, e.g., Rayner, 2009) and guided by salient hyperlinks (Fitzsimmons, Weal, & Drieghe, 2013). While hyperlinks can benefit reading by facilitating effective allocation of attention, they can also disrupt the automatic word identification and memory retrieval processes required to integrate the text’s meaning (DeStefano & LeFevre, 2007). Factors related to the ‘visual ergonomics’ of digital text on small-screen devices, such as text size, screen resolution, and luminance contrast have also been found to influence text recall (Garland & Noyes, 2004) and contribute to visual fatigue (Mangen et al., 2013). Constraints on the format of text on small-screen devices may also impair text ‘navigation’ (Mangen et al., 2013). In addition to the demands of scrolling (Wastlund et al., 2008), the need to integrate information across spatial locations that are not simultaneously visible may impair comprehension by disrupting construction of a mental model of the text (Kerr & Symons, 2006).

Apart from differences in reading strategy due to inherent properties of the text format and reader goals, broader contextual factors associated with usage of mobile devices such as smartphones also influence memory and attention (see Wilmer, Sherman, & Chein, 2017, for a review). Mobile phone multitasking is widely considered to be a major source of distraction due to alerts from social media applications, phone calls, or texts (Chen & Yan, 2016). Stothart, Mitchum, and Yehnert (2015) found that performance was disrupted by such alerts even when participants did not directly interact with the phone, suggesting that the attentional costs can arise from task-irrelevant thoughts or mind-wandering. Such distractions have the potential to cause cognitive overload that reduces the resources available for comprehension (Paas, van Gog, & Sweller, 2010).

Metacognitive factors may also contribute to differences between comprehension of printed and digital text. People’s increasing reliance on the Internet as an external, transactional memory system has been shown to encourage a tendency to remember where to access digital information rather than the information itself (Sparrow et al., 2011). Readers’ metacognitive judgements of their comprehension of digital material have also been found to be poorly calibrated. Even when judgements of comprehension were made immediately following exposure to material in both formats, the majority of a university-student sample incorrectly predicted their performance was better for the digitally presented text (Singer & Alexander, 2017b). Similarly, Ackerman and Goldsmith (2011) found that overconfidence in comprehension accuracy was consistently higher for digital than print material, and that monitoring and regulation of digital study time was more ‘erratic’. They suggested that people’s association of digital media with fast, shallow reading of short messages ‘may reduce the mobilization of cognitive resources … needed for effective self-regulation’ (p. 29).

Using misinformation effects to investigate reading comprehension

The present research was designed to directly investigate the impact of reading format on comprehension, integration, and interpretation of information by comparing sensitivity to the correction and retraction of information presented either in printed texts or in digital format on smartphones, indexed by the continued-influence effect (CIE) of misinformation. We also investigated whether the effects of format on misinformation effects differed between readers of English and Chinese.

The CIE was originally demonstrated in a misinformation paradigm in which readers are presented with fictitious reports of an event, unfolding over time (H. M. Johnson & Seifert, 1994; Wilkes & Leatherbarrow, 1988). In the present study, which used materials developed by Ecker, Hogan, and Lewandowsky (2017), these were pairs of newspaper-style articles describing an unfolding event (e.g., a fire). The initial article contains a piece of critical information (e.g., the fire was caused by arson). In critical scenarios, this information is subsequently retracted and corrected with an alternative cause (e.g., the fire was caused by a lightning strike), while for others no correction occurs. The CIE is the robust finding that people continue to report the original misinformation,Footnote 1 often much more frequently than the alternative version, even when they acknowledge and remember the retraction (e.g., Ecker, Lewandowsky, & Apai, 2011a; Ecker, Lewandowsky, Swire, & Chang, 2011b; H. M. Johnson & Seifert, 1994). This implies that people often fail to update their original memory of a fact or event, despite successful encoding of the correction. Systematic investigations of the CIE have shown that the impact of exposure to misinformation is very difficult to modify even for neutral scenarios for which there is no inherent reason to favor one alternative over the other, and extends beyond memory to influence related inferences and beliefs (for reviews, see Lewandowsky, Ecker, Schwarz, Seifert, & Cook, 2012; Seifert, 2014). The CIE therefore provides a useful index of the extent to which readers comprehend, integrate, and evaluate information extracted from text, and whether and how these processes are influenced by reading format.

People’s sensitivity to misinformation is an increasing focus of concern in the contemporary ‘posttruth’ era where massive quantities of relatively unregulated information is available through new media and the Internet (Lewandowsky, Ecker, & Cook, 2017). Such media are increasingly accessed through mobile devices but, to our knowledge, there has been no direct investigation of the impact of reading format on misinformation effects or the CIE. If reading on small-screen devices reduces the cognitive resources available for comprehension by impairing navigation through text, or encouraging shallow processing, it may be associated with reduced sensitivity to correction and retraction of information leading to an enhanced CIE.

The present study

To shed light on the source of any differences in the impact of misinformation as a function of reading format or language/culture, we adapted a design used by Ecker et al. (2017) to investigate whether the CIE is increased or reduced when the original misinformation is repeated in conjunction with its retraction. Some researchers have argued that repeating the original misinformation at the time it is retracted can yield a ‘familiarity backfire effect’ by strengthening the misinformation (e.g., Dechêne, Stahl, Hansen, & Wänke, 2010). Alternatively, theoretical accounts that focus on the salience of the correction propose that repeating the misinformation at the time of retraction can reduce the CIE by highlighting the conflict between the two causal explanations (e.g., Putnam, Wahlheim, & Jacoby, 2014). Detecting the conflict is argued to enhance the likelihood that the misinformation will be updated with the new correct alternative by coactivating the original memory as well as the new correct information (e.g., Kendeou, Walsh, Smith, & O’Brien, 2014). Consistent with the latter view, Ecker et al. (2017) found that explicit retractions accompanied by a reminder of the original misinformation yielded a reduced CIE relative to retractions without reminders. As well as influencing the likelihood of remembering the alternative cause and the retraction of the original misinformation, explicit reminders were associated with a reduced impact of misinformation on broader inferential judgements related to the misinformation.

Ecker et al. (2017) presented scenarios in a slide show on a computer screen. The present research used their stimulus materials but presented the passages either in a hard-copy printed format or on a mobile phone to assess whether reading format influenced memory for the passages and the inferences derived from them. The English-speaking participants, who were drawn from a similar population of Australian university students to those assessed by Ecker et al., were also assessed on reading proficiency to determine whether it moderated the effects of misinformation or reading format.

To investigate whether susceptibility to misinformation and effects of reading format generalized across language and culture, we compared the Australian student sample with a sample of Chinese readers of Mandarin. Although the English and Chinese writing systems differ on a range of fundamental dimensions, many theories of reading assume that the representation and processes involved in comprehension are universal across languages (e.g., Feldman & Moscoso del Prado Martín, 2012). Readers of both English and Chinese rely on phonological processes during the comprehension of written text (Perfetti, Zhang, & Berent, 1992), and neuroimaging research has suggested that common neural substrates underlie syntactic processing in sentence reading across the two languages (Wang et al., 2008). A recent eye-movement study of text reading found strikingly similar reading behavior in English, Chinese, and Finnish (Liversedge et al., 2016), demonstrating that the time required to encode and construct a representation of the meaning of the information conveyed in text was very similar despite the substantial differences between the three writing systems. Such findings suggest that the impact of misinformation on comprehension and memory should be similar across languages.

There may, however, be cultural differences in sensitivity to contradiction. A substantial body of cross-cultural research has documented differences in the reasoning styles of Western and East Asian cultures (see de Oliveira & Nisbett, 2017; Varnum, Grossman, Kitayama, & Nisbett, 2010, for reviews). In contrast to the analytic reasoning style that is dominant in many Western cultures, East Asian cultures have been found to rely on a more dialectical style that is characterized by a focus on context and relationships (e.g., de Oliveira & Nisbett, 2017). These two styles are associated with different responses to contradictions (e.g., Peng & Nisbett, 1999). The analytic approach requires resolution of contradictions by differentiating the alternatives to select one option as preferred, and reject the other. Within the dialectical approach, contradictions are accepted as a necessary element of a changing world rather than a logical problem to be solved, and typically dealt with by seeking a compromise or ‘middle way’ that retains the basic elements of both competing perspectives. Peng and Nisbett (1999) observed cultural differences in responses to contradictions even in American and Chinese students attending the same U.S. university: exposure to contradictory information polarized the American students’ preference for the alternative options relative to when they were presented alone, while it led Chinese students to rate both options as equivalently acceptable.

The present study investigated whether these different responses to contradiction generalized to the misinformation paradigm. If so, Chinese participants may be more likely to retain a representation of the original misinformation, and therefore show a stronger CIE. If reading on a mobile phone further reduces analytical processing, the CIE may be particularly strong in Chinese readers exposed to the mobile phone reading condition.

Two parallel experiments were conducted, one in Sydney, Australia, and the second in Shanghai, China. The Australian study used a subset of Ecker et al.’s (2017) materials, which were translated from English into Mandarin for the Shanghai study. The same general procedures were used in both studies. The methods used in the Australian study are described first, followed by a summary of minor differences specific to the Chinese sample.

Method: Australian study

Participants

The Australian sample comprised 307 undergraduate students recruited from the University of Sydney as part of a second-year psychology class exercise, of whom 60 participants were excluded as they did not consent to their data being collated or failed to complete all the experimental tasks.Footnote 2 This left a final sample of 247 participantsFootnote 3 for analyses (see Table 1 for participant demographics).

Table 1. Summary of the characteristics of the participants randomly allocated to each format condition

Experimental design and materials

Each participant read four pairs of short newspaper-article style (70–140 words) passages from Ecker et al. (2017; see Appendix) that described an unfolding news scenario. The first article introduced the scenario (e.g., a fire) and described what happened, while the second contained updated information about the event. The first article for each scenario, which was identical across all conditions, contained a piece of critical information related to the cause of the event (e.g., “the fire had been deliberately lit”) that provided the potential target for the retraction manipulation in the second article. There were three versions of each second article corresponding to the three retraction conditions. In the control no-retraction (NR) condition, the second article did not retract, or repeat, the critical information from the first article. In the retraction with no reminder (RNR) condition, updated information was provided in the second article that contradicted the critical causal information from the first article, but did not explicitly refer back to it (e.g., “After a full investigation and review of witness reports, authorities have concluded that the fire was set off by lightning strikes”). The second retraction condition, retraction with explicit reminder (RER),Footnote 4 presented the same updated information in the second article, but explicitly repeated the critical information from the first article before correcting it (e.g., “It was originally reported that the fire had been deliberately lit, but authorities have now ruled out that possibility. After a full investigation and review of witness reports, authorities have concluded that the fire was set off by lightning strikes”).

To reduce the likelihood that participants would be alerted to retractions, each participant read article pairs for two scenarios in the control NR condition and one scenario in each of the retraction conditions. The order of presentation of the scenarios was controlled so that the first scenario was always from the control NR condition and the two retraction conditions were separated by a control NR scenario. That is, the order of scenario conditions was always either (i) NR, RNR, NR, RER or (ii) NR, RER, NR, RNR. The assignment of scenarios to conditions was counterbalanced so that, across participants, all four scenarios occurred approximately equally often in each retraction condition and in each presentation order. There were therefore eight counterbalanced lists of passages.

The novel addition to Ecker et al.’s (2017) design was a manipulation of the format in which the passages were read. Participants were randomly allocated to read the four articles either on paper or on their mobile phones. The stimulus materials therefore formed a mixed 2 × 3 design, where reading format (paper, mobile) was the between-subjects factor and retraction condition (NR, RNR, RER) was the within-subjects factor.

The Australian sample were also assessed on two measures of language proficiency. The Shipley Vocabulary Scale (Shipley, 1940) consists of 40 items for which participants must select the word that most closely matches the meaning of a target word from amongst four alternatives. The Author Recognition Test (ART) was originally developed by Stanovich and West (1989) to provide an index of reading experience by assessing accuracy of discriminating the names of real authors from fictitious names. Moore and Gordon (2015) used item response theory to develop the updated 50-item version employed here. The standard scoring procedure of subtracting the number of false alarms to lures from the number of correct selections was used. The two scores were highly correlated (r = .70) so they were standardized and averaged to form a single measure of overall proficiency.

Comprehension questionnaire

Memory and comprehension for the scenarios were assessed using a modified version of the questionnaire used by Ecker et al. (2017; see Appendix). For each scenario, participants completed two open-ended free recall questions (e.g., “Briefly summarise the main points of the bushfire articles”; “What was the cause of the bushfire?”), as well as three four-alternative multiple-choice questions testing fact memory (e.g., “Where did the bushfire occur?”). Then, inferences related to the critical information were assessed using three rating-scale questions designed to elicit a judgement (e.g., “The government should spend more resources to prevent arson.).

Procedure

Participants were tested in class groups that were randomly assigned to read the experimental passages either on paper, or on their own mobile phones.Footnote 5 Within each class, an booklets containing the passages while those in the mobile phone condition were given a link to a website where they could access the passages online. The online passages were presented in paragraph format, identical to the paper materials, using the Qualtrics survey platform. Each passage appeared on a single page/screen.

Participants first read the ethics-approved information statement before providing consent for participation in the study. At the end of the study, in accordance with requirements of the ethics committee, they were given the opportunity to exclude their data from data collation and analysis. An example pair of passages and set of comprehension questions were presented in a group format to illustrate the timing of passage presentation and the nature of the comprehension requirements. Participants then read the four passage pairs in their assigned format condition. Following Ecker et al. (2017), encoding time was controlled by presenting each article for a fixed maximum time (0.35 s per word), which allowed a comfortable, but not excessive, time to complete reading of the article. In the print condition, time limits were signalled on each participant’s computer screen. All participants could progress to the next passage before the time expired if they chose.

To provide a delay before assessing memory for the passages, participants completed a 10–15-minute questionnaire that included demographic questions about age and language history and the two measures of written language proficiency. All participants then completed the comprehension questionnaire on individual computers. Questions about the four scenarios were presented in the order in which the participant had read the passages. A 5-minute time limit was imposed for completing the set of questions for each passage.

Method: Chinese study

Participants

A total of 235 students were recruited from Shanghai Jiao Tong University. Two participants who answered the open-ended questions in English and four nonnative Chinese speaking participants were excluded from the final analyses, leaving a final sample of 229 participants (see Table 1 for participant demographics).

Experimental design and materials

All the experimental materials, including the comprehension questionnaire, were translated into Chinese by two bilingual college students using the ‘double-blind principle’ (Brislin, 1980). All aspects of the design and materials were the same as the Australian study. Participants were randomly allocated to read the four pairs of passages in either the paper or mobile format.

Procedure

The general procedure was the same as the Australian study but participants were tested in small groups rather than a practical class, and completed an unrelated visual experiment and questionnaire for 20 minutes in the interval between reading the passages and completing the comprehension questionnaire on a computer.Footnote 6 Questions about the four scenarios were presented in a fixed order for all participants such that for some participants this order would have been different from the one in which they read the original passage pairs.

Results

Questionnaire scoring

The comprehension questionnaire responses were coded following procedures adapted from the methods described in detail by Ecker et al. (2017).Footnote 7 In each testing location, two scorers applied the same systematic set of scoring criteria and consulted on ambiguous cases to code the five dependent variables described below for each condition. Scores for the two control NR scenarios presented to each participant were averaged yielding three scores on each dependent variable: NR, RNR, and RER.

The general memory score was calculated for each scenario based on (i) the number of correct idea units (including themes and details from Ecker et al.’s, 2017, criteria; maximum of four) included in the participant’s open-ended response, and (ii) the number of correctly answered multiple-choice questions (maximum of three). These two components were summed and averaged to form a score ranging from 0 to 1, where 1 indicated perfect recall. The idea units coded for this score did not mention the critical cause or its alternative.

Memory for the critical information that distinguished the retraction conditions was assessed by three scores calculated using the procedures followed by Ecker et al. (2017). Separate scores of 0 or 1 were assigned according to whether the participant’s initial open-ended response referred to (i) the original critical cause from the first passage (e.g., the fire was caused by arson); (ii) the alternative cause presented in the second passage of the RNR and RER conditions (e.g., the fire was caused by lightning strike); and (iii) the retraction or change of causal information (e.g., it was initially thought the fire was caused by arson, but investigation revealed it was actually caused by lightning strike). The two latter scores were not calculated for NR passages because they did not include an alternative cause.

The inference score was calculated from the three rating-scale questions that assessed judgements related to the critical information. These ratings were summed and converted to a score ranging from 0 to 1 where a high score indexed a stronger influence of the original misinformation on inferences about the implications of the scenarios (e.g., a high rating to “How mistrustful would local residents be after the bushfire?”).

Main analyses

As summarized in Table 1, the Australian and Chinese samples were demographically similar and the random allocation to different reading formats resulted in groups that did not differ significantly on age, gender, and, for the Australian sample, reading proficiency (all ps < .05). Mean scores for both language groups on each dependent variable in each condition are presented in Table 2.

Table 2. Mean (and standard error) scores on general memory, critical memory, and inference for Australian and Chinese samples across retraction and format conditions

Omnibus analyses of variance (ANOVA) were conducted on each of the dependent variables treating retraction condition as a within-subjects factor, and format and group as between-subject factors. Follow-up contrasts were carried out to decompose significant main effects and interactions. For the retraction factor, two contrasts were tested (i) the retraction effect: comparison of the control NR condition with the average of the two retraction conditions (RNR and RER), and (ii) the reminder effect: the difference between the RNR and RER conditions. Interactions involving group were followed up by separate analyses of each language group.

General memory scores

There was a significant main effect of retraction condition on general memory scores, F(2, 472) = 15.98, p < .001, ƞ2 = .033. Follow-up contrasts showed a significant retraction effect—higher general memory scores for the NR than for the average of the RNR and RER conditions, F(1, 472) = 32.69, p < .001, ƞ2 = .065—but no significant difference between the latter two conditions (p = .30)—that is, no significant reminder effect. The main effect of Format was not significant (p > .05). However, there was a significant main effect of group because Chinese participants showed significantly higher general memory scores than did Australian participants, F(1, 472) = 22.66, p < .001, ƞ2 = .046. There was also a significant interaction between format and group, F(1, 472) = 6.78, p = .009, ƞ2 = .014. A separate analysis of the Chinese group alone showed that general memory scores were significantly higher in the paper than in the mobile condition (M = 0.603 vs. M = 0.556, respectively), F(1, 472) = 6.44, p = .012, ƞ2 = .028. In contrast, the small difference between memory scores for the paper and mobile conditions (M = 0.513 vs. M = 0.526, respectively) observed in the Australian group was not significant (F < 1).

Critical memory scores

Critical information

The critical information scores showed a significant overall effect of retraction condition, F(2, 472) = 12.06, p < .001, ƞ2 = .025, that reflected both a significant retraction effect, F(1, 472) = 6.31, p = .012, ƞ2 = .013, due to poorer memory for the critical cause from the NR scenarios than for the average of the RNR and RER conditions; and a significant reminder effect, F(1, 472) = 16.75, p < .001, ƞ2 = .034, due to superior memory in the RER than in the RNR condition. The main effect of Format was not significant (F < 1), and format did not significantly interact with retraction condition, F(2, 472) = 2.34, p = .096. Paralleling the general memory scores, there was a significant main effect of group, F(1, 472) = 22.66, p < .001, ƞ2 = .046, which reflected significantly higher recall of the critical information in Chinese than Australian participants (M = 0.677 vs. M = 0.488, respectively). Group did not significantly interact with either retraction or format condition (both Fs < 1).

Alternative cause and retraction

The alternative cause was not presented in the NR passages, so these analyses were restricted to the RNR and RER conditions. Recall of the alternative cause was not significantly affected by retraction condition, F(1, 472) = 2.90, p = .089; group, F(1, 472) = 1.79, p = .181; or format (F < 1). However, memory that the original critical cause was retracted or changed was significantly affected by retraction condition, F(1, 472) = 61.87, p < .001, ƞ2 = .116, because retraction recall rate was higher in the RER condition. Format significantly interacted with Retraction condition because the higher recall of the retraction in the RER than RNR condition was significantly larger in the mobile condition F(1, 472) = 6.30, p = .012, ƞ2 = .013. There was also a significant main effect of Group, F(1, 472) = 4.06, p = .044, ƞ2 = .009, which, unlike the general and critical information scores, reflected higher retraction recall in the Australian than in the Chinese sample (M = 0.469 vs. M = 0.404, respectively).

In summary, memory for general details was better for the control NR passages than for the two retraction conditions, but memory for the critical causal information was poorer for the NR passages than for the retraction conditions, and poorer in the RNR than in the RER condition. Although memory for the alternative cause did not differ between the RNR and RER conditions, participants were more likely to report the retraction/change in causal information for the RER condition, which included an explicit reminder of the original information. There were no significant main effects of format on any memory measure, but the retraction memory scores showed a significantly larger reminder effect in the mobile than in the print condition. Finally, there were significant effects of language group on all memory measures except the alternative cause: Chinese participants showed better memory for both general information and the original critical cause than for the Australian participants, but were less likely to report the retraction/change in causal information.

Inference scores

Analysis of the inference scores yielded a significant main effect of retraction condition, F(2, 472) = 159.57, p < .001, ƞ2 = .253. The follow-up contrasts showed that this reflected both a significant retraction effect, F(1, 472) = 391.67, p < .001, ƞ2 = .453, and a significant reminder effect, F(1, 472) = 10.47, p = .001, ƞ2 = .022, due to lower inference scores in the retraction conditions, particularly RER. The main effect of Format was also significant, F(1, 472) = 14.46, p < .001, ƞ2 = .030, because inference scores were significantly lower for the paper than for the mobile condition. There was also a significant interaction between retraction and format condition, F(2, 472) = 4.33, p = .013, ƞ2 = .009. The follow-up contrasts indicated that this reflected a larger retraction effect in the paper than in the mobile condition, F(1, 472) = 9.72, p = .002, ƞ2 = .020, while the reminder effect did not significantly differ between formats (F < 1).

The main effect of Group was significant, F(1, 472) = 70.94, p < .001, ƞ2 = .131, due to lower inference scores for the Australian than for Chinese participants. Group also participated in two significant interactions on inference scores. Firstly, group significantly interacted with retraction condition, F(2, 472) = 14.36, p < .001, ƞ2 = .030, with follow-up contrasts indicating that the retraction effect was smaller in the Chinese than in the Australian group, F(1, 472) = 36.44, p < .001, ƞ2 = .072. Secondly, group interacted significantly with format, F(1, 472) = 3.96, p = .047, ƞ2 = .008, because the difference in inference scores between paper and mobile format was greater in Chinese than in Australian participants. A separate analysis of the Chinese sample alone confirmed that they showed a significant interaction between retraction and format condition, F(2, 227) = 5.01, p= .007, ƞ2 = .022, which was due to a stronger retraction effect in the print than in the mobile condition, F(1, 227) = 13.49, p < .001, ƞ2 = .056. Neither of these interactions was significant in the Australian sample (both Fs < 1.47).

In summary, inference scores were significantly reduced by retraction of the original information, demonstrating that these judgements were sensitive to participant’s perceptions of the cause of the events described in the passages. Inference scores were also significantly lower in the paper than in the mobile condition, particularly in the two retraction conditions, demonstrating a stronger CIE in the mobile condition. Inference scores were also modulated by language group: Chinese participants showed higher average inference scores than did Australian participants, particularly in the retraction conditions, indicating that the Chinese sample showed a stronger CIE that was most marked when they read the passages on a mobile phone.

Supplementary analyses

Two supplementary sets of analyses of covariance (ANCOVA) were conducted. The first assessed whether general memory performance modulated the critical memory and inference scores, while the second evaluated the contribution of proficiency to performance of the Australian sample.

General memory

The Chinese participants showed significantly better general memory for the passages. To determine whether differences in general memory ability contributed to the stronger CIE effects evident in critical memory and inference scores for the Chinese sample, ANCOVA analyses were conducted on these dependent measures including (centred) general memory score as a covariate (see Figs. 1 and 2).

Fig. 1
figure 1

Memory scores for Australian and Chinese participants across retraction and format conditions with general memory scores partialed out. From top to bottom, the panels show data for (1) critical information, (2) retraction, (3) alternative cause, and (4) inferences (error bars are within-subject standard errors)

The ANCOVA analyses of memory for the critical information showed that general memory significantly predicted scores for the critical cause, the alternative cause, and retraction (all ps < .001). There was no change in the significant effects of retraction condition and group on memory for the critical cause or retraction when general memory was controlled. However, including it as a covariate in the analysis of alternative cause revealed a significant main effect of group, F(1, 470) = 12.50, p < .001, ƞ2 = .026, that was not observed in the main analysis. This occurred because, when general memory performance was controlled, recall of the alternative information was significantly higher in the Australian than in the Chinese sample (M = 0.61 vs. M = 0.49, respectively).

A parallel ANCOVA analysis of inference scores showed that general memory scores did not significantly moderate inference scores (F < 1). All main effects and interactions as well as follow-up contrasts remained significant when general memory scores were partialed out (ps < .05). However, the interaction between group and format was only marginally significant in the ANCOVA, F(1, 471) = 3.67, p = .056, ƞ2 = .008.

Thus, the ANCOVA analyses suggest that Chinese participants’ generally better memory for the passages extended to the original critical cause, but that their memory for the alternative cause was poorer than expected from their general memory performance. However, general memory did not significantly modulate the effects of language group on inference scores.

Reading proficiency

The second set of ANCOVAs, which were limited to the Australian sample, included participants’ standardized overall proficiency scores as continuous covariate in analyses of each dependent variable to determine whether the memory or inference scores were modulated by reading proficiency as assessed by the combined vocabulary and ART scores.

Proficiency predicted significant variance in all measures (all ps < .001), reflecting higher average memory for both general and critical information, and lower inference scores, in higher proficiency participants. However, controlling proficiency did not significantly modulate the pattern of effects of format or retraction condition, and the only significant interaction involving proficiency was a significantly larger increase in memory for the retraction in the RER relative to the RNR condition in higher proficiency participants, F(1, 244) = 5.16, p = .024, ƞ2 = .021.

Discussion

The continued-influence effect (CIE) of misinformation is a very robust phenomenon that has been demonstrated in many independent studies (see Lewandowsky et al., 2012; Seifert, 2014, for reviews). However, the vast majority of this research has assessed readers of English from Western cultures, typically reading material presented on computers. To the best of our knowledge, the current study is the first to demonstrate the CIE effect in Chinese readers, and the first systematic evidence that it also generalizes across reading format.

The overall results replicated Ecker et al.’s (2017) evidence that the continued influence of retracted information on inferential reasoning was reduced when the critical misinformation was repeated at the time that it was explicitly retracted. However, average memory performance revealed some minor differences from Ecker et al.’s findings. In contrast to Ecker et al., general memory for noncritical information was higher for the NR condition than for the two retraction conditions. This may reflect the fact that there was simply more information to summarize in the retraction conditions, which contained information about the alternative cause. The time constraints imposed on answering questions about each vignette may therefore have encouraged participants to focus on the critical and alternative cause information and limit their report of other details. Consistent with this possibility, average recall of the original critical cause was significantly higher in the retraction conditions.

There was, however, no difference between general memory scores for the RNR and RER passages. Nevertheless, memory for the critical cause, its retraction, and inference scores significantly differed between these conditions, confirming that Ecker et al.’s (2017) finding that explicitly repeating the original misinformation reduced the CIE generalized from the computer format used in their study to information read in paper and mobile phone formats.

These overall findings replicate and extend evidence for the beneficial effect of reminders about the original information presented at the time of retraction and converge with Ecker et al.’s (2017) conclusion that, rather than yielding a ‘familiarity backfire effect’ (Dechêne et al., 2010), reminders facilitate the correction of misinformation and updating of mental representations of the event with a plausible alternative cause. Reminder effects were significant in the average critical memory and inference data for both language groups, and in both formats, suggesting that they reflect generalizable, relatively universal cognitive processes. However, the results also revealed that the continued influence of misinformation, particularly on inferential reasoning, was modulated by both format and language group. We discuss each of these effects, and their implications, below.

The impact of reading format on the CIE

The format in which the passages were read significantly modulated the CIE on both memory and inference scores, but in different ways. The general pattern of memory performance was similar in the paper and mobile conditions, and there were no significant main effects of format on memory for the general or critical information. However, the impact of including a reminder of the original cause on memory for the retraction was stronger in the mobile than in the paper condition due to both lower retraction memory in the RNR condition and higher retraction memory in the RER condition. This implies that readers were less likely to effectively update their memory with the alternative information when passages were read on a mobile phone, unless it was made salient by an explicit reminder of the discrepancy with the originally stated cause.

By contrast, the interactions of format with retraction condition on inference scores were due to the presentation of an alternative cause, regardless of whether it was accompanied by a reminder of the discrepancy with the original cause: The reduction in average inference scores for retraction relative to control passages was significantly smaller in the mobile than in the paper condition, but format did not significantly modulate the difference between the RNR and RER conditions. The reduced retraction effect on inference scores in the mobile condition was principally due to Chinese readers; readers of English did not show significant format effects on inference scores.

The limited impact of reading format on memory performance is consistent with previous evidence of null effects on memory and comprehension of short passages (Singer & Alexander, 2017a). However, reading on a mobile phone reduced the likelihood that readers noticed the discrepancy between the original and alternative causal information when it was not explicitly repeated in the text and, in readers of Chinese but not English, reduced the extent to which inferences related to the event were modified by contradictory information. Both effects of format were relatively modest, but it is noteworthy that they occurred under controlled conditions that reduced or eliminated many of the features of mobile text reading that are thought to impair comprehension. The texts presented in paper and mobile conditions were identical: They did not contain any hypertext features, or require scrolling, and the whole passage could be simultaneously viewed in both the paper and mobile condition. The observed differences between paper and mobile formats therefore seem unlikely to be due to the impact of visual ergonomic or navigational factors on cognitive load (Singer & Alexander, 2017a).

Participants were not explicitly required to disable social media applications, but they were tested in classroom or laboratory contexts under the supervision of teaching or research staff, and under time constrained conditions. Nevertheless, the availability of these distractions has been found to influence performance even when participants cannot interact with their phone (Stothart et al., 2015) and may therefore contribute to the enhanced CIE effects observed in the mobile conditions of the present study. Another possible contributor to format differences is suggested by O’Rear and Radvansky’s (2019) recent evidence that many participants exposed to a misinformation paradigm reported that they did not believe the retraction. It is possible that awareness of the potential inaccuracy of information on the Internet makes people more unwilling to accept a retraction when it is encountered digitally, leading to a reduced CIE. However, the fact that the format effect was limited to the no reminder condition suggests that it is more likely to reflect a failure to notice the contradiction than a failure to believe it: when the contradiction was highlighted by a reminder of the original cause, memory for the retraction was equivalent in the mobile phone and paper conditions. This suggests that increasing the salience of discrepant information by highlighting contradictions may encourage allocation of attentional resources and compensate for the disruptive impact of distraction on self-regulation and monitoring of mobile phone reading (Ackerman & Goldsmith, 2011; Singer & Alexander, 2017b).

The impact of language/culture on the CIE

The results for the Chinese sample confirmed the cross-language generality of both the CIE and the beneficial effects of explicit reminders of the previous information in reducing susceptibility to misinformation. Like the readers of English tested by Ecker et al. (2017) and the present Australian sample, Chinese readers showed less influence of the original misinformation when it was repeated in the second passage than when it was not. However, the results also revealed significant differences between the effects of both reading format and retraction condition on readers of English and Chinese.

Chinese readers showed significantly better memory for both the general details of the passages and the original critical cause. However, they were less likely to report the retraction or change in causal information. Including general memory scores as a covariate also revealed significantly lower memory for the alternative cause in the Chinese than in the Australian sample, providing further evidence that Chinese readers’ superior memory did not extend to the alternative cause and retraction. The stronger continued influence of misinformation on Chinese readers was most marked in inference scores where the impact of retraction on inference scores was significantly smaller than that observed in readers of English, particularly when the passages were read on a mobile phone. Australian and Chinese participants showed very similar inference ratings for the control NR passages, regardless of format, demonstrating that the inferences made in response to the original causes were similar across language/culture. However, Chinese readers showed less reduction in inference ratings when the original cause was retracted than readers of English, suggesting that they were less likely to modify inferences related to the event when the original information was contradicted. Thus, even though Chinese readers were at least as likely as readers of English to correctly encode and remember the passages, they showed a stronger continuing influence of initial misinformation that was subsequently retracted on both what they remembered and the broader inferences they drew from it.

The differences between readers of Chinese and English in both memory and sensitivity to misinformation and format converge with the evidence of cross-cultural differences between East Asian and Western cultures in thinking and reasoning style reviewed in the Introduction. The Chinese teaching and assessment systems, and the memory demands imposed by the Chinese writing system, encourage memorization and reproduction of knowledge (e.g., Stephenson, Paine, & Meltzer, 1990), rather than focusing on the critical thinking and analysis skills emphasized in Western education (D. W. Johnson & Johnson, 1993). The present finding of superior memory for the general details of the passages in Chinese readers is consistent with previous evidence that, when they expect a memory test, Chinese learners tend to concentrate on the details of individual items to be remembered and apply a direct retrieval strategy, rather than to integrate across items and rely on the general gist memory typically used by Western learners (Cowley, 2002). Similarly, our Chinese readers remembered more details of the passages, but were apparently less likely than the readers of English to integrate the discrepancy between the original and alternative cause in their memory representation of the passages. They also showed much smaller changes in inference scores in response to retraction, and stronger effects of reading format on sensitivity to retraction.

These enhanced CIE effects on inference scores in Chinese readers correspond to the reduced sensitivity to contradiction predicted from reliance on a dialectical rather than analytic approach to inferential reasoning. According to de Oliveira and Nisbett (2017), the dialectical reasoning strategy that is characteristic of Chinese culture favours the continuity principle and attempts to reconcile contradictions by accepting multiple perspectives and searching for a ‘middle way’ between opposing propositions. In the context of the misinformation paradigm, this may mean that Chinese readers are more likely to maintain both the original and alternative causes of the event in their memory representation, so that original misinformation continues to affect reasoning even when it is explicitly retracted.

In contrast, the Australian readers of English appeared to apply an analytic approach in which logic rules were applied to contradictory propositions to select which was more valid. The temporal structure of the unfolding scenarios favoured rejection of the original cause, leading to updating of memory when it was retracted and substituted with an alternative, plausible cause, and therefore weaker CIE effects

Thus, the differences between Chinese and Australian participants appear to reflect cultural influences rather than language per se. There may also be contributions of language and/or writing system. Both the stimulus materials and the testing context for each of our language groups were presented in their native language, which may exaggerate cultural influences. However, distinctions between dialectical and analytic processing have been demonstrated even when participants are tested their second language (Ji, Zhang, & Nisbett, 2004). Further research on bilingual participants is required to investigate the boundary conditions governing misinformation effects.

The results for the Australian sample also demonstrated the contribution of language proficiency to the CIE. Consistent with recent evidence that higher vocabulary (De keersmaecker & Roets, 2017) and working memory (Brydges, Gignac, & Ecker, 2018) were associated with a reduced susceptibility to ongoing effects of retracted information, the present results showed that higher scores on a composite index of vocabulary and reading experience significantly predicted superior memory and a reduced CIE on inferential reasoning. The contribution of general verbal ability to reading comprehension is not surprising. However, the fact that its impact can be observed in memory scores within restricted-range samples of university students highlights the contribution of reading comprehension to people’s sensitivity to misinformation effects. Most investigations of the CIE have been conducted on WEIRD (Henrich, Heine, & Norenzayan, 2010) samples of university students from Western cultures, limiting the generality of the findings. If the continued influence of misinformation is stronger in lower proficiency readers even within populations of above-average readers, it is likely to have an even stronger impact within more representative community samples.

In conclusion, this research demonstrates the utility of using sensitivity to misinformation and contradiction to investigate how memory, comprehension, and inferential reasoning are influenced by the format in which information is read and the language/culture of the reader. In this digital era, when mobile devices are becoming the major means by which people acquire information, understanding the cognitive and social implications of their use is an increasingly important focus of research.