For good reasons, the results reported by Diemand-Yauman et al. (2011), accompanied by the clever title, “Fortune favors the bold (and the italicized): effects of disfluency on educational outcomes,” have had a big impact, not only among researchers, but also in the public media. First, from a common-sense standpoint, why should making to-be-learned materials harder to read by virtue of a more-difficult-to-read typeface have any benefits? Second, from a research standpoint, what kind of productive processing might be triggered by disfluency? And third, could introducing disfluency be an important, if surprising, way to increase comprehension? That is, if such a manipulation were able to improve metacognitive accuracy and/or benefit learning, then it would be a valuable—not to mention a fairly simple—intervention.

As the editors of this special issue point out, in the years since the appearance of Diemand-Yauman et al.’s intriguing findings there have been many follow-up studies using a variety of methods and materials, a number of which have failed to replicate Diemand-Yauman et al.’s dramatic findings. In that context, a more thorough investigation of the potential benefits of perceptual disfluency and its possible boundary conditions is clearly desirable, and this special issue constitutes just such an investigation. The six impressive studies in this special issue test a variety of boundary conditions for the effect of disfluency on metacognitive judgments, reading time, recall, and comprehension. They examine, collectively, various factors that may moderate or mediate the effect of disfluency. They employ a variety of materials, including word lists, passages, and problem-solving tasks; they examine a range of degrees of disfluency; and they include a broad spectrum of other potentially moderating or mediating variables—even different media of presentation, such as paper versus a computer screen. We first provide brief summaries of the papers in this issue and then conclude with some broader comments on disfluency and when difficulties are and are not desirable.

The articles in this special issue

Eitel and Kuhl (this issue) focus on the suspected mechanism by which perceptual disfluency is proposed to provide its advantage—by causing slower, deeper, more effortful processing. In order to compare disfluency with another condition that also appears to lead to such processing, the authors introduce the variable of test expectancy. Participants read a passage in either fluent or disfluent font accompanied by static images; half the participants were told to expect a test later, and half were not. Eitel and Kuhl found that while high text expectancy led to improved scores on both retention and transfer tests, the disfluency manipulation had no effect. They conclude that disfluency does not automatically elicit effortful processing, but also suggest the complexity of the learning material as a potential moderating factor. Related to complexity, it is also possible that the fluent image allowed participants to overcome any disadvantage due to the fluency or disfluency of the text, suggesting yet another possible moderating factor.

The importance of testing the effects of disfluency in the presence of other variables is key to its usefulness as an educational intervention. That is, in order for introducing disfluency to be a worthwhile intervention, the effects of disfluency must be consistent and present even when other factors that normally exist in a real-world setting are also present. Test expectancy is a logical variable to test, as are other cues that typically influence metacognitive judgments and might have the potential to overshadow disfluency, such as those explored by Magreehan, Serra, Schwartz, and Narciss (this issue)—namely, different levels of fluency and item relatedness, specifically in regard to their effect on judgments of learning (JOLs) for word pairs. When item relatedness was an available cue, disfluency did not affect JOLs; also, disfluency only affected JOLs when it was manipulated within subjects. In other words, disfluency only affected metacognitive judgments when it was the only factor by which learners could determine a difference between items. In the presence of other (potentially more useful) cues, disfluency had no effect on metacognitive judgments.

The findings presented in Eitel and Kuhl (this issue) and Magreehan et al. (this issue) suggest that the possible effects of disfluency on both metacognitive judgments and learning are easily overpowered by the presence of other cues. Both focus on the explanation that disfluent text may cause readers to slow down and alter their processing of disfluent text compared to fluent text. A third study in this issue, Rummer, Schweppe, and Schwede, addresses a different possible mechanism of the effects of disfluency: distinctiveness. They presented participants with lists of aliens and their characteristics, similar to the stimuli used in Diemand-Yauman et al. (2011). Four of the five lists were presented in a consistent format (either fluent or disfluent) and the fifth list was in the opposite format. For some participants, then, the fluent list was distinctive, but for others the disfluent list was distinctive, which is an attractive feature of the authors’ experimental design. Across three experiments (one on the computer and two on paper), they found no effect of disfluency and, surprisingly, no effect of distinctiveness. In light of their results, the authors add their voice to the call for a fuller exploration of potential moderators of the disfluency effect, including the complexity of to-be-learned information and difficulty of the required task.

Applying a slightly different methodological approach, Strukelj, Scheiter, Nystrom, and Holmqvist (this issue) employed eye tracking to measure how learners interacted with fluent versus degraded text while reading expository text. They found that while fluency had no impact on recall, learners spent more time on later sentences of degraded text than on earlier sentences. They suggest that their eye-tracking results may indicate that learners require time to adapt to disfluency and that disfluent text may engage more complex processes than previously thought. Furthermore, they point out a critical issue in the literature on this topic—that what constitutes “hard to read” text is not clearly defined and, as such, difficult to replicate and measure precisely.

One particular difference across the literature regarding how disfluency is defined involves the medium of presentation. Sidi, Ophir, and Ackerman (this issue) had participants complete a problem-solving task either on the computer or on paper with the materials presented in either a disfluent or a fluent font, and they measured both success rate (Experiments 1 and 2) and confidence (Experiment 2). The results of Experiment 1 did not reveal any effects of disfluency or medium, but in Experiment 2, they observed an interaction between medium and fluency: Success rates were higher on the computer for disfluent font than for fluent font, but higher on paper for fluent font than for disfluent font. In addition, they found that confidence did not differ between fluency conditions on the computer, but participants had higher confidence for fluent, paper-based material than for disfluent, paper-based material. In addition to providing a more nuanced view of disfluency’s effect on metacognitive judgments, this set of experiments illustrates that even very slight differences in the presentation of disfluent text may greatly impact (or eliminate) previously observed effects on memory.

Finally, in their contribution to this special issue, Lehmann, Goussious, and Seufert (this issue) explore the potential moderating factor of working memory capacity. Participants read a text passage in either fluent or disfluent font and were measured on their working memory capacity, as well as their retention, comprehension, and transfer of the material in the passage. Participants did benefit from disfluency on the retention and comprehension tests, but only if they had high working memory capacity. Such results indicate yet another potential moderating factor for the benefits of disfluency—it may be helpful, but only if learners have sufficient cognitive resources to devote to enhanced processing of the disfluent text.

Why Is disfluency a difficulty, but not a desirable difficulty?

The conclusion from the impressive body of research reported in this special issue seems inescapable: The remarkable and intriguing benefits of disfluency reported by Diemand-Yauman et al. do not replicate, at least not in any general way. That conclusion is sad, in a way, given the interesting theoretical issues and practical applications that are posed by positive effects of disfluency, should they be reliable. It is tempting to say that this is one (rare) case in which common sense is accurate with respect to conditions of learning. Viewed more broadly, however, that conclusion gives common sense too much credit. Importantly, in our opinion, and something that is not commented on in the present papers, the lack positive effects may be obscuring another potentially important finding—namely, the lack of negative effects. We have not carried out the necessary meta-analysis, but our impression is that the common-sense hypothesis that disfluency has negative effects can be soundly rejected.

So what does the lack of negative effects, as well as of positive effects, actually mean? Perhaps the simplest conclusion is that what matters with respect to comprehension of, storage of, and subsequent access to text materials happens after the perceptual encoding of what is on the paper or screen. The research by Rhodes and Castel (2008) and others on the effects of font size—in which they find that increasing font size results in higher predictions of later recall, but has no effects on actual recall—leads to a similar conclusion. At the limit, of course, to the extent that text becomes literally unreadable, common sense will prove accurate.

Such a conclusion leads to another, more basic question: What constitutes disfluency? The label of “disfluent” has been applied to font that is small (Rhodes and Castel 2008), blurred (Yue et al. 2013), poorly copied (Diemand-Yauman et al. 2011), degraded (Strukelj et al. this issue), or, in many cases, a range of different fonts. Ideally, a theory of disfluency would apply to all these variations of hard-to-read text, but that does not seem to be the case; perhaps, rather than being too eager to generalize previous findings, we must invest some time in determining what characteristics consistently create a disfluent stimulus.

Finally, if disfluency is a difficulty, why is it not a “desirable difficulty” (Bjork 1994)? The answer to that question, we think, has to do with what makes any difficulty desirable. As we have emphasized elsewhere (e. g, Bjork and Bjork 2011; Bjork 2011, 2013; Yue et al. 2013), the word “desirable” is important. There are plenty of difficulties that are never desirable, and even the conditions of learning that introduce difficulties and have been shown to enhance long-term retention and transfer, such as generation, variation, spacing, contextual interference, and using tests, rather than presentations, as learning events, can be undesirable in specific instances. The difficulties and challenges introduced by such conditions can trigger the very encoding and retrieval processes that support learning, comprehension, and remembering, but whether that happens can depend on the particular learner. As emphasized by Bjork and Bjork (2011), difficulties become undesirable if the learner “does not have the background knowledge or skills to respond to them successfully.”

Whether a certain difficulty is desirable can also depend on characteristics of the to-be-learned materials. As emphasized by McDaniel, Einstein, and their collaborators (see McDaniel and Einstein 2005, for a review) in their “material-appropriate processing” framework, how a certain difficulty impacts comprehension and later memory performance can depend on whether that difficulty exercises beneficial processes that are not already supported by characteristics of the material to be learned. Thus, for example, participants having to re-arrange sentences into their proper order can enhance performance when the material is expository text (versus a control condition in which the sentences are presented in normal order), but not when the material already has a familiar and readily understood organization (such as a fairy tale), whereas having to read text that has letters deleted from the to-be-read words (versus a control condition in which the words are intact) shows the opposite pattern.

Viewing disfluency from the perspective of transfer leads us to conclude with a speculation—namely, that to the degree that responding to disfluency exercises useful processes, any positive effects of those processes are most likely to show up on a later perceptual task, such as reading a new and hard-to-read typeface. That is, although responding to disfluent text does not appear to exercise processes that are “transfer-appropriate” (Morris et al. 1977) when the later task requires remembering the content of the text, but perhaps responding to disfluent text does exercise lower-level processes that are transfer-appropriate when the later task involves something like reading a new and truly hard-to-read typeface.

Having said that, though, perhaps there are special instances in which disfluency can exercise higher-order processes, analogous to the benefits of letter deletion in the research by McDaniel, Einstein, and colleagues (e.g., Einstein et al. 1990; McDaniel et al. 1986). Having letters deleted certainly leads to disfluent reading (and does increase reading time), but it can benefit later recall under the conditions mentioned above. There is evidence (Bjork and Storm 2011) that one key to the benefits of completing words that have letters missing is participants drawing on relationships between a to-be-generated word and the surrounding text. That appears not to happen in the case of disfluency introduced by a hard-to-read typeface, but perhaps it could be made to happen. To be more specific, it would seem worth examining whether there might be benefits of disfluency when key words are presented in a form that makes them hard to read in isolation, but readable, if with a little effort, given the semantic context created by the surrounding (clear) words.

Concluding comment

The articles in this special issue are an important contribution in two respects. First, with respect to the effects of disfluency per se, the multiple experiments that are reported constitute a much-needed effort to explore the desirability (or lack thereof) of disfluency as a particular difficulty. Second, and more broadly, this special issue highlights the need to establish the conditions that separate “desirable difficulties” from simply “difficulties.”