Introduction

Consider these two facts, side-by-side:

  1. 1.

    A frequently cited longitudinal study reported that 36 % of students show no significant improvement in writing or critical thinking over 4 years of college (Arum and Roksa 2011b). This is the now well-known Academically Adrift argument, a sharp critique of learning outcomes in academia.

  2. 2.

    Another frequently-cited study reported that an identical percentage—36 %—of college-attending survey respondents had knowingly plagiarized material for written assignments (Roig 1997).

I bring up these studies not to promote a breezy correlation-causation fallacy but to illuminate an oddity: We frequently tell students at the start of a term that they only harm their own learning when they commit academic transgressions. Yet I cannot recall witnessing—in print or live—any teacher, scholar, or administrator ever suggesting that weak capstone assessment results might be partly explained by dishonest engagement during the learning process. Data on the two phenomena often orbit each other suggestively, but that pattern seems to have set off no alarm bells.

This article explores indirect evidence of a link between academic dishonesty and poor learning results. The question at its heart is this: Do students who cheat fare poorly on national assessments because by cheating they have failed to learn? More specifically, do students who plagiarize—as a specific type of cheating—fail to hone the sorts of communication and critical thinking skills assessed in the Academically Adrift study?

To some, the hypothesis implied by the above questions will seem commonsensical. However, even though both cheating and poor national assessment performance have been heatedly and separately debated in journals, academic lists, and the opinion pages of the Chronicle of Higher Education, it is difficult to find scholars apart from Chace (2012) discussing both the cheating and the Adrift phenomenon in the same text. Even that article includes no suggestion the two phenomena are causally linked. Researchers have already established, of course, that the quest for better grades can lead students to cheat (Bowers 1964; Miller et al. 2007; Stafford 2014). In those discussions, grades are the independent variable; cheating, the dependent one. Lang (2013) comes closest to my premise, arguing that better teaching practices can curb cheating.

This gap in the literature, plus personal encounters with educators and assessors who shrug off the idea of a connection, have convinced me something like the following argument and review of the associated literature may be necessary. Yes, it’s true that the apparent symmetry topping this discussion ignores margins of error. It’s also true that other studies on the same trends come up with different (though neighborly) numbers. And it’s true that other factors must play a role in poor learning. Nevertheless, anyone considering the two phenomena in tandem should be forgiven for wondering whether a relationship exists between them. It is quite easy, after all, to come up with good, research-supported reasons for thinking that students who copy the work of others may show small gains in writing and critical thinking as a result of such shortcuts. If that is the case, then—endless debates over teaching philosophy aside—perhaps the best way to improve results on assessments is to improve the percentage of students who complete cognitive work honestly. (Although, as we shall see, some teaching approaches appear to improve student integrity, there are ways an institution can address academic integrity without infringing on faculty academic freedom.)

It is considerably more difficult to come up with sensible objections to a linkage between integrity and learning. Nonetheless, I have done my best to anticipate objections below, with special attention to two in particular. The first of those objections is that Academically Adrift’s “adrift” population (2300 students at 24 institutions) may have learned more than the authors uncovered through their effect-size analysis of Collegiate Learning Assessment (CLA) results. The second is that few self-reporting “cheaters” identify themselves as habitually dishonest. (Surely, one might ask, a one-time cheater does not risk losing all intrinsic benefits of a college education?) The nutshell response to those objections, to be elaborated upon below, is that they tend to answer each other.

What follows, then, is a literature review exploring some reasons we might hypothesize a relationship between cheating and lack of learning, followed by some suggestions on steps researchers might take to test such a notion—and that educators might take to improve results.

Why Dishonesty Might Be a Factor in Poor Learning Results

When we tell students that cheating will keep them from learning, indirect evidence exists for that assertion.

For instance, the same majors that fare poorly in assessment gains related to communication and critical thinking (Arum and Roksa 2011a) also self-report higher rates of overall cheating in surveys. Bowers (1964) reported business, engineering, and education as having among the higher rates of dishonesty (66, 58, and 52 % respectively), compared with 39 % for humanities majors and 47 % for physical science majors. Steedle and Bradley (2012), drawing on CLA data, reported that the same three majors had the worst assessment gains after controlling for initial scores and student standing upon entry to college. The Bowers (1964) figures for cheating by major correlate significantly with the Steedle and Bradley (2012) figures for CLA performance by major despite small sample size (r = −0.65, p < 0.03, DF = 7; see Table 1), though obviously such a parallel must be interpreted with caution: The data are separated by five decades and the studies classify some majors in different ways.

Table 1 Self-reported rates of cheating, compared with adjusted Collegiate Learning Assessment gains by major

A similar pattern emerges in the literature on motivation in learning environments and in studies on mastery and performance goals. Even though discussions of cheating and learning tend to be segregated from each other, for instance, they draw in similar ways on motivational and goal theories. Performance-oriented environments and an emphasis on extrinsic motivation appear to cultivate cheating (Weiss et al. 1993; Newsteadet al. 1996; Murdock and Anderman 2006; Anderman 2007). That is, students who go to college primarily to earn more money appear more likely to cheat than those attending for personal enrichment, and over-emphasis on grades can encourage such behavior. At the same time, it is also now widely accepted that the opposite conditions—learning or mastery goals and intrinsic motivation—promote learning (Dweck 1986; Grant and Dweck 2003; Bain 2004). Taken together, the above studies suggest that the same conditions which promote learning curb dishonesty, while those which promote dishonesty curb learning.

Experience has taught me that some educators in any audience will resist the causal link implied above. For instance, I frequently encounter educators who—often citing (and misreading) plagiarism scholar Rebecca Moore Howard—argue that even students who copy from a text are learning from the act of copying. Some readers may share this view. Still others assume writing and critical thinking are natural gifts that instruction cannot improve and so cheating cannot inhibit: Those who cannot do, fake it. Nevertheless, there are good reasons, beyond common sense, to posit a causal relationship between plagiarism and weak improvement in communications and critical thinking.

Perhaps the most compelling evidence stems from cognitive research into how the mind deals with challenges and how those strategies shape its development. In a nutshell, students may often be cheating as a thought-reduction strategy or cognitive shortcut. Those shortcuts may, in turn, leave them deprived of the reading, writing, and thinking strategies they would have developed had they worked through those mental feats honestly. The subsections that follow explain the research behind this argument.

Why Students Take Cognitive Shortcuts

Although cheating is often characterized as a strategy to get a higher grade, it may more accurately be described as a cognitive shortcut with cognitive consequences. Most people, even intellectuals, find thinking difficult and do whatever they can to avoid it (Willingham 2009), though they may lack the self-insight to realize this is happening (Dunning 2005). What appears to be thinking is often, instead, some form of memory recall. The thinker remembers a solution from a similar situation and applies it (Simon and Chase 1973; Chase and Ericsson 1981; Recht and Leslie 1988; Ericsson and Kintsch 1995; Was and Woltz 2007; Willingham 2009; Guida et al. 2012). As long as a person can solve problems or comprehend information by relying mostly on memory, such activities can even be fun. But when a challenge is difficult enough to require new thinking, frustration, anxiety, and apathy can result (Willingham 2009; Csikszentmihalyi 1990). Wilson et al. (2014) reported that subjects left to think on matters of their choosing in a distraction-free room for up to 15 min at a time found it difficult to concentrate and generally unpleasant. Indeed, some subjects would, given the option, instead pass the time administering to themselves electric shocks that they had earlier indicated they would pay to avoid. Such behavior seems bizarre and alien, but more familiar scenarios exist in which people find ways to avoid the challenge of deliberate thinking.

Consider, for instance, the difficulty of speaking a language: If you have worked with a language long enough that you no longer have to think consciously about word-choice and grammar when you speak it, as with your native language, then communicating with others can be nonthreatening or even enjoyable. However, if the language is relatively new to you, so that you have to think about each word and plan sentences in advance, the effort is often unpleasant enough that unless you are disciplined and determined to improve at the language, you might avoid instances when it might be required. (For a review of literature on language learning anxiety, see Horwitz 2001.) The same goes for reading: Articles and books that are fast, enjoyable, and engaging to those who are already immersed in the conversation and who know the subject can be slow torture for readers who know little about the subjects. Hence, high school students who know a lot about baseball, when tested for comprehension of a passage about the sport, have been shown to outperform students who know little about the sport—even if the baseball-ignorant are rated as better readers in general (Recht and Leslie 1988).

A relevant example is suggested by an analysis of scientific grammar by Halliday (1993). Halliday found that scientific writers often transform propositions and actions into nouns, a process called nominalization. Examples of nominalizing include transforming the idea that cracks may grow at a more rapid rate into the phrase crack growth rate—and, in my previous sentence, the word nominalization itself. Nominalization saves space. It’s efficient. However, nominalization often omits information in ways that leave novices scratching their heads. Does crack growth rate refer to the speed at which the cracks grow or to the rate at which more cracks develop? Nominalization also encourages passive voice: subjects of sentences are often implied rather than explicitly stated. Yet the same expressions are often quite clear to intended audiences who already possess that omitted information. Intended readers “may have rejected all but the ‘right’ interpretation without thinking–but only because we know what it is on about already” (Halliday 1993, p. 68). In short, a passage that one reader skims with ease, another reader will struggle with, a difference due entirely to the background knowledge each can draw upon.

Because students often lack the background knowledge necessary to translate specialized texts, I have learned when teaching writing to expect undergraduates to plagiarize or patchwrite in any sentence containing a statistic. (Patchwriting, sometimes considered a kind of light plagiarism, is the practice of copying original wording and then tweaking it through deletion, addition, and substitution.) Howard et al. (2010) reported similar issues with students patchwriting technical material (p. 188). Even normally honest students find they do not understand statistical concepts well enough to paraphrase statistical claims comfortably. As a result, they tend to rely on the original wording, often wholesale. A student with a solid grasp of statistics or a willingness to find out what the terms mean might be able to paraphrase or translate that information, but students unwilling to do that difficult work (being human, that would be most of them) are left with just three other options: omit, quote, or copy.

What the above phenomenon means for the present discussion is this: Despite our exhortations to students that they should think about what they read and write, such assignments are slow, painful, difficult experiences for students not already plugged into the subject matters they are writing about—much more unpleasant than faculty might imagine (or remember). New thinking inherently involves moments of confusion and doubt, feelings that can be worked through but which many people find unsettling (Wells 2009; Miller 2013). Students may respond to such feelings by cheating as indicated above (Batane 2010), by giving up (Smith et al. 1982), by self-handicapping (Thompson and Hepburn 2003), or by finding “honest” cognitive shortcuts that enable them to feel successful despite an actual lack of progress (Chance et al. 2011). Moreover, people often mistakenly associate ease (or what some researchers call fluency) with learning or knowing, and they interpret cognitive strain as a sign of stupidity or error (Bjork 1994; Dunning 2005; Ackerman and Zalmanov 2012). As a result, cognitive shortcuts that, like patchwriting, enable students to sustain their self-esteem and feel like they accomplished the task easily may seem particularly alluring.

Such shortcutting behavior has manifested itself in a variety of ways in literature on cheating. Roig (1999) found that students asked to paraphrase two-sentence paragraphs were much more likely to plagiarize when attempting difficult texts (plagiarizing up to 68 % of the time) than when attempting texts with low difficulties (plagiarizing up to 19 % of the time). His results reinforce Howard (1992)’s earlier anecdotal finding that college students become more likely to patchwrite as the difficulty of the text increases (p. 239), even though scholars of reading suggest their comprehension would improve if they did paraphrase (Kletzien 2009).

In a study that points to the impacts of cheating on later performance, Chance et al. (2011) reported that students provided with a pre-test and an answer key, so they could check their answers when they were done, had significantly higher scores on those pre-tests than students who took the test with no key available. That specific finding surprised no one (not even the students). However, after the practice test, the students who had had the answer key for the pre-test became inaccurately overconfident about how they would do on later, real tests. It appears from the results as though the students failed to attribute their higher scores to peeking at the answers. Instead, they attributed their pre-test success to ability. For students who received certificates of accomplishment for their high pretest scores, the effects were even more pronounced. In other words, students may take cognitive shortcuts without later remembering they have done so and without imagining that the shortcuts they take will hurt their long-term performance, testifying honestly that they think the shortcut helped them.

The preference for shortcuts emerges even in studies not focused on cheating. Muller et al. (2008) found that students who viewed science videos that deliberately challenged their misconceptions showed significantly greater gains in learning than those who viewed standard exposition of the same principles. Yet students preferred the weaker method, finding it more clear and less confusing (Muller 2011). Muller (2011) has suggested that students failed to associate the effort and dissonance required by the effective video with learning. Instead, they assumed that because they did not feel challenged by the other video, that must mean they already knew the material. Similarly, Diemand-Yauman et al. (2011) found that when fonts are harder to read, like 60 % grayscaled Comic Sans MS, students slow down and understand the text better than they do with conventional fonts. Yet it seems likely that students, if asked, would say they prefer easier-to-read fonts, just as trainees studied by Bjork (1994) pressured trainers they were evaluating to use less cognitively demanding forms of practice. Echoing the above findings, a meta-analysis by Dunlosky et al. (2013) revealed that even though students may prefer to highlight and underline, they learn better when they engage in far more intellectually demanding activities like self-explanation and distributed practice. (Distributed practice refers to lessons and experiences that are spaced out enough that students forget material in the interim and struggle to recall it. In the long run, students who engage in distributed practice exhibit better memories of the material than do students who cram.)

Each of the above studies provides evidence that students may be more inclined to choose an approach that does not require a lot of thinking (or rethinking) than to choose one that does. Hence the advantage to cheating, and particularly to plagiarism: Copying what someone else has written gets one closer to assignment completion much more quickly and without as much cognitive effort.

It is perhaps for this reason that survey responses from self-confessed cheaters at the high school and college level reveal that students frequently cheat not to get higher grades but to save time and effort. Lathrop and Foss (2005) surveyed students in grades 7–12, inviting them to choose a theme to comment on, either “Why I don’t cheat,” “Why I cheat,” or “Why I cheat in some classes but not in others.” More than a third of the respondents—indeed, at 35 %, very close to the number in this article’s title—chose to comment on why they cheat or why they cheat in some instances (pp. 16–17). Although about 18 % of those giving reasons for cheating indicated that pressure to pass the class or earn decent grades encouraged their cheating, a plurality of the students (20.6 %) indicated that “cheating is easier than studying” (p. 16). One 10th-grade girl wrote, “Cheating or copying is just easier than doing the work honestly,” while a 12th-grade boy admitted, with a degree of self-awareness unusual for such comments, “I’m lazy. I’m fully aware that it’s hindering my education and development, but I still do it” (p. 24). Other students who were classified by Lathrop and Foss as being motivated primarily by grades nevertheless made comments consistent with the idea that cheating is often about cognitive shortcuts. “I cheat because I don’t know the information or understand all the material that was taught in class,” wrote one, while another testified, “Because I hate school; because it overwhelms me” (p. 25).

Consequences of Shortcutting: Current Thinking is Armed by Past Thought

The above pattern has consequences for student progress. Students who plagiarize and take other cognitive shortcuts bypass experiences that might have made them better writers and more effective thinkers. Moreover, although the above section covers a range of shortcutting behaviors, there is particular reason to be concerned about plagiarism. It shortcuts the single most intellectually engaging, thought-provoking, comprehension-building activity students that will regularly encounter: writing. Scholars since Emig (1977) have argued that writing is a powerful engine for learning; this assertion is widely accepted to the point of being a truism. But if writing is such a powerful learning tool, what then must we conclude happens during the not-writing of the plagiarist? A student who peeks at bubbled multiple-choice answers of a neighbor during an objective test may inflate her score, but her understanding of the material remains largely indifferent to whether she cheats at that moment. The bulk of her learning experience happened earlier: She either studied or didn’t. With plagiarism, however, the act of dishonesty fundamentally changes the impact of the potential learning experience, an experience usually designed not just to assess but also to stimulate learning. The student does not merely steal credit from an unaware, off-screen victim, and does not merely inflate a grade, but also skips past a learning experience that otherwise would have been engaged. The academic sins most like plagiarism in having this quality are the faking of lab data and the fabrication of library research. Both similarly bypass learning activities—and both also lead to a written or composed product.

Students who plagiarize miss out on even more than the opportunity to learn subject material. They miss out on opportunities to develop their reading, writing, and critical thinking skills as well, thanks to a cognitive vicious circle: Thinking depends on memory, while what we remember depends reciprocally on we have thought. That is, we remember what we think about, and those memories help us think about new things. Having already thought through a problem in chess or Sudoku equips us to deal with similar problems later. Having thought through a problem in writing (such as, “how do I write a grant application?”) helps one cope with later, similar challenges. Arguing that “memory is the residue of thought” (p. 41), Willingham (2009) suggests this is why students remember a teacher’s jokes better than they remember her lessons (p. 42). Many students will think about the joke (and perhaps retell it). However, when it comes to the material, too many will robotically highlight slide printouts or scribble what is on the overhead, in one eye and out the other. If instead they think about what the instructor is saying and make connections between new material and material they have studied elsewhere, and if they wrestle with apparent contradictions in that material, then they emerge able to draw on the memories of those thoughts in new situations and to think about new problems. What they learned yesterday helps them learn tomorrow. The fact that what one has already learned enables one to learn more may be one reason why Arum and Roksa (2011a) found that students with higher GPAs and better academic preparation before college tended to show more improvement than their counterparts between the pre- and post-test exams in their study. To put it in a metaphor, when the educationally rich keep getting richer, it is partly because they have more educational capital on which to collect interest.

For plagiarists, the above cognitive findings suggest that merely copying another person’s material (even with tweaks to the wording) is unlikely to produce any gains in writing or critical thinking because every thought that students avoid having about the material in their papers holds back development of their ability to write or think critically in other contexts. Arum and Roksa (2011a), who argue that poor gains on the CLA stem from the fact that many faculty fail to assign enough writing, are likely correct when they contend that students would improve in critical thinking and composition if they read and wrote more. As Spencer (2010) has persuasively argued, plagiarists are much like airline pilots who have been studied in human factors research, whose skills have been shown to deteriorate the more they rely on flight-deck automation for decision-making. Their classmates struggling through the work honestly, by comparison, do much better over the long haul because they have their recollection of those previous thinking experiences to draw upon as they encounter new scenarios. Evidence for the power of putting ideas into one’s own words emerges in recent research on note-taking. Mueller and Oppenheimer (2014) found that students taking lecture notes by computer tended to transcribe material verbatim, while slower longhand note-takers were forced to select and summarize in their notes in order to keep up. The longhand note-takers learned more as a result. The authors conclude that the lack of processing that occurs during verbatim note-taking is “detrimental to learning” (p. 1159).

In findings more pertinent to growth in writing, a meta-analysis by Hillocks (1986) determined that teaching approaches which obligated students to think in new ways consistently improved student writing more than those in which students were provided with pre-digested dictums about composition or argumentation. Specifically, presentational teaching modes (including lecture, discussion, models, instructions, and instructor feedback) had an overall effect size of 0.02 on student writing, compared with an effect size of 0.44 for “environmental” modes that gave students materials to work with and thought-intensive tasks to accomplish (p. 200). I participated once in 2007 when Hillocks demonstrated the environmental mode to a room full of teachers at the University of California, Riverside, and remember that even experienced teachers found the activities intellectually taxing. Yet I left the room with a far better understanding of Toulmin warrants than I had entered with. In addition, Hillocks (1986) found that individualized instruction—one-on-one work with tutors—had a weak effect (0.17) (p. 200), perhaps because too many writing tutors save students from having to think through their process.

One particularly compelling body of empirical research demonstrating the educational superiority of putting material into one’s own words comes from Webb (1982, 1985). Most educators have encountered the lore that collaborative learning can lead classes of students to greater gains. However, Webb’s studies revealed that the learning impacts from teamwork are almost entirely experienced by those students who explain concepts to their peers. The students who explain concepts to classmates improve considerably more than the students receiving the explanations do (Webb 1982, 1985). Indeed, the beneficial impact of collaborative learning disappears almost entirely when students simply tell each other answers without explanations (Webb 1985), as often happens in the sort of study groups that Arum and Roksa (2011a) identified as not being effective.

Over the course of rephrasing information in their own words—a process that requires difficult thinking about that material—students draw new connections between the material and their previous experiences and learn it better than they had known it before. Most teachers have experienced this effect themselves. The observation that we learn our subject better when we teach it than when we originally studied it applies equally well to people not employed as teachers: Students also benefit from Webb’s explanation effect. It seems reasonable to expect the same effect to trigger improvements in writing and thinking when earnest paraphrasers and summarizers explain material to their readers and to themselves. Overquoters, cheaters, patchwriters, and plagiarists are unlikely to glean that benefit.

Challenges to the Adrift and Cheating Linkages

One obvious challenge to the above premise is that the students who admit in surveys to having cheated are not saying that they cheat all the time. Many scholars who study dishonesty ask students whether they have ever cheated, without gauging frequencies. However, those who do ask how often students cheat find that the percentage of students drops off steeply as frequency increases. To cite one example, Pino and Smith (2003) surveyed Georgia Southern students, 36.9 % of whom indicated they had cheated a few times. Yet only around 10 % reported cheating regularly, with fewer than 1 % of respondents reporting 6–10 cheats a semester. Surely a student who cheats just once in 4 years does not surrender all of her accumulated learning?

However, it may be more appropriate to read student responses on this subject as a tip-of-the-iceberg index rather than as a census of dishonest activity. Anonymous student self-reports are plagued with challenges, including underreporting (Bowers 1964; Miller et al. 2008; Howard et al. 2010; Martin et al. 2009; McCabe et al. 2012, pp. 37–39), disagreement with researcher definitions of cheating (Roig and Ballew 1994; Graham et al. 1994), and a tendency to forget cheating and other shortcuts previously taken (Chance et al. 2011; Shu et al. 2011; Shu and Gino 2012; Moore and Gino 2013). Each of these issues will be explored briefly below before we turn to a second objection.

Underreporting is a well-known problem with student surveys. A useful reality check comes from the Citation Project—a long-term study into the citation practices of students, co-directed by Howard. Analyzing 1911 citations across 194 student papers, Citation Project researchers arrived at two critical findings. The first is that students engaged in more prohibited practices than they sometimes admit to in surveys: 52 % of the papers featured patchwriting; 19 % had instances of cited, copied language without quotation marks (Jamieson and Howard 2011a, p. 2). Working from a smaller sample in a report on the project’s pilot study, Howard et al. (2010) found that 78 % of the papers included misleading citations. (That is, their sources did not say what the students claimed they did, and it seems likely the cited information often came from other sources [p. 182].) The study team also found that 94 % of student papers contained information that was not cited at all despite not being common knowledge (p. 182). Second, even when students were citing accurately and using quotation marks properly, there were problems we might expect would lead to limited gains in writing and critical thinking. For instance, more than 46 % of the citations were for the first page of a source, while only 9 % were for pages after the eighth, even for book-length texts (Jamieson and Howard 2011b, p. 4), suggesting that students were seldom reading sources very deeply. Additionally, only 6 % of the citations were for summarized material (Jamieson and Howard 2011a, p. 1), with most of those summaries being brief recaps of works of fiction, rather than digests of scholarship or criticism (p. 2).

Interpreting the data, project researchers concluded students “are not writing from sources; they are writing from sentences selected from sources” (Howard et al. 2010, p. 187, emphasis in original). The Citation Project data paint a picture in which students are patching papers together from sentences accrued by scavenger hunt and then deciding whether to quote or paraphrase those sentences, frequently fumbling both. The percentages of papers composed in this manner far eclipse the percentages of students self-reporting academic dishonesty.

Why are students under-reporting their dishonest behaviors? The usual answer may not be the strongest one, and a discussion of the reasons may be important to attempts to deter cheating. For that reason, I will pursue a brief tangent here. Although some scholars suggest the gap between actual cheating and admitted cheating might be partly explained by ignorance of the rules (e.g., Dee and Jacob 2010; Roig 1997), findings from two studies suggest that students understand faculty expectations better than we often assume. Both studies (Roig and Ballew 1994; Graham et al. 1994) employed a clever methodological step worth emulating in future research. In addition to asking students to gauge the seriousness of a range of offenses, researchers also asked students to predict how faculty would answer the same questions—and then compared those predictions to faculty responses. In both studies, students seemed well aware of how faculty would answer. When they defined an act as not-cheating, it was due more to disagreement with faculty than to misunderstanding of the rules. In Roig and Ballew (1994), student predictions of what faculty would think were uncanny. Students correctly estimated, accurate to the second decimal place, how heavily faculty would weigh the 34 offenses combined. The finding that students may understand faculty standards better than they claim is noteworthy, particularly side-by-side with the discovery by Roig and Caso (2005) that cheating by students correlates significantly (r = .38, n = 211, p < 0.0001) with the making of fraudulent excuses (p. 490). In that study, lack of understanding ranked third on the list of most popular faked excuses, after personal illness and family emergency (p. 489).

The question remains, though, as to why students would under-report cheating in an anonymous survey. Several studies have found that students engage in “motivated forgetting” after choosing to break rules (Shu et al. 2011; Shu and Gino 2012), much as students forgot that they had relied on answer keys in the aforementioned Chance et al. (2011) study. Still other research has shown that respondents often rationalize and excuse their own behaviors. Noting that respondents indicated higher rates of cheating for their peers than for themselves, Miller et al. (2008) speculated that students might under-report even in anonymous surveys as a way to protect their egos, possibly by redefining what they have done as acceptable. Anecdotal evidence for precisely this behavior has been described by McCabe et al. (2012). The researchers reported that when they receive student responses to their regular surveys on academic dishonesty, students who mark that they have not committed a particular act often go on in the comments section to “explain that, yes, they actually did engage in that type of behavior, but they checked ‘no’ on the survey because, when they did it, it was not cheating, because…” (p. 39).

Although it may be tempting to assume students often simply misunderstand documentation rules, the above studies lead to a more complicated narrative. On the whole, students seem aware of our expectations, but when faced with a specific, unfamiliar challenge and the difficult thinking required to navigate it, many give up, granting themselves permission to take shortcuts that sometimes break rules. When confronted, the same student who is willing to cheat to escape cognitive pressure may be willing to lie and plead confusion to excuse the act, and may even buy his or her own cover story or forget the shortcut he or she had taken. Certainly true confusion exists. Anyone who has ever given students a graded academic integrity quiz knows this. However, the common faculty instinct for lenience and understanding may be fueled by overestimation of the role authentic confusion plays.

Because students completing surveys may disagree with researchers and faculty over what is dishonest, general questions about cheating behavior may be less likely to collect an accurate response than specific questions are. For instance, Hanson (1990) found that 46 % of her student respondents at University of California, Los Angeles admitted to cheating in general, yet 75 % fessed up to specific infractions during later, more focused questions. Hence, students who admit in surveys to cheating “just once or twice” may be underestimating their reliance on such strategies, not just to researchers, but also to themselves.

Perhaps more significantly, students are more likely to commit offenses if they do not think they are serious (Franklyn-Stokes 1995) and may be less likely to report them in surveys for the same reason (McCabe et al. 2012). The number of offenses that students see as trivial is large, and some scholars think the number of such activities is climbing (McCabe et al. 2001, p. 220). Although nearly 90 % of student respondents to a survey reported in McCabe (2005a) saw the submission of an essay from a “paper mill” as a serious offense, only 57 % thought that “paraphrasing/copying few sentences from Internet source without footnoting it” was a serious infraction (McCabe 2005a, Table 5). Only 36 %—there is that number again–admitted to such a practice (Table 4), which is peculiar, as it suggests some respondents think there is nothing wrong with such a shortcut—but nevertheless refuse to take it. Similarly, Graham et al. (1994) found that only 45.9 % of student respondents saw reusing papers in multiple classes as dishonest, compared with 93.5 % who saw “taking a test for someone else” as dishonest (p. 256), though the same respondents knew faculty were likely to disapprove (p. 257).

For the above reasons, it may be best to view the frequency rates from student self-reports as indirect or fuzzy indicators of a much larger phenomenon, one that might very well have serious impacts on learning even if a single isolated incident in a single student’s career would not. Indeed, the Citation Project studies and the above literature review suggest that the reported plagiarism rate may simply be the brightest headlamp of a long caravan of cognitive shortcutting behaviors, including poorly recalled or acknowledged cheating. Put simply, if roughly a third of students report knowingly plagiarizing at least once, we may want to imagine a cloud of related behaviors surrounding those acknowledged transgressions, with the admitted rate highlighting the end of the spectrum where such practices are more severe.

Learning itself will also fall naturally along a spectrum, with some students learning a lot, others learning some (though less), and still others learning relatively little. This observation brings us to the second objection alluded to at the top of this paper: that “adrift” students may have improved in critical thinking and communication, even if the methodology Arum and Roksa (2011a) determined their improvement was insignificant. Astin (2011) has pointed out, for instance, that the authors used a methodology that cuts down on false positives (Type I errors) but risks a higher rate of false negatives (Type II errors). The researchers then drew conclusions about the negative results, even though that was where the error rate was likely to be higher given their methodology (Astin 2011). Put another way, students whose learning gains were “insignificant” were described as having not learned, when the more responsible interpretation would be that they did not improve enough to trust that the gains were real.

The narrative surrounding the Adrift data does indeed need to be reframed: The researchers were only able to establish with any confidence that two-thirds of students improved over 4 years of college. About the other third, we cannot say. However, that is in part because those students did not do much better on the second test than they did on the first one. In trying to identify why they did not improve more, we can work through a suite of factors, all of which probably contribute to varying degrees: the student did not take the assessment seriously; the student learned skills not sufficiently exposed by the test; there were measurement or rater errors; the student had an off-day; the student may in fact have not learned much. Most experienced teachers will recognize that the final factor in that list is almost certainly part of the mix for a fair number of cases. Sometimes a student will not learn despite our best intentions. Certainly any assessment’s results will capture some of that effect, even if it is without precision. Cognitive shortcutting, particularly dishonest shortcutting, seems a likely mechanism for that not-learning. Academically Adrift’s results may be giving us a hazy glimpse of both the existence of and nature of that problem beneath the noise of the data, just as the self-reported plagiarism rate is giving us a glimpse of a larger phenomenon.

Adrift’s depressing conclusion about student learning has also been directly challenged (Glenn 2011; Haswell 2012; Redd 2012). However, the challenge that has received the most press has been from the makers of the very assessment tool that Arum and Roksa (2011a) employed. In the wake of the Adrift report, the Council for Aid to Education (2013) issued an analysis explicitly challenging the “adrift” narrative, arguing that students improve nearly three-quarters of a standard deviation in critical thinking and communication skills over four years of college (p. 3). However, largely in the name of speed and expense, the organization decided to avoid Arum and Roksa (2011a)’s longitudinal cohort approach. Instead of sampling freshmen and then sampling the same students again in their second and fourth years, the Council for Aid to Education sampled freshmen and seniors simultaneously, assuming that differences in scores between the two populations constituted learning (Council for Aid to Education 2013, p.3).

The problem with such a move, as both Arum and Astin argued in interviews with Lederman (2013), is that attrition between year one and year four can be high enough that the increase in average score may reflect survival of the fittest more than it does learning. We have already seen how it is possible for a population to appear to improve while members of that population are invisibly left behind, particularly in the aforementioned studies by Webb (1982, 1985) showing that only those who explain concepts are likely to benefit from group learning. The group appears to improve, but the improvement is not universal.

Moreover, even to the extent that the Council for Aid to Education (2013) report may identify real growth, that fact would do nothing to refute Arum and Roksa (2011a)’s assertion that the growth is lopsided. Indeed, Arum and Roksa (2011a)’s findings have been evaluated by Igo (2011), checked by Pascarella et al. (2011), and echoed in large part by the Wabash National Study of Liberal Arts Education (Center of Inquiry in the Liberal Arts at Wabash College, n.d.). Later studies have also suggested the Adrift findings may predict successful employment after college (Roksa and Arum 2012), a pattern that suggests some validity to the measures.

Finally, readers more sympathetic with Arum and Roksa (2011a)’s methods might object that the duo conducted a thorough analysis and did not turn up cheating as a factor. However, the results of their regression did not rule out plagiarism as a contributing variable and may have quietly included it. The authors emphasized two factors that seemed to drive results. The first was the numbers of pages of reading and writing expected by faculty. The second was the number of hours each week spent by students in individual study. The authors expressed dismay over the high percentage of students spending fewer than 5 h studying each week: 37 %, not far off from the percentage identified as not significantly improving (Arum and Roksa 2011a, p. 69). The researchers’ survey data show that many students were avoiding courses with intensive reading and writing classes (p. 71), a finding that echoes much of the argument above.

At the same time, it is worth noting that the 37 % of respondents who admit they study fewer than 5 h a week also seem to remain enrolled 4 years (else they would not be part of the longitudinal data). That curious fact invites a question: If so many students are passing their classes with so little studying, how exactly are they managing that trick? One possibility, of course, is lax standards on the part of faculty. But another is that the low-studying population overlaps considerably with the shortcutting population. Bowers (1964) found, for instance, that time spent studying correlated significantly and negatively with self-reported rates of cheating (p. 80). Hanson (1990) similarly found that students studying more than 20 h a week had half the self-reported cheating rate of those studying for 6 or fewer hours per week (p. 142). It seems reasonable to suppose the non-studying population is saving time through cognitive shortcuts, including sometimes cheating.

In a related finding, Astin (1993) identified one common archetype of student as the “Status Striver,” who in 1985 comprised 33.1 % of the first-year-student population (p. 107). Status Strivers, as described by Astin, share a lot of characteristics with the extrinsically motivated, performance-focused college students often linked to cheating in other studies (e.g., Weiss et al. 1993; Newstead et al. 1996; Murdock and Anderman 2006; Anderman 2007). They tend to major in the same sorts of fields, particularly business and accounting. Astin (1993) noted that Status Striving profiles are negative predictors for Graduate Record Examination Verbal scores (p. 202) and that Status Strivers have notably poor academic performance (p. 125).

In short, when studies like Astin (1993), Roig (1997), and Arum and Roksa (2011a), are viewed side-by-side, they begin to resemble the Indian parable of the three blind men and the elephant. Superficial differences aside, they seem to describe a common size and texture. Establishing that the three beasts are really one elephant is challenging for individual researchers, but could be relatively straightforward for some organizations, as the next section explains.

Recommendations

The recommendations below are separated into those aimed at researchers and those aimed at administrators or faculty.

Suggestions for Research

As a general rule, national surveys of students rarely address academic integrity unless the survey-builder specializes in academic dishonesty research. The Higher Education Research Institute at University of California, Los Angeles is unusual in asking students about integrity issues as part of a larger battery of questions, with 39.4 % of students reporting seeing peers cheat at least occasionally (Ruiz et al. 2010, p. 7).

Beyond that survey, however, few broad instruments include cheating as a variable. The National Survey of Student Engagement (2014) includes no questions related to cheating, for instance, though it does ask whether students came to class unprepared. Neither does NSSE ask whether respondents’ institutions have or enforce honor codes, despite the volumes of research establishing that honor code institutions have significantly fewer instances of academic dishonesty (Bowers 1964; Hanson 1990; McCabe and Treviño 1993, 1997; McCabe et al. 2001, 2002, 2012; McCabe 2005b; Dirmeyer and Cartwright 2012).

When it comes to surveys of best practices in education, leaving honor codes out of the questionnaires may represent a significant oversight. Although examples can be found of ineffective honor codes (Vandehey et al. 2007), failed codes seem to be more exception than rule and may be the result of too little attention given to building a student-led culture of integrity over time (Dirmeyer and Cartwright 2012; McCabe and Treviño 1997; McCabe et al. 2001, 2002). Multi-campus studies consistently find that honor codes reduce cheating. Bowers (1964), for instance, found that only 28 % of students at campuses with student-led honor systems reported having cheated, compared with 71 % of students at institutions with faculty-centered approaches (p. 185). Moreover, research has deflated widespread assumptions that honor code institutions have lower cheating rates simply because they tend to be small and selective. McCabe and Pavela (2000), McCabe et al. (2002), and McCabe and Pavela (2004) reported positive impacts even among large, public universities with modified honor codes. (A modified honor code enjoys some features of honor codes, but not all of them. For instance, a modified honor code might not grant students access to unproctored exams.)

Because honor-code institutions have dramatically lower cheating rates than traditional campuses, noting whether an institution has an honor code could be very useful in national studies, tests, and surveys. In particular, had the Council for Aid to Education tracked whether institutions using the CLA had honor codes, Arum and Roksa (2011a, b) might have been able to dig more deeply into what is going on with the population that does not study much. If students at honor-code institutions spend more time studying than students at other universities, even after controlling for selectivity or classification, then we would have strong evidence that the Adrift population is saving time largely through shortcuts like cheating. We would also then have a fairly compelling reason to explore honor codes for a wider range of institutions.

In Summer 2013, I proposed to the Council for Aid to Education that we merge a copy of its existing test data with a list I had been building of institutions with any of several honor-code characteristics. Regression analyses could then explore relationships among honor code characteristics and student gains. The Council denied the request, asserting that a relationship between cheating and assessment results would not be meaningful (a response that in part prompted this article). Anyone wishing to analyze honor code impacts in a similar fashion should consider breaking up the variable. Simply noting that a campus has an “honor code” may not be informative enough because honor code features vary considerably and some of those characteristics may be more important than others. Four honor-code characteristics stand out in the literature (see Melendez 1985), and each should probably be its own dummy variable:

  1. 1.

    Are students obligated to participate in a pledging ceremony (i.e., to sign a document in the presence of peers, pledging to follow the honor code)?

  2. 2.

    Does the code require students to report observed violations or else be in violation of the code themselves?

  3. 3.

    Do students have the right to take unproctored exams?

  4. 4.

    Do students have judicial and/or legislative responsibilities with regard to student integrity? That is, do students oversee revisions of the honor code policies or do they hear and pass judgment on cheating cases?

Even a single-campus study with no surveys about student cheating practices could empower researchers interested in meta-analysis by reporting what honor code characteristics exist at that institution or in that program. (Some programs have honor code policies separate from those of parent institutions.) Single-campus studies often include such details as the size of the student body, geographic region, public or private classification, and indicators of selectivity, judging all of these variables to be important for generalization to the larger population beyond their campuses. Honor-code status should be another such reported variable.

The honor code variable may be more helpful to meta-analyses, in fact, than the rate of cheating reported to the institution’s conduct office would be. Although it is often easy to obtain counts for the number of cases of cheating reported by faculty at an institution, those numbers are likely poor indicators of how often its students cheat. An institution with a high reported cheating rate may simply be more diligent than one with a low rate, with faculty more likely to report offenses. At the same time, researchers at institutions which survey students on academic integrity might report those data as part of the institutional profile, since self-reported cheating rates appear to be more valid across large populations, despite some of the issues noted above.

Deterring Dishonesty

The available literature offers compelling evidence that honor codes have a desirable impact on student behavior. However, readers who imagine that the reality must be more complicated than that would be justified in thinking so. Initiating a successful honor code policy appears to be much more involved than simply coming up with something called a “code” and then having students sign it.

Some authors who are experienced with honor codes have contended that the campus culture has to be right before honor codes will work (Dirmeyer and Cartwright 2012). Honor codes appear to work by establishing a culture of integrity, relying on peer pressure to keep cheating rates low (Bowers 1964; McCabe et al. 1999). Bowers (1964) found that students who did not themselves object to cheating were much less likely to cheat if they believed their peers would disapprove. The reported likelihood of doing so dropped from 83 % if they expected peer support to only 49 % if they expected disapproval (p. 149). From a qualitative analysis of student comments on a survey, McCabe et al. (1999) concluded that even though honor-code and non-honor-code students feel many of the same pressures, for honor-code students, the sense of membership in a community with moral expectations tends to trump temptation (p. 231).

A comprehensive literature review on the social pressures affecting dishonest behavior (Moore and Gino 2013) has similarly suggested that a complex web of social factors—including perceptions of how common cheating is, perceptions of the likelihood of peer approval, the behavior of people with whom one identifies, and the behavior of those perceived as outsiders—drives many moral decisions. If perceived peer morality is indeed a major factor behind the success of honor code systems, then campuses may experience little success with them unless they can get the student community to buy into the idea. It is perhaps for this reason that a common characteristic of effective honor codes seem to be the inclusion of student-only judicial hearing boards (Hanson 1990, pp. 163–165).

Another ingredient critical to the success of honor codes may be the heightened sense of being observed that results when students are expected to report witnessed violations. After all, the deterrent effect of being observed is powerful enough that simply putting the image of eyes on an honor box appears to increase the likelihood people pay for their snacks (Bateson et al. 2006). Similarly, Covey et al. (1989) found that research subjects were less likely to cheat on a difficult maze test when sitting at a table with peers and a proctor than when in small cubicles with little oversight.

It is perhaps for this reason that software-based deterrence programs like Turnitin seem most effective when faculty review the program’s results for an early paper in the classroom. Batane (2010)’s study, which reviewed papers with students, achieved considerably better results than did Youmans (2011), which did not. Going over results with a class signals to students that the software is no placebo, and it shows that faculty will see the results. In a study of factors that affect deterrence of cheating, Ogilvie and Stewart (2010) concluded that software-based deterrence “is only likely to be effective if it affects students’ perceptions of the certainty of detection. […I]t is not the situational context itself that is important, but rather students’ subjective perceptions of the situation that are vital” (p. 149). (See also Walker 2010, in which students plagiarized less often after seeing Turnitin-based feedback on a first assignment.)

That both Turnitin and honor codes may depend on student perceptions of scrutiny strongly suggests that other deterrence strategies—like having students submit all of their sources or attach documentation of their writing process—may only prove effective when students see for a fact that those additional materials are being examined.

The effectiveness of honor codes underscores my opening suggestion that we might be able to improve student learning without picking apart each other’s teaching philosophies. An institution can affect results by engineering a change in campus culture while faculty retain the academic freedom to teach as they see fit. That institutional change could take the form of an honor code, a modified honor code, or even what McCabe et al. (1999) call a “quasi-honor system,” which would communicate expectations through moral socialization and take offenses more seriously (p. 231). Institutional surveys of academically dishonest behavior could track the program’s impact on honesty while assessments track its impact on learning.

Even if a campus cannot be persuaded to adopt an honor code, alternatives exist for institutions or faculty who want to curb cheating. Although some of the options that follow might involve changes to pedagogy, a wide enough range of strategies has been shown to be effective that faculty ought to be able to find one or more that complement their teaching philosophies instead of forcing dramatic revisions to them.

In particular, approaches related to the honor code approach seem to scale well down to the individual classroom. Merely signing a class honor pledge, for instance, appears to make a difference by reinforcing students’ memories of the rules (Shu et al. 2011, 2012). Moral reminders in general seem to help, such that having students recall as many of the Ten Commandments as possible prior to taking a test appears to curb cheating on that test (Mazar and Ariely 2006). At the same time, training on how to avoid plagiarism appears to defuse the “rational ignorance” that many students rely on to excuse dishonest choices (Dee and Jacob 2010).

Some studies have shown, however, that attempts to establish a culture of honesty can stumble on nuances. Moore and Gino (2013) concluded in a literature review that positive moral exemplars work best when selected from communities to which subjects belong, while bad examples work best if selected from outside groups. An individual’s identification with or empathy toward the person used as an example tends to govern how he or she responds to it. Bryan et al. (2013) found that students veer more toward honesty when told not to be “a cheater” than when asked not to cheat. They are more inclined to avoid the personal label than to avoid the action. Hulsart and McCarthy (2011) found that faculty role-modeling, consistency, and clarity helped to improve honesty, but that overly specific definitions of terms like cheating or plagiarism may backfire. Any attempt to build a classroom culture of integrity, therefore, should be careful about how dishonest acts are framed, illustrated, and defined.

Another kind of change to class culture might bear fruit. Lang (2013) has argued that teachers can alleviate some of the pressure to cheat by cultivating a climate of mastery goals and intrinsic motivation—a climate that (not coincidentally, I’d argue) also seems to improve learning. Faculty who are uninterested in transforming their class culture may, meanwhile, find that the relationship I have posited cuts both ways. That is, improving honesty may improve learning, but improving skills may improve honesty by bringing a wider range of challenges within the grasp of students. Based on that logic, Bain (2004) and Willingham (2009) may be powerful resources for educators interested in improving instruction and cutting down on dishonesty at the same time. Both are highly readable for non-experts. Moreover, their guidance is often applicable to a wide range of teaching styles and philosophies. Bain (2004) derived lessons for educators from case studies and interviews with outstanding college faculty. Willingham (2009) explained a set of non-controversial cognitive science principles that the author believes educators should understand to teach effectively. To the extent that assessment artifacts increasingly draw on student writing or other compositions, Hillocks (1986)’s meta-analysis evaluation of strategies for teaching writing may also be an asset. Although Hillocks’ book was written for audiences who simultaneously grasp statistics and the specialized discourse of language instruction, its discussions of environmental modes, inquiry foci, scales, and criteria offer some useful and non-intuitive tools for educators who want students to write better. All of the methods identified by Hillocks (1986) as being effective increase the likelihood that students must work through and become comfortable with difficult, active thinking.

Conclusion

Adding integrity or honor-code variables to national surveys, assessments, single-campus studies, multi-campus studies, and Common Data Sets could prove illuminating, even if the resulting data falsifies the link hypothesized above. If dishonest students are learning as much as (or more than) honest ones, that would be a discovery worth investigating. At the same time, attempts to curb shortcutting by students may improve performance on learning outcomes while enriching available data. For as classes and institutions transform integrity-related practices, they will necessarily be creating conditions that can be compared against one another, either across time (longitudinally) or across space (geographically).

The integrity variable matters. If, as this article argues, dishonest cognitive shortcuts are hindering student improvement, then many of the ongoing and recurring heated debates over grading standards or how much writing to assign may be chasing red herrings. Instead of debating over details of instruction, we might better help students learn by adopting honor codes or strengthening academic integrity policies—by ensuring that, when we assign brain work, students actually do it.