After being told that they would either teach or take a test, participants read a passage and then took a free recall test. This test allowed us to quantify the degree to which participants’ responses were coherently organized. Additionally, we measured participants’ focus on key ideas in the passage by using a passage in which idea units had previously been categorized as main points, important details, or unimportant details (Rawson & Kintsch, 2005). After the free recall test, participants took a short-answer test consisting of questions about important and unimportant details.
Method
Participants and design
A total of 56 undergraduate students (34 females and 22 males; average age = 20.8 years, SD = 3.27) from the University of California, Los Angeles, served as participants for course credit. Participants were asked two questions at the start of the experiment to assess their prior knowledge of the to-be-learned materials. None reported having seen the movie The Charge of the Light Brigade. Although 2 participants (both in the test-expectancy condition) reported having knowledge of the Crimean War, their performance did not differ from that of their peers; thus, their data were not excluded from our analyses.
We employed a 2 × 3 mixed design with expectancy instructions (teaching-expectancy vs. test-expectancy) manipulated between subjects and information type (main points, important details, unimportant details) manipulated within subjects. Multiple measures of performance—proportion correct recall, organization, efficiency, and retention of different types of information—were derived from the free recall and short-answer tests we administered to all participants.
Materials and measures
We employed a 1,541-word passage (acquired from Rawson & Kintsch, 2005) comparing the depiction of the Crimean War in the movie The Charge of the Light Brigade (Curtiz, 1936) with the actual historical events of that war as a demonstration of the tendency of movies to portray historical events inaccurately. We chose this passage because Rawson and Kintsch had identified 125 idea units (IUs) in the passage, including a subset of 39 that they identified as main points (8 IUs), important details (15 IUs), and unimportant details (16 IUs; for additional details, see Rawson & Kintsch, 2005, p. 74). We used the full set of 125 IUs for most analyses, and the subset of 39 IUs to analyze levels of knowledge.
We employed two measures to assess our hypothesis regarding organization of free recall. First, we measured the organization of our participants’ recall in terms of IUs by adapting the adjusted ratio of clustering measure (ARC; Roenker, Thompson, & Brown, 1971) to our purposes. The ARC was originally intended for use in the context of learning lists of items belonging to different taxonomic categories (e.g., fruits, trees), where it indicates the extent to which learners cluster their recall output in terms of the categorical memberships of the list items. Broadly speaking, the ARC measures the degree to which a learner’s recall patterns reflect the conceptual organization of the studied material. Typically, the bases for computing ARC measures is to count the number of repetitions occurring in the recall output, where a repetition is defined as two items belonging to the same category being output sequentially (e.g., orange, banana; for a complete explanation of how to compute ARC scores, see Roenker et al., 1971). With the present materials, we considered each IU to correspond to an “item” (i.e., category member) in the ARC analysis and each paragraph within the original passage to correspond to a “category.” Thus, a repetition occurred when a participant contiguously recalled two IUs that had been presented in the same paragraph of the passage. To illustrate, if a paragraph in the original passage contained six IUs and a participant output those six units together (even if not in the same order as in the original passage), the ARC score for that participant would be higher than the ARC score for a participant who output the same exact information scattered throughout his or her recall output. Perfect overlap between the structure of the original text passage and the participants’ output produces ARC = 1.0. Thus, this ARC measure reflects the degree to which participants’ output of IUs corresponds to how those units were originally organized into paragraphs in the source passage.
Latent semantic analysis (LSA; Landauer, Foltz, & Laham, 1998) was the second measure of organization employed. LSA measures the sentence-to-sentence coherence of a passage of text by comparing the relations among terms within a sentence with a large database of terms, then by comparing pairs of sentences in sequences from text (see the Sentence Comparison tool at www.lsa.colorado.edu). Perfect overlap between all sequential pairs of sentences produces LSA = 1.0. The LSA sentence-to-sentence coherence score for the source passage used in the present experiment is .22. LSA sentence-to-sentence coherence scores were computed for each participant’s recall output, and the means across participants were averaged to obtain the test-expectancy and teaching-expectancy group means. LSA provides a measure of the coherence of recall output that is independent of the structure of the passage participants read.
Our short-answer test (also acquired from Rawson & Kintsch, 2005) consisted of 8 questions asking about important details of the passage and 10 asking about unimportant details. One question was presented per page in a Microsoft Word document.
Procedure
Experiment 1 consisted of four phases: study, distractor task, free recall test, and short-answer test.
Study phase
After collecting demographic information, we told participants that they would have 10 min to read a passage. They were told that they could not take notes or highlight or underline items but that they could read the passage at their own pace and return to previously read parts of the passage whenever they wanted. Additionally, participants were told that they would not have access to this passage once the 10-min reading time expired. Finally, before handing out the passage, participants in the test-expectancy condition were told that they would later be given a test on the material in the passage, whereas participants in the teaching-expectancy condition were told that they would be teaching the material in the passage to another participant, who would then be asked to take a test on the passage.
Distractor task
Following the study phase, participants engaged in a distractor task for 25 min (a separate memory experiment using categorized word lists with no overlap with the present materials).
Test phase
Following the distractor phase, participants in the test-expectancy condition were told—consistent with their expectation—that they were now going to be tested on the previously read passage. Participants in the teaching-expectancy condition were asked to take a test—inconsistent with their expectation—because the student they were supposed to teach had failed to show up.
Free recall test
All participants first received a free recall test for the studied passage. Specifically, they were asked to type as much information from the passage as they could recall onto a blank document in Microsoft Word. They were given unlimited time to recall the information in any format (e.g., paragraph form, bullet point, etc.) and were told to inform the experimenter when they were done. For each participant, the experimenter recorded the amount of time spent on this test.
Short-answer test
Immediately following the free recall test, the experimenter opened a Microsoft Word document containing the short-answer test. All participants were instructed to type their responses directly into the document, to proceed through the test at their own pace, and to inform the experimenter when they were finished.
Results
Effect sizes for comparisons of means are reported as Cohen’s d calculated using the pooled standard deviation of the groups being compared (Olejnik & Algina, 2000, Box 1, Option B). Effect sizes for analyses of variance (ANOVAs) are reported as ω
2
partial calculated using the formulae provided by Maxwell and Delaney (2004).
Free recall test
Four aspects of performance were assessed via the free recall data: amount of correct output (i.e., proportion correct recall), output efficiency, output organization, and type of information recalled.
Two independent raters, blind to conditions, scored the free recall tests; reliability was high between the two raters (α = .81). Participants were given a full point for recall of an entire idea unit, half a point for partial recall of that idea unit, and zero for no recall. Discrepancies in scoring were resolved by a third rater who was also blind to the conditions.
Proportion of IUs correctly recalled
As is indicated in Fig. 1a, participants expecting to teach produced a greater proportion of correct IUs (M = .17, SD = .07) than did participants expecting a test (M = .13, SD = .08), and this difference was significant, t(54) = 2.08, p = .043, d = 0.56, CImean difference [0.001, 0.079].
Efficiency of free recall
Participants initially instructed to prepare to teach spent slightly less time typing their output (M = 15.59 min, SD = 7.15) than did participants instructed to prepare for a test (M = 17.00 min, SD = 10.83), but this difference did not reach significance, t < 1.00. In addition to measuring the total time taken by each participant in recalling the passage, we also measured the efficiency of each participant’s recall by dividing the number of idea units recalled by the total time spent recalling. This measure (# IUs/min), which is plotted in Fig. 1b, revealed that the teaching-expectancy group recalled information more efficiently (M = 1.50, SD = 0.64) than did the test-expectancy group (M = 1.02, SD = .57), t(54) = 2.94, p = .005, d = 0.79, CImean difference [0.151, 0.802].
Organization of free recall
As indicated in Fig. 1c, expecting to teach enhanced the organization of participants’ free recall. An independent samples t-test indicated a higher ARC score for the teaching-expectancy group (M = .81, SD = .18) than for the test-expectancy group (M = .67, SD = .25), t(54) = 2.42, p = .019, d = 0.65, CImean difference [0.024, 0.255], suggesting that expecting to teach led participants to organize their encoding and/or their recall in a way that reflected the structure of the source passage.
In contrast, however, the two expectancy conditions did not differ in their mean sentence-to-sentence coherence scores. An independent samples t-test confirmed that the LSA scores for the teaching-expectancy group (M = .24, SD = .09) and the test-expectancy group (M = .22, SD = .09) did not differ from one another, t(54) = 0.89, p = .380, d = 0.24, CImean difference [−0.025, 0.064], suggesting that the sentence-to-sentence coherence of recall output was similar for the two groups.
Information type
Table 1 depicts the relationship between expectancy instructions and memory for IUs of different types of information. These means are based on only the 39 idea units identified as main points (8 IUs), important details (15 IUs), and unimportant details (16 IUs). The results based on these 39 IUs appear consistent with the three analyses already reported, which utilized the full set of 125 IUs, in that expecting to teach enhanced recall of each type of information.
Table 1 Mean proportions of free recall idea units correctly recalled at specific types of information for the different instruction conditions in Experiment 1
We analyzed these data using a 2 (expectancy instruction: teaching-expectancy vs. test-expectancy) × 3 (information type: main point vs. important detail vs. unimportant detail) mixed-design ANOVA. Consistent with the previous analyses on the full set of 125 IUs, the group expecting to teach recalled more idea units (M = .22, SD = .13) than the group expecting a test (M = .17, SD = .14), F(1, 54) = 3.81, MSE = .035, p = .056, ω
2
partial = .05. There was also a main effect of information type, F(1, 54) = 29.2, MSE = .012, p < .001, ω
2
partial = .17. Post hoc tests revealed that main points (M = .28, SD = .17) were better recalled than were important details (M = .21, SD = .13), t(55) = 3.12, p = .003, d = 0.47, CImean difference [0.026, 0.116], which, in turn, were better recalled than unimportant details (M = .12, SD = .13), t(55) = 4.31, p < .001, d = 0.69, CImean difference [0.047, 0.130]. Post hoc tests also revealed a teaching-expectancy advantage over test-expectancy for recall of main points, [t(54) = 2.16, p = .035, d = 0.56, CImean difference [0.007, 0.181], but not for important details, t(54) = 1.198, p = .236, d = 0.30, CImean difference [−0.028, 0.111] or unimportant details, t(54) = 0.982, p = .330, d = 0.23, CImean difference [−0.035, 0.109]. Importantly, however, despite the numerical suggestion of an interaction, the interaction between expectancy instruction and information type was not significant, F(1, 54) = 1.22, MSE = .012, p = .299, ω
2
partial = .09, a point to which we return in the Discussion section for Experiment 1.
Short-answer test
Average correct recall performance on the short-answer test for the 8 questions about important details and the 10 questions about unimportant details is shown in Fig. 1d. The apparent superior performance for participants in the teaching-expectancy group was confirmed by the results of a 2 (expectancy instruction: teaching-expectancy vs. test-expectancy) × 2 (information type: important details vs. unimportant details) mixed-design ANOVA, which revealed a significant main effect of expectancy instruction, F(1, 54) = 5.04, MSE = .104, p = .03, ω
2
partial = .07, indicating better overall performance for the teaching-expectancy group than for the test-expectancy group. No effect of information type emerged, however, F(1, 54) = 0.06, MSE = .023, p = .81, ω
2
partial = .00. Additionally, and consistent with the pattern observed in the free recall data, a significant interaction between expectancy instruction and information type was not obtained, F(1, 54) = 0.88, MSE = .023, p = .35, ω
2
partial = .00.
Discussion
In Experiment 1, multiple measures of participants’ responses converged to support the claim that expecting to teach promotes learning in ways that expecting a test does not. First, expecting to teach enhanced the amount and efficiency of output in free recall. Additionally, expecting to teach enhanced the match of organization of free recall output to the structure of the source passage (ARC scores), although there was not a teaching-expectancy advantage in the sentence-to-sentence coherence measure (LSA scores). Finally, expecting to teach also produced better performance on short-answer questions. These findings suggest that participants processed information differently, and more effectively, when they expected to teach than when they expected to take a test.
Experiment 1 did not provide strong support for the hypothesis that expecting to teach would specifically enhance recall of important information (as indicated by the lack of a significant interaction between expectancy instructions and information type). As can be seen in Table 1, however, the numerical pattern of our results does appear to indicate that the recall advantage of the teaching-expectancy group over the test-expectancy group was greater for main points than for the other two types of information. Specifically, the numerical advantage for expecting to teach diminished across levels of importance of information type (9 %, 4 %, and 3 % in the main points, important details, and unimportant details, respectively), and this difference was statistically significant only in the main points, which had the largest effect size of the three comparisons. Thus, in Experiment 2, we further explored the effect of expecting to teach on how different types of information are processed and recalled.