The relationship between interactive-imagery instructions and association memory

Thomas, Jeremy J.; Ayuno, Kezziah C.; Kluger, Felicitas E.; Caplan, Jeremy B.

doi:10.3758/s13421-022-01347-6

The relationship between interactive-imagery instructions and association memory

Published: 10 August 2022

Volume 51, pages 371–390, (2023)
Cite this article

Download PDF

Memory & Cognition Aims and scope Submit manuscript

The relationship between interactive-imagery instructions and association memory

Download PDF

1518 Accesses
2 Citations
Explore all metrics

Abstract

Interactive imagery, one of the most effective strategies for remembering pairs of words, involves asking participants to form mental images during study. We tested the hypothesis that the visual image is, in fact, responsible for its memory benefit. Neither subjectively reported vividness (all experiments) nor objective imagery skill (experiments 1 and 3) could explain the benefit of interactive imagery for cued recall. Aphantasic participants, who self-identified little to no mental imagery, benefited from interactive-imagery instructions as much as controls (experiment 3). Imagery instructions did not improve memory for the constituent order of associations (AB versus BA), even when participants were told how to incorporate order within their images (experiments 1 and 2). Taken together, our results suggest that the visual format of images may not be responsible for the effectiveness of the interactive-imagery instruction and moreover, interactive imagery may not result in qualitatively different associative memories.

Memory benefits when actively, rather than passively, viewing images

Article 27 November 2023

Deconstructing the effect of self-directed study on episodic memory

Article 19 June 2014

Cognitive Architecture and Instructional Design: 20 Years Later

Article Open access 22 January 2019

Introduction

One of the best-known ways to increase memory for word pairs (e.g., study APPLE-OVEN, when presented APPLE, recall OVEN), is to instruct participants to form a mental image of the two words interacting (Bower, 1970; Bower & Winzenz, 1970; Dunlosky, Hertzog, & Powell-Moman, 2005; Paivio & Yuille, 1969; Paivio & Foth, 1970; Richardson, 1985; 1998). For example, “imagine an APPLE cooked inside an OVEN, in your mind’s eye.” Participants who receive interactive-imagery instructions perform significantly better at cued recall than participants given no strategy instruction (Richardson, 1985; 1998), and \(\sim 20-50\%\) higher cued recall accuracy than participants instructed to use rote repetition (Bower & Winzenz, 1970; Bower, 1970). Bower and Winzenz (1970) and Paivio and Foth (1970) found that interactive-imagery instructions could even outperform comparable verbally mediated instructions (e.g., form a sentence with both words) for concrete word pairs, although Dunlosky et al., (2005) found these instructions were comparable.

At face value, interactive-imagery instructions might cause participants to literally construct rich visual representations, directly improving memory in this way (Yates, 1966). However, this hypothesis is hard to test because visual imagery cannot be directly observed. Here we examine the effect of interactive-imagery instructions with two main approaches. First, we test the visually relevant characteristics of the imagery instruction and individual differences characteristics of the participants. Second, we ask whether interactive-imagery changes the formal nature of the representation; specifically, whether or not constituent order (knowledge that it was APPLE–OVEN, not OVEN–APPLE) is coupled with memory for the pairing, itself.

Testing for visual-imagery characteristics of associations formed through interactive imagery

One way to interrogate how visual imagery functions is to exploit individual differences. There is large individual variability in the subjective experience of mental imagery (Marks, 1973; Zeman, Dewar, & Della Sala, 2015; Zeman, Milton, Della Sala, Dewar, Frayling, Gaddum, & Winlove, 2020) and objectively scored imagery/visuospatial tasks (Keogh & Pearson, 2018; Sanchez, 2019; Zeman, Della Sala, Torrens, Gountouna, McGonigle, & Logie, 2010). If the visual image itself is fundamental to the benefit of interactive imagery, one would expect that imagery instructions may benefit individuals with vivid or accurate mental imagery more than those with poor mental imagery. Alternatively, visual imagery may be epiphenomenal (Pylyshyn, 2002), implying that individual differences in mental imagery should not relate to objective memory performance. Our three experiments test the hypothesis that both mental imagery vividness and skill determine how much an individual benefits from interactive-imagery instructions.

There is considerable support for a central role of imagery in association memory. Instructions to use interactive imagery produces higher cued recall than without imagery instructions, and associations involving words higher in imageability are remembered better (Bower, 1970; Bower & Winzenz, 1970; Paivio, Smythe, & Yuille, 1968; Paivio & Yuille, 1969; Paivio, 1969; Paivio & Foth, 1970). Beyond memory for word pairs, ancient texts claim that forming vivid images can improve memory of various kinds (Foer, 2011; Gesualdo, 1592; Yates, 1966). For example, when using the method of loci, a popular technique for ordered lists, skilled memorizers report forming mental images of to-be-remembered items in various locations (e.g., Maguire, Valentine, Wilding, & Kapur, 2003).

Common advice by skilled memorizers is that vivid imagery is important for the efficacy of mnemonic strategies (e.g., Foer 2011; Konrad 2013; Müller, Konrad, Kohn, Muñoz-López, Czisch, Fernández, & Dresler, 2018). To test this idea, Sanchez (2019) measured individual differences in imagery/visuospatial skill with the Cube Comparisons Task (CCT; a mental rotation task), and the Paper Folding Task (PFT; judging the outcome of multiple folds and hole-punches of a paper) (French, Ekstrom, & Price, 1963), and examined the correlation to memory performance. In Sanchez’ (2019) study, aggregate CCT and PFT performance correlated with serial recall performance for participants who were instructed to use the method of loci, but not for participants who were given a control instruction. However, three studies did not find a significant relationship between Vividness of Visual Imagery Questionnaire (VVIQ; Marks, 1973) and successful use of the method of loci (Kliegl, Smith, & Baltes, 1990; Kluger, Oladimeji, Tan, Brown, & Caplan, 2022; McKellar, Marks and Barron, cited as in-preparation by Marks (1972)).

In light of these variable findings, we included the VVIQ (all experiments) and PFT (experiments 1 and 3) to assess subjective quality of imagery and objective imagery ability, respectively. The hypothesis that the construction of a visual image is central to the success of interactive-imagery instructions implies that either or both the VVIQ and PFT should covary with cued recall accuracy. Alternatively, interactive-imagery effects may not depend on vivid or accurate mental images or perhaps do not require any conscious experience of mental imagery at all.

To further test the hypothesis that visual imagery is vital for the benefits of interactive imagery, we tested people with the phenomenon of aphantasia, extremely low or non-existent self-reported ability to form voluntary mental images. Current interest in aphantasia originated with patient MX (Zeman et al., 2010), who, after undergoing coronary angioplasty, reported a complete inability to form mental images. MX exhibited completely intact performance in imagery/visuospatial related tasks. However, closer examination of behavior and brain activity suggested MX was applying distinct verbal/symbolic strategies to complete tasks typically thought to require mental imagery. Other studies have examined larger populations of self-reported aphantasics who rate significantly low vividness (Zeman et al., 2015), report worse autobiographical memory, and have difficulty recognizing faces (Zeman et al., 2020). Specific to memory, Bainbridge, Pounder, Eardley, and Baker (2021) examined the ability of aphantasics to draw photographs of rooms in a house from memory. Aphantasics were not different from controls in copying a presented image, indicating no deficits to their perceptual ability. Interestingly, aphantasics remembered fewer objects than controls, but for the objects they could remember, they reproduced their spatial arrangement at the same level as controls. These results indicated that aphantasics had specific deficits to object, but not spatial memory. If the visual image is the necessary mechanism by which interactive-imagery instructions increase cued recall accuracy, aphantasics should show no such advantage (experiment 3).

Interactive imagery and the formal properties of associations

We could find no formal implementation of imagery in any mathematical model of association memory. However, image-based associations could differ in their qualitative or formal characteristics, which might be meaningful from a mathematical modeling perspective. One hypothesis about the relationship between imagery and the formal characteristics of association-memory emerged while reviewing existing models, as we now elaborate.

Mathematical models make starkly different predictions about memory for the constituent order of associations (AB versus BA) (Kato & Caplan, 2017), a memory task that has only begun to be investigated experimentally. Matrix-based models (Anderson, 1970) and concatenation-based models (Hintzman, 1984; Shiffrin & Steyvers, 1997), which we now refer to as perfect-order models, encode associations with non-commutative operations, and consequently predict that order is remembered perfectly given that the association itself is intact. Convolution-based models (Kelly, Blostein, & Mewhort, 2013; Murdock, 1982; Metcalfe Eich, 1982; Plate, 1995), in contrast, are based on commutative operations that completely discard order (and see Cox & Criss 2017, 2020, and Criss & Shiffrin’s 2005 model, which also disregard order). In these models, which we now refer to as order-absent models, information for order, if present, must be provided by some other term, predicting that the ability to remember the constituent order will be unrelated to remembering the pairing itself. Kato and Caplan (2017) found no evidence for either of these predictions. In their study, word pairs were tested with cued recall, and then, an order recognition task, where participants had to recognize whether a probe was in the correct order (AB), or reversed (BA) (Greene & Tussing, 2001; Kounios, Bachman, Casasanto, Grossman, Smith, & Yang, 2003; Kounios, Smith, Yang, Bachman, & D’Esposito, 2001; Yang, Zhao, Zhu, Mecklinger, Fang, & Han, 2013). Challenging both perfect-order and order-absent models, they found a significant correlation between order recognition and cued recall performance; however, this correlation was significantly smaller than a control correlation (with associative recognition), suggesting associations are not stored with perfect order, nor are they completely order-absent.

If we take imagery at face value, it seems plausible that a visual image could provide an effective means of incorporating order, such as left-to-right within the image, or top-to-bottom. This might be just the thing that participants are missing in their spontaneously adopted strategies. So, in addition to increasing memory accuracy, interactive-imagery instructions might help participants incorporate order, and render the association non-commutative like in a perfect-order model. This was our first hypothesis. The alternative hypothesis is that imagery is simply a good “hook”, engaging participants better in the task, but otherwise invoking the same associative mechanism as in conditions without imagery instructions. This hypothesis leads to the prediction that the relationship between order and the association itself will be unchanged with interactive-imagery instructions. We tested these two hypotheses in experiments 1 and 2 with order recognition subsequent to cued recall for all studied pairs in one group, and as a control, associative recognition in another group.

Summary of experiments

In all experiments, participants studied lists of eight word-pairs followed by cued recall. First, we obtained a baseline measure of memory with no strategy instructions; then participants were given imagery instructions (all experiments), or a filler instruction (experiment 1). To test the hypothesis that visual images are necessary for memory benefit due to interactive imagery, and that individual differences in imagery ability/vividness should predict memory benefit, vividness was assessed with the VVIQ in all experiments, and imagery skill was assessed with the PFT in experiments 1 and 2. Experiment 3 applied a stronger test of the visual imagery hypothesis by recruiting aphantasics. In experiments 1 and 2, we also tested the hypothesis that imagery could provide a way for participants to incorporate order and generate associations that are more non-commutative (like a matrix model). Cued recall was followed by either order or associative recognition to test the relationship between constituent order and memory for the pair itself. The prediction is that imagery instructions will increase order recognition, and, moreover, its relationship to cued recall. Finally, we also include supplementary materials with additional analyses.

Experiment 1

Methods

Participants

Participants enrolled in introductory psychology courses at the University of Alberta (N = 227) participated for partial course credit. Participants were required to have learned English before the age of six, have normal or corrected-to-normal vision, and be comfortable typing. Participants chose one of 15 testing rooms in order of arrival, blind to condition. One participant was excluded from analyses for not completing the experiment within the allotted 50 min. Procedures in all experiments were approved by a University of Alberta ethical review board.

Groups

There were two main experimental groups. The imagery group (N = 113) received interactive-imagery instructions halfway through the word lists, and the control group (N = 114) received filler instructions halfway through the lists (Fig. 1). Each experimental group was further subdivided into two conditions. Following cued recall, one condition performed order recognition (N = 57 and 56 for imagery and control, respectively), and the other condition performed associative recognition (N = 56 and 58, respectively). For analyses involving only cued recall, these conditions were collapsed within the imagery group and control group. For all analyses involving recognition tasks, these conditions were separated and named, accordingly, control-order recognition, control-associative recognition, imagery-order recognition, and imagery-associative recognition.

Materials

Stimuli were the 478 nouns from the Toronto Word Pool (Friendly, Franklin, Hoffman, & Rubin, 1982), 4–8 letters and spanning the full ranges of concreteness mean (SD) = 5.32 (1.32), and with frequency = 62.47 (82.45) per million (Kucera & Francis, 1967). Words were assigned to pairs and lists with the computer’s random number generator. Study pairs, cued recall, and recognition test probes were presented in uppercase, white, Courier bold font.

Procedure

The experiment was run in Python, in conjunction with the Python Experiment-Programming Library (Geller, Schleifer, Sederberg, Jacobs, & Kahana, 2007), for the first cohort of participants. Because software updates made lab computers incompatible with PyEPL, we ran the second cohort in a MATLAB port, written with the PsychToolBox experiment programming extensions (Brainard, 1997; Kleiner, Brainard, & Pelli, 2007; Pelli, 1997), and the CogToolBox Library (Fraundorf et al., 2014). Illustrated in Fig. 1, the session included study of word pairs, cued recall, followed by order or associative recognition tests, repeated for eight study sets, with five trials of a mathematical distractor task between study, cued recall and recognition sets. Given that Kato and Caplan (2017) found that initial cued recall tests affected subsequent recognition tests but did not change the coupling of order with association memory, we tested every pair initially with cued recall (as in experiment 1 of Kato & Caplan, 2017) to maximize the data yield (and see page S1). Interactive-imagery instructions or control filler instructions were administered after the fourth list in a pretest (Lists 1–4)/posttest (Lists 5–8) design, allowing us to check for equal baseline performance (pre-instruction), and get a closer estimate of the true effect of imagery instructions above baseline. Participants then completed the VVIQ and the PFT. Halfway through data collection, a section was added after the PFT, where participants were asked to rate how often they used interactive imagery, and then asked to type a free-form response about their strategy use, reported on page S2.

Practice list

Participants performed one practice list, excluded from analyses, at the beginning of the session, during which they were walked through the tasks.

Study phase

For each list, participants viewed eight pairs in sequence. The two words in a pair were presented side by side, centered on the screen, for 2850 ms, with a 150-ms inter-pair blank.

Distractor

Interleaved between study, recall, and recognition, participants were administered a math distractor task. Participants had to solve the sum of three digits, randomly drawn from two to eight within 5000 ms followed by a 200-ms blank inter-trial interval. Participants typed their response, which was displayed on the screen, and upon pressing ENTER, the color of the response digit changed to gray, to show the response registered, and the 200-ms inter-trial interval was initiated after the 5000-ms response interval elapsed.

Cued recall

Each studied pair was tested once with cued recall. Direction of cued recall (forward, APPLE–?, or backward, ?–OVEN) was counterbalanced (Python version: across all lists except the practice; MATLAB version: within each list). The cue word was presented in centrally with a centered response line underneath, regardless of the direction of cued recall. The letters appeared on the line as the participant typed, submitting the word with the ENTER key. The next cued recall trial started 750 ms later. ENTER was only accepted once more than two letters were typed, to reduce participants speeding through. In the Python version, if participants did not press ENTER within 15,000 ms, the trial ended, was scored incorrect, and the next cued recall trial was presented. In the MATLAB version, this time-limit was removed.

Recognition

Two probe words were presented side by side centrally, as in the study phase. In order recognition, participants judged if a presented probe was intact (e.g., OVEN APPLE) or reverse (e.g., APPLE OVEN). In associative recognition, participants judged whether a presented probe was intact (e.g., OVEN APPLE) or recombined (e.g., OVEN BUTTON). Key 1 was assigned to intact and key 2 was assigned to reverse or recombined. Other keys were ignored. Recombined probes were only rearranged with other pairs within the current list, and a pair probed with an intact probe was never used to create a recombined probe. Pairs were tested in pseudo-random order. In the Python version, the number of intact and lure (reverse or recombined) probes were counterbalanced over all analyzed lists (excluding practice). In the MATLAB version, trials were counterbalanced over all lists including the practice list.^{Footnote 1} In the Python version, the trial was aborted after 15,000 ms. Rather than score these timed-out trials as incorrect, they were omitted from analyses (two trials in all, both in control-associative participants). To prevent missing data, the 15,000-ms timeout limit was removed in the MATLAB version. The next recognition trial started after a 750-ms blank screen.

Vividness of visual imagery questionnaire

Participants completed a computerized version of the Visual Vividness of Imagery Questionnaire (Marks, 1973), which asks participants to imagine four scenes. A description of each scene was displayed on the screen, followed by instructions to imagine four items within the scene and to rate vividness on a scale from one (perfectly vivid imagery), to five (no image at all) using the number keys. To indicate the response registered, the choice changed to green for 1000 ms, immediately followed by the next item. VVIQ score was the sum of these ratings, ranging from 16 (perfectly vivid imagery) to 80 (no image formed at all).

Paper Folding Task

Participants completed a computerized version of the PFT (French et al., 1963), consisting of 20 questions increasing in difficulty. Each question was a series of images that depicted a piece of paper being folded successively and then hole-punched. The question was displayed to the left of a central vertical line, and five possible choices were displayed to the right, selected with the keys 1–5. The chosen option was highlighted in green for 1000 ms, immediately followed by the next question. Mean accuracy and response time were analyzed.

Distribution of VVIQ ratings and PFT ratings

Distributions of VVIQ ratings and PFT scores aligned with previous studies (Table 1).

Table 1 M(SD) (Means and standard deviations) of VVIQ ratings for each group in experiments 1, 2, and 3, and PFT scores in experiment 1 and 3, along with population estimates for VVIQ ratings from McKelvie’s (1995), and PFT scores in the control and method of loci group in Sanchez (2019)

Full size table

Analyses

To check null effects, we include Bayesian analyses (with uniform priors) run in JASP Team (2021). The Bayes factor is a ratio of evidence, where by convention, when BF₁₀ > 3, the effect is considered supported, and when BF₁₀ < 0.3, the effect is considered more consistent with the null. For ANOVAs, BF_inclusion, which summarizes across all factorial models and quantifies whether each model fits better with the main effect or interaction included versus excluded. We measured order and associative recognition with \({d^{\prime }}\) = z(hit rate) − z(false alarm rate). Whenever hit or false alarm rate were 0 or 1, one-half an observation was added or subtracted to avoid infinities.

Results and discussion

Cued recall

We replicated the interactive-imagery advantage for cued recall. A mixed ANOVA on cued recall accuracy (Fig. 2), with design Group (imagery, control group) × Instruction phase (pre-instruction, post-instruction), returned significant main effects of Instruction phase, F(1,225) = 110.79, MSE = 2.91, p < .001, \({\eta _{p}^{2}} = 0.33\), BF_inclusion > 1000, and Group, F(1,225) = 4.92, MSE = 0.41, p = .03, \({\eta _{p}^{2}} = 0.02\), BF_inclusion > 1000; however, the interaction was also significant, F(1,225) = 41.5, MSE = 1.09, p < .001, \({\eta _{p}^{2}} = 0.16\), BF_inclusion > 1000. Simple effects found no difference between groups pre-instruction (p = .19, BF₁₀ = 0.33), but significantly higher accuracy for the imagery group post-instruction (p < .001, BF₁₀ > 1000). Additionally, for both groups, accuracy significantly increased post-instruction (both p < .001, BF₁₀ > 33). Thus, perhaps due to practice effects, the control group moderately improved as the experiment progressed; however, the imagery group performed significantly better in the post-instruction phase, and exhibited a greater improvement from baseline compared to the control group.^{Footnote 2}

Associative and order recognition

A mixed ANOVA on associative recognition \(d^{\prime }\) (Fig. 3), with design Group (imagery-associative recognition, control-associative recognition) × Instruction phase (pre-instruction, post-instruction) returned a non-significant main effect of Group (p = .25, BF_inclusion = 612.89),^{Footnote 3} a significant main effect of Instruction phase, F(1,112) = 38.13, MSE = 22.79, p < .001, \({\eta _{p}^{2}} = 0.25\), BF_inclusion > 1000, and a significant interaction Group × Instruction phase, F(1,112) = 21.24, MSE = 13.29, p < .001, \({\eta _{p}^{2}} = 0.17\), BF_inclusion > 1000. Simple effects revealed a non-significant group difference in performance pre-instruction (p = .14, BF₁₀ = 0.54), but the imagery-associative recognition condition performed significantly better post-instruction (p < .001, BF₁₀ = 31.12). Additionally, the imagery-associative recognition condition improved post-instruction (p < .001, BF₁₀ > 1000), but the control-associative recognition condition did not significantly improve (p = .16, BF₁₀ = 0.37). These analyses indicate that imagery instructions substantially improved associative recognition performance over control instructions.

An ANOVA with the same design, on order recognition \(d^{\prime }\) (Fig. 3) returned non-significant, favored null main effects of both factors (both p > .2, BF_inclusion < 0.3). The interaction Group × Instruction phase nearly reached significance, F(1,111) = 3.90, MSE = 1.61, p = .051, \({\eta _{p}^{2}} = 0.03\), although the Bayesian analysis favored the null (BF_inclusion = 0.26). Nonetheless, we cautiously followed up on the interaction with simple effects. The control-order recognition group performed significantly worse post-instruction (p = .01, BF₁₀ = 3.07), while the imagery-order recognition group did not exhibit any significant change (p = .65, BF₁₀ = 0.16). Additionally, the group difference in performance was not significant pre-instruction (p = .06, BF₁₀ = 0.98), or post-instruction (p = .80, BF₁₀ = 0.21). In sum, imagery instructions did not improve order recognition performance, but may have acted against a performance decrease observed in the control-order recognition group.

The relationship among mental imagery skill, vividness, and the effectiveness of interactive-imagery instructions

Next, we asked if any individual difference measure would explain individual differences in memory performance (Tables S1–S3). Correlations between VVIQ ratings and cued recall accuracy were all non-significant and either were, or were nearly, supported null effects (all p > .09, BF₁₀ < 0.45), and likewise for order recognition (all p > .15, BF₁₀ < 0.46). VVIQ ratings significantly correlated with post-instruction associative recognition performance in the imagery-associative recognition condition, r(54) = −.44, p < .001, BF₁₀ = 44.10, but this correlation was not significant post-instruction for control-associative recognition group, r(56) = −.04, p = .78, BF₁₀ = 0.17; and these correlations differed significantly (Fisher’s test, p = .024). Thus, individual differences in mental imagery vividness explained differences in associative recognition performance under interactive-imagery conditions,^{Footnote 4} but could not explain the interactive imagery advantage for cued recall.

PFT accuracy exhibited significant, positive correlations with nearly all memory tasks, and not only with memory performance in the imagery group (Tables S1–S3). Although the tables show some exceptions, our results, particularly the presence of pre-instruction correlations, suggest that PFT accuracy does not specifically relate to interactive imagery, and may have either reflected a general factor such as motivation, task engagement or a distinct cognitive process such as working memory.

PFT response time was not significantly related to the memory measures apart from a significant positive correlation with post-instruction cued recall accuracy, r(111) = .27, p = .004, BF₁₀ = 7.49, and post-instruction associative recognition performance, r(54) = .32, p = .017, BF₁₀ = 2.74, both in the imagery group. If longer PFT response times indicate worse performance, these correlations would be counter-intuitive. A simpler interpretation is that longer PFT latencies are a consequence of greater general effort or engagement (a successful speed–accuracy trade-off) rather than mental imagery skill. Thus, the pattern argues against the idea that mental imagery accuracy or skill is required for the memory benefit.^{Footnote 5}

The relationship of order recognition to cued recall

Figure S11 plots log-odds transformed cued recall accuracy versus both order recognition and associative recognition \({d^{\prime }}\), for both imagery and control groups. Pre-instruction, the associative recognition–cued recall correlations (imagery: r(56) = .86, p < .001, control: r(56) = .83, p < .001), were larger than the order recognition–cued-recall correlations (imagery: r(55) = .43, p < .001, control: r(54) = .46, p < .001). The difference in correlations was significant for both groups pre-instruction (Fisher’s tests, imagery: p < .001, control: p < .001). This pattern persisted post-instruction; associative recognition-cued recall correlations (imagery: r(54) = .70, p < .001, control: r(56) = .81, p < .001) were also larger than order recognition-cued recall correlations (imagery: r(55) = .31, p = .020, control: r(54) = .37, p = .005; Fisher’s test, imagery: p < .001, control: p = .005). Thus, consistent with Kato and Caplan (2017), order recognition exhibited a smaller correlation to cued recall accuracy than associative recognition.^{Footnote 6}

Importantly, Fisher’s tests between the control and imagery group OR-CR correlations were not significant pre- (p = .85) and post-instruction (p = .70), and AR-CR correlations pre- (p = .57) and post-instruction (p = .15), suggesting that imagery instructions did not affect the dependence of order or associative recognition on cued recall. This result does not support the hypothesis that imagery instructions help participants incorporate order. Instead, we have evidence for the alternative hypothesis that imagery does not change the formal characteristics of the association.^{Footnote 7}

Summary of experiment 1

Interactive-imagery instructions increased cued recall accuracy and associative recognition \({d^{\prime }}\) above baseline, and compared to the control group. Imagery instructions did not improve order recognition, or change its relationship to cued recall. Both imagery vividness and skill did not predict the effectiveness of imagery instructions.

Experiment 2

The results of experiment 1 raised an additional question. Although interactive imagery failed to improve order recognition, if participants were given a specific way to incorporate order into their image, could that improve order recognition?

We addressed this question by modifying the interactive-imagery instruction in two ways (see Fig. 1 for instructions). First, physically enacting verbal stimuli (e.g., hit the NAIL) improves benefits memory (enactment effects; cf. Allen, Waterman, Yang, & Jaroslawska, 2022; Engelkamp, 1991; Engelkamp, 1995; Sivashankar & Fernandes, 2021), even when imagined (Allen et al., 2022; Yang et al., 2021). We hypothesized that imagining an actor–object relationship might not only exploit this benefit but also incorporate order into the image. Second, whereas the left–right axis is generally symmetric, gravity can break the symmetry; for example, a MOUSE on top of an ELEPHANT conjures a different meaning than the ELEPHANT on the MOUSE. We thus added two imagery instructions, where images were to comprise actor–object or top–bottom relationships, respectively.

Experiment 2 was pre-registered. All pre-registered analyses are reported. For analyses of the within-subject relationship of order/associative recognition to cued recall of pairs, see page S13.