A fundamental and much debated question in psychology is whether a specific learning experience can transfer and generalize to other tasks, situations, or processes. The concept is intuitively reasonable, since the opposite would imply that learning is specific to every task and context, and several studies have indeed shown self-reported benefits of cognitive training (Hudes et al., 2019). Still, transfer has been scientifically elusive, and empirical results have been mixed (Besson et al., 2011; Gross et al., 2012; Jaeggi et al., 2008; Klingberg, 2010; Melby-Lervåg & Hulme, 2013; Nguyen et al., 2021; Owen et al., 2010; Pahor et al., 2022; Pan & Rickard, 2018; Sandberg et al., 2014; Wang, 2020).

A common way of defining transfer is on a near to far continuum, where transfer within the same cognitive domain is usually defined as near transfer, while transfer to another cognitive domain, e.g., from working memory to intelligence, is defined as far (Barnett & Ceci, 2002; Gobet & Sala, 2022; Green et al., 2019), which has been demonstrated to a much lesser extent in previous research (Gobet & Sala, 2022).

One hypothesis for why transfer effects, especially far transfer, are rare is that engagement of specific neuronal processes in both the practiced situation and in the transfer task(s) is required, so that strengthening of these processes during training will benefit performance in transfer situations. Such an overlap might be less likely in some far transfer situations. Although studies examining neural underpinnings of behaviorally observed transfer effects are scarce (Buschkuehl et al., 2012; Salmi et al., 2018), empirical support for this account was provided in a study of working memory (updating) training, where training-induced increases of striatal activity were observed during the execution of both trained (letter memory) and untrained (n-back) updating tasks (Dahlin et al., 2008) in young but not old adults. The behavioral training effects were replicated in a subsequent study, which additionally indicated a role of the dopamine D2 system in updating training (Bäckman et al., 2011). Furthermore, a study by Salminen et al. (2016) showed that practice on a dual-n-back task led to increased performance in a letter memory task and a related increase in striatal activity in both tasks. A recent study provided additional support for the notion that learning transfer arises when trained and transfer tasks rely on overlapping brain systems and shared cognitive processes (Thibault et al., 2021). In that study, it was demonstrated that training on complex syntax processing selectively improved performance in tool use and vice versa and was associated with co-localized neural activity within the striatum.

In general, transfer is less common among older adults (Bråthen et al., 2022; Noack et al., 2009; Sandberg et al., 2014; Schmiedek et al., 2010), possibly because the typically lower levels reached on the criterion task by older adults are not sufficient for engaging critical brain structures (Dahlin et al., 2008).

In the current study, we targeted transfer effects following episodic-long-term memory training. Episodic memory (Tulving, 2002) has profound importance in everyday life and is affected in normal as well as pathological aging (Boraxbekk et al., 2015; Gorbach et al., 2017; Moscovitch et al., 2016; Nyberg & Pudas, 2019; Nyberg et al., 2012, 2020; Shing et al., 2010). Episodic memories are formed by binding aspects of an event (which may include time, place, percepts, and feelings), represented in widespread cortical sites, into a coherent whole, a process dependent on the hippocampus (Barry & Maguire, 2019; Bouffard et al., 2018; Hainmueller & Bartos, 2020; Moscovitch et al., 2016; Shimamura, 2010; Yonelinas et al., 2019). Hippocampal activity has been demonstrated across a range of associative tasks, such as associations between names and faces (Pudas et al., 2013; Salami et al., 2012; Sperling et al., 2003) and also between other item pairs (e.g., occupations-faces, objects-scenes) (see, e.g., Dulas et al. (2021) and Rubin et al. (2017).

The classic mnemonic, the method of loci (MoL), involves visualizing items to be encoded (e.g., words or pictures) and associating them to imagined places by “placing” them, in mind, in a familiar environment, e.g., one’s home (Bower, 1970; Higbee, 2001; Yates, 1966). MoL use can greatly increase performance in trained episodic memory tasks (Dresler et al., 2017; Engvig et al., 2010; Kliegl et al., 1990; Mallow et al., 2015; McCabe, 2015; Raz et al., 2009). The efficacy of MoL has been shown to rely on the effective binding between item and spatial context (Reggente et al., 2020), and functional neural changes associated with the utilization of MoL have been observed in the hippocampus (Jones et al., 2006; Nyberg et al., 2003). Thus, tasks that require binding information might share a processing overlap in the hippocampus. Competitors in memory championships ubiquitously report use of MoL (Dresler et al., 2017; Maguire et al., 2003; Müller et al., 2018) and have been shown to disproportionally recruit (posterior) hippocampus at encoding (Maguire et al., 2003) and to have larger anterior hippocampal volumes compared to control participants (Müller et al., 2018).

Strengthening episodic memory by means of MoL instruction and practice has received considerable interest in aging research (Baltes & Kliegl, 1992; Bråthen et al., 2022; de Lange et al., 2017; Engvig et al., 2010; Gross et al., 2014; Verhaeghen & Marcoen, 1996), fueled by findings of age-related deterioration in hippocampus, and the fact that tasks requiring binding typically show the most pronounced age-related decline (Naveh-Benjamin et al., 2007), possibly partly due to reductions in recruitment of a process-specific encoding network involving binding processes (Salami et al., 2012). Commonly, younger adults increase performance to a larger degree than older adults, magnifying preexisting age differences in episodic memory training task performance (Kliegl et al., 1989; Singer et al., 2003; Verhaeghen & Marcoen, 1996), as well as transfer (Bråthen et al., 2022).

Based on the notion that transfer will occur provided that trained and untrained tasks rely on overlapping neural networks and shared cognitive processing, practice on MoL could be expected to transfer to tasks that also involve a binding component and tax hippocampal processing. However, despite pronounced training effects, transfer effects after MoL training have commonly been slim or nonexistent, especially among older adults (Brehmer et al., 2014; Gross et al., 2012; Verhaeghen & Marcoen, 1996). Similarly, in a previous study, we addressed training and transfer effects of MoL practice across the adult lifespan, by means of a smart phone training intervention where the transfer task involved associating names with faces. Despite high levels reached in the trained task, no transfer was observed in any of the age groups examined (Sandberg et al., 2021). A possible explanation for the lack of transfer effect is no or limited engagement of the associative processes that MoL is assumed to strengthen during the face-name transfer task due to the lack of explicit instruction to encourage the participants to utilize the learned mnemonic for the transfer tasks. If the strategy was in fact not applied at all in the transfer task, the cognitive (and neural) processes which were supposedly strengthened in training should not be expected to carry over to the untrained task. Indeed, our previous version of the face-name transfer task lacked an instruction to utilize MoL also in the transfer tests and allowed only a short time frame for encoding, which is known to limit spontaneous utilization of MoL (Jones et al., 2006).

The specific aim of the current study was to examine whether MoL practice (Fig. 1) would generalize to support the performance on associative transfer tasks if the participants were explicitly instructed to use MoL during the transfer tasks (Cavallini et al., 2010; Dunlosky et al., 2013) and given enough time to do so (Jones et al., 2006). The explicit instruction was expected to elicit shared cognitive processes and overlapping neural networks during trained and untrained tasks. The nature of the to-be-remembered items could also be crucial, as names are generally more difficult to remember than other types of personal information (Brédart, 2019; Cohen, 1990). To examine if the nature of the items to be associated mattered (Festini et al., 2013), we included scene-object and face-occupation tasks in addition to the face-name transfer task (Fig. 1). Finally, based on previous demonstrations of reduced transfer in older age, we examined if transfer effects varied across the adult age span. Based on previous findings of smaller or nonexistent transfer effects in older adults and on hippocampal age-related deterioration, we expected the younger adults to display larger transfer effects.

Fig. 1
figure 1

Loci memory training app—instructions, training game and transfer tests. Note. a Start screen with information about how to sign up for the study and informed consent page. b Instruction videos about how to use the MoL to encode series of pictures by placing them at locations within a “memory palace” such as one’s home. Each video ended with a memory test in which the presented pictures had to be touched in the correct order. c Function allowing to save memory palaces as lists in the app—e.g., “the house” and add locations within each palace—e.g., “on front door,” “at the bottom of the stairs,” etc. d MoL training game: Encode sequences of pictures presented in a self-paced manner, perform a distraction task (e.g., what is 2 + 5?), and answer by tapping pictures in correct order. e Transfer tests: At each test occasion, three paired associate cued recall tests were presented (face/name, face/occupation, scene/object). Instruction provided prior to each test, along with a reminder to use the MoL. Feedback provided after all three transfer tests had been completed. The faces shown in (e) are separate from the actual stimulus material used in the app due to copyright regulations for the images

Materials and Methods

Participants

Participants were recruited through ads in one of the larger daily newspapers in Sweden, through Facebook groups, in a mailing list to a national pensioners’ organization, on the project home page, and through word of mouth. Interested participants were directed to the project’s webpage where they were informed about the study and signed up as interested through a web form. Upon signing up, each participant received a welcoming email with a username and activation code, along with a brochure with detailed information about the study, the app, and contact information to the researchers.

Data from the app was transferred in encrypted form from the users’ phones to the university’s data collection platform at close intervals (after completing a survey, after completing transfer tests, and every third training trial) and in accordance with the General Data Protection Regulation (GDPR). Participants’ data were assigned a separate randomized digit-letter code in the database. The study was approved by the Regional Ethical Review Board in Umeå (ref: 2018/373–31) and the Swedish Ethical Review Authority (ref: 2020–02509).

Of the 754 people signing up as interested and receiving detailed information, 419 persons agreed to participate by signing an informed consent in the app with a personal username and code. Of those, 358 participants between 19 and 86 years (220 women, 137 men, 1 other) met the inclusion criteria (signed an informed consent, answered a pre-training questionnaire, watched all instruction videos, completed the MoL practice tests, practiced with the MoL training game to a minimum of five trials, and upon reaching level 5 (corresponding to recalling a five-item sequence), completed the first set of transfer tests). Two cases were regarded as outliers in terms of the number of trained trials based on visual inspection of boxplots for each age group, with 746 and 601 trials, respectively. These two cases were hence excluded from analysis, which resulted in 356 participants (age range 19–83 years, 219 women, 136 men, 1 other) being included in the study.

For the sake of studying possible effects of age upon training and transfer effects, we split the group into younger/middle age (20–54), middle age/older (55–69), and older (70–83) participants. The group assignment was based on longitudinal studies showing age-related episodic memory decline from around the years just before 60, and escalating around 70 in population studies (Rönnlund et al., 2005) that age-related reduction in hippocampus functioning are relatively minor until the age of 70 (Salami et al., 2012), and in keeping with our previous study (Sandberg et al., 2021) and other studies focusing on age effects of mnemonic training (Brooks et al., 1999). The cutoff from younger/middle-aged and middle-aged/older was set at 55 years to increase the chance of having in this group predominantly people without declining episodic memory ability. The distribution across age groups was 134 younger/middle aged (20–54), 119 middle age/older (55–69), and 103 older (70–83) participants (see Table 1 for demographic data).

Table 1 Demographics of the three age groups for the whole training sample

Demographics are presented in Table 1.

Materials and Procedure

Loci Memory Training App

The app was developed by a team of developers at Umeå University ICT systems and development department in close collaboration with the research group. The app was available for iPhone and Android, was distributed through App Store and Google Play, and was free of charge to download. It consisted of an informed consent form, pre-and post-training surveys, a series of videos teaching the MoL, texts about the MoL, a memory training game for MoL practice, and three parallel versions of three different transfer tests (see Fig. 1 and below).

Informed Consent

Upon downloading the app, participants were instructed on its front page that it was locked and that it was part of a research project. Participants who had found the app through App Store/Google Play were directed to the project home page to sign up. Participants were then guided to an informed consent form in the app, where they were presented with detailed information about the study, that participation meant that they should aim to practice with the app for a duration of 3 months, preferably 10 min each day and to try to reach as high as possible in the memory training game, that participation was voluntary, that they could quit at any time, how their personal data would be stored, as well as contact information to the researchers and how to go about if they wanted personal data removed. They were instructed to read through the information carefully, to contact the researchers if they had any questions and to type in the username and activation code, and to tick “I consent” if they wanted to participate. After they had activated the app, a short survey (see Tables 1 and 3) had to be answered.

Instruction Videos

After completion of the survey, four instruction videos had to be viewed in consecutive order to further unlock the app (Fig. 1). In the videos, an experienced mnemonic teacher and himself, a memory athlete (I. Z.), gave short lectures about MoL and how to use it to encode and serially recall pictures. The first video included an introduction to memory training. In the next three videos, MoL was explained in detail and demonstrated by I.Z., and participants were instructed about how to use, e.g., one’s own body, car, or home as a memory palace to memorize pictures by forming vivid pictures in mind and how to, for one’s inner eye, place one, two, or three pictures at each loci within that memory palace, and how to re-walk the mental route, visualizing each picture in their loci, at recall. These three videos each ended with a training test, where the short series of pictures that had been used in the demonstrations had to be serially recalled. These were included to increase the likelihood that everyone had understood the technique. Lastly, two optional videos contained general tips about mnemonic training. All videos could be re-watched at any time, and additional written information was available throughout training as well.

Completion of the four obligatory videos with associated tests unlocked the full version of the app and participants could now access the memory training game. Upon the first launch of the game, participants were asked to ensure that they had understood how to use the MoL and to otherwise watch the videos again and/or to read the written information.

Memory Training Game

In the memory training game (Fig. 1), participants were instructed to use the MoL to encode and retrieve pictures. Images (e.g., a rose, a sheep, a box, etc.) were presented one by one and remained on screen until “next” was pressed, so that time for visualization was tailored to each individuals’ current level of proficiency in the MoL. All participants started the game at level 1 with a sequence length of two items. After the whole sequence had been shown, a short distraction task appeared, and then a screen with thumbnail versions of all pictures followed. The task was to touch them in the same order as they had been presented. There was no time limit for answering. If the wrong picture was touched, this was indicated by a red color, and the participant could try again for a maximum of four times. If five errors were made, the message “Too many errors, please try again” appeared.

If less than five errors were made on a trial, the level was passed, and the next level was unlocked, in which the sequence length increased with one item. If the level was not passed, the participant instead continued to practice on the same level. This ensured that participants always practiced on their current proficiency level. The already passed levels were kept unlocked so that difficulty could be adjusted downward at any time. Maximum level was 99 (100 items in sequence).

Transfer Tests

In order to test for transfer effects, we included three in-house developed transfer tests: Face/name, face/occupation, and scene/object paired associates cued recall memory tests (Fig. 1). Each participant performed all three tests up to three times during the training period. The tests were triggered when participants reached certain levels in training. The first transfer test occurred when reaching level 5 (so that level 4 had been passed in training—corresponding to memorizing a sequence five items long). The second transfer test occurred when (if) reaching level 25 and the third transfer test when (if) reaching level 45. The last two levels were chosen in order to test for transfer effects after reaching a moderate and high proficiency in the MoL, respectively, based on training task results after MoL training in younger and older adults in previous studies. In a study by Baltes & Kliegl (1992), younger adults reached a performance level ranging from 20 to 29 items, and the older reached a level ranging between 7 and 22 after 35 sessions of practice. Later studies with extensive training have shown higher levels—up to 45 for older adults in de Lange et al. (2018) and even higher for younger adults in Dresler et al. (2017). Participants were explicitly instructed in all three tests to try to use the MoL to perform the task. Prior to each transfer test, participants received written instructions and practiced the task by completing one item before proceeding (Fig. 1). The tests were performed in the same order on each occasion (1: face/name, 2: face/occupation, 3: scene/object).

In the face/name test, 20 faces with written Swedish names above them were presented sequentially without repetition in a self-paced manner. The task was to memorize each face/name combination. After completing the whole sequence, a short distractor task was presented, in which participants judged whether three numbers were in the correct order. In the recall phase, all original and 50% new faces were presented sequentially in a random order, without names, in a self-paced manner, thus yielding 30 items in the first (20 old plus 10 new). The task was to pick the correct name from one of three alternatives which had the correct second letter—for example, _n____ for Anders, or if the face was not recognized, to choose “Not seen.” Multiple choices were chosen over free text so that familiarity with keyboard writing would not affect performance. The second letter was chosen to make guessing from seeing the first letter difficult.

After completion of the face/name test, instructions were given about the next test, face/occupation, in which faces were presented together with occupations, and the test was then performed in the exact same manner as the face/name test. Lastly, instructions for scene/object test were presented, in which pictures of indoor and outdoor scenes were presented together with names of objects (e.g., spoon, vase, apple), and then the test was performed in the same manner.

For each test at each test occasion, picture/item pairs were put together by randomly selecting a picture and a word from a pool of pictures (180 faces of different ages and ethnic backgrounds—90 male and 90 female, 90 indoor and outdoor scenes) and words (90 common Swedish names—45 male and 45 female, 90 common Swedish occupations, 90 common objects—e.g., spoon, vase, apple, etc.). Name and face were matched according to sex. The items already drawn were not reused for that participant, so that the item pools decreased by one each time an item was presented. Thus, at each test occasion (Transfertest 1, Transfertest 2, Transfertest 3), the tests consisted of randomly generated series of unique picture/word combinations, from the same picture and word item pools. This process was performed separately for each participant, so that the item combinations and item pair order were random across participants.

After all three tests had been completed, the results were presented, and the training could be continued.

Statistical Analysis

Statistical analyses were performed using IBM SPSS Statistics Package version 26. To examine differences in amount of practice between the three age groups, we performed a one-way analysis of variance (ANOVA) with age group as between-groups factor and number of practiced trials as dependent variable. The same analysis was performed with mean level reached in training as a dependent variable.

To test whether background variables predicted the highest level reached in training, a multiple regression analysis with maximum level reached in training as the outcome variable was performed.

To investigate transfer effects, two mixed analyses of variance (ANOVAs) were performed—one including the participants reaching level 25 in training and one including the participants reaching level 45 in training (Fig. 2).

Fig. 2
figure 2

Flow of participants

Results

Training Task

As in our previous study (Sandberg et al., 2021), the number of training trials increased as a function of age group (Fig. 3a), but the maximum sequence length did not differ between age groups (Fig. 3b). Thus, older individuals had to train more trials to reach the same MoL level as the younger adults (Fig. 3c).

Fig. 3
figure 3

Training results. Note. a Mean (and 95% CI) of amount of training (number of completed training trials) for the three age groups in the total sample (n = 356: 134 young/middle age, 119 middle age/older, and 103 older). b Mean (and 95% CI) highest level (longest recalled sequence of pictures) reached in training for the three age groups in the total sample (n = 356: 134 young/middle age, 119 middle age/older, and 103 older). c Maximum level reached in training as a function of total number of training trials. Separate regression models for each age group revealed that this relation was significant for all age groups, ps < .001

A multiple regression analysis with maximum level reached in training as the outcome variable revealed that number of trials significantly predicted highest level reached in training for the total sample (Table 2). Pretest performance on the face/name transfer task was also a significant predictor, but neither age nor years of education predicted level reached in training. Computation of (unique) variance accounted for by each predictor suggested that number of trials had the largest influence on level reached in training.

Table 2 Predictors of highest level reached in MoL training task

Transfer Tasks

To reach transfer test 2, a level of proficiency corresponding to 25 correctly remembered items in the MoL game was required, and 155 participants succeeded in reaching that level (Fig. 2) (61 younger/middle age, 48 middle age/older, 46 older). The first analysis of transfer examined if the performance of these 155 individuals on the transfer tasks at Transfertest 2 was increased relative to that at Transfertest 1 using a 2 (TestOccasion [Transfertest 1, Transfertest 2]) × 3 (TestCondition: [Face/Name, Face/Occupation, Scene/Object]) × 3 (AgeGroup [Young/middle age (20–54); Middle age/old (55–69); Older (70–83)]) mixed ANOVA with TestOccasion and TestCondition as within-subjects variables and AgeGroup as between subjects factor. The Critical main effect of TestOccasion was significant [F(1, 152) = 33.47, MSE = 8.89, p < 0.001, ƞ2p = 0.18], showing that the individuals who reached MoL level 25 significantly improved their performance on the transfer tasks (Fig. 4a). Cohen’s d was 0.18. The main effects of TestCondition [F(1.84, 280.20) = 123.51, MSE = 12.05, p < 0.001 ƞ2p = 0.45] and AgeGroup [F(2, 152) = 6.18, MSE = 42.71, p = 0.003, ƞ2p = 0.075] were also significant, but no interaction effects reached significance. Thus, a transfer effect at MoL level 25 was observed across all three transfer tasks and age groups (see Fig S1 a, b for details).

Fig. 4
figure 4

Transfer test results. Note. a Mean and 95% CI for number of hits across transfer test conditions and age groups from Transfertest 1 to Transfertest 2. b Mean and 95% CI for number of hits across age groups and transfer test conditions from Transfertest 1 to Transfertest 2 and 3

Next, we consider the results from the group of participants who reached Transfertest 3 (n = 60; 24 Younger/middle age, 19 Middle age/older, and 17 Older), which required reaching a MoL level of 45 correctly remembered items. A 3 (TestOccasion [Transfertest 1, Transfertest 2, Transfertest 3]) × 3 (TestCondition [Face/Name; Face/Occupation; Scene/Object]) × 3 (AgeGroup [Younger/middle age; Middle age/older; Older]) mixed ANOVA revealed a significant main effect of TestOccasion [F(1.77, 101.08) = 14.24, MSE = 7.31, p < 0.001, ƞ2p = 0.20]. Subsequent post-hoc tests with LSD between test occasions (Fig. 4b) revealed a significant performance increase from Transfertest 1 to Transfertest 2 (Mean difference = 0.90, p = 0.005, Cohen’s d = 0.16), and from Transfertest 2 to Transfertest 3 (Mean difference = 0.53, p = 0.022, Cohen’s d = 0.09). The difference between Transfertest 1 vs Transfertest 3 was also significant (Mean difference = 1.43, p < 0.001, Cohen’s d = 0.26). In addition, the ANOVA revealed a significant main effect of TestCondition, [F(1.73, 98.60) = 63.98, MSE = 11.46, p < 0.001, ƞ2p = 0.53], but no main effect of AgeGroup and no interaction effects. Thus, again, the transfer effect generalized across transfer tasks and age groups also at MoL 45 (see Fig S1 c, d for details).

Finally, we characterized the three subgroups of individuals who (i) practiced > 25 trials and did not reach transfer test 2 (n = 82), (ii) practiced > 45 trials and reached transfer test 2 but not 3 (n = 59), and (iii) reached transfer test 3 (n = 60). The best performing sub-group (iii) was younger and had more years of education and out-performed the other groups on all tasks already at transfer test 1 (Table 3). It is noteworthy that sub-group (ii) practiced significantly more on the MoL app (ca 115 trials) than the best performing sub-group (ca 70 trials), but still did not reach the critical MoL level of 45.

Table 3 Differences between participants who did and did not reach each transfer test occasion despite practicing at least the minimum amount of trials required

Comparisons with a Previous Data Set

In order to provide an evaluation of whether the observed results likely stem from mere retest effects, due to the lack of control group, we performed additional analyses in which we used data from a previous study, where participants were tested on the face-name transfer test (Sandberg et al., 2021) to compare performance across transfer test occasions. The analyses are detailed in the supplementary material and presented in brief here.

Two analyses were performed: First one where we compared performance on the face-name transfer test in the previous study with that of the current, at test occasion 1 and 2 for those who reached Transfertest 2, and one and at test occasion 1, 2, and 3 for those who reached Transfertest 3. Dependent variables were percent correct responses at each test occasion. A 2 (TestOccasion [Transfertest 1, Transfertest 2]) × 2 (Study group [Prevous; Current]) × 3 (AgeGroup [Younger/middle age; Middle age/older; Older]) mixed ANOVA was performed. Participants from the previous study were 165 adults (33 younger/middle aged, 71 middle aged/older, and 61 older) and 156 from the current (60 younger/middle aged, 48 middle aged/older, and 47 older). The analysis revealed a significant interaction between TestOccasion and StudyGroup F(1, 315) = 22.7, MSE = 0.16, p < 0.001, so that the participants from the current study increased their performance, while the participants from the previous study did not.

Secondly, a 3 (TestOccasion [Transfertest 1, Transfertest 2, Transfertest 3]) × 2 (Study group [Prevous; Current]) × 3 (AgeGroup [Younger/middle age; Middle age/older; Older]) mixed ANOVA was performed. Participants from the previous study were 36 adults from the previous study (8 younger/middle aged, 18 middle aged/older, and 10 older) and 60 from the current (24 younger/middle aged, 19 middle aged/older, and 17 older). The analysis revealed a significant interaction between TestOccasion and StudyGroup F(2, 180) = 5.60, MSE = 0.015, p = 0.004, so that the participants from the current study increased their performance from Transfertest 1 to 2 and 1 to 3, while the participants from the previous study did not increase their performance from Transfertest 1 to 2 or 2 to 3, but in fact decreased their performance from Transfertest 1 to 3.

Discussion

The main aim of the current study was to examine whether up to 3 months of app-based MoL practice would generalize to support the performance on untrained associative transfer tasks if the participants were explicitly instructed to use MoL during the transfer tasks and given enough time to form associations. This approach was inspired by findings that explicit instructions to make use of a learned strategy increase the likelihood of obtaining transfer (Cavallini et al., 2010) and our previous observation of no transfer to an associative face-name transfer task when no explicit instructions were given, despite high levels of MoL proficiency (Sandberg et al., 2021). Strikingly, we found that individuals in all age groups who reached the proficiency of serially retrieving 25 items after MoL practice increased their performance on all three associative transfer tasks relative to their initial levels, and individuals, again in all age groups, who were even more successful on MoL training and reached the level of 45 continued to increase their performance levels on the transfer tasks even further.

A distinctive feature of our study was that transfer was measured when specific proficiency levels on the criterion task had been reached, regardless of the amount of training up to that point. Thus, the participants who performed the transfer tests were equally proficient on the training task, but they had not necessarily received the same training dose. In fact, we found that older adults in general had to train many more trials in order to reach the same MoL level as the youngest individuals (cf., Brehmer et al., 2007; Kliegl et al., 1989). Hence, the lack of effect of age group in the analyses of transfer tests should be qualified by the existence of age differences in training dose. However, crucially, once the older adults, through more training, reached the same level as the younger groups did, they showed similar transfer effects. An app-based approach seems ideal in this context as it accommodates individualized training regimes. Although skill level in the training task and its relation to transfer effect has not previously been investigated, our findings are in line with prior findings of larger transfer gains for individuals who improve more on the training task (Cavallini et al., 2019; Jaeggi et al., 2011; Pahor et al., 2022; Zinke et al., 2014). Future studies should focus on whether training gain or the skill level that is obtained in training is crucial for transfer.

It is important to stress that even though skill level might be an important factor, our previous study revealed no transfer despite marked MoL improvement when the participants were not explicitly instructed to make use of the mnemonic during the transfer task (Sandberg et al., 2021). We cannot with the current design tease apart which aspects are crucial, and it is possible that skill level in combination with explicit instructions might be of key importance. It remains unclear from the present study how participants used the MoL while solving the transfer tasks, but the explicit instruction to utilize MoL was intended to increase the likelihood that the picture-word pairs to be remembered were placed at specific locations in mind, just as the single items in training. It could be expected that strengthening item-location binding during training may in turn have also triggered similar binding processes when forming associations between the various item pairs as during MoL execution (e.g., Reggente et al. (2020)). In effect, strengthening of associative processing by MoL training, likely taxing hippocampal computation (Bråthen et al., 2018; Dulas et al., 2021; Jones et al., 2006), could have translated into more efficient processing on novel tasks for which similar neurocognitive processes should be efficient (for a related argument on the importance of neurocognitive overlap between training and transfer tasks, see Dahlin et al. (2008) and Thibault et al. (2021). Future studies should direct attention towards investigating how participants apply the strategy to a task dissimilar to the trained one, as well as targeting the neural underpinnings for transfer after such training. It is important to point out that all transfer tests were performed after the initial instruction about how to use the MoL. Thus, the increase in performance does not reflect strategy application alone. According to previous definitions, our results would fall under the term near transfer (Gobet & Sala, 2022), while in turn, those demonstrated in Thibault et al. (2021), with transfer from complex syntax to tool use and vice versa, arguably would be far transfer. A definition stemming from overlap in neural processes might be more fruitful in this regard.

A limitation of the current app-based approach is that no control group was included, so we cannot rule out that some of the increases on the transfer tests reflected retest effects. However, as noted, in our previous study, we observed no transfer effect at all after MoL training (Sandberg et al., 2021), which is in agreement with prior reports of little or no improvement for control groups, or even transfer (retest) effect for training groups (de Lange et al., 2017; Gross et al., 2012). Moreover, we included a training trial before each transfer test on each occasion, so the participants were familiar with the stimulus material and response mode before starting which should limit retest effects. To provide an evaluation of whether the results likely stem from mere retest effects, we performed additional analyses with the data from our previous study where we observed that the participants in the current study increased performance in the transfer tests while those in the previous study did not. The results from these analyses are presented in supplementary material. These analyses do not replace a control group design, and the results here will thus still need to be interpreted with caution and must be confirmed in future studies with a control group.

Due to the nature of the data collection entirely through the app, we did not include screening tests for dementia or cognitive impairment among the older participants, or screened them for other factors such metacognitive skills, motivation, or individual differences in learning strategies, which should be kept in mind when interpreting the results. When comparing the previous data with the current, we cannot know whether frequency of cognitive impairment differed between the study groups, but importantly, the comparison is made between participants of equal proficiency level in training. Future studies should focus on investigating app-based episodic memory training with a focus on people with dementia or cognitive impairment, and whether the outcome is dependent on such and related factors.

Since the in-house developed transfer tests have not been previously validated, we cannot be sure that the versions used are parallel. We did however randomize the pictures and words so that no two tests consisted of the same item pairs in the same order. By applying this process, we believe that item effects should not be affecting the results.

Another consequence of performing the data collection via an app might have been that the sample consisted of people who were familiar with using smartphone technology. Familiarity with mobile devices might nonetheless have been a factor affecting the older participants’ performance more than the younger. Lastly, as is the case in many training interventions, the sample was highly educated, which might limit the generalizability to the broader population.

In conclusion, our findings suggest that app-based strategy training, under certain conditions, such as skill level in combination with explicit instructions, can enhance episodic memory beyond the specific training task. The app-based approach offers personalized training schemes tailored to individual needs, and it can be continued until desired skill levels are reached on the criterion task, which may have clinical implementations. Due to the lack of studies investigating neural underpinnings for transfer, an important avenue for future research is to examine this, and based on findings in the working memory domain, to investigate whether neural overlap between trained and transfer tasks after mnemonic training is a key factor, as well as how to best stimulate the use of learned strategies when solving untrained tasks and the borderline conditions for such intentional generalization of cognitive skills to novel conditions.