Introduction

Objective and Rationale

Visionaries foresee a future in which you can improve your mind by playing video games (Gee, 2003; McGonical, 2011; Prensky, 2006), but rigorous scientific research is needed to be able to determine how best to achieve this goal or whether the goal can be achieved at all (Mayer, 2014, 2019; Plass et al., 2019). The aim of the present study is to determine the effectiveness of a game designed to train the executive function skill of updating based on a cognitive theory of game-based training (Parong et al., 2020). Updating involves continually monitoring incoming information and replacing outdated or irrelevant items in working memory with new information as relevant for completing a task (Morris & Jones, 1990).

In the present study, we measure updating skill by using the n-back task, in which the participant sees a series of rapidly presented letters on a screen and must press a key each time the current letter is the same as one presented n trials previously (e.g., 3 trials back in a 3-back task). For example, on a 3-back task, the participant might see X followed by T followed by R followed by V followed by T followed by X and so on for perhaps 100 trials. In this case, the participant should press the key for the second T because the T appeared three trials earlier. In the 3-back test, the participant must continually update working memory with the letter that was presented 3, 2, and 1 trials previously, replacing them each time a new letter appears. The score on the n-back is the number of hits (i.e., pressing the key when the current letter also appeared n trials earlier) minus the number of false alarms (i.e., pressing the key when the current letter did not also appear n trials earlier).

Updating is recognized as a fundamental executive function skill, that is, a cognitive skill involved in monitoring and controlling one’s cognitive processing (Miyake et al., 2000). We focus on games for improving executive function because executive function has been shown to be related to academic success (Banich, 2009; Best, 2014; Miyake et al., 2000). In short, we aim to determine whether the executive function skills learned in playing a game that is designed based on cognitive principles will transfer to improvements in the same targeted skill tested outside the game environment.

The Illusive Search for Transfer of Cognitive Skill in Game-Based Training

Using computer games for the training of cognitive skills has a somewhat disappointing history characterized by strong claims based on weak evidence (Mayer, 2014; O’Neil & Perez, 2008; Plass et al., 2019; Tobias & Fletcher, 2011; Wouters & van Oostendorp, 2017). Research on game-based training of cognitive skills can be divided into three strands (Mayer et al., 2019). In the first strand, researchers found that playing commercial computer games built for entertainment (which can be called off-the shelf games) generally did not result in improvements in cognitive skills tested outside of the game environment (Mayer, 2014, 2019). The main exception is that action video games such as Metal of Honor or Unreal Tournament have been shown to improve perceptual attention skills performed in contexts outside the game (Bediou et al., 2018). In the second strand, researchers found that playing computer games intended for cognitive training (sometimes called brain-training games) also generally did not result in improvements in cognitive skills tested outside of the game (Mayer, 2014, 2019). For example, Bainbridge and Mayer (2018) reported that playing the brain-training game, Lumosity, for up to 80 15-min sessions did not have substantial impacts on players’ performance on the targeted skills tested outside the game context. Kable et al. (2017) reported similar results. Hardy et al. (2015) found some positive effects, but their work has been criticized on methodological grounds and because the authors had a financial interest in the company producing Lumosity (Simons et al., 2016).

In contrast to the disappointing results with off-the-shelf games and brain-training games, in the third strand, researchers are beginning to find that playing computer games specifically designed for cognitive training based on cognitive theory (which we call theory-based games) can cause improvements in targeted cognitive skills tested outside the game environment (Anguera et al., 2013; Parong et al., 2017, 2020). For example, Parong et al. (2017, 2020) sought to build games based on cognitive principles of skill learning (Anderson & Bavelier, 2011; Ericsson, 2009; Fitts & Posner, 1967; Johnson & Priest, 2014; Posner & Keele, 1968; Rigby & Ryan, 2011; Singley & Anderson, 1989) as summarized in the left columns of Table 1. In a series of experiments, Parong et al. (2017, 2020) found replicated evidence across 4 experimental comparisons that playing a game aimed at training the executive function skill of shifting, All You Can ET, resulted in significantly greater pretest-to-posttest gains on non-game shifting tasks as compared to a control group. Shifting is the executive function skill of being able to rapidly and effectively change from one cognitive task to another (Miyake et al., 2000). Similarly, Anguera et al. (2013) developed a theory-based game called Neuroracer, which was successful in improving executive function skills in older populations.

Table 1 Six components in a cognitive theory of game-based training

Given the relatively small research base, in the present study, we seek to extend this promising approach by examining a new game that we developed based on the cognitive theory of game-based training, CrushStations, which we designed to train the executive function skill of updating. In line with previous research, CrushStations was created based on six principles of game-based training as summarized in Table 1 (Parong et al., 2020). After creating and testing CrushStations, we shared the game as a free app available on Android and Apple devices. This paper reports our test of the effectiveness of the game we created in promoting the executive function skill of updating, and examines whether the effects are limited to the targeted cognitive skill rather than cognitive functioning in general.

Theory and Predictions

The focus of the current study is on what can be learned from playing a well-designed cognitive skill training game, as indicated by transfer performance on non-game tasks. Transfer refers to the effects of prior learning on new learning or performance on a novel task (Mayer, 2011). Table 2 summarizes three views of how transfer works: specific transfer, specific transfer of general skill, and general transfer.

Table 2 Three views of transfer in game-based training

As shown in the first section of Table 2, the specific transfer view holds that learning one thing will facilitate learning another thing to the extent they share common elements (Singley & Anderson, 1989; Thorndike & Woodworth, 1901). According to this view, playing CrushStations in previous sessions should help people play better in a new session, because the new session involves exercising the same skill (i.e., updating) in the same context (i.e., the game). The specific transfer view predicts that CrushStations players will show an improvement in game performance metrics—such as the highest level reached—across the four sessions of game playing (Hypothesis 1).

As shown in the second section of Table 2, the specific transfer of general skill view (Anderson & Bavelier, 2011; Sims & Mayer, 2002; Singley & Anderson, 1989; Wertheimer, 1959) holds that when a cognitive skill (e.g., updating) is learned in a game environment, there will also be an improvement in the same skill applied outside the game environment, because the same general skill can be used across contexts. According to this view, playing CrushStations will cause general improvement in updating skill that can be detected when updating skill is tested outside a game context (e.g., in an n-back task). The specific transfer of general skill view predicts that CrushStations players will show a greater pretest-to-posttest gain than a control group on tests of updating skill performed outside the game context—such as accuracy score on an n-back test involving letters (Hypothesis 2).

As shown in the third section of Table 2, the general transfer view (Mayer, 2011; Singley & Anderson, 1989) holds that when a cognitive skill is learned in a game, there will also be an improvement in other skills applied outside the game because cognitive functioning is being improved in general during game play. According to this view, playing CrushStations will improve the player’s mind, thereby causing improvements on other cognitive skills besides updating when tested outside the game context. The general transfer view predicts that CrushStations players will show a greater pretest-to-posttest gain than a control group on tests of other cognitive skills such as visual working memory as measured by score on the visuospatial scratch pad working memory task (Hypothesis 3). We chose the visuospatial working memory task because it taps a basic cognitive skill that has been examined in game research (Mayer, 2014) but is not specifically targeted by CrushStations. Visual working memory skill refers to being able to actively keep visual information in mind for a short period of time.

Method

Participants and Design

The participants were 91 undergraduate students from a public university in southern California, who completed all sessions of the study. Participants received credit towards a class requirement. The mean age was 19.26 years (SD = 1.37), with a range from 18 to 25 years. Sixty-eight participants identified as women and 23 identified as men. In a between-subjects design, 45 participants were randomly assigned to play the CrushStations game (CrushStations group) and 46 were assigned to play the Bookworm game (control group).

Materials and Apparatus

The materials consisted of a consent form, two computer-based cognitive skill tests, two computer games, a demographic questionnaire, and a postquestionnaire. The two cognitive tasks were an n-back task and a visuospatial sketch pad working memory task, created and implemented using PsyScope (Cohen et al., 1993). The two games were the custom video game, CrushStations, and a commercial word search game, Bookworm, which served as an active control. The postquestionnaire asked about the participant’s experience in playing the game by gauging their affect (i.e., feelings about the game, indicated by items 2 and 3 in Table 3), motivation (i.e., willingness to persist or exert effort in the game, indicated by items 4, 5, and 7 in Table 3), and level of challenge (indicated by item 6 in Table 3), and was administered through Qualtrics. The consent form and demographic questionnaire were administered on 8.5×11″ white paper. There was also a reading comprehension test, but we excluded it from this paper because it was too easy for our participants, resulting in a ceiling effect.

Table 3 Mean rating (and standard deviation) by two groups on seven self-report items

The n-back task was used because it is a classic updating task (Jaeggi et al., 2010), and because the custom game being used in this study, CrushStations, was designed to have similar cognitive demands as an n-back task. Thus, it seemed prudent to use the n-back as a cognitive measure of specific transfer of a general skill (i.e., applying the learned skill in a new context). The task included 3 blocks of trials: 2-back, 3-back, and 4-back. Each block consisted of 30 trials, with 10 targets per block. As summarized in Fig. 1, black English letters were presented one at a time in the center of a white screen for 500 ms, with a black “+” fixation point in the center of a white screen presented for 2500 ms after each letter. Participants were asked to press the space bar every time the letter on the screen matched the letter that was presented n-back. They were given the opportunity at the beginning of each block to ask questions if they did not understand the presented instructions. The task was designed in and responses were captured by PsyScope and took approximately 5 min to complete. The primary dependent measure from the n-back task was an accuracy score based on hits minus false alarms for the 2-back, 3-back, and 4-back tasks.

Fig. 1
figure 1

A depiction of four trials from the 3-back task. One black English letter was presented at a time (500 ms) followed by a black “+” fixation point (2500 ms). In this example, participants should only press the spacebar when the second “A” appeared on the screen because it is the same letter that was shown 3 back (trial 4 = hit). Pressing the spacebar on any other trial in this example would be counted as a false alarm. The 2-, 3-, and 4-back blocks each consisted of 10 hit trials out of 30 trials

The visuospatial sketchpad task (VSSP task) was adapted from Ma et al. (2017) and was intended to tap visual working memory. According to Baddeley and Logie’s (1999) multicomponent model, the visual working memory skill required for the VSSP task is different than the updating skill required for playing CrushStations, and therefore would be considered a test of general transfer. The VSSP task was broken into two blocks: a 10-trial training block (4 blacked-out space trials), and a 50-trial testing block (5, 6, 7, 8, and 9 blacked-out space trials, randomly presented), with more squares to be memorized increasing the difficulty of the task. As summarized in Fig. 2, in this task, a 4×4 grid was displayed on a white screen, with 4, 5, 6, 7, 8, or 9 of the grid spaces blacked out. A blank screen was displayed for 1000 ms, then another 4×4 grid was displayed with 2 of the grid spaces blacked out. Participants were asked to determine whether the blacked-out spaces in the second grid were in the same position as they were in the first grid. For a match, they pressed the “A” key; for a no-match, they pressed the “L” key. There were 5 separate patterns for each trial type, with one correct/matched trial and one incorrect/no-match trial for each pattern (yielding 10 trials for each condition). The task was designed and responses were collected using PsyScope and took approximately 5 min to complete. The primary dependent measure for the VSSP task was an accuracy score based on percent correct out of 50 trials. This test involves memory for presented visuospatial information rather than the target skill of updating.

Fig. 2
figure 2

A depiction of one trial from the Visuo-Spatial Sketch Pad (VSSP) working memory task. The participant should press the “A” key if the position of the shown blocks in the second grid matches the positions shown in the first grid or should press the “L” key if it is not a match

The two computer games used in this study were the custom updating training game, CrushStations, and the active control game, Bookworm. In CrushStations (exemplified in Fig. 3), the premise is that sea creatures are trapped in bubbles and a hungry octopus is trying to eat them. There are five different types of sea creatures (jellyfish, lobster, starfish, crab, and stingray) that can be trapped in five different color bubbles (green, yellow, red, pink, and purple). The trapped creatures move across the screen from right to left, and up to six bubbles are on the screen at a time. Depending on the difficulty of the level, 1 to 6 bubbles on the left side of the screen can be occluded and players are required to give a response by clicking the creature type (from 1 to 5 options) and color (from 1 to 5 options). When players provide a correct response, the creature goes free; when they provide an incorrect response, the creature is eaten by the evil octopus. Difficulty is increased by requiring delayed responses (i.e., a bubble may be occluded, but participants may have to wait to respond), increasing the number of color and creature options, and by increasing the number of items per level. Each level has its own response rules (e.g., 3 bubbles occluded with 2 colors and 2 creatures in one level; 2 bubbles occluded with 3 colors and 4 creatures in another), and increasing levels provide additional elements to make the game progressively more difficult. A sufficient percentage of correct responses is required to progress to the next level. For this experiment, the game was accessed through the developer’s website (create.nyu.edu/dream). The game also is available as an app on Apple and Android devices.

Fig. 3
figure 3

A screenshot from a level of the custom video game CrushStations. Players are required to remember the type of creature and the color of the bubble they are in as they move across the screen from right to left, and select the correct combination for the occluded items. Correct answers result in freeing the sea creature; incorrect answers result in the mean octopus eating the sea creature

The control game, Bookworm, is a web-based flash game where players must make words from adjacent letter tiles in a 7 × 7 array to feed the bookworm. This is a word search game that relies on prior vocabulary knowledge and is not intended to train updating skill. A screenshot is shown in Fig. 4. This game was also accessed through the developer’s website.

Fig. 4
figure 4

A screenshot from the control video game, Bookworm. Players are required to create words from the provided letters to feed the bookworm and earn points

The demographic questionnaire solicited information concerning the participants’ gender and age. The postquestionnaire consisted of seven statements to rate using a 5-point Likert scale from 1 (strongly disagree) to 5 (strongly agree), as shown in the left column of Table 3. The statements were intended to provide preliminary information about aspects of the players’ experience in playing the game for exploratory analyses. Cronbach’s alpha was 0.72 and we relied on face validity in constructing the items.

The apparatus for this experiment consisted of up to five Apple iMac desktop computers with 20-inch screens, wired mice, and wired keyboards.

Procedure

Up to five students participated in each session, with all participants per session randomly assigned to either the CrushStations group or the control group. The study consisted of four treatment sessions spread across 9 days. On the first treatment session, participants read and signed the informed consent form, created an individual login for the online portal, and performed the two cognitive tasks for pretesting measures. There was a short pause after each task to allow for set up. Once the tasks were complete, the participants played either CrushStations or Bookworm for 30 min. The second treatment session was scheduled 2 days after the first session, and the third treatment session was scheduled 5 days after the second session; each consisted of playing 30 min of the participants’ assigned game. During the fourth treatment session, which was scheduled 2 days after the third, the participants played their assigned game for 30 min, then completed the two cognitive tasks for posttesting measures, and the posttest questionnaire. Upon completion of the study, participants were paid $30 or credited with 3 h of participation and thanked for their time. We followed guidelines for research with human subjects and obtained IRB approval.

Results

Do the Groups Differ on Basic Characteristics?

A preliminary step is to determine whether the groups had any preexisting differences. Participants in the CrushStations and Bookworm groups did not differ significantly (p < .05 based on t-tests) in their mean age, mean hours of reported video game play per week, or mean pretest scores. The groups did not differ significantly (p < .05, based on a chi-square test) on the proportion of men and women. We conclude that random assignment produced groups equivalent on basic characteristics.

Does the Data Meet Requisite Assumptions for Planned Analyses?

Prior to analysis, outliers were identified for the n-back task and the VSSP working memory task, separately. We removed any value that was 3 standard deviations above or below the mean for the dependent measures of each task. After removing outliers for the n-back task, the total number of participants was 84, with 39 in the CrushStations group and 45 in the control group. After removing outliers for the VSSP working memory task, the total number of participants was 82, with 38 in the CrushStations group and 43 in the control group.

With outliers removed, the normality assumption was checked. For the n-back task, the dependent measure of accuracy (i.e., hits minus false alarms) met the assumptions for normality using the Shapiro-Wilkes test (p’s > .05) for both pretest and posttest measures. Therefore, ANCOVA (using pretest accuracy as covariate) is appropriate. Levene’s Test of Equality revealed no homogeneity of variance violations with the dependent measure (p > .05). For the VSSP working memory task, the dependent measures of percent correct did not meet the assumptions for normality using the Shapiro-Wilkes test for both pretest and posttest measures. Therefore, non-parametric statistics were used to analyze these data.

Specific Transfer View: Do Participants Improve at CrushStations?

According to the specific transfer view, students in the CrushStations group should show improvements in game performance across the four sessions. Table 4 shows the mean highest level reached (and standard deviation) on each session. A repeated-measures ANOVA was performed to determine whether participants improved as they played CrushStations across the four treatment sessions, as measured by the highest level they achieved on each treatment session (out of 84 levels total). There was a significant within-subjects effect of treatment session on level achieved, F(2.356, 87.165) = 399.925, p <.001 (based on using Greenhouse-Geisser measures as Mauchy’s test of sphericity was significant at p = .004). Bonferroni-adjusted post hoc tests revealed significant differences among treatment sessions (with p < .001) in that participants achieved a significantly higher level during later treatment sessions compared to all other previous sessions: treatment session 2 > treatment session 1, treatment session 3 > treatment session 1, treatment session 4 > treatment session 1, treatment session 3 > treatment session 2, treatment session 4 > treatment session 2, and treatment session 4 > treatment session 3. We conclude that there is evidence for specific transfer because players tended to get better at playing the game.

Table 4 Testing the specific transfer view: CrushStations players improve on game performance across four sessions as measured by level achieved in each session

Specific Transfer of General Skill View: Does Playing CrushStations Increase Updating Performance (on the N-back Task)?

According to the specific transfer of general skill view, people who played 2 h of the focused updating skill game, CrushStations, should have improved their updating skill on a test outside the game context compared to those who played the control game. Table 5 shows the mean pretest and posttest scores (and standard deviations) on n-back accuracy score for the two groups. An ANCOVA using pretest n-back accuracy score (out of 30 possible target trials) as a covariate revealed a significant main effect of game condition on posttest accuracy score, F(1,81) = 5.67, p = .020, η2p = .07, in that those who played CrushStations had significantly better average n-back score then those playing Bookworm, yielding an effect size of d = .29. Consistent with the specific transfer of general skill view, this is the main empirical contribution of this project because it shows that 2 h of game-based training involving updating can have a significant impact on updating skill performed in a non-game context.

Table 5 Testing the specific transfer of general skill view: CrushStations players improve more than control group on target skill (n-back score) outside game context

General Transfer View: Does Playing CrushStations Increase Visual Working Memory Skill (on the VSSP WM Task)?

The general transfer view holds that playing CrushStations should result in greater improvements on VSSP score as compared to a control group. Table 6 shows the mean pretest and posttest scores (and standard deviations) on the VSSP and for the two groups. An independent samples Kruskal-Wallis test (p = 0.59) indicated there was no significant difference between the mean percent correct of those playing CrushStations and those playing Bookworm.

Table 6 Testing the general skill view: CrushStations players do not improve more than control group on non-target skill (VSSP score) outside game context

Do the Groups Differ on Self-report Measures?

As shown in the right side of Table 3, independent samples t-tests on the seven posttest questionnaire questions revealed significant differences favoring the Bookworm group on items 3 (d = 1.10) and 7 (d = 1.24), and the CrushStations group on item 6 (d = 0.64). Overall, participants enjoyed Bookworm more than CrushStations and would rather play Bookworm again, but found CrushStations to be more challenging.

Discussion

Empirical Contributions

The current study examined the effectiveness of a game designed based on cognitive principles of skill learning, CrushStations. The primary objective of this study was to test different views of transfer (i.e., specific transfer, specific transfer of general skills, and general transfer) for participants who played the training game. Consistent with the specific transfer view (Hypothesis 1), playing the focused video game CrushStations resulted in significant improvements in performance across the 4 sessions. Consistent with the specific transfer of general skills view (Hypothesis 2), playing CrushStations for 2 h was effective at training for near transfer to a task that uses the same cognitive skill—updating—in a new context, namely the n-back task, as compared to an active control group. In contrast, the general transfer view (Hypothesis 3) was not supported. Playing CrushStations did not result in far transfer to tasks that use similar, but different cognitive skills than the target skill of updating. Specifically, participants who played CrushStations did not show significant pretest-to-posttest gains in accuracy on the VSSP task as compared to an active control group.

Theoretical Implications

Concerning theoretical implications for cognitive training with computer games, this study provides evidence that the connection between playing a video game and learning to think depends on the degree to which the game is based on the six criteria from the cognitive theory of game-based training listed in Table 1. When we designed a game based on these criteria, as in CrushStations, we found evidence for specific transfer and specific transfer of general skill but not general transfer. This work shows that when game-based training of cognitive skills is effective, students will show improvements in the targeted skill in game and non-game contexts, but not in non-targeted skills in non-game contexts, at least not for this dose of the intervention. In contrast to many studies of brain-training games that do not find evidence of transfer of a skill targeted in the game to improvements in that same targeted skill outside the game context (Bainbridge & Mayer, 2018; Mayer, 2014; Mayer et al., 2019), we found that playing a game designed to train updating skill based on cognitive principles of game-based training produced learning that allowed players to show improvements in updating skill tested outside the game context. This work provides evidence for the value of designing game-based training that meets the six criteria in the cognitive theory of game-based training summarized in Table 1 and adds to the small evidence-base showing the benefits of applying cognitive theories of skill learning to designing educational game for cognitive training (Mayer et al., 2019; Parong et al., 2017, 2020).

Practical Implications

The primary practical contribution of this study is that a focused video game can be used to train the executive function updating skill with 2 h of appropriate gameplay in a population of young adult university students. CrushStations was originally designed for middle and high school aged children and being able to provide evidence that it is an effective tool for improving updating in an older population makes the game more viable in commercial and educational settings. This work shows that potential game users and educational decision-makers should base their choices on empirical research evidence concerning game effectiveness (Mayer, 2016).

Limitations and Future Directions

A surprising finding is that students did not report high levels of enjoyment for CrushStations, which is one of the criteria in a cognitive theory of game-based learning. This may be due to the fact that the game was originally developed for younger players, so perhaps the game should be modified to better appeal to young adults. However, players reported being challenged, which could cause greater effort.

There was a general pattern of players liking CrushStations less and wanting to play it less than the control game, but feeling more challenged by it. This suggests that developers have work to do increase players’ enjoyment and motivation for CrushStations for young adults, while maintaining an adequate level of challenge throughout the game.

In the future, this line of research should aim to determine more incremental steps between the training tasks and the transfer tasks (i.e., how different the transfer task can be as long as it uses the same underlying general skill). Another issue for future research concerns whether CrushStations would be useful with an aging population. Future research also is needed to determine the proper dosage of gameplay, e.g., how many minutes spread over how many sessions at what interval are needed to maximize learning.

Overall, however, it is encouraging that the third phase of games to train cognitive skills, in which the design of the game intervention is based on cognitive theory, has shown promise in training such an important skill as updating.