Is Training with the N-Back Task More Effective Than with Other Tasks? N-Back vs. Dichotic Listening vs. Simple Listening

Studer-Luethi, Barbara; Meier, Beat

doi:10.1007/s41465-020-00202-3

Is Training with the N-Back Task More Effective Than with Other Tasks? N-Back vs. Dichotic Listening vs. Simple Listening

Original Research
Open access
Published: 28 December 2020

Volume 5, pages 434–448, (2021)
Cite this article

Download PDF

You have full access to this open access article

Journal of Cognitive Enhancement Aims and scope Submit manuscript

Is Training with the N-Back Task More Effective Than with Other Tasks? N-Back vs. Dichotic Listening vs. Simple Listening

Download PDF

9770 Accesses
6 Citations
Explore all metrics

Abstract

Cognitive training most commonly uses computerized tasks that stimulate simultaneous cognitive processing in two modalities, such as a dual n-back task with visual and auditive stimuli, or on two receptive channels, such as a listening task with dichotically presented stimuli. The present study was designed to compare a dual n-back task and a dichotic listening (DL) task with an active control condition (a simple listening task) and a no-training control condition for their impact on cognitive performance, daily life memory, and mindfulness. One hundred thirty healthy adults aged 18–55 years were randomly assigned to one of the four conditions. The training consisted of twenty 15-min sessions spread across 4 weeks. The results indicated some improvement on episodic memory tasks and a trend for enhanced performance in an untrained working memory (WM) span task following cognitive training relative to the no-training control group. However, the only differential training effects were found for the DL training in increasing choice reaction performance and a trend for self-reported mindfulness. Transfer to measures of fluid intelligence and memory in daily life did not emerge. Additionally, we found links between self-efficacy and n-back training performance and between emotion regulation and training motivation. Our results contribute to the field of WM training by demonstrating that our listening tasks are comparable in effect to a dual n-back task in slightly improving memory. The possibility of improving attentional control and mindfulness through dichotic listening training is promising and deserves further consideration.

N-back Versus Complex Span Working Memory Training

Article 16 October 2017

Transfer of working memory training to the inhibitory control of auditory distraction

Article Open access 15 January 2021

Mechanisms Underlying N-back Training: Response Consistency During Training Influences Training Outcome

Article 12 October 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Attentional control is more important than ever in our modern everyday lives. It enables us to focus on specific tasks in the midst of a plethora of information. The possibility of enhancing it through training is attractive for a variety of groups in the population, ranging from the young to older adults. Indeed, the observation that the human brain is plastic and that cognitive processes become more efficient as a result of regular and focused mental exercise has encouraged researchers to examine the effects of targeted interventions, above all by means of computerized cognitive tasks (see, e.g., von Bastian and Oberauer, 2014). Consequently, research has elicited promising findings during the last decade by demonstrating learning effects and transfer to untrained tasks; most of these studies have applied the n-back task as a cognitive training paradigm (see Au et al., 2015; Schwaighofer et al., 2015; Soveri et al., 2017a, for recent meta-analyses). This has stirred widespread interest among researchers and the public at large (Simons et al., 2016) but has also attracted widespread criticism (e.g., Melby-Lervåg and Hulme, 2013). Many critics have focused on a lack of insight into how training tasks lead to training effects (e.g., Shipstead, Redick, and Engle, 2012a). One way to better understand such mechanisms is to investigate the differential effects of diverse training approaches. Thus, the present study compared the effects of the commonly used and extolled dual n-back task with those of a dichotic listening task, which places a heavy load on selective attention, and with active and passive control conditions.

In the dual n-back task, trainees simultaneously see and hear a series of stimuli. They are required to indicate whether each stimulus is the same as that seen or heard n items back. The task is assumed to train our working memory (WM), which represents our ability to simultaneously store and process information and hold it available for complex cognition at a given moment (Oberauer and Hein, 2012). WM has been defined as one core component of executive functions, which stands for a set of cognitive top–down mental processes needed for paying attention. Besides WM, the two other core executive functions are inhibition and shifting (Miyake et al., 2000).

WM has been linked to a number of important skills, such as attentional control, reasoning, and general intellectual capacity (Engle, 2018; Kane et al., 2007; Shipstead et al. 2012b; Wongupparaj et al., 2015). Indeed, training with the n-back task has been shown not only to enhance performance in the trained task but also to generalize to untrained tasks of WM and attention (Lilienthal et al., 2013; Pergher et al., 2018; Studer-Luethi et al., 2016), here referred to as near transfer, and higher-order cognition (Jaeggi et al., 2008; Jaušovec and Jaušovec, 2012; Klingberg, 2010; Soveri et al., 2017b), here referred to as far transfer. Even though far transfer to general intelligence was present most often in response to dual n-back training (Au et al., 2015; Blacker et al., 2017), they seem to be smaller and more inconsistent as compared to the more consistently observed near transfer effects (Soveri et al., 2017a).

In the forced-choice dichotic listening (DL) task, trainees are presented with auditory words via headphones; one word is played to the right ear, and a different word is simultaneously played to the left ear. The participant is instructed to direct attention to one of the ears and decide on the category of the presented word (e.g., natural vs. artificial). The inputs in each ear cross over to the contralateral cerebral hemisphere, while the ipsilateral inputs are automatically inhibited (Tallus et al., 2015). The task is assumed to train our attentional capacity by obliging trainees to direct their attention focus to one source of information while inhibiting the other (cf. Rothen and Meier, 2018). With that, the DL task puts high demand on the core executive function of inhibition, which is the capacity to obstruct dominant responses and to suppress the influence of interfering information (e.g., Bexkens et al., 2015). The DL task has been applied to assess impairments within attention, working memory, and executive functions (Hugdahl, 2011) and found to be beneficial for participants with auditive, verbal, or neurological impairments (Helland et al., 2018; McCullagh and Palmer, 2017; Osisanya and Adewunmi, 2018). Apart from that, little research has been done with this task, but some evidence has indicated improvements in auditory attention and attentional control after 4 weeks of DL training (Soveri et al., 2013). A more recent study demonstrated increased post-training attentional control at the neuronal level but no behavioral improvements (Tallus et al., 2015).

In response to some inconsistencies in cognitive training results, some studies focused on the potential modulatory roles of individual personal and motivational differences (e.g., Jaeggi et al., 2014; Studer-Luethi et al., 2016; Zhao et al., 2018). While most researchers agree on the relevance of individual, motivational, and emotional factors, findings on this topic are rather inconclusive (see, e.g., Borella et al., 2017; Katz et al., 2014; Linares et al., 2019; Maraver et al., 2016). It seems worthwhile to include personal and motivational factors in cognitive intervention designs to bring more clarity about possible links.

But what are the mechanisms through which training-induced improvements occur (see, e.g., Meiran et al., 2019)? The mismatch model of cognitive plasticity predicts that a rise in demand on cognitive processes results in increased resources associated with cognitive functioning (Lindenberger, 2014). When the training tasks continually and sufficiently challenge the upper limits of attention and memory, trainees’ cognitive abilities will increase in various cognitive tasks. Related to this model is the phenomenon of dual-task practice advantage which suggests an advantage of dual-task trainings versus single-task trainings in regard to their effects on performance in demanding cognitive tasks (see Strobach, 2020). Finally, other approaches assume that cognitive training enhances the efficiency of the specific processes involved in the training (e.g., Dahlin et al., 2008) or develops the highly specific skills required to perform specific cognitive tasks (Gathercole et al., 2019). In this case, training tasks targeting different components of executive functions (that is, WM updating/shifting vs. inhibition) are expected to show differential improvements on transfer tasks with similar or dissimilar cognitive demands (cf. Miyake et al., 2000).

The Present Study

The present study investigates whether WM training is effective in a sample of adults at a range of ages and whether various cognitive training approaches lead to differential cognitive improvements. Specifically, we aimed to compare the effects of 4 weeks of training with a new tablet-based version of the dual n-back task with a new version of the DL task on the same set of near-transfer measures of attention and memory and far-transfer measures of intelligence and daily life memory. To estimate the significance of the training effects, we compared them to an active control group using a simple listening (SL) training task and a no-training control group. The SL task followed the same structure as the DL task but consisted of identical auditory stimuli simultaneously presented to both ears. Therefore, no directing or shifting of attention was required in this task. In both versions of the listening tasks, we implemented a prospective memory task in the second part: participants were asked to react to a specific word (e.g., “dog”) by pressing a special button. The main reason for this addition was to keep the task demanding and interesting for the participants.

With perspective on the methods of the training tasks, the WM task, as well as the listening training tasks, includes a steady flow of information as well as a simultaneous presentation of information on two canals (n-back: one visual and one auditive information; DL: two different auditive information presented in one ear each). Also, all the training tasks combine attentional and memory demands.

We were interested whether possible training-related changes could be explained by the cognitive processes involved in specific training tasks. The assumption here was that while all three training tasks require measures of executive function, such as attentional control, the n-back task and the DL task put their focus on WM and inhibition, relatively. That is, the n-back task puts high demand on the WM components updating and shifting, whereas the DL task puts high demand on the attentional component inhibition (of irrelevant information) and with this on selective attention.

If cognitive training effects are unspecific, no differential gains should emerge across the three training conditions. If cognitive training is less specific but requires high attentional load and processing speed to increase general cognitive processes, we expected higher benefits for both the n-back and the DL training than for the SL training. The same expectation (advantage of the two dual training tasks versus the single training task regarding broader cognitive benefits) results from the dual-task practice advantage phenomenon. If n-back training’s high WM demands produce a specific effect, we expected higher improvements especially in far transfer measures in the dual n-back condition than in the other conditions. In this case, the DL training is assumed to show specific improvements of inhibition. In contrast, if cognitive training is not effective, we expected no differential retest improvements in either the trained or the untrained participants.

Furthermore, cognitive training effects should ultimately be evaluated with measures that more closely reflect real-life experience (cf. Soveri et al., 2017a). We were interested in whether training participants noticed any impact of the intervention on the mindfulness and memory performance they experienced in daily life.

Finally, we were interested in whether we would find associations of personality, emotion regulation, self-efficacy, and training motivation with training outcomes, since such individual variables can change the engagement, commitment, and persistence of trainees.

Methods

Participants

One hundred thirty participants (62 male) with a mean age of 26.26 years (SD = 10.62; range = 18–55) were recruited from the personal environment of the study leaders. Participants were required to be adults between the ages of 18 and 55, in good health, and not taking any drugs. The participants were not paid, but they received our collection of cognitive training tasks after the completion of the study. All participants received the same information, reported normal vision and audition, and provided informed, written consent before participation.

Assignment to the training groups was random except matching for gender and age. To complete the study design, participants for the passive control group were recruited later, after the 3 training groups ended the training. A higher number of participants, which were also matched for gender and age, were included for this group in order to enhance statistical power. The final sample consisted of 28 participants (mean age = 24.5 years; SD = 7.46; 11 male) in the dual n-back training, 30 participants (mean age = 25.6 years; SD = 10.01; 10 male) in the DL training, 24 participants (mean age = 26.70 years; SD = 11.45; 9 male) in the active control group (SL training), and 48 participants (mean age = 27.5; SD = 12.14; 32 male) in the no-training group.

Procedure

The recruited participants were assigned to one of the four experimental groups before taking any tests. They took the pretraining behavioral test in groups of around 20 participants in a computer room at the university. After the completion of the pretests, each training participant received a tablet to take home and training instructions. Participants were instructed to schedule five training sessions each week for 4 weeks for a total of 20 sessions. Finally, all the participants were tested 2 to 5 days after their last training session.

Measures

Cognitive Tasks

Choice Reaction Task

In this task, arrows pointing to the right or left were presented on the screen (presentation time of max. 5000 ms, interval between 300 and 500 ms), followed by a black screen (500 ms). Participants were requested to press the corresponding arrow on the keyboard as fast as possible. Mean accuracy served as the dependent variable.

Task Switching

A total of 32 numbers (1–10) were serially presented on the screen. Participants were required to assign the numbers to one of two categories by pressing a predefined key as fast as possible. Crucially, the task changed from odd/even to lower/higher than five in an AABB order, thus enabling switch costs to be calculated. The dependent variable was the difference in accuracy between task change and task repetition.

Processing Speed

The digit symbol substitution test (DSST) of the Wechsler Adult Intelligence Scale (WAIS; Wechsler, 1958) consists of nine digit-symbol pairs (e.g., 1/-,8/X) followed by a list of digits. Participants were required to write the corresponding symbol under each digit as fast as possible. The number of correct symbols within the time allowed (120 s) served as the dependent measure.

Fluid Intelligence

We used Raven’s Standard Progressive Matrices test (RPM; Raven, Raven, and Court, 1998) separated into two forms of 30 items (items were split into odd and even sets and counterbalanced across testing times). Participants saw a 3 × 3 matrix of shapes presented with the last shape missing and were required to choose the item that completed the pattern from a set of six to eight choices. Participants were given 10 min to complete the task. The number of correctly answered items served as the dependent variable.

Working Memory

Verbal working memory capacity was individually assessed with the backwards number span task of Wechsler Memory Scale (Wechsler, 1997). Starting with two numbers, growing sequences of numbers between 1 and 9 were read out, and the participant was required to repeat each sequence in reverse order. The number of correctly reproduced sequences served as the dependent variable.

Episodic Memory

A total of 48 words consisting of a maximum of 9 letters were presented serially and dichotically through headphones (interval of 200 ms). Participants were instructed to pay attention only to the words presented to one ear and decide on the category (part 1: flower vs. tree; part 2: furniture vs. clothes). Half of the word pairs presented were congruent (identical words), and the other half were incongruent (different words). After completing the tasks, participants were asked to recall as many of the words as possible in 2 min. The number of correctly recalled words served as the dependent variable (Muhmenthaler and Meier, 2019).

Self-Reported Measures

Mindfulness

We used the German version of the Mindfulness Inventory (FMI; Walachet al. 2006), which consists of 14 items (e.g., “I feel connected to my experience in the here-and-now”). Answers are given on a Likert scale ranging from 1 (rarely) to 5 (almost always).

Memory in Everyday Life

We used the Prospective and Retrospective Memory Questionnaire (PRMQ; Smith et al. 2000) as a self-report measure of prospective and retrospective memory slips in everyday life. The questionnaire consists of 16 questions, 8 asking about retrospective memory failures (e.g., “Do you forget what you watched on television the previous day?”) and 8 concerning prospective failures (e.g., “Do you decide to do something in a few minutes’ time and then forget to do it?”). Answers are given on a Likert scale ranging from 1 (never) to 5 (very often).

Neuroticism and Conscientiousness

These two personality traits of the Big Five Model developed by McCrae and Costa (1999) were measured with 24 items from the NEO-FFI Questionnaire (Costa and McCrea, 1992): 12 items concerning neuroticism (e.g., “I’m often tense and nervous”) and 12 items concerning conscientiousness (e.g., “I try to conscientiously finish given tasks”). Participants are asked to rate their agreement with a statement on a Likert scale from 1 (strong disagreement) to 5 (strong agreement).

Emotion Regulation

We used the Emotion Regulation Skills Questionnaire (ERSQ; Berking and Znoj 2008) to measure emotional regulation competences. The 27 questions explore the emotional competencies of awareness, clarity, sensation, understanding, acceptance, resilience, self-support, willingness to confront, and modification (e.g., “I can influence my negative emotions”). Answers are given on a Likert scale ranging from 1 (very often) to 5 (very rarely).

Self-efficacy

To assess belief in one’s own capacity to handle difficulties and challenges in everyday life, we used the General Self-Efficacy Short Scale (GSE; Beierlein et al. 2013). Answers to the four items (e.g., “When I am confronted with a problem, I can usually find several solutions”) are given on a Likert scale ranging from 1 (strong disagreement) to 4 (strong agreement).

Training Tasks

Both of the training tasks are part of our cognitive training task collection designed for application on tablets and smartphones (Studer-Luethi et al. 2017).

Dual N-Back Task

We used the dual n-back procedure described by Jaeggi et al. (2008). We created a version with motivating features and constant direct feedback. In our version of the task, an animal such as a rabbit or mole appears at different locations on the screen (presentation time 500 ms, interstimulus interval 2500 ms). Simultaneously, one of the alphabetic letters is presented through the earphones. During each interval, the trainee is required to touch a predefined target button on the tablet screen and decide whether the current location of the animal and whether the heard letter is the same as n positions back in the sequence or to press a predefined nontarget button in any other case. Immediate feedback is provided at the top of the screen for each response in both the visual and auditory modalities (see Fig. 1a). For every level of n, there are three field sizes with 4, 8, and 11 grid compartments. If the trainee makes fewer than three mistakes, the field size increases. The level of n increases after successful completion of the third block. Similarly, the field size decreases after more than five mistakes, but the level of n decreases only after three unsuccessful blocks. After each block consisting of 20+ n trials, trainees receive performance feedback. Each training session consisted of 15 blocks and lasted approximately 20 min.

DL Task

In the forced-choice dichotic listening task, the participant is presented on each trial with two different words to each ear over the headphones. In the first part of the task, the participant is instructed to direct auditory attention to either the left or the right ear and assign the word to one of two categories by touching the corresponding button on the right or the left side of the screen (see Fig. 1b). During the 20 training sessions, the categories changed between concrete/abstract, English/German, male/female voices, natural/artificial sounds, smaller/bigger objects, and the to-be-attended ear (i.e., left vs. right). In the second part, a prospective memory task is added by instructing participants to react to a predefined word (i.e., “dog”) or category (i.e., animal) by pressing a special key (Meier et al. 2011). Each part of the task consisted of 90 words and lasted approximately 20 min.

SL Task

In the simple listening task, the participant is presented on each trial with the same word to both ears over the headphones. Thus, auditory attention is not to be directed to one ear as in the dichotic listening task. Apart from that, the procedure is identical for the simple and dichotic listening tasks.

Results

Data Processing

Following outlier analysis, 5% of the data was trimmed to 3 SD above or under the mean scores. We compared pretest and post-test performance as a function of the three training groups and the no-training control group to analyze transfer effects. Descriptive data of pretest and post-test as well as within-group changes are presented in Table 1. Importantly, the 4 experimental groups did not significantly differ in any of their test performance at pretest (all t < 0.78, p = n.s.).

Table 1 Mean (M) and standard deviations (SD) of performance in the untrained cognitive measures and within group changes in these tasks (t values and effect size Cohen’s d for repeated measures)

Full size table

Participants had to be excluded from the transfer analysis if they did not complete the post-tests (n = 4) or if they completed fewer than 17 of the 20 training sessions (dual n-back group: n = 4; DL training group: n = 3; SL training group: n = 6). This left a sample of N = 93 for training and transfer analysis.

Training Performance and Motivation

All three training groups showed improved performance across the 20 training sessions (see Fig. 2). The dependent measure for the dual n-back training was the average n-back level achieved in each session. The dependent measure for the dichotic listening task was the average accuracy of responses in each session. Mean performance in the dual n-back task increased from level 1.36 to level 9.69 (which is equivalent to level 4 in the classical dual n-back task; F_(1.19) = 81.91, p < 0.001, η_p² = 0.79). The mean DL and SL task performance increased in accuracy from 0.92 to 0.95 (F_(1.19) = 3.17, p < 0.01, η_p² = 0.10) and from 0.91 to 0.94 (F_(1.19) = 4.41, p < 0.001, η_p² = 0.17), respectively. Changes in reaction time differed: There were no significant changes in the dual n-back task (F < 1), but participants in the DL training decreased their reaction time from 1419 to 1204 ms (F_(1.19) = 39.36, p < 0.001, η_p² = 0.69) and those in the SL training from 1747 to 1438 ms (F_(1.19) = 28.66, p < 0.001, η_p² = 0.61).

We also collected feedback after completing the training: (1) How motivated were you for the training? (2) How much did you enjoy the training? (3) To what extent did you feel it improved (a) your concentration, (b) your responsiveness, (c) your memory performance? We found no differences between the three training groups for any of these variables. When examining association with training performance, we found a positive relation between training enjoyment and DL training performance (r = 0.34, p < 0.01) and with higher n-back training by trend (r = 0.29, p = 0.08). However, we did not find any significant association between these motivational variables and transfer performance in any of the training groups.

Transfer

We conducted ANOVAs for repeated measures^{Footnote 1} for the transfer variables with the factors group (dual n-back training group, DL training group, SL active control group, no training control group), and time (pre- and post-training assessment). We conducted post hoc analyses of differences of means (Δ), corrected for multiple comparisons with the Bonferroni correction. We also computed the within-group changes by calculating the effect size Cohen’s d with the correction for repeated measures as proposed by Morris (2007). The resulting transfer effects are presented in Table 1 and Fig. 3.