Introduction

In the context of healthy aging — the focus of WHO’s work on aging between 2015 and 2030 — regular physical activity not only improves and maintains physical functioning, but also evokes positive effects on cognitive functioning in older adults (Colcombe & Kramer, 2003; Voelcker-Rehage & Niemann, 2013). According to reviews and meta-analyses, improvements in cognitive and executive functions are detected predominantly following resistance training and aerobic exercise (Gallardo-Gómez et al., 2022; Northey et al., 2018). However, the specific impact of these exercise types on executive functions has recently been questioned and discussed (Diamond & Ling, 2016, 2019; Hillman et al., 2019). In particular, the meaning of “mindless” aerobic and resistance exercises with small demands for cognition has been under critical debate as benefits to executive functions appeared to be unjustified (Diamond & Ling, 2016).

The cognitive stimulation hypothesis assumes that cognitively engaging exercises that require whole-body coordination and attentional control to complete the task (Chen & Nakagawa, 2023; Tomporowski et al., 2015) activate similar neural circuits associated with executive function tasks (Best, 2010). Due to the complexity and novelty of such motor tasks, chronic effects are thought to involve learning processes, such that neuroplasticity and improvements in cognitive functions are task specific (Netz, 2019; Tomporowski & Pesce, 2019). Supporting this hypothesis, recent literature indicates that training interventions with simultaneous cognitive activation have a greater effect on cognitive functioning compared to endurance and strength training interventions alone (Gheysen et al., 2018). Moreover, motor tasks where task complexity rather than intensity is understood as the control variable seem to be a promising training mode for healthy aging (Netz, 2019). For instance, studies on gross motor exercises and coordination training showed equivalent improvements in executive functions, selective attention, and visual perception compared to aerobic and strength training (Berryman et al., 2014; Voelcker-Rehage et al., 2011).

In resistance training, unstable surfaces could be used to alter task complexity because they change coordinative requirements through increased vertical and transversal ground reaction forces (Eckardt et al., 2020a; Lehmann et al., 2022), in addition to increased demands on visual perception (Lord & Menz, 2000). However, adding balance requirements to resistance exercises has not yet been analyzed in terms of attentional demands. This study aims to show that cognitive engagement during acute resistance exercises can be increased by implementing unstable devices. Consequently, the investigation of the acute effects of exercising on unstable devices may contribute to its classification as cognitive engaging exercise, and thus the understanding of how resistance training with surface instability enhanced executive functions (Eckardt et al., 2020b).

Meta-analyses of the acute effects of resistance exercise on cognition have yielded inconsistent results. Chang et al. (2012) found negative effects while Wilke et al. (2019) showed moderate improvements in cognitive performance after an acute bout of resistance exercise. Reasons for divergent results may include differences in exercise intensity, timing of cognitive testing, and type of cognitive task. However, the studies included in these analyses mainly assessed cognitive performance within 5 min after the single bout of exercise. The literature on cognitive performance and attention during resistance exercise is limited. One study that examined the attentional demands of resistance exercise in young adults found performance declines in the cognitive task and concluded that barbell squats required a higher amount of attentional resources than standing upright (Herold et al., 2020).

In their study, Herold et al. (2020) employed a dual-task paradigm, which provides a methodology for examining the cognitive resources required during exercise. This paradigm involves the execution of two tasks simultaneously, usually a motor task and a concurrent cognitive task (Abernethy, 1988). It is assumed that the execution of a motor task requires a proportion of a limited processing capacity depending on its attentional demands (Huang & Mercer, 2001). In general, goal-directed behavior requires cognitive resources in terms of selecting task-relevant information from task-irrelevant information summarized as selective attention (Lavie et al., 2004). Performance on the concurrent cognitive task is considered to reflect the amount of residual processing capacity not allocated to the motor task. Consequently, attentional demands for a motor task are assessed by comparing the individual’s cognitive task performance in a single-task (ST) condition (cognitive task only) with the cognitive task performance in a dual-task (DT) condition (cognitive and motor task concurrently) (Huang & Mercer, 2001). The differences between single- and dual-task performances are defined as dual-task effect (\(\mathrm{DT effect}=(\mathrm{DT}-\mathrm{ST})/\mathrm{ST}*\pm 100\%\)). For example, a negative DT effect in the performance of the cognitive task would indicate that attentional resources were allocated to the motor task and the remaining resources were insufficient to perform the cognitive task to the best of their ability (McIsaac et al., 2015).

Overall, task complexity, such as postural demands, and individual prerequisites, such as experience, age, or cognitive abilities, are associated with the attentional resources required for motor task performance. Several studies have used a DT paradigm to compare attentional resources during static and dynamic motor tasks (i.e., sitting, standing, and walking) in young and older adults (for a review Woollacott & Shumway-Cook, 2002). In general, their findings show that attentional demands increased during a static standing task in both older and young adults when the complexity of the postural task increased, e.g., by changing the base of support (wide vs. normal stance, Lajoie et al., 1993, 1996), the surface (normal vs. foam, Marsh & Geel, 2000; Teasdale et al., 1993), or the sensory input (eyes open vs. eyes closed, Teasdale et al., 1993; Vuillerme et al., 2006), or by perturbation (platform displacement, Brown et al., 1999). Compared to static motor tasks, dynamic motor tasks (e.g., walking) were found to require more attentional resources, as indicated by larger performance decrements in the secondary cognitive task (Lajoie et al., 1993; Wollesen et al., 2016). Compared to young adults, older adults showed higher reaction times and DT effects under DT conditions, suggesting that postural control requires more attentional resources in older adults (Woollacott & Shumway-Cook, 2002). This age effect is attributed to functional and structural changes in motor control (Papegaaij et al., 2014), and can be attenuated by expertise or exacerbated by pathological aging (Li et al., 2010; Rapp et al., 2006; Tsang & Shaner, 1998).

To date, no studies have investigated whether using unstable devices, as in postural tasks, increases the attentional demands of executing dynamic resistance exercises and thus may activate cognitive resources in the sense of the cognitive stimulation hypothesis. Therefore, the purpose of this study was to examine the attentional resources required for squatting exercises under stable and unstable surface conditions. It was hypothesized that DT effects are higher in unstable than in stable conditions. Furthermore, this study explored the impact of age on attentional resources needed to execute squatting exercises on stable and unstable surfaces. Hence, DT effects were compared between older and young adults.

Materials and Methods

Participants

Older adults were recruited through an announcement in the local newspaper and a local senior citizen walking group. Young adults were recruited by an announcement in university seminars and lectures. In total, 74 participants, including older (n = 57; 26 females) and young adults (n = 17; 7 females), were invited to a pretest session to receive information on the experimental procedures. All participants were informed about the nature and the purpose of the investigation and provided written consent to participate. Participants confirmed that they were not suffering from acute or chronic musculoskeletal or neurological disorders or an acute or chronic disease hampering balance control (e.g., Parkinson’s disease, diabetes) and listed their medication. Participants were not excluded because of their medication (e.g., l-thyroxine, candesartan), as there is evidence that the medication may compensate for any attentional deficits due to their medical conditions (e.g., thyroid disease, hypertension). All participants completed the German version of the Physical Activity Readiness Questionnaire (PARQ) assessing an individual’s health risk when exercising physically (Adams, 1999). Additionally, three questionnaires were administered to further exclude neurological and psychological disorders. First, general cognitive abilities were assessed by the Montreal Cognitive Assessment (MoCA) (Nasreddine et al., 2005) with obtained permission. Participants with a MoCA score lower than the cutoff values provided by Thomann et al. (2018) were excluded from the study. The Fall Efficacy Scale-International (FES-I, Dias et al., 2006) and the short form of the Geriatric Depression Scale (GDS-S, Yesavage & Sheikh, 1986) were utilized to assess a concern of falling and depression due to their influence on attentional resources (Wollesen & Voelcker-Rehage, 2018; Young & Williams, 2015). Consequently, participants with a high concern of falling (score ≥ 23 points, Delbaere et al., 2010) and depressive symptoms (score ≥ 5, Pocklington et al., 2016) were excluded from the study. Figure 1 illustrates the process of participants’ exclusion after pretesting and during data analysis. For instance, 14 older adults had to be excluded after pretesting because dumbbell squats on the unstable device were too challenging for them. After applying the exclusion criteria, 56 individuals remained in the study and participated in the testing session. All experimental procedures were conducted in accordance with the ethical standards of the Declaration of Helsinki and approved by the local ethics committee of the University of Kassel (E05202004).

Fig. 1
figure 1

Subject inclusion chart

Procedure

Our experimental setup was adopted from the study protocol by Herold et al. (2020). Participants were first invited to a pretesting session to screen their eligibility.

Pretesting Session

During the pretesting session, participants filled out questionnaires, performed the pretests, as described below, and familiarized themselves with the cognitive and motor tasks. For a warm-up routine, participants exercised 5 min on a rowing machine at 1 W/kg body weight and a revolution of 24 strokes per minute. Prior to familiarization with the cognitive and motor task, the recording quality of the microphone and amplifier for the visual-verbal Stroop task was checked. Then, a familiarization block of 16 congruent and incongruent trials in a random order followed. Finally, a 30-s testing block of randomly assigned congruent and incongruent trials was executed while standing upright in front of the TV screen. Next, dumbbell squats with self-selected low-intensity weights were executed as the motor task on an even surface (stable condition) and the flat side of a Bosu® Balance Trainer (unstable condition). Dumbbell squats were performed as a single-task (ST condition) and in a dual-task (DT) condition concurrently with the Stroop task. Participants started their squatting movement following a visual countdown (from 3 to 1) on the screen. Depending on the task condition, either a white cross on a black screen or the first colored word of the Stroop task appeared on the screen following the countdown. The appearance of the white cross on the black screen indicated the motor ST condition (squatting exercise only). In contrast, the first colored word indicated the cognitive-motor DT condition (30-s Stroop task while squatting). In the DT conditions, the prioritization of either task in terms of functional relevance was avoided, since in everyday life people often strive to perform both tasks in a dual-task situation in the best possible way (Wehrle et al., 2010). Instead, participants were instructed to execute the squatting exercise as similarly as possible as in the ST condition (i.e., with a similar velocity and downward displacement) while simultaneously trying to respond as fast as possible to the Stroop task. This instruction should ensure that participants strive for consistent performance in the squatting exercise without defining one task as the primary task in order to attribute the losses in the cognitive task to the attentional resources required for it. At the end of each task, participants rated their RPE on the Borg Scale.

The final stage of the pretesting session comprised a multiple repetition maximum (M-RM) strength test to adjust dumbbell weight to the individual’s strength abilities. As the 1-RM would cause a possible risk of injury especially for older adults, a M-RM was conducted (Reynolds et al., 2006). For the M-RM strength test, participants were instructed to perform sets of dumbbell squats with increasing loads until a weight was found that they could lift approximately ten times. After each set, participants rated their perceived exertion by using the Repetitions in Reserve (RIR) scale (Zourdos et al., 2016) and, then, rested for 3 to 5 min between sets. The 1-RM was estimated by the equations provided by Tan et al. (2015) referring to the squat performance of older adults with different calculations for male and female participants.

Testing Session

The second visit to the laboratory was scheduled with a rest period of at least 48 h. Figure 2 illustrates the procedures of the testing session. Participants started with a warm-up procedure on the rowing machine followed by one block of the Stroop task and one set of the squatting exercise in each ground condition (stable, unstable) for familiarization and a task-specific warm-up. Participants then executed one block of the cognitive task (i.e., Stroop task while standing) and one set each of the motor task (i.e., dumbbell squats on stable and unstable ground) in the ST condition followed by the performance of two sets each of the ground condition in the DT condition (i.e., Stroop task while squatting on stable and unstable ground). The testing session ended with a second set each of the motor task and a second block of the cognitive task in the ST condition. Accordingly, each task was conducted twice. The order of the stable and unstable ground conditions was counterbalanced in the motor ST and cognitive-motor DT conditions. The cognitive task was always the first and the last task in order to account for possible learning effects in the cognitive task performance. Each cognitive task block and motor task set lasted 30 s with a rest of at least 2 min between tasks and at least 5 min between the ST and DT conditions and within the DT condition between the stable and unstable conditions. Cognitive task blocks with many voice key failures (triggered by breathing or no trigger to acoustic response) and motor task sets with balance loss on the Bosu® Balance Trainer were aborted and repeated as extra sets or blocks.

Fig. 2
figure 2

Overview of the testing session procedure. Dumbbell squats in the stable and the unstable conditions were conducted in a counterbalanced order in both the single-task and the dual-task conditions. Each task lasted 30 s with at least a 2-min rest between tasks and a 5-min rest between the single-task and dual-task conditions and within the dual-task condition between stable and unstable conditions

Instruments

Questionnaires and Pretests

In order to identify individual’s prerequisites, the following questionnaires and tests were completed by the participants during the pretesting session: a German physical activity questionnaire to obtain an estimate of the individual’s level of physical activity and the type of sport exercise within the past 4 weeks (Fuchs et al., 2015), the d2 test as a measure of focused and sustained attention (Brickenkamp, 1962), and a computerized digit span backward task as an estimate of the working memory capacity (Diamond, 2013). In addition, participants were asked about the duration of their school education as well as their experience in resistance training using a rating scale ranging from 1 (i.e., no experience) to 10 (i.e., strong experience). Last but not least, the 30-s chair-rise test was conducted to measure lower body muscle power (Csuka & McCarty, 1985).

Cognitive Task

The abilities to focus attention on relevant information and detain attention from distracting information are widespread components of testing attentional resources (Perrey, 2022). A custom-built Stroop color-word task was administered to assess selective attention through an E-Prime software experiment (version 3.0, PST, Pittsburgh, PA, USA). Colored words were presented on a TV screen at a visual angle of α = 1.26° during an upright stance. The task consisted of congruent (word and ink color are identical, e.g., the word “red” with red ink color) and incongruent trials (word and ink color differ, e.g., “red” with green ink color) with the word and colors red, green, yellow, and blue similar to that applied by Wollesen et al. (2016). Congruent and incongruent trials were presented in a random order with each block lasting 30 s. A visual countdown starting from three indicated the start of the 30-s testing block. Participants were instructed to respond as fast as possible to the presented stimulus material by naming the ink color of the word loud and clear. Acoustic responses and the corresponding reaction time (RT) were recorded by utilizing a microphone (Superlux ECM999) and an amplifier (Yamaha MG16/6FX). The acoustic responses triggered the voice key, which then caused the appearance of the next trial. Thus, the total number of trials in the 30-s testing block differed between participants depending on their reaction times.

Motor Task

The execution of the squat exercises was based on the setup reported by Herold et al. (2020). Dumbbell squats at 40% of the one repetition maximum (1-RM) were executed on a solid surface (floor) and on an instability device (Bosu® Balance Trainer; flat side facing up) at a self-paced frequency. During the squatting exercise, participants were standing at a distance of 2.5 m in front of the TV screen with their feet shoulder-wide apart. They were instructed to lower their body as far as possible until their thighs were oriented parallel to the ground. Dumbbells were held at each side of the body with arms extended and eyes resting on the center of the screen. According to Zawadka et al. (2020), squat depth should be at least 30% of leg length. Based on the average height of German citizens (male, 178.9 cm; female, 165.9 cm) (Federal Office for Statistics in Germany, 2022), a squatting depth of 30 cm would correspond to the minimum of a squat depth of 30% of leg length. Therefore, as an estimate, a valid squat was assigned when the downward displacement of the dumbbells reached at least 30cm (Zawadka et al., 2020). For that purpose, an inertial measurement unit (IMU) (Vmaxpro, version 6.0, Magdeburg, Germany) was attached to one of the dumbbells. The IMU signals were transferred via Bluetooth to the Vmaxpro software (version 6.0). Additionally, the IMU was used to assess motor-related DT effects regarding the number of repetitions, movement velocity, and squat depth as suggested by Herold et al. (2020).

Perceived Exertion

During ST and DT conditions, a 6–20 Borg Scale was administered to estimate the perceived exertion (RPE) (Borg, 1982). Participants were instructed to rate their physical and cognitive perceived exertion immediately after the execution of each task.

Data Processing and Statistical Analysis

Participants were excluded from the analysis if they failed to execute two valid blocks of the Stroop task or two valid squat sets in each of the task and ground conditions. At least 12 valid trials for a block in the Stroop task were required for further analysis. Trials with either a voice key failure, an incomprehensibly recorded color name (e.g., due to heavy breathing), or RTs of less than 250 ms were excluded from the analysis. In addition, RTs larger than the median RT plus 3 × interquartile range were identified as outliers and excluded (Eckardt et al., 2020b). For the motor task, at least five correctly executed dumbbell squats in the ST and the DT conditions were required for further analysis. Additionally, in the motor ST condition, each dumbbell squat had to be executed with a dumbbell downward displacement of at least 30 cm. Since the estimation of attentional resources required for the performance of the squatting exercise is based on the performance differences in the cognitive task, it should be ensured that the motor task is performed similarly in ST and DT conditions. For that purpose, the deviation in motor performance (i.e., number of repetitions) between ST and DT conditions was calculated for each participant and expressed relative to the ST condition as follows: \((\mathrm{DT}-\mathrm{ST})/\mathrm{ST}*100\). Participants who were least able to maintain a comparable motor performance in ST and DT condition were excluded by including only DT trials within the 5th to 95th percentile of %-deviation from ST performance (as a result, acceptable range of deviation in stable condition, − 22.4 to 28.4%; unstable condition, − 18.3 to 40.0%).

For statistical analysis, the cognitive (MoCA, d2 test, DS backward) and physical (chair-rise, sport activity, experience) prerequisites of young and older adults assessed during pretesting were evaluated for normal distribution and then compared by an unpaired Student t-test. The median RT of correct responses was calculated for the congruent and incongruent trials for each of the Stroop task blocks. The median RTs were then averaged across the two blocks in each task condition (ST, DT) and each ground condition (stable, unstable). Response accuracy and the Stroop effect (difference in RT between congruent and incongruent trials) were calculated for each condition but not subjected to statistical analysis. The DT effect was calculated through the equation: \(\mathrm{DT effect}=(\mathrm{DT}-\mathrm{ST})/\mathrm{ST}*-100\%\). The negative sign refers to the negative relationship between RT and task performance (McIsaac et al., 2015). A negative DT effect represents performance decrements while a positive DT effect represents improvements. For the calculation of DT effects in stable and unstable conditions, the average RT across congruent and incongruent trials was used.

Further, RTs and DT effects were inspected for outliers and tested for their normal distribution by the Shapiro–Wilk test. Outliers were defined as values larger than the median plus 3 × interquartile range. There was one young adult identified as an outlier for DT effect and thereby excluded from further analysis. Main and interaction effects in the RTs were examined by a 2 × 3 × 2 mixed ANOVA (stimulus type × condition × group). Main and interaction effects in the DT effects were analyzed by a 2 × 2 mixed ANOVA (condition × group). Motor performance parameters for the squatting exercise (repetitions, velocity, vertical displacement) and RPEs were subjected to a 2 × 2 × 2 mixed ANOVA (task condition × ground condition × group).

Levene’s test was administered to examine the homogeneity assumption. Greenhouse–Geisser corrections were performed when necessary (Geisser & Greenhouse, 1958). Significant main effects and interaction effects were further analyzed by a post hoc Tukey’s HSD (honestly significant difference) and Tukey–Kramer test due to unequal sample sizes across age groups. The calculation of Tukey’s HSD was also adjusted if corrections of degrees of freedom were necessary (Hays, 1994). Significant main effects of the ANOVA were analyzed by non-parametric tests in cases when the normality assumption was violated. All statistical calculations were carried out using SPSS Statistics software (IBM, version 28). Significance levels were set to α = 0.05 for all statistical analyses.

Results

During data analysis, 26 participants were excluded because of invalid data (i.e., voice key errors, improper squat performance, outliers) (see Fig. 1). Since the sample size changed considerably during the data analysis, we performed a sensitivity analysis to investigate the impact on the effect size (Giner-Sorolla et al., 2022). A sensitivity analysis on the actual sample size of 30 participants with an α level of 0.05 and a power of β = 0.8 showed that only a large effect (ηp2 = 0.22) can be detected with this sample size. Due to the high average effect sizes given in the literature related to the variation of the task condition (ηp2 = 0.42), we assumed that the remaining sample is sufficient to detect differences between the conditions. Since the effect size for age differences is smaller (ηp2 = 0.14) (Lindenberger et al., 2000), the sample of n = 30 is too small to detect group differences.

Finally, the data of 13 young adults (seven females, Mage = 23.54, SD = 2.67 years) and 17 older adults (four females, Mage = 70.18, SD = 4.25 years) were analyzed (see Table 1). All variables met the assumption of the normal distribution except for the DT effects. Nevertheless, ANOVA and post hoc Tukey–Kramer test were applied for analysis because of its robustness to non-normality (Driscoll, 1996; Knief & Forstmeier, 2021). Levene’s test (p > 0.05) confirmed the homogeneity assumption for all variables.

Table 1 Mean values, standard deviations (SD), and median of the sample’s characteristics

As can be seen from Table 1, older adults showed lower MoCA scores, t(28) = 3.89, p < 0.001, and lower attentional capacity (d2 test), t(28) = 5.15, p < 0.001, in the cognitive tests than young adults. No significant differences between groups were found in the working memory capacity (DS backward), t(28) = 1.93, p = 0.064; the chair-rise test performance, t(28) = 0.56, p = 0.583; the amount of sport activity per week, t(28) = 0.21, p = 0.575; and the experience in resistance training, t(28) = 0.09, p = 0.984. Specifically, resistance training or working out in a gym was explicitly reported as a regular sport activity by 23 participants, including 12 young adults and 11 older adults.

Cognitive Task

Participants responded to a different number of trials in each task and ground condition due to their different reaction times. On average, the number of trials in each block was 24.0 trials (SD = 2.2) in the ST standing condition, 23.3 trials (SD = 2.4) in the DT squatting stable condition, and 22.6 trials (SD = 2.4) in the DT squatting unstable condition. The response accuracy ranged on average between 94.7% (ST; SD = 5.8), 92.0% (DT unstable; SD = 7.5), and 91.2% (DT stable; SD = 8.4) with few false responses (ST: M = 0.9, SD = 1.6%; DT stable: M = 0.8, SD = 1.0%; DT unstable: M = 0.8, SD = 1.6%). The remaining trials were excluded from further analysis due to invalid reaction times and/or missing color identification in the recorded audio track.

The 2 × 3 × 2 mixed ANOVA (stimulus type × condition × group) of RT revealed a significant main effect of condition, εGG = 0.81, F(1.61, 45.10) = 19.64, p < 0.001, ηp2 = 0.41 with an increase in RT from the ST standing (M = 788 ms) to the DT squatting conditions on stable (M = 857 ms) and unstable surface (M = 899 ms). All differences in the means were found to be significant referring to Tukey’s post hoc HSD, HSD1%, ST-DT stable = 40.5 ms; HSD1%, ST-DT unstable = 65.5 ms; HSD5%, DT stable-DT unstable = 41.3 ms (see Fig. 3). The main effect of stimulus type, F(1, 28) = 78.30, p < 0.001, ηp2 = 0.74, was significant with, on average, 120-ms faster RTs for congruent as compared to the incongruent trials. Moreover, the ANOVA showed an interaction effect between stimulus type and condition, F(2, 56) = 4.91, p = 0.011, ηp2 = 0.15, indicating that conditions affected participants’ responses to congruent and incongruent trials differently. In detail, Tukey’s HSD showed that the difference between congruent and incongruent trials (Stroop effect) was significant across all conditions (HSD1% = 33.1 ms), but decreased from the ST standing (M = 146 ms) to the DT stable (M = 120 ms) and the DT unstable (M = 104ms) conditions. This can be seen in the approximating RT levels for congruent and incongruent trials in Fig. 3. Additionally, a significant main effect of group was found on RT, F(1, 28) = 9.97, p = 0.004, ηp2 = 0.26. Overall, older adults showed, on average, 125-ms longer RTs as compared to young adults. However, the mixed ANOVA did not reveal an interaction effect between stimulus type, condition, and group, F(2, 56) = 0.26, p = 0.775, ηp2 = 0.01. Moreover, there were neither significant interaction effects between stimulus type and group, F(1, 28) = 2.43, p = 0.130, ηp2 = 0.08, nor between condition and group, εGG = 0.81, F(1.61, 45.10) = 2.40, p = 0.112, ηp2 = 0.08.

Fig. 3
figure 3

Reaction time to congruent and incongruent trials across the single-task and stable and unstable dual-task conditions for a young and b older adults. DT, dual-task; OA, older adults; ST, single-task; YA, young adults Error bars represent standard deviations

Dual-Task Effect

The 2 × 2 mixed ANOVA (condition × group) of the DT effect revealed a significant main effect of ground condition, F(1, 28) = 5.27, p = 0.029, ηp2 = 0.16, with an on average 5% larger negative DT effect in the unstable condition than in the stable condition. A non-parametric Wilcoxon test also showed a significant difference between conditions, Z =  − 2.47, p = 0.042. There was no significant main effect of group, F(1, 28) = 2.07, p = 0.162, ηp2 = 0.07, and no significant interaction between ground condition and group, F(1, 28) = 1.81, p = 0.190, ηp2 = 0.06. DT effects of young and older adults are displayed in Fig. 4 for both conditions.

Fig. 4
figure 4

Dual-task effect in the Stroop task for both groups across the stable and unstable conditions. OA, older adults; RT, reaction time; YA, young adults Error bars represent standard deviations

Motor Task

Mean values of the performance in the squatting exercise are presented in Table 2. A 2 × 2 × 2 mixed ANOVA (task condition × ground condition × group) on repetitions showed a main effect of ground condition which indicates that significantly more squats were executed in stable than in unstable conditions, F(1, 28) = 42.59, p < 0.001, ηp2 = 0.60, across both age groups. Moreover, the ANOVA revealed an interaction effect between task condition and ground condition, F(1, 28) = 6.29, p = 0.018, ηp2 = 0.18. In this respect, post hoc Tukey’s HSD test showed that the DT condition increased the number of repetitions in the unstable condition, HSD1% = 0.69. Participants were able to conduct significantly more repetitions on unstable ground while simultaneously performing the Stroop task. Moreover, the mixed ANOVA revealed an interaction effect between task condition and group, F(1, 28) = 9.59, p = 0.004, ηp2 = 0.26. As such, older adults performed significantly more repetitions in the DT condition as compared to young adults, HSD5% = 1.14. Furthermore, the interaction between ground condition and group was significant, F(1, 28) = 12.10, p = 0.002, ηp2 = 0.30. Tukey–Kramer post hoc analysis showed that older adults significantly not only reduced the number of repetitions from the stable to the unstable condition but also conducted significantly more repetitions on stable ground than young adults, HSD5% = 1.43. The mixed ANOVA revealed neither an effect of task condition, F(1, 28) = 1.76, p = 0.196, ηp2 = 0.06, nor group, F(1, 28) = 0.30, p = 0.586, ηp2 = 0.01.

Table 2 Mean values and standard deviations for the variables of the motor task performance

As another result, movement velocity in the squatting performance differed significantly between ground conditions, F(1, 28) = 65.41, p < 0.001, ηp2 = 0.70, indicating that both groups reduced their movement velocity while squatting on the unstable device in comparison to squatting on stable ground (YA, − 10.3%; OA, − 19.9%). Additionally, the interaction between ground condition and group was significant, F(1, 28) = 6.52, p = 0.016, ηp2 = 0.19, and Tukey–Kramer’s post hoc test showed that ground condition had a significant effect on movement velocity for older adults only, HSD1% = 0.11 m/s. The mixed ANOVA revealed neither an effect of task condition, F(1, 28) = 0.92, p = 0.347, ηp2 = 0.03, nor group, F(1, 28) = 0.42, p = 0.523, ηp2 = 0.02.

The mixed ANOVA based on the vertical dumbbell displacement detected a significant main effect of task condition, F(1, 28) = 6.29, p = 0.018, ηp2 = 0.18. In the ST conditions, participants moved dumbbells on average 5 cm lower than in the DT conditions. Additionally, a significant main effect of ground condition was found, F(1, 28) = 37.13, p < 0.001, ηp2 = 0.57, indicating a reduced squat depth on the unstable surface. Moreover, there was a significant interaction effect between ground condition and group, F(1, 28) = 11.78, p = 0.002, ηp2 = 0.30. Post hoc Tukey’s HSD test revealed that older adults executed squats on the unstable surface more shallowly than on the stable surface, HSD1% = 0.06 m. Furthermore, older adults’ vertical dumbbell displacement was significantly shorter on the unstable device as compared to young adults, HSD5% = 0.05 m. In turn, young adults’ squat depth was not significantly affected by ground condition. There was no main effect of group, F(1, 28) = 0.97, p = 0.333, ηp2 = 0.03.

Rating of Perceived Exertion

Ratings of perceived exertion (RPE) were found to be significantly different between task conditions, F(1, 28) = 6.82, p = 0.014, ηp2 = 0.20, and ground conditions F(1, 28) = 17.13, p < 0.001, ηp2 = 0.38. The DT condition (M = 13.5, SD = 1.6) and the unstable ground condition (M = 13.6, SD = 1.5) were rated as more demanding than the ST condition (M = 13.0, SD = 1.2) and the stable ground condition (M = 12.8, SD = 1.4). There was no main effect of group, F(1, 28) = 0.79, p = 0.382, ηp2 = 0.03.

Discussion

The significant increase in reaction times (RT) during a verbal color-word Stroop task from the single-task (ST) to the dual-task (DT) conditions suggests that attentional resources are increasingly required during squatting exercises on stable and unstable surfaces. Additionally, the higher negative DT effect on unstable as compared to stable surfaces confirms our hypothesis that surface instability increases cognitive demands while squatting. However, contrary to our previous assumption, DT effects did not differ significantly between young and older adults.

Cognitive Task and Dual-Task Effect

The increase in the color-naming latency from a rested standing posture (i.e., in ST) to squatting exercises on stable and unstable grounds (i.e., in DT) suggests that the attentional resources required for the execution of squats compete with the cognitive task for the limited processing capacity (Huang & Mercer, 2001). This finding is consistent with that of Herold et al. (2020) who also found a decrease in cognitive task performance while squatting. Additionally, the significant increase of the DT effect and RTs from the stable to the unstable ground condition demonstrates that more attentional resources are needed to execute dynamic resistance exercises with increasing demands for postural control (Woollacott & Shumway-Cook, 2002).

So far, only a few studies have investigated the influence of various degrees of task complexity on cognitive task performance in dynamic motor task (Agmon et al., 2014; Lindenberger et al., 2000; Simoni et al., 2013). Lindenberger et al. (2000) found performance decrements in a cognitive task and increasing DT effects with increasing complexity in a walking task for young and older adults. However, our study showed smaller negative DT effects (stable, − 8.7%; unstable, − 14.0%) for the cognitive task and a smaller effect size (ηp2 = 0.16) compared to those obtained in walking tasks (DT effectoval track =  − 12.5%, DT effectaperiodic track =  − 29.3%, ηp2 = 0.21) (Lindenberger et al., 2000), possibly owing to a different cognitive task or higher attentional demands of locomotion tasks. The DT effect of the stable condition of our study was also smaller as compared to the DT effect found by Herold et al. (2020) (DT effectback squats =  − 18.8%).

Furthermore, studies measuring brain activity during various balance tasks showed increased brain activity with increasing metastability demands in brain areas relevant to information processing such as the frontal, central, and parietal cortex (Büchel et al., 2021; Hülsdünker et al., 2015). In line with these findings, recent investigations on cortical processing during squats on stable and unstable surfaces indicated differences in information processing to be associated with differences in balance demands (Lehmann et al., 2022). Consequently, according to Wood’s (1986) task complexity construct, squats on unstable devices are assumed to have a higher task complexity than squats on stable ground as a result of a larger amount of information cues to be processed. These information cues may be visual information, proprioceptive, and kinesthetic information transmitted by the mechanoreceptors of the lower limb or the vestibular system in order to ensure balance and postural control on unstable surface conditions (Woollacott & Shumway-Cook, 2002).

The most interesting finding was that the reaction times did not increase consistently across congruent and incongruent trials of the Stroop task. Task complexity seems to affect cognitive performance differently. Specifically, the Stroop interference effect (difference between congruent and incongruent trials) decreased with increasing task complexity from ST to DT conditions. An explanation for this outcome may provide the Load Theory of Attention (Lavie et al., 2004) which distinguishes between two different types of load influencing processes of selective attention: the cognitive load and the perceptual load. Based on this theory, cognitive load is provoked by cognitive control processes that require higher cognitive functions (e.g., working memory) to successfully perform the current task according to the task definition. In high cognitive load situations, due to the limited processing capacity, task-irrelevant distractors are perceived but not ignored as efficiently as in low cognitive load situations, resulting in pronounced interference effects. By contrast, perceptual load, referring to the load caused by perceptual information processing, prevents the perception of irrelevant information in situations of high perceptual load due to deficient processing capacity. As a result, interference effects are reduced (Lavie et al., 2004). As an example, given a squatting movement to induce cognitive load, the inhibition of distracting information in the incongruent trials (i.e., the color name) should be less efficient while squatting in comparison with the standing condition. Consequently, this would lead to increasing RTs in the incongruent trials and a more pronounced interference effect while squatting. In contrast, the results of this study showed a decreasing Stroop interference effect with increasing task complexity, which may possibly originate from an increased perceptual load while squatting on stable and unstable surfaces. As mentioned above, in situations of high perceptual load, distracting information is supposed to be less perceived resulting in reduced interference effects (Lavie et al., 2004). Contrary to the interpretation by Herold et al. (2020) that squatting exercises require higher order cognitive functions, our findings suggest that squatting induces perceptual load with increased processing of perceptual information with surface instability rather than cognitive load. Our results support previous findings on theimportance of visual perception for postural control on compliant and movable surfaces in older adults (Lord & Menz, 2000; Shumway-Cook & Woollacott, 2000).

Future research on cognitive and perceptual load during resistance exercise is required including studies on (neuro)physiological measures of cognitive demand such as electroencephalography, heart rate variability, blink rate, or pupillary response (Perrey, 2022). In particular, pupil diameter has been considered to be a valid indicator of task complexity and cognitive effort (Beatty & Lucero-Wagoner, 2000). In fact, studies on motor tasks with varying task complexity show increases in pupil diameter with increasing task complexity (Kahya et al., 2018; Saeedpour-Parizi et al., 2020; White & French, 2017).

Overall, older adults showed slower reaction times as compared to young adults. These findings are in line with the results of a previous study on the visual-verbal Stroop task performance of young and older adults (Davidson et al., 2003). Furthermore, older adults’ increased response latency corresponds with findings on the age-related decline of processing speed and inhibitory control (Bugg et al., 2007; Ebaid et al., 2017; West & Alain, 2000).

Contrary to our expectations, DT effects did not significantly differ between age groups despite lower cognitive performance scores in older adults during the pretesting session (e.g., MoCA score, d2 test). However, Krampe and Baltes (2003) note that psychometric tests are only limited valuable as measures of mental capacity to assess individuals’ cognitive resources. This lack of group differences may be related to the small sample size which was only sensitive to detect large effects (ηp2 = 0.22). However, the rigorous selection of the individuals attending our study possibly excluded other aspects that may influence the DT performance in older adults, such as physical and functional capacity (Campos-Magdaleno et al., 2022). As seen in Table 1, older participants had similarly high levels of physical activity and experience in resistance training as compared to young adults. In addition, older adults were more likely to report using unstable surfaces for exercise than younger adults in the initial survey. Hence, our active older adults might have been used to similar situations and, therefore, already acquired appropriate strategies to cope with the test situation in our study (Engelhard & Kleinert, 2006). In addition, the effect of age on DT performance was described as less robust than the effect of pathological aging. For instance, Alzheimer’s disease patients show distinct impairment in dual-task performance (Logie et al., 2004; Rapp et al., 2006). Nevertheless, our data for the Stroop task and the DT effects indicate that cognitive-motor interference was higher during squatting on unstable ground in both age groups. The observed higher exertion rates for the squatting exercise on unstable ground, when compared to stable ground, additionally emphasized the various physical and cognitive demands when exercising on that surface. However, a shortcoming of our study may relate to an objective measure of effort (e.g., heart rate) to assess the acute physical effects of the squatting exercises (Amico et al., 2021; Herold et al., 2020).

Motor Task

Squat performance differed between conditions and age groups. Especially older adults conducted squats with slower movement velocity (approx. 0.16 m/s), with a reduced vertical displacement (approx. 7 cm), and with two to three repetitions less on the unstable device as compared to the stable surface. This may be associated with the age-related decline in postural reserve, a capability that allows people to respond effectively to postural demands (Yogev-Seligmann et al., 2012). In comparison to stable conditions, participants were constantly urged to adjust postural stability and to counteract higher vertical and transversal ground reaction forces while squatting on an unstable device (Eckardt et al., 2020a; Lehmann et al., 2022). Older adults may cope with increased postural demands by conducting squats slower and less pronounced in the vertical displacement than young adults did. These motor performance decrements as a consequence of increasing task complexity accord with findings in back squatting on unstable surface (Drinkwater et al., 2007) and tend to be a common strategy known from walking tasks with increasing complexity (Lindenberger et al., 2000).

Although participants who showed a large difference in the number of repetitions between the ST and DT conditions were excluded from further analysis, the number of repetitions in the unstable condition and vertical displacement still differed between task conditions for the remaining participants. Noteworthy, for the unstable surface condition, older adults executed, on average, one more repetition in the DT condition than in the ST condition mainly due to smaller vertical displacements including faster movement velocity. Alterations in older adults’ motor performance were also found in DT walking with an additional Stroop task (Wollesen et al., 2016). They are associated with the demands of the concurrently executed cognitive task and may be explained by theoretical models regarding DT effects. According to the limited resource hypothesis (Wickens, 1980) and the cross-domain competition model (Lacour et al., 2008), motor and cognitive tasks compete for the same pool of limited attentional resources. This means, when both tasks are executed simultaneously, the limited amount of resources may be exceeded, leading to performance decrements in one or both tasks. The slowing in RT as well as the reduction in squat depth found in our study suggests that the unstable condition exhausted the available resources resulting in performance decrements in both the cognitive and the motor tasks.

In contrast, young adults slightly reduced movement velocity, and thus performed a fewer number of repetitions in DT conditions, but these motor performance adaptations were not significant. Since we only used data from an IMU attached to a dumbbell to record squatting performance, possible other motor adaptations of the participants cannot be evaluated. For example, as found in a previous kinematic analysis (Wei et al., 2014), older people tend to lower the weights by additionally leaning and bending their upper body forward while squatting.

Overall, the motor task placed high demands on the participants’ postural control because only a few reported experience with metastable training. In particular, older people had to be excluded from the study already after pretesting due to uncertain standing on unstable ground. Furthermore, more older adults than young adults had to be excluded during data analysis due to insufficient squat performance. This suggests that especially excluded older adults had difficulty in adequately allocating their available attentional resources to both tasks.

Limitations

A major limitation of this study relates to a possible selection bias in the participant recruitment process. Both age groups are not representative of their general age population as they were highly physically active and experienced in resistance training. In addition, neuropsychological conditions (e.g., ADHD) cannot be excluded in this sample, as they were not explicitly tested. Therefore, conclusions on attentional resources required for squatting can only be derived from the study sample. For the inclusion of a broader, less active population, this study may be repeated with less challenging unstable devices such as foam pads.

One further weakness of this study relates to the measurement of the cognitive task, both because of the small number of trials per condition and because of the recording of RTs via a voice key. As a result of increased physical effort and increased breathing during the exercise, the participants’ voice key responses were subjected to several sources of error. For example, participants may have triggered the voice key without naming a color, the color name could not be identified in the recorded audio track, or participants’ responses were not loud enough. In such cases, participants tried to ignore these voice key errors and continued with the task. Voice key errors occurred in both conditions (total number: stable, 113; unstable, 95). However, this disturbance of participants’ concentration and their refocusing on the task could have also accounted for the increased attentional resources observed while squatting on stable and unstable ground conditions. Other studies using the Stroop task to assess DT effects during walking evaluated the number of correct responses and accuracy scores on a slightly larger number of trials per block (32–40 trials) (Bishnoi et al., 2022; Chaparro et al., 2020; Wollesen et al., 2016). It should be noted that squats with additional weights cannot be repeated as frequently, as they are more demanding than walking. In this study, it was not intended to measure the effect of fatigue on RT, as fatigue would be an additional and undesirable source of variance. Therefore, we were limited in the duration of recording. Nevertheless, despite the small number of trials, we assume that measuring RT is useful. Herold et al. (2020) argued that the DT effect found in their study was quantitative rather than qualitative because the accuracy of the responses remained unchanged. In addition, both Wollesen et al. (2016) and Chaparro et al. (2020) stated the lack of RT as a limitation of their study. Furthermore, Chaparro et al. (2020) suggested that the use of accuracy as a measure of cognitive performance may not have been sensitive enough to capture motor-cognitive interference. They suggested that future studies should use variables that are more sensitive, such as RT. Thus, our study represents an attempt to capture RT in a visual-verbal Stroop task during a physically demanding motor task.

Furthermore, the difference in the squatting exercise performance between task and ground conditions can be considered another limitation. Basically, the motor primary task should be performed alike in the ST and DT conditions in order to appropriately assess the attentional resources of the motor task based on the performance decrements in the additional cognitive task (Abernethy, 1988; Huang & Mercer, 2001). Indeed, this is a central problem of the dual-task paradigm because the influence of the additional cognitive task on the motor task can hardly be prevented (i.e., cognitive-motor interference) (Plummer et al., 2013). The performance difference in the motor task between settings and conditions could be reduced by introducing a metronome providing a fixed rhythm for the concentric and eccentric phases (Eckardt et al., 2020a). However, a fixed rhythm may counteract intuitive, automatized movement patterns and cause additional attentional demands (Amico et al., 2021). Nonetheless, the increase in RT and the difference in DT effect with a concomitant adaptation of motor performance suggest that squatting on an unstable surface increases cognitive demands.

Conclusions

Our study found attentional resources while squatting exercises to be increased by surface instability. This finding is considered to emerge from increased perceptual load for both young and older adults. Therefore, unstable devices and free weights could provide a means to raise cognitive demands during resistance training. This cognitive engagement during exercise may be a possible indication of the positive effects of so-called metastable resistance training (MRT) versus machine-based resistance training on executive function in older adults as suggested by Kibele et al. (2021).

Noteworthy, older adults in this study did not show increased attentional demands compared to young adults while squatting exercises with a different degree of metastability. Further research is needed to better understand the mental effort involved in motor tasks of varying complexity. Consequently, cognitive demands of physical training and its cognitive effects could be better suited to the individual needs and goals. Nevertheless, resistance exercises with different degrees of metastability should be included in older adults’ daily routine to challenge not only the muscles but the brain as well.