Mindfulness is often defined as an accepting, open-minded attention to the present moment (Creswell, 2017). Mindfulness interventions in children typically consist of in-person trainings in mindfulness practices such as breath awareness, body scans, and journaling (Kabat-Zinn, 2003). These practices are intended to help children be fully aware of their experiences and to adopt an open and accepting stance toward them. Although mindfulness practices vary, they share a focus on training attention, building emotion regulation skills to effectively manage stress, and gaining self-knowledge (Greenberg & Harris, 2012). Further, some practices aim to build empathy and compassion (Kabat-Zinn, 2005). Mindfulness interventions in children have shown a host of benefits, including reduced psychopathology (Zoogman et al., 2015), reduced stress and negative feelings (Bauer et al., 2019), increased prosociality (Flook et al., 2015), improved attention (Bauer et al., 2020), and improved academic outcomes (Bakosh et al., 2016). A meta-analysis of randomized controlled trials (RCTs) in children and adolescents found decreases in anxiety relative to active control groups (Dunning et al., 2022). Due to these benefits, mindfulness has increasingly been used in schools to prevent and reduce stress and stress-related mental health and behavioral problems (e.g., mindfulness-based social emotional learning, MBSEL; Greenland, 2010).

Mindfulness may be beneficial for children because it teaches self-regulatory skills, which have been linked with both readiness to learn (Diamond & Lee, 2011; Mrnjaus & Krneta, 2014; Schonfeld et al., 2015) and decreased childhood stress (Lupien et al., 2009; Miller et al., 2011; Repetti et al., 2002). Two key types of self-regulatory skills are emotion regulation, the ability to monitor, evaluate, and modify emotional reactions (Thompson, 1994) and attention regulation, the ability to flexibly adjust attention (Posner, 2011). Emotion regulation in children is associated with better academic performance, higher social competence, fewer behavior issues and more (Beauchaine et al., 2007; Graziano et al., 2007; Spinrad et al., 2006). Attention regulation is related to lower instances of childhood ADHD, mood disorders, anxiety (e.g., Emerson et al., 2005), greater interpersonal success (Rotenberg et al., 2008), and future self-regulatory abilities and outcomes (Friedman et al., 2007; Miller et al., 2011). Despite the positive long-term impacts of self-regulation, children are rarely explicitly taught how to pay attention and regulate their emotions (Semple et al., 2017). Critically, mindfulness trainings in children teach these regulatory skills “from the inside out” by focusing on mental experiences, such as thoughts, emotional states, the breath, and other bodily sensations (Semple et al., 2010). Mindfulness interventions also incorporate attentional training, e.g., repeated gentle redirection of attention to the breath (Lutz et al., 2008). In summary, mindfulness training may lead to the development of healthy self-regulation in children, resulting in improvements in academic outcomes (Bakosh et al., 2016), social outcomes (Flook et al., 2015), and mental health outcomes (Dunning et al., 2022).

Mindfulness instruction for children has typically been delivered in schools and other settings with an adult instructor controlling the content, pace, and duration of instruction. Another potential approach is to use remote, app-based mindfulness programs with children, which could allow for more widespread mindfulness instruction that reaches many diverse children for a low cost. App-based interventions may be particularly useful for reaching children and adolescents who are increasingly involved with the digital world. The possibility of such a highly scalable and wide-reaching intervention is especially salient given evidence for increasing risk for mental health difficulties in children in the USA (Centers for Disease Control and Prevention [CDC], 2022). A recent survey found that 95% of adolescents own smartphones (Anderson & Jiang, 2018) and they spend over 2.5 hr per day on their phones (Rideout, 2015). Some children and adolescents already seek mental health support on their phones (Rideout & Fox, 2018). Indeed, a review of 71 digital health interventions in children and adolescents found good retention (Liverpool et al., 2020), and another review identified improvements in mental health outcomes (e.g., depression and anxiety) (Grist et al., 2019) (although many of these interventions are internet-based CBT).

Beyond feasibility, an important open question is whether app-based mindfulness interventions are effective for children. Mindfulness apps usually deliver content in short (e.g., 10-min) recordings, contrasting with in-person workshops where sessions may be longer. Despite this difference, app-based interventions have been efficacious in adults. App-based interventions have been shown to decrease distress (Goldberg et al., 2020; Hirshberg et al., 2022), improve trait mindfulness (Flett et al., 2019), and improve various facets of well-being including mood (Economides et al., 2018) and positive affect (Mahlo & Windsor, 2021). A meta-analysis of 66 smartphone intervention studies found significant decreases in stress and in anxiety and depression symptoms (Linardon et al., 2019).

To date, no quantitative studies have been conducted on the use of app-based mindfulness interventions in pre-adolescent children (see Nunes et al., 2020; Puzia et al., 2020; Tunney et al., 2017 for qualitative reports). However, several studies have been conducted with adolescents, with positive results. A pilot study of a 6-week mindfulness app intervention in a normative population of adolescents found that adolescents who used the app reported healthier weight-related behaviors (Turner & Hingle, 2017). A larger study with 80 adolescents found that the app intervention reduced rumination, with effects persisting for up to 12 weeks after the intervention (Hilt & Swords, 2021). Finally, a 3-week randomized controlled trial (RCT) with 152 adolescents (with a control mood monitoring intervention) found that the mindfulness intervention resulted in greater reductions in rumination compared to the control condition (Hilt et al., 2023). This research suggests that app-based mindfulness interventions are effective for adolescents, and raises the question as to whether and under what circumstances such interventions are effective for pre-teen children.

While all studies so far in children have tested the efficacy of mindfulness interventions using trained adult instructors, in practice, many schools choose to implement mindfulness instructions using app-based programs. One such program, Inner Explorer, consists of mindfulness practices recorded by a diversity of speakers, and was developed for multiple age ranges including 8–10 year olds. Inner Explorer has been implemented in over 3,000 schools with 2 million students across 50 states (Inner Explorer, 2023). In one study, 383 1st – 5th grade students were randomized by classroom to either Inner Explorer, consisting of 10 weeks of daily 10 min recordings, or waitlist control (Bakosh, 2013). Teachers endorsed high adherence (completing 95% sessions) and no disruption to daily routines. Students showed improvements in math scores as well as grade point average when compared to the waitlist control. Further analysis of these data showed particular academic improvements in a subgroup of students with disabilities (Dunlap, 2022). A non-randomized study in 191 third-graders also showed academic improvements, and, additionally, decreases in behavioral disruptions (Bakosh et al., 2016). However, two studies without control groups found no pre-post changes in academic, mental health, or behavioral outcomes (Goodman, 2019; Strickland, 2023). Thus far, there have been no randomized active-controlled trials of Inner Explorer, and it has only been implemented in school settings. In general, there is sufficient feasibility and acceptability evidence to support the use of Inner Explorer with children, and the hands-off audio recordings may be suitable for applications outside the classroom.

In the present study, we sought to rigorously test whether Inner Explorer delivered remotely is beneficial for children, and thus shed light on the effectiveness of app-based mindfulness interventions for children generally. In addition, one of our motivations for remote administration was the COVID-19 pandemic and associated restrictions. From July 2020 to January 2022, we administered a remote mindfulness intervention using the Inner Explorer app (Inner Explorer, 2023) to children from ages 8–10 years. Children were recruited to participate in a remote at-home audiobook program aimed at enhancing reading skills and then randomized to one of three conditions: (1) the audiobook program (Audiobook-only condition); (2) the audiobook program with additional personalized support (Audiobook-scaffolded condition); or (3) the mindfulness program (Mindfulness condition) (Ozernov-Palchik et al., 2022). We followed recommendations for a rigorous RCT design (Davidson & Kaszniak, 2015) including an active control condition, intent-to-treat analyses, and preregistration of hypotheses. Our main hypothesis was that the children in the mindfulness intervention would experience greater decreases in stress and anxiety than children in the audiobook interventions. We also assessed a range of other outcomes including negative affect, which may be reduced in children with greater trait mindfulness (Treves et al., 2023) and has been reduced in school mindfulness programs for children (Vickery & Dorjee, 2016).

Of note, this study was designed to evaluate the effects of remote audiobook interventions, and mindfulness was included as an active control condition. Mindfulness may be a suitable control for an audiobook intervention, as both interventions involve listening to recordings over a period of multiple weeks, but the mindfulness intervention was not hypothesized to influence reading skills. We believe this comparison was equally apt for studying whether changes in mental health outcomes like stress and anxiety were specific to the mindfulness intervention as the reading-related interventions were not hypothesized to enhance mental health outcomes. For context, the interventions took place during the height of the COVID-19 pandemic in the US, when social and academic restrictions were in place.

Method

Participants

We analyzed the pre-intervention (baseline) and post-intervention data from children and their parents in a remote audiobook intervention study, collected from July 2020 through January 2022. Third and fourth-grade children across the United States were recruited through Facebook ads, flyers, school partnerships, and through word of mouth. We aimed to recruit children from lower socioeconomic status (SES) backgrounds and therefore reached out to school districts with high percentages of students eligible for free/reduced lunch and targeted Facebook ads to lower income zip codes across the country (for more on recruitment strategies see Ozernov-Palchik et al., 2022). As a number of schools we contacted had significant numbers of Spanish-speaking families, we provided parents with the opportunity to indicate whether Spanish was their preferred language for communication. The research team included bilingual staff, fluent in Spanish, who translated all written parent communication and assessments into Spanish, and were available to communicate in Spanish with parents who indicated that was their preferred language.

A total of 1020 participants were assessed for eligibility, and 314 were randomized into the three conditions of the study: Mindfulness (n = 101), Audiobook-only (n = 105), and Audiobook-scaffolded (n = 108) conditions (Fig. 1). Of the 314 participants randomized, 279 completed pre-test and were given the interventions. We collected demographic information regarding child gender, age, grade, mental health diagnosis, parent education level and annual household income. Parents reported their child’s gender, and we did not inquire about sex assigned at birth. Therefore, we utilized gender to obtain scores for the self-report measures that were normed by sex.

Fig. 1
figure 1

CONSORT diagram

The 279 children (157 male) with demographic information were on average 9 years, 5 months old (SD = 6 months, range 8 years, 0 months to 10 years, 5 months). The majority of children came from the states of Georgia, Massachusetts, California, and Texas. The median income of our obtained sample was US$80,000–120,000. Median income in the United States in 2021 was US$70,784 (census.gov). Demographics by group are shown in Table 1.

Table 1 Demographics of the total sample

Procedure

The entirety of the study was conducted remotely. Potentially interested parents filled out an eligibility screener and were invited to participate in a pretest session (pre-intervention) if initial inclusionary criteria were met (child speaks English proficiently, has normal or corrected-normal hearing, has access to a computer or tablet at home, and has a parent that speaks English or Spanish). Further inclusionary criteria required that the child achieve a standard score of 80 or above on the Kaufman Brief Intelligence Test (KBIT) Matrices subtest, a standardized nonverbal IQ assessment (Kaufman, 2004).

Eligible participants whose parents opted-in to the study were block randomized into one of three experimental conditions, each lasting 8 weeks. We assigned groups based on a list of block-randomized integers 1–3, sampled from a normal distribution.

We obtained informed consent from the parents of all participants as well as assent from the children over Zoom before they participated in the study.

Interventions

The mindfulness intervention consisted of an 8-week Mindfulness-Based Stress Reduction (MBSR)-inspired curriculum, delivered remotely through the Inner Explorer smartphone app (Inner Explorer, 2023). Inner Explorer customized their 90-day program to 40 days for the purposes of best comparison to the 8-week audiobook interventions (as well as MBSR and the majority of mindfulness interventions). Parents of children assigned to this condition were instructed to encourage their children to engage in the practices as laid out in the Inner Explorer app. The program entailed approximately 10 min of practice per day, five times a week, over the course of the 8-week-long intervention phase. Types of practices include focused attention to breath, body scans, and gratitude and loving-kindness practices (also called ‘heart-centered’ practices by Inner Explorer). Support from research staff within the Mindfulness condition was minimal and was limited to video materials about the study, the app, and mindfulness in general, technical help if there were problems downloading or using the app, and weekly one-page digests briefly summarizing the past week’s practices and recommending an optional mindfulness exercise for the weekend. Parents were instructed to have their children follow the 40 practices in order and were advised to ‘catch up’ on the week’s practices over the weekend if needed. All participants were given continued access to the app at the end of the study.

There were two reading interventions, Audiobook-only and Audiobook-scaffolded. Children in the Audiobook-only intervention received unlimited access to audiobooks via the Learning Ally web-based platform (Learning Ally, 2023), curated based on their listening comprehension level. Children in the Audiobook-scaffolded intervention also received audiobooks and recommendations, as well as one-on-one 30-min online sessions with a learning facilitator twice per week, focused on improving their listening comprehension strategies and supporting their intervention adherence.

For the mindfulness intervention, we recorded number of days practiced from the Inner Explorer app. Participants were assigned one ten-minute practice per weekday for 8 weeks, so perfect adherence would be 40 days practiced. We also collected parental reports of number of practices completed with a categorical, ordinal question. For the reading intervention, we collected listening minutes from the Learning Ally app.

Measures

We obtained child self-reports via questionnaires administered over Zoom, a secure video chat platform, during testing sessions that involved a battery of language and cognitive assessments (Ozernov-Palchik et al., 2022). Experimenters read the assessment questions to the children who followed along on their computer screen and answered aloud. Testing sessions typically ranged between 1–2 hr, and there were 2–3 pretest sessions, as well as 1–3 post-test sessions, over the course of the study. Participants found out their group assignment before pre-testing. Testers were blind to group assignment during pretest and during post-test up until the final questionnaire (which was only administered to children in the reading interventions). When possible, the same tester was assigned to a child for multiple sessions. We also obtained parent-reports of child behavior through online questionnaires administered remotely through Research Electronic Data Capture (REDCap). REDCap is a secure, web-based application designed to support data capture for research studies (Harris et al., 2009). Parents were compensated $5.00 for every questionnaire completed (10 total questionnaires possible over the course of the study). Children were compensated $20.00 per hour of testing. Questionnaires were available in Spanish and English for parents and in English for children.

Child Measures

Perceived Stress

To measure stress, we administered the Perceived Stress Scale for Children (PSS-C) (White, 2014). This self-report measure consists of 13 items on a 4-point Likert scale ranging from never to a lot. The questions assess perceived stress related to time pressure, academic performance, and relationships with family and friends. A higher score indicates a greater level of perceived stress. For this study the Cronbach’s alpha was 0.60, and the McDonald’s omega was 0.52. Perceived stress was one of the two primary outcomes in our preregistration.

Depression & Anxiety

To measure child-reported anxiety and depressive symptoms we used the 25-item Revised Child Anxiety and Depression Scale (RCADS-25-C; Ebesutani et al., 2012), which contains the following scales: Anxiety Total scale, Depression Total scale, Total scale. We omitted the Total scale from analysis per our preregistration. Items are based on Diagnostic and Statistical Manual of Mental Disorders–Fourth edition (DSM-IV) criteria (American Psychiatric Association [APA], 1994). Items in the Anxiety Total scale measure a “broad anxiety” dimension, assessing a variety of anxiety symptoms. Items in the Depression Total scale measure symptoms of major depressive disorder. Higher scores represent a greater degree of symptoms. All items are rated on a 4-point Likert scale ranging from never to always. Raw summative scores from each of the scales are converted into T-scores normed by age and sex, with the following ranges: low severity (0–64), medium severity (65–70), and high severity (> 70). T-scores of medium severity are considered to be “borderline clinical threshold” whereas T-scores of high severity are considered to be “above clinical threshold” (Chorpita et al., 2000). For this study the Cronbach’s alpha values were 0.78 and 0.69 for the Anxiety Total and Depression Total scales respectively, and the McDonald’s omega values were 0.81 and 0.72, respectively. The Anxiety Total scale was one of the two primary outcomes in our preregistration.

Negative Affect

To measure children’s affect we administered a brief 13-item questionnaire (Panorama Education, 2015). Children rated the degree to which they felt each of thirteen emotions in the past week, using a 5-point Likert scale ranging from almost never to almost always. We calculated a Negative Affect score from the seven items asking about negative affect: mad, bored, lonely, sad, nervous, worried, and afraid. This Negative Affect factor was supported by confirmatory factor analyses (see Supplement). The composite score ranges from 7 to 35, with higher scores representing more negative affect. The Cronbach’s alpha for this scale was 0.71, the McDonald’s omega was 0.78.

Mindfulness

To assess child trait mindfulness we administered the Child and Adolescent Mindfulness Measure (CAMM; Greco et al., 2011). This self-report scale consists of 10 items querying the frequency of non-mindful thoughts or behaviors on a 5-point Likert scale from never true to always true. All items are negatively worded and reverse-scored. Higher scores represent greater levels of mindfulness. Specifically, the authors describe the scale as measuring both awareness of the present and the degree to which one has a nonjudgmental attitude towards one’s thoughts and feelings, which includes not suppressing or avoiding them. In this study the Cronbach’s alpha was 0.74, the McDonald’s omega was 0.78.

Parent Measures

Negative Affect

To measure child affect we asked parents to rate the degree to which their child felt worried, frustrated, stressed, sad, mad, calm, and happy, using a 4-point Likert scale, ranging from not at all to very. We reverse-scored calm and happy and calculated a composite Negative Affect score ranging from 0 to 21, with higher scores signifying more negative affect and less positive affect. The Cronbach’s alpha for this scale was 0.82, and the McDonald’s omega was 0.88.

Prosociality

To assess child prosociality we asked parents to rate how often their child is helpful when asked, how often their child is helpful without being asked, and how often their child has done something kind. Parents answered each question on a 4-point Likert scale ranging from not at all to very. Scores ranged from 0 to 9, with higher scores indicating more prosociality. We calculated a prosociality composite by summing the raw scores from the three items. The Cronbach’s alpha for this scale was 0.80, and the McDonald’s omega was 0.81.

Perceived Stress

In order to measure parents’ own perceived stress parents completed the ten-item adult self-report Perceived Stress Scale (PSS) (Cohen, 1988; Cohen et al., 1983) which assesses their overall perceived stress over the past month. Items are rated on a 5-point Likert scale from never to very often, with four items that are reverse-scored. Higher scores indicate greater levels of perceived stress. In this study internal reliability was demonstrated with an Cronbach’s alpha of 0.87, and the McDonald’s omega was 0.91.

Depression & Anxiety

In addition to the child’s self-report of their anxiety and depressive symptoms, we also obtained a parent report of their child’s anxiety and depressive symptoms, assessed with the 25-item Revised Child and Anxiety Depression Scale, Parent Form (RCADS-25-P) (Chorpita et al., 2000; Ebesutani et al., 2017). This scale consists of the same scales and items that comprise the RCADS-25-C: Anxiety Total and Depression Total scales (see above). The only difference is that the items are worded to ask the parent about their child’s behavior. As is the case with the child scale, items are based on DSM-IV (APA, 1994). Higher scores represent a greater degree of symptoms. All of the items are rated on a 4-point Likert scale ranging from never to always. Raw summative scores from each of the scales are converted into T-scores normed by child age and sex, with the following ranges: low severity (0–64), medium severity (65–70), and high severity (> 70). T-scores of medium severity are considered to be “borderline clinical threshold” whereas T-scores of high severity are considered to be “above clinical threshold” (Chorpita et al., 2000). For this study the Cronbach’s alphas were 0.75 and 0.77 for the Anxiety Total and Depression Total scales respectively, and the McDonald’s omegas were 0.80 and 0.81, respectively.

Executive Functioning

We administered an 86-item parent-report of the child’s executive functioning, the Behavior Rating Inventory of Executive Function-Parent Form (BRIEF) (Gioia et al., 2000). Although the BRIEF has several subscales and composite indices, we chose to use the Global Executive Composite (GEC) as a summary measure of executive function, as well as the Behavioral Regulation Index (BRI). Higher scores on the GEC indicate a greater degree of difficulty with executive functioning (e.g., needs help from an adult to stay on task). Higher scores on the BRI indicate a greater degree of difficulty with behavioral regulation (e.g., overreacts to small problems). Raw scores are summed and transformed into T-scores normed by age and sex. T-scores ≥ 65 are considered clinically significant (Gioia et al., 2000). In this study the Cronbach’s alpha was 0.97 for the GEC, and 0.94 for the BRI, and the McDonald’s omegas were 0.98 and 0.96, respectively.

Data Analyses

Data were analyzed using intention-to-treat principles (i.e., participants were not excluded based on engagement) (Polit & Gillespie, 2010). We assessed differences in demographics and outcomes by group at baseline using ANOVAs, and in the case of significant differences, we used Tukey tests for isolating the group differences driving the effect. We assessed differences in completion by group using logistic regression. We assessed whether demographics or baseline outcome measures predicted completion using logistic regression. Last, we assessed correlations between outcome variables at baseline (Supplementary Table S1).

We used ANOVAs to test for group by time effects, comparing Mindfulness vs Audiobook-only groups, and Mindfulness vs Audiobook-scaffolded groups. To correct for multiple comparisons, we used false discover rate (FDR) correction in the core R package stats. For the significant effects from the ANOVAs, we conducted one-tailed Welch’s t-tests on change scores to evaluate directional hypotheses from the preregistration. We used one-tailed t-tests as we hypothesized larger decreases in the mindfulness groups than the audiobook groups (e.g. larger decreases in stress). Additionally, we used ANCOVA to evaluate group effects on changes in outcomes over time (i.e., post-test scores predicted by group covarying pre-test scores) (Supplementary Table S3). We used multiple imputation to handle missing data which is robust to data that are missing at random (Graham, 2009). Multiple imputation was implemented using the mice (Van Buuren & Groothuis-Oudshoorn, 2011), jomo (Quartagno et al., 2020), and mitools (Lumley, 2019) packages in R. We conducted sensitivity analyses, assessing the pre-post effects using ANCOVA while also covarying demographic variables (Supplementary Table S4). We also explored whether demographic variables moderated the effect of group assignment for our main measures of anxiety and child perceived stress (Supplementary Table S5, S6). Lastly, to examine dosage effects in the mindfulness group, we correlated the changes in measures with number of days practiced, and then conducted a median split by days practiced (a median was used because of deviations from normality in the distribution of days practiced; Supplementary Fig. S1). Dosage effects in the audiobook groups were assessed using minutes listened by each participant.

Note that there were deviations from our preregistered analysis plan. Firstly, we intended to test hypotheses about a foraging task, but were unable to do so due to programming issues. Secondly, we did not investigate mediation by hours practiced because that data was unavailable, and instead conducted correlations between the number of days practiced and pre-post changes in the measures. We assessed these dosage effects for all measures. The following analyses were not preregistered: one-tailed t-tests on change scores to examine the significant group X time effects from the ANOVAs, ANCOVAs controlling for baseline scores, sensitivity tests (covarying demographics in ANCOVAs for all outcomes), and moderation tests (assessing the interaction of demographics with group assignment for main outcomes).

Results

As assessed by Inner Explorer, the Mindfulness group practiced on average 25.30 (SD = 12.19) of the targeted 40 days (Supplementary Fig. S1). The most frequently selected option by parents for number of practices was 40 (Supplementary Fig. S2). Inner Explorer days practiced and parent-report number of practices were positively related (Supplementary Fig. S3). The Audiobook-only group listened to audiobooks on average for 839.16 min (SD = 1231.91 min), and the Audiobook-scaffolded group listened to audiobooks on average for 1022.26 min (SD = 1000.89 min).

Attrition

Overall 84.6% (236/279) of children completed the intervention. Specifically, 85.6% of Mindfulness participants (77/90), 85.6% of Audiobook-only participants (77/90), and 82.8% of Audiobook-scaffolded participants (82/99) who completed pre-test also completed at least some post-test. There was no significant difference in completion of pre-test between the mindfulness and audiobook groups (p = 0.92, OR = 0.96, 95% CI [0.44 2.00]). There was no significant difference in completion of the intervention between groups (p = 0.76, OR = 0.89, 95% CI [0.43 1.78]). The only demographic variable associated with completion rates was maternal education; higher maternal education correlated with higher completion rates across all groups (p = 0.013, OR = 2.61, 95% CI [1.12 6.58]). Further, completion was not associated with outcome variables at baseline (all p-values were greater than 0.15).

Power Analyses

We conducted post-hoc power analyses using the pwr.t2n.test and pwr.anova.test functions in the pwr package in R (Champley, 2020). For comparisons of change scores between the mindfulness and audiobook conditions, we were powered at 80% to detect Cohen’s d-values of at least 0.42 at a significance level of p = 0.05. We also conducted a power analysis of the ANOVA between the three conditions, which revealed that we were powered at 80% to detect f-values of 0.19 at a significance level of p = 0.05.

Outcome Analyses

Correlations between outcome measures at baseline can be found in Supplementary Table S1. The three groups did not differ significantly on demographic or outcome measures at baseline (p ≥ 0.05; Tables 1 and 2), with one exception. Specifically, the Mindfulness group had a higher proportion of child mental health diagnoses than the Audiobook-only group (Tukey test, p = 0.046). We controlled for possible effects of this difference in a sensitivity analysis using ANCOVA (Supplementary Table S4).

Table 2 Descriptive statistics for outcomes

Effect sizes were generally small (Cohen’s d < 0.45, Table 3). When controlling for multiple comparisons, there was one significant group X time interaction using pairwise ANOVAs (Fig. 2, Table 3). Parent-reported child negative affect decreased significantly more in the Mindfulness group than the Audiobook-only group (F(1,146) = 8.60, FDR-corrected p = 0.048). Child-perceived stress also decreased more in the Mindfulness group than the Audiobook-only group, although this trend was not significant (F(1,146) = 3.20, uncorrected p = 0.075). No significant group X time interactions were observed for the comparison of Mindfulness vs Audiobook-scaffolded groups.

Table 3 Pre-post effect sizes and ANOVA results
Fig. 2
figure 2

Treatment effects for all groups and all scales. Note: Blue is Mindfulness, Green is Audiobook-scaffolded, Red is Audiobook-only. Error bars reflect standard errors. TimePoint 0 is Pre-test, TimePoint 1 is Post-test. Sample size is 279 individuals. CR: Child Report; PR: Parent Report; Negative Affect CR: Child Negative Affect; Mindfulness: Child and Adolescent Mindfulness Measure (CAMM); Stress Children: Perceived Stress Scale for Children; Anxiety CR: RCADS – Anxiety Scale; Depression CR: RCADS – Depression Scale; GEC: BRIEF Global Executive Composite; BRI: BRIEF Behavioral Regulation Index; Anxiety PR: RCADS – Anxiety Scale; Depression PR: RCADS –Depression Scale; Stress Parent: Perceived Stress Scale; Prosociality: Parent Report Prosociality, Negative Affect PR: Parent Report of Child Negative Affect

We examined these effects from the ANOVAs, calculating one-tailed t-tests between change scores for child-perceived stress and parent-reported negative affect because those measures yielded significant or trending towards significant differences. We found that reductions in child-perceived stress in the Mindfulness group were significantly greater than in the Audiobook-only group (t(148.74) = 1.79, p = 0.038). We also found that reductions in parent-reported negative affect in the Mindfulness group were significantly greater than in the Audiobook-only group (t(130.05) = 2.89, p = 0.0022).

Time effects across all interventions were observed for reduced child-reported negative affect, and also parent reports of improved global executive composite (GEC), improved behavioral regulation index (BRI), reduced child depression symptoms, reduced parent stress, and reduced child negative affect (Supplementary Table S2).

When controlling for multiple comparisons, there were no differences between groups on pre-post changes in main outcomes or exploratory outcomes using ANCOVA models (Table S3, p-values > 0.10). However, there was a trend difference between Mindfulness and Audiobook-only groups (FDR-corrected p = 0.072) in parent-reported child negative affect when covarying demographic variables, that is, children in the mindfulness group had greater decreases in parent-reported child negative affect (Supplementary Table S4). We further tested for possible baseline moderation by demographic variables and results were not significant (Supplementary Tables S5, S6).

We explored dosage effects using days practiced recorded by Inner Explorer in the Mindfulness group. We conducted a median split on days practiced where a Low-Practice group was defined as at or below the median (29 days), and a High-Practice group was defined as above the median. The High-Practice group had significantly larger decreases in parent-reported child negative affect than the Low-Practice group (t(63) = 2.07, p = 0.021), and significantly larger decreases in parental stress (t(63) = 4.16, p < 0.001). The High-Practice group had significantly larger decreases in parent-reported negative affect than both the Audiobook-scaffolded (t(115) = 2.18, p = 0.016), and Audiobook-only (t(104) = 2.98, p = 0.0018) groups (Fig. 3). The High-Practice group had significantly larger decreases in parental stress than both the Audiobook-scaffolded (t(115) = 2.53, p = 0.0065), and Audiobook-only (t(104) = 3.16, p = 0.0010) groups. The High-Practice group had significantly larger decreases in child-perceived stress than the Audiobook-only group (t(112) = 1.66, p = 0.049), and trended toward larger decreases in child-perceived stress than the Audiobook-scaffolded group (t(117) = 1.32, p = 0.094). Dosage effects were similar for parent reports of number of practices (Supplementary Fig. S4). No dosage effects were found for all other outcomes. No dosage effects were found in the reading minutes for the audiobook groups when correlating minutes practiced with outcomes (parent-reported child negative affect p = 0.12, parental stress p = 0.18, child perceived stress p = 0.92).

Fig. 3
figure 3

Change in parent-reported child negative affect with median split by days practiced. *p < 0.05, **p < 0.01. LowPracticeMindful is the group of mindfulness participants at or below the median of days practiced (29). HighPracticeMindful is the group above the median. Error bars reflect standard errors. Sample sizes are 90 for Audiobook-only, 99 for Audiobook-scaffolded, 37 for HighPracticeMindful, 35 for LowPracticeMindful, and 18 mindfulness individuals were missing data for days practiced

Discussion

In this study, we administered an at-home app-based mindfulness intervention to children ages 8 to 10 years in the US. We hypothesized that there would be decreases in child-reported anxiety symptoms and perceived stress after the mindfulness intervention relative to the control audiobook interventions. There was evidence of larger decreases in children’s negative affect as reported by parents and of children’s self-perceived stress in the Mindfulness intervention compared to the Audiobook-only intervention, but no differences between the Mindfulness intervention and the Audiobook-scaffolded intervention. The reduction in stress is consistent with mindfulness studies reporting reduced stress with adults (Khoury et al., 2015) and children (Bauer et al., 2019). There were no significant group differences for trait mindfulness, parent-report anxiety and depression, executive function, prosociality, and parental stress. Effect sizes in all interventions were small (Cohen’s d < 0.45). There was evidence that participants who engaged with the mindfulness intervention more often showed larger reductions in negative affect.

This study was novel in applying a remote, app-based intervention with pre-adolescent children. The remote nature of the mindfulness intervention may have diminished its effectiveness. While app-based mindfulness interventions in adults often show decreases in anxiety and depression (Spijkerman et al., 2016), increases in well-being (Gál et al., 2021) and increases in mindfulness (Linardon et al., 2019), changes are often smaller in size than in-person interventions (Goldberg et al., 2018). Indeed, this has been observed across app-based health interventions generally (Goldberg et al., 2022a). There are barriers to success for app-based interventions, including low adherence (Linardon & Fuller-Tyszkiewicz, 2020). However, in the current study we sought to encourage adherence by including weekly practice reminders as well as personal communication in the case of multiple missed practices. Our monitoring data show that most families were practicing (on average 25 days of a total of 40), and dropout was relatively low compared to other app-based studies (e.g., Goldberg et al., 2020; Linardon & Fuller-Tyszkiewicz, 2020).

Another potential explanation for the small effects is the rigor of the control conditions. It is well-established that active control conditions produce smaller effect sizes (Gál et al., 2021; Goldberg et al., 2022b). In our study, participants were recruited for a reading study to evaluate the impact of audiobooks on reading scores, and were randomized into either Audiobook-only, Audiobook-scaffolded (with weekly meetings with a facilitator), or Mindfulness conditions. For our mindfulness analyses, we thus had two audiobook control conditions. It is possible that these audiobook conditions were not the best comparison conditions for mindfulness, especially because the Audiobook-scaffolded condition involved weekly contact with a facilitator which was not present in the mindfulness condition. The interventions took place during the height of the COVID-19 pandemic in the US, when social and academic restrictions were in place. Given the circumstances, the human contact provided by the Audiobook-scaffolded intervention may have been particularly beneficial.

Indeed, there were multiple pre-post improvements across all three interventions, including measures of children’s negative feelings, and parent ratings of children’s negative affect, depression symptoms, executive function, and behavior regulation. One possibility is that the human contact involved in remote but interactive pre-post testing, and the activity involved in all three interventions were beneficial in the context of COVID isolation. Social support and interaction was an important determiner of mental health during the COVID-19 pandemic for adults (Li et al., 2021; Saltzman et al., 2020), and likely in children as well (Wong et al., 2020). Whether all three interventions would show similar benefits in more typical circumstances is unknown. Additionally, it is possible that parents had biased expectancies for benefits for all the interventions.

Our study is not alone in finding limited effectiveness of mindfulness interventions for children. A school-based RCT with over 8,000 adolescents found that mindfulness interventions showed no advantages over teaching as usual (Montero-Marin et al., 2022). In addition, they found that the mindfulness interventions were actually contraindicated for some adolescents with baseline mental health difficulties. Other, smaller studies with children and adolescents have also found null results (Malboeuf-Hurtubise et al., 2021; Odgers et al., 2020). A meta-analysis of mindfulness interventions in youth found no increases in well-being relative to active or passive (i.e., waitlist or treatment-as-usual) controls, and only small decreases in anxiety and stress with regard to active controls (Dunning et al., 2022).

In contrast, numerous studies show evidence for the benefits of mindfulness in children. In child and adolescent populations, mindfulness-based interventions (MBIs) have been related to symptom reduction, increased wellbeing (Caldwell et al., 2019; Carsley et al., 2018; Porter et al., 2022), improved attention (Bauer et al., 2020), and behavioral regulation (Kaunhoven & Dorjee, 2017; Schonert-Reichl et al., 2015). Researchers have found functional changes in brain circuitry in sixth graders after a mindfulness intervention (Bauer et al., 2019). Trait mindfulness, defined as the disposition or tendency to behave mindfully in day-to-day life, is connected to positive mental health outcomes in children (de Bruin et al., 2014; Greco et al., 2011) and positive academic outcomes and school behaviors (Caballero et al., 2019). In theory, MBIs promote trait mindfulness and thus positive mental health outcomes.

The benefit of the mindfulness app on children’s emotional well-being (child perceived stress and parent-reported child negative affect) was related to frequency of the usage of the app. Children who practiced 30 days or more had significantly larger gains in emotional well-being than children in both the Audiobook-only and Audiobook-scaffolded conditions. We also found a similar dose-dependent benefit for parental stress. These results suggest there needs to be sufficient dosage for app-based interventions to show effects in children, paralleling similar findings in adults (Bostock et al., 2019). Further, the large school-based study of mindfulness in the UK that found no benefits contained only 10 instructional sessions spread across a semester and that most students reported little or no independent practice beyond those sessions (Montero-Marin et al., 2022). In contrast, for example, a school-based study that yielded multiple benefits with 6th graders involved about 24 hr of instruction and practice across 8 weeks (Bauer et al., 2019, 2020). It is, however possible that participants in the present study who practiced more than 30 days were more motivated to report mindfulness-related benefits. Future studies could randomize participants to receive more or less treatment so that there is direct evidence of the relation between dosage and possible benefits and so that the minimally effective dosage might be identified.

These findings suggest that an important issue in at-home app-based mindfulness interventions is adherence to the program. For young children, such adherence most likely involves adult caregivers as well as the children who depend on them. This challenge contrasts with school-based programs in which, when given with high fidelity, the instructor can assure engagement.

Limitations and Future Research

There are some limitations to our study that may account for the limited effectiveness of the mindfulness intervention. One limitation was that the families were initially recruited to take part in an audiobook intervention. Thus, they may have been less engaged and motivated to take part in the mindfulness intervention. There is evidence that creating more favorable subjective norms around mindfulness and increasing intentions by emphasizing the benefits of mindfulness (e.g. behavioral control) increases the number of minutes practiced by participants (Crandall et al., 2019). Initial motivations to engage in mindfulness practice also appear to be related to persistence (Jiwani et al., 2022), and motivations for mindfulness may have been initially low in this sample. As participants were informed about their intervention assignment before pre-test, any difference in motivation could have effects not just on adherence, but also on test scores. However, we found no baseline differences in scores between the groups, no differences in completion of pre-testing, and no differences in completion of the interventions.

An additional limitation is that we may have been underpowered to detect small to moderate selective effects between interventions. Power analyses for interventions have been conducted using between-group comparisons of change scores (Goldberg et al., 2020), or f-values (Kaplan et al., 2022). We were powered at 80% to detect effect sizes of d = 0.42 or larger for the comparisons of Mindfulness and Audiobook groups. We were powered at 80% to detect f-values of 0.19 using the ANOVA analyses. Future studies will ideally be more highly powered, but there may be questions about value of benefits that are so small that large numbers of participants are needed to detect them. There is the possibility, however, that the individual differences among participants are predictive of the magnitude of benefits (Webb et al., 2022). Also, to better generalize to the public, studies should be conducted with diverse participants. While we focused substantial effort and resources toward recruiting a socioeconomically diverse sample (Ozernov-Palchik et al., 2022) the majority of participants in our study were from middle to high-income families.

The final limitation concerns the validity of our self-report measures. We used a clinical questionnaire (the RCADS (Chorpita et al., 2000)) to measure anxiety symptoms. Floor effects may have obscured sub-clinical changes in anxiety. In addition, the child perceived stress measure had low scale reliability.

In summary, we conducted a rigorously controlled, app-based mindfulness intervention in children, and found some limited benefits in reduced stress and enhanced emotional well-being. Smartphone apps offer benefits in terms of scalability, reach, and cost-effectiveness (Linardon & Fuller-Tyszkiewicz, 2020). For this reason, we suggest that future work should continue to study remote interventions in children, examining the types of mindfulness practices that are useful, how much practice is necessary to achieve beneficial outcomes, how to encourage children to accomplish that amount of practice, and which childen may benefit most from a mindfulness app.