Early parent–child relational health contributes significantly to children’s social and emotional development, shaping the architecture of life-course well-being (Duschinsky, 2020). Relational health in the parent-infant dyad is characterized by consistent responsiveness and sensitivity by the parent, which promotes trust and organization in attachment for children, in turn forming the cornerstone of socioemotional development (Pederson et al., 2014). Parent sensitivity, which predicts early relational health, reflects a parent's capacity to accurately interpret their infant's emotional experiences and needs, and to respond in a timely, containing, and empathic manner (Fonagy, 2004). Suboptimal interactional patterns between parent and child during the critical period of psychosocial, immune, cardio-vascular, and neuro-biological development are of particular concern (Cassidy & Shaver, 2016). Whether entrenched in inter-generational dynamics or emerging from contemporary modifiable risk factors, disturbances to care during the perinatal and early childhood period create risk for enduring vulnerabilities, such as distorted representations of need and trust (Levendosky et al., 2011; Schore, 2019). Such sequalae provide a clear impetus for widespread prevention via strategic, evidence-informed investment. Importantly, relational health elements are largely modifiable. With this knowledge, promoting positive caregiving via psychoeducation and increased parenting capacity represents an important growth area.

The Potential of Technologically Supported Parent Education

Recent years have seen a surge in the availability of online and technologically supported programs for parents, accelerated by the COVID-19 pandemic. The ubiquitous nature of smartphones, computers, and internet access has further escalated development of online and digitally delivered health-based education. Emerging evidence around online interventions has found improvement in parent self-efficacy and confidence on par with in-person interventions, while reducing service system burden (Spencer et al., 2020; Thongseiratch et al., 2020). A diverse array of preventative universal parenting programs and associated evidence for their efficacy is mounting (Morris et al., 2019), including indications of equivalent efficacy between online universal programs with and without a clinical support component or active guidance (Spencer et al., 2020).

If shown to be efficacious, predominantly self-guided online programs represent tremendous value to the community, reducing the resources required for program delivery and maximizing program reach. Preventative parenting programs can be classified as (1) universal, (2) selective, or (3) indicated. Universal programs are available to all, irrespective of prior risk status (Baker, 2011; Greenberg et al., 2001). In contrast, selective programs target subgroups displaying above average risk, while indicated programs target individuals with current symptomatology (Baker, 2011; Fusar-Poli et al., 2020). Initial findings suggest that universal programs are highly suitable for mental health promotion, reducing the development of mental and behavioral health disorders and resulting in substantial social and economic gains (Arango et al., 2018; Ulfsdotter et al., 2014).

In-person parent education programs predominantly seek to enhance parent knowledge of young children’s socioemotional development at low cost to participants. Such programs are offered in a range of contexts and modes, including group or individual delivery, with evidence of improved outcomes via decreased caregiver stress, improved reflective functioning, and parent sensitivity, among other positive socioemotional parent outcomes (e.g., Havighurst et al., 2019; Powell et al., 2009).

With growing familiarity and accessibility of technology platforms, many in-person parenting programs have been adapted for digital delivery (e.g., Triple P Positive Parenting Program; Ehrensaft et al., 2016). Online programs hold much promise in alleviating barriers associated with in-person program delivery. Universally available online interventions allow for more equitable access to diverse populations, provide parents with knowledge in private thus mitigating stigma-based barriers, and minimize costs and logistical engagement obstacles, thereby broadening program reach. Online parenting programs can be delivered with guidance (from experts or peers), without guidance, or include elements of each. Universal programs are typically well-suited for online unguided or mixed methods delivery, reducing reliance on, or eliminating the need for contact with costly specialists, avoiding staff shortages and healthcare service demand (Canário et al., 2022). Entirely self-guided online programs offer even greater accessibility, with 24/7 on-demand content availability, program engagement flexibility, the provision of passive healthcare at minimal ongoing expense, and anonymous engagement reducing help-seeking associated stigma (Hollis et al., 2017).

Known Challenges of Online Parenting Programs

Despite the potential advantages of such online parenting programs, the e-mental health literature has highlighted existing program shortcomings. These include low adherence/high drop-out rates, as well as access program barriers for some populations due to low digital literacy (e.g., program navigation, troubleshooting), minimal technology accessibility (computer and/or internet connection), high internet data usage costs, and technology reliability concerns (Day et al., 2021; Ramos et al., 2022; Ros-DeMarize et al., 2021). Suitability for and impact on different population groups is important, however, the benefit of these programs at the sub-group level is currently under-researched. Similarly, the efficacy of these programs at a broad public health level is un-established. Preventative evidence is also lacking for early de-escalation of parent mental health concerns, and sequalae for parents and children.

The Need for Systematic Examination

Considering the limitations and advantages of universal online parenting programs, their rapid growth warrants systematic examination alongside evidence that has long supported the utility and efficacy of traditional in-person parenting programs (Barlow et al., 2002; Mingebach et al., 2018). Prior meta-analytic evidence demonstrates the efficacy of online parenting programs for enhancing parent, child, and, to a lesser extent, systemic relational outcomes. However, these reviews have been largely limited to selective and/or indicated prevention samples (Baumel et al., 2016; Cai et al., 2022; Florean et al., 2020; Li et al., 2021), or pooled populations with aggregated evidence across selected, indicated, and/or universally targeted interventions (Flujas-Contreras et al., 2019; MacKinnon et al., 2022; Nieuwboer et al., 2013; Thongseiratch et al., 2020). For children, these reviews have demonstrated small-to-moderate reductions in behavioural problems (Florean et al., 2020; Thongseiratch et al., 2020), psychological issues, anxiety (Spencer et al., 2020), and increased positive behaviours (Baumel et al., 2016; Spencer et al., 2020), adjustment (Cai et al., 2022), and emotional problems (Thongseiratch et al., 2020). For parents, they have shown moderate-to-large positive increases in parenting behaviour (Baumel et al., 2016; Florean et al., 2020; Spencer et al., 2020), confidence (Baumel et al., 2016), self-efficacy (Florean et al., 2020; Flujas-Contereras et al., 2019), and decreases in parent stress (Florean et al., 2020; MacKinnnon et al., 2022), depression (MacKinnon et al., 2022) and anxiety (MacKinnon et al., 2022). While prior meta-analyses and primary studies have assessed child-specific and parent-specific outcomes for target groups, there exists a dearth of literature examining relational outcomes. To date, only two digital parenting program meta-analyses have examined relational impacts, and these were for selective and indicated prevention samples (Li et al., 2021; Spencer et al., 2020). Each study identified significant relational effects for both parent–child and parent-parent level outcomes (Li et al., 2021; Spencer et al., 2020).

While these results appear promising, there is room now for meta-analytic evidence assessing the efficacy of universally targeted online parenting programs. To date, only one meta-analysis (Spencer et al., 2020) has been conducted, and only included a small number of studies (k = 3–6). The authors reported significant increases in parent confidence (d = 0.30) and decreases in stress (d = 0.31) for universal online programs and observed no significant effect for child problem behaviours or parent depression. This study reported significant declines in parent depression. Due to limited relevant studies, relational outcomes (i.e., parent–child and parent-parent) were not examined in Spencer et al. (2020) for universally targeted programs. Only a small number of universal studies has been previously identified in meta-analyses that pool indicated, selected, and universal target populations (Spencer et al., 2020 [k = 9]; MacKinnon et al. (2022) [k = 6]; Li et al., 2021 [k = 0]; Flujas-Contereras et al., 2019 [k = 6]). Taken together, there is a clear lack of literature evaluating the efficacy of socioemotional outcomes for online, universal parenting programs.

An additional limitation across prior meta-analyses is the aggregation of heterogenous data such as different socioemotional outcomes, guided and self-guided programs (Baumel et al., 2016; Flujas-Contreras et al., 2019; Li et al., 2021; Thongseiratch et al., 2020), broad child age ranges (Flujas-Contreras et al., 2019; Li et al., 2021; Nieuwboer et al., 2013; Spencer et al., 2020; Thongseiratch et al., 2020), and studies with varying methodological designs (randomized control trial [RCT] and pre-post; MacKinnon et al., 2022). In the case of study design, results observed in within-group study designs, which lack a control group, may be due to normative changes in outcomes from pre-to-post-intervention, rather than the intervention itself. This makes it challenging to draw causal conclusions regarding the effectiveness of interventions and raises questions about what the effect size meaningfully reflects. These concerns are reflected in these studies’ moderate-to-high rates of statistical heterogeneity, which lowers confidence that the parenting programs under examination have consistent effects across populations.

In this light, the current study aimed to meta-analytically evaluate all self-directed, universal, online, or smart phone-based parenting program studies and identify the impacts of such programs on parent, child, and relational socioemotional outcomes. The research question can be summarized as: “Do online universal parenting programs have a positive impact on socioemotional outcomes when compared to care as usual control groups?”.

Methods

This review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (PRISMA; Page et al., 2021). The review protocol was registered on PROSPERO CRD42021275647. MEDLine, Embase, PsycINFO (all via OVID interface), CINAHL (via Ebscohost interface), and Web of Science (all databases) electronic databases were initially searched from January 1st, 2000, to October 10, 2021, to identify peer-reviewed articles. Databases were not searched prior to the year 2000 due to limited internet availability and smartphone access, thus technology-delivered programs were less likely to be available prior to that time. The database search was re-run to identify any new publications from October 11, 2021 to February 15, 2023. The database search strategy was developed, piloted, and refined with a senior health-science librarian (AJH). See Supplementary Material 1 for detailed search strategy description. Note, the MEDLine search is accompanied by a contextual narrative to enhance search reproducibility and transparency (Cooper et al., 2018). Additionally, to ensure comprehensiveness and control for publication bias, unpublished research was examined at both search periods, as were manual searches of the reference lists of pertinent identified publications and relevant reviews.

In total, 8798 unique published records were identified in the initial search from published sources following duplicate removal. At the title and abstract level, 8622 records were excluded. Full-text screening of 176 articles resulted in a total of 48 eligible studies. Following an amendment to study eligibility (i.e., inclusion of universal programs only), a further 33 studies were removed and a total of 15 studies were included in the review. When the search was re-run in February 2023, the systematic search yielded an additional 2865 unique records, 2797 of which were excluded on screening. Of the 11,663 records screened for eligibility across both searches, 22 published studies met all inclusion criteria, all of which were double screened with 100% agreement.

In addition to published articles, and to control for publication bias, unpublished research was examined. Dissertations were identified via ProQuest Dissertations and Theses Global (787 records found). Conference proceedings, unpublished research, and dissertations were searched in Scopus (408 records found), Opengrey (22 records found), and the first 10 pages of Google Scholar were screened (100 records found), yielding 1317 records. Grey literature searching was re-run on February 15, 2023. A total of 201 records were retrieved and screened. Note, Opengrey is no longer active, hence was not included in the updated search. No relevant records were identified through searching the grey literature. See Supplementary Material 2 for search strategy and search details of unpublished research.

Selection Criteria

The search followed a PICOS framework for systematic reviews (Higgins et al., 2019). Studies were considered for inclusions if they met the following criteria reported in Table 1.

Table 1 PICOS framework

Eligibility Criteria and Study Selection

Identified studies were screened for eligibility via: (1) title, keyword, and abstract screening; and (2) full-text article screening. Due to the heterogenous array of potential socioemotional sequalae that could be examined after participating in parent online training, outcome terms were omitted from the search, but screened for in accordance with the inclusion and exclusion criteria.

A post hoc study amendment was made wherein universal only studies were examined, while selected studies were excluded from the review, a result of the large volume of references identified and minimal resources. This decision was made as we chose to examine this vast body of literature by commencing with programs that are most accessible to the widest population cohorts.

During full-text screening, papers were excluded if they reported on an overlapping-dependent sample to another included study (n = 7), wherein the study with the largest sample size was retained. Studies examining the same intervention with independent samples were retained. For studies where the full-text could not be located, authors were contacted via email. If the author did not respond within the specified time, the paper was excluded. Such references were excluded if information was not obtained following two online requests. Two studies were excluded on this basis, and no studies included. Following screening, 22 studies were identified that met eligibility criteria. Reference lists of all included studies were examined, yielding no additional references.

Screening Inter-Rater Reliability

At each level of screening and across both searches, all papers were double screened. At the title, keyword, and abstract level, for searches 1 and 2, four independent reviewers (JO, AV, FP, AB) yielded an agreement rate of 96.11% (interrater reliability [IRR]: κ = − 0.67). At the full-text screening level, an agreement rate of 85.23% (IRR: κ = 0.70) was identified.

Data Extraction

Four authors (JO, AV, FP, AB) extracted data, which included (1) study details (i.e., author name, year, country, design, sample size); (2) sample details and recruitment (i.e., parent age/sex, child age, recruitment); (3) intervention details (i.e., program name, delivery method, program goal, length, and duration, guidance provided); (4) socioemotional outcomes; and (5) results.

Quality Assessment

The Quality Assessment for Diverse Studies tool (QuADS; Harrison et al., 2021) was used to assess methodological quality and risk of bias of all included studies. QuADS is an appraisal tool for methodological and reporting quality in systematic reviews of mixed- or multi-method studies allowing for direct comparison of heterogenous study designs. QuADS comprises 13 criteria scored on a four-point scale. An additional item from the Jadad Scale for Reporting Randomized Control Trials (Jadad et al., 1996) was included to assess for study randomization where relevant. This resulted in four risk of bias levels for items 1–13 (i.e., small, small-moderate, moderate-high, high), and three levels of risk (i.e., small, moderate, high) for item 14. Per study, a maximum quality assessment score of 41 could be generated; when the Jadad item was excluded, a maximum quality score of 39 could be achieved. Quality assessment was independently conducted by two authors (AV, FP). A third independent author (JO), double reviewed 20% of all studies, yielding a weighted Cohen’s κ (Cohen, 1968) of 100%.

Data Analysis Strategy

Analyses focused on socioemotional outcomes. Many programs also reported on additional non-socioemotional outcomes, such as program user experience outcomes; however, these were excluded to contain the review’s scope. Only between-groups design studies were included to ensure rigour and increase homogeneity. The review only included experimental and quasi-experimental designs, with a sensitivity analysis to explore whether any heterogeneity present was due to the design. Findings were analyzed quantitatively, via a meta-analysis. An a priori decision was made to limit analyses to outcomes with a minimum of five independent effect sizes to reduce the likelihood of identifying spurious associations between variables due to ‘overfitting’, as recommended by (Geissbühler et al., 2021).

The reported data (i.e., raw scores, effect sizes, etc.) from primary studies were converted to Cohen’s d using several different methods, depending on the available effects. When available, our preference was to calculate Cohen’s d directly from reported means, standard deviations, and sample sizes. If these were not available, reported Cohen’s d values were used. When other effect types were used, these were converted to Cohen’s d if possible. One study was excluded from analysis as no data could be extracted that facilitated calculating Cohen’s d (Barrera et al., 2015). For all studies that included a pre-intervention baseline assessment for control and intervention groups, we applied the method in Morris (2007) to make a baseline adjustment and include a bias term. For parent satisfaction, parent self-efficacy, social support, and parent–child interaction, an increase in d represents positive change. For depression, anxiety, and stress, a decrease in d represents positive change. For instance, a d value of − 0.5 for depression indicates a reduction in depression for participants after completing the program. Cohen’s d can represent a small (d = 0.2), medium (d = 0.5), or large (d = 0.8) effect size (Cohen, 1988).

Meta-analyses were categorized into parent and relational outcomes. Meta-analyses were possible for six parent socioemotional outcomes (i.e., parent anxiety, parent depression, parent stress, parent satisfaction, parent self-efficacy, and parent social support) and one dyadic socioemotional outcome (parent–child interaction). No child-specific sub-group analyses could be conducted as fewer than five independent effects were reported for these outcomes (Geissbühler et al., 2021). To assess if effects of the interventions were stronger for mother or fathers, a meta-regression was conducted for each outcome.

Statistical Analysis

The findings related to each outcome were synthesised using statistical software R v4.2.3 (R Core Team, 2018). Statistical analyses were performed with the aid of third-party R packages robumeta (Fisher & Tipton, 2015) and metafor (Viechtbauer, 2010). Data loading and transformation was performed using the third-party R packagedplyr (Wickham et al., 2018).

All syntheses of effect size were conducted using robust variance estimation (RVE) techniques (Hedges et al., 2010; Tipton, 2015), following the procedure of Opie et al. (2021). To ensure the robustness and accuracy of the performed analyses and assumptions made, a series of tests and adjustments were performed.

Heterogeneity

The assumption of heterogeneity was tested for each meta-analysis using Cochran's Q, τ2, and I2 metrics. Based on the confirmation of heterogeneity between studies, a random effects model was used to compute the aggregate level of effects (Borenstein et al., 2009). The I2 statistic indicates the amount of variation across studies due to true differences (heterogeneity) rather than chance (sampling error) and is expressed as a proportion of the total observed variance. This statistic ranges from 0 to 100%, where a higher percentage suggests greater heterogeneity. The τ2 statistic displays the between-study variance. It is an estimate of the variance of the true effect sizes.

Multiple-Dependent Samples and Multi-Arm Studies

To account for intra-study sample correlations, meta-analytic estimates were calculated using RVE (Hedges et al., 2010; Tipton, 2015). RVE accounts for the correlation of measurements between dependent samples, such as due to repeated measures or due to the synthesis of different, but related, effects. This ensures that all data available at the time of analysis can be used without introducing undue risk of bias.

For multi-arm studies, where more than a single study arm met the criteria for inclusion, each eligible arm was compared to a common control group. Direct comparisons between different intervention groups were not performed as it is outside the scope of the current research question.

Small Sample Adjustment

As suggested by Tipton (2015), a small sample adjustment was applied to improve estimation robustness. This adjustment applies a modification to the residuals and degrees of freedom used by the statistical test to account for the potential for excess Type I error.

Sensitivity and Moderator Analyses

Sensitivity analyses were conducted to assess study-level heterogeneity sources. We intended to conduct a moderation analysis to assess whether change in parenting differs based on program delivery method received (i.e., web-based or in-person) to assess whether web-based programs were statistically superior, inferior, or equivalent to a comparable in-person program. However, this was not possible due to a lack of studies providing appropriate comparison data (n = 3). There are also challenges as, unlike using non-intervention comparison groups, comparing to a different in-person intervention adds a variable baseline that differs between studies. For this type of comparison, findings of non-inferiority of online vs. in-person may be due to the quality of either of the compared interventions and clouds our ability to scrutinize the impact of the online program.

Meta-regression analyses were used to examine differences due to study design (experimental vs. quasi-experimental), control type (inactive vs. minimally active controls), program guidance (fully vs. partially self-guided), and parent sex. Due to a lack of data, we were unable to explore differences within partially guided programs, such as synchronous (i.e., live guidance) vs. asynchronous (i.e., delayed) guidance. In addition, a pairwise subgroup analysis was conducted using meta-regression to identify the durability of program effects over time.

Presence of Publication Bias

We assessed for publication bias by visual inspection of the funnel plots of the meta-analyses and using Egger's regression test, which determines if there is a trend between effect size and sample size or variance (Egger et al., 1997). Identification of significance of such a trend demonstrates that studies with the same effect size but a smaller statistical power are less likely to be published.

Results

Study Selection

From the 11,663 references identified through database searching, grey literature searching, and hand searching reference lists, 22 published studies were included in the meta-analysis. Studies were published from 2003 to 2023. No unpublished references were included. See Fig. 1 for a PRISMA diagram of all included literature (Page et al., 2021) and Supplementary Material 3 for a PRISMA checklist (Page et al., 2021). See Supplementary Material 4 for a PRISMA diagram of the grey literature.

Fig. 1
figure 1

PRISMA flow diagram of the study selection process. Study amendment refers to the decision include only universal parenting programs (i.e., exclude indicated/selective programs). Grey literature searching involved the screening of 1518 articles (Search 1, n = 1317; Search 2, n = 201). No studies met criterion for inclusion, as such study identification has not been reported here (see Supplementary Material 4 for associated PRISMA diagram of searches 1 and 2)

Study Characteristics

Table 2 present the characteristics of included studies. Sixteen studies were experimental (i.e., RCTs) and six were quasi-experimental study designs. Study samples size ranged from 32 to 1141 parents (M = 236.29), with a total of 5671 parents included in the meta-analysis.

Table 2 Sample, program characteristics, & study design (N = 22)

Studies came from 13 countries, with most studies conducted in the USA (n = 5), Singapore (n = 4), Australia (n = 2), China (n = 2) and South Korea (n = 2). All participants were recruited from community settings and fell on the universal risk continuum, with no participant experiencing an acute psychiatric illness. Fifteen studies included only mothers, one study included only fathers, and six studies included both mothers and fathers. Eighteen studies reported on parents’ age, with a mean age of 30.47 years (range: 18–53). Six studies reported on child age, with a mean age of 1.32 years (range: 10 weeks gestation-6 years). Of the 13 studies that reported on marital status, the percentage of those who were partnered (e.g., in a committed relationship, married, cohabitation) ranged from 47 to 100%. Fifteen studies reported on income (e.g., monthly/yearly household income) of which three comprised entirely of those from low-income backgrounds (Baggett et al., 2010; Ehrensaft et al., 2016; Zuckerman et al., 2022) while one sampled participants (64%) who were low-income earners according to federal poverty level classifications (Breitenstein et al., 2021). Of the 12 studies that reported on ethnic or racial background, five studies had samples where more than half of the participants were of White backgrounds (range: 55.20–88.00%). Of the 20 studies that reported on highest level of educational attainment, nine studies had samples where more than half of the participants had completed a tertiary level education (e.g., Bachelor degree or higher; range: 50.50–80.88%).

We identified 22 unique interventions, with all studies reporting on a single intervention. All interventions were standardized and validated. Online program durations ranged from one week to six months (M = 9.28 weeks). The average number of modules per intervention was 6.89 (range: 4–10 modules, n = 9). Of the 21 studies that reported the level of guidance provided, 9 (42.86%) reported on entirely self-guided programs, while 12 (57.14%) reported on partially self-guided programs, which included a combination of human support and self-guided program elements. For the partially self-guided content, guided program support was delivered asynchronously in four studies and synchronously in two studies. Six study interventions delivered a combination of synchronous and asynchronous human support methods. Researchers were the primary providers of guided intervention content (n = 6), followed by mental health professionals (n = 3), and a combination of peers and mental health professionals (n = 3). Programs delivery method varied: 13 studies were web-based, four mobile phone-based, three app-based, and two via a combination of delivery methods. The mean follow-up period was 23.79 weeks (range 1.5 weeks- 21 months, n = 17) depending on the particular outcome category. Eight studies reported on more than one follow-up period.

For parent socioemotional outcomes, 13 studies reported on parent depression, nine reported on parent stress, nine reported on parent self-efficacy, eight reported on parent anxiety, seven reported on parent social support, and five reported on parent satisfaction. All included parent outcomes were from validated self-report questionnaires. At the relational level, of the 6 studies that reported on parent–child interaction, three used parent-reported questionnaires (Breitenstein et al., 2021; Ehrensaft et al., 2016; Na & Chia, 2008); two used observational data (Baggett et al., 2010; Park & Bang, 2022); and one study used parent-reported and observational (Mogil et al., 2022).

All studies included a control group. Control groups varied, 17 included inactive (care as usual) controls and five included active controls with access to additional information resources. Identified studies with controls that experienced care beyond the usual and access to static informational resources were excluded from this study.

Overall, the quality of included studies was largely rated high (90.90%, n = 20); with some of moderate quality (9.09%, n = 2) and no studies of low quality. See Fig. 2 for visual representation of study quality and Supplementary Material 5 for a tabular depiction.

Fig. 2
figure 2

Reviewers’ judgments regarding each risk of bias item, presented as percentages for the 22 included studies using a modified version of the QuADS (2021). For items 1–13, a score was assigned for each criterion on the checklist using a four-point rating scale developed by Harrison et al. (2021), which are shown by the colors in the figure legend: red(0) = not reported; orange(1) = reported but inadequate; yellow(2) = reported and partially adequate; green(3) = points denote a low-risk of bias sufficiently reported and adequate. Item 14 was also included from the Jadad (1996) measure to assess for randomization using a three-point rating scale: red(0) = not reported; yellow(1) = item described as randomized but method not described or inappropriate; green(2) = described as randomized with appropriate method. The 14-item modified QuADS was not used as a means of study exclusion, but as an indicator of study quality across included studies (Color figure online)

Meta-analysis

Meta-analysis of the effectiveness of online parenting programs was evaluated using 134 total effects across seven outcomes (see Table 3 for study outcomes). As presented in Figs. 3, 4, 5, 6, 7, 8 and 9, meta-analyses were conducted for six parental outcomes—depression, anxiety, stress, social support, self-efficacy, and satisfaction—and one relational outcome, parent–child interaction. Statistically significant reductions in parent depression (d = − 0.299, t = − 3.394, df = 12.506, p = 0.005) and parent anxiety (d = − 0.321, t = –3.15, df = 9.921, p = 0.01) were observed following completion of a universal online parenting program. Significant increases in parent self-efficacy (d = − 0.632, t = -3.139, df = 7.912, p = 0.014) were also observed. These significant effects were all small-to-moderate. The effect on improved parent–child interactions at post-intervention approached significance (d = 0.28, t = − 2.074, df = 5.22, p = 0.09). Online parenting programs did not produce a significant effect on outcomes of parent social support (d = 0.289, t = 1.738, df = 5.975, p = 0.133), parent satisfaction (d = 0.601, t = 1.772, df = 3.994, p = 0.151) and parent stress (d = 0.109, t = 0.658, df = 8.874, p = 0.527). As described in the methods, adjustments were made in effect size estimation to adjust for cases where multiple-dependent effects/samples were contributed by the same study. To ensure interpretability of results, we endeavored not to synthesize effects from different socioemotional outcomes. The exception to this was for analyses focussed primarily on secondary factors, such as parent sex and time effects, presented below.

Table 3 Meta-analysis studies and outcomes
Fig. 3
figure 3

Parent anxiety forest plot. Cohen’s d effect sizes are shown for all included studies and their subsamples. The summary effect size and 95% confidence intervals are presented. Where more than one non-independent subsample was reported in a single study, all samples are shown, with a description of each shown in the subsample column. FU (mo.) = Follow-up time in months from intervention completion to data collection

Fig. 4
figure 4

Parent depression forest plot. Cohen’s d effect sizes are shown for all included studies and their subsamples. The summary effect size and 95% confidence intervals are presented. Where more than one non-independent subsample was reported in a single study, all samples are shown, with a description of each shown in the subsample column. FU (mo.) = Follow-up time in months from intervention completion to data collection

Fig. 5
figure 5

Parent satisfaction forest plot. Cohen’s d effect sizes are shown for all included studies and their subsamples. The summary effect size and 95% confidence intervals are presented. Where more than one non-independent subsample was reported in a single study, all samples are shown, with a description of each shown in the subsample column. FU (mo.) = Follow-up time in months from intervention completion to data collection

Fig. 6
figure 6

Parent self-efficacy forest plot. Cohen’s d effect sizes are shown for all included studies and their subsamples. The summary effect size and 95% confidence intervals are presented. Where more than one non-independent subsample was reported in a single study, all samples are shown, with a description of each shown in the subsample column. FU (mo.) = Follow-up time in months from intervention completion to data collection

Fig. 7
figure 7

Parent–child interaction forest. Cohen’s d effect sizes are shown for all included studies and their subsamples. The summary effect size and 95% confidence intervals are presented. Where more than one non-independent subsample was reported in a single study, all samples are shown, with a description of each shown in the subsample column. FU (mo.) = Follow-up time in months from intervention completion to data collection

Fig. 8
figure 8

Parent social support forest plot. Cohen’s d effect sizes are shown for all included studies and their subsamples. The summary effect size and 95% confidence intervals are presented. Where more than one non-independent subsample was reported in a single study, all samples are shown, with a description of each shown in the subsample column. FU (mo.) = Follow-up time in months from intervention completion to data collection

Fig. 9
figure 9

Parent stress forest plot. Cohen’s d effect sizes are shown for all included studies and their subsamples. The summary effect size and 95% confidence intervals are presented. Where more than one non-independent subsample was reported in a single study, all samples are shown, with a description of each shown in the subsample column. FU (mo.) = Follow-up time in months from intervention completion to data collection

Evidence for Publication Bias

As shown in Fig. 10, evidence for publication bias was first assessed using funnel plot analyses to depict the relationship between effect sizes (Cohen’s d) and standard errors. No outcomes assessed showed significant levels bias via an Egger's regression test (see Table 4). Additional outcomes (i.e., anxiety, self-efficacy) appear to show bias in their funnel plots under visual inspection, however this is of course subjective and was not corroborated by statistical tests.

Fig. 10
figure 10figure 10

Funnel plots for parent socioemotional outcomes and parent–child interaction

Table 4 Sensitivity and moderator analyses

Sensitivity and Moderator Analyses

As shown in Table 4, a sensitivity analysis of study design (experimental vs. quasi-experimental) demonstrated a significant influence only within the parent satisfaction outcome (t = − 5.561, p = 0.021). Two outcomes, parent anxiety and social support, included only studies with experimental designs and so could not be included in this analysis. For the inactive vs. minimally active control group sensitivity analysis, significant differences were observed between the studies with each type of control for parent self-efficacy and social support. To estimate the impact on overall results, meta-analyses for these two outcomes were re-run using only studies with inactive controls. This yielded minor increases in the intervention vs. control effect, as expected. There was no change in findings of significance for self-efficacy, which had already been demonstrated, but there was a change for social support, demonstrating that with a purer comparison between online universal parenting programs and inactive controls, there is a significant positive influence on social support (d = 0.407, p = 0.038). Finally, when comparing program efficacy outcomes for self-guided vs. partially self-guided program delivery, parent satisfaction displayed a significant sensitivity.

To assess the maintenance of effects over time post-intervention, study sample effect sizes were grouped into short-term (0–3 months), medium-term (4–6 months), and long-term (7–24 months) evaluations. To obtain a suitable degree of statistical power, all outcomes were aggregated to perform this analysis. Pair-wise time group comparisons using t-tests demonstrated a minor (non-significant) decrease in effect from short to-medium-term time frames post-intervention (t = 1.29, df = 51.63, p = 0.204), followed by a significant reduction in effect when comparing both short- (t = 5.18, df = 78.65, p < 0.001) and medium-term effects to long-term outcomes (t = 3.67, df = 42.959, p < 0.001). See Fig. 11 for a visual representation of these data.

Fig. 11
figure 11

Change in effects from parenting programs over time, aggregating all outcomes. Mean Cohen’s d values and 95% confidence intervals are shown, along with significance indicators for each pairwise comparison (*** = p < 0.001)

At the combined outcome level, we identified no significant difference between parent sex (t = − 0.185, df = 3.39, p = 0.864) or between interventions with partially self-guided programs and entirely self-guided programs (t = 1.24, df = 19.14, p = 0.231). At the outcome-specific level, a significant difference in intervention outcome was observed between parent sex for social support, and for unguided vs. partially guided studies for parent satisfaction, as shown in Table 4.

Discussion

This meta-analysis examined seven parent-child outcomes, across 22 studies, associated with online parenting programs, and represents the first meta-analysis study focusing on the universal population. We found evidence that these programs can enhance parent behaviours, perceptions, and mental health, although these benefits are generally not sustained. (Baumel et al., 2016; Florean et al., 2020; Spencer et al., 2020; Thongseiratch et al., 2020)Improved parent self-efficacy was the strongest identified outcome, consistent with findings elsewhere in selective and indicated prevention sample meta-analyses (Florean et al., 2020; Flujas-Contreras et al., 2019). Significant impacts were also observed in parent depression and parent anxiety. Collectively, our findings demonstrate short-term meaningful impacts of online parenting support programs on parental mental health problems and the promise that these programs hold. However, it is important to note that effects did not appear to persist beyond the short-term following program completion and that effects did not extend to benefit the parent–child relationship. If these short-term effects can be sustained, programs are likely to have substantial preventative healthcare impacts, serving as a high-quality alternative that overcomes the logistical and financial barriers of face-to-face services, intensified by COVID-19 and a critical shortage of trained professionals.

The present results broadly align with prior meta-analyses that examined selective and/or indicated programs (Li et al., 2021; Spencer et al., 2020). Our findings provide an important addition to existing evidence for online universal parenting programs, which, to date, has been limited to a secondary analysis including a small sample of primary studies (Spencer et al., 2020; k = 9).

Parent Outcomes

Despite similarities with prior studies, our findings vary in important ways. First, our effect sizes are generally smaller or did not reach statistical significance. This difference is unsurprising considering the inclusion of solely universal programs, while other meta-analyses have included varied program types (i.e., selective, indicated, treatment), with many actively excluding universal programs (Baumel et al., 2016; Cai et al., 2022; Florean et al., 2020; Li et al., 2021; Thongseiratch et al., 2020). The typically more modest observable effects of universal programs represent a challenge in demonstrating efficacy in statistical terms, generally requiring larger sample sizes. By taking advantage of recently published primary research, this meta-analysis represents the first study to combine sufficient evidence to estimate population-wide efficacy of online parenting programs on a broad range of socioemotional outcomes. Programs applied to general populations typically have smaller individual effects relative to programs targeting higher-risk populations (McLaren, 2019; Rose, 2001; Werner-Seidler et al., 2017). Herein lies the prevention paradox: a preventative measure which brings great benefit to the population overall may offer little to each participating individual, yet small changes in the mean of whole-of-population distributions can result in marked societal benefits and reduce the need for costly subsequent targeted measures (Rose, 1981, 1985).

Again, in contrast to Spencer et al. (2020), the only other meta-analysis of universal programs, we did not observe significant changes in parent confidence or parent stress. This inconsistent finding may be explained by Spencer’s inclusion of parents with a child 0–18 years, while the current review included only parents of younger children (≤ 5 years). Only through comparison with a control group can intervention effects and natural changes over time be teased apart. Additionally, and unlike the present review, Spencer included both within-group and between-group study designs. Due to the lack of control groups against which to anchor results, within-group studies are likely to yield inflated results due to normative changes.

Child Outcomes

Like Spencer et al. (2020), but unlike prior meta-analyses that focused only on selected and/or indicated populations, we were unable to examine child-specific outcomes due to limited data availability, pointing to a lack of evidence examining universal self-directed parenting programs for children aged 0–5 years. This highlights the need for further investigation into the impact of such universal programs in the earliest years, relative to more intensive programs (Flujas-Contreras et al., 2019). Prior meta-analyses have not been limited to self-directed (entirely self-directed and partially self-directed programs) online parenting programs, with many providing direct professional support, longer training length, and greater program intensity. Considering these differences, cautious cross-study comparisons must be made.

Relational Outcomes

Even though few studies assessed dyadic or relational outcomes, programs that assessed relational health showed improvement in parent–child interaction following program participation, with effects approaching significance. We would expect additional study and data in this area to highlight some true benefits. This calls for additional research as only four studies were included in the meta-analysis, but past primary research has associated poor parental mental health with adverse parenting of offspring (Borre & Kliewer, 2014; Harvey et al., 2011; Sturge-Apple et al., 2011). However, relational health was variably assessed in the included studies, typically completed via parent-report questionnaires, with minimal studies including follow-up assessments, highlighting the need for more nuanced attention to measurement of dyadic interactions. Furthermore, there were few studies of father-child relationships. This is noteworthy considering the current study found no evidence to suggest that these programs are any less efficacious for fathers relative to mothers.

Notably, no digital programs focused primarily on strengthening the parent–child relationship, and few studies evaluated program impacts on parent–child interaction, despite the health of this relationship being seminal to the future mental health and well-being of both generations. This highlights an important gap in knowledge and practice in universal parenting programs, and is in stark contrast to significant research investments in relational health in high-risk populations (Bergsund et al., 2021). It may be that universal relational programs have been examined in less depth, due to the perceived non-critical nature of these dyads. Given the economic benefit of early intervention and prevention at the public health level further examination of the relational impacts of universal programs remains necessary (Heckman, 2012). Furthermore, given the centrality of the parent–child bond to child development across the lifecourse, additional investment in new digitally facilitated approaches focusing on this bond are likewise warranted.

Sensitivity and Moderator Analyses

Sensitivity and moderator analyses yielded interesting findings in the present review. Firstly, we found no intervention efficacy differences between partially guided and completely self-guided programs. This is a notable result given the considerably reduced cost and resources of entirely self-guided programs relative to partially guided programs. Through these analyses, we also observed a statistically significant decline in program efficacy over time. This suggests that, following the completion of the main program, additional refresher content or other initiatives may be required to sustain program efficacy (Furlong & McGilloway, 2012).

Of particular interest were results of a sensitivity analysis assessing the impact of programs on inactive vs. (minimally) active control groups. This analysis demonstrated sensitivity of results to control type for parent self-efficacy and social support. For each of these outcomes, the meta-analysis was then repeated after removing active controls, yielding significance for both outcomes and larger effect sizes. For social support, this is notable as the main analysis including all controls did not yield significance. This finding is intuitive and emphasises the importance of carefully considering the potential influence of differing control conditions when examining program efficacy.

Strengths and Limitations

We restricted our review to include a homogenous group of studies (i.e., experimental and quasi-experimental universal online programs and parents with a child aged 0–5), differing from other meta-analyses that have aggregated heterogenous data (e.g., Baumel et al., 2016; Thongseiratch et al., 2020). Additionally, we assessed a nuanced subset of digital parenting programs, namely self-guided and partially self-guided programs, due to their significantly greater potential for reach and funding (Spencer et al., 2020). Further, our meta-analysis builds on these prior studies through the examination of unique parent socioemotional outcomes not previously reported on at the universal level: parent self-efficacy, social support, parent anxiety, and parent–child interaction. Compared to prior meta-analyses, the scope of our search was also considerably larger, including a rigorous examination of a wide range of outcomes, at multiple follow-up time intervals, yielding some insight into the trajectory of parent-specific outcomes.

Despite these strengths, limitations must be noted. There was some variety in control groups used, with some receiving purely care as usual treatment and others receiving informational resources as part of their participation. Further, there can also be marked differences in what constitutes case-as-usual in different countries and regions. For example, care and availability of support services is probably better in metropolitan areas and developed countries, which may have some impact on results. Further, most relational and child-specific outcomes could not be examined due to a lack of included studies reporting on them. Given child-specific outcomes have previously been meta-analyzed, this limitation points towards a lack of evidence examining universal-specific program outcomes for parents and children aged 0–5. This in turn brings to light the need for further investigation into the impact of such universal programs in the earliest years. The dearth of universal programs focused primarily on strengthening the parent–child relationship is of concern, given health economic evidence highlighting the centrality of this relationship to the well-being of both generations, and the importance of early intervention (Heckman, 2012). This underscores an important gap in knowledge and practice in universal parenting programs, in contrast to significant research investments in relational health in high-risk populations (selective prevention, indicated prevention, and treatment programs) (Bergsund et al., 2021). Finally, there is a lack of clarity in program evaluation and what a robust approach should include before public dissemination (e.g., relational assessment looking at attachment outcomes, interactional quality using observational measures, and longitudinal child development including socioemotional functioning). Current studies appear to be failing to evaluate programs using such robust approaches.

Future Research

There is clear benefit in considering future research strategies that better clarify “what works for whom” within universal interventions, defining the levels of evaluation within a stratified model, that enables new knowledge of outcomes relative to pre-existing risk and vulnerability, including parents impacted by clinical depression and anxiety, histories of trauma, and attachment disruption. Longer-term outcome pathways for the general population warrant attention.

We suggest future research place greater focus on universal preventative programs to reduce later service system burden, which has public health significance. Once developed, these programs should be easily accessible, scalable, affordable, available ongoingly, and delivered flexibly and on-demand, providing greater program reach for those unable to attend through traditional in-person means. It is critical to distinguish effective universal support for well-being in the adaptation to parenting from targeted support for early mental health disorders including post-partum depression.

Future investment is needed in the formation and sustained evaluation of specific evidence-based public health programs pertaining to enhancing relational health and inhibiting early relational trauma. Investment in online universal relational interventions is an efficient way to utilize limited resources relative to other selective and indicated health initiatives. These programs will likely be relationally protective, resiliency-building within the relationship, while enhancing child health and well-being, thus reducing risk factors. Formation of such universal online relational programs may act as a socioemotional health preventative measure for the parent, child, and their relationship. While early relational health is routinely a feature of research and included in treatment programs for high-risk dyads; there appears to be an absence of specific universal parenting programs highlighting this content. Program elements that focus on enhancing parental awareness of their role in promoting early relational trust may assist with prevention of relational trauma, especially in the face of challenge.

Implications for Policy and Practice

In population health terms, our results suggest short-term small-to-moderate effects across several outcomes from online parenting education programs. Such effects are significant; however, these effects appear not to endure over time. Further development of practice and policy surrounding universally accessed parenting programs should work to ensure that learned behaviours and skills are retained over time. This will likely require refresher content and long-term access to program materials for participants.

In time, these programs may both provide early support for parents of young children and play a screening function, offering early triage to indicated supports, and potentially reducing the need for later targeted measures. Taken together, our findings support a future focus on the development of universally available, population health online preventative programs for parents. This is due to their capacity to reduce later public service system burden. Such programs would optimally be accessible, scalable, and affordable, with flexible delivery and ongoing access providing greater program reach, especially for those unable to attend through traditional in-person means.

Conclusion

Online preventative parenting programs have a unique contemporary capacity to provide mental health content and support for parents of young children (McGoron & Ondersma, 2015). In targeting improved understanding of child needs, empathic responsivity, and regulation of child experience, online parenting programs reviewed in this analysis show potential for both support of parents in their role and prevention of personal and developmental difficulties attendant to stressed parenting. Collectively, findings demonstrate the potential impact of these online parenting support programs on parent well-being, however, existing programs are yet to demonstrate that this impact can be maintained over longer periods. Universal online parenting programs (e.g., MERTIL for Parents; Opie et al., 2023) may have preventative healthcare impacts provided their efficacy can endure, serving as a high-quality alternative to overcoming the logistical and financial barriers in accessing face-to-face services.