Parents’ behavior has profound life-long effects on young people’s health, wellbeing and education (Viner et al. 2012). Parenting interventions have been shown to improve child-caregiver relationships, promote positive parenting practices, and reduce harsh parenting as well as child maltreatment (e.g., see Barlow and Coren 2018). While most studies have been conducted in Western settings, a growing number of evaluations in low- and middle-income countries (LMIC) also suggest positive effects (Gardner et al. 2015; Knerr et al. 2013). Consequently, international agencies, such as the World Health Organization, have recommended parenting interventions among the key strategies for violence reduction within the family (World Health Organization 2016). Parenting support is seen as especially important for LMICs as most of the world’s children live in LMICs and maltreatment tends to occur at higher rates than in high-income countries (Ward et al. 2016).

Converging evidence from interventions for a range of health and social outcomes suggests that interventions need to be delivered with sufficient quality and participant engagement to produce the intended results. In a widely cited review of prevention and health promotion programs for children (Durlak and DuPre 2008), authors found that in 45 out of 59 studies, there was a statistically significant positive relationship between implementation level and at least half of the intervention effects measured. A systematic review of group-based parenting programs for early-onset conduct disorders among children 3 to 12 years old found that studies reporting higher levels of implementation had larger intervention effects on measures, such as negative parenting practices, than the studies with lower implementation rates (Furlong et al. 2012). In line with these findings, the development and study of parenting interventions have been accompanied by an emphasis on the quality of intervention processes (Forgatch et al. 2005; Olofsson et al. 2016).

Berkel and co-authors have developed a helpful framework to guide research on the effects of implementation (Berkel et al. 2011), which focuses on two dimensions of implementation—facilitator and participant behavior—that are inter-related and can influence program outcomes. In assessing facilitator behavior, the authors identify three aspects: fidelity (intervention components delivered as prescribed), quality of delivery (facilitator teaching and process skills), and adaptation (facilitators modifying the intervention). Most of these terms have many alternative labels, for instance, fidelity is often referred to as adherence, and quality as facilitator competence, among other terms. To measure participant behavior (responsiveness), relevant aspects included in the model are attendance, the level of active engagement with the intervention, home practice and satisfaction.

Several parenting intervention studies have examined the effect of facilitator fidelity and quality of delivery on parenting outcomes, including studies of Incredible Years in the United Kingdom (Eames et al. 2009), Chicago Parent Program (Breitenstein et al. 2010) and Strengthening Families Program (Cantu et al. 2010) in the United States (US), Parent Management Training—Oregon Model in the US and Norway (Forgatch and DeGarmo 2011; Forgatch et al. 2005), Brief Parent Training in Norway (Kjøbli et al. 2012), and Growing Up Happily in the Family in Spain (Alvarez et al. 2016). Most, but not all of these studies (e.g., Breitenstein et al. 2010; Cantu et al. 2010) found higher facilitator fidelity or quality to be related to stronger improvements in some of the parenting or child behavior outcomes. A study of the Sinovuyo Kids project in South Africa focused on families with children 2–9 years old (Wessels 2017). In Sinovuyo Kids, the self-reported performance of all facilitators was close to the maximum possible on the measure; thus, there was insufficient variation to examine the effect of fidelity on program outcomes.

The differences between the resources available in high-income countries (HICs) and LMICs may impact the extent to which interventions can be implemented to the desired standard. In HICs, facilitators in parenting and other psychosocial interventions often are expected to have graduate degrees and backgrounds in social work, psychology or education (Webster-Stratton and McCoy 2015). In LMICs, it is increasingly common for interventions to be delivered by lay workers due to the need for service scale-up and the limited numbers of available professional staff (Patel et al. 2011).

During parenting intervention sessions, participants can learn about, observe and practice various techniques, and each session has a specific topic. Therefore, if an intervention is efficacious, the participants who attend and actively engage with the intervention may be able to benefit the most. Indeed, studies of interventions, such as The Incredible Years (Baydar et al. 2003) and Chicago Parent Program in the US (Gross et al. 2009), found that more active attendance and engagement in sessions was linked to better intervention outcomes. In an evaluation of Fast Track in the US, engagement in sessions predicted several of the outcomes, while attendance did not, suggesting that attendance without active engagement may be insufficient to improve outcomes (Nix et al. 2009). However, not all studies found a statistically significant effect of participant attendance and engagement on treatment outcomes. Generally, even in studies that established a significant relation, the dose-response pattern was present for some, but not all outcomes (Alvarez et al. 2016; Gross et al. 2009; Wessels 2017; Weeland et al. 2017).

In summary, several previous parenting trials examined the role of implementation, attendance and engagement on outcomes, finding that the way the intervention was implemented and received was often significantly related to the variation in participant outcomes. Most of these trials focused on evaluating parenting programs aiming to improve child behavior. However, interventions targeting problem behavior in children are very similar in design to programs primarily addressing caregiver behavior to reduce child maltreatment (Knerr et al. 2013). The current study explores the implementation of Sinovuyo Teen, a parenting intervention for families with adolescents, within a randomised controlled trial (RCT) in South Africa. This paper first aims to describe the key dimensions of intervention implementation, and, second, to examine if the implementation measures in this study predict participant outcomes on child maltreatment and parenting behavior.

Method

Participants

This study was nested within an RCT in the Eastern Cape, South Africa. The Eastern Cape province is a historically disadvantaged part of South Africa, with poor infrastructure and high rates of poverty and unemployment. The trial was conducted between April 2015 and August 2016, and enrolled 552 families within 32 rural and 8 peri-urban study clusters, randomised into two arms. Families were identified through a range of sources, including self-referrals, local chieftains, community-selected representatives, schools, social services, as well as door-to-door visits. To be eligible for the study, families had to respond positively to one of the screening questions on whether there are conflicts between the caregiver and adolescent in the household, and to complete the two rounds of baseline assessments. The treatment clusters received the parenting program, described below. The control arm clusters received a hygiene information event. More details, including power calculations for the trial, are available in the study protocol (Cluver et al. 2016b).

Procedure

The intervention manual was based on social learning principles, and developed drawing on existing research, consultations with experts and piloting in South Africa. Two pilot studies were conducted, including qualitative research with program facilitators and participants to incorporate their feedback in the revised manual and ensure program acceptability and relevance (Cluver et al. 2016a; Cluver et al. 2016c). Sessions were designed to be participatory and non-didactic, and they covered topics such as praise and relationship-building, managing emotions and solving family problems. The group sessions included both the adolescents and their caregivers to facilitate change in family relationships. The intervention was delivered by a local non-governmental organisation (NGO), and consisted of 14 weekly group sessions. In the ten joint sessions, caregivers and children worked together in the same room, while in the four separate sessions caregivers and adolescents were in parallel sessions, ideally out of each other’s earshot, in order to promote an open discussion. In addition, the participants were given a home practice task after each session to practice new skills. For example, the facilitators asked adolescents and caregivers to give each other compliments during the week to practice the skill of offering specific labelled praise. For those who could not attend sessions, home visits were delivered by the facilitators with a short summary of the week’s topic.

The facilitators were recruited by the implementing NGO to deliver the intervention during the RCT, and all but two of the 25 facilitators had no previous experience with implementing a parenting intervention. Several facilitators were social workers seconded by the local government, and the others had various backgrounds in diverse fields, such as insurance, project management, arts and culture. Facilitators received a five-day training in collaborative facilitation methods and parenting principles, as well as on-going weekly day-long supervision and training on session content. Several strategies, such as session observations by the facilitators’ trainer and NGO staff, were used to maintain treatment fidelity and quality.

Measures

RCT data collection

Primary caregivers and adolescents completed self-report measures at baseline, post-test at one-month post-intervention (92% retention from baseline) and follow-up at 5–9 months post-intervention (97% retention from baseline). In addition, due to high population mobility in the study area, at the time of the follow-up data collection, 8% of the intervention arm adolescents who completed follow-up were neither living in the same house as the original caregiver nor spending most of the week together. Therefore these participants could not report on parenting in the original pair and were not included in the current analyses.

The randomised trial found improvements across several family and individual outcomes. Among the primary outcomes, caregivers reported reduced physical and emotional maltreatment, reduced use of corporal punishment and poor monitoring, as well as an increase in positive and involved parenting. Adolescents reported reduced maltreatment at post-test, but not follow-up, an increase in involved parenting, and a decrease in inconsistent discipline. Adolescents reported fewer intervention effects than their caregivers. Full details on the intervention effects are available in other publications (Cluver et al. 2018; Steinert et al. 2018), and basic descriptive information about the intervention arm is provided in Table 1.

Table 1 Baseline sample characteristics

Outcome measures

The outcome measures in this paper were the primary trial outcomes as specified in the study protocol. Adolescents’ emotional and physical maltreatment, and neglect by caregivers were measured using a culturally-adapted version of the ISPCAN Child Abuse Screening Tool, ICAST-Trial (Meinck et al. 2018; Zolotor et al. 2009). At baseline, within the intervention arm, Cronbach alphas for the physical and emotional maltreatment scale was 0.78 (14 items) for caregiver report, and 0.87 (12 items) for adolescent report. For neglect, the alphas were 0.62 (3 items) for caregiver report, and 0.77 (6 items) for adolescent report.

Parenting (poor parental monitoring, inconsistent discipline, corporal punishment, positive parenting and positive involved parenting) was measured using the Alabama Parenting Questionnaire (APQ; parent and adolescent versions), widely used internationally and previously used in South Africa (Essau, Sasagawa, Frick 2006; Lachman et al. 2014). For caregiver report, Cronbach alphas were 0.78 for positive parenting (6 items), 0.77 for involved parenting (10 items), 0.72 for poor monitoring (10 items), 0.74 for corporal punishment (3 items) and 0.55 for inconsistent discipline (6 items). For adolescent report, Cronbach alphas were 0.89 for positive parenting (6 items), 0.87 for involved parenting (10 items), 0.76 for poor monitoring (10 items), 0.68 (3 items) for corporal punishment and 0.68 for inconsistent discipline (6 items).

Predictors

Implementation measures were collected by the Research Assistants (RAs) observing sessions in order to reduce the burden on the facilitators and participants and to avoid social desirability bias in self-reported implementation measures (Dusenbury et al. 2005). The RAs collected data in 277 sessions, out of the total 279 sessions delivered, and 32% of the sessions were double-rated by two RAs. RA ratings were supplemented with comments that were reviewed during data analysis to validate and contextualize the quantitative trends. The information about home visits came from the records of the implementing NGO. We examine the following four implementation indicators as predictors of treatment outcomes.

Group session attendance—based on the program design, participation of both caregiver and adolescent in a family is necessary for intervention success. Therefore, we include the total number of sessions attended by adolescents and caregivers (the sum of the sessions each attended) as a candidate predictor of the changes in primary outcomes.

Home visits—as facilitators offered participants home visits after missed sessions, we also examined the effect of the total number of home visits delivered by facilitators to each household after missed sessions.

Overall dosage—we used the total sum of group sessions and home visits the family received to examine an overall dose-response effect. For simplicity, we treat group sessions at community venues and individual home sessions as equivalent. However, it is likely that the home sessions were not equivalent to the group sessions as the intervention relies on group interactions. In addition, the home sessions only lasted about 20 min each, about 5 times shorter than an average group session.

Family engagement —average of adolescent and caregiver engagement scores from all the sessions they attended. To measure the level of engagement in sessions, we used a behaviorally-anchored 3-point scale (1—Adolescent or caregiver was quiet or distracted most of the time; 2—Adolescent or caregiver participated in parts of the session; 3—Adolescent or caregiver participated through most of the session).

Facilitator fidelity—average rating of the facilitators in a cluster in sessions 1–14 (see Table 2). Slightly different items were used at the first and last sessions due to a different session structure. Cronbach alpha for the final scale was 0.92 (6 items). These items were developed based on the program manual and in consultation with the program developers. The observation tools assessed fidelity by measuring how well, according to RA observation, the facilitators implemented the core activities in a session, such as introducing, reinforcing and summarizing core lessons, and performing and discussing role plays. Moreover, we assessed facilitator skill by measuring the extent to which facilitators encouraged an open and supportive discussion.

Table 2 Items used to assess session implementation

While in seven of the 20 intervention groups the original pair of facilitators delivered the entire program, in others, the same pair of facilitators did not always facilitate in the same groups. Given this variation, we focus on the average ratings of all the facilitators in a group, rather than looking at the impact of individual facilitators. To create the final scale measuring facilitator performance, we used exploratory factor analysis, which suggested a one-factor solution (Kjøbli et al. 2012). The item on whether the facilitator was judgemental did not correlate well with the other items, perhaps because facilitators showing critical judgements of participant behaviors were quite rare; hence, this item was not included in the final scale. As the final scale primarily measured the completion of intervention activities, we treat it as a measure of fidelity. The approach to measuring fidelity was focused on the function of intervention activities (Hawe et al. 2004). In other words, the data collection emphasized assessing whether the purpose of the session activities was achieved rather than whether the activities were simply completed or not.

Research assistant training

Fifteen RAs were involved in the implementation data collection after a five-day training in observational research and data collection forms. Furthermore, daily supervisions were conducted to ensure consistency in completing the forms. All RAs were from the local area and most of them had completed at least secondary education.

Data Analyses

For the double-coded sessions, we examined inter-rater reliability of the observational measures. The intra-class correlation coefficients for fidelity items ranged between 0.7–0.9, and for participant engagement, 0.8–0.9. As high inter-rater reliability was achieved on these measures, we used averages of two observations for analyses.

Fixed-effect regression analyses and Kruskal-Wallis tests were used to check if the implementation characteristics differed significantly between groups. To examine changes in individual engagement over time, we used time (session number) as a predictor of engagement in a multi-level model. To examine the impact of implementation predictors on participant outcomes, we adopted a longitudinal data analysis approach, as recommended for trials with repeated measures (Moerbeek and Teerenstra 2016). To detect the effect of implementation characteristics, we used the following model:

$$\begin{array}{l}Parenting_{ti} = \beta _{00} + \beta _1\left( {Predictor_i} \right) + \beta _2\left( {PT_{ti}} \right) + \beta _3\left( {FU_{ti}} \right) + \\ \beta _4\left( {Predictor_i \ast PT_{ti}} \right) + \beta _5\left( {Predictor_i \ast FU_{ti}} \right) + \\ \alpha _1\left( {Stratification_i} \right) + u_{0i} + \varepsilon _{ti},\end{array}$$
(1)

where Level 1 = occasion (t), Level 2 = individual (i), PT is the immediate post-test and FU is the follow-up. Parentingti is the estimated value for parenting at time t for person i, Predictori is the candidate predictor that is a characteristic of the attendance, engagement or implementation for individual i. The interactions are the key parameters of interest used to assess the effect of predictors. These interactions β4 and β5 allowed us to examine the effect of the predictors on change over time in the intervention arm. The last two terms represent the time- and person-level residuals.

We estimated separate effects at post-test and follow-up. To avoid non-essential collinearity, all predictors were grand-mean centred. We used a linear link function for parenting outcomes and a negative binomial function for maltreatment, neglect and corporal punishment outcomes to account for the skewed distribution of the harsh parent behaviors. The linear models were estimated using maximum likelihood estimation, drawing on all available data. The outputs of the negative binomial models are presented as incidence rate ratios (IRR). We tested whether random slopes for time were indicated for these models (Barr 2013). However, they either did not substantively improve model fit or —in a small number of cases—produced convergence issues and so were not included in the final model. Although this was a cluster randomised trial, due to low ICC at cluster level (under 5%) and a design effect under 2.0, we did not include a separate level for clusters (Peugh 2010). For ICC values, see Table 3. To obtain conservative estimates, we used robust clustered standard errors to account for the cluster-level correlations.

Table 3 Schedule of measurement of primary outcomes and their ICC values

We accounted for multiple hypothesis testing using the Benjamini–Hochberg procedure (Benjamini and Hochberg 1995), treating all tests as one family. For each outcome, we report both the unadjusted p-values as well as the corrected q-values. All analyses were implemented in Stata/SE 14.2, except the ICC inter-rater reliability calculations and the multiple hypothesis testing adjustment implemented in R 3.3.0 (package FSA v0.8.20).

Results

As the summaries demonstrate, the families received 91% of the sessions in total either via group sessions or home visits (see Table 4). The average group session attendance, however, was only 58%. The highest number of group and home sessions received by a family was 31 instead of 28 because a few families received additional home visits, even though they attended the group sessions. Family engagement in group sessions was on average 74%, and average facilitator fidelity was 83% of the possible maximum.

Table 4 Descriptive statistics of the implementation characteristics

Attendance, Participation and Implementation Over Time

Neither family attendance and engagement, nor fidelity, displayed systematic change over time (see Table 5 for fidelity ratings by week). It appears that the level of fidelity was influenced by the specific circumstances and the content of the session in a particular week. Based on the RA observations, some of the common challenges with regard to fidelity were with the adoption of participatory approaches (e.g., facilitators reading role plays instead of acting them out with participants), and facilitators missing or misunderstanding parts of the content.

Table 5 Session fidelity ratings by week from all 20 intervention groups

Attendance, Participation and Implementation Across Clusters

Overall, there was a statistically significant difference between clusters in the average participant attendance, engagement and overall treatment dose. In other work, we examined predictors of attendance and engagement (Shenderovich et al. 2018).

The Relationship of Implementation and Participant Behavior

There was a positive statistically significant correlation between average participant engagement and fidelity at cluster level, r(18) = 0.53, p < 0.05. However, when examined separately for caregivers and adolescents, this effect was only statistically significant for average adolescent engagement, r(18) = 0.62, p < 0.01. Although facilitators have reported challenges implementing the program when attendance was low, fidelity was not correlated with the average attendance rate in a particular group, r(18) = 0.02, p > 0.05 (see Table 6 for cluster-level correlations).

Table 6 Cluster-level correlations of implementation characteristics

Adaptations of the Intervention

The open-ended comments included in the RA observation forms suggest some adaptations of the Sinovuyo Teen program by facilitators during implementation. While adaptation was not measured systematically, we provide examples of the common changes made by facilitators, as reported by the RA observers. In some groups, facilitators had to make adaptations for low attendance when activities were designed to be practiced in family dyads but only one person from a dyad attended. On other occasions, instead of starting a new topic, facilitators continued with lessons from previous weeks when they felt something was not covered. Lack of physical space for separate sessions in some locations did not provide a quiet and confidential space separate for adolescents and caregivers. We also observed shorter duration of sessions than prescribed (1.8 h instead of 3), but session duration was not correlated with fidelity (see Table 6).

Implementation Characteristics as Predictors of Outcomes

At both post-test and follow-up, there was no overall trend of the implementation characteristics affecting the outcomes in the intervention arm (see Tables 7 and 8). However, there were several statistically significant effects of implementation characteristics. A higher number of home visits was related to slightly larger reduction in maltreatment at post-test (caregiver report), while at follow-up, a higher total dose of the intervention was related to a larger reduction in inconsistent parenting (adolescent report) as well as a larger reduction in neglect scores (caregiver report). There was a statistically significant effect of family engagement, with higher family engagement predicting lower adolescent-reported involved parenting at post-test (p = 0.040). Higher fidelity was marginally related to lower adolescent-reported positive parenting at post-test (p = 0.053). Furthermore, groups with higher fidelity reported an increase in adolescent-reported maltreatment and a smaller decrease in neglect, compared to the participants in groups with lower fidelity.

Table 7 The impact of implementation predictors on primary outcomes at post-test in the intervention arm
Table 8 Implementation predictors at follow-up (intervention arm)

After adjusting for multiple hypothesis testing, the only remaining significant effect of implementation characteristics on participant outcomes was the effect of fidelity on adolescent-reported maltreatment. Contrary to our expectation, higher fidelity was linked to increased reports of maltreatment, IRR 1.33 95% CI [1.19, 1.49], p < 0.01.

Discussion

In summary, the current study suggests that it was possible to deliver the Sinovuyo Teen intervention with a level of implementation comparable to studies in HICs, although it is not possible to compare precisely across studies due to different interventions and measures used. For example, in this study session fidelity was rated on average at 83%, participant attendance in group sessions at 58% and participant engagement at 74% of the maximum, whereas the Chicago Parent Program in the US reported average facilitator fidelity at 89%, group attendance at 50% and participant engagement at 82% (Breitenstein et al. 2010). This finding adds to the growing body of evidence from fields such as mental health that complex interventions can be delivered by lay staff with appropriate training and supervision (Kazdin and Rabbitt 2013).

Overall, the analyses did not identify the hypothesised pattern of impact of the intervention characteristics on participant outcomes. Different explanations may account for the absence of the expected effects. One explanation is that there was a very limited variation in the dosage for different participants within the RCT. As home visits were received nearly always when participants could not attend, the power to detect differences due to dosage was very limited. There was also no pattern of the effects of fidelity on the outcomes in the intervention arm. This may also be due to low statistical power (Cantu et al. 2010). As intervention implementation within studies is usually of high quality and has limited variation, the statistical power is limited to detect the effects of fidelity, dose and engagement (Breitenstein et al. 2010; Nix et al. 2009).

After adjustment for multiple testing, there was one significant effect of treatment quality in the direction opposite to expected. Adolescents in clusters with higher quality of implementation reported an increase in maltreatment at follow-up, compared to adolescents in clusters with lower quality of implementation. One can only speculate as to the explanations for this finding. One possibility, given that there was no evidence of harm from the intervention in the quantitative and qualitative RCT data, is that the increased reporting of maltreatment by adolescents reflected their increased confidence and willingness to disclose their experiences at home. Indeed, as discussed above, average quality of implementation and adolescent engagement in a cluster were strongly correlated. As highlighted in relation to intimate partner violence against women, violence is often under-reported and “willingness to disclose often improves with increased awareness about the definitions and extent of such abuse” (Kim et al. 2007: 1799). Similarly, in a recent study of the whole-school violence-reduction intervention the Good Schools Toolkit in Uganda, the researchers found an increase in girls’ experience of sexual violence as a result of the intervention, presumably due to increased reporting thanks to girls’ increased sense of safety to report violence (Devries et al. 2017). Although the intervention described in our study did not explicitly address definitions of child maltreatment or children’s rights, it is possible that the intervention increased awareness of maltreatment by cultivating adolescents’ experience with non-violent problem-solving and communication. For example, one of the RA observers in a group session noted that adolescents present there “said they learned to tell their parents if they don’t like the way they [the caregivers] were treating them.”

Alternatively, another interpretation is that higher fidelity in some clusters may have meant less flexibility to meet the specific needs of the participants. Thus, perhaps the manual was followed more closely to the detriment of addressing unexpected issues. Indeed, modifying interventions can be useful to make them more suitable for particular circumstances (Mowbray et al. 2003). In practice, adaptations always occur (Durlak 2015). Therefore, future parenting intervention research will benefit from planning how to systematically record and assess adaptations to evaluate their impact (Dusenbury et al. 2005). Furthermore, in complex interventions, it may not always be clear to facilitators how each activity or intervention process is expected to lead to a change in behavior. Providing clarifications for why various activities are included into the intervention manual and how they can be modified while preserving the core principles enables facilitators to adapt the intervention when necessary without undermining the ideas behind the program (Hill and Owens 2013).

The current evaluation also included a qualitative component exploring issues such as acceptability and mechanisms of change (Doubt et al. 2017). Intervention participants had primarily positive feedback, with most reporting improvements in their households, such as perceived reductions in violent discipline among caregivers and aggressive behaviors among children. Although high participant satisfaction was reported in interviews, attendance rates were lower than in some of the previous studies (e.g., Annan et al. 2017). Other work (Shenderovich et al. 2018) explores reasons for missed group sessions in Sinovuyo Teen. Overall, the study did not show evidence that the level of stressors, such as poverty, was related to attendance. There were several statistically significant predictors of attendance – for example, controlling for other variables, fewer sessions were attended in peri-urban clusters compared to rural areas, and by employed caregivers and male caregivers. Participants also reported other reasons for missed sessions, such as illness and community events. The offer of home visits following missed sessions may have also reduced the incentive to attend group meetings.

It is important to conduct replications by non-developers and research in circumstances that are typical of service settings (Forgatch and DeGarmo 2011). On the one hand, implementation tends to be closely supervised in a trial setting as compared to routine services. For example, in routine services interventions often experience program drift, meaning that implementation fidelity falls over time (Mowbray et al. 2003). On the other hand, interventions within trials may be delivered without the infrastructure available in routine services. For instance, in this trial, nearly all facilitators delivered the intervention for the first time during the RCT, immediately after the initial training and without any previous experience of working together.

There are also many important organisational factors in implementing any intervention that should be examined in future research on parenting interventions in LMICs. Factors such as facilitator recruitment, compensation, incentives, and work conditions are likely to impact implementation (Fixsen et al. 2013). One model to promote fidelity at a larger scale that is used in parenting interventions is the certification of facilitators following training (Forgatch and DeGarmo 2011). Supervision and coaching of facilitators are likely to improve fidelity as well as help address facilitator burn-out and turnover (Sheidow et al. 2007). Various organisational models of delivering parenting programs can also be compared in experiments (Ashraf et al. 2014).

There is some evidence suggesting that monitoring implementation can boost its quality. Hence, implementation quality may be lower, if it is not monitored (Lachman et al. 2016). For instance, a review of 59 mentoring studies found that programs that monitored implementation had three times higher average effect sizes than programs without monitoring (DuBois et al. 2002). At the same time, this connection might not be causal; it is also possible that the projects that place an emphasis on quality have both higher overall quality of implementation and monitoring measures in place. Furthermore, as the Medical Research Council guidelines for process evaluations highlight, a key question for process evaluations is whether to communicate emerging findings as the intervention is implemented (Moore et al. 2015). In this study, as the intervention team was delivering the program for the first time, several emerging issues were communicated to the implementation team. Future process evaluations working in an effectiveness setting and with established providers are best suited to feed back all findings only at the end of the intervention (Hickey et al. 2016).

Even though only some of the parenting intervention studies report overall fidelity rates (Gardner et al. 2015), most of the published information on implementation of parenting interventions comes from experimental evaluations rather than routine delivery. However, as parenting programs are already widely implemented, it would be beneficial for future research to also draw on outcome monitoring in ordinary service delivery (Hurst et al. 2014) as this can greatly enlarge the pool of available data to examine the relationships between implementation and participant outcomes.

Limitations

Our current analyses focus on quantitative implementation indicators, while providing context based on field notes of session observations, with qualitative research findings reported separately. As mentioned in the program description, participants who were unable to attend a session received a brief home visit from the facilitators with a summary of the week’s content. Following missed sessions, close to a third of the sessions were delivered through these individual visits, but it was not feasible to systematically observe home visits as observations in the home were considered too intrusive. It was also not possible to collect data on home practice completion, which is a program feature that parenting interventions rely on in order to reinforce the learning between sessions (Berkel et al. 2018; Ros et al. 2016). Furthermore, due to a limited number of intervention clusters (20), statistical power was limited, especially for fidelity as a cluster-level predictor, and it was not possible to explore predictors of fidelity. Future studies with larger samples would allow to test further hypotheses about mechanisms, such as whether effect of fidelity is mediated by participant engagement (Berkel et al. 2011).

Due to the relative brevity of the implementation measurement tool, our study did not examine facilitator fidelity and quality as two separate predictors, although they may have different effects on participant outcomes (Breitenstein et al. 2010; Humphrey et al. 2018). Future studies in LMICs could incorporate a more extensive subscale measuring facilitator competence. Another promising area is understanding therapeutic alliance – the interpersonal processes between the facilitator and participant—in the context of interventions delivered by lay staff in different cultural settings (e.g., see Elvins & Green, 2008; Schmidt et al. 2014).

This study also has some important strengths. It adds to the literature on the impact of implementation characteristics on outcomes in parenting interventions. Fidelity and engagement were observed by independent researchers rather than the implementers themselves, which provided a higher level of objectivity. In addition, data were collected from almost all intervention sessions rather than a limited sample. The fidelity, attendance, and engagement levels within an RCT may not be representative of all the replications in other settings by various agencies. However, these are helpful benchmarks that can be used for comparison and planning within the growing body of research examining parenting support in LMICs. Rigorously measuring implementation is a key step towards establishing evidence-informed parenting programs in LMICs.