An Evaluation of Parent Training Interventions in Scotland: The Psychology of Parenting Project (PoPP)

Early-onset behavioural difficulties persisting into the pre-school years, can make young children vulnerable to poor long-term outcomes including the development of conduct disorders, which are linked to significantly higher societal costs. Several parenting interventions have been shown to reduce behavioural difficulties in children and this evaluation presents outcomes from the Psychology of Parenting Project (PoPP), a national implementation programme delivered in early years services in Scotland. This evaluation of service implementation reports on a large cohort of children (2204, age: 2–5 years) whose parents/caregivers participated in PoPP group-based parenting interventions. We explored change in parent-reported Strengths and Difficulties Questionnaire (SDQ) scores following either the Incredible Years Pre-school Basic or the Level 4 Group Triple P interventions. Latent profile analysis (LPA) was used to identify statistically distinct sub-groups of children based on SDQ subscale scores (emotional, conduct, hyperactivity, peer problems and prosocial). Pre- and post-intervention SDQ scores were available for 58% of children. Large intervention effects were reported and analyses showed that 60% of “at-risk” children were no longer scoring in the at-risk range post-interventions. LPA identified four statistically-distinct profiles of children. Children from “low” and “moderate” behavioural problem profiles benefited more from Triple P, whereas “severe” and “hyperactivity-focused” problem profiles displayed better outcomes following Incredible Years. When delivered through a robust implementation scheme, these parenting interventions can be effective in routine service settings and produce clinically important improvements. These findings and the identification of distinct profiles of children who may respond differentially to interventions could guide the planning of future dissemination schemes. We examine the Psychology of Parenting Project (PoPP)—a national roll out of evidence-based parenting interventions in Scotland. We calculated intervention effects using data from a sample of 2,264 children enrolled in PoPP interventions in the evaluation period. 60% of children scoring in the high-risk range of the Strengths and Difficulties Questionnaire pre-assessment were no longer scored in the clinical range post-intervention. Overall, we observed similar outcomes when comparing the Incredible Years and Triple P intervention groups. These results suggest that PoPP interventions may offer considerable long-term savings when considering potential long-term costs associated with conduct disorders. We examine the Psychology of Parenting Project (PoPP)—a national roll out of evidence-based parenting interventions in Scotland. We calculated intervention effects using data from a sample of 2,264 children enrolled in PoPP interventions in the evaluation period. 60% of children scoring in the high-risk range of the Strengths and Difficulties Questionnaire pre-assessment were no longer scored in the clinical range post-intervention. Overall, we observed similar outcomes when comparing the Incredible Years and Triple P intervention groups. These results suggest that PoPP interventions may offer considerable long-term savings when considering potential long-term costs associated with conduct disorders.

Persistent behavioural difficulties (which can include oppositional behaviours, attention difficulties, hyperactivity and conduct behaviours) in young children are associated with a range of negative life outcomes including school failure, unemployment, substance misuse and mental health problems, and can lead to the onset of the more serious conduct disorder and subsequent poor adjustment in adulthood (Fergusson et al. 2005;Knapp et al. 2011;Patterson et al. 2000). Conduct disorders are characterised by repetitive aggressive, antisocial or problem behaviours that contribute to persistent violations of social and family expectations (National Institute for Health and Care Excellence 2013). Health, social care, education and criminal justice costs are estimated to be ten times higher for children with conduct disorder compared to those without (Parsonage et al. 2014;Sainsbury Centre for Mental Health 2009;. The "Growing up in Scotland" study (Bradshaw and Tipping 2010) surveyed the level of social, emotional and behavioural difficulties on entry to primary school and found that between 10 and 27% of children in Scotland displayed difficulties above the normal range, and between 5 and 12% of children were considered at "high-risk" for poor outcomes in the future. Parenting interventions aim to improve parenting skills, and address parenting practices that may be contributing to behavioural difficulties (Scott 2008). There is strong evidence from systematic reviews for their effectiveness in reducing reported behavioural difficulties in young children (Epstein et al. 2015;Sanders et al. 2014), especially those aiming to increase parent-child interactions and emotional communication skills (Wyatt Kaminski et al. 2008). Multiple randomised controlled trials of parenting interventions compared to waitlist controls or minimal parenting interventions have found moderate effect sizes (ranging from 0.3 to 0.52) in favour of parenting interventions on measures of a range of behavioural difficulties (Furlong et al. 2012;Hutchings et al. 2007;Scott et al. 2010). This approach represents an opportunity for early intervention, and guidelines for the prevention and management of conduct disorder recommend the use of group-based parenting interventions as an effective method of managing children's behavioural difficulties (National Institute for Health and Care Excellence 2013).
Treatment gains achieved through the delivery of these interventions have been shown to persist for several years (Drugli et al. 2010). This has further contributed to the costeffectiveness of these interventions (Edwards et al. 2016) and in the wide-spread dissemination of parent training interventions in England, for example through the Children and Young People's Increasing Access to Psychological Therapies (CYP-IAPT) initiative and the Parenting Early Intervention Programme (PEIP; (Lindsay and Strand 2013)), where moderate pre-post intervention effects were observed in reducing behavioural difficulties. NHS Education for Scotland (NES) initiated the Psychology of Parenting Project (PoPP) to increase the availability of evidence-based parent training interventions for Scottish families. In its first 3 years, the primary aim of PoPP was to improve outcomes for 3 and 4-year old children displaying elevated behavioural difficulties (but not more serious conduct disorder). It sought to do this by equipping the existing multi-agency child and family care workforce with the skills to deliver evidence-based parenting interventions, following an identified gap between the number of practitioners who had received core training in evidence-based parenting interventions and the number of groups actually delivered. In response the PoPP implementation framework that was developed drew upon the research and practice within the field of implementation science, and was structured around the three principal implementation drivers proposed by Fixsen et al. (2005): staff competency, organisational supports and leadership. This plan recognised that, in addition to building a confident and competent workforce to deliver the interventions, attention must also be given to the need to create organisational systems and supports to allow staff to deliver the programmes with fidelity, supported by leadership at all levels of the system that was able to be both technical and adaptive. The dissemination plan outlined the educational infrastructure required to complement a re-alignment of some of the considerable resources currently devoted to parenting work across the country, in order to address the current paucity of evidence-based parenting interventions being delivered to families in Scotland.
Between August 2013 and October 2016, the PoPP model had been adopted by 14 Community Planning Partnerships (CPPs) in Scotland, and over 400 multi-agency early years practitioners had completed the 3-day authorised training associated with PoPP-delivered parenting interventions, as well as an enhanced set of training activities developed by the PoPP team. These included training sessions on the PoPP implementation, strength-based communication skills for working with families and the distinct supervision models that were utilised in each of the parenting interventions.
The prevention protocol of the Incredible Years Pre-School intervention (Webster-Stratton 1998) and the Level 4 Group Triple P-Positive Parenting intervention (Sanders 2012) were selected, based on their substantial evidencebase (Furlong et al. 2012;Sanders et al. 2014;Scott et al. 2010), to be delivered by practitioners already in the early years workforce but with newly-dedicated time for this work. A number of potential child moderators have been explored within studies of each intervention. Incredible Years has been found to be effective irrespective of gender, age, and co-occurring attentional issues, as well as levels of anxiety and depression symptoms (Seabra-Santos et al. 2016;Webster-Stratton 2016). A major systematic review of Triple P interventions identified younger age and higher levels of initial severity associated with greater improvements in child outcomes (Sanders et al. 2014).
The creation of a PoPP database has supported the collection of parent-reported outcomes following parenting interventions and provides an opportunity to evaluate this large-scale roll out of parenting interventions in everyday service settings. It also allows an exploration of potential differences in outcome between two interventions for different groups of children. Previous analysis (Bradshaw and Tipping 2010) has suggested that clusters of children exist based on behavioural symptoms. These authors identified five clusters of children attending primary school, specifically a group that displayed little or no emotional and behavioural issues (the largest group), a group with moderate hyperactivity issues only, a group with particularly high hyperactivity scores only, a group displaying average levels of scores across subscales and a group scoring relatively high across domains of conduct, emotional problems, hyperactivity and peer problems. However, clusters have not been explored in a population of children identified as at-risk, and outcomes have not been explored between these clusters, especially in relation to different interventions. The identification of sub-groups of children with differential outcomes following Incredible Years or Triple P may help to refine the provision of parenting interventions.
The aim of this evaluation of service implementation is to assess the effectiveness of the PoPP-supported interventions in reducing behavioural difficulties in pre-school children in Scotland. In addition, latent profile methods were used to explore the hypothesis that there were different sub-groups of children served by the project, and that these different profiles have differential response to PoPP delivered interventions. This analysis was considered exploratory in approach, but considering previous findings (Bradshaw and Tipping 2010) we hypothesised the existence of a sub-group with difficulties on a number of domains, a group with relatively lower difficulties as well as the potential for a group where issues around hyperactivity were the main concern.

Participants
The dataset used for this analysis comes from assessment information for n = 2264 children whose parents enrolled in PoPP-supported parenting interventions between August 2013 and October 2016. Families were referred to a local PoPP team, who would contact the family to discuss the group and establish their interest in taking part. Although it was recommended that children score over 17 on the Strengths and Difficulties Questionnaire (SDQ; Goodman 2001) (see "Measures" section for further detail) for inclusion, this criteria was not enforced, and the only other inclusion criteria was that children were around 3 to 4 years of age (but families of 2 and 5 years olds could be included if it was considered appropriate). Data was collected prior to the groups starting, at which time they consented to it being used for analysis and evaluation of the PoPP. Preintervention assessment data was available for n = 2204 children, the included sample for this analysis.

Interventions
The Incredible Years Pre-school BASIC intervention (Webster-Stratton 1998) involves the delivery of fourteen face to face, 2-hour long weekly sessions to groups of up to twelve parents. Two practitioners lead discussions of video vignettes, problem-solving exercises and role play activities that address parents' personal goals. The intervention focuses on strengthening parent-child interactions, nurturing relationships with children, effective management of behaviour, and promoting children's social, emotional and language development. The Incredible Year manual can be purchased from the Incredible Years website (http://www. incredibleyears.com). Level-4 Triple P groups are delivered to a similar number of families with a total of eight sessions delivered, five of which involve 2-hour long face-to-face sessions, delivered by two practitioners, and three 20-30 minute long sessions which are delivered via individual telephone call consultation between one of the practitioners and the parent. During Level-4 Group Triple P (Sanders 2012), parents learn strategies for improving their relationship with their child, increasing the child's competencies and discouraging unwanted child behaviour. Learning is supported through role play activities, group discussions of video vignettes, and problem-solving exercises focused on the parents' personal goals. The Triple P manual is available to trained practitioners, with more information available at the Triple P website (www.triplep. net). Fidelity of individual groups was not assessed by the PoPP, but across both interventions fidelity was supported by the use of standardised training, manualised and available materials as well as post-training supervision, accreditation and practitioner networks. Children did not attend either intervention groups.
With these considerations in mind, workforce capacity was created in all but one participating Community Planning Partnership (CPP) for two-thirds of local delivery to involve Triple P groups and one-third to involve Incredible Years groups. This split was made for both pragmatic and financial reasons as Triple P, with 8 sessions, could be delivered four times a year compared to the 14-week Incredible Years groups. However, it was found that the overall split was closer to one-third Triple P and two-thirds Incredible Years. One CPP made the strategic decision to invest in only one parenting intervention and therefore did not complete any PoPP parenting interventions in the study period. Although the PoPP model was designed so that there was a one-third Incredible Years and two-thirds Triple P split in the delivery of interventions to families, decisions relating to which one of the two interventions would be best suited to a particular family's needs were to be made by the local implementation team. Families were not randomised to interventions and instead practitioners made pragmatic decisions about the allocation of families to interventions depending on a range of factors including the availability of practitioners trained in a specific intervention. In some localities the different interventions were offered on different days and therefore the families were given a choice as to which intervention they attended.

Measures
We asked all parents to complete a paper copy of the SDQ at both initial assessment and at the final session of the group. A total SDQ score of 17 or more is considered to be in the "high-risk" range of behavioural difficulties. The SDQ has demonstrated good reliability, including for younger children (Mieloo et al. 2012;Stone et al. 2015), but individual items were not available in the current dataset to allow study specific reliability to be calculated. The SDQ is also made up of five sub-scales measuring specific domains of behavioural difficulties which were available to the research team: emotional, conduct problems, hyperactivity, peer problems and prosocial subscales, although the prosocial subscale does not contribute to the total score. The PoPP database also includes basic demographic information on age and gender of children, as well as information about the PoPP delivered parenting interventions.

Analysis
This analysis included data collected during the PoPP up until October 2016. Post-intervention SDQ scores were not available for n = 935 (42%) of children with initial SDQ information. This was due either to facilitators not gathering and submitting this information or to families not attending the final intervention session. We split the analysis into two samples. A completer sample included all children whose parents attended the last group session and provided postintervention SDQ scores. A "full-cohort" sample comprised all children with pre-intervention scores only. Where postintervention SDQ scores were missing in the full-cohort, we carried forward the pre-intervention SDQ score to provide a post-intervention score, making baseline and endpoint information the same for the n = 935 children with missing outcome data. As no sessional SDQ scores were available, carrying forward initial scores was the only option for this analysis.
We considered three primary outcomes of interest in this evaluation, which were provided for the full sample as well as compared between the intervention types: (1) The mean change between pre-post intervention SDQ scores.
(2) The number of children achieving reliable change in SDQ scores, defined as a change in 7 or more points on the SDQ (Law and Wolpert 2014).
(3) The number of children "moving out of clinical range", which is defined as being in the "high-risk" range pre-assessment on the SDQ (score ≥ 17) then scoring below 17 post-intervention.
A second level of enquiry involved the identification of statistically distinct sub-groups of children whose parents enrolled in PoPP-supported groups. We used latent profile analysis (LPA), an extension of latent class analysis (Hagenaars and McCutcheon 2002;Lazarsfield and Henry 1968) for this purpose. Although previous analysis of SDQ clustering (Bradshaw and Tipping 2010) used K-means clustering to identify sub-groups based on SDQ assessment scores, LPA methods have shown better performance and are preferred due to the inclusion of model fit statistics (Magidson and Vermunt 2002;Schreiber and Pekarik 2014).
To identify the best fitting model for the data, we compared the Vuong-Lo-Medell-Rubin Likelihood Ratio test (VLMR-LRT; (Lo et al. 2001)) between models, and this was considered alongside the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and entropy values (Asparouhov and Muthén 2012;Geiser 2013). The VLMR-LRT compares the K model (current model, with K number of profiles) to a model with one less profile (K − 1 model), with a significant p value (p < 0.05) indicating better fit for the K model. A p value ≥ 0.05 would suggest the K − 1 model is a better fit, and the more parsimonious model would be preferred. Lower AIC and BIC values between models suggest better fit, and higher entropy values indicate higher classification accuracy for the model. Analysis was conducted in Mplus version-7 . We then explored outcomes between the identified profiles, with analysis performed in STATA 14 (StataCorp 2015).

Results
A total of 357 PoPP-supported groups were delivered in the evaluation time frame (218 Incredible Years groups and 139 Triple P groups). The average number of families in Incredible Years groups was 7.2 (standard deviation = 2.59) compared to 5.00 (SD = 1.87) in Triple P groups. The average number of sessions attended for Incredible Years groups was 11.37 (SD = 3.93) and 6.12 (SD = 2.27) for Triple P groups. We present the baseline demographics of children from the completer and full-cohort samples in Table 1. The completer sample consisted of n = 1269 children with post-intervention SDQ information available (57.58% of children with initial SDQ assessment). The families of most children without post-intervention data were reported to have left the intervention groups before the final session (80.86%). The proportion of families leaving the interventions before the final session, and therefore not providing final SDQ scores, was not statistically different between groups (Incredible Years = 36%; Triple P = 32%; odds ratio = 0.843, p = 0.082). Over 65% of the child sample were male, with a mean age of three and a half years. The average total SDQ score pre-intervention was 19.55 (SD = 6.06) for the full-cohort sample, and 19.2 (SD = 6.11) for the completer sample. Mean SDQ subscale scores ranged from 3.1 to 7.1.
A comparison of the descriptive statistics for the fullcohort (pre-intervention SDQ only) and completer samples (pre-and post-intervention scores) suggested that means and proportions were very similar. However, comparison statistics indicated a number of statistical differences between samples. Independent samples t-tests showed that the mean age of children (t = 2.681, p = 0.007) and total SDQ at assessment were (t = 3.178, p = 0.002) significantly higher in the full-cohort sample, as well as for the emotional (t = 2.167, p = 0.03), conduct (t = 3.481, p = 0.001) and hyperactivity (t = 1.974, p = 0.049) sub-scales.
We also performed comparison statistics between children whose parents attended either Incredible Years or Triple P groups to explore where there were differences between frequencies or averages of demographic variables, within both the full-cohort and completer samples. This analysis is presented in Supplementary Table S1. The mean age of children was significantly higher in Triple P groups than Incredible Years, in both the full-cohort (3.52 vs 3.64, t = −2.990, p = 0.003) and completer samples (3.47 vs 3.61, t = −2.479, p = 0.013). We found no other significant differences in child characteristics between interventions across either sample (p > 0.05).
We present the mean change in SDQ scores between the start and end of the parenting interventions in Table 2. The first section presents change for the full-cohort sample for all children, and by intervention. Results suggest an average reduction of 3.45 points on the SDQ following interventions, with a slightly greater effect found in favour of Incredible Years groups. When only intervention completers were included, mean change was 5.99 points on the SDQ, with large effect sizes indicated for both Incredible Years (d = 0.941) and Triple P (d = 0.915) groups.
The number of children from both the full-cohort and completer samples who displayed reliable change in SDQ scores (drop of 7 or more points), or those who moved out of clinical range following intervention are presented in  Table 3. It was found that 44% of children in the completer sample, and 25% of the full-cohort sample (where preintervention SDQ scores were carried forward for children with missing post-intervention data) achieved reliable change. A total of n = 1582 from the full-cohort and n = 839 children in the completer sample scored in the high-risk range of the SDQ at pre-intervention assessment and therefore could be included in the analysis. We observed that moving out of clinical range was reported for approaching 60% of completers, and 32% of the full-cohort who were previously in the "high-risk" range. The proportion of positive outcomes was slightly higher following Incredible Years groups. LPA was then performed on the SDQ sub-scale dataset to identify statistically-distinct profiles of children served through the PoPP scheme. Of the initial n = 2204 children, n = 90 had no SDQ subscale scores available, and therefore the included sample consisted of n = 2114 children with pre-intervention SDQ subscale scores. Model fit statistics are displayed in Table 4. The VLMR-LRT indicated statistically significant p values for the 2-profile, 3-profile and 4-profile solutions, before the 5-profile solution resulted in a p value > 0.05, suggesting that the 4-profile solution was the best fit for the data. The AIC and BIC values decreased from the 2-to 6-profile solutions, with the rate decelerating. The VLMR-LRT p value for the 5-profile solution was only just non-significant (p = 0.053) and therefore the 4-and 5-profile solutions were compared. The fifth profile (in the 5-profile solution) was found to have average subscale scores in between two other profiles, providing little additional clinical information, and therefore the 4-class solution was considered optimal given the non-significant p value for the VLMR-LRT of the 5-profile solution (Nylund et al. 2007). Included children were allocated to the latent profile (LP) to which they had the highest probability of membership.
The distribution of children across LPs and mean subscale scores for each LP are displayed in Supplementary  Table S2. The four profiles are described below and are also presented graphically in Supplementary Fig. S3.
(1) LP1 (low problem behaviours)-children in this profile score relatively low on all subscales of the SDQ and could be considered a group of children with low problem behaviours. (2) LP2 (moderate problem behaviours)-members of LP2 show moderate levels of problem behaviour. Making up 12.7% of children, this profile includes the smallest number of participants. (3) LP3 (hyperactivity problems)-LP3 is distinguished by its considerably higher mean hyperactivity subscale scores when compared to all other subscales. The mean subscale scores on all other domains are similar to those of LP2 (moderate), and therefore this profile is considered a hyperactivity problem specific  subgroup. (4) LP4 (severe problem behaviours)-children in this profile display the most severe problem behaviours compared to the other profiles. All mean subscale scores except emotional symptoms are in the high-risk range, but hyperactivity and conduct are particularly severe.
In the next stage, behavioural outcomes were explored between the profiles, as well as in response to the two different intervention groups, using only children with postintervention data available (the completer sample). Table 5 presents the proportion of children from each profile who were no longer scoring in the high-risk range following intervention. The table indicates that the number of children who were scoring high-risk initially was limited for the low and moderate profiles, which is due to the lower total SDQ scores for these profiles of children.
Overall the percentage of children moving out of the clinical range was more than 70% for the low, moderate and hyperactivity profiles, whilst it was 43% for children from the severe profile of behavioural difficulties. Comparing outcomes between Incredible Years and Triple P groups suggests that for the low and moderate profiles, Triple P groups show more positive outcomes, whereas slightly more benefit was indicated for hyperactivity and severe profile children following Incredible Years groups, although differences were not statistically significant (p > 0.05). It should be noted, however, that the low numbers of children included in this analysis for the low and moderate profile groups means that results should be interpreted with caution.
We then explored the mean change in SDQ total between profiles, and present results in the left-hand columns of Table 6. The findings suggest that effect sizes were highest in the severe profile, followed by the hyperactivity profile, then the moderate and low profiles, which was consistent across interventions. These larger effect sizes in the profiles with higher severity at baseline may be in part due to regression to the mean, as these children would have more available change in scale scores than children with lower scores. Effect sizes for Incredible Years groups were slightly higher than for Triple P groups, although differences in mean change scores were not statistically significant (p > 0.05). The smaller effect size for the low behavioural difficulties profile is likely due to the lower initial assessment symptom scores resulting in less available change in symptoms.
As the conduct and hyperactivity sub-scales were the highest scoring at initial assessment, and contributed to the distribution of the profiles, change in these sub-scales was also explored in this analysis and presented in the righthand columns of Table 6. The results suggest that Triple P interventions produced more conduct subscale change for the moderate profile, whereas effect sizes are larger for the severe profile when Incredible Years were delivered. The hyperactivity subscale change showed a larger effect size for the Low profile when Triple P interventions were received, but more change was indicated for the hyperactivity profile following Incredible Years interventions on this SDQ subscale. Although differences were observed, these were not statistically significant.

Discussion
The results of this evaluation suggest that the evidencebased parenting interventions delivered as part of the PoPP implementation scheme in everyday service settings have had a positive effect in reducing behavioural difficulties in children in Scotland, with effect sizes comparable to controlled trials. We found that nearly 60% of children with post-intervention assessment information who were initially scoring "high-risk" for problem behaviours were no longer high-risk following intervention. A conservative analysis performed on the full-cohort sample, whereby children who did not provide post-intervention data were assumed to still be high-risk, resulted in a drop from 60% of completers to 32% of the full-cohort sample moving out of clinical range (33% Incredible Years, 29% Triple P). Considering the potential long-term costs associated with conduct disorders , the effectiveness of the Incredible Years and Triple P groups implemented within the PoPP framework suggests the potential for considerable long-term savings. The effect sizes for pre-post change in total SDQ scores ranged from 0.915 to 0.941 for intervention completers and 0.496 to 0.527 for the full-cohort. These are equivalent, if not larger, than effect sizes indicated in other researchoriented implementation evaluations and controlled trials of parenting groups (Lindsay and Strand 2013;Sanders et al. 2014;Scott et al. 2010). This may, in part, be due to the younger age of children included in the PoPP, as this has been associated with better outcomes (Sanders et al. 2014). Equally, this strong performance may be associated with the robustness of the PoPP implementation scheme, the aim of which was to support practitioners to provide interventions that were consistently delivered to a high standard, even in routine settings. Anecdotally parents spoke positively about changes in their child's behaviour, their relationship with their child, and their confidence and skills as parents. PoPP practitioners reported seeing tremendous changes in families, but also in their own skills and practice. Feedback evaluations on PoPP delivered groups were provided by 588 families, with 99.66% reporting they would recommend the groups received to a friend, 96.77% agreeing that the groups improved their relationship with their child and 97.69% reporting their family life has benefitted. In response to the question "how have you personally benefitted from the groups?", quotes included: "My relationship with my daughter has become more happier & a stronger bond" "I feel more calm & in control, the general household is much calmer & happier" "more confidence in myself & my child".
Comparison of the parenting intervention types showed that overall outcomes were similar, with a slight advantage of the Incredible Years interventions compared to Triple P. It is not possible to tell from this dataset what factors may be contributing to the reported differences. They could, for example be due to the Incredible Years intervention being provided over 14-weeks compared to the 8-week Triple P groups, to programme-specific factors or to local implementation processes.
We found that 35% of children whose parents attended at least the first group session did not have a final SDQ score recorded, with little difference between Incredible Years and Triple P interventions, despite their different attendance demands. This suggests a higher than usual non-completion rate for PoPP-supported groups than that reported in controlled trials of parenting interventions which have indicated between 18 and 25% dropout Scott et al. 2010). It is, however, still considerably lower than non-completion rates reported in other national evaluations of parenting interventions such as the PEIP (46%, (Lindsay and Strand 2013)). The higher non-completion in PoPP-supported groups compared to controlled trials is likely due to the more naturalistic settings in which the PoPP-supported groups were delivered. It may also be confounded by a lack of data to clarify the proportion of parents who did attend the final group session but did not provide a final SDQ score for their child as opposed to those who had stopped attending the group. Nevertheless, as the reduction between outcomes between the full-cohort and completers samples was significant, methods of reducing the proportion of families leaving interventions early may further improve PoPP outcomes.
Using SDQ subscale scores, LPA identified four statistically-distinct profiles of children whose parents attended PoPP-supported groups (low, moderate, severe and hyperactivity problems profiles). The profiles have some broad similarities to profiles of SDQ subscales identified in previous analyses (Bradshaw and Tipping 2010), in that the current study also identified a low, moderate and high group, as well as the existence of a group with noticeable higher hyperactivity subscale scores. The main differences were that a fifth profile, identified in previous work as sub-group with above average hyperactivity and that might be considered boisterous without having specific difficulties. This may be due to the previous research surveying all primary school children rather than a group identified as having at risk behavioural difficulties as in the current analysis. This may also explain why the low severity group were the biggest group in the previous study but not for the current study, and why the mean subscales of the "low" group in the current analysis were much higher than those in the previously identified cluster.
We observed differential outcomes between these profiles, and high-risk members of the severe profile group were less likely to move out of the clinical range (43%) compared to the three other profiles, where likelihood was over 70%. It should be noted that these differences were not statistically significant, indicating further research into differential outcomes between profiles is warranted. Large pre-post intervention effect sizes were observed for all profiles except the low problem profile. However, this may be accounted for by the lower initial SDQ score for this profile resulting in limited available change in measured behaviours. Some intervention-specific differential outcomes were also observed for children based on their different SDQ problem profiles. Local services could use these findings to support the implementation and sustainability of this form of effective early intervention; to take decisions about how they apportion their delivery capacity for each intervention in the future; and to guide a more tailored delivery of these parenting interventions. As only about half of the children with elevated levels of behavioural difficulty at this age would be expected to develop conduct disorder, efficient use of resources demands as accurate prediction as possible of those children who are most at risk and who are most likely to benefit.
This naturalistic evaluation of parent training interventions had a number of limitations which need to be considered in relation to the findings discussed in this study. Firstly, no measures of staff adherence in the parenting groups was included and it is possible that there were local differences in delivery across PoPP sites. However, all PoPP practitioners received the same levels of training, and each site benefited from equivalent and standardised implementation support measures. This included additional clinical skills consultation days, supervision structures, as well as the implementation support given to each CPPs by the PoPP implementation team (via regular email and telephone consultation and review meetings). A second limitation is that only pre-and post-intervention SDQ data was available, with no sessional data collection. Collecting assessment information at every session, or even midway through would have allowed further exploration of change in scores through interventions. Instead, the conservative analysis performed using the full-cohort sample presumes that all children without post-intervention SDQ scores did not show any benefit. Only SDQ subscale scores were available, meaning the reliability of scale could not be assessed in the sample. Lastly, the number of baseline characteristics of children and families was limited and collecting additional information would provide the opportunity to further explore characteristics of children who were more likely to benefit from one intervention as opposed to the other.

Conclusion
Overall, the parent training interventions delivered through the PoPP implementation scheme were associated with significant reductions in behavioural difficulties in young children in various localities in Scotland. The effects reported are equivalent, and at times larger, than those reported by randomised controlled trials of parenting interventions. The identification of profiles of children based on SDQ subscale scores could also be used to inform future service delivery decisions. The naturalistic settings of this evaluation have associated limitations such as a lack of ability to assess practitioner fidelity, but provide support for the use of parent training programme for reducing behavioural difficulties in at risk pre-school children. Further research is needed to explore the utility of the profiles identified in supporting intervention allocation.
Funding The Psychology of Parenting Project is funded through the Scottish Government Mental Health Division. S.P. is supported by funding from the UCLH National Institute for Health (NIHR) Biomedical Research Centre.

Compliance with Ethical Standards
Conflict of Interest B.R. is a Mentor within the Incredible Years and delivers training in the Incredible Years intervention. All other authors declare that they have no conflicts of interest.
Ethical Approval Ethical standards: as the present study is a service evaluation, the NHS Health Research Authority does not classify this study as research, and as such ethical approval was not required.
Informed Consent Informed consent was gained from all parents for the collection of their data to be used for analysis to determine if the groups were effective for families and why they were effective.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.