Baby Triple P: A Randomized Controlled Trial Testing the Efficacy in First-Time Parent Couples

In a randomized controlled trial, we tested the efficacy of Baby Triple P in a community sample of first-time parent couples. The intervention was developed to promote better mental health, a positive couple relationship, positive parenting, and a better parent-infant relationship. One hundred and fifty six couples were randomly allocated to intervention (n = 78) or care as usual (n = 78) conditions. The intervention was delivered in four antenatal face-to-face group sessions followed by four early postnatal individual telephone sessions. Couples completed self-report assessments at baseline, immediately postintervention and at 12 and 24 months. The study had one primary (the Depression, Anxiety and Stress Scale) and 11 secondary outcomes. Over half of the intervention and care as usual participants remained in the study for the full 24 months. Intention to treat analysis of the full sample yielded positive results in some mental health domains for mothers and fathers, but this was not evident when follow up sensitivity analysis was conducted on a subsample of the data. There was limited support for the intervention in relation to secondary outcomes such as the couple relationship, social support and parenting. However, the parent couples were positive about the intervention and described it as providing the support that they wanted. This trial provides some evidence in support of Baby Triple P as an early intervention for new parent couples. High levels of satisfaction with the intervention are promising, especially in relation to the engagement of fathers. Trial Registration: ISRCTN31955576 This randomized controlled trial provides some evidence that the mental health of new parents is amenable to intervention. This randomized controlled trial provides some evidence in support of Baby Triple P as an early intervention. Evidence from this study demonstrates that engagement and retention of fathers is possible and fathers value involvement. Initial engagement is a critical point in the delivery of early interventions; couples who are engaged are more likely to complete the intervention. This randomized controlled trial provides some evidence that the mental health of new parents is amenable to intervention. This randomized controlled trial provides some evidence in support of Baby Triple P as an early intervention. Evidence from this study demonstrates that engagement and retention of fathers is possible and fathers value involvement. Initial engagement is a critical point in the delivery of early interventions; couples who are engaged are more likely to complete the intervention.

• This randomized controlled trial provides some evidence in support of Baby Triple P as an early intervention.
• Evidence from this study demonstrates that engagement and retention of fathers is possible and fathers value involvement.
• Initial engagement is a critical point in the delivery of early interventions; couples who are engaged are more likely to complete the intervention.
Having a first baby is a life course event experienced by the majority of adults. While normative, and in many cases planned, the transition to parenthood is a time requiring significant adjustment, with the potential for disruption to psychosocial wellbeing in both the short-and long-term. There is evidence that having a baby is associated with declines in mental health (Figueiredo & Conde, 2011), and both mothers and fathers are affected (Parfitt & Ayers, 2014). For example, in a sample of UK mothers, the prevalence of elevated symptoms of depression at 2 months postpartum was 14.4%, and 11.6% for anxiety (Bell et al., 2016), and the prevalence of broader distress that might not reach clinical thresholds (i.e., depression and/or anxiety symptomology) could be as high as 29% (Miller et al., 2006). Meta-analytic estimates show paternal postpartum depression prevalence to be as high as 8% 1 year postpartum, with higher rates immediately post-birth (Cameron et al., 2016).
Of course, not every new parent reports declines in mental health in the perinatal period (the weeks immediately before and after birth). Subgroup analyses show that, even in low risk populations, some parents may be more vulnerable than others, and mental health outcomes may be mediated by other psychological and sociodemographic variables such as relationship status (Don et al., 2014;McKenzie & Carter, 2013). In addition, there is evidence that the impact of a first birth may be different to subsequent births (Ruppanner et al., 2019). Indeed, the complexity of the associations between parent mental health and other variables is such that risk pathways have yet to be modeled in a way that would allow targeted early intervention, except in the case of identified high risk groups (e.g., adolescent parents). Moreover, it is questionable whether it is appropriate to wait for the emergence of early mental health risk indicators before intervening when a preventative approach could be adopted. This is especially important when we know that anxiety and depressive symptomology during pregnancy are predictors of parenting-related stress, anxiety and depression after the birth of a child (Huizink et al., 2017), and there is an established risk pathway between parental mental health, the parent-child relationship and future child outcomes across a range of psycho-developmental variables (Behrendt et al., 2016;Vänskä et al., 2017).
In addition to the impact on individual parent wellbeing, the impending birth of a first child initiates a reimagination of the family unit, and for couples it necessitates change in their relationship. For many couples this begins prior to pregnancy, and in reflective accounts couples talk about a phase when closeness and intimacy are enhanced and their relationship is prioritized (Schwerdtfeger et al., 2013). Indeed, first-time parents describe the pregnancy as a shared experience that promotes a growing sense of commitment to their relationship (Schwerdtfeger et al., 2013). However, assessments of relationship functioning following the birth of a first baby show that in many cases relationship satisfaction begins to decline, communication becomes more problematic (Doss et al., 2009) and couple activities become child-focused and instrumental (MacDermid et al., 1990;Petch & Halford, 2008). As well as being important in the context of the couple relationship in itself, perceptions of reduced partner support have been highlighted as a risk factor for declines in mental health (Biaggi et al., 2016).
Interpersonal change is not confined to the couple. New parents will experience shifts in their relationship with significant others (including partners, friends and family), and the impact this has on the quantity and quality of social support they receive can have wider reaching consequences. The World Health Organization (2018) has cautioned that the lack of both peer and practical support in the perinatal period increases the risk of maternal mental health problems and poor infant outcomes. Social support offered by family and friends has been associated with reduced postnatal depression symptomology at 6 weeks postpartum in firsttime mothers (Leahy-Warren et al., 2012). Moreover, wider reaching and high quality parental social support networks are also associated with better mental health outcomes for children (McPherson et al., 2014).
Like other life course change, the transition to parenthood is a journey rather than an event, and a wide range of factors can affect the ways in which parents manage and adapt to their circumstances (Kralik et al., 2006). Examined through the lens of the vulnerability-stress-adaptation model (Karney et al., 1995), it is the complexity of individual circumstances that account for the above noted differential outcomes for parents. Individual vulnerabilities, different experiences in relation to the qualitative and quantitative nature of stressful events (e.g., the birth of a child, sleeplessness), and different adaptive capabilities and capacities (e.g., good communication) interact to produce different outcomes (Doss et al., 2009). Thus, while treatment may be necessary in some cases, a better approach is a preventative one designed to reduce known risk factors and strengthen protective factors so parents are equipped with strategies that will help them adapt in both the short-and longer-term, irrespective of their individual trajectory.
Linked to this, first-time parents often report feeling unprepared for impending parenthood and seek out information to equip themselves (Barimani et al., 2017;Wilkins, 2006). In the UK, this can include attending prenatal (antenatal) classes provided as standard maternity care (see https://www.nhs.uk/pregnancy/labour-and-birth/preparingfor-the-birth/antenatal-classes). However, uptake of classes is by no means universal, and it is estimated that around one third of expectant first-time mothers never attend the offered sessions (Anderson et al. 2007). Furthermore, parents typically describe prenatal (antenatal) classes as being about pregnancy-related health and the birth event, with a focus on the experience of the mother rather than the couple. This can result in the (perceived) exclusion of fathers, and the narrow focus can lead to both parents being surprised at other changes that ensue; for example, changes in their relationship (Ingram et al., 2008;Kowlessar et al., 2014). Perhaps more importantly, prenatal (antenatal) classes invariably neglect discussion about parenting and the promotion of parenting cognitions and behaviors known to support better parent-child relationships and positive child outcomes.
The linkages between parental wellbeing, parenting and child outcomes are clearly defined. Negative mental health, such as depression, stress and fatigue, relationship problems and limited social support are all associated with reduced parenting capability; including less frequent positive parenting behaviors (e.g., nurturing), more frequent negative behaviors (e.g., harsh discipline) and disengagement from their child (e.g., less frequent eye contact) (Anthony et al., 2005;Lovejoy et al., 2000;Zemp et al., 2016). In turn, parenting cognitions and behaviors have a well-documented enduring impact on the full range of child social, emotional and behavioral outcomes (Biglan et al., 2012).
It is widely acknowledged that evidence-based parenting support interventions (EBPS) are an effective way of promoting better outcomes for children (World Health Organization, 2016) and, of course, they can make life better for parents themselves by improving wellbeing, parenting confidence and competence, and through improvements in the family (Bennett et al., 2013). Underpinned by the assumption that the family environment, including parenting practices, is fundamental to child development and is modifiable (Biglan et al., 2012), EBPS interventions work to promote positive outcomes by changing the family environment using appropriate behavioral, affective and cognitive change techniques (Sanders & Prinz, 2018), informed by the theoretical underpinnings of the intervention.
Despite clear evidence of the need for parenting support in the perinatal period, few programmes exist. To fill this gap Baby Triple P (BTP) was developed as a broad focused programme of support for couples transitioning to parenthood, targeting modifiable risk and protection factors at the level of the parent, family and child (Spry, 2013). BTP is informed by social learning theory and developed using the theoretical principles of the Triple P-Positive Parenting Program (Sanders, 2012;Spry, 2013). The key areas targeted are fostering realistic expectations about the transition to parenthood, infant behavior and development, protecting the couple relationship, skills and competencies of dealing with infants, promoting increased confidence in new parents about their abilities to parent, and conveying adaptive strategies for emotion regulation, such as relaxation or seeking social support (see below for further information about intervention content and delivery). BTP sits at level 4 of the Triple P system and is moderate to high intensity (Sanders, 2012).
The aim of this current study was to provide evaluation of the efficacy of Baby Triple P in a UK context. There is limited evaluation of the intervention and none with the general population, with limited evidence about specific risk groups (e.g., parents of pre-term babies and those with mental health problems) (e.g., Evans et al., 2017;Wittkowski et al., 2018). The primary hypothesis was that firsttime parent couples who participated in the intervention would have better mental health, assessed as depression, anxiety and stress, than control group parents. It was further predicted that couples who participated in Baby Triple P would demonstrate: (a) higher levels of subjective wellbeing; (b) indicators of a more positive couple relationship and communication about parenting; (c) more satisfaction with social support they receive; (d) increased parenting confidence, efficacy and role expectancy; (e) a better parentinfant relationship; and, (f) fewer perceived problems with their baby's behavior.

Sample
Recruitment took place between August 2011 and March 2014. The study was advertised to first-time parent couples through promotional material placed in maternity clinics and relevant community venues in Glasgow, Scotland, allowing couples to self-refer. In addition, researchers visited maternity clinics to undertake face-to-face recruitment. Inclusion criteria were: (a) experiencing a first pregnancy reaching the middle trimester (i.e., between 20 and 35 weeks gestation); (b) having a significant other (i.e., partner/father of baby) who was prepared to be involved in the programme; (c) a basic level of English literacy; (d) mother and partner had not sought treatment for depression or other mental health problems in the previous six months (i.e., had an existing mental health problem); and, (e) absence of a diagnosed genetic disorder or disability in the baby. Parents with existing mental health problems were excluded because Baby Triple P was offered as an early preventative, rather than a treatment, intervention. Eligibility was established during telephone screening.
Slow rates of recruitment in the early phase necessitated a review and simplification of recruitment material and the introduction of monetary compensation payments for assessment completion and intervention attendance; this was implemented in June 2012. Appropriate payments were made to all participants who had completed phases of the study prior to June 2012. At the same time there was a relaxation in the assessment of couple engagement at screening. Prior to June 2012 couples were excluded if they reported that they could not attend at least three of the faceto-face sessions together. After June 2012 couples were entered into the trial if they reported that one parent could attend all four sessions and a minimum of two sessions together. See below for information about impact on intervention attendance.

Measures
Information about the timepoints participants completed each assessment measure and the internal consistency at first completion is summarized in the supplementary materials (see Tables A1-1). Sociodemographic information was obtained from participant couples at T1 using a modified version of the Family Background Questionnaire (FBQ; Sanders & Morawska, 2010). To minimize participant burden, mothers were asked to report the majority of information. A short version of the FBQ was sent out at T2-4 to capture any changes in the individual/couple circumstances. Data collected included age, nationality/ethnicity, relationship status/living arrangements, education, occupation, financial comfort, mental health, pregnancy and pregnancy-related health care. Participants' postcodes were used to classify the relative level of deprivation of the area in which families lived, using the Scottish Index of Multiple Deprivation (SIMD).
The primary outcome, parent mental health, was assessed using the Depression Anxiety Stress Scale (DASS-21; Lovibond & Lovibond, 1995) to allow for assessment of change in three relevant elements of mental health. This 21item self-report questionnaire assesses depression (DASS-D), anxiety (DASS-A) and stress (DASS-S) and higher scores represent increasing symptomology. In addition to interpretation of the respondent subscale scores, each of the subscales was converted to severity ratings classifying symptomology as normal, mild, moderate, severe or extremely severe (Lovibond & Lovibond, 1995) and these data are presented as supplementary material (see Tables A1-2). The DASS-21 was selected because it has previously demonstrated good internal consistency in UK samples (α = 0.90-0.82; Henry & Crawford, 2005), although in this study internal consistency was lower (see Tables A1-1). It was completed by both parents at all four timepoints.
There were 11 secondary outcome measures. Subjective wellbeing was assessed using the Satisfaction with Life Scale (SWLS; Diener et al., 1985) at all assessment points. The 5-item scale has demonstrated acceptable psychometric properties in large scale testing, with high internal consistency in adult community samples (α = 0.88; Kobau et al., 2010). Both parents completed the SWLF at all assessment points.
Three measures were included to assess elements of the couple relationship and were completed by both parents. Relationship acceptance was assessed (T1-T4) using the Frequency and Acceptability of Partner Behavior Inventory (FAPBI; Doss & Christensen, 2006). Participants were asked how frequently their partner performed positive (e.g., physical affection) and negative (e.g., invaded privacy) behaviors and how acceptable this frequency of the behavior was. The original scale had 20 items that represent four subscales but the childcare item was removed at all timepoints because at preintervention the couples would not have had a child to care for. The possible range of scores on each of the four acceptability subscales was: Affection 0-27 (FAPBI-Aa), Closeness 0-63 (FAPBI-Ca), Violation 0-54 (FAPBI-Va), Demand 0-27 (FAPBI-Da). The frequency of behavior subscale scores (FAPBI-Af, FAPBI-Cf, FAPBI-Vf, FAPBI-Df) were converted to number of times per month with the possible range from zero (behavior not performed in previous month) to the number of times it was relevant to the couple relationship. The FAPBI has demonstrated acceptable levels of internal consistency for the affection, closeness, demand subscales (α = 0.78-0.80) but lower levels for the violation subscale (α = 0.63-0.67) (Doss & Christensen, 2006), a pattern replicated in this study for the frequency subscale (see Tables A1-1).
The Household and Childcare Task Checklist (HCTC; Spry, 2013) assessed (T1-T4) participants' perceptions of fairness in relation to the distribution of labor within the couple relationship. The measure asks about 13 household tasks (HCTC-H) and 10 baby care tasks (HCTC-B) and has two items measuring global perceived fairness (HCTC-GF) and satisfaction (HCTC-GS) with the division of tasks. The Household and Baby subscale scores were presented as a mean of the relevant items, giving a possible range of 0-2. The global satisfaction item was responded to using a 5-point scale (1 = not satisfied at all, 5 = very satisfied). The HCTC has previously demonstrated moderate to high internal consistency (Household tasks α = 0.69-0.77, Baby care tasks α = 0.79-0.86). The baby care tasks subscale was not completed at T1 because the couple's baby had not been born.
A modified version of the Parent Problem Checklist (PPC; Dadds & Powell, 1991) assessed (T2-T4) interparental conflict, or the couple's ability to cooperate over child rearing practices. Two of the original 16 items were removed at all timepoints because they were not relevant to first-time parents and the remaining 14 items were modified to contextualize them to infants. Participants indicated if the defined issues had been a problem (PPC-P) for them in the previous 4 weeks, generating a subscale representing the number of problems. They also rated the extent of the problem on a 7-point scale (1 = not at all, 7 = very much) and the mean of the 14 items provides an Extent subscale (PPC-E). Both the Problem (α = 0.82) and Extent (α = 0.89) subscales have good internal consistency (Stallman et al., 2009).
Measures relating to the baby were completed at T2-T4. The Maternal Self-Efficacy Scale (MSES; Teti & Gelfand, 1991) measured mothers' beliefs about their parenting competence. Nine domain-specific items assessed selfefficacy beliefs in relation to infant care and a tenth item assessed parental self-efficacy in general, with higher scores indicating stronger self-efficacy beliefs. The measure has previously demonstrated good internal consistency (α = 0.83; Coleman & Karraker, 2003).
Mother-infant bonding was measured using a modified version of the Postpartum Bonding Instrument (PBI; Brockington et al., 2001). The 25 items represent four subscales measuring general bonding (PBI-G), rejection and pathological anger (PBI-RP), infant-focused anxiety (PBI-IA), and incipient abuse (high risk of abuse; PBI-A) with higher scores reflecting increasing level of psychopathology. The PBI has previously demonstrated moderate internal consistency in a UK sample (α = 0.63-0.79; Wittkowski et al., 2007).
The 11-item evaluation subscale from the What Being the Parent of a Baby is Like Questionnaire (WBPBL; Pridham & Chang, 1989) measured the participants' evaluation of themselves as a parent with respect to the meaning of their relationship with the infant and their care of them. Higher scores reflect more positive evaluations. The subscale has previously demonstrated good internal consistency (α = 0.87-0.90; Pridham & Chang, 1989) and was completed by both parents.
The Baby Behavior Inventory (BBI; Spry, 2013) assessed the frequency of occurrence (BBI-F) of 14 baby behaviors that parents commonly report as challenging (e.g., sleep and feeding behaviors). Higher scores are indicative of a parent experiencing higher intensity of infant behavioral problems (i.e., more perceived problems with their baby's behavior). In addition, the BBI assesses the number of problem areas (BBI-N) and parent confidence (BBI-C) in dealing with behaviors they identified as problematic using a 5-point confidence scale. Higher scores are indicative of parental confidence in being able to manage problem behaviors. The BBI has demonstrated good internal consistency (α = 0.80-0.84; Spry, 2013) and was completed by both parents.
Couples completed a 24 h Baby Diary (BD; Spry, 2013) recording their infant's pattern of feeding, sleeping, crying/ fussy behavior and time when they are happy/content/ awake. Parents recorded the category of behavior that occurred the most in each 30-minute period. The percentage of time spent crying/fussy (BD-C) and happy/content/awake (BD-H) was calculated and used in the analysis.
Perception of available social support (T1-T4) was assessed using the Social Support Scale (SSS; Spry, 2013) by asking participants to list people they received formal social support (SS-F) from and to rate how satisfied they were with this support on a 5-point scale. This was replicated for informal social support (SS-I). The listing of people was not used in assessment but rather it was a way of prompting participants to think about the extent of their available support networks.
A 7-item Client Satisfaction Questionnaire (CSQ; Spry, 2013) was used to assess intervention participants' views on the quality of the intervention received, how well it addressed their needs, and their satisfaction with the intervention. Higher total scores reflect greater satisfaction. The CSQ has demonstrated good internal consistency (α = 0.89-0.91; Spry, 2013) and it was completed by intervention couples at T2.

Procedure
Potential participant couples were provided with a study information booklet at initial recruitment and, if they were less than 18 weeks pregnant, were asked to consent to the status of their pregnancy being reconfirmed and to being contacted for a telephone screening interview at approximately 20 weeks gestation. At screening, eligible couples consented to join the study and complete baseline (T1) assessments prior to randomization, which was carried out using a web-based system hosted by an external clinical trials unit.
After randomization, intervention group couples were invited to attend the next available Baby Triple P group, they were sent T2 assessments following completion of the intervention, when the baby was approximately 10 weeks old. CAU couples were sent T2 assessments when their baby was 10 weeks old. Subsequent assessments were sent to both groups when baby was 12 (T3) and 24 months (T4) old. Monetary compensation for time payments (£10 shopping voucher) were offered for completion of assessments at each timepoint. Intervention participants received a travel cost compensation payment (£5) for each face-to-face intervention session they attended. In addition, couples who completed all four assessment points were entered into a draw to win a £200 shopping voucher. After completion of T1 assessments all couples, irrespective of randomized condition, were sent a Triple P home safety tip sheet.

Intervention
The intervention is typically called Baby Triple P, but in this study it was referred to as Triple P for Baby to avoid conflation with a highly publicized UK child abuse case known as Baby P.
BTP was delivered in eight sessions with four, weekly 2-hour face-to-face group sessions delivered prenatally, and four, weekly 30-minute telephone sessions starting when the baby was approximately 6 weeks old. Intervention couples were provided with a BTP manual that contained the session information and homework tasks, which meant parents who missed a face-to-face session could engage with intervention content, it was retained after completion of the intervention.
Session 1 introduced parents to the intervention before focusing on strategies for developing a positive relationship with their baby. Session 2 focused on strategies to help parents teach their baby new skills and behaviors, ways of responding to their baby and understanding crying and sleep behaviors. Session 3 focused on changes that parents might expect following the birth of their baby and introduced coping strategies for emotion regulation. Session 4 focused on strategies for promoting a positive couple relationship, including communication and partner support. Sessions 5-8, the telephone sessions, were design to provide parents with tailored, family-contextualized support for putting the strategies from previous sessions into practice. In these sessions practitioners promoted a self-regulatory approach through prompting to the BTP session materials and strategies, and encouraged parents to set goals and review progress.
Sessions were delivered by 13 trained and accredited Triple P practitioners, all health professionals (e.g., psychologists and health visitors). Face-to-face sessions were facilitated by two practitioners and telephone sessions were facilitated by one practitioner. In total, 16 groups were delivered and groups sizes ranged from two to seven couples. Information about what the National Health Service Scotland offers as care as usual in Scotland is available at https://www.nhsinform.scot/ready-steady-baby.

Protocol Adherence
As noted above, early recruitment was slow necessitating a review and simplification of recruitment material, a relaxation in the assessment of couple engagement, and the introduction of compensation payments. Ethical approval was obtained for these changes. To facilitate adherence to the standardized intervention delivery manual practitioners were required to complete protocol adherence checklists for each session delivered, including telephone sessions.

Statistical Analysis
Data were analyzed in line with the pre-specified analysis plan described in the protocol (unpublished) approved by the study steering group, the study sponsor (the NHS) and the approving ethics committees. An a priori power analysis indicated that to detect a medium effect size of 0.5, α = 0.05, (two tailed) and power at 0.80, a minimum sample size of 128 was required. Allowing for potential attrition of approximately 20%, a recruitment target of 160 couples was proposed to provide sufficient power to conduct the proposed analyses.
For ethical purposes participants could choose not to answer items in the assessment battery. In keeping with a previous BTP efficacy study, a conservative approach to missing data was adopted (Spry, 2013). Where more than 10% of items needed to calculate a variable were missing, the whole case was excluded. Where less than 10% were missing mean imputation was performed, with the participant's mean across the remain items substituted for the missing value. This has been demonstrated as an appropriate technique with low levels of missing data (Lodder, 2014). Where appropriate (i.e., scales with more than three items), the internal consistency of the assessment measures was calculated at the first timepoint they were completed and are included in the supplementary materials (Tables  A1-1).
Analysis was conducted using an intention to treat approach with all trial participants analyzed according to their original randomized group. The same analysis was performed for both mothers and partners, apart from outcomes that were only relevant to the mother. Descriptive summaries are presented for baseline sociodemographic and relationship characteristics (see Table 1). Outcomes from participant questionnaires are summarized at all timepoints they were collected. To evaluate efficacy of the intervention, mixed-effects repeated measures models were performed using data across all available timepoints. Time was treated as categorical in the models, allowing for models of fixed effects for treatment group, timepoint and a group x timepoint interaction, with random slopes and intercepts for trial participants, and consequently estimates of treatment effects produced for each timepoint. This method controls for any baseline differences between treatment groups. Likelihood ratio tests were performed comparing the full model to a model without the group x time interaction to assess for any overall intervention effect. All p values are two tailed and p values < 0.05 were considered statistically significant with no adjustment made for multiple comparisons, in line with the pre-specified analysis plan. An acknowledged limitation relates to deviation from the planned timing of the follow up assessments (see below). In keeping with a pre-specified plan to investigate deviations in engagement with the research demands, post hoc sensitivity analyses were conducted on the primary outcome by performing the repeated measures analysis using only data from participants who returned completed assessments within 8 weeks of the originally estimated time. The proportion of participants showing reliable change (Jacobson & Truax, 1991) was calculated for the primary outcome. Analysis was performed in R version 3.3.3 (R Core Team, 2013) by a statistician who was independent from the research team.

Results
In total 858 couples registered an interest in participating in the study; however, 35 withdrew/were excluded prior to eligibility screening (17 did not have a continuing pregnancy, three were too advanced in pregnancy, six did not provide informed consent, one could not attend sessions, four moved out of the study area, one did not have an appropriate level of English literacy, one actively withdrew, two were not contactable). Therefore, 823 couples were eligible to be screened against the study inclusion criteria at the~20 weeks gestation point. Participant flow is documented in Fig. 1, where it is noted that 650 couples were ineligible for randomization, most often because they did not meet the inclusion criteria, and 78 couples were randomized to each arm of the trial. In all cases, the mother's partner was described as being the baby's father.
As noted in Fig. 1, an administrative error led to 17 couple being randomized prior to completion of baseline assessment so they were excluded from the research study. Intervention couples who were randomized but stopped communicating with the research team before participating in the intervention were not sent the T2 assessments because it was assumed they had withdrawn (n = 12). A further two couples did not receive T2 assessments because difficulties engaging them in the telephone sessions meant they exceeded the deadline. These couples are represented as 'not offered' in Fig. 1 and none chose to complete T3/4 assessments when these were sent.
The baseline sociodemographic characteristics of participants in the BTP and CAU groups can be seen in Table 1. The majority of participants identified as white, married or living together, were employed and educated beyond high school. The majority of pregnancies were planned and were without complication at the point of baseline data collection. Despite the participants' higher levels of education and employment, the majority of mothers reported that in the previous 12 months there had been times they had been unable to meet their essential household expenses. Linked to this, perhaps reflecting a need to work rather than a desire to work, after the birth of their baby the majority of mothers who had returned to work said they would prefer to work less (T2 = 71.8%, T3 = 50.6%, T4 = 50.6%).

Engagement, Retention and Attrition
Participation in the intervention was variable. In 28 couples, both parents attended all four face-to-face sessions and at least one parent participated in all four telephone sessions; they received the full intervention. Four other couples were considered to have received the full intervention because at least one parent had attended the four face-to-face sessions and completed the four telephone calls. Seventeen couples did not participate in any sessions, which left 29 couples who received some of the intervention. Nine of these partial recipient couples had full attendance at the face-to-face sessions (both parents) but they did not complete all telephone sessions. Thirteen couples had attendance (at least one parent) at three face-to-face sessions, five couples had attendance (at least one parent) at two face-to-face sessions, and two couples attended (at least one parent) one of the face-to-face sessions; across these 20 couples there was varying levels of engagement with telephone sessions (0-4).
In summary, 41% of the 78 intervention couples received the full intervention, 52.6% had attendance at all four faceto-face intervention sessions where BTP was delivered and practiced. The recruitment review did not appear to impact attendance at the intervention. Mothers recruited prior to the recruitment review (n = 26) attended an average of 2.7 faceto-face sessions and those recruited after the review (n = 52) attended an average of 2.8 sessions. Partners recruited prior to the recruitment review attended an average of 2.5 face-to-face sessions and those recruited after the review attended an average of 2.6 sessions.
We planned to conduct T2 assessment at approximately 10 weeks, T3 at~52 weeks and T4 at~104 weeks after the couple's baby was born. However, for a variety of different reasons, many related to the demands of new parenthood, there were often delays in the follow up assessments. For example, in some instances parents did not begin the telephone sessions at 6 weeks post-birth or it took longer than 4 weeks to complete them because they had other priorities. Similarly, at T3 and T4 parents could be late to return their assessments because these had to be completed in the evening while caring for their toddler. This resulted in the mean time of completion, from baby birth date, being 17.9 (SD = 9.9) weeks, 56.2 (SD = 5.9) weeks and 109.6 (SD = 10.3) weeks for mothers, and 20.3 (SD = 13.6) weeks, 57.7 (SD = 8.9) weeks and 111.2 (SD = 11.5) weeks for partners. At T2 66.0% (n = 70) of mother assessments were returned within an 8 week window, at T3 it was 82.8% (n = 82) and at T4 it was 83.9% (n = 78). Rates were slightly lower for partners with 60.2% (n = 62) returning within 8 weeks at T2, 71.9% (n = 69) at T3 and 76.4% (n = 68) at T4. Across both groups of mothers 68.6% completed T2, 64.1% completed T3 and 61.5% completed T4 assessments. Across both groups, 66.7% of partners completed T2, 62.8% completed T3 and 59.0% completed T4. There were no significant differences between BTP and CAU groups in the rates of attrition at T2-T4 (Mother T2 p = 0.730, T3 p = 1.00, T4 p = 0.869; Partner T2 p = 0.610, T3 p = 0.868, T4 p = 0.416).
Means and standard deviations for all outcomes measures for each group of mothers and partners, at all four timepoints, are presented in Tables 2 and 3. All analyses were conducted with baseline scores included to account for differences across intervention and CAU groups when modeling intervention and time effects.

Primary Outcome: Parent Mental Health
As shown in Tables 2 and 3, for mothers there was no significant intervention effect on the DASS anxiety or stress subscale scores, and for partners no effect on the stress or depression subscales. There were significant effects for the remaining subscale outcomes.
There was a significant intervention effect for mothers' DASS depression scores across all postintervention timepoints. Follow up analysis identified significant differences at T2 only, with BTP mothers reporting lower mean scores compared to CAU mothers (estimated difference of −3.29, p = 0.001, 95% CI [−5.29, −1.30]). BTP mothers reported reduction in depressive symptomology immediately postintervention whereas CAU mother reported an increase.

Sensitivity Analysis
To investigate the effect deviation from the planned timing of the follow up assessments (see above) may have had on the results, sensitivity analyses was conducted on the primary outcome by performing the repeated measures analysis using only data from participants who returned completed assessments within 8 weeks of the originally estimated time. Using this approach mothers' DASS depression (LR = 7.43, df = 3, p = 0.059) and partners' DASS anxiety (LR = 3.64, df = 3, p = 0.303) were no longer statistically significant. Although, this sensitivity analysis casts some doubt on the reliability of main finding, it should be interpreted with caution because the analyses are under-powered and effect sizes are large for some time points. Indeed, for maternal depression the calculated estimate for T2 follow up was significant and the effect size large (
The proportion of participants in both the BTP and CAU groups showing reliable change in either a positive or negative direction or showing no change between T1 and T2, T1 and T3, and T1 and T4 is presented in Table 4. The majority of participants in both BTP and CAU reported no reliable change across any timepoints. However, across all timepoints the proportion of mothers reporting significant improvements in depression, anxiety and stress was greater for those in the BTP group than the CAU group, and a similar trend was noted for partner anxiety. The proportion of mothers reporting worsening depression and anxiety between T1 and T2, and between T1 and T4 was greater among CAU mothers in comparison with BTP mothers. For partners, a greater proportion of those in the CAU group reported significant declines in all three mental health constructs than the BTP partner group.

Secondary Outcomes
As can be seen in Tables 2 and 3, for the majority of secondary outcomes there were no significant intervention effects for either mothers or partners, with the exception of the FAPBI acceptability of violation behaviors subscale and the WBPBL scale for partners only. The estimated model for FAPBI-Va score among partners indicated a significant intervention effect across all postintervention timepoints. Follow up analysis indicated significantly lower scores among BTP partners at T4 compared with CAU partners (estimated difference −3.44, p = 0.40, 95% CI [−6.71, −0.16]); BTP fathers were less accepting of their partner's violation behaviors than the CAU fathers. Analysis of the WBPBL scores for partners indicated a significant intervention effect across all timepoints, but follow up analyses were not significant at T2, T3 or T4. Therefore, while there was a main effect of the intervention, the analysis did not identify significant differences between the two groups at the three assessment points. A review of the data, suggests that intervention partners had a trajectory of improving parental self-efficacy from postintervention to 24 months whereas the CAU group began to dip between T2 and T3.

Intervention Satisfaction
The CSQ was completed by 50 BTP mothers and 47 partners. Participants reported high levels of overall satisfaction with the intervention with a mean CSQ score of 39.0 (SD = 6.8, range 24.0-49.0) for mothers and 38.8 (SD = 6.3, range 24.0-49.0) for partners. When asked if they had received the type of help that they wanted from the programme, 88.0% of mother and 93.6% of partners reported 'yes, generally' to 'yes, definitely' they had. When asked if they thought their relationship with their partner had been improved by the programme, 62.0% of mothers and 52.0% of partners reported 'yes, generally' to 'yes, definitely'.

Discussion
This RCT was undertaken to test the efficacy of Baby Triple P, an intervention for couples transitioning to parenthood. The primary hypothesis, that first-time parent couples who participated in BTP would have better mental health than control couples, was partially supported. Specifically, when the full set of data was included in the analysis, mothers who participated in BTP reported a decline in depressive symptomology postintervention whereas mothers in the CAU groups had increased levels. In addition, where BTP partners had lower levels of anxiety immediately postintervention, CAU partners saw a rise in anxiety symptomology and, although both groups had a pattern of decline at 12 and 24 months, BTP partners reported significantly lower levels of anxiety symptoms than CAU partners across the full assessment period. Additional sensitivity analyses, which included only a subset of the available data (60.2-83.9%), did not reach significance for maternal depression or paternal anxiety. However, the reliability of this analysis must be treated with caution given that it was, in both cases, underpowered. Moreover, large effect sizes for the immediate postintervention comparison for maternal depression and all three time point comparisons for paternal anxiety suggest the effects are worthy of follow up. Alongside this, while some parents did not return their follow up assessments within 8 weeks of the original estimated completion, the exclusion of their data on the basis of time from the birth of their baby rather than time from intervention completion is, in retrospect, conceptually questionable. A range of factors meant that some couples took longer to complete the intervention than others. This included them delaying the start of their telephone consultations and, even when this phase had started, other priorities and the work schedule of intervention practitioners sometimes made weekly calls more difficult to achieve. This elongated the time associated with this phase of the intervention and, consequently, postintervention assessment. Given the study was an assessment of the impact of the intervention, rather than becoming a parent, time from intervention completion may, with hindsight, have been a more appropriate variable to consider in the sensitivity analysis than time from birth. While variability in intervention timeline complicates the analysis of trial data it, without doubt, reflects the expected variability of intervention delivery in a service as usual context. The data gathered as part of follow up still reflect short (~10 weeks), medium (~12 months) and longer term (~24 months) points within the transition to parenthood and, provide, an indication of the impact of the BTP over this timeline. Specifically, with the data from all couples included, there is evidence to suggest that BTP works to support aspects of parent mental health during the transition to parenthood.
Maximizing the opportunity for successful transition during any major life course change is critical to ensuring positive short-and long-term outcomes. In the case of becoming a parent, the consequences of disrupted transition are far reaching, with implications for the parents themselves but also for their child. In addition, there is evidence that psychopathology and parenting practices, both positive and negative, interact and they can be transmitted to subsequent generations (Loeber et al., 2009;Neppl et al., 2009). This means that intervening early to prevent, or change, poor parenting practices and mental ill-health has the potential to benefit both current and future generations.
A particular challenge with respect to the current study was the low levels of psychopathology reported by participants at baseline; the majority of parents were within the normal range of scores for depression, anxiety and stress symptomology. While this limits the opportunity for any intervention to result in positive change, an effective intervention should at least be able to demonstrate the stabilizing of mental health symptomology within the normal range during a period of significant transition. Indeed, when the full data set was analyzed, mothers who participated in the BTP intervention reported improvements in their depression symptomology from baseline rather than just a stabilizing. This improvement is an important finding given that the risk of developing depression is elevated during the early post-birth period (Cooper & Murray, 1998), and is evidenced to a certain degree in this study by the concomitant increase in symptomology in the control group.
Sensitivity analysis notwithstanding, BTP appears to be an appropriate intervention to offer mothers, to complement existing antenatal care, as a way of preventing increase in early postnatal depressive symptomology. It is, however, unclear why this benefit, relative to the control group, wasn't sustained over the longer term and, without assessment of the extent to which mothers were using the BTP strategies when contact with the practitioners ended, it is not possible to know if this is associated with them drifting away from the intervention or something else. For example, by the time their child is 12-months old many mothers will have finished their maternity leave and have returned to employment. This, in itself, is another important transition point that necessitates a renegotiation of roles both within and outside the household and it has the potential to impact on the parent-child relationship (Lucas-Thompson et al., 2010). Top up or booster intervention sessions that recontextualize the implementation of positive parenting strategies given changing circumstances might be an appropriate way to reinforce the intervention content in the longer term and help support positive mental health outcomes. This is something worthy of future investigation.
Assuming a level of confidence in the analysis of the full dataset over the sensitivity analysis, there is evidence that BTP is an appropriate intervention to offer first-time fathers as well as mothers. As with maternal depression, BTP partners showed a pattern of declining anxiety from baseline to postintervention, whereas there was an increase in the CAU group. In both groups, levels of anxiety receded at 12 and 24 months, but for partners who took part in the intervention this is a trend to having lower levels of anxiety than at baseline. Couples completed baseline assessment during their pregnancy so, unfortunately, it is not possible to know if this pattern of anxiety decline in BTP partner is a return to prepregnancy rates or a decline beyond this.
Although paternal, as compared to maternal, mental health has been neglected in the literature, there is increasing recognition that fathers may experience anxiety symptomology more commonly than depression during the perinatal period (Wynter et al., 2013;Vismara et al., 2016). Therefore, an intervention that is able to effect positive and sustained change in fathers' anxiety while they transition to parenthood, and in the period beyond the initial transition, is particularly valuable.
How BTP might work to impact anxiety is, as yet, unclear; however, emerging evidence linking perinatal paternal anxiety to cognitive bias and beliefs about fatherhood might offer insight (Sockol et al., 2018). The intervention sought to promote a self-regulatory approach to parenting and included learning about realistic expectations and parental self-care, all of which might contribute to a cognitive restructuring about parenting and a belief in one's own capabilities in relation to this, which in turn may impact on anxiety. Indeed, the trend of increasing fathers' parenting self-efficacy in the BTP group, as assessed by the WBPBL scale, may be a reflection of such cognitive change. Like many other interventions, future consideration of the mechanisms of action in relation to parent mental health and BTP is much needed.
There was limited support for the secondary hypotheses, with no effects of the intervention identified for mothers. In partners, effects were identified for an element of the couple relationship, acceptability of violation behaviors (FAPBI-Va), and global parenting self-efficacy (WBPBL) but not on other outcomes. Fathers in the BTP group reported lower levels of acceptance of their partner's violation behaviors (e.g., dishonesty, breaking of agreements, flirting/affairs, physical abuse and addictive behaviors) at 24 months than CAU fathers. It was not that there was a difference in the reported frequency of violation behaviors but rather a difference in the reported acceptability, which suggests that the intervention may have worked to change partners' expectations in relation to negative behaviors. Indeed, as noted above, an important component of the BTP intervention is about managing the couple relationship, to ensure it remains positive, and on looking after oneself. By reporting lower levels of acceptability in the absence of increased frequency of the behaviors, partners may be signaling less tolerance of negative behaviors because of the impact these have on them and their relationship.
In relation to global parenting self-efficacy (WBPBL), while the analysis identified an overall influence of the intervention on fathers' parenting self-efficacy after the birth of a child, no specific timepoint was identified where there was a significant difference between BTP and CAU partners. The pattern within the data suggests that intervention fathers saw an increase in parenting self-efficacy from postintervention to 24 months, which was in contrast to the CAU fathers who dipped at the 12 month assessment point; however, the ability to comment further on this is limited by the nature of the assessment. Specifically, there was no preintervention, baseline data available because assessment of parenting self-efficacy was not initiated until the birth of the couple's first baby (i.e., postintervention). Prior to this parents would not have been able to answer questions about the parenting role and/or their relationship with their baby. It is possible that a proxy indicator of imagined, or perceived, parenting self-efficacy in the antenatal phase would have provided an appropriate baseline comparator; however, the relationship between antenatal and postnatal parenting self-efficacy is currently untested and requires research investigation before it would be appropriate to adopt this approach in trial research.
Although it was disappointing that one fifth of couples allocated to the intervention arm of the trial chose not to attend any of the sessions, the vast majority (89%) of couples who did engage attended at least three of the four face-to-face sessions. In addition, levels of overall satisfaction with the programme were moderate to high and the majority of parents felt that they had received the type of help they wanted from the programme. This is an important consideration given that estimates of antenatal care uptake suggest that as many as one third of first-time parents may opt out of attending offered classes (Anderson et al., 2007), a higher number than the pattern of disengagement in this research. Although it is not possible to substantiate, these higher rates of engagement may be linked to a cultural normalization of parenting support in Scotland. Indeed, at the time this study was conducted standard maternity care in Scotland included free to access prenatal (antenatal) classes (see https://www.nhs.uk/pregnancy/labour-and-birth/prepa ring-for-the-birth/antenatal-classes) and the national government and local authorities were developing and implementing parenting support frameworks that had normalization-focused public health campaigns associated with them (e.g., Scottish Government, 2012).
Of particular interest is the moderate to high levels of satisfaction expressed by fathers. Previous reports of father satisfaction with antenatal care in the UK has tended to be negative. First, fathers have noted difficulties in terms of the logistics of attending classes at the time offered , yet, in this study, father attendance was similar to that of mothers. Although there is no data to support this contention, the delivery of the sessions in the evening may have facilitated father engagement. Second, fathers have previously reported that that they found antenatal classes to be patronizing and entirely focused on the needs of the mother (Bradley et al., 2004;Kowlessar et al., 2014). In contrast, over 90% of partners who attended BTP in this study reported that the intervention had provided them with the help they wanted. The content of BTP is focused on encouraging positive psychological and parenting outcomes for parents as individuals and as a couple. This greater visibility of the psychoeducational needs of fathers in the transition to parenthood appears to be entirely palatable and, perhaps, stands in contrast to other antenatal support they would have been receiving as part of care as usual.

Limitations
This study makes important contributions to the parenting and antenatal care literature; however, there were a number of limitations that contextualize the findings and their interpretation. First, as noted above, while retention rates were relatively high, a moderate proportion of randomized couples failed to engage with either the intervention or the research element of the trial. This means that with an intent to treat approach to analysis the estimated treatment effect is conservative. Reasons offered for non-engagement were primarily associated with time and being able, or willing, to prioritize attendance over other demands, not least because care as usual appointments had to be attended as well. This is a pattern of engagement characteristic of the parenting intervention literature where uptake has been noted as being as low as 20-25% of the eligible population (Coie et al., 1993) with around 50% of parents finishing an intervention (Dumas et al., 2010). It may be possible to implement strategies to overcome barriers to uptake (e.g., monetary incentives), especially in research trials; however, it is important that these do not transgress the modifiable constraints of implementation ecological validity. In this study, monetary compensation was offered for time associated with the research element of the study and for travel costs associated with intervention attendance, but it is unlikely that this could be replicated in a service context. Alternative approaches to facilitate better engagement might include the normalization of antenatal parenting interventions (Sander & Kirby, 2012) or, based on what participants in this trial feedback, greater flexibility in delivery to enable working parents to attend at the weekend.
Second, and linked to this, no a priori minimum threshold for the number of intervention sessions that couples were required to attend was set; this means that there were no criteria to define couples as 'compliant' versus 'noncompliant' for consideration in the analysis. Parenting interventions rarely have full compliance so the pattern of attendance described in this study is likely to be reflective of the antenatal population (Anderson et al., 2007). However, more detailed consideration of the impact of differential compliance would make a useful addition to the literature. In particular, in the case of interventions like BTP where there are qualitative differences between each session, future consideration of whether some sessions are more important than others would allow for greater understanding of minimal sufficiency for positive gain.
Third, a further issue of compliance relates to the research element of the trial and the acceptable timeframe for assessment completion. While all research requires some flexibility in the assessment of participants, defining an a priori cut off for receipt of assessments would provide future research with a more robust approach to analysis planning. In the case of this study, a priori criteria associated with intervention completion may have eliminated the need for sensitivity analysis that was underpowered.
Fourth, as noted above, participants could choose not to answer items in the assessment battery, and mean imputation was used to produce values for those participants where less than 10% of items were missing. While this has been demonstrated as an appropriate technique when there are low levels of missing data (Lodder, 2013), recent calls have been made for the adoption of more sophisticated missing data techniques (Rioux & Little, 2021). Future research should consider these when planning analyses.
Fifth, while care was taken to use assessment measures that had previously been validated, the availability of measures sensitive to changes that might occur in multiple domains across a 24 month period following the birth of a child is limited. For example, assessments like the Baby Behavior Inventory (Spry, 2013) and What Being the Parent of a Baby is Like Questionnaire (Pridham & Chang, 1989) were worded such that they would have had explicit relevance to parents in the early part of the study, and they were retained across the 24 month period to allow for longitudinal comparison. However, as the baby became a toddler, the phrasing of these assessment is less appropriate and may have lacked meaning for parents, or limited the opportunity to report on other factors that more accurately reflect the constructs of interest during the toddler phase. There is a need for the development and testing of contextsensitive assessment measures that allow for longitudinal assessment of both parent and child outcomes in intervention research.
Finally, in relation to external validity, the couples who self-selected to participate in this study were all mother and father dyads living in Scotland, the majority of whom had low levels of psychopathology. This means no conclusions can be drawn about the efficacy of the intervention in the context of different types of parent groups; for example, single parents and same-sex couples, parents with higher levels of psychopathology, and parents from other cultural contexts (e.g., where antenatal care is not free at the point of use). Understanding if there is a difference in the type of support needed for other groups of parents who are transitioning to parenthood is an important consideration for future research.

Conclusions
The successful development and implementation of early interventions that support parents to create an optimal environment in which children can thrive continues to be of critical importance to policy makers (World Health Organization, 2018). An intervention that can effect positive change in key mental health variables, such as maternal depression and paternal anxiety, through joint delivery to couples transitioning to parenthood represents a valuable addition to perinatal care. While presenting caveats associated with sensitivity analysis, this trial offers some positive results in support of Baby Triple P as an early intervention. Of particular importance is the acceptability of the intervention for both mothers and their partners, which stands in clear contrast to views expressed about standard antenatal care. To support future development, research is needed to further explore the efficacy of BTP in more diverse populations, especially those at higher risk of mental health decline during pregnancy. In addition, there is much need consideration of the outcomes measures available to track change across life stages and this might allow for robust longitudinal research that explores the impact of early intervention on longer term outcomes. members of the research team, Dr Elizabeth McGee and Ms Tania Loureiro. In addition, we acknowledge the clinical staff who supported recruitment of participants and the delivery of the intervention.
Funding This study was funded by NHS Greater Glasgow and Clyde.

Compliance with ethical standards
Conflict of interest The Parenting and Family Support Center is partly funded by royalties stemming from published resources of the Triple P -Positive Parenting Program, which is developed and owned by The University of Queensland (UQ). Royalties are also distributed to the Faculty of Health and Behavioral Sciences at UQ and contributory authors of published Triple P resources. Triple P International (TPI) Pty Ltd is a private company licensed by Uniquest Pty Ltd on behalf of UQ, to publish and disseminate Triple P worldwide. The authors of this report have no share or ownership of TPI. A.M. receives royalties from TPI. TPI had no involvement in the study design, collection, analysis or interpretation of data, or writing of this report. A.M. is an employee at UQ. All other authors have no competing interests.
Ethical approval Ethical approval for the study was obtained from NHS HRA West of Scotland Research Ethics Committee 5 (11/AL/ 0056) and Glasgow Caledonian University's Department of Psychology Ethics Committee. The procedures used in this study adhere to the tenets of the Declaration of Helsinki.
Informed consent Informed consent was obtained from all participants in this study.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.