Social deficits (e.g., lack of initiating and maintaining social interactions, decreased verbal and nonverbal communication, limited perspective-taking) are core to autism spectrum disorder (ASD; American Psychiatric Association [APA], 2013; Baron-Cohen & Wheelwright, 2004). These difficulties are often present early in development (Paul, 2003) and may worsen as the child matures (Mundy, 2016; Rao et al., 2008). Caregiver-mediated interventions that target social skills may be particularly impactful, as incorporating caregivers can increase generalizability, maintenance of skills, and individual and caregiver/family outcomes (Factor et al., 2019; Klinger et al., 2022; Pacia et al., 2022; Trembath et al., 2019). While there has been a rapid increase in caregiver-mediated interventions for children with ASD, little research has investigated the impact of the intervention on the caregiver–child relationship or on caregiver outcomes (Reichow et al., 2012). Thus, the present study aimed to examine the impact of a caregiver-mediated social skills group intervention for children with ASD 4–7 years, the Program for the Education and Enrichment of Relational Skills (PEERS®) for Preschoolers (P4P; Factor et al., 2022a; Park et al., 2022), on caregiver–child relationships, caregiver confidence, parenting styles, and family functioning.

Early Social Skills Interventions

Early intervention can decrease ASD symptomology, although few evidence-based interventions explicitly address social skills broadly and as a primary intervention target in young children with ASD (Social Skills Group Intervention-High Functioning Autism or S.S.GRIN-HFA in DeRosier et al., 2011; Unnamed intervention in Kroeger et al., 2007). One review of social skills interventions found only 2 out of 48 studies included participants younger than 6 years (Kaat & Lecavalier, 2014). For example, the Superheroes Social Skills Program (Jenson et al., 2011) is a manualized intervention that incorporates didactic training and behavioral rehearsal and has been found to improve social skills in preschoolers with ASD (Radley et al., 2015). Video modeling (Murdock et al., 2013) has also been found to effectively teach social skills to children with ASD with various levels of cognitive and language ability. Otherwise, much of the research on social skills interventions for preschoolers with ASD are single-subject case studies that are not manualized (Reichow & Volkmar, 2010). Despite the importance of early interventions (Watkins et al., 2017), few comprehensive social skills programs exist for young children with ASD (DeRosier et al., 2011; Reichow & Volkmar, 2010; Tripathi et al., 2022; Wolstencroft et al., 2018). There are also interventions that include skill-building in social communication domains such as language, play skills, joint attention, imitation, requesting, inclusive learning environments, etc., within a broader curriculum, and thereby may include social skills as secondary intervention targets (e.g., Early Start Denver Model; Rogers & Dawson, 2020; LEAP; Boyd et al., 2014; Project ImPACT; Stahmer et al., 2020; JASPER; Shire et al., 2019). However, manualized interventions that focus on social skills as a primary target for preschoolers with ASD are important, given the strong impact of early intervention.

Caregiver Involvement in Interventions

There has been a movement toward a more family-focused model of intervention for children with ASD (Dixon et al., 2004; Thompson et al., 1997) as primary caregivers (e.g., parents, grandparents) are lifelong models for social learning, and therefore uniquely positioned to influence a child’s development (Tomasello, 2001). Caregiver-mediated interventions teach caregivers how to employ intervention strategies to enhance the maintenance and generalization of child improvements, while reducing the time and resources needed for the intervention (Bearss et al., 2013; Gantman et al., 2012; Laugeson et al., 2009; 2012). Further, caregiver involvement is also expected to have benefits for the caregiver including improved responsiveness, mental and physical health, and caregiver self-efficacy, as well as decreased stress and depression (McConachie & Diggle, 2007; Roberts & Pickering, 2010; Sofronoff & Farbotko, 2002; Solomon et al., 2004; Whittingham et al., 2009). Caregiver involvement in intervention is an especially salient area to examine, as research has indicated a link between caregiver behavior, confidence, and child outcomes (Brookman-Frazee & Koegel, 2004; Factor et al., 2019; Osborne et al., 2008). Wan et al. (2013) found that qualities of caregiver–child interaction (e.g., more directive, lower ratings of dyadic mutuality, and intensity of engagement) in infants at-risk for ASD were associated with ASD outcomes at 3 years. Similarly, caregiver self-efficacy (i.e., confidence) has been shown to relate to both child and caregiver intervention outcomes (Meirsschaut et al., 2010), highlighting the importance of implementing interventions that include a family-centered approach, to enhance caregiver confidence in skills, which then transfer to child improvements (Trivette et al., 2010; Wainer et al., 2017). Information regarding how to structure caregiver involvement is essential in the creation of the most beneficial interventions, for both caregiver and child.

Despite research suggesting that including caregivers in interventions enhances child outcomes (Bearss et al., 2015; DeRosier et al., 2011), only one unpublished social skills intervention for young children with ASD actively integrates caregivers (Reichow et al., 2012). The Hanen Centre’s TalkAbility™ program is a parent-mediated social skills intervention for this population; however, further programmatic research is necessary to examine the curriculum, the caregiver–child-based component, and the group setting (The Hanen Centre, 2016). The Program for the Education and Enrichment of Relational Skills (PEERS®) for Preschoolers (P4P) is a manualized intervention with a growing evidence base that has indicated positive child outcomes (e.g., increased social skills, reduction in ASD symptoms), but examining the impact on the family has not yet been explored (BLINDED FOR REVIEW).

Family Functioning in ASD

Interventions that champion the caregiver–child relationship may ameliorate specific features of ASD and therefore, improve family dynamics. Caregivers of children with ASD reported lower marital happiness, family cohesion, and family adaptability than caregivers of typically developing (TD) children (Higgins et al., 2005). These modifiable family characteristics, such as adaptability to stressors and family conflict, in turn, may partially predict the child’s ASD symptomatology and other behavior difficulties, such that improvements in these family behaviors have been suggested to alleviate distress related to disruptions in routine, sensory sensitivities, and challenging behaviors (Baker et al., 2011; Kelly et al., 2008). Similarly, chaos (lack of order in the family system) can lead to a greater risk of conduct and emotional problems in children with ASD (Midouhas et al., 2013; Osborne et al., 2008; Sivberg, 2002) and decreased family quality of life (Mugno et al., 2007; Sivberg, 2002). Conflict and chaos may also lead to more punishment and arguments with the child, preventing engagement in enjoyable activities and opportunities for modeling positive social interactions (Lam et al., 2010). In sum, family functioning and relationships involve social reciprocity, a core ASD difficulty (APA, 2013). Thus, ignoring the caregiver component can have deleterious effects (e.g., poor mental health) for the caregiver, which in turn may negatively impact the child, and, transactionally impact family functioning (Gulsrud et al., 2010). Focusing on caregiver mental health and family dynamics may indeed act as a protective factor against stress.

Despite the increase in caregiver-administered interventions and the need for interventions that can improve family functioning, the impact on family functioning and caregiver–child relationships has not been examined in detail in interventions for families of children with ASD (Lord & Bishop, 2010). One study that implemented PEERS® to target social skills did find improved family chaos, even though family functioning was not specifically targeted in this intervention for adolescents (Karst et al., 2015). Another study found maternal home-based involvement in the intervention of their children with ASD was linked to decreased psychological distress and increased parenting efficacy (confidence) and family cohesion (Benson, 2015). Similarly, these caregiver and family characteristics were not targeted in this intervention but rather were a secondary outcome. However, most intervention studies only focus on the child with ASD, not providing a comprehensive picture of the family impact (Karst & Van Hecke, 2012).

Given these limitations, there may be a need for a theory-driven examination to guide the exploration of caregivers in the context of interventions (Klinger et al., 2021; Vivanti et al., 2014). One theory, family systems theory, emphasizes the reciprocal influences of family members on each other (Cox & Paley, 1997). Within that framework, studying families with a child with ASD seems essential, especially since family functioning involves social reciprocity, a core deficit of ASD (APA, 2013), and therefore, family reciprocal relationships and functioning are potentially already different than that of families of TD children. Thus, considering all aspects of an individual’s broader family environment and relationships is necessary to achieve the most beneficial intervention outcomes.

The PEERS® for Preschoolers Program

There are few social skills groups for young children with ASD that focus on social skills as a primary intervention target, though many include social skills as secondary intervention targets (Kaat & Lecavalier, 2014). Focusing on social skills early may lead to enhanced short- and long-term outcomes (Watkins et al., 2017). Given the lack of empirically supported social skills interventions for younger children on the autism spectrum, PEERS®, an evidence-based, ecologically valid caregiver-assisted social skills intervention for adolescents and young adults with ASD (Laugeson, et al., 2009; 2012), was modified for this population. P4P highlights the same tenets of the other PEERS® programs in a developmentally appropriate manner (Factor et al., 2022a, b; Park et al., 2022). In addition to the PEERS® separate and simultaneous child and caregiver groups, P4P has an added caregiver-coached play component at the end of each session. This allows caregivers to be coached by a clinician in intervention skills. An initial randomized control trial (RCT) indicated P4P benefits (e.g., increased social skills, reduction in problem behaviors; Laugeson et al., 2016), but did not examine caregiver or family outcomes. To date, there is no research on caregiver and family functioning in the context of this social skills intervention. This pilot study builds on initial findings and demonstrates the feasibility of this intervention.

Aims and Hypotheses

The present pilot study aimed to preliminarily examine caregiver–child relationships, caregiver confidence, parenting styles, and family functioning in the context of P4P, as a first step for larger RCTs to add to this gap in intervention work. Based on extant literature on interventions involving caregivers for young children with ASD, hypotheses for a 16-session social skills program are:

  1. 1.

    Caregivers would (a) increase confidence/self-efficacy managing their child’s social interactions measured by the Parental Self-Efficacy in the Management of Asperger Syndrome (PSEMAS); (b) improve overall parenting styles, specifically in laxness, overreactivity, and verbosity measured by the Parenting Scale (PS), and (c) caregiver–child interactions would increase in responsiveness, affect, achievement, and directiveness measured by the Maternal Behavioral Rating Scale (MBRS) from entry/pre-intervention to exit/post-intervention and be maintained at follow-up.

  2. 2.

    Family dysfunction (chaos) would decrease from entry/pre-intervention to exit/post-intervention and be maintained at follow-up, measured by the Confusion, Hubbub, and Order Scale (CHAOS).

Method

Participants

Children from 4–7 years (M = 4.87, SD = 1.25) diagnosed with ASD without intellectual impairment and their caregivers (27–42 years, M = 36.13, SD = 5.14) were recruited. One individual was 3 years old at intake, but 4 when groups began. For this pilot study, 15 caregiver–child dyads (11 males: 73.3%, 4 females; 26.7%) participated in four separate groups. To be eligible, children were required to have a previous ASD diagnosis, verified by meeting the cutoff on the Autism Diagnostic Observation Schedule, Second Edition (ADOS-2). Children also had to be toilet trained, able to tolerate a group setting, play preschool games, and sing songs, and children and caregivers were required to be fluent in English. Children were required to be verbally fluent (speaking in 4-5 word phrases) and meet specific cutoffs on a standard cognitive measure (more information below). Exclusion criteria included an active medical problem (e.g., unstable seizure disorder), severe mental health problems (e.g., psychosis), physical aggression, or an unstable medication regimen. Participants were recruited via local clinics, registries, support groups, or schools. Interested caregivers completed a phone screen to assess eligibility. Caregiver–child dyads who appeared appropriate were scheduled for an eligibility appointment (see Fig. 1). Additionally, participants were reported to have co-morbid diagnoses in addition to ASD including attention deficit/hyperactivity disorder (n = 6, 40%), generalized anxiety disorder (n = 3, 20%), obsessive compulsive disorder (n = 3, 20%), and developmental disability (n = 1, 6.7%). Ethnicity of the children included white (n = 10, 66.7%), black (n = 2, 13.3%), Asian (n = 1, 6.7%), Mixed Race (n = 1, 6.7%), and Other (n = 1, 6.7%). Other sample demographics are presented in Table 1.

Fig. 1
figure 1

Consort diagram for participant flow

Table 1 Descriptive statistics for categorical variables of interest

At eligibility sessions, consent and assent were obtained from caregivers and children. Assessments included the ADOS-2, Kaufman Brief Intelligence Test (KBIT-2), a 5-min interaction task between the caregiver and child (more details below), and caregiver completion of entry/pre-intervention forms. Participants were considered intervention completers if they attended more than 60% of sessions.

ADOS-2 severity scores (Mcomparison score = 6.80, SD = 2.01) and KBIT-2 IQ Composite scores (M = 102.00, SD = 15.34) fell in the average range. The target caregivers (i.e., completed interaction tasks and measures) were 93.3% of mothers. One father completed the interaction task at the follow-up visit only and in addition to the target parent, some families had other caregivers attend sessions. While we do include all caregivers interested in participating, one caregiver is identified as the primary caregiver and only the target caregiver completed measures and the interaction task at each timepoint, to keep the data consistent. Almost all target caregivers identified as female (14, 93.3%), while one male caregiver was the target caregiver (6.67%).

Data were collected over four timepoints (i.e., entry/pre-intervention, midpoint (Session 8), exit/post-intervention, and a 4–6-week follow-up post-intervention) and were analyzed and collapsed over all groups.

Procedures

Intervention

Sessions followed the unpublished P4P manual, made available from the UCLA PEERS® Clinic, which includes a script for each child and caregiver session, and largely follows the format of other PEERS® interventions. Groups consisted of 16 1.5-h sessions that met twice per week over the course of eight weeks. Each group consisted of 2–5 children (Group 1 = 2 children (13.3%), Group 2 = 4 children (26.7%), Group 3 = 4 children (26.7%), Group 4 = 5 children (33.3%)) with 4–7 group assistants and leaders. Training for clinicians included a 1-day intensive training and receipt of manuals. Leaders included graduate students, master’s students, and students with their bachelor’s degrees. All but one clinician met training requirements and were deemed ready to lead groups (only one clinician was asked to cease leading groups, due to inadequate fidelity) and fidelity of administration was assessed at each session (see below for more information regarding fidelity). Groups were supervised by an advanced graduate student clinician and licensed clinical psychologist who were both PEERS® Certified Providers. A 45-min case conference was held before each group to review any clinical concerns as well as review the lesson content.

Sessions included a didactic lesson with concrete, ecologically valid, and developmentally appropriate social skill rules and steps, role-play demonstrations, behavioral rehearsals, and weekly socialization assignments, presented in a developmentally appropriate manner. Lesson targets included listening to and following directions, greeting friends, sharing and giving turns, keeping cool when upset and being flexible during play, being a good sport, asking friends to play, and maintaining appropriate body boundaries. Children were taught skills through a live puppet show and activities designed to reinforce skill development naturalistically. Simultaneously, caregivers engaged in an hour-long group, in which they learned skills for helping their children make and keep friends. Caregiver sessions also included discussion of joining playgroups, arranging and preparing for playdates, and caregivers discussed and received feedback on socialization homework assignments, often bringing up both successes and difficulties. The caregiver sessions also had a brief overview of the same lessons listed above that the children learned, so that they were apprised of the content their children learned each week and this dictated what skill (or skills) they focused on each week (e.g., listening and following directions, keeping cool, etc.).

The last 30 min of each 90-min session were devoted to caregiver-coached play, which consisted of in vivo feedback from the intervention team, while caregivers coached their children during play with other group members (e.g., in-group playdates) on targeted skills. Children were paired with different playmates, just as caregivers were paired with different group leaders. Caregivers also had handouts that highlighted key terms and skills focused on each week, to help guide them in their social coaching. Caregiver social coaching, which was reviewed in the caregiver session, highlighted the P4P technique known as the 4Ps: priming (a form of cognitive rehearsal to prepare a child to practice skills immediately before a social opportunity by reviewing rules and steps), prompting (gentle reminder to use a particular skill in the moment through buzzwords and other keys terms), praising (complimenting a child when they used or attempted to use a skill), and providing corrective feedback in the form of a praise sandwich (start with praise using buzzwords, then feedback using buzzwords, end with general statement of praise). Caregivers were encouraged to utilize these strategies while interacting with their children.

Materials and Design

Randomization

A non-concurrent multiple baseline design (i.e., series of A–B replications) was employed to allow for rolling enrollment and a smaller sample without a control group (Horner et al., 2005; Morgan & Morgan, 2008). Single-case designs are less time intensive and more cost-effective than large-scale RCTs and therefore more feasible in the early stages of intervention development (Horner et al., 2005; Morgan & Morgan, 2008). This design and sample size were consistent with previous intervention studies for children with ASD and deemed appropriate for this study (Rao et al., 2008; Wan et al., 2013). Each group was randomized to a baseline condition. However, analyses presented in the current manuscript examine ratings administered during entry/pre-intervention, midpoint, exit/post-intervention, and at follow-up, rather than assessments completed each session and during the baseline period. Thus, these results are exploratory and we note the preliminary, yet important nature of the findings presented. This design was considered suitable for a pilot study without a comparison group. Administration of assessments did not occur directly following intervention groups. Approval for this research design was granted by the Institutional Review Board of participating institutions and was carried out in accordance with the Declaration of Helsinki. Families did not receive compensation.

Diagnostic and Screening Measures (To Determine Eligibility)

Autism Diagnostic Observation Schedule, Second Edition (ADOS-2; Lord et al., 2012)

The ADOS-2 is a semi-structured, observational assessment of ASD characteristics. The ADOS-2 consists of multiple modules, determined by age and language ability. For this study, Modules 2 (little or phrase speech; 5 individuals) and 3 (fluent speech; 10 individuals) were employed. The ADOS-2 demonstrates moderate to high levels of internal consistency, moderate test-retest reliability, and acceptable interrater reliability (McCrimmon & Rostad, 2014). This assessment was administered at entry/pre-intervention to verify each child met ASD criteria for the current study.

Kaufman Brief Intelligence Test (KBIT-2; Kaufman & Kaufman, 2004)

The KBIT-2 is an abbreviated measure of general intelligence. The KBIT-2 provides Verbal and Non-Verbal Intelligence scores, as well as a composite Intelligence Quotient (IQ) score and percentile ranks by age. Children needed to meet certain cutoffs on both domains, to demonstrate their adequate fit for the program, as it not only moves at a fast pace, but requires verbal fluency in thinking about social skills and social communication. The KBIT-2’s IQ Composite internal consistency coefficient was 0.93 across ages (0.89–0.96). This assessment was administered at entry/pre-intervention to verify that each child met inclusion criteria across both domains.

Demographic Questionnaire

This questionnaire includes general information regarding caregiver education, family history, composition, and child developmental and medical history.

Primary Outcome Measures for Exploratory Hypothesis Testing

Parental Self-Efficacy in the Management of Asperger Syndrome (PSEMAS; Sofronoff & Farbotko, 2002)

The PSEMAS is a 15-item questionnaire developed to assess parental self-efficacy (i.e., parent self-confidence) with children with Asperger Syndrome over the course of a specific intervention. This questionnaire assesses child behaviors and the extent to which caregivers feel they can handle them. The total self-efficacy score is determined by the total confidence score divided by the total number of behaviors that occur. This measure was administered at entry/pre-intervention, midpoint, exit/post-intervention, and follow-up.

The Maternal Behavioral Rating Scale (MBRS; Mahoney et al., 1986)

The MBRS is a 12-item observational measure that assesses four dimensions of parenting: responsiveness (RCO; 3 questions; responsivity, sensitivity, effectiveness engaging child in play), affect/animation (AA; 5 questions; acceptance, enjoyment, expressiveness, inventiveness, warmth); achievement orientation (AO; 2 questions; focus on child’s development, praise), and directiveness (DR; 2 questions; how much caregiver directs child or follows their lead, pace). Trained clinicians rate items on a 5-point Likert scale, with higher codes usually indicating more positive parenting styles based on the first 5-min of the interaction. The caregiver and child were presented with age-appropriate toys and instructed to “play as you usually do at home.” This was administered at entry/pre-intervention, midpoint, exit/post-intervention, and follow-up. The measure has indicated sensitivity to intervention changes (Mahoney et al., 1996).

Two research assistants (RAs) coded interaction tasks using the codes developed by Mahoney et al. (1986). RAs coded interactions from a previous study to achieve 85% absolute reliability before coding current study videos. Each RA coded two-thirds of videos (i.e., RA1 coded 20 videos, RA2 coded 20 videos) and 19 were double coded (ICC = 0.90). Scores from the overlapping videos were averaged for final codes.

Parenting Scale (PS; Arnold et al., 1993)

The PS is a 30-item measure of parenting style that indicates a total score of parenting style based on three styles: laxness (permissive, inconsistent), overreactivity (harsh, authoritarian, irritability, anger), and verbosity (overreliance on talking). Caregivers respond on a 7-point Likert scale. This was measured during the four assessment periods. The total score was primarily examined, and subscales were examined for exploratory analyses. Test-retest reliability has been proven adequate (Prinzie et al., 2007). In the current study, Cronbach’s alphas for the total score were 0.87 at entry/pre-treatment, 0.91 at midpoint, 0.90 at exit/post-treatment, and 0.77 at follow-up.

Confusion, Hubbub, and Order Scale (CHAOS; Matheny et al., 1995)

CHAOS is a 15-item, caregiver-report measure assessing environmental confusion in the home. Items are presented on a 6-point Likert scale from “Strongly Agree” to “Strongly Disagree,” with higher scores indicating greater family chaos. This was measured during the main assessment periods. Test-retest reliability has been shown to be satisfactory (Matheny et al., 1995). In the current study, the Cronbach’s alphas were 0.52 at entry, −1.91 at midpoint, 0.44 at exit/post treatment, and 0.44 at follow-up. The negative value is likely due to a negative average covariance among items, which is more negative than total values.

Fidelity of Implementation

Both caregiver and child groups were rated by a live observer to assess fidelity of intervention implementation each session (i.e., completion of session goals, therapist behavior, therapeutic relationship). Raters were trained on completion of fidelity forms, as well as the intervention, and observed the entirety of each session. Items were rated on a 0–5-point Likert scale, measuring the success of implementation in each session (5 being the highest score of implementation). Raters followed the same training procedures as group leaders (e.g., comprehensive one-day training on P4P intervention procedures) and were either graduate students, master’s students, or students with their bachelor’s degrees. They remained in each group through the entirety of each session. Groups did not vary in fidelity (the last session was excluded as it was graduation and a party in addition to review of material covered).

Analytic Plan

Data were first analyzed to determine whether necessary assumptions of normality, linearity, and homoscedasticity were met before proceeding. Next, descriptive statistics including the means, standard deviations, and ranges were determined for variables of interest (see Table 1).

Due to the non-normal distribution of data, nonparametric tests were used. Nonparametric Friedman tests were employed, followed by post-hoc Wilcoxon tests for pre–post comparisons, as well as an examination of follow-up data. Interpretation of effect sizes (r values) are as follows: 0.5 = large effect, 0.3 = medium effect, 0.1 = small effect (Fritz et al., 2012).

Single-Subject Analyses

A reliable change index (RCI) was calculated to determine social skills, caregiver confidence, and knowledge, and family functioning change relative to measurement error for each individual (Jacobson & Truax, 1991). RCIs determined the magnitude of change needed to show meaningful change above and beyond standard error. RCI calculations were completed by dividing the difference of scores between two timepoints (i.e., either entry/pre-intervention and exit/post-intervention or entry/preintervention and follow-up), divided by the standard difference, which includes test-retest reliability and standard deviation of the original measure. RCI values above 1.96 are suggested to infer statistically significant and meaningful change. The test-retest reliabilities and standard deviations used to compute the Sdiff score were obtained from the literature. If test-retest reliability was not previously reported in the literature, then Cronbach’s alpha from the literature was used.

Results

Preliminary Analyses

No differences were indicated in group demographics and thus covariates were not included (see Table 1 for complete demographics).

Intervention Efficacy

Therapist fidelity of intervention implementation

Groups did not vary in fidelity. All sessions completed 90–100% of outlined components (Mchild group = 99.37, SDchild group = 2.06; Mcaregiver group = 99.63, SDcaregiver group = 1.84), other than one group session where 75% of outlined components were completed due to a late start for one session as a result of weather complications. This one session was identified as an outlier and not included in analyses in calculating the means and standard deviations presented above. Raters noted success of implementation of specific session content across all groups (Likert scale from 0 to 5; Mchild group = 4.89, SDchild group = 0.20; Mcaregiver group = 4.92, SDcaregiver group = 0.18).

Exploratory Intervention Outcomes

Caregiver confidence and self-efficacy

Mean scores at each timepoint are indicated below. Changes in total self-efficacy (PSEMAS) were not significant across all timepoints (x2(3) = 7.58, p = 0.055; Table 2), though differences were significant between entry/pre-intervention (Table 3) and midpoint (Z = −2.48, p = 0.013, r = 0.029; Table 3) and follow-up (Z = −2.23, p = 0.026, r = 0.23; Table 3). These are small and medium effect sizes, respectively.

Table 2 Comparison of variables of interest across timepoints
Table 3 Statistics for all variables of interest across all timepoints

Caregiver behavior

Significant changes across all four timepoints were indicated on the MBRS AO code (x2(3) = 7.97, p = 0.047; Table 2). Further analyses revealed differences on AO from entry/pre-intervention (Table 3) to both midpoint (Z = −2.00, p = 0.046, r = 0.52; Table 3) and follow-up (Z = −2.39, p = 0.017, r = 0.62; Table 3) and on AA from midpoint to exit/post-intervention (Z = −2.501, p = 0.012, r = 0.67; Table 3; exit/post-treat values Table 3). These all indicate large effect sizes.

Differences on the PS scale were not significant (Table 3). For exploratory analyses, the three subscales were examined. Only the overreactivity scale showed significant change across all four timepoints (x2(3) = 11.8, p = 0.008; Table 2). Wilcoxon tests indicated PS total score was significant from entry/pre-intervention (Table 3) to both midpoint (Z = −2.25, p = 0.024, r = 0.75) and follow-up (Z = −2.045, p = 0.041, r = 0.62), laxness was significant from entry/pre-intervention to midpoint (Z = −2.016, p = 0.044, r = 0.67), and overreactivity was significant from entry/pre-intervention to both midpoint (Z = −2.20, p = 0.028, r = 0.73) and exit/post-intervention (Z = −2.39, p = 0.017, r = 0.84). These were all large effect sizes. Verbosity was not significant.

Family functioning

No significant change in CHAOS scores across all timepoints (x2(3) = 2.02, p = 0.57; Table 2) or comparing specific timepoints (Table 3) were found.

Individual Outcomes

RCI determined meaningful change above and beyond standard error (Table 4). Negative scores indicate improvement (e.g., less chaos). This analysis adds another measure of change in single-subject design and adds to rigor of statistical outcomes as group-level analyses may obfuscate some individual changes. Further, this measure is essential when examining individual change with small sample, pilot intervention studies.

Table 4 Reliable change index for each participant

On the SRS-2, while no individuals showed significant reductions in the total score from pre- On the PSEMAS total self-efficacy score, more families reported clinically significant change at follow-up (50% or 7 out of 14), compared to exist/post intervention (30% or 3 out of 10).

Fewer parents were observed to show or reported clinically significant change in maternal behavior or family functioning, with the most change observed in directiveness. Specifically, on the MBRS, 21.42% of caregivers significantly improved on the RCO scale from entry/pre-intervention to exit/post-intervention (3 out of 14) and 13.33% significantly improved at follow-up (2 out of 15 caregivers). On the AA scale, 7.14% improved at exit/post-intervention (1 out of 14 caregivers) and 6.67% significantly improved at follow-up (1 out of 15 caregivers). On the AO scale, 21.43% of caregivers significantly improved from entry/pre-intervention to exit/post-intervention (3 out of 14), while 40% of caregivers significantly improved at follow-up (6 out of 15). On the DR scale, 35.71% significantly improved from entry/pre-intervention to exit/post-intervention (5 out of 14 caregivers) and 33.33% at follow-up (5 out of 15 caregivers). Similarly, on the self-report PS total scale, 22.22% of caregivers improved from entry/pre-intervention to exit/post-intervention (2 out of 9) and 9.10% significantly improved from entry/pre-intervention to follow-up (1 out of 11 caregivers). Finally, on the CHAOS scale, 11.11% of caregivers indicated significant improvements to family functioning from entry/pre-intervention to exit/post-intervention (1 out of 9), while 18.18% of caregivers indicated family functioning significantly improved from entry/pre-intervention to follow-up (2 out of 11).

Discussion and Implications

The current pilot study examined improvements over the course of a social skills intervention for young children with ASD and at follow-up regarding caregiver–child relationships, caregiver confidence/self-efficacy, parenting style, and family functioning. These preliminary, exploratory results support the hypotheses of improvements in caregiver efficacy, some parenting style components, and caregiver–child dynamics.

Significant improvements in the observed caregiver–child relationship and caregiver-reported self-efficacy were found at exit/post-intervention and follow-up. This may indicate that while the P4P program trained caregivers to serve as social coaches in play settings, the intervention may have a more widespread impact on some caregivers’ confidence in parenting their children (Karst & Van Hecke, 2012). Caregiver self-efficacy, which predicts caregiver characteristics and behaviors, including competence and mental health (Johnston & Mash, 1989; Jones & Prinz, 2005), is an important concept for caregivers of children on the autism spectrum. Most children with ASD are diagnosed as preschoolers, but symptoms often emerge earlier (CDC, 2014), so caregivers may feel ineffective until they receive a diagnosis or learn autism-specific techniques (Karst & Van Hecke, 2012). Given the bidirectional relationship between child and caregiver, in conjunction with the increased demand involving caregivers in intervention, it is promising that P4P benefits the caregiver and caregiver–child relationship, though further research is needed to confirm these findings (Granger et al., 2012).

While overall parenting strategy did not change significantly at the group level, several caregivers showed improvements in parenting styles in this pilot study, particularly overreactivity and laxness, even when not specifically targeted. Caregivers may gain a sense of calm through in vivo coaching and practicing skills in P4P, which would decrease overreactivity. Additionally, caregivers may better understand targets and methods for coaching their child, increasing their consistency, and therefore laxness. On the observational assessment, though not all domains were indicated to have improved, the Achievement Orientation (AO) and Animation/Affect (AA) scales suggested improvement from entry/pre-intervention to follow-up when data were aggregated and also when RCIs were calculated for individual caregivers. Specific improvements may relate to the P4P curriculum’s emphasis on praise and enjoyment during play, and these findings are consistent with findings in previous systematic literature reviews that found that inclusion of caregivers leads to positive relationship changes (Factor et al., 2019) and more positive caregiver–child interactional styles (Karst & Van Hecke, 2012). More caregivers may have reported improvements at follow-up, rather than post-intervention, as it takes time to become comfortable with techniques and they may see more success when they have fewer time demands from intervention participation (Bristol et al., 1993; Iadarola et al., 2018). Future research may identify which caregivers most benefit from these interventions. However, due to the preliminary nature of these findings, conclusions regarding the nature of these results due to P4P are tempered. Specific caregiver characteristics were not examined, due to the sample size, which can be further explored in future studies.

Family functioning remained largely stable for most families over the course of intervention. There was some improvement in family functioning for two caregivers based on RCI scores at exit/post-intervention and follow-up. Overall, the current pilot study fails to support other research that caregiver training programs may lead to positive familial outcomes (Factor et al., 2019) and that changes in one relationship may impact larger family dynamics (Minuchin, 1985). Other family members may need to be involved in intervention (e.g., caregivers and sibling involvement; more in-home practice). In particular, the inclusion of fathers is likely important, as mothers and fathers respond differently to child behaviors and caregiver self-efficacy (Hastings & Brown, 2002).

The study is not without limitations. Specifically, more participants would allow for more power, which would expand the type of analyses appropriate to examine outcomes based on this intervention and may be important for studying RCIs, as values in the current small sample were low. Further, an RCT design could be possible with a larger sample size. This would allow for more conclusive findings and interpretations about the impact of P4P on caregiver and family outcomes. Another limitation was the primary reliance on caregiver-report (Whittingham et al., 2009). Observational measures of family functioning, particularly in naturalistic settings, may better elucidate intervention changes and may be more accurate in the home setting. Additionally, the four timepoints for assessment measures may have impacted results due to repeated measurement’s impact on familiarity of assessments. For this reason, we also examined the follow-up measures and compared each timepoint. Finally, caregiver traits and other child behavior (e.g., problem behavior) were not explored. High levels of heritable traits (e.g., Broader Autism Phenotype (BAP); Bolton et al., 1994) often predict social and emotional challenges for both the family member who does not have an ASD diagnosis and for the child with ASD (Ingersoll & Hambrick, 2011; Maxwell et al., 2013). Further, other child behaviors may influence caregiver–child daily interactions. Examining these factors could impact how both children and caregivers respond to intervention.

Future work should continue to address the experience of the caregiver and family unit within a social skills intervention for young children with ASD, given that interventions increasingly include caregivers as active intervention participants (Dixon et al., 2004; Thompson et al., 1997). Evaluating outcomes beyond those between the caregiver and child, even if other family members cannot participate, will help identify potential barriers to family involvement in interventions (Karst & Van Hecke, 2012). Results indicate further research is needed to design and administer the most successful interventions for individuals with ASD and their families. Exploration of caregiver traits, including BAP or stress, could help clinicians tailor intervention. Since caregivers of children with ASD experience higher levels of stress (Davis & Carter, 2008; Estes et al., 2013), it is essential to study mechanisms of change. Additionally, mothers with high rigidity on BAP measures may benefit from learning adaptive emotion regulation strategies and those with more pragmatic difficulties may need support showing positivity in interactions (Ingersoll & Hambrick, 2011; Rea et al., 2019). Therefore, BAP features may partially dictate the stress a caregiver experiences interacting with their child or how they may respond to social coaching. Adaptations for functioning levels (e.g., lower IQs, less language), socioeconomic status (SES), race, and other factors, should also be considered to determine if any necessary intervention adaptations are needed.

In sum, this pilot study provides initial support for a caregiver-assisted social skills group for young children with ASD in improving caregiver confidence and caregiver–child interactions. Findings address a gap in the literature by demonstrating the benefit on the caregiver and on caregiver–child interactions in response to caregiver-assisted social coaching for young children with ASD in the context of a social skills intervention. Further exploring findings from this pilot study will allow for a deeper understanding of the specific effectiveness of caregiver-assisted social skills on intervention implementation and caregiver confidence, parenting styles, and relationships/interactions with their child and larger family.