Spanish Adaptation of the Parenting Practices Interview (PPI-25) for Families with Substantiated Reports or at Risk for Child Maltreatment

Parenting practices are a central focus of many family preventive and treatment programs due to their influence on children’s well-being. Reliable measures of parenting practices are relevant not only for research purposes, but also for assessment, selection of intervention goals, and evaluation of expected changes in clinical practice. However, measurement of parenting practices has been a challenge for researchers and practitioners. The Parenting Practice Interview (PPI) has been developed to assess both positive and negative parenting dimensions and has been used in clinical contexts. The present study aimed to develop a Spanish adaptation of the PPI and to analyze its main psychometric properties. The sample consisted of 213 parents with substantiated reports or at risk for child maltreatment with significant problems in coping with their children’s behavioral problems, recruited from Child Welfare and Child Protection Services. Confirmatory factor analysis (CFA), measurement invariance (MI), convergent validity, and differences based on parents’ and children’s age and gender were analyzed. A four-factor model with 25 items (Appropriate Discipline, Verbal Praise and Incentives, Inconsistent Discipline, and Physical Punishment) met statistical requirements (RMSEA = 0.06, CFI = 0.92, TLI = 0.91) and showed adequate internal consistency and convergent validity. MI analyses allowed comparison across time and groups. Although more research is needed, the PPI-25’s psychometric properties are encouraging for its use with families with substantiated reports or at risk for child maltreatment in Spain. A brief Spanish adaptation of the Parenting Practices Interview with 25 items with both positive and parenting dimensions is proposed. Measurement Invariance analysis supports the use of the PPI-25 to compare parenting scores across time, and across mothers and fathers. Findings support the utility of the PPI-25 to measure parenting practices in Spanish families with substantiated reports or at risk for child maltreatment with children age 4 to 9. A brief Spanish adaptation of the Parenting Practices Interview with 25 items with both positive and parenting dimensions is proposed. Measurement Invariance analysis supports the use of the PPI-25 to compare parenting scores across time, and across mothers and fathers. Findings support the utility of the PPI-25 to measure parenting practices in Spanish families with substantiated reports or at risk for child maltreatment with children age 4 to 9.

• Findings support the utility of the PPI-25 to measure parenting practices in Spanish families with substantiated reports or at risk for child maltreatment with children age 4 to 9.
Despite substantial theory and research developed around parenting, it is challenging to find a clear definition (Hurley et al., 2014;Keijser et al., 2020;O'Connor, 2002). Usually, parenting has been conceptualized as a complex multifaceted and dynamic set of human activities (behaviors, cognitions and emotions) that includes attitudes towards child rearing, parent-to-child nurturing behaviors, parenting strategies, and parenting skills and competences (Hurley et al., 2014;Lindhiem & Shaffer, 2017).
Two main perspectives have traditionally been adopted in the study of parenting. The first combines parental behaviors into styles, and usually includes four parenting styles described as authoritative, authoritarian, permissive, or disengaged (for more information see Baumrind, 1991). The second perspective focuses on specific dimensions of parental behavior (or parenting practices) and their association with child outcomes (O'Connor, 2002;Pinquart, 2017). Dimensions of parenting practices typically involve warmth/support, hostility/rejection, and control of children's behavior (O'Connor, 2002). These dimensions can be understood as positive or negative based on their effects on child development. For example, behavioral control is considered an indicator of positive parenting when it includes clear expectations or appropriate discipline according to child age. However, it is considered an indicator of negative parenting when including harsh discipline, physical punishment, or intrusiveness (Parent & Forehand, 2017;Pinquart, 2017).
Parenting practices have been widely studied due to their direct and indirect influence on children's well-being. Effective parenting practices have been associated with fewer child behavior problems, improved social skills, and better academic achievement and personal and social longterm adjustment (Lindhiem et al., 2019). Moreover, changes in parenting practices have been proven to impact on child outcomes, showing that increased parental skills effectiveness is related to decreased child behavior problems, especially for families reporting higher levels of initial problems (Chamberlain et al., 2008).
Because parenting is a major determinant of child development and a relevant factor affecting many outcomes along the life course, it is usually a main target for preventive, early intervention, and treatment programs aimed to promote child well-being and development (Sanders & Turner, 2018). This is the case, for example, of Behavioral Parent Training (BTP) programs, widely developed over the years, such as Triple P (Sanders et al., 2014) or The Incredible Years (Pidano & Allen, 2015). Such programs require valid and reliable measures of parenting practices to select areas of parenting in which intervention is needed, and to rigorously evaluate expected changes. But measurement of parenting practices has been a challenge for researchers and practitioners. A recent review concluded that few measures have demonstrated adequate psychometric properties (Lindhiem et al., 2019). Hurley et al. (2014) carefully reviewed the psychometrics properties of 164 measures of parenting skills and parental attitudes. Their findings showed that, although 25 measures provided some information, only 5 of them reported strong psychometric properties: Child Abuse Potential Inventory (Milner, 1986), Alabama Parenting Questionnaire (Shelton et al., 1996), Parenting Alliance Measure (Abidin & Konold, 1999), Parenting Scale (Arnold et al., 1993), and Parent Child Relationship Inventory (Gerard, 1994). Only two of these measures were specifically developed to assess parenting practices: the Alabama Parenting Questionnaire, which was designed for elementary school-age children (6-18 years) and includes positive and negative dimensions of parenting practices, and the Parenting Scale, designed for toddlers and preschoolers (age 18 to 48 months) and focusing on dysfunctional parenting dimensions.
To date, two studies have been conducted to validate the Alabama Parenting Questionnaire (APQ) with Spanish samples. One study was conducted with the child self-report version with children age 8 to 12 years old (Escribano et al., 2013). A second study was conducted with 42 adapted items from the original parent self-report version with parents of 3year-old children by de la Osa et al. (2014). No study has been conducted to validate the Parenting Scale in Spain.
Clearly, more studies are needed to provide validated measures of parental practices for the Spanish population, particularly to be used in the assessment and intervention with families with significant difficulties in the parent-child relationship or at risk of it. However, both previously mentioned measures have limitations: the APQ is not applicable with children under 6 years, and the Parenting Scale focuses only on negative parenting dimensions.
The Parenting Practice Interview (PPI; Webster-Stratton et al., 2001) was adapted from the Oregon Social Learning Center's discipline questionnaire, and it was originally designed to measure both positive and negative dimensions of parenting practices. It asks parents how frequently they display specific responses toward their children when they misbehave (e.g., "give him/her a time out") or when they behave well or do a good job (e.g., "give your child a hug, kiss, pat, handshake or "high five"), as well as how much they agree or disagree with some statements broadly describing ideas or attitudes about the good or wrong way to cope with children's behaviors (e.g., "being consistent in discipline is more important than giving big punishment for misbehavior"). The PPI can be administered as a structured interview or in a self-report format. It is composed of 64 items organized in seven summary scales: Harsh and Inconsistent Discipline, Physical Punishment, Appropriate Discipline, Positive Verbal Discipline, Praise and Incentives, Clear Expectations, and Monitoring (Webster-Stratton, 1998). The PPI has been widely used in clinical interventions, both preventive (Reid et al., 2007;Webster-Stratton et al., 2001;Weeland et al., 2017) and treatment programs with parents of children between 3 to 12 years old with significant behavioral problems (Reid et al., 2003;Smith et al., 2015;Webster-Stratton et al., 2004), including children with ADHD and ODD/CD diagnosis (Abikoff et al., 2015;Drugli et al., 2010;Lessard et al., 2016). It has also been applied with ethnic minorities (Leijten et al., 2017) and with families from the child protection system (Letarte et al., 2010;Linares et al., 2006;Smith et al., 2015). Despite its application in numerous studies, different sets of items making up different scales or summary scores of the PPI have been used, so findings are difficult to compare. For example, Drugli et al. (2010), Reid et al. (2007) and Webster-Stratton et al. (2004) used only two PPI scales -one of them measuring a positive dimension of parenting, and another measuring a negative dimension-, but dimensions are named in different ways (positive parenting/praise and incentives/supportive parenting; harsh discipline/harsh and inconsistent discipline/harsh and inappropriate discipline) and consisted of different numbers of items. In another study, Abikoff et al. (2015) used only two PPI dimensions of positive parenting (appropriate discipline and clear expectations). In two further studies, Linares et al. (2006) and Weeland et al. (2017) used four PPI scales, also named in different ways and using different numbers of items: whereas Linares et al. (2006) used 49 items grouped into three positive parenting dimensions (positive parenting, appropriate discipline and clear expectations) and one negative dimension (harsh discipline), Weeland et al. (2017) used 41 items grouped into two positive dimensions (positive verbal discipline and praise and incentives) and two negative dimensions (harsh and inconsistent discipline and physical discipline). In three other studies (Leijten et al., 2017;Lessard et al., 2016;Letarte et al., 2010) a total of 64 items were used and the seven scales proposed in the original version of the PPI were analyzed. In the study of Leijten et al. (2017) the positive verbal discipline and monitoring dimensions were excluded from analyses due to unreliability (alpha < 0.60).
Most of the studies using the PPI provided information about subscales internal consistency coefficients (Cronbach's alpha), ranging from 0.65 to 0.85. Only the study of Smith et al. (2015) conducted an Exploratory Factorial Analysis (EFA), reporting adequate psychometric indexes for a three-factor solution and 17 items: effective discipline (6 items), inconsistent discipline (6 items) and punitive discipline (5 items). In summary, it can be concluded that the main psychometric properties of the PPI have not been thoroughly assessed to date.
The goal of the present study was to develop and to analyze the main psychometric properties of a Spanish adaptation of the PPI to be used with families with substantiated reports or at risk for child maltreatment. More specifically, factorial structure, reliability, measurement invariance (across time and across parents' and children's age and gender), and convergent validity of the PPI were analyzed. We calculated the invariance of mothers vs. fathers, boys vs. girls, and younger vs. older children groups in order to confirm the hypothesis that the proposed factorial structure held and remained invariant for each group. Convergent validity was analyzed through measures of child behavior problems, parental stress, and depressive symptomatology, following the evidence of their relationship with parenting practices (Sanders & Turner, 2018). We expected to support the hypothesis that parents with higher scores in negative PPI parenting dimensions showed higher scores in reported child behavior problems, parental stress and depressive symptomatology. Vice versa, it was expected that parents reporting higher scores in positive PPI dimensions showed lower scores in reported child behavior problems, parental stress and depressive symptomatology. Additionally, PPI changes following parents´attendance to a parent training program, as well as differences according to parents' and children's age and gender were explored.

Participants
The sample consisted of 213 parents (76% mothers) of 161 families, with children ranging in age from 4 to 9 years, recruited from Child Welfare (CW) and Child Protection Services (CPS) of the region of Gipuzkoa (Spain). Families were considered at risk by CW and CPS services due to substantiated reports or significant risk for child maltreatment. In all the families, children displayed behavioral problems and parents showed significant difficulties handling them.
Sociodemographic characteristics of the sample (161 mothers and 52 fathers) are presented in Table 1

2001)
The PPI consists of 64 items rated by parents of children age 3 to 12 years old. The original version includes seven summary scales: Harsh and Inconsistent Discipline (15 items; e.g., "Raise your voice", "How often does your child get away with things that you feel s/he should have been disciplined for?"), Physical Punishment (6 items; e.g., "Give your child a spanking"), Appropriate Discipline (12 items; e.g., "Take away privileges like TV, playing with friends"), Positive Verbal Discipline (9 items; e.g., "In an average week, how often do you praise or reward your child for doing a good job at home or school?"), Praise and Incentives (11 items; e.g., "Give your child a hug, kiss, pat, handshake for a good behavior"), Clear Expectations (6 items; e.g., "I have made clear rules or expectations for my child about chores"), and Monitoring (5 items; e.g., "How many hours in the last 24 h did your child spend at home without adult supervision, if any?"). The Spanish adaptation of the PPI used with Hispanic families in the USA (Linares et al., 2006;Reid et al., 2001) was applied in the present study, although the wording of some items was slightly modified to fit better with the Spanish dialect used in Spain.
The responses are given on a Likert-type scale from 1 (Never/Not at all likely/Totally disagree) to 7 (Always/ Extremely likely/Totally agree) with the exception of five items from the Monitoring dimension (None/Less than 1/ 2 h/……/More than 4 h) and two items from Positive Verbal Discipline (Less than once per week/About once per week/ ……/More than 10 times per day), where responses are given in terms of number of times parents say that the behavior happened.
Cronbach's alpha coefficients reported by Webster-Stratton et al. (2001)  Parenting stress index/short form (PSI-SF; Abidin, 1995) The PSI-SF is a 36-item, self-report measure of parenting stress. It includes three subscales: Parental Distress (PD; e.g., "I feel trapped by my responsibilities as a parent", "I feel lonely and without friends"), Parent-Child Dysfunctional Interaction (PCDI; e.g., "Sometimes I feel my child doesn't like me and doesn't want to be close to me", "When I do things for my child I get the feeling that my efforts are not appreciated"), and Difficult Child (DC; e.g., "My child makes more demands on me than most children", "My child gets upset easily over the smallest thing"). Each subscale consists of 12 items rated from 1 (strongly disagree) to 5 (strongly agree), with scores ranging from 12 to 60. A Total score is calculated by summing the three subscale scores, ranging from 36 to 180. Scores of 90 or above may indicate a clinical level of stress. Abidin (1995) reported Cronbach's alpha coefficients of 0.91 for the PSI-SF Total Score, and 0.87, 0.80 and 0.85 for the PD, PCDI and DC subscales, respectively. The PSI-SF version validated with Spanish population (Rivas et al., 2021) was used in the present study, with satisfactory internal consistency for the total score (α = 0.93) and all three dimensions (α = 0.86, 0.91, and 0.85).
Beck depression inventory-II (BDI-II; Beck, Steer, & Brown, 1996) The BDI-II is a 21-item, self-report measure of depressive symptomatology appropriate for both psychiatric and normative populations. Responses are given using a four-point scale from 0 to 3 (e.g., 0 "I do not feel like a failure"; 1 "I have failed more than I should have"; 2 "As I look back, I see a lot of failures"; 3 "I feel I am a total failure as a person"), with scores ranging from 0 to 63 and higher scores indicating higher levels of depressive symptomatology. The BDI-II has been shown to have adequate reliability (between 0.92 and 0.93 for internal consistency) as well as adequate construct validity (Beck et al., 1996). The BDI-II has been validated for its use with the Spanish population (Sanz et al., 2003). In the present study, internal consistency indices were also satisfactory (Cronbach's alphas of 0.87). Eyberg child behavior inventory (ECBI; Eyberg & Pincus, 1999) The ECBI is a parent-rating scale covering 36 child disruptive behaviors with two subscales. The Intensity subscale measures the frequency of the child's behavior (e.g., "Acts defiant when told to do something", "Refuses to go to bed on time") on a seven-point scale, ranging from 1 to 7 with a minimum score of 36 and a maximum of 252. The Problem subscale measures the extent to which the parent finds the child's behavior troublesome, rated on a binary scale (0 no; 1 yes) with a score range from 0 to 36. Eyberg and Pincus (1999) reported high internal consistency for both Intensity and Problem subscales (α = 0.95 and KR20 = 0.94, respectively). The ECBI has been translated and validated with the Spanish population (García-Tornel et al., 1998). In the present study, both Intensity and Problem subscales showed high internal consistency (α = 0.91 and KR20 = 0.88).

Procedure
Parents were informed of the study goals by Child Welfare and Child Protection Services caseworkers and gave informed consent. Every parent agreed to participate in the study voluntarily and completed the instruments at the family home in the presence of a trained clinical psychologist at two times: before starting the assigned intervention (Time 1: pre-intervention) and six months later (Time 2: post-intervention). The Ethics Committee of the University of the Basque Country UPV/EHU approved the study protocol.

Preliminary analyses
Preliminary analyses were conducted to explore data characteristics. Multivariate normality was estimated by the Mardia's multivariate skewness and kurtosis test (Mardia, 1970).

Factor analysis and reliability
Confirmatory Factor Analysis (CFA) was preferred over Exploratory Factor Analysis (EFA) based on three considerations. First, EFA assumes that there is no theoretical information on the variables under study (Lloret et al., 2014). In our case, there was sufficient theoretical information about the PPI dimensions. Second, large samples are a requirement of EFA, which is difficult to achieve in the field of family intervention programs. Third, and most relevant, CFA offers greater methodological rigor compared to EFA (Brown, 2015).
Confirmatory Factor Analysis (CFA) was conducted with Mplus 8 using weighted least squares mean-and varianceadjusted (WLSMV) estimation methods for categorical data. CFA was conducted with PPI data of participants before starting the assigned intervention at Time 1 (preintervention). Longitudinal Measurement Invariance was calculated to confirm the maintenance of the factorial structure at Time 2 (post-intervention).
Missing data were treated with pairwise deletion. Goodness of fit indices were examined: root mean square error of approximation (RMSEA), with values below 0.08 representing acceptable fit, comparative fit index (CFI) and Tucker-Lewis Index (TLI), with values between 0.90 and.95 representing reasonable model fit and values above 0.95 an excellent model fit (Brown, 2015).
Internal consistency was examined by computing Cronbach's alpha coefficients for each PPI dimension. Cronbach's alpha is less reliable in multidimensional measures and requires equal factor loadings (Viladrich et al., 2017); the Omega coefficient was therefore also calculated using R software.
Total scores for each PPI dimension were calculated by summing the responses for the items within each dimension.

Measurement invariance (MI)
Multigroup analyses were used to test measurement invariance across parents' gender (mothers vs. fathers), and children's gender (boys vs. girls) and age (4-6 years vs. 7-9 years). MI was calculated using parcels since large group sizes are needed in order to have reasonable statistical power when testing for measurement invariance (Kline, 2011). Parcels were created based on each dimension, and divided by the number of items within each dimension . In these MI comparisons, nonsignificant Δχ2 along with ΔCFI ≤ 0.01 and a ΔRMSEA ≤ 0.015 were considered evidence of invariance. Although parents pertain to the same family, items of the PPI ask the mother and the father independently about their usual behavior with their children. In this study, each parent was considered independently even if they participated in the assigned intervention as a couple. The perception of each parent about his/her child behavior problems and the behavior reported by each parent towards their children, were the focus of the assessment.
Longitudinal Measurement Invariance (LMI) was tested across time (Time 1 and Time 2) following Liu et al. (2017) recommendations. LMI was only calculated for the total sample. Based on the limited number of participants at Time 2, it was not possible to calculate longitudinal MI for groups.
As recommended by Liu et al. (2017;pp. 495), some response-categories were collapsed in order to deal with sparse data and to secure that the analyses are conducted with the same number of response-categories at each measurement time. This strategy was only used for the LMI analyses. For every 7 items that make up the Verbal Praise and Incentives dimension, options 1 (never/totally disagree) and 2 (seldom/disagree) were merged into a single category, leaving a total of 6 categories. For the 5 items that make up the Inconsistent Discipline dimension, options 6 (very often/agree) and 7 (always/totally agree) were merged into a single category, with a total of 6 categories also remaining. For the 6 items of the Physical Punishment dimension, it was necessary to merge options 4 (about half of the time/ neither agree or disagree), 5 (often/slightly agree), 6 (very often/agree) and 7 (always/totally agree) into a single category, leaving a total of 4 categories. It was not necessary to make any changes to the items belonging to the Appropriate Discipline dimension, which maintained 7 response categories.
Configural, metric and scalar invariance were tested, based on recommendations by Cheung & Rensvold (2002) and Little (2013). A ΔCFI ≤ 0.01 and a ΔRMSEA ≤ 0.015 were considered evidence of invariance. Chi-square difference tests were less favored given that the X 2 test is considered too sensitive to sample size (Little, 2013).

Validity analysis
Convergent validity was assessed by computing Spearman correlations between each factor of the PPI and parenting stress (PSI-SF), parental depressive symptomatology (BDI-II), and child behavior problems (ECBI). MANOVAs were conducted to test PPI score differences between pre-and post-intervention for parents participating in the Incredible Years Program (n = 104), between parents' gender, and between children's age and gender. Cohen's d was used to calculate effect sizes: d ≥ 0.20 was considered a small effect, d ≥ 0.50 a moderate effect, and d ≥ 0.80 a large effect.

Preliminary Analysis
Descriptive statistics for all 64 PPI items used in the analysis are provided in the supplemental material. Analysis of the distribution scores indicated violations of univariate normality in at least 20 items (skewness and kurtosis ±2). Mardia's coefficient for multivariate kurtosis was also statistically significant (p < 0.001).
Further analysis indicated that the Monitoring dimension showed inadequate kurtosis and skewness (± 10) with more than 20% of missing data per item. Therefore, the Monitoring scale was eliminated from further analysis. Missing data analyses of the remaining 59 items showed that only 5% of the sample had missing values and less than 2% of responses per item were missing.

Factor Analysis and Reliability
A six-factor model was tested using confirmatory factor analysis (CFA). The six PPI dimensions used were Harsh and Inconsistent Discipline, Physical Punishment, Appropriate Discipline, Positive Verbal Discipline, Praise and Incentives, and Clear Expectations.
CFA results for the six-factor model were not acceptable (χ2 = 3096.39, df = 1637, p < 0.001, RMSEA = 0.07, RMSEA 90% CI = 0.06,0.08, CFI = 0.68, TLI = 0.67). A total of 34 items with a factor loading < 0.30 and a negative correlation with items of the same factor were eliminated, including every item from Harsh and Inconsistent Discipline and Clear Expectations dimensions. Moreover, three items of the Positive Verbal Discipline dimension showed correlations between 0.20 and 0.40 with the Praise and Incentives dimension. A content analysis of these three items (8a "Within the last two days, how many times did you praise or compliment your child for anything s/he did well?", 9d "It is important to praise when they do well", and 11a "When your child completes his/her chores, how likely are you to praise or reward your child?") supported their inclusion in the Praise and Incentives dimension.
Positive correlations were observed between factors related to positive dimensions of parenting practices (Appropriate Discipline and Verbal Praise and Incentives), as well as between factors related to negative dimensions (Inconsistent Discipline and Physical Punishment). Additionally, only a negative correlation was found between positive and negative parenting dimensions (Appropriate Discipline and Physical Punishment).
Total scores were calculated for each of the four dimensions of the PPI-25. Means and standard deviations are presented in Table 3.

Measurement Invariance (MI)
MI was tested across pre-and post-intervention measures in parents participating in the Incredible Years Parent Training Program (see Table 4). Configural, metric and scalar invariance meet all criteria for invariance (ΔCFI < 0.01 and ΔRMSEA < 0.015), allowing PPI-25 scores comparison across time.
The same properties of invariance were tested across parent gender (mothers vs. fathers), child gender (boys vs. girls), and child age (4 to 6 years vs. 7 to 9 years). It was not possible to calculate invariance across parent age due to the small sample size in one of the groups. As can be seen in Table 4, configural, metric and scalar invariance meet all criteria for invariance (non-significant Δχ2, ΔCFI ≤ 0.01 and ΔRMSEA ≤ 0.01), allowing PPI-25 scores comparison across groups.

Convergent validity
Correlations between the four dimensions of the PPI-25 and parenting stress (PSI-SF), parental depressive symptomatology (BDI-II), and child behavior problems (ECBI Intensity and Problem subscales) were analyzed (Table 5).
For the negative parenting dimensions (Inconsistent discipline and Physical Punishment) findings showed significant positive correlations with all external measures. Only one exception was observed with no significant correlation between Physical Punishment and parental depressive symptomatology.
For the positive parenting dimensions (Appropriate Discipline and Verbal Praise and Incentives) only Appropriate Discipline showed a weak negative correlation with parental depressive symptomatology.

Comparison across time and subgroups
Differences across pre-and post-intervention Comparison between pre-and post-intervention PPI-25 scores of the 104 parents who participated in The Incredible Years Parent Training Program are presented in Table 6. Both negative parenting dimensions (Inconsistent discipline and Physical Punishment) significantly decreased from pre-to post-intervention, and Verbal Praise and Incentives significantly increased with a large effect size. However, no difference between pre-and post-intervention was observed for the Appropriate Discipline dimension. Differences across parent and child gender (mothers vs. fathers/girls vs. boys) No differences were found for any dimension of PPI-25 between mothers and fathers, and between boys and girls.
Differences across child age Differences between PPI-25 scores of parents with children of different ages are presented in Table 7. Statistically significant differences were observed only for the Verbal Praise and Incentives and the Physical Punishment dimensions, with parents of children between 4-6 years old reporting higher scores than parents of children between 7-9 years old.

Discussion
The main goal of the present study was to adapt and to analyze the psychometric properties of the Spanish version of the Parenting Practices Interview (PPI), a comprehensive measure of parenting practices for parents of 3 to 12-year-old children.
Results showed that a brief adaptation of the PPI with 25 items presented the best fit for the sample of the present study. This Spanish version included four dimensions: Appropriate Discipline (7 items), Verbal Praise and Incentives (7 items), Inconsistent Discipline (5 items), and Physical Punishment (6 items). Furthermore, internal correlations were statistically significant between both positive parenting dimensions and between both negative parenting dimensions. These results were in line with those obtained by Smith et al. (2015), who found 17 PPI items organized in three dimensions named Inconsistent Discipline, Effective Discipline and Punitive Discipline. These dimensions were similar, respectively, to those found in the present study: Inconsistent Discipline, Appropriate Discipline/Vernal Praise and Incentives and Physical Punishment. Adequate internal consistency coefficients (Cronbach's alpha and Omega) were found for every Spanish PPI-25 dimension, ranging from 0.70 to 0.87. These coefficients were similar or higher than those observed in previous studies (Drugli et al., 2010;  Cohen's d effect size was calculated only between groups with significant differences ***p < 0.001 Lessard et al., 2016;Letarte et al., 2010;Linares et al., 2006;Webster-Stratton et al., 2001;Webster-Stratton et al., 2004). It is also relevant to highlight that the low reliability coefficients obtained in the present study for the Monitoring and Positive Verbal Discipline dimensions were in line with those found by Leijten et al. (2017) -who eliminated items of both dimensions from their study-and by Lessard et al. (2016) with the Monitoring dimension (Cronbach's alpha = 0.54). Following empirical considerations, 39 items from the original version of the PPI were eliminated, including 20 items that made up the Monitoring, Clear Expectations, and Positive Verbal Discipline dimensions. Also, 5 items from the Appropriate Discipline dimension, 4 items from the Verbal Praise and Incentives dimension, and 10 items from the Harsh and Inconsistent Discipline dimension were deleted due to low factor loadings and/or negative correlations with items of the dimension. It is difficult to explain why these items were not suited for our Spanish sample of families with substantiated reports or at risk for child maltreatment, and any explanation would be merely speculative. The PPI-25 cannot be considered a Spanish version of the original PPI. It seems more appropriate to consider it as an adaptation which could be useful for the measurement both in clinical and research contexts of parents' self-reports of positive (Appropriate Discipline and Verbal Praise and Incentives) and negative parenting practices (Inconsistent Discipline and Physical Punishment) in Spanish families with substantiated reports or at risk for child maltreatment. Findings also suggested the usefulness of the PPI-25 for measuring parenting practices in parents of both girls and boys age 4 to 9 years, and for comparing scores between mothers and fathers. In the present study, the four dimensions of the PPI-25 remained invariant across groups (mothers/fathers, boys/girls, child age ranges). However, due to sample size limitations, it was not possible to confirm that every item making up each dimension actually worked in a similar way for each group, and new studies with larger samples are needed to confirm PPI-25 dimensions invariance.
Findings of Longitudinal Measurement Invariance (MI) analysis and differences across pre-and post-intervention PPI-25 scores of the parents who participated in The Incredible Years Parent Training Program, showed that negative parenting dimensions (Inconsistent discipline and Physical Punishment) significantly decreased from pre-to postintervention, and that Verbal Praise and Incentives significantly increased with a large effect size. These results are consistent with those obtained in most studies using the PPI to measure the outcomes of preventive and treatment programs for parents of children with significant behavioral problems (Reid et al., 2003(Reid et al., , 2007Smith et al., 2015;Webster-Stratton et al., 2001Weeland et al., 2017), and parents from the child protection system (Letarte et al., 2010;Linares et al., 2006;Smith et al., 2015). Although preliminary, our findings suggested that the PPI-25 may be sensitive enough to detect changes in reported parenting behavior over time, thus showing potential for use in longitudinal studies and comparisons across time. This may be especially relevant for interventions aimed to change parenting practices.
It is important to underline that convergent validity of the PPI-25 was only observed for negative parenting dimensions (Inconsistent Discipline and Physical Punishment) but not for positive ones. With the exception of a weak relationship between Physical Punishment and parental depressive symptomatology, our findings suggested that parents reporting a higher use of negative parenting practices were more likely to report more child behavior problems, more parental stress, and more depressive symptomatology. Such findings are in line with those obtained by Smith et al. (2015) with externalized child behavior problems and Inconsistent Discipline scores.
Although findings of the present study should be considered preliminary, it is important to underline that this is the first study to use Confirmatory Factor Analyses to obtain data supporting the main instrument dimensions, and Measurement Invariance and Longitudinal Measurement Invariance analyses to confirm that it can be used with different groups of people (fathers and mothers, boys and Cohen's d effect size was calculated only between groups with significant differences **p < 0.005. *p < 0.05 girls, and children from different age groups) and across time (before and after intervention).
In summary, findings of the present study supported the utility of the PPI-25 to measure parenting practices in Spanish families with substantiated reports or at risk for child maltreatment with children age 4 to 9, an age range when child behavior problems usually emerge and parents can start to display difficulties in coping with them (Prior et al., 2001;Webster-Stratton et al., 2004). Such a brief measure can help clinicians and practitioners to define intervention goals and to evaluate changes in programs aimed at helping parents to improve their parenting practices, particularly in services affected by time constraints. Although more research is needed, the present findings regarding the psychometric properties of the PPI-25 are encouraging for its use with families with substantiated reports or at risk for child maltreatment from Child Welfare and Child Protection Services in Spain.
Although the contribution of the present study is relevant, its limitations must be considered. The main limitation was due to the sample size. Additional studies with larger samples of mothers and fathers with equal gender distribution are needed to cross-validate the present findings, and to explore the PPI-25 structure maintenance with samples of different sociodemographic characteristics. Moreover, informant biases and inflated correlations derived from the use of self-report measures must also be considered.

Compliance with Ethical Standards
Conflict of Interest The authors declare no competing interests.
Ethics Approval The Ethics Committee of the University of the Basque Country UPV/EHU approved the study protocol.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.