Introduction

Childhood maltreatment is a strong predictor of poor health outcomes during childhood and across the lifespan, increasing disease risk by 2- to 6-fold [1,2,3,4,5]. Given the harm of child maltreatment, many clinicians and researchers are assessing child maltreatment exposure in clinical practice and research, as a means to identify high-risk populations, identify immediate safety issues, reduce long-term health risk, and understand the predictors and consequences of these experiences.

However, a lack of reliable tools to measure the occurrence of maltreatment has remained a major barrier to these efforts. Most often, researchers use self-reports, relying on retrospective study designs, wherein adolescents or adults report experiences from their childhood, or prospective study designs, in which a caregiver reports the child’s experiences over time. Most studies using self-reports of childhood maltreatment exposure experiences are retrospective in nature. Although there are also child-report measures available, caregiver reporting is often used in practice due to the developmental challenges of measurement with young children. These reporting types have limitations. Retrospective measures are susceptible to memory and recall biases, including infantile amnesia (when adults cannot recollect childhood episodic memories [6,7,8]. Furthermore, fear of legal consequence, denial, or shame, may skew prospective measures by parents toward underreporting, especially when the reporter has perpetrated the maltreatment or has a relationship with the perpetrator [9, 10].

Studies gathering reports of child maltreatment from multiple sources often find disagreement between informants [9, 11,12,13,14]. A recent meta-analysis of 16 different studies found poor agreement between prospective reports from various informants and retrospective self-reports, such that only about half of those identified retrospectively as having experienced maltreatment also had a concordant prospective report [15]. Similar studies have also observed that adverse health outcomes are more closely associated with retrospective versus prospective maltreatment reports [16,17,18].

Little is known about factors that drive child maltreatment reporting disagreement and how might reporter disagreement, in turn, shapes child health. Although there are known demographic and family factors associated with differences in maltreatment exposure (e.g., child sex, race, maternal mental health status, parent history of abuse or neglect, bonding), these factors have not been extensively studied in relation to reporting agreement. Without these insights, clinicians and researchers are unable to understand the benefits and drawbacks of different measurement approaches and identify the mechanisms through which different reporting types may contribute to adverse health outcomes. The knowledge gap in our understanding of the predictors and consequences of child maltreatment reporting disagreement also creates clinical practice challenges for deciding who should report maltreatment and what sources should be considered for reliable measurement and appropriate response.

To address these gaps, we investigated the prevalence, predictors, and consequences of disagreement between prospective and retrospective reports of child maltreatment using data from the Avon Longitudinal Study of Parents and Children (ALSPAC). First, we examined levels of agreement on reports of physical maltreatment (PM) and emotional maltreatment (EM) exposure by comparing: (1) prospective reports from mothers compared to prospective reports from her partner; and (2) prospective caregiver reports (combining mother and partner reports) compared to retrospective child reports. We hypothesized that (1) mothers would report maltreatment more frequently than their partners; and (2) children would report maltreatment more frequently than their caregivers. Second, we explored the predictors of disagreement between these two pairs of reporters. This aim was exploratory and thus no hypotheses were specified. Third, we examined the extent to which disagreement between reporters was associated with child health outcomes in young adulthood. We hypothesized that disagreement would be associated with greater health risk. The ALSPAC sample is uniquely poised to examine these questions, given its 30-year duration of follow-up, often with repeated measures from multiple family members [19].

Methods

Sample

ALSPAC is a birth cohort from Avon, England that follows pregnant mothers whose children had expected delivery dates between April 1991 and December 1992 [19, 20]. Informed consent for the use of data collected was obtained from participants following the recommendations of the ALSPAC Ethics and Law Committee at the time (additional details about the sample and methods are available in Supplemental Materials).

We constructed two analytic samples from the data (Fig. S1). Our primary analytic sample included reporter pairs from both mother and partner with three or more completed timepoints (Table S1) in which they responded to questions regarding maltreatment behaviors (N = 5799 pairs) [21]. Our secondary analytic sample was restricted to pairs from the primary sample who also had child-reported data for maltreatment questions as reported at age 22; this secondary sample allowed us to examine the differences in reports of maltreatment by caregivers (combined mother and partner reports) and children (N = 2373 pairs) [21]. See Table S2 for comparisons between our analytic subsamples and the total sample.

Measures

We examined two types of maltreatment behaviors using reports from mail-in questionnaires sent to mothers, partners, and children. At seven different timepoints starting at child aged 8 months and ending at 9 years, mothers and partners reported on maltreatment exposure using an ALSPAC-designed measure (Table S3 for details). Most participants (≥ 92%) completing the partner report identified as the child’s father at each reporting time for both analytic samples (Table S4). We derived two variables to classify children as exposed or unexposed to PM and EM, respectively. Exposure to maltreatment was defined as at least one affirmative response by the reporter to one item in each maltreatment category, regardless of the perpetrator(s) identified. Exposure to “caregiver” maltreatment behavior was defined as maltreatment reported by either the mother or partner, regardless of the perpetrator(s) identified.

At age 22, children reported their exposure to PM and EM by an adult in the family before age 11 (see Table S3 for questionnaire details). The questions were derived from the psychometrically-validated Child Abuse Questionnaire and Sexual Experiences Survey [22, 23]. Each item was rated on a 5-point frequency scale: never; rare; sometimes; often; or very often. For our analyses, children were considered exposed to maltreatment if their response was greater than “never” for any of the items associated with PM and EM (sensitivity analyses with other cut-points are described later).

Predictors of Disagreement

Disagreement was defined as when the mother reported exposure while her partner did not (or vice versa) and when the caregivers reported exposure while child did not (or vice versa). We examined 16 possible predictors of reporter disagreement. The following demographic characteristics were investigated, because they are risk factors of child maltreatment exposure [24, 25]: child’s sex, child’s race, and maternal factors (education level, marital status, age at child’s birth, and number of previous pregnancies). We also investigated familial factors as predictors of disagreement given their association with child maltreatment exposure, including maternal postnatal depression, caregivers’ exposure to neglect and abuse as children, maternal mental health history, and maternal and paternal bonding with child (see Sect. "Results" of Supplemental Materials for variable coding and complete list of variables) [26,27,28].

Child Health Outcomes

Maltreatment has been associated with a wide range of health outcomes. Thus, we broadly evaluated the consequences of discordant reports, using data on both objective and subjective health measures of the children collected at two timepoints: (a) a self-report mail-in questionnaire at 22 years, and (b) a clinic-based assessment at age 24 years. Health outcomes included self-reported lifetime presence of 8 conditions (Sect. "Discussion" of Supplemental Materials), general health quality, as well as clinically assessed body mass index and blood pressure. Mood/behavioral outcomes included self-reported lifetime presence of four conditions (Sect. "Discussion" of Supplemental Materials), depressive symptoms at age 22, and clinically assessed symptoms of depression, anxiety, and substance abuse at the 24-year timepoint. By using data from both timepoints we could better ascertain the implications related to reporter discordance and maintain temporality in our exposure–outcome association.

Primary Analyses

First, to examine the level of agreement across reports of child maltreatment, we calculated two agreement statistics: the kappa coefficient (κ) of agreement and the prevalence-adjusted bias-adjusted κ (PABAK). κ is often used to assess the level of inter-rater reliability and is useful in situations where there is no standard measure to estimate validity [29]. However, some argue against using the κ coefficient, particularly when an outcome is rare, as rare events can create the paradox of high concordance but low κ values [30, 31]. Because we expected maltreatment to be underreported and thus rare in this population-based sample, which could artificially decrease κ, we also calculated the PABAK.31 PABAK, like κ, provides an estimate of inter-rater agreement, but it also adjusts for the prevalence of the outcome. κ and PABAK values range from 0 (no agreement) to 1 (perfect agreement) and were interpreted as degrees of agreement, per standard interpretation guidelines: (a) 0.01–0.20 (slight); (b) 0.21–0.40 (fair); (c) 0.41–0.60 (moderate); and (d) > 0.60 (substantial) [32].

Second, we examined demographic and familial predictors of disagreement between reports of child maltreatment. Disagreement was coded as a binary variable capturing the pair discordance versus concordance for maltreatment presence (0 = reporter pair agreed on presence/absence of maltreatment; 1 = reporter pair disagreed). We used simple logistic regression to model the association between demographic and familial factors as predictors of disagreement between: (a) mother and partner reports and (b) caregiver and child reports. No covariates were included in these analyses, as this was the first study to test these associations. Finally, we examined whether disagreement between caregiver and child reports was associated with child health outcomes using simple logistic and linear regression. Because we examined bivariate associations between disagreement and multiple, potentially correlated child outcomes, we report both unadjusted and Bonferroni adjusted p-values corrected for multiple testing. All analyses were conducted using SAS®.

Given the results of prior studies suggesting there are distinct pathways between prospective and retrospective reporting with health outcomes, we also explored, for comparison, associations between prospective caregiver-reported and retrospective child-reported maltreatment exposure and the health outcomes described above.

Sensitivity Analysis

To determine how results might change if different cut-points were applied to define child-reported maltreatment from the frequency rating scales, we re-analyzed the primary analyses using a more conservative definition of child-reported exposure (maltreatment items occurring “often” or “very often”) (Sect. "Methods" of Supplemental Materials).

Results

Prevalence of Disagreement

The most common response combination from mother–partner pairs was for agreement that their child had not been exposed to PM (93.5%, N = 5420) or EM (84.3%, N = 4891, Table 1). Pair-wise disagreement was rarer: 3.1% of pairs (N = 182) had PM reported by mothers, but not partners; 2.6% of partners (N = 151) reported PM, but not mothers; 7.2% of pairs (N = 418) had EM reported by mothers, but not partners; and 5.7% of partners (N = 328) reported EM, but not mothers. While κ coefficients showed slight agreement between mothers and partners for PM (κ = 0.19) and EM (κ = 0.23), the PABAK values indicated substantial agreement by pairs on reports of PM (PABAK = 0.89) and EM (PABAK = 0.74, Table 1).

Table 1 Prevalence and agreement between prospective mother and partner reports of their child’s exposure to physical and emotional maltreatment behaviors (N = 5799)

Children retrospectively reported more maltreatment than their caregivers (Table 2). Pair-wise disagreement was more common when comparing children and their caregivers. In 20.4% of pairs (N = 484) the child retrospectively reported PM, but not the caregivers; 4.6% of caregivers (N = 109) reported PM, but not the child; in 27.8% of pairs (N = 659), the child retrospectively reported EM, but not the caregivers; and in 8.9% of pairs (N = 210) the caregivers reported EM, but not the child. Compared to reports between mother–partner pairs, caregivers–child pairs had lower levels of agreement across both behaviors: PM (κ = 0.05; PABAK = 0.50); EM (κ = 0.08; PABAK = 0.27, Table 2).

Table 2 Prevalence and agreement between prospective caregiver (combining mother and partner) reports and retrospective child reports on child’s exposure to physical and emotional maltreatment behaviors (N = 2373)

Predictors of Disagreement

Given the high concordance of mother–partner reports, only results related to predictors of caregiver–child disagreement are presented. Of 16 variables tested, eight were associated with disagreement between reporters for PM (Table 3). Male sex [odds ratio (OR) = 1.69, 95% CI = 1.40–2.04], certain maternal education levels lower than a college degree (less than O-level: OR = 1.46, 95% CI = 1.07–1.99; A-level: OR = 1.48, 95% CI = 1.13–1.93), maternal postnatal depression (OR = 1.69, 95% CI = 1.20–2.37), caregiver history of childhood maltreatment (e.g., mother emotionally neglected: OR = 1.40, 95% CI = 1.11–1.76; mother physically neglected: OR = 2.40, 95% CI = 1.21–4.75; mother physically abused: OR = 1.56, 95% CI = 1.02–2.38; and partner emotionally neglected: OR = 1.42, 95% CI = 1.11–1.83), and lower scores on maternal bonding measurements (OR = 1.53, 95% CI = 1.18–1.98) were significantly associated with reporter PM disagreement, with caregivers tending to underreport PM exposure (Fig. 1). All scores less than the 4th quartile on maternal bonding measurements predicted caregiver–child EM exposure disagreement (1st quartile: OR = 1.28, 95% CI = 1.02–1.62; 2nd quartile: OR = 1.41, 95% CI = 1.10–1.82; 3rd quartile: OR = 1.28, 95% CI = 1.01–1.63).

Table 3 Predictors of disagreement between prospective caregiver reports and retrospective child reports on child’s exposure to physical (PM) and emotional (EM) maltreatment (N = 2373)
Fig. 1
figure 1

Significant predictors of pair-wise disagreement between prospective caregiver reports and retrospective child reports on child’s exposure to physical maltreatment (N = 2373)

Consequences of Disagreement

Caregiver–child report disagreement on the child’s exposure to maltreatment was associated with a decreased risk of several health outcomes. Both PM and EM reporter disagreement was associated with a decreased risk of child’s self-reported history of lifetime depression (PM: OR = 0.60, 95% CI = 0.48–0.74; EM: OR = 0.64, 95% CI = 0.52–0.79) and clinically-assessed generalized anxiety disorder (PM: OR = 0.59, 95% CI = 0.42–0.84; EM: OR = 0.59, 95% CI = 0.42–0.82, Table 4). Reporter disagreement on EM exposure, but not PM exposure, was associated with a statistically significant decreased risk of other lifetime medical conditions (OR = 0.67, 95% CI = 0.54–0.83).

Table 4 Estimated associations of child-caregiver disagreement for reports of physical (PM) and emotional (EM) maltreatment on medical, mood, and behavioral health outcomes in the secondary sample (N = 2373)

In contrast, caregiver–child disagreement for both types of maltreatment were significantly positively associated with higher ratings on clinical assessments of alcohol abuse, cannabis abuse, nicotine dependence, and lifetime illicit drug use (Table 4). Caregiver–child disagreement for both types of maltreatment was associated with higher ratings of self-reported current depressive symptoms (PM: β = 2.03, 95% CI = 1.53–2.53; EM: β = 1.89, 95% CI = 1.44–2.34). Caregiver–child disagreement on PM exposure was associated with higher BMI, but disagreement on either maltreatment type was otherwise not significantly associated with clinically assessed BMI or blood pressure measures.

Retrospective, but not prospective, reports of PM and EM exposure were significantly associated with increased risk of most health outcomes (Fig. S3).

Sensitivity Analysis

Fewer maltreatment cases were identified when using a stricter cut-point to define maltreatment (Table S5); PABAK measures of agreement, but not κ measures, improved (Table S6), and associations between disagreement and child health outcomes were generally the same but fewer results were statistically significant when maltreatment frequency was high (often/very often) (Fig. S2).

Discussion

This study examined the predictors of disagreement between different reporters of child maltreatment exposure and explored implications of reporter disagreement. Like other studies  [33,34,35], we found that parent and child reports often disagree on maltreatment exposure; however, our results are comparable to the literature only when understanding our data based on PABAK values. Our κ statistics were even lower, thus indicating worse agreement, than a prior meta-analysis (who reported κ = 0.19) [15]. In our sample, maltreatment exposure was rare; rare events contribute to the reliability paradox described earlier, where κ coefficient show poor agreement, but the PABAK value indicated moderate to substantial agreement between reporters. In these instances, PABAK provides a clearer picture of agreement between reporters. Our results may reflect a higher percentage of children never exposed to maltreatment, causing our κ value to be artificially decreased. Considering these findings, future studies should include multiple methods for assessing agreement (κ and PABAK) to account for the possible impacts of sample size and prevalence.

We identified correlates of self- and parent-reported maltreatment experiences of children that predicted reporting disagreement, and these associations provide actionable new insights to improve maltreatment measurement and the identification of high-risk children. While prior studies suggest maternal postnatal depression and caregivers’ own history of maltreatment may increase risk of early-life child maltreatment, our findings suggest these factors may also influence reporting of maltreatment [36, 37]. Caregivers’ social desirability or changes of societal definitions of maltreatment over generations may explain these results, exacerbating potential disagreement between multi-generational reporters. Future studies may adjust for possible underreporting by caregivers of their child’s exposure to maltreatment based on these characteristics. Furthermore, screening measures for childhood maltreatment, especially those with caregiver reporters, may also want to evaluate these characteristics to identify high-risk children. In future studies of child maltreatment, the role of disagreement in both retrospective and prospective reports of children’s maltreatment experiences should be considered.

Consistent with existing literature, we found that retrospective reports were more strongly associated with health outcomes than prospective reports [16]. In our analysis, disagreement between reporters was associated with a decreased risk of lifetime depression, anxiety, and poor general health. A possible explanation for why disagreement was associated with decreased risk could be that caregivers may be less likely to acknowledge health problems in their child, thus decreasing the chances their child received healthcare or a diagnosis when showing signs of distress. Another possible explanation is that the typical age of onset for common mental health disorders is after age 22 to 24 when health outcomes were assessed in the ALSPAC study, and as such, studying the mental health consequences of maltreatment reporting disagreement may require longer-term follow-up [38]. Disagreement in maltreatment reports was more common in our analysis among male children, who are less likely to exhibit internalizing symptoms in response to trauma than their female counterparts and more likely to express externalizing symptoms and substance use behaviors [39].

Although disagreement in reporting was associated with decreased risk for certain mental health diagnoses and overall poor health, we also observed that disagreement was associated with higher self-reported and clinically assessed health problems, including elevated self-reported depressive symptoms, substance use severity, and BMI. Given the association of disagreement to depression symptoms, but not a depression or anxiety diagnosis, disagreement may be a risk factor for unmet mental health need or future mental health risk. Findings may also reflect sex differences in behavioral symptoms among children who experience maltreatment [39].

Limitations

Several limitations should be considered when interpreting these results. First, our study examines maltreatment exposure before age 11, thus our results may not be generalizable to maltreatment occurring in later childhood and adolescence. Second, differences in maltreatment survey depth could have biased our results towards increased disagreement. Although mothers and partners identified the perpetrator of maltreatment, our definition of agreement was not based on the identity of the perpetrator. Therefore, a mother and partner pair could be coded as in agreement on the child’s exposure to maltreatment but have disagreement on who was responsible. Likewise, disagreement between caregiver–child pairs could be due to the child being maltreated by another adult in the family, although perpetrators of the maltreatment are most often a child’s parent [40]. Finally, there are several types of bias including memory, recall, and reporting bias that may affect how ALSPAC participants responded to surveys. Future studies should focus on the way maltreatment is screened and how the identity of the perpetrator of maltreatment influences multi-informant disagreement.

Summary

Our study found that retrospective child reports and prospective caregiver reports of child maltreatment often disagree, and caution should be taken in using these reports interchangeably. We identified predictors of disagreement in reporting including child male sex, maternal education levels lower than a college degree, maternal postnatal depression, caregiver history of childhood maltreatment, and impaired maternal bonding. Further research is needed to understand what factors drive disagreement and how to optimize child maltreatment assessment in clinical practice accounting for the perspectives of both caregivers and children themselves. Disagreement between reporters may be important to consider when exploring the mechanisms underlying the connection between child maltreatment, poor health outcomes, and type of report, as well as possible unmet need for mental health evaluation.