Introduction

Physical child abuse is detrimental to child well-being and has adverse consequences throughout life [1,2,3,4,5]. Successful prevention necessitates proper risk assessment and early intervention, and a number of tools exist for early clinical detection [6,7,8]. Knowledge on causes and risk factors for physical child abuse is used to develop such interventions. Despite the often cited causal assumption that physical child abuse occurs when family resources are overwhelmed by stressors [9], to the best of our knowledge, causal inference techniques and theory [10] have only been used sparsely in the field of physical child abuse [11], although this has been encouraged [12]. As improvements in preventive efforts are called for [13], research is warranted to define which risk factors are most pertinent, including parental factors that are likely to induce stress in the whole family. Physical disease in adulthood is known to cause perceived individual stress [14]. However, whereas psychiatric disease among parents has previously been shown to increase the risk of lethal physical child abuse [15, 16], parental physical disease has scarcely been studied as a risk factor. Within the highly diverse construct of physical disease, only a few definitions have been explored. Chang et al. found an association to child maltreatment, but they did not ensure that parental physical disease preceded child maltreatment temporally [17]. Household AIDS, but not other household chronic illness, predicted a high intensity of abuse [18]. “Poor maternal health” was associated with physical child abuse but was not further defined, and the temporal separation of parental health and child abuse was not clear [19]. Thus, although there are indications in the literature that parental physical health is a predictor of subsequent physical abuse, the temporality, robustness, and dose–response relationship between parental physical disease and physical child abuse remains unclear.

This study aims to explore the causal link between parental physical disease severity and severe documented physical child abuse. We hypothesize that parental physical disease severity, measured using a joint Charlson score [20] and preceding reported first incidents of abuse, will indicate the risk of severe physical child abuse documented in official registries.

Methods

Data and design

All public authorities in Denmark use the same unique personal identifier across registries [21]. This was used to create a prospective data set by linking administrative data on all children living in Denmark from the 1st of January 1997 until the 31st of December 2018 and their legal parents, updating the dataset using monthly intervals and using an existing data framework [22]. The year 1997 was chosen because some data sources used were initiated in this year, and 2018 was the latest available year, maximizing available data for a rare outcome. Children lacking both parents were censored from the first time point of this event. As the personal identifier is a prerequisite for citizenship and many public services, the coverage in the population is assumed to be practically complete. This study has been pre-registered, available along with changes since pre-registration and project code [23]. Access to data was granted by Statistics Denmark, and this study was exempt from ethics approval according to Danish law, as it only used administrative data.

Exposure—parental physical disease severity

Parental physical disease severity was modeled using a sum of both parents’ Charlson Comorbidity Index [20] as a proxy to joint parental stress from physical disease severity. Data for the index were drawn from the Danish National Patient Register. The Charlson Comorbidity Index is known as a reliable index of long-term mortality and is widely used in clinical settings [24]. ICD 8 and 10 diagnoses that increased the Charlson Comorbidity Index [25] for an individual parent were assigned points according to the original definition [20], and all diagnoses available 5 years back [25] from any given month in the dataset and until physical child abuse or censoring were included in the calculation. A full description is provided in Table 1.

Table 1 Scoring of Charlson Comorbidity Index

Outcome variable

Severe documented physical child abuse was identified through three sources: the Danish National Patient Register [26], the Cause of Death Register [27], and the Danish Criminal Register. All models in this study focused on first-time events. All children with events before the inclusion date were excluded. Severe documented physical child abuse was operationalized as abuse that resulted in contact with a hospital, including the emergency department or any other department coding as described below. Also, death of the child resulting from non-accidental injuries or a court sentence of the perpetrator describing violence according to police codes was included in the outcome. Coding for documented physical abuse included all codes recommended by the Danish Paediatric Society [28], and all codes that specified physical violence (not accidents or uncertain intent) towards any individual less than 18 years of age. For details on codes and diagnoses used, see pre-registration and associated resources [23].

Covariates

Variables were chosen based on two Directed Acyclic Graphs [10], printed here as Fig. 1. They were drawn with exposure and outcome in mind, and based on prior literature, see Table 2. The link between ethnicity and physical child abuse was based on prior literature, showing differing risks for physical child abuse after adjusting for covariates [29,30,31,32]. Except where otherwise noted, variables were programmed as defined in the pre-registration. Table 2 provides references to further descriptions of the registries used.

Fig. 1
figure 1

These directed acyclic graphs depict the causal pathways assumed in our models. Except for physical environment, all other variables are adjusted for; income is used as a proxy for occupation. For models 1a and 1b, the variables are measured at baseline and constant during risk time, reflecting an assumption that the variables after baseline may be mediators. For model 2, the exposure is allowed to affect covariates, which in turn affect the exposure. For simplicity, only T0, 1, and 2 are shown here

Table 2 List of variables, their study definitions, sources, and levels

Analysis

Model 1 assumed that the first change in parental physical disease severity could have downstream consequences on health and covariates. This was modeled as a survival model comparing exposed with unexposed children and analyzed by regression using pseudoobservations to estimate risk ratios [54]. Exposed children had joint parental Charlson score of two or more. Single parents were included with their own Charlson score. The index date was when the joint parental Charlson score reached two or more, or the child’s birth date, whichever came last. Exposed children were matched with 5 children with joint parental Charlson score less than two, matching on index date (matches drawn within 3 months), reconstituted family (living with biological parent(s), living with one or more unrelated adults, adopted or in foster care), child ages (within 1 year), number of children (according to categories), and mean parental age (difference less than or equal to 5 years). Pseudoobservations assumed that censoring in the dataset was marginally independent of the covariates. This was checked by fitting a Cox model of the covariates with censoring as the dependent variable. Calendar year was not marginally independent of censoring, and, adjusting for this, pseudoobservations were generated after stratifying on calendar year group. Model 1a presents the model as specified a priori. The influence of collinearity and small cells as a result of many covariates was controlled by removing candidates with collinearity and skewed distributions, shown as model 1b. The linearity of continuous variables was checked by introducing their quadratic terms in the model. Only cases with information on all variables were analyzed. For distribution across covariates, see Online Resource 1. The number of missing values across variables can be seen in Table 3. The same child could be included in the analysis as both exposed and non-exposed; to adjust for this, Eicker-Huber-White estimations were used for standard errors [55]. As siblings could be included, clustering was adjusted for at the family level, where each unique combination of parents, including single parents, was counted as a cluster. Children were followed from index date to outcome, emigration, death, age 18 years or the first of January 2018, whichever came first.

Table 3 Baseline characteristics of the source population according to severity of parental physical disease

Model 2, in Fig. 1, assumed that family physical disease severity could influence and be influenced by covariates during the child’s contributed risk time. This was modeled as a G-model [10]. For this model, imputation was performed [56] and analyses were carried out on a reduced dataset with half-year intervals and a random draw of 50,000 children; see Online Resource 2. Imputation convergence and distribution of resulting variables were controlled with plots of iteration means and standard deviations, density, and box plots of distributions [57], and for the categorical and binary variables, the distribution of imputed values was compared to the observed, inspired by Harrell; results not shown [58].

Results

The characteristics of the full population, from which the samples for model 1 and model 2 were drawn, are shown in Table 3.

Model 1a and 1b: survival model

Data from 1,160,840 children were used, 43% of the original cohort, and after removing those that could not be matched with five unexposed controls, 1,160,529 children remained. The average follow-up time for these children was 12.22 years, and the total follow-up time was 14,184,982 years. In this dataset 28,190 children were registered as having been exposed to documented physical abuse for the first time between 1997 and 2018, including 29 cases of lethal abuse. Among the 1,160,529 children, deaths with no known relation to physical abuse during the same period totaled 1097. Model 1a showed a relative risk for children with parental joint Charlson score of two or more of 0.99 (0.93–1.05) and model 1b of 1.02 (0.96–1.08). As the Directed Acyclic Graph was designed with parental physical disease severity in mind, all other estimates in Table 4 cannot be interpreted causally but are reported for comparison with the existing literature. A parent with tertiary level education or higher predicted less abuse with a relative risk of 0.53 (0.50–0.57) (model 1a) or 0.53 (0.49–0.56) (model 1b) compared to primary education, respectively, and parental history of childhood maltreatment or neglect showed a relative risk of 2.25 (2.10–2.40) (model 1a) or 2.30 (2.16–2.46) (model 1b). See Table 4 for further information.

Table 4 Results from survival models 1a and 1b

Model 2: G-model

Plots of imputed data compared to observed data showed plausible results for all five datasets. The G-model on a random extract of 50,000 children from the main dataset showed that parental joint Charlson score of two or more had a relative risk of 1.08 for documented child abuse compared to unexposed children, adjusting for the same variables as included in model 1 with the exception of interparental violence, parental substance abuse, and status as refugee, as the model could not converge with the full set of variables.

Sensitivity analyses

Because of computational intensity, all model control and sensitivity analyses were carried out on variations of model 1, except for control of pseudoobservations which was done in a Cox model. Model results are not shown. The introduction of quadratic terms was significantly associated with the outcome but did not influence the main result. A number of separate models excluding parents with a registered death before the end of the study period, excluding child deaths, stratifying on less than and older than 7 years, changing the joint Charlson score cutoff to 4 and 8 (two different models), excluding immigrants and emigrants, and stratifying on children living with single parents and children living in other family arrangements showed essentially unchanged results (data not shown). A model exchanging parental physical disease with psychiatric disease, leaving out parental physical disease, showed a RR of 1.41 (1.30–1.52). Part of the outcome was drawn from the Victims of Criminal Offences registry, only available from after 2000, and restricting the data to this time frame, results were essentially unchanged (data not shown).

Discussion

This study aimed to explore the link between parental physical disease severity and severe documented physical child abuse. The results from all models correspond closely and do not indicate a causal connection. To the best of our knowledge, there are no similar longitudinal results in the literature for comparison. However, given the methodological challenges mentioned in the introduction with preceding studies, this result is not in opposition to earlier findings. The associations estimated for the covariates from model 1a do not allow causal inference, but the associations demonstrated were in line with previous literature, including the negative association with family highest education [32], and the positive association with parental maltreatment in childhood [43]. The sensitivity analysis exchanging the exposure with parental psychiatric disease also showed an association with parental psychiatric disease, in line with earlier studies [15, 16].

Strengths and weaknesses

To the best of our knowledge, no study before this has tested a causal hypothesis on parental physical disease and severe documented physical child abuse. Models and all definitions of variables were based on prior literature, and their associations were illustrated in a pre-specified directed acyclic graph. Also, no prior study on parental physical disease has tested disease severity as a risk factor. This study uses data from health registries, and police cases confirmed by the courts. Data from social services were not available on a national level in sufficient detail to be included in this study. This is likely to limit cases to those severe enough to warrant either hospital treatment, sufficient evidence for a court trial, or both. Consequently, our results may not be generalizable to the large number of cases of physical child abuse that do not reach the attention of health care or law enforcement. Because of the computational intensity of the models, the number of children included was limited by available machine power, even on a server with 1 terabyte of working memory. For the same reason, as imputation in the full dataset was not feasible, only cases with information on all variables were used in model 1, thereby deviating from the pre-registration. The number of missings across variables was substantial. Assuming that low socioeconomic status predicts an increased risk of both missing variables [59], parental physical disease [60], and physical child abuse [29], bias could be introduced. However, this would be accounted for by adjusting for socioeconomic status. As we adjust for both level of education and income, the influence of this bias is expected to be small. Additionally, finding similar results across two models based on two different data excerpts, one of which used imputation for missing values, and finding essentially unchanged results, does not indicate major influence by missing values. This also indicates a minimal risk of major differences when using a larger dataset or computer. The population-level data on place of residence proved to be unreliable, to the point that the number of adults in each family could not be reliably determined, and this was a deviation from the pre-registration. However, the reconstituted family variable made a useful replacement and is assumed to explain at least part of the same variation. Who lives with the child is of more importance than how many adults are in the household. The data specified if a child lived with both biological parents, with a step-parent, or if the child was adopted or in foster care. Living with a single parent may have influenced the joint Charlson score. However, a sensitivity analysis stratifying the dataset on single parents and children living in other family constitutions showed essentially unchanged results. The G-model includes a large number of model specifications, thus increasing the risk of mis-specifications. Nonetheless, it is reassuring that the G-model and the regression based on pseudoobservations reach highly similar conclusions, despite very different assumptions. The median child age at index date in the source population (Table 3) is 0.0, 0.4 in the random draw from the source population (Online Resource 2), and 10.3 in data for model 1 (Online Resource 1). The age differences between samples question the comparability between model 1 and 2. It is reassuring that stratifying model 1 into children younger than and older than 7 years gives essentially unchanged results in both strata, and that the results in model 1a, 1b, and 2 are similar. Confidence intervals for the G-model were intended to be derived from bootstrapping by family unit to adjust for similarities among siblings. However, because of the reduced sample size of 50,000 and only 1.7% of families with joint Charlson scores above 2, the G-model did not converge reliably using the bootstrapped samples. Therefore, model 2 was presented without confidence intervals. For both models 1 and 2, the few families with a joint Charlson score above 2 meant a small group of exposed children, indicating that the Charlson score might not appropriately describe disease in the parental population.

The study has controlled for socioeconomic indicators and age and is comprehensive in the inclusiveness of all kinds of citizens in Denmark and can thus be generalized to similar populations with some degree of certainty. A note of caution should be made to societies without universal health care, as it seems likely that disease in such a society may affect the family stress level differently than in this cohort. The access to affordable health care has been shown to affect levels of maltreatment, although not physical abuse [61]. Another issue is that data from administrative sources largely underestimate the incidence of physical child abuse [62]. Thus, misclassification of outcomes as non-abuse could result in a weakening of any associations found in the present study. As there is a risk that the underlying populations differ, our results may not be generalizable to cases of physical child abuse unknown to authorities or, as mentioned above, known only to child protective services. We had access to data on the entire child population living permanently in Denmark up until 2018 in this study, and in models 1a and 1b, we used substantial samples of this population. We did, however, not use a population correction factor to correct our confidence intervals. This was because we regarded the population studied as a sample of a child population living with universal health care, which is a limited population but a population extending beyond Danish children. Using a population correction factor on models 1a and 1b would have narrowed their confidence intervals but likely not have changed their interpretation, as they are both close to 1 with point estimates 0.99 and 1.02. Model 2 was run on only a subsample of 50,000 children and would therefore not justify the use of a population correction factor.

Interpretation

The results fail to reject the null hypothesis. This could indicate that although parents may be stressed by documented physical disease [14], this stress does not result in an increased risk of severe documented physical child abuse. It is, however, also possible that specific entities of disease are associated with documented physical abuse in line with the hypothesis, but that any signal from these are muted by a large heterogeneity in the diseases included in the Charlson index. Different categories of disease would have to be studied separately to uncover this. In addition, to better describe this population, a definition of physical disease not emphasizing severity but broader disease categories might show different results, as high Charlson scores are rare in this population, affecting only 1.7% of families. Furthermore, levels of health-related stress may differ in families living in countries without universal health care, and results may not replicate in such settings. If longitudinal results differ in such countries, our results might indicate that access to health care in families with parental physical disease is protective against physical child abuse.

Conclusion

The present data showed no link between parental physical disease and an increased risk of severe documented physical child abuse. A possible explanation is that previous studies have not modeled disease severity, categories, and covariates in a similar cohort. Other explanations could be that any disease categories consistent with the hypothesis are muted by disease heterogeneity. Further research should explore individual disease categories and seek to confirm results in countries with non-universal health care, and in data sets including cases of physical abuse unknown to authorities, as the results might differ in these populations. Also, these results should be confirmed in data with self-report of physical child abuse.