This study was performed within the Commonwealth of Pennsylvania (PA) MIECHV evaluation. The study followed a partially mixed, concurrent, equal status design, in which qualitative and quantitative data were analyzed separately and mixed at the stage of interpretation (Leech and Onwuegbuzie 2009). The study was approved by PA’s Department of Human Services with human subjects approval by the Children’s Hospital of Philadelphia’s Institutional Review Board.
Data was obtained for clients enrolled in MIECHV funded PA nurse–family partnership (NFP, n = 22), parents as teachers (PAT, n = 9), or early head start (EHS, n = 7) from 2008 to 2014. Clients were matched to local-area non-client women (comparisons) who (1) had similarly aged children identified in birth certificate files and (2) resided in the same local implementing agency catchment area (i.e., county or multi-county service area). Inclusion criteria for clients and comparisons were as follows: (1) child affiliated with MIECHV program enrollment was identifiable in PA birth certificate files and (2) child affiliated with MIECHV program enrollment had enrollment in the state medical assistance program (Medicaid) during the outcome observation period.
Clients and potential comparisons were identified in a multisource administrative data file linked using an iterative deterministic approach reliant on unique identifiers constructed from social security numbers, names, and dates of birth that included program enrollment, vital statistics (birth and death), welfare eligibility, and medical assistance claim files (Dusetzina et al. 2014).
The primary analysis examined if the prevalence and rate of child abuse episodes significantly differed between program clients and comparison women for NFP, EHS, and PAT programs separately. Two primary quasi-experimental methods were used for causal inference related to program effect on child abuse: entropy balancing for NFP and propensity score matching (PSM) for EHS and PAT. Both analytic approaches are widely used for obtaining covariate balance in observational data, but neither approach was suitable for all three program model analyses for the following reasons.
First, PSM has a disadvantage of dropping subjects unable to be matched to counterparts, which creates a biased sample. In the case of this study, PSM did not retain a generalizable subset of the clients in the NFP analysis (specifically, young mothers in rural areas were disproportionately dropped in attempted PSM). Entropy balancing retained all cases and was the approach used for NFP. While PSM was not the optimal approach for NFP, it was chosen for the EHS and PAT analyses because it allowed for standardized follow-up time in the outcome observation windows of matched sets of clients and comparisons. This is a critical analytic design feature for programs without standardized enrollment at a point in time. Unlike NFP, which uniformly enrolls clients into the program prior to a child’s birth, EHS and PAT programs do not uniformly enroll at a particular age. Therefore, for each client, PSM allowed for the identification of comparisons with similarly aged children at the time of program enrollment. The analysis then standardized observation periods for outcome ascertainment within matched sets of clients and comparisons using the client’s child’s age of enrollment and length of time in the program as the reference point (e.g., if client enrolled child at 3 months and was observed through month 27, all comparison children for that client are observed for months 3 through 27). This level of modeling flexibility is not possible with entropy balancing, but was also not necessary for NFP given the requirement of prenatal enrollment, which serves as a standardization (i.e., all client and comparison children begin observation at birth).
Both entropy balancing and PSM were performed within local implementing agency catchment areas (Matone et al. 2012) to address the possibility that there is confounding by geography (i.e., the outcomes might vary across sites at a community level beyond maternal-level characteristics). Catchments included each implementing agency’s county and contiguous counties. Clients enrolled in a program in a particular catchment were matched to comparison women living in the same catchment. Entropy balancing and PSM were performed within catchments, and then the samples were aggregated.
Description of PSM for EHS and PAT
PSM is a matching technique for observational data that mimics a randomized control trial by creating pairs (or sets) from clients and comparison women with similar values of the propensity score (Stuart 2010). Multivariable logistic regression models estimate the probability of program participation using available maternal sociodemographic and clinical characteristics—from birth certificate: mother’s age at birth (continuous), race/ethnicity (white/black/Hispanic/other), maternal education (< high school/high school or greater), gestational age (continuous), smoking prior to pregnancy (y/n); from welfare eligibility files: receipt of Temporary Assistance for Needy Families (TANF) or supplemental nutrition program prior to or during the first trimester of pregnancy (y/n); from medical assistance claims: Medicaid eligibility (y/n), maternal diagnosis of substance abuse, depression and/or bipolar disorder in the immediate preconception period or first trimester of pregnancy (y/n). A separate logistic regression model was performed within each of the catchment areas for each local implementing agency. Our matching approach used both caliper and exact matching on covariates to produce matched sets. Any nearest neighbor within a caliper of 0.05 was considered a match (up to a maximum of four comparison women per client). Matching was conducted exactly on catchment area, infant year of birth, and maternal age (< 18 years of age at birth or 18 and older). A threshold of 2.5 absolute percentage points was used to determine balance within each catchment area model. Interaction terms were added to the propensity score model when needed to achieve balance. Analytic weights were developed within matched sets; each comparison woman was given a weight equal to the inverse of the number of comparison women matched to that client and each client was given a weight of 1. These weights were applied to outcome modeling.
Description of Entropy Balancing for NFP
Entropy matching is a multivariable weighting technique that creates a balanced sample by reweighting the control group (in this case the comparison women) to have the same covariate distribution as the treatment group (i.e., the clients) using the above described maternal sociodemographic and clinical characteristics. In this approach, specifications for each covariate can be applied as to whether exact balance between the two groups should be achieved on the first moment (mean), second moment (standard deviation), or higher moments (Hainmueller and Xu 2013). As is intended with this methodology, there is automatic balance created between the samples after conducting entropy balancing, so no additional balance checks or model adjustment to create balance was necessary. Covariates used in entropy balancing were the same as included in PSM.
Abuse and Injury Episode Creation
The primary outcome for this study was the presence of an abuse episode or high risk injury episode (composite measure) with a secondary outcome that identified the presence of any injury episode. Outcome measures were derived from child Medicaid claims. Episodes were created to conservatively count unique instances of abuse and injury recognizing that multiple claims/encounters may exist for a single event. The methodology of collapsing claim encounters to create episodes is described in Matone et al. 2012 and further in Online Appendix A.
Abuse episodes were those in which an ICD-9 code indicated child abuse (995.50-5, 995.59), as well as high risk injuries (HRI), specific types of severe injuries considered highly suspicious for abuse without the presence of a medical diagnosis of abuse in the medical record. These episodes feature injuries that include fractures of the femur, radius, ulna, tibia, fibula, humerus, ribs, or traumatic brain injuries within the first 24 months of life (without the presence of ICD-9 codes indicating an injury due to a motor vehicle crash) (Wood 2010) (Online Appendix B).
Injury outcomes included: superficial injuries, a composite of dislocation, fracture, and crush injuries, poisonings, and burns identified through ICD-9 codes.
The observation window for episodes were claims during 0 to 24 months of life for NFP cohort and, for EHS and PAT cohorts, 24 months post-enrollment, up to 6 years of life. Right-censoring of episodes occurred for children whose observation periods exceeded the study end period of 2014.
The primary exposure was NFP, PAT, or EHS program participation. For EHS and PAT analyses, a weighted conditional logistic regression model was used to examine the unadjusted association between program participation and the primary outcome. For the NFP analysis, a weighted logistic regression with a random intercept for county was used to estimate the relationship between program participation and the primary outcome, controlling for the variability in the outcome across counties. The presence of abuse or injury prior to enrollment was included as an adjustment covariate in PAT and EHS final outcome models (not applicable for NFP modeling given prenatal enrollment) to account for baseline injury risk.
Two sensitivity analyses were conducted to test if the relationship found between program participation and abuse outcome was robust to potential confounders that could not be included as covariates in the PSM or entropy balancing model. We tested for confounding between program participation and abuse by maternal psychosocial risk factors by separately including each risk factor in the primary outcome model and examining if the estimated odds ratio effect for the program participation, adjusting for the risk factor, changed by 10% or more. The risk factors ascertained from the literature to be confounders are (1) maternal previous involvement with child protective services (CPS) before pregnancy and (2) intimate partner violence (IPV) measured after conception (Berlin et al. 2011; Eckenrode et al. 2000).
To identify clients involved in CPS, NFP clients and comparisons residing in Philadelphia were linked to county child welfare records via first name, last name, date of birth, and gender. Child welfare systems are administered at the county-level in PA; Philadelphia represents the largest county in the state and produced a sample large enough for sensitivity analysis. For any clients and comparisons successfully linked to child welfare records, dates of protective service were provided. We identified mothers with childhood CPS involvement (prior to pregnancy).
Regarding the second sensitivity analysis, IPV was identified in maternal medical encounters as an ICD-9 code of 995.8x during the observation window of child’s date of conception through the first month of life. Even though we identified IPV in some clients after program enrollment occurred or after the program services could have started, we deemed this time window as most meaningful for measuring IPV for two reasons: (1) to reflect a baseline risk proximal to program enrollment and (2) to increase likelihood of ascertainment in medical assistance files given increased health seeking during pregnancy and increased risk period for IPV. While rates of IPV during pregnancy vary depending on the samples studied and measures used, the prevalence of IPV during pregnancy is elevated compared to women of non-reproductive age and may be increased compared to non-pregnant patients (Hellmuth et al. 2013; Jasinski 2004). Less biased screening (i.e. more universal screening) may occur during the prenatal period due to recommendations by professional organizations, such as the American Congress of Obstetricians and Gynecologists’ (ACOG), that providers should screen all women for IPV at periodic intervals, including during obstetric care (at the first prenatal visit, at least once per trimester, and at the postpartum checkup) (“ACOG Committee Opinion No. 518: Intimate partner violence” 2012).
For each set of primary analyses described above, we ran two additional models—one with a dichotomous covariate for presence of maternal involvement with CPS prior to childbirth and another with a covariate for presence of IPV within child’s date of conception through the first month of life.
Presentation of Results
Logistic regression results were expressed as odds ratios (with 95% confidence intervals) and standardized marginal probabilities. All analyses were conducted using SAS version 9.4, Stata version 14.2 and R. Stata’s ebalance package was used for entropy balancing and R’s MatchIt for PSM. All statistical tests were two-sided and used an alpha = 0.05 as the threshold for statistical significance.
Setting and Participants
11 of the 38 PA MIECHV-funded programs were selected for the qualitative study, chosen to supply a representative sample of agencies, based on program size, location, and model type, including NFP, PAT, EHS, and Healthy Families America (HFA, which, due to data constraints, could only be included in the qualitative study). Program staff were interviewed during day-long site visits; enrolled clients, recruited with flyers and help from program staff, were interviewed over the phone. Participants were verbally consented before participating in interviews lasting between 30 and 60 min. Clients were sent a $20 gift card in appreciation for their time. Interviews took place between 2013 and 2015.
Measures and Analysis
The interdisciplinary project team worked with home visiting leadership to develop three distinct interview guides for program administrators, home visitors, and clients, each including questions on specific program outcomes (e.g. child maltreatment; See Online Appendix C for questions that elicited content related to child maltreatment.). De-identified transcripts were imported into NVivo 10 for coding and analysis. We used a modified Grounded Theory approach to coding (Glaser and Strauss 1967), including a priori codes relating to quantitative metrics included in the evaluation. Using a constant comparative approach, coders met regularly to review memos and coding comparison queries to discuss and refine node definitions and the application of codes. Discrepancies were resolved through group consensus. A thematic analysis was conducted on all interview content related to child maltreatment.