Increased death rates of domestic violence victims from arresting vs. warning suspects in the Milwaukee Domestic Violence Experiment (MilDVE)

We explored death rates from all causes among victims of misdemeanor domestic violence 23 years after random assignment of their abusers to arrests vs. warnings. We gathered state and national death data on all 1,125 victims (89 % female; 70 % African-American; mean age = 30) enrolled by Milwaukee Police in 1987–88, after 98 % treatment as randomly assigned. Victims were 64 % more likely to have died of all causes if their partners were arrested and jailed than if warned and allowed to remain at home (p = .037, 95 % CI = risk ratio of 1:1.024 to 1:2.628). Among the 791 African-American victims, arrest increased mortality by 98 % (p = .019); among 334 white victims, arrest increased mortality by only 9 % (95 % CI = RR of 1:0.489 to 1:2.428). The highest victim death rate across four significant differences found in all 22 moderator tests was within the group of 192 African-American victims who held jobs: 11 % died after partner arrests, but none after warnings (d = .8, p = .003). Murder of the victims caused only three of all 91 deaths; heart disease and other internal morbidity caused most victim deaths. Partner arrests for domestic common assault apparently increased premature death for their victims, especially African-Americans. Victims who held jobs at the time of police response suffered the highest death rates, but only if they were African-American. Replications and detailed risk factor studies are needed to confirm these conclusions, which may support repeal or judicial invalidation of state-level mandatory arrest laws.


Introduction
Mandatory arrest for misdemeanor domestic violence-mostly without injury-was adopted by over half of U.S. state legislatures in the 1980s (Mills 2003;Sherman 1992). These laws required arrests for slaps, kicks, punches, or other acts unlikely to cause death or serious injury (as distinct from aggravated assault, for which arrest was not made mandatory). They were based in part on a randomized experiment that found arrest for these minor assaults caused less re-offending for 6 months after police response (Sherman and Berk 1984).
Yet by the early 1990s, five replications of the original experiment created a more complex body of evidence: arrests consistently reduced repeat offending for employed suspects, but among unemployed suspects arrest doubled recidivism (Pate and Hamilton 1992; or, at least, was ineffective (Berk et al. 1992). A pooled sample analysis of all six trials found a small average benefit but confirmed the different effects on jobless versus employed suspects (Maxwell et al. 2002). Although these recidivism findings suggested that many victims, like their suspect counterparts, might be harmed by mandatory arrest, the research had little impact on state laws or police policies-partly because of measurement issues in both low-response rate victim interviews and selective reporting of domestic violence to police.
A quarter-century after these experiments, a clearer outcome can be examined to measure the impact of mandatory arrest on victims: death. This unambiguous measure provides a more powerful focus for debate about the wisdom of current state laws. While no experiment on cases can provide evidence on whether a general deterrent effect of mandatory arrest lowers domestic assault across entire cities, mortality data can suggest whether more victims are dying after arrests than after warnings. Since the majority of domestic homicides occur without police contact of any kind with the victim prior to these murders (Sherman 1991;Sherman and Strang 1996), knowledge of the effects of arrests on mortality of those victims who do have contact with police may provide the most concrete guidance for policy.
One case-control design has reported that for domestic homicide, arrest is a protective factor (Campbell et al. 2003). No prospective-longitudinal or experimental design, however, has previously examined the association between partner arrest and mortality of domestic violence victims.
We began this long-term follow-up by attempting to discover whether arrest deterred-or increased-future domestic homicide. As our study protocol indicated, we were also open to all possible effects of arrest on mortality. Our intention in this analysis was to explore any possible connection between arrest and long-term mortality rates. Milwaukee offered the experiment with the highest success rate (98 %) of the six 1980s trials in delivering treatments as randomly assigned (Sherman 1992). This trial arguably provides an opportunity to discover causal links of arrest to death even without directly measuring any causal pathways triggered by the randomly assigned treatments.

Sample and trial design
The 1987-88 Milwaukee Domestic Violence Experiment (MilDVE) was reported in detail shortly after its completion (Sherman 1992;, including eligibility criteria, procedures, and treatment fidelity. No long-term mortality outcomes for victims have been reported from MilDVE to date (Sherman and Harris 2013).
MilDVE was a three-armed (1:1:1) randomized trial comparing "short" and "long" arrests of suspected assailants to a warning. In the latter arm, police left the scene while suspects and victims were usually still present together. In "short" arrests, police released suspects in a mean of 4.5 h after booking at police headquarters; in "long" arrests release averaged 11.1 h after booking. About half of the victims in the arrest arm met with a prosecutor at the courthouse the next business day, but only 5 % of the arrests were prosecuted.
Cases of serious injury were ineligible, as were attempts to inflict serious injury. While 13 % of the eligible cases had an injury for which victims sought medical attention, only 5 % of victims went to hospital after the police came (Sherman 1992). Both suspects and victims were required to be older than 18 and in a domestic relationship. A team of 36 officers screened, enrolled, and accepted random assignment of 1,200 cases from April 6, 1987, through August 8, 1988(Sherman 1992.

Mortality follow-up study
In 2012-2013, the present authors obtained the list of victim names and dates of birth in MilDVE, then purchased mortality data from the Wisconsin Office of Vital Statistics that we cross-checked and expanded through searches of the Social Security Death Index (SSDI), which features national coverage of all persons who have ever received Federal welfare benefits of any kind. Tests of the comprehensiveness of its Death Master File (DMF) show that for the people aged 25-54 (the age most victims in our sample entered the study) the percentage of publicly available death records compared to the national (confidential) list of death certificates was 75 % in 1997, having risen steadily for over a decade (Hill and Rosenwaike 2001).
Since only 10 % of the DMF cases found in our study lacked records in Wisconsin, a small but unknown number of cases were probably undetected during mortality followup, but were arguably not "lost" in CONSORT terms ( Fig. 1) because we looked for them in the same way that we found the other deaths. To avoid errors, multiple dates of birth, cities in which decedents had resided, and other checks were used to determine matches. Cause of death is not known for the nine deaths (out of 91) identified only by the DMF. There is no evidence that any differential movement out of state was created by the randomly assigned treatment. Errors in detecting death would likely have the same structure in both groups.
While there was no national registry of crime and justice trials in 1987, the primary MilDVE publications (Sherman 1992; report that approval for the random assignment of arrest was granted by a vote of the Milwaukee City Council and an Institutional Review Board of the Crime Control Institute of Washington DC, both in 1986. Approval for our 23-year follow-up study was obtained from the Institutional Review Board of the University of Maryland in 2012, title 334834-2.

Randomization and masking
Suspects were randomly assigned to arrest or warning in a sequence determined by an independent statistician (Sherman 1992), without consent of either victims or suspects, consistent with ethical standards and law for experimentation in criminal sanctions (Federal Judicial Center 1981). Assignments were enclosed in sequentially numbered, opaque, sealed envelopes opened one at a time by the independent research team. Details of the sequence remained unknown to all police and members of the research team until police called researchers to report identifying details for both suspect and victim, and each envelope was opened while police officers remained on the phone and researchers communicated the assigned treatment. Because randomization assigned suspects to different legal statuses, it was impossible to mask the treatment from the research staff or participants.

Procedures
After deciding that there was probable cause to believe that an eligible suspect had assaulted an eligible victim, MilDVE police officers called the warrant desk to insure there were no outstanding arrest warrants for the suspect. Police then obtained the randomly assigned treatment and arrested or warned the suspects accordingly, using a standard script for the warning. Police handcuffed suspects in 94 % of arrests, doing so in front of the victims in 73 %. No victims in any case were arrested. MilDVE randomly assigned arrests or warnings to cases, rather than to people. A CONSORT diagram for victims appears below as Fig. 1. The experiment allowed reenrolment of the same persons in multiple cases, based on a primary concern with Enrollment Follow -Up Analysis Allocation * Estimated assuming all ineligible cases included unique individuals 13 13 victims were treated as suspects before they were treated as victims short-term repeat offending (Sherman 1992). For purposes of assessing long-term differences in mortality, we analyzed the random assignment of intention-to-treat in the first case in which each individual appeared as a victim, even if they were subsequently treated as an offender or as a victim whose partner received a different treatment. Only 1 % of victims (13 of 1,125) had previously been treated in the experiment as suspects.

Statistical analyses
Most analyses below pool data from the two arms of the trial in which victims' partners were arrested. This procedure under-estimates the effect of the "standard" arrest in Milwaukee at the time, described here as "long" arrest, since it was only "short" arrest that was adopted for the unique purpose of the trial; "long" arrest was the standard arrest condition before and after the trial. We combine the two arrest categories to both increase the power of moderator analyses, and provide more external validity to a range of time periods in police custody-something that can vary widely across police agencies operating under identical mandatory arrest laws.
Using the intention-to-treat principle, we calculate relative risk ratios and their confidence intervals between mortality for victims whose partners were arrested or warned in their first enrolment as a victim. We examine both main effects and subgroup effects conditional on theoretically relevant moderator factors. As a sensitivity analysis, we tested the effect of partner arrest on victims who had at any point had their partner arrested by random assignment in the experiment, and not just at the point of their initial enrolment as victims; we found no different results from that procedure. We also used Cox hazard models to adjust for slight imbalances in known baseline covariates that could predict outcomes, and again for race as a strong moderator. We used Stata Version 13 for all analyses.

Results
Of the 2,054 cases assessed by the 35 police officers, 1,200 cases were randomly assigned to arrest or warning, with 1,125 unique victims whose partners were treated in one or more of the cases. The CONSORT diagram shows the treatment pathways of this sample with 98 % compliance on intent-to-treat. Table 1 shows victims' characteristics at baseline, which are similar but not identical for the two treatment groups. Both groups were about 70 % African-American, 10 % male, and averaging 30 years of age. Of eight characteristics examined, however, three show significant differences between treatment groups. Victims whose partners were arrested were more likely than those whose partners were warned to have had partners who were employed, partners with a prior arrest, and a prior arrest themselves. The combined effect of these differences yields an under-estimate of the effect of partner arrest on mortality, based on the moderator analyses reported in Table 3. We adjust for these differences in a Cox proportional hazard model estimated in Table 4 and displayed (after correction) in Fig. 6 below.

Main effects
The primary outcome for all cases showed 64 % more deaths among victims whose suspects had been arrested (Fig. 2). The arrested-partner group suffered 92.8 deaths per 1,000 victims from all causes, compared to 56.6 per 1,000 for the victims whose partners had been warned (p=0.02).
The relative risk ratio was even larger (but not significant) in the first 5 years after random assignment, with victim deaths in that period over three times higher in the partner-arrest group. The 23-year trend in main effect size is displayed in Fig. 3.
Disaggregating outcomes by cause of death (Table 2) showed only two categories with even marginally significant differences: heart disease (p=0.122), which caused twice as many deaths for partner-arrested victims as for partner-warned, and "other" internal causes (p=0.06), for which arrest raised the risk of death by 183 % (d=0.4). Homicide rates were identical for the two treatment groups at 2.7 per thousand. Differences in group death rates from cancer, alcohol, and drugs were also indiscernible.

Moderator analyses
We conducted 22 moderator analyses (Table 3), of which four (18 %) were significant, almost four times what would have been expected by chance. The moderator  candidates were selected for examination largely on the basis of previous moderators found for arrest effects on repeat domestic assault . In this analysis, victim employment and race were both powerful moderators. The overall 64 % difference between the arrest and warned group was almost entirely concentrated among the African-American victims (Fig. 4). The white subgroup had only a 9 % higher death rate after partner arrest than after a partner warning. For the African-American victims, the rate of death after a partner's arrest was 98 % higher than it was after a partner's warning (p=.03).
Further moderator analyses showed, in both races combined, that the association of partner arrest with victim mortality was much higher if the victim had a job (RR= 4.26:1; d=−0.51; p=0.03). That effect, however, was also entirely isolated among African-American victims, since there was no discernible treatment effect on mortality among white victims who were employed at time of enrolment. Not one of 67 employed black victims (0 %) whose partners were warned had died after 23 years, compared to 14 of the 125 employed black victims (11 %) whose partners had been arrested (p=0.003). That large effect size (d=−0.81) was the largest among all 22 moderator and main effects examined, and compares to a small effect size (but in the same direction) of suspect arrest for all other cases (Fig. 5).
A final significant moderator was among victims whose abuser had no prior arrest, which raised the risk of victim death after partner-arrest by 129 % over a partnerwarning (p=0.03, d=−0.39).

Survival analyses
In order to increase precision in the estimate of our main effect, we estimated a Cox hazard model by adjusting for the predictive influence of three imbalanced covariates at baseline (Fig. 6). The adjustment was informed by Table 4, which shows the predicted effect of having each imbalance equalized. Based on the moderator analyses (Table 3), the arrested group had slightly more prior victim arrests, which lowered their predicted  Fig. 4 Effect of suspect arrest on victim mortality by race of victim mortality; more employed suspects, which raised predicted victim mortality; and more suspects with priors, which lowered predicted victim mortality. The results slightly increase the main effect of partner arrest on victim mortality over the unadjusted raw increase of 64 %, with the adjusted hazard rate showing victims in the partner-arrest group 85 % more likely to die than those in the partner-warned group. Table 5 performs further adjustments beyond baseline imbalances, incorporating key moderators including race and employment.  Randomly assigned time in custody Table 6 presents the main effects on victim mortality separately by all three randomly assigned treatments, including the two different randomly assigned lengths of time in custody for arrested partners. The results show higher mortality for the standard length of time in custody than for the artificially reduced time. Analyzing only the "long" arrest condition against the warn condition, the raw relative risk ratio is 1:1.87, compared to only 1:1.40 for the "short" arrest effect.

Other analyses
We used Stata to calculate the population attributable fraction of deaths (Rockhill et al. 1998) related to arrest as a risk factor in this analysis. In the absence of arrest, 30 % of the observed mortality in this population over 23 years would have been avoided, holding all other factors constant. For black victims, 40 % of deaths would have been avoided; for whites it would have been 6 %. Finally, since death was relatively rare in the sample overall, we performed a permutation test as a sensitivity analysis. This test shuffled the time order of deaths 1,125 times (creating 1,125 new variables). We then ran the Cox model 1,125 times and saved each outcome. Figure 7 plots the actual order of deaths when time is real in relation to the order when time is shuffled. It shows that the point estimate of arrest effects in the actual sequence (RR 1:1.85) lies in the far right tail of the distribution of all estimates. This result provides strong evidence that the relationship between victim death and partner arrest is causal, rather than a merely chance difference between the groups that was not overcome by random assignment with very high treatment fidelity and little potential attrition.

Discussion
The analysis presented herein is an inductive exploration, not a deductive hypothesis test. We did not predict our findings, and were surprised by their direction and magnitude. This is not a reason to dismiss them: similar serendipity has characterized important scientific discoveries that withstood further testing, such as the effect of penicillin on infections (Haven 1994). Nevertheless, the present findings lack an empirical context: They are both the result of the longest-ever follow-up of a randomized trial of a criminal justice sanction and the first study to examine the victimmortality effects of suspect arrest for domestic violence. This discussion seeks to provide a larger context for these findings. Doing so requires that three key issues be addressed. One is the statistical risk that the findings are unique to this sample, and not indicative of a more general pattern. The second is the theoretical pathway by which these results can be understood, especially the differences between black and white victims and by victim employment status. The third is the potential policy implications of these findings, both for Wisconsin and elsewhere.

Statistical interpretation
While our results remained robust through multiple sensitivity tests, it remains possible that they are not causal. Imbalances in measured covariates suggest that, despite random assignment, our results could contain spurious imbalances in unmeasured factors predisposing mortality. Additionally, our findings may not generalize beyond the Milwaukee sample, even if they generalize fully to the Milwaukee population from which they were sampled. Fortunately, because similar experiments were conducted at the same time, our analytic process might be replicable in three more samples (Charlotte, Miami, and Omaha). Those replications, if accomplished, would provide a valuable check on the external validity of our results.
Despite the robustness of our results, some may question whether random assignment can plausibly "last" so long and over a life-course in which other causes of death could intervene. For example, this analysis does not examine common proximate causes of death, such as cigarette smoking and obesity, but there is no time limit on the causal interpretation of randomized experiments. Whatever the proximate causes of death may be, they are logically connected to differences in causal pathways influenced by the exogenous instrument of random assignment itself. If, for example, the arrest group developed more obesity or a higher prevalence of smoking by year 23, such differences are likely to have been shaped by a causal pathway attributable to the randomly assigned treatment.

Theoretical interpretation
These results pose three related challenges of explanation: 1. By what causal pathways can arresting a suspect cause a victim to die? 2. What could explain why those pathways produced such different results for black and white victims? 3. Why does victim employment double the lethal effect of suspect arrest among black victims, but not among white victims?
Causal pathways Currently, only hypotheses, not conclusions, can be drawn regarding the particular causal pathways that may have extended from the arrest of MilDVE suspects to the deaths of their victims. We suggest the interactions between arrest, race, and employment provide clues as to the direction those causal pathways took. We also suggest that those clues point toward differential post-traumatic stress manifestations in victims whose partners were arrested. Extensive studies of mortality and employment hierarchy in the British civil service demonstrate that a large portion of the variation in mortality remains unexplained by smoking, alcohol, or biological risk factors (Marmot et al. 1991(Marmot et al. , 1997Kuper and Marmot 2003;Kunz-Ebrecht et al. 2004). That variance can, however, be explained by psychosocial causes like the degree of autonomy people have in their work, the extent of their social isolation (Stringhini et al. 2012), and depression, which could result from an arrest that increases stress by limiting current or even future intimate and other social relationships.
By the last wave of MilDVE victim interviews (77 % response rate at 6 months after enrolment), a large minority of victims in all three arms (45 % long-arrest; 44 % shortarrest; 38 % warned) had already separated from their partners (Sherman 1992). Yet, with no subsequent interviews, we cannot know whether arrest had long-term effects on their then-current or future relationships, on their loneliness and social isolation, or on any other potential stressor on a causal pathway to higher mortality. A new wave of MilDVE victim interviews could explore those stressors and causal pathways by examining, for example, whether post-traumatic stress differences between victims whose partners are warned or arrested create a biochemical pathway that reduces victims' life-spans.
Given evidence of elevated levels of post-traumatic stress symptoms among women victims of domestic violence in general (Silva et al. 1997;Golding 1999), the added trauma of an arrest could be more dangerous than witnessing an arrest at lower baseline levels of PTSS. Chronic but low-to-moderate elevation of post-traumatic stress symptoms have been reported to have a strong prospective link to premature mortality from coronary heart disease (Kubzansky et al. 2007;Boscarino 2006), possibly via hyperactivity of the central CRH [corticotrophin-releasing hormone] systems with underactivity of the pituitary-adrenal axis (Kasckow et al. 2001). If some domestic violence victims experience partner arrest as traumatic, that stress-related response chain could trigger an increase in coronary heart disease and other morbidity leading to premature death. As a first step toward examining the viability of this pathway as an explanation of our results, new interviews could use PTSS assessment instruments to determine whether victims in the partner-arrest group have higher current levels of PTSS than those in the partner-warned group. Even better would be new experiments in which PTSS measures were taken within a few weeks after random assignment.
Explaining racial differences A major obstacle to applying a PTSS theory to these results lies with the racial differences. For the PTSS theory to fit the facts, African-American domestic victims should presumably have higher baseline levels of PTSS than white victims of domestic violence, on average. Yet, even a brief examination of the literature shows, at worst, that black domestic violence victims suffer less PTSS than whites, evincing greater resilience in relation to numerous symptoms and measures of PTSS. At best, the literature, which includes only a handful of studies, is mixed (Dutton 2009;Lily and Graham-Bermann 2009;Wright et al. 2010;Iverson et al. 2013).
A careful examination of that literature, however, shows that those findings may be confounded with employment because unemployed African-American domestic violence victims dominated the samples. Not all of the studies report employment data, but in the two that did unemployment amongst black domestic violence victims was greater or statistically similar to high unemployment among whites. In one study of 132 domestic violence victims in southern Michigan, in which 31 % of respondents were black and 59 % white, 63 % of the black domestic violence victims were unemployed, whereas only 31 % of the white victims were unemployed (Lily and Graham-Bermann 2009). In the second study of 204 residents in a domestic violence shelter, unemployment was slightly, although not significantly, higher for white (76 %) than for black (72 %) victims (Wright et al. 2010). In the MilDVE sample, unemployment predominated amongst all victims at 70.6 %, but varied considerably by race. 75.5 % of black victims were unemployed, whereas only 58.4 % of white victims were. Statistically speaking, this significant racial difference in employment opens the door for a theoretical explanation that may indicate a paradox of social context of the kind that Sampson (2013) calls "contextual causality." The African-American victims in the MilDVE resided in the context of the most hyper-segregated metropolitan area in the US (Massey and Denton 1993). Most of the black victims lived in areas of concentrated unemployment, while whites lived in mixed income neighborhoods dominated by employment rather than unemployment (Wilson 1981(Wilson , 1996. This social context was also associated with a divergence in the structure of intimate partner relationships between blacks and whites. Fertility in the absence of marriage was more prevalent and accelerating far faster among black women than among white women (Wilson 1981;Wojtkiewicz et al. 1990). These contextual differences within the MilDVE victim sample may have led to equally divergent identity constructs that Sampson (2013) calls "identities of place" (p. 2) that could theoretically create a paradox in the relationship between race, employment, and victims' vulnerability to PTSS mediating their risk of death in response to arrest.
The paradox would be present if there was greater resilience (and less PTSS) among black victims who were unemployed than among those who were employed, and if unemployed black victims also showed greater resilience than white victims whether they were employed or not. That paradox may relate to the extent to which employed African-American women, in particular, are more vulnerable to stress stemming from a threat to their livelihood or a change in their status and identity resulting from the threat of loss of employment, in comparison to unemployed women. The latter not only had no job to lose; many could also predictably depend on welfare benefits (such as aid for dependent children) for their financial support. Only a minority of women in Milwaukee's concentrated poverty neighborhoods were employed "strivers," pursuing what Anderson (1999) calls a "decent" code of conduct. They could reasonably have feared a far greater loss of status and respectability from their partner being arrested than most of the unemployed women, who had a different "code" or self-defined identity unrelated to employment.
For employed white women in a neighborhood context of relatively high employment, the impact of a partner's arrest could have been very different. White victims were more likely to have ever been married to their abusers than were black women (42 vs. 26 %), with higher rates of employment among their partners. In particular, only 16 % of the partners of employed black victims also had jobs, whereas 40 % of the partners of employed white victims, 38 % of the partners of unemployed black victims, and 30 % of the partners of unemployed white victims had jobs. White women were therefore more likely to have been second earners, whose own employment provided a buffer against a husband's or partner's loss of work, rather than the sole means of support in a single-parent household, as appears to have been the case for employed black women. The psycho-social (and possibly biomedical) meaning of employment and its contribution to the identities of married white women, living in predominantly white neighborhoods of higher employment, could be quite different from that for employment of unmarried African-American women in struggling neighborhoods. Employment for white victims could even provide a protective effect against PTSS in response to a partner's arrest. Unemployment for white victims, however, could leave them more economically vulnerable to a loss of partner's earnings-or even the partner entirely-in the event of an arrest, thus increasing their stress levels.
A race-employment paradox? The possibility that employment for black domestic violence victims may lead to more PTSS from partner arrest, but make white victims more resilient, implies that their death rates should correspond to this paradox of employment and identity. Arrest should cause higher death rates for employed (vs. unemployed) black victims, but have the opposite interaction for white victims, regardless of their employment. As Table 3 shows, that is generally the case. For employed black victims whose suspects were arrested, the 23-year death rate per 1,000 was 112; for unemployed black victims, the comparable figure was 94-less, but admittedly not lower than for employed white victims. For white unemployed victims, the death rate when partners were arrested was 111 per 1,000, almost the same as for employed black victims; for white employed victims, it was only 40, and was actually lower for those whose partners were arrested than for those whose partners were warned (which was 50).
A further nuance in the findings is the different rates of death by race and employment after partners are warned. For unemployed white women, the death rate after partner warning was close to their death rate after partner arrest (87 vs. 111 per 1,000), but for employed black women, the randomly assigned warning was a lifesaver: not one of them had died after their partner was warned. If differences in neighborhood context help shape different death rates (by race) from similar treatments, then these findings may be a prime example of Sampson's (2013) concept of "contextual causality." Although the results from this analysis failed to provide perfectly symmetrical support for it, the race employment paradox may still be true. It is simply one hypothesis for which the evidence provides limited support. Suffice it to say that evidence on PTSS among black and white victims of different employment statuses remains quite thin, and future evidence may still show that PTSS levels differ in direct proportion to the differences in death rates by demographic subgroups, consistent with differences in social context.
There are two ways to continue to investigate and develop these theoretical possibilities. One is gathering new evidence in the field from the Milwaukee sample (and the samples of the companion experiments). The other is to analyze the data not by race but by neighborhood context. In further analyses we will examine the Census tract data for all victims, not only at random assignment but even at later periods of the follow-up, as may be indicated by change of address data from the Wisconsin drivers' license records we have obtained.

Policy implications
Regardless of the reasons for the racial disparity we observe, the evidence is clear: African-American victims of domestic violence are disproportionately likely to die after partner arrests relative to white victims. The magnitude of the disparity strongly indicates that mandatory arrest laws, however well-intentioned, can create a racially discriminatory impact on victims. While 6 % of the deaths amongst the white partners of arrested suspects could have been avoided, 40 % of the deaths amongst the black partners of arrested suspects could have been avoided, if only their partners had been warned instead.
Wisconsin state law requires police to make arrests when they have probable cause to believe that a misdemeanor assault has occurred in a domestic relationship, irrespective of race. This law was not intended to place black victims at a disadvantage to white victims, or to deprive them of life. Like many other "race-neutral" policies, however, this policy generates racially disparate outcomes. A key question is whether the unintended potential harm generated by that disparity is outweighed by the intended potential benefit of the policy.
In the realm of labor law, employment policies often require a high school diploma or passage of tailored exams to qualify for certain jobs. Those policies exemplify seemingly race-neutral policies that have resulted in racial disparity. In considering whether those policies could be considered justified, the US Supreme Court has focused on whether the policy under examination has sufficient evidence to justify its use as an employment criterion (e.g., Griggs v. Duke Power Co., 401 U. S. 424, 1971;Ricci v. DeStefano, 557 U.S. 557, 2009).
In the realm of criminal law, there is no comparable body of US case law regarding unintentional disparate impacts of criminal justice policies on victims of crime. In fact, there has been no prior evidence of victim mortality differences apparently caused by any criminal sanction. Thus, it is unclear how US federal courts would rule if a Constitutional challenge to the Wisconsin mandatory arrest statute were filed by surviving families of deceased African-American domestic violence victims whose partners had been arrested.
Since 1871, Section 1983 of the U.S. Code has prohibited states from depriving people of certain rights "under color of" statutes (or other acts or omissions). This code was enacted to enforce the 14th Amendment to the U.S. Constitution, which says (in part) that "No State shall make or enforce any law which shall… deny to any person within its jurisdiction the equal protection of the laws." Whether equal protection is defined by equal enforcement or equal effect is a theme of increasing discussion by legal scholars, especially with respect to racially disparate effects of incarceration rates and the collateral consequences of incarceration. This has led to proposals for "demographic impact statements" regarding sentencing of offenders and to one state (Minnesota) reconsidering a potential change to its sentencing regime based on the projection that it would differentially impact African-Americans (Reitz 2009).
That legislatures should consider the potential impact of legislation before enacting new sentencing laws could apply equally to repealing existing arrest laws. The present article can be seen as an example of what could be called a "demographic victim impact statement," as distinct from the case-by-case use of "victim impact statements" before sentencing (e.g., Davis and Smith 1994). Such statements are also relevant to proposals to repeal existing laws governing police conduct.
Perhaps the key fact in any "balancing test" of evidence for or against a law that legislatures or courts might consider is the lack of any discernible evidence of a longevity benefit for victims, caused by the arrest of abusive partners. Using the stringently unambiguous criterion of victim life expectancy, none of the 22 moderator analyses found any significant increase in longevity associated with arrest. Earlier findings on short-term misdemeanor recidivism (Pate and Hamilton 1992; had demonstrated opposite effects for different subgroups, with victims of employed abusers gaining a benefit of less violence from arrest, at the cost of victims of unemployed abusers experiencing a higher rate of domestic violence if the abusers had been arrested. When death is the test of benefit, the result is unambiguous: no victims appear to benefit. The potential benefit of a general deterrent effect is much harder to examine, and remains untested with respect to overall mortality of domestic violence victims. Other benefits, such as moral condemnation of domestic violence, also remain untested. If nothing else, the present findings may justify a shift in the burden of proof onto those who would prefer to see mandatory arrest continued. For those who caution no policy change (at least in Wisconsin) until there is more evidence, we must recall that the evidence on which the Wisconsin statute was based was a 6-month follow-up of an experiment in Minnesota (Sherman and Berk 1984). For reasons of both internal and external validity, the present study covering 23 years arguably provides much more policy-relevant evidence for Wisconsin.
Whatever effect these findings may have on future research or policy decisions, they provide strong evidence for proposals to test criminal sanctions in the same way that medical treatments are tested. Evidence from controlled trials indicating a substantial cause of premature death for African-American victims should be taken seriously unless and until further strong evidence shows otherwise.