Introduction

In British policing and perhaps elsewhere in the world, the concept of “offender management” is increasingly discussed and deployed. It implies a wide range of strategies for dealing with people who have been convicted of crimes and are currently (or formerly) subject to supervision in the community by agents of the criminal justice system. These agents include not only probation (and parole) officers: they increasingly include police officers who employ legal powers to defer charges or to apply for a conditional caution for a specific individual in relation to a specific offense—for which they may not have been charged, let alone convicted. In all cases of community supervision, however, there is a public expectation that authorities know where these persons can be found and that the persons have not absconded from criminal justice supervision.

Increasing interest in police decisions to divert individuals from prosecution to a police-supervised, pre-charge program of rehabilitation (Neyroud, 2018; Slothower & Joyce, 2020) raises key questions about breaches of diversion agreements. One experiment in police-led diversion (Strang et al., 2017), for example, reported a breach rate of 14%, in which 22 of 154 individuals randomly assigned to attend a series of two domestic abuse workshops were prosecuted after failing to comply with all requirements of their “conditional caution.” Rapid growth of out-of-court disposals by police in the UK makes the issue of preventing breaches just as important to police offender managers as it has always been for probation and parole officers in a post-sentencing context. The latter’s experience with missed appointments may have much to offer for police to learn about similar challenges in a pre-prosecution context.

The scale of community supervision in many countries vastly exceeds that of imprisonment, as well as police disposals out of court of cases with sufficient evidence to prosecute. Approximately 4.4 million American adults are under community supervision, more than double the number of individuals incarcerated (Maruschak & Minton, 2020). Despite providing the majority of correctional supervision, community supervision agency budgets have not grown in proportion to the increases in their caseloads (Petersilia, 1997). This has encouraged an emphasis on efficiency, resource management, and innovation in community supervision.

During their term of supervision, most individuals are subject to conditions that mandate the frequency and nature of contact with the officer(s) assigned to manage their case. These interactions often take place at an agency office, where the individual under supervision can provide updates, receive support and treatment, and be tested for recent drug use. These visits are described as the keystone of effective supervision (Lindner, 1992).

Despite their importance to effective supervision management, office visits are often difficult to coordinate, resulting in high numbers of missed appointments due to work, education, or an inability to travel to the location of the office meeting. This is especially true for higher-risk individuals (Hyatt & Barnes, 2017), for whom missed appointments can result in an officer requesting the revocation of probation and returning people to prison (Medina, 2017).

Probation agencies across the USA have embarked on a number of initiatives to increase compliance with the requirements of supervision, including attempts to increase attendance at office meetings. Technological innovation has played an increasingly important role in these initiatives. One widely discussed innovation from the “nudge” literature (Thaler & Sunstein, 2008) includes sending text messages to remind individuals to attend court appearances and other important scheduled events. One randomized trial of this approach by a police agency with arrested persons scheduled for court (Chivers & Barnes, 2018) in England found no effect on attendance—due to police failing to secure correct phone numbers at point of arrest. Another randomized trial in England with victims and witnesses tested text reminders 2–3 days prior to court dates, with no discernible effect on attendance rates (Cumberbatch & Barnes, 2018).

Yet testing such innovations may require a more nuanced approach than the binary conditions of the two police-led trials noted above. Simply comparing text messaging to no messaging may miss important mediating effects of how far in advance, or how many times, a text message is sent prior to the event a person should attend. Multiple conditions, randomly assigned, may have a better chance of identifying a “sweet spot” of timing or frequency that has the greatest effect. In this study, we do just that: we compare cancellations and no-show rates of client appointments with community supervisors under four conditions: no text, early text (2 days prior), late text (1 day before the appointment), and two texts (both 4 days and 1 day before the appointment). This approach provides three chances to create a condition that might increase compliance, as well as reducing the risks of more imprisonment due solely to failure to attend a meeting. This article reports on that research design as applied by the Arkansas Community Corrections (ACC) agency in 2018–2019, long before the COVID pandemic.

Setting

Community supervision in Arkansas requires that minimum-level clients under supervision have one office visit every 3 months with their probation/parole officer. As of 2018, these appointments were missed about 30% of the time, limiting the opportunity for prosocial contact, potentially resulting a violation and, in most cases, wasted time and effort for officers with already overburdened caseloads.

To reduce the number of missed appointments, ACC contracted with Marquis Software to enhance the capacity of its Case Management System (CMS). Among other changes, the revised system allowed the supervision agency to send text message reminders directly to clients, using contact information gathered by the agency and by the courts. This new system provided a low-cost means of systematically comparing different strategies for reducing missed appointments or cancellations, in order to select an optimal system based on strong evidence. The opportunity for testing helped ACC to frame two specific research questions around the core question of whether text message reminders can reduce missed appointments by clients under community supervision with their probation or parole officers.

Research Questions

This evaluation was designed to assess two primary research questions:

  1. 1)

    Do text message appointment reminders reduce the rates of canceled and no-show appointments for community supervision participants?

  2. 2)

    If text messages work, what is the optimal frequency and timing of the reminders for reducing missed appointments?

Data

The Case Management System (CMS) delivered by Marquis provides a central recording platform for all clients under community supervision across all 54 urban and rural offices in the state. Every appointment made for every client was recorded in advance in the CMS. Every outcome of every appointment—held, cancelled, or no-show—was recorded on the date of the appointment. The CMS also held the phone number of record for each client. These numbers were available to Marquis to program a schedule of text message reminders for each of the participants in the sample.

Background characteristics on the cases include individual-level data about participant age at the start of the evaluation, sex, race, supervision assignments, risk levels, term length of community supervision, and the amount of time elapsed between the start of the individuals’ supervision and the beginning of the evaluation. Dependent variables include the total number of appointments assigned to individuals categorized as held, canceled, or no-shows.

Methods

To best isolate the effect of the text message reminders on the behaviors of clients, a randomized controlled trial (RCT) design was employed. Causality can be established by leveraging the text message reminder technology. This is important as RCTs are regarded as the “gold standard” of evaluation design and the ideal approach for policy assessment in the social sciences (Farrington et al., 2019).

In July 2018, Marquis identified all 23,209 clients in the ACC system who were on parole or probation, had an active cell phone, a supervision end date of February 1, 2019, or later, and no outstanding warrants or other issues that would interfere with their completion of the experiment. Stratified sampling applied five dimensions: gender, age, race, risk level, and supervision type. Gender and age (over thirty or thirty and under) were binary, as was supervision type (parole and probation). Race had three categories (Caucasian, Black, and other) as did risk class (minimum, medium, and maximum). This led to 29 combinations of variables that had more than 25 members. These 29 designated client groups were then each subjected to random selection of some of their members to each of the four treatment groups, in proportion to their population (rounding down) to be included in each treatment group. Thus, 989 of 1000 clients in each of the four cohorts were selected with the five-dimensional stratification, and the remaining 11 in each group were selected randomly from the clients that remained. The experiment thus began with a total of exactly 4,000 participants equally divided into the four treatment groups of 1,000.

Subsequent to random assignment, a delay in the start of the experiment for 2 months roughly reduced the overall sample of 4,000 clients still under supervision by attrition of 530 (13.25), but with almost identical totals remaining in each treatment group (see Table 2: sample sizes by treatment group were 865, 857, 868, and 880, within 87 to 88% of the randomly assigned original sample). The causes of the delay included staff training and software programming changes to ensure that office visit appointments were consistently entered into the CMS. Thus, 530 clients did not complete the experiment because they either left supervision, experienced reincarceration, moved to another state, had an invalid phone number, or changed to an annual or unsupervised status that, in turn, impacted the scheduling of the appointments during the experiment. The attrition rates did not significantly differ between the groups.

The study began with all remaining participants receiving appointment reminder texts under the four treatment conditions starting October 1, 2018, and concluding on April 15, 2019.

Treatment Groups

The treatment protocols for each of the four groups were as follows:

  • Group 1: a text 2 days before the appointment (“early text”)

  • Group 2: a text 1 day before the appointment (“late text”)

  • Group 3: texts 1 day and 4 days before the appointment (“two texts”)

  • Control group: no texts

Marquis Software used the CMS to send text message reminders to participants depending on their group assignment. Implementation and fidelity data were later drawn from their software portal(s). Additionally, background and covariate data for between-group balancing at baseline, as well as outcome data, were derived from the CMS (matched on date and a unique client identification number). These data were provided to the academic team (third and fourth authors) to conduct an external, independent analysis.

Statistical Approach

The statistical methodology employed in this analysis uses a one-way analysis of variance (ANOVA) to examine continuous variables (e.g., counts and scales) and chi-square analyses for categorical variables (e.g., binary variables). This approach allows for the comparison of all of the groups simultaneously. Multivariate statistics were not necessary given the successful implementation of the treatment assignment protocol detailed above.

In addition to comparing the data directly extracted from the CMS (e.g., the average number of appointments between the four groups), several additional variables are calculated for this analysis. These include the average rates of held, canceled, and no-show appointments.

Descriptive Statistics at Baseline

Table 1 shows that, on average, the sample was approximately 37 (SD = 11.09) years old on the date of the start of the evaluation, about 73% of the sample was male, 31% black and 66% white. About 42% were assigned to parole (coming out of prison), and 58% were assigned by their sentence solely to probation. Risk classifications included 1% of the sample being assigned to annual, 6% to maximum, 37% to medium, and 56% to minimum frequency of office visits. The sample was assigned to serve, on average, about 2,200 (SD = 1,657.97) days on community supervision and served an average of 720 (SD = 622.40) days of supervision before being enrolled in the study.

Table 1 Baseline comparison of randomly assigned treatment groups: descriptive statistics

As shown in Table 1, the assignment protocol successfully resulted in the creation of four groups that were functionally identical with regard to background characteristics. No statistically significant differences were found between the four treatment groups in regard to age, sex, race, supervision, and risk classifications. Moreover, there were no differences in the length of supervision and the time elapsed between supervision’s start and the beginning of the evaluation.

Findings

As Table 2 shows, about 14,000 appointments were assigned to the individuals enrolled in this sample during the evaluation window: 3,590 to the control group members (n = 865), 3,477 to the early text group (n = 857), 3,614 to the late text group (n = 868), and 3,454 to the two texts group (n = 880). On average, each person in the sample was assigned to participate in about 4 (SD = 2.83) appointments during the evaluation period (min = 2, max = 33). Notably, the four groups of interest did not differ significantly between average total appointments assigned. Overall, sample members had an average of 3.44 (SD = 2.63) appointments that were successfully held during the course of the evaluation, with the average number of held appointments not significantly differing between the group members.

Table 2 Differences between groups in appointment behaviors during experiment

However, the average number of both canceled and no-show appointments did significantly differ between the groups (F = 4.41, p = 0.004 and F = 11.60, p = 0.000, respectively). These effects were driven by significant mean differences between the control group and the late text group (mean different = 0.14, p = 0.006) for canceled appointments and significant mean differences between the groups receiving an early text and two texts (mean difference =  − 0.10, p = 0.003) as well as the control group and late text (mean difference = 0.11, p = 0.000) and two texts (mean difference = 0.13, p = 0.000) for no-show appointments.

The rates of appointments held (F = 9.28, p = 0.000), canceled (F = 5.63, p = 0.000), and those where clients did not show up (F = 5.37, p = 0.001) significantly differed between the groups of interest. The effects for the rate of held appointments were driven by statistically significant mean rate differences between the control group and late text (mean rate difference =  − 6.41, p = 0.000) as well as the control group and two texts (mean rate difference =  − 4.87, p = 0.000). The effects for the rate of canceled appointments were driven by significant differences between the control group and late text (mean rate difference = 4.27, p = 0.000). The effects for the rate of no-show appointments were driven by the mean rate difference between the control group and two texts (mean rate difference = 3.12, p = 0.002).

As Table 3 shows, appointment behaviors were followed for 6 months after the culmination of the treatment period, with the sample reduced somewhat by completions of the terms of supervision for several hundreds of the participants in the randomized trial. During this post-experimental period, all (100%) of the participants received text messages 1 day prior to their appointment (the late text condition). In total, 15,933 appointments were assigned to the sample members during this time—with 3,203 unique persons under supervision receiving orders to attend appointments. These appointments were spread across the former 803 control group members (4,057 appointments), the former 802 early text group members (4,033 appointments), the former 801 late text group members (4,075 appointments), and the former 796 two text group members (3,763 appointments). After the treatment period ended, the groups did not significantly differ from one another in regard to the total appointments held, canceled, or where clients did not show up. Likewise, the rates of appointments held, canceled, and “no-showed” did not differ between the groups after the evaluation was over.

Table 3 Differences between groups appointment behaviors for 6 months with all groups receiving text messages 1 day prior to appointments

Figures 1 and 2 display the absolute differences between the treatment groups in the rates of cases having canceled (Fig. 1) or no-show (Fig. 2) outcomes during the treatment period. These absolute differences may appear small, but they are large in both the relative differences in rates and actual numbers of appointments missed.

Fig. 1
figure 1

Percent of appointments canceled by treatment group

Fig. 2
figure 2

Percent of appointments “no-show” by treatment group

During the 6-month experiment, the best attendance was found in the treatment group assigned to late text reminders 1 day before the appointment. That group had 29% fewer no-shows and 21% fewer cancelled appointments than the control group during the experiment. In the subsequent rollout of the late text treatment to all of the clients still under supervision, the entire group still under supervision in that time period had 30% fewer missed appointments compared to the control group in the time period during the experiment.

Conclusions

Overall, the results of this evaluation suggest that the Arkansas Community Corrections texting protocol had a favorable impact on the behavior of individuals under supervision. This effect varied, however, depending on when the reminders were sent and how many were sent.

The evidence indicates that the random assignment protocol successfully eliminated discernible differences in background factors between the four treatment groups, providing an appropriate empirical foundation for the analyses. This means that by far the most plausible interpretation is that the text messages caused a major reduction in missed appointments. Moreover, the findings offer a “sweet spot” of 1-day prior messaging as the best timing for the messages to reduce missed appointments.

This experiment identified sweet spot by analyzing the significant differences that emerged between the groups, particularly between the control “no text messages” group and two other treatment groups: group 2 (a late text 1 day before the appointment) and group 3 (two texts sent 1 day and 4 days before the appointment). Control group members, on average, had significantly more canceled and no-show appointments when compared to the late text group, and more “no-show” appointments when compared to the two texts group. The rate at which control group members held appointments was significantly lower than the late text and two texts groups. The rate at which control group members canceled appointments was significantly higher than the late text group, and control group members had significantly higher rates of no-show appointments when compared to the two texts group. Finally, no specific between-group level effects were ascertained between the control group members and the early text group members that received a text 2 days before the appointment.

Considering that the intervention is relatively low cost, it may be prudent to send text reminders (or continue to send text reminders) to everyone under community supervision about their appointments as a matter of agency policy. Additionally, this work should also build on the foundations of this study and explore which combinations of timing, substance, and individual characteristics (e.g., risk profile, offense history, treatment needs) may encourage behavioral change.

From a policing standpoint, this study has ample implications for out-of-court disposal programming. If defendants agree to attend drug or alcohol treatments, anger management courses, domestic abuse workshops, or other individual strategies for crime prevention, they are likely to require scheduled appointments. This study suggests that text messages 1 day prior to appointments with these service providers might increase their attendance and reduce the rate at which defendants breach their agreements with police offender managers, who must therefore proceed to charge and prosecute them in court.

Limitations

The present experiment’s conclusions are limited to the research questions about attendance at meetings. It seems important now to raise further questions in assessing the full effects of text messaging, including possible downsides. Future research can attempt to take a broader and more detailed look at the impact of altering appointment behaviors of community supervision groups. This should include, for example, any effects the messages may have on repeat offending, employment, or other dimensions of clients’ lives.

While using text messages with a one-day lead time for the participant presents appears to be a low cost, simple, and effective way to increase the rate of held appointments and decrease canceled and no-show appointments, it is not clear how appointments could positively or negatively impact clients’ supervision experiences. For example, the impact of a reminder on revocations could be explored. Even if keeping probation and/or parole appointments may have no direct effects upon important outcomes such as employment or recidivism, they may increase rapport and connectedness between clients and supervising officers that could increase rehabilitative opportunities (e.g., program referrals, one-on-one counseling sessions, or communication about employment opportunities).

Mandating regular contact, on the other hand, may increase the rates of recidivism connected to clients running afoul of the rules and regulations of supervision due to being more closely monitored during appointments. The true impacts may, in fact, be driven by a combination of these outcomes. Future research is therefore needed to connect appointment-centric data to pertinent outcome data such as changes in risk levels, officer behaviors in regard to technical violations and program referrals, and employment outcomes for participants.