Introduction

Mandated community supervision following release from incarceration (i.e., parole) is a common correctional practice in many countries, including Australia (Australian Bureau of Statistics 2018), Canada (Public Safety Canada 2020), New Zealand (New Zealand Department of Corrections 2020), the United Kingdom (Ministry of Justice 2018) and the United States (Kaeble and Alper 2020). Although community corrections practices vary across jurisdictions, both within the United States and internationally (Cullen et al. 2017), community supervision supports successful re-entry when supervision officers focus their time with clients on addressing risk factors and barriers to desistance (Chadwick et al. 2015; Dowden and Andrews 2004). Assessment of risk for recidivism is standard practice in many jurisdictions, but it is unclear how risk assessments might support successful rehabilitative efforts (Viljoen et al. 2018). In particular, it is unclear how supervision officers should conceptualize changes in risk and respond when observing these changes. This study applies a novel analytic approach to help address this knowledge gap.

In this study, we used two types of information about individuals re-entering the community following incarceration. The first was repeated assessment of dynamic (changeable) risk factors. Theoretically, when dynamic risk is reduced, there is associated reduction in the likelihood of a new offense (Andrews and Bonta 2010). Statistically, this information is longitudinal within-individual data as each participant has multiple, time-linked assessment values. The second type of information was recidivism data, detailing if and when an individual committed a new criminal offense. Statistically, this information is single-instance time-to-event data, as each participant has one end time (time of recidivism or end of observation) and one recidivism outcome (dichotomously coded). These features yield two study objectives: describe how risk scores change over time and predict recidivism.

Joint modelling is an innovative approach that allows researchers to directly study the association between change in a longitudinal variable and time to a survival outcome. The joint modelling approach comprises a model of change in a time-dependent variable (usually a mixed model) and a model of time-to-event (usually a proportional hazard model), linked via a shared latent structure (Proust-Lima et al. 2014). In joint latent class modelling (JLCM, see Proust-Lima et al. 2014, 2017) trajectories of time-dependent variables are linked to survival outcomes via a shared latent class structure. This approach assumes that within a heterogeneous population, there are homogeneous subgroups that share both the same trajectory of a time-dependent variable and the same time-to-event. Essentially, it combines survival analysis with latent group trajectory analysis. This approach integrates latent classes into a prediction model and controls for systematically missing data when conducting trajectory analyses. Beyond statistical sophistication, the JLCM approach adds conceptual value for community corrections research. It is well-established that risk changes following re-entry (Papalia et al. 2020), but it is unclear how change varies across individuals. Whereas prior analytic approaches implicitly assumed a single-group trajectory during re-entry (e.g., Babchishin and Hanson 2020; Davies et al. 2021; Lloyd et al. 2020), this is an unlikely assumption and JLCM allows researchers to better examine individualized change while conserving data excluded in traditional group trajectory analysis.

Overview: Community Corrections, Core Correctional Practice, and Acute Risk Factors

Community corrections is often viewed as merely an alternative to incarceration or a strategy for shortening custodial sentences (Cullen et al. 2017). This reflects both growing recognition that mass incarceration is unsustainable (Petersilia and Cullen 2015) and the persistence of “nothing works” ideology that view rehabilitation efforts as ineffective (see Martinson 1974; Martinson and Wilks 1978). The latter is of particular interest for both its deleterious impact on corrections policy (Petersilia 2008; Sarre 2001) and because it is unfounded (Andrews and Bonta 2010; Gendreau and Ross 1979; Lipsey 1989, 1995).

Within the framework of a “nothing works” ideology, community supervision functions as a form of containment, with surveillance and control as the primary objectives. Indeed, without rehabilitation-focused initiatives, Bonta et al. (2008) found that supervision officers were unlikely to use time with clients to address barriers to re-entry, focusing instead on adherence to parole conditions. Control-based approaches to supervision are generally ineffective for reducing recidivism (Georgiou 2014; Hyatt and Barnes 2017; Lane et al. 2005; Petersilia & Turner 1993) and may result in increased absconding and reincarceration (Hyatt & Barnes 2017).

The Risk Needs Responsivity model (RNR, Andrews & Bonta 2010; Bonta & Andrews 2017), drawn from social learning, social control, and differential association theories, offers an alternate approach. It posits that the degree of rehabilitation services must match the empirically assessed level of risk (i.e., individuals more likely to recidivate receive higher intensity interventions), that rehabilitation should target factors that are empirically related to recidivism and appropriate for intervention, and that program delivery should be consistent with evidence-based practices and the learning/motivational style of the individual. Use of the RNR model has proliferated, and RNR-based practices have been adopted by jurisdictions internationally (Cullen 2013).

Core Correctional Practice (CCP, Dowden & Andrews 2004) is the application of the RNR model. In the context of community corrections, supervision officers implementing CCP act as facilitators of clients’ process of change, or as agents of change. Within the context of a collaborative working alliance with a fair but firm approach, officers model and reinforce prosocial attitudes, facilitate concrete problem solving skills, and assist clients to access and effectively use community resources to overcome barriers to successful re-entry. In short, they identify and address risk factors. The CCP approach is associated with improved outcomes (Bonta et al. 2008; Dowden & Andrews 2004; Kennealy et al. 2012), and training supervision officers to implement these practices is associated with substantial reductions in recidivism compared to standard practice (Bonta et al. 2011; Chadwick et al. 2015).

Risk factors are personal, social, contextual, and community factors associated with a higher likelihood of recidivism. In this study, we focused on risk factors that predict recidivism for individuals already involved in the criminal justice system. Conceptually, risk factors may be divided into static and dynamic risk factors (Andrews & Bonta 2010). Static risk factors cannot be changed with intervention and are typically limited to demographic and criminal history factors, such as age of first involvement with the justice system, number of prior convictions, or severity of past convictions. Static risk factors are robust predictors of future recidivism, but by definition cannot be targets for intervention (Douglas & Skeem 2005). In contrast, dynamic factors can theoretically change over time and are promising targets for interventions.

Although all dynamic risk factors must be able to change, different risk factors change at different rates. Stable dynamic risk factors, such as antisocial attitudes, impulsivity, and poor problem solving are unlikely to change on a day-to-day basis, but may shift over months or years (Hanson & Harris 2000). In contrast, acute risk factors change rapidly. These can be internal states, such as anger/hostility (Hanson & Harris 2000; Pettersen et al. 2015; Skeem et al. 2006) and substance misuse (Horney et al. 1995; Kazemian et al. 2009; Serin et al. 2013), or situational factors, such as employment status (Berg & Huebner 2011; Hanson & Harris 2000; Kazemian et al. 2009) and relationship conflict (Mann et al. 2010; Serin et al. 2013). Acute risk factors may change over the course of days or even minutes.

Acute risk factors may act as destabilizers, interfering with adjustment and prosocial success outside of prison. Zamble and Quinsey (2001) found that 22% of recently reincarcerated men self-reported experiencing a destabilizing event in the month preceding their most recent recidivism event and that men who recidivated had higher rates of acute risk factors such as anger, substance misuse, and interpersonal conflict, relative to a sample of men who had not recidivated. In large-scale prospective studies controlling for static risk, acute dynamic risk factors predict recidivism better than stable dynamic risk factors (Davies et al. 2021; Lloyd et al. 2020).

These findings are consistent with phenomenological explanations of criminal behavior that deemphasize the role of general propensities and abstract categorizations. Rather, they posit that individuals choose actions that align with their perceived options, goals, and roles in the present situation (Lloyd 2015). Some theorists use the metaphor of a “game” (Szasz 1961), where individuals subjectively determine what outcomes constitute “winning”. As such, the situational antecedents of crime behaviors (e.g., opportunities, motivations, social standing) are integral to understanding criminal actions, because antecedents set and constrain the rules of the subjective “game”. Predicting recidivism in the short-term should therefore be improved by better understanding situational antecedents in general and acute risk factors in particular. In other words, because propensity-focused frameworks cannot explain periods of non-criminal behavior, frameworks focused on the immediate interaction between antecedents and subjective goals are better poised to explain the timing of recidivism (see Douglas & Skeem 2005). Methodologically, this approach requires assessing individualized experiences of potentially destabilizing antecedents and their association with criminal behavior across time. Individuals’ personal understandings and perceptions of their situation are especially relevant.

Community Studies of Risk Reassessment

Longitudinal studies of dynamic risk factors (e.g., ratings of internal and external situational antecedents) in community settings are the best approach to understand how dynamic risk changes. The community setting ensures that each participant has similar opportunity to commit a new crime. Further, the greater number of reassessments included in the study, the higher the resolution for describing the change process. Community studies with frequent reassessments of dynamic risk remain relatively rare. Some notable exceptions include studies by Hanson et al. (2007); see also Babchishin & Hanson (2020), Hanson et al. (2021), Howard and Dixon (2013), Davies et al. (2021), Lloyd et al. (2020), and Stone et al. (2021). These studies have tended to focus primarily on prediction, with descriptions of risk change limited to the average trend for the total sample or for retrospectively identified subgroups (e.g., based on later recidivism status). These researchers have indirectly addressed change in risk using Cox regression survival analysis with time-varying predictors. In this approach, recidivism is framed as the survival event in a time-to-event survival analysis with reassessments of dynamic risk framed as a time-dependent covariate. Further, these researchers have applied different specifications of dynamic risk scores (e.g., total averages across time, recent averages of scores, etc.) to evaluate model fit and effect sizes.

These studies support the conclusion that change in risk levels is important to understanding risk of recidivism by indicating updated risk assessments incrementally predict recidivism including all outcomes, violence, and sexual recidivism (Howard & Dixon 2013; Davies et al. 2021; Babchishin & Hanson 2020; Lloyd et al. 2020) with aggregated assessments particularly predicting violent recidivism (Babchishin & Hanson 2020; Stone et al. 2021) over initial time-at-release risk scores. However, it may be beneficial to more closely examine how individuals’ assessed risk levels change over time. On average, levels of risk factors tend to decrease in the community whereas protective factors increase (Lloyd et al. 2020; Stone et al. 2021). Although there is clearly variability in how much each individual’s level of assessed risk changes following community re-entry, it remains unclear whether individuals’ risk score trajectories differ primarily in degree, or if there are qualitatively distinct patterns of change.

Group Trajectory Models

Group trajectory methods are useful for describing different patterns of change within a population. Group trajectory approaches assume that persistent unobserved heterogeneity has a discrete rather than continuous distribution (Piquero 2008). The method may be used to identify groups within a population that share similar trajectories of a variable (e.g., rate of criminal charges, levels of specific risk factors) over time. These methods are popular in the field of criminology (Piquero 2008), especially to describe patterns of criminal activity across the life course.

Research generally has substantiated that subpopulations of individuals involved in the criminal justice system follow different trajectories of criminal activity over time (Jennings & Reingle 2012; Piquero 2008). It would be theoretically consistent for trajectories of risk factor ratings (i.e., risk-related psychosocial factors) to be similarly discretely distributed in these populations. For instance, among a group of individuals paroled into the community, those with desisting trajectories would be expected to have faster declining levels of risk factors compared to those still persisting in criminal involvement, whereas those in the early stages of their criminal careers may be expected to have concurrently increasing levels of risk factors. Risk factors are not evenly distributed among individuals on supervision (Taxman & Caudy 2015) and different profiles of risk are associated with different recidivism outcomes. Zamble and Quinsey (2001) demonstrated among recently reincarcerated men that recent charges of assault, robbery, or property crimes were associated with different constellations of acute risk factors. For example, although each group of men reported alcohol misuse, alcohol was most strongly associated with assault, whereas men with recent robbery convictions reported more cocaine use and men with recent property crime convictions reported more heroin use. Affective factors especially differentiated. Men with new assault convictions reported high levels of anger in the months preceding their arrest, whereas men convicted of robbery reported general dysphoria (e.g., hopelessness, anxiety) without anger. Men convicted of property crimes reported increasing levels of frustration prior to arrest, also without high levels of anger. Importantly, Zamble and Quinsey observed these different patterns over time, suggesting that trajectories of change in acute risk may inform recidivism and type of new charge.

However, researchers have rarely used group trajectory methods to describe patterns of change in ratings of risk factors. Two studies used samples of forensic patients (Billen et al. 2019; Quinsey et al. 2006a, b) and another two examined samples of justice-involved juveniles (Baglivio et al. 2017; Clark 2015). Each study found evidence of distinct groups with different patterns of change in various risk factors, revealing heterogeneity in trajectories of overall risk scores (Baglivio et al. 2017; Clark 2015; Quinsey et al. 2006a, b) and impulsivity and self-control (Billen et al. 2019), although the number of groups identified varied across studies. Additionally, post-hoc analyses of outcomes (e.g., ANOVA tests of differences in recidivism rates between participants designated to trajectory groups by posterior probabilities) indicated that group trajectories may be useful to understand the relationship between change in risk factors and recidivism or related outcomes (e.g., elopement, violation of conditions, violent and aggressive incidents in Quinsey et al. 2006a, b). For example, Billen et al. (2019) found that likelihood of recidivism was different across the three groups identified from trajectories of change in impulsivity ratings. Together, these studies indicated that trajectories helped to describe risk for recidivism in the context of community re-entry following either residential placement (Baglivio et al. 2017) or discharge from forensic psychiatric care (Billen et al. 2019; Quinsey et al. 2006a, b).

To our knowledge, however, no published studies have applied group trajectory methods to community-based samples of adults with criminal histories. One likely reason is non-randomly missing data. In samples of paroled individuals, recidivism often results in a return to incarceration. Once returned to incarceration, community-based risk assessment data ceases. Because higher risk for recidivism is associated with higher likelihood of return to incarceration, this non-randomly censored data introduces a high degree of bias within traditional group trajectory analyses that require uniform assessment schedules, thus over-representing individuals with lower risk levels in the data. Developing a nuanced understanding of how risk levels change across re-entry requires methods that allow retaining data from individuals who recidivate prior to study completion.

Joint Latent Class Modelling (JLCM)

JLCM presents a direct means of modelling group trajectories while accounting for non-randomly missing data by combining time-to-event survival analysis with latent class modelling. To our knowledge, this approach has not been used in community corrections research, but has been gaining traction in the field of biostatistics. For example, Smagula et al. (2019) used JCLM to control for attrition and death while modelling trajectories of daytime sleepiness in patients at risk for dementia. Proust-Lima et al. (2017) used JLCM to model trajectories of cognitive decline with time to dementia diagnosis, to create individualized dynamic predictions of when dementia diagnosis might occur. This approach can also be expanded to include multiple competing outcomes; Brilleman et al. (2019) examined trajectories of change in body mass index as these changes may have preceded either kidney transplantation or death.

JLCM undoubtedly presents a more statistically valid approach to modelling group trajectories in samples where non-randomly missing data are unavoidable. However, it is unclear whether this approach would improve understanding of community re-entry and changes in acute risk factors. This study aims to demonstrate the utility of JLCM for modelling corrections data and to test the value of JLCM as an approach to predicting recidivism. Specifically, we aim to illustrate how latent groups describe the association between changes in assessed levels of acute risk factors and recidivism. Given prior research demonstrating average declines in acute risk (Davies et al. 2021; Lloyd et al. 2020) we expected that most participants would evidence decreasing trajectories of assessed risk, although a small proportion would evidence increasing risk. We expected that increasing acute risk trajectories would be associated with high recidivism and that decreasing trajectories would be associated with lower recidivism. However, in consideration of phenomenological explanations of crime, we expected that decreases in risk would not be uniform and would not be uniformly associated with reduced recidivism. Specifically, participants with improving risk profiles might recidivate after experiencing a destabilizing event. Such events might be reflected by subsequent increases or plateaus in the assessed risk trajectory, or might occur so close to the recidivism event as to be unobservable in participants’ assessment histories. We expected a trajectory reflecting this pattern to be associated with more likely recidivism despite initially-decreasing acute risk.

Method

Participants

Beginning in April 2010, the New Zealand Department of Corrections mandated that supervision officers administer an assessment of dynamic risk (DRAOR, see 2.2.1) as part of routine practice with their caseloads of paroled individuals, re-scoring the tool at each contact of sufficient length to reassess the items. Coinciding with the implementation of this risk measure, this study used data recorded between April 1, 2010 and March 31, 2012. The data set included all individuals released into community supervision in New Zealand following incarceration during this two year period.

New Zealand law mandates that all individuals sentenced to two or more years in prison must report to a supervision officer (approximately weekly to fortnightly) for at least six months following release. Those who had served the entirety of their sentences in prison were thus supervised in the community for six months, whereas those released prior to sentence expiry were supervised for additional time, depending when parole was granted during their sentence. Although some records extended beyond 1 year, we used only up to 52 weeks of assessment data for each participant; during these 52 weeks, assessment records were largely complete up to time of censoring (i.e., weekly or fortnightly records), but assessments became less consistent as reporting requirements reduced after 52 weeks. To describe the association between recidivism and acute risk levels in the short-term, assessments must be adequately proximal to recidivism. After 52 weeks, this was rarely achievable, given sporadic assessment. The data set included 97,188 assessments completed for 3,498 participants Lloyd (2015).

Measures

Recidivism

New Zealand Department of Corrections provided official record recidivism outcomes from the beginning of data collection through July 18, 2012. Outcomes included technical violations, non-violent criminal offenses, or violent criminal offenses. Technical violations were defined as crimes without identifiable victims (e.g., parole violations, substance use offenses). Non-violent criminal offenses included non-violent crimes against others (e.g., fraud, property destruction). Violent offenses included non-sexual violent crimes against others (e.g., assault, robbery) as well as sexual violence. In our data set, parole violations were formally prosecuted, convicted, and convictions almost always resulted in cancellation of parole (i.e., reincarceration). As such, these outcomes were not fully due to supervision officers’ discretion.

Static Risk

In New Zealand Department of Corrections, a computer automatically scores static risk using the Risk of Reconviction*Risk of Re-Imprisonment (RoC*RoI) tool developed internally from the logistic regression model reported by Bakker et al. (1999). The RoC*RoI’s 16 items include demographic information (gender and age) and criminal history information (e.g., criminal career length, crime frequency, and crime seriousness). We did not include age as a separate variable, as, in addition to being included in RoC*RoI’s algorithm, it is strongly correlated with RoC*RoI scores in this sample (r = − 0.51). RoC*RoI scores range from 0 to 1, with scores accurately calibrated to the likelihood of being reconvicted and reincarcerated within the next 5 years, with strong discrimination of future recidivists (AUC = 0.76; Nadesu 2007). Scores represent likelihood such that a score of 0.65 indicates a 65% likelihood of reincarceration within the next 5 years.

Dynamic Risk and Protective Factors

Supervision officers assessed dynamic factors using the Dynamic Risk Assessment for Offender Re-entry (DRAOR; Serin 2007), a 19-item case management tool conceptually organized into three subscales assessing the three domains of stable risk factors, acute risk factors, and protective factors. Assessors decide which risk factors are present and to what extent primarily based on interview information, supplemented with available file or collateral information. The DRAOR Stable subscale contains six relatively stable risk factors that are still theoretically amenable to change (i.e., attitudes, traits, and behavior patterns). The DRAOR Acute subscale comprises seven items that assess potentially rapid-changing situations (employment, living situation, interpersonal relationship conflict, and opportunity to recidivate), moods (negative mood and anger/hostility), and behaviors (substance use). Items from both DRAOR Stable and Acute subscales are scored as either no problem (0), slight/possible problem (1), or definite problem (2), for a maximum score of 12 and 14, respectively. Higher scores on DRAOR Stable and Acute subscales are associated with higher recidivism risk. The DRAOR Protect subscale consists of six items assessing prosocial perceptions of self and positive social connectedness. These are scored as either not an asset (0), slight/possible asset (1), or definite asset (2), for a maximum score of 12, with higher scores associated with lower recidivism risk.

When supervision officers use the in-session interview to assess the DRAOR Acute dynamic risk factors, they take a phenomenological approach. This is especially apparent in assessments of internal mood but is also relevant when recording situations and behaviors. For example, the DRAOR manual guides supervision officers to query individuals’ perceptions of their substance use, employment, and relationships to assess personal engagement, rather than simply the presence / absence of the factor. For example, to determine the intensity of the problem, ratings are not simply based on drug tests, employment status, or relationship status, but the individual’s personal experiences pursuing sobriety, fulfilling employment, and relationship satisfaction.

Although scores on DRAOR subscales are correlated, the subscales are conceptually distinct by assessing risk of recidivism from different temporal and directional vantage points, and the tool developers currently do not recommend using a DRAOR composite score within research. For this reason, we chose not to examine composite scores. DRAOR has effectively discriminated people with future recidivism versus non-recidivism in multiple populations, including both men and women (Yesberg et al. 2015; Yesberg & Polaschek 2014, 2019). It discriminated general recidivism (Davies et al. 2021; Lloyd et al. 2020) as well as violent recidivism (Stone et al. 2021), incremental to static risk measures and initial dynamic scores assessed at time of re-entry.

Procedure

Ethics approval was obtained from institutional review boards at the following institutions: Carleton University, Swinburne University of Technology, and University of Texas at El Paso.

Supervision officers initially assessed and reassessed (approximately weekly to fortnightly) all 19 DRAOR items as part of routine practice. We excluded participants (less than 8% of the population of people supervised in the jurisdiction during the study time frame) when assessments could not be matched with a release date, if a static risk assessment was not recorded in the data set, or if the participant’s initial dynamic assessment occurred more than four weeks following release from incarceration. During the study period, some participants experienced return to incarceration and re-release. Because all relevant information was updated in the data base with each re-release (i.e., age, static risk, and dynamic risk), we included subsequent re-entry sequences in our analyses as new trajectories toward recidivism. This was conceptually consistent with our definition of a trajectory as progressing from release from incarceration, reflecting a unique re-entry experience.

Given the approximate weekly assessment schedule, we chose weeks as a meaningful unit of time to measure dynamic risk trajectories. Although JLCM does not require imputation, it reduces model bias to have complete assessment schedules prior to censorship. To consider the maximum amount of data while ensuring that trajectories reflected actual scores, we imputed the most recent assessment into weeks missing an assessment, up to no more than six consecutive weeks. If seven consecutive weeks of data were missing, we truncated sequences at the last recorded assessment. Only 14.3% of sequences required imputation beyond three weeks, most missing weeks (65.7%) were single occurrences, and the majority (94.2%) of recidivism events occurred within two weeks of an assessment. Additionally, we lagged recidivism outcomes by one week to ensure that all DRAOR assessments were prospective to recidivism.

Analytic Approach

We conducted analyses on each of the three DRAOR subscales. However, we present analysis and results from DRAOR Acute only in this article because theoretically acute risk factors have unique value for predicting short-term recidivism. Further, prior studies demonstrated that change in DRAOR Acute scores occurred faster than change in other subscales. We include results from DRAOR Stable and DRAOR Protect in appendices.

Prior to estimating latent classes for our joint model, we used mixed modelling to determine the appropriate specification for time and establish variability in individual slopes and intercepts of DRAOR Acute. We used the lme4 package (Bates et al. 2015) for R (R Core Team 2015). For model selection, we randomly selected a calibration sample comprising approximately 60% of the full dataset (N = 51,864 observations of 2,050 re-entry sequences). Splitting the sample for model selection allowed us to test the parameters estimated and reduced computation time. We then applied the selected model to the remaining 40% of the sample. We compared proportions of participants likely to belong to each group across samples, and visually compared the predicted trajectories. Finally, we applied the final selected model to the full sample for descriptive analyses.

Modelling

We conducted joint latent class modelling using the Jointlcmm function of the lcmm package of R (Proust-Lima et al. 2017). This approach estimates a group structure that accounts for both the trajectory of a longitudinally reassessed variable (i.e., DRAOR Acute ratings) and the time to a survival event (i.e., recidivism). Individuals in the same group share similar trajectories of acute risk as well as similar time to recidivism. To be clear, individuals do not actually belong to any group identified by JLCM. Rather, the groups identified reflect points of support for unobserved heterogeneity in trajectories of acute risk and associated time to recidivism. We refer to participants as belonging to specific groups for the sake of parsimony, but as true latent class membership is unknown, when we refer to group membership we are actually referring to the probability that an individual can be characterized as belonging within a specific, unknowable latent class. The JLCM approach estimates these groups using (1) a class membership submodel, (2) a longitudinal submodel, and (3) a hazard submodel (see Proust-Lima et al. 2014, 2017).

Class Membership Submodel

The first regression submodel contained within the JLCM approach designates probabilistic class membership. We assume that each individual i (i = 1,…, N) belongs to a single latent class characterized by dynamic risk trajectory and associated recidivism outcome. True latent class membership, ci,, cannot be known. We therefore model the probability that an individual i belongs to each of the possible latent classes through multinomial distribution where P(ci = g) = πig for latent classes g (g = 1, …, G) (Proust-Lima et al. 2017). We included RoC*RoI scores as a covariate in the class membership model to account for demographic (including age) and criminal history differences associated with recidivism and to ensure that groups differed primarily in terms of dynamic risk.

Longitudinal Submodel

The second submodel within the JLCM approach describes the class-specific trajectories of longitudinal variables (i.e., DRAOR Acute scores). This model is identical to the standard linear mixed with the exception that it allows for class-specific distribution of both fixed and random effects. Assuming a Gaussian outcome, the longitudinal trajectory of marker variable y for individual i at time tij (j = 1,…, ni) in class g is assumed to follow

$${y}_{ij }{|}_{{c}_{i}=g}= {X}_{L1i}{\left({t}_{ij}\right)}^{T}\beta +{X}_{L2i}{\left({t}_{ij}\right)}^{T}{v}_{g}+{Z}_{i}{\left({t}_{ij}\right)}^{T}{u}_{ig}+{w}_{i}\left({t}_{ij}\right)+{\epsilon }_{ij}$$
(1)

(Proust-Lima et al. 2017, p.6), where \({X}_{L1i}\) is the vector of covariates associated with class-common fixed effects, \(\beta\),; \({X}_{L2i}\) is the vector of those same covariates, associated with class-specific effects, \({v}_{g}\); and \({Z}_{i}\) is the vector of covariates associated with individual random effects, \({u}_{ig}\). The process \({w}_{i}\left({t}_{ij}\right)\) is a zero-mean Gaussian stochastic process, and \({\epsilon }_{ij}\) represents normally distributed standard error.

For the longitudinal model, we included assessment occasions and intercepts as covariates, with class-common, class-specific, and random effects. We included RoC*RoI scores as class-common covariates.

Hazard Submodel

The final submodel within JLCM describes class-specific hazard for a survival event (i.e., recidivism). We used any recidivism (including reconvictions for technical violations) as a survival event.


Preliminary visual inspection indicated that our survival data followed a convex monotonically decreasing pattern, with rates of recidivism decreasing over time and cumulative hazard plots indicated relative good fit of a Weibull parametric model. We therefore chose a class-specific Weibull proportional hazard model with the RoC*RoI as a class-common covariate. In the class-specific survival model described in Eq. (2) (see Proust-Lima et al. 2017, p. 7), \({T}_{i}^{*}\) denotes the time to recidivism, while \({\widetilde{T}}_{i}\) denotes censoring time (i.e., exit from the dataset in the absence of recidivism). The time of exit from the dataset is thus \({T}_{i}=min({T}_{i}^{*}, {\widetilde{T}}_{i})\), and time to event is \({E}_{i}={1}_{{T}_{i}^{*}\le {\widetilde{T}}_{i}}\). Given the latent class g and survival up until time (t), the risk of recidivism can be described using the following proportional hazard model:

$${\lambda }_{i}\left(t\right){|}_{{c}_{i}=g}= {\lambda }_{0g}\left(t\right){e}^{{X}_{Si1}^{\mathrm{\top }}V+{X}_{Si2}^{\mathrm{\top }}{\delta }_{g}},$$
(2)

where \({\lambda }_{0g}\) indicates the class-specific baseline hazard, \({X}_{Si1}\) is the vector of covariates associated with parameters common over classes, v, and \({X}_{Si2}\) is the vector of covariates associated with class-specific parameters, \({\delta }_{g}\).

Model Selection

We compared models with varying numbers of classes, estimated from the 60% calibration sample, using a spline link function to account for higher order time effects. Our model was selected using several criteria. First, the Bayesian information criterion (BIC) was considered. BIC provides the relative marginal likelihood of competing models, penalizing free parameters (Schwarz 1978). The model with the smallest BIC is most supported by the data. Another indication of fit, relative entropy, was calculated using the methods described by Brilleman et al. (2019; equation in Appendix 1). Values of relative entropy range from 0 to 1, with 0 indicating random classification and 1 indicating perfect classification. Another consideration was the conditional independence test statistic. JLCM assumes that the latent group structure accounts for the entirety of the relationship between the trajectory variable (risk for recidivism) and the time to the survival event (Proust-Lima et al. 2014). We therefore granted preference to models in which this assumption was upheld. The proportion of participants designated to each class was also considered, in accordance with the suggestions of Nagin (2005). Solutions were not considered if any of the classes estimated contained less than 1% of the sample. Finally, also in accordance with guidelines provided by Nagin (2005), we considered mean posterior probabilities, or the likelihood that any given participant belonged to the group to which they were assigned. Solutions were only considered if the mean posterior probability for each class was above 70% (Nagin 2005).

Post Fit Analyses

Descriptive

After selecting a model, posterior probabilities were used to designate participants to the groups identified. Descriptive statistics were then compared across groups, including participant age, length of follow up time (i.e., number of weeks assessed), risk scores, and recidivism. Risk scores included mean baseline RoC*RoI and DRAOR scores, as well as mean total change and mean net change in DRAOR scores. Mean total change was calculated by subtracting final DRAOR scores from baseline scores for each participant. It describes the overall direction and amount of change in risk, such that a negative score indicates a decrease in risk or protective scores, whereas a positive score indicates an increase. In contrast, mean net changes were calculated to determine how much DRAOR scores changed on average from one week to the next, regardless of direction of change. This measure calculates the mean absolute difference between weekly scores, such that a higher number indicates greater instability from one week to the next. We also conducted paired-sample t-tests to assess whether participants’ final DRAOR ratings were different from initial DRAOR ratings to a statistically significant degree. Frequency of recidivism, as defined in the previous text, was also compared across groups.

We conducted Kruskal–Wallis rank sum tests to check for statistically significant differences between groups in continuous variables. We used Pearson’s Chi-squared test to assess differences in recidivism rates between groups.

Dynamic Prediction

Dynamic predictions were extracted from the final selected model. Dynamic predictions were computed from the lcmm program (see Proust-Lima et al. 2017). The dynamic prediction provides the probability that an event \({T}_{i}\) occurs due to cause p during the prediction window (\(s, s+t\)), given longitudinal data collected up until time s and a defined number of groups, \({\theta }_{G}\). Given a single cause of event (i.e., recidivism), \({\delta }_{i}\)= p = 1,

$${\text{P}}\left({T}_{i}\le s+t,{\delta }_{i}=p|{T}_{i}\ge s, {Y}_{i}^{\left(s\right)},{X}_{i}^{\left(s\right)},{X}_{Si},{X}_{ci};{\theta }_{G}\right)$$
$$=\frac{{\sum }_{g=1}^{G}\mathrm{\rm P}\left({c}_{i}=g|{X}_{ci} ; {\theta }_{G}\right)\mathrm{\rm P}\left({T}_{i}\in \left(s, s+t\right]|{X}_{Si},{c}_{i}=g;{\theta }_{G}\right)f\left({Y}_{i}^{(s)}|{X}_{i}^{\left(s\right)},{c}_{i}=g;{\theta }_{G}\right)}{{\sum }_{g=1}^{G}\mathrm{\rm P}\left({c}_{i}=g|{X}_{ci};{\theta }_{G}\right){S}_{i}\left(s|{X}_{Si},{c}_{i}=g; {\theta }_{G}\right)f\left({Y}_{i}^{(s)}|{X}_{i}^{\left(s\right)}, {c}_{i}=g;{\theta }_{G}\right)}$$
(3)

where the probability of class membership \(\mathrm{\rm P}\left({c}_{i}=g|{X}_{ci} ; {\theta }_{G}\right),\) the class-specific cumulative incidence\(\mathrm{\rm P}\left({T}_{i}\in \left(s, s+t\right]|{X}_{Si},{c}_{i}=g;{\theta }_{G}\right)\), the density of longitudinal outcomes\(f\left({Y}_{i}^{(s)}|{X}_{i}^{\left(s\right)},{c}_{i}=g;{\theta }_{G}\right)\), and the class-specific survival function \({S}_{i}\left(s|{X}_{Si},{c}_{i}=g; {\theta }_{G}\right)\) are conditional on covariate scores at time s (Eq. (3) from Proust Lima et al., 2017, p. 24).

We selected prediction windows (\(s, s+t)\) that reflected early milestones in re-entry (4–12 weeks, 12–20 weeks, and 24–32 weeks). We chose to use narrow prediction windows to mirror the timescale over which supervision decisions are generally made.

The dynamic predictions from the selected model were compared to predictions from an equivalent joint model with no latent group structure (see Appendix 2). These models were fit using JM (Rizopoulos 2010).

Using methods described by Blanche et al. (2015), we used the dynamic predictions to calculate dynamic AUCs and Brier scores for the specified prediction windows. Dynamic AUCs quantify a model’s discriminative ability, such that a higher AUC indicates a greater ability to discriminate between participants who recidivate and participants who do not, during the specified prediction window. The Brier score reflects calibration, or how close the estimated risk score is to true underlying risk, such that a lower Brier score indicates a more accurate model. AUCs and Brier scores from the selected models were compared to those from the joint models with no latent class structure to test if the addition of a latent group structure improved prediction.

Results

Descriptive Statistic of the Full Sample

After eliminating assessments in the data set for the reasons described earlier (representing N = 77 individuals, see Lloyd (2015), 92,104 observations of 3,421 individuals remained. These participants were primarily male (92.8%). Ages ranged from 17 to 86 years, with a mean age of 35.1 (SD = 11.7). Follow up times for individual participants ranged from one to 52 weeks (M = 25.2, SD = 17.0), at which point we truncated the follow up.

Of these individuals, some were returned to incarceration before being re-released into community supervision. There were 207 participants with two assessment sequences and 13 with three assessment sequences. As a result, there were a total of 3,648 independent sequences for analysis. Forty-two percent (N = 1,535) of the sample had any recidivism recorded. With overlap, convictions for technical violations were recorded for 30.4% of the sample (N = 1,108), nonviolent criminal offenses were recorded for 11.2% of the sample (N = 407), and non-sexual violent or sexual recidivism was only observed in 7.6% of the sample (N = 279).

The mean RoC*RoI score was 0.51 (SD = 0.24), representing 51% likelihood of returning to incarceration within 5 years. We calculated averages of each participant’s entire history of DRAOR reassessments and computed sample means of these averages. The mean average score for DRAOR Stable was 5.70 (SD = 2.64), representing the equivalent of almost three out of six stable risk factors rated as definite problem. The mean average score for DRAOR Acute was 4.99 (SD = 2.18), representing two to three definite problem ratings out of seven acute risk factors or five factors rated as possible problem. The mean average score for DRAOR Protect was 6.78 (SD = 2.43), representing the definite or partial presence of three to six strength factors. Although participants’ risk scores showed statistically significant change using each DRAOR subscale (DRAOR Stable: t = 22.52, p < 0.01; DRAOR Acute: t = 31.97, p < 0.01; DRAOR Protect t = − 26.98, p < 0.01), DRAOR Acute scores tended to change more than other subscale scores during the follow up. On average, DRAOR Acute scores decreased by 1.32 (SD = 2.50) between first and final assessments, whereas DRAOR Protect scores increased by 1.04 (SD = 2.32), and DRAOR Stable scores decreased by only 0.89 (SD = 2.38). DRAOR Acute scores also demonstrated more week-to-week fluctuation, with mean net change of 0.38 (SD = 0.50). DRAOR Stable scores had a mean net change of 0.24 (SD = 0.51) and DRAOR Protect scores had a mean net change of 0.23 (SD = 0.51).

Assumption Checks

Mixed modelling indicated that there was statistically significant variation across individual slopes and intercepts. Thus, it was appropriate to proceed with group-based trajectory analysis. A full discussion of the mixed modelling process is presented in Appendix 3.

Model Selection

Joint latent class models fitted on the 60% calibration sample are presented in Table 1. See Appendices 45 for a discussion of selection of initial values for model estimation and results from DRAOR Stable and Protect subscales.

Table 1 DRAOR acute model selection criteria derived from 60% calibration sample

When modelling trajectories of DRAOR Acute, BIC values continued to decrease as classes were added to the model. This is common for group-based trajectory analyses, especially when sample sizes are large, and has been presented as a criticism of group-based trajectory analyses (Erosheva et al. 2014; Piquero 2008). Relative entropy was highest for the three-group solution, but the conditional independence test statistic was highly statistically significant. The four-group solution did not violate the assumption of conditional independence, and because the mean posterior probabilities fell below 0.70 with the addition of a fifth group, we selected the four-group model.

The solution indicated by the 60% calibration sample was then applied to the remaining 40% of the sample. Visual comparison indicated that intercepts and slopes of the classes identified remained consistent across samples, and class proportions were similar (Appendix Fig. 3). Importantly, groups that represented only a small proportion of participants were reliably estimated across samples, indicating that these were not merely artefacts of the latent group estimation process. The four-group solution for DRAOR Acute was therefore upheld and applied to the full sample. See Appendix 6 for a more complete discussion of this process.

Descriptive Statistics of the Selected Model

The four groups of trajectories of DRAOR Acute scores differed in the direction and degree of change and had different associated hazard for recidivism (see Fig. 1). Note that the JLCM approach explicitly includes survival time in the class structure. Therefore, the relationship between class membership and recidivism is descriptive rather than predictive in this context. The association between class membership and recidivism only becomes predictive when probability of class membership informs the baseline hazard of the survival model for prospective predictions as described in the Dynamic Prediction subsection of the Methods and applied in Dynamic Predictions and Comparison of Predictive Ability subsection of Results.

Fig. 1
figure 1

Mean group predicted trajectories of DRAOR Acute plotted with (A) and without (B) Monte Carlo confidence intervals, mean group survival curves (C) and group-specific baseline hazard rates (D). Predicted values from lcmm’s Jointlcmm function (see Proust-Lima et al. 2017, p. 22). DRAOR Acute = Dynamic Risk Assessment for Offender Re-entry (Serin 2007), Acute subscale. Scores range from 0 to 14 with higher scores indicating higher risk

The first group was characterized by moderate initial DRAOR Acute scores (equivalent of three to four of seven risk factors rated as definite problem) that decreased over time (moderate decreasing group; about 41% of sample; see Fig. 2 for sample trajectories within groups). The second group of trajectories was characterized by low initial DRAOR Acute scores (equivalent of two to three definite problem ratings) that declined slightly over time (low decreasing group; about 47% of sample). Trajectories in the third group started with moderate/high DRAOR Acute scores (equivalent of four to five definite problem ratings) that declined rapidly (rapid decreasing group; about 9% of sample). Finally, a fourth group of trajectories was characterized by low initial DRAOR scores (equivalent of two to three factors rated as definite problem) that increased over time (increasing group; about 3% of sample).

Fig. 2
figure 2

Heavily weighted line represents predicted mean trajectories of DRAOR Acute for A Moderate Decreasing (41% of participants), B Low Decreasing (47% of participants), C Rapid Decreasing (9% of participants), and D Increasing (3% of participants). Unweighted lines represent 50 randomly selected individual sample trajectories within each group, jittered to reduce overlap. Jittering increases readability, facilitating illustration of within-group noise, but creates appearance of oscillation, where scores may actually be constant across measurement occasions. DRAOR Acute = Dynamic Risk Assessment for Offender Re-entry (Serin 2007), Acute subscale. Scores range from 0 to 14 with higher scores indicating higher risk

Descriptive statistics for these groups are presented in Table 2. Tests indicated statistically significant group differences in each domain. Participants in the low decreasing group were older than participants in other groups, were assessed in the community longer, and had far lower rates of any recidivism than any of the other groups. They had low initial risk levels as indicated by RoC*RoI and baseline DRAOR ratings, and they improved across subscales while maintaining low mean net change.

Table 2 Descriptive qualities of four DRAOR acute trajectories

The three remaining groups had similar levels of static risk indicated by RoC*RoI scores. The moderate decreasing and rapid decreasing groups started with higher initial DRAOR ratings of risk relative to other groups. Although participants in the rapid decreasing group demonstrated more improvement in risk levels across DRAOR subscales relative to the moderate decreasing group (a decrease equivalent to two to three risk factor ratings decreasing from slight/possible problem to no problem compared to a decrease equivalent to one to two risk factor ratings decreasing from slight/possible problem to no problem), they had higher rates of recidivism and were assessed in the community for shorter periods of time. Of all groups, the rapid decreasing group had the highest rates of criminal recidivism and violent recidivism.

Participants in the increasing group had high recidivism rates, similar to the rapid decreasing group. Although DRAOR scores were moderate at baseline, RoC*RoI scores indicated that these participants were high risk at time of re-entry. This was the only group to demonstrate deterioration (i.e., increasing risk ratings) across DRAOR subscales over time (an increase equivalent to three risk factor ratings increasing from no problem to slight/possible problem). Descriptive statistics of model-selected DRAOR Stable and Protect groups are presented in Appendices 78.

Dynamic Predictions and Comparison of Predictive Ability

Although the joint latent class models demonstrated good predictive discrimination (AUCs ranging from 0.68 to 0.74), when compared to equivalent joint models with no latent class structure, the joint latent class model selected did not demonstrate any advantages in model calibration or in discriminating between recidivists and non-recidivists. In all three prediction windows, the joint model with no latent class structure outperformed the selected joint latent class model, in both discrimination and calibration, with differences being greatest for the second prediction window (12 through 20 weeks, see Table 3). The predictive ability of selected DRAOR Stable and Protect models are presented in Appendix 9.

Table 3 Predictive discrimination and calibration of DRAOR Acute using selected joint latent class model and equivalent model without latent class structure

Discussion

Previous analyses of dynamic risk across community re-entry in this and other similar samples have indicated that risk levels generally decrease over time (Davies et al. 2021; Lloyd et al. 2020; Stone et al. 2021). The present results confirm this was the case for most participants but decreases in risk scores were not uniform and were not uniformly associated with recidivism. Although most individuals evidenced more gradual reduction in risk, JLCM revealed a small group of individuals who demonstrated relatively more extreme variability characterized by rapid short-term improvement. However, this group also recidivated at a high rate shortly after reentry. Other analytic approaches would be unlikely to capture this heterogeneity. Additionally, there was a small but reliably estimated trajectory with relatively rapidly increasing risk scores. Likely membership in this group was also associated with high rates of recidivism.

Should assessors and justice agencies consider trajectories of acute dynamic risk factors when evaluating risk for recidivism? Our results do not support trajectory methods as means to discriminate between recidivists and non-recidivists in practice. Indeed, stratifying hazard estimates on probable latent group membership resulted in poorer predictive discrimination. Failure to improve discrimination is perhaps unsurprising, as Lloyd et al. (2020) found scores representing demographics and criminal history (i.e., RoC*RoI) drive predictive discrimination in this sample (although DRAOR Acute scores contributed incrementally to prediction). Supervision officers should not use dynamic DRAOR scores as the sole determinant when clients are at risk of criminal behavior or breaching supervision conditions. Rather, DRAOR should be used as a case management tool to identify specific areas of risk among clients known to be high risk based on a static risk assessment, then use targeted interventions and referrals to ameliorate these risks.

Why is it important that trajectories of risk varied in this sample, if those differences did not predict recidivism? On one hand, these findings largely substantiate what is already known about risk for recidivism: dynamic risk generally decreases over time, and older participants with lower static risk are less likely to recidivate (see specifically the low decreasing group). However, the results also provide unique evidence supporting theories emphasizing multiple pathways to recidivism. Specifically, whereas there appeared to be only one pathway towards desistance in this sample, the pathways to recidivism varied. As suggested by Zamble and Quinsey (2001), different series of events, internal or external, may lead to criminal behavior. This requires supervision officers to be vigilant about considering varied client-informed, subjective motivations and goals that may arise from multiple situational triggers while avoiding assumptions about singular pathways and pitfalls. This is also consistent with a framework that hypothesized eight unique pathways to sexual criminal behavior based on different patterns of affect leading to the sexual behavior (Hudson et al. 1999). As such, there is no uniform, single risk factor or pattern of re-entry experiences that will signal recidivism for all individuals. In our sample, underlying differences in a collection of risk factors gave rise to three trajectory groups with high rates of recidivism.

Unfortunately, addressing and reducing acute risk factors may be insufficient to reduce recidivism in the absence of other simultaneous changes. From a coping-criminality perspective, for example, a decrease in acute risk factors (e.g., stability through housing and employment) may mean little for long-term desistance if not accompanied by improved coping or problem solving when encountering inevitable stressors. Many desistance theories emphasize that change needs to occur at a deeper intrapersonal level beyond situational adjustments. For example, interactionist theories of desistance posit that engagement in prosocial institutions facilitates desistance when accompanied with concomitant changes in identity and agency (Weaver 2019; see also LeBel et al. 2008). Further, interactionist theories recognize that the transition from criminal behavior to desistance is not linear, but that episodes of crime occur despite gradual progress towards desistance (Bottoms et al. 2004). Maruna (2001) identified “condemnation scripts” in his interviews of individuals whose desistance efforts did not lead to desistance: attempts to “make good” eventually gave way to resignation and beliefs that recidivism was inevitable. Perhaps for participants in the moderate decreasing or rapid decreasing groups, effortful reductions in acute risk (e.g., avoiding substance use, seeking employment, avoiding romantic conflict) eventually proved unsustainable, or behavioral efforts were insufficient in the absence of underlying changes in desistance belief structures (see Lloyd & Serin 2012).

Although a re-entry stress-coping framework (Zamble & Porporino 1988) suggests recidivism can occur when legitimate efforts toward decreasing one’s risk factors start to become outstripped by stressful events and available coping resources, we did not expect similar survival curves for the rapid decreasing and increasing groups. This result indicates an important contribution of the joint latent class modelling approach. Namely, JLCM allows observation of patterned differences in how long individuals remain and are assessed in the community. Given existing approaches that either do not control for recidivism when examining patterns of change in assessed risk or assume a uniform relationship between risk and recidivism across participants, it is unsurprising that the pattern observed in the rapid decreasing group has not been previously observed. The group is unusual, not only in their high rates of recidivism despite apparent improvement but in the rapidity with which they recidivated (within the first 2 months of re-entry on average). Data from these participants would typically be removed from analyses using a two time-point design or even a traditional group-based trajectory panel design, as these designs would generally span longer timeframes and do not allow retention of data from censored cases. Conversely, using a hazard model that assumes a uniform relationship between risk and recidivism would absorb the variability contributed by this small group with relatively few assessments into the larger group. JLCM is uniquely able to parcel out heterogenous information about risk and recidivism in the weeks immediately following release from incarceration. This information may have important implications for how to best contextualize instability and support individuals with multiple situational, behavioral, and affective risk factors early in their re-entry process.

Why would a group that appears to be improving have similar outcomes to a group that appears to be deteriorating? Results are surprising if one examines change in acute risk in isolation but are less surprising when one considers initial acute risk and static risk. Participants with rapid decreasing acute risk had, on average, identical static risk and potentially twice as many or twice the intensity of acute risk factors at the time of re-entry relative to those with increasing acute risk. Participants in the rapid decreasing group returned from incarceration to lives characterized by initial instability, indicated by multiple situational, behavioral, and emotional risk factors. The brief time they spent in the community suggests that, despite some improvement, these participants never quite achieved stability. Instead, their rapidly decreasing risk scores indicate a different form of lifestyle instability as they experienced many simultaneous changes in multiple important life areas.

Still, it is possible that supervision officers did not accurately assess risk for a small proportion of clients characterized by a unique presentation. Supervision officers may have overestimated improvements in the rapid decreasing group, or these clients’ risk may have been most closely tied to risk factors unassessed on DRAOR. This latent trajectory is also consistent with supervision officers using more accurate file information when initially assessing DRAOR, but needing to rely on inaccurate information from interview and collateral information for subsequent assessments, such that clients who are better able or more motivated to obscure their community functioning may appear to improve, despite remaining high risk. This may indicate a weakness in our measure of acute risk factors, as supervision officers may not adequately account for response bias when scoring acute items. Alternatively, the pattern observed in the rapid decreasing group might reflect instability (or even regression to the mean) in a small, high-risk group. It is possible that the high fluctuation in weekly ratings (as evidenced by mean net change) contributed to increased recidivism rates in both rapid decreasing and increasing groups. Even as individuals may be trying their best to “try on” a new identity as a desisted person, fluctuation may represent affective dysregulation, faltering attempts at abstinence from substance use, or unstable relationships (including prosocial relationships). This is consistent with findings that higher variability in acute risk is associated with higher rates of recidivism (Davies 2019; Stone et al. 2021). It also accords with several criminological and desistance theories that suggest circumstances conducive to stability (e.g., consistent engagement in employment, meaningful investment in relationships with prosocial others, ongoing efforts to maintain sobriety, etc.) are also less conducive to criminal behavior (e.g., strain theory, control theory; Bonta & Andrews 2017), or that instability and poor coping together lead to criminal behavior (e.g., coping-criminality hypothesis; Zamble & Porporino 1988).

Both explanations, and the patterns found across groups, have implications for supervision practice. First, the results support that acute dynamic risk factors should be considered incrementally to and in the context of static risk when predicting recidivism (Brown et al. 2009; Serin et al. 2016). Supervision officers should be cautious discounting an extensive history of serious criminal behavior on the basis of potentially superficial current improvements, especially in the first months of re-entry. Second, results indicate consistent, incrementally achieved change over time is associated with improved outcomes. Effective supervision should include facilitating overall stability characterized by gradual improvements, as this characterized the single trajectory most closely associated with desistance. For example, coaching clients how to keep a job and problem-solve through stressful temptations to give up (i.e., maintaining existing strengths) may be as important as initially helping a client to find a job. In practice, rather than focusing on patterns of change as a means to identify individuals at high risk, it may be more useful to use steadily decreasing change over sufficient time to identify desisting individuals for whom it would be appropriate to lower supervision intensity.

Limitations

The primary limitations of our study are common to applied research. We were limited to observing only the outcomes recorded by law enforcement and criminal justice agencies and using only the risk factor information provided by corrections staff, potentially colored by staffs’ personal bias or participants’ response bias. These data are important, as they provide the basis for case management decisions; however, these data cannot tell the entire story of desistance or recidivism, and assessors should always understand that there are individual variations masked by aggregate data.

We relied on officially recorded recidivism as an outcome. This limitation is twofold. First, law enforcement does not detect or respond to all criminal behavior equally. Multiple external factors and structural inequality influence patterns of detection and prosecution. Second, a brief timeframe and a binary recidivism outcome cannot adequately describe the process of desistance. Even those who are reincarcerated with new charges may already be making substantial changes towards desistance through reduced frequency or severity. It is also important that most of the participants who recidivated did so with convictions for technical violations. Although these violations merited prosecution and reincarceration in this jurisdiction, results may not generalize to jurisdictions where technical violations do not usually result in parole cancellation.

Another limitation of our data is that supervision officers only assessed many participants for 6 months consistent with New Zealand’s policies for mandatory supervision. JLCM is well-suited to right censoring, and tests indicated that results did not differ when we restricted our analyses only to six months of data. Yet, our decision to truncate data once assessments became inconsistent led us to censor all cases at 52 weeks and this limited the scope of this study. However, prior studies in New Zealand indicated approximately 60% of recidivism events observed over several years occurred within the first year in the community (Nadesu 2007). Although longer timeframes may be additionally informative, high first-year recidivism rates and low assessment rates beyond 52 weeks suggest our results meaningfully describe core risk-recidivism processes.

Our decision to focus on only acute risk factors may be an additional limitation. In particular, as the JLCM design intends covariates to be time-independent, we chose not to include other time-dependent risk dimensions as covariates. DRAOR Stable scores did not predict recidivism incrementally to RoC*RoI and DRAOR Acute in this sample (Lloyd et al. 2020), but it is still possible that variation in group trajectories is related to stable dynamic risk scores. A multi-factorial group trajectory model is beyond the scope of the current study but may be of interest to future researchers. Further, our use of a summary score obscures changes at the item-level, introducing another limitation. Our analyses cannot reveal the extent to which change in risk scores indicated changes in the variety of risk factors or their intensity.

Additional limitations concern model selection. As we constructed our models, we found that relatively minor changes to model specification often resulted in changes in the balance of model selection criteria, such that a different optimal number of groups was indicated. This is consistently cited as a critical problem with group-based trajectory models (D’Unger et al. 1998; Erosheva et al. 2014; Roeder et al. 1999; Sampson et al. 2004). Indeed, as selection criteria rarely converged on a single solution, model selection seemed disconcertingly arbitrary. It is probable that there is no single real or correct number of latent groups to be identified within our sample. This is not an indictment of the statistical concept of latent classes, but rather demonstrates a need for transparency and clarity when presenting results of group-based trajectory models. The groups identified are defined by the trajectories and survival times of the individuals observed, and not vice versa. This echoes Nagin and Tremblay’s (2005) discussion of misconceptions of how group-based trajectory modelling functions, particularly the misconceptions that individuals belong to trajectory groups and that the number of groups is immutable. Testing trajectory estimations across a split sample confirmed that the trajectories were consistently identifiable in our data. Plotting during the selection process (Appendix 9) suggested that trajectories identified in a model with four latent classes were still present when a fifth class was added. However, baseline risk functions based on these latent classes did not improve predictive discrimination or model calibration, compared to risk functions based on a single distribution of random effects. Researchers must understand the latent groups as the identification of patterns of unobserved heterogeneity within our data, a starting point for further examination. They should exercise caution in their communication of this, especially with stakeholders less versed in statistical concepts, being careful not to imply that the directional pattern of scores at any one point in time in practice are meaningful for labelling individuals, predicting an individual’s future risk trajectory, or predicting an individual’s subsequent recidivism.

Future Directions

Our study had the advantages of a large sample with frequent reassessment recorded in close proximity to recidivism outcomes. It will be informative to recreate this study in similar corrections samples. The identification of similar trajectory groups in different samples would suggest that these patterns of heterogeneity are common to the re-entry process, rather than unique to this specific cohort.

If trajectories are replicated in new samples, it will then be helpful to identify potential sources of heterogeneity. Within DRAOR subscales, it may be useful to examine if specific risk items are driving differences in trajectories. For instance, in the group identified as increasing in acute dynamic risk, it would be helpful to examine if these increases are due to deterioration in situational factors (e.g., employment, relationship conflict), affective factors (e.g., negative mood, anger) or both. Additionally, it would be informative to examine if within-group changes in overall acute scores are due to changes in the number of risk factors observed, their apparent intensity, or both. Researchers could also test whether certain stable risk factors (e.g., impulsivity, attitudes towards authority) moderate the relationship between acute risk and recidivism. Further heterogeneity might be explained by examining file information beyond RoC*RoI summary scores (e.g., specific dimensions of criminal history) or by including information on the supervision officers conducting assessments (i.e., nesting observations conducted by specific supervision officers). Our data set did not include indicators that identified different assessors.

The question of stability in risk should be examined more closely. Mean net change provides a general measure of how much scores varied from week-to-week but does not indicate direction of change or consider how fluctuation varies over time. Methods designed specifically to assess stability, such as P(Δ) (Barnes et al. 2017) or damped oscillator models (Chow et al. 2005) may be informative.

Future directions also include expanded application of dynamic predictions, and exploration of dynamic predictions from joint models using a shared random-effects approach. Potential applications include comparing predictive accuracy of different risk models or comparing models’ discriminative ability across different prediction windows. Eventually, individualized dynamic predictions may be a useful tool for identifying individuals requiring reductions in the amount of control imposed by their community supervision.