Introduction

An important motivation for the imposition of lengthier prison sentences is the belief that such sentences help to prevent crime. Lengthy sentences would reduce offending, not only through incapacitation effects during imprisonment, but also by specific deterrence effects after release (see Von Hirsch et al., 2009). This belief has been a driving factor in the rising imprisonment rates and the associated criminal justice costs in recent decades, not only in the USA (e.g., Henrichson & Delaney, 2012; Latessa et al., 2020), but also in multiple Western European countries, such as Great Britain, Belgium, and Portugal (see Dünkel, 2017; Tak, 2008). This, in turn, resulted in what is referred to as the “prison paradox”: while the inflow of new entries into prisons has steadily decreased, imprisonment levels have continued to rise due to harsher sentencing (Aebi et al., 2015). Prior studies, however, suggest that the average prison spell only exerts negligible incapacitation effects on crime, and have thus far produced conflicting evidence on the effects of length of stay on recidivism (See Owens, 2009; Sweeten & Apel, 2007; Wermink et al., 2013). Simultaneously, longer prison spells are shown to have a more disruptive impact on multiple life domains, such as the development of a conventional career, bonds to conventional peers, and a stable family and housing situation (e.g., Ramakers et al., 2014). The detrimental impact of incarceration on an offender’s life prospects, substantial costs of imprisonment,Footnote 1 and high recidivism rates upon releaseFootnote 2 make the question as to whether imprisonment length deters or stimulates reoffending to be of key public policy concern.

While a vast body of research has investigated the effects of imprisonment on recidivism as compared to alternative sanctions (see Nagin et al., 2009; Villettaz et al., 2015), comparatively little attention has gone to imprisonment length. Furthermore, the scarce existing research on length of stay has produced mixed findings, potentially due to the inability of older study designs to account for the considerable endogeneity in the relationship between the length of imprisonment and offending behavior (see Nagin et al., 2009). This endogeneity is in part an inherent consequence of the judicial decision-making process, as case and offender characteristics related to the probability to recidivate—such as criminal history (see Roberts, 1997; Van Wingerden et al., 2011)—are generally considered in sentencing decisions, with offenders with higher recidivism risk receiving lengthier prison sentences. Few methodologically rigorous studies have attempted to account for this endogeneity, and therefore there remains a remarkable lack of knowledge about the effects of the length of imprisonment on recidivism.

In this article, we address this gap using advanced quasi-experimental techniques to examine the effects of imprisonment length within the criminal justice system in the Netherlands. In doing so, we address two critical shortcomings in prior research.

The first concerns the overreliance on regression and matching techniques to examine imprisonment length and reoffending.Footnote 3 An inherent flaw of these techniques is that they rely on the availability of comprehensive data to control for endogeneity in the imprisonment length-recidivism relationship. As perfect data coverage on all external influences on these outcomes is not achievable, such approaches will always remain sensitive to unobserved heterogeneity. In light of this shortcoming, criminal justice scholars have more recently started to implement instrumental variable (IV) designs (see Bushway & Apel, 2010). In contrast to matching techniques, IV models are able to control for both observed and unobserved heterogeneity, by statistically isolating exogenous treatment variation from treatment variation that originates from differences in individual characteristics. In doing so, IV models enable us to completely distinguish treatment effects from selection effects. Despite this considerable methodological advantage, applications of IV designs to investigate the causal link between imprisonment length and recidivism remain scarce.

The second relates to the relative paucity of quasi-experimental research in criminal justice systems other than the USA. While the use of more advanced econometric techniques such as IV estimation has gained ground in criminological research, the application of such designs to investigate the causal effects of imprisonment length on recidivism remains primarily limited to the US context. Compared to most other Western countries, the US criminal justice system on average imposes much longer prison sentences (e.g., 3.8 months for Dutch prisoners versus 2.6 and 4.5 years for US state and federal prisoners, see Aebi & Tiago, 2020; Kaeble, 2018; Motivans, 2020), which has resulted in a body of research that primarily investigates exceptionally large changes in imprisonment length. As prior evidence suggests that the effect of imprisonment on recidivism varies across imprisonment length (see Meade et al., 2013), this may compromise the generalizability of earlier findings to shorter sentencing practices. Consequently, many questions remain as to the effects on recidivism of changes in imprisonment length around average prison spells in more lenient criminal justice systems, but also shorter custodial terms in the USA.Footnote 4

This study investigates the effects of imprisonment length on recidivism by exploiting exogenous variation from random judge assignment through an instrumental variable approach. Facilitated by longitudinal individual-level data on all offenders convicted to a (≤ 1 year) prison sentence by a single-sitting judge in 2012 in the Netherlands (N = 5,092), this approach allows us to exploit differences in the proclivity of judges to impose lengthier prison sentences. As such, we assess the effect of length of stay in prison while holding other factors constant that may affect recidivism (e.g., offense and offender characteristics). The available data also enable us to investigate heterogeneous effects across crime categories (property, violent, and other crimes) and offender characteristics (first-time prisoners, repeated prisoners, young adults, and adults). As such, this study contributes to the literature in being the first to examine the effects of imprisonment length on recidivism through an instrumental variable design in the European context. By using longitudinal administrative data from the Netherlands, this approach enables us to investigate the generalizability of prior findings to sentencing practices more common in Western European and Nordic countries, as well as US sentencing practices for lesser offenses.

Theory

The aim of this study is to address the question as to what extent imprisonment length affects recidivism after release. Deriving expectations on the direction of such an effect is challenging, however, as scholars have posited divergent theories on imprisonment having either a preventative or a criminogenic effect.

On the one hand, a sizable body of theoretical literature predicts the imposition of lengthier prison sentences to reduce reoffending after release. Most notable among these preventative perspectives is deterrence theory (Von Hirsch et al., 2009). Related to economic rational choice theory (Becker, 1968), this perspective is based on the notion that individuals decide to commit a crime by rationally weighing the perceived costs and benefits of such actions. From a deterrence perspective, these costs are dependent upon an individual’s estimates of the certainty, severity, and celerity of punishment (Nagin et al., 2009). As such, the imposition of lengthier prison sentences should exert greater deterrent effects, by increasing the perceived costs of reoffending. Furthermore, various other preventative theories posit imprisonment to be an effective method of reducing reoffending after release by rehabilitating offenders (e.g., Andrews & Bonta, 2010; Gendreau et al., 2008; MacKenzie, 2006).

On the other hand, many criminologists argue that increasing imprisonment length may stimulate recidivism after release. Such criminogenic effects are postulated to arise through various theoretical mechanisms. For instance, one school of thought focuses on imprisonment potentially cutting off opportunities to establish a conventional career and a law-abiding lifestyle, such as education, employment, and marriage (Sampson & Laub, 1993). Longer prison spells cut off such opportunities for longer periods of time, effectively further reducing opportunities to accumulate human capital and barring individuals from establishing stabilizing conditions outside of prison. The notion that lengthier prison sentences may stimulate further development of a criminal career can also be derived from labeling theory (Becker, 1963). A longer prison spell also implies that an individual is being treated as a criminal by others for a longer period of time. From a labeling perspective, this may result in a greater internalization of a criminal identity and subsequently more behavior conforming to this identity (i.e., recidivism). Lengthier prison sentences may also be more difficult to hide from one’s social circle, which may cause further labeling by others outside of prison. Furthermore, various criminological theories postulate the adverse effects of longer exposure to criminogenic social environments in prison. From a differential association perspective, increased lengths of stay may facilitate further transfer of criminal knowledge and skills among prisoners (Letkemann, 1973; Sutherland, 1947), whereas social learning theory emphasizes the transfer of deviant and antisocial norms (Akers, 1997; Clemmer, 1940; Sykes, 1958). Overall, lengthier imprisonment may cause further detachment from conventional social circles and further embeddedness in criminal networks (see Volker et al., 2016).

In summary, criminological theories support both hypotheses on preventative and criminogenic effects of longer prison sentences. The sizable body of theoretical literature underlying these antithetical hypotheses emphasizes the importance of investigating the causal relationship between imprisonment length and recidivism.

Prior research

Prior research on the specific deterrence effects of the length of stay in prison has long been hampered by the empirical challenges posed by the endogenous relationship between sentence length and recidivism (see Nagin et al., 2009). An important underlying cause of this endogeneity is the consideration of recidivism risk in sentencing decisions, with individuals more likely to reoffend receiving lengthier prison sentences. For example, a criminal record is generally considered to be an aggravating factor by judges and in sentencing guidelines (see Roberts, 1997; Van Wingerden et al., 2011). As prior criminal behavior is also a strong predictor of future criminal behavior (e.g., Farrington, 1992; Gendreau et al., 1996; Nagin & Paternoster, 2000), this may cause selection bias in estimates of the effects of imprisonment length on recidivism. As the methodology of the first generation of research on this topic is insufficiently rigorous to account for such endogeneity (see Nagin et al., 2009), we do not include these studies in this literature overview.

More recently, a second generation of studies has arisen that uses matching techniques to account for confounding factors in the imprisonment length-recidivism relationship. By matching individuals on observed characteristics, these papers attempt to construct near-identical counterfactual groups. Overall, these second-generation investigations have produced somewhat mixed results. Early attempts to matching suggest longer sentences to produce either null effects (Kraus, 1981) or increased recidivism after release (Jaman et al., 1972). Since then, notable advancements in matching techniques—such as the introduction of propensity score matching—have furthered the ability to control for selection bias (see Rosenbaum & Rubin, 1984; Shadish, 2013). Evidence produced by these novel approaches to matching primarily suggests imprisonment length to have neither preventative nor criminogenic effects (Loughran et al., 2009; Snodgrass et al., 2011; Wermink et al., 2018). Notably, both Snodgrass et al. (2011) and Wermink et al. (2018) find null effects on recidivism by investigating the dose–response relationship among a Dutch prison sample serving short average prison terms comparable to those examined in the current study (6.7 and 4.1 months, respectively). However, there is also evidence of non-zero marginal effects, as Meade et al. (2013) find only the longest included prison spells (5 + years) to reduce recidivism after release. One other matching study also reports lengthier imprisonment to reduce reincarceration among a sample serving comparatively long prison terms, with a median sentence length of 5 years (Rydberg & Clark, 2016). Yet, this study also finds increased parole revocations, and considerable heterogeneity in recidivism across offense categories and offender characteristics, leading the authors to conclude there to be insufficient support for null, preventative, or criminogenic effects.

While matching techniques have enabled observational studies to account for endogeneity to a degree that was not previously possible, their ability to do so is limited to the extent that this endogeneity is related to observables. As such, matching estimates remain sensitive to bias originating from unobserved heterogeneity (see Shadish, 2013). Addressing this vulnerability, a third generation of research can be distinguished that employs econometric models such as regression discontinuity and instrumental variable designs to capitalize on exogenous variation in sentencing originating from differences in criminal law, sentencing guidelines, and judicial proclivities.

The first of these third-generation investigations by Drago et al. (2009) capitalizes on the Italian 2006 Collective Clemency Bill, which granted the release of prisoners with remaining lengths of stay of up to 3 years. This clemency came with the condition that the remainder of an individual’s prison spell would be reinstated if another offense is committed within 5 years after release. By investigating this commutation, Drago et al. find that a 1-month increase in remaining sentence length reduces recidivism by 0.16 percentage points (or 1.3%). Another related study exploits sharp discontinuities in parole guidelines in the US state of Georgia (Kuziemko, 2013). By employing an instrumental variable design, Kuziemko finds a 1-month increase in imprisonment length to reduce the 3-year recidivism rate by approximately 1.3 percentage points. However, Roodman (2017) reanalyzed the data used for this study and found either statistically nonsignificant estimates or significant increases in recidivism by investigating other parole guideline discontinuities. Another study in this tradition by Rhodes et al. (2018) exploits discontinuities in US sentencing guidelines using both a regression discontinuity and an instrumental variable design. Through these identification strategies, they find a 7.5-month increase in imprisonment length to reduce the 3-year recidivism rate after release by 1 percentage point (from 20 to 19%). Offense seriousness and criminal history do not moderate the effect, which Rhodes et al. also find to be homogenous across educational attainment, race, and sex. Contrasting the findings from prior third-generation research, the most recent study does not find evidence for preventative effects of lengthier prison terms (Al Weswasi et al., 2022). By exploiting three Swedish reforms that changed the required time served before being considered eligible for parole, they do not find statistically significant effects on recidivism, irrespective of operationalization (including prevalence, incidence, and reconvictions). The authors consider a potential explanation for their findings to be the limited changes in imprisonment length under investigation.

Finally, studies that exploit exogenous variation in judge sentencing preferences in an instrumental variable design are few and far between. Thus far, the application of this approach has mainly been limited to research that compares custodial to non-custodial sentences (Bhuller et al., 2020; Di Tella & Schargrodsky, 2013; Green & Winik, 2010; Loeffler, 2013). The evidence produced by this approach varies notably across different institutional contexts, supporting preventative effects of imprisonment on recidivism in Norway (Bhuller et al., 2020), criminogenic effects in Argentina (Di Tella & Schargrodsky, 2013), and null effects in the USA (Green & Winik, 2010; Loeffler, 2013). To our knowledge, only one prior study employs a judge stringency IV design to investigate the effects of imprisonment length on recidivism. Roach and Schanzenbach (2015) exploit the random assignment of more than 8,000 offenders to sentencing judges after pleading guilty in the US district of Seattle, WA (with a median imposed sentence length of 4 months). By investigating recidivism rates up to 3 years after release, they find a 1-month increase in imprisonment length to reduce recidivism by around 1 percentage point. They also find the reduction in recidivism to be concentrated in the first year after release. However, a review by Roodman (2017) of this study did raise concerns about a potential weak instrument problem, non-random assignment to judges, and bias from parole supervision length being correlated to imprisonment length.

Despite the notable methodological advancements to account for the considerable endogeneity in the imprisonment length-recidivism relationship, evidence remains mixed and comparatively scarce. While matching studies predominately find null effects, most of the third-generation research presents evidence of preventative effects of longer imprisonment. Critical review has also brought to light several vulnerabilities of the limited extant evidence (see Roodman, 2017), which has thus far mostly focused on the USA.

The Dutch criminal justice context

The Netherlands has witnessed vast changes in incarceration rates in recent decades. Traditionally seen as a country with low imprisonment levels, the 1980s marked the onset of a trend described as “the end of tolerance” (Tak, 2008). By 2006, the Dutch imprisonment rate had quadrupled, which placed it among the highest in Western Europe. After this peak, the incarceration rate almost halved over the span of 10 years (− 48%). This recent trend of decarceration coincided with decreasing crime rates, as well as the gaining of ground of alternatives to custodial sanctions in sentencing practices.

Similar to other Western European and Nordic countries, the present-day Dutch criminal justice system is considered to be relatively lenient. In 2018, approximately 30,854 adults entered a detention facility in the Netherlands (Aebi & Tiago, 2020). With an average of 54.2 prisoners per 100,000 inhabitants, the Dutch adult incarceration rate is comparable to countries such as Iceland (46.8), Finland (51.1), Sweden (56.5), Denmark (63.2), and Norway (65.4). These rates are substantially lower than that of the USA, which leads the world with an adult incarceration rate of 556 per 100,000 (Carson, 2020). The comparative leniency of the Dutch criminal justice system is further reflected in the relatively short average imprisonment length of 3.8 months (Aebi & Tiago, 2020). All other Western European and Nordic countries also have average lengths of stay of less than 1 year, which contrasts starkly with the USA, where the average sentence lengths are 2.6 years and 4.5 years for state and federal prisoners respectively (Kaeble, 2018; Motivans, 2020).Footnote 5 Whereas most prior studies investigate exceptionally large changes in time served in US correctional institutions, the Dutch context allows for investigation of sentencing practices more representative of most other developed nations, as well as US pre-trial detention and sentencing practices for less severe offenses.

In line with most European countries, the Netherlands does not have a jury-based criminal justice system. All criminal courts in the Netherlands adhere to uniform criminal procedure and national criminal law, and cases that are brought to trial are always tried by professional judges. The submission of a case to a court of first instance is dependent upon the severity of the offenses in question. Law enforcement agencies first send crime reports to the public prosecutor’s office, which holds the legal authority to dispose of lesser criminal cases through various non-custodial sanctions (e.g., monetary fines and community service). More severe criminal cases are submitted to criminal court, where the presiding judges hold the sole authority to assign prison sentences. The total influx to the Dutch public prosecutor’s office was around 189,000 criminal cases in 2019, of which approximately 92,900 were sent to criminal court (Meijer et al., 2020). Of this total caseload, judges ruled for (partially) unconditional sanctions for 69,600 cases, of which 25,100 concerned prison sentences (36%).

After submission to a criminal court of first instance, the criminal cases are randomly distributed among available judges. Dependent upon legal characteristics, all cases are assigned to either a single or a multiple criminal chamber. The lion’s share of cases are tried by a single professional judge (82%), whereas those of notable severity or complexity are presided over by a chamber of three judges (Meijer et al., 2020). Cases that are tried by single-sitting judges typically range from minor property and drug offenses to more severe violent crimes, such as assault and robbery. The maximum prison sentence length imposable by a single-sitting judge is 1 year (Art. 369 Code of Criminal Procedure). These sentences have to be served in full, as under Dutch criminal law the possibility for parole is limited to individuals serving more than a year (Art. 15 Penal Code). Single-sitting judges possess broad discretionary powers concerning both the type and severity of sentences, as presumptive sentencing guidelines are absent. The Dutch criminal code does not mention specific minimum sentences for specific crimes, when it comes to imprisonment, in each case brought before them single-sitting judges may impose sentences anywhere between 1 day and 1 year.

The exploitation of variation in the proclivity to impose lengthier sentences across single-sitting judges in the Netherlands as an identification strategy, allows this study to avoid potential sources of bias inherent to other criminal justice systems (see Roodman, 2017). First, the absence of legal possibilities for parole allows for the observation of post-release behavior unaffected by parole supervision. Second, the random assignment of cases to judges avoids bias from non-random variation in sentencing, which is further confirmed by the randomization checks presented in the section “Empirical methodology”. Finally, the absence of sentencing guidelines avoids restriction of sentencing variation. Combined, these characteristics make the Dutch criminal justice context uniquely suited to investigate the imprisonment length-recidivism relationship.

Data

To answer our research questions, we use longitudinal individual-level data on 5,092 adult offenders sentenced to imprisonment by a single-sitting judge in 2012 in the Netherlands.Footnote 6 These data were provided to us by the Dutch Ministry of Justice and Security’s Research and Documentation Centre (WODC), and cover a 5-year observation window after conviction and release from prison. In addition to recidivism measurements, this dataset includes information concerning offender characteristics (age, sex, country of birth, and criminal history) and case characteristics (imposed sentence, type and number of offenses, and maximum penalty). Data on judge identifiers was provided to us by the Public Prosecutor’s Office.

As outcome variables, we separately run analyses for measures of both recidivism prevalence and recidivism incidence at 1, 3, and 5 years after release from prison.Footnote 7 The analysis of both prevalence—as well as incidence measures of recidivism allows us to differentiate between criminal involvement at both the extensive—(recidivism status) and intensive margin (number of offenses). We classify recidivism as the commission of 1 or more registered criminal offenses after release from prison. To account for time spent incarcerated, we multiply the observed number of registered crimes by the inverse of the proportion of the follow-up period that individuals were free to commit crimes. This implies that we assume that individuals would have committed crimes at the same rate had they been on the street instead of incarcerated. For instance, if an individual commits 1 crime in a 1-year follow-up period but was incarcerated for 6 months during this follow-up period, we count 2 crimes (1/((360 − 180)/360)). Because we are faced with skewed incidence outcomes, we winsorize the recidivism incidence at the 99th percentile.Footnote 8 In line with related prior research (e.g., Al Weswasi et al., 2022; Bhuller et al., 2020; Wermink et al., 2023), we investigate recidivism incidence among the full sample (irrespective of recidivism) to avoid selection bias.Footnote 9

The independent variable of interest of this study concerns imprisonment length, which is measured by the number of days spent in incarceration. Figure 1 presents the distribution of the length of stay in the sample under consideration. We find our data to encompass the full spectrum of sentence lengths imposable by a single-sitting judge in the Netherlands, but also that the vast majority of offenders in our sample have served up to 6 months. To investigate whether the comparatively low number of prison spells longer than 6 months (N = 136) affects our estimates, we additionally performed our analyses over the selection of offenders who have served a maximum of 6 months. Furthermore, as very short sentences may differ from longer sentences in their effect on recidivism after release (e.g., Becker, 1963), we have also performed analyses excluding individuals sentenced to a maximum of 2 weeks. As these sensitivity analyses did not substantively change our estimates, their results are not reported here.

Fig. 1
figure 1

Distribution of imprisonment length across criminal cases

To identify the causal effects of length of stay on recidivism, we instrument imprisonment length with judge stringency. To this end, all cases in our sample are linked to 250 unique judge identifiers. In addition to the random assignment of judges, our identification strategy relies on the existence of sufficient inter-judge variation in sentencing. Figure 2 presents a graphical representation of the first stage of our baseline model.Footnote 10 In line with the included prison spells, we find a distribution that is skewed towards the lower end. Nevertheless, punitivity varies considerably across judges, with average imposed prison spells ranging from close to 0 days to around 6 months.

Fig. 2
figure 2

Average sentence length across judges ranked by punitivity

Table 1 gives an overview of the most relevant characteristics of the full sample. The shown recidivism outcomes are conditional on the ability to observe an individual at the respective follow-up period. The average imprisonment length in the full sample equals 50.63 days. The recidivism prevalence is 35%, 50%, and 56% for the 1-year, 3-year, and 5-year follow-up periods respectively. These rates are very much in line with the 2-year recidivism rate of 47% for Dutch ex-prisoners in general, as reported by the Research and Documentation Centre (WODC) of the Ministry of Justice and Security (Weijters et al., 2019). Both in terms of recidivism prevalence and incidence, we find most of the reoffending to occur within the first year after release. As there are no parole supervision provisions for individuals sentenced to less than 1 year, these observations likely reflect higher criminal activity in the first year after release, as opposed to a higher detection rate.

Table 1 Descriptive statistics

Empirical methodology

Estimating the effect of imprisonment length on recidivism after release is empirically challenging due to omitted variables affecting both the probability to receive lengthier sentences and the probability to recidivate. This considerable endogeneity is in part an inherent consequence of the judicial decision-making process, which takes recidivism risk into account in sentencing decisions (see Roberts, 1997; Van Wingerden et al., 2011). To address this endogeneity problem, we employ a two-stage least squares (2SLS) instrumental variable estimator for our baseline model.

By instrumenting imprisonment length with a judge stringency measure, we statistically isolate the variation in imprisonment length that originates from differences in judge stringency from the variation in imprisonment length caused by differences in offender—and case characteristics. In cases of variable treatment intensity (such as imprisonment length), a 2SLS IV model estimates the weighted average causal change in the dependent variable from a one-unit increase in treatment among individuals whose treatment status is affected by the instrument (see Imbens & Angrist, 1994). As such, the local average treatment effects (LATEs) estimated in this study capture the average treatment effects on recidivism from an additional day spent in prison, weighted by the number of individuals who change in sentence length at the respective instrument values (i.e., treatment compliers).

A well-known challenge for 2SLS IV models is the potential for small-sample bias towards naïve ordinary least squares estimates when using many weak instruments (see Chao & Swanson, 2005). We therefore specify our baseline model using a single leave-out mean judge stringency instrument (also see Bhuller et al., 2020). This judge stringency measure is computed as a leave-out mean, which omits case \(i\) in the computation of the relative average sentence length that the judge in question has imposed within our observation window.

The baseline model is specified as follows:

$${y}_{it}={\beta }_{0}+{\beta }_{1}{I}_{i}+{\beta }_{2}{{X}^{^{\prime}}}_{it}+{\beta }_{3}{{C}^{^{\prime}}}_{i}+{\beta }_{4}{{D}^{^{\prime}}}_{d}+{\upsilon }_{it}$$
(1)
$${I}_{i}={\gamma }_{0}+{{\gamma }_{1}Z}_{j(i)}+{{\gamma }_{2}{X}^{^{\prime}}}_{it}+{\gamma }_{3}{{C}^{^{\prime}}}_{i}+{\gamma }_{4}{{D}^{^{\prime}}}_{d}+{\varepsilon }_{it}$$
(2)

where \({y}_{it}\) in Eq. (1) is a dichotomous or count variable that respectively captures recidivism prevalence or incidence at time t for individual \(i\), \({I}_{i}\) captures imprisonment length in days, \({X}_{it}\) is a vector of individual characteristics including native-born Dutch citizen status, sex, and linear, quadratic, and cubic terms for age, \({C}_{i}\) is a vector of case characteristics including offense type and severity, number of offenses, and various criminal history measures, \({D}_{c}\) indicates district court fixed effects, and \({\nu }_{it}\) is the error term. As we expect imprisonment length to be endogenous, we instrument imprisonment length with the judge stringency instrument, as shown in Eq. (2). In this equation, \({Z}_{j(i)}\) indicates the judge stringency leave-out mean and \({\varepsilon }_{it}\) is the error term. The error terms \(\upsilon\) and \(\varepsilon\) are iid ~ N(0,σ) and are allowed to be correlated. We are interested in the coefficient \({\beta }_{1}\), which captures the effect of imprisonment length on recidivism at 1, 3, or 5 years after release. The judge stringency leave-out mean is given by:

$${Z}_{j\left(i\right)}=\frac{1}{{n}_{j(i)}-1}\sum_{i,j\ne j(i)}^{n}({X}_{j\left(i\right)}-\overline{X})$$

where \({Z}_{j\left(i\right)}\) captures the average relative stringency of judge j, determined by the average sentence length imposed by that judge for offenders other than individual \(i\), relative to the overall average sentence length \(\overline{X}\).

To investigate the functional form of the imprisonment length-recidivism relationship, we present exploratory graphs on the evolution of the recidivism rate over time served. Figure 3 presents local polynomial smooth plots of recidivism prevalence at the 1-, 3-, and 5-year mark after release from prison, overlaid by linear and quadratic fitted lines.

Fig. 3
figure 3

Recidivism prevalence over days imprisoned by time since release

Overall, we do not find evidence of notable nonlinearity in the relationship between length-of-stay and recidivism across the included measurement periods. For recidivism prevalence within 1 year after release (Fig. 3), we find the fitted lines to diverge slightly towards the upper bound for imprisonment length. However, this does not hold true for recidivism prevalence at 3 and 5 years after release. Furthermore, a comparison of the standard errors across functional forms suggests that a quadratic model specification increases the risk of overfitting of the model without increasing performance. As such, we do not consider a higher-order baseline model specification to be warranted. Yet, to investigate whether linearization affects our estimation results, we also compare our baseline estimates to those obtained from a quadratic model specification.

Our identification strategy relies on the random assignment of cases to judges. While the assignment process described by the criminal procedures is explicitly random within district courts, we test whether this holds true by regressing all observed covariates on the full set of judge identifier dummies and district court fixed effects. These randomization tests do not reject the random assignment of offenders by sex, native status, age, nor any of the observed case and criminal history measures (see Appendix Table 9). As such, the results suggest that the assignment procedure is, indeed, at random.

While the instrumental variable approach allows us to account for endogeneity and exploit the full potential of the available data, it also brings along two main model assumptions: instrument exogeneity and instrument relevance. First, the assumption of instrument exogeneity—i.e., the exclusion restriction—states that the instrument must only affect the second-stage outcome through the instrumented variable (imprisonment length) and that the instrument may not be correlated to the second-stage errors. We consider this assumption reasonably to hold, as there are no conceivable mechanisms through which the random assignment of a judge may affect recidivism after release other than through the imposed sentence. Second, the assumption of instrument relevance pertains to the requirement for the instrument to cause sufficient variation in the first-stage outcome variable.

To test whether the employed instruments are sufficiently relevant, we perform F-tests over the first stages of our IV models. Prior research has shown that F-statistics of 10 and above are indicative of strong instruments, whereas values below 5 will cause 2SLS models to produce unreliable estimates over a finite sample (see Staiger & Stock, 1997). For our baseline 2SLS model, we find the first-stage F-statistic to be very high with a value of 1563.47. As such a value is indicative of a very strong instrument, we find the assumption of instrument relevance to hold for our baseline model specification.

Results

Baseline estimation results

Table 2 presents the baseline two-stage least squares (2SLS) instrumental variable estimation results for the full sample. To investigate the effects of imprisonment length on recidivism at the extensive as well as the intensive margin, we present estimates for both recidivism prevalence and recidivism incidence.

Table 2 Baseline two-stage least squares recidivism estimates by time since release

Starting with the estimates for recidivism prevalence, the coefficients suggest a reduction in the recidivism rate as the length of stay in prison increases. For example, we find that the probability to recidivate within 1 year after release from prison is reduced by around 18 percentage points per extra hundred days of incarceration. However, none of the estimates for recidivism prevalence are statistically significant. We also find the coefficient size to decrease substantially as the time since release increases, from − 0.176 at the 1-year mark to − 0.028 at the 5-year mark. This suggests that any potential inverse effect of imprisonment length on recidivism prevalence diminishes as time proceeds. These findings are further corroborated by the small and statistically non-significant estimates from the sensitivity analyses.

The estimates for recidivism incidence capture absolute changes in offense counts, allowing us to investigate the relationship between imprisonment length and reoffending frequency after release. As shown in Table 2, we find statistically significant absolute reductions in offense counts across all measurement periods (p < 0.05). Within the first year after release, we find a 100-day increase in imprisonment length to reduce recidivism incidence by 1.28 offenses. While the estimates suggest that the largest relative reduction in offenses takes place in the first year after imprisonment, we find the effect size to increase to − 2.91 offenses at the 5-year mark. The estimates for recidivism incidence remain statistically significant in a higher-order model specification (see “Indicator specification”).

Reconciling the estimation results for recidivism prevalence and incidence, our evidence suggests that an increase in imprisonment length does not significantly affect recidivism prevalence. Yet, we do find longer prison spells to cause reoffenders to be less criminally active. Hence, the question arises whether the imprisonment length-recidivism relationship differs across offense categories. To address this question, we also investigate treatment effect heterogeneity.

Heterogeneity analyses

Tables 3 and 4 present 2SLS estimates for recidivism prevalence and recidivism incidence across multiple offense categories and offender characteristics. We investigate treatment effect heterogeneity across different crime categories by differentiating between violent crime, property crime, and other crime (e.g., drug offenses) among the full sample, and treatment effect heterogeneity across offender characteristics by differentiating between first-time prisoners and individuals who have previously been imprisoned, as well as young adults (ages 18–25) and adults (ages 26 +).

Table 3 Baseline two-stage least squares recidivism prevalence and incidence estimates by time since release and crime category
Table 4 Baseline two-stage least squares recidivism prevalence and incidence estimates by time since release and subpopulation

The estimation results shown in Table 3 suggest that there is considerable heterogeneity across crime categories in the imprisonment length-recidivism relationship. Starting with recidivism prevalence, we find a marginally significant reduction in property crime after 3 and 5 years after release (p < 0.10), whereas all of the estimates for violent and other crimes are close to zero and statistically non-significant. The suggestion that lengthier prison sentences may be more effective in reducing property crime than other offense categories after release, is further supported by the estimation results for recidivism incidence. Only property offense counts appear to be reduced by longer prison spells (p < 0.10), whereas none of the estimates for violent and other crimes are statistically significant.

Table 4 shows that the conclusion that recidivism prevalence is unaffected by imprisonment length does not differ for any of the included subsamples. While the reduction in recidivism prevalence among repeated prisoners is statistically significant at the 3-year mark (p < 0.05), we find statistically non-significant estimates after both 1 and 5 years after release. The relationship between imprisonment length and recidivism incidence, however, does appear to vary across offender characteristics. The difference across imprisonment history seems especially pronounced, as we find substantially larger coefficient sizes for repeated prisoners than for first-time prisoners. Furthermore, only the estimation results for repeated prisoners remain statistically marginally significant beyond the 1-year mark (p < 0.10). While we find more similarly sized coefficients across age groups, only the estimates for adults are statistically marginally significant across all follow-up periods (p < 0.10). All subsample estimation results have to be interpreted with caution, however, as the sensitivity analyses show them to be sensitive to changes in functional form.

Higher-order estimation results

To further investigate the sensitivity of our estimates, we test for robustness against a higher-order model specification. To this end, we re-estimate the 2SLS models and add a quadratic term for days imprisoned, the results of which are presented in Tables 5 and 6. The shown quadratic estimates are computed through a linear combination of the linear and quadratic imprisonment length terms, and capture the total change by a 100-day increase in imprisonment length in the recidivism rate for recidivism prevalence and offense counts for recidivism incidence. Through this approach, we investigate potential biases in the baseline 2SLS estimates due to nonlinearity in the imprisonment length-recidivism relationship (see Mogstad & Wiswall, 2010).

Table 5 Higher order two-stage least squares recidivism prevalence and incidence estimates by time since release and crime category
Table 6 Higher order two-stage least squares recidivism prevalence and incidence estimates by time since release and subpopulation

Table 5 shows the statistically significant baseline estimates for recidivism incidence among the full sample to remain statistically significant when we include a quadratic term. This also holds true for the property crime recidivism prevalence and incidence estimates. In comparison to the baseline linear model specification, we find the inclusion of a quadratic term to produce notably larger coefficients. As such, we consider the baseline estimation results to likely provide a lower bound for the true effect of imprisonment length on recidivism. As relaxing the linearity restriction confirms the robustness of the baseline estimation results for recidivism incidence among the full sample, we find these sensitivity analyses to support the conclusion that longer prison spells reduce reoffending frequency after release.

Table 6 presents similar results for recidivism prevalence among the included subsamples, as none of the reductions in recidivism prevalence remain statistically significant across functional forms. For recidivism incidence, however, we only find the reductions among the adult subsample (ages 26 +) to remain statistically significant across all follow-up periods in a quadratic model specification (p < 0.05). While this suggests that imprisonment length only affects recidivism incidence among adults, these results have to be interpreted with caution as we find notable changes in coefficient size across functional forms. Furthermore, despite being statistically non-significant, the quadratic model specification produces the largest coefficient sizes for repeated prisoners. Hence, while our investigation offers some insight into treatment effect heterogeneity across offender characteristics, we do not find the estimation results to be sufficiently robust to support substantive conclusions.

Indicator specification

Criminological theory postulates that preventative effects from lengthier prison terms arise from either deterrent or rehabilitative mechanisms. As rehabilitative provisions in the Dutch criminal justice system are limited for prisoners serving sentences of up to 90 days (RSJ, 2021), we further investigate the underlying causal mechanism by estimating a two-stage least squares (2SLS) instrumental variable model with an indicator for prison spells longer than 90 days. As such, the estimates capture the total change in recidivism by a prison term longer than 90 days, compared to a prison term up to 90 days.

Table 7 presents the estimation results from the 2SLS indicator specification for recidivism prevalence and recidivism incidence across multiple offense categories. Similar to the baseline estimation results, prison terms longer than 90 days appear to only significantly reduce recidivism incidence in general. More specifically, we find a marginally significant reduction in the reoffending frequency of 2.15 offenses in the first year after release (p < 0.10), which increases to − 3.92 and − 4.90 offenses at 3 and 5 years after release (p < 0.05). The estimation results across different crime categories are also in line with the baseline estimates. Again, we only find marginally significant estimates for property crime, including recidivism prevalence at 3 and 5 years after release (p < 0.10). All of the estimates for violent and other crimes remain statistically non-significant and mostly close to zero.

Table 7 Nonlinear two-stage least squares estimates for recidivism prevalence and incidence by time since release and crime category

Discussion

This study investigates the causal effects of the length of imprisonment on recidivism after release. Unique individual-level administrative data on all offenders convicted to short-term imprisonment (≤ 1 year) by a single-sitting judge in 2012 in the Netherlands, allow us to control for heterogeneity in offense and offender characteristics by exploiting variation in the proclivity to impose lengthier sentences across randomly assigned judges through a two-stage least squares (2SLS) instrumental variable design. To further investigate the effects of imprisonment length, we estimate separate models for recidivism prevalence and incidence, and multiple crime categories (violent crime, property crime, and other offenses).

Our findings suggest that length of imprisonment does not have a significant effect on recidivism prevalence and that this conclusion holds across various follow-up periods, that is, 1 year, 3 years, and 5 years after being released from prison. In contrast to the recidivism prevalence estimates, our findings suggest that longer prison sentences significantly reduce recidivism incidence. The reduction in recidivism incidence appears to be driven by property offense counts, as we find a marginally significant reduction in property offenses, whereas violent and other offenses are unaffected. More specifically, a 100-day increase in imprisonment length reduces offending frequency by 0.84 property offenses within the first year after imprisonment. After 5 years, this reduction has increased to 1.66 property offenses. All of the estimation results for recidivism incidence are robust to changes in functional form (i.e., linear versus quadratic). The recidivism prevalence estimates, however, were sensitive to changes in model specification, as only the quadratic model estimates show a statistically significant reduction in recidivism prevalence. While we do not find evidence of nonlinearity in the relationship between prison length and recidivism, the sensitivity analyses indicate that relaxing the linearity restriction may impact the results. Future research is therefore needed to further examine the relationship between imprisonment length and recidivism prevalence.

The estimation results support theoretical perspectives on the preventative effects of imprisonment, which posit lengthier sentences to reduce reoffending after release. Most notably, the reduction in recidivism incidence is in line with specific deterrence theory (Nagin et al., 2009), which states that greater punishment severity exerts a greater deterrent effect on reoffending. The evidence that only property offense counts are reduced by longer prison spells further supports this perspective, as deterrence theory suggests higher expected costs of offending to primarily deter rational decisions to commit instrumental crime, as opposed to expressive offenses (e.g., violent crime). Other preventative theoretical perspectives suggest longer prison spells to reduce recidivism through rehabilitative mechanisms. In the Netherlands, rehabilitative programming is largely absent when extremely short-term prison sanctions are considered. For instance, individuals serving short-term prison sanctions of up to a couple of months are not able to participate in Penitentiary Programs aimed at successful reintegration after release, and guidance and supervision from probation officers is largely absent (RSJ, 2021). The fact that we observe some crime preventative effects may, in part, also be the result of more emphasis on rehabilitative programming during longer terms of imprisonment. This is also in line with prior work that typically finds crime preventative effects of imprisonment in contexts where rehabilitative programming is emphasized (Loeffler & Nagin, 2022). At the same time, we also find that imprisonment length does not influence recidivism prevalence, and recidivism incidence for violent and other crimes, so the net effect of length of imprisonment may also represent an amalgam of both intended and unintended consequences, and rehabilitative and criminogenic effects on those types of repeat offending behavior.

As the current study is one of the first to investigate the effects of length of imprisonment on recidivism incidence, a comprehensive comparison of our findings to prior research is challenging. Overall, the unveiled reduction in reoffending frequency is in line with the preventative effects of longer imprisonment found in the current generation of quasi-experimental research (see Drago et al., 2009; Kuziemko, 2013; Rhodes et al., 2018; Roach & Schanzenbach, 2015). Yet, in contrast to these prior studies, we do not find statistically significant effects on recidivism prevalence. A potential explanation for these divergent findings may lie in the comparatively limited changes in imprisonment length under consideration which is in line with the only other third-generation study that finds null effects on recidivism prevalence (Al Weswasi et al., 2022). While these changes prove to be sufficiently substantive to reduce reoffending frequency, they may be too small to nudge individuals towards desistance. Typically, transitioning from a life in which committing a crime is common to a life without committing crimes takes time and usually occurs gradually (Bushway et al., 2001; Maruna, 2001).

The exploitation of variation in judge stringency through an instrumental variable design allows us to control for both observed as well as unobserved heterogeneity. As such, this approach enables us to account for the considerable endogeneity that arises from the consideration of recidivism risk in sentencing decisions. Combined with our data, however, this approach also brings along multiple limitations. First, the resulting estimates represent local average treatment effects. As the average sentence length in our sample is only 51 days, and the vast majority of sentences are shorter than 6 months, the local average treatment effects are limited to sentences spanning weeks or months, rather than years. As prior research suggests that the effect of incarceration on reoffending varies across imprisonment length (see Meade et al., 2013), the generalizability to longer prison spells may be limited.

Second, 2SLS instrumental variable estimation is more vulnerable to small-sample bias than single-stage procedures, due to larger variance (see Boef et al., 2014). Despite the strength of the constructed judge stringency instrument, we find our sample size to be insufficient to further investigate heterogeneous effects across subsamples. Yet, it is important to acknowledge that there are reasons to expect the imprisonment length-recidivism relationship to differ across offender characteristics. From a specific deterrence perspective (Nagin et al., 2009), the length of a prison spell may have a greater influence on expectations about punishment severity among first-time prisoners, as compared to individuals who have prior prison experience. The significance of aging effects may also differ across age groups and length of imprisonment (e.g., Meade et al., 2013). While prior research comparing custodial to non-custodial sentences supports such treatment effect heterogeneity, further research is warranted as to what extent length-of-stay effects differ across these groups.

This study adds to the scarce body of quasi-experimental evidence on the effects of imprisonment length on recidivism after release. Only few studies have previously investigated this relationship, and even fewer have investigated these dynamics across lengths of stay shorter than 1 year. Yet, such short incarceration spells account for most of the prison sentences imposed in Western European and Nordic countries, as well as jail sentences in the USA. While prior evidence suggests that non-custodial alternative sanctions for such offenses are more effective in lowering recidivism than custodial sanctions (see Wermink et al., 2010), our findings add that, in the case of imprisonment, lengthier spells reduce the number of property crimes in society. In addition to greater specific deterrence, lengthier imprisonment may also increase exposure to rehabilitative services offered in prison, such as educational and vocational services (see Gendreau et al., 2008; MacKenzie, 2006; Nagin et al., 2009).

In summary, our findings confirm the belief that longer prison sentences reduce reoffending among individuals serving prison terms of up to 1 year. The relevance of these findings is emphasized by the continued rise in imprisonment levels and costs, due to harsher sentencing in many Western European countries (Dünkel, 2017; Tak, 2008), as well as the USA (Henrichson & Delaney, 2012; Latessa et al., 2020). We cautiously conclude that these substantial investments in longer prison sentences yield returns through the reduction of the number of property crimes committed after release and its societal costs, though the null effects on recidivism prevalence and recidivism incidence of violent and other crimes readily caution against unrealistic expectations in this respect. Moreover, there are high costs associated with prison sentences. For instance, the Ministry of Finance (2013) in the Netherlands reports 249 euros per day for imprisonment, while the daily costs for non-custodial sanctions are estimated to be much lower. From a policy perspective, alternative ways of punishment seem to be more promising in terms of reducing criminal justice expenditures and reducing repeat offending. This is especially relevant given the high number of extremely short prison sentences in many Western contexts. Yet, our results also suggest that—for individuals for whom a noncustodial sanction is not suitable—investing in longer prison sentences yields crime preventative effects, especially when property crimes are considered. Whether public resources should be spent to achieve such benefits should be discussed further, as an increase in imprisonment length also leads to a substantial increase in costs. For instance, a 100-day increase in imprisonment length would equal an increase of 24,900 Euros per imprisoned offender. It will also lead the prison population to increase, adding pressure to the prison system. While further research is warranted to assess the generalizability of these findings to long-term prison sentences, this study contributes to a comprehensive overview of the costs and benefits of criminal justice policy.