Introduction and research approach

In the Netherlands, as in most Western countries, there appears to be a deeply rooted disagreement about the appropriate severity of punishment for criminal cases between judges in criminal courts and the general public. In the literature (discussed below), it has been suggested that such a gap may be an artefact of survey methodology and the lack of factual information on the part of the general public when responding to questions about the levels of sentencing. In this article we examine the question of whether a punitiveness gap between judges and the public in the Netherlands is really a problem of information rather than a true normative gap in terms of preferred severity of sentencing. Does public opinion become less punitive when more information is provided? Are sentences preferred by the public really that different from judges’ decisions in court when the public has available the same type and amount of information on a specific criminal case?

While the role of information in affecting levels of punitiveness has been the subject of much previous research, we believe that a thorough examination of the issues at hand can only be achieved through the combination of distinct but connected studies using both survey and experimental methodologies, and integrating samples from the general public and judges working in the criminal courts. In doing so, we contribute to an existing body of research that is largely based on separate studies incorporating single methodologies. Below, we will therefore discuss and integrate findings from three studies on punitive preferences in the Netherlands: study I, a sentencing study with a large sample of judges from Dutch criminal courts responding to three detailed and realistic case files; study II, a survey among the Dutch general public using survey questions that measured people’s punitive opinions off the top of their heads; study III, a sentencing study with a sub-sample from study II (i.e. the public survey), using exactly the same case files as in the judges’ sentencing study, as well as descriptions of the same cases in the abridged format of newspaper articles.

The relationship between the three sub-studies is schematically presented in Fig. 1. The design shown in Fig. 1 facilitates three comparisons:

  1. C1.

    Sentencing compared between lay persons and professional judges when presented with the same detailed case files (comparing study I with IIIa).

  2. C2.

    Sentencing compared between lay persons presented with a complete case file and lay persons given a short newspaper article of the same case (comparison within study III, i.e. comparing IIIa with IIIb).

  3. C3.

    Lay persons’ answers to general survey questions compared with the same persons’ sentences when presented with concrete cases (both case files and newspaper articles) (study II compared with study III).

Fig. 1
figure 1

Design of the study incorporating three related studies; facilitating comparisons C1, C2, and C3

Combination of these connected studies results in the integration of both experimental and survey methodologies. Note that, next to the survey methodology of study II, our approach involves two explicitly experimental elements. Study III is a randomised experiment by design, assigning members of the general public either one of the three detailed case files (IIIa), or one of the three abridged newspaper articles based on the case files (IIIb). This design of study III enables experimental testing of effects of amount and nature of information on public sentencing preferences. The second experimental element in our approach is indirect; it is the result of integrating study I with study III (i.e. IIIa in Fig. 1), comparing sentencing decisions of the public with those of judges, based on identical case materials. Both in study I and study III, stimulus materials were randomly distributed to respondents.

Previous research and literature

The concept of a punitiveness gap is based on public opinion surveys. These surveys show, rather consistently, that there is a wide gap between judges and the public in terms of preferred severity of sentences. At first glance, public opinion on the issue of sentencing in the Netherlands appears to be crystal clear and has remained quite stable over time. Typically, between 80% and 90% of the Dutch public agree with the widely used survey statement In general, sentences for crimes in the Netherlands are too lenient (e.g. Elffers and de Keijser 2004; Sociaal en Cultureel Planbureau 2002, 2005). In this respect, the Dutch public is not much different from that of other Western countries (cf. Barber and Doob 2004; Hough and Roberts 1998; Hutton 2005; Mattinson and Mirrlees-Black 2000; Roberts and Hough 2002). However, in recent years, research has accumulated to build a strong case against the validity of such survey measurement of public opinion on crime and punishment (see, for overviews, Roberts and Hough 2005; Roberts et al. 2003). A number of reasons have been put forward to explain why public attitudes towards sentencing are not as punitive as general survey questions would tempt us to believe. More informed methods of gauging public opinion on issues such as sentencing would approximate actual judges’ decisions much closer than ‘unreflecting views’ (Hough and Park 2002) as they are produced by general survey methodologies.

Public opinion and populist punitiveness

The public outcry for harsher sentences has, in many Western countries, been associated with a ‘punitive turn’ that occurred in the past few decades. Hutton (2005) describes the two main manifestations of this punitive turn: rising prison populations and the politicisation of crime and punishment (see also Beyens et al. 1993). In the Netherlands a punitive turn has also taken place. While the Netherlands has, for a long time, been a country with a mild sentencing climate, it is currently average in terms of prison population compared to other Western countries (cf. Tonry 2004). A mechanism through which public opinion may establish and continue to sustain such a punitive turn has been described by Bottoms (1995) as populist punitiveness. More recently, Roberts and colleagues (2003) discussed basically the same mechanism in terms of penal populism. The mechanism is ‘simply’ that policy makers, judges, and legislators respond to what they perceive as massive popular support for harsher sentencing. One drive fuelling this mechanism may be that the call for harsher sentences is believed to be associated with a lack of confidence in the criminal justice system (Hough and Roberts 1999; van Koppen 2003).

Issues with punitive attitude measurement

Research has shown how outcomes of penal attitude measurements are affected by questioning technique and context provided (cf. Durham III 1993; Green 2006; Hough and Park 2002, Tonry 2004; Hutton 2005; Roberts et al. 2003; Stalans 2002; Walker and Hough 1988). Indeed, also in the Netherlands, some of the scarce research that does not solely rely on the usual general survey questions shows public support for sanctions other than mere stiff prison sentences (Dümig and van Dijk 1975; van der Laan 1993).

Two related issues play a central role in the discussions about variations in public punitiveness. One is the specific method of inquiry, while the other concerns the type and degree of information that is provided to respondents. The combination of these determines what is being measured. Yankelovich (1991) has made a distinction between public opinion and public judgment. Public opinion is what is measured off the top of people’s heads without much prior deliberation or processing of specific information. This is measured by general surveys (see also, Zaller 1992). Public judgment, on the other hand, results from more informed and deliberated choice. Much recent empirical research indicates that informed public judgment is less punitive and more liberal than public opinion is (Hough and Park 2002; Hutton 2005). Techniques such as deliberative polls and focus groups have been shown to generate public judgments about punishment not far removed from, or even the same as, what actual sentencers would do (Green 2006; Hutton 2005; Roberts et al. 2003).

A factor that is believed to affect punitive responses further is the specificity of the questions asked and of context provided. General survey questions on sentencing produce a different type and more punitive response than questions on sentencing pertaining to specific cases (cf. Applegate et al. 1996; Cullen et al. 2000; Hutton 2005). Offering concise vignettes of criminal cases to respondents already produces public responses similar to actual judges’ sentences for the same cases, as a Swiss study has shown (Kuhn 2002). One reason that has been given for this is that, when asked a global question about sentencing, people tend to focus on stereotypes and worst case scenarios, which results in more punitive stances (cf. Roberts et al. 2003; Stalans 2002). In general, it appears that knowledge, information, and specificity are inversely related to public punitiveness (Doob and Roberts 1988; Indermauer and Hough 2002; Mirrlees-Black 2002; Seidman-Diamond 1990).

A final factor that has been argued to cause people to express punitive attitudes is their fear of crime (Indermauer and Hough 2002; Sprott and Doob 1997) and the belief (perception) that crime is strongly on the rise (Hough et al. 1988; Rossi and Berk 1997; Sprott and Doob 1997). In this respect, a punitive attitude may be conceived as part of a complex of fear, insecurity and negative attitudes towards crime and justice.Footnote 1 In a similar vein Hutton (2005) describes punitiveness as part of a narrative of insecurity that shapes the way that people perceive and respond to crime- and justice-related issues.

The current focus

Using general survey questions it is expected that members of the public will report being (highly) dissatisfied with the level of sentences in the Netherlands. After all, in most Western jurisdictions, this is consistently shown in public opinion surveys. However, when given realistic case files containing detailed information, the same dissatisfied persons are expected to prefer sentences similar to judges’ sentences. In contrast, when members of the public are presented with short, one-sided newspaper articles of the same cases, sentences are expected to be more severe.

Our research design facilitates within-subject comparison of survey responses with responses to the experimental case materials. It thus enables us to analyse people’s sentencing preferences in more depth by differentiating between types of case material, relating them not only to their general punitive attitudes but also to other crime- and justice-related attitudes.

In summary, against the backdrop of attitudinal survey information obtained through study II, our specific focus will be on the evaluation of two hypotheses:

  1. 1.

    The general public reaches the same sentencing decisions as judges do when both groups are given exactly the same detailed case file of a specific criminal case.

  2. 2.

    When members of the general public consider a concise newspaper report of a specific criminal case, sentencing decisions will be much harsher than when they are handed the full case file.

Background information on the Dutch criminal justice system

Before we discuss the empirical studies, it is necessary to provide very briefly some relevant background information on the Dutch legal system.

All cases in the Netherlands are tried exclusively by professional judges. Every official involved in each stage of the criminal process (such as police, prosecutor, defence, examining judge, expert witnesses) produces written records that become part of the case file. These written records sum up findings, courses of action and points of view. Dutch criminal procedure and judges’ decision making relies to a large extent on these written records. In court, interaction between judges, prosecutor, accused and counsel focuses on evaluation of the case file.

All criminal cases are tried, in the first instance, by the criminal law divisions of the district courts. In these courts the less serious cases are tried by judges sitting alone, the more serious cases by panels of three judges. All cases receive a full trial: plea bargaining does not exist in Dutch law. If the accused is found guilty by the judge(s), single-sitting judges generally give their verdict immediately. When a case is tried before a panel of judges, the verdict is given after the judges have deliberated in chambers. Decisions of a district court are open to full appeal, both on the facts and on the law, to one of the courts of appeal, without leave to appeal. Thereafter, appeal in cassation is possible to the Supreme Court, on matters of law only.Footnote 2

Dutch judges enjoy wide discretionary powers in choosing the type and severity of punishment (de Keijser 2000; Tak 1997). There are no mandatory sentences. Each type of punishment cannot be specified by less than a legal minimum (e.g., one day imprisonment, €10 fine). Specific maximum terms are specified for each offence codified in the penal code. There are no sentencing guidelines, though Dutch judges do aim to enhance consistency through mutual consultation and by formulation of sentencing policies for clearly defined types of offences. Furthermore, the Dutch prosecutor requests a specific punishment at the end of the trial hearing. Judges are not bound by the requested punishment, although it does provide some form of anchor point in judges’ deliberations on the sentence.

Study I: sentencing study with judges in criminal courts

Background and objective

This study was carried out earlier as a separate sentencing study with Dutch judges focusing on particular psychological pitfalls that may affect judges’ decisions on proof of guilt and on punishment (de Keijser and van Koppen 2006). The study used an experimental design with fictitious but realistic case files as stimuli, randomly distributed over participating judges. A selection of the case files and judges’ sentencing decisions from that study are used for current purposes. The objective is to obtain sentencing decisions in detailed fictitious case files that lend themselves to replication with an experimental study using a sample from the general population.

Materials

Dutch legal procedure strongly relies on evaluation of the written file. The reality of this mode of legal decision making was approached as much as possible by providing judges with realistic case files. We included in the case files all relevant information in the same raw format as judges are accustomed to.Footnote 3 In the original study (de Keijser and van Koppen 2006), case files were constructed from three basic stories: (A) aggravated assault, (B) simple assault, and (C) aggravated burglary.Footnote 4 Table 1 describes the characteristics of these case files. For each of the basic files, a strong-evidence version was constructed, together with a weak evidence version, resulting in a total of six case files (cf. Table 1). While the two assault cases (A and B) involved basically the same incident, differences between them related to crime seriousness. Type and level of violence applied by the perpetrator, and subsequent injuries suffered by the victim, were varied. In the serious version (A), the offender did not only kick the body of the victim, but also his head, resulting in permanent loss of powers of speech as well as irreparable paralysis from the waist down. In the less serious version (B), only the victim’s body was kicked, not resulting in permanent injuries. The aggravated burglary case (C) was constructed as a non-violent contrast to the two assault cases. Within its own legal qualification, however, it was also a serious case.

Table 1 The six dossiers in the original study and their content (taken from de Keijser and van Koppen 2006)

Case files included the usual elements, such as police affidavits of witness statements, victim statements and statements by the accused, forensic experts’ and medical examiners’ reports, prosecutor’s indictment and requisitoir (summing up), psychological reports on the accused, and criminal records of the accused. In the aggravated assault case, the prosecutor requested 30 months imprisonment; in the simple assault case 2.5 months imprisonment, and in the burglary case 6 months imprisonment. These requested punishments were consistent with national prosecution guidelines for similar cases. Each case file comprised about 20 pages of written reports.Footnote 5 An instruction page was added, stating our awareness of two unavoidable abstractions from reality in this study. These involved having to make a decision without actually seeing the accused at trial, and the absence of deliberations in chambers. The lack of a real trial was compensated for by a final sheet attached to the case files in which a short description was provided of the hypothetical trial.

Because issues related to strength of the evidence between case files are not of interest for current purposes, as indicated in Table 1, below we will only use the data that relate to the strong evidence versions of the aggravated assault (A), the simple assault (B), and the aggravated burglary (C).

Procedure and design

In October 2003 we asked all 629 judges in the district courts and all justicesFootnote 6 in the courts of appeal who served in the criminal law divisions to participate. We excluded the so-called replacement judges, who serve part time alongside their jobs elsewhere. The Dutch Council for the Administration of Justice (Raad voor de rechtspraak) wrote a letter of recommendation to the presidents of all 19 district courts and five courts of appeal, describing the study only in general terms as ‘a study on legal decision making’. Two weeks later we sent the case files to the judges. A reminder was sent out 2 weeks later. Participation in the study was anonymous. The dossiers were accompanied by a separate response form and a postage pre-paid return envelope. Judges were requested to write down their sentence freely, in a manner consistent with real sentences. The design of the original study was completely between subjects according to a 3 (case version, A/B/C) × 2 (evidence, strong/weak) design. The six case files were randomly distributed between judges.

Sample and representativeness

A total of 229 judges returned their written decisions to us. This was 36% of the population of judges in the Dutch criminal courts. As noted above, for current purposes, we only used the data relating to the strong evidence versions of the cases. This selection of data resulted in 180 sentencing decisions pertaining to one of the three strong evidence cases.Footnote 7 This constituted 29% of the population of criminal judges at that time. A limited number of background variables for the population were available, enabling a rough indication of representativeness of our sample in terms of gender, type of judge (judge in court or justice in court of appeal), and regional dispersion grouped at the level of courts of appeal jurisdictions. Table 2 compares our sample with the population of criminal judges in terms of these background characteristics. It gives the descriptives for the original sample as well as for the selection that we made for current purposes. Although judges are slightly over-represented and justices slightly under-represented, overall representativeness may be considered satisfactory both in the original sample as well as in our current selection of 180 sentencing decisions in strong evidence cases.

Table 2 Judges in criminal courts: sample representativeness, gender, type of judge, and regional dispersion (percentages). Regional dispersion is grouped at the level of courts of appeal jurisdictions.

Judges’ sentencing decisions

Coding of sentencing decisions was uncomplicated. All but four judges specified a straightforward prison sentence. The four exceptions involved community service orders. These were excluded from further analyses of the sentencing decisions.Footnote 8

A prison sentence in the Netherlands can be imposed completely unsuspended, partly suspended, or completely suspended. In all cases in this experiment the large majority of judges specified a completely unsuspended prison term (74% in the aggravated assault case; 70% in the simple assault case; 83% in the burglary case). Our analyses are based on total prison sentences.

Table 3 shows the sentencing decisions. While the average sentences in Table 3 are well below the formal upper limits, as they usually are, the table does show that Dutch judges make use of their discretionary powers in varying ways. Sentences between judges given identical criminal cases differ substantially. For the aggravated assault, the most serious case in our study, standard deviation was near 10 months imprisonment, with an average sentence length of 30 months. It should be noted, however, that these standard deviations most likely overestimate differences between judges in real cases in Dutch courts. While our participants evaluated the case file and made the subsequent sentencing decision in isolation, in real life serious cases such as these are dealt with by a panel of three judges (cf. above on the Dutch legal system). It may be expected that such deliberations in chambers between judges have a converging effect on the sentence.

Table 3 Judges’ sentences (N = 177; months of imprisonment)

Study II: survey among the Dutch population

Objective

In this survey general penal attitudes and their correlates are globally charted and will be used at a later stage to put the findings from the sentencing study (study III) into a wider attitudinal perspective (comparison C3).

Sample

The survey was carried out in November 2004 using the Telepanel of TNS-NIPO, a large Dutch marketing research bureau. It concerns a sample that is representative in terms of gender, age, education, and urbanisation. A representative sub-sample from the Telepanel of 2,155 Dutch persons of 18 years and older was used for the current study. The questionnaire was programmed to be self-administered (with computer-assisted personal interviewing methodology).

Questionnaire

As much as possible in the same wording as in previous (Dutch) survey research, our questionnaire covered the following areas: attitudes on sentencing climate in the Netherlands, attitudes toward judges; concern over and perceptions of crime and law enforcement; and knowledge of and interest in crime and law enforcement. Table 4 lists the survey questions. In addition to these items a number of background variables were available for analyses: gender, age, level of education, vocation, political preference, and media consumption.

Table 4 Survey questions

Findings

Penal attitudes and attitude towards judges

In our sample, 84% agreed that sentences are too lenient, whereas only 5% disagreed. In line with these percentages are people’s responses to the hypothetical situation of being in the judge’s chair. No fewer than 81% expected to be harsher than a real judge; almost one fifth (19%) expected to sentence about the same, and fewer than 1% expected to be more lenient than a real judge. However, overall attitude towards judges was not negative at all. Fewer than 10% of the public rejected the notion that one can be confident that a judge will deal with one’s case in an independent and unbiased way. Moreover, when asked for a general evaluation marking (ranging from 0 to 10), four out of five persons rated judges’ performances as at least sufficient (6 or higher). And, somewhat surprisingly, even 75% agreed that, in the eyes of the general public, judges’ sentences will never be harsh enough.

Concern over and perceptions of crime and law enforcement

Our sample expresses great concern about crime: 86% are concerned. When further asked about perceived trends in crime rates, over two-thirds believe that crime has gone up strongly over the past years. Only 7% believe that crime rates have remained stable over the past years, and no more than 1% thinks that crime rates have dropped. When asked about perceived trends in sentencing, only 13% think that, nowadays, sentences are harsher than they were 10 years ago. No fewer than one-third even believes that sentences have become more lenient. The remainder of the sample thinks that sentencing has remained at the same level as it was 10 years ago.

Knowledge of and interest in crime and law enforcement

Over 40% of the Dutch claim to be interested in news about criminal cases. Only one in five (18%) expresses no interest whatsoever in such media reports. In order to get a glimpse of our respondents’ general knowledge of the criminal justice system, we asked a straightforward multi-staged question on Dutch criminal procedure: which official(s) are responsible in a murder case for prosecution, the decision of guilt, and the sentencing decision, respectively? Just over 80% of our sample knew that the prosecutor is responsible for prosecution,Footnote 9 36% correctly stated that a panel of three judges decides upon the question of guilt and one-third correctly responded that the same panel of three judges gives the sentencing decision in a murder case. Overall, the total number of correct answers to the three straightforward knowledge questions was not very impressive. Only 24% of our sample knew the three correct answers; 16% gave two correct answers, 46% gave only one correct answer, and 14% had it all wrong.

We checked knowledge with respondents’ consumption of various television news shows. Correlations were small but statistically significant. There was an obvious consequent pattern of association in the sense that watching news shows on public television was positively associated with knowledge, whereas watching tabloid type news shows was negatively associated with knowledge.

Penal attitude and its associates

We regressed responses to the statement In general, sentences for crimes in the Netherlands are too lenient on the limited set of predictors available from our survey. These included gender, age, level of education, interest in news about crime, concern over crime, attitude towards judges, perception trend in crime rates, perception trend in sentencing, watching television news shows, knowledge, and political preference.Footnote 10 Multiple regression analysis reported in Table 5 shows that 29% of the variance in our respondents’ general penal attitudes, as measured by the typical survey question, can be explained.

Table 5 Explaining general penal attitude: standardised regression coefficients of background characteristics, perceptions and attitudes (multiple linear regression, N = 2,127). Variables not displaying a significant relation are not mentioned in the table: gender, knowledge, judge is seen as independent and unbiased, all other television news shows, all other political parties

While demographic characteristics, political preference and news consumption have only a minor influence, the regression model in Table 5 is dominated by three predictors. People who express a punitive penal attitude are, in general, those who are worried about crime (β = 0.24), perceive crime rates as rising (β = 0.18), and believe that sentencing is becoming more lenient (β = 0.15). The public outcry for harsher sentences may be understood not necessarily as dissatisfaction with the penal climate per se, but rather, as a general concern about crime and law enforcement.

An alternative way to analyse these data is by focusing on underlying dimensions, not on causal relations. We ran a principal components analysis (PCA) on opinion on sentencing climate, being worried about crime, perception of trend in crime rates, and perception of trend in sentencing. The dimensional analysis suggested that we retain a single principal component. This single component summarises 48% of the variance shared by these variables.Footnote 11 We describe this factor as General Concern over Crime; GCC in short. People who score highly on this GCC factor believe that sentences are too lenient, that crime rates have rise while sentencing has become more lenient, and are, more than others, worried about crime. Our GCC factor connects well with Hutton’s earlier mentioned narrative of insecurity (Hutton 2005). Moreover, in a Dutch study explaining public support for capital punishment, similar patterns were found in relation to such extreme punitive attitudes (cf. Hessing et al. 2003).

The wider attitudinal perspective of the study, as it is charted here, is in line with the familiar picture that usually emerges from survey research on public opinion towards crime and justice issues.

Study III: sentencing study with a sample from the Dutch population

Objective

This study enables the comparison of sentencing decisions between the general public and professional judges when the public sample is presented with the same case files as in study I (i.e. comparison C1 in Fig. 1 above). The study further incorporates an experimental ‘between-subjects factor’ that varies the amount and detail of information presented to members of the public. It is expected that people presented with a concise newspaper report will be harsher in their sentences than people presented with the detailed case file (C2 in Fig. 1 above).

Materials

In the sentencing study with the public, the experimental materials are: (a) the same three strong evidence case files as those used in the sentencing study with judges (study I), relating to an aggravated assault, a simple assault and an aggravated burglary, and (b) three newspaper reports based on the three case files.

The case files

For valid comparison between judges and the public, we carefully refrained from any alteration in the case files. We did, however, consider that one minor addition to the files was unavoidable. We added brief explanations of some juridical technical phrases where they occurred in the headings of reports. For instance, for the heading ‘Summons’ (Tenlastelegging), the clarification was added: ‘In juridical language this is the official written accusation of the prosecutor against the defendant’. Thus, the added information referred solely to the function of a specific report without any explanation or interpretation of content.

The newspaper reports

Three newspaper reports were obtained by giving the three case files to an experienced court journalist working for a Dutch national daily newspaper (Algemeen Dagblad). This newspaper, most would agree, is positioned at the right side of the political left–right continuum. Without revealing our objectives, and without giving any further instructions, we asked the journalist to produce a newspaper report based on each of the three case files. The resulting three newspaper reports were concise and, as expected, rather one-sided, reflecting mainly the seriousness of the crimes and the consequences for the victims, and giving only negative aspects relating to the offender. Without alteration, we adopted the three newspaper reports as experimental materials. Appendix shows the three newspaper reports produced by the court journalist. All newspaper reports mentioned the punishment (the same as in the case file versions) that was requested by the prosecutor in that particular case (see Table 6).

Table 6 Sentencing study with the Dutch public: materials, response, requested punishment and legal sentencing maximum per case

Procedure and design

From the 2,155 persons in November 2004 who participated in the survey (study II above), using the respondent identification numbers kept by the research bureau TNS-NIPO, a random sub-sample was drawn of 1,200 persons in April 2005. Responses by individuals in this second sample were linked to the same individuals’ responses in the survey a year earlier.

Case materials were randomly distributed through the normal mail. Because we feared that responses would be relatively low from those who had received an extensive case file, more case files than newspaper reports were distributed (see Table 6). As in study II, participants were requested to respond using the self-administered capi at home questionnaire. Respondents were first asked in open and unrestricted format to give their written sentencing decision:

What punishment do you personally find appropriate in this case and how severe should it be? Please write this down concisely.

The question was designed to measure respondents’ sentencing decisions off the top of their heads with regard to the case in hand. Because we wanted to retain the possibility of comparing sentences preferred by the public with judges’ sentences, and because judges are bound to legal sentencing maxima (discussed above), in a follow-up question we mentioned the legal sentencing options and their respective legal maxima for the case in hand and asked the respondent again to give the preferred sentence.Footnote 12 This restricted follow-up question was more or less a back-up for the study in case the public’s sentences turned out to be too extreme for further comparison. It further enabled us to explore the effect of mentioning different sentencing options. The final column of Table 6 lists the legal maxima per case.

In order to understand respondents’ perceptions of judges’ sentencing behaviour in specific cases, we posed the following final question:Footnote 13

What sentence do you think a real judge would give for this case, expressed in months of imprisonment (and of course not exceeding the legal maximum)? Please try to give your best estimate.

The experiment was completely between subjects, and respondents were randomly assigned one of the case materials, with twice as many respondents being randomly assigned to a complete case file than to the newspaper versions. Respondents who were given a complete case file received a €10 voucher in return for their response, whereas those receiving a newspaper report were rewarded with a €5 voucher.

Response

Table 6 gives an overview of the case materials used in the experiment. For each type of case it shows numbers distributed and received, as well as specific and overall response rates. It also shows punishments requested by the prosecutor, and the legal sentencing maxima.

No systematic differences can be observed between response rates for complete case files and response rates for the newspaper reports. Overall response is 76% (N = 917). Moreover, the resulting sample is statistically equivalent to the original sample used in study II with respect to the attitudinal variates analysed in that study.

Findings

Sentences before and after provision of legal maxima

In all cases a large majority specified a prison sentence (cf. Table 7). Small numbers answered ‘prison sentence’ without specifying the amount. These were excluded from further quantitative analyses. Only two participants imposed the death penalty (for the aggravated assault case). For each case small numbers of respondents, never more than ten, imposed a life sentence. Sentences involving a combination of sanctions were not very common. These mostly involved prison combined with some form of treatment.

Table 7 Percentages preferring prison sentence (initial and bounded), and proportion of public harsher than judges (only initial sentence)

Mentioning the legal sentencing maximum for the case in hand did not change much in preference for the prison sentence in the aggravated assault and the burglary cases, both in the case file and in the newspaper report varieties. Table 7 shows that at least nine out of ten respondents in those cases still prefer a prison sentence after the maximum is mentioned. In the simple assault case, however, we observe a mitigating effect. Providing the maximum, and thereby also sentencing options other than imprisonment, does appear to have a modest effect on those judging either the case file or newspaper report of the simple assault: support for imprisonment decreases by, respectively, 9% and 14%. These respondents predominantly shift their preference to a community service order.

Did mentioning legal maxima have an effect on sentence severity? Table 8 shows that, in so far as it did, the effect was mainly in an unexpected direction: an increase in severity. Figure 2 enables a more focused investigation of the effect. The figure shows percentages equal to—or above in the case of the unrestricted question—the sentencing maximum. Figure 2 shows that, in all cases, mentioning the legal maximum produces a substantial movement from below to exactly equal to the legal maximum.Footnote 14 For instance, in the case of the newspaper report of the aggravated assault, 43% of the respondents initially sentenced the offender equal to or above the legal maximum. However, after the legal maximum was mentioned and respondents were asked again for their preferred sentence, no fewer than 65% imposed the legal maximum. For all the subjects lumped together, 22% initially chose the maximum or above, whereas 42% chose the maximum once it had been mentioned (χ2 = 379.4; P < 0.001). Perhaps many of our respondents perceived the legal maxima as guidelines or orientation points (which, of course, they are not meant to be) and subsequently wanted to close the gap between their initial, apparently relatively lenient, sentence and the legal maximum.

Table 8 Sentences imposed (months imprisonment): initial sentence and sentence bounded by legal maxima; judges’ sentence as perceived by public
Fig. 2
figure 2

Effect of mentioning legal maximum

Given the fact that the provision of the legal maximum not only results in harsher sentences than initially, but also that the bounded sentencing question results in extremely skewed distributions, for the following analyses we will focus exclusively on the prison sentences that were specified after the first unrestricted sentencing question.

Case files versus newspaper reports (comparison C2)

The sentencing study with the public enables direct evaluation of our second hypothesis:

When members of the general public consider a concise newspaper report of a specific criminal case, sentencing decisions will be much harsher than when they are handed the full case file.

Table 8 shows that this hypothesis is confirmed for two of the three cases. Given the case file of the aggravated assault, the public’s average prison sentence is 61 months, whereas it is 79 months for those who were given the newspaper report as produced by the journalist. The magnitude of the effect here is thus 18 months imprisonment (P < 0.001).Footnote 15 For the simple assault there is no significant effect between the two versions of the case material. The burglary, on the other hand, shows an extreme effect: from 19 months imprisonment, when the full case file was given, up to 62 months when the newspaper version was given. One possible explanation for the magnitude of the effect (43 months, P < 0.001) is that the newspaper report in this case may inadvertently lead the reader to believe that the death of the victim is related to the burglary. In fact, and this is clear in the detailed case file, the victim’s later death is not related to the crime at all. If nothing else, the difference in sentence length between the two versions of the case material shows how extreme the impact of tone and choice of wording of a newspaper journalist may be on the public.

Perception of judges’ sentences

Table 8, in the final column, shows respondents’ perceptions of what sentences a real judge would impose for the cases presented. Members of the public think that a real judge would be much more lenient than they would themselves. Given the journalist’s version of the simple assault, respondents believe that they are twice as punitive as a real judge would be (11 months imposed versus 5 months perceived). Only the case file of the burglary generates a more modest difference between imposed and perceived sentences (19 months versus 15 months, respectively). In the next section we will return to this by relating the differences between sentences imposed and sentences perceived to judges’ actual sentences.

Integration and comparison over studies

The sentencing studies

Judges and the public on the same cases files (comparison C1)

Integrating results from the two sentencing studies (studies I and III) enables evaluation of our first hypothesis:

The general public reaches the same sentencing decisions as judges do when both groups are given exactly the same detailed case file of a specific criminal case.

Table 8 shows that this hypothesis is to be rejected. The case against the hypothesis is a strong one. For the aggravated assault, judges’ average sentence was 29.7 months imprisonment. Given the same case file, lay persons’ average is 30 months harsher, with an average prison sentence of 60.9 months.Footnote 16 In Kuhn’s Swiss study that we mentioned earlier, initially, public sentences appeared to be much more punitive than judges’ sentences, when the same vignettes were given. However, Kuhn proceeded to show that the public’s average was distorted by a relatively small group of punitive extremes. The majority of lay participants in Kuhn’s study were, in fact, not more punitive than judges were. This is, however, not at all the case in our Dutch experiment. A clear majority of lay persons gave sentences harsher than judges’.

Table 7 shows that, when given the case file of the aggravated assault, no fewer than 91% impose a sentence above the judges’ average of 29.7 months. The public’s sentences when given either the case file of the simple assault or the case file of the burglary, lead to the same conclusion: reject the hypothesis of no difference between judges and the public when presented with identical case files. For the simple assault, lay persons’ sentences are almost five-times harsher than judges’ (12.1 months compared with 2.5 months). The public’s average for the case file of the burglary is 19 months, whereas the judges’ average for this case is a prison sentence of 5 months.

Thus, for the three case files given both to judges and to the public, the public is indeed much more punitive than judges are, despite parity of case materials. So, there is a real gap between pubic punitiveness and judges’ punitiveness. An interesting additional question is whether the public is aware of this gap. Does the public intend to be harsher than judges? We can answer this question by looking at what our lay participants in study III thought that a real judge would do when given the case in hand.

The punitiveness-gap between judges and the public

By comparing the public’s perceptions of judges’ sentences with judges’ actual sentences (see Table 8), one can observe that the lay participants systematically overestimated judges’ punitiveness. Without exception, what people think that a real judge would do is significantly more punitive than judges’ actual sentences.Footnote 17 So, the public’s general claim that sentences are too lenient would probably be even louder if the public were to be (ceteris paribus) better informed about the actual severity of sentences. This is especially interesting, since studies abroad report the opposite: a tendency among the general public to under-estimate the true severity of sentences (cf. Roberts and Hough 2005).

Integrating the findings from our sentencing studies allows us to dissect further the punitiveness gap between judges and the public into three sections. The first section is the real gap: the difference between the public’s sentences and judges’ sentences. The second section is the gap as perceived by the public: it is the difference between the public’s sentence and what the public thinks (perceives) a real judge would do. The third section is the public’s misperception: it is the difference between what the public thinks a real judge would do and the judges’ actual sentence. Figure 3 illustrates these sections of the gap.

Fig. 3
figure 3

The punitiveness gap dissected

Figures 4, 5 and 6 show the different types of gaps for our three criminal cases, in case file version as well as in newspaper version. The figures show that the real gap is systematically larger than the gap perceived by the public. Real and perceived gaps are smaller for those who were handed a case file than for those given a newspaper report. This is in line with the effect of information that we discussed above (i.e. confirmation of our second hypothesis). Nevertheless, the perceived gap for each of the three case files remains a gap of substantial size.Footnote 18 For the burglary case file, the difference between the public’s sentence and what the public thinks that an actual judge would do is relatively the smallest (19 months imposed versus 15 months perceived).

Fig. 4
figure 4

Aggravated assault: punitiveness gap dissected

Fig. 5
figure 5

Simple assault: the punitiveness gap dissected

Fig. 6
figure 6

Burglary: the punitiveness gap dissected

Figures 4, 5 and 6 also illustrate the public’s misperceptions of judges’ real sentences for the cases presented. The figures illustrate how lay participants overestimate judges’ sentences. As a result, our sample of the Dutch population underestimates the magnitude of the gap between themselves and judges in the criminal courts.

Conclusions from integration of the two sentencing studies

Integrating the findings from the two sentencing studies (study I and study III) reveals three things. First, in line with their answers on general survey questions, lay persons are more punitive than real judges, even when judgment is based on the same case files. Second, our study illustrated the impact that tone and wording of presentation and format of information about a case can have on the judgment of a lay person. It was shown (in two out of three cases) that providing lay persons with detailed information on a criminal case has, indeed, a strong mitigating effect on severity. While the effect of information may be huge, it did not suffice, however, in bridging the gap between judges and the public. Third, lay persons in our study consistently overestimated judges’ sentences for the cases presented. The gap as perceived by the public is, therefore, smaller than the real gap between judges and the public.

Comparing experimental findings with findings from general survey

Punitive attitudinal disposition versus sentence when presented with a concrete case (comparison C3)

In the survey in study II, four out of five respondents agreed that sentences in the Netherlands are too lenient. Not surprisingly, in response to the hypothetical situation of being in the judge’s chair, the same proportion of people expected to be harsher than a real judge, while one-fifth expected not to be harsher than a real judge. How do these groups compare on their decisions and perceptions when handed a concrete case file?

In Table 9 sentencing decisions from our public sample of study III are related to the same persons’ responses to the earlier survey question in study II. The table shows that those who claimed in the survey not be harsher than a real judge are, indeed, more lenient than respondents who expressed a more punitive general attitude in the survey. However, the table also shows that, despite more lenient attitudes and more lenient sentences, these respondents remain much more punitive than real judges. This can be observed for each of the three cases. The most intriguing finding here concerns the burglary case. Those who claimed in the survey not to be harsher than a real judge sentenced the burglar to 10 months’ imprisonment, on average, which is, indeed, 11 months less than the sentence by respondents with a more punitive general attitude. At the same time this is still 5 months above the judges’ average (i.e. the real gap). However, these respondents with a relatively lenient general penal attitude think that a real judge would be harsher than they themselves in this case. The result is a negative perceived gap of more than 3 months.

Table 9 Punitiveness claimed in the survey related to judgment based on case files: sentences, perceptions and gaps (differences between ‘harsher’ and ‘not harsher’ significant in both parametric and non-parametric tests at at least P < 0.05, except for differences between the two groups in perception of judges’ sentence)

In summary, among the general public, punitive attitude as expressed earlier in the survey is indicative for relative punitiveness when deciding upon a specific case. It is relative because even those who claim in the survey not to be harsher than judges are still more punitive that real judges, albeit not to the extent of their counterparts with an explicit punitive attitude.

Conclusions and discussion

Providing complete information on criminal court cases to members of the general public does not bridge the gap between the public and the judiciary with respect to preferred sentence severity. The Dutch public is more punitive than judges in the criminal courts. Considering identical case files, the public imposes much harsher sentences than judges do. The general punitive public attitude that emerges from surveys, and was replicated in the current study, does persist when the public is provided with concrete and detailed case files. The hypothesis that lay persons reach the same sentencing decisions as judges do when given the same case file of a specific criminal case was rejected. Our study did show the potential impact of presentation of information on decisions and perceptions of lay persons. Participants who considered a complete case file of a criminal case were much less punitive than participants who based their judgment on a typical newspaper report of that same case. However, the demonstrated effect of information does not suffice to bridge the gap between judges and the public. Connecting the experimental findings to the survey data further showed that, in general, people with a more punitive disposition pass a more severe sentence, when given a specific case, than do people with a less punitive attitude. However, even the latter group remains significantly more punitive than judges. On top of that, the study showed how members of the public misperceive judges’ punitiveness in an unexpected direction: lay persons consistently, and to a considerable extent, overestimated judges’ sentences for the cases presented. The gap, as perceived by the public, is, therefore, smaller than the real gap that exists between judges and the public.

Our study is a multi-method study. We combined experimental and survey methodology. Our integration of three distinct but connected studies, using large samples from both the general public and from the population of professional criminal judges, has given us a unique and focused insight into the depth and nature of the gap between judges and the public. Our approach to the gap, combined with the reality of Dutch criminal procedure, further contributes to existing knowledge because of the external validity of what we have done. Dutch criminal procedure relies to a very large extent on the written case files, which are detailed and cover all relevant aspects of a case in hand. The task required from judges in our study, using realistic case files, was, therefore, very similar to what they do in daily practice. With minimal additional explanation, members of the Dutch public proved to be capable of judging the very same case files. Moreover, because a large number of judges working in the criminal courts cooperated with our study, the gap between judges and the public could be established and analysed in a very direct way under quasi-experimental conditions. There was no need, as in earlier studies on the subject, to have judges pass sentence in concise vignettes or to infer a gap from contrasting the public’s sentencing preferences with formal court statistics pertaining to the types of cases used in the study. Our integrative methodology was further tailored to actually measure (not assume) a punitiveness gap between judges and the public in a methodologically sound way, to relate the gap to the wider attitudinal public perspective, and to examine systematically the effect of information on the extent of the gap. Most earlier studies either focused on one or two of these aspects in isolation, or focused primarily on the public’s sentencing preferences whilst comparing it to court practices, or to a single decision by a court in a case that served as the basis for a vignette.

While our study provides strong support for the information hypothesis, it has also become clear that it would be naïve to expect additional or better information for the public to close the gap with judges completely. In our opinion, the gap that remains in our study is simply too large to support such an expectation. True, a sample of the public could be given much more information than we have on details of the criminal cases, on what happens during trial in court, on criminal law and criminal procedure, and about different types of sanctions and their effectiveness. However, given current findings, we are not at all convinced that such information would ever completely bridge the gap between lay persons and judges. It can, in our view, only be bridged by information if, through training, we make experts out of lay persons, who are, then, no longer lay persons.

Apart from charting the actual gap between judges and the public, and the role of information therein, our study further contributes to discussions on the gap by introducing the contrast between the real gap and the gap as perceived by the public. While the former is the actual difference between preferred sentences by judges and by the general public, the latter refers to the difference between what members of the public prefer and what they believe that a real judge would do. The gap, as perceived by the public, can, therefore, be established without reference to actual preferences measured on part of the judiciary. Our study has shown that both types of gaps are certainly not the same thing. While other studies (many reviewed in Roberts and Hough 2005 and in Roberts et al. 2003) have shown that the public tends systematically to underestimate the severity of actual sentencing practices, the perceived gaps analysed in our study portray the opposite. With regard to the preferred sentences when concrete and detailed information about criminal cases is given, the Dutch public consistently overestimates judges’ (average) sentences in those cases. For each of our three cases, the gap as perceived by the public is thus a smaller one than the real gap between them and judges. This contradicts the results of other studies and deserves further investigation in the future.

This brings us to the question of why our findings are not in line with earlier studies abroad. A first and obvious explanation would focus on differences in the methodologies applied. This explanation includes reference to the nature and extent of the case materials used in the current study and to the integration of several connected studies. It should further be noted that our three cases may be considered special in the sense that each of them represents a more serious example within its own legal classification. We cannot indicate whether other types of cases or less serious cases would yield different results within our methodology. Would the same conclusions be reached if cases were used that were more eligible for sanctions other than mere prison sentences? One may further ask whether findings abroad would be that much different from ours if our study were to be replicated in other jurisdictions. Given earlier findings abroad, it does seem hard to believe that replication would lead to similar results. For instance, using British Crime Survey (BCS) data, Hough and Roberts (1999) showed that large majorities of respondents provided estimates of actual imprisonment rates for rape, mugging and burglary that were much too low (p. 16). Moreover, in the same study, it was shown that public preference when a burglary vignette was given (two sentence description) in the BCS was far less punitive than the decision by the court in the actual case on which that vignette was based. That study also showed that giving respondents a ‘menu’ of sentencing options (including alternatives to imprisonment) strongly reduced public preferences for imprisonment. Our own study provides some support for the latter finding in the simple assault case, where preference for imprisonment dropped as a result of giving sentencing options. Nevertheless, given the different approaches taken by researchers, comparing study findings is comparing chalk with cheese.

If we assume that the contrast between our study and findings from other countries is not due to differences in approach, there must be something special about the Dutch in comparison with other jurisdictions. The punitiveness gap between the Dutch public and Dutch judges may, then, be the result of either one or a combination of the following. First, the Dutch courts may be very lenient, also in international perspective. Second, on the opposite side of the gap, the general public may be especially punitive, also in an international perspective. It is, however, unlikely that leniency of the courts is a valid explanation. This may have been true more than a decade ago, but inspection of trends in prison populations shows that this has changed. From 2000 to 2005, the Dutch prison population increased from 90 per 100,000 inhabitants to 134 per 100,000, an increase of 49% (cf. Aebi and Stadnic 2007). To date, Dutch courts, in comparison with other jurisdictions, can no longer be labelled ‘mild’.Footnote 19 What about an excessively punitive public? The 2005 European Survey of Crime and Safety (van Dijk et al. 2007) gives an indication. Respondents were asked for a sentencing preference for a recidivist burglar, as described in a vignette. In the Netherlands, in the 2005 sweep, 32% preferred a prison sentence. While this is above the international average of 24% preferring prison, we do not consider that enough to merit the conclusion that the Dutch public is excessively punitive.Footnote 20 Moreover, it appears to be in line with the increased punitiveness of Dutch courts.

In short, we are very much inclined to take the gap, as we have charted it, at face value. There is a punitiveness gap between judges and the public. More and better information results in a smaller gap, but is insufficient to close it. Does this mean that the legitimacy of our criminal justice system is in jeopardy? No, we think not. Our survey (study II above) showed that the Dutch public, while expressing dissatisfaction with levels of sentencing, is not necessarily negative about the performance of Dutch judges. Moreover, 75% agreed with the statement that, in the eyes of the general public, judges’ sentences will never be harsh enough. This combination of survey findings leads us to the conclusion that the Dutch public is willing to accept a certain gap, even finds the existence of such a gap a normal situation. However, when the courts fail to explain or fail to give reasons for their decisions and effectively convey them to the public, then the gap between the courts’ decisions and the public preferences may become a true threat to the legitimacy of the justice system. Apart from information, the explanation of decisions is a key aspect in the public’s acceptance of the courts’ decisions and of the gap of which they are aware. Indeed, Dutch criminal judges themselves appear to be aware of this fact. In an earlier study (de Keijser et al. 2004) judges claimed to be well able to relay their arguments and reasons to the public attending a case in the courtroom, and to create, there, at least understanding, if not approval, of their decisions. They add, however, that they fail to reach public opinion effectively outside the courtroom, which worries them. On a final note, we have discussed how the Dutch public overestimates judges’ sentences in the cases presented to them. It should be noted that, in so far as the public is willing to accept the existence of a gap, this necessarily refers to the gap as they perceive it. The judiciary now faces the challenge of removing the difference between the real gap and the gap perceived by the public. Increasing sentence severity is not likely to resolve the matter. For bridging this ‘gap between gaps’, providing the public with more factual information and better explanation of decisions would be a logical course of action.