Introduction

Slow study progress of students is a long-standing issue in educational policy. In the literature on study success, a host of explanatory variables has been suggested, spanning student factors, educational factors and government policy. Examples of student-related factors are the inability of students to adapt to the academic environment, a lack of ability, discipline or motivation and a general inability to generate sufficient “time-on-task” (Lowe & Cook, 2003; Thomas et al., 1991).

Insufficient time-on-task can result from students’ tendency to procrastinate, which can be defined as postponing academic tasks and engaging in distractions, with the result of being inadequately prepared for examinations. Temporal motivation theory offers a unifying theoretical framework to understand procrastination and to examine how educational interventions may affect procrastination (Steel & König, 2006; Steel, 2011). According to Steel (2011), students’ motivation to perform an educational task depends on four factors. The first two factors are the value that students attach to task completion and the likelihood that they can complete the task. Together, these are summarized in the expected value of task completion. A higher expected value will motivate students to put in more effort. The third factor is time. When students have more opportunities to delay their effort, they will be less motivated to complete their task now. Finally, students’ impulsiveness may increase their tendency to be distracted and delay task completion.

This paper examines the effect of academic dismissal (AD) policies on students’ study efforts. While a lack of study progress is often attributed to procrastination, in empirical research it is hard to accurately measure procrastination and to unambiguously show that improvements in study progress result from reductions in procrastination. With this caveat in mind, we argue that procrastination is a potential, maybe even plausible mechanism through which AD policies may affect students’ efforts. AD policies specify a minimum completion of coursework in a specific period for students to be allowed to continue their study. Temporal motivation theory predicts that AD policies can help to reduce procrastination, by constraining students’ opportunities to delay their study effort. Yet the design of such policies matters. Setting a minimum level holds the risk that students will be nudged towards this level as being the institutionally accepted standard performance level and will not strive for the maximum performance. For some students, the minimum level may thus become a target level. If the minimum level is set too low, the effect may be to increase rather than to reduce procrastination (De Koning et al., 2014). This suggests that for AD policies to reduce procrastination, the level should be set sufficiently high. Temporal motivation theory also suggests that students who do not value their study or have a low probability of passing their coursework, will make more use of opportunities to delay task completion. This suggests that academically weaker students may especially benefit from interventions that reduce their room for delay, such as strict AD policies.

This paper describes the sensitivity of students’ academic performance to a large change in the minimum threshold level of an AD policy. We use a dataset from a large Economics & Business program at a Dutch university over the period from 2009 to 2019. At this institution, an AD policy is in place in the first year of the bachelor program. An advantage of our research setting is that during the sample period, the educational system of the program did not experience major changes, except for one change in the threshold for academic dismissal. In 2012, the threshold was raised from 40 credits to 60 credits (out of 60 credits). This means that during the sample period two different AD threshold regimes were in place (40 credits and 60 credits). All other features of the educational system, such as the academic calendar and the resit policy, remained unchanged. In previous studies examining the relationship between AD policies and first-year study success, variations in AD policies and threshold levels are entangled with other changes to the educational system, such as the resit policy (Schmidt et al., 2021). When educational innovation comprises multiple simultaneous interventions, this may hamper the attribution of their effectiveness. The setting of the present study is unique in the sense that we study an isolated change in the AD policy, not clouded by accompanying major changes to the curriculum or the educational system of the program. The data will be used to address the following research questions.

First, is there a relationship between variations in the threshold level, academic ability and academic performance? Temporal motivation theory suggests that academically weaker students with a low expected value of task completion are more susceptible to procrastination. A stricter AD policy may thus be more effective among this subset of the student population. The empirical evidence on this is scarce, yet the policy implication is important. AD policies that increase the study efforts of weaker students may reduce inequality in study progress.

Second, is there a relationship between variations in the threshold level and other measures of student behavior suggestive of procrastination? While the relationship between the threshold level and academic performance forms the core of this paper, our data yield additional insights into the timing of students’ academic activities during the first bachelor year. More specifically, we use data on students’ performance in regular exams and resits to show how their participation in different exam opportunities changes following the increase in the threshold level.

Finally, does the threshold level used in AD policies act as a target level for some students? While insights from the field of behavioral economics suggest that the presence of a numerical threshold level may nudge students’ performance towards this level, to our knowledge no empirical evidence has been presented on this tendency in the context of AD policies.

This paper makes use of a large historical dataset. As the past change in the threshold level has been implemented for the full student population, a (quasi-)experimental design was not possible. This makes it difficult to establish a causal relationship between the change in the threshold level and academic performance. We therefore proceed using a simple pre-post analysis, comparing measures of study success and study activity before and after the intervention. This descriptive approach may suggest possible correlations between students’ study success and the change in AD policy, without proving causality.

Our analysis includes confounding variables which are known to be related to study success, such as gender, age and ethnicity. We include the confounding variables in a multiple regression framework, not to establish a causal effect, but to check whether possible correlations between study success and the AD intervention remain intact once we control for confounding variables. Finally, we present evidence on the timing of students’ academic activities and on the nudging effect of the threshold level that lend additional plausibility to the relationship between variations in the threshold level and academic performance and would support an interpretation based on procrastination reduction.

The organization of this paper is as follows. The “Literature” section briefly reviews the literature on procrastination and AD policies in higher education. The “Setting, data and method” section describes the setting, data and empirical approach. Our findings are reported in The “Results” section. The “Conclusions” section concludes.

Literature

The prevalence of procrastination in higher education has been widely researched. Early studies report widespread procrastination among college students. Ziesat Jr et al. (1978) and Ellis and Knaus (1977) find that respectively 30–60% and over 70% of students engage in procrastination. Time has not solved the problem. More recent studies put the share of procrastinators among students above 80% (O’Brien, 2002; Steel, 2007). With the advent of mobile devices and social media in the past decade, distractions have multiplied. Severe procrastination has been found in over 30% of students (Day et al., 2000).

Most research on procrastination focuses on students’ personal characteristics, such as self-control, conscientiousness, impulsiveness, impatience and time preferences (Ferrari et al., 1995; Steel, 2007; Patrzek et al., 2012). The act of studying involves a intertemporal trade-off between the immediate cost of study effort and the longer term benefits of educational attainment. Impatient students prioritize their well-being in the present and postpone costly study effort. This tendency to procrastinate may occur both in the case of high yet time-consistent exponential discount factors and in the case of present-biased preferences (Non & Tempelaar, 2016). Students with present-biased preferences lack self-control, are more impulsive and are less able to resist temptation (DellaVigna, 2009).

Linking procrastination to academic performance requires a measure of procrastination. Many studies use the Procrastination Assessment Scale - Student (PASS) introduced by Solomon and Rothblum (1984). The correlation between self-reported PASS scores and actual procrastination is, however, rather low (Solomon & Rothblum, 1984; Tice & Baumeister, 1997). As an alternative to PASS scores, some studies measure the amount of time it takes students to start or complete specific educational tasks, such as completing assignments or homework (Rotenstein et al., 2009).

The literature reviews in Rotenstein et al. (2009) and De Paola and Scoppa (2015) report mixed evidence on the link between procrastination and academic performance: some studies find a negative relationship while others fail to find a significant correlation. Controlling for student quality, Rotenstein et al. (2009) find that task procrastination is associated with lower academic performance. De Paola and Scoppa (2015) measure procrastination by the time that students need to finalize their university enrolment procedure and find that this measure is a strong predictor of academic achievement. Diver and Martinez (2015) and Kim and Seo (2015) find that procrastination can predict dropout. Survey evidence also indicates that procrastination may negatively affect student well-being (Tice & Baumeister, 1997).

A subset of the literature examines whether procrastination depends on students’ background characteristics. The evidence of a gender effect is mixed. Milgram et al. (1995), Senecal et al. (1995) and Özer et al. (2009) find that male students have a higher inclination to procrastinate than female students. Stegers-Jager et al. (2020) find that raising first-year performance standards to reduce procrastination has a stronger effect on male than on female students. However, Ferrari (2001) and Kachgal et al. (2001) do not find a gender effect. There is some evidence for an effect of age. Kim and Seo (2015) and Van Eerde (2003) find that younger students suffer more from procrastination than older students. Baars et al. (2021) conclude that subgroups based on gender and ethnic background are equally susceptible to procrastination and benefit equally from measures to reduce procrastination. Using descriptive statistics, they also show that students with the lowest pre-university grades benefit most for changes in examination practices to reduce procrastination. The present study builds on this finding.

Educational interventions that address students’ self-regulatory deficiencies can help to reduce procrastination (Damgaard and Nielsen, 2018; DellaVigna, 2009). Jansen (2004) and Van der Hulst and Jansen (2002) and Van den Berg and Hofman (2005) find that curriculum design influences study success. For example, offering more courses in an academic year or more courses in parallel tends to reduce study progress. Schmidt et al. (2021) and Ostermaier et al. (2013) focus on examination practices and argue that allowing students to do more resits facilitates procrastination. An AD policy is another example of such an intervention, as it sets a deadline before which students must have achieved a minimum performance level. This level is usually defined in terms of GPA or number of credits. If students fall below the threshold they can be removed from the program or institution. AD policies are meant to ensure that students make adequate academic progress and are a common intervention in higher education. They are typically introduced early in the bachelor program and can be viewed as a form of selection “after the gate” (Sneyers and De Witte, 2018). AD policies may include a period of academic probation, which is a precursor to actual suspension or dismissal.

The literature on the effectiveness of AD policies is small but yields several insights. First, the effect on student retention and study duration is sensitive to the design of AD policies. For example, when AD policies include a period of temporary suspension that allows for a return of suspended students into the program, as in the Canadian setting in Lindo et al. (2010), the effect on study duration will differ from policies that are less forgiving. In the Dutch AD system, students who fail to meet the minimum level at the end of the first bachelor year are dismissed from the program and are not allowed to re-register for that program for the next three years. This effectively closes the option to re-enter the program. Sneyers and De Witte (2018) indeed find that the effects of AD policies differ across countries, with a strong effect on time to graduation in The Netherlands (Arnold, 2015), but no effect in Canada and the USA (Lindo et al., 2010; Fletcher & Tokmouline, 2017).

Second, the effectiveness of selection after the gate will be influenced by the presence of selection at the gate. When programs can select students based on motivation and ability, the student body will be more able and motivated to perform. In terms of temporal motivation theory, they will have a higher expected value of task completion and be less susceptible to procrastination. Selection at the gate can also explain the difference in the effectiveness of AD policies between the USA, where selection at the gate is common, and The Netherlands, where it is rare. It can also explain why some studies fail to find an effect of AD policies on the study progress of Dutch medicine students, which is one of the few bachelor programs in the Netherlands that select at the gate (Eijsvogels et al., 2015; Stegers-Jager et al., 2011). If selection at the gate is done well, selection after the gate will be less effective.

A third insight from the literature is that variations in the minimum performance level may affect the empirical results. Following the introduction of an AD policy with a threshold level at 40 (out of 60) credits, De Koning et al. (2014) found no positive effect on study progress. They concluded that the low threshold level may have had the adverse effect that students just try to meet the required minimum level and do not aim for the best possible performance. Similar findings are reported in Steel (2011) and Eijsvogels et al. (2015). In contrast, studies examining the effect of a high threshold level generally do find an effect on study progress (Schmidt et al., 2021). Kickert et al. (2020) find a significantly higher one-year study progress after an increase of the AD threshold and conclude that students adapt their study effort to the assessment policy. Summarizing the literature, they conclude that studies comparing no threshold with a low threshold generally find no effect on study success, while studies comparing a high with a low threshold find a positive effect.

A fourth observation relates to the research methodology. The severity of the sanction in AD policies usually precludes the use of randomized experimental designs. Most papers are therefore cohort studies using pre-post analysis or studies using quasi-experimental designs, such as regression discontinuity (RD) design. In the case of cohort studies, care must be taken to control for confounding factors, as a stricter AD policy may alter the characteristics of the student population by discouraging some students from enrolling. RD designs examine academic outcomes just around the AD threshold (Lindo et al., 2010; Fletcher and Tokmouline, 2017; Cornelisz et al., 2020). They typically find little or no effect of AD policies on academic outcomes just around the threshold. However, a non-trivial complication in the application of RD design to AD policies is the occurrence of non-random sorting around the cutoff point. Given the severe repercussions to students, AD policies will be communicated early and clearly to students. Most institutions using AD policies also have an early-warning system in place. The threshold level will thus be well-known. Students who want to continue their studies will make every effort to clear this hurdle. This could result in a discontinuity in the student distribution around the AD threshold, with a large group just above the threshold. The evidence presented below indeed suggests that student may clump together above the AD threshold.

In summary, the evidence that AD policies with a high threshold level increase short-run academic performance is strong. The evidence on the longer term effects on graduation rates is mixed. The literature on how increases in the threshold level affect different subgroups in the student population is scarce. The present study contributes to the latter literature, by describing how the study efforts of academically weak students have changed following the introduction of a stricter AD regime.

Setting, data and method

Setting

This study is conducted at a school of economics of a research university in the Netherlands. We use data from the 2009–2018 cohorts of the main Dutch-language bachelor program in Economics & Business. The nominal duration of the program is three years. The first two bachelor years consist of obligatory core and support courses, offering students a general background in economics and business before they specialize in a subfield in their third and final year. The first-year curriculum consists of 10 courses totaling 60 credits and is organized in five 8-week modules of twelve credits each. Most courses are delivered using a combination of large-scale plenary lectures and small-scale tutorials. Grading follows the Dutch system, in which grades are determined on a 1–10 scale. A grade of 8 or higher corresponds to an “A”, a grade of 7 to a “B”. The standard setting strategy mixes compensatory and conjunctive elements. While the cutoff score for passing is 5.5, the examination rules allow for compensation, to the effect that three “failed” grades (between 4.5 and 5.5) can be compensated by higher grades for related courses. Resits are limited to a maximum of three out of ten courses and are scheduled during the summer break.

For the full sample period, an AD policy has been in place for first-year students. Since 1993, Dutch law allows universities to give first-year students the so-called binding study advice (BSA) at the end of the academic year. The BSA succeeded the so-called consilium abeundi, which was an urgent though non-binding advice to students to discontinue their study, based on their academic performance in the first year. As few students followed this advice, the consilium abeundi was considered ineffective. The introduction of the BSA took off quickly in the universities of applied sciences. Implementation at the research universities took more time. Currently, more than 80% of programs in Dutch higher education use a BSA. Resistance from student organizations to the BSA was and remains high, to the effect that the current government is considering a relaxation of the BSA.

The purpose of the BSA is to prevent students spending too much time pursuing a study program for which they do not have the skills, talent or motivation and to stimulate them to switch early on to a more suitable study. This makes the BSA an instrument to strengthen the orienting, selective and referential function of the first bachelor year. While the focus of this paper is on the selective function of the BSA, the referential function has been an important consideration in the introduction of the BSA. By putting students on the right track in time, the cost of late dropout can be reduced. According to education managers, the BSA provides students with a strong incentive to study (Onderwijsinspectie, 2010).

A student receiving a negative BSA is not allowed to re-register for the program in the next three academic years. The performance threshold below which students will receive a negative BSA can be set at program level. Most Dutch programs use a threshold level between 30 and 45 credits (out of a maximum of 60 credits per year). Students obtaining the maximum of 60 credits will receive a positive BSA. Students obtaining a number of credits between the threshold level and the maximum receive a conditional positive BSA. They are required to complete the first-year curriculum in their second year. Students can avoid a negative BSA by formally withdrawing from the program before February 1st. Below we label this as early dropout. In the setting of this study, one change was made to the AD threshold. In 2012, the BSA-norm was raised from 40 to 60 credits as part of a university-wide reform. As a consequence, the conditional positive BSA was abolished. The school’s AD policy is communicated to students at the start of the academic year in written communication and in the plenary opening sessions. More importantly, the AD policy is discussed in small-scale mentoring groups with obligatory attendance. This implies that we may safely assume that students are aware of the threshold level from the start of the program.

Data

The university’s education research database contains data on students’ grades and background characteristics. The data have been anonymized before use, in line with privacy regulations. We use data on 5920 students from ten cohorts from 2009 to 2018. Various explanatory variables are known to be related to study success in economics. The most important of these is academic performance in preparatory education (Ballard & Johnson, 2004; Johnson & Kuennen, 2006; Lagerlöf & Seltzer, 2009). Our dataset contains high school GPA (denoted SchoolGPA), although the coverage is incomplete. For 13% of the students in our sample, high school GPA is missing. A related variable is the mathematical orientation of pre-university education. In the Dutch school system, pupils need to choose among four subject clusters (tracks), each emphasizing a different field of study: (1) Science & Technology, (2) Science & Health, (3) Economics & Society and (4) Culture & Society. While the Economics & Society track is meant to prepare pupils for studying economics at university, in practice students originating from a Science & Technology track do much better in economics (Arnold & Straten, 2012). This is related to the way in which math skills are embedded in the high school tracks. The Science & Technology track contains more differential calculus and thus better prepares pupils for a study in economics. Data on track choice are part of our dataset. We construct the dummy variable ScienceTrack which takes on the value one if students originate from the Science & Technology track and zero otherwise.

A large literature documents the evolution of the gender gap in economics. Early studies found evidence of a negative gender gap, in which females underperform compared to males (Ballard & Johnson, 2004). The gender gap has, however, been closed over time (Johnson et al., 2014). We nevertheless include gender as a potential confounding factor. The dummy variable Female takes on the value one for female and zero for male students. We also include information on students’ ethnic background. A general finding in the literature is that minority students experience slower study progress than majority students (Severiens & Wolff, 2008). In the Dutch context, ethnic minority students mainly come from Surinam, the Netherlands Antilles, Turkey and Morocco. The dummy variable EthnicMinority takes on the value one if a student is either born or has at least one parent born in one of these countries and zero otherwise. Following much of the literature on study success, an additional explanatory variable is students’ age (denoted Age). The sign of the relationship between age and study success is ambiguous. While a higher age at enrolment could indicate increased maturity and a correspondingly more serious attitude towards studying, it could also result from procrastination and study delay at high school.

Table 1 provides descriptive statistics on students’ background characteristics, calculated for two subperiods defined by the AD regime. During the subperiod 2009–2011 the BSA-norm of 40 credits applied; during the subperiod 2012–2018 the BSA-norm of 60 credits applied. The final column of Table 1 reports the p-values for the F-test or χ2 test for differences across the two regimes. The average age of freshmen has declined significantly over time, from 19.0 to 18.7 years. The share of female students has hovered between 25% and 27% and does not differ significantly across subperiods. The statistics on the next three variables may indicate a selection effect due to a stricter AD policy. Over time, the share of minority students has dropped from 27% to 21% and the share of students originating from a science track has increased from 17% to 24%. The share of students with the highest school grades (SchoolGPA= 8) has increased from 12.7% to 13.7%, while the share of students in the lowest GPA category (SchoolGPA= 6) increased from 26.7% to 29.5%. Based on the data in Table 1, the possibility that the tightening of the AD policy in 2012 has affected the student composition cannot be excluded. This stresses the importance of including these variables in the empirical analysis below.

Table 1 Background characteristics

Table 1 finally shows data on first-year student retention across AD regimes. Overall, student dropout has not significantly increased. After the 2012 increase in the AD threshold, the combination of early and late dropout increased marginally (from 34.2% to 34.3%). However, the timing of the dropout did change, with more students opting for early dropout (from 12.0% to 15.9%). The stricter AD policy may have induced students to stop earlier, instead of vainly hoping for a positive BSA. It can be argued that this shift from late to early dropout is beneficial for all concerned. It enables the student to take the time to reflect on the next study choice and in the meantime earn a living with a temporary job. Educational institutions will benefit from spending less scarce resources on students that will drop out at the end of the year.

We define the following variables to measure study progress. Dropout is a dummy variable that takes on the value one for students that drop out early or receive a negative BSA at the end of the academic year and zero otherwise. CreditsB1 measures the total number of credits that students have earned in their first bachelor year. This measure includes credits gained during the resit period and credits for compensated courses. CreditsB1 is the measure for which the AD threshold applies. CreditsB1Reg measures the amount of credits gained during the regular teaching periods and excludes the results from the resits. We hypothesize that students that are less susceptible to procrastination will not postpone their efforts unnecessarily and will tend to earn more credits using the regular exams. NoResits is a dummy variable that takes on the value one for students that did not make use of the resit period to pass their first bachelor year and zero otherwise; #Resits measures the number of resits that a student has taken in the first year. Finally, #NoShows measures the number of times that a student has not turned up at a regular first-year exam. The latter variables give an indication of the timing of students’ study effort during the academic year. Students that are less susceptible to procrastination will tend to turn up more often at regular exams and make less use of resit opportunities.

The gist of this paper’s argument can be seen in Figs. 12 and 3, which plot the distribution of CreditsB1 by high school GPA and by threshold regime. For the lowest GPA category (Fig. 1), the share of students earning the maximum number of credits in the 60-credits regime is more than double the share in the 40-credits regime. This effect is much smaller in the middle GPA category (Fig. 2) and practically absent in the highest GPA category (Fig. 3). The latter graph suggests that good students tend to do well irrespective of the AD regime in place. In other words, this group seems largely immune to changes in AD policy. The contrast with the strong increase in the share of students that earn 60 credits in the lowest GPA category is stark. The graphs also suggest that for some students the minimum threshold level may have become a target level. This nudging effect occurs predominantly among weaker students. For the 40-credits BSA-norm, Fig. 1 shows a hump in the distribution at 40 and 44 credits, just above the threshold. In Figs. 2 and 3, the 40-credits hump is much smaller or almost indiscernible.

Fig. 1
figure 1

Credit distributions for students with high school GPA= 6 (excluding early dropout)

Fig. 2
figure 2

Credit distributions for students with high school GPA= 7 (excluding early dropout)

Fig. 3
figure 3

Credit distributions for students with high school GPA= 8 (excluding early dropout)

Figures 4 and 5 further illustrate how a minimum threshold level can become a target level and how students may use resit opportunities to procrastinate. They document the presence of students that achieve the minimum threshold level at the latest possible time. These students could be labelled as “just-in-time minimizers”. For the two AD regimes, Fig. 4 plots the number of credits that students earn in the resit period against the number of credits earned during the regular exams. The edges of the distributions in Fig. 4 are close to zero. When students earn few credits before the resits, they stand little chance to reach the threshold level. For most of them, earning credits in the resit period will be pointless. When students have earned close to 60 credits before the resits, they have little need for resits. More interesting are the data points in the middle, which show how the final sprint towards the threshold level changes across the two AD regimes. The maximum of the solid black line, representing the BSA-norm of 40 credits, shows that students earning 28 credits before the resit period earned on average just above 12 credits during the resit period. This is exactly the amount they needed to reach the minimum threshold level of 40 credits. Beyond that point, the black line drops. Although students earning more than 28 credits before the resit period could be expected to perform similarly or even better during the resits than students earning 28 credits, they do not. An interpretation is that they aim for the minimum threshold level and need less than 12 credits to reach it. The broad striped line shows what happens to the resit performance when the AD threshold is raised to 60 credits. For students having earned less than 32 credits before the resits, the performance drops, as their chances to avoid dismissal are slim. For students above 32 credits, the resit performance improves markedly. Figure 5 shows similar data for students in the lowest GPA category. Here the gap between the solid and the broad striped lines is even more pronounced. Among students in the lowest GPA category, just-in-time minimizers are a non-trivial group: students earning 28–36 credits before resits make up 16% of students. Among students in the highest GPA category, this share is just 4%. Figures 4 and 5 strongly suggest that the academic performance of just-in-time minimizers can be influenced by variations in the AD threshold.

Fig. 4
figure 4

The final sprint towards the threshold (excluding early dropout)

Fig. 5
figure 5

The final sprint towards the threshold, high school GPA= 6 (excluding early dropout)

Method

Our empirical analysis consists of two parts. In the first part, we show how first-year study progress has developed following the increase in the AD threshold. Using a simple pre-post analysis, we test whether there is a significant difference in Dropout and CreditB1 across the two AD regimes. Given our focus on academic ability and the stark differences across Figs. 12 and 3, we do this for each of the three categories of SchoolGPA separately. In a next step we include the confounding variables in the analysis and test for significant pre-post differences in study progress across various subgroups. We finally control for the influence of the confounding variables on the initial pre-post analysis by estimating the following regression equation:

$$ y_{i,t}=\beta{_{0}}+\beta{_{1}} BSA60_{i,t} +\sum\limits_{k=1}^{K}\beta_{k}x_{k,i,t}+\varepsilon_{i,t}. $$
(1)

In Eq. (1), yi,t measures the study progress of student i in cohort t. The dummy variable BSA60 equals one during the period that the 60-credits BSA-norm was in place and zero when the 40-credits BSA-norm applied. The K confounding variables are denoted xk,i,t. The coefficient β1 is an estimate of the pre-post difference in study progress once we control for variations in the confounding variables. It does not represent a causal effect. We estimate OLS regressions for Eq. (1) for each of the three categories of SchoolGPA separately.Footnote 1

The second part of our empirical analysis focuses on students’ activity during the academic year. Instead of just looking at the outcomes at the end of the year, we now consider the timing of students’ study efforts during the year. To the end, we examine the variables NoResits, #NoShows, #Resits and CreditsB1Reg. The second part follows the structure of the first part. We first show how activity during the year has changed following the increase in the AD threshold. Next, we test for significant pre-post differences in activity across various subgroups of the confounding variables. We finally estimate an equation similar to Eq. (1) to control for the influence of the confounding variables on the pre-post analysis. We do this, again, for each of the three categories of SchoolGPA separately.Footnote 2

Results

First-year study progress

Table 2 presents statistics on first-year study progress by academic ability and AD regime. Starting with Dropout, the first column shows large differences in dropout rates across the different categories for SchoolGPA. For the lowest category (SchoolGPA= 6), dropout rates are almost ten times as large as for the highest category (SchoolGPA= 8). More noteworthy in the context of this study is that the dropout rates hardly change following the increase in the AD threshold in 2012. For the two lowest GPA categories, there is no significant difference between the dropout rates before and after the change in the AD regime. The increase in the highest category (from 0.040 to 0.076) is significant at just a 10% level. This suggests that the increase in the AD threshold level to 60 credits did not lead to a significantly higher first-year dropout.

Table 2 First-year study progress and academic ability

The remaining columns in Table 2 report the outcomes for CreditsB1. The results are reported for three subsamples: (1) including all students that enrolled at the start of the academic year; (2) excluding students that dropped out early; (3) excluding students that dropped out or were dismissed. Excluding dropouts may increase the difference in CreditsB1 across AD regimes, as students that drop out typically earn few credits.

The second column shows the outcomes for the full sample. The differences in CreditsB1 range from 5.49 (SchoolGPA= 6) to 0.74 (SchoolGPA= 8). Except in the case of SchoolGPA equalling 8, these differences are significant at a 1% level. The remaining columns show that, as expected, the exclusion of dropouts increases the pre-post differences in CreditsB1. Excluding all dropout, the differences in CreditsB1 range from 10.02 (SchoolGPA= 6) to 2.53 (SchoolGPA= 8). All these differences are significant at a 1% level. These statistics thus confirm the visual impression from Figs. 12 and 3, that the change in study progress following an increase in the AD threshold is much more pronounced among the academically weaker students.

Table 3 reports statistics for the subgroups of our confounding variables. For each subgroup we report averages by AD regime and test for the significance of differences across AD regimes. Starting with Age, the second column shows that dropout rates are much higher for older students. This suggests that a higher age at enrolment may result from study delay at school. Older students also earn less credits in their first year. This holds across AD regimes and for samples including and excluding dropouts. Regarding gender, the differences between female and male students are less pronounced. Both dropout rates and the number of credits earned are very close. This confirms recent evidence that the negative gender gap in economics is a thing of the past. Comparing study progress across AD regimes, the data show a larger change in Dropout and CreditB1 among females following the increase in the AD threshold. Table 3 also shows that students from a non-western minority background have higher dropout rates and earn less credits in their first year. This again holds across AD regimes and for samples including and excluding dropouts. Finally, track choice at school also matters: students originating from a Science & Technology track have a significantly lower chance of dropping out and earn more credits in their first year.

Table 3 First-year study progress and confounding variables

With regard to the change in AD regime, Table 3 shows remarkable similar patterns across most subgroups. For Dropout, the pre-post differences are insignificant in most cases. The exceptions are older students and females. Older students have a slightly higher dropout rate following the increase in the threshold. The difference is significant at a 10% level. In contrast, female students show a lower dropout rate, which is also significant at a 10% level. For CreditsB1, the pre-post differences are significant at a 1% level for almost all subgroups. Here the exceptions are students originating from a Science & Technology track in the full sample and in the sample excluding early dropout.

The data in Table 1 show that we cannot exclude the possibility that the increase of the AD threshold in 2012 has changed the student composition of the program. If, for example, the stricter AD policy would attract more students originating from a Science & Technology track or discourage students from other tracks, this could (partially) explain the change in study progress that we observe in Table 2. To control for the possible effects of variation in the confounding variables, Table 4 reports our estimates of the coefficient β1 in Eq. (1). This coefficient estimate is not intended to show a causal effect, but rather serves as our estimate of the pre-post difference in study progress once we control for variations in the confounding variables.

Table 4 First-year study progress and academic ability controlling for confounding variables

The results in Table 4 confirm, by and large, those in Table 2. For Dropout, β1 is insignificant across all categories of SchoolGPA. For CreditsB1, β1 is significant at at least a 5% level in almost all cases, the exception being the full sample for students with SchoolGPA equal to 8. In general, the β1 estimates are somewhat lower than the pre-post differences in Table 2. For example, for the sample excluding all dropout, the pre-post difference of 10.02 for students in the lowest GPA category is reduced to 9.25 in Table 4. This implies that the pre-post differences in study progress can only for a small part be explained by variation in the confounding variables.

We conclude from this section that following the increase of the AD threshold to 60 credits, the number of credits that the weakest students earn in their first bachelor year increases significantly. For weak students that proceed to their second year, the number of credits earned has increased by 9 credits. This is a sizable change. An important finding is also that this change has not coincided with a significant increase in first-year dropout. This suggests that stricter AD policies may improve study progress without the collateral damage of higher dropout.

Activity during the academic year

We next consider the timing of students’ study efforts during the academic year, by doing a pre-post analysis for the variables NoResits, #NoShows, #Resits and CreditsB1Reg. We are interested in how students that have survived their first bachelor year changed the timing of their study efforts following the increase in the AD threshold. We therefore exclude students that dropped out during the first year.

Table 5 shows how activity during the year has changed following the increase in the AD threshold. As in Table 2, Table 5 presents the statistics by academic ability and AD regime. Starting with NoResits, the first column shows substantial differences in the share of students that do not need resits across the different categories for SchoolGPA. For the lowest category (SchoolGPA= 6), the share is 70% before the change in AD regime, compared to almost 90% for the highest category (SchoolGPA= 8). For all GPA categories this share increases following the change in AD policy. The change is highest (17 percentage points) in the lowest category and smallest (6 percentage points) in the highest category. The decreased use of resits is also evident from the third column, which shows that the average number of resits that students need to finish their first year drops across all GPA categories. Again, the change is largest among the weakest students, where the average number of resits drops from 1.83 (out of 10 courses) to 1.48. A related statistic is the average number of no-shows (#NoShows), which measures the number of times that a student has not turned up at a regular exam for a first-year course. A student that doesn’t show up at a regular exam and still wants to proceed to the second year has to make use of the resit opportunity. The number of no-shows also declined significantly following the change in the AD policy, by 0.80 for the lowest GPA category and by 0.33 for the highest GPA category. An interpretation of these statistics is that the higher bar may have changed to study planning the students, especially the academically weaker ones. Under a stricter AD policy, some students may have concluded that skipping regular exam opportunities in favor of resits is too risky a strategy. As a result of bringing their study efforts forward in time, students earn more credits out of their regular exam opportunities after the increase in the AD threshold. The change in CreditsB1Reg ranges from 9.5 credits for the lowest GPA category to 3.3 credits for the highest GPA category. All pre-post differences in our measures for academic activity during the year are significant at a 1% level.

Table 5 Activity during the year and academic ability

Table 6 reports averages of NoResits, #NoShows, #Resits and CreditsB1Reg for the subgroups of our confounding variables. The table presents averages by AD regime and tests for the significance of differences across AD regimes. In general the results fit with those in Table 3: subgroups that show lower study progress in Table 3 have more no-shows, make more frequent use of resits and earn less credits at regular exam opportunities in Table 6. Older students tend to postpone their study efforts more than younger students, as evidenced by the lower values for NoResits and CreditsB1Reg and the higher values for #NoShows and #Resits. Similar to Table 3, the differences between female and male students are less pronounced. Students from a non-western minority background have more no-shows, need more resits and earn less credits at regular exam opportunities. The reverse holds for students originating from a Science & Technology track, although the differences with students from other tracks are small. As in Table 3, most pre-post differences are significant at a 1% level.

Table 6 Activity during the year and confounding variables

Finally, to control for the possible effects of variation in the confounding variables, Table 7 reports our estimates of the coefficient β1 in Eq. (1). Similar to Table 6, Table 7 shows that the pre-post differences in our measures for study activity during the academic year cannot be fully explained by variation in the confounding variables. For example, the pre-post difference in CreditsB1Reg of 9.55 for students in the lowest GPA category is reduced to 8.62 in Table 7. All β1 coefficients in Table 7 are significant at at least a 5% level. Overall, the evidence that students’ timing of their study efforts has changed following the increase in the AD threshold is strong.

Table 7 Activity during the year and academic ability controlling for confounding variables

Conclusions

This paper has used a large sample of economics students from a Dutch research university to describe changes in the study progress of academically weak students following a change in the stringency of AD policies. The setting in this study is unique, in that the only major educational intervention in this program has been the change in the AD threshold, while other characteristics of the educational system remained constant.

We find that after an increase in the AD threshold level the academic performance of weak students during their first bachelor year significantly improves. Most of this change is unrelated to age, gender, ethnicity or track choice at high school. We also find that first-year dropout did not increase following the introduction of a stricter AD regime. Our non-experimental empirical approach does not allow us to establish a causal effect. Nevertheless, our findings open up the possibility that a higher bar may result in a “Pareto improvement”: weak students that progress to their second year are better off, yet the share of students that drop out remains the same. This suggests that students that just passed their first year under a low-bar AD policy can be stimulated to improve their performance by raising the bar.

The evidence further shows that following an increase in the AD threshold level, the timing of weak students’ educational activities during the academic year also changes markedly. After the increase of the AD threshold level, students bring their study activity forward in time, as shown by a decrease in the number of no-shows and an increase in the number of credits earned during regular exams. These findings lend plausibility to the interpretation that the mechanism through which an increase in the AD threshold improves academic performance is a reduction in procrastination.

We also show that the presence of a numerical threshold level may nudge students’ performance towards this level. For the 40-credits AD threshold, the credit distribution shows a hump at or just above the threshold level. This hump is absent in the data for the 60-credits AD regime, suggesting that for some students the minimum threshold level acts as a target level. We identify just-in-time minimizers who achieve the minimum threshold level at the latest possible time and show that their performance during the resit period is affected by the AD threshold level. These effects occur predominantly in the group of students with low grades at high school.

We are aware that multiple explanations can be put forward to explain the observed changes in study progress and in the timing of students’ activities following the increase in the AD threshold. Theories centering on rational choice would hold that optimizing students will adjust their study planning when the AD constraints imposed on them by the educational institution change. While this explanation cannot be excluded, it does not offer an explanation for our observation that the largest changes in behavior occur among the academically weakest students. In our view, the findings in this paper are more compatible with temporal motivation theory, which hypothesizes that students with a low expected value of task completion will be more susceptible to procrastination. A stricter AD policy may therefore be more effective among these students.

In conclusion, the evidence in this paper suggests that variations in the implementation of academic dismissal policies affect students’ study progress, especially among weaker students. The latter finding has the important policy implication that more stringent AD policies have the potential to reduce inequality in study progress among students. Preventing weaker students from lagging behind will reduce the inequality in students’ time and money spent on education. It will also reduce the disadvantage that these students may experience in the labor market from having had a longer and possibly less productive academic career.