JEL Codes

Keywords

1 Introduction

There is a huge literature from the economics and sociology of education analysing the role played by family background and economic resources on individuals’ schooling and college choices. Overall, this body of work provides overwhelming evidence that educational choices are strongly influenced by family background. It is widely recognised that, on average, children from higher socioeconomic status backgrounds perform better at school: this pattern is attributed to the capability of more advantaged parents to purchase better quality education, offer cultural stimuli, and support their children in case of difficulties. Yet, students from advantaged backgrounds make more ambitious school choices and exhibit better outcomes net of prior scholastic results. Further differences in educational choices across family backgrounds may emerge because, acknowledging their own ability, rational individuals take decisions according to costs and expected benefits, maximising a utility function.

Breen and Golthorpe (1997) conceptualise utility in terms of expectations concerning the social class destinations of their offspring and, emphasising the role of aspirations, assume that individuals aim at minimising the risk of social demotion (i.e. ending up in a lower class than that of their parents). Parental education is also valued as a major driver of aspirations, and most empirical analyses of the effects of family background on educational outcomes either focus on the role of parental education or control for it. Other channels might exacerbate differences across family backgrounds in retention. Tinto (1975, 1993) highlights the role played by academic and social integration. Student academic performance and interaction with faculty, as well as involvement in informal peer-group interactions, may lead to either positive or negative experiences that affect feelings of inclusion. Students who feel more disconnected are more likely to withdraw: because first-generation university students often lack good knowledge of and familiarity with the higher education system, they tend to have a higher chance of experiencing poor integration and eventually drop out.

A large body of the economic literature is centred on the role played by family income, and the utility function is defined in terms of children’s future earnings. As discussed in Becker (1975), low-income families may face limited borrowing opportunities. Credit constraints may discourage college attendance among youth from low-income families, even when the financial returns are high. However, Cameron and Heckman (2001) and Carneiro and Heckman (2002) find relatively small gaps by family income after controlling for children’s ability. They conclude that the long-run factors associated with family income—family environment, early investments in children’s education—are what play a prominent role in explaining differential college enrolment rates by family income compared to short-term borrowing constraints. Similarly, Stinebrickner and Stinebrickner (2008) study college dropout decisions and report little evidence of credit constraints on most students. Instead, other scholars find that financial constraints are important drivers of university enrolment and completion (Ellwood & Kane, 2000; Belley & Lochner, 2007). Comparing cohorts from the mid-seventies studied in Heckman and colleagues’ work with cohorts of students from the mid-nineties, Belley and Lochner (2007) find that family income has become substantially more important over time. They conclude that it is likely that borrowing constraints have become more stringent, although they acknowledge that other factors such as social networks, imperfect information and college admissions policies might have played a major role as well. Bound et al. (2012) find that growing difficulties in financing a college education, especially among students from low-income families, have contributed to increasing student employment to cover a greater share of college costs, and in turn to increasing time to degree. Examining college dropout, Stinebrickner and Stinebrickner (2012) argue that students learn about their academic ability from grade performance while in college and provide evidence that a substantial share of withdrawals can be attributed to the gained awareness of poor performance. Indeed, families invest more in their children’s education the higher the expectations are of their ability (Checchi, 2000). While affluent parents might still find it worthwhile to keep financing their offspring’s education even when they perform poorly, low-income ones are more likely to give up.

The issue of credit constraints is addressed mainly in research on the USA and UK, where the tertiary education system is strongly differentiated, and tuition fees are generally much higher. In European countries, where higher education institutions are mainly public and direct costs are much lower, the explanations put forward by scholars of the potential influence of family income on university attendance (conditional on prior ability and schooling careers) are more generically related to the inability to face costs, including the cost of living, and to foregone earnings (Glocker, 2011; Barone et al., 2014). Where financial difficulties and no efficient student aid system exist, disadvantaged students often need to cover their costs by working, increasing time to degree and/or leading to dropout (Glocker, 2011; Triventi, 2014). In favourable labour market conditions, pull factors may also operate, as in particular, low-income students might be induced to accept good job offers and leave university. Indirect evidence of an impact of family income on higher education attendance and completion is also provided by the numerous studies showing the beneficial effect of student aid in different countries (e.g. Dynarski, 2003; Glocker, 2011; Mealli & Rampichini, 2012; Singell, 2014; Bettinger et al., 2019; Denning et al., 2019; Modena et al., 2020).

Against this background, in this chapter, we analyse whether family economic conditions affect the probability of dropout from Italian university courses upon enrolment. Italy is an interesting study case because the education system is mainly public and university tuition fees are relatively low and income progressive. While parental education and occupation may shape aspirations—and thus the wish to undertake ambitious educational programmes—lack of income could represent a material obstacle to the continuation of study. However, because the direct costs for disadvantaged students are low, we would expect income not to be highly relevant in this context. As we will show, this is not the case: economic conditions appear strongly associated to student dropout, even after controlling for other dimensions of socioeconomic background, prior school achievement and school type. To our knowledge, there is little existing evidence in Italy on the role played by financial conditions on student academic careers in university. One reason is the lack of appropriate data. Although administrative data provide a measure of family income, it is difficult to identify its independent effect because of the potential confounding of other family background characteristics.

Our research focuses on student educational careers upon enrolment in higher education. Exploiting a unique data set from the University of Torino that links administrative data from students’ university careers, information on family income and wealth and information on mothers’ and fathers’ characteristics collected at matriculation, we disentangle the effects of income, parental education and parental occupation on the probability of dropping out in the first academic year. Information on the financial situation of the family is provided by the ISEE indicator (Indicatore della Situazione Economica Equivalente), which is an official document released by the tax authorities delivering a measure of the household economic condition, based on official records of family members’ labour income, property and real estate assets, and normalised by the number of components. This document is used to determine tuition fees due for each student.

Parental education and occupation are not available in university registries. To overcome this limitation, the University of Torino has been collecting data on parental education and occupation since the 2014/2015 academic year through an online questionnaire that students fill in at matriculation. Although this section is not mandatory, the large majority (approx. 90%, evenly distributed across subgroups) provide this information. However, nearly 30% of the students do not disclose the ISEE documentation. We show that these data are not randomly missing and that a non-negligible share can be attributed to early dropout decisions. Because in this case complete case analyses or naïve solutions will deliver biased estimates of income effects, we tackle this problem by implementing an appropriate ad hoc imputation strategy.

The rest of the chapter is organised as follows. In Sect. 2, we summarise the existing evidence for Italy. In Sect. 3, we describe the data, and in Sect. 4, we illustrate the problem of missing information investing income data and how we tackle it. In Sect. 5, we describe the empirical strategy, and in Sect. 6, we present our findings. Conclusions follow in Sect. 7.

2 The Italian Context

Despite the absence of formal barriers to track choice and access to university, the Italian educational system is flawed by strong socioeconomic inequalities (Cobalti & Schizzerotto, 1993; Checchi & Flabbi, 2007). In comparative research, Italy stands as a country with particularly large inequalities across parental class and education in upper secondary school choice and access to tertiary education (Jackson, 2013). Family background critically influences students’ high school choices (Gambetta, 1987; Schizzerotto & Barone, 2006). Even if inequalities in access to upper secondary education have consistently declined and the share of students enrolling to the academic track has increased over time, class inequalities in track choices have not changed much (Panichella & Triventi, 2014). Horizontal segregation in high school has strong consequences on inequalities in university enrolment, as the transition rate to tertiary education varies largely across tracks (around 80% for students with a lyceum diploma, and below 30% for students with a vocational/technical diploma). Overall, there is evidence of increasing participation in higher education and slightly decreasing inequalities up to the 2000s (Argentin & Triventi, 2011; Guetto & Vergolini, 2016), but in the most recent decade, probably due to the economic crisis, transition rates have been declining and differences across high school tracks have increased, which has determined a change in the composition of the enrolled population (ANVUR, 2016).

Research on student academic careers has been limited by the lack of appropriate longitudinal data at the national level. For this reason, the existing literature on university dropout is largely based on retrospective survey data on high school graduates, periodically run by the National Statistical Institute (Cingano & Cipollone, 2007; Di Pietro & Cutillo, 2008; Cappellari & Lucifora, 2009; Ghignoni, 2017; Contini et al., 2018). This literature reports substantial differentials related to family background and shows that disadvantaged groups in terms of enrolment are also disadvantaged in terms of persistence. These groups include students who attended technical institutes and vocational schools (largely composed of students of lower socioeconomic background), although parental education and social class also influence university attendance and retention, conditional on prior schooling experience. Disadvantaged students are also less likely to enrol in a second tier, once they have obtained a bachelor’s degree (Bratti & Cappellari, 2012).

Only a few studies have been based on micro-level administrative data (Belloc et al., 2010, Clerici et al., 2014, Carrieri et al., 2015, Zotti, 2016; Contini & Salza, 2020, Scagni, 2021). Because the archives on schooling and university careers are not linked together, it is not possible to study enrolment choice and consider selection effects. Moreover, a major limitation is that, while it is possible to obtain data on family income, there is no information on parental characteristics. Parental education and occupation influence individuals’ aspirations and shape their expectations about future life chances. Economic conditions influence the possibility of bearing the direct and indirect costs of schooling. To disentangle these effects, data on all of these dimensions are needed.

While parental education and class strongly influence high school choices, in Italy there is no evidence of income effects at this stage (Checchi, 2000). This is hardly surprising, because schooling is free up to high school completion, and the expansion of the educational system has now made high school attendance almost universal, as nearly 85% of the young attain a high school qualification. The evidence on the role of economic resources in higher education is mixed. Analysing a national sample in the survey on Household Incomes and Wealth, Checchi (2000) reports that family income does not seem to play a significant role in preventing the enrolment of cohabiting children in Italian public universities. Instead, Aina (2013) finds sizable effects on enrolment probability but small effects on dropout. Using administrative data from single institutions, Zotti (2016) and Scagni (2021) report income effects on dropout probability. Although analysing the data of single institutions has limited external validity, focusing on more homogeneous environments has the advantage of better controlling for contextual confounding effects. Analysing the University of Salerno, Zotti (2016) reports significant differences between low- and medium-income families in dropout probability. Scagni (2021) analyses data from the University of Torino and finds a sizable effect of income on dropout choices. Belloc et al. (2010), however, report the opposite finding—that low-income students drop out less—for the University Roma La Sapienza. Yet, this result is derived from including university performance (a mediator of dropout) as a control, and thus it is not comparable with the other studies. From a different perspective, Barone et al. (2018) use measures of material deprivation to study university enrolment and find that economic deprivation, as such, matters, even controlling for other variables meant to capture the rational choice mechanisms, in line with the Breen and Goldthorpe’s theoretical model, although it does not play a major role.

Indirect evidence of the role of financial conditions on student academic careers is provided by the compelling evidence that income support provided to low-income students is effective in preventing dropout and fostering in-time graduation (Mealli & Rampichini, 2012; Vergolini & Zanini, 2015; Martini et al., 2021; Modena et al., 2020). Scholarships may favour college enrolment and persistence by providing income that allows students to allocate more time to school activities instead of work.Footnote 1

3 Data

We exploit administrative data provided by the Ministry of Education on the entire career of the cohorts of students first enrolled at the University of Torino in a bachelor’s programme in the three academic years from 2015/2016 to 2017/2018. The archive contains full information on the students’ progression (including exam transcripts and credits earned, degree changes, timing of degree attainment or withdrawal); demographic characteristics (gender, age, place of birth and place of residence); and information on previous schooling (type of high school and final examination marks). These data have been integrated with information on family income and tuition payments, with information on scholarship recipiencyFootnote 2 and with a unique piece of information on parental education and occupation collected independently by the University of Torino at matriculation since 2014.Footnote 3 This makes it possible to improve our understanding of socioeconomic inequalities in higher education, assess the independent contribution of each of these family characteristics and disentangle the effect of economic conditions.

We analyse the determinant of first-year dropout, with a particular focus on the role played by family income. Withdrawal is defined implicitly, based on whether we observe re-enrolment in year 2. Because we have access only to microdata from the University of Torino, we cannot distinguish between changes of institution and withdrawal from higher education altogether.Footnote 4 Previous analyses based on more comprehensive data have, however, shown that, among bachelor students, only a small share of the observed dropouts belong to the former group, so we believe we can safely interpret the results in terms of system-level dropout.

In Italian public universities, tuition fees are progressive, depending on household economic conditions. Students make a first payment of a fixed amount at the beginning of each academic year. In late fall, they are asked to provide the ISEE document reporting the family equivalized indicator, based on family members’ labour income, properties and real estate assets.Footnote 5 Students whose ISEE exceeds a given threshold (currently set around 85,000 euros) or not providing the document are requested to pay the maximum fee (approximately 2500 euros per year). Nearly 30% of students do not provide the ISEE declaration. In the next section, we deal with this issue: as we will show, this piece of information is clearly not missing randomly. This implies that we cannot ignore the issue and conduct a complete case analysis: instead, missing data will be imputed, based on the available information on the following academic years, on parental education and occupation and tuition payments.

4 Missing Data on Family Income

If we could assume that, conditional on observed variables, data on income were “missing at random” (MAR), we could conduct a complete case analysis including all of the relevant explanatory variables in the models. There are, however, good reasons to believe this is not the case. First, because high-income students have no tuition reductions, they have no incentive to provide an income declaration. Let us label these students rich. Indeed, if we could assume that all individuals with missing ISEE exceed the highest threshold, it would not be a big problem, because we would have relevant information on income that we could exploit. Unfortunately, there is evidence against this assumption. When we analyse the characteristics of the students with missing ISEE we find that: (a) many of the students with missing ISEE come from disadvantaged family backgrounds in terms of parental education and occupation (see Table 9 in Appendix A); and (b) many students not disclosing income in year 1 do so in subsequent years, often reporting a low ISEE value (see Table 10 in Appendix A). If economic conditions are fairly stable over a short time span, we may assume that in year 1 they had missed the deadlines, so we call these students sloppy and exploit the information provided in later years.

Second, students who decide to leave their studies within the first couple of months of the academic year also have no incentives to declare ISEE, because ISEE determines the second tuition payment, due in late fall. We call these students early dropouts. The choice timeline is depicted in Fig. 1.

Fig. 1
figure 1

Decision-making timeline

While the rich and sloppy can be easily handled by imputing high income or subsequent ISEE values, early dropouts involve an endogeneity issue that must be considered. Endogeneity results from the fact that, although we are dealing with missing values for an independent variable, whether this variable is observed or not may depend on the dependent variable itself.Footnote 6 Hence, we cannot simply ignore the issue and exclude these cases from the analysis, because we would end up with potentially highly biased estimates of the effect of income on the dropout probability. As we will see later, this practice would lead to substantial underestimation of the effect of interest.

We now describe how to identify the students in these subgroups and our imputation strategy. We classify the students in the cohorts of interest in terms of whether they have or have not provided the income declaration in academic years 1 and 2, whether they have or have not enrolled in year 2 and, when relevant, whether they have paid the second tuition instalment: this piece of information is useful to identify early dropout students. Details are provided in Table 1.

Table 1 Classification of students (matriculated population in BA degrees, 2015–2017)

Most of the students (more than 70% of the entire student population matriculated in bachelor’s degree courses) provide ISEE in year 1. Consider the students not declaring ISEE in year 1 (29.37% of the total population); as discussed above, we may identify three relevant clusters: the rich, the sloppy and the early dropouts, as well as an additional residual group. In the following lines, we describe how we identify them and the imputation strategy. Let us start with those who do not drop out by year 2.

  1. 1.

    SLOPPY. As argued above, we assume that those who did not declare income in year 1 but declare income in year 2 had previously missed the deadlines: the sloppy represent 5.07% of the total population. Assuming short-term stability of economic conditions, we impute ISEE in year 1 using the value reported in year 2.

  2. 2.

    RICH. Some students fail to provide the information even in year 2 (and in subsequent years). These students (17.5% of the total population) are labelled rich, under the assumption that if a student does not disclose ISEE more than once, it is because there would be no substantial tuition reduction justifying the burden required to produce the documentation. For these individuals, we impute ISEE with a conventional value exceeding the maximum threshold. To keep it simple, we impute the value 100,000 and run robustness checks with alternative values (see Sect. 6).

    After these imputations, the share of students with no information on economic condition drops from 29.37% to 6.76%. Even if the size of the missing ISEE population is small at this point, we must still account for the most problematic subgroup of students: those who do not enrol in year 2.

  3. 3.

    EARLY DROPOUTS. To identify this group, we exploit an additional piece of information: whether students have made the second tuition payment, due in late fall. We assume that those who did not (4.46% of the total population) have taken the dropout decision before the ISEE deadline. Our imputation strategy for the early dropout students relies on the available information on parental education and occupation and on the observed relation between these family background characteristics and ISEE. Let us define I as the household economic condition indicator and z as the vector of dummy variables describing mother’s and father’s education and occupation. Assuming a linear relation Ii = a + bzi + ui, we estimate model parameters, predict ISEE for given combinations of parental background characteristics and use the estimated E(I| z) to impute missing ISEE. Yet, to address the endogeneity issue, we must acknowledge that the relation between I and z is generally different in the dropout population from that in the student population at large, because economic conditions and other dimensions of family background may themselves affect dropout (see proof in Appendix B). Against this background, we estimate the relation between I and z among those dropouts disclosing income and impute the predicted expected value \( \hat{E}\left(I|z,\mathrm{drop}\ \mathrm{out}\right) \), under the additional assumption that the same relation holds for early and late dropouts.

  4. 4.

    OTHER DROPOUTS. There is an additional small residual group of dropouts (2.3% of the total population), who did not declare ISEE in year 1, but, having paid the second instalment, should not be considered as early dropouts. In principle, we could exploit the observed relation between parental characteristics and income and impute expected income as for the early dropouts; however, this would imply neglecting their decision not to disclose their income. Instead, we may acknowledge that this group is likely to be composed of sloppy and rich students. However, because they drop out, we cannot observe their behaviour in year 2, so we have no means of identifying them. Hence, we will assume they are all rich. Although this is unlikely to be true for all of the students in this group, by imputing a high value of income to all of them, we tend to narrow the economic differences between dropouts and non-dropouts, delivering a conservative estimate of the true income effect.

5 Empirical Strategy and Variables Description

The original sample included 33,485 individuals who first matriculated in bachelor’s degree programmes between 2015 and 2017. We excluded from the analyses the students not reporting parental occupation or parental education for both parents (approximately 10% of the original sample, apparently randomly selected) and those who attained a high school degree abroad, because most of them did not report family background information (final sample size N = 29,719).

In Table 2, we show descriptive evidence on the ISEE and the parental education distributions of dropouts and non-dropouts. On average, the former display substantially less favourable economic conditions and a smaller share have parents with higher education degrees. In the last columns, we report the share of dropouts within the population at large and among those providing and not providing the income declaration. As we can see, dropouts are overrepresented among those not disclosing income, confirming the suspicion that provision of the income declaration may be endogenous to the early dropout decision.

Table 2 Family economic condition (ISEE) and parental education by dropout status and share of dropouts among students providing and not providing the income declaration

To analyse the role of family economic conditions on dropout probability, we estimate logit models where the dependent variable is a binary indicator taking the value 0 if the students enrolled in year 1 re-enrol in year 2 and the value 1 if they do not re-enrol, focusing on students who first matriculated between 2015 and 2017 in 3-year degree programmes. We consider the following baseline specification:

$$ {D}_i^{\ast }={\beta}_0+{\beta}_1{I}_i+{\beta}_2{x}_i+{\beta}_3{z}_i+{\beta}_4{f}_i+{\beta}_5{c}_i+{u}_i\vspace*{-6pt} $$
(1)
$$ {D}_i=\left\{\begin{array}{c}1\\ {}0\end{array}\right.{\displaystyle \begin{array}{c}\ if\ {D}_i^{\ast }>0\ \\ {} if\ {D}_i^{\ast }<0\end{array}} $$
(2)

where Dis the latent utility of dropout, D is the observed binary counterpart and the error term u is distributed as a logistic random variable. The explanatory variable of main interest is I= ln(income), while the control variables are x=parental education and occupation, z=socio-demographic characteristics and prior schooling, f= field of study and c=matriculation cohort.

Given that we can control for a large array of explanatory variables capturing all of the main determinants described in the existing literature (including other dimensions of socioeconomic background), we are able to estimate the independent effect of family economic conditions on the probability of withdrawal. What often prevents researchers from being able to interpret the income effect as causal is the unavailability of information on parental education and occupation. In the absence of such controls, due to the association between these variables and family income, we would not be able to disentangle income effects from other effects related to family background. Moreover, there are possible selection effects that might affect our results, because by observing only university students we cannot model the enrolment decision. We address these limitations in Sect. 6.3. The explanatory variables are defined as follows:

  • Income is defined as the natural logarithm of the ISEE indicator, determining family economic conditions from household income, parental wealth and family size. When missing, we use the imputation strategy described in Sect. 4.

  • Parental education is recorded separately for mothers and fathers, according to the following classification: up to lower secondary school, upper secondary school and higher education. However, in the estimation, we include the highest level between mother and father, further distinguishing between households where one parent or both parents have a university degree.

  • Mother and father occupation is categorised as: blue collar, low-skilled white collar, high-skilled white collar and self-employed.Footnote 7 For the mother, we also add the category housework.

  • Female is dummy variable identifying female students to account for the widespread evidence of gender differences in educational outcomes.

  • Age at matriculation is included because there is extensive evidence that individuals not enrolling right after the end of high school (possibly after a period of occupation or while working) or, more generally, at an older age (perhaps because they previously experienced grade repetition) are more likely to leave university before degree completion. The variable is included in a categorical version (<=19 years old, 20 years old, 21–25 years old and more than 25 years old) to capture possible non-linear effects.

  • High school track is included because prior schooling has been shown to strongly affect higher educational choice and outcomes. It is classified into traditional lyceums (classic and scientific), other lyceums (linguistic, human science, artistic), technical schools and vocational schools. Students who attended high school abroad (n = 881) were excluded from the analyses.

  • High school final grade, ranging between 60 (pass) to 100 (excellent) is a proxy of academic preparedness and has been shown to be an important predictor of students’ outcomes.

  • Area of origin may influence the dropout probability for several reasons. First, because there is evidence from national and international standardised assessments that the level of competencies reached in school widely differs across the country (highest in the North and lowest in the South, see Bratti et al., 2007). Second, because students leaving their family of origin and bearing higher costs of living, on the one side are more exposed to changes in family economic conditions, but on the other side, they might be more motivated than stayers. The area of origin has been based on information related to high school location. We adopt the classification: Turin, Piedmont, North-West, North-East, Centre and South.

  • Field of study. University careers—withdrawal/completion, credit attainment speed, grades—vary across majors and disciplines. We classify the field of study into broad categories: Scientific, Political and Social Sciences, Economics, Humanities, Health and Psychology.Footnote 8

  • Scholarship is a binary variable taking value 1 if the student receives financial aid in the form of a (small) scholarship and 0 otherwise. In some specifications, we include the variable in the model to account for the evidence that financial aid has a beneficial effect on student progression.

  • Working student is a binary variable taking the value 1 if the student declares being a working student and 0 otherwise.Footnote 9 In some specifications, we include this variable because this condition often entails worse academic outcomes and higher chances of withdrawal.

Descriptive statistics on the full set of variables are presented in Table 3.

Table 3 Descriptive statistics

6 Results

In Table 4, we summarise the results of logit model estimation relative to the effect of income on the dropout probability. All models control for parental education and occupation, gender, age at enrolment, high-school type and final grade and area of origin, as well as including field of study and cohort fixed effects. For comparative purposes, we start with two naïve strategies: a complete case analysis (column 1) and a model including all observations, with a variable taking the observed ISEE value if available and 0 if missing and a dummy indicator for missing ISEE (column 2).Footnote 10 We then move to models using the imputed ISEE, according to the procedure described in the previous section: a model with the baseline explanatory variables (column 3) and models adding as control variables an indicator of the student being a scholarship recipient and whether the individual is a working student (columns 4–6).Footnote 11

Table 4 The effect of economic conditions on first year dropout probability (AME)

The effect of income is negative and highly significant in all models, implying that students from more affluent families experience lower chances of withdrawal.Footnote 12 The effect appears weaker in the complete case model than in the models where we address the missing data issue with appropriate imputation. The effect is even weaker when we estimate the naïve model in column 2: interestingly, the estimates reveal that the dropout probability for individuals not disclosing ISEE is substantially larger even than the probability experienced by those reporting very poor economic conditions, confirming the suspicion that missing income is at least partially endogenous. In column (3), we find our preferred estimates, which we explain in further detail below. The average marginal effect (AME) is −0.234; thus, between the 5th and the 95th income percentile (8.45 and 11.51), the dropout probability of two otherwise identical individuals in terms of demographic characteristics, prior schooling, field of study and parental background, differs by 7.16 percentage points.Footnote 13 The effect size is large, if we consider that the overall dropout share in the first academic year is 15–16%. In columns (4)–(6) we include the additional controls: the income effect increases when we include the scholarship variable and decreases slightly when we include the variable student worker. Interestingly, the effects of both controls are large and highly significant. Ceteris paribus, scholarship recipients have a dropout probability which is approximately 8 percentage points lower than that of non-recipients: this result confirms the findings of rigorous impact evaluation studies reporting a positive impact of scholarships on student academic careers. Student workers also have a much higher dropout probability (13 percentage points) than non-workers.

We believe the overall effect of income is best captured by the model that does not include being a scholarship recipient and being a working student as explanatory variables (Table 4, Column 3), because these variables are endogenous to income and play the role of mediators. Both receiving the scholarship and being a working student are influenced by income: by including them in the model as controls, we would capture the direct effect of income on dropout probability, while failing to acknowledge the—positive or negative—indirect effects. Let us be more specific. (1) Scholarships are typically granted to less affluent students, with the explicit aim of supporting their studies. Including the variable in the model would result in inflating the estimate of the income effect, because in this way the income effect would capture the difference in the dropout probability between more affluent and less affluent non-recipient students (or recipients, although this comparison seems less salient). In other words, in doing so we would end up interpreting the income effect as if income support policies did not exist. (2) Working students are generally less affluent than non-workers (Triventi, 2014); moreover, as we have seen, they have a much higher likelihood of leaving university before completion. By interpreting the income effect when controlling for this variable (and thus comparing students with different incomes, but either both working or both non-working), we would then end up underestimating the income effect by ascribing part of the negative effect of the lack of income to the condition of being a student worker, although being a student worker is itself influenced by the lack of income.

6.1 Heterogeneity of the Income Effect

Does income influence dropout probability for all students, or is the observed average effect driven by the behaviour of specific subgroups? To answer this question, we conduct separate analyses by gender, high school type, parental education, area of origin and field of study. Overall, income seems to exert a sizable influence on all subgroups, with only minor differences between them and only a few exceptions. We also estimate the income effect by the levels of the two mediator variables, indicating whether the student is a scholarship recipient or a working student. The results are shown in Tables 5, 6, 7 and 8.

Table 5 Heterogeneous effects by gender
Table 6 Heterogeneous effects by high school type, parental education and area of origin
Table 7 Heterogeneous effects by field of study
Table 8 Heterogeneous effects by scholarship and working student

Gender differences are small (Table 5). Income seems to have a slightly lower impact on the dropout probability of girls than boys, but the difference is not statistically significant. Income has a stronger effect on students holding technical and vocational high school degrees. Having previously self-selected into less academically oriented high school types, these students are likely to be more exposed to difficulties and may be able to count on lower family support than students from lyceums (Table 6, Columns 1a–4a). There are no sizable differences across parental education levels (Table 6, Columns 1b–4b). Income does not seem to exert an influence on students coming from central south Italy: we interpret this result in terms of self-selection as well. Although these students display a higher propensity to leave their studies compared to students from the North (results not presented here)—perhaps because, as shown by standardised assessments, they reach lower competence levels (Bratti et al., 2007)—they are likely to be especially positively selected in terms of aspirations and motivation and might thus be less exposed to the detrimental effects of low economic resources (Table 6, Columns 1c–4c).

Income plays a role in all fields of study except for health degrees (Table 7). This is not surprising, because of the selective admission to these programmes regulated by numerus clausus. Being strongly self-selected at entrance, these students are highly motivated and generally display very low dropout rates. Similarly, although still sizable and statistically significant, we observe a smaller income effect among students enrolled in the scientific fields, where in many degree programmes there are selective admission tests.

We find no income effects for working students, who are usually engaged in full-time jobs and display much higher dropout probabilities than full-time students. We interpret the absence of income effects for this subgroup as being related to the fact that, earning their own income, they are less dependent on family economic conditions. Income effects are weaker for scholarship recipients (AME = 0.017) than for non-recipients (AME = 0.032). This result provides additional evidence of the beneficial effect of student aid policies, as the scholarship contributes to making recipients less exposed to the negative impact of lack of family economic resources (Table 8).

6.2 The Effect of Parental Education and Occupation

Although the role of economic conditions emerges clearly, the effect of parental education and occupation is less clear. In Table 12 in Appendix A, we show the estimated effects for all family background dimensions. The effects of parental education go in the expected direction, but they are small and barely significant, and even weaker results are observed for parental occupation.Footnote 14 Hence, we may conclude that at this point of the educational career—after a strong previous social selection that may be represented as an obstacle course for low-SES and a flat road for high-SES individuals—parental education and occupation do not seem to exert any substantial residual effect on the decision to complete the bachelor’s degree.

6.3 Potential Limitations

6.3.1 Peer Effects

It might be argued that because we have not controlled for peer characteristics, we cannot rule out that our estimates of the effect of financial conditions also capture peer effects. Let us examine this point more closely. Students of higher socioeconomic background have, on average, better peers in terms of academic and soft skills, and better peers foster persistence in education. Hence, the link between socioeconomic background and persistence in education is likely to be causal, but (at least partly) indirect. Yet, if first-year university students’ relevant peers are high school friends and classmates, it is reasonable to consider parental education and high school track—taken jointly—as good proxies for peer quality. If we believe this is the case (this is our standpoint), the issue no longer exists. If instead we believe that income as such may influence the capability of making friends and which friends young individuals make, the income coefficient might indeed also incorporate peer effects. What would the policy implications be in this latter case? If the relevant peers have been established during high school, providing financial aid upon university enrolment might not help reduce dropout, because the aid comes too late. Instead, income support could contribute to reducing dropout if the relevant peers are made after university enrolment, because this additional source of income could foster social integration in university and the acquisition of better peers. In this scenario, the income coefficient captures the total causal effect of income (direct + indirect). Thus, the policy implications may depend not only on whether the relation between economic conditions and retention is truly causal but also on the mechanisms underlying this causal link.

6.3.2 Self-Selection Issues

By exploiting administrative data on university students, we cannot account for selection effects related to previous educational decisions—the choice of the high school track, high school completion and university entrance. Hence, our estimates of the effect of economic conditions on university dropout are not estimates of a causal effect in the usual sense: being conditional only on observed features, they do not capture the differences across the income distribution among individuals otherwise identical in terms of both observed and unobserved characteristics. The comparison is not fully ‘like with like’, because—due to the strong social selection operating along the entire schooling career—upon university enrolment, low-socioeconomic status individuals are likely to be positively selected and thus more endowed in terms of unobserved traits such as motivation and resilience than students from advantaged backgrounds (Cingano & Cipollone, 2007). For this reason, we expect our estimates to be conservative estimates of the total causal effect of income (by total effect we mean the effect inclusive of the potential effects of mediators). This conclusion holds under the assumption that motivation is independent of financial conditions after controlling for parental education and occupation (see Appendix B for proof).

7 Conclusions and Discussion

As maintained by Manski (1989) and more recently by Bertola (2021), college dropout need not be considered a social problem, because ‘students contemplating college entrance do not know whether completion will be feasible or desirable. Hence, enrolment is a decision to initiate an experiment, one of whose possible outcomes is dropout’ (Manski, 1989, p. 1). While we do agree with this point, we believe that dropout becomes a social problem if it is mainly experienced by students from disadvantaged backgrounds. If this is the case, we need to gain a better understanding of how to weaken barriers to higher education attainment among young individuals who have taken the decision to enrol in college and thus reduce intergenerational transmission of education and income.

Exploiting the unique administrative data from the University of Torino, which augments administrative university data with information on mothers’ and fathers’ educational level and occupation since academic year 2014/2015, we have been able to analyse whether and how family economic condition, parental education and occupation influence university students’ dropout probability and disentangle their effects. We highlight the existence of a severe missing data problem, elicited by the lack of incentives to provide ISEE documentation if the student’s income exceeds a certain threshold, and most importantly, in case of an early dropout decision. This source of missing data cannot be ignored. We deal with the endogenous missing data issue with an ad hoc imputation strategy and find that at this stage of the schooling career—after a strong previous social selection operating up to university enrolment—parental education and occupation no longer exert a sizable effect on educational choices. Instead, there is evidence that, despite the progressive character of tuition fees and the existence of scholarships provided to low-income students, financial conditions have a substantial impact on university dropout.

Our results suggest that low tuition fees and current student aid policies, although beneficial, are not sufficient to eliminate the negative effect of a lack of economic resources on student academic careers. Further investigation is needed to gain a better understanding of why this is the case. While still preliminary, our analyses reveal that scholarship recipients are much less exposed to family income effects than non-recipients, even if a sizable effect also exists among them. Moreover, despite all eligible applicants receiving a scholarship in recent years, the take-up rate is low, as only about half of the students meeting the income requirements apply for a scholarship (Laudisa, 2017). Whether this is due to a lack of information or to other reasons remains to be determined, which is necessary if we wish to promote equity and at the same time raise the share of young people with tertiary education, which is still dramatically low in Italy.