Keywords

JEL Classification

1 Introduction

In recent years, a substantial number of universities around the world have become increasingly research-oriented. Most universities adopt reward systems that favour academics with high-ranking publications and guarantee career prospects to those with a high productivity in research, by reserving only marginal attention to the teaching effectiveness (ter Bogt & Scapens 2012; Parker 2012; Douglas 2013; Cadez et al. 2017). Nevertheless, contrary to what seems to emerge from such orientation, universities are interested also in high-quality teaching, since both research and teaching are leading missions for them. Hence, it is important to understand what consequences a reward system so skewed towards research may have on the quality of teaching. In fact, if the two activities are substitutes, a reward system based mainly on research might reduce the quality of teaching. The contrary happens if the two activities are complements: in this case, rewarding research allows also the teaching quality to rise.

Although this is a crucial issue for the university system, the literature has not reached a wide consensus on the nature and sign of the relationship between research and teaching. On the one hand, there are those who claim that the relationship is positive because the abilities in running the two activities are complementary, since excellent researchers may also provide high-quality teaching, being people with deeper insights on scientific topics that they transfer through teaching (Braxton 1996; Sullivan 1996; Rodriguez & Rubio 2016). On the other hand, there are those who emphasize substitutability, arguing that the abilities in teaching and research are independent and both activities need time and effort which are limited resources for researchers. As a consequence, an incentive scheme more skewed towards research might drastically lower the time and effort that individual researchers dedicate to teaching activity as well as its effectiveness (Barnett 1992; Marsh 1987; Ramsden & Moses 1992; Parker 2012). Empirical analysis run on the question has not solved the puzzle. Several papers find a positive relationship between research and teaching quality, others a negative or even a null one. Moreover, results show a high variability since they change according to the level of degree programs, the proxies used to measure quality of teaching and research, and the variables capturing the context within which the two activities are performed.Footnote 1

The large variability in the empirical results could also derive from the fact that, although the relationship between teaching and research depends both on the behaviour of academic professors and on the organization of the universities and departments within which the two activities are carried out, the aims of the professors and universities do not completely coincide, and, under some circumstances, there might be even a conflict between them. While universities are multitasking institutions, for which teaching and research are complementary activities, this is not necessarily true for a single researcher, for whom the two activities are more likely substitutes (Barnett 1992; Hattie & Marsh 1996; Cadez et al. 2017). In fact, universities derive funds from tuition fees and research funds, while researchers derive their wages, tenure, and scientific reputation mainly by research productivity. Since both teaching and research require effort and time which are limited resources, they might be perceived as substitutes by the individual researcher. This framework can be further complicated by the fact that between universities and academic professors there is a principal–agent relation (Gautier & Wauthy 2007; Bak & Kim 2015; De Philippis 2020). While universities can observe research productivity, they cannot perfectly observe teaching effectiveness. This implies that if universities adopt an incentive scheme based on research performance, in order to solve the agency problem, this strategy may have an unintended detrimental effect on the teaching quality, since professors would choose to put more effort on research activity by free riding on teaching activity, which is perceived as a sort of public good (Gautier & Wauthy 2007; Payne & Roberts 2010).

In this chapter, we study the relationship between research and teaching in the Italian university system at department and study program level. We will explore the role of the department organization in reducing the detrimental effects on teaching quality, which derive from an incentive scheme based on research performance, adopted in order to solve the agency relation between professors and departments. In particular, we will consider the multi-unit nature of departments, which in Italy typically supply different study programs both at bachelor and at master level, and the fact that university departments are financed by funds received from the government both for their research productivity and for the number of students enrolled in their programs. These characteristics of the institutional context are conducive to a yardstick competition between study programs, which reduces the incentive for individual professors to free ride on teaching activities and also the trade-off between teaching and research.

In our empirical analysis, we exploit a very rich dataset collected by the Italian National Agency for the Evaluation of Universities and Research Institutes (ANVUR) and providing information on almost five thousand degree programs belonging to all Italian universities, public and private, telematic or traditional. To measure the quality of teaching for each study program, we consider objective as well as subjective measures. Objective measures include the initial efficacy of the study programs and the regularity of study paths. As subjective indicators, we consider the graduates’ satisfaction with the degree program they graduated from. All measures are based on data collected for the 2016–2017 academic year. We measure the academic research performance by using an indicator of quality rather than quantity: the R indicator provided by ANVUR, calculated at level of study program and department. More precisely, the R indicator is the average score that researchers, who teach in a given study program in the 2016–2017 academic year, received during the 2011–2014 Italian Research Assessment, normalized by scientific macro-field.

From our analysis, a positive teaching–research quality relationship emerges rather clearly when we carefully represent the nexus between research and teaching in the framework of yardstick competition among study programs, proxied by the number of study programs activated within the same department. In particular, we interact research performance with our proxy for yardstick competition and find its coefficient to be positive for degree programs facing relatively few competitors in the same department and negative if the number of competing degree programs is larger, such that free-riding behaviours are harder to detect. Less clear is instead the relationship between research quality and teaching in BA-level programs, where topics are typically far from research interests of the faculty. Other interesting results regard the student–instructor ratio: in study programs with a below-median student–instructor ratio, research quality and teaching quality are more strongly associated, likely because classes with lower students allow to relax the time and energy constraints faced by professors. However, we find that when the student–instructor ratio is below median, additional students per instructor tend to weaken the positive relationship between research and teaching, as shown by the coefficient estimates of an interaction term. This can be explained by the fact that, when departments’ budget is based on both research productivity and the number of students, smaller classes imply also a lower amount of funds available to departments, which limits the scope of winner-picking in research activity and rises incentives for teaching. Results about control variables, accounting for heterogeneity in terms of average instructors’ age, qualification, gender composition of the faculty, research funding, and degree program internationalization, follow expectations.

The empirical literature analysing the relationship between teaching and research at the university (or department) level for the Italian case is quite scant. In this regard, we mention the contributions of Sylos Labini and Zinovyeva (2011) and Braga et al. (2014), which focus, respectively, on the teaching performance of the departments of all the Italian universities and on the teaching effectiveness of academic staff of some degree programs of Bocconi University. Results of both seem to suggest the existence of a weak positive correlation between the two phenomena. The existence of a weak positive correlation between teaching and research is also confirmed in De Philippis (2020), who analyses the case of Bocconi University. By comparing the results before and after the application of an incentive scheme more biased towards research, she finds evidence of a negative effect of research productivity on the teaching effectiveness at individual level, but a positive effect at the university level due to a composition effect. Although these results are particularly interesting, they are not completely transferable to the whole Italian university system.Footnote 2 Hence, a wider empirical analysis is needed in order to obtain more general results. This chapter may contribute to fill this gap, which is important given also the great interest showed by policy makers in setting policies aimed at enhancing the effectiveness of teaching and the productivity of research in Italian university system.

The remainder of the chapter is organized as follows. Section 2 draws on the existing literature to overview the main insights on the relationship between teaching and research at the department level. Section 3 focuses on the Italian institutional context. Section 4 describes the data. Section 5 presents the econometric model. Section 6 includes the main results. Finally, Sect. 7 concludes.

2 The Relationship Between Teaching and Research Within Higher Education Institutions

The relationship between teaching and research has long been debated in the literature on scientific productivity. The topic, however, has been analysed mainly at the single researcher level, by paying less attention to the organizations within which both activities are carried out, such as departments and faculties (Marsh & Hattie 2002). Moreover, the theoretical justifications provided for the existence of a negative or positive relationship between the two activities are mainly at the individual level, while factors that explain the existence of a relationship between the two activities at the department level have been much less analysed.

We believe that the most appropriate unit of analysis to study such relationship is the department and/or the university, for several reasons. First, the research activity is often conducted by teams which are composed by members of the same department.Footnote 3 Second, there is no reason to expect that what holds for individual academics holds in the aggregate, since individual- and organization-level goals do not fully overlap. Moreover, the relationship between professors and universities can be understood in the principal–agent framework, where universities are the principal. As an implication, the incentive scheme adopted by universities is of crucial importance for determining whether the two activities are complements or substitutes, for which the relation is, respectively, positive or negative. Third, universities and departments may affect the relationship between teaching and research through their organization of human resources by favouring the specialization among faculty members, or the emergence of positive externalities through different forms of collaborations among members of departments (Bäkera & Goodallb 2020; Bradford et al. 2014; Carillo et al. 2013), or again by adopting a type of organization which reduces (or support) the administrative tasks carried out by professors. All these aspects may reduce the trade-off between the two activities.

2.1 The Multitasking Nature of Universities and the Principal–Agent Relation Between University and Professors

As already stated, universities are multitasking institutions for which teaching and research are complements since the universities’ budget is composed by both research funds and students’ fees. This is not necessarily true for researchers individually, who have to allocate time and effort between the two tasks, which makes the two activities substitutes rather than complements. The agency problem further complicates the framework, since the non-observability of teaching effectiveness induces universities to adopt an incentive scheme biased towards research, by incentivizing professors to free ride on teaching activity. In fact, as predicted by Holmstrom and Milgrom (1991), if a multitasking principal adopts a performance-based scheme, this drives the agents to reallocate time and resources towards the more rewarding task, at the expense of the less rewarding one. Hence, the final results may radically change according to the incentive scheme adopted by universities and whether universities are able to counterbalance the free riding of professors through other aspects of their organization.

Recently, several papers have adopted this framework in analysing the relation between research and teaching, by asking how multitasking universities may solve the principal–agent problem. Gautier and Wauthy (2007) assume that departments (or universities) are multi-unit organizations, and budget allocation among units depends both on the number of students and on research productivity. The authors show that such allocation rule induces yardstick competition among units, which reduces the substitutability between research and teaching effectiveness. In particular, yardstick competition among the different units reduces the incentive to free ride and rises the complementarity between teaching and research if the number of units is not too high. In the same line is the paper by De Philippis (2020), who also studies the allocation of professors’ efforts between research and teaching when there is an agency problem and universities adopt an incentive scheme biased towards research. In particular, she focuses on the relationship between research and teaching abilities in order to assess the effect of such an incentive scheme on the relation between the two activities. She shows that in such a framework, the degree of substitutability between the two tasks arises because when the reward is biased towards research, the cost of effort in teaching is higher for academics who are more involved in research. However, the negative effect can be counterbalanced by a composition effect which occurs if the ability for teaching is complement with the ability for research. In this case, incentives highly skewed towards research attract a supply of academics with high ability, thus counterbalancing the negative effect at individual level. Also, Bak and Kim (2015) adopt the multitasking theory for analysing the research and teaching relationship in the case of the Korean university system. The authors find that in a context where the incentive scheme is more skewed towards research, there is a reduction in teaching effectiveness. However, the negative effect is higher for undergraduate programs, for which the substitutability between the two tasks for individual researchers is higher.

An incentive structure biased towards research may reduce teaching effectiveness also by modifying the type of research. If it incentivizes the quantity rather than quality of research, the possibility of transferring new scientific knowledge to students is reduced, and hence the sign of the correlation is more likely negative (Shin 2011).

Finally, several authors suggest that not only explicit but also implicit rewards are important in shaping the relation between teaching and research but implicit rewards as well (Marsh & Hattie 2002; Carillo & Papagni 2014). A departmental ethos that gives more emphasis on research (or on teaching) could lead academics to place greater importance to research (or to teaching). If colleagues are particularly committed to research or teaching, then it is more likely that there are intrinsic rewards and higher reputation for excellence in that activity. Ramsden and Moses (1992) suggested that “high departments are populated by staff who are on average less effective teachers and vice-versa” (p. 287).

2.2 Specialization

At the department level, the way in which tasks and duties are allocated among the department members affects the time required for the implementation of teaching activities. For example, the involvement of PhD students and research assistants in teaching activities can improve the quality of teaching and at the same time relax the time constraints faced by senior scholars. A division of labour between senior and junior academic members, which gives more administrative duties related to teaching activities to senior academics, can also achieve the same results. Bäkera and Goodallb (2020) find that in departments where junior members have a low administrative burden, their research activity improves and there is less substitutability between the two tasks at the individual level. Also, Garcia-Gallego et al. (2015), by exploring the case of Castellona University in Spain, ask whether the specialization arising within the university for which some professors specialize more in administrative and teaching duties may reduce the substitutability between the two activities. They find that all phenomena arising within departments which increase specialization and collaborations among their members give rise to a positive correlation between research and teaching at the department level.

2.3 Positive Scientific Externalities

Another important factor is the existence of positive externalities generated by the scientific activity of the members of the same department. Positive scientific externalities within the department can spread through scientific collaborations between members of the same department, the organization and participation in seminars, participation in funded research projects, or even just through the exchange of ideas and information sharing (Carillo et al. 2013; Carillo & Papagni 2005). This implies that in an environment with high scientific externalities, it is possible to obtain a certain level of scientific production while investing less time and resources, which in turn improves the time and resource constraints on individuals and the trade-off between teaching and research activities.

2.4 The Level of Education

Another important feature of universities and departments that affect the relationship between teaching and research is the level of education they offer. Several authors (Brew & Boud 1995; Griffiths 2004; Brew 1999; Healey 2005; Palali et al. 2018) argue that undergraduate university programs offer less space for transferring the new frontier knowledge into teaching, while in more advanced education levels, such as masters or doctoral programs, this transfer is wider if not a necessary part of the teaching activity. Brew and Boud 1995 and Griffiths 2004 have focused on how departments define teaching activities: when they define it as a “student learning process,” research is closely related to teaching. Obviously, this definition is more suitable for higher level education. This result is confirmed by Palali et al. (2018). The authors run an empirical analysis on professors in the Netherlands, to find a positive relationship in case of master students and for students in the last year of their bachelor degree, while a negative one for lower degrees. De Philippis (2020) finds a similar result for Bocconi University. Hence, when professors can bring their research into class and disseminate it to students, the substitutability between teaching and research does not apply.

2.5 Research Fields

Finally, the nexus between the two activities varies according to the disciplines that characterize a department or a faculty, because of differences in epistemology, research methods, and types of academic cultures existing among them. Shin (2011) and Shin and Kim (2017) in empirical papers on the Korean university system find that in hard science departments the relationship between teaching and research is null or even negative in low-level education, while it becomes weakly positive in high-level ones. The contrary happens in social and humanities sciences. The authors argue that this can derive from the fact that research in hard sciences produces more articles in international journals, while humanities and social sciences produce more books and articles in domestic journals. These characteristics make easier for the humanities and social sciences to transfer the new knowledge in undergraduate programs. The contrary happens in higher levels of education, where students are more accustomed with formal reasoning and have good knowledge of foreign languages: in this case, hard sciences can more easily transfer the new knowledge to students. Walstad and Allgood (2005) for example find that US Economic fields are too much aimed and too rewarding towards research activity, if compared to fields in Business, Engineering, Mathematics, and Statistics.

3 The Italian University System

The Italian university system has been profoundly transformed after the Gelmini reform of the university system implemented in the 2010 and the introduction of the National Scientific Qualification (Abilitazione Scientifica Nazionale—ASN), which jointly characterize it as a system wherein public funding is allocated to universities mostly based on teaching indicators, but individual careers depend on research performance.

After the reform, universities are organized in departments, which have responsibilities on research, teaching, and the related recruitment, within the budget allocated by the university. Each department can manage one or more degree programs, including both BA-level and MA-level programs. Each program is managed by a council, including a number of professors affiliated to the department or to other departments. Such professors are termed reference professors and can take this role only in one degree program.Footnote 4 The department is responsible for proposing to the university the structure of the degree programs, namely, the list of subjects, their weights in terms of ECTS, and the allocation of instructors among subjects.

The enrolment fees are collected by the university and contribute to its budget, along with transfers from the Ministry of University and Research (MUR). Such transfers are based on the number of enrolled students, as well as on teaching quality indicators and, for a limited share, on research assessment outcomes performed by ANVUR. Part of the budget is used by universities for recruitment of academic staff. This can amount to new recruits or to upgrading the position of the existing academic staff.

Academic staff members can apply for career upgrades within their university, provided they have obtained the National Scientific Qualification to that position in the relevant academic field. Qualification is awarded by national committees, by considering above all the scientific quality of the publications submitted by candidates (originality, impact, editorial collocation, coherence with the field), provided that candidates satisfy certain threshold values in terms of a number of publications. Some teaching-related aspects are also taken into account, such as teaching fellowships in foreign universities or PhD board membership, but their weight in the evaluation is very minor. Significantly, no indicator about undergraduate teaching is considered.

To sum up, for the purposes of studying the teaching–research relationship, one can summarize the Italian university system as follows. Universities collect enrolment fees from students and transfers from the ministry and use part of them to finance new recruits or career upgrades. Though, candidates for academic positions compete in terms of research performance. Hence, in the aggregate, opportunities for academic careers depend on the ability of universities to attract students by carefully balancing tuition fees and teaching quality; however, individual opportunities do not depend on teaching efforts and could in fact be hampered by allocating too much effort away from research.

It is worth noting that the multi-unit structure of (Italian) universities adds a further layer of incentives that may affect the teaching–research trade-off. Universities can allocate their funding for recruitment among departments and degree programs based on their relative performances in attracting students. Degree programs with more students and/or with students who report better satisfaction or job market placement may be allocated larger shares of the recruitment budget. Competition among degree programs, based on better teaching indicators, is what provides the best researchers with larger opportunities for their career concerns. But there may not be enough incentives for the individual academic to improve his/her teaching performance since positions are awarded based on research quality.

3.1 The Italian Evaluation of Research Quality

The Italian assessment of research quality (VQR) has been carried out by ANVUR, on behalf of MUR, since 2011, to evaluate the scientific production of Italian universities and departments. Researchers have to submit a limited number of research papers, presumably their best papers,Footnote 5 which are evaluated by a panel of experts, selected by ANVUR for each macroarea of scientific research. The evaluation process is based on two evaluation methods: bibliometric analysis, based on bibliometric indicators (i.e. citations of the paper and the impact factor of the journal in which the paper is published) and informed peer-review evaluation by external experts, named by the panel. Each product receives a score ranging from 1 (excellent) to 0.7 (good), 0.4 (fair), 0.1 (acceptable), and 0 (limited or inadmissible). Hence, the research productivity is valued in terms of quality rather than in terms of quantity.

The contribution of each researcher to the scientific performance of the university is significant, given that the results of the research evaluation contribute to determining the share of the fund that MUR allocates to each University. However, only a small part of this fund depends on the results of VQR. In 2017, after the publication of the VQR results that referred to the evaluation of scientific production in the period 2011–2014, this share represented 80% of the “reward fund ” (quota premiale), which in turn consisted of 23% of the ordinary fund.

4 Data and Variables

For the purposes of this research, we have obtained by ANVUR data on 4858 degree programs activated by all public and private Italian universities in the 2016–2017 academic year. We consider also programs provided by online universities, as they are exposed to the same hiring rules and incentives as all other universities in Italy.

The dataset includes a number of variables that proxy for quality of research and teaching in Italian universities. In particular, in order to measure the quality of research performed by members of a study program, we rely on a variable that represents the key indicator within the 2011–2014 Italian Research Assessment, the so-called R indicator.Footnote 6 More specifically, the R indicator is calculated as the ratio between the average grade of the expected products by a given university in a certain scientific area and the average grade received by all the products of the area; the aggregate measure for the degree program is computed as the weighted sum of the area-wise R indicators, using the number of expected products of each area as weights.

Indicating with vi,j,k the sum of the evaluations of the k-th degree program of the i-th university in the j-th area and with ni,j,k the number of products expected for the VQR of the k-th degree program of university i in the j-th area and defining as qi,j,k the share of professors belonging to area j who teach in the k-th degree, we have

$$\displaystyle \begin{aligned} R_{ik} = \sum_{j=1}^{N_{j}} q_{i,j,k}\frac{\frac{v_{i,j,k}}{n_{i,j,k}}}{\frac{ \sum_{i=1}^{N_{i}}v_{i,j}}{N_{j}}} = \sum_{j=1}^{N_{j}} q_{i,j,k}\frac{\frac{v_{i,j,k}}{n_{i,j,k}}}{\frac{V_{j}}{N_{j}}}, \end{aligned} $$
(1)

where Ni and Nj are the cardinalities of, respectively, universities and areas.

This indicator captures the relative research performance of researchers teaching in a given degree program, with respect to research performances in the scientific areas involved in the degree program. Values below (above) 1 indicate a below-average (above-average) research performance. We recall that individual grades, which make up the sum vi,j,k for each degree/university/field combination, range from 1 (excellent) to 0.7 (good), 0.4 (discreto), 0.1 (acceptable), 0 (limited or inadmissible).

Teaching quality indicators that we consider measure both the academic performance of students and the satisfaction of graduates. Among indicators of students’ academic performance, we use the percentage of credits obtained in the first year with respect to the total number of credits to be obtained in the first year, the percentage of students who have obtained at least 40 credits in the first year and then enrol to the second year, and finally, the percentage of freshmen who graduate not later than one year after the ordinary duration of the study program. The first two indicators would capture the initial efficacy of study programs, i.e. the ability of university teaching staff to allow students a fairly swift transition from the first year courses, when students apprehend the basics, to second and third year courses that are more specifically aimed at preparing students for the job market. If students struggle to pass first year exams, it may as well be due to poor selection of freshmen, to ineffective organization of first year courses,Footnote 7 or to a teaching staff who set very high standards. For these reasons, students may decide to transfer to another university where they expect to find a better match, or to give up university at all. In both cases, one may argue that the university has failed in its teaching mission. The third indicator of students’ academic performance would capture the regularity of study paths since it is achieved when students complete their curriculum in due time. Such indicator refers to cohorts of students who have managed to pass first year exams. However, some students may still find difficulties in passing second and third year exams, which may require the application of basic notions learned in the first year, as well as learning more advanced concepts and analytical tools. Policy makers tend to have a negative assessment of universities in which students struggle to graduate in time, as this may prevent an effective school-to-work transition. On the other hand, students may as well take longer to graduate because they engage in activities that improve their chances of a successful school-to-work transition, such as internships or advanced dissertation topics.

A final category of teaching quality indicators concerns the satisfaction of graduates. We consider the percentage of graduates who would enrol again in the same degree program and the percentage of graduates who are overall satisfied about their degree program.Footnote 8 Students may be satisfied about their university choice for several reasons. Perhaps the straightest reason concerns the job market outcomes. Students who quickly find jobs that correspond to their labour market expectations or ambitions are supposedly more satisfied than average. Yet, satisfaction may originate from having attended classes given by highly skilled professors, from spending time in a well-organized university environment, or from the sheer interest of the discipline—regardless of labour market outcomes. All teaching indicators are provided by ANVUR in respect of AVA (Autovalutazione—Valutazione periodica—Accreditamento) obligations on universities and refer to academic year 2016–2017.

In Fig. 1, we present the relationship between teaching indicators and the R measure of program-level research quality. The three scatter plots in the upper panel of Fig. 1 refer to students’ academic performance and show a positive association between the research performance of teaching staff and all indicators of teaching quality, i.e. the average number of credits obtained in the first year, the percentage of students enrolled in the second year with 40 credits in the first year and the percentage of students who graduate within one year by the legal duration of the study program. Instead, we observe no association or even a weakly positive association between program research quality and the satisfaction of graduates according to the scatter plots at the bottom of Fig. 1, which refer to the percentage of graduates who would enrol again in the same degree program (Program satisfaction I) and the percentage of graduates who are overall satisfied about their degree program (Program satisfaction II).

Fig. 1
figure 1

Correlation between teaching indicators and research quality

As suggested in Sect. 2 when summarizing the theoretical insights on the research–teaching relationship, it is essential to take into account the organization of departments. In particular, we have to consider in our case the multi-unit nature of departments in Italy, since generally with few exceptions, departments may house more than one degree program, often at least a BA-level and a MA-level degree. The degree programs organize the teaching activity and establish the actions to be taken in order to improve the teaching quality indicators (e.g. tutoring of students who struggle to pass exams, recommendations in order to have syllabi that match the students expectations, organization of internships). However, degree programs are designed by university departments, which decide their goals, modules, as well as the allocation of teaching personnel among them.

To measure the inner organization of departments, we use different variables. First, the number of professors allocated to each degree program. Second, the ratio between the number of students over the number of instructors, which indicates how much relevant is students’ fees in the budget of department but also the effort required by teaching activity. Third, the number of degree programs per department, which captures the yardstick competition arising within department given that degree programs compete each other for obtaining funds and resources from department. Moreover, we also include two further ANVUR indicators that are study program-specific. These are the percentage of professors who teach in basic subjects and are at the same time reference professors for the study program (i.e. directly engaged in the management of the study program) and the percentage of credits obtained abroad by students. High values of the former may signal that the management strategy defined by the professors who coordinate the program directly affects the process of basic knowledge acquisition by the students and the selectiveness of the program. The latter (credits abroad) can be seen as a proxy of the intrinsic motivation of students and of their income. Indeed, although students in international mobility receive a small scholarship, students coming from lower income families may not afford to pay for the full cost of a foreign stay. Typically, students who are less motivated will not apply to Erasmus programs.

Other important aspects are the shares of full and associate professors, the share of post-docs, and research funding per capita. The average teaching experience of the department professors (as proxied by their role) and the availability of younger colleagues who may help them carry out research and teaching tasks (post-docs) sound like useful control variables. The department staff composition tells something about the division of labour within a department, which may be a key driver of teaching quality, as well as about the pattern of intra-department externalities (see Sect. 2). Also, higher research funding per capita may alter the trade-off between teaching and research efforts, as it may be reflective of an incentive structure biased in favour of research, possibly to the detriment of teaching.

Finally, we control for some characteristics of individual professors such as the average age of professors and the share of women. Younger professors may master the most advanced methodological tools, yet they may lack experience. Women may face a tighter work–life constraint and therefore may have to choose between excelling in teaching and in research.

We consider also fixed effects. We include university dummies, to control for unobserved university-specific features that may affect performances.Footnote 9 We control for the level of education: BA-level degree (laurea), MA-level degree (laurea magistrale), and laurea magistrale a ciclo unico (a 5-year degree). Indeed, the knowledge base and motivation of students in different degree types change considerably (see Sect. 2): MA-level students are “better selected” and are interested in more applied topics. We also control for the geographical area (North, Centre, South), as the socio-economic differentials that characterize Italy may have an impact on students’ performances. In the South, with less infrastructures and lower per capita income, students may have less resources for their education and lower expectations about job opportunities and therefore may underperform even if their universities are well-organized and house highly skilled professors.

Moreover, our estimates will take account of the irreducible specificities of scientific areas, as discussed in Sect. 2.5, by including the degree type dummies (i.e. Economics, Humanities, Mathematics, Medicine, etc.) and performing estimates on area-specific subsamples (bibliometric vs. non-bibliometric areas). All control variables refer to the academic year 2016–2017.

Descriptive statistics for the variables considered in this study are displayed in Table 1. Means are computed for the whole sample (column 1) and by type of degree program (BA- and MA-level degrees, respectively, in columns 2 and 3). In column 4, we compute t-tests to verify if there are statistically significant differences between BA- and MA-level programs (column 4). With regard to ANVUR indicators on students’ academic performance, we see that, on average, students obtain about 60% of the required ECTS credits within the first academic year, while the percentages of those who progress to the second year with at least 40 credits are on average 49%; finally, about 61% of students graduate within one year beyond the legal duration of the study. According to column 4, there is a substantial difference between BA- and MA-level programs according to all the indicators of students’ performance, with MA-level degree students outperforming BA-level degree students by, respectively, 9.5%, 7.5%, and 23.5%, which confirms the higher ability of students who self-select in master programs vis-a-vis those who enrol in bachelor programs.

Table 1 Descriptive statistics

As for the ANVUR indicators on the satisfaction of graduates, Table 1 shows that the percentage of graduates declaring they would enrol again in the same programs (Program satisfaction I) or to be completely satisfied about the program they attended (Program satisfaction II) is relatively large, that is, 67% and 84%, respectively. The t-tests in column 4 highlight that the percentage of satisfied graduates is higher for MA-level degree students with respect to the former indicator only, while there is no statistically significant difference between BA- and MA-level degree students in relation to the percentage of graduates, who are completely satisfied about the program.

On average, the R score at the degree program level, i.e. the research quality indicator, is about 1 and it is slightly larger for MA-level programs (1.046) than for BA-level programs (0.99). Looking at the other covariates at degree program level, we show that the student–instructor ratio is higher for BA-level than for MA-level programs (17.6 vs. 7.6), while the percentage of instructors teaching basic topics who are “reference professors” is almost 90% for both BA- and MA-level programs (93% and 86%, respectively). The percentage of female instructors is also slightly higher for the BA-level program case. Finally, the percentage of ECTS obtained abroad is very low and equal to 2.1% and less than 1% in the case of students enrolled in BA-level programs. This significant differences between the BA- and MA-level programs in most of the variables we consider in our analysis may signal that the teaching–research quality relationship could work differently depending on the degree type.

5 Econometric Model

The relationship between research quality and teaching quality in study programs is estimated through the following model:

(2)

where i denotes the generic university, d the department, and k the generic study program; Teachingqualityidk is a teaching quality indicator for study program k in university i and in the department d; Ridk represents the research quality indicator based on the Research Assessment grades; and Xidk is a matrix of control variables for study program k in university i, including also dummies accounting for fixed effects, and Zid is a matrix of control variables for department d in university i. β is our coefficient of interest. If positive, it testifies to a positive correlation between the research quality of the professors teaching in a study program and the performance of the study program according to teaching quality indicator (such as student academic performance and graduate satisfaction).

In some model specifications, we use a slightly different measure of research quality, i.e. the R indicator normalized by instructor-specific academic discipline rather than by academic field, which gives a more precise estimator of research quality because it captures relevant differences in research performance, which are field-specific.Footnote 10

Some of the control variables at study program level and at department level are of particular interest since they can modify the relationship between teaching and research; thus, in the subsequent section, we will use some of them in order to explore whether they are moderating factors for this relation. Indeed, the theoretical insights summarized in Sect. 2 suggest that, despite the existence of a trade-off between research and teaching efforts from the viewpoint of the individual academic, a positive correlation between teaching and research quality may arise at the study program level due to multi-unit nature of universities and departments. Thus, to account for the moderating role that some department characteristics may play, we estimate a model including interaction terms. In one model, the research performance indicator is interacted with the number of programs per department. In another model, the research performance indicator is interacted with the number of students per instructor at the study program level. Such interaction terms are meant to capture the effect on teaching quality of yardstick competition among programs within the same department and the effects of funds related to students’ fees.Footnote 11

In commenting these results, a special emphasis will also be put on the competition among teaching staff for career concerns, which is captured by the coefficient associated with the number of instructors. Indeed, the number of instructors can be seen also as a proxy for competition faced by academics, within their degree program, for potential upgrades. In degree programs with more instructors, we expect academics to focus more on research and less on teaching, on average, in order to win the competition for upgrades.Footnote 12 Thus, we expect a negative coefficient for the number of instructors. Such a negative effect due to career concerns may be weaker in MA-level programs, as in MA-level programs, academics typically teach topics that are closer to their research interests and they may rather prefer to reduce teaching efforts on more basic BA-level programs.

We finally compare the results obtained on subsamples of bibliometric and non-bibliometric fields.

6 Results

A first estimation exercise considers, as the dependent variable, proxies for teaching quality and students’ progress, namely: average ECTS obtained in the first year, number of students enrolled in the second year after obtaining 40 or more ECTS in the first year, and number of graduates one year after the ordinary duration of the program. Table 2 collects these results. In detail, the first three columns of Table 2 focus on the average ECTS obtained by students in their first year. Column (1) includes estimates for a sample including all programs, whereas the following two focus on, respectively, only BA-level and MA-level programs.Footnote 13 The results concerning the other two indicators of teaching quality are similarly organized: columns (4), (5), and (6) for the number of students with 40 or more ECTS in the first year and columns (7), (8), and (9) for graduates one year after the ordinary duration of the program.

Table 2 The relationship between teaching effectiveness and research. The effect on student academic performance

From our results, it seems that teaching quality is not robustly associated with research quality. We find positive and (weakly) significant coefficients when focusing on the sample including all programs and only MA-level programs, but not for graduates at time N + 1. In fact, teaching quality in BA-level programs is not significantly correlated with research performance and the coefficient is even significantly negative when considering graduates at N + 1. As concerns the number of degree programs per department, its coefficient is not significant and negative for BA-level programs, but significant and positive for MA-level degrees (except for the case of graduates in N + 1). While the student–instructor ratio is positive for all specifications, even if strongly significant only for the case of MA-level Graduates in N + 1. Hence, while yardstick competition has no effect on the teaching effectiveness for BA-level programs, it has a positive effect on MA-level programs. We interpret the positive correlation between the student–instructor ratio and the quality of teaching as a consequence of the fund allocation scheme where having more students raises the amount of funds devoted to each program.

The coefficient estimates for the number of instructors, a proxy for competition in career concerns, are instead consistent across teaching quality proxies. Apparently, degree programs with more instructors perform less in terms of teaching quality. This is corroborated by statistical significance only for BA-level programs, presumably because BA-level programs do not often allow instructors to teach subjects that are close to their research interests.

As regards control variables (reported in Table A.1 in the Appendix), the average age of instructors shows predominantly negative and significant coefficients. Indeed, younger instructors may possess more frontier knowledge on teaching methods and/or on research concepts and tools, which may be valuable especially for MA-level students. We find positive coefficients for the shares of full and associate professors and for the per capita research funds. Full and associate professors, indeed, are supposedly more talented on average than assistant professors, given their academic age, or more experienced. The availability of more research funds per capita allows to acquire equipment which may be helpful for teaching and may improve the trade-off between teaching and research efforts by relaxing the time constraint. Because this trade-off is more stringent when teaching is perceived as subtracting precious time for research, it is no surprise to find that the coefficients of per capita research funds lack significance in MA-level programs, where research and teaching are more complementary. Another variable that displays positive and significant coefficients, both for BA-level and for MA-level programs, is the number of ECTS abroad over total ECTS. This may be due to a selection effect: only students who are more motivated and have a higher than average income may afford visiting a foreign university.

Students’ satisfaction, too, could depend on the research performance of instructors. Table 3 presents estimates of our econometric model including, as dependent variables, two alternative satisfaction proxies: the percentage of graduates who declared they would enrol again in the same program (Program satisfaction I) and the percentage of graduates who are overall satisfied about their degree program (Program satisfaction II).Footnote 14

Table 3 The relationship between teaching effectiveness and research. The effect on student satisfaction
Table 4 The relationship between teaching effectiveness and research. Using an alternative measure for research quality

The most striking result in Table 3 is that, whatever the degree level and the satisfaction variable, there is no significant correlation between satisfaction and research performance. In fact, the signs are negative, except in one column. One may argue that top researchers may not possess the teaching skills, the business contacts, or the incentives to make their classes fit for the expectations of students who, after graduation, will look for jobs outside of academia. Satisfaction may rather be improved by instructors who can offer opportunities for business sector stages and job interviews. This lacking correlation would also be in line with the most recent criticisms on the use of the students’ opinions to measure teaching quality (Weinberg et al. 2009; Babcock 2010; Carrell & West 2010; Braga et al. 2014). According to this literature, the most skilled researchers, if they are also more demanding as teachers, would be penalized when evaluated through students’ opinions, as students may seek to minimize efforts.

Despite the lack of significant correlations between satisfaction and research performance, the estimates on graduates satisfaction bring a few interesting take-home messages (see control variable results in Table A.2 in the Appendix). One is that, for a given research performance, it pays off for degree programs to allocate basic subjects to the professors who are responsible for managing the degree. Coefficients to the corresponding variable are positive and statistically significant, both in the BA- and in the MA-level programs. Another insight is that more ECTS abroad do not help in terms of satisfaction in the MA programs case. Perhaps, students unfavourably compare their degree of origin with the foreign one, if the latter is better organized; or, they may attribute to their degree of origin the responsibility of a weak performance abroad. In these estimates, too, the average age of instructors shows negative coefficients. The coefficient is statistically significant for MA-level degrees, where younger professors—supposing they are on the scientific knowledge frontier—may be most intellectually stimulating.

Table 4 performs the same exercise as in Tables 2 and 3, but using a slightly different research quality indicator. It is represented by the average of the R values (VQR 2011–2014) of all university teachers belonging to each academic discipline, weighted by the ECTS of the related programs. Thus, compared to the research quality measure used so far, the new indicator would represent a more precise measure of research quality because it would capture relevant differences in research performance, which are specific to academic discipline, rather than academic macro-field, to which each instructor belongs. However, such indicator is calculated only for MA-level degree programs.

Table 5 The relationship between teaching effectiveness and research. Controlling for heterogeneity
Table 6 Table 5

According to the results in Table 4, the positive correlation is confirmed for all teaching indicators and it emerges even more clearly, as it can be grasped from the larger point estimates, especially in columns (3), (4), and (5). The sign of the number of degree programs is still positive and significant in the case of average ECTS obtained in the first year and of the percentage of students enrolled in the second year after obtaining 40 or more ECTS in the first year. Also, the positive influence of the student–instructor ratio is confirmed. Conversely, our proxy for career concerns (the number of instructors) does not affect teaching quality negatively as we would expect and only shows a statistically significant (positive) coefficient in respect to one of the satisfaction variables (column 5). As for the other control variables (see Table A.3 in the Appendix), negative coefficients on age are confirmed, as well as the negative correlation of graduates satisfaction with per capita research funds. ECTS gained abroad keep their positive correlation with teaching quality proxies, except in the case of graduates’ satisfaction.

6.1 Estimates Including Interaction Terms

As emerging from the above tables, the evidence on the effects of yardstick competition and career concerns in multi-task multi-unit universities is, at best, mixed. Though, our estimation strategy so far has probably overlooked the essentially non-linear relationship between yardstick competition, research performance, and teaching quality. If resources are distributed among degree programs in the same department through some form of yardstick competition, we expect the relationship between teaching quality and research performance to change across departments characterized by different competitive conditions. Let us set aside the extreme case of departments with a single degree program: in such a case, no competition arises, and therefore instructors will put their efforts on research, in order to achieve their career upgrades, to the detriment of teaching quality. Consider, on the opposite, a department with several degree programs. It would be difficult to avoid free-riding behaviours in that case, as strategic interaction among degree programs would be weaker, and each degree program may rather behave in a sort of ”price-taking” fashion. The argument by Gautier and Wauthy (2007) may imply a positive correlation between teaching and research quality in departments with relatively few degree programs, where performance comparisons among degree programs are easier. Therefore, in further estimations—focusing on the research performance indicator studied in Table 4—we include an interaction term between the research performance indicator and the number of degree programs per department. Furthermore, we separately analyse two subsamples: one including departments with a below-median number of degree programs and those above the median. The median number of degree programs per department is 7 (considering only MA-level degrees). We expect the coefficient of the interaction term to be positive in degree programs which compete with few other programs in the same department and a negative coefficient if the number of competing degree programs is larger.

Columns (1), (2), and (3) of Panel A in Table 5 report estimates of the model, using the average ECTS in the first year as the dependent variable. The sign and significance of the coefficients associated with the interaction term in the two subsamples confirm our expectations: the coefficient is 0.0246 and significant in the below-median subsample and − 0.0118 and significant in the above-median subsample. Hence, the role of yardstick competition in yielding a positive relationship between teaching and research quality is confirmed, while it vanishes when the number of competing degree programs is relatively large. The direct effect of the research performance indicator keeps its positive sign in the whole sample (0.0720) and is even stronger in the above-median subsample (0.1373). Similar results hold for the direct effect of the number of degree programs (0.0064 whole sample, 0.0084 above-median subsample, while negative in the below-median subsample). Finally, we replicate the above analysis on the heterogeneity effects, using the graduates’ program satisfaction as measure of teaching quality and find similar results, as reported in columns (1), (2), and (3) of Panel B in Table 5. In particular, the interaction term is positively associated with the program satisfaction where the number of degree programs in department is low (column 2), confirming the role of yardstick competition in modulating the research–teaching relationship.

Table 6 The relationship between teaching effectiveness and research. Bibliometric vs. non-bibliometric sector
Table 7 Bibliometric vs. non-bibliometric sector. Heterogeneity by # of degree programs in department
Table A.1 The relationship between teaching effectiveness and research. The effect on student academic performance—complete specification of Table 2

Another possible source of heterogeneity in the teaching–research quality relationship may arise in reference to the student–instructor ratio. For high-performance researchers, teaching time may have a rather high opportunity cost, but this can be mitigated if students are less. A smaller number of students allow to customize teaching methods and to involve students in research-intensive activities (e.g. data collection, experiments, discussion of scholarly articles). To capture such form of non-linearity, here too, we split the sample based on the median of student–instructor ratio, which is equal to 7.325. According to the above insights, we expect a positive correlation between teaching and research quality especially in degrees with a below-median student–instructor ratio and a positive correlation between teaching indicator and the student–instructor ratio; while the interaction term research quality times student–instructor ratio should feature a negative coefficient in degrees with below-median student–instructor ratio. This is because, as showed in Sect. 2, when the budget sharing rule adopted by departments to allocate resources among study programs is based on both research performance and the number of students, as occurs in Italy, smaller classes imply also lower amount of funds available to departments for career advancements and research. This limits, on the one hand, the incentive for academic professors to free ride on teaching activity, by rising the appropriability of teaching effort by academics, and, on the other hand, the scope of winner-picking in research activity. Results in columns (4), (5), and (6) of both Panels A and B in Table 5 reveal that our expectations are confirmed. In fact, in Panel A, we find that the positive influence of research performance, which results in the whole sample analysis, seems to be driven by the below-median analysis where the coefficient increases in size and is still significant (even if weakly). The analysis on program satisfaction in Panel B of Table 5 bolsters our expectations. The coefficient associated with research performance in the whole sample is 0.4797 and strongly significant, rising to 0.6419 in degrees with below-median student–instructor ratio, and losing significance in the above-median subsample. Finally, the interaction between research performance and the student–instructor ratio is characterized by a negative coefficient, significant and greater in the below-median subsample, but lower and not significant in the above-median subsample.

6.2 Bibliometric vs. Non-bibliometric Fields

We now explore the heterogeneity in the relationship teaching–research by bibliometric field.Footnote 15 Results in Table 6 show the link between the average ECTS acquired by students during the first academic years and the research indicator normalized by academic macro-field. As regards bibliometric fields, we find a positive correlation between our teaching indicator and the instructors’ performance in research, which arises both at BA- and at MA-level degree program. The positive result also for the BA-level degree programs is not surprising, given the less generalist nature of hard science programs, which makes the transfer of knowledge suitable even to the younger and unselected students. As for the non-bibliometric fields, by contrast, we find a negative role of the research in enhancing the students’ academic performance of BA level. This is probably a consequence of the most generalist nature of social science programs. However, these results could also depend on the measurement error of the indicator we use for the research performance of non-bibliometric instructors; the R indicator would at most capture the scientific products of international relevance, penalizing domestic publications, which are more present in social science (Shin 2011).

Table 10 Table A.1

Interestingly, results are more homogeneous when we control for the moderating effect of the number of degree programs in the department (Table 7), from which we can infer the main role of the yardstick competition in influencing the teaching–research relationship for both bibliometric and non-bibliometric sectors.

Table A.2 The relationship between teaching effectiveness and research. The effect on student satisfaction—complete specification of Table 3

6.3 Assessment

Overall, our estimates allow to draw some food for thought. A first take-home message is that scientific performance and teaching quality move together in line with yardstick competition among degree programs activated in the same department. The positive conditional correlation between research and teaching indicators, measured at the study program level, is stronger in departments where degree programs are relatively few and can be immediately compared and declines whenever the degree programs competing for the resources allocated by their department are many. This is the message from coefficient estimates on the interaction between research performance and the number of study programs per department. Such results suggest that the multi-unit and multi-task nature of Italian university departments—along with the lack of alignment between university goals and individual goals—subverts the trade-off that otherwise would characterize individual decisions about teaching and research efforts.

Second, the teaching–research relationship is best understood by analysing relatively homogeneous subsamples of degree programs. BA-level and MA-level students differ in terms of knowledge base, learning potential, and goals. At the same time, professors may have different expectations from students at different education levels and may tune their teaching style accordingly. Complementarity between teaching and research is less likely in BA-level programs, where the basics are taught and topics are far from the scientific interests of professors, and indeed we find the teaching–research relationship to be weaker in the subsample focused on BA-level degrees.

Our estimates can only be interpreted as correlations since we do not have information on the identity of universities and therefore cannot rely on a causality identification strategy. Yet, some insights on how to unleash an effective knowledge transmission can be drawn from further econometric exercises. In particular, the specifications including interaction terms confirm that yardstick competition and a budget sharing rule, which includes the number of students, are essential in order to allow a more effective transmission of advanced knowledge to students.

7 Concluding Remarks

The growing research orientation of universities in recent years has fostered an intense debate among academics on the consequences on teaching activities. There are, in fact, several reasons in both support of and against the complementarity between the two main university missions: teaching and research. Empirical evidence from previous studies is mixed. Therefore, the question whether being a good researcher implies being also a good teacher is still an open question.

This study contributes to the ongoing debate in that it examines the relationship between teaching and research in the Italian university system. To do so, we use a rich dataset provided by ANVUR on the study programs of all Italian universities and measure the quality of teaching using both students’ academic performance and their degree of satisfaction with the programs attended. Our analysis suggests that the involvement of good quality researchers in the program supports mostly the academic career of MA-level degree program students and increases their program satisfaction, once they graduate. On the contrary, with regard to the BA-level degree program students, we find some negative correlation between teaching and research quality.

An interesting result that emerges from our study is the heterogeneous effect in the teaching–research relationship, which stems from the multi-unit organization of departments. In particular, we find that the positive correlation between research and teaching indicators is stronger in departments where degree programs are relatively few and shrinks when the number of the degree programs competing for the resources allocated by their department increases, suggesting the major role played by yardstick competition in shaping the teaching–research relationship in the Italian universities.

Appendix

Table 12 Table A.2
Table A.3 The relationship between teaching effectiveness and research. Using an alternative measure for research quality—complete specification of Table 4
Table 14 Table A.3