Students’ opinions on teaching and services provided by the Italian Universities: a proposal for a new evaluation scheme

In Italy, the evaluation of the internal effectiveness of academic training courses has been substantiated, for over 20 years, in periodical surveys on students’ opinions on teaching and related services. The first proposal to homogenize the various measurement methods adopted by the Universities was advanced by the former National Committee for the Evaluation of the University System in 2000 and it was the reference model until 2011, when the first Board of Directors of the National Evaluation of University and Research Agency (ANVUR) took over. The Agency’s attempt, within the AVA (Self-assessment, Periodic Evaluation and Accreditation) methodological framework, to enrich and update the survey highlighted a number of critical issues, essentially linked to the ways and times of partici-pation of students, compared to the modalities in which the training offers of the universities are organized. Taking a cue from these critical issues, the purpose of this paper is to propose a new, simpler and more rational evaluation model, which still maintains substantial continuity with the inspiring principles of the past plants, and tries to consolidate the monitoring efforts made by the universities to date.


Introduction
In Italy, since 1999 (with Ministerial Decree No. 370/99), the periodic acquisition of the opinions expressed by the students about the educational characteristics of the courses is the responsibility of the universities. The request made by the legislator for an efficient and effectively synthetic treatment of the information collected was implicit, aimed at ascertaining the existence of possible margins for raising the quality of the academic educational offer.
The Italian National Committee for the Evaluation of the University System (CNVSU, established by the same law) immediately started off in this direction, promoting the main 1 3 results obtained over a three-year period by a research group specially appointed (Chiandotto and Gola 2000;Gola et. al. 2002 1 ). The group's guiding principle was the need to implement a survey capable of producing information complementary to the "career" information, usually available in the administrative archives. The survey was to become part of a more general verification process in which the University's governing bodies, at every level of the didactic offer (faculties, courses of study, teachings), could be put in a position to get appropriate evaluations on topics such as teachers' didactic abilities, training objectives, disciplinary updating and content level, coordination between lessons regarding the general formative profile, and the adequacy of the resources.
In particular, the work of 2002 was strongly promoted by the CNVSU following the observation of the high degree of heterogeneity "in terms of articulation, level of completeness and legibility" of the methods used by the universities that had already activated policies for the evaluation of the quality of teaching perceived by the students. Underlining the fact that the use of a questionnaire was indispensable, the research team observed how "… the methods of administration, the use of open questions and their possible elaboration (…)" were "so different to make the reconstruction of a single scenario at the national level almost impossible, even if of a very general kind". These assumptions led to the proposal of the first survey model, essentially based on a paper questionnaire. The questionnaire was composed of a minimum set of 15 questions organized in 5 thematic sections (Organization of the Study Program, Organization of teaching, Educational and study activities, Infrastructures, Interest and satisfaction), allowing the Universities to integrate it with specific questions at any hierarchical level of the educational offer. Indications were also given on the scale to be used in the acquisition of single answers and on the most suitable time for administering the questionnaire. As regards the scale of detection, the working group rejected any hypothesis of using scales with an odd number of categories (so as to force the respondent's orientation towards a non-neutral position). They proposed the use of a scale composed by four-balanced modalities (two positive and two negative), which combines its immediate comprehensibility with its intrinsic ability to ensure higher response rates compared to other scales just used by the Universities. The 4-point scale also had the acknowledged advantage of being graphically suitable with the layout prepared for the paper delivery in A4 format. The time interval between half and two thirds of the teaching was considered the most suitable time for filling out the questionnaires, the right compromise between a level of attendance that would allow the students an appropriate assessment and the possibility for the teacher to make the first corrective interventions (Chiandotto 2002(Chiandotto , 2004. The choices made on the methods of administration involved an aspect that turned out to be the essential prerequisite for conducting the investigation: the survey, managed in PASI mode (Paper Aided Self-Interviewing), had to be necessarily dedicated to attending students only. For this reason it was baptized for at least a decade as the evaluation on teaching activities, made by the "attending students". In fact, such methods of administration of a paper form have allowed only to get the opinion of those who were present in the classroom the day of administration.
At the end of 2006 (with Law 286 dated 24 November, transposed with amendments into Legislative Decree 262 dated 3 October 2006) the National Agency for Evaluation of University and Research (ANVUR) was established, with the simultaneous termination of the activities of the CNVSU. Since the inauguration of the first Board of Directors (which took place on May 2, 2011 after the publication of Presidential Decree No. 02/02/2010 No. 76 which established its structure and functioning), the Agency has undertaken to establish a series of methods for the accreditation and the periodic evaluation of the effectiveness of the training offer and the research programs of the universities. Among those, the most important is the AVA system (Self-assessment, Periodic Evaluation and Accreditation), conceived with the primary objective of enhancing the processes of self-assessment of the quality of university courses.
The survey of the students' opinions thus becomes one of the main elements of this integrated monitoring system, because it is present as a thematic section of the annual report named Scheda Unica Annuale (SUA-B6 and B7) and as a useful information for drawing up the "Riesame" and Commissione Paritetica Docenti Studenti" reports. The role attributed to the survey, first by the ministerial governing bodies and then by the ANVUR, is so important that it provides the acquisition of the opinion of all students (including those not attending), as well as graduating students, graduates and teachers. 2 From a model based on a single questionnaire, we then moved on to a model based on 7 questionnaires, to be distributed at different times to the collective of students enrolled, graduates and teachers. Bertaccini (2016) contains a comprehensive analysis of the current regulations and guidelines regarding the evaluation of university teaching by the students, and illustrates the main critical issues impeding the correct methodological structure of the surveys provided by the AVA system. The analysis was accomplished by comparing the provisions of the ANVUR with what has actually been carried out and implemented by the universities, both from the logical point of view and in relation to the new computerized technologies provided for the Agency itself.
In this paper we intend to follow up the critical issues highlighted above, proposing a new evaluation system based on a reduced number of questionnaires and a reduced number of items per questionnaire, operating in continuity with the inspiring principles of the previous system.

Evaluation of teaching according to the ANVUR-AVA model
The AVA system passes from the so-called "Chiandotto-Gola" model, promoted by the CNVSU and based on a single evaluation form to be given to attending students, to a model consisting of 7 forms to be administered to the group of students enrolled, graduates and teaching staff.
ANVUR-AVA questionnaires consist of questions that are generally the same as those found in the CNVSU form. More specifically, the CNVSU form has been broken down into AVA-Form 1 (dedicated to the evaluation of lessons and to be completed once 2/3 of the lessons have been carried out) and the first part (part A) of AVA-Form 2 (dedicated to the evaluation of the organization of the Study Program, classrooms, equipment and support services and to be completed at the beginning of the academic year starting from the second year of enrolment). The reasons leading to this decomposition can be found in the appreciable desire to solve a methodological error that characterized the initial model. With the CNVSU questionnaire, each student, for each lesson attended, was asked to produce a duplication of the evaluations regarding the organization of the study course, equipment, classrooms and support services: these aspects, in the hypothesis of consistency of the answers provided in the same academic year, then received implicit weighting related to the number of teaching activities evaluated inside each course program.
Forms 3 and 4 differ respectively from Forms 1 and 2 only with regard to the reduced number of questions, as they are intended for students who declare a frequency lower than 50%. In other words, those aspects for which the student's opinion can reasonably only be provided in the face of an adequate teaching frequency are excluded. In Forms 2 and 4, the only novelty compared to the previous CNSVU model concerns a section (Part B) provided for the assessment of the exam and to be compiled for each course for which the final examination was taken.
Two other questionnaires were also introduced to measure the same quality aspects of teaching at the time of graduation (Form 5) and 1, 3 or 5 years after achievement of the degree (Form 6). Lastly, Form 7 is a re-adaptation of Form 1 for the teachers in charge and should be completed, for each course, after 2/3 of the lessons; it is aimed at allowing a verification of the critical issues that could emerge from the survey on students.
In the AVA document, ANVUR states that "it is intended to generalize surveys in online mode" (section G of the final AVA document of 24 July 2012), consequently, "… it is necessary that the Universities prepare procedures to make the compilation mandatory". The requirements for filling in the forms provide clear indications in this direction: in case of non-completion after 2/3 of the lessons, Form 1 (or 3) has to be filled in at the time of booking the exam at the end of the course). Nevertheless, the operative proposals issued by the same Agency 3 also allow the use of optical reading questionnaires, because of the difficulty some universities still have in preparing suitable web-based administration tools.
The same operational proposals also define the timing of administration and the units of detection (all the courses that provide a total number of CFUs exceeding 3).

The main critical issues of the ANVUR-AVA model
As mentioned in the introduction, an in-depth analysis of the requirements and guidelines that define the ANVUR-AVA model has already been carried out (Bertaccini 2016). In this section we will only briefly review the main critical issues that emerged after 6 years of application of this model.
It has been stated that the set of items on which the ANVUR boards are based is, in general, the same as the ones composing the CNVSU board. From AVA-Form 1, however, the question on overall satisfaction was singularly eliminated; it was an item repeatedly requested and analyzed by both academic and ministerial governing bodies because it is considered a valid synthesis of the mnemonic-cognitive processes that induce students to quantify all their evaluations. Some initial questions also continue to cast doubt on the students' interpretation (for example the item: "Does the teacher explain the arguments clearly?"), with the risk of making the measured concepts differ from the purposes for which they were built. The response scale to 4 balanced modes was also inherited from the CNVSU model ("definitely not", "more not that yes", "more yes than not", "definitely yes") in order to force the respondent's orientation towards a non-neutral position. The justifications given in the CNVSU documents to endorse this choice (the immediate comprehensibility and the intrinsic ability to ensure higher response rates compared to other scales) are not, however, entirely convincing. In fact, in this context, the scale that seems to most adequately respond to the requirements of comprehensibility, familiarity and consequent intrinsic ability to raise response rates is the equidistant 10-point scale (1-10), which induces the expression of judgment in analogy with the scholastic experience (Various Authors 2008). Nevertheless, this scale does not require any coding of the response modalities, a coding that the CNVSU instead suggested to operate on the ordinal modalities of the 4-point scale to facilitate the interpretation of the indicators during analysis, according to a conversion rule 4 that involved the translation of the threshold of sufficiency to the value 7 (Chiandotto and Gola 2000).
The desired transition to the methods of on-line administration is strongly linked to the technical specifications of the applications in use by the universities. As regards the time window of administration, ANVUR requires that the evaluation of the teaching activities would be possible only once two-thirds of their duration have been reached. In general, this request poses technical-practical difficulties due to the organization of the courses during the reference period, depending on the time of start-up of the same, the number of assigned CFUs, and the distribution of the hours over the weeks of teaching activities planned in the calendar. In other words, the correct computation of the administration window for each teaching activities would be feasible only if the timetable of the lessons of each course was integrated in the university databases, thing that, at the moment, cannot be guaranteed by all universities. The obligation to complete the questionnaire upon booking of the final exam (provided that the form has not been filled previously) is a technical-political solution adopted and shared by many universities, which corresponds to the clear logic of raising the level of coverage of the monitored lessons and consequent extension of the survey to all active students. However, this method of administration is often criticized by teachers who fear a superficial or even random compilation of some forms.
The AVA questionnaires are in fact numerous, excessively long (thus encouraging a lack of attention in the compilation) and often perceived (both by the governing bodies, and above all by the students) as a useless bureaucratic burden. This becomes particularly true when the obligations of AVA Form 1 (or 3) have to be replicated for all teaching activities that compose the so-called integrated courses or for courses provided by co-teachers. The evaluation is certainly a right for students but, in the light of the normative references indicated above, it becomes in fact also a duty. Thus, in parallel, it becomes a right for the teacher to be evaluated. The evaluation should therefore be guaranteed for all teaching activities, including those with a number of CFUs less than 4 (limit established by the current ANVUR operational proposal). The effective replicability of the administration of AVA-Form 1 (or 3) for each of the teachers involved in an integrated course is however difficult to propose at this time, due to the large number of questions it consists of.
Part B of Form 2 was introduced following several positive experiments conducted at the local level on the evaluation of the examination procedures. In this case, the technical difficulties are only to be found in the impossibility of identifying the best moment to trigger the compilation mandatory (in this case ANVUR does not provide any indications).
Each proposed solution (the acceptance of the vote in the case of on-line verbalization, the beginning of the next semester, the booking of the next exam, the first post-examination access to the web services), in many cases entails advantages, but also significant disadvantages.
The AVA-Forms 5 and 6 can be realistically provided thanks to the technological support of the ALMALAUREA Consortium which has been dealing with the investigation of the evaluation of the external effectiveness of academic qualifications for years. Form 5 is already an integral part of the so-called "Profile" survey of graduates, and is generally administered by verifying the obligation to fill it in when the thesis application is presented. The administration of Form 6 is currently suspended (the form is only proposed for compilation on a voluntary basis); it is in fact believed that a revaluation of the quality of teaching received 3 or 5 years after the achievement of the university degree is a very questionable opportunity. For example, difficulty in entering the world of work, or reasons of dissatisfaction with the job carried out, risk to reduce judgment ability.
The administration of AVA-Form 7, which could prove extremely useful in the analysis of the assessments obtained from Form 1 (or 3), remains in the complete autonomy of the universities that have, however, very few tools to impose the compilation on teachers.

A new evaluation model
The proposal for innovation of the evaluation model is divided into 4 questionnaires which are the result of the experience accrued by the authors, in the light of the institutional positions held at the local and national level of governance, as well as referents of the SIS-VALDIDAT project that to day gathers the adhesion of about twenty Italian universities (Bertaccini 2006). The new model therefore takes into account the critical issues previously exposed, while trying to maintain the substantial continuity of the methodological evaluation plant to which the universities have long been accustomed.
The first questionnaire (Form A-see Table 1) is constructed in the aim of replacing Forms 1 and 3 of the ANVUR-AVA model. It is therefore aimed at all students for all the teaching activities that compose their study plan. It is deliberately based on a few questions concerning specific aspects of the teaching provided, which must guarantee the replicability of the administration for all the modules (with any number of CFUs) or the co-teachers engaged on the same activity.
The form includes two initial filter questions (section F) aimed at acquiring both the correct "coverage" of the teaching (matching academic year-teacher/s in charge in order to quantify the distance between the year of any attendance and the year in which the evaluation was carried out) and the level of frequency declared. The frequency level acts as a filter on questions related to punctuality and the ability of the teacher to stimulate interest and attention: it is believed that those who declare a frequency of less than 2/3 of lessons could be not able to adequately assess these aspects.
With respect to the existing system, the questions on the teacher's clarity of exposition 5 and on the interest in teaching subjects (the usefulness of a teaching cannot be confused Table 1 Form A (in bold, the items not proposed to those who declare a frequency of less than 2/3 of the lessons)

F1
Indicate the teacher responsible for the teaching activity Drop-down menu with teaching coverage list in the last 10 A.Y.

F2
What was your attendance rate at the course?
1. More than 2/3 of the lessons 2. Between 1/3 and 2/3 of the lessons 3. Less than 1/3 of the lessons 4. Never attended D1 Proportionality between the required study load and credits assigned to the teaching activity with the interest that students feel for the same 6 ) have been eliminated. It has also been eliminated the question on the sufficiency of the preliminary knowledge possessed for the understanding the topics provided for in the course, the evaluation of which should be in the responsibility of the governing body of the degree course program. The remaining questions that require a level of satisfaction (section D) have all been reformulated in a neutral way so as not to influence the respondent's cognitive process. Added to these is the question on the overall satisfaction which is not present in the current ANVUR-AVA evaluation plant. For the reasons explained in the previous paragraph, the judgments relating to this section would therefore be expressed using the equidistant scale 1-10.
Almost all the pre-formulated suggestions already present in the ANVUR-AVA form (section S) are confirmed, with the exception of the activation of the evening courses (an element that may not be available or possible in the study course). This section should only require a substantially short compilation time since any non-response to the single suggestion is equated with a non-identification of the relative criticality.
Finally, the free comments section (section C), which is reintroduced, is very useful and requested by the teachers.
The choices made are in line with the general objectives of the survey and, as mentioned, allow a wide margin of continuity with the previous assessment facilities. It is also proposed to confirm the windows of administration and the methods of detection of the current system (evaluation on the web platform, indicatively from 2/3 of the course with compulsory completion when booking the exam, if the form has not already been completed previously).
The second questionnaire (Form B-see Table 2) was devised in the aim of replacing Forms 2 and 4 of the current ANVUR-AVA model. It targets all enrolled students and must be filled in at the first access to the university web services, after the conclusion of each teaching period (semester, quarter) of reference, and in any case within 2 months after conclusion of the same. Also in this case, the form uses two initial filter questions (section F) to reduce the number of questions to be offered to those who declare that they have attended less than half the scheduled activities.
The questions (section D) for regular attendants are only 9, and are reduced to 4 if the student declares an occasional or partial frequency in relation to the number of teaching activities offered in the period (question filter F1). Compared to Forms 2 and 4 (part A) of the ANVUR-AVA system, the questions have been reviewed and reformulated in a neutral manner so as not to influence the respondent's cognitive process. Furthermore, it is proposed to eliminate almost all the aspects related to the post-examination phase (part B of Forms 2 and 4) from the evaluation process so as not to weigh down the overall plant of the survey, also in the light of the different administration time and the associated and cited technical difficulties faced by the universities.
The congruence between the methods of examination declared by the teachers during the activities and the methods via which they were actually carried out, however, is an aspect that can be conveniently added to the list of aspects to be assessed at the end of each teaching period, because of certain interest in the actions of governance and coordination of the study courses. Consistent with the choices made for Form A, the judgments related to this section should also use an equidistant scale from 1 to 10.

Table 2
Form B (in bold, items not offered to those who declare to have attended less than half the courses scheduled in the reference period) F1 How many teachings activities planned for the semester just ended did you attend regularly?
1. All or almost all 2. More than half 3. Less than half 4. None or almost none > 2) Why did you rarely attend scheduled activities/lessons? (indicate the prevailing reason) 1. Working student 2. Off-site student 3. Because the organization and content of the teachings did not make it necessary 4. Other D1 Congruence between credits and the study load required by the courses scheduled in the teaching period just ended D2 Compatibility of the timetable of the lessons with the individual study activity required by the courses scheduled in the teaching period just ended D3 Coordination of the contents of the courses scheduled in the teaching period just ended D4 Adequacy of the classrooms in which the lessons were held (accessibility, capacity, visibility, acoustics, air conditioning, Wi-Fi) D5 Adequacy of technical/IT laboratories (accessibility, capacity, level of instrumentation update, air conditioning, Wi-Fi) D6 Adequacy of classrooms and facilities dedicated to individual study (accessibility, capacity, air conditioning, Wi-Fi) D7 Adequacy of students' secretarial services D8 Adequacy of other support services (web services, libraries, canteens) D9 Congruence between the methods of examination declared by the teacher and the methods via which they were actually carried out C Free comments 1. Yes, in the same course at this university 2. Yes, but in another course at this university 3. Yes, in the same course but at another university 4. Yes, but in another course and at another university 5. No, I would not enrol at university again Finally, also in this form the section of free comments (section C) is introduced for optional compilation.
The third and last questionnaire (Form C-see Table 3) to be filled out compulsorily at the time of the thesis application has been constructed in the aim of replacing the "Graduates" Form 5 of the current ANVUR-AVA model. The number of questions it consists of is deliberately higher than the previously presented forms due to the wider spectrum of investigated aspects. Since it is administered only once for each qualification, the time required to fill it in is greater, although the presence of 3 filter questions (section F) can significantly reduce its length in relation to the initial statements made by those who are about to graduate .
The aspects object of evaluation (section D) are in principle the same as those composing Form B, as well as some questions dedicated to the experience of internship, the possible experience of studies abroad, and the relationship with the thesis supervisor (analogous to the contents of AVA-Form 5). Also in this case, the items have been reviewed and reformulated in a neutral manner due to the aforementioned need not to influence the respondent's cognitive process. In line with the choices made for Forms A and B, the judgments related to this section should also use a scale spaced from 1 to 10.
Another proposal concerns the elimination from the assessment system of the "re-evaluation" questionnaires of the teaching to be delivered 1, 3 and 5 years after the achievement of the degree because, as mentioned, the difficulty of entering the world of work or reasons of dissatisfaction with the job carried out would risk making the assessment capacity less objective. As a result, these forms would in fact end up being an appendix of the evaluation questionnaires of the external effectiveness of the degree titles issued by the universities, lengthening the compilation times and increasing non-response and/or interruption rates for such investigations.
Finally, while recognizing its usefulness and underlining the impossibility of making it compulsory, it is considered that the administration of AVA-Form 7 should be delegated to the free choice of the academic organs of government.

Conclusions
The evaluation of the quality of teaching provided, a subject that has always been widely debated, should take numerous elements into consideration that certainly cannot be reduced to the results of the surveys on the opinions of the students. However, in the Italian AVA system, the evaluation of the internal effectiveness of the training processes is identified and carried out by means of an assessment of the opinion of the students both ongoing and ex-post. And since ongoing surveys play a key role, this work, which has been inspired by the criticalities found in the evaluation model currently used in universities, intends to propose a new methodological system which operates in continuity with the inspiring principles of the previous plants. Specifically, the preparation of guidelines that can be easily implemented (i.e. based on simple questionnaires, with a low impact on the compilation times and easily administrable in relation to the technological equipment of the universities) and the sharing of analysis tools and indicators could contribute to the improvement of the AVA system. And, at the same time, it could prevent the new evaluation system from being perceived by those operating in the university system as the umpteenth intangible bureaucratic burden with a negative influence on the quality of the survey.
The proposal, object of this work, is in line with the historical objectives of the survey on the students' opinions and is based on the need of raising the level of coverage of the monitored lessons and the consequent need to extend the survey to all active students. This objective is only achievable by relying on electronic administration systems, although this is often criticized by teachers who fear a superficial or even random compilation of some forms. Unfortunately, while this risk is undeniable, the return to paper administration (as suggested by some) may not be the only solution to the problem. For example, today's modern telecommunications technologies provide mobile applications (so-called Apps) that could be used by teachers to encourage compilation from 2/3 of the lessons, as has been the case for paper delivery.
There is undoubtedly a great deal to be done with the analysis methods and synthesis of the data collected by the evaluation systems. It would be desirable for the community of Italian statisticians involved in the evaluation, to succeed in formulating a unanimous methodological proposal for resolving these problems (Biggeri 2000). It would also be important for the community itself to concentrate its efforts on identifying a unique tool for analyzing and consulting the evaluation results, capable of responding to the needs of local and national government bodies that currently need to request this information from Statistical Offices or the Evaluation Support Units of the universities.
As a last reflection in the margins of this proposal, we wish to highlight the final passage of the introduction of the first report produced by the CNVSU working group, a passage that today assumes the outline of a prophetic warning. The indicators coming from the students' opinions must not be the only element but one of the many on which to base the evaluation of academic teaching. And "it is important that these indicators are not used for automatic reward/sanction mechanisms, but instead pass, together with other information, through the filter of a competent judgment, consistently with a correct University Quality Assurance policy". In full awareness that the use of these indicators in individual "reward" procedures should be avoided in the light of the various critical form and method aspects set out above, it should however be noted that adequate and thoughtful forms of "reward" could contribute actively to the dissemination of the culture of the evaluation of teaching. The teaching body could therefore have a different approach to the evaluation process, clarifying to the students the purpose of the survey and stimulating them to evaluate in an objective way without waiting for the booking of the exams.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.