Introduction

In recent years, we have witnessed a progressive evolution of assessment processes that has changed the focus of attention towards students’ strategic and lifelong learning. From the recall of knowledge, emphasis has moved onto students’ ability to respond to assessment tasks that are divergent rather than convergent and complex rather than simple (Sadler 2012). Among the approaches accompanying this refocusing are assessment for learning (Carless et al. 2017; Lai 2006; Sambell et al. 2013), learning-oriented assessment (Carless et al. 2006; Carless 2015), assessment as learning (Dann 2014; Earl 2013), sustainable assessment (Boud and Soler 2016; Nguyen and Walker 2016) and assessment as learning and empowerment (Rodríguez-Gómez and Ibarra-Sáiz 2015).

In these different approaches, particular significance is given to participatory modalities of assessment, such as self-assessment and peer assessment. The reviews carried out by Dochy et al. (1999), Gielen et al. (2011) and Panadero (2016) illustrate the variety of ways in which peer assessment can be implemented in practice.

Given the centrality of peer assessment in reforming assessment practices, the purpose of this paper is to examine how peer assessment practices can be analysed and thereby improved. It develops an exploratory and predictive model that considers the key variables involved in peer assessment. To do this, it discusses theoretical foundations that suggest possible causal relationships between relevant variables. The model developed is validated using data from peer assessment activities with undergraduates in Spain taking an Economics and Business Degree over four academic years. The students experienced peer assessment as part of the course, using the EvalCOMIX® web service, which was expressly developed to promote participatory modes of assessment (Ibarra-Sáiz and Rodríguez-Gómez 2017).

Specifically, this paper aims to:

  • Provide a predictive model of the competence development of students, based on the practice of peer assessment and which illustrates the relationships between variables such as evaluative judgement, participation, feedback, self-regulation and the quality of the assessment.

  • Offer an instrument that facilitates analysis and understanding of the perception of university students about peer assessment practices using technological resources.

  • Orientate the practice of peer assessment towards those aspects with the greatest potential for improving students’ competence development.

Framework and development of hypotheses

Practicing peer assessment

Until the 1990s, assessment processes in universities tended to focus on what students knew. Students were assessed, above all, on their understanding of some domain of specific knowledge within the subject area they studied. Progressively, the emphasis has been refocused onto what students can do and the value of transferable, generic or essential skills, that is, the skills and competences that all students should develop (Boud 2014; Strijbos et al. 2015).

As other authors have argued (Nicol et al. 2014; Thomas et al. 2011), learning with peers is assessed because it is a key skill required for lifelong learning, which involves critical thinking and reflection and being able to evaluate one’s own work and that of others. Universities increasingly focus their efforts on these skills so that they represent an essential part of what students learn throughout their university studies.

At the beginning of any peer assessment practice we are faced with a situation in which the role of students needs to substantially change. Giving the student a voice implies modifying the traditional relationship of power in assessment processes, from one in which the lecturer holds a dominant position to a more equitable and democratic relationship in which students assume responsibility themselves as an assessor. Assessment therefore transforms from a unidirectional process, dominated by lecturers, to a socio-constructive and dynamic process in which lecturers and students interact (Rust et al. 2005).

This new situation requires numerous variables to be taken into consideration. Aspects such as participation (Falchikov 2005), students’ evaluative judgement (Boud et al. 2018a, b), self-regulation (Hawe and Dixon 2017; Panadero et al. 2017), feedback (Boud and Molloy 2013; Nicol et al. 2014), a climate of trust (Carless 2009, 2013) and the quality of the assessment (Sadler 2016) are all elements that play a vital role in assessment practice. In this study, we focus primarily on the two elements that can be considered as basic to peer assessment: student participation and their evaluative judgement. Secondly, we analyse the role played by feedback and self-regulation, as well as the value that students attribute to peer assessment in contributing to their competence development.

Participation and evaluative judgement

Contemporary theories acknowledge the central role of the student in the construction of their own learning (Penuel and Shepard 2016). In the case of peer assessment, the importance of student participation has been highlighted by Falchikov (2005), Thomas et al. (2011), Reinholz (2016) and López-Pastor and Sicilia-Camacho (2017). Participation means encouraging dialogue with students and enabling them to collaborate in the process of assessing their learning in ways that are transparent. This participation can be established during all three phases of the assessment process: planning, development and judgement.

During the planning, phase tutors can decide or agree with their students the selection or definition of criteria, the means of assessment, the design of assessment instruments or the grading system. When the assessment is carried out, students can participate by assessing their own work and actions and those of their peers, through assessment modes such as self- assessment or peer assessment. Finally, students can participate fully in the grading process through dialogue and consensus around the grades allocated.

For the students’ judgement on assessed work to be fair and equitable, it is vital that they have significant competence in the practice of assessment. The importance of students’ ability to make judgements has been evidenced by Boud and Falchikov (2007), Cowan (2010), Yucel et al. (2014), Nguyen and Walker (2016) or Murillo and Hidalgo (2017). In fact, Boud (2014, p. 27) highlights the importance of the development of informed judgement as one of the strategic axes in the assessment change agenda, because it is the “‘sine qua non’ of assessment”.

Tai et al. (2018, p. 471) define evaluative judgement as “the ability to make decisions about the quality of work of self and others”. In short, this implies the identification or discernment of standards, the application of them to a given piece of work, techniques for calibrating judgement and mechanisms to avoid being fooled (Boud 2016).

Under this conception of evaluative judgement lies a double dimensionality, in so far as it supposes an assessment of one’s own work and that of others. Furthermore, if we add another determining factor in the assessment process, such as trust or a lack of trust (Carless 2009, 2013), we can consider that, in the context of peer assessment, evaluative judgement is underpinned by trust in one’s own judgement and trust in the judgement of others.

Feedback and self-regulation

In peer assessment, the role of feedback is crucial. In general, research shows that feedback is associated with learning and performance because, as Hounsell (2007) states, feedback can improve learning in three different ways: by accelerating learning, by optimizing the quality of what is learned and by raising the level of achievement of individuals and of groups.

There are numerous contributions to the characteristics of high-quality feedback, as well as suggestions on how further improvements can be made (Ajjawi and Boud 2017, 2018; Boud and Molloy 2013; Espasa and Meneses 2010; Evans 2013; Gielen et al. 2011; Pardo 2018). However, in recent years, we have seen a change in the meaning and purpose of feedback. Previously, special attention was paid to features such as speed, level of detail, clarity, structure or relevance in the delivery of information to students; but nowadays, attention has shifted towards the meaning of feedback for the student and the interaction between the student and the giver of the feedback (Rowe 2017). Feedback has evolved from being perceived as a one-way process of transmitting information from lecturer to student, to being considered as a process using multiple communication channels, through which lecturers and students interact with each other in order to lead to improved outcomes. This highlights the importance of facilitating the participation of students as a source of feedback and learning (Moore and Teather 2013; Nicol et al. 2014)

In the context of peer assessment, it is essential for students to understand what quality feedback involves. They need to learn to evaluate and make judgements about the quality of the work of others whilst maintaining a dialogue with their lecturers and colleagues about the quality of their assessment (Sadler 2012). Consequently, it is important for students to develop the ability to make judgements and evaluate constructively. However, that alone is not sufficient. In addition, students must be able to use the feedback offered in such a way as to reduce the gap between the feedback given and the feedback used (Cartney 2012).

This change of the focus and the actors in feedback implies that lecturers should pay less attention to delivering unidirectional, focused and direct feedback and pay more attention to how students understand and interpret multidirectional feedback from their self-regulatory and self-productive identities (Dann 2014).

The self-regulation of learning is an essential feature of effective learning processes, as supported by multiple existing models (Panadero 2017). One of the assumptions underlying self-regulated learning is the importance of the nature of mediation that takes place between personal and contextual characteristics and the level or degree of achievement or execution (Pintrich 2000; Järvelä et al. 2016). The work of Panadero et al. (2018) shows that many studies have explored the relationships between self-regulated learning, the use of learning strategies and academic performance. However, few studies have focused their attention on the role of assessment as an element conducive to self-regulated learning.

Assessment quality

The achievement of high and consistent quality in assessment practices is required by students (Smith and Coombe 2006). With regard to assessment standards, it has been proposed that “Classroom assessment practices meet the standards of quality when teachers can be confident that their assessment practices provide accurate and dependable information about student learning” (Klinger et al. 2015). In fact, the generating of quality evidence is one basic principle in the Berkeley Evaluation and Assessment Research Assessment (BEAR) System, based on four principles: (1) developmental perspective, (2) a match between instruction and assessment, (3) the generating of quality evidence, and (4) management by instructors to allow appropriate feedback and follow-up (Wilson and Scalise 2006, p. 646). However, this is a partial view since it only takes into consideration the perspective of teachers. Quality assessment should be perceived by students as a rigorous evaluation, that is both valuable and interesting. Assessment quality also relates to other aspects mentioned previously, such as trust in the judgement of others and the usefulness of their judgements.

During peer assessment, if students do not receive pertinent, constructive feedback from their peers, there is a risk that they will perceive their judgements as unfair and discouraging. It is vitally important, therefore, that quality feedback information is given during peer assessment to reduce any perception of injustice and increase students’ motivation and commitment (Moore and Teather 2013).

Research model and hypotheses

The model adopted here proposes that, within the context of an assessment process based on peer assessment, competency development is determined by self-regulation and feedback which, in turn, are conditioned by the quality of assessment, being itself dependent on participation and evaluative judgement. Figure 1 illustrates the base model indicating the relationships between the constructs. Table 1 summarises the definition of the constructs in the model.

Fig. 1
figure 1

Model for testing drivers of competency development in the context of peer assessment

Table 1 Constructs definition

Based on this theoretical model and within the scope of the contributions presented in this study, we propose the following hypotheses:

  • H1. Evaluative judgement consists of two components; trust in one’s own judgement and confidence in the judgement of others

  • H2a. Evaluative judgement is expected to be positively related to the quality of the assessment

  • H2b. Participation is expected to be positively related to the quality of the assessment

  • H3a. Participation is expected to be positively related to the development of competence

  • H3b. Evaluative judgement is expected to be directly related to the development of competence

  • H3c. Self-regulation is expected to be positively related to the development of competence

  • H3d. Feedback is expected to be positively related to the development of competence

  • H4a. The relationship between evaluative judgement and the development of competence is expected to be mediated by feedback.

  • H4b. The relationship between participation and the development of competence is expected to be mediated by feedback

  • H5a. The relationship between evaluative judgement and the development of competence is expected to be mediated by self-regulation

  • H5b. The relationship between participation and the development of competence is expected to be mediated by self-regulation

  • H6a. The relationship between participation and the development of competence is expected to be mediated by the quality of the assessment

  • H6b. The relationship between evaluative judgement and the development of competence is expected to be mediated by the quality of the assessment

Methodology

This study was carried out during four academic years (2012/2013, 2013/2014, 2016/2017 and 2017/2018) using peer assessment as an integral part of the assessment process in the subject Project Management. A cohort design for the research was used, since the self-perception of the students was collected at the end of each semester. Consequently, different students responded in each academic year. A set of three assessment tasks was designed in which the students had to perform both self-assessment and peer assessment of the products or actions that were being assessed.

Peer assessment in practice

The way in which peer assessment is implemented in practice is of fundamental importance if the intention is to enable the study to be replicated and to compare and synthesise results (Topping 2010). Consequently, with the intention of avoiding what Makel and Plucker (2014) refer to as “replication crisis”, the characteristics of each of these three assessment tasks are described in Appendix A (Online Resource 1). In addition, based on the nineteen elements that should be considered in the description of peer assessment proposed by Adachi et al. (2018), Appendix B (Online Resource 2) describes each of these nineteen elements. The assessment tools used in each of the tasks are summarised in Table 2 and can be found in Appendix C (Online Resource 3).

Table 2 Assessment tools used for every assessment task

The EvalCOMIX® web service (Ibarra-Sáiz and Rodríguez-Gómez 2017), integrated into the Moodle server of the university’s virtual campus, was used to design, manage and apply all assessment instruments. Using EvalCOMIX® facilitated the peer assessment process, the delivery of feedback and the final calculation of grades, based on the criteria and weighting of each of the elements within the assessment task.

Participants

A total of 301 students from the Faculty of Economic and Business Sciences of the University of Cádiz (Spain) participated in the study (Table 3). These students were taking the Project Management module, taught in the final year of the Business Administration and Management Degree (BAM) and the Finance and Accounting Degree (FINA).

Table 3 Demographic characteristics

Instrument

In order to obtain the students’ views on participating in assessment processes through peer assessment, we designed an ad hoc questionnaire, the “Student Perception of Peer Assessment in Practice” questionnaire (Appendix D) (Online Resource 4).

Figure 2 illustrates the process followed in the design and validation of the questionnaire. The constructs and their items of measurement were developed first, following an extensive literature review. The content validity was then determined using the process of validation by experts. Of the various methods available (Johnson and Morgan 2016; Litwin 2003), we chose to use the group consensus option in which the experts consulted arrive at a final product on which they all agree, following an incremental and iterative process in several cycles. Five experts in assessment in higher education were consulted, chosen using the criteria proposed by Skjong and Wentworth (2000) regarding making judgements, decision making, availability, motivation and impartiality. Starting with an initial proposition, two iterative cycles were completed before the final version was agreed. The definition of the constructs was revised during each cycle and the appropriateness of each of the indicators was considered during discussions of approximately an hour and a half. Finally, to achieve face validity, the final version of the questionnaire was shown to a group of 8 Masters students who amended it from the perspective of the clarity of the language and ease of understanding.

Fig. 2
figure 2

Questionnaire design process

The final version comprised 40 items in a Likert scale format (1–6) structured around seven dimensions (Table 4).

Table 4 Participation satisfaction questionnaire structure

Data analysis

Partial least squares structural equation modelling (PLS-SEM) method and the statistical software SmartPLS 3 (Ringle et al. 2015) were used. This is a second-generation technique designed to overcome the weakness in more traditional exploratory type first-generation methods such as cluster analysis, exploratory factor analysis or multidimensional scaling (Hair et al. 2017). PLS-SEM is used specifically to develop theories in exploratory research by focusing on the explanation of variances in dependent variables when analysing a model.

PLS-SEM is recommended when, as here, the objective is the prediction of an objective construct or the intention is to identify “driver constructs”; the research model is complex according to the type of relationships hypothesised (direct and mediation) and the levels of dimensionality (first-order and second-order constructs); formatively measured constructs are part of the structural model; the structural model is complex and the data is non-normal (Hair et al. 2017).

Confirmatory tetrad analysis (CTA-PLS) was employed to confirm the formative or reflective nature of the constructs. This is used to check the adequacy of the specification of the measurement model and test the null hypothesis that the indicators for a model are reflective (Garson 2016), so that the reflective or formative nature of the latent variables can be confirmed (Hair et al. 2018).

The evaluation of the model has been carried out according to the reflective (Mode A) or formative (Mode B) character of the model. Once the measurements of the constructs were confirmed as reliable and valid, we proceeded to analyse the predictive capacity of the model and the relationships between the constructs.

Finally, the importance-performance map analysis (IPMA) technique was used to identify predecessor constructs that have a relatively high importance for predicting the target construct, but “also have a relatively low performance so that improvements can be implemented” (Hair et al. 2018, p. 105). This technique allows constructs and indicators to be easily identified so they can be modified to improve results in an effective way and enable improvements or changes to be prioritised.

Results

Before proceeding to the results of our evaluation of the measurements model and the structural model, Table 5 offers descriptive results for each of the variables measured and an analysis of the possible differences between groups by gender, degree studied and cohort. Table 5 shows that in terms of gender, the only statistically significant difference is in self-regulation whilst there are differences in practically all of the variables when considered by degree studied and cohort.

Table 5 Descriptive statistics and contrast tests (Mann-Whitney U and Kruskal-Wallis)

We initially considered all the constructs to be formative but, after carrying out confirmatory tetrad analysis (CTA-PLS), no empirical evidence of this could be seen for the constructs feedback (FEEDFP), confidence in their own judgement (OWNJUP) and confidence in the judgement of others (OTHJP) so the decision was taken to consider them as reflective.

Evaluation of the measurement model

(a) Reflective model

The evaluation of the measurement model for reflective indicators in PLS-SEM is based on internal consistency reliability, convergent validity and discriminant validity (Hair et al. 2017). As the values of internal consistency reliability and Cronbach’s Alpha values are above the 0.70 threshold, we can conclude that the four constructs are reliable. Average variance extracted (AVE) values for latent variables are greater than 0.61. Thus, the measure of the four reflective constructs has high levels of convergent validity. The Heterotrait-Monotrait Ratio (HTMT) uses 0.85 as the relevant threshold level, a criterion which is also met in this study. This means that all the constructs are empirically distinct (Online Resource 5).

(b) Formative model

With values of variance inflation factor (VIF) between 1.43 and 4.06, we can conclude that collinearity does not reach critical levels in any of the formative constructs and is not an issue for the estimation of the PLS path model (threshold value of 5). Some indicators were found whose values were not statistically significant but instead had loads greater than 0.5, so according to the rules of thumb expressed by Hair et al. (2017, p. 151), all the formative indicators were maintained (Online Resource 6).

Evaluation of structural model

All variance inflation factor (VIF) values are clearly below the threshold of 5. Therefore, collinearity among the predictor constructs is not a critical issue in the structural model. In order to assess the statistical significance of the path coefficients, consistent with Hair et al. (2017), bootstrapping (5000 resamples) was used to generate t-statistics and confidence intervals (Table 6). Figure 3 shows the results obtained in the evaluation of the model.

Table 6 Structural model results using t values and percentile bootstrap 95% confidence interval (n = 5000 subsamples)
Fig. 3
figure 3

Structural model results

We can confirm the predictive value of the model through the analysis of the coefficient of determination (R2). Thus, it is evident how 65.9% of the variance (R2) of the competence development construct (CODEVP) is explained by four essential constructs. The strongest effect is exercised by the feedback construct (FEEDFP, 0.429), followed by self-regulation (SELFRP, 0.261), evaluative judgement (JUDGEP, 0.140) and participation (PARTI, 0.128).

Evaluative judgement (JUDGEP) is a hierarchical component model (HCM) constructed under a repeated indicators approach (Hair et al. 2017, 2018). That is to say, it is a multidimensional construct, formed by the confidence in one’s own judgement (OWNJUP) and trust in the judgement of others (OTHJUP). Our research model achieves SRMR of 0.072 (Fig. 2), which means an appropriate fit, taking into account the usual cut-off of 0.08.

All Stone-Geisser’s Q2 values (predictive relevance) for endogenous constructs are considerably above zero (Online Resource 7). More precisely, evaluative judgement (JUDGEP) has the highest Q2 value (0.530), followed by feedback (FEEDF, 0.397), quality of the assessment (QUASSP, 0.404), competency development (CODEVP, 0.350) and, finally, self-regulation (SELFRP, 0.347). These results provide clear support for the model’s predictive relevance regarding the endogenous latent variables. A medium value is reached by effect sizes (q2) in the cases of evaluative judgement (JUDGED) on quality of the assessment (QUASSP) and participation (PARTI) on feedback (FEEDFP), with smaller values in the other cases.

Mediation analysis

(a) Feedback as mediator

In our model (Fig. 2), feedback operates as a mediating variable between participation and competence development (Table 7). We found a significant specific indirect effect of the relationship from participation to competence development (0.111). Participation has a significant direct effect (0.128) on competence development. Both the direct and the specific indirect effects from participation to competence development are significant, which indicates that feedback partially mediates the relationship between them (Hair et al. 2017). Moreover, the product of the direct effect and specific indirect effects are positive. Hence, the result reveals that feedback represents a complementary partial mediation for the path from participation to competence development. This complementary mediation suggests that feedback explains the relationship between participation and competence development. When students participate by being involved in producing assessment criteria or in selecting assessment instruments, it contributes to their competence development due to the learning produced by assessing their own learning and that of their peers.

Table 7 Summary of mediating effect test of PARTI on CODEVP

Feedback also operates as a mediating variable between evaluative judgement and competence development (Table 8). We found a significant specific indirect effect of the relationship from evaluative judgement to competence development (0.101). Evaluative judgement has a significant direct effect (0.140). Both the direct and the indirect effects from evaluative judgement to competence development are significant, which indicates that feedback partially mediates the relationship between them. Moreover, the product of the direct effect and indirect effect are positive. Hence, the result reveals that feedback represents a complementary partial mediation for the path from evaluative judgement to competence development. This complementary mediation suggests that feedback explains the relationship between evaluative judgement and competence development. Ultimately, the information gained by students through peer assessment aimed at improving their work serves to measure the relationship established between their confidence in their own judgements and those made by their peers about the development of their generic competences.

Table 8 Summary of mediating effect test of JUDGED on CODEVP

So far, we have presented the results of the analysis of simple mediation, but as we can see in Fig. 2 and Tables 7 and 8, feedback also operates in the context of multiple mediation, that is, in the mediation that occurs when an exogenous construct exerts its influence through more than one mediating variable. This multiple analysis allows us to consider all the mediators simultaneously in one model, enabling us to obtain a better representation of the mechanisms through which an exogenous construct affects an endogenous construct (Hair et al. 2017).

In the case of our model, we can see in Table 7 how feedback also intervenes in this multiple mediation, between participation and competence development, together with the quality of the assessment (0.049) and self-regulation (0.024). This multiple mediation of feedback between evaluative judgement and competence development is also evident (Table 7), together with the quality of the assessment (0.111) and self-regulation (0.022 and 0.024). In all cases, it is a partial (complementary) mediation.

(b) Self-regulation as mediator

Regarding the mediation of self-regulation between participation and competence development (Table 7), we find that the value of the specific indirect effect is not significant (0.027). Since the direct effect is significant and this indirect effect is not significant, we conclude that there is no mediation, but when analysing multiple mediation, the results indicate that it occurs in conjunction with the quality of the assessment (0.027), with feedback (0.024) and with the quality of the assessment and the feedback together (0.011). In this case, it would be a partial (complementary) mediation.

In the case of the mediation of self-regulation between evaluative judgement and competence development (Table 8) something similar happens. The results indicate that there is no simple mediation (0.004). On the other hand, it is evident that it mediates in conjunction with feedback (0.022), the quality of the assessment (0.061) and with both together (0.024). In this case, there is also a partial (complementary) mediation. Therefore, within the context of collaboration, the process by which students critically analyse their own work and that of their peers and identify omissions or errors that help them improve their own and their peers’ work is seen to be a valuable element in developing their competences.

Importance-performance map analysis

In addition to the evaluation of the measurement model, the structural model and the analysis of simple and multiple mediation, an importance-performance map analysis (IPMA) was carried out (Online Resource 8). The rationale of IMPA is “to identify predecessor constructs that have a relatively high importance for predicting the target construct, but also have a relatively low performance so that improvements can be implemented” (Hair et al. 2018, p. 105). In our case, the constructs on which action could be taken to improve competence development are, firstly, feedback and evaluative judgement, followed by participation. If we make self-regulation the objective, we could act on the quality of the assessment. And, finally, if we take as an objective the improvement of feedback, the variable which we should act on is evaluative judgement.

Discussion

In this paper, we intended, firstly, to provide a predictive model of students' competence development based on the practice of peer assessment. Secondly, we sought to propose an instrument through which to analyse and understand the perception of university students about peer assessment practices using technological resources. Finally, we wanted to guide the practice of peer assessment towards formats with the greatest potential for change and improvement. The results achieved in this study suggest there are important implications, both from a theoretical and practical perspective, to understanding peer assessment processes. At the same time, they also provide insight into future lines of research.

Theoretical implications

The primary objective of this study was to provide a predictive model of students’ competence development based on the practice of peer assessment. In this regard, our study reflects the proposals made by Panadero et al. (2018) who suggest that an analysis is needed of the influence that different models of formative assessment and self-regulation have on each other, what practices considered as formative can promote self-regulated learning and under what conditions.

One of the main contributions of this work is the construction of a model that integrates the relationships between significant variables of peer assessment in a university context. The results obtained demonstrate that the hypothesised model can, indeed, predict a large part of the relationships between the variables and show, on the one hand, that participation and evaluative judgement are directly related to competence development and, on the other, the mediating role of feedback and self-regulation in the context of peer assessment.

Hypothesis H1, in which the construction of evaluative judgement is achieved through two constructs: trust in one’s own judgement and trust in the judgement of others, has been tested and verified. Likewise, hypothesis H2, that directly relates evaluative judgement and participation with the quality of the assessment, as well as hypothesis H3 regarding the positive relationship between participation, evaluative judgement, self-regulation and feedback on the one hand with competence development on the other, have also been confirmed. Finally, the remaining hypotheses concerning the mediating character of feedback (H4), self-regulation (H5) and quality of evaluation (H6) have also been established.

One of the essential purposes of assessment as learning is that students should become the protagonists of their learning (Coombs et al. 2018; DeLuca et al. 2016). This means tutors must assume more of a role as facilitator. Our study confirms student participation in assessment as a variable that is directly related to their competence development and which exerts a direct influence on other aspects such as feedback or self-regulation.

It has been shown in this study how evaluative judgement, in terms of trust in one’s own judgement and in the judgement of others, is directly related to students’ competence development, as well as to feedback and self-regulation. The systematic development of evaluative judgement is currently an important challenge for the university curriculum, since it places it at the very centre of education (Boud et al. 2018b).

Practical implications

A second objective that has guided our research has been to develop a technology-enabled instrument to analyse and understand the perception of university students about peer assessment practices. In relation to this, our evaluation of the measurement model employed supports the validity of the questionnaire used to operationalise the latent variables, since the items are relevant and all the items load on the right construct. As a result, tutors now have access to an easy-to-use instrument through which they can collect students’ perceptions on the implementation of peer assessment.

In this research, the peer assessment process has been carried out using the EvalCOMIX® web service, which allows for greater speed and efficiency in the assessment process. It requires tutors to design and manage the assessment instruments used and monitor the process to address any problems students have with it. This web service can be an excellent technological tool to facilitate the impetus for change identified by Bearman et al. (2017)

The last objective that guided this study was to influence the practice of peer assessment to focus on areas of greatest potential for change and improvement. The results obtained in the IPMA analysis confirm the importance of evaluative judgement and feedback as the primary elements on which to act in order to significantly improve competence development. These results are consistent with the contributions of authors such as Boud et al. (2018a), Dawson et al. (2018), Hernández (2012), Nicol et al. (2014), Rodríguez-Gómez and Ibarra-Sáiz (2015) and Sadler (2016). Consequently, the importance of the mediating role of tutors is crucial. They must foster a climate of trust among students that allows them to carry out a rigorous, credible and objective assessment, whilst providing useful and relevant information for the improvement of future activity. This requires lecturers to educate students about assessment so that they, in turn, can participate and deliver judgements that can be reviewed and contrasted, allowing them to progressively acquire greater confidence in their own judgements and those of their peers.

Limitations and future research

From a methodological perspective, the research described in this paper suffers from three specific limitations. In the first place, the research was carried out in a specific context, with final year project management students in Spain. Research needs to be undertaken in other subject areas and with students at other stages of their studies. Secondly, it is an investigation carried out on the basis of a design with a post-test measurement, meaning the degree of control over the intervening variables is reduced and, in line with the caution advised by Stone-Romero and Rosopa (2008), the inferences that can be made about the mediation model are limited. Finally, the measurement instrument is based on the perception of the students themselves, which, as indicated by Panadero et al. (2018), could be improved by the use of alternative measuring instruments.

To be able to generalise our results more widely, further studies need to be carried out using experimental designs in which both the independent variable (the practice of peer assessment) and the mediating variables (essentially feedback and self-regulation) can be manipulated. Despite these limitations, though, we have been able to verify the great diversity and variability of current assessment practices. This diversity and variability make comparison and generalisation difficult, especially in the context of formal education where experimental studies are often difficult to carry out, but the modes of self-assessment and peer assessment in particular seem to be an issue in the future (Pereira et al. 2017).

Struyven et al. (2005) highlight that students’ perceptions serve to guide us in our reflective attempts to improve our educational practices and achieve a higher quality of learning and education for our students. However, a second line of research would be to improve the instrumentation used in the data collection process, incorporating other ways of collecting the students’ perceptions and collecting data from sources other than students, whilst incorporating measurements that combine both measurement and intervention (Panadero et al. 2016).

The variables involved in the process of peer assessment are highly complex and interact in so many different ways that it is important to try and achieve a greater level of detail, precision and understanding of them. It would therefore be valuable if further research was undertaken using mixed methodologies which, on the basis of explanatory sequential design, could explain the quantitative results in terms of qualitative data obtained. As Creswell and Clark (2010, p. 82) argue, this type of design “is most useful when the researcher wants to assess trends and relationships with quantitative data but also be able to explain the mechanism or reasons behind the resultant trends”.

Conclusion

In this paper, we have demonstrated how the practice of peer assessment is perceived by students as an element that promotes their competence development. We have devised tools that can facilitate adaptation or replication in other different contexts and have suggested future lines of research that will lead to further improvements in assessment. Likewise, we have shown how the implementation of participatory assessment involves a series of interrelationships between different aspects, highlighting the need to address the improvement of feedback processes and the development of evaluative judgement. In fact, providing a context where assessment processes are rigorous, credible, objective and participative, as well as delivering useful and relevant information for the ongoing development of peer assessment practices.

It is vital that policies are developed in higher education that encourage the creation of contexts in which peer assessment can be incorporated, both from a pedagogical and conceptual perspective and also from a technological perspective. Lecturer education on these practices should be promoted and technological resources provided so that the implementation of participatory assessment methods does not become a continuous struggle to overcome bureaucratic difficulties or technological limitations, which can often frustrate and limit educational improvement and change.