1 Introduction

The reflection of teachers regarding their practice is essential during pre-service training (Beauchamp, 2015; Smith et al., 2017). Studies published by Theodoulides and Armour (2001) or Slepcevic-Zach and Stock (2018) concerning the reflections performed by teachers regarding their teaching practices during pre-service training highlight that these reflections may serve to self-regulate learning and may also be useful once they enter service. So, when they are in-service teachers, they could also place greater emphasis on helping adolescents to self-regulate their own learning (Sebre & Miltuze, 2021). However, the review by Van Beveren et al. (2018) into reflection activities at university shows the need to carry out further empirical studies that may help to highlight the type of reflection activities that work best. In addition, although reflection practices are encouraged in the classroom, there is often little or no preparation or instructions explaining how to carry them out (Russell, 2005).

Lee (2007) argues that it is very interesting for pre-service teachers to reflect on their teaching practice as soon as possible in order for them to learn how to resolve any difficulties they may come across as regards the difference between their vision of teaching and reality.

This reflection forms part of constructivism, which supports the idea that students themselves should construct their own knowledge as an active participant in their own learning process (Poerksen, 2004; Woolfolk, 2014). One such interesting model is known as social constructivism, where students actively construct their own knowledge by way of experiences and interactions with others (McKinley, 2015; Rannikmäe et al., 2020) and in which group-based reflection is of particular importance. Social constructivism is based around the social nature of cognition and proposes an educational framework that allows students to become involved in discussion and reflection, encouraging them to argue their ideas in order to share them with others and seek autonomy by interacting for mutual benefit (Akpan & Kennedy, 2020). Numerous factors are known to influence learning and their reflection concerning practice as part of this approach, thus affecting both the activities proposed and the evaluation thereof. Social constructivism emphasises the importance of having students actively involved in their learning process (Guillén et al., 2021).

In order to maintain the interest of students in this type of reflexive activities, which are generally associated with a low “emotional climate” (Bellocchi et al., 2014), interesting, active and entertaining proposals, such as those related to gamification, are required. Polin (2018) has highlighted the potential of educational games from a social constructivism point of view as they allow the creation of attractive, authentic, complex and collaborative spaces for reflection and learning for both students and teachers alike. One alternative with significant advantages is the construction of games by and with students. However, this requires an in-depth understanding of the subject and a significant time commitment for the construction and reflection activities in class (Polin, 2018).

Social constructivism is also present in science fairs, when students explain a project they have designed to their companions (Duit & Treagust, 1998). Paul et al. (2016) concluded that communication processes between students in a science fair promote a meaningful reflection about their project.

Lee’s study (2007), and the opinions of teachers reported in the studies by Ryan et al. (2017) or Goulette and Swanson (2019), confirms that the evaluation process is one of the most complex tasks in teaching and, as such, we believe that it should receive greater attention from the very earliest stages of training. As such, reflection of the evaluation in science education from a social constructivism perspective requires alternative evaluation methods based on the students’ different learning styles in order to offer them all the opportunity to express themselves (Iofciu et al., 2012). In this regard, electronic rubrics may help and facilitate this reflection on evaluation, and may also aid dialogue and feedback between teacher and student (Lu & Zhang, 2012). Thus, students can interact amongst themselves using digital tools and develop communication and social skills by way of peer evaluation.

One final yet important aspect of reflection concerns the relationship between social learning and emotions. In other words, the close interactions between cognitive, conative and affective factors during students’ learning and problem-solving (Op’T Eynde et al., 2006). An understanding of the role of emotions in science education, and the ability to reflect on them, implies an understanding of the nature of the cognitive processes involved.

In light of the above, this study attempts to incorporate educational activities for pre-service secondary science teachers (hereinafter PSTs) into the framework of social constructivism in science education. The originality of this study lies in several aspects:

(1) This paper presents a training programme for PSTs integrating several key constructs of social constructivism (reflection on practice, gamification resources and science fairs, e-rubric assessment and social-emotional learning). This study links all these aspects in science education, which have traditionally been considered individually, together.

(2) The programme involves PSTs in the design, implementation and evaluation of gamified resources using e-rubrics, as well as the emotions experienced during these activities by way of a formative and continuous reflection for experiences.

(3) The programme uses science fairs to present and reflect on gamification resources and their evaluation. Unfortunately, this format has received little attention in initial teacher training in Spain.

(4) This study would research the impact of the programme on transfer into practice, particularly if e-rubrics and gamification resources were translated into teaching practice by the PSTs.

2 Framework

This section describes the key variables considered in this study related to social constructivism, namely reflections regarding practice, gamification resources and science fairs, e-rubric evaluation and social-emotional learning. The combination of all these variables in a social or group setting will favour actual learning and are intended to be a driver for change and the generation of new reflections, ideas and knowledge, which would be insufficient if applied individually (Bovill, 2020). The scheme in Fig. 1 shows these variables.

Fig. 1
figure 1

Proposed reference framework

Key constructs definitions of this research are shown in Table 1.

Table 1 Key constructs of the research

2.1 Reflections Regarding Practice in Pre-Service Teacher Training

The literature concerning pre-service teacher training emphasises the importance of defining the type of teacher to be trained. From our perspective, pre-service training of secondary school science teachers should contribute to the development of a professional role as a reflexive and critical agent, as defined and discussed in various studies (Danielowich, 2007; Körkkö et al., 2016; O’Keeffe & Paige, 2020; Schön, 1992).

Reflection is important for both professional teaching and personal development (Le Cornu & Peters, 2005). Thus, Marcelo (2009) considers this initial training period to be very fertile and important as regards learning the teaching profession, and highlights the need to present training proposals to these future teachers aimed at enhancing their reflexive, self-critical and self-assessment abilities. These proposals should be formulated considering the concept of the teacher as reflexive practice, with an ability to construct understanding based on his or her personal and professional involvement.

Since the early work by Schön (1992), the inclusion of references to reflection, reflexive practices and the reflective teacher have become widespread in training programmes and have led to a large number of studies in pre-service teachers in general (Beauchamp, 2015; Van Beveren et al., 2018), and in science education in particular. In the latter case, a reflection in science should be able to convince students of their own explanations for phenomena, which are often counter-intuitive, prior to studying these phenomena (Parker & Heywood, 2013).

Despite this, it is still impossible to conclude that the reflexive teacher is the most common profile in our schools (Russell, 2012). The complexity of changing paradigms, the stability of teaching practices and preconceived ideas arising from observation-based learning are just a few of the obstacles to the development of this type of professional. Moreover, teachers must participate in reflection in order to understand the nature of their teaching, achieve results and see the way in which their personal values and beliefs guide the teaching–learning process, thus helping them to understand the role of education as an instrument for change (Smith et al., 2017). If this atmosphere of reflection can be promoted in PSTs, who then use it in their classrooms once in-service, teaching staff may change from a transition centred on their students to a more in-depth understanding of the content involving an analysis of their scientific beliefs and knowledge (Abell & Lederman, 2007).

In addition, the need to contemplate a broader view of those aspects that must be taken into consideration in PSTs, including the three fields —professional, social and personal— in which various competences must be developed, should be considered (Perrenoud, 2004). As noted by Slepcevic-Zach and Stock (2018), we believe that reflection should involve self-perception and a reflection of personal competences (including social and personal skills), professional and methodological competences, and in the same state to ensure an integral, holistic reflection of the action-based competence.

As part of the personal aspect, we consider it important to reflect on emotions, which can act as facilitators or obstacles to teaching and learning (Blanco et al., 2010; King et al., 2015). Indeed, the study of emotions plays a key role in PSTs (Schoffner, 2009) as they can minimise student tiredness, which is one means of engaging with the most relevant scientific content (Ritchie et al., 2011).

All the above supports the need to integrate reflection into PSTs in the framework of social constructivism, which is the aim of the training programme presented herein, in order to help to overcome various practices which may sometimes appear out of date.

2.2 Gamification Resources and Science Fair, e-rubric Evaluation and Social-Emotional Learning as Elements for Reflection in the Framework of Social Constructivism

2.2.1 Gamification Resources and Science Fair

Although all the elements of the teaching–learning process are susceptible to reflection, in PSTs we consider it important to pay greater attention to the resources to be used and one of the most complex processes to be faced, namely evaluation (Goulette & Swanson, 2019).

Although innovations in educational resources may have great potential for significant learning, they must fulfil the various human needs and material resources available in the classroom, even though these resources often reach schools via commercial networks (McKenney, 2018). In addition to commercial networks, teachers can also use Open Educational Resources (OERs), which can be found, amongst others, on the internet. The subsequent adaptation of such OERs by teachers to the specific context of their classrooms can improve and redefine the knowledge acquired by them during their studies (Kim, 2018). However, from our perspective, teachers must also be able to develop and innovate their own resources, even though this is not usually standard practice amongst teachers who, in most cases, use science textbooks (Yun & Park, 2018) or simply select materials prepared by other professionals for incorporation into their classes, subsequently adapting them to their own scientific, pedagogical and teaching ideas and beliefs with no in-depth reflection during the process. These aspects have received little attention in educational research and represent an additional difficulty in PSTs (Kang et al., 2016). It is therefore essential to involve teachers in resource design tasks, and their implementation and evaluation, as occurs in research with in-service teachers during the design of teaching sequences to prepare them for innovation (Coenders et al., 2010). This strategy is in agreement with the concept of a teacher as a reflective and innovative professional and also with the social and educational requirements proposed recently.

Amongst the numerous resources available, gamification-based resources (hereinafter GRs), or the use of game design elements in non-game contexts (Deterding et al., 2011), have been widely explored in science teaching (Gaydos & Squire, 2010). GRs are typically activities involving one or more players, with objectives, rewards, governed by rules and with some degree of competition, and in which social collaboration is important (Stieglitz et al., 2017). Some examples of GRs include puzzles (Joag, 2014), board games (Perkins, 2016), card games (Luttikhuizen, 2018), bingo (Tan et al., 2019), augmented reality games (Crandall et al., 2015), video games (Annetta, 2012), etc. The proliferation of GRs is due to the qualities they present in science teaching in the context of a socio-constructivist learning framework. Benefits are seen in terms of the development and promotion of creativity, collaboration, exploration and imagination (Kangas, 2010) and for promoting interpersonal qualities such as respect, fair play, integrity, justice, etc. The implications of the emotional type or the game-playing nature are other aspects that encourage learning. However, perhaps the most important quality is the motivation they produce, which is why the design of this type of approach is valued by teachers as it fits well with their desire to seek new and motivational ways of teaching science. Other researchers argue that the main motivation underlying any GR should be to learn, with this serving as an effective model to do so (Dörner et al., 2016). As such, they may be very effective alternatives for more traditional tasks at all educational levels (Haz et al., 2018).

The format in which the designed GR is presented is also important. Thus, a science fair is a format with significant advantages as it involves teachers in training, thereby improving their performance and facilitating their learning experience (De-Barros-Miller, 2016). Science fairs make students the protagonist by allowing them to present their scientific project in a context other than the classroom and explain a complex concept to their fellow students (McComas, 2011). A science fair provides a common space for the group to allow the generation of shared knowledge (Matusov, 2001). At primary and secondary levels, they provide students with critical thinking skills (De-Barros-Miller, 2016) and promote their creativity and interest in science (Sahin, 2013). The belief that, when used as the main axis of teaching, science fairs are not in alignment with the curriculum is one negative aspect. Despite being widely used in other countries, science fairs have received little attention at a secondary level and in PSTs in Spain, which is why we have decided to use this format to present GRs.

2.2.2 Evaluation Using e-rubrics

Another important aspect that requires reflection from PSTs is evaluation, with one means of improving this capacity involving an “evaluate to learn” approach (Folkes & Carmichael, 2006). In this sense, some proposals concentrate on the interiorisation and reflection of evaluation criteria (Lehesvuori et al., 2017).

Different evaluation methodologies are available to achieve learning by reflection in practice, including self-evaluation activities (Wanner & Palmer, 2018), which allow students to learn strategies that let them self-regulate their learning as well as receive a self-criticism of their work (Steffens & Underwood, 2008). Other such methodologies include peer-evaluation or co-evaluation activities, in which they will carry out a critical exercise with their partner, thus allowing different problem-solving approaches to be appraised (Mustafa, 2017); or the so-called 360-degree evaluation, in which these methodologies are combined (Tee & Ahmed, 2014). This practice-related reflection has been shown to be very useful and effective for improving the performance of pre-service teachers (Lei & Chan, 2018). According to Zimmerman (2013), reflection is the final phase in the cyclic process of self-regulated learning, in which students evaluate and correct their learning before returning to the initial phase of prevision, performance and self-reflection. According to Chetcuti and Cutajar (2014), student learning, which is usually dependent on supervision by the teacher to carry out the evaluations, is required to implement these methodologies.

Although the entire teaching–learning process that takes place in a science classroom is susceptible to evaluation, instruments that allow the degree of compliance with the expected objectives to be assessed objectively are required to carry this out successfully. Of the various evaluation instruments available, we consider rubrics to be of particular interest as they provide benefits in terms of time-saving during evaluation, transmit effective feedback and promote learning (Stevens & Levi, 2005); specifically, electronic rubrics, which facilitate cooperative evaluation (Cebrián-de-la-Serna et al., 2014). In addition, one of the advantages of using online platforms for evaluation is that they promote self-evaluation and autonomous learning that is supervised instantaneously, thereby facilitating dialogue between fellow students, teacher-guided reflection and the communication and justification of the various elements evaluated by the rubric (Cebrián-Robles, 2016). Studies providing the satisfaction results for e-rubrics for teachers and students are available (Quintana et al., 2014).

In light of the above, we have considered reflection in terms of evaluation with e-rubrics in the social-constructivism framework. In addition to being important for PSTs, e-rubrics are one of the tasks they will need to carry out in their future professional career, therefore we are convinced that the collaborative construction of rubrics, and use thereof during evaluation, helps the learning and interiorisation of those aspects that need to be improved to result in a quality instrument. In summary, the involvement of PSTs in evaluation during teaching practices in higher education is relevant as the lecturer is not the only source of evaluation within such processes.

2.2.3 Social-Emotional Learning

For a number of years, researchers in the field of social psychology have studied the influence of mood and emotions on cognitive processing (Fredricks et al., 2016). Emotions are “brief, psychophysiological changes that result from a response to a meaningful situation in one’s environment” (Rosenberg, 1998, p. 250) that usually arise in response to a specific person or event (Linnenbrink & Pintrich, 2002). Emotions are fast, automatic and occur unconsciously yet still have a marked influence on the way in which we think and interpret events (Kagan, 2007). Emotions are also present in science education despite the widely held belief that it is a dispassionate and emotionless discipline, whereas a full range of emotions is actually required for learning (Sinatra et al., 2014). Moreover, collaborative learning promoted by social constructivism may also help to improve the emotions experienced when pursuing common learning. The aim is to engage students in exploring their emotions about each other and about science as well for the purpose of supporting them in improving their social and emotional skills (Matthews, 2004). Different studies in science education provide evidence that positive emotions and enjoyment from learning science play a significant role in learning outcomes and serve as a driving force for self-learning, and for retaining knowledge (Nicolaou et al., 2015). In short, three essential principles of social-emotional learning are: (1) Caring relationships are the foundation of all lasting learning, (2) Emotions affect how and what we learn, and (3) Goal setting and problem-solving provide focus, direction, and energy for learning (Elias, 2004; Haynes et al., 2003).

3 Purpose of the Research

This research was carried out in a training programme for PSTs centred on the design, implementation and evaluation of GRs and e-rubrics for science teaching in the social-constructivism framework and the reflective teacher. The aim was to study the impact of reflection performed by PSTs on the evaluation of GRs when using e-rubrics and optimising them in an iterative process involving all PSTs. The initial hypothesis proposed was that the activities carried out, and the reflection thereof, may influence the criteria proposed when designing the e-rubric or affect their emotions. Secondly, the potential impact of the programme as regards design of the GR and evaluation using e-rubrics may affect its transfer into educational practice.

In order to achieve these goals the following research questions were posed:

  • Research Question 1 What is the impact of reflection on the evaluation of GRs using e-rubrics?: (a) How do the evaluation criteria used by PSTs when designing an e-rubric to evaluate GRs, which will be modified during the programme encouraging reflection, evolve? and (b) how does the group consensus affect the criteria chosen?

  • Research Question 2 What is the impact of emotions on PSTs when participating in reflection activities related to the design of GRs and an e-rubric to evaluate them?

  • Research Question 3 What is the impact of the programme on transfer into practice?: (a) How do PSTs perceive the e-rubric as an evaluation instrument before and after participating in the programme? and (b) to what extent are the PSTs able to transfer the knowledge that they have learned in this training programme into practice?

4 Method

4.1 Participants

A nonprobability convenience of 50 Spanish PSTs studying the “Teaching innovation and introduction to educational research” module of the Masters in Secondary School Teaching at the University of Málaga (Málaga, Spain) in the academic year 2018/19 was taken. This subject requires a reflection concerning educational innovations as one of the main teaching contents as they are being trained for their professional activity as teachers. These PSTs were from the specialist subjects Biology and Geology (31) and Physics and Chemistry (19). Of these, 60% were female and the rest male, and they were aged between 21 and 39 years. The most common profile amongst Biology and Geology PSTs was a degree in biology or environmental science, whereas that for Physics and Chemistry was a degree in chemistry or chemical engineering. It should be noted that although participation in all activities was high, as many as nine participants failed to answer some questionnaires. Moreover, 53.1% of PSTs had never used a rubric for evaluation, and 97.9% had never used an e-rubric.

The programme was designed jointly by the paper’s three authors, who had extensive experience as lecturers in science education. Two lecturers (second and third authors) collaborated with the course lecturer (first author) in the implementation and the evaluation.

4.2 Training Programme

This programme was carried out during this module as part of a topic on educational innovation in science teaching that included a series of tasks which allowed PSTs to reflect on GRs and e-rubrics. The programme comprised seven sessions (8.5 h of class participation plus 25 h of personal homework: 20 h for design and construction of the GR and 5 h for design and improvement of the e-rubrics) distributed over a period of 23 days.

Figure 2 provides a schematic view of the training programme, showing the four dimensions that were carried out in parallel: (1) Reflection activities; (2) GRs, in which PSTs received training and designed, constructed and exchanged experiences; (3) evaluation using an e-rubric, in which PSTs were expected to design, optimise and use their own e-rubric to evaluate GRs, and then optimise it with the group; (4) social-emotional learning, in which students were expected to question their own emotional situation in each of the activities in order to learn how to understand their emotions as a future teacher.

Fig. 2
figure 2

Training programme

4.2.1 Dimension 1: Reflection Activities

This dimension was mainly carried out via a training activity known as personal reflection (Luque et al., 2021), which was obligatory and held at the start of the session. It is considered to be the presentation of a personal view of a topic covered during the module and involves summarising, self-evaluation, broadening of understanding and the drafting of questions. All PSTs were required to perform this once during the module. At each session, two PSTs reflected independently about aspects covered in the previous session, using a digital presentation of around 10 min and including three important ideas covered in class, three aspects that they did not understand, three aspects they would have liked to learn more about, and one question for the discussion. As such, PSTs reflected on training in GRs and e-rubrics in sessions 2 and 3, respectively, on the GR fair and the efficacy of the e-rubric designed in session 5, and on the consensualised e-rubric in session 7. In addition, two surveys were used in sessions 5 and 6 to make them reflect on some aspects of the e-rubric, namely its drawbacks or strengths, or consensus criteria.

4.2.2 Dimension 2: Gamification-Based Resources

This dimension involves training, design and elaboration, implementation and dissemination of a science fair as part of a GR.

Gamification-Based Resources Training (session 1) The aim was to determine best practices for GRs for use in secondary education and analyse key aspects of the design, implementation and evaluation thereof. The lecturer presented different GRs, supported by research, previously implemented in a secondary school classroom. During the session, PSTs had the opportunity to use and analyse, amongst others, a set of cards for learning the periodic table (Franco-Mariscal et al., 2012), a puzzle to discover the bones in the body (Franco-Mariscal & Cano-Iglesias, 2011), augmented and virtual reality tools for science (Moreno & Franco-Mariscal, 2019), a worldwide football competition based on the properties of the chemical elements (Franco-Mariscal, 2014) or a bingo concerning the properties of atoms (Franco-Mariscal & Cano-Iglesias, 2014) (Fig. 3).

Fig. 3
figure 3

Some of the GRs used in the training session: set of cards for the periodic table (left) and skeleton puzzle (right)

Design and Elaboration of the Gamification-Based Resources The design and elaboration of a GR for secondary education (12 to 18 years) were proposed as a task (session 1), with two products having to be submitted: The GR itself and a project explaining the design, including learning objectives, the educational level at which it is to be used, the competences to be developed, the scientific aspects covered, the materials required, methodology, instructions for use and explanatory photographs. The PSTs designed board games (13), cards (10), puzzles (8), digital GRs (7), question and answer games (3), dominoes (2), a role-playing game (1) and an escape room (1). Some examples can be found in Fig. 4. Despite having asked for a GR, five PSTs designed resources that did not meet this requirement.

Fig. 4
figure 4

Examples of GRs designed by the PSTs: cards to learn the periodic table (top left), board game on blood groups (top right) and digital resource on the cardiovascular system (bottom)

Implementation of the Gamification-Based Resources in a Science Fair The PSTs presented their GR in a science fair in class (session 4), 15 days after session 1, and were given access to a stand to present the GR and a poster; they were allowed to use any type of material. The fair was held in two sittings of one hour each, with the PSTs being organised into two groups with different roles. At each sitting, one group of PSTs played the role of the teacher implementing the GR, explaining it to their fellow students (second group), who played the role of a student visiting the fair and evaluating the GRs (Fig. 5). These groups exchanged roles after the first hour.

Fig. 5
figure 5

Science fair for GRs in the classroom

4.2.3 Dimension 3: Evaluation Using e-rubrics

The third dimension of the programme involved the e-rubric as an evaluation instrument and included training, different design activities and optimisation of the e-rubric for evaluating GRs. It comprised the following activities:

(1) Perceptions of Evaluation Instruments (sessions 2 and 6). The PSTs responded to the question “With which evaluation instrument(s) will you feel most comfortable evaluating your future students: a written exam, observation in class, portfolios, rubrics, questionnaire, oral exam, essays or evaluation with GRs? You can choose more than one instrument”.

(2) Training in e-rubrics and CoRubric (session 2) This training covered e-rubrics as an evaluation instrument and the free collaborative evaluation platform (corubric.com) (Cebrián-Robles, 2016). It was noted that an e-rubric is an evaluation instrument, generally in the form of a grid, although flexible, that collects various evidences and specifies different achievement levels for each (Cebrián-Robles, 2016). CoRubric was presented as a collaboration-based methodology and technological tool that is characterised by its flexibility, and allows e-rubrics with evidences that present different levels of achievement to be designed. This tool allows different qualitative and quantitative evaluation methodologies to be designed and applied, such as peer evaluation, self-evaluation, ipsative evaluation, 360° evaluation or group evaluation. It allows evaluation using mobile devices, and the possibility of receiving other evaluations performed by fellow students to be received in situ. As part of this training, instructions regarding how to register with the CoRubric platform and how to create e-rubrics were provided, along with the means of including it in a common project that allows each PST to see the e-rubrics of their fellow students and use their own e-rubric to evaluate GRs. PSTs had the opportunity to use an e-rubric to evaluate the personal reflection activity performed by a fellow student.

(3) Design of an Initial e-rubric to Evaluate Gamification-Based Resources This task consisted of preparing an initial e-rubric using CoRubric, which each PST then had to improve, on an individual basis, on at least three occasions during the programme. This was posed as follows: “Using CoRubric, construct an e-rubric that can be used to evaluate all GRs that will be created by your fellow students. This e-rubric must contain a minimum of three elements and a maximum of five. Each element must have four achievement levels.” When constructing the initial e-rubric, the PSTs had not yet prepared the GR, therefore the design concepts corresponded to their misconceptions.

(4) Construction of a Second e-rubric (Pre-Fair e-rubric) Based on Reflection During the Preparation of a Gamification-Based Resource Whilst the PSTs were preparing the GR, they were given a perfection time of 14 days in which they could modify the initial e-rubric, if they considered this to be necessary, to include aspects related to the GR. During this period, PSTs did not receive any instructions concerning how to improve the initial e-rubric, although they were given access to the e-rubrics designed by their fellow students via CoRubric and could search for information on the internet. Moreover, session 3, which involved a reflection on e-rubric training, was also held in this period.

(5) Use of the Pre-Fair e-rubric to Evaluate Gamification-Based Resources in a Science Fair The PSTs were given the opportunity to use CoRubric and the pre-fair e-rubric to evaluate the GRs of other PSTs at the science fair (session 4; Fig. 6).

Fig. 6
figure 6

PST using CoRubric to evaluate a GR at the fair

(6) Construction of a Third e-rubric (Post-Fair e-rubric) After the fair, PSTs were able to consult the evaluations given to their GRs by their fellow students in CoRubric. This activity allowed them to reflect on the drawbacks and strengths of their e-rubric in situ, doing so in writing by completing a survey (Table 2) (session 5). The PSTs were given three days to prepare this post-fair e-rubric as a result of this reflection.

Table 2 Survey to reflect on the efficacy of the pre-fair e-rubric

(7) Construction of a Consensus e-rubric At session 6, and in light of the results of the post-fair e-rubric, the PSTs performed the following task, which allowed the consensus e-rubric to be constructed as a group: In this activity, we are all going to decide on a consensus e-rubric to evaluate a GR. To that end, the course lecturers have reviewed the e-rubrics designed by you. Amongst all PSTs, you have used a total of 18 items which you consider to be important for evaluating a GR. Rate each one from 1 to 10 depending on their suitability for inclusion in a consensus e-rubric for evaluating a GR.

4.2.4 Dimension 4: Social-Emotional Learning

Upon completing each session, PSTs were each asked to complete an adapted version of the KPSI (Knowledge and Prior Study Inventory) questionnaire (Jiménez-Liso et al., 2021) (Appendix), in an attempt to get them to reflect on the emotions experienced in some activities of the programme, indicating which emotions they had felt among the nine presented, and were able to evaluate as many as they wished. Five important aspects concerning GRs (training and fair) and the e-rubrics (training, use and consensus) were covered. This questionnaire uses different emotions applicable to educational activities while avoiding an overlap between them, specifically: rejection, concentration, insecurity, interest, boredom, confidence, satisfaction, dissatisfaction, and shame. This study is not intended to stimulate some emotions or consider some as positive (e.g. interest, satisfaction) and others as negative (e.g. insecurity, boredom), a dichotomy that has been used to analyse teaching practice (Marks, 2000) by associating positive emotions with success and negative emotions with failure (Pekrun, & Linnenbrink-García, 2014). In our opinion, emotions are not always positive or negative but are much more complex and are found on a continuum. This study aims simply to recognise the emotions experienced by way of an individual and collective reflection designed to describe which types or emotions are experienced in the classroom setting during the programme while giving a meaning to them as part of educational practice.

The questionnaire also included a question to evaluate their perception of understanding, before and after the session, of the aspects covered, using a Likert scale with responses from 1 to 5 points (1: I do not know anything, 2: I know a little, 3: I know it well, 4: I know it very well, 5: I can explain it to a friend).

4.3 School-Based Practicals for Pre-Service Secondary Science Teachers

The participants in this study carried out 125 h of practice in a secondary school over a period of five months, which started one month after the end of the proposed training programme. During this period, the PSTs had to design and implement a science teaching–learning sequence for secondary school students. At the end of the period, they presented a portfolio in which this sequence was included. These teaching–learning sequences were not predefined in the training programmes of their Masters course and could include content from any subject. Thus, the PSTs themselves decided the specific work topic, the content and their approach to teaching activities, although the content was always agreed with their academic and professional tutors, who were not involved in the research presented here. Consequently, PSTs were free to decide whether or not to include GRs designed by themselves or e-rubrics to evaluate as part of the teaching–learning sequence. In addition to these practicals, PSTs were required to submit a Final Masters Project, which included the teaching–learning sequence designed and implemented and an improved version of that project based on the reflection of their teaching practice. The PSTs submitted the Final Masters Project six months after completing the training programme. For this study, it was only possible to access the 38 Final Masters Projects deposited in the official repository established by the faculty.

5 Data Analysis

Data analysis was carried out in three stages, one for each research question proposed. The first study analysed the impact of reflection on the evaluation of the GRs by studying the evolution of the four e-rubrics designed by the PSTs during the programme (Research Question 1). The second analysis involved emotions and the evolution of understanding at different moments associated with the GR and the e-rubric (Research Question 2). The third analysis addressed transfer into practice, concentrating on the design of the GR and use of e-rubrics found in the PSTs’ Final Masters Project (Research Question 3).

5.1 Data Analysis 1: E-rubrics


The researchers established the items that the PSTs considered important for assessing the GRs by performing the analysis shown in Fig. 7.

Fig. 7
figure 7

Data analysis of e-rubrics

5.1.1 Establishment of Micro- and Macrocategories


Initially, the evaluation criteria proposed by each PST in the initial e-rubric were analysed, establishing microcategories that included statements regarding the same idea. It was also determined whether these microcategories were maintained in the pre-fair e-rubric and the post-fair e-rubric, or whether new ones appeared after the reflection activities carried out. Finally, these microcategories were grouped into two macrocategories to differentiate between who (speaker) or what (GR) was the basis for each evaluation criterion.

5.1.2 Establishment of a Consensus E‑Rubric


The microcategories were used to allow the PSTs to establish to what extent each evaluation criterion was pertinent for each evaluation criterion in the collective consensus e-rubric. A score-based voting system was used, and each PSTs could score each microcategory from 1 to 10 points (Section 4.2.3, point 7). The total score given by the 50 PSTs to each microcategory was then added up, with the score ranging from 50 to 500 points (if all PSTs had scored that microcategory as 1 or 10, respectively). The researchers selected half of the microcategories (those with the highest scores) for the consensus rubric.

5.1.3 Evolution of the Categories in the E-Rubrics

To study the evolution of the evaluation criteria from the initial e-rubric to the consensus e-rubric, the frequency of appearance of each microcategory at the four stages was compared. Specifically, the number of PSTs who included each microcategory was analysed for the different possible combinations of the four e-rubrics. These possible combinations were represented with a four-digit code consisting of 0 and 1 (for example 1010), with each position representing the corresponding e-rubric (the first digit relates to the initial e-rubric, the second to the pre-fair e-rubric, and so on), with the value of 0 or 1 indicating whether said category appeared in that e-rubric or not, respectively. Thus, the combination 1010 assigned to a specific microcategory indicates that said criterion was only included in the initial e-rubric (1 in the first digit) and the post-fair e-rubric (1 in the third digit).

The McNemar test was used to check for possible statistically significant differences between the different moments at which PSTs prepared the e-rubric as these are categoric samples of related non-parametric variables, in other words data that do not fit a normal curve. Specifically, three key moments in the programme that could affect the decision taken by the person preparing the e-rubric, and which allowed three pairs of e-rubrics to be compared, were chosen, namely:

  • Preparation of the GR: This allows a comparison between the initial e-rubric and the pre-fair e-rubric.

  • Use of the e-rubric to evaluate GRs at the science fair: This allows a comparison between the pre-fair e-rubric and the post-fair e-rubric.

  • Establishment of the consensus e-rubric: This allows a comparison between the post-fair e-rubric and the consensus e-rubric.

In addition, the McNemar test was also used to check the impact of the training programme. This test allows a comparison between the initial e-rubric and the consensus e-rubric.

Finally, the researchers in this study constructed a new e-rubric (individual evolution e-rubric) to study the evolution between the first three e-rubrics reflected on individually by the PSTs and the consensus e-rubric reflected on collectively. This individual evolution e-rubric considers all the microcategories which the students highlighted in at least one or their three individual rubrics. The McNemar test was also used to check for statistically significant differences between the individual evolution e-rubric and the consensus e-rubric. The frequency of PSTs who used each microcategory in at least one of the three e-rubrics prepared was taken into consideration for the individual evolution e-rubric.

5.2 Data Analysis 2: Emotions and Evolution of the Perception of Understanding.

Emotions were analysed quantitatively based on the PSTs’ frequency who expressed each emotion in the most important aspects concerning GRs (training and fair) and the e-rubrics (training, implementation and consensus). It was taken into account that the PSTs could mark more than one emotion. Therefore, a bar graph was constructed representing each bar as an activity of the training programme.

The perception of the understanding of GRs and e-rubrics before and after each activity was analysed, calculating the percentage of PSTs who marked each value on the Likert scale (from 1 to 5) in the two moments. Finally, a comparison between the majority percentages in each case was performed to detect any progress that had taken place.

5.3 Data Analysis 3: Transfer into Practice.

Analysis of the possible change in perception of the PSTs regarding the use of the e-rubric as an evaluation instrument as a result of the programme was performed quantitatively through the variation in the percentage of their choice before and after the training programme.

Data analysis for the transfer into practice in the Final Masters Projects was performed as follows. Initially, a search was carried out for possible references to the design of GRs and use of e-rubrics in the Final Masters Project by reading each portfolio. In addition, portfolios were submitted in electronic format (PDF), and a search was carried out using Adobe Acrobat Reader to confirm that all the references to GRs and e-rubrics had been analysed. In this search, the keywords “resource”, “fair” and ‘‘rubric’’ were used to allow the different words related to GRs and e-rubrics to be located. The second step involved an analysis of the extent to which the GRs or e-rubric were mentioned in the Final Masters Project. This was achieved by establishing a system of progress that had different categories for the ideas learned during the training programme. For the case of GRs: (1) GR is not mentioned in the Final Masters Project; (2) GR is mentioned, but is not part of the teaching–learning sequence activities; and (3) GR is mentioned and is part of the design of activities for the teaching–learning sequence and is put into practice in the intervention. For the e-rubric: (1) rubric is not mentioned in the Final Masters Project; (2) rubric is mentioned, but is not part of the teaching–learning sequence design; (3) rubric is mentioned and is incorporated into the teaching–learning sequence design but is not involved in the intervention; (4) rubric is mentioned, is incorporated into the teaching–learning sequence design and is involved in the intervention, but it is not an e-rubric; and (5) rubric is mentioned, is incorporated into the teaching–learning sequence design, is involved in the intervention and an e-rubric is used.

6 Results and Discussion

6.1 Results 1: E-Rubrics

6.1.1 Establishment of Micro- and Macrocategories

Table 3 shows the micro- and macrocategories categorised by the authors of this study in accordance with the evaluation criteria used by the PSTs in their e-rubrics.

Table 3 Micro- and macrocategories

The first 15 microcategories (Table 3) were identified during analysis of the initial e-rubric. Three new microcategories (M16: Time required to use the GR; M17: Ability of the GR to transmit the content; and M18: Ability of the GR to evaluate understanding) emerged in the pre-fair e-rubric. The improvements made to the post-fair e-rubric and the consensus e-rubric did not generate new microcategories, which remained at 18. All microcategories could be grouped into two macrocategories, one centred on the speaker for the GR at the fair and the other centred on the GR itself.

As an example, Table 4 presents the evolution of the evaluation criterion content of the GR, associated with M08 (Sources used to prepare the GR) and proposed by one of the PSTs at the three perfection stages.

Table 4 Evolution of the e-rubric for one PST

As can be seen, this criterion was modified in terms of both the text defining it and the different achievement levels, and the process of improving the e-rubric provided aspects to be taken into consideration in these levels.

In general, the most important changes were found between the initial e-rubric and the pre-fair e-rubric and concerned improvements to the wording of the indicators and achievement levels for the various criteria similar to those of the examples. These changes were performed as a result of the reflection process carried out by the PSTs and may have been influenced by the preparation of their own GR, access to the initial e-rubric of their fellow students in CoRubric, the search for information on the internet or the personal reflections of PSTs in class. In contrast to Chetcuti and Cutajar (2014), who considered supervision of rubric preparation by the tutor to be essential, in the design of e-rubrics, PSTs did not require such supervision when modifying the evaluation instrument based on the limitations and reflections they performed during the training programme.

Similarly, the improvements made between the pre-fair e-rubric and the post-fair e-rubric were much less significant and, indeed, comprised the incorporation of minor variations in the criteria and indicators formulated previously rather than the introduction of new microcategories. The analysis of the reflections carried out by the PSTs at that stage, by way of a strengths and drawbacks survey for the pre-fair e-rubric after use in the science fair (Table 2), supports these minor changes. Thus, 98% of PSTs indicated that, after being used at the fair to evaluate the GRs of their fellow students, the pre-fair e-rubric needed to be improved, and 67.3% considered their e-rubric to be subjective. However, the majority were unable to apply this evaluation in the post-fair e-rubric and justified the need for changes in one of the following ways: (1) the e-rubric is too long for the time anticipated to evaluate a GR (6% of PSTs); (2) the achievement levels are imprecise (31%); (3) the e-rubric requires a higher number of evaluation criteria (41%); and (4) I disagree with the score that other e-rubrics give my GR (4%). As can be seen, the reflections noted by the PSTs simply reflect weaknesses in the pre-fair e-rubric for some criteria rather than a need to change the type of microcategory. In addition, it can be seen that all these changes are related to the macrocategory GR, with no changes being proposed for the macrocategory speaker.

6.1.2 Establishment of a Consensus E-Rubric

Table 5 lists the total score for each microcategory and indicates those accepted for inclusion thereof in the consensus e-rubric. Those microcategories that scored equal to or above 346 points, which was the score of the ninth microcategory in the ranking, were selected.

Table 5 Microcategories with or without consensus for the consensus e-rubric

These results are used in the following section to perform a statistical treatment that will allow us to compare the evolution of the evaluation criteria selected by each PST and the consensus version, thereby comparing individual and collective thinking.

6.1.3 Evolution of the Categories in the E-Rubrics

Table 6 lists the PSTs’ appearance frequency (f) (from highest to lowest) of the 18 microcategories in the possible combinations of the four e-rubrics. For instance, for the combination 0110, 12 PSTs included the microcategory M08 in the intermediate e-rubrics (pre-fair e-rubric and post-fair e-rubric) only. The total for each possible combination for the appearance of microcategories in e-rubrics gives some idea of the impact of the different activities in the programme as regards changes to the evaluation criteria as a result of reflection. The combinations with a high frequency in a microcategory, corresponding to f ≥ 22 (50% or more of the participating PSTs) are highlighted in black, those with 11 < f < 22 (between 25 and 50% of PSTs) in grey and those with f < 11 (less than 25% of PSTs) in white.

Table 6 Frequency of PSTs in each microcategory for the possible combinations of e-rubrics

As can be seen (Table 6), the combination 0000 is the most common and comprises four microcategories not assigned to any of the e-rubrics by more than 50% of the PSTs, including the consensus e-rubric: Sources used to prepare the GR (M08), Relationship between the GR and teaching in context (M11), Versatility of the GR for use in other activities (M15), and Time required to use the GR (M16). Indeed, M16 is the least common microcategory, being absent from the e-rubrics of more than 60% of PSTs.

It should also be noted that the second most common combination (0001) represents the appearance of some microcategories in the consensus version only. The categories Adaptation of the GR to the educational level (M05) and Ability of the GR to evaluate students’ understanding (M18) had the greatest impact on the changes in the consensus e-rubric given that more than 50% of PSTs had not even taken these criteria into consideration during the e-rubric construction process (from the initial e-rubric to the post-fair e-rubric).

Of those microcategories taken into consideration in all e-rubrics (combination 1111), the Creativity/originality/innovation of the GR (M12) was the only one considered by more than 50% of PSTs. The remaining combinations are not representative as they were not considered by more than 50% of PSTs.

Table 7 presents the frequency of appearance of each microcategory in each e-rubric and in at least one of the first three (individual evolution e-rubric). For example, 21 PSTs included M01 in at least one of the initial e-rubric, the pre-fair e-rubric or the post-fair e-rubric. The results of the McNemar test carried out to detect possible statistically significant differences between the e-rubrics designed at different moments of the programme: preparation of the GR (initial e-rubric & pre-fair e-rubric), evaluation of the GR in fair (pre-fair e-rubric & post-fair e-rubric), consensus (post-fair e-rubric & consensus e-rubric), training programme (initial e-rubric & consensus e-rubric) and during the programme and after consensus (individual evolution e-rubric & consensus e-rubric). The last column in Table 7 lists the differences between the evolution of the e-rubrics for each individual evolution e-rubric and the consensus e-rubric.

Table 7 Frequencies for microcategories in each e-rubric and McNemar test applied to different stages in the programme

These results showed that some aspects related to the design and preparation of the GR had a marked influence on the selection of criteria between the initial e-rubric and the pre-fair e-rubric given that statistically significant differences were found for three microcategories (M05: Adaptation of the GR to the educational level [χ2 = 8.3; p = 0.006], M08: Sources used to prepare the GR [χ2 = 6.8; p = 0.012] and M13: Ease of use of the GR [χ2 = 7.1; p = 0.013]). This may be due to the fact that, during preparation of the GR, PSTs had to reflect individually on the design of their own GR and the best way of evaluating learning with other GRs of which they were unaware prepared by their fellow PSTs. The fact that they were expected to construct their GR individually showed them that some important characteristics (educational level, sources or ease of use) favoured learning and could be extrapolated to other GRs. This reflection fits well with the constructivist approach, as the construction of knowledge begins with students, who actively participate in the construction of their own learning (Poerksen, 2004; Woolfolk, 2014). The case of M08 caused a high percentage of PSTs to reconsider its inclusion in the e-rubrics at the stage at which PSTs were collecting information for preparation of their own GR although, as noted above, more than 50% of them decided not to include it in any of the four e-rubrics in the end (combination 0000).

Evaluation of the GRs at the fair using an e-rubric did not affect the changes produced between the pre-fair e-rubric and the post-fair e-rubric, thus demonstrating that reflection of the evaluation criteria during preparation of the GR was sufficient. Despite this, use of the pre-fair e-rubric demonstrated the need to redefine the tests and achievement levels in the criteria, as also shown by the reflection survey completed after use of the e-rubric.

However, the programme stage with the most marked influence was clearly the contribution of criteria to the group when establishing the consensus version, as statistically significant differences were detected for 11 microcategories between both the pre-fair e-rubric and the consensus e-rubric, and between the initial e-rubric and the consensus e-rubric. These findings appear to indicate that a global vision of the criteria proposed by the group of PSTs causes them to reflect and to make new changes to their own e-rubrics.

Finally, the comparison between the individual evolution e-rubric and the consensus e-rubric, which showed statistically significant differences for 50% of the microcategories, should be noted. This highlights the fact that the criteria in an e-rubric used to evaluate GRs are different if designed in a consensual manner in a group or individually. In other words, PSTs have greater confidence when selecting criteria from a list generated in consensus with their fellow students than those that they would have chosen previously, which they exclude from the consensus despite their presence in the three previous e-rubrics. These findings reinforce the idea that encouraging PST to discuss ideas and share them with their fellow students leads to better learning regarding complex tasks. In this case, the consensus criteria for evaluating an e-rubric is a clear example of social constructivism (Akpan & Kennedy, 2020).

6.2 Results 2: Emotions and Evolution of the Perception of Understanding

Figure 8 shows both the emotions felt by the participants and the frequencies of each. The first two columns relate to GR-related aspects and the final three to the e-rubric.

Fig. 8
figure 8

Emotions experienced by the participants

In general, it can be seen that not all emotions were experienced equally by the PSTs. Thus, several emotions are more common than others given the high frequencies obtained. This is the case, for example, for interest, satisfaction and concentration, with the former being the most common emotion at all stages analysed. Training in GRs was well-received, with a high degree of interest (42/48 PSTs) and concentration (33/48). Once the GR had been designed and implemented, this interest was maintained (32/41) and also generated satisfaction (24/41). In the case of e-rubrics, training also generated interest (30/44) and concentration (28/44), although their use at the fair generated low levels of interest (15/41), satisfaction (10/41) and confidence (11/41), thus suggesting that, at that stage, it was more important for PSTs to discover new GRs than to evaluate them. The implementation of the e-rubric also resulted in some degree of insecurity (9/41), thus highlighting that many PSTs did not agree with the result of their e-rubric. Finally, although the consensus criteria produced some degree of interest (17/44) and concentration (15/44), they also generated boredom (10/44) and insecurity and dissatisfaction (9/44), thus appearing to indicate that some PSTs accepted the group consensus but continued to believe that some of the criteria not present in the consensus e-rubric should have been included.

Table 8 lists the perception of understanding (in percentages) that PSTs indicated before and after these five activities. In all cases, an evolution of their learning of both GRs and e-rubrics was observed as the majority responses were found prior to the activities for options 1 and 2 and after them for options 4 or 5.

Table 8 Perception of GR and e-rubrics understanding before and after the activity analysed (in percentages)

6.3 Results 3: Transfer into Practice

The programme resulted in an important change in perception regarding the use of e-rubrics as an evaluation instrument, as shown by the 28.7% increase in PSTs who felt comfortable with them (from 46.9% before the programme to 75.6% afterwards). The Final Masters Projects were analysed in order to check whether this positive perception of the use of e-rubrics, and the use of GRs, was translated into teaching practice. Figure 9 shows the percentages for transfer of the GR designed into practice, and use of e-rubrics by the PSTs, as obtained upon analysis of their Final Masters Project, for each category.

Fig. 9
figure 9

Categorisation of transfer of the GRs (above) and e-rubrics (below) into practice

The results show that 74.4% of PSTs did not manage to transfer the GR designed into teaching practice as it was neither used nor even mentioned in the Final Masters Project. Only 23.3% included their GR as part of the design of the teaching–learning sequence in addition to putting it into practice. When interpreting these findings, we must take into account that each GR was designed for a specific topic, which in many cases was impossible to put into practice as the groups of students where the PSTs implemented their teaching–learning sequence may already have covered that topic.

However, there was no such problem with the use of e-rubrics applicable to any topic, with 67.4% of PSTs incorporating the design into their teaching–learning sequence design and implementing a traditional (category 4) or electronic rubric (category 5) into their practice period, with 20.9% of these being electronic. This reflects the importance and awareness of PSTs regarding the use of rubrics as an evaluation instrument, thus allowing a more objective evaluation. It also shows that PSTs feel more able to evaluate using rubrics after the programme.

7 Conclusions and Educational Implications

The results obtained in this study allowed us to draw a series of conclusions regarding the research questions.

The study carried out has shown that the training programme is able to promote reflection in PSTs for the design and improvement of the evaluation criteria for an e-rubric that is to be used to evaluate GRs, as part of a social constructivism approach. Said reflection had an impact on both the evaluation criteria proposed and the selection thereof in a consensual manner (Research Question 1). Thus, the criteria for the e-rubric undergo marked changes during the design and preparation of a GR in parallel, with adaptation of the GR to the educational level (M05), the sources used to prepare the GR (M08) and ease of use of the GR (M13) being the most influential aspects as regards evolution of the criteria. Thus, the evaluation criteria in the e-rubric can be defined adequately during the GR preparation process, although use of the e-rubric itself causes the PSTs to become aware of how the achievement levels for the criteria defined should be scored (Research Question 1a). In addition, the reflections of PSTs after using the e-rubric show that it is more difficult to evaluate a GR than the speaker. This may be due to the fact that more evaluation criteria were identified to evaluate the speaker than the GR.

Similarly, the agreement of consensus criteria by PSTs when preparing an e-rubric results in a more in-depth and significant reflection as regards the criteria that must be applied after evaluation than when these criteria are defined individually, despite a prior reflection concerning them (Research Question 1b). This highlights the importance of a social constructivism that facilitates discussion and reflection amongst PSTs, encouraging them to discuss ideas which they must agree on and support (Akpan & Kennedy, 2020). Thus, the group consensus when preparing e-rubrics is shown to be a key aspect for many PSTs, prompting them to modify their evaluation criteria so that they are closer to a majority agreement that they had previously not even considered, for example in the case of the importance of designing and adapting the GR to the educational level (M05) or the ability of the GR to evaluate understanding (M18). The selection of this latter criterion for the consensus e-rubric highlights their awareness of the utility of GRs for both the transmission of understanding and as an evaluation instrument.

Moreover, the programme had an impact on the emotions experienced by the PSTs (Research Question 2), which were favourable, especially during preparation and use of the GR, with interest being the most common emotion in all cases. The insecurity regarding use of the e-rubric or dissatisfaction during the consensus, thus suggesting that some PSTs were unwilling to change their criteria, should also be noted.

The programme also had a marked impact on transfer into practice (Research Question 3). This contributed to a positive change in perception with regard to use of the e-rubric as an evaluation instrument (Research Question 3a), as confirmed upon follow-up of its use in the teaching–learning sequences carried out during their practice periods. In contrast, the transfer of GRs was not as successful (23.3% of PSTs) as, in most cases, it was impossible to put them into practice given that the stage of the curriculum which the students had reached did not coincide with the content for which they had been designed (Research Question 3b).

In our opinion, we believe that the training programme presented herein makes a series of contributions to the literature:

  1. 1.

    This programme allows reflection to be promoted during evaluation in the context of gamification and use of e-rubrics, and gives some idea of the type of activities that produce good results in the framework of social constructivism.

  2. 2.

    In contrast to other evaluation methodologies, where student learning is typically dependent on supervision by the tutor when conducting evaluations (Chetcuti & Cutajar, 2014), our programme does not require this and is able to involve students in decision-making when constructing an e-rubric that is perfected several times. This is due to the different reflection activities proposed and the need to choose appropriate criteria for evaluating different GRs with unknown characteristics.

  3. 3.

    The programme also helps to ensure that PSTs both “do science” and “feel science” (Bellocchi et al., 2017), and to that end provides opportunities to reflect on their emotions, such as a science fair where they can share the GRs they have designed or collaborate or exchange ideas to prepare a consensus e-rubric.

According to the results, within the framework of social constructivism and the reflective teacher, the educational implications of this study are:

(a) Involve PSTs in the reflection in complete cycles of design, implementation and evaluation of activities, in this case on GRs and e-rubrics, offering opportunities at different moments of the training programme.

(b) Reflect on the same activity, first individually and then collectively, generating participation processes in group activities. In this sense, the use of science fairs in PSTs, which have proven to promote group reflection, should be further encouraged. In addition, these science fairs could be extended to topics other than GRs, such as inquiry projects (Alarcón et al., 2021).

(c) Encourage group reflection on evaluation instruments to lead to consensus, which will impact the evaluation. Thus, a group consensus to develop e-rubrics can facilitate understanding evaluation criteria that otherwise would not have been considered.

(d) Help PSTs to understand the evaluation criteria when designing an e-rubric. It is essential to keep in mind that an e-rubric containing too many evaluation criteria can be complex. Therefore, a good balance in the number of criteria will have to be found. On the other hand, PSTs are often unaware of the criteria and scope of an e-rubric until it is used, even if they are the authors of the rubric. According to Cebrián-Robles et al., 2014, peer review helps students understand the criteria.

(e) Introduce activities in which PSTs can interact with each other to facilitate bringing out the emotions of PSTs during design, implementation, and evaluation. For instance, science fairs are again a meeting point that brings out these emotions as PSTs interact.

Finally, the main limitation of this study, which has already been noted by O’Keeffe and Paige (2020), is that both time and the opportunity to implement them are required when undertaking reflexive practices in the classroom. In our opinion, and given the results obtained, this is well worth the effort.