1 Introduction

With increasing popularity, educational innovations such as blended learning (BL) have taken off in higher education due to educational technology evolving and becoming more user friendly and accessible (Garrison & Kanuka, 2004). However, the assimilation into the digital era, as a direct result of the Covid-19 pandemic, has caused universities to lurch forward in taking up digital tools to tackle the need for online teaching (Scherer et al., 2021; UNESCO IESALC, 2020). Normally, successful implementation of BL initiatives requires careful planning and consideration of multidimensional factors (Philipsen et al., 2019a). However, the measurement of this success needs to be carefully and thoroughly evaluated (Guskey, 2000).

There is a great need for transparency in evaluating professional development initiatives (PDI’s) in higher education, where many stakeholders’ interests are at stake, as well as allocation of institutional funding. With BL becoming more and more of a rising trend in universities and being applied as the panacea for “upgrading” a university’s educational profile (Becker et al., 2017), these initiatives need to be implemented in a systematic and context-appropriate way. Thus, evaluation of these initiatives cannot be applied haphazardly as an afterthought, but rather needs to be integrated as part of the implementation process (Philipsen et al., 2019a).

The Covid-19 pandemic has undoubtedly had an impact on higher education globally. The shift towards online learning has put implementation of educational technologies in higher education on a fast forward track, forcing those teaching staff that were previously skeptical or unwilling, to reckon with the realities of online and BL. Training and support needed to step up and so now, going forward, important lessons learned during these times should not get lost. Evaluation is an important component of professional development for BL that will ensure transparency, continuity and efficient allocation of institutional resources. Ultimately this study aims to gain an overview of how evaluation of professional development for BL is organized and to discuss future recommendations on how to approach integration of evaluation into professional development initiatives for BL.

1.1 Blended learning in higher education

While there are varying definitions and approaches to BL, the focus of transforming courses into a blended format is on enhancing student learning, rather than replacing face-to-face lectures or simply making use of an additional learning platform (Bohle Carbonell et al., 2013). BL has been praised as an approach, allowing for flexibility which also takes advantage of the best online teaching tools can offer while optimizing traditional physical lectures (Köse, 2010). The claims to the effectiveness of BL have been the subject of countless studies, many of which pointing to a collection of evidence that students achieve better academic results in BL environments when compared to purely online or face-to-face formats (Siemens et al., 2015). BL is an approach that requires innovation and the deliberate design of a combination of teaching and assessment. Thus, the approach allows for the constructive alignment between theory, practice and work experience, with consideration for the skills that young professionals need when entering the labour market (Biggs, 1996; Bohle Carbonell et al., 2013).

Educational technology allows institutes to stay competitive as educational trends change. Garrison and Kanuka (2004) predicted that young learners will need “flexibility of time and place and the reality of unbounded educational discourse” (Garrison & Kanuka, 2004; p.2). Almost a decade after their initial paper, frameworks for institution-wide adoption for BL emerge, with a call for re-examining and refining BL policies specifically in the context of higher education. Graham and colleagues (Graham et al., 2013) found that institutional BL implementation often happens in three stages, first with awareness and exploration, followed by some form of adoption with intensive central support. The third stage is characterized by growth and mature implementation in which BL is well established and becomes an integral part of the normal functions of the institute. Han et al. (2019) confirmed Graham’s framework by assessing faculty online teaching presence in universities that were found to be in various BL implementation stages. The more advanced stage in implementing BL a university had reached, they found that the stronger the frequency of online interactive course facilitation was. Ultimately, success in institutional implementation of BL depends heavily on central support and leadership in the form of quality PDI aimed at transforming teaching practices (Garrison & Vaughan, 2013).

1.2 Professional development for blended learning in higher education

PDIs comprise of all learning opportunities and events, either intentionally and formally organized to informal implicitly occurring events that can remain unrecognized. Examples of such events are skills workshops, formal courses, communities of practice, and coaching and mentoring situated within the work environment. The eventual aim of all PDIs in educational settings is to improve teaching quality and student learning outcomes (Evans, 2014; Guskey, 2000). Concerning the approach of designing blended courses, PDI are deliberately and thoughtfully designed, systematically implemented, and require intentional and ongoing effort of the participants and leaders (Guskey, 2000; Merchie et al., 2018). PDIs involving university teaching staff, however, need to take a unique context into account that differs from teacher training and teacher professional development. University teaching staff usually comprise of professors, their assistants and other research or teaching staff, who besides teaching, are often also burdened with other administrative duties alongside research and project management (Diaz et al., 2010; Teixeira Antunes et al., 2021). Depending on the country and educational legislations in question, university teaching staff have a varied background in teaching competences, ranging from extensive training and certification, to almost no training at all (Díaz et al., 2010).

Professional development for BL as an educational innovation therefore should address the possible need for change in teaching practices, as well as change in institutional policy and leadership structures (Garrison & Vaughan, 2013). Research on institutional drivers for BL has indicated that one of the strongest factors for BL and change management is a strong institutional triggering event (Vaughan, 2010). Triggering events on institutional, or even macro (national/regional) level include, for example, realizations about student satisfaction, changes in labour market needs, internationalization, and mobility. Ultimately, university teaching staff have to feel a sense of urgency that requires them to address the issue with change through innovation (Vaughan, 2010).

On a micro-level, a pedagogical shift is associated with BL approaches, in which teaching staff learn to develop and integrate their content knowledge, along with pedagogical and technological knowledge (Koehler & Mishra, 2009). Brinkley-Etzkorn (2018) found that after a PDI for BL, change was observed in that faculty adopted more of a pedagogical role, but that the integration of technology and pedagogy still remained a challenge. A plausible reason for this challenge, they argue, is that educational technology is fast-changing and evolving, with every new tool needing to be re-learned from the start.

1.3 Evaluating professional development initiatives

The goal of a professional development initiative for university teaching staff is to provide an effective way of transferring knowledge and skills to improve student learning outcomes. However, determining the effectiveness of a PDI is challenging (Zeggelaar et al., 2020). Several attempts to evaluate professional development outcomes range from comparing students’ scores before and after the PDI, testing the knowledge transfer of the teaching staff, as well as gathering data on the satisfaction and personal experiences of the participants involved (Jaramillo-Baquerizo et al., 2018; Zeggelaar et al., 2020).

Zeggelaar et al. (2020) addressed the question of PDI effectiveness in a study where a PDI was evaluated based on a list of design requirements, and measuring the outcomes based on an evaluative framework comprised mainly of both the Guskey (2000) and the Kirkpatrick and Kirkpatrick (2006) frameworks, as well as measurement concepts such as the stages of concern (George et al., 2006). The study concluded that post training compliance, and therefore effectiveness, was dependent on timing and duration issues, as well as frequent follow-ups and systematic support for continuous stimulation for learning.

On evaluation, both Guskey (2000) and Kirkpatrick and Kirkpatrick (2006), argue that focusing on evaluation and accountability for PDI design is the next step towards creating transparent, efficient and effective PDIs. The importance of transparency, efficiency and effectiveness in both online and BL have become apparent even more so since the Covid-19 pandemic, as ultimately improving student outcomes remain the most important goal of teaching innovations (OECD, 2021). Both evaluative frameworks present evaluation as a multilevel initiative that require the collection of evaluative data at multiple time points before, during and after the initiative. The Guskey evaluative framework consists of five levels: (1) Participants’ Reactions; (2) Participants’ Learning; (3) Organisation Support & Change; (4) Participants' Use of New Knowledge and Skills; and (5) Student Learning Outcomes (Guskey, 2000). The Kirkpatrick framework, meanwhile, consists of four levels: 1) Evaluation of reaction, 2) Evaluation of learning, 3) Evaluation of behaviour and 4) Evaluation of results (Kirkpatrick & Kirkpatrick, 2006). Both frameworks are relevant to the context of professional development for BL in higher education, as can be seen when the Kirkpatrick framework is compared to the Guskey:

Guskey’s “Participant’s reactions” (level 1) refers to what the participants think of the initiative and more specifically about the organizational aspects such as the physical environment, the format/structure, the timing, and the pace. This level is comparable to the Kirkpatrick level 1 “evaluation of reaction”, who defines it as participants perception, satisfaction, and thoughts on the training. This information can be gathered either through a survey, focus group discussion or interviews.

“Participants’ learning” (Level 2), measuring the participants’ learning can be achieved through skills demonstration, personal reflections, or assessing personal portfolios. This is comparable to the Kirkpatrick level 2 “Evaluation of learning”.

“Organizational support and change” (Level 3) draws on the institutional context. This includes the degree to which the institute has supported the initiative through communication about the initiative, to providing funding, or to setting aside time for the staff to invest in professional development. This level goes one step further in that the change recommendations for practice are then implemented to adjust institutional policies to better accommodate the current, ongoing, or future innovations. This level is also partially reflected in the Kirkpatrick level 4 “Evaluation of results”.

“Participants’ use of new knowledge and skills” (Level 4), refers to documenting actual changes in practices, for instance via observations of the newly developed courses, and comparing with a qualitative “checklist” of innovative features. This level corresponds to the Kirkpatrick level 3 “Evaluation of behaviour”.

“Student outcomes” (level 5) refers to comparing student scores and grades prior to, and after the initiative and gathering feedback from students about the course. This level is also encompassed within the Kirkpatrick level 4 “Evaluation of results”. With “results” is further meant the measurement and observation of impacts that affect the institute, colleagues, students, and the greater society.

All these levels contribute to developing a well-rounded approach to evaluating professional development initiatives. In sum, both evaluation frameworks measure similar dimensions, and have thus been important tools in literature on professional development and support for higher education teaching staff.

2 Purpose of this study

The purpose of this study is to synthesize the findings from empirical studies on the evaluation of professional development initiatives for BL in higher education. This study seeks to focus in particular on understanding the content of the evaluations regarding participants’ reactions, what they have learnt, what are the organizational support and change factors, how the participants use the new knowledge and skills, as well as the student outcomes with regards to the studied professional development initiatives. The Guskey framework has been chosen to provide a framework of analysis in this study because, even though other frameworks exist and are important, the Guskey framework is more differentiated and suited to the higher education context, particularly with the focus on organizational support and change, which is an important factor in institutional implementation of BL (Garrison & Kanuka, 2004). This study is guided by the following research questions that are formulated regarding professional development initiatives for BL in higher education:

  1. 1.

    What are the common findings for participants’ reactions (Level 1)?

  2. 2.

    How is participants’ learning evaluated (Level 2)?

  3. 3.

    What are the factors of organizational support and change (Level 3)?

  4. 4.

    How is “Participants use of new knowledge and skills” measured (level 4)?

  5. 5.

    What were the student outcomes (Level 5)?

3 Methodology

To identify evaluative components within current empirical research on PDI’s for BL in higher education settings, a meta-aggregative approach to synthesize qualitative evidence was employed (Lockwood et al., 2017). This approach is mainly used in fields where qualitative studies are common, such as healthcare, social and educational sciences (Philipsen et al., 2019a; Tondeur et al., 2017). This will yield aggregated evidence and insights into evaluative practices in PDIs for BL specifically in the higher education context.

3.1 Data collection

A systematic review was conducted to locate and select empirical studies that fit the scope of this study which were then evaluated using a list of selection criterion, including a critical appraisal framework for evaluating qualitative studies from the Joanna Briggs Institute (Critical Appraisal Skills Programme, 2018).

The search therefore focused on finding studies which described PDIs for BL where the main participants were teaching staff in higher education institutes. Empirical research papers were obtained through multiple search strategies. The Web of Science and Scopus databases were consulted in June 2018 with the following search terms: “Blended learning”, refined to “Higher education” (1681) and then refined to Institutional development (38) and finally refined to articles (9). A follow up Web of Science search was conducted with additional key words used: “Blended Learning” and “Higher Education” and “Training and professional development”. Year published was refined to the years 2000 and 2018. The total number of initially selected articles was 141, of which 43 were selected based on the abstracts. The search strategy and selection criteria are illustrated in Fig. 1.

Fig. 1
figure 1

Search strategy and selection criteria

3.2 Selection criteria

This study focuses on higher education teaching staff as the main participants. Therefore, studies with teaching staff from secondary and primary education were excluded, as well as studies that dealt with BL stakeholders other than teaching staff. The following criteria were employed after careful reading through the full text of each article: 1) Excluded if only teacher trainers as participants. 2) The PDI had to be for BL instead of purely e-learning. Initiatives that were in an e-learning format were included, however, if the teachers being trained were new to e-learning. This is because the online aspect is new to their teaching and the learning processes to become an online teacher requires the same approach to changing teaching practices, namely being able to translate face-to-face teaching practices to meaningful online teaching practices. 3) Empirical studies were included, while reviews, conceptual framework studies were excluded. 4) Studies had to include qualitative evidence, mixed methods studies were included while purely quantitative studies, were excluded. The purpose of this qualitative synthesis is to synthesize evidence on the lived experiences, attitudes and other qualitative data concerning PDI for BL participants, which cannot be answered via quantitative data alone. 5) A description of the PDI design, research methods and evaluation approaches as well as institutional contexts had to be present. This means the studies had to include some reporting on the evaluation and assessment of one or more PDI’s for BL. Case-study papers therefore had to be excluded. 6) Finally, full-text journal articles in English were included, while conference papers, as well as studies where the full-text was unavailable were excluded.

After judging all the full texts of the articles based on the above criteria 14 journal articles were selected. The relevance of these publications was further judged by an independent coder. The articles, the context of the studies and PDI design are listed in Table 1.

Table 1 Classification of the analysed papers

4 Analysis

An inductive analysis strategy was chosen because rather than creating new categories of evaluation without the use of predetermined categories, this study makes use of the levels of evaluation framework by Guskey (2000). The analysed data were extracted findings from the results, discussion, and conclusion sections of all the included studies. The Joanna Briggs Institute defines a finding as “…a verbatim extract of the authors analytic interpretation accompanied by either a participant voice, or fieldwork observations or other data.” (Lockwood et al., 2017), which are supported by illustrative, in-text evidence such as direct quotations from participants, observations, or other anecdotal data such as learning management system analytics and logs, participants’ in-training performance, activities and reflections” (Lockwood et al., 2017).

The selected studies were imported into NVIVO12, where the texts were analysed and coded directly into the 5 parent nodes corresponding to the 5 levels: 1) Participants’ reaction, 2) Participants’ learning, 3) Organizational support and change, 4) Participants’ use of new knowledge and skills, and 5) Student outcomes. The full coding scheme can be found in Appendix Table 4.

4.1 Inter-coder reliability

Two co-authors participated in intercoder-reliability. A preliminary parallel coding exercise showed low intercoder reliability, particularly between levels 2 and 4. After discussion among the authors, changes were made to some of the coding to reflect more accurately the differences between the levels. After a second round, four articles were coded in parallel with the first author, using the same coding scheme and codebook. Inter-rater reliability was calculated via a coding comparison query in NVivo 12 and an overall percentage agreement of 98,06% for all five main level-codes was reached (see Table 2). Some levels had lower percentage agreements, such as organizational support and change. The disagreement was not due to the definition of the level but rather to overlooking relevant information pertaining to the level.

Table 2 Inter-coder reliability

5 Results

The 5 main synthesized findings, with their subsequent subcategories, are presented in this section. Each finding is illustrated with supporting direct quotes found within the articles, or authors statements. Figures 2, 3, 4, 5, and 6 illustrate the synthesized findings. The referenced studies are indicated according to their numbers assigned in Table 1. The numbers in parenthesis next to each category indicate the reference frequency found within this sample of studies.

Fig. 2
figure 2

Synthesized finding 1– Participants’ reactions, and recommendation for evaluating participants’ reactions

Fig. 3
figure 3

Synthesized finding 2 – Participants’ learning, and recommendation for evaluating participants’ learning

Fig. 4
figure 4

Synthesized finding 3 – Organizational support and change, and recommendation for evaluating organizational support and change

Fig. 5
figure 5

Synthesized finding 4 – Participants’ use of new knowledge and skills, and recommendation for evaluating use of new knowledge and skills

Fig. 6
figure 6

Synthesized finding 5 – Student outcomes, and recommendation for evaluating student outcomes

5.1 Participant’s reactions

Participants’ reactions were prominently featured in evaluation activities reported by the authors (12 out of 14 studies), with a total of 142 references. Five resulting categories occurring under participants’ reactions which are 1.1. Reaction to collaboration, 1.2. Participant satisfaction, 1.3. Reactions to blended learning, 1.4. Technical problems and issues, 1.5. Time and workload management. The synthesized finding and corresponding evaluation recommendation are found in Fig. 2.

Various reactions were recorded by authors. Most data collected on participants reactions were qualitative, stemming from interviews with participants, observations and reflections by PDI organisers and leaders, or written reflections by participants made during or as part of the training. Reactions to collaboration can be considered as an experience that reflects the nature of BL, where teaching staff must learn to experience BL from the students’ point of view as organizing online collaboration is generally accepted as an effective means to increase student engagement. Furthermore, reactions to collaborating with colleagues featured heavily is studies that reported PDIs with specific collaborative and groupwork strategies, such as teacher design teams (Nihuka & Voogt, 2012) or problem based learning and interdisciplinary teams (Donnelly, 2010).

Positive collaboration experiences, also illustrated by direct quotes, show how participants valued the input and feedback from their peers, while others appreciated the community feeling that came with collaboration:

…Project member B well, showed me the how to do this…how to work with the equipment and of course he shared his experiences with me on how to get the best results. So in that way doing this, during that first year, well Project member B was really of great importance to me.

(Bohle Carbonell et al., 2013, p.33)

While negative collaboration experiences were also reported. Some authors wrote of participants who felt frustrated with their colleagues due to unclear communication, lacking effort or engagement or simply discomfort in working with unfamiliar colleagues:

Our group suffered severely for several weeks from misunderstandings and a complete disagreement on our concepts and ideas of how to move things forward; […] well that really set me off. (Sorcha, FG2)

(Donnelly, 2010, p.355)

Satisfaction, one of the main indicators in this level (Guskey, 2000) was evaluated via interviews, reflective reports, surveys and feedback forms. Authors reported on satisfaction regarding various issues, such as support during training, the usefulness of tools or methods covered during training and whether or not the training was perceived as sufficient to the participants.

We note that even tutors who were familiar with functionality still appreciated support in the development of their sessions…

(Macdonald & Campbell, 2012, p. 890)

Reflections about the nature of BL, realizations about the changes needed to accommodate its implementation and realizations about the possibilities and/or restrictions associated with BL were found to be specifically reaction to BL. Authors reported these mainly in connection with the desired pedagogical shift they hoped would come about with BL, for instance realizing the different potentials for increasing student engagement to highlighting the advantages that BL has over purely online learning:

I think it is a weakness of e-Learning that in many cases it relies on written communication because although people can misinterpret things in any form of communication, when you are online, it is much more complex and intricate to re-explain what I mean than what I can do in the f2f tutorial. (Aine, FG2).

(Donnelly, 2010, p.355)

The technical aspect of BL was pinpointed by these authors as having a significant effect on how the participants experienced the training. Technical issues were seen as one of the biggest problems that can affect the success of PDIs for BL, thus several authors reported specifically on these experiences and reactions:

Three quarters of respondents experienced technical problems, which affected mainly the audio-graphic conferencing system (mentioned by 14 respondents), the electronic assignment submission system (mentioned by 8) and the audio recording tool (mentioned by 7).

(Comas-Quinn, 2011, p.17)

In addition to technical issues, time and workload management were mainly seen as negatively impacting the PDI experience for participants. In several studies, this was mentioned in the context of the university environment where faculty staff already felt overwhelmed with many additional tasks other than teaching.

…instructors stated that collaborating in TDTs was time consuming because of too many demanding university routines.

(Nihuka & Voogt, 2012, p.239)

5.2 Participant’s learning

Evaluations featuring participant’s learning was less prominent in this sample of articles (10 out of 14 studies, 112 references) reporting explicitly on what participants learnt during the PDI. The main themes under this level are participants learning about: 2.1. Blended optimization, 2.2. Blended tools or methods, 2.3. Changing attitudes and beliefs, 2.4. Collaborating with colleagues, and 2.5. Student needs. The synthesized finding and corresponding recommendation for evaluating participants’ learning can be found in Fig. 3.

A common feature of PDIs for BL is the objective for participants to understand that online tools are a means to optimize learning environments, and not to simply replace face-to-face activities with online tools. As such, themes around optimization and harmonization of BL and teaching the participants’ skills to implement BL is such a way that it enhances learning for the students were central to many of the PDIs described in this sample of articles. Blended optimization and harmonization are often an explicit intended learning outcome in PDIs for BL, and therefore a natural focus point for evaluating participants’ learning:

The beauty of the mix between f2f and online is that you would never reach that on your own. Even in 10 weeks, you would never acquire that amount of knowledge as an individual in a lecture situation. (Declan, FG2)

(Donnelly, 2010, p.354)

Online tools and technology are a prominent feature in BL. It is therefore to be expected that many PDIs for BL will focus on training teaching staff for specific tools and learning management systems, the efficient use of which is often key to successful implementation of BL. In fact, several studies focused on specific tools or technologies, and were explicitly stated as a main goal of the PDI, while other studies reported on participants learning how to evaluate the suitability of certain tools or online content:

I have learned that there are various styles out there that work really well. I also got a lot of ideas out of it on how to include the several tools into my online exercises.

(McDonald and Campbell, 2012, p.889)

Changing attitudes and beliefs were deemed by many authors who reported on participants’ learning as an indicator for learning. These were often reported in the form of reflections voiced during interviews or entries and comments observed in online learning environments. Participants would comment on how they realize the possibilities offered through BL that they had not considered before, or overcoming apprehension for learning to use online tools and methods. It is important that PDIs for BL address attitudes and beliefs specifically concerning technology and student-centred teaching approaches. For this reason, several studies described PDIs that were designed to place participants in specific scenarios such as experiencing BL from the point of view of the students where active reflection about pedagogy and technology are triggered:

I also learnt that exposure to pedagogic methods makes a tutor more receptive to ideas and methods from other faculties. Supporting learning makes one a better learner.

(McDonald & Campbell, 2012, p.887)

Learning to collaborate with colleagues was reported by authors whose PDI and research objectives aligned with investigating the effects of collegial collaboration has on participant learning. Learning from and with colleagues were prominently featured in PDIs that were team or community based. An example where participants’ learning to collaborate was evaluated in-depth:

Another element apparently also contributing to the effective collaboration within the group was that members were able to accept individual initiatives and were willing to work together to develop these.

(Ernest et al., 2013, p.14)

Participants’ learning about students’ needs were reported mostly as reflections voiced during interviews or observations of comments in online learning environments. Some PDI’s were structured in such a way that participants had to become learners themselves with the objective being to understand which approaches to BL would best suit the learners in their courses:

Silke also reflects on the teacher and the learner role and compares both of them and she understands the importance of task instructions: ―Sometimes […] we had problems to figure out what exactly we were supposed to do. Experiences like this made clear to me how important the formulation of a task is

(Fuchs et al., 2012, p.91)

5.3 Organizational support and change

Organizational change and support is by far the most heavily focused aspect of evaluation in this sample of articles (14 authors, 173 references). Five categories were found under this level: 3.1. Addressing students’ needs and concerns, 3.2. Addressing teachers’ needs and concerns, 3.3. Improvement of PDI, 3.4. Institutional considerations, and 3.5. Institutional triggers for BL. These themes were found under findings or result sections via evidence from interviews with PDI participants or other key institutional stakeholders, or in the discussion sections where authors formulate institutional policy recommendations based on the findings in their studies. The synthesized finding and recommendation for evaluating organizational support and change can be found in Fig. 4.

Institutional stakeholders were often interviewed concerning BL and the adjustments and accommodations that need to take place at an institutional level. Key concerns that were quality assurance, students’ technology skills and access, as well as logistics that need to be considered with institution-wide implementation of BL initiatives.

‘I think key considerations for me are the quality of the student experience and the viability of the approach… [The] quality of the student experience is paramount’

(Adekola et al., 2017, p.6)

Other than students needs, authors also addressed concerns that arose directly from the teaching staff. These issues mainly concern teaching staff technology competence, pedagogical issues within specific fields, and time and workload challenges specifically within the higher education academic environment. These statements were most often found in the discussion sections of the studies where authors addressed the findings from evaluations on participants reactions and learning.

More provision for staff development is now being made and new staff allocation policies are being implemented to resolve staff workload issues.

(Ramos et al., 2011, pp. 169-170)

Closely connected to addressing teacher and student related concerns were discussions on improving future PDIs. Authors who formulated recommendations for future iterations of trainings often took the findings from interviews or feedback forms gathered in connection with the PDI to state how they would address these issues in the future.

The recommendations for improvement and changes in future editions of the course focused on the possibility of going into an in-depth analysis of specific issues of collaborative activity design, such as communicative channels and spaces for building, sharing and discussing knowledge, and the procedures for its assessment.

(Guasch et al., 2010, p. 205)

Institutional considerations were widely discussed within this sample of articles. Under this theme, institutional considerations such as central support for BL as well as centrally organized professional development support, infrastructure and allocation of resources and central evaluation of blended programmes as well as support and leadership for institutional learning communities were widely discussed by these authors.

there is a need to repurpose learning spaces to support a blended environment

(Adekola et al., 2017, p.8)

Institutional triggers for BL were either evaluated before the PDIs took place and served as the justification for the initiatives, or after the PDIs took place as a justification for applying a smaller initiative to the wider institutional context. Internationalization was seen as a main driver for offering blended and online courses, however, other triggers such as labour market needs and changing demographics were also listed as triggers. The changing population dynamics in Mozambique, for instance called for a higher demand for access to higher education, thus prompting the need to provide distance education and thus the development to prepare staff for using educational technology as a way to cater for these needs:

This enormous imbalance between supply and demand has been the main driver for UEM’s adoption of distance education.

(Ramos et al., 2011, p.161)

5.4 Participants’ use of new knowledge and skills

Evaluating participants’ use of new knowledge and skill was not as widely reported as participants’ learning (8 authors, 51 references). The challenge to measure change in behaviour and teaching practices was approached in various ways, either quantitatively through examining online logs and learning analytics, or qualitatively through interviews that took place with the participants some time after the conclusion of the PDI or examining reflections of participants during formative evaluation processes in longitudinal PDIs such as action research cycles, communities of inquiry or teacher design teams. Authors reported findings under two main themes: 4.1. Change in teaching practices or, 4.2. No change in teaching practices. The synthesized finding and recommendation for evaluating participants’ use of new knowledge and skills can be found in Fig. 5.

Authors who evaluated the use of new knowledge and skills overwhelmingly reported on positive changes in behaviour and teaching practices. Depending on the specific goals of the study or PDI, specific changes were focused on, such as participants using newly learnt evaluation and assessment methods, innovative approaches to teaching, and efficient use of the new tools and technology covered within the PDI. The difference between this evaluation level and that of participants’ learning is mostly the timing of the evaluation. Most authors reported changes in behaviour that were expected, or part of the intended learning outcomes of the PDI, and thus very similar to learning evaluated within the PDI, however, the difference being that the evaluation takes place sometime after the conclusion of the PDI, or in connection with long-term formative evaluation in the context of communities of practice or action research. While self-reported evidence and personal reflections can show evidence of changes and retaining of knowledge, classroom observations will provide impartial and effective evidence of actual implementation on BL methods, such as illustrated below in the reported observation of a faculty dean concerning new teachers who had previously completed a PDI for BL:

The new HE teacher in our department was more kind of tech-savvy and implemented new ICT into teaching practice very fast … He was now the first teacher who opened an online course in our department. (Respondent 1)

(Wu et al., 2016, p. 551)

One author, however, specifically reported instances where no change in teaching practices was observed. These observations were then reflected upon and used to formulate improvement suggestions for future PDI iterations:

Online asynchronous tools were neglected because teachers were possibly not made adequately aware that online teaching through asynchronous tools could also be a central part of their jobs as teachers, just a different way of performing their role.

(Comas-Quinn, 2011, p. 21)

5.5 Student outcomes

Student outcomes was the least evaluated level in this sample of articles (6 authors, 46 references). Authors who evaluated the impact that the PDIs ultimately had on students either reported on observations made by teachers, and faculty or other key stakeholders in institutional leadership positions expressed during interviews, feedback surveys, or with interviews or focus groups directly with students themselves, or examining learning analytics such as exam results, grades and failure rates in connection with implemented blended courses. The three main themes under this level are 5.1. Management voice, 5.2. Student voice, and 5.3. Teachers’ voice. The synthesized finding and recommendation for evaluating student outcomes can be found in Fig. 6.

Faculty deans and heads of departments can provide a different perspective of student outcomes. Observations carried out by institutional leadership can help to facilitate decision making and to promote BL initiatives more widely, and to secure better allocation of resources towards this goal. Wu et al., (2016) reported on personal observations on the impact on students in their institutes, and commented on students’ satisfaction and engagement within the new blended courses being carried out by the newly trained teachers:

Students were more engaged in (P4, 10 respondents), or more satisfied with, these new HE teachers’ courses (P5, 7 respondents).

(Wu et al., 2016, p.552)

Nevertheless, the most conventional evaluation of student outcomes employed by authors in this sample of studies was to either interview directly the students or to examine quantitatively student performance or behaviour via learning management systems’ analytics, or qualitatively via interviews, focus groups, feedback and survey forms. Participation, perceptions of BL, learning preferences and satisfaction were all important issues that the students themselves focused on when they were asked to comment on the new BL approaches in their courses.

Students indicated that these blended courses provided them with more flexibility but they expected that less class time would equate to less work and were frustrated to discover the opposite.

(Garrison & Vaughan, 2013, p.27)

Authors who interviewed teachers as part of evaluating the PDIs found that teachers commented on positive impacts that they had observed in their students, such as increased engagement, participation and satisfaction in the newly implemented blended courses.

I adopted inquiry-based learning strategy and provided topic-related video clips as learning trigger … Students felt more satisfied with this teaching mode compared with traditional lecture. (Respondent 41)

(Wu et al., 2016, p.549)

6 Discussion

The empirical evidence found within these studies provides a snapshot into the reality of BL PDI implementation in the higher education context, and how these are evaluated. Evidence of evaluation taking place which corresponds to each of the five Guskey levels could be found throughout this sample of studies. Level 3 (Organizational change and support) was found in all 14 studies and was referenced most abundantly. This may be explained by the specific context of these articles, which is BL implemented in higher education institutes. Initiatives such as these in higher education focus heavily on institutional issues rather than on individual participant level. Institutes frequently upgrade their infrastructure to accommodate new technologies and teaching methods. With BL, some level of technological infrastructure adaptation is necessary, especially in the case of a learning management systems roll-out. Thus, evaluation of support and change from a central institutional level becomes a very important factor in BL PDIs.

Evaluation of level 1 (participants’ reactions) came second in prevalence in this sample, with 12 authors evaluating the reactions from their participants to the PDIs in various ways. Reactions to colleagues reveals the importance of gathering qualitative evidence during and after the PDI to understand the group dynamics that may either positively or negatively affect the PDI experience. This finding is further confirmed by Philipsen and colleagues (2019b) who found that participants’ PDI experiences can greatly be affected by how connected they feel with their peers.

Time and workload management issues seemed to feature prominently as being important factors in higher education institutes such as universities. Hence, institutional leadership must prepare for changes that need to be made on policies concerning the balance between teaching and other tasks, possible additional support to ensure effective use of new tools and infrastructure, and to prevent widespread underuse or resistance to implemented changes (Díaz et al., 2010; Teixeira Antunes et al., 2021).

Technical challenges, and reactions to BL are issues that are typical in PDIs for BL. Trainings and initiatives must account for the possibility of risk factors that come with technology use, such as software or hardware problems or other issues that previously unforeseen might be discovered during the training such as inadequate infrastructure or suitability of tools for specific subject fields. Unforeseen contextual factors and technological challenges further became extremely evident during the Covid-19 pandemic when all education had to suddenly shift to online and distance learning. Internet access, access to spaces for learning and teaching all came to the forefront of lessons learnt that need to be considered in future professional development initiatives for BL, during and even continuing after the global pandemic (Lockee, 2021; OECD, 2021).

Levels 2 (Participants’ learning) and 4 (Use of new Knowledge and skills) were at first difficult to differentiate from one another. In part because of how the results were communicated, as it was not always clear which evaluations took place directly in the context of the PDI and which took place afterwards and/or independently of the PDI. It was also not always clear whether sufficient time had passed in between the conclusion of the PDI and the data collection moment for evaluating change in teaching behaviours. Some studies were of a longitudinal design (action research, community of inquiry, teacher design teams), and thus the two levels were intertwined within the design of the PDI, where change in behaviours was observed over time and occurred in parallel with learning. This enmeshment of the two levels was evident by the first round of inter-coder reliability checks which prompted the authors to reassess the accuracy of the definition of the categories.

Levels 2 and 4 can be closely associated with changes in behaviour, attitudes and a shift towards reflections on pedagogy. Zeggelaar et al. (2020) found that the evaluation of such outcomes is key to understanding the effect of the PDI on participants, but more importantly emphasized that continuous evaluation and thus, by extension, continuous professional development support do ensure better compliance and retaining of learning. Therefore, to ensure widespread use and implementation of BL, evaluation can be used as a tool to ensure that learning and use of new knowledge and skills are retained and continue to develop. Formative approaches to evaluation and continuous professional development formats are more likely to enable institutional transitions towards BL (Garrison & Kanuka, 2004; Graham et al., 2013). The results found under the level 4 category are also in line with the findings of Brinkley-Etzkorn (2018), in that integration of pedagogy and technology is challenging.

Further concerning the evaluation of level 4, Guskey (2000) places great importance on classroom observations. Many authors in this sample reported on online observations within learning management systems. Only one author, however, reported on classroom observations via interviews with faculty deans and heads that carried out these observations (Wu et al., 2016). This is a point of consideration when planning evaluation for BL PDIs, that the nature of the method enables both online and face to face classroom observations. Evaluation of both environments will ensure a comprehensive understanding of how effectively BL is being implemented because of the PDI. To ensure uptake in innovations, level 4 evaluations are important in understanding which aspects of the innovation to improve. This level was mostly under-reported in this sample of articles, which may be an indicator that further research attention is needed on evaluating the use of new knowledge and skills in more in-depth ways such as online and classroom observations.

Level 5 (Student outcomes) seemed to be neglected by most authors, while other authors mentioned in their articles that these results were looked at in other previously published work or discussed in project documents, and thus only merited a brief mention within the space of the journal articles. An interesting aspect was the value of understanding the student voice, particularly concerning expectations regarding use of new technologies and methods, as often unclear expectations can lead to frustration and disengagement in students.

6.1 Limitations and recommendations

The authors are aware that the findings from these studies are not exhaustive or conclusive of all the evaluation approaches that have likely taken place. The reported evaluations present in these studies might have focused on the most significant results or additional quantitative analyses (e.g. on students outcomes) were reported in separate research articles or documents.

Based on our study we propose the following recommendations – listed in Table 3 along with the target evaluation participants. Most noticeably, special attention should be paid to how technology and the BL approach to teaching can cause tensions in higher education settings, particularly universities where teaching staff often see PDI’s as a burden rather than an opportunity. Hence, the focus of level 3, organizational support and change, needs to be on triangulating all evaluation data in order to align institutional vision, policies and resource allocation with the immediate needs of teaching staff who will be expected to carry the innovation on their shoulders. Furthermore, paying attention to participants’ reactions (level 1) to, for example, in-training group dynamics, technological competences and workload and time management issues can give an indication for level 3 improvements that should be most urgently addressed at an institutional level or in future iterations of the PDI.

Table 3 Recommendations for evaluation strategies of PDIs for blended learning in higher education

Participant learning (level 2) and use of new knowledge and skills (level 4) tend to be more difficult to follow up on but are vital for understanding progress in BL implementation (Han et al., 2019). In the literature on professional development in BL, it is observed that continuous check-ups, feedback and “reminders” can play an important role in cementing adherence to use of technologies and BL methods (Zeggelaar et al., 2020). Most authors presented in this study thus advocate for continuous professional development and support for BL, in several iterations or for setting up communities of practice/inquiry to support ongoing innovation. Again, here the importance of level 3 plays a role, where central support for BL and PDI’s provide a necessary foundation for learning and change to take place over sufficient time and with sufficient resources.

Future research should look into the further contextual issues surrounding the evaluation of PDI for BL in higher educational settings, such as why certain evaluation methods are preferred over others (and how these differ from other educational settings), and what the implications are of these choices. Furthermore, publication bias needs to be considered, in that authors will typically publish the most attractive results that will most likely get published (Torgerson, 2006). Further studies should examine the biases in this particular field of professional development for blended learning, particularly regarding the “unpublished” results and how this has impacted trends in PDI implementation in higher education.

7 Conclusion

Despite the limited number of articles included in this study, a broad variety of approaches and PDI designs have been found to provide a holistic view of evaluation that corresponds to all five of the Guskey evaluation levels. These results provide thus a general impression of how Professional development initiatives for BL in higher education are evaluated. With the results from this study, it is possible to ascertain that indeed, special considerations need to be highlighted in the context of BL and higher education institutes, that in other contexts are not necessarily relevant. Special considerations for technological issues, infrastructure and dealing with the context of universities where teaching staff often must divide their time between teaching and other duties, show that evaluation approaches need to take comprehensive look at all of the levels, while paying special attention to changes that need to take place at an institutional level.